Building Software While You Sleep: The Fully Automated Software Development Lifecycle

We don’t solve software issues at Astrohacker anymore. We specify them. The issue gets a clear goal, explicit verification criteria, and a set of guardrails — and then a command runs the rest. One agent designs experiments, writes the code, and runs the tests. A second agent reviews its work adversarially at every step. The loop runs end to end, and the software gets built while we sleep.

This is not a demo or a roadmap. It is how the entire portfolio ships today — the methodology Astrohacker founder Ryan X. Charles developed across TermSurf, Shannon, KeyPears, and EarthBucks, and now runs as a single automated pipeline. We’ve crossed into the automated software development age, and the commit logs are public.

The lifecycle, end to end

Here is the whole loop, with no human in the middle of it.

A goal is written down — not a plan, a goal. What “solved” means is stated as explicit verification criteria. The guardrails go next: which tests to write, which linters and type checkers must pass, which code patterns the solution has to follow. Then the machine takes over. It designs one experiment. A second agent reviews that design. If the design passes, the first agent implements it, runs the verification, and records the result. Then it designs the next experiment, informed by what the last one taught it. The cycle repeats — design, review, implement, verify, record — until the verification criteria pass and the issue is closed.

That is the fully automated software development lifecycle. Specification in, shipped software out, and the iteration in between belongs to the machine.

The goal is the whole job

What’s left for a human is the front of the loop, not the inside of it. The work is writing the issue well enough that solving it becomes mechanical: a clear description of the goal, concrete criteria for what counts as done, and the guardrails that keep the solution honest. Once that is written down — in the issue itself or in the project’s AGENTS.md — the job is essentially over. We are not the ones who solve the issue. We are the ones who define it precisely enough that the solving can be automated. This is the front edge of a shift the founder has called setting the goal and walking away.

Experiments, not plans

The loop does not march through a plan. It runs experiments. Each experiment either fully solves the issue, partially solves it, or fails — and failure is fine, because a failed experiment is progress. It means something was poorly understood, and now it isn’t. Every result is logged whether it passes or fails, so the record narrows toward the goal with each iteration. The documentation is the primary artifact; the code is what happens when an experiment returns a pass.

This is research-driven development, and it is the substrate the automation runs on. You never list the experiments up front, because the result of experiment N is what tells you what experiment N+1 should be. The lab notebook stays current without effort because the same agents that write the code write the notebook — which is exactly what makes the notebook cheap enough to keep, and what makes automating the loop possible at all.

Adversarial review is the keystone

The thing that makes an unattended loop trustworthy is a second agent trying to tear the work apart. It is not enough for an agent to write and pass its own tests — marking your own homework is how slop ships. So a separate agent performs an adversarial review at the two stages that matter: the experiment design must pass before any code is written, and the result must pass before the next experiment begins.

The reviewer doesn’t need to be a different model. It needs to come at the work cold — a fresh context, not a fork. A clean built-in subagent works; so does a call out to a separate CLI tool. Different context is all you need, as long as both agents are state-of-the-art. That cold second look is what keeps an overnight run from drifting, and it is the difference between an automated pipeline you can trust and one you have to babysit.

The receipts

The claim is falsifiable, and the proof is in the repos. With this flow we added full PDF support to TermSurf — a GPU-accelerated Chromium browser inside the terminal — in about a day; rendering PDFs through a Chromium embedder by hand would have cost months. PDFs now render inline, scroll, select, and print. And as this post goes out, the loop is doing something no one would attempt by hand: porting Ghostty from Zig into Rust, one terminal subsystem at a time. That issue is over a hundred experiments deep, nearly all passing, with hundreds of commits landed in a matter of days — tabstops, page storage, selection, formatters, escape-sequence handlers — each one designed, reviewed, implemented, tested, and reviewed again, with no hands on the keyboard. Click into the issues/ folder of any Astrohacker repo and you are reading the actual engineering history, not a sanitized summary.

Why now

This works today because three things converged at once. The models finally crossed the quality bar where their code is better than what most engineers write by hand. The tooling grew goal-level orchestration, so a whole issue can be handed off in a single command. And the best practices — tests, explicit verification, adversarial review — matured into a loop that produces output you can actually trust. None of these alone is enough. Together they mean the software development lifecycle, end to end, can run without a human inside it.

So that’s how Astrohacker builds now. Set the goal. Hack the universe.