Loop engineering: stop prompting, start designing the loop.

For two years the unit of AI work was the prompt. You typed, you read, you typed again. The teams pulling ahead now have stopped prompting turn by turn. They design the loop once, and watch instead of drive.

The prompt isn't
the unit anymore.

Here is a pattern you will recognize. You open Claude Code, give it a task, and then you babysit. Run the tests. Read the output. Tell it what broke. Tell it to try again. Approve. Nudge. Wait. You are not doing the work; you are driving the thing that does the work, one prompt at a time, and you cannot step away because the moment you do, it stops.

That babysitting is a tax. And it is the single biggest thing standing between "AI helps me code" and "AI does the work while I do something else." The fix is not a better prompt. It is a different unit of work entirely.

The unit of work is no longer the prompt. It is the loop.

What loop engineering actually is

Loop engineering is the practice of designing an autonomous agent loop instead of prompting an agent step by step. You state what "done" looks like as a condition something can verify, and the agent iterates until that condition is provably true. You design the loop once. You stop prompting each step.

It rests on four ideas, and the whole discipline fits in them:

  1. Goal A recursive intent with a verifiable stopping condition. You state what done looks like ("all tests in tests/auth pass and lint is clean"), and the agent iterates until that is provably true. Not "improve the auth module." Something a script can check.
  2. Loop The system that runs the goal on a heartbeat instead of you prompting turn by turn. It discovers work, dispatches it, checks the result, writes down what happened, and picks the next move.
  3. Task A discrete unit the loop finds and triages, then hands to a sub-agent, usually inside an isolated worktree so parallel agents do not collide on the same files.
  4. Completion The verified stop. One agent writes; a second, with different instructions and often a cheaper model, verifies against the spec and real command output. The agent that did the work is not the one allowed to declare victory.

Why a loop beats a cron job

The instinct is to call this automation and move on. But there is a precise line between a loop and the automation you already know.

A cron job runs a fixed script. A loop runs a model that reads the current state and chooses its next action.

That difference is the entire value. A script does the same thing every time, whether or not it still makes sense. A loop reads what changed since the last run and decides what to do about it. It can find work you did not enumerate, triage it, skip what is already handled, and stop when the goal is met. The intelligence lives in the decision, not in the schedule.

The one rule that makes it work or fail

Most loops that fail, fail for the same reason: the goal was not checkable. "Refactor the billing module" gives the evaluator nothing to test, so the loop either never stops or stops on a vibe. "Billing tests green and no type errors" gives it a fact to verify.

If you cannot phrase the stopping condition as something a command or a second model can check by reading real output, the loop is not ready. Narrow the task until you can. This is the discipline that separates loop engineering from "let the AI run wild and hope."

The maker / checker split

Models grade their own homework too generously. So one agent makes the change and a second one, with different instructions and usually a faster, cheaper model, checks it against the real test, lint, and build output. Self-verification does not count as done. This split is the thing that makes walking away from the loop safe instead of reckless.

The fast path: three commands

You do not need a framework to start. In Claude Code, most loops are a single native command. Hand off an objective and walk away; a separate evaluator checks after every turn whether the condition holds, and if it does not, the agent keeps going.

run until done
# iterate until the condition is provably true
/goal all tests in tests/auth pass and `npm run lint` exits clean

# cap the spend before you walk away
/goal --tokens 250K all tests in tests/auth pass and lint is clean

Want it to run on a cadence instead of once? Cycle on an interval for "watch this for a while" work.

run on a schedule
/loop 5m check error logs and open an issue for any new stack trace

And for a large change across many files, run parallel agents in isolated worktrees so their edits cannot collide.

parallel work
/batch migrate every component in src/legacy/ to the new design
tokens, one worktree per directory, verify each against the
existing snapshot tests before merging

That is the on-ramp. The full discipline adds a state file the loop reads first and writes last (so tomorrow's run resumes where today's stopped), the maker and checker as separate sub-agents, and a hard budget. But you can feel the shift from the one-liners alone.

Where this actually pays off in marketing

This is not just an engineering trick. The work we do at SuperMarketers is full of recurring, checkable jobs that were quietly eating human hours:

The loopStopping condition it verifies
Competitive gap closingEvery query where a competitor is cited in AI search and we are not now has a brief that passes the AEO rubric, or is ticketed with a reason.
Content decay auditEvery published page either still earns its citations or is flagged for refresh, checked weekly.
CI / link integrityNo broken internal links and no failing build on the marketing site, every morning.
Voice gateEvery drafted asset passes the founder voice check before it reaches a human for the publish decision.

Notice what every one of these shares: a clear "done," a separate check, and a hard stop at the gate. The loop does the finding, the drafting, and the self-checking. The publish decision stays human. That is the contract.

The guardrails nobody mentions

Three problems get sharper as the loop gets better, not easier. Pretending otherwise is how teams get burned.

Verification is still on you. An unattended loop is an unattended mistake-maker. The maker / checker split is what makes "done" mean something, and even then "done" is a claim, not a proof. Watch the first cycle every time. Confirm the checker actually fails bad work before you trust it to run alone.

Comprehension debt. The faster the loop ships work you did not do yourself, the wider the gap between what exists and what you understand. Read what it produced. A smooth loop just grows this debt faster.

Cognitive surrender. It is tempting to stop having an opinion and take whatever comes back. Two people run the identical loop: one moves faster on work they understand, the other quietly stops understanding their own system. The loop cannot tell the difference. You can.

Build the loop. Stay the engineer.

We are open-sourcing the loop-engineering skill

We packaged everything above into a skill for Claude Code, and we are releasing it for free. It teaches the agent the model, the build recipe, and the guardrails, and ships templates for the maker, the checker, and the state file so you are not assembling the scaffold by hand.

Get it on GitHub

github.com/genwfurukawa/supermarketers-skills/tree/main/loop-engineering — the full SKILL.md plus three asset templates: maker.md (explores and implements one task), checker.md (verifies output against real signals), and loop-state.md (the state file plus a headless cron harness).

How to use it

  1. Drop it into your skills folder. Clone the repo, or copy the loop-engineering directory into .claude/skills/. Claude Code picks it up automatically; no config.
  2. Invoke it in plain language. Say something like: "Use the loop-engineering skill to set up a loop that audits broken links every morning until the site is clean, and ping me for anything it cannot fix." The skill stands up the maker, the checker, and the state file for you.
  3. Or just start with a one-liner. If you only need the fast path, /goal, /loop, and /batch are native. The skill is for when you want the full unattended scaffold with a verifier you can trust.
  4. Watch cycle one, then walk away. Confirm the checker fails a weak result before you let it run alone. That single observation is what earns the right to stop babysitting.

If you are building agentic workflows and any of this resonates, take the skill, fork it, tell us what broke. And if you want a loop wired into your own go-to-market motion, that is exactly the kind of thing we install.

◆ ◆ ◆
Stop driving the agent. Start designing the loop.

Book a 30-minute install conversation.