Coding agents

AI coding loops

AI coding loops are repo-safe agent cycles for repeated coding work: inspect, patch, test, review, and stop with evidence.

Abstract developer workbench showing inspect patch test and review loop

Answer-first definition

An AI coding loop gives a coding agent durable task state, an isolated workspace when needed, allowed actions, a verifier such as tests or CI, and an approval boundary before merge or production change.

Common coding loop types

Loop typeVerifierSafe output
CI Failure SweeperCI job passes or failing log is summarizedPatch or report
PR BabysitterPR checks and reviewer comments are resolvedUpdated PR, not auto-merge
Flaky Test StabilizerRepeated test runs prove stabilityPatch plus evidence log
Maker-Checker ReviewSeparate agent or human reviews diff against rubricReview notes or gated approval
Dependency Update LoopTests, lockfile diff, vulnerability statusApproval-gated PR
Details

Relevant loops

LoopCategoryDifficultyCadenceVerification
API Contract DriftEngineeringIntermediateAfter API changes or SDK releasesServer behavior, client types, examples, and docs agree on request/response contracts.
Acceptance Scenario LockstepEngineeringIntermediateBefore and during ambiguous feature workThe same scenarios written before implementation pass after the change, and any scope expansion is explicitly approved.
Agent Instructions After-ActionEngineeringBeginnerAfter a successful or painful agent coding sessionRepo instructions contain only reusable, source-grounded lessons and the next similar task can start without rediscovering the same trap.
Architecture Rubric RefactorEngineeringAdvancedWhen architecture work has a defined scopeScoped module meets the written rubric, tests pass, and unresolved objections are explicit.
Behavior Ladder TDDEngineeringIntermediateWhen implementing logic-heavy featuresEach behavior test fails before implementation, passes after the smallest change, and remains green through final refactor.
CI OptimizationEngineeringAdvancedMonthly or when CI is painfulCI p50/p95 improves against the same workflow without weakening tests or hiding failures.
Claude Code Repo ReadinessEngineeringBeginnerBefore major agent workRepo has agent instructions, documented commands, architecture notes, risk areas, and a docs/loops scaffold.
Cold Load TrimEngineeringAdvancedWhen first visit feels heavyInitial screen downloads fewer bytes while screenshots and behavior remain unchanged.
Completion Promise LoopEngineeringIntermediateFor scoped implementation tasks where half-finished output is the main riskEvery acceptance criterion is checked with tests, browser evidence, logs, screenshots, or a clear blocker report before the agent stops.
Fresh Clone OnboardingEngineeringIntermediateBefore onboardingA clean machine reaches the documented ready state using only the README.
Parallel Agent Worktree SweepEngineeringAdvancedWhen several independent repo improvements can run at onceEach agent branch has isolated scope, passing checks, a summary, and no conflicting files before integration review.
Project Docs FreshnessEngineeringBeginnerNightly or after meaningful code changesChanged behavior, APIs, CLI commands, config, and workflows are reflected in docs. Docs checks pass.
Reference Oracle ImplementationEngineeringAdvancedBefore implementing tricky behavior with an external source of truthGenerated outputs match the reference oracle across the agreed fixture set, with tolerances documented for legitimate differences.
Spec to Task ShardsEngineeringIntermediateBefore multi-file feature work or migrationsA written spec, non-goals, acceptance checks, and ordered task shards exist before implementation begins.
Test Flake StabilizerEngineeringIntermediateWhen tests are inconsistentThe repaired test and full suite pass for the required consecutive-run streak.
Test and Logging CoverageEngineeringIntermediateWeekly or before releaseCritical flows have useful tests and structured logs for representative success and failure paths.
Trace-First DebuggingEngineeringIntermediateWhen an agent is tempted to patch a bug from a hunchThe bug is reproduced, the root cause is evidenced by traces or tests, and the fix includes a regression check.
Adversarial PR ReviewEvaluationAdvancedFor meaningful PRsAn independent critic approves the unchanged version or only accepted findings remain.
Agent Merge Queue ReviewEvaluationAdvancedAfter multiple agent-generated PRs or branches accumulateOnly branches with passing checks, clear intent, non-conflicting scope, and human-readable evidence are merged or promoted.
Browser Quality StreakEvaluationIntermediateBefore releaseN realistic scenarios pass consecutively, and earlier failures have regression coverage.
Dependency CVE BurndownSecurityAdvancedAfter security scanNo exploitable high or critical CVE remains without an explicit risk decision.
Sandboxed YOLO ProbeSecurityAdvancedBefore allowing autonomous shell-heavy agent runsThe agent can run needed commands inside the sandbox, cannot reach forbidden files/secrets, and produces a replayable diff or report before host-side changes.