Security

Sandboxed YOLO Probe

Let the agent run wild somewhere boring.

Use when

You want a coding agent to iterate freely, but the command surface, repo contents, or prompt-injection risk makes direct host execution unsafe.

Cadence

Before allowing autonomous shell-heavy agent runs

Verification

The agent can run needed commands inside the sandbox, cannot reach forbidden files/secrets, and produces a replayable diff or report before host-side changes.

Advanced spec

Structured loop spec

FieldValue
NameSandboxed YOLO Probe
CategorySecurity
TriggerBefore allowing autonomous shell-heavy agent runs
ObjectiveLet the agent run wild somewhere boring.
Allowed inputsRelevant files, source notes, logs, tests, screenshots, metrics, or task state for this loop
Allowed actionsCreate a disposable container, codespace, VM, or worktree with only the files and secrets required for the task.; Disable or restrict network access unless specific hosts are required.; Run the agent's exploratory loop inside the sandbox with clear budget, scope, and forbidden actions.; Export only the patch, metrics, logs, and reproduction steps needed for review.; Apply to the real repo only after human or policy review of the diff and verification evidence.
VerificationThe agent can run needed commands inside the sandbox, cannot reach forbidden files/secrets, and produces a replayable diff or report before host-side changes.
Stop conditionStop when the verifier passes, the budget is exhausted, no progress is made, a blocker appears, or approval is required.
BudgetSet a time, turn, token, retry, file, or dollar cap before running the loop.
Approval boundaryHuman approval required before publishing, sending, deleting, spending, changing accounts, touching production, or making reputational/legal/financial commitments.
Safe outputDraft, report, checklist, table, or approval-gated recommendation
Works withClaude, ChatGPT, Gemini, any tool-using AI assistant
Runbook

Steps

  1. Create a disposable container, codespace, VM, or worktree with only the files and secrets required for the task.
  2. Disable or restrict network access unless specific hosts are required.
  3. Run the agent's exploratory loop inside the sandbox with clear budget, scope, and forbidden actions.
  4. Export only the patch, metrics, logs, and reproduction steps needed for review.
  5. Apply to the real repo only after human or policy review of the diff and verification evidence.
Copy prompt

Prompt

Run the Sandboxed YOLO Probe loop. Create a disposable sandbox with only required files and no unnecessary secrets. Restrict network access unless named hosts are required. Let the agent iterate inside that boundary with explicit budget, scope, and forbidden actions. Export a patch, metrics, logs, and reproduction steps. Do not apply host-side changes until the diff and verification evidence have been reviewed.
Metadata

Tags

sandboxpermissionsYOLO modesafe agents
Next loops

Related