Evaluation

Browser Quality Streak

Run real scenarios until the streak proves the product is stable.

Use when

You need confidence that the product works in realistic flows.

Cadence

Before release

Verification

N realistic scenarios pass consecutively, and earlier failures have regression coverage.

Advanced spec

Structured loop spec

FieldValue
NameBrowser Quality Streak
CategoryEvaluation
TriggerBefore release
ObjectiveRun real scenarios until the streak proves the product is stable.
Allowed inputsRelevant files, source notes, logs, tests, screenshots, metrics, or task state for this loop
Allowed actionsDefine the exact scope, source of truth, and approval boundary.; Inspect current state and rank the highest-risk gap.; Make one small, reversible improvement.; Run the stated verification and record evidence.; Stop on success, budget, no progress, or approval required.
VerificationN realistic scenarios pass consecutively, and earlier failures have regression coverage.
Stop conditionStop when the verifier passes, the budget is exhausted, no progress is made, a blocker appears, or approval is required.
BudgetSet a time, turn, token, retry, file, or dollar cap before running the loop.
Approval boundaryHuman approval required before publishing, sending, deleting, spending, changing accounts, touching production, or making reputational/legal/financial commitments.
Safe outputDraft, report, checklist, table, or approval-gated recommendation
Works withClaude, ChatGPT, Gemini, any tool-using AI assistant
Runbook

Steps

  1. Define the exact scope, source of truth, and approval boundary.
  2. Inspect current state and rank the highest-risk gap.
  3. Make one small, reversible improvement.
  4. Run the stated verification and record evidence.
  5. Stop on success, budget, no progress, or approval required.
Copy prompt

Prompt

Run the Browser Quality Streak loop. Use it when You need confidence that the product works in realistic flows. Work in bounded iterations: inspect current state, choose the highest-risk gap, make one reversible improvement, verify it, and record evidence. Stop when N realistic scenarios pass consecutively, and earlier failures have regression coverage. or when blocked, budget exhausted, or approval is required.
Metadata

Tags

QAbrowserproduct
Next loops

Related