Review agent work

Review agent work the same way you review a teammate's branch, with one extra habit: check how the agent got there.

The terminal tells you what the agent said. The diff tells you what it did. The task tells you whether it mattered.

Start with the task

Before opening changed files, read the task description. Look for three things:

  • What problem was the agent asked to solve?
  • What was explicitly out of scope?
  • What checks were expected?

If the task is vague, tighten it before asking for more work. A vague task usually becomes a vague diff.

Read the plan

Open the latest implementation plan if the brew has one. You are checking direction, not grading prose.

Ask:

  • Did the agent touch the areas it said it would touch?
  • Did it avoid the areas it said were out of scope?
  • Did it name the right checks?
  • Did the plan miss a risk that now appears in the diff?

If the plan and diff disagree, trust the diff and investigate.

Read the session log

The session log should explain what happened after the plan:

  • Files changed.
  • Commands run.
  • Failures or skipped checks.
  • Follow-up the agent could not finish.

A clean session log is useful. A missing or fluffy session log is a reason to inspect more carefully, not a reason to reject the work by default.

Inspect the diff

Open the changed files in the editor and review them against the task.

Look for ordinary engineering issues:

  • The change solves the requested problem.
  • The scope did not drift.
  • The code follows local patterns.
  • Tests cover the risky path.
  • Generated files or lockfiles changed only when expected.
  • Secrets, machine paths, or debug leftovers did not sneak in.

Agents are good at producing plausible code. Plausible is not the same as correct.

Run the checks

Use the smallest checks that prove the change:

npm run lint
npm run types:check
npm test -- path/to/relevant.test.ts

Use the commands that fit your repo. If an agent claims a check passed, read enough output to know which command actually ran and where it ran.

Send focused feedback

When the work needs another pass, send feedback that names the exact problem and the expected fix.

Good:

The docs page still says worktrees are a sandbox. Change that to say they isolate Git working directories, not machine permissions. Then rerun the docs lint command.

Bad:

Make it better.

Focused feedback gives the next brew a finish line.

Finish deliberately

A review ends in one of three ways:

  • Accept the change and move it toward merge.
  • Ask the agent or a teammate for another pass.
  • Discard the branch because the direction is wrong.

Do not archive the context until you know which one happened.

Next: run parallel agents when work can be split into separate branches.

On this page