Why does Codex wander outside the files I asked it to change?

Usually because the repo gave it no local boundary. Codex reads AGENTS.md, and without a nested rule near the feature it treats the whole repo as fair game. Drop a short AGENTS.md in the feature directory that says to keep changes local unless the task asks otherwise, and most scope leaks stop.

What makes a Codex diff actually reviewable on Windows?

A diff is reviewable when the reviewer can see the change, the verification command, and its result without replaying your session. End every Codex CLI pass with a check your team already trusts, and record the exact local path you ran on Windows. That turns "looks fine" into a decision someone can stand behind.

Do I really need to note every MCP connector a task used?

Yes, one line is enough. Connector scope is part of the change: what system was touched, what permission was assumed, what was left alone. Without that note, a reviewer has to reverse-engineer the boundary from logs, which is slow and error-prone. The note keeps the review honest and fast.

Is the review receipt just more bureaucracy?

No, it replaces guesswork, not work. The receipt is a single block recording scope, checks, and connector use, and it lives in the PR description. Reviewers stop reconstructing intent from a chat transcript, and approvals get faster because the proof is right there next to the diff.

Where does this fit in a larger workflow?

It sits in the review step. Let Codex handle first-pass decomposition and edits, then keep ownership of scope, checks, and the merge decision. If that split feels fuzzy, the workflow is too, and the receipt is the cheapest way to make it sharp again.

Codex Windows review loop

A Codex Windows review loop is a short, repeatable habit that makes the agent's work legible before you make it fast: scope it with AGENTS.md, end every pass with a check your team trusts, and write down what an external connector touched. Codex CLI, OpenAI's coding agent, can edit files faster than most teams can read the result. The whole problem of shipping with it is not whether it can change code; it is whether you can still trust the diff.

So lead with trust. The goal is not faster edits. It is faster reviewable diffs, the kind a teammate can approve without replaying your whole chat.

Keep Codex inside the lines with AGENTS.md

The first thing that breaks is scope. The agent starts in one file and wanders into adjacent code, because the repo gave it no local boundary to respect.

Put the rule where the work lives. Codex reads AGENTS.md, and nested files are the feature, not a hack: a short rule next to a feature directory beats a long rule in the repo root, because a reviewer can see exactly what governed the change.

# AGENTS.md (inside the feature directory)
Keep changes inside this directory unless the task explicitly asks for cross-cutting edits.
Run the local verification command before reporting a task as done.

The payoff is dull in the best way: fewer surprise edits and less time spent reconstructing what the agent was supposed to do.

End every pass with a check the team already trusts

The second thing that breaks is the unverified patch. The diff looks clean, nobody reran the checks that matter, and review turns into a queue of "looks fine" comments that were not fine.

Fix it by making every Codex CLI pass end with a command your team already trusts. The Codex CLI docs cover CLI use and slash commands you can standardize around. On Windows, write down the path you actually run locally, not the path you wish you had. After that, a reviewer can read the patch, see the command, see the result, and decide.

Treat MCP connector scope as part of the change

The third thing that breaks is the hidden connector. The agent reaches into GitHub, Slack, or Jira through MCP, and the team forgets that connector scope is part of the diff.

So write one line in the handoff for any task that uses an external connector: what system it touched, what permission it assumed, and what it left alone. The Model Context Protocol spec treats connector boundaries as part of the operating model, so your review should too. A reviewer can then check the boundary instead of reverse-engineering it from logs.

This is also where memory drift creeps in: an agent following a stale instruction, or a teammate assuming a skill is always on. If a task depends on repeatable behavior, package it as a skill or a short instruction block and name when it runs. The openai/skills repo is a good reminder that reusable instructions are a real surface, not a one-off prompt.

Ship a review receipt, not a bigger ceremony

The last gap is the receipt. The PR lands, and the reviewer has no compact proof of what was tried. You do not need a heavier process. You need a smaller proof, one artifact that records scope, checks, and connector use.

# Codex Review Receipt

Task:
- What changed:
- Why Codex was used:

Scope:
- Files expected to change:
- Files actually changed:
- Any nested AGENTS.md or AGENTS.override.md rules used:

Verification:
- Command run:
- Result:
- Any failing check and why it is acceptable:

Connectors:
- MCP server or external system touched:
- Permission or boundary note:

Review notes:
- What a human reviewer should inspect first:
- What is still uncertain:

Paste this into the PR description. It is the shortest path to a merge a reviewer can defend.

Common questions

Why does Codex wander outside the files I asked it to change? Usually because the repo gave it no local boundary. Codex reads AGENTS.md, and without a nested rule near the feature it treats the whole repo as fair game. Drop a short AGENTS.md in the feature directory that says to keep changes local unless the task asks otherwise, and most scope leaks stop.
What makes a Codex diff actually reviewable on Windows? A diff is reviewable when the reviewer can see the change, the verification command, and its result without replaying your session. End every Codex CLI pass with a check your team already trusts, and record the exact local path you ran on Windows. That turns "looks fine" into a decision someone can stand behind.
Do I really need to note every MCP connector a task used? Yes, one line is enough. Connector scope is part of the change: what system was touched, what permission was assumed, what was left alone. Without that note, a reviewer has to reverse-engineer the boundary from logs, which is slow and error-prone. The note keeps the review honest and fast.
Is the review receipt just more bureaucracy? No, it replaces guesswork, not work. The receipt is a single block recording scope, checks, and connector use, and it lives in the PR description. Reviewers stop reconstructing intent from a chat transcript, and approvals get faster because the proof is right there next to the diff.
Where does this fit in a larger workflow? It sits in the review step. Let Codex handle first-pass decomposition and edits, then keep ownership of scope, checks, and the merge decision. If that split feels fuzzy, the workflow is too, and the receipt is the cheapest way to make it sharp again.

Start here

Pick one feature directory, add a three-line AGENTS.md, and run your next Codex task with the receipt above. For a worked version of this loop, see the related training topic.

Codex Windows review loop

Keep Codex inside the lines with AGENTS.md

End every pass with a check the team already trusts

Treat MCP connector scope as part of the change

Ship a review receipt, not a bigger ceremony

Common questions

Start here

Related training topics

Related research

Codex CLI 0.132.0: workflows and integrations

Codex governance: four contracts that hold in review

Codex workflows for Chrome and the CLI

Ready to start?

Keep Codex inside the lines with AGENTS.md

End every pass with a check the team already trusts

Treat MCP connector scope as part of the change

Ship a review receipt, not a bigger ceremony

Common questions

Start here

Related training topics

Codex CLI training for engineering teams

OpenAI Codex training for engineering teams

OpenAI Codex CLI team workflows

Codex code review training for engineering teams

Related research

Codex CLI 0.132.0: workflows and integrations

Codex governance: four contracts that hold in review

Codex workflows for Chrome and the CLI

Ready to start?