Does codex-cli 0.125.0 make agent loops reviewable by itself?

No. Reviewability comes from the operating contract around the tool, not from the release alone. The useful contract is small: state scope, capture verification, document model changes, and roster connectors. codex-cli 0.125.0 may change what the agent can do, but your repository rules decide what reviewers can prove.

What should go in AGENTS.md for Codex CLI work?

Put the review rules that must survive across runs. Include the expected verification commands, the default model, the escalation rule for model changes, browser evidence requirements, and MCP connector boundaries. Keep it short enough that engineers will follow it during real work, not just during audits.

Why does MCP need a connector roster?

MCP needs a roster because each connector adds access, authority, and review risk. A server that can read tickets, query databases, or call deployment tools changes the blast radius of an agent run. A checked-in roster gives security and reviewers a shared source of truth before another connector is added.

Is a screenshot ever enough evidence for a Codex change?

Usually no. A screenshot can support a browser workflow, but it should not replace command output or a linked transcript. Reviewers need evidence they can connect to the repository state, test suite, and PR diff. Use screenshots for visual confirmation, then attach the CLI checks that make the change reproducible.

Codex-CLI 0.125.0: Make Agent Loops

Do not approve codex-cli 0.125.0 because the version number changed. Approve it when each run leaves enough evidence for a reviewer to trace intent, scope, and verification. A reviewable agent loop is a Codex run whose work can be audited without replaying the chat.

Codex CLI, OpenAI's command-line coding agent, can speed up edits. It does not remove the need for ownership, test evidence, and connector boundaries.

Treat the version as a review boundary

The release number is a useful marker, not a review strategy. As of April 2026, the safer question is whether your repository contract makes Codex work inspectable after the run ends.

A team can run several agent streams in parallel and still lose control. The common failure is not slow output. It is duplicated edits, unresolved conflicts, and tool calls that look routine until credentials or production data appear in the transcript.

Use the upgrade as a forcing function. Require the same evidence for agent-authored changes that you require for human-authored changes: what changed, why it changed, what was checked, and who accepted the risk.

Make every Codex run leave receipts

The most important rule is a verification latch. A Codex run is not done when code is generated. It is done when the transcript or PR shows the command that checked the work.

For a TypeScript service, that might be pnpm test, pnpm lint, and a focused regression test for the touched package. For a Rails app, it might be bundle exec rspec spec/models/invoice_spec.rb plus the relevant system test. The exact commands vary, but the receipt should not.

Do not let screenshots replace terminal evidence. A screenshot proves a demo existed. A transcript gives the reviewer something to replay when a regression appears later.

Keep browser, model, and connector changes visible

Browser work needs a bridge back to the CLI. If an agent validates a staging flow in Chrome, the PR should name the staging URL, the test account boundary, and the matching CLI check. Otherwise reviewers see two stories: a demo that passed and a test suite they cannot connect to it.

Model changes also need a rule. Put the default model and escalation path in AGENTS.md, so reviewers know when a run used the normal path and when a higher-risk path was chosen.

Model Context Protocol (MCP), Anthropic's protocol for connecting models to tools and data, needs a connector roster. Each MCP server expands the blast radius of a run. Keep the roster in the repo root, with an owner and allowed actions for each connector.

Put the rule where reviewers already look

The contract belongs in the repository, not in a private chat thread. Start with AGENTS.md, then mirror the key receipts in the PR template.

This is also where training helps. Teams that practice the Review step in our methodology tend to catch missing receipts earlier, and the same habits show up in practical CLI workflows drills.

Paste this into AGENTS.md and adjust the commands for your repo:

Codex review contract

- State the task scope before starting a Codex CLI run.
- Paste or link the transcript section that shows verification commands and results.
- Pair browser evidence with the normal CLI checks before merge.
- Record the default model and the rule for changing it.
- List each MCP connector with owner, allowed actions, and credential boundary.
- If multiple agent runs touch the same area, name the parent task and each child responsibility.

Use one gate in every PR review

Reviewers need a short, repeatable gate. The gate should answer four questions before merge.

Can someone unfamiliar with the work trace the intent without replaying the chat? Were sensitive folders, deployment files, or credential paths touched, and who approved that risk? Which commands prove the regression guards ran? Does the PR body match the actual transcript, not a summary written after the fact?

This is governance hygiene. It keeps review effort tied to evidence instead of confidence borrowed from the last successful run.

Common questions

Does codex-cli 0.125.0 make agent loops reviewable by itself?

No. Reviewability comes from the operating contract around the tool, not from the release alone. The useful contract is small: state scope, capture verification, document model changes, and roster connectors. codex-cli 0.125.0 may change what the agent can do, but your repository rules decide what reviewers can prove.
What should go in AGENTS.md for Codex CLI work?

Put the review rules that must survive across runs. Include the expected verification commands, the default model, the escalation rule for model changes, browser evidence requirements, and MCP connector boundaries. Keep it short enough that engineers will follow it during real work, not just during audits.
Why does MCP need a connector roster?

MCP needs a roster because each connector adds access, authority, and review risk. A server that can read tickets, query databases, or call deployment tools changes the blast radius of an agent run. A checked-in roster gives security and reviewers a shared source of truth before another connector is added.
Is a screenshot ever enough evidence for a Codex change?

Usually no. A screenshot can support a browser workflow, but it should not replace command output or a linked transcript. Reviewers need evidence they can connect to the repository state, test suite, and PR diff. Use screenshots for visual confirmation, then attach the CLI checks that make the change reproducible.

Make the next run auditable

Before the next Codex CLI run, add the review contract to AGENTS.md and require one PR to follow it end to end.

Codex-CLI 0.125.0: Reviewable Agent Loops

Treat the version as a review boundary

Make every Codex run leave receipts

Keep browser, model, and connector changes visible

Put the rule where reviewers already look

Use one gate in every PR review

Common questions

Further reading

Make the next run auditable

Related training topics

Related research

Codex workflows: governance that lives in the repo

Codex-cli 0.130.0: workflows that survive the update

Codex CLI 0.123.0: workflows that hold up

Continue through the research archive

Codex CLI /goal workflows with AGENTS.md

Ready to start?

Treat the version as a review boundary

Make every Codex run leave receipts

Keep browser, model, and connector changes visible

Put the rule where reviewers already look

Use one gate in every PR review

Common questions

Further reading

Make the next run auditable

Related training topics

Codex CLI training for engineering teams

OpenAI Codex training for engineering teams

OpenAI Codex CLI team workflows

Codex code review training for engineering teams

Related research

Codex workflows: governance that lives in the repo

Codex-cli 0.130.0: workflows that survive the update

Codex CLI 0.123.0: workflows that hold up

Continue through the research archive

Codex CLI /goal workflows with AGENTS.md

Ready to start?