Codex CLI 0.123.0: workflows that hold up
Codex CLI 0.123.0 workflows that hold up in review: replay recipes in the diff, a pinned model, a connector roster, and a ten-line done checklist.

A Codex CLI workflow holds up when a reviewer can replay it from the diff alone, without the person who ran it in the room. Codex CLI, OpenAI's coding agent, is fast on its own; the part that scales is the trail it leaves behind. The diff has to answer one question on its own: why did the agent touch this file?
Most review friction comes from a missing answer. The reason lives in a chat session that scrolled away two days ago, and nobody can replay it. That is a workflow gap, not a tool gap, and you fix it by putting the receipts where reviewers already look.
Write the answer into the PR body
Reviewers do not absorb intent by osmosis. If the only record of why a change happened is a transcript nobody reads, you get permission drift that nobody signed off on, and that becomes the expensive bug.
So write the answer down where it belongs. The PR body should list the scopes the run touched and the verification transcript a reviewer can replay. When the diff carries its own replay recipe, anyone unfamiliar with the session can trace intent and move on.
Pin the model and the connector roster
Casual model swaps wobble review expectations. The failure is rarely the model; it is the missing operating contract. Pin the default model and the escalation rule inside AGENTS.md, and leads can reason about blast radius again because they know what did the work.
Connectors are the other quiet drift. They accumulate, least-privilege erodes, and every server widens what an agent can reach. Keep a Markdown roster of connectors checked into the repo root, consistent with the MCP specification, so security reviews start from a known list instead of a guess. Boundaries the repo cannot state are boundaries the agent will invent.
Browser tasks need their own note for the same reason. Chrome workflows that diverge from CLI habits show reviewers two truths at once, so document staging URLs and credential boundaries beside each browser task. Then demos stop contradicting CI artifacts.
Cap AGENTS.md with a done checklist
When AGENTS.md grows without a ceiling, Codex optimizes the wrong definition of done. The fix is small: put the Definition of Done at the top of the file, ten lines or fewer, so completion criteria stay visible and output matches your team's vocabulary.
Here is a verification snippet you can paste straight into AGENTS.md:
# AGENTS.md verification snippet
- Every Codex CLI run ends with the transcript snippet reviewers can replay.
- Pair browser evidence with the project's normal CLI checks before merge.
- If MCP servers are enabled, list allowed actions beside each connector name.
This pairs with our methodology: Test proves the behavior, Review proves the team can explain it. The loop-level version of the same contract is in codex-cli 0.125.0: reviewable agent loops, and the rehearsal drills live under CLI workflows.
Run this gate on your next PR
These four gates turn the idea into something a reviewer can check in two minutes:
| Gate | Question |
|---|---|
| Receipt match | Does the PR body list scopes plus the verification transcript? |
| Rules precedence | Which .mdc, SKILL.md, or CLAUDE.md governed behavior? |
| Connector truth | Which MCP servers fired, and were they expected? |
| Reviewer path | Can someone unfamiliar trace intent without chat replay? |
And a short PR-proof checklist to paste into the description:
- Scopes in the PR body match folders in the diff.
- Primary-doc links were smoke-checked after publishing edits.
- MCP connectors mentioned (if any) list owners.
- Verification command output is pasted or linked.
The mechanics are documented if you want to go deeper: the Codex CLI docs, the features page, the slash commands reference, the quickstart, and the openai/skills repository for what ships around the CLI.
Common questions
-
What makes Codex CLI 0.123.0 workflows hold up in review?
A workflow holds up when the diff carries its own replay recipe. The PR body lists scopes plus the verification transcript, the connector roster says which MCP servers may fire, and the model pin says which model did the work. A reviewer who never saw the session can trace intent without hunting for a chat log.
-
What is AGENTS spaghetti, and how do you fix it?
AGENTS spaghetti is an
AGENTS.mdthat grows unchecked until Codex starts optimizing the wrong definition of done. The fix is a done checklist: Definition of Done bullets, ten lines or fewer, pinned at the top of the file. That keeps completion criteria visible and aligns output with your team's vocabulary. -
Why do browser tasks need their own note?
Because Chrome workflows that diverge from CLI habits show reviewers two truths at once, and the gap between them breeds doubt. A browser bridge note documents staging URLs and credential boundaries beside each browser task. That explicit handoff language is what stops demos from contradicting CI artifacts.
-
Which gates belong in the review table?
Four. Receipt match checks that the PR body lists scopes plus the transcript. Rules precedence names which rules file governed behavior. Connector truth lists which MCP servers fired and whether they were expected. Reviewer path asks whether someone unfamiliar can trace intent without chat replay.
Next move
Paste the receipt template into your PR body before the next review, and if you want the habit drilled into the whole team, our training does exactly that.
Related training topics
Related research

Codex workflows: governance that lives in the repo
How to govern codex workflows from the repo: a connector roster, a ten-line done checklist, a slash catalog, and a verification latch reviewers can replay.

Codex 5.5: pin the model before you swap it
Codex 5.5 questions are model governance questions: pin the default model and escalation rule in AGENTS.md, and keep browser checks bridged to CLI receipts.

Codex-cli 0.130.0: workflows that survive the update
What codex-cli 0.130.0 means for production repos: the AGENTS.md boundaries, MCP permissions, and review receipts that hold across any Codex CLI release.
Continue through the research archive
Newer research
Codex CLI 0.122.0: workflows, permissions, MCP
A Codex CLI 0.122.0 workflow guide: AGENTS.md instructions, permission boundaries, MCP rosters, and verification reviewers can replay.
Earlier research
Codex CLI 0.124.0: tighter rollback loops
Codex CLI 0.124.0 as a workflow moment: shrink the rollback contract, pin the model, and keep a connector roster and done checklist where reviewers live.