Codex CLI /goal workflows with AGENTS.md
A practical Codex CLI /goal workflow guide: persistent goals carry intent, AGENTS.md carries boundaries, and receipts carry proof across handoffs.

The branch ships across a 6pm handoff and the goal that shaped it stays behind in the author's chat scrollback; by morning, parent intent and child scope disagree quietly. Engineers searching /goal codex agents.md are trying to prevent exactly that. A /goal workflow is a team routine where the persistent goal carries intent and AGENTS.md carries the boundaries. Receipts are kindness to the engineer who inherits your branch.
Where goals go to die
Counter-thesis: the /goal command is not the feature; the survival of intent across a handoff is.
The wrong path: We believed tighter prompts could substitute for repo contracts. We ran cohorts that way before scopes lived in boring, explicit files, and the expensive bug was duplicated edits nobody reconciled.
Diagnosis: Ward Cunningham's technical debt, applied to process. We borrowed review speed and skipped the explainability principal, and the interest comes due the first time someone asks why the agent touched a file and the answer lives only in chat.
Thesis: persistent goals carry intent, AGENTS.md carries boundaries, receipts carry proof.
Search traffic tells the same story. Queries like codex cli slash commands /goal, "openai codex cli documentation goal", and even the quoted "/goal" "codex" "agents.md" land here because the official slash commands reference explains the command and stops before the team routine. Same for codex cli 0.128.0 release notes, or the shorthand codex cli 0.128 release notes: a release line item tells you /goal exists, not how a team keeps it honest.
Four fixes for the handoff
Named fix: Verification latch. Exec shortcuts that skip tests let regressions return quietly, so require a transcript snippet showing tests ran after codegen. Speed wins once; discipline wins weekly, and after the latch, green merges correlate with the actual ritual instead of the mood.
Named fix: Browser bridge note. When Chrome workflows diverge from CLI habits, reviewers see two truths. Dual rails need explicit handoff language: document staging URLs and credential boundaries beside every browser task, and demos stop contradicting CI artifacts.
Named fix: Model pin note. Casual model swaps wobble review expectations because different models imply different risk appetite. Pin the default model and the escalation rule inside AGENTS.md, and leads can reason about blast radius again.
Named fix: Connector roster. Connectors accumulate quietly and least-privilege erodes. A Markdown roster checked into the repo root grounds security reviews before they start, and it turns agent output back into team-owned work.
# AGENTS.md verification snippet
- Every Codex CLI run ends with the transcript snippet reviewers can replay.
- Pair browser evidence with the project's normal CLI checks before merge.
- If MCP servers are enabled, list allowed actions beside each connector name.
We map this discipline to the Review step in our methodology: evidence beats narration when merges touch shared surfaces. The wider artifact set, model pin, roster, done checklist, slash catalog, is laid out in Codex CLI workspace tools, and the rehearsal drills live under CLI workflows.
Evidence pack
| Gate | Question |
|---|---|
| Replay proof | Which commands prove regression guards? |
| Receipt match | Does the PR body list scopes + verification transcript? |
| Rules precedence | Which .mdc, SKILL.md, or CLAUDE.md governed behavior? |
| Connector truth | Which MCP servers fired, and were they expected? |
Review strip
- Verification command output is pasted or linked.
- Forked agent work lists parent + child responsibilities.
- Red-folder paths received explicit human acknowledgement.
- Scopes in the PR body match folders in the diff.
For the mechanics themselves: the Codex CLI docs and features page cover what runs can do, the quickstart gets a new machine going, and the openai/codex and openai/skills repositories track what ships next. None of them will write your receipt format, and a repo that cannot state boundaries plainly leaves agents guessing, which scales poorly.
One image: treat automation like radio discipline: more radios with sloppy callsigns only adds noise. The /goal is the callsign, AGENTS.md is the frequency plan, and the receipt is the read-back.
Best ways to use this research
- Best for: Codex teams deciding which AGENTS.md instruction, /goal routine, MCP boundary, or verification loop to standardize next.
- Best first artifact: turn one named fix into an AGENTS.md rule, verification checklist, MCP note, or review receipt before the next automated run.
- Best comparison angle: compare the routine against your current Codex CLI review loop and evidence trail; keep the path that leaves the shortest auditable trail.
Common questions
-
How do /goal and AGENTS.md work together in a Codex CLI workflow?
The persistent goal set by /goal carries intent across the session while AGENTS.md carries the boundaries: the pinned default model, the escalation rule, and the verification snippet every run must end with. That split keeps parent intent and child scope from disagreeing quietly during handoffs.
-
What should reviewers check before merging a /goal-driven change?
The review strip lists four checks: verification command output is pasted or linked, forked agent work lists parent and child responsibilities, red-folder paths received explicit human acknowledgement, and scopes in the PR body match the folders in the diff. The checks are boring on purpose, which makes them repeatable under crunch.
-
Why require a verification transcript after codegen?
Because exec shortcuts that skip tests let regressions return quietly. The verification latch requires a transcript snippet showing tests ran after codegen, and the mundane evidence pays off: green merges start correlating with actual ritual instead of mood.
-
How do these workflows stop MCP privilege creep?
A Markdown connector roster checked into the repo root keeps least-privilege from eroding as connectors accumulate. Each MCP server expands the blast radius, so the roster names what is allowed before security reviews begin, and reviews start grounded instead of speculative.
Next step
Bring one live repo and one recent handoff to a training session and we will wire the /goal routine into both.
Related training topics
Related research

Codex CLI 0.132.0: workflows and integrations
Codex CLI 0.132.0 tightens AGENTS.md, MCP, and verification loops for reviewable diffs.

Codex workflows for Chrome and the CLI
Codex workflows that cross into Chrome: the browser bridge note, model pin, connector roster, and done checklist that keep two surfaces telling one story.

Codex governance: four contracts that hold in review
A codex governance note for engineering teams: the slash catalog, verification latch, browser bridge note, and model pin that keep Codex CLI work reviewable.
Continue through the research archive
Newer research
AI agent boundaries that hold under pressure
A boundary-setting guide to AI agent boundaries: connector cards, scope ledgers, child receipts, and decision stubs that stop permission drift.
Earlier research
Eval platform governance for AI coding teams
A governance memo on eval platform governance: receipts behind scores, scoped harness access, and owners that stop Goodhart drift.