Codex 5.5: pin the model before you swap it
Codex 5.5 questions are model governance questions: pin the default model and escalation rule in AGENTS.md, and keep browser checks bridged to CLI receipts.

The codex 5.5 question always arrives the same way: someone proposes swapping Codex to GPT-5.5 the week of a release cut, and review expectations start to wobble before the first diff lands. During AGENTS.md readiness reviews we kept meeting branches that looked complete but could not say which instruction owned the final result. A model pin note is the AGENTS.md entry that fixes the team's default model and its escalation rule. It is the difference between a swap and a surprise.
The night before the cut
Counter-thesis: the model swap is the easy decision; the hard one is what the team is allowed to expect from review afterwards.
The wrong path: We believed local greens equaled team-wide confidence. We ran that playbook with squads whose durable contracts were still mushy, and the branches passed everything except the question of who owned the diff.
Diagnosis: normalization of deviance, Diane Vaughan's term. Each casual swap that ends fine becomes the new normal, until the day the risk appetite changed and nobody had written it down.
Thesis: Codex CLI loops win when AGENTS.md owns truth.
Half the queries arrive as codex gtp 5.5, typo included, and the intent is identical: can we run the new model without breaking the loop? The answer is yes, provided the workspace file carries proof rather than preference.
Workspace moves
Slash-command roulette gets worse during swaps, because operators improvise. Slash commands spread undocumented and become private folklore. Named fix: Slash catalog. Maintain docs/codex-commands.md linked from AGENTS.md, with the slash commands reference for the built-ins. The team converges on shared choreography.
Verification bypass is the swap-week temptation. Exec shortcuts skip tests because everyone wants to see the new model move; regressions return quietly. Named fix: Verification latch. Require a transcript snippet showing tests ran after codegen. Green merges correlate with actual ritual again.
Headless vs browser drift doubles under a new model, since demos are the first thing people reach for. Chrome workflows diverge from CLI habits and reviewers see two truths. Named fix: Browser bridge note. Document staging URLs and credential boundaries beside browser tasks. Demos stop contradicting CI artifacts.
Model mismatch is the headline failure. Teams swap models casually and different models imply different risk appetite. Named fix: Model pin note. Pin the default model and the escalation rule inside AGENTS.md. Leads can reason about blast radius before the swap, not after the incident review.
# AGENTS.md verification snippet
- Every Codex CLI run ends with the transcript snippet reviewers can replay.
- Pair browser evidence with the project's normal CLI checks before merge.
- If MCP servers are enabled, list allowed actions beside each connector name.
That trade routes through our methodology at the Review step: parallel agent output must be inspectable without replaying sessions, whatever model produced it. The same argument run against a version bump instead of a model swap is in the codex-cli 0.130.0 contract, and the broader practice lives under Codex CLI workflows.
| Gate | Question |
|---|---|
| Risk routing | Were red folders touched, and who approved? |
| Replay proof | Which commands prove regression guards? |
| Receipt match | Does the PR body list scopes + verification transcript? |
| Rules precedence | Which AGENTS.md or SKILL.md rule governed behavior? |
The merge check before any model graduates:
- Forked agent work lists parent + child responsibilities.
- Red-folder paths received explicit human acknowledgement.
- Scopes in the PR body match folders in the diff.
- Primary-doc links were smoke-checked after publishing edits.
The Codex CLI docs and features page describe what each build supports; openai/codex and openai/skills show where it is heading; the MCP specification anchors connector language. None of it replaces architecture judgement. Agents accelerate execution, not ownership.
Synthesis: agents are signal amplifiers. They multiply whatever clarity already exists in files, hooks, and scopes, and a model swap just turns up the volume.
Best ways to use this research
- Best for: leads fielding a GPT-5.5 swap request who want the review consequences settled before the model changes.
- Best first artifact: the model pin note; two lines in AGENTS.md, default plus escalation rule, written before the swap branch exists.
- Best comparison angle: take your last model change and check whether anyone could say, from the repo alone, which instruction owned the final diff.
Common questions
-
What does codex 5.5 mean for a team workflow?
Codex 5.5 questions are really questions about running GPT-5.5 in Codex, and the workflow answer is model governance: pin the default model and an escalation rule inside AGENTS.md. Casual swaps make review expectations wobble, because different models imply different risk appetite and leads lose track of blast radius.
-
How does a team adopt a new model without breaking review?
Through the model pin note: AGENTS.md fixes the default model, and the escalation rule says when a different model may step in. After the pin, leads can reason about blast radius instead of discovering the swap while reading an unfamiliar diff at merge time.
-
What are browser checks in this memo?
Browser checks are the Chrome-side verifications that stay paired with the project's normal CLI checks before merge. The browser bridge note documents staging URLs and credential boundaries beside browser tasks, which is how demos stop contradicting CI artifacts when work runs on dual rails.
-
How do teams prove which instruction owned a diff?
The workspace file carries the proof: every Codex CLI run ends with a transcript snippet reviewers can replay, and the rules precedence gate asks which AGENTS.md or SKILL.md rule governed behavior. Branches that look complete but cannot answer that question are the failure this memo opens with.
Where to go next
Model governance is one afternoon of our training: pin, escalation rule, and browser bridge installed on your own repo. Bring the repo you are about to swap.
Related training topics
Related research

Codex CLI 0.123.0: workflows that hold up
Codex CLI 0.123.0 workflows that hold up in review: replay recipes in the diff, a pinned model, a connector roster, and a ten-line done checklist.

Codex CLI 0.124.0: tighter rollback loops
Codex CLI 0.124.0 as a workflow moment: shrink the rollback contract, pin the model, and keep a connector roster and done checklist where reviewers live.

Codex CLI 0.132.0: workflows and integrations
Codex CLI 0.132.0 tightens AGENTS.md, MCP, and verification loops for reviewable diffs.
Continue through the research archive
Newer research
Always-on AI code review governance
AI code review governance for always-on agents: receipts, scopes, and owners that answer why a file changed without replaying chat.
Earlier research
AI agent boundaries that hold under pressure
A boundary-setting guide to AI agent boundaries: connector cards, scope ledgers, child receipts, and decision stubs that stop permission drift.