Back to Research

Codex workflows for mobile handoffs

Codex workflows for mobile handoffs: the repo contract of model pins, connector rosters, done checklists, and slash catalogs that lets agent work change hands.

Autumn in the Mountains, landscape painting by Unknown (1600).
Rogier MullerMay 15, 20266 min read

A mobile handoff survives only when the work travels in the repo, not in chat. Codex CLI, OpenAI's coding agent, picks up wherever you point it, so a task you start on a laptop and finish from your phone is fine as long as the rules are written down. A codex workflow, for handoff purposes, is the repo contract that lets agent work change hands without losing context: the model pin, the connector roster, the done checklist, and the slash catalog. If those four live in the repo, the second device inherits everything. If they live in someone's head, you get two diffs and an argument.

The version-control caveat is older than agents. Every extra hand or device adds another path for context to leak, and whatever was never written down lands back in the review queue as confusion. The fix is the same fix good teams have always used: put the contract where both ends can read it.

Put the model and its escalation rule in AGENTS.md

A handoff breaks before it starts when the two devices disagree on which model is running. Different models carry different risk appetite, so the same prompt can produce a careful diff on one machine and a sprawling one on the other.

Pin the default model and the escalation rule at the top of AGENTS.md. Whoever resumes the task inherits the same blast-radius rules without asking. This is the one contract the second device genuinely cannot see in chat, so write it first.

Keep a connector roster so MCP scope is visible

MCP connectors accumulate quietly. Each server you enable widens what the agent can touch, and across two devices that scope is invisible unless you list it. A security review that starts from a chat scroll starts blind.

Keep a Markdown roster at the repo root, written in the vocabulary of the MCP specification. List each connector with the actions it is allowed to take. Now a reviewer on either end of the handoff can answer "which servers can fire here" without guessing.

Cap AGENTS.md with a done checklist

An unchecked AGENTS.md grows until Codex optimizes the wrong definition of done. The person resuming the task cannot tell which "done" is current, so they either redo work or ship the wrong thing.

Cap the file with a Definition of Done, ten lines or fewer, at the top. Catalog your slash commands in docs/codex-commands.md and link it from AGENTS.md so private command folklore stops dying with the device that invented it. Here is a verification snippet that closes each run with replayable evidence:

# AGENTS.md verification snippet

- Every Codex CLI run ends with the transcript snippet reviewers can replay.
- Pair browser evidence with the project's normal CLI checks before merge.
- If MCP servers are enabled, list allowed actions beside each connector name.

Run a review strip on every handoff

Before any handoff hits review, walk a short gate. Receipts in the repo beat narration in chat, every time.

Gate Question
Rules precedence Which AGENTS.md or SKILL.md rule governed behavior?
Connector truth Which MCP servers fired, and were they expected?
Reviewer path Can someone unfamiliar trace intent without chat replay?
Risk routing Were red folders touched, and who approved?

Paste this checklist into the PR body so the second device fills it in before merge:

  • Red-folder paths received explicit human acknowledgement.
  • Scopes in the PR body match folders in the diff.
  • Primary-doc links were smoke-checked after publishing edits.
  • MCP connectors mentioned (if any) list owners.

The quickstart gets a single laptop going. The Codex CLI docs and features page describe what each build can do, and openai/codex plus openai/skills show what is coming. If your repo cannot state its boundaries plainly, the agent will guess, and guessing is worst at the moment work changes hands.

Common questions

  • What lets codex workflows survive a mobile handoff?

    Repo-visible contracts survive; chat summaries do not. The four that matter are a model pin in AGENTS.md, a Markdown connector roster, a done checklist of ten lines or fewer, and a slash catalog at docs/codex-commands.md. They prevent the one failure that scales worst across devices: intent that lives only in chat, where nobody can answer why the agent touched a given file.

  • What does a Codex handoff include before review?

    It includes the review strip. Red-folder paths got explicit human acknowledgement, the scopes in the PR body match the folders in the diff, primary-doc links were smoke-checked after publishing edits, and any MCP connectors mentioned list their owners. Each line is a receipt the next reviewer can check against the diff instead of trusting a chat recap.

  • Why cap AGENTS.md with a done checklist?

    Because an uncapped AGENTS.md drifts until Codex optimizes a definition of done nobody agreed to. The fix is the operating contract, not better tooling. A Definition of Done in ten lines or fewer at the top of the file keeps output aligned with the vocabulary your team reviews against, so the person resuming knows which "done" is current.

  • Why pin the model before handing work across devices?

    Because casual model swaps make review expectations wobble, and different models imply different risk appetite. The model pin note fixes the default model and an escalation rule inside AGENTS.md, so a lead can reason about blast radius even when the handoff happens mid-sprint. It is the contract the second device cannot infer from chat.

  • Where should the slash commands live?

    In docs/codex-commands.md, linked from AGENTS.md. Slash commands spread fast and turn into private dialect the next operator never learned, so a catalog keeps the choreography shared instead of remembered. Linking it from AGENTS.md means the file the agent already reads points to the commands it should run.

Try it on a real handoff

Replay your last messy handoff against the review strip and mark which items existed in the repo versus in someone's head. Then bring that repo to our training, or read more patterns under Codex CLI workflows.

Related training topics

Related research

Continue through the research archive

Ready to start?

Transform how your team builds software.

Get in touch