Back to Research

Codex workflows for Chrome and the CLI

Codex workflows that cross into Chrome: the browser bridge note, model pin, connector roster, and done checklist that keep two surfaces telling one story.

The Kiss of Peace and Justice, landscape painting by Laurent de La Hyre (1654).
Rogier MullerMay 10, 20265 min read

When Codex runs in both Chrome and the CLI, the one artifact that keeps them honest is a short browser bridge note: staging URLs and credential boundaries written down beside every browser task. A Codex workflow is the operating contract around an agent run: scoped tasks, approval gates, and verification notes. Codex CLI, OpenAI's coding agent, is happy to work across both surfaces, but only the contract makes them tell the same story. Without it, a demo that sailed through Chrome and a CI run an hour later end up arguing, and the release sits blocked while two versions of the truth fight it out.

The fix is boring on purpose. Write the contract down so any reviewer can replay the run without you in the room.

Why the browser is the same run, not a second one

The browser is not a separate workflow. It is the same Codex run on a surface your CI cannot see. That gap is the whole problem.

When tool access widens to unblock senior engineers, connectors tend to multiply faster than anyone tracks who owns them. An MCP call looks harmless right up until credentials land in the transcript. Tooling moves faster than the story the repo can defend, and forks without receipts quietly eat the sprint.

So the rule is simple: browser work and CLI work have to leave the same evidence. If they do not, reviewers are stuck picking between two truths, and they will pick wrong.

Write four notes that survive the handoff

Four small artifacts cover most of the drift. None of them are clever. That is the point.

Start with the browser bridge note. Document staging URLs and credential boundaries beside every browser task, so a Chrome demo can no longer contradict a CI artifact.

Then pin the model. A casual model swap wobbles review expectations, and the failure looks like tool quality when it is really a missing contract. Pin the default model and the escalation rule inside AGENTS.md so leads can reason about blast radius on both surfaces.

Keep a connector roster too. Each server widens blast radius, and connectors that pile up quietly erode least privilege. A Markdown roster checked into the repo root means security reviews start grounded instead of guessing.

Last, cap AGENTS.md with a definition of done, ten lines or fewer, at the top of the file. Left unchecked, the file grows and Codex starts optimizing the wrong "done."

# AGENTS.md verification snippet

- Every Codex CLI run ends with the transcript snippet reviewers can replay.
- Pair browser evidence with the project's normal CLI checks before merge.
- If MCP servers are enabled, list allowed actions beside each connector name.

The drills for both surfaces live under CLI workflows.

Give the reviewer something to check against

A reviewer should not have to reconstruct intent. Hand them a table of gates and a short checklist, and the review starts grounded.

Gate Question
Risk routing Were red folders touched, and who approved?
Replay proof Which commands prove regression guards?
Receipt match Does the PR body list scopes + verification transcript?
Rules precedence Which .mdc, SKILL.md, or CLAUDE.md governed behavior?

Pair that with a PR proof checklist the author fills in before review:

  • MCP connectors mentioned (if any) list owners.
  • Verification command output is pasted or linked.
  • Forked agent work lists parent + child responsibilities.
  • Red-folder paths received explicit human acknowledgement.

Tool mechanics sit upstream of all this. The Codex CLI docs, features page, and slash commands reference describe what the agent can do, the quickstart sets up a machine, and the openai/codex and openai/skills repositories track releases. Architecture judgement stays human. Agents speed up execution, not ownership.

Common questions

  • How do Codex workflows keep Chrome and CLI runs telling one story?

    With a browser bridge note. Document staging URLs and credential boundaries beside every browser task. When Chrome work diverges from CLI habits, reviewers see two truths and stall. The note is the explicit handoff language that makes a demo stop contradicting its CI artifact, so the run reads the same on both surfaces.

  • What is a Codex workflow, concretely?

    A Codex workflow is the operating contract around an agent run: scoped tasks, approval gates, and verification notes. In practice you install four pieces of it. A browser bridge note, a model pin in AGENTS.md, a connector roster in the repo root, and a ten-line definition of done. Together they keep the run reviewable.

  • Where do MCP credentials go wrong in browser work?

    MCP calls look harmless until credentials enter the transcript. That is why every connector gets a Markdown roster checked into the repo root. Each added server widens the blast radius and least privilege erodes a little more. A security review can only start grounded once the roster names exactly what each connector is allowed to do.

  • What does the PR need before review starts?

    The PR proof list. MCP connectors list their owners, verification command output is pasted or linked, forked agent work names parent and child responsibilities, and any red-folder path got an explicit human acknowledgement. The test is plain: the handoff has to survive without the original operator in the room.

Try it on one flow

Pick your last Chrome demo and the CI run that followed it, and put them side by side. If they disagree, write the browser bridge note for that one staging flow before the next demo books a release slot.

Related training topics

Related research

Continue through the research archive

Ready to start?

Transform how your team builds software.

Get in touch