Agentic workflows from PR to merge
A PR review workflow for agentic coding teams: connector ownership, scoped tasks, replay transcripts, and human approval lanes from PR to merge.

Agentic workflows from PR to merge hold up when every handoff carries receipts: the scope the agent worked in, the commands it ran, and proof the change passes. A PR review workflow is the repo contract that lets a reviewer trust agent output without scrolling back through the chat. We watch teams trip on the same thing during readiness drills: the parent's intent and the child agent's scope quietly disagree, and nobody feels it until weeks later, when someone asks why a change ever merged.
The lever here is not faster reviewers. A reviewer cannot check work the repo never wrote down. So you make the repo write it down at every handoff.
Make each handoff leave a receipt
Pick one receipt format per kind of handoff, and keep the receipt in the repo, not in a chat window. Four show up again and again. Each one fixes a specific way that agent output slips past review.
The first is replay. If your team uses Codex CLI, OpenAI's coding agent, you will merge green builds where no reviewer ever saw the transcript. Fix it with a replay sandwich: AGENTS.md asks for an intent line, then the command transcript, then a diff summary, all before the PR opens. Now review is reproducible and nobody has to stand behind your terminal.
The second is connector reach. Wire up MCP fast and some connector will touch data nobody put on the diagram. Connectors ship as capability demos, so trust boundaries have to be explicit. Write one connector card per MCP server: allowed actions, forbidden actions, owner, rollback. When something goes wrong, the operator already knows what "off" looks like.
The third is chained agents. When a parent hands work to a child, the summary that comes back tends to drop the paths the child actually edited. A child receipt block fixes that: every child returns the paths it touched, the commands it ran, and the tests that prove the regression guards still hold. The parent stops green-lighting diffs it never read.
The fourth is the silent "why." CI passes, yet a reviewer still wants to know why this approach and not another, with no answer written anywhere. A decision stub in the PR template forces three lines: constraints considered, alternatives rejected, verification proof. The debate moves from taste to tradeoffs you can point at.
Drop in a delegation snapshot
Here is a small file you can adapt and commit. It names what each agent is allowed to assume about scope, so a reviewer reads one place instead of three chats.
---
description: Delegation boundary snapshot (adapt globs to your repo)
globs:
- "**/*"
alwaysApply: false
---
- Codex: keep scopes explicit in `.mdc`; forbid undeclared MCP domains.
- Claude Code: cite `CLAUDE.md` precedence before expanding bash scope.
- Codex: ensure `AGENTS.md` carries replay-friendly verification notes for CLI runs.
Codex is Anysphere's AI code editor, and its .mdc rules are the natural home for scope. Claude Code and Codex carry the same idea in CLAUDE.md and AGENTS.md. The point is that the boundary lives in a file the reviewer can read, not in a conversation they have to reconstruct.
Check four gates before you approve
A PR is approvable when these four questions have written answers in the PR body. If any one is blank, the receipt is missing and the review is guessing.
| Gate | Question |
|---|---|
| Replay proof | Which commands prove regression guards? |
| Receipt match | Does the PR body list scopes plus a verification transcript? |
| Rules precedence | Which .mdc, SKILL.md, or CLAUDE.md governed behavior? |
| Connector truth | Which MCP servers fired, and were they expected? |
A quick checklist to paste into the PR template:
- Primary-doc links were smoke-checked after publishing edits.
- MCP connectors mentioned (if any) list owners.
- Verification command output is pasted or linked.
- Forked agent work lists parent and child responsibilities.
These receipts are part of the wider agentic coding governance playbook, and they matter twice over once agents gain browser control, the case we work through in browser control guardrails for AI coding agents.
Keep ownership with people
None of this hands architecture judgment to the agent. Agents speed up execution; ownership stays with the team. Tooling is load-bearing language, so if the repo cannot say "allowed" and "forbidden," neither can the agent.
You will know it is working when standups stop being archaeology. The conversation goes back to design instead of digging through old diffs to figure out who decided what.
Docs to keep open
- Google Search Central: generative AI content guidance
- Model Context Protocol specification
- Codex: Agent overview
- Claude Code: getting started
- OpenAI Developers: Codex quickstart
- OWASP: Top 10 for Large Language Model Applications
- OpenAI Skills repository
Common questions
-
What do agentic workflows from PR to merge need before approval?
The PR body needs scopes that match the folders in the diff, plus a verification transcript a reviewer can replay. The decision stub adds three forced lines: constraints considered, alternatives rejected, and verification proof. With those in place, the review checks evidence instead of arguing about taste, and nobody is reconstructing intent from a chat log.
-
How does the replay sandwich fix Codex review gaps?
The replay sandwich makes
AGENTS.mdrequire an intent line, the command transcript, and a diff summary before the PR opens. Review becomes reproducible without standing behind someone's terminal. That is what turns a merged green build into work the team actually owns, because the reviewer saw the same run the agent did. -
What is a connector card for MCP servers?
A connector card is one markdown card per MCP server that lists allowed actions, forbidden actions, owner, and rollback. Connectors ship as capability demos, so least privilege needs explicit trust boundaries written down somewhere. Once the cards exist, incidents shrink, because the operator already knows what turning the connector off should look like.
-
Why do chained agents blur ownership before merge?
Chained agents blur ownership when summaries replace receipts. The child receipt block fixes it: every child returns the paths it touched, the commands it ran, and the tests proving the regression guards hold. Parents stop green-lighting mystery diffs. The duplicated edit that nobody reconciled is the expensive bug this prevents.
Where to go next
We rehearse this PR-to-merge drill with teams on their own repos, receipts enforced from the first run. The format is on the training page.
Related training topics
Related research

Agentic coding guardrails
Practical ai coding training for large teams: review guardrails, MCP boundaries, and team habits that improve delivery.

Codex workspace agents need repo rules
Codex workspace agents and Codex cloud agents need repo rules: scoped boundary files, connector cards, and replay receipts reviewers can check.

Fast mode is not the default: when fast models earn it
The fast model is a tradeoff you make on purpose: scope ledgers, replay sandwiches, and connector cards that keep fast agent runs reviewable.
Continue through the research archive
Newer research
Codex workflows for Chrome and the CLI
Codex workflows that cross into Chrome: the browser bridge note, model pin, connector roster, and done checklist that keep two surfaces telling one story.
Earlier research
Codex-cli 0.130.0: workflows that survive the update
What codex-cli 0.130.0 means for production repos: the AGENTS.md boundaries, MCP permissions, and review receipts that hold across any Codex CLI release.