Recursive agents, guardrails that hold
Practical agentic coding governance for team training, MCP boundaries, and reviewable artifacts across tools.

The situation
The new signal is not just that an agent can do more. It is that one task can now be split into a parent agent plus recursive sub-agents, with the parent handling research, implementation, and verification. That changes how teams think about agentic coding governance: the risk is no longer only bad code, but also bad delegation, hidden tool use, and context drift that never gets reviewed.
For engineering teams running an ai coding workshop or standardizing ai coding training, the useful question is simple: what needs to be written down so multi-agent work stays reviewable? Usually it is not a bigger prompt. It is a small set of durable artifacts: scoped rules, memory files, skills, connector boundaries, and a verification loop people can inspect.
The official signal from Codex is that /orchestrate can recursively spawn agents through the Codex SDK, and that the team used it to research internal skills and improve cold-start performance. Treat that as an early pattern, not a finished operating model. The lesson is broader than one product: once agents can delegate, teams need clear guardrails for what each agent may read, change, and call.
If you are comparing Codex, Claude Code, and Codex, the shared standard is straightforward: keep instructions close to the repo, keep connectors narrow, and make every agent-authored change land in a reviewable artifact. That is the practical center of ai coding governance.
Walkthrough
-
Start with one team rule that defines the boundary.
Write down what the parent agent may delegate, what it may not touch, and what must be verified by a human. In Codex, that usually means a scoped
.cursor/rules/*.mdcfile plusAGENTS.mdfor repo conventions. In Claude Code, the anchor isCLAUDE.md. In Codex, useAGENTS.mdand, when needed, a temporaryAGENTS.override.mdfor short-lived exceptions.
---
description: Agent delegation and review boundary for this repo
globs:
- "**/*"
alwaysApply: true
---
- Prefer small delegated tasks over broad autonomous edits.
- Do not change auth, billing, or deployment files without explicit review.
- Every agent run must end with a verification step and a summary of changed files.
-
Give each tool one concrete artifact to own.
Codex: use a small
.mdcrule for repo-specific behavior, and keep broader conventions inAGENTS.md. If you are using background agents or subagents, make the handoff explicit: what the child agent may inspect, and what summary it must return.Claude Code: keep durable project memory in
CLAUDE.md, then add a skill when the task is repeatable but not always-on. Use hooks for deterministic checks such as formatting, permission boundaries, or logging. If the task depends on external systems, review the MCP connector scope before enabling it.Codex: use
AGENTS.mdfor instruction discovery, then run a verification loop in the CLI so the model’s output is checked against tests, lint, or a dry run. If a task needs temporary policy changes, prefer an override file over editing the permanent repo rules. -
Make delegation narrow enough to audit.
Recursive orchestration works best when the child agent has one job: research one file tree, draft one patch, or validate one failure mode. Broad “fix the repo” prompts are where teams lose traceability. A good test is whether a reviewer can answer three questions from the final diff: what changed, why it changed, and what verified it.
-
Put MCP behind a permission review.
MCP is useful when the task needs GitHub, Slack, docs, Jira, or a database. It is also where governance gets real. Review connector scope before rollout, and treat least privilege as the default. If a skill or subagent can reach external systems, the evaluation should include connector behavior, not just model output.
-
Keep the verification loop visible.
The most reliable pattern across tools is still: delegate, inspect, verify, then merge. For Claude Code, that may mean a review checklist plus hooks. For Codex, it may mean a CLI run that ends in tests or a sandboxed check. For Codex, it may mean a background-agent PR policy that requires a human-readable summary and scoped diff.
A practical team exercise is to take one bloated instruction file and split it into a small rule tree. That usually surfaces the real governance gaps faster than a slide deck. It also fits a useful methodology habit: in the Review step, check the artifact, not the agent’s confidence.
Tradeoffs and limits
Recursive agents can reduce token use and speed up cold starts, but they also increase coordination overhead. If the parent agent delegates poorly, you get fragmented context and summaries that omit the reason a choice was made. If the child agent has too much access, you get hidden side effects.
Skills help with repeatable work, but they are not a substitute for repo rules. Memory files help with continuity, but they can drift if they grow into a junk drawer. Hooks are deterministic, but they can become brittle if they try to enforce policy that should live in review.
MCP boundaries deserve special caution. The more external systems an agent can touch, the more your evaluation needs to include permissions, failure modes, and rollback behavior. That is true whether the surface is an IDE, a terminal CLI, or a browser-connected workflow.
The main limit of the current signal is that it is still an early product note. The performance claims are interesting, but teams should verify them against official docs, changelogs, and their own repo metrics before adopting the pattern broadly.
Further reading
- https://cursor.com/docs
- https://code.claude.com/docs/en/overview
- https://code.claude.com/docs/en/memory
- https://support.claude.com/en/articles/12512176-what-are-skills
- https://developers.openai.com/codex
- https://developers.openai.com/codex/docs
- https://cursor.com/docs/rules
- https://cursor.com/docs/mcp
Related training topics
Related research

Agentic PR Review Workflow
Practical ai coding governance for reviewable PRs, scoped rules, and team training across agentic tools.

Agentic Coding Governance for Teams
Practical ai coding governance for engineering teams: scoped rules, MCP boundaries, skills, and review loops.

CI Fixes with Agent Governance
Turn CI failures into a governed agent workflow with rules, MCP boundaries, and review checks.