AI agent boundaries that hold under pressure
A boundary-setting guide to AI agent boundaries: connector cards, scope ledgers, child receipts, and decision stubs that stop permission drift.

A governance rollback review keeps finding the same thing: a dependency connector that widened its blast radius with no named owner anywhere. That discovery is what AI agent boundaries exist to prevent, and the cost of missing them lands on whoever is on call when the agent finds the gap. AI agent boundaries are the written limits on what an agent may touch: allowed paths, forbidden paths, connector scopes, and named owners. If onboarding feels noisy, the boundary is implicit, and the agent will discover it at the worst time.
The connector nobody owned
Another connector is rarely the missing ingredient; an owner for the existing one usually is. During governance rollback reviews we keep noticing dependency connectors widening blast radius without a named owner, and the expensive bug is always the permission drift nobody signed.
Counter-thesis: Boundaries do not slow agents down; missing boundaries do, one surprise at a time.
The wrong path: We believed local greens equaled team-wide confidence. We tested it while hooks existed but nobody read the transcripts, and the boundary lived in people's heads right up until it mattered.
Diagnosis: The principle of least privilege. Every capability defaults to denied until someone owns the grant; connectors wired for demos invert that default and call it convenience.
Thesis: Governance is velocity once surprises stop scaling.
Boundaries that hold under load
A boundary holds when it is written, owned, and checkable in review.
MCP blast radius. Wire connectors quickly and one of them will touch data nobody listed on the diagram. Capability demos default to broad access; least privilege needs explicit trust boundaries.
Named fix: Connector card. One markdown card per MCP server: allowed actions, forbidden actions, owner, rollback. Incidents shrink because operators know what "off" looks like.
Recursive handoff blur. Chained agents return summaries that omit child-owned paths, a telephone game played with your diff.
Named fix: Child receipt block. Every child returns paths touched, commands run, and tests proving regression guards. The boundary travels with the work instead of dissolving at each handoff.
Review queue theater. CI is green and reviewers still ask why this approach, with no written answer on file.
Named fix: Decision stub. The PR template forces three lines: constraints considered, rejected alternatives, verification proof. The boundary between agent judgement and team judgement gets a paper trail.
Codex scope fog. .mdc language sounds precise until reviewers argue what it meant; rules compete with chat memory in a split-brain coordination failure.
Named fix: Scope ledger. Five lines in the parent chat: goal, allowed paths, forbidden paths, verification command, merge owner. Review checks ledgers against diffs instead of debating prompts.
---
description: Delegation boundary snapshot (adapt globs to your repo)
globs:
- "**/*"
alwaysApply: false
---
- Codex: keep scopes explicit in `.mdc`; forbid undeclared MCP domains.
- Claude Code: cite `CLAUDE.md` precedence before expanding bash scope.
- Codex: ensure `AGENTS.md` carries replay-friendly verification notes for CLI runs.
The same boundary files travel across tools: Claude Code reads CLAUDE.md precedence, and Codex carries verification notes in AGENTS.md. In our methodology, Test proves behavior and Review proves the team can explain it, the bar everything in agentic coding governance reports to. When scores enter the picture, the boundaries follow in eval platform governance.
What review checks when a boundary is real
A real boundary turns review from interrogation into inspection.
| Gate | Question |
|---|---|
| Risk routing | Were red folders touched, and who approved? |
| Replay proof | Which commands prove regression guards? |
| Receipt match | Does the PR body list scopes + verification transcript? |
| Rules precedence | Which .mdc, SKILL.md, or CLAUDE.md governed behavior? |
Reviewer handoff
- Forked agent work lists parent and child responsibilities.
- Red-folder paths received explicit human acknowledgement.
- Scopes in the PR body match folders in the diff.
- MCP connectors mentioned (if any) list owners.
One image: Onboarding noise is the sound of implicit boundaries being discovered; a boundary the agent cannot read does not exist.
Boundary conditions
None of this replaces architecture judgement; agents accelerate execution, not ownership. For the threat catalogue itself, the OWASP Top 10 for LLM applications and the NIST AI Risk Management Framework stay on the required-reading shelf.
Best ways to use this research
- Best for: engineering teams comparing Codex, Claude Code, and Codex operating habits while connectors multiply faster than owners.
- Best first artifact: a connector card for the one MCP server nobody currently owns; write it before the next automated run.
- Best comparison angle: list every connector your agents can reach, then list the named owners; the gap between the two lists is your real boundary map.
Common questions
-
How do you set AI agent boundaries that hold?
Write them where the agent reads: connector cards with allowed and forbidden actions, scope ledgers with allowed and forbidden paths, and a named owner for every grant. A boundary that lives in someone's head is discovered by the agent at the worst possible time.
-
Who should own an MCP connector?
One named person, written on the connector card with the allowed actions, forbidden actions, and rollback step. Ownerless connectors are how blast radius widens quietly; when nobody owns the grant, nobody notices the drift, and rollback becomes an investigation instead of a step.
-
What happens when agent boundaries stay implicit?
The agent guesses, and guessing scales poorly. Implicit boundaries surface as onboarding noise, permission drift nobody signed, and summaries that omit child-owned paths. The fix is boring on purpose: cards, ledgers, receipts, and an owner whose name appears next to every grant.
Further reading
- Google Search Central on helpful, people-first content
- Google Search Central on generative AI content
- The OpenAI Skills repository
Next move
If your connector list is longer than your owner list, contact us and we will map the boundaries with your team in one session.
Related training topics
Related research

Agent boundaries for teams running coding agents
How to set agent boundaries for teams: connector ownership, written scopes, and review receipts that keep agent diffs explainable after the session ends.

Eval platform governance for AI coding teams
A governance memo on eval platform governance: receipts behind scores, scoped harness access, and owners that stop Goodhart drift.

How to set up an AI coding workshop for your engineering team
How to set up an AI coding workshop: pick a format, scope it to your real repos and review habits, run hands-on labs, and leave with a shared playbook.
Continue through the research archive
Newer research
Codex 5.5: pin the model before you swap it
Codex 5.5 questions are model governance questions: pin the default model and escalation rule in AGENTS.md, and keep browser checks bridged to CLI receipts.
Earlier research
Codex CLI /goal workflows with AGENTS.md
A practical Codex CLI /goal workflow guide: persistent goals carry intent, AGENTS.md carries boundaries, and receipts carry proof across handoffs.