Back to Research

Agent boundaries for teams running coding agents

How to set agent boundaries for teams: connector ownership, written scopes, and review receipts that keep agent diffs explainable after the session ends.

Summer and Winter Landscapes (one of a pair), landscape painting by Kano Naonobu (1607)
Rogier MullerApril 28, 20266 min read

Onboarding week is when it shows. A new engineer approves a green PR, someone asks what the agent was allowed to touch, and the room goes quiet. Agent boundaries for teams exist for exactly this moment: an agent boundary is a written limit on what a coding agent may touch, plus the receipt proving the run stayed inside it. CI greens keep masking missing human judgement until those limits live in files.

The quiet room problem

Counter-thesis: Your team does not need smarter agents to merge calmly; it needs boundaries a stranger can read.

The wrong path: We believed smaller tasks guaranteed safer autonomy. We watched that belief fail during crunch weeks, when summaries shrank to bullet vibes and nobody could say which paths a run had owned.

Diagnosis: Fred Brooks is still right about accidental complexity. We shipped branching behavior, parallel runs, forked sessions, chained handoffs, without buying down the conceptual debt of explaining any of it.

Thesis: Explainable forks beat clever forks.

The bottleneck has moved from typing speed to traceability. Another prompt template will not replace ownership.

Boundaries that hold under pressure

Codex scope fog. Teams shipping agent work weekly watch .mdc language sound precise until reviewers argue about what it meant. Rules compete with chat memory: split-brain coordination without a ledger.

Named fix: Scope ledger. The parent chat carries five lines: goal, allowed paths, forbidden paths, verification command, merge owner. Review shifts from debating prompts to checking ledgers against diffs.

Claude permission creep. On shared laptops, bash approvals become muscle memory. The failure is not tool quality; it is the missing operating contract.

Named fix: CLAUDE.md supremacy clause. The top of CLAUDE.md states which hooks win, which folders require human eyes, and where temporary overrides live (Claude Code docs). Sessions stop inventing policy mid-run.

Codex replay gaps. CLI convenience hides verification theater: commands ran, narrative did not, and greens get merged that no reviewer ever saw run.

Named fix: Replay sandwich. AGENTS.md mandates intent line → command transcript → diff summary before the PR (Codex quickstart). Review becomes reproducible without standing behind someone's terminal.

MCP blast radius. Wire MCP quickly and a connector ends up touching data nobody listed on the diagram. Connectors default to capability demos; least privilege needs explicit trust boundaries.

Named fix: Connector card. One markdown card per MCP server: allowed actions, forbidden actions, owner, rollback. Incidents shrink because operators know what "off" looks like.

---
description: Delegation boundary snapshot (adapt globs to your repo)
globs:
  - "**/*"
alwaysApply: false
---

- Codex: keep scopes explicit in `.mdc`; forbid undeclared MCP domains.
- Claude Code: cite `CLAUDE.md` precedence before expanding bash scope.
- Codex: ensure `AGENTS.md` carries replay-friendly verification notes for CLI runs.

The snapshot reads the same whether the harness is Codex's agent or a CLI. We map the discipline to our methodology under Review: evidence beats narration when merges touch shared surfaces. The harness-level version of this argument is in agent harnesses need guardrails, with the wider AI coding governance cluster behind it.

The evidence pack

A boundary that cannot be checked is a vibe. Four gates make it checkable.

Gate Question
Reviewer path Can someone unfamiliar trace intent without chat replay?
Risk routing Were red folders touched, and who approved?
Replay proof Which commands prove regression guards?
Receipt match Does the PR body list scopes + verification transcript?

Review strip

  • MCP connectors mentioned (if any) list owners.
  • Verification command output is pasted or linked.
  • Forked agent work lists parent + child responsibilities.
  • Primary-doc links were smoke-checked after publishing edits.

What stays human

Hard constraints still belong to humans: threat models, customer promises, and blast radius decisions stay off autopilot. OWASP's LLM Top 10 is a sober companion read here, and the MCP specification defines the connector surface you are bounding.

One image: Agents are relief crews; the blueprint still belongs to humans standing outside the trench.

Best ways to use this research

  • Best for: engineering teams comparing coding agent operating habits under delivery pressure and deciding which boundary to write down first.
  • Best first artifact: a scope ledger in the next parent chat, then a connector card per MCP server; OpenAI's skills repository shows behavior defined in files rather than vibes.
  • Best when publishing: if the boundary docs go public, hold them against Google's generative AI content guidance so the evidence stays visible.

Common questions

  • How do you set agent boundaries for teams without slowing them down?

    Write the five-line scope ledger before the run, not after: goal, allowed paths, forbidden paths, verification command, merge owner. It takes a minute and removes the slowest step in review, which is reconstructing intent. Boundaries cost less than the archaeology they prevent.

  • Where should agent boundaries live?

    In version-controlled files the agent actually reads: .mdc rules for Codex, a supremacy clause in CLAUDE.md, replay notes in AGENTS.md, and one connector card per MCP server. Chat instructions evaporate when the session ends; files survive review, handoff, and the next hire.

  • What proves an agent stayed inside its boundary?

    The receipt pack: scopes in the PR body that match folders in the diff, pasted verification output, and parent plus child responsibilities for forked work. If any red-folder path appears, the PR shows explicit human acknowledgement. Green CI alone proves nothing about scope.

  • Do boundaries still matter when only one developer uses agents?

    Yes, because the reviewer is the second user of every agent run. Even solo, the scope ledger and replay sandwich convert a private session into work the whole team owns and can defend later. The alternative is a bus factor of one unsaved chat history.

Next step

If you want a second pair of eyes on the boundaries you are about to standardize, contact us and bring your current rule files. The review takes one conversation, not a consulting engagement.

Related training topics

Related research

Continue through the research archive

Ready to start?

Transform how your team builds software.

Get in touch