Back to Research

Codex Agent Rules for Real Repos

Set up Codex agents with scoped AGENTS.md rules, safe MCP access, verification commands, and reviewable handoffs.

The Teton Range, landscape painting by Thomas Moran (1897).
Rogier MullerJuly 2, 20268 min read

Use a Codex agent as a bounded repo worker: give it scoped AGENTS.md rules, read-only MCP context by default, and a verification command it must run before handing back a diff. A Codex agent is a Codex-run coding assistant that reads your repo context, follows team instructions, proposes changes, and can be held to a reviewable workflow.

Codex, OpenAI's coding agent, is most useful when the team convention is boring and explicit. For Codex CLI workflows, that means treating context like an interface, not a pile of files.

Start with scoped AGENTS.md rules

Put durable rules where the work happens. A root AGENTS.md should describe repo-wide expectations, while nested files should describe local constraints for packages, services, or apps.

This matters because a real production repo is not one codebase. A React app, a billing worker, and a database migration folder usually have different safety rules.

The trap is turning AGENTS.md into a long onboarding doc. Keep it short enough that a reviewer can tell whether the Codex agent followed it.

A useful root file might say: use pnpm, keep diffs small, run pnpm test --filter <package>, do not edit generated files, and ask before changing public API behavior. A nested apps/web/AGENTS.md can add: keep hooks inside src/features/*/hooks, do not call server-only modules from client components, and update Storybook stories for visible UI changes.

For more patterns in this lane, see the related Codex CLI workflows topic.

Make MCP useful before it is powerful

MCP is the Model Context Protocol, an open protocol for connecting agents to external tools and data sources through a consistent interface. In a Codex workflow, MCP can expose things like GitHub issues, internal docs, design files, database metadata, or a code graph.

A TypeScript compiler graph MCP server is a good example of the pattern. Instead of asking the agent to scan a whole repo, the server can answer narrower questions like “what imports this symbol?” or “which files implement this interface?” That can reduce token waste, but the exact savings will depend on the repo, server design, and task.

Start read-only. Let the Codex agent query structure, tickets, and docs before you let it mutate anything outside the working tree.

The trap is giving MCP broad write access because the demo looked smooth. A production MCP server should have a permission boundary, a logging story, and a clear failure mode when data is missing.

Require a verification loop, not a confident answer

The Codex CLI becomes much easier to trust when every task has a known finish line. Ask for a diff, the command output, and any skipped checks in the final handoff.

For example, a backend task can require pnpm test --filter api, pnpm lint --filter api, and pnpm typecheck --filter api. A docs-only task can require a link check or a local build instead.

The trap is accepting “looks good” as verification. The agent should say what it ran, what passed, what failed, and what it did not run.

This is also where Codex training pays off for engineering teams. You are not teaching people magic prompts. You are teaching the team to make every Codex workflow end in evidence.

Paste this team convention

Use this as the first operational checklist for a Codex team. Put the repo-specific parts in AGENTS.md, then keep the review rule in your pull request template or team checklist.

# Codex agent convention

## Scope
- Work only inside the package, app, or service named in the task.
- Read the nearest AGENTS.md before editing.
- Prefer the smallest reviewable diff that solves the issue.
- Do not edit generated files, lockfiles, migrations, or public API contracts unless the task explicitly asks for it.

## MCP boundary
- Treat MCP tools as context sources first.
- Use read-only MCP access by default.
- Do not use MCP tools that write to GitHub, Jira, Slack, databases, or docs without explicit approval in the task.
- If MCP data conflicts with repo files, trust the repo files and mention the conflict in the handoff.

## Verification loop
- Before handoff, run the narrowest relevant checks:
  - unit test for the changed package
  - typecheck for the changed package
  - lint for the changed package
  - app build only when the change crosses runtime or bundling boundaries
- If a check is too expensive or unavailable, say why and name the next best check.

## Handoff receipt
- Summarize the change in 3 bullets or fewer.
- List files changed by area.
- Paste the verification commands and results.
- Call out risks, skipped checks, and follow-up work.

The adoption path should be simple. One engineer proposes the convention in a pull request, one service owner reviews the local rules, and the team lands it in root AGENTS.md plus any nested AGENTS.md files that need stricter scope.

The review rule keeps it alive: no Codex-authored diff is accepted without a handoff receipt that names the nearest AGENTS.md file, the MCP boundary used, and the verification commands run. If that receipt is missing, ask for it before reviewing the code.

Keep diffs reviewable under context pressure

Do not build your workflow around a remembered context limit. Engineers sometimes ask about the openai codex-1 agent maximum context tokens 2025 number, but the safer habit is to design tasks that do not need the maximum window.

Small tasks beat heroic context. Ask the agent to change one hook boundary, one package, one migration, or one failing test cluster.

The trap is using large context as permission to skip structure. Even with a bigger window, the agent still benefits from scoped instructions, MCP queries, and a narrow verification loop.

A good handoff sounds like this: “Changed apps/web/src/features/billing/hooks/usePlan.ts and its tests. Used read-only code graph MCP to find importers. Ran pnpm test --filter web -- usePlan and pnpm typecheck --filter web; both passed.” That gives the reviewer something to check without replaying the chat.

Common questions

  • What is a Codex agent?

    A Codex agent is a coding assistant running through OpenAI Codex that can inspect a repo, follow instructions, make changes, and return a diff for review. In team use, the important artifact is not the label “agent”; it is the operating loop: scoped rules, bounded tools, verification commands, and a handoff receipt.

  • Is “codex agents md” the same thing as AGENTS.md?

    Usually, yes: engineers typing “codex agents md” are looking for the repo instruction file named AGENTS.md. The useful convention is to keep a short root file for global rules and add nested AGENTS.md files where local architecture constraints matter, such as apps/web or packages/billing.

  • Should we size workflows around the OpenAI codex-1 agent maximum context tokens from 2025?

    No, do not hardcode a workflow around an old or searched context-window number. As of July 2026, the durable practice is to minimize required context with scoped tasks, repo instructions, and MCP lookups, then verify the resulting diff with commands the reviewer can rerun.

  • When should MCP be writable in a Codex workflow?

    Writable MCP should be rare at first and tied to explicit approval in the task. A practical permission table is read-only for docs, issues, code graphs, and design metadata; approval-required for GitHub comments or ticket updates; and blocked for production databases or secrets unless your platform team has built a safe wrapper.

  • How do we know the agent followed the rules?

    You know by checking the handoff receipt against the diff. The receipt should name the nearest AGENTS.md, summarize the MCP tools used, list verification commands, and call out skipped checks; reviewers should reject the workflow, not just the code, when that evidence is missing.

Best ways to use this research

  • Best for Codex teams: Use this as a first convention for production repos where engineers already use the Codex CLI but want more consistent review behavior.
  • Best first artifact: Land the fenced Codex agent convention block in AGENTS.md, then customize only the commands and protected paths.
  • Best comparison angle: Compare agent output by the handoff receipt, not by how polished the chat sounded. A smaller diff with passing checks is better than a broad patch with vague confidence.
  • Best workshop exercise: In an OpenAI Codex training session, give two engineers the same bug: one with only a prompt, one with scoped AGENTS.md plus the verification loop. Review the diffs side by side.

Further reading

Next step

Pick one service and add the convention there first. If the next Codex-authored pull request has a clean receipt, expand the pattern to the rest of the repo.

One methodology lens

One useful way to read this through our methodology is the Plan step: delegate first-pass decomposition and dependency mapping, review the sequencing and assumptions, and keep ownership of scope and priorities. If that split is still fuzzy, the workflow usually is too.

Related training topics

Related research

Ready to start?

Transform how your team builds software.

Get in touch