Back to Research

Codex Conventions for Mixed-Skill Teams

A practical Codex convention for AI coding training, skill rubrics, MCP boundaries, and review guardrails.

The Mekhitarist Fathers on San Lazzaro Island. Venice, landscape painting by Ivan Aivazovsky (1843).
Rogier MullerJune 22, 20269 min read

For teams with mixed coding experience, the safest AI coding workflow is a shared skill rubric plus repo-level rules, not unrestricted agent access. OpenAI Codex, OpenAI's coding agent, works best when it knows what a junior, mid-level, or senior-approved change looks like before it edits code.

Agentic coding governance is the set of team rules, tool boundaries, and review checks that keep coding agents useful without letting them invent local standards. This is practical ai coding for teams: teach the convention, enforce it in review, then let agents move faster inside that fence.

Start with a skill rubric, not a tool rollout

A good rollout begins by defining the skills your team wants Codex to demonstrate, not the features it can use. For a mixed-skill team, that usually means three levels: safe mechanical changes, bounded feature work, and architecture-sensitive changes that need senior review before implementation. This is where ai code solutions for diverse coding skills teams become real instead of vague. A junior developer can ask Codex to update a React component, add tests, and run the existing checks. A staff engineer can ask Codex to compare two migration paths but still require an architecture note before code changes. Same tool, different allowed move. Orchestration does not replace judgment; boundaries still decide who can touch production code, credentials, database migrations, or public APIs.

Put the convention where Codex will read it

For Codex users, the convention should live in AGENTS.md. Put durable repo rules there, then keep task-specific instructions in the prompt or issue.

A useful root AGENTS.md explains the project, test commands, style constraints, review expectations, and forbidden actions. A nested packages/payments/AGENTS.md can add stricter rules for money movement, idempotency, logging, and migration review.

That local scope matters. A frontend package does not need the same agent rules as a billing worker. A docs folder should not inherit production database instructions. Codex uses agent workflows, Claude Code often uses CLAUDE.md, and Codex uses AGENTS.md in its CLI workflow. Cross-tool governance means the rule is shared even when the file name changes. Keep scope tight; local files beat a sprawling root policy.

For broader patterns, use the related training topic as your team map, then adapt the examples to your own repo.

Set MCP boundaries before credentials appear

Model Context Protocol, or MCP, is a standard way for coding agents to connect to external tools such as GitHub, Slack, Jira, databases, design systems, and internal knowledge bases.

MCP is powerful because it lets Codex work with real context. It can inspect an issue, read a design doc, query a test database, or open a pull request when allowed. That makes MCP a governance boundary. Give Codex read-only access by default. Allow write access only for narrow systems, narrow repos, and narrow tasks.

A practical boundary note might say: Codex may read GitHub issues and pull request comments, may create branches in this repository, may not write to production databases, and may not post to customer-facing Slack channels. For data work, it may query staging only and must paste the query in the PR.

The trap is adding MCP servers during a live task because the agent asks for more context. Decide the boundary first. If a developer wants a broader permission, make that a human review item, not an in-chat click-through.

Make Codex prove the change

Codex should not just produce a diff. It should produce evidence that the diff is safe enough to review.

For small changes, that evidence can be boring: npm test, npm run lint, and a focused unit test. For larger changes, it might include a migration dry run, an API compatibility check, a screenshot comparison, or a rollback note.

The verification loop belongs in both AGENTS.md and code review. Ask Codex to run the expected commands, summarize failures, fix only failures it understands, and stop when the next step requires product or architecture judgment.

This is also the heart of an AI coding workshop. Developers learn fastest when they see the full loop: prompt, diff, test, failure, repair, review. Engineering team training should make that loop normal before the team tries more autonomy.

The trap is accepting green-looking agent output without command output. A plausible summary is not a verification receipt. If the agent cannot run the check, the PR should say why.

If you want to go deeper on review thresholds, pair this convention with Skill Rubrics for Coding Agents.

Paste this skill acceptance rubric into AGENTS.md

Use this as a starting convention. It is intentionally small enough for a real repo.

# AGENTS.md — Codex skill acceptance rubric

Purpose:
Use Codex to speed up engineering work while keeping review standards consistent across skill levels.

Skill levels:
- Level 1: Mechanical change
  - Allowed work: renames, formatting, dependency-safe refactors, docs updates, simple test additions.
  - Required proof: existing test command plus one focused check when relevant.
  - Stop condition: unclear behavior change, failing test the agent cannot explain, or touched security-sensitive code.

- Level 2: Bounded product change
  - Allowed work: small feature behind an existing pattern, bug fix with reproduction, API client update, component behavior change.
  - Required proof: failing test or reproduction before fix, passing test after fix, reviewer notes for behavior changes.
  - Stop condition: new architecture, new external dependency, data migration, auth change, payment change.

- Level 3: Architecture-sensitive change
  - Allowed work: planning, option comparison, test scaffolding, migration draft, risk checklist.
  - Required proof: design note approved by a senior reviewer before implementation.
  - Stop condition: production data access, credential handling, public API contract change, cross-service rollout.

Codex CLI verification loop:
1. Inspect the relevant AGENTS.md files before editing.
2. State the intended skill level in the PR description.
3. Make the smallest coherent change.
4. Run the repo checks listed below:
   - npm test
   - npm run lint
   - npm run typecheck
5. Paste command results or explain why a command could not run.
6. Do not hide failing checks. Summarize the failure and the next human decision needed.

MCP boundary:
- GitHub: read issues and PRs; create branches and PRs only in this repo.
- Jira or Linear: read tickets only.
- Database: staging read-only unless a maintainer grants temporary access.
- Slack: read linked engineering channels only; do not post automatically.
- Secrets: never request, print, store, or transform secrets.

Skills handoff:
When a repeated workflow becomes stable, propose a reusable skill with:
- name
- description
- inputs
- exact steps
- verification commands
- known failure modes

Review rule:
A PR with agent-authored code cannot be approved until the reviewer can find the stated skill level, changed files, verification output, and any MCP access used.

Adoption should be lightweight. A tech lead proposes the first rubric, two frequent reviewers edit it, and the team tries it on three real pull requests before calling it standard.

Put the root version in AGENTS.md. Put stricter local versions next to high-risk code, such as services/billing/AGENTS.md, infra/AGENTS.md, or packages/auth/AGENTS.md.

The review rule keeps the convention alive. If a PR does not state its skill level and verification evidence, reviewers ask for that before debating the code.

Common questions

  • What are good ai code solutions for diverse coding skills teams?

    The best solutions combine shared repo instructions, a skill acceptance rubric, and review guardrails rather than giving every developer the same unrestricted agent access. One practical artifact is an AGENTS.md rubric with three skill levels: mechanical changes, bounded product changes, and architecture-sensitive work that requires senior approval before implementation.

  • Should junior developers use Codex directly?

    Yes, junior developers can use Codex directly when the task boundary is clear and the review bar is visible. Start them with Level 1 work, require command output in the PR, and ask reviewers to comment on the prompt-to-verification loop, not only the final diff.

  • Do we need multi-agent orchestration for this?

    No, most teams should standardize the single-agent workflow before adding multi-agent orchestration. A useful first milestone is boring: one repo convention, one verification loop, and one review checklist that works across five real pull requests without special handling.

  • Where should MCP permissions be documented?

    MCP permissions should be documented in the same place as the agent rules, usually AGENTS.md plus any local file near sensitive code. The important caveat is that write permissions need a named boundary, such as one repository, staging-only data, or temporary access for one task.

  • How strict should the rubric be?

    The rubric should be strict about risk and flexible about implementation style. Keep three levels at first, because more categories create review theater; add a fourth only when your team repeatedly sees a real class of work that does not fit.

Further reading

Make the next PR your pilot

Pick one low-risk repo, paste the rubric into AGENTS.md, and require the skill level plus verification output on the next three Codex-assisted PRs. After that, tighten the parts reviewers actually used and delete anything decorative.

One methodology lens

One useful way to read this through our methodology is the Plan step: delegate first-pass decomposition and dependency mapping, review the sequencing and assumptions, and keep ownership of scope and priorities. If that split is still fuzzy, the workflow usually is too.

Related training topics

Related research

Ready to start?

Transform how your team builds software.

Get in touch