Codex agents.md vs. Claude Code CLAUDE.md — Which Project Context System Actually Works Better?

Both tools use a markdown file to anchor project context — and the choice between them has real consequences

If you’re switching between Claude Code and OpenAI Codex, you’ll hit the agents.md vs CLAUDE.md question almost immediately. Both files do the same conceptual job: give the agent a persistent briefing that survives across chat sessions. But the implementation details differ enough that copying your CLAUDE.md straight into a Codex project and renaming it will leave performance on the table.

This post is a direct comparison of the agents.md vs Claude’s CLAUDE.md — Codex project context file comparison — what each file does, where each approach wins, and what you should actually put in each one.

Why project context files exist in the first place

Neither Claude Code nor Codex has persistent memory across sessions by default. Open a new chat and the agent starts cold. The context file is the workaround: a markdown document the agent reads at the start of every new conversation, so it knows who you are, what the project is, and what conventions to follow.

In Claude Code, that file is CLAUDE.md. In Codex, it’s agents.md. Same idea, different name, different behavior.

Not a coding agent. A product manager.

Remy doesn't type the next file. Remy runs the project — manages the agents, coordinates the layers, ships the app.

BY MINDSTUDIO

The reason this matters more now than six months ago is that both tools have gotten significantly more autonomous. Codex’s /goal command can run agentic loops for multiple hours without human intervention. Claude Code has cloud routines. When an agent is running unsupervised for that long, a badly written context file isn’t just an inconvenience — it’s a source of compounding errors.

Five dimensions that actually matter

Before going file-by-file, here’s the framework I’d use to evaluate any project context system:

1. Read frequency. How often does the agent actually consult the file? Is it once per session, once per task, or on-demand?

2. Token cost. A bloated context file eats your session budget on every single chat. This is especially painful in Codex, where you have a 5-hour session reset and a weekly reset — both visible in the rate limits panel under Settings.

3. Scope. Does the file apply globally, per-project, or per-chat? Can you have multiple levels?

4. Portability. Can the same file work across different agent harnesses — Codex, Claude Code, Cursor, OpenClaw?

5. Actionability. Does the file actually change agent behavior, or is it just documentation that gets ignored under pressure?

How `CLAUDE.md` works in Claude Code

Claude Code reads CLAUDE.md at session start and treats it as high-priority context. The file is project-local by default — it lives in your project root — but you can also place one in ~/.claude/CLAUDE.md for global defaults that apply across all projects.

The format is flexible. Most practitioners use a mix of:

Project description and goals
Tech stack and conventions
Recurring patterns to follow or avoid
Known issues and their resolutions

Claude tends to be good at using CLAUDE.md for exploratory and planning work. The model has a strong prior toward following structured instructions, and Anthropic has clearly optimized the harness to weight this file heavily. If you write “always use TypeScript strict mode” in CLAUDE.md, Claude Code will generally respect that even deep into a long session.

One structural advantage: Claude Code’s context window is large (Opus 4.7 gives you 1 million tokens), so a verbose CLAUDE.md doesn’t hurt as much. You can afford to be thorough. That said, the /compact command exists for a reason — managing context rot in Claude Code is a real concern on long sessions, and a bloated CLAUDE.md accelerates the problem.

The weakness of CLAUDE.md is that it’s a single file. There’s no native tiering — you can’t say “read this section only when working on the API layer.” Everything gets loaded every time.

How `agents.md` works in Codex

Codex reads agents.md at the start of every new chat in a project. The behavior is functionally identical to CLAUDE.md in that respect. But there are meaningful differences in how you should write it.

First, token efficiency matters more in Codex. GPT-5.5 is notably efficient on both input and output tokens compared to Opus — practitioners report sessions lasting significantly longer in Codex than in Claude Code for equivalent workloads. But that efficiency advantage erodes if your agents.md is 3,000 words of background context that gets re-read on every single chat. Keep it tight.

Second, Codex has a context window bar at the bottom of every chat showing fill percentage. Watch it. If you’re hitting 30% context fill before you’ve typed a single message, your agents.md is too long.

Third, agents.md interacts with the rest of Codex’s context system in ways CLAUDE.md doesn’t. Codex has a /memories slash command, a skills system (more on that below), and the ability to tag specific files with @ in chat. The agents.md should be the high-level briefing; the detailed, task-specific knowledge should live in skills files or be tagged inline.

Here’s what Nate Herk’s workflow looks like in practice: after a new project is created, he asks Codex to read several existing files (transcripts, prior work, whatever’s relevant), then asks it to generate an agents.md from that context. The file gets project description, goals, tech stack, and any known gotchas. It’s a living document — when the agent runs into a new issue (like the TLS problem with PowerShell’s web request in his YouTube API demo), he asks it to add that knowledge to agents.md so it doesn’t repeat the mistake.

That’s the right instinct, but with a caveat: as the project grows, agents.md can balloon. At some point you want to move resolved issues into a separate project-notes.md or into a skill file, and keep agents.md focused on what the agent needs to know before it starts working.

The skills layer: where Codex pulls ahead

This is the part of the comparison that doesn’t have a direct equivalent in vanilla Claude Code.

Codex has a skills system. Skills are markdown files — recipes that tell the agent how to do a specific repeatable task. They live either globally at ~/.codex/skills/ or locally in the project directory. A global skill is available in every Codex project. A local skill is project-specific.

The interesting thing is that these are just markdown files. The same skill files work across Claude Code, Codex, Cursor, and OpenClaw — any agent harness that reads from the local directory. This means your agents.md can be lean (high-level context only) because the detailed procedural knowledge lives in skills files that get invoked on demand via /skill-name slash commands.

In the YouTube analytics demo, after Codex built the comment analysis pipeline, Herk asked it to turn the entire workflow into a skill. Codex reverse-engineered the process and wrote a skill file called youtube-comment-insights. Now instead of re-explaining the workflow every time, you just type /youtube-comment-insights and the agent follows the recipe. The agents.md stays clean.

Claude Code doesn’t have this native skill invocation system, though you can approximate it by putting skill-like files in your project and referencing them explicitly. The difference is ergonomics: in Codex, /skill-name is a first-class interface. In Claude Code, you’re manually pointing at files.

For teams building self-improving AI skills with Claude Code, the portability of these markdown skill files is genuinely useful — you can author skills in Claude Code and use them in Codex, or vice versa.

Plan mode and the context file interaction

One Codex feature that changes how you should write agents.md is the Plan mode toggle. When Plan mode is on, Codex won’t execute anything — it only brainstorms. This means you can have a conversation to refine your understanding before any code gets written.

How Remy works. You talk. Remy ships.

YOU14:02

Build me a sales CRM with a pipeline view and email integration.

REMY14:03 → 14:11

Scoping the project

Wiring up auth, database, API

Building pipeline UI + email integration

Running QA tests

✓ Live at yourapp.msagent.ai

The implication for agents.md: you don’t need to front-load every possible constraint into the file. You can use Plan mode to surface constraints interactively, then ask Codex to update agents.md with what you learned. The file grows organically from real sessions rather than being written speculatively upfront.

CLAUDE.md doesn’t have an equivalent mode. Claude Code will execute unless you explicitly tell it not to in your prompt. This pushes more of the constraint-setting work into the initial CLAUDE.md authoring, which is why you see more elaborate CLAUDE.md files in the wild — people are compensating for the lack of a dedicated planning mode.

Portability: the cross-harness argument

Here’s something that gets undersold: because both agents.md and CLAUDE.md are just markdown files in a local directory, you can run multiple agent harnesses against the same project.

In practice, this means you can use Claude Code for exploratory brainstorming (where Opus tends to be stronger) and Codex for execution (where GPT-5.5’s token efficiency and instruction-following shine on longer plans). The files coexist. You might have both CLAUDE.md and agents.md in the same project root, each tuned for its respective harness.

When migrating a Claude Code project to Codex, the process is straightforward: ask Codex to read the existing CLAUDE.md and generate a compatible agents.md. It’ll rename the file and adjust any Claude-specific conventions. Takes about 30 seconds.

This portability also matters when thinking about where your project context actually lives. If you’re building a full-stack application and want the spec to be the source of truth, tools like Remy take a different approach entirely: you write annotated markdown and the full-stack app — TypeScript backend, SQLite database, auth, deployment — gets compiled from it. The spec is the source of truth; the code is derived output. That’s a different abstraction layer than agents.md, but the underlying instinct (markdown as the authoritative document) is the same.

The automations gotcha that affects both

One thing that affects how you write your context files: Codex automations default to GPT-5.2, not GPT-5.5. This is a documented bug. If you set up a scheduled automation and don’t manually change the model setting, it runs on a significantly weaker model — and your agents.md instructions that were tuned for GPT-5.5’s capabilities may not work as expected.

The fix is manual: go into each automation and set the model to GPT-5.5 with your preferred reasoning level (medium for most tasks, high for complex builds). This is worth calling out because a 40-minute automation stall — the kind Herk ran into in the YouTube analytics demo — can look like a context file problem when it’s actually a model version problem.

Claude Code doesn’t have this issue because you’re always running the model you selected. But it does have its own version of this: if you’re using Claude Opus 4.7 vs 4.6, the behavior differences are significant enough that a CLAUDE.md tuned for one version may need adjustment for the other.

Verdict: which file wins in which situation

Use agents.md (Codex) when:

You’re running long autonomous sessions via /goal and need the agent to stay oriented without human check-ins
You want to pair the context file with a skills system for repeatable workflows
Token efficiency matters — you’re watching your 5-hour session budget
You’re building something where Plan mode’s brainstorm-before-execute workflow fits your process
You want the context file to stay lean and push detailed knowledge into skills

Other agents ship a demo. Remy ships an app.

React + Tailwind ✓ LIVE

API

REST · typed contracts ✓ LIVE

DATABASE

real SQL, not mocked ✓ LIVE

AUTH

roles · sessions · tokens ✓ LIVE

DEPLOY

git-backed, live URL ✓ LIVE

Real backend. Real database. Real auth. Real plumbing. Remy has it all.

Use CLAUDE.md (Claude Code) when:

You’re doing exploratory, creative, or architectural work where Claude’s reasoning style is a better fit
You have a large, complex context file and want the 1M token window to absorb it without penalty
You’re working in a team that already has Claude Code conventions established
You need deep integration with Anthropic’s ecosystem (MCP servers, Claude-specific tooling)

Use both when:

You’re running a hybrid workflow — Claude for planning, Codex for execution
You want the flexibility to switch harnesses without rewriting your project context

The honest answer is that neither system is strictly better. They’re tuned for different models with different strengths. GPT-5.5 vs Claude Opus 4.7 on real-world coding tasks shows this clearly — the models have different token economics and different failure modes, and the context files should reflect that.

For teams building agents at scale, platforms like MindStudio handle the orchestration layer above this: 200+ models, 1,000+ integrations, and a visual builder for chaining agents and workflows — so you’re not manually managing which context file goes with which model for which task.

The one thing both files get wrong

Neither agents.md nor CLAUDE.md has a good answer for context that’s conditionally relevant. Both files get read in full, every time, regardless of what you’re actually doing in the session.

If your project has a frontend layer, a backend layer, and a data pipeline, and you’re only touching the data pipeline today, you’re still paying the token cost for all the frontend and backend context in your project file.

The skills system in Codex partially solves this — you can put layer-specific knowledge in skills and invoke them only when needed. But the base agents.md still loads unconditionally.

The right mental model: agents.md and CLAUDE.md are for context that’s always relevant. Anything that’s only relevant sometimes belongs in a skill file, a tagged document, or an inline @reference. Keep the base file to what the agent genuinely needs to know before it types its first character.

That discipline — knowing what’s always-relevant versus sometimes-relevant — is what separates a context file that helps from one that just burns tokens.

Codex agents.md vs. Claude Code CLAUDE.md — Which Project Context System Actually Works Better?

Both tools use a markdown file to anchor project context — and the choice between them has real consequences

Why project context files exist in the first place

Not a coding agent. A product manager.

Five dimensions that actually matter

How `CLAUDE.md` works in Claude Code

How `agents.md` works in Codex

The skills layer: where Codex pulls ahead

Plan mode and the context file interaction

How Remy works. You talk. Remy ships.

Portability: the cross-harness argument

The automations gotcha that affects both

Verdict: which file wins in which situation

Other agents ship a demo. Remy ships an app.

The one thing both files get wrong

Related Articles

Hermes Agent vs. Claude Code vs. OpenClaw — Which Self-Improving AI Agent Is Right for Your Workflow?

Claude in Microsoft Office vs ChatGPT for Excel: Which AI Office Integration Is Actually Better?

Claude vs GPT-4o in Enterprise Coding: 42-54% vs 21% Market Share — What the Data Actually Shows

Grok 4.3 vs Claude Opus vs GPT-4o: Is Cheaper Worth It When You're Behind on Every Benchmark?

Both tools use a markdown file to anchor project context — and the choice between them has real consequences

Why project context files exist in the first place

Not a coding agent. A product manager.

Five dimensions that actually matter

How CLAUDE.md works in Claude Code

How agents.md works in Codex

The skills layer: where Codex pulls ahead

Plan mode and the context file interaction

How Remy works. You talk. Remy ships.

Portability: the cross-harness argument

The automations gotcha that affects both

Verdict: which file wins in which situation

Other agents ship a demo. Remy ships an app.

The one thing both files get wrong

Related Articles

Hermes Agent vs. Claude Code vs. OpenClaw — Which Self-Improving AI Agent Is Right for Your Workflow?

Claude in Microsoft Office vs ChatGPT for Excel: Which AI Office Integration Is Actually Better?

Claude vs GPT-4o in Enterprise Coding: 42-54% vs 21% Market Share — What the Data Actually Shows

Grok 4.3 vs Claude Opus vs GPT-4o: Is Cheaper Worth It When You're Behind on Every Benchmark?

How `CLAUDE.md` works in Claude Code

How `agents.md` works in Codex