How to Build a Portable AI Second Brain That Works Across Claude, Codex, and Hermes
Build your AI second brain as markdown files and folders so any agent harness can read it. Learn the routing rules, folder structure, and memory patterns.
Why Your AI Context Shouldn’t Be Trapped in One Tool
Every time you switch between Claude, OpenAI’s Codex, and a local model like Hermes, you start from zero. You re-explain your project. You re-paste your preferences. You remind the agent what you already decided last week. That’s not a second brain — that’s a leaky notebook.
The concept of a portable AI second brain flips this. Instead of storing context inside a specific tool or a proprietary memory system, you keep everything as plain markdown files in a structured folder. Any agent harness — Claude’s Projects, OpenAI Codex, a local Ollama instance running Hermes — can read those files directly. Your context travels with you.
This guide covers exactly how to build that system: the folder layout, the memory patterns, the routing rules that tell each agent where to look, and how to keep it from turning into a mess.
What Makes a Second Brain “Portable”
The word “portable” does real work here. Most AI memory systems fail portability in one of two ways:
-
Vendor lock-in: Your memory lives inside a proprietary system — ChatGPT’s memory, Notion AI’s context, a RAG database tied to one platform. When you switch tools, the memory doesn’t come.
-
Format lock-in: Your notes live in a format only one tool can parse — a custom JSON schema, a database, a binary blob. Plain text is universally readable.
Everyone else built a construction worker.
We built the contractor.
One file at a time.
UI, API, database, deploy.
A portable second brain uses plain markdown files as the storage layer. Markdown is human-readable, version-controllable with Git, and ingestible by virtually every AI system. You’re not building for one agent — you’re building an interface that any agent can consume.
The three agents this guide focuses on — Claude (via Anthropic’s API or Projects), Codex (OpenAI’s code-oriented model), and Hermes (NousResearch’s instruction-tuned model, typically run locally via Ollama or LM Studio) — all handle context injection differently. The routing rules below account for those differences.
The Core Folder Structure
Keep the structure shallow and explicit. Deep nesting creates routing confusion. Here’s a working layout:
/second-brain/
├── _index.md
├── /memory/
│ ├── episodic.md
│ ├── semantic.md
│ └── procedural.md
├── /projects/
│ ├── project-a.md
│ └── project-b.md
├── /context/
│ ├── preferences.md
│ ├── constraints.md
│ └── glossary.md
├── /routing/
│ ├── claude-rules.md
│ ├── codex-rules.md
│ └── hermes-rules.md
└── /log/
└── YYYY-MM-DD.md
_index.md — The Entry Point
This is the file every agent reads first. It’s a short document (under 300 words) that explains the structure and points to relevant files based on task type. Think of it as a table of contents plus a routing guide.
Example content:
# Second Brain Index
## What this is
This is a persistent context system for AI agents. Read this file first, then follow the routing rules for your task type.
## Quick routing
- For coding tasks → read /memory/procedural.md + /routing/codex-rules.md
- For writing/analysis → read /memory/semantic.md + /context/preferences.md
- For project-specific work → read the relevant file in /projects/
- For any task → read /context/constraints.md
## Active projects
- Project A: [project-a.md]
- Project B: [project-b.md]
## Last updated
2025-07-01
Why a Flat-ish Structure Works Better
Deeply nested folders require agents to infer paths. A flat structure with explicit links reduces guesswork. When an agent reads _index.md, it knows exactly which file to pull next based on the task at hand — no inference required.
The Three Memory Types (and How to Use Each)
This is the most important design decision in your second brain. Different information has different update frequencies and different retrieval patterns. Mixing them into one big notes file is what makes memory systems break down over time.
Episodic Memory — What Happened
episodic.md holds time-stamped records of decisions, conversations, and outcomes. This is the equivalent of a lab notebook.
Structure each entry like this:
## 2025-06-28
**Context**: Discussing API rate limiting strategy for Project A
**Decision**: Use exponential backoff with max 3 retries
**Reasoning**: Simpler than a queue system; acceptable for current traffic
**Status**: Implemented
Episodic memory is append-only. Never edit past entries — only add new ones. This keeps the record honest and gives agents a clear timeline.
Semantic Memory — What You Know
semantic.md is your knowledge base: concepts, frameworks, mental models, domain-specific facts that apply broadly. Update this when your understanding of something changes.
## Rate limiting
Exponential backoff is preferred for third-party API calls. Token bucket is better for internal services with predictable load. Never use fixed-delay retries in production.
## My writing style
Prefer short paragraphs. Active voice. No em-dashes for sentence connectors. Use bullet points for 3+ items.
This file tends to be the most valuable for creative and writing tasks. It’s also the file that ages best — a decision from six months ago might be outdated, but a mental model you’ve refined is still valid.
Procedural Memory — How You Work
procedural.md holds your repeatable processes: code patterns, deployment checklists, standard operating procedures, templates you use frequently.
## Code review checklist
1. Does this function do one thing?
2. Are errors handled explicitly?
3. Is the return type predictable?
4. Are there edge cases not covered by tests?
## Deployment process for Project A
1. Run test suite locally
2. Push to staging branch
3. Confirm CI passes
4. Tag release
5. Merge to main
Codex, in particular, benefits from procedural memory. When you’re asking it to write code, having your patterns and conventions injected upfront produces far better results than starting cold.
Routing Rules for Each Agent
This is where the system actually becomes multi-agent capable. Each routing file tells a specific agent how to interact with the second brain — what to read, what to update, and what to ignore.
Claude Routing Rules
Claude handles long context well and can synthesize information across multiple files. Your claude-rules.md should tell it to read broadly before responding.
# Claude Routing Rules
## On task start
1. Read _index.md
2. Identify task type
3. Load relevant memory files (episodic + semantic for analysis; procedural for structured tasks)
4. Load project file if task is project-specific
5. Load constraints.md always
## On task end
If a significant decision was made:
- Append to episodic.md with today's date
- Update semantic.md if a mental model changed
- Flag any updates needed in procedural.md (do not edit directly)
## Context window management
If context is getting large, prioritize: constraints.md > active project file > semantic.md > episodic.md (recent only)
Codex Routing Rules
Codex is optimized for code generation and works best with precise, structured context. Keep its routing instructions lean and code-focused.
# Codex Routing Rules
## On task start
1. Read constraints.md
2. Read procedural.md (full)
3. Read active project file — code section only
4. Do NOT load episodic.md unless asked
## Code generation defaults
- Follow patterns in procedural.md
- Match naming conventions in glossary.md
- Flag any deviation from established patterns with a comment
## On task end
- If a new pattern is established, output it formatted for procedural.md
- Do not write to memory files directly
The “do not write directly” rule matters. Codex is good at code but can produce inconsistent prose. Have it output proposed memory updates in a structured format, then review and paste them yourself — or automate that review step.
Hermes Routing Rules
Hermes (running locally via Ollama or LM Studio) typically has a smaller context window than Claude and may run on hardware with memory constraints. Its routing rules should be more conservative about file loading.
# Hermes Routing Rules
## On task start
1. Read _index.md
2. Load ONLY the single most relevant memory file based on task type
3. Load constraints.md (always, it's short)
4. Do not attempt to load multiple large files simultaneously
## Context priority order
constraints.md > procedural.md > semantic.md > episodic.md (last 30 days only)
## Memory window
If total context exceeds 4,000 tokens, truncate episodic.md to most recent 5 entries only.
## On task end
Output proposed updates as a structured list. Do not update files directly.
Other agents start typing. Remy starts asking.
Scoping, trade-offs, edge cases — the real work. Before a line of code.
Local models are getting more capable, but being explicit about their limits prevents silent failures where the model simply ignores files it can’t process.
Building the Daily Log Pattern
The /log/ folder is optional but genuinely useful. Each day gets a new file named YYYY-MM-DD.md. At the end of a session, you (or an automated script) summarize what happened, what changed, and what’s pending.
A simple template:
## Session Log — 2025-07-01
### What I worked on
- Refactored authentication module in Project A
- Reviewed rate limiting approach
### Decisions made
- Switched to token bucket for internal auth service
- Deferred database migration to next sprint
### Memory updates needed
- Update procedural.md: add token bucket pattern
- Update episodic.md: record auth refactor decision
### Open questions
- Should error logs go to Datadog or a custom endpoint?
This daily log becomes the input for keeping your other memory files fresh. Without a log, memory files drift — you stop updating them because you forget what changed.
How to Inject Context Into Each Agent
Having the files is only half the problem. You also need a reliable way to get the right files into each agent’s context at the start of a session.
Claude Projects
Claude’s Projects feature lets you upload files that persist across conversations. Upload your /context/ folder and the relevant /memory/ files as project knowledge. Add _index.md and claude-rules.md as the system prompt or first user message.
For project-specific work, add the relevant project file to the session manually. You don’t want all project files loaded at once — only the active one.
OpenAI Codex (via API or CLI)
Codex is typically invoked programmatically. Use a simple shell script or Python wrapper that prepends the routing files to every request:
def build_codex_context(task: str, project: str = None) -> str:
context_files = ["constraints.md", "procedural.md", "routing/codex-rules.md"]
if project:
context_files.append(f"projects/{project}.md")
context = ""
for f in context_files:
with open(f"second-brain/{f}") as fp:
context += fp.read() + "\n\n---\n\n"
return context + task
This gives you consistent context injection without manual copy-pasting.
Local Models (Hermes via Ollama)
With Ollama, you can use the --system flag or build a Modelfile that injects your routing instructions as a persistent system prompt. For dynamic context, use the API directly:
curl http://localhost:11434/api/generate -d '{
"model": "hermes3",
"system": "<contents of constraints.md + hermes-rules.md>",
"prompt": "<contents of procedural.md>\n\nTask: <your task>"
}'
The key is automation — if injecting context requires manual steps every time, you’ll stop doing it.
Keeping Memory Files From Going Stale
Memory systems fail in a predictable way: they start strong, then get neglected, then become unreliable, then get abandoned. Preventing this requires a light maintenance routine.
Weekly: Review and update semantic.md. Ask yourself: has my understanding of anything changed this week? Did a mental model get refined or disproved?
Per session: Append to episodic.md if a significant decision was made. This takes 2 minutes and keeps the timeline honest.
Monthly: Review procedural.md for patterns that are no longer accurate. Technology changes; your procedures should too.
Quarterly: Archive old episodic entries (move them to an /archive/ folder) to keep the active file lean. Agents don’t need to read decisions from two years ago unless specifically asked.
The daily log pattern helps here — it creates a lightweight forcing function for reviewing what needs updating.
Where MindStudio Fits This System
If you want to automate the maintenance layer — the logging, the memory updates, the context injection — MindStudio is a natural fit. It’s a no-code platform for building AI workflows, and the specific capability that matters here is its ability to connect AI models with file systems, APIs, and other tools without writing infrastructure code.
Here’s a concrete example: you could build a MindStudio workflow that runs at the end of each day, reads your /log/ file, sends it to a model like Claude to extract decisions and pattern changes, and writes the structured output back to your episodic.md and semantic.md files automatically. That’s the “memory maintenance” problem solved without a cron job or custom script.
MindStudio supports 200+ models out of the box — Claude, GPT-4o, Gemini, and others — so you’re not locked into one provider for this automation either. You can route maintenance tasks to the cheapest model that handles them well (often a smaller model is sufficient for summarization) and reserve more capable models for actual reasoning work.
The MindStudio workflow builder handles rate limiting, retries, and auth, so the workflow you build on day one keeps running reliably. You can try it free at mindstudio.ai.
For developers who want to go further, MindStudio’s Agent Skills Plugin lets you call MindStudio capabilities directly from Claude Code or any custom agent — including methods like agent.runWorkflow() that could trigger your memory update pipeline on demand.
Common Mistakes to Avoid
One giant notes file: The temptation is to keep everything in one place. Resist it. A single file gets unwieldy, context windows fill up with irrelevant content, and agents can’t selectively load what they need.
No routing rules: Without explicit routing files, agents make their own decisions about what to read. Those decisions are inconsistent. Routing files make behavior predictable.
Writing to memory files mid-session: Letting agents update memory files during a live session creates race conditions and inconsistencies. Collect proposed updates, review them, then apply them deliberately.
Over-elaborate structure: A second brain with 15 subfolders and 40 files becomes a project in itself. Start with the minimal structure above. Add folders only when the absence of one is causing real problems.
Ignoring context window limits: Different models have different limits. What works with Claude’s 200K context fails silently with a 4K local model. The routing rules need to account for this explicitly.
Frequently Asked Questions
What’s the difference between this and just using a notes app?
Notes apps are optimized for human retrieval. A portable AI second brain is optimized for agent retrieval. That means consistent structure, explicit routing rules, and file formats that models can parse reliably. You can still read these files yourself — markdown is human-readable — but the design decisions prioritize machine consumption.
How do I handle sensitive information in these files?
Remy doesn't build the plumbing. It inherits it.
Other agents wire up auth, databases, models, and integrations from scratch every time you ask them to build something.
Remy ships with all of it from MindStudio — so every cycle goes into the app you actually want.
Keep sensitive data out of the second brain entirely, or store it in a separate encrypted vault and reference it by name rather than value. If your constraints.md says “API keys are in 1Password under [Project A],” an agent knows where to look without the key ever appearing in a plaintext file.
Can I use this system with local models that have small context windows?
Yes — that’s exactly what the Hermes routing rules address. The key is being explicit about which files to load and in what order. For models with small windows, load only constraints and procedural memory, and truncate episodic memory to recent entries. The routing file handles this logic so you don’t have to think about it each time.
How often should I update the memory files?
Episodic memory: after any significant decision or session. Semantic memory: weekly, or whenever your understanding changes. Procedural memory: whenever a process changes. The daily log is the easiest way to make this a habit — write the log first, then use it to decide what needs updating.
What if I want to use this with a RAG system later?
The folder structure works well as a RAG source. Your markdown files become the document corpus. The routing rules become metadata filters — you can tag documents by type (episodic, semantic, procedural) and use those tags to filter retrieval at query time. Starting with flat files doesn’t close off the RAG path later.
Does this work with Claude’s Projects feature specifically?
Yes. Upload the /context/ and /memory/ folders as project knowledge. Put your claude-rules.md content into the custom instructions field. For project-specific work, add the relevant project file to the conversation manually. Claude’s 200K context window means you can load more files than with other models, but the selective loading rules still improve response quality by reducing noise.
Key Takeaways
- A portable AI second brain stores context as plain markdown files that any agent can read — no vendor lock-in, no proprietary format.
- Three memory types cover different needs: episodic (what happened), semantic (what you know), procedural (how you work).
- Routing rule files tell each agent — Claude, Codex, Hermes — exactly which files to read and how to handle updates.
- Context injection needs to be automated; manual copy-pasting breaks down quickly.
- Maintenance is the hardest part. A daily log creates the habit loop that keeps memory files accurate.
The system described here takes a few hours to set up from scratch. Once it’s running, switching between agents stops feeling like starting over — your context is always there, in a format every tool understands.
If you want to automate the maintenance layer without writing infrastructure code, MindStudio is worth exploring. Building a memory-update workflow takes under an hour on the platform, and it runs reliably in the background while you focus on actual work.
