Build a 3-Pillar AI Second Brain in Obsidian: Wiki, CRM, and Journal That Talk to Each Other

Three Folders Changed How I Think About Personal Knowledge Management

Most second brain systems are glorified filing cabinets. You clip articles, dump transcripts, maybe tag things obsessively, and then never look at any of it again. The information sits there, inert, waiting for you to remember it exists. That’s not a second brain. That’s a graveyard with good metadata.

The setup I’m going to walk you through is different in one specific way: the three pillars — wiki, CRM, and journal — are wired together so that your AI actually uses what you’ve saved when you talk to it. The vault folder structure is /raw, /raw/processed, /wiki, /journal, /CRM, plus three control files: agents.md, index.md, and log.md. Seven components. That’s the whole thing. And the way they interact is what makes it worth building.

This is based on Andrej Karpathy’s LLM Wiki architecture, extended with a journaling layer and a CRM. Credit where it’s due — the wiki bones came from Karpathy. Everything else is an extrapolation on top of that foundation.

Pillar One: The Wiki (Where Raw Stuff Becomes Structured Knowledge)

Remy is new. The platform isn't.

Remy

Product Manager Agent

THE PLATFORM

200+ models 1,000+ integrations Managed DB Auth Payments Deploy

▮

BUILT BY MINDSTUDIO

Shipping agent infrastructure since 2021

Remy is the latest expression of years of platform work. Not a hastily wrapped LLM.

The /raw folder is where everything lands first. You install the Obsidian Web Clipper Chrome extension, point it at your vault, and set the note location to RAW. From that point forward, any article or YouTube video you want to save gets clipped directly into that folder as a markdown file with front matter: source title, source URL, date saved, and a web-clip tag.

The YouTube transcript behavior is worth calling out specifically. When you’re on a YouTube video and you click the web clipper, it pulls the entire transcript automatically. Not a summary — the full transcript, ready to be processed. That’s a meaningful detail because it means you’re not losing information at the ingestion step.

Nothing happens to those files automatically at first. They sit in /raw, immutable. The rule is that source material in /raw doesn’t get touched — it’s the original record. When the AI processes a file, it moves it to /raw/processed so you always know what’s been ingested and what hasn’t.

The processing step is where the wiki actually gets built. When Codeex runs against the /raw folder, it reads each source file and generates structured markdown pages in /wiki. These aren’t just summaries. The AI extracts entities — people, companies, tools, ideas, themes — and creates individual wiki pages for each. A video about discipline without willpower becomes multiple pages: one for the concept of temptation bundling, one for identity-led goals, one for environment design. Each page cross-links back to the original source. Nothing is orphaned.

The index.md file at the vault root is the catalog. Every time a new wiki page gets created, the index gets updated. The log.md file records every operation — what was processed, what queries were answered, what CRM records were updated. If something breaks or behaves unexpectedly, the log is where you look first.

This is the part that makes the wiki more than a dump: as you ask questions, the wiki grows. If you query the system about motivation and it finds relevant content, it creates a new wiki page called something like “Motivation for Hard Tasks” and links it back to the source videos. The wiki is not static. It expands in response to how you use it.

For anyone building knowledge management systems at scale, the Karpathy LLM Wiki approach cuts token use by up to 95% compared to naive RAG on small knowledge bases — which matters when you’re running hourly automations against an ever-growing vault.

Pillar Two: The CRM (Contact Records That Connect to Everything Else)

The CRM folder is simple in structure and surprisingly useful in practice. Each file is named after a person. That’s the rule baked into agents.md: “CRM files should always be a person’s name.” No clever naming schemes, no IDs. Just Matthew Berman.md.

Adding someone to the CRM is a natural language operation. You open a new chat in Codeex inside the second brain project and say something like: “Add to CRM — Matthew Berman, met at Qualcomm event in 2024, hung out at CES in 2025, got lunch during TechCrunch Disrupt in 2025.” The agent creates the file, populates the front matter with whatever details you provided, and updates the CRM index with the person’s name and a short bio.

Not a coding agent. A product manager.

Remy doesn't type the next file. Remy runs the project — manages the agents, coordinates the layers, ships the app.

BY MINDSTUDIO

The CRM index lists everyone alphabetically with a one-line summary. That’s your lookup layer. When you want to remember where you met someone before a conference, you ask the system directly: “Where did I meet Matthew Berman?” It checks the CRM records and tells you. No searching, no scrolling.

Where this gets interesting is the connection to the other pillars. When you journal, the system is instructed to check the CRM for relevant contacts. If you’re writing about a struggle with a business decision and you’ve had conversations with people in your CRM about that topic, the journal response can surface those connections. The CRM isn’t isolated — it’s part of the same knowledge graph.

The prompt that governs CRM behavior lives in agents.md and reads: “If I tell you I’m giving you information for the CRM, either update the person in the CRM or add the person to the CRM.” That’s it. One sentence of instruction produces a fully functional contact management system because the AI handles the rest — creating files, updating indexes, logging changes.

If you’re thinking about how this scales into something more production-grade — say, a team CRM where multiple people are adding contacts and the system needs to handle conflicts, deduplication, and structured queries — that’s where the architecture starts to strain. Platforms like MindStudio handle that kind of orchestration: 200+ models, 1,000+ integrations, and a visual builder for chaining agents and workflows without writing the glue code yourself.

Pillar Three: The Journal (Where the Other Two Pillars Pay Off)

The journal is the interface layer. It’s where you actually interact with everything you’ve built.

The trigger is a prefix. Start any chat in Codeex with the word “journal” and the system switches modes. Instead of treating your input as a wiki query, it treats it as a journal entry. The full conversation gets saved as a new markdown file in /journal, named with the date and a short AI-generated title based on the content. The journal index gets updated. The log gets an entry.

But the response behavior is what makes this worth the setup. When you journal, the agent is instructed to check three places before answering: the wiki, past journal entries, and the CRM. It’s not responding from a blank slate. It’s responding from your accumulated knowledge.

Here’s a concrete example from the build. The journal entry was about the tension between clickbait titles that perform well and literal titles that feel honest. The system’s response wasn’t generic YouTube advice. It pulled from creator strategy pages already in the wiki — specifically pages about “YouTube valley of death” and “creator persistence” — and framed the response around those concepts. It identified two distinct fears in the journal entry (creative integrity and channel safety) and addressed them separately, grounded in what the wiki actually contained.

That’s the difference between a chatbot and a system that knows you. The response is shaped by what you’ve found valuable enough to save over time.

As the journal grows, the system also starts finding patterns. If you’re journaling about the same struggle repeatedly, that pattern becomes visible in the index and can inform future responses. The journal isn’t just a record — it’s a feedback loop into the wiki.

✗ VIBE-CODED APP

Tangled. Half-built. Brittle.

✓ AN APP, MANAGED BY REMY

UIReact + Tailwind✓

APIValidated routes✓

DBPostgres + auth✓

DEPLOYProduction-ready✓

Architected. End to end.

Built like a system. Not vibe-coded.

Remy manages the project — every layer architected, not stitched together at the last second.

For anyone interested in how self-evolving memory systems work at a deeper level, the Claude Code hooks approach to building self-evolving memory with Obsidian covers similar territory from a different technical angle.

The Control Layer: agents.md Is the Whole System

Every behavior described above — how files get processed, how the CRM gets updated, how journal entries get routed — is governed by a single plain-text markdown file called agents.md.

This is the part that surprises people. There’s no code to modify. No config files in obscure formats. No redeployment. You open agents.md in Obsidian, edit the instructions in plain English, save the file, and the next time the agent runs, it behaves differently.

Want files moved to /raw/processed after ingestion? Add a line to the ingest operation in agents.md. Want the YouTube channel name added to the source file’s front matter instead of the generated wiki page? Edit the instruction. Want cross-links between wiki pages and their source files? Add a step. The entire system is a prompt file that you maintain like any other note.

This is also where you extend the system. The initial build handles wiki, CRM, and journal. But you could add a /workouts folder with its own rules. A /recipes folder. A /research folder with different processing logic for academic papers. Each new pillar is just a new section in agents.md with instructions for how to handle that content type.

The architecture is worth understanding in terms of what it actually is: a set of instructions that an LLM follows when it has access to your file system. The model recommendation for the automation is GPT-5.5 on high reasoning — you want the strongest model available for the processing step because it’s making judgment calls about how to structure and cross-link content. Skimping on the model here produces a worse wiki.

For teams thinking about building similar agent systems with more complex routing logic, the WAT framework — Workflows, Agents, and Tools — provides a useful mental model for how to structure these kinds of multi-behavior systems cleanly.

The Automation Layer: Set It and Forget It

The manual version of this system — clip something, then tell Codeex to process it — works fine for testing. It’s not how you want to run it day-to-day.

Codeex has an Automations feature. You create a new automation, set the project to your second brain folder, set the cadence to hourly, and give it a single instruction: “If there are any unprocessed files inside the raw directory, please process them.” That’s the whole automation. Every hour, it checks /raw, processes anything new, updates the wiki, and moves processed files to /raw/processed.

The second instruction you add to the automation: after processing, commit and push the current version of the directory to the main branch on GitHub. You create a private GitHub repo, connect it, and now every hourly run produces a backup. Your entire second brain — wiki, journal, CRM, all of it — is version-controlled and backed up automatically.

Hire a contractor. Not another power tool.

Cursor, Bolt, Lovable, v0 are tools. You still run the project.
With Remy, the project runs itself.

This is the point where the system becomes genuinely low-maintenance. You clip things when you find them. The wiki grows on its own. You journal when you want to. The CRM gets updated when you meet people. The only active work is the clipping and the journaling — the processing, organizing, cross-linking, and backing up all happen without you.

The graph view in Obsidian is a useful way to watch this compound over time. Early on, it’s sparse — a handful of nodes with a few connections. After a few weeks of consistent clipping, it starts to look like an actual knowledge graph, with clusters of related concepts and dense cross-linking between sources. That visual is a reasonable proxy for how much useful context the system has accumulated.

For anyone building AI agents for personal productivity more broadly, the six AI agents for personal productivity overview covers complementary approaches worth knowing about.

What This Is Actually Good For

The honest assessment: this system is most valuable if you consume a lot of content and have a consistent journaling practice. If you clip ten things a week and journal three times a week, the wiki compounds meaningfully within a month. If you clip things sporadically and never journal, you’ve built a slightly fancier filing cabinet.

The CRM pillar is genuinely useful for anyone who goes to events, takes meetings, or maintains a network where context matters. The ability to ask “what did I know about this person before I met them?” or “what have I discussed with people about this topic?” is something most CRM tools don’t give you in a conversational interface.

The journal pillar is the one that’s hardest to replicate with off-the-shelf tools. The grounding behavior — responses shaped by your specific saved knowledge rather than generic LLM training — produces noticeably different output. It’s not always better, but it’s more relevant to your actual situation.

One thing worth flagging: the system described here uses Codeex as both the IDE and the automation runner. If you prefer Claude Code or another agentic coding environment, the same architecture works — the agents.md file and the folder structure are model-agnostic. The prompts are plain English. Swap the runner and the behavior follows.

Tools like Remy take a related but distinct approach to the “spec as source of truth” idea: you write an annotated markdown spec and it compiles into a complete TypeScript backend, database, auth, and frontend. The principle is similar — structured prose drives the system — but Remy’s output is a deployed application rather than a knowledge base.

The folder structure — /raw, /raw/processed, /wiki, /journal, /CRM, agents.md, index.md, log.md — is small enough to hold in your head and extensible enough to grow with you. That’s a rare combination in knowledge management tooling, most of which is either too rigid or too chaotic to sustain over time.

Build it once. Clip consistently. Journal honestly. The system does the rest.