How to Automate Your Obsidian Second Brain with Codeex: Hourly Processing, No Manual Triggers

Your Obsidian Second Brain Can Process Itself Every Hour

Most knowledge management systems require you to do the work twice. You save something, then you have to go back and actually process it. That second step is where every second brain dies. You end up with a /raw folder full of YouTube transcripts and articles you clipped three weeks ago and never touched again.

The fix takes about 20 minutes to set up. Once it’s running, Codeex Automations handles hourly processing of your /raw folder, wiki updates, and a GitHub private repo commit — zero manual intervention after the initial configuration. You clip something, walk away, and an hour later it’s been summarized, cross-linked, and backed up.

Here’s exactly how to wire that up.

What This Actually Gets You

The outcome isn’t just “automated notes.” It’s a knowledge base that compounds.

Every time you clip a YouTube video or article, the Obsidian Web Clipper drops a markdown file into /raw. The hourly Codeex automation picks it up, extracts concepts, people, tools, and themes, creates or updates wiki pages in /wiki, cross-links related content, moves the source file to /raw/processed, updates index.md and log.md, then commits the whole vault to a private GitHub repo.

You don’t trigger any of this. It just happens.

Hire a contractor. Not another power tool.

Cursor, Bolt, Lovable, v0 are tools. You still run the project.
With Remy, the project runs itself.

After a few weeks, your Obsidian graph view goes from a handful of isolated nodes to a dense web of interconnected concepts. You can ask the system “what do I know about motivation when I don’t feel like doing hard things?” and it queries the wiki, finds the relevant pages, and answers from your own saved material — not from generic LLM knowledge. The difference in response quality is noticeable.

The journal layer makes this even more useful. When you prefix a chat with journal, the system grounds its response in your wiki, your past journal entries, and your CRM. It’s not ChatGPT answering a question. It’s a system that knows what you’ve been thinking about for months.

What You Need Before Starting

Tools (all free to start):

Obsidian — free markdown vault, available at obsidian.md
Obsidian Web Clipper — Chrome extension, linked from the Obsidian website footer; handles YouTube transcript extraction automatically
Codeex — free tier available; you’ll get more throughput on a paid plan, especially for hourly automations running GPT-5.5 on high reasoning
A GitHub account — for the private repo backup

Knowledge assumed:

You know how to create a folder on your computer
You’ve used a Chrome extension before
You don’t need to know how to code — the only “code” you’ll write is plain English prompts

One judgment call upfront: The automation runs best on GPT-5.5 with high reasoning enabled. Weaker models will process files but produce shallower wiki pages — fewer cross-links, less useful concept extraction. The stronger model is worth the cost here because this runs in the background and you’re not waiting on it.

Setting It Up: Five Steps

Step 1: Create the Obsidian vault and connect Codeex

Create a new Obsidian vault. Call it “Second Brain” and save it somewhere you’ll remember — you need the exact folder path in the next step.

Open Codeex. Click “Add new project” → “Use an existing folder.” Navigate to the folder you just created. You now have a Codeex project pointing directly at your Obsidian vault. Every file Codeex creates or modifies will appear instantly in Obsidian.

Now you have: a Codeex project and an Obsidian vault pointing at the same folder.

Step 2: Build the vault architecture

This is the step most people overthink. The architecture is already figured out — Andrej Karpathy published his LLM wiki design on GitHub, and you can use it directly. Open a new chat in your Codeex second brain project and run this prompt:

Build out the wiki architecture based on Karpathy’s LLM wiki here [GitHub URL]. The current second brain folder is the folder that Obsidian is connected to. It is currently empty.

Codeex will scaffold the folder structure. You want exactly these: /raw (immutable source material), /wiki (AI-generated pages), agents.md (the prompt file governing all agent behavior), index.md (catalog of everything in the wiki), and log.md (change history). If it creates extra files, tell it to prune back to the minimal Karpathy game plan.

Open Obsidian. You should see those folders and files. If you do, the architecture is correct.

Not a coding agent. A product manager.

Remy doesn't type the next file. Remy runs the project — manages the agents, coordinates the layers, ships the app.

BY MINDSTUDIO

The agents.md file is worth understanding before you move on. It’s a plain-text markdown file that contains all the instructions governing how the agent processes files, responds to queries, handles journal entries, and manages the CRM. You can edit it directly in Obsidian — no code changes, no config files, no redeployment. If you want the agent to behave differently, you open agents.md and change the text. This is the entire control surface for the system. For anyone building more complex agent orchestration, platforms like MindStudio handle this kind of orchestration at scale — 200+ models, 1,000+ integrations, and a visual builder for chaining agents — but for a personal knowledge system, a single markdown file is genuinely sufficient.

Now you have: a properly structured vault with agents.md as your control file.

Step 3: Configure the Web Clipper and seed the vault

Install the Obsidian Web Clipper Chrome extension. Open its settings and make sure:

The vault name matches exactly what you named your Obsidian vault (check the bottom-left corner of Obsidian for the exact name)
The default template is set to your second brain vault
Note location is set to RAW

The Web Clipper will pull full article content and YouTube transcripts automatically. Navigate to any YouTube video, click the Web Clipper icon, and it will load the entire transcript as a markdown file and drop it into /raw.

Add five to ten items before setting up the automation. This gives the first automated run something meaningful to process. Go through your YouTube watch history, clip a few videos you found valuable, clip a couple of articles. Each clip lands in /raw as a markdown file with source title, URL, and date in the front matter.

Now you have: a /raw folder with unprocessed source files waiting to be ingested.

Step 4: Set up the hourly Codeex automation

This is the core of the whole system.

In Codeex, find the Automations feature. Create a new automation with these settings:

Title: Process second brain raw files
Work tree: Local (runs directly in the selected project)
Project: Second Brain
Schedule: Hourly
Model: GPT-5.5, high reasoning
Prompt: If there are any unprocessed files inside the raw directory, please process them. Once everything is processed, commit and push the current version of the directory to the main branch on GitHub.

That last sentence — the GitHub commit — requires one additional setup step. Go to GitHub, create a new private repository called “second brain.” Copy the repo URL. In a Codeex chat, run: Commit this current version to my private GitHub repo: [URL]. This connects the repo. After that, the automation handles commits automatically.

The automation now runs every hour. It checks /raw for files that haven’t been moved to /raw/processed. For each unprocessed file, it reads the source, creates or updates wiki pages in /wiki, updates index.md, appends to log.md, moves the source to /raw/processed, and pushes everything to GitHub. If /raw is empty, it does nothing and exits.

Now you have: a fully automated processing pipeline that runs without any manual trigger.

Step 5: Add the journal and CRM layers

These are optional but they’re what make the system useful for more than just information retrieval.

In Codeex, open a new chat in your second brain project and run this prompt to update agents.md:

Update the agents MD file to handle these items. If I start a chat with “journal,” add the text of that chat and subsequent conversation as a new MD file within the journal folder. Create an index file in the journal folder. Each new journal entry gets added to the index file. Your response to my journal entry should be grounded in content from the wiki, past journal entries, and the CRM.

One coffee. One working app.

You bring the idea. Remy manages the project.

WHILE YOU WERE AWAY

✓Designed the data model

✓Picked an auth scheme — sessions + RBAC

✓Wired up Stripe checkout

✓Deployed to production

Live at yourapp.msagent.ai

For the CRM:

If I tell you I’m giving you information for the CRM, either update the person in the CRM or add the person to the CRM. CRM files should always be a person’s name.

After running these, check agents.md in Obsidian. You’ll see new sections for journal rules and CRM rules. The journal folder and CRM folder will have index files. Test the journal by starting a chat with journal followed by whatever you’re thinking about. Test the CRM by typing “add to CRM” followed by a person’s name and context.

The journal response will explicitly tell you which wiki pages it’s drawing from. That’s the system working correctly — it’s not generating a generic response, it’s querying your vault first.

Now you have: a three-pillar system (wiki, journal, CRM) running on hourly automation.

Where This Breaks (and How to Fix It)

The automation processes files but the wiki pages are shallow. This is almost always a model issue. GPT-5.5 on high reasoning produces significantly better concept extraction than weaker models. If you’re on the free tier and using a lighter model, the wiki will still build but with fewer cross-links and less useful synthesis. Upgrade the model in the automation settings.

YouTube clips aren’t getting channel names. The Web Clipper doesn’t automatically pull channel metadata into the front matter. Fix this by editing agents.md directly in Obsidian: find the YouTube processing instruction and add a line specifying that the channel name should be added to the original source page front matter (not the generated wiki page). The next automation run will apply this to new files.

Files are piling up in /raw instead of moving to /raw/processed. Check agents.md — the move instruction needs to be explicit. Open the file in Obsidian, find the ingest operation steps, and add: Move the source file from the root raw directory to raw/processed. Save the file. The next run will handle it correctly.

The GitHub push fails. Usually a permissions issue with the GitHub plugin in Codeex. Make sure the plugin is connected and authenticated. If you haven’t gone through the GitHub plugin setup flow, do that first, then re-run the commit command manually once to verify it works before relying on the automation.

The journal response ignores the wiki. This means the journal trigger isn’t being recognized. Make sure you’re starting the chat message with journal on its own line or as the first word. The agents.md file uses this prefix as the routing signal. If the prefix isn’t there, the system treats it as a regular wiki query rather than a journal entry.

Codeex creates 50 files when you asked for 5. This happened during the initial architecture build in the source walkthrough. If it over-generates, just tell it: Please remove all the extra files and build only what’s explicitly called for in Karpathy’s game plan. It will prune back. The architecture should be minimal — a handful of folders and three root-level markdown files.

✗ VIBE-CODED APP

Tangled. Half-built. Brittle.

✓ AN APP, MANAGED BY REMY

UIReact + Tailwind✓

APIValidated routes✓

DBPostgres + auth✓

DEPLOYProduction-ready✓

Architected. End to end.

Built like a system. Not vibe-coded.

Remy manages the project — every layer architected, not stitched together at the last second.

Where to Take This Further

The system as described is functional and useful. Here’s where the interesting extensions are.

Meeting notes ingestion. Granola can auto-inject meeting notes directly into the knowledge base. If you’re on calls regularly, this means your second brain is getting updated from meetings without any manual clipping. The meeting notes land in /raw and get processed on the next hourly run like everything else.

Tighter entity extraction. Right now the wiki builds pages for concepts, tools, people, and themes. You can dial this in further by editing agents.md to create separate folders for companies, for specific topic domains, for recurring themes across journal entries. The more specific the extraction instructions, the more useful the graph view becomes over time.

Richer CRM connections. The current CRM setup stores contact records and links them to journal entries when relevant. You can extend this by telling agents.md to cross-reference CRM entries against wiki pages — so if you’ve saved content about a topic and you’ve had conversations with someone about that same topic, those get linked. If you wanted to take this further and build a proper CRM application on top of this data, Remy is worth looking at: you write your application as an annotated markdown spec and it compiles into a complete TypeScript backend with database, auth, and deployment — the spec is the source of truth, the code is derived output.

Longer retention, better patterns. The journal layer gets more useful the longer you run it. After a few months, the system has enough past entries to surface genuine patterns — recurring struggles, topics you keep returning to, ideas that keep showing up in different contexts. This is the compounding effect that makes the system worth maintaining.

The graph view in Obsidian is a good proxy for system health. A sparse graph means you haven’t seeded enough content or the cross-linking instructions in agents.md aren’t specific enough. A dense, interconnected graph means the system is working. If you want to see what a mature vault looks like, the source walkthrough shows one — it’s a dense web that took a few weeks of regular clipping to build.

The underlying insight from Karpathy’s original design is that the wiki shouldn’t be a static dump — it should grow and restructure itself as you add more material and ask more questions. The hourly automation is what makes that possible without requiring you to be the one doing the restructuring. You build an AI second brain that learns from every session and then you mostly just use it.

Remy doesn't write the code. It manages the agents who do.

AGENTS ASSIGNED TO THIS BUILD

Remy

Product Manager Agent

Leading

Design

Engineer

Deploy

Remy runs the project. The specialists do the work. You work with the PM, not the implementers.

One thing worth understanding about this architecture: the entire agent behavior lives in a single plain-text file. If you want the system to do something different, you open agents.md in Obsidian and change the instructions. No deployment, no config, no code. This is the right abstraction for a personal system — the self-evolving memory system built with Obsidian hooks takes a similar approach, using hooks to capture session logs and extract lessons automatically. Both systems share the same core insight: the knowledge structure should update itself, not wait for you.

For the model selection question — GPT-5.5 is the recommendation here, but if you’re evaluating alternatives, the GPT-5.4 vs Claude Opus 4.6 comparison covers the tradeoffs on agentic tasks specifically, which is the relevant dimension for background processing work like this. And if you want to understand the Karpathy architecture more deeply before building on top of it, the deep dive on Karpathy’s LLM wiki and how to build a personal knowledge base with Claude Code covers the original design in detail.

The system is simpler than it looks. Two tools, one markdown control file, one hourly job. The complexity lives in the wiki that builds itself over time — not in the setup.