How to Auto-Process YouTube Transcripts into a Searchable Wiki with Obsidian and Codex (Hourly)

Your YouTube Watch History Is a Knowledge Base You’re Not Using

Most people watch a video, think “that was useful,” and never see it again. The transcript disappears. The ideas dissolve. You’re left with a vague sense that you learned something about motivation or filmmaking or compound interest, but you couldn’t cite it if someone asked.

You can fix this in about 20 minutes. The setup: Obsidian Web Clipper auto-pulls YouTube transcripts directly into a local vault, and Codex automations run hourly to process those raw clips into structured wiki pages — summaries, extracted concepts, cross-linked notes. You clip a video, walk away, and an hour later it’s a searchable wiki entry with backlinks to related content you’ve already saved.

This post walks through exactly how to build that pipeline, from a blank Obsidian vault to an hourly automation that runs without you.

What you actually end up with

Before the setup, it helps to be concrete about the output.

When you clip a YouTube video, the Obsidian Web Clipper saves the full transcript as a markdown file in your /raw folder. That file has front matter: source title, source URL, date clipped, and (after you configure it) the YouTube channel name. The transcript is the raw material — unprocessed, immutable.

Other agents ship a demo. Remy ships an app.

React + Tailwind ✓ LIVE

API

REST · typed contracts ✓ LIVE

DATABASE

real SQL, not mocked ✓ LIVE

AUTH

roles · sessions · tokens ✓ LIVE

DEPLOY

git-backed, live URL ✓ LIVE

Real backend. Real database. Real auth. Real plumbing. Remy has it all.

Every hour, a Codex automation checks /raw for any files that haven’t been processed yet. When it finds one, it reads the transcript, creates wiki pages for the concepts and ideas it contains, updates the vault index, appends an entry to the log, and moves the source file to /raw/processed so it doesn’t get ingested twice.

The wiki pages aren’t summaries of individual videos. They’re topic pages — “temporal discounting,” “identity-led goals,” “environment design” — that pull from everything you’ve saved. A concept page links back to every source that mentioned it. The sources link forward to every concept they generated. Over time, the graph view in Obsidian shows you a web of interconnected ideas rather than a pile of transcripts.

When you ask the wiki a question, it doesn’t just answer — it creates a new wiki page for the synthesized answer and logs the query. The wiki grows from your questions, not just from what you clip.

That’s the system. Here’s how to build it.

What you need before you start

Obsidian — free, available at obsidian.md. This is your vault: a local folder of markdown files with a nice interface on top. You don’t need to pay for sync or any plugins.

Obsidian Web Clipper — a Chrome extension. On the Obsidian website, scroll to the bottom and click the Web Clipper link, then add it to Chrome. This is what pulls YouTube transcripts automatically.

Codex — OpenAI’s coding agent. Free to download; you’ll get more out of it on a paid plan. This is the agent that processes your raw files and runs the hourly automation.

A GitHub account — optional, but useful for backup. The automation can commit and push after each processing run.

You don’t need to know how to code. You do need to be comfortable following instructions and editing a plain text file.

Building the pipeline, step by step

Step 1: Create the Obsidian vault

Open Obsidian and create a new vault. Call it “Second Brain” (or whatever you want — just remember the name and location). Save it somewhere you’ll find it, like ~/Documents/second-brain.

Delete the welcome note. You want a clean, empty vault.

Now you have: an empty Obsidian vault at a known path on your computer.

Step 2: Build the folder structure with Codex

Open Codex and create a new project. When it asks for a folder, point it at the vault folder you just created.

Now give Codex this prompt, linking to Andrej Karpathy’s LLM Wiki GitHub page as the architectural reference:

Build out the wiki architecture based on Karpathy’s LLM wiki [URL]. The current second brain folder is the folder that Obsidian is connected to. It is currently empty. Build from scratch.

Codex will generate files. It may generate too many — in the original build, it created 51 files when only a handful were needed. If that happens, follow up with: “Please remove all the extra files and just build what’s explicitly called for in Karpathy’s plan.”

The correct minimal structure is:

/raw — where clipped content lands
/raw/processed — where processed files get moved
/wiki — AI-generated wiki pages
agents.md — the plain-text file governing all agent behavior
index.md — catalog of everything in the wiki
log.md — running log of all ingestion and queries

Day one: idea. Day one: app.

DAY

DELIVERED

Not a sprint plan. Not a quarterly OKR. A finished product by end of day.

The agents.md file is the most important piece. It’s a plain-text prompt file. Every instruction you want the agent to follow lives here. You’ll edit it throughout this setup.

Now you have: a vault with the correct folder structure and an agents.md file you can edit.

Step 3: Configure the Obsidian Web Clipper

Click the Web Clipper icon in Chrome and open Settings.

Under the vault list, add the exact name of your vault. Check the bottom-left corner of Obsidian if you’re not sure what it’s called — the name there must match exactly.

Under the default template, select your vault. Set the properties to pull in: source title, source URL, date created, and a “web clip” tag.

Under note location, set it to raw. This tells the clipper to drop every saved file directly into your /raw folder.

Now test it. Go to any YouTube video, click the Web Clipper icon, and watch it load the full transcript. Click “Add to Obsidian.” Check your vault — the file should appear in /raw with the transcript as the note content.

Now you have: a working clip pipeline. Any YouTube video or article you clip lands in /raw as a markdown file.

Step 4: Add the channel name instruction to agents.md

When you clip a YouTube video, the Web Clipper doesn’t automatically pull the channel name into the front matter. You need to tell the agent to do this during processing.

Open agents.md in Obsidian (or in any text editor). Find the section that describes what happens during ingestion. Add an instruction like:

For YouTube videos clipped with Obsidian Web Clipper, open or inspect the YouTube source URL and add the channel name to the original source page front matter.

While you’re in agents.md, add two more instructions:

After processing a file, move it from /raw to /raw/processed.
Cross-link any wiki pages generated or updated back to the original source page.

The first keeps your /raw folder clean. The second prevents orphaned wiki pages — every concept page should link back to where it came from.

Now you have: an agents.md that governs ingestion, channel name extraction, file movement, and cross-linking.

Step 5: Run a manual test

Clip two or three YouTube videos using the Web Clipper. They’ll land in /raw.

In Codex, open a new chat in your second brain project and say: “Process the files inside the raw folder.”

This will take a few minutes. When it finishes, open Obsidian. You should see:

The original source files moved to /raw/processed
New pages in /wiki for concepts extracted from the videos
An updated index.md listing sources and concepts
New entries in log.md

Check one of the wiki pages. It should have a link back to the original source. Check one of the source files — it should have the YouTube channel name in the front matter.

If either of those is missing, go back to agents.md, clarify the instruction, and re-run. The agents.md file is just a prompt. Editing it is how you fix behavior.

Now you have: a working wiki with your first few entries, properly cross-linked and organized.

How Remy works. You talk. Remy ships.

YOU14:02

Build me a sales CRM with a pipeline view and email integration.

REMY14:03 → 14:11

Scoping the project

Wiring up auth, database, API

Building pipeline UI + email integration

Running QA tests

✓ Live at yourapp.msagent.ai

Step 6: Set up the hourly automation

This is the part that makes the system actually useful. Instead of manually telling Codex to process files, you set up an automation that runs every hour.

In Codex, click the Automations tab. Create a new automation with these settings:

Title: Process Second Brain Raw Files
Work tree: Local (runs directly in your project folder)
Project: Second Brain
Schedule: Hourly
Prompt: “If there are any unprocessed files inside the raw directory, please process them following the instructions in agents.md.”
Model: GPT-4.5 on High reasoning (use the strongest model available to you)

Save it. The automation will now run every hour, check /raw for unprocessed files, and ingest anything it finds.

If you want to explore how similar spec-driven approaches work for full-stack app generation, Remy is worth knowing about — it’s MindStudio’s app compiler that takes a markdown spec with annotations and compiles it into a complete TypeScript app, including backend, database, auth, and deployment. The same principle applies here: a plain-text spec file (your agents.md) drives all agent behavior, and editing that file is how you change what the system does.

Now you have: a fully automated ingestion pipeline. Clip a video, walk away, and it’s in your wiki within the hour.

Step 7: Add GitHub backup (optional but recommended)

Create a new private repository on GitHub. Copy the repository URL.

In Codex, open a new chat in your second brain project and say: “Commit this current version to my private GitHub repo [paste URL].”

Then edit your automation prompt to add: “Once everything is processed, commit and push the current version of the directory to the main branch on GitHub.”

Now every hourly run ends with a backup push. Your vault is stored locally in Obsidian, processed by Codex, and backed up to a private GitHub repo — three layers of persistence.

Now you have: an automated, backed-up second brain that grows every hour.

When things go wrong

The Web Clipper loads but the transcript is empty. Some videos have transcripts disabled or auto-captions that the clipper can’t access. This is a YouTube limitation, not a setup problem. Try a different video to confirm the clipper is working, then accept that some videos won’t clip cleanly.

Codex generates too many files during the initial build. This happened in the original setup — 51 files instead of the minimal structure. Follow up with an explicit prompt to prune back to only what Karpathy’s spec calls for. Being specific in the follow-up (“remove everything except /raw, /wiki, agents.md, index.md, log.md”) works better than asking it to “clean up.”

The channel name isn’t appearing in front matter. The agent may have misunderstood the instruction and added the channel name to the generated wiki page instead of the original source file. Go to agents.md, find the channel name instruction, and rewrite it to be explicit: “add the channel name to the original source page front matter, not to the generated wiki pages.” Then reprocess.

Remy doesn't build the plumbing. It inherits it.

Other agents wire up auth, databases, models, and integrations from scratch every time you ask them to build something.

WHAT REMY DOESN'T HAVE TO BUILD

200+

AI MODELS

GPT · Claude · Gemini · Llama

✓

1,000+

INTEGRATIONS

Slack · Stripe · Notion · HubSpot

✓

MANAGED DB

AUTH

PAYMENTS

CRONS

Remy ships with all of it from MindStudio — so every cycle goes into the app you actually want.

Files aren’t moving to /raw/processed. Check that the agents.md instruction says /raw/processed (with the subfolder), not just /processed. The path needs to be relative to the vault root.

The hourly automation runs but does nothing. If /raw is empty (because everything was already processed), the automation correctly does nothing. Drop a new file in /raw and wait for the next run to confirm it’s working.

Wiki pages aren’t cross-linking. Cross-linking is governed by the instruction in agents.md. If it’s not happening, the instruction may be missing or ambiguous. Add it explicitly: “Cross-link any wiki pages generated or updated to the original source page by adding a link in both directions.”

One thing worth knowing: the model choice matters here. The original setup recommends GPT-4.5 on High reasoning for processing. Cheaper or faster models will produce shallower wiki pages with less accurate cross-linking. The hourly automation is running unattended — this is not the place to optimize for cost.

Where to take this further

The wiki is the foundation. Once it’s running, two natural extensions are worth building.

Add a journal layer. Update agents.md with a rule: if you start a chat with the word “journal,” treat the input as a journal entry, save it as a dated markdown file in /journal, and respond based on what’s in the wiki plus past journal entries. The demo in the original build showed this working well — a journal entry about YouTube title anxiety produced a response that cited specific saved videos by date, grounded in the wiki rather than generic LLM output.

Add a CRM layer. A similar rule in agents.md: if you say “add to CRM” followed by a person’s name and details, create or update a named markdown file in /crm. The index for /crm lists contacts alphabetically with a short bio. Later, you can ask “where did I meet [name]?” and get an answer pulled from the CRM record.

Both of these follow the same pattern as the wiki: plain markdown files, governed by instructions in agents.md, queryable through Codex chat. The architecture is just markdown files, which means any agent — Claude Code, Codex, or others — can work from the same vault directory interchangeably. If you’re curious about building a self-evolving memory system with Claude Code hooks, the underlying vault structure is compatible with what you’ve built here.

The agents.md file is the real lever. Every behavior in this system — ingestion rules, file movement, cross-linking, journal responses, CRM updates — is just a prompt in that file. If something isn’t working the way you want, you edit the prompt. If you want new behavior, you add a new rule. It’s a plain text file that you can open in any editor.

For teams or more complex workflows, platforms like MindStudio handle this kind of orchestration at a different scale — 200+ models, 1,000+ integrations, and a visual builder for chaining agents and workflows — which is useful when the markdown-plus-agent approach starts to feel limiting. But for a personal knowledge base, the simplicity of the markdown approach is a feature, not a limitation.

✗ VIBE-CODED APP

Tangled. Half-built. Brittle.

✓ AN APP, MANAGED BY REMY

UIReact + Tailwind✓

APIValidated routes✓

DBPostgres + auth✓

DEPLOYProduction-ready✓

Architected. End to end.

Built like a system. Not vibe-coded.

Remy manages the project — every layer architected, not stitched together at the last second.

If you want to understand the architectural foundation more deeply, the Karpathy LLM wiki setup with Claude Code post covers the original spec in detail. And if you’re thinking about what else you can automate once the pipeline is running, Claude Code skills for content repurposing shows a similar pattern applied to social media output.

The graph view in Obsidian is a good way to check your progress. When you first build the vault, it’s a handful of isolated nodes. After a few weeks of clipping and processing, it looks like a web — concepts connected to sources, sources connected to concepts, journal entries linked to wiki pages. That visual is a reasonable proxy for how much of what you’ve watched has actually been retained.

One opinion: the hourly automation is the right cadence. Daily feels too slow — you clip something, forget about it, and it’s not in the wiki when you need it. Real-time feels unnecessary and expensive. Hourly means that by the time you’d actually want to query something you just watched, it’s already been processed.

The system is simpler than it sounds. It’s a Chrome extension, a folder structure, a text file of instructions, and a scheduled job. The complexity is in the output, not the setup. And the output compounds — every video you clip makes the wiki more useful for the next question you ask it.