Build a Multi-Agent OS on Claude Code: 6 Components of a Hive Mind That Runs a Business Autonomously

One Developer Built a 6-Component Multi-Agent OS That Runs Entirely on a Local SQLite Database

A solo builder just published a detailed walkthrough of a multi-agent operating system he uses daily to run his business — and the architecture is more grounded than the name suggests. There are 6 core components. The whole thing runs on a local SQLite database with zero cloud database costs. And you can replicate most of it with a Claude Code subscription you probably already have.

The centerpiece is what he calls the hive mind: a shared SQLite database storing all agent tasks, memories, conversations, and scheduled jobs locally, with a 3D graph visualization layered on top. Every agent in the system reads from and writes to this same database. That single design decision — one local database, all agents — is what makes the rest of the architecture possible.

Here’s what each of the 6 components actually does, how it’s built, and what you’d need to replicate it.

The Architecture Is Simpler Than It Looks

Mark Kashef, who runs the @Mark_Kashef channel on YouTube, published the walkthrough in a video titled “This Claude Code Agentic OS Runs My Business.” The framing he uses is deliberately deflating: “This is purely a data organization exercise with some layers on top.”

Remy is new. The platform isn't.

Remy

Product Manager Agent

THE PLATFORM

200+ models 1,000+ integrations Managed DB Auth Payments Deploy

▮

BUILT BY MINDSTUDIO

Shipping agent infrastructure since 2021

Remy is the latest expression of years of platform work. Not a hastily wrapped LLM.

That’s not false modesty. The system is sophisticated, but the sophistication is in the data layer, not the AI layer. If your file hygiene is bad — scattered folders, inconsistent naming, no clear separation between global and project-level skills — no amount of agent orchestration fixes it.

The agents themselves are structurally minimal. Each one is just two files: a claude.md file containing its instructions and personality, and a YAML configuration file defining its settings. Global skills are available to every agent automatically. Project-level skills are scoped to specific agents. That’s the whole agent definition.

Component 1: The Hive Mind Database

Every agent in the system logs its completed tasks to a shared SQLite database. This is the foundation everything else builds on. The 3D graph visualization, the 2D Obsidian-style graph view, the list view, the standup reports — all of them are just different ways of reading the same underlying table.

Kashef is explicit about the build order: “If the list view is actually working and the list view is operational and all the agents are logging in real time and all of the database is set up, everything else is just additive.” The 3D graph is the cherry. The database is the cake.

The database stores agents, conversations, scheduled tasks, and memories. No Supabase. No Neon. No cloud database bill. SQLite runs locally, which means the entire system has zero recurring infrastructure costs beyond the Claude Code subscription.

For builders who’ve been intimidated by multi-agent memory architectures, this is the most important reframe in the video: the “AI operating system” is a data engineering problem. The AI is the interface. The database is the system.

If you want to understand how memory layers work before building your own, the Claude Code source leak memory architecture post covers the three-layer memory system Anthropic uses internally — it’s a useful reference point for designing your own schema.

Component 2: Mission Control (The Kanban Board)

The mission control dashboard is a kanban board where you can see all agents, drag tasks between them, and monitor parallel execution in real time. Standard stuff. The interesting part is auto-assign.

When you create a new task and don’t want to decide which agent should handle it, you click auto-assign. Under the hood, this calls Gemini 3 Flash — “the cheapest model from Gemini, which is inconsequential from a cost perspective” — with a dynamic system prompt that describes all current agents and asks it to classify the task to the best-fit agent.

The system prompt is generated dynamically from the live agent list. So as you add new agents, the classifier automatically knows about them. Kashef’s example: a task to “send Mark a message saying hi” gets routed to the comms agent. Obvious, but the point is the routing happens without human judgment.

The hardest engineering challenge in mission control, per Kashef, is maintaining “a perfect symbiosis” between the frontend and backend — making sure a task created in the UI propagates correctly through the Telegram API and triggers the right agent. The UI is the easy part. The plumbing is where things break.

Component 3: The Memory System

Remy doesn't build the plumbing. It inherits it.

Other agents wire up auth, databases, models, and integrations from scratch every time you ask them to build something.

WHAT REMY DOESN'T HAVE TO BUILD

200+

AI MODELS

GPT · Claude · Gemini · Llama

✓

1,000+

INTEGRATIONS

Slack · Stripe · Notion · HubSpot

✓

MANAGED DB

AUTH

PAYMENTS

CRONS

Remy ships with all of it from MindStudio — so every cycle goes into the app you actually want.

The memory architecture has five to six layers, organized around three properties: importance, salience, and recency. Salience is his word for “how significant is this memory” — distinct from recency, which is just “how recent.”

The memory tab is searchable. Type “Gmail” and you get every memory related to Gmail across all agents. You can also run an insights pass: a cheap language model scans all stored memories and surfaces patterns you might not have noticed yourself. The equivalent, he says, of running /insights in Claude Code to get a 30-day behavioral breakdown.

The design question he recommends answering before building: how do you want to handle fading memories? Do important memories stay pinned forever? Do low-salience memories decay to nothing, or get archived somewhere? These aren’t AI questions — they’re product decisions you make once and encode into the system.

The self-evolving Claude Code memory system with Obsidian and hooks post covers a complementary approach: using Claude Code hooks to automatically capture session logs and build a wiki that grows over time. Worth reading alongside this architecture if you’re designing your own memory layer.

Component 4: The Schedule Tab

The schedule tab is a cron job UI. You pick an agent, set a time, and the system creates a scheduled job that runs locally (or on a VPS if you’re hosting remotely).

The UX trick here is small but worth stealing. Raw cron syntax — 30 7 * * 1-5 — is unreadable to most people. Kashef’s instruction to Claude Code: “Instead of showing me this raw input, show me the equivalent raw input as English.” The frontend displays “Every weekday at 7:30 AM.” The backend stores the cron expression. Non-technical users interact with plain language; the system handles the translation.

The practical example he uses: a scheduled task for the Meta Ads CLI that fires every morning at 7:30 AM, queries the full campaign API, generates a daily report with hyperlinks to specific ads, and sends it to Telegram. The report includes spend, actions, blind spots, and a quick take on winners and losers — all formatted for mobile.

Component 5: The Agent Suggestion System

This one is easy to overlook but solves a real problem. As you add more tasks to a system, certain agents get overloaded. You don’t always notice until the quality of their outputs degrades.

The suggestion system runs Gemini 3 Flash against your full JSON conversation history, scans for agents that are handling too many distinct task types, and recommends new agents to split the load. In Kashef’s case, the comms agent was handling WhatsApp, school communications, Gmail, LinkedIn, and more. The system flagged it and suggested creating a dedicated email manager agent.

The reason he uses Gemini for this rather than Claude: “I don’t want to sacrifice my precious Claude Code tokens.” Gemini’s large context window and cheap inference make it practical to feed entire JSON conversation files and get useful analysis back. The cost is negligible. The signal is real.

REMY IS NOT

✕a coding agent
✕no-code
✕vibe coding
✕a faster Cursor

IT IS

✓a general contractor for software

The one that tells the coding agents what to build.

This is a good example of the multi-model pattern in practice — using the cheapest capable model for classification and analysis tasks, reserving the more expensive model for execution. Platforms like MindStudio handle this kind of orchestration natively: 200+ models, 1,000+ integrations, and a visual builder for chaining agents and workflows, which means you can route tasks to different models based on cost and capability without writing the routing logic yourself.

Component 6: The War Room

The war room is the synchronous communication layer — a chat room where you interact with all agents using slash commands.

Three commands matter most:

/standup — pings every agent, each one queries its entries in the hive mind database for the last 24 hours, and returns a status report. The main agent synthesizes the individual reports into a cohesive summary.
/discuss — opens a group conversation where agents can respond to each other with full context on prior replies.
/pin — designates one agent as the lead for the current meeting. Every other agent’s reply defers to the pinned agent’s framing.

The technical challenge: making sure each agent in a war room conversation has context on all previous replies, not just its own. Kashef’s solution was to ask Claude Code to figure out how to pass the full conversation context — including each agent’s underlying claude.md and YAML files — to every participant. It worked.

You can also tag specific agents directly, the same way you’d tag a custom GPT. @comms give me advice for creating a good YouTube video routes to the comms agent specifically, bypassing the group.

For context on how the Telegram connection works underneath all of this, the Claude Code Dispatch post on remote-controlling your agent from your phone covers the Anthropic SDK bridge that makes the Telegram interface possible — the same bridge Kashef built his system on top of.

What’s Actually Buried Here

The most surprising detail in the entire video is how the 3D graph visualization was built.

Kashef wanted an Obsidian-style graph view — the kind where nodes represent tasks and edges represent relationships. Rather than spec it from scratch, he screen-recorded himself using Obsidian’s graph view in Loom, downloaded the video, fed it to Gemini’s video understanding API, and then asked Claude Code to replicate what it saw and integrate it into the mission control dashboard.

He didn’t write a design document. He didn’t mock up wireframes. He recorded himself using the thing he wanted to build, fed the video to a model with video understanding capabilities, and used the output as the spec.

This is a genuinely different way to communicate intent to a coding agent. Most people write prompts. Some write specs. Kashef recorded a demo of existing software and said “build me this.” The Gemini skill — which gives Claude Code access to Gemini models including the video understanding API — is what made it possible.

The 3D version is resource-intensive, he notes. If you’re on a lower-spec machine, the 2D version is more practical. But the build method applies to either.

On the topic of building from specs rather than prompts: Remy takes a related approach at the application layer — you write an annotated markdown spec and it compiles a complete TypeScript backend, SQLite database, auth, and frontend from it. The spec is the source of truth; the code is derived output. It’s a different abstraction than what Kashef is doing, but the underlying logic is similar: precise intent expressed in a structured document produces better output than freeform prompting.

What to Build First

If you want to replicate this system, the build order matters. Kashef is direct about it: start with the boring stuff.

Get the SQLite database schema right. Define your agents as claude.md plus YAML. Decide which skills are global and which are agent-specific. Make sure the list view is populating correctly before you touch the graph visualization. The 3D graph running on a broken database is just a pretty broken database.

The Telegram connection is the second priority. The step-by-step guide to setting up Claude Code channels with Telegram covers the BotFather setup and token configuration — the same process Kashef uses for each new agent. Create a bot, copy the token, paste it into the frontend, activate. Repeat for each agent.

The mission control kanban and schedule tab come next. Both are relatively straightforward once the database and messaging layer are stable. The agent suggestion system and war room slash commands are the last layer — useful, but they depend on having enough conversation history to analyze.

The 8/20 question Kashef poses is worth sitting with: which features actually drive business value, and which are novelties? The Meta Ads CLI report at 7:30 AM every morning is business value. The 3D graph is a novelty — a useful one for understanding system state, but not the thing that makes the business run.

The system he’s built took months of iteration. But the architecture is legible, the components are separable, and the underlying technology — SQLite, Claude Code, Gemini Flash, Telegram — is all accessible today. The gap between understanding this and building it is smaller than it looks.