How to Build a Personal AI Operating System in Claude Code: Step-by-Step Setup Guide

Your Laptop Is Closed. Your AI Is Still Working.

Most people use Claude Code the way they use a calculator — open it, type something, wait for an answer, close it. That’s fine. But it’s leaving most of the value on the table.

The difference between a calculator and an operating system is that an OS runs things for you. It has memory. It has connections to other systems. It acts on schedules. You can walk away from it.

Building a personal AI operating system in Claude Code takes a focused weekend to set up properly. After that, you have something that plans your day by reading your actual calendar, drafts responses by pulling from your actual email history, and runs automations while your laptop is off. The framework that makes this work has four steps, in order: Context → Connections → Capabilities → Cadence. The Four C’s. You cannot skip steps. Cadence without context is an automation that doesn’t know who it’s working for.

Here’s how to build it.

What You’re Actually Building (And Why It’s Different)

The obvious mental model for this is “a really good chatbot.” That’s wrong.

A chatbot answers questions. An AI OS acts on your behalf, with memory of what happened yesterday, access to the tools you actually use, and the ability to run processes while you’re asleep. The distinction matters because it changes what you build.

TIME SPENT BUILDING REAL SOFTWARE

95%

5% Typing the code

95% Knowing what to build · Coordinating agents · Debugging + integrating · Shipping to production

Coding agents automate the 5%. Remy runs the 95%.

The bottleneck was never typing the code. It was knowing what to build.

When you ask a fresh Claude session “what should I focus on this week?” it answers like a stranger who just met you. When you ask your AIOS the same question, it reads your priorities file, checks your ClickUp tasks, scans your calendar, and answers like a chief of staff who’s been working with you for six months.

The goal is to get to a point where, if someone on your team had a question, they’d be better off asking your AIOS than asking you — because it has better memory, faster recall, and never sleeps. If you want to go further and compile a full application from a structured spec rather than orchestrating prompts manually, Remy is worth knowing about — it’s MindStudio’s spec-driven full-stack app compiler where you write a markdown spec with annotations and it compiles into a complete TypeScript app with backend, database, auth, and deployment already wired together.

What You Need Before Starting

Accounts and tools:

Claude Pro ($20/mo) or Max ($200/mo). Pro gives you 5 remote routine runs per day; Max gives you 15. You need at least Pro to follow this guide fully.
Visual Studio Code — free, just download it.
The Claude Code VS Code extension — install from the Extensions panel, search “Claude Code.”
A GitHub account (free). You’ll need a private repo for your AIOS.
A Google account if you want Google Workspace integration.

Knowledge:

You don’t need to know how to code. You need to be comfortable pasting things into a terminal and following instructions.
You need to know what tools your business actually runs on. Spend 10 minutes before you start and write down where your revenue data lives, where your tasks live, where your communications live.

Step 1: Build the Foundation (Context)

The first C is Context — what the AI knows about you.

Create a new folder on your desktop called AIOS. Open VS Code, click “Open Folder,” and open that folder. Then open the Claude Code panel (the Anthropic logo in the top right).

Now clone the starter repo. Tell Claude Code: “Clone this GitHub repo into the current project folder: [your repo URL].” If you’re starting from scratch without a template, ask Claude to initialize a .claude/skills/ directory structure and create a CLAUDE.md file.

Your project should now have:

.claude/ — where skills live
context/ — where Claude stores what it knows about you
references/ — API docs, SOPs, reference material
CLAUDE.md — the master prompt for the project

Run the onboard skill. Type: “I just set up this project. My name is [name]. Help me get onboarded.” Claude will invoke the onboard skill — a seven-question interview that creates three files: about_me.md, about_business.md, and priorities.md.

Don’t rush this. Give it real answers. Paste in actual writing samples when it asks — a LinkedIn post, an email to a client, whatever. This is how it learns your voice. The more specific you are here, the less you’ll have to correct it later.

Check: You now have a context/ folder with three populated markdown files. Ask Claude: “What should I focus on this week?” If it answers with specifics from your priorities file rather than generic advice, the context layer is working.

Step 2: Wire In Your Data (Connections)

Remy is new. The platform isn't.

Remy

Product Manager Agent

THE PLATFORM

200+ models 1,000+ integrations Managed DB Auth Payments Deploy

▮

BUILT BY MINDSTUDIO

Shipping agent infrastructure since 2021

Remy is the latest expression of years of platform work. Not a hastily wrapped LLM.

The second C is Connections — what data the AI can reach.

This is where most people stall. They have context (Claude knows who they are) but no connections (Claude can’t actually see what’s happening in their business). The result is an assistant that gives good generic advice but can’t tell you what’s actually on fire right now.

Start with the tools you check most. For most people that’s some combination of: a project management tool (ClickUp, Linear, Notion), email (Gmail), and calendar (Google Calendar).

The API approach: For each tool, tell Claude: “I want to connect [tool name]. Research the API documentation, create a reference markdown file in the references/ folder with all the endpoints, and create a .env file with a placeholder for the API key.”

This does three things at once: Claude researches the API so you don’t have to, stores the documentation locally as a markdown file (so it never has to search again), and creates a secure place for your credentials. Never paste API keys into the chat — always into the .env file.

The GWS CLI for Google Workspace: If your work lives in Google, install the Google Workspace CLI — an open-source bash-command interface to Gmail, Drive, Docs, Sheets, Calendar, and Slides. Setup: Google Cloud Console → new project → APIs & Services → OAuth consent screen → Desktop App credential → download JSON to ~/.config/gws/. Then enable the specific APIs you need (Gmail API, Drive API, etc.) in the console.

The GWS CLI is worth the 20-minute setup. It gives Claude bash-level access to your entire Google environment through a single tool, which is dramatically more token-efficient than running separate MCP servers for each service.

On MCP vs. API: The default instinct is to use MCP servers for everything because they’re easy to install. The problem is that loading multiple MCP servers eats context tokens on every session, even when you don’t need those tools. The better approach: use the API directly, store the endpoint documentation as a local markdown file, and only load what the specific task requires. This is the same progressive disclosure principle that makes Claude Code skills architecture efficient.

Check: Ask Claude to pull something real from one of your connected tools — your open tasks, your unread emails from today, your next three calendar events. If it returns actual data, the connection is live.

Step 3: Teach It How to Work (Capabilities)

The third C is Capabilities — what the AI can produce.

Skills are the mechanism. A skill is a markdown file in .claude/skills/<skill-name>/skill.md with YAML front matter (name, description) followed by step-by-step SOP instructions. When you ask Claude to do something, it reads the YAML front matter of every skill — about 100 tokens per skill — to find the right one, then loads the full file only if it matches. This is progressive context loading, and it’s why you can have dozens of skills without blowing your context budget.

The practical rule: if you’ve done something twice, build a skill for it.

Not a coding agent. A product manager.

Remy doesn't type the next file. Remy runs the project — manages the agents, coordinates the layers, ships the app.

BY MINDSTUDIO

Building your first skill: Walk Claude through a process you do regularly. Say: “Every Monday I check our team’s open tasks in ClickUp, look at who’s overloaded, and write a brief status update. Let’s turn this into a skill.” Claude will ask clarifying questions, then generate the skill file. The first version will be rough. Run it, watch what it does, give feedback, and update the skill file. By the fifth run it’s usually very good.

Keep skill files under 500 lines. Move detailed reference material — API endpoint lists, brand guidelines, example outputs — to separate files and reference them from the skill. The skill.md is the brain; the supporting files are the tools it reaches for when needed.

For building skills that reference external documentation, the ClickUp API reference markdown strategy works well: scrape all endpoints once, store as a .md file in references/, and point your skills at that file. Claude reads local markdown much faster and cheaper than making live web requests.

Run the /audit skill periodically. It grades your AIOS on all four C’s, scores out of 100, and tells you where the gaps are. A fresh setup with one connection and no custom skills will score around 50 — that’s expected. The audit tells you what to build next. Pair it with /level up, which asks five diagnostic questions to surface automation opportunities you haven’t thought of yet.

If you’re building skills that need to generate visual outputs — slide decks, design assets, anything rendered — look at how Claude Design handles this: it takes screenshots of its own output and feeds them back into the vision model to critique before showing you. The same self-QA loop pattern works in your own skills. You burn more tokens, but the quality improvement is usually worth it. For an even deeper look at how to build self-improving AI skills with feedback loops, that pattern extends naturally from here.

Check: Run one of your custom skills end-to-end. Did it follow the SOP? Did it use the right data sources? If yes, you have a working capability. If not, update the skill file and run it again.

Step 4: Make It Run Without You (Cadence)

The fourth C is Cadence — when the AI acts on its own.

This is where the system stops being a better chatbot and starts being an actual operating system.

Remote routines: In the Claude desktop app, go to Scheduled Tasks and create a new remote task. Remote routines run on Anthropic’s cloud — 4 vCPUs, 16GB RAM, 30GB disk per run — against a cloned copy of your GitHub repo. Your laptop can be off. The environment is destroyed after each run, which means your .env file never reaches it. API keys must be set as environment variables in the Cloud Environment settings panel, not in your repo.

Limits: Pro plan gets 5 remote routine runs per day, Max ($200/mo) gets 15, Team/Enterprise gets 25. Minimum interval is 1 hour.

A routine is just a specific prompt that fires on a schedule. The prompt should be specific enough to run without you — no clarifying questions, no ambiguity. Good: “Run the team-pulse skill and post the output to the #updates ClickUp channel.” Bad: “Check on the team.”

REMY IS NOT

✕a coding agent
✕no-code
✕vibe coding
✕a faster Cursor

IT IS

✓a general contractor for software

The one that tells the coding agents what to build.

The loop skill for in-session scheduling: For things you want to run during an active session, /loop creates recurring cron jobs using cron_create, cron_list, and cron_delete tools. These have a 3-day expiry and die when the session closes. Use loop for things like: check this deploy every 20 minutes, remind me at 3pm to review the draft, monitor this ClickUp task for changes. Use remote routines for things like: every Monday morning run the weekly briefing skill, every night sync new data to the wiki.

Push your AIOS to GitHub: Your project needs to be in a private GitHub repo for remote routines to work, and also so you can access it from any machine. This is also what makes the system durable — if you switch from Claude Code to another tool in six months, your skills, context files, and reference docs all travel with you.

For building a knowledge layer that compounds over time, the Karpathy LLM wiki pattern pairs well with cadence: a nightly routine ingests new documents into your raw/ folder, Claude processes them into the wiki/ folder, and your AIOS gets smarter without you doing anything. One user reduced token usage by 95% versus naive RAG by switching to this approach. For a full walkthrough of building that memory layer, the self-evolving Claude Code memory system guide covers the hook-based approach in detail.

Check: Set up one remote routine. Let it run on schedule without you touching it. If it completes and produces the right output, your AIOS has cadence.

The Failure Modes Worth Knowing

The routine can’t find your API keys. This happens because the .env file isn’t pushed to GitHub (correct — don’t push it). Fix: add your keys as environment variables in the Cloud Environment settings panel in the desktop app. Explicitly tell the routine prompt to use environment variables, not the .env file.

Skills trigger too often or not at all. If a skill fires when you don’t want it to, the YAML description is too broad. Tighten it. If it never fires, the description is too narrow or the trigger phrase doesn’t match how you naturally ask for things. You can also force-trigger any skill with /skill-name directly.

Context rot in long sessions. If Claude starts forgetting things mid-session, you’ve hit context limits. The fix is to keep skills modular, use reference files instead of embedding everything in the skill.md, and start fresh sessions for distinct tasks. The /clear command clears conversation history without closing the session.

Remote routines fail silently. Check the run history in the desktop app. If a routine fails, add a step at the end: “If this run fails for any reason, post a brief error summary to [Slack channel / ClickUp task].” This gives you visibility without having to manually check.

The GWS CLI asks you to re-authenticate repeatedly. This is a known issue with the current beta. It’s annoying but not fatal — just re-run the auth command. The project is pre-1.0 and this is being actively worked on.

Where to Take This Next

Once all four C’s are working, the compounding starts. Every skill you add makes the system more capable. Every connection you wire in gives it more data to work with. Every routine you schedule gives you back more time.

Remy doesn't build the plumbing. It inherits it.

Other agents wire up auth, databases, models, and integrations from scratch every time you ask them to build something.

WHAT REMY DOESN'T HAVE TO BUILD

200+

AI MODELS

GPT · Claude · Gemini · Llama

✓

1,000+

INTEGRATIONS

Slack · Stripe · Notion · HubSpot

✓

MANAGED DB

AUTH

PAYMENTS

CRONS

Remy ships with all of it from MindStudio — so every cycle goes into the app you actually want.

The natural next step is building skill systems — chains of skills that run in sequence to produce complex outputs. A content pipeline, for example: a routine pulls trending topics every morning, passes them to a research skill, which passes output to a drafting skill, which posts to a review queue for your approval. Each skill is modular and reusable; the chain is what produces the end-to-end automation.

If you want to access your AIOS from your phone or while traveling, the Claude Code Telegram channels setup gives you a mobile interface to the same underlying system.

For teams building AI-powered applications on top of this kind of infrastructure, MindStudio offers a different path: an enterprise AI platform with 200+ models, 1,000+ integrations, and a visual builder for orchestrating agents and workflows — so you can build and deploy agents without writing the orchestration layer yourself. The underlying ideas — context grounding, structured memory, repeatable skills — are the same; the implementation layer is different.

The AIOS you build in week one will be unrecognizable by month three. That’s the point. The system learns as you use it, and the skills compound. The question isn’t whether this is worth building. The question is how long you want to wait before you start.