Run an AI Engineering Team in Paperclip With Heartbeats

Q: Can I add more agent roles beyond the three covered here?

Yes. Define any number of agents in the /agents directory and register them in paperclip.config.json. Common additions include a DevOps agent that handles deployments after QA approval, a Documentation agent that updates README files based on engineer output, and a Security agent that scans for vulnerabilities before QA runs. Each new role just needs a JSON definition file and a clear triggers_on condition.

Q: What happens if an agent fails mid-task?

Paperclip logs all failures to paperclip.log. If an agent crashes mid-task, the task file stays in its current state — nothing is rolled back automatically. You'll need to inspect the /tasks directory and reset the relevant file to PENDING status to allow a retry. Adding explicit error handling to your system prompts helps: instruct each agent to write failure details to /tasks/errors.md and stop cleanly if it hits an unrecoverable problem.

The Case for a Multi-Agent Team Structure

Running a software project with a single AI agent has real limits. One context window, one thread of attention, one point of failure. When you need a system that can plan, build, and verify work simultaneously — and keep doing it without manual prompts — you need a multi-agent setup with clear role separation.

That’s exactly what Paperclip combined with Claude Code makes possible. You get a CEO agent that plans and delegates, an engineer agent that writes and ships code, and a QA agent that catches problems before they compound. The whole system runs on a heartbeat schedule, ticking along autonomously while you focus on higher-level work.

This guide walks through the complete setup: installing Paperclip, configuring Claude Code, wiring up your three-agent team, and scheduling autonomous runs.

What Paperclip Does and Why It Works with Claude Code

Paperclip is a lightweight multi-agent orchestration framework built specifically for teams using Claude Code. It handles the coordination layer — routing tasks between agents, managing shared memory, and defining agent roles through a simple configuration file.

Where Claude Code handles the actual reasoning and code generation, Paperclip handles:

Agent role definitions — Each agent gets a system prompt, a set of tools, and a defined scope of work
Inter-agent messaging — Agents pass tasks and results through a shared task queue
Heartbeat scheduling — A built-in scheduler triggers the CEO agent on a defined interval, kicking off each work cycle
State persistence — Completed tasks, decisions, and code outputs are stored so each agent has context from previous runs

Everyone else built a construction worker.
We built the contractor.

🦺

CODING AGENT

Types the code you tell it to.
One file at a time.

🧠

CONTRACTOR · REMY

Runs the entire build.
UI, API, database, deploy.

Think of Paperclip as the org chart layer that sits above Claude Code. Claude Code does the reasoning; Paperclip decides who reasons about what.

Why This Combination Works

Claude Code is built for agentic work. It reads files, writes code, runs terminal commands, and browses documentation — all without a human prompting each step. When you run multiple Claude Code instances under Paperclip’s orchestration, you get a system that can plan a feature, implement it, and test it in a coordinated cycle.

The CEO agent never writes a line of code. It reads the product backlog, breaks features into tasks, assigns them to the engineer, and reviews results. The QA agent runs independently — pulling completed work from a shared directory and applying test cases. Each agent has clean separation of concerns, which keeps outputs predictable and failures easy to diagnose.

Prerequisites and Project Setup

Before starting, make sure you have the following in place:

Requirements:

Node.js 18 or higher
Claude Code installed globally (npm install -g @anthropic-ai/claude-code)
An Anthropic API key with Claude access
Paperclip installed (npm install -g paperclip-agents)
A project directory where your agents will operate

Optional but recommended:

Git initialized in your project directory so the QA agent can diff changes
A .env file for managing API keys and configuration

Initialize Your Paperclip Project

Once Paperclip is installed, run the following in your project root:

paperclip init

This creates a paperclip.config.json file and an /agents directory where you’ll define each agent’s role. The init command also creates a /tasks directory — this is the shared task queue all three agents read from and write to.

Your initial project structure should look like this:

my-project/
├── agents/
│   ├── ceo.json
│   ├── engineer.json
│   └── qa.json
├── tasks/
│   ├── backlog.md
│   └── completed/
├── paperclip.config.json
└── .env

Add your Anthropic API key to .env:

ANTHROPIC_API_KEY=your_key_here
CLAUDE_MODEL=claude-opus-4-5

Building Your CEO Agent

The CEO agent is the planning layer of this system. Its job is to read the product backlog, decide what to work on next, break it into an actionable spec, and assign it to the engineer. It also reviews completed QA results and decides whether to close a task or send it back for revision.

Write the CEO System Prompt

Open agents/ceo.json and define the agent:

{
  "name": "CEO",
  "model": "claude-opus-4-5",
  "role": "product-manager",
  "system_prompt": "You are a technical product manager overseeing a software project. Read the backlog in /tasks/backlog.md, identify the highest-priority uncompleted task, break it into a clear implementation spec, and write that spec to /tasks/current_task.md. Do not write any code. Once QA has approved a task (flagged in /tasks/qa_results.md), mark the original backlog item as complete.",
  "tools": ["read_file", "write_file", "list_directory"],
  "triggers_on": "heartbeat"
}

A few things worth noting:

The CEO only gets read/write file access — no terminal execution. This limits its scope and prevents accidental code runs.
The triggers_on: heartbeat flag means Paperclip calls this agent on each scheduled tick.
The system prompt keeps the CEO focused on one input (the backlog), one output (a task spec), and one result to check (QA output).

What a Good Backlog Looks Like

The CEO agent reads from /tasks/backlog.md. Keep this file structured but simple:

# Product Backlog

## TODO
- [ ] Add user authentication with JWT
- [ ] Build REST endpoint for /api/products
- [ ] Add rate limiting middleware

## IN PROGRESS
- [ ] Add user authentication with JWT

## DONE
- [x] Set up Express server with basic routing

The CEO reads this file on each heartbeat, identifies the top TODO item, and writes a task spec to /tasks/current_task.md for the engineer to pick up.

Building Your Engineer and QA Agents

With the CEO defined, the next step is setting up the two execution agents.

Configure the Engineer Agent

The engineer agent reads the current task spec and writes the code. Open agents/engineer.json:

{
  "name": "Engineer",
  "model": "claude-sonnet-4-5",
  "role": "software-engineer",
  "system_prompt": "You are a senior software engineer. Read /tasks/current_task.md for your current assignment. Implement the feature as described. Write your code to the appropriate files in /src. When complete, write a summary of what you changed to /tasks/engineer_output.md. Do not modify the backlog or any task files other than engineer_output.md.",
  "tools": ["read_file", "write_file", "execute_command", "list_directory"],
  "triggers_on": "task_ready"
}

The engineer gets execute_command access — it needs to run package installs, test runners, and build steps. If you want tighter control, add a command allowlist:

"command_allowlist": ["npm install", "npm run build", "npx"]

Configure the QA Agent

The QA agent picks up completed engineer work and runs verification. Open agents/qa.json:

{
  "name": "QA",
  "model": "claude-sonnet-4-5",
  "role": "qa-engineer",
  "system_prompt": "You are a QA engineer. Read /tasks/engineer_output.md to understand what was just implemented. Review the changed files in /src. Write test cases, run them, and document results in /tasks/qa_results.md. Flag whether the task PASSED or FAILED with specific reasons. Do not modify source files — only write to qa_results.md.",
  "tools": ["read_file", "write_file", "execute_command", "list_directory"],
  "triggers_on": "engineer_done"
}

Paperclip detects when /tasks/engineer_output.md is written and triggers the QA agent automatically based on the triggers_on: engineer_done flag.

How Agent Communication Works

The three agents don’t call each other directly. They communicate through shared files in the /tasks directory:

CEO writes to current_task.md → triggers Engineer
Engineer reads current_task.md, writes to engineer_output.md → triggers QA
QA reads engineer_output.md, writes to qa_results.md → CEO reads on next heartbeat

This file-based handoff is simple and fully debuggable. At any point, you can open a file in /tasks and see exactly where a task sits in the pipeline.

Setting Up Heartbeat Scheduling for Autonomous Operation

The heartbeat is what makes this system run without you. Without it, you’d need to manually trigger the CEO agent each time you want a work cycle to start.

Configure the Heartbeat in paperclip.config.json

Open your main config file and add the scheduler block:

{
  "project_name": "my-company",
  "agents": ["ceo", "engineer", "qa"],
  "scheduler": {
    "enabled": true,
    "heartbeat_interval": "30m",
    "start_agent": "ceo",
    "max_concurrent_agents": 2,
    "idle_behavior": "wait"
  },
  "shared_memory": {
    "type": "filesystem",
    "path": "./tasks"
  }
}

Key settings explained:

heartbeat_interval — How often the CEO agent triggers. 30m works well for most projects. Set 5m for fast iteration cycles or 2h for projects that don’t need constant updates.
max_concurrent_agents — Caps simultaneous agents. Setting this to 2 lets the engineer and QA run on different tasks in parallel when needed, but prevents the CEO from starting a new cycle while the engineer is mid-task.
idle_behavior: wait — If the backlog is empty, the CEO waits instead of spinning on nothing.

Start the System

Once everything is configured:

paperclip start

You’ll see the process launch the CEO agent on the first heartbeat. Watch the /tasks directory and you’ll see files being written as the CEO, engineer, and QA pass work between each other.

To see a live activity log:

paperclip logs --follow

To stop the system gracefully:

paperclip stop

Adjusting the Heartbeat Over Time

The right interval depends on your project complexity. For a simple CRUD API, 15-minute heartbeats are plenty. For systems with many interdependent tasks, hourly cycles reduce task collisions.

You can pause the heartbeat without stopping the whole system:

paperclip pause --agent ceo

This lets the engineer finish in-progress work before the next planning cycle begins.

Extending Your Agents with MindStudio

Paperclip and Claude Code handle reasoning and code generation well. But there are common tasks your agents will need that file access and terminal commands don’t cover alone — sending a Slack message when QA signs off on a feature, emailing a daily progress summary, or triggering a deployment workflow after a clean test run.

This is where MindStudio’s Agent Skills Plugin fits in cleanly. It’s an npm SDK (@mindstudio-ai/agent) that gives any Claude Code agent access to 120+ typed capabilities as simple method calls — without building API integrations from scratch. It handles rate limiting, retries, and authentication behind the scenes, so your agents only deal with the logic.

Install the Plugin

npm install @mindstudio-ai/agent

Initialize it in your agent scripts:

import { MindStudio } from '@mindstudio-ai/agent';

const agent = new MindStudio();

Give Your Agents Real-World Reach

With the plugin in place, your CEO agent can do things like:

// Post a Slack message when a task is assigned
await agent.sendSlackMessage({
  channel: '#engineering',
  message: `New task assigned: ${taskTitle}`
});

// Email a daily summary at the end of each work cycle
await agent.sendEmail({
  to: 'team@yourcompany.com',
  subject: 'Daily Agent Summary',
  body: summaryContent
});

// Search for documentation when planning a complex feature
const results = await agent.searchGoogle({
  query: 'Node.js JWT authentication best practices 2025'
});

Your QA agent can post test results to Notion. Your engineer can trigger a deployment after a successful QA pass. Your CEO can pull external data to inform its planning decisions.

This is the infrastructure layer you’d otherwise build yourself. The Agent Skills Plugin handles it so your agents stay focused on reasoning, not plumbing.

You can start using MindStudio free at mindstudio.ai.

If you’d rather build autonomous agent workflows without writing infrastructure code at all, MindStudio’s visual no-code agent builder supports scheduled background agents, 1,000+ integrations, and 200+ AI models out of the box — no Paperclip setup required.

Troubleshooting Common Issues

Even a well-configured multi-agent setup runs into edge cases. Here are the most common problems and how to fix them.

Agent Gets Stuck in a Loop

Symptom: The CEO keeps rewriting the same task spec without the engineer picking it up.

Catch up on Hermes — free 60-minute live workshop

Fix: Check that the triggers_on: task_ready condition for the engineer correctly detects when current_task.md is updated. Add a timestamp and status field to the task spec so Paperclip can detect freshness:

## Task
Build JWT authentication middleware

## Assigned: 2025-01-15T14:30:00Z
## Status: PENDING

The engineer should check the status field before starting work, and update it to IN PROGRESS as its first action.

QA Agent Reports False Positives

Symptom: QA flags tests as failed even when the code is correct.

Fix: The QA agent needs more context about expected behavior, not just what changed. Update your engineer’s system prompt to require an expected behavior section in its output:

## Expected Behavior
- Authenticated requests with valid JWT should return 200
- Requests without a token should return 401
- Expired tokens should return 403

Heartbeat Fires During Long-Running Tasks

Symptom: The CEO starts a new planning cycle while the engineer is still working.

Fix: Add a lock file mechanism. The engineer should write a task.lock file when it starts and delete it when done. Update the CEO’s configuration:

"triggers_on": "heartbeat",
"skip_if_file_exists": "./tasks/task.lock"

Context Window Degradation After Many Cycles

Symptom: Agents produce lower-quality outputs after dozens of task cycles.

Fix: Implement a summarization step. After every five completed tasks, have the CEO agent summarize the completed work into a compact changelog and truncate old entries from the backlog file. This keeps each agent’s active context lean and relevant.

Frequently Asked Questions

What is the difference between Paperclip and LangChain for multi-agent workflows?

LangChain is a general-purpose framework for building LLM applications with broad abstractions — chains, tools, memory, and routing. It’s flexible but requires significant configuration for multi-agent coordination.

Paperclip is narrower by design: it’s built specifically for orchestrating teams of Claude Code agents with defined roles, shared file-based memory, and a heartbeat scheduler. If you’re already using Claude Code and want a minimal orchestration layer without learning a full framework, Paperclip gets you running faster. If you need complex memory architectures or want to mix models from multiple providers, LangChain gives more control.

How many agents can Paperclip manage at once?

Paperclip can manage as many agents as your system resources support, but 3–5 concurrent agents covers most use cases. Each Claude Code agent call consumes API tokens, so running too many simultaneously raises costs without proportional productivity gains. The max_concurrent_agents setting in the config lets you cap this.

Is this setup safe to run unattended overnight?

With the right guardrails, yes. The most important steps: don’t give the engineer agent access to production credentials, use a command_allowlist to prevent arbitrary shell commands, and set the heartbeat interval long enough that you can review changes between cycles. Running QA as a mandatory gate before any task is marked complete prevents bad code from accumulating across runs.

Does the CEO agent need to run on Claude Opus specifically?

Wondering what the Hermes hype is about? Free 60-minute primer

No, but higher-capability models produce better planning outputs. Claude Sonnet works well for the engineer and QA roles where the task is clearly defined. The CEO role involves open-ended reasoning — reading a backlog, setting priorities, writing specs — so it benefits from a more capable model. If cost is a concern, test the CEO on Sonnet first and switch to Opus only if planning quality suffers.

Can I add more agent roles beyond the three covered here?

Yes. Define any number of agents in the /agents directory and register them in paperclip.config.json. Common additions include a DevOps agent that handles deployments after QA approval, a Documentation agent that updates README files based on engineer output, and a Security agent that scans for vulnerabilities before QA runs. Each new role just needs a JSON definition file and a clear triggers_on condition.

What happens if an agent fails mid-task?

Paperclip logs all failures to paperclip.log. If an agent crashes mid-task, the task file stays in its current state — nothing is rolled back automatically. You’ll need to inspect the /tasks directory and reset the relevant file to PENDING status to allow a retry. Adding explicit error handling to your system prompts helps: instruct each agent to write failure details to /tasks/errors.md and stop cleanly if it hits an unrecoverable problem.

Key Takeaways

A multi-agent company setup with Paperclip and Claude Code gives you a CEO, engineer, and QA agent team that operates on a heartbeat schedule without manual intervention.
Role separation is the core design principle — each agent has a defined scope, specific tools, and clear handoff points. Mixing responsibilities degrades output quality quickly.
File-based communication between agents is simple, transparent, and easy to debug. The /tasks directory is a full audit trail of every decision and handoff.
Heartbeat scheduling turns a one-shot automation into a persistent system that can work through an entire product backlog autonomously.
Extending agents with tools like MindStudio’s Agent Skills Plugin adds real-world capabilities — email, Slack, web search, workflow triggers — without building those integrations yourself.

If you want to build autonomous multi-agent workflows without managing infrastructure, MindStudio lets you create and deploy AI agents visually, with scheduling, 1,000+ integrations, and 200+ AI models built in. You can get started free at mindstudio.ai.

Run an AI Engineering Team in Paperclip With Heartbeats

The Case for a Multi-Agent Team Structure

What Paperclip Does and Why It Works with Claude Code

Everyone else built a construction worker.
We built the contractor.

Why This Combination Works

Prerequisites and Project Setup

Initialize Your Paperclip Project

Building Your CEO Agent

Write the CEO System Prompt

What a Good Backlog Looks Like

Building Your Engineer and QA Agents

Configure the Engineer Agent

Configure the QA Agent

How Agent Communication Works

Setting Up Heartbeat Scheduling for Autonomous Operation

Configure the Heartbeat in paperclip.config.json

Start the System

Adjusting the Heartbeat Over Time

Extending Your Agents with MindStudio

Install the Plugin

Give Your Agents Real-World Reach

Troubleshooting Common Issues

Agent Gets Stuck in a Loop

QA Agent Reports False Positives

Heartbeat Fires During Long-Running Tasks

Context Window Degradation After Many Cycles

Frequently Asked Questions

What is the difference between Paperclip and LangChain for multi-agent workflows?

How many agents can Paperclip manage at once?

Is this setup safe to run unattended overnight?

Does the CEO agent need to run on Claude Opus specifically?

Can I add more agent roles beyond the three covered here?

What happens if an agent fails mid-task?

Key Takeaways

Related Articles

How to Use OmniAgent to Orchestrate Claude and Codex in One Workflow

How to Use Claude Code Ultra Code Mode for Deep Research and Complex Tasks

How to Build a Skill System in Claude Code: Chaining Skills Into Autonomous Pipelines

How to Use Claude Fable 5 for Long-Running Agentic Tasks: Real-World Results

The Case for a Multi-Agent Team Structure

What Paperclip Does and Why It Works with Claude Code

Everyone else built a construction worker.We built the contractor.

Why This Combination Works

Prerequisites and Project Setup

Initialize Your Paperclip Project

Building Your CEO Agent

Write the CEO System Prompt

What a Good Backlog Looks Like

Building Your Engineer and QA Agents

Configure the Engineer Agent

Configure the QA Agent

How Agent Communication Works

Setting Up Heartbeat Scheduling for Autonomous Operation

Configure the Heartbeat in paperclip.config.json

Start the System

Adjusting the Heartbeat Over Time

Extending Your Agents with MindStudio

Install the Plugin

Give Your Agents Real-World Reach

Troubleshooting Common Issues

Agent Gets Stuck in a Loop

QA Agent Reports False Positives

Heartbeat Fires During Long-Running Tasks

Context Window Degradation After Many Cycles

Frequently Asked Questions

What is the difference between Paperclip and LangChain for multi-agent workflows?

How many agents can Paperclip manage at once?

Is this setup safe to run unattended overnight?

Does the CEO agent need to run on Claude Opus specifically?

Can I add more agent roles beyond the three covered here?

What happens if an agent fails mid-task?

Key Takeaways

Related Articles

How to Use OmniAgent to Orchestrate Claude and Codex in One Workflow

How to Use Claude Code Ultra Code Mode for Deep Research and Complex Tasks

How to Build a Skill System in Claude Code: Chaining Skills Into Autonomous Pipelines

How to Use Claude Fable 5 for Long-Running Agentic Tasks: Real-World Results

Everyone else built a construction worker.
We built the contractor.