Claude Code Split-and-Merge: A Deep Dive Into Sub-Agent Parallelism

How Claude Breaks Big Jobs Into Parallel Pieces

Most AI coding assistants work in a straight line. You give them a task, they work through it start to finish, and if the task is bigger than their context window allows, something gets dropped or truncated. For large codebases, multi-file refactors, or sprawling research tasks, that’s a real ceiling.

Claude’s split-and-merge pattern exists specifically to break through that ceiling. It’s one of the core multi-agent techniques built into Claude Code, and it’s worth understanding clearly — because it changes what’s possible with AI-assisted development.

The idea: take one large task, split it into independent subtasks, run those subtasks in parallel using multiple Claude sub-agents, then merge all the results back into a coherent output. Each sub-agent gets its own context window. None of them step on each other. And the whole thing runs far faster than sequential processing ever could.

This article walks through exactly how the split-and-merge pattern works, when to use it, what its limits are, and how similar patterns show up in broader multi-agent workflow tools.

What the Split-and-Merge Pattern Actually Is

The split-and-merge pattern — sometimes called fan-out/fan-in — is a parallel processing strategy applied to AI agents. It has three stages:

Split: A coordinating agent (often called the orchestrator) receives a complex task and breaks it into smaller, independent subtasks.
Fan out: Each subtask is handed off to a separate sub-agent that works on it concurrently with the others.
Merge: When the sub-agents finish, their outputs are collected and synthesized into a single result.

Hermes Crash Course — free 1-hour live workshop

This pattern isn’t new to software engineering. Map-reduce, thread pools, and distributed computing all follow similar logic. What’s new is applying it to large language model agents that need to reason, write code, read documentation, and make judgment calls — not just crunch numbers.

In Claude Code specifically, the orchestrator is typically the top-level Claude instance running in your terminal or IDE. Sub-agents are spawned using Claude’s internal Task tool, which lets one Claude instance delegate work to another isolated Claude process. Each of those processes has a fresh, independent context window.

How Claude Code Spawns Sub-Agents

Claude Code uses a tool called Task to create sub-agents. When Claude determines that a problem can be parallelized, it calls the Task tool multiple times — once per subtask — and those tasks run concurrently rather than waiting for each other to finish.

The Task Tool

The Task tool is essentially a lightweight API call from one Claude instance to another. The orchestrator passes:

A description of what the sub-agent should do
Any relevant context (file paths, instructions, constraints)
Optionally, a set of tools the sub-agent is allowed to use

Each spawned sub-agent starts fresh. It doesn’t inherit the orchestrator’s conversation history. It receives only what the orchestrator explicitly hands it. This isolation is intentional — it prevents context bleed between subtasks and keeps each sub-agent focused.

How Parallelism Works

Claude Code can run up to 10 sub-agents simultaneously. This isn’t a strict API limit in all contexts, but it’s the practical ceiling for most Claude Code workflows. Running more than that tends to introduce coordination overhead that negates the speed benefit.

When the orchestrator fans out to, say, six sub-agents, all six start processing at roughly the same time. If each task takes 30 seconds on its own, the total wall-clock time is still around 30 seconds — not 3 minutes.

The orchestrator then waits for all sub-agents to finish before proceeding to the merge step. This is a synchronization point: the orchestrator pauses, collects results, and then synthesizes.

Context Windows: Why Each Sub-Agent Gets Its Own

One of the key reasons to use the split-and-merge pattern is context window management. Claude’s context window is large — Claude 3.5 and Claude 3 Opus support 200,000 tokens — but that limit still matters when you’re asking an agent to read dozens of files, hold entire codebases in mind, or process large datasets.

When you run everything through a single Claude instance, you’re competing for space in one context window. A single agent trying to analyze 20 large files simultaneously may run out of room, force-truncate content, or start losing track of earlier information as new content pushes it out.

Sub-agents solve this because each one gets a fresh 200,000-token context window. A task split across 8 sub-agents effectively gives you access to 1.6 million tokens of working memory — spread across independent processes, each optimized for its specific subtask.

Practical Example: Codebase Refactoring

Imagine you need to refactor a legacy Python codebase — updating all API calls from v1 to v2, adding type annotations, and flagging deprecated patterns. The codebase has 40 files.

Hermes, walked through line by line — free 1-hour workshop

With a single agent, you’d load files in batches, hope nothing critical falls out of context, and process sequentially. That’s slow and fragile.

With the split-and-merge pattern:

The orchestrator scans the repository structure
It groups files into 8 batches of roughly 5 files each
It fans out to 8 sub-agents, each focused on one batch
Each sub-agent reads its files, applies changes, and returns diffs and a summary
The orchestrator merges the diffs and reconciles any conflicts

Total time: roughly the time it takes to process one batch. Context quality: high, because no single agent is overloaded.

The Merge Step: Putting Results Back Together

Splitting is the easy part. Merging is where things get interesting.

After sub-agents finish, the orchestrator receives a set of results that need to be combined into something coherent. Depending on the task type, merging can be straightforward or complex.

Simple Merges

For tasks with clean, non-overlapping outputs — like “summarize each of these 10 documents” — merging is just concatenation or light formatting. Each sub-agent returns a summary, and the orchestrator stitches them together.

Complex Merges

For tasks where sub-agents might produce conflicting outputs — like multiple agents editing related parts of a codebase — the merge step requires actual reasoning:

Conflict resolution: Two sub-agents might both modify a shared utility function in incompatible ways. The orchestrator must detect the conflict and decide which version to use, or synthesize a combined version.
Deduplication: If sub-agents each identified issues in overlapping code regions, the orchestrator needs to deduplicate before generating a final report.
Cross-referencing: Sub-agents working on related modules may each produce changes that are locally correct but globally inconsistent. The orchestrator reviews for coherence.

This is why the orchestrator itself needs to be a capable model. A weak orchestrator can’t reliably merge complex multi-agent outputs. In Claude Code, the orchestrating Claude instance runs the same model as the sub-agents, which means it has full reasoning capability at the merge step.

When to Use the Split-and-Merge Pattern

Not every task benefits from parallelization. The split-and-merge pattern works best when:

The Task Has Clear Independence Boundaries

If subtask A doesn’t need to know the result of subtask B to do its work, they can run in parallel. Analyzing file A and analyzing file B are independent. Writing a function that calls another function might not be — the second depends on what the first returns.

The Task Is Bottlenecked by Volume, Not Complexity

If you have 50 test files to generate, 30 API endpoints to document, or 20 UI components to audit, the problem isn’t hard — it’s large. Parallelization cuts the time proportionally.

Context Would Otherwise Get Crowded

If a single-agent approach would require fitting too much into one context window, splitting the input across sub-agents each with fresh windows is a practical necessity, not just an optimization.

When to Avoid It

Sequential dependencies: If step 3 depends on step 2, which depends on step 1, you can’t parallelize. Use a pipeline pattern instead.
Small tasks: The overhead of spawning sub-agents and merging results isn’t worth it for tasks that take 5 seconds single-threaded.
Shared mutable state: If all sub-agents need to write to the same resource simultaneously, you’ll hit race conditions and conflicts that are harder to resolve than just doing the work sequentially.

Remy doesn't build the plumbing. It inherits it.

Other agents wire up auth, databases, models, and integrations from scratch every time you ask them to build something.

WHAT REMY DOESN'T HAVE TO BUILD

200+

AI MODELS

GPT · Claude · Gemini · Llama

✓

1,000+

INTEGRATIONS

Slack · Stripe · Notion · HubSpot

✓

MANAGED DB

AUTH

PAYMENTS

CRONS

Remy ships with all of it from MindStudio — so every cycle goes into the app you actually want.

Real-World Use Cases

Large-Scale Code Review

A team pushing a major refactor across a monorepo can use split-and-merge to fan out code review across modules. Each sub-agent reviews one module — checking for security issues, style violations, and logic errors — then the orchestrator synthesizes a unified review report.

Documentation Generation

Documentation is naturally parallelizable. Each function, class, or endpoint can be documented independently. Ten sub-agents working simultaneously can generate docs for a whole codebase in the time it would take one agent to document a single module.

Multi-File Search and Analysis

“Find all places in the codebase where we’re using a deprecated library and suggest replacements.” This is a classic fan-out task. The orchestrator splits files across sub-agents, each sub-agent searches its batch, and the orchestrator compiles a unified list of replacement suggestions.

Test Generation

Generating unit tests for 50 functions? Fan out one function per sub-agent (or a few functions per agent), collect the test files, merge them into the test suite. Works cleanly because each test file is independent.

Research Aggregation

Outside of pure coding tasks, the pattern applies to any research or summarization task. Feed different sources to different sub-agents, have each one extract key facts, then merge into a single research brief.

Limits and Practical Considerations

The 10 Sub-Agent Cap

Claude Code practically limits concurrent sub-agent tasks to around 10. This reflects a balance between parallelism benefits and API costs, rate limits, and orchestration complexity. For most real-world tasks, 8–10 parallel agents represents the sweet spot.

Cost Scales Linearly

More sub-agents means more API calls. If running one agent on a task costs $0.10, running 10 sub-agents costs roughly $1.00 (plus the orchestrator’s own usage). The speed benefit is real, but budget accordingly for large-scale tasks.

Each sub-agent starts fresh. It can’t see what other sub-agents are doing or have done. The only way sub-agents share information is through the orchestrator’s merge step. Design your splits accordingly — if sub-agents need to reference shared context, include that context explicitly in each agent’s prompt.

Orchestrator Quality Matters

The orchestrator determines how well the task is split and how well results are merged. A poorly-designed split sends sub-agents overlapping work or misses important dependencies. A poor merge produces incoherent output. When building workflows that use this pattern, spend as much thought on the orchestrator’s logic as on the sub-agents’ tasks.

How MindStudio Handles Multi-Agent Parallelism

The split-and-merge pattern isn’t unique to Claude Code. It’s a general architectural approach that any multi-agent system can implement — and platforms like MindStudio make it accessible without requiring custom orchestration code.

MindStudio is a no-code platform for building and deploying AI agents and automated workflows. You can use it to build agents that span multiple AI models — including Claude — and chain them together with fan-out logic, conditional routing, and result merging, all through a visual builder.

Where this connects directly to the split-and-merge pattern:

You can build a coordinating agent that breaks an input into subtasks and routes each to a specialized downstream agent.
MindStudio supports parallel workflow branches, so multiple agents can process their subtasks at the same time.
Results from parallel branches can be fed into a merge step — another agent or a template that synthesizes the outputs.

If you’re running Claude sub-agents through code and want to add pre-built integrations (Slack notifications when merging completes, Google Sheets logging, Salesforce updates triggered by results), MindStudio’s 1,000+ integrations handle that layer without custom infrastructure.

For teams that want the split-and-merge architecture without writing orchestration logic from scratch, it’s worth exploring. You can try MindStudio free at mindstudio.ai.

If you’re already building multi-agent systems, you might also find MindStudio’s Agent Skills Plugin useful — it’s an npm SDK that lets Claude Code and other agents call 120+ typed capabilities (email sending, web search, image generation, workflow execution) as simple method calls.

Frequently Asked Questions

What is the split-and-merge pattern in Claude Code?

The split-and-merge pattern is a multi-agent technique where a coordinating Claude instance breaks a complex task into independent subtasks, fans them out to multiple sub-agents running in parallel, then collects and merges all results. It’s designed for tasks that are too large or time-consuming for a single agent working sequentially.

How many sub-agents can Claude Code run in parallel?

Claude Code can run up to approximately 10 sub-agents simultaneously. This cap balances performance benefits against API costs and coordination overhead. For most real-world tasks — large codebase reviews, batch documentation, multi-file refactors — 8–10 parallel sub-agents is sufficient.

Does each sub-agent get its own context window?

Yes. Each sub-agent spawned by Claude Code starts with a completely fresh context window, independent of the orchestrator and of other sub-agents. This is one of the key advantages of the pattern: it multiplies available working memory by spreading context across isolated processes rather than cramming everything into one shared window.

What’s the difference between the split-and-merge pattern and a pipeline?

In a pipeline, tasks run sequentially — each step waits for the previous one to finish because it depends on that output. In the split-and-merge pattern, subtasks run in parallel because they’re independent of each other. Pipelines are right for sequential dependencies. Split-and-merge is right for tasks that can be parallelized across independent work units.

When does the split-and-merge pattern not make sense?

Avoid it when subtasks have strong dependencies on each other, when the task is simple enough that orchestration overhead outweighs time savings, or when all sub-agents would need to write to the same resource simultaneously. It also adds cost — each sub-agent is an independent API call — so it’s not always economical for small tasks.

How does the orchestrator merge results from sub-agents?

The orchestrator waits for all sub-agents to return their outputs, then synthesizes them. For simple tasks, this might just be concatenation or formatting. For complex tasks — like merging code changes from multiple sub-agents — the orchestrator applies reasoning to resolve conflicts, deduplicate findings, and ensure the combined output is coherent. This makes orchestrator quality critical: a capable model running the orchestration step is essential for reliable results.

Key Takeaways

The split-and-merge pattern divides complex tasks across multiple Claude sub-agents that run in parallel, each with an independent context window.
Claude Code uses the Task tool to spawn sub-agents, supporting up to ~10 running simultaneously.
Each sub-agent starts fresh — no shared memory — so subtask independence is a prerequisite for the pattern to work well.
The merge step requires genuine reasoning from the orchestrator, especially when sub-agents produce overlapping or conflicting outputs.
Best use cases: large-scale code review, batch documentation, multi-file refactoring, test generation, and research aggregation.
Cost scales with the number of sub-agents, so plan accordingly for high-volume tasks.

Other agents ship a demo. Remy ships an app.

React + Tailwind ✓ LIVE

API

REST · typed contracts ✓ LIVE

DATABASE

real SQL, not mocked ✓ LIVE