Cursor's Research on Running 100 Agents in Parallel: Why Flat Agent Teams Fail Without an Issue Tracker

When You Run 100 Agents in Parallel, You Rediscover Why Jira Exists

Cursor’s engineers did something most AI teams haven’t done yet: they actually ran hundreds of agents simultaneously on large coding projects and wrote down what broke. The finding, which has been circulating among people building serious multi-agent systems, is blunt — flat agent organizations develop the same coordination failures as flat human organizations. And the thing that fixes both turns out to be the same boring tool: an issue tracker.

That’s the Cursor research on running multiple agents in parallel. Not a theoretical concern. An empirical one, discovered by watching agents fail in the wild.

The specific failure modes are worth naming. When you run a flat agent system — every agent with equal access, no claiming, no status, no explicit ownership — agents start holding locks too long. They forget to release them. They wait on each other without any mechanism to surface the blockage. They become risk-averse, gravitating toward small, easy tasks instead of the hard end-to-end work that actually matters. The system produces a lot of activity and not much output.

If you’ve ever worked at a company that grew too fast without adding any management structure, this will sound familiar.

The Coordination Problem Isn’t New — It Just Moved

Here’s the non-obvious part. The coordination problems Cursor found aren’t a failure of the underlying models. They’re a failure of the environment the models are operating in.

How Remy works. You talk. Remy ships.

YOU14:02

Build me a sales CRM with a pipeline view and email integration.

REMY14:03 → 14:11

Scoping the project

Wiring up auth, database, API

Building pipeline UI + email integration

Running QA tests

✓ Live at yourapp.msagent.ai

A single agent running a single task in a clean context window is a solved problem, more or less. The hard part is what happens when you have dozens or hundreds of agents working on the same codebase, the same project, the same set of interdependent tasks — and none of them have a shared map of who owns what, what state things are in, or which tasks are blocked waiting on something else.

This is exactly the problem that Bugzilla was built to solve in 1998. Terry Weissman wrote it for Mozilla to replace Netscape’s internal defect tracker. The first public deployment was April of that year. The tool was narrow on purpose — the Bugzilla team explicitly said they were not trying to build a general project management system. They were solving one specific problem: when a lot of people are building software asynchronously, how do you make sure bugs don’t disappear into the ether?

The answer they came up with has survived 27 years. A bug has durable state outside any one person’s head. It has a state machine — new, assigned, resolved, verified, closed, and the emotionally honest “won’t fix.” It has an assignee field that makes clear whose turn it is. It has dependencies. It has audit history.

None of that was designed for AI. There was no AI in any relevant sense. But the constraints that drove those design decisions — humans forget context, humans need handoffs, humans need accountability across time zones and memory gaps — turn out to map almost exactly onto agent constraints.

Agents lose context. Agents need handoffs. Agents need observability. Agents need permissions. The system we built to compensate for human weakness compensates very well for agent weakness too.

What OpenAI’s Symphony Spec Actually Confirms

The most concrete recent evidence for all of this is OpenAI’s Symphony — an open-source Codex orchestration spec whose central design decision is to use a project management board, specifically a Linear board, as the control plane for autonomous coding agents.

Symphony defines polling, per-issue workspaces, active and terminal states, retries, observability, concurrency limits, and handoff states. Human review is an example handoff state. The spec is not using Linear as a UI for humans to look at. It’s using Linear as a state machine for agents to operate against.

The reported result: some internal OpenAI teams saw a 500% increase in landed pull requests when using this model.

That number is striking enough to sit with for a moment. The improvement didn’t come from a better model. It came from giving agents a structured substrate — a place to find work, understand who owns it, know what state it’s in, see what changed, and hand results back. The issue tracker didn’t die when agents arrived. It got promoted.

This is exactly what Cursor’s research predicts. The coordination problem in flat agent systems isn’t solved by making individual agents smarter. It’s solved by giving the system a shared map of work. Claiming, status, blockers, priority — these are coordination primitives, and issue trackers have had them for decades.

For teams building multi-agent AI workflows, this is the practical implication: before you add another model or another tool, ask whether your agents have a shared, durable, queryable map of what’s being worked on. If the answer is “it’s in a Slack thread somewhere,” you’ve already hit the ceiling.

The Linear Contradiction That Resolves Itself

Day one: idea. Day one: app.

DAY

DELIVERED

Not a sprint plan. Not a quarterly OKR. A finished product by end of day.

There’s a timing detail here that’s worth dwelling on. On March 24, 2026, Linear CEO Kari Saarinen published a piece called “Issue tracking is dead.” The argument was reasonable: issue trackers were built for a handoff model of software development, where a product manager scopes work, someone writes the ticket, someone else picks it up, and the ticket moves through statuses while people argue about priority and acceptance criteria. The ticket ends up being a translation layer between reality and the team. Agents, Saarinen argued, can read more of the underlying context directly — they don’t need a human to do as much translation first.

A little over a month later, OpenAI published Symphony, which uses a Linear board as the control plane for autonomous coding agents.

The contradiction resolves once you separate the UI from the substrate. Saarinen was right that the human ceremony around tickets is shrinking. The ritual of manually translating messy reality into well-behaved tickets — that world is under real pressure. But Symphony isn’t using Linear for that. It’s using Linear because Linear has records, a state machine, explicit ownership, structural verbs, and queryable history. The human translation step can die. The underlying substrate gets more important, not less.

Linear’s own evolution is instructive here. Linear didn’t invent a new data model — issues, states, assignees, priorities, dependencies, history. That’s the same shape as Bugzilla and Jira. What Linear did was make that model pleasant enough that people use it voluntarily and consistently. And that’s the part that matters for agents in a way most people haven’t thought through.

When people hate a tool, they work around it. They leave fields blank. They put important decisions in Slack. They use fake statuses. They create tickets after the work is done. The tracker reflects obligation, not reality. When people like the tool, more of the real work ends up in the system — cleaner state, better descriptions, current ownership, less-fabricated dependencies. Linear was a UX win. The UX win became a data win. The data win becomes an agent win, because an agent doesn’t care whether your project management tool feels elegant. It cares whether the state inside it is reliable enough to act on.

Why Atlassian Is Sitting on Something Bigger Than It Looks

If issue trackers are agent substrates, then Atlassian owns one of the largest installed bases of agent-readable work state in the world. That’s a different frame than “Jira is the thing developers complain about.”

In May 2025, Atlassian launched its Rovo MCP server in beta, with Claude as the first official partner and Cloudflare infrastructure underneath. By February 2026, the Rovo MCP server was generally available — supporting search and summarization across Jira, Confluence, and Compass; create and update operations on issues and pages; OAuth; existing permission models; admin controls and whitelisting. This is Atlassian making Jira and Confluence agent-readable and agent-writable through a controlled interface. Mechanically, it’s the same pattern Symphony assumes with Linear: take the system where work already lives, expose it through a controlled interface, let agents operate against the work graph.

There’s also a rumored Anthropic acquisition of Atlassian at roughly a $40 billion premium — unconfirmed, no SEC filing, no formal announcement. Treat it as rumor. But the logic is obvious enough that people take it seriously, which is itself interesting. A few years ago, “frontier AI lab buys the issue tracker company” would have sounded bizarre. Now it sounds like infrastructure strategy. Anthropic has a multi-year partnership with Atlassian Williams Racing, making Claude the team’s official thinking partner. The brand alignment is telling even if the acquisition never happens: an AI lab that wants to be the enterprise agent layer sitting very close to the company that owns the map of how work happens inside the enterprise.

The Substrate Problem Is Bigger Than Issue Trackers

Once you see why issue trackers work for agents, you start seeing the same pattern in a lot of other places.

CRMs are issue trackers for revenue. Salesforce and HubSpot have accounts, contacts, opportunities, owners, stages, next steps, history, permissions, integrations. A deal moves from prospecting to qualification to proposal to negotiation to closed-won or closed-lost. That’s a state machine. An agent can research an account, draft a follow-up, update fields, flag risk, prepare the next meeting, ask for human approval before sending something externally — all operating against a durable state layer that already exists.

Service desks — Zendesk, ServiceNow, Jira Service Management — are issue trackers for customer problems. Tickets, assignees, statuses, SLAs, escalation paths, customer history, audit trails, permissions. If you were designing a customer support agent from scratch, you’d rebuild most of that. So the agent won’t replace the service desk. It will operate through it.

The weaker candidates are just as instructive. Email has state and history and permissions, but the verbs are weak — reply, forward, archive, label. There’s no native way to assign, resolve, block, or approve in the general email model. Slack contains enormous amounts of context, but the structure is mostly transcript structure. A thread is a pile of messages. The state of the work is implied rather than encoded. Agents can read Slack and summarize Slack, but if Slack is the only place your work state lives, the agent has to infer too much.

For teams building AI agents for research and analysis tasks, this distinction matters practically. An agent that can query a structured work graph — who owns this, what state is it in, what’s blocking it — is doing fundamentally different work than an agent that’s reading transcripts and inferring.

What This Means If You’re Building Agent Systems Now

The Cursor finding reframes a question a lot of teams are asking wrong. The question isn’t “which model should our agents use?” or even “how many agents should we run?” The question is: what substrate are those agents operating against?

If your work data is clean — current ownership, meaningful statuses, real dependencies, actual history — your agents get a head start. If your work data is spread across Slack threads, half-filled tickets, mystery spreadsheets, and undocumented tribal knowledge, agents will struggle in exactly the places you want them to help. Messy operations used to be a human tax. People could compensate with meetings and memory and relationships. Agents are worse at those compensations. Agents need the business to be legible.

This is one of the hidden costs of messy operations that most AI readiness conversations miss entirely. Cleaning up your workflows, consolidating systems, enforcing fields, keeping ownership current, making sure status actually means something — that’s not just good hygiene. It’s the precondition for agents doing real work.

✗ VIBE-CODED APP

Tangled. Half-built. Brittle.

✓ AN APP, MANAGED BY REMY

UIReact + Tailwind✓

APIValidated routes✓

DBPostgres + auth✓

DEPLOYProduction-ready✓

Architected. End to end.

Built like a system. Not vibe-coded.

Remy manages the project — every layer architected, not stitched together at the last second.

For teams building on top of these substrates, platforms like MindStudio handle the orchestration layer: 200+ models, 1,000+ integrations including Jira, Salesforce, and Slack, and a visual builder for chaining agents and workflows against the systems of record your business already runs on.

The data model is a strategic surface. If you’re building a product you want agents to use, don’t start by adding a chat interface. Start by making the underlying state clean. Expose your records. Define your verbs. Make ownership explicit. Preserve your history. Build permissions into the model. The difference between “we added AI for show” and “we became part of the agent stack” is whether agents can safely understand and change the state of work inside your product.

That’s a different question than “does this product have an AI chatbot.” And the teams that are asking the right question right now are the ones who will look prescient in two years.

The Boring Infrastructure Thesis

The 30-year accumulation of human coordination infrastructure — Bugzilla to Jira to Linear — accidentally encoded exactly what AI agents need. None of it was designed for AI. It was designed for humans working asynchronously across time zones, across teams, across memory gaps. But those human constraints and agent constraints overlap more than anyone expected.

Cursor found this empirically. OpenAI built Symphony around it. Atlassian is betting on it with the Rovo MCP server. The companies that own systems of record — Atlassian, Salesforce, ServiceNow, Workday — may not have the most impressive demos, but they own the substrate agents build on. And the substrate is hard to displace.

For builders working on AI agents for product management workflows, the practical takeaway from Cursor’s research isn’t complicated: before you scale your agent system, give it a coordination layer. The issue tracker you already have — if your team has been using it cleanly — is probably the right starting point. Not because it was designed for this. Because it wasn’t, and it works anyway.

There’s something worth noting about where this is all heading. Tools like Remy represent one direction: you write a spec — annotated markdown — and the full-stack app gets compiled from it, backend, database, auth, deployment, all of it. The spec becomes the source of truth; the code is derived output. That’s a different abstraction layer than the issue tracker, but the underlying logic is similar — structured, durable, queryable state as the foundation that everything else builds on.

The boring tools are winning. The question now is which boring tools win next, and whether the teams building on top of them understand why they’re winning in the first place.