AI Agents Don't Save Time — They Create an Infinite Backlog: 5 New Organizational Roles Emerging Right Now
Agents expose everything you could be doing, not just what you are doing. Five new roles — from context librarian to eval engineer — are emerging.
The Infinite Backlog: Why AI Agents Create More Work Than They Eliminate — and 5 Roles Being Invented to Handle It
Seven new organizational roles are emerging inside enterprise companies right now, and most of them don’t have job descriptions yet.
That’s the practical implication of a framework that’s been crystallizing across AI Twitter over the past few weeks — what you might call the infinite backlog problem. The core insight is this: AI agents don’t save you time by finishing your work. They expose how much work was always possible but never attempted. Five constraints now govern how much of that backlog any organization can actually capture: judgment, planning, coordination, evaluation, and absorption. Understanding those five constraints tells you exactly which new roles are being created and why.
This is not a theoretical argument. Aaron Levy, CEO of Box, is actively hiring for some of these roles today.
The Lump of Labor Fallacy, Reloaded
You’ve probably heard of the lump of labor fallacy — the mistaken belief that there’s a fixed amount of work to be done, and that if a machine does some of it, a human loses a job. Economists have been debunking this for 150 years. Every productivity-enhancing technology in history has generated more work, not less, because human wants are not finite.
But here’s what’s different about agents: the infinite backlog isn’t theoretical anymore. It’s immediate.
Coding agents automate the 5%. Remy runs the 95%.
The bottleneck was never typing the code. It was knowing what to build.
Before agents, the backlog was always there — the campaigns you never ran, the analyses you never did, the integrations you never built. But it stayed comfortably in the future because you simply couldn’t get to it. Time was the hard constraint. You finished your week, looked at what you’d accomplished, and called it enough.
Agents break that psychological contract. When multiple instances of an agent can work in parallel, 24/7, on tasks you define, the backlog stops being a future problem and becomes a present failure. Everything you’re not doing right now is something you could theoretically be doing right now. That’s a fundamentally different cognitive experience than “I’ll get to that next quarter.”
Aaron Levy captured it plainly: “Sorry to anyone who thought AI would mean we’d work less, at least for now. AI makes it easy to explore more than you did before, and so you start doing far more as a result.”
Sam Altman’s tweet last week made the same point more obliquely. He juxtaposed two quotes: “post-AGI, no one is going to work and the economy is going to collapse” against “I’m switching to polyphasic sleep because GPT-5.5 and Codex is so good that I can’t afford to be sleeping for such long stretches and miss out on working.” The CEO of the company building AGI is switching to polyphasic sleep to maximize Codex usage. That’s a revealed preference, not a press release. As Cheyen Jiao put it: “The constraint isn’t model quality anymore. It’s how many hours per day you can feed at work.”
Why Agents Feel Like Running a Startup
The closest analogy to what knowledge workers are experiencing right now is early-stage entrepreneurship.
A founder’s job is to assemble limited resources against an infinite set of possible things to build. The options are genuinely unlimited; the time and money are not. That gap — between what’s possible and what’s achievable — creates a specific kind of cognitive pressure. It’s exhilarating when you’re making progress. It’s destabilizing when you’re not sure you’re working on the right things.
That’s exactly what’s happening to people deploying agent fleets. The first wave of agentic work is the known backlog — the things you always wanted to do but couldn’t. Automate the content pipeline. Build the analytics dashboard. Wire up the CRM. That part feels incredible because you’re finally doing things that were clearly valuable but perpetually deferred.
Then you hit the uncharted territory. The things you didn’t even know were possible. And suddenly you’re a founder again: building without blueprints, choosing between options that are all plausibly good, unsure whether the thing you’re working on is the most important thing you could be working on.
Tang Yan described the burnout pattern that follows: “Instead of 8 to 10 normal productive hours, you might get 4 or 5 extremely intense hours before your brain is fully cooked and you feel numb until you sleep properly and reset.” The work no longer drains you through typing. It drains you through judgment — more context switching, more verification, more decisions per hour. The agent keeps running. The human hits a wall.
Everyone else built a construction worker.
We built the contractor.
One file at a time.
UI, API, database, deploy.
Brian Johnson — whose entire public identity is optimized health habits — described his experience with Claude: “I got seaold. Suffered sleep consequences. I busted my screens off rule. Turned down socializing. Fell behind on work. AI is preposterous.” The person most committed to sustainable rhythms couldn’t stop.
The Five Constraints That Actually Govern Agent Output
Here’s the thing about the infinite backlog: it’s not actually infinite. Agents don’t eliminate constraints. They relocate them.
The old constraint was time — how many hours a human could work. The new constraints are different in kind, not just degree. There are five of them.
Judgment is the first and most fundamental. When everything is possible, deciding what to work on becomes the job. Agents can execute; they can’t prioritize. Someone has to decide which part of the infinite backlog matters most, and that decision requires context, strategy, and accountability that agents don’t carry.
Planning is the sequencing problem. Even if you know what to work on, agents running in parallel can create dependencies and conflicts if they’re not sequenced correctly. What starts when? What blocks what? This is harder than it looks when you have dozens of concurrent workstreams.
Coordination is the organizational version of planning. When multiple people in multiple departments are each running their own agent fleets against their own slices of the infinite backlog, the outputs can conflict, duplicate, or undermine each other. Someone has to maintain legibility across the whole system.
Evaluation is the quality gate. Agents produce output. That output is not always correct. Someone — or some system — has to assess whether the work is actually good before it ships, gets sent to a customer, or gets built on top of. The naive assumption is that agents are reliable enough to skip this step. They’re not.
Absorption is the constraint nobody talks about. Even if you produce more, markets and customers can only consume so much. There’s a ceiling on how much content a market will read, how many sales touches a prospect will tolerate, how many product changes users can absorb. Infinite production doesn’t mean infinite consumption.
These five constraints are the shape of the new work. And they map almost directly onto the new roles being created.
The Roles Being Invented Right Now
Aaron Levy’s tweet on Thursday was the clearest public signal yet that these roles are moving from theory to org chart. He wrote that Box is “starting to hire and retrain for new agent engineering roles for internal functions to help get more powerful agents working on critical business processes.”
The description he gave is specific: an internal FTE whose job is to wire up systems like Box, Salesforce, and Workday; build secure, governed agents for internal workflows; and codify business processes. Critically, he noted that this person often won’t understand the business process well enough to work alone — they’ll need to work embedded with business teams. Which, he observed, may introduce yet another new role: “agent product management for internal processes.”
His framing: “It’s not about bringing automation to a job, but about bringing automation to a process.”
That distinction matters. Job-level automation is what everyone’s been worried about. Process-level automation is what’s actually happening, and it requires people who can think across functions, not just within them.
From the five constraints above, you can derive the full taxonomy of emerging roles:
Agent Ops Engineers own the infrastructure — keeping fleets running, managing token allocations, maintaining sandboxes and permissioned context. This is the constraint of evaluation and planning at the technical layer. If you’re building multi-agent workflows, you already know how much operational overhead this creates. MindStudio handles some of this orchestration — with 200+ models, 1,000+ integrations, and a visual builder for chaining agents across workflows — but someone still has to own the architecture decisions and monitor what’s running in production.
Context Librarians address the judgment and evaluation constraints at the knowledge layer. Agents are only as good as the context they have access to. Someone has to curate what agents know, keep it current, and manage the complex permissioning that determines which agent can see which data. This is not a trivial job. In a large enterprise, the context management problem is as hard as the agent-building problem. For teams building more complex agent pipelines, the three-layer memory architecture described in the Claude Code source leak is a useful model for thinking about how agents should access and update context at different timescales.
Eval Engineers create quality gates at scale. Rather than assuming every person deploying agents will be equally good at assessing output quality, eval engineers build systematic evaluation pipelines — the infrastructure that catches errors before they propagate. This maps directly to the evaluation constraint. As multi-agent system architectures become more complex, the eval layer becomes load-bearing. The WAT framework — workflows, agents, tools — is a useful mental model here: eval engineers are essentially building the quality infrastructure that sits above all three layers.
Coordination Architects design how everything stays legible across the organization. When marketing, product, sales, and engineering are all running agent fleets, someone has to maintain a map of what’s being built, where it overlaps, and where it conflicts. This is the coordination constraint made into a role. Teams that have experimented with running AI engineering teams in multi-agent configurations know how quickly coordination overhead compounds — the coordination architect role is the organizational answer to that technical problem.
Information Pipeline Owners route signals to the right places. In a world where agents are generating outputs continuously, the question of who sees what, when, and in what form becomes a full-time job. This is partly technical (data pipelines) and partly organizational (knowing which team needs which signal).
Experiment Portfolio Managers operate at the strategic layer — funding, scaling, merging, and killing agentic initiatives. This is the judgment constraint at the organizational level. Not every agent project that gets started should continue. Someone has to make portfolio-level decisions about which bets to press and which to cut.
Entrepreneur Coaches are the most unusual entry on this list, and possibly the most important. If agents make every job a startup, then the people doing that work need founder support — help with judgment, pacing, and the psychological experience of operating with too many options and too little certainty. This is new territory for HR and management alike.
What’s Actually Buried Here
The non-obvious implication of all this is that the new roles aren’t primarily technical.
The agent ops engineer and eval engineer are technical, yes. But context librarians, coordination architects, information pipeline owners, and entrepreneur coaches are organizational and strategic roles. They require deep knowledge of how the business works, not just how the technology works. That’s why Levy’s framing — “technical plus process people that can span multiple teams or functions” — is the right frame. The hard part isn’t building the agent. It’s knowing what the agent should be doing and whether it’s doing it well.
This also means the organizations best positioned to capture the infinite backlog are not necessarily the ones with the most technical talent. They’re the ones with the clearest processes, the best institutional knowledge, and the management structures capable of making fast portfolio-level decisions about emerging work.
There’s a parallel here in how production software gets built. When you’re building the internal tools and workflows that agents will operate against, the specification problem is as hard as the implementation problem. Remy takes this seriously: you write an annotated markdown spec as the source of truth, and the full-stack application — TypeScript backend, database, auth, deployment — gets compiled from it. The spec is what you own; the code is derived output. The same logic applies to agent workflows: the clearer your process documentation, the better your agents perform. Organizations that invest in writing rigorous specs before deploying agents will outperform those that treat spec-writing as optional overhead.
The organizations that will struggle are the ones that treat agent deployment as a purely technical project and skip the process work. You can’t wire up Salesforce and Workday effectively if you don’t understand the business processes those systems support. That’s exactly what Levy is saying when he notes that the agent engineer will often need to work embedded with business teams.
What to Do With This Before Next Quarter
If you’re building or managing AI systems inside an organization, the five constraints give you a diagnostic framework.
Start by asking which constraint is actually binding for you right now. Most teams that are struggling with agents aren’t struggling because the models are bad. They’re struggling because of judgment (no clear prioritization process), evaluation (no quality gates), or coordination (no visibility into what other teams are building). Identifying the binding constraint tells you which role to hire or develop first.
Second, audit your context infrastructure. The context librarian role sounds abstract until you realize that most agent failures in production trace back to agents operating on stale, incomplete, or incorrectly permissioned information. Before you scale your agent fleet, ask whether the knowledge those agents depend on is actually being maintained.
Built like a system. Not vibe-coded.
Remy manages the project — every layer architected, not stitched together at the last second.
Third, take the coordination problem seriously before it becomes a crisis. The time to design cross-team visibility into agent deployments is before three different teams have built three overlapping versions of the same workflow. Coordination architects don’t need to be a formal hire immediately — but someone needs to own that function, even informally.
Finally, think about pacing infrastructure. Tang Yan’s observation about 4-5 usable hours versus 8-10 normal hours isn’t a complaint about agents — it’s a constraint to design around. If your organization is rewarding whoever stays up latest rather than whoever makes the best decisions, you’re optimizing for the wrong thing. The agent can keep running. The human judgment that directs it cannot.
The roles being invented right now are the organizational response to a genuine structural change in how work gets done. Most of them will look different in three years than they do today. But the five constraints — judgment, planning, coordination, evaluation, absorption — are durable. Whatever the roles end up being called, they’ll be solving for those five things.
Aaron Levy is right that this is the early innings. The teams that start designing for it now will be significantly better positioned than the ones that wait for the org chart to catch up with the technology.