Claude Code Rate Limits Just Doubled: Every New API Limit After the Colossus 1 Deal

The Numbers Changed Overnight — Here’s What You Actually Get Now

Claude Code’s 5-hour rate limit just doubled. API Tier 1 input tokens per minute jumped from 30,000 to 500,000. If you’ve been hitting walls mid-session and watching the “rate limit exceeded” message appear at the worst possible moment, those two numbers are the ones that matter most.

This isn’t a gradual rollout or a beta. Anthropic made these changes effective immediately, tied to a compute deal that gave them full use of XAI’s Colossus 1 data center — 220,000 Nvidia GPUs, mostly H100s, running at 300 megawatts of capacity in Memphis, Tennessee. The entire data center. Not a partial lease.

The practical question is: what does this mean for your actual Claude Code sessions, your API integrations, and your workflow? This post walks through every specific number that changed, which plans are affected, and what to watch for as the new limits roll out.

What Changed and Why It Matters to You

Before getting into the table of numbers, it helps to understand what was actually broken.

Anthropic had been quietly throttling Claude Code usage for months. There were peak-hour reductions — meaning your rate limit was lower during busy times of day, even if you were on a paid plan. There was a 5-hour rolling window that capped how many tokens you could burn in any given session. And the API tier limits were so conservative that serious production workloads kept hitting ceilings.

One coffee. One working app.

You bring the idea. Remy manages the project.

WHILE YOU WERE AWAY

✓Designed the data model

✓Picked an auth scheme — sessions + RBAC

✓Wired up Stripe checkout

✓Deployed to production

Live at yourapp.msagent.ai

Dario Amodei explained the root cause at Anthropic’s developer event: “We planned for a world of 10x growth per year. In Q1 2026, we saw 80x annualized growth per year in revenue and usage.” That’s not a rounding error. The infrastructure simply wasn’t built for that trajectory.

The SpaceX deal changes the supply side of that equation immediately. Colossus 1 is online and running. The inference capacity is available now, not in six months when some other deal comes online.

Here’s what actually changed:

Claude Code subscription plans (Pro, Max, Team, seat-based Enterprise):

5-hour rate limit: doubled
Peak-hour limit reduction: eliminated for Pro and Max accounts

API rate limits for Opus models:

Tier	Input tokens/min (before)	Input tokens/min (after)
Tier 1	30,000	500,000
Tier 2	450,000	2,000,000
Tier 3	800,000	5,000,000
Tier 4	2,000,000	10,000,000

Tier 1 went from 30K to 500K — that’s a 16x increase. Tier 4 went from 2M to 10M. Output token throughput increased between 2x and 10x depending on your tier.

These are not small adjustments. For Tier 1 users especially, this is the difference between a tool you can actually use for extended agentic tasks and one you’re constantly babysitting.

What You Need to Get the Full Benefit

You don’t need to do anything to receive these changes. They’re applied automatically based on your plan. But there are a few things worth knowing before you assume you’re getting the maximum benefit.

For Claude Code rate limit increases:

You need to be on Pro, Max, Team, or a seat-based Enterprise plan
Free plan users are not included in these changes
The 5-hour window is still a rolling window — it doubled in size, not eliminated

For API rate limit increases:

These apply specifically to Opus models
Your tier is determined by your API account history and spend — you don’t manually select it
If you’re not sure which tier you’re on, check the Anthropic API documentation for the current tier thresholds

One thing that did NOT change (yet): Weekly limits. Anthropic’s head of growth noted that only a small percentage of users hit weekly limits, while a much larger portion hit the 5-hour limit. So they fixed the 5-hour limit first. Weekly limits are still in place and will be revisited as more compute from the Colossus deal comes online.

If you’ve been managing your token usage carefully to avoid weekly limits, those strategies still apply. The 18 Claude Code token management hacks post covers techniques that remain relevant even with the doubled 5-hour window — especially for long-running agentic tasks where you’re burning tokens fast.

How to Actually Use the New Headroom

Doubling the 5-hour limit is only useful if you’re running tasks that actually need it. Here’s how to think about what’s now possible that wasn’t before.

Longer uninterrupted agentic sessions. The old 5-hour limit was tight enough that complex multi-file refactors or extended research tasks would hit the ceiling before completion. With the doubled window, you can now run tasks that genuinely take a few hours without planning around interruptions.

Remy doesn't write the code. It manages the agents who do.

AGENTS ASSIGNED TO THIS BUILD

Remy

Product Manager Agent

Leading

Design

Engineer

Deploy

Remy runs the project. The specialists do the work. You work with the PM, not the implementers.

Less aggressive compaction. If you’ve been running /compact early and often to preserve context budget, you have more room to breathe now. That said, compacting at the right time still matters — running it too late causes context rot. The guide on using /compact to prevent context rot explains the 60% rule, which is still good practice even with more headroom.

API integrations that were previously throttled. If you’re building on top of the Claude API and hitting Tier 1 limits, the jump from 30K to 500K input tokens per minute is substantial. Workflows that required rate-limit-aware retry logic and backoff may now run cleanly without it.

Parallel workloads. With the higher throughput, running multiple Claude Code instances on separate branches becomes more practical. If you’ve been using Git worktrees to parallelize feature development, the new limits reduce the chance that parallel sessions compete for the same rate limit budget. The Claude Code Git worktrees guide covers the setup if you haven’t tried this pattern yet.

One thing worth being explicit about: the peak-hour reduction being eliminated for Pro and Max accounts is actually significant. Previously, your effective rate limit varied depending on when you were working. If you were in a timezone where your working hours overlapped with peak US usage, you were getting a worse deal than someone working off-hours. That asymmetry is gone now for Pro and Max.

The Compute Stack Behind This (and Why It’s Stable)

It’s reasonable to wonder whether these limits will hold or whether Anthropic will quietly walk them back as demand increases.

The Colossus 1 deal gives Anthropic over 300 megawatts of additional capacity to deploy within the month. That’s not a future commitment — the data center is already running. XAI had moved its own training to Colossus 2 (which contains around 550,000 Blackwell GPUs), so Colossus 1 was sitting largely idle. Anthropic is using 100% of it.

Beyond Colossus 1, Anthropic has several other compute deals in various stages:

Up to 5 gigawatts from Amazon AWS, with nearly 1 gigawatt of new capacity expected by end of 2026
A 5 gigawatt agreement with Google and Broadcom, coming online in 2027
A $30 billion Azure capacity deal with Microsoft and Nvidia

The Colossus 1 deal is the one that’s live right now and the reason these rate limit increases are happening today rather than in 2027. The other deals are why Anthropic’s head of growth said there’s more to come on limits — the weekly caps being the obvious next target.

For builders who’ve been routing around Anthropic’s limits using OpenRouter or local models, the calculus may shift. If you’ve been using OpenRouter free models with Claude Code to cut costs, that’s still a valid strategy for cost optimization — but the “I can’t get enough tokens from Anthropic” reason for doing it is less compelling now.

What Might Still Break

Even with the increased limits, there are a few real failure modes to watch for.

You’re on the wrong plan. The 5-hour limit doubling applies to Pro, Max, Team, and seat-based Enterprise. If you’re on a usage-based API plan without a subscription, you’re not getting the subscription-side changes — only the API tier increases.

Day one: idea. Day one: app.

DAY

DELIVERED

Not a sprint plan. Not a quarterly OKR. A finished product by end of day.

You’re hitting weekly limits, not 5-hour limits. If your sessions were getting cut off and you assumed it was the 5-hour window, it might actually be the weekly cap. Those haven’t changed. Check your usage dashboard to see which limit you’re actually hitting.

Your API tier is lower than you think. Tier assignment is based on account history. New API accounts start at Tier 1. If you recently created a new API key or organization, you’re at Tier 1 — which is still much better than before (30K → 500K), but not at the Tier 4 ceiling.

Third-party harnesses may not benefit yet. Anthropic’s announcement specifically called out that these improvements are aimed at “our most dedicated customers” using Claude Code and the direct API. If you’re using Claude through a third-party integration, the limits you experience depend on how that integration is structured. Some harnesses pool API capacity across users, which means the per-user effective limit may not reflect the new tier maximums.

The Opus-specific caveat. The API rate limit increases are specifically for Opus models. If your workflow uses Sonnet or Haiku, check the current rate limit documentation — those may have different limits that weren’t part of this announcement.

For teams building agent workflows on top of Claude, platforms like MindStudio handle the orchestration layer — 200+ models, 1,000+ integrations, and a visual builder for chaining agents — which means you can route to different models or tiers based on task complexity without rewriting your integration logic.

What to Do With the Extra Capacity

If you’ve been artificially constraining your Claude Code usage to avoid hitting limits, now is a good time to revisit those constraints.

A few specific things worth trying with the new headroom:

Run the effort level higher. If you’ve been defaulting to lower effort levels to conserve tokens, the Claude Code effort levels guide explains when max effort actually pays off. With more room in your 5-hour window, using high or max effort on complex tasks is less likely to burn your budget before the task completes.

Try longer planning sessions with Opus. The Opus plan mode — where you plan with Opus and execute with Sonnet — becomes more useful when the Opus rate limits aren’t a bottleneck. The Opus plan mode token-saving guide covers how to set this up if you haven’t already.

Build the thing you kept deferring. If you had a project idea that required extended autonomous work and you kept putting it off because you’d hit limits before finishing, the doubled 5-hour window is the practical unlock. For teams thinking about going from spec to deployed application, Remy takes a different approach to that problem: you write an annotated markdown spec, and it compiles a complete full-stack application — TypeScript backend, SQLite database, auth, deployment — treating the spec as the source of truth rather than the generated code.

The rate limit increases are real and they’re live. The compute deal that made them possible is also real — 220,000 H100s don’t lie. What happens next depends on whether Anthropic can keep pace with demand as the other compute deals come online. But for right now, the ceiling is higher than it was last week, and that’s worth knowing precisely.

Claude Code Rate Limits Just Doubled: Every New API Limit After the Colossus 1 Deal

The Numbers Changed Overnight — Here’s What You Actually Get Now

What Changed and Why It Matters to You

One coffee. One working app.

What You Need to Get the Full Benefit

How to Actually Use the New Headroom

Remy doesn't write the code. It manages the agents who do.

The Compute Stack Behind This (and Why It’s Stable)

What Might Still Break

Day one: idea. Day one: app.

What to Do With the Extra Capacity

Related Articles

Claude API Token Limits Just Jumped 10x — Every Tier's New Numbers Explained

Claude Code /ultra review: 5 Things You Need to Know Before Running It ($5–$20 Per Run)

The Anthropic Advisor Strategy: Cut Claude Costs by 11%

What Is the Anthropic Advisor Strategy? How to Cut AI Agent Costs by 12% Without Losing Quality