How Anthropic's Harness Detection Actually Works — and Why It Triggered a $200 Overcharge

The String That Cost $200: How Anthropic’s Harness Detection Actually Works

Anthropic pulls git status into Claude Code’s system prompt and scans for strings like hermes.md and OpenClaw in recent commit messages, then routes you to API billing instead of your subscription. That’s the mechanism. One commit message. One string match. $200.98 in unexpected charges.

You probably heard about this incident. On Patel was on the Claude Max 20x plan — the $200/month tier — and got hit with extra usage charges out of nowhere. His dashboard showed 13% weekly usage consumed, 0% current session, 86% of his plan sitting untouched. The charges came anyway. He did what any good engineer would do: binary searched his repos and commits until he found the trigger. The culprit was the string hermes.md in a recent git commit message. Not an active Hermes session. Not API calls routed through a third-party harness. A string. In a commit message.

Theo Brown ran a cleaner experiment. Empty repo. Just OpenClaw in a JSON blob. Claude Code either refused requests or billed extra. That’s it. That’s the whole test.

This post is about the mechanism itself — what Anthropic is actually doing, why they built it this way, and what it means for you if you’re building anything that touches Claude Code.

What Anthropic Is Actually Doing Under the Hood

The core behavior, confirmed by Anthropic’s own Tariq in a public response, is: “a bug with the third-party harness detection and how we pull git status into the system prompt.”

That sentence is doing a lot of work. Unpack it.

Claude Code, when it starts a session, pulls environmental context into the system prompt. This is standard practice for coding agents — you want the model to know what repo it’s in, what branch, recent commits, maybe the current diff. It’s useful context. The Claude Code source code leak revealed a three-layer memory architecture that includes exactly this kind of environmental ingestion, where files like memory.md act as pointer indexes into the broader context.

The problem is what Anthropic layered on top of that context ingestion: a detection pass that looks for keywords associated with third-party harnesses. Hermes. OpenClaw. Probably others we don’t know about yet. If those strings appear in the git status output — which includes recent commit messages — the system routes the session to API billing rather than treating it as a subscription session.

The intent is clear enough. Boris Cherny from Anthropic said it directly: “subscriptions weren’t built for the usage patterns of these third-party tools.” Third-party harnesses like Hermes and OpenClaw can run Claude Code in loops, burning through tokens at rates that make the flat-fee subscription economics completely untenable. Anthropic is trying to protect their margins. That’s a legitimate business problem.

The implementation, though, is string matching on commit messages. That’s where it falls apart.

Why String Matching on Git History Is the Wrong Layer

Here’s the thing about this class of bug. When you’re trying to detect whether a user is running a third-party harness, the right signal is the actual API call pattern — request frequency, token consumption rate, the structure of the system prompt being sent. Those signals are hard to spoof and directly measure the thing you care about.

Commit message scanning is none of those things. It’s a proxy for a proxy. The reasoning chain is: “if someone has hermes.md in their commit history, they probably use Hermes, and Hermes users probably consume more tokens.” Every step of that chain is lossy. You’re punishing people for having worked on a project that touched Hermes at some point, or for naming a file something that pattern-matches, or for committing a config blob that mentions a competitor tool.

Theo Brown’s empty repo test makes this concrete. He wasn’t using OpenClaw. He wasn’t even running a real project. He just had the string in a JSON blob in a commit. The detection fired anyway.

The deeper issue is what Theo said in his reply to On Patel’s post: “there’s a certain class of bugs that suggests the thing you’re trying to do is a bad idea.” He’s right. The bug isn’t just the false positive rate. The bug is that Anthropic wrote code to scan your commit history for competitor tool names and change your billing based on what it finds. That’s the thing people are actually upset about. The refund is nice. The architecture is the problem.

The Business Pressure Behind This

✗ VIBE-CODED APP

Tangled. Half-built. Brittle.

✓ AN APP, MANAGED BY REMY

UIReact + Tailwind✓

APIValidated routes✓

DBPostgres + auth✓

DEPLOYProduction-ready✓

Architected. End to end.

Built like a system. Not vibe-coded.

Remy manages the project — every layer architected, not stitched together at the last second.

To be fair to Anthropic, the pressure they’re under is real. The agentic era has completely broken the economics of flat-fee subscriptions. When Claude Code is running multi-hour autonomous coding sessions, the token consumption is orders of magnitude higher than what the subscription pricing assumed. GitHub Copilot just announced a shift to consumption-based fees with roughly a 6x price hike for frontier models. Replit made this move earlier and took significant backlash for it. The whole industry is repricing.

Anthropic’s specific problem is that they’re also dealing with a compute shortage. As Anthropic’s compute constraints have tightened, they’ve been metering supply during peak hours, and users have been hitting limits faster than expected. When you combine constrained compute with token-hungry third-party harnesses running on flat-fee subscriptions, you get a situation where a small number of power users can genuinely degrade service for everyone else.

So the motivation to detect and route third-party harness usage is understandable. The execution is what failed.

What the System Prompt Ingestion Actually Looks Like

When Claude Code initializes a session, it builds a system prompt that includes environmental context. Based on what we know from the source code and from Tariq’s confirmation, this includes git status output — which means recent commit messages, branch names, and potentially staged changes.

The detection logic then runs over this ingested context. It’s looking for strings that indicate you’re running inside a third-party harness rather than native Claude Code. The specific strings we know about: hermes.md (associated with the Hermes agent framework) and OpenClaw (a competing agent harness that Jensen Huang has talked about publicly as a major use case for Nvidia’s models).

When a match fires, the session gets routed differently. Instead of drawing from your subscription quota, it bills against the API. On Patel’s case shows what that looks like in practice: his dashboard showed 86% of his plan unused, but he still got charged $200.98 in extra usage because the session was routed to API billing, not subscription billing.

Anthropic support’s initial response made it worse: “we are unable to issue compensation for degraded service or technical errors that result in incorrect billing routing.” They acknowledged the bug three times, called it an “authentication routing issue,” thanked On Patel for finding it, and then declined to refund. That response only changed after On Patel’s post hit 1.44 million views and Theo Brown’s post hit 1 million views.

What This Means If You’re Building on Claude Code

If you’re building anything that uses Claude Code — or if you’re building a harness that wraps Claude Code — there are a few practical things to understand.

First, your git history is context. Anything in recent commit messages is potentially ingested into the system prompt. This isn’t just about harness detection; it’s about what information you’re inadvertently feeding the model. If you’re working in a repo that has touched Hermes, OpenClaw, or any other tool that Anthropic might be scanning for, you now know that history can affect your billing routing.

Second, the detection is keyword-based, not behavioral. This means it can fire on false positives — and it means it can potentially be gamed. Neither of those is a good property for a billing system.

Day one: idea. Day one: app.

DAY

DELIVERED

Not a sprint plan. Not a quarterly OKR. A finished product by end of day.

Third, if you’re building a legitimate third-party integration that wraps Claude Code, you’re in a genuinely ambiguous position. Anthropic’s stated position is that subscriptions weren’t built for third-party harness usage patterns. But the detection mechanism doesn’t distinguish between “running a token-hungry autonomous agent loop” and “worked on a project that mentioned a harness name six commits ago.” If you’re building agent infrastructure on top of Claude Code, you should be routing through the API directly anyway — the subscription tier isn’t the right product for that use case, and the detection system will eventually find you.

For teams building multi-model agent workflows, platforms like MindStudio handle this routing explicitly: you specify which models and tiers your agents use, and the billing is tied to actual API consumption rather than subscription heuristics. That’s a cleaner model than trying to infer usage patterns from commit history.

The Refund and What It Signals

Tariq’s public response was: “Sorry, this was a bug with the third-party harness detection and how we pull git status into the system prompt. We’re reaching out to affected users and giving them a refund plus another month’s worth of credit.”

That’s the right resolution. But the sequence matters: the bug existed, support refused to refund, the posts went viral (1.44M and 1M views respectively), and then Anthropic reversed course. That’s not a great look for a company that positions itself as the ethical AI lab.

The more interesting signal is in Tariq’s framing: “bug with the third-party harness detection.” He’s not saying the detection itself is wrong. He’s saying this particular implementation had a bug. Which means the detection is staying. They’re going to fix the false positive rate and keep scanning.

That’s worth knowing if you’re building on this platform. Anthropic is actively trying to identify and route third-party harness usage. The mechanism is git history ingestion plus keyword matching. The fix will presumably make the keyword matching more precise, or add additional signals, or both. But the intent — to route third-party harness sessions to API billing — is the product direction, not an accident.

The Deeper Architecture Question

There’s a more interesting question underneath all of this, which is: what’s the right way to build a billing system for an AI coding agent?

The current approach — flat subscription with a usage cap, plus detection logic to route heavy users to API billing — is clearly not it. The detection logic is fragile, the false positive rate is nonzero, and the user experience when it fires is terrible (unexpected charges, no notification, support initially refusing refunds).

The honest answer is probably consumption-based billing with a subscription credit, which is where GitHub Copilot just moved and where most of the industry is heading. You get a monthly allotment of credits. Heavy usage draws down credits faster. When you run out, you buy more. The economics are transparent and the billing is tied to actual consumption, not inferred usage patterns.

The challenge for Anthropic is that this transition is hard to make without alienating the power users who are currently getting a good deal on the subscription tier. Those power users are also the ones most likely to be building on Claude Code, writing about it publicly, and influencing what the broader developer community uses. On Patel and Theo Brown’s posts getting 1M+ views each is a direct consequence of that dynamic.

REMY IS NOT

✕a coding agent
✕no-code
✕vibe coding
✕a faster Cursor

IT IS

✓a general contractor for software

The one that tells the coding agents what to build.

If you’re thinking about building a production application that depends on Claude Code’s capabilities, the Claude Code source code leak’s hidden features are worth understanding — particularly the memory and context management patterns, since those are the same mechanisms that are now being used for billing detection. Understanding how the system prompt gets constructed tells you a lot about what signals Anthropic has access to.

For spec-driven development workflows where you’re generating code from structured requirements, tools like Remy take a different approach to the whole stack: you write an annotated markdown spec, and it compiles into a complete TypeScript application — backend, database, auth, deployment. The source of truth is the spec, not the generated code. That’s a different abstraction layer than Claude Code, and it sidesteps the harness detection problem entirely because you’re not running an agent loop against a subscription.

What to Watch For

Anthropic will fix the false positive rate. The detection will get more sophisticated — probably moving from pure string matching toward behavioral signals like request frequency and token consumption patterns. That’s the right direction technically.

But the underlying tension doesn’t go away. Subscriptions aren’t built for agentic usage patterns. The economics of flat-fee AI subscriptions are breaking down across the whole industry. Anthropic is just further along in that reckoning than most, partly because Claude Code became the inflection point for agentic usage faster than anyone expected.

The commit message scanning is a symptom of a company trying to solve a billing problem with a detection heuristic when what they actually need is a different pricing model. The heuristic will get better. The pricing model will eventually change. In the meantime, if you have hermes.md anywhere in your recent git history, you now know what to expect.

The Claude Code buddy feature and other easter eggs suggest a team that genuinely cares about the developer experience. The billing detection incident suggests a team under real resource pressure making pragmatic decisions that didn’t hold up to scrutiny. Both things can be true. The question is which one shapes the product direction going forward.