Skip to main content
MindStudio
Pricing
Blog About
My Workspace

Anthropic's OpenClaw and Hermes Detection Controversy: 4 Things Every Claude Max User Needs to Know

A $200 overcharge. A 1.44M-view viral post. An empty repo test. Four things Claude Max users need to know about Anthropic's harness detection policy.

MindStudio Team RSS
Anthropic's OpenClaw and Hermes Detection Controversy: 4 Things Every Claude Max User Needs to Know

A $200 Charge, an Empty Repo, and 2.4 Million Views Later

Anthropic quietly built a system to detect when Claude Code users were running third-party agent harnesses — and charged them API rates instead of subscription rates when it found them. You probably didn’t know that until last week, when two posts went viral and forced the company to respond.

On Patel posted to Reddit about being charged $200.98 in extra usage fees on his Claude Max 20x plan — the $200-per-month tier — despite his dashboard showing only 13% of his weekly usage consumed. Theo Brown ran a different kind of test: an empty repository with the string “OpenClaw” buried in a JSON blob. Claude Code refused his requests or billed extra. Both posts exploded. Patel’s hit 1.44 million views. Brown’s crossed 1 million before Anthropic said anything publicly.

Here are four things every Claude Max user needs to understand about what happened.


The Trigger Was a String in a Git Commit Message

Patel didn’t figure out what caused the overcharge immediately. He binary-searched his repos and commits until he isolated the culprit: the string hermes.md in a recent git commit message. He wasn’t using Hermes — the third-party Claude Code framework — at all. The text just happened to appear in his commit history.

REMY IS NOT
  • a coding agent
  • no-code
  • vibe coding
  • a faster Cursor
IT IS
a general contractor for software

The one that tells the coding agents what to build.

Anthropic, it turns out, was pulling git status into Claude Code’s system prompt and scanning it for keywords associated with third-party harnesses like Hermes and OpenClaw. When those strings appeared, the system routed users from their subscription plan to API billing — without notification, without confirmation, and apparently without checking whether they were actually using those tools.

The detection logic was blunt. A commit message referencing a file named hermes.md was enough. Theo Brown’s test made this concrete: he created an empty repo, added “OpenClaw” to a JSON blob, and watched Claude Code either refuse requests or charge extra. No actual harness usage required.

This is the part that made people angry. Not just that there was a bug, but that the underlying behavior — scanning your code for keywords and penalizing you for what it finds — was apparently intentional policy, with the bug being only in how broadly the detection cast its net.


Anthropic Support Refused to Refund the Money

When Patel reported the issue, Anthropic support acknowledged the bug three times. They called it an “authentication routing issue.” They thanked him for finding it. Then they declined to issue a refund.

The exact language from their response: “I sincerely apologize for the disruption you experienced with the billing routing issue. We take service reliability very seriously. However, I need to let you know that we are unable to issue compensation for degraded service or technical errors that result in incorrect billing routing.”

Read that again. Anthropic’s support team confirmed it was a technical error, confirmed it caused incorrect billing, and then said that incorrect billing caused by technical errors is not eligible for compensation. That’s a policy position, not a one-off mistake by a support agent.

This is where the story shifted from “billing bug” to something more pointed. The question stopped being whether Anthropic had a bug and started being whether Anthropic’s actual policy — detecting harness usage and routing to API billing — was something users had ever agreed to or been told about. The answer, based on everything that came out, appears to be no.


The Public Response Only Came After 2.4 Million Views

Tariq from Anthropic eventually posted publicly: “Sorry, this was a bug with the third-party harness detection and how we pull git status into the system prompt. We’re reaching out to affected users and giving them a refund plus another month’s worth of credit.”

That’s a reasonable resolution. Refund plus a month of credit is more than the minimum. But the timeline matters here. Patel’s post hit 1.44 million views. Brown’s hit 1 million. The combined reach was roughly 2.4 million impressions before Anthropic’s public acknowledgment appeared. The support team had already told Patel no refund was coming.

It’s hard to read that sequence as anything other than: the policy held until the optics became untenable. Theo Brown put it plainly in a reply to Patel’s post: “There’s a certain class of bugs that suggests the thing you’re trying to do is a bad idea.” His point wasn’t subtle. If your code contains logic that scans for keywords like hermes.md and OpenClaw and routes users to different billing tiers based on what it finds, the bug isn’t just in the detection — it’s in the design.

Day one: idea. Day one: app.

DAY
1
DELIVERED

Not a sprint plan. Not a quarterly OKR. A finished product by end of day.

Boris Cherny, an Anthropic researcher, had offered some context a few weeks earlier when the company made related changes to how Claude Code handled third-party tools: “We’ve been working hard to meet the increase in demand for Claude, and our subscriptions weren’t built for the usage patterns of these third-party tools.” That framing is honest. It also confirms that the harness detection wasn’t incidental — it was a response to a real business problem.


The Broader Context: Anthropic Is Under Compute Pressure

This incident didn’t happen in a vacuum. Anthropic has been visibly straining under the weight of agentic usage for months. Claude Code, in particular, drove a spike in token consumption that the company’s infrastructure wasn’t fully prepared for. Anthropic’s compute shortage has been well-documented — the company has been metering supply during peak hours, and users have reported hitting limits far faster than their dashboards suggest they should.

The harness detection policy fits into that context. Third-party tools like Hermes and OpenClaw tend to run longer, more token-intensive sessions than Claude Code’s native interface. From Anthropic’s perspective, a $200/month subscriber running multi-hour autonomous coding sessions through a third-party harness is consuming resources that the subscription pricing never accounted for. Boris Cherny said as much: subscriptions weren’t built for those usage patterns.

That’s a legitimate problem. The solution Anthropic chose — scanning git commit messages for harness-related strings and silently rerouting billing — is where things went wrong. The detection was opaque, the rerouting was silent, and when a user got caught in a false positive, the support response was to deny the refund.

For anyone building Claude Code-based agentic workflows, this is a real operational risk. Your commit history, your file names, your JSON blobs — any of these could contain strings that trigger detection logic you didn’t know existed. If you’re using open-source routing approaches to reduce Claude Code costs, you’re probably already thinking about this. But even users who aren’t routing around Anthropic’s infrastructure can get caught if their repos happen to reference the wrong tools.

Platforms like MindStudio handle model orchestration across 200+ models and 1,000+ integrations through a visual builder — which means the underlying routing decisions are explicit and auditable rather than embedded in opaque system prompt logic that users can’t inspect.


What This Actually Revealed About the Subscription Model

The deeper issue here isn’t the bug. It’s that Anthropic’s subscription tiers were priced for a usage pattern that no longer describes how power users actually use Claude.

The Claude Max 20x plan at $200/month was designed around a certain volume of usage. Agentic coding sessions — especially through third-party harnesses that run long, multi-step tasks across entire repositories — can consume tokens at a rate that blows past those assumptions in a single session. Patel’s case is illustrative: his dashboard showed 13% weekly usage consumed, 0% current session usage, and 86% of his plan untouched. Then he got a $200.98 overage charge. The dashboard and the billing engine were apparently measuring different things.

VIBE-CODED APP
Tangled. Half-built. Brittle.
AN APP, MANAGED BY REMY
UIReact + Tailwind
APIValidated routes
DBPostgres + auth
DEPLOYProduction-ready
Architected. End to end.

Built like a system. Not vibe-coded.

Remy manages the project — every layer architected, not stitched together at the last second.

This is the same dynamic playing out across the industry right now. GitHub Copilot just announced a shift to consumption-based fees, with Claude Opus 4.7’s multiplier jumping from 7.5x to 27x. The flat-fee model that made sense for chat-style AI usage doesn’t hold up when agents are running for hours at a time. Anthropic’s attempt to manage this by detecting and rerouting harness users was a workaround for a pricing model that needed to change.

If you’re building production systems that depend on Claude, the Claude Code source code leak from earlier this year offered some insight into how the system prompt is constructed — including how external context like git status gets pulled in. Understanding that architecture matters now more than it did before, because it turns out that context can affect your bill.

There’s a parallel here for anyone building spec-driven applications. Tools like Remy take a different approach to the source-of-truth problem: you write an annotated markdown spec, and the full-stack app — TypeScript backend, SQLite database, auth, deployment — gets compiled from it. The spec is explicit and inspectable; nothing is hidden in system prompt logic you can’t see. That kind of transparency is increasingly valuable when the alternative involves billing surprises triggered by strings in your commit history.


The One Thing Anthropic Got Right

The refund-plus-credit resolution was the correct call. It came late, and it came because of public pressure, but it came. Tariq’s public acknowledgment was specific — he named the bug, named the mechanism (git status in the system prompt), and described the remediation. That’s more transparency than most companies offer when a billing bug goes viral.

What Anthropic hasn’t done is explain what the harness detection policy looks like going forward. The bug was in how broadly the detection triggered. The underlying policy — routing third-party harness users to API billing — hasn’t been publicly walked back. Users who are legitimately running Hermes or OpenClaw presumably still get routed to API rates. The question is whether they’re told that’s happening, and whether the detection is accurate enough that false positives like Patel’s don’t recur.

Theo Brown’s framing deserves to be the lasting takeaway from this episode. When a bug reveals that your system is scanning user code for keywords and making billing decisions based on what it finds, the fix isn’t just better keyword matching. It’s reconsidering whether that’s the right architecture at all. The fact that a string in a commit message — hermes.md, in a file that wasn’t even being used — could silently double a user’s monthly bill suggests the detection logic was never robust enough to be trusted with billing consequences.

For Claude Max users: check your commit history, check your JSON blobs, and check your dashboard against your actual charges. The gap between what the dashboard shows and what you’re billed can apparently be significant, and the support team’s first instinct, at least in this case, was to say that’s not their problem.

Presented by MindStudio

No spam. Unsubscribe anytime.