Skip to main content
MindStudio
Pricing
Blog About
My Workspace

Anthropic Confirms: Claude Code Was Scanning Git Commits for OpenClaw and Hermes Keywords

Anthropic's Tariq confirmed the billing bug was 'third-party harness detection' pulling git status into the system prompt.

MindStudio Team RSS
Anthropic Confirms: Claude Code Was Scanning Git Commits for OpenClaw and Hermes Keywords

What Anthropic’s Git Commit Scanning Actually Confirmed

Anthropic’s own engineer Tariq publicly confirmed it: the Claude Code billing incident was “a bug with the third-party harness detection and how we pull git status into the system prompt.” That sentence is worth reading twice. It confirms that Claude Code was actively reading your git history, scanning it for competitor harness names, and using that signal to change how it billed or served you.

This post is about what that confirmation actually means technically — what was happening under the hood, why it’s architecturally significant, and what you should know if you’re building workflows on top of Claude Code or any agentic coding tool.

The incident itself is documented: a user on the Claude Max $200/month plan was charged an additional $200.98 in overage despite their dashboard showing 86% of their plan untouched. After binary-searching their repos and commits, they traced the trigger to the string hermes.md in a recent git commit message. Anthropic support acknowledged the bug three times, called it an “authentication routing issue,” and then refused to issue a refund — until the story went viral with 1.4 million views from On Patel and 1 million views from Theo Brown.

The refund came. The explanation came. But the explanation is the interesting part.


What “Pulling Git Status Into the System Prompt” Actually Means

Plans first. Then code.

PROJECTYOUR APP
SCREENS12
DB TABLES6
BUILT BYREMY
1280 px · TYP.
yourapp.msagent.ai
A · UI · FRONT END

Remy writes the spec, manages the build, and ships the app.

If you’ve used Claude Code, you know it operates with a lot of context about your project. It reads your file tree, your open files, your recent changes. That’s expected and useful — it’s how it gives you relevant suggestions.

What’s less obvious is that Claude Code was also running something like git status or git log and injecting that output directly into the system prompt. The system prompt is the hidden preamble that shapes how the model behaves before you type a single word. It’s not the conversation — it’s the instructions the model receives before the conversation starts.

So when Tariq says “how we pull git status into the system prompt,” he’s describing a pipeline that looks roughly like this:

  1. You open Claude Code in a project directory
  2. Claude Code runs git commands to gather context about your repo state
  3. That output — including commit messages — gets serialized into the system prompt
  4. The model (or some pre-model routing layer) reads that system prompt
  5. If the system prompt contains certain strings, behavior changes

The strings that triggered the behavior change were competitor harness names: OpenClaw and Hermes. Theo Brown’s tweet made this concrete: “if you have a recent commit that mentions OpenClaw in a JSON blob, Claude Code will either refuse your request or bill you extra money.”

This is keyword scanning. It’s not semantic understanding. It’s string matching on git commit messages that happened to end up in a context window.


Why Third-Party Harness Detection Was There in the First Place

The “third-party harness detection” Tariq mentioned isn’t an accident. It’s a deliberate feature that went wrong.

Claude Code has a legitimate reason to know whether it’s being called directly by a human or being orchestrated by another tool. When you’re running Claude Code inside an agent harness like Hermes or another Claude Code framework, the usage patterns are different. Automated agents can generate far more API calls than a human typing at a terminal. Pricing tiers and rate limits that make sense for human-paced usage don’t necessarily make sense for machine-paced usage.

So Anthropic built detection logic to identify when Claude Code was being used inside a third-party harness. The intent was probably to route those sessions to different billing or rate-limit buckets. That’s a reasonable product decision.

The bug was in the detection method. Instead of detecting actual runtime orchestration signals (like call patterns, headers, or explicit harness declarations), the detection was reading git commit messages for harness names. That’s a fragile proxy. A commit message that says “added hermes.md to project docs” is not evidence that you’re running Claude Code inside Hermes. It’s just a string.

The Claude Code source code leak that surfaced earlier this year gave researchers a window into how Claude Code constructs its context. The system prompt architecture is more complex than most users realize — it layers project context, memory files, tool outputs, and environmental signals into a single large prompt before any user message arrives. Git status was one of those environmental signals. The problem is that environmental signals pulled from developer artifacts (commit messages, file names, directory structures) are full of incidental strings that have nothing to do with runtime behavior.


Day one: idea. Day one: app.

DAY
1
DELIVERED

Not a sprint plan. Not a quarterly OKR. A finished product by end of day.

The Specific Failure Chain

Here’s the sequence that produced the $200.98 charge:

  1. User commits a file called hermes.md — a documentation file, not a harness configuration
  2. Claude Code runs git log or git status as part of its context-gathering step
  3. The string hermes.md appears in the git output
  4. The harness detection logic matches on hermes as a keyword
  5. The session gets routed to a different billing tier — one that charges overage on top of the $200/month plan
  6. The user’s dashboard still shows 86% plan usage because the overage is being charged to a separate bucket
  7. Anthropic support acknowledges the bug but initially refuses to refund, citing a policy of not compensating for “degraded service or technical errors that result in incorrect billing routing”

That last point is worth sitting with. The support response wasn’t “we don’t see a bug.” It was “we see the bug, we acknowledge the bug, and our policy is that bugs don’t entitle you to a refund.” That policy only changed after the posts went viral.


What This Tells You About Agentic Context Architecture

The deeper issue here isn’t billing. It’s what this incident reveals about how agentic tools construct their operating context.

Claude Code, like most agentic coding tools, builds a rich context window before it does anything. That context includes things you’d expect (your code, your instructions) and things you might not expect (git history, environment variables, file metadata). The three-layer memory architecture revealed in the Claude Code source leak shows how layered this context actually is — there’s project-level memory, session-level memory, and tool-generated context all getting merged together.

When you’re building on top of these tools, you’re implicitly trusting that the context-gathering layer is doing something reasonable with your project artifacts. This incident is a case where it wasn’t. A documentation file name in a commit message became a billing trigger.

If you’re building agent workflows that orchestrate Claude Code programmatically — say, running it as part of a CI pipeline or a multi-agent system — this matters a lot. Your commit messages, your file names, your directory structure are all potentially being read and interpreted. That’s not inherently bad, but you should know it’s happening.

Platforms like MindStudio handle this kind of orchestration at a higher level of abstraction — 200+ models, 1,000+ integrations, and a visual builder for chaining agents and workflows — which means the context-construction layer is managed for you rather than being something you have to audit in your own git history.


What You Should Actually Check Right Now

If you’re a Claude Code user, especially on a Max plan, here are the concrete things worth verifying.

Check your recent commit messages for these strings:

  • openclaw (case-insensitive)
  • hermes (case-insensitive)
  • Any other agent harness names you’ve referenced in docs or configs

You can run this locally:

git log --oneline --all | grep -i "openclaw\|hermes"

Other agents start typing. Remy starts asking.

YOU SAID "Build me a sales CRM."
01 DESIGN Should it feel like Linear, or Salesforce?
02 UX How do reps move deals — drag, or dropdown?
03 ARCH Single team, or multi-org with permissions?

Scoping, trade-offs, edge cases — the real work. Before a line of code.

If you get hits, that doesn’t mean you’ve been overbilled — Anthropic says they’ve reached out to affected users and issued refunds plus one month’s credit. But it’s worth knowing whether your commit history contains these strings.

Check your billing dashboard carefully. The affected user’s dashboard showed 86% plan usage remaining while a separate overage charge of $200.98 had accumulated. These can appear in different places in the billing UI. Look for line items that say “extra usage” or “overage” separately from your plan usage percentage.

Check file names in your project. The trigger wasn’t just commit messages — it was the string appearing in git output, which can include file names tracked by git. If you have a file called hermes.md, openclaw-config.json, or similar, that could appear in git status output even if it’s never been committed.


The Broader Question About Keyword-Based Routing

There’s a pattern here that goes beyond this specific bug. Keyword scanning on developer artifacts is a fragile way to make routing decisions.

Commit messages are written by humans for humans. They’re informal, they reference external tools, they contain jokes, they include documentation about things the project doesn’t use. Using them as a signal for billing or access control creates a system where the behavior of your AI tool depends on your writing style and your documentation habits.

This is different from, say, detecting API call patterns or explicit configuration flags. Those are signals that actually correlate with the behavior you’re trying to detect. A commit message that mentions a competitor’s tool name is noise.

The fact that this detection logic existed at all — and was connected to billing routing — suggests Anthropic was trying to solve a real problem (differentiating human usage from automated harness usage) with a method that was too blunt. The how to run Claude Code for free using Ollama and Open Router post touches on the routing layer from a different angle — when you’re proxying Claude Code through alternative backends, the system prompt and context construction behave differently than in the standard setup.

Theo Brown’s observation after the incident was pointed: “there’s a certain class of bugs that suggests the thing you’re trying to do is a bad idea.” The implication is that if your bug is “we accidentally charged users extra because their commit messages mentioned a competitor,” the underlying design decision — scanning commit messages for competitor names — was the real problem.


What Anthropic’s Response Tells You About Incident Handling

The resolution timeline is instructive. The sequence was:

  • User discovers bug, reports to Anthropic support
  • Support acknowledges bug three times, calls it “authentication routing issue”
  • Support refuses refund, citing policy against compensating for billing routing errors
  • User posts to Reddit
  • On Patel amplifies: 1.4 million views
  • Theo Brown amplifies: 1 million views
  • Tariq (Anthropic) posts public acknowledgment, confirms the technical cause, announces refunds plus one month’s credit

The refund happened. That’s good. But the path to the refund required viral distribution. The initial support response — “we are unable to issue compensation for degraded service or technical errors that result in incorrect billing routing” — was a policy response to a situation that warranted an exception.

For builders who are running production workloads on Claude Code, this is a data point about what the support escalation path looks like when something goes wrong with billing. Document your usage, keep screenshots of your dashboard, and if you hit an anomalous charge, the support ticket alone may not be sufficient.


Where This Fits in the Larger Picture of Agentic Tool Transparency

The git commit scanning story is one instance of a broader question: what are agentic coding tools doing with your project context, and how transparent are they about it?

The Claude Code agentic workflow patterns post covers how Claude Code structures multi-step tasks. What’s less documented — and what this incident surfaced — is the environmental context layer that runs before any of those workflow patterns kick in. The git status pull, the file tree scan, the environment variable inspection — these happen before you type anything, and they shape what the model sees.

For most users, this is fine. The context-gathering is what makes Claude Code useful. But “fine for most users” and “transparent to all users” are different things. The harness detection logic existed, was connected to billing, and wasn’t documented anywhere users could find it.

Tools like Remy take a different approach to the source-of-truth problem: you write an annotated markdown spec, and the full-stack application — TypeScript backend, database, auth, deployment — gets compiled from it. The spec is explicit and readable; the derived outputs are generated from it. That’s a different model of transparency than “we read your git history and make decisions based on it.”

The Anthropic incident doesn’t mean you should stop using Claude Code. It means you should understand what context it’s gathering and how that context affects your experience. Tariq’s confirmation that git status gets pulled into the system prompt is the most specific piece of documentation we have on that question, and it came from a bug report, not from Anthropic’s docs.

That’s worth knowing.

Presented by MindStudio

No spam. Unsubscribe anytime.