How to Use OpenRouter with Claude Code: Run Cheaper Models as a Backend

Why Claude Code Gets Expensive Fast

Claude Code is one of the best AI coding assistants available. It reasons well, handles complex refactors, and integrates tightly into your terminal workflow. But the costs add up quickly — especially if you’re using it for long sessions or running it against large codebases.

Claude Sonnet 4 runs around $3 per million input tokens and $15 per million output tokens. Opus is significantly more. For developers using Claude Code as a daily driver, a busy week can mean real money. The good news: you don’t have to pay Claude prices for every task.

Using OpenRouter with Claude Code, you can swap the backend model to something like DeepSeek V3 — which costs roughly $0.14 per million input tokens. That’s 2–5% of what you’d pay for Claude Sonnet. For routine tasks like writing boilerplate, generating tests, or explaining code, cheaper models perform nearly as well.

This guide walks through how to set that up using a local proxy, including which models to use and when.

What Is OpenRouter?

OpenRouter is a unified API gateway that routes requests to dozens of AI models — including Claude, GPT-4o, Gemini, DeepSeek, Mistral, Llama, and many others — through a single endpoint.

Instead of managing multiple API keys and billing accounts, you fund one OpenRouter account and access all supported models. Pricing is pay-per-token, and many models are significantly cheaper than Anthropic’s own API. Some are even free with rate limits.

One coffee. One working app.

You bring the idea. Remy manages the project.

WHILE YOU WERE AWAY

✓Designed the data model

✓Picked an auth scheme — sessions + RBAC

✓Wired up Stripe checkout

✓Deployed to production

Live at yourapp.msagent.ai

OpenRouter exposes both an OpenAI-compatible API (/api/v1) and an Anthropic-compatible API. This is what makes it possible to point Claude Code at OpenRouter — Claude Code speaks Anthropic’s API format, and OpenRouter can receive it.

How the Proxy Setup Works

Claude Code isn’t simply a chat interface. It’s an agentic system that sends structured API requests — including tool use, multi-turn context, and system prompts — to Anthropic’s backend.

You can redirect those requests to OpenRouter in two ways:

Direct environment variable override — Set ANTHROPIC_BASE_URL and ANTHROPIC_API_KEY to point at OpenRouter. Simple, but can break on specific Anthropic API features that OpenRouter doesn’t fully replicate.
A local proxy (recommended) — Run a lightweight proxy server on your machine that sits between Claude Code and OpenRouter. It translates requests, handles authentication, and lets you configure model substitution rules. This is what tools like free-claude-code do.

The proxy approach is more reliable because it gives you control over model routing, handles edge cases in API compatibility, and lets you swap models without touching your Claude Code config repeatedly.

Prerequisites

Before starting, make sure you have:

Claude Code installed — If you haven’t set it up yet, install it via npm: npm install -g @anthropic-ai/claude-code
An OpenRouter account — Sign up at openrouter.ai and add credit to your account (even $5 gets you a long way with cheaper models)
Your OpenRouter API key — Found in your OpenRouter account settings under API Keys
Node.js 18+ — Required to run the proxy
Git — For cloning the proxy repository

Step-by-Step: Setting Up the free-claude-code Proxy

Step 1: Clone and Install the Proxy

The free-claude-code proxy is an open-source tool that handles the translation layer between Claude Code and OpenRouter. Clone it and install dependencies:

git clone https://github.com/ruvnet/free-claude-code.git
cd free-claude-code
npm install

Step 2: Configure Your Environment

Inside the project directory, create a .env file (or copy from .env.example if one exists):

OPENROUTER_API_KEY=your_openrouter_api_key_here
PROXY_PORT=3010
DEFAULT_MODEL=deepseek/deepseek-chat

Key configuration options:

OPENROUTER_API_KEY — Your OpenRouter API key
PROXY_PORT — The local port the proxy will listen on (3010 is fine; just avoid conflicts with other local services)
DEFAULT_MODEL — The OpenRouter model identifier you want Claude Code to use

Model identifiers follow the format provider/model-name. For example:

deepseek/deepseek-chat — DeepSeek V3
deepseek/deepseek-r1 — DeepSeek R1 (stronger reasoning)
google/gemini-flash-1.5 — Gemini Flash (very fast, cheap)
anthropic/claude-3.5-sonnet — Still Claude, but routed through OpenRouter pricing

Step 3: Start the Proxy

npm start

You should see output confirming the proxy is running on your specified port:

Proxy server running on http://localhost:3010
Routing requests to OpenRouter → deepseek/deepseek-chat

Leave this terminal window open. The proxy needs to keep running while you use Claude Code.

Step 4: Configure Claude Code to Use the Proxy

Now tell Claude Code to send requests to your local proxy instead of Anthropic’s servers. Set these environment variables in your shell:

export ANTHROPIC_BASE_URL=http://localhost:3010
export ANTHROPIC_API_KEY=any-string-works-here

The ANTHROPIC_API_KEY value doesn’t matter here — authentication happens at the OpenRouter level, handled by the proxy. You just need something non-empty to satisfy Claude Code’s startup check.

To make this persistent, add both lines to your shell config file (~/.bashrc, ~/.zshrc, etc.), then reload:

source ~/.zshrc

Step 5: Launch Claude Code

claude

Other agents start typing. Remy starts asking.

YOU SAID "Build me a sales CRM."

REMY ASKS

01 DESIGN Should it feel like Linear, or Salesforce?

02 UX How do reps move deals — drag, or dropdown?

03 ARCH Single team, or multi-org with permissions?

Scoping, trade-offs, edge cases — the real work. Before a line of code.

Claude Code will start normally, but all requests will route through your proxy to OpenRouter, using whichever model you configured. You can verify this by watching the proxy’s terminal output — you’ll see requests coming in and being forwarded.

Direct Environment Variable Method (Simpler, Less Reliable)

If you’d prefer to skip the proxy and connect Claude Code directly to OpenRouter, you can point ANTHROPIC_BASE_URL at OpenRouter’s Anthropic-compatible endpoint:

export ANTHROPIC_BASE_URL=https://openrouter.ai/api/v1
export ANTHROPIC_API_KEY=your_openrouter_api_key

Then set the model using Claude Code’s --model flag or by editing the ANTHROPIC_MODEL environment variable:

export ANTHROPIC_MODEL=deepseek/deepseek-chat

This works for basic use cases. However, you may hit errors with tool use, certain system prompts, or Claude Code’s more advanced agentic behaviors — OpenRouter’s Anthropic compatibility layer isn’t always a perfect replica. If you see API errors or unexpected behavior, switch to the proxy approach.

Choosing the Right Model

Not all models on OpenRouter are equal. Here’s a practical breakdown of which to use when:

DeepSeek V3 (`deepseek/deepseek-chat`)

Cost: ~$0.14/M input, $0.28/M output
Best for: General coding tasks, boilerplate generation, documentation, refactoring, explaining code, writing tests

DeepSeek V3 has become the default choice for budget-conscious developers. Its coding ability is competitive with Claude Sonnet 3.5 on most everyday tasks. It handles large contexts well and responds quickly.

DeepSeek R1 (`deepseek/deepseek-r1`)

Cost: ~$0.55/M input, $2.19/M output
Best for: Complex reasoning, architecture decisions, debugging tricky issues, algorithm design

R1 is a reasoning model — it thinks through problems step by step before answering. Slower than V3 but significantly better on problems that require multi-step logic. Still much cheaper than Claude Opus.

Gemini Flash 1.5 (`google/gemini-flash-1.5`)

Cost: Very low — often near-free for smaller contexts
Best for: Quick questions, short code completions, rapid iteration

Flash is extremely fast and cheap. It’s not the strongest model, but for lightweight tasks it’s hard to beat on cost-per-interaction.

Llama 3.1 405B (`meta-llama/llama-3.1-405b-instruct`)

Cost: Varies by provider on OpenRouter
Best for: When you want a capable open-weight model with no usage restrictions

The 405B variant is Meta’s most capable Llama model and performs well on coding tasks. Multiple providers on OpenRouter offer it, so you can shop for the cheapest hosting.

When to Keep Using Claude

For genuinely hard tasks — understanding a complex unfamiliar codebase, multi-file refactors with subtle side effects, debugging race conditions — Claude Sonnet or Opus still has an edge. You don’t have to pick one model permanently. You can switch the proxy config or use Claude Code’s native model flag for high-stakes sessions.

Cost Comparison in Practice

To make this concrete, here’s what a typical heavy Claude Code session might cost across different backends:

Session Type	Claude Sonnet 3.5	DeepSeek V3	Savings
1M tokens (light day)	~$9–18	~$0.21–0.42	~97%
5M tokens (heavy day)	~$45–90	~$1.05–2.10	~97%
Monthly (20 heavy days)	~$900–1,800	~$21–42	~97%

These are rough estimates and depend heavily on your input/output ratio, but the order of magnitude is accurate. For most developers, routing 80% of sessions through DeepSeek and keeping Claude for hard problems cuts the monthly bill significantly.

Troubleshooting Common Issues

Claude Code throws authentication errors

Plans first. Then code.

PROJECTYOUR APP

SCREENS12

DB TABLES6

BUILT BYREMY

1280 px · TYP.

yourapp.msagent.ai

A · UI · FRONT END

Remy writes the spec, manages the build, and ships the app.

Make sure ANTHROPIC_API_KEY is set to something — even "placeholder" works. Claude Code checks for a non-empty key on startup, but the actual authentication against OpenRouter is handled by the proxy.

Tool use fails or returns malformed responses

This is the most common issue with the direct (no-proxy) method. Switch to the proxy. If you’re already using the proxy, check whether the model you’ve chosen has strong tool-use support. DeepSeek V3 handles tool calls well; smaller models sometimes don’t.

The proxy starts but Claude Code can’t connect

Check that ANTHROPIC_BASE_URL points to the right port. If you changed PROXY_PORT in .env, update the URL accordingly. Also verify no firewall rules are blocking localhost connections.

Responses feel slower than normal Claude

Latency depends on the model provider and your network connection to OpenRouter. DeepSeek can sometimes be slow during peak hours. If speed matters, try Gemini Flash or switch to a different DeepSeek provider available through OpenRouter’s routing.

Model outputs look wrong or the context seems truncated

Different models have different context window sizes. DeepSeek V3 supports up to 64K tokens by default. If your session involves very large files, verify the model’s context limit isn’t causing silent truncation.

Where MindStudio Fits Into AI Model Management

If you’re building tools and workflows on top of AI models — not just using Claude Code interactively — there’s a broader infrastructure question worth thinking about: how do you manage model selection, cost, and routing at the application level?

MindStudio gives you access to 200+ AI models in a single no-code builder, including all the models available on OpenRouter (DeepSeek, Gemini, Llama, Claude, GPT, and more). You can wire up AI agents that automatically select different models based on task complexity — for example, routing quick summarization tasks to a cheap model while escalating reasoning-heavy steps to a stronger one.

This is especially useful if you’re building products or internal tools where you want to control model spend without hardcoding model choices into your app. You can switch the backend model for any workflow in seconds without touching any infrastructure.

If you’re already experimenting with OpenRouter to reduce costs in Claude Code, MindStudio applies the same philosophy to full workflow and agent development — minus the manual proxy setup. You can try MindStudio free at mindstudio.ai.

Frequently Asked Questions

Is it safe to route Claude Code requests through a local proxy?

Yes. The proxy runs entirely on your machine — requests go from Claude Code → localhost → OpenRouter. Your code never passes through a third-party server beyond OpenRouter itself (and Anthropic’s servers if you use Claude models on OpenRouter). That said, OpenRouter does see your prompts, so don’t send highly sensitive credentials or proprietary code if that’s a concern for your organization.

Can I switch models mid-session?

Not mid-session without restarting Claude Code. You’d need to update the DEFAULT_MODEL in your proxy config, restart the proxy, and start a new Claude Code session. Some developers keep multiple proxy configs and switch between them depending on the task.

Does this work with Claude Code’s agentic features like computer use?

Remy is new. The platform isn't.

Remy

Product Manager Agent

THE PLATFORM

200+ models 1,000+ integrations Managed DB Auth Payments Deploy

▮

BUILT BY MINDSTUDIO

Shipping agent infrastructure since 2021

Remy is the latest expression of years of platform work. Not a hastily wrapped LLM.

Computer use and certain advanced agentic capabilities depend on specific Anthropic API features that not all models support. Basic coding tasks, file editing, terminal commands, and tool use generally work well with DeepSeek V3 through the proxy. More specialized Anthropic-specific features may not.

Will I lose Claude Code’s quality on hard problems?

On routine coding tasks — writing functions, generating tests, explaining unfamiliar code — most users find DeepSeek V3 nearly indistinguishable from Claude Sonnet. On harder problems (complex multi-file refactors, nuanced debugging, architecture design), Claude still has an edge. The practical approach most developers use is DeepSeek for the bulk of work, Claude for the hard 20%.

Does OpenRouter have rate limits?

Yes, but they’re generous for most individual developers. Rate limits vary by model and your OpenRouter account tier. If you’re running Claude Code in a team setting or CI pipeline with high volume, check OpenRouter’s rate limit documentation for the specific models you’re using.

What happens if OpenRouter is down?

Your Claude Code session will fail — requests can’t reach the model. You can configure a fallback by updating ANTHROPIC_BASE_URL back to Anthropic’s endpoint if you need continuity. Some proxy implementations support fallback routing, but free-claude-code doesn’t do this by default.

Key Takeaways

Claude Code supports ANTHROPIC_BASE_URL overrides, which makes it possible to route requests to OpenRouter instead of Anthropic’s API
A local proxy (like free-claude-code) handles the translation layer more reliably than direct environment variable overrides
DeepSeek V3 is the go-to cheap model — it handles most everyday coding tasks at about 2–5% of Claude Sonnet’s cost
Use DeepSeek R1 when you need strong reasoning; keep Claude for genuinely hard problems
The setup takes about 10 minutes and can cut AI coding costs significantly for heavy users

If you want to apply the same cost-aware model selection logic to building agents and automated workflows — not just interactive coding — MindStudio is worth a look. It puts 200+ models in one place with no API key management, so you can build with the right model for each task without the infrastructure overhead.