How to Use OpenRouter to Run GLM 5.2 in Claude Code for Cheaper Agentic Workflows

Q: Are there other cheap models worth considering on OpenRouter?

Yes. A few worth testing alongside GLM 5.2: Mistral Small — Strong on instruction-following, competitive pricing DeepSeek Coder — Specifically optimized for code, very cost-efficient Qwen 2.5 — Alibaba's model, strong multilingual and coding performance Gemini Flash — Google's speed-optimized tier, low per-token cost OpenRouter's model library shows current pricing for all of them. The right choice depends on your specific task mix — run some head-to-head tests on representative tasks before committing to a single model for production use.

Agentic AI Is Expensive — Unless You’re Strategic About Models

Running agentic workflows with Claude Code is powerful, but costs can spiral quickly. Every tool call, every context window refill, every multi-step reasoning chain adds up. If you’re using a premium model like Claude Fable at $10 per million input tokens for every single step of your pipeline, you’ll notice it on your bill fast.

GLM 5.2, available through OpenRouter, costs $1.40 per million input tokens. That’s an 86% reduction. For long-running agents that process thousands of tokens per task, across dozens of tasks per day, the difference is significant.

This guide shows you exactly how to connect OpenRouter to Claude Code so you can run GLM 5.2 — or any other cost-efficient model — as your backend, without losing the Claude Code interface you’re already using.

What GLM 5.2 Actually Is

GLM 5.2 comes from Zhipu AI, a Chinese AI research company backed by major institutional investment. The GLM (General Language Model) series has been a serious contender in the open-weights and API-accessible model space for a while.

GLM 5.2 is built for instruction-following, multi-turn conversation, and reasoning tasks — which maps well to what agentic coding workflows actually need. It supports a 128K context window, handles tool use and function calling, and performs competitively on coding benchmarks relative to its price tier.

Cursor

ChatGPT

Figma

Linear

GitHub

Vercel

Supabase

goremy.ai

Seven tools to build an app. Or just Remy.

Editor, preview, AI agents, deploy — all in one tab. Nothing to install.

It’s not going to beat Claude Opus or GPT-4o on complex reasoning tasks. But for the bulk of what an agent actually does — reading files, executing planned steps, running tool calls, writing boilerplate code — it handles the load well at a fraction of the cost.

The practical pattern most engineers land on: use a cheaper model like GLM 5.2 for the high-frequency, lower-complexity steps, and route the genuinely hard reasoning to a premium model only when needed. OpenRouter makes this kind of model mixing straightforward.

What OpenRouter Does

OpenRouter is a unified API gateway that gives you access to dozens of AI models through a single endpoint and billing account. Instead of managing API keys and accounts for Anthropic, OpenAI, Google, Mistral, Zhipu AI, and others separately, you authenticate once with OpenRouter and route requests to any model on their platform.

From a technical standpoint, OpenRouter’s API is compatible with the OpenAI API format. That matters here because Claude Code supports a ANTHROPIC_BASE_URL override — which means you can point Claude Code’s underlying API calls at OpenRouter instead of Anthropic’s servers directly.

The pricing model on OpenRouter is pay-per-token with no monthly minimums. You load credits into your account and spend them across models. This is ideal for experimentation and for setups where you’re mixing models based on task complexity.

Prerequisites

Before starting, make sure you have the following in place:

Claude Code installed — Available as a CLI tool from Anthropic. Install it via npm install -g @anthropic-ai/claude-code if you haven’t already.
An OpenRouter account — Sign up at openrouter.ai. Free to create; you’ll add credits when you’re ready to run live requests.
An OpenRouter API key — Found in your OpenRouter dashboard under Keys. It starts with sk-or-v1-.
Basic terminal access — You’ll be setting environment variables and running Claude Code from the command line.

That’s it. You don’t need to touch any config files in Claude Code itself — the override happens at the environment level.

Step-by-Step Setup: OpenRouter + GLM 5.2 in Claude Code

Step 1: Create Your OpenRouter API Key

Log into your OpenRouter account and navigate to the Keys section. Click Create Key, give it a label (e.g., “claude-code-glm”), and copy the key immediately — it won’t be shown in full again.

Add some credits to your account if you haven’t. Even $5 is enough to run a lot of requests at GLM 5.2’s pricing.

Step 2: Set the Environment Variables

Claude Code respects two environment variables that let you override its default behavior: ANTHROPIC_BASE_URL and ANTHROPIC_API_KEY.

Set them in your terminal session:

export ANTHROPIC_BASE_URL="https://openrouter.ai/api/v1"
export ANTHROPIC_API_KEY="sk-or-v1-your-key-here"

To make this persistent across sessions, add those two lines to your ~/.zshrc or ~/.bashrc file, then run source ~/.zshrc.

What this does: Claude Code uses the Anthropic SDK under the hood. When you set ANTHROPIC_BASE_URL, the SDK routes its API calls to that URL instead of https://api.anthropic.com. OpenRouter accepts calls in the Anthropic format and handles the translation to whatever model you specify.

Step 3: Specify GLM 5.2 as Your Model

When you launch Claude Code, pass the model flag pointing to GLM 5.2 on OpenRouter. The model ID uses OpenRouter’s routing format:

claude --model zhipuai/glm-4-5

Other agents start typing. Remy starts asking.

YOU SAID "Build me a sales CRM."

REMY ASKS

01 DESIGN Should it feel like Linear, or Salesforce?

02 UX How do reps move deals — drag, or dropdown?

03 ARCH Single team, or multi-org with permissions?

Scoping, trade-offs, edge cases — the real work. Before a line of code.

OpenRouter’s model identifier for GLM 5.2 follows the provider/model-name convention. Check the OpenRouter model library for the exact current slug — model IDs occasionally update as providers release new versions. Look under the Zhipu AI provider section for the GLM 5.x series.

Step 4: Run a Test to Confirm the Connection

Before running any real workflows, verify everything is wired up correctly:

claude --model zhipuai/glm-4-5 "What model are you running on? Respond in one sentence."

If the connection is working, you’ll get a response. If you see an authentication error, double-check that your ANTHROPIC_API_KEY is set to your OpenRouter key (not an Anthropic key) and that ANTHROPIC_BASE_URL is pointing to https://openrouter.ai/api/v1.

Step 5: Verify Billing Through OpenRouter

After your first successful request, check your OpenRouter dashboard under Activity or Usage. You should see the request logged there with the token count and cost. This confirms you’re being billed through OpenRouter and not hitting Anthropic’s servers.

Testing Your Setup on a Real Task

Once you’ve confirmed the connection, run a simple but representative task — something that reflects what you’ll actually use this for. A good test:

claude --model zhipuai/glm-4-5 "Read the files in the current directory and write a brief README.md summarizing what this project does."

This exercises file reading, basic reasoning, and code generation — the core loop of most agentic coding tasks. Pay attention to:

Response quality — Is the output coherent and accurate?
Latency — How long does it take compared to your baseline?
Token usage — Check the OpenRouter dashboard for the cost breakdown.

For most standard coding assistant tasks, GLM 5.2 performs well. Where you’ll notice the ceiling is on highly complex multi-file refactors or tasks requiring deep architectural reasoning — those are the cases where routing to a premium model makes sense.

The Real Cost Difference: Running the Numbers

Here’s a concrete comparison to illustrate the savings at scale.

Assume a moderately active agentic workflow:

50 tasks per day
5,000 input tokens average per task (context + instructions)
1,000 output tokens average per task

With Claude Fable at $10/M input + $30/M output:

Input: 50 × 5,000 = 250,000 tokens → $2.50/day
Output: 50 × 1,000 = 50,000 tokens → $1.50/day
Total: ~$4.00/day → ~$120/month

With GLM 5.2 at $1.40/M input + ~$1.40/M output (approximate):

Input: 250,000 tokens → $0.35/day
Output: 50,000 tokens → $0.07/day
Total: ~$0.42/day → ~$12.60/month

That’s roughly $107/month saved on a moderate-use case. For teams running agents at higher volumes or across multiple developers, the math gets more compelling fast.

The hybrid approach — GLM 5.2 for the routine steps, Claude or GPT-4o for the hard reasoning steps — gets you close to premium model output quality at closer to GLM pricing overall.

Common Mistakes and How to Fix Them

”Authentication failed” or “Invalid API key”

This usually means your ANTHROPIC_API_KEY is still set to an Anthropic key rather than an OpenRouter key. OpenRouter keys start with sk-or-v1-. Run echo $ANTHROPIC_API_KEY to check what’s actually set.

”Model not found” error

The model slug is case-sensitive and must match exactly what OpenRouter uses. Visit the OpenRouter models page, search for GLM, and copy the model ID directly rather than typing it manually.

Responses look different than expected

GLM 5.2 has different strengths and weaknesses than Claude. If you’re migrating an existing prompt-heavy workflow, you may need to adjust your system prompts. GLM models tend to be more literal in their instruction-following — verbose chain-of-thought prompts sometimes work better than tersely structured ones.

Environment variables not persisting

If you set the variables in your terminal and they work, but a new session doesn’t pick them up, they haven’t been added to your shell config. Open ~/.zshrc (or ~/.bashrc on Linux) and add the export lines there permanently.

Latency is higher than expected

OpenRouter adds a routing layer, so there’s some baseline latency overhead. For highly latency-sensitive workflows, test whether this matters for your use case. For most agentic workflows where tasks run asynchronously, it’s not a practical issue.

Where MindStudio Fits into This Picture

If you’re using Claude Code to build and test AI workflows, you’re probably thinking about how to scale or productionize those workflows. That’s where MindStudio is worth knowing about.

MindStudio is a no-code builder for AI agents and automated workflows. It gives you access to 200+ models — including GLM models, Claude, GPT-4o, Gemini, and others — with no API keys or separate accounts required. You pick models at the step level, which means you can build the same kind of model-mixing strategy described above (cheaper model for routine steps, premium model for hard reasoning) directly in a visual builder, without environment variable gymnastics.

For teams that have validated a workflow in Claude Code and want to deploy it as a background agent, a webhook endpoint, or a scheduled automation, MindStudio’s agent builder lets you do that without rewriting everything. The average build takes under an hour.

You can also use MindStudio to expose agents as MCP servers — making them callable from Claude Code and other AI systems. So rather than rebuilding infrastructure inside your agent, you can offload tasks like sending emails, querying databases, or running media generation to MindStudio-hosted agents and call them as tools.

If you want to experiment with cost optimization across models in a more visual environment, try MindStudio free at mindstudio.ai.

FAQ

Can Claude Code actually use non-Anthropic models?

Yes. Claude Code uses the Anthropic SDK under the hood, but it supports a ANTHROPIC_BASE_URL override. When you point this at OpenRouter (which accepts Anthropic-format requests), Claude Code doesn’t know or care that the underlying model has changed. It sends requests in the same format, and OpenRouter handles the translation.

Is GLM 5.2 good enough for coding tasks?

For the majority of coding tasks — generating boilerplate, reading and explaining code, making targeted edits, writing tests — GLM 5.2 performs well. It struggles more on complex multi-file refactors or tasks requiring nuanced architectural judgment. The practical answer: use it for the high-frequency routine steps and route harder tasks to a premium model when needed.

Does using OpenRouter violate Anthropic’s terms of service?

Remy doesn't build the plumbing. It inherits it.

Other agents wire up auth, databases, models, and integrations from scratch every time you ask them to build something.

WHAT REMY DOESN'T HAVE TO BUILD

200+

AI MODELS

GPT · Claude · Gemini · Llama

✓

1,000+

INTEGRATIONS

Slack · Stripe · Notion · HubSpot

✓

MANAGED DB

AUTH

PAYMENTS

CRONS

Remy ships with all of it from MindStudio — so every cycle goes into the app you actually want.

Using OpenRouter itself doesn’t involve Anthropic’s API at all. You’re not proxying Claude through OpenRouter — you’re replacing Claude with a different model (GLM 5.2) on a different provider (Zhipu AI via OpenRouter). The Claude Code interface is a separate product from the Claude API. Using the CLI tool with a different backend is functionally just using the Claude Code UI with a different model provider.

What happens to context and memory when I switch models?

Claude Code manages conversation context at the session level. When you switch models, the interface behavior stays the same — Claude Code still maintains the session context within a conversation. The difference is in model capability and how each model uses that context. GLM 5.2’s 128K context window is comparable to Claude’s, so large-context tasks should work fine.

Can I switch between models within the same workflow?

Not natively within a single Claude Code session. However, you can start new sessions pointing to different models for different tasks. For more sophisticated model routing within a single workflow — where step A uses one model and step B uses another — a workflow builder like MindStudio or a custom orchestration layer is better suited.

Are there other cheap models worth considering on OpenRouter?

Yes. A few worth testing alongside GLM 5.2:

Mistral Small — Strong on instruction-following, competitive pricing
DeepSeek Coder — Specifically optimized for code, very cost-efficient
Qwen 2.5 — Alibaba’s model, strong multilingual and coding performance
Gemini Flash — Google’s speed-optimized tier, low per-token cost

OpenRouter’s model library shows current pricing for all of them. The right choice depends on your specific task mix — run some head-to-head tests on representative tasks before committing to a single model for production use.

Key Takeaways

Setting up OpenRouter with GLM 5.2 in Claude Code takes under 5 minutes and requires only two environment variables.
GLM 5.2 costs $1.40/M input tokens compared to $10/M for Claude Fable — an 86% reduction.
The setup works by overriding Claude Code’s ANTHROPIC_BASE_URL to point at OpenRouter’s API endpoint.
For production agentic workflows, the hybrid approach — cheap model for routine steps, premium model for hard reasoning — gets the best cost-quality balance.
If you want visual model mixing and production-ready agent deployment without managing environment variables, MindStudio’s no-code builder gives you 200+ models with per-step model selection out of the box.

Start with a small test on a representative task, verify billing through the OpenRouter dashboard, and scale from there. The setup is low-risk and the cost savings at any meaningful volume are worth the 5 minutes it takes to configure.