Text Generation Model

GLM 4.7

A powerful 358B-parameter open-source model built for agentic coding, complex reasoning, and reliable tool use across multiple languages.

Start Building with GLM 4.7 View All Models

Publisher

Z.ai

Type Text

Context Window 131,072 tokens

Training Data December 2025

Input $0.40/MTok

Output $1.75/MTok

Provider

DeepInfra

Try GLM 4.7 →

About GLM 4.7

358B open-source model for agentic coding and reasoning

GLM-4.7 is a 358-billion-parameter large language model developed by Z.ai (formerly Zhipu AI/THUDM) and released in December 2025. It is designed specifically for agentic workflows, multi-step coding tasks, terminal automation, and complex mathematical and scientific reasoning. The model is available under an MIT license, making it usable for both commercial and non-commercial applications. It supports a 131,072-token context window, allowing it to handle long documents and extended coding sessions.

What distinguishes GLM-4.7 from earlier GLM releases is a set of three reasoning mechanisms: Interleaved Thinking, which applies reasoning before every response and tool call; Preserved Thinking, which retains reasoning context across conversation turns to maintain consistency; and Turn-level Thinking, which lets developers toggle reasoning depth on or off per turn. On benchmarks, the model scores 73.8% on SWE-bench Verified, 95.7% on AIME 2025, and 87.4% on τ²-Bench. It is best suited for developers and researchers building agent pipelines, automated coding tools, or applications requiring reliable multi-step planning.

Capabilities

What GLM 4.7 supports

Agentic Coding

Handles multi-step coding agent loops while retaining context across turns, scoring 73.8% on SWE-bench Verified and 66.7% on SWE-bench Multilingual.

Terminal Automation

Executes command sequencing, error recovery, and multi-step shell automation, achieving 41.0% on Terminal Bench.

Mathematical Reasoning

Solves advanced math and science problems, scoring 95.7% on AIME 2025 and 97.1% on HMMT Feb. 2025.

Multi-Step Planning

Sequences actions across complex tasks using structured reasoning, scoring 87.4% on τ²-Bench.

Tool Use

Calls external tools reliably within agent loops using Interleaved Thinking, which applies reasoning before each tool invocation.

Long Context Processing

Processes inputs up to 131,072 tokens, supporting extended documents, codebases, and multi-turn conversations.

Multilingual Support

Handles coding and reasoning tasks across multiple languages, as reflected by its dedicated SWE-bench Multilingual score of 66.7%.

Configurable Reasoning

Turn-level Thinking lets developers enable or disable deep reasoning per conversation turn, trading response depth for speed as needed.

Science Reasoning

Addresses graduate-level science questions, achieving 85.7% on GPQA-Diamond.

Ready to build with GLM 4.7?

Get Started Free

Performance

Benchmark scores

Scores represent accuracy — the percentage of questions answered correctly on each test.

Benchmark	What it tests	Score
MMLU-Pro	Expert knowledge across 14 academic disciplines	85.6%
GPQA Diamond	PhD-level science questions (biology, physics, chemistry)	85.9%
LiveCodeBench	Real-world coding tasks from recent competitions	89.4%
HLE	Questions that challenge frontier models across many domains	25.1%
SciCode	Scientific research coding and numerical methods	45.1%
AIME 2025	American math olympiad problems (2025)	95.7%
SWE-bench Verified	Real GitHub issues requiring multi-file code fixes	73.8%

FAQ

Common questions about GLM 4.7

What is the context window size for GLM-4.7?

GLM-4.7 supports a context window of 131,072 tokens, which allows it to process long documents, extended codebases, and lengthy multi-turn conversations in a single session.

What license does GLM-4.7 use?

GLM-4.7 is released under the MIT license, which permits both commercial and non-commercial use without royalty restrictions.

What is the knowledge cutoff for GLM-4.7?

Based on the available metadata, GLM-4.7 has a training date of December 2025. A specific knowledge cutoff date beyond this has not been published in the provided metadata.

How many parameters does GLM-4.7 have?

GLM-4.7 has 358 billion parameters, making it a large-scale model intended for demanding tasks such as agentic coding, complex reasoning, and terminal automation.

What are the three thinking mechanisms introduced in GLM-4.7?

GLM-4.7 introduces Interleaved Thinking (reasoning before every response and tool call), Preserved Thinking (retaining reasoning context across conversation turns), and Turn-level Thinking (allowing developers to toggle reasoning on or off per turn).

Who developed GLM-4.7 and where can I access it?

GLM-4.7 was developed by Z.ai, formerly known as Zhipu AI/THUDM. It is available on Hugging Face, via NVIDIA NIM, and through the Z.ai API. The model weights and related code are also accessible on GitHub.

Community Discussion

What people think about GLM 4.7

Community reception on r/LocalLLaMA has been generally positive, with the GLM-4.7 Flash variant thread drawing 755 upvotes and 232 comments, indicating strong interest in running the model locally. Users have highlighted its benchmark performance on coding and reasoning tasks as notable for an open-source release.

Some discussion has shifted toward the subsequent GLM-5 release, which garnered even more engagement, suggesting that a portion of the community views GLM-4.7 as a stepping stone rather than a long-term target. Common use cases mentioned include local deployment for agentic coding workflows and multi-step automation tasks.

r/LocalLLaMA 755 pts 232 comments

zai-org/GLM-4.7-Flash · Hugging Face

r/LocalLLaMA 814 pts 159 comments

GLM-5 Officially Released

r/LocalLLaMA 341 pts 95 comments

GLM 4.7 released!

View more discussions →

Resources