GLM 4.7
A powerful 358B-parameter open-source model built for agentic coding, complex reasoning, and reliable tool use across multiple languages.
358B open-source model for agentic coding and reasoning
GLM-4.7 is a 358-billion-parameter large language model developed by Z.ai (formerly Zhipu AI/THUDM) and released in December 2025. It is designed specifically for agentic workflows, multi-step coding tasks, terminal automation, and complex mathematical and scientific reasoning. The model is available under an MIT license, making it usable for both commercial and non-commercial applications. It supports a 131,072-token context window, allowing it to handle long documents and extended coding sessions.
What distinguishes GLM-4.7 from earlier GLM releases is a set of three reasoning mechanisms: Interleaved Thinking, which applies reasoning before every response and tool call; Preserved Thinking, which retains reasoning context across conversation turns to maintain consistency; and Turn-level Thinking, which lets developers toggle reasoning depth on or off per turn. On benchmarks, the model scores 73.8% on SWE-bench Verified, 95.7% on AIME 2025, and 87.4% on τ²-Bench. It is best suited for developers and researchers building agent pipelines, automated coding tools, or applications requiring reliable multi-step planning.
What GLM 4.7 supports
Agentic Coding
Handles multi-step coding agent loops while retaining context across turns, scoring 73.8% on SWE-bench Verified and 66.7% on SWE-bench Multilingual.
Terminal Automation
Executes command sequencing, error recovery, and multi-step shell automation, achieving 41.0% on Terminal Bench.
Mathematical Reasoning
Solves advanced math and science problems, scoring 95.7% on AIME 2025 and 97.1% on HMMT Feb. 2025.
Multi-Step Planning
Sequences actions across complex tasks using structured reasoning, scoring 87.4% on τ²-Bench.
Tool Use
Calls external tools reliably within agent loops using Interleaved Thinking, which applies reasoning before each tool invocation.
Long Context Processing
Processes inputs up to 131,072 tokens, supporting extended documents, codebases, and multi-turn conversations.
Multilingual Support
Handles coding and reasoning tasks across multiple languages, as reflected by its dedicated SWE-bench Multilingual score of 66.7%.
Configurable Reasoning
Turn-level Thinking lets developers enable or disable deep reasoning per conversation turn, trading response depth for speed as needed.
Science Reasoning
Addresses graduate-level science questions, achieving 85.7% on GPQA-Diamond.
Ready to build with GLM 4.7?
Get Started FreeBenchmark scores
Scores represent accuracy — the percentage of questions answered correctly on each test.
| Benchmark | What it tests | Score |
|---|---|---|
| MMLU-Pro | Expert knowledge across 14 academic disciplines | 85.6% |
| GPQA Diamond | PhD-level science questions (biology, physics, chemistry) | 85.9% |
| LiveCodeBench | Real-world coding tasks from recent competitions | 89.4% |
| HLE | Questions that challenge frontier models across many domains | 25.1% |
| SciCode | Scientific research coding and numerical methods | 45.1% |
| AIME 2025 | American math olympiad problems (2025) | 95.7% |
| SWE-bench Verified | Real GitHub issues requiring multi-file code fixes | 73.8% |
Common questions about GLM 4.7
What is the context window size for GLM-4.7?
GLM-4.7 supports a context window of 131,072 tokens, which allows it to process long documents, extended codebases, and lengthy multi-turn conversations in a single session.
What license does GLM-4.7 use?
GLM-4.7 is released under the MIT license, which permits both commercial and non-commercial use without royalty restrictions.
What is the knowledge cutoff for GLM-4.7?
Based on the available metadata, GLM-4.7 has a training date of December 2025. A specific knowledge cutoff date beyond this has not been published in the provided metadata.
How many parameters does GLM-4.7 have?
GLM-4.7 has 358 billion parameters, making it a large-scale model intended for demanding tasks such as agentic coding, complex reasoning, and terminal automation.
What are the three thinking mechanisms introduced in GLM-4.7?
GLM-4.7 introduces Interleaved Thinking (reasoning before every response and tool call), Preserved Thinking (retaining reasoning context across conversation turns), and Turn-level Thinking (allowing developers to toggle reasoning on or off per turn).
Who developed GLM-4.7 and where can I access it?
GLM-4.7 was developed by Z.ai, formerly known as Zhipu AI/THUDM. It is available on Hugging Face, via NVIDIA NIM, and through the Z.ai API. The model weights and related code are also accessible on GitHub.
What people think about GLM 4.7
Community reception on r/LocalLLaMA has been generally positive, with the GLM-4.7 Flash variant thread drawing 755 upvotes and 232 comments, indicating strong interest in running the model locally. Users have highlighted its benchmark performance on coding and reasoning tasks as notable for an open-source release.
Some discussion has shifted toward the subsequent GLM-5 release, which garnered even more engagement, suggesting that a portion of the community views GLM-4.7 as a stepping stone rather than a long-term target. Common use cases mentioned include local deployment for agentic coding workflows and multi-step automation tasks.
Parameters & options
Explore similar models
Start building with GLM 4.7
No API keys required. Create AI-powered workflows with GLM 4.7 in minutes — free.