Grok 4
xAI's most powerful reasoning model, trained with massive-scale reinforcement learning to achieve world-leading performance on the hardest academic and scientific benchmarks.
Large-scale reasoning across science, math, and code
Grok 4 is a text generation model developed by xAI, released on July 9, 2025, and trained using reinforcement learning on xAI's 200,000-GPU Colossus cluster. It features a 256,000-token context window and was built with a 6x improvement in compute efficiency over its predecessor, with verifiable training data expanded well beyond mathematics and coding. The model is designed for tasks requiring deep reasoning, including expert-level problems in science, mathematics, and software development.
What distinguishes Grok 4 is its native tool use — it was trained to autonomously operate a code interpreter and web browser, selecting its own search queries to produce thorough answers. It also integrates real-time web search and X (Twitter) search, including keyword, semantic, and media search. A variant called Grok 4 Heavy runs multiple reasoning agents in parallel at inference time to handle the most demanding problems, and it was the first model to score above 50% on the Humanity's Last Exam benchmark. Grok 4 is available to SuperGrok and Premium+ subscribers on grok.com and through the xAI API.
What Grok 4 supports
Advanced Reasoning
Applies multi-step reasoning to expert-level problems in science, mathematics, and coding, trained via reinforcement learning at scale on xAI's 200,000-GPU Colossus cluster.
Native Tool Use
Autonomously selects and operates tools such as a code interpreter and web browser, choosing its own search queries to construct thorough, grounded answers.
Real-Time Web Search
Integrates live web search and X (Twitter) search — including keyword, semantic, and media search — to retrieve up-to-date information during a response.
Long Context Window
Supports a 256,000-token context window, enabling processing of lengthy documents, codebases, or multi-turn conversations in a single request.
Math & Science Problem Solving
Achieves 61.9% on USAMO 2025 olympiad math proofs (Grok 4 Heavy) and 50.7% on the Humanity's Last Exam text-only subset, as reported by xAI.
Parallel Agent Reasoning
Grok 4 Heavy runs multiple reasoning agents simultaneously at test time, allowing it to tackle problems that benefit from parallel exploration of solution paths.
Code Generation
Generates, debugs, and explains code across common programming languages, with access to a built-in code interpreter for execution and verification.
Agentic Task Execution
Handles multi-step autonomous tasks, achieving high scores on Vending-Bench, a benchmark designed to evaluate agentic decision-making over extended task sequences.
Ready to build with Grok 4?
Get Started FreeBenchmark scores
Scores represent accuracy — the percentage of questions answered correctly on each test.
| Benchmark | What it tests | Score |
|---|---|---|
| MMLU-Pro | Expert knowledge across 14 academic disciplines | 86.6% |
| GPQA Diamond | PhD-level science questions (biology, physics, chemistry) | 87.7% |
| MATH-500 | Undergraduate and competition-level math problems | 99.0% |
| AIME 2024 | American math olympiad problems | 94.3% |
| LiveCodeBench | Real-world coding tasks from recent competitions | 81.9% |
| HLE | Questions that challenge frontier models across many domains | 23.9% |
| SciCode | Scientific research coding and numerical methods | 45.7% |
| AIME 2025 | American math olympiad problems (2025) | 91.7% |
| SWE-bench Verified | Real GitHub issues requiring multi-file code fixes | 69.1% |
Common questions about Grok 4
What is the context window for Grok 4?
Grok 4 supports a context window of 256,000 tokens, allowing it to process long documents, extended conversations, or large codebases in a single request.
How can I access Grok 4?
Grok 4 is available to SuperGrok and Premium+ subscribers on grok.com. Developers can also access it programmatically through the xAI API. Pricing details are listed on the xAI API Models & Pricing page.
What is Grok 4 Heavy and how does it differ from standard Grok 4?
Grok 4 Heavy is a variant that runs multiple reasoning agents in parallel at inference time, using additional compute to tackle the most demanding problems. It achieved 61.9% on USAMO 2025 and was the first model to exceed 50% on Humanity's Last Exam, according to xAI's published benchmarks.
What is the training data cutoff for Grok 4?
According to the available metadata, Grok 4's training date is listed as July 2025. However, the model also integrates real-time web search and X (Twitter) search, which allows it to retrieve information beyond its training cutoff during a conversation.
What hardware was used to train Grok 4?
Grok 4 was trained on xAI's Colossus cluster, which consists of 200,000 GPUs. xAI reports a 6x improvement in compute efficiency compared to prior training runs.
What benchmarks has Grok 4 been evaluated on?
According to xAI, Grok 4 scored 50.7% on the Humanity's Last Exam text-only subset, 15.9% on ARC-AGI V2, and 61.9% on USAMO 2025 math proofs (in its Grok 4 Heavy configuration). It also achieved high scores on Vending-Bench for agentic task performance.
What people think about Grok 4
Community discussion ahead of Grok 4's release was largely anticipatory, with threads citing xAI engineers claiming it would represent a larger generational leap than Grok 3 was over Grok 2. After release, discussion focused on benchmark results, particularly its performance on Humanity's Last Exam and ARC-AGI V2.
A widely circulated thread raised concerns about content moderation after users reported the model producing inappropriate outputs, reflecting ongoing community scrutiny of safety guardrails. Discussions around Grok 4 Fast benchmarks and a rumored future Grok 4.20 release also generated engagement, indicating active interest in xAI's development roadmap.
xAI employee bragging about upcoming release of grok 4
xAI Engineer: "Grok 4 is coming, and its going to be a bigger jump from grok 3 than grok 3 was from 2."
xAI releases details and performance benchmarks for Grok 4 Fast
xAI to launch Grok 4.20 by Christmas
Grok says its surname is Hitler
Parameters & options
Explore similar models
Start building with Grok 4
No API keys required. Create AI-powered workflows with Grok 4 in minutes — free.