Grok 3
xAI's flagship large language model, delivering state-of-the-art reasoning, coding, and creative writing capabilities on par with the best models in the world.
xAI's flagship model for reasoning and code
Grok 3 is the flagship large language model from xAI, developed and released in February 2025. It was built from the ground up in approximately one year and is designed to handle demanding tasks including advanced reasoning, coding, and creative writing. The model is available via API under the identifier grok-3-latest and supports a context window of 131,072 tokens. It includes a dedicated Thinking mode that enables multi-step reasoning on complex problems.
Grok 3 is well-suited for tasks that require structured, multi-step problem solving, such as scientific research, advanced mathematics, and complex software development. It scored 96% on AIME, a challenging mathematics competition benchmark, and 85% on GPQA, a graduate-level science reasoning benchmark. The model also supports image understanding, function calling, and structured output generation, making it usable across a range of developer and research workflows. It ranked first in creative writing evaluations at the time of its release.
What Grok 3 supports
Extended Thinking
A dedicated Thinking mode lets the model reason through multi-step problems before producing a final answer, rather than responding immediately. This is particularly useful for mathematics and logic-heavy tasks.
Advanced Reasoning
Grok 3 scored 96% on AIME and 85% on GPQA, demonstrating strong performance on graduate-level science and competitive mathematics benchmarks.
Code Generation
Supports a wide range of programming tasks including code writing, debugging, and explanation. Ranked highly in coding benchmarks at the time of its launch.
Creative Writing
Ranked first in creative writing evaluations at launch, handling tasks such as narrative generation, dialogue, and stylistic composition.
Image Understanding
Can analyze image inputs and respond to questions about visual content, supporting multimodal workflows alongside text-based tasks.
Function Calling
Supports structured function calling, allowing developers to define callable tools that the model can invoke as part of a response.
Structured Outputs
Can generate outputs in structured formats, making it easier to parse model responses programmatically in application pipelines.
Large Context Window
Supports a context window of 131,072 tokens, enabling processing of long documents, codebases, or extended conversation histories in a single request.
Ready to build with Grok 3?
Get Started FreeBenchmark scores
Scores represent accuracy — the percentage of questions answered correctly on each test.
| Benchmark | What it tests | Score |
|---|---|---|
| MMLU-Pro | Expert knowledge across 14 academic disciplines | 79.9% |
| GPQA Diamond | PhD-level science questions (biology, physics, chemistry) | 69.3% |
| MATH-500 | Undergraduate and competition-level math problems | 87.0% |
| AIME 2024 | American math olympiad problems | 33.0% |
| LiveCodeBench | Real-world coding tasks from recent competitions | 42.5% |
| HLE | Questions that challenge frontier models across many domains | 5.1% |
| SciCode | Scientific research coding and numerical methods | 36.8% |
Common questions about Grok 3
What is the context window size for Grok 3?
Grok 3 supports a context window of 131,072 tokens, which allows it to process long documents, extended conversations, or large codebases within a single request.
What is the knowledge cutoff date for Grok 3?
Based on the available metadata, Grok 3's training data has a cutoff of February 2025.
How do I access Grok 3 via API?
Grok 3 is available through the xAI API using the model identifier grok-3-latest. Pricing and endpoint details are documented at docs.x.ai/developers/models.
Does Grok 3 support image inputs?
Yes, Grok 3 supports image understanding, meaning it can accept image inputs and respond to questions about visual content alongside text.
What benchmarks has Grok 3 been evaluated on?
Grok 3 has been evaluated on AIME, where it scored 96%, and GPQA, where it scored 85%. It also ranked first in creative writing evaluations at the time of its release.
What people think about Grok 3
Community discussion around Grok 3 on Reddit reflects interest in xAI's rapid development pace, with users noting that Grok 3 arrived roughly five months after Grok 2 and that a Grok 4 is already anticipated. Some threads highlight the speed of xAI's release cadence as a notable characteristic of the company.
A significant portion of community discussion centers on concerns about potential changes to future model versions, including debates about training data manipulation and ideological alignment. Discussions about Grok 3 itself are often framed in the context of what comes next, with an xAI engineer's comments about Grok 4 generating considerable engagement.
Elon is working on Grok 3.5 and will push xAI towards removing "leftist indoctrination" from the model. This can be accomplished by either significantly manipulating the training data and messing with Grok's ontology (the exact things AI doomers were/are worried about)
xAI Engineer: "Grok 4 is coming, and its going to be a bigger jump from grok 3 than grok 3 was from 2."
Grok 2 launched ~11 months ago (Aug 14, 2024), Grok 3 ~5 months ago (Feb 17, 2025) and now Grok 4, xAI does makes everyone else seem very slow
Parameters & options
Explore similar models
Start building with Grok 3
No API keys required. Create AI-powered workflows with Grok 3 in minutes — free.