Text Generation Model

GPT OSS 120B

OpenAI's flagship open-weight reasoning model with 117B parameters, built for powerful coding, math, and agentic tasks under the Apache 2.0 license.

Start Building with GPT OSS 120B View All Models

Publisher

OpenAI

Type Text

Context Window 128,000 tokens

Training Data August 2025

Input $0.15/MTok

Output $0.75/MTok

Provider

Groq

OPEN SOURCEVERY FAST

Try GPT OSS 120B →

About GPT OSS 120B

OpenAI's open-weight reasoning model for code and math

GPT OSS 120B is OpenAI's largest open-weight model, released in August 2025 under the Apache 2.0 license. It has approximately 116.8 billion total parameters and uses a Mixture-of-Experts (MoE) architecture that activates only around 5.1 billion parameters per token, enabling efficient inference on a single H100 GPU. The model is part of the GPT OSS family and is designed for commercial and private deployments without licensing restrictions.

The model is built for coding, mathematical reasoning, scientific analysis, and agentic workflows. It supports a 128,000-token context window, adjustable reasoning levels (low, medium, and high), and native tool use including web browsing, Python code execution, and custom developer-defined functions. Architecturally, it uses 36 transformer layers with 128 experts per MoE layer (top 4 active per token), Grouped Query Attention, Rotary Position Embeddings, and an alternating local/dense attention pattern, and it is available for local inference via Hugging Face Transformers, llama.cpp, and vLLM.

Capabilities

What GPT OSS 120B supports

Mixture-of-Experts Architecture

Uses a MoE design with 128 experts per layer, activating only ~5.1 billion of 116.8 billion total parameters per token for efficient inference.

Adjustable Reasoning

Supports low, medium, and high reasoning levels, allowing developers to tune the trade-off between response speed and reasoning depth.

Long Context Window

Handles up to 128,000 tokens per request, equivalent to roughly 100,000 words of text in a single prompt.

Coding and Math

Designed for software development, mathematical reasoning, and scientific analysis tasks requiring multi-step problem solving.

Tool Use

Natively supports web browsing, Python code execution, and custom developer-defined functions as callable tools.

Agentic Workflows

Built for multi-step agentic tasks and integrates with agent frameworks, supporting complex sequences of tool calls and decisions.

Open Source License

Released under the Apache 2.0 license, permitting commercial use, fine-tuning, and private deployment without royalty obligations.

Fast Inference

Tagged as very fast; the MoE architecture keeps active parameter count low, and the model fits on a single H100 GPU for local deployment.

Fine-Tuning Support

Supports fine-tuning workflows, allowing developers to adapt the base model to domain-specific tasks using standard training pipelines.

Ready to build with GPT OSS 120B?

Get Started Free

Performance

Benchmark scores

Scores represent accuracy — the percentage of questions answered correctly on each test.

Benchmark	What it tests	Score
MMLU-Pro	Expert knowledge across 14 academic disciplines	80.8%
GPQA Diamond	PhD-level science questions (biology, physics, chemistry)	78.2%
LiveCodeBench	Real-world coding tasks from recent competitions	87.8%
HLE	Questions that challenge frontier models across many domains	18.5%
SciCode	Scientific research coding and numerical methods	38.9%

FAQ

Common questions about GPT OSS 120B

What is the context window for GPT OSS 120B?

GPT OSS 120B supports a 128,000-token context window, which is roughly equivalent to 100,000 words of text in a single request.

What license does GPT OSS 120B use?

The model is released under the Apache 2.0 license, which permits commercial use, modification, fine-tuning, and private deployment.

What is the training data cutoff for GPT OSS 120B?

Based on the available metadata, the model was released in August 2025. A specific training data cutoff date is not stated in the provided metadata.

How many parameters does GPT OSS 120B have, and how does the MoE architecture affect inference?

The model has approximately 116.8 billion total parameters, but its Mixture-of-Experts architecture activates only around 5.1 billion parameters per token during inference, reducing compute requirements compared to a dense model of the same total size.

Where can GPT OSS 120B be deployed?

The model is available on AWS via Amazon Bedrock and SageMaker JumpStart, on NVIDIA NIM, and locally through Hugging Face Transformers, llama.cpp, and vLLM. It fits on a single H100 GPU for local inference.

Does GPT OSS 120B support tool use and agentic tasks?

Yes. The model natively supports web browsing, Python code execution, and custom developer-defined functions, and it is designed for multi-step agentic workflows and integration with agent frameworks.

Community Discussion

What people think about GPT OSS 120B

Community reception on r/LocalLLaMA has been largely positive, with the release announcement drawing over 2,000 upvotes and 551 comments, making it one of the more discussed open-weight model launches in the community. Users have praised the model's coding and reasoning capabilities, with one thread titled "OpenAI GPT-OSS-120b is an excellent model" accumulating 202 upvotes and 146 comments.

Some community members have raised concerns about benchmark performance, particularly on Simple-Bench, where results were described as disappointing in a dedicated thread. Creative writing and EQ-Bench results were also discussed separately, suggesting the community is actively evaluating the model across a range of tasks beyond coding and math.

r/LocalLLaMA 468 pts 104 comments

openai/gpt-oss-120b · Hugging Face

r/LocalLLaMA 202 pts 146 comments

OpenAI GPT-OSS-120b is an excellent model

r/LocalLLaMA 226 pts 108 comments

OpenAI gpt-oss-120b & 20b EQ-Bench & creative writing results

r/LocalLLaMA 159 pts 77 comments

GPT-OSS 120B Simple-Bench is not looking great either. What is going on Openai?

r/LocalLLaMA 2,022 pts 551 comments

🚀 OpenAI released their open-weight models!!!

View more discussions →

Resources