Text to Speech Model

GPT-4o-mini TTS

A lightweight, cost-efficient text-to-speech model from OpenAI that converts written text into natural-sounding audio.

Start Building with GPT-4o-mini TTS View All Models

Publisher

OpenAI

Type Text to Speech

Context Window 2,000 tokens

Training Data n/a

Input $0.60/MTok

Output $2.40/MTok

LATEST

Try GPT-4o-mini TTS →

About GPT-4o-mini TTS

Lightweight text-to-speech audio generation by OpenAI

GPT-4o-mini TTS is a text-to-speech model developed by OpenAI that converts written text into natural-sounding spoken audio. It belongs to the GPT-4o mini family, which is designed to deliver capable output at a smaller computational footprint than full-scale variants. The model is accessible to developers through the OpenAI API and is intended for programmatic speech generation across a range of applications. It accepts a text input of up to 2,000 characters and returns audio output in a synthesized voice.

GPT-4o-mini TTS is part of OpenAI's broader suite of audio models, which also includes transcription and speech-to-speech capabilities. Its focus is specifically on the text-to-speech task, producing clear and expressive spoken output from plain text. The model is well-suited for voice-enabled applications, accessibility tools, content narration, and any product that requires reliable, scalable audio generation without requiring a larger model. Developers can configure voice selection and other parameters through the API.

Capabilities

What GPT-4o-mini TTS supports

Text to Speech

Converts written text input into synthesized spoken audio output. Accepts up to 2,000 characters of text per request.

Voice Selection

Allows developers to choose from available voice options via a select input parameter in the API request.

Expressive Audio Output

Produces natural-sounding speech with expressive qualities, suitable for narration and voice-enabled interfaces.

API Integration

Available through the OpenAI API, enabling developers to integrate speech generation directly into applications via standard API calls.

Configurable Parameters

Supports numeric and select input types, allowing developers to adjust settings such as speed or format alongside voice and prompt inputs.

Ready to build with GPT-4o-mini TTS?

Get Started Free

FAQ

Common questions about GPT-4o-mini TTS

What is the input limit for GPT-4o-mini TTS?

The model supports a context window of 2,000 characters, which represents the maximum amount of text that can be submitted in a single request.

How is GPT-4o-mini TTS accessed?

The model is available through the OpenAI API using the model ID gpt-4o-mini-tts. Developers can integrate it into applications by making API calls with a text prompt and optional configuration parameters.

What input types does GPT-4o-mini TTS accept?

The model accepts a text prompt, select inputs for options such as voice choice, and a numeric input for parameters like speed, based on the available input types listed in the metadata.

Does GPT-4o-mini TTS have a training data cutoff date?

A specific training date is not provided for this model. As a text-to-speech model, it does not rely on a knowledge cutoff in the same way that language models do, since it converts text to audio rather than generating factual content.

What types of applications is GPT-4o-mini TTS suited for?

The model is designed for use cases that require programmatic speech generation, including voice-enabled applications, accessibility tools, content narration systems, and other products that need scalable audio output from text input.

Community Discussion

What people think about GPT-4o-mini TTS

Community discussion directly about GPT-4o-mini TTS is limited in the found threads, though one Reddit post notes that OpenAI quietly released updated versions of their TTS and related audio models in the API in December 2025, suggesting ongoing development. General sentiment around OpenAI's audio model updates tends to be positive, with developers noting the availability of new versioned releases.

The threads found are largely focused on other TTS models or broad AI release roundups rather than GPT-4o-mini TTS specifically, so detailed community feedback on its limitations or use cases is not well represented in this sample. Developers interested in community discussion may find more targeted feedback in OpenAI developer forums or the OpenAI API changelog.

r/StableDiffusion 147 pts 78 comments

Higgs Audio V2: A New Open-Source TTS Model with Voice Cloning and SOTA Expressiveness

r/singularity 103 pts 26 comments

All Major LLM Releases from 2025 - Today (Source:Lex Fridman State of Ai in 2026 Video)

r/singularity 132 pts 29 comments

OpenAI just stealth-dropped new "2025-12-15" versions of their Realtime, TTS and Transcribe models in the API.

r/LocalLLaMA 122 pts 19 comments

Higgs Audio V2: A New Open-Source TTS Model with Voice Cloning and SOTA Expressiveness

View more discussions →

Resources