Skip to main content
MindStudio
Pricing
Blog About
My Workspace
Text to Speech Model

Gemini 3.1 Flash TTS

The Gemini 3.1 Flash TTS Preview model provides powerful, low-latency speech generation with natural outputs, steerable prompts, and new expressive audio tags for precise narration control.

Publisher Google
Type Text to Speech
Context Window 16,384 tokens
Training Data April 2026
Input $1.00/MTok
Output $20.00/MTok
LATEST

Gemini 3.1 Flash TTS

The Gemini 3.1 Flash TTS Preview model provides powerful, low-latency speech generation with natural outputs, steerable prompts, and new expressive audio tags for precise narration control.

Ready to build with Gemini 3.1 Flash TTS?

Get Started Free

Parameters & options

Max Response Size 16,384 tokens
Voice Select

Prebuilt voice preset to use.

Default: Kore
Zephyr (bright)Puck (upbeat)Charon (informative)Kore (firm)Fenrir (excitable)Leda (youthful)Orus (firm)Aoede (breezy)Callirhoe (easy-going)Autonoe (bright)Enceladus (breathy)Iapetus (clear)Umbriel (easy-going)Algieba (smooth)Despina (smooth)Erinome (clear)Algenib (gravelly)Rasalgethi (informative)Laomedeia (upbeat)Achernar (soft)Alnilam (firm)Schedar (even)Gacrux (mature)Pulcherrima (forward)Achird (friendly)Zubenelgenubi (casual)Vindemiatrix (gentle)Sadachbia (lively)Sadaltager (knowledgeable)Sulafat (warm)
Style Instruction Prompt

Optional natural-language direction for delivery (e.g. "Say cheerfully:", "Whisper softly:", "Narrate dramatically:"). Prepended to the input before synthesis. Leave blank for a neutral read. You can also embed expressive audio tags directly in your input text like [happy], [whisper], [laughing].

Start building with Gemini 3.1 Flash TTS

No API keys required. Create AI-powered workflows with Gemini 3.1 Flash TTS in minutes — free.