Text to Speech
The Text to Speech block takes a string of text and generates an audio file from it using a speech model.
Convert text into a generated audio file
The Text to Speech block takes a string of text and generates an audio file from it using a speech model. The primary input is the text field, which contains the exact words to be spoken. You can optionally specify a destination variable to store the resulting audio URL, and you can configure whether the generated asset is hidden from the gallery using the intermediateAsset flag. The block also accepts an optional speech model override; if none is provided, it falls back to the workflow's default speech model settings.
What you can build
Real-world workflows powered by the Text to Speech block.
Narrated Article Delivery
Convert a written article or blog post into spoken audio so users can listen to content instead of reading it.
Voice-Enabled Chatbot Responses
Take text responses generated by an AI model and convert them to audio for a voice-based conversational interface.
Language Learning Pronunciation
Generate spoken audio from vocabulary words or phrases to help users hear correct pronunciation in a language learning workflow.
Accessibility Audio Output
Produce audio versions of text-based content to make workflows accessible to users who prefer or require audio output.
Automated Podcast Production
Feed a script into the block to generate a spoken audio file as part of an automated podcast or audio content pipeline.
Notification Audio Alerts
Convert short notification or alert messages into audio files that can be played back to users within an application workflow.
Ready to add Text to Speech to your workflow?
Get Started FreeCommon questions about Text to Speech
What are the required parameters for this block?
The only required parameter is the text field, which contains the exact string of words to be spoken. All other fields — destinationVar, intermediateAsset, and speechModelOverride — are optional.
What does the block return?
The block outputs an audioUrl field, which is a string containing the URL of the generated audio file.
How do I access the generated audio URL in later steps of my workflow?
You can specify a variable name in the destinationVar field. The block will save the audio URL to that variable, making it available for use in subsequent blocks.
Which speech model does the block use?
By default, the block uses the speech model configured at the workflow level. You can override this on a per-block basis using the optional speechModelOverride parameter.
What kinds of workflows commonly use this block?
This block fits into workflows that produce audio output for users, such as voice assistants, narrated content delivery, language learning tools, and accessibility-focused applications. It can also be used in background processing pipelines where the audio URL is stored in a variable for later use.
Related capabilities
User Message
Send a message to an AI model and return the response, or echo a system message.
Generate Image
Generate an image from a text prompt using an AI model.
Generate Video
Generate a video from a text prompt using an AI model.
Generate Chart
Create a chart image using QuickChart (Chart.js) and return the URL.
Generate Music
Generate an audio file from provided instructions (text) using a music model.
Generate LipSync
Generate a lip sync video from provided audio and image.
Add Text to Speech to your workflow
Build powerful AI workflows with drag-and-drop blocks. No coding required.