Z Image Turbo Controlnet
Alibaba's Tongyi Lab Z-Image is a 6-billion-parameter image generation foundation model that delivers state-of-the-art visual quality and prompt coherence using an efficient single-stream diffusion transformer architecture.
Fast image generation with ControlNet guidance
Z Image Turbo Controlnet is an image generation model developed by Alibaba's Tongyi-MAI lab, built on a single-stream diffusion transformer architecture with 6 billion parameters. It uses a few-step distillation approach (the Turbo variant) to accelerate inference while preserving output quality, and incorporates ControlNet to allow structural guidance from a source image. The model was trained with a multi-level captioning system and a data infrastructure that includes a Cross-modal Vector Engine and World Knowledge Topological Graph to improve semantic alignment between prompts and outputs.
This model is well-suited for workflows that require both speed and structural control over generated images, such as guided creative generation, image editing pipelines, and rapid prototyping. It accepts image URLs as source inputs alongside configurable parameters including seed values for reproducibility. An RLHF alignment pipeline using DPO and GRPO stages was applied to bring outputs closer to human aesthetic preferences, and a built-in prompt enhancer with reasoning chain helps produce better results from short or underspecified prompts.
What Z Image Turbo Controlnet supports
ControlNet Guidance
Accepts a source image URL to provide structural or compositional control over the generated output, enabling guided image generation from a reference.
Turbo Inference
Uses few-step distillation to reduce the number of diffusion steps required at inference time, producing results faster without significant quality degradation.
Text-to-Image Generation
Generates images from text prompts using a 6-billion-parameter single-stream diffusion transformer, with a built-in prompt enhancer that applies a reasoning chain to improve results from short inputs.
Seed-Based Reproducibility
Accepts a numeric seed input so that generation results can be reproduced exactly across multiple runs with the same parameters.
RLHF Alignment
Trained with a reinforcement learning from human feedback pipeline using DPO and GRPO stages to align generated images with human aesthetic preferences.
Configurable Generation Parameters
Exposes multiple select-type inputs allowing users to configure generation options such as style or quality mode directly within the request.
Ready to build with Z Image Turbo Controlnet?
Get Started FreeCommon questions about Z Image Turbo Controlnet
What is the context window for this model?
The model has a context window of 10,000 tokens, as specified in its metadata.
Who developed Z Image Turbo Controlnet?
It was developed by Alibaba's Tongyi-MAI lab and is published under the Qwen publisher on MindStudio.
What inputs does this model accept?
The model accepts an image URL (for ControlNet source guidance), two select-type configuration inputs, a numeric parameter, and a seed value for reproducibility.
What is the training cutoff date for this model?
According to the metadata, the model's training date is November 2024.
How does the Turbo variant differ from the base Z-Image model?
The Turbo variant applies few-step distillation to the base 6-billion-parameter Z-Image model, reducing the number of diffusion steps needed at inference time for faster generation while aiming to preserve output quality.
Do I need to provide an API key to use this model on MindStudio?
No API key is required. You can use Z Image Turbo Controlnet directly through MindStudio without managing separate API credentials.
Documentation & links
Parameters & options
Reference image URL for ControlNet to extract structural guidance from.
ControlNet mode: 'depth' for depth map guidance, 'canny' for edge detection, 'pose' for human pose estimation, 'none' for no control.
Output image size in pixels (width*height).
Controls how strongly the ControlNet guidance affects the output. Higher values follow the control signal more strictly.
Random seed for reproducible generation. Use -1 for random seed.
Explore similar models
Start building with Z Image Turbo Controlnet
No API keys required. Create AI-powered workflows with Z Image Turbo Controlnet in minutes — free.