Skip to main content
MindStudio
Pricing
Blog About
My Workspace
Video Generation Model

Wan 2.7

Alibaba's unified video generation model supporting text-to-video, image-to-video, reference-to-video, and natural-language video editing — with multi-shot storytelling, native audio, and up to 15 seconds of 1080p output.

Publisher Wan
Type Video
Context Window 2,000 tokens
Price $0.05-$0.15/second
Provider Alibaba Cloud
Source ImageReference ImagesVideo EditingAudio

Wan 2.7

**Wan 2.7** is Alibaba's latest video generation model family, available through the DashScope API. It unifies four powerful video creation workflows into a single system: text-to-video, image-to-video, reference-to-video, and natural-language video editing — with intelligent shot scheduling, native audio generation, and highly realistic dynamic rendering. ### Generation Modes - **Text to Video**: Generates high-quality videos directly from text prompts with intelligent shot scheduling that supports multi-shot storytelling — producing multi-shot narrative videos with consistent subjects, scenes, and atmosphere, up to 15 seconds in length, with strong instruction following and improved visual fidelity. - **Image to Video**: Animates a provided first-frame image guided by a text prompt. Supports stable multi-speaker dialogue with natural, realistic vocal timbres and multi-camera storytelling. - **Reference to Video**: Uses a specified person or any object as a reference, precisely maintaining consistency of appearance and voice, and allows multi-character references for joint performances. - **Video Edit**: Performs localized or global edits to existing videos using natural-language prompts. Seamlessly replaces elements using image references and replicates complex dynamic processes, including motion, special effects, and camera movements. ### Key Capabilities - **Native audio generation**: Synchronized dialogue, vocals, and sound effects generated directly alongside video — including lip-synced speech and singing - **Multi-shot storytelling**: A single prompt can produce multi-scene narratives with automatic camera transitions and consistent characters across shots - **Output quality**: Generates video at **720p or 1080p** resolution in 16:9 or 9:16 aspect ratios, with clips up to **15 seconds** long - **Prompt enhancement**: Optional AI-powered prompt expansion for improved output quality Wan 2.7 is well suited for content creators, marketers, filmmakers, and developers who need a single flexible model for generating new videos with audio from scratch, animating still images, maintaining consistent characters across shots, or making precise edits to existing footage.

Ready to build with Wan 2.7?

Get Started Free

Parameters & options

Mode Select
Default: text-to-video
Text to VideoImage to VideoReference to VideoVideo Edit
First Frame Image Image URL

Image used as the first frame of the generated video.

Source Video Video URL

The video to edit. Describe your edits in the prompt using natural language instructions. Supports both localized and global edits.

Reference Images Image URL Array

Provide reference images of people or objects to keep their appearance (and voice) consistent in the generated video. Reference them as "Character 1", "Character 2", etc. in the prompt. Supports multi-character joint performances.

Reference Images Image URL Array

Optionally provide reference images to seamlessly replace elements in the video.

Resolution Toggle Group
Default: 720P
Aspect Ratio Toggle Group
Default: 16:9
Duration Select
Default: 5
5s10s15s
Generate Audio Toggle Group

Generate synchronized audio including dialogue, vocals, and sound effects alongside the video.

Shot Type Toggle Group

Multi-shot uses intelligent shot scheduling to generate multi-shot narrative videos with consistent subjects, scenes, and atmosphere.

Default: single
Auto-Enhance Prompt Toggle Group

Automatically rewrite and expand the prompt for improved output quality.

Default: true

Start building with Wan 2.7

No API keys required. Create AI-powered workflows with Wan 2.7 in minutes — free.