Video Generation Model

Wan 2.6

Alibaba's open-source video generation model that creates cinematic 1080p videos up to 15 seconds with native audio and lip sync from a single prompt.

Start Building with Wan 2.6 View All Models

Publisher

Wan

Type Video

Context Window 1,000 tokens

Training Data December 2025

Price $0.10-$0.15/second

Provider

WaveSpeed

Source ImageSource AudioMulti-Shot

Try Wan 2.6 →

About Wan 2.6

Text-to-video with native audio and lip sync

Wan 2.6 is a video generation model developed by Alibaba that produces 1080p video at 24 frames per second for clips up to 15 seconds in length. It accepts text, image, or video as input and generates complete video output — including synchronized audio, dialogue, sound effects, and lip movements — in a single generation pass, without requiring a separate audio pipeline. The model was trained with a cutoff of December 2025 and is available as an open-source release.

Wan 2.6 is designed for creators, marketers, and developers who need publish-ready video content without extensive post-production work. Its distinguishing features include multi-shot narrative handling across a single clip, character consistency when using reference figures, physics simulation for realistic motion, and style transfer from reference videos. These capabilities make it suited for use cases such as social media content, product demonstrations, commercials, and short narrative sequences.

Capabilities

What Wan 2.6 supports

Native Audio Generation

Generates synchronized audio — including dialogue, sound effects, and lip movements — alongside video in a single pass, eliminating the need for separate dubbing tools.

Image-to-Video

Accepts a source image as input and animates it into a 1080p video clip up to 15 seconds long.

Multi-Shot Narratives

Handles camera transitions and scene segmentation across a single 15-second clip based on a text description of the full scene.

Character Consistency

Places reference figures into generated scenes while maintaining consistent appearance, voice, and interaction throughout the clip.

Physics Simulation

Renders gravity, fluid dynamics, and complex object interactions to produce realistic motion in action and product shots.

Video Style Transfer

Locks onto motion from a reference video so the performance is preserved while the visual environment is replaced.

Seed Control

Accepts a seed value as input to enable reproducible generation outputs across multiple runs.

Text Rendering

Supports rendering legible text within generated video frames, useful for graphics, titles, and on-screen labels.

Ready to build with Wan 2.6?

Get Started Free

FAQ

Common questions about Wan 2.6

What is the context window for Wan 2.6?

Wan 2.6 has a context window of 1,000 tokens, which applies to the text prompt input used to guide video generation.

What is the maximum video length and resolution Wan 2.6 can produce?

Wan 2.6 generates video at up to 1080p resolution and 24 frames per second, with a maximum clip length of 15 seconds.

Does Wan 2.6 require a separate tool to add audio to generated videos?

No. Wan 2.6 generates native audio — including synchronized dialogue, sound effects, and lip movements — as part of the same generation pass that produces the video.

What input types does Wan 2.6 accept?

Wan 2.6 accepts text prompts, image URLs, selection inputs (such as style or mode options), toggle group settings, and a seed value for reproducibility.

What is the training data cutoff for Wan 2.6?

The model's training data has a cutoff of December 2025.

Is Wan 2.6 open source?

Yes. Wan 2.6 is released as an open-source model by Alibaba.

Community Discussion

What people think about Wan 2.6

Community discussion around Wan 2.6 on Reddit was generally positive, with users highlighting its native audio synchronization and 1080p output as notable features for a single-pass video generation model. The thread, which received 226 upvotes and 78 comments, noted that the model appeared on API platforms ahead of Alibaba's official announcement event.

Some commenters framed the release in the context of the broader competitive landscape for video generation models, while others focused on practical use cases such as short-form content and automated video production. No widespread technical limitations were documented in the available thread data.

r/singularity 226 pts 78 comments

Alibaba just dropped "Wan 2.6" (Sora Rival) on API platforms ahead of tomorrow's official event. Features 1080p, Native Audio Sync and 15s clips.

View more discussions →

Resources