AI Models
Explore 245+ AI models available in MindStudio. From large language models to image generators — no API keys required.
Anthropic 15
Claude 4.6 Sonnet
Anthropic's most capable Sonnet model yet, delivering near-Opus-level intelligence for coding, agents, and computer use with a 1M token context window.
TextClaude 4.6 Opus
Anthropic's most intelligent model, setting new standards in agentic coding, complex reasoning, and professional knowledge work.
TextClaude 4.5 Opus
Anthropic's most powerful flagship model, setting the state of the art in coding, agentic workflows, and real-world computer use.
TextClaude 4.5 Haiku
Anthropic's fastest and most efficient model, delivering frontier-level coding performance at a fraction of the cost and more than twice the speed of its predecessors.
TextClaude 4.5 Sonnet
Anthropic's most intelligent model, leading the world in coding, computer use, and complex agentic tasks.
TextClaude 4.1 Opus
Anthropic's most advanced model, purpose-built for complex coding, agentic tasks, and deep research requiring sustained multi-step reasoning.
TextClaude 4 Opus
Anthropic's most advanced flagship model at launch, excelling at complex coding, deep reasoning, and long-horizon agentic tasks.
TextClaude 4 Sonnet
Claude Sonnet 4 is Anthropic's efficient and capable mid-tier model, delivering state-of-the-art coding and reasoning performance at scale.
TextClaude 3 Sonnet
Ideal balance of intelligence and speed for enterprise workloads, offering cost-effective performance.
TextClaude 3 Haiku
Fast, affordable model with strong vision capabilities and performance for diverse enterprise applications.
TextClaude 3.5 Sonnet
Performance boost, combined with cost-effective pricing, makes Claude 3.5 Sonnet ideal for complex tasks.
TextClaude 3.5 Sonnet v2
Sets new industry benchmarks in reasoning, knowledge, and coding with improved grasp of nuance and humor.
TextClaude 3.5 Haiku
Faster than Claude 3 Haiku with improved performance across multiple intelligence benchmarks.
TextClaude 3 Haiku
Fast, affordable model with strong vision capabilities and performance for diverse enterprise applications.
TextClaude InstantDeprecated
OpenAI 45
GPT 5.4 Pro
OpenAI's most capable and efficient frontier model for professional work, combining powerful reasoning with reliable agentic execution at scale.
TextGPT 5.4
OpenAI's most capable and efficient frontier model for professional work, combining powerful reasoning with reliable agentic execution at scale.
TextGPT‑5.2 Pro
The most capable model series yet for professional knowledge work.
TextGPT-5.1
The best model for coding and agentic tasks across domains
Texto3-pro
OpenAI's most powerful reasoning model, designed to tackle the hardest problems with deeper thinking, multimodal understanding, and autonomous tool use.
TextGPT-5 Chat
GPT-5 model used in ChatGPT
TextGPT-5 nano
Fastest, most cost-efficient version of GPT-5
TextGPT-5 mini
A faster, more cost-efficient version of GPT-5 for well-defined tasks
TextGPT-5
The best model for coding and agentic tasks across domains
TextGPT OSS 120B
OpenAI's flagship open-weight reasoning model with 117B parameters, built for powerful coding, math, and agentic tasks under the Apache 2.0 license.
TextGPT OSS 20B
OpenAI's compact open-weight reasoning model that delivers powerful AI capabilities on consumer hardware, running within just 16GB of memory.
TextGPT-4.1 Nano
GPT-4.1 nano is the fastest, most cost-effective GPT-4.1 model.
TextGPT-4.1 Mini
GPT-4.1 mini provides a balance between intelligence, speed, and cost that makes it an attractive model for many use cases.
TextGPT-4.1
GPT-4.1 is OpenAI's flagship model for complex tasks. It is well suited for problem solving across domains.
Texto4-mini
A fast, efficient reasoning model from OpenAI that delivers exceptional performance in math, coding, and visual tasks at high throughput.
Texto3
OpenAI's most powerful reasoning model, built to tackle complex problems in coding, math, science, and visual analysis with exceptional depth and accuracy.
Texto1-pro
OpenAI's most powerful reasoning model, built to tackle the hardest problems with extended thinking time and enhanced compute.
Texto3-mini
OpenAI's compact reasoning model optimized for fast, cost-efficient problem-solving with exceptional performance in math, coding, and science.
Texto1
Small cost-efficient reasoning model that’s optimized for coding, math, and science, and supports tools and Structured Outputs.
TextGPT-4o Mini
Low-cost, low-latency model surpassing GPT-3.5 Turbo in textual intelligence and multimodal reasoning.
TextGPT-4o
Accepting any input and generating any output combination of text, audio, and image for more natural interaction.
TextGPT-4 Turbo
Combines GPT-4's sophisticated language processing with faster response times for interactive applications.
TextGPT-4
Increased capacity and nuance compared to predecessors, offering more accurate text generation.
VisionGPT-4o Mini Vision
Low-cost, fast model surpassing GPT-3.5 Turbo in textual intelligence and multimodal reasoning.
VisionGPT-4o Vision
A GPT-4o variant with vision capabilities, processing both text and image inputs.
VisionGPT-4 Turbo Vision
A variant of GPT-4 with vision capabilities, processing both text and image inputs.
ImageGPT Image Latest
OpenAI's latest image generation and editing model, offering state-of-the-art visual quality, precise instruction following, and support for large-scale batch processing.
ImageGPT Image 1.5
OpenAI's flagship image generation model featuring precise editing, face preservation, and 4x faster generation speeds.
ImageGPT Image 1
OpenAI's flagship image generation model that creates and edits images with precise instruction-following, fast generation, and detail-preserving edits.
ImageDALL-E 3
OpenAI's most advanced AI image generator, capable of producing highly detailed, photorealistic images from natural language prompts with unprecedented accuracy.
ImageDALL-E 2
DALL·E 2 is OpenAI's AI image generation system that creates realistic images and art from natural language descriptions, with 4x greater resolution than its predecessor.
VideoSora 2 Pro
OpenAI's most powerful text-to-video model, delivering up to 25-second cinematic clips with synchronized audio, lifelike physics, and unprecedented creative control.
VideoSora 2
OpenAI's Sora 2 is a groundbreaking AI video generation model that creates realistic, physics-aware videos with synchronized audio from text prompts.
TranscriptionGPT-4o mini Transcribe
Speech-to-text model powered by GPT-4o mini
TranscriptionGPT-4o Transcribe
Speech-to-text model powered by GPT-4o
TranscriptionWhisper
General-purpose speech recognition model
Text to SpeechGPT-4o-mini TTS
A lightweight, cost-efficient text-to-speech model from OpenAI that converts written text into natural-sounding audio.
Text to SpeechTTS HD
AI model converting text to high-quality, natural-sounding speech with TTS-1-HD optimized for quality.
Text to SpeechTTS
OpenAI's fast text-to-speech model, optimized for low-latency voice generation across a wide range of applications.
TextGPT-4.5Deprecated
Increased capacity and nuance compared to predecessors, offering more accurate text generation.
Texto1-previewDeprecated
Early preview model using broad general knowledge to reason about hard problems.
Texto1-miniDeprecated
Faster, cheaper version of o1 adept at coding, math, and science tasks without extensive general knowledge.
TextGPT-3.5Deprecated
A versatile model excelling in coherent text generation, conversation, and problem-solving.
InstructGPT-3Deprecated
Enhanced language understanding and generation for detailed, context-relevant responses.
InstructGPT-3.5 InstructDeprecated
Variant tuned to better understand and adhere to user prompts for precise, instruction-following responses.
Google 34
Gemini 3.1 Pro
Google's frontier reasoning model delivering enhanced software engineering, agentic reliability, and multimodal intelligence across a 1M-token context window.
TextGemini 3 Flash
A fast, capable thinking model from Google designed for agentic workflows, coding, and multi-turn chat with near Pro-level reasoning at lower latency.
TextGemini 3
The best model in the world for multimodal understanding, and our most powerful agentic and vibe-coding model yet, delivering richer visuals and deeper interactivity, all built on a foundation of state-of-the-art reasoning.
TextGemini 2.5 Flash Lite
Google's fastest and most efficient Gemini 2.5 model, delivering high-quality AI performance at scale with optional reasoning capabilities.
TextGemini 2.5 Flash
Gemini 2.5 Flash is a thinking model that offers great, well-rounded capabilities. It is designed to offer a balance between price and performance.
TextGemini 2.5 Pro
State-of-the-art thinking model, capable of reasoning over complex problems in code, math, and STEM, as well as analyzing large datasets, codebases, and documents using long context.
TextGemma 3.2
Google's most capable open multimodal model, combining image and text understanding with support for 140+ languages and a massive 128K context window.
TextGemini 2.0 Flash Lite
Speedy, cost-effective multimodal model for high-volume applications without compromising quality.
TextGemini 2.0 Flash
Workhorse model for all daily tasks with enhanced performance. 2.0 Flash is an upgrade path for 1.5 Flash users who want a slightly slower model with significantly better quality, or 1.5 Pro users who want slightly better quality and real-time latency for less.
VisionGemini 2.5 Pro Vision
Gemini 2.5 Pro is our most advanced reasoning Gemini model, capable of solving complex problems.
VisionGemini 2.5 Flash Vision
Gemini 2.5 Flash is a thinking model that offers great, well-rounded capabilities. It is designed to offer a balance between price and performance.
VisionGemini 2.0 Flash-Lite Vision
Gemini 2.0 Flash-Lite is our fastest and most cost efficient Flash model. It's an upgrade path for 1.5 Flash users who want better quality for the same price and speed.
VisionGemini 2.0 Flash Vision
Gemini 2.0 Flash delivers next-gen features and improved capabilities, including superior speed, built-in tool use, multimodal generation, and a 1M token context window.
ImageGemini 3.1 Flash Image
Google's top-ranked image generation and editing model, delivering world-class visual quality with strong text rendering, multi-subject consistency, and search-grounded creativity.
ImageGemini 3 Pro Image
Google's most advanced image generation and editing model, built on Gemini 3 Pro, delivering professional-grade visual synthesis with multimodal reasoning.
ImageGemini 2.5 Flash Image
Google's state-of-the-art image generation and editing model that combines powerful creative control with Gemini's world knowledge.
ImageImagen 3
Google's Imagen 3 is a premium text-to-image model delivering photorealistic quality with exceptional text rendering precision and natural language understanding.
ImageImagen 3 Fast
A fast variant of Google's Imagen 3 that generates high-quality, realistic images from text prompts with lower latency.
ImageImagen 4 Ultra
Google's highest quality text-to-image generation model, delivering exceptional fidelity and detail for professional-grade visuals.
ImageImagen 4 Fast
Google's highest quality image generation model, optimized for speed without sacrificing detail or visual fidelity.
VideoVeo 3.1 Fast
A fast video generation model from Google designed for quick, high-quality AI video creation via Vertex AI and the Gemini API.
VideoVeo 3.1
Google's Veo 3.1 video generation model, the stable production release succeeding the preview version for high-quality AI video creation.
VideoVeo 3 Fast
Google's fast variant of the Veo 3 video generation model, optimized for speed while delivering high-quality AI-generated video from text prompts.
VideoVeo 3
Google's stable, production-ready video generation model that succeeds the Veo 3.0 preview, offering high-quality AI video synthesis via the Vertex AI platform.
VideoVeo 2
Google's generally available text- and image-to-video model capable of generating detailed, physically realistic, and artistically nuanced videos.
TextGemini 2.0 Flash ThinkingDeprecated
Combining speed and performance, 2.0 Flash Thinking Experimental excels in science and math, showing its thinking to solve complex problems.
TextGemini 2.0 ProDeprecated
An experimental update Gemini 2.0 for coding and complex prompts.
TextGemini 1.5 FlashDeprecated
Speedy, cost-effective multimodal model for high-volume applications without compromising quality.
TextGemini 1.5 ProDeprecated
Proficient at multimodal tasks and content creation from image, audio, and video inputs.
TextGemini 1.0 ProDeprecated
Handles both text and image inputs for content generation and problem-solving.
TextPaLM 2Deprecated
Advanced language model with high efficiency and accuracy for complex language tasks and creative content generation.
VisionGemini 1.5 Flash VisionDeprecated
Fast, cost-effective multimodal model for quality applications at high volume.
VisionGemini 1.5 Pro VisionDeprecated
Adept at processing visual and text inputs for multimodal tasks and content creation.
VisionGemini 1.0 Pro VisionDeprecated
Handles both text and image inputs for content generation and problem-solving.
Amazon 4
Amazon Nova Pro
Highly capable multimodal model with the best combination of accuracy, speed, and cost for a wide range of tasks.
TextAmazon Nova Lite
Very low cost multimodal model that is lightning fast for processing image, video, and text inputs.
TextAmazon Nova Micro
Text-only model that delivers the lowest latency responses at very low cost.
ImageAmazon Nova Canvas
Amazon Nova Canvas is a state-of-the-art image generation model that creates professional-grade images from text or images provided in prompts.
Blackforestlabs 8
FLUX.1 [dev] LoRA
FLUX.1 [dev] with LoRA support for fast, high-quality, personalized image generation using pre-trained style adapters.
ImageFLUX.2 [max]
Black Forest Labs' most powerful image generation and editing model, delivering professional-grade output with real-time web context and unmatched visual consistency.
ImageFLUX.2 [dev] LoRA
FLUX.2 [dev] with LoRA support brings personalized, studio-quality text-to-image generation by letting you inject custom styles, characters, and brand identities through stackable adapter modules.
ImageFLUX.2 [pro]
FLUX.2 [pro] is Black Forest Labs' production-grade image generation model delivering state-of-the-art photorealism, multi-reference consistency, and precise brand control at scale.
ImageFLUX.1 Kontext [max]
FLUX.1 Kontext [Max] is a frontier image generation and editing model with advanced text rendering and contextual understanding.
ImageFLUX.1 Kontext [pro]
FLUX.1 Kontext Pro is a powerful image generation and editing model that enables precise, context-aware modifications to existing images using natural language prompts.
ImageFLUX 1.1 [pro] Ultra
A state-of-the-art AI image generation model that produces stunning high-resolution images from text descriptions with exceptional speed and creative control.
ImageFLUX 1.1 [pro]
FLUX 1.1 Pro is Black Forest Labs' latest AI image generation model, delivering six times faster generation speeds and superior image quality with enhanced prompt accuracy.
ByteDance 6
Seedream 5.0 Lite
ByteDance's lightweight Seedream 5.0 image editing model, delivering fast, high-fidelity transformations with exceptional face preservation and multi-reference support.
ImageSeedream 4.5
ByteDance's unified AI model that seamlessly combines high-fidelity image generation and intelligent image editing in a single architecture, with standout text rendering and multi-image composition capabilities.
VideoDreamActor V2
ByteDance DreamActor V2 is a motion transfer model that animates any character — human, cartoon, or creature — by applying movements from a reference video to a static image.
VideoSeedance 1.5 Pro
ByteDance's cinematic image-to-video AI model that generates high-quality 1080p videos with natively synchronized audio from static images in a single pass.
Lip SyncLatentSync
LatentSync is an end-to-end audio-conditioned lip-sync framework that generates perfectly synchronized talking-head videos from any source video and target audio.
Lip SyncOmni Human 1.5
ByteDance's OmniHuman 1.5 transforms static images into emotionally expressive, audio-driven digital humans using cognitive simulation and multimodal AI.
Cohere 2
DeepSeek 7
DeepSeek V3.2
DeepSeek-V3.2 is a reasoning-first open large language model that combines efficient sparse attention with advanced agentic capabilities, delivering GPT-5-level performance.
TextKimi K2.5
Moonshot AI's most powerful open-source model — a 1T-parameter native multimodal agent capable of visual understanding, advanced coding, and coordinated multi-agent task execution.
TextDeepSeek V3.1
DeepSeek-V3.1 is a powerful 671B parameter hybrid AI model that seamlessly switches between fast conversational responses and deep step-by-step reasoning, with significantly improved tool use and agent capabilities.
TextDeepSeek R1 Turbo
DeepSeek-R1-Turbo is a fast, efficient variant of DeepSeek's powerful R1 reasoning model, optimized for speed without sacrificing strong analytical performance.
TextDeepSeek-R1
Reasoning LLM from Chinese AI company DeepSeek utilizing Chain of Thought.
TextDeepSeek-V3
General-purpose LLM from DeepSeek.
TextDeepSeek-V3Deprecated
General-purpose LLM from Chinese AI company DeepSeek.
ElevenLabs 4
Scribe v2
ElevenLabs' state-of-the-art speech recognition model, delivering highly accurate transcription across 90+ languages with advanced features like speaker diarization, entity detection, and precise word-level timestamps.
TranscriptionScribe v1
ElevenLabs' first-generation speech-to-text transcription model, offering accurate audio transcription across multiple languages.
Text to SpeechElevenLabs TTS
ElevenLabs is a leading voice AI platform that generates emotionally rich, human-like speech across 70+ languages with advanced voice cloning and real-time conversational AI capabilities.
MusicElevenLabs Music
music_v1 is an AI model designed for music-related generation or analysis tasks.
Ideogram 8
Ideogram Vision
Ideogram Vision is a multimodal AI model capable of understanding and analyzing images alongside text prompts.
ImageIdeogram V3 Remix
Ideogram V3 Remix is an AI-powered image editing tool that lets you transform and create variations of existing images while preserving the elements that matter most.
ImageIdeogram V3
Ideogram's most advanced text-to-image model, delivering ultra-realistic imagery with exceptional in-image text rendering and style reference control.
ImageIdeogram Upscale
Ideogram's AI-powered image upscaler that enhances photos and generated images to stunning 8K resolution with sharp, production-ready detail.
ImageIdeogram V2 Remix
An AI image remixing model that transforms existing images into creative variations while preserving their core elements through text prompt guidance.
ImageIdeogram V1 Remix
Ideogram v1 Remix is an AI image generation model designed to creatively reimagine and transform existing images with new styles, concepts, and artistic directions.
ImageIdeogram V2
Ideogram's second-generation text-to-image model, renowned for best-in-class text rendering, typography, and graphic design capabilities.
ImageIdeogram V1
Ideogram v1 is an AI image generation model recognized for its exceptional ability to render accurate, legible text within images.
Kling 7
Kling Image O3
Kling AI's first image generation model, delivering high-quality text-to-image and image-to-image results with exceptional text rendering and up to 4K resolution support.
ImageKling Image O1
Kling Omni Image O1 is Kuaishou's multi-reference image generation model that maintains stunning visual consistency across characters, styles, and scenes using up to 10 reference images at once.
VideoKling 3.0 Motion Control
Kling 3.0 Standard Motion Control transfers motion from reference videos to animate still images.
VideoKling O3
Kling Video O3 is Kuaishou's most advanced omni-video model, built for reference-driven, multi-shot cinematic storytelling with consistent characters, native audio, and precise creative control.
VideoKling 2.6 Pro Motion Control
Kling V2.6 Pro Motion Control transforms static character images into dynamically animated videos by precisely extracting and transferring motion from real reference video clips.
VideoKling O1
Kling Video O1 is a unified multimodal AI video model that seamlessly combines text, images, and video to generate, edit, and extend footage with remarkable consistency.
Lip SyncAI Avatar Standard
Kling AI Avatar transforms a single portrait photo into a natural talking-head video driven by any audio track, with precise lip-sync and stable identity preservation.
Lightricks 3
LTX-2.3 LoRA
LTX-2.3 LoRA is a fine-tuning extension for Lightricks' LTX-2.3 video generation model, enabling custom character, style, and camera control in AI-generated videos.
VideoLTX-2.3
LTX-2.3 is Lightricks' open-source 22-billion-parameter multimodal model that generates synchronized audio and video in a single pass at up to 4K resolution.
VideoLTX-2 19b
LTX-2 19B is Lightricks' open-source AI model that simultaneously generates cinematic 4K video and synchronized audio from text or image inputs in a single unified workflow.
Luma Labs 4
Photon 1 Flash
Photon Flash 1 is a fast, efficient AI model designed for high-speed inference and responsive applications.
ImagePhoton 1
Luma Labs' photorealistic AI image generator built on a proprietary Universal Transformer architecture, delivering studio-quality 1080p images with exceptional prompt understanding and character consistency.
VideoRay 2
Luma AI's flagship video generation model that creates strikingly realistic, physically accurate video clips from text or image inputs.
VideoRay Flash 2
Ray Flash 2 is Luma AI's fast video generation model designed for efficient, high-quality text-to-video and image-to-video creation.
Meta 24
Llama 4 Scout
Llama 4 Scout is a powerful multimodal model with 17 billion active parameters, offering state-of-the-art performance in its class.
TextLlama 4 Scout
Llama 4 Scout is a powerful multimodal model with 17 billion active parameters, offering state-of-the-art performance in its class.
TextLlama 4 Maverick
Llama 4 Maverick is a high-performance multimodal model with 17 billion active parameters, supporting 12 languages and offering industry-leading image and text understanding.
TextLlama 3.1 8B Instant
TextLlama 3.3 70B Versatile
TextLlama 3.2 90B Instruct
Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.
TextLlama 3.2 11B Instruct
Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.
TextLlama 3.2 3B Instruct
Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.
TextLlama 3.2 1B Instruct
Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.
TextLlama 3.1 405B Instruct
Optimized for multilingual dialogue, outperforming open-source and closed chat models on industry benchmarks.
TextLlama 3.1 70B Instruct
Optimized for multilingual dialogue, outperforming open-source and closed chat models on industry benchmarks.
TextLlama 3.1 8B Instruct
Optimized for multilingual dialogue, outperforming open-source and closed chat models on industry benchmarks.
TextLlama 3 70B Instruct
Best-in-class 70B model with enhanced reasoning, code generation, and instruction following.
TextLlama 3 8B Instruct
Best-in-class 8B model with improved reasoning, code generation, and instruction following.
TextLlama 3.1 405B Instruct
Optimized for multilingual dialogue, outperforming open-source and closed chat models on industry benchmarks.
TextLlama 3.1 70B Instruct
Optimized for multilingual dialogue, outperforming open-source and closed chat models on industry benchmarks.
TextLlama 3.1 8B Instruct
Optimized for multilingual dialogue, outperforming open-source and closed chat models on industry benchmarks.
TextLlama 3 8B Instruct
Best-in-class 8B model with improved reasoning, code generation, and instruction following.
TextLlama 3 70B Instruct
Best-in-class 70B model with enhanced reasoning, code generation, and instruction following.
TextLlama 3 8BDeprecated
TextLlama 3 70BDeprecated
TextCode LlamaDeprecated
Tailored for code comprehension, generation, and debugging with an instructive design.
TextLlama-2 70B ChatDeprecated
Provides depth and complexity in language understanding for sophisticated content creation.
TextLlama-2 13B ChatDeprecated
Balanced model for detailed language processing, offering advanced understanding and generation.
MiniMax 3
Hailuo 2.3 Pro
MiniMax Hailuo 2.3 Pro is a video generation model for ultra-clear 1080P output and physics-aware scenes with responsive rendering.
Text to SpeechMinimax Speech 2.8 HD
MiniMax Speech 2.8 HD is a studio-quality text-to-speech model that delivers broadcast-ready, emotionally expressive audio rivaling professional voice actors.
MusicMinimax Music 2.5
MiniMax Music 2.5 is an AI music generation model that delivers professional-grade, structurally precise songs with strikingly realistic sound — no recording studio required.
Mistral 18
Ministral 3 3B
Designed for edge deployment, it delivers high performance across diverse hardware, including local setups.
TextMinistral 3 8B
Built for edge deployment, it delivers high performance across diverse hardware, including local setups.
TextMinistral 3 14B
Optimized for local deployment, it delivers high performance across diverse hardware, including local setups.
TextMistral Large 3
Mistral Large 3 is Mistral’s first mixture-of-experts model since the seminal Mixtral series, and represents a substantial step forward in pretraining at Mistral.
TextMistral Medium 3
A versatile AI model designed for professional applications with particular strengths in coding and multimodal understanding. The model can be deployed across various environments, making it ideal for enhancing customer service, personalizing business processes, and analyzing complex datasets across industries.
TextMistral Small 3.1 (25.03)
Single-node inference model with 128k context window supporting dozens of languages and 80+ coding languages.
TextMixtral 8x7B Instruct
High-quality, efficient sparse model outperforming larger models in speed and benchmarks.
TextMistral Small 24.02
Single-node inference model with 128k context window supporting dozens of languages and 80+ coding languages.
TextMistral Large 24.07
Single-node inference model with 128k context window supporting dozens of languages and 80+ coding languages.
TextMistral Large 24.02
Single-node inference model with 128k context window supporting dozens of languages and 80+ coding languages.
TextMistral 7B Instruct
Focused on instruction-based tasks, providing clear, concise responses adhering to user instructions.
TextMistral Codestral
Open-weight code generation model assisting developers in 80+ programming languages.
TextMistral Nemo
State-of-the-art reasoning, world knowledge, and coding accuracy model designed for global, multilingual applications.
TextMistral Large 2
Single-node inference model with 128k context window supporting dozens of languages and 80+ coding languages.
TextMistral 8x7bDeprecated
Mixtral 8x7B is a high-performance mixture-of-experts language model from Mistral AI, offering a 32K token context window with efficient, fast inference.
TextMixtral 8x22B InstructDeprecated
High-performance, cost-efficient sparse model using 39B active parameters out of 141B.
TextMixtral 8x7B InstructDeprecated
High-quality, efficient sparse model outperforming larger models in speed and benchmarks.
TextMistral 7B InstructDeprecated
Focused on instruction-based tasks, providing clear, concise responses adhering to user instructions.
Nvidia 2
Nemotron 3 Nano 30B
NVIDIA's compact open-weight hybrid MoE model with 30B total / 3.5B active parameters, delivering strong reasoning and coding performance up to 1M context.
TextNemotron 3 Super 120B
NVIDIA's open-weight hybrid Mamba-Transformer MoE with 120B total / 12B active parameters, excelling in agentic reasoning, coding, and long-context tasks up to 1M tokens.
Perplexity 9
Sonar Deep Research
Perplexity's exhaustive deep research model that autonomously searches hundreds of sources to deliver expert-level analysis and comprehensive reports.
TextSonar Reasoning Pro
Perplexity's premier reasoning model combining DeepSeek R1-powered Chain-of-Thought reasoning with real-time web search for comprehensive, citation-rich answers.
TextSonar
Perplexity's lightweight, real-time AI search model that delivers fast, citation-backed answers by connecting directly to the live web.
TextSonar Pro
Perplexity's advanced search model delivering deeper reasoning, richer citations, and multi-step query handling for complex research tasks.
TextSonar ReasoningDeprecated
Lightweight reasoning offering powered by reasoning models trained with DeepSeek R1.
TextSonar Large OnlineDeprecated
Perplexity's latest model family surpassing earlier versions in cost-efficiency, speed, and performance.
TextSonar Large ChatDeprecated
Perplexity's latest model family surpassing earlier versions in cost-efficiency, speed, and performance.
TextSonar Small OnlineDeprecated
Perplexity's latest model family surpassing earlier versions in cost-efficiency, speed, and performance.
TextSonar Small ChatDeprecated
Perplexity's latest model family surpassing earlier versions in cost-efficiency, speed, and performance.
Qwen 4
Qwen3 235B
Alibaba's updated 235B Mixture-of-Experts model delivering top-tier reasoning, coding, and long-context understanding while only activating 22B parameters at a time.
ImageQwen 2 Pro
Alibaba's unified image generation and editing model that excels at accurate text rendering, native 2K resolution output, and ranks #1 on AI Arena's blind human evaluation leaderboard.
ImageQwen Image
Qwen-Image is a state-of-the-art image generation and editing model with exceptional text rendering capabilities, including complex Chinese character generation.
ImageZ Image Turbo
Alibaba's Tongyi Lab Z-Image is a 6-billion-parameter image generation foundation model that delivers state-of-the-art visual quality and prompt coherence using an efficient single-stream diffusion transformer architecture.
Reka 3
Reka EdgeDeprecated
Lightweight 7B model outperforming larger models while delivering outsized value.
TextReka FlashDeprecated
Fast and capable 21B model outperforming larger models while delivering outsized value.
TextReka CoreDeprecated
Largest, most capable model for complex tasks, performing competitively on benchmarks and human evaluations.
Stability 5
SDXL LoRA
Stability AI's SDXL LoRA is a powerful text-to-image model combining a 3.5 billion parameter architecture with flexible LoRA customization for professional-grade, high-resolution image generation.
ImageSDXL
Stability AI's flagship text-to-image model, delivering dramatically improved image quality, composition, and detail over previous Stable Diffusion versions.
ImageStable Diffusion 3
Stable Diffusion 3 is Stability AI's most advanced text-to-image model, featuring a new Multimodal Diffusion Transformer architecture that delivers superior prompt adherence, typography, and visual quality.
ImageStable Image Ultra
Stability AI's most advanced text-to-image model, delivering exceptional photorealism and creative detail from short text prompts.
ImageStable Image Core
Stable Image Core is Stability AI's text-to-image generation model designed to produce high-quality visuals from descriptive text prompts.
Wan 6
Wan 2.6
Alibaba's powerful multimodal AI model that generates cinematic 1080p video with native audio synchronization, multi-shot storytelling, and advanced image creation.
ImageWan 2.5
Alibaba's open-source AI video model that generates cinematic 1080p video clips with fully synchronized audio — including dialogue, ambient sound, and music — all in a single step.
VideoWan 2.6
Alibaba's open-source video generation model that creates cinematic 1080p videos up to 15 seconds with native audio and lip sync from a single prompt.
VideoWan 2.5
Alibaba's open-source AI video model that generates cinematic 1080p clips with fully synchronized audio — dialogue, ambient sound, and music — all in a single step.
VideoWan 2.2
Wan 2.2 is an open-source video generation model from Alibaba's Tongyi Lab that uses a pioneering Mixture-of-Experts architecture to deliver cinematic-quality text-to-video and image-to-video results.
ImageWan 2.2Deprecated
X.ai 15
Grok 4.20 Reasoning
A beta reasoning variant of xAI's Grok 4.20, designed for enhanced step-by-step thinking and complex problem-solving.
TextGrok 4.20
A fast, tool-focused variant of Grok 4.20 Beta with reasoning disabled for lower-latency agentic workloads and a massive 2M-token context window.
TextGrok 4.1 Fast Reasoning
A fast, reasoning-capable AI model built for agentic tool-calling workflows with a massive 2-million-token context window.
TextGrok 4.1 Fast
xAI's ultra-fast, non-reasoning variant of Grok 4.1 Fast, built for real-time agentic tasks with a massive 2 million token context window.
TextGrok 4 Fast Reasoning
A fast, cost-efficient reasoning model from xAI that delivers near-flagship performance with dramatically fewer thinking tokens and a massive 2 million token context window.
TextGrok 4 Fast
xAI's cost-efficient reasoning model that delivers frontier-level intelligence with exceptional speed and token efficiency.
TextGrok 4
xAI's most powerful reasoning model, trained with massive-scale reinforcement learning to achieve world-leading performance on the hardest academic and scientific benchmarks.
TextGrok 3 Mini Fast
A lightweight, high-speed reasoning model from xAI that delivers fast, intelligent responses with function calling, web search, and extended thinking capabilities.
TextGrok 3 Mini
A lightweight, cost-efficient reasoning model from xAI that thinks before it responds, excelling at math, logic puzzles, and quantitative problem-solving.
TextGrok 3 Fast
A high-speed variant of xAI's Grok 3, optimized for rapid responses while retaining strong language understanding and tool-use capabilities.
TextGrok 3
xAI's flagship large language model, delivering state-of-the-art reasoning, coding, and creative writing capabilities on par with the best models in the world.
VisionGrok 2 Vision
A multimodal vision model from xAI that combines strong image understanding with text reasoning, multilingual support, and enhanced instruction-following.
ImageGrok Imagine Pro
xAI's premium image generation model, delivering higher-quality visuals than the standard grok-imagine-image.
ImageGrok Imagine
xAI's fast and accessible text-to-image generation model, designed to turn written prompts into high-quality images at scale.
VideoGrok Imagine
X.ai's fast, native text-to-video and image-to-video generation model with built-in audio, multiple aspect ratios, and flexible creative modes.
Z.ai 4
GLM 5
GLM-5 is a 744B-parameter open-weight frontier model from Z.ai, built for complex reasoning, coding, and long-horizon agentic tasks — and trained entirely on domestic Chinese hardware.
TextGLM 4.7
A powerful 358B-parameter open-source model built for agentic coding, complex reasoning, and reliable tool use across multiple languages.
TextGLM 4.6V
A 106B multimodal foundation model with native function calling, 128K context, and state-of-the-art visual understanding for real-world AI agent applications.
TextGLM 4.6
GLM-4.6 is a powerful 357B Mixture-of-Experts language model from Zhipu AI featuring a 200K context window, advanced reasoning, and top-tier coding and agentic capabilities.
Build with any AI model
No API keys required. Start building AI-powered workflows in minutes.
Get Started Free