Skip to main content
MindStudio
Pricing
Blog About
My Workspace
Model catalog

AI Models

Explore 245+ AI models available in MindStudio. From large language models to image generators — no API keys required.

245 models

Anthropic 15

Text

Claude 4.6 Sonnet

Anthropic's most capable Sonnet model yet, delivering near-Opus-level intelligence for coding, agents, and computer use with a 1M token context window.

Text

Claude 4.6 Opus

Anthropic's most intelligent model, setting new standards in agentic coding, complex reasoning, and professional knowledge work.

Text

Claude 4.5 Opus

Anthropic's most powerful flagship model, setting the state of the art in coding, agentic workflows, and real-world computer use.

Text

Claude 4.5 Haiku

Anthropic's fastest and most efficient model, delivering frontier-level coding performance at a fraction of the cost and more than twice the speed of its predecessors.

Text

Claude 4.5 Sonnet

Anthropic's most intelligent model, leading the world in coding, computer use, and complex agentic tasks.

Text

Claude 4.1 Opus

Anthropic's most advanced model, purpose-built for complex coding, agentic tasks, and deep research requiring sustained multi-step reasoning.

Text

Claude 4 Opus

Anthropic's most advanced flagship model at launch, excelling at complex coding, deep reasoning, and long-horizon agentic tasks.

Text

Claude 4 Sonnet

Claude Sonnet 4 is Anthropic's efficient and capable mid-tier model, delivering state-of-the-art coding and reasoning performance at scale.

Text

Claude 3 Sonnet

Ideal balance of intelligence and speed for enterprise workloads, offering cost-effective performance.

Text

Claude 3 Haiku

Fast, affordable model with strong vision capabilities and performance for diverse enterprise applications.

Text

Claude 3.5 Sonnet

Performance boost, combined with cost-effective pricing, makes Claude 3.5 Sonnet ideal for complex tasks.

Text

Claude 3.5 Sonnet v2

Sets new industry benchmarks in reasoning, knowledge, and coding with improved grasp of nuance and humor.

Text

Claude 3.5 Haiku

Faster than Claude 3 Haiku with improved performance across multiple intelligence benchmarks.

Text

Claude 3 Haiku

Fast, affordable model with strong vision capabilities and performance for diverse enterprise applications.

Text

Claude InstantDeprecated

OpenAI 45

Text

GPT 5.4 Pro

OpenAI's most capable and efficient frontier model for professional work, combining powerful reasoning with reliable agentic execution at scale.

Text

GPT 5.4

OpenAI's most capable and efficient frontier model for professional work, combining powerful reasoning with reliable agentic execution at scale.

Text

GPT‑5.2 Pro

The most capable model series yet for professional knowledge work.

Text

GPT-5.1

The best model for coding and agentic tasks across domains

Text

o3-pro

OpenAI's most powerful reasoning model, designed to tackle the hardest problems with deeper thinking, multimodal understanding, and autonomous tool use.

Text

GPT-5 Chat

GPT-5 model used in ChatGPT

Text

GPT-5 nano

Fastest, most cost-efficient version of GPT-5

Text

GPT-5 mini

A faster, more cost-efficient version of GPT-5 for well-defined tasks

Text

GPT-5

The best model for coding and agentic tasks across domains

Text

GPT OSS 120B

OpenAI's flagship open-weight reasoning model with 117B parameters, built for powerful coding, math, and agentic tasks under the Apache 2.0 license.

Text

GPT OSS 20B

OpenAI's compact open-weight reasoning model that delivers powerful AI capabilities on consumer hardware, running within just 16GB of memory.

Text

GPT-4.1 Nano

GPT-4.1 nano is the fastest, most cost-effective GPT-4.1 model.

Text

GPT-4.1 Mini

GPT-4.1 mini provides a balance between intelligence, speed, and cost that makes it an attractive model for many use cases.

Text

GPT-4.1

GPT-4.1 is OpenAI's flagship model for complex tasks. It is well suited for problem solving across domains.

Text

o4-mini

A fast, efficient reasoning model from OpenAI that delivers exceptional performance in math, coding, and visual tasks at high throughput.

Text

o3

OpenAI's most powerful reasoning model, built to tackle complex problems in coding, math, science, and visual analysis with exceptional depth and accuracy.

Text

o1-pro

OpenAI's most powerful reasoning model, built to tackle the hardest problems with extended thinking time and enhanced compute.

Text

o3-mini

OpenAI's compact reasoning model optimized for fast, cost-efficient problem-solving with exceptional performance in math, coding, and science.

Text

o1

Small cost-efficient reasoning model that’s optimized for coding, math, and science, and supports tools and Structured Outputs.

Text

GPT-4o Mini

Low-cost, low-latency model surpassing GPT-3.5 Turbo in textual intelligence and multimodal reasoning.

Text

GPT-4o

Accepting any input and generating any output combination of text, audio, and image for more natural interaction.

Text

GPT-4 Turbo

Combines GPT-4's sophisticated language processing with faster response times for interactive applications.

Text

GPT-4

Increased capacity and nuance compared to predecessors, offering more accurate text generation.

Vision

GPT-4o Mini Vision

Low-cost, fast model surpassing GPT-3.5 Turbo in textual intelligence and multimodal reasoning.

Vision

GPT-4o Vision

A GPT-4o variant with vision capabilities, processing both text and image inputs.

Vision

GPT-4 Turbo Vision

A variant of GPT-4 with vision capabilities, processing both text and image inputs.

Image

GPT Image Latest

OpenAI's latest image generation and editing model, offering state-of-the-art visual quality, precise instruction following, and support for large-scale batch processing.

Image

GPT Image 1.5

OpenAI's flagship image generation model featuring precise editing, face preservation, and 4x faster generation speeds.

Image

GPT Image 1

OpenAI's flagship image generation model that creates and edits images with precise instruction-following, fast generation, and detail-preserving edits.

Image

DALL-E 3

OpenAI's most advanced AI image generator, capable of producing highly detailed, photorealistic images from natural language prompts with unprecedented accuracy.

Image

DALL-E 2

DALL·E 2 is OpenAI's AI image generation system that creates realistic images and art from natural language descriptions, with 4x greater resolution than its predecessor.

Video

Sora 2 Pro

OpenAI's most powerful text-to-video model, delivering up to 25-second cinematic clips with synchronized audio, lifelike physics, and unprecedented creative control.

Video

Sora 2

OpenAI's Sora 2 is a groundbreaking AI video generation model that creates realistic, physics-aware videos with synchronized audio from text prompts.

Transcription

GPT-4o mini Transcribe

Speech-to-text model powered by GPT-4o mini

Transcription

GPT-4o Transcribe

Speech-to-text model powered by GPT-4o

Transcription

Whisper

General-purpose speech recognition model

Text to Speech

GPT-4o-mini TTS

A lightweight, cost-efficient text-to-speech model from OpenAI that converts written text into natural-sounding audio.

Text to Speech

TTS HD

AI model converting text to high-quality, natural-sounding speech with TTS-1-HD optimized for quality.

Text to Speech

TTS

OpenAI's fast text-to-speech model, optimized for low-latency voice generation across a wide range of applications.

Text

GPT-4.5Deprecated

Increased capacity and nuance compared to predecessors, offering more accurate text generation.

Text

o1-previewDeprecated

Early preview model using broad general knowledge to reason about hard problems.

Text

o1-miniDeprecated

Faster, cheaper version of o1 adept at coding, math, and science tasks without extensive general knowledge.

Text

GPT-3.5Deprecated

A versatile model excelling in coherent text generation, conversation, and problem-solving.

Instruct

GPT-3Deprecated

Enhanced language understanding and generation for detailed, context-relevant responses.

Instruct

GPT-3.5 InstructDeprecated

Variant tuned to better understand and adhere to user prompts for precise, instruction-following responses.

Google 34

Text

Gemini 3.1 Pro

Google's frontier reasoning model delivering enhanced software engineering, agentic reliability, and multimodal intelligence across a 1M-token context window.

Text

Gemini 3 Flash

A fast, capable thinking model from Google designed for agentic workflows, coding, and multi-turn chat with near Pro-level reasoning at lower latency.

Text

Gemini 3

The best model in the world for multimodal understanding, and our most powerful agentic and vibe-coding model yet, delivering richer visuals and deeper interactivity, all built on a foundation of state-of-the-art reasoning.

Text

Gemini 2.5 Flash Lite

Google's fastest and most efficient Gemini 2.5 model, delivering high-quality AI performance at scale with optional reasoning capabilities.

Text

Gemini 2.5 Flash

Gemini 2.5 Flash is a thinking model that offers great, well-rounded capabilities. It is designed to offer a balance between price and performance.

Text

Gemini 2.5 Pro

State-of-the-art thinking model, capable of reasoning over complex problems in code, math, and STEM, as well as analyzing large datasets, codebases, and documents using long context.

Text

Gemma 3.2

Google's most capable open multimodal model, combining image and text understanding with support for 140+ languages and a massive 128K context window.

Text

Gemini 2.0 Flash Lite

Speedy, cost-effective multimodal model for high-volume applications without compromising quality.

Text

Gemini 2.0 Flash

Workhorse model for all daily tasks with enhanced performance. 2.0 Flash is an upgrade path for 1.5 Flash users who want a slightly slower model with significantly better quality, or 1.5 Pro users who want slightly better quality and real-time latency for less.

Vision

Gemini 2.5 Pro Vision

Gemini 2.5 Pro is our most advanced reasoning Gemini model, capable of solving complex problems.

Vision

Gemini 2.5 Flash Vision

Gemini 2.5 Flash is a thinking model that offers great, well-rounded capabilities. It is designed to offer a balance between price and performance.

Vision

Gemini 2.0 Flash-Lite Vision

Gemini 2.0 Flash-Lite is our fastest and most cost efficient Flash model. It's an upgrade path for 1.5 Flash users who want better quality for the same price and speed.

Vision

Gemini 2.0 Flash Vision

Gemini 2.0 Flash delivers next-gen features and improved capabilities, including superior speed, built-in tool use, multimodal generation, and a 1M token context window.

Image

Gemini 3.1 Flash Image

Google's top-ranked image generation and editing model, delivering world-class visual quality with strong text rendering, multi-subject consistency, and search-grounded creativity.

Image

Gemini 3 Pro Image

Google's most advanced image generation and editing model, built on Gemini 3 Pro, delivering professional-grade visual synthesis with multimodal reasoning.

Image

Gemini 2.5 Flash Image

Google's state-of-the-art image generation and editing model that combines powerful creative control with Gemini's world knowledge.

Image

Imagen 3

Google's Imagen 3 is a premium text-to-image model delivering photorealistic quality with exceptional text rendering precision and natural language understanding.

Image

Imagen 3 Fast

A fast variant of Google's Imagen 3 that generates high-quality, realistic images from text prompts with lower latency.

Image

Imagen 4 Ultra

Google's highest quality text-to-image generation model, delivering exceptional fidelity and detail for professional-grade visuals.

Image

Imagen 4 Fast

Google's highest quality image generation model, optimized for speed without sacrificing detail or visual fidelity.

Video

Veo 3.1 Fast

A fast video generation model from Google designed for quick, high-quality AI video creation via Vertex AI and the Gemini API.

Video

Veo 3.1

Google's Veo 3.1 video generation model, the stable production release succeeding the preview version for high-quality AI video creation.

Video

Veo 3 Fast

Google's fast variant of the Veo 3 video generation model, optimized for speed while delivering high-quality AI-generated video from text prompts.

Video

Veo 3

Google's stable, production-ready video generation model that succeeds the Veo 3.0 preview, offering high-quality AI video synthesis via the Vertex AI platform.

Video

Veo 2

Google's generally available text- and image-to-video model capable of generating detailed, physically realistic, and artistically nuanced videos.

Text

Gemini 2.0 Flash ThinkingDeprecated

Combining speed and performance, 2.0 Flash Thinking Experimental excels in science and math, showing its thinking to solve complex problems.

Text

Gemini 2.0 ProDeprecated

An experimental update Gemini 2.0 for coding and complex prompts.

Text

Gemini 1.5 FlashDeprecated

Speedy, cost-effective multimodal model for high-volume applications without compromising quality.

Text

Gemini 1.5 ProDeprecated

Proficient at multimodal tasks and content creation from image, audio, and video inputs.

Text

Gemini 1.0 ProDeprecated

Handles both text and image inputs for content generation and problem-solving.

Text

PaLM 2Deprecated

Advanced language model with high efficiency and accuracy for complex language tasks and creative content generation.

Vision

Gemini 1.5 Flash VisionDeprecated

Fast, cost-effective multimodal model for quality applications at high volume.

Vision

Gemini 1.5 Pro VisionDeprecated

Adept at processing visual and text inputs for multimodal tasks and content creation.

Vision

Gemini 1.0 Pro VisionDeprecated

Handles both text and image inputs for content generation and problem-solving.

Blackforestlabs 8

Image

FLUX.1 [dev] LoRA

FLUX.1 [dev] with LoRA support for fast, high-quality, personalized image generation using pre-trained style adapters.

Image

FLUX.2 [max]

Black Forest Labs' most powerful image generation and editing model, delivering professional-grade output with real-time web context and unmatched visual consistency.

Image

FLUX.2 [dev] LoRA

FLUX.2 [dev] with LoRA support brings personalized, studio-quality text-to-image generation by letting you inject custom styles, characters, and brand identities through stackable adapter modules.

Image

FLUX.2 [pro]

FLUX.2 [pro] is Black Forest Labs' production-grade image generation model delivering state-of-the-art photorealism, multi-reference consistency, and precise brand control at scale.

Image

FLUX.1 Kontext [max]

FLUX.1 Kontext [Max] is a frontier image generation and editing model with advanced text rendering and contextual understanding.

Image

FLUX.1 Kontext [pro]

FLUX.1 Kontext Pro is a powerful image generation and editing model that enables precise, context-aware modifications to existing images using natural language prompts.

Image

FLUX 1.1 [pro] Ultra

A state-of-the-art AI image generation model that produces stunning high-resolution images from text descriptions with exceptional speed and creative control.

Image

FLUX 1.1 [pro]

FLUX 1.1 Pro is Black Forest Labs' latest AI image generation model, delivering six times faster generation speeds and superior image quality with enhanced prompt accuracy.

Meta 24

Text

Llama 4 Scout

Llama 4 Scout is a powerful multimodal model with 17 billion active parameters, offering state-of-the-art performance in its class.

Text

Llama 4 Scout

Llama 4 Scout is a powerful multimodal model with 17 billion active parameters, offering state-of-the-art performance in its class.

Text

Llama 4 Maverick

Llama 4 Maverick is a high-performance multimodal model with 17 billion active parameters, supporting 12 languages and offering industry-leading image and text understanding.

Text

Llama 3.1 8B Instant

Text

Llama 3.3 70B Versatile

Text

Llama 3.2 90B Instruct

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

Text

Llama 3.2 11B Instruct

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

Text

Llama 3.2 3B Instruct

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

Text

Llama 3.2 1B Instruct

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

Text

Llama 3.1 405B Instruct

Optimized for multilingual dialogue, outperforming open-source and closed chat models on industry benchmarks.

Text

Llama 3.1 70B Instruct

Optimized for multilingual dialogue, outperforming open-source and closed chat models on industry benchmarks.

Text

Llama 3.1 8B Instruct

Optimized for multilingual dialogue, outperforming open-source and closed chat models on industry benchmarks.

Text

Llama 3 70B Instruct

Best-in-class 70B model with enhanced reasoning, code generation, and instruction following.

Text

Llama 3 8B Instruct

Best-in-class 8B model with improved reasoning, code generation, and instruction following.

Text

Llama 3.1 405B Instruct

Optimized for multilingual dialogue, outperforming open-source and closed chat models on industry benchmarks.

Text

Llama 3.1 70B Instruct

Optimized for multilingual dialogue, outperforming open-source and closed chat models on industry benchmarks.

Text

Llama 3.1 8B Instruct

Optimized for multilingual dialogue, outperforming open-source and closed chat models on industry benchmarks.

Text

Llama 3 8B Instruct

Best-in-class 8B model with improved reasoning, code generation, and instruction following.

Text

Llama 3 70B Instruct

Best-in-class 70B model with enhanced reasoning, code generation, and instruction following.

Text

Llama 3 8BDeprecated

Text

Llama 3 70BDeprecated

Text

Code LlamaDeprecated

Tailored for code comprehension, generation, and debugging with an instructive design.

Text

Llama-2 70B ChatDeprecated

Provides depth and complexity in language understanding for sophisticated content creation.

Text

Llama-2 13B ChatDeprecated

Balanced model for detailed language processing, offering advanced understanding and generation.

Mistral 18

Text

Ministral 3 3B

Designed for edge deployment, it delivers high performance across diverse hardware, including local setups.

Text

Ministral 3 8B

Built for edge deployment, it delivers high performance across diverse hardware, including local setups.

Text

Ministral 3 14B

Optimized for local deployment, it delivers high performance across diverse hardware, including local setups.

Text

Mistral Large 3

Mistral Large 3 is Mistral’s first mixture-of-experts model since the seminal Mixtral series, and represents a substantial step forward in pretraining at Mistral.

Text

Mistral Medium 3

A versatile AI model designed for professional applications with particular strengths in coding and multimodal understanding. The model can be deployed across various environments, making it ideal for enhancing customer service, personalizing business processes, and analyzing complex datasets across industries.

Text

Mistral Small 3.1 (25.03)

Single-node inference model with 128k context window supporting dozens of languages and 80+ coding languages.

Text

Mixtral 8x7B Instruct

High-quality, efficient sparse model outperforming larger models in speed and benchmarks.

Text

Mistral Small 24.02

Single-node inference model with 128k context window supporting dozens of languages and 80+ coding languages.

Text

Mistral Large 24.07

Single-node inference model with 128k context window supporting dozens of languages and 80+ coding languages.

Text

Mistral Large 24.02

Single-node inference model with 128k context window supporting dozens of languages and 80+ coding languages.

Text

Mistral 7B Instruct

Focused on instruction-based tasks, providing clear, concise responses adhering to user instructions.

Text

Mistral Codestral

Open-weight code generation model assisting developers in 80+ programming languages.

Text

Mistral Nemo

State-of-the-art reasoning, world knowledge, and coding accuracy model designed for global, multilingual applications.

Text

Mistral Large 2

Single-node inference model with 128k context window supporting dozens of languages and 80+ coding languages.

Text

Mistral 8x7bDeprecated

Mixtral 8x7B is a high-performance mixture-of-experts language model from Mistral AI, offering a 32K token context window with efficient, fast inference.

Text

Mixtral 8x22B InstructDeprecated

High-performance, cost-efficient sparse model using 39B active parameters out of 141B.

Text

Mixtral 8x7B InstructDeprecated

High-quality, efficient sparse model outperforming larger models in speed and benchmarks.

Text

Mistral 7B InstructDeprecated

Focused on instruction-based tasks, providing clear, concise responses adhering to user instructions.

Perplexity 9

Text

Sonar Deep Research

Perplexity's exhaustive deep research model that autonomously searches hundreds of sources to deliver expert-level analysis and comprehensive reports.

Text

Sonar Reasoning Pro

Perplexity's premier reasoning model combining DeepSeek R1-powered Chain-of-Thought reasoning with real-time web search for comprehensive, citation-rich answers.

Text

Sonar

Perplexity's lightweight, real-time AI search model that delivers fast, citation-backed answers by connecting directly to the live web.

Text

Sonar Pro

Perplexity's advanced search model delivering deeper reasoning, richer citations, and multi-step query handling for complex research tasks.

Text

Sonar ReasoningDeprecated

Lightweight reasoning offering powered by reasoning models trained with DeepSeek R1.

Text

Sonar Large OnlineDeprecated

Perplexity's latest model family surpassing earlier versions in cost-efficiency, speed, and performance.

Text

Sonar Large ChatDeprecated

Perplexity's latest model family surpassing earlier versions in cost-efficiency, speed, and performance.

Text

Sonar Small OnlineDeprecated

Perplexity's latest model family surpassing earlier versions in cost-efficiency, speed, and performance.

Text

Sonar Small ChatDeprecated

Perplexity's latest model family surpassing earlier versions in cost-efficiency, speed, and performance.

X.ai 15

Text

Grok 4.20 Reasoning

A beta reasoning variant of xAI's Grok 4.20, designed for enhanced step-by-step thinking and complex problem-solving.

Text

Grok 4.20

A fast, tool-focused variant of Grok 4.20 Beta with reasoning disabled for lower-latency agentic workloads and a massive 2M-token context window.

Text

Grok 4.1 Fast Reasoning

A fast, reasoning-capable AI model built for agentic tool-calling workflows with a massive 2-million-token context window.

Text

Grok 4.1 Fast

xAI's ultra-fast, non-reasoning variant of Grok 4.1 Fast, built for real-time agentic tasks with a massive 2 million token context window.

Text

Grok 4 Fast Reasoning

A fast, cost-efficient reasoning model from xAI that delivers near-flagship performance with dramatically fewer thinking tokens and a massive 2 million token context window.

Text

Grok 4 Fast

xAI's cost-efficient reasoning model that delivers frontier-level intelligence with exceptional speed and token efficiency.

Text

Grok 4

xAI's most powerful reasoning model, trained with massive-scale reinforcement learning to achieve world-leading performance on the hardest academic and scientific benchmarks.

Text

Grok 3 Mini Fast

A lightweight, high-speed reasoning model from xAI that delivers fast, intelligent responses with function calling, web search, and extended thinking capabilities.

Text

Grok 3 Mini

A lightweight, cost-efficient reasoning model from xAI that thinks before it responds, excelling at math, logic puzzles, and quantitative problem-solving.

Text

Grok 3 Fast

A high-speed variant of xAI's Grok 3, optimized for rapid responses while retaining strong language understanding and tool-use capabilities.

Text

Grok 3

xAI's flagship large language model, delivering state-of-the-art reasoning, coding, and creative writing capabilities on par with the best models in the world.

Vision

Grok 2 Vision

A multimodal vision model from xAI that combines strong image understanding with text reasoning, multilingual support, and enhanced instruction-following.

Image

Grok Imagine Pro

xAI's premium image generation model, delivering higher-quality visuals than the standard grok-imagine-image.

Image

Grok Imagine

xAI's fast and accessible text-to-image generation model, designed to turn written prompts into high-quality images at scale.

Video

Grok Imagine

X.ai's fast, native text-to-video and image-to-video generation model with built-in audio, multiple aspect ratios, and flexible creative modes.

Build with any AI model

No API keys required. Start building AI-powered workflows in minutes.

Get Started Free