Vision Model Deprecated
Gemini 1.5 Pro Vision
Adept at processing visual and text inputs for multimodal tasks and content creation.
Publisher
Google
Type Vision
Context Window 1,000,000 tokens
Training Data November 2023
Input $0.63/MTok
Output $2.50/MTok
Overview
Gemini 1.5 Pro Vision
Gemini 1.5 Pro is a foundation model that performs well at a variety of multimodal tasks such as visual understanding, classification, summarization, and creating content from image, audio and video. It's adept at processing visual and text inputs such as photographs, documents, infographics, and screenshots
Ready to build with Gemini 1.5 Pro Vision?
Get Started FreeConfiguration
Parameters & options
Max Temperature 1
Max Response Size 8,192 tokens
Temperature Number
Max Response Tokens Number
Related models
Explore similar models
Start building with Gemini 1.5 Pro Vision
No API keys required. Create AI-powered workflows with Gemini 1.5 Pro Vision in minutes — free.