Vision ModelDeprecated

Gemini 1.5 Pro Vision

Adept at processing visual and text inputs for multimodal tasks and content creation.

Start Building with Gemini 1.5 Pro Vision View All Models

Publisher

Google

TypeVision

Context Window1,000,000 tokens

Training DataNovember 2023

Try Gemini 1.5 Pro Vision →

Overview

Gemini 1.5 Pro Vision

Gemini 1.5 Pro is a foundation model that performs well at a variety of multimodal tasks such as visual understanding, classification, summarization, and creating content from image, audio and video. It's adept at processing visual and text inputs such as photographs, documents, infographics, and screenshots

Ready to build with Gemini 1.5 Pro Vision?

Get Started Free

Resources