PaliGemma – Making Gemma 2 see by adding a vision encoder

61 Views

Thanks! Share it with your friends!

You disliked this video. Thanks for the feedback!

Published Apr 2, 2025

Explore how PaliGemma adds a SigLIP vision encoder to Gemma 2. This model is pre-trained on captioning, question answering, object detection, and even segmentation. Varying image resolution and model size allows to scale the compute by a factor of 155. If you have data available for your task, fine-tuning PaliGemma will result in great performance, especially on text-related tasks.

Subscribe to Google for Developers → https://goo.gle/developers

#Gemma #GemmaDeveloperDay

Speaker: Andreas Steiner
Products Mentioned: Gemma

Category: Project
Tags: Google, developers, pr_pr: Gemma;

Be the first to comment

Sign in

Create your account

Add Video

PaliGemma – Making Gemma 2 see by adding a vision encoder

Up Next