Gemma 3
A family of lightweight models with multimodal understanding and unparalleled multilingual capabilities for more intelligent applications
Gemma 3 is the most capable model that can run on a single GPU or TPU. Efficient on workstations, laptops, and even smartphones, allowing developers to build responsible AI applications at scale.
Watch
Capabilities
- 
        
Handle complex tasks Gemma 3's 128K-token context window lets your applications process and understand vast amounts of information, enabling more sophisticated AI features. 
- 
        
Multilingual communication Unparalleled multilingual capabilities let you communicate effortlessly across countries and cultures. Develop applications that reach a global audience, with support for over 140 languages. 
- 
        
Multimodal understanding Easily build applications that analyze images, text, and video opening up new possibilities for interactive and intelligent applications. 
Model sizes
Gemma 3 is available in four sizes to meet different development and deployment needs.
- 
        
270M Compact model designed for both task-specific fine-tuning and strong instruction-following. 
- 
        
1B Lightweight text model, ideal for small applications. 
- 
        
4B Balanced for performance and flexibility, with multimodal support. 
- 
        
12B Strong language capabilities, designed for complex tasks. 
- 
        
27B Enhanced understanding, great for sophisticated applications. 
Download open model weights
Get access and start building using your preferred frameworks and tools.
Performance and benchmarks
MMLU-Pro
The MMLU benchmark is a test that measures the breadth of knowledge and problem-solving ability acquired by large language models during pretraining.
LiveCodeBench
Assesses code generation capabilities on real-world coding problems from platforms like LeetCode and Codeforces.
Bird-SQL
Tests a model's ability to translate natural language questions into complex SQL queries across various domains.
GPQA Diamond
Challenges models with difficult questions written by Ph.D. holders in biology, physics, and chemistry.
SimpleQA
Evaluates a model's ability to answer simple, factual questions with short phrases.
FACTS Grounding
Evaluates if LLM responses are factually accurate and detailed enough, based on given input documents.
MATH
MATH evaluates a language model's ability to solve complex mathematical word problems, requiring reasoning, multi-step problem-solving, and the understanding of mathematical concepts.
HiddenMath
An internal holdout set of competition math problems.
MMMU
Evaluates multimodal understanding and reasoning across various disciplines requiring college-level knowledge.
Gemma Quantization-Aware Training (QAT)
Gemma QAT dramatically reduces memory requirements while maintaining high quality. This lets you run powerful models like Gemma 3 27B locally on consumer-grade GPUs like an NVIDIA RTX 3090.