Gemini 2.5 Pro is our most advanced model yet, excelling at coding and complex prompts.
Pro performance
-
Enhanced reasoning
State-of-the-art in key math and science benchmarks.
-
Advanced coding
Easily generate code for web development tasks.
-
Natively multimodal
Understands input across text, audio, images and video.
-
Long context
Explore vast datasets with a 1-million token context window.
Hands-on with 2.5 Pro
See how Gemini 2.5 Pro uses its reasoning capabilities to create interactive simulations and do advanced coding.
Benchmarks
Gemini 2.5 Pro leads common benchmarks by meaningful margins.
| Benchmark |
Gemini 2.5 Pro
Preview (05-06)
|
OpenAI o3
|
OpenAI GPT-4.1
|
Claude 3.7 Sonnet
64k Extended thinking
|
Grok 3 Beta
Extended thinking
|
DeepSeek R1
|
|
|---|---|---|---|---|---|---|---|
|
Input price
|
$/1M tokens |
$2.50 $1.25 <= 200k tokens |
$10.00 | $2.00 | $3.00 | $3.00 | $0.55 |
|
Output price
|
$/1M tokens |
$15.00 $10.00 <= 200k tokens |
$40.00 | $8.00 | $15.00 | $15.00 | $2.19 |
|
Reasoning & knowledge
Humanity's Last Exam (no tools)
|
17.8% | 20.3% | 5.4% | 8.9% | — | 8.6%* | |
|
Science
GPQA diamond
|
single attempt (pass@1) | 83.0% | 83.3% | 66.3% | 78.2% | 80.2% | 71.5% |
|
|
multiple attempts | — | — | — | 84.8% | 84.6% | — |
|
Mathematics
AIME 2025
|
single attempt (pass@1) | 83.0% | 88.9% | — | 49.5% | 77.3% | 70.0% |
|
|
multiple attempts | — | — | — | — | 93.3% | — |
|
Code generation
LiveCodeBench v5
|
single attempt (pass@1) | 75.6% | — | — | — | 70.6% | 64.3% |
|
|
multiple attempts | — | — | — | — | 79.4% | — |
|
Code editing
Aider Polyglot
|
76.5% / 72.7%
whole / diff
|
81.3% / 79.6%
whole / diff
|
51.6% / 52.9%
whole / diff
|
64.9%
diff
|
— |
56.9%
diff
|
|
|
Agentic coding
SWE-bench Verified
|
63.2% | 69.1% | 54.6% | 70.3% | — | 49.2% | |
|
Factuality
SimpleQA
|
50.8% | 49.4% | 41.6% | — | 43.6% | 30.1% | |
|
Visual reasoning
MMMU
|
single attempt (pass@1) | 79.6% | 82.9% | 75.0% | 75.0% | 76.0% | no MM support |
|
|
multiple attempts | — | — | — | — | 78.0% | no MM support |
|
Image understanding
Vibe-Eval (Reka)
|
65.6% | — | — | — | — | no MM support | |
|
Video
Video-MME
|
84.8% | — | — | — | — | no MM support | |
|
Long context
MRCR
|
128k (average) | 93.0% | — | — | — | — | — |
|
|
1M (pointwise) | 82.9% | — | — | — | — | — |
|
Multilingual performance
Global MMLU (Lite)
|
88.6% | — | — | — | — | — |
| Model deployment status | Experimental, Preview |
| Supported data types for input | Text, Image, Video, Audio |
| Supported data types for output | Text |
| Supported # tokens for input | 1M |
| Supported # tokens for output | 64k |
| Knowledge cutoff | January 2025 |
| Tool use |
Function calling Structured output Search as a tool Code execution |
| Best for |
Reasoning Coding Complex prompts |
| Availability |
Google AI Studio Gemini API Gemini App |