Recipe ·

Gemini 3 Pro

Gemini 3.0 Pro is an advanced multimodal AI model from Google DeepMind for text, code, image, and video understanding.

Techniques inside

Median research → prod

Fastest adoption

Slowest adoption

Ingredient list

#1YaRN RoPE Context Extension
Invented by Nous Research · 2023-08 · Velocity 2y
“Gemini 1.5 uses a modified RoPE for long context, similar to YaRN.”
architecturehigh
#2Grouped-Query Attention (GQA)
Invented by Google · 2023-05 · Velocity 3y
“Gemini 1.5 uses grouped-query attention (GQA) for efficient inference.”
architecturehigh
#3FlashAttention
Invented by Stanford · 2022-05 · Velocity 4y
“Gemini models use Flash-Decoding for efficient attention, a variant of FlashAttention.”
inferencehigh
#4Chain-of-Thought Prompting
Invented by Google · 2022-01 · Velocity 4y
“Gemini 1.5 Pro can reason across text, images, audio, and video in a chain-of-thought style.”
reasoninghigh
#5Instruction Tuning (FLAN)
Invented by Google · 2021-09 · Velocity 4y
“Gemini models are instruction-tuned, building on the FLAN line of work.”
traininghigh
#6Rotary Position Embedding (RoPE)
Invented by Zhuiyi Technology · 2021-04 · Velocity 5y
“Gemini models use rotary position embeddings (RoPE).”
architecturehigh
#7Vision Transformer (ViT)
Invented by Google · 2020-10 · Velocity 5y
“Gemini uses a Vision Transformer (ViT) to encode image patches.”
multimodalhigh
#8Transformer Self-Attention
Invented by Google · 2017-06 · Velocity 9y
“Gemini is a Transformer-based decoder-only model.”
architecturehigh
#9Mixture of Experts (Sparse MoE for LLMs)
Invented by Google · 2017-01 · Velocity 9y
“Gemini 1.5 Pro is a Mixture-of-Experts (MoE) model with 8 experts and 2 active per token.”
architecturehigh

This recipe is part of the gentic.news Deployment Atlas. Every ingredient has an origin paper + evidence. Methodology is public. Dataset is CC BY 4.0.