Technique · architecture

YaRN RoPE Context Extension

A method to extend RoPE-based models to much longer contexts via frequency-dependent interpolation, with minimal fine-tuning data.

Origin: Nous Research, 2023-08Read origin paper →Also known as: YaRN, NTK-by-parts

Products deploying

Avg research → prod

1.6y

First commercial deploy

Deployment timeline

Llama 4 Maverick
Deployed 2025-04-05 · Velocity 1.6y
“Llama 4 Maverick supports 1M context. Meta's previous long-context models (Llama 3.1) used YaRN.”
medium
Gemini 3 Pro
Deployed 2026-02-19 · Velocity 2y
“Gemini 1.5 uses a modified RoPE for long context, similar to YaRN.”
high
Gemini 3.1
Deployed 2026-02-20 · Velocity 2y
“The 10M token context window suggests use of advanced RoPE extension techniques like YaRN.”
medium
Kimi K2.5
Deployed 2026-03-04 · Velocity 3y
“To achieve long context windows, models often use YaRN or similar RoPE extension techniques.”
medium
Gemini 3.1 Flash-Lite
Deployed 2026-03-05 · Velocity 3y
“Gemini 1.5 models feature a 1 million token context window, achieved via novel research on efficient attention and positional encoding.”
medium
DeepSeek-V3
Deployed 2026-03-11 · Velocity 3y
“DeepSeek-V3 uses YaRN for extended context length.”
high
Mistral Small 4
Deployed 2026-03-16 · Velocity 3y
“Mistral Small 4 uses YaRN for 128K context length.”
high
GLM-5.1
Deployed 2026-03-21 · Velocity 3y
“GLM-5.1 extends context length to 1M tokens using YaRN (Yet another RoPE extensioN) method.”
high
Qwen 3.6
Deployed 2026-03-31 · Velocity 3y
“Qwen 3.6 supports a 128K context length, likely using RoPE extension techniques like YaRN.”
medium