Technique · architecture
Rotary Position Embedding (RoPE)
A relative-position encoding that rotates query/key vectors in complex space, giving transformers better length extrapolation than absolute sinusoidal embeddings.
Deployment timeline
- Llama 4 Scouthigh
Deployed 2025-04-05 · Velocity 4y
“Based on Llama 3 architecture which uses RoPE; Llama 4 Scout is a direct evolution.”
- Llama 4 Maverickhigh
Deployed 2025-04-05 · Velocity 4y
“Llama family models consistently use RoPE. Llama 4 is a direct successor.”
- Claude Opus 4.6medium
Deployed 2026-02-16 · Velocity 5y
“Anthropic's research mentions using rotary position embeddings (RoPE) in transformer architectures.”
- GPT-4ohigh
Deployed 2026-02-16 · Velocity 5y
“GPT models since GPT-3 use rotary position embeddings (RoPE). GPT-4o's architecture is a direct evolution.”
- GPT-5high
Deployed 2026-02-16 · Velocity 5y
“RoPE is a standard positional encoding used in modern Transformer LLMs, including GPT series.”
- Claude 3high
Deployed 2026-02-18 · Velocity 5y
“Claude 3 uses Rotary Position Embeddings (RoPE) for positional encoding, per technical details.”
- high
- Gemini 3.1medium
Deployed 2026-02-20 · Velocity 5y
“Gemini models use Rotary Position Embeddings (RoPE) for position encoding.”
- Claude 3.5 Sonnethigh
Deployed 2026-02-23 · Velocity 5y
“Anthropic's Claude models use rotary position embeddings (RoPE) for position encoding.”
- Claude Haiku 4.5high
Deployed 2026-02-25 · Velocity 5y
“Claude models use rotary position embeddings (RoPE) for positional encoding.”
- GPT-5.3medium
Deployed 2026-02-26 · Velocity 5y
“RoPE is a standard position encoding in modern LLMs; GPT-5.3 likely uses it for better length extrapolation.”
- Claude 4.5medium
Deployed 2026-02-26 · Velocity 5y
“Anthropic's Claude models use rotary position embeddings (RoPE) for positional encoding.”
- Gemini 3 Flashhigh
Deployed 2026-02-27 · Velocity 5y
“Gemini models use rotary position embeddings (RoPE), as confirmed in the Gemini 1.5 technical report.”
- GPT-OSS-120Bmedium
Deployed 2026-03-02 · Velocity 5y
“As a large language model in the GPT lineage, it almost certainly uses Rotary Position Embedding (RoPE), which is standard in modern transformer architectures.”
- Grok 4.20medium
Deployed 2026-03-02 · Velocity 5y
“Grok models are based on the Transformer architecture, which commonly uses RoPE for position encoding.”
- Kimi K2.5medium
Deployed 2026-03-04 · Velocity 5y
“Most modern LLMs use RoPE for position encoding; Kimi K2.5's long-context capability aligns with this.”
- Gemini 3.1 Flash-Litehigh
Deployed 2026-03-05 · Velocity 5y
“Gemini models use rotary position embeddings (RoPE) for positional encoding.”
- high
- Mistral Small 4high
Deployed 2026-03-16 · Velocity 5y
“Mistral models use Rotary Position Embeddings (RoPE).”
- GLM-5.1high
Deployed 2026-03-21 · Velocity 5y
“GLM-5.1 uses Rotary Position Embedding (RoPE) for positional encoding.”
- Qwen 3.6high
Deployed 2026-03-31 · Velocity 5y
“Qwen models use Rotary Position Embedding (RoPE) for positional encoding.”
- GPT-5.4-Cybermedium
Deployed 2026-04-16 · Velocity 5y
“GPT models use Rotary Position Embeddings (RoPE) for positional encoding.”
- Claude Opus 4.7high
Deployed 2026-04-16 · Velocity 5y
“Anthropic's Claude 3 model card mentions using rotary position embeddings (RoPE). This is a standard architectural component for their models.”