Recipe ·

Qwen 3.6

Qwen 3.6 is Alibaba's next-generation open-source large language model, offering improved speed and multimodal capabilities as a successor to the Qwen 3.5 series.

Techniques inside

Median research → prod

Fastest adoption

Slowest adoption

Ingredient list

#1YaRN RoPE Context Extension
Invented by Nous Research · 2023-08 · Velocity 3y
“Qwen 3.6 supports a 128K context length, likely using RoPE extension techniques like YaRN.”
architecturemedium
#2Grouped-Query Attention (GQA)
Invented by Google · 2023-05 · Velocity 3y
“Qwen 3.6 uses GQA to reduce memory usage and improve inference speed.”
architecturehigh
#3LLaVA (Visual Instruction Tuning)
Invented by University of Wisconsin · 2023-04 · Velocity 3y
“Qwen 3.6 includes a multimodal version (Qwen-VL) that uses a vision encoder and projector.”
multimodalhigh
#4FlashAttention
Invented by Stanford · 2022-05 · Velocity 4y
“Qwen models utilize FlashAttention for efficient training and inference.”
inferencehigh
#5Zero-Shot Chain-of-Thought
Invented by University of Tokyo · 2022-05 · Velocity 4y
“Qwen 3.6 exhibits strong zero-shot reasoning capabilities.”
reasoningmedium
#6Self-Consistency
Invented by Google · 2022-03 · Velocity 4y
“Qwen 3.6 demonstrates improved reasoning, which can be enhanced with self-consistency.”
reasoningmedium
#7Chain-of-Thought Prompting
Invented by Google · 2022-01 · Velocity 4y
“Qwen 3.6 demonstrates strong reasoning capabilities, including step-by-step reasoning.”
reasoninghigh
#8Instruction Tuning (FLAN)
Invented by Google · 2021-09 · Velocity 5y
“Qwen models are instruction-tuned on a large collection of datasets.”
traininghigh
#9LoRA (Low-Rank Adaptation)
Invented by Microsoft · 2021-06 · Velocity 5y
“Qwen models support LoRA for efficient fine-tuning.”
traininghigh
#10Rotary Position Embedding (RoPE)
Invented by Zhuiyi Technology · 2021-04 · Velocity 5y
“Qwen models use Rotary Position Embedding (RoPE) for positional encoding.”
architecturehigh
#11Vision Transformer (ViT)
Invented by Google · 2020-10 · Velocity 5y
“Qwen 3.6's multimodal version uses a Vision Transformer (ViT) as its vision encoder.”
multimodalhigh
#12Transformer Self-Attention
Invented by Google · 2017-06 · Velocity 9y
“Qwen 3.6 is a Transformer-based large language model.”
architecturehigh
#13Mixture of Experts (Sparse MoE for LLMs)
Invented by Google · 2017-01 · Velocity 9y
“Qwen 3.6 includes a MoE version (Qwen 3.6 MoE) with 14B active parameters.”
architecturehigh

This recipe is part of the gentic.news Deployment Atlas. Every ingredient has an origin paper + evidence. Methodology is public. Dataset is CC BY 4.0.