Recipe ·
Claude 3.5 Sonnet
Ingredient list
Invented by Stanford · 2022-05 · Velocity 4y
“Anthropic's research mentions using FlashAttention for efficient training of their transformer models.”
inferencemediumInvented by University of Tokyo · 2022-05 · Velocity 4y
“Claude exhibits zero-shot chain-of-thought reasoning when prompted appropriately.”
reasoninghighInvented by Google · 2022-01 · Velocity 4y
“Claude 3.5 Sonnet demonstrates chain-of-thought reasoning in its outputs when solving complex problems.”
reasoninghighInvented by Google · 2021-09 · Velocity 4y
“Claude models are instruction-tuned on diverse tasks to follow instructions and generalize to new tasks.”
traininghighInvented by Zhuiyi Technology · 2021-04 · Velocity 5y
“Anthropic's Claude models use rotary position embeddings (RoPE) for position encoding.”
architecturehighInvented by Google · 2017-06 · Velocity 9y
“Claude 3.5 Sonnet is built on transformer architecture with self-attention mechanisms.”
architecturehigh
This recipe is part of the gentic.news Deployment Atlas. Every ingredient has an origin paper + evidence. Methodology is public. Dataset is CC BY 4.0.