Technique · training
Chinchilla Scaling Laws
Scaling law showing compute-optimal models use ~20 training tokens per parameter — correcting prior over-parameterization in GPT-3-era models.
Origin: Google DeepMind, 2022-03Read origin paper →Also known as: Compute-optimal scaling, Chinchilla
0
Products deploying
—
Avg research → prod
—
First commercial deploy
Deployment timeline
No verified deployments yet in our tracked product set.