Skip to content
gentic.news — AI News Intelligence Platform

Technique · inference

INT8 Weight Quantization for LLMs

Row-wise and vector-wise INT8 quantization with outlier detection that enables zero-degradation 8-bit inference of LLMs.

Origin: University of Washington, 2022-08Read origin paper →Also known as: LLM.int8()
0
Products deploying
Avg research → prod
First commercial deploy

Deployment timeline

No verified deployments yet in our tracked product set.

Techniques built on this