Technique · inference
INT8 Weight Quantization for LLMs
Row-wise and vector-wise INT8 quantization with outlier detection that enables zero-degradation 8-bit inference of LLMs.
0
Products deploying
—
Avg research → prod
—
First commercial deploy
Deployment timeline
No verified deployments yet in our tracked product set.