Skip to content
gentic.news — AI News Intelligence Platform

Technique · inference

GPTQ Quantization

Post-training quantization to 3-4 bits using second-order information, enabling 175B-scale LLMs to run on single-GPU inference.

Origin: ISTA, 2022-10Read origin paper →Also known as: GPTQ, 4-bit GPTQ
0
Products deploying
Avg research → prod
First commercial deploy

Deployment timeline

No verified deployments yet in our tracked product set.