Technique · inference
StreamingLLM (Attention Sinks)
A sliding-window attention pattern with preserved initial tokens ("sinks") that enables indefinite streaming generation without quality collapse.
0
Products deploying
—
Avg research → prod
—
First commercial deploy
Deployment timeline
No verified deployments yet in our tracked product set.