SpargeAttn: Accurate Sparse Attention Accelerating Any Model Inference Paper • 2502.18137 • Published Feb 25 • 58
XAttention: Block Sparse Attention with Antidiagonal Scoring Paper • 2503.16428 • Published Mar 20 • 15
Towards Understanding the Nature of Attention with Low-Rank Sparse Decomposition Paper • 2504.20938 • Published Apr 29
Learning to Compress: Local Rank and Information Compression in Deep Neural Networks Paper • 2410.07687 • Published Oct 10, 2024 • 1