Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective Paper • 2505.15045 • Published May 21, 2025 • 54
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention Paper • 2506.13585 • Published Jun 16, 2025 • 273
Marrying Autoregressive Transformer and Diffusion with Multi-Reference Autoregression Paper • 2506.09482 • Published Jun 11, 2025 • 45