MM-PoisonRAG: Disrupting Multimodal RAG with Local and Global Poisoning Attacks Paper • 2502.17832 • Published Feb 25 • 6
EBT-Policy: Energy Unlocks Emergent Physical Reasoning Capabilities Paper • 2510.27545 • Published Oct 31 • 48
Multimodal Policy Internalization for Conversational Agents Paper • 2510.09474 • Published Oct 10 • 4
Multimodal Policy Internalization for Conversational Agents Paper • 2510.09474 • Published Oct 10 • 4 • 2
Where LLM Agents Fail and How They can Learn From Failures Paper • 2509.25370 • Published Sep 29 • 11
FlashAdventure: A Benchmark for GUI Agents Solving Full Story Arcs in Diverse Adventure Games Paper • 2509.01052 • Published Sep 1 • 21
Perception-Aware Policy Optimization for Multimodal Reasoning Paper • 2507.06448 • Published Jul 8 • 47
Perception-Aware Policy Optimization for Multimodal Reasoning Paper • 2507.06448 • Published Jul 8 • 47 • 1
Energy-Based Transformers are Scalable Learners and Thinkers Paper • 2507.02092 • Published Jul 2 • 69
BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset Paper • 2505.09568 • Published May 14 • 97
DyMU: Dynamic Merging and Virtual Unmerging for Efficient VLMs Paper • 2504.17040 • Published Apr 23 • 13
MultiAgentBench: Evaluating the Collaboration and Competition of LLM agents Paper • 2503.01935 • Published Mar 3 • 29