HopChain: Multi-Hop Data Synthesis for Generalizable Vision-Language Reasoning Paper • 2603.17024 • Published 6 days ago • 93
Stroke of Surprise: Progressive Semantic Illusions in Vector Sketching Paper • 2602.12280 • Published Feb 12 • 34
Thinking in Uncertainty: Mitigating Hallucinations in MLRMs with Latent Entropy-Aware Decoding Paper • 2603.13366 • Published 14 days ago • 93
Beyond Language Modeling: An Exploration of Multimodal Pretraining Paper • 2603.03276 • Published 20 days ago • 100
MOSSBench: Is Your Multimodal Language Model Oversensitive to Safe Queries? Paper • 2406.17806 • Published Jun 22, 2024 • 2
MiroThinker-1.7 & H1: Towards Heavy-Duty Research Agents via Verification Paper • 2603.15726 • Published 7 days ago • 176
MM-Zero: Self-Evolving Multi-Model Vision Language Models From Zero Data Paper • 2603.09206 • Published 14 days ago • 52
MoltBook - AI agent-only Society Collection MoltBook datasets and papers • 2 items • Updated 18 days ago • 1
From Blind Spots to Gains: Diagnostic-Driven Iterative Training for Large Multimodal Models Paper • 2602.22859 • Published 25 days ago • 151
Compositional Generalization Requires Linear, Orthogonal Representations in Vision Embedding Models Paper • 2602.24264 • Published 24 days ago • 14
Overconfident Errors Need Stronger Correction: Asymmetric Confidence Penalties for Reinforcement Learning Paper • 2602.21420 • Published 27 days ago • 6
Solaris: Building a Multiplayer Video World Model in Minecraft Paper • 2602.22208 • Published 26 days ago • 28
SkillOrchestra: Learning to Route Agents via Skill Transfer Paper • 2602.19672 • Published 28 days ago • 56
From Perception to Action: An Interactive Benchmark for Vision Reasoning Paper • 2602.21015 • Published 27 days ago • 23
Humanual Datasets Collection Benchmarking LLM-based user simulators • 7 items • Updated 21 days ago • 2
The Molecular Structure of Thought: Mapping the Topology of Long Chain-of-Thought Reasoning Paper • 2601.06002 • Published Jan 9 • 57