-
EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters
Paper • 2402.04252 • Published • 29 -
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models
Paper • 2402.03749 • Published • 14 -
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Paper • 2402.04615 • Published • 44 -
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss
Paper • 2402.05008 • Published • 23
Collections
Discover the best community collections!
Collections including paper arxiv:2503.16219
-
Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model
Paper • 2503.24290 • Published • 62 -
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders
Paper • 2503.18878 • Published • 119 -
START: Self-taught Reasoner with Tools
Paper • 2503.04625 • Published • 113 -
DAPO: An Open-Source LLM Reinforcement Learning System at Scale
Paper • 2503.14476 • Published • 144
-
LADDER: Self-Improving LLMs Through Recursive Problem Decomposition
Paper • 2503.00735 • Published • 23 -
START: Self-taught Reasoner with Tools
Paper • 2503.04625 • Published • 113 -
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
Paper • 2503.05592 • Published • 27 -
R1-Omni: Explainable Omni-Multimodal Emotion Recognition with Reinforcing Learning
Paper • 2503.05379 • Published • 38
-
Reinforcement Learning for Reasoning in Large Language Models with One Training Example
Paper • 2504.20571 • Published • 98 -
One RL to See Them All: Visual Triple Unified Reinforcement Learning
Paper • 2505.18129 • Published • 60 -
Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't
Paper • 2503.16219 • Published • 52 -
Performance Trade-offs of Optimizing Small Language Models for E-Commerce
Paper • 2510.21970 • Published • 2
-
Skywork Open Reasoner 1 Technical Report
Paper • 2505.22312 • Published • 54 -
Unveiling Instruction-Specific Neurons & Experts: An Analytical Framework for LLM's Instruction-Following Capabilities
Paper • 2505.21191 • Published • 3 -
Absolute Zero: Reinforced Self-play Reasoning with Zero Data
Paper • 2505.03335 • Published • 188 -
Qwen3 Technical Report
Paper • 2505.09388 • Published • 317
-
Natural Language Reinforcement Learning
Paper • 2411.14251 • Published • 31 -
Towards General-Purpose Model-Free Reinforcement Learning
Paper • 2501.16142 • Published • 30 -
Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't
Paper • 2503.16219 • Published • 52 -
Teaching Large Language Models to Reason with Reinforcement Learning
Paper • 2403.04642 • Published • 50
-
RuCCoD: Towards Automated ICD Coding in Russian
Paper • 2502.21263 • Published • 133 -
Unified Reward Model for Multimodal Understanding and Generation
Paper • 2503.05236 • Published • 122 -
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching
Paper • 2503.05179 • Published • 46 -
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
Paper • 2503.05592 • Published • 27
-
Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search
Paper • 2412.18319 • Published • 39 -
Token-Budget-Aware LLM Reasoning
Paper • 2412.18547 • Published • 46 -
Efficiently Serving LLM Reasoning Programs with Certaindex
Paper • 2412.20993 • Published • 37 -
B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners
Paper • 2412.17256 • Published • 47
-
EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters
Paper • 2402.04252 • Published • 29 -
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models
Paper • 2402.03749 • Published • 14 -
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Paper • 2402.04615 • Published • 44 -
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss
Paper • 2402.05008 • Published • 23
-
Reinforcement Learning for Reasoning in Large Language Models with One Training Example
Paper • 2504.20571 • Published • 98 -
One RL to See Them All: Visual Triple Unified Reinforcement Learning
Paper • 2505.18129 • Published • 60 -
Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't
Paper • 2503.16219 • Published • 52 -
Performance Trade-offs of Optimizing Small Language Models for E-Commerce
Paper • 2510.21970 • Published • 2
-
Skywork Open Reasoner 1 Technical Report
Paper • 2505.22312 • Published • 54 -
Unveiling Instruction-Specific Neurons & Experts: An Analytical Framework for LLM's Instruction-Following Capabilities
Paper • 2505.21191 • Published • 3 -
Absolute Zero: Reinforced Self-play Reasoning with Zero Data
Paper • 2505.03335 • Published • 188 -
Qwen3 Technical Report
Paper • 2505.09388 • Published • 317
-
Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model
Paper • 2503.24290 • Published • 62 -
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders
Paper • 2503.18878 • Published • 119 -
START: Self-taught Reasoner with Tools
Paper • 2503.04625 • Published • 113 -
DAPO: An Open-Source LLM Reinforcement Learning System at Scale
Paper • 2503.14476 • Published • 144
-
Natural Language Reinforcement Learning
Paper • 2411.14251 • Published • 31 -
Towards General-Purpose Model-Free Reinforcement Learning
Paper • 2501.16142 • Published • 30 -
Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't
Paper • 2503.16219 • Published • 52 -
Teaching Large Language Models to Reason with Reinforcement Learning
Paper • 2403.04642 • Published • 50
-
RuCCoD: Towards Automated ICD Coding in Russian
Paper • 2502.21263 • Published • 133 -
Unified Reward Model for Multimodal Understanding and Generation
Paper • 2503.05236 • Published • 122 -
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching
Paper • 2503.05179 • Published • 46 -
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
Paper • 2503.05592 • Published • 27
-
LADDER: Self-Improving LLMs Through Recursive Problem Decomposition
Paper • 2503.00735 • Published • 23 -
START: Self-taught Reasoner with Tools
Paper • 2503.04625 • Published • 113 -
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
Paper • 2503.05592 • Published • 27 -
R1-Omni: Explainable Omni-Multimodal Emotion Recognition with Reinforcing Learning
Paper • 2503.05379 • Published • 38
-
Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search
Paper • 2412.18319 • Published • 39 -
Token-Budget-Aware LLM Reasoning
Paper • 2412.18547 • Published • 46 -
Efficiently Serving LLM Reasoning Programs with Certaindex
Paper • 2412.20993 • Published • 37 -
B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners
Paper • 2412.17256 • Published • 47