Collections
Discover the best community collections!
Collections including paper arxiv:2512.03442
-
PretrainZero: Reinforcement Active Pretraining
Paper • 2512.03442 • Published • 41 -
UniQL: Unified Quantization and Low-rank Compression for Adaptive Edge LLMs
Paper • 2512.03383 • Published • 3 -
ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration
Paper • 2511.21689 • Published • 98 -
Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models
Paper • 2511.18890 • Published • 29
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 526 • 98 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 36 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 97 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 88
-
Scaling Latent Reasoning via Looped Language Models
Paper • 2510.25741 • Published • 219 -
Every Token Counts: Generalizing 16M Ultra-Long Context in Large Language Models
Paper • 2511.23319 • Published • 21 -
Focused Chain-of-Thought: Efficient LLM Reasoning via Structured Input Information
Paper • 2511.22176 • Published • 4 -
FedRE: A Representation Entanglement Framework for Model-Heterogeneous Federated Learning
Paper • 2511.22265 • Published • 1
-
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning
Paper • 2506.01939 • Published • 187 -
ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration
Paper • 2511.21689 • Published • 98 -
PretrainZero: Reinforcement Active Pretraining
Paper • 2512.03442 • Published • 41
-
Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning
Paper • 2407.20798 • Published • 24 -
Offline Reinforcement Learning for LLM Multi-Step Reasoning
Paper • 2412.16145 • Published • 38 -
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models
Paper • 2501.03262 • Published • 103 -
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution
Paper • 2502.18449 • Published • 75
-
PretrainZero: Reinforcement Active Pretraining
Paper • 2512.03442 • Published • 41 -
UniQL: Unified Quantization and Low-rank Compression for Adaptive Edge LLMs
Paper • 2512.03383 • Published • 3 -
ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration
Paper • 2511.21689 • Published • 98 -
Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models
Paper • 2511.18890 • Published • 29
-
Scaling Latent Reasoning via Looped Language Models
Paper • 2510.25741 • Published • 219 -
Every Token Counts: Generalizing 16M Ultra-Long Context in Large Language Models
Paper • 2511.23319 • Published • 21 -
Focused Chain-of-Thought: Efficient LLM Reasoning via Structured Input Information
Paper • 2511.22176 • Published • 4 -
FedRE: A Representation Entanglement Framework for Model-Heterogeneous Federated Learning
Paper • 2511.22265 • Published • 1
-
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning
Paper • 2506.01939 • Published • 187 -
ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration
Paper • 2511.21689 • Published • 98 -
PretrainZero: Reinforcement Active Pretraining
Paper • 2512.03442 • Published • 41
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 526 • 98 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 36 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 97 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 88
-
Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning
Paper • 2407.20798 • Published • 24 -
Offline Reinforcement Learning for LLM Multi-Step Reasoning
Paper • 2412.16145 • Published • 38 -
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models
Paper • 2501.03262 • Published • 103 -
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution
Paper • 2502.18449 • Published • 75