Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2512.03442

PretrainZero: Reinforcement Active Pretraining

Paper • 2512.03442 • Published 7 days ago • 41

PretrainZero: Reinforcement Active Pretraining

Paper • 2512.03442 • Published 7 days ago • 41
UniQL: Unified Quantization and Low-rank Compression for Adaptive Edge LLMs

Paper • 2512.03383 • Published 7 days ago • 3
ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration

Paper • 2511.21689 • Published 13 days ago • 98
Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models

Paper • 2511.18890 • Published 15 days ago • 29

Salesforce/blip2-opt-2.7b

Image-Text-to-Text • 4B • Updated Feb 3 • 756k • 424
timbrooks/instruct-pix2pix

Image-to-Image • Updated Jul 5, 2023 • 61k • 1.16k
huggan/pix2pix-edge2shoes

Updated Apr 15, 2022 • 3
KaiChen1998/geodiffusion-nuimages-time-weather-512x512

Text-to-Image • Updated Dec 5, 2024 • 10

about 3 hours ago

lusxvr/nanoVLM-222M

Image-Text-to-Text • 0.2B • Updated May 8 • 526 • 98
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12 • 36
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

Paper • 2505.24863 • Published May 30 • 97
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23 • 88

Papers-LLM-Training

PretrainZero: Reinforcement Active Pretraining

Paper • 2512.03442 • Published 7 days ago • 41

about 10 hours ago

Scaling Latent Reasoning via Looped Language Models

Paper • 2510.25741 • Published Oct 29 • 219
Every Token Counts: Generalizing 16M Ultra-Long Context in Large Language Models

Paper • 2511.23319 • Published 11 days ago • 21
Focused Chain-of-Thought: Efficient LLM Reasoning via Structured Input Information

Paper • 2511.22176 • Published 13 days ago • 4
FedRE: A Representation Entanglement Framework for Model-Heterogeneous Federated Learning

Paper • 2511.22265 • Published 12 days ago • 1

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Paper • 2506.01939 • Published Jun 2 • 187
ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration

Paper • 2511.21689 • Published 13 days ago • 98
PretrainZero: Reinforcement Active Pretraining

Paper • 2512.03442 • Published 7 days ago • 41

Reinforcement learning

about 9 hours ago

Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning

Paper • 2407.20798 • Published Jul 30, 2024 • 24
Offline Reinforcement Learning for LLM Multi-Step Reasoning

Paper • 2412.16145 • Published Dec 20, 2024 • 38
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models

Paper • 2501.03262 • Published Jan 4 • 103
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution

Paper • 2502.18449 • Published Feb 25 • 75

PretrainZero: Reinforcement Active Pretraining

Paper • 2512.03442 • Published 7 days ago • 41

Papers-LLM-Training

PretrainZero: Reinforcement Active Pretraining

Paper • 2512.03442 • Published 7 days ago • 41

PretrainZero: Reinforcement Active Pretraining

Paper • 2512.03442 • Published 7 days ago • 41
UniQL: Unified Quantization and Low-rank Compression for Adaptive Edge LLMs

Paper • 2512.03383 • Published 7 days ago • 3
ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration

Paper • 2511.21689 • Published 13 days ago • 98
Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models

Paper • 2511.18890 • Published 15 days ago • 29

about 10 hours ago

Scaling Latent Reasoning via Looped Language Models

Paper • 2510.25741 • Published Oct 29 • 219
Every Token Counts: Generalizing 16M Ultra-Long Context in Large Language Models

Paper • 2511.23319 • Published 11 days ago • 21
Focused Chain-of-Thought: Efficient LLM Reasoning via Structured Input Information

Paper • 2511.22176 • Published 13 days ago • 4
FedRE: A Representation Entanglement Framework for Model-Heterogeneous Federated Learning

Paper • 2511.22265 • Published 12 days ago • 1

Salesforce/blip2-opt-2.7b

Image-Text-to-Text • 4B • Updated Feb 3 • 756k • 424
timbrooks/instruct-pix2pix

Image-to-Image • Updated Jul 5, 2023 • 61k • 1.16k
huggan/pix2pix-edge2shoes

Updated Apr 15, 2022 • 3
KaiChen1998/geodiffusion-nuimages-time-weather-512x512

Text-to-Image • Updated Dec 5, 2024 • 10

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Paper • 2506.01939 • Published Jun 2 • 187
ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration

Paper • 2511.21689 • Published 13 days ago • 98
PretrainZero: Reinforcement Active Pretraining

Paper • 2512.03442 • Published 7 days ago • 41

about 3 hours ago

lusxvr/nanoVLM-222M

Image-Text-to-Text • 0.2B • Updated May 8 • 526 • 98
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12 • 36
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

Paper • 2505.24863 • Published May 30 • 97
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23 • 88

Reinforcement learning

about 9 hours ago

Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning

Paper • 2407.20798 • Published Jul 30, 2024 • 24
Offline Reinforcement Learning for LLM Multi-Step Reasoning

Paper • 2412.16145 • Published Dec 20, 2024 • 38
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models

Paper • 2501.03262 • Published Jan 4 • 103
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution

Paper • 2502.18449 • Published Feb 25 • 75

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs