arxiv:2407.15762
Kaiwen Wang
kaiwenw
AI & ML interests
Reinforcement Learning
Organizations
models
36
kaiwenw/single_node_run2-step-12170
2B
•
Updated
•
6
kaiwenw/single_node_run2-step-12150
2B
•
Updated
•
5
kaiwenw/single_node_run2-step-11664
2B
•
Updated
•
5
kaiwenw/single_node_run2-step-11178
2B
•
Updated
•
5
kaiwenw/single_node_run2-step-10692
2B
•
Updated
•
7
kaiwenw/single_node_run2-step-10206
2B
•
Updated
•
5
kaiwenw/single_node_run2-step-9720
2B
•
Updated
•
5
kaiwenw/single_node_run2-step-9234
2B
•
Updated
•
6
kaiwenw/single_node_run2-step-8748
2B
•
Updated
•
6
kaiwenw/single_node_run2-step-8262
2B
•
Updated
•
6
datasets
220
kaiwenw/distill-r1-qwen-1.5b-hmmt-feb-25-4096-with-bt-model-with-sigmoid
Viewer
•
Updated
•
123k
•
105
kaiwenw/distill-r1-qwen-1.5b-hmmt-feb-24-4096-with-bt-model-with-sigmoid
Viewer
•
Updated
•
123k
•
82
kaiwenw/distill-r1-qwen-1.5b-aime-25-4096-with-bt-model-with-sigmoid
Viewer
•
Updated
•
123k
•
2.33k
kaiwenw/distill-r1-qwen-1.5b-aime-24-4096-with-bt-model-with-sigmoid
Viewer
•
Updated
•
123k
•
2.64k
kaiwenw/distill-r1-qwen-1.5b-hmmt-feb-25-4096-with-bt-model-wout-sigmoid
Viewer
•
Updated
•
123k
•
44
kaiwenw/distill-r1-qwen-1.5b-hmmt-feb-24-4096-with-bt-model-wout-sigmoid
Viewer
•
Updated
•
123k
•
52
kaiwenw/distill-r1-qwen-1.5b-aime-25-4096-with-bt-model-wout-sigmoid
Viewer
•
Updated
•
123k
•
2.22k
kaiwenw/distill-r1-qwen-1.5b-aime-24-4096-with-bt-model-wout-sigmoid
Viewer
•
Updated
•
123k
•
2.49k
kaiwenw/distill-r1-qwen-1.5b-hmmt-feb-25-4096-with-old-prm-indices_61440_69120
Viewer
•
Updated
•
7.68k
•
12
kaiwenw/distill-r1-qwen-1.5b-hmmt-feb-25-4096-with-old-prm-indices_76800_84480
Viewer
•
Updated
•
7.68k
•
19