liu zh's picture

8 1

liu zh

morphism42

·

AI & ML interests

None yet

Organizations

None yet

upvoted a paper 2 months ago

On Predictability of Reinforcement Learning Dynamics for Large Language Models

Paper • 2510.00553 • Published Oct 1 • 8

upvoted a paper 4 months ago

Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed Inference

Paper • 2508.02193 • Published Aug 4 • 132

upvoted a paper 10 months ago

Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search

Paper • 2502.02508 • Published Feb 4 • 23

upvoted an article about 1 year ago

Article

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

+4

Sep 18, 2024

•

273

upvoted 4 articles over 1 year ago

Article

How NuminaMath Won the 1st AIMO Progress Prize

+6

Jul 11, 2024

•

124

Article

Illustrating Reinforcement Learning from Human Feedback (RLHF)

+2

Dec 9, 2022

•

380

Article

Fine-tune Llama 3 with ORPO

Apr 22, 2024

•

241

Article

Personal Copilot: Train Your Own Coding Assistant

Oct 27, 2023

•

75