Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Shenao Zhang's picture
3 9 10

Shenao Zhang

ZhangShenao
mintwire's profile picture 21world's profile picture AdinaY's profile picture
·
https://shenao-zhang.github.io/
  • ShenaoZhang

AI & ML interests

None yet

Organizations

Georgia Institute of Technology's profile picture

authored a paper 4 months ago

Learning to Reason as Action Abstractions with Scalable Mid-Training RL

Paper • 2509.25810 • Published Sep 30, 2025 • 6
authored a paper 8 months ago

Beyond Markovian: Reflective Exploration via Bayes-Adaptive RL for LLM Reasoning

Paper • 2505.20561 • Published May 26, 2025 • 7
authored 4 papers about 1 year ago

Provably Mitigating Overoptimization in RLHF: Your SFT Loss is Implicitly an Adversarial Regularizer

Paper • 2405.16436 • Published May 26, 2024 • 1

Reward-Augmented Data Enhances Direct Preference Alignment of LLMs

Paper • 2410.08067 • Published Oct 10, 2024 • 2

DSTC: Direct Preference Learning with Only Self-Generated Tests and Code to Improve Code LMs

Paper • 2411.13611 • Published Nov 20, 2024

Offline Reinforcement Learning for LLM Multi-Step Reasoning

Paper • 2412.16145 • Published Dec 20, 2024 • 38
authored a paper over 1 year ago

Self-Exploring Language Models: Active Preference Elicitation for Online Alignment

Paper • 2405.19332 • Published May 29, 2024 • 22
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs