aloobun
/

gpt2-hh-rlhf

Text Generation

reward modeling

text-generation-inference

Model card Files Files and versions

reward modeling experiment using Anthropic/hh-rlhf dataset

Downloads last month: 6