Reward Reasoning Model
Published in NeurIPS, Poster, 2025
Reward Reasoning Models (RRMs) perform chain-of-thought reasoning before predicting rewards to improve reward accuracy.
Recommended citation: Jiaxin Guo*, Zewen Chi*, Li Dong*, Qingxiu Dong, Xun Wu, Shaohan Huang, Furu Wei. (2025). "Reward Reasoning Model." NeurIPS 2025, Poster.
Download Paper