Posts by Collection

publications

Reward Reasoning Model

Published in NeurIPS, Poster, 2025

Reward Reasoning Models (RRMs) perform chain-of-thought reasoning before predicting rewards to improve reward accuracy.

Recommended citation: Jiaxin Guo*, Zewen Chi*, Li Dong*, Qingxiu Dong, Xun Wu, Shaohan Huang, Furu Wei. (2025). "Reward Reasoning Model." NeurIPS 2025, Poster.
Download Paper

AdaSPEC: Selective Knowledge Distillation for Efficient Speculative Decoders

Published in NeurIPS, Spotlight (top 3.5%), 2025

AdaSPEC selectively distills knowledge to improve draft–target alignment for speculative decoding.

Recommended citation: Jiaxin Guo*, Yuezhou Hu*, Xinyu Feng, Tuo Zhao. (2025). "AdaSPEC: Selective Knowledge Distillation for Efficient Speculative Decoders." NeurIPS 2025, Spotlight.
Download Paper

talks

Reward Reasoning Model

Published: December 18, 2025