Reward Modeling with Ordinal Feedback: Wisdom of the Crowd

4
citations
#1154
in ICML 2025
of 3340 papers
4
Top Authors
4
Data Points

Abstract

The canonical setup of learning a reward model (RM) from human preferences with binary feedback discards potentially useful samples (such as "tied" between the two responses) and loses fine-grained information (such as "slightly better'"). This paper proposes a framework for learning RMs underordinal feedback, generalizing the binary feedback to arbitrary granularity. We first identify a marginal unbiasedness condition, which generalizes the existing assumption of the binary feedback. The condition is validated via the sociological concept called "wisdom of the crowd". Under this condition, we develop a natural probability model and prove the benefits of fine-grained feedback in terms of reducing the Rademacher complexity, which may be of independent interest to another problem: the bias-variance trade-off in knowledge distillation. The framework also sheds light on designing guidelines for human annotators. Our numerical experiments validate that: (1) fine-grained feedback leads to better RM learning for both in- and out-of-distribution settings; (2) incorporating a certain proportion of tied samples boosts RM learning.

Citation History

Jan 28, 2026
0
Feb 13, 2026
4+4
Feb 13, 2026
4
Feb 13, 2026
4