Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.07472
Cited By
Uncertainty Estimation for Language Reward Models
14 March 2022
Adam Gleave
G. Irving
UQLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Uncertainty Estimation for Language Reward Models"
9 / 9 papers shown
Title
On the Robustness of Reward Models for Language Model Alignment
Jiwoo Hong
Noah Lee
Eunki Kim
Guijin Son
Woojin Chung
Aman Gupta
Shao Tang
James Thorne
29
0
0
12 May 2025
Routing to the Expert: Efficient Reward-guided Ensemble of Large Language Models
Keming Lu
Hongyi Yuan
Runji Lin
Junyang Lin
Zheng Yuan
Chang Zhou
Jingren Zhou
MoE
LRM
42
52
0
15 Nov 2023
Scaling Laws for Reward Model Overoptimization
Leo Gao
John Schulman
Jacob Hilton
ALM
41
481
0
19 Oct 2022
To Softmax, or not to Softmax: that is the question when applying Active Learning for Transformer Models
Julius Gonsior
C. Falkenberg
Silvio Magino
Anja Reusch
Maik Thiele
Wolfgang Lehner
UQCV
36
7
0
06 Oct 2022
Calibration of Pre-trained Transformers
Shrey Desai
Greg Durrett
UQLM
243
290
0
17 Mar 2020
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
298
1,610
0
18 Sep 2019
Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles
Balaji Lakshminarayanan
Alexander Pritzel
Charles Blundell
UQCV
BDL
276
5,675
0
05 Dec 2016
Teaching Machines to Read and Comprehend
Karl Moritz Hermann
Tomás Kociský
Edward Grefenstette
L. Espeholt
W. Kay
Mustafa Suleyman
Phil Blunsom
184
3,513
0
10 Jun 2015
Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning
Y. Gal
Zoubin Ghahramani
UQCV
BDL
285
9,145
0
06 Jun 2015
1