On the Calibration of Large Language Models and Alignment

On the Calibration of Large Language Models and Alignment

22 November 2023

Benfeng Xu

Papers citing "On the Calibration of Large Language Models and Alignment"

16 / 16 papers shown

Title
What do Language Model Probabilities Represent? From Distribution Estimation to Response Prediction Eitan Wagner Omri Abend 39 0 0 04 May 2025
Bi-directional Model Cascading with Proxy Confidence David Warren Mark Dras 44 0 0 27 Apr 2025
Comparing Uncertainty Measurement and Mitigation Methods for Large Language Models: A Systematic Review Toghrul Abbasli Kentaroh Toyoda Yuan Wang Leon Witt Muhammad Asif Ali Yukai Miao Dan Li Qingsong Wei UQCV 92 0 0 25 Apr 2025
Mind the Confidence Gap: Overconfidence, Calibration, and Distractor Effects in Large Language Models Prateek Chhikara 41 1 0 16 Feb 2025
Understanding the Capabilities and Limitations of Weak-to-Strong Generalization Wei Yao Wenkai Yang Zhongqi Wang Yankai Lin Yong Liu ELM 105 1 0 03 Feb 2025
On Calibration of LLM-based Guard Models for Reliable Content Moderation Hongfu Liu Hengguan Huang Hao Wang Xiangming Gu Ye Wang 55 2 0 14 Oct 2024
Does Alignment Tuning Really Break LLMs' Internal Confidence? Hongseok Oh Wonseok Hwang 47 0 0 31 Aug 2024
LoRA Dropout as a Sparsity Regularizer for Overfitting Control Yang Lin Xinyu Ma Xu Chu Yujie Jin Zhibang Yang Yasha Wang Hong-yan Mei 49 19 0 15 Apr 2024
On the Challenges and Opportunities in Generative AI Laura Manduchi Kushagra Pandey Robert Bamler Ryan Cotterell Sina Daubener ... F. Wenzel Frank Wood Stephan Mandt Vincent Fortuin Vincent Fortuin 56 17 0 28 Feb 2024
Temperature-scaling surprisal estimates improve fit to human reading times -- but does it do so for the "right reasons"? Tong Liu Iza vSkrjanec Vera Demberg 40 5 0 15 Nov 2023
$k$ NN Prompting: Beyond-Context Learning with Calibration-Free Nearest Neighbor Inference Benfeng Xu Quan Wang Zhendong Mao Yajuan Lyu Qiaoqiao She Yongdong Zhang 101 52 0 24 Mar 2023
Self-Consistency Improves Chain of Thought Reasoning in Language Models Xuezhi Wang Jason W. Wei Dale Schuurmans Quoc Le Ed H. Chi Sharan Narang Aakanksha Chowdhery Denny Zhou ReLM BDL LRM AI4CE 314 3,248 0 21 Mar 2022
Training language models to follow instructions with human feedback Long Ouyang Jeff Wu Xu Jiang Diogo Almeida Carroll L. Wainwright ... Amanda Askell Peter Welinder Paul Christiano Jan Leike Ryan J. Lowe OSLM ALM 313 11,953 0 04 Mar 2022
The Pile: An 800GB Dataset of Diverse Text for Language Modeling Leo Gao Stella Biderman Sid Black Laurence Golding Travis Hoppe ... Horace He Anish Thite Noa Nabeshima Shawn Presser Connor Leahy AIMat 253 1,989 0 31 Dec 2020
Calibration of Pre-trained Transformers Shrey Desai Greg Durrett UQLM 243 289 0 17 Mar 2020
Scaling Laws for Neural Language Models Jared Kaplan Sam McCandlish T. Henighan Tom B. Brown B. Chess R. Child Scott Gray Alec Radford Jeff Wu Dario Amodei 240 4,469 0 23 Jan 2020