Look Before You Leap: An Exploratory Study of Uncertainty Measurement for Large Language Models

16 July 2023

Papers citing "Look Before You Leap: An Exploratory Study of Uncertainty Measurement for Large Language Models"

25 / 25 papers shown

Title
Learning on LLM Output Signatures for gray-box Behavior Analysis Guy Bar-Shalom Fabrizio Frasca Derek Lim Yoav Gelberg Yftah Ziser Ran El-Yaniv Gal Chechik Haggai Maron 93 0 0 18 Mar 2025
Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation Yiming Wang Pei Zhang Baosong Yang Derek F. Wong Rui Wang LRM 74 9 0 17 Oct 2024
Functional-level Uncertainty Quantification for Calibrated Fine-tuning on LLMs Ruijia Niu D. Wu Rose Yu Yi-An Ma 50 1 0 09 Oct 2024
LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations Hadas Orgad Michael Toker Zorik Gekhman Roi Reichart Idan Szpektor Hadas Kotek Yonatan Belinkov HILM AIFin 85 37 0 03 Oct 2024
Context-aware LLM-based Safe Control Against Latent Risks Quang Khanh Luu Xiyu Deng Anh Van Ho Yorie Nakahira 87 4 0 18 Mar 2024
Uncertainty quantification in fine-tuned LLMs using LoRA ensembles Oleksandr Balabanov Hampus Linander UQCV 80 16 0 19 Feb 2024
Frontier AI Regulation: Managing Emerging Risks to Public Safety Markus Anderljung Joslyn Barnhart Anton Korinek Jade Leung Cullen O'Keefe ... Jonas Schuett Yonadav Shavit Divya Siddarth Robert F. Trager Kevin J. Wolf SILM 66 121 0 06 Jul 2023
A Survey of Safety and Trustworthiness of Large Language Models through the Lens of Verification and Validation Xiaowei Huang Wenjie Ruan Wei Huang Gao Jin Yizhen Dong ... Sihao Wu Peipei Xu Dengyu Wu André Freitas Mustafa A. Mustafa ALM 82 87 0 19 May 2023
Capabilities of GPT-4 on Medical Challenge Problems Harsha Nori Nicholas King S. McKinney Dean Carignan Eric Horvitz LM&MA ELM AI4MH 77 793 0 20 Mar 2023
How Robust is GPT-3.5 to Predecessors? A Comprehensive Study on Language Understanding Tasks Xuanting Chen Junjie Ye Can Zu Nuo Xu Rui Zheng Minlong Peng Jie Zhou Tao Gui Qi Zhang Xuanjing Huang AI4MH ELM 52 82 0 01 Mar 2023
An Exploratory Study of AI System Risk Assessment from the Lens of Data Distribution and Uncertainty Zhijie Wang Yuheng Huang Lei Ma Haruki Yokoyama Susumu Tokumoto Kazuki Munakata 49 4 0 13 Dec 2022
A Holistic Approach to Undesired Content Detection in the Real World Todor Markov Chong Zhang Sandhini Agarwal Tyna Eloundou Teddy Lee Steven Adler Angela Jiang L. Weng 34 228 0 05 Aug 2022
Locally Typical Sampling Clara Meister Tiago Pimentel Gian Wiher Ryan Cotterell 168 88 0 01 Feb 2022
CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation Yue Wang Weishi Wang Shafiq Joty Guosheng Lin 273 1,532 0 02 Sep 2021
Program Synthesis with Large Language Models Jacob Austin Augustus Odena Maxwell Nye Maarten Bosma Henryk Michalewski ... Ellen Jiang Carrie J. Cai Michael Terry Quoc V. Le Charles Sutton ELM AIMat ReCod ALM 97 1,893 0 16 Aug 2021
A Survey of Uncertainty in Deep Neural Networks J. Gawlikowski Cedrique Rovile Njieutcheu Tassi Mohsin Ali Jongseo Lee Matthias Humt ... R. Roscher Muhammad Shahzad Wen Yang R. Bamler Xiaoxiang Zhu BDL UQCV OOD 173 1,136 0 07 Jul 2021
Persistent Anti-Muslim Bias in Large Language Models Abubakar Abid Maheen Farooqi James Zou AILaw 78 545 0 14 Jan 2021
CodeBLEU: a Method for Automatic Evaluation of Code Synthesis Shuo Ren Daya Guo Shuai Lu Long Zhou Shujie Liu Duyu Tang Neel Sundaresan M. Zhou Ambrosio Blanco Shuai Ma ELM 86 517 0 22 Sep 2020
GraphCodeBERT: Pre-training Code Representations with Data Flow Daya Guo Shuo Ren Shuai Lu Zhangyin Feng Duyu Tang ... Dawn Drain Neel Sundaresan Jian Yin Daxin Jiang M. Zhou 128 1,111 0 17 Sep 2020
Pitfalls of In-Domain Uncertainty Estimation and Ensembling in Deep Learning Arsenii Ashukha Alexander Lyzhov Dmitry Molchanov Dmitry Vetrov UQCV FedML 64 315 0 15 Feb 2020
Towards Making the Most of BERT in Neural Machine Translation Jiacheng Yang Mingxuan Wang Hao Zhou Chengqi Zhao Yong Yu Weinan Zhang Lei Li CLL 46 158 0 15 Aug 2019
Optimization under Uncertainty in the Era of Big Data and Deep Learning: When Machine Learning Meets Mathematical Programming C. Ning Fengqi You OOD 25 279 0 03 Apr 2019
Failing Loudly: An Empirical Study of Methods for Detecting Dataset Shift Stephan Rabanser Stephan Günnemann Zachary Chase Lipton 52 363 0 29 Oct 2018
Classification Uncertainty of Deep Neural Networks Based on Gradient Information Philipp Oberdiek Matthias Rottmann Hanno Gottschalk UQCV 54 64 0 22 May 2018
On the Safety of Machine Learning: Cyber-Physical Systems, Decision Sciences, and Data Products Kush R. Varshney H. Alemzadeh 54 223 0 05 Oct 2016