ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.14648
  4. Cited By
Calibrated Language Models Must Hallucinate

Calibrated Language Models Must Hallucinate

24 November 2023
Adam Tauman Kalai
Santosh Vempala
    HILM
ArXivPDFHTML

Papers citing "Calibrated Language Models Must Hallucinate"

49 / 49 papers shown
Title
Comparing Uncertainty Measurement and Mitigation Methods for Large Language Models: A Systematic Review
Comparing Uncertainty Measurement and Mitigation Methods for Large Language Models: A Systematic Review
Toghrul Abbasli
Kentaroh Toyoda
Yuan Wang
Leon Witt
Muhammad Asif Ali
Yukai Miao
Dan Li
Qingsong Wei
UQCV
92
0
0
25 Apr 2025
Three Types of Calibration with Properties and their Semantic and Formal Relationships
Three Types of Calibration with Properties and their Semantic and Formal Relationships
Rabanus Derr
Jessie Finocchiaro
Robert C. Williamson
38
0
0
25 Apr 2025
Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction
Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction
Vaishnavh Nagarajan
Chen Henry Wu
Charles Ding
Aditi Raghunathan
36
0
0
21 Apr 2025
PROMPTEVALS: A Dataset of Assertions and Guardrails for Custom Production Large Language Model Pipelines
PROMPTEVALS: A Dataset of Assertions and Guardrails for Custom Production Large Language Model Pipelines
Reya Vir
Shreya Shankar
Harrison Chase
Will Fu-Hinthorn
Aditya G. Parameswaran
AI4TS
32
0
0
20 Apr 2025
Why and How LLMs Hallucinate: Connecting the Dots with Subsequence Associations
Why and How LLMs Hallucinate: Connecting the Dots with Subsequence Associations
Yiyou Sun
Y. Gai
Lijie Chen
Abhilasha Ravichander
Yejin Choi
D. Song
HILM
57
0
0
17 Apr 2025
High dimensional online calibration in polynomial time
High dimensional online calibration in polynomial time
Binghui Peng
22
0
0
12 Apr 2025
Hallucination, reliability, and the role of generative AI in science
Hallucination, reliability, and the role of generative AI in science
Charles Rathkopf
HILM
40
0
0
11 Apr 2025
Exploring Data Scaling Trends and Effects in Reinforcement Learning from Human Feedback
Exploring Data Scaling Trends and Effects in Reinforcement Learning from Human Feedback
Wei Shen
Guanlin Liu
Zheng Wu
Ruofei Zhu
Qingping Yang
Chao Xin
Yu Yue
Lin Yan
84
8
0
28 Mar 2025
Estimating stationary mass, frequency by frequency
Estimating stationary mass, frequency by frequency
Milind Nakul
Vidya Muthukumar
A. Pananjady
47
0
0
17 Mar 2025
Aligning Vision to Language: Text-Free Multimodal Knowledge Graph Construction for Enhanced LLMs Reasoning
Aligning Vision to Language: Text-Free Multimodal Knowledge Graph Construction for Enhanced LLMs Reasoning
Junming Liu
Siyuan Meng
Yanting Gao
Song Mao
Pinlong Cai
Guohang Yan
Yirong Chen
Zilin Bian
Botian Shi
Ding Wang
51
1
0
17 Mar 2025
Verify when Uncertain: Beyond Self-Consistency in Black Box Hallucination Detection
Verify when Uncertain: Beyond Self-Consistency in Black Box Hallucination Detection
Yihao Xue
Kristjan Greenewald
Youssef Mroueh
Baharan Mirzasoleiman
HILM
54
1
0
20 Feb 2025
Hallucinations are inevitable but statistically negligible
Hallucinations are inevitable but statistically negligible
Atsushi Suzuki
Yulan He
Feng Tian
Zhongyuan Wang
HILM
49
0
0
15 Feb 2025
Hallucination, Monofacts, and Miscalibration: An Empirical Investigation
Hallucination, Monofacts, and Miscalibration: An Empirical Investigation
Muqing Miao
Michael Kearns
67
0
0
11 Feb 2025
Selective Response Strategies for GenAI
Selective Response Strategies for GenAI
Boaz Taitler
Omer Ben-Porat
66
1
0
02 Feb 2025
Dialogue Benchmark Generation from Knowledge Graphs with Cost-Effective Retrieval-Augmented LLMs
Dialogue Benchmark Generation from Knowledge Graphs with Cost-Effective Retrieval-Augmented LLMs
Reham Omar
Omij Mangukiya
Essam Mansour
39
0
0
20 Jan 2025
The FACTS Grounding Leaderboard: Benchmarking LLMs' Ability to Ground Responses to Long-Form Input
Alon Jacovi
Andrew Wang
Chris Alberti
Connie Tao
Jon Lipovetz
...
Rachana Fellinger
Rui Wang
Zizhao Zhang
Sasha Goldshtein
Dipanjan Das
HILM
ALM
85
13
0
06 Jan 2025
Exploring Facets of Language Generation in the Limit
Exploring Facets of Language Generation in the Limit
Moses Charikar
Chirag Pabbaraju
LRM
72
1
0
22 Nov 2024
Distinguishing Ignorance from Error in LLM Hallucinations
Distinguishing Ignorance from Error in LLM Hallucinations
Adi Simhi
Jonathan Herzig
Idan Szpektor
Yonatan Belinkov
HILM
53
2
0
29 Oct 2024
No Free Lunch: Fundamental Limits of Learning Non-Hallucinating
  Generative Models
No Free Lunch: Fundamental Limits of Learning Non-Hallucinating Generative Models
Changlong Wu
A. Grama
Wojciech Szpankowski
27
1
0
24 Oct 2024
DocETL: Agentic Query Rewriting and Evaluation for Complex Document Processing
DocETL: Agentic Query Rewriting and Evaluation for Complex Document Processing
Shreya Shankar
Tristan Chambers
Eugene Wu
Aditya G. Parameswaran
Eugene Wu
LLMAG
56
6
0
16 Oct 2024
On Classification with Large Language Models in Cultural Analytics
On Classification with Large Language Models in Cultural Analytics
David Bamman
Kent K. Chang
L. Lucy
Naitian Zhou
28
4
0
15 Oct 2024
An X-Ray Is Worth 15 Features: Sparse Autoencoders for Interpretable
  Radiology Report Generation
An X-Ray Is Worth 15 Features: Sparse Autoencoders for Interpretable Radiology Report Generation
Ahmed Abdulaal
Hugo Fry
Nina Montaña-Brown
Ayodeji Ijishakin
Jack Gao
Stephanie L. Hyland
Daniel C. Alexander
Daniel Coelho De Castro
MedIm
31
8
0
04 Oct 2024
CreDes: Causal Reasoning Enhancement and Dual-End Searching for Solving
  Long-Range Reasoning Problems using LLMs
CreDes: Causal Reasoning Enhancement and Dual-End Searching for Solving Long-Range Reasoning Problems using LLMs
Kangsheng Wang
Xiao Zhang
Hao Liu
Songde Han
Huimin Ma
Tianyu Hu
LRM
51
5
0
02 Oct 2024
State space models, emergence, and ergodicity: How many parameters are
  needed for stable predictions?
State space models, emergence, and ergodicity: How many parameters are needed for stable predictions?
Ingvar M. Ziemann
Nikolai Matni
George J. Pappas
25
1
0
20 Sep 2024
Policy Filtration in RLHF to Fine-Tune LLM for Code Generation
Policy Filtration in RLHF to Fine-Tune LLM for Code Generation
Wei Shen
Chuheng Zhang
OffRL
36
6
0
11 Sep 2024
DiPT: Enhancing LLM reasoning through diversified perspective-taking
DiPT: Enhancing LLM reasoning through diversified perspective-taking
H. Just
Mahavir Dabas
Lifu Huang
Ming Jin
Ruoxi Jia
LRM
37
1
0
10 Sep 2024
ContextCite: Attributing Model Generation to Context
ContextCite: Attributing Model Generation to Context
Benjamin Cohen-Wang
Harshay Shah
Kristian Georgiev
Aleksander Madry
LRM
30
18
0
01 Sep 2024
Understanding Generative AI Content with Embedding Models
Understanding Generative AI Content with Embedding Models
Max Vargas
Reilly Cannon
A. Engel
Anand D. Sarwate
Tony Chiang
52
3
0
19 Aug 2024
Mission Impossible: A Statistical Perspective on Jailbreaking LLMs
Mission Impossible: A Statistical Perspective on Jailbreaking LLMs
Jingtong Su
Mingyu Lee
SangKeun Lee
40
8
0
02 Aug 2024
Automated Review Generation Method Based on Large Language Models
Automated Review Generation Method Based on Large Language Models
Shican Wu
Xiao Ma
Dehui Luo
Lulu Li
Xiangcheng Shi
...
Ran Luo
Chunlei Pei
Zhijian Zhao
Zhi-Jian Zhao
Jinlong Gong
74
0
0
30 Jul 2024
Building Machines that Learn and Think with People
Building Machines that Learn and Think with People
Katherine M. Collins
Ilia Sucholutsky
Umang Bhatt
Kartik Chandra
Lionel Wong
...
Mark K. Ho
Vikash K. Mansinghka
Adrian Weller
Joshua B. Tenenbaum
Thomas L. Griffiths
54
30
0
22 Jul 2024
Towards a Science Exocortex
Towards a Science Exocortex
Kevin G. Yager
80
0
0
24 Jun 2024
On Subjective Uncertainty Quantification and Calibration in Natural
  Language Generation
On Subjective Uncertainty Quantification and Calibration in Natural Language Generation
Ziyu Wang
Chris Holmes
UQLM
45
4
0
07 Jun 2024
On the Intrinsic Self-Correction Capability of LLMs: Uncertainty and
  Latent Concept
On the Intrinsic Self-Correction Capability of LLMs: Uncertainty and Latent Concept
Guangliang Liu
Haitao Mao
Bochuan Cao
Zhiyu Xue
K. Johnson
Jiliang Tang
Rongrong Wang
LRM
34
9
0
04 Jun 2024
Is In-Context Learning in Large Language Models Bayesian? A Martingale
  Perspective
Is In-Context Learning in Large Language Models Bayesian? A Martingale Perspective
Fabian Falck
Ziyu Wang
Chris Holmes
58
12
0
02 Jun 2024
Hallucination-Free? Assessing the Reliability of Leading AI Legal
  Research Tools
Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools
Varun Magesh
Faiz Surani
Matthew Dahl
Mirac Suzgun
Christopher D. Manning
Daniel E. Ho
HILM
ELM
AILaw
27
66
0
30 May 2024
Constructing Benchmarks and Interventions for Combating Hallucinations
  in LLMs
Constructing Benchmarks and Interventions for Combating Hallucinations in LLMs
Adi Simhi
Jonathan Herzig
Idan Szpektor
Yonatan Belinkov
HILM
46
10
0
15 Apr 2024
RLHF Deciphered: A Critical Analysis of Reinforcement Learning from
  Human Feedback for LLMs
RLHF Deciphered: A Critical Analysis of Reinforcement Learning from Human Feedback for LLMs
Shreyas Chaudhari
Pranjal Aggarwal
Vishvak Murahari
Tanmay Rajpurohit
A. Kalyan
Karthik Narasimhan
A. Deshpande
Bruno Castro da Silva
26
34
0
12 Apr 2024
Language Generation in the Limit
Language Generation in the Limit
Jon M. Kleinberg
S. Mullainathan
LRM
29
3
0
10 Apr 2024
Automating Research Synthesis with Domain-Specific Large Language Model
  Fine-Tuning
Automating Research Synthesis with Domain-Specific Large Language Model Fine-Tuning
Teo Susnjak
Peter Hwang
N. Reyes
A. Barczak
Timothy R. McIntosh
Surangika Ranathunga
70
22
0
08 Apr 2024
Multicalibration for Confidence Scoring in LLMs
Multicalibration for Confidence Scoring in LLMs
Gianluca Detommaso
Martín Bertrán
Riccardo Fogliato
Aaron Roth
24
12
0
06 Apr 2024
Unfamiliar Finetuning Examples Control How Language Models Hallucinate
Unfamiliar Finetuning Examples Control How Language Models Hallucinate
Katie Kang
Eric Wallace
Claire Tomlin
Aviral Kumar
Sergey Levine
HILM
LRM
41
49
0
08 Mar 2024
Guardrail Baselines for Unlearning in LLMs
Guardrail Baselines for Unlearning in LLMs
Pratiksha Thaker
Yash Maurya
Shengyuan Hu
Zhiwei Steven Wu
Virginia Smith
MU
43
38
0
05 Mar 2024
On the Challenges and Opportunities in Generative AI
On the Challenges and Opportunities in Generative AI
Laura Manduchi
Kushagra Pandey
Robert Bamler
Ryan Cotterell
Sina Daubener
...
F. Wenzel
Frank Wood
Stephan Mandt
Vincent Fortuin
Vincent Fortuin
56
17
0
28 Feb 2024
On Limitations of the Transformer Architecture
On Limitations of the Transformer Architecture
Binghui Peng
Srini Narayanan
Christos H. Papadimitriou
24
32
0
13 Feb 2024
Large Legal Fictions: Profiling Legal Hallucinations in Large Language
  Models
Large Legal Fictions: Profiling Legal Hallucinations in Large Language Models
Matthew Dahl
Varun Magesh
Mirac Suzgun
Daniel E. Ho
HILM
AILaw
25
73
0
02 Jan 2024
How Language Model Hallucinations Can Snowball
How Language Model Hallucinations Can Snowball
Muru Zhang
Ofir Press
William Merrill
Alisa Liu
Noah A. Smith
HILM
LRM
82
253
0
22 May 2023
The Internal State of an LLM Knows When It's Lying
The Internal State of an LLM Knows When It's Lying
A. Azaria
Tom Michael Mitchell
HILM
218
299
0
26 Apr 2023
Truthful AI: Developing and governing AI that does not lie
Truthful AI: Developing and governing AI that does not lie
Owain Evans
Owen Cotton-Barratt
Lukas Finnveden
Adam Bales
Avital Balwit
Peter Wills
Luca Righetti
William Saunders
HILM
236
109
0
13 Oct 2021
1