ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.18153
  4. Cited By
Do Large Language Models Know What They Don't Know?

Do Large Language Models Know What They Don't Know?

29 May 2023
Zhangyue Yin
Qiushi Sun
Qipeng Guo
Jiawen Wu
Xipeng Qiu
Xuanjing Huang
    ELM
    AI4MH
ArXivPDFHTML

Papers citing "Do Large Language Models Know What They Don't Know?"

50 / 119 papers shown
Title
Uncertainty-Aware Large Language Models for Explainable Disease Diagnosis
Uncertainty-Aware Large Language Models for Explainable Disease Diagnosis
Shuang Zhou
Jiashuo Wang
Zidu Xu
Song Wang
David Brauer
...
Zaifu Zhan
Yu Hou
Mingquan Lin
Genevieve B. Melton
Rui Zhang
48
0
0
06 May 2025
AI Awareness
AI Awareness
X. Li
Haoyuan Shi
Rongwu Xu
Wei Xu
54
0
0
25 Apr 2025
Comparing Uncertainty Measurement and Mitigation Methods for Large Language Models: A Systematic Review
Comparing Uncertainty Measurement and Mitigation Methods for Large Language Models: A Systematic Review
Toghrul Abbasli
Kentaroh Toyoda
Yuan Wang
Leon Witt
Muhammad Asif Ali
Yukai Miao
Dan Li
Qingsong Wei
UQCV
92
0
0
25 Apr 2025
HalluLens: LLM Hallucination Benchmark
HalluLens: LLM Hallucination Benchmark
Yejin Bang
Ziwei Ji
Alan Schelten
Anthony Hartshorn
Tara Fowler
Cheng Zhang
Nicola Cancedda
Pascale Fung
HILM
92
0
0
24 Apr 2025
Leveraging LLMs as Meta-Judges: A Multi-Agent Framework for Evaluating LLM Judgments
Leveraging LLMs as Meta-Judges: A Multi-Agent Framework for Evaluating LLM Judgments
Y. Li
Jama Hussein Mohamud
Chongren Sun
Di Wu
Benoit Boulet
LLMAG
ELM
70
0
0
23 Apr 2025
From predictions to confidence intervals: an empirical study of conformal prediction methods for in-context learning
From predictions to confidence intervals: an empirical study of conformal prediction methods for in-context learning
Zhe Huang
Simone Rossi
Rui Yuan
T. Hannagan
32
0
0
22 Apr 2025
Meta-Thinking in LLMs via Multi-Agent Reinforcement Learning: A Survey
Meta-Thinking in LLMs via Multi-Agent Reinforcement Learning: A Survey
Ahsan Bilal
Muhammad Ahmed Mohsin
Muhammad Umer
Muhammad Awais Khan Bangash
Muhammad Ali Jamshed
LLMAG
LRM
AI4CE
51
0
0
20 Apr 2025
Metacognition and Uncertainty Communication in Humans and Large Language Models
Metacognition and Uncertainty Communication in Humans and Large Language Models
Mark Steyvers
Megan A.K. Peters
31
0
0
18 Apr 2025
Why and How LLMs Hallucinate: Connecting the Dots with Subsequence Associations
Why and How LLMs Hallucinate: Connecting the Dots with Subsequence Associations
Yiyou Sun
Y. Gai
Lijie Chen
Abhilasha Ravichander
Yejin Choi
D. Song
HILM
57
0
0
17 Apr 2025
A Desideratum for Conversational Agents: Capabilities, Challenges, and Future Directions
A Desideratum for Conversational Agents: Capabilities, Challenges, and Future Directions
Emre Can Acikgoz
Cheng Qian
Hongru Wang
Vardhan Dongre
X. Chen
Heng Ji
Dilek Hakkani-Tür
Gökhan Tür
LM&Ro
ELM
52
1
0
07 Apr 2025
FindTheFlaws: Annotated Errors for Detecting Flawed Reasoning and Scalable Oversight Research
FindTheFlaws: Annotated Errors for Detecting Flawed Reasoning and Scalable Oversight Research
Gabriel Recchia
Chatrik Singh Mangat
Issac Li
Gayatri Krishnakumar
ALM
80
0
0
29 Mar 2025
The MASK Benchmark: Disentangling Honesty From Accuracy in AI Systems
The MASK Benchmark: Disentangling Honesty From Accuracy in AI Systems
Richard Ren
Arunim Agarwal
Mantas Mazeika
Cristina Menghini
Robert Vacareanu
...
Matias Geralnik
Adam Khoja
Dean Lee
Summer Yue
Dan Hendrycks
HILM
ALM
88
0
0
05 Mar 2025
Text2Scenario: Text-Driven Scenario Generation for Autonomous Driving Test
Xuan Cai
Xuesong Bai
Zhiyong Cui
Danmu Xie
Daocheng Fu
Haiyang Yu
Yilong Ren
42
0
0
04 Mar 2025
Semantic Volume: Quantifying and Detecting both External and Internal Uncertainty in LLMs
Semantic Volume: Quantifying and Detecting both External and Internal Uncertainty in LLMs
Xiaomin Li
Zhou Yu
Ziji Zhang
Yingying Zhuang
S.
Narayanan Sadagopan
Anurag Beniwal
HILM
58
0
0
28 Feb 2025
END: Early Noise Dropping for Efficient and Effective Context Denoising
END: Early Noise Dropping for Efficient and Effective Context Denoising
Hongye Jin
Pei Chen
Jingfeng Yang
Z. Wang
Meng-Long Jiang
...
X. Zhang
Zheng Li
Tianyi Liu
Huasheng Li
Bing Yin
128
0
0
26 Feb 2025
Theoretical Physics Benchmark (TPBench) -- a Dataset and Study of AI Reasoning Capabilities in Theoretical Physics
Theoretical Physics Benchmark (TPBench) -- a Dataset and Study of AI Reasoning Capabilities in Theoretical Physics
Daniel J.H. Chung
Zhiqi Gao
Yurii Kvasiuk
Tianyi Li
Moritz Münchmeyer
Maja Rudolph
Frederic Sala
Sai Chaitanya Tadepalli
AIMat
46
3
0
19 Feb 2025
Language Models Can Predict Their Own Behavior
Language Models Can Predict Their Own Behavior
Dhananjay Ashok
Jonathan May
ReLM
AI4TS
LRM
58
0
0
18 Feb 2025
Refine Knowledge of Large Language Models via Adaptive Contrastive Learning
Refine Knowledge of Large Language Models via Adaptive Contrastive Learning
Yinghui Li
Haojing Huang
Jiayi Kuang
Yangning Li
Shu Guo
C. Qu
Xiaoyu Tan
Hai-Tao Zheng
Ying Shen
Philip S. Yu
CLL
66
5
0
11 Feb 2025
TableMaster: A Recipe to Advance Table Understanding with Language Models
TableMaster: A Recipe to Advance Table Understanding with Language Models
Lang Cao
Hanbing Liu
LMTD
RALM
211
0
1
31 Jan 2025
BLoB: Bayesian Low-Rank Adaptation by Backpropagation for Large Language Models
BLoB: Bayesian Low-Rank Adaptation by Backpropagation for Large Language Models
Yibin Wang
H. Shi
Ligong Han
Dimitris N. Metaxas
Hao Wang
BDL
UQLM
113
6
0
28 Jan 2025
Decoding Knowledge in Large Language Models: A Framework for Categorization and Comprehension
Yanbo Fang
Ruixiang Tang
ELM
33
0
0
03 Jan 2025
LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs
LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs
LLM-jp
Akiko Aizawa
Eiji Aramaki
Bowen Chen
Fei Cheng
...
Yuya Yamamoto
Yusuke Yamauchi
Hitomi Yanaka
Rio Yokota
Koichiro Yoshino
55
14
0
31 Dec 2024
Investigating Factuality in Long-Form Text Generation: The Roles of
  Self-Known and Self-Unknown
Investigating Factuality in Long-Form Text Generation: The Roles of Self-Known and Self-Unknown
Lifu Tu
Rui Meng
Shafiq R. Joty
Yingbo Zhou
Semih Yavuz
HILM
67
0
0
24 Nov 2024
Exploring Knowledge Boundaries in Large Language Models for Retrieval
  Judgment
Exploring Knowledge Boundaries in Large Language Models for Retrieval Judgment
Zhen Zhang
Xinyu Wang
Yong-feng Jiang
Zhuo Chen
Feiteng Mu
Mengting Hu
Pengjun Xie
Fei Huang
KELM
59
2
0
09 Nov 2024
Right this way: Can VLMs Guide Us to See More to Answer Questions?
Right this way: Can VLMs Guide Us to See More to Answer Questions?
Li Liu
Diji Yang
Sijia Zhong
Kalyana Suma Sree Tholeti
Lei Ding
Yi Zhang
Leilani H. Gilpin
34
2
0
01 Nov 2024
Rethinking the Uncertainty: A Critical Review and Analysis in the Era of
  Large Language Models
Rethinking the Uncertainty: A Critical Review and Analysis in the Era of Large Language Models
Mohammad Beigi
Sijia Wang
Ying Shen
Zihao Lin
Adithya Kulkarni
...
Ming Jin
Jin-Hee Cho
Dawei Zhou
Chang-Tien Lu
Lifu Huang
29
1
0
26 Oct 2024
From Imitation to Introspection: Probing Self-Consciousness in Language
  Models
From Imitation to Introspection: Probing Self-Consciousness in Language Models
Sirui Chen
Shu Yu
Shengjie Zhao
Chaochao Lu
MILM
LRM
30
1
0
24 Oct 2024
Decoding on Graphs: Faithful and Sound Reasoning on Knowledge Graphs
  through Generation of Well-Formed Chains
Decoding on Graphs: Faithful and Sound Reasoning on Knowledge Graphs through Generation of Well-Formed Chains
K. Li
Tianhua Zhang
Xixin Wu
Hongyin Luo
James Glass
H. Meng
31
0
0
24 Oct 2024
Tell me what I need to know: Exploring LLM-based (Personalized)
  Abstractive Multi-Source Meeting Summarization
Tell me what I need to know: Exploring LLM-based (Personalized) Abstractive Multi-Source Meeting Summarization
Frederic Kirstein
Terry Ruas
Robert Kratel
Bela Gipp
21
2
0
18 Oct 2024
"Let's Argue Both Sides": Argument Generation Can Force Small Models to
  Utilize Previously Inaccessible Reasoning Capabilities
"Let's Argue Both Sides": Argument Generation Can Force Small Models to Utilize Previously Inaccessible Reasoning Capabilities
Kaveh Eskandari Miandoab
Vasanth Sarathy
LRM
ReLM
28
0
0
16 Oct 2024
Probing Language Models on Their Knowledge Source
Probing Language Models on Their Knowledge Source
Zineddine Tighidet
Andrea Mogini
Jiali Mei
Benjamin Piwowarski
Patrick Gallinari
KELM
32
1
0
08 Oct 2024
Integrative Decoding: Improve Factuality via Implicit Self-consistency
Integrative Decoding: Improve Factuality via Implicit Self-consistency
Yi Cheng
Xiao Liang
Yeyun Gong
Wen Xiao
Song Wang
...
Wenjie Li
Jian Jiao
Qi Chen
Peng Cheng
Wayne Xiong
HILM
56
1
0
02 Oct 2024
FaithEval: Can Your Language Model Stay Faithful to Context, Even If "The Moon is Made of Marshmallows"
FaithEval: Can Your Language Model Stay Faithful to Context, Even If "The Moon is Made of Marshmallows"
Yifei Ming
Senthil Purushwalkam
Shrey Pandit
Zixuan Ke
Xuan-Phi Nguyen
Caiming Xiong
Shafiq R. Joty
HILM
110
16
0
30 Sep 2024
A Survey on the Honesty of Large Language Models
A Survey on the Honesty of Large Language Models
Siheng Li
Cheng Yang
Taiqiang Wu
Chufan Shi
Yuji Zhang
...
Jie Zhou
Yujiu Yang
Ngai Wong
Xixin Wu
Wai Lam
HILM
32
4
0
27 Sep 2024
Are Large Language Models More Honest in Their Probabilistic or
  Verbalized Confidence?
Are Large Language Models More Honest in Their Probabilistic or Verbalized Confidence?
Shiyu Ni
Keping Bi
Lulu Yu
Jiafeng Guo
HILM
33
4
0
19 Aug 2024
Defining and Evaluating Decision and Composite Risk in Language Models
  Applied to Natural Language Inference
Defining and Evaluating Decision and Composite Risk in Language Models Applied to Natural Language Inference
Ke Shen
M. Kejriwal
32
0
0
04 Aug 2024
Internal Consistency and Self-Feedback in Large Language Models: A
  Survey
Internal Consistency and Self-Feedback in Large Language Models: A Survey
Xun Liang
Shichao Song
Zifan Zheng
Hanyu Wang
Qingchen Yu
...
Rong-Hua Li
Peng Cheng
Zhonghao Wang
Feiyu Xiong
Zhiyu Li
HILM
LRM
65
25
0
19 Jul 2024
Evaluating Human-AI Collaboration: A Review and Methodological Framework
Evaluating Human-AI Collaboration: A Review and Methodological Framework
George Fragiadakis
Christos Diou
George Kousiouris
Mara Nikolaidou
57
11
0
09 Jul 2024
Factual Confidence of LLMs: on Reliability and Robustness of Current
  Estimators
Factual Confidence of LLMs: on Reliability and Robustness of Current Estimators
Matéo Mahaut
Laura Aina
Paula Czarnowska
Momchil Hardalov
Thomas Müller
Lluís Marquez
HILM
32
11
0
19 Jun 2024
BeHonest: Benchmarking Honesty in Large Language Models
BeHonest: Benchmarking Honesty in Large Language Models
Steffi Chern
Zhulin Hu
Yuqing Yang
Ethan Chern
Yuan Guo
Jiahe Jin
Binjie Wang
Pengfei Liu
HILM
ALM
86
3
0
19 Jun 2024
Unified Active Retrieval for Retrieval Augmented Generation
Unified Active Retrieval for Retrieval Augmented Generation
Qinyuan Cheng
Xiaonan Li
Shimin Li
Qin Zhu
Zhangyue Yin
Yunfan Shao
Linyang Li
Tianxiang Sun
Hang Yan
Xipeng Qiu
38
0
0
18 Jun 2024
Interactive Evolution: A Neural-Symbolic Self-Training Framework For
  Large Language Models
Interactive Evolution: A Neural-Symbolic Self-Training Framework For Large Language Models
Fangzhi Xu
Qiushi Sun
Kanzhi Cheng
J. Liu
Yu Qiao
Zhiyong Wu
LLMAG
36
5
0
17 Jun 2024
Teaching Large Language Models to Express Knowledge Boundary from Their
  Own Signals
Teaching Large Language Models to Express Knowledge Boundary from Their Own Signals
Lida Chen
Zujie Liang
Xintao Wang
Jiaqing Liang
Yanghua Xiao
Feng Wei
Jinglei Chen
Zhenghong Hao
Bing Han
Wei Wang
53
10
0
16 Jun 2024
Large Language Models Must Be Taught to Know What They Don't Know
Large Language Models Must Be Taught to Know What They Don't Know
Sanyam Kapoor
Nate Gruver
Manley Roberts
Katherine Collins
Arka Pal
Umang Bhatt
Adrian Weller
Samuel Dooley
Micah Goldblum
Andrew Gordon Wilson
34
15
0
12 Jun 2024
HalluDial: A Large-Scale Benchmark for Automatic Dialogue-Level
  Hallucination Evaluation
HalluDial: A Large-Scale Benchmark for Automatic Dialogue-Level Hallucination Evaluation
Wen Luo
Tianshu Shen
Wei Li
Guangyue Peng
Richeng Xuan
Houfeng Wang
Xi Yang
HILM
31
11
0
11 Jun 2024
Cycles of Thought: Measuring LLM Confidence through Stable Explanations
Cycles of Thought: Measuring LLM Confidence through Stable Explanations
Evan Becker
Stefano Soatto
42
6
0
05 Jun 2024
ANAH: Analytical Annotation of Hallucinations in Large Language Models
ANAH: Analytical Annotation of Hallucinations in Large Language Models
Ziwei Ji
Yuzhe Gu
Wenwei Zhang
Chengqi Lyu
Dahua Lin
Kai-xiang Chen
HILM
48
2
0
30 May 2024
Detecting Hallucinations in Large Language Model Generation: A Token
  Probability Approach
Detecting Hallucinations in Large Language Model Generation: A Token Probability Approach
Ernesto Quevedo
Jorge Yero
Rachel Koerner
Pablo Rivas
Tomas Cerny
HILM
41
12
0
30 May 2024
Evaluating the External and Parametric Knowledge Fusion of Large
  Language Models
Evaluating the External and Parametric Knowledge Fusion of Large Language Models
Hao Zhang
Yuyang Zhang
Xiaoguang Li
Wenxuan Shi
Haonan Xu
...
Yasheng Wang
Lifeng Shang
Qun Liu
Yong-jin Liu
Ruiming Tang
KELM
33
4
0
29 May 2024
CtrlA: Adaptive Retrieval-Augmented Generation via Probe-Guided Control
CtrlA: Adaptive Retrieval-Augmented Generation via Probe-Guided Control
Huanshuo Liu
Hao Zhang
Zhijiang Guo
Kuicai Dong
Xiangyang Li
Yi Quan Lee
Cong Zhang
Yong-jin Liu
3DV
33
6
0
29 May 2024
123
Next