ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2207.05221
  4. Cited By
Language Models (Mostly) Know What They Know

Language Models (Mostly) Know What They Know

11 July 2022
Saurav Kadavath
Tom Conerly
Amanda Askell
T. Henighan
Dawn Drain
Ethan Perez
Nicholas Schiefer
Zac Hatfield-Dodds
Nova Dassarma
Eli Tran-Johnson
Scott R. Johnston
S. E. Showk
Andy Jones
Nelson Elhage
Tristan Hume
Anna Chen
Yuntao Bai
Sam Bowman
Stanislav Fort
Deep Ganguli
Danny Hernandez
Josh Jacobson
John Kernion
Shauna Kravec
Liane Lovitt
Kamal Ndousse
Catherine Olsson
Sam Ringer
Dario Amodei
Tom B. Brown
Jack Clark
Nicholas Joseph
Benjamin Mann
Sam McCandlish
C. Olah
Jared Kaplan
    ELM
ArXivPDFHTML

Papers citing "Language Models (Mostly) Know What They Know"

50 / 161 papers shown
Title
Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware
  Direct Preference Optimization
Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization
Zhiyuan Zhao
Bin Wang
Linke Ouyang
Xiao-wen Dong
Jiaqi Wang
Conghui He
MLLM
VLM
32
106
0
28 Nov 2023
Calibrated Language Models Must Hallucinate
Calibrated Language Models Must Hallucinate
Adam Tauman Kalai
Santosh Vempala
HILM
30
76
0
24 Nov 2023
Probabilistic Tree-of-thought Reasoning for Answering
  Knowledge-intensive Complex Questions
Probabilistic Tree-of-thought Reasoning for Answering Knowledge-intensive Complex Questions
S. Cao
Jiajie Zhang
Jiaxin Shi
Xin Lv
Zijun Yao
Qingwen Tian
Juanzi Li
Lei Hou
LRM
29
14
0
23 Nov 2023
Transfer Attacks and Defenses for Large Language Models on Coding Tasks
Transfer Attacks and Defenses for Large Language Models on Coding Tasks
Chi Zhang
Zifan Wang
Ravi Mangal
Matt Fredrikson
Limin Jia
Corina S. Pasareanu
AAML
SILM
29
1
0
22 Nov 2023
R-Tuning: Instructing Large Language Models to Say `I Don't Know'
R-Tuning: Instructing Large Language Models to Say `I Don't Know'
Hanning Zhang
Shizhe Diao
Yong Lin
Yi R. Fung
Qing Lian
Xingyao Wang
Yangyi Chen
Heng Ji
Tong Zhang
UQLM
42
38
0
16 Nov 2023
Ever: Mitigating Hallucination in Large Language Models through
  Real-Time Verification and Rectification
Ever: Mitigating Hallucination in Large Language Models through Real-Time Verification and Rectification
Haoqiang Kang
Juntong Ni
Huaxiu Yao
HILM
LRM
32
34
0
15 Nov 2023
ADaPT: As-Needed Decomposition and Planning with Language Models
ADaPT: As-Needed Decomposition and Planning with Language Models
Archiki Prasad
Alexander Koller
Mareike Hartmann
Peter Clark
Ashish Sabharwal
Mohit Bansal
Tushar Khot
LM&Ro
31
76
0
08 Nov 2023
Knowing What LLMs DO NOT Know: A Simple Yet Effective Self-Detection
  Method
Knowing What LLMs DO NOT Know: A Simple Yet Effective Self-Detection Method
Yukun Zhao
Lingyong Yan
Weiwei Sun
Guoliang Xing
Chong Meng
Shuaiqiang Wang
Zhicong Cheng
Zhaochun Ren
Dawei Yin
31
37
0
27 Oct 2023
Factored Verification: Detecting and Reducing Hallucination in Summaries
  of Academic Papers
Factored Verification: Detecting and Reducing Hallucination in Summaries of Academic Papers
Charlie George
Andreas Stuhlmuller
HILM
25
5
0
16 Oct 2023
KGQuiz: Evaluating the Generalization of Encoded Knowledge in Large
  Language Models
KGQuiz: Evaluating the Generalization of Encoded Knowledge in Large Language Models
Yuyang Bai
Shangbin Feng
Vidhisha Balachandran
Zhaoxuan Tan
Shiqi Lou
Tianxing He
Yulia Tsvetkov
ELM
40
2
0
15 Oct 2023
SALMON: Self-Alignment with Instructable Reward Models
SALMON: Self-Alignment with Instructable Reward Models
Zhiqing Sun
Songlin Yang
Hongxin Zhang
Qinhong Zhou
Zhenfang Chen
David D. Cox
Yiming Yang
Chuang Gan
ALM
SyDa
41
35
0
09 Oct 2023
Ragas: Automated Evaluation of Retrieval Augmented Generation
Ragas: Automated Evaluation of Retrieval Augmented Generation
ES Shahul
Jithin James
Luis Espinosa-Anke
Steven Schockaert
91
177
0
26 Sep 2023
Cognitive Mirage: A Review of Hallucinations in Large Language Models
Cognitive Mirage: A Review of Hallucinations in Large Language Models
Hongbin Ye
Tong Liu
Aijia Zhang
Wei Hua
Weiqiang Jia
HILM
48
77
0
13 Sep 2023
FLM-101B: An Open LLM and How to Train It with $100K Budget
FLM-101B: An Open LLM and How to Train It with 100KBudget100K Budget100KBudget
Xiang Li
Yiqun Yao
Xin Jiang
Xuezhi Fang
Xuying Meng
...
Li Du
Bowen Qin
Zheng-Wei Zhang
Aixin Sun
Yequan Wang
60
21
0
07 Sep 2023
Siren's Song in the AI Ocean: A Survey on Hallucination in Large
  Language Models
Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models
Yue Zhang
Yafu Li
Leyang Cui
Deng Cai
Lemao Liu
...
Longyue Wang
A. Luu
Wei Bi
Freda Shi
Shuming Shi
RALM
LRM
HILM
48
522
0
03 Sep 2023
Large Language Models Sensitivity to The Order of Options in
  Multiple-Choice Questions
Large Language Models Sensitivity to The Order of Options in Multiple-Choice Questions
Pouya Pezeshkpour
Estevam R. Hruschka
LRM
20
131
0
22 Aug 2023
Evaluating the Ripple Effects of Knowledge Editing in Language Models
Evaluating the Ripple Effects of Knowledge Editing in Language Models
Roi Cohen
Eden Biran
Ori Yoran
Amir Globerson
Mor Geva
KELM
42
157
0
24 Jul 2023
In-Context Learning Learns Label Relationships but Is Not Conventional
  Learning
In-Context Learning Learns Label Relationships but Is Not Conventional Learning
Jannik Kossen
Y. Gal
Tom Rainforth
40
28
0
23 Jul 2023
An Empirical Study of Pre-trained Model Selection for Out-of-Distribution Generalization and Calibration
An Empirical Study of Pre-trained Model Selection for Out-of-Distribution Generalization and Calibration
Hiroki Naganuma
Ryuichiro Hataya
Kotaro Yoshida
Ioannis Mitliagkas
OODD
95
1
0
17 Jul 2023
Comparing Traditional and LLM-based Search for Consumer Choice: A
  Randomized Experiment
Comparing Traditional and LLM-based Search for Consumer Choice: A Randomized Experiment
S. Spatharioti
David M. Rothschild
D. Goldstein
Jake M. Hofman
33
46
0
07 Jul 2023
Robots That Ask For Help: Uncertainty Alignment for Large Language Model
  Planners
Robots That Ask For Help: Uncertainty Alignment for Large Language Model Planners
Allen Z. Ren
Anushri Dixit
Alexandra Bodrova
Sumeet Singh
Stephen Tu
...
Jacob Varley
Zhenjia Xu
Dorsa Sadigh
Andy Zeng
Anirudha Majumdar
LM&Ro
64
220
0
04 Jul 2023
AI Transparency in the Age of LLMs: A Human-Centered Research Roadmap
AI Transparency in the Age of LLMs: A Human-Centered Research Roadmap
Q. V. Liao
J. Vaughan
44
159
0
02 Jun 2023
Do Large Language Models Know What They Don't Know?
Do Large Language Models Know What They Don't Know?
Zhangyue Yin
Qiushi Sun
Qipeng Guo
Jiawen Wu
Xipeng Qiu
Xuanjing Huang
ELM
AI4MH
41
150
0
29 May 2023
Taming AI Bots: Controllability of Neural States in Large Language
  Models
Taming AI Bots: Controllability of Neural States in Large Language Models
Stefano Soatto
Paulo Tabuada
Pratik Chaudhari
Tianwei Liu
LLMAG
LM&Ro
18
13
0
29 May 2023
Reward Collapse in Aligning Large Language Models
Reward Collapse in Aligning Large Language Models
Ziang Song
Tianle Cai
Jason D. Lee
Weijie J. Su
ALM
33
22
0
28 May 2023
Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence
  Scores from Language Models Fine-Tuned with Human Feedback
Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback
Katherine Tian
E. Mitchell
Allan Zhou
Archit Sharma
Rafael Rafailov
Huaxiu Yao
Chelsea Finn
Christopher D. Manning
63
289
0
24 May 2023
Mitigating Temporal Misalignment by Discarding Outdated Facts
Mitigating Temporal Misalignment by Discarding Outdated Facts
Michael J.Q. Zhang
Eunsol Choi
KELM
HILM
27
17
0
24 May 2023
Estimating Large Language Model Capabilities without Labeled Test Data
Estimating Large Language Model Capabilities without Labeled Test Data
Harvey Yiyun Fu
Qinyuan Ye
Albert Xu
Xiang Ren
Robin Jia
21
8
0
24 May 2023
Improving Factuality and Reasoning in Language Models through Multiagent
  Debate
Improving Factuality and Reasoning in Language Models through Multiagent Debate
Yilun Du
Shuang Li
Antonio Torralba
J. Tenenbaum
Igor Mordatch
LLMAG
LRM
79
614
0
23 May 2023
FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long
  Form Text Generation
FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation
Sewon Min
Kalpesh Krishna
Xinxi Lyu
M. Lewis
Wen-tau Yih
Pang Wei Koh
Mohit Iyyer
Luke Zettlemoyer
Hannaneh Hajishirzi
HILM
ALM
86
607
0
23 May 2023
Language Models with Rationality
Language Models with Rationality
Nora Kassner
Oyvind Tafjord
Ashish Sabharwal
Kyle Richardson
Hinrich Schütze
Peter Clark
ReLM
KELM
LRM
20
15
0
23 May 2023
Knowledge of Knowledge: Exploring Known-Unknowns Uncertainty with Large
  Language Models
Knowledge of Knowledge: Exploring Known-Unknowns Uncertainty with Large Language Models
Alfonso Amayuelas
Kyle Wong
Liangming Pan
Wenhu Chen
Wei Wang
42
26
0
23 May 2023
LM vs LM: Detecting Factual Errors via Cross Examination
LM vs LM: Detecting Factual Errors via Cross Examination
Roi Cohen
May Hamri
Mor Geva
Amir Globerson
HILM
41
120
0
22 May 2023
Can ChatGPT Defend its Belief in Truth? Evaluating LLM Reasoning via
  Debate
Can ChatGPT Defend its Belief in Truth? Evaluating LLM Reasoning via Debate
Boshi Wang
Xiang Yue
Huan Sun
ELM
LRM
46
60
0
22 May 2023
Active Retrieval Augmented Generation
Active Retrieval Augmented Generation
Zhengbao Jiang
Frank F. Xu
Luyu Gao
Zhiqing Sun
Qian Liu
Jane Dwivedi-Yu
Yiming Yang
Jamie Callan
Graham Neubig
RALM
25
255
0
11 May 2023
Taking Advice from ChatGPT
Taking Advice from ChatGPT
Peter Zhang
40
5
0
11 May 2023
MoT: Memory-of-Thought Enables ChatGPT to Self-Improve
MoT: Memory-of-Thought Enables ChatGPT to Self-Improve
Xiaonan Li
Xipeng Qiu
ReLM
KELM
LRM
AI4MH
29
32
0
09 May 2023
The Current State of Summarization
The Current State of Summarization
Fabian Retkowski
23
6
0
08 May 2023
ReCEval: Evaluating Reasoning Chains via Correctness and Informativeness
ReCEval: Evaluating Reasoning Chains via Correctness and Informativeness
Archiki Prasad
Swarnadeep Saha
Xiang Zhou
Joey Tianyi Zhou
LRM
32
45
0
21 Apr 2023
CHiLS: Zero-Shot Image Classification with Hierarchical Label Sets
CHiLS: Zero-Shot Image Classification with Hierarchical Label Sets
Zachary Novack
Julian McAuley
Zachary Chase Lipton
Saurabh Garg
VLM
35
79
0
06 Feb 2023
Discovering Language Model Behaviors with Model-Written Evaluations
Discovering Language Model Behaviors with Model-Written Evaluations
Ethan Perez
Sam Ringer
Kamilė Lukošiūtė
Karina Nguyen
Edwin Chen
...
Danny Hernandez
Deep Ganguli
Evan Hubinger
Nicholas Schiefer
Jared Kaplan
ALM
22
367
0
19 Dec 2022
Constitutional AI: Harmlessness from AI Feedback
Constitutional AI: Harmlessness from AI Feedback
Yuntao Bai
Saurav Kadavath
Sandipan Kundu
Amanda Askell
John Kernion
...
Dario Amodei
Nicholas Joseph
Sam McCandlish
Tom B. Brown
Jared Kaplan
SyDa
MoMe
106
1,487
0
15 Dec 2022
Calibrated Interpretation: Confidence Estimation in Semantic Parsing
Calibrated Interpretation: Confidence Estimation in Semantic Parsing
Elias Stengel-Eskin
Benjamin Van Durme
UQLM
41
24
0
14 Nov 2022
Measuring Progress on Scalable Oversight for Large Language Models
Measuring Progress on Scalable Oversight for Large Language Models
Sam Bowman
Jeeyoon Hyun
Ethan Perez
Edwin Chen
Craig Pettit
...
Tristan Hume
Yuntao Bai
Zac Hatfield-Dodds
Benjamin Mann
Jared Kaplan
ALM
ELM
28
122
0
04 Nov 2022
Broken Neural Scaling Laws
Broken Neural Scaling Laws
Ethan Caballero
Kshitij Gupta
Irina Rish
David M. Krueger
30
74
0
26 Oct 2022
Large Language Models Can Self-Improve
Large Language Models Can Self-Improve
Jiaxin Huang
S. Gu
Le Hou
Yuexin Wu
Xuezhi Wang
Hongkun Yu
Jiawei Han
ReLM
AI4MH
LRM
47
566
0
20 Oct 2022
Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them
Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them
Mirac Suzgun
Nathan Scales
Nathanael Scharli
Sebastian Gehrmann
Yi Tay
...
Aakanksha Chowdhery
Quoc V. Le
Ed H. Chi
Denny Zhou
Jason W. Wei
ALM
ELM
LRM
ReLM
119
1,011
0
17 Oct 2022
The Open-World Lottery Ticket Hypothesis for OOD Intent Classification
The Open-World Lottery Ticket Hypothesis for OOD Intent Classification
Yunhua Zhou
Pengyu Wang
Peiju Liu
Yuxin Wang
Xipeng Qiu
25
2
0
13 Oct 2022
Toward Trustworthy Neural Program Synthesis
Toward Trustworthy Neural Program Synthesis
Darren Key
Wen-Ding Li
Kevin Ellis
NAI
83
5
0
29 Sep 2022
Faithful Reasoning Using Large Language Models
Faithful Reasoning Using Large Language Models
Antonia Creswell
Murray Shanahan
ReLM
LRM
24
122
0
30 Aug 2022
Previous
1234
Next