Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2104.08315
Cited By
Surface Form Competition: Why the Highest Probability Answer Isn't Always Right
16 April 2021
Ari Holtzman
Peter West
Vered Schwartz
Yejin Choi
Luke Zettlemoyer
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Surface Form Competition: Why the Highest Probability Answer Isn't Always Right"
50 / 76 papers shown
Title
GraphPrompter: Multi-stage Adaptive Prompt Optimization for Graph In-Context Learning
Rui Lv
Zhenru Zhang
Kai Zhang
Qi Liu
Weibo Gao
Jing Liu
Jiaxia Yan
Linan Yue
Fangzhou Yao
211
0
0
04 May 2025
Comparing Uncertainty Measurement and Mitigation Methods for Large Language Models: A Systematic Review
Toghrul Abbasli
Kentaroh Toyoda
Yuan Wang
Leon Witt
Muhammad Asif Ali
Yukai Miao
Dan Li
Qingsong Wei
UQCV
94
0
0
25 Apr 2025
SuperBPE: Space Travel for Language Models
Alisa Liu
J. Hayase
Valentin Hofmann
Sewoong Oh
Noah A. Smith
Yejin Choi
53
3
0
17 Mar 2025
Reversal Blessing: Thinking Backward May Outpace Thinking Forward in Multi-choice Questions
Yizhe Zhang
Richard He Bai
Zijin Gu
Ruixiang Zhang
Jiatao Gu
Emmanuel Abbe
Samy Bengio
Navdeep Jaitly
LRM
BDL
72
1
0
25 Feb 2025
On The Truthfulness of 'Surprisingly Likely' Responses of Large Language Models
Naman Goel
HILM
57
0
0
28 Jan 2025
Option-ID Based Elimination For Multiple Choice Questions
Zhenhao Zhu
Bulou Liu
Qingyao Ai
Yong-Jin Liu
54
0
0
25 Jan 2025
Out-of-distribution generalization via composition: a lens through induction heads in Transformers
Jiajun Song
Zhuoyan Xu
Yiqiao Zhong
88
4
0
31 Dec 2024
Task Calibration: Calibrating Large Language Models on Inference Tasks
Yingjie Li
Yun Luo
Xiaotian Xie
Yue Zhang
LRM
21
0
0
24 Oct 2024
Scaling up Masked Diffusion Models on Text
Shen Nie
Fengqi Zhu
Chao Du
Tianyu Pang
Qian Liu
Guangtao Zeng
Min Lin
Chongxuan Li
AI4CE
63
14
0
24 Oct 2024
In-Context Learning Enables Robot Action Prediction in LLMs
Yida Yin
Zekai Wang
Yuvan Sharma
Dantong Niu
Trevor Darrell
Roei Herzig
LM&Ro
120
2
0
16 Oct 2024
Token-based Decision Criteria Are Suboptimal in In-context Learning
Hakaze Cho
Yoshihiro Sakai
Mariko Kato
Kenshiro Tanaka
Akira Ishii
Naoya Inoue
46
3
0
24 Jun 2024
People will agree what I think: Investigating LLM's False Consensus Effect
Junhyuk Choi
Yeseon Hong
Bugeun Kim
54
0
0
16 Jun 2024
Enhancing Domain Adaptation through Prompt Gradient Alignment
Hoang Phan
Lam C. Tran
Quyen Tran
Trung Le
52
0
0
13 Jun 2024
OLMES: A Standard for Language Model Evaluations
Yuling Gu
Oyvind Tafjord
Bailey Kuehl
Dany Haddad
Jesse Dodge
Hannaneh Hajishirzi
ELM
45
14
0
12 Jun 2024
Language Model Cascades: Token-level uncertainty and beyond
Neha Gupta
Harikrishna Narasimhan
Wittawat Jitkrittum
A. S. Rawat
A. Menon
Sanjiv Kumar
UQLM
53
42
0
15 Apr 2024
Behavior Trees Enable Structured Programming of Language Model Agents
Richard Kelley
AI4CE
LM&Ro
LLMAG
40
0
0
11 Apr 2024
Rectifying Demonstration Shortcut in In-Context Learning
Joonwon Jang
Sanghwan Jang
Wonbin Kweon
Minjin Jeon
Hwanjo Yu
40
1
0
14 Mar 2024
Prompt-Based Bias Calibration for Better Zero/Few-Shot Learning of Language Models
Kang He
Yinghan Long
Kaushik Roy
28
2
0
15 Feb 2024
Leveraging Biases in Large Language Models: "bias-kNN'' for Effective Few-Shot Learning
Yong Zhang
Hanzhang Li
Zhitao Li
Ning Cheng
Ming Li
Jing Xiao
Jianzong Wang
28
3
0
18 Jan 2024
Evaluating and Mitigating Discrimination in Language Model Decisions
Alex Tamkin
Amanda Askell
Liane Lovitt
Esin Durmus
Nicholas Joseph
Shauna Kravec
Karina Nguyen
Jared Kaplan
Deep Ganguli
38
68
0
06 Dec 2023
Bring Your Own KG: Self-Supervised Program Synthesis for Zero-Shot KGQA
Dhruv Agarwal
Rajarshi Das
Sopan Khosla
Rashmi Gangadharaiah
OffRL
23
7
0
14 Nov 2023
Towards Concept-Aware Large Language Models
Chen Shani
Jilles Vreeken
Dafna Shahaf
LRM
30
6
0
03 Nov 2023
Information Value: Measuring Utterance Predictability as Distance from Plausible Alternatives
Mario Giulianelli
Sarenne Wallbridge
Raquel Fernández
34
13
0
20 Oct 2023
Sweeping Heterogeneity with Smart MoPs: Mixture of Prompts for LLM Task Adaptation
Chen Dun
Mirian Hipolito Garcia
Guoqing Zheng
Ahmed Hassan Awadallah
Anastasios Kyrillidis
Robert Sim
90
6
0
04 Oct 2023
Zero-Shot Robustification of Zero-Shot Models
Dyah Adila
Changho Shin
Lin Cai
Frederic Sala
48
19
0
08 Sep 2023
SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore
Sewon Min
Suchin Gururangan
Eric Wallace
Hannaneh Hajishirzi
Noah A. Smith
Luke Zettlemoyer
AILaw
28
63
0
08 Aug 2023
A Simple and Effective Framework for Strict Zero-Shot Hierarchical Classification
R. Bhambhoria
Lei Chen
Xiao-Dan Zhu
24
3
0
24 May 2023
Editing Common Sense in Transformers
Anshita Gupta
Debanjan Mondal
Akshay Krishna Sheshadri
Wenlong Zhao
Xiang Lorraine Li
Sarah Wiegreffe
Niket Tandon
KELM
47
22
0
24 May 2023
Flatness-Aware Prompt Selection Improves Accuracy and Sample Efficiency
Lingfeng Shen
Weiting Tan
Boyuan Zheng
Daniel Khashabi
VLM
45
6
0
18 May 2023
A Better Way to Do Masked Language Model Scoring
Carina Kauf
Anna A. Ivanova
50
22
0
17 May 2023
Rethinking Visual Prompt Learning as Masked Visual Token Modeling
Ning Liao
Bowen Shi
Xiaopeng Zhang
Min Cao
Junchi Yan
Qi Tian
VLM
34
7
0
09 Mar 2023
In-context Example Selection with Influences
Nguyen Tai
Eric Wong
16
48
0
21 Feb 2023
Second Thoughts are Best: Learning to Re-Align With Human Values from Text Edits
Ruibo Liu
Chenyan Jia
Ge Zhang
Ziyu Zhuang
Tony X. Liu
Soroush Vosoughi
99
35
0
01 Jan 2023
In-context Learning Distillation: Transferring Few-shot Learning Ability of Pre-trained Language Models
Yukun Huang
Yanda Chen
Zhou Yu
Kathleen McKeown
27
30
0
20 Dec 2022
Neural Story Planning
Anbang Ye
Christopher Cui
Taiwei Shi
Mark O. Riedl
21
8
0
16 Dec 2022
Demystifying Prompts in Language Models via Perplexity Estimation
Hila Gonen
Srini Iyer
Terra Blevins
Noah A. Smith
Luke Zettlemoyer
LRM
46
196
0
08 Dec 2022
Event knowledge in large language models: the gap between the impossible and the unlikely
Carina Kauf
Anna A. Ivanova
Giulia Rambelli
Emmanuele Chersoni
Jingyuan Selena She
Zawad Chowdhury
Evelina Fedorenko
Alessandro Lenci
37
67
0
02 Dec 2022
Nonparametric Masked Language Modeling
Sewon Min
Weijia Shi
M. Lewis
Xilun Chen
Wen-tau Yih
Hannaneh Hajishirzi
Luke Zettlemoyer
RALM
50
48
0
02 Dec 2022
Active Example Selection for In-Context Learning
Yiming Zhang
Shi Feng
Chenhao Tan
SILM
LRM
32
187
0
08 Nov 2022
Mutual Information Alleviates Hallucinations in Abstractive Summarization
Liam van der Poel
Ryan Cotterell
Clara Meister
HILM
18
58
0
24 Oct 2022
The Better Your Syntax, the Better Your Semantics? Probing Pretrained Language Models for the English Comparative Correlative
Leonie Weissweiler
Valentin Hofmann
Abdullatif Köksal
Hinrich Schütze
37
33
0
24 Oct 2022
ProGen: Progressive Zero-shot Dataset Generation via In-context Feedback
Jiacheng Ye
Jiahui Gao
Jiangtao Feng
Zhiyong Wu
Tao Yu
Lingpeng Kong
SyDa
VLM
81
72
0
22 Oct 2022
SLING: Sino Linguistic Evaluation of Large Language Models
Yixiao Song
Kalpesh Krishna
R. Bhatt
Mohit Iyyer
24
8
0
21 Oct 2022
Automatic Chain of Thought Prompting in Large Language Models
ZhuoSheng Zhang
Aston Zhang
Mu Li
Alexander J. Smola
ReLM
LRM
67
584
0
07 Oct 2022
Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners
Seonghyeon Ye
Doyoung Kim
Joel Jang
Joongbo Shin
Minjoon Seo
FedML
VLM
UQCV
LRM
24
25
0
06 Oct 2022
Ask Me Anything: A simple strategy for prompting language models
Simran Arora
A. Narayan
Mayee F. Chen
Laurel J. Orr
Neel Guha
Kush S. Bhatia
Ines Chami
Frederic Sala
Christopher Ré
ReLM
LRM
235
208
0
05 Oct 2022
Language models show human-like content effects on reasoning tasks
Ishita Dasgupta
Andrew Kyle Lampinen
Stephanie C. Y. Chan
Hannah R. Sheahan
Antonia Creswell
D. Kumaran
James L. McClelland
Felix Hill
ReLM
LRM
35
181
0
14 Jul 2022
Emergent Abilities of Large Language Models
Jason W. Wei
Yi Tay
Rishi Bommasani
Colin Raffel
Barret Zoph
...
Tatsunori Hashimoto
Oriol Vinyals
Percy Liang
J. Dean
W. Fedus
ELM
ReLM
LRM
90
2,364
0
15 Jun 2022
kNN-Prompt: Nearest Neighbor Zero-Shot Inference
Weijia Shi
Julian Michael
Suchin Gururangan
Luke Zettlemoyer
RALM
VLM
29
32
0
27 May 2022
Non-Programmers Can Label Programs Indirectly via Active Examples: A Case Study with Text-to-SQL
Ruiqi Zhong
Charles Burton Snell
Dan Klein
Jason Eisner
24
8
0
25 May 2022
1
2
Next