Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.14399
Cited By
We're Afraid Language Models Aren't Modeling Ambiguity
27 April 2023
Alisa Liu
Zhaofeng Wu
Julian Michael
Alane Suhr
Peter West
Alexander Koller
Swabha Swayamdipta
Noah A. Smith
Yejin Choi
Re-assign community
ArXiv
PDF
HTML
Papers citing
"We're Afraid Language Models Aren't Modeling Ambiguity"
22 / 22 papers shown
Title
REI-Bench: Can Embodied Agents Understand Vague Human Instructions in Task Planning?
Chenxi Jiang
Chuhao Zhou
Jianfei Yang
9
0
0
16 May 2025
FalseReject: A Resource for Improving Contextual Safety and Mitigating Over-Refusals in LLMs via Structured Reasoning
Zhehao Zhang
Weijie Xu
Fanyou Wu
Chandan K. Reddy
29
0
0
12 May 2025
LLMs Get Lost In Multi-Turn Conversation
Philippe Laban
Hiroaki Hayashi
Yingbo Zhou
Jennifer Neville
50
1
0
09 May 2025
Enhancing LLM Reliability via Explicit Knowledge Boundary Modeling
Hang Zheng
Hongshen Xu
Yuncong Liu
Lu Chen
Pascale Fung
Kai Yu
109
2
0
04 Mar 2025
A linguistically-motivated evaluation methodology for unraveling model's abilities in reading comprehension tasks
Elie Antoine
Frédéric Béchet
Géraldine Damnati
Philippe Langlais
56
1
0
29 Jan 2025
Modeling Future Conversation Turns to Teach LLMs to Ask Clarifying Questions
Michael J.Q. Zhang
W. Bradley Knox
Eunsol Choi
50
4
0
17 Oct 2024
AMBROSIA: A Benchmark for Parsing Ambiguous Questions into Database Queries
Irina Saparina
Mirella Lapata
57
11
0
27 Jun 2024
Evaluating the External and Parametric Knowledge Fusion of Large Language Models
Hao Zhang
Yuyang Zhang
Xiaoguang Li
Wenxuan Shi
Haonan Xu
...
Yasheng Wang
Lifeng Shang
Qun Liu
Yong-jin Liu
Ruiming Tang
KELM
45
4
0
29 May 2024
Probing Large Language Models for Scalar Adjective Lexical Semantics and Scalar Diversity Pragmatics
Fangru Lin
Daniel Altshuler
J. Pierrehumbert
38
1
0
04 Apr 2024
Evaluating the Factuality of Large Language Models using Large-Scale Knowledge Graphs
Xiaoze Liu
Feijie Wu
Tianyang Xu
Zhuo Chen
Yichi Zhang
Xiaoqian Wang
Jing Gao
HILM
45
8
0
01 Apr 2024
Towards a Psychology of Machines: Large Language Models Predict Human Memory
Markus Huff
Elanur Ulakçi
40
4
0
08 Mar 2024
AmbigNLG: Addressing Task Ambiguity in Instruction for NLG
Ayana Niwa
Hayate Iso
36
4
0
27 Feb 2024
Leak, Cheat, Repeat: Data Contamination and Evaluation Malpractices in Closed-Source LLMs
Simone Balloccu
Patrícia Schmidtová
Mateusz Lango
Ondrej Dusek
SILM
ELM
PILM
26
158
0
06 Feb 2024
Interpretation modeling: Social grounding of sentences by reasoning over their implicit moral judgments
Liesbeth Allein
Maria Mihaela Trucscva
Marie-Francine Moens
45
1
0
27 Nov 2023
MacGyver: Are Large Language Models Creative Problem Solvers?
Yufei Tian
Abhilasha Ravichander
Lianhui Qin
Ronan Le Bras
Raja Marjieh
Nanyun Peng
Yejin Choi
Thomas Griffiths
Faeze Brahman
AI4CE
LLMAG
15
11
0
16 Nov 2023
How Trustworthy are Open-Source LLMs? An Assessment under Malicious Demonstrations Shows their Vulnerabilities
Lingbo Mo
Boshi Wang
Muhao Chen
Huan Sun
29
27
0
15 Nov 2023
DiPlomat: A Dialogue Dataset for Situated Pragmatic Reasoning
Hengli Li
Songchun Zhu
Zilong Zheng
11
8
0
15 Jun 2023
Large Language Models as Tax Attorneys: A Case Study in Legal Capabilities Emergence
John J. Nay
David Karamardian
Sarah Lawsky
Wenting Tao
Meghana Moorthy Bhat
Raghav Jain
Aaron Travis Lee
Jonathan H. Choi
Jungo Kasai
ELM
AILaw
24
57
0
12 Jun 2023
SCITAB: A Challenging Benchmark for Compositional Reasoning and Claim Verification on Scientific Tables
Xinyuan Lu
Liangming Pan
Qian Liu
Preslav Nakov
Min-Yen Kan
LMTD
44
24
0
22 May 2023
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
366
12,003
0
04 Mar 2022
The Woman Worked as a Babysitter: On Biases in Language Generation
Emily Sheng
Kai-Wei Chang
Premkumar Natarajan
Nanyun Peng
223
618
0
03 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
299
6,984
0
20 Apr 2018
1