ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2211.17257
  4. Cited By
CREPE: Open-Domain Question Answering with False Presuppositions

CREPE: Open-Domain Question Answering with False Presuppositions

30 November 2022
Xinyan Velocity Yu
Sewon Min
Luke Zettlemoyer
Hannaneh Hajishirzi
ArXivPDFHTML

Papers citing "CREPE: Open-Domain Question Answering with False Presuppositions"

32 / 32 papers shown
Title
Cancer-Myth: Evaluating AI Chatbot on Patient Questions with False Presuppositions
Cancer-Myth: Evaluating AI Chatbot on Patient Questions with False Presuppositions
Wang Zhu
Tianqi Chen
Ching Ying Lin
Jade Law
Mazen Jizzini
Jorge J. Nieva
Ruishan Liu
Robin Jia
39
0
0
15 Apr 2025
Don't Let It Hallucinate: Premise Verification via Retrieval-Augmented Logical Reasoning
Don't Let It Hallucinate: Premise Verification via Retrieval-Augmented Logical Reasoning
Yuehan Qin
Shawn Li
Yi Nian
Xinyan Velocity Yu
Yue Zhao
Xuezhe Ma
HILM
LRM
40
0
0
08 Apr 2025
Skip-Vision: Efficient and Scalable Acceleration of Vision-Language Models via Adaptive Token Skipping
Skip-Vision: Efficient and Scalable Acceleration of Vision-Language Models via Adaptive Token Skipping
Weili Zeng
Ziyuan Huang
Kaixiang Ji
Yichao Yan
VLM
47
1
0
26 Mar 2025
How to Protect Yourself from 5G Radiation? Investigating LLM Responses to Implicit Misinformation
Ruohao Guo
Wei-ping Xu
Alan Ritter
44
1
0
12 Mar 2025
What makes a good metric? Evaluating automatic metrics for text-to-image
  consistency
What makes a good metric? Evaluating automatic metrics for text-to-image consistency
Candace Ross
Melissa Hall
Adriana Romero Soriano
Adina Williams
95
3
0
18 Dec 2024
DRS: Deep Question Reformulation With Structured Output
DRS: Deep Question Reformulation With Structured Output
Zhecheng Li
Yijiao Wang
Bryan Hooi
Yujun Cai
Nanyun Peng
Kai-Wei Chang
KELM
76
0
0
27 Nov 2024
Do great minds think alike? Investigating Human-AI Complementarity in
  Question Answering with CAIMIRA
Do great minds think alike? Investigating Human-AI Complementarity in Question Answering with CAIMIRA
Maharshi Gor
Hal Daumé III
Dinesh Manocha
Jordan Boyd-Graber
ELM
AI4MH
LRM
23
1
0
09 Oct 2024
Adaptive Question Answering: Enhancing Language Model Proficiency for
  Addressing Knowledge Conflicts with Source Citations
Adaptive Question Answering: Enhancing Language Model Proficiency for Addressing Knowledge Conflicts with Source Citations
Sagi Shaier
Ari Kobren
Philip Ogren
HILM
31
6
0
05 Oct 2024
I Could've Asked That: Reformulating Unanswerable Questions
I Could've Asked That: Reformulating Unanswerable Questions
Wenting Zhao
Ge Gao
Claire Cardie
Alexander M. Rush
ELM
34
1
0
24 Jul 2024
KG-FPQ: Evaluating Factuality Hallucination in LLMs with Knowledge
  Graph-based False Premise Questions
KG-FPQ: Evaluating Factuality Hallucination in LLMs with Knowledge Graph-based False Premise Questions
Yanxu Zhu
Jinlin Xiao
Yuhang Wang
Jitao Sang
HILM
36
4
0
08 Jul 2024
The Art of Saying No: Contextual Noncompliance in Language Models
The Art of Saying No: Contextual Noncompliance in Language Models
Faeze Brahman
Sachin Kumar
Vidhisha Balachandran
Pradeep Dasigi
Valentina Pyatkin
...
Jack Hessel
Yulia Tsvetkov
Noah A. Smith
Yejin Choi
Hannaneh Hajishirzi
75
21
0
02 Jul 2024
Towards Robust Evaluation: A Comprehensive Taxonomy of Datasets and
  Metrics for Open Domain Question Answering in the Era of Large Language
  Models
Towards Robust Evaluation: A Comprehensive Taxonomy of Datasets and Metrics for Open Domain Question Answering in the Era of Large Language Models
Akchay Srivastava
Atif Memon
ELM
48
1
0
19 Jun 2024
The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
Seungone Kim
Juyoung Suk
Ji Yong Cho
Shayne Longpre
Chaeeun Kim
...
Sean Welleck
Graham Neubig
Moontae Lee
Kyungjae Lee
Minjoon Seo
ELM
ALM
LM&MA
105
31
0
09 Jun 2024
Towards Unbiased Evaluation of Detecting Unanswerable Questions in
  EHRSQL
Towards Unbiased Evaluation of Detecting Unanswerable Questions in EHRSQL
Yongjin Yang
Sihyeon Kim
Sangmook Kim
Gyubok Lee
Se-Young Yun
Edward Choi
38
2
0
29 Apr 2024
Interpreting Answers to Yes-No Questions in Dialogues from Multiple
  Domains
Interpreting Answers to Yes-No Questions in Dialogues from Multiple Domains
Zijie Wang
Farzana Rashid
Eduardo Blanco
30
0
0
25 Apr 2024
Syn-QA2: Evaluating False Assumptions in Long-tail Questions with
  Synthetic QA Datasets
Syn-QA2: Evaluating False Assumptions in Long-tail Questions with Synthetic QA Datasets
Ashwin Daswani
Rohan Sawant
Najoung Kim
19
0
0
18 Mar 2024
Whispers that Shake Foundations: Analyzing and Mitigating False Premise
  Hallucinations in Large Language Models
Whispers that Shake Foundations: Analyzing and Mitigating False Premise Hallucinations in Large Language Models
Hongbang Yuan
Pengfei Cao
Zhuoran Jin
Yubo Chen
Daojian Zeng
Kang Liu
Jun Zhao
HILM
37
3
0
29 Feb 2024
How the Advent of Ubiquitous Large Language Models both Stymie and
  Turbocharge Dynamic Adversarial Question Generation
How the Advent of Ubiquitous Large Language Models both Stymie and Turbocharge Dynamic Adversarial Question Generation
Yoo Yeon Sung
Ishani Mondal
Jordan L. Boyd-Graber
30
0
0
20 Jan 2024
Evaluating Large Language Models for Health-related Queries with
  Presuppositions
Evaluating Large Language Models for Health-related Queries with Presuppositions
Navreet Kaur
Monojit Choudhury
Danish Pruthi
HILM
ELM
38
2
0
14 Dec 2023
Examining LLMs' Uncertainty Expression Towards Questions Outside
  Parametric Knowledge
Examining LLMs' Uncertainty Expression Towards Questions Outside Parametric Knowledge
Genglin Liu
Xingyao Wang
Lifan Yuan
Yangyi Chen
Hao Peng
29
16
0
16 Nov 2023
Pregnant Questions: The Importance of Pragmatic Awareness in Maternal
  Health Question Answering
Pregnant Questions: The Importance of Pragmatic Awareness in Maternal Health Question Answering
Neha Srikanth
Rupak Sarkar
Heran Mane
Elizabeth M. Aparicio
Quynh C. Nguyen
Rachel Rudinger
Jordan Lee Boyd-Graber
11
2
0
16 Nov 2023
PreWoMe: Exploiting Presuppositions as Working Memory for Long Form
  Question Answering
PreWoMe: Exploiting Presuppositions as Working Memory for Long Form Question Answering
Wookje Han
Jinsol Park
Kyungjae Lee
36
3
0
24 Oct 2023
FreshLLMs: Refreshing Large Language Models with Search Engine
  Augmentation
FreshLLMs: Refreshing Large Language Models with Search Engine Augmentation
Tu Vu
Mohit Iyyer
Xuezhi Wang
Noah Constant
Jerry W. Wei
...
Chris Tar
Yun-hsuan Sung
Denny Zhou
Quoc Le
Thang Luong
KELM
HILM
LRM
22
186
0
05 Oct 2023
FLASK: Fine-grained Language Model Evaluation based on Alignment Skill
  Sets
FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets
Seonghyeon Ye
Doyoung Kim
Sungdong Kim
Hyeonbin Hwang
Seungone Kim
Yongrae Jo
James Thorne
Juho Kim
Minjoon Seo
ALM
46
98
0
20 Jul 2023
Won't Get Fooled Again: Answering Questions with False Premises
Won't Get Fooled Again: Answering Questions with False Premises
Shengding Hu
Yi-Xiao Luo
Huadong Wang
Xingyi Cheng
Zhiyuan Liu
Maosong Sun
29
22
0
05 Jul 2023
BoardgameQA: A Dataset for Natural Language Reasoning with Contradictory
  Information
BoardgameQA: A Dataset for Natural Language Reasoning with Contradictory Information
Mehran Kazemi
Quan Yuan
Deepti Bhatia
Najoung Kim
Xin Xu
Vaiva Imbrasaite
Deepak Ramachandran
LRM
29
38
0
13 Jun 2023
HOP, UNION, GENERATE: Explainable Multi-hop Reasoning without Rationale
  Supervision
HOP, UNION, GENERATE: Explainable Multi-hop Reasoning without Rationale Supervision
Wenting Zhao
Justin T. Chiu
Claire Cardie
Alexander M. Rush
LRM
19
4
0
23 May 2023
IfQA: A Dataset for Open-domain Question Answering under Counterfactual
  Presuppositions
IfQA: A Dataset for Open-domain Question Answering under Counterfactual Presuppositions
W. Yu
Meng Jiang
Peter Clark
Ashish Sabharwal
15
21
0
23 May 2023
Enhancing Large Language Models Against Inductive Instructions with
  Dual-critique Prompting
Enhancing Large Language Models Against Inductive Instructions with Dual-critique Prompting
Rui Wang
Hongru Wang
Fei Mi
Yi Chen
Boyang Xue
Kam-Fai Wong
Rui-Lan Xu
31
13
0
23 May 2023
Can Language Models Solve Graph Problems in Natural Language?
Can Language Models Solve Graph Problems in Natural Language?
Heng Wang
Shangbin Feng
Tianxing He
Zhaoxuan Tan
Xiaochuang Han
Yulia Tsvetkov
ReLM
LRM
26
181
0
17 May 2023
(QA)$^2$: Question Answering with Questionable Assumptions
(QA)2^22: Question Answering with Questionable Assumptions
Najoung Kim
Phu Mon Htut
Sam Bowman
Jackson Petty
29
33
0
20 Dec 2022
Which Linguist Invented the Lightbulb? Presupposition Verification for
  Question-Answering
Which Linguist Invented the Lightbulb? Presupposition Verification for Question-Answering
Najoung Kim
Ellie Pavlick
Burcu Karagol Ayan
Deepak Ramachandran
70
43
0
02 Jan 2021
1