Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.09694
Cited By
v1
v2 (latest)
Whispers of Doubt Amidst Echoes of Triumph in NLP Robustness
16 November 2023
Ashim Gupta
Rishanth Rajendhran
Nathan Stringham
Vivek Srikumar
Ana Marasović
AAML
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Whispers of Doubt Amidst Echoes of Triumph in NLP Robustness"
36 / 36 papers shown
Title
Evaluating Concurrent Robustness of Language Models Across Diverse Challenge Sets
Vatsal Gupta
Pranshu Pandya
Tushar Kataria
Vivek Gupta
Dan Roth
AAML
112
1
0
03 Jan 2025
Revisiting Out-of-distribution Robustness in NLP: Benchmark, Analysis, and LLMs Evaluations
Lifan Yuan
Yangyi Chen
Ganqu Cui
Hongcheng Gao
Fangyuan Zou
Xingyi Cheng
Heng Ji
Zhiyuan Liu
Maosong Sun
98
80
0
07 Jun 2023
WiCE: Real-World Entailment for Claims in Wikipedia
Ryo Kamoi
Tanya Goyal
Juan Diego Rodriguez
Greg Durrett
59
89
0
02 Mar 2023
GLUE-X: Evaluating Natural Language Understanding Models from an Out-of-distribution Generalization Perspective
Linyi Yang
Shuibai Zhang
Libo Qin
Yafu Li
Yidong Wang
Hanmeng Liu
Jindong Wang
Xingxu Xie
Yue Zhang
ELM
102
81
0
15 Nov 2022
CONDAQA: A Contrastive Reading Comprehension Dataset for Reasoning about Negation
Abhilasha Ravichander
Matt Gardner
Ana Marasović
103
35
0
01 Nov 2022
Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them
Mirac Suzgun
Nathan Scales
Nathanael Scharli
Sebastian Gehrmann
Yi Tay
...
Aakanksha Chowdhery
Quoc V. Le
Ed H. Chi
Denny Zhou
Jason W. Wei
ALM
ELM
LRM
ReLM
261
1,110
0
17 Oct 2022
State-of-the-art generalisation research in NLP: A taxonomy and review
Dieuwke Hupkes
Mario Giulianelli
Verna Dankers
Mikel Artetxe
Yanai Elazar
...
Leila Khalatbari
Maria Ryskina
Rita Frieske
Ryan Cotterell
Zhijing Jin
260
97
0
06 Oct 2022
Residue-Based Natural Language Adversarial Attack Detection
Vyas Raina
Mark Gales
AAML
45
12
0
17 Apr 2022
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks
Yizhong Wang
Swaroop Mishra
Pegah Alipoormolabashi
Yeganeh Kordi
Amirreza Mirzaei
...
Chitta Baral
Yejin Choi
Noah A. Smith
Hannaneh Hajishirzi
Daniel Khashabi
ELM
123
846
0
16 Apr 2022
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Xuezhi Wang
Jason W. Wei
Dale Schuurmans
Quoc Le
Ed H. Chi
Sharan Narang
Aakanksha Chowdhery
Denny Zhou
ReLM
BDL
LRM
AI4CE
516
3,646
0
21 Mar 2022
WANLI: Worker and AI Collaboration for Natural Language Inference Dataset Creation
Alisa Liu
Swabha Swayamdipta
Noah A. Smith
Yejin Choi
129
219
0
16 Jan 2022
Measure and Improve Robustness in NLP Models: A Survey
Xuezhi Wang
Haohan Wang
Diyi Yang
235
139
0
15 Dec 2021
The Dangers of Underclaiming: Reasons for Caution When Reporting How NLP Systems Fail
Sam Bowman
OffRL
76
45
0
15 Oct 2021
CrossFit: A Few-shot Learning Challenge for Cross-task Generalization in NLP
Qinyuan Ye
Bill Yuchen Lin
Xiang Ren
286
184
0
18 Apr 2021
Cross-Task Generalization via Natural Language Crowdsourcing Instructions
Swaroop Mishra
Daniel Khashabi
Chitta Baral
Hannaneh Hajishirzi
LRM
148
747
0
18 Apr 2021
What Will it Take to Fix Benchmarking in Natural Language Understanding?
Samuel R. Bowman
George E. Dahl
ELM
ALM
55
160
0
05 Apr 2021
Robustness Gym: Unifying the NLP Evaluation Landscape
Karan Goel
Nazneen Rajani
Jesse Vig
Samson Tan
Jason M. Wu
Stephan Zheng
Caiming Xiong
Joey Tianyi Zhou
Christopher Ré
AAML
OffRL
OOD
190
140
0
13 Jan 2021
Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics
Swabha Swayamdipta
Roy Schwartz
Nicholas Lourie
Yizhong Wang
Hannaneh Hajishirzi
Noah A. Smith
Yejin Choi
111
448
0
22 Sep 2020
Measuring Massive Multitask Language Understanding
Dan Hendrycks
Collin Burns
Steven Basart
Andy Zou
Mantas Mazeika
Basel Alomair
Jacob Steinhardt
ELM
RALM
176
4,434
0
07 Sep 2020
Measuring Robustness to Natural Distribution Shifts in Image Classification
Rohan Taori
Achal Dave
Vaishaal Shankar
Nicholas Carlini
Benjamin Recht
Ludwig Schmidt
OOD
117
546
0
01 Jul 2020
Beyond Accuracy: Behavioral Testing of NLP models with CheckList
Marco Tulio Ribeiro
Tongshuang Wu
Carlos Guestrin
Sameer Singh
ELM
208
1,104
0
08 May 2020
Reevaluating Adversarial Examples in Natural Language
John X. Morris
Eli Lifland
Jack Lanchantin
Yangfeng Ji
Yanjun Qi
SILM
AAML
173
114
0
25 Apr 2020
What do Models Learn from Question Answering Datasets?
Priyanka Sen
Amir Saffari
RALM
ELM
54
75
0
07 Apr 2020
BAE: BERT-based Adversarial Examples for Text Classification
Siddhant Garg
Goutham Ramakrishnan
AAML
SILM
198
556
0
04 Apr 2020
PyTorch: An Imperative Style, High-Performance Deep Learning Library
Adam Paszke
Sam Gross
Francisco Massa
Adam Lerer
James Bradbury
...
Sasank Chilamkurthy
Benoit Steiner
Lu Fang
Junjie Bai
Soumith Chintala
ODL
511
42,449
0
03 Dec 2019
Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment
Di Jin
Zhijing Jin
Qiufeng Wang
Peter Szolovits
SILM
AAML
179
1,078
0
27 Jul 2019
BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions
Christopher Clark
Kenton Lee
Ming-Wei Chang
Tom Kwiatkowski
Michael Collins
Kristina Toutanova
224
1,527
0
24 May 2019
TextBugger: Generating Adversarial Text Against Real-world Applications
Jinfeng Li
S. Ji
Tianyu Du
Bo Li
Ting Wang
SILM
AAML
211
738
0
13 Dec 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.8K
94,891
0
11 Oct 2018
QuAC : Question Answering in Context
Eunsol Choi
He He
Mohit Iyyer
Mark Yatskar
Wen-tau Yih
Yejin Choi
Percy Liang
Luke Zettlemoyer
122
826
0
21 Aug 2018
Stress Test Evaluation for Natural Language Inference
Aakanksha Naik
Abhilasha Ravichander
Norman M. Sadeh
Carolyn Rose
Graham Neubig
ELM
75
377
0
02 Jun 2018
Black-box Generation of Adversarial Text Sequences to Evade Deep Learning Classifiers
Ji Gao
Jack Lanchantin
M. Soffa
Yanjun Qi
AAML
137
721
0
13 Jan 2018
Adversarial Examples for Evaluating Reading Comprehension Systems
Robin Jia
Percy Liang
AAML
ELM
199
1,605
0
23 Jul 2017
NewsQA: A Machine Comprehension Dataset
Adam Trischler
Tong Wang
Xingdi Yuan
Justin Harris
Alessandro Sordoni
Philip Bachman
Kaheer Suleman
103
893
0
29 Nov 2016
SQuAD: 100,000+ Questions for Machine Comprehension of Text
Pranav Rajpurkar
Jian Zhang
Konstantin Lopyrev
Percy Liang
RALM
283
8,134
0
16 Jun 2016
Explaining and Harnessing Adversarial Examples
Ian Goodfellow
Jonathon Shlens
Christian Szegedy
AAML
GAN
277
19,066
0
20 Dec 2014
1