ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2110.08300
  4. Cited By
The Dangers of Underclaiming: Reasons for Caution When Reporting How NLP
  Systems Fail

The Dangers of Underclaiming: Reasons for Caution When Reporting How NLP Systems Fail

15 October 2021
Sam Bowman
    OffRL
ArXivPDFHTML

Papers citing "The Dangers of Underclaiming: Reasons for Caution When Reporting How NLP Systems Fail"

28 / 28 papers shown
Title
A General Language Assistant as a Laboratory for Alignment
A General Language Assistant as a Laboratory for Alignment
Amanda Askell
Yuntao Bai
Anna Chen
Dawn Drain
Deep Ganguli
...
Tom B. Brown
Jack Clark
Sam McCandlish
C. Olah
Jared Kaplan
ALM
114
775
0
01 Dec 2021
AI and the Everything in the Whole Wide World Benchmark
AI and the Everything in the Whole Wide World Benchmark
Inioluwa Deborah Raji
Emily M. Bender
Amandalynne Paullada
Emily L. Denton
A. Hanna
68
305
0
26 Nov 2021
Recursively Summarizing Books with Human Feedback
Recursively Summarizing Books with Human Feedback
Jeff Wu
Long Ouyang
Daniel M. Ziegler
Nissan Stiennon
Ryan J. Lowe
Jan Leike
Paul Christiano
ALM
139
302
0
22 Sep 2021
The Benchmark Lottery
The Benchmark Lottery
Mostafa Dehghani
Yi Tay
A. Gritsenko
Zhe Zhao
N. Houlsby
Fernando Diaz
Donald Metzler
Oriol Vinyals
76
90
0
14 Jul 2021
How Good Is NLP? A Sober Look at NLP Tasks through the Lens of Social
  Impact
How Good Is NLP? A Sober Look at NLP Tasks through the Lens of Social Impact
Zhijing Jin
Geeticka Chauhan
Brian Tse
Mrinmaya Sachan
Rada Mihalcea
59
26
0
04 Jun 2021
A Non-Linear Structural Probe
A Non-Linear Structural Probe
Jennifer C. White
Tiago Pimentel
Naomi Saphra
Ryan Cotterell
29
25
0
21 May 2021
Dynaboard: An Evaluation-As-A-Service Platform for Holistic
  Next-Generation Benchmarking
Dynaboard: An Evaluation-As-A-Service Platform for Holistic Next-Generation Benchmarking
Zhiyi Ma
Kawin Ethayarajh
Tristan Thrush
Somya Jain
Ledell Yu Wu
Robin Jia
Christopher Potts
Adina Williams
Douwe Kiela
ELM
72
57
0
21 May 2021
Provable Limitations of Acquiring Meaning from Ungrounded Form: What
  Will Future Language Models Understand?
Provable Limitations of Acquiring Meaning from Ungrounded Form: What Will Future Language Models Understand?
William Merrill
Yoav Goldberg
Roy Schwartz
Noah A. Smith
65
68
0
22 Apr 2021
Low-Complexity Probing via Finding Subnetworks
Low-Complexity Probing via Finding Subnetworks
Steven Cao
Victor Sanh
Alexander M. Rush
40
53
0
08 Apr 2021
Dynabench: Rethinking Benchmarking in NLP
Dynabench: Rethinking Benchmarking in NLP
Douwe Kiela
Max Bartolo
Yixin Nie
Divyansh Kaushik
Atticus Geiger
...
Pontus Stenetorp
Robin Jia
Joey Tianyi Zhou
Christopher Potts
Adina Williams
184
405
0
07 Apr 2021
Preregistering NLP Research
Preregistering NLP Research
Emiel van Miltenburg
Chris van der Lee
E. Krahmer
AI4CE
52
23
0
11 Mar 2021
ANLIzing the Adversarial Natural Language Inference Dataset
ANLIzing the Adversarial Natural Language Inference Dataset
Adina Williams
Tristan Thrush
Douwe Kiela
AAML
214
47
0
24 Oct 2020
Aligning AI With Shared Human Values
Aligning AI With Shared Human Values
Dan Hendrycks
Collin Burns
Steven Basart
Andrew Critch
Jingkai Li
D. Song
Jacob Steinhardt
137
548
0
05 Aug 2020
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
Pengcheng He
Xiaodong Liu
Jianfeng Gao
Weizhu Chen
AAML
132
2,724
0
05 Jun 2020
AI Research Considerations for Human Existential Safety (ARCHES)
AI Research Considerations for Human Existential Safety (ARCHES)
Andrew Critch
David M. Krueger
86
52
0
30 May 2020
Beyond Accuracy: Behavioral Testing of NLP models with CheckList
Beyond Accuracy: Behavioral Testing of NLP models with CheckList
Marco Tulio Ribeiro
Tongshuang Wu
Carlos Guestrin
Sameer Singh
ELM
194
1,100
0
08 May 2020
The State and Fate of Linguistic Diversity and Inclusion in the NLP
  World
The State and Fate of Linguistic Diversity and Inclusion in the NLP World
Pratik M. Joshi
Sebastin Santy
A. Budhiraja
Kalika Bali
Monojit Choudhury
LMTD
107
842
0
20 Apr 2020
Adversarial Filters of Dataset Biases
Adversarial Filters of Dataset Biases
Ronan Le Bras
Swabha Swayamdipta
Chandra Bhagavatula
Rowan Zellers
Matthew E. Peters
Ashish Sabharwal
Yejin Choi
92
222
0
10 Feb 2020
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
532
4,773
0
23 Jan 2020
Algorithmic Fairness from a Non-ideal Perspective
Algorithmic Fairness from a Non-ideal Perspective
S. Fazelpour
Zachary Chase Lipton
FaML
36
101
0
08 Jan 2020
oLMpics -- On what Language Model Pre-training Captures
oLMpics -- On what Language Model Pre-training Captures
Alon Talmor
Yanai Elazar
Yoav Goldberg
Jonathan Berant
LRM
96
303
0
31 Dec 2019
Adversarial NLI: A New Benchmark for Natural Language Understanding
Adversarial NLI: A New Benchmark for Natural Language Understanding
Yixin Nie
Adina Williams
Emily Dinan
Joey Tianyi Zhou
Jason Weston
Douwe Kiela
115
1,003
0
31 Oct 2019
Risks from Learned Optimization in Advanced Machine Learning Systems
Risks from Learned Optimization in Advanced Machine Learning Systems
Evan Hubinger
Chris van Merwijk
Vladimir Mikulik
Joar Skalse
Scott Garrabrant
76
150
0
05 Jun 2019
SuperGLUE: A Stickier Benchmark for General-Purpose Language
  Understanding Systems
SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems
Alex Jinpeng Wang
Yada Pruksachatkun
Nikita Nangia
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
240
2,307
0
02 May 2019
Attaining the Unattainable? Reassessing Claims of Human Parity in Neural
  Machine Translation
Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Machine Translation
Antonio Toral
Sheila Castilho
Ke Hu
Andy Way
48
190
0
30 Aug 2018
Has Machine Translation Achieved Human Parity? A Case for Document-level
  Evaluation
Has Machine Translation Achieved Human Parity? A Case for Document-level Evaluation
Samuel Läubli
Rico Sennrich
M. Volk
37
258
0
21 Aug 2018
Deep contextualized word representations
Deep contextualized word representations
Matthew E. Peters
Mark Neumann
Mohit Iyyer
Matt Gardner
Christopher Clark
Kenton Lee
Luke Zettlemoyer
NAI
190
11,542
0
15 Feb 2018
SQuAD: 100,000+ Questions for Machine Comprehension of Text
SQuAD: 100,000+ Questions for Machine Comprehension of Text
Pranav Rajpurkar
Jian Zhang
Konstantin Lopyrev
Percy Liang
RALM
239
8,113
0
16 Jun 2016
1