A Survey of Parameters Associated with the Quality of Benchmarks in NLP

14 October 2022

Papers citing "A Survey of Parameters Associated with the Quality of Benchmarks in NLP"

8 / 8 papers shown

Title
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering Pan Lu Swaroop Mishra Tony Xia Liang Qiu Kai-Wei Chang Song-Chun Zhu Oyvind Tafjord Peter Clark A. Kalyan ELM ReLM LRM 211 1,106 0 20 Sep 2022
Competency Problems: On Finding and Removing Artifacts in Language Data Matt Gardner William Merrill Jesse Dodge Matthew E. Peters Alexis Ross Sameer Singh Noah A. Smith 168 107 0 17 Apr 2021
Are We Modeling the Task or the Annotator? An Investigation of Annotator Bias in Natural Language Understanding Datasets Mor Geva Yoav Goldberg Jonathan Berant 242 320 0 21 Aug 2019
Language GANs Falling Short Massimo Caccia Lucas Caccia W. Fedus Hugo Larochelle Joelle Pineau Laurent Charlin 124 215 0 06 Nov 2018
Hypothesis Only Baselines in Natural Language Inference Adam Poliak Jason Naradowsky Aparajita Haldar Rachel Rudinger Benjamin Van Durme 190 576 0 02 May 2018
Split and Rephrase: Better Evaluation and a Stronger Baseline Roee Aharoni Yoav Goldberg MoE 226 45 0 02 May 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding Alex Jinpeng Wang Amanpreet Singh Julian Michael Felix Hill Omer Levy Samuel R. Bowman ELM 297 6,959 0 20 Apr 2018
Teaching Machines to Read and Comprehend Karl Moritz Hermann Tomás Kociský Edward Grefenstette L. Espeholt W. Kay Mustafa Suleyman Phil Blunsom 175 3,510 0 10 Jun 2015