v1v2 (latest)

Annotation Artifacts in Natural Language Inference Data

6 March 2018

Papers citing "Annotation Artifacts in Natural Language Inference Data"

50 / 796 papers shown

Title
Improving Question Answering Model Robustness with Synthetic Adversarial Data Generation Max Bartolo Tristan Thrush Robin Jia Sebastian Riedel Pontus Stenetorp Douwe Kiela AAML 93 106 0 18 Apr 2021
Competency Problems: On Finding and Removing Artifacts in Language Data Matt Gardner William Merrill Jesse Dodge Matthew E. Peters Alexis Ross Sameer Singh Noah A. Smith 248 111 0 17 Apr 2021
AM2iCo: Evaluating Word Meaning in Context across Low-Resource Languages with Adversarial Examples Qianchu Liu Edoardo Ponti Diana McCarthy Ivan Vulić Anna Korhonen 118 19 0 17 Apr 2021
Back to Square One: Artifact Detection, Training and Commonsense Disentanglement in the Winograd Schema Yanai Elazar Hongming Zhang Yoav Goldberg Dan Roth ReLM LRM 126 44 0 16 Apr 2021
Supervising Model Attention with Human Explanations for Robust Natural Language Inference Joe Stacey Yonatan Belinkov Marek Rei 75 47 0 16 Apr 2021
ExplaGraphs: An Explanation Graph Generation Task for Structured Commonsense Reasoning Swarnadeep Saha Prateek Yadav Lisa Bauer Joey Tianyi Zhou LRM 90 59 0 15 Apr 2021
Regularization for Long Named Entity Recognition Minbyul Jeong Jaewoo Kang 89 4 0 15 Apr 2021
Does Putting a Linguist in the Loop Improve NLU Data Collection? Alicia Parrish William Huang Omar Agha Soo-hwan Lee Nikita Nangia Alex Warstadt Karmanya Aggarwal Emily Allaway Tal Linzen Samuel R. Bowman 117 40 0 15 Apr 2021
Natural-Language Multi-Agent Simulations of Argumentative Opinion Dynamics Gregor Betz LLMAG AI4CE 37 11 0 14 Apr 2021
Masked Language Modeling and the Distributional Hypothesis: Order Word Matters Pre-training for Little Koustuv Sinha Robin Jia Dieuwke Hupkes J. Pineau Adina Williams Douwe Kiela 134 249 0 14 Apr 2021
Semantic maps and metrics for science Semantic maps and metrics for science using deep transformer encoders Brendan Chambers James A. Evans MedIm 46 0 0 13 Apr 2021
NLI Data Sanity Check: Assessing the Effect of Data Corruption on Model Performance Aarne Talman Marianna Apidianaki S. Chatzikyriakidis Jörg Tiedemann 43 10 0 10 Apr 2021
Fool Me Twice: Entailment from Wikipedia Gamification Julian Martin Eisenschlos Bhuwan Dhingra Jannis Bulian Benjamin Borschinger Jordan L. Boyd-Graber 93 48 0 10 Apr 2021
GrASP: A Library for Extracting and Exploring Human-Interpretable Textual Patterns Piyawat Lertvittayakumjorn Leshem Choshen Eyal Shnarch Francesca Toni 96 7 0 08 Apr 2021
Dynabench: Rethinking Benchmarking in NLP Douwe Kiela Max Bartolo Yixin Nie Divyansh Kaushik Atticus Geiger ... Pontus Stenetorp Robin Jia Joey Tianyi Zhou Christopher Potts Adina Williams 218 410 0 07 Apr 2021
Paired Examples as Indirect Supervision in Latent Decision Models Nitish Gupta Sameer Singh Matt Gardner Dan Roth 85 7 0 05 Apr 2021
UNICORN on RAINBOW: A Universal Commonsense Reasoning Model on a New Multitask Benchmark Nicholas Lourie Ronan Le Bras Chandra Bhagavatula Yejin Choi LRM 105 140 0 24 Mar 2021
SelfExplain: A Self-Explaining Architecture for Neural Text Classifiers Dheeraj Rajagopal Vidhisha Balachandran Eduard H. Hovy Yulia Tsvetkov MILM SSL FAtt AI4TS 86 67 0 23 Mar 2021
Automatic Generation of Contrast Sets from Scene Graphs: Probing the Compositional Consistency of GQA Yonatan Bitton Gabriel Stanovsky Roy Schwartz Michael Elhadad CoGe 95 33 0 17 Mar 2021
Get Your Vitamin C! Robust Fact Verification with Contrastive Evidence Tal Schuster Adam Fisch Regina Barzilay 114 239 0 15 Mar 2021
Are NLP Models really able to Solve Simple Math Word Problems? Arkil Patel S. Bhattamishra Navin Goyal ReLM LRM 141 851 0 12 Mar 2021
Towards Interpreting and Mitigating Shortcut Learning Behavior of NLU Models Mengnan Du Varun Manjunatha R. Jain Ruchi Deshpande Franck Dernoncourt Jiuxiang Gu Tong Sun Helen Zhou 108 107 0 11 Mar 2021
Hurdles to Progress in Long-form Question Answering Kalpesh Krishna Aurko Roy Mohit Iyyer 72 200 0 10 Mar 2021
Rissanen Data Analysis: Examining Dataset Characteristics via Description Length Ethan Perez Douwe Kiela Kyunghyun Cho 82 24 0 05 Mar 2021
Contrastive Explanations for Model Interpretability Alon Jacovi Swabha Swayamdipta Shauli Ravfogel Yanai Elazar Yejin Choi Yoav Goldberg 163 98 0 02 Mar 2021
Cryptonite: A Cryptic Crossword Benchmark for Extreme Ambiguity in Language Avia Efrat Uri Shaham D. Kilman Omer Levy ELM 65 18 0 01 Mar 2021
Long Document Summarization in a Low Resource Setting using Pretrained Language Models Ahsaas Bajaj Pavitra Dangati Kalpesh Krishna Pradhiksha Ashok Kumar Rheeya Uppaal Bradford T. Windsor Eliot Brenner Dominic Dotterrer Rajarshi Das Andrew McCallum AILaw RALM 92 52 0 01 Mar 2021
Teach Me to Explain: A Review of Datasets for Explainable Natural Language Processing Sarah Wiegreffe Ana Marasović XAI 93 146 0 24 Feb 2021
Statistically Profiling Biases in Natural Language Reasoning Datasets and Models Shanshan Huang Kenny Q. Zhu 32 1 0 09 Feb 2021
Think you have Solved Direct-Answer Question Answering? Try ARC-DA, the Direct-Answer AI2 Reasoning Challenge Sumithra Bhakthavatsalam Daniel Khashabi Tushar Khot Bhavana Dalvi Kyle Richardson Ashish Sabharwal Carissa Schoenick Oyvind Tafjord Peter Clark RALM AI4CE 70 66 0 05 Feb 2021
Challenges in Automated Debiasing for Toxic Language Detection Xuhui Zhou Maarten Sap Swabha Swayamdipta Noah A. Smith Yejin Choi 78 142 0 29 Jan 2021
Exploring Transitivity in Neural NLI Models through Veridicality Hitomi Yanaka K. Mineshima Kentaro Inui 82 23 0 26 Jan 2021
Mitigating the Position Bias of Transformer Models in Passage Re-Ranking Sebastian Hofstatter Aldo Lipani Sophia Althammer Markus Zlabinger Allan Hanbury 128 17 0 18 Jan 2021
Robustness Gym: Unifying the NLP Evaluation Landscape Karan Goel Nazneen Rajani Jesse Vig Samson Tan Jason M. Wu Stephan Zheng Caiming Xiong Joey Tianyi Zhou Christopher Ré AAML OffRL OOD 199 140 0 13 Jan 2021
BERT & Family Eat Word Salad: Experiments with Text Understanding Ashim Gupta Giorgi Kvernadze Vivek Srikumar 258 73 0 10 Jan 2021
Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies Mor Geva Daniel Khashabi Elad Segal Tushar Khot Dan Roth Jonathan Berant RALM 363 742 0 06 Jan 2021
Polyjuice: Generating Counterfactuals for Explaining, Evaluating, and Improving Models Tongshuang Wu Marco Tulio Ribeiro Jeffrey Heer Daniel S. Weld 130 250 0 01 Jan 2021
Promoting Graph Awareness in Linearized Graph-to-Text Generation Alexander Miserlis Hoyle Ana Marasović Noah A. Smith AI4CE 75 31 0 31 Dec 2020
DynaSent: A Dynamic Benchmark for Sentiment Analysis Christopher Potts Zhengxuan Wu Atticus Geiger Douwe Kiela 299 80 0 30 Dec 2020
To what extent do human explanations of model behavior align with actual model behavior? Grusha Prasad Yixin Nie Joey Tianyi Zhou Robin Jia Douwe Kiela Adina Williams 73 28 0 24 Dec 2020
R $^2$ -Net: Relation of Relation Learning Network for Sentence Semantic Matching Kun Zhang Le Wu Guangyi Lv Meng Wang Enhong Chen Shulan Ruan 71 21 0 16 Dec 2020
Learning to Rationalize for Nonmonotonic Reasoning with Distant Supervision Faeze Brahman Vered Shwartz Rachel Rudinger Yejin Choi LRM 98 42 0 14 Dec 2020
Data and its (dis)contents: A survey of dataset development and use in machine learning research Amandalynne Paullada Inioluwa Deborah Raji Emily M. Bender Emily L. Denton A. Hanna 133 532 0 09 Dec 2020
Semantics Altering Modifications for Evaluating Comprehension in Machine Reading Viktor Schlegel Goran Nenadic Riza Batista-Navarro 79 18 0 07 Dec 2020
WeaQA: Weak Supervision via Captions for Visual Question Answering Pratyay Banerjee Tejas Gokhale Yezhou Yang Chitta Baral 110 36 0 04 Dec 2020
Learning from others' mistakes: Avoiding dataset biases without modeling them Victor Sanh Thomas Wolf Yonatan Belinkov Alexander M. Rush 96 116 0 02 Dec 2020
Multimodal Pretraining Unmasked: A Meta-Analysis and a Unified Framework of Vision-and-Language BERTs Emanuele Bugliarello Ryan Cotterell Naoaki Okazaki Desmond Elliott 102 120 0 30 Nov 2020
The Geometry of Distributed Representations for Better Alignment, Attenuated Bias, and Improved Interpretability Sunipa Dev 80 1 0 25 Nov 2020
What do we expect from Multiple-choice QA Systems? Krunal Shah Nitish Gupta Dan Roth AAML 40 14 0 20 Nov 2020
Gradient Starvation: A Learning Proclivity in Neural Networks Mohammad Pezeshki Sekouba Kaba Yoshua Bengio Aaron Courville Doina Precup Guillaume Lajoie MLT 158 269 0 18 Nov 2020