Beat the AI: Investigating Adversarial Human Annotation for Reading Comprehension

2 February 2020

Papers citing "Beat the AI: Investigating Adversarial Human Annotation for Reading Comprehension"

50 / 51 papers shown

Title
Green AI: Exploring Carbon Footprints, Mitigation Strategies, and Trade Offs in Large Language Model Training Vivian Liu Yiqiao Yin 40 11 0 01 Apr 2024
Adversarial Nibbler: An Open Red-Teaming Method for Identifying Diverse Harms in Text-to-Image Generation Jessica Quaye Alicia Parrish Oana Inel Charvi Rastogi Hannah Rose Kirk ... Nathan Clement Rafael Mosquera Juan Ciro Vijay Janapa Reddi Lora Aroyo 45 7 0 14 Feb 2024
Desiderata for the Context Use of Question Answering Systems Sagi Shaier Lawrence E Hunter K. Wense 28 4 0 31 Jan 2024
How the Advent of Ubiquitous Large Language Models both Stymie and Turbocharge Dynamic Adversarial Question Generation Yoo Yeon Sung Ishani Mondal Jordan L. Boyd-Graber 30 0 0 20 Jan 2024
Let's Synthesize Step by Step: Iterative Dataset Synthesis with Large Language Models by Extrapolating Errors from Small Models Ruida Wang Wangchunshu Zhou Mrinmaya Sachan 27 32 0 20 Oct 2023
Mind the instructions: a holistic evaluation of consistency and interactions in prompt-based learning Lucas Weber Elia Bruni Dieuwke Hupkes 32 25 0 20 Oct 2023
No Offense Taken: Eliciting Offensiveness from Language Models Anugya Srivastava Rahul Ahuja Rohith Mukku 16 3 0 02 Oct 2023
Teaching Smaller Language Models To Generalise To Unseen Compositional Questions Tim Hartill N. Tan Michael Witbrock Patricia J. Riddle ReLM KELM LRM 34 2 0 02 Aug 2023
Revisiting Out-of-distribution Robustness in NLP: Benchmark, Analysis, and LLMs Evaluations Lifan Yuan Yangyi Chen Ganqu Cui Hongcheng Gao Fangyuan Zou Xingyi Cheng Heng Ji Zhiyuan Liu Maosong Sun 44 75 0 07 Jun 2023
Entailment as Robust Self-Learner Jiaxin Ge Hongyin Luo Yoon Kim James R. Glass 42 3 0 26 May 2023
Out-of-Distribution Generalization in Text Classification: Past, Present, and Future Linyi Yang Yangqiu Song Xuan Ren Chenyang Lyu Yidong Wang Lingqiao Liu Jindong Wang Jennifer Foster Yue Zhang OOD 37 2 0 23 May 2023
On the Limitations of Simulating Active Learning Katerina Margatina Nikolaos Aletras 31 11 0 21 May 2023
Can NLP Models Correctly Reason Over Contexts that Break the Common Assumptions? Neeraj Varshney Mihir Parmar Nisarg Patel Divij Handa Sayantan Sarkar Man Luo Chitta Baral LRM 34 4 0 20 May 2023
A Matter of Annotation: An Empirical Study on In Situ and Self-Recall Activity Annotations from Wearable Sensors Alexander Hoelzemann Kristof Van Laerhoven 19 6 0 15 May 2023
Think Twice: Measuring the Efficiency of Eliminating Prediction Shortcuts of Question Answering Models Lukávs Mikula Michal vStefánik Marek Petrovivc Petr Sojka 41 3 0 11 May 2023
Assessing Language Model Deployment with Risk Cards Leon Derczynski Hannah Rose Kirk Vidhisha Balachandran Sachin Kumar Yulia Tsvetkov M. Leiser Saif Mohammad 30 42 0 31 Mar 2023
Revealing Weaknesses of Vietnamese Language Models Through Unanswerable Questions in Machine Reading Comprehension Son Quoc Tran Phong Nguyen-Thuan Do Kiet Van Nguyen Ngan Luu-Thuy Nguyen 39 0 0 16 Mar 2023
Learning to Initialize: Can Meta Learning Improve Cross-task Generalization in Prompt Tuning? Chengwei Qin Q. Li Ruochen Zhao Chenyu You VLM LRM 23 15 0 16 Feb 2023
Evaluating Human-Language Model Interaction Mina Lee Megha Srivastava Amelia Hardy John Thickstun Esin Durmus ... Hancheng Cao Tony Lee Rishi Bommasani Michael S. Bernstein Percy Liang LM&MA ALM 58 100 0 19 Dec 2022
Discovering Language Model Behaviors with Model-Written Evaluations Ethan Perez Sam Ringer Kamilė Lukošiūtė Karina Nguyen Edwin Chen ... Danny Hernandez Deep Ganguli Evan Hubinger Nicholas Schiefer Jared Kaplan ALM 22 367 0 19 Dec 2022
RQUGE: Reference-Free Metric for Evaluating Question Generation by Answering the Question Alireza Mohammadshahi Thomas Scialom Majid Yazdani Pouya Yanki Angela Fan James Henderson Marzieh Saeidi 31 20 0 02 Nov 2022
IDK-MRC: Unanswerable Questions for Indonesian Machine Reading Comprehension Rifki Afina Putri Alice Oh 33 9 0 25 Oct 2022
Rich Knowledge Sources Bring Complex Knowledge Conflicts: Recalibrating Models to Reflect Conflicting Evidence Hung-Ting Chen Michael J.Q. Zhang Eunsol Choi RALM HILM 47 92 0 25 Oct 2022
CORE: A Retrieve-then-Edit Framework for Counterfactual Data Generation Tanay Dixit Bhargavi Paranjape Hannaneh Hajishirzi Luke Zettlemoyer SyDa 146 24 0 10 Oct 2022
State-of-the-art generalisation research in NLP: A taxonomy and review Dieuwke Hupkes Mario Giulianelli Verna Dankers Mikel Artetxe Yanai Elazar ... Leila Khalatbari Maria Ryskina Rita Frieske Ryan Cotterell Zhijing Jin 129 95 0 06 Oct 2022
Efficiently Enhancing Zero-Shot Performance of Instruction Following Model via Retrieval of Soft Prompt Seonghyeon Ye Joel Jang Doyoung Kim Yongrae Jo Minjoon Seo VLM 39 2 0 06 Oct 2022
Possible Stories: Evaluating Situated Commonsense Reasoning under Multiple Possible Scenarios Mana Ashida Saku Sugawara 70 6 0 16 Sep 2022
A Survey on Measuring and Mitigating Reasoning Shortcuts in Machine Reading Comprehension Xanh Ho Johannes Mario Meissner Saku Sugawara Akiko Aizawa OffRL 35 4 0 05 Sep 2022
Collecting high-quality adversarial data for machine reading comprehension tasks with humans and models in the loop Damian Y. Romero Diaz M. Aniol John M. Culnan 32 0 0 28 Jun 2022
Eliciting and Understanding Cross-Task Skills with Task-Level Mixture-of-Experts Qinyuan Ye Juan Zha Xiang Ren MoE 18 12 0 25 May 2022
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks Yizhong Wang Swaroop Mishra Pegah Alipoormolabashi Yeganeh Kordi Amirreza Mirzaei ... Chitta Baral Yejin Choi Noah A. Smith Hannaneh Hajishirzi Daniel Khashabi ELM 64 790 0 16 Apr 2022
Evaluating Prompts Across Multiple Choice Tasks In a Zero-Shot Setting Gabriel Orlanski LRM 27 2 0 29 Mar 2022
What Makes Reading Comprehension Questions Difficult? Saku Sugawara Nikita Nangia Alex Warstadt Sam Bowman ELM RALM 20 13 0 12 Mar 2022
ZeroGen: Efficient Zero-shot Learning via Dataset Generation Jiacheng Ye Jiahui Gao Qintong Li Hang Xu Jiangtao Feng Zhiyong Wu Tao Yu Lingpeng Kong SyDa 47 212 0 16 Feb 2022
WANLI: Worker and AI Collaboration for Natural Language Inference Dataset Creation Alisa Liu Swabha Swayamdipta Noah A. Smith Yejin Choi 82 212 0 16 Jan 2022
CommonsenseQA 2.0: Exposing the Limits of AI through Gamification Alon Talmor Ori Yoran Ronan Le Bras Chandrasekhar Bhagavatula Yoav Goldberg Yejin Choi Jonathan Berant ELM 33 141 0 14 Jan 2022
Models in the Loop: Aiding Crowdworkers with Generative Annotation Assistants Max Bartolo Tristan Thrush Sebastian Riedel Pontus Stenetorp Robin Jia Douwe Kiela 24 33 0 16 Dec 2021
QuALITY: Question Answering with Long Input Texts, Yes! Richard Yuanzhe Pang Alicia Parrish Nitish Joshi Nikita Nangia Jason Phang ... Vishakh Padmakumar Johnny Ma Jana Thompson He He Sam Bowman RALM 30 141 0 16 Dec 2021
Measure and Improve Robustness in NLP Models: A Survey Xuezhi Wang Haohan Wang Diyi Yang 139 130 0 15 Dec 2021
Adversarially Constructed Evaluation Sets Are More Challenging, but May Not Be Fair Jason Phang Angelica Chen William Huang Samuel R. Bowman AAML 28 13 0 16 Nov 2021
Adversarial GLUE: A Multi-Task Benchmark for Robustness Evaluation of Language Models Wei Ping Chejian Xu Shuohang Wang Zhe Gan Yu Cheng Jianfeng Gao Ahmed Hassan Awadallah Yangqiu Song VLM ELM AAML 33 216 0 04 Nov 2021
Retrieval-guided Counterfactual Generation for QA Bhargavi Paranjape Matthew Lamm Ian Tenney 33 31 0 14 Oct 2021
Distantly-Supervised Evidence Retrieval Enables Question Answering without Evidence Annotation Chen Zhao Chenyan Xiong Jordan L. Boyd-Graber Hal Daumé RALM 29 9 0 10 Oct 2021
On the Efficacy of Adversarial Data Collection for Question Answering: Results from a Large-Scale Randomized Study Divyansh Kaushik Douwe Kiela Zachary Chase Lipton Wen-tau Yih AAML 11 36 0 02 Jun 2021
CrossFit: A Few-shot Learning Challenge for Cross-task Generalization in NLP Qinyuan Ye Bill Yuchen Lin Xiang Ren 223 180 0 18 Apr 2021
Improving Question Answering Model Robustness with Synthetic Adversarial Data Generation Max Bartolo Tristan Thrush Robin Jia Sebastian Riedel Pontus Stenetorp Douwe Kiela AAML 28 103 0 18 Apr 2021
Cooperative Self-training of Machine Reading Comprehension Hongyin Luo Shang-Wen Li Ming Gao Seunghak Yu James R. Glass SyDa RALM 20 11 0 12 Mar 2021
Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies Mor Geva Daniel Khashabi Elad Segal Tushar Khot Dan Roth Jonathan Berant RALM 259 678 0 06 Jan 2021
DynaSent: A Dynamic Benchmark for Sentiment Analysis Christopher Potts Zhengxuan Wu Atticus Geiger Douwe Kiela 230 77 0 30 Dec 2020
Elastic weight consolidation for better bias inoculation James Thorne Andreas Vlachos 22 11 0 29 Apr 2020