Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference

4 February 2019

Papers citing "Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference"

50 / 291 papers shown

Title
Assessing Robustness to Spurious Correlations in Post-Training Language Models Julia Shuieh Prasann Singhal Apaar Shanker John Heyer George Pu Samuel Denton LRM 29 0 0 09 May 2025
Pushing the boundary on Natural Language Inference Pablo Miralles-González Javier Huertas-Tato Alejandro Martín David Camacho LRM 44 0 0 25 Apr 2025
FLUKE: A Linguistically-Driven and Task-Agnostic Framework for Robustness Evaluation Yulia Otmakhova Hung Thinh Truong Rahmad Mahendra Zenan Zhai Rongxin Zhu Daniel Beck Jey Han Lau ELM 70 0 0 24 Apr 2025
Do Large Language Models know who did what to whom? Joseph M. Denning Xiaohan Bryor Snefjella Idan A. Blank 62 1 0 23 Apr 2025
Continuum-Interaction-Driven Intelligence: Human-Aligned Neural Architecture via Crystallized Reasoning and Fluid Generation Pengcheng Zhou Zhiqiang Nie Haochen Li 53 0 0 12 Apr 2025
Re-evaluating Theory of Mind evaluation in large language models Jennifer Hu Felix Sosa T. Ullman 45 0 0 28 Feb 2025
Neuro-Symbolic Contrastive Learning for Cross-domain Inference Mingyue Liu Ryo Ueda Zhen Wan Katsumi Inoue Chris G. Willcocks NAI 72 0 0 13 Feb 2025
Does Training on Synthetic Data Make Models Less Robust? Lingze Zhang Ellie Pavlick SyDa 89 0 0 11 Feb 2025
Self-Rationalization in the Wild: A Large Scale Out-of-Distribution Evaluation on NLI-related tasks Jing Yang Max Glockner Anderson de Rezende Rocha Iryna Gurevych LRM 73 1 0 07 Feb 2025
A linguistically-motivated evaluation methodology for unraveling model's abilities in reading comprehension tasks Elie Antoine Frédéric Béchet Géraldine Damnati Philippe Langlais 56 1 0 29 Jan 2025
Evaluating Concurrent Robustness of Language Models Across Diverse Challenge Sets Vatsal Gupta Pranshu Pandya Tushar Kataria Vivek Gupta Dan Roth AAML 55 1 0 03 Jan 2025
Sneaking Syntax into Transformer Language Models with Tree Regularization Ananjan Nandi Christopher D. Manning Shikhar Murty 74 0 0 28 Nov 2024
Focus On This, Not That! Steering LLMs With Adaptive Feature Specification Tom A. Lamb Adam Davies Alasdair Paren Philip H. S. Torr Francesco Pinto 47 0 0 30 Oct 2024
Fact Recall, Heuristics or Pure Guesswork? Precise Interpretations of Language Models for Fact Completion Denitsa Saynova Lovisa Hagström Moa Johansson Richard Johansson Marco Kuhlmann HILM 43 0 0 18 Oct 2024
Inference and Verbalization Functions During In-Context Learning Junyi Tao Xiaoyin Chen Nelson F. Liu LRM ReLM 26 0 0 12 Oct 2024
In-context Learning in Presence of Spurious Correlations Hrayr Harutyunyan R. Darbinyan Samvel Karapetyan Hrant Khachatrian LRM 46 1 0 04 Oct 2024
Efficient LLM Context Distillation Rajesh Upadhayayaya Zachary Smith Chritopher Kottmyer Manish Raj Osti 42 1 0 03 Sep 2024
CoverBench: A Challenging Benchmark for Complex Claim Verification Alon Jacovi Moran Ambar Eyal Ben-David Uri Shaham Amir Feder Mor Geva Dror Marcus Avi Caciularu LMTD 49 3 0 06 Aug 2024
Consistent Document-Level Relation Extraction via Counterfactuals Ali Modarressi Abdullatif Köksal Hinrich Schutze 48 1 0 09 Jul 2024
The Factorization Curse: Which Tokens You Predict Underlie the Reversal Curse and More O. Kitouni Niklas Nolte Diane Bouchacourt Adina Williams Mike Rabbat Mark Ibrahim LRM CLL 46 12 0 07 Jun 2024
ACCORD: Closing the Commonsense Measurability Gap François Roewer-Després Jinyue Feng Zining Zhu Frank Rudzicz LRM 48 0 0 04 Jun 2024
Filtered Corpus Training (FiCT) Shows that Language Models can Generalize from Indirect Evidence Abhinav Patil Jaap Jumelet Yu Ying Chiu Andy Lapastora Peter Shen Lexie Wang Clevis Willrich Shane Steinert-Threlkeld 32 13 0 24 May 2024
CHARP: Conversation History AwaReness Probing for Knowledge-grounded Dialogue Systems Abbas Ghaddar David Alfonso-Hermelo Philippe Langlais Mehdi Rezagholizadeh Boxing Chen Prasanna Parthasarathi 39 0 0 24 May 2024
What is it for a Machine Learning Model to Have a Capability? Jacqueline Harding Nathaniel Sharadin ELM 38 3 0 14 May 2024
Learned feature representations are biased by complexity, learning order, position, and more Andrew Kyle Lampinen Stephanie C. Y. Chan Katherine Hermann AI4CE FaML SSL OOD 34 6 0 09 May 2024
$How does promoting the minority fraction affect generalization? A theoretical study of the one-hidden-layer neural network on group imbalance$ How does promoting the minority fraction affect generalization? A theoretical study of the one-hidden-layer neural network on group imbalance Hongkang Li Shuai Zhang Yihua Zhang Meng Wang Sijia Liu Pin-Yu Chen 41 4 0 12 Mar 2024
Complexity Matters: Dynamics of Feature Learning in the Presence of Spurious Correlations GuanWen Qiu Da Kuang Surbhi Goel 27 8 0 05 Mar 2024
Best of Both Worlds: A Pliable and Generalizable Neuro-Symbolic Approach for Relation Classification Robert Vacareanu F. Alam M. Islam Haris Riaz Mihai Surdeanu NAI 29 2 0 05 Mar 2024
Should We Fear Large Language Models? A Structural Analysis of the Human Reasoning System for Elucidating LLM Capabilities and Risks Through the Lens of Heidegger's Philosophy Jianqiiu Zhang ELM 40 1 0 05 Mar 2024
On the Challenges and Opportunities in Generative AI Laura Manduchi Kushagra Pandey Robert Bamler Ryan Cotterell Sina Daubener ... F. Wenzel Frank Wood Stephan Mandt Vincent Fortuin Vincent Fortuin 56 17 0 28 Feb 2024
InfFeed: Influence Functions as a Feedback to Improve the Performance of Subjective Tasks Somnath Banerjee Maulindu Sarkar Punyajoy Saha Binny Mathew Animesh Mukherjee TDI 34 0 0 22 Feb 2024
Punctuation Restoration Improves Structure Understanding Without Supervision Junghyun Min Minho Lee Woochul Lee Yeonsoo Lee 59 1 0 13 Feb 2024
LEVI: Generalizable Fine-tuning via Layer-wise Ensemble of Different Views Yuji Roh Qingyun Liu Huan Gui Zhe Yuan Yujin Tang ... Liang Liu Shuchao Bi Lichan Hong Ed H. Chi Zhe Zhao 43 1 0 07 Feb 2024
Semantic Sensitivities and Inconsistent Predictions: Measuring the Fragility of NLI Models Erik Arakelyan Zhaoqi Liu Isabelle Augenstein AAML 45 9 0 25 Jan 2024
Learning Shortcuts: On the Misleading Promise of NLU in Language Models Geetanjali Bihani Julia Taylor Rayz 33 3 0 17 Jan 2024
Self-Supervised Position Debiasing for Large Language Models Zhongkun Liu Zheng Chen Mengqi Zhang Zhaochun Ren Pengjie Ren Zhumin Chen 36 1 0 02 Jan 2024
The Earth is Flat? Unveiling Factual Errors in Large Language Models Wenxuan Wang Juluan Shi Zhaopeng Tu Youliang Yuan Jen-tse Huang Wenxiang Jiao Michael R. Lyu KELM HILM SyDa 47 1 0 01 Jan 2024
Analyzing the Inherent Response Tendency of LLMs: Real-World Instructions-Driven Jailbreak Yanrui Du Sendong Zhao Ming Ma Yuhan Chen Bing Qin 26 15 0 07 Dec 2023
Latent Feature-based Data Splits to Improve Generalisation Evaluation: A Hate Speech Detection Case Study Maike Zufle Verna Dankers Ivan Titov 42 0 0 16 Nov 2023
Measuring and Improving Attentiveness to Partial Inputs with Counterfactuals Yanai Elazar Bhargavi Paranjape Hao Peng Sarah Wiegreffe Khyathi Raghavi Vivek Srikumar Sameer Singh Noah A. Smith AAML OOD 28 0 0 16 Nov 2023
Self-Contradictory Reasoning Evaluation and Detection Ziyi Liu Isabelle G. Lee Yongkang Du Soumya Sanyal Jieyu Zhao LRM 30 2 0 16 Nov 2023
How Well Do Text Embedding Models Understand Syntax? Yan Zhang Zhaopeng Feng Zhiyang Teng Zuozhu Liu Haizhou Li 40 3 0 14 Nov 2023
Syntax-Guided Transformers: Elevating Compositional Generalization and Grounding in Multimodal Environments Danial Kamali Parisa Kordjamshidi 36 1 0 07 Nov 2023
Implications of Annotation Artifacts in Edge Probing Test Datasets Sagnik Ray Choudhury Jushaan Kalra 16 0 0 20 Oct 2023
Mind the instructions: a holistic evaluation of consistency and interactions in prompt-based learning Lucas Weber Elia Bruni Dieuwke Hupkes 30 24 0 20 Oct 2023
Data Augmentations for Improved (Large) Language Model Generalization Amir Feder Yoav Wald Claudia Shi S. Saria David M. Blei OOD CML 32 7 0 19 Oct 2023
Beyond Testers' Biases: Guiding Model Testing with Knowledge Bases using LLMs Chenyang Yang Rishabh Rustogi Rachel A. Brower-Sinning Grace A. Lewis Christian Kastner Tongshuang Wu KELM 32 11 0 14 Oct 2023
GLS-CSC: A Simple but Effective Strategy to Mitigate Chinese STM Models' Over-Reliance on Superficial Clue Yanrui Du Sendong Zhao Yuhan Chen Rai Bai Jing Liu Huaqin Wu Haifeng Wang Bing Qin 42 2 0 08 Sep 2023
SeqGPT: An Out-of-the-box Large Language Model for Open Domain Sequence Understanding Tianyu Yu Chengyue Jiang Chao Lou Shen Huang Xiaobin Wang ... Haitao Zheng Ningyu Zhang Pengjun Xie Fei Huang Yong-jia Jiang LRM 57 17 0 21 Aug 2023
Using Artificial Populations to Study Psychological Phenomena in Neural Models Jesse Roberts Kyle Moore Drew Wilenzick Doug Fisher 19 6 0 15 Aug 2023