Language models are not naysayers: An analysis of language models on negation benchmarks

14 June 2023

Karin Verspoor

Papers citing "Language models are not naysayers: An analysis of language models on negation benchmarks"

50 / 51 papers shown

Title
Reasoning Capabilities and Invariability of Large Language Models Alessandro Raganato Rafael Peñaloza Marco Viviani G. Pasi ReLM LRM 82 0 0 01 May 2025
Don't Deceive Me: Mitigating Gaslighting through Attention Reallocation in LMMs Pengkun Jiao Bin Zhu Jingjing Chen Chong-Wah Ngo Yu Jiang 40 0 0 13 Apr 2025
Negation: A Pink Elephant in the Large Language Models' Room? Tereza Vrabcová Marek Kadlcík Petr Sojka Michal Štefánik Michal Spiegel 44 0 0 28 Mar 2025
From No to Know: Taxonomy, Challenges, and Opportunities for Negation Understanding in Multimodal Foundation Models Mayank Vatsa Aparna Bharati S. Mittal Richa Singh 58 0 0 10 Feb 2025
Calling a Spade a Heart: Gaslighting Multimodal Large Language Models via Negation Bin Zhu Hui yan Qi Yinxuan Gui Jingjing Chen Chong-Wah Ngo Ee-Peng Lim 166 1 0 31 Jan 2025
A linguistically-motivated evaluation methodology for unraveling model's abilities in reading comprehension tasks Elie Antoine Frédéric Béchet Géraldine Damnati Philippe Langlais 56 1 0 29 Jan 2025
Generating Diverse Negations from Affirmative Sentences Darian Rodriguez Vasquez Afroditi Papadaki 50 0 0 30 Oct 2024
Is artificial intelligence still intelligence? LLMs generalize to novel adjective-noun pairs, but don't mimic the full human distribution Hayley Ross Kathryn Davidson Najoung Kim 38 2 0 23 Oct 2024
Are LLMs Models of Distributional Semantics? A Case Study on Quantifiers Zhang Enyan Zewei Wang Michael A. Lepori Ellie Pavlick Helena Aparicio 31 1 0 17 Oct 2024
On A Scale From 1 to 5: Quantifying Hallucination in Faithfulness Evaluation Xiaonan Jing Srinivas Billa Danny Godbout HILM 45 0 0 16 Oct 2024
LLM Self-Correction with DeCRIM: Decompose, Critique, and Refine for Enhanced Following of Instructions with Multiple Constraints Thomas Palmeira Ferraz Kartik Mehta Yu-Hsiang Lin Haw-Shiuan Chang Shereen Oraby Sijia Liu Vivek Subramanian Tagyoung Chung Mohit Bansal Nanyun Peng 56 8 0 09 Oct 2024
Enhancing Post-Hoc Attributions in Long Document Comprehension via Coarse Grained Answer Decomposition Pritika Ramu Koustava Goswami Apoorv Saxena Balaji Vasan Srinivavsan 33 2 0 25 Sep 2024
Controlled LLM-based Reasoning for Clinical Trial Retrieval Mael Jullien Alex Bogatu Harriet Unsworth André Freitas LRM 27 0 0 19 Sep 2024
NeIn: Telling What You Don't Want Nhat-Tan Bui Dinh-Hieu Hoang Quoc-Huy Trinh Minh-Triet Tran Truong Nguyen Susan Gauch 43 2 0 09 Sep 2024
Animate, or Inanimate, That is the Question for Large Language Models Leonardo Ranaldi Giulia Pucci Fabio Massimo Zanzotto 37 0 0 12 Aug 2024
Can LLMs Replace Manual Annotation of Software Engineering Artifacts? Toufique Ahmed Premkumar Devanbu Christoph Treude Michael Pradel 80 13 0 10 Aug 2024
How and where does CLIP process negation? Vincent Quantmeyer Pablo Mosteiro Albert Gatt CoGe 29 6 0 15 Jul 2024
Learning to Ask Informative Questions: Enhancing LLMs with Preference Optimization and Expected Information Gain Davide Mazzaccara A. Testoni Raffaella Bernardi 40 2 0 25 Jun 2024
Is this a bad table? A Closer Look at the Evaluation of Table Generation from Text Pritika Ramu Aparna Garimella Sambaran Bandyopadhyay LMTD 41 1 0 21 Jun 2024
Large Language Models are Skeptics: False Negative Problem of Input-conflicting Hallucination Jongyoon Song Sangwon Yu Sungroh Yoon HILM 38 3 0 20 Jun 2024
Bag of Lies: Robustness in Continuous Pre-training BERT I. Gevers Walter Daelemans 49 0 0 14 Jun 2024
Paraphrasing in Affirmative Terms Improves Negation Understanding MohammadHossein Rezaei Eduardo Blanco 44 1 0 11 Jun 2024
Investigating and Addressing Hallucinations of LLMs in Tasks Involving Negation Neeraj Varshney Satyam Raj Venkatesh Mishra Agneet Chatterjee Ritika Sarkar Amir Saeidi Chitta Baral LRM 43 7 0 08 Jun 2024
Large Language Models Lack Understanding of Character Composition of Words Andrew Shin Kunitake Kaneko 31 8 0 18 May 2024
Challenges and Opportunities in Text Generation Explainability Kenza Amara Rita Sevastjanova Mennatallah El-Assady SILM 48 2 0 14 May 2024
Experimental Pragmatics with Machines: Testing LLM Predictions for the Inferences of Plain and Embedded Disjunctions Polina Tsvilodub Paul Marty Sonia Ramotowska Jacopo Romoli Michael Franke 34 1 0 09 May 2024
Language in Vivo vs. in Silico: Size Matters but Larger Language Models Still Do Not Comprehend Language on a Par with Humans Vittoria Dentella Fritz Guenther Evelina Leivada ELM 49 1 0 23 Apr 2024
Revisiting subword tokenization: A case study on affixal negation in large language models Thinh Hung Truong Yulia Otmakhova Karin Verspoor Trevor Cohn Timothy Baldwin 47 2 0 03 Apr 2024
Beyond Accuracy: Evaluating the Reasoning Behavior of Large Language Models -- A Survey Philipp Mondorf Barbara Plank ELM LRM LM&MA 36 37 0 02 Apr 2024
Comparing Inferential Strategies of Humans and Large Language Models in Deductive Reasoning Philipp Mondorf Barbara Plank LRM 40 9 0 20 Feb 2024
ArabicMMLU: Assessing Massive Multitask Language Understanding in Arabic Fajri Koto Haonan Li Sara Shatnawi Jad Doughman Abdelrahman Boda Sadallah ... Neha Sengupta Shady Shehata Nizar Habash Preslav Nakov Timothy Baldwin ELM LRM 80 31 0 20 Feb 2024
Strong hallucinations from negation and how to fix them Nicholas Asher Swarnadeep Bhar ReLM LRM 43 4 0 16 Feb 2024
SyntaxShap: Syntax-aware Explainability Method for Text Generation Kenza Amara Rita Sevastjanova Mennatallah El-Assady 44 2 0 14 Feb 2024
Exploring Group and Symmetry Principles in Large Language Models Shima Imani Hamid Palangi LRM 24 1 0 09 Feb 2024
Can Language Models Be Tricked by Language Illusions? Easier with Syntax, Harder with Semantics Yuhan Zhang Edward Gibson Forrest Davis 41 6 0 02 Nov 2023
Assessing Step-by-Step Reasoning against Lexical Negation: A Case Study on Syllogism Mengyu Ye Tatsuki Kuribayashi Jun Suzuki Goro Kobayashi Hiroaki Funayama LRM 34 8 0 23 Oct 2023
Revealing the Unwritten: Visual Investigation of Beam Search Trees to Address Language Model Prompting Challenges Thilo Spinner Rebecca Kehlbeck Rita Sevastjanova Tobias Stähle Daniel A. Keim Oliver Deussen Andreas Spitz Mennatallah El-Assady 32 2 0 17 Oct 2023
Trustworthy Formal Natural Language Specifications Colin S. Gordon Sergey Matskevich HILM 29 3 0 05 Oct 2023
Are Multilingual LLMs Culturally-Diverse Reasoners? An Investigation into Multicultural Proverbs and Sayings Chen Cecilia Liu Fajri Koto Timothy Baldwin Iryna Gurevych LRM 32 18 0 15 Sep 2023
Representation Synthesis by Probabilistic Many-Valued Logic Operation in Self-Supervised Learning Hiroki Nakamura Masashi Okada T. Taniguchi SSL NAI 31 0 0 08 Sep 2023
Not wacky vs. definitely wacky: A study of scalar adverbs in pretrained language models Isabelle Lorge J. Pierrehumbert 41 0 0 25 May 2023
Simple Linguistic Inferences of Large Language Models (LLMs): Blind Spots and Blinds Victoria Basmov Yoav Goldberg Reut Tsarfaty ReLM LRM 32 5 0 24 May 2023
Leveraging Large Language Models for Multiple Choice Question Answering Joshua Robinson Christopher Rytting David Wingate ELM 148 186 0 22 Oct 2022
Not another Negation Benchmark: The NaN-NLI Test Suite for Sub-clausal Negation Thinh Hung Truong Yulia Otmakhova Tim Baldwin Trevor Cohn Jey Han Lau Karin Verspoor 65 21 0 06 Oct 2022
Can Large Language Models Truly Understand Prompts? A Case Study with Negated Prompts Joel Jang Seonghyeon Ye Minjoon Seo ELM LRM 99 64 0 26 Sep 2022
Life after BERT: What do Other Muppets Understand about Language? Vladislav Lialin Kevin Zhao Namrata Shivagunde Anna Rumshisky 49 6 0 21 May 2022
Training language models to follow instructions with human feedback Long Ouyang Jeff Wu Xu Jiang Diogo Almeida Carroll L. Wainwright ... Amanda Askell Peter Welinder Paul Christiano Jan Leike Ryan J. Lowe OSLM ALM 369 12,081 0 04 Mar 2022
The Power of Scale for Parameter-Efficient Prompt Tuning Brian Lester Rami Al-Rfou Noah Constant VPVLM 280 3,872 0 18 Apr 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling Leo Gao Stella Biderman Sid Black Laurence Golding Travis Hoppe ... Horace He Anish Thite Noa Nabeshima Shawn Presser Connor Leahy AIMat 282 2,000 0 31 Dec 2020
Language Models as Knowledge Bases? Fabio Petroni Tim Rocktaschel Patrick Lewis A. Bakhtin Yuxiang Wu Alexander H. Miller Sebastian Riedel KELM AI4MH 454 2,589 0 03 Sep 2019