ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1902.01007
  4. Cited By
Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural
  Language Inference

Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference

4 February 2019
R. Thomas McCoy
Ellie Pavlick
Tal Linzen
ArXivPDFHTML

Papers citing "Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference"

50 / 291 papers shown
Title
Assessing Robustness to Spurious Correlations in Post-Training Language Models
Assessing Robustness to Spurious Correlations in Post-Training Language Models
Julia Shuieh
Prasann Singhal
Apaar Shanker
John Heyer
George Pu
Samuel Denton
LRM
29
0
0
09 May 2025
Pushing the boundary on Natural Language Inference
Pushing the boundary on Natural Language Inference
Pablo Miralles-González
Javier Huertas-Tato
Alejandro Martín
David Camacho
LRM
44
0
0
25 Apr 2025
FLUKE: A Linguistically-Driven and Task-Agnostic Framework for Robustness Evaluation
FLUKE: A Linguistically-Driven and Task-Agnostic Framework for Robustness Evaluation
Yulia Otmakhova
Hung Thinh Truong
Rahmad Mahendra
Zenan Zhai
Rongxin Zhu
Daniel Beck
Jey Han Lau
ELM
70
0
0
24 Apr 2025
Do Large Language Models know who did what to whom?
Do Large Language Models know who did what to whom?
Joseph M. Denning
Xiaohan
Bryor Snefjella
Idan A. Blank
62
1
0
23 Apr 2025
Continuum-Interaction-Driven Intelligence: Human-Aligned Neural Architecture via Crystallized Reasoning and Fluid Generation
Continuum-Interaction-Driven Intelligence: Human-Aligned Neural Architecture via Crystallized Reasoning and Fluid Generation
Pengcheng Zhou
Zhiqiang Nie
Haochen Li
53
0
0
12 Apr 2025
Re-evaluating Theory of Mind evaluation in large language models
Re-evaluating Theory of Mind evaluation in large language models
Jennifer Hu
Felix Sosa
T. Ullman
45
0
0
28 Feb 2025
Neuro-Symbolic Contrastive Learning for Cross-domain Inference
Neuro-Symbolic Contrastive Learning for Cross-domain Inference
Mingyue Liu
Ryo Ueda
Zhen Wan
Katsumi Inoue
Chris G. Willcocks
NAI
72
0
0
13 Feb 2025
Does Training on Synthetic Data Make Models Less Robust?
Does Training on Synthetic Data Make Models Less Robust?
Lingze Zhang
Ellie Pavlick
SyDa
89
0
0
11 Feb 2025
Self-Rationalization in the Wild: A Large Scale Out-of-Distribution Evaluation on NLI-related tasks
Self-Rationalization in the Wild: A Large Scale Out-of-Distribution Evaluation on NLI-related tasks
Jing Yang
Max Glockner
Anderson de Rezende Rocha
Iryna Gurevych
LRM
73
1
0
07 Feb 2025
A linguistically-motivated evaluation methodology for unraveling model's abilities in reading comprehension tasks
A linguistically-motivated evaluation methodology for unraveling model's abilities in reading comprehension tasks
Elie Antoine
Frédéric Béchet
Géraldine Damnati
Philippe Langlais
56
1
0
29 Jan 2025
Evaluating Concurrent Robustness of Language Models Across Diverse Challenge Sets
Evaluating Concurrent Robustness of Language Models Across Diverse Challenge Sets
Vatsal Gupta
Pranshu Pandya
Tushar Kataria
Vivek Gupta
Dan Roth
AAML
55
1
0
03 Jan 2025
Sneaking Syntax into Transformer Language Models with Tree Regularization
Sneaking Syntax into Transformer Language Models with Tree Regularization
Ananjan Nandi
Christopher D. Manning
Shikhar Murty
74
0
0
28 Nov 2024
Focus On This, Not That! Steering LLMs With Adaptive Feature Specification
Focus On This, Not That! Steering LLMs With Adaptive Feature Specification
Tom A. Lamb
Adam Davies
Alasdair Paren
Philip H. S. Torr
Francesco Pinto
47
0
0
30 Oct 2024
Fact Recall, Heuristics or Pure Guesswork? Precise Interpretations of Language Models for Fact Completion
Fact Recall, Heuristics or Pure Guesswork? Precise Interpretations of Language Models for Fact Completion
Denitsa Saynova
Lovisa Hagström
Moa Johansson
Richard Johansson
Marco Kuhlmann
HILM
43
0
0
18 Oct 2024
Inference and Verbalization Functions During In-Context Learning
Inference and Verbalization Functions During In-Context Learning
Junyi Tao
Xiaoyin Chen
Nelson F. Liu
LRM
ReLM
26
0
0
12 Oct 2024
In-context Learning in Presence of Spurious Correlations
In-context Learning in Presence of Spurious Correlations
Hrayr Harutyunyan
R. Darbinyan
Samvel Karapetyan
Hrant Khachatrian
LRM
46
1
0
04 Oct 2024
Efficient LLM Context Distillation
Efficient LLM Context Distillation
Rajesh Upadhayayaya
Zachary Smith
Chritopher Kottmyer
Manish Raj Osti
42
1
0
03 Sep 2024
CoverBench: A Challenging Benchmark for Complex Claim Verification
CoverBench: A Challenging Benchmark for Complex Claim Verification
Alon Jacovi
Moran Ambar
Eyal Ben-David
Uri Shaham
Amir Feder
Mor Geva
Dror Marcus
Avi Caciularu
LMTD
49
3
0
06 Aug 2024
Consistent Document-Level Relation Extraction via Counterfactuals
Consistent Document-Level Relation Extraction via Counterfactuals
Ali Modarressi
Abdullatif Köksal
Hinrich Schutze
48
1
0
09 Jul 2024
The Factorization Curse: Which Tokens You Predict Underlie the Reversal
  Curse and More
The Factorization Curse: Which Tokens You Predict Underlie the Reversal Curse and More
O. Kitouni
Niklas Nolte
Diane Bouchacourt
Adina Williams
Mike Rabbat
Mark Ibrahim
LRM
CLL
46
12
0
07 Jun 2024
ACCORD: Closing the Commonsense Measurability Gap
ACCORD: Closing the Commonsense Measurability Gap
François Roewer-Després
Jinyue Feng
Zining Zhu
Frank Rudzicz
LRM
48
0
0
04 Jun 2024
Filtered Corpus Training (FiCT) Shows that Language Models can
  Generalize from Indirect Evidence
Filtered Corpus Training (FiCT) Shows that Language Models can Generalize from Indirect Evidence
Abhinav Patil
Jaap Jumelet
Yu Ying Chiu
Andy Lapastora
Peter Shen
Lexie Wang
Clevis Willrich
Shane Steinert-Threlkeld
32
13
0
24 May 2024
CHARP: Conversation History AwaReness Probing for Knowledge-grounded
  Dialogue Systems
CHARP: Conversation History AwaReness Probing for Knowledge-grounded Dialogue Systems
Abbas Ghaddar
David Alfonso-Hermelo
Philippe Langlais
Mehdi Rezagholizadeh
Boxing Chen
Prasanna Parthasarathi
39
0
0
24 May 2024
What is it for a Machine Learning Model to Have a Capability?
What is it for a Machine Learning Model to Have a Capability?
Jacqueline Harding
Nathaniel Sharadin
ELM
38
3
0
14 May 2024
Learned feature representations are biased by complexity, learning
  order, position, and more
Learned feature representations are biased by complexity, learning order, position, and more
Andrew Kyle Lampinen
Stephanie C. Y. Chan
Katherine Hermann
AI4CE
FaML
SSL
OOD
34
6
0
09 May 2024
How does promoting the minority fraction affect generalization? A
  theoretical study of the one-hidden-layer neural network on group imbalance
How does promoting the minority fraction affect generalization? A theoretical study of the one-hidden-layer neural network on group imbalance
Hongkang Li
Shuai Zhang
Yihua Zhang
Meng Wang
Sijia Liu
Pin-Yu Chen
41
4
0
12 Mar 2024
Complexity Matters: Dynamics of Feature Learning in the Presence of
  Spurious Correlations
Complexity Matters: Dynamics of Feature Learning in the Presence of Spurious Correlations
GuanWen Qiu
Da Kuang
Surbhi Goel
27
8
0
05 Mar 2024
Best of Both Worlds: A Pliable and Generalizable Neuro-Symbolic Approach
  for Relation Classification
Best of Both Worlds: A Pliable and Generalizable Neuro-Symbolic Approach for Relation Classification
Robert Vacareanu
F. Alam
M. Islam
Haris Riaz
Mihai Surdeanu
NAI
29
2
0
05 Mar 2024
Should We Fear Large Language Models? A Structural Analysis of the Human
  Reasoning System for Elucidating LLM Capabilities and Risks Through the Lens
  of Heidegger's Philosophy
Should We Fear Large Language Models? A Structural Analysis of the Human Reasoning System for Elucidating LLM Capabilities and Risks Through the Lens of Heidegger's Philosophy
Jianqiiu Zhang
ELM
40
1
0
05 Mar 2024
On the Challenges and Opportunities in Generative AI
On the Challenges and Opportunities in Generative AI
Laura Manduchi
Kushagra Pandey
Robert Bamler
Ryan Cotterell
Sina Daubener
...
F. Wenzel
Frank Wood
Stephan Mandt
Vincent Fortuin
Vincent Fortuin
56
17
0
28 Feb 2024
InfFeed: Influence Functions as a Feedback to Improve the Performance of
  Subjective Tasks
InfFeed: Influence Functions as a Feedback to Improve the Performance of Subjective Tasks
Somnath Banerjee
Maulindu Sarkar
Punyajoy Saha
Binny Mathew
Animesh Mukherjee
TDI
34
0
0
22 Feb 2024
Punctuation Restoration Improves Structure Understanding Without Supervision
Punctuation Restoration Improves Structure Understanding Without Supervision
Junghyun Min
Minho Lee
Woochul Lee
Yeonsoo Lee
59
1
0
13 Feb 2024
LEVI: Generalizable Fine-tuning via Layer-wise Ensemble of Different
  Views
LEVI: Generalizable Fine-tuning via Layer-wise Ensemble of Different Views
Yuji Roh
Qingyun Liu
Huan Gui
Zhe Yuan
Yujin Tang
...
Liang Liu
Shuchao Bi
Lichan Hong
Ed H. Chi
Zhe Zhao
43
1
0
07 Feb 2024
Semantic Sensitivities and Inconsistent Predictions: Measuring the
  Fragility of NLI Models
Semantic Sensitivities and Inconsistent Predictions: Measuring the Fragility of NLI Models
Erik Arakelyan
Zhaoqi Liu
Isabelle Augenstein
AAML
45
9
0
25 Jan 2024
Learning Shortcuts: On the Misleading Promise of NLU in Language Models
Learning Shortcuts: On the Misleading Promise of NLU in Language Models
Geetanjali Bihani
Julia Taylor Rayz
33
3
0
17 Jan 2024
Self-Supervised Position Debiasing for Large Language Models
Self-Supervised Position Debiasing for Large Language Models
Zhongkun Liu
Zheng Chen
Mengqi Zhang
Zhaochun Ren
Pengjie Ren
Zhumin Chen
36
1
0
02 Jan 2024
The Earth is Flat? Unveiling Factual Errors in Large Language Models
The Earth is Flat? Unveiling Factual Errors in Large Language Models
Wenxuan Wang
Juluan Shi
Zhaopeng Tu
Youliang Yuan
Jen-tse Huang
Wenxiang Jiao
Michael R. Lyu
KELM
HILM
SyDa
47
1
0
01 Jan 2024
Analyzing the Inherent Response Tendency of LLMs: Real-World
  Instructions-Driven Jailbreak
Analyzing the Inherent Response Tendency of LLMs: Real-World Instructions-Driven Jailbreak
Yanrui Du
Sendong Zhao
Ming Ma
Yuhan Chen
Bing Qin
26
15
0
07 Dec 2023
Latent Feature-based Data Splits to Improve Generalisation Evaluation: A
  Hate Speech Detection Case Study
Latent Feature-based Data Splits to Improve Generalisation Evaluation: A Hate Speech Detection Case Study
Maike Zufle
Verna Dankers
Ivan Titov
42
0
0
16 Nov 2023
Measuring and Improving Attentiveness to Partial Inputs with
  Counterfactuals
Measuring and Improving Attentiveness to Partial Inputs with Counterfactuals
Yanai Elazar
Bhargavi Paranjape
Hao Peng
Sarah Wiegreffe
Khyathi Raghavi
Vivek Srikumar
Sameer Singh
Noah A. Smith
AAML
OOD
28
0
0
16 Nov 2023
Self-Contradictory Reasoning Evaluation and Detection
Self-Contradictory Reasoning Evaluation and Detection
Ziyi Liu
Isabelle G. Lee
Yongkang Du
Soumya Sanyal
Jieyu Zhao
LRM
30
2
0
16 Nov 2023
How Well Do Text Embedding Models Understand Syntax?
How Well Do Text Embedding Models Understand Syntax?
Yan Zhang
Zhaopeng Feng
Zhiyang Teng
Zuozhu Liu
Haizhou Li
40
3
0
14 Nov 2023
Syntax-Guided Transformers: Elevating Compositional Generalization and
  Grounding in Multimodal Environments
Syntax-Guided Transformers: Elevating Compositional Generalization and Grounding in Multimodal Environments
Danial Kamali
Parisa Kordjamshidi
36
1
0
07 Nov 2023
Implications of Annotation Artifacts in Edge Probing Test Datasets
Implications of Annotation Artifacts in Edge Probing Test Datasets
Sagnik Ray Choudhury
Jushaan Kalra
16
0
0
20 Oct 2023
Mind the instructions: a holistic evaluation of consistency and
  interactions in prompt-based learning
Mind the instructions: a holistic evaluation of consistency and interactions in prompt-based learning
Lucas Weber
Elia Bruni
Dieuwke Hupkes
30
24
0
20 Oct 2023
Data Augmentations for Improved (Large) Language Model Generalization
Data Augmentations for Improved (Large) Language Model Generalization
Amir Feder
Yoav Wald
Claudia Shi
S. Saria
David M. Blei
OOD
CML
32
7
0
19 Oct 2023
Beyond Testers' Biases: Guiding Model Testing with Knowledge Bases using
  LLMs
Beyond Testers' Biases: Guiding Model Testing with Knowledge Bases using LLMs
Chenyang Yang
Rishabh Rustogi
Rachel A. Brower-Sinning
Grace A. Lewis
Christian Kastner
Tongshuang Wu
KELM
32
11
0
14 Oct 2023
GLS-CSC: A Simple but Effective Strategy to Mitigate Chinese STM Models'
  Over-Reliance on Superficial Clue
GLS-CSC: A Simple but Effective Strategy to Mitigate Chinese STM Models' Over-Reliance on Superficial Clue
Yanrui Du
Sendong Zhao
Yuhan Chen
Rai Bai
Jing Liu
Huaqin Wu
Haifeng Wang
Bing Qin
42
2
0
08 Sep 2023
SeqGPT: An Out-of-the-box Large Language Model for Open Domain Sequence
  Understanding
SeqGPT: An Out-of-the-box Large Language Model for Open Domain Sequence Understanding
Tianyu Yu
Chengyue Jiang
Chao Lou
Shen Huang
Xiaobin Wang
...
Haitao Zheng
Ningyu Zhang
Pengjun Xie
Fei Huang
Yong-jia Jiang
LRM
57
17
0
21 Aug 2023
Using Artificial Populations to Study Psychological Phenomena in Neural
  Models
Using Artificial Populations to Study Psychological Phenomena in Neural Models
Jesse Roberts
Kyle Moore
Drew Wilenzick
Doug Fisher
19
6
0
15 Aug 2023
123456
Next