ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.12288
  4. Cited By
The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A"
v1v2v3v4 (latest)

The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A"

21 September 2023
Lukas Berglund
Meg Tong
Max Kaufmann
Mikita Balesni
Asa Cooper Stickland
Tomasz Korbak
Owain Evans
    LRM
ArXiv (abs)PDFHTMLGithub (288★)

Papers citing "The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A""

50 / 62 papers shown
Title
LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs
LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs
Xiaoran Liu
Zhigeng Liu
Zengfeng Huang
Qipeng Guo
Ziwei He
Xipeng Qiu
36
0
0
17 Jun 2025
Just Go Parallel: Improving the Multilingual Capabilities of Large Language Models
Just Go Parallel: Improving the Multilingual Capabilities of Large Language Models
Muhammad Reza Qorib
Junyi Li
Hwee Tou Ng
LRM
16
0
0
16 Jun 2025
A Study on Individual Spatiotemporal Activity Generation Method Using MCP-Enhanced Chain-of-Thought Large Language Models
A Study on Individual Spatiotemporal Activity Generation Method Using MCP-Enhanced Chain-of-Thought Large Language Models
Yu Zhang
Yang Hu
D. B. Wang
LRM
97
0
0
12 Jun 2025
BF-Max: an Efficient Bit Flipping Decoder with Predictable Decoding Failure Rate
Alessio Baldelli
Marco Baldi
F. Chiaraluce
Paolo Santini
115
0
0
11 Jun 2025
PropMEND: Hypernetworks for Knowledge Propagation in LLMs
Zeyu Leo Liu
Greg Durrett
Eunsol Choi
KELM
20
0
0
10 Jun 2025
C-PATH: Conversational Patient Assistance and Triage in Healthcare System
C-PATH: Conversational Patient Assistance and Triage in Healthcare System
Qi Shi
Qiwei Han
Cláudia Soares
LM&MA
13
0
0
07 Jun 2025
Multidimensional Analysis of Specific Language Impairment Using Unsupervised Learning Through PCA and Clustering
Multidimensional Analysis of Specific Language Impairment Using Unsupervised Learning Through PCA and Clustering
Niruthiha Selvanayagam
27
0
0
05 Jun 2025
Quantifying Cross-Modality Memorization in Vision-Language Models
Yuxin Wen
Yangsibo Huang
Tom Goldstein
Ravi Kumar
Badih Ghazi
Chiyuan Zhang
108
0
0
05 Jun 2025
DLM-One: Diffusion Language Models for One-Step Sequence Generation
DLM-One: Diffusion Language Models for One-Step Sequence Generation
Tianqi Chen
Shujian Zhang
Mingyuan Zhou
28
0
0
30 May 2025
How Is LLM Reasoning Distracted by Irrelevant Context? An Analysis Using a Controlled Benchmark
How Is LLM Reasoning Distracted by Irrelevant Context? An Analysis Using a Controlled Benchmark
Minglai Yang
Ethan Huang
Liang Zhang
Mihai Surdeanu
William Yang Wang
Liangming Pan
LRM
38
0
0
24 May 2025
When Do LLMs Admit Their Mistakes? Understanding the Role of Model Belief in Retraction
When Do LLMs Admit Their Mistakes? Understanding the Role of Model Belief in Retraction
Yuqing Yang
Robin Jia
KELMLRM
122
1
0
22 May 2025
Thought-Augmented Policy Optimization: Bridging External Guidance and Internal Capabilities
Thought-Augmented Policy Optimization: Bridging External Guidance and Internal Capabilities
Jinyang Wu
Chonghua Liao
Mingkuan Feng
Shuai Zhang
Zhengqi Wen
Pengpeng Shao
Huazhe Xu
Jianhua Tao
LRMOffRL
137
3
0
21 May 2025
Bidirectional LMs are Better Knowledge Memorizers? A Benchmark for Real-world Knowledge Injection
Bidirectional LMs are Better Knowledge Memorizers? A Benchmark for Real-world Knowledge Injection
Yuwei Zhang
Wenhao Yu
Shangbin Feng
Yifan Zhu
Letian Peng
Jayanth Srinivasa
Gaowen Liu
Jingbo Shang
KELM
71
2
0
18 May 2025
Llama See, Llama Do: A Mechanistic Perspective on Contextual Entrainment and Distraction in LLMs
Llama See, Llama Do: A Mechanistic Perspective on Contextual Entrainment and Distraction in LLMs
Jingcheng Niu
Xingdi Yuan
Tong Wang
Hamidreza Saghir
Amir H. Abdi
73
0
0
14 May 2025
Memorization-Compression Cycles Improve Generalization
Memorization-Compression Cycles Improve Generalization
Fangyuan Yu
71
0
0
13 May 2025
On the generalization of language models from in-context learning and finetuning: a controlled study
On the generalization of language models from in-context learning and finetuning: a controlled study
Andrew Kyle Lampinen
Arslan Chaudhry
Stephanie Chan
Cody Wild
Diane Wan
Alex Ku
Jorg Bornschein
Razvan Pascanu
Murray Shanahan
James L. McClelland
167
5
0
01 May 2025
Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction
Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction
Vaishnavh Nagarajan
Chen Henry Wu
Charles Ding
Aditi Raghunathan
119
0
0
21 Apr 2025
Large Language and Reasoning Models are Shallow Disjunctive Reasoners
Large Language and Reasoning Models are Shallow Disjunctive Reasoners
Irtaza Khalid
Amir Masoud Nourollah
Steven Schockaert
LRM
177
1
0
30 Mar 2025
Language Models, Graph Searching, and Supervision Adulteration: When More Supervision is Less and How to Make More More
Language Models, Graph Searching, and Supervision Adulteration: When More Supervision is Less and How to Make More More
Arvid Frydenlund
LRM
173
0
0
13 Mar 2025
Understanding the Logical Capabilities of Large Language Models via Out-of-Context Representation Learning
Jonathan Shaki
Emanuele La Malfa
Michael Wooldridge
Sarit Kraus
LRMReLM
141
0
0
13 Mar 2025
Training Plug-n-Play Knowledge Modules with Deep Context Distillation
Training Plug-n-Play Knowledge Modules with Deep Context Distillation
Lucas Caccia
Alan Ansell
Edoardo Ponti
Ivan Vulić
Alessandro Sordoni
SyDa
527
2
0
11 Mar 2025
Enhancing LLM Knowledge Learning through Generalization
Enhancing LLM Knowledge Learning through Generalization
Mingkang Zhu
Xi Chen
Ziyi Wang
Bei Yu
Hengshuang Zhao
Jiaya Jia
109
0
0
05 Mar 2025
What Makes the Preferred Thinking Direction for LLMs in Multiple-choice Questions?
What Makes the Preferred Thinking Direction for LLMs in Multiple-choice Questions?
Yizhe Zhang
Richard He Bai
Zijin Gu
Ruixiang Zhang
Jiatao Gu
Emmanuel Abbe
Samy Bengio
Navdeep Jaitly
BDLLRM
148
1
0
25 Feb 2025
Paradigms of AI Evaluation: Mapping Goals, Methodologies and Culture
Paradigms of AI Evaluation: Mapping Goals, Methodologies and Culture
John Burden
Marko Tesic
Lorenzo Pacchiardi
José Hernández-Orallo
75
1
0
21 Feb 2025
Large Language Diffusion Models
Large Language Diffusion Models
Shen Nie
Fengqi Zhu
Zebin You
Xiaolu Zhang
Jingyang Ou
Jun Hu
Jun Zhou
Yankai Lin
Ji-Rong Wen
Chongxuan Li
263
55
0
14 Feb 2025
The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding
The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding
Mo Yu
Lemao Liu
J. Wu
Tsz Ting Chung
Shunchi Zhang
JiangNan Li
Dit-Yan Yeung
Jie Zhou
216
2
0
13 Feb 2025
Time-Reversal Provides Unsupervised Feedback to LLMs
Time-Reversal Provides Unsupervised Feedback to LLMs
Yerram Varun
Rahul Madhavan
Sravanti Addepalli
A. Suggala
Karthikeyan Shanmugam
Prateek Jain
LRMSyDa
90
0
0
03 Dec 2024
Sneaking Syntax into Transformer Language Models with Tree Regularization
Sneaking Syntax into Transformer Language Models with Tree Regularization
Ananjan Nandi
Christopher D. Manning
Shikhar Murty
148
0
0
28 Nov 2024
Enhancing Character-Level Understanding in LLMs through Token Internal Structure Learning
Enhancing Character-Level Understanding in LLMs through Token Internal Structure Learning
Zhu Xu
Zhiqiang Zhao
Zihan Zhang
Yuchi Liu
Quanwei Shen
Fei Liu
Yu Kuang
Jian He
Conglin Liu
178
2
0
26 Nov 2024
Constraint Back-translation Improves Complex Instruction Following of Large Language Models
Constraint Back-translation Improves Complex Instruction Following of Large Language Models
Yunjia Qi
Hao Peng
Xinyu Wang
Bin Xu
Lei Hou
Juanzi Li
106
4
0
31 Oct 2024
Next-Token Prediction Task Assumes Optimal Data Ordering for LLM Training in Proof Generation
Next-Token Prediction Task Assumes Optimal Data Ordering for LLM Training in Proof Generation
Chenyang An
Shima Imani
Feng Yao
Chengyu Dong
Ali Abbasi
...
Samuel Buss
Jingbo Shang
Gayathri Mahalingam
Pramod Sharma
Maurice Diesendruck
LRM
64
1
0
30 Oct 2024
Take Caution in Using LLMs as Human Surrogates: Scylla Ex Machina
Take Caution in Using LLMs as Human Surrogates: Scylla Ex Machina
Yuan Gao
Dokyun Lee
Gordon Burtch
Sina Fazelpour
LRM
184
14
0
25 Oct 2024
Scaling up Masked Diffusion Models on Text
Scaling up Masked Diffusion Models on Text
Shen Nie
Fengqi Zhu
Chao Du
Tianyu Pang
Qian Liu
Guangtao Zeng
Min Lin
Chongxuan Li
AI4CE
206
30
0
24 Oct 2024
LLM The Genius Paradox: A Linguistic and Math Expert's Struggle with Simple Word-based Counting Problems
LLM The Genius Paradox: A Linguistic and Math Expert's Struggle with Simple Word-based Counting Problems
Nan Xu
Xuezhe Ma
LRM
153
5
0
18 Oct 2024
The Mystery of the Pathological Path-star Task for Language Models
The Mystery of the Pathological Path-star Task for Language Models
Arvid Frydenlund
LRM
125
4
0
17 Oct 2024
Reverse Modeling in Large Language Models
Reverse Modeling in Large Language Models
S. Yu
Yuanchen Xu
Cunxiao Du
Yanying Zhou
Minghui Qiu
Q. Sun
Hao Zhang
Jiawei Wu
157
2
0
13 Oct 2024
Retrieval Augmented Generation (RAG) and Beyond: A Comprehensive Survey
  on How to Make your LLMs use External Data More Wisely
Retrieval Augmented Generation (RAG) and Beyond: A Comprehensive Survey on How to Make your LLMs use External Data More Wisely
Siyun Zhao
Yuqing Yang
Zilong Wang
Zhiyuan He
Luna Qiu
Lili Qiu
SyDaRALM3DV
113
42
0
23 Sep 2024
Co-occurrence is not Factual Association in Language Models
Co-occurrence is not Factual Association in Language Models
Xiao Zhang
Miao Li
Ji Wu
KELM
172
4
0
21 Sep 2024
Revisiting the Graph Reasoning Ability of Large Language Models: Case Studies in Translation, Connectivity and Shortest Path
Revisiting the Graph Reasoning Ability of Large Language Models: Case Studies in Translation, Connectivity and Shortest Path
Xinnan Dai
Qihao Wen
Yifei Shen
Hongzhi Wen
Dongsheng Li
Jiliang Tang
Caihua Shan
LRM
117
4
0
18 Aug 2024
Does Refusal Training in LLMs Generalize to the Past Tense?
Does Refusal Training in LLMs Generalize to the Past Tense?
Maksym Andriushchenko
Nicolas Flammarion
132
36
0
16 Jul 2024
Teaching Transformers Causal Reasoning through Axiomatic Training
Teaching Transformers Causal Reasoning through Axiomatic Training
Aniket Vashishtha
Abhinav Kumar
Abbavaram Gowtham Reddy
Vineeth N. Balasubramanian
Amit Sharma
Vineeth N Balasubramanian
Amit Sharma
140
4
0
10 Jul 2024
ConCodeEval: Evaluating Large Language Models for Code Constraints in Domain-Specific Languages
ConCodeEval: Evaluating Large Language Models for Code Constraints in Domain-Specific Languages
Mehant Kammakomati
Sameer Pimparkhede
Srikanth G. Tamilselvam
Praveen Venkateswaran
Pushpak Bhattacharyya
ALM
133
0
0
03 Jul 2024
Can Small Language Models Learn, Unlearn, and Retain Noise Patterns?
Can Small Language Models Learn, Unlearn, and Retain Noise Patterns?
Nicy Scaria
Silvester John Joseph Kennedy
Deepak N. Subramani
MU
115
2
0
01 Jul 2024
From Distributional to Overton Pluralism: Investigating Large Language Model Alignment
From Distributional to Overton Pluralism: Investigating Large Language Model Alignment
Thom Lake
Eunsol Choi
Greg Durrett
102
14
0
25 Jun 2024
Crosslingual Capabilities and Knowledge Barriers in Multilingual Large Language Models
Crosslingual Capabilities and Knowledge Barriers in Multilingual Large Language Models
Lynn Chua
Badih Ghazi
Yangsibo Huang
Pritish Kamath
Ravi Kumar
Pasin Manurangsi
Amer Sinha
Chulin Xie
Chiyuan Zhang
149
2
0
23 Jun 2024
Your Absorbing Discrete Diffusion Secretly Models the Conditional Distributions of Clean Data
Your Absorbing Discrete Diffusion Secretly Models the Conditional Distributions of Clean Data
Jingyang Ou
Shen Nie
Kaiwen Xue
Fengqi Zhu
Jiacheng Sun
Zhenguo Li
Chongxuan Li
DiffM
154
54
0
06 Jun 2024
Alice in Wonderland: Simple Tasks Showing Complete Reasoning Breakdown in State-Of-the-Art Large Language Models
Alice in Wonderland: Simple Tasks Showing Complete Reasoning Breakdown in State-Of-the-Art Large Language Models
Marianna Nezhurina
Lucia Cipolina-Kun
Mehdi Cherti
J. Jitsev
LLMAGLRMELMReLM
185
37
0
04 Jun 2024
Knowledge Circuits in Pretrained Transformers
Knowledge Circuits in Pretrained Transformers
Yunzhi Yao
Ningyu Zhang
Zekun Xi
Meng Wang
Ziwen Xu
Shumin Deng
Huajun Chen
KELM
180
25
0
28 May 2024
ReactXT: Understanding Molecular "Reaction-ship" via
  Reaction-Contextualized Molecule-Text Pretraining
ReactXT: Understanding Molecular "Reaction-ship" via Reaction-Contextualized Molecule-Text Pretraining
Zhiyuan Liu
Yaorui Shi
An Zhang
Changhao Nai
Enzhi Zhang
Xiang Wang
Kenji Kawaguchi
Tat-Seng Chua
84
10
0
23 May 2024
Can LLMs Solve longer Math Word Problems Better?
Can LLMs Solve longer Math Word Problems Better?
Xin Xu
Tong Xiao
Zitong Chao
Zhenya Huang
Can Yang
Yang Wang
166
14
0
23 May 2024
12
Next