Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.13799
Cited By
Reverse Training to Nurse the Reversal Curse
20 March 2024
O. Yu. Golovneva
Zeyuan Allen-Zhu
Jason Weston
Sainbayar Sukhbaatar
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Reverse Training to Nurse the Reversal Curse"
33 / 33 papers shown
Title
On the generalization of language models from in-context learning and finetuning: a controlled study
Andrew Kyle Lampinen
Arslan Chaudhry
Stephanie Chan
Cody Wild
Diane Wan
Alex Ku
Jorg Bornschein
Razvan Pascanu
Murray Shanahan
James L. McClelland
144
5
0
01 May 2025
Reversal Blessing: Thinking Backward May Outpace Thinking Forward in Multi-choice Questions
Yizhe Zhang
Richard He Bai
Zijin Gu
Ruixiang Zhang
Jiatao Gu
Emmanuel Abbe
Samy Bengio
Navdeep Jaitly
LRM
BDL
130
1
0
25 Feb 2025
Constraint Back-translation Improves Complex Instruction Following of Large Language Models
Yunjia Qi
Hao Peng
Xinyu Wang
Bin Xu
Lei Hou
Juanzi Li
96
3
0
31 Oct 2024
Beyond Autoregression: Discrete Diffusion for Complex Reasoning and Planning
Jiacheng Ye
Jiahui Gao
Shansan Gong
Lin Zheng
Xin Jiang
Zhiyu Li
Dianbo Sui
DiffM
LRM
146
23
0
18 Oct 2024
The Mystery of the Pathological Path-star Task for Language Models
Arvid Frydenlund
LRM
90
4
0
17 Oct 2024
Reverse Modeling in Large Language Models
S. Yu
Yuanchen Xu
Cunxiao Du
Yanying Zhou
Minghui Qiu
Q. Sun
Hao Zhang
Jiawei Wu
135
2
0
13 Oct 2024
Enhancing elusive clues in knowledge learning by contrasting attention of language models
Jian Gao
Xiao Zhang
Ji Wu
Miao Li
98
0
0
26 Sep 2024
Repetition Improves Language Model Embeddings
Jacob Mitchell Springer
Suhas Kotha
Daniel Fried
Graham Neubig
Aditi Raghunathan
73
32
0
23 Feb 2024
Punctuation Restoration Improves Structure Understanding Without Supervision
Junghyun Min
Minho Lee
Woochul Lee
Yeonsoo Lee
99
1
0
13 Feb 2024
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Zhihong Shao
Peiyi Wang
Qihao Zhu
Runxin Xu
Jun-Mei Song
...
Haowei Zhang
Mingchuan Zhang
Yiming Li
Yu-Huan Wu
Daya Guo
ReLM
LRM
125
1,119
0
05 Feb 2024
Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling
Pratyush Maini
Skyler Seto
Richard He Bai
David Grangier
Yizhe Zhang
Navdeep Jaitly
SyDa
76
65
0
29 Jan 2024
Physics of Language Models: Part 3.2, Knowledge Manipulation
Zeyuan Allen-Zhu
Yuanzhi Li
KELM
54
98
0
25 Sep 2023
Physics of Language Models: Part 3.1, Knowledge Storage and Extraction
Zeyuan Allen-Zhu
Yuanzhi Li
KELM
108
152
0
25 Sep 2023
The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A"
Lukas Berglund
Meg Tong
Max Kaufmann
Mikita Balesni
Asa Cooper Stickland
Tomasz Korbak
Owain Evans
LRM
110
273
0
21 Sep 2023
Language Modeling Is Compression
Grégoire Delétang
Anian Ruoss
Paul-Ambroise Duquenne
Elliot Catt
Tim Genewein
...
Wenliang Kevin Li
Matthew Aitchison
Laurent Orseau
Marcus Hutter
J. Veness
AI4CE
93
144
0
19 Sep 2023
Taken out of context: On measuring situational awareness in LLMs
Lukas Berglund
Asa Cooper Stickland
Mikita Balesni
Max Kaufmann
Meg Tong
Tomasz Korbak
Daniel Kokotajlo
Owain Evans
LLMAG
LRM
78
67
0
01 Sep 2023
Self-Alignment with Instruction Backtranslation
Xian Li
Ping Yu
Chunting Zhou
Timo Schick
Omer Levy
Luke Zettlemoyer
Jason Weston
M. Lewis
SyDa
63
133
0
11 Aug 2023
Meet in the Middle: A New Pre-training Paradigm
A. Nguyen
Nikos Karampatziakis
Weizhu Chen
38
21
0
13 Mar 2023
LLaMA: Open and Efficient Foundation Language Models
Hugo Touvron
Thibaut Lavril
Gautier Izacard
Xavier Martinet
Marie-Anne Lachaux
...
Faisal Azhar
Aurelien Rodriguez
Armand Joulin
Edouard Grave
Guillaume Lample
ALM
PILM
1.5K
13,247
0
27 Feb 2023
Efficient Training of Language Models to Fill in the Middle
Mohammad Bavarian
Heewoo Jun
Nikolas Tezak
John Schulman
C. McLeavey
Jerry Tworek
Mark Chen
70
195
0
28 Jul 2022
Masked Language Modeling and the Distributional Hypothesis: Order Word Matters Pre-training for Little
Koustuv Sinha
Robin Jia
Dieuwke Hupkes
J. Pineau
Adina Williams
Douwe Kiela
83
246
0
14 Apr 2021
FLERT: Document-Level Features for Named Entity Recognition
Stefan Schweter
Alan Akbik
61
111
0
13 Nov 2020
Measuring Massive Multitask Language Understanding
Dan Hendrycks
Collin Burns
Steven Basart
Andy Zou
Mantas Mazeika
D. Song
Jacob Steinhardt
ELM
RALM
176
4,434
0
07 Sep 2020
Pre-training via Paraphrasing
M. Lewis
Marjan Ghazvininejad
Gargi Ghosh
Armen Aghajanyan
Sida I. Wang
Luke Zettlemoyer
AIMat
87
160
0
26 Jun 2020
PIQA: Reasoning about Physical Commonsense in Natural Language
Yonatan Bisk
Rowan Zellers
Ronan Le Bras
Jianfeng Gao
Yejin Choi
OOD
LRM
146
1,806
0
26 Nov 2019
XLNet: Generalized Autoregressive Pretraining for Language Understanding
Zhilin Yang
Zihang Dai
Yiming Yang
J. Carbonell
Ruslan Salakhutdinov
Quoc V. Le
AI4CE
230
8,433
0
19 Jun 2019
BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions
Christopher Clark
Kenton Lee
Ming-Wei Chang
Tom Kwiatkowski
Michael Collins
Kristina Toutanova
224
1,527
0
24 May 2019
Cross-lingual Language Model Pretraining
Guillaume Lample
Alexis Conneau
75
2,747
0
22 Jan 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.8K
94,891
0
11 Oct 2018
Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering
Todor Mihaylov
Peter Clark
Tushar Khot
Ashish Sabharwal
113
1,537
0
08 Sep 2018
Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge
Peter Clark
Isaac Cowhey
Oren Etzioni
Tushar Khot
Ashish Sabharwal
Carissa Schoenick
Oyvind Tafjord
ELM
RALM
LRM
158
2,610
0
14 Mar 2018
Neural Machine Translation of Rare Words with Subword Units
Rico Sennrich
Barry Haddow
Alexandra Birch
221
7,745
0
31 Aug 2015
Natural Language Processing (almost) from Scratch
R. Collobert
Jason Weston
Léon Bottou
Michael Karlen
Koray Kavukcuoglu
Pavel P. Kuksa
188
7,726
0
02 Mar 2011
1