Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2309.14402
Cited By
Physics of Language Models: Part 3.2, Knowledge Manipulation
25 September 2023
Zeyuan Allen-Zhu
Yuanzhi Li
KELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Physics of Language Models: Part 3.2, Knowledge Manipulation"
50 / 70 papers shown
Title
Boosting Performance on ARC is a Matter of Perspective
Daniel Franzen
Jan Disselhoff
David Hartmann
RALM
LRM
49
0
0
08 May 2025
On the generalization of language models from in-context learning and finetuning: a controlled study
Andrew Kyle Lampinen
Arslan Chaudhry
Stephanie Chan
Cody Wild
Diane Wan
Alex Ku
Jorg Bornschein
Razvan Pascanu
Murray Shanahan
James L. McClelland
46
0
0
01 May 2025
Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction
Vaishnavh Nagarajan
Chen Henry Wu
Charles Ding
Aditi Raghunathan
36
0
0
21 Apr 2025
Is the Reversal Curse a Binding Problem? Uncovering Limitations of Transformers from a Basic Generalization Failure
Boshi Wang
Huan Sun
36
2
0
02 Apr 2025
CASCADE Your Datasets for Cross-Mode Knowledge Retrieval of Language Models
Runlong Zhou
Yi Zhang
RALM
61
0
0
02 Apr 2025
Self-Routing RAG: Binding Selective Retrieval with Knowledge Verbalization
Di Wu
Jia-Chen Gu
Kai-Wei Chang
Nanyun Peng
34
0
0
01 Apr 2025
Understanding the Logical Capabilities of Large Language Models via Out-of-Context Representation Learning
Jonathan Shaki
Emanuele La Malfa
Michael Wooldridge
Sarit Kraus
LRM
ReLM
69
0
0
13 Mar 2025
Implicit Reasoning in Transformers is Reasoning through Shortcuts
Tianhe Lin
Jian Xie
Siyu Yuan
Deqing Yang
ReLM
LRM
75
2
0
10 Mar 2025
Reversal Blessing: Thinking Backward May Outpace Thinking Forward in Multi-choice Questions
Yizhe Zhang
Richard He Bai
Zijin Gu
Ruixiang Zhang
Jiatao Gu
Emmanuel Abbe
Samy Bengio
Navdeep Jaitly
LRM
BDL
70
1
0
25 Feb 2025
How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training
Yixin Ou
Yunzhi Yao
N. Zhang
Hui Jin
Jiacheng Sun
Shumin Deng
Zechao Li
H. Chen
KELM
CLL
54
0
0
16 Feb 2025
Large Language Diffusion Models
Shen Nie
Fengqi Zhu
Zebin You
Xiaolu Zhang
Jingyang Ou
Jun Hu
Jun Zhou
Yankai Lin
Zhicheng Dou
Chongxuan Li
112
17
0
14 Feb 2025
Hallucination, Monofacts, and Miscalibration: An Empirical Investigation
Miranda Muqing Miao
Michael Kearns
67
0
0
11 Feb 2025
Investigating Compositional Reasoning in Time Series Foundation Models
Willa Potosnak
Cristian Challu
Mononito Goswami
Kin G. Olivares
Michał Wiliński
Nina Żukowska
Artur Dubrawski
ReLM
AI4TS
LRM
53
0
0
09 Feb 2025
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Tianzhe Chu
Yuexiang Zhai
Jihan Yang
Shengbang Tong
Saining Xie
Dale Schuurmans
Quoc V. Le
Sergey Levine
Yi Ma
OffRL
70
60
0
28 Jan 2025
Are Transformers Able to Reason by Connecting Separated Knowledge in Training Data?
Yutong Yin
Zhaoran Wang
LRM
ReLM
158
0
0
27 Jan 2025
Do Large Language Models Perform Latent Multi-Hop Reasoning without Exploiting Shortcuts?
Sohee Yang
Nora Kassner
E. Gribovskaya
Sebastian Riedel
Mor Geva
KELM
LRM
ReLM
78
4
0
25 Nov 2024
The Two-Hop Curse: LLMs trained on A
→
\rightarrow
→
B, B
→
\rightarrow
→
C fail to learn A
→
\rightarrow
→
C
Mikita Balesni
Tomek Korbak
Owain Evans
ReLM
LRM
81
0
0
25 Nov 2024
Continual Memorization of Factoids in Language Models
Howard Chen
Jiayi Geng
Adithya Bhaskar
Dan Friedman
Danqi Chen
KELM
56
0
0
11 Nov 2024
OGBench: Benchmarking Offline Goal-Conditioned RL
Seohong Park
Kevin Frans
Benjamin Eysenbach
Sergey Levine
OffRL
50
9
0
26 Oct 2024
Looking Inward: Language Models Can Learn About Themselves by Introspection
Felix J Binder
James Chua
Tomek Korbak
Henry Sleight
John Hughes
Robert Long
Ethan Perez
Miles Turpin
Owain Evans
KELM
AIFin
LRM
35
12
0
17 Oct 2024
Aligning with Logic: Measuring, Evaluating and Improving Logical Preference Consistency in Large Language Models
Yinhong Liu
Zhijiang Guo
Tianya Liang
Ehsan Shareghi
Ivan Vulić
Nigel Collier
128
0
0
03 Oct 2024
Wait, but Tylenol is Acetaminophen... Investigating and Improving Language Models' Ability to Resist Requests for Misinformation
Shan Chen
Mingye Gao
Kuleen Sasse
Thomas Hartvigsen
Brian Anthony
Lizhou Fan
Hugo J. W. L. Aerts
Jack Gallifant
Danielle S. Bitterman
LM&MA
33
0
0
30 Sep 2024
VLM's Eye Examination: Instruct and Inspect Visual Competency of Vision Language Models
Nam Hyeon-Woo
Moon Ye-Bin
Wonseok Choi
Lee Hyun
Tae-Hyun Oh
CoGe
28
3
0
23 Sep 2024
Co-occurrence is not Factual Association in Language Models
Xiao Zhang
Miao Li
Ji Wu
KELM
68
2
0
21 Sep 2024
Implicit Reasoning in Deep Time Series Forecasting
Willa Potosnak
Cristian Challu
Mononito Goswami
Michał Wiliński
Nina Żukowska
Artur Dubrawski
ReLM
AI4TS
LRM
40
2
0
17 Sep 2024
Synthetic continued pretraining
Zitong Yang
Neil Band
Shuangping Li
Emmanuel Candès
Tatsunori Hashimoto
CLL
SyDa
38
11
0
11 Sep 2024
Recent Advances in Attack and Defense Approaches of Large Language Models
Jing Cui
Yishi Xu
Zhewei Huang
Shuchang Zhou
Jianbin Jiao
Junge Zhang
PILM
AAML
57
1
0
05 Sep 2024
How Transformers Utilize Multi-Head Attention in In-Context Learning? A Case Study on Sparse Linear Regression
Xingwu Chen
Lei Zhao
Difan Zou
49
6
0
08 Aug 2024
Structure-aware Domain Knowledge Injection for Large Language Models
Kai-Chun Liu
Ze Chen
Zhihang Fu
Rongxin Jiang
Fan Zhou
Yao-Shen Chen
Yue-bo Wu
Yue Wu
Jieping Ye
52
1
0
23 Jul 2024
Empirical Capacity Model for Self-Attention Neural Networks
Aki Härmä
M. Pietrasik
Anna Wilbik
42
1
0
22 Jul 2024
Knowledge Mechanisms in Large Language Models: A Survey and Perspective
Meng Wang
Yunzhi Yao
Ziwen Xu
Shuofei Qiao
Shumin Deng
...
Yong-jia Jiang
Pengjun Xie
Fei Huang
Huajun Chen
Ningyu Zhang
55
28
0
22 Jul 2024
Representing Rule-based Chatbots with Transformers
Dan Friedman
Abhishek Panigrahi
Danqi Chen
66
1
0
15 Jul 2024
Changing Answer Order Can Decrease MMLU Accuracy
Vipul Gupta
David Pantoja
Candace Ross
Adina Williams
Megan Ung
64
22
0
27 Jun 2024
Do LLMs dream of elephants (when told not to)? Latent concept association and associative memory in transformers
Yibo Jiang
Goutham Rajendran
Pradeep Ravikumar
Bryon Aragam
CLL
KELM
37
6
0
26 Jun 2024
Scaling Laws for Fact Memorization of Large Language Models
Xingyu Lu
Xiaonan Li
Qinyuan Cheng
Kai Ding
Xuanjing Huang
Xipeng Qiu
34
11
0
22 Jun 2024
How Do Large Language Models Acquire Factual Knowledge During Pretraining?
Hoyeon Chang
Jinho Park
Seonghyeon Ye
Sohee Yang
Youngkyung Seo
Du-Seong Chang
Minjoon Seo
KELM
37
32
0
17 Jun 2024
RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models
Zhuoran Jin
Pengfei Cao
Chenhao Wang
Zhitao He
Hongbang Yuan
Jiachun Li
Yubo Chen
Kang Liu
Jun Zhao
KELM
MU
42
12
0
16 Jun 2024
Limited Out-of-Context Knowledge Reasoning in Large Language Models
Peng Hu
Changjiang Gao
Ruiqi Gao
Jiajun Chen
Shujian Huang
LRM
37
3
0
11 Jun 2024
The Factorization Curse: Which Tokens You Predict Underlie the Reversal Curse and More
O. Kitouni
Niklas Nolte
Diane Bouchacourt
Adina Williams
Mike Rabbat
Mark Ibrahim
LRM
CLL
51
12
0
07 Jun 2024
Evaluating the External and Parametric Knowledge Fusion of Large Language Models
Hao Zhang
Yuyang Zhang
Xiaoguang Li
Wenxuan Shi
Haonan Xu
...
Yasheng Wang
Lifeng Shang
Qun Liu
Yong-jin Liu
Ruiming Tang
KELM
41
4
0
29 May 2024
CtrlA: Adaptive Retrieval-Augmented Generation via Probe-Guided Control
Huanshuo Liu
Hao Zhang
Zhijiang Guo
Kuicai Dong
Xiangyang Li
Yi Quan Lee
Cong Zhang
Yong-jin Liu
3DV
41
6
0
29 May 2024
Semantic are Beacons: A Semantic Perspective for Unveiling Parameter-Efficient Fine-Tuning in Knowledge Learning
Renzhi Wang
Piji Li
37
4
0
28 May 2024
Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization
Boshi Wang
Xiang Yue
Yu-Chuan Su
Huan Sun
LRM
29
41
0
23 May 2024
ALPINE: Unveiling the Planning Capability of Autoregressive Learning in Language Models
Siwei Wang
Yifei Shen
Shi Feng
Haoran Sun
Shang-Hua Teng
Wei Chen
35
4
0
15 May 2024
Towards a Theoretical Understanding of the 'Reversal Curse' via Training Dynamics
Hanlin Zhu
Baihe Huang
Shaolun Zhang
Michael I. Jordan
Jiantao Jiao
Yuandong Tian
Stuart Russell
LRM
AI4CE
52
13
0
07 May 2024
Are large language models superhuman chemists?
Adrian Mirza
Nawaf Alampara
Sreekanth Kunchapu
Benedict Emoekabu
Aswanth Krishnan
...
Leanne M. Stafast
Dinga Wonanke
Michael Pieler
P. Schwaller
Kevin Maik Jablonka
ELM
AI4MH
LRM
LM&MA
31
5
0
01 Apr 2024
Source-Aware Training Enables Knowledge Attribution in Language Models
Muhammad Khalifa
David Wadden
Emma Strubell
Honglak Lee
Lu Wang
Iz Beltagy
Hao Peng
HILM
42
14
0
01 Apr 2024
Benchmarking Chinese Commonsense Reasoning of LLMs: From Chinese-Specifics to Reasoning-Memorization Correlations
Jiaxing Sun
Weiquan Huang
Jiang Wu
Chenya Gu
Wei Li
Songyang Zhang
Hang Yan
Conghui He
LRM
52
5
0
21 Mar 2024
Reverse Training to Nurse the Reversal Curse
O. Yu. Golovneva
Zeyuan Allen-Zhu
Jason Weston
Sainbayar Sukhbaatar
33
33
0
20 Mar 2024
Beyond Memorization: The Challenge of Random Memory Access in Language Models
Tongyao Zhu
Qian Liu
Liang Pang
Zhengbao Jiang
Min-Yen Kan
Min-Bin Lin
KELM
37
6
0
12 Mar 2024
1
2
Next