Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.05452
Cited By
v1
v2 (latest)
Parrot Mind: Towards Explaining the Complex Task Reasoning of Pretrained Large Language Models with Template-Content Structure
9 October 2023
Haotong Yang
Fanxu Meng
Zhouchen Lin
Muhan Zhang
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Parrot Mind: Towards Explaining the Complex Task Reasoning of Pretrained Large Language Models with Template-Content Structure"
30 / 30 papers shown
Title
Number Cookbook: Number Understanding of Language Models and How to Improve It
Haotong Yang
Yi Hu
Shijia Kang
Zhouchen Lin
Muhan Zhang
LRM
93
8
0
06 Nov 2024
Understanding Reasoning Ability of Language Models From the Perspective of Reasoning Paths Aggregation
Xinyi Wang
Alfonso Amayuelas
Kexun Zhang
Liangming Pan
Wenhu Chen
Wenjie Wang
LRM
82
15
0
05 Feb 2024
Skill-Mix: a Flexible and Expandable Family of Evaluations for AI models
Dingli Yu
Simran Kaur
Arushi Gupta
Jonah Brown-Cohen
Anirudh Goyal
Sanjeev Arora
ALM
LLMAG
74
47
0
26 Oct 2023
Towards a Mechanistic Interpretation of Multi-Step Reasoning Capabilities of Language Models
Buse Giledereli
Jiaoda Li
Yu Fei
Alessandro Stolfo
Wangchunshu Zhou
Guangtao Zeng
Antoine Bosselut
Mrinmaya Sachan
LRM
116
47
0
23 Oct 2023
The Expressive Power of Transformers with Chain of Thought
William Merrill
Ashish Sabharwal
LRM
AI4CE
ReLM
85
41
0
11 Oct 2023
Causal Parrots: Large Language Models May Talk Causality But Are Not Causal
Matej Zečević
Moritz Willig
Devendra Singh Dhami
Kristian Kersting
LRM
72
120
0
24 Aug 2023
A Theory for Emergence of Complex Skills in Language Models
Sanjeev Arora
Anirudh Goyal
LRM
89
87
0
29 Jul 2023
Evaluating the Logical Reasoning Ability of ChatGPT and GPT-4
Hanmeng Liu
Ruoxi Ning
Zhiyang Teng
Jian Liu
Qiji Zhou
Yuexin Zhang
ELM
ReLM
LRM
106
258
0
07 Apr 2023
A Brief Survey on the Approximation Theory for Sequence Modelling
Hao Jiang
Qianxiao Li
Zhong Li
Shida Wang
AI4TS
77
12
0
27 Feb 2023
What learning algorithm is in-context learning? Investigations with linear models
Ekin Akyürek
Dale Schuurmans
Jacob Andreas
Tengyu Ma
Denny Zhou
119
493
0
28 Nov 2022
Text and Patterns: For Effective Chain of Thought, It Takes Two to Tango
Aman Madaan
Amir Yazdanbakhsh
LRM
220
121
0
16 Sep 2022
Your Transformer May Not be as Powerful as You Expect
Shengjie Luo
Shanda Li
Shuxin Zheng
Tie-Yan Liu
Liwei Wang
Di He
127
54
0
26 May 2022
OPT: Open Pre-trained Transformer Language Models
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
...
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
VLM
OSLM
AI4CE
373
3,700
0
02 May 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
897
13,228
0
04 Mar 2022
Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?
Sewon Min
Xinxi Lyu
Ari Holtzman
Mikel Artetxe
M. Lewis
Hannaneh Hajishirzi
Luke Zettlemoyer
LLMAG
LRM
193
1,501
0
25 Feb 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
856
9,714
0
28 Jan 2022
An Explanation of In-context Learning as Implicit Bayesian Inference
Sang Michael Xie
Aditi Raghunathan
Percy Liang
Tengyu Ma
ReLM
BDL
VPVLM
LRM
227
764
0
03 Nov 2021
Training Verifiers to Solve Math Word Problems
K. Cobbe
V. Kosaraju
Mohammad Bavarian
Mark Chen
Heewoo Jun
...
Jerry Tworek
Jacob Hilton
Reiichiro Nakano
Christopher Hesse
John Schulman
ReLM
OffRL
LRM
367
4,598
0
27 Oct 2021
An Empirical Study of GPT-3 for Few-Shot Knowledge-Based VQA
Zhengyuan Yang
Zhe Gan
Jianfeng Wang
Xiaowei Hu
Yumao Lu
Zicheng Liu
Lijuan Wang
256
422
0
10 Sep 2021
Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tuning
Armen Aghajanyan
Luke Zettlemoyer
Sonal Gupta
110
571
1
22 Dec 2020
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
905
42,520
0
28 May 2020
Are Transformers universal approximators of sequence-to-sequence functions?
Chulhee Yun
Srinadh Bhojanapalli
A. S. Rawat
Sashank J. Reddi
Sanjiv Kumar
128
359
0
20 Dec 2019
A Unified MRC Framework for Named Entity Recognition
Xiaoya Li
Jingrong Feng
Yuxian Meng
Qinghong Han
Leilei Gan
Jiwei Li
112
637
0
25 Oct 2019
Theoretical Limitations of Self-Attention in Neural Sequence Models
Michael Hahn
74
275
0
16 Jun 2019
Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned
Elena Voita
David Talbot
F. Moiseev
Rico Sennrich
Ivan Titov
119
1,149
0
23 May 2019
EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks
Jason W. Wei
Kai Zou
119
1,964
0
31 Jan 2019
The Importance of Generation Order in Language Modeling
Nic Ford
Daniel Duckworth
Mohammad Norouzi
George E. Dahl
75
31
0
23 Aug 2018
Measuring the Intrinsic Dimension of Objective Landscapes
Chunyuan Li
Heerad Farkhoor
Rosanne Liu
J. Yosinski
89
416
0
24 Apr 2018
Deep contextualized word representations
Matthew E. Peters
Mark Neumann
Mohit Iyyer
Matt Gardner
Christopher Clark
Kenton Lee
Luke Zettlemoyer
NAI
235
11,569
0
15 Feb 2018
Named Entity Recognition with Bidirectional LSTM-CNNs
Jason P. C. Chiu
Eric Nichols
91
1,901
0
26 Nov 2015
1