Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.23013
Cited By
Scalable Complexity Control Facilitates Reasoning Ability of LLMs
29 May 2025
Liangkai Hang
Junjie Yao
Zhiwei Bai
Tianyi Chen
Yang Chen
R. Diao
Hezhou Li
Pengxiao Lin
Zhiwei Wang
Cheng Xu
Zhongwang Zhang
Zhangchen Zhou
Zhiyu Li
Zehao Lin
Kai Chen
Feiyu Xiong
Y. Zhang
Weinan E
Hongkang Yang
Zhi-hai Xu
Author Contacts:
lizy@iaar.ac.cn
xiongfy@iaar.ac.cn
hongkang@alumni.princeton.edu
xuzhiqin@sjtu.edu.cn
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Scalable Complexity Control Facilitates Reasoning Ability of LLMs"
32 / 32 papers shown
Title
An overview of condensation phenomenon in deep learning
Zhi-Qin John Xu
Yaoyu Zhang
Zhangchen Zhou
AI4CE
66
2
0
13 Apr 2025
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
DeepSeek-AI
Daya Guo
Dejian Yang
Haowei Zhang
Junxiao Song
...
Shiyu Wang
S. Yu
Shunfeng Zhou
Shuting Pan
S.S. Li
ReLM
VLM
OffRL
AI4TS
LRM
373
1,692
0
22 Jan 2025
Complexity Control Facilitates Reasoning-Based Compositional Generalization in Transformers
Zhongwang Zhang
Pengxiao Lin
Zhiwei Wang
Yaoyu Zhang
Zhi-Qin John Xu
52
5
0
15 Jan 2025
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking
E. Zelikman
Georges Harik
Yijia Shao
Varuna Jayasiri
Nick Haber
Noah D. Goodman
LLMAG
ReLM
LRM
108
142
0
14 Mar 2024
Do Large Language Models Latently Perform Multi-Hop Reasoning?
Sohee Yang
E. Gribovskaya
Nora Kassner
Mor Geva
Sebastian Riedel
ReLM
LRM
115
109
0
26 Feb 2024
Understanding Reasoning Ability of Language Models From the Perspective of Reasoning Paths Aggregation
Xinyi Wang
Alfonso Amayuelas
Kexun Zhang
Liangming Pan
Wenhu Chen
Wenjie Wang
LRM
78
15
0
05 Feb 2024
Faith and Fate: Limits of Transformers on Compositionality
Nouha Dziri
Ximing Lu
Melanie Sclar
Xiang Lorraine Li
Liwei Jian
...
Sean Welleck
Xiang Ren
Allyson Ettinger
Zaïd Harchaoui
Yejin Choi
ReLM
LRM
138
377
0
29 May 2023
Stochastic Modified Equations and Dynamics of Dropout Algorithm
Zhongwang Zhang
Yuqing Li
Yaoyu Zhang
Z. Xu
41
9
0
25 May 2023
Towards Revealing the Mystery behind Chain of Thought: A Theoretical Perspective
Guhao Feng
Bohang Zhang
Yuntian Gu
Haotian Ye
Di He
Liwei Wang
LRM
100
248
0
24 May 2023
Loss Spike in Training Neural Networks
Zhongwang Zhang
Z. Xu
63
7
0
20 May 2023
GPT-4 Technical Report
OpenAI OpenAI
OpenAI Josh Achiam
Steven Adler
Sandhini Agarwal
Lama Ahmad
...
Shengjia Zhao
Tianhao Zheng
Juntang Zhuang
William Zhuk
Barret Zoph
LLMAG
MLLM
1.4K
14,359
0
15 Mar 2023
Phase Diagram of Initial Condensation for Two-layer Neural Networks
Zheng Chen
Yuqing Li
Yaoyu Zhang
Zhaoguang Zhou
Z. Xu
MLT
AI4CE
58
10
0
12 Mar 2023
Measuring and Narrowing the Compositionality Gap in Language Models
Ofir Press
Muru Zhang
Sewon Min
Ludwig Schmidt
Noah A. Smith
M. Lewis
ReLM
KELM
LRM
188
626
0
07 Oct 2022
Emergent Abilities of Large Language Models
Jason W. Wei
Yi Tay
Rishi Bommasani
Colin Raffel
Barret Zoph
...
Tatsunori Hashimoto
Oriol Vinyals
Percy Liang
J. Dean
W. Fedus
ELM
ReLM
LRM
279
2,480
0
15 Jun 2022
STaR: Bootstrapping Reasoning With Reasoning
E. Zelikman
Yuhuai Wu
Jesse Mu
Noah D. Goodman
ReLM
LRM
140
488
0
28 Mar 2022
DeepNet: Scaling Transformers to 1,000 Layers
Hongyu Wang
Shuming Ma
Li Dong
Shaohan Huang
Dongdong Zhang
Furu Wei
MoE
AI4CE
126
162
0
01 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
817
9,576
0
28 Jan 2022
Training Verifiers to Solve Math Word Problems
K. Cobbe
V. Kosaraju
Mohammad Bavarian
Mark Chen
Heewoo Jun
...
Jerry Tworek
Jacob Hilton
Reiichiro Nakano
Christopher Hesse
John Schulman
ReLM
OffRL
LRM
285
4,408
0
27 Oct 2021
Program Synthesis with Large Language Models
Jacob Austin
Augustus Odena
Maxwell Nye
Maarten Bosma
Henryk Michalewski
...
Ellen Jiang
Carrie J. Cai
Michael Terry
Quoc V. Le
Charles Sutton
ELM
AIMat
ReCod
ALM
195
1,986
0
16 Aug 2021
PanGu-
α
α
α
: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation
Wei Zeng
Xiaozhe Ren
Teng Su
Hui Wang
Yi-Lun Liao
...
Gaojun Fan
Yaowei Wang
Xuefeng Jin
Qun Liu
Yonghong Tian
ALM
MoE
AI4CE
74
213
0
26 Apr 2021
oLMpics -- On what Language Model Pre-training Captures
Alon Talmor
Yanai Elazar
Yoav Goldberg
Jonathan Berant
LRM
103
304
0
31 Dec 2019
Machine Learning from a Continuous Viewpoint
E. Weinan
Chao Ma
Lei Wu
109
104
0
30 Dec 2019
Gradient Dynamics of Shallow Univariate ReLU Networks
Francis Williams
Matthew Trager
Claudio Silva
Daniele Panozzo
Denis Zorin
Joan Bruna
62
80
0
18 Jun 2019
A Comparative Analysis of the Optimization and Generalization Property of Two-layer Neural Network and Random Feature Models Under Gradient Descent Dynamics
E. Weinan
Chao Ma
Lei Wu
MLT
57
123
0
08 Apr 2019
CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge
Alon Talmor
Jonathan Herzig
Nicholas Lourie
Jonathan Berant
RALM
140
1,733
0
02 Nov 2018
Mean Field Analysis of Neural Networks: A Central Limit Theorem
Justin A. Sirignano
K. Spiliopoulos
MLT
75
194
0
28 Aug 2018
A Mean Field View of the Landscape of Two-Layers Neural Networks
Song Mei
Andrea Montanari
Phan-Minh Nguyen
MLT
95
858
0
18 Apr 2018
Stronger generalization bounds for deep nets via a compression approach
Sanjeev Arora
Rong Ge
Behnam Neyshabur
Yi Zhang
MLT
AI4CE
86
642
0
14 Feb 2018
Size-Independent Sample Complexity of Neural Networks
Noah Golowich
Alexander Rakhlin
Ohad Shamir
154
547
0
18 Dec 2017
Spectrally-normalized margin bounds for neural networks
Peter L. Bartlett
Dylan J. Foster
Matus Telgarsky
ODL
205
1,220
0
26 Jun 2017
Norm-Based Capacity Control in Neural Networks
Behnam Neyshabur
Ryota Tomioka
Nathan Srebro
290
588
0
27 Feb 2015
Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
VLM
326
18,625
0
06 Feb 2015
1