ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.11268
  4. Cited By
Bypassing the Exponential Dependency: Looped Transformers Efficiently Learn In-context by Multi-step Gradient Descent

Bypassing the Exponential Dependency: Looped Transformers Efficiently Learn In-context by Multi-step Gradient Descent

15 October 2024
Bo Chen
Xiaoyu Li
Yingyu Liang
Zhenmei Shi
Zhao-quan Song
ArXivPDFHTML

Papers citing "Bypassing the Exponential Dependency: Looped Transformers Efficiently Learn In-context by Multi-step Gradient Descent"

14 / 14 papers shown
Title
Provable Failure of Language Models in Learning Majority Boolean Logic via Gradient Descent
Provable Failure of Language Models in Learning Majority Boolean Logic via Gradient Descent
Bo Chen
Zhenmei Shi
Zhao-quan Song
Jiahao Zhang
NAI
LRM
AI4CE
53
1
0
07 Apr 2025
Theoretical Guarantees for High Order Trajectory Refinement in Generative Flows
Chengyue Gong
Xiaoyu Li
Yingyu Liang
Jiangxuan Long
Zhenmei Shi
Zhao-quan Song
Yu Tian
54
3
0
12 Mar 2025
Scaling Law Phenomena Across Regression Paradigms: Multiple and Kernel Approaches
Yifang Chen
Xuyang Guo
Xiaoyu Li
Yingyu Liang
Zhenmei Shi
Zhao-quan Song
65
3
0
03 Mar 2025
When Can We Solve the Weighted Low Rank Approximation Problem in Truly Subquadratic Time?
When Can We Solve the Weighted Low Rank Approximation Problem in Truly Subquadratic Time?
Chenyang Li
Yingyu Liang
Zhenmei Shi
Zhao-quan Song
36
3
0
24 Feb 2025
Looped ReLU MLPs May Be All You Need as Practical Programmable Computers
Looped ReLU MLPs May Be All You Need as Practical Programmable Computers
Yingyu Liang
Zhizhou Sha
Zhenmei Shi
Zhao-quan Song
Yufa Zhou
93
18
0
21 Feb 2025
Enhancing Auto-regressive Chain-of-Thought through Loop-Aligned Reasoning
Enhancing Auto-regressive Chain-of-Thought through Loop-Aligned Reasoning
Qifan Yu
Zhenyu He
Sijie Li
Xun Zhou
Jun Zhang
Jingjing Xu
Di He
OffRL
LRM
89
4
0
12 Feb 2025
Fast Gradient Computation for RoPE Attention in Almost Linear Time
Fast Gradient Computation for RoPE Attention in Almost Linear Time
Yifang Chen
Jiayan Huo
Xiaoyu Li
Yingyu Liang
Zhenmei Shi
Zhao-quan Song
61
11
0
03 Jan 2025
Theoretical Constraints on the Expressive Power of $\mathsf{RoPE}$-based
  Tensor Attention Transformers
Theoretical Constraints on the Expressive Power of RoPE\mathsf{RoPE}RoPE-based Tensor Attention Transformers
Xiaoyu Li
Yingyu Liang
Zhenmei Shi
Zhao-quan Song
Mingda Wan
84
8
0
23 Dec 2024
Advancing the Understanding of Fixed Point Iterations in Deep Neural
  Networks: A Detailed Analytical Study
Advancing the Understanding of Fixed Point Iterations in Deep Neural Networks: A Detailed Analytical Study
Yekun Ke
Xiaoyu Li
Yingyu Liang
Zhenmei Shi
Zhao-quan Song
63
3
0
15 Oct 2024
Beyond Linear Approximations: A Novel Pruning Approach for Attention Matrix
Beyond Linear Approximations: A Novel Pruning Approach for Attention Matrix
Yingyu Liang
Jiangxuan Long
Zhenmei Shi
Zhao-quan Song
Yufa Zhou
61
5
0
15 Oct 2024
HSR-Enhanced Sparse Attention Acceleration
HSR-Enhanced Sparse Attention Acceleration
Bo Chen
Yingyu Liang
Zhizhou Sha
Zhenmei Shi
Zhao-quan Song
87
18
0
14 Oct 2024
Can Looped Transformers Learn to Implement Multi-step Gradient Descent
  for In-context Learning?
Can Looped Transformers Learn to Implement Multi-step Gradient Descent for In-context Learning?
Khashayar Gatmiry
Nikunj Saunshi
Sashank J. Reddi
Stefanie Jegelka
Sanjiv Kumar
67
17
0
10 Oct 2024
A Tighter Complexity Analysis of SparseGPT
A Tighter Complexity Analysis of SparseGPT
Xiaoyu Li
Yingyu Liang
Zhenmei Shi
Zhao-quan Song
66
21
0
22 Aug 2024
The Closeness of In-Context Learning and Weight Shifting for Softmax
  Regression
The Closeness of In-Context Learning and Weight Shifting for Softmax Regression
Shuai Li
Zhao-quan Song
Yu Xia
Tong Yu
Tianyi Zhou
28
36
0
26 Apr 2023
1