Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1810.12065
Cited By
On the Convergence Rate of Training Recurrent Neural Networks
29 October 2018
Zeyuan Allen-Zhu
Yuanzhi Li
Zhao Song
Re-assign community
ArXiv
PDF
HTML
Papers citing
"On the Convergence Rate of Training Recurrent Neural Networks"
50 / 128 papers shown
Title
Scaling Law Phenomena Across Regression Paradigms: Multiple and Kernel Approaches
Yifang Chen
Xuyang Guo
Xiaoyu Li
Yingyu Liang
Zhenmei Shi
Zhao Song
73
3
0
03 Mar 2025
LoR-VP: Low-Rank Visual Prompting for Efficient Vision Model Adaptation
Can Jin
Ying Li
Mingyu Zhao
Shiyu Zhao
Zhenting Wang
Xiaoxiao He
Ligong Han
Tong Che
Dimitris N. Metaxas
VPVLM
VLM
129
1
0
02 Feb 2025
Generalization and Risk Bounds for Recurrent Neural Networks
Xuewei Cheng
Ke Huang
Shujie Ma
29
1
0
05 Nov 2024
SHAP values via sparse Fourier representation
Ali Gorji
Andisheh Amrollahi
A. Krause
FAtt
38
0
0
08 Oct 2024
Stochastic Gradient Descent for Two-layer Neural Networks
Dinghao Cao
Zheng-Chu Guo
Lei Shi
MLT
24
0
0
10 Jul 2024
Evaluating the design space of diffusion-based generative models
Yuqing Wang
Ye He
Molei Tao
DiffM
38
5
0
18 Jun 2024
Recurrent Natural Policy Gradient for POMDPs
Semih Cayci
A. Eryilmaz
32
0
0
28 May 2024
HiLo: Detailed and Robust 3D Clothed Human Reconstruction with High-and Low-Frequency Information of Parametric Models
Yifan Yang
Dong Liu
Shuhai Zhang
Zeshuai Deng
Zixiong Huang
Mingkui Tan
3DH
29
8
0
07 Apr 2024
CAM-Based Methods Can See through Walls
Magamed Taimeskhanov
R. Sicre
Damien Garreau
23
1
0
02 Apr 2024
Convergence of Gradient Descent for Recurrent Neural Networks: A Nonasymptotic Analysis
Semih Cayci
A. Eryilmaz
26
3
0
19 Feb 2024
LoRA Training in the NTK Regime has No Spurious Local Minima
Uijeong Jang
Jason D. Lee
Ernest K. Ryu
44
14
0
19 Feb 2024
Convergence Analysis for Learning Orthonormal Deep Linear Neural Networks
Zhen Qin
Xuwei Tan
Zhihui Zhu
34
0
0
24 Nov 2023
On the Convergence of Encoder-only Shallow Transformers
Yongtao Wu
Fanghui Liu
Grigorios G. Chrysos
V. Cevher
50
5
0
02 Nov 2023
An Automatic Learning Rate Schedule Algorithm for Achieving Faster Convergence and Steeper Descent
Zhao Song
Chiwun Yang
29
9
0
17 Oct 2023
How many Neurons do we need? A refined Analysis for Shallow Networks trained with Gradient Descent
Mike Nguyen
Nicole Mücke
MLT
27
5
0
14 Sep 2023
Is Solving Graph Neural Tangent Kernel Equivalent to Training Graph Neural Network?
Lianke Qin
Zhao Song
Baocheng Sun
25
7
0
14 Sep 2023
A Fast Optimization View: Reformulating Single Layer Attention in LLM Based on Tensor and SVM Trick, and Solving It in Matrix Multiplication Time
Yeqi Gao
Zhao Song
Weixin Wang
Junze Yin
24
25
0
14 Sep 2023
Six Lectures on Linearized Neural Networks
Theodor Misiakiewicz
Andrea Montanari
44
12
0
25 Aug 2023
How to Protect Copyright Data in Optimization of Large Language Models?
T. Chu
Zhao Song
Chiwun Yang
40
29
0
23 Aug 2023
Convergence of Two-Layer Regression with Nonlinear Units
Yichuan Deng
Zhao Song
Shenghao Xie
29
7
0
16 Aug 2023
Dynamic Analysis and an Eigen Initializer for Recurrent Neural Networks
Ran Dou
José C. Príncipe
33
2
0
28 Jul 2023
Equitable Time-Varying Pricing Tariff Design: A Joint Learning and Optimization Approach
Liudong Chen
Bolun Xu
18
0
0
26 Jul 2023
Efficient SGD Neural Network Training via Sublinear Activated Neuron Identification
Lianke Qin
Zhao Song
Yuanyuan Yang
25
9
0
13 Jul 2023
Efficient Uncertainty Quantification and Reduction for Over-Parameterized Neural Networks
Ziyi Huang
H. Lam
Haofeng Zhang
UQCV
26
4
0
09 Jun 2023
InfoPrompt: Information-Theoretic Soft Prompt Tuning for Natural Language Understanding
Junda Wu
Tong Yu
Rui Wang
Zhao Song
Ruiyi Zhang
Handong Zhao
Chaochao Lu
Shuai Li
Ricardo Henao
VLM
39
23
0
08 Jun 2023
Query Complexity of Active Learning for Function Family With Nearly Orthogonal Basis
Xiangyi Chen
Zhao Song
Baochen Sun
Junze Yin
Danyang Zhuo
42
3
0
06 Jun 2023
A Scalable Walsh-Hadamard Regularizer to Overcome the Low-degree Spectral Bias of Neural Networks
Ali Gorji
Andisheh Amrollahi
A. Krause
19
4
0
16 May 2023
Efficient Asynchronize Stochastic Gradient Algorithm with Structured Data
Zhao Song
Mingquan Ye
27
4
0
13 May 2023
On the Eigenvalue Decay Rates of a Class of Neural-Network Related Kernel Functions Defined on General Domains
Yicheng Li
Zixiong Yu
Y. Cotronis
Qian Lin
55
13
0
04 May 2023
An Iterative Algorithm for Rescaled Hyperbolic Functions Regression
Yeqi Gao
Zhao Song
Junze Yin
31
33
0
01 May 2023
Attention Scheme Inspired Softmax Regression
Yichuan Deng
Zhihang Li
Zhao Song
44
42
0
20 Apr 2023
An Over-parameterized Exponential Regression
Yeqi Gao
Sridhar Mahadevan
Zhao Song
16
36
0
29 Mar 2023
A Brief Survey on the Approximation Theory for Sequence Modelling
Hao Jiang
Qianxiao Li
Zhong Li
Shida Wang
AI4TS
30
12
0
27 Feb 2023
An Analysis of Attention via the Lens of Exchangeability and Latent Variable Models
Yufeng Zhang
Boyi Liu
Qi Cai
Lingxiao Wang
Zhaoran Wang
53
11
0
30 Dec 2022
Bypass Exponential Time Preprocessing: Fast Neural Network Training via Weight-Data Correlation Preprocessing
Josh Alman
Jiehao Liang
Zhao Song
Ruizhe Zhang
Danyang Zhuo
77
31
0
25 Nov 2022
Linear RNNs Provably Learn Linear Dynamic Systems
Lifu Wang
Tianyu Wang
Shengwei Yi
Bo Shen
Bo Hu
Xing Cao
22
0
0
19 Nov 2022
Learning Low Dimensional State Spaces with Overparameterized Recurrent Neural Nets
Edo Cohen-Karlik
Itamar Menuhin-Gruman
Raja Giryes
Nadav Cohen
Amir Globerson
27
4
0
25 Oct 2022
Global Convergence of SGD On Two Layer Neural Nets
Pulkit Gopalani
Anirbit Mukherjee
26
5
0
20 Oct 2022
On Scrambling Phenomena for Randomly Initialized Recurrent Networks
Vaggos Chatziafratis
Ioannis Panageas
Clayton Sanford
S. Stavroulakis
30
2
0
11 Oct 2022
A Sublinear Adversarial Training Algorithm
Yeqi Gao
Lianke Qin
Zhao Song
Yitan Wang
GAN
36
25
0
10 Aug 2022
Training Overparametrized Neural Networks in Sublinear Time
Yichuan Deng
Han Hu
Zhao Song
Omri Weinstein
Danyang Zhuo
BDL
30
28
0
09 Aug 2022
Federated Adversarial Learning: A Framework with Convergence Analysis
Xiaoxiao Li
Zhao Song
Jiaming Yang
FedML
27
19
0
07 Aug 2022
Bounding the Width of Neural Networks via Coupled Initialization -- A Worst Case Analysis
Alexander Munteanu
Simon Omlor
Zhao Song
David P. Woodruff
33
15
0
26 Jun 2022
Global Convergence of Over-parameterized Deep Equilibrium Models
Zenan Ling
Xingyu Xie
Qiuhao Wang
Zongpeng Zhang
Zhouchen Lin
32
12
0
27 May 2022
The Mechanism of Prediction Head in Non-contrastive Self-supervised Learning
Zixin Wen
Yuanzhi Li
SSL
32
34
0
12 May 2022
Real-time Forecasting of Time Series in Financial Markets Using Sequentially Trained Many-to-one LSTMs
Kelum Gajamannage
Yonggi Park
AI4TS
AIFin
11
4
0
10 May 2022
Spectrum of inner-product kernel matrices in the polynomial regime and multiple descent phenomenon in kernel ridge regression
Theodor Misiakiewicz
21
39
0
21 Apr 2022
Implicit Bias of MSE Gradient Optimization in Underparameterized Neural Networks
Benjamin Bowman
Guido Montúfar
28
11
0
12 Jan 2022
Training Multi-Layer Over-Parametrized Neural Network in Subquadratic Time
Zhao Song
Licheng Zhang
Ruizhe Zhang
32
64
0
14 Dec 2021
Fast Graph Neural Tangent Kernel via Kronecker Sketching
Shunhua Jiang
Yunze Man
Zhao Song
Zheng Yu
Danyang Zhuo
29
6
0
04 Dec 2021
1
2
3
Next