Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.05807
Cited By
v1
v2
v3 (latest)
Extended convexity and smoothness and their applications in deep learning
8 October 2024
Binchuan Qi
Wei Gong
Li Li
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Extended convexity and smoothness and their applications in deep learning"
32 / 32 papers shown
Title
Convex and Non-convex Optimization Under Generalized Smoothness
Haochuan Li
Jian Qian
Yi Tian
Alexander Rakhlin
Ali Jadbabaie
94
44
0
02 Jun 2023
Variance-reduced Clipping for Non-convex Optimization
Amirhossein Reisizadeh
Haochuan Li
Subhro Das
Ali Jadbabaie
96
29
0
02 Mar 2023
Robustness to Unbounded Smoothness of Generalized SignSGD
M. Crawshaw
Mingrui Liu
Francesco Orabona
Wei Zhang
Zhenxun Zhuang
AAML
110
74
0
23 Aug 2022
Improved Analysis of Clipping Algorithms for Non-convex Optimization
Bohang Zhang
Jikai Jin
Cong Fang
Liwei Wang
126
92
0
05 Oct 2020
Understanding Notions of Stationarity in Non-Smooth Optimization
Jiajin Li
Anthony Man-Cho So
Wing-Kin Ma
60
47
0
26 Jun 2020
Lower Bounds for Non-Convex Stochastic Optimization
Yossi Arjevani
Y. Carmon
John C. Duchi
Dylan J. Foster
Nathan Srebro
Blake E. Woodworth
124
362
0
05 Dec 2019
CARS: Continuous Evolution for Efficient Neural Architecture Search
Zhaohui Yang
Yunhe Wang
Xinghao Chen
Boxin Shi
Chao Xu
Chunjing Xu
Qi Tian
Chang Xu
122
231
0
11 Sep 2019
Why gradient clipping accelerates training: A theoretical justification for adaptivity
J.N. Zhang
Tianxing He
S. Sra
Ali Jadbabaie
88
469
0
28 May 2019
Momentum-Based Variance Reduction in Non-Convex SGD
Ashok Cutkosky
Francesco Orabona
ODL
110
410
0
24 May 2019
Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks
Sanjeev Arora
S. Du
Wei Hu
Zhiyuan Li
Ruosong Wang
MLT
234
974
0
24 Jan 2019
Reconciling modern machine learning practice and the bias-variance trade-off
M. Belkin
Daniel J. Hsu
Siyuan Ma
Soumik Mandal
305
1,665
0
28 Dec 2018
Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks
Difan Zou
Yuan Cao
Dongruo Zhou
Quanquan Gu
ODL
254
448
0
21 Nov 2018
Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers
Zeyuan Allen-Zhu
Yuanzhi Li
Yingyu Liang
MLT
235
775
0
12 Nov 2018
A Convergence Theory for Deep Learning via Over-Parameterization
Zeyuan Allen-Zhu
Yuanzhi Li
Zhao Song
AI4CE
ODL
306
1,470
0
09 Nov 2018
Gradient Descent Finds Global Minima of Deep Neural Networks
S. Du
Jason D. Lee
Haochuan Li
Liwei Wang
Masayoshi Tomizuka
ODL
342
1,136
0
09 Nov 2018
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
S. Du
Xiyu Zhai
Barnabás Póczós
Aarti Singh
MLT
ODL
284
1,276
0
04 Oct 2018
Mean Field Analysis of Neural Networks: A Central Limit Theorem
Justin A. Sirignano
K. Spiliopoulos
MLT
110
194
0
28 Aug 2018
Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data
Yuanzhi Li
Yingyu Liang
MLT
226
653
0
03 Aug 2018
SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path Integrated Differential Estimator
Cong Fang
C. J. Li
Zhouchen Lin
Tong Zhang
133
580
0
04 Jul 2018
Stochastic Nested Variance Reduction for Nonconvex Optimization
Dongruo Zhou
Pan Xu
Quanquan Gu
85
147
0
20 Jun 2018
Neural Tangent Kernel: Convergence and Generalization in Neural Networks
Arthur Jacot
Franck Gabriel
Clément Hongler
346
3,226
0
20 Jun 2018
On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport
Lénaïc Chizat
Francis R. Bach
OT
230
738
0
24 May 2018
A Mean Field View of the Landscape of Two-Layers Neural Networks
Song Mei
Andrea Montanari
Phan-Minh Nguyen
MLT
113
863
0
18 Apr 2018
Deep Neural Networks as Gaussian Processes
Jaehoon Lee
Yasaman Bahri
Roman Novak
S. Schoenholz
Jeffrey Pennington
Jascha Narain Sohl-Dickstein
UQCV
BDL
160
1,100
0
01 Nov 2017
Stochastic gradient descent performs variational inference, converges to limit cycles for deep networks
Pratik Chaudhari
Stefano Soatto
MLT
104
304
0
30 Oct 2017
SARAH: A Novel Method for Machine Learning Problems Using Stochastic Recursive Gradient
Lam M. Nguyen
Jie Liu
K. Scheinberg
Martin Takáč
ODL
177
608
0
01 Mar 2017
Understanding deep learning requires rethinking generalization
Chiyuan Zhang
Samy Bengio
Moritz Hardt
Benjamin Recht
Oriol Vinyals
HAI
372
4,639
0
10 Nov 2016
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
559
2,947
0
15 Sep 2016
Optimization Methods for Large-Scale Machine Learning
Léon Bottou
Frank E. Curtis
J. Nocedal
297
3,229
0
15 Jun 2016
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
2.4K
195,053
0
10 Dec 2015
Identifying and attacking the saddle point problem in high-dimensional non-convex optimization
Yann N. Dauphin
Razvan Pascanu
Çağlar Gülçehre
Kyunghyun Cho
Surya Ganguli
Yoshua Bengio
ODL
166
1,394
0
10 Jun 2014
Stochastic First- and Zeroth-order Methods for Nonconvex Stochastic Programming
Saeed Ghadimi
Guanghui Lan
ODL
149
1,562
0
22 Sep 2013
1