ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.05807
  4. Cited By
Extended convexity and smoothness and their applications in deep learning
v1v2v3 (latest)

Extended convexity and smoothness and their applications in deep learning

8 October 2024
Binchuan Qi
Wei Gong
Li Li
ArXiv (abs)PDFHTML

Papers citing "Extended convexity and smoothness and their applications in deep learning"

32 / 32 papers shown
Title
Convex and Non-convex Optimization Under Generalized Smoothness
Convex and Non-convex Optimization Under Generalized Smoothness
Haochuan Li
Jian Qian
Yi Tian
Alexander Rakhlin
Ali Jadbabaie
94
44
0
02 Jun 2023
Variance-reduced Clipping for Non-convex Optimization
Variance-reduced Clipping for Non-convex Optimization
Amirhossein Reisizadeh
Haochuan Li
Subhro Das
Ali Jadbabaie
96
29
0
02 Mar 2023
Robustness to Unbounded Smoothness of Generalized SignSGD
Robustness to Unbounded Smoothness of Generalized SignSGD
M. Crawshaw
Mingrui Liu
Francesco Orabona
Wei Zhang
Zhenxun Zhuang
AAML
110
74
0
23 Aug 2022
Improved Analysis of Clipping Algorithms for Non-convex Optimization
Improved Analysis of Clipping Algorithms for Non-convex Optimization
Bohang Zhang
Jikai Jin
Cong Fang
Liwei Wang
126
92
0
05 Oct 2020
Understanding Notions of Stationarity in Non-Smooth Optimization
Understanding Notions of Stationarity in Non-Smooth Optimization
Jiajin Li
Anthony Man-Cho So
Wing-Kin Ma
60
47
0
26 Jun 2020
Lower Bounds for Non-Convex Stochastic Optimization
Lower Bounds for Non-Convex Stochastic Optimization
Yossi Arjevani
Y. Carmon
John C. Duchi
Dylan J. Foster
Nathan Srebro
Blake E. Woodworth
124
362
0
05 Dec 2019
CARS: Continuous Evolution for Efficient Neural Architecture Search
CARS: Continuous Evolution for Efficient Neural Architecture Search
Zhaohui Yang
Yunhe Wang
Xinghao Chen
Boxin Shi
Chao Xu
Chunjing Xu
Qi Tian
Chang Xu
122
231
0
11 Sep 2019
Why gradient clipping accelerates training: A theoretical justification
  for adaptivity
Why gradient clipping accelerates training: A theoretical justification for adaptivity
J.N. Zhang
Tianxing He
S. Sra
Ali Jadbabaie
88
469
0
28 May 2019
Momentum-Based Variance Reduction in Non-Convex SGD
Momentum-Based Variance Reduction in Non-Convex SGD
Ashok Cutkosky
Francesco Orabona
ODL
110
410
0
24 May 2019
Fine-Grained Analysis of Optimization and Generalization for
  Overparameterized Two-Layer Neural Networks
Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks
Sanjeev Arora
S. Du
Wei Hu
Zhiyuan Li
Ruosong Wang
MLT
234
974
0
24 Jan 2019
Reconciling modern machine learning practice and the bias-variance
  trade-off
Reconciling modern machine learning practice and the bias-variance trade-off
M. Belkin
Daniel J. Hsu
Siyuan Ma
Soumik Mandal
305
1,665
0
28 Dec 2018
Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU
  Networks
Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks
Difan Zou
Yuan Cao
Dongruo Zhou
Quanquan Gu
ODL
254
448
0
21 Nov 2018
Learning and Generalization in Overparameterized Neural Networks, Going
  Beyond Two Layers
Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers
Zeyuan Allen-Zhu
Yuanzhi Li
Yingyu Liang
MLT
235
775
0
12 Nov 2018
A Convergence Theory for Deep Learning via Over-Parameterization
A Convergence Theory for Deep Learning via Over-Parameterization
Zeyuan Allen-Zhu
Yuanzhi Li
Zhao Song
AI4CEODL
306
1,470
0
09 Nov 2018
Gradient Descent Finds Global Minima of Deep Neural Networks
Gradient Descent Finds Global Minima of Deep Neural Networks
S. Du
Jason D. Lee
Haochuan Li
Liwei Wang
Masayoshi Tomizuka
ODL
342
1,136
0
09 Nov 2018
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
S. Du
Xiyu Zhai
Barnabás Póczós
Aarti Singh
MLTODL
284
1,276
0
04 Oct 2018
Mean Field Analysis of Neural Networks: A Central Limit Theorem
Mean Field Analysis of Neural Networks: A Central Limit Theorem
Justin A. Sirignano
K. Spiliopoulos
MLT
110
194
0
28 Aug 2018
Learning Overparameterized Neural Networks via Stochastic Gradient
  Descent on Structured Data
Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data
Yuanzhi Li
Yingyu Liang
MLT
226
653
0
03 Aug 2018
SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path
  Integrated Differential Estimator
SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path Integrated Differential Estimator
Cong Fang
C. J. Li
Zhouchen Lin
Tong Zhang
133
580
0
04 Jul 2018
Stochastic Nested Variance Reduction for Nonconvex Optimization
Stochastic Nested Variance Reduction for Nonconvex Optimization
Dongruo Zhou
Pan Xu
Quanquan Gu
85
147
0
20 Jun 2018
Neural Tangent Kernel: Convergence and Generalization in Neural Networks
Neural Tangent Kernel: Convergence and Generalization in Neural Networks
Arthur Jacot
Franck Gabriel
Clément Hongler
346
3,226
0
20 Jun 2018
On the Global Convergence of Gradient Descent for Over-parameterized
  Models using Optimal Transport
On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport
Lénaïc Chizat
Francis R. Bach
OT
230
738
0
24 May 2018
A Mean Field View of the Landscape of Two-Layers Neural Networks
A Mean Field View of the Landscape of Two-Layers Neural Networks
Song Mei
Andrea Montanari
Phan-Minh Nguyen
MLT
113
863
0
18 Apr 2018
Deep Neural Networks as Gaussian Processes
Deep Neural Networks as Gaussian Processes
Jaehoon Lee
Yasaman Bahri
Roman Novak
S. Schoenholz
Jeffrey Pennington
Jascha Narain Sohl-Dickstein
UQCVBDL
160
1,100
0
01 Nov 2017
Stochastic gradient descent performs variational inference, converges to
  limit cycles for deep networks
Stochastic gradient descent performs variational inference, converges to limit cycles for deep networks
Pratik Chaudhari
Stefano Soatto
MLT
104
304
0
30 Oct 2017
SARAH: A Novel Method for Machine Learning Problems Using Stochastic
  Recursive Gradient
SARAH: A Novel Method for Machine Learning Problems Using Stochastic Recursive Gradient
Lam M. Nguyen
Jie Liu
K. Scheinberg
Martin Takáč
ODL
177
608
0
01 Mar 2017
Understanding deep learning requires rethinking generalization
Understanding deep learning requires rethinking generalization
Chiyuan Zhang
Samy Bengio
Moritz Hardt
Benjamin Recht
Oriol Vinyals
HAI
372
4,639
0
10 Nov 2016
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp
  Minima
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
559
2,947
0
15 Sep 2016
Optimization Methods for Large-Scale Machine Learning
Optimization Methods for Large-Scale Machine Learning
Léon Bottou
Frank E. Curtis
J. Nocedal
297
3,229
0
15 Jun 2016
Deep Residual Learning for Image Recognition
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
2.4K
195,053
0
10 Dec 2015
Identifying and attacking the saddle point problem in high-dimensional
  non-convex optimization
Identifying and attacking the saddle point problem in high-dimensional non-convex optimization
Yann N. Dauphin
Razvan Pascanu
Çağlar Gülçehre
Kyunghyun Cho
Surya Ganguli
Yoshua Bengio
ODL
166
1,394
0
10 Jun 2014
Stochastic First- and Zeroth-order Methods for Nonconvex Stochastic
  Programming
Stochastic First- and Zeroth-order Methods for Nonconvex Stochastic Programming
Saeed Ghadimi
Guanghui Lan
ODL
149
1,562
0
22 Sep 2013
1