Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.08745
Cited By
v1
v2
v3
v4 (latest)
Using Stochastic Gradient Descent to Smooth Nonconvex Functions: Analysis of Implicit Graduated Optimization with Optimal Noise Scheduling
15 November 2023
Naoki Sato
Hideaki Iiduka
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Using Stochastic Gradient Descent to Smooth Nonconvex Functions: Analysis of Implicit Graduated Optimization with Optimal Noise Scheduling"
36 / 36 papers shown
Title
On the Importance of Noise Scheduling for Diffusion Models
Ting Chen
DiffM
84
157
0
26 Jan 2023
A Generalist Framework for Panoptic Segmentation of Images and Videos
Ting-Li Chen
Lala Li
Saurabh Saxena
Geoffrey E. Hinton
David J. Fleet
VGen
MLLM
62
103
0
12 Oct 2022
How to decay your learning rate
Aitor Lewkowycz
100
24
0
23 Mar 2021
ASAM: Adaptive Sharpness-Aware Minimization for Scale-Invariant Learning of Deep Neural Networks
Jungmin Kwon
Jeongseop Kim
Hyunseong Park
I. Choi
100
290
0
23 Feb 2021
Score-Based Generative Modeling through Stochastic Differential Equations
Yang Song
Jascha Narain Sohl-Dickstein
Diederik P. Kingma
Abhishek Kumar
Stefano Ermon
Ben Poole
DiffM
SyDa
353
6,566
0
26 Nov 2020
Denoising Diffusion Implicit Models
Jiaming Song
Chenlin Meng
Stefano Ermon
VLM
DiffM
289
7,469
0
06 Oct 2020
Outlier-Robust Estimation: Hardness, Minimally Tuned Algorithms, and Applications
Pasquale Antonante
Vasileios Tzoumas
Heng Yang
Luca Carlone
66
55
0
29 Jul 2020
Improved Techniques for Training Score-Based Generative Models
Yang Song
Stefano Ermon
DiffM
260
1,163
0
16 Jun 2020
Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast Convergence
Nicolas Loizou
Sharan Vaswani
I. Laradji
Simon Lacoste-Julien
69
187
0
24 Feb 2020
Relative Flatness and Generalization
Henning Petzka
Michael Kamp
Linara Adilova
C. Sminchisescu
Mario Boley
78
78
0
03 Jan 2020
Graduated Non-Convexity for Robust Spatial Perception: From Non-Minimal Solvers to Global Outlier Rejection
Heng Yang
Pasquale Antonante
Vasileios Tzoumas
Luca Carlone
228
230
0
18 Sep 2019
Generative Modeling by Estimating Gradients of the Data Distribution
Yang Song
Stefano Ermon
SyDa
DiffM
258
3,956
0
12 Jul 2019
Convergence rates for the stochastic gradient descent method for non-convex objective functions
Benjamin J. Fehrman
Benjamin Gess
Arnulf Jentzen
85
101
0
02 Apr 2019
Large Batch Optimization for Deep Learning: Training BERT in 76 minutes
Yang You
Jing Li
Sashank J. Reddi
Jonathan Hseu
Sanjiv Kumar
Srinadh Bhojanapalli
Xiaodan Song
J. Demmel
Kurt Keutzer
Cho-Jui Hsieh
ODL
261
999
0
01 Apr 2019
sharpDARTS: Faster and More Accurate Differentiable Architecture Search
Andrew Hundt
Varun Jain
Gregory Hager
OOD
67
66
0
23 Mar 2019
A Sufficient Condition for Convergences of Adam and RMSProp
Fangyu Zou
Li Shen
Zequn Jie
Weizhong Zhang
Wei Liu
61
372
0
23 Nov 2018
On the Convergence of Adaptive Gradient Methods for Nonconvex Optimization
Dongruo Zhou
Yiqi Tang
Yuan Cao
Ziyan Yang
Quanquan Gu
74
151
0
16 Aug 2018
On the Convergence of A Class of Adam-Type Algorithms for Non-Convex Optimization
Xiangyi Chen
Sijia Liu
Ruoyu Sun
Mingyi Hong
65
324
0
08 Aug 2018
Closing the Generalization Gap of Adaptive Gradient Methods in Training Deep Neural Networks
Jinghui Chen
Dongruo Zhou
Yiqi Tang
Ziyan Yang
Yuan Cao
Quanquan Gu
ODL
82
193
0
18 Jun 2018
Visualizing the Loss Landscape of Neural Nets
Hao Li
Zheng Xu
Gavin Taylor
Christoph Studer
Tom Goldstein
258
1,898
0
28 Dec 2017
Receptive Field Block Net for Accurate and Fast Object Detection
Songtao Liu
Di Huang
Yunhong Wang
ObjD
75
1,267
0
21 Nov 2017
Don't Decay the Learning Rate, Increase the Batch Size
Samuel L. Smith
Pieter-Jan Kindermans
Chris Ying
Quoc V. Le
ODL
103
996
0
01 Nov 2017
Rethinking Atrous Convolution for Semantic Image Segmentation
Liang-Chieh Chen
George Papandreou
Florian Schroff
Hartwig Adam
SSeg
232
8,488
0
17 Jun 2017
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
Priya Goyal
Piotr Dollár
Ross B. Girshick
P. Noordhuis
Lukasz Wesolowski
Aapo Kyrola
Andrew Tulloch
Yangqing Jia
Kaiming He
3DH
128
3,685
0
08 Jun 2017
Coupling Adaptive Batch Sizes with Learning Rates
Lukas Balles
Javier Romero
Philipp Hennig
ODL
130
110
0
15 Dec 2016
Pyramid Scene Parsing Network
Hengshuang Zhao
Jianping Shi
Xiaojuan Qi
Xiaogang Wang
Jiaya Jia
VOS
SSeg
665
12,033
0
04 Dec 2016
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
429
2,945
0
15 Sep 2016
SGDR: Stochastic Gradient Descent with Warm Restarts
I. Loshchilov
Frank Hutter
ODL
350
8,174
0
13 Aug 2016
DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
Liang-Chieh Chen
George Papandreou
Iasonas Kokkinos
Kevin Patrick Murphy
Alan Yuille
SSeg
267
18,267
0
02 Jun 2016
Wide Residual Networks
Sergey Zagoruyko
N. Komodakis
353
8,000
0
23 May 2016
On Graduated Optimization for Stochastic Non-Convex Problems
Elad Hazan
Kfir Y. Levy
Shai Shalev-Shwartz
79
117
0
12 Mar 2015
Deep Unsupervised Learning using Nonequilibrium Thermodynamics
Jascha Narain Sohl-Dickstein
Eric A. Weiss
Niru Maheswaranathan
Surya Ganguli
SyDa
DiffM
312
7,016
0
12 Mar 2015
Escaping From Saddle Points --- Online Stochastic Gradient for Tensor Decomposition
Rong Ge
Furong Huang
Chi Jin
Yang Yuan
143
1,059
0
06 Mar 2015
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Sergey Ioffe
Christian Szegedy
OOD
465
43,341
0
11 Feb 2015
Hybrid Deterministic-Stochastic Methods for Data Fitting
M. Friedlander
Mark Schmidt
199
388
0
13 Apr 2011
Randomized Smoothing for Stochastic Optimization
John C. Duchi
Peter L. Bartlett
Martin J. Wainwright
106
288
0
22 Mar 2011
1