v1v2v3v4 (latest)

Using Stochastic Gradient Descent to Smooth Nonconvex Functions: Analysis of Implicit Graduated Optimization with Optimal Noise Scheduling

15 November 2023

Papers citing "Using Stochastic Gradient Descent to Smooth Nonconvex Functions: Analysis of Implicit Graduated Optimization with Optimal Noise Scheduling"

36 / 36 papers shown

Title
On the Importance of Noise Scheduling for Diffusion Models Ting Chen DiffM 84 157 0 26 Jan 2023
A Generalist Framework for Panoptic Segmentation of Images and Videos Ting-Li Chen Lala Li Saurabh Saxena Geoffrey E. Hinton David J. Fleet VGen MLLM 62 103 0 12 Oct 2022
How to decay your learning rate Aitor Lewkowycz 100 24 0 23 Mar 2021
ASAM: Adaptive Sharpness-Aware Minimization for Scale-Invariant Learning of Deep Neural Networks Jungmin Kwon Jeongseop Kim Hyunseong Park I. Choi 100 290 0 23 Feb 2021
Score-Based Generative Modeling through Stochastic Differential Equations Yang Song Jascha Narain Sohl-Dickstein Diederik P. Kingma Abhishek Kumar Stefano Ermon Ben Poole DiffM SyDa 353 6,566 0 26 Nov 2020
Denoising Diffusion Implicit Models Jiaming Song Chenlin Meng Stefano Ermon VLM DiffM 289 7,469 0 06 Oct 2020
Outlier-Robust Estimation: Hardness, Minimally Tuned Algorithms, and Applications Pasquale Antonante Vasileios Tzoumas Heng Yang Luca Carlone 66 55 0 29 Jul 2020
Improved Techniques for Training Score-Based Generative Models Yang Song Stefano Ermon DiffM 260 1,163 0 16 Jun 2020
Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast Convergence Nicolas Loizou Sharan Vaswani I. Laradji Simon Lacoste-Julien 69 187 0 24 Feb 2020
Relative Flatness and Generalization Henning Petzka Michael Kamp Linara Adilova C. Sminchisescu Mario Boley 78 78 0 03 Jan 2020
Graduated Non-Convexity for Robust Spatial Perception: From Non-Minimal Solvers to Global Outlier Rejection Heng Yang Pasquale Antonante Vasileios Tzoumas Luca Carlone 228 230 0 18 Sep 2019
Generative Modeling by Estimating Gradients of the Data Distribution Yang Song Stefano Ermon SyDa DiffM 258 3,956 0 12 Jul 2019
Convergence rates for the stochastic gradient descent method for non-convex objective functions Benjamin J. Fehrman Benjamin Gess Arnulf Jentzen 85 101 0 02 Apr 2019
Large Batch Optimization for Deep Learning: Training BERT in 76 minutes Yang You Jing Li Sashank J. Reddi Jonathan Hseu Sanjiv Kumar Srinadh Bhojanapalli Xiaodan Song J. Demmel Kurt Keutzer Cho-Jui Hsieh ODL 261 999 0 01 Apr 2019
sharpDARTS: Faster and More Accurate Differentiable Architecture Search Andrew Hundt Varun Jain Gregory Hager OOD 67 66 0 23 Mar 2019
A Sufficient Condition for Convergences of Adam and RMSProp Fangyu Zou Li Shen Zequn Jie Weizhong Zhang Wei Liu 61 372 0 23 Nov 2018
On the Convergence of Adaptive Gradient Methods for Nonconvex Optimization Dongruo Zhou Yiqi Tang Yuan Cao Ziyan Yang Quanquan Gu 74 151 0 16 Aug 2018
On the Convergence of A Class of Adam-Type Algorithms for Non-Convex Optimization Xiangyi Chen Sijia Liu Ruoyu Sun Mingyi Hong 65 324 0 08 Aug 2018
Closing the Generalization Gap of Adaptive Gradient Methods in Training Deep Neural Networks Jinghui Chen Dongruo Zhou Yiqi Tang Ziyan Yang Yuan Cao Quanquan Gu ODL 82 193 0 18 Jun 2018
Visualizing the Loss Landscape of Neural Nets Hao Li Zheng Xu Gavin Taylor Christoph Studer Tom Goldstein 258 1,898 0 28 Dec 2017
Receptive Field Block Net for Accurate and Fast Object Detection Songtao Liu Di Huang Yunhong Wang ObjD 75 1,267 0 21 Nov 2017
Don't Decay the Learning Rate, Increase the Batch Size Samuel L. Smith Pieter-Jan Kindermans Chris Ying Quoc V. Le ODL 103 996 0 01 Nov 2017
Rethinking Atrous Convolution for Semantic Image Segmentation Liang-Chieh Chen George Papandreou Florian Schroff Hartwig Adam SSeg 232 8,488 0 17 Jun 2017
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour Priya Goyal Piotr Dollár Ross B. Girshick P. Noordhuis Lukasz Wesolowski Aapo Kyrola Andrew Tulloch Yangqing Jia Kaiming He 3DH 128 3,685 0 08 Jun 2017
Coupling Adaptive Batch Sizes with Learning Rates Lukas Balles Javier Romero Philipp Hennig ODL 130 110 0 15 Dec 2016
Pyramid Scene Parsing Network Hengshuang Zhao Jianping Shi Xiaojuan Qi Xiaogang Wang Jiaya Jia VOS SSeg 665 12,033 0 04 Dec 2016
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima N. Keskar Dheevatsa Mudigere J. Nocedal M. Smelyanskiy P. T. P. Tang ODL 429 2,945 0 15 Sep 2016
SGDR: Stochastic Gradient Descent with Warm Restarts I. Loshchilov Frank Hutter ODL 350 8,174 0 13 Aug 2016
DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs Liang-Chieh Chen George Papandreou Iasonas Kokkinos Kevin Patrick Murphy Alan Yuille SSeg 267 18,267 0 02 Jun 2016
Wide Residual Networks Sergey Zagoruyko N. Komodakis 353 8,000 0 23 May 2016
On Graduated Optimization for Stochastic Non-Convex Problems Elad Hazan Kfir Y. Levy Shai Shalev-Shwartz 79 117 0 12 Mar 2015
Deep Unsupervised Learning using Nonequilibrium Thermodynamics Jascha Narain Sohl-Dickstein Eric A. Weiss Niru Maheswaranathan Surya Ganguli SyDa DiffM 312 7,016 0 12 Mar 2015
Escaping From Saddle Points --- Online Stochastic Gradient for Tensor Decomposition Rong Ge Furong Huang Chi Jin Yang Yuan 143 1,059 0 06 Mar 2015
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift Sergey Ioffe Christian Szegedy OOD 465 43,341 0 11 Feb 2015
Hybrid Deterministic-Stochastic Methods for Data Fitting M. Friedlander Mark Schmidt 199 388 0 13 Apr 2011
Randomized Smoothing for Stochastic Optimization John C. Duchi Peter L. Bartlett Martin J. Wainwright 106 288 0 22 Mar 2011