Accelerating Minibatch Stochastic Gradient Descent using Typicality Sampling

11 March 2019

Papers citing "Accelerating Minibatch Stochastic Gradient Descent using Typicality Sampling"

20 / 20 papers shown

Title
Convergence, Sticking and Escape: Stochastic Dynamics Near Critical Points in SGD Dmitry Dudukalov Artem Logachov Vladimir Lotov Timofei Prasolov Evgeny Prokopenko Anton Tarasenko 42 0 0 24 May 2025
On the Convergence of Adam and Beyond Sashank J. Reddi Satyen Kale Surinder Kumar 93 2,499 0 19 Apr 2019
Adaptive Gradient Methods with Dynamic Bound of Learning Rate Liangchen Luo Yuanhao Xiong Yan Liu Xu Sun ODL 77 602 0 26 Feb 2019
MentorNet: Learning Data-Driven Curriculum for Very Deep Neural Networks on Corrupted Labels Lu Jiang Zhengyuan Zhou Thomas Leung Li Li Li Fei-Fei NoLa 98 1,453 0 14 Dec 2017
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour Priya Goyal Piotr Dollár Ross B. Girshick P. Noordhuis Lukasz Wesolowski Aapo Kyrola Andrew Tulloch Yangqing Jia Kaiming He 3DH 126 3,681 0 08 Jun 2017
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications Andrew G. Howard Menglong Zhu Bo Chen Dmitry Kalenichenko Weijun Wang Tobias Weyand M. Andreetto Hartwig Adam 3DH 1.1K 20,837 0 17 Apr 2017
Learning What Data to Learn Yang Fan Fei Tian Tao Qin Jiang Bian Tie-Yan Liu 61 79 0 28 Feb 2017
Big Batch SGD: Automated Inference using Adaptive Batch Sizes Soham De A. Yadav David Jacobs Tom Goldstein ODL 128 62 0 18 Oct 2016
WaveNet: A Generative Model for Raw Audio Aaron van den Oord Sander Dieleman Heiga Zen Karen Simonyan Oriol Vinyals Alex Graves Nal Kalchbrenner A. Senior Koray Kavukcuoglu DiffM 406 7,399 0 12 Sep 2016
Densely Connected Convolutional Networks Gao Huang Zhuang Liu Laurens van der Maaten Kilian Q. Weinberger PINN 3DV 772 36,813 0 25 Aug 2016
Importance Sampling for Minibatches Dominik Csiba Peter Richtárik 101 115 0 06 Feb 2016
Variance Reduction in SGD by Distributed Importance Sampling Guillaume Alain Alex Lamb Chinnadhurai Sankar Aaron Courville Yoshua Bengio FedML 79 199 0 20 Nov 2015
Online Batch Selection for Faster Training of Neural Networks I. Loshchilov Frank Hutter ODL 89 301 0 19 Nov 2015
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Shaoqing Ren Kaiming He Ross B. Girshick Jian Sun AIMat ObjD 502 62,294 0 04 Jun 2015
Adam: A Method for Stochastic Optimization Diederik P. Kingma Jimmy Ba ODL 1.8K 150,115 0 22 Dec 2014
Accelerating Minibatch Stochastic Gradient Descent using Stratified Sampling P. Zhao Tong Zhang 76 91 0 13 May 2014
Visualizing and Understanding Convolutional Networks Matthew D. Zeiler Rob Fergus FAtt SSL 595 15,882 0 12 Nov 2013
ADADELTA: An Adaptive Learning Rate Method Matthew D. Zeiler ODL 152 6,625 0 22 Dec 2012
Stochastic Dual Coordinate Ascent Methods for Regularized Loss Minimization Shai Shalev-Shwartz Tong Zhang 181 1,033 0 10 Sep 2012
Natural Language Processing (almost) from Scratch R. Collobert Jason Weston Léon Bottou Michael Karlen Koray Kavukcuoglu Pavel P. Kuksa 188 7,726 0 02 Mar 2011