Revisiting Small Batch Training for Deep Neural Networks

20 April 2018

Carlo Luschi

Papers citing "Revisiting Small Batch Training for Deep Neural Networks"

50 / 167 papers shown

Title
Dynamic Gradient Sparse Update for Edge Training I-Hsuan Li Tian-Sheuan Chang 66 1 0 23 Mar 2025
Using machine learning to measure evidence of students' sensemaking in physics courses Kaitlin Gili Kyle Heuton Astha Shah Michael C. Hughes 48 0 0 19 Mar 2025
Gate-Shift-Pose: Enhancing Action Recognition in Sports with Skeleton Information Edoardo Bianchi Oswald Lanz 3DH 68 1 0 06 Mar 2025
CESAR: A Convolutional Echo State AutoencodeR for High-Resolution Wind Forecasting Matthew Bonas Paolo Giani Paola Crippa Stefano Castruccio 73 0 0 13 Dec 2024
Leveraging free energy in pretraining model selection for improved fine-tuning Michael Munn Susan Wei 32 0 0 08 Oct 2024
Forecasting Smog Clouds With Deep Learning Valentijn Oldenburg Juan Cardenas-Cartagena Matias Valdenegro-Toro AI4TS 21 1 0 03 Oct 2024
Fine-tuning LLMs for Autonomous Spacecraft Control: A Case Study Using Kerbal Space Program Alejandro Carrasco Victor Rodriguez-Fernandez Richard Linares 33 1 0 16 Aug 2024
Maxwell's Demon at Work: Efficient Pruning by Leveraging Saturation of Neurons Simon Dufort-Labbé P. DÓro Evgenii Nikishin Razvan Pascanu Pierre-Luc Bacon A. Baratin 45 1 0 12 Mar 2024
Training Neural Networks from Scratch with Parallel Low-Rank Adapters Minyoung Huh Brian Cheung Jeremy Bernstein Phillip Isola Pulkit Agrawal 35 10 0 26 Feb 2024
NeuroFlux: Memory-Efficient CNN Training Using Adaptive Local Learning Dhananjay Saikumar Blesson Varghese 21 1 0 21 Feb 2024
Training Artificial Neural Networks by Coordinate Search Algorithm Ehsan Rokhsatyazdi Shahryar Rahnamayan Sevil Zanjani Miyandoab Azam Asilian Bidgoli H. R. Tizhoosh ODL 19 1 0 20 Feb 2024
Offline Training of Language Model Agents with Functions as Learnable Weights Shaokun Zhang Jieyu Zhang Jiale Liu Linxin Song Chi Wang Ranjay Krishna Qingyun Wu LLMAG LM&Ro AIFin 40 12 0 17 Feb 2024
Implicit Bias in Noisy-SGD: With Applications to Differentially Private Training Tom Sander Maxime Sylvestre Alain Durmus 31 1 0 13 Feb 2024
Flora: Low-Rank Adapters Are Secretly Gradient Compressors Yongchang Hao Yanshuai Cao Lili Mou 16 39 0 05 Feb 2024
Enhancing Contrastive Learning with Efficient Combinatorial Positive Pairing Jaeill Kim Duhun Hwang Eunjung Lee Jangwon Suh Jimyeong Kim Wonjong Rhee 33 0 0 11 Jan 2024
Learning from One Continuous Video Stream João Carreira Michael King Viorica Patraucean Dilara Gokay Catalin Ionescu ... Joseph Heyward Carl Doersch Y. Aytar Dima Damen Andrew Zisserman CLL 32 4 0 01 Dec 2023
An Empirical Investigation into Benchmarking Model Multiplicity for Trustworthy Machine Learning: A Case Study on Image Classification Prakhar Ganesh 41 5 0 24 Nov 2023
Learning spatio-temporal patterns with Neural Cellular Automata Alex D. Richardson Tibor Antal Richard A. Blythe Linus J. Schumacher AI4CE 9 2 0 23 Oct 2023
A Convolutional Network Adaptation for Cortical Classification During Mobile Brain Imaging B. Cichy J. Lukos Mohammad Alam J. C. Bradford Nicholas Wymbs 13 0 0 11 Oct 2023
Small batch deep reinforcement learning J. Obando-Ceron Marc G. Bellemare Pablo Samuel Castro VLM 34 14 0 05 Oct 2023
Stochastic Gradient Descent-like relaxation is equivalent to Metropolis dynamics in discrete optimization and inference problems Maria Chiara Angelini A. Cavaliere Raffaele Marino F. Ricci-Tersenghi 55 5 0 11 Sep 2023
Enhancing Generalization of Universal Adversarial Perturbation through Gradient Aggregation Xuantong Liu Yaoyao Zhong Yuhang Zhang Lixiong Qin Weihong Deng AAML 30 25 0 11 Aug 2023
Graph Neural Networks for Forecasting Multivariate Realized Volatility with Spillover Effects Chao Zhang Xingyue Pu Mihai Cucuringu Xiaowen Dong 17 4 0 01 Aug 2023
Beyond Implicit Bias: The Insignificance of SGD Noise in Online Learning Nikhil Vyas Depen Morwani Rosie Zhao Gal Kaplun Sham Kakade Boaz Barak MLT 13 4 0 14 Jun 2023
Batches Stabilize the Minimum Norm Risk in High Dimensional Overparameterized Linear Regression Shahar Stein Ioushua Inbar Hasidim O. Shayevitz M. Feder 19 0 0 14 Jun 2023
Catapults in SGD: spikes in the training loss and their impact on generalization through feature learning Libin Zhu Chaoyue Liu Adityanarayanan Radhakrishnan M. Belkin 30 13 0 07 Jun 2023
Physical Layer Authentication and Security Design in the Machine Learning Era T. M. Hoang Alireza Vahid H. Tuan L. Hanzo 30 18 0 16 May 2023
GeNAS: Neural Architecture Search with Better Generalization Joonhyun Jeong Joonsang Yu Geondo Park Dongyoon Han Y. Yoo 30 4 0 15 May 2023
Phase transitions in the mini-batch size for sparse and dense two-layer neural networks Raffaele Marino F. Ricci-Tersenghi 30 14 0 10 May 2023
Predictive Coding as a Neuromorphic Alternative to Backpropagation: A Critical Evaluation Umais Zahid Qinghai Guo Z. Fountas BDL 26 7 0 05 Apr 2023
Physics-informed PointNet: On how many irregular geometries can it solve an inverse problem simultaneously? Application to linear elasticity Ali Kashefi Leonidas J. Guibas T. Mukerji PINN 3DPC AI4CE 32 9 0 22 Mar 2023
(S)GD over Diagonal Linear Networks: Implicit Regularisation, Large Stepsizes and Edge of Stability Mathieu Even Scott Pesme Suriya Gunasekar Nicolas Flammarion 28 16 0 17 Feb 2023
Disentangling the Mechanisms Behind Implicit Regularization in SGD Zachary Novack Simran Kaur Tanya Marwah Saurabh Garg Zachary Chase Lipton FedML 27 2 0 29 Nov 2022
DGD-cGAN: A Dual Generator for Image Dewatering and Restoration Salma Gonzalez-Sabbagh A. Robles-Kelly Shang Gao GAN 29 9 0 18 Nov 2022
Accelerating Parallel Stochastic Gradient Descent via Non-blocking Mini-batches Haoze He Parijat Dube 6 3 0 02 Nov 2022
A New Perspective for Understanding Generalization Gap of Deep Neural Networks Trained with Large Batch Sizes O. Oyedotun Konstantinos Papadopoulos Djamila Aouada AI4CE 32 11 0 21 Oct 2022
Large-batch Optimization for Dense Visual Predictions Zeyue Xue Jianming Liang Guanglu Song Zhuofan Zong Liang Chen Yu Liu Ping Luo VLM 39 9 0 20 Oct 2022
TAN Without a Burn: Scaling Laws of DP-SGD Tom Sander Pierre Stock Alexandre Sablayrolles FedML 32 42 0 07 Oct 2022
TripleE: Easy Domain Generalization via Episodic Replay Xiaomeng Li Hongyu Ren Huifeng Yao Ziwei Liu 11 0 0 04 Oct 2022
Self-Supervised Learning with an Information Maximization Criterion Serdar Ozsoy Shadi S. Hamdan Sercan Ö. Arik Deniz Yuret A. Erdogan SSL 21 35 0 16 Sep 2022
Top-Tuning: a study on transfer learning for an efficient alternative to fine tuning for image classification with fast kernel methods P. D. Alfano Vito Paolo Pastore Lorenzo Rosasco Francesca Odone 21 6 0 16 Sep 2022
Learning Deep Optimal Embeddings with Sinkhorn Divergences S. Roy Yan Han Mehrtash Harandi L. Petersson 20 0 0 14 Sep 2022
Detection and Mitigation of Byzantine Attacks in Distributed Training Konstantinos Konstantinidis Namrata Vaswani Aditya Ramamoorthy AAML 24 0 0 17 Aug 2022
How Well Do Vision Transformers (VTs) Transfer To The Non-Natural Image Domain? An Empirical Study Involving Art Classification Vincent Tonkes M. Sabatelli ViT 25 6 0 09 Aug 2022
ILASR: Privacy-Preserving Incremental Learning for Automatic Speech Recognition at Production Scale Gopinath Chennupati Milind Rao Gurpreet Chadha Aaron Eakin A. Raju ... Andrew Oberlin Buddha Nandanoor Prahalad Venkataramanan Zheng Wu Pankaj Sitpure CLL 27 8 0 19 Jul 2022
Modeling the Machine Learning Multiverse Samuel J. Bell Onno P. Kampman Jesse Dodge Neil D. Lawrence 21 17 0 13 Jun 2022
MLLess: Achieving Cost Efficiency in Serverless Machine Learning Training Pablo Gimeno Sarroca Marc Sánchez Artigas 24 14 0 12 Jun 2022
Two Decades of Bengali Handwritten Digit Recognition: A Survey A. A. Ashikur Rahman Md. Bakhtiar Hasan Sabbir Ahmed Tasnim Ahmed Md. Hamjajul Ashmafee Mohammad Ridwan Kabir M. H. Kabir 30 24 0 05 Jun 2022
Metrizing Fairness Yves Rychener Bahar Taşkesen Daniel Kuhn FaML 36 4 0 30 May 2022
Efficient Learning of Interpretable Classification Rules Bishwamittra Ghosh Dmitry Malioutov Kuldeep S. Meel 24 7 0 14 May 2022