GradMax: Growing Neural Networks using Gradient Information

GradMax: Growing Neural Networks using Gradient Information

13 January 2022

B. V. Merrienboer

Thomas Unterthiner

Fabian Pedregosa

Papers citing "GradMax: Growing Neural Networks using Gradient Information"

16 / 16 papers shown

Title
Growth strategies for arbitrary DAG neural architectures Stella Douka Manon Verbockhaven Théo Rudkiewicz Stéphane Rivaud François P. Landes Sylvain Chevallier Guillaume Charpiat AI4CE 54 0 0 17 Feb 2025
The Cake that is Intelligence and Who Gets to Bake it: An AI Analogy and its Implications for Participation Martin Mundt Anaelia Ovalle Felix Friedrich A Pranav Subarnaduti Paul Manuel Brack Kristian Kersting William Agnew 374 0 0 05 Feb 2025
Level Set Teleportation: An Optimization Perspective Aaron Mishkin A. Bietti Robert Mansel Gower 45 1 0 05 Mar 2024
Beyond Uniform Scaling: Exploring Depth Heterogeneity in Neural Architectures Akash Guna R.T Arnav Chavan Deepak Gupta MDE 34 0 0 19 Feb 2024
Composable Function-preserving Expansions for Transformer Architectures Andrea Gesmundo Kaitlin Maile AI4CE 42 8 0 11 Aug 2023
Accelerated Training via Incrementally Growing Neural Networks using Variance Transfer and Learning Rate Adaptation Xin Yuan Pedro H. P. Savarese Michael Maire 13 5 0 22 Jun 2023
STen: Productive and Efficient Sparsity in PyTorch Andrei Ivanov Nikoli Dryden Tal Ben-Nun Saleh Ashkboos Torsten Hoefler 39 4 0 15 Apr 2023
Can We Scale Transformers to Predict Parameters of Diverse ImageNet Models? Boris Knyazev Doha Hwang Simon Lacoste-Julien AI4CE 39 17 0 07 Mar 2023
The Dormant Neuron Phenomenon in Deep Reinforcement Learning Ghada Sokar Rishabh Agarwal Pablo Samuel Castro Utku Evci CLL 53 90 0 24 Feb 2023
Adaptive Neural Networks Using Residual Fitting N. Ford J. Winder Josh Mcclellan 27 0 0 13 Jan 2023
Exploiting the Partly Scratch-off Lottery Ticket for Quantization-Aware Training Mingliang Xu Gongrui Nan Yuxin Zhang Rongrong Ji Rongrong Ji MQ 23 3 0 12 Nov 2022
Streamable Neural Fields Junwoo Cho Seungtae Nam Daniel Rho J. Ko Eunbyung Park AI4TS 40 17 0 20 Jul 2022
Firefly Neural Architecture Descent: a General Approach for Growing Neural Networks Lemeng Wu Bo Liu Peter Stone Qiang Liu 70 55 0 17 Feb 2021
Towards Learning Convolutions from Scratch Behnam Neyshabur SSL 220 71 0 27 Jul 2020
The large learning rate phase of deep learning: the catapult mechanism Aitor Lewkowycz Yasaman Bahri Ethan Dyer Jascha Narain Sohl-Dickstein Guy Gur-Ari ODL 159 236 0 04 Mar 2020
Neural Architecture Search with Reinforcement Learning Barret Zoph Quoc V. Le 274 5,331 0 05 Nov 2016