v1v2v3v4v5 (latest)

Essentially No Barriers in Neural Network Energy Landscape

2 March 2018

Papers citing "Essentially No Barriers in Neural Network Energy Landscape"

50 / 295 papers shown

Title
Optimal Sets and Solution Paths of ReLU Networks Aaron Mishkin Mert Pilanci 129 4 0 31 May 2023
A Rainbow in Deep Network Black Boxes Florentin Guth Brice Ménard G. Rochette S. Mallat 114 12 0 29 May 2023
A Three-regime Model of Network Pruning Yefan Zhou Yaoqing Yang Arin Chang Michael W. Mahoney 96 11 0 28 May 2023
How to escape sharp minima with random perturbations Kwangjun Ahn Ali Jadbabaie S. Sra ODL 123 8 0 25 May 2023
Sparse Weight Averaging with Multiple Particles for Iterative Magnitude Pruning Moonseok Choi Hyungi Lee G. Nam Juho Lee 78 2 0 24 May 2023
Transferring Learning Trajectories of Neural Networks Daiki Chijiwa 60 3 0 23 May 2023
Mode Connectivity in Auction Design Christoph Hertrich Yixin Tao László A. Végh 78 1 0 18 May 2023
Recyclable Tuning for Continual Pre-training Yujia Qin Cheng Qian Xu Han Yankai Lin Huadong Wang Ruobing Xie Zhiyuan Liu Maosong Sun Jie Zhou CLL 66 13 0 15 May 2023
Understanding and Improving Model Averaging in Federated Learning on Heterogeneous Data Tailin Zhou Zehong Lin Jinchao Zhang Danny H. K. Tsang MoMe FedML 100 13 0 13 May 2023
ZipIt! Merging Models from Different Tasks without Training George Stoica Daniel Bolya J. Bjorner Pratik Ramesh Taylor N. Hearn Judy Hoffman VLM MoMe 141 125 0 04 May 2023
$π$ -Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation Chengyue Wu Teng Wang Yixiao Ge Zeyu Lu Rui-Zhi Zhou Ying Shan Ping Luo MoMe 145 37 0 27 Apr 2023
Typical and atypical solutions in non-convex neural networks with discrete and continuous weights Carlo Baldassi Enrico M. Malatesta Gabriele Perugini R. Zecchina MQ 92 13 0 26 Apr 2023
Hierarchical Weight Averaging for Deep Neural Networks Xiaozhe Gu Zixun Zhang Yuncheng Jiang Yaoyu Zhang Ruimao Zhang Shuguang Cui Zhuguo Li 62 5 0 23 Apr 2023
PopulAtion Parameter Averaging (PAPA) Alexia Jolicoeur-Martineau Emy Gervais Kilian Fatras Yan Zhang Simon Lacoste-Julien MoMe 110 21 0 06 Apr 2023
Towards Efficient MCMC Sampling in Bayesian Neural Networks by Exploiting Symmetry J. G. Wiese Lisa Wimmer Theodore Papamarkou Bernd Bischl Stephan Günnemann David Rügamer 86 15 0 06 Apr 2023
On the Variance of Neural Network Training with respect to Test Sets and Distributions Keller Jordan OOD 48 11 0 04 Apr 2023
Achieving a Better Stability-Plasticity Trade-off via Auxiliary Networks in Continual Learning Sang-Ho Kim Lorenzo Noci Antonio Orvieto Thomas Hofmann CLL 101 41 0 16 Mar 2023
Revisiting the Noise Model of Stochastic Gradient Descent Barak Battash Ofir Lindenbaum 56 11 0 05 Mar 2023
DART: Diversify-Aggregate-Repeat Training Improves Generalization of Neural Networks Samyak Jain Sravanti Addepalli P. Sahu Priyam Dey R. Venkatesh Babu MoMe OOD 118 20 0 28 Feb 2023
Random Teachers are Good Teachers Felix Sarnthein Gregor Bachmann Sotiris Anagnostidis Thomas Hofmann 85 5 0 23 Feb 2023
Modular Deep Learning Jonas Pfeiffer Sebastian Ruder Ivan Vulić Edoardo Ponti MoMe OOD 159 80 0 22 Feb 2023
Revisiting Weighted Aggregation in Federated Learning with Neural Networks Zexi Li Tao R. Lin Xinyi Shang Chao-Xiang Wu FedML 102 65 0 14 Feb 2023
Data efficiency and extrapolation trends in neural network interatomic potentials Joshua A Vita Daniel Schwalbe-Koda 73 17 0 12 Feb 2023
Gradient Descent in Neural Networks as Sequential Learning in RKBS A. Shilton Sunil R. Gupta Santu Rana Svetha Venkatesh MLT 130 1 0 01 Feb 2023
Re-basin via implicit Sinkhorn differentiation F. Guerrero-Peña H. R. Medeiros Thomas Dubail Masih Aminbeidokhti Eric Granger M. Pedersoli MoMe 104 50 0 22 Dec 2022
Model Ratatouille: Recycling Diverse Models for Out-of-Distribution Generalization Alexandre Ramé Kartik Ahuja Jianyu Zhang Matthieu Cord Léon Bottou David Lopez-Paz MoMe OODD 130 86 0 20 Dec 2022
Likelihood-based generalization of Markov parameter estimation and multiple shooting objectives in system identification Nicholas Galioto Alex Arkady Gorodetsky 152 1 0 20 Dec 2022
Dataless Knowledge Fusion by Merging Weights of Language Models Xisen Jin Xiang Ren Daniel Preoţiuc-Pietro Pengxiang Cheng FedML MoMe 103 250 0 19 Dec 2022
Editing Models with Task Arithmetic Gabriel Ilharco Marco Tulio Ribeiro Mitchell Wortsman Suchin Gururangan Ludwig Schmidt Hannaneh Hajishirzi Ali Farhadi KELM MoMe MU 217 523 0 08 Dec 2022
A survey of deep learning optimizers -- first and second order methods Rohan Kashyap ODL 108 7 0 28 Nov 2022
Building a Subspace of Policies for Scalable Continual Learning Jean-Baptiste Gaya T. Doan Lucas Caccia Laure Soulier Ludovic Denoyer Roberta Raileanu CLL 121 31 0 18 Nov 2022
Mechanistic Mode Connectivity Ekdeep Singh Lubana Eric J. Bigelow Robert P. Dick David M. Krueger Hidenori Tanaka 122 49 0 15 Nov 2022
REPAIR: REnormalizing Permuted Activations for Interpolation Repair Keller Jordan Hanie Sedghi O. Saukh R. Entezari Behnam Neyshabur MoMe 142 101 0 15 Nov 2022
Robust Federated Learning against both Data Heterogeneity and Poisoning Attack via Aggregation Optimization Yueqi Xie Weizhong Zhang Renjie Pi Fangzhao Wu Qifeng Chen Xing Xie Sunghun Kim FedML 77 7 0 10 Nov 2022
Class Interference of Deep Neural Networks Dongcui Diao Hengshuai Yao Bei Jiang 49 1 0 31 Oct 2022
Symmetries, flat minima, and the conserved quantities of gradient flow Bo Zhao I. Ganev Robin Walters Rose Yu Nima Dehmamy 109 20 0 31 Oct 2022
Flatter, faster: scaling momentum for optimal speedup of SGD Aditya Cowsik T. Can Paolo Glorioso 109 5 0 28 Oct 2022
Exploring Mode Connectivity for Pre-trained Language Models Yujia Qin Cheng Qian Jing Yi Weize Chen Yankai Lin Xu Han Zhiyuan Liu Maosong Sun Jie Zhou 99 21 0 25 Oct 2022
lo-fi: distributed fine-tuning without communication Mitchell Wortsman Suchin Gururangan Shen Li Ali Farhadi Ludwig Schmidt Michael G. Rabbat Ari S. Morcos 108 24 0 19 Oct 2022
Pareto Manifold Learning: Tackling multiple tasks via ensembles of single-task models Nikolaos Dimitriadis P. Frossard Franccois Fleuret 101 25 0 18 Oct 2022
Mean-field analysis for heavy ball methods: Dropout-stability, connectivity, and global convergence Diyuan Wu Vyacheslav Kungurtsev Marco Mondelli 62 3 0 13 Oct 2022
Wasserstein Barycenter-based Model Fusion and Linear Mode Connectivity of Neural Networks A. K. Akash Sixu Li Nicolas García Trillos 80 13 0 13 Oct 2022
Plateau in Monotonic Linear Interpolation -- A "Biased" View of Loss Landscape for Deep Networks Xiang Wang Annie Wang Mo Zhou Rong Ge MoMe 231 10 0 03 Oct 2022
Multiple Modes for Continual Learning Siddhartha Datta N. Shadbolt CLL MoMe 109 2 0 29 Sep 2022
On Quantum Speedups for Nonconvex Optimization via Quantum Tunneling Walks Yizhou Liu Weijie J. Su Tongyang Li 90 18 0 29 Sep 2022
Random initialisations performing above chance and how to find them Frederik Benzing Simon Schug Robert Meier J. Oswald Yassir Akram Nicolas Zucchet Laurence Aitchison Angelika Steger ODL 119 26 0 15 Sep 2022
Git Re-Basin: Merging Models modulo Permutation Symmetries Samuel K. Ainsworth J. Hayase S. Srinivasa MoMe 339 344 0 11 Sep 2022
Lottery Tickets on a Data Diet: Finding Initializations with Sparse Trainable Networks Mansheej Paul Brett W. Larsen Surya Ganguli Jonathan Frankle Gintare Karolina Dziugaite 59 24 0 02 Jun 2022
Superposing Many Tickets into One: A Performance Booster for Sparse Neural Network Training Lu Yin Vlado Menkovski Meng Fang Tianjin Huang Yulong Pei Mykola Pechenizkiy Decebal Constantin Mocanu Shiwei Liu 117 8 0 30 May 2022
The Missing Invariance Principle Found -- the Reciprocal Twin of Invariant Risk Minimization Dongsung Huh A. Baidya OOD 65 8 0 29 May 2022