Recovery Guarantees for One-hidden-layer Neural Networks

10 June 2017

Inderjit S. Dhillon

Papers citing "Recovery Guarantees for One-hidden-layer Neural Networks"

50 / 223 papers shown

Title
Learning Neural Networks with Distribution Shift: Efficiently Certifiable Guarantees Gautam Chandrasekaran Adam R. Klivans Lin Lin Lee Konstantinos Stavropoulos OOD 75 1 0 22 Feb 2025
Event Stream-based Visual Object Tracking: HDETrack V2 and A High-Definition Benchmark Shiao Wang Xinyu Wang Chao wang Liye Jin Lin Zhu Bo Jiang Yonghong Tian Jin Tang 130 0 0 08 Feb 2025
$Theoretical Constraints on the Expressive Power of $\mathsf{RoPE}$-based Tensor Attention Transformers$ Theoretical Constraints on the Expressive Power of $\mathsf{RoPE}$ -based Tensor Attention Transformers Xiaoyu Li Yingyu Liang Zhenmei Shi Zhao Song Mingda Wan 338 9 0 23 Dec 2024
On the Hardness of Learning One Hidden Layer Neural Networks Shuchen Li Ilias Zadik Manolis Zampetakis 53 2 0 04 Oct 2024
Local Linear Recovery Guarantee of Deep Neural Networks at Overparameterization Yaoyu Zhang Leyang Zhang Zhongwang Zhang Zhiwei Bai 83 0 0 26 Jun 2024
What Improves the Generalization of Graph Transformers? A Theoretical Dive into the Self-attention and Positional Encoding Hongkang Li Meng Wang Tengfei Ma Sijia Liu Zaixi Zhang Pin-Yu Chen MLT AI4CE 131 11 0 04 Jun 2024
A Provably Effective Method for Pruning Experts in Fine-tuned Sparse Mixture-of-Experts Mohammed Nowaz Rabbani Chowdhury Meng Wang Kaoutar El Maghraoui Naigang Wang Pin-Yu Chen Christopher Carothers MoE 116 4 0 26 May 2024
SF-DQN: Provable Knowledge Transfer using Successor Feature for Deep Reinforcement Learning Shuai Zhang Heshan Devaka Fernando Miao Liu K. Murugesan Songtao Lu Pin-Yu Chen Tianyi Chen Meng Wang 75 2 0 24 May 2024
$How does promoting the minority fraction affect generalization? A theoretical study of the one-hidden-layer neural network on group imbalance$ How does promoting the minority fraction affect generalization? A theoretical study of the one-hidden-layer neural network on group imbalance Hongkang Li Shuai Zhang Yihua Zhang Meng Wang Sijia Liu Pin-Yu Chen 107 4 0 12 Mar 2024
Early Directional Convergence in Deep Homogeneous Neural Networks for Small Initializations Akshay Kumar Jarvis Haupt ODL 107 4 0 12 Mar 2024
Towards Robust Out-of-Distribution Generalization Bounds via Sharpness Yingtian Zou Kenji Kawaguchi Yingnan Liu Jiashuo Liu Mong Li Lee Wynne Hsu 81 7 0 11 Mar 2024
How Do Nonlinear Transformers Learn and Generalize in In-Context Learning? Hongkang Li Meng Wang Songtao Lu Xiaodong Cui Pin-Yu Chen MLT 119 18 0 23 Feb 2024
Data Reconstruction Attacks and Defenses: A Systematic Evaluation Sheng Liu Zihan Wang Yuxiao Chen Qi Lei AAML MIACV 112 4 0 13 Feb 2024
Provably learning a multi-head attention layer Sitan Chen Yuanzhi Li MLT 92 17 0 06 Feb 2024
Hidden Minima in Two-Layer ReLU Networks Yossi Arjevani 99 3 0 28 Dec 2023
On the Convergence and Sample Complexity Analysis of Deep Q-Networks with $ε$ -Greedy Exploration Shuai Zhang Hongkang Li Meng Wang Miao Liu Pin-Yu Chen Songtao Lu Sijia Liu K. Murugesan Subhajit Chaudhury 111 22 0 24 Oct 2023
An Automatic Learning Rate Schedule Algorithm for Achieving Faster Convergence and Steeper Descent Zhao Song Chiwun Yang 90 10 0 17 Oct 2023
Global Convergence of SGD For Logistic Loss on Two Layer Neural Nets Pulkit Gopalani Samyak Jha Anirbit Mukherjee 62 2 0 17 Sep 2023
Max-affine regression via first-order methods Seonho Kim Kiryung Lee 64 3 0 15 Aug 2023
Efficiently Learning One-Hidden-Layer ReLU Networks via Schur Polynomials Ilias Diakonikolas D. Kane 71 5 0 24 Jul 2023
A faster and simpler algorithm for learning shallow networks Sitan Chen Shyam Narayanan 81 8 0 24 Jul 2023
Modular Neural Network Approaches for Surgical Image Recognition Nosseiba Ben Salem Younès Bennani Joseph Karkazan Abir Barbara Charles Dacheux Thomas Gregory 55 0 0 17 Jul 2023
Test-Time Training on Video Streams Renhao Wang Yu Sun Yossi Gandelsman Xinlei Chen Alexei A. Efros Alexei A. Efros Xiaolong Wang TTA ViT 3DGS 157 21 0 11 Jul 2023
Fast, Distribution-free Predictive Inference for Neural Networks with Coverage Guarantees Yue Gao Garvesh Raskutti Rebecca Willett 56 0 0 11 Jun 2023
InfoPrompt: Information-Theoretic Soft Prompt Tuning for Natural Language Understanding Junda Wu Tong Yu Rui Wang Zhao Song Ruiyi Zhang Handong Zhao Chaochao Lu Shuai Li Ricardo Henao VLM 94 25 0 08 Jun 2023
Patch-level Routing in Mixture-of-Experts is Provably Sample-efficient for Convolutional Neural Networks Mohammed Nowaz Rabbani Chowdhury Shuai Zhang Ming Wang Sijia Liu Pin-Yu Chen MoE 105 19 0 07 Jun 2023
Most Neural Networks Are Almost Learnable Amit Daniely Nathan Srebro Gal Vardi 57 0 0 25 May 2023
$Toward $L_\infty$-recovery of Nonlinear Functions: A Polynomial Sample Complexity Bound for Gaussian Random Fields$ Toward $L_\infty$ -recovery of Nonlinear Functions: A Polynomial Sample Complexity Bound for Gaussian Random Fields Kefan Dong Tengyu Ma 93 4 0 29 Apr 2023
Expand-and-Cluster: Parameter Recovery of Neural Networks Flavio Martinelli Berfin Simsek W. Gerstner Johanni Brea 146 8 0 25 Apr 2023
Learning Narrow One-Hidden-Layer ReLU Networks Sitan Chen Zehao Dou Surbhi Goel Adam R. Klivans Raghu Meka MLT 74 15 0 20 Apr 2023
DiffFit: Unlocking Transferability of Large Diffusion Models via Simple Parameter-Efficient Fine-Tuning Enze Xie Lewei Yao Han Shi Zhili Liu Daquan Zhou Zhaoqiang Liu Jiawei Li Zhenguo Li 74 81 0 13 Apr 2023
Over-Parameterization Exponentially Slows Down Gradient Descent for Learning a Single Neuron Weihang Xu S. Du 108 16 0 20 Feb 2023
Computational Complexity of Learning Neural Networks: Smoothness and Degeneracy Amit Daniely Nathan Srebro Gal Vardi 96 5 0 15 Feb 2023
A Theoretical Understanding of Shallow Vision Transformers: Learning, Generalization, and Sample Complexity Hongkang Li Ming Wang Sijia Liu Pin-Yu Chen ViT MLT 148 64 0 12 Feb 2023
$Generalization Ability of Wide Neural Networks on $\mathbb{R}$$ Generalization Ability of Wide Neural Networks on $\mathbb{R}$ Jianfa Lai Manyun Xu Rui Chen Qi-Rong Lin 87 23 0 12 Feb 2023
Joint Edge-Model Sparse Learning is Provably Efficient for Graph Neural Networks Shuai Zhang Ming Wang Pin-Yu Chen Sijia Liu Songtao Lu Miaoyuan Liu MLT 118 17 0 06 Feb 2023
Reconstructing Training Data from Model Gradient, Provably Zihan Wang Jason D. Lee Qi Lei FedML 116 26 0 07 Dec 2022
Finite Sample Identification of Wide Shallow Neural Networks with Biases M. Fornasier T. Klock Marco Mondelli Michael Rauchensteiner 57 6 0 08 Nov 2022
Learning Single-Index Models with Shallow Neural Networks A. Bietti Joan Bruna Clayton Sanford M. Song 227 71 0 27 Oct 2022
Bures-Wasserstein Barycenters and Low-Rank Matrix Recovery Tyler Maunu Thibaut Le Gouic Philippe Rigollet 64 5 0 26 Oct 2022
Global Convergence of SGD On Two Layer Neural Nets Pulkit Gopalani Anirbit Mukherjee 69 6 0 20 Oct 2022
Annihilation of Spurious Minima in Two-Layer ReLU Networks Yossi Arjevani M. Field 52 8 0 12 Oct 2022
Neural Networks Efficiently Learn Low-Dimensional Representations with SGD Alireza Mousavi-Hosseini Sejun Park M. Girotti Ioannis Mitliagkas Murat A. Erdogdu MLT 379 50 0 29 Sep 2022
Is Stochastic Gradient Descent Near Optimal? Yifan Zhu Hong Jun Jeon Benjamin Van Roy 71 2 0 18 Sep 2022
Agnostic Learning of General ReLU Activation Using Gradient Descent Pranjal Awasthi Alex K. Tang Aravindan Vijayaraghavan MLT 64 7 0 04 Aug 2022
Learning and generalization of one-hidden-layer neural networks, going beyond standard Gaussian data Hongkang Li Shuai Zhang Ming Wang MLT 77 8 0 07 Jul 2022
Generalization Guarantee of Training Graph Convolutional Networks with Graph Topology Sampling Hongkang Li Ming Wang Sijia Liu Pin-Yu Chen Jinjun Xiong GNN 78 28 0 07 Jul 2022
Bounding the Width of Neural Networks via Coupled Initialization -- A Worst Case Analysis Alexander Munteanu Simon Omlor Zhao Song David P. Woodruff 97 15 0 26 Jun 2022
Local Identifiability of Deep ReLU Neural Networks: the Theory Joachim Bona-Pellissier Franccois Malgouyres François Bachoc FAtt 114 7 0 15 Jun 2022
Excess Risk of Two-Layer ReLU Neural Networks in Teacher-Student Settings and its Superiority to Kernel Methods Shunta Akiyama Taiji Suzuki 66 6 0 30 May 2022