v1v2 (latest)

Should Under-parameterized Student Networks Copy or Average Teacher Weights?

3 November 2023

Papers citing "Should Under-parameterized Student Networks Copy or Average Teacher Weights?"

4 / 4 papers shown

Title
Understanding Learning Invariance in Deep Linear Networks Hao Duan Guido Montúfar 20 0 0 16 Jun 2025
Learning Gaussian Multi-Index Models with Gradient Flow: Time Complexity and Directional Convergence Berfin Simsek Amire Bendjeddou Daniel Hsu 183 3 0 13 Nov 2024
Loss Landscape of Shallow ReLU-like Neural Networks: Stationary Points, Saddle Escape, and Network Embedding Zhengqing Wu Berfin Simsek Francois Ged ODL 121 0 0 08 Feb 2024
Learning time-scales in two-layers neural networks Raphael Berthier Andrea Montanari Kangjie Zhou 196 38 0 28 Feb 2023