Scaling Limits of Wide Neural Networks with Weight Sharing: Gaussian
  Process Behavior, Gradient Independence, and Neural Tangent Kernel Derivation
v1v2v3 (latest)

Scaling Limits of Wide Neural Networks with Weight Sharing: Gaussian Process Behavior, Gradient Independence, and Neural Tangent Kernel Derivation

Papers citing "Scaling Limits of Wide Neural Networks with Weight Sharing: Gaussian Process Behavior, Gradient Independence, and Neural Tangent Kernel Derivation"

50 / 57 papers shown
Title
Layer Normalization
Layer Normalization
413
10,494
0
21 Jul 2016

We use cookies and other tracking technologies to improve your browsing experience on our website, to show you personalized content and targeted ads, to analyze our website traffic, and to understand where our visitors are coming from. See our policy.