ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2207.10430
  4. Cited By
The Neural Race Reduction: Dynamics of Abstraction in Gated Networks

The Neural Race Reduction: Dynamics of Abstraction in Gated Networks

21 July 2022
Andrew M. Saxe
Shagun Sodhani
Sam Lewallen
    AI4CE
ArXiv (abs)PDFHTML

Papers citing "The Neural Race Reduction: Dynamics of Abstraction in Gated Networks"

50 / 74 papers shown
Title
Position: Solve Layerwise Linear Models First to Understand Neural Dynamical Phenomena (Neural Collapse, Emergence, Lazy/Rich Regime, and Grokking)
Position: Solve Layerwise Linear Models First to Understand Neural Dynamical Phenomena (Neural Collapse, Emergence, Lazy/Rich Regime, and Grokking)
Yoonsoo Nam
Seok Hyeong Lee
Clementine Domine
Yea Chan Park
Charles London
Wonyl Choi
Niclas Goring
Seungjai Lee
AI4CE
188
1
0
28 Feb 2025
Flexible task abstractions emerge in linear networks with fast and bounded units
Flexible task abstractions emerge in linear networks with fast and bounded units
Kai Sandbrink
Jan P. Bauer
A. Proca
Andrew M. Saxe
Christopher Summerfield
Ali Hummos
109
2
0
17 Jan 2025
When does compositional structure yield compositional generalization? A kernel theory
When does compositional structure yield compositional generalization? A kernel theory
Samuel Lippl
Kim Stachenfeld
NAICoGe
212
10
0
26 May 2024
Multimodal Deep Learning
Multimodal Deep Learning
Cem Akkus
Jiquan Ngiam
Vladana Djakovic
Steffen Jauch-Walser
A. Khosla
...
Jann Goschenhofer
Honglak Lee
A. Ng
Daniel Schalk
Matthias Aßenmacher
120
3,176
0
12 Jan 2023
Omnivore: A Single Model for Many Visual Modalities
Omnivore: A Single Model for Many Visual Modalities
Rohit Girdhar
Mannat Singh
Nikhil Ravi
Laurens van der Maaten
Armand Joulin
Ishan Misra
268
236
0
20 Jan 2022
Characterizing Learning Dynamics of Deep Neural Networks via Complex
  Networks
Characterizing Learning Dynamics of Deep Neural Networks via Complex Networks
Emanuele La Malfa
G. Malfa
Giuseppe Nicosia
Vito Latora
53
10
0
06 Oct 2021
Structure and Performance of Fully Connected Neural Networks: Emerging
  Complex Network Properties
Structure and Performance of Fully Connected Neural Networks: Emerging Complex Network Properties
Leonardo F. S. Scabini
Odemir M. Bruno
GNN
38
54
0
29 Jul 2021
The Principles of Deep Learning Theory
The Principles of Deep Learning Theory
Daniel A. Roberts
Sho Yaida
Boris Hanin
FaMLPINNGNN
68
246
0
18 Jun 2021
BASE Layers: Simplifying Training of Large, Sparse Models
BASE Layers: Simplifying Training of Large, Sparse Models
M. Lewis
Shruti Bhosale
Tim Dettmers
Naman Goyal
Luke Zettlemoyer
MoE
201
281
0
30 Mar 2021
Multi-Task Reinforcement Learning with Context-based Representations
Multi-Task Reinforcement Learning with Context-based Representations
Shagun Sodhani
Amy Zhang
Joelle Pineau
87
191
0
11 Feb 2021
Switch Transformers: Scaling to Trillion Parameter Models with Simple
  and Efficient Sparsity
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity
W. Fedus
Barret Zoph
Noam M. Shazeer
MoE
88
2,226
0
11 Jan 2021
Do Wide and Deep Networks Learn the Same Things? Uncovering How Neural
  Network Representations Vary with Width and Depth
Do Wide and Deep Networks Learn the Same Things? Uncovering How Neural Network Representations Vary with Width and Depth
Thao Nguyen
M. Raghu
Simon Kornblith
OOD
65
282
0
29 Oct 2020
Deep learning versus kernel learning: an empirical study of loss
  landscape geometry and the time evolution of the Neural Tangent Kernel
Deep learning versus kernel learning: an empirical study of loss landscape geometry and the time evolution of the Neural Tangent Kernel
Stanislav Fort
Gintare Karolina Dziugaite
Mansheej Paul
Sepideh Kharaghani
Daniel M. Roy
Surya Ganguli
109
193
0
28 Oct 2020
Measuring Systematic Generalization in Neural Proof Generation with
  Transformers
Measuring Systematic Generalization in Neural Proof Generation with Transformers
Nicolas Angelard-Gontier
Koustuv Sinha
Siva Reddy
C. Pal
LRM
91
64
0
30 Sep 2020
GShard: Scaling Giant Models with Conditional Computation and Automatic
  Sharding
GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding
Dmitry Lepikhin
HyoukJoong Lee
Yuanzhong Xu
Dehao Chen
Orhan Firat
Yanping Huang
M. Krikun
Noam M. Shazeer
Zhiwen Chen
MoE
124
1,191
0
30 Jun 2020
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech
  Representations
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations
Alexei Baevski
Henry Zhou
Abdel-rahman Mohamed
Michael Auli
SSL
299
5,849
0
20 Jun 2020
Array Programming with NumPy
Array Programming with NumPy
Charles R. Harris
K. Millman
S. Walt
R. Gommers
Pauli Virtanen
...
Tyler Reddy
Warren Weckesser
Hameer Abbasi
C. Gohlke
T. Oliphant
156
15,026
0
18 Jun 2020
Gaussian Gated Linear Networks
Gaussian Gated Linear Networks
David Budden
Adam H. Marblestone
Eren Sezener
Tor Lattimore
Greg Wayne
J. Veness
BDLAI4CE
55
12
0
10 Jun 2020
Multi-Task Reinforcement Learning with Soft Modularization
Multi-Task Reinforcement Learning with Soft Modularization
Ruihan Yang
Huazhe Xu
Yi Wu
Xiaolong Wang
65
185
0
30 Mar 2020
Evaluating Logical Generalization in Graph Neural Networks
Evaluating Logical Generalization in Graph Neural Networks
Koustuv Sinha
Shagun Sodhani
Joelle Pineau
William L. Hamilton
NAIAI4CE
88
23
0
14 Mar 2020
A Benchmark for Systematic Generalization in Grounded Language
  Understanding
A Benchmark for Systematic Generalization in Grounded Language Understanding
Laura Ruis
Jacob Andreas
Marco Baroni
Diane Bouchacourt
Brenden M. Lake
69
145
0
11 Mar 2020
Emergence of Network Motifs in Deep Neural Networks
Emergence of Network Motifs in Deep Neural Networks
Matteo Zambra
Alberto Testolin
A. Maritan
GNN
59
13
0
27 Dec 2019
CLOSURE: Assessing Systematic Generalization of CLEVR Models
CLOSURE: Assessing Systematic Generalization of CLEVR Models
Dzmitry Bahdanau
H. D. Vries
Timothy J. O'Donnell
Shikhar Murty
Philippe Beaudoin
Yoshua Bengio
Aaron Courville
NAI
62
90
0
12 Dec 2019
Deep Ensembles: A Loss Landscape Perspective
Deep Ensembles: A Loss Landscape Perspective
Stanislav Fort
Huiyi Hu
Balaji Lakshminarayanan
OODUQCV
125
629
0
05 Dec 2019
PyTorch: An Imperative Style, High-Performance Deep Learning Library
PyTorch: An Imperative Style, High-Performance Deep Learning Library
Adam Paszke
Sam Gross
Francisco Massa
Adam Lerer
James Bradbury
...
Sasank Chilamkurthy
Benoit Steiner
Lu Fang
Junjie Bai
Soumith Chintala
ODL
547
42,639
0
03 Dec 2019
Gated Linear Networks
Gated Linear Networks
William H. Guss
Tor Lattimore
David Budden
Avishkar Bhoopchand
Christopher Mattern
...
Ruslan Salakhutdinov
Jianan Wang
Peter Toth
Simon Schmitt
Marcus Hutter
AI4CE
104
41
0
30 Sep 2019
Recurrent Independent Mechanisms
Recurrent Independent Mechanisms
Anirudh Goyal
Alex Lamb
Jordan Hoffmann
Shagun Sodhani
Sergey Levine
Yoshua Bengio
Bernhard Schölkopf
94
338
0
24 Sep 2019
CLUTRR: A Diagnostic Benchmark for Inductive Reasoning from Text
CLUTRR: A Diagnostic Benchmark for Inductive Reasoning from Text
Koustuv Sinha
Shagun Sodhani
Jin Dong
Joelle Pineau
William L. Hamilton
67
211
0
16 Aug 2019
Reinforcement Learning with Competitive Ensembles of
  Information-Constrained Primitives
Reinforcement Learning with Competitive Ensembles of Information-Constrained Primitives
Anirudh Goyal
Shagun Sodhani
Jonathan Binas
Xue Bin Peng
Sergey Levine
Yoshua Bengio
83
49
0
25 Jun 2019
Compositional generalization through meta sequence-to-sequence learning
Compositional generalization through meta sequence-to-sequence learning
Brenden M. Lake
CoGe
83
199
0
12 Jun 2019
Towards Interpretable Reinforcement Learning Using Attention Augmented
  Agents
Towards Interpretable Reinforcement Learning Using Attention Augmented Agents
Alex Mott
Daniel Zoran
Mike Chrzanowski
Daan Wierstra
Danilo Jimenez Rezende
66
191
0
06 Jun 2019
Implicit Regularization in Deep Matrix Factorization
Implicit Regularization in Deep Matrix Factorization
Sanjeev Arora
Nadav Cohen
Wei Hu
Yuping Luo
AI4CE
91
509
0
31 May 2019
Luck Matters: Understanding Training Dynamics of Deep ReLU Networks
Luck Matters: Understanding Training Dynamics of Deep ReLU Networks
Yuandong Tian
Tina Jiang
Qucheng Gong
Ari S. Morcos
138
25
0
31 May 2019
Which Tasks Should Be Learned Together in Multi-task Learning?
Which Tasks Should Be Learned Together in Multi-task Learning?
Trevor Scott Standley
Amir Zamir
Dawn Chen
Leonidas Guibas
Jitendra Malik
Silvio Savarese
113
517
0
18 May 2019
Routing Networks and the Challenges of Modular and Compositional
  Computation
Routing Networks and the Challenges of Modular and Compositional Computation
Clemens Rosenbaum
Ignacio Cases
Matthew D Riemer
Tim Klinger
67
83
0
29 Apr 2019
On Exact Computation with an Infinitely Wide Neural Net
On Exact Computation with an Infinitely Wide Neural Net
Sanjeev Arora
S. Du
Wei Hu
Zhiyuan Li
Ruslan Salakhutdinov
Ruosong Wang
238
928
0
26 Apr 2019
CondConv: Conditionally Parameterized Convolutions for Efficient
  Inference
CondConv: Conditionally Parameterized Convolutions for Efficient Inference
Brandon Yang
Gabriel Bender
Quoc V. Le
Jiquan Ngiam
MedIm3DV
82
637
0
10 Apr 2019
On-line learning dynamics of ReLU neural networks using statistical
  physics techniques
On-line learning dynamics of ReLU neural networks using statistical physics techniques
Michiel Straat
Michael Biehl
AI4CE
26
8
0
18 Mar 2019
Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient
  Descent
Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent
Jaehoon Lee
Lechao Xiao
S. Schoenholz
Yasaman Bahri
Roman Novak
Jascha Narain Sohl-Dickstein
Jeffrey Pennington
213
1,108
0
18 Feb 2019
Multi-Task Deep Neural Networks for Natural Language Understanding
Multi-Task Deep Neural Networks for Natural Language Understanding
Xiaodong Liu
Pengcheng He
Weizhu Chen
Jianfeng Gao
AI4CE
145
1,273
0
31 Jan 2019
Width Provably Matters in Optimization for Deep Linear Neural Networks
Width Provably Matters in Optimization for Deep Linear Neural Networks
S. Du
Wei Hu
75
95
0
24 Jan 2019
On Lazy Training in Differentiable Programming
On Lazy Training in Differentiable Programming
Lénaïc Chizat
Edouard Oyallon
Francis R. Bach
111
840
0
19 Dec 2018
A mathematical theory of semantic development in deep neural networks
A mathematical theory of semantic development in deep neural networks
Andrew M. Saxe
James L. McClelland
Surya Ganguli
73
271
0
23 Oct 2018
A Convergence Analysis of Gradient Descent for Deep Linear Neural
  Networks
A Convergence Analysis of Gradient Descent for Deep Linear Neural Networks
Sanjeev Arora
Nadav Cohen
Noah Golowich
Wei Hu
135
293
0
04 Oct 2018
Gradient descent aligns the layers of deep linear networks
Gradient descent aligns the layers of deep linear networks
Ziwei Ji
Matus Telgarsky
123
257
0
04 Oct 2018
Deep learning systems as complex networks
Deep learning systems as complex networks
Alberto Testolin
Michele Piccolini
S. Suweis
AI4CEBDLGNN
38
27
0
28 Sep 2018
An analytic theory of generalization dynamics and transfer learning in
  deep linear networks
An analytic theory of generalization dynamics and transfer learning in deep linear networks
Andrew Kyle Lampinen
Surya Ganguli
OOD
87
131
0
27 Sep 2018
On the Learning Dynamics of Deep Neural Networks
On the Learning Dynamics of Deep Neural Networks
Rémi Tachet des Combes
Mohammad Pezeshki
Samira Shabanian
Aaron Courville
Yoshua Bengio
62
38
0
18 Sep 2018
Mean Field Analysis of Neural Networks: A Central Limit Theorem
Mean Field Analysis of Neural Networks: A Central Limit Theorem
Justin A. Sirignano
K. Spiliopoulos
MLT
77
194
0
28 Aug 2018
Theory IIIb: Generalization in Deep Networks
Theory IIIb: Generalization in Deep Networks
T. Poggio
Q. Liao
Brando Miranda
Andrzej Banburski
Xavier Boix
Jack Hidary
ODLAI4CE
75
26
0
29 Jun 2018
12
Next