ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2110.09419
  4. Cited By
Compositional Attention: Disentangling Search and Retrieval

Compositional Attention: Disentangling Search and Retrieval

18 October 2021
Sarthak Mittal
Sharath Chandra Raparthy
Irina Rish
Yoshua Bengio
Guillaume Lajoie
ArXivPDFHTML

Papers citing "Compositional Attention: Disentangling Search and Retrieval"

45 / 45 papers shown
Title
A Complexity-Based Theory of Compositionality
A Complexity-Based Theory of Compositionality
Eric Elmoznino
Thomas Jiralerspong
Yoshua Bengio
Guillaume Lajoie
CoGe
82
8
0
18 Oct 2024
Breaking Neural Network Scaling Laws with Modularity
Breaking Neural Network Scaling Laws with Modularity
Akhilan Boopathy
Sunshine Jiang
William Yue
Jaedong Hwang
Abhiram Iyer
Ila Fiete
OOD
98
2
0
09 Sep 2024
Associative Transformer
Associative Transformer
Yuwei Sun
H. Ochiai
Zhirong Wu
Stephen Lin
Ryota Kanai
ViT
86
0
0
22 Sep 2023
The Devil is in the Detail: Simple Tricks Improve Systematic
  Generalization of Transformers
The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers
Róbert Csordás
Kazuki Irie
Jürgen Schmidhuber
ViT
48
131
0
26 Aug 2021
Systematic Evaluation of Causal Discovery in Visual Model Based
  Reinforcement Learning
Systematic Evaluation of Causal Discovery in Visual Model Based Reinforcement Learning
Nan Rosemary Ke
Aniket Didolkar
Sarthak Mittal
Anirudh Goyal
Guillaume Lajoie
Stefan Bauer
Danilo Jimenez Rezende
Yoshua Bengio
Michael C. Mozer
C. Pal
CML
51
55
0
02 Jul 2021
Fast and Slow Learning of Recurrent Independent Mechanisms
Fast and Slow Learning of Recurrent Independent Mechanisms
Kanika Madan
Rosemary Nan Ke
Anirudh Goyal
Bernhard Schölkopf
Yoshua Bengio
OCL
50
40
0
18 May 2021
Neural Production Systems: Learning Rule-Governed Visual Dynamics
Neural Production Systems: Learning Rule-Governed Visual Dynamics
Anirudh Goyal
Aniket Didolkar
Nan Rosemary Ke
Charles Blundell
Philippe Beaudoin
N. Heess
Michael C. Mozer
Yoshua Bengio
OCL
72
82
0
02 Mar 2021
Coordination Among Neural Modules Through a Shared Global Workspace
Coordination Among Neural Modules Through a Shared Global Workspace
Anirudh Goyal
Aniket Didolkar
Alex Lamb
Kartikeya Badola
Nan Rosemary Ke
Nasim Rahaman
Jonathan Binas
Charles Blundell
Michael C. Mozer
Yoshua Bengio
167
98
0
01 Mar 2021
Transformers with Competitive Ensembles of Independent Mechanisms
Transformers with Competitive Ensembles of Independent Mechanisms
Alex Lamb
Di He
Anirudh Goyal
Guolin Ke
Chien-Feng Liao
Mirco Ravanelli
Yoshua Bengio
MoE
39
23
0
27 Feb 2021
Investigating the Limitations of Transformers with Simple Arithmetic
  Tasks
Investigating the Limitations of Transformers with Simple Arithmetic Tasks
Rodrigo Nogueira
Zhiying Jiang
Jimmy J. Li
LRM
55
124
0
25 Feb 2021
Emergent Symbols through Binding in External Memory
Emergent Symbols through Binding in External Memory
Taylor Webb
I. Sinha
Jonathan Cohen
104
65
0
29 Dec 2020
Inductive Biases for Deep Learning of Higher-Level Cognition
Inductive Biases for Deep Learning of Higher-Level Cognition
Anirudh Goyal
Yoshua Bengio
AI4CE
37
355
0
30 Nov 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at
  Scale
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
307
40,217
0
22 Oct 2020
The EOS Decision and Length Extrapolation
The EOS Decision and Length Extrapolation
Benjamin Newman
John Hewitt
Percy Liang
Christopher D. Manning
43
46
0
14 Oct 2020
Rethinking Attention with Performers
Rethinking Attention with Performers
K. Choromanski
Valerii Likhosherstov
David Dohan
Xingyou Song
Andreea Gane
...
Afroz Mohiuddin
Lukasz Kaiser
David Belanger
Lucy J. Colwell
Adrian Weller
133
1,548
0
30 Sep 2020
Efficient Transformers: A Survey
Efficient Transformers: A Survey
Yi Tay
Mostafa Dehghani
Dara Bahri
Donald Metzler
VLM
138
1,111
0
14 Sep 2020
Compositional Generalization via Neural-Symbolic Stack Machines
Compositional Generalization via Neural-Symbolic Stack Machines
Xinyun Chen
Chen Liang
Adams Wei Yu
D. Song
Denny Zhou
BDL
24
100
0
15 Aug 2020
Learning to Combine Top-Down and Bottom-Up Signals in Recurrent Neural
  Networks with Attention over Modules
Learning to Combine Top-Down and Bottom-Up Signals in Recurrent Neural Networks with Attention over Modules
Sarthak Mittal
Alex Lamb
Anirudh Goyal
Vikram S. Voleti
Murray Shanahan
Guillaume Lajoie
Michael C. Mozer
Yoshua Bengio
27
64
0
30 Jun 2020
Object Files and Schemata: Factorizing Declarative and Procedural
  Knowledge in Dynamical Systems
Object Files and Schemata: Factorizing Declarative and Procedural Knowledge in Dynamical Systems
Anirudh Goyal
Alex Lamb
Phanideep Gampa
Philippe Beaudoin
Sergey Levine
Charles Blundell
Yoshua Bengio
Michael C. Mozer
OCL
34
71
0
29 Jun 2020
Object-Centric Learning with Slot Attention
Object-Centric Learning with Slot Attention
Francesco Locatello
Dirk Weissenborn
Thomas Unterthiner
Aravindh Mahendran
G. Heigold
Jakob Uszkoreit
Alexey Dosovitskiy
Thomas Kipf
OCL
158
832
0
26 Jun 2020
Linformer: Self-Attention with Linear Complexity
Linformer: Self-Attention with Linear Complexity
Sinong Wang
Belinda Z. Li
Madian Khabsa
Han Fang
Hao Ma
154
1,678
0
08 Jun 2020
Reformer: The Efficient Transformer
Reformer: The Efficient Transformer
Nikita Kitaev
Lukasz Kaiser
Anselm Levskaya
VLM
143
2,279
0
13 Jan 2020
Measuring Compositional Generalization: A Comprehensive Method on
  Realistic Data
Measuring Compositional Generalization: A Comprehensive Method on Realistic Data
Daniel Keysers
Nathanael Scharli
Nathan Scales
Hylke Buisman
Daniel Furrer
...
Tibor Tihon
Dmitry Tsarkov
Tianlin Li
Marc van Zee
Olivier Bousquet
CoGe
50
350
0
20 Dec 2019
Compositional Generalization for Primitive Substitutions
Compositional Generalization for Primitive Substitutions
Yuanpeng Li
Liang Zhao
Jianyu Wang
Joel Hestness
34
86
0
07 Oct 2019
Recurrent Independent Mechanisms
Recurrent Independent Mechanisms
Anirudh Goyal
Alex Lamb
Jordan Hoffmann
Shagun Sodhani
Sergey Levine
Yoshua Bengio
Bernhard Schölkopf
54
336
0
24 Sep 2019
Deep Equilibrium Models
Deep Equilibrium Models
Shaojie Bai
J. Zico Kolter
V. Koltun
51
663
0
03 Sep 2019
Compositionality decomposed: how do neural networks generalise?
Compositionality decomposed: how do neural networks generalise?
Dieuwke Hupkes
Verna Dankers
Mathijs Mul
Elia Bruni
CoGe
90
330
0
22 Aug 2019
Are Sixteen Heads Really Better than One?
Are Sixteen Heads Really Better than One?
Paul Michel
Omer Levy
Graham Neubig
MoE
64
1,049
0
25 May 2019
Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy
  Lifting, the Rest Can Be Pruned
Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned
Elena Voita
David Talbot
F. Moiseev
Rico Sennrich
Ivan Titov
76
1,120
0
23 May 2019
Generating Long Sequences with Sparse Transformers
Generating Long Sequences with Sparse Transformers
R. Child
Scott Gray
Alec Radford
Ilya Sutskever
62
1,880
0
23 Apr 2019
fairseq: A Fast, Extensible Toolkit for Sequence Modeling
fairseq: A Fast, Extensible Toolkit for Sequence Modeling
Myle Ott
Sergey Edunov
Alexei Baevski
Angela Fan
Sam Gross
Nathan Ng
David Grangier
Michael Auli
VLM
FaML
76
3,141
0
01 Apr 2019
A Meta-Transfer Objective for Learning to Disentangle Causal Mechanisms
A Meta-Transfer Objective for Learning to Disentangle Causal Mechanisms
Yoshua Bengio
T. Deleu
Nasim Rahaman
Nan Rosemary Ke
Sébastien Lachapelle
O. Bilaniuk
Anirudh Goyal
C. Pal
CML
OOD
83
334
0
30 Jan 2019
Compositional Attention Networks for Interpretability in Natural
  Language Question Answering
Compositional Attention Networks for Interpretability in Natural Language Question Answering
Selvakumar Murugan
Suriyadeepan Ramamoorthy
Vaidheeswaran Archana
Malaikannan Sankarasubbu
32
3
0
30 Oct 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
870
93,936
0
11 Oct 2018
Universal Transformers
Universal Transformers
Mostafa Dehghani
Stephan Gouws
Oriol Vinyals
Jakob Uszkoreit
Lukasz Kaiser
62
752
0
10 Jul 2018
Compositional Attention Networks for Machine Reasoning
Compositional Attention Networks for Machine Reasoning
Drew A. Hudson
Christopher D. Manning
BDL
OOD
LRM
90
574
0
08 Mar 2018
Memorize or generalize? Searching for a compositional RNN in a haystack
Memorize or generalize? Searching for a compositional RNN in a haystack
Adam Liska
Germán Kruszewski
Marco Baroni
51
79
0
18 Feb 2018
The Consciousness Prior
The Consciousness Prior
Yoshua Bengio
DRL
AI4CE
34
229
0
25 Sep 2017
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
422
129,831
0
12 Jun 2017
A simple neural network module for relational reasoning
A simple neural network module for relational reasoning
Adam Santoro
David Raposo
David Barrett
Mateusz Malinowski
Razvan Pascanu
Peter W. Battaglia
Timothy Lillicrap
GNN
NAI
101
1,610
0
05 Jun 2017
Pointer Sentinel Mixture Models
Pointer Sentinel Mixture Models
Stephen Merity
Caiming Xiong
James Bradbury
R. Socher
RALM
147
2,783
0
26 Sep 2016
Effective Approaches to Attention-based Neural Machine Translation
Effective Approaches to Attention-based Neural Machine Translation
Thang Luong
Hieu H. Pham
Christopher D. Manning
297
7,942
0
17 Aug 2015
Show, Attend and Tell: Neural Image Caption Generation with Visual
  Attention
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
Ke Xu
Jimmy Ba
Ryan Kiros
Kyunghyun Cho
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
DiffM
277
10,034
0
10 Feb 2015
Neural Turing Machines
Neural Turing Machines
Alex Graves
Greg Wayne
Ivo Danihelka
61
2,318
0
20 Oct 2014
Neural Machine Translation by Jointly Learning to Align and Translate
Neural Machine Translation by Jointly Learning to Align and Translate
Dzmitry Bahdanau
Kyunghyun Cho
Yoshua Bengio
AIMat
371
27,205
0
01 Sep 2014
1