ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2012.08508
  4. Cited By
Attention over learned object embeddings enables complex visual
  reasoning

Attention over learned object embeddings enables complex visual reasoning

15 December 2020
David Ding
Felix Hill
Adam Santoro
Malcolm Reynolds
M. Botvinick
    OCL
ArXivPDFHTML

Papers citing "Attention over learned object embeddings enables complex visual reasoning"

40 / 40 papers shown
Title
Understanding the Limits of Vision Language Models Through the Lens of the Binding Problem
Understanding the Limits of Vision Language Models Through the Lens of the Binding Problem
Declan Campbell
Sunayana Rane
Tyler Giallanza
Nicolò De Sabbata
Kia Ghods
...
Alexander Ku
Steven M. Frankland
Thomas Griffiths
Jonathan D. Cohen
Taylor W. Webb
63
15
0
31 Oct 2024
Compositional Physical Reasoning of Objects and Events from Videos
Compositional Physical Reasoning of Objects and Events from Videos
Zhenfang Chen
Shilong Dong
Kexin Yi
Yunzhu Li
Mingyu Ding
Antonio Torralba
Joshua B. Tenenbaum
Chuang Gan
OCL
71
1
0
02 Aug 2024
Exploring the Effectiveness of Object-Centric Representations in Visual Question Answering: Comparative Insights with Foundation Models
Exploring the Effectiveness of Object-Centric Representations in Visual Question Answering: Comparative Insights with Foundation Models
Amir Mohammad Karimi Mamaghan
Samuele Papa
Karl Henrik Johansson
Stefan Bauer
Andrea Dittadi
OCL
66
7
0
22 Jul 2024
CLEVRER-Humans: Describing Physical and Causal Events the Human Way
CLEVRER-Humans: Describing Physical and Causal Events the Human Way
Jiayuan Mao
Xuelin Yang
Xikun Zhang
Noah D. Goodman
Jiajun Wu
NAI
39
22
0
05 Oct 2023
SUTD-TrafficQA: A Question Answering Benchmark and an Efficient Network
  for Video Reasoning over Traffic Events
SUTD-TrafficQA: A Question Answering Benchmark and an Efficient Network for Video Reasoning over Traffic Events
Li Xu
He Huang
Jun Liu
ViT
LRM
56
83
0
29 Mar 2021
ACRE: Abstract Causal REasoning Beyond Covariation
ACRE: Abstract Causal REasoning Beyond Covariation
Chi Zhang
Baoxiong Jia
Mark Edmonds
Song-Chun Zhu
Yixin Zhu
CML
79
48
0
26 Mar 2021
Hopper: Multi-hop Transformer for Spatiotemporal Reasoning
Hopper: Multi-hop Transformer for Spatiotemporal Reasoning
Honglu Zhou
Asim Kadav
Farley Lai
Alexandru Niculescu-Mizil
Martin Renqiang Min
Mubbasir Kapadia
H. Graf
LRM
51
18
0
19 Mar 2021
Coordination Among Neural Modules Through a Shared Global Workspace
Coordination Among Neural Modules Through a Shared Global Workspace
Anirudh Goyal
Aniket Didolkar
Alex Lamb
Kartikeya Badola
Nan Rosemary Ke
Nasim Rahaman
Jonathan Binas
Charles Blundell
Michael C. Mozer
Yoshua Bengio
167
98
0
01 Mar 2021
Unsupervised Discovery of 3D Physical Objects from Video
Unsupervised Discovery of 3D Physical Objects from Video
Yilun Du
Kevin A. Smith
Tomer Ulman
J. Tenenbaum
Jiajun Wu
OCL
135
38
0
24 Jul 2020
Object-Centric Learning with Slot Attention
Object-Centric Learning with Slot Attention
Francesco Locatello
Dirk Weissenborn
Thomas Unterthiner
Aravindh Mahendran
G. Heigold
Jakob Uszkoreit
Alexey Dosovitskiy
Thomas Kipf
OCL
128
832
0
26 Jun 2020
Language Models are Few-Shot Learners
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
321
41,106
0
28 May 2020
End-to-End Object Detection with Transformers
End-to-End Object Detection with Transformers
Nicolas Carion
Francisco Massa
Gabriel Synnaeve
Nicolas Usunier
Alexander Kirillov
Sergey Zagoruyko
ViT
3DV
PINN
221
12,847
0
26 May 2020
Learning Object Permanence from Video
Learning Object Permanence from Video
Aviv Shamsian
Ofri Kleinfeld
Amir Globerson
Gal Chechik
SSL
57
31
0
23 Mar 2020
A Simple Framework for Contrastive Learning of Visual Representations
A Simple Framework for Contrastive Learning of Visual Representations
Ting-Li Chen
Simon Kornblith
Mohammad Norouzi
Geoffrey E. Hinton
SSL
119
18,523
0
13 Feb 2020
SPACE: Unsupervised Object-Oriented Scene Representation via Spatial
  Attention and Decomposition
SPACE: Unsupervised Object-Oriented Scene Representation via Spatial Attention and Decomposition
Zhixuan Lin
Yi-Fu Wu
Skand Peri
Weihao Sun
Gautam Singh
Fei Deng
Jindong Jiang
Sungjin Ahn
BDL
OCL
3DPC
102
247
0
08 Jan 2020
Deep Learning for Symbolic Mathematics
Deep Learning for Symbolic Mathematics
Guillaume Lample
François Charton
3DGS
46
406
0
02 Dec 2019
CATER: A diagnostic dataset for Compositional Actions and TEmporal
  Reasoning
CATER: A diagnostic dataset for Compositional Actions and TEmporal Reasoning
Rohit Girdhar
Deva Ramanan
36
177
0
10 Oct 2019
CLEVRER: CoLlision Events for Video REpresentation and Reasoning
CLEVRER: CoLlision Events for Video REpresentation and Reasoning
Kexin Yi
Yuta Saito
Yunzhu Li
Pushmeet Kohli
Jiajun Wu
Antonio Torralba
J. Tenenbaum
NAI
60
461
0
03 Oct 2019
Video Representation Learning by Dense Predictive Coding
Video Representation Learning by Dense Predictive Coding
Tengda Han
Weidi Xie
Andrew Zisserman
SSL
53
359
0
10 Sep 2019
VL-BERT: Pre-training of Generic Visual-Linguistic Representations
VL-BERT: Pre-training of Generic Visual-Linguistic Representations
Weijie Su
Xizhou Zhu
Yue Cao
Bin Li
Lewei Lu
Furu Wei
Jifeng Dai
VLM
MLLM
SSL
97
1,657
0
22 Aug 2019
LXMERT: Learning Cross-Modality Encoder Representations from
  Transformers
LXMERT: Learning Cross-Modality Encoder Representations from Transformers
Hao Hao Tan
Joey Tianyi Zhou
VLM
MLLM
177
2,467
0
20 Aug 2019
VisualBERT: A Simple and Performant Baseline for Vision and Language
VisualBERT: A Simple and Performant Baseline for Vision and Language
Liunian Harold Li
Mark Yatskar
Da Yin
Cho-Jui Hsieh
Kai-Wei Chang
VLM
92
1,939
0
09 Aug 2019
ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for
  Vision-and-Language Tasks
ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks
Jiasen Lu
Dhruv Batra
Devi Parikh
Stefan Lee
SSL
VLM
160
3,659
0
06 Aug 2019
Shaping Belief States with Generative Environment Models for RL
Shaping Belief States with Generative Environment Models for RL
Karol Gregor
Danilo Jimenez Rezende
F. Besse
Yan Wu
Hamza Merzic
Aaron van den Oord
OffRL
AI4CE
55
118
0
21 Jun 2019
Learning Video Representations using Contrastive Bidirectional
  Transformer
Learning Video Representations using Contrastive Bidirectional Transformer
Chen Sun
Fabien Baradel
Kevin Patrick Murphy
Cordelia Schmid
SSL
ViT
88
133
0
13 Jun 2019
VideoBERT: A Joint Model for Video and Language Representation Learning
VideoBERT: A Joint Model for Video and Language Representation Learning
Chen Sun
Austin Myers
Carl Vondrick
Kevin Patrick Murphy
Cordelia Schmid
VLM
SSL
26
1,238
0
03 Apr 2019
Large Batch Optimization for Deep Learning: Training BERT in 76 minutes
Large Batch Optimization for Deep Learning: Training BERT in 76 minutes
Yang You
Jing Li
Sashank J. Reddi
Jonathan Hseu
Sanjiv Kumar
Srinadh Bhojanapalli
Xiaodan Song
J. Demmel
Kurt Keutzer
Cho-Jui Hsieh
ODL
94
991
0
01 Apr 2019
Multi-Object Representation Learning with Iterative Variational
  Inference
Multi-Object Representation Learning with Iterative Variational Inference
Klaus Greff
Raphael Lopez Kaufman
Rishabh Kabra
Nicholas Watters
Christopher P. Burgess
Daniel Zoran
Loic Matthey
M. Botvinick
Alexander Lerchner
OCL
SSL
49
505
0
01 Mar 2019
MONet: Unsupervised Scene Decomposition and Representation
MONet: Unsupervised Scene Decomposition and Representation
Christopher P. Burgess
Loic Matthey
Nicholas Watters
Rishabh Kabra
I. Higgins
M. Botvinick
Alexander Lerchner
OCL
44
519
0
22 Jan 2019
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Zihang Dai
Zhilin Yang
Yiming Yang
J. Carbonell
Quoc V. Le
Ruslan Salakhutdinov
VLM
75
3,707
0
09 Jan 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
603
93,936
0
11 Oct 2018
Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language
  Understanding
Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding
Kexin Yi
Jiajun Wu
Chuang Gan
Antonio Torralba
Pushmeet Kohli
J. Tenenbaum
NAI
57
606
0
04 Oct 2018
Compositional Attention Networks for Machine Reasoning
Compositional Attention Networks for Machine Reasoning
Drew A. Hudson
Christopher D. Manning
BDL
OOD
LRM
56
574
0
08 Mar 2018
Object-based reasoning in VQA
Object-based reasoning in VQA
Mikyas T. Desta
Larry Chen
Tomasz Kornuta
45
33
0
29 Jan 2018
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
230
129,831
0
12 Jun 2017
Mask R-CNN
Mask R-CNN
Kaiming He
Georgia Gkioxari
Piotr Dollár
Ross B. Girshick
ObjD
224
27,018
0
20 Mar 2017
Discovering objects and their relations from entangled scene
  representations
Discovering objects and their relations from entangled scene representations
David Raposo
Adam Santoro
David Barrett
Razvan Pascanu
Timothy Lillicrap
Peter W. Battaglia
GNN
OCL
41
71
0
16 Feb 2017
CLEVR: A Diagnostic Dataset for Compositional Language and Elementary
  Visual Reasoning
CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning
Justin Johnson
B. Hariharan
Laurens van der Maaten
Li Fei-Fei
C. L. Zitnick
Ross B. Girshick
CoGe
248
2,346
0
20 Dec 2016
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal
  Networks
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Shaoqing Ren
Kaiming He
Ross B. Girshick
Jian Sun
AIMat
ObjD
321
61,900
0
04 Jun 2015
Adam: A Method for Stochastic Optimization
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
262
149,474
0
22 Dec 2014
1