Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1908.04950
Cited By
VideoNavQA: Bridging the Gap between Visual and Embodied Question Answering
14 August 2019
Cătălina Cangea
Eugene Belilovsky
Pietro Lio
Aaron Courville
Re-assign community
ArXiv
PDF
HTML
Papers citing
"VideoNavQA: Bridging the Gap between Visual and Embodied Question Answering"
35 / 35 papers shown
Title
Multi-Target Embodied Question Answering
Licheng Yu
Xinlei Chen
Georgia Gkioxari
Joey Tianyi Zhou
Tamara L. Berg
Dhruv Batra
63
104
0
09 Apr 2019
Embodied Question Answering in Photorealistic Environments with Point Cloud Perception
Erik Wijmans
Samyak Datta
Oleksandr Maksymets
Abhishek Das
Georgia Gkioxari
Stefan Lee
Irfan Essa
Devi Parikh
Dhruv Batra
3DPC
LM&Ro
75
168
0
06 Apr 2019
Habitat: A Platform for Embodied AI Research
Manolis Savva
Abhishek Kadian
Oleksandr Maksymets
Yili Zhao
Erik Wijmans
...
Jia-Wei Liu
V. Koltun
Jitendra Malik
Devi Parikh
Dhruv Batra
LM&Ro
115
1,407
0
02 Apr 2019
GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question Answering
Drew A. Hudson
Christopher D. Manning
CoGe
NAI
59
137
0
25 Feb 2019
MUREL: Multimodal Relational Reasoning for Visual Question Answering
Rémi Cadène
H. Ben-younes
Matthieu Cord
Nicolas Thome
LRM
62
276
0
25 Feb 2019
Benchmarking Classic and Learned Navigation in Complex 3D Environments
Dmytro Mishkin
Alexey Dosovitskiy
V. Koltun
113
75
0
30 Jan 2019
From FiLM to Video: Multi-turn Question Answering with Multi-modal Context
T. Nguyen
Shikhar Sharma
Hannes Schulz
Layla El Asri
40
33
0
17 Dec 2018
Blindfold Baselines for Embodied QA
Ankesh Anand
Eugene Belilovsky
Kyle Kastner
Hugo Larochelle
Aaron Courville
77
45
0
12 Nov 2018
Neural Modular Control for Embodied Question Answering
Abhishek Das
Georgia Gkioxari
Stefan Lee
Devi Parikh
Dhruv Batra
LM&Ro
177
130
0
26 Oct 2018
TVQA: Localized, Compositional Video Question Answering
Muhammad Abdul Wahab
Licheng Yu
Mounir Nasr Allah
Tamara L. Berg
90
640
0
05 Sep 2018
Visual Reasoning with Multi-hop Feature Modulation
Florian Strub
Mathieu Seurin
Ethan Perez
H. D. Vries
Jérémie Mary
Philippe Preux
Aaron Courville
Olivier Pietquin
63
26
0
03 Aug 2018
Learning Visual Question Answering by Bootstrapping Hard Attention
Mateusz Malinowski
Carl Doersch
Adam Santoro
Peter W. Battaglia
OOD
53
96
0
01 Aug 2018
Learning Conditioned Graph Structures for Interpretable Visual Question Answering
Will Norcliffe-Brown
Efstathios Vafeias
Sarah Parisot
GNN
66
237
0
19 Jun 2018
Compositional Attention Networks for Machine Reasoning
Drew A. Hudson
Christopher D. Manning
BDL
OOD
LRM
193
577
0
08 Mar 2018
Building Generalizable Agents with a Realistic and Rich 3D Environment
Yi Wu
Yuxin Wu
Georgia Gkioxari
Yuandong Tian
3DV
128
338
0
07 Jan 2018
AI2-THOR: An Interactive 3D Environment for Visual AI
Eric Kolve
Roozbeh Mottaghi
Winson Han
Eli VanderBilt
Luca Weihs
...
Daniel Gordon
Yuke Zhu
Aniruddha Kembhavi
Abhinav Gupta
Ali Farhadi
LM&Ro
79
1,101
0
14 Dec 2017
IQA: Visual Question Answering in Interactive Environments
Daniel Gordon
Aniruddha Kembhavi
Mohammad Rastegari
Joseph Redmon
Dieter Fox
Ali Farhadi
LM&Ro
88
390
0
09 Dec 2017
Embodied Question Answering
Abhishek Das
Samyak Datta
Georgia Gkioxari
Stefan Lee
Devi Parikh
Dhruv Batra
LM&Ro
93
646
0
30 Nov 2017
FiLM: Visual Reasoning with a General Conditioning Layer
Ethan Perez
Florian Strub
H. D. Vries
Vincent Dumoulin
Aaron Courville
FAtt
AIMat
OffRL
AI4CE
352
2,212
0
22 Sep 2017
Learning to Reason: End-to-End Module Networks for Visual Question Answering
Ronghang Hu
Jacob Andreas
Marcus Rohrbach
Trevor Darrell
Kate Saenko
KELM
GNN
ReLM
LRM
129
578
0
18 Apr 2017
CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning
Justin Johnson
B. Hariharan
Laurens van der Maaten
Li Fei-Fei
C. L. Zitnick
Ross B. Girshick
CoGe
301
2,378
0
20 Dec 2016
MarioQA: Answering Questions by Watching Gameplay Videos
Jonghwan Mun
Paul Hongsuck Seo
Ilchae Jung
Bohyung Han
83
109
0
06 Dec 2016
Semantic Scene Completion from a Single Depth Image
Shuran Song
Feng Yu
Andy Zeng
Angel X. Chang
Manolis Savva
Thomas Funkhouser
3DV
84
1,243
0
28 Nov 2016
Graph-Structured Representations for Visual Question Answering
Damien Teney
Lingqiao Liu
Anton Van Den Hengel
GNN
NAI
99
420
0
19 Sep 2016
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
Akira Fukui
Dong Huk Park
Daylen Yang
Anna Rohrbach
Trevor Darrell
Marcus Rohrbach
299
1,465
0
06 Jun 2016
MovieQA: Understanding Stories in Movies through Question-Answering
Makarand Tapaswi
Yukun Zhu
Rainer Stiefelhagen
Antonio Torralba
R. Urtasun
Sanja Fidler
115
749
0
09 Dec 2015
Stacked Attention Networks for Image Question Answering
Zichao Yang
Xiaodong He
Jianfeng Gao
Li Deng
Alex Smola
BDL
107
1,882
0
07 Nov 2015
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Shaoqing Ren
Kaiming He
Ross B. Girshick
Jian Sun
AIMat
ObjD
520
62,294
0
04 Jun 2015
Exploring Models and Data for Image Question Answering
Mengye Ren
Ryan Kiros
R. Zemel
80
715
0
08 May 2015
VQA: Visual Question Answering
Aishwarya Agrawal
Jiasen Lu
Stanislaw Antol
Margaret Mitchell
C. L. Zitnick
Dhruv Batra
Devi Parikh
CoGe
202
5,478
0
03 May 2015
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Sergey Ioffe
Christian Szegedy
OOD
463
43,305
0
11 Feb 2015
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
1.8K
150,115
0
22 Dec 2014
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan
Andrew Zisserman
FAtt
MDE
1.6K
100,386
0
04 Sep 2014
Neural Machine Translation by Jointly Learning to Align and Translate
Dzmitry Bahdanau
Kyunghyun Cho
Yoshua Bengio
AIMat
558
27,311
0
01 Sep 2014
Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation
Kyunghyun Cho
B. V. Merrienboer
Çağlar Gülçehre
Dzmitry Bahdanau
Fethi Bougares
Holger Schwenk
Yoshua Bengio
AIMat
1.0K
23,354
0
03 Jun 2014
1