Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2103.12944
Cited By
Scene-Intuitive Agent for Remote Embodied Visual Grounding
24 March 2021
Xiangru Lin
Guanbin Li
Yizhou Yu
LM&Ro
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Scene-Intuitive Agent for Remote Embodied Visual Grounding"
38 / 38 papers shown
Title
Do RNN and LSTM have Long Memory?
Jingyu Zhao
Feiqing Huang
Jia Lv
Yanjie Duan
Zhen Qin
Guodong Li
Guangjian Tian
92
142
0
06 Jun 2020
Improving Vision-and-Language Navigation with Image-Text Pairs from the Web
Arjun Majumdar
Ayush Shrivastava
Stefan Lee
Peter Anderson
Devi Parikh
Dhruv Batra
LM&Ro
115
231
0
30 Apr 2020
Graph-Structured Referring Expression Reasoning in The Wild
Sibei Yang
Guanbin Li
Yizhou Yu
NAI
43
92
0
19 Apr 2020
Towards Learning a Generic Agent for Vision-and-Language Navigation via Pre-training
Weituo Hao
Chunyuan Li
Xiujun Li
Lawrence Carin
Jianfeng Gao
LM&Ro
50
276
0
25 Feb 2020
12-in-1: Multi-Task Vision and Language Representation Learning
Jiasen Lu
Vedanuj Goswami
Marcus Rohrbach
Devi Parikh
Stefan Lee
VLM
ObjD
66
476
0
05 Dec 2019
PyTorch: An Imperative Style, High-Performance Deep Learning Library
Adam Paszke
Sam Gross
Francisco Massa
Adam Lerer
James Bradbury
...
Sasank Chilamkurthy
Benoit Steiner
Lu Fang
Junjie Bai
Soumith Chintala
ODL
261
42,038
0
03 Dec 2019
Vision-Language Navigation with Self-Supervised Auxiliary Reasoning Tasks
Fengda Zhu
Yi Zhu
Xiaojun Chang
Xiaodan Liang
LRM
46
239
0
18 Nov 2019
VL-BERT: Pre-training of Generic Visual-Linguistic Representations
Weijie Su
Xizhou Zhu
Yue Cao
Bin Li
Lewei Lu
Furu Wei
Jifeng Dai
VLM
MLLM
SSL
124
1,657
0
22 Aug 2019
LXMERT: Learning Cross-Modality Encoder Representations from Transformers
Hao Hao Tan
Joey Tianyi Zhou
VLM
MLLM
205
2,467
0
20 Aug 2019
ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks
Jiasen Lu
Dhruv Batra
Devi Parikh
Stefan Lee
SSL
VLM
197
3,659
0
06 Aug 2019
Chasing Ghosts: Instruction Following as Bayesian State Tracking
Peter Anderson
Ayush Shrivastava
Devi Parikh
Dhruv Batra
Stefan Lee
42
73
0
03 Jul 2019
Relationship-Embedded Representation Learning for Grounding Referring Expressions
Sibei Yang
Guanbin Li
Yizhou Yu
ObjD
44
53
0
11 Jun 2019
REVERIE: Remote Embodied Visual Referring Expression in Real Indoor Environments
Yuankai Qi
Qi Wu
Peter Anderson
Xinze Wang
Wenjie Wang
Chunhua Shen
Anton Van Den Hengel
LM&Ro
76
320
0
23 Apr 2019
Learning to Navigate Unseen Environments: Back Translation with Environmental Dropout
Hao Tan
Licheng Yu
Joey Tianyi Zhou
SSL
56
317
0
08 Apr 2019
Scene Memory Transformer for Embodied Agents in Long-Horizon Tasks
Kuan Fang
Alexander Toshev
Li Fei-Fei
Silvio Savarese
OffRL
103
200
0
09 Mar 2019
Tactical Rewind: Self-Correction via Backtracking in Vision-and-Language Navigation
Liyiming Ke
Xiujun Li
Yonatan Bisk
Ari Holtzman
Zhe Gan
Jingjing Liu
Jianfeng Gao
Yejin Choi
S. Srinivasa
67
168
0
06 Mar 2019
Improving Referring Expression Grounding with Cross-modal Attention-guided Erasing
Xihui Liu
Zihao Wang
Jing Shao
Xiaogang Wang
Hongsheng Li
ObjD
58
181
0
03 Mar 2019
Self-Monitoring Navigation Agent via Auxiliary Progress Estimation
Chih-Yao Ma
Jiasen Lu
Zuxuan Wu
G. Al-Regib
Z. Kira
R. Socher
Caiming Xiong
LM&Ro
53
275
0
10 Jan 2019
Touchdown: Natural Language Navigation and Spatial Reasoning in Visual Street Environments
Howard Chen
Alane Suhr
Dipendra Kumar Misra
Noah Snavely
Yoav Artzi
71
384
0
29 Nov 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
953
93,936
0
11 Oct 2018
Speaker-Follower Models for Vision-and-Language Navigation
Daniel Fried
Ronghang Hu
Volkan Cirik
Anna Rohrbach
Jacob Andreas
Louis-Philippe Morency
Taylor Berg-Kirkpatrick
Kate Saenko
Dan Klein
Trevor Darrell
LM&Ro
LRM
307
499
0
07 Jun 2018
Visual Question Reasoning on General Dependency Tree
Qingxing Cao
Xiaodan Liang
Bailin Li
Guanbin Li
Liang Lin
CoGe
39
37
0
31 Mar 2018
Look Before You Leap: Bridging Model-Free and Model-Based Reinforcement Learning for Planned-Ahead Vision-and-Language Navigation
Xin Eric Wang
Wenhan Xiong
Hongmin Wang
William Yang Wang
56
200
0
21 Mar 2018
Semi-parametric Topological Memory for Navigation
Nikolay Savinov
Alexey Dosovitskiy
V. Koltun
47
379
0
01 Mar 2018
MAttNet: Modular Attention Network for Referring Expression Comprehension
Licheng Yu
Zhe Lin
Xiaohui Shen
Jimei Yang
Xin Lu
Joey Tianyi Zhou
Tamara L. Berg
ObjD
94
822
0
24 Jan 2018
Building Generalizable Agents with a Realistic and Rich 3D Environment
Yi Wu
Yuxin Wu
Georgia Gkioxari
Yuandong Tian
3DV
115
338
0
07 Jan 2018
Embodied Question Answering
Abhishek Das
Samyak Datta
Georgia Gkioxari
Stefan Lee
Devi Parikh
Dhruv Batra
LM&Ro
71
644
0
30 Nov 2017
Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments
Peter Anderson
Qi Wu
Damien Teney
Jake Bruce
Mark Johnson
Niko Sünderhauf
Ian Reid
Stephen Gould
Anton Van Den Hengel
LM&Ro
79
1,299
0
20 Nov 2017
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
443
129,831
0
12 Jun 2017
Mask R-CNN
Kaiming He
Georgia Gkioxari
Piotr Dollár
Ross B. Girshick
ObjD
300
27,018
0
20 Mar 2017
Neural Map: Structured Memory for Deep Reinforcement Learning
Emilio Parisotto
Ruslan Salakhutdinov
52
258
0
27 Feb 2017
Cognitive Mapping and Planning for Visual Navigation
Saurabh Gupta
Varun Tolani
James Davidson
Sergey Levine
Rahul Sukthankar
Jitendra Malik
54
713
0
13 Feb 2017
Reinforcement Learning with Unsupervised Auxiliary Tasks
Max Jaderberg
Volodymyr Mnih
Wojciech M. Czarnecki
Tom Schaul
Joel Z Leibo
David Silver
Koray Kavukcuoglu
SSL
43
1,225
0
16 Nov 2016
Learning to Navigate in Complex Environments
Piotr Wojciech Mirowski
Razvan Pascanu
Fabio Viola
Hubert Soyer
Andy Ballard
...
Ross Goroshin
Laurent Sifre
Koray Kavukcuoglu
D. Kumaran
R. Hadsell
67
876
0
11 Nov 2016
Predicting Personal Traits from Facial Images using Convolutional Neural Networks Augmented with Facial Landmark Information
Yoad Lewenberg
Valliappa Chockalingam
Satinder Singh
Honglak Lee
CVBM
43
304
0
29 May 2016
Aligning Books and Movies: Towards Story-like Visual Explanations by Watching Movies and Reading Books
Yukun Zhu
Ryan Kiros
R. Zemel
Ruslan Salakhutdinov
R. Urtasun
Antonio Torralba
Sanja Fidler
99
2,529
0
22 Jun 2015
VQA: Visual Question Answering
Aishwarya Agrawal
Jiasen Lu
Stanislaw Antol
Margaret Mitchell
C. L. Zitnick
Dhruv Batra
Devi Parikh
CoGe
145
5,421
0
03 May 2015
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
813
149,474
0
22 Dec 2014
1