Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2109.03413
Cited By
YouRefIt: Embodied Reference Understanding with Language and Gesture
8 September 2021
Yixin Chen
Qing Li
Deqian Kong
Yik Lun Kei
Song-Chun Zhu
Tao Gao
Yixin Zhu
Siyuan Huang
LM&Ro
Re-assign community
ArXiv
PDF
HTML
Papers citing
"YouRefIt: Embodied Reference Understanding with Language and Gesture"
24 / 24 papers shown
Title
Communicative Learning with Natural Gestures for Embodied Navigation Agents with Human-in-the-Scene
Qi Wu
Cheng-Ju Wu
Yixin Zhu
Jungseock Joo
70
14
0
05 Aug 2021
Individual vs. Joint Perception: a Pragmatic Model of Pointing as Communicative Smithian Helping
Kaiwen Jiang
Stephanie Stacy
Chuyu Wei
Adelpha Chan
Federico Rossano
Yixin Zhu
Tao Gao
16
5
0
03 Jun 2021
Learning Triadic Belief Dynamics in Nonverbal Communication from Videos
Lifeng Fan
Shuwen Qiu
Zilong Zheng
Tao Gao
Song-Chun Zhu
Yixin Zhu
18
24
0
07 Apr 2021
LEMMA: A Multi-view Dataset for Learning Multi-agent Multi-task Activities
Baoxiong Jia
Yixin Chen
Siyuan Huang
Yixin Zhu
Song-Chun Zhu
24
51
0
31 Jul 2020
Human-Robot Interaction in a Shared Augmented Reality Workspace
Shuwen Qiu
Hangxin Liu
Zeyu Zhang
Yixin Zhu
Song-Chun Zhu
92
23
0
24 Jul 2020
Joint Inference of States, Robot Knowledge, and Human (False-)Beliefs
Tao Yuan
Hangxin Liu
Lifeng Fan
Zilong Zheng
Tao Gao
Yixin Zhu
Song-Chun Zhu
31
21
0
25 Apr 2020
Dark, Beyond Deep: A Paradigm Shift to Cognitive AI with Humanlike Common Sense
Yixin Zhu
Tao Gao
Lifeng Fan
Siyuan Huang
Mark Edmonds
...
Chi Zhang
Siyuan Qi
Ying Nian Wu
J. Tenenbaum
Song-Chun Zhu
78
130
0
20 Apr 2020
Graph-Structured Referring Expression Reasoning in The Wild
Sibei Yang
Guanbin Li
Yizhou Yu
NAI
43
92
0
19 Apr 2020
Multi-task Collaborative Network for Joint Referring Expression Comprehension and Segmentation
Gen Luo
Yiyi Zhou
Xiaoshuai Sun
Liujuan Cao
Chenglin Wu
Cheng Deng
Rongrong Ji
ObjD
225
288
0
19 Mar 2020
Cops-Ref: A new Dataset and Task on Compositional Referring Expression Comprehension
Zhenfang Chen
Peng Wang
Lin Ma
Kwan-Yee K. Wong
Qi Wu
ObjD
60
68
0
01 Mar 2020
A Fast and Accurate One-Stage Approach to Visual Grounding
Zhengyuan Yang
Boqing Gong
Liwei Wang
Wenbing Huang
Dong Yu
Jiebo Luo
ObjD
43
361
0
18 Aug 2019
Cross-Modal Self-Attention Network for Referring Image Segmentation
Linwei Ye
Mrigank Rochan
Zhi Liu
Yang Wang
EgoV
30
472
0
09 Apr 2019
Improving Referring Expression Grounding with Cross-modal Attention-guided Erasing
Xihui Liu
Zihao Wang
Jing Shao
Xiaogang Wang
Hongsheng Li
ObjD
58
181
0
03 Mar 2019
Contextual Encoder-Decoder Network for Visual Saliency Prediction
Alexander Kroner
M. Senden
K. Driessens
R. Goebel
33
189
0
18 Feb 2019
CLEVR-Ref+: Diagnosing Visual Reasoning with Referring Expressions
Runtao Liu
Chenxi Liu
Yutong Bai
Alan Yuille
NAI
ObjD
52
123
0
03 Jan 2019
OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields
Zhe Cao
Gines Hidalgo
Tomas Simon
S. Wei
Yaser Sheikh
3DH
CVBM
106
4,554
0
18 Dec 2018
Free-Form Image Inpainting with Gated Convolution
Jiahui Yu
Zhe Lin
Jimei Yang
Xiaohui Shen
Xin Lu
Thomas Huang
DRL
45
1,707
0
10 Jun 2018
YOLOv3: An Incremental Improvement
Joseph Redmon
Ali Farhadi
ObjD
93
21,306
0
08 Apr 2018
MAttNet: Modular Attention Network for Referring Expression Comprehension
Licheng Yu
Zhe Lin
Xiaohui Shen
Jimei Yang
Xin Lu
Joey Tianyi Zhou
Tamara L. Berg
ObjD
94
822
0
24 Jan 2018
Generative Image Inpainting with Contextual Attention
Jiahui Yu
Zhe Lin
Jimei Yang
Xiaohui Shen
Xin Lu
Thomas S. Huang
GAN
DiffM
71
2,255
0
24 Jan 2018
GuessWhat?! Visual object discovery through multi-modal dialogue
H. D. Vries
Florian Strub
A. Chandar
Olivier Pietquin
Hugo Larochelle
Aaron Courville
VLM
80
428
0
23 Nov 2016
Grounding of Textual Phrases in Images by Reconstruction
Anna Rohrbach
Marcus Rohrbach
Ronghang Hu
Trevor Darrell
Bernt Schiele
60
497
0
12 Nov 2015
Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting
Xingjian Shi
Zhourong Chen
Hao Wang
Dit-Yan Yeung
W. Wong
W. Woo
471
7,952
0
13 Jun 2015
Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models
Bryan A. Plummer
Liwei Wang
Christopher M. Cervantes
Juan C. Caicedo
Julia Hockenmaier
Svetlana Lazebnik
177
2,033
0
19 May 2015
1