ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2109.03413
  4. Cited By
YouRefIt: Embodied Reference Understanding with Language and Gesture

YouRefIt: Embodied Reference Understanding with Language and Gesture

8 September 2021
Yixin Chen
Qing Li
Deqian Kong
Yik Lun Kei
Song-Chun Zhu
Tao Gao
Yixin Zhu
Siyuan Huang
    LM&Ro
ArXivPDFHTML

Papers citing "YouRefIt: Embodied Reference Understanding with Language and Gesture"

24 / 24 papers shown
Title
Communicative Learning with Natural Gestures for Embodied Navigation
  Agents with Human-in-the-Scene
Communicative Learning with Natural Gestures for Embodied Navigation Agents with Human-in-the-Scene
Qi Wu
Cheng-Ju Wu
Yixin Zhu
Jungseock Joo
70
14
0
05 Aug 2021
Individual vs. Joint Perception: a Pragmatic Model of Pointing as
  Communicative Smithian Helping
Individual vs. Joint Perception: a Pragmatic Model of Pointing as Communicative Smithian Helping
Kaiwen Jiang
Stephanie Stacy
Chuyu Wei
Adelpha Chan
Federico Rossano
Yixin Zhu
Tao Gao
16
5
0
03 Jun 2021
Learning Triadic Belief Dynamics in Nonverbal Communication from Videos
Learning Triadic Belief Dynamics in Nonverbal Communication from Videos
Lifeng Fan
Shuwen Qiu
Zilong Zheng
Tao Gao
Song-Chun Zhu
Yixin Zhu
18
24
0
07 Apr 2021
LEMMA: A Multi-view Dataset for Learning Multi-agent Multi-task
  Activities
LEMMA: A Multi-view Dataset for Learning Multi-agent Multi-task Activities
Baoxiong Jia
Yixin Chen
Siyuan Huang
Yixin Zhu
Song-Chun Zhu
24
51
0
31 Jul 2020
Human-Robot Interaction in a Shared Augmented Reality Workspace
Human-Robot Interaction in a Shared Augmented Reality Workspace
Shuwen Qiu
Hangxin Liu
Zeyu Zhang
Yixin Zhu
Song-Chun Zhu
92
23
0
24 Jul 2020
Joint Inference of States, Robot Knowledge, and Human (False-)Beliefs
Joint Inference of States, Robot Knowledge, and Human (False-)Beliefs
Tao Yuan
Hangxin Liu
Lifeng Fan
Zilong Zheng
Tao Gao
Yixin Zhu
Song-Chun Zhu
31
21
0
25 Apr 2020
Dark, Beyond Deep: A Paradigm Shift to Cognitive AI with Humanlike
  Common Sense
Dark, Beyond Deep: A Paradigm Shift to Cognitive AI with Humanlike Common Sense
Yixin Zhu
Tao Gao
Lifeng Fan
Siyuan Huang
Mark Edmonds
...
Chi Zhang
Siyuan Qi
Ying Nian Wu
J. Tenenbaum
Song-Chun Zhu
78
130
0
20 Apr 2020
Graph-Structured Referring Expression Reasoning in The Wild
Graph-Structured Referring Expression Reasoning in The Wild
Sibei Yang
Guanbin Li
Yizhou Yu
NAI
43
92
0
19 Apr 2020
Multi-task Collaborative Network for Joint Referring Expression
  Comprehension and Segmentation
Multi-task Collaborative Network for Joint Referring Expression Comprehension and Segmentation
Gen Luo
Yiyi Zhou
Xiaoshuai Sun
Liujuan Cao
Chenglin Wu
Cheng Deng
Rongrong Ji
ObjD
225
288
0
19 Mar 2020
Cops-Ref: A new Dataset and Task on Compositional Referring Expression
  Comprehension
Cops-Ref: A new Dataset and Task on Compositional Referring Expression Comprehension
Zhenfang Chen
Peng Wang
Lin Ma
Kwan-Yee K. Wong
Qi Wu
ObjD
60
68
0
01 Mar 2020
A Fast and Accurate One-Stage Approach to Visual Grounding
A Fast and Accurate One-Stage Approach to Visual Grounding
Zhengyuan Yang
Boqing Gong
Liwei Wang
Wenbing Huang
Dong Yu
Jiebo Luo
ObjD
43
361
0
18 Aug 2019
Cross-Modal Self-Attention Network for Referring Image Segmentation
Cross-Modal Self-Attention Network for Referring Image Segmentation
Linwei Ye
Mrigank Rochan
Zhi Liu
Yang Wang
EgoV
30
472
0
09 Apr 2019
Improving Referring Expression Grounding with Cross-modal
  Attention-guided Erasing
Improving Referring Expression Grounding with Cross-modal Attention-guided Erasing
Xihui Liu
Zihao Wang
Jing Shao
Xiaogang Wang
Hongsheng Li
ObjD
58
181
0
03 Mar 2019
Contextual Encoder-Decoder Network for Visual Saliency Prediction
Contextual Encoder-Decoder Network for Visual Saliency Prediction
Alexander Kroner
M. Senden
K. Driessens
R. Goebel
33
189
0
18 Feb 2019
CLEVR-Ref+: Diagnosing Visual Reasoning with Referring Expressions
CLEVR-Ref+: Diagnosing Visual Reasoning with Referring Expressions
Runtao Liu
Chenxi Liu
Yutong Bai
Alan Yuille
NAI
ObjD
52
123
0
03 Jan 2019
OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity
  Fields
OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields
Zhe Cao
Gines Hidalgo
Tomas Simon
S. Wei
Yaser Sheikh
3DH
CVBM
106
4,554
0
18 Dec 2018
Free-Form Image Inpainting with Gated Convolution
Free-Form Image Inpainting with Gated Convolution
Jiahui Yu
Zhe Lin
Jimei Yang
Xiaohui Shen
Xin Lu
Thomas Huang
DRL
45
1,707
0
10 Jun 2018
YOLOv3: An Incremental Improvement
YOLOv3: An Incremental Improvement
Joseph Redmon
Ali Farhadi
ObjD
93
21,306
0
08 Apr 2018
MAttNet: Modular Attention Network for Referring Expression
  Comprehension
MAttNet: Modular Attention Network for Referring Expression Comprehension
Licheng Yu
Zhe Lin
Xiaohui Shen
Jimei Yang
Xin Lu
Joey Tianyi Zhou
Tamara L. Berg
ObjD
94
822
0
24 Jan 2018
Generative Image Inpainting with Contextual Attention
Generative Image Inpainting with Contextual Attention
Jiahui Yu
Zhe Lin
Jimei Yang
Xiaohui Shen
Xin Lu
Thomas S. Huang
GAN
DiffM
71
2,255
0
24 Jan 2018
GuessWhat?! Visual object discovery through multi-modal dialogue
GuessWhat?! Visual object discovery through multi-modal dialogue
H. D. Vries
Florian Strub
A. Chandar
Olivier Pietquin
Hugo Larochelle
Aaron Courville
VLM
80
428
0
23 Nov 2016
Grounding of Textual Phrases in Images by Reconstruction
Grounding of Textual Phrases in Images by Reconstruction
Anna Rohrbach
Marcus Rohrbach
Ronghang Hu
Trevor Darrell
Bernt Schiele
60
497
0
12 Nov 2015
Convolutional LSTM Network: A Machine Learning Approach for
  Precipitation Nowcasting
Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting
Xingjian Shi
Zhourong Chen
Hao Wang
Dit-Yan Yeung
W. Wong
W. Woo
471
7,952
0
13 Jun 2015
Flickr30k Entities: Collecting Region-to-Phrase Correspondences for
  Richer Image-to-Sentence Models
Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models
Bryan A. Plummer
Liwei Wang
Christopher M. Cervantes
Juan C. Caicedo
Julia Hockenmaier
Svetlana Lazebnik
177
2,033
0
19 May 2015
1