Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1806.03831
Cited By
Interactive Visual Grounding of Referring Expressions for Human-Robot Interaction
11 June 2018
Mohit Shridhar
David Hsu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Interactive Visual Grounding of Referring Expressions for Human-Robot Interaction"
15 / 15 papers shown
Title
Deformable Attentive Visual Enhancement for Referring Segmentation Using Vision-Language Model
Alaa Dalaq
Muzammil Behzad
VLM
163
0
0
25 May 2025
RAIDER: Tool-Equipped Large Language Model Agent for Robotic Action Issue Detection, Explanation and Recovery
Silvia Izquierdo-Badiola
Carlos Rizzo
Guillem Alenyà
LLMAG
LM&Ro
141
0
0
22 Mar 2025
HiFi-CS: Towards Open Vocabulary Visual Grounding For Robotic Grasping Using Vision-Language Models
V. Bhat
Prashanth Krishnamurthy
Ramesh Karri
Farshad Khorrami
97
5
0
16 Sep 2024
General-purpose Clothes Manipulation with Semantic Keypoints
Yuhong Deng
David Hsu
100
2
0
15 Aug 2024
Interactively Picking Real-World Objects with Unconstrained Spoken Language Instructions
Jun Hatori
Yuta Kikuchi
Sosuke Kobayashi
K. Takahashi
Yuta Tsuboi
Y. Unno
W. Ko
Jethro Tan
59
161
0
17 Oct 2017
A simple neural network module for relational reasoning
Adam Santoro
David Raposo
David Barrett
Mateusz Malinowski
Razvan Pascanu
Peter W. Battaglia
Timothy Lillicrap
GNN
NAI
177
1,614
0
05 Jun 2017
Dex-Net 2.0: Deep Learning to Plan Robust Grasps with Synthetic Point Clouds and Analytic Grasp Metrics
Jeffrey Mahler
Jacky Liang
Sherdil Niyaz
Michael Laskey
R. Doan
Xinyu Liu
J. A. Ojea
Ken Goldberg
3DPC
3DV
100
1,267
0
27 Mar 2017
A Joint Speaker-Listener-Reinforcer Model for Referring Expressions
Licheng Yu
Hao Tan
Joey Tianyi Zhou
Tamara L. Berg
ObjD
94
275
0
30 Dec 2016
CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning
Justin Johnson
B. Hariharan
Laurens van der Maaten
Li Fei-Fei
C. L. Zitnick
Ross B. Girshick
CoGe
295
2,375
0
20 Dec 2016
Modeling Relationships in Referential Expressions with Compositional Modular Networks
Ronghang Hu
Marcus Rohrbach
Jacob Andreas
Trevor Darrell
Kate Saenko
75
406
0
30 Nov 2016
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations
Ranjay Krishna
Yuke Zhu
Oliver Groth
Justin Johnson
Kenji Hata
...
Yannis Kalantidis
Li Li
David A. Shamma
Michael S. Bernstein
Fei-Fei Li
215
5,743
0
23 Feb 2016
DenseCap: Fully Convolutional Localization Networks for Dense Captioning
Justin Johnson
A. Karpathy
Li Fei-Fei
VLM
129
1,169
0
24 Nov 2015
Natural Language Object Retrieval
Ronghang Hu
Huazhe Xu
Marcus Rohrbach
Jiashi Feng
Kate Saenko
Trevor Darrell
ObjD
94
553
0
13 Nov 2015
Generation and Comprehension of Unambiguous Object Descriptions
Junhua Mao
Jonathan Huang
Alexander Toshev
Oana-Maria Camburu
Alan Yuille
Kevin Patrick Murphy
ObjD
118
1,345
0
07 Nov 2015
A Joint Model of Language and Perception for Grounded Attribute Learning
Cynthia Matuszek
Nicholas FitzGerald
Luke Zettlemoyer
Liefeng Bo
Dieter Fox
LM&Ro
82
316
0
27 Jun 2012
1