ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2409.10419
  4. Cited By
HiFi-CS: Towards Open Vocabulary Visual Grounding For Robotic Grasping Using Vision-Language Models
v1v2 (latest)

HiFi-CS: Towards Open Vocabulary Visual Grounding For Robotic Grasping Using Vision-Language Models

16 September 2024
V. Bhat
Prashanth Krishnamurthy
Ramesh Karri
Farshad Khorrami
ArXiv (abs)PDFHTML

Papers citing "HiFi-CS: Towards Open Vocabulary Visual Grounding For Robotic Grasping Using Vision-Language Models"

20 / 70 papers shown
Title
TransVG: End-to-End Visual Grounding with Transformers
TransVG: End-to-End Visual Grounding with Transformers
Jiajun Deng
Zhengyuan Yang
Tianlang Chen
Wen-gang Zhou
Houqiang Li
ViT
74
345
0
17 Apr 2021
A Joint Network for Grasp Detection Conditioned on Natural Language
  Commands
A Joint Network for Grasp Detection Conditioned on Natural Language Commands
Yiye Chen
Ruinian Xu
Yunzhi Lin
Patricio A. Vela
91
46
0
01 Apr 2021
Refer-it-in-RGBD: A Bottom-up Approach for 3D Visual Grounding in RGBD
  Images
Refer-it-in-RGBD: A Bottom-up Approach for 3D Visual Grounding in RGBD Images
Haolin Liu
Anran Lin
Xiaoguang Han
Lei Yang
Yizhou Yu
Shuguang Cui
69
40
0
14 Mar 2021
RGB Matters: Learning 7-DoF Grasp Poses on Monocular RGBD Images
RGB Matters: Learning 7-DoF Grasp Poses on Monocular RGBD Images
Minghao Gou
Haoshu Fang
Zhanda Zhu
Shengwei Xu
Chenxi Wang
Cewu Lu
65
101
0
03 Mar 2021
Learning Transferable Visual Models From Natural Language Supervision
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIPVLM
978
29,871
0
26 Feb 2021
A Recurrent Vision-and-Language BERT for Navigation
A Recurrent Vision-and-Language BERT for Navigation
Yicong Hong
Qi Wu
Yuankai Qi
Cristian Rodriguez-Opazo
Stephen Gould
LM&Ro
104
302
0
26 Nov 2020
ACRONYM: A Large-Scale Grasp Dataset Based on Simulation
ACRONYM: A Large-Scale Grasp Dataset Based on Simulation
Clemens Eppner
Arsalan Mousavian
Dieter Fox
105
210
0
18 Nov 2020
Real-Time Deep Learning Approach to Visual Servo Control and Grasp
  Detection for Autonomous Robotic Manipulation
Real-Time Deep Learning Approach to Visual Servo Control and Grasp Detection for Autonomous Robotic Manipulation
E. G. Ribeiro
R. Q. Mendes
V. Grassi
59
60
0
13 Oct 2020
GSNet: Joint Vehicle Pose and Shape Reconstruction with Geometrical and
  Scene-aware Supervision
GSNet: Joint Vehicle Pose and Shape Reconstruction with Geometrical and Scene-aware Supervision
Lei Ke
Shichao Li
Yanan Sun
Yu-Wing Tai
Chi-Keung Tang
3DPC
48
48
0
26 Jul 2020
ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language
ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language
Dave Zhenyu Chen
Angel X. Chang
Matthias Nießner
3DPC
91
378
0
18 Dec 2019
Grasping in the Wild:Learning 6DoF Closed-Loop Grasping from Low-Cost
  Demonstrations
Grasping in the Wild:Learning 6DoF Closed-Loop Grasping from Low-Cost Demonstrations
Shuran Song
Andy Zeng
Johnny Lee
Thomas Funkhouser
70
228
0
09 Dec 2019
Interactive Visual Grounding of Referring Expressions for Human-Robot
  Interaction
Interactive Visual Grounding of Referring Expressions for Human-Robot Interaction
Mohit Shridhar
David Hsu
64
144
0
11 Jun 2018
Jacquard: A Large Scale Dataset for Robotic Grasp Detection
Jacquard: A Large Scale Dataset for Robotic Grasp Detection
Amaury Depierre
Emmanuel Dellandrea
Liming Chen
101
319
0
30 Mar 2018
MAttNet: Modular Attention Network for Referring Expression
  Comprehension
MAttNet: Modular Attention Network for Referring Expression Comprehension
Licheng Yu
Zhe Lin
Xiaohui Shen
Jimei Yang
Xin Lu
Joey Tianyi Zhou
Tamara L. Berg
ObjD
117
831
0
24 Jan 2018
FiLM: Visual Reasoning with a General Conditioning Layer
FiLM: Visual Reasoning with a General Conditioning Layer
Ethan Perez
Florian Strub
H. D. Vries
Vincent Dumoulin
Aaron Courville
FAttAIMatOffRLAI4CE
372
2,236
0
22 Sep 2017
Modulating early visual processing by language
Modulating early visual processing by language
H. D. Vries
Florian Strub
Jérémie Mary
Hugo Larochelle
Olivier Pietquin
Aaron Courville
135
489
0
02 Jul 2017
Modeling Context Between Objects for Referring Expression Understanding
Modeling Context Between Objects for Referring Expression Understanding
Varun K. Nagaraja
Vlad I. Morariu
Larry S. Davis
74
154
0
01 Aug 2016
Modeling Context in Referring Expressions
Modeling Context in Referring Expressions
Licheng Yu
Patrick Poirson
Shan Yang
Alexander C. Berg
Tamara L. Berg
131
1,277
0
31 Jul 2016
Natural Language Object Retrieval
Natural Language Object Retrieval
Ronghang Hu
Huazhe Xu
Marcus Rohrbach
Jiashi Feng
Kate Saenko
Trevor Darrell
ObjD
101
554
0
13 Nov 2015
Flickr30k Entities: Collecting Region-to-Phrase Correspondences for
  Richer Image-to-Sentence Models
Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models
Bryan A. Plummer
Liwei Wang
Christopher M. Cervantes
Juan C. Caicedo
Julia Hockenmaier
Svetlana Lazebnik
208
2,074
0
19 May 2015
Previous
12