Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.10721
Cited By
RoboPoint: A Vision-Language Model for Spatial Affordance Prediction for Robotics
15 June 2024
Wentao Yuan
Jiafei Duan
Valts Blukis
Wilbert Pumacay
Ranjay Krishna
Adithyavairavan Murali
Arsalan Mousavian
Dieter Fox
LM&Ro
Re-assign community
ArXiv
PDF
HTML
Papers citing
"RoboPoint: A Vision-Language Model for Spatial Affordance Prediction for Robotics"
28 / 28 papers shown
Title
From Seeing to Doing: Bridging Reasoning and Decision for Robotic Manipulation
Yifu Yuan
Haiqin Cui
Yibin Chen
Zibin Dong
Fei Ni
Longxin Kou
Jinyi Liu
Pengyi Li
Yan Zheng
Jianye Hao
28
0
0
13 May 2025
Pixel Motion as Universal Representation for Robot Control
Kanchana Ranasinghe
Xiang Li
Cristina Mata
J. Park
Michael S. Ryoo
VGen
32
0
0
12 May 2025
Mapping User Trust in Vision Language Models: Research Landscape, Challenges, and Prospects
Agnese Chiatti
Sara Bernardini
Lara Shibelski Godoy Piccolo
Viola Schiaffonati
Matteo Matteucci
62
0
0
08 May 2025
PlaceIt3D: Language-Guided Object Placement in Real 3D Scenes
Ahmed Abdelreheem
Filippo Aleotti
Jamie Watson
Z. Qureshi
Abdelrahman Eldesokey
Peter Wonka
Gabriel J. Brostow
Sara Vicente
Guillermo Garcia-Hernando
DiffM
59
0
0
08 May 2025
RoboOS: A Hierarchical Embodied Framework for Cross-Embodiment and Multi-Agent Collaboration
Huajie Tan
Xiaoshuai Hao
Minglan Lin
Pengwei Wang
Yaoxu Lyu
Mingyu Cao
Zhongyuan Wang
S. Zhang
LM&Ro
48
0
0
06 May 2025
CrayonRobo: Object-Centric Prompt-Driven Vision-Language-Action Model for Robotic Manipulation
Xiaoqi Li
Lingyun Xu
M. Zhang
Jiaming Liu
Yan Shen
...
Jiahui Xu
Liang Heng
Siyuan Huang
S. Zhang
Hao Dong
LM&Ro
51
0
0
04 May 2025
ReLI: A Language-Agnostic Approach to Human-Robot Interaction
Linus Nwankwo
Bjoern Ellensohn
Ozan Özdenizci
Elmar Rueckert
LM&Ro
58
0
0
03 May 2025
A0: An Affordance-Aware Hierarchical Model for General Robotic Manipulation
Rongtao Xu
J. Zhang
Minghao Guo
Youpeng Wen
H. Yang
...
Liqiong Wang
Yuxuan Kuang
Meng Cao
Feng Zheng
Xiaodan Liang
47
3
0
17 Apr 2025
GAT-Grasp: Gesture-Driven Affordance Transfer for Task-Aware Robotic Grasping
Ruixiang Wang
Huayi Zhou
Xinyue Yao
Guiliang Liu
Kui Jia
39
0
0
08 Mar 2025
Stealthy Backdoor Attack in Self-Supervised Learning Vision Encoders for Large Vision Language Models
Zhaoyi Liu
Huan Zhang
AAML
83
0
0
25 Feb 2025
A Real-to-Sim-to-Real Approach to Robotic Manipulation with VLM-Generated Iterative Keypoint Rewards
Shivansh Patel
Xinchen Yin
Wenlong Huang
Shubham Garg
H. Nayyeri
Li Fei-Fei
Svetlana Lazebnik
Yongqian Li
92
0
0
12 Feb 2025
HAMSTER: Hierarchical Action Models For Open-World Robot Manipulation
Yi Li
Yuquan Deng
Jingyang Zhang
Joel Jang
Marius Memme
...
Fabio Ramos
Dieter Fox
Anqi Li
Abhishek Gupta
Ankit Goyal
LM&Ro
99
9
0
08 Feb 2025
RoboSpatial: Teaching Spatial Understanding to 2D and 3D Vision-Language Models for Robotics
Chan Hee Song
Valts Blukis
Jonathan Tremblay
Stephen Tyree
Yu-Chuan Su
Stan Birchfield
96
5
0
25 Nov 2024
Open-World Task and Motion Planning via Vision-Language Model Inferred Constraints
Nishanth Kumar
F. Ramos
Dieter Fox
Caelan Reed Garrett
Tomás Lozano-Pérez
Leslie Pack Kaelbling
Caelan Reed Garrett
LRM
LM&Ro
68
3
0
13 Nov 2024
Do Vision-Language Models Represent Space and How? Evaluating Spatial Frame of Reference Under Ambiguities
Zheyuan Zhang
Fengyuan Hu
Jayjun Lee
Freda Shi
Parisa Kordjamshidi
Joyce Chai
Ziqiao Ma
56
11
0
22 Oct 2024
Semantically Safe Robot Manipulation: From Semantic Scene Understanding to Motion Safeguards
Lukas Brunke
Yanni Zhang
Ralf Romer
Jack Naimer
Nikola Staykov
Siqi Zhou
Angela P. Schoellig
59
4
0
19 Oct 2024
MotIF: Motion Instruction Fine-tuning
Minyoung Hwang
Joey Hejna
Dorsa Sadigh
Yonatan Bisk
49
1
0
16 Sep 2024
LLaRA: Supercharging Robot Learning Data for Vision-Language Policy
Xiang Li
Cristina Mata
J. Park
Kumara Kahatapitiya
Yoo Sung Jang
...
Kanchana Ranasinghe
R. Burgert
Mu Cai
Yong Jae Lee
Michael S. Ryoo
LM&Ro
72
25
0
28 Jun 2024
A Survey on Vision-Language-Action Models for Embodied AI
Yueen Ma
Zixing Song
Yuzheng Zhuang
Jianye Hao
Irwin King
LM&Ro
82
42
0
23 May 2024
MOKA: Open-Vocabulary Robotic Manipulation through Mark-Based Visual Prompting
Fangchen Liu
Kuan Fang
Pieter Abbeel
Sergey Levine
LM&Ro
40
33
0
05 Mar 2024
M2T2: Multi-Task Masked Transformer for Object-centric Pick and Place
Wentao Yuan
Adithyavairavan Murali
Arsalan Mousavian
Dieter Fox
50
19
0
02 Nov 2023
Motion Policy Networks
Adam Fishman
Adithya Murali
Clemens Eppner
Bryan N. Peele
Byron Boots
D. Fox
48
55
0
21 Oct 2022
ProgPrompt: Generating Situated Robot Task Plans using Large Language Models
Ishika Singh
Valts Blukis
Arsalan Mousavian
Ankit Goyal
Danfei Xu
Jonathan Tremblay
D. Fox
Jesse Thomason
Animesh Garg
LM&Ro
LLMAG
120
624
0
22 Sep 2022
Perceiver-Actor: A Multi-Task Transformer for Robotic Manipulation
Mohit Shridhar
Lucas Manuelli
D. Fox
LM&Ro
161
457
0
12 Sep 2022
SORNet: Spatial Object-Centric Representations for Sequential Manipulation
Wentao Yuan
Chris Paxton
Karthik Desingh
D. Fox
3DPC
147
72
0
08 Sep 2021
ManipulaTHOR: A Framework for Visual Object Manipulation
Kiana Ehsani
Winson Han
Alvaro Herrasti
Eli VanderBilt
Luca Weihs
Eric Kolve
Aniruddha Kembhavi
Roozbeh Mottaghi
LM&Ro
171
124
0
22 Apr 2021
Where2Act: From Pixels to Actions for Articulated 3D Objects
Kaichun Mo
Leonidas J. Guibas
Mustafa Mukadam
Abhinav Gupta
Shubham Tulsiani
162
176
0
07 Jan 2021
SAPIEN: A SimulAted Part-based Interactive ENvironment
Fanbo Xiang
Yuzhe Qin
Kaichun Mo
Yikuan Xia
Hao Zhu
...
He-Nan Wang
Li Yi
Angel X. Chang
Leonidas J. Guibas
Hao Su
218
487
0
19 Mar 2020
1