Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.11176
Cited By
Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language Model
18 May 2023
Siyuan Huang
Zhengkai Jiang
Hao Dong
Yu Qiao
Peng Gao
Hongsheng Li
LM&Ro
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language Model"
50 / 75 papers shown
Title
PartDexTOG: Generating Dexterous Task-Oriented Grasping via Language-driven Part Analysis
Weishang Wu
Yifei Shi
Zhizhong Chen
Zhipong Cai
2
0
0
18 May 2025
Semantic Intelligence: Integrating GPT-4 with A Planning in Low-Cost Robotics
Jesse Barkley
A. George
A. Farimani
154
0
0
03 May 2025
CoordField: Coordination Field for Agentic UAV Task Allocation In Low-altitude Urban Scenarios
Tengchao Zhang
Yonglin Tian
Fei Lin
Jun Huang
Patrik P. Süli
Rui Qin
Fei-Yue Wang
73
0
0
30 Apr 2025
Few-Shot Vision-Language Action-Incremental Policy Learning
Mingchen Song
Xiang Deng
Guoqiang Zhong
Qi Lv
Jia Wan
Yinchuan Li
Haifeng Zhang
Weili Guan
41
0
0
22 Apr 2025
RAIDER: Tool-Equipped Large Language Model Agent for Robotic Action Issue Detection, Explanation and Recovery
Silvia Izquierdo-Badiola
Carlos Rizzo
Guillem Alenyà
LLMAG
LM&Ro
84
0
0
22 Mar 2025
HybridGen: VLM-Guided Hybrid Planning for Scalable Data Generation of Imitation Learning
Wensheng Wang
Ning Tan
LM&Ro
OffRL
60
0
0
17 Mar 2025
Efficient Alignment of Unconditioned Action Prior for Language-conditioned Pick and Place in Clutter
Kechun Xu
Xunlong Xia
Kaixuan Wang
Yifei Yang
Yunxuan Mao
Bing Deng
R. Xiong
Yansen Wang
OffRL
72
0
0
12 Mar 2025
LTLCodeGen: Code Generation of Syntactically Correct Temporal Logic for Robot Task Planning
Behrad Rabiei
Mahesh Kumar A.R.
Zhirui Dai
Surya L.S.R. Pilla
Qiyue Dong
Nikolay Atanasov
LM&Ro
61
0
0
10 Mar 2025
Generative Artificial Intelligence in Robotic Manipulation: A Survey
Kun Zhang
Peng Yun
Jun Cen
Junhao Cai
DiDi Zhu
...
Qifeng Chen
Jia Pan
Wei Zhang
Bo Yang
Hua Chen
59
1
0
05 Mar 2025
USPilot: An Embodied Robotic Assistant Ultrasound System with Large Language Model Enhanced Graph Planner
Mingcong Chen
Siqi Fan
Guanglin Cao
Hongbin Liu
55
0
0
18 Feb 2025
Video2Policy: Scaling up Manipulation Tasks in Simulation through Internet Videos
Weirui Ye
Fangchen Liu
Z. Ding
Yang Gao
Oleh Rybkin
Pieter Abbeel
VGen
OffRL
88
3
0
14 Feb 2025
Self-Consistency of the Internal Reward Models Improves Self-Rewarding Language Models
Xin Zhou
Yiwen Guo
Ruotian Ma
Tao Gui
Qi Zhang
Xuanjing Huang
LRM
92
2
0
13 Feb 2025
A Real-to-Sim-to-Real Approach to Robotic Manipulation with VLM-Generated Iterative Keypoint Rewards
Shivansh Patel
Xinchen Yin
Wenlong Huang
Shubham Garg
H. Nayyeri
Li Fei-Fei
Svetlana Lazebnik
Yong Li
92
0
0
12 Feb 2025
Visual Large Language Models for Generalized and Specialized Applications
Yifan Li
Zhixin Lai
Wentao Bao
Zhen Tan
Anh Dao
Kewei Sui
Jiayi Shen
Dong Liu
Huan Liu
Yu Kong
VLM
88
12
0
06 Jan 2025
Generative Timelines for Instructed Visual Assembly
Alejandro Pardo
Jui-hsien Wang
Guohao Li
Josef Sivic
Bryan C. Russell
Fabian Caba Heilbron
VGen
72
0
0
19 Nov 2024
Vision Language Models are In-Context Value Learners
Yecheng Jason Ma
Joey Hejna
Ayzaan Wahid
Chuyuan Fu
Dhruv Shah
...
Dinesh Jayaraman
Wenhao Yu
Tingnan Zhang
Dorsa Sadigh
Fei Xia
57
5
0
07 Nov 2024
Eurekaverse: Environment Curriculum Generation via Large Language Models
William Liang
Sam Wang
Hung-Ju Wang
Osbert Bastani
Dinesh Jayaraman
Yecheng Jason Ma
SyDa
38
2
0
04 Nov 2024
A Large Language Model-Driven Reward Design Framework via Dynamic Feedback for Reinforcement Learning
Shengjie Sun
Runze Liu
Jiafei Lyu
J. Yang
L. Zhang
Xiu Li
LRM
22
7
0
18 Oct 2024
ImagineNav: Prompting Vision-Language Models as Embodied Navigator through Scene Imagination
Xinxin Zhao
Wenzhe Cai
Likun Tang
Teng Wang
LM&Ro
40
3
0
13 Oct 2024
SG-Nav: Online 3D Scene Graph Prompting for LLM-based Zero-shot Object Navigation
Hang Yin
Xiuwei Xu
Zhenyu Wu
Jie Zhou
Jiwen Lu
42
13
0
10 Oct 2024
Discovering Object Attributes by Prompting Large Language Models with Perception-Action APIs
A. Mavrogiannis
Dehao Yuan
Yiannis Aloimonos
LM&Ro
43
0
0
23 Sep 2024
TrustNavGPT: Modeling Uncertainty to Improve Trustworthiness of Audio-Guided LLM-Based Robot Navigation
Xingpeng Sun
Yiran Zhang
Xindi Tang
Amrit Singh Bedi
Aniket Bera
50
4
0
03 Aug 2024
MaxMI: A Maximal Mutual Information Criterion for Manipulation Concept Discovery
Pei Zhou
Yanchao Yang
43
1
0
21 Jul 2024
Lifelong Robot Library Learning: Bootstrapping Composable and Generalizable Skills for Embodied Control with Language Models
Georgios Tziafas
H. Kasaei
KELM
LM&Ro
47
8
0
26 Jun 2024
Towards Open-World Grasping with Large Vision-Language Models
Georgios Tziafas
H. Kasaei
LM&Ro
LRM
37
12
0
26 Jun 2024
PARSE-Ego4D: Personal Action Recommendation Suggestions for Egocentric Videos
Steven Abreu
Tiffany D. Do
Ruofei Du
Eric J. Gonzalez
Lee Payne
Daniel J. McDuff
Mar Gonzalez-Franco
45
2
0
14 Jun 2024
A3VLM: Actionable Articulation-Aware Vision Language Model
Siyuan Huang
Haonan Chang
Yuhan Liu
Yimeng Zhu
Hao Dong
Peng Gao
Abdeslam Boularias
Hongsheng Li
38
10
0
11 Jun 2024
DrEureka: Language Model Guided Sim-To-Real Transfer
Yecheng Jason Ma
William Liang
Hung-Ju Wang
Sam Wang
Yuke Zhu
Linxi Fan
Osbert Bastani
Dinesh Jayaraman
77
43
0
04 Jun 2024
Luban: Building Open-Ended Creative Agents via Autonomous Embodied Verification
Yuxuan Guo
Shaohui Peng
Jiaming Guo
Di Huang
Xishan Zhang
...
Zihao Zhang
Zidong Du
Qi Guo
Xingui Hu
Yunji Chen
29
4
0
24 May 2024
WorldAfford: Affordance Grounding based on Natural Language Instructions
Changmao Chen
Yuren Cong
Zhen Kan
24
4
0
21 May 2024
Meta-Control: Automatic Model-based Control Synthesis for Heterogeneous Robot Skills
Tianhao Wei
Liqian Ma
Rui Chen
Weiye Zhao
Changliu Liu
48
3
0
18 May 2024
Bi-VLA: Vision-Language-Action Model-Based System for Bimanual Robotic Dexterous Manipulations
Koffivi Fidele Gbagbe
Miguel Altamirano Cabrera
Ali Alabbas
Oussama Alyunes
Artem Lykov
Dzmitry Tsetserukou
LM&Ro
43
18
0
09 May 2024
Integrating Disambiguation and User Preferences into Large Language Models for Robot Motion Planning
Mohammed Abugurain
Shinkyu Park
37
1
0
22 Apr 2024
RoboMP
2
^2
2
: A Robotic Multimodal Perception-Planning Framework with Multimodal Large Language Models
Qi Lv
Haochuan Li
Xiang Deng
Rui Shao
Michael Yu Wang
Liqiang Nie
LRM
LM&Ro
42
1
0
07 Apr 2024
Natural Language as Policies: Reasoning for Coordinate-Level Embodied Control with LLMs
Yusuke Mikami
Andrew Melnik
Jun Miura
Ville Hautamaki
LM&Ro
LRM
66
4
0
20 Mar 2024
Embedding Pose Graph, Enabling 3D Foundation Model Capabilities with a Compact Representation
Hugues Thomas
Jian Zhang
43
1
0
20 Mar 2024
ManipVQA: Injecting Robotic Affordance and Physically Grounded Information into Multi-Modal Large Language Models
Siyuan Huang
Iaroslav Ponomarenko
Zhengkai Jiang
Xiaoqi Li
Xiaobin Hu
Peng Gao
Hongsheng Li
Hao Dong
LM&Ro
37
16
0
17 Mar 2024
Segment Any Object Model (SAOM): Real-to-Simulation Fine-Tuning Strategy for Multi-Class Multi-Instance Segmentation
Mariia Khan
Yue Qiu
Yuren Cong
Jumana Abu-Khalaf
David Suter
Bodo Rosenhahn
40
4
0
16 Mar 2024
NaturalVLM: Leveraging Fine-grained Natural Language for Affordance-Guided Visual Manipulation
Ran Xu
Yan Shen
Xiaoqi Li
Ruihai Wu
Hao Dong
LM&Ro
30
9
0
13 Mar 2024
TINA: Think, Interaction, and Action Framework for Zero-Shot Vision Language Navigation
Dingbang Li
Wenzhou Chen
Xin Lin
LLMAG
LM&Ro
47
4
0
13 Mar 2024
ChartX & ChartVLM: A Versatile Benchmark and Foundation Model for Complicated Chart Reasoning
Renqiu Xia
Bo-Wen Zhang
Hancheng Ye
Xiangchao Yan
Qi Liu
...
Min Dou
Botian Shi
Junchi Yan
Junchi Yan
Yu Qiao
LRM
63
55
0
19 Feb 2024
RAG-Driver: Generalisable Driving Explanations with Retrieval-Augmented In-Context Learning in Multi-Modal Large Language Model
Jianhao Yuan
Shuyang Sun
Daniel Omeiza
Bo Zhao
Paul Newman
Lars Kunze
Matthew Gadd
LRM
36
49
0
16 Feb 2024
BBSEA: An Exploration of Brain-Body Synchronization for Embodied Agents
Sizhe Yang
Qian Luo
Anumpam Pani
Yanchao Yang
37
2
0
13 Feb 2024
Large Language Models for Robotics: Opportunities, Challenges, and Perspectives
Jiaqi Wang
Zihao Wu
Yiwei Li
Hanqi Jiang
Peng Shu
...
Lin Zhao
Bao Ge
Xiang Li
Tianming Liu
Shu Zhang
LM&Ro
45
61
0
09 Jan 2024
ConfusionPrompt: Practical Private Inference for Online Large Language Models
Peihua Mai
Ran Yan
Rui Ye
Youjia Yang
Yinchuan Li
Yan Pang
20
1
0
30 Dec 2023
LangSplat: 3D Language Gaussian Splatting
Minghan Qin
Wanhua Li
Jiawei Zhou
Haoqian Wang
Hanspeter Pfister
VLM
3DGS
26
180
0
26 Dec 2023
Toward General-Purpose Robots via Foundation Models: A Survey and Meta-Analysis
Yafei Hu
Quanting Xie
Vidhi Jain
Jonathan M Francis
Jay Patrikar
...
Xiaolong Wang
Sebastian A. Scherer
Z. Kira
Fei Xia
Yonatan Bisk
LM&Ro
AI4CE
37
63
0
14 Dec 2023
RoboGPT: an intelligent agent of making embodied long-term decisions for daily instruction tasks
Yaran Chen
Wenbo Cui
Yuanwen Chen
Mining Tan
Xinyao Zhang
Dong Zhao
He Wang
LM&Ro
LLMAG
38
0
0
27 Nov 2023
Robot Learning in the Era of Foundation Models: A Survey
Xuan Xiao
Jiahang Liu
Zhipeng Wang
Yanmin Zhou
Yong Qi
Qian Cheng
Bin He
Shuo Jiang
AI4CE
LM&Ro
29
27
0
24 Nov 2023
REAL: Resilience and Adaptation using Large Language Models on Autonomous Aerial Robots
Andrea Tagliabue
Kota Kondo
Tong Zhao
Mason B. Peterson
Claudius T. Tewari
Jonathan P. How
48
10
0
02 Nov 2023
1
2
Next