Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.16093
Cited By
Towards Natural Language-Driven Assembly Using Foundation Models
23 June 2024
O. Joglekar
Tal Lancewicki
Shir Kozlovsky
Vladimir Tchuiev
Zohar Feldman
Dotan Di Castro
LM&Ro
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Towards Natural Language-Driven Assembly Using Foundation Models"
9 / 9 papers shown
Title
Look Before You Leap: Unveiling the Power of GPT-4V in Robotic Vision-Language Planning
Yingdong Hu
Fanqi Lin
Tong Zhang
Li Yi
Yang Gao
LM&Ro
85
101
0
29 Nov 2023
Visual Language Maps for Robot Navigation
Chen Huang
Oier Mees
Andy Zeng
Wolfram Burgard
LM&Ro
150
343
0
11 Oct 2022
Safe Self-Supervised Learning in Real of Visuo-Tactile Feedback Policies for Industrial Insertion
Letian Fu
Huang Huang
Lars Berscheid
Hui Li
Ken Goldberg
Sachin Chitta
33
18
0
04 Oct 2022
ProgPrompt: Generating Situated Robot Task Plans using Large Language Models
Ishika Singh
Valts Blukis
Arsalan Mousavian
Ankit Goyal
Danfei Xu
Jonathan Tremblay
D. Fox
Jesse Thomason
Animesh Garg
LM&Ro
LLMAG
120
622
0
22 Sep 2022
LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language, Vision, and Action
Dhruv Shah
B. Osinski
Brian Ichter
Sergey Levine
LM&Ro
147
436
0
10 Jul 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
392
4,125
0
28 Jan 2022
Pix2seq: A Language Modeling Framework for Object Detection
Ting-Li Chen
Saurabh Saxena
Lala Li
David J. Fleet
Geoffrey E. Hinton
MLLM
ViT
VLM
238
344
0
22 Sep 2021
TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models
Minghao Li
Tengchao Lv
Jingye Chen
Lei Cui
Yijuan Lu
D. Florêncio
Cha Zhang
Zhoujun Li
Furu Wei
ViT
98
340
0
21 Sep 2021
Unified Vision-Language Pre-Training for Image Captioning and VQA
Luowei Zhou
Hamid Palangi
Lei Zhang
Houdong Hu
Jason J. Corso
Jianfeng Gao
MLLM
VLM
252
927
0
24 Sep 2019
1