Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.02071
Cited By
v1
v2
v3
v4 (latest)
Towards End-to-End Embodied Decision Making via Multi-modal Large Language Model: Explorations with GPT4-Vision and Beyond
3 October 2023
Liang Chen
Yichi Zhang
Shuhuai Ren
Haozhe Zhao
Zefan Cai
Yuchi Wang
Peiyi Wang
Tianyu Liu
Baobao Chang
LM&Ro
LLMAG
Re-assign community
ArXiv (abs)
PDF
HTML
Github (104★)
Papers citing
"Towards End-to-End Embodied Decision Making via Multi-modal Large Language Model: Explorations with GPT4-Vision and Beyond"
2 / 52 papers shown
Title
AI2-THOR: An Interactive 3D Environment for Visual AI
Eric Kolve
Roozbeh Mottaghi
Winson Han
Eli VanderBilt
Luca Weihs
...
Daniel Gordon
Yuke Zhu
Aniruddha Kembhavi
Abhinav Gupta
Ali Farhadi
LM&Ro
84
1,110
0
14 Dec 2017
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering
Yash Goyal
Tejas Khot
D. Summers-Stay
Dhruv Batra
Devi Parikh
CoGe
345
3,270
0
02 Dec 2016
Previous
1
2