ResearchTrend.AI
  • Papers
  • Communities
  • Organizations
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.00598
  4. Cited By
Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language
v1v2 (latest)

Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language

1 April 2022
Andy Zeng
Maria Attarian
Brian Ichter
K. Choromanski
Adrian S. Wong
Stefan Welker
F. Tombari
Aveek Purohit
Michael S. Ryoo
Vikas Sindhwani
Johnny Lee
Vincent Vanhoucke
Peter R. Florence
    ReLMLRM
ArXiv (abs)PDFHTML

Papers citing "Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language"

50 / 438 papers shown
Title
Tree-Planner: Efficient Close-loop Task Planning with Large Language
  Models
Tree-Planner: Efficient Close-loop Task Planning with Large Language Models
Mengkang Hu
Yao Mu
Xinmiao Yu
Mingyu Ding
Shiguang Wu
Wenqi Shao
Qiguang Chen
Bin Wang
Yu Qiao
Ping Luo
LLMAG
115
39
0
12 Oct 2023
Jigsaw: Supporting Designers to Prototype Multimodal Applications by
  Chaining AI Foundation Models
Jigsaw: Supporting Designers to Prototype Multimodal Applications by Chaining AI Foundation Models
David Chuan-En Lin
Nikolas Martelaro
66
19
0
12 Oct 2023
Open-Set Knowledge-Based Visual Question Answering with Inference Paths
Open-Set Knowledge-Based Visual Question Answering with Inference Paths
Jingru Gan
Xinzhe Han
Shuhui Wang
Qingming Huang
81
0
0
12 Oct 2023
A Closer Look into Automatic Evaluation Using Large Language Models
A Closer Look into Automatic Evaluation Using Large Language Models
Cheng-Han Chiang
Hunghuei Lee
ELMALMLM&MA
90
13
0
09 Oct 2023
Compositional Semantics for Open Vocabulary Spatio-semantic
  Representations
Compositional Semantics for Open Vocabulary Spatio-semantic Representations
Robin Karlsson
Francisco Lepe-Salazar
K. Takeda
VLM
87
1
0
08 Oct 2023
GRID: A Platform for General Robot Intelligence Development
GRID: A Platform for General Robot Intelligence Development
Sai H. Vemprala
Shuhang Chen
Abhinav Shukla
Dinesh Narayanan
Ashish Kapoor
97
10
0
02 Oct 2023
Cook2LTL: Translating Cooking Recipes to LTL Formulae using Large
  Language Models
Cook2LTL: Translating Cooking Recipes to LTL Formulae using Large Language Models
A. Mavrogiannis
Christoforos Mavrogiannis
Yiannis Aloimonos
LM&Ro
87
12
0
29 Sep 2023
OceanChat: Piloting Autonomous Underwater Vehicles in Natural Language
OceanChat: Piloting Autonomous Underwater Vehicles in Natural Language
Jia Huang
Mengxue Hou
Junkai Wang
Fumin Zhang
89
5
0
27 Sep 2023
Lifelong Robot Learning with Human Assisted Language Planners
Lifelong Robot Learning with Human Assisted Language Planners
Meenal Parakh
Alisha Fong
Anthony Simeonov
Tao Chen
Abhishek Gupta
Pulkit Agrawal
LM&Ro
125
18
0
25 Sep 2023
ReConcile: Round-Table Conference Improves Reasoning via Consensus among
  Diverse LLMs
ReConcile: Round-Table Conference Improves Reasoning via Consensus among Diverse LLMs
Justin Chih-Yao Chen
Swarnadeep Saha
Joey Tianyi Zhou
LLMAGLRM
138
143
0
22 Sep 2023
LMC: Large Model Collaboration with Cross-assessment for Training-Free
  Open-Set Object Recognition
LMC: Large Model Collaboration with Cross-assessment for Training-Free Open-Set Object Recognition
Haoxuan Qu
Xiaofei Hui
Yujun Cai
Jun Liu
145
11
0
22 Sep 2023
LLM-Grounder: Open-Vocabulary 3D Visual Grounding with Large Language
  Model as an Agent
LLM-Grounder: Open-Vocabulary 3D Visual Grounding with Large Language Model as an Agent
Jianing Yang
Xuweiyi Chen
Shengyi Qian
Nikhil Madaan
Madhavan Iyengar
David Fouhey
Joyce Chai
LM&RoLLMAG
152
101
0
21 Sep 2023
SG-Bot: Object Rearrangement via Coarse-to-Fine Robotic Imagination on
  Scene Graphs
SG-Bot: Object Rearrangement via Coarse-to-Fine Robotic Imagination on Scene Graphs
Guangyao Zhai
Xiaoni Cai
Dianye Huang
Yan Di
Fabian Manhardt
Federico Tombari
Nassir Navab
Benjamin Busam
LM&Ro
74
27
0
21 Sep 2023
BELT:Bootstrapping Electroencephalography-to-Language Decoding and Zero-Shot Sentiment Classification by Natural Language Supervision
Jinzhao Zhou
Yiqun Duan
Yu-Cheng Chang
Yu-Kai Wang
Chin-Teng Lin
80
6
0
21 Sep 2023
Text2Reward: Reward Shaping with Language Models for Reinforcement
  Learning
Text2Reward: Reward Shaping with Language Models for Reinforcement Learning
Tianbao Xie
Siheng Zhao
Chen Henry Wu
Yitao Liu
Qian Luo
Victor Zhong
Yanchao Yang
Tao Yu
LM&Ro
141
65
0
20 Sep 2023
Guide Your Agent with Adaptive Multimodal Rewards
Guide Your Agent with Adaptive Multimodal Rewards
Changyeon Kim
Younggyo Seo
Hao Liu
Lisa Lee
Jinwoo Shin
Honglak Lee
Kimin Lee
90
9
0
19 Sep 2023
Conformal Temporal Logic Planning using Large Language Models
Conformal Temporal Logic Planning using Large Language Models
Jun Wang
J. Tong
Kai Liang Tan
Yevgeniy Vorobeychik
Y. Kantaros
LM&Ro
267
23
0
18 Sep 2023
From Cooking Recipes to Robot Task Trees -- Improving Planning
  Correctness and Task Efficiency by Leveraging LLMs with a Knowledge Network
From Cooking Recipes to Robot Task Trees -- Improving Planning Correctness and Task Efficiency by Leveraging LLMs with a Knowledge Network
Md. Sadman Sakib
Yu Sun
56
11
0
17 Sep 2023
Language Models as Black-Box Optimizers for Vision-Language Models
Language Models as Black-Box Optimizers for Vision-Language Models
Shihong Liu
Zhiqiu Lin
Samuel Yu
Ryan Lee
Tiffany Ling
Deepak Pathak
Deva Ramanan
VLM
128
30
0
12 Sep 2023
Incremental Learning of Humanoid Robot Behavior from Natural Interaction
  and Large Language Models
Incremental Learning of Humanoid Robot Behavior from Natural Interaction and Large Language Models
Leonard Barmann
Rainer Kartmann
Fabian Peller-Konrad
Jan Niehues
Alexander H. Waibel
Tamim Asfour
LM&Ro
123
26
0
08 Sep 2023
Physically Grounded Vision-Language Models for Robotic Manipulation
Physically Grounded Vision-Language Models for Robotic Manipulation
Jensen Gao
Bidipta Sarkar
F. Xia
Ted Xiao
Jiajun Wu
Brian Ichter
Anirudha Majumdar
Dorsa Sadigh
LM&Ro
147
134
0
05 Sep 2023
Cognitive Architectures for Language Agents
Cognitive Architectures for Language Agents
T. Sumers
Shunyu Yao
Karthik Narasimhan
Thomas Griffiths
LLMAGLM&Ro
170
182
0
05 Sep 2023
ISR-LLM: Iterative Self-Refined Large Language Model for Long-Horizon
  Sequential Task Planning
ISR-LLM: Iterative Self-Refined Large Language Model for Long-Horizon Sequential Task Planning
Zhehua Zhou
Yuheng Huang
Kunpeng Yao
Zhan Shu
Lei Ma
86
67
0
26 Aug 2023
MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual
  Captioning
MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning
Bang-ju Yang
Fenglin Liu
X. Wu
Yaowei Wang
Xu Sun
Yuexian Zou
VLMCLIP
88
13
0
25 Aug 2023
VisIT-Bench: A Benchmark for Vision-Language Instruction Following
  Inspired by Real-World Use
VisIT-Bench: A Benchmark for Vision-Language Instruction Following Inspired by Real-World Use
Yonatan Bitton
Hritik Bansal
Jack Hessel
Rulin Shao
Wanrong Zhu
Anas Awadalla
Josh Gardner
Rohan Taori
L. Schimdt
VLM
152
82
0
12 Aug 2023
MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities
MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities
Weihao Yu
Zhengyuan Yang
Linjie Li
Jianfeng Wang
Kevin Qinghong Lin
Zicheng Liu
Xinchao Wang
Lijuan Wang
MLLM
187
725
0
04 Aug 2023
LEMMA: Learning Language-Conditioned Multi-Robot Manipulation
LEMMA: Learning Language-Conditioned Multi-Robot Manipulation
Ran Gong
Xiaofeng Gao
Qiaozi Gao
Suhaila Shakiah
Govind Thattai
Gaurav Sukhatme
LM&Ro
152
9
0
02 Aug 2023
Transferable Decoding with Visual Entities for Zero-Shot Image
  Captioning
Transferable Decoding with Visual Entities for Zero-Shot Image Captioning
Junjie Fei
Teng Wang
Jinrui Zhang
Zhenyu He
Chengjie Wang
Feng Zheng
VLM
89
37
0
31 Jul 2023
AntGPT: Can Large Language Models Help Long-term Action Anticipation
  from Videos?
AntGPT: Can Large Language Models Help Long-term Action Anticipation from Videos?
Qi Zhao
Shijie Wang
Ce Zhang
Changcheng Fu
Minh Quan Do
Nakul Agarwal
Kwonjoon Lee
Chen Sun
LM&Ro
137
52
0
31 Jul 2023
GraspGPT: Leveraging Semantic Knowledge from a Large Language Model for
  Task-Oriented Grasping
GraspGPT: Leveraging Semantic Knowledge from a Large Language Model for Task-Oriented Grasping
Chao Tang
Dehao Huang
Wenqiang Ge
Weiyu Liu
Kuanqi Cai
118
75
0
25 Jul 2023
A Real-World WebAgent with Planning, Long Context Understanding, and
  Program Synthesis
A Real-World WebAgent with Planning, Long Context Understanding, and Program Synthesis
Izzeddin Gur
Hiroki Furuta
Austin Huang
Mustafa Safdari
Yutaka Matsuo
Douglas Eck
Aleksandra Faust
LM&RoLLMAG
231
227
0
24 Jul 2023
OBJECT 3DIT: Language-guided 3D-aware Image Editing
OBJECT 3DIT: Language-guided 3D-aware Image Editing
Oscar Michel
Anand Bhattad
Eli VanderBilt
Ranjay Krishna
Aniruddha Kembhavi
Tanmay Gupta
DiffM
104
38
0
20 Jul 2023
Towards A Unified Agent with Foundation Models
Towards A Unified Agent with Foundation Models
Norman Di Palo
Arunkumar Byravan
Leonard Hasenclever
Markus Wulfmeier
N. Heess
Martin Riedmiller
LM&RoLLMAGOffRL
90
60
0
18 Jul 2023
Coupling Large Language Models with Logic Programming for Robust and
  General Reasoning from Text
Coupling Large Language Models with Logic Programming for Robust and General Reasoning from Text
Zhun Yang
Adam Ishay
Joohyung Lee
LRMELM
84
59
0
15 Jul 2023
SayPlan: Grounding Large Language Models using 3D Scene Graphs for
  Scalable Robot Task Planning
SayPlan: Grounding Large Language Models using 3D Scene Graphs for Scalable Robot Task Planning
Krishan Rana
Jesse Haviland
Sourav Garg
Jad Abou-Chakra
Ian Reid
Niko Sünderhauf
LM&Ro
122
240
0
12 Jul 2023
VoxPoser: Composable 3D Value Maps for Robotic Manipulation with
  Language Models
VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language Models
Wenlong Huang
Chen Wang
Ruohan Zhang
Yunzhu Li
Jiajun Wu
Li Fei-Fei
LM&Ro
134
523
0
12 Jul 2023
RoCo: Dialectic Multi-Robot Collaboration with Large Language Models
RoCo: Dialectic Multi-Robot Collaboration with Large Language Models
Zhao Mandi
Shreeya Jain
Shuran Song
LM&RoLLMAG
77
142
0
10 Jul 2023
Large Language Models as General Pattern Machines
Large Language Models as General Pattern Machines
Suvir Mirchandani
F. Xia
Peter R. Florence
Brian Ichter
Danny Driess
Montse Gonzalez Arenas
Kanishka Rao
Dorsa Sadigh
Andy Zeng
LLMAG
140
205
0
10 Jul 2023
Robots That Ask For Help: Uncertainty Alignment for Large Language Model
  Planners
Robots That Ask For Help: Uncertainty Alignment for Large Language Model Planners
Allen Z. Ren
Anushri Dixit
Alexandra Bodrova
Sumeet Singh
Stephen Tu
...
Jacob Varley
Zhenjia Xu
Dorsa Sadigh
Andy Zeng
Anirudha Majumdar
LM&Ro
314
240
0
04 Jul 2023
Visual Instruction Tuning with Polite Flamingo
Visual Instruction Tuning with Polite Flamingo
Delong Chen
Jianfeng Liu
Wenliang Dai
Baoyuan Wang
MLLM
126
48
0
03 Jul 2023
Conformer LLMs -- Convolution Augmented Large Language Models
Conformer LLMs -- Convolution Augmented Large Language Models
Prateek Verma
76
1
0
02 Jul 2023
DoReMi: Grounding Language Model by Detecting and Recovering from
  Plan-Execution Misalignment
DoReMi: Grounding Language Model by Detecting and Recovering from Plan-Execution Misalignment
Yanjiang Guo
Yen-Jen Wang
Lihan Zha
Zheyuan Jiang
Jianyu Chen
LM&Ro
152
41
0
01 Jul 2023
Statler: State-Maintaining Language Models for Embodied Reasoning
Statler: State-Maintaining Language Models for Embodied Reasoning
Takuma Yoneda
Jiading Fang
Peng Li
Huanyu Zhang
Tianchong Jiang
Shengjie Lin
Ben Picker
David Yunis
Hongyuan Mei
Matthew R. Walter
LM&Ro
94
35
0
30 Jun 2023
Towards Language Models That Can See: Computer Vision Through the LENS
  of Natural Language
Towards Language Models That Can See: Computer Vision Through the LENS of Natural Language
William Berrios
Gautam Mittal
Tristan Thrush
Douwe Kiela
Amanpreet Singh
MLLMVLM
77
61
0
28 Jun 2023
Next Steps for Human-Centered Generative AI: A Technical Perspective
Next Steps for Human-Centered Generative AI: A Technical Perspective
Xiang Ánthony' Chen
Jeff Burke
Andrea Colaço
Matthew K. Hong
Jennifer Jacobs
...
Dingzeyu Li
Nanyun Peng
Karl D. D. Willis
Chien-Sheng Wu
Bolei Zhou
LLMAG
108
35
0
27 Jun 2023
REFLECT: Summarizing Robot Experiences for Failure Explanation and
  Correction
REFLECT: Summarizing Robot Experiences for Failure Explanation and Correction
Zeyi Liu
Arpit Bahety
Shuran Song
LRM
118
127
0
27 Jun 2023
A Survey on Multimodal Large Language Models
A Survey on Multimodal Large Language Models
Shukang Yin
Chaoyou Fu
Sirui Zhao
Ke Li
Xing Sun
Tong Xu
Enhong Chen
MLLMLRM
166
618
0
23 Jun 2023
Retrieving-to-Answer: Zero-Shot Video Question Answering with Frozen
  Large Language Models
Retrieving-to-Answer: Zero-Shot Video Question Answering with Frozen Large Language Models
Junting Pan
Ziyi Lin
Yuying Ge
Xiatian Zhu
Renrui Zhang
Yi Wang
Yu Qiao
Hongsheng Li
MLLM
97
27
0
15 Jun 2023
Language-Guided Music Recommendation for Video via Prompt Analogies
Language-Guided Music Recommendation for Video via Prompt Analogies
Daniel McKee
Justin Salamon
Josef Sivic
Bryan C. Russell
VGen
90
27
0
15 Jun 2023
Semantic HELM: A Human-Readable Memory for Reinforcement Learning
Semantic HELM: A Human-Readable Memory for Reinforcement Learning
Fabian Paischer
Thomas Adler
M. Hofmarcher
Sepp Hochreiter
75
12
0
15 Jun 2023
Previous
123456789
Next