Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.13473
Cited By
Benchmarking Sequential Visual Input Reasoning and Prediction in Multimodal Large Language Models
20 October 2023
Mingwei Zhu
Leigang Sha
Yu Shu
Kangjia Zhao
Tiancheng Zhao
Jianwei Yin
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Benchmarking Sequential Visual Input Reasoning and Prediction in Multimodal Large Language Models"
4 / 4 papers shown
Title
Plan, Eliminate, and Track -- Language Models are Good Teachers for Embodied Agents
Yu-Chih Chen
So Yeon Min
Chase Davis
Ruslan Salakhutdinov
A. Azaria
Yuan-Fang Li
Tom Michael Mitchell
A. Bovik
LM&Ro
LLMAG
78
33
0
03 May 2023
mPLUG-Owl: Modularization Empowers Large Language Models with Multimodality
Qinghao Ye
Haiyang Xu
Guohai Xu
Jiabo Ye
Ming Yan
...
Junfeng Tian
Qiang Qi
Ji Zhang
Feiyan Huang
Jingren Zhou
VLM
MLLM
208
900
0
27 Apr 2023
Instruction Tuning with GPT-4
Baolin Peng
Chunyuan Li
Pengcheng He
Michel Galley
Jianfeng Gao
SyDa
ALM
LM&MA
162
579
0
06 Apr 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
281
4,244
0
30 Jan 2023
1