ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.13473
  4. Cited By
Benchmarking Sequential Visual Input Reasoning and Prediction in
  Multimodal Large Language Models

Benchmarking Sequential Visual Input Reasoning and Prediction in Multimodal Large Language Models

20 October 2023
Mingwei Zhu
Leigang Sha
Yu Shu
Kangjia Zhao
Tiancheng Zhao
Jianwei Yin
    LRM
ArXivPDFHTML

Papers citing "Benchmarking Sequential Visual Input Reasoning and Prediction in Multimodal Large Language Models"

4 / 4 papers shown
Title
Plan, Eliminate, and Track -- Language Models are Good Teachers for
  Embodied Agents
Plan, Eliminate, and Track -- Language Models are Good Teachers for Embodied Agents
Yu-Chih Chen
So Yeon Min
Chase Davis
Ruslan Salakhutdinov
A. Azaria
Yuan-Fang Li
Tom Michael Mitchell
A. Bovik
LM&Ro
LLMAG
78
33
0
03 May 2023
mPLUG-Owl: Modularization Empowers Large Language Models with
  Multimodality
mPLUG-Owl: Modularization Empowers Large Language Models with Multimodality
Qinghao Ye
Haiyang Xu
Guohai Xu
Jiabo Ye
Ming Yan
...
Junfeng Tian
Qiang Qi
Ji Zhang
Feiyan Huang
Jingren Zhou
VLM
MLLM
208
900
0
27 Apr 2023
Instruction Tuning with GPT-4
Instruction Tuning with GPT-4
Baolin Peng
Chunyuan Li
Pengcheng He
Michel Galley
Jianfeng Gao
SyDa
ALM
LM&MA
162
579
0
06 Apr 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image
  Encoders and Large Language Models
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
281
4,244
0
30 Jan 2023
1