Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2408.01319
Cited By
A Comprehensive Review of Multimodal Large Language Models: Performance and Challenges Across Different Tasks
2 August 2024
Jiaqi Wang
Hanqi Jiang
Yi-Hsueh Liu
Chong Ma
Xu-Yao Zhang
Yi Pan
Mengyuan Liu
Peiran Gu
Sichen Xia
Wenjun Li
Yutong Zhang
Zihao Wu
Zheng Liu
Tianyang Zhong
Bao Ge
Tuo Zhang
Ning Qiang
Xintao Hu
Xi Jiang
Xin Zhang
Wei Zhang
Dinggang Shen
Tianming Liu
Shu Zhang
VLM
AI4TS
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Comprehensive Review of Multimodal Large Language Models: Performance and Challenges Across Different Tasks"
9 / 9 papers shown
Title
Position: Foundation Models Need Digital Twin Representations
Yiqing Shen
Hao Ding
Lalithkumar Seenivasan
Tianmin Shu
Mathias Unberath
AI4CE
40
0
0
01 May 2025
Video-Bench: Human-Aligned Video Generation Benchmark
Hui Han
Siyuan Li
Jiaqi Chen
Yiwen Yuan
Yuling Wu
...
Y. Li
Jingyang Zhang
Chi Zhang
Li Li
Yongxin Ni
EGVM
VGen
73
0
0
07 Apr 2025
Do Language Models Understand Time?
Xi Ding
Lei Wang
178
0
0
18 Dec 2024
Recent advances in deep learning and language models for studying the microbiome
Binghao Yan
Yunbi Nam
Lingyao Li
Rebecca A Deek
Hongzhe Li
Siyuan Ma
18
1
0
15 Sep 2024
MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning
Jun Chen
Deyao Zhu
Xiaoqian Shen
Xiang Li
Zechun Liu
Pengchuan Zhang
Raghuraman Krishnamoorthi
Vikas Chandra
Yunyang Xiong
Mohamed Elhoseiny
MLLM
160
441
0
14 Oct 2023
Multimodal Foundation Models: From Specialists to General-Purpose Assistants
Chunyuan Li
Zhe Gan
Zhengyuan Yang
Jianwei Yang
Linjie Li
Lijuan Wang
Jianfeng Gao
MLLM
115
228
0
18 Sep 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
270
4,244
0
30 Jan 2023
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
Soravit Changpinyo
P. Sharma
Nan Ding
Radu Soricut
VLM
278
1,082
0
17 Feb 2021
Video Transformer Network
Daniel Neimark
Omri Bar
Maya Zohar
Dotan Asselmann
ViT
198
422
0
01 Feb 2021
1