Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2408.16986
Cited By
AdaptVision: Dynamic Input Scaling in MLLMs for Versatile Scene Understanding
30 August 2024
Yonghui Wang
Wengang Zhou
Hao Feng
Houqiang Li
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"AdaptVision: Dynamic Input Scaling in MLLMs for Versatile Scene Understanding"
2 / 2 papers shown
Title
mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with Modality Collaboration
Qinghao Ye
Haiyang Xu
Jiabo Ye
Mingshi Yan
Anwen Hu
Haowei Liu
Qi Qian
Ji Zhang
Fei Huang
Jingren Zhou
MLLM
VLM
126
375
0
07 Nov 2023
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering
Pan Lu
Swaroop Mishra
Tony Xia
Liang Qiu
Kai-Wei Chang
Song-Chun Zhu
Oyvind Tafjord
Peter Clark
A. Kalyan
ELM
ReLM
LRM
211
1,106
0
20 Sep 2022
1