Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.01487
Cited By
InfiMM-HD: A Leap Forward in High-Resolution Multimodal Understanding
3 March 2024
Haogeng Liu
Quanzeng You
Xiaotian Han
Yiqi Wang
Bohan Zhai
Yongfei Liu
Yunzhe Tao
Huaibo Huang
Ran He
Hongxia Yang
MLLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"InfiMM-HD: A Leap Forward in High-Resolution Multimodal Understanding"
5 / 5 papers shown
Title
Law of Vision Representation in MLLMs
Shijia Yang
Bohan Zhai
Quanzeng You
Jianbo Yuan
Hongxia Yang
Chenfeng Xu
40
9
0
29 Aug 2024
LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images
Ruyi Xu
Yuan Yao
Zonghao Guo
Junbo Cui
Zanlin Ni
Chunjiang Ge
Tat-Seng Chua
Zhiyuan Liu
Maosong Sun
Gao Huang
VLM
MLLM
37
102
0
18 Mar 2024
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
270
4,244
0
30 Jan 2023
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering
Pan Lu
Swaroop Mishra
Tony Xia
Liang Qiu
Kai-Wei Chang
Song-Chun Zhu
Oyvind Tafjord
Peter Clark
A. Kalyan
ELM
ReLM
LRM
211
1,106
0
20 Sep 2022
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
305
7,443
0
11 Nov 2021
1