A Token-level Text Image Foundation Model for Document Understanding
v1v2 (latest)

A Token-level Text Image Foundation Model for Document Understanding

    VLM

Papers citing "A Token-level Text Image Foundation Model for Document Understanding"

50 / 94 papers shown
Title
LLaVA-OneVision: Easy Visual Task Transfer
LLaVA-OneVision: Easy Visual Task Transfer
Bo Li
Yuanhan Zhang
Dong Guo
Renrui Zhang
Feng Li
Hao Zhang
Kaichen Zhang
Yanwei Li
Ziwei Liu
Chunyuan Li
121
867
0
06 Aug 2024