
ViGoR: Improving Visual Grounding of Large Vision Language Models with Fine-Grained Reward Modeling
Papers citing "ViGoR: Improving Visual Grounding of Large Vision Language Models with Fine-Grained Reward Modeling"
20 / 20 papers shown
Title |
---|
![]() Grounded Language-Image Pre-training Liunian Harold Li Pengchuan Zhang Haotian Zhang Jianwei Yang Chunyuan Li ...Lu Yuan Lei Zhang Lei Li Kai-Wei Chang Jianfeng Gao |