Title |
---|
![]() DEEM: Diffusion Models Serve as the Eyes of Large Language Models for Image Perception Run Luo Yunshui Li Longze Chen Wanwei He Ting-En Lin ...Zikai Song Xiaobo Xia Tongliang Liu Min Yang Binyuan Hui |
![]() PitVQA: Image-grounded Text Embedding LLM for Visual Question Answering
in Pituitary Surgery Runlong He Mengya Xu Adrito Das Danyal Z. Khan Sophia Bano Hani J. Marcus Danail Stoyanov Matthew J. Clarkson Mobarakol Islam |