PitVQA: Image-grounded Text Embedding LLM for Visual Question Answering
in Pituitary Surgery

PitVQA: Image-grounded Text Embedding LLM for Visual Question Answering in Pituitary Surgery

22 May 2024

Adrito Das

Danyal Z. Khan

Hani J. Marcus

Danail Stoyanov

Matthew J. Clarkson

Mobarakol Islam

Papers citing "PitVQA: Image-grounded Text Embedding LLM for Visual Question Answering in Pituitary Surgery"

7 / 7 papers shown

Title
ORQA: A Benchmark and Foundation Model for Holistic Operating Room Modeling Ege Özsoy Chantal Pellegrini D. Bani-Harouni Kun Yuan Matthias Keicher Nassir Navab 4 0 0 19 May 2025
AGIR: Assessing 3D Gait Impairment with Reasoning based on LLMs Diwei Wang Cédric Bobenrieth Hyewon Seo LRM 47 0 0 23 Mar 2025
SurgicalVLM-Agent: Towards an Interactive AI Co-Pilot for Pituitary Surgery Jiayuan Huang Runlong He Danyal Z. Khan E. Mazomenos Danail Stoyanov Hani J. Marcus Matthew J. Clarkson Mobarakol Islam LM&Ro 59 0 0 12 Mar 2025
FunBench: Benchmarking Fundus Reading Skills of MLLMs Qijie Wei Kaiheng Qian Xirong Li 39 1 0 02 Mar 2025
VideoICL: Confidence-based Iterative In-context Learning for Out-of-Distribution Video Understanding Kangsan Kim G. Park Youngwan Lee Woongyeong Yeo Sung Ju Hwang 94 3 0 03 Dec 2024
PitVis-2023 Challenge: Workflow Recognition in videos of Endoscopic Pituitary Surgery Adrito Das Danyal Z. Khan Dimitrios Psychogyios Yitong Zhang John G. Hanrahan ... Santiago Rodriguez Pablo Arbelaez Danail Stoyanov Hani J. Marcus Sophia Bano 44 5 0 02 Sep 2024
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation Junnan Li Dongxu Li Caiming Xiong Guosheng Lin MLLM BDL VLM CLIP 392 4,154 0 28 Jan 2022