Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2306.00890
Cited By
LLaVA-Med: Training a Large Language-and-Vision Assistant for Biomedicine in One Day
1 June 2023
Chunyuan Li
Cliff Wong
Sheng Zhang
Naoto Usuyama
Haotian Liu
Jianwei Yang
Tristan Naumann
Hoifung Poon
Jianfeng Gao
LM&MA
MedIm
Re-assign community
ArXiv
PDF
HTML
Papers citing
"LLaVA-Med: Training a Large Language-and-Vision Assistant for Biomedicine in One Day"
50 / 145 papers shown
Title
Patho-R1: A Multimodal Reinforcement Learning-Based Pathology Expert Reasoner
Wenchuan Zhang
Penghao Zhang
Jingru Guo
Tao Cheng
Jie Chen
Shuwan Zhang
Zhang Zhang
Yuhao Yi
Hong Bu
AI4TS
LRM
22
0
0
16 May 2025
A Multimodal Multi-Agent Framework for Radiology Report Generation
Ziruo Yi
Ting Xiao
Mark V. Albert
MedIm
29
0
0
14 May 2025
Advancing Food Nutrition Estimation via Visual-Ingredient Feature Fusion
Huiyan Qi
B. Zhu
Chong-Wah Ngo
Jingjing Chen
Ee-Peng Lim
31
0
0
13 May 2025
Ultrasound Report Generation with Multimodal Large Language Models for Standardized Texts
Peixuan Ge
Tongkun Su
Faqin Lv
Baoliang Zhao
Peng Zhang
...
Liang Yao
Yu Sun
Zenan Wang
Pak Kin Wong
Ying Hu
MedIm
26
0
0
13 May 2025
DocVXQA: Context-Aware Visual Explanations for Document Question Answering
Mohamed Ali Souibgui
Changkyu Choi
Andrey Barsky
Kangsoo Jung
Ernest Valveny
Dimosthenis Karatzas
28
0
0
12 May 2025
Multi-Modal Explainable Medical AI Assistant for Trustworthy Human-AI Collaboration
Honglong Yang
Shanshan Song
Yi Qin
Lehan Wang
Haonan Wang
Xinpeng Ding
Qixiang Zhang
Bodong Du
Xuelong Li
LM&MA
34
0
0
11 May 2025
MM-Skin: Enhancing Dermatology Vision-Language Model with an Image-Text Dataset Derived from Textbooks
Wenqi Zeng
Yuqi Sun
Chenxi Ma
Weimin Tan
Bo Yan
LM&MA
VLM
55
0
0
09 May 2025
GeomHair: Reconstruction of Hair Strands from Colorless 3D Scans
Rachmadio Noval Lazuardi
Artem Sevastopolsky
Egor Zakharov
Matthias Niessner
V. Sklyarova
3DH
54
0
0
08 May 2025
The Eye as a Window to Systemic Health: A Survey of Retinal Imaging from Classical Techniques to Oculomics
Inamullah
Imran Razzak
Shoaib Jameel
38
0
0
06 May 2025
Task-Oriented Semantic Communication in Large Multimodal Models-based Vehicle Networks
Baoxia Du
H. Du
Dusit Niyato
Ruidong Li
60
0
0
05 May 2025
Structure Causal Models and LLMs Integration in Medical Visual Question Answering
Zibo Xu
Qiang Li
Weizhi Nie
Weijie Wang
Anan Liu
CML
MedIm
47
0
0
05 May 2025
Evaluating Vision Language Model Adaptations for Radiology Report Generation in Low-Resource Languages
Marco Salmè
R. Sicilia
Paolo Soda
V. Guarrasi
195
0
0
02 May 2025
Calibrating Uncertainty Quantification of Multi-Modal LLMs using Grounding
Trilok Padhi
R. Kaur
Adam D. Cobb
Manoj Acharya
Anirban Roy
Colin Samplawski
Brian Matejek
Alexander M. Berenbeim
Nathaniel D. Bastian
Susmit Jha
28
0
0
30 Apr 2025
Detecting and Mitigating Hateful Content in Multimodal Memes with Vision-Language Models
Minh-Hao Van
Xintao Wu
VLM
88
0
0
30 Apr 2025
Multimodal Large Language Models for Medicine: A Comprehensive Survey
Jiarui Ye
Hao Tang
LM&MA
91
0
0
29 Apr 2025
Keep the General, Inject the Specific: Structured Dialogue Fine-Tuning for Knowledge Injection without Catastrophic Forgetting
Y. Hong
Xiaofei Yin
Xinzhong Wang
Yi Tu
Ya Guo
Sufeng Duan
Weiqiang Wang
Lingyong Fang
Depeng Wang
Huijia Zhu
CLL
96
0
0
27 Apr 2025
Anyprefer: An Agentic Framework for Preference Data Synthesis
Yiyang Zhou
Zhaoxiang Wang
Tianle Wang
Shangyu Xing
Peng Xia
...
Chetan Bansal
Weitong Zhang
Ying Wei
Joey Tianyi Zhou
Huaxiu Yao
71
1
0
27 Apr 2025
Hallucinations and Key Information Extraction in Medical Texts: A Comprehensive Assessment of Open-Source Large Language Models
Anindya Bijoy Das
Shibbir Ahmed
Shahnewaz Karim Sakib
HILM
LM&MA
57
0
0
27 Apr 2025
Revisiting Data Auditing in Large Vision-Language Models
Hongyu Zhu
Sichu Liang
Wei Wang
Boheng Li
Tongxin Yuan
Fangqi Li
Shilin Wang
Zhuosheng Zhang
VLM
236
0
0
25 Apr 2025
Reason Like a Radiologist: Chain-of-Thought and Reinforcement Learning for Verifiable Report Generation
Peiyuan Jing
Kinhei Lee
Zhenxuan Zhang
Huichi Zhou
Zhengqing Yuan
Zhifan Gao
Lei Zhu
G. Papanastasiou
Yingying Fang
Guang Yang
MedIm
OffRL
LRM
68
0
0
25 Apr 2025
TimeSoccer: An End-to-End Multimodal Large Language Model for Soccer Commentary Generation
Ling You
Wenxuan Huang
Xinni Xie
Xiangyi Wei
Bangyan Li
Shaohui Lin
Yang Li
Changbo Wang
VGen
205
1
0
24 Apr 2025
How Well Can General Vision-Language Models Learn Medicine By Watching Public Educational Videos?
Rahul Thapa
Andrew Li
Qingyang Wu
B. He
Yuki Sahashi
...
Angela Zhang
Ben Athiwaratkun
Shuaiwen Leon Song
David Ouyang
James Zou
LM&MA
49
0
0
19 Apr 2025
Learning Joint ID-Textual Representation for ID-Preserving Image Synthesis
Zichuan Liu
Liming Jiang
Qing Yan
Yumin Jia
Hao Kang
Xin Lu
DiffM
33
0
0
19 Apr 2025
Evaluating Menu OCR and Translation: A Benchmark for Aligning Human and Automated Evaluations in Large Vision-Language Models
Zhanglin Wu
Tengfei Song
Ning Xie
Mengli Zhu
Weidong Zhang
...
Pengfei Li
Chong Li
Junhao Zhu
Hao Yang
Shiliang Sun
55
2
0
16 Apr 2025
MediSee: Reasoning-based Pixel-level Perception in Medical Images
Qinyue Tong
Ziqian Lu
Jun Liu
Yangming Zheng
Zheming Lu
LRM
43
0
0
15 Apr 2025
PATFinger: Prompt-Adapted Transferable Fingerprinting against Unauthorized Multimodal Dataset Usage
Wenbo Zhang
Ju Jia
Xiaojun Jia
Yihao Huang
Xuzhao Li
Cong Wu
Lina Wang
AAML
42
0
0
15 Apr 2025
Enhancing Multi-task Learning Capability of Medical Generalist Foundation Model via Image-centric Multi-annotation Data
Xun Zhu
Fanbin Mo
Zheng Zhang
J. Wang
Yiming Shi
Ming Wu
Chuang Zhang
Miao Li
Ji Wu
32
0
0
14 Apr 2025
PathVLM-R1: A Reinforcement Learning-Driven Reasoning Model for Pathology Visual-Language Tasks
Junfei Wu
Hao Yang
Xinhua Zeng
Guibing He
Zhengzhang Chen
Zhu Li
Xinming Zhang
Yangyang Ma
Run Fang
Yang Liu
LRM
157
0
0
12 Apr 2025
PaMi-VDPO: Mitigating Video Hallucinations by Prompt-Aware Multi-Instance Video Preference Learning
Xinpeng Ding
Kaipeng Zhang
Jinahua Han
Lanqing Hong
Hang Xu
Xuelong Li
MLLM
VLM
239
0
0
08 Apr 2025
SCAM: A Real-World Typographic Robustness Evaluation for Multimodal Foundation Models
Justus Westerhoff
Erblina Purellku
Jakob Hackstein
Jonas Loos
Leo Pinetzki
Lorenz Hufe
AAML
28
0
0
07 Apr 2025
MedM-VL: What Makes a Good Medical LVLM?
Yiming Shi
Shaoshuai Yang
Xun Zhu
Haoyu Wang
Miao Li
Ji Wu
VLM
40
1
0
06 Apr 2025
MedReason: Eliciting Factual Medical Reasoning Steps in LLMs via Knowledge Graphs
Juncheng Wu
Wenlong Deng
X. Li
Sheng Liu
Taomian Mi
...
Yihan Cao
Hui Ren
Xuzhao Li
Xiaoxiao Li
Yuyin Zhou
AI4MH
LRM
61
4
0
01 Apr 2025
Towards Understanding How Knowledge Evolves in Large Vision-Language Models
Sudong Wang
Yujie Zhang
Yao Zhu
Jianing Li
Zizhe Wang
Yi Liu
Xiangyang Ji
169
0
0
31 Mar 2025
Communication-Efficient and Personalized Federated Foundation Model Fine-Tuning via Tri-Matrix Adaptation
Yong Li
Bo Liu
Sheng Huang
Zhe Zhang
Xiaotong Yuan
Richang Hong
46
0
0
31 Mar 2025
A Large-Scale Vision-Language Dataset Derived from Open Scientific Literature to Advance Biomedical Generalist AI
Alejandro Lozano
Min Woo Sun
James Burgess
Jeffrey Nirschl
Christopher Polzak
...
Xiaohan Wang
Alfred Seunghoon Song
Chiang Chia-Chun
Robert Tibshirani
Serena Yeung-Levy
LM&MA
102
1
0
26 Mar 2025
MLLM-Selector: Necessity and Diversity-driven High-Value Data Selection for Enhanced Visual Instruction Tuning
Yiwei Ma
Guohai Xu
Xiaoshuai Sun
Jiayi Ji
Jie Lou
Debing Zhang
Rongrong Ji
95
0
0
26 Mar 2025
Lie Detector: Unified Backdoor Detection via Cross-Examination Framework
Xiaobei Wang
Siyuan Liang
Dongping Liao
Han Fang
Aishan Liu
Xiaochun Cao
Yu-liang Lu
E. Chang
X. Gao
AAML
50
1
0
21 Mar 2025
Med-R1: Reinforcement Learning for Generalizable Medical Reasoning in Vision-Language Models
Yuxiang Lai
Shitian Zhao
Ming Li
Jike Zhong
Xiaofeng Yang
OffRL
LRM
LM&MA
VLM
81
11
0
18 Mar 2025
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey
Yansen Wang
Shengqiong Wu
Yujie Zhang
William Yang Wang
Ziwei Liu
Jiebo Luo
Hao Fei
LRM
95
11
0
16 Mar 2025
A Multimodal Benchmark Dataset and Model for Crop Disease Diagnosis
Xiang Liu
Zhaoxiang Liu
Huan Hu
Zezhou Chen
Kohou Wang
Ning Wang
Kai Wang
43
1
0
10 Mar 2025
CLIMB: Data Foundations for Large Scale Multimodal Clinical Foundation Models
Wei Dai
Peilin Chen
Malinda Lu
Daniel Li
Haowen Wei
Hejie Cui
Paul Pu Liang
LM&MA
56
1
0
09 Mar 2025
Distilled Prompt Learning for Incomplete Multimodal Survival Prediction
Yingxue Xu
Fengtao Zhou
Chenyu Zhao
Yihui Wang
Can Yang
Hao Chen
VLM
OffRL
57
0
0
03 Mar 2025
MIRROR: Multi-Modal Pathological Self-Supervised Representation Learning via Modality Alignment and Retention
Tianyi Wang
Jianan Fan
Dingxin Zhang
Dongnan Liu
Yong-quan Xia
Heng Huang
Weidong Cai
39
0
0
01 Mar 2025
PaliGemma-CXR: A Multi-task Multimodal Model for TB Chest X-ray Interpretation
Denis Musinguzi
Andrew Katumba
Sudi Murindanyi
36
0
0
28 Feb 2025
FedMentalCare: Towards Privacy-Preserving Fine-Tuned LLMs to Analyze Mental Health Status Using Federated Learning Framework
S M Sarwar
AI4MH
51
0
0
27 Feb 2025
Repurposing the scientific literature with vision-language models
Anton Alyakin
Jaden Stryker
Daniel Alber
Karl L. Sangwon
Brandon Duderstadt
...
Laura Snyder
Eric Leuthardt
Douglas Kondziolka
E. Oermann
Eric Karl Oermann
103
0
0
26 Feb 2025
MedVLM-R1: Incentivizing Medical Reasoning Capability of Vision-Language Models (VLMs) via Reinforcement Learning
Jiazhen Pan
Che Liu
Junde Wu
Fenglin Liu
Jiayuan Zhu
Hongwei Bran Li
Chen Chen
Cheng Ouyang
Daniel Rueckert
LRM
LM&MA
VLM
73
15
0
26 Feb 2025
MOVE: A Mixture-of-Vision-Encoders Approach for Domain-Focused Vision-Language Processing
Matvey Skripkin
Elizaveta Goncharova
Dmitrii Tarasov
Andrey Kuznetsov
67
0
0
24 Feb 2025
Disentangling Visual Transformers: Patch-level Interpretability for Image Classification
Guillaume Jeanneret
Loïc Simon
F. Jurie
ViT
61
0
0
24 Feb 2025
Tracking the Copyright of Large Vision-Language Models through Parameter Learning Adversarial Images
Yubo Wang
Jianting Tang
Chaohu Liu
Linli Xu
AAML
63
1
0
23 Feb 2025
1
2
3
Next