Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2312.06224
Cited By
Medical Vision Language Pretraining: A survey
11 December 2023
Prashant Shrestha
Sanskar Amgain
Bidur Khanal
Cristian A. Linte
Binod Bhattarai
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Medical Vision Language Pretraining: A survey"
50 / 63 papers shown
Title
BiomedCLIP: a multimodal biomedical foundation model pretrained from fifteen million scientific image-text pairs
Sheng Zhang
Yanbo Xu
Naoto Usuyama
Hanwen Xu
J. Bagga
...
Carlo Bifulco
M. Lungren
Tristan Naumann
Sheng Wang
Hoifung Poon
LM&MA
MedIm
227
233
0
10 Jan 2025
CXR-CLIP: Toward Large Scale Chest X-ray Language-Image Pre-training
Kihyun You
Jawook Gu
Jiyeon Ham
Beomhee Park
Jiho Kim
Eun K. Hong
Woonhyuk Baek
Byungseok Roh
CLIP
VLM
65
63
0
20 Oct 2023
Utilizing Synthetic Data for Medical Vision-Language Pre-training: Bypassing the Need for Real Images
Che Liu
Anand Shah
Wenjia Bai
Rossella Arcucci
MedIm
98
14
0
10 Oct 2023
An Empirical Analysis for Zero-Shot Multi-Label Classification on COVID-19 CT Scans and Uncurated Reports
Ethan Dack
Lorenzo Brigato
Matthew McMurray
Matthias Fontanellaz
Thomas Frauenfelder
...
Thomas Geiser
M. Funke-Chambour
Andreas Christe
L. Ebner
Stavroula Mougiakakou
65
2
0
04 Sep 2023
A Foundation Language-Image Model of the Retina (FLAIR): Encoding Expert Knowledge in Text Supervision
Julio Silva-Rodríguez
H. Chakor
Riadh Kobbi
Jose Dolz
Ismail Ben Ayed
VLM
MedIm
231
44
0
15 Aug 2023
Learning Multi-modal Representations by Watching Hundreds of Surgical Video Lectures
Kun Yuan
V. Srivastav
Tong Yu
Joël L. Lavanchy
J. Marescaux
Pietro Mascagni
Nassir Navab
N. Padoy
142
23
0
27 Jul 2023
Knowledge Boosting: Rethinking Medical Contrastive Vision-Language Pre-Training
Xiaofei Chen
Yuting He
Cheng Xue
Rongjun Ge
Shuo Li
Guanyu Yang
VLM
MedIm
71
13
0
14 Jul 2023
Masked Vision and Language Pre-training with Unimodal and Multimodal Contrastive Losses for Medical Visual Question Answering
Pengfei Li
Gang Liu
Jinlong He
Zixu Zhao
Shenjun Zhong
42
37
0
11 Jul 2023
Quilt-1M: One Million Image-Text Pairs for Histopathology
Wisdom O. Ikezogwo
M. S. Seyfioglu
Fatemeh Ghezloo
Dylan Stefan Chan Geva
Fatwir Sheikh Mohammed
Pavan Kumar Anand
Ranjay Krishna
Linda G. Shapiro
CLIP
VLM
301
125
0
20 Jun 2023
Visual Language Pretrained Multiple Instance Zero-Shot Transfer for Histopathology Images
Ming Y. Lu
Bowen Chen
Andrew Zhang
Drew F. K. Williamson
Richard J. Chen
Tong Ding
L. Le
Yung-Sung Chuang
Faisal Mahmood
VLM
MedIm
189
102
0
13 Jun 2023
Local Contrastive Learning for Medical Image Recognition
S. A. Rizvi
Ruixiang Tang
X. Jiang
X. Ma
X. Hu
76
6
0
24 Mar 2023
MEDIMP: 3D Medical Images with clinical Prompts from limited tabular data for renal transplantation
Léo Milecki
Vicky Kalogeiton
Sylvain Bodard
D. Anglicheau
J. Correas
M. Timsit
Maria Vakalopoulou
MedIm
67
4
0
22 Mar 2023
LIMITR: Leveraging Local Information for Medical Image-Text Representation
Gefen Dawidowicz
Elad Hirsch
A. Tal
69
15
0
21 Mar 2023
Towards Unifying Medical Vision-and-Language Pre-training via Soft Prompts
Zhihong Chen
Shizhe Diao
Benyou Wang
Guanbin Li
Xiang Wan
MedIm
113
33
0
17 Feb 2023
Learning to Exploit Temporal Structure for Biomedical Vision-Language Processing
Shruthi Bannur
Stephanie L. Hyland
Qianchu Liu
Fernando Pérez-García
Maximilian Ilse
...
Maria T. A. Wetscherek
M. Lungren
A. Nori
Javier Alvarez-Valle
Ozan Oktay
87
126
0
11 Jan 2023
MedKLIP: Medical Knowledge Enhanced Language-Image Pre-Training in Radiology
Chaoyi Wu
Xiaoman Zhang
Ya Zhang
Yanfeng Wang
Weidi Xie
LM&MA
VLM
106
120
0
05 Jan 2023
Learning Multimodal Data Augmentation in Feature Space
Zichang Liu
Zhiqiang Tang
Xingjian Shi
Aston Zhang
Mu Li
Anshumali Shrivastava
A. Wilson
89
23
0
29 Dec 2022
UnICLAM:Contrastive Representation Learning with Adversarial Masking for Unified and Interpretable Medical Vision Question Answering
Chenlu Zhan
Peng Peng
Hongsen Wang
Tao Chen
Hongwei Wang
MedIm
68
4
0
21 Dec 2022
Significantly Improving Zero-Shot X-ray Pathology Classification via Fine-tuning Pre-trained Image-Text Encoders
Jongseong Jang
Daeun Kyung
Seunghyeon Kim
Honglak Lee
Kyunghoon Bae
Edward Choi
LM&MA
MedIm
62
11
0
14 Dec 2022
RoentGen: Vision-Language Foundation Model for Chest X-ray Generation
Pierre J. Chambon
Christian Blüthgen
Jean-Benoit Delbrouck
Rogier van der Sluijs
M. Polacin
Juan Manuel Zambrano Chaves
Tanishq Mathew Abraham
Shivanshu Purohit
C. Langlotz
Akshay S. Chaudhari
LM&MA
DiffM
MedIm
68
102
0
23 Nov 2022
Multi-Granularity Cross-modal Alignment for Generalized Medical Visual Representation Learning
Fuying Wang
Yuyin Zhou
Shujun Wang
V. Vardhanabhuti
Lequan Yu
105
146
0
12 Oct 2022
Multi-Modal Masked Autoencoders for Medical Vision-and-Language Pre-Training
Zhihong Chen
Yu Du
Jinpeng Hu
Yang Liu
Guanbin Li
Xiang Wan
Tsung-Hui Chang
146
118
0
15 Sep 2022
Multimodal Masked Autoencoders Learn Transferable Representations
Xinyang Geng
Hao Liu
Lisa Lee
Dale Schuurams
Sergey Levine
Pieter Abbeel
79
118
0
27 May 2022
Breaking with Fixed Set Pathology Recognition through Report-Guided Contrastive Training
C. Seibold
Simon Reiß
M. Sarfraz
Rainer Stiefelhagen
Jens Kleesiek
51
31
0
14 May 2022
Making the Most of Text Semantics to Improve Biomedical Vision--Language Processing
Benedikt Boecking
Naoto Usuyama
Shruthi Bannur
Daniel Coelho De Castro
Anton Schwaighofer
...
Tristan Naumann
A. Nori
Javier Alvarez-Valle
Hoifung Poon
Ozan Oktay
71
245
0
21 Apr 2022
Visual Prompt Tuning
Menglin Jia
Luming Tang
Bor-Chun Chen
Claire Cardie
Serge Belongie
Bharath Hariharan
Ser-Nam Lim
VLM
VPVLM
158
1,645
0
23 Mar 2022
A Survey on Model Compression and Acceleration for Pretrained Language Models
Canwen Xu
Julian McAuley
86
60
0
15 Feb 2022
Inference of captions from histopathological patches
M. Tsuneki
F. Kanavati
81
32
0
07 Feb 2022
Grounded Language-Image Pre-training
Liunian Harold Li
Pengchuan Zhang
Haotian Zhang
Jianwei Yang
Chunyuan Li
...
Lu Yuan
Lei Zhang
Lei Li
Kai-Wei Chang
Jianfeng Gao
ObjD
VLM
136
1,067
0
07 Dec 2021
Joint Learning of Localized Representations from Medical Images and Reports
Philipp Muller
Georgios Kaissis
Cong Zou
Daniel Munich
200
85
0
06 Dec 2021
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
477
7,827
0
11 Nov 2021
LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs
Christoph Schuhmann
Richard Vencu
Romain Beaumont
R. Kaczmarczyk
Clayton Mullis
Aarush Katta
Theo Coombes
J. Jitsev
Aran Komatsuzaki
VLM
MLLM
CLIP
243
1,444
0
03 Nov 2021
Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm
Yangguang Li
Feng Liang
Lichen Zhao
Yufeng Cui
Wanli Ouyang
Jing Shao
F. Yu
Junjie Yan
VLM
CLIP
154
458
0
11 Oct 2021
ResViT: Residual vision transformers for multi-modal medical image synthesis
Onat Dalmaz
Mahmut Yurt
Tolga Çukur
ViT
MedIm
94
351
0
30 Jun 2021
RadGraph: Extracting Clinical Entities and Relations from Radiology Reports
Saahil Jain
Ashwin Agrawal
A. Saporta
Steven QH Truong
D. Duong
...
Yuhao Zhang
M. Lungren
A. Ng
C. Langlotz
Pranav Rajpurkar
MedIm
96
213
0
28 Jun 2021
Multi-modal Understanding and Generation for Medical Images and Text via Vision-Language Pre-Training
Jong Hak Moon
HyunGyung Lee
W. Shin
Young-Hak Kim
Edward Choi
MedIm
79
160
0
24 May 2021
COVID-Net CXR-2: An Enhanced Deep Convolutional Neural Network Design for Detection of COVID-19 Cases from Chest X-ray Images
Maya Pavlova
Naomi Terhljan
A. Chung
Andy Zhao
Siddharth Surana
Hossein Aboutalebi
Hayden Gunraj
A. Sabri
Amer Alaref
Alexander Wong
46
47
0
14 May 2021
MMBERT: Multimodal BERT Pretraining for Improved Medical VQA
Yash Khare
Viraj Bagal
Minesh Mathew
Adithi Devi
U. Priyakumar
C. V. Jawahar
MedIm
75
136
0
03 Apr 2021
Self-supervised Image-text Pre-training With Mixed Data In Chest X-rays
Xiaosong Wang
Ziyue Xu
Leo K. Tam
Dong Yang
Daguang Xu
ViT
MedIm
57
24
0
30 Mar 2021
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng Zhang
Stephen Lin
B. Guo
ViT
467
21,603
0
25 Mar 2021
Multimodal Representation Learning via Maximization of Local Mutual Information
Ruizhi Liao
Daniel Moyer
Miriam Cha
Keegan Quigley
Seth Berkowitz
Steven Horng
Polina Golland
W. Wells
SSL
81
42
0
08 Mar 2021
TransBTS: Multimodal Brain Tumor Segmentation Using Transformer
Wenxuan Wang
Chen Chen
Meng Ding
Jiangyun Li
Hong Yu
Sen Zha
ViT
MedIm
94
729
0
07 Mar 2021
SLAKE: A Semantically-Labeled Knowledge-Enhanced Dataset for Medical Visual Question Answering
Bo Liu
Li-Ming Zhan
Li Xu
Lin Ma
Y. Yang
Xiao-Ming Wu
80
272
0
18 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
463
3,901
0
11 Feb 2021
Intriguing Properties of Contrastive Losses
Ting Chen
Calvin Luo
Lala Li
72
177
0
05 Nov 2020
MedICaT: A Dataset of Medical Images, Captions, and Textual References
Sanjay Subramanian
Lucy Lu Wang
Sachin Mehta
Ben Bogin
Madeleine van Zuylen
Sravanthi Parasa
Sameer Singh
Matt Gardner
Hannaneh Hajishirzi
MedIm
51
74
0
12 Oct 2020
Contrastive Learning of Medical Visual Representations from Paired Images and Text
Yuhao Zhang
Hang Jiang
Yasuhide Miura
Christopher D. Manning
C. Langlotz
MedIm
147
767
0
02 Oct 2020
Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing
Yu Gu
Robert Tinn
Hao Cheng
Michael R. Lucas
Naoto Usuyama
Xiaodong Liu
Tristan Naumann
Jianfeng Gao
Hoifung Poon
LM&MA
AI4CE
121
1,783
0
31 Jul 2020
ChestX-Det10: Chest X-ray Dataset on Detection of Thoracic Abnormalities
Jingyun Liu
Jie Lian
Yizhou Yu
ViT
OOD
52
35
0
17 Jun 2020
Auxiliary Signal-Guided Knowledge Encoder-Decoder for Medical Report Generation
Mingjie Li
Fuyu Wang
Xiaojun Chang
Xiaodan Liang
MedIm
68
103
0
06 Jun 2020
1
2
Next