Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.06057
Cited By
Unified Multi-modal Diagnostic Framework with Reconstruction Pre-training and Heterogeneity-combat Tuning
9 April 2024
Yupei Zhang
Li Pan
Qiushi Yang
Tan Li
Zhen Chen
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Unified Multi-modal Diagnostic Framework with Reconstruction Pre-training and Heterogeneity-combat Tuning"
29 / 29 papers shown
Title
Long-tailed Medical Diagnosis with Relation-aware Representation Learning and Iterative Classifier Calibration
Li Pan
Yupei Zhang
Qiushi Yang
Tan Li
Zhen Chen
95
0
0
05 Feb 2025
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
429
4,641
0
30 Jan 2023
Towards All-in-one Pre-training via Maximizing Multi-modal Mutual Information
Weijie Su
Xizhou Zhu
Chenxin Tao
Lewei Lu
Bin Li
Gao Huang
Yu Qiao
Xiaogang Wang
Jie Zhou
Jifeng Dai
73
41
0
17 Nov 2022
Test-Time Training with Masked Autoencoders
Yossi Gandelsman
Yu Sun
Xinlei Chen
Alexei A. Efros
OOD
101
177
0
15 Sep 2022
Multi-Modal Masked Autoencoders for Medical Vision-and-Language Pre-Training
Zhihong Chen
Yu Du
Jinpeng Hu
Yang Liu
Guanbin Li
Xiang Wan
Tsung-Hui Chang
134
118
0
15 Sep 2022
GaitForeMer: Self-Supervised Pre-Training of Transformers via Human Motion Forecasting for Few-Shot Gait Impairment Severity Estimation
Mark Endo
K. Poston
E. Sullivan
L. Fei-Fei
K. Pohl
Ehsan Adeli
71
19
0
30 Jun 2022
Balanced Multimodal Learning via On-the-fly Gradient Modulation
Xiaokang Peng
Yake Wei
Andong Deng
Dong Wang
Di Hu
69
213
0
29 Mar 2022
Vision-Language Pre-Training with Triple Contrastive Learning
Jinyu Yang
Jiali Duan
Son N. Tran
Yi Xu
Sampath Chanda
Liqun Chen
Belinda Zeng
Trishul Chilimbi
Junzhou Huang
VLM
108
297
0
21 Feb 2022
Fine-Tuning can Distort Pretrained Features and Underperform Out-of-Distribution
Ananya Kumar
Aditi Raghunathan
Robbie Jones
Tengyu Ma
Percy Liang
OODD
124
681
0
21 Feb 2022
Masked Feature Prediction for Self-Supervised Visual Pre-Training
Chen Wei
Haoqi Fan
Saining Xie
Chaoxia Wu
Alan Yuille
Christoph Feichtenhofer
ViT
149
670
0
16 Dec 2021
FLAVA: A Foundational Language And Vision Alignment Model
Amanpreet Singh
Ronghang Hu
Vedanuj Goswami
Guillaume Couairon
Wojciech Galuba
Marcus Rohrbach
Douwe Kiela
CLIP
VLM
99
715
0
08 Dec 2021
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
467
7,814
0
11 Nov 2021
Recent advances and clinical applications of deep learning in medical image analysis
Xuxin Chen
Ximing Wang
Kecheng Zhang
K. Fung
T. Thai
K. Moore
Robert S. Mannel
Hong Liu
B. Zheng
Y. Qiu
OOD
76
599
0
27 May 2021
Multi-modal Understanding and Generation for Medical Images and Text via Vision-Language Pre-Training
Jong Hak Moon
HyunGyung Lee
W. Shin
Young-Hak Kim
Edward Choi
MedIm
73
160
0
24 May 2021
MMBERT: Multimodal BERT Pretraining for Improved Medical VQA
Yash Khare
Viraj Bagal
Minesh Mathew
Adithi Devi
U. Priyakumar
C. V. Jawahar
MedIm
66
136
0
03 Apr 2021
SLAKE: A Semantically-Labeled Knowledge-Enhanced Dataset for Medical Visual Question Answering
Bo Liu
Li-Ming Zhan
Li Xu
Lin Ma
Y. Yang
Xiao-Ming Wu
78
270
0
18 Feb 2021
ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision
Wonjae Kim
Bokyung Son
Ildoo Kim
VLM
CLIP
128
1,749
0
05 Feb 2021
Annotation-efficient deep learning for automatic medical image segmentation
Shanshan Wang
Cheng Li
Rongpin Wang
Zaiyi Liu
Meiyun Wang
...
Xin Liu
Jie Chen
Hui-Chong Zhou
Ismail Ben Ayed
Hairong Zheng
VLM
MedIm
81
186
0
09 Dec 2020
MedICaT: A Dataset of Medical Images, Captions, and Textual References
Sanjay Subramanian
Lucy Lu Wang
Sachin Mehta
Ben Bogin
Madeleine van Zuylen
Sravanthi Parasa
Sameer Singh
Matt Gardner
Hannaneh Hajishirzi
MedIm
48
74
0
12 Oct 2020
Contrastive Learning of Medical Visual Representations from Paired Images and Text
Yuhao Zhang
Hang Jiang
Yasuhide Miura
Christopher D. Manning
C. Langlotz
MedIm
137
766
0
02 Oct 2020
Don't Stop Pretraining: Adapt Language Models to Domains and Tasks
Suchin Gururangan
Ana Marasović
Swabha Swayamdipta
Kyle Lo
Iz Beltagy
Doug Downey
Noah A. Smith
VLM
AI4CE
CLL
155
2,434
0
23 Apr 2020
A Simple Framework for Contrastive Learning of Visual Representations
Ting-Li Chen
Simon Kornblith
Mohammad Norouzi
Geoffrey E. Hinton
SSL
375
18,859
0
13 Feb 2020
Overcoming Data Limitation in Medical Visual Question Answering
Binh Duc Nguyen
Thanh-Toan Do
Binh X. Nguyen
Tuong Khanh Long Do
Erman Tjiputra
Quang-Dieu Tran
MedIm
62
151
0
26 Sep 2019
ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks
Jiasen Lu
Dhruv Batra
Devi Parikh
Stefan Lee
SSL
VLM
234
3,693
0
06 Aug 2019
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
AIMat
671
24,528
0
26 Jul 2019
SciBERT: A Pretrained Language Model for Scientific Text
Iz Beltagy
Kyle Lo
Arman Cohan
157
2,983
0
26 Mar 2019
Layer Normalization
Jimmy Lei Ba
J. Kiros
Geoffrey E. Hinton
416
10,526
0
21 Jul 2016
Gaussian Error Linear Units (GELUs)
Dan Hendrycks
Kevin Gimpel
172
5,037
0
27 Jun 2016
Stacked Attention Networks for Image Question Answering
Zichao Yang
Xiaodong He
Jianfeng Gao
Li Deng
Alex Smola
BDL
109
1,883
0
07 Nov 2015
1