ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2307.05314
  4. Cited By
Masked Vision and Language Pre-training with Unimodal and Multimodal
  Contrastive Losses for Medical Visual Question Answering

Masked Vision and Language Pre-training with Unimodal and Multimodal Contrastive Losses for Medical Visual Question Answering

11 July 2023
Pengfei Li
Gang Liu
Jinlong He
Zixu Zhao
Shenjun Zhong
ArXivPDFHTML

Papers citing "Masked Vision and Language Pre-training with Unimodal and Multimodal Contrastive Losses for Medical Visual Question Answering"

20 / 20 papers shown
Title
A Lightweight Large Vision-language Model for Multimodal Medical Images
A Lightweight Large Vision-language Model for Multimodal Medical Images
Belal Alsinglawi
Chris McCarthy
Sara Webb
Christopher Fluke
Navid Toosy Saidy
LM&MA
47
0
0
08 Apr 2025
Uni-Mlip: Unified Self-supervision for Medical Vision Language
  Pre-training
Uni-Mlip: Unified Self-supervision for Medical Vision Language Pre-training
Ameera Bawazir
Kebin Wu
Wenbin Li
CLIP
77
1
0
20 Nov 2024
Memory-Augmented Multimodal LLMs for Surgical VQA via Self-Contained Inquiry
Wenjun Hou
Yi Cheng
Kaishuai Xu
Yan Hu
Wenjie Li
Jiang-Dong Liu
40
0
0
17 Nov 2024
Parameter-Efficient Fine-Tuning Medical Multimodal Large Language Models
  for Medical Visual Grounding
Parameter-Efficient Fine-Tuning Medical Multimodal Large Language Models for Medical Visual Grounding
Jinlong He
Pengfei Li
Gang Liu
Shenjun Zhong
41
3
0
31 Oct 2024
ZALM3: Zero-Shot Enhancement of Vision-Language Alignment via In-Context
  Information in Multi-Turn Multimodal Medical Dialogue
ZALM3: Zero-Shot Enhancement of Vision-Language Alignment via In-Context Information in Multi-Turn Multimodal Medical Dialogue
Zhangpu Li
Changhong Zou
Suxue Ma
Zhicheng Yang
Chen Du
...
Xingzhi Sun
Jing Xiao
Kai Zhang
Mei Han
Mei Han
LM&MA
51
1
0
26 Sep 2024
PA-LLaVA: A Large Language-Vision Assistant for Human Pathology Image
  Understanding
PA-LLaVA: A Large Language-Vision Assistant for Human Pathology Image Understanding
Dawei Dai
Yuanhui Zhang
Long Xu
Qianlan Yang
Xiaojing Shen
Shuyin Xia
Guoyin Wang
LM&MA
VLM
36
9
0
18 Aug 2024
Tri-VQA: Triangular Reasoning Medical Visual Question Answering for
  Multi-Attribute Analysis
Tri-VQA: Triangular Reasoning Medical Visual Question Answering for Multi-Attribute Analysis
Lin Fan
Xun Gong
Cenyang Zheng
Yafei Ou
30
0
0
21 Jun 2024
Biomedical Visual Instruction Tuning with Clinician Preference Alignment
Biomedical Visual Instruction Tuning with Clinician Preference Alignment
Hejie Cui
Lingjun Mao
Xin Liang
Jieyu Zhang
Hui Ren
Quanzheng Li
Xiang Li
Carl Yang
LM&MA
66
6
0
19 Jun 2024
LaPA: Latent Prompt Assist Model For Medical Visual Question Answering
LaPA: Latent Prompt Assist Model For Medical Visual Question Answering
Tiancheng Gu
Kaicheng Yang
Dongnan Liu
Weidong Cai
MedIm
46
2
0
19 Apr 2024
Multi-Image Visual Question Answering for Unsupervised Anomaly Detection
Multi-Image Visual Question Answering for Unsupervised Anomaly Detection
Jun Li
Cosmin I. Bercea
Philipp Muller
Lina Felsner
Suhwan Kim
Daniel Rueckert
Benedikt Wiestler
Julia A. Schnabel
34
3
0
11 Apr 2024
Foundation Model for Advancing Healthcare: Challenges, Opportunities,
  and Future Directions
Foundation Model for Advancing Healthcare: Challenges, Opportunities, and Future Directions
Yuting He
Fuxiang Huang
Xinrui Jiang
Yuxiang Nie
Minghao Wang
Jiguang Wang
Hao Chen
LM&MA
AI4CE
84
30
0
04 Apr 2024
Can LLMs' Tuning Methods Work in Medical Multimodal Domain?
Can LLMs' Tuning Methods Work in Medical Multimodal Domain?
Jiawei Chen
Yue Jiang
Dingkang Yang
Mingcheng Li
Jinjie Wei
Ziyun Qian
Lihua Zhang
LM&MA
27
9
0
11 Mar 2024
Vision-Language Models for Medical Report Generation and Visual Question
  Answering: A Review
Vision-Language Models for Medical Report Generation and Visual Question Answering: A Review
Iryna Hartsock
Ghulam Rasool
54
64
0
04 Mar 2024
MISS: A Generative Pretraining and Finetuning Approach for Med-VQA
MISS: A Generative Pretraining and Finetuning Approach for Med-VQA
Jiawei Chen
Dingkang Yang
Yue Jiang
Yuxuan Lei
Lihua Zhang
LM&MA
MedIm
21
13
0
10 Jan 2024
Medical Vision Language Pretraining: A survey
Medical Vision Language Pretraining: A survey
Prashant Shrestha
Sanskar Amgain
Bidur Khanal
Cristian A. Linte
Binod Bhattarai
VLM
46
14
0
11 Dec 2023
C3Net: Compound Conditioned ControlNet for Multimodal Content Generation
C3Net: Compound Conditioned ControlNet for Multimodal Content Generation
Juntao Zhang
Yuehuai Liu
Yu-Wing Tai
Chi-Keung Tang
DiffM
38
5
0
29 Nov 2023
A Survey on Image-text Multimodal Models
A Survey on Image-text Multimodal Models
Ruifeng Guo
Jingxuan Wei
Linzhuang Sun
Khai Le-Duc
Guiyong Chang
Dawei Liu
Sibo Zhang
Zhengbing Yao
Mingjun Xu
Liping Bu
VLM
36
5
0
23 Sep 2023
Multi-Modal Masked Autoencoders for Medical Vision-and-Language
  Pre-Training
Multi-Modal Masked Autoencoders for Medical Vision-and-Language Pre-Training
Zhihong Chen
Yu Du
Jinpeng Hu
Yang Liu
Guanbin Li
Xiang Wan
Tsung-Hui Chang
91
111
0
15 Sep 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified
  Vision-Language Understanding and Generation
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
Guosheng Lin
MLLM
BDL
VLM
CLIP
392
4,185
0
28 Jan 2022
Masked Autoencoders Are Scalable Vision Learners
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
322
7,503
0
11 Nov 2021
1