Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.08966
Cited By
v1
v2 (latest)
Pretraining Vision-Language Model for Difference Visual Question Answering in Longitudinal Chest X-rays
14 February 2024
Yeongjae Cho
Taehee Kim
Heejun Shin
Sungzoon Cho
Dongmyung Shin
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Pretraining Vision-Language Model for Difference Visual Question Answering in Longitudinal Chest X-rays"
13 / 13 papers shown
Title
PaLI: A Jointly-Scaled Multilingual Language-Image Model
Xi Chen
Tianlin Li
Soravit Changpinyo
A. Piergiovanni
Piotr Padlewski
...
Andreas Steiner
A. Angelova
Xiaohua Zhai
N. Houlsby
Radu Soricut
MLLM
VLM
116
732
0
14 Sep 2022
Flamingo: a Visual Language Model for Few-Shot Learning
Jean-Baptiste Alayrac
Jeff Donahue
Pauline Luc
Antoine Miech
Iain Barr
...
Mikolaj Binkowski
Ricardo Barreira
Oriol Vinyals
Andrew Zisserman
Karen Simonyan
MLLM
VLM
418
3,602
0
29 Apr 2022
Image Difference Captioning with Pre-training and Contrastive Learning
Linli Yao
Weiying Wang
Qin Jin
SSL
VLM
69
42
0
09 Feb 2022
OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
Peng Wang
An Yang
Rui Men
Junyang Lin
Shuai Bai
Zhikang Li
Jianxin Ma
Chang Zhou
Jingren Zhou
Hongxia Yang
MLLM
ObjD
154
880
0
07 Feb 2022
Medical Visual Question Answering: A Survey
Zhihong Lin
Donghao Zhang
Qingyi Tao
Danli Shi
Gholamreza Haffari
Qi Wu
M. He
Z. Ge
83
119
0
19 Nov 2021
VLMo: Unified Vision-Language Pre-Training with Mixture-of-Modality-Experts
Hangbo Bao
Wenhui Wang
Li Dong
Qiang Liu
Owais Khan Mohammed
Kriti Aggarwal
Subhojit Som
Furu Wei
VLM
MLLM
MoE
89
558
0
03 Nov 2021
Describing and Localizing Multiple Changes with Transformers
Yue Qiu
Shintaro Yamamoto
Kodai Nakashima
Ryota Suzuki
K. Iwata
Hirokatsu Kataoka
Y. Satoh
64
59
0
25 Mar 2021
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
Soravit Changpinyo
P. Sharma
Nan Ding
Radu Soricut
VLM
447
1,141
0
17 Feb 2021
Generating Radiology Reports via Memory-driven Transformer
Zhihong Chen
Yan Song
Tsung-Hui Chang
Xiang Wan
MedIm
72
481
0
30 Oct 2020
MIMIC-CXR-JPG, a large publicly available database of labeled chest radiographs
Alistair E. W. Johnson
Tom Pollard
Nathaniel R. Greenbaum
M. Lungren
Chih-ying Deng
Yifan Peng
Zhiyong Lu
R. Mark
Seth Berkowitz
Steven Horng
MedIm
98
820
0
21 Jan 2019
Neural Machine Translation of Rare Words with Subword Units
Rico Sennrich
Barry Haddow
Alexandra Birch
228
7,757
0
31 Aug 2015
CIDEr: Consensus-based Image Description Evaluation
Ramakrishna Vedantam
C. L. Zitnick
Devi Parikh
297
4,508
0
20 Nov 2014
Microsoft COCO: Common Objects in Context
Nayeon Lee
Michael Maire
Serge J. Belongie
Lubomir Bourdev
Ross B. Girshick
James Hays
Pietro Perona
Deva Ramanan
C. L. Zitnick
Piotr Dollár
ObjD
432
43,814
0
01 May 2014
1