Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2301.04558
Cited By
v1
v2 (latest)
Learning to Exploit Temporal Structure for Biomedical Vision-Language Processing
11 January 2023
Shruthi Bannur
Stephanie L. Hyland
Qianchu Liu
Fernando Pérez-García
Maximilian Ilse
Daniel Coelho De Castro
Benedikt Boecking
H. Sharma
Kenza Bouzid
Anja Thieme
Anton Schwaighofer
Maria T. A. Wetscherek
M. Lungren
A. Nori
Javier Alvarez-Valle
Ozan Oktay
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Learning to Exploit Temporal Structure for Biomedical Vision-Language Processing"
50 / 57 papers shown
Title
RADAR: Enhancing Radiology Report Generation with Supplementary Knowledge Injection
Wenjun Hou
Yi Cheng
Kaishuai Xu
Heng Li
Yan Hu
Wenjie Li
Jiang Liu
98
0
0
20 May 2025
CheXLearner: Text-Guided Fine-Grained Representation Learning for Progression Detection
Yanjie Wang
Junwen Duan
Xinyu Li
Jianxin Wang
MedIm
76
0
0
11 May 2025
MedUnifier: Unifying Vision-and-Language Pre-training on Medical Data with Vision Generation Task using Discrete Visual Representations
Ziyang Zhang
Yang Yu
Yucheng Chen
Xulei Yang
S. Yeo
MedIm
143
2
0
02 Mar 2025
Libra: Leveraging Temporal Images for Biomedical Radiology Analysis
Xi Zhang
Zaiqiao Meng
Jake Lever
Edmond S. L. Ho
MedIm
160
1
0
28 Nov 2024
Can Medical Vision-Language Pre-training Succeed with Purely Synthetic Data?
Che Liu
Zhongwei Wan
Haozhe Wang
Yinda Chen
T. Qaiser
Chen Jin
Fariba Yousefi
Nikolay Burlutskiy
Rossella Arcucci
VLM
SyDa
LM&MA
MedIm
150
2
0
17 Oct 2024
Zero-Shot Medical Phrase Grounding with Off-the-shelf Diffusion Models
Konstantinos Vilouras
Pedro Sanchez
Alison Q. OÑeil
Sotirios A. Tsaftaris
MedIm
140
3
0
19 Apr 2024
MIML: Multiplex Image Machine Learning for High Precision Cell Classification via Mechanical Traits within Microfluidic Systems
Khayrul Islam
Ratul Paul
Shen Wang
Yaling Liu
Partho Adhikary
Qiying Li
Xiaochen Qin
Yaling Liu
85
0
0
15 Sep 2023
Improving Radiology Report Generation Systems by Removing Hallucinated References to Non-existent Priors
Vignav Ramesh
Nathan Chi
Pranav Rajpurkar
MedIm
78
50
0
27 Sep 2022
CheXRelNet: An Anatomy-Aware Model for Tracking Longitudinal Relationships between Chest X-Rays
Gaurang Karwande
Amarachi Mbakawe
Joy T. Wu
Leo Anthony Celi
Mehdi Moradi
Ismini Lourentzou
47
16
0
08 Aug 2022
Time Is MattEr: Temporal Self-supervision for Video Transformers
Sukmin Yun
Jaehyung Kim
Dongyoon Han
Hwanjun Song
Jung-Woo Ha
Jinwoo Shin
ViT
64
12
0
19 Jul 2022
CoCa: Contrastive Captioners are Image-Text Foundation Models
Jiahui Yu
Zirui Wang
Vijay Vasudevan
Legg Yeung
Mojtaba Seyedhosseini
Yonghui Wu
VLM
CLIP
OffRL
169
1,307
0
04 May 2022
Flamingo: a Visual Language Model for Few-Shot Learning
Jean-Baptiste Alayrac
Jeff Donahue
Pauline Luc
Antoine Miech
Iain Barr
...
Mikolaj Binkowski
Ricardo Barreira
Oriol Vinyals
Andrew Zisserman
Karen Simonyan
MLLM
VLM
418
3,607
0
29 Apr 2022
Making the Most of Text Semantics to Improve Biomedical Vision--Language Processing
Benedikt Boecking
Naoto Usuyama
Shruthi Bannur
Daniel Coelho De Castro
Anton Schwaighofer
...
Tristan Naumann
A. Nori
Javier Alvarez-Valle
Hoifung Poon
Ozan Oktay
71
245
0
21 Apr 2022
How Do Vision Transformers Work?
Namuk Park
Songkuk Kim
ViT
90
484
0
14 Feb 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
Guosheng Lin
MLLM
BDL
VLM
CLIP
555
4,413
0
28 Jan 2022
Robust Contrastive Learning against Noisy Views
Ching-Yao Chuang
R. Devon Hjelm
Xin Eric Wang
Vibhav Vineet
Neel Joshi
Antonio Torralba
Stefanie Jegelka
Ya-heng Song
NoLa
55
72
0
12 Jan 2022
FLAVA: A Foundational Language And Vision Alignment Model
Amanpreet Singh
Ronghang Hu
Vedanuj Goswami
Guillaume Couairon
Wojciech Galuba
Marcus Rohrbach
Douwe Kiela
CLIP
VLM
104
715
0
08 Dec 2021
FILIP: Fine-grained Interactive Language-Image Pre-Training
Lewei Yao
Runhu Huang
Lu Hou
Guansong Lu
Minzhe Niu
Hang Xu
Xiaodan Liang
Zhenguo Li
Xin Jiang
Chunjing Xu
VLM
CLIP
108
643
0
09 Nov 2021
Leveraging Time Irreversibility with Order-Contrastive Pre-training
Monica Agrawal
Hunter Lang
M. Offin
L. Gazit
David Sontag
46
7
0
04 Nov 2021
An Empirical Study of Training End-to-End Vision-and-Language Transformers
Zi-Yi Dou
Yichong Xu
Zhe Gan
Jianfeng Wang
Shuohang Wang
...
Pengchuan Zhang
Lu Yuan
Nanyun Peng
Zicheng Liu
Michael Zeng
VLM
74
380
0
03 Nov 2021
VLMo: Unified Vision-Language Pre-Training with Mixture-of-Modality-Experts
Hangbo Bao
Wenhui Wang
Li Dong
Qiang Liu
Owais Khan Mohammed
Kriti Aggarwal
Subhojit Som
Furu Wei
VLM
MLLM
MoE
89
559
0
03 Nov 2021
Long Short View Feature Decomposition via Contrastive Video Representation Learning
Nadine Behrmann
Mohsen Fayyaz
Juergen Gall
M. Noroozi
55
36
0
23 Sep 2021
Data Efficient Masked Language Modeling for Vision and Language
Yonatan Bitton
Gabriel Stanovsky
Michael Elhadad
Roy Schwartz
VLM
72
20
0
05 Sep 2021
Finetuned Language Models Are Zero-Shot Learners
Jason W. Wei
Maarten Bosma
Vincent Zhao
Kelvin Guu
Adams Wei Yu
Brian Lester
Nan Du
Andrew M. Dai
Quoc V. Le
ALM
UQCV
233
3,782
0
03 Sep 2021
Rethinking and Improving Relative Position Encoding for Vision Transformer
Kan Wu
Houwen Peng
Minghao Chen
Jianlong Fu
Hongyang Chao
ViT
88
338
0
29 Jul 2021
Align before Fuse: Vision and Language Representation Learning with Momentum Distillation
Junnan Li
Ramprasaath R. Selvaraju
Akhilesh Deepak Gotmare
Shafiq Joty
Caiming Xiong
Guosheng Lin
FaML
221
1,975
0
16 Jul 2021
RadGraph: Extracting Clinical Entities and Relations from Radiology Reports
Saahil Jain
Ashwin Agrawal
A. Saporta
Steven QH Truong
D. Duong
...
Yuhao Zhang
M. Lungren
A. Ng
C. Langlotz
Pranav Rajpurkar
MedIm
96
213
0
28 Jun 2021
Exploring the Limits of Out-of-Distribution Detection
Stanislav Fort
Jie Jessie Ren
Balaji Lakshminarayanan
77
338
0
06 Jun 2021
Multi-modal Understanding and Generation for Medical Images and Text via Vision-Language Pre-Training
Jong Hak Moon
HyunGyung Lee
W. Shin
Young-Hak Kim
Edward Choi
MedIm
76
160
0
24 May 2021
Broaden Your Views for Self-Supervised Video Learning
Adrià Recasens
Pauline Luc
Jean-Baptiste Alayrac
Luyu Wang
Ross Hemsley
...
Florent Altché
M. Valko
Jean-Bastien Grill
Aaron van den Oord
Andrew Zisserman
SSL
AI4TS
101
128
0
30 Mar 2021
ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases
Stéphane dÁscoli
Hugo Touvron
Matthew L. Leavitt
Ari S. Morcos
Giulio Biroli
Levent Sagun
ViT
136
834
0
19 Mar 2021
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
975
29,871
0
26 Feb 2021
MedAug: Contrastive learning leveraging patient metadata improves representations for chest X-ray interpretation
Yen Nhi Truong Vu
Richard Wang
N. Balachandar
Can Liu
A. Ng
Pranav Rajpurkar
72
83
0
21 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
459
3,901
0
11 Feb 2021
COVID-19 Prognosis via Self-Supervised Representation Learning and Multi-Image Prediction
Anuroop Sriram
Matthew Muckley
Koustuv Sinha
Farah E. Shamout
Joelle Pineau
Krzysztof J. Geras
Lea Azour
Y. Aphinyanaphongs
N. Yakubova
W. Moore
MedIm
179
65
0
13 Jan 2021
Generating Radiology Reports via Memory-driven Transformer
Zhihong Chen
Yan Song
Tsung-Hui Chang
Xiang Wan
MedIm
72
481
0
30 Oct 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
676
41,483
0
22 Oct 2020
Improving Factual Completeness and Consistency of Image-to-Text Radiology Report Generation
Yasuhide Miura
Yuhao Zhang
Emily Bao Tsai
C. Langlotz
Dan Jurafsky
MedIm
219
158
0
20 Oct 2020
Self-supervised Co-training for Video Representation Learning
Tengda Han
Weidi Xie
Andrew Zisserman
SSL
242
320
0
19 Oct 2020
Contrastive Learning of Medical Visual Representations from Paired Images and Text
Yuhao Zhang
Hang Jiang
Yasuhide Miura
Christopher D. Manning
C. Langlotz
MedIm
147
767
0
02 Oct 2020
Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing
Yu Gu
Robert Tinn
Hao Cheng
Michael R. Lucas
Naoto Usuyama
Xiaodong Liu
Tristan Naumann
Jianfeng Gao
Hoifung Poon
LM&MA
AI4CE
116
1,783
0
31 Jul 2020
Debiased Contrastive Learning
Ching-Yao Chuang
Joshua Robinson
Yen-Chen Lin
Antonio Torralba
Stefanie Jegelka
SSL
81
566
0
01 Jul 2020
VirTex: Learning Visual Representations from Textual Annotations
Karan Desai
Justin Johnson
SSL
VLM
162
436
0
11 Jun 2020
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
880
42,379
0
28 May 2020
End-to-End Object Detection with Transformers
Nicolas Carion
Francisco Massa
Gabriel Synnaeve
Nicolas Usunier
Alexander Kirillov
Sergey Zagoruyko
ViT
3DV
PINN
437
13,108
0
26 May 2020
Quantifying Attention Flow in Transformers
Samira Abnar
Willem H. Zuidema
169
802
0
02 May 2020
CheXbert: Combining Automatic Labelers and Expert Annotations for Accurate Radiology Report Labeling Using BERT
Akshay Smit
Saahil Jain
Pranav Rajpurkar
Anuj Pareek
A. Ng
M. Lungren
MedIm
52
332
0
20 Apr 2020
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Nils Reimers
Iryna Gurevych
1.3K
12,316
0
27 Aug 2019
CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison
Jeremy Irvin
Pranav Rajpurkar
M. Ko
Yifan Yu
Silviana Ciurea-Ilcus
...
D. Larson
C. Langlotz
Bhavik Patel
M. Lungren
A. Ng
118
2,604
0
21 Jan 2019
Fully Convolutional Siamese Networks for Change Detection
Rodrigo Caye Daudt
Bertrand Le Saux
Alexandre Boulch
59
1,121
0
19 Oct 2018
1
2
Next