Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2307.05435
Cited By
One-Versus-Others Attention: Scalable Multimodal Integration for Clinical Data
11 July 2023
Michal Golovanevsky
Eva Schiller
Akira Nair
Ritambhara Singh
Carsten Eickhoff
Re-assign community
ArXiv
PDF
HTML
Papers citing
"One-Versus-Others Attention: Scalable Multimodal Integration for Clinical Data"
36 / 36 papers shown
Title
Visual Instruction Tuning
Haotian Liu
Chunyuan Li
Qingyang Wu
Yong Jae Lee
SyDa
VLM
MLLM
529
4,725
0
17 Apr 2023
NarrowBERT: Accelerating Masked Language Model Pretraining and Inference
Haoxin Li
Phillip Keung
Daniel Cheng
Jungo Kasai
Noah A. Smith
42
4
0
11 Jan 2023
Learning to Exploit Temporal Structure for Biomedical Vision-Language Processing
Shruthi Bannur
Stephanie L. Hyland
Qianchu Liu
Fernando Pérez-García
Maximilian Ilse
...
Maria T. A. Wetscherek
M. Lungren
A. Nori
Javier Alvarez-Valle
Ozan Oktay
59
124
0
11 Jan 2023
Cross-Domain Consumer Review Analysis
Aditya Pandey
Kunal Joshi
18
1
0
23 Dec 2022
SANCL: Multimodal Review Helpfulness Prediction with Selective Attention and Natural Contrastive Learning
Wei Han
Hui Chen
Zhen Hai
Soujanya Poria
Lidong Bing
75
16
0
12 Sep 2022
MedFuse: Multi-modal fusion with clinical time-series data and chest X-ray images
Nasir Hayat
Krzysztof J. Geras
Farah E. Shamout
MedIm
50
41
0
14 Jul 2022
Multimodal Attention-based Deep Learning for Alzheimer's Disease Diagnosis
Michal Golovanevsky
Carsten Eickhoff
Ritambhara Singh
52
65
0
17 Jun 2022
DeepFusion: Lidar-Camera Deep Fusion for Multi-Modal 3D Object Detection
Yingwei Li
Adams Wei Yu
Tianjian Meng
Benjamin Caine
Jiquan Ngiam
...
Yifeng Lu
Denny Zhou
Quoc V. Le
Alan Yuille
Mingxing Tan
3DPC
92
336
0
15 Mar 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
Guosheng Lin
MLLM
BDL
VLM
CLIP
530
4,343
0
28 Jan 2022
End-to-end Generative Pretraining for Multimodal Video Captioning
Paul Hongsuck Seo
Arsha Nagrani
Anurag Arnab
Cordelia Schmid
68
168
0
20 Jan 2022
TriBERT: Full-body Human-centric Audio-visual Representation Learning for Visual Sound Separation
Tanzila Rahman
Mengyu Yang
Leonid Sigal
ViT
60
8
0
26 Oct 2021
Deep Orthogonal Fusion: Multimodal Prognostic Biomarker Discovery Integrating Radiology, Pathology, Genomic, and Clinical Data
Nathaniel Braman
Jacob Gordon
Emery T. Goossens
Caleb Willis
Martin C. Stumpe
Jagadish Venkataraman
46
63
0
01 Jul 2021
Attention Bottlenecks for Multimodal Fusion
Arsha Nagrani
Shan Yang
Anurag Arnab
A. Jansen
Cordelia Schmid
Chen Sun
98
565
0
30 Jun 2021
Multi-modal Understanding and Generation for Medical Images and Text via Vision-Language Pre-Training
Jong Hak Moon
HyunGyung Lee
W. Shin
Young-Hak Kim
Edward Choi
MedIm
56
159
0
24 May 2021
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
Hassan Akbari
Liangzhe Yuan
Rui Qian
Wei-Hong Chuang
Shih-Fu Chang
Huayu Chen
Boqing Gong
ViT
314
588
0
22 Apr 2021
MMBERT: Multimodal BERT Pretraining for Improved Medical VQA
Yash Khare
Viraj Bagal
Minesh Mathew
Adithi Devi
U. Priyakumar
C. V. Jawahar
MedIm
63
135
0
03 Apr 2021
Perceiver: General Perception with Iterative Attention
Andrew Jaegle
Felix Gimeno
Andrew Brock
Andrew Zisserman
Oriol Vinyals
João Carreira
VLM
ViT
MDE
185
1,014
0
04 Mar 2021
ActBERT: Learning Global-Local Video-Text Representations
Linchao Zhu
Yi Yang
ViT
117
421
0
14 Nov 2020
Cross-Domain Sentiment Classification with Contrastive Learning and Mutual Information Maximization
Tian Li
Xiang Chen
Shanghang Zhang
Zhen Dong
Kurt Keutzer
125
36
0
30 Oct 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
637
41,003
0
22 Oct 2020
The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes
Douwe Kiela
Hamed Firooz
Aravind Mohan
Vedanuj Goswami
Amanpreet Singh
Pratik Ringshia
Davide Testuggine
87
604
0
10 May 2020
Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Transformers
Zhicheng Huang
Zhaoyang Zeng
Bei Liu
Dongmei Fu
Jianlong Fu
ViT
130
438
0
02 Apr 2020
ImageBERT: Cross-modal Pre-training with Large-scale Weak-supervised Image-Text Data
Di Qi
Lin Su
Jianwei Song
Edward Cui
Taroon Bharti
Arun Sacheti
VLM
78
261
0
22 Jan 2020
Benchmarking machine learning models on multi-centre eICU critical care dataset
Seyedmostafa Sheikhalishahi
Vevake Balaraman
V. Osmani
OOD
31
72
0
02 Oct 2019
VL-BERT: Pre-training of Generic Visual-Linguistic Representations
Weijie Su
Xizhou Zhu
Yue Cao
Bin Li
Lewei Lu
Furu Wei
Jifeng Dai
VLM
MLLM
SSL
153
1,663
0
22 Aug 2019
LXMERT: Learning Cross-Modality Encoder Representations from Transformers
Hao Hao Tan
Joey Tianyi Zhou
VLM
MLLM
237
2,479
0
20 Aug 2019
VisualBERT: A Simple and Performant Baseline for Vision and Language
Liunian Harold Li
Mark Yatskar
Da Yin
Cho-Jui Hsieh
Kai-Wei Chang
VLM
138
1,951
0
09 Aug 2019
ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks
Jiasen Lu
Dhruv Batra
Devi Parikh
Stefan Lee
SSL
VLM
226
3,678
0
06 Aug 2019
Multimodal Transformer for Unaligned Multimodal Language Sequences
Yao-Hung Hubert Tsai
Shaojie Bai
Paul Pu Liang
J. Zico Kolter
Louis-Philippe Morency
Ruslan Salakhutdinov
78
1,301
0
01 Jun 2019
Multimodal Transformer with Multi-View Visual Representation for Image Captioning
Jun-chen Yu
Jing Li
Zhou Yu
Qingming Huang
ViT
61
384
0
20 May 2019
VideoBERT: A Joint Model for Video and Language Representation Learning
Chen Sun
Austin Myers
Carl Vondrick
Kevin Patrick Murphy
Cordelia Schmid
VLM
SSL
77
1,246
0
03 Apr 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.7K
94,770
0
11 Oct 2018
Attention-based Audio-Visual Fusion for Robust Automatic Speech Recognition
George Sterpu
Christian Saam
N. Harte
65
65
0
05 Sep 2018
Multimodal Sentiment Analysis: Addressing Key Issues and Setting up the Baselines
Soujanya Poria
Navonil Majumder
Devamanyu Hazarika
Min Zhang
Alexander Gelbukh
Amir Hussain
31
173
0
19 Mar 2018
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
692
131,526
0
12 Jun 2017
Effective Approaches to Attention-based Neural Machine Translation
Thang Luong
Hieu H. Pham
Christopher D. Manning
380
7,962
0
17 Aug 2015
1