ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2307.05435
  4. Cited By
One-Versus-Others Attention: Scalable Multimodal Integration for
  Clinical Data

One-Versus-Others Attention: Scalable Multimodal Integration for Clinical Data

11 July 2023
Michal Golovanevsky
Eva Schiller
Akira Nair
Ritambhara Singh
Carsten Eickhoff
ArXivPDFHTML

Papers citing "One-Versus-Others Attention: Scalable Multimodal Integration for Clinical Data"

36 / 36 papers shown
Title
Visual Instruction Tuning
Visual Instruction Tuning
Haotian Liu
Chunyuan Li
Qingyang Wu
Yong Jae Lee
SyDa
VLM
MLLM
529
4,740
0
17 Apr 2023
NarrowBERT: Accelerating Masked Language Model Pretraining and Inference
NarrowBERT: Accelerating Masked Language Model Pretraining and Inference
Haoxin Li
Phillip Keung
Daniel Cheng
Jungo Kasai
Noah A. Smith
44
4
0
11 Jan 2023
Learning to Exploit Temporal Structure for Biomedical Vision-Language
  Processing
Learning to Exploit Temporal Structure for Biomedical Vision-Language Processing
Shruthi Bannur
Stephanie L. Hyland
Qianchu Liu
Fernando Pérez-García
Maximilian Ilse
...
Maria T. A. Wetscherek
M. Lungren
A. Nori
Javier Alvarez-Valle
Ozan Oktay
61
124
0
11 Jan 2023
Cross-Domain Consumer Review Analysis
Cross-Domain Consumer Review Analysis
Aditya Pandey
Kunal Joshi
20
1
0
23 Dec 2022
SANCL: Multimodal Review Helpfulness Prediction with Selective Attention
  and Natural Contrastive Learning
SANCL: Multimodal Review Helpfulness Prediction with Selective Attention and Natural Contrastive Learning
Wei Han
Hui Chen
Zhen Hai
Soujanya Poria
Lidong Bing
75
16
0
12 Sep 2022
MedFuse: Multi-modal fusion with clinical time-series data and chest
  X-ray images
MedFuse: Multi-modal fusion with clinical time-series data and chest X-ray images
Nasir Hayat
Krzysztof J. Geras
Farah E. Shamout
MedIm
50
42
0
14 Jul 2022
Multimodal Attention-based Deep Learning for Alzheimer's Disease
  Diagnosis
Multimodal Attention-based Deep Learning for Alzheimer's Disease Diagnosis
Michal Golovanevsky
Carsten Eickhoff
Ritambhara Singh
52
65
0
17 Jun 2022
DeepFusion: Lidar-Camera Deep Fusion for Multi-Modal 3D Object Detection
DeepFusion: Lidar-Camera Deep Fusion for Multi-Modal 3D Object Detection
Yingwei Li
Adams Wei Yu
Tianjian Meng
Benjamin Caine
Jiquan Ngiam
...
Yifeng Lu
Denny Zhou
Quoc V. Le
Alan Yuille
Mingxing Tan
3DPC
92
336
0
15 Mar 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified
  Vision-Language Understanding and Generation
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
Guosheng Lin
MLLM
BDL
VLM
CLIP
530
4,343
0
28 Jan 2022
End-to-end Generative Pretraining for Multimodal Video Captioning
End-to-end Generative Pretraining for Multimodal Video Captioning
Paul Hongsuck Seo
Arsha Nagrani
Anurag Arnab
Cordelia Schmid
68
168
0
20 Jan 2022
TriBERT: Full-body Human-centric Audio-visual Representation Learning
  for Visual Sound Separation
TriBERT: Full-body Human-centric Audio-visual Representation Learning for Visual Sound Separation
Tanzila Rahman
Mengyu Yang
Leonid Sigal
ViT
62
8
0
26 Oct 2021
Deep Orthogonal Fusion: Multimodal Prognostic Biomarker Discovery
  Integrating Radiology, Pathology, Genomic, and Clinical Data
Deep Orthogonal Fusion: Multimodal Prognostic Biomarker Discovery Integrating Radiology, Pathology, Genomic, and Clinical Data
Nathaniel Braman
Jacob Gordon
Emery T. Goossens
Caleb Willis
Martin C. Stumpe
Jagadish Venkataraman
46
63
0
01 Jul 2021
Attention Bottlenecks for Multimodal Fusion
Attention Bottlenecks for Multimodal Fusion
Arsha Nagrani
Shan Yang
Anurag Arnab
A. Jansen
Cordelia Schmid
Chen Sun
98
565
0
30 Jun 2021
Multi-modal Understanding and Generation for Medical Images and Text via
  Vision-Language Pre-Training
Multi-modal Understanding and Generation for Medical Images and Text via Vision-Language Pre-Training
Jong Hak Moon
HyunGyung Lee
W. Shin
Young-Hak Kim
Edward Choi
MedIm
56
159
0
24 May 2021
VATT: Transformers for Multimodal Self-Supervised Learning from Raw
  Video, Audio and Text
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
Hassan Akbari
Liangzhe Yuan
Rui Qian
Wei-Hong Chuang
Shih-Fu Chang
Huayu Chen
Boqing Gong
ViT
314
588
0
22 Apr 2021
MMBERT: Multimodal BERT Pretraining for Improved Medical VQA
MMBERT: Multimodal BERT Pretraining for Improved Medical VQA
Yash Khare
Viraj Bagal
Minesh Mathew
Adithi Devi
U. Priyakumar
C. V. Jawahar
MedIm
63
135
0
03 Apr 2021
Perceiver: General Perception with Iterative Attention
Perceiver: General Perception with Iterative Attention
Andrew Jaegle
Felix Gimeno
Andrew Brock
Andrew Zisserman
Oriol Vinyals
João Carreira
VLM
ViT
MDE
188
1,015
0
04 Mar 2021
ActBERT: Learning Global-Local Video-Text Representations
ActBERT: Learning Global-Local Video-Text Representations
Linchao Zhu
Yi Yang
ViT
117
421
0
14 Nov 2020
Cross-Domain Sentiment Classification with Contrastive Learning and
  Mutual Information Maximization
Cross-Domain Sentiment Classification with Contrastive Learning and Mutual Information Maximization
Tian Li
Xiang Chen
Shanghang Zhang
Zhen Dong
Kurt Keutzer
125
36
0
30 Oct 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at
  Scale
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
640
41,003
0
22 Oct 2020
The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes
The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes
Douwe Kiela
Hamed Firooz
Aravind Mohan
Vedanuj Goswami
Amanpreet Singh
Pratik Ringshia
Davide Testuggine
87
604
0
10 May 2020
Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal
  Transformers
Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Transformers
Zhicheng Huang
Zhaoyang Zeng
Bei Liu
Dongmei Fu
Jianlong Fu
ViT
130
438
0
02 Apr 2020
ImageBERT: Cross-modal Pre-training with Large-scale Weak-supervised
  Image-Text Data
ImageBERT: Cross-modal Pre-training with Large-scale Weak-supervised Image-Text Data
Di Qi
Lin Su
Jianwei Song
Edward Cui
Taroon Bharti
Arun Sacheti
VLM
78
261
0
22 Jan 2020
Benchmarking machine learning models on multi-centre eICU critical care
  dataset
Benchmarking machine learning models on multi-centre eICU critical care dataset
Seyedmostafa Sheikhalishahi
Vevake Balaraman
V. Osmani
OOD
31
72
0
02 Oct 2019
VL-BERT: Pre-training of Generic Visual-Linguistic Representations
VL-BERT: Pre-training of Generic Visual-Linguistic Representations
Weijie Su
Xizhou Zhu
Yue Cao
Bin Li
Lewei Lu
Furu Wei
Jifeng Dai
VLM
MLLM
SSL
153
1,663
0
22 Aug 2019
LXMERT: Learning Cross-Modality Encoder Representations from
  Transformers
LXMERT: Learning Cross-Modality Encoder Representations from Transformers
Hao Hao Tan
Joey Tianyi Zhou
VLM
MLLM
237
2,479
0
20 Aug 2019
VisualBERT: A Simple and Performant Baseline for Vision and Language
VisualBERT: A Simple and Performant Baseline for Vision and Language
Liunian Harold Li
Mark Yatskar
Da Yin
Cho-Jui Hsieh
Kai-Wei Chang
VLM
138
1,951
0
09 Aug 2019
ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for
  Vision-and-Language Tasks
ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks
Jiasen Lu
Dhruv Batra
Devi Parikh
Stefan Lee
SSL
VLM
226
3,678
0
06 Aug 2019
Multimodal Transformer for Unaligned Multimodal Language Sequences
Multimodal Transformer for Unaligned Multimodal Language Sequences
Yao-Hung Hubert Tsai
Shaojie Bai
Paul Pu Liang
J. Zico Kolter
Louis-Philippe Morency
Ruslan Salakhutdinov
78
1,301
0
01 Jun 2019
Multimodal Transformer with Multi-View Visual Representation for Image
  Captioning
Multimodal Transformer with Multi-View Visual Representation for Image Captioning
Jun-chen Yu
Jing Li
Zhou Yu
Qingming Huang
ViT
61
384
0
20 May 2019
VideoBERT: A Joint Model for Video and Language Representation Learning
VideoBERT: A Joint Model for Video and Language Representation Learning
Chen Sun
Austin Myers
Carl Vondrick
Kevin Patrick Murphy
Cordelia Schmid
VLM
SSL
79
1,246
0
03 Apr 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.8K
94,770
0
11 Oct 2018
Attention-based Audio-Visual Fusion for Robust Automatic Speech
  Recognition
Attention-based Audio-Visual Fusion for Robust Automatic Speech Recognition
George Sterpu
Christian Saam
N. Harte
67
65
0
05 Sep 2018
Multimodal Sentiment Analysis: Addressing Key Issues and Setting up the
  Baselines
Multimodal Sentiment Analysis: Addressing Key Issues and Setting up the Baselines
Soujanya Poria
Navonil Majumder
Devamanyu Hazarika
Min Zhang
Alexander Gelbukh
Amir Hussain
31
173
0
19 Mar 2018
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
698
131,526
0
12 Jun 2017
Effective Approaches to Attention-based Neural Machine Translation
Effective Approaches to Attention-based Neural Machine Translation
Thang Luong
Hieu H. Pham
Christopher D. Manning
380
7,962
0
17 Aug 2015
1