Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2107.00135
Cited By
Attention Bottlenecks for Multimodal Fusion
30 June 2021
Arsha Nagrani
Shan Yang
Anurag Arnab
A. Jansen
Cordelia Schmid
Chen Sun
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Attention Bottlenecks for Multimodal Fusion"
50 / 285 papers shown
Title
ADAPT: Multimodal Learning for Detecting Physiological Changes under Missing Modalities
Julie Mordacq
Léo Milecki
Maria Vakalopoulou
Steve Oudot
Vicky Kalogeiton
OffRL
MedIm
37
3
0
04 Jul 2024
Adaptive Modality Balanced Online Knowledge Distillation for Brain-Eye-Computer based Dim Object Detection
Zixing Li
Chao Yan
Zhen Lan
Xiaojia Xiang
Han Zhou
Jun Lai
Dengqing Tang
46
0
0
02 Jul 2024
How Intermodal Interaction Affects the Performance of Deep Multimodal Fusion for Mixed-Type Time Series
Simon Dietz
Thomas Altstidl
Dario Zanca
Björn Eskofier
An Nguyen
AI4TS
28
0
0
21 Jun 2024
MoME: Mixture of Multimodal Experts for Cancer Survival Prediction
Conghao Xiong
Hao Chen
Hao Zheng
Dong Wei
Yefeng Zheng
Joseph J. Y. Sung
Irwin King
MoE
34
10
0
14 Jun 2024
MA-AVT: Modality Alignment for Parameter-Efficient Audio-Visual Transformers
Tanvir Mahmud
Shentong Mo
Yapeng Tian
Diana Marculescu
34
4
0
07 Jun 2024
Audio Mamba: Bidirectional State Space Model for Audio Representation Learning
Mehmet Hamza Erol
Arda Senocak
Jiu Feng
Joon Son Chung
Mamba
73
19
0
05 Jun 2024
Progressive Confident Masking Attention Network for Audio-Visual Segmentation
Yuxuan Wang
Feng Dong
Jinchao Zhu
Shuyue Zhu
VOS
56
0
0
04 Jun 2024
GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer
Ding Jia
Jianyuan Guo
Kai Han
Han Wu
Chao Zhang
Chang Xu
Xinghao Chen
ViT
48
16
0
03 Jun 2024
Multimodal Cross-Domain Few-Shot Learning for Egocentric Action Recognition
Masashi Hatano
Ryo Hachiuma
Ryoske Fujii
Hideo Saito
EgoV
42
4
0
30 May 2024
Coupled Mamba: Enhanced Multi-modal Fusion with Coupled State Space Model
Wenbing Li
Hang Zhou
Junqing Yu
Zikai Song
Wei Yang
Mamba
54
3
0
28 May 2024
MMPareto: Boosting Multimodal Learning with Innocent Unimodal Assistance
Yake Wei
Di Hu
34
14
0
28 May 2024
SoK: Leveraging Transformers for Malware Analysis
Pradip Kunwar
Kshitiz Aryal
Maanak Gupta
Mahmoud Abdelsalam
Elisa Bertino
90
0
0
27 May 2024
Planted: a dataset for planted forest identification from multi-satellite time series
L. M. Pazos-Outón
Cristina Nader Vasconcelos
Anton Raichuk
Anurag Arnab
Dan Morris
Maxim Neumann
47
4
0
24 May 2024
A Multimodal Learning-based Approach for Autonomous Landing of UAV
Francisco Neves
Luís Branco
Maria Pereira
R. Claro
Andry Pinto
24
1
0
21 May 2024
ReconBoost: Boosting Can Achieve Modality Reconcilement
Cong Hua
Qianqian Xu
Shilong Bao
Zhiyong Yang
Qingming Huang
46
10
0
15 May 2024
A Semantic and Motion-Aware Spatiotemporal Transformer Network for Action Detection
Matthew Korban
Peter Youngs
Scott T. Acton
ViT
29
6
0
13 May 2024
Demystifying the Hypercomplex: Inductive Biases in Hypercomplex Deep Learning
Danilo Comminiello
Eleonora Grassucci
Danilo P. Mandic
A. Uncini
HAI
AI4CE
36
2
0
11 May 2024
Multi-scale Bottleneck Transformer for Weakly Supervised Multimodal Violence Detection
Shengyang Sun
Xiaojin Gong
18
4
0
08 May 2024
Audio-Visual Traffic Light State Detection for Urban Robots
Sagar Gupta
Akansel Cosgun
21
0
0
30 Apr 2024
Feature importance to explain multimodal prediction models. A clinical use case
Jorn-Jan van de Beld
Shreyasi Pathak
J. Geerdink
J. H. Hegeman
Christin Seifert
FAtt
22
0
0
29 Apr 2024
Multimodal Fusion on Low-quality Data: A Comprehensive Survey
Qingyang Zhang
Yake Wei
Zongbo Han
Huazhu Fu
Xi Peng
...
Qinghua Hu
Cai Xu
Jie Wen
Di Hu
Changqing Zhang
57
26
0
27 Apr 2024
A review of deep learning-based information fusion techniques for multimodal medical image classification
Yi-Hsuan Li
Mostafa EL HABIB DAHO
Pierre-Henri Conze
Rachid Zeghlache
Hugo Le Boité
R. Tadayoni
B. Cochener
M. Lamard
G. Quellec
38
31
0
23 Apr 2024
Learning to Rebalance Multi-Modal Optimization by Adaptively Masking Subnetworks
Yang Yang
Hongpeng Pan
Qingjun Jiang
Yi Tian Xu
Jinghui Tang
29
4
0
12 Apr 2024
Audio-Visual Generalized Zero-Shot Learning using Pre-Trained Large Multi-Modal Models
David Kurzendörfer
Otniel-Bogdan Mercea
A. Sophia Koepke
Zeynep Akata
VLM
CLIP
33
2
0
09 Apr 2024
TIM: A Time Interval Machine for Audio-Visual Action Recognition
Jacob Chalk
Jaesung Huh
Evangelos Kazakos
Andrew Zisserman
Dima Damen
46
9
0
08 Apr 2024
SoundingActions: Learning How Actions Sound from Narrated Egocentric Videos
Changan Chen
Kumar Ashutosh
Rohit Girdhar
David Harwath
Kristen Grauman
EgoV
SSL
28
7
0
08 Apr 2024
Continual Learning for Smart City: A Survey
Li Yang
Zhipeng Luo
Shi-sheng Zhang
Fei Teng
Tian-Jie Li
HAI
43
8
0
01 Apr 2024
Siamese Vision Transformers are Scalable Audio-visual Learners
Yan-Bo Lin
Gedas Bertasius
37
5
0
28 Mar 2024
Unsupervised Audio-Visual Segmentation with Modality Alignment
Swapnil Bhosale
Haosen Yang
Diptesh Kanojia
Jiangkang Deng
Xiatian Zhu
VOS
43
5
0
21 Mar 2024
Multimodal Fusion Method with Spatiotemporal Sequences and Relationship Learning for Valence-Arousal Estimation
Jun-chen Yu
Gongpeng Zhao
Yongqi Wang
Zhihong Wei
Yang Zheng
Zerui Zhang
Zhongpeng Cai
Guochen Xie
Jichao Zhu
Wangyuan Zhu
44
8
0
19 Mar 2024
MIntRec2.0: A Large-scale Benchmark Dataset for Multimodal Intent Recognition and Out-of-scope Detection in Conversations
Hanlei Zhang
Xin Wang
Hua Xu
Qianrui Zhou
Kai Gao
Jianhua Su
jinyue Zhao
Wenrui Li
Yanting Chen
45
2
0
16 Mar 2024
EquiAV: Leveraging Equivariance for Audio-Visual Contrastive Learning
Jongsuk Kim
Hyeongkeun Lee
Kyeongha Rho
Junmo Kim
Joon Son Chung
34
4
0
14 Mar 2024
Gradient-Guided Modality Decoupling for Missing-Modality Robustness
Hao Wang
Shengda Luo
Guosheng Hu
Jianguo Zhang
24
3
0
26 Feb 2024
Multimodal Transformer With a Low-Computational-Cost Guarantee
Sungjin Park
Edward Choi
52
1
0
23 Feb 2024
The Landscape and Challenges of HPC Research and LLMs
Le Chen
Nesreen K. Ahmed
Akashnil Dutta
Arijit Bhattacharjee
Sixing Yu
...
Vy A. Vo
J. P. Muñoz
Ted Willke
Tim Mattson
Ali Jannesari
AI4CE
48
20
0
03 Feb 2024
Distilling Privileged Multimodal Information for Expression Recognition using Optimal Transport
Haseeb Aslam
Muhammad Osama Zeeshan
Soufiane Belharbi
M. Pedersoli
A. L. Koerich
Simon L Bacon
Eric Granger
28
9
0
27 Jan 2024
Memory-Inspired Temporal Prompt Interaction for Text-Image Classification
Xinyao Yu
Hao Sun
Ziwei Niu
Rui Qin
Zhenjia Bai
Yen-Wei Chen
Lanfen Lin
VLM
39
2
0
26 Jan 2024
CoAVT: A Cognition-Inspired Unified Audio-Visual-Text Pre-Training Model for Multimodal Processing
Xianghu Yue
Xiaohai Tian
Lu Lu
Malu Zhang
Zhizheng Wu
Haizhou Li
39
0
0
22 Jan 2024
Exploring Missing Modality in Multimodal Egocentric Datasets
Merey Ramazanova
Alejandro Pardo
Humam Alwassel
Guohao Li
EgoV
41
4
0
21 Jan 2024
Fusing Echocardiography Images and Medical Records for Continuous Patient Stratification
Nathan Painchaud
P. Courand
Pierre-Marc Jodoin
Nicolas Duchateau
Olivier Bernard
39
2
0
15 Jan 2024
FunnyNet-W: Multimodal Learning of Funny Moments in Videos in the Wild
Zhi-Song Liu
Robin Courant
Vicky Kalogeiton
42
6
0
08 Jan 2024
Efficient Multiscale Multimodal Bottleneck Transformer for Audio-Video Classification
Wentao Zhu
41
5
0
08 Jan 2024
Efficient Selective Audio Masked Multimodal Bottleneck Transformer for Audio-Video Classification
Wentao Zhu
40
4
0
08 Jan 2024
EAT: Self-Supervised Pre-Training with Efficient Audio Transformer
Wenxi Chen
Yuzhe Liang
Ziyang Ma
Zhisheng Zheng
Xie Chen
ViT
54
18
0
07 Jan 2024
Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action
Jiasen Lu
Christopher Clark
Sangho Lee
Zichen Zhang
Savya Khosla
Ryan Marten
Derek Hoiem
Aniruddha Kembhavi
VLM
MLLM
40
147
0
28 Dec 2023
Deformable Audio Transformer for Audio Event Detection
Wentao Zhu
28
0
0
24 Dec 2023
Unleashing the Power of CNN and Transformer for Balanced RGB-Event Video Recognition
Tianlin Li
Yao Rong
Shiao Wang
Yuan Chen
Zhe Wu
Bowei Jiang
Yonghong Tian
Jin Tang
ViT
86
3
0
18 Dec 2023
On Robustness to Missing Video for Audiovisual Speech Recognition
Oscar Chang
Otavio Braga
H. Liao
Dmitriy Serdyuk
Olivier Siohan
45
11
0
13 Dec 2023
Modality Plug-and-Play: Elastic Modality Adaptation in Multimodal LLMs for Embodied AI
Kai Huang
Boyuan Yang
Wei Gao
37
1
0
13 Dec 2023
More than Vanilla Fusion: a Simple, Decoupling-free, Attention Module for Multimodal Fusion Based on Signal Theory
Peiwen Sun
Yifan Zhang
Zishan Liu
Donghao Chen
Honggang Zhang
24
0
0
12 Dec 2023
Previous
1
2
3
4
5
6
Next