Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2107.00135
Cited By
Attention Bottlenecks for Multimodal Fusion
30 June 2021
Arsha Nagrani
Shan Yang
Anurag Arnab
A. Jansen
Cordelia Schmid
Chen Sun
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Attention Bottlenecks for Multimodal Fusion"
50 / 285 papers shown
Title
4M: Massively Multimodal Masked Modeling
David Mizrahi
Roman Bachmann
Ouguzhan Fatih Kar
Teresa Yeo
Mingfei Gao
Afshin Dehghan
Amir Zamir
MLLM
50
64
0
11 Dec 2023
Unveiling the Power of Audio-Visual Early Fusion Transformers with Dense Interactions through Masked Modeling
Shentong Mo
Pedro Morgado
27
13
0
02 Dec 2023
CAST: Cross-Attention in Space and Time for Video Action Recognition
Dongho Lee
Jongseo Lee
Jinwoo Choi
EgoV
35
12
0
30 Nov 2023
A-JEPA: Joint-Embedding Predictive Architecture Can Listen
Zhengcong Fei
Mingyuan Fan
Junshi Huang
25
17
0
27 Nov 2023
Joyful: Joint Modality Fusion and Graph Contrastive Learning for Multimodal Emotion Recognition
Dongyuan Li
Yusong Wang
Kotaro Funakoshi
Manabu Okumura
26
12
0
18 Nov 2023
Accommodating Missing Modalities in Time-Continuous Multimodal Emotion Recognition
Juan Vazquez-Rodriguez
G. Lefebvre
Julien Cumin
James L. Crowley
34
2
0
16 Nov 2023
Rethink Cross-Modal Fusion in Weakly-Supervised Audio-Visual Video Parsing
Yating Xu
Conghui Hu
Gim Hee Lee
22
2
0
14 Nov 2023
Mirasol3B: A Multimodal Autoregressive model for time-aligned and contextual modalities
A. Piergiovanni
Isaac Noble
Dahun Kim
Michael S. Ryoo
Victor Gomes
A. Angelova
43
19
0
09 Nov 2023
MOSEL: Inference Serving Using Dynamic Modality Selection
Bodun Hu
Le Xu
Jeongyoon Moon
N. Yadwadkar
Aditya Akella
13
4
0
27 Oct 2023
SynergyNet: Bridging the Gap between Discrete and Continuous Representations for Precise Medical Image Segmentation
Vandan Gorade
Sparsh Mittal
Debesh Jha
Ulas Bagci
33
11
0
26 Oct 2023
CAD -- Contextual Multi-modal Alignment for Dynamic AVQA
Asmar Nadeem
Adrian Hilton
R. Dawes
Graham A. Thomas
A. Mustafa
33
9
0
25 Oct 2023
GraFT: Gradual Fusion Transformer for Multimodal Re-Identification
Haoli Yin
Jiayao Li
Eva Schiller
Luke McDermott
Daniel Cummings
29
6
0
25 Oct 2023
Multimodal Transformer Using Cross-Channel attention for Object Detection in Remote Sensing Images
Bissmella Bahaduri
Zuheng Ming
Fangchen Feng
Anissa Mokraou
32
1
0
21 Oct 2023
Advancing Perception in Artificial Intelligence through Principles of Cognitive Science
Palaash Agrawal
Cheston Tan
Heena Rathore
54
1
0
13 Oct 2023
Multimodal Variational Auto-encoder based Audio-Visual Segmentation
Yuxin Mao
Jing Zhang
Mochu Xiang
Yiran Zhong
Yuchao Dai
40
34
0
12 Oct 2023
STELLA: Continual Audio-Video Pre-training with Spatio-Temporal Localized Alignment
Jaewoo Lee
Jaehong Yoon
Wonjae Kim
Yunji Kim
Sung Ju Hwang
CLL
19
1
0
12 Oct 2023
Improving Discriminative Multi-Modal Learning with Large-Scale Pre-Trained Models
Chenzhuang Du
Yue Zhao
Chonghua Liao
Jiacheng You
Jie Fu
Hang Zhao
47
2
0
08 Oct 2023
Multi-Resolution Audio-Visual Feature Fusion for Temporal Action Localization
Edward Fish
Jon Weinbren
Andrew Gilbert
36
0
0
05 Oct 2023
RegBN: Batch Normalization of Multimodal Data with Regularization
Morteza Ghahremani
Christian Wachinger
30
6
0
01 Oct 2023
Audio-Visual Speaker Verification via Joint Cross-Attention
R Gnana Praveen
Jahangir Alam
34
6
0
28 Sep 2023
M
3
^{3}
3
3D: Learning 3D priors using Multi-Modal Masked Autoencoders for 2D image and video understanding
Muhammad Abdullah Jamal
Omid Mohareri
3DPC
24
1
0
26 Sep 2023
Mitigating Adversarial Attacks in Federated Learning with Trusted Execution Environments
Simon Queyrut
V. Schiavoni
Pascal Felber
AAML
FedML
32
6
0
13 Sep 2023
Harmonic-NAS: Hardware-Aware Multimodal Neural Architecture Search on Resource-constrained Devices
Mohamed Imed Eddine Ghebriout
Halima Bouzidi
Smail Niar
Hamza Ouarnoughi
19
3
0
12 Sep 2023
Enhancing multimodal cooperation via sample-level modality valuation
Yake Wei
Ruoxuan Feng
Zihe Wang
Di Hu
38
11
0
12 Sep 2023
Multimodal Fish Feeding Intensity Assessment in Aquaculture
Meng Cui
Xubo Liu
Haohe Liu
Zhuangzhuang Du
Tao Chen
Guoping Lian
Daoliang Li
Wenwu Wang
34
5
0
10 Sep 2023
Multi-modal Extreme Classification
Anshul Mittal
Kunal Dahiya
Shreya Malani
Janani Ramaswamy
Seba Kuruvilla
Jitendra Ajmera
Keng-hao Chang
Sumeet Agarwal
Purushottam Kar
Manik Varma
34
8
0
10 Sep 2023
Text-to-feature diffusion for audio-visual few-shot learning
Otniel-Bogdan Mercea
Thomas Hummel
A. Sophia Koepke
Zeynep Akata
VLM
27
2
0
07 Sep 2023
Exchanging-based Multimodal Fusion with Transformer
Renyu Zhu
Chengcheng Han
Yong Qian
Qiushi Sun
Xiang Li
Ming Gao
Xuezhi Cao
Yunsen Xian
40
2
0
05 Sep 2023
Extract-and-Adaptation Network for 3D Interacting Hand Mesh Recovery
J. Park
Daniel Sungho Jung
Gyeongsik Moon
Kyoung Mu Lee
30
6
0
05 Sep 2023
LoRA-like Calibration for Multimodal Deception Detection using ATSFace Data
Shun-Wen Hsiao
Chengbin Sun
CVBM
18
1
0
04 Sep 2023
MM-AU:Towards Multimodal Understanding of Advertisement Videos
Digbalay Bose
Rajat Hebbar
Tiantian Feng
Krishna Somandepalli
Anfeng Xu
Shrikanth Narayanan
32
5
0
27 Aug 2023
MOFO: MOtion FOcused Self-Supervision for Video Understanding
Mona Ahmadian
Frank Guerin
Andrew Gilbert
44
2
0
23 Aug 2023
Learning Bottleneck Transformer for Event Image-Voxel Feature Fusion based Classification
Chengguo Yuan
Yu Jin
Zong-Yao Wu
Fanting Wei
Yangzirui Wang
Langlang Chen
Tianlin Li
ViT
100
7
0
23 Aug 2023
MEGA: Multimodal Alignment Aggregation and Distillation For Cinematic Video Segmentation
Najmeh Sadoughi
Xinyu Li
Avijit Vajpayee
D. Fan
Bing Shuai
H. Santos-Villalobos
Vimal Bhat
M. Rohith
31
4
0
22 Aug 2023
Diffusion Models for Image Restoration and Enhancement -- A Comprehensive Survey
Xin Li
Yulin Ren
Xin Jin
Cuiling Lan
X. Wang
Wenjun Zeng
Xinchao Wang
Zhibo Chen
43
86
0
18 Aug 2023
AudioFormer: Audio Transformer learns audio feature representations from discrete acoustic codes
Zhaohui Li
Haitao Wang
Xinghua Jiang
40
1
0
14 Aug 2023
Improving Audio-Visual Speech Recognition by Lip-Subword Correlation Based Visual Pre-training and Cross-Modal Fusion Encoder
Yusheng Dai
Hang Chen
Jun Du
xiao-ying Ding
Ning Ding
Feijun Jiang
Chin-Hui Lee
32
7
0
14 Aug 2023
Pelta: Shielding Transformers to Mitigate Evasion Attacks in Federated Learning
Simon Queyrut
Yérom-David Bromberg
V. Schiavoni
FedML
AAML
11
1
0
08 Aug 2023
SSTFormer: Bridging Spiking Neural Network and Memory Support Transformer for Frame-Event based Recognition
Tianlin Li
Zong-Yao Wu
Yao Rong
Lin Zhu
Bowei Jiang
Jin Tang
Yonghong Tian
ViT
77
18
0
08 Aug 2023
DCTM: Dilated Convolutional Transformer Model for Multimodal Engagement Estimation in Conversation
Vu Ngoc Tu
V. Huynh
Hyung-Jeong Yang
M. Zaheer
Shah Nawaz
Karthik Nandakumar
Soo-Hyung Kim
22
4
0
31 Jul 2023
Visual Prompt Flexible-Modal Face Anti-Spoofing
Zitong Yu
Rizhao Cai
Yawen Cui
Ajian Liu
Changsheng Chen
38
6
0
26 Jul 2023
FedMEKT: Distillation-based Embedding Knowledge Transfer for Multimodal Federated Learning
Huy Q. Le
Minh N. H. Nguyen
Chu Myaet Thwal
Yu Qiao
Chao Zhang
Choong Seon Hong
16
13
0
25 Jul 2023
Towards a performance analysis on pre-trained Visual Question Answering models for autonomous driving
Kaavya Rekanar
Ciarán Eising
Ganesh Sistu
Martin Hayes
8
3
0
18 Jul 2023
Multi-Modal Discussion Transformer: Integrating Text, Images and Graph Transformers to Detect Hate Speech on Social Media
Liam Hebert
Gaurav Sahu
Yuxuan Guo
Nanda Kishore Sreenivas
Lukasz Golab
Robin Cohen
23
10
0
18 Jul 2023
FlexiAST: Flexibility is What AST Needs
Jiu Feng
Mehmet Hamza Erol
Joon Son Chung
Arda Senocak
23
3
0
18 Jul 2023
CoTracker: It is Better to Track Together
Nikita Karaev
Ignacio Rocco
Benjamin Graham
Natalia Neverova
Andrea Vedaldi
Christian Rupprecht
VOT
ViT
51
246
0
14 Jul 2023
Multimodal Distillation for Egocentric Action Recognition
Gorjan Radevski
Dusan Grujicic
Marie-Francine Moens
Matthew Blaschko
Tinne Tuytelaars
EgoV
30
23
0
14 Jul 2023
One-Versus-Others Attention: Scalable Multimodal Integration for Clinical Data
Michal Golovanevsky
Eva Schiller
Akira Nair
Ritambhara Singh
Carsten Eickhoff
23
2
0
11 Jul 2023
Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers
Yuan Gong
Sameer Khurana
Leonid Karlinsky
James R. Glass
27
68
0
06 Jul 2023
Deep Equilibrium Multimodal Fusion
Jinhong Ni
Yalong Bai
Wei Zhang
Ting Yao
Tao Mei
28
1
0
29 Jun 2023
Previous
1
2
3
4
5
6
Next