Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2107.00135
Cited By
Attention Bottlenecks for Multimodal Fusion
30 June 2021
Arsha Nagrani
Shan Yang
Anurag Arnab
A. Jansen
Cordelia Schmid
Chen Sun
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Attention Bottlenecks for Multimodal Fusion"
50 / 285 papers shown
Title
Learning Unseen Modality Interaction
Yunhua Zhang
Hazel Doughty
Cees G. M. Snoek
29
3
0
22 Jun 2023
Exploring the Role of Audio in Video Captioning
Yuhan Shen
Linjie Yang
Longyin Wen
Haichao Yu
Ehsan Elhamifar
Heng Wang
31
2
0
21 Jun 2023
Utilizing Longitudinal Chest X-Rays and Reports to Pre-Fill Radiology Reports
Qingqing Zhu
T. Mathai
P. Mukherjee
Yifan Peng
Ronald M. Summers
Zhiyong Lu
27
17
0
14 Jun 2023
Exploring Attention Mechanisms for Multimodal Emotion Recognition in an Emergency Call Center Corpus
Théo Deschamps-Berger
L. Lamel
Laurence Devillers
42
8
0
12 Jun 2023
A Comprehensive Survey on Applications of Transformers for Deep Learning Tasks
Saidul Islam
Hanae Elmekki
Ahmed Elsebai
Jamal Bentahar
Najat Drawel
Gaith Rjoub
Witold Pedrycz
ViT
MedIm
24
174
0
11 Jun 2023
Transformer-based Multi-Modal Learning for Multi Label Remote Sensing Image Classification
David Hoffmann
Kai Norman Clasen
Begüm Demir
21
8
0
02 Jun 2023
Encoder-decoder multimodal speaker change detection
Jee-weon Jung
Soonshin Seo
Hee-Soo Heo
Geon-min Kim
You Jin Kim
Youngki Kwon
Min-Ji Lee
Bong-Jin Lee
45
2
0
01 Jun 2023
Denoising Bottleneck with Mutual Information Maximization for Video Multimodal Fusion
Shao-Yu Wu
Damai Dai
Ziwei Qin
Tianyu Liu
Binghuai Lin
Yunbo Cao
Zhifang Sui
31
6
0
24 May 2023
CongFu: Conditional Graph Fusion for Drug Synergy Prediction
Oleksii Tsepa
Bohdan Naida
Anna Goldenberg
Bo Wang
15
1
0
23 May 2023
Efficient Multimodal Neural Networks for Trigger-less Voice Assistants
Sai Srujana Buddi
U. Sarawgi
Tashweena Heeramun
Karan Sawnhey
Ed Yanosik
Saravana Rathinam
Saurabh N. Adya
33
5
0
20 May 2023
Object Re-Identification from Point Clouds
Benjamin Thérien
Chengjie Huang
Adrian Chow
Krzysztof Czarnecki
3DPC
36
2
0
17 May 2023
UniS-MMC: Multimodal Classification via Unimodality-supervised Multimodal Contrastive Learning
Heqing Zou
Meng Shen
Chen Chen
Yuchen Hu
D. Rajan
Chng Eng Siong
SSL
40
15
0
16 May 2023
MMG-Ego4D: Multi-Modal Generalization in Egocentric Action Recognition
Xinyu Gong
S. Mohan
Naina Dhingra
Jean-Charles Bazin
Yilei Li
Zhangyang Wang
Rakesh Ranjan
EgoV
56
18
0
12 May 2023
ChatGPT-Like Large-Scale Foundation Models for Prognostics and Health Management: A Survey and Roadmaps
Yanfang Li
Huan Wang
Muxia Sun
LM&MA
AI4TS
AI4CE
29
46
0
10 May 2023
ImageBind: One Embedding Space To Bind Them All
Rohit Girdhar
Alaaeldin El-Nouby
Zhuang Liu
Mannat Singh
Kalyan Vasudev Alwala
Armand Joulin
Ishan Misra
VLM
56
855
0
09 May 2023
UIT-OpenViIC: A Novel Benchmark for Evaluating Image Captioning in Vietnamese
Doanh C. Bui
Nghia Hieu Nguyen
Khang Phuoc-Quy Nguyen
VLM
27
3
0
07 May 2023
Listen to Look into the Future: Audio-Visual Egocentric Gaze Anticipation
Bolin Lai
Fiona Ryan
Wenqi Jia
Miao Liu
James M. Rehg
EgoV
32
8
0
06 May 2023
Distilled Mid-Fusion Transformer Networks for Multi-Modal Human Activity Recognition
Jingcheng Li
L. Yao
Binghao Li
Claude Sammut
31
2
0
05 May 2023
Learning Missing Modal Electronic Health Records with Unified Multi-modal Data Embedding and Modality-Aware Attention
Kwanhyung Lee
Soojeong Lee
Sangchul Hahn
Heejung Hyun
Edward Choi
Byungeun Ahn
Joohyung Lee
51
16
0
04 May 2023
On Uni-Modal Feature Learning in Supervised Multi-Modal Learning
Chenzhuang Du
Jiaye Teng
Tingle Li
Yichen Liu
Tianyuan Yuan
Yue Wang
Yang Yuan
Hang Zhao
89
40
0
02 May 2023
Incomplete Multimodal Learning for Remote Sensing Data Fusion
Yuxing Chen
Maofan Zhao
Lorenzo Bruzzone
37
3
0
22 Apr 2023
Modeling Dense Multimodal Interactions Between Biological Pathways and Histology for Survival Prediction
Guillaume Jaume
Anurag J. Vaidya
Richard J. Chen
Drew F. K. Williamson
Paul Pu Liang
Faisal Mahmood
41
44
0
13 Apr 2023
Efficient Multimodal Fusion via Interactive Prompting
Yaowei Li
Ruijie Quan
Linchao Zhu
Yezhou Yang
35
44
0
13 Apr 2023
Looking Similar, Sounding Different: Leveraging Counterfactual Cross-Modal Pairs for Audiovisual Representation Learning
Nikhil Singh
Chih-Wei Wu
Iroro Orife
Mahdi M. Kalayeh
25
2
0
12 Apr 2023
WEAR: An Outdoor Sports Dataset for Wearable and Egocentric Activity Recognition
Marius Bock
Hilde Kuehne
Kristof Van Laerhoven
Michael Moeller
EgoV
42
24
0
11 Apr 2023
On Robustness in Multimodal Learning
Brandon McKinzie
Joseph Cheng
Vaishaal Shankar
Yinfei Yang
Jonathon Shlens
Alexander Toshev
40
2
0
10 Apr 2023
Unraveling Instance Associations: A Closer Look for Audio-Visual Segmentation
Yuanhong Chen
Yuyuan Liu
Hu Wang
Fengbei Liu
Chong Wang
Helen Frazer
G. Carneiro
VOS
27
15
0
06 Apr 2023
VicTR: Video-conditioned Text Representations for Activity Recognition
Kumara Kahatapitiya
Anurag Arnab
Arsha Nagrani
Michael S. Ryoo
42
20
0
05 Apr 2023
Scalable and Accurate Self-supervised Multimodal Representation Learning without Aligned Video and Text Data
Vladislav Lialin
Stephen Rawls
David M. Chan
Shalini Ghosh
Anna Rumshisky
Wael Hamza
VLM
AI4TS
28
6
0
04 Apr 2023
Egocentric Auditory Attention Localization in Conversations
Fiona Ryan
Hao Jiang
Abhinav Shukla
James M. Rehg
V. Ithapu
EgoV
29
16
0
28 Mar 2023
CoRe-Sleep: A Multimodal Fusion Framework for Time Series Robust to Imperfect Modalities
Konstantinos Kontras
Christos Chatzichristos
Huy P Phan
Johan A. K. Suykens
Marina De Vos
AI4TS
24
11
0
27 Mar 2023
Unimodal Training-Multimodal Prediction: Cross-modal Federated Learning with Hierarchical Aggregation
Rongyu Zhang
Xiaowei Chi
Guiliang Liu
Wenyi Zhang
Yuan Du
Fang Wang
38
12
0
27 Mar 2023
Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline
Tiantian Geng
Teng Wang
Jinming Duan
Runmin Cong
Feng Zheng
30
28
0
22 Mar 2023
Machine Learning for Brain Disorders: Transformers and Visual Transformers
Robin Courant
Maika Edberg
Nicolas Dufour
Vicky Kalogeiton
MedIm
ViT
40
1
0
21 Mar 2023
Transformers in Speech Processing: A Survey
S. Latif
Aun Zaidi
Heriberto Cuayáhuitl
Fahad Shamshad
Moazzam Shoukat
Junaid Qadir
46
47
0
21 Mar 2023
Scene Graph Based Fusion Network For Image-Text Retrieval
Guoliang Wang
Yanlei Shang
Yongzhe Chen
28
1
0
20 Mar 2023
Retrieving Multimodal Information for Augmented Generation: A Survey
Ruochen Zhao
Hailin Chen
Weishi Wang
Fangkai Jiao
Do Xuan Long
...
Bosheng Ding
Xiaobao Guo
Minzhi Li
Xingxuan Li
Chenyu You
31
82
0
20 Mar 2023
Multiscale Audio Spectrogram Transformer for Efficient Audio Classification
Wenjie Zhu
M. Omar
37
22
0
19 Mar 2023
Facial Affect Recognition based on Transformer Encoder and Audiovisual Fusion for the ABAW5 Challenge
Ziyang Zhang
Liuwei An
Zishun Cui
Ao Xu
Tengteng Dong
Yueqi Jiang
Jingyi Shi
Xin Liu
Xiao Sun
Meng Wang
CVBM
38
20
0
16 Mar 2023
EgoViT: Pyramid Video Transformer for Egocentric Action Recognition
Chen-Ming Pan
Zhiqi Zhang
Senem Velipasalar
Yi Tian Xu
ViT
20
1
0
15 Mar 2023
CAT: Causal Audio Transformer for Audio Classification
Xiaoyu Liu
Hanlin Lu
Jianbo Yuan
Xinyu Li
ViT
28
22
0
14 Mar 2023
MMCosine: Multi-Modal Cosine Loss Towards Balanced Audio-Visual Fine-Grained Learning
Ruize Xu
Ruoxuan Feng
Shi-Xiong Zhang
Di Hu
39
21
0
09 Mar 2023
Audio-Visual Deception Detection: DOLOS Dataset and Parameter-Efficient Crossmodal Learning
Xiaobao Guo
Nithish Muthuchamy Selvaraj
Zitong Yu
A. Kong
Bingquan Shen
Alex C. Kot
38
8
0
09 Mar 2023
Transformadores: Fundamentos teoricos y Aplicaciones
J. D. L. Torre
78
0
0
18 Feb 2023
A dataset for Audio-Visual Sound Event Detection in Movies
Rajat Hebbar
Digbalay Bose
Krishna Somandepalli
Veena Vijai
Shrikanth Narayanan
8
8
0
14 Feb 2023
CoMAE: Single Model Hybrid Pre-training on Small-Scale RGB-D Datasets
Jiang Yang
Sheng Guo
Gangshan Wu
Limin Wang
VLM
31
7
0
13 Feb 2023
SemanticAC: Semantics-Assisted Framework for Audio Classification
Yicheng Xiao
Yue Ma
Shuyan Li
Hantao Zhou
Ran Liao
Xiu Li
13
8
0
12 Feb 2023
Revisiting Pre-training in Audio-Visual Learning
Ruoxuan Feng
Wenke Xia
Di Hu
39
1
0
07 Feb 2023
Epic-Sounds: A Large-scale Dataset of Actions That Sound
Jaesung Huh
Jacob Chalk
Evangelos Kazakos
Dima Damen
Andrew Zisserman
EgoV
21
41
0
01 Feb 2023
Neural Additive Models for Location Scale and Shape: A Framework for Interpretable Neural Regression Beyond the Mean
Anton Thielmann
René-Marcel Kruse
Thomas Kneib
Benjamin Säfken
32
12
0
27 Jan 2023
Previous
1
2
3
4
5
6
Next