ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2403.04245
  4. Cited By
A Study of Dropout-Induced Modality Bias on Robustness to Missing Video
  Frames for Audio-Visual Speech Recognition

A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition

7 March 2024
Yusheng Dai
Hang Chen
Jun Du
Ruoyu Wang
Shihao Chen
Jie Ma
Haotian Wang
Chin-Hui Lee
ArXiv (abs)PDFHTMLGithub (9★)

Papers citing "A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition"

32 / 32 papers shown
Title
Towards Open-Vocabulary Audio-Visual Event Localization
Jinxing Zhou
Dan Guo
Ruohao Guo
Yuxin Mao
Jingjing Hu
Yiran Zhong
Xiaojun Chang
Ming Wang
VLM
117
5
0
18 Nov 2024
Enhancing Multimodal Sentiment Analysis for Missing Modality through Self-Distillation and Unified Modality Cross-Attention
Enhancing Multimodal Sentiment Analysis for Missing Modality through Self-Distillation and Unified Modality Cross-Attention
Yuzhe Weng
Haotian Wang
Tian Gao
Kewei Li
Shutong Niu
Jun Du
96
0
0
19 Oct 2024
Improving Audio-Visual Speech Recognition by Lip-Subword Correlation
  Based Visual Pre-training and Cross-Modal Fusion Encoder
Improving Audio-Visual Speech Recognition by Lip-Subword Correlation Based Visual Pre-training and Cross-Modal Fusion Encoder
Yusheng Dai
Hang Chen
Jun Du
xiao-ying Ding
Ning Ding
Feijun Jiang
Chin-Hui Lee
89
8
0
14 Aug 2023
Auto-AVSR: Audio-Visual Speech Recognition with Automatic Labels
Auto-AVSR: Audio-Visual Speech Recognition with Automatic Labels
Pingchuan Ma
A. Haliassos
Adriana Fernandez-Lopez
Honglie Chen
Stavros Petridis
Maja Pantic
72
114
0
25 Mar 2023
MSeg3D: Multi-modal 3D Semantic Segmentation for Autonomous Driving
MSeg3D: Multi-modal 3D Semantic Segmentation for Autonomous Driving
Jiale Li
Hang Dai
Hao Han
Yong Ding
3DPC
95
72
0
15 Mar 2023
Leveraging Modality-specific Representations for Audio-visual Speech
  Recognition via Reinforcement Learning
Leveraging Modality-specific Representations for Audio-visual Speech Recognition via Reinforcement Learning
Chen Chen
Yuchen Hu
Qiang Zhang
Heqing Zou
Beier Zhu
Eng Siong Chng
74
28
0
10 Dec 2022
GPU-accelerated Guided Source Separation for Meeting Transcription
GPU-accelerated Guided Source Separation for Meeting Transcription
Desh Raj
Daniel Povey
Sanjeev Khudanpur
62
40
0
10 Dec 2022
Analyzing Modality Robustness in Multimodal Sentiment Analysis
Analyzing Modality Robustness in Multimodal Sentiment Analysis
Devamanyu Hazarika
Yingting Li
Bo Cheng
Shuai Zhao
Roger Zimmermann
Soujanya Poria
87
33
0
30 May 2022
Are Multimodal Transformers Robust to Missing Modality?
Are Multimodal Transformers Robust to Missing Modality?
Mengmeng Ma
Jian Ren
Long Zhao
Davide Testuggine
Xi Peng
ViT
102
154
0
12 Apr 2022
On Modality Bias Recognition and Reduction
On Modality Bias Recognition and Reduction
Yangyang Guo
Liqiang Nie
Harry Cheng
Zhiyong Cheng
Mohan S. Kankanhalli
A. Bimbo
70
28
0
25 Feb 2022
Learning Audio-Visual Speech Representation by Masked Multimodal Cluster
  Prediction
Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction
Bowen Shi
Wei-Ning Hsu
Kushal Lakhotia
Abdel-rahman Mohamed
SSL
113
321
0
05 Jan 2022
Tip-Adapter: Training-free CLIP-Adapter for Better Vision-Language
  Modeling
Tip-Adapter: Training-free CLIP-Adapter for Better Vision-Language Modeling
Renrui Zhang
Rongyao Fang
Wei Zhang
Peng Gao
Kunchang Li
Jifeng Dai
Yu Qiao
Hongsheng Li
VLM
288
402
0
06 Nov 2021
CLIP-Adapter: Better Vision-Language Models with Feature Adapters
CLIP-Adapter: Better Vision-Language Models with Feature Adapters
Peng Gao
Shijie Geng
Renrui Zhang
Teli Ma
Rongyao Fang
Yongfeng Zhang
Hongsheng Li
Yu Qiao
VLMCLIP
335
1,053
0
09 Oct 2021
LoRA: Low-Rank Adaptation of Large Language Models
LoRA: Low-Rank Adaptation of Large Language Models
J. E. Hu
Yelong Shen
Phillip Wallis
Zeyuan Allen-Zhu
Yuanzhi Li
Shean Wang
Lu Wang
Weizhu Chen
OffRLAI4TSAI4CEALMAIMat
502
10,526
0
17 Jun 2021
Multimodal Knowledge Expansion
Multimodal Knowledge Expansion
Zihui Xue
Sucheng Ren
Zhengqi Gao
Hang Zhao
141
30
0
26 Mar 2021
SMIL: Multimodal Learning with Severely Missing Modality
SMIL: Multimodal Learning with Severely Missing Modality
Mengmeng Ma
Jian Ren
Long Zhao
Sergey Tulyakov
Cathy H. Wu
Xi Peng
103
263
0
09 Mar 2021
There is More than Meets the Eye: Self-Supervised Multi-Object Detection
  and Tracking with Sound by Distilling Multimodal Knowledge
There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge
Francisco Rivera Valverde
Juana Valeria Hurtado
Abhinav Valada
85
73
0
01 Mar 2021
End-to-end Audio-visual Speech Recognition with Conformers
End-to-end Audio-visual Speech Recognition with Conformers
Pingchuan Ma
Stavros Petridis
Maja Pantic
145
234
0
12 Feb 2021
Intermediate Loss Regularization for CTC-based Speech Recognition
Intermediate Loss Regularization for CTC-based Speech Recognition
Jaesong Lee
Shinji Watanabe
138
139
0
05 Feb 2021
Removing Bias in Multi-modal Classifiers: Regularization by Maximizing
  Functional Entropies
Removing Bias in Multi-modal Classifiers: Regularization by Maximizing Functional Entropies
Itai Gat
Idan Schwartz
Alex Schwing
Tamir Hazan
101
92
0
21 Oct 2020
Correlating Subword Articulation with Lip Shapes for Embedding Aware
  Audio-Visual Speech Enhancement
Correlating Subword Articulation with Lip Shapes for Embedding Aware Audio-Visual Speech Enhancement
Hang Chen
Jun Du
Yu Hu
Lirong Dai
Baocai Yin
Chin-Hui Lee
82
20
0
21 Sep 2020
CHiME-6 Challenge:Tackling Multispeaker Speech Recognition for
  Unsegmented Recordings
CHiME-6 Challenge:Tackling Multispeaker Speech Recognition for Unsegmented Recordings
Shinji Watanabe
Michael I. Mandel
Jon Barker
Emmanuel Vincent
Ashish Arora
...
Emmanuel Vincent
Shota Horiguchi
Naoyuki Kanda
Takuya Yoshioka
Neville Ryant
72
308
0
20 Apr 2020
How to Teach DNNs to Pay Attention to the Visual Modality in Speech
  Recognition
How to Teach DNNs to Pay Attention to the Visual Modality in Speech Recognition
George Sterpu
Christian Saam
N. Harte
69
29
0
17 Apr 2020
Audio-visual Recognition of Overlapped speech for the LRS2 dataset
Audio-visual Recognition of Overlapped speech for the LRS2 dataset
Jianwei Yu
Shi-Xiong Zhang
Jian Wu
Shahram Ghorbani
Bo Wu
Shiyin Kang
Shansong Liu
Xunying Liu
Helen Meng
Dong Yu
85
73
0
06 Jan 2020
Recurrent Neural Network Transducer for Audio-Visual Speech Recognition
Recurrent Neural Network Transducer for Audio-Visual Speech Recognition
Takaki Makino
H. Liao
Yannis Assael
Brendan Shillingford
Basi García
Otavio Braga
Olivier Siohan
87
130
0
08 Nov 2019
Correlation Congruence for Knowledge Distillation
Correlation Congruence for Knowledge Distillation
Baoyun Peng
Xiao Jin
Jiaheng Liu
Shunfeng Zhou
Yichao Wu
Yu Liu
Dongsheng Li
Zhaoning Zhang
94
513
0
03 Apr 2019
Deep Audio-Visual Speech Recognition
Deep Audio-Visual Speech Recognition
Triantafyllos Afouras
Joon Son Chung
A. Senior
Oriol Vinyals
Andrew Zisserman
100
710
0
06 Sep 2018
Attention-based Audio-Visual Fusion for Robust Automatic Speech
  Recognition
Attention-based Audio-Visual Fusion for Robust Automatic Speech Recognition
George Sterpu
Christian Saam
N. Harte
98
65
0
05 Sep 2018
LRS3-TED: a large-scale dataset for visual speech recognition
LRS3-TED: a large-scale dataset for visual speech recognition
Triantafyllos Afouras
Joon Son Chung
Andrew Zisserman
67
445
0
03 Sep 2018
Learning multiple visual domains with residual adapters
Learning multiple visual domains with residual adapters
Sylvestre-Alvise Rebuffi
Hakan Bilen
Andrea Vedaldi
OOD
185
940
0
22 May 2017
CLEVR: A Diagnostic Dataset for Compositional Language and Elementary
  Visual Reasoning
CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning
Justin Johnson
B. Hariharan
Laurens van der Maaten
Li Fei-Fei
C. L. Zitnick
Ross B. Girshick
CoGe
319
2,392
0
20 Dec 2016
Lip Reading Sentences in the Wild
Lip Reading Sentences in the Wild
Joon Son Chung
A. Senior
Oriol Vinyals
Andrew Zisserman
264
793
0
16 Nov 2016
1