ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.09212
  4. Cited By
Cross-Modal Global Interaction and Local Alignment for Audio-Visual
  Speech Recognition

Cross-Modal Global Interaction and Local Alignment for Audio-Visual Speech Recognition

16 May 2023
Yuchen Hu
Ruizhe Li
Chen Chen
Heqing Zou
Qiu-shi Zhu
E. Chng
ArXivPDFHTML

Papers citing "Cross-Modal Global Interaction and Local Alignment for Audio-Visual Speech Recognition"

10 / 10 papers shown
Title
Multi-Task Corrupted Prediction for Learning Robust Audio-Visual Speech Representation
Multi-Task Corrupted Prediction for Learning Robust Audio-Visual Speech Representation
Sungnyun Kim
Sungwoo Cho
Sangmin Bae
Kangwook Jang
Se-Young Yun
SSL
73
1
0
23 Jan 2025
Learning Video Temporal Dynamics with Cross-Modal Attention for Robust
  Audio-Visual Speech Recognition
Learning Video Temporal Dynamics with Cross-Modal Attention for Robust Audio-Visual Speech Recognition
Sungnyun Kim
Kangwook Jang
Sangmin Bae
Hoirin Kim
Se-Young Yun
50
3
0
04 Jul 2024
A Study of Dropout-Induced Modality Bias on Robustness to Missing Video
  Frames for Audio-Visual Speech Recognition
A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition
Yusheng Dai
Hang Chen
Jun Du
Ruoyu Wang
Shihao Chen
Jie Ma
Haotian Wang
Chin-Hui Lee
45
4
0
07 Mar 2024
Wav2code: Restore Clean Speech Representations via Codebook Lookup for
  Noise-Robust ASR
Wav2code: Restore Clean Speech Representations via Codebook Lookup for Noise-Robust ASR
Yuchen Hu
Cheng Chen
Qiu-shi Zhu
E. Chng
22
15
0
11 Apr 2023
Unifying Speech Enhancement and Separation with Gradient Modulation for
  End-to-End Noise-Robust Speech Separation
Unifying Speech Enhancement and Separation with Gradient Modulation for End-to-End Noise-Robust Speech Separation
Yuchen Hu
Chen Chen
Heqing Zou
Xionghu Zhong
Chng Eng Siong
47
16
0
22 Feb 2023
Robust Data2vec: Noise-robust Speech Representation Learning for ASR by
  Combining Regression and Improved Contrastive Learning
Robust Data2vec: Noise-robust Speech Representation Learning for ASR by Combining Regression and Improved Contrastive Learning
Qiu-shi Zhu
Long Zhou
Jie Zhang
Shujie Liu
Yu-Chen Hu
Lirong Dai
VLM
SSL
60
37
0
27 Oct 2022
A Noise-Robust Self-supervised Pre-training Model Based Speech
  Representation Learning for Automatic Speech Recognition
A Noise-Robust Self-supervised Pre-training Model Based Speech Representation Learning for Automatic Speech Recognition
Qiu-shi Zhu
Jie Zhang
Zi-qiang Zhang
Ming Wu
Xin Fang
Lirong Dai
123
40
0
22 Jan 2022
End-to-end Audio-visual Speech Recognition with Conformers
End-to-end Audio-visual Speech Recognition with Conformers
Pingchuan Ma
Stavros Petridis
M. Pantic
84
225
0
12 Feb 2021
Improved Baselines with Momentum Contrastive Learning
Improved Baselines with Momentum Contrastive Learning
Xinlei Chen
Haoqi Fan
Ross B. Girshick
Kaiming He
SSL
273
3,375
0
09 Mar 2020
Lip Reading Sentences in the Wild
Lip Reading Sentences in the Wild
Joon Son Chung
A. Senior
Oriol Vinyals
Andrew Zisserman
167
784
0
16 Nov 2016
1