Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.07996
Cited By
Leveraging Unimodal Self-Supervised Learning for Multimodal Audio-Visual Speech Recognition
24 February 2022
Xichen Pan
Peiyu Chen
Yichen Gong
Helong Zhou
Xinbing Wang
Zhouhan Lin
SSL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Leveraging Unimodal Self-Supervised Learning for Multimodal Audio-Visual Speech Recognition"
9 / 9 papers shown
Title
CoGenAV: Versatile Audio-Visual Representation Learning via Contrastive-Generative Synchronization
Detao Bai
Zhiheng Ma
Xihan Wei
Liefeng Bo
171
0
0
06 May 2025
Multi-Task Corrupted Prediction for Learning Robust Audio-Visual Speech Representation
Sungnyun Kim
Sungwoo Cho
Sangmin Bae
Kangwook Jang
Se-Young Yun
SSL
79
1
0
23 Jan 2025
UniBriVL: Robust Universal Representation and Generation of Audio Driven Diffusion Models
Sen Fang
Bowen Gao
Yangjian Wu
T. Teoh
DiffM
34
1
0
29 Jul 2023
Hearing Lips in Noise: Universal Viseme-Phoneme Mapping and Transfer for Robust Audio-Visual Speech Recognition
Yuchen Hu
Ruizhe Li
Cheng Chen
Chengwei Qin
Qiu-shi Zhu
Eng Siong Chng
39
5
0
18 Jun 2023
End-to-end Audio-visual Speech Recognition with Conformers
Pingchuan Ma
Stavros Petridis
M. Pantic
86
226
0
12 Feb 2021
Improved Baselines with Momentum Contrastive Learning
Xinlei Chen
Haoqi Fan
Ross B. Girshick
Kaiming He
SSL
281
3,378
0
09 Mar 2020
Lipreading using Temporal Convolutional Networks
Brais Martínez
Pingchuan Ma
Stavros Petridis
M. Pantic
168
239
0
23 Jan 2020
VoxCeleb2: Deep Speaker Recognition
Joon Son Chung
Arsha Nagrani
Andrew Zisserman
266
2,238
0
14 Jun 2018
Lip Reading Sentences in the Wild
Joon Son Chung
A. Senior
Oriol Vinyals
Andrew Zisserman
185
784
0
16 Nov 2016
1