Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.16078
Cited By
Target Speech Extraction with Pre-trained AV-HuBERT and Mask-And-Recover Strategy
24 March 2024
Wenxuan Wu
Xueyuan Chen
Xixin Wu
Haizhou Li
Helen M. Meng
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Target Speech Extraction with Pre-trained AV-HuBERT and Mask-And-Recover Strategy"
7 / 7 papers shown
Title
Exploiting Audio-Visual Features with Pretrained AV-HuBERT for Multi-Modal Dysarthric Speech Reconstruction
Xueyuan Chen
Yuejiao Wang
Xixin Wu
Disong Wang
Zhiyong Wu
Xunying Liu
Helen M. Meng
42
6
0
31 Jan 2024
StyleSpeech: Self-supervised Style Enhancing with VQ-VAE-based Pre-training for Expressive Audiobook Speech Synthesis
Xueyuan Chen
Xi Wang
Shaofei Zhang
Lei He
Zhiyong Wu
Xixin Wu
Helen M. Meng
41
7
0
19 Dec 2023
token2vec: A Joint Self-Supervised Pre-training Framework Using Unpaired Speech and Text
Xianghu Yue
Junyi Ao
Xiaoxue Gao
Haizhou Li
SSL
26
8
0
30 Oct 2022
Speaker recognition with two-step multi-modal deep cleansing
Ruijie Tao
Kong Aik Lee
Zhan Shi
Haizhou Li
NoLa
45
13
0
28 Oct 2022
Self-Supervised Speech Representation Learning: A Review
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
128
349
0
21 May 2022
VisualVoice: Audio-Visual Speech Separation with Cross-Modal Consistency
Ruohan Gao
Kristen Grauman
CVBM
190
198
0
08 Jan 2021
VoxCeleb2: Deep Speaker Recognition
Joon Son Chung
Arsha Nagrani
Andrew Zisserman
224
2,234
0
14 Jun 2018
1