ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.09224
  4. Cited By
ContentVec: An Improved Self-Supervised Speech Representation by
  Disentangling Speakers

ContentVec: An Improved Self-Supervised Speech Representation by Disentangling Speakers

20 April 2022
Kaizhi Qian
Yang Zhang
Heting Gao
Junrui Ni
Cheng-I Jeff Lai
David D. Cox
M. Hasegawa-Johnson
Shiyu Chang
    DRL
ArXivPDFHTML

Papers citing "ContentVec: An Improved Self-Supervised Speech Representation by Disentangling Speakers"

31 / 81 papers shown
Title
Leveraging Diverse Semantic-based Audio Pretrained Models for Singing
  Voice Conversion
Leveraging Diverse Semantic-based Audio Pretrained Models for Singing Voice Conversion
Xueyao Zhang
Yicheng Gu
Haopeng Chen
Zihao Fang
Lexiao Zou
Junan Zhang
Liumeng Xue
Jinchao Zhang
Jie Zhou
Zhizheng Wu
DiffM
32
1
0
17 Oct 2023
SelfVC: Voice Conversion With Iterative Refinement using Self
  Transformations
SelfVC: Voice Conversion With Iterative Refinement using Self Transformations
Paarth Neekhara
Shehzeen Samarah Hussain
Rafael Valle
Boris Ginsburg
Rishabh Ranjan
Shlomo Dubnov
F. Koushanfar
Julian McAuley
18
3
0
14 Oct 2023
A Comparative Study of Voice Conversion Models with Large-Scale Speech
  and Singing Data: The T13 Systems for the Singing Voice Conversion Challenge
  2023
A Comparative Study of Voice Conversion Models with Large-Scale Speech and Singing Data: The T13 Systems for the Singing Voice Conversion Challenge 2023
Ryuichi Yamamoto
Reo Yoneyama
Lester Phillip Violeta
Wen-Chin Huang
T. Toda
19
7
0
08 Oct 2023
VITS-based Singing Voice Conversion System with DSPGAN post-processing
  for SVCC2023
VITS-based Singing Voice Conversion System with DSPGAN post-processing for SVCC2023
Yi-Hua Zhou
Meng Chen
Yi Lei
Jihua Zhu
Weifeng Zhao
16
5
0
08 Oct 2023
HuBERTopic: Enhancing Semantic Representation of HuBERT through
  Self-supervision Utilizing Topic Model
HuBERTopic: Enhancing Semantic Representation of HuBERT through Self-supervision Utilizing Topic Model
Takashi Maekaku
Jiatong Shi
Xuankai Chang
Yuya Fujita
Shinji Watanabe
34
1
0
06 Oct 2023
Multi-resolution HuBERT: Multi-resolution Speech Self-Supervised
  Learning with Masked Unit Prediction
Multi-resolution HuBERT: Multi-resolution Speech Self-Supervised Learning with Masked Unit Prediction
Jiatong Shi
H. Inaguma
Xutai Ma
Ilia Kulikov
Anna Y. Sun
48
24
0
04 Oct 2023
Disentangling Voice and Content with Self-Supervision for Speaker
  Recognition
Disentangling Voice and Content with Self-Supervision for Speaker Recognition
Tianchi Liu
Kong Aik Lee
Qiongqiong Wang
Haizhou Li
BDL
DRL
32
30
0
02 Oct 2023
Unsupervised Accent Adaptation Through Masked Language Model Correction
  Of Discrete Self-Supervised Speech Units
Unsupervised Accent Adaptation Through Masked Language Model Correction Of Discrete Self-Supervised Speech Units
Jakob Poncelet
Hugo Van hamme
23
3
0
25 Sep 2023
FSD: An Initial Chinese Dataset for Fake Song Detection
FSD: An Initial Chinese Dataset for Fake Song Detection
Yuankun Xie
Jingjing Zhou
Xiaolin Lu
Zhenghao Jiang
Yuxin Yang
Haonan Cheng
Long Ye
24
14
0
05 Sep 2023
SLMGAN: Exploiting Speech Language Model Representations for
  Unsupervised Zero-Shot Voice Conversion in GANs
SLMGAN: Exploiting Speech Language Model Representations for Unsupervised Zero-Shot Voice Conversion in GANs
Yinghao Aaron Li
Cong Han
N. Mesgarani
28
5
0
18 Jul 2023
The Singing Voice Conversion Challenge 2023
The Singing Voice Conversion Challenge 2023
Wen-Chin Huang
Lester Phillip Violeta
Songxiang Liu
Jiatong Shi
T. Toda
16
46
0
26 Jun 2023
HiddenSinger: High-Quality Singing Voice Synthesis via Neural Audio
  Codec and Latent Diffusion Models
HiddenSinger: High-Quality Singing Voice Synthesis via Neural Audio Codec and Latent Diffusion Models
Ji-Sang Hwang
Sang-Hoon Lee
Seong-Whan Lee
DiffM
38
8
0
12 Jun 2023
What Can an Accent Identifier Learn? Probing Phonetic and Prosodic
  Information in a Wav2vec2-based Accent Identification Model
What Can an Accent Identifier Learn? Probing Phonetic and Prosodic Information in a Wav2vec2-based Accent Identification Model
Mu Yang
R. Shekar
Okim Kang
John H. L. Hansen
20
10
0
10 Jun 2023
Self-supervised Fine-tuning for Improved Content Representations by
  Speaker-invariant Clustering
Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clustering
Heng-Jui Chang
Alexander H. Liu
James R. Glass
SSL
22
20
0
18 May 2023
DinoSR: Self-Distillation and Online Clustering for Self-supervised
  Speech Representation Learning
DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning
Alexander H. Liu
Heng-Jui Chang
Michael Auli
Wei-Ning Hsu
James R. Glass
27
25
0
17 May 2023
Adversarial Speaker Disentanglement Using Unannotated External Data for
  Self-supervised Representation Based Voice Conversion
Adversarial Speaker Disentanglement Using Unannotated External Data for Self-supervised Representation Based Voice Conversion
Xintao Zhao
Shuai Wang
Yang Chao
Zhiyong Wu
Helen Meng
32
3
0
16 May 2023
Self-supervised Neural Factor Analysis for Disentangling Utterance-level
  Speech Representations
Self-supervised Neural Factor Analysis for Disentangling Utterance-level Speech Representations
Wei-wei Lin
Chenhang He
Man-Wai Mak
Youzhi Tu
27
5
0
14 May 2023
Who is Speaking Actually? Robust and Versatile Speaker Traceability for
  Voice Conversion
Who is Speaking Actually? Robust and Versatile Speaker Traceability for Voice Conversion
Yanzhen Ren
Hongcheng Zhu
Liming Zhai
Zongkun Sun
Rubing Shen
Lina Wang
28
6
0
09 May 2023
SPADE: Self-supervised Pretraining for Acoustic DisEntanglement
SPADE: Self-supervised Pretraining for Acoustic DisEntanglement
John Harvill
Jarred Barber
Arun Nair
Ramin Pishehvar
SSL
DRL
24
0
0
03 Feb 2023
NANSY++: Unified Voice Synthesis with Neural Analysis and Synthesis
NANSY++: Unified Voice Synthesis with Neural Analysis and Synthesis
Hyeong-Seok Choi
Jinhyeok Yang
Juheon Lee
Hyeongju Kim
20
46
0
17 Nov 2022
Introducing Semantics into Speech Encoders
Introducing Semantics into Speech Encoders
Derek Xu
Shuyan Dong
Changhan Wang
Suyoun Kim
Zhaojiang Lin
...
Alexei Baevski
Guan-Ting Lin
Hung-yi Lee
Yizhou Sun
Wei Wang
SSL
33
3
0
15 Nov 2022
Distinguishable Speaker Anonymization based on Formant and Fundamental
  Frequency Scaling
Distinguishable Speaker Anonymization based on Formant and Fundamental Frequency Scaling
Jixun Yao
Qing Wang
Yi Lei
Pengcheng Guo
Linfu Xie
Namin Wang
Jie Liu
32
13
0
06 Nov 2022
More Speaking or More Speakers?
More Speaking or More Speakers?
Dan Berrebbi
R. Collobert
Navdeep Jaitly
Tatiana Likhomanenko
13
5
0
02 Nov 2022
Self-supervised language learning from raw audio: Lessons from the Zero
  Resource Speech Challenge
Self-supervised language learning from raw audio: Lessons from the Zero Resource Speech Challenge
Ewan Dunbar
Nicolas Hamilakis
Emmanuel Dupoux
SSL
32
30
0
27 Oct 2022
CCC-wav2vec 2.0: Clustering aided Cross Contrastive Self-supervised
  learning of speech representations
CCC-wav2vec 2.0: Clustering aided Cross Contrastive Self-supervised learning of speech representations
Vasista Sai Lodagala
Sreyan Ghosh
S. Umesh
SSL
46
18
0
05 Oct 2022
Augmentation Invariant Discrete Representation for Generative Spoken
  Language Modeling
Augmentation Invariant Discrete Representation for Generative Spoken Language Modeling
Itai Gat
Felix Kreuk
Tu Nguyen
Ann Lee
Jade Copet
Gabriel Synnaeve
Emmanuel Dupoux
Yossi Adi
30
11
0
30 Sep 2022
I2CR: Improving Noise Robustness on Keyword Spotting Using Inter-Intra
  Contrastive Regularization
I2CR: Improving Noise Robustness on Keyword Spotting Using Inter-Intra Contrastive Regularization
Dianwen Ng
J. Yip
Tanmay Surana
Zhao Yang
Chong Zhang
Yukun Ma
Chongjia Ni
Chng Eng Siong
B. Ma
35
6
0
14 Sep 2022
Self-Supervised Speech Representation Learning: A Review
Self-Supervised Speech Representation Learning: A Review
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
131
349
0
21 May 2022
Generative Spoken Language Modeling from Raw Audio
Generative Spoken Language Modeling from Raw Audio
Kushal Lakhotia
Evgeny Kharitonov
Wei-Ning Hsu
Yossi Adi
Adam Polyak
...
Tu Nguyen
Jade Copet
Alexei Baevski
A. Mohamed
Emmanuel Dupoux
AuLLM
191
337
0
01 Feb 2021
Pushing the Limits of Semi-Supervised Learning for Automatic Speech
  Recognition
Pushing the Limits of Semi-Supervised Learning for Automatic Speech Recognition
Yu Zhang
James Qin
Daniel S. Park
Wei Han
Chung-Cheng Chiu
Ruoming Pang
Quoc V. Le
Yonghui Wu
VLM
SSL
146
308
0
20 Oct 2020
Multi-task self-supervised learning for Robust Speech Recognition
Multi-task self-supervised learning for Robust Speech Recognition
Mirco Ravanelli
Jianyuan Zhong
Santiago Pascual
P. Swietojanski
João Monteiro
J. Trmal
Yoshua Bengio
SSL
189
288
0
25 Jan 2020
Previous
12