Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2312.08676
Cited By
v1
v2 (latest)
SEF-VC: Speaker Embedding Free Zero-Shot Voice Conversion with Cross Attention
14 December 2023
Junjie Li
Yiwei Guo
Xie Chen
Kai Yu
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"SEF-VC: Speaker Embedding Free Zero-Shot Voice Conversion with Cross Attention"
28 / 28 papers shown
Title
EZ-VC: Easy Zero-shot Any-to-Any Voice Conversion
Advait Joglekar
Divyanshu Singh
Rooshil Rohit Bhatia
S. Umesh
80
0
0
22 May 2025
ZSVC: Zero-shot Style Voice Conversion with Disentangled Latent Diffusion Models and Adversarial Training
Xinfa Zhu
Lei He
Yujia Xiao
Xi Wang
Xu Tan
Sheng Zhao
Lei Xie
DiffM
92
2
0
08 Jan 2025
LSCodec: Low-Bitrate and Speaker-Decoupled Discrete Speech Codec
Yiwei Guo
Zhihan Li
Chenpeng Du
Hankun Wang
Xie Chen
Kai Yu
89
3
0
21 Oct 2024
LM-VC: Zero-shot Voice Conversion via Speech Generation based on Language Models
Zhichao Wang
Yuan-Jui Chen
Linfu Xie
Qiao Tian
Yuping Wang
154
32
0
18 Jun 2023
UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding
Chenpeng Du
Yiwei Guo
Feiyu Shen
Zhijun Liu
Zheng Liang
Xie Chen
Shuai Wang
Hui Zhang
K. Yu
DiffM
94
44
0
13 Jun 2023
Make-A-Voice: Unified Voice Synthesis With Discrete Representation
Rongjie Huang
Chunlei Zhang
Yongqiang Wang
Dongchao Yang
Lu Liu
Zhenhui Ye
Ziyue Jiang
Chao Weng
Zhou Zhao
Dong Yu
DiffM
80
27
0
30 May 2023
ACE-VC: Adaptive and Controllable Voice Conversion using Explicitly Disentangled Self-supervised Speech Representations
Shehzeen Samarah Hussain
Paarth Neekhara
Jocelyn Huang
Jason Chun Lok Li
Boris Ginsburg
56
25
0
16 Feb 2023
AudioLM: a Language Modeling Approach to Audio Generation
Zalan Borsos
Raphaël Marinier
Damien Vincent
Eugene Kharitonov
Olivier Pietquin
...
Dominik Roblek
O. Teboul
David Grangier
Marco Tagliasacchi
Neil Zeghidour
AuLLM
161
616
0
07 Sep 2022
Towards Improved Zero-shot Voice Conversion with Conditional DSVAE
Jiachen Lian
Chunlei Zhang
Gopala Krishna Anumanchipalli
Dong Yu
53
23
0
11 May 2022
Robust Disentangled Variational Speech Representation Learning for Zero-shot Voice Conversion
Jiachen Lian
Chunlei Zhang
Dong Yu
DRL
65
52
0
30 Mar 2022
SpeechSplit 2.0: Unsupervised speech disentanglement for voice conversion Without tuning autoencoder Bottlenecks
Chak Ho Chan
Kaizhi Qian
Yang Zhang
M. Hasegawa-Johnson
DRL
45
48
0
26 Mar 2022
DGC-vector: A new speaker embedding for zero-shot voice conversion
Ruitong Xiao
Haitong Zhang
Yue Lin
52
12
0
18 Mar 2022
DRVC: A Framework of Any-to-Any Voice Conversion with Self-Supervised Learning
Qiqi Wang
Xulong Zhang
Jianzong Wang
Ning Cheng
Jing Xiao
DRL
109
23
0
22 Feb 2022
Training Robust Zero-Shot Voice Conversion Models with Self-supervised Features
Trung D. Q. Dang
Dung T. Tran
Peter Chin
K. Koishida
SSL
69
15
0
08 Dec 2021
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone
Edresson Casanova
Julian Weber
C. Shulby
Arnaldo Cândido Júnior
Eren Golge
M. Ponti
238
415
0
04 Dec 2021
HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units
Wei-Ning Hsu
Benjamin Bolte
Yao-Hung Hubert Tsai
Kushal Lakhotia
Ruslan Salakhutdinov
Abdel-rahman Mohamed
SSL
188
3,004
0
14 Jun 2021
NoiseVC: Towards High Quality Zero-Shot Voice Conversion
Shijun Wang
Damian Borth
DRL
64
6
0
13 Apr 2021
S2VC: A Framework for Any-to-Any Voice Conversion with Self-Supervised Pretrained Representations
Jheng-hao Lin
Yist Y. Lin
C. Chien
Hung-yi Lee
138
56
0
07 Apr 2021
SC-GlowTTS: an Efficient Zero-Shot Multi-Speaker Text-To-Speech Model
Edresson Casanova
C. Shulby
Eren Golge
Nicolas Müller
F. S. Oliveira
Arnaldo Cândido Júnior
A. S. Soares
S. Aluísio
M. Ponti
58
100
0
02 Apr 2021
Speech Resynthesis from Discrete Disentangled Self-Supervised Representations
Adam Polyak
Yossi Adi
Jade Copet
Eugene Kharitonov
Kushal Lakhotia
Wei-Ning Hsu
Abdel-rahman Mohamed
Emmanuel Dupoux
115
318
0
01 Apr 2021
Any-to-One Sequence-to-Sequence Voice Conversion using Self-Supervised Discrete Speech Representations
Wen-Chin Huang
Yi-Chiao Wu
Tomoki Hayashi
Tomoki Toda
BDL
98
38
0
23 Oct 2020
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Jungil Kong
Jaehyeon Kim
Jaekyoung Bae
179
1,952
0
12 Oct 2020
Conformer: Convolution-augmented Transformer for Speech Recognition
Anmol Gulati
James Qin
Chung-Cheng Chiu
Niki Parmar
Yu Zhang
...
Wei Han
Shibo Wang
Zhengdong Zhang
Yonghui Wu
Ruoming Pang
229
3,164
0
16 May 2020
Unsupervised Speech Decomposition via Triple Information Bottleneck
Kaizhi Qian
Yang Zhang
Shiyu Chang
David D. Cox
M. Hasegawa-Johnson
82
185
0
23 Apr 2020
vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations
Alexei Baevski
Steffen Schneider
Michael Auli
SSL
170
667
0
12 Oct 2019
LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech
Heiga Zen
Viet Dang
R. Clark
Yu Zhang
Ron J. Weiss
Ye Jia
Zhiwen Chen
Yonghui Wu
104
959
0
05 Apr 2019
Glow: Generative Flow with Invertible 1x1 Convolutions
Diederik P. Kingma
Prafulla Dhariwal
BDL
DRL
308
3,144
0
09 Jul 2018
Voice Conversion from Non-parallel Corpora Using Variational Auto-encoder
Chin-Cheng Hsu
Hsin-Te Hwang
Yi-Chiao Wu
Yu Tsao
H. Wang
106
304
0
13 Oct 2016
1