Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.00529
Cited By
VoxGenesis: Unsupervised Discovery of Latent Speaker Manifold for Speech Synthesis
1 March 2024
Wei-wei Lin
Chenhang He
Man-Wai Mak
Jiachen Lian
Kong Aik Lee
DiffM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"VoxGenesis: Unsupervised Discovery of Latent Speaker Manifold for Speech Synthesis"
21 / 21 papers shown
Title
High-Fidelity Audio Compression with Improved RVQGAN
Rithesh Kumar
Prem Seetharaman
Alejandro Luebs
I. Kumar
Kundan Kumar
77
326
0
11 Jun 2023
DSVAE: Interpretable Disentangled Representation for Synthetic Speech Detection
Amit Kumar Singh Yadav
Kratika Bhagtani
Ziyue Xiang
Paolo Bestagini
Stefano Tubaro
Edward J. Delp
DRL
51
6
0
06 Apr 2023
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone
Edresson Casanova
Julian Weber
C. Shulby
Arnaldo Cândido Júnior
Eren Golge
M. Ponti
217
403
0
04 Dec 2021
Speaker Generation
Daisy Stanton
Matt Shannon
Soroosh Mariooryad
RJ Skerry-Ryan
Eric Battenberg
Tom Bagby
David Kao
38
28
0
07 Nov 2021
Neural Analysis and Synthesis: Reconstructing Speech from Self-Supervised Representations
Hyeong-Seok Choi
Juheon Lee
W. Kim
Jie Hwan Lee
Hoon Heo
Kyogu Lee
54
156
0
27 Oct 2021
HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units
Wei-Ning Hsu
Benjamin Bolte
Yao-Hung Hubert Tsai
Kushal Lakhotia
Ruslan Salakhutdinov
Abdel-rahman Mohamed
SSL
145
2,937
0
14 Jun 2021
Diffusion Models Beat GANs on Image Synthesis
Prafulla Dhariwal
Alex Nichol
171
7,763
0
11 May 2021
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Jungil Kong
Jaehyeon Kim
Jaekyoung Bae
162
1,923
0
12 Oct 2020
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations
Alexei Baevski
Henry Zhou
Abdel-rahman Mohamed
Michael Auli
SSL
217
5,767
0
20 Jun 2020
Unsupervised Speech Decomposition via Triple Information Bottleneck
Kaizhi Qian
Yang Zhang
Shiyu Chang
David D. Cox
M. Hasegawa-Johnson
62
183
0
23 Apr 2020
GANSpace: Discovering Interpretable GAN Controls
Erik Härkönen
Aaron Hertzmann
J. Lehtinen
Sylvain Paris
109
902
0
06 Apr 2020
Unsupervised Discovery of Interpretable Directions in the GAN Latent Space
A. Voynov
Artem Babenko
119
418
0
10 Feb 2020
AUTOVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss
Kaizhi Qian
Yang Zhang
Shiyu Chang
Xuesong Yang
M. Hasegawa-Johnson
70
462
0
14 May 2019
CycleGAN-VC2: Improved CycleGAN-based Non-parallel Voice Conversion
Takuhiro Kaneko
Hirokazu Kameoka
Kou Tanaka
Nobukatsu Hojo
54
258
0
09 Apr 2019
A Style-Based Generator Architecture for Generative Adversarial Networks
Tero Karras
S. Laine
Timo Aila
524
10,527
0
12 Dec 2018
WaveGlow: A Flow-based Generative Network for Speech Synthesis
R. Prenger
Rafael Valle
Bryan Catanzaro
151
1,029
0
31 Oct 2018
Glow: Generative Flow with Invertible 1x1 Convolutions
Diederik P. Kingma
Prafulla Dhariwal
BDL
DRL
254
3,123
0
09 Jul 2018
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
Ye Jia
Yu Zhang
Ron J. Weiss
Quan Wang
Jonathan Shen
...
Zhiwen Chen
Patrick Nguyen
Ruoming Pang
Ignacio López Moreno
Yonghui Wu
251
828
0
12 Jun 2018
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions
Jonathan Shen
Ruoming Pang
Ron J. Weiss
M. Schuster
Navdeep Jaitly
...
Yuxuan Wang
RJ Skerry-Ryan
Rif A. Saurous
Yannis Agiomyrgiannakis
Yonghui Wu
77
2,693
0
16 Dec 2017
Unsupervised Learning of Disentangled and Interpretable Representations from Sequential Data
Wei-Ning Hsu
Yu Zhang
James R. Glass
BDL
SSL
76
351
0
22 Sep 2017
WaveNet: A Generative Model for Raw Audio
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
DiffM
350
7,381
0
12 Sep 2016
1