VoxGenesis: Unsupervised Discovery of Latent Speaker Manifold for Speech Synthesis

1 March 2024

Kong Aik Lee

Papers citing "VoxGenesis: Unsupervised Discovery of Latent Speaker Manifold for Speech Synthesis"

21 / 21 papers shown

Title
High-Fidelity Audio Compression with Improved RVQGAN Rithesh Kumar Prem Seetharaman Alejandro Luebs I. Kumar Kundan Kumar 77 326 0 11 Jun 2023
DSVAE: Interpretable Disentangled Representation for Synthetic Speech Detection Amit Kumar Singh Yadav Kratika Bhagtani Ziyue Xiang Paolo Bestagini Stefano Tubaro Edward J. Delp DRL 51 6 0 06 Apr 2023
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone Edresson Casanova Julian Weber C. Shulby Arnaldo Cândido Júnior Eren Golge M. Ponti 217 403 0 04 Dec 2021
Speaker Generation Daisy Stanton Matt Shannon Soroosh Mariooryad RJ Skerry-Ryan Eric Battenberg Tom Bagby David Kao 38 28 0 07 Nov 2021
Neural Analysis and Synthesis: Reconstructing Speech from Self-Supervised Representations Hyeong-Seok Choi Juheon Lee W. Kim Jie Hwan Lee Hoon Heo Kyogu Lee 54 156 0 27 Oct 2021
HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units Wei-Ning Hsu Benjamin Bolte Yao-Hung Hubert Tsai Kushal Lakhotia Ruslan Salakhutdinov Abdel-rahman Mohamed SSL 145 2,937 0 14 Jun 2021
Diffusion Models Beat GANs on Image Synthesis Prafulla Dhariwal Alex Nichol 171 7,763 0 11 May 2021
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis Jungil Kong Jaehyeon Kim Jaekyoung Bae 162 1,923 0 12 Oct 2020
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations Alexei Baevski Henry Zhou Abdel-rahman Mohamed Michael Auli SSL 217 5,767 0 20 Jun 2020
Unsupervised Speech Decomposition via Triple Information Bottleneck Kaizhi Qian Yang Zhang Shiyu Chang David D. Cox M. Hasegawa-Johnson 62 183 0 23 Apr 2020
GANSpace: Discovering Interpretable GAN Controls Erik Härkönen Aaron Hertzmann J. Lehtinen Sylvain Paris 109 902 0 06 Apr 2020
Unsupervised Discovery of Interpretable Directions in the GAN Latent Space A. Voynov Artem Babenko 119 418 0 10 Feb 2020
AUTOVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss Kaizhi Qian Yang Zhang Shiyu Chang Xuesong Yang M. Hasegawa-Johnson 70 462 0 14 May 2019
CycleGAN-VC2: Improved CycleGAN-based Non-parallel Voice Conversion Takuhiro Kaneko Hirokazu Kameoka Kou Tanaka Nobukatsu Hojo 54 258 0 09 Apr 2019
A Style-Based Generator Architecture for Generative Adversarial Networks Tero Karras S. Laine Timo Aila 524 10,527 0 12 Dec 2018
WaveGlow: A Flow-based Generative Network for Speech Synthesis R. Prenger Rafael Valle Bryan Catanzaro 151 1,029 0 31 Oct 2018
Glow: Generative Flow with Invertible 1x1 Convolutions Diederik P. Kingma Prafulla Dhariwal BDL DRL 254 3,123 0 09 Jul 2018
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis Ye Jia Yu Zhang Ron J. Weiss Quan Wang Jonathan Shen ... Zhiwen Chen Patrick Nguyen Ruoming Pang Ignacio López Moreno Yonghui Wu 251 828 0 12 Jun 2018
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions Jonathan Shen Ruoming Pang Ron J. Weiss M. Schuster Navdeep Jaitly ... Yuxuan Wang RJ Skerry-Ryan Rif A. Saurous Yannis Agiomyrgiannakis Yonghui Wu 77 2,693 0 16 Dec 2017
Unsupervised Learning of Disentangled and Interpretable Representations from Sequential Data Wei-Ning Hsu Yu Zhang James R. Glass BDL SSL 76 351 0 22 Sep 2017
WaveNet: A Generative Model for Raw Audio Aaron van den Oord Sander Dieleman Heiga Zen Karen Simonyan Oriol Vinyals Alex Graves Nal Kalchbrenner A. Senior Koray Kavukcuoglu DiffM 350 7,381 0 12 Sep 2016