High Fidelity Speech Synthesis with Adversarial Networks

25 September 2019

Papers citing "High Fidelity Speech Synthesis with Adversarial Networks"

49 / 149 papers shown

Title
Sprachsynthese -- State-of-the-Art in englischer und deutscher Sprache René Peinl 21 0 0 11 Jun 2021
Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech Jaehyeon Kim Jungil Kong Juhee Son DRL 80 842 0 11 Jun 2021
Fre-GAN: Adversarial Frequency-consistent Audio Synthesis Ji-Hoon Kim Sang-Hoon Lee Ji-Hyun Lee Seong-Whan Lee 16 53 0 04 Jun 2021
NVC-Net: End-to-End Adversarial Voice Conversion Bac Nguyen Cong Fabien Cardinaux AAML 37 41 0 02 Jun 2021
ItôTTS and ItôWave: Linear Stochastic Differential Equation Is All You Need For Audio Generation Shoule Wu Ziqiang Shi DiffM 19 11 0 17 May 2021
Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech Vadim Popov Ivan Vovk Vladimir Gogoryan Tasnima Sadekova Mikhail Kudinov DiffM 44 514 0 13 May 2021
VQCPC-GAN: Variable-Length Adversarial Audio Synthesis Using Vector-Quantized Contrastive Predictive Coding J. Nistal Cyran Aouameur Stefan Lattner G. Richard 17 7 0 04 May 2021
VideoGPT: Video Generation using VQ-VAE and Transformers Wilson Yan Yunzhi Zhang Pieter Abbeel A. Srinivas ViT VGen 245 484 0 20 Apr 2021
Noise Estimation for Generative Diffusion Models Robin San-Roman Eliya Nachmani Lior Wolf DiffM 28 105 0 06 Apr 2021
Deepfakes Generation and Detection: State-of-the-art, open challenges, countermeasures, and way forward Momina Masood M. Nawaz K. Malik A. Javed Aun Irtaza AAML 123 297 0 25 Feb 2021
MaskCycleGAN-VC: Learning Non-parallel Voice Conversion with Filling in Frames Takuhiro Kaneko Hirokazu Kameoka Kou Tanaka Nobukatsu Hojo 33 57 0 25 Feb 2021
AudioVisual Speech Synthesis: A brief literature review Efthymios Georgiou Athanasios Katsamanis 19 0 0 18 Feb 2021
High Fidelity Speech Regeneration with Application to Speech Enhancement Adam Polyak Lior Wolf Yossi Adi Ori Kabeli Yaniv Taigman 12 18 0 31 Jan 2021
Fully Non-autoregressive Neural Machine Translation: Tricks of the Trade Jiatao Gu X. Kong 28 135 0 31 Dec 2020
MelGlow: Efficient Waveform Generative Network Based on Location-Variable Convolution Zhen Zeng Jianzong Wang Ning Cheng Jing Xiao 6 8 0 03 Dec 2020
A Comprehensive Survey on Deep Music Generation: Multi-level Representations, Algorithms, Evaluations, and Future Directions Shulei Ji Jing Luo Xinyu Yang MGen 13 125 0 13 Nov 2020
Wave-Tacotron: Spectrogram-free end-to-end text-to-speech synthesis Ron J. Weiss RJ Skerry-Ryan Eric Battenberg Soroosh Mariooryad Diederik P. Kingma 21 97 0 06 Nov 2020
StyleMelGAN: An Efficient High-Fidelity Adversarial Vocoder with Temporal Adaptive Normalization Ahmed Mustafa N. Pia Guillaume Fuchs 14 71 0 03 Nov 2020
Speech Synthesis and Control Using Differentiable DSP Giorgio Fabbro Vladimir Golkov Thomas Kemp Daniel Cremers 13 12 0 28 Oct 2020
Upsampling artifacts in neural audio synthesis Jordi Pons Santiago Pascual Giulio Cengarle Joan Serrà 31 62 0 27 Oct 2020
Parallel waveform synthesis based on generative adversarial networks with voicing-aware conditional discriminators Ryuichi Yamamoto Eunwoo Song Min-Jae Hwang Jae-Min Kim 22 18 0 27 Oct 2020
CLAR: Contrastive Learning of Auditory Representations Haider Al-Tahan Y. Mohsenzadeh SSL 118 56 0 19 Oct 2020
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis Jungil Kong Jaehyeon Kim Jaekyoung Bae 12 1,850 0 12 Oct 2020
The NU Voice Conversion System for the Voice Conversion Challenge 2020: On the Effectiveness of Sequence-to-sequence Models and Autoregressive Neural Vocoders Wen-Chin Huang Patrick Lumban Tobing Yi-Chiao Wu Kazuhiro Kobayashi T. Toda 19 8 0 09 Oct 2020
DiffWave: A Versatile Diffusion Model for Audio Synthesis Zhifeng Kong Ming-Yu Liu Jiaji Huang Kexin Zhao Bryan Catanzaro DiffM BDL 34 1,392 0 21 Sep 2020
HiFiSinger: Towards High-Fidelity Neural Singing Voice Synthesis Jiawei Chen Xu Tan Jian Luan Tao Qin Tie-Yan Liu VLM 19 92 0 03 Sep 2020
WaveGrad: Estimating Gradients for Waveform Generation Nanxin Chen Yu Zhang Heiga Zen Ron J. Weiss Mohammad Norouzi William Chan DiffM BDL 14 771 0 02 Sep 2020
Prosody Learning Mechanism for Speech Synthesis System Without Text Length Limit Zhen Zeng Jianzong Wang Ning Cheng Jing Xiao 11 8 0 13 Aug 2020
An Overview of Voice Conversion and its Challenges: From Statistical Modeling to Deep Learning Berrak Sisman Junichi Yamagishi Simon King Haizhou Li BDL 32 317 0 09 Aug 2020
A Spectral Energy Distance for Parallel Speech Synthesis A. Gritsenko Tim Salimans Rianne van den Berg Jasper Snoek Nal Kalchbrenner 6 69 0 03 Aug 2020
Adversarially Trained Multi-Singer Sequence-To-Sequence Singing Synthesizer Jie Wu Jian Luan 25 26 0 18 Jun 2020
FastPitch: Parallel Text-to-speech with Pitch Prediction Adrian Lañcucki 27 332 0 11 Jun 2020
HiFi-GAN: High-Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks Jiaqi Su Zeyu Jin Adam Finkelstein 23 136 0 10 Jun 2020
End-to-End Adversarial Text-to-Speech Jeff Donahue Sander Dieleman Mikolaj Binkowski Erich Elsen Karen Simonyan 17 185 0 05 Jun 2020
Speech-to-Singing Conversion based on Boundary Equilibrium GAN Da-Yi Wu Yi-Hsuan Yang GAN 6 8 0 28 May 2020
Quasi-Periodic Parallel WaveGAN Vocoder: A Non-autoregressive Pitch-dependent Dilated Convolution Model for Parametric Speech Generation Yi-Chiao Wu Tomoki Hayashi T. Okamoto Hisashi Kawai T. Toda 23 4 0 18 May 2020
Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis Rafael Valle Kevin J. Shih R. Prenger Bryan Catanzaro 21 119 0 12 May 2020
Multi-band MelGAN: Faster Waveform Generation for High-Quality Text-to-Speech Geng Yang Shan Yang Kai-Chun Liu Peng Fang Wei Chen Lei Xie 64 198 0 11 May 2020
GACELA -- A generative adversarial context encoder for long audio inpainting Andrés Marafioti P. Majdak Nicki Holighaus Nathanael Perraudin 33 43 0 11 May 2020
Cotatron: Transcription-Guided Speech Encoder for Any-to-Many Voice Conversion without Parallel Data Seung-won Park Doo-young Kim Myun-chul Joe 8 40 0 07 May 2020
Conditional Spoken Digit Generation with StyleGAN Kasperi Palkama Lauri Juvela Alexander Ilin GAN 19 10 0 28 Apr 2020
Transformation-based Adversarial Video Prediction on Large-Scale Data Pauline Luc Aidan Clark Sander Dieleman Diego de Las Casas Yotam Doron Albin Cassirer Karen Simonyan VGen 231 86 0 09 Mar 2020
A Limited-Capacity Minimax Theorem for Non-Convex Games or: How I Learned to Stop Worrying about Mixed-Nash and Love Neural Nets Gauthier Gidel David Balduzzi Wojciech M. Czarnecki M. Garnelo Yoram Bachrach 11 7 0 14 Feb 2020
Score and Lyrics-Free Singing Voice Generation Jen-Yu Liu Yu-Hua Chen Yin-Cheng Yeh Yi-Hsuan Yang 19 22 0 26 Dec 2019
WaveFlow: A Compact Flow-based Model for Raw Audio Ming-Yu Liu Kainan Peng Kexin Zhao Z. Song 15 116 0 03 Dec 2019
Change your singer: a transfer learning generative adversarial framework for song to song conversion Rema Daher Mohammad Kassem Zein Julia El Zini M. Awad Daniel C. Asmar 19 1 0 07 Nov 2019
CorrGAN: Sampling Realistic Financial Correlation Matrices Using Generative Adversarial Networks Gautier Marti GAN 24 44 0 21 Oct 2019
A Style-Based Generator Architecture for Generative Adversarial Networks Tero Karras S. Laine Timo Aila 294 10,368 0 12 Dec 2018
A Learned Representation For Artistic Style Vincent Dumoulin Jonathon Shlens M. Kudlur GAN 214 1,156 0 24 Oct 2016