End-to-End Adversarial Text-to-Speech

5 June 2020

Papers citing "End-to-End Adversarial Text-to-Speech"

50 / 55 papers shown

Title
Generative Adversarial Networks Gilad Cohen Raja Giryes GAN 106 30,021 0 01 Mar 2022
Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search Jaehyeon Kim Sungwon Kim Jungil Kong Sungroh Yoon 66 482 0 22 May 2020
Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis Rafael Valle Kevin J. Shih R. Prenger Bryan Catanzaro 51 120 0 12 May 2020
Multi-band MelGAN: Faster Waveform Generation for High-Quality Text-to-Speech Geng Yang Shan Yang Kai-Chun Liu Peng Fang Wei Chen Lei Xie 85 198 0 11 May 2020
DDSP: Differentiable Digital Signal Processing Jesse Engel Lamtharn Hantrakul Chenjie Gu Adam Roberts DiffM 136 375 0 14 Jan 2020
Normalizing Flows for Probabilistic Modeling and Inference George Papamakarios Eric T. Nalisnick Danilo Jimenez Rezende S. Mohamed Balaji Lakshminarayanan TPM AI4CE 120 1,662 0 05 Dec 2019
WaveFlow: A Compact Flow-based Model for Raw Audio Ming-Yu Liu Kainan Peng Kexin Zhao Z. Song 55 117 0 03 Dec 2019
Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram Ryuichi Yamamoto Eunwoo Song Jae-Min Kim 32 817 0 25 Oct 2019
Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis Eric Battenberg RJ Skerry-Ryan Soroosh Mariooryad Daisy Stanton David Kao Matt Shannon Tom Bagby 50 114 0 23 Oct 2019
MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis Kundan Kumar Rithesh Kumar T. Boissière L. Gestin Wei Zhen Teoh Jose M. R. Sotelo A. D. Brébisson Yoshua Bengio Aaron Courville GAN 59 945 0 08 Oct 2019
High Fidelity Speech Synthesis with Adversarial Networks Mikolaj Binkowski Jeff Donahue Sander Dieleman Aidan Clark Erich Elsen Norman Casagrande Luis C. Cobo Karen Simonyan 262 240 0 25 Sep 2019
MelNet: A Generative Model for Audio in the Frequency Domain Sean Vasquez M. Lewis DiffM 33 131 0 04 Jun 2019
Robust Sequence-to-Sequence Acoustic Modeling with Stepwise Monotonic Attention for Neural TTS Mutian He Yan Deng Lei He 27 81 0 03 Jun 2019
Non-Autoregressive Neural Text-to-Speech Kainan Peng Ming-Yu Liu Z. Song Kexin Zhao 38 40 0 21 May 2019
Expediting TTS Synthesis with Adversarial Vocoding Paarth Neekhara Chris Donahue M. Puckette Shlomo Dubnov Julian McAuley 16 20 0 16 Apr 2019
A New GAN-based End-to-End TTS Training Algorithm Haohan Guo Frank Soong Lei He Lei Xie 45 47 0 09 Apr 2019
Probability density distillation with generative adversarial networks for high-quality parallel waveform generation Ryuichi Yamamoto Eunwoo Song Jae-Min Kim 28 55 0 09 Apr 2019
FloWaveNet : A Generative Flow for Raw Audio Sungwon Kim Sang-gil Lee Jongyoon Song Jaehyeon Kim Sungroh Yoon 38 169 0 06 Nov 2018
WaveGlow: A Flow-based Generative Network for Speech Synthesis R. Prenger Rafael Valle Bryan Catanzaro 115 1,024 0 31 Oct 2018
Neural source-filter-based waveform model for statistical parametric speech synthesis Xin Wang Shinji Takaki Junichi Yamagishi 32 125 0 29 Oct 2018
LPCNet: Improving Neural Speech Synthesis Through Linear Prediction J. Valin Jan Skoglund 28 450 0 28 Oct 2018
Large Scale GAN Training for High Fidelity Natural Image Synthesis Andrew Brock Jeff Donahue Karen Simonyan 200 5,363 0 28 Sep 2018
Fast Spectrogram Inversion using Multi-head Convolutional Neural Networks Sercan O. Arik Heewoo Jun G. Diamos 29 108 0 20 Aug 2018
ClariNet: Parallel Wave Generation in End-to-End Text-to-Speech Ming-Yu Liu Kainan Peng Jitong Chen 34 344 0 19 Jul 2018
Forward Attention in Sequence-to-sequence Acoustic Modelling for Speech Synthesis Jing-Xuan Zhang Zhenhua Ling Lirong Dai 22 83 0 18 Jul 2018
Efficient Neural Audio Synthesis Nal Kalchbrenner Erich Elsen Karen Simonyan Seb Noury Norman Casagrande Edward Lockhart Florian Stimberg Aaron van den Oord Sander Dieleman Koray Kavukcuoglu 63 866 0 23 Feb 2018
Spectral Normalization for Generative Adversarial Networks Takeru Miyato Toshiki Kataoka Masanori Koyama Yuichi Yoshida ODL 127 4,421 0 16 Feb 2018
cGANs with Projection Discriminator Takeru Miyato Masanori Koyama GAN 42 768 0 15 Feb 2018
Adversarial Audio Synthesis Chris Donahue Julian McAuley M. Puckette GAN 114 609 0 12 Feb 2018
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions Jonathan Shen Ruoming Pang Ron J. Weiss M. Schuster Navdeep Jaitly ... Yuxuan Wang RJ Skerry-Ryan Rif A. Saurous Yannis Agiomyrgiannakis Yonghui Wu 59 2,684 0 16 Dec 2017
Monotonic Chunkwise Attention Chung-Cheng Chiu Colin Raffel 50 255 0 14 Dec 2017
Parallel WaveNet: Fast High-Fidelity Speech Synthesis Aaron van den Oord Yazhe Li Igor Babuschkin Karen Simonyan Oriol Vinyals ... Alex Graves Helen King T. Walters Dan Belov Demis Hassabis 118 857 0 28 Nov 2017
Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning Ming-Yu Liu Kainan Peng Andrew Gibiansky Sercan O. Arik Ajay Kannan Sharan Narang Jonathan Raiman John Miller 54 304 0 20 Oct 2017
Modulating early visual processing by language H. D. Vries Florian Strub Jérémie Mary Hugo Larochelle Olivier Pietquin Aaron Courville 101 484 0 02 Jul 2017
Attention Is All You Need Ashish Vaswani Noam M. Shazeer Niki Parmar Jakob Uszkoreit Llion Jones Aidan Gomez Lukasz Kaiser Illia Polosukhin 3DV 341 129,831 0 12 Jun 2017
Deep Voice 2: Multi-Speaker Neural Text-to-Speech Sercan O. Arik G. Diamos Andrew Gibiansky John Miller Kainan Peng Ming-Yu Liu Jonathan Raiman Yanqi Zhou 51 495 0 24 May 2017
Convolutional Sequence to Sequence Learning Jonas Gehring Michael Auli David Grangier Denis Yarats Yann N. Dauphin AIMat 115 3,279 0 08 May 2017
Geometric GAN Jae Hyun Lim J. C. Ye GAN 39 516 0 08 May 2017
Online and Linear-Time Attention by Enforcing Monotonic Alignments Colin Raffel Minh-Thang Luong Peter J. Liu Ron J. Weiss Douglas Eck 47 258 0 03 Apr 2017
Tacotron: Towards End-to-End Speech Synthesis Yuxuan Wang RJ Skerry-Ryan Daisy Stanton Yonghui Wu Ron J. Weiss ... Samy Bengio Quoc V. Le Yannis Agiomyrgiannakis R. Clark Rif A. Saurous 125 1,817 0 29 Mar 2017
Soft-DTW: a Differentiable Loss Function for Time-Series Marco Cuturi Mathieu Blondel AI4TS 159 620 0 05 Mar 2017
Deep Voice: Real-time Neural Text-to-Speech Sercan O. Arik Mike Chrzanowski Adam Coates G. Diamos Andrew Gibiansky ... John Miller Andrew Ng Jonathan Raiman Shubho Sengupta Mohammad Shoeybi 52 613 0 25 Feb 2017
SampleRNN: An Unconditional End-to-End Neural Audio Generation Model Soroush Mehri Kundan Kumar Ishaan Gulrajani Rithesh Kumar Shubham Jain Jose M. R. Sotelo Aaron Courville Yoshua Bengio 59 597 0 22 Dec 2016
A Learned Representation For Artistic Style Vincent Dumoulin Jonathon Shlens M. Kudlur GAN 276 1,160 0 24 Oct 2016
WaveNet: A Generative Model for Raw Audio Aaron van den Oord Sander Dieleman Heiga Zen Karen Simonyan Oriol Vinyals Alex Graves Nal Kalchbrenner A. Senior Koray Kavukcuoglu DiffM 251 7,361 0 12 Sep 2016
SGDR: Stochastic Gradient Descent with Warm Restarts I. Loshchilov Frank Hutter ODL 210 8,030 0 13 Aug 2016
Layer Normalization Jimmy Lei Ba J. Kiros Geoffrey E. Hinton 219 10,412 0 21 Jul 2016
TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems Martín Abadi Ashish Agarwal P. Barham E. Brevdo Zhiwen Chen ... Pete Warden Martin Wattenberg Martin Wicke Yuan Yu Xiaoqiang Zheng 160 11,135 0 14 Mar 2016
Sequence Level Training with Recurrent Neural Networks MarcÁurelio Ranzato S. Chopra Michael Auli Wojciech Zaremba 65 1,610 0 20 Nov 2015
Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks Samy Bengio Oriol Vinyals Navdeep Jaitly Noam M. Shazeer 103 2,024 0 09 Jun 2015