WaveGlow: A Flow-based Generative Network for Speech Synthesis

31 October 2018

Papers citing "WaveGlow: A Flow-based Generative Network for Speech Synthesis"

50 / 525 papers shown

Title
FeatherWave: An efficient high-fidelity neural vocoder with multi-band linear prediction Qiao Tian Zewang Zhang Heng Lu Linghui Chen Shan Liu 16 22 0 12 May 2020
DiscreTalk: Text-to-Speech as a Machine Translation Problem Tomoki Hayashi Shinji Watanabe 27 32 0 12 May 2020
Multi-band MelGAN: Faster Waveform Generation for High-Quality Text-to-Speech Geng Yang Shan Yang Kai-Chun Liu Peng Fang Wei Chen Lei Xie 64 198 0 11 May 2020
GACELA -- A generative adversarial context encoder for long audio inpainting Andrés Marafioti P. Majdak Nicki Holighaus Nathanael Perraudin 35 43 0 11 May 2020
Jukebox: A Generative Model for Music Prafulla Dhariwal Heewoo Jun Christine Payne Jong Wook Kim Alec Radford Ilya Sutskever VLM 52 722 0 30 Apr 2020
Adversarial Feature Learning and Unsupervised Clustering based Speech Synthesis for Found Data with Acoustic and Textual Noise Shan Yang Yuxuan Wang Lei Xie 14 9 0 28 Apr 2020
ByteSing: A Chinese Singing Voice Synthesis System Using Duration Allocated Encoder-Decoder Acoustic Models and WaveRNN Vocoders Yu Gu Xiang Yin Yonghui Rao Yuan Wan Benlai Tang Yang Zhang Jitong Chen Yuxuan Wang Zejun Ma 17 70 0 23 Apr 2020
A Study of Non-autoregressive Model for Sequence Generation Yi Ren Jinglin Liu Xu Tan Zhou Zhao Sheng Zhao Tie-Yan Liu 15 60 0 22 Apr 2020
Data Processing for Optimizing Naturalness of Vietnamese Text-to-speech System V. Phung Phan Huy Kinh Anh-Tuan Dinh Quoc Bao Nguyen 25 5 0 20 Apr 2020
ViSQOL v3: An Open Source Production Ready Objective Speech and Audio Metric Michael Chinen Felicia S. C. Lim Jan Skoglund Nikita Gureev F. O'Gorman Andrew Hines 8 132 0 20 Apr 2020
Knowledge-and-Data-Driven Amplitude Spectrum Prediction for Hierarchical Neural Vocoders Yang Ai Zhenhua Ling 11 8 0 16 Apr 2020
Speech Quality Factors for Traditional and Neural-Based Low Bit Rate Vocoders Wissam A. Jassim Jan Skoglund Michael Chinen Andrew Hines 14 8 0 26 Mar 2020
Unsupervised Style and Content Separation by Minimizing Mutual Information for Speech Synthesis Ting-Yao Hu A. Shrivastava Oncel Tuzel C. Dhir 11 30 0 09 Mar 2020
AlignTTS: Efficient Feed-Forward Text-to-Speech System without Explicit Alignment Zhen Zeng Jianzong Wang Ning Cheng Tian Xia Jing Xiao VLM 30 56 0 04 Mar 2020
Gradient Boosted Normalizing Flows Robert Giaquinto A. Banerjee BDL DRL 4 1 0 27 Feb 2020
VFlow: More Expressive Generative Flows with Variational Data Augmentation Jianfei Chen Cheng Lu Biqi Chenli Jun Zhu Tian Tian DRL 16 63 0 22 Feb 2020
Vocoder-free End-to-End Voice Conversion with Transformer Network June-Woo Kim H. Jung Minho Lee 30 4 0 05 Feb 2020
SqueezeWave: Extremely Lightweight Vocoders for On-device Speech Synthesis Bohan Zhai Tianren Gao Flora Xue D. Rothchild Bichen Wu Joseph E. Gonzalez Kurt Keutzer 21 27 0 16 Jan 2020
Neural ODEs for Image Segmentation with Level Sets Rafael Valle F. Reda M. Shoeybi P. LeGresley Andrew Tao Bryan Catanzaro 17 8 0 25 Dec 2019
Probing the phonetic and phonological knowledge of tones in Mandarin TTS models Jian Zhu 18 8 0 23 Dec 2019
C-Flow: Conditional Generative Flow Models for Images and 3D Point Clouds Albert Pumarola S. Popov Francesc Moreno-Noguer V. Ferrari 3DPC AI4CE 31 80 0 15 Dec 2019
Normalizing Flows for Probabilistic Modeling and Inference George Papamakarios Eric T. Nalisnick Danilo Jimenez Rezende S. Mohamed Balaji Lakshminarayanan TPM AI4CE 57 1,631 0 05 Dec 2019
Towards Robust Neural Vocoding for Speech Generation: A Survey Po-Chun Hsu Chun-hsuan Wang Andy T. Liu Hung-yi Lee OOD 15 24 0 05 Dec 2019
WaveFlow: A Compact Flow-based Model for Raw Audio Ming-Yu Liu Kainan Peng Kexin Zhao Z. Song 17 116 0 03 Dec 2019
High-quality Speech Synthesis Using Super-resolution Mel-Spectrogram Leyuan Sheng Dong-Yan Huang Evgeny Nikolaevich Pavlovskiy 14 15 0 03 Dec 2019
Dynamic Prosody Generation for Speech Synthesis using Linguistics-Driven Acoustic Embedding Selection Shubhi Tyagi M. Nicolis Jonas Rohnke Thomas Drugman Jaime Lorenzo-Trueba 32 32 0 02 Dec 2019
SchrödingeRNN: Generative Modeling of Raw Audio as a Continuously Observed Quantum State Beñat Mencia Uranga A. Lamacraft 28 3 0 26 Nov 2019
Invertible DNN-based nonlinear time-frequency transform for speech enhancement Daiki Takeuchi Kohei Yatabe Yuma Koizumi Yasuhiro Oikawa N. Harada 30 10 0 25 Nov 2019
Deep Long Audio Inpainting Ya-Liang Chang Kuan-Ying Lee Po-Yu Wu Hung-yi Lee Winston H. Hsu 30 33 0 15 Nov 2019
Feedback Recurrent AutoEncoder Yang Yang Guillaume Sautière J. Jon Ryu Taco S. Cohen 43 21 0 11 Nov 2019
Incremental Text-to-Speech Synthesis with Prefix-to-Prefix Framework Mingbo Ma Baigong Zheng Kaibo Liu Renjie Zheng Hairong Liu Kainan Peng Kenneth Church Liang Huang 17 29 0 07 Nov 2019
On Investigation of Unsupervised Speech Factorization Based on Normalization Flow Haoran Sun Yunqi Cai Lantian Li Dong Wang 21 1 0 29 Oct 2019
Neural Density Estimation and Likelihood-free Inference George Papamakarios BDL DRL 24 44 0 29 Oct 2019
Transferring neural speech waveform synthesizers to musical instrument sounds generation Yi Zhao Xin Wang Lauri Juvela Junichi Yamagishi 24 16 0 27 Oct 2019
Mellotron: Multispeaker expressive voice synthesis by conditioning on rhythm, pitch and global style tokens Rafael Valle Jason Chun Lok Li R. Prenger Bryan Catanzaro 16 148 0 26 Oct 2019
Learning audio representations via phase prediction Félix de Chaumont Quitry Marco Tagliasacchi Dominik Roblek SSL AI4TS 11 10 0 25 Oct 2019
Fast and High-Quality Singing Voice Synthesis System based on Convolutional Neural Networks Kazuhiro Nakamura Shinji Takaki Kei Hashimoto Keiichiro Oura Yoshihiko Nankaku K. Tokuda 16 19 0 24 Oct 2019
ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Text-to-Speech Toolkit Tomoki Hayashi Ryuichi Yamamoto Katsuki Inoue Takenori Yoshimura Shinji Watanabe T. Toda K. Takeda Yu Zhang Xu Tan VLM 29 202 0 24 Oct 2019
Sequence-to-sequence Singing Synthesis Using the Feed-forward Transformer Merlijn Blaauw J. Bonada 27 55 0 22 Oct 2019
MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis Kundan Kumar Rithesh Kumar T. Boissière L. Gestin Wei Zhen Teoh Jose M. R. Sotelo A. D. Brébisson Yoshua Bengio Aaron Courville GAN 13 938 0 08 Oct 2019
High Fidelity Speech Synthesis with Adversarial Networks Mikolaj Binkowski Jeff Donahue Sander Dieleman Aidan Clark Erich Elsen Norman Casagrande Luis C. Cobo Karen Simonyan 241 239 0 25 Sep 2019
FlowSeq: Non-Autoregressive Conditional Sequence Generation with Generative Flow Xuezhe Ma Chunting Zhou Xian Li Graham Neubig Eduard H. Hovy AI4TS BDL 8 189 0 05 Sep 2019
Neural Harmonic-plus-Noise Waveform Model with Trainable Maximum Voice Frequency for Text-to-Speech Synthesis Xin Wang Junichi Yamagishi 14 31 0 27 Aug 2019
Normalizing Flows: An Introduction and Review of Current Methods I. Kobyzev S. Prince Marcus A. Brubaker TPM MedIm 19 57 0 25 Aug 2019
Survey on Deep Neural Networks in Speech and Vision Systems M. Alam Manar D. Samad Lasitha Vidyaratne Alexander M. Glandon Khan M. Iftekharuddin 3DV VLM AI4TS 34 205 0 16 Aug 2019
Hierarchical Sequence to Sequence Voice Conversion with Limited Data P. Narayanan Punarjay Chakravarty F. Charette G. Puskorius 23 3 0 15 Jul 2019
Speech bandwidth extension with WaveNet Archit Gupta Brendan Shillingford Yannis Assael Thomas C. Walters 21 28 0 05 Jul 2019
Neural Drum Machine : An Interactive System for Real-time Synthesis of Drum Sounds Cyran Aouameur P. Esling Gaëtan Hadjeres 16 21 0 04 Jul 2019
PointFlow: 3D Point Cloud Generation with Continuous Normalizing Flows Guandao Yang Xun Huang Jinwei Gu Ming Liu Serge J. Belongie Bharath Hariharan 3DPC 40 658 0 28 Jun 2019
A Neural Vocoder with Hierarchical Generation of Amplitude and Phase Spectra for Statistical Parametric Speech Synthesis Yang Ai Zhenhua Ling 21 29 0 23 Jun 2019