ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1811.00002
  4. Cited By
WaveGlow: A Flow-based Generative Network for Speech Synthesis

WaveGlow: A Flow-based Generative Network for Speech Synthesis

31 October 2018
R. Prenger
Rafael Valle
Bryan Catanzaro
ArXivPDFHTML

Papers citing "WaveGlow: A Flow-based Generative Network for Speech Synthesis"

50 / 525 papers shown
Title
FeatherWave: An efficient high-fidelity neural vocoder with multi-band
  linear prediction
FeatherWave: An efficient high-fidelity neural vocoder with multi-band linear prediction
Qiao Tian
Zewang Zhang
Heng Lu
Linghui Chen
Shan Liu
16
22
0
12 May 2020
DiscreTalk: Text-to-Speech as a Machine Translation Problem
DiscreTalk: Text-to-Speech as a Machine Translation Problem
Tomoki Hayashi
Shinji Watanabe
27
32
0
12 May 2020
Multi-band MelGAN: Faster Waveform Generation for High-Quality
  Text-to-Speech
Multi-band MelGAN: Faster Waveform Generation for High-Quality Text-to-Speech
Geng Yang
Shan Yang
Kai-Chun Liu
Peng Fang
Wei Chen
Lei Xie
64
198
0
11 May 2020
GACELA -- A generative adversarial context encoder for long audio
  inpainting
GACELA -- A generative adversarial context encoder for long audio inpainting
Andrés Marafioti
P. Majdak
Nicki Holighaus
Nathanael Perraudin
35
43
0
11 May 2020
Jukebox: A Generative Model for Music
Jukebox: A Generative Model for Music
Prafulla Dhariwal
Heewoo Jun
Christine Payne
Jong Wook Kim
Alec Radford
Ilya Sutskever
VLM
52
722
0
30 Apr 2020
Adversarial Feature Learning and Unsupervised Clustering based Speech
  Synthesis for Found Data with Acoustic and Textual Noise
Adversarial Feature Learning and Unsupervised Clustering based Speech Synthesis for Found Data with Acoustic and Textual Noise
Shan Yang
Yuxuan Wang
Lei Xie
14
9
0
28 Apr 2020
ByteSing: A Chinese Singing Voice Synthesis System Using Duration
  Allocated Encoder-Decoder Acoustic Models and WaveRNN Vocoders
ByteSing: A Chinese Singing Voice Synthesis System Using Duration Allocated Encoder-Decoder Acoustic Models and WaveRNN Vocoders
Yu Gu
Xiang Yin
Yonghui Rao
Yuan Wan
Benlai Tang
Yang Zhang
Jitong Chen
Yuxuan Wang
Zejun Ma
17
70
0
23 Apr 2020
A Study of Non-autoregressive Model for Sequence Generation
A Study of Non-autoregressive Model for Sequence Generation
Yi Ren
Jinglin Liu
Xu Tan
Zhou Zhao
Sheng Zhao
Tie-Yan Liu
15
60
0
22 Apr 2020
Data Processing for Optimizing Naturalness of Vietnamese Text-to-speech
  System
Data Processing for Optimizing Naturalness of Vietnamese Text-to-speech System
V. Phung
Phan Huy Kinh
Anh-Tuan Dinh
Quoc Bao Nguyen
25
5
0
20 Apr 2020
ViSQOL v3: An Open Source Production Ready Objective Speech and Audio
  Metric
ViSQOL v3: An Open Source Production Ready Objective Speech and Audio Metric
Michael Chinen
Felicia S. C. Lim
Jan Skoglund
Nikita Gureev
F. O'Gorman
Andrew Hines
8
132
0
20 Apr 2020
Knowledge-and-Data-Driven Amplitude Spectrum Prediction for Hierarchical
  Neural Vocoders
Knowledge-and-Data-Driven Amplitude Spectrum Prediction for Hierarchical Neural Vocoders
Yang Ai
Zhenhua Ling
11
8
0
16 Apr 2020
Speech Quality Factors for Traditional and Neural-Based Low Bit Rate
  Vocoders
Speech Quality Factors for Traditional and Neural-Based Low Bit Rate Vocoders
Wissam A. Jassim
Jan Skoglund
Michael Chinen
Andrew Hines
14
8
0
26 Mar 2020
Unsupervised Style and Content Separation by Minimizing Mutual
  Information for Speech Synthesis
Unsupervised Style and Content Separation by Minimizing Mutual Information for Speech Synthesis
Ting-Yao Hu
A. Shrivastava
Oncel Tuzel
C. Dhir
11
30
0
09 Mar 2020
AlignTTS: Efficient Feed-Forward Text-to-Speech System without Explicit
  Alignment
AlignTTS: Efficient Feed-Forward Text-to-Speech System without Explicit Alignment
Zhen Zeng
Jianzong Wang
Ning Cheng
Tian Xia
Jing Xiao
VLM
30
56
0
04 Mar 2020
Gradient Boosted Normalizing Flows
Gradient Boosted Normalizing Flows
Robert Giaquinto
A. Banerjee
BDL
DRL
4
1
0
27 Feb 2020
VFlow: More Expressive Generative Flows with Variational Data
  Augmentation
VFlow: More Expressive Generative Flows with Variational Data Augmentation
Jianfei Chen
Cheng Lu
Biqi Chenli
Jun Zhu
Tian Tian
DRL
16
63
0
22 Feb 2020
Vocoder-free End-to-End Voice Conversion with Transformer Network
Vocoder-free End-to-End Voice Conversion with Transformer Network
June-Woo Kim
H. Jung
Minho Lee
30
4
0
05 Feb 2020
SqueezeWave: Extremely Lightweight Vocoders for On-device Speech
  Synthesis
SqueezeWave: Extremely Lightweight Vocoders for On-device Speech Synthesis
Bohan Zhai
Tianren Gao
Flora Xue
D. Rothchild
Bichen Wu
Joseph E. Gonzalez
Kurt Keutzer
21
27
0
16 Jan 2020
Neural ODEs for Image Segmentation with Level Sets
Neural ODEs for Image Segmentation with Level Sets
Rafael Valle
F. Reda
M. Shoeybi
P. LeGresley
Andrew Tao
Bryan Catanzaro
17
8
0
25 Dec 2019
Probing the phonetic and phonological knowledge of tones in Mandarin TTS
  models
Probing the phonetic and phonological knowledge of tones in Mandarin TTS models
Jian Zhu
18
8
0
23 Dec 2019
C-Flow: Conditional Generative Flow Models for Images and 3D Point
  Clouds
C-Flow: Conditional Generative Flow Models for Images and 3D Point Clouds
Albert Pumarola
S. Popov
Francesc Moreno-Noguer
V. Ferrari
3DPC
AI4CE
31
80
0
15 Dec 2019
Normalizing Flows for Probabilistic Modeling and Inference
Normalizing Flows for Probabilistic Modeling and Inference
George Papamakarios
Eric T. Nalisnick
Danilo Jimenez Rezende
S. Mohamed
Balaji Lakshminarayanan
TPM
AI4CE
57
1,631
0
05 Dec 2019
Towards Robust Neural Vocoding for Speech Generation: A Survey
Towards Robust Neural Vocoding for Speech Generation: A Survey
Po-Chun Hsu
Chun-hsuan Wang
Andy T. Liu
Hung-yi Lee
OOD
15
24
0
05 Dec 2019
WaveFlow: A Compact Flow-based Model for Raw Audio
WaveFlow: A Compact Flow-based Model for Raw Audio
Ming-Yu Liu
Kainan Peng
Kexin Zhao
Z. Song
17
116
0
03 Dec 2019
High-quality Speech Synthesis Using Super-resolution Mel-Spectrogram
High-quality Speech Synthesis Using Super-resolution Mel-Spectrogram
Leyuan Sheng
Dong-Yan Huang
Evgeny Nikolaevich Pavlovskiy
14
15
0
03 Dec 2019
Dynamic Prosody Generation for Speech Synthesis using Linguistics-Driven
  Acoustic Embedding Selection
Dynamic Prosody Generation for Speech Synthesis using Linguistics-Driven Acoustic Embedding Selection
Shubhi Tyagi
M. Nicolis
Jonas Rohnke
Thomas Drugman
Jaime Lorenzo-Trueba
32
32
0
02 Dec 2019
SchrödingeRNN: Generative Modeling of Raw Audio as a Continuously
  Observed Quantum State
SchrödingeRNN: Generative Modeling of Raw Audio as a Continuously Observed Quantum State
Beñat Mencia Uranga
A. Lamacraft
28
3
0
26 Nov 2019
Invertible DNN-based nonlinear time-frequency transform for speech
  enhancement
Invertible DNN-based nonlinear time-frequency transform for speech enhancement
Daiki Takeuchi
Kohei Yatabe
Yuma Koizumi
Yasuhiro Oikawa
N. Harada
30
10
0
25 Nov 2019
Deep Long Audio Inpainting
Deep Long Audio Inpainting
Ya-Liang Chang
Kuan-Ying Lee
Po-Yu Wu
Hung-yi Lee
Winston H. Hsu
30
33
0
15 Nov 2019
Feedback Recurrent AutoEncoder
Feedback Recurrent AutoEncoder
Yang Yang
Guillaume Sautière
J. Jon Ryu
Taco S. Cohen
43
21
0
11 Nov 2019
Incremental Text-to-Speech Synthesis with Prefix-to-Prefix Framework
Incremental Text-to-Speech Synthesis with Prefix-to-Prefix Framework
Mingbo Ma
Baigong Zheng
Kaibo Liu
Renjie Zheng
Hairong Liu
Kainan Peng
Kenneth Church
Liang Huang
17
29
0
07 Nov 2019
On Investigation of Unsupervised Speech Factorization Based on
  Normalization Flow
On Investigation of Unsupervised Speech Factorization Based on Normalization Flow
Haoran Sun
Yunqi Cai
Lantian Li
Dong Wang
21
1
0
29 Oct 2019
Neural Density Estimation and Likelihood-free Inference
Neural Density Estimation and Likelihood-free Inference
George Papamakarios
BDL
DRL
24
44
0
29 Oct 2019
Transferring neural speech waveform synthesizers to musical instrument
  sounds generation
Transferring neural speech waveform synthesizers to musical instrument sounds generation
Yi Zhao
Xin Wang
Lauri Juvela
Junichi Yamagishi
24
16
0
27 Oct 2019
Mellotron: Multispeaker expressive voice synthesis by conditioning on
  rhythm, pitch and global style tokens
Mellotron: Multispeaker expressive voice synthesis by conditioning on rhythm, pitch and global style tokens
Rafael Valle
Jason Chun Lok Li
R. Prenger
Bryan Catanzaro
16
148
0
26 Oct 2019
Learning audio representations via phase prediction
Learning audio representations via phase prediction
Félix de Chaumont Quitry
Marco Tagliasacchi
Dominik Roblek
SSL
AI4TS
11
10
0
25 Oct 2019
Fast and High-Quality Singing Voice Synthesis System based on
  Convolutional Neural Networks
Fast and High-Quality Singing Voice Synthesis System based on Convolutional Neural Networks
Kazuhiro Nakamura
Shinji Takaki
Kei Hashimoto
Keiichiro Oura
Yoshihiko Nankaku
K. Tokuda
16
19
0
24 Oct 2019
ESPnet-TTS: Unified, Reproducible, and Integratable Open Source
  End-to-End Text-to-Speech Toolkit
ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Text-to-Speech Toolkit
Tomoki Hayashi
Ryuichi Yamamoto
Katsuki Inoue
Takenori Yoshimura
Shinji Watanabe
T. Toda
K. Takeda
Yu Zhang
Xu Tan
VLM
29
202
0
24 Oct 2019
Sequence-to-sequence Singing Synthesis Using the Feed-forward
  Transformer
Sequence-to-sequence Singing Synthesis Using the Feed-forward Transformer
Merlijn Blaauw
J. Bonada
27
55
0
22 Oct 2019
MelGAN: Generative Adversarial Networks for Conditional Waveform
  Synthesis
MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis
Kundan Kumar
Rithesh Kumar
T. Boissière
L. Gestin
Wei Zhen Teoh
Jose M. R. Sotelo
A. D. Brébisson
Yoshua Bengio
Aaron Courville
GAN
13
938
0
08 Oct 2019
High Fidelity Speech Synthesis with Adversarial Networks
High Fidelity Speech Synthesis with Adversarial Networks
Mikolaj Binkowski
Jeff Donahue
Sander Dieleman
Aidan Clark
Erich Elsen
Norman Casagrande
Luis C. Cobo
Karen Simonyan
241
239
0
25 Sep 2019
FlowSeq: Non-Autoregressive Conditional Sequence Generation with
  Generative Flow
FlowSeq: Non-Autoregressive Conditional Sequence Generation with Generative Flow
Xuezhe Ma
Chunting Zhou
Xian Li
Graham Neubig
Eduard H. Hovy
AI4TS
BDL
8
189
0
05 Sep 2019
Neural Harmonic-plus-Noise Waveform Model with Trainable Maximum Voice
  Frequency for Text-to-Speech Synthesis
Neural Harmonic-plus-Noise Waveform Model with Trainable Maximum Voice Frequency for Text-to-Speech Synthesis
Xin Wang
Junichi Yamagishi
14
31
0
27 Aug 2019
Normalizing Flows: An Introduction and Review of Current Methods
Normalizing Flows: An Introduction and Review of Current Methods
I. Kobyzev
S. Prince
Marcus A. Brubaker
TPM
MedIm
19
57
0
25 Aug 2019
Survey on Deep Neural Networks in Speech and Vision Systems
Survey on Deep Neural Networks in Speech and Vision Systems
M. Alam
Manar D. Samad
Lasitha Vidyaratne
Alexander M. Glandon
Khan M. Iftekharuddin
3DV
VLM
AI4TS
34
205
0
16 Aug 2019
Hierarchical Sequence to Sequence Voice Conversion with Limited Data
Hierarchical Sequence to Sequence Voice Conversion with Limited Data
P. Narayanan
Punarjay Chakravarty
F. Charette
G. Puskorius
23
3
0
15 Jul 2019
Speech bandwidth extension with WaveNet
Speech bandwidth extension with WaveNet
Archit Gupta
Brendan Shillingford
Yannis Assael
Thomas C. Walters
21
28
0
05 Jul 2019
Neural Drum Machine : An Interactive System for Real-time Synthesis of
  Drum Sounds
Neural Drum Machine : An Interactive System for Real-time Synthesis of Drum Sounds
Cyran Aouameur
P. Esling
Gaëtan Hadjeres
16
21
0
04 Jul 2019
PointFlow: 3D Point Cloud Generation with Continuous Normalizing Flows
PointFlow: 3D Point Cloud Generation with Continuous Normalizing Flows
Guandao Yang
Xun Huang
Jinwei Gu
Ming Liu
Serge J. Belongie
Bharath Hariharan
3DPC
40
658
0
28 Jun 2019
A Neural Vocoder with Hierarchical Generation of Amplitude and Phase
  Spectra for Statistical Parametric Speech Synthesis
A Neural Vocoder with Hierarchical Generation of Amplitude and Phase Spectra for Statistical Parametric Speech Synthesis
Yang Ai
Zhenhua Ling
21
29
0
23 Jun 2019
Previous
123...10119
Next