ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1811.00002
  4. Cited By
WaveGlow: A Flow-based Generative Network for Speech Synthesis

WaveGlow: A Flow-based Generative Network for Speech Synthesis

31 October 2018
R. Prenger
Rafael Valle
Bryan Catanzaro
ArXivPDFHTML

Papers citing "WaveGlow: A Flow-based Generative Network for Speech Synthesis"

50 / 525 papers shown
Title
iFlow: Numerically Invertible Flows for Efficient Lossless Compression
  via a Uniform Coder
iFlow: Numerically Invertible Flows for Efficient Lossless Compression via a Uniform Coder
Shifeng Zhang
Ning Kang
Tom Ryder
Zhenguo Li
27
30
0
01 Nov 2021
RefineGAN: Universally Generating Waveform Better than Ground Truth with
  Highly Accurate Pitch and Intensity Responses
RefineGAN: Universally Generating Waveform Better than Ground Truth with Highly Accurate Pitch and Intensity Responses
Shengyuan Xu
Wenxiao Zhao
Jing Guo
24
12
0
01 Nov 2021
Uncertainty quantification for ptychography using normalizing flows
Uncertainty quantification for ptychography using normalizing flows
Agnimitra Dasgupta
Z. Di
AI4CE
36
5
0
01 Nov 2021
TorchAudio: Building Blocks for Audio and Speech Processing
TorchAudio: Building Blocks for Audio and Speech Processing
Yao-Yuan Yang
Moto Hira
Zhaoheng Ni
Anjali Chourdia
Artyom Astafurov
...
Sean Narenthiran
Shinji Watanabe
Soumith Chintala
Vincent Quenneville-Bélair
Yangyang Shi
31
165
0
28 Oct 2021
Chunked Autoregressive GAN for Conditional Waveform Synthesis
Chunked Autoregressive GAN for Conditional Waveform Synthesis
Max Morrison
Rithesh Kumar
Kundan Kumar
Prem Seetharaman
Aaron Courville
Yoshua Bengio
GAN
41
69
0
19 Oct 2021
KaraTuner: Towards end to end natural pitch correction for singing voice
  in karaoke
KaraTuner: Towards end to end natural pitch correction for singing voice in karaoke
Xiaobin Zhuang
Huiran Yu
Weifeng Zhao
Tao Jiang
Peng Hu
32
5
0
18 Oct 2021
Neural Dubber: Dubbing for Videos According to Scripts
Neural Dubber: Dubbing for Videos According to Scripts
Chenxu Hu
Qiao Tian
Tingle Li
Yuping Wang
Yuxuan Wang
Hang Zhao
DiffM
VGen
36
39
0
15 Oct 2021
SingGAN: Generative Adversarial Network For High-Fidelity Singing Voice
  Generation
SingGAN: Generative Adversarial Network For High-Fidelity Singing Voice Generation
Rongjie Huang
Chenye Cui
Feiyang Chen
Yi Ren
Jinglin Liu
Zhou Zhao
Baoxing Huai
N. Yuan
GAN
107
62
0
14 Oct 2021
A Melody-Unsupervision Model for Singing Voice Synthesis
A Melody-Unsupervision Model for Singing Voice Synthesis
Soonbeom Choi
Juhan Nam
29
14
0
13 Oct 2021
DeepA: A Deep Neural Analyzer For Speech And Singing Vocoding
DeepA: A Deep Neural Analyzer For Speech And Singing Vocoding
Sergey Nikonorov
Berrak Sisman
Mingyang Zhang
Haizhou Li
23
2
0
13 Oct 2021
Fine-grained style control in Transformer-based Text-to-speech Synthesis
Fine-grained style control in Transformer-based Text-to-speech Synthesis
Li-Wei Chen
Alexander I. Rudnicky
88
29
0
12 Oct 2021
Adapting TTS models For New Speakers using Transfer Learning
Adapting TTS models For New Speakers using Transfer Learning
Paarth Neekhara
Jason Chun Lok Li
Boris Ginsburg
38
15
0
12 Oct 2021
Voice Reenactment with F0 and timing constraints and adversarial
  learning of conversions
Voice Reenactment with F0 and timing constraints and adversarial learning of conversions
F. Bous
L. Benaroya
Nicolas Obin
Axel Roebel
14
2
0
07 Oct 2021
Towards Universal Neural Vocoding with a Multi-band Excited WaveNet
Towards Universal Neural Vocoding with a Multi-band Excited WaveNet
Axel Roebel
F. Bous
29
2
0
07 Oct 2021
Automated Testing of AI Models
Automated Testing of AI Models
Swagatam Haldar
Deepak Vijaykeerthy
Diptikalyan Saha
VLM
21
0
0
07 Oct 2021
Style Equalization: Unsupervised Learning of Controllable Generative
  Sequence Models
Style Equalization: Unsupervised Learning of Controllable Generative Sequence Models
Jen-Hao Rick Chang
A. Shrivastava
H. Koppula
Xiaoshuai Zhang
Oncel Tuzel
DiffM
51
16
0
06 Oct 2021
GANtron: Emotional Speech Synthesis with Generative Adversarial Networks
GANtron: Emotional Speech Synthesis with Generative Adversarial Networks
E. Hortal
Rodrigo Brechard Alarcia
GAN
26
2
0
06 Oct 2021
Neural Pitch-Shifting and Time-Stretching with Controllable LPCNet
Neural Pitch-Shifting and Time-Stretching with Controllable LPCNet
Max Morrison
Zeyu Jin
Nicholas J. Bryan
Juan-Pablo Caceres
Bryan Pardo
30
14
0
05 Oct 2021
On the Interplay Between Sparsity, Naturalness, Intelligibility, and
  Prosody in Speech Synthesis
On the Interplay Between Sparsity, Naturalness, Intelligibility, and Prosody in Speech Synthesis
Cheng-I Jeff Lai
Erica Cooper
Yang Zhang
Shiyu Chang
Kaizhi Qian
...
Yung-Sung Chuang
Alexander H. Liu
Junichi Yamagishi
David D. Cox
James R. Glass
26
6
0
04 Oct 2021
PortaSpeech: Portable and High-Quality Generative Text-to-Speech
PortaSpeech: Portable and High-Quality Generative Text-to-Speech
Yi Ren
Jinglin Liu
Zhou Zhao
47
78
0
30 Sep 2021
VoiceFixer: Toward General Speech Restoration with Neural Vocoder
VoiceFixer: Toward General Speech Restoration with Neural Vocoder
Haohe Liu
Qiuqiang Kong
Qiao Tian
Yan Zhao
DeLiang Wang
Chuanzeng Huang
Yuxuan Wang
33
57
0
28 Sep 2021
MSR-NV: Neural Vocoder Using Multiple Sampling Rates
MSR-NV: Neural Vocoder Using Multiple Sampling Rates
Kentaro Mitsui
Kei Sawada
20
0
0
28 Sep 2021
FlowVocoder: A small Footprint Neural Vocoder based Normalizing flow for
  Speech Synthesis
FlowVocoder: A small Footprint Neural Vocoder based Normalizing flow for Speech Synthesis
Manh Luong
Viet-Anh Tran
11
2
0
27 Sep 2021
Low-Latency Incremental Text-to-Speech Synthesis with Distilled Context
  Prediction Network
Low-Latency Incremental Text-to-Speech Synthesis with Distilled Context Prediction Network
Takaaki Saeki
Shinnosuke Takamichi
Hiroshi Saruwatari
34
3
0
22 Sep 2021
On-device neural speech synthesis
On-device neural speech synthesis
Sivanand Achanta
Albert Antony
L. Golipour
Jiangchuan Li
T. Raitio
...
Francesco Rossi
Jennifer Shi
Jaimin Upadhyay
David Winarsky
Hepeng Zhang
35
17
0
17 Sep 2021
fairseq S^2: A Scalable and Integrable Speech Synthesis Toolkit
fairseq S^2: A Scalable and Integrable Speech Synthesis Toolkit
Changhan Wang
Wei-Ning Hsu
Yossi Adi
Adam Polyak
Ann Lee
Peng-Jen Chen
Jiatao Gu
J. Pino
VLM
69
32
0
14 Sep 2021
Neural HMMs are all you need (for high-quality attention-free TTS)
Neural HMMs are all you need (for high-quality attention-free TTS)
Shivam Mehta
Éva Székely
Jonas Beskow
G. Henter
40
18
0
30 Aug 2021
Integrated Speech and Gesture Synthesis
Integrated Speech and Gesture Synthesis
Siyang Wang
Simon Alexanderson
Joakim Gustafson
Jonas Beskow
G. Henter
Éva Székely
37
19
0
25 Aug 2021
Multimodal analysis of the predictability of hand-gesture properties
Multimodal analysis of the predictability of hand-gesture properties
Taras Kucherenko
Rajmund Nagy
Michael Neff
Hedvig Kjellström
G. Henter
34
22
0
12 Aug 2021
RW-Resnet: A Novel Speech Anti-Spoofing Model Using Raw Waveform
RW-Resnet: A Novel Speech Anti-Spoofing Model Using Raw Waveform
Youxuan Ma
Zongze Ren
Shugong Xu
38
39
0
12 Aug 2021
StarGAN-VC+ASR: StarGAN-based Non-Parallel Voice Conversion Regularized
  by Automatic Speech Recognition
StarGAN-VC+ASR: StarGAN-based Non-Parallel Voice Conversion Regularized by Automatic Speech Recognition
Shoki Sakamoto
Akira Taniguchi
T. Taniguchi
Hirokazu Kameoka
BDL
31
5
0
10 Aug 2021
AnyoneNet: Synchronized Speech and Talking Head Generation for Arbitrary
  Person
AnyoneNet: Synchronized Speech and Talking Head Generation for Arbitrary Person
Xinsheng Wang
Qicong Xie
Jihua Zhu
Lei Xie
O. Scharenborg
31
16
0
09 Aug 2021
A Streamwise GAN Vocoder for Wideband Speech Coding at Very Low Bit Rate
A Streamwise GAN Vocoder for Wideband Speech Coding at Very Low Bit Rate
Ahmed Mustafa
Jan Büthe
Srikanth Korse
Kishan Gupta
Guillaume Fuchs
N. Pia
21
18
0
09 Aug 2021
An Empirical Study on End-to-End Singing Voice Synthesis with
  Encoder-Decoder Architectures
An Empirical Study on End-to-End Singing Voice Synthesis with Encoder-Decoder Architectures
Dengfeng Ke
Yuxing Lu
Xudong Liu
Yanyan Xu
Jing Sun
Cheng-Hao Cai
30
0
0
06 Aug 2021
A Benchmarking Initiative for Audio-Domain Music Generation Using the
  Freesound Loop Dataset
A Benchmarking Initiative for Audio-Domain Music Generation Using the Freesound Loop Dataset
Tun-Min Hung
Bo-Yu Chen
Yen-Tung Yeh
Yi-Hsuan Yang
18
12
0
03 Aug 2021
Creation and Detection of German Voice Deepfakes
Creation and Detection of German Voice Deepfakes
Vanessa Barnekow
Dominik Binder
Niclas Kromrey
Pascal Munaretto
A. Schaad
Felix Schmieder
21
2
0
02 Aug 2021
A Survey on Audio Synthesis and Audio-Visual Multimodal Processing
A Survey on Audio Synthesis and Audio-Visual Multimodal Processing
Zhaofeng Shi
26
7
0
01 Aug 2021
Sequence-to-Sequence Voice Reconstruction for Silent Speech in a Tonal
  Language
Sequence-to-Sequence Voice Reconstruction for Silent Speech in a Tonal Language
Huiyan Li
Haohong Lin
You Wang
Hengyang Wang
Ming Zhang
Han Gao
Qing Ai
Zhiyuan Luo
Guang Li
31
12
0
31 Jul 2021
Beyond Voice Identity Conversion: Manipulating Voice Attributes by
  Adversarial Learning of Structured Disentangled Representations
Beyond Voice Identity Conversion: Manipulating Voice Attributes by Adversarial Learning of Structured Disentangled Representations
L. Benaroya
Nicolas Obin
Axel Roebel
16
5
0
26 Jul 2021
Adaptation of Tacotron2-based Text-To-Speech for
  Articulatory-to-Acoustic Mapping using Ultrasound Tongue Imaging
Adaptation of Tacotron2-based Text-To-Speech for Articulatory-to-Acoustic Mapping using Ultrasound Tongue Imaging
Csaba Zainkó
L. Tóth
Amin Honarmandi Shandiz
G. Gosztolya
Alexandra Markó
Géza Németh
Tamás Gábor Csapó
39
4
0
26 Jul 2021
Approximation Theory of Convolutional Architectures for Time Series
  Modelling
Approximation Theory of Convolutional Architectures for Time Series Modelling
Haotian Jiang
Zhong Li
Qianxiao Li
AI4TS
19
11
0
20 Jul 2021
PU-Flow: a Point Cloud Upsampling Network with Normalizing Flows
PU-Flow: a Point Cloud Upsampling Network with Normalizing Flows
Aihua Mao
Zihui Du
Junhui Hou
Yaqi Duan
Yong-jin Liu
Ying He
3DPC
37
35
0
13 Jul 2021
Extending Text-to-Speech Synthesis with Articulatory Movement Prediction
  using Ultrasound Tongue Imaging
Extending Text-to-Speech Synthesis with Articulatory Movement Prediction using Ultrasound Tongue Imaging
Tamás Gábor Csapó
16
2
0
12 Jul 2021
Neural Waveshaping Synthesis
Neural Waveshaping Synthesis
B. Hayes
C. Saitis
Gyorgy Fazekas
36
28
0
11 Jul 2021
EditSpeech: A Text Based Speech Editing System Using Partial Inference
  and Bidirectional Fusion
EditSpeech: A Text Based Speech Editing System Using Partial Inference and Bidirectional Fusion
Daxin Tan
Liqun Deng
Y. Yeung
Xin Jiang
Xiao Chen
Tan Lee
29
38
0
04 Jul 2021
Supervised Contrastive Learning for Accented Speech Recognition
Supervised Contrastive Learning for Accented Speech Recognition
Tao Han
Hantao Huang
Ziang Yang
Wei Han
49
15
0
02 Jul 2021
Normalizing Flow based Hidden Markov Models for Classification of Speech
  Phones with Explainability
Normalizing Flow based Hidden Markov Models for Classification of Speech Phones with Explainability
Anubhab Ghosh
Antoine Honoré
Dong Liu
G. Henter
S. Chatterjee
16
5
0
01 Jul 2021
A Survey on Neural Speech Synthesis
A Survey on Neural Speech Synthesis
Xu Tan
Tao Qin
Frank Soong
Tie-Yan Liu
AI4TS
18
352
0
29 Jun 2021
Transflower: probabilistic autoregressive dance generation with
  multimodal attention
Transflower: probabilistic autoregressive dance generation with multimodal attention
Guillermo Valle Pérez
G. Henter
Jonas Beskow
A. Holzapfel
Pierre-Yves Oudeyer
Simon Alexanderson
30
42
0
25 Jun 2021
Basis-MelGAN: Efficient Neural Vocoder Based on Audio Decomposition
Basis-MelGAN: Efficient Neural Vocoder Based on Audio Decomposition
Zhengxi Liu
Y. Qian
DRL
19
10
0
25 Jun 2021
Previous
123...567...91011
Next