WaveGlow: A Flow-based Generative Network for Speech Synthesis

31 October 2018

Papers citing "WaveGlow: A Flow-based Generative Network for Speech Synthesis"

50 / 525 papers shown

Title
Audio Dequantization for High Fidelity Audio Generation in Flow-based Neural Vocoder Hyun-Wook Yoon Sang-Hoon Lee Hyeong-Rae Noh Seong-Whan Lee 20 11 0 16 Aug 2020
Adaptation Algorithms for Neural Network-Based Speech Recognition: An Overview P. Bell Joachim Fainberg Ondˇrej Klejch Jinyu Li Steve Renals P. Swietojanski 46 74 0 14 Aug 2020
Prosody Learning Mechanism for Speech Synthesis System Without Text Length Limit Zhen Zeng Jianzong Wang Ning Cheng Jing Xiao 13 8 0 13 Aug 2020
Bunched LPCNet : Vocoder for Low-cost Neural Text-To-Speech Systems Ravichander Vipperla Sangjun Park Kihyun Choo Samin S. Ishtiaq Kyoungbo Min S. Bhattacharya Abhinav Mehrotra Alberto Gil C. P. Ramos Nicholas D. Lane 26 26 0 11 Aug 2020
Speaker Conditional WaveRNN: Towards Universal Neural Vocoder for Unseen Speaker and Recording Conditions D. Paul Yannis Pantazis Y. Stylianou DRL 13 29 0 09 Aug 2020
An Overview of Voice Conversion and its Challenges: From Statistical Modeling to Deep Learning Berrak Sisman Junichi Yamagishi Simon King Haizhou Li BDL 41 318 0 09 Aug 2020
Unsupervised Cross-Domain Singing Voice Conversion Adam Polyak Lior Wolf Yossi Adi Yaniv Taigman 20 44 0 06 Aug 2020
HooliGAN: Robust, High Quality Neural Vocoding Ollie McCarthy Zo Ahmed 8 14 0 06 Aug 2020
PPSpeech: Phrase based Parallel End-to-End TTS System Yahuan Cong Ran Zhang Jian Luan 24 3 0 06 Aug 2020
Ultrasound-based Articulatory-to-Acoustic Mapping with WaveGlow Speech Synthesis Tamás Gábor Csapó Csaba Zainkó L. Tóth G. Gosztolya Alexandra Markó 6 31 0 06 Aug 2020
Recognition-Synthesis Based Non-Parallel Voice Conversion with Adversarial Learning Jing-Xuan Zhang Zhenhua Ling Lirong Dai 15 6 0 05 Aug 2020
FRMDN: Flow-based Recurrent Mixture Density Network S. Razavi Reshad Hosseini Tina Behzad BDL 16 0 0 05 Aug 2020
A Spectral Energy Distance for Parallel Speech Synthesis A. Gritsenko Tim Salimans Rianne van den Berg Jasper Snoek Nal Kalchbrenner 8 70 0 03 Aug 2020
Exploiting Deep Sentential Context for Expressive End-to-End Speech Synthesis Fengyu Yang Shan Yang Qinghua Wu Yujun Wang Lei Xie 39 5 0 03 Aug 2020
Solving inverse problems using conditional invertible neural networks G. A. Padmanabha N. Zabaras AI4CE 6 63 0 31 Jul 2020
Speaking Speed Control of End-to-End Speech Synthesis using Sentence-Level Conditioning Jaesung Bae Hanbin Bae Young-Sun Joo Junmo Lee Gyeong-Hoon Lee Hoon-Young Cho 4 17 0 30 Jul 2020
VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested Adversarial Network Jinhyeok Yang Junmo Lee Young-Ik Kim Hoonyoung Cho Injung Kim 9 72 0 30 Jul 2020
Robust Front-End for Multi-Channel ASR using Flow-Based Density Estimation Xiaoyuan Yi Hyeonseung Lee Wenhao Li Hyung Yong Kim Nam Soo Kim 25 22 0 25 Jul 2020
A Transfer Learning End-to-End ArabicText-To-Speech (TTS) Deep Architecture Fady K. Fahmy M. Khalil Hazem M. Abbas 41 20 0 22 Jul 2020
Quasi-Periodic WaveNet: An Autoregressive Raw Waveform Generative Model with Pitch-dependent Dilated Convolution Neural Network Yi-Chiao Wu Tomoki Hayashi Patrick Lumban Tobing Kazuhiro Kobayashi T. Toda 27 18 0 11 Jul 2020
Invertible Zero-Shot Recognition Flows Yuming Shen Jie Qin Lei Huang BDL AI4CE 19 100 0 09 Jul 2020
Articulatory-WaveNet: Autoregressive Model For Acoustic-to-Articulatory Inversion Narjes Bozorg Michael T.Johnson 11 1 0 22 Jun 2020
Denoising Diffusion Probabilistic Models Jonathan Ho Ajay Jain Pieter Abbeel DiffM 118 17,042 0 19 Jun 2020
Categorical Normalizing Flows via Continuous Transformations Phillip Lippe E. Gavves BDL 21 43 0 17 Jun 2020
Generative Modelling for Controllable Audio Synthesis of Expressive Piano Performance Hao Hao Tan Yin-Jyun Luo Dorien Herremans 12 7 0 16 Jun 2020
Why Normalizing Flows Fail to Detect Out-of-Distribution Data Polina Kirichenko Pavel Izmailov A. Wilson OODD 22 271 0 15 Jun 2020
SE-MelGAN -- Speaker Agnostic Rapid Speech Enhancement Luka Chkhetiani Levan Bejanidze 25 1 0 13 Jun 2020
FastPitch: Parallel Text-to-speech with Pitch Prediction Adrian Lañcucki 42 332 0 11 Jun 2020
NanoFlow: Scalable Normalizing Flows with Sublinear Parameter Complexity Sang-gil Lee Sungwon Kim Sungroh Yoon 19 17 0 11 Jun 2020
Deep generative models for musical audio synthesis M. Huzaifah L. Wyse 27 20 0 10 Jun 2020
SoftFlow: Probabilistic Framework for Normalizing Flow on Manifolds Hyeongju Kim Hyeonseung Lee Woohyun Kang Joun Yeop Lee N. Kim 3DPC 25 114 0 08 Jun 2020
WaveNODE: A Continuous Normalizing Flow for Speech Synthesis Hyeongju Kim Hyeongseung Lee Woohyun Kang Sung Jun Cheon Byoung Jin Choi N. Kim 14 12 0 08 Jun 2020
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech Yi Ren Chenxu Hu Xu Tan Tao Qin Sheng Zhao Zhou Zhao Tie-Yan Liu 60 1,360 0 08 Jun 2020
End-to-End Adversarial Text-to-Speech Jeff Donahue Sander Dieleman Mikolaj Binkowski Erich Elsen Karen Simonyan 17 185 0 05 Jun 2020
Auto-decoding Graphs Sohil Shah V. Koltun GNN 34 4 0 04 Jun 2020
CSTNet: Contrastive Speech Translation Network for Self-Supervised Speech Representation Learning Sameer Khurana Antoine Laurent James R. Glass SSL 19 12 0 04 Jun 2020
Graphical Normalizing Flows Antoine Wehenkel Gilles Louppe TPM BDL 12 37 0 03 Jun 2020
A Convolutional Deep Markov Model for Unsupervised Speech Representation Learning Sameer Khurana Antoine Laurent Wei-Ning Hsu J. Chorowski A. Lancucki R. Marxer James R. Glass SSL BDL 30 29 0 03 Jun 2020
A comparison of Vietnamese Statistical Parametric Speech Synthesis Systems Phan Huy Kinh V. Phung Anh-Tuan Dinh Quoc Bao Nguyen 17 1 0 26 May 2020
Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search Jaehyeon Kim Sungwon Kim Jungil Kong Sungroh Yoon 54 475 0 22 May 2020
NAUTILUS: a Versatile Voice Cloning System Hieu-Thi Luong Junichi Yamagishi 28 51 0 22 May 2020
Investigation of learning abilities on linguistic features in sequence-to-sequence text-to-speech synthesis Yusuke Yasuda Xin Wang Junichi Yamagishi AI4TS 19 31 0 20 May 2020
Quasi-Periodic Parallel WaveGAN Vocoder: A Non-autoregressive Pitch-dependent Dilated Convolution Model for Parametric Speech Generation Yi-Chiao Wu Tomoki Hayashi T. Okamoto Hisashi Kawai T. Toda 29 4 0 18 May 2020
MoBoAligner: a Neural Alignment Model for Non-autoregressive TTS with Monotonic Boundary Search Naihan Li Shujie Liu Yanqing Liu Sheng Zhao Ming Liu Ming Zhou 6 6 0 18 May 2020
Many-to-Many Voice Transformer Network Hirokazu Kameoka Wen-Chin Huang Kou Tanaka Takuhiro Kaneko Nobukatsu Hojo T. Toda ViT 30 30 0 18 May 2020
ConVoice: Real-Time Zero-Shot Voice Style Transfer with Convolutional Network Yurii Rebryk Stanislav Beliaev 11 8 0 15 May 2020
JDI-T: Jointly trained Duration Informed Transformer for Text-To-Speech without Explicit Alignment D. Lim Won Jang Gyeonghwan O Heayoung Park Bongwan Kim Jaesam Yoon 16 36 0 15 May 2020
WG-WaveNet: Real-Time High-Fidelity Speech Synthesis without GPU Po-Chun Hsu Hung-yi Lee 9 16 0 15 May 2020
Reverberation Modeling for Source-Filter-based Neural Vocoder Yang Ai Xin Wang Junichi Yamagishi Zhenhua Ling 20 3 0 15 May 2020
Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis Rafael Valle Kevin J. Shih R. Prenger Bryan Catanzaro 21 119 0 12 May 2020