WaveGlow: A Flow-based Generative Network for Speech Synthesis

31 October 2018

Papers citing "WaveGlow: A Flow-based Generative Network for Speech Synthesis"

50 / 525 papers shown

Title
Distilling the Knowledge from Conditional Normalizing Flows Dmitry Baranchuk Vladimir Aliev Artem Babenko BDL 36 2 0 24 Jun 2021
UniTTS: Residual Learning of Unified Embedding Space for Speech Style Control M. Kang Sungjae Kim Injung Kim 26 3 0 21 Jun 2021
Deep Generative Learning via Schrödinger Bridge Gefei Wang Yuling Jiao Qiang Xu Yang Wang Can Yang DiffM OT 23 92 0 19 Jun 2021
WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis Nanxin Chen Yu Zhang Heiga Zen Ron J. Weiss Mohammad Norouzi Najim Dehak William Chan DiffM 23 88 0 17 Jun 2021
EMOVIE: A Mandarin Emotion Speech Dataset with a Simple Emotional Text-to-Speech Model Chenye Cui Yi Ren Jinglin Liu Feiyang Chen Rongjie Huang Ming Lei Zhou Zhao 24 35 0 17 Jun 2021
A Flow-Based Neural Network for Time Domain Speech Enhancement Martin Strauss B. Edler 23 33 0 16 Jun 2021
Improving the expressiveness of neural vocoding with non-affine Normalizing Flows Adam Gabry's Yunlong Jiao V. Klimkov Daniel Korzekwa Roberto Barra-Chicote 13 1 0 16 Jun 2021
WSRGlow: A Glow-based Waveform Generative Model for Audio Super-Resolution Kexun Zhang Yi Ren Changliang Xu Zhou Zhao 48 29 0 16 Jun 2021
UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation Won Jang D. Lim Jaesam Yoon Bongwan Kim Juntae Kim 38 125 0 15 Jun 2021
CRASH: Raw Audio Score-based Generative Modeling for Controllable High-resolution Drum Sound Synthesis Simon Rouard Gaëtan Hadjeres DiffM 27 42 0 14 Jun 2021
Continuous Wavelet Vocoder-based Decomposition of Parametric Speech Waveform Synthesis M. S. Al-Radhi Tamás Gábor Csapó Csaba Zainkó Géza Németh 11 3 0 12 Jun 2021
Catch-A-Waveform: Learning to Generate Audio from a Single Short Example Gal Greshler Tamar Rott Shaham T. Michaeli 18 25 0 11 Jun 2021
PriorGrad: Improving Conditional Denoising Diffusion Models with Data-Dependent Adaptive Prior Sang-gil Lee Heeseung Kim Chaehun Shin Xu Tan Chang-Shu Liu Qi Meng Tao Qin Wei Chen Sung-Hoon Yoon Tie-Yan Liu DiffM 29 81 0 11 Jun 2021
Sprachsynthese -- State-of-the-Art in englischer und deutscher Sprache René Peinl 24 0 0 11 Jun 2021
Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech Jaehyeon Kim Jungil Kong Juhee Son DRL 89 847 0 11 Jun 2021
Neural Speaker Embeddings for Ultrasound-based Silent Speech Interfaces Amin Honarmandi Shandiz L. Tóth G. Gosztolya Alexandra Markó Tamás Gábor Csapó 26 6 0 08 Jun 2021
Fre-GAN: Adversarial Frequency-consistent Audio Synthesis Ji-Hoon Kim Sang-Hoon Lee Ji-Hyun Lee Seong-Whan Lee 24 53 0 04 Jun 2021
An objective evaluation of the effects of recording conditions and speaker characteristics in multi-speaker deep neural speech synthesis Beáta Lőrincz Adriana Stan M. Giurgiu 13 2 0 03 Jun 2021
NVC-Net: End-to-End Adversarial Voice Conversion Bac Nguyen Cong Fabien Cardinaux AAML 37 41 0 02 Jun 2021
StarGAN-ZSVC: Towards Zero-Shot Voice Conversion in Low-Resource Contexts Matthew Baas Herman Kamper 30 6 0 31 May 2021
Voice Activity Detection for Ultrasound-based Silent Speech Interfaces using Convolutional Neural Networks Amin Honarmandi Shandiz L. Tóth 25 6 0 28 May 2021
High-Fidelity and Low-Latency Universal Neural Vocoder based on Multiband WaveRNN with Data-Driven Linear Prediction for Discrete Waveform Modeling Patrick Lumban Tobing T. Toda 49 8 0 20 May 2021
Parallel and Flexible Sampling from Autoregressive Models via Langevin Dynamics V. Jayaram John Thickstun DiffM 28 23 0 17 May 2021
ItôTTS and ItôWave: Linear Stochastic Differential Equation Is All You Need For Audio Generation Shoule Wu Ziqiang Shi DiffM 24 11 0 17 May 2021
Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech Vadim Popov Ivan Vovk Vladimir Gogoryan Tasnima Sadekova Mikhail Kudinov DiffM 61 515 0 13 May 2021
MetaKernel: Learning Variational Random Features with Limited Labels Yingjun Du Haoliang Sun Xiantong Zhen Jun Xu Yilong Yin Ling Shao Cees G. M. Snoek VLM BDL 15 5 0 08 May 2021
Conditional Invertible Neural Networks for Diverse Image-to-Image Translation Lynton Ardizzone Jakob Kruse Carsten T. Lüth Niels Bracher Carsten Rother Ullrich Kothe 21 31 0 05 May 2021
Text-to-Speech Synthesis Techniques for MIDI-to-Audio Synthesis Erica Cooper Xin Wang Junichi Yamagishi 33 6 0 25 Apr 2021
Improving Neural Silent Speech Interface Models by Adversarial Training Amin Honarmandi Shandiz L. Tóth G. Gosztolya Alexandra Markó Tamás Gábor Csapó AAML GAN 24 7 0 23 Apr 2021
Reconstructing Speech from Real-Time Articulatory MRI Using Neural Vocoders Yicong Yu Amin Honarmandi Shandiz L. Tóth 22 18 0 23 Apr 2021
VideoGPT: Video Generation using VQ-VAE and Transformers Wilson Yan Yunzhi Zhang Pieter Abbeel A. Srinivas ViT VGen 245 484 0 20 Apr 2021
Review of end-to-end speech synthesis technology based on deep learning Zhaoxi Mu Xinyu Yang Yizhuo Dong AuLLM ALM 26 24 0 20 Apr 2021
TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis with Explicit Pitch and Duration Prediction Stanislav Beliaev Boris Ginsburg 27 8 0 16 Apr 2021
Unified Source-Filter GAN: Unified Source-filter Network Based On Factorization of Quasi-Periodic Parallel WaveGAN Reo Yoneyama Yi-Chiao Wu T. Toda 14 12 0 10 Apr 2021
The AS-NU System for the M2VoC Challenge Cheng-Hung Hu Yi-Chiao Wu Wen-Chin Huang Yu-Huai Peng Yu-Wen Chen Pin-Jui Ku T. Toda Yu Tsao Hsin-Min Wang 19 1 0 07 Apr 2021
NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling Junhyeok Lee Seungu Han DiffM 29 67 0 06 Apr 2021
Diff-TTS: A Denoising Diffusion Model for Text-to-Speech Myeonghun Jeong Hyeongju Kim Sung Jun Cheon Byoung Jin Choi N. Kim DiffM 25 191 0 03 Apr 2021
Multi-rate attention architecture for fast streamable Text-to-speech spectrum modeling Qing He Zhiping Xiu T. Koehler Jilong Wu 8 7 0 01 Apr 2021
Speech Resynthesis from Discrete Disentangled Self-Supervised Representations Adam Polyak Yossi Adi Jade Copet Eugene Kharitonov Kushal Lakhotia Wei-Ning Hsu Abdel-rahman Mohamed Emmanuel Dupoux 29 306 0 01 Apr 2021
CycleDRUMS: Automatic Drum Arrangement For Bass Lines Using CycleGAN Giorgio Barnabò Giovanni Trappolini L. Lastilla Cesare Campagnano Angela Fan Fabio Petroni Fabrizio Silvestri 18 4 0 01 Apr 2021
ArtFlow: Unbiased Image Style Transfer via Reversible Neural Flows Jie An Siyu Huang Yibing Song Dejing Dou Wei Liu Jiebo Luo 30 190 0 31 Mar 2021
iVPF: Numerical Invertible Volume Preserving Flow for Efficient Lossless Compression Shifeng Zhang Chen Zhang Ning Kang Zhenguo Li 33 37 0 30 Mar 2021
Improve GAN-based Neural Vocoder using Pointwise Relativistic LeastSquare GAN Cong Wang Yu Chen Bin Wang Yi Shi 35 1 0 26 Mar 2021
Out-of-Distribution Detection of Melanoma using Normalizing Flows M. Valiuddin C.G.A. Viviers OODD 24 0 0 23 Mar 2021
STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllable Neural Text to Speech Keon Lee Kyumin Park Daeyoung Kim 24 30 0 17 Mar 2021
GAN Vocoder: Multi-Resolution Discriminator Is All You Need J. You Dalhyun Kim Gyuhyeon Nam Geumbyeol Hwang Gyeongsu Chae 21 27 0 09 Mar 2021
Deep Generative Modelling: A Comparative Review of VAEs, GANs, Normalizing Flows, Energy-Based and Autoregressive Models Sam Bond-Taylor Adam Leach Yang Long Chris G. Willcocks VLM TPM 41 483 0 08 Mar 2021
CUHK-EE Voice Cloning System for ICASSP 2021 M2VoC Challenge Daxin Tan Hingpang Huang Guangyan Zhang Tan Lee 19 6 0 08 Mar 2021
A Spectral Enabled GAN for Time Series Data Generation Kaleb E. Smith Anthony O. Smith GAN 30 12 0 02 Mar 2021
MaskCycleGAN-VC: Learning Non-parallel Voice Conversion with Filling in Frames Takuhiro Kaneko Hirokazu Kameoka Kou Tanaka Nobukatsu Hojo 38 57 0 25 Feb 2021