WaveGlow: A Flow-based Generative Network for Speech Synthesis

31 October 2018

Papers citing "WaveGlow: A Flow-based Generative Network for Speech Synthesis"

50 / 525 papers shown

Title
iFlow: Numerically Invertible Flows for Efficient Lossless Compression via a Uniform Coder Shifeng Zhang Ning Kang Tom Ryder Zhenguo Li 27 30 0 01 Nov 2021
RefineGAN: Universally Generating Waveform Better than Ground Truth with Highly Accurate Pitch and Intensity Responses Shengyuan Xu Wenxiao Zhao Jing Guo 24 12 0 01 Nov 2021
Uncertainty quantification for ptychography using normalizing flows Agnimitra Dasgupta Z. Di AI4CE 36 5 0 01 Nov 2021
TorchAudio: Building Blocks for Audio and Speech Processing Yao-Yuan Yang Moto Hira Zhaoheng Ni Anjali Chourdia Artyom Astafurov ... Sean Narenthiran Shinji Watanabe Soumith Chintala Vincent Quenneville-Bélair Yangyang Shi 31 165 0 28 Oct 2021
Chunked Autoregressive GAN for Conditional Waveform Synthesis Max Morrison Rithesh Kumar Kundan Kumar Prem Seetharaman Aaron Courville Yoshua Bengio GAN 41 69 0 19 Oct 2021
KaraTuner: Towards end to end natural pitch correction for singing voice in karaoke Xiaobin Zhuang Huiran Yu Weifeng Zhao Tao Jiang Peng Hu 32 5 0 18 Oct 2021
Neural Dubber: Dubbing for Videos According to Scripts Chenxu Hu Qiao Tian Tingle Li Yuping Wang Yuxuan Wang Hang Zhao DiffM VGen 36 39 0 15 Oct 2021
SingGAN: Generative Adversarial Network For High-Fidelity Singing Voice Generation Rongjie Huang Chenye Cui Feiyang Chen Yi Ren Jinglin Liu Zhou Zhao Baoxing Huai N. Yuan GAN 107 62 0 14 Oct 2021
A Melody-Unsupervision Model for Singing Voice Synthesis Soonbeom Choi Juhan Nam 29 14 0 13 Oct 2021
DeepA: A Deep Neural Analyzer For Speech And Singing Vocoding Sergey Nikonorov Berrak Sisman Mingyang Zhang Haizhou Li 23 2 0 13 Oct 2021
Fine-grained style control in Transformer-based Text-to-speech Synthesis Li-Wei Chen Alexander I. Rudnicky 88 29 0 12 Oct 2021
Adapting TTS models For New Speakers using Transfer Learning Paarth Neekhara Jason Chun Lok Li Boris Ginsburg 38 15 0 12 Oct 2021
Voice Reenactment with F0 and timing constraints and adversarial learning of conversions F. Bous L. Benaroya Nicolas Obin Axel Roebel 14 2 0 07 Oct 2021
Towards Universal Neural Vocoding with a Multi-band Excited WaveNet Axel Roebel F. Bous 29 2 0 07 Oct 2021
Automated Testing of AI Models Swagatam Haldar Deepak Vijaykeerthy Diptikalyan Saha VLM 21 0 0 07 Oct 2021
Style Equalization: Unsupervised Learning of Controllable Generative Sequence Models Jen-Hao Rick Chang A. Shrivastava H. Koppula Xiaoshuai Zhang Oncel Tuzel DiffM 51 16 0 06 Oct 2021
GANtron: Emotional Speech Synthesis with Generative Adversarial Networks E. Hortal Rodrigo Brechard Alarcia GAN 26 2 0 06 Oct 2021
Neural Pitch-Shifting and Time-Stretching with Controllable LPCNet Max Morrison Zeyu Jin Nicholas J. Bryan Juan-Pablo Caceres Bryan Pardo 30 14 0 05 Oct 2021
On the Interplay Between Sparsity, Naturalness, Intelligibility, and Prosody in Speech Synthesis Cheng-I Jeff Lai Erica Cooper Yang Zhang Shiyu Chang Kaizhi Qian ... Yung-Sung Chuang Alexander H. Liu Junichi Yamagishi David D. Cox James R. Glass 26 6 0 04 Oct 2021
PortaSpeech: Portable and High-Quality Generative Text-to-Speech Yi Ren Jinglin Liu Zhou Zhao 47 78 0 30 Sep 2021
VoiceFixer: Toward General Speech Restoration with Neural Vocoder Haohe Liu Qiuqiang Kong Qiao Tian Yan Zhao DeLiang Wang Chuanzeng Huang Yuxuan Wang 33 57 0 28 Sep 2021
MSR-NV: Neural Vocoder Using Multiple Sampling Rates Kentaro Mitsui Kei Sawada 20 0 0 28 Sep 2021
FlowVocoder: A small Footprint Neural Vocoder based Normalizing flow for Speech Synthesis Manh Luong Viet-Anh Tran 11 2 0 27 Sep 2021
Low-Latency Incremental Text-to-Speech Synthesis with Distilled Context Prediction Network Takaaki Saeki Shinnosuke Takamichi Hiroshi Saruwatari 34 3 0 22 Sep 2021
On-device neural speech synthesis Sivanand Achanta Albert Antony L. Golipour Jiangchuan Li T. Raitio ... Francesco Rossi Jennifer Shi Jaimin Upadhyay David Winarsky Hepeng Zhang 35 17 0 17 Sep 2021
fairseq S^2: A Scalable and Integrable Speech Synthesis Toolkit Changhan Wang Wei-Ning Hsu Yossi Adi Adam Polyak Ann Lee Peng-Jen Chen Jiatao Gu J. Pino VLM 69 32 0 14 Sep 2021
Neural HMMs are all you need (for high-quality attention-free TTS) Shivam Mehta Éva Székely Jonas Beskow G. Henter 40 18 0 30 Aug 2021
Integrated Speech and Gesture Synthesis Siyang Wang Simon Alexanderson Joakim Gustafson Jonas Beskow G. Henter Éva Székely 37 19 0 25 Aug 2021
Multimodal analysis of the predictability of hand-gesture properties Taras Kucherenko Rajmund Nagy Michael Neff Hedvig Kjellström G. Henter 34 22 0 12 Aug 2021
RW-Resnet: A Novel Speech Anti-Spoofing Model Using Raw Waveform Youxuan Ma Zongze Ren Shugong Xu 38 39 0 12 Aug 2021
StarGAN-VC+ASR: StarGAN-based Non-Parallel Voice Conversion Regularized by Automatic Speech Recognition Shoki Sakamoto Akira Taniguchi T. Taniguchi Hirokazu Kameoka BDL 31 5 0 10 Aug 2021
AnyoneNet: Synchronized Speech and Talking Head Generation for Arbitrary Person Xinsheng Wang Qicong Xie Jihua Zhu Lei Xie O. Scharenborg 31 16 0 09 Aug 2021
A Streamwise GAN Vocoder for Wideband Speech Coding at Very Low Bit Rate Ahmed Mustafa Jan Büthe Srikanth Korse Kishan Gupta Guillaume Fuchs N. Pia 21 18 0 09 Aug 2021
An Empirical Study on End-to-End Singing Voice Synthesis with Encoder-Decoder Architectures Dengfeng Ke Yuxing Lu Xudong Liu Yanyan Xu Jing Sun Cheng-Hao Cai 30 0 0 06 Aug 2021
A Benchmarking Initiative for Audio-Domain Music Generation Using the Freesound Loop Dataset Tun-Min Hung Bo-Yu Chen Yen-Tung Yeh Yi-Hsuan Yang 18 12 0 03 Aug 2021
Creation and Detection of German Voice Deepfakes Vanessa Barnekow Dominik Binder Niclas Kromrey Pascal Munaretto A. Schaad Felix Schmieder 21 2 0 02 Aug 2021
A Survey on Audio Synthesis and Audio-Visual Multimodal Processing Zhaofeng Shi 26 7 0 01 Aug 2021
Sequence-to-Sequence Voice Reconstruction for Silent Speech in a Tonal Language Huiyan Li Haohong Lin You Wang Hengyang Wang Ming Zhang Han Gao Qing Ai Zhiyuan Luo Guang Li 31 12 0 31 Jul 2021
Beyond Voice Identity Conversion: Manipulating Voice Attributes by Adversarial Learning of Structured Disentangled Representations L. Benaroya Nicolas Obin Axel Roebel 16 5 0 26 Jul 2021
Adaptation of Tacotron2-based Text-To-Speech for Articulatory-to-Acoustic Mapping using Ultrasound Tongue Imaging Csaba Zainkó L. Tóth Amin Honarmandi Shandiz G. Gosztolya Alexandra Markó Géza Németh Tamás Gábor Csapó 39 4 0 26 Jul 2021
Approximation Theory of Convolutional Architectures for Time Series Modelling Haotian Jiang Zhong Li Qianxiao Li AI4TS 19 11 0 20 Jul 2021
PU-Flow: a Point Cloud Upsampling Network with Normalizing Flows Aihua Mao Zihui Du Junhui Hou Yaqi Duan Yong-jin Liu Ying He 3DPC 37 35 0 13 Jul 2021
Extending Text-to-Speech Synthesis with Articulatory Movement Prediction using Ultrasound Tongue Imaging Tamás Gábor Csapó 16 2 0 12 Jul 2021
Neural Waveshaping Synthesis B. Hayes C. Saitis Gyorgy Fazekas 36 28 0 11 Jul 2021
EditSpeech: A Text Based Speech Editing System Using Partial Inference and Bidirectional Fusion Daxin Tan Liqun Deng Y. Yeung Xin Jiang Xiao Chen Tan Lee 29 38 0 04 Jul 2021
Supervised Contrastive Learning for Accented Speech Recognition Tao Han Hantao Huang Ziang Yang Wei Han 49 15 0 02 Jul 2021
Normalizing Flow based Hidden Markov Models for Classification of Speech Phones with Explainability Anubhab Ghosh Antoine Honoré Dong Liu G. Henter S. Chatterjee 16 5 0 01 Jul 2021
A Survey on Neural Speech Synthesis Xu Tan Tao Qin Frank Soong Tie-Yan Liu AI4TS 18 352 0 29 Jun 2021
Transflower: probabilistic autoregressive dance generation with multimodal attention Guillermo Valle Pérez G. Henter Jonas Beskow A. Holzapfel Pierre-Yves Oudeyer Simon Alexanderson 30 42 0 25 Jun 2021
Basis-MelGAN: Efficient Neural Vocoder Based on Audio Decomposition Zhengxi Liu Y. Qian DRL 19 10 0 25 Jun 2021