WaveGlow: A Flow-based Generative Network for Speech Synthesis

31 October 2018

Papers citing "WaveGlow: A Flow-based Generative Network for Speech Synthesis"

50 / 204 papers shown

Title
DPN-GAN: Inducing Periodic Activations in Generative Adversarial Networks for High-Fidelity Audio Synthesis Zeeshan Ahmad Shudi Bao Meng Chen 20 0 0 14 May 2025
SingNet: Towards a Large-Scale, Diverse, and In-the-Wild Singing Voice Dataset Yicheng Gu Chaoren Wang Junge Zhang Xueyao Zhang Zihao Fang Haorui He Zhizheng Wu 32 2 0 14 May 2025
Provably Secure Public-Key Steganography Based on Admissible Encoding Xinsong Zhang Kejiang Chen Na Zhao Weixi Zhang N. Yu 29 0 0 28 Apr 2025
Mitigating Timbre Leakage with Universal Semantic Mapping Residual Block for Voice Conversion Na Li Chuke Wang Yu Gu Zhifeng Li 59 0 0 11 Apr 2025
P2Mark: Plug-and-play Parameter-level Watermarking for Neural Speech Generation Yong Ren Jiangyan Yi Tao Wang J. Tao Zhengqi Wen Chenxing Li Zheng Lian Ruibo Fu Ye Bai Xiaohui Zhang 60 0 0 07 Apr 2025
Everyday Speech in the Indian Subcontinent Utkarsh Pathak 56 1 0 24 Feb 2025
Less is More for Synthetic Speech Detection in the Wild Ashi Garg Zexin Cai Henry Li Xinyuan Leibny Paola García-Perera Kevin Duh Sanjeev Khudanpur Matthew Wiesner Nicholas Andrews 74 0 0 17 Feb 2025
Recent Advances in Speech Language Models: A Survey Wenqian Cui Dianzhi Yu Xiaoqi Jiao Ziqiao Meng Guangyan Zhang Qichao Wang Yiwen Guo Irwin King AuLLM 61 14 0 01 Oct 2024
SpoofCeleb: Speech Deepfake Detection and SASV In The Wild Jee-weon Jung Yihan Wu Xin Wang Ji-Hoon Kim Soumi Maiti ... Joon Son Chung Wangyou Zhang Seyun Um Shinnosuke Takamichi Shinji Watanabe 68 1 0 18 Sep 2024
Stutter-Solver: End-to-end Multi-lingual Dysfluency Detection Xuanru Zhou Cheol Jun Cho Ayati Sharma Brittany Morin D. Baquirin ... Zachary Miller B. Tee M. G. Tempini Jiachen Lian Gopala Anumanchipalli 34 3 0 15 Sep 2024
INN-PAR: Invertible Neural Network for PPG to ABP Reconstruction Soumitra Kundu Gargi Panda Saumik Bhattacharya Aurobinda Routray Rajlakshmi Guha 47 0 0 13 Sep 2024
How Should We Extract Discrete Audio Tokens from Self-Supervised Models? Pooneh Mousavi J. Duret Salah Zaiem Luca Della Libera Artem Ploujnikov Cem Subakan Mirco Ravanelli 42 9 0 15 Jun 2024
Variational Bayesian Optimal Experimental Design with Normalizing Flows Jiayuan Dong Christian L. Jacobsen Mehdi Khalloufi Maryam Akram Wanjiao Liu Karthik Duraisamy Xun Huan BDL 54 5 0 08 Apr 2024
PeriodGrad: Towards Pitch-Controllable Neural Vocoder Based on a Diffusion Probabilistic Model Yukiya Hono Kei Hashimoto Yoshihiko Nankaku Keiichi Tokuda DiffM 35 2 0 22 Feb 2024
Speaking in Wavelet Domain: A Simple and Efficient Approach to Speed up Speech Diffusion Model Xiangyu Zhang Daijiao Liu Hexin Liu Qiquan Zhang Hanyu Meng Leibny Paola García Chng Eng Siong Lina Yao DiffM 25 3 0 16 Feb 2024
Combined Generative and Predictive Modeling for Speech Super-resolution Heming Wang Eric W. Healy DeLiang Wang DiffM 33 0 0 25 Jan 2024
Creating New Voices using Normalizing Flows Piotr Bilinski Thomas Merritt Abdelhamid Ezzerg Kamil Pokora Sebastian Cygert K. Yanagisawa Roberto Barra-Chicote Daniel Korzekwa 26 17 0 22 Dec 2023
Amphion: An Open-Source Audio, Music and Speech Generation Toolkit Xueyao Zhang Liumeng Xue Yicheng Gu Yuancheng Wang Haorui He ... Mingxuan Wang Jun Han Kai Chen Haizhou Li Zhizheng Wu 29 28 0 15 Dec 2023
VoiceExtender: Short-utterance Text-independent Speaker Verification with Guided Diffusion Model Yayun He Zuheng Kang Jianzong Wang Junqing Peng Jing Xiao DiffM 16 2 0 07 Oct 2023
Matcha-TTS: A fast TTS architecture with conditional flow matching Shivam Mehta Ruibo Tu Jonas Beskow Éva Székely G. Henter 24 72 0 06 Sep 2023
Image Synthesis under Limited Data: A Survey and Taxonomy Mengping Yang Zhe Wang 28 8 0 31 Jul 2023
VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design Jungil Kong Jihoon Park Beomjeong Kim Jeongmin Kim Dohee Kong Sangjin Kim 37 36 0 31 Jul 2023
Textually Pretrained Speech Language Models Michael Hassid Tal Remez Tu Nguyen Itai Gat Alexis Conneau ... Alexandre Défossez Gabriel Synnaeve Emmanuel Dupoux Roy Schwartz Yossi Adi VLM SyDa 44 53 0 22 May 2023
APNet: An All-Frame-Level Neural Vocoder Incorporating Direct Prediction of Amplitude and Phase Spectra Yang Ai Zhenhua Ling 34 13 0 13 May 2023
Source-Filter-Based Generative Adversarial Neural Vocoder for High Fidelity Speech Synthesis Ye-Xin Lu Yang Ai Zhenhua Ling 24 1 0 26 Apr 2023
Affective social anthropomorphic intelligent system Md. Adyelullahil Mamun Hasnat Md. Abdullah Md. Golam Rabiul Alam Muhammad Mehedi Hassan Md. Zia Uddin 17 1 0 19 Apr 2023
Neural Diffeomorphic Non-uniform B-spline Flows S. Hong S. Chun 35 1 0 07 Apr 2023
Transformers in Speech Processing: A Survey S. Latif Aun Zaidi Heriberto Cuayáhuitl Fahad Shamshad Moazzam Shoukat Junaid Qadir 42 47 0 21 Mar 2023
UniFLG: Unified Facial Landmark Generator from Text or Speech Kentaro Mitsui Yukiya Hono Kei Sawada CVBM 16 6 0 28 Feb 2023
Fast and small footprint Hybrid HMM-HiFiGAN based system for speech synthesis in Indian languages Sudhanshu Srivastava Ishika Gupta Anusha Prakash Jom Kuriakose H. Murthy VLM 21 1 0 13 Feb 2023
Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers Chengyi Wang Sanyuan Chen Yu-Huan Wu Zi-Hua Zhang Long Zhou ... Huaming Wang Jinyu Li Lei He Sheng Zhao Furu Wei 48 644 0 05 Jan 2023
On the Robustness of Normalizing Flows for Inverse Problems in Imaging Seongmin Hong I. Park S. Chun 33 7 0 08 Dec 2022
Low-Resource End-to-end Sanskrit TTS using Tacotron2, WaveGlow and Transfer Learning Ankur Debnath Shridevi S Patil Gangotri Nadiger R. Ganesan 24 20 0 07 Dec 2022
Generative Models for Improved Naturalness, Intelligibility, and Voicing of Whispered Speech Dominik Wagner Sebastian P. Bayerl H. A. C. Maruri Tobias Bocklet 24 7 0 04 Dec 2022
Puffin: pitch-synchronous neural waveform generation for fullband speech on modest devices O. Watts Lovisa Wihlborg Cassia Valentini-Botinhao 27 3 0 25 Nov 2022
Efficient Incremental Text-to-Speech on GPUs Muyang Du Chuan Liu Jiaxing Qi Junjie Lai 24 1 0 25 Nov 2022
Towards Building Text-To-Speech Systems for the Next Billion Users Gokul Karthik Kumar V. PraveenS. Pratyush Kumar Mitesh M. Khapra Karthik Nandakumar 38 18 0 17 Nov 2022
OverFlow: Putting flows on top of neural transducers for better TTS Shivam Mehta Ambika Kirkland Harm Lameris Jonas Beskow Éva Székely G. Henter AI4TS 39 12 0 13 Nov 2022
Online Phase Reconstruction via DNN-based Phase Differences Estimation Yoshiki Masuyama Kohei Yatabe Kento Nagatomo Yasuhiro Oikawa 3DV 16 7 0 12 Nov 2022
Robust MelGAN: A robust universal neural vocoder for high-fidelity TTS Kun Song Jian Cong Xinsheng Wang Yongmao Zhang Linfu Xie Ning Jiang Haiying Wu 27 0 0 31 Oct 2022
Conditioning and Sampling in Variational Diffusion Models for Speech Super-Resolution Chin-Yun Yu Sung-Lin Yeh Gyorgy Fazekas Hao Tang DiffM 40 20 0 27 Oct 2022
Cover Reproducible Steganography via Deep Generative Models Kejiang Chen Hang Zhou Yaofei Wang Meng Li Weiming Zhang Neng H. Yu DiffM 31 9 0 26 Oct 2022
Improved Normalizing Flow-Based Speech Enhancement using an All-pole Gammatone Filterbank for Conditional Input Representation Martin Strauss Matteo Torcoli B. Edler 21 4 0 21 Oct 2022
Robust One-Shot Singing Voice Conversion Naoya Takahashi M. Singh Yuki Mitsufuji DiffM 25 8 0 20 Oct 2022
Hierarchical Diffusion Models for Singing Voice Neural Vocoder Naoya Takahashi Mayank Kumar Singh Yuki Mitsufuji DiffM 21 16 0 14 Oct 2022
SpecRNet: Towards Faster and More Accessible Audio DeepFake Detection Piotr Kawa Marcin Plata P. Syga 37 14 0 12 Oct 2022
An Overview of Affective Speech Synthesis and Conversion in the Deep Learning Era Andreas Triantafyllopoulos Björn W. Schuller Gokcce .Iymen M. Sezgin Xiangheng He ... Shuo Liu Silvan Mertes Elisabeth André Ruibo Fu Jianhua Tao 20 53 0 06 Oct 2022
WaveFit: An Iterative and Non-autoregressive Neural Vocoder based on Fixed-Point Iteration Yuma Koizumi Kohei Yatabe Heiga Zen M. Bacchiani DiffM 42 29 0 03 Oct 2022
AutoLV: Automatic Lecture Video Generator Wen Wang Yang Song Sanjay Jha VGen 18 3 0 19 Sep 2022
Maximum Likelihood on the Joint (Data, Condition) Distribution for Solving Ill-Posed Problems with Conditional Flow Models John Shelton Hyatt 12 1 0 24 Aug 2022