ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1811.00002
  4. Cited By
WaveGlow: A Flow-based Generative Network for Speech Synthesis

WaveGlow: A Flow-based Generative Network for Speech Synthesis

31 October 2018
R. Prenger
Rafael Valle
Bryan Catanzaro
ArXivPDFHTML

Papers citing "WaveGlow: A Flow-based Generative Network for Speech Synthesis"

50 / 525 papers shown
Title
Boosting Fast and High-Quality Speech Synthesis with Linear Diffusion
Boosting Fast and High-Quality Speech Synthesis with Linear Diffusion
Hao Liu
Tao Wang
Jie Cao
Ran He
J. Tao
DiffM
11
3
0
09 Jun 2023
Towards Robust FastSpeech 2 by Modelling Residual Multimodality
Towards Robust FastSpeech 2 by Modelling Residual Multimodality
Fabian Kögel
Bac Nguyen
Fabien Cardinaux
14
2
0
02 Jun 2023
Vocos: Closing the gap between time-domain and Fourier-based neural
  vocoders for high-quality audio synthesis
Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis
Hubert Siuzdak
32
79
0
01 Jun 2023
UnDiff: Unsupervised Voice Restoration with Unconditional Diffusion
  Model
UnDiff: Unsupervised Voice Restoration with Unconditional Diffusion Model
A. Iashchenko
Pavel Andreev
Ivan Shchekotov
Nicholas Babaev
Dmitry Vetrov
DiffM
21
1
0
01 Jun 2023
Adaptation of Tongue Ultrasound-Based Silent Speech Interfaces Using
  Spatial Transformer Networks
Adaptation of Tongue Ultrasound-Based Silent Speech Interfaces Using Spatial Transformer Networks
L. Tóth
Amin Honarmandi Shandiz
G. Gosztolya
T. Csapó
24
3
0
30 May 2023
Towards single integrated spoofing-aware speaker verification embeddings
Towards single integrated spoofing-aware speaker verification embeddings
Sung Hwan Mun
Hye-jin Shim
Hemlata Tak
Xin Wang
Xuechen Liu
...
Junichi Yamagishi
Nicholas W. D. Evans
Tomi Kinnunen
N. Kim
Jee-weon Jung
46
11
0
30 May 2023
Towards generalizing deep-audio fake detection networks
Towards generalizing deep-audio fake detection networks
Konstantin Gasenzer
Moritz Wolter
36
4
0
22 May 2023
Textually Pretrained Speech Language Models
Textually Pretrained Speech Language Models
Michael Hassid
Tal Remez
Tu Nguyen
Itai Gat
Alexis Conneau
...
Alexandre Défossez
Gabriel Synnaeve
Emmanuel Dupoux
Roy Schwartz
Yossi Adi
VLM
SyDa
44
53
0
22 May 2023
NAS-FM: Neural Architecture Search for Tunable and Interpretable Sound
  Synthesis based on Frequency Modulation
NAS-FM: Neural Architecture Search for Tunable and Interpretable Sound Synthesis based on Frequency Modulation
Zhe Ye
Wei Xue
Xuejiao Tan
Qi-fei Liu
Yi-Ting Guo
26
2
0
22 May 2023
APNet: An All-Frame-Level Neural Vocoder Incorporating Direct Prediction
  of Amplitude and Phase Spectra
APNet: An All-Frame-Level Neural Vocoder Incorporating Direct Prediction of Amplitude and Phase Spectra
Yang Ai
Zhenhua Ling
34
13
0
13 May 2023
Who is Speaking Actually? Robust and Versatile Speaker Traceability for
  Voice Conversion
Who is Speaking Actually? Robust and Versatile Speaker Traceability for Voice Conversion
Yanzhen Ren
Hongcheng Zhu
Liming Zhai
Zongkun Sun
Rubing Shen
Lina Wang
33
6
0
09 May 2023
Source-Filter-Based Generative Adversarial Neural Vocoder for High
  Fidelity Speech Synthesis
Source-Filter-Based Generative Adversarial Neural Vocoder for High Fidelity Speech Synthesis
Ye-Xin Lu
Yang Ai
Zhenhua Ling
24
1
0
26 Apr 2023
SAR: Self-Supervised Anti-Distortion Representation for End-To-End
  Speech Model
SAR: Self-Supervised Anti-Distortion Representation for End-To-End Speech Model
Jianzong Wang
Xulong Zhang
Haobin Tang
Aolan Sun
Ning Cheng
Jing Xiao
26
1
0
23 Apr 2023
Affective social anthropomorphic intelligent system
Affective social anthropomorphic intelligent system
Md. Adyelullahil Mamun
Hasnat Md. Abdullah
Md. Golam Rabiul Alam
Muhammad Mehedi Hassan
Md. Zia Uddin
17
1
0
19 Apr 2023
Neural Diffeomorphic Non-uniform B-spline Flows
Neural Diffeomorphic Non-uniform B-spline Flows
S. Hong
S. Chun
37
1
0
07 Apr 2023
AraSpot: Arabic Spoken Command Spotting
AraSpot: Arabic Spoken Command Spotting
Mahmoud Salhab
H. Harmanani
28
0
0
29 Mar 2023
Wave-U-Net Discriminator: Fast and Lightweight Discriminator for
  Generative Adversarial Network-Based Speech Synthesis
Wave-U-Net Discriminator: Fast and Lightweight Discriminator for Generative Adversarial Network-Based Speech Synthesis
Takuhiro Kaneko
Hirokazu Kameoka
Kou Tanaka
Shogo Seki
34
9
0
24 Mar 2023
Transformers in Speech Processing: A Survey
Transformers in Speech Processing: A Survey
S. Latif
Aun Zaidi
Heriberto Cuayáhuitl
Fahad Shamshad
Moazzam Shoukat
Junaid Qadir
42
47
0
21 Mar 2023
Configurable EBEN: Extreme Bandwidth Extension Network to enhance
  body-conducted speech capture
Configurable EBEN: Extreme Bandwidth Extension Network to enhance body-conducted speech capture
Hauret Julien
Joubaud Thomas
V. Zimpfer
Bavu Éric
21
6
0
17 Mar 2023
TrojDiff: Trojan Attacks on Diffusion Models with Diverse Targets
TrojDiff: Trojan Attacks on Diffusion Models with Diverse Targets
Weixin Chen
D. Song
Bo-wen Li
DiffM
34
74
0
10 Mar 2023
UniFLG: Unified Facial Landmark Generator from Text or Speech
UniFLG: Unified Facial Landmark Generator from Text or Speech
Kentaro Mitsui
Yukiya Hono
Kei Sawada
CVBM
16
6
0
28 Feb 2023
Conditional deep generative models as surrogates for spatial field
  solution reconstruction with quantified uncertainty in Structural Health
  Monitoring applications
Conditional deep generative models as surrogates for spatial field solution reconstruction with quantified uncertainty in Structural Health Monitoring applications
Nicholas E. Silionis
Theodora Liangou
K. Anyfantis
AI4CE
26
0
0
14 Feb 2023
Fast and small footprint Hybrid HMM-HiFiGAN based system for speech
  synthesis in Indian languages
Fast and small footprint Hybrid HMM-HiFiGAN based system for speech synthesis in Indian languages
Sudhanshu Srivastava
Ishika Gupta
Anusha Prakash
Jom Kuriakose
H. Murthy
VLM
21
1
0
13 Feb 2023
Multilingual Multiaccented Multispeaker TTS with RADTTS
Multilingual Multiaccented Multispeaker TTS with RADTTS
Rohan Badlani
Rafael Valle
Kevin J. Shih
J. F. Santos
Francesco Ferroni
Bryan Catanzaro
16
6
0
24 Jan 2023
Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
Chengyi Wang
Sanyuan Chen
Yu-Huan Wu
Zi-Hua Zhang
Long Zhou
...
Huaming Wang
Jinyu Li
Lei He
Sheng Zhao
Furu Wei
48
644
0
05 Jan 2023
Analysing Discrete Self Supervised Speech Representation for Spoken
  Language Modeling
Analysing Discrete Self Supervised Speech Representation for Spoken Language Modeling
Amitay Sicherman
Yossi Adi
20
32
0
02 Jan 2023
Framewise WaveGAN: High Speed Adversarial Vocoder in Time Domain with
  Very Low Computational Complexity
Framewise WaveGAN: High Speed Adversarial Vocoder in Time Domain with Very Low Computational Complexity
Ahmed Mustafa
J. Valin
Jan Büthe
Paris Smaragdis
Mike Goodwin
30
4
0
08 Dec 2022
On the Robustness of Normalizing Flows for Inverse Problems in Imaging
On the Robustness of Normalizing Flows for Inverse Problems in Imaging
Seongmin Hong
I. Park
S. Chun
33
7
0
08 Dec 2022
Low-Resource End-to-end Sanskrit TTS using Tacotron2, WaveGlow and
  Transfer Learning
Low-Resource End-to-end Sanskrit TTS using Tacotron2, WaveGlow and Transfer Learning
Ankur Debnath
Shridevi S Patil
Gangotri Nadiger
R. Ganesan
26
20
0
07 Dec 2022
Generative Models for Improved Naturalness, Intelligibility, and Voicing
  of Whispered Speech
Generative Models for Improved Naturalness, Intelligibility, and Voicing of Whispered Speech
Dominik Wagner
Sebastian P. Bayerl
H. A. C. Maruri
Tobias Bocklet
24
7
0
04 Dec 2022
Puffin: pitch-synchronous neural waveform generation for fullband speech
  on modest devices
Puffin: pitch-synchronous neural waveform generation for fullband speech on modest devices
O. Watts
Lovisa Wihlborg
Cassia Valentini-Botinhao
33
3
0
25 Nov 2022
Efficient Incremental Text-to-Speech on GPUs
Efficient Incremental Text-to-Speech on GPUs
Muyang Du
Chuan Liu
Jiaxing Qi
Junjie Lai
24
1
0
25 Nov 2022
STGlow: A Flow-based Generative Framework with Dual Graphormer for
  Pedestrian Trajectory Prediction
STGlow: A Flow-based Generative Framework with Dual Graphormer for Pedestrian Trajectory Prediction
Rongqin Liang
Yuanman Li
Jiantao Zhou
Xia Li
39
12
0
21 Nov 2022
Towards Building Text-To-Speech Systems for the Next Billion Users
Towards Building Text-To-Speech Systems for the Next Billion Users
Gokul Karthik Kumar
V. PraveenS.
Pratyush Kumar
Mitesh M. Khapra
Karthik Nandakumar
38
18
0
17 Nov 2022
Challenges in creative generative models for music: a divergence
  maximization perspective
Challenges in creative generative models for music: a divergence maximization perspective
Axel Chemla-Romeu-Santos
P. Esling
18
4
0
16 Nov 2022
OverFlow: Putting flows on top of neural transducers for better TTS
OverFlow: Putting flows on top of neural transducers for better TTS
Shivam Mehta
Ambika Kirkland
Harm Lameris
Jonas Beskow
Éva Székely
G. Henter
AI4TS
39
12
0
13 Nov 2022
Online Phase Reconstruction via DNN-based Phase Differences Estimation
Online Phase Reconstruction via DNN-based Phase Differences Estimation
Yoshiki Masuyama
Kohei Yatabe
Kento Nagatomo
Yasuhiro Oikawa
3DV
16
7
0
12 Nov 2022
DSPGAN: a GAN-based universal vocoder for high-fidelity TTS by
  time-frequency domain supervision from DSP
DSPGAN: a GAN-based universal vocoder for high-fidelity TTS by time-frequency domain supervision from DSP
Kun Song
Yongmao Zhang
Yinjiao Lei
Jian Cong
Hanzhao Li
Linfu Xie
Gang He
Jinfeng Bai
61
15
0
02 Nov 2022
SIMD-size aware weight regularization for fast neural vocoding on CPU
SIMD-size aware weight regularization for fast neural vocoding on CPU
Hiroki Kanagawa
Yusuke Ijima
16
0
0
02 Nov 2022
Robust MelGAN: A robust universal neural vocoder for high-fidelity TTS
Robust MelGAN: A robust universal neural vocoder for high-fidelity TTS
Kun Song
Jian Cong
Xinsheng Wang
Yongmao Zhang
Linfu Xie
Ning Jiang
Haiying Wu
27
0
0
31 Oct 2022
Audio Time-Scale Modification with Temporal Compressing Networks
Audio Time-Scale Modification with Temporal Compressing Networks
Ernie Chu
Ju-Ting Chen
Chia-Ping Chen
25
0
0
31 Oct 2022
Conditioning and Sampling in Variational Diffusion Models for Speech
  Super-Resolution
Conditioning and Sampling in Variational Diffusion Models for Speech Super-Resolution
Chin-Yun Yu
Sung-Lin Yeh
Gyorgy Fazekas
Hao Tang
DiffM
40
20
0
27 Oct 2022
Cover Reproducible Steganography via Deep Generative Models
Cover Reproducible Steganography via Deep Generative Models
Kejiang Chen
Hang Zhou
Yaofei Wang
Meng Li
Weiming Zhang
Neng H. Yu
DiffM
31
9
0
26 Oct 2022
Efficiently Trained Low-Resource Mongolian Text-to-Speech System Based
  On FullConv-TTS
Efficiently Trained Low-Resource Mongolian Text-to-Speech System Based On FullConv-TTS
Ziqi Liang
36
0
0
24 Oct 2022
Improved Normalizing Flow-Based Speech Enhancement using an All-pole
  Gammatone Filterbank for Conditional Input Representation
Improved Normalizing Flow-Based Speech Enhancement using an All-pole Gammatone Filterbank for Conditional Input Representation
Martin Strauss
Matteo Torcoli
B. Edler
21
4
0
21 Oct 2022
Robust One-Shot Singing Voice Conversion
Robust One-Shot Singing Voice Conversion
Naoya Takahashi
M. Singh
Yuki Mitsufuji
DiffM
25
8
0
20 Oct 2022
Spoofed training data for speech spoofing countermeasure can be
  efficiently created using neural vocoders
Spoofed training data for speech spoofing countermeasure can be efficiently created using neural vocoders
Xin Wang
Junichi Yamagishi
26
36
0
19 Oct 2022
Invertible Monotone Operators for Normalizing Flows
Invertible Monotone Operators for Normalizing Flows
Byeongkeun Ahn
Chiyoon Kim
Youngjoon Hong
Hyunwoo J. Kim
TPM
43
8
0
15 Oct 2022
Hierarchical Diffusion Models for Singing Voice Neural Vocoder
Hierarchical Diffusion Models for Singing Voice Neural Vocoder
Naoya Takahashi
Mayank Kumar
Singh
Yuki Mitsufuji
DiffM
21
16
0
14 Oct 2022
SpecRNet: Towards Faster and More Accessible Audio DeepFake Detection
SpecRNet: Towards Faster and More Accessible Audio DeepFake Detection
Piotr Kawa
Marcin Plata
P. Syga
37
14
0
12 Oct 2022
Previous
123456...91011
Next