Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1811.00002
Cited By
WaveGlow: A Flow-based Generative Network for Speech Synthesis
31 October 2018
R. Prenger
Rafael Valle
Bryan Catanzaro
Re-assign community
ArXiv
PDF
HTML
Papers citing
"WaveGlow: A Flow-based Generative Network for Speech Synthesis"
50 / 525 papers shown
Title
An Overview of Affective Speech Synthesis and Conversion in the Deep Learning Era
Andreas Triantafyllopoulos
Björn W. Schuller
Gokcce .Iymen
M. Sezgin
Xiangheng He
...
Shuo Liu
Silvan Mertes
Elisabeth André
Ruibo Fu
Jianhua Tao
20
53
0
06 Oct 2022
How Image Generation Helps Visible-to-Infrared Person Re-Identification?
Honghu Pan
Yongyong Chen
Yunqing He
Xin Li
Zhenyu He
15
2
0
04 Oct 2022
WaveFit: An Iterative and Non-autoregressive Neural Vocoder based on Fixed-Point Iteration
Yuma Koizumi
Kohei Yatabe
Heiga Zen
M. Bacchiani
DiffM
42
29
0
03 Oct 2022
Augmentation Invariant Discrete Representation for Generative Spoken Language Modeling
Itai Gat
Felix Kreuk
Tu Nguyen
Ann Lee
Jade Copet
Gabriel Synnaeve
Emmanuel Dupoux
Yossi Adi
51
11
0
30 Sep 2022
AutoLV: Automatic Lecture Video Generator
Wen Wang
Yang Song
Sanjay Jha
VGen
21
3
0
19 Sep 2022
Open Challenges in Synthetic Speech Detection
Luca Cuccovillo
Christoforos Papastergiopoulos
Anastasios Vafeiadis
Artem Yaroshchuk
P. Aichroth
K. Votis
Dimitrios Tzovaras
46
27
0
15 Sep 2022
ConvNeXt Based Neural Network for Audio Anti-Spoofing
Qiaowei Ma
J. Zhong
Yitao Yang
Weiheng Liu
Yingbo Gao
W. W. Ng
AAML
44
6
0
14 Sep 2022
Deep Speech Synthesis from Articulatory Representations
Peter Wu
Shinji Watanabe
Louis Goldstein
A. Black
Gopala K. Anumanchipalli
39
24
0
13 Sep 2022
Evaluating generative audio systems and their metrics
Ashvala Vinay
Alexander Lerch
24
19
0
31 Aug 2022
Maximum Likelihood on the Joint (Data, Condition) Distribution for Solving Ill-Posed Problems with Conditional Flow Models
John Shelton Hyatt
12
1
0
24 Aug 2022
Pathway to Future Symbiotic Creativity
Yi-Ting Guo
Qi-fei Liu
Jie Chen
Wei Xue
Jie Fu
...
Fernando Rosas
Jeffrey Shaw
Xing Wu
Jiji Zhang
Jianliang Xu
34
0
0
18 Aug 2022
Towards Parametric Speech Synthesis Using Gaussian-Markov Model of Spectral Envelope and Wavelet-Based Decomposition of F0
M. S. Al-Radhi
Tamás Gábor Csapó
Csaba Zainkó
Géza Németh
14
1
0
15 Aug 2022
Controllable and Lossless Non-Autoregressive End-to-End Text-to-Speech
Zhengxi Liu
Qiao Tian
Chenxu Hu
Xudong Liu
Meng-Che Wu
Yuping Wang
Hang Zhao
Yuxuan Wang
36
10
0
13 Jul 2022
SATTS: Speaker Attractor Text to Speech, Learning to Speak by Learning to Separate
Nabarun Goswami
Tatsuya Harada
26
5
0
13 Jul 2022
A Cyclical Approach to Synthetic and Natural Speech Mismatch Refinement of Neural Post-filter for Low-cost Text-to-speech System
Yi-Chiao Wu
Patrick Lumban Tobing
Kazuki Yasuhara
Noriyuki Matsunaga
Yamato Ohtani
T. Toda
42
0
0
13 Jul 2022
End-to-end speech recognition modeling from de-identified data
M. Flechl
Shou-Chun Yin
Junho Park
Peter Skala
17
4
0
12 Jul 2022
Glow-WaveGAN 2: High-quality Zero-shot Text-to-speech Synthesis and Any-to-any Voice Conversion
Yinjiao Lei
Shan Yang
Jian Cong
Linfu Xie
Dan Su
DiffM
52
12
0
05 Jul 2022
Avocodo: Generative Adversarial Network for Artifact-free Vocoder
Taejun Bak
Junmo Lee
Hanbin Bae
Jinhyeok Yang
Jaesung Bae
Young-Sun Joo
25
28
0
27 Jun 2022
Attack Agnostic Dataset: Towards Generalization and Stabilization of Audio DeepFake Detection
Piotr Kawa
Marcin Plata
P. Syga
AAML
49
23
0
27 Jun 2022
Improved Processing of Ultrasound Tongue Videos by Combining ConvLSTM and 3D Convolutional Networks
Amin Honarmandi Shandiz
L. Tóth
16
4
0
26 Jun 2022
Generating Diverse Vocal Bursts with StyleGAN2 and MEL-Spectrograms
Marco Jiralerspong
Gauthier Gidel
VLM
27
3
0
25 Jun 2022
WOLONet: Wave Outlooker for Efficient and High Fidelity Speech Synthesis
Yi Wang
Yi Si
25
0
0
20 Jun 2022
NatiQ: An End-to-end Text-to-Speech System for Arabic
Ahmed Abdelali
Nadir Durrani
C. Demiroğlu
Fahim Dalvi
Hamdy Mubarak
Kareem Darwish
18
14
0
15 Jun 2022
RF-Next: Efficient Receptive Field Search for Convolutional Neural Networks
Shanghua Gao
Zhong-Yu Li
Qi Han
Ming-Ming Cheng
Liang Wang
32
34
0
14 Jun 2022
Multi-instrument Music Synthesis with Spectrogram Diffusion
Curtis Hawthorne
Ian Simon
Adam Roberts
Neil Zeghidour
Josh Gardner
Ethan Manilow
Jesse Engel
DiffM
21
49
0
11 Jun 2022
BigVGAN: A Universal Neural Vocoder with Large-Scale Training
Sang-gil Lee
Ming-Yu Liu
Boris Ginsburg
Bryan Catanzaro
Sung-Hoon Yoon
22
228
0
09 Jun 2022
Patch-based Object-centric Transformers for Efficient Video Generation
Wilson Yan
Ryogo Okumura
Stephen James
Pieter Abbeel
DiffM
ViT
31
6
0
08 Jun 2022
FlexLip: A Controllable Text-to-Lip System
Dan Oneaţă
Beáta Lőrincz
Adriana Stan
H. Cucu
26
3
0
07 Jun 2022
Preparing an Endangered Language for the Digital Age: The Case of Judeo-Spanish
A. Oktem
Rodolfo Zevallos
Yasmin Moslem
Günes Öztürk
Karen Sarhon
26
0
0
31 May 2022
Guided-TTS 2: A Diffusion Model for High-quality Adaptive Text-to-Speech with Untranscribed Data
Sungwon Kim
Heeseung Kim
Sung-Hoon Yoon
DiffM
204
52
0
30 May 2022
A Tale of Two Flows: Cooperative Learning of Langevin Flow and Normalizing Flow Toward Energy-Based Model
Jianwen Xie
Y. Zhu
Juntao Li
Ping Li
24
50
0
13 May 2022
Talking Face Generation with Multilingual TTS
Hyoung-Kyu Song
Sanghyun Woo
Junhyeok Lee
S. Yang
Hyunjae Cho
Youseong Lee
Dongho Choi
Kang-Wook Kim
CVBM
40
21
0
13 May 2022
Real-Time Packet Loss Concealment With Mixed Generative and Predictive Model
J. Valin
Ahmed Mustafa
Christopher Montgomery
Timothy B. Terriberry
Michael Klingbeil
Paris Smaragdis
A. Krishnaswamy
24
18
0
11 May 2022
NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Xu Tan
Jiawei Chen
Haohe Liu
Jian Cong
Chen Zhang
...
Lei He
Frank Soong
Tao Qin
Sheng Zhao
Tie-Yan Liu
44
213
0
09 May 2022
ReCAB-VAE: Gumbel-Softmax Variational Inference Based on Analytic Divergence
Sangshin Oh
Seyun Um
Hong-Goo Kang
BDL
DRL
16
2
0
09 May 2022
Regotron: Regularizing the Tacotron2 architecture via monotonic alignment loss
Efthymios Georgiou
Kosmas Kritsis
Georgios Paraskevopoulos
Athanasios Katsamanis
Vassilis Katsouros
Alexandros Potamianos
23
3
0
28 Apr 2022
Parallel Synthesis for Autoregressive Speech Generation
Po-Chun Hsu
Da-Rong Liu
Andy T. Liu
Hung-yi Lee
42
5
0
25 Apr 2022
Speaking-Rate-Controllable HiFi-GAN Using Feature Interpolation
Detai Xin
Shinnosuke Takamichi
T. Okamoto
Hisashi Kawai
Hiroshi Saruwatari
24
0
0
22 Apr 2022
FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis
Rongjie Huang
Max W. Y. Lam
Jun Wang
Dan Su
Dong Yu
Yi Ren
Zhou Zhao
DiffM
28
166
0
21 Apr 2022
Music Source Separation with Generative Flow
Ge Zhu
Jordan Darefsky
Fei Jiang
A. Selitskiy
Z. Duan
25
6
0
19 Apr 2022
Learning and controlling the source-filter representation of speech with a variational autoencoder
Samir Sadok
Simon Leglaive
Laurent Girin
Xavier Alameda-Pineda
Renaud Séguier
SSL
DRL
BDL
32
14
0
14 Apr 2022
A Post Auto-regressive GAN Vocoder Focused on Spectrum Fracture
Zhe-ming Lu
Mengnan He
Ruixiong Zhang
Caixia Gong
GAN
14
2
0
12 Apr 2022
Fine-grained Noise Control for Multispeaker Speech Synthesis
Karolos Nikitaras
G. Vamvoukakis
Nikolaos Ellinas
Konstantinos Klapsas
K. Markopoulos
S. Raptis
June Sig Sung
Gunu Jho
Aimilios Chalamandaris
Pirros Tsiakoulis
29
4
0
11 Apr 2022
Karaoker: Alignment-free singing voice synthesis with speech training data
Panos Kakoulidis
Nikolaos Ellinas
G. Vamvoukakis
K. Markopoulos
June Sig Sung
Gunu Jho
Pirros Tsiakoulis
Aimilios Chalamandaris
12
3
0
08 Apr 2022
Correcting Mispronunciations in Speech using Spectrogram Inpainting
Talia Ben Simon
Felix Kreuk
Faten Awwad
Jacob T. Cohen
Joseph Keshet
12
2
0
07 Apr 2022
Lip to Speech Synthesis with Visual Context Attentional GAN
Minsu Kim
Joanna Hong
Y. Ro
28
51
0
04 Apr 2022
Universal Adaptor: Converting Mel-Spectrograms Between Different Configurations for Speech Synthesis
Fan Wang
Po-Chun Hsu
Da-Rong Liu
Hung-yi Lee
13
0
0
01 Apr 2022
Audio-Visual Speech Codecs: Rethinking Audio-Visual Speech Enhancement by Re-Synthesis
Karren D. Yang
Dejan Marković
Steven Krenn
Vasu Agrawal
Alexander Richard
VGen
16
32
0
31 Mar 2022
HiFi-VC: High Quality ASR-Based Voice Conversion
A. Kashkin
I. Karpukhin
S. Shishkin
29
5
0
31 Mar 2022
WavThruVec: Latent speech representation as intermediate features for neural speech synthesis
Hubert Siuzdak
Piotr Dura
Pol van Rijn
Nori Jacoby
AI4TS
18
30
0
31 Mar 2022
Previous
1
2
3
4
5
...
9
10
11
Next