ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1912.01219
  4. Cited By
WaveFlow: A Compact Flow-based Model for Raw Audio

WaveFlow: A Compact Flow-based Model for Raw Audio

3 December 2019
Ming-Yu Liu
Kainan Peng
Kexin Zhao
Z. Song
ArXivPDFHTML

Papers citing "WaveFlow: A Compact Flow-based Model for Raw Audio"

32 / 32 papers shown
Title
PeriodGrad: Towards Pitch-Controllable Neural Vocoder Based on a
  Diffusion Probabilistic Model
PeriodGrad: Towards Pitch-Controllable Neural Vocoder Based on a Diffusion Probabilistic Model
Yukiya Hono
Kei Hashimoto
Yoshihiko Nankaku
Keiichi Tokuda
DiffM
43
3
0
22 Feb 2024
Speaking in Wavelet Domain: A Simple and Efficient Approach to Speed up
  Speech Diffusion Model
Speaking in Wavelet Domain: A Simple and Efficient Approach to Speed up Speech Diffusion Model
Xiangyu Zhang
Daijiao Liu
Hexin Liu
Qiquan Zhang
Hanyu Meng
Leibny Paola García
Chng Eng Siong
Lina Yao
DiffM
25
3
0
16 Feb 2024
Amphion: An Open-Source Audio, Music and Speech Generation Toolkit
Amphion: An Open-Source Audio, Music and Speech Generation Toolkit
Xueyao Zhang
Liumeng Xue
Yicheng Gu
Yuancheng Wang
Haorui He
...
Mingxuan Wang
Jun Han
Kai Chen
Haizhou Li
Zhizheng Wu
31
29
0
15 Dec 2023
AI-Generated Content (AIGC) for Various Data Modalities: A Survey
AI-Generated Content (AIGC) for Various Data Modalities: A Survey
Lin Geng Foo
Hossein Rahmani
Jun Liu
78
31
0
27 Aug 2023
APNet: An All-Frame-Level Neural Vocoder Incorporating Direct Prediction
  of Amplitude and Phase Spectra
APNet: An All-Frame-Level Neural Vocoder Incorporating Direct Prediction of Amplitude and Phase Spectra
Yang Ai
Zhenhua Ling
39
13
0
13 May 2023
Source-Filter-Based Generative Adversarial Neural Vocoder for High
  Fidelity Speech Synthesis
Source-Filter-Based Generative Adversarial Neural Vocoder for High Fidelity Speech Synthesis
Ye-Xin Lu
Yang Ai
Zhenhua Ling
24
1
0
26 Apr 2023
A Tale of Two Latent Flows: Learning Latent Space Normalizing Flow with
  Short-run Langevin Flow for Approximate Inference
A Tale of Two Latent Flows: Learning Latent Space Normalizing Flow with Short-run Langevin Flow for Approximate Inference
Jianwen Xie
Y. Zhu
Yifei Xu
Dingcheng Li
Ping Li
BDL
DRL
24
7
0
23 Jan 2023
Enabling AI-Generated Content (AIGC) Services in Wireless Edge Networks
Enabling AI-Generated Content (AIGC) Services in Wireless Edge Networks
Hongyang Du
Zonghang Li
Dusit Niyato
Jiawen Kang
Zehui Xiong
Xuemin
X. Shen
Dong In Kim
24
68
0
09 Jan 2023
Robust MelGAN: A robust universal neural vocoder for high-fidelity TTS
Robust MelGAN: A robust universal neural vocoder for high-fidelity TTS
Kun Song
Jian Cong
Xinsheng Wang
Yongmao Zhang
Linfu Xie
Ning Jiang
Haiying Wu
35
0
0
31 Oct 2022
Robust One-Shot Singing Voice Conversion
Robust One-Shot Singing Voice Conversion
Naoya Takahashi
M. Singh
Yuki Mitsufuji
DiffM
30
8
0
20 Oct 2022
Hierarchical Diffusion Models for Singing Voice Neural Vocoder
Hierarchical Diffusion Models for Singing Voice Neural Vocoder
Naoya Takahashi
Mayank Kumar
Singh
Yuki Mitsufuji
DiffM
29
16
0
14 Oct 2022
WaveFit: An Iterative and Non-autoregressive Neural Vocoder based on
  Fixed-Point Iteration
WaveFit: An Iterative and Non-autoregressive Neural Vocoder based on Fixed-Point Iteration
Yuma Koizumi
Kohei Yatabe
Heiga Zen
M. Bacchiani
DiffM
49
29
0
03 Oct 2022
Controllable Accented Text-to-Speech Synthesis
Controllable Accented Text-to-Speech Synthesis
Rui Liu
Berrak Sisman
Guanglai Gao
Haizhou Li
42
6
0
22 Sep 2022
RF-Next: Efficient Receptive Field Search for Convolutional Neural
  Networks
RF-Next: Efficient Receptive Field Search for Convolutional Neural Networks
Shanghua Gao
Zhong-Yu Li
Qi Han
Ming-Ming Cheng
Liang Wang
37
34
0
14 Jun 2022
PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit
PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit
Hui Zhang
Tian Yuan
Junkun Chen
Xintong Li
Renjie Zheng
...
Zeyu Chen
Xiaoguang Hu
Dianhai Yu
Yanjun Ma
Liang Huang
AuLLM
38
24
0
20 May 2022
SpecGrad: Diffusion Probabilistic Model based Neural Vocoder with
  Adaptive Noise Spectral Shaping
SpecGrad: Diffusion Probabilistic Model based Neural Vocoder with Adaptive Noise Spectral Shaping
Yuma Koizumi
Heiga Zen
Kohei Yatabe
Nanxin Chen
M. Bacchiani
DiffM
38
45
0
31 Mar 2022
WaveFake: A Data Set to Facilitate Audio Deepfake Detection
WaveFake: A Data Set to Facilitate Audio Deepfake Detection
Joel Frank
Lea Schonherr
DiffM
132
125
0
04 Nov 2021
Likelihood Training of Schrödinger Bridge using Forward-Backward SDEs
  Theory
Likelihood Training of Schrödinger Bridge using Forward-Backward SDEs Theory
T. Chen
Guan-Horng Liu
Evangelos A. Theodorou
DiffM
OT
174
165
0
21 Oct 2021
On the Interplay Between Sparsity, Naturalness, Intelligibility, and
  Prosody in Speech Synthesis
On the Interplay Between Sparsity, Naturalness, Intelligibility, and Prosody in Speech Synthesis
Cheng-I Jeff Lai
Erica Cooper
Yang Zhang
Shiyu Chang
Kaizhi Qian
...
Yung-Sung Chuang
Alexander H. Liu
Junichi Yamagishi
David D. Cox
James R. Glass
26
6
0
04 Oct 2021
Integrated Speech and Gesture Synthesis
Integrated Speech and Gesture Synthesis
Siyang Wang
Simon Alexanderson
Joakim Gustafson
Jonas Beskow
G. Henter
Éva Székely
37
19
0
25 Aug 2021
A Survey on Neural Speech Synthesis
A Survey on Neural Speech Synthesis
Xu Tan
Tao Qin
Frank Soong
Tie-Yan Liu
AI4TS
18
352
0
29 Jun 2021
UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram
  Discriminators for High-Fidelity Waveform Generation
UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation
Won Jang
D. Lim
Jaesam Yoon
Bongwan Kim
Juntae Kim
38
125
0
15 Jun 2021
D2C: Diffusion-Denoising Models for Few-shot Conditional Generation
D2C: Diffusion-Denoising Models for Few-shot Conditional Generation
Abhishek Sinha
Jiaming Song
Chenlin Meng
Stefano Ermon
VLM
DiffM
30
118
0
12 Jun 2021
Catch-A-Waveform: Learning to Generate Audio from a Single Short Example
Catch-A-Waveform: Learning to Generate Audio from a Single Short Example
Gal Greshler
Tamar Rott Shaham
T. Michaeli
18
25
0
11 Jun 2021
PriorGrad: Improving Conditional Denoising Diffusion Models with
  Data-Dependent Adaptive Prior
PriorGrad: Improving Conditional Denoising Diffusion Models with Data-Dependent Adaptive Prior
Sang-gil Lee
Heeseung Kim
Chaehun Shin
Xu Tan
Chang-Shu Liu
Qi Meng
Tao Qin
Wei Chen
Sung-Hoon Yoon
Tie-Yan Liu
DiffM
29
81
0
11 Jun 2021
Restoring degraded speech via a modified diffusion model
Restoring degraded speech via a modified diffusion model
Jianwei Zhang
Suren Jayasuriya
Visar Berisha
DiffM
20
19
0
22 Apr 2021
Review of end-to-end speech synthesis technology based on deep learning
Review of end-to-end speech synthesis technology based on deep learning
Zhaoxi Mu
Xinyu Yang
Yizhuo Dong
AuLLM
ALM
26
24
0
20 Apr 2021
NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling
NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling
Junhyeok Lee
Seungu Han
DiffM
29
67
0
06 Apr 2021
Improve GAN-based Neural Vocoder using Pointwise Relativistic
  LeastSquare GAN
Improve GAN-based Neural Vocoder using Pointwise Relativistic LeastSquare GAN
Cong Wang
Yu Chen
Bin Wang
Yi Shi
35
1
0
26 Mar 2021
Wave-Tacotron: Spectrogram-free end-to-end text-to-speech synthesis
Wave-Tacotron: Spectrogram-free end-to-end text-to-speech synthesis
Ron J. Weiss
RJ Skerry-Ryan
Eric Battenberg
Soroosh Mariooryad
Diederik P. Kingma
24
98
0
06 Nov 2020
DiffWave: A Versatile Diffusion Model for Audio Synthesis
DiffWave: A Versatile Diffusion Model for Audio Synthesis
Zhifeng Kong
Ming-Yu Liu
Jiaji Huang
Kexin Zhao
Bryan Catanzaro
DiffM
BDL
34
1,396
0
21 Sep 2020
High Fidelity Speech Synthesis with Adversarial Networks
High Fidelity Speech Synthesis with Adversarial Networks
Mikolaj Binkowski
Jeff Donahue
Sander Dieleman
Aidan Clark
Erich Elsen
Norman Casagrande
Luis C. Cobo
Karen Simonyan
243
239
0
25 Sep 2019
1