Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2009.00713
Cited By
v1
v2 (latest)
WaveGrad: Estimating Gradients for Waveform Generation
2 September 2020
Nanxin Chen
Yu Zhang
Heiga Zen
Ron J. Weiss
Mohammad Norouzi
William Chan
DiffM
BDL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"WaveGrad: Estimating Gradients for Waveform Generation"
50 / 76 papers shown
Title
STOPA: A Database of Systematic VariaTion Of DeePfake Audio for Open-Set Source Tracing and Attribution
Anton Firc
Manasi Chibber
Jagabandhu Mishra
Vishwanath Pratap Singh
Tomi Kinnunen
K. Malinka
122
0
0
26 May 2025
Miipher-2: A Universal Speech Restoration Model for Million-Hour Scale Data Restoration
Shigeki Karita
Yuma Koizumi
Heiga Zen
Haruko Ishikawa
Robin Scheibler
M. Bacchiani
VLM
409
1
0
07 May 2025
Dual Audio-Centric Modality Coupling for Talking Head Generation
Ao Fu
Ziqi Ni
Yi Zhou
121
1
0
26 Mar 2025
Denoising as Adaptation: Noise-Space Domain Adaptation for Image Restoration
Kang Liao
Zongsheng Yue
Zhouxia Wang
Chen Change Loy
174
4
0
20 Feb 2025
RestoreGrad: Signal Restoration Using Conditional Denoising Diffusion Models with Jointly Learned Prior
Ching Hua Lee
Chouchang Yang
Jaejin Cho
Yashas Malur Saidutta
R. S. Srinivasa
Yilin Shen
Hongxia Jin
DiffM
139
0
0
19 Feb 2025
Towards Kriging-informed Conditional Diffusion for Regional Sea-Level Data Downscaling
Subhankar Ghosh
Arun Sharma
Jayant Gupta
Aneesh Subramanian
Shashi Shekhar
DiffM
143
6
0
28 Jan 2025
Simplified and Generalized Masked Diffusion for Discrete Data
Jiaxin Shi
Kehang Han
Zehao Wang
Arnaud Doucet
Michalis K. Titsias
DiffM
170
105
0
17 Jan 2025
SF-Speech: Straightened Flow for Zero-Shot Voice Clone
Xuyuan Li
Zengqiang Shang
Hua Hua
Peiyang Shi
Chen Yang
Li Wang
Pengyuan Zhang
129
3
0
16 Oct 2024
Distillation of Discrete Diffusion through Dimensional Correlations
Satoshi Hayakawa
Yuhta Takida
Masaaki Imaizumi
Hiromi Wakaki
Yuki Mitsufuji
DiffM
106
4
0
11 Oct 2024
Recent Advances in Speech Language Models: A Survey
Wenqian Cui
Dianzhi Yu
Xiaoqi Jiao
Ziqiao Meng
Guangyan Zhang
Qichao Wang
Yiwen Guo
Irwin King
AuLLM
159
25
0
01 Oct 2024
A Simple but Strong Baseline for Sounding Video Generation: Effective Adaptation of Audio and Video Diffusion Models for Joint Generation
Masato Ishii
Akio Hayakawa
Takashi Shibuya
Yuki Mitsufuji
VGen
DiffM
120
4
0
26 Sep 2024
A Comprehensive Survey with Critical Analysis for Deepfake Speech Detection
Lam Pham
Phat Lam
Dat Tran
Hieu Tang
Tin Nguyen
Alexander Schindler
Canh Vu
Alexander Polonsky
Canh Vu
97
5
0
23 Sep 2024
Improving Robustness of Diffusion-Based Zero-Shot Speech Synthesis via Stable Formant Generation
C. Han
Seokgi Lee
Gyuhyeon Nam
Gyeongsu Chae
DiffM
449
0
0
14 Sep 2024
Convergence of the denoising diffusion probabilistic models for general noise schedules
Yumiharu Nakano
DiffM
118
1
0
03 Jun 2024
Adversarial Attacks on Both Face Recognition and Face Anti-spoofing Models
Fengfan Zhou
Qianyu Zhou
Hefei Ling
Xuequan Lu
AAML
113
3
0
27 May 2024
Diffusion Bridge Implicit Models
Kaiwen Zheng
Guande He
Jianfei Chen
Fan Bao
Jun Zhu
DiffM
144
18
0
24 May 2024
LiteVAE: Lightweight and Efficient Variational Autoencoders for Latent Diffusion Models
Seyedmorteza Sadat
Jakob Buhmann
Derek Bradley
Otmar Hilliges
Romann M. Weber
112
9
0
23 May 2024
TerDiT: Ternary Diffusion Models with Transformers
Xudong Lu
Aojun Zhou
Ziyi Lin
Qi Liu
Yuhui Xu
Renrui Zhang
Yafei Wen
Shuai Ren
Peng Gao
Junchi Yan
MQ
92
3
0
23 May 2024
Image Super-Resolution with Text Prompt Diffusion
Zheng Chen
Yulun Zhang
Jinjin Gu
Xin Yuan
Linghe Kong
Guihai Chen
Xiaokang Yang
DiffM
113
20
0
24 Nov 2023
Diffusion Models with Deterministic Normalizing Flow Priors
Mohsen Zand
Ali Etemad
Michael A. Greenspan
DiffM
109
3
0
03 Sep 2023
AI-Generated Content (AIGC) for Various Data Modalities: A Survey
Lin Geng Foo
Hossein Rahmani
Jing Liu
190
31
0
27 Aug 2023
Preconditioned Score-based Generative Models
He Ma
Xiatian Zhu
Xiatian Zhu
Jianfeng Feng
DiffM
87
6
0
13 Feb 2023
DPM-Solver++: Fast Solver for Guided Sampling of Diffusion Probabilistic Models
Cheng Lu
Yuhao Zhou
Fan Bao
Jianfei Chen
Chongxuan Li
Jun Zhu
DiffM
171
609
0
02 Nov 2022
DiffWave: A Versatile Diffusion Model for Audio Synthesis
Zhifeng Kong
Ming-Yu Liu
Jiaji Huang
Kexin Zhao
Bryan Catanzaro
DiffM
BDL
158
1,468
0
21 Sep 2020
Learning Gradient Fields for Shape Generation
Ruojin Cai
Guandao Yang
Hadar Averbuch-Elor
Jinwei Gu
Serge J. Belongie
Noah Snavely
B. Hariharan
3DPC
112
286
0
14 Aug 2020
HooliGAN: Robust, High Quality Neural Vocoding
Ollie McCarthy
Zo Ahmed
72
14
0
06 Aug 2020
A Spectral Energy Distance for Parallel Speech Synthesis
A. Gritsenko
Tim Salimans
Rianne van den Berg
Jasper Snoek
Nal Kalchbrenner
45
70
0
03 Aug 2020
VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested Adversarial Network
Jinhyeok Yang
Junmo Lee
Young-Ik Kim
Hoonyoung Cho
Injung Kim
62
73
0
30 Jul 2020
Denoising Diffusion Probabilistic Models
Jonathan Ho
Ajay Jain
Pieter Abbeel
DiffM
672
18,276
0
19 Jun 2020
Improved Techniques for Training Score-Based Generative Models
Yang Song
Stefano Ermon
DiffM
258
1,163
0
16 Jun 2020
WaveNODE: A Continuous Normalizing Flow for Speech Synthesis
Hyeongju Kim
Hyeongseung Lee
Woohyun Kang
Sung Jun Cheon
Byoung Jin Choi
N. Kim
49
12
0
08 Jun 2020
Multi-band MelGAN: Faster Waveform Generation for High-Quality Text-to-Speech
Geng Yang
Shan Yang
Kai-Chun Liu
Peng Fang
Wei Chen
Lei Xie
129
199
0
11 May 2020
Knowledge-and-Data-Driven Amplitude Spectrum Prediction for Hierarchical Neural Vocoders
Yang Ai
Zhenhua Ling
47
8
0
16 Apr 2020
Non-Autoregressive Machine Translation with Latent Alignments
Chitwan Saharia
William Chan
Saurabh Saxena
Mohammad Norouzi
61
159
0
16 Apr 2020
Imputer: Sequence Modelling via Imputation and Dynamic Programming
William Chan
Chitwan Saharia
Geoffrey E. Hinton
Mohammad Norouzi
Navdeep Jaitly
BDL
AI4TS
70
116
0
20 Feb 2020
Insertion-Deletion Transformer
Laura Ruis
Mitchell Stern
Julia Proskurnia
William Chan
55
9
0
15 Jan 2020
DDSP: Differentiable Digital Signal Processing
Jesse Engel
Lamtharn Hantrakul
Chenjie Gu
Adam Roberts
DiffM
172
381
0
14 Jan 2020
An Empirical Study of Generation Order for Machine Translation
William Chan
Mitchell Stern
J. Kiros
Jakob Uszkoreit
39
10
0
29 Oct 2019
Big Bidirectional Insertion Representations for Documents
Lala Li
William Chan
33
4
0
29 Oct 2019
Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram
Ryuichi Yamamoto
Eunwoo Song
Jae-Min Kim
60
818
0
25 Oct 2019
Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis
Eric Battenberg
RJ Skerry-Ryan
Soroosh Mariooryad
Daisy Stanton
David Kao
Matt Shannon
Tom Bagby
77
114
0
23 Oct 2019
MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis
Kundan Kumar
Rithesh Kumar
T. Boissière
L. Gestin
Wei Zhen Teoh
Jose M. R. Sotelo
A. D. Brébisson
Yoshua Bengio
Aaron Courville
GAN
165
956
0
08 Oct 2019
High Fidelity Speech Synthesis with Adversarial Networks
Mikolaj Binkowski
Jeff Donahue
Sander Dieleman
Aidan Clark
Erich Elsen
Norman Casagrande
Luis C. Cobo
Karen Simonyan
283
240
0
25 Sep 2019
Generative Modeling by Estimating Gradients of the Data Distribution
Yang Song
Stefano Ermon
SyDa
DiffM
258
3,954
0
12 Jul 2019
KERMIT: Generative Insertion-Based Modeling for Sequences
William Chan
Nikita Kitaev
Kelvin Guu
Mitchell Stern
Jakob Uszkoreit
VLM
62
65
0
04 Jun 2019
MelNet: A Generative Model for Audio in the Frequency Domain
Sean Vasquez
M. Lewis
DiffM
71
131
0
04 Jun 2019
Levenshtein Transformer
Jiatao Gu
Changhan Wang
Jake Zhao
121
359
0
27 May 2019
Sliced Score Matching: A Scalable Approach to Density and Score Estimation
Yang Song
Sahaj Garg
Jiaxin Shi
Stefano Ermon
115
418
0
17 May 2019
Neural source-filter waveform models for statistical parametric speech synthesis
Xin Wang
Shinji Takaki
Junichi Yamagishi
77
118
0
27 Apr 2019
Direct speech-to-speech translation with a sequence-to-sequence model
Ye Jia
Ron J. Weiss
Fadi Biadsy
Wolfgang Macherey
Melvin Johnson
Zhiwen Chen
Yonghui Wu
85
229
0
12 Apr 2019
1
2
Next