Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2105.06337
Cited By
Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech
13 May 2021
Vadim Popov
Ivan Vovk
Vladimir Gogoryan
Tasnima Sadekova
Mikhail Kudinov
DiffM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech"
50 / 352 papers shown
Title
DiffAugment: Diffusion based Long-Tailed Visual Relationship Recognition
Parul Gupta
Tuan Nguyen
Abhinav Dhall
Munawar Hayat
Trung Le
Thanh-Toan Do
36
0
0
01 Jan 2024
Adapt & Align: Continual Learning with Generative Models Latent Space Alignment
Kamil Deja
Bartosz Cywiñski
Jan Rybarczyk
Tomasz Trzciñski
CLL
DRL
23
0
0
21 Dec 2023
Diffusion Models With Learned Adaptive Noise
S. Sahoo
Aaron Gokaslan
Christopher De Sa
Volodymyr Kuleshov
DiffM
34
8
0
20 Dec 2023
Optimizing Diffusion Noise Can Serve As Universal Motion Priors
Korrawe Karunratanakul
Konpat Preechakul
Emre Aksan
Thabo Beeler
Supasorn Suwajanakorn
Siyu Tang
DiffM
31
37
0
19 Dec 2023
Towards Detailed Text-to-Motion Synthesis via Basic-to-Advanced Hierarchical Diffusion Model
Zhenyu Xie
Yang Wu
Xuehao Gao
Zhongqian Sun
Wei Yang
Xiaodan Liang
DiffM
29
11
0
18 Dec 2023
A Note on the Convergence of Denoising Diffusion Probabilistic Models
S. Mbacke
Omar Rivasplata
DiffM
21
5
0
10 Dec 2023
Investigating the Design Space of Diffusion Models for Speech Enhancement
Philippe Gonzalez
Zheng-Hua Tan
Jan Østergaard
Jesper Jensen
T. S. Alstrøm
Tobias May
DiffM
30
6
0
07 Dec 2023
DiffusionSat: A Generative Foundation Model for Satellite Imagery
Samar Khanna
Patrick Liu
Linqi Zhou
Chenlin Meng
Robin Rombach
Marshall Burke
David B. Lobell
Stefano Ermon
26
57
0
06 Dec 2023
Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis
Zehua Chen
Guande He
Kaiwen Zheng
Xu Tan
Jun Zhu
DiffM
56
21
0
06 Dec 2023
Analyzing and Improving the Training Dynamics of Diffusion Models
Tero Karras
M. Aittala
J. Lehtinen
Janne Hellsten
Timo Aila
S. Laine
42
155
0
05 Dec 2023
Diffusion-Based Speech Enhancement in Matched and Mismatched Conditions Using a Heun-Based Sampler
Philippe Gonzalez
Zheng-Hua Tan
Jan Østergaard
Jesper Jensen
T. S. Alstrøm
Tobias May
DiffM
24
4
0
05 Dec 2023
DeepCache: Accelerating Diffusion Models for Free
Xinyin Ma
Gongfan Fang
Xinchao Wang
24
122
0
01 Dec 2023
Elijah: Eliminating Backdoors Injected in Diffusion Models via Distribution Shift
Shengwei An
Sheng-Yen Chou
Kaiyuan Zhang
Qiuling Xu
Guanhong Tao
...
Shuyang Cheng
Shiqing Ma
Pin-Yu Chen
Tsung-Yi Ho
Xiangyu Zhang
DiffM
AAML
33
28
0
27 Nov 2023
ACT-Diffusion: Efficient Adversarial Consistency Training for One-step Diffusion Models
Fei Kong
Jinhao Duan
Lichao Sun
Hao-Ran Cheng
Renjing Xu
Hengtao Shen
Xiao-lan Zhu
Xiaoshuang Shi
Kaidi Xu
44
3
0
23 Nov 2023
HierSpeech++: Bridging the Gap between Semantic and Acoustic Representation of Speech by Hierarchical Variational Inference for Zero-shot Speech Synthesis
Sang-Hoon Lee
Haram Choi
Seung-Bin Kim
Seong-Whan Lee
BDL
32
31
0
21 Nov 2023
Diff-HierVC: Diffusion-based Hierarchical Voice Conversion with Robust Pitch Generation and Masked Prior for Zero-shot Speaker Adaptation
Haram Choi
Sang-Hoon Lee
Seong-Whan Lee
DiffM
21
24
0
08 Nov 2023
Transduce and Speak: Neural Transducer for Text-to-Speech with Semantic Token Prediction
Minchan Kim
Myeonghun Jeong
Byoung Jin Choi
Dongjune Lee
N. Kim
AI4TS
30
10
0
06 Nov 2023
E3 TTS: Easy End-to-End Diffusion-based Text to Speech
Yuan Gao
Nobuyuki Morioka
Yu Zhang
Nanxin Chen
DiffM
26
27
0
02 Nov 2023
Gaussian Mixture Solvers for Diffusion Models
Hanzhong Guo
Cheng Lu
Fan Bao
Tianyu Pang
Shuicheng Yan
Chao Du
Chongxuan Li
30
9
0
02 Nov 2023
Seeing Through the Conversation: Audio-Visual Speech Separation based on Diffusion Model
Suyeon Lee
Chaeyoung Jung
Youngjoon Jang
Jaehun Kim
Joon Son Chung
33
7
0
30 Oct 2023
Controllable Group Choreography using Contrastive Diffusion
Nhat Le
Tuong Khanh Long Do
Khoa Do
Hien Nguyen
Erman Tjiputra
Quang-Dieu Tran
Anh Nguyen
45
10
0
29 Oct 2023
Successfully Applying Lottery Ticket Hypothesis to Diffusion Model
Chao Jiang
Bo Hui
Bohan Liu
Da Yan
DiffM
40
14
0
28 Oct 2023
DiffS2UT: A Semantic Preserving Diffusion Model for Textless Direct Speech-to-Speech Translation
Yongxin Zhu
Zhujin Gao
Xinyuan Zhou
Zhongyi Ye
Linli Xu
26
2
0
26 Oct 2023
Energy-Based Models For Speech Synthesis
Wanli Sun
Zehai Tu
Anton Ragni
DiffM
24
0
0
19 Oct 2023
Generation or Replication: Auscultating Audio Latent Diffusion Models
Dimitrios Bralios
G. Wichern
François Germain
Zexu Pan
Sameer Khurana
Chiori Hori
Jonathan Le Roux
DiffM
27
6
0
16 Oct 2023
Neural Diffusion Models
Grigory Bartosh
Dmitry Vetrov
C. A. Naesseth
DiffM
24
6
0
12 Oct 2023
DASpeech: Directed Acyclic Transformer for Fast and High-quality Speech-to-Speech Translation
Qingkai Fang
Yan Zhou
Yangzhou Feng
40
6
0
11 Oct 2023
Improving End-to-End Speech Processing by Efficient Text Data Utilization with Latent Synthesis
Jianqiao Lu
Wenyong Huang
Nianzu Zheng
Xingshan Zeng
Y. Yeung
Xiao Chen
SyDa
24
1
0
09 Oct 2023
Unified speech and gesture synthesis using flow matching
Shivam Mehta
Ruibo Tu
Simon Alexanderson
Jonas Beskow
Éva Székely
G. Henter
24
3
0
08 Oct 2023
U-Style: Cascading U-nets with Multi-level Speaker and Style Modeling for Zero-Shot Voice Cloning
Tao Li
Zhichao Wang
Xinfa Zhu
Jian Cong
Qiao Tian
Yuping Wang
Lei Xie
DiffM
33
3
0
06 Oct 2023
MedDiffusion: Boosting Health Risk Prediction via Diffusion-based Data Augmentation
Yuan Zhong
Suhan Cui
Jiaqi Wang
Xiaochen Wang
Ziyi Yin
Yaqing Wang
Houping Xiao
Mengdi Huai
Ting Wang
Fenglong Ma
MedIm
30
3
0
04 Oct 2023
DiffAR: Denoising Diffusion Autoregressive Model for Raw Speech Waveform Generation
Roi Benita
Michael Elad
Joseph Keshet
DiffM
25
7
0
02 Oct 2023
Navigating the Design Space of Equivariant Diffusion-Based Generative Models for De Novo 3D Molecule Generation
Tuan Le
Julian Cremer
Frank Noé
Djork-Arné Clevert
Kristof T. Schütt
DiffM
29
25
0
29 Sep 2023
Advances in Kidney Biopsy Lesion Assessment through Dense Instance Segmentation
Zhan Xiong
Junling He
Pieter Valkema
Tri Q. Nguyen
M. Naesens
J. Kers
F. Verbeek
MedIm
22
0
0
29 Sep 2023
ReFlow-TTS: A Rectified Flow Model for High-fidelity Text-to-Speech
Wenhao Guan
Qi Su
Haodong Zhou
Shiyu Miao
Xingjia Xie
Lin Li
Q. Hong
DiffM
20
13
0
29 Sep 2023
Deep Networks as Denoising Algorithms: Sample-Efficient Learning of Diffusion Models in High-Dimensional Graphical Models
Song Mei
Yuchen Wu
DiffM
31
26
0
20 Sep 2023
EMOCONV-DIFF: Diffusion-based Speech Emotion Conversion for Non-parallel and In-the-wild Data
N. Prabhu
Bunlong Lay
Simon Welker
N. Lehmann-Willenbrock
Timo Gerkmann
DiffM
21
3
0
14 Sep 2023
Diff-SV: A Unified Hierarchical Framework for Noise-Robust Speaker Verification Using Score-Based Diffusion Probabilistic Models
Ju-ho Kim
Ju-Sung Heo
Hyun-Seo Shin
Chanmann Lim
Ha-Jin Yu
DiffM
19
2
0
14 Sep 2023
DCTTS: Discrete Diffusion Model with Contrastive Learning for Text-to-speech Generation
Zhichao Wu
Qiulin Li
Sixing Liu
Qun Yang
24
3
0
13 Sep 2023
Distinguishing Neural Speech Synthesis Models Through Fingerprints in Speech Waveforms
Chu Yuan Zhang
Jiangyan Yi
Jianhua Tao
Chenglong Wang
Xinrui Yan
15
2
0
13 Sep 2023
VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching
Yiwei Guo
Chenpeng Du
Ziyang Ma
Xie Chen
K. Yu
DiffM
30
36
0
10 Sep 2023
Cross-Utterance Conditioned VAE for Speech Generation
Yongqian Li
Cheng Yu
Guangzhi Sun
Weiqin Zu
Zheng Tian
...
Wei Pan
Chao Zhang
Jun Wang
Yang Yang
Fanglei Sun
18
2
0
08 Sep 2023
Highly Controllable Diffusion-based Any-to-Any Voice Conversion Model with Frame-level Prosody Feature
Kyungguen Byun
Sunkuk Moon
Erik Visser
DiffM
32
0
0
06 Sep 2023
Matcha-TTS: A fast TTS architecture with conditional flow matching
Shivam Mehta
Ruibo Tu
Jonas Beskow
Éva Székely
G. Henter
24
69
0
06 Sep 2023
PromptTTS 2: Describing and Generating Voices with Text Prompt
Yichong Leng
Zhifang Guo
Kai Shen
Xu Tan
Zeqian Ju
...
Lei He
Xiang-Yang Li
Sheng Zhao
Tao Qin
Jiang Bian
VLM
DiffM
44
40
0
05 Sep 2023
Bridge Diffusion Model: bridge non-English language-native text-to-image diffusion model with English communities
Shanyuan Liu
Dawei Leng
Yuhui Yin
DiffM
24
7
0
02 Sep 2023
DiCLET-TTS: Diffusion Model based Cross-lingual Emotion Transfer for Text-to-Speech -- A Study between English and Mandarin
Tao Li
Chenxu Hu
Jian Cong
Xinfa Zhu
Jingbei Li
Qiao Tian
Yuping Wang
Linfu Xie
DiffM
38
8
0
02 Sep 2023
LightGrad: Lightweight Diffusion Probabilistic Model for Text-to-Speech
Jing Chen
Xingcheng Song
Zhendong Peng
Binbin Zhang
Fuping Pan
Zhiyong Wu
DiffM
19
16
0
31 Aug 2023
Let There Be Sound: Reconstructing High Quality Speech from Silent Videos
Ji-Hoon Kim
Jaehun Kim
Joon Son Chung
30
5
0
29 Aug 2023
DiffusionVMR: Diffusion Model for Joint Video Moment Retrieval and Highlight Detection
Henghao Zhao
Kevin Qinghong Lin
Rui Yan
Zechao Li
VGen
DiffM
37
1
0
29 Aug 2023
Previous
1
2
3
4
5
6
7
8
Next