ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2105.06337
  4. Cited By
Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech

Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech

13 May 2021
Vadim Popov
Ivan Vovk
Vladimir Gogoryan
Tasnima Sadekova
Mikhail Kudinov
    DiffM
ArXivPDFHTML

Papers citing "Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech"

50 / 352 papers shown
Title
DiffAugment: Diffusion based Long-Tailed Visual Relationship Recognition
DiffAugment: Diffusion based Long-Tailed Visual Relationship Recognition
Parul Gupta
Tuan Nguyen
Abhinav Dhall
Munawar Hayat
Trung Le
Thanh-Toan Do
36
0
0
01 Jan 2024
Adapt & Align: Continual Learning with Generative Models Latent Space
  Alignment
Adapt & Align: Continual Learning with Generative Models Latent Space Alignment
Kamil Deja
Bartosz Cywiñski
Jan Rybarczyk
Tomasz Trzciñski
CLL
DRL
23
0
0
21 Dec 2023
Diffusion Models With Learned Adaptive Noise
Diffusion Models With Learned Adaptive Noise
S. Sahoo
Aaron Gokaslan
Christopher De Sa
Volodymyr Kuleshov
DiffM
34
8
0
20 Dec 2023
Optimizing Diffusion Noise Can Serve As Universal Motion Priors
Optimizing Diffusion Noise Can Serve As Universal Motion Priors
Korrawe Karunratanakul
Konpat Preechakul
Emre Aksan
Thabo Beeler
Supasorn Suwajanakorn
Siyu Tang
DiffM
31
37
0
19 Dec 2023
Towards Detailed Text-to-Motion Synthesis via Basic-to-Advanced
  Hierarchical Diffusion Model
Towards Detailed Text-to-Motion Synthesis via Basic-to-Advanced Hierarchical Diffusion Model
Zhenyu Xie
Yang Wu
Xuehao Gao
Zhongqian Sun
Wei Yang
Xiaodan Liang
DiffM
29
11
0
18 Dec 2023
A Note on the Convergence of Denoising Diffusion Probabilistic Models
A Note on the Convergence of Denoising Diffusion Probabilistic Models
S. Mbacke
Omar Rivasplata
DiffM
21
5
0
10 Dec 2023
Investigating the Design Space of Diffusion Models for Speech
  Enhancement
Investigating the Design Space of Diffusion Models for Speech Enhancement
Philippe Gonzalez
Zheng-Hua Tan
Jan Østergaard
Jesper Jensen
T. S. Alstrøm
Tobias May
DiffM
30
6
0
07 Dec 2023
DiffusionSat: A Generative Foundation Model for Satellite Imagery
DiffusionSat: A Generative Foundation Model for Satellite Imagery
Samar Khanna
Patrick Liu
Linqi Zhou
Chenlin Meng
Robin Rombach
Marshall Burke
David B. Lobell
Stefano Ermon
26
57
0
06 Dec 2023
Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis
Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis
Zehua Chen
Guande He
Kaiwen Zheng
Xu Tan
Jun Zhu
DiffM
56
21
0
06 Dec 2023
Analyzing and Improving the Training Dynamics of Diffusion Models
Analyzing and Improving the Training Dynamics of Diffusion Models
Tero Karras
M. Aittala
J. Lehtinen
Janne Hellsten
Timo Aila
S. Laine
42
155
0
05 Dec 2023
Diffusion-Based Speech Enhancement in Matched and Mismatched Conditions
  Using a Heun-Based Sampler
Diffusion-Based Speech Enhancement in Matched and Mismatched Conditions Using a Heun-Based Sampler
Philippe Gonzalez
Zheng-Hua Tan
Jan Østergaard
Jesper Jensen
T. S. Alstrøm
Tobias May
DiffM
24
4
0
05 Dec 2023
DeepCache: Accelerating Diffusion Models for Free
DeepCache: Accelerating Diffusion Models for Free
Xinyin Ma
Gongfan Fang
Xinchao Wang
24
122
0
01 Dec 2023
Elijah: Eliminating Backdoors Injected in Diffusion Models via
  Distribution Shift
Elijah: Eliminating Backdoors Injected in Diffusion Models via Distribution Shift
Shengwei An
Sheng-Yen Chou
Kaiyuan Zhang
Qiuling Xu
Guanhong Tao
...
Shuyang Cheng
Shiqing Ma
Pin-Yu Chen
Tsung-Yi Ho
Xiangyu Zhang
DiffM
AAML
33
28
0
27 Nov 2023
ACT-Diffusion: Efficient Adversarial Consistency Training for One-step
  Diffusion Models
ACT-Diffusion: Efficient Adversarial Consistency Training for One-step Diffusion Models
Fei Kong
Jinhao Duan
Lichao Sun
Hao-Ran Cheng
Renjing Xu
Hengtao Shen
Xiao-lan Zhu
Xiaoshuang Shi
Kaidi Xu
44
3
0
23 Nov 2023
HierSpeech++: Bridging the Gap between Semantic and Acoustic
  Representation of Speech by Hierarchical Variational Inference for Zero-shot
  Speech Synthesis
HierSpeech++: Bridging the Gap between Semantic and Acoustic Representation of Speech by Hierarchical Variational Inference for Zero-shot Speech Synthesis
Sang-Hoon Lee
Haram Choi
Seung-Bin Kim
Seong-Whan Lee
BDL
32
31
0
21 Nov 2023
Diff-HierVC: Diffusion-based Hierarchical Voice Conversion with Robust
  Pitch Generation and Masked Prior for Zero-shot Speaker Adaptation
Diff-HierVC: Diffusion-based Hierarchical Voice Conversion with Robust Pitch Generation and Masked Prior for Zero-shot Speaker Adaptation
Haram Choi
Sang-Hoon Lee
Seong-Whan Lee
DiffM
21
24
0
08 Nov 2023
Transduce and Speak: Neural Transducer for Text-to-Speech with Semantic
  Token Prediction
Transduce and Speak: Neural Transducer for Text-to-Speech with Semantic Token Prediction
Minchan Kim
Myeonghun Jeong
Byoung Jin Choi
Dongjune Lee
N. Kim
AI4TS
30
10
0
06 Nov 2023
E3 TTS: Easy End-to-End Diffusion-based Text to Speech
E3 TTS: Easy End-to-End Diffusion-based Text to Speech
Yuan Gao
Nobuyuki Morioka
Yu Zhang
Nanxin Chen
DiffM
26
27
0
02 Nov 2023
Gaussian Mixture Solvers for Diffusion Models
Gaussian Mixture Solvers for Diffusion Models
Hanzhong Guo
Cheng Lu
Fan Bao
Tianyu Pang
Shuicheng Yan
Chao Du
Chongxuan Li
30
9
0
02 Nov 2023
Seeing Through the Conversation: Audio-Visual Speech Separation based on
  Diffusion Model
Seeing Through the Conversation: Audio-Visual Speech Separation based on Diffusion Model
Suyeon Lee
Chaeyoung Jung
Youngjoon Jang
Jaehun Kim
Joon Son Chung
33
7
0
30 Oct 2023
Controllable Group Choreography using Contrastive Diffusion
Controllable Group Choreography using Contrastive Diffusion
Nhat Le
Tuong Khanh Long Do
Khoa Do
Hien Nguyen
Erman Tjiputra
Quang-Dieu Tran
Anh Nguyen
45
10
0
29 Oct 2023
Successfully Applying Lottery Ticket Hypothesis to Diffusion Model
Successfully Applying Lottery Ticket Hypothesis to Diffusion Model
Chao Jiang
Bo Hui
Bohan Liu
Da Yan
DiffM
40
14
0
28 Oct 2023
DiffS2UT: A Semantic Preserving Diffusion Model for Textless Direct
  Speech-to-Speech Translation
DiffS2UT: A Semantic Preserving Diffusion Model for Textless Direct Speech-to-Speech Translation
Yongxin Zhu
Zhujin Gao
Xinyuan Zhou
Zhongyi Ye
Linli Xu
26
2
0
26 Oct 2023
Energy-Based Models For Speech Synthesis
Energy-Based Models For Speech Synthesis
Wanli Sun
Zehai Tu
Anton Ragni
DiffM
24
0
0
19 Oct 2023
Generation or Replication: Auscultating Audio Latent Diffusion Models
Generation or Replication: Auscultating Audio Latent Diffusion Models
Dimitrios Bralios
G. Wichern
François Germain
Zexu Pan
Sameer Khurana
Chiori Hori
Jonathan Le Roux
DiffM
27
6
0
16 Oct 2023
Neural Diffusion Models
Neural Diffusion Models
Grigory Bartosh
Dmitry Vetrov
C. A. Naesseth
DiffM
24
6
0
12 Oct 2023
DASpeech: Directed Acyclic Transformer for Fast and High-quality
  Speech-to-Speech Translation
DASpeech: Directed Acyclic Transformer for Fast and High-quality Speech-to-Speech Translation
Qingkai Fang
Yan Zhou
Yangzhou Feng
40
6
0
11 Oct 2023
Improving End-to-End Speech Processing by Efficient Text Data
  Utilization with Latent Synthesis
Improving End-to-End Speech Processing by Efficient Text Data Utilization with Latent Synthesis
Jianqiao Lu
Wenyong Huang
Nianzu Zheng
Xingshan Zeng
Y. Yeung
Xiao Chen
SyDa
24
1
0
09 Oct 2023
Unified speech and gesture synthesis using flow matching
Unified speech and gesture synthesis using flow matching
Shivam Mehta
Ruibo Tu
Simon Alexanderson
Jonas Beskow
Éva Székely
G. Henter
24
3
0
08 Oct 2023
U-Style: Cascading U-nets with Multi-level Speaker and Style Modeling
  for Zero-Shot Voice Cloning
U-Style: Cascading U-nets with Multi-level Speaker and Style Modeling for Zero-Shot Voice Cloning
Tao Li
Zhichao Wang
Xinfa Zhu
Jian Cong
Qiao Tian
Yuping Wang
Lei Xie
DiffM
33
3
0
06 Oct 2023
MedDiffusion: Boosting Health Risk Prediction via Diffusion-based Data
  Augmentation
MedDiffusion: Boosting Health Risk Prediction via Diffusion-based Data Augmentation
Yuan Zhong
Suhan Cui
Jiaqi Wang
Xiaochen Wang
Ziyi Yin
Yaqing Wang
Houping Xiao
Mengdi Huai
Ting Wang
Fenglong Ma
MedIm
30
3
0
04 Oct 2023
DiffAR: Denoising Diffusion Autoregressive Model for Raw Speech Waveform
  Generation
DiffAR: Denoising Diffusion Autoregressive Model for Raw Speech Waveform Generation
Roi Benita
Michael Elad
Joseph Keshet
DiffM
25
7
0
02 Oct 2023
Navigating the Design Space of Equivariant Diffusion-Based Generative
  Models for De Novo 3D Molecule Generation
Navigating the Design Space of Equivariant Diffusion-Based Generative Models for De Novo 3D Molecule Generation
Tuan Le
Julian Cremer
Frank Noé
Djork-Arné Clevert
Kristof T. Schütt
DiffM
29
25
0
29 Sep 2023
Advances in Kidney Biopsy Lesion Assessment through Dense Instance
  Segmentation
Advances in Kidney Biopsy Lesion Assessment through Dense Instance Segmentation
Zhan Xiong
Junling He
Pieter Valkema
Tri Q. Nguyen
M. Naesens
J. Kers
F. Verbeek
MedIm
22
0
0
29 Sep 2023
ReFlow-TTS: A Rectified Flow Model for High-fidelity Text-to-Speech
ReFlow-TTS: A Rectified Flow Model for High-fidelity Text-to-Speech
Wenhao Guan
Qi Su
Haodong Zhou
Shiyu Miao
Xingjia Xie
Lin Li
Q. Hong
DiffM
20
13
0
29 Sep 2023
Deep Networks as Denoising Algorithms: Sample-Efficient Learning of
  Diffusion Models in High-Dimensional Graphical Models
Deep Networks as Denoising Algorithms: Sample-Efficient Learning of Diffusion Models in High-Dimensional Graphical Models
Song Mei
Yuchen Wu
DiffM
31
26
0
20 Sep 2023
EMOCONV-DIFF: Diffusion-based Speech Emotion Conversion for Non-parallel
  and In-the-wild Data
EMOCONV-DIFF: Diffusion-based Speech Emotion Conversion for Non-parallel and In-the-wild Data
N. Prabhu
Bunlong Lay
Simon Welker
N. Lehmann-Willenbrock
Timo Gerkmann
DiffM
21
3
0
14 Sep 2023
Diff-SV: A Unified Hierarchical Framework for Noise-Robust Speaker
  Verification Using Score-Based Diffusion Probabilistic Models
Diff-SV: A Unified Hierarchical Framework for Noise-Robust Speaker Verification Using Score-Based Diffusion Probabilistic Models
Ju-ho Kim
Ju-Sung Heo
Hyun-Seo Shin
Chanmann Lim
Ha-Jin Yu
DiffM
19
2
0
14 Sep 2023
DCTTS: Discrete Diffusion Model with Contrastive Learning for
  Text-to-speech Generation
DCTTS: Discrete Diffusion Model with Contrastive Learning for Text-to-speech Generation
Zhichao Wu
Qiulin Li
Sixing Liu
Qun Yang
24
3
0
13 Sep 2023
Distinguishing Neural Speech Synthesis Models Through Fingerprints in
  Speech Waveforms
Distinguishing Neural Speech Synthesis Models Through Fingerprints in Speech Waveforms
Chu Yuan Zhang
Jiangyan Yi
Jianhua Tao
Chenglong Wang
Xinrui Yan
15
2
0
13 Sep 2023
VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching
VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching
Yiwei Guo
Chenpeng Du
Ziyang Ma
Xie Chen
K. Yu
DiffM
30
36
0
10 Sep 2023
Cross-Utterance Conditioned VAE for Speech Generation
Cross-Utterance Conditioned VAE for Speech Generation
Yongqian Li
Cheng Yu
Guangzhi Sun
Weiqin Zu
Zheng Tian
...
Wei Pan
Chao Zhang
Jun Wang
Yang Yang
Fanglei Sun
18
2
0
08 Sep 2023
Highly Controllable Diffusion-based Any-to-Any Voice Conversion Model
  with Frame-level Prosody Feature
Highly Controllable Diffusion-based Any-to-Any Voice Conversion Model with Frame-level Prosody Feature
Kyungguen Byun
Sunkuk Moon
Erik Visser
DiffM
32
0
0
06 Sep 2023
Matcha-TTS: A fast TTS architecture with conditional flow matching
Matcha-TTS: A fast TTS architecture with conditional flow matching
Shivam Mehta
Ruibo Tu
Jonas Beskow
Éva Székely
G. Henter
24
69
0
06 Sep 2023
PromptTTS 2: Describing and Generating Voices with Text Prompt
PromptTTS 2: Describing and Generating Voices with Text Prompt
Yichong Leng
Zhifang Guo
Kai Shen
Xu Tan
Zeqian Ju
...
Lei He
Xiang-Yang Li
Sheng Zhao
Tao Qin
Jiang Bian
VLM
DiffM
44
40
0
05 Sep 2023
Bridge Diffusion Model: bridge non-English language-native text-to-image
  diffusion model with English communities
Bridge Diffusion Model: bridge non-English language-native text-to-image diffusion model with English communities
Shanyuan Liu
Dawei Leng
Yuhui Yin
DiffM
24
7
0
02 Sep 2023
DiCLET-TTS: Diffusion Model based Cross-lingual Emotion Transfer for
  Text-to-Speech -- A Study between English and Mandarin
DiCLET-TTS: Diffusion Model based Cross-lingual Emotion Transfer for Text-to-Speech -- A Study between English and Mandarin
Tao Li
Chenxu Hu
Jian Cong
Xinfa Zhu
Jingbei Li
Qiao Tian
Yuping Wang
Linfu Xie
DiffM
38
8
0
02 Sep 2023
LightGrad: Lightweight Diffusion Probabilistic Model for Text-to-Speech
LightGrad: Lightweight Diffusion Probabilistic Model for Text-to-Speech
Jing Chen
Xingcheng Song
Zhendong Peng
Binbin Zhang
Fuping Pan
Zhiyong Wu
DiffM
19
16
0
31 Aug 2023
Let There Be Sound: Reconstructing High Quality Speech from Silent
  Videos
Let There Be Sound: Reconstructing High Quality Speech from Silent Videos
Ji-Hoon Kim
Jaehun Kim
Joon Son Chung
30
5
0
29 Aug 2023
DiffusionVMR: Diffusion Model for Joint Video Moment Retrieval and
  Highlight Detection
DiffusionVMR: Diffusion Model for Joint Video Moment Retrieval and Highlight Detection
Henghao Zhao
Kevin Qinghong Lin
Rui Yan
Zechao Li
VGen
DiffM
37
1
0
29 Aug 2023
Previous
12345678
Next