ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2206.02246
  4. Cited By
Zero-Shot Voice Conditioning for Denoising Diffusion TTS Models

Zero-Shot Voice Conditioning for Denoising Diffusion TTS Models

5 June 2022
Alon Levkovitch
Eliya Nachmani
Lior Wolf
    DiffM
ArXivPDFHTML

Papers citing "Zero-Shot Voice Conditioning for Denoising Diffusion TTS Models"

22 / 22 papers shown
Title
VQ-CTAP: Cross-Modal Fine-Grained Sequence Representation Learning for Speech Processing
VQ-CTAP: Cross-Modal Fine-Grained Sequence Representation Learning for Speech Processing
Chunyu Qiang
Wang Geng
Yi Zhao
Ruibo Fu
Tao Wang
...
Chen Zhang
Hao Che
L. Wang
Jianwu Dang
J. Tao
AI4TS
62
0
0
11 Aug 2024
CHAMP: Conformalized 3D Human Multi-Hypothesis Pose Estimators
CHAMP: Conformalized 3D Human Multi-Hypothesis Pose Estimators
Harry Zhang
Luca Carlone
3DH
167
1
0
27 May 2024
TGAVC: Improving Autoencoder Voice Conversion with Text-Guided and
  Adversarial Training
TGAVC: Improving Autoencoder Voice Conversion with Text-Guided and Adversarial Training
Huaizhen Tang
Xulong Zhang
Jianzong Wang
Ning Cheng
Zhen Zeng
Edward Xiao
Jing Xiao
45
20
0
08 Aug 2022
DGC-vector: A new speaker embedding for zero-shot voice conversion
DGC-vector: A new speaker embedding for zero-shot voice conversion
Ruitong Xiao
Haitong Zhang
Yue Lin
42
12
0
18 Mar 2022
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice
  Conversion for everyone
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone
Edresson Casanova
Julian Weber
C. Shulby
Arnaldo Cândido Júnior
Eren Golge
M. Ponti
217
391
0
04 Dec 2021
ILVR: Conditioning Method for Denoising Diffusion Probabilistic Models
ILVR: Conditioning Method for Denoising Diffusion Probabilistic Models
Jooyoung Choi
Sungwon Kim
Yonghyun Jeong
Youngjune Gwon
Sungroh Yoon
DiffM
114
706
0
06 Aug 2021
StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for
  Natural-Sounding Voice Conversion
StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion
Yinghao Aaron Li
A. Zare
N. Mesgarani
68
100
0
21 Jul 2021
StarGAN-ZSVC: Towards Zero-Shot Voice Conversion in Low-Resource
  Contexts
StarGAN-ZSVC: Towards Zero-Shot Voice Conversion in Low-Resource Contexts
Matthew Baas
Herman Kamper
46
6
0
31 May 2021
Cascaded Diffusion Models for High Fidelity Image Generation
Cascaded Diffusion Models for High Fidelity Image Generation
Jonathan Ho
Chitwan Saharia
William Chan
David J. Fleet
Mohammad Norouzi
Tim Salimans
132
1,196
0
30 May 2021
Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech
Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech
Vadim Popov
Ivan Vovk
Vladimir Gogoryan
Tasnima Sadekova
Mikhail Kudinov
DiffM
88
526
0
13 May 2021
Diffusion Models Beat GANs on Image Synthesis
Diffusion Models Beat GANs on Image Synthesis
Prafulla Dhariwal
Alex Nichol
150
7,639
0
11 May 2021
Improving Zero-shot Voice Style Transfer via Disentangled Representation
  Learning
Improving Zero-shot Voice Style Transfer via Disentangled Representation Learning
Siyang Yuan
Pengyu Cheng
Ruiyi Zhang
Weituo Hao
Zhe Gan
Lawrence Carin
DRL
42
61
0
17 Mar 2021
Score-Based Generative Modeling through Stochastic Differential
  Equations
Score-Based Generative Modeling through Stochastic Differential Equations
Yang Song
Jascha Narain Sohl-Dickstein
Diederik P. Kingma
Abhishek Kumar
Stefano Ermon
Ben Poole
DiffM
SyDa
279
6,293
0
26 Nov 2020
VoiceGrad: Non-Parallel Any-to-Many Voice Conversion with Annealed
  Langevin Dynamics
VoiceGrad: Non-Parallel Any-to-Many Voice Conversion with Annealed Langevin Dynamics
Hirokazu Kameoka
Takuhiro Kaneko
Kou Tanaka
Nobukatsu Hojo
Shogo Seki
DiffM
48
21
0
06 Oct 2020
DiffWave: A Versatile Diffusion Model for Audio Synthesis
DiffWave: A Versatile Diffusion Model for Audio Synthesis
Zhifeng Kong
Ming-Yu Liu
Jiaji Huang
Kexin Zhao
Bryan Catanzaro
DiffM
BDL
97
1,429
0
21 Sep 2020
WaveGrad: Estimating Gradients for Waveform Generation
WaveGrad: Estimating Gradients for Waveform Generation
Nanxin Chen
Yu Zhang
Heiga Zen
Ron J. Weiss
Mohammad Norouzi
William Chan
DiffM
BDL
55
787
0
02 Sep 2020
Voice Conversion Challenge 2020: Intra-lingual semi-parallel and
  cross-lingual voice conversion
Voice Conversion Challenge 2020: Intra-lingual semi-parallel and cross-lingual voice conversion
Yi Zhao
Wen-Chin Huang
Xiaohai Tian
Junichi Yamagishi
Rohan Kumar Das
Tomi Kinnunen
Zhenhua Ling
Tomoki Toda
58
206
0
28 Aug 2020
Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment
  Search
Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search
Jaehyeon Kim
Sungwon Kim
Jungil Kong
Sungroh Yoon
79
489
0
22 May 2020
StarGAN-VC2: Rethinking Conditional Methods for StarGAN-Based Voice
  Conversion
StarGAN-VC2: Rethinking Conditional Methods for StarGAN-Based Voice Conversion
Takuhiro Kaneko
Hirokazu Kameoka
Kou Tanaka
Nobukatsu Hojo
61
140
0
29 Jul 2019
AUTOVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss
AUTOVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss
Kaizhi Qian
Yang Zhang
Shiyu Chang
Xuesong Yang
M. Hasegawa-Johnson
64
461
0
14 May 2019
TTS Skins: Speaker Conversion via ASR
TTS Skins: Speaker Conversion via ASR
Adam Polyak
Lior Wolf
Yaniv Taigman
35
27
0
18 Apr 2019
LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech
LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech
Heiga Zen
Viet Dang
R. Clark
Yu Zhang
Ron J. Weiss
Ye Jia
Zhiwen Chen
Yonghui Wu
80
933
0
05 Apr 2019
1