Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.15816
Cited By
DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for Verified Robust Voice Conversion
25 May 2023
Haram Choi
Sang-Hoon Lee
Seong-Whan Lee
DiffM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for Verified Robust Voice Conversion"
26 / 26 papers shown
Title
Mitigating Timbre Leakage with Universal Semantic Mapping Residual Block for Voice Conversion
Na Li
Chuke Wang
Yu Gu
Zhifeng Li
54
0
0
11 Apr 2025
MoEdit: On Learning Quantity Perception for Multi-object Image Editing
Yanfeng Li
Kahou Chan
Yue Sun
C. Lam
Tong Tong
Zitong Yu
Keren Fu
Xiaohong Liu
Tao Tan
DiffM
41
0
0
13 Mar 2025
Less is More for Synthetic Speech Detection in the Wild
Ashi Garg
Zexin Cai
Henry Li Xinyuan
Leibny Paola García-Perera
Kevin Duh
Sanjeev Khudanpur
Matthew Wiesner
Nicholas Andrews
74
0
0
17 Feb 2025
Emotion Recognition and Generation: A Comprehensive Review of Face, Speech, and Text Modalities
Rebecca Mobbs
Dimitrios Makris
Vasileios Argyriou
43
0
0
02 Feb 2025
VoicePrompter: Robust Zero-Shot Voice Conversion with Voice Prompt and Conditional Flow Matching
Ha-Yeong Choi
Jaehan Park
39
0
0
29 Jan 2025
Zero-shot Voice Conversion with Diffusion Transformers
Songting Liu
37
2
0
15 Nov 2024
VoiceWukong: Benchmarking Deepfake Voice Detection
Ziwei Yan
Yanjie Zhao
Haoyu Wang
34
1
0
10 Sep 2024
vec2wav 2.0: Advancing Voice Conversion via Discrete Token Vocoders
Yiwei Guo
Zhihan Li
Junjie Li
Chenpeng Du
Hankun Wang
Shuai Wang
Xie Chen
Kai Yu
35
0
0
03 Sep 2024
Seeing Your Speech Style: A Novel Zero-Shot Identity-Disentanglement Face-based Voice Conversion
Yan Rong
Li Liu
26
3
0
01 Sep 2024
Accelerating High-Fidelity Waveform Generation via Adversarial Flow Matching Optimization
Sang-Hoon Lee
Ha-Yeong Choi
Seong-Whan Lee
AI4TS
29
1
0
15 Aug 2024
Bailing-TTS: Chinese Dialectal Speech Synthesis Towards Human-like Spontaneous Representation
Xinhan Di
Jiahao Lu
Yunming Liang
Junjie Zheng
Yihua Wang
Chaofan Ding
ALM
35
1
0
01 Aug 2024
Vec-Tok-VC+: Residual-enhanced Robust Zero-shot Voice Conversion with Progressive Constraints in a Dual-mode Training Strategy
Linhan Ma
Xinfa Zhu
Yuanjun Lv
Zhichao Wang
Ziqian Wang
Wendi He
Hongbin Zhou
Lei Xie
42
2
0
14 Jun 2024
Seed-TTS: A Family of High-Quality Versatile Speech Generation Models
Philip Anastassiou
Jiawei Chen
J. Chen
Yuanzhe Chen
Zhuo Chen
...
Wenjie Zhang
Yuhang Zhang
Zilin Zhao
Dejian Zhong
Xiaobin Zhuang
49
77
0
04 Jun 2024
FlashSpeech: Efficient Zero-Shot Speech Synthesis
Zhen Ye
Zeqian Ju
Haohe Liu
Xu Tan
Jianyi Chen
...
Weizhen Bian
Shulin He
Qi-fei Liu
Yi-Ting Guo
Wei Xue
38
16
0
23 Apr 2024
VoiceShop: A Unified Speech-to-Speech Framework for Identity-Preserving Zero-Shot Voice Editing
Philip Anastassiou
Zhenyu Tang
Kainan Peng
Dongya Jia
Jiaxin Li
Ming Tu
Yuping Wang
Yuxuan Wang
Mingbo Ma
42
4
0
10 Apr 2024
M
3
^3
3
AV: A Multimodal, Multigenre, and Multipurpose Audio-Visual Academic Lecture Dataset
Zhe Chen
Heyang Liu
Wenyi Yu
Guangzhi Sun
Hongcheng Liu
Ji Wu
Chao Zhang
Yu Wang
Yanfeng Wang
VGen
49
1
0
21 Mar 2024
HierSpeech++: Bridging the Gap between Semantic and Acoustic Representation of Speech by Hierarchical Variational Inference for Zero-shot Speech Synthesis
Sang-Hoon Lee
Haram Choi
Seung-Bin Kim
Seong-Whan Lee
BDL
27
31
0
21 Nov 2023
Diff-HierVC: Diffusion-based Hierarchical Voice Conversion with Robust Pitch Generation and Masked Prior for Zero-shot Speaker Adaptation
Haram Choi
Sang-Hoon Lee
Seong-Whan Lee
DiffM
21
24
0
08 Nov 2023
HierVST: Hierarchical Adaptive Zero-shot Voice Style Transfer
Sang-Hoon Lee
Haram Choi
H. Oh
Seong-Whan Lee
BDL
28
9
0
30 Jul 2023
HiddenSinger: High-Quality Singing Voice Synthesis via Neural Audio Codec and Latent Diffusion Models
Ji-Sang Hwang
Sang-Hoon Lee
Seong-Whan Lee
DiffM
35
8
0
12 Jun 2023
On the Design Fundamentals of Diffusion Models: A Survey
Ziyi Chang
G. Koulieris
Hubert P. H. Shum
DiffM
29
53
0
07 Jun 2023
Guided-TTS 2: A Diffusion Model for High-quality Adaptive Text-to-Speech with Untranscribed Data
Sungwon Kim
Heeseung Kim
Sung-Hoon Yoon
DiffM
196
52
0
30 May 2022
RePaint: Inpainting using Denoising Diffusion Probabilistic Models
Andreas Lugmayr
Martin Danelljan
Andrés Romero
F. I. F. Richard Yu
Radu Timofte
Luc Van Gool
DiffM
218
1,355
0
24 Jan 2022
Palette: Image-to-Image Diffusion Models
Chitwan Saharia
William Chan
Huiwen Chang
Chris A. Lee
Jonathan Ho
Tim Salimans
David J. Fleet
Mohammad Norouzi
DiffM
VLM
342
1,591
0
10 Nov 2021
Any-to-One Sequence-to-Sequence Voice Conversion using Self-Supervised Discrete Speech Representations
Wen-Chin Huang
Yi-Chiao Wu
Tomoki Hayashi
T. Toda
BDL
41
37
0
23 Oct 2020
VoxCeleb2: Deep Speaker Recognition
Joon Son Chung
Arsha Nagrani
Andrew Zisserman
224
2,234
0
14 Jun 2018
1