ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2207.04646
  4. Cited By
DelightfulTTS 2: End-to-End Speech Synthesis with Adversarial
  Vector-Quantized Auto-Encoders

DelightfulTTS 2: End-to-End Speech Synthesis with Adversarial Vector-Quantized Auto-Encoders

11 July 2022
Yanqing Liu
Rui Xue
Lei He
Xu Tan
Sheng Zhao
ArXiv (abs)PDFHTML

Papers citing "DelightfulTTS 2: End-to-End Speech Synthesis with Adversarial Vector-Quantized Auto-Encoders"

18 / 18 papers shown
Title
Zero-Shot Text-to-Speech for Vietnamese
Zero-Shot Text-to-Speech for Vietnamese
Thi Vu
L. T. Nguyen
Dat Quoc Nguyen
67
0
0
02 Jun 2025
Continuous Speech Tokens Makes LLMs Robust Multi-Modality Learners
Continuous Speech Tokens Makes LLMs Robust Multi-Modality Learners
Ze Yuan
Yanqing Liu
Shujie Liu
Sheng Zhao
AuLLM
141
2
0
06 Dec 2024
A Survey of Deep Learning Audio Generation Methods
A Survey of Deep Learning Audio Generation Methods
Matej Bozic
Marko Horvat
VLMMedIm
104
2
0
31 May 2024
Extending Multilingual Speech Synthesis to 100+ Languages without
  Transcribed Data
Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data
Takaaki Saeki
Gary Wang
Nobuyuki Morioka
Isaac Elias
Kyle Kastner
...
Andrew Rosenberg
Bhuvana Ramabhadran
Heiga Zen
Francoise Beaufays
Hadar Shemtov
104
14
0
29 Feb 2024
Generative Adversarial Training for Text-to-Speech Synthesis Based on
  Raw Phonetic Input and Explicit Prosody Modelling
Generative Adversarial Training for Text-to-Speech Synthesis Based on Raw Phonetic Input and Explicit Prosody Modelling
Tiberiu Boros
Stefan Daniel Dumitrescu
Ionut Mironica
Radu Chivereanu
GAN
38
1
0
14 Oct 2023
Speech Synthesis By Unrolling Diffusion Process using Neural Network Layers
Speech Synthesis By Unrolling Diffusion Process using Neural Network Layers
Peter Ochieng
DiffM
56
0
0
18 Sep 2023
Timbre-reserved Adversarial Attack in Speaker Identification
Timbre-reserved Adversarial Attack in Speaker Identification
Qing Wang
Jixun Yao
Li Zhang
Pengcheng Guo
Linfu Xie
AAML
79
4
0
02 Sep 2023
A Systematic Exploration of Joint-training for Singing Voice Synthesis
A Systematic Exploration of Joint-training for Singing Voice Synthesis
Yuning Wu
Yifeng Yu
Jiatong Shi
Tao Qian
Qin Jin
112
6
0
05 Aug 2023
HierVST: Hierarchical Adaptive Zero-shot Voice Style Transfer
HierVST: Hierarchical Adaptive Zero-shot Voice Style Transfer
Sang-Hoon Lee
Haram Choi
H. Oh
Seong-Whan Lee
BDL
95
12
0
30 Jul 2023
Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive
  Bias
Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive Bias
Ziyue Jiang
Yi Ren
Zhe Ye
Jinglin Liu
Chen Zhang
...
Rongjie Huang
Chunfeng Wang
Xiang Yin
Zejun Ma
Zhou Zhao
DiffM
105
80
0
06 Jun 2023
DC CoMix TTS: An End-to-End Expressive TTS with Discrete Code
  Collaborated with Mixer
DC CoMix TTS: An End-to-End Expressive TTS with Discrete Code Collaborated with Mixer
Yerin Choi
M. Koo
77
0
0
31 May 2023
Pseudo-Siamese Network based Timbre-reserved Black-box Adversarial
  Attack in Speaker Identification
Pseudo-Siamese Network based Timbre-reserved Black-box Adversarial Attack in Speaker Identification
Qing Wang
Jixun Yao
Ziqian Wang
Pengcheng Guo
Linfu Xie
AAML
64
1
0
30 May 2023
NaturalSpeech 2: Latent Diffusion Models are Natural and Zero-Shot
  Speech and Singing Synthesizers
NaturalSpeech 2: Latent Diffusion Models are Natural and Zero-Shot Speech and Singing Synthesizers
Kai Shen
Zeqian Ju
Xu Tan
Yanqing Liu
Yichong Leng
Lei He
Tao Qin
Sheng Zhao
Jiang Bian
DiffM
117
247
0
18 Apr 2023
FoundationTTS: Text-to-Speech for ASR Customization with Generative
  Language Model
FoundationTTS: Text-to-Speech for ASR Customization with Generative Language Model
Rui Xue
Yanqing Liu
Lei He
Xuejiao Tan
Linquan Liu
Ed Lin
Sheng Zhao
118
7
0
06 Mar 2023
Speech Enhancement with Multi-granularity Vector Quantization
Speech Enhancement with Multi-granularity Vector Quantization
Xiaokang Zhao
Qiu-shi Zhu
Jie Zhang
67
0
0
16 Feb 2023
Regeneration Learning: A Learning Paradigm for Data Generation
Regeneration Learning: A Learning Paradigm for Data Generation
Xu Tan
Tao Qin
Jiang Bian
Tie-Yan Liu
Yoshua Bengio
GAN
64
15
0
21 Jan 2023
Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
Chengyi Wang
Sanyuan Chen
Yu-Huan Wu
Zi-Hua Zhang
Long Zhou
...
Huaming Wang
Jinyu Li
Lei He
Sheng Zhao
Furu Wei
193
727
0
05 Jan 2023
Source Tracing: Detecting Voice Spoofing
Source Tracing: Detecting Voice Spoofing
Tinglong Zhu
Xingming Wang
Xiaoyi Qin
Ming Li
65
18
0
16 Dec 2022
1