ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.08484
  4. Cited By
Attentron: Few-Shot Text-to-Speech Utilizing Attention-Based
  Variable-Length Embedding

Attentron: Few-Shot Text-to-Speech Utilizing Attention-Based Variable-Length Embedding

18 May 2020
Seungwoo Choi
Seungju Han
Dongyoung Kim
S. Ha
ArXivPDFHTML

Papers citing "Attentron: Few-Shot Text-to-Speech Utilizing Attention-Based Variable-Length Embedding"

18 / 18 papers shown
Title
ISDrama: Immersive Spatial Drama Generation through Multimodal Prompting
ISDrama: Immersive Spatial Drama Generation through Multimodal Prompting
Wenjie Qu
Wenxiang Guo
Changhao Pan
Zehan Zhu
Tao Jin
Zhou Zhao
VGen
54
0
0
29 Apr 2025
Towards Zero-Shot Text-To-Speech for Arabic Dialects
Towards Zero-Shot Text-To-Speech for Arabic Dialects
Khai Duy Doan
Abdul Waheed
Muhammad Abdul-Mageed
42
0
0
24 Jun 2024
Unmasking Illusions: Understanding Human Perception of Audiovisual
  Deepfakes
Unmasking Illusions: Understanding Human Perception of Audiovisual Deepfakes
Ammarah Hashmi
Sahibzada Adil Shahzad
Chia-Wen Lin
Yu Tsao
Hsin-Min Wang
46
3
0
07 May 2024
StyleSinger: Style Transfer for Out-of-Domain Singing Voice Synthesis
StyleSinger: Style Transfer for Out-of-Domain Singing Voice Synthesis
Yu Zhang
Rongjie Huang
Ruiqi Li
Jinzheng He
Yan Xia
Feiyang Chen
Xinyu Duan
Baoxing Huai
Zhou Zhao
VLM
26
17
0
17 Dec 2023
Stylebook: Content-Dependent Speaking Style Modeling for Any-to-Any
  Voice Conversion using Only Speech Data
Stylebook: Content-Dependent Speaking Style Modeling for Any-to-Any Voice Conversion using Only Speech Data
Hyungseob Lim
Kyungguen Byun
Sunkuk Moon
Erik Visser
DiffM
28
2
0
06 Sep 2023
Automatic Tuning of Loss Trade-offs without Hyper-parameter Search in
  End-to-End Zero-Shot Speech Synthesis
Automatic Tuning of Loss Trade-offs without Hyper-parameter Search in End-to-End Zero-Shot Speech Synthesis
Seong-Hyun Park
Bohyung Kim
Tae-Hyun Oh
39
1
0
26 May 2023
Personalized Lightweight Text-to-Speech: Voice Cloning with Adaptive
  Structured Pruning
Personalized Lightweight Text-to-Speech: Voice Cloning with Adaptive Structured Pruning
Sung-Feng Huang
Chia-Ping Chen
Zhi-Sheng Chen
Yu-Pao Tsai
Hung-yi Lee
33
3
0
21 Mar 2023
Warning: Humans Cannot Reliably Detect Speech Deepfakes
Warning: Humans Cannot Reliably Detect Speech Deepfakes
Kimberly T. Mai
Sergi D. Bray
Toby O. Davies
Lewis D. Griffin
41
40
0
19 Jan 2023
Towards zero-shot Text-based voice editing using acoustic context
  conditioning, utterance embeddings, and reference encoders
Towards zero-shot Text-based voice editing using acoustic context conditioning, utterance embeddings, and reference encoders
Jason Fong
Yun Wang
Prabhav Agrawal
Vimal Manohar
Jilong Wu
Thilo Kohler
Qing He
15
0
0
28 Oct 2022
Glow-WaveGAN 2: High-quality Zero-shot Text-to-speech Synthesis and
  Any-to-any Voice Conversion
Glow-WaveGAN 2: High-quality Zero-shot Text-to-speech Synthesis and Any-to-any Voice Conversion
Yinjiao Lei
Shan Yang
Jian Cong
Linfu Xie
Dan Su
DiffM
52
12
0
05 Jul 2022
Fine-grained Noise Control for Multispeaker Speech Synthesis
Fine-grained Noise Control for Multispeaker Speech Synthesis
Karolos Nikitaras
G. Vamvoukakis
Nikolaos Ellinas
Konstantinos Klapsas
K. Markopoulos
S. Raptis
June Sig Sung
Gunu Jho
Aimilios Chalamandaris
Pirros Tsiakoulis
29
4
0
11 Apr 2022
Self-supervised learning for robust voice cloning
Self-supervised learning for robust voice cloning
Konstantinos Klapsas
Nikolaos Ellinas
Karolos Nikitaras
G. Vamvoukakis
Panos Kakoulidis
...
S. Raptis
June Sig Sung
Gunu Jho
Aimilios Chalamandaris
Pirros Tsiakoulis
SSL
27
6
0
07 Apr 2022
Speaker Adaption with Intuitive Prosodic Features for Statistical
  Parametric Speech Synthesis
Speaker Adaption with Intuitive Prosodic Features for Statistical Parametric Speech Synthesis
Pengyu Cheng
Zhenhua Ling
28
3
0
02 Mar 2022
Voice Filter: Few-shot text-to-speech speaker adaptation using voice
  conversion as a post-processing module
Voice Filter: Few-shot text-to-speech speaker adaptation using voice conversion as a post-processing module
Adam Gabry's
Goeric Huybrechts
M. Ribeiro
C. Chien
Julian Roth
Giulia Comini
Roberto Barra-Chicote
Bartek Perz
Jaime Lorenzo-Trueba
36
21
0
16 Feb 2022
Meta-TTS: Meta-Learning for Few-Shot Speaker Adaptive Text-to-Speech
Meta-TTS: Meta-Learning for Few-Shot Speaker Adaptive Text-to-Speech
Sung-Feng Huang
Chyi-Jiunn Lin
Da-Rong Liu
Yi-Chen Chen
Hung-yi Lee
18
56
0
07 Nov 2021
A Survey on Neural Speech Synthesis
A Survey on Neural Speech Synthesis
Xu Tan
Tao Qin
Frank Soong
Tie-Yan Liu
AI4TS
18
352
0
29 Jun 2021
Modeling Multi-speaker Latent Space to Improve Neural TTS: Quick
  Enrolling New Speaker and Enhancing Premium Voice
Modeling Multi-speaker Latent Space to Improve Neural TTS: Quick Enrolling New Speaker and Enhancing Premium Voice
Yan Deng
Lei He
Frank Soong
63
29
0
13 Dec 2018
Transfer Learning from Speaker Verification to Multispeaker
  Text-To-Speech Synthesis
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
Ye Jia
Yu Zhang
Ron J. Weiss
Quan Wang
Jonathan Shen
...
Z. Chen
Patrick Nguyen
Ruoming Pang
Ignacio López Moreno
Yonghui Wu
207
820
0
12 Jun 2018
1