ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1809.10460
  4. Cited By
Sample Efficient Adaptive Text-to-Speech
v1v2v3 (latest)

Sample Efficient Adaptive Text-to-Speech

27 September 2018
Yutian Chen
Yannis Assael
Brendan Shillingford
David Budden
Scott E. Reed
Heiga Zen
Quan Wang
Luis C. Cobo
Andrew Trask
Ben Laurie
Çağlar Gülçehre
Aaron van den Oord
Oriol Vinyals
Nando de Freitas
ArXiv (abs)PDFHTML

Papers citing "Sample Efficient Adaptive Text-to-Speech"

50 / 94 papers shown
Title
ReverbMiipher: Generative Speech Restoration meets Reverberation Characteristics Controllability
ReverbMiipher: Generative Speech Restoration meets Reverberation Characteristics Controllability
Wataru Nakata
Yuma Koizumi
Shigeki Karita
Robin Scheibler
Haruko Ishikawa
Adriana Guevara-Rukoz
Heiga Zen
M. Bacchiani
102
0
0
08 May 2025
Miipher-2: A Universal Speech Restoration Model for Million-Hour Scale Data Restoration
Miipher-2: A Universal Speech Restoration Model for Million-Hour Scale Data Restoration
Shigeki Karita
Yuma Koizumi
Heiga Zen
Haruko Ishikawa
Robin Scheibler
M. Bacchiani
VLM
439
1
0
07 May 2025
Voice Cloning: Comprehensive Survey
Voice Cloning: Comprehensive Survey
Hussam Azzuni
Abdulmotaleb El Saddik
VLM
112
0
0
01 May 2025
HALL-E: Hierarchical Neural Codec Language Model for Minute-Long
  Zero-Shot Text-to-Speech Synthesis
HALL-E: Hierarchical Neural Codec Language Model for Minute-Long Zero-Shot Text-to-Speech Synthesis
Yuto Nishimura
Takumi Hirose
Masanari Ohi
Hideki Nakayama
Nakamasa Inoue
VLM
112
2
0
06 Oct 2024
Advancing Voice Cloning for Nepali: Leveraging Transfer Learning in a
  Low-Resource Language
Advancing Voice Cloning for Nepali: Leveraging Transfer Learning in a Low-Resource Language
Manjil Karki
Pratik Shakya
Sandesh Acharya
Ravi Pandit
Dinesh Gothe
122
0
0
19 Aug 2024
ASRRL-TTS: Agile Speaker Representation Reinforcement Learning for
  Text-to-Speech Speaker Adaptation
ASRRL-TTS: Agile Speaker Representation Reinforcement Learning for Text-to-Speech Speaker Adaptation
Ruibo Fu
Xin Qi
Zhengqi Wen
Jianhua Tao
Tao Wang
...
Xiaopeng Wang
Shuchen Shi
Yukun Liu
Xuefei Liu
Shuai Zhang
92
0
0
07 Jul 2024
VALL-E R: Robust and Efficient Zero-Shot Text-to-Speech Synthesis via
  Monotonic Alignment
VALL-E R: Robust and Efficient Zero-Shot Text-to-Speech Synthesis via Monotonic Alignment
Bing Han
Long Zhou
Shujie Liu
Sanyuan Chen
Lingwei Meng
Yanming Qian
Yanqing Liu
Sheng Zhao
Jinyu Li
Furu Wei
109
24
0
12 Jun 2024
Learning Fine-Grained Controllability on Speech Generation via Efficient
  Fine-Tuning
Learning Fine-Grained Controllability on Speech Generation via Efficient Fine-Tuning
Chung-Ming Chien
Andros Tjandra
Apoorv Vyas
Matt Le
Bowen Shi
Wei-Ning Hsu
75
0
0
10 Jun 2024
VALL-E 2: Neural Codec Language Models are Human Parity Zero-Shot Text
  to Speech Synthesizers
VALL-E 2: Neural Codec Language Models are Human Parity Zero-Shot Text to Speech Synthesizers
Sanyuan Chen
Shujie Liu
Long Zhou
Yanqing Liu
Xu Tan
Jinyu Li
Sheng Zhao
Yao Qian
Furu Wei
VLM
118
83
0
08 Jun 2024
Enhancing Zero-Shot Multi-Speaker TTS with Negated Speaker
  Representations
Enhancing Zero-Shot Multi-Speaker TTS with Negated Speaker Representations
Yejin Jeon
Yunsu Kim
Gary Geunbae Lee
68
2
0
04 Jan 2024
ELF: Encoding Speaker-Specific Latent Speech Feature for Speech
  Synthesis
ELF: Encoding Speaker-Specific Latent Speech Feature for Speech Synthesis
Jungil Kong
Junmo Lee
Jeongmin Kim
Beomjeong Kim
Jihoon Park
Dohee Kong
Changheon Lee
Sangjin Kim
84
1
0
20 Nov 2023
Improving Language Model-Based Zero-Shot Text-to-Speech Synthesis with
  Multi-Scale Acoustic Prompts
Improving Language Model-Based Zero-Shot Text-to-Speech Synthesis with Multi-Scale Acoustic Prompts
Shunwei Lei
Yixuan Zhou
Liyang Chen
Dan Luo
Zhiyong Wu
...
Shiyin Kang
Tao Jiang
Yahui Zhou
Yuxing Han
Helen M. Meng
VLM
90
2
0
21 Sep 2023
Mega-TTS 2: Boosting Prompting Mechanisms for Zero-Shot Speech Synthesis
Mega-TTS 2: Boosting Prompting Mechanisms for Zero-Shot Speech Synthesis
Ziyue Jiang
Jinglin Liu
Yi Ren
Jinzheng He
Zhe Ye
...
Pengfei Wei
Chunfeng Wang
Xiang Yin
Zejun Ma
Zhou Zhao
120
52
0
14 Jul 2023
The NPU-MSXF Speech-to-Speech Translation System for IWSLT 2023
  Speech-to-Speech Translation Task
The NPU-MSXF Speech-to-Speech Translation System for IWSLT 2023 Speech-to-Speech Translation Task
Kun Song
Yinjiao Lei
Pei-Ning Chen
Yiqing Cao
Kun Wei
Yongmao Zhang
Linfu Xie
Ning Jiang
Guoqing Zhao
112
1
0
10 Jul 2023
Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive
  Bias
Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive Bias
Ziyue Jiang
Yi Ren
Zhe Ye
Jinglin Liu
Chen Zhang
...
Rongjie Huang
Chunfeng Wang
Xiang Yin
Zejun Ma
Zhou Zhao
DiffM
105
80
0
06 Jun 2023
LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus
LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus
Yuma Koizumi
Heiga Zen
Shigeki Karita
Yifan Ding
Kohei Yatabe
Nobuyuki Morioka
M. Bacchiani
Yu Zhang
Wei Han
Ankur Bapna
114
80
0
30 May 2023
ADAPTERMIX: Exploring the Efficacy of Mixture of Adapters for
  Low-Resource TTS Adaptation
ADAPTERMIX: Exploring the Efficacy of Mixture of Adapters for Low-Resource TTS Adaptation
Ambuj Mehrish
Abhinav Ramesh Kashyap
Yingting Li
Navonil Majumder
Soujanya Poria
70
7
0
29 May 2023
EE-TTS: Emphatic Expressive TTS with Linguistic Information
EE-TTS: Emphatic Expressive TTS with Linguistic Information
Yifan Zhong
Chen Zhang
Xule Liu
Chenxi Sun
Weishan Deng
Haifeng Hu
Zhongqian Sun
38
3
0
20 May 2023
Unsupervised Pre-Training For Data-Efficient Text-to-Speech On Low
  Resource Languages
Unsupervised Pre-Training For Data-Efficient Text-to-Speech On Low Resource Languages
Seong-Hyun Park
Myungseo Song
Bohyung Kim
Tae-Hyun Oh
37
1
0
28 Mar 2023
Personalized Lightweight Text-to-Speech: Voice Cloning with Adaptive
  Structured Pruning
Personalized Lightweight Text-to-Speech: Voice Cloning with Adaptive Structured Pruning
Sung-Feng Huang
Chia-Ping Chen
Zhi-Sheng Chen
Yu-Pao Tsai
Hung-yi Lee
78
3
0
21 Mar 2023
Miipher: A Robust Speech Restoration Model Integrating Self-Supervised
  Speech and Text Representations
Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations
Yuma Koizumi
Heiga Zen
Shigeki Karita
Yifan Ding
Kohei Yatabe
Nobuyuki Morioka
Yu Zhang
Wei Han
Ankur Bapna
M. Bacchiani
94
29
0
03 Mar 2023
Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
Chengyi Wang
Sanyuan Chen
Yu-Huan Wu
Zi-Hua Zhang
Long Zhou
...
Huaming Wang
Jinyu Li
Lei He
Sheng Zhao
Furu Wei
193
727
0
05 Jan 2023
Grad-StyleSpeech: Any-speaker Adaptive Text-to-Speech Synthesis with
  Diffusion Models
Grad-StyleSpeech: Any-speaker Adaptive Text-to-Speech Synthesis with Diffusion Models
Minki Kang
Dong Min
Sung Ju Hwang
DiffM
105
50
0
17 Nov 2022
Towards zero-shot Text-based voice editing using acoustic context
  conditioning, utterance embeddings, and reference encoders
Towards zero-shot Text-based voice editing using acoustic context conditioning, utterance embeddings, and reference encoders
Jason Fong
Yun Wang
Prabhav Agrawal
Vimal Manohar
Jilong Wu
Thilo Kohler
Qing He
50
0
0
28 Oct 2022
Residual Adapters for Few-Shot Text-to-Speech Speaker Adaptation
Residual Adapters for Few-Shot Text-to-Speech Speaker Adaptation
Nobuyuki Morioka
Heiga Zen
Nanxin Chen
Yu Zhang
Yifan Ding
98
16
0
28 Oct 2022
Semi-Supervised Learning Based on Reference Model for Low-resource TTS
Semi-Supervised Learning Based on Reference Model for Low-resource TTS
Xulong Zhang
Jianzong Wang
Ning Cheng
Jing Xiao
AI4TS
53
5
0
25 Oct 2022
AutoLV: Automatic Lecture Video Generator
AutoLV: Automatic Lecture Video Generator
Wen Wang
Yang Song
Sanjay Jha
VGen
133
3
0
19 Sep 2022
AdaVITS: Tiny VITS for Low Computing Resource Speaker Adaptation
AdaVITS: Tiny VITS for Low Computing Resource Speaker Adaptation
Kun Song
Heyang Xue
Xinsheng Wang
Jian Cong
Yongmao Zhang
Linfu Xie
Bing Yang
Xiong Zhang
Dan Su
95
5
0
01 Jun 2022
Guided-TTS 2: A Diffusion Model for High-quality Adaptive Text-to-Speech
  with Untranscribed Data
Guided-TTS 2: A Diffusion Model for High-quality Adaptive Text-to-Speech with Untranscribed Data
Sungwon Kim
Heeseung Kim
Sung-Hoon Yoon
DiffM
249
53
0
30 May 2022
Meta Learning for Natural Language Processing: A Survey
Meta Learning for Natural Language Processing: A Survey
Hung-yi Lee
Shang-Wen Li
Ngoc Thang Vu
95
45
0
03 May 2022
Self-supervised learning for robust voice cloning
Self-supervised learning for robust voice cloning
Konstantinos Klapsas
Nikolaos Ellinas
Karolos Nikitaras
G. Vamvoukakis
Panos Kakoulidis
...
S. Raptis
June Sig Sung
Gunu Jho
Aimilios Chalamandaris
Pirros Tsiakoulis
SSL
77
6
0
07 Apr 2022
Audio-Visual Speech Codecs: Rethinking Audio-Visual Speech Enhancement
  by Re-Synthesis
Audio-Visual Speech Codecs: Rethinking Audio-Visual Speech Enhancement by Re-Synthesis
Karren D. Yang
Dejan Marković
Steven Krenn
Vasu Agrawal
Alexander Richard
VGen
79
33
0
31 Mar 2022
AdaVocoder: Adaptive Vocoder for Custom Voice
AdaVocoder: Adaptive Vocoder for Custom Voice
Xin Yuan
Yongbin Feng
Mingming Ye
Cheng Tuo
Minghang Zhang
117
3
0
18 Mar 2022
Improve few-shot voice cloning using multi-modal learning
Improve few-shot voice cloning using multi-modal learning
Haitong Zhang
Yue Lin
46
8
0
18 Mar 2022
Speaker Adaption with Intuitive Prosodic Features for Statistical
  Parametric Speech Synthesis
Speaker Adaption with Intuitive Prosodic Features for Statistical Parametric Speech Synthesis
Pengyu Cheng
Zhenhua Ling
67
3
0
02 Mar 2022
Voice Filter: Few-shot text-to-speech speaker adaptation using voice
  conversion as a post-processing module
Voice Filter: Few-shot text-to-speech speaker adaptation using voice conversion as a post-processing module
Adam Gabry's
Goeric Huybrechts
M. Ribeiro
C. Chien
Julian Roth
Giulia Comini
Roberto Barra-Chicote
Bartek Perz
Jaime Lorenzo-Trueba
80
21
0
16 Feb 2022
MR-SVS: Singing Voice Synthesis with Multi-Reference Encoder
MR-SVS: Singing Voice Synthesis with Multi-Reference Encoder
Shoutong Wang
Jinglin Liu
Yi Ren
Zhen Wang
Changliang Xu
Zhou Zhao
40
7
0
11 Jan 2022
V2C: Visual Voice Cloning
V2C: Visual Voice Cloning
Qi Chen
Yuanqing Li
Yuankai Qi
Jiaqiu Zhou
Mingkui Tan
Qi Wu
VGen
72
27
0
25 Nov 2021
Meta-Voice: Fast few-shot style transfer for expressive voice cloning
  using meta learning
Meta-Voice: Fast few-shot style transfer for expressive voice cloning using meta learning
Songxiang Liu
Dan Su
Dong Yu
56
10
0
14 Nov 2021
Meta-TTS: Meta-Learning for Few-Shot Speaker Adaptive Text-to-Speech
Meta-TTS: Meta-Learning for Few-Shot Speaker Adaptive Text-to-Speech
Sung-Feng Huang
Chyi-Jiunn Lin
Da-Rong Liu
Yi-Chen Chen
Hung-yi Lee
126
57
0
07 Nov 2021
A study on the efficacy of model pre-training in developing neural
  text-to-speech system
A study on the efficacy of model pre-training in developing neural text-to-speech system
Guangyan Zhang
Yichong Leng
Daxin Tan
Ying Qin
Kaitao Song
Xu Tan
Sheng Zhao
Tan Lee
56
2
0
08 Oct 2021
GC-TTS: Few-shot Speaker Adaptation with Geometric Constraints
GC-TTS: Few-shot Speaker Adaptation with Geometric Constraints
Ji-Hoon Kim
Sang-Hoon Lee
Ji-Hyun Lee
Hong G Jung
Seong-Whan Lee
159
6
0
16 Aug 2021
A Survey on Neural Speech Synthesis
A Survey on Neural Speech Synthesis
Xu Tan
Tao Qin
Frank Soong
Tie-Yan Liu
AI4TS
133
359
0
29 Jun 2021
GANSpeech: Adversarial Training for High-Fidelity Multi-Speaker Speech
  Synthesis
GANSpeech: Adversarial Training for High-Fidelity Multi-Speaker Speech Synthesis
Jinhyeok Yang
Jaesung Bae
Taejun Bak
Young-Ik Kim
Hoon-Young Cho
134
37
0
29 Jun 2021
AI based Presentation Creator With Customized Audio Content Delivery
AI based Presentation Creator With Customized Audio Content Delivery
Muvazima Mansoor
Srikanth Chandar
Ramamoorthy Srinath
113
0
0
27 Jun 2021
Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation
Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation
Dong Min
Dong Bok Lee
Eunho Yang
Sung Ju Hwang
134
175
0
06 Jun 2021
An objective evaluation of the effects of recording conditions and
  speaker characteristics in multi-speaker deep neural speech synthesis
An objective evaluation of the effects of recording conditions and speaker characteristics in multi-speaker deep neural speech synthesis
Beáta Lőrincz
Adriana Stan
M. Giurgiu
33
2
0
03 Jun 2021
AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data
AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data
Yuzi Yan
Xu Tan
Bohan Li
Tao Qin
Sheng Zhao
Yuan-Chung Shen
Tie-Yan Liu
45
46
0
20 Apr 2021
The AS-NU System for the M2VoC Challenge
The AS-NU System for the M2VoC Challenge
Cheng-Hung Hu
Yi-Chiao Wu
Wen-Chin Huang
Yu-Huai Peng
Yu-Wen Chen
Pin-Jui Ku
Tomoki Toda
Yu Tsao
Hsin-Min Wang
54
1
0
07 Apr 2021
Continual Speaker Adaptation for Text-to-Speech Synthesis
Continual Speaker Adaptation for Text-to-Speech Synthesis
Hamed Hemati
Damian Borth
CLL
77
9
0
26 Mar 2021
12
Next