ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1702.07825
  4. Cited By
Deep Voice: Real-time Neural Text-to-Speech

Deep Voice: Real-time Neural Text-to-Speech

25 February 2017
Sercan Ö. Arik
Mike Chrzanowski
Adam Coates
G. Diamos
Andrew Gibiansky
Yongguo Kang
Xian Li
John Miller
Andrew Ng
Jonathan Raiman
Shubho Sengupta
M. Shoeybi
ArXivPDFHTML

Papers citing "Deep Voice: Real-time Neural Text-to-Speech"

44 / 94 papers shown
Title
Neural voice cloning with a few low-quality samples
Neural voice cloning with a few low-quality samples
Sunghee Jung
Hoi-Rim Kim
33
2
0
12 Jun 2020
Deep generative models for musical audio synthesis
Deep generative models for musical audio synthesis
M. Huzaifah
L. Wyse
27
20
0
10 Jun 2020
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
Yi Ren
Chenxu Hu
Xu Tan
Tao Qin
Sheng Zhao
Zhou Zhao
Tie-Yan Liu
60
1,357
0
08 Jun 2020
Many-to-Many Voice Transformer Network
Many-to-Many Voice Transformer Network
Hirokazu Kameoka
Wen-Chin Huang
Kou Tanaka
Takuhiro Kaneko
Nobukatsu Hojo
T. Toda
ViT
30
30
0
18 May 2020
Flowtron: an Autoregressive Flow-based Generative Network for
  Text-to-Speech Synthesis
Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis
Rafael Valle
Kevin J. Shih
R. Prenger
Bryan Catanzaro
21
119
0
12 May 2020
Jukebox: A Generative Model for Music
Jukebox: A Generative Model for Music
Prafulla Dhariwal
Heewoo Jun
Christine Payne
Jong Wook Kim
Alec Radford
Ilya Sutskever
VLM
28
722
0
30 Apr 2020
Direct Speech-to-image Translation
Direct Speech-to-image Translation
Jiguo Li
Xinfeng Zhang
Chuanmin Jia
Jizheng Xu
Li Zhang
Y. Wang
Siwei Ma
Wen Gao
36
29
0
07 Apr 2020
Unsupervised Style and Content Separation by Minimizing Mutual
  Information for Speech Synthesis
Unsupervised Style and Content Separation by Minimizing Mutual Information for Speech Synthesis
Ting-Yao Hu
A. Shrivastava
Oncel Tuzel
C. Dhir
11
30
0
09 Mar 2020
AlignTTS: Efficient Feed-Forward Text-to-Speech System without Explicit
  Alignment
AlignTTS: Efficient Feed-Forward Text-to-Speech System without Explicit Alignment
Zhen Zeng
Jianzong Wang
Ning Cheng
Tian Xia
Jing Xiao
VLM
25
56
0
04 Mar 2020
Semi-Supervised Neural Architecture Search
Semi-Supervised Neural Architecture Search
Renqian Luo
Xu Tan
Rui Wang
Tao Qin
Enhong Chen
Tie-Yan Liu
13
88
0
24 Feb 2020
Vision-Infused Deep Audio Inpainting
Vision-Infused Deep Audio Inpainting
Hang Zhou
Ziwei Liu
Lingfeng Guo
Ping Luo
Dahua Lin
35
88
0
24 Oct 2019
High Fidelity Speech Synthesis with Adversarial Networks
High Fidelity Speech Synthesis with Adversarial Networks
Mikolaj Binkowski
Jeff Donahue
Sander Dieleman
Aidan Clark
Erich Elsen
Norman Casagrande
Luis C. Cobo
Karen Simonyan
235
239
0
25 Sep 2019
Unpaired Image-to-Speech Synthesis with Multimodal Information
  Bottleneck
Unpaired Image-to-Speech Synthesis with Multimodal Information Bottleneck
Shuang Ma
Daniel J. McDuff
Yale Song
25
22
0
19 Aug 2019
A Methodology for Controlling the Emotional Expressiveness in Synthetic
  Speech -- a Deep Learning approach
A Methodology for Controlling the Emotional Expressiveness in Synthetic Speech -- a Deep Learning approach
Noé Tits
16
10
0
05 Jul 2019
Polyphone Disambiguation for Mandarin Chinese Using Conditional Neural
  Network with Multi-level Embedding Features
Polyphone Disambiguation for Mandarin Chinese Using Conditional Neural Network with Multi-level Embedding Features
Zexin Cai
Yaogen Yang
Chuxiong Zhang
Xiaoyi Qin
Ming Li
27
26
0
03 Jul 2019
Towards Transfer Learning for End-to-End Speech Synthesis from Deep
  Pre-Trained Language Models
Towards Transfer Learning for End-to-End Speech Synthesis from Deep Pre-Trained Language Models
Wei Fang
Yu-An Chung
James R. Glass
13
27
0
17 Jun 2019
Non-Autoregressive Neural Text-to-Speech
Non-Autoregressive Neural Text-to-Speech
Kainan Peng
Ming-Yu Liu
Z. Song
Kexin Zhao
29
39
0
21 May 2019
Almost Unsupervised Text to Speech and Automatic Speech Recognition
Almost Unsupervised Text to Speech and Automatic Speech Recognition
Yi Ren
Xu Tan
Tao Qin
Sheng Zhao
Zhou Zhao
Tie-Yan Liu
44
101
0
13 May 2019
A Light Dual-Task Neural Network for Haze Removal
A Light Dual-Task Neural Network for Haze Removal
Yu Zhang
Xinchao Wang
Xiaojun Bi
Dacheng Tao
31
13
0
12 Apr 2019
Probability density distillation with generative adversarial networks
  for high-quality parallel waveform generation
Probability density distillation with generative adversarial networks for high-quality parallel waveform generation
Ryuichi Yamamoto
Eunwoo Song
Jae-Min Kim
19
55
0
09 Apr 2019
A Real-Time Wideband Neural Vocoder at 1.6 kb/s Using LPCNet
A Real-Time Wideband Neural Vocoder at 1.6 kb/s Using LPCNet
J. Valin
Jan Skoglund
24
78
0
28 Mar 2019
Securing Voice-driven Interfaces against Fake (Cloned) Audio Attacks
Securing Voice-driven Interfaces against Fake (Cloned) Audio Attacks
Hafiz Malik
13
26
0
18 Feb 2019
CROSSBOW: Scaling Deep Learning with Small Batch Sizes on Multi-GPU
  Servers
CROSSBOW: Scaling Deep Learning with Small Batch Sizes on Multi-GPU Servers
A. Koliousis
Pijika Watcharapichat
Matthias Weidlich
Luo Mai
Paolo Costa
Peter R. Pietzuch
16
69
0
08 Jan 2019
Learning pronunciation from a foreign language in speech synthesis
  networks
Learning pronunciation from a foreign language in speech synthesis networks
Younggun Lee
Suwon Shon
Taesu Kim
20
26
0
23 Nov 2018
Speaking style adaptation in Text-To-Speech synthesis using
  Sequence-to-sequence models with attention
Speaking style adaptation in Text-To-Speech synthesis using Sequence-to-sequence models with attention
Bajibabu Bollepalli
Lauri Juvela
P. Alku
15
4
0
29 Oct 2018
Sample Efficient Adaptive Text-to-Speech
Sample Efficient Adaptive Text-to-Speech
Yutian Chen
Yannis Assael
Brendan Shillingford
David Budden
Scott E. Reed
...
Ben Laurie
Çağlar Gülçehre
Aaron van den Oord
Oriol Vinyals
Nando de Freitas
35
149
0
27 Sep 2018
Fast Spectrogram Inversion using Multi-head Convolutional Neural
  Networks
Fast Spectrogram Inversion using Multi-head Convolutional Neural Networks
Sercan Ö. Arik
Heewoo Jun
G. Diamos
14
106
0
20 Aug 2018
Multi-task WaveNet: A Multi-task Generative Model for Statistical
  Parametric Speech Synthesis without Fundamental Frequency Conditions
Multi-task WaveNet: A Multi-task Generative Model for Statistical Parametric Speech Synthesis without Fundamental Frequency Conditions
Yu Gu
Yongguo Kang
10
17
0
22 Jun 2018
Voice Imitating Text-to-Speech Neural Networks
Voice Imitating Text-to-Speech Neural Networks
Younggun Lee
Taesu Kim
Soo-Young Lee
26
11
0
04 Jun 2018
Collapsed speech segment detection and suppression for WaveNet vocoder
Collapsed speech segment detection and suppression for WaveNet vocoder
Yi-Chiao Wu
Kazuhiro Kobayashi
Tomoki Hayashi
Patrick Lumban Tobing
T. Toda
7
25
0
30 Apr 2018
Speaker-independent raw waveform model for glottal excitation
Speaker-independent raw waveform model for glottal excitation
Lauri Juvela
Vassilis Tsiaras
Bajibabu Bollepalli
Manu Airaksinen
Junichi Yamagishi
P. Alku
13
39
0
25 Apr 2018
Conditional End-to-End Audio Transforms
Conditional End-to-End Audio Transforms
Albert Haque
Michelle Guo
Prateek Verma
33
41
0
30 Mar 2018
Style Tokens: Unsupervised Style Modeling, Control and Transfer in
  End-to-End Speech Synthesis
Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis
Yuxuan Wang
Daisy Stanton
Yu Zhang
RJ Skerry-Ryan
Eric Battenberg
Joel Shor
Y. Xiao
Fei Ren
Ye Jia
Rif A. Saurous
21
815
0
23 Mar 2018
Efficient Neural Audio Synthesis
Efficient Neural Audio Synthesis
Nal Kalchbrenner
Erich Elsen
Karen Simonyan
Seb Noury
Norman Casagrande
Edward Lockhart
Florian Stimberg
Aaron van den Oord
Sander Dieleman
Koray Kavukcuoglu
23
863
0
23 Feb 2018
Do WaveNets Dream of Acoustic Waves?
Do WaveNets Dream of Acoustic Waves?
Kanru Hua
24
1
0
23 Feb 2018
Fitting New Speakers Based on a Short Untranscribed Sample
Fitting New Speakers Based on a Short Untranscribed Sample
Eliya Nachmani
Adam Polyak
Yaniv Taigman
Lior Wolf
21
84
0
20 Feb 2018
Waveform Modeling and Generation Using Hierarchical Recurrent Neural
  Networks for Speech Bandwidth Extension
Waveform Modeling and Generation Using Hierarchical Recurrent Neural Networks for Speech Bandwidth Extension
Zhenhua Ling
Yang Ai
Yu Gu
Lirong Dai
16
61
0
24 Jan 2018
Denoising Gravitational Waves using Deep Learning with Recurrent
  Denoising Autoencoders
Denoising Gravitational Waves using Deep Learning with Recurrent Denoising Autoencoders
Hongyu Shen
D. George
Eliu A. Huerta
Zhizhen Zhao
32
66
0
27 Nov 2017
Listening while Speaking: Speech Chain by Deep Learning
Listening while Speaking: Speech Chain by Deep Learning
Andros Tjandra
S. Sakti
Satoshi Nakamura
AuLLM
126
165
0
16 Jul 2017
Device Placement Optimization with Reinforcement Learning
Device Placement Optimization with Reinforcement Learning
Azalia Mirhoseini
Hieu H. Pham
Quoc V. Le
Benoit Steiner
Rasmus Larsen
Yuefeng Zhou
Naveen Kumar
Mohammad Norouzi
Samy Bengio
J. Dean
27
436
0
13 Jun 2017
Deep Voice 2: Multi-Speaker Neural Text-to-Speech
Deep Voice 2: Multi-Speaker Neural Text-to-Speech
Sercan Ö. Arik
G. Diamos
Andrew Gibiansky
John Miller
Kainan Peng
Ming-Yu Liu
Jonathan Raiman
Yanqi Zhou
22
494
0
24 May 2017
Tacotron: Towards End-to-End Speech Synthesis
Tacotron: Towards End-to-End Speech Synthesis
Yuxuan Wang
RJ Skerry-Ryan
Daisy Stanton
Yonghui Wu
Ron J. Weiss
...
Samy Bengio
Quoc V. Le
Yannis Agiomyrgiannakis
R. Clark
Rif A. Saurous
47
1,804
0
29 Mar 2017
Pixel Recurrent Neural Networks
Pixel Recurrent Neural Networks
Aaron van den Oord
Nal Kalchbrenner
Koray Kavukcuoglu
SSeg
GAN
269
2,552
0
25 Jan 2016
Sequence-to-Sequence Neural Net Models for Grapheme-to-Phoneme
  Conversion
Sequence-to-Sequence Neural Net Models for Grapheme-to-Phoneme Conversion
Kaisheng Yao
Geoffrey Zweig
48
163
0
31 May 2015
Previous
12