ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1703.10135
  4. Cited By
Tacotron: Towards End-to-End Speech Synthesis

Tacotron: Towards End-to-End Speech Synthesis

29 March 2017
Yuxuan Wang
RJ Skerry-Ryan
Daisy Stanton
Yonghui Wu
Ron J. Weiss
Navdeep Jaitly
Zongheng Yang
Y. Xiao
Z. Chen
Samy Bengio
Quoc V. Le
Yannis Agiomyrgiannakis
R. Clark
Rif A. Saurous
ArXivPDFHTML

Papers citing "Tacotron: Towards End-to-End Speech Synthesis"

50 / 817 papers shown
Title
Audio-Linguistic Embeddings for Spoken Sentences
Audio-Linguistic Embeddings for Spoken Sentences
Albert Haque
Michelle Guo
Prateek Verma
Li Fei-Fei
28
51
0
20 Feb 2019
Insertion Transformer: Flexible Sequence Generation via Insertion
  Operations
Insertion Transformer: Flexible Sequence Generation via Insertion Operations
Mitchell Stern
William Chan
J. Kiros
Jakob Uszkoreit
KELM
31
247
0
08 Feb 2019
Exploring Transfer Learning for Low Resource Emotional TTS
Exploring Transfer Learning for Low Resource Emotional TTS
Noé Tits
Kevin El Haddad
Thierry Dutoit
17
61
0
14 Jan 2019
Efficient Convolutional Neural Network Training with Direct Feedback
  Alignment
Efficient Convolutional Neural Network Training with Direct Feedback Alignment
Donghyeon Han
H. Yoo
3DV
16
17
0
06 Jan 2019
Introduction to Voice Presentation Attack Detection and Recent Advances
Introduction to Voice Presentation Attack Detection and Recent Advances
Md. Sahidullah
Héctor Delgado
Massimiliano Todisco
Tomi Kinnunen
Nicholas W. D. Evans
Junichi Yamagishi
Kong-Aik Lee
AAML
13
75
0
04 Jan 2019
Feature reinforcement with word embedding and parsing information in
  neural TTS
Feature reinforcement with word embedding and parsing information in neural TTS
Huaiping Ming
Lei He
Haohan Guo
Frank Soong
74
15
0
03 Jan 2019
Modeling Multi-speaker Latent Space to Improve Neural TTS: Quick
  Enrolling New Speaker and Enhancing Premium Voice
Modeling Multi-speaker Latent Space to Improve Neural TTS: Quick Enrolling New Speaker and Enhancing Premium Voice
Yan Deng
Lei He
Frank Soong
63
29
0
13 Dec 2018
Learning pronunciation from a foreign language in speech synthesis
  networks
Learning pronunciation from a foreign language in speech synthesis networks
Younggun Lee
Suwon Shon
Taesu Kim
22
26
0
23 Nov 2018
Improving Sequence-to-Sequence Acoustic Modeling by Adding
  Text-Supervision
Improving Sequence-to-Sequence Acoustic Modeling by Adding Text-Supervision
Jing-Xuan Zhang
Zhenhua Ling
Yuan Jiang
Li-Juan Liu
Chen Liang
Lirong Dai
17
29
0
20 Nov 2018
Effect of data reduction on sequence-to-sequence neural TTS
Effect of data reduction on sequence-to-sequence neural TTS
Javier Latorre
Jakub Lachowicz
Jaime Lorenzo-Trueba
Thomas Merritt
Thomas Drugman
S. Ronanki
Klimkov Viacheslav
38
59
0
15 Nov 2018
Comprehensive evaluation of statistical speech waveform synthesis
Comprehensive evaluation of statistical speech waveform synthesis
Thomas Merritt
Bartosz Putrycz
Adam Nadolski
Tianjun Ye
Daniel Korzekwa
...
Alexis Moinet
A. Breen
Rafal Kuklinski
N. Strom
Roberto Barra-Chicote
19
17
0
15 Nov 2018
AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and
  Context Preservation Mechanisms
AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms
Kou Tanaka
Hirokazu Kameoka
Takuhiro Kaneko
Nobukatsu Hojo
17
111
0
09 Nov 2018
Robust and fine-grained prosody control of end-to-end speech synthesis
Robust and fine-grained prosody control of end-to-end speech synthesis
Younggun Lee
Jonathan Le Roux
7
147
0
06 Nov 2018
Leveraging Weakly Supervised Data to Improve End-to-End Speech-to-Text
  Translation
Leveraging Weakly Supervised Data to Improve End-to-End Speech-to-Text Translation
Ye Jia
Melvin Johnson
Wolfgang Macherey
Ron J. Weiss
Yuan Cao
Chung-Cheng Chiu
Naveen Ari
Stella Laurenzo
Yonghui Wu
31
159
0
05 Nov 2018
ConvS2S-VC: Fully convolutional sequence-to-sequence voice conversion
ConvS2S-VC: Fully convolutional sequence-to-sequence voice conversion
Hirokazu Kameoka
Kou Tanaka
Damian Kwaśny
Takuhiro Kaneko
Nobukatsu Hojo
23
62
0
05 Nov 2018
Investigating context features hidden in End-to-End TTS
Investigating context features hidden in End-to-End TTS
Kohki Mametani
T. Kato
Seiichi Yamamoto
12
9
0
04 Nov 2018
End-to-End Feedback Loss in Speech Chain Framework via Straight-Through
  Estimator
End-to-End Feedback Loss in Speech Chain Framework via Straight-Through Estimator
Andros Tjandra
S. Sakti
Satoshi Nakamura
13
44
0
31 Oct 2018
WaveGlow: A Flow-based Generative Network for Speech Synthesis
WaveGlow: A Flow-based Generative Network for Speech Synthesis
R. Prenger
Rafael Valle
Bryan Catanzaro
37
1,023
0
31 Oct 2018
Speaking style adaptation in Text-To-Speech synthesis using
  Sequence-to-sequence models with attention
Speaking style adaptation in Text-To-Speech synthesis using Sequence-to-sequence models with attention
Bajibabu Bollepalli
Lauri Juvela
P. Alku
17
4
0
29 Oct 2018
Investigation of enhanced Tacotron text-to-speech synthesis systems with
  self-attention for pitch accent language
Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language
Yusuke Yasuda
Xin Wang
Shinji Takaki
Junichi Yamagishi
22
86
0
29 Oct 2018
Reducing over-smoothness in speech synthesis using Generative
  Adversarial Networks
Reducing over-smoothness in speech synthesis using Generative Adversarial Networks
Leyuan Sheng
Evgeny Nikolaevich Pavlovskiy
GAN
17
8
0
25 Oct 2018
SING: Symbol-to-Instrument Neural Generator
SING: Symbol-to-Instrument Neural Generator
Alexandre Défossez
Neil Zeghidour
Nicolas Usunier
Léon Bottou
Francis R. Bach
18
59
0
23 Oct 2018
Hierarchical Generative Modeling for Controllable Speech Synthesis
Hierarchical Generative Modeling for Controllable Speech Synthesis
Wei-Ning Hsu
Yu Zhang
Ron J. Weiss
Heiga Zen
Yonghui Wu
...
Ye Jia
Z. Chen
Jonathan Shen
Patrick Nguyen
Ruoming Pang
BDL
12
274
0
16 Oct 2018
Sequence-to-Sequence Acoustic Modeling for Voice Conversion
Sequence-to-Sequence Acoustic Modeling for Voice Conversion
Jing-Xuan Zhang
Zhenhua Ling
Li-Juan Liu
Yuan Jiang
Lirong Dai
11
129
0
16 Oct 2018
Conditional WaveGAN
Conditional WaveGAN
Chae Young Lee
Anoop Toffy
G. Jung
W. Han
DiffM
21
21
0
27 Sep 2018
Sample Efficient Adaptive Text-to-Speech
Sample Efficient Adaptive Text-to-Speech
Yutian Chen
Yannis Assael
Brendan Shillingford
David Budden
Scott E. Reed
...
Ben Laurie
Çağlar Gülçehre
Aaron van den Oord
Oriol Vinyals
Nando de Freitas
35
149
0
27 Sep 2018
Semi-Supervised Training for Improving Data Efficiency in End-to-End
  Speech Synthesis
Semi-Supervised Training for Improving Data Efficiency in End-to-End Speech Synthesis
Yu-An Chung
Yuxuan Wang
Wei-Ning Hsu
Yu Zhang
RJ Skerry-Ryan
22
117
0
30 Aug 2018
Rhythm-Flexible Voice Conversion without Parallel Data Using Cycle-GAN
  over Phoneme Posteriorgram Sequences
Rhythm-Flexible Voice Conversion without Parallel Data Using Cycle-GAN over Phoneme Posteriorgram Sequences
Cheng-chieh Yeh
Po-Chun Hsu
Ju-Chieh Chou
Hung-yi Lee
Lin-Shan Lee
30
23
0
09 Aug 2018
Predicting Expressive Speaking Style From Text In End-To-End Speech
  Synthesis
Predicting Expressive Speaking Style From Text In End-To-End Speech Synthesis
Daisy Stanton
Yuxuan Wang
RJ Skerry-Ryan
13
122
0
04 Aug 2018
Multi-scale Alignment and Contextual History for Attention Mechanism in
  Sequence-to-sequence Model
Multi-scale Alignment and Contextual History for Attention Mechanism in Sequence-to-sequence Model
Andros Tjandra
S. Sakti
Satoshi Nakamura
8
12
0
22 Jul 2018
Noise Adaptive Speech Enhancement using Domain Adversarial Training
Noise Adaptive Speech Enhancement using Domain Adversarial Training
Chien-Feng Liao
Yu Tsao
Hung-yi Lee
H. Wang
17
51
0
19 Jul 2018
ClariNet: Parallel Wave Generation in End-to-End Text-to-Speech
ClariNet: Parallel Wave Generation in End-to-End Text-to-Speech
Ming-Yu Liu
Kainan Peng
Jitong Chen
12
342
0
19 Jul 2018
Forward Attention in Sequence-to-sequence Acoustic Modelling for Speech
  Synthesis
Forward Attention in Sequence-to-sequence Acoustic Modelling for Speech Synthesis
Jing-Xuan Zhang
Zhenhua Ling
Lirong Dai
13
83
0
18 Jul 2018
Phase reconstruction from amplitude spectrograms based on
  von-Mises-distribution deep neural network
Phase reconstruction from amplitude spectrograms based on von-Mises-distribution deep neural network
Shinnosuke Takamichi
Yuki Saito
Norihiro Takamune
Daichi Kitamura
Hiroshi Saruwatari
13
42
0
10 Jul 2018
The Emotional Voices Database: Towards Controlling the Emotion Dimension
  in Voice Generation Systems
The Emotional Voices Database: Towards Controlling the Emotion Dimension in Voice Generation Systems
Adaeze Adigwe
Noé Tits
Kevin El Haddad
Sarah Ostadabbas
Thierry Dutoit
6
79
0
25 Jun 2018
A Variational Prosody Model for Mapping the Context-Sensitive Variation
  of Functional Prosodic Prototypes
A Variational Prosody Model for Mapping the Context-Sensitive Variation of Functional Prosodic Prototypes
B. Gerazov
Gérard Bailly
Omar Mohammed
Yi Xu
Philip N. Garner
11
7
0
22 Jun 2018
Transfer Learning from Speaker Verification to Multispeaker
  Text-To-Speech Synthesis
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
Ye Jia
Yu Zhang
Ron J. Weiss
Quan Wang
Jonathan Shen
...
Z. Chen
Patrick Nguyen
Ruoming Pang
Ignacio López Moreno
Yonghui Wu
207
820
0
12 Jun 2018
Voice Imitating Text-to-Speech Neural Networks
Voice Imitating Text-to-Speech Neural Networks
Younggun Lee
Taesu Kim
Soo-Young Lee
26
11
0
04 Jun 2018
Collapsed speech segment detection and suppression for WaveNet vocoder
Collapsed speech segment detection and suppression for WaveNet vocoder
Yi-Chiao Wu
Kazuhiro Kobayashi
Tomoki Hayashi
Patrick Lumban Tobing
T. Toda
9
25
0
30 Apr 2018
Automatic Documentation of ICD Codes with Far-Field Speech Recognition
Automatic Documentation of ICD Codes with Far-Field Speech Recognition
Albert Haque
Corinna Fukushima
11
0
0
30 Apr 2018
Multi-target Voice Conversion without Parallel Data by Adversarially
  Learning Disentangled Audio Representations
Multi-target Voice Conversion without Parallel Data by Adversarially Learning Disentangled Audio Representations
Ju-Chieh Chou
Cheng-chieh Yeh
Hung-yi Lee
Lin-Shan Lee
6
132
0
09 Apr 2018
A comparison of recent waveform generation and acoustic modeling methods
  for neural-network-based speech synthesis
A comparison of recent waveform generation and acoustic modeling methods for neural-network-based speech synthesis
Xin Wang
Jaime Lorenzo-Trueba
Shinji Takaki
Lauri Juvela
Junichi Yamagishi
20
67
0
07 Apr 2018
Expressive Speech Synthesis via Modeling Expressions with Variational
  Autoencoder
Expressive Speech Synthesis via Modeling Expressions with Variational Autoencoder
K. Akuzawa
Yusuke Iwasawa
Y. Matsuo
8
138
0
06 Apr 2018
Conditional End-to-End Audio Transforms
Conditional End-to-End Audio Transforms
Albert Haque
Michelle Guo
Prateek Verma
33
41
0
30 Mar 2018
Machine Speech Chain with One-shot Speaker Adaptation
Machine Speech Chain with One-shot Speaker Adaptation
Andros Tjandra
S. Sakti
Satoshi Nakamura
25
55
0
28 Mar 2018
Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with
  Tacotron
Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron
RJ Skerry-Ryan
Eric Battenberg
Y. Xiao
Yuxuan Wang
Daisy Stanton
Joel Shor
Ron J. Weiss
R. Clark
Rif A. Saurous
16
547
0
24 Mar 2018
Style Tokens: Unsupervised Style Modeling, Control and Transfer in
  End-to-End Speech Synthesis
Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis
Yuxuan Wang
Daisy Stanton
Yu Zhang
RJ Skerry-Ryan
Eric Battenberg
Joel Shor
Y. Xiao
Fei Ren
Ye Jia
Rif A. Saurous
21
815
0
23 Mar 2018
Can we steal your vocal identity from the Internet?: Initial
  investigation of cloning Obama's voice using GAN, WaveNet and low-quality
  found data
Can we steal your vocal identity from the Internet?: Initial investigation of cloning Obama's voice using GAN, WaveNet and low-quality found data
Jaime Lorenzo-Trueba
Fuming Fang
Xin Wang
Isao Echizen
Junichi Yamagishi
Tomi Kinnunen
6
73
0
02 Mar 2018
Fitting New Speakers Based on a Short Untranscribed Sample
Fitting New Speakers Based on a Short Untranscribed Sample
Eliya Nachmani
Adam Polyak
Yaniv Taigman
Lior Wolf
24
84
0
20 Feb 2018
Adversarial Audio Synthesis
Adversarial Audio Synthesis
Chris Donahue
Julian McAuley
M. Puckette
GAN
45
602
0
12 Feb 2018
Previous
123...151617
Next