ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1712.05884
  4. Cited By
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram
  Predictions
v1v2 (latest)

Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions

16 December 2017
Jonathan Shen
Ruoming Pang
Ron J. Weiss
M. Schuster
Navdeep Jaitly
Zongheng Yang
Zhiwen Chen
Yu Zhang
Yuxuan Wang
RJ Skerry-Ryan
Rif A. Saurous
Yannis Agiomyrgiannakis
Yonghui Wu
ArXiv (abs)PDFHTML

Papers citing "Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions"

50 / 1,276 papers shown
Title
Introduction to Voice Presentation Attack Detection and Recent Advances
Introduction to Voice Presentation Attack Detection and Recent Advances
Md. Sahidullah
Héctor Delgado
Massimiliano Todisco
Tomi Kinnunen
Nicholas W. D. Evans
Junichi Yamagishi
Kong-Aik Lee
AAML
83
75
0
04 Jan 2019
Feature reinforcement with word embedding and parsing information in
  neural TTS
Feature reinforcement with word embedding and parsing information in neural TTS
Huaiping Ming
Lei He
Haohan Guo
Frank Soong
167
15
0
03 Jan 2019
Modeling Multi-speaker Latent Space to Improve Neural TTS: Quick
  Enrolling New Speaker and Enhancing Premium Voice
Modeling Multi-speaker Latent Space to Improve Neural TTS: Quick Enrolling New Speaker and Enhancing Premium Voice
Yan Deng
Lei He
Frank Soong
114
29
0
13 Dec 2018
FPETS : Fully Parallel End-to-End Text-to-Speech System
FPETS : Fully Parallel End-to-End Text-to-Speech System
Dabiao Ma
Zhiba Su
Wenxuan Wang
Yuhao Lu
58
6
0
12 Dec 2018
Learning latent representations for style control and transfer in
  end-to-end speech synthesis
Learning latent representations for style control and transfer in end-to-end speech synthesis
Ya-Jie Zhang
Shifeng Pan
Lei He
Zhenhua Ling
BDLSSLDRL
109
229
0
11 Dec 2018
Generative Adversarial Network based Speaker Adaptation for High
  Fidelity WaveNet Vocoder
Generative Adversarial Network based Speaker Adaptation for High Fidelity WaveNet Vocoder
Qiao Tian
Bing Yang
Shan Liu
GAN
55
9
0
06 Dec 2018
LP-WaveNet: Linear Prediction-based WaveNet Speech Synthesis
LP-WaveNet: Linear Prediction-based WaveNet Speech Synthesis
Min-Jae Hwang
Frank Soong
Fenglong Xie
Xi Wang
Hyeonjoo Kang
Hong-Goo Kang
70
21
0
29 Nov 2018
Refined WaveNet Vocoder for Variational Autoencoder Based Voice
  Conversion
Refined WaveNet Vocoder for Variational Autoencoder Based Voice Conversion
Wen-Chin Huang
Yi-Chiao Wu
Hsin-Te Hwang
Patrick Lumban Tobing
Tomoki Hayashi
Kazuhiro Kobayashi
Tomoki Toda
Yu Tsao
H. Wang
75
20
0
27 Nov 2018
Learning pronunciation from a foreign language in speech synthesis
  networks
Learning pronunciation from a foreign language in speech synthesis networks
Younggun Lee
Suwon Shon
Taesu Kim
58
28
0
23 Nov 2018
TimbreTron: A WaveNet(CycleGAN(CQT(Audio))) Pipeline for Musical Timbre
  Transfer
TimbreTron: A WaveNet(CycleGAN(CQT(Audio))) Pipeline for Musical Timbre Transfer
Sicong Huang
Qiyang Li
Cem Anil
Xuchan Bao
Sageev Oore
Roger C. Grosse
95
98
0
22 Nov 2018
Bytes are All You Need: End-to-End Multilingual Speech Recognition and
  Synthesis with Bytes
Bytes are All You Need: End-to-End Multilingual Speech Recognition and Synthesis with Bytes
Yue Liu
Yu Zhang
Tara N. Sainath
Yonghui Wu
William Chan
AuLLM
79
131
0
22 Nov 2018
The Effect of Explicit Structure Encoding of Deep Neural Networks for
  Symbolic Music Generation
The Effect of Explicit Structure Encoding of Deep Neural Networks for Symbolic Music Generation
Kai Chen
Weilin Zhang
Shlomo Dubnov
Gus Xia
Wei Li
MGen
62
5
0
20 Nov 2018
Improving Sequence-to-Sequence Acoustic Modeling by Adding
  Text-Supervision
Improving Sequence-to-Sequence Acoustic Modeling by Adding Text-Supervision
Jing-Xuan Zhang
Zhenhua Ling
Yuan Jiang
Li-Juan Liu
Chen Liang
Lirong Dai
80
30
0
20 Nov 2018
Representation Mixing for TTS Synthesis
Representation Mixing for TTS Synthesis
Kyle Kastner
J. F. Santos
Yoshua Bengio
Aaron Courville
64
43
0
17 Nov 2018
Generating Albums with SampleRNN to Imitate Metal, Rock, and Punk Bands
Generating Albums with SampleRNN to Imitate Metal, Rock, and Punk Bands
CJ Carr
Zack Zukowski
MGen
35
20
0
16 Nov 2018
Effect of data reduction on sequence-to-sequence neural TTS
Effect of data reduction on sequence-to-sequence neural TTS
Javier Latorre
Jakub Lachowicz
Jaime Lorenzo-Trueba
Thomas Merritt
Thomas Drugman
S. Ronanki
Klimkov Viacheslav
109
59
0
15 Nov 2018
Comprehensive evaluation of statistical speech waveform synthesis
Comprehensive evaluation of statistical speech waveform synthesis
Thomas Merritt
Bartosz Putrycz
Adam Nadolski
Tianjun Ye
Daniel Korzekwa
...
Alexis Moinet
A. Breen
Rafal Kuklinski
N. Strom
Roberto Barra-Chicote
51
18
0
15 Nov 2018
Towards achieving robust universal neural vocoding
Towards achieving robust universal neural vocoding
Jaime Lorenzo-Trueba
Thomas Drugman
Javier Latorre
Thomas Merritt
Bartosz Putrycz
Roberto Barra-Chicote
Alexis Moinet
Vatsal Aggarwal
DRL
152
19
0
15 Nov 2018
PerformanceNet: Score-to-Audio Music Generation with Multi-Band
  Convolutional Residual Network
PerformanceNet: Score-to-Audio Music Generation with Multi-Band Convolutional Residual Network
Bryan Wang
Yi-Hsuan Yang
71
38
0
11 Nov 2018
ExcitNet vocoder: A neural excitation model for parametric speech
  synthesis systems
ExcitNet vocoder: A neural excitation model for parametric speech synthesis systems
Eunwoo Song
Kyungguen Byun
Hong-Goo Kang
75
29
0
09 Nov 2018
AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and
  Context Preservation Mechanisms
AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms
Kou Tanaka
Hirokazu Kameoka
Takuhiro Kaneko
Nobukatsu Hojo
79
112
0
09 Nov 2018
Speaker-adaptive neural vocoders for parametric speech synthesis systems
Speaker-adaptive neural vocoders for parametric speech synthesis systems
Eunwoo Song
Xiang Yu
Erik Cambria
Jagath Rajapakse
49
3
0
08 Nov 2018
Reconstructing Speech Stimuli From Human Auditory Cortex Activity Using
  a WaveNet Approach
Reconstructing Speech Stimuli From Human Auditory Cortex Activity Using a WaveNet Approach
Ran Wang
Yao Wang
A. Flinker
39
7
0
06 Nov 2018
FloWaveNet : A Generative Flow for Raw Audio
FloWaveNet : A Generative Flow for Raw Audio
Sungwon Kim
Sang-gil Lee
Jongyoon Song
Jaehyeon Kim
Sungroh Yoon
148
169
0
06 Nov 2018
Robust and fine-grained prosody control of end-to-end speech synthesis
Robust and fine-grained prosody control of end-to-end speech synthesis
Younggun Lee
Jonathan Le Roux
91
147
0
06 Nov 2018
Leveraging Weakly Supervised Data to Improve End-to-End Speech-to-Text
  Translation
Leveraging Weakly Supervised Data to Improve End-to-End Speech-to-Text Translation
Ye Jia
Melvin Johnson
Wolfgang Macherey
Ron J. Weiss
Yuan Cao
Chung-Cheng Chiu
Naveen Ari
Stella Laurenzo
Yonghui Wu
98
163
0
05 Nov 2018
ConvS2S-VC: Fully convolutional sequence-to-sequence voice conversion
ConvS2S-VC: Fully convolutional sequence-to-sequence voice conversion
Hirokazu Kameoka
Kou Tanaka
Damian Kwaśny
Takuhiro Kaneko
Nobukatsu Hojo
113
64
0
05 Nov 2018
Investigating context features hidden in End-to-End TTS
Investigating context features hidden in End-to-End TTS
Kohki Mametani
T. Kato
Seiichi Yamamoto
52
9
0
04 Nov 2018
Cycle-consistency training for end-to-end speech recognition
Cycle-consistency training for end-to-end speech recognition
Takaaki Hori
Ramón Fernández Astudillo
Tomoki Hayashi
Yu Zhang
Shinji Watanabe
Jonathan Le Roux
97
87
0
02 Nov 2018
Training Neural Speech Recognition Systems with Synthetic Speech
  Augmentation
Training Neural Speech Recognition Systems with Synthetic Speech Augmentation
Jason Chun Lok Li
R. Gadde
Boris Ginsburg
Vitaly Lavrukhin
63
55
0
02 Nov 2018
Neural Music Synthesis for Flexible Timbre Control
Neural Music Synthesis for Flexible Timbre Control
Jong Wook Kim
Rachel M. Bittner
Aparna Kumar
J. P. Bello
73
39
0
01 Nov 2018
WaveGlow: A Flow-based Generative Network for Speech Synthesis
WaveGlow: A Flow-based Generative Network for Speech Synthesis
R. Prenger
Rafael Valle
Bryan Catanzaro
192
1,036
0
31 Oct 2018
Waveform generation for text-to-speech synthesis using pitch-synchronous
  multi-scale generative adversarial networks
Waveform generation for text-to-speech synthesis using pitch-synchronous multi-scale generative adversarial networks
Lauri Juvela
Bajibabu Bollepalli
Junichi Yamagishi
P. Alku
77
23
0
30 Oct 2018
End-to-end music source separation: is it possible in the waveform
  domain?
End-to-end music source separation: is it possible in the waveform domain?
Francesc Lluís
Jordi Pons
Xavier Serra
100
73
0
29 Oct 2018
Audio inpainting of music by means of neural networks
Audio inpainting of music by means of neural networks
Andrés Marafioti
Nicki Holighaus
P. Majdak
Nathanael Perraudin
97
18
0
29 Oct 2018
Speaking style adaptation in Text-To-Speech synthesis using
  Sequence-to-sequence models with attention
Speaking style adaptation in Text-To-Speech synthesis using Sequence-to-sequence models with attention
Bajibabu Bollepalli
Lauri Juvela
P. Alku
51
4
0
29 Oct 2018
Investigation of enhanced Tacotron text-to-speech synthesis systems with
  self-attention for pitch accent language
Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language
Yusuke Yasuda
Xin Wang
Shinji Takaki
Junichi Yamagishi
63
87
0
29 Oct 2018
Neural source-filter-based waveform model for statistical parametric
  speech synthesis
Neural source-filter-based waveform model for statistical parametric speech synthesis
Xin Wang
Shinji Takaki
Junichi Yamagishi
136
125
0
29 Oct 2018
STFT spectral loss for training a neural speech waveform model
STFT spectral loss for training a neural speech waveform model
Shinji Takaki
Toru Nakashika
Xin Wang
Junichi Yamagishi
75
21
0
29 Oct 2018
LPCNet: Improving Neural Speech Synthesis Through Linear Prediction
LPCNet: Improving Neural Speech Synthesis Through Linear Prediction
J. Valin
Jan Skoglund
86
451
0
28 Oct 2018
Reducing over-smoothness in speech synthesis using Generative
  Adversarial Networks
Reducing over-smoothness in speech synthesis using Generative Adversarial Networks
Leyuan Sheng
Evgeny Nikolaevich Pavlovskiy
GAN
69
9
0
25 Oct 2018
Hierarchical Generative Modeling for Controllable Speech Synthesis
Hierarchical Generative Modeling for Controllable Speech Synthesis
Wei-Ning Hsu
Yu Zhang
Ron J. Weiss
Heiga Zen
Yonghui Wu
...
Ye Jia
Zhiwen Chen
Jonathan Shen
Patrick Nguyen
Ruoming Pang
BDL
109
276
0
16 Oct 2018
Sequence-to-Sequence Acoustic Modeling for Voice Conversion
Sequence-to-Sequence Acoustic Modeling for Voice Conversion
Jing-Xuan Zhang
Zhenhua Ling
Li-Juan Liu
Yuan Jiang
Lirong Dai
85
130
0
16 Oct 2018
A Fully Time-domain Neural Model for Subband-based Speech Synthesizer
A Fully Time-domain Neural Model for Subband-based Speech Synthesizer
Azam Rabiee
Geonmin Kim
Tae-Ho Kim
Soo-Young Lee
27
1
0
12 Oct 2018
Conditional WaveGAN
Conditional WaveGAN
Chae Young Lee
Anoop Toffy
G. Jung
W. Han
DiffM
46
21
0
27 Sep 2018
Neural Speech Synthesis with Transformer Network
Neural Speech Synthesis with Transformer Network
Naihan Li
Shujie Liu
Yanqing Liu
Sheng Zhao
Ming-Yuan Liu
M. Zhou
95
102
0
19 Sep 2018
Semi-Supervised Training for Improving Data Efficiency in End-to-End
  Speech Synthesis
Semi-Supervised Training for Improving Data Efficiency in End-to-End Speech Synthesis
Yu-An Chung
Yuxuan Wang
Wei-Ning Hsu
Yu Zhang
RJ Skerry-Ryan
87
117
0
30 Aug 2018
Fast Spectrogram Inversion using Multi-head Convolutional Neural
  Networks
Fast Spectrogram Inversion using Multi-head Convolutional Neural Networks
Sercan O. Arik
Heewoo Jun
G. Diamos
110
108
0
20 Aug 2018
Multimodal speech synthesis architecture for unsupervised speaker
  adaptation
Multimodal speech synthesis architecture for unsupervised speaker adaptation
Hieu-Thi Luong
Junichi Yamagishi
75
10
0
20 Aug 2018
Investigating accuracy of pitch-accent annotations in neural
  network-based speech synthesis and denoising effects
Investigating accuracy of pitch-accent annotations in neural network-based speech synthesis and denoising effects
Hieu-Thi Luong
Xin Wang
Junichi Yamagishi
Nobuyuki Nishizawa
60
16
0
02 Aug 2018
Previous
123...242526
Next