Title
Introduction to Voice Presentation Attack Detection and Recent Advances Md. Sahidullah Héctor Delgado Massimiliano Todisco Tomi Kinnunen Nicholas W. D. Evans Junichi Yamagishi Kong-Aik Lee AAML 83 75 0 04 Jan 2019
Feature reinforcement with word embedding and parsing information in neural TTS Huaiping Ming Lei He Haohan Guo Frank Soong 167 15 0 03 Jan 2019
Modeling Multi-speaker Latent Space to Improve Neural TTS: Quick Enrolling New Speaker and Enhancing Premium Voice Yan Deng Lei He Frank Soong 114 29 0 13 Dec 2018
FPETS : Fully Parallel End-to-End Text-to-Speech System Dabiao Ma Zhiba Su Wenxuan Wang Yuhao Lu 58 6 0 12 Dec 2018
Learning latent representations for style control and transfer in end-to-end speech synthesis Ya-Jie Zhang Shifeng Pan Lei He Zhenhua Ling BDL SSL DRL 109 229 0 11 Dec 2018
Generative Adversarial Network based Speaker Adaptation for High Fidelity WaveNet Vocoder Qiao Tian Bing Yang Shan Liu GAN 55 9 0 06 Dec 2018
LP-WaveNet: Linear Prediction-based WaveNet Speech Synthesis Min-Jae Hwang Frank Soong Fenglong Xie Xi Wang Hyeonjoo Kang Hong-Goo Kang 70 21 0 29 Nov 2018
Refined WaveNet Vocoder for Variational Autoencoder Based Voice Conversion Wen-Chin Huang Yi-Chiao Wu Hsin-Te Hwang Patrick Lumban Tobing Tomoki Hayashi Kazuhiro Kobayashi Tomoki Toda Yu Tsao H. Wang 75 20 0 27 Nov 2018
Learning pronunciation from a foreign language in speech synthesis networks Younggun Lee Suwon Shon Taesu Kim 58 28 0 23 Nov 2018
TimbreTron: A WaveNet(CycleGAN(CQT(Audio))) Pipeline for Musical Timbre Transfer Sicong Huang Qiyang Li Cem Anil Xuchan Bao Sageev Oore Roger C. Grosse 95 98 0 22 Nov 2018
Bytes are All You Need: End-to-End Multilingual Speech Recognition and Synthesis with Bytes Yue Liu Yu Zhang Tara N. Sainath Yonghui Wu William Chan AuLLM 79 131 0 22 Nov 2018
The Effect of Explicit Structure Encoding of Deep Neural Networks for Symbolic Music Generation Kai Chen Weilin Zhang Shlomo Dubnov Gus Xia Wei Li MGen 62 5 0 20 Nov 2018
Improving Sequence-to-Sequence Acoustic Modeling by Adding Text-Supervision Jing-Xuan Zhang Zhenhua Ling Yuan Jiang Li-Juan Liu Chen Liang Lirong Dai 80 30 0 20 Nov 2018
Representation Mixing for TTS Synthesis Kyle Kastner J. F. Santos Yoshua Bengio Aaron Courville 64 43 0 17 Nov 2018
Generating Albums with SampleRNN to Imitate Metal, Rock, and Punk Bands CJ Carr Zack Zukowski MGen 35 20 0 16 Nov 2018
Effect of data reduction on sequence-to-sequence neural TTS Javier Latorre Jakub Lachowicz Jaime Lorenzo-Trueba Thomas Merritt Thomas Drugman S. Ronanki Klimkov Viacheslav 109 59 0 15 Nov 2018
Comprehensive evaluation of statistical speech waveform synthesis Thomas Merritt Bartosz Putrycz Adam Nadolski Tianjun Ye Daniel Korzekwa ... Alexis Moinet A. Breen Rafal Kuklinski N. Strom Roberto Barra-Chicote 51 18 0 15 Nov 2018
Towards achieving robust universal neural vocoding Jaime Lorenzo-Trueba Thomas Drugman Javier Latorre Thomas Merritt Bartosz Putrycz Roberto Barra-Chicote Alexis Moinet Vatsal Aggarwal DRL 152 19 0 15 Nov 2018
PerformanceNet: Score-to-Audio Music Generation with Multi-Band Convolutional Residual Network Bryan Wang Yi-Hsuan Yang 71 38 0 11 Nov 2018
ExcitNet vocoder: A neural excitation model for parametric speech synthesis systems Eunwoo Song Kyungguen Byun Hong-Goo Kang 75 29 0 09 Nov 2018
AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms Kou Tanaka Hirokazu Kameoka Takuhiro Kaneko Nobukatsu Hojo 79 112 0 09 Nov 2018
Speaker-adaptive neural vocoders for parametric speech synthesis systems Eunwoo Song Xiang Yu Erik Cambria Jagath Rajapakse 49 3 0 08 Nov 2018
Reconstructing Speech Stimuli From Human Auditory Cortex Activity Using a WaveNet Approach Ran Wang Yao Wang A. Flinker 39 7 0 06 Nov 2018
FloWaveNet : A Generative Flow for Raw Audio Sungwon Kim Sang-gil Lee Jongyoon Song Jaehyeon Kim Sungroh Yoon 148 169 0 06 Nov 2018
Robust and fine-grained prosody control of end-to-end speech synthesis Younggun Lee Jonathan Le Roux 91 147 0 06 Nov 2018
Leveraging Weakly Supervised Data to Improve End-to-End Speech-to-Text Translation Ye Jia Melvin Johnson Wolfgang Macherey Ron J. Weiss Yuan Cao Chung-Cheng Chiu Naveen Ari Stella Laurenzo Yonghui Wu 98 163 0 05 Nov 2018
ConvS2S-VC: Fully convolutional sequence-to-sequence voice conversion Hirokazu Kameoka Kou Tanaka Damian Kwaśny Takuhiro Kaneko Nobukatsu Hojo 113 64 0 05 Nov 2018
Investigating context features hidden in End-to-End TTS Kohki Mametani T. Kato Seiichi Yamamoto 52 9 0 04 Nov 2018
Cycle-consistency training for end-to-end speech recognition Takaaki Hori Ramón Fernández Astudillo Tomoki Hayashi Yu Zhang Shinji Watanabe Jonathan Le Roux 97 87 0 02 Nov 2018
Training Neural Speech Recognition Systems with Synthetic Speech Augmentation Jason Chun Lok Li R. Gadde Boris Ginsburg Vitaly Lavrukhin 63 55 0 02 Nov 2018
Neural Music Synthesis for Flexible Timbre Control Jong Wook Kim Rachel M. Bittner Aparna Kumar J. P. Bello 73 39 0 01 Nov 2018
WaveGlow: A Flow-based Generative Network for Speech Synthesis R. Prenger Rafael Valle Bryan Catanzaro 192 1,036 0 31 Oct 2018
Waveform generation for text-to-speech synthesis using pitch-synchronous multi-scale generative adversarial networks Lauri Juvela Bajibabu Bollepalli Junichi Yamagishi P. Alku 77 23 0 30 Oct 2018
End-to-end music source separation: is it possible in the waveform domain? Francesc Lluís Jordi Pons Xavier Serra 100 73 0 29 Oct 2018
Audio inpainting of music by means of neural networks Andrés Marafioti Nicki Holighaus P. Majdak Nathanael Perraudin 97 18 0 29 Oct 2018
Speaking style adaptation in Text-To-Speech synthesis using Sequence-to-sequence models with attention Bajibabu Bollepalli Lauri Juvela P. Alku 51 4 0 29 Oct 2018
Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language Yusuke Yasuda Xin Wang Shinji Takaki Junichi Yamagishi 63 87 0 29 Oct 2018
Neural source-filter-based waveform model for statistical parametric speech synthesis Xin Wang Shinji Takaki Junichi Yamagishi 136 125 0 29 Oct 2018
STFT spectral loss for training a neural speech waveform model Shinji Takaki Toru Nakashika Xin Wang Junichi Yamagishi 75 21 0 29 Oct 2018
LPCNet: Improving Neural Speech Synthesis Through Linear Prediction J. Valin Jan Skoglund 86 451 0 28 Oct 2018
Reducing over-smoothness in speech synthesis using Generative Adversarial Networks Leyuan Sheng Evgeny Nikolaevich Pavlovskiy GAN 69 9 0 25 Oct 2018
Hierarchical Generative Modeling for Controllable Speech Synthesis Wei-Ning Hsu Yu Zhang Ron J. Weiss Heiga Zen Yonghui Wu ... Ye Jia Zhiwen Chen Jonathan Shen Patrick Nguyen Ruoming Pang BDL 109 276 0 16 Oct 2018
Sequence-to-Sequence Acoustic Modeling for Voice Conversion Jing-Xuan Zhang Zhenhua Ling Li-Juan Liu Yuan Jiang Lirong Dai 85 130 0 16 Oct 2018
A Fully Time-domain Neural Model for Subband-based Speech Synthesizer Azam Rabiee Geonmin Kim Tae-Ho Kim Soo-Young Lee 27 1 0 12 Oct 2018
Conditional WaveGAN Chae Young Lee Anoop Toffy G. Jung W. Han DiffM 46 21 0 27 Sep 2018
Neural Speech Synthesis with Transformer Network Naihan Li Shujie Liu Yanqing Liu Sheng Zhao Ming-Yuan Liu M. Zhou 95 102 0 19 Sep 2018
Semi-Supervised Training for Improving Data Efficiency in End-to-End Speech Synthesis Yu-An Chung Yuxuan Wang Wei-Ning Hsu Yu Zhang RJ Skerry-Ryan 87 117 0 30 Aug 2018
Fast Spectrogram Inversion using Multi-head Convolutional Neural Networks Sercan O. Arik Heewoo Jun G. Diamos 110 108 0 20 Aug 2018
Multimodal speech synthesis architecture for unsupervised speaker adaptation Hieu-Thi Luong Junichi Yamagishi 75 10 0 20 Aug 2018
Investigating accuracy of pitch-accent annotations in neural network-based speech synthesis and denoising effects Hieu-Thi Luong Xin Wang Junichi Yamagishi Nobuyuki Nishizawa 60 16 0 02 Aug 2018