A comparison of recent waveform generation and acoustic modeling methods
for neural-network-based speech synthesis

A comparison of recent waveform generation and acoustic modeling methods for neural-network-based speech synthesis

7 April 2018

Xin Wang

Jaime Lorenzo-Trueba

Junichi Yamagishi

Papers citing "A comparison of recent waveform generation and acoustic modeling methods for neural-network-based speech synthesis"

18 / 18 papers shown

Title
Partial Rank Similarity Minimization Method for Quality MOS Prediction of Unseen Speech Synthesis Systems in Zero-Shot and Semi-supervised setting Hemant Yadav Erica Cooper Junichi Yamagishi Sunayana Sitaram R. Shah 11 0 0 08 Oct 2023
VisualTTS: TTS with Accurate Lip-Speech Synchronization for Automatic Voice Over Junchen Lu Berrak Sisman Rui Liu Mingyang Zhang Haizhou Li DiffM 36 19 0 07 Oct 2021
A Survey on Neural Speech Synthesis Xu Tan Tao Qin Frank Soong Tie-Yan Liu AI4TS 18 352 0 29 Jun 2021
Universal Neural Vocoding with Parallel WaveNet Yunlong Jiao Adam Gabry's Georgi Tinchev Bartosz Putrycz Daniel Korzekwa V. Klimkov 36 42 0 01 Feb 2021
VAW-GAN for Disentanglement and Recomposition of Emotional Elements in Speech Kun Zhou Berrak Sisman Haizhou Li DRL 34 40 0 03 Nov 2020
Parallel waveform synthesis based on generative adversarial networks with voicing-aware conditional discriminators Ryuichi Yamamoto Eunwoo Song Min-Jae Hwang Jae-Min Kim 27 18 0 27 Oct 2020
An Overview of Voice Conversion and its Challenges: From Statistical Modeling to Deep Learning Berrak Sisman Junichi Yamagishi Simon King Haizhou Li BDL 41 318 0 09 Aug 2020
Investigation of learning abilities on linguistic features in sequence-to-sequence text-to-speech synthesis Yusuke Yasuda Xin Wang Junichi Yamagishi AI4TS 22 31 0 20 May 2020
Transferring neural speech waveform synthesizers to musical instrument sounds generation Yi Zhao Xin Wang Lauri Juvela Junichi Yamagishi 24 16 0 27 Oct 2019
Neural source-filter waveform models for statistical parametric speech synthesis Xin Wang Shinji Takaki Junichi Yamagishi 40 117 0 27 Apr 2019
Probability density distillation with generative adversarial networks for high-quality parallel waveform generation Ryuichi Yamamoto Eunwoo Song Jae-Min Kim 19 55 0 09 Apr 2019
Towards achieving robust universal neural vocoding Jaime Lorenzo-Trueba Thomas Drugman Javier Latorre Thomas Merritt Bartosz Putrycz Roberto Barra-Chicote Alexis Moinet Vatsal Aggarwal DRL 20 19 0 15 Nov 2018
ExcitNet vocoder: A neural excitation model for parametric speech synthesis systems Eunwoo Song Kyungguen Byun Hong-Goo Kang 10 29 0 09 Nov 2018
Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language Yusuke Yasuda Xin Wang Shinji Takaki Junichi Yamagishi 22 86 0 29 Oct 2018
STFT spectral loss for training a neural speech waveform model Shinji Takaki Toru Nakashika Xin Wang Junichi Yamagishi 23 21 0 29 Oct 2018
Sequence-to-Sequence Acoustic Modeling for Voice Conversion Jing-Xuan Zhang Zhenhua Ling Li-Juan Liu Yuan Jiang Lirong Dai 16 129 0 16 Oct 2018
Speaker-independent raw waveform model for glottal excitation Lauri Juvela Vassilis Tsiaras Bajibabu Bollepalli Manu Airaksinen Junichi Yamagishi P. Alku 19 39 0 25 Apr 2018
Can we steal your vocal identity from the Internet?: Initial investigation of cloning Obama's voice using GAN, WaveNet and low-quality found data Jaime Lorenzo-Trueba Fuming Fang Xin Wang Isao Echizen Junichi Yamagishi Tomi Kinnunen 6 73 0 02 Mar 2018