v1v2 (latest)

Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram

25 October 2019

Papers citing "Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram"

50 / 464 papers shown

Title
VocBench: A Neural Vocoder Benchmark for Speech Synthesis Ehab A. AlBadawy Andrew Gibiansky Qing He Jilong Wu Ming-Ching Chang Siwei Lyu 58 12 0 06 Dec 2021
Steerable discovery of neural audio effects C. Steinmetz Joshua D. Reiss 52 6 0 06 Dec 2021
High Quality Streaming Speech Synthesis with Low, Sentence-Length-Independent Latency Nikolaos Ellinas G. Vamvoukakis K. Markopoulos Aimilios Chalamandaris Georgia Maniati Panos Kakoulidis S. Raptis June Sig Sung Hyoungmin Park Pirros Tsiakoulis 139 37 0 17 Nov 2021
AC-VC: Non-parallel Low Latency Phonetic Posteriorgrams Based Voice Conversion Damien Ronssin Milos Cernak 78 11 0 12 Nov 2021
RAVE: A variational autoencoder for fast and high-quality neural audio synthesis Antoine Caillon P. Esling DRL 68 112 0 09 Nov 2021
Speaker Generation Daisy Stanton Matt Shannon Soroosh Mariooryad RJ Skerry-Ryan Eric Battenberg Tom Bagby David Kao 77 30 0 07 Nov 2021
WaveFake: A Data Set to Facilitate Audio Deepfake Detection Joel Frank Lea Schonherr DiffM 204 131 0 04 Nov 2021
RefineGAN: Universally Generating Waveform Better than Ground Truth with Highly Accurate Pitch and Intensity Responses Shengyuan Xu Wenxiao Zhao Jing Guo 63 12 0 01 Nov 2021
Learning Continuous Representation of Audio for Arbitrary Scale Super Resolution Jaechang Kim Yunjoo Lee Seunghoon Hong Jungseul Ok SupR CLL 70 13 0 30 Oct 2021
Zero-shot Voice Conversion via Self-supervised Prosody Representation Learning Shijun Wang Dimche Kostadinov Damian Borth 83 11 0 27 Oct 2021
TUNet: A Block-online Bandwidth Extension Model based on Transformers and Self-supervised Pretraining Viet-Anh Nguyen Anh H. T. Nguyen Andy W. H. Khong 60 22 0 26 Oct 2021
Disentanglement of Emotional Style and Speaker Identity for Expressive Voice Conversion Zongyang Du Berrak Sisman Kun Zhou Haizhou Li 91 24 0 20 Oct 2021
Speech Enhancement-assisted Voice Conversion in Noisy Environments Yun-Ju Chan Chiang-Jen Peng Syu-Siang Wang Hsin-Min Wang Yu Tsao T. Chi 46 2 0 19 Oct 2021
Neural Synthesis of Footsteps Sound Effects with Generative Adversarial Networks Marco Comunità Huy Phan Joshua D. Reiss GAN 49 11 0 18 Oct 2021
KaraTuner: Towards end to end natural pitch correction for singing voice in karaoke Xiaobin Zhuang Huiran Yu Weifeng Zhao Tao Jiang Peng Hu 90 6 0 18 Oct 2021
Neural Dubber: Dubbing for Videos According to Scripts Chenxu Hu Qiao Tian Tingle Li Yuping Wang Yuxuan Wang Hang Zhao DiffM VGen 99 43 0 15 Oct 2021
Towards Identity Preserving Normal to Dysarthric Voice Conversion Wen-Chin Huang B. Halpern Lester Phillip Violeta O. Scharenborg Tomoki Toda 106 23 0 15 Oct 2021
ESPnet2-TTS: Extending the Edge of TTS Research Tomoki Hayashi Ryuichi Yamamoto Takenori Yoshimura Peter Wu Jiatong Shi Takaaki Saeki Yooncheol Ju Yusuke Yasuda Shinnosuke Takamichi Shinji Watanabe VLM 85 63 0 15 Oct 2021
SingGAN: Generative Adversarial Network For High-Fidelity Singing Voice Generation Rongjie Huang Chenye Cui Feiyang Chen Yi Ren Jinglin Liu Zhou Zhao Baoxing Huai N. Yuan GAN 203 63 0 14 Oct 2021
FedSpeech: Federated Text-to-Speech with Continual Learning Ziyue Jiang Yi Ren Ming Lei Zhou Zhao FedML 166 28 0 14 Oct 2021
SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing Junyi Ao Rui Wang Long Zhou Chengyi Wang Shuo Ren ... Yu Zhang Zhihua Wei Yao Qian Jinyu Li Furu Wei 162 202 0 14 Oct 2021
A Melody-Unsupervision Model for Singing Voice Synthesis Soonbeom Choi Juhan Nam 67 14 0 13 Oct 2021
Source Mixing and Separation Robust Audio Steganography Naoya Takahashi M. Singh Yuki Mitsufuji 60 6 0 11 Oct 2021
Towards Universal Neural Vocoding with a Multi-band Excited WaveNet Axel Roebel F. Bous 56 2 0 07 Oct 2021
GANtron: Emotional Speech Synthesis with Generative Adversarial Networks E. Hortal Rodrigo Brechard Alarcia GAN 46 2 0 06 Oct 2021
Neural Pitch-Shifting and Time-Stretching with Controllable LPCNet Max Morrison Zeyu Jin Nicholas J. Bryan Juan-Pablo Caceres Bryan Pardo 73 14 0 05 Oct 2021
On the Interplay Between Sparsity, Naturalness, Intelligibility, and Prosody in Speech Synthesis Cheng-I Jeff Lai Erica Cooper Yang Zhang Shiyu Chang Kaizhi Qian ... Yung-Sung Chuang Alexander H. Liu Junichi Yamagishi David D. Cox James R. Glass 69 6 0 04 Oct 2021
PortaSpeech: Portable and High-Quality Generative Text-to-Speech Yi Ren Jinglin Liu Zhou Zhao 122 79 0 30 Sep 2021
VoiceFixer: Toward General Speech Restoration with Neural Vocoder Haohe Liu Qiuqiang Kong Qiao Tian Yan Zhao DeLiang Wang Chuanzeng Huang Yuxuan Wang 87 58 0 28 Sep 2021
MSR-NV: Neural Vocoder Using Multiple Sampling Rates Kentaro Mitsui Kei Sawada 109 0 0 28 Sep 2021
FlowVocoder: A small Footprint Neural Vocoder based Normalizing flow for Speech Synthesis Manh Luong Viet-Anh Tran 24 2 0 27 Sep 2021
Time Alignment using Lip Images for Frame-based Electrolaryngeal Voice Conversion Yi-Syuan Liou Wen-Chin Huang Ming-Chi Yen S. Tsai Yu-Huai Peng Tomoki Toda Yu Tsao Hsin-Min Wang 66 1 0 08 Sep 2021
Bilateral Denoising Diffusion Models Max W. Y. Lam Jun Wang Rongjie Huang Jane Polak Scowcroft Dong Yu DiffM 83 43 0 26 Aug 2021
GC-TTS: Few-shot Speaker Adaptation with Geometric Constraints Ji-Hoon Kim Sang-Hoon Lee Ji-Hyun Lee Hong G Jung Seong-Whan Lee 162 6 0 16 Aug 2021
Masked Acoustic Unit for Mispronunciation Detection and Correction Zhan Zhang Yuehai Wang Jianyi Yang 116 3 0 12 Aug 2021
A Streamwise GAN Vocoder for Wideband Speech Coding at Very Low Bit Rate Ahmed Mustafa Jan Büthe Srikanth Korse Kishan Gupta Guillaume Fuchs N. Pia 129 19 0 09 Aug 2021
Applying the Information Bottleneck Principle to Prosodic Representation Learning Guangyan Zhang Ying Qin Daxin Tan Tan Lee 77 4 0 05 Aug 2021
DarkGAN: Exploiting Knowledge Distillation for Comprehensible Audio Synthesis with GANs J. Nistal Stefan Lattner G. Richard 74 9 0 03 Aug 2021
Creation and Detection of German Voice Deepfakes Vanessa Barnekow Dominik Binder Niclas Kromrey Pascal Munaretto A. Schaad Felix Schmieder 23 3 0 02 Aug 2021
A Survey on Audio Synthesis and Audio-Visual Multimodal Processing Zhaofeng Shi 57 7 0 01 Aug 2021
Sequence-to-Sequence Voice Reconstruction for Silent Speech in a Tonal Language Huiyan Li Haohong Lin You Wang Hengyang Wang Ming Zhang Han Gao Qing Ai Zhiyuan Luo Guang Li 63 14 0 31 Jul 2021
StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion Yinghao Aaron Li A. Zare N. Mesgarani 97 101 0 21 Jul 2021
Digital Einstein Experience: Fast Text-to-Speech for Conversational AI Joanna Rownicka Kilian Sprenkamp A. Tripiana Volodymyr Gromoglasov Timo P. Kunz 26 0 0 21 Jul 2021
On Prosody Modeling for ASR+TTS based Voice Conversion Wen-Chin Huang Tomoki Hayashi Xinjian Li Shinji Watanabe Tomoki Toda 73 9 0 20 Jul 2021
SVSNet: An End-to-end Speaker Voice Similarity Assessment Model Cheng-Hung Hu Yu-Huai Peng Junichi Yamagishi Yu Tsao Hsin-Min Wang 48 5 0 20 Jul 2021
Filtered Noise Shaping for Time Domain Room Impulse Response Estimation From Reverberant Speech C. Steinmetz V. Ithapu P. Calamia 83 40 0 15 Jul 2021
Neural Waveshaping Synthesis B. Hayes C. Saitis Gyorgy Fazekas 85 28 0 11 Jul 2021
EditSpeech: A Text Based Speech Editing System Using Partial Inference and Bidirectional Fusion Daxin Tan Liqun Deng Y. Yeung Xin Jiang Xiao Chen Tan Lee 91 41 0 04 Jul 2021
Adversarial Sample Detection for Speaker Verification by Neural Vocoders Haibin Wu Po-Chun Hsu Ji Gao Shanshan Zhang Shen Huang Jian Kang Zhiyong Wu Helen Meng Hung-yi Lee AAML 93 21 0 01 Jul 2021
A Survey on Neural Speech Synthesis Xu Tan Tao Qin Frank Soong Tie-Yan Liu AI4TS 133 359 0 29 Jun 2021