v1v2 (latest)

Efficient Neural Audio Synthesis

23 February 2018

Papers citing "Efficient Neural Audio Synthesis"

50 / 469 papers shown

Title
Fre-GAN: Adversarial Frequency-consistent Audio Synthesis Ji-Hoon Kim Sang-Hoon Lee Ji-Hyun Lee Seong-Whan Lee 104 54 0 04 Jun 2021
NVC-Net: End-to-End Adversarial Voice Conversion Bac Nguyen Cong Fabien Cardinaux AAML 126 42 0 02 Jun 2021
1xN Pattern for Pruning Convolutional Neural Networks Mingbao Lin Yu-xin Zhang Yuchao Li Bohong Chen Yong Li Mengdi Wang Shen Li Yonghong Tian Rongrong Ji 3DPC 134 44 0 31 May 2021
DiffSVC: A Diffusion Probabilistic Model for Singing Voice Conversion Songxiang Liu Yuewen Cao Jane Polak Scowcroft Helen Meng DiffM 86 59 0 28 May 2021
High-Fidelity and Low-Latency Universal Neural Vocoder based on Multiband WaveRNN with Data-Driven Linear Prediction for Discrete Waveform Modeling Patrick Lumban Tobing Tomoki Toda 62 8 0 20 May 2021
Dual-side Sparse Tensor Core Yang-Feng Wang Chen Zhang Zhiqiang Xie Cong Guo Yunxin Liu Jingwen Leng 91 76 0 20 May 2021
MASS: Multi-task Anthropomorphic Speech Synthesis Framework Jinyin Chen Linhui Ye Zhaoyan Ming 65 7 0 10 May 2021
Protecting gender and identity with disentangled speech representations Dimitrios Stoidis Andrea Cavallaro 66 10 0 22 Apr 2021
Compact CNN Structure Learning by Knowledge Distillation Waqar Ahmed Andrea Zunino Pietro Morerio Vittorio Murino 115 5 0 19 Apr 2021
Accelerating Sparse Deep Neural Networks Asit K. Mishra J. Latorre Jeff Pool Darko Stosic Dusan Stosic Ganesh Venkatesh Chong Yu Paulius Micikevicius 167 237 0 16 Apr 2021
Unified Source-Filter GAN: Unified Source-filter Network Based On Factorization of Quasi-Periodic Parallel WaveGAN Reo Yoneyama Yi-Chiao Wu Tomoki Toda 73 12 0 10 Apr 2021
Noise Estimation for Generative Diffusion Models Robin San-Roman Eliya Nachmani Lior Wolf DiffM 133 107 0 06 Apr 2021
NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling Junhyeok Lee Seungu Han DiffM 76 70 0 06 Apr 2021
Multi-rate attention architecture for fast streamable Text-to-speech spectrum modeling Qing He Zhiping Xiu T. Koehler Jilong Wu 75 7 0 01 Apr 2021
Limited Data Emotional Voice Conversion Leveraging Text-to-Speech: Two-stage Sequence-to-Sequence Training Kun Zhou Berrak Sisman Haizhou Li 78 29 0 31 Mar 2021
Training Sparse Neural Network by Constraining Synaptic Weight on Unit Lp Sphere Weipeng Li Xiaogang Yang Chuanxiang Li Ruitao Lu Xueli Xie 27 0 0 30 Mar 2021
PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS Ye Jia Heiga Zen Jonathan Shen Yu Zhang Yonghui Wu SSL 103 84 0 28 Mar 2021
Scalable and Efficient Neural Speech Coding: A Hybrid Design Kai Zhen Jongmo Sung Mi Suk Lee Seung-Wha Beack Minje Kim 95 14 0 27 Mar 2021
Continual Speaker Adaptation for Text-to-Speech Synthesis Hamed Hemati Damian Borth CLL 77 9 0 26 Mar 2021
Improve GAN-based Neural Vocoder using Pointwise Relativistic LeastSquare GAN Cong Wang Yu Chen Bin Wang Yi Shi 146 1 0 26 Mar 2021
Latent Space Explorations of Singing Voice Synthesis using DDSP J. Alonso Cumhur Erkut 145 12 0 12 Mar 2021
GAN Vocoder: Multi-Resolution Discriminator Is All You Need J. You Dalhyun Kim Gyuhyeon Nam Geumbyeol Hwang Gyeongsu Chae 68 27 0 09 Mar 2021
Generating Images with Sparse Representations C. Nash Jacob Menick Sander Dieleman Peter W. Battaglia 93 211 0 05 Mar 2021
Compute and memory efficient universal sound source separation Efthymios Tzinis Zhepei Wang Xilin Jiang Paris Smaragdis 90 40 0 03 Mar 2021
Handling Background Noise in Neural Speech Generation Tom Denton Alejandro Luebs Felicia S. C. Lim Andrew Storus Hengchin Yeh W. Kleijn Jan Skoglund 52 2 0 23 Feb 2021
Generative Speech Coding with Predictive Variance Regularization W. Kleijn Andrew Storus Michael Chinen Tom Denton Felicia S. C. Lim Alejandro Luebs Jan Skoglund Hengchin Yeh 68 68 0 18 Feb 2021
PeriodNet: A non-autoregressive waveform generation model with a structure separating periodic and aperiodic components Yukiya Hono Shinji Takaki Kei Hashimoto Keiichiro Oura Yoshihiko Nankaku K. Tokuda 69 16 0 15 Feb 2021
VARA-TTS: Non-Autoregressive Text-to-Speech Synthesis based on Very Deep VAE with Residual Attention Peng Liu Yuewen Cao Songxiang Liu Na Hu Guangzhi Li Chao Weng Jane Polak Scowcroft 95 22 0 12 Feb 2021
Voice Cloning: a Multi-Speaker Text-to-Speech Synthesis Approach based on Transfer Learning Giuseppe Ruggiero Enrico Zovato Luigi Di Caro V. Pollet DiffM 63 10 0 10 Feb 2021
Universal Neural Vocoding with Parallel WaveNet Yunlong Jiao Adam Gabry's Georgi Tinchev Bartosz Putrycz Daniel Korzekwa V. Klimkov 81 42 0 01 Feb 2021
Triple M: A Practical Text-to-speech Synthesis System With Multi-guidance Attention And Multi-band Multi-time LPCNet Shilu Lin Fenglong Xie Li Meng Xinhui Li Li Lu 72 0 0 30 Jan 2021
Whispered and Lombard Neural Speech Synthesis Qiong Hu T. Bleisch Petko N. Petkov T. Raitio Erik Marchi V. Lakshminarasimhan 63 14 0 13 Jan 2021
Parallel WaveNet conditioned on VAE latent vectors Jonas Rohnke Thomas Merritt Jaime Lorenzo-Trueba Adam Gabry's Vatsal Aggarwal Alexis Moinet Roberto Barra-Chicote 74 3 0 17 Dec 2020
DeepTalk: Vocal Style Encoding for Speaker Recognition and Speech Synthesis Anurag Chowdhury Arun Ross Prabu David 38 5 0 09 Dec 2020
I'm Sorry for Your Loss: Spectrally-Based Audio Distances Are Bad at Pitch Joseph P. Turian Max Henry 49 31 0 08 Dec 2020
Text-to-speech for the hearing impaired Josef Schlittenlacher T. Baer 32 0 0 03 Dec 2020
Phonetic Posteriorgrams based Many-to-Many Singing Voice Conversion via Adversarial Training Haohan Guo Heng Lu Na Hu Chunlei Zhang Shan Yang Lei Xie Jane Polak Scowcroft Dong Yu AAML 68 12 0 03 Dec 2020
MelGlow: Efficient Waveform Generative Network Based on Location-Variable Convolution Zhen Zeng Jianzong Wang Ning Cheng Jing Xiao 44 8 0 03 Dec 2020
FBWave: Efficient and Scalable Neural Vocoders for Streaming Text-To-Speech on the Edge Bichen Wu Qing He Peizhao Zhang T. Koehler Kurt Keutzer Peter Vajda 47 6 0 25 Nov 2020
Synth2Aug: Cross-domain speaker recognition with TTS synthesized speech Yiling Huang Yutian Chen Jason W. Pelecanos Quan Wang 100 12 0 24 Nov 2020
Empirical Evaluation of Deep Learning Model Compression Techniques on the WaveNet Vocoder Sam Davis Giuseppe Coccia Sam Gooch Julian Mack 38 0 0 20 Nov 2020
Universal MelGAN: A Robust Neural Vocoder for High-Fidelity Waveform Generation in Multiple Domains Won Jang D. Lim Jaesam Yoon 60 34 0 19 Nov 2020
Towards transformation-resilient provenance detection of digital media Jamie Hayes Krishnamurthy Dvijotham Dvijotham Yutian Chen Sander Dieleman Pushmeet Kohli Norman Casagrande 30 3 0 14 Nov 2020
A Comprehensive Survey on Deep Music Generation: Multi-level Representations, Algorithms, Evaluations, and Future Directions Shulei Ji Jing Luo Xinyu Yang MGen 61 126 0 13 Nov 2020
Low-resource expressive text-to-speech using data augmentation Goeric Huybrechts Thomas Merritt Giulia Comini Bartek Perz Raahil Shah Jaime Lorenzo-Trueba 68 53 0 11 Nov 2020
Enhancing Low-Quality Voice Recordings Using Disentangled Channel Factor and Neural Waveform Model Haoyu Li Yang Ai Junichi Yamagishi 76 2 0 10 Nov 2020
Pretraining Strategies, Waveform Model Choice, and Acoustic Configurations for Multi-Speaker End-to-End Speech Synthesis Erica Cooper Xin Wang Yi Zhao Yusuke Yasuda Junichi Yamagishi SyDa 50 3 0 10 Nov 2020
Wave-Tacotron: Spectrogram-free end-to-end text-to-speech synthesis Ron J. Weiss RJ Skerry-Ryan Eric Battenberg Soroosh Mariooryad Diederik P. Kingma 99 101 0 06 Nov 2020
Improving Prosody Modelling with Cross-Utterance BERT Embeddings for End-to-end Speech Synthesis Guanghui Xu Wei Song Zhengchen Zhang Chao Zhang Xiaodong He Bowen Zhou 62 50 0 06 Nov 2020
Paralinguistic Privacy Protection at the Edge Ranya Aloufi Hamed Haddadi David E. Boyle 66 14 0 04 Nov 2020