v1v2 (latest)

WaveNet: A Generative Model for Raw Audio

12 September 2016

Papers citing "WaveNet: A Generative Model for Raw Audio"

50 / 3,082 papers shown

Title
Analysis and Assessment of Controllability of an Expressive Deep Learning-based TTS system Noé Tits Kevin El Haddad Thierry Dutoit 69 5 0 06 Mar 2021
Investigating on Incorporating Pretrained and Learnable Speaker Representations for Multi-Speaker Multi-Style Text-to-Speech C. Chien Jheng-hao Lin Chien-yu Huang Po-Chun Hsu Hung-yi Lee 127 70 0 06 Mar 2021
Enhanced 3D Human Pose Estimation from Videos by using Attention-Based Neural Network with Dilated Convolutions Ruixu Liu Ju Shen He Wang Chong Chen S. Cheung V. Asari 3DH 72 31 0 04 Mar 2021
crank: An Open-Source Software for Nonparallel Voice Conversion Based on Vector-Quantized Variational Autoencoder Kazuhiro Kobayashi Wen-Chin Huang Yi-Chiao Wu Patrick Lumban Tobing Tomoki Hayashi Tomoki Toda BDL DRL 68 19 0 04 Mar 2021
Predicting Video with VQVAE Jacob Walker Ali Razavi Aaron van den Oord DRL 131 69 0 02 Mar 2021
A Spectral Enabled GAN for Time Series Data Generation Kaleb E. Smith Anthony O. Smith GAN 45 12 0 02 Mar 2021
Experiments with Rich Regime Training for Deep Learning Xinyan Li A. Banerjee 73 2 0 26 Feb 2021
Deepfakes Generation and Detection: State-of-the-art, open challenges, countermeasures, and way forward Momina Masood M. Nawaz K. Malik A. Javed Aun Irtaza AAML 208 323 0 25 Feb 2021
Automatic Feature Extraction for Heartbeat Anomaly Detection Robert-George Colt Csongor-Huba Várady Riccardo Volpi Luigi Malagò 24 4 0 24 Feb 2021
Multi-Task Temporal Convolutional Networks for Joint Recognition of Surgical Phases and Steps in Gastric Bypass Procedures Sanat Ramesh Diego DallÁlba Cristians Gonzalez Tong Yu Pietro Mascagni Didier Mutter J. Marescaux Paolo Fiorini N. Padoy 83 69 0 24 Feb 2021
Speech Enhancement Using Multi-Stage Self-Attentive Temporal Convolutional Networks Ju Lin A. Wijngaarden Kuang-Ching Wang M. C. Smith 78 51 0 24 Feb 2021
Handling Background Noise in Neural Speech Generation Tom Denton Alejandro Luebs Felicia S. C. Lim Andrew Storus Hengchin Yeh W. Kleijn Jan Skoglund 52 2 0 23 Feb 2021
Anytime Sampling for Autoregressive Models via Ordered Autoencoding Yilun Xu Yang Song Sahaj Garg Linyuan Gong Rui Shu Aditya Grover Stefano Ermon DiffM 93 11 0 23 Feb 2021
Investigating Deep Neural Structures and their Interpretability in the Domain of Voice Conversion Samuel J. Broughton Md. Asif Jalal Roger K. Moore 39 0 0 22 Feb 2021
Improving Lossless Compression Rates via Monte Carlo Bits-Back Coding Yangjun Ruan Karen Ullrich Daniel de Souza Severo James Townsend Ashish Khisti Arnaud Doucet Alireza Makhzani Chris J. Maddison 112 25 0 22 Feb 2021
Anyone GAN Sing Shreeviknesh Sankaran Sukavanan Nanjundan G. Anand GAN 54 2 0 22 Feb 2021
Introducing an experimental distortion-tolerant speech encryption scheme for secure voice communication Piotr Krasnowski J. Lebrun Bruno Martin 22 2 0 19 Feb 2021
Hierarchical Recurrent Neural Networks for Conditional Melody Generation with Long-term Structure Zixun Guo D. Makris Dorien Herremans 77 24 0 19 Feb 2021
Generative Speech Coding with Predictive Variance Regularization W. Kleijn Andrew Storus Michael Chinen Tom Denton Felicia S. C. Lim Alejandro Luebs Jan Skoglund Hengchin Yeh 68 68 0 18 Feb 2021
AudioVisual Speech Synthesis: A brief literature review Efthymios Georgiou Athanasios Katsamanis 27 0 0 18 Feb 2021
Deep Learning Approaches for Forecasting Strawberry Yields and Prices Using Satellite Images and Station-Based Soil Parameters Mohita Chaudhary Mohamed Sadok Gastli Lobna Nassar Fakhri Karray 18 7 0 17 Feb 2021
One-shot action recognition in challenging therapy scenarios Alberto Sabater Laura Santos J. Santos-Victor Alexandre Bernardino Luis Montesano Ana C. Murillo 129 39 0 17 Feb 2021
Hierarchical VAEs Know What They Don't Know Jakob Drachmann Havtorn J. Frellsen Søren Hauberg Lars Maaløe DRL 131 74 0 16 Feb 2021
PeriodNet: A non-autoregressive waveform generation model with a structure separating periodic and aperiodic components Yukiya Hono Shinji Takaki Kei Hashimoto Keiichiro Oura Yoshihiko Nankaku K. Tokuda 69 16 0 15 Feb 2021
Leveraging Acoustic and Linguistic Embeddings from Pretrained speech and language Models for Intent Classification Bidisha Sharma Maulik C. Madhavi Haizhou Li 51 20 0 15 Feb 2021
Deep Convolutional and Recurrent Networks for Polyphonic Instrument Classification from Monophonic Raw Audio Waveforms Kleanthis Avramidis Agelos Kratimenos C. Garoufis Athanasia Zlatintsi Petros Maragos 43 8 0 13 Feb 2021
Enhancing into the codec: Noise Robust Speech Coding with Vector-Quantized Autoencoders Jonah Casebeer Vinjai Vale Umut Isik J. Valin Ritwik Giri A. Krishnaswamy 100 20 0 12 Feb 2021
VARA-TTS: Non-Autoregressive Text-to-Speech Synthesis based on Very Deep VAE with Residual Attention Peng Liu Yuewen Cao Songxiang Liu Na Hu Guangzhi Li Chao Weng Jane Polak Scowcroft 95 22 0 12 Feb 2021
DEEPF0: End-To-End Fundamental Frequency Estimation for Music and Speech Signals Satwinder Singh Ruili Wang Yuanhang Qiu 45 26 0 11 Feb 2021
ASVspoof 2019: spoofing countermeasures for the detection of synthesized, converted and replayed speech A. Nautsch Xin Wang Nicholas W. D. Evans Tomi Kinnunen Ville Vestman Massimiliano Todisco Héctor Delgado Md. Sahidullah Junichi Yamagishi Kong Aik Lee 197 155 0 11 Feb 2021
Causal Inference for Time series Analysis: Problems, Methods and Evaluation Raha Moraffah Paras Sheth Mansooreh Karami Anchit Bhattacharya Qianru Wang Anique Tahir A. Raglin Huan Liu CML AI4TS 113 111 0 11 Feb 2021
Self-Supervised VQ-VAE for One-Shot Music Style Transfer Ondřej Cífka A. Ozerov Umut Simsekli G. Richard 79 28 0 10 Feb 2021
Voice Cloning: a Multi-Speaker Text-to-Speech Synthesis Approach based on Transfer Learning Giuseppe Ruggiero Enrico Zovato Luigi Di Caro V. Pollet DiffM 63 10 0 10 Feb 2021
Conditional Loss and Deep Euler Scheme for Time Series Generation Carl Remlinger Joseph Mikael Romuald Elie DiffM 111 12 0 10 Feb 2021
EMA2S: An End-to-End Multimodal Articulatory-to-Speech System Yu-Wen Chen Kuo-Hsuan Hung Shang-Yi Chuang Jonathan Sherman Wen-Chin Huang Xugang Lu Yu Tsao 86 16 0 07 Feb 2021
Multi-Task Self-Supervised Pre-Training for Music Classification Ho-Hsiang Wu Chieh-Chi Kao Qingming Tang Ming Sun Brian McFee J. P. Bello Chao Wang SSL 414 37 0 05 Feb 2021
Co-Mixup: Saliency Guided Joint Mixup with Supermodular Diversity Jang-Hyun Kim Wonho Choo Hosan Jeong Hyun Oh Song 275 184 0 05 Feb 2021
Invertible DenseNets with Concatenated LipSwish Yura Perugachi-Diaz Jakub M. Tomczak Sandjai Bhulai 139 20 0 04 Feb 2021
CKConv: Continuous Kernel Convolution For Sequential Data David W. Romero Anna Kuzina Erik J. Bekkers Jakub M. Tomczak Mark Hoogendoorn 77 126 0 04 Feb 2021
Generative Spoken Language Modeling from Raw Audio Kushal Lakhotia Evgeny Kharitonov Wei-Ning Hsu Yossi Adi Adam Polyak ... Tu Nguyen Jade Copet Alexei Baevski A. Mohamed Emmanuel Dupoux AuLLM 295 366 0 01 Feb 2021
Universal Neural Vocoding with Parallel WaveNet Yunlong Jiao Adam Gabry's Georgi Tinchev Bartosz Putrycz Daniel Korzekwa V. Klimkov 85 42 0 01 Feb 2021
Triple M: A Practical Text-to-speech Synthesis System With Multi-guidance Attention And Multi-band Multi-time LPCNet Shilu Lin Fenglong Xie Li Meng Xinhui Li Li Lu 83 0 0 30 Jan 2021
Time Series (re)sampling using Generative Adversarial Networks Christian Moller Dahl Emil N. Sørensen TTA AI4TS 66 6 0 30 Jan 2021
Expressive Neural Voice Cloning Paarth Neekhara Shehzeen Samarah Hussain Shlomo Dubnov F. Koushanfar Julian McAuley DiffM 59 30 0 30 Jan 2021
A causal convolutional neural network for multi-subject motion modeling and generation Shuaiying Hou Congyi Wang Wenlin Zhuang Yu Chen Yangang Wang Hujun Bao Jinxiang Chai Weiwei Xu 78 4 0 28 Jan 2021
Autoregressive Denoising Diffusion Models for Multivariate Probabilistic Time Series Forecasting Kashif Rasul Calvin Seward Ingmar Schuster Roland Vollgraf DiffM 194 320 0 28 Jan 2021
Semi-supervised source localization in reverberant environments with deep generative modeling Michael J. Bianco Sharon Gannot Efren Fernandez-Grande Peter Gerstoft 66 21 0 26 Jan 2021
High-Quality Vocoding Design with Signal Processing for Speech Synthesis and Voice Conversion M. S. Al-Radhi 34 1 0 25 Jan 2021
Multi-Task Time Series Forecasting With Shared Attention Zekai Chen Jiaze E Xiao Zhang Hao Sheng Xiuzhen Cheng AI4TS 86 20 0 24 Jan 2021
Generating a Doppelganger Graph: Resembling but Distinct Yuliang Ji Ru Huang Jie Chen Yuanzhe Xi 55 2 0 23 Jan 2021