v1v2 (latest)

WaveNet: A Generative Model for Raw Audio

12 September 2016

Papers citing "WaveNet: A Generative Model for Raw Audio"

50 / 3,082 papers shown

Title
The Effectiveness of Discretization in Forecasting: An Empirical Study on Neural Time Series Models Stephan Rabanser Tim Januschowski Valentin Flunkert David Salinas Jan Gasthaus BDL AI4TS 80 20 0 20 May 2020
Deep learning approaches for neural decoding: from CNNs to LSTMs and spikes to fMRI J. Livezey Joshua I. Glaser AI4CE 100 9 0 19 May 2020
Toward Automated Classroom Observation: Multimodal Machine Learning to Estimate CLASS Positive Climate and Negative Climate Anand Ramakrishnan Brian Zylich Erin Ottmar Jennifer LoCasale-Crouch Jacob Whitehill 36 27 0 19 May 2020
Vector-quantized neural networks for acoustic unit discovery in the ZeroSpeech 2020 challenge Benjamin van Niekerk Leanne Nortje Herman Kamper 120 117 0 19 May 2020
Improving Accent Conversion with Reference Encoder and End-To-End Text-To-Speech Wenjie Li Benlai Tang Xiang Yin Yushi Zhao Wei Li Kang Wang Hao Huang Yuxuan Wang Zejun Ma 70 13 0 19 May 2020
Defending Your Voice: Adversarial Attack on Voice Conversion Chien-yu Huang Yist Y. Lin Hung-yi Lee Lin-Shan Lee AAML 87 52 0 18 May 2020
A Cyclical Post-filtering Approach to Mismatch Refinement of Neural Vocoder for Text-to-speech Systems Yi-Chiao Wu Patrick Lumban Tobing Kazuki Yasuhara Noriyuki Matsunaga Yamato Ohtani Tomoki Toda 50 5 0 18 May 2020
Quasi-Periodic Parallel WaveGAN Vocoder: A Non-autoregressive Pitch-dependent Dilated Convolution Model for Parametric Speech Generation Yi-Chiao Wu Tomoki Hayashi T. Okamoto Hisashi Kawai Tomoki Toda 73 4 0 18 May 2020
MoBoAligner: a Neural Alignment Model for Non-autoregressive TTS with Monotonic Boundary Search Naihan Li Shujie Liu Yanqing Liu Sheng Zhao Ming-Yuan Liu Ming Zhou 50 6 0 18 May 2020
Many-to-Many Voice Transformer Network Hirokazu Kameoka Wen-Chin Huang Kou Tanaka Takuhiro Kaneko Nobukatsu Hojo Tomoki Toda ViT 94 30 0 18 May 2020
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis Prajwal K R Rudrabha Mukhopadhyay Vinay P. Namboodiri C. V. Jawahar 71 113 0 17 May 2020
Universal Adversarial Perturbations: A Survey Ashutosh Chaubey Nikhil Agrawal Kavya Barnwal K. K. Guliani Pramod Mehta OOD AAML 110 47 0 16 May 2020
Semi-supervised Learning for Multi-speaker Text-to-speech Synthesis Using Discrete Speech Representation Tao Tu Yuan-Jui Chen Alexander H. Liu Hung-yi Lee 54 7 0 16 May 2020
DAMIA: Leveraging Domain Adaptation as a Defense against Membership Inference Attacks Hongwei Huang Weiqi Luo Guoqiang Zeng J. Weng Yue Zhang Anjia Yang AAML 35 26 0 16 May 2020
Improved Prosody from Learned F0 Codebook Representations for VQ-VAE Speech Waveform Reconstruction Yi Zhao Haoyu Li Cheng-I Jeff Lai Jennifer Williams Erica Cooper Junichi Yamagishi 84 18 0 16 May 2020
Unsupervised Cross-Domain Speech-to-Speech Conversion with Time-Frequency Consistency M. A. Khan Fabien Cardinaux Stefan Uhlich Marc Ferras Asja Fischer 26 0 0 15 May 2020
JDI-T: Jointly trained Duration Informed Transformer for Text-To-Speech without Explicit Alignment D. Lim Won Jang Gyeonghwan O Heayoung Park Bongwan Kim Jaesam Yoon 71 37 0 15 May 2020
WG-WaveNet: Real-Time High-Fidelity Speech Synthesis without GPU Po-Chun Hsu Hung-yi Lee 44 16 0 15 May 2020
Reverberation Modeling for Source-Filter-based Neural Vocoder Yang Ai Xin Wang Junichi Yamagishi Zhenhua Ling 59 3 0 15 May 2020
OctSqueeze: Octree-Structured Entropy Model for LiDAR Compression Lila Huang Shenlong Wang K. Wong Jerry Liu R. Urtasun 3DPC 68 146 0 14 May 2020
Neural Networks Versus Conventional Filters for Inertial-Sensor-based Attitude Estimation Daniel Weber C. Gühmann Thomas Seel 37 35 0 14 May 2020
Foundations and modelling of dynamic networks using Dynamic Graph Neural Networks: A survey Joakim Skarding Bogdan Gabrys Katarzyna Musial AI4CE 122 240 0 13 May 2020
AdaDurIAN: Few-shot Adaptation for Neural Text-to-Speech with DurIAN Zewang Zhang Qiao Tian Heng Lu Ling-Hao Chen Shan Liu 62 27 0 12 May 2020
FeatherWave: An efficient high-fidelity neural vocoder with multi-band linear prediction Qiao Tian Zewang Zhang Heng Lu Linghui Chen Shan Liu 69 22 0 12 May 2020
DiscreTalk: Text-to-Speech as a Machine Translation Problem Tomoki Hayashi Shinji Watanabe 70 32 0 12 May 2020
TTS-Portuguese Corpus: a corpus for speech synthesis in Brazilian Portuguese Edresson Casanova A. Júnior C. Shulby F. S. Oliveira João Paulo Teixeira M. Ponti S. Aluísio 75 24 0 11 May 2020
Multi-band MelGAN: Faster Waveform Generation for High-Quality Text-to-Speech Geng Yang Shan Yang Kai-Chun Liu Peng Fang Wei Chen Lei Xie 153 200 0 11 May 2020
GACELA -- A generative adversarial context encoder for long audio inpainting Andrés Marafioti P. Majdak Nicki Holighaus Nathanael Perraudin 100 46 0 11 May 2020
A review of radar-based nowcasting of precipitation and applicable machine learning techniques R. Prudden Samantha V. Adams D. Kangin Nial H. Robinson Suman V. Ravuri S. Mohamed A. Arribas AI4Cl OffRL 94 45 0 11 May 2020
From Speaker Verification to Multispeaker Speech Synthesis, Deep Transfer with Feedback Constraint Zexin Cai Chuxiong Zhang Ming Li 73 42 0 10 May 2020
Temporal-Framing Adaptive Network for Heart Sound Segmentation without Prior Knowledge of State Duration Xingyao Wang Chengyu Liu Yuwen Li Xianghong Cheng Jianqing Li Gari D. Clifford MedIm 32 24 0 09 May 2020
Learning to Understand Child-directed and Adult-directed Speech Lieke Gelderloos Grzegorz Chrupała Afra Alishahi 61 6 0 06 May 2020
Neural Networks and Value at Risk Alexander Arimond Damian Borth Andreas G. F. Hoepner M. Klawunn S. Weisheit 37 8 0 04 May 2020
Hard-Coded Gaussian Attention for Neural Machine Translation Weiqiu You Simeng Sun Mohit Iyyer 103 67 0 02 May 2020
Generative Adversarial Networks (GANs Survey): Challenges, Solutions, and Future Directions Divya Saxena Jiannong Cao AAML AI4CE 158 307 0 30 Apr 2020
Jukebox: A Generative Model for Music Prafulla Dhariwal Heewoo Jun Christine Payne Jong Wook Kim Alec Radford Ilya Sutskever VLM 214 758 0 30 Apr 2020
CopyCat: Many-to-Many Fine-Grained Prosody Transfer for Neural Text-to-Speech S. Karlapati Alexis Moinet Arnaud Joly V. Klimkov Daniel Sáez-Trigueros Thomas Drugman 52 67 0 30 Apr 2020
Detecting Deep-Fake Videos from Appearance and Behavior S. Agarwal Tarek El-Gaaly Hany Farid Ser-Nam Lim PICV 64 169 0 29 Apr 2020
Conditional Spoken Digit Generation with StyleGAN Kasperi Palkama Lauri Juvela Alexander Ilin GAN 61 10 0 28 Apr 2020
Time Series Forecasting With Deep Learning: A Survey Bryan Lim S. Zohren AI4TS AI4CE 128 1,257 0 28 Apr 2020
A Summary of the First Workshop on Language Technology for Language Documentation and Revitalization Graham Neubig Shruti Rijhwani Alexis Palmer Jordan MacKenzie Hilaria Cruz ... Yiyuan Li S. Zink Mengzhou Xia Roshan S. Sharma Patrick Littell 30 8 0 27 Apr 2020
Autoencoding Neural Networks as Musical Audio Synthesizers Joseph T Colonel C. Curro S. Keene MGen 16 2 0 27 Apr 2020
Interpretation of Deep Temporal Representations by Selective Visualization of Internally Activated Nodes Sohee Cho Ginkyeng Lee Wonjoon Chang Jaesik Choi 73 16 0 27 Apr 2020
Low-latency hand gesture recognition with a low resolution thermal imager Maarten Vandersteegen Wouter Reusen Kristof Van Beeck 38 17 0 24 Apr 2020
ByteSing: A Chinese Singing Voice Synthesis System Using Duration Allocated Encoder-Decoder Acoustic Models and WaveRNN Vocoders Yu Gu Xiang Yin Yonghui Rao Yuan Wan Benlai Tang Yang Zhang Jitong Chen Yuxuan Wang Zejun Ma 91 70 0 23 Apr 2020
Group Activity Detection from Trajectory and Video Data in Soccer Ryan Sanford Siavash Gorji L. G. Hafemann B. Pourbabaee Mehrsan Javan 61 34 0 21 Apr 2020
Deep Learning for Time Series Forecasting: Tutorial and Literature Survey Konstantinos Benidis Syama Sundar Rangapuram Valentin Flunkert Bernie Wang Danielle C. Maddix ... David Salinas Lorenzo Stella François-Xavier Aubet Laurent Callot Tim Januschowski AI4TS 99 202 0 21 Apr 2020
ESPnet-ST: All-in-One Speech Translation Toolkit Hirofumi Inaguma Shun Kiyono Kevin Duh Shigeki Karita Nelson Yalta Tomoki Hayashi Shinji Watanabe 118 166 0 21 Apr 2020
Data Processing for Optimizing Naturalness of Vietnamese Text-to-speech System V. Phung Phan Huy Kinh Anh-Tuan Dinh Quoc Bao Nguyen 35 5 0 20 Apr 2020
Speech Paralinguistic Approach for Detecting Dementia Using Gated Convolutional Neural Network M. R. Makiuchi Tifani Warnita Nakamasa Inoue Koichi Shinoda M. Yoshimura Momoko Kitazawa K. Funaki Yoko Eguchi T. Kishimoto 56 11 0 16 Apr 2020