v1v2 (latest)

WaveNet: A Generative Model for Raw Audio

12 September 2016

Papers citing "WaveNet: A Generative Model for Raw Audio"

50 / 3,082 papers shown

Title
The Zero Resource Speech Challenge 2019: TTS without T Ewan Dunbar Robin Algayres Julien Karadayi Mathieu Bernard Juan Benjumea ... Lucas Ondel A. Black Laurent Besacier S. Sakti Emmanuel Dupoux 100 117 0 25 Apr 2019
Generating Long Sequences with Sparse Transformers R. Child Scott Gray Alec Radford Ilya Sutskever 142 1,925 0 23 Apr 2019
End-to-End Spoken Language Translation Michelle Guo Albert Haque Prateek Verma 58 8 0 23 Apr 2019
Analyzing the benefits of communication channels between deep learning models Philippe Lacaille 34 0 0 19 Apr 2019
TTS Skins: Speaker Conversion via ASR Adam Polyak Lior Wolf Yaniv Taigman 76 28 0 18 Apr 2019
Convolutional neural networks: a magic bullet for gravitational-wave detection? Timothy D. Gebhard Niki Kilbertus I. Harry Bernhard Schölkopf 65 92 0 18 Apr 2019
Explaining Deep Classification of Time-Series Data with Learned Prototypes Alan H. Gee Diego Garcia-Olano Joydeep Ghosh D. Paydarfar AI4TS 105 67 0 18 Apr 2019
Expediting TTS Synthesis with Adversarial Vocoding Paarth Neekhara Chris Donahue M. Puckette Shlomo Dubnov Julian McAuley 66 20 0 16 Apr 2019
Improved Speech Separation with Time-and-Frequency Cross-domain Joint Embedding and Clustering Gene-Ping Yang Chao-I Tuan Hung-yi Lee Lin-Shan Lee 61 25 0 16 Apr 2019
Speech Denoising by Accumulating Per-Frequency Modeling Fluctuations Michael Michelashvili Lior Wolf 70 16 0 16 Apr 2019
Unsupervised acoustic unit discovery for speech synthesis using discrete latent-variable neural networks Ryan Eloff A. Nortje Benjamin van Niekerk Avashna Govender Leanne Nortje Arnu Pretorius Elan Van Biljon Ewald van der Westhuizen Lisa van Staden Herman Kamper DRL 79 57 0 16 Apr 2019
RHR-Net: A Residual Hourglass Recurrent Neural Network for Speech Enhancement Jalal Abdulbaqi Yue Gu I. Marsic 27 9 0 15 Apr 2019
Singing voice synthesis based on convolutional neural networks Kazuhiro Nakamura Kei Hashimoto Keiichiro Oura Yoshihiko Nankaku K. Tokuda 86 33 0 15 Apr 2019
Unsupervised Singing Voice Conversion Eliya Nachmani Lior Wolf 82 56 0 13 Apr 2019
End-to-end Text-to-speech for Low-resource Languages by Cross-Lingual Transfer Learning Tao Tu Yuan-Jui Chen Cheng-chieh Yeh Hung-yi Lee 96 88 0 13 Apr 2019
Low-Latency Speaker-Independent Continuous Speech Separation Takuya Yoshioka Zhuo Chen Changliang Liu Xiong Xiao Hakan Erdogan Dimitrios Dimitriadis BDL VLM 46 28 0 13 Apr 2019
Assisted Sound Sample Generation with Musical Conditioning in Adversarial Auto-Encoders Adrien Bitton P. Esling Antoine Caillon Martin Fouilleul 73 10 0 12 Apr 2019
RNN-based speech synthesis using a continuous sinusoidal model M. S. Al-Radhi T. Csapó Géza Németh 24 4 0 12 Apr 2019
Supervised Anomaly Detection based on Deep Autoregressive Density Estimators Tomoharu Iwata Yuki Yamanaka 51 13 0 12 Apr 2019
Autoregressive Energy Machines C. Nash Conor Durkan 78 55 0 11 Apr 2019
RawNet: Fast End-to-End Neural Vocoder Yunchao He Yujun Wang 30 2 0 10 Apr 2019
Neuralogram: A Deep Neural Network Based Representation for Audio Signals Prateek Verma C. Chafe J. Berger AI4TS 13 9 0 10 Apr 2019
Enhancing Time Series Momentum Strategies Using Deep Neural Networks Bryan Lim S. Zohren Stephen J. Roberts AIFin AI4TS 72 90 0 09 Apr 2019
A New GAN-based End-to-End TTS Training Algorithm Haohan Guo Frank Soong Lei He Lei Xie 101 47 0 09 Apr 2019
Exploiting Syntactic Features in a Parsed Tree to Improve End-to-End TTS Haohan Guo Frank Soong Lei He Lei Xie 74 30 0 09 Apr 2019
Software and application patterns for explanation methods Maximilian Alber 80 11 0 09 Apr 2019
ASVspoof 2019: Future Horizons in Spoofed and Fake Audio Detection Massimiliano Todisco Xin Wang Ville Vestman Md. Sahidullah Héctor Delgado A. Nautsch Junichi Yamagishi Nicholas W. D. Evans Tomi Kinnunen Kong Aik Lee 99 618 0 09 Apr 2019
CycleGAN-VC2: Improved CycleGAN-based Non-parallel Voice Conversion Takuhiro Kaneko Hirokazu Kameoka Kou Tanaka Nobukatsu Hojo 72 261 0 09 Apr 2019
Probability density distillation with generative adversarial networks for high-quality parallel waveform generation Ryuichi Yamamoto Eunwoo Song Jae-Min Kim 70 55 0 09 Apr 2019
Hierarchical Temporal Convolutional Networks for Dynamic Recommender Systems Jiaxuan You Yichen Wang Aditya Pal Pong Eksombatchai C. Rosenberg J. Leskovec 73 125 0 08 Apr 2019
Audio Source Separation via Multi-Scale Learning with Dilated Dense U-Nets V. Narayanaswamy Sameeksha Katoch Jayaraman J. Thiagarajan Huan Song A. Spanias 56 7 0 08 Apr 2019
GELP: GAN-Excited Linear Prediction for Speech Synthesis from Mel-spectrogram Lauri Juvela Bajibabu Bollepalli Junichi Yamagishi P. Alku 76 18 0 08 Apr 2019
Improving Image Classification Robustness through Selective CNN-Filters Fine-Tuning Alessandro Bianchi Moreno Raimondo Vendra P. Protopapas Marco Brambilla 20 8 0 08 Apr 2019
Taco-VC: A Single Speaker Tacotron based Voice Conversion with Limited Data Roee Levy Leshem Raja Giryes 105 8 0 06 Apr 2019
Feature-Based Interpolation and Geodesics in the Latent Spaces of Generative Models Lukasz Struski M. Sadowski Tomasz Danel Jacek Tabor Igor T. Podolak DiffM 94 7 0 06 Apr 2019
HOList: An Environment for Machine Learning of Higher-Order Theorem Proving Kshitij Bansal Sarah M. Loos M. Rabe Christian Szegedy S. Wilcox AIMat 90 51 0 05 Apr 2019
An Unsupervised Autoregressive Model for Speech Representation Learning Yu-An Chung Wei-Ning Hsu Hao Tang James R. Glass SSL 114 409 0 05 Apr 2019
Fast Weakly Supervised Action Segmentation Using Mutual Consistency Yaser Souri Mohsen Fayyaz Luca Minciullo Gianpiero Francesca Juergen Gall 91 52 0 05 Apr 2019
WaveCycleGAN2: Time-domain Neural Post-filter for Speech Waveform Generation Kou Tanaka Hirokazu Kameoka Takuhiro Kaneko Nobukatsu Hojo 80 19 0 05 Apr 2019
In Other News: A Bi-style Text-to-speech Model for Synthesizing Newscaster Voice with Limited Data N. Prateek Mateusz Lajszczak Roberto Barra-Chicote Thomas Drugman Jaime Lorenzo-Trueba Thomas Merritt S. Ronanki Trevor Wood 87 30 0 04 Apr 2019
A Learned Representation for Scalable Vector Graphics Raphael Gontijo-Lopes David R Ha Douglas Eck Jonathon Shlens GAN OCL 76 118 0 04 Apr 2019
Multi-reference Tacotron by Intercross Training for Style Disentangling,Transfer and Control in Speech Synthesis Yanyao Bian Changbin Chen Yongguo Kang Zhenglin Pan 77 46 0 04 Apr 2019
End-to-end Binaural Sound Localisation from the Raw Waveform Paolo Vecchiotti Ning Ma S. Squartini Guy J. Brown 45 59 0 03 Apr 2019
Speech denoising by parametric resynthesis Soumi Maiti Michael I. Mandel DiffM 16 10 0 02 Apr 2019
Semantics-Guided Neural Networks for Efficient Skeleton-Based Human Action Recognition Pengfei Zhang Cuiling Lan Wenjun Zeng Junliang Xing Jianru Xue Nanning Zheng 3DH 104 446 0 02 Apr 2019
Generative predecessor models for sample-efficient imitation learning Yannick Schroecker Mel Vecerík Jonathan Scholz VLM 60 31 0 01 Apr 2019
Training Multi-Speaker Neural Text-to-Speech Systems using Speaker-Imbalanced Speech Corpora Hieu-Thi Luong Xin Wang Junichi Yamagishi Nobuyuki Nishizawa 89 23 0 01 Apr 2019
Training a Neural Speech Waveform Model using Spectral Losses of Short-Time Fourier Transform and Continuous Wavelet Transform Shinji Takaki Hirokazu Kameoka Junichi Yamagishi 25 2 0 29 Mar 2019
Joint training framework for text-to-speech and voice conversion using multi-source Tacotron and WaveNet Mingyang Zhang Xin Wang Fuming Fang Haizhou Li Junichi Yamagishi 72 50 0 29 Mar 2019
Bit-Flip Attack: Crushing Neural Network with Progressive Bit Search Adnan Siraj Rakin Zhezhi He Deliang Fan AAML 106 227 0 28 Mar 2019