v1v2 (latest)

WaveNet: A Generative Model for Raw Audio

12 September 2016

Papers citing "WaveNet: A Generative Model for Raw Audio"

50 / 3,082 papers shown

Title
GANs & Reels: Creating Irish Music using a Generative Adversarial Network A. Kolokolova M. Billard Robert Bishop Moustafa Elsisy Zachary Northcott Laura Graves Vineel Nagisetty Heather Patey GAN 34 8 0 29 Oct 2020
The IQIYI System for Voice Conversion Challenge 2020 Wendong Gan Haitao Chen Yin Yan Jianwei Li Bolong Wen Xueping Xu Hai Li 26 0 0 29 Oct 2020
Speech Synthesis and Control Using Differentiable DSP Giorgio Fabbro Vladimir Golkov Thomas Kemp Zorah Lähner 78 12 0 28 Oct 2020
PPG-based singing voice conversion with adversarial representation learning Zhonghao Li Benlai Tang Xiang Yin Yuan Wan Linjia Xu Chen Shen Zejun Ma 59 37 0 28 Oct 2020
Upsampling artifacts in neural audio synthesis Jordi Pons Santiago Pascual Giulio Cengarle Joan Serrà 95 64 0 27 Oct 2020
Parallel waveform synthesis based on generative adversarial networks with voicing-aware conditional discriminators Ryuichi Yamamoto Eunwoo Song Min-Jae Hwang Jae-Min Kim 76 18 0 27 Oct 2020
Benchmarking Deep Learning Interpretability in Time Series Predictions Aya Abdelsalam Ismail Mohamed K. Gunady H. C. Bravo Soheil Feizi XAI AI4TS FAtt 72 174 0 26 Oct 2020
Shimon the Robot Film Composer and DeepScore: An LSTM for Generation of Film Scores based on Visual Analysis Richard J. Savery Gil Weinberg 29 7 0 26 Oct 2020
TTS-by-TTS: TTS-driven Data Augmentation for Fast and High-Quality Speech Synthesis Min-Jae Hwang Ryuichi Yamamoto Eunwoo Song Jae-Min Kim 44 32 0 26 Oct 2020
LagNetViP: A Lagrangian Neural Network for Video Prediction Christine Allen-Blanchette Sushant Veer Anirudha Majumdar Naomi Ehrich Leonard 112 31 0 24 Oct 2020
Autoregressive Score Matching Chenlin Meng Lantao Yu Yang Song Jiaming Song Stefano Ermon DiffM 241 14 0 24 Oct 2020
A Comparison of Discrete Latent Variable Models for Speech Representation Learning Henry Zhou Alexei Baevski Michael Auli DRL 67 10 0 24 Oct 2020
Show and Speak: Directly Synthesize Spoken Description of Images Xinsheng Wang Siyuan Feng Jihua Zhu M. Hasegawa-Johnson O. Scharenborg 154 4 0 23 Oct 2020
Listening to Sounds of Silence for Speech Denoising Ruilin Xu Rundi Wu Y. Ishiwaka Carl Vondrick Changxi Zheng 66 33 0 22 Oct 2020
Limitations of Autoregressive Models and Their Alternatives Chu-cheng Lin Aaron Jaech Xin Li Matthew R. Gormley Jason Eisner 89 64 0 22 Oct 2020
CryptoGRU: Low Latency Privacy-Preserving Text Analysis With GRU Bo Feng Qian Lou Lei Jiang Geoffrey C. Fox 68 15 0 22 Oct 2020
AISHELL-3: A Multi-speaker Mandarin TTS Corpus and the Baselines Yao Shi Hui Bu Xin Xu Shaojing Zhang Ming Li 119 223 0 22 Oct 2020
How Similar or Different Is Rakugo Speech Synthesizer to Professional Performers? Shuhei Kato Yusuke Yasuda Xin Wang Erica Cooper Junichi Yamagishi 21 0 0 22 Oct 2020
Convolutional Autoencoders for Human Motion Infilling Manuel Kaufmann Emre Aksan Mingli Song Fabrizio Pece R. Ziegler Otmar Hilliges 3DH 54 102 0 22 Oct 2020
The NTU-AISG Text-to-speech System for Blizzard Challenge 2020 Haobo Zhang Tingzhi Mao Haihua Xu Hao-Ming Huang 92 1 0 22 Oct 2020
Parallel Tacotron: Non-Autoregressive and Controllable TTS Isaac Elias Heiga Zen Jonathan Shen Yu Zhang Ye Jia Ron J. Weiss Yonghui Wu DRL 78 103 0 22 Oct 2020
NU-GAN: High resolution neural upsampling with GAN Rithesh Kumar Kundan Kumar Vicki Anand Yoshua Bengio Aaron Courville 65 26 0 22 Oct 2020
Learning to Summarize Long Texts with Memory Compression and Transfer Jaehong Park Jonathan Pilault C. Pal 44 0 0 21 Oct 2020
Transferable Graph Optimizers for ML Compilers Yanqi Zhou Sudip Roy AmirAli Abdolrashidi Daniel Wong Peter C. Ma ... Mangpo Phitchaya Phothilimtha Shen Wang Anna Goldie Azalia Mirhoseini James Laudon GNN 73 55 0 21 Oct 2020
Improving Audio Anomalies Recognition Using Temporal Convolutional Attention Network Qiang Huang Thomas Hain 42 10 0 21 Oct 2020
WaveTransformer: A Novel Architecture for Audio Captioning Based on Learning Temporal and Time-Frequency Information An Tran Konstantinos Drossos Tuomas Virtanen 106 19 0 21 Oct 2020
End-to-End Text-to-Speech using Latent Duration based on VQ-VAE Yusuke Yasuda Xin Wang Junichi Yamagishi 68 17 0 19 Oct 2020
A combined full-reference image quality assessment approach based on convolutional activation maps D. Varga 57 7 0 19 Oct 2020
Melody Classifier with Stacked-LSTM You Li Zhuowen Lin 15 1 0 16 Oct 2020
Sobolev training of thermodynamic-informed neural networks for smoothed elasto-plasticity models with level set hardening Nikolaos N. Vlassis WaiChing Sun AI4CE 41 2 0 15 Oct 2020
The NeteaseGames System for Voice Conversion Challenge 2020 with Vector-quantization Variational Autoencoder and WaveNet Haitong Zhang DRL 31 4 0 15 Oct 2020
Unsupervised Video Anomaly Detection via Normalizing Flows with Implicit Latent Features Myeongah Cho Taeoh Kim Woojin Kim Suhwan Cho Sangyoun Lee 98 95 0 15 Oct 2020
Medical Code Assignment with Gated Convolution and Note-Code Interaction Shaoxiong Ji Shirui Pan Pekka Marttinen MedIm 103 18 0 14 Oct 2020
Dual-mode ASR: Unify and Improve Streaming ASR with Full-context Modeling Jiahui Yu Wei Han Anmol Gulati Chung-Cheng Chiu Yue Liu Tara N. Sainath Yonghui Wu Ruoming Pang 125 19 0 12 Oct 2020
The Cone of Silence: Speech Separation by Localization Teerapat Jenrungrot V. Jayaram S. M. Seitz Ira Kemelmacher-Shlizerman 83 56 0 12 Oct 2020
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis Jungil Kong Jaehyeon Kim Jaekyoung Bae 183 1,958 0 12 Oct 2020
Enhancement Of Coded Speech Using a Mask-Based Post-Filter Srikanth Korse Kishan Gupta Guillaume Fuchs 36 14 0 12 Oct 2020
AI Song Contest: Human-AI Co-Creation in Songwriting Cheng-Zhi Anna Huang Hendrik Vincent Koops Ed Newton-Rex Monica Dinculescu Carrie J. Cai 57 92 0 12 Oct 2020
SJTU-NICT's Supervised and Unsupervised Neural Machine Translation Systems for the WMT20 News Translation Task Z. Li Hai Zhao Rui Wang Kehai Chen Masao Utiyama Eiichiro Sumita 66 15 0 11 Oct 2020
The NU Voice Conversion System for the Voice Conversion Challenge 2020: On the Effectiveness of Sequence-to-sequence Models and Autoregressive Neural Vocoders Wen-Chin Huang Patrick Lumban Tobing Yi-Chiao Wu Kazuhiro Kobayashi Tomoki Toda 86 8 0 09 Oct 2020
Baseline System of Voice Conversion Challenge 2020 with Cyclic Variational Autoencoder and Parallel WaveGAN Patrick Lumban Tobing Yi-Chiao Wu Tomoki Toda DRL 60 14 0 09 Oct 2020
Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis Including Unsupervised Duration Modeling Jonathan Shen Ye Jia Mike Chrzanowski Yu Zhang Isaac Elias Heiga Zen Yonghui Wu 106 112 0 08 Oct 2020
Randomized Overdrive Neural Networks C. Steinmetz Joshua D. Reiss 50 4 0 08 Oct 2020
FastVC: Fast Voice Conversion with non-parallel data Oriol Barbany Milos Cernak 43 7 0 08 Oct 2020
Automating Inference of Binary Microlensing Events with Neural Density Estimation Keming 名 Zhang 张可 J. Bloom B. Gaudi F. Lanusse C. Lam Jessica R. Lu 21 1 0 08 Oct 2020
A Survey of Deep Meta-Learning Mike Huisman Jan N. van Rijn Aske Plaat 201 335 0 07 Oct 2020
Improving Sequential Latent Variable Models with Autoregressive Flows Joseph Marino Lei Chen Jiawei He Stephan Mandt BDL AI4TS 127 12 0 07 Oct 2020
VoiceGrad: Non-Parallel Any-to-Many Voice Conversion with Annealed Langevin Dynamics Hirokazu Kameoka Takuhiro Kaneko Kou Tanaka Nobukatsu Hojo Shogo Seki DiffM 124 21 0 06 Oct 2020
Digital Voicing of Silent Speech David Gaddy Dana Klein 64 56 0 06 Oct 2020
A Contrastive Learning Approach for Training Variational Autoencoder Priors J. Aneja Alex Schwing Jan Kautz Arash Vahdat DRL 126 83 0 06 Oct 2020