v1v2 (latest)

WaveNet: A Generative Model for Raw Audio

12 September 2016

Papers citing "WaveNet: A Generative Model for Raw Audio"

50 / 3,082 papers shown

Title
Fine-Grained Classroom Activity Detection from Audio with Neural Networks Eric Slyman Chris Daw Morgan Skrabut A. Usenko Brian Hutchinson HAI 59 5 0 29 Jul 2021
Predictive Coding: a Theoretical and Experimental Review Beren Millidge A. Seth Christopher L. Buckley AI4CE 101 134 0 27 Jul 2021
Improving ClusterGAN Using Self-Augmented Information Maximization of Disentangling Latent Spaces T. Dam S. Anavatti H. Abbass 72 6 0 27 Jul 2021
Adaptation of Tacotron2-based Text-To-Speech for Articulatory-to-Acoustic Mapping using Ultrasound Tongue Imaging Csaba Zainkó L. Tóth Amin Honarmandi Shandiz G. Gosztolya Alexandra Markó Géza Németh Tamás Gábor Csapó 66 4 0 26 Jul 2021
A Study on Speech Enhancement Based on Diffusion Probabilistic Model Yen-Ju Lu Yu Tsao Shinji Watanabe DiffM 78 74 0 25 Jul 2021
Use of speaker recognition approaches for learning and evaluating embedding representations of musical instrument sounds Xuan Shi Erica Cooper Junichi Yamagishi 100 7 0 24 Jul 2021
Using Deep Learning Techniques and Inferential Speech Statistics for AI Synthesised Speech Recognition A. Singh Priyanka Singh K. Nathwani 22 2 0 23 Jul 2021
HARP-Net: Hyper-Autoencoded Reconstruction Propagation for Scalable Neural Audio Coding Darius Petermann Seungkwon Beack Minje Kim 79 15 0 22 Jul 2021
Generative Models for Security: Attacks, Defenses, and Opportunities L. A. Bauer Vincent Bindschaedler 114 4 0 21 Jul 2021
A Tandem Framework Balancing Privacy and Security for Voice User Interfaces Ranya Aloufi Hamed Haddadi David E. Boyle 90 3 0 21 Jul 2021
WikiGraphs: A Wikipedia Text - Knowledge Graph Paired Dataset Luyu Wang Yujia Li Ozlem Aslan Oriol Vinyals 63 26 0 20 Jul 2021
Interactive Storytelling for Children: A Case-study of Design and Development Considerations for Ethical Conversational AI J. Chubb S. Missaoui S. Concannon Liam Maloney James Alfred Walker 54 35 0 20 Jul 2021
SVSNet: An End-to-end Speaker Voice Similarity Assessment Model Cheng-Hung Hu Yu-Huai Peng Junichi Yamagishi Yu Tsao Hsin-Min Wang 63 5 0 20 Jul 2021
Human Perception of Audio Deepfakes Nicolas Müller Karla Markert Konstantin Böttinger 121 50 0 20 Jul 2021
Approximation Theory of Convolutional Architectures for Time Series Modelling Haotian Jiang Zhong Li Qianxiao Li AI4TS 83 12 0 20 Jul 2021
Multi-Modal Temporal Convolutional Network for Anticipating Actions in Egocentric Videos Olga Zatsarynna Yazan Abu Farha Juergen Gall EgoV 89 27 0 18 Jul 2021
Learning De-identified Representations of Prosody from Raw Audio J. Weston R. Lenain U. Meepegama E. Fristed SSL 68 17 0 17 Jul 2021
Hierarchical Reinforcement Learning with Optimal Level Synchronization based on a Deep Generative Model JaeYoon Kim Junyu Xuan Christy Jie Liang F. Hussain 29 0 0 17 Jul 2021
Continual Learning for Automated Audio Captioning Using The Learning Without Forgetting Approach Jan van den Berg Konstantinos Drossos CLL 73 11 0 16 Jul 2021
Neural Contextual Anomaly Detection for Time Series Chris U. Carmona Franccois-Xavier Aubet Valentin Flunkert Jan Gasthaus BDL AI4TS 103 66 0 16 Jul 2021
Beyond In-Place Corruption: Insertion and Deletion In Denoising Probabilistic Models Daniel D. Johnson Jacob Austin Rianne van den Berg Daniel Tarlow DiffM 244 19 0 16 Jul 2021
MultiBench: Multiscale Benchmarks for Multimodal Representation Learning Paul Pu Liang Yiwei Lyu Xiang Fan Zetian Wu Yun Cheng ... Peter Wu Michelle A. Lee Yuke Zhu Ruslan Salakhutdinov Louis-Philippe Morency VLM 111 172 0 15 Jul 2021
Serialized Multi-Layer Multi-Head Attention for Neural Speaker Embedding Hongning Zhu Kong Aik Lee Haizhou Li 78 15 0 14 Jul 2021
Codified audio language modeling learns useful representations for music information retrieval Rodrigo Castellon Chris Donahue Percy Liang 146 91 0 12 Jul 2021
Extending Text-to-Speech Synthesis with Articulatory Movement Prediction using Ultrasound Tongue Imaging Tamás Gábor Csapó 43 2 0 12 Jul 2021
Neural Waveshaping Synthesis B. Hayes C. Saitis Gyorgy Fazekas 85 28 0 11 Jul 2021
Many-to-Many Voice Conversion based Feature Disentanglement using Variational Autoencoder Manh Luong Viet-Anh Tran DRL 54 16 0 11 Jul 2021
A Deep-Bayesian Framework for Adaptive Speech Duration Modification Ravi Shankar A. Venkataraman 55 0 0 11 Jul 2021
Causal affect prediction model using a facial image sequence Geesung Oh Euiseok Jeong Sejoon Lim CVBM 69 12 0 08 Jul 2021
Pragmatic Image Compression for Human-in-the-Loop Decision-Making S. Reddy Anca Dragan Sergey Levine OffRL 86 13 0 07 Jul 2021
Evaluating Large Language Models Trained on Code Mark Chen Jerry Tworek Heewoo Jun Qiming Yuan Henrique Pondé ... Bob McGrew Dario Amodei Sam McCandlish Ilya Sutskever Wojciech Zaremba ELM ALM 302 5,702 0 07 Jul 2021
SoundStream: An End-to-End Neural Audio Codec Neil Zeghidour Alejandro Luebs Ahmed Omran Jan Skoglund Marco Tagliasacchi AI4TS 120 806 0 07 Jul 2021
Adversarial Auto-Encoding for Packet Loss Concealment Santiago Pascual Joan Serrà Jordi Pons 73 29 0 07 Jul 2021
Structured Denoising Diffusion Models in Discrete State-Spaces Jacob Austin Daniel D. Johnson Jonathan Ho Daniel Tarlow Rianne van den Berg DiffM 293 952 0 07 Jul 2021
Energy Consumption of Deep Generative Audio Models Constance Douwes P. Esling Jean-Pierre Briot MedIm 75 13 0 06 Jul 2021
Agents that Listen: High-Throughput Reinforcement Learning with Multiple Sensory Systems Shashank Hegde Anssi Kanervisto Aleksei Petrenko VLM 70 9 0 05 Jul 2021
Single Model for Influenza Forecasting of Multiple Countries by Multi-task Learning Taichi Murayama Shoko Wakamiya Eiji Aramaki AI4TS 40 0 0 05 Jul 2021
Comparison of end-to-end neural network architectures and data augmentation methods for automatic infant motility assessment using wearable sensors Manu Airaksinen S. Vanhatalo Okko Räsänen 54 4 0 02 Jul 2021
Inverse-Dirichlet Weighting Enables Reliable Training of Physics Informed Neural Networks Suryanarayana Maddu D. Sturm Christian L. Müller I. Sbalzarini AI4CE 94 84 0 02 Jul 2021
Variational Diffusion Models Diederik P. Kingma Tim Salimans Ben Poole Jonathan Ho DiffM 234 1,146 0 01 Jul 2021
Exploring Context Generalizability in Citywide Crowd Mobility Prediction: An Analytic Framework and Benchmark Liyue Chen Xiaoxiang Wang Leye Wang 67 1 0 30 Jun 2021
A Generative Model for Raw Audio Using Transformer Architectures Prateek Verma C. Chafe 84 29 0 30 Jun 2021
Multi-Scale Spectrogram Modelling for Neural Text-to-Speech Ammar Abbas Bajibabu Bollepalli Alexis Moinet Arnaud Joly Penny Karanasou Peter Makarov Simon Slangens S. Karlapati Thomas Drugman 69 0 0 29 Jun 2021
A Survey on Neural Speech Synthesis Xu Tan Tao Qin Frank Soong Tie-Yan Liu AI4TS 139 359 0 29 Jun 2021
GANSpeech: Adversarial Training for High-Fidelity Multi-Speaker Speech Synthesis Jinhyeok Yang Jaesung Bae Taejun Bak Young-Ik Kim Hoon-Young Cho 142 37 0 29 Jun 2021
HDMapGen: A Hierarchical Graph Generative Model of High Definition Maps Lu Mi Hang Zhao C. Nash Xiaohan Jin Jiyang Gao Chen Sun Cordelia Schmid Nir Shavit Yuning Chai Drago Anguelov 57 54 0 28 Jun 2021
Short-Term Load Forecasting for Smart HomeAppliances with Sequence to Sequence Learning Mina Razghandi Hao Zhou Melike Erol-Kantarci D. Turgut AI4TS 21 20 0 26 Jun 2021
Transflower: probabilistic autoregressive dance generation with multimodal attention Guillermo Valle Pérez G. Henter Jonas Beskow A. Holzapfel Pierre-Yves Oudeyer Simon Alexanderson 130 43 0 25 Jun 2021
Basis-MelGAN: Efficient Neural Vocoder Based on Audio Decomposition Zhengxi Liu Y. Qian DRL 49 10 0 25 Jun 2021
Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting Haixu Wu Jiehui Xu Jianmin Wang Mingsheng Long AI4TS 146 2,384 0 24 Jun 2021