v1v2 (latest)

WaveNet: A Generative Model for Raw Audio

12 September 2016

Papers citing "WaveNet: A Generative Model for Raw Audio"

50 / 3,082 papers shown

Title
Multi-Scale Sub-Band Constant-Q Transform Discriminator for High-Fidelity Vocoder Yicheng Gu Xueyao Zhang Liumeng Xue Zhizheng Wu 72 12 0 25 Nov 2023
An NMF-Based Building Block for Interpretable Neural Networks With Continual Learning Brian K. Vogel 43 0 0 20 Nov 2023
Advancements in Generative AI: A Comprehensive Review of GANs, GPT, Autoencoders, Diffusion Model, and Transformers Staphord Bengesi Hoda El-Sayed Md Kamruzzaman Sarker Yao Houkpati John Irungu T. Oladunni 128 93 0 17 Nov 2023
Formal Verification of Long Short-Term Memory based Audio Classifiers: A Star based Approach Neelanjana Pal Taylor T. Johnson 55 0 0 16 Nov 2023
CLN-VC: Text-Free Voice Conversion Based on Fine-Grained Style Control and Contrastive Learning with Negative Samples Augmentation Yimin Deng Xulong Zhang Jianzong Wang Ning Cheng Jing Xiao 77 3 0 15 Nov 2023
EDMSound: Spectrogram Based Diffusion Models for Efficient and High-Quality Audio Synthesis Ge Zhu Yutong Wen M. Carbonneau Zhiyao Duan DiffM 76 8 0 15 Nov 2023
Reimagining Speech: A Scoping Review of Deep Learning-Powered Voice Conversion A. R. Bargum Stefania Serafin Cumhur Erkut 70 4 0 14 Nov 2023
DQR-TTS: Semi-supervised Text-to-speech Synthesis with Dynamic Quantized Representation Jiangzong Wang Pengcheng Li Xulong Zhang Ning Cheng Jing Xiao 81 0 0 14 Nov 2023
CSLP-AE: A Contrastive Split-Latent Permutation Autoencoder Framework for Zero-Shot Electroencephalography Signal Conversion Anders Vestergaard Norskov Alexander Neergaard Zahid Morten Morup 65 3 0 13 Nov 2023
Efficient bandwidth extension of musical signals using a differentiable harmonic plus noise model Pierre-Amaury Grumiaux Mathieu Lagrange 68 3 0 13 Nov 2023
SponTTS: modeling and transferring spontaneous style for TTS Hanzhao Li Xinfa Zhu Liumeng Xue Yang Song Yunlin Chen Lei Xie 89 7 0 13 Nov 2023
Music ControlNet: Multiple Time-varying Controls for Music Generation Shih-Lun Wu Chris Donahue Shinji Watanabe Nicholas J. Bryan DiffM MGen 111 61 0 13 Nov 2023
Instant3D: Fast Text-to-3D with Sparse-View Generation and Large Reconstruction Model Jiahao Li Hao Tan Kai Zhang Zexiang Xu Fujun Luan Yinghao Xu Yicong Hong Kalyan Sunkavalli Greg Shakhnarovich Sai Bi 131 275 0 10 Nov 2023
FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores Daniel Y. Fu Hermann Kumbong Eric N. D. Nguyen Christopher Ré VLM 100 30 0 10 Nov 2023
Semantic Map Guided Synthesis of Wireless Capsule Endoscopy Images using Diffusion Models Haejin Lee Jeongwoo Ju Jonghyuck Lee Yeoun Joo Lee Heechul Jung DiffM MedIm 65 0 0 10 Nov 2023
Synthetic Speaking Children -- Why We Need Them and How to Make Them Muhammad Ali Farooq Dan Bigioi Rishabh Jain Wang Yao Mariam Yiwere Peter Corcoran 86 0 0 08 Nov 2023
Diff-HierVC: Diffusion-based Hierarchical Voice Conversion with Robust Pitch Generation and Masked Prior for Zero-shot Speaker Adaptation Haram Choi Sang-Hoon Lee Seong-Whan Lee DiffM 72 30 0 08 Nov 2023
Improved DDIM Sampling with Moment Matching Gaussian Mixtures Prasad Gabbur DiffM 50 1 0 08 Nov 2023
Impact of HPO on AutoML Forecasting Ensembles David Hoffmann 46 0 0 07 Nov 2023
TS-Diffusion: Generating Highly Complex Time Series with Diffusion Models Yangming Li DiffM AI4TS 97 5 0 06 Nov 2023
AV-Lip-Sync+: Leveraging AV-HuBERT to Exploit Multimodal Inconsistency for Video Deepfake Detection Sahibzada Adil Shahzad Ammarah Hashmi Yan-Tsung Peng Yu Tsao Hsin-Min Wang 96 7 0 05 Nov 2023
Sounding Bodies: Modeling 3D Spatial Sound of Humans Using Body Pose and Audio Xudong Xu Dejan Marković Jacob Sandakly Todd Keebler Steven Krenn Alexander Richard 44 5 0 01 Nov 2023
REBAR: Retrieval-Based Reconstruction for Time-series Contrastive Learning Maxwell A. Xu Alexander Moreno Hui Wei Benjamin M. Marlin James M. Rehg AI4TS SSL 105 13 0 01 Nov 2023
Semantic Hearing: Programming Acoustic Scenes with Binaural Hearables Bandhav Veluri Malek Itani Justin Chan Takuya Yoshioka Shyamnath Gollakota 67 18 0 01 Nov 2023
Deepfake detection by exploiting surface anomalies: the SurFake approach Andrea Ciamarra R. Caldelli Federico Becattini Lorenzo Seidenari A. Bimbo 86 14 0 31 Oct 2023
Enabling Acoustic Audience Feedback in Large Virtual Events Tamay Aykut M. Hofbauer Christopher B. Kuhn Eckehard Steinbach Bernd Girod 77 0 0 27 Oct 2023
Learning an Inventory Control Policy with General Inventory Arrival Dynamics Sohrab Andaz Carson Eisenach Dhruv Madeka Kari Torkkola Randy Jia Dean Phillips Foster Sham Kakade 59 2 0 26 Oct 2023
Real-time Neonatal Chest Sound Separation using Deep Learning Yang Yi Poh Ethan Grooby Kenneth Tan Lindsay Zhou Arrabella King Ashwin Ramanathan Atul Malhotra Mehrtash Harandi F. Marzbanrad 57 1 0 26 Oct 2023
Subtle Signals: Video-based Detection of Infant Non-nutritive Sucking as a Neurodevelopmental Cue Shaotong Zhu Michael Wan Sai Kumar Reddy Manne Emily B. Zimmerman Sarah Ostadabbas 31 2 0 24 Oct 2023
Synthetic Data as Validation Qixing Hu Alan Yuille Zongwei Zhou SyDa OOD 73 8 0 24 Oct 2023
LC-TTFS: Towards Lossless Network Conversion for Spiking Neural Networks with TTFS Coding Qu Yang Malu Zhang Jibin Wu Kay Chen Tan Haizhou Li 63 10 0 23 Oct 2023
Mid-Long Term Daily Electricity Consumption Forecasting Based on Piecewise Linear Regression and Dilated Causal CNN Zhou Lan Ben Liu Yi Feng Danhuang Dong Peng Zhang AI4TS 30 1 0 23 Oct 2023
An overview of text-to-speech systems and media applications Mohammad Reza Hasanabadi 28 3 0 22 Oct 2023
MFCC-GAN Codec: A New AI-based Audio Coding Mohammad Reza Hasanabadi 40 0 0 22 Oct 2023
Neural Likelihood Approximation for Integer Valued Time Series Data Luke O'Loughlin John Maclean Andrew Black AI4TS 53 0 0 19 Oct 2023
Physics-informed neural network for acoustic resonance analysis in a one-dimensional acoustic tube Kazuya Yokota Takahiko Kurahashi Masajiro Abe 28 5 0 18 Oct 2023
Leveraging Diverse Semantic-based Audio Pretrained Models for Singing Voice Conversion Xueyao Zhang Yicheng Gu Haopeng Chen Zihao Fang Lexiao Zou Junan Zhang Liumeng Xue Jinchao Zhang Jie Zhou Zhizheng Wu DiffM 64 2 0 17 Oct 2023
BeatDance: A Beat-Based Model-Agnostic Contrastive Learning Framework for Music-Dance Retrieval Kaixing Yang Xukun Zhou Xulong Tang Ran Diao Hongyan Liu Jun He Zhaoxin Fan 71 3 0 16 Oct 2023
MoConVQ: Unified Physics-Based Motion Control via Scalable Discrete Representations Heyuan Yao Zhenhua Song Yuyang Zhou Tenglong Ao Baoquan Chen Libin Liu 135 44 0 16 Oct 2023
Generative Adversarial Training for Text-to-Speech Synthesis Based on Raw Phonetic Input and Explicit Prosody Modelling Tiberiu Boros Stefan Daniel Dumitrescu Ionut Mironica Radu Chivereanu GAN 38 1 0 14 Oct 2023
Machine Learning for Urban Air Quality Analytics: A Survey Jindong Han Weijiao Zhang Hao Liu Hui Xiong AI4CE 114 12 0 14 Oct 2023
A decoder-only foundation model for time-series forecasting Abhimanyu Das Weihao Kong Rajat Sen Yichen Zhou AI4TS AI4CE 137 243 0 14 Oct 2023
ARM: Refining Multivariate Forecasting with Adaptive Temporal-Contextual Learning Jiecheng Lu Xu Han Shihao Yang AI4TS 49 4 0 14 Oct 2023
LL-VQ-VAE: Learnable Lattice Vector-Quantization For Efficient Representations Ahmed Khalil Robert Piechocki Raúl Santos-Rodríguez 54 2 0 13 Oct 2023
Large Language Models Are Zero-Shot Time Series Forecasters Nate Gruver Marc Finzi Shikai Qiu Andrew Gordon Wilson AI4TS 97 375 0 11 Oct 2023
Prosody Analysis of Audiobooks Charuta Pethe Yunting Yin Felix D Childress Yunting Yin Steven Skiena 89 1 0 10 Oct 2023
Generative Spoken Language Model based on continuous word-sized audio tokens Robin Algayres Yossi Adi Tu Nguyen Jade Copet Gabriel Synnaeve Benoît Sagot Emmanuel Dupoux AuLLM 119 16 0 08 Oct 2023
Comparative Analysis of Transfer Learning in Deep Learning Text-to-Speech Models on a Few-Shot, Low-Resource, Customized Dataset Ze Liu 53 1 0 08 Oct 2023
FM Tone Transfer with Envelope Learning Franco Caspe Andrew Mcpherson Mark Sandler 57 2 0 07 Oct 2023
Hate Speech Detection in Limited Data Contexts using Synthetic Data Generation Aman Khullar Daniel K. Nkemelu Cuong V. Nguyen Michael L. Best 80 5 0 04 Oct 2023