v1v2 (latest)

WaveNet: A Generative Model for Raw Audio

12 September 2016

Papers citing "WaveNet: A Generative Model for Raw Audio"

50 / 3,082 papers shown

Title
HEAR: Holistic Evaluation of Audio Representations Joseph P. Turian Jordie Shier H. Khan Bhiksha Raj Björn W. Schuller ... P. Esling Pranay Manocha Shinji Watanabe Zeyu Jin Yonatan Bisk 137 108 0 06 Mar 2022
Variational Auto-Encoder based Mandarin Speech Cloning Qingyu Xing Xiaohan Ma 133 0 0 06 Mar 2022
NeuralDPS: Neural Deterministic Plus Stochastic Model with Multiband Excitation for Noise-Controllable Waveform Generation Tao Wang Ruibo Fu Jiangyan Yi J. Tao Zhengqi Wen 25 2 0 05 Mar 2022
iSTFTNet: Fast and Lightweight Mel-Spectrogram Vocoder Incorporating Inverse Short-Time Fourier Transform Takuhiro Kaneko Kou Tanaka Hirokazu Kameoka Shogo Seki 89 62 0 04 Mar 2022
$Look\&Listen: Multi-Modal Correlation Learning for Active Speaker Detection and Speech Enhancement$ Look\&Listen: Multi-Modal Correlation Learning for Active Speaker Detection and Speech Enhancement Jun Xiong Yu Zhou Peng Zhang Lei Xie Wei Huang Yufei Zha 72 22 0 04 Mar 2022
Real time spectrogram inversion on mobile phone Oleg Rybakov Marco Tagliasacchi Yunpeng Li Liyang Jiang Xia Zhang Fadi Biadsy 131 4 0 01 Mar 2022
A Brief Overview of Unsupervised Neural Speech Representation Learning Lasse Borgholt Jakob Drachmann Havtorn Joakim Edin Lars Maaløe Christian Igel BDL AI4TS SSL 96 11 0 01 Mar 2022
Explainable deepfake and spoofing detection: an attack analysis using SHapley Additive exPlanations W. Ge Massimiliano Todisco Nicholas W. D. Evans AAML 54 9 0 28 Feb 2022
Concept Graph Neural Networks for Surgical Video Understanding Yutong Ban J. Eckhoff Thomas M. Ward Daniel A. Hashimoto O. Meireles Daniela Rus Guy Rosman NAI 86 18 0 27 Feb 2022
Learning the Beauty in Songs: Neural Singing Voice Beautifier Jinglin Liu Chengxi Li Yi Ren Zhiying Zhu Zhou Zhao DiffM 94 17 0 27 Feb 2022
Continuous Human Action Recognition for Human-Machine Interaction: A Review Harshala Gammulle David Ahmedt-Aristizabal Simon Denman Lachlan Tychsen-Smith L. Petersson Clinton Fookes 124 28 0 26 Feb 2022
Revisiting Over-Smoothness in Text to Speech Yi Ren Xu Tan Tao Qin Zhou Zhao Tie-Yan Liu 148 64 0 26 Feb 2022
Spatio-Temporal Latent Graph Structure Learning for Traffic Forecasting Jiabin Tang Tang Qian Shikun Liu Shengdong Du Jie Hu Tianrui Li AI4TS 58 23 0 25 Feb 2022
Preformer: Predictive Transformer with Multi-Scale Segment-wise Correlations for Long-Term Time Series Forecasting Dazhao Du Fuchun Sun Zhewei Wei AI4TS 87 51 0 23 Feb 2022
End-to-end LPCNet: A Neural Vocoder With Fully-Differentiable LPC Estimation Krishna Subramani J. Valin Umut Isik Paris Smaragdis A. Krishnaswamy 70 11 0 23 Feb 2022
Neural Speech Synthesis on a Shoestring: Improving the Efficiency of LPCNet J. Valin Umut Isik Paris Smaragdis A. Krishnaswamy 62 4 0 22 Feb 2022
Wavebender GAN: An architecture for phonetically meaningful speech manipulation Gustavo Teodoro Döhler Beck Ulme Wennberg Zofia Malisz G. Henter AI4CE 88 8 0 22 Feb 2022
Benchmarking Generative Latent Variable Models for Speech Jakob Drachmann Havtorn Lasse Borgholt Søren Hauberg J. Frellsen Lars Maaløe 80 3 0 22 Feb 2022
CampNet: Context-Aware Mask Prediction for End-to-End Text-Based Speech Editing Tao Wang Jiangyan Yi Ruibo Fu J. Tao Zhengqi Wen KELM 69 20 0 21 Feb 2022
It's Raw! Audio Generation with State-Space Models Karan Goel Albert Gu Chris Donahue Christopher Ré 110 195 0 20 Feb 2022
Learning to Detect Slip with Barometric Tactile Sensors and a Temporal Convolutional Neural Network Abhinav Grover Philippe Nadeau C. Grebe Jonathan Kelly 69 10 0 19 Feb 2022
Rethinking Pareto Frontier for Performance Evaluation of Deep Neural Networks V. Nia Alireza Ghaffari Mahdi Zolnouri Yvon Savaria 58 5 0 18 Feb 2022
Dynamic Relation Discovery and Utilization in Multi-Entity Time Series Forecasting Lin Huang Lijun Wu Jia Zhang Jiang Bian Tie-Yan Liu AI4TS 48 2 0 18 Feb 2022
PGCN: Progressive Graph Convolutional Networks for Spatial-Temporal Traffic Forecasting Y. Shin Yoonjin Yoon GNN AI4TS 74 47 0 18 Feb 2022
Speech Denoising in the Waveform Domain with Self-Attention Zhifeng Kong Ming-Yu Liu Ambrish Dantrey Bryan Catanzaro 89 63 0 15 Feb 2022
General-purpose, long-context autoregressive modeling with Perceiver AR Curtis Hawthorne Andrew Jaegle Cătălina Cangea Sebastian Borgeaud C. Nash ... Hannah R. Sheahan Neil Zeghidour Jean-Baptiste Alayrac João Carreira Jesse Engel 118 66 0 15 Feb 2022
Interpreting a Machine Learning Model for Detecting Gravitational Waves M. Safarzadeh Asad Khan Eliu A. Huerta Martin Wattenberg 108 2 0 15 Feb 2022
NewsPod: Automatic and Interactive News Podcasts Philippe Laban Elicia Ye Srujay Korlakunta John F. Canny Marti A. Hearst 54 22 0 15 Feb 2022
Visual Acoustic Matching Changan Chen Ruohan Gao P. Calamia Kristen Grauman 79 58 0 14 Feb 2022
An Introduction to Neural Data Compression Yibo Yang Stephan Mandt Lucas Theis 149 125 0 14 Feb 2022
Distribution augmentation for low-resource expressive text-to-speech Mateusz Lajszczak Animesh Prasad Arent van Korlaar Bajibabu Bollepalli Antonio Bonafonte ... M. Nicolis Alexis Moinet Thomas Drugman Trevor Wood Elena Sokolova 61 7 0 13 Feb 2022
SleepPPG-Net: a deep learning algorithm for robust sleep staging from continuous photoplethysmography Kevin Kotzen Peter H. Charlton Sharon Salabi Lea Amar A. Landesberg Joachim A. Behar 67 33 0 11 Feb 2022
Bernstein Flows for Flexible Posteriors in Variational Bayes Oliver Durr Stephan Hörling Daniel Dold Ivonne Kovylov Beate Sick BDL 102 4 0 11 Feb 2022
A Graph-based U-Net Model for Predicting Traffic in unseen Cities L. Hermes Barbara Hammer Andrew Melnik Riza Velioglu Markus Vieth M. Schilling GNN AI4TS AI4CE 77 6 0 11 Feb 2022
Conditional Diffusion Probabilistic Model for Speech Enhancement Yen-Ju Lu Zhongqiu Wang Shinji Watanabe Alexander Richard Cheng Yu Yu Tsao DiffM 84 191 0 10 Feb 2022
Diffusion bridges vector quantized Variational AutoEncoders Max H. Cohen Guillaume Quispe Sylvain Le Corff Charles Ollion Eric Moulines DiffM 90 15 0 10 Feb 2022
Deconstructing the Inductive Biases of Hamiltonian Neural Networks Nate Gruver Marc Finzi Samuel Stanton A. Wilson AI4CE 69 42 0 10 Feb 2022
InferGrad: Improving Diffusion Models for Vocoder by Considering Inference in Training Zehua Chen Xu Tan Ke Wang Shifeng Pan Danilo Mandic Lei He Sheng Zhao DiffM 71 31 0 08 Feb 2022
TACTiS: Transformer-Attentional Copulas for Time Series Alexandre Drouin Étienne Marcotte Nicolas Chapados AI4TS 283 39 0 07 Feb 2022
Deep Impulse Responses: Estimating and Parameterizing Filters with Deep Networks Alexander Richard Peter Dodds V. Ithapu 71 37 0 07 Feb 2022
Building Synthetic Speaker Profiles in Text-to-Speech Systems Jie Pu Yi Meng Oguz H. Elibol 48 2 0 07 Feb 2022
Tubes Among Us: Analog Attack on Automatic Speaker Identification Shimaa Ahmed Yash R. Wani Ali Shahin Shamsabadi Mohammad Yaghini Ilia Shumailov Nicolas Papernot Kassem Fawaz AAML 62 4 0 06 Feb 2022
GhostTalk: Interactive Attack on Smartphone Voice System Through Power Line Yuanda Wang Hanqing Guo Qiben Yan AAML 77 41 0 05 Feb 2022
EcoFlow: Efficient Convolutional Dataflows for Low-Power Neural Network Accelerators Lois Orosa Skanda Koppula Yaman Umuroglu Konstantinos Kanellopoulos Juan Gómez Luna Michaela Blott K. Vissers O. Mutlu 82 4 0 04 Feb 2022
A Survey on Safety-Critical Driving Scenario Generation -- A Methodological Perspective Wenhao Ding Chejian Xu Mansur Arief Hao-ming Lin Yue Liu Ding Zhao 119 165 0 04 Feb 2022
Deep Learning for Epidemiologists: An Introduction to Neural Networks S. Serghiou K. Rough FedML 54 14 0 02 Feb 2022
The HCCL-DKU system for fake audio generation task of the 2022 ICASSP ADD Challenge Ziyi Chen Hua Hua Yuxiang Zhang Ming Li Pengyuan Zhang 102 0 0 29 Jan 2022
ItôWave: Itô Stochastic Differential Equation Is All You Need For Wave Generation Shoule Wu Ziqiang Shi DiffM 456 9 0 29 Jan 2022
Electra: Conditional Generative Model based Predicate-Aware Query Approximation Nikhil Sheoran Subrata Mitra Vibhor Porwal Siddharth Ghetia Jatin Varshney Tung Mai Anup B. Rao Vikas Maddukuri 91 13 0 28 Jan 2022
DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs Songxiang Liu Jane Polak Scowcroft Dong Yu DiffM 150 67 0 28 Jan 2022