v1v2 (latest)

WaveNet: A Generative Model for Raw Audio

12 September 2016

Papers citing "WaveNet: A Generative Model for Raw Audio"

50 / 3,082 papers shown

Title
Singer Identification for Metaverse with Timbral and Middle-Level Perceptual Features Xulong Zhang Jianzong Wang Ning Cheng Jing Xiao 65 16 0 24 May 2022
HiPAL: A Deep Framework for Physician Burnout Prediction Using Activity Logs in Electronic Health Records Hanyang Liu Sunny S. Lou Benjamin C. Warner Derek Harford Thomas Kannampallil Chenyang Lu LM&MA HAI 72 11 0 24 May 2022
Deep Representations for Time-varying Brain Datasets Sikun Lin Shuyun Tang Scott T. Grafton Ambuj K. Singh AI4CE 62 6 0 23 May 2022
Self-Supervised Speech Representation Learning: A Review Abdel-rahman Mohamed Hung-yi Lee Lasse Borgholt Jakob Drachmann Havtorn Joakim Edin ... Shang-Wen Li Karen Livescu Lars Maaløe Tara N. Sainath Shinji Watanabe SSL AI4TS 293 368 0 21 May 2022
Self-Supervised Time Series Representation Learning via Cross Reconstruction Transformer Wen-Rang Zhang Ling Yang Shijia Geng Shenda Hong ViT AI4TS 94 45 0 20 May 2022
End-to-End Zero-Shot Voice Conversion with Location-Variable Convolutions Wonjune Kang M. Hasegawa-Johnson D. Roy 86 8 0 19 May 2022
Improving Robustness against Real-World and Worst-Case Distribution Shifts through Decision Region Quantification Leo Schwinn Leon Bungert A. Nguyen René Raab Falk Pulsmeyer Doina Precup Björn Eskofier Dario Zanca OOD 93 15 0 19 May 2022
Cross-Enhancement Transformer for Action Segmentation Jiahui Wang Zhenyou Wang Shanna Zhuang Hui Wang ViT 93 23 0 19 May 2022
Macedonian Speech Synthesis for Assistive Technology Applications B. Sofronievski Elena Velovska Martin Velichkovski Violeta Argirova Tea Veljkovikj ... Kristijan Lazarev Toni Bachvarovski Z. Ivanovski Dimitar Tashkovski B. Gerazov 20 0 0 18 May 2022
Spatial-Temporal Interactive Dynamic Graph Convolution Network for Traffic Forecasting Aoyun Liu Yaying Zhang GNN AI4TS 80 33 0 18 May 2022
HARNet: A Convolutional Neural Network for Realized Volatility Forecasting Rafael Reisenhofer Xandro Bayer N. Hautsch 59 8 0 16 May 2022
cMelGAN: An Efficient Conditional Generative Model Based on Mel Spectrograms Tracy Qian Jackson Kaunismaa Tony Chung MGen GAN MedIm 40 6 0 15 May 2022
GAN-Aimbots: Using Machine Learning for Cheating in First Person Shooters Anssi Kanervisto Tomi Kinnunen Ville Hautamaki 36 13 0 14 May 2022
A Generalist Agent Scott E. Reed Konrad Zolna Emilio Parisotto Sergio Gomez Colmenarejo Alexander Novikov ... Yutian Chen R. Hadsell Oriol Vinyals Mahyar Bordbar Nando de Freitas LM&Ro LLMAG AI4CE 217 827 0 12 May 2022
Unified Source-Filter GAN with Harmonic-plus-Noise Source Excitation Generation Reo Yoneyama Yi-Chiao Wu Tomoki Toda 70 14 0 12 May 2022
Robot Cooking with Stir-fry: Bimanual Non-prehensile Manipulation of Semi-fluid Objects Junjia Liu Yiting Chen Zhipeng Dong Shixiong Wang Sylvain Calinon Miao Li Fei Chen 90 62 0 12 May 2022
Real-Time Packet Loss Concealment With Mixed Generative and Predictive Model J. Valin Ahmed Mustafa Christopher Montgomery Timothy B. Terriberry Michael Klingbeil Paris Smaragdis A. Krishnaswamy 61 18 0 11 May 2022
Efficient Automated Deep Learning for Time Series Forecasting Difan Deng Florian Karl Frank Hutter Bernd Bischl Marius Lindauer AI4TS 139 16 0 11 May 2022
Towards Improved Zero-shot Voice Conversion with Conditional DSVAE Jiachen Lian Chunlei Zhang Gopala Krishna Anumanchipalli Dong Yu 53 23 0 11 May 2022
Deep Learning Enabled Semantic Communications with Speech Recognition and Synthesis Zhenzi Weng Zhijin Qin Xiaoming Tao Chengkang Pan Guangyi Liu Geoffrey Ye Li 87 144 0 09 May 2022
NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality Xu Tan Jiawei Chen Haohe Liu Jian Cong Chen Zhang ... Lei He Frank Soong Tao Qin Sheng Zhao Tie-Yan Liu 144 221 0 09 May 2022
Cross-Utterance Conditioned VAE for Non-Autoregressive Text-to-Speech Yongqian Li Cheng Yu Guangzhi Sun Hua Jiang Fanglei Sun Weiqin Zu Ying Wen Yang Yang Jun Wang 60 7 0 09 May 2022
Synthetic Data -- what, why and how? James Jordon Lukasz Szpruch F. Houssiau M. Bottarelli Giovanni Cherubin Carsten Maple Samuel N. Cohen Adrian Weller 96 120 0 06 May 2022
GANimator: Neural Motion Synthesis from a Single Sequence Peizhuo Li Kfir Aberman Zihan Zhang Rana Hanocka O. Sorkine-Hornung GAN 58 35 0 05 May 2022
MAD: Self-Supervised Masked Anomaly Detection Task for Multivariate Time Series Yiwei Fu Feng Xue AI4TS 36 15 0 04 May 2022
SVTS: Scalable Video-to-Speech Synthesis Rodrigo Mira A. Haliassos Stavros Petridis Björn W. Schuller Maja Pantic 71 35 0 04 May 2022
TartanDrive: A Large-Scale Dataset for Learning Off-Road Dynamics Models S. Triest Matthew Sivaprakasam Sean J. Wang Wenshan Wang Aaron M. Johnson Sebastian Scherer 137 57 0 03 May 2022
The ICML 2022 Expressive Vocalizations Workshop and Competition: Recognizing, Generating, and Personalizing Vocal Bursts Alice Baird Panagiotis Tzirakis Gauthier Gidel Marco Jiralerspong Eilif B. Muller Kory W. Mathewson Björn Schuller Min Zhang D. Keltner Alan S. Cowen VLM 84 30 0 03 May 2022
HarmoF0: Logarithmic Scale Dilated Convolution For Pitch Estimation Weixing Wei P. Li Yi Yu Wei Li 51 17 0 02 May 2022
A Novel Speech-Driven Lip-Sync Model with CNN and LSTM Xiaohong Li Xiang Wang Kai Wang Kai Wang 41 4 0 02 May 2022
Short-Term Density Forecasting of Low-Voltage Load using Bernstein-Polynomial Normalizing Flows M. Arpogaus Marcus Voss Beate Sick Mark Nigge-Uricher Oliver Durr 69 18 0 29 Apr 2022
Regotron: Regularizing the Tacotron2 architecture via monotonic alignment loss Efthymios Georgiou Kosmas Kritsis Georgios Paraskevopoulos Athanasios Katsamanis Vassilis Katsouros Alexandros Potamianos 133 3 0 28 Apr 2022
Parallel Synthesis for Autoregressive Speech Generation Po-Chun Hsu Da-Rong Liu Andy T. Liu Hung-yi Lee 80 5 0 25 Apr 2022
SyntaSpeech: Syntax-Aware Generative Adversarial Text-to-Speech Zhenhui Ye Zhou Zhao Yi Ren Leilei Gan 88 28 0 25 Apr 2022
PhysioGAN: Training High Fidelity Generative Model for Physiological Sensor Readings M. Alzantot L. Garcia Mani B. Srivastava 31 1 0 25 Apr 2022
Improving Self-Supervised Learning-based MOS Prediction Networks Bálint Gyires-Tóth Csaba Zainkó SSL 38 1 0 23 Apr 2022
Sequence-Based Target Coin Prediction for Cryptocurrency Pump-and-Dump Sihao Hu Zhen Zhang Shengliang Lu Bingsheng He Zhao Li AI4TS 65 16 0 21 Apr 2022
FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis Rongjie Huang Max W. Y. Lam Jun Wang Jane Polak Scowcroft Dong Yu Yi Ren Zhou Zhao DiffM 76 172 0 21 Apr 2022
STFT-Domain Neural Speech Enhancement with Very Low Algorithmic Latency Zhong-Qiu Wang Gordon Wichern Shinji Watanabe Jonathan Le Roux 87 36 0 21 Apr 2022
Scale Dependencies and Self-Similar Models with Wavelet Scattering Spectra Rudy Morel G. Rochette R. Leonarduzzi J. Bouchaud S. Mallat 59 14 0 19 Apr 2022
Approaching sales forecasting using recurrent neural networks and transformers Iván Vallés-Pérez E. Soria-Olivas M. Martínez-Sober Antonio J. Serrano J. Gómez-Sanchís Fernando Mateo AI4TS 57 37 0 16 Apr 2022
Efficient Architecture Search for Diverse Tasks Jun Shen M. Khodak Ameet Talwalkar 64 34 0 15 Apr 2022
Diagnosing and Fixing Manifold Overfitting in Deep Generative Models Gabriel Loaiza-Ganem Brendan Leigh Ross Jesse C. Cresswell Anthony L. Caterini GAN DRL 113 31 0 14 Apr 2022
Do You Really Mean That? Content Driven Audio-Visual Deepfake Dataset and Multimodal Method for Temporal Forgery Localization Zhixi Cai Kalin Stefanov Abhinav Dhall Munawar Hayat 72 3 0 13 Apr 2022
A Post Auto-regressive GAN Vocoder Focused on Spectrum Fracture Zhe-ming Lu Mengnan He Ruixiong Zhang Caixia Gong GAN 28 2 0 12 Apr 2022
Fine-grained Noise Control for Multispeaker Speech Synthesis Karolos Nikitaras G. Vamvoukakis Nikolaos Ellinas Konstantinos Klapsas K. Markopoulos S. Raptis June Sig Sung Gunu Jho Aimilios Chalamandaris Pirros Tsiakoulis 69 5 0 11 Apr 2022
Model-free optimization of power/efficiency tradeoffs in quantum thermal machines using reinforcement learning P. A. Erdman Frank Noé 42 9 0 10 Apr 2022
On Principal Curve-Based Classifiers and Similarity-Based Selective Sampling in Time-Series Aref Hakimzadeh K. Ziarati M. Taheri AI4TS 52 0 0 10 Apr 2022
Super-Resolved Microbubble Localization in Single-Channel Ultrasound RF Signals Using Deep Learning N. Blanken J. Wolterink H. Delingette Christoph Brune M. Versluis G. Lajoinie 43 15 0 09 Apr 2022
Enhanced exemplar autoencoder with cycle consistency loss in any-to-one voice conversion Weida Liang Lantian Li Wenqiang Du Dong Wang 126 0 0 08 Apr 2022