v1v2 (latest)

WaveNet: A Generative Model for Raw Audio

12 September 2016

Papers citing "WaveNet: A Generative Model for Raw Audio"

50 / 3,082 papers shown

Title
Stochastic Reconstruction of Gappy Lagrangian Turbulent Signals by Conditional Diffusion Models Tianyi Li Luca Biferale F. Bonaccorso M. Buzzicotti Luca Centurioni DiffM 76 4 0 31 Oct 2024
RDSinger: Reference-based Diffusion Network for Singing Voice Synthesis Kehan Sui Jinxu Xiang Fang Jin DiffM 45 0 0 29 Oct 2024
Scaling-based Data Augmentation for Generative Models and its Theoretical Extension Yoshitaka Koike Takumi Nakagawa Hiroki Waida Takafumi Kanamori DiffM 121 0 0 28 Oct 2024
David and Goliath: Small One-step Model Beats Large Diffusion with Score Post-training Weijian Luo C. Zhang Debing Zhang Zhengyang Geng 96 4 0 28 Oct 2024
Meta-Learning Approaches for Improving Detection of Unseen Speech Deepfakes Ivan Kukanov Janne Laakkonen Tomi Kinnunen Ville Hautamaki AAML 133 1 0 27 Oct 2024
Utilizing Image Transforms and Diffusion Models for Generative Modeling of Short and Long Time Series Ilan Naiman Nimrod Berman Itai Pemper Idan Arbiv Gal Fadlon Omri Azencot 87 15 0 25 Oct 2024
Flow Generator Matching Zemin Huang Zhengyang Geng Weijian Luo Guo-Jun Qi 123 11 0 25 Oct 2024
Making Social Platforms Accessible: Emotion-Aware Speech Generation with Integrated Text Analysis Suparna De Ionut Bostan Nishanth Sastry 122 0 0 24 Oct 2024
TEAM: Topological Evolution-aware Framework for Traffic Forecasting--Extended Version Duc Kieu Tung Kieu Peng Han Bin Yang Christian S. Jensen Bac Le AI4TS 54 2 0 24 Oct 2024
Diff-Instruct++: Training One-step Text-to-image Generator Model to Align with Human Preferences Weijian Luo EGVM 114 9 0 24 Oct 2024
ELAICHI: Enhancing Low-resource TTS by Addressing Infrequent and Low-frequency Character Bigrams Srija Anand Praveen Srinivasa Varadhan Mehak Singal Mitesh M. Khapra 40 0 0 23 Oct 2024
Regularized autoregressive modeling and its application to audio signal declipping Ondřej Mokrý P. Rajmic 82 0 0 23 Oct 2024
Deep Generative Models for 3D Medical Image Synthesis Paul Friedrich Yannik Frisch P. Cattin 3DV MedIm 92 4 0 23 Oct 2024
One-Step Diffusion Distillation through Score Implicit Matching Weijian Luo Zemin Huang Zhengyang Geng J. Zico Kolter Guo-Jun Qi DiffM 92 21 0 22 Oct 2024
Real-time Sub-milliwatt Epilepsy Detection Implemented on a Spiking Neural Network Edge Inference Processor Ruixin Lia Guoxu Zhaoa Dylan Richard Muir Yuya Ling Karla Burelo Mina Khoei Dong Wang Y. Xing Ning Qiao 42 2 0 22 Oct 2024
Acoustic Model Optimization over Multiple Data Sources: Merging and Valuation Victor Junqiu Wei Weicheng Wang Di Jiang Conghui Tan Rongzhong Lian MoMe 96 0 0 21 Oct 2024
ConSinger: Efficient High-Fidelity Singing Voice Generation with Minimal Steps Yulin Song Guorui Sang Jing Yu Chuangbai Xiao DiffM 66 1 0 20 Oct 2024
Optimal Transport Maps are Good Voice Converters Arip Asadulaev Rostislav Korst V. Shutov Alexander Korotin Yaroslav Grebnyak Vahe Egiazarian Evgeny Burnaev OT 57 2 0 17 Oct 2024
Accelerating Codec-based Speech Synthesis with Multi-Token Prediction and Speculative Decoding Tan Dat Nguyen Ji-Hoon Kim Jeongsoo Choi Shukjae Choi Jinseok Park Younglo Lee Joon Son Chung 86 3 0 17 Oct 2024
DART: Disentanglement of Accent and Speaker Representation in Multispeaker Text-to-Speech J. Melechovský Ambuj Mehrish Berrak Sisman Dorien Herremans 57 2 0 17 Oct 2024
DurIAN-E 2: Duration Informed Attention Network with Adaptive Variational Autoencoder and Adversarial Learning for Expressive Text-to-Speech Synthesis Yu Gu Qiushi Zhu Guangzhi Lei Chao Weng Jane Polak Scowcroft DiffM 67 0 0 17 Oct 2024
TabSeq: A Framework for Deep Learning on Tabular Data via Sequential Ordering A. Habib Kesheng Wang Mary-Anne Hartley Gianfranco Doretto Donald Adjeroh LMTD 132 1 0 17 Oct 2024
Irregularity-Informed Time Series Analysis: Adaptive Modelling of Spatial and Temporal Dynamics Liangwei Nathan Zheng Zhengyang Li Chang George Dong Wei Emma Zhang Lin Yue Miao Xu Olaf Maennel Weitong Chen AI4TS 74 2 0 16 Oct 2024
Transfer Learning with Foundational Models for Time Series Forecasting using Low-Rank Adaptations M. Germán-Morales A. J. Rivera-Rivas M. J. del Jesus Díaz C. J. Carmona AI4TS AI4CE 286 1 0 15 Oct 2024
Generative Deep Learning and Signal Processing for Data Augmentation of Cardiac Auscultation Signals: Improving Model Robustness Using Synthetic Audio Leigh Abbott Milan Marocchi Matthew Fynn Yue Rong Sven Nordholm MedIm 66 0 0 14 Oct 2024
Bahasa Harmony: A Comprehensive Dataset for Bahasa Text-to-Speech Synthesis with Discrete Codec Modeling of EnGen-TTS Onkar Kishor Susladkar Vishesh Tripathi Biddwan Ahmed 42 0 0 09 Oct 2024
Evaluating the Generalization Ability of Spatiotemporal Model in Urban Scenario Hongjun Wang Jiyuan Chen Tong Pan Zheng Dong Lingyu Zhang Renhe Jiang Xuan Song OOD 169 3 0 07 Oct 2024
Neural Fourier Modelling: A Highly Compact Approach to Time-Series Analysis Minjung Kim Yusuke Hioka Michael Witbrock AI4TS 93 0 0 07 Oct 2024
EmoGene: Audio-Driven Emotional 3D Talking-Head Generation Wenqing Wang Yun Fu VGen 141 0 0 07 Oct 2024
GAS-Norm: Score-Driven Adaptive Normalization for Non-Stationary Time Series Forecasting in Deep Learning Edoardo Urettini Daniele Atzeni Reshawn J. Ramjattan Antonio Carta AI4TS 46 0 0 04 Oct 2024
S7: Selective and Simplified State Space Layers for Sequence Modeling Taylan Soydan Nikola Zubić Nico Messikommer Siddhartha Mishra Davide Scaramuzza 86 7 0 04 Oct 2024
Generative Semantic Communication for Text-to-Speech Synthesis Jiahao Zheng Jinke Ren Peng Xu Zhihao Yuan Jie Xu Fangxin Wang Gui Gui Shuguang Cui 55 2 0 04 Oct 2024
Convolutional Variational Autoencoders for Spectrogram Compression in Automatic Speech Recognition Olga Iakovenko Ivan Bondarenko 41 0 0 03 Oct 2024
Flow Matching with Gaussian Process Priors for Probabilistic Time Series Forecasting Marcel Kollovieh Marten Lienen David Lüdke Leo Schwinn Stephan Günnemann AI4TS BDL DiffM 152 8 0 03 Oct 2024
On the Geometry and Optimization of Polynomial Convolutional Networks Vahid Shahverdi Giovanni Luca Marchetti Kathlén Kohn 55 4 0 01 Oct 2024
Recent Advances in Speech Language Models: A Survey Wenqian Cui Dianzhi Yu Xiaoqi Jiao Ziqiao Meng Guangyan Zhang Qichao Wang Yiwen Guo Irwin King AuLLM 203 26 0 01 Oct 2024
TSI: A Multi-View Representation Learning Approach for Time Series Forecasting Wentao Gao Ziqi Xu Jiuyong Li Lin Liu Jixue Liu T. Le Debo Cheng Yanchang Zhao Yun Chen AI4TS 68 1 0 30 Sep 2024
A method for identifying causality in the response of nonlinear dynamical systems Joseph Massingham Ole Nielsen Tore Butlin CML 38 0 0 26 Sep 2024
A Survey of Spatio-Temporal EEG data Analysis: from Models to Applications Pengfei Wang Huanran Zheng Silong Dai Yiqiao Wang Xiaotian Gu Yuanbin Wu Xiaoling Wang SyDa AI4TS 152 4 0 26 Sep 2024
Neural Coordination and Capacity Control for Inventory Management Carson Eisenach Udaya Ghai Dhruv Madeka Kari Torkkola Dean Phillips Foster Sham Kakade 59 0 0 24 Sep 2024
HiFi-Glot: Neural Formant Synthesis with Differentiable Resonant Filters Lauri Juvela Pablo Pérez Zarazaga G. Henter Zofia Malisz 56 0 0 23 Sep 2024
A Comprehensive Survey with Critical Analysis for Deepfake Speech Detection Lam Pham Phat Lam Dat Tran Hieu Tang Tin Nguyen Alexander Schindler Canh Vu Alexander Polonsky Canh Vu 129 5 0 23 Sep 2024
ReFine: Boosting Time Series Prediction of Extreme Events by Reweighting and Fine-tuning Jimeng Shi Azam Shirali Giri Narasimhan AI4TS 78 0 0 21 Sep 2024
Sketching With Your Voice: "Non-Phonorealistic" Rendering of Sounds via Vocal Imitation Matthew Caren Kartik Chandra J. Tenenbaum Jonathan Ragan-Kelley Karima Ma 77 0 0 20 Sep 2024
DiffSSD: A Diffusion-Based Dataset For Speech Forensics Kratika Bhagtani Amit Kumar Singh Yadav Paolo Bestagini Edward J. Delp DiffM 55 2 0 19 Sep 2024
A quest through interconnected datasets: lessons from highly-cited ICASSP papers Cynthia C. S. Liem Doğa Taşcılar Andrew M. Demetriou 54 0 0 19 Sep 2024
ViolinDiff: Enhancing Expressive Violin Synthesis with Pitch Bend Conditioning Daewoong Kim Hao-Wen Dong Dasaem Jeong 68 0 0 19 Sep 2024
SpoofCeleb: Speech Deepfake Detection and SASV In The Wild Jee-weon Jung Yihan Wu Xin Wang Ji-Hoon Kim Soumi Maiti ... Joon Son Chung Wangyou Zhang Seyun Um Shinnosuke Takamichi Shinji Watanabe 155 4 0 18 Sep 2024
BAD: Bidirectional Auto-regressive Diffusion for Text-to-Motion Generation Seyed Rohollah Hosseyni Ali Ahmad Rahmani S. J. Seyedmohammadi Sanaz Seyedin Arash Mohammadi DiffM 93 7 0 17 Sep 2024
Implicit Reasoning in Deep Time Series Forecasting Willa Potosnak Cristian Challu Mononito Goswami Michał Wiliński Nina Żukowska Artur Dubrawski ReLM AI4TS LRM 98 4 0 17 Sep 2024