Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1609.03499
Cited By
v1
v2 (latest)
WaveNet: A Generative Model for Raw Audio
12 September 2016
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"WaveNet: A Generative Model for Raw Audio"
50 / 3,082 papers shown
Title
Breaking Quadratic Barriers: A Non-Attention LLM for Ultra-Long Context Horizons
Andrew Kiruluta
Preethi Raju
Priscilla Burity
14
0
0
09 May 2025
Recognizing Ornaments in Vocal Indian Art Music with Active Annotation
Sumit Kumar
Parampreet Singh
Vipul Arora
60
0
0
07 May 2025
Aliasing Reduction in Neural Amp Modeling by Smoothing Activations
Ryota Sato
Julius O. Smith III
91
0
0
07 May 2025
On the retraining frequency of global forecasting models
Marco Zanotti
78
2
0
01 May 2025
Voice Cloning: Comprehensive Survey
Hussam Azzuni
Abdulmotaleb El Saddik
VLM
114
0
0
01 May 2025
Temporal Attention Evolutional Graph Convolutional Network for Multivariate Time Series Forecasting
Xinlong Zhao
Lingling Zhang
Tianbo Zou
Yan Zhang
AI4TS
158
0
0
01 May 2025
Versatile Framework for Song Generation with Prompt-based Control
Yanzhe Zhang
Wenxiang Guo
Changhao Pan
Zehan Zhu
Ruiqi Li
...
Rongjie Huang
Ruiyuan Zhang
Zhiqing Hong
Ziyue Jiang
Zhou Zhao
214
2
0
27 Apr 2025
Auto-FEDUS: Autoregressive Generative Modeling of Doppler Ultrasound Signals from Fetal Electrocardiograms
Alireza Rafiei
Gari D. Clifford
N. Katebi
82
0
0
17 Apr 2025
Generation of Musical Timbres using a Text-Guided Diffusion Model
Weixuan Yuan
Qadeer Khan
Vladimir Golkov
DiffM
112
0
0
12 Apr 2025
AMNet: An Acoustic Model Network for Enhanced Mandarin Speech Synthesis
Yubing Cao
Yinfeng Yu
Yongming Li
Liejun Wang
67
0
0
12 Apr 2025
Forecasting Cryptocurrency Prices using Contextual ES-adRNN with Exogenous Variables
Slawek Smyl
Grzegorz Dudek
Paweł Pełka
AI4TS
74
1
0
11 Apr 2025
TAPNext: Tracking Any Point (TAP) as Next Token Prediction
Artem Zholus
Carl Doersch
Yi Yang
Skanda Koppula
Viorica Patraucean
Xu He
Ignacio Rocco
Mehdi S. M. Sajjadi
Sarath Chandar
Ross Goroshin
89
0
0
08 Apr 2025
SoundVista: Novel-View Ambient Sound Synthesis via Visual-Acoustic Binding
Mingfei Chen
I. D. Gebru
Ishwarya Ananthabhotla
Christian Richardt
Dejan Marković
Jake Sandakly
Steven Krenn
Todd Keebler
Eli Shlizerman
Alexander Richard
83
0
0
08 Apr 2025
P2Mark: Plug-and-play Parameter-level Watermarking for Neural Speech Generation
Yong Ren
Jiangyan Yi
Tao Wang
J. Tao
Zhengqi Wen
Chenxing Li
Zheng Lian
Ruibo Fu
Ye Bai
Xiaohui Zhang
102
0
0
07 Apr 2025
SpeakEasy: Enhancing Text-to-Speech Interactions for Expressive Content Creation
Stephen Brade
Sam Anderson
Rithesh Kumar
Zeyu Jin
Anh Truong
89
0
0
07 Apr 2025
Electromyography-Based Gesture Recognition: Hierarchical Feature Extraction for Enhanced Spatial-Temporal Dynamics
Jungpil Shin
Abu Saleh Musa Miah
Sota Konnai
Shu Hoshitaka
Pankoo Kim
67
0
0
04 Apr 2025
LiDAR-based Object Detection with Real-time Voice Specifications
Anurag Kulkarni
55
0
0
03 Apr 2025
A Survey on Music Generation from Single-Modal, Cross-Modal, and Multi-Modal Perspectives
Shuyu Li
Shulei Ji
Zihao Wang
Songruoyao Wu
Jiaxing Yu
Kai Zhang
MGen
VGen
297
1
0
01 Apr 2025
HDVIO2.0: Wind and Disturbance Estimation with Hybrid Dynamics VIO
Giovanni Cioffi
L. Bauersfeld
Davide Scaramuzza
94
0
0
01 Apr 2025
Style Quantization for Data-Efficient GAN Training
Jian Wang
Xin Lan
Jizhe Zhou
Yuxin Tian
Jiancheng Lv
94
0
0
31 Mar 2025
Make Some Noise: Towards LLM audio reasoning and generation using sound tokens
Shivam Mehta
Nebojsa Jojic
Hannes Gamper
73
0
0
28 Mar 2025
From Deep Learning to LLMs: A survey of AI in Quantitative Investment
Bokai Cao
Saizhuo Wang
Xinyi Lin
Xiaojun Wu
Haohan Zhang
L. Ni
Jian Guo
AIFin
112
1
0
27 Mar 2025
Tune It Up: Music Genre Transfer and Prediction
Fidan Samet
Oguz Bakir
Adnan Fidan
59
0
0
27 Mar 2025
ReverBERT: A State Space Model for Efficient Text-Driven Speech Style Transfer
Michael Brown
Sofia Martinez
Priya Singh
72
0
0
26 Mar 2025
Debiasing Kernel-Based Generative Models
Tian Qin
Wei-Min Huang
194
0
0
26 Mar 2025
An Empirical Study of the Impact of Federated Learning on Machine Learning Model Accuracy
Haotian Yang
Ziyi Wang
Benson Chou
Sophie Xu
Hao Wang
Jingxian Wang
Qizhen Zhang
FedML
121
0
0
26 Mar 2025
BADGR: Bundle Adjustment Diffusion Conditioned by GRadients for Wide-Baseline Floor Plan Reconstruction
Yuguang Li
Ivaylo Boyadzhiev
Zixuan Liu
Linda Shapiro
Alex Colburn
DiffM
3DV
98
0
0
25 Mar 2025
SparSamp: Efficient Provably Secure Steganography Based on Sparse Sampling
Yaofei Wang
Gang Pei
Kejiang Chen
Jinyang Ding
Chao Pan
Weilong Pang
Donghui Hu
Weinan Zhang
79
2
0
25 Mar 2025
WaveFM: A High-Fidelity and Efficient Vocoder Based on Flow Matching
Tianze Luo
Xingchen Miao
Wenbo Duan
DiffM
91
0
0
20 Mar 2025
Prosody-Enhanced Acoustic Pre-training and Acoustic-Disentangled Prosody Adapting for Movie Dubbing
Zhedong Zhang
Liang-Sheng Li
C. Yan
Chunshan Liu
Anton Van Den Hengel
Yuankai Qi
142
2
0
15 Mar 2025
Designing Neural Synthesizers for Low-Latency Interaction
Franco Caspe
Jordie Shier
Mark Sandler
C. Saitis
Andrew Mcpherson
437
0
0
14 Mar 2025
Exploring Performance-Complexity Trade-Offs in Sound Event Detection Models
T. Morocutti
Florian Schmid
Jonathan Greif
Francesco Foscarin
Gerhard Widmer
74
0
0
14 Mar 2025
Chat-TS: Enhancing Multi-Modal Reasoning Over Time-Series and Natural Language Data
Paul Quinlan
Qingguo Li
Xiaodan Zhu
AI4TS
LRM
92
0
0
13 Mar 2025
Mamba-VA: A Mamba-based Approach for Continuous Emotion Recognition in Valence-Arousal Space
Yuheng Liang
Ziyi Wang
Feng Liu
Mingzhou Liu
Yu Yao
Mamba
114
1
0
13 Mar 2025
Probabilistic Forecasting via Autoregressive Flow Matching
Ahmed El-Gazzar
Marcel van Gerven
AI4TS
95
0
0
13 Mar 2025
Learning Control of Neural Sound Effects Synthesis from Physically Inspired Models
Yisu Zong
Joshua Reiss
79
1
0
13 Mar 2025
An Exhaustive Evaluation of TTS- and VC-based Data Augmentation for ASR
Sewade Ogun
Vincent Colotte
Emmanuel Vincent
108
0
0
11 Mar 2025
Multilevel Generative Samplers for Investigating Critical Phenomena
Ankur Singha
E. Cellini
K. Nicoli
K. Jansen
Stefan Kühn
Shinichi Nakajima
108
1
0
11 Mar 2025
Generalized Interpolating Discrete Diffusion
Dimitri von Rutte
J. Fluri
Yuhui Ding
Antonio Orvieto
Bernhard Scholkopf
Thomas Hofmann
DiffM
133
2
0
06 Mar 2025
An Optimization Algorithm for Multimodal Data Alignment
Wei Zhang
Xinyu Wang
Lan Yu
S. Li
69
0
0
05 Mar 2025
FlowDec: A flow-based full-band general audio codec with high perceptual quality
Simon Welker
Matthew Le
Ricky T. Q. Chen
Wei-Ning Hsu
Timo Gerkmann
Alexander Richard
Yi-Chiao Wu
98
1
0
03 Mar 2025
Self-attention-based Diffusion Model for Time-series Imputation in Partial Blackout Scenarios
Mohammad Rafid Ul Islam
Prasad Tadepalli
Alan Fern
60
0
0
03 Mar 2025
HOP: Heterogeneous Topology-based Multimodal Entanglement for Co-Speech Gesture Generation
Hongye Cheng
Tianyu Wang
Guangsi Shi
Zexing Zhao
Yanwei Fu
SLR
84
1
0
03 Mar 2025
Language-agnostic, automated assessment of listeners' speech recall using large language models
Björn Herrmann
42
0
0
02 Mar 2025
Clip-TTS: Contrastive Text-content and Mel-spectrogram, A High-Quality Text-to-Speech Method based on Contextual Semantic Understanding
Tianyun Liu
CLIP
VLM
105
0
0
26 Feb 2025
SMT(LIA) Sampling with High Diversity
Yong Lai
Junjie Li
Chuan Luo
93
0
0
25 Feb 2025
PuzzleFusion++: Auto-agglomerative 3D Fracture Assembly by Denoise and Verify
Zhengqing Wang
Jiacheng Chen
Yasutaka Furukawa
149
8
0
24 Feb 2025
Everyday Speech in the Indian Subcontinent
Utkarsh Pathak
105
1
0
24 Feb 2025
An End-to-End Homomorphically Encrypted Neural Network
Marcos Florencio
Luiz Alencar
Bianca Lima
SyDa
139
0
0
22 Feb 2025
Beyond Fixed Variables: Expanding-variate Time Series Forecasting via Flat Scheme and Spatio-temporal Focal Learning
Minbo Ma
Kai Tang
Huan Li
Fei Teng
Dalin Zhang
Tianrui Li
AI4TS
101
0
0
21 Feb 2025
Previous
1
2
3
4
5
...
60
61
62
Next