ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1609.03499
  4. Cited By
WaveNet: A Generative Model for Raw Audio

WaveNet: A Generative Model for Raw Audio

12 September 2016
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
    DiffM
ArXivPDFHTML

Papers citing "WaveNet: A Generative Model for Raw Audio"

50 / 3,039 papers shown
Title
Diffusion Model in Hyperspectral Image Processing and Analysis: A Review
Diffusion Model in Hyperspectral Image Processing and Analysis: A Review
Xing Hu
Xiangcheng Liu
Qianqian Duan
Danfeng Hong
Dawei Zhang
DiffM
24
0
0
16 May 2025
TransDiffuser: End-to-end Trajectory Generation with Decorrelated Multi-modal Representation for Autonomous Driving
TransDiffuser: End-to-end Trajectory Generation with Decorrelated Multi-modal Representation for Autonomous Driving
Xuefeng Jiang
Yuan Ma
Pengxiang Li
Leimeng Xu
Xin Wen
Kun Zhan
Zhongpu Xia
Peng Jia
Xianpeng Lang
Sheng Sun
DiffM
18
0
0
14 May 2025
SingNet: Towards a Large-Scale, Diverse, and In-the-Wild Singing Voice Dataset
SingNet: Towards a Large-Scale, Diverse, and In-the-Wild Singing Voice Dataset
Yicheng Gu
Chaoren Wang
Jingyang Zhang
Xueyao Zhang
Zihao Fang
Haorui He
Zhizheng Wu
32
2
0
14 May 2025
MiniMax-Speech: Intrinsic Zero-Shot Text-to-Speech with a Learnable Speaker Encoder
MiniMax-Speech: Intrinsic Zero-Shot Text-to-Speech with a Learnable Speaker Encoder
Bowen Zhang
Congchao Guo
Geng Yang
Hang Yu
Haozhe Zhang
...
Yichen Xiao
Yiying Zhou
Yujie Zhang
Yuan Lu
Yucen He
26
0
0
12 May 2025
Physics-informed Multiple-Input Operators for efficient dynamic response prediction of structures
Physics-informed Multiple-Input Operators for efficient dynamic response prediction of structures
Bilal Ahmed
Yuqing Qiu
Diab W. Abueidda
Waleed El-Sekelly
Tarek Abdoun
M. Mobasher
AI4CE
39
0
0
11 May 2025
Beyond Identity: A Generalizable Approach for Deepfake Audio Detection
Beyond Identity: A Generalizable Approach for Deepfake Audio Detection
Yasaman Ahmadiadli
Xiao-Ping Zhang
Naimul Khan
31
0
0
10 May 2025
Accurate and Efficient Multivariate Time Series Forecasting via Offline Clustering
Accurate and Efficient Multivariate Time Series Forecasting via Offline Clustering
Yiming Niu
Jinliang Deng
L. Zhang
Zimu Zhou
Yongxin Tong
AI4TS
31
0
0
09 May 2025
Aliasing Reduction in Neural Amp Modeling by Smoothing Activations
Aliasing Reduction in Neural Amp Modeling by Smoothing Activations
Ryota Sato
Julius O. Smith III
48
0
0
07 May 2025
Recognizing Ornaments in Vocal Indian Art Music with Active Annotation
Recognizing Ornaments in Vocal Indian Art Music with Active Annotation
Sumit Kumar
Parampreet Singh
Vipul Arora
34
0
0
07 May 2025
Do global forecasting models require frequent retraining?
Do global forecasting models require frequent retraining?
Marco Zanotti
39
0
0
01 May 2025
Voice Cloning: Comprehensive Survey
Voice Cloning: Comprehensive Survey
Hussam Azzuni
Abdulmotaleb El Saddik
VLM
44
0
0
01 May 2025
Temporal Attention Evolutional Graph Convolutional Network for Multivariate Time Series Forecasting
Temporal Attention Evolutional Graph Convolutional Network for Multivariate Time Series Forecasting
Xinlong Zhao
L. Zhang
Tianbo Zou
Yan Zhang
AI4TS
26
0
0
01 May 2025
Versatile Framework for Song Generation with Prompt-based Control
Versatile Framework for Song Generation with Prompt-based Control
Wenjie Qu
Wenxiang Guo
Changhao Pan
Zehan Zhu
Ruiqi Li
...
Rongjie Huang
Ruiyuan Zhang
Zhiqing Hong
Ziyue Jiang
Zhou Zhao
77
2
0
27 Apr 2025
Auto-FEDUS: Autoregressive Generative Modeling of Doppler Ultrasound Signals from Fetal Electrocardiograms
Auto-FEDUS: Autoregressive Generative Modeling of Doppler Ultrasound Signals from Fetal Electrocardiograms
Alireza Rafiei
Gari D. Clifford
N. Katebi
36
0
0
17 Apr 2025
Generation of Musical Timbres using a Text-Guided Diffusion Model
Generation of Musical Timbres using a Text-Guided Diffusion Model
Weixuan Yuan
Qadeer Khan
Vladimir Golkov
DiffM
31
0
0
12 Apr 2025
AMNet: An Acoustic Model Network for Enhanced Mandarin Speech Synthesis
AMNet: An Acoustic Model Network for Enhanced Mandarin Speech Synthesis
Yubing Cao
Yinfeng Yu
Yongming Li
Liejun Wang
29
0
0
12 Apr 2025
Forecasting Cryptocurrency Prices using Contextual ES-adRNN with Exogenous Variables
Forecasting Cryptocurrency Prices using Contextual ES-adRNN with Exogenous Variables
Slawek Smyl
Grzegorz Dudek
Paweł Pełka
AI4TS
21
1
0
11 Apr 2025
TAPNext: Tracking Any Point (TAP) as Next Token Prediction
TAPNext: Tracking Any Point (TAP) as Next Token Prediction
Artem Zholus
Carl Doersch
Yi Yang
Skanda Koppula
Viorica Patraucean
Xu He
Ignacio Rocco
Mehdi S. M. Sajjadi
Sarath Chandar
Ross Goroshin
32
0
0
08 Apr 2025
SoundVista: Novel-View Ambient Sound Synthesis via Visual-Acoustic Binding
SoundVista: Novel-View Ambient Sound Synthesis via Visual-Acoustic Binding
Mingfei Chen
I. D. Gebru
Ishwarya Ananthabhotla
Christian Richardt
Dejan Marković
Jake Sandakly
Steven Krenn
Todd Keebler
Eli Shlizerman
Alexander Richard
24
0
0
08 Apr 2025
SpeakEasy: Enhancing Text-to-Speech Interactions for Expressive Content Creation
SpeakEasy: Enhancing Text-to-Speech Interactions for Expressive Content Creation
Stephen Brade
Sam Anderson
Rithesh Kumar
Zeyu Jin
Anh Truong
41
0
0
07 Apr 2025
P2Mark: Plug-and-play Parameter-level Watermarking for Neural Speech Generation
P2Mark: Plug-and-play Parameter-level Watermarking for Neural Speech Generation
Yong Ren
Jiangyan Yi
Tao Wang
J. Tao
Zhengqi Wen
Chenxing Li
Zheng Lian
Ruibo Fu
Ye Bai
Xiaohui Zhang
62
0
0
07 Apr 2025
Solid State Bus-Comp: A Large-Scale and Diverse Dataset for Dynamic Range Compressor Virtual Analog Modeling
Solid State Bus-Comp: A Large-Scale and Diverse Dataset for Dynamic Range Compressor Virtual Analog Modeling
Yicheng Gu
Runsong Zhang
Lauri Juvela
Zhikai Wu
DiffM
189
0
0
06 Apr 2025
Electromyography-Based Gesture Recognition: Hierarchical Feature Extraction for Enhanced Spatial-Temporal Dynamics
Electromyography-Based Gesture Recognition: Hierarchical Feature Extraction for Enhanced Spatial-Temporal Dynamics
Jungpil Shin
Abu Saleh Musa Miah
Sota Konnai
Shu Hoshitaka
Pankoo Kim
36
0
0
04 Apr 2025
LiDAR-based Object Detection with Real-time Voice Specifications
LiDAR-based Object Detection with Real-time Voice Specifications
Anurag Kulkarni
24
0
0
03 Apr 2025
HDVIO2.0: Wind and Disturbance Estimation with Hybrid Dynamics VIO
HDVIO2.0: Wind and Disturbance Estimation with Hybrid Dynamics VIO
Giovanni Cioffi
L. Bauersfeld
Davide Scaramuzza
51
0
0
01 Apr 2025
A Survey on Music Generation from Single-Modal, Cross-Modal, and Multi-Modal Perspectives
A Survey on Music Generation from Single-Modal, Cross-Modal, and Multi-Modal Perspectives
Shuyu Li
Shulei Ji
Zihao Wang
Songruoyao Wu
Jiaxing Yu
Kaipeng Zhang
MGen
VGen
73
1
0
01 Apr 2025
Style Quantization for Data-Efficient GAN Training
Style Quantization for Data-Efficient GAN Training
Jian Wang
Xin Lan
Jizhe Zhou
Yuxin Tian
Jiancheng Lv
51
0
0
31 Mar 2025
Make Some Noise: Towards LLM audio reasoning and generation using sound tokens
Make Some Noise: Towards LLM audio reasoning and generation using sound tokens
Shivam Mehta
Nebojsa Jojic
Hannes Gamper
31
0
0
28 Mar 2025
Tune It Up: Music Genre Transfer and Prediction
Tune It Up: Music Genre Transfer and Prediction
Fidan Samet
Oguz Bakir
Adnan Fidan
29
0
0
27 Mar 2025
From Deep Learning to LLMs: A survey of AI in Quantitative Investment
From Deep Learning to LLMs: A survey of AI in Quantitative Investment
Bokai Cao
Saizhuo Wang
Xinyi Lin
Xiaojun Wu
Haohan Zhang
L. Ni
Jian Guo
AIFin
57
1
0
27 Mar 2025
Debiasing Kernel-Based Generative Models
Debiasing Kernel-Based Generative Models
Tian Qin
Wei-Min Huang
55
0
0
26 Mar 2025
An Empirical Study of the Impact of Federated Learning on Machine Learning Model Accuracy
An Empirical Study of the Impact of Federated Learning on Machine Learning Model Accuracy
Haotian Yang
Zihan Wang
Benson Chou
Sophie Xu
Hao Wang
Jingxian Wang
Qizhen Zhang
FedML
93
0
0
26 Mar 2025
ReverBERT: A State Space Model for Efficient Text-Driven Speech Style Transfer
ReverBERT: A State Space Model for Efficient Text-Driven Speech Style Transfer
Michael Brown
Sofia Martinez
Priya Singh
53
0
0
26 Mar 2025
BADGR: Bundle Adjustment Diffusion Conditioned by GRadients for Wide-Baseline Floor Plan Reconstruction
BADGR: Bundle Adjustment Diffusion Conditioned by GRadients for Wide-Baseline Floor Plan Reconstruction
Yuguang Li
Ivaylo Boyadzhiev
Zixuan Liu
Linda Shapiro
Alex Colburn
DiffM
3DV
70
0
0
25 Mar 2025
SparSamp: Efficient Provably Secure Steganography Based on Sparse Sampling
SparSamp: Efficient Provably Secure Steganography Based on Sparse Sampling
Yaofei Wang
Gang Pei
Kejiang Chen
Jinyang Ding
Chao Pan
Weilong Pang
Donghui Hu
Wenbo Zhang
51
1
0
25 Mar 2025
WaveFM: A High-Fidelity and Efficient Vocoder Based on Flow Matching
WaveFM: A High-Fidelity and Efficient Vocoder Based on Flow Matching
Tianze Luo
Xingchen Miao
Wenbo Duan
DiffM
42
0
0
20 Mar 2025
Prosody-Enhanced Acoustic Pre-training and Acoustic-Disentangled Prosody Adapting for Movie Dubbing
Prosody-Enhanced Acoustic Pre-training and Acoustic-Disentangled Prosody Adapting for Movie Dubbing
Zhedong Zhang
Liang-Sheng Li
C. Yan
Chunshan Liu
Anton Van Den Hengel
Yuankai Qi
91
2
0
15 Mar 2025
Exploring Performance-Complexity Trade-Offs in Sound Event Detection
T. Morocutti
Florian Schmid
Jonathan Greif
Francesco Foscarin
Gerhard Widmer
43
0
0
14 Mar 2025
Designing Neural Synthesizers for Low-Latency Interaction
Designing Neural Synthesizers for Low-Latency Interaction
Franco Caspe
Jordie Shier
Mark Sandler
C. Saitis
Andrew Mcpherson
204
0
0
14 Mar 2025
Chat-TS: Enhancing Multi-Modal Reasoning Over Time-Series and Natural Language Data
Paul Quinlan
Qingguo Li
Xiaodan Zhu
AI4TS
LRM
64
0
0
13 Mar 2025
Probabilistic Forecasting via Autoregressive Flow Matching
Ahmed El-Gazzar
Marcel van Gerven
AI4TS
57
0
0
13 Mar 2025
Learning Control of Neural Sound Effects Synthesis from Physically Inspired Models
Yisu Zong
Joshua Reiss
56
0
0
13 Mar 2025
Mamba-VA: A Mamba-based Approach for Continuous Emotion Recognition in Valence-Arousal Space
Yuheng Liang
Zihan Wang
Feng Liu
Mingzhou Liu
Yu Yao
Mamba
65
1
0
13 Mar 2025
Multilevel Generative Samplers for Investigating Critical Phenomena
Ankur Singha
E. Cellini
K. Nicoli
K. Jansen
Stefan Kühn
Shinichi Nakajima
64
1
0
11 Mar 2025
An Exhaustive Evaluation of TTS- and VC-based Data Augmentation for ASR
Sewade Ogun
Vincent Colotte
Emmanuel Vincent
64
0
0
11 Mar 2025
Generalized Interpolating Discrete Diffusion
Dimitri von Rutte
J. Fluri
Yuhui Ding
Antonio Orvieto
Bernhard Scholkopf
Thomas Hofmann
DiffM
67
0
0
06 Mar 2025
An Optimization Algorithm for Multimodal Data Alignment
Wei Zhang
Xueliang Wang
Lan Yu
S. Li
54
0
0
05 Mar 2025
HOP: Heterogeneous Topology-based Multimodal Entanglement for Co-Speech Gesture Generation
Hongye Cheng
Tianyu Wang
Guangsi Shi
Zexing Zhao
Yanwei Fu
SLR
50
1
0
03 Mar 2025
FlowDec: A flow-based full-band general audio codec with high perceptual quality
Simon Welker
Matthew Le
Ricky T. Q. Chen
Wei-Ning Hsu
Timo Gerkmann
Alexander Richard
Yi-Chiao Wu
63
0
0
03 Mar 2025
Self-attention-based Diffusion Model for Time-series Imputation in Partial Blackout Scenarios
Mohammad Rafid Ul Islam
Prasad Tadepalli
Alan Fern
43
0
0
03 Mar 2025
1234...596061
Next