ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1609.03499
  4. Cited By
WaveNet: A Generative Model for Raw Audio
v1v2 (latest)

WaveNet: A Generative Model for Raw Audio

12 September 2016
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
    DiffM
ArXiv (abs)PDFHTML

Papers citing "WaveNet: A Generative Model for Raw Audio"

50 / 3,082 papers shown
Title
SpecDiff-GAN: A Spectrally-Shaped Noise Diffusion GAN for Speech and
  Music Synthesis
SpecDiff-GAN: A Spectrally-Shaped Noise Diffusion GAN for Speech and Music Synthesis
Teysir Baoueb
Haocheng Liu
Mathieu Fontaine
Jonathan Le Roux
Gaël Richard
DiffM
80
5
0
30 Jan 2024
Provably Robust Multi-bit Watermarking for AI-generated Text
Provably Robust Multi-bit Watermarking for AI-generated Text
Wenjie Qu
Dong Yin
Zixin He
Wei Zou
Tianyang Tao
Jinyuan Jia
Jiaheng Zhang
Jinyuan Jia
Jiaheng Zhang
WaLM
248
2
0
30 Jan 2024
MunTTS: A Text-to-Speech System for Mundari
MunTTS: A Text-to-Speech System for Mundari
Varun Gumma
Rishav Hada
Aditya Yadavalli
Pamir Gogoi
Ishani Mondal
Vivek Seshadri
Kalika Bali
59
1
0
28 Jan 2024
A real-time rendering method for high albedo anisotropic materials with
  multiple scattering
A real-time rendering method for high albedo anisotropic materials with multiple scattering
Shun Fang
Xing Feng
Ming Cui
14
0
0
25 Jan 2024
RefreshNet: Learning Multiscale Dynamics through Hierarchical Refreshing
RefreshNet: Learning Multiscale Dynamics through Hierarchical Refreshing
Junaid Farooq
Danish Rafiq
Pantelis R. Vlachas
M. A. Bazaz
59
0
0
24 Jan 2024
Truck Parking Usage Prediction with Decomposed Graph Neural Networks
Truck Parking Usage Prediction with Decomposed Graph Neural Networks
Rei Tamaru
Yang Cheng
Steven T. Parker
Ernie Perry
Bin Ran
S. Ahn
124
0
0
23 Jan 2024
Detecting Multimedia Generated by Large AI Models: A Survey
Detecting Multimedia Generated by Large AI Models: A Survey
Li Lin
Neeraj Gupta
Yue Zhang
Hainan Ren
Chun-Hao Liu
Feng Ding
Xin Eric Wang
Xin Li
Luisa Verdoliva
Shu Hu
223
64
0
22 Jan 2024
Data-driven grapheme-to-phoneme representations for a lexicon-free
  text-to-speech
Data-driven grapheme-to-phoneme representations for a lexicon-free text-to-speech
Abhinav Garg
Jiyeon Kim
Sushil Khyalia
Chanwoo Kim
Dhananjaya N. Gowda
74
3
0
19 Jan 2024
Ultra-lightweight Neural Differential DSP Vocoder For High Quality
  Speech Synthesis
Ultra-lightweight Neural Differential DSP Vocoder For High Quality Speech Synthesis
Prabhav Agrawal
Thilo Köhler
Zhiping Xiu
Prashant Serai
Qing He
37
1
0
19 Jan 2024
FreGrad: Lightweight and Fast Frequency-aware Diffusion Vocoder
FreGrad: Lightweight and Fast Frequency-aware Diffusion Vocoder
Tan Dat Nguyen
Ji-Hoon Kim
Youngjoon Jang
Jaehun Kim
Joon Son Chung
DiffM
116
6
0
18 Jan 2024
Scalable Pre-training of Large Autoregressive Image Models
Scalable Pre-training of Large Autoregressive Image Models
Alaaeldin El-Nouby
Michal Klein
Shuangfei Zhai
Miguel Angel Bautista
Alexander Toshev
Vaishaal Shankar
J. Susskind
Armand Joulin
VLM
105
80
0
16 Jan 2024
SpecSTG: A Fast Spectral Diffusion Framework for Probabilistic
  Spatio-Temporal Traffic Forecasting
SpecSTG: A Fast Spectral Diffusion Framework for Probabilistic Spatio-Temporal Traffic Forecasting
Lequan Lin
Dai Shi
Andi Han
Junbin Gao
DiffMAI4TS
76
5
0
16 Jan 2024
DIFFRENT: A Diffusion Model for Recording Environment Transfer of Speech
DIFFRENT: A Diffusion Model for Recording Environment Transfer of Speech
Jae-Yeol Im
Juhan Nam
DiffM
56
3
0
16 Jan 2024
CarSpeedNet: A Deep Neural Network-based Car Speed Estimation from
  Smartphone Accelerometer
CarSpeedNet: A Deep Neural Network-based Car Speed Estimation from Smartphone Accelerometer
Barak Or
62
2
0
15 Jan 2024
ELLA-V: Stable Neural Codec Language Modeling with Alignment-guided
  Sequence Reordering
ELLA-V: Stable Neural Codec Language Modeling with Alignment-guided Sequence Reordering
Ya-Zhen Song
Zhuo Chen
Xiaofei Wang
Ziyang Ma
Xie Chen
AuLLM
113
42
0
14 Jan 2024
BioDiffusion: A Versatile Diffusion Model for Biomedical Signal
  Synthesis
BioDiffusion: A Versatile Diffusion Model for Biomedical Signal Synthesis
Xiaomin Li
Mykhailo Sakevych
G. Atkinson
V. Metsis
MedIm
73
9
0
12 Jan 2024
HyperGANStrument: Instrument Sound Synthesis and Editing with
  Pitch-Invariant Hypernetworks
HyperGANStrument: Instrument Sound Synthesis and Editing with Pitch-Invariant Hypernetworks
Zhe Zhang
Taketo Akama
GAN
63
1
0
09 Jan 2024
A Primer on Temporal Graph Learning
A Primer on Temporal Graph Learning
Aniq Ur Rahman
J. Coon
AI4CE
67
1
0
08 Jan 2024
Reversing the Irreversible: A Survey on Inverse Biometrics
Reversing the Irreversible: A Survey on Inverse Biometrics
M. Gomez-Barrero
Javier Galbally
82
69
0
05 Jan 2024
The Rise of Diffusion Models in Time-Series Forecasting
The Rise of Diffusion Models in Time-Series Forecasting
Caspar Meijer
Lydia Y. Chen
DiffMAI4TS
91
10
0
05 Jan 2024
Bring Metric Functions into Diffusion Models
Bring Metric Functions into Diffusion Models
Jie An
Zhengyuan Yang
Jianfeng Wang
Linjie Li
Zicheng Liu
Lijuan Wang
Jiebo Luo
DiffM
72
6
0
04 Jan 2024
Incremental FastPitch: Chunk-based High Quality Text to Speech
Incremental FastPitch: Chunk-based High Quality Text to Speech
Muyang Du
Chuan Liu
Junjie Lai
53
0
0
03 Jan 2024
S$^{2}$-DMs:Skip-Step Diffusion Models
S2^{2}2-DMs:Skip-Step Diffusion Models
Yixuan Wang
Shuangyin Li
82
0
0
03 Jan 2024
Deep autoregressive modeling for land use land cover
Deep autoregressive modeling for land use land cover
C. Krapu
Mark Borsuk
Ryan Calder
BDL
51
1
0
02 Jan 2024
Efficient Parallel Audio Generation using Group Masked Language Modeling
Efficient Parallel Audio Generation using Group Masked Language Modeling
Myeonghun Jeong
Minchan Kim
Joun Yeop Lee
Nam Soo Kim
54
6
0
02 Jan 2024
Brain-Conditional Multimodal Synthesis: A Survey and Taxonomy
Brain-Conditional Multimodal Synthesis: A Survey and Taxonomy
Weijian Mai
Jian Zhang
Pengfei Fang
Zhijun Zhang
177
11
0
31 Dec 2023
AI and Tempo Estimation: A Review
AI and Tempo Estimation: A Review
Geoff Luck
24
0
0
30 Dec 2023
TransFace: Unit-Based Audio-Visual Speech Synthesizer for Talking Head
  Translation
TransFace: Unit-Based Audio-Visual Speech Synthesizer for Talking Head Translation
Xize Cheng
Rongjie Huang
Linjun Li
Tao Jin
Zehan Wang
Aoxiong Yin
Minglei Li
Xinyu Duan
Changpeng Yang
Zhou Zhao
81
6
0
23 Dec 2023
The Effects of Signal-to-Noise Ratio on Generative Adversarial Networks
  Applied to Marine Bioacoustic Data
The Effects of Signal-to-Noise Ratio on Generative Adversarial Networks Applied to Marine Bioacoustic Data
Georgia Atkinson
Nick Wright
A. Mcgough
Per Berggren
GAN
55
0
0
22 Dec 2023
ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis
  Conditioned on Self-supervised Discrete Speech Representations
ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations
Cheng Gong
Xin Wang
Erica Cooper
Dan Wells
Longbiao Wang
Jianwu Dang
Korin Richmond
Junichi Yamagishi
116
25
0
22 Dec 2023
Adapt & Align: Continual Learning with Generative Models Latent Space
  Alignment
Adapt & Align: Continual Learning with Generative Models Latent Space Alignment
Kamil Deja
Bartosz Cywiński
Jan Rybarczyk
Tomasz Trzciñski
CLLDRL
53
0
0
21 Dec 2023
BrainTalker: Low-Resource Brain-to-Speech Synthesis with Transfer
  Learning using Wav2Vec 2.0
BrainTalker: Low-Resource Brain-to-Speech Synthesis with Transfer Learning using Wav2Vec 2.0
Miseul Kim
Zhenyu Piao
Jihyun Lee
Hong-Goo Kang
129
3
0
21 Dec 2023
AutoXPCR: Automated Multi-Objective Model Selection for Time Series
  Forecasting
AutoXPCR: Automated Multi-Objective Model Selection for Time Series Forecasting
Raphael Fischer
Amal Saadallah
67
0
0
20 Dec 2023
Time-Transformer: Integrating Local and Global Features for Better Time
  Series Generation
Time-Transformer: Integrating Local and Global Features for Better Time Series Generation
Yuansan Liu
S. Wijewickrema
Ang Li
C. Bester
Stephen O'Leary
James Bailey
ViTAI4TS
120
9
0
18 Dec 2023
Amphion: An Open-Source Audio, Music and Speech Generation Toolkit
Amphion: An Open-Source Audio, Music and Speech Generation Toolkit
Xueyao Zhang
Liumeng Xue
Yicheng Gu
Yuancheng Wang
Haorui He
...
Mingxuan Wang
Jun Han
Kai Chen
Haizhou Li
Zhizheng Wu
91
35
0
15 Dec 2023
StemGen: A music generation model that listens
StemGen: A music generation model that listens
Julian Parker
Janne Spijkervet
Katerina Kosta
Furkan Yesiler
Boris Kuznetsov
Ju-Chiang Wang
Matt Avent
Jitong Chen
Duc Le
MGen
103
31
0
14 Dec 2023
NAC-TCN: Temporal Convolutional Networks with Causal Dilated
  Neighborhood Attention for Emotion Understanding
NAC-TCN: Temporal Convolutional Networks with Causal Dilated Neighborhood Attention for Emotion Understanding
Alexander Mehta
William Yang
ViT
90
2
0
12 Dec 2023
SCCA: Shifted Cross Chunk Attention for long contextual semantic
  expansion
SCCA: Shifted Cross Chunk Attention for long contextual semantic expansion
Yuxiang Guo
110
1
0
12 Dec 2023
Computational Copyright: Towards A Royalty Model for Music Generative AI
Computational Copyright: Towards A Royalty Model for Music Generative AI
Junwei Deng
Shiyuan Zhang
Jiaqi Ma
86
4
0
11 Dec 2023
CLeaRForecast: Contrastive Learning of High-Purity Representations for
  Time Series Forecasting
CLeaRForecast: Contrastive Learning of High-Purity Representations for Time Series Forecasting
Jiaxin Gao
Yuxiao Hu
Qinglong Cao
Siqi Dai
Yuntian Chen
AI4TS
34
0
0
10 Dec 2023
A Cascaded Neural Network System For Rating Student Performance In
  Surgical Knot Tying Simulation
A Cascaded Neural Network System For Rating Student Performance In Surgical Knot Tying Simulation
Yunzhe Xue
Olanrewaju A Eletta
J. Ady
Nell M. Patel
Advaith Bongu
Usman Roshan
133
2
0
09 Dec 2023
TCNCA: Temporal Convolution Network with Chunked Attention for Scalable
  Sequence Processing
TCNCA: Temporal Convolution Network with Chunked Attention for Scalable Sequence Processing
Aleksandar Terzić
Michael Hersche
G. Karunaratne
Zixiao Huang
Abu Sebastian
Abbas Rahimi
AI4TS
62
1
0
09 Dec 2023
Trajeglish: Traffic Modeling as Next-Token Prediction
Trajeglish: Traffic Modeling as Next-Token Prediction
Jonah Philion
Xue Bin Peng
Sanja Fidler
39
25
0
07 Dec 2023
Detecting Voice Cloning Attacks via Timbre Watermarking
Detecting Voice Cloning Attacks via Timbre Watermarking
Chang-rui Liu
Jie Zhang
Tianwei Zhang
Xi Yang
Weiming Zhang
Neng H. Yu
105
38
0
06 Dec 2023
Rapid Speaker Adaptation in Low Resource Text to Speech Systems using
  Synthetic Data and Transfer learning
Rapid Speaker Adaptation in Low Resource Text to Speech Systems using Synthetic Data and Transfer learning
Raviraj Joshi
Nikesh Garera
73
2
0
02 Dec 2023
Context Retrieval via Normalized Contextual Latent Interaction for
  Conversational Agent
Context Retrieval via Normalized Contextual Latent Interaction for Conversational Agent
Junfeng Liu
Zhuocheng Mei
Kewen Peng
R. Vatsavai
71
1
0
01 Dec 2023
Spatial-Temporal-Decoupled Masked Pre-training for Spatiotemporal
  Forecasting
Spatial-Temporal-Decoupled Masked Pre-training for Spatiotemporal Forecasting
Haotian Gao
Renhe Jiang
Zheng Dong
Jinliang Deng
Yuxin Ma
Xuan Song
AI4TS
101
21
0
01 Dec 2023
DREAM: Diffusion Rectification and Estimation-Adaptive Models
DREAM: Diffusion Rectification and Estimation-Adaptive Models
Jinxin Zhou
Tianyu Ding
Tianyi Chen
Jiachen Jiang
Ilya Zharkov
Zhihui Zhu
Luming Liang
90
7
0
30 Nov 2023
DiffusionTalker: Personalization and Acceleration for Speech-Driven 3D
  Face Diffuser
DiffusionTalker: Personalization and Acceleration for Speech-Driven 3D Face Diffuser
Peng Chen
Xiaobao Wei
Ming Lu
Yitong Zhu
Nai-Ming Yao
Xingyu Xiao
Hui Chen
81
13
0
28 Nov 2023
Stability-Informed Initialization of Neural Ordinary Differential
  Equations
Stability-Informed Initialization of Neural Ordinary Differential Equations
Theodor Westny
Arman Mohammadi
Daniel Jung
Erik Frisk
122
1
0
27 Nov 2023
Previous
123...8910...606162
Next