ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1609.03499
  4. Cited By
WaveNet: A Generative Model for Raw Audio
v1v2 (latest)

WaveNet: A Generative Model for Raw Audio

12 September 2016
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
    DiffM
ArXiv (abs)PDFHTML

Papers citing "WaveNet: A Generative Model for Raw Audio"

50 / 3,082 papers shown
Title
Personalized Neural Speech Codec
Personalized Neural Speech Codec
Inseon Jang
Haici Yang
Wootaek Lim
Seung-Wha Beack
Minje Kim
73
1
0
31 Mar 2024
PyTorch Frame: A Modular Framework for Multi-Modal Tabular Learning
PyTorch Frame: A Modular Framework for Multi-Modal Tabular Learning
Weihua Hu
Yiwen Yuan
Zecheng Zhang
Akihiro Nitta
Kaidi Cao
Vid Kocijan
J. Leskovec
Matthias Fey
LMTD
76
16
0
31 Mar 2024
A Review of Modern Recommender Systems Using Generative Models
  (Gen-RecSys)
A Review of Modern Recommender Systems Using Generative Models (Gen-RecSys)
Yashar Deldjoo
Zhankui He
Julian McAuley
Anton Korikov
Scott Sanner
Arnau Ramisa
René Vidal
M. Sathiamoorthy
Atoosa Kasirzadeh
Silvia Milano
VLM
152
61
0
31 Mar 2024
CM-TTS: Enhancing Real Time Text-to-Speech Synthesis Efficiency through
  Weighted Samplers and Consistency Models
CM-TTS: Enhancing Real Time Text-to-Speech Synthesis Efficiency through Weighted Samplers and Consistency Models
Xiang Li
Fan Bu
Ambuj Mehrish
Yingting Li
Jiale Han
Bo Cheng
Soujanya Poria
DiffM
59
6
0
31 Mar 2024
Generative weather for improved crop model simulations
Generative weather for improved crop model simulations
Yuji Saikai
107
1
0
31 Mar 2024
DeepHeteroIoT: Deep Local and Global Learning over Heterogeneous IoT
  Sensor Data
DeepHeteroIoT: Deep Local and Global Learning over Heterogeneous IoT Sensor Data
Muhammad Sakib Khan Inan
Kewen Liao
Haifeng Shen
Prem Prakash Jayaraman
Dimitrios Georgakopoulos
Ming Jian Tang
66
0
0
29 Mar 2024
FastPerson: Enhancing Video Learning through Effective Video
  Summarization that Preserves Linguistic and Visual Contexts
FastPerson: Enhancing Video Learning through Effective Video Summarization that Preserves Linguistic and Visual Contexts
Kazuki Kawamura
Jun Rekimoto
54
3
0
26 Mar 2024
Correlation of Fréchet Audio Distance With Human Perception of
  Environmental Audio Is Embedding Dependant
Correlation of Fréchet Audio Distance With Human Perception of Environmental Audio Is Embedding Dependant
Modan Tailleur
Junwon Lee
Mathieu Lagrange
Keunwoo Choi
Laurie M. Heller
Keisuke Imoto
Yuki Okamoto
100
10
0
26 Mar 2024
Training Generative Adversarial Network-Based Vocoder with Limited Data
  Using Augmentation-Conditional Discriminator
Training Generative Adversarial Network-Based Vocoder with Limited Data Using Augmentation-Conditional Discriminator
Takuhiro Kaneko
Hirokazu Kameoka
Kou Tanaka
57
0
0
25 Mar 2024
Building speech corpus with diverse voice characteristics for its
  prompt-based representation
Building speech corpus with diverse voice characteristics for its prompt-based representation
Aya Watanabe
Shinnosuke Takamichi
Yuki Saito
Wataru Nakata
Detai Xin
Hiroshi Saruwatari
65
1
0
20 Mar 2024
Castor: Competing shapelets for fast and accurate time series
  classification
Castor: Competing shapelets for fast and accurate time series classification
Isak Samsten
Zed Lee
AI4TS
53
0
0
19 Mar 2024
A Practical Guide to Statistical Distances for Evaluating Generative
  Models in Science
A Practical Guide to Statistical Distances for Evaluating Generative Models in Science
Sebastian Bischoff
Alana Darcher
Michael Deistler
Richard Gao
Franziska Gerken
...
Auguste Schulz
Zinovia Stefanidi
Shoji Toyota
Linda Ulmer
Julius Vetter
SyDa
63
14
0
19 Mar 2024
CoPlay: Audio-agnostic Cognitive Scaling for Acoustic Sensing
CoPlay: Audio-agnostic Cognitive Scaling for Acoustic Sensing
Yin Li
Rajalakshmi Nandakumar
60
0
0
16 Mar 2024
EM-TTS: Efficiently Trained Low-Resource Mongolian Lightweight
  Text-to-Speech
EM-TTS: Efficiently Trained Low-Resource Mongolian Lightweight Text-to-Speech
Ziqi Liang
Haoxiang Shi
Jiawei Wang
Keda Lu
77
0
0
13 Mar 2024
Motifs, Phrases, and Beyond: The Modelling of Structure in Symbolic
  Music Generation
Motifs, Phrases, and Beyond: The Modelling of Structure in Symbolic Music Generation
Keshav Bhandari
Simon Colton
68
9
0
12 Mar 2024
Memory-based Adapters for Online 3D Scene Perception
Memory-based Adapters for Online 3D Scene Perception
Xiuwei Xu
Chong Xia
Ziwei Wang
Linqing Zhao
Yueqi Duan
Jie Zhou
Jiwen Lu
3DPC
66
5
0
11 Mar 2024
HAM-TTS: Hierarchical Acoustic Modeling for Token-Based Zero-Shot
  Text-to-Speech with Model and Data Scaling
HAM-TTS: Hierarchical Acoustic Modeling for Token-Based Zero-Shot Text-to-Speech with Model and Data Scaling
Chunhui Wang
Chang Zeng
Bowen Zhang
Ziyang Ma
Yefan Zhu
Zifeng Cai
Jian Zhao
Zhonglin Jiang
Yong Chen
SyDa
64
5
0
09 Mar 2024
RFWave: Multi-band Rectified Flow for Audio Waveform Reconstruction
RFWave: Multi-band Rectified Flow for Audio Waveform Reconstruction
Peng Liu
Dongyang Dai
Zhiyong Wu
144
3
0
08 Mar 2024
NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and
  Diffusion Models
NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models
Zeqian Ju
Yuancheng Wang
Kai Shen
Xu Tan
Detai Xin
...
Shikun Zhang
Jiang Bian
Lei He
Jinyu Li
Sheng Zhao
DiffM
159
180
0
05 Mar 2024
AIx Speed: Playback Speed Optimization Using Listening Comprehension of
  Speech Recognition Models
AIx Speed: Playback Speed Optimization Using Listening Comprehension of Speech Recognition Models
Kazuki Kawamura
Jun Rekimoto
30
0
0
05 Mar 2024
CATS: Enhancing Multivariate Time Series Forecasting by Constructing
  Auxiliary Time Series as Exogenous Variables
CATS: Enhancing Multivariate Time Series Forecasting by Constructing Auxiliary Time Series as Exogenous Variables
Jiecheng Lu
Xu Han
Yan Sun
Shihao Yang
AI4TS
86
19
0
04 Mar 2024
Day-ahead regional solar power forecasting with hierarchical temporal
  convolutional neural networks using historical power generation and weather
  data
Day-ahead regional solar power forecasting with hierarchical temporal convolutional neural networks using historical power generation and weather data
M. Perera
J. Hoog
Kasun Bandara
Damith A. Senanayake
Saman K. Halgamuge
AI4TS
48
20
0
04 Mar 2024
VoxGenesis: Unsupervised Discovery of Latent Speaker Manifold for Speech
  Synthesis
VoxGenesis: Unsupervised Discovery of Latent Speaker Manifold for Speech Synthesis
Wei-wei Lin
Chenhang He
Man-Wai Mak
Jiachen Lian
Kong Aik Lee
DiffM
67
0
0
01 Mar 2024
Humanoid Locomotion as Next Token Prediction
Humanoid Locomotion as Next Token Prediction
Ilija Radosavovic
Bike Zhang
Baifeng Shi
Jathushan Rajasegaran
Sarthak Kamat
Trevor Darrell
Koushil Sreenath
Jitendra Malik
LM&Ro
96
67
0
29 Feb 2024
Beyond Language Models: Byte Models are Digital World Simulators
Beyond Language Models: Byte Models are Digital World Simulators
Shangda Wu
Xu Tan
Zili Wang
Rui Wang
Xiaobing Li
Maosong Sun
65
13
0
29 Feb 2024
Physics Sensor Based Deep Learning Fall Detection System
Physics Sensor Based Deep Learning Fall Detection System
Zeyuan Qu
Tiange Huang
Yuxin Ji
Yongjun Li
32
0
0
29 Feb 2024
Parallelized Spatiotemporal Binding
Parallelized Spatiotemporal Binding
Gautam Singh
Yue Wang
Jiawei Yang
Boris Ivanovic
Sungjin Ahn
Marco Pavone
Tong Che
86
1
0
26 Feb 2024
An Automated End-to-End Open-Source Software for High-Quality
  Text-to-Speech Dataset Generation
An Automated End-to-End Open-Source Software for High-Quality Text-to-Speech Dataset Generation
Ahmet Gunduz
K. Yuksel
Kareem Darwish
Golara Javadi
Fabio Minazzi
Nicola Sobieski
Sebastien Bratieres
57
0
0
26 Feb 2024
Generative AI in Vision: A Survey on Models, Metrics and Applications
Generative AI in Vision: A Survey on Models, Metrics and Applications
Gaurav Raut
Apoorv Singh
VLMMedIm
112
7
0
26 Feb 2024
A Survey of Music Generation in the Context of Interaction
A Survey of Music Generation in the Context of Interaction
Ismael Agchar
Ilja Baumann
Franziska Braun
Paula Andrea Pérez-Toro
Korbinian Riedhammer
Sebastian Trump
Martin Ullrich
MGen
92
0
0
23 Feb 2024
PeriodGrad: Towards Pitch-Controllable Neural Vocoder Based on a
  Diffusion Probabilistic Model
PeriodGrad: Towards Pitch-Controllable Neural Vocoder Based on a Diffusion Probabilistic Model
Yukiya Hono
Kei Hashimoto
Yoshihiko Nankaku
Keiichi Tokuda
DiffM
67
3
0
22 Feb 2024
Generative Probabilistic Time Series Forecasting and Applications in
  Grid Operations
Generative Probabilistic Time Series Forecasting and Applications in Grid Operations
Xinyi Wang
Lang Tong
Qing Zhao
AI4TS
75
3
0
21 Feb 2024
Structural Knowledge Informed Continual Multivariate Time Series
  Forecasting
Structural Knowledge Informed Continual Multivariate Time Series Forecasting
Zijie Pan
Yushan Jiang
Dongjin Song
Sahil Garg
Kashif Rasul
Anderson Schneider
Yuriy Nevmyvaka
CLLAI4TS
92
3
0
20 Feb 2024
SingVisio: Visual Analytics of Diffusion Model for Singing Voice
  Conversion
SingVisio: Visual Analytics of Diffusion Model for Singing Voice Conversion
Liumeng Xue
Chaoren Wang
Mingxuan Wang
Xueyao Zhang
Jun Han
Zhizheng Wu
DiffM
71
6
0
20 Feb 2024
Speaking in Wavelet Domain: A Simple and Efficient Approach to Speed up
  Speech Diffusion Model
Speaking in Wavelet Domain: A Simple and Efficient Approach to Speed up Speech Diffusion Model
Xiangyu Zhang
Daijiao Liu
Hexin Liu
Qiquan Zhang
Hanyu Meng
Leibny Paola García
Chng Eng Siong
Lina Yao
DiffM
78
4
0
16 Feb 2024
Graph-based Forecasting with Missing Data through Spatiotemporal
  Downsampling
Graph-based Forecasting with Missing Data through Spatiotemporal Downsampling
Ivan Marisca
Cesare Alippi
F. Bianchi
AI4TS
82
11
0
16 Feb 2024
Can Transformers Predict Vibrations?
Can Transformers Predict Vibrations?
Fusataka Kuniyoshi
Yoshihide Sawada
52
0
0
16 Feb 2024
BASE TTS: Lessons from building a billion-parameter Text-to-Speech model
  on 100K hours of data
BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data
Mateusz Lajszczak
Guillermo Cámbara
Yang Li
Fatih Beyhan
Arent van Korlaar
...
Bartosz Putrycz
Soledad López Gambino
Kayeon Yoo
Elena Sokolova
Thomas Drugman
LM&MA
113
88
0
12 Feb 2024
Forecasting Events in Soccer Matches Through Language
Forecasting Events in Soccer Matches Through Language
Tiago Mendes-Neves
Luís Meireles
João Mendes-Moreira
82
5
0
09 Feb 2024
GLA-Grad: A Griffin-Lim Extended Waveform Generation Diffusion Model
GLA-Grad: A Griffin-Lim Extended Waveform Generation Diffusion Model
Haocheng Liu
Teysir Baoueb
Mathieu Fontaine
Jonathan Le Roux
Gaël Richard
63
4
0
09 Feb 2024
Fast Timing-Conditioned Latent Audio Diffusion
Fast Timing-Conditioned Latent Audio Diffusion
Zach Evans
CJ Carr
Josiah Taylor
Scott H. Hawley
Jordi Pons
DiffM
142
117
0
07 Feb 2024
Analysis of Linear Mode Connectivity via Permutation-Based Weight Matching: With Insights into Other Permutation Search Methods
Analysis of Linear Mode Connectivity via Permutation-Based Weight Matching: With Insights into Other Permutation Search Methods
Akira Ito
Masanori Yamada
Atsutoshi Kumagai
MoMe
170
6
0
06 Feb 2024
An Inpainting-Infused Pipeline for Attire and Background Replacement
An Inpainting-Infused Pipeline for Attire and Background Replacement
F. Mahlow
A. F. Zanella
William Alberto Cruz-Castaneda
Marcellus Amadeus
78
0
0
05 Feb 2024
Privacy-Preserving Distributed Learning for Residential Short-Term Load
  Forecasting
Privacy-Preserving Distributed Learning for Residential Short-Term Load Forecasting
Yizhen Dong
Yingjie Wang
Mariana Gama
Mustafa A. Mustafa
Geert Deconinck
Xiaowei Huang
36
2
0
02 Feb 2024
Bass Accompaniment Generation via Latent Diffusion
Bass Accompaniment Generation via Latent Diffusion
Marco Pasini
M. Grachten
Stefan Lattner
103
12
0
02 Feb 2024
STAA-Net: A Sparse and Transferable Adversarial Attack for Speech
  Emotion Recognition
STAA-Net: A Sparse and Transferable Adversarial Attack for Speech Emotion Recognition
Yi Chang
Zhao Ren
Zixing Zhang
Xin Jing
Kun Qian
Xi Shao
Bin Hu
Tanja Schultz
Björn W. Schuller
AAML
75
4
0
02 Feb 2024
SymbolicAI: A framework for logic-based approaches combining generative
  models and solvers
SymbolicAI: A framework for logic-based approaches combining generative models and solvers
Marius-Constantin Dinu
Claudiu Leoveanu-Condrei
Markus Holzleitner
Werner Zellinger
Sepp Hochreiter
88
11
0
01 Feb 2024
EVA-GAN: Enhanced Various Audio Generation via Scalable Generative
  Adversarial Networks
EVA-GAN: Enhanced Various Audio Generation via Scalable Generative Adversarial Networks
Shijia Liao
Shiyi Lan
Arun George Zachariah
43
1
0
31 Jan 2024
Proactive Detection of Voice Cloning with Localized Watermarking
Proactive Detection of Voice Cloning with Localized Watermarking
Robin San Roman
Pierre Fernandez
Alexandre Défossez
Teddy Furon
Tuan Tran
Hady ElSahar
146
54
0
30 Jan 2024
Forecasting VIX using Bayesian Deep Learning
Forecasting VIX using Bayesian Deep Learning
Héctor J. Hortúa
Andrés Mora-Valencia
BDLOOD
96
4
0
30 Jan 2024
Previous
123...789...606162
Next