ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1609.03499
  4. Cited By
WaveNet: A Generative Model for Raw Audio

WaveNet: A Generative Model for Raw Audio

12 September 2016
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
    DiffM
ArXivPDFHTML

Papers citing "WaveNet: A Generative Model for Raw Audio"

50 / 3,039 papers shown
Title
Conformer-1: Robust ASR via Large-Scale Semisupervised Bootstrapping
Conformer-1: Robust ASR via Large-Scale Semisupervised Bootstrapping
Kevin Zhang
Luka Chkhetiani
Francis McCann Ramirez
Yash Khare
Andrea Vanzo
...
Ruben Bousbib
Taufiquzzaman Peyash
Michael Nguyen
Dillon Pulliam
Domenic Donato
40
2
0
10 Apr 2024
Adapting LLaMA Decoder to Vision Transformer
Adapting LLaMA Decoder to Vision Transformer
Jiahao Wang
Wenqi Shao
Yonghong Tian
Chengyue Wu
Yong Liu
Taiqiang Wu
Kaipeng Zhang
Songyang Zhang
Kai-xiang Chen
Ping Luo
MLLM
40
4
0
10 Apr 2024
VoiceShop: A Unified Speech-to-Speech Framework for Identity-Preserving
  Zero-Shot Voice Editing
VoiceShop: A Unified Speech-to-Speech Framework for Identity-Preserving Zero-Shot Voice Editing
Philip Anastassiou
Zhenyu Tang
Kainan Peng
Dongya Jia
Jiaxin Li
Ming Tu
Yuping Wang
Yuxuan Wang
Mingbo Ma
42
4
0
10 Apr 2024
TimeCSL: Unsupervised Contrastive Learning of General Shapelets for
  Explorable Time Series Analysis
TimeCSL: Unsupervised Contrastive Learning of General Shapelets for Explorable Time Series Analysis
Zhiyu Liang
Cheng Liang
Zheng Liang
Hongzhi Wang
Bo Zheng
21
1
0
07 Apr 2024
A Novel Bi-LSTM And Transformer Architecture For Generating Tabla Music
A Novel Bi-LSTM And Transformer Architecture For Generating Tabla Music
Roopa Mayya
Vivekanand Venkataraman
A. Paduri
Narayana Darapaneni
23
0
0
06 Apr 2024
PromptCodec: High-Fidelity Neural Speech Codec using Disentangled
  Representation Learning based Adaptive Feature-aware Prompt Encoders
PromptCodec: High-Fidelity Neural Speech Codec using Disentangled Representation Learning based Adaptive Feature-aware Prompt Encoders
Yu Pan
Lei Ma
Jianjun Zhao
37
4
0
03 Apr 2024
A Novel Audio Representation for Music Genre Identification in MIR
A Novel Audio Representation for Music Genre Identification in MIR
Navin Kamuni
Mayank Jindal
Arpita Soni
Sukender Reddy Mallreddy
Sharath Chandra Macha
VLM
37
6
0
01 Apr 2024
Personalized Neural Speech Codec
Personalized Neural Speech Codec
Inseon Jang
Haici Yang
Wootaek Lim
Seung-Wha Beack
Minje Kim
58
1
0
31 Mar 2024
PyTorch Frame: A Modular Framework for Multi-Modal Tabular Learning
PyTorch Frame: A Modular Framework for Multi-Modal Tabular Learning
Weihua Hu
Yiwen Yuan
Zecheng Zhang
Akihiro Nitta
Kaidi Cao
Vid Kocijan
J. Leskovec
Matthias Fey
LMTD
47
11
0
31 Mar 2024
A Review of Modern Recommender Systems Using Generative Models
  (Gen-RecSys)
A Review of Modern Recommender Systems Using Generative Models (Gen-RecSys)
Yashar Deldjoo
Zhankui He
Julian McAuley
Anton Korikov
Scott Sanner
Arnau Ramisa
René Vidal
M. Sathiamoorthy
Atoosa Kasirzadeh
Silvia Milano
VLM
33
41
0
31 Mar 2024
CM-TTS: Enhancing Real Time Text-to-Speech Synthesis Efficiency through
  Weighted Samplers and Consistency Models
CM-TTS: Enhancing Real Time Text-to-Speech Synthesis Efficiency through Weighted Samplers and Consistency Models
Xiang Li
Fan Bu
Ambuj Mehrish
Yingting Li
Jiale Han
Bo Cheng
Soujanya Poria
DiffM
40
6
0
31 Mar 2024
Generative weather for improved crop model simulations
Generative weather for improved crop model simulations
Yuji Saikai
27
1
0
31 Mar 2024
DeepHeteroIoT: Deep Local and Global Learning over Heterogeneous IoT
  Sensor Data
DeepHeteroIoT: Deep Local and Global Learning over Heterogeneous IoT Sensor Data
Muhammad Sakib Khan Inan
Kewen Liao
Haifeng Shen
Prem Prakash Jayaraman
Dimitrios Georgakopoulos
Ming Jian Tang
43
0
0
29 Mar 2024
FastPerson: Enhancing Video Learning through Effective Video
  Summarization that Preserves Linguistic and Visual Contexts
FastPerson: Enhancing Video Learning through Effective Video Summarization that Preserves Linguistic and Visual Contexts
Kazuki Kawamura
Jun Rekimoto
31
3
0
26 Mar 2024
Correlation of Fréchet Audio Distance With Human Perception of
  Environmental Audio Is Embedding Dependant
Correlation of Fréchet Audio Distance With Human Perception of Environmental Audio Is Embedding Dependant
Modan Tailleur
Junwon Lee
Mathieu Lagrange
Keunwoo Choi
Laurie M. Heller
Keisuke Imoto
Yuki Okamoto
30
10
0
26 Mar 2024
Training Generative Adversarial Network-Based Vocoder with Limited Data
  Using Augmentation-Conditional Discriminator
Training Generative Adversarial Network-Based Vocoder with Limited Data Using Augmentation-Conditional Discriminator
Takuhiro Kaneko
Hirokazu Kameoka
Kou Tanaka
29
0
0
25 Mar 2024
Building speech corpus with diverse voice characteristics for its
  prompt-based representation
Building speech corpus with diverse voice characteristics for its prompt-based representation
Aya Watanabe
Shinnosuke Takamichi
Yuki Saito
Wataru Nakata
Detai Xin
Hiroshi Saruwatari
40
0
0
20 Mar 2024
Castor: Competing shapelets for fast and accurate time series
  classification
Castor: Competing shapelets for fast and accurate time series classification
Isak Samsten
Zed Lee
AI4TS
24
0
0
19 Mar 2024
CoPlay: Audio-agnostic Cognitive Scaling for Acoustic Sensing
CoPlay: Audio-agnostic Cognitive Scaling for Acoustic Sensing
Yin Li
Rajalakshmi Nandakumar
35
0
0
16 Mar 2024
EM-TTS: Efficiently Trained Low-Resource Mongolian Lightweight
  Text-to-Speech
EM-TTS: Efficiently Trained Low-Resource Mongolian Lightweight Text-to-Speech
Ziqi Liang
Haoxiang Shi
Jiawei Wang
Keda Lu
43
0
0
13 Mar 2024
Motifs, Phrases, and Beyond: The Modelling of Structure in Symbolic
  Music Generation
Motifs, Phrases, and Beyond: The Modelling of Structure in Symbolic Music Generation
Keshav Bhandari
Simon Colton
47
8
0
12 Mar 2024
Memory-based Adapters for Online 3D Scene Perception
Memory-based Adapters for Online 3D Scene Perception
Xiuwei Xu
Chong Xia
Ziwei Wang
Linqing Zhao
Yueqi Duan
Jie Zhou
Jiwen Lu
3DPC
35
4
0
11 Mar 2024
HAM-TTS: Hierarchical Acoustic Modeling for Token-Based Zero-Shot
  Text-to-Speech with Model and Data Scaling
HAM-TTS: Hierarchical Acoustic Modeling for Token-Based Zero-Shot Text-to-Speech with Model and Data Scaling
Chunhui Wang
Chang Zeng
Bowen Zhang
Ziyang Ma
Yefan Zhu
Zifeng Cai
Jian Zhao
Zhonglin Jiang
Yong Chen
SyDa
44
5
0
09 Mar 2024
RFWave: Multi-band Rectified Flow for Audio Waveform Reconstruction
RFWave: Multi-band Rectified Flow for Audio Waveform Reconstruction
Peng Liu
Dongyang Dai
Zhiyong Wu
46
2
0
08 Mar 2024
Probing the Robustness of Time-series Forecasting Models with
  CounterfacTS
Probing the Robustness of Time-series Forecasting Models with CounterfacTS
Haakon Hanisch Kjaernli
Lluis Mas-Ribas
Aida Ashrafi
Gleb Sizov
Helge Langseth
Odd Erik Gundersen
AI4TS
36
0
0
06 Mar 2024
NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and
  Diffusion Models
NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models
Zeqian Ju
Yuancheng Wang
Kai Shen
Xu Tan
Detai Xin
...
Shikun Zhang
Jiang Bian
Lei He
Jinyu Li
Sheng Zhao
DiffM
49
147
0
05 Mar 2024
AIx Speed: Playback Speed Optimization Using Listening Comprehension of
  Speech Recognition Models
AIx Speed: Playback Speed Optimization Using Listening Comprehension of Speech Recognition Models
Kazuki Kawamura
Jun Rekimoto
20
0
0
05 Mar 2024
CATS: Enhancing Multivariate Time Series Forecasting by Constructing
  Auxiliary Time Series as Exogenous Variables
CATS: Enhancing Multivariate Time Series Forecasting by Constructing Auxiliary Time Series as Exogenous Variables
Jiecheng Lu
Xu Han
Yan Sun
Shihao Yang
AI4TS
49
16
0
04 Mar 2024
Day-ahead regional solar power forecasting with hierarchical temporal
  convolutional neural networks using historical power generation and weather
  data
Day-ahead regional solar power forecasting with hierarchical temporal convolutional neural networks using historical power generation and weather data
M. Perera
J. Hoog
Kasun Bandara
Damith A. Senanayake
Saman K. Halgamuge
AI4TS
27
16
0
04 Mar 2024
VoxGenesis: Unsupervised Discovery of Latent Speaker Manifold for Speech
  Synthesis
VoxGenesis: Unsupervised Discovery of Latent Speaker Manifold for Speech Synthesis
Wei-wei Lin
Chenhang He
Man-Wai Mak
Jiachen Lian
Kong Aik Lee
DiffM
46
0
0
01 Mar 2024
Humanoid Locomotion as Next Token Prediction
Humanoid Locomotion as Next Token Prediction
Ilija Radosavovic
Bike Zhang
Baifeng Shi
Jathushan Rajasegaran
Sarthak Kamat
Trevor Darrell
Koushil Sreenath
Jitendra Malik
LM&Ro
28
60
0
29 Feb 2024
Beyond Language Models: Byte Models are Digital World Simulators
Beyond Language Models: Byte Models are Digital World Simulators
Shangda Wu
Xu Tan
Zili Wang
Rui Wang
Xiaobing Li
Maosong Sun
35
12
0
29 Feb 2024
Physics Sensor Based Deep Learning Fall Detection System
Physics Sensor Based Deep Learning Fall Detection System
Zeyuan Qu
Tiange Huang
Yuxin Ji
Yongjun Li
21
0
0
29 Feb 2024
Parallelized Spatiotemporal Binding
Parallelized Spatiotemporal Binding
Gautam Singh
Yue Wang
Jiawei Yang
Boris Ivanovic
Sungjin Ahn
Marco Pavone
Tong Che
48
1
0
26 Feb 2024
An Automated End-to-End Open-Source Software for High-Quality
  Text-to-Speech Dataset Generation
An Automated End-to-End Open-Source Software for High-Quality Text-to-Speech Dataset Generation
Ahmet Gunduz
K. Yuksel
Kareem Darwish
Golara Javadi
Fabio Minazzi
Nicola Sobieski
Sebastien Bratieres
25
0
0
26 Feb 2024
Generative AI in Vision: A Survey on Models, Metrics and Applications
Generative AI in Vision: A Survey on Models, Metrics and Applications
Gaurav Raut
Apoorv Singh
VLM
MedIm
43
6
0
26 Feb 2024
A Survey of Music Generation in the Context of Interaction
A Survey of Music Generation in the Context of Interaction
Ismael Agchar
Ilja Baumann
Franziska Braun
Paula Andrea Pérez-Toro
Korbinian Riedhammer
Sebastian Trump
Martin Ullrich
MGen
42
0
0
23 Feb 2024
PeriodGrad: Towards Pitch-Controllable Neural Vocoder Based on a
  Diffusion Probabilistic Model
PeriodGrad: Towards Pitch-Controllable Neural Vocoder Based on a Diffusion Probabilistic Model
Yukiya Hono
Kei Hashimoto
Yoshihiko Nankaku
Keiichi Tokuda
DiffM
43
3
0
22 Feb 2024
Generative Probabilistic Time Series Forecasting and Applications in
  Grid Operations
Generative Probabilistic Time Series Forecasting and Applications in Grid Operations
Xinyi Wang
Lang Tong
Qing Zhao
AI4TS
36
3
0
21 Feb 2024
Structural Knowledge Informed Continual Multivariate Time Series
  Forecasting
Structural Knowledge Informed Continual Multivariate Time Series Forecasting
Zijie Pan
Yushan Jiang
Dongjin Song
Sahil Garg
Kashif Rasul
Anderson Schneider
Yuriy Nevmyvaka
CLL
AI4TS
42
3
0
20 Feb 2024
SingVisio: Visual Analytics of Diffusion Model for Singing Voice
  Conversion
SingVisio: Visual Analytics of Diffusion Model for Singing Voice Conversion
Liumeng Xue
Chaoren Wang
Mingxuan Wang
Xueyao Zhang
Jun Han
Zhizheng Wu
DiffM
32
5
0
20 Feb 2024
Speaking in Wavelet Domain: A Simple and Efficient Approach to Speed up
  Speech Diffusion Model
Speaking in Wavelet Domain: A Simple and Efficient Approach to Speed up Speech Diffusion Model
Xiangyu Zhang
Daijiao Liu
Hexin Liu
Qiquan Zhang
Hanyu Meng
Leibny Paola García
Chng Eng Siong
Lina Yao
DiffM
25
3
0
16 Feb 2024
Graph-based Forecasting with Missing Data through Spatiotemporal
  Downsampling
Graph-based Forecasting with Missing Data through Spatiotemporal Downsampling
Ivan Marisca
Cesare Alippi
F. Bianchi
AI4TS
40
8
0
16 Feb 2024
Can Transformers Predict Vibrations?
Can Transformers Predict Vibrations?
Fusataka Kuniyoshi
Yoshihide Sawada
27
0
0
16 Feb 2024
BASE TTS: Lessons from building a billion-parameter Text-to-Speech model
  on 100K hours of data
BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data
Mateusz Lajszczak
Guillermo Cámbara
Yang Li
Fatih Beyhan
Arent van Korlaar
...
Bartosz Putrycz
Soledad López Gambino
Kayeon Yoo
Elena Sokolova
Thomas Drugman
LM&MA
38
75
0
12 Feb 2024
Forecasting Events in Soccer Matches Through Language
Forecasting Events in Soccer Matches Through Language
Tiago Mendes-Neves
Luís Meireles
João Mendes-Moreira
18
5
0
09 Feb 2024
GLA-Grad: A Griffin-Lim Extended Waveform Generation Diffusion Model
GLA-Grad: A Griffin-Lim Extended Waveform Generation Diffusion Model
Haocheng Liu
Teysir Baoueb
Mathieu Fontaine
Jonathan Le Roux
Gaël Richard
37
4
0
09 Feb 2024
Fast Timing-Conditioned Latent Audio Diffusion
Fast Timing-Conditioned Latent Audio Diffusion
Zach Evans
CJ Carr
Josiah Taylor
Scott H. Hawley
Jordi Pons
DiffM
82
102
0
07 Feb 2024
Analysis of Linear Mode Connectivity via Permutation-Based Weight Matching: With Insights into Other Permutation Search Methods
Analysis of Linear Mode Connectivity via Permutation-Based Weight Matching: With Insights into Other Permutation Search Methods
Akira Ito
Masanori Yamada
Atsutoshi Kumagai
MoMe
64
5
0
06 Feb 2024
An Inpainting-Infused Pipeline for Attire and Background Replacement
An Inpainting-Infused Pipeline for Attire and Background Replacement
F. Mahlow
A. F. Zanella
William Alberto Cruz-Castaneda
Marcellus Amadeus
41
0
0
05 Feb 2024
Previous
123...678...596061
Next