Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1609.03499
Cited By
WaveNet: A Generative Model for Raw Audio
12 September 2016
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
DiffM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"WaveNet: A Generative Model for Raw Audio"
50 / 3,039 papers shown
Title
Conformer-1: Robust ASR via Large-Scale Semisupervised Bootstrapping
Kevin Zhang
Luka Chkhetiani
Francis McCann Ramirez
Yash Khare
Andrea Vanzo
...
Ruben Bousbib
Taufiquzzaman Peyash
Michael Nguyen
Dillon Pulliam
Domenic Donato
40
2
0
10 Apr 2024
Adapting LLaMA Decoder to Vision Transformer
Jiahao Wang
Wenqi Shao
Yonghong Tian
Chengyue Wu
Yong Liu
Taiqiang Wu
Kaipeng Zhang
Songyang Zhang
Kai-xiang Chen
Ping Luo
MLLM
40
4
0
10 Apr 2024
VoiceShop: A Unified Speech-to-Speech Framework for Identity-Preserving Zero-Shot Voice Editing
Philip Anastassiou
Zhenyu Tang
Kainan Peng
Dongya Jia
Jiaxin Li
Ming Tu
Yuping Wang
Yuxuan Wang
Mingbo Ma
42
4
0
10 Apr 2024
TimeCSL: Unsupervised Contrastive Learning of General Shapelets for Explorable Time Series Analysis
Zhiyu Liang
Cheng Liang
Zheng Liang
Hongzhi Wang
Bo Zheng
21
1
0
07 Apr 2024
A Novel Bi-LSTM And Transformer Architecture For Generating Tabla Music
Roopa Mayya
Vivekanand Venkataraman
A. Paduri
Narayana Darapaneni
23
0
0
06 Apr 2024
PromptCodec: High-Fidelity Neural Speech Codec using Disentangled Representation Learning based Adaptive Feature-aware Prompt Encoders
Yu Pan
Lei Ma
Jianjun Zhao
37
4
0
03 Apr 2024
A Novel Audio Representation for Music Genre Identification in MIR
Navin Kamuni
Mayank Jindal
Arpita Soni
Sukender Reddy Mallreddy
Sharath Chandra Macha
VLM
37
6
0
01 Apr 2024
Personalized Neural Speech Codec
Inseon Jang
Haici Yang
Wootaek Lim
Seung-Wha Beack
Minje Kim
58
1
0
31 Mar 2024
PyTorch Frame: A Modular Framework for Multi-Modal Tabular Learning
Weihua Hu
Yiwen Yuan
Zecheng Zhang
Akihiro Nitta
Kaidi Cao
Vid Kocijan
J. Leskovec
Matthias Fey
LMTD
47
11
0
31 Mar 2024
A Review of Modern Recommender Systems Using Generative Models (Gen-RecSys)
Yashar Deldjoo
Zhankui He
Julian McAuley
Anton Korikov
Scott Sanner
Arnau Ramisa
René Vidal
M. Sathiamoorthy
Atoosa Kasirzadeh
Silvia Milano
VLM
33
41
0
31 Mar 2024
CM-TTS: Enhancing Real Time Text-to-Speech Synthesis Efficiency through Weighted Samplers and Consistency Models
Xiang Li
Fan Bu
Ambuj Mehrish
Yingting Li
Jiale Han
Bo Cheng
Soujanya Poria
DiffM
40
6
0
31 Mar 2024
Generative weather for improved crop model simulations
Yuji Saikai
27
1
0
31 Mar 2024
DeepHeteroIoT: Deep Local and Global Learning over Heterogeneous IoT Sensor Data
Muhammad Sakib Khan Inan
Kewen Liao
Haifeng Shen
Prem Prakash Jayaraman
Dimitrios Georgakopoulos
Ming Jian Tang
43
0
0
29 Mar 2024
FastPerson: Enhancing Video Learning through Effective Video Summarization that Preserves Linguistic and Visual Contexts
Kazuki Kawamura
Jun Rekimoto
31
3
0
26 Mar 2024
Correlation of Fréchet Audio Distance With Human Perception of Environmental Audio Is Embedding Dependant
Modan Tailleur
Junwon Lee
Mathieu Lagrange
Keunwoo Choi
Laurie M. Heller
Keisuke Imoto
Yuki Okamoto
30
10
0
26 Mar 2024
Training Generative Adversarial Network-Based Vocoder with Limited Data Using Augmentation-Conditional Discriminator
Takuhiro Kaneko
Hirokazu Kameoka
Kou Tanaka
29
0
0
25 Mar 2024
Building speech corpus with diverse voice characteristics for its prompt-based representation
Aya Watanabe
Shinnosuke Takamichi
Yuki Saito
Wataru Nakata
Detai Xin
Hiroshi Saruwatari
40
0
0
20 Mar 2024
Castor: Competing shapelets for fast and accurate time series classification
Isak Samsten
Zed Lee
AI4TS
24
0
0
19 Mar 2024
CoPlay: Audio-agnostic Cognitive Scaling for Acoustic Sensing
Yin Li
Rajalakshmi Nandakumar
35
0
0
16 Mar 2024
EM-TTS: Efficiently Trained Low-Resource Mongolian Lightweight Text-to-Speech
Ziqi Liang
Haoxiang Shi
Jiawei Wang
Keda Lu
43
0
0
13 Mar 2024
Motifs, Phrases, and Beyond: The Modelling of Structure in Symbolic Music Generation
Keshav Bhandari
Simon Colton
47
8
0
12 Mar 2024
Memory-based Adapters for Online 3D Scene Perception
Xiuwei Xu
Chong Xia
Ziwei Wang
Linqing Zhao
Yueqi Duan
Jie Zhou
Jiwen Lu
3DPC
35
4
0
11 Mar 2024
HAM-TTS: Hierarchical Acoustic Modeling for Token-Based Zero-Shot Text-to-Speech with Model and Data Scaling
Chunhui Wang
Chang Zeng
Bowen Zhang
Ziyang Ma
Yefan Zhu
Zifeng Cai
Jian Zhao
Zhonglin Jiang
Yong Chen
SyDa
44
5
0
09 Mar 2024
RFWave: Multi-band Rectified Flow for Audio Waveform Reconstruction
Peng Liu
Dongyang Dai
Zhiyong Wu
46
2
0
08 Mar 2024
Probing the Robustness of Time-series Forecasting Models with CounterfacTS
Haakon Hanisch Kjaernli
Lluis Mas-Ribas
Aida Ashrafi
Gleb Sizov
Helge Langseth
Odd Erik Gundersen
AI4TS
36
0
0
06 Mar 2024
NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models
Zeqian Ju
Yuancheng Wang
Kai Shen
Xu Tan
Detai Xin
...
Shikun Zhang
Jiang Bian
Lei He
Jinyu Li
Sheng Zhao
DiffM
49
147
0
05 Mar 2024
AIx Speed: Playback Speed Optimization Using Listening Comprehension of Speech Recognition Models
Kazuki Kawamura
Jun Rekimoto
20
0
0
05 Mar 2024
CATS: Enhancing Multivariate Time Series Forecasting by Constructing Auxiliary Time Series as Exogenous Variables
Jiecheng Lu
Xu Han
Yan Sun
Shihao Yang
AI4TS
49
16
0
04 Mar 2024
Day-ahead regional solar power forecasting with hierarchical temporal convolutional neural networks using historical power generation and weather data
M. Perera
J. Hoog
Kasun Bandara
Damith A. Senanayake
Saman K. Halgamuge
AI4TS
27
16
0
04 Mar 2024
VoxGenesis: Unsupervised Discovery of Latent Speaker Manifold for Speech Synthesis
Wei-wei Lin
Chenhang He
Man-Wai Mak
Jiachen Lian
Kong Aik Lee
DiffM
46
0
0
01 Mar 2024
Humanoid Locomotion as Next Token Prediction
Ilija Radosavovic
Bike Zhang
Baifeng Shi
Jathushan Rajasegaran
Sarthak Kamat
Trevor Darrell
Koushil Sreenath
Jitendra Malik
LM&Ro
28
60
0
29 Feb 2024
Beyond Language Models: Byte Models are Digital World Simulators
Shangda Wu
Xu Tan
Zili Wang
Rui Wang
Xiaobing Li
Maosong Sun
35
12
0
29 Feb 2024
Physics Sensor Based Deep Learning Fall Detection System
Zeyuan Qu
Tiange Huang
Yuxin Ji
Yongjun Li
21
0
0
29 Feb 2024
Parallelized Spatiotemporal Binding
Gautam Singh
Yue Wang
Jiawei Yang
Boris Ivanovic
Sungjin Ahn
Marco Pavone
Tong Che
48
1
0
26 Feb 2024
An Automated End-to-End Open-Source Software for High-Quality Text-to-Speech Dataset Generation
Ahmet Gunduz
K. Yuksel
Kareem Darwish
Golara Javadi
Fabio Minazzi
Nicola Sobieski
Sebastien Bratieres
25
0
0
26 Feb 2024
Generative AI in Vision: A Survey on Models, Metrics and Applications
Gaurav Raut
Apoorv Singh
VLM
MedIm
43
6
0
26 Feb 2024
A Survey of Music Generation in the Context of Interaction
Ismael Agchar
Ilja Baumann
Franziska Braun
Paula Andrea Pérez-Toro
Korbinian Riedhammer
Sebastian Trump
Martin Ullrich
MGen
42
0
0
23 Feb 2024
PeriodGrad: Towards Pitch-Controllable Neural Vocoder Based on a Diffusion Probabilistic Model
Yukiya Hono
Kei Hashimoto
Yoshihiko Nankaku
Keiichi Tokuda
DiffM
43
3
0
22 Feb 2024
Generative Probabilistic Time Series Forecasting and Applications in Grid Operations
Xinyi Wang
Lang Tong
Qing Zhao
AI4TS
36
3
0
21 Feb 2024
Structural Knowledge Informed Continual Multivariate Time Series Forecasting
Zijie Pan
Yushan Jiang
Dongjin Song
Sahil Garg
Kashif Rasul
Anderson Schneider
Yuriy Nevmyvaka
CLL
AI4TS
42
3
0
20 Feb 2024
SingVisio: Visual Analytics of Diffusion Model for Singing Voice Conversion
Liumeng Xue
Chaoren Wang
Mingxuan Wang
Xueyao Zhang
Jun Han
Zhizheng Wu
DiffM
32
5
0
20 Feb 2024
Speaking in Wavelet Domain: A Simple and Efficient Approach to Speed up Speech Diffusion Model
Xiangyu Zhang
Daijiao Liu
Hexin Liu
Qiquan Zhang
Hanyu Meng
Leibny Paola García
Chng Eng Siong
Lina Yao
DiffM
25
3
0
16 Feb 2024
Graph-based Forecasting with Missing Data through Spatiotemporal Downsampling
Ivan Marisca
Cesare Alippi
F. Bianchi
AI4TS
40
8
0
16 Feb 2024
Can Transformers Predict Vibrations?
Fusataka Kuniyoshi
Yoshihide Sawada
27
0
0
16 Feb 2024
BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data
Mateusz Lajszczak
Guillermo Cámbara
Yang Li
Fatih Beyhan
Arent van Korlaar
...
Bartosz Putrycz
Soledad López Gambino
Kayeon Yoo
Elena Sokolova
Thomas Drugman
LM&MA
38
75
0
12 Feb 2024
Forecasting Events in Soccer Matches Through Language
Tiago Mendes-Neves
Luís Meireles
João Mendes-Moreira
18
5
0
09 Feb 2024
GLA-Grad: A Griffin-Lim Extended Waveform Generation Diffusion Model
Haocheng Liu
Teysir Baoueb
Mathieu Fontaine
Jonathan Le Roux
Gaël Richard
37
4
0
09 Feb 2024
Fast Timing-Conditioned Latent Audio Diffusion
Zach Evans
CJ Carr
Josiah Taylor
Scott H. Hawley
Jordi Pons
DiffM
82
102
0
07 Feb 2024
Analysis of Linear Mode Connectivity via Permutation-Based Weight Matching: With Insights into Other Permutation Search Methods
Akira Ito
Masanori Yamada
Atsutoshi Kumagai
MoMe
64
5
0
06 Feb 2024
An Inpainting-Infused Pipeline for Attire and Background Replacement
F. Mahlow
A. F. Zanella
William Alberto Cruz-Castaneda
Marcellus Amadeus
41
0
0
05 Feb 2024
Previous
1
2
3
...
6
7
8
...
59
60
61
Next