SampleRNN: An Unconditional End-to-End Neural Audio Generation Model

22 December 2016

Aaron Courville

Papers citing "SampleRNN: An Unconditional End-to-End Neural Audio Generation Model"

50 / 274 papers shown

Title
Puffin: pitch-synchronous neural waveform generation for fullband speech on modest devices O. Watts Lovisa Wihlborg Cassia Valentini-Botinhao 27 3 0 25 Nov 2022
HyperSound: Generating Implicit Neural Representations of Audio Signals with Hypernetworks Filip Szatkowski Karol J. Piczak Przemysław Spurek Jacek Tabor Tomasz Trzciñski 23 12 0 03 Nov 2022
A Survey on Artificial Intelligence for Music Generation: Agents, Domains and Perspectives Carlos Hernandez-Olivan Javier Hernandez-Olivan J. R. Beltrán MGen 40 6 0 25 Oct 2022
Robust One-Shot Singing Voice Conversion Naoya Takahashi M. Singh Yuki Mitsufuji DiffM 25 8 0 20 Oct 2022
Hierarchical Diffusion Models for Singing Voice Neural Vocoder Naoya Takahashi Mayank Kumar Singh Yuki Mitsufuji DiffM 21 16 0 14 Oct 2022
WaveFit: An Iterative and Non-autoregressive Neural Vocoder based on Fixed-Point Iteration Yuma Koizumi Kohei Yatabe Heiga Zen M. Bacchiani DiffM 42 29 0 03 Oct 2022
Pathway to Future Symbiotic Creativity Yi-Ting Guo Qi-fei Liu Jie Chen Wei Xue Jie Fu ... Fernando Rosas Jeffrey Shaw Xing Wu Jiji Zhang Jianliang Xu 31 0 0 18 Aug 2022
Musika! Fast Infinite Waveform Music Generation Marco Pasini Jan Schluter MGen 12 29 0 18 Aug 2022
Towards Parametric Speech Synthesis Using Gaussian-Markov Model of Spectral Envelope and Wavelet-Based Decomposition of F0 M. S. Al-Radhi Tamás Gábor Csapó Csaba Zainkó Géza Németh 9 1 0 15 Aug 2022
DDX7: Differentiable FM Synthesis of Musical Instrument Sounds Franco Caspe Andrew Mcpherson Mark Sandler 33 30 0 12 Aug 2022
Latent-Domain Predictive Neural Speech Coding Xue Jiang Xiulian Peng Huaying Xue Yuan Zhang Yan Lu 38 17 0 18 Jul 2022
A Cyclical Approach to Synthetic and Natural Speech Mismatch Refinement of Neural Post-filter for Low-cost Text-to-speech System Yi-Chiao Wu Patrick Lumban Tobing Kazuki Yasuhara Noriyuki Matsunaga Yamato Ohtani T. Toda 42 0 0 13 Jul 2022
Cross-Scale Vector Quantization for Scalable Neural Speech Coding Xue Jiang Xiulian Peng Huaying Xue Yuan Zhang Yan Lu MQ 39 9 0 07 Jul 2022
WOLONet: Wave Outlooker for Efficient and High Fidelity Speech Synthesis Yi Wang Yi Si 20 0 0 20 Jun 2022
Adversarial Audio Synthesis with Complex-valued Polynomial Networks Yongtao Wu Grigorios G. Chrysos V. Cevher DiffM 19 4 0 14 Jun 2022
Multi-instrument Music Synthesis with Spectrogram Diffusion Curtis Hawthorne Ian Simon Adam Roberts Neil Zeghidour Josh Gardner Ethan Manilow Jesse Engel DiffM 21 49 0 11 Jun 2022
BigVGAN: A Universal Neural Vocoder with Large-Scale Training Sang-gil Lee Ming-Yu Liu Boris Ginsburg Bryan Catanzaro Sung-Hoon Yoon 22 228 0 09 Jun 2022
Co-creation and ownership for AI radio Skylar Gordon Robert Mahari Manaswi Mishra Ziv Epstein 24 4 0 01 Jun 2022
cMelGAN: An Efficient Conditional Generative Model Based on Mel Spectrograms Tracy Qian Jackson Kaunismaa Tony Chung MGen GAN MedIm 19 5 0 15 May 2022
Synthetic Data -- what, why and how? James Jordon Lukasz Szpruch F. Houssiau M. Bottarelli Giovanni Cherubin Carsten Maple Samuel N. Cohen Adrian Weller 46 109 0 06 May 2022
Brainish: Formalizing A Multimodal Language for Intelligence and Consciousness Paul Pu Liang 30 4 0 14 Apr 2022
SpecGrad: Diffusion Probabilistic Model based Neural Vocoder with Adaptive Noise Spectral Shaping Yuma Koizumi Heiga Zen Kohei Yatabe Nanxin Chen M. Bacchiani DiffM 33 45 0 31 Mar 2022
Symbolic music generation conditioned on continuous-valued emotions Serkan Sulun M. Davies Paula Viana MGen 24 25 0 30 Mar 2022
Long Document Summarization with Top-down and Bottom-up Inference Bo Pang Erik Nijkamp Wojciech Kry'sciñski Silvio Savarese Yingbo Zhou Caiming Xiong RALM BDL 24 55 0 15 Mar 2022
Practical cognitive speech compression Reza Lotfidereshgi P. Gournay 32 2 0 08 Mar 2022
HEAR: Holistic Evaluation of Audio Representations Joseph P. Turian Jordie Shier H. Khan Bhiksha Raj Björn W. Schuller ... P. Esling Pranay Manocha Shinji Watanabe Zeyu Jin Yonatan Bisk 39 100 0 06 Mar 2022
NeuralDPS: Neural Deterministic Plus Stochastic Model with Multiband Excitation for Noise-Controllable Waveform Generation Tao Wang Ruibo Fu Jiangyan Yi J. Tao Zhengqi Wen 9 2 0 05 Mar 2022
Neural Speech Synthesis on a Shoestring: Improving the Efficiency of LPCNet J. Valin Umut Isik Paris Smaragdis A. Krishnaswamy 29 4 0 22 Feb 2022
Wavebender GAN: An architecture for phonetically meaningful speech manipulation Gustavo Teodoro Döhler Beck Ulme Wennberg Zofia Malisz G. Henter AI4CE 24 8 0 22 Feb 2022
It's Raw! Audio Generation with State-Space Models Karan Goel Albert Gu Chris Donahue Christopher Ré 16 186 0 20 Feb 2022
General-purpose, long-context autoregressive modeling with Perceiver AR Curtis Hawthorne Andrew Jaegle Cătălina Cangea Sebastian Borgeaud C. Nash ... Hannah R. Sheahan Neil Zeghidour Jean-Baptiste Alayrac João Carreira Jesse Engel 43 65 0 15 Feb 2022
InferGrad: Improving Diffusion Models for Vocoder by Considering Inference in Training Zehua Chen Xu Tan Ke Wang Shifeng Pan Danilo Mandic Lei He Sheng Zhao DiffM 31 28 0 08 Feb 2022
ItôWave: Itô Stochastic Differential Equation Is All You Need For Wave Generation Shoule Wu Ziqiang Shi DiffM 280 9 0 29 Jan 2022
Audio representations for deep learning in sound synthesis: A review Anastasia Natsiou Seán O'Leary AI4TS 24 18 0 07 Jan 2022
Evaluating Deep Music Generation Methods Using Data Augmentation Toby Godwin Georgios Rizos Alice Baird N. A. Futaisi Vincent Brisse Bjoern W. Schuller MGen 12 0 0 31 Dec 2021
Video Background Music Generation with Controllable Music Transformer Shangzhe Di Jiang Sihan Liu Zhaokai Wang Leyan Zhu Zexin He Hongming Liu Shuicheng Yan 22 91 0 16 Nov 2021
Property Inference Attacks Against GANs Junhao Zhou Yufei Chen Chao Shen Yang Zhang AAML MIACV 30 52 0 15 Nov 2021
RAVE: A variational autoencoder for fast and high-quality neural audio synthesis Antoine Caillon P. Esling DRL 21 109 0 09 Nov 2021
Development of a robust cascaded architecture for intelligent robot grasping using limited labelled data Priya Shukla V. Kushwaha G. C. Nandi 27 4 0 06 Nov 2021
WaveFake: A Data Set to Facilitate Audio Deepfake Detection Joel Frank Lea Schonherr DiffM 129 123 0 04 Nov 2021
Chunked Autoregressive GAN for Conditional Waveform Synthesis Max Morrison Rithesh Kumar Kundan Kumar Prem Seetharaman Aaron Courville Yoshua Bengio GAN 41 68 0 19 Oct 2021
Taming Visually Guided Sound Generation Vladimir E. Iashin Esa Rahtu VLM 32 122 0 17 Oct 2021
Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Training in Text-To-Speech Pengfei Wu Junjie Pan Chenchang Xu Junhui Zhang Lin Wu Xiang Yin Zejun Ma 8 16 0 08 Oct 2021
On-device neural speech synthesis Sivanand Achanta Albert Antony L. Golipour Jiangchuan Li T. Raitio ... Francesco Rossi Jennifer Shi Jaimin Upadhyay David Winarsky Hepeng Zhang 35 17 0 17 Sep 2021
Network Modulation Synthesis: New Algorithms for Generating Musical Audio Using Autoencoder Networks Jeremy Hyrkas 11 1 0 04 Sep 2021
Self-Attention for Audio Super-Resolution Nathanaël Carraz Rakotonirina SupR 38 23 0 26 Aug 2021
A Benchmarking Initiative for Audio-Domain Music Generation Using the Freesound Loop Dataset Tun-Min Hung Bo-Yu Chen Yen-Tung Yeh Yi-Hsuan Yang 16 12 0 03 Aug 2021
A Survey on Audio Synthesis and Audio-Visual Multimodal Processing Zhaofeng Shi 26 7 0 01 Aug 2021
Codified audio language modeling learns useful representations for music information retrieval Rodrigo Castellon Chris Donahue Percy Liang 84 86 0 12 Jul 2021
Neural Waveshaping Synthesis B. Hayes C. Saitis Gyorgy Fazekas 36 28 0 11 Jul 2021