ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1612.07837
  4. Cited By
SampleRNN: An Unconditional End-to-End Neural Audio Generation Model

SampleRNN: An Unconditional End-to-End Neural Audio Generation Model

22 December 2016
Soroush Mehri
Kundan Kumar
Ishaan Gulrajani
Rithesh Kumar
Shubham Jain
Jose M. R. Sotelo
Aaron Courville
Yoshua Bengio
ArXivPDFHTML

Papers citing "SampleRNN: An Unconditional End-to-End Neural Audio Generation Model"

50 / 274 papers shown
Title
Puffin: pitch-synchronous neural waveform generation for fullband speech
  on modest devices
Puffin: pitch-synchronous neural waveform generation for fullband speech on modest devices
O. Watts
Lovisa Wihlborg
Cassia Valentini-Botinhao
27
3
0
25 Nov 2022
HyperSound: Generating Implicit Neural Representations of Audio Signals
  with Hypernetworks
HyperSound: Generating Implicit Neural Representations of Audio Signals with Hypernetworks
Filip Szatkowski
Karol J. Piczak
Przemysław Spurek
Jacek Tabor
Tomasz Trzciñski
23
12
0
03 Nov 2022
A Survey on Artificial Intelligence for Music Generation: Agents,
  Domains and Perspectives
A Survey on Artificial Intelligence for Music Generation: Agents, Domains and Perspectives
Carlos Hernandez-Olivan
Javier Hernandez-Olivan
J. R. Beltrán
MGen
40
6
0
25 Oct 2022
Robust One-Shot Singing Voice Conversion
Robust One-Shot Singing Voice Conversion
Naoya Takahashi
M. Singh
Yuki Mitsufuji
DiffM
25
8
0
20 Oct 2022
Hierarchical Diffusion Models for Singing Voice Neural Vocoder
Hierarchical Diffusion Models for Singing Voice Neural Vocoder
Naoya Takahashi
Mayank Kumar
Singh
Yuki Mitsufuji
DiffM
21
16
0
14 Oct 2022
WaveFit: An Iterative and Non-autoregressive Neural Vocoder based on
  Fixed-Point Iteration
WaveFit: An Iterative and Non-autoregressive Neural Vocoder based on Fixed-Point Iteration
Yuma Koizumi
Kohei Yatabe
Heiga Zen
M. Bacchiani
DiffM
42
29
0
03 Oct 2022
Pathway to Future Symbiotic Creativity
Pathway to Future Symbiotic Creativity
Yi-Ting Guo
Qi-fei Liu
Jie Chen
Wei Xue
Jie Fu
...
Fernando Rosas
Jeffrey Shaw
Xing Wu
Jiji Zhang
Jianliang Xu
31
0
0
18 Aug 2022
Musika! Fast Infinite Waveform Music Generation
Musika! Fast Infinite Waveform Music Generation
Marco Pasini
Jan Schluter
MGen
12
29
0
18 Aug 2022
Towards Parametric Speech Synthesis Using Gaussian-Markov Model of
  Spectral Envelope and Wavelet-Based Decomposition of F0
Towards Parametric Speech Synthesis Using Gaussian-Markov Model of Spectral Envelope and Wavelet-Based Decomposition of F0
M. S. Al-Radhi
Tamás Gábor Csapó
Csaba Zainkó
Géza Németh
9
1
0
15 Aug 2022
DDX7: Differentiable FM Synthesis of Musical Instrument Sounds
DDX7: Differentiable FM Synthesis of Musical Instrument Sounds
Franco Caspe
Andrew Mcpherson
Mark Sandler
33
30
0
12 Aug 2022
Latent-Domain Predictive Neural Speech Coding
Latent-Domain Predictive Neural Speech Coding
Xue Jiang
Xiulian Peng
Huaying Xue
Yuan Zhang
Yan Lu
38
17
0
18 Jul 2022
A Cyclical Approach to Synthetic and Natural Speech Mismatch Refinement
  of Neural Post-filter for Low-cost Text-to-speech System
A Cyclical Approach to Synthetic and Natural Speech Mismatch Refinement of Neural Post-filter for Low-cost Text-to-speech System
Yi-Chiao Wu
Patrick Lumban Tobing
Kazuki Yasuhara
Noriyuki Matsunaga
Yamato Ohtani
T. Toda
42
0
0
13 Jul 2022
Cross-Scale Vector Quantization for Scalable Neural Speech Coding
Cross-Scale Vector Quantization for Scalable Neural Speech Coding
Xue Jiang
Xiulian Peng
Huaying Xue
Yuan Zhang
Yan Lu
MQ
39
9
0
07 Jul 2022
WOLONet: Wave Outlooker for Efficient and High Fidelity Speech Synthesis
WOLONet: Wave Outlooker for Efficient and High Fidelity Speech Synthesis
Yi Wang
Yi Si
20
0
0
20 Jun 2022
Adversarial Audio Synthesis with Complex-valued Polynomial Networks
Adversarial Audio Synthesis with Complex-valued Polynomial Networks
Yongtao Wu
Grigorios G. Chrysos
V. Cevher
DiffM
19
4
0
14 Jun 2022
Multi-instrument Music Synthesis with Spectrogram Diffusion
Multi-instrument Music Synthesis with Spectrogram Diffusion
Curtis Hawthorne
Ian Simon
Adam Roberts
Neil Zeghidour
Josh Gardner
Ethan Manilow
Jesse Engel
DiffM
21
49
0
11 Jun 2022
BigVGAN: A Universal Neural Vocoder with Large-Scale Training
BigVGAN: A Universal Neural Vocoder with Large-Scale Training
Sang-gil Lee
Ming-Yu Liu
Boris Ginsburg
Bryan Catanzaro
Sung-Hoon Yoon
22
228
0
09 Jun 2022
Co-creation and ownership for AI radio
Co-creation and ownership for AI radio
Skylar Gordon
Robert Mahari
Manaswi Mishra
Ziv Epstein
24
4
0
01 Jun 2022
cMelGAN: An Efficient Conditional Generative Model Based on Mel
  Spectrograms
cMelGAN: An Efficient Conditional Generative Model Based on Mel Spectrograms
Tracy Qian
Jackson Kaunismaa
Tony Chung
MGen
GAN
MedIm
19
5
0
15 May 2022
Synthetic Data -- what, why and how?
Synthetic Data -- what, why and how?
James Jordon
Lukasz Szpruch
F. Houssiau
M. Bottarelli
Giovanni Cherubin
Carsten Maple
Samuel N. Cohen
Adrian Weller
46
109
0
06 May 2022
Brainish: Formalizing A Multimodal Language for Intelligence and
  Consciousness
Brainish: Formalizing A Multimodal Language for Intelligence and Consciousness
Paul Pu Liang
30
4
0
14 Apr 2022
SpecGrad: Diffusion Probabilistic Model based Neural Vocoder with
  Adaptive Noise Spectral Shaping
SpecGrad: Diffusion Probabilistic Model based Neural Vocoder with Adaptive Noise Spectral Shaping
Yuma Koizumi
Heiga Zen
Kohei Yatabe
Nanxin Chen
M. Bacchiani
DiffM
33
45
0
31 Mar 2022
Symbolic music generation conditioned on continuous-valued emotions
Symbolic music generation conditioned on continuous-valued emotions
Serkan Sulun
M. Davies
Paula Viana
MGen
24
25
0
30 Mar 2022
Long Document Summarization with Top-down and Bottom-up Inference
Long Document Summarization with Top-down and Bottom-up Inference
Bo Pang
Erik Nijkamp
Wojciech Kry'sciñski
Silvio Savarese
Yingbo Zhou
Caiming Xiong
RALM
BDL
24
55
0
15 Mar 2022
Practical cognitive speech compression
Practical cognitive speech compression
Reza Lotfidereshgi
P. Gournay
32
2
0
08 Mar 2022
HEAR: Holistic Evaluation of Audio Representations
HEAR: Holistic Evaluation of Audio Representations
Joseph P. Turian
Jordie Shier
H. Khan
Bhiksha Raj
Björn W. Schuller
...
P. Esling
Pranay Manocha
Shinji Watanabe
Zeyu Jin
Yonatan Bisk
39
100
0
06 Mar 2022
NeuralDPS: Neural Deterministic Plus Stochastic Model with Multiband
  Excitation for Noise-Controllable Waveform Generation
NeuralDPS: Neural Deterministic Plus Stochastic Model with Multiband Excitation for Noise-Controllable Waveform Generation
Tao Wang
Ruibo Fu
Jiangyan Yi
J. Tao
Zhengqi Wen
9
2
0
05 Mar 2022
Neural Speech Synthesis on a Shoestring: Improving the Efficiency of
  LPCNet
Neural Speech Synthesis on a Shoestring: Improving the Efficiency of LPCNet
J. Valin
Umut Isik
Paris Smaragdis
A. Krishnaswamy
29
4
0
22 Feb 2022
Wavebender GAN: An architecture for phonetically meaningful speech
  manipulation
Wavebender GAN: An architecture for phonetically meaningful speech manipulation
Gustavo Teodoro Döhler Beck
Ulme Wennberg
Zofia Malisz
G. Henter
AI4CE
24
8
0
22 Feb 2022
It's Raw! Audio Generation with State-Space Models
It's Raw! Audio Generation with State-Space Models
Karan Goel
Albert Gu
Chris Donahue
Christopher Ré
16
186
0
20 Feb 2022
General-purpose, long-context autoregressive modeling with Perceiver AR
General-purpose, long-context autoregressive modeling with Perceiver AR
Curtis Hawthorne
Andrew Jaegle
Cătălina Cangea
Sebastian Borgeaud
C. Nash
...
Hannah R. Sheahan
Neil Zeghidour
Jean-Baptiste Alayrac
João Carreira
Jesse Engel
43
65
0
15 Feb 2022
InferGrad: Improving Diffusion Models for Vocoder by Considering
  Inference in Training
InferGrad: Improving Diffusion Models for Vocoder by Considering Inference in Training
Zehua Chen
Xu Tan
Ke Wang
Shifeng Pan
Danilo Mandic
Lei He
Sheng Zhao
DiffM
31
28
0
08 Feb 2022
ItôWave: Itô Stochastic Differential Equation Is All You Need For
  Wave Generation
ItôWave: Itô Stochastic Differential Equation Is All You Need For Wave Generation
Shoule Wu
Ziqiang Shi
DiffM
280
9
0
29 Jan 2022
Audio representations for deep learning in sound synthesis: A review
Audio representations for deep learning in sound synthesis: A review
Anastasia Natsiou
Seán O'Leary
AI4TS
24
18
0
07 Jan 2022
Evaluating Deep Music Generation Methods Using Data Augmentation
Evaluating Deep Music Generation Methods Using Data Augmentation
Toby Godwin
Georgios Rizos
Alice Baird
N. A. Futaisi
Vincent Brisse
Bjoern W. Schuller
MGen
12
0
0
31 Dec 2021
Video Background Music Generation with Controllable Music Transformer
Video Background Music Generation with Controllable Music Transformer
Shangzhe Di
Jiang
Sihan Liu
Zhaokai Wang
Leyan Zhu
Zexin He
Hongming Liu
Shuicheng Yan
22
91
0
16 Nov 2021
Property Inference Attacks Against GANs
Property Inference Attacks Against GANs
Junhao Zhou
Yufei Chen
Chao Shen
Yang Zhang
AAML
MIACV
30
52
0
15 Nov 2021
RAVE: A variational autoencoder for fast and high-quality neural audio
  synthesis
RAVE: A variational autoencoder for fast and high-quality neural audio synthesis
Antoine Caillon
P. Esling
DRL
21
109
0
09 Nov 2021
Development of a robust cascaded architecture for intelligent robot
  grasping using limited labelled data
Development of a robust cascaded architecture for intelligent robot grasping using limited labelled data
Priya Shukla
V. Kushwaha
G. C. Nandi
27
4
0
06 Nov 2021
WaveFake: A Data Set to Facilitate Audio Deepfake Detection
WaveFake: A Data Set to Facilitate Audio Deepfake Detection
Joel Frank
Lea Schonherr
DiffM
129
123
0
04 Nov 2021
Chunked Autoregressive GAN for Conditional Waveform Synthesis
Chunked Autoregressive GAN for Conditional Waveform Synthesis
Max Morrison
Rithesh Kumar
Kundan Kumar
Prem Seetharaman
Aaron Courville
Yoshua Bengio
GAN
41
68
0
19 Oct 2021
Taming Visually Guided Sound Generation
Taming Visually Guided Sound Generation
Vladimir E. Iashin
Esa Rahtu
VLM
32
122
0
17 Oct 2021
Cross-speaker Emotion Transfer Based on Speaker Condition Layer
  Normalization and Semi-Supervised Training in Text-To-Speech
Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Training in Text-To-Speech
Pengfei Wu
Junjie Pan
Chenchang Xu
Junhui Zhang
Lin Wu
Xiang Yin
Zejun Ma
8
16
0
08 Oct 2021
On-device neural speech synthesis
On-device neural speech synthesis
Sivanand Achanta
Albert Antony
L. Golipour
Jiangchuan Li
T. Raitio
...
Francesco Rossi
Jennifer Shi
Jaimin Upadhyay
David Winarsky
Hepeng Zhang
35
17
0
17 Sep 2021
Network Modulation Synthesis: New Algorithms for Generating Musical
  Audio Using Autoencoder Networks
Network Modulation Synthesis: New Algorithms for Generating Musical Audio Using Autoencoder Networks
Jeremy Hyrkas
11
1
0
04 Sep 2021
Self-Attention for Audio Super-Resolution
Self-Attention for Audio Super-Resolution
Nathanaël Carraz Rakotonirina
SupR
38
23
0
26 Aug 2021
A Benchmarking Initiative for Audio-Domain Music Generation Using the
  Freesound Loop Dataset
A Benchmarking Initiative for Audio-Domain Music Generation Using the Freesound Loop Dataset
Tun-Min Hung
Bo-Yu Chen
Yen-Tung Yeh
Yi-Hsuan Yang
16
12
0
03 Aug 2021
A Survey on Audio Synthesis and Audio-Visual Multimodal Processing
A Survey on Audio Synthesis and Audio-Visual Multimodal Processing
Zhaofeng Shi
26
7
0
01 Aug 2021
Codified audio language modeling learns useful representations for music
  information retrieval
Codified audio language modeling learns useful representations for music information retrieval
Rodrigo Castellon
Chris Donahue
Percy Liang
84
86
0
12 Jul 2021
Neural Waveshaping Synthesis
Neural Waveshaping Synthesis
B. Hayes
C. Saitis
Gyorgy Fazekas
36
28
0
11 Jul 2021
Previous
123456
Next