Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2209.15352
Cited By
AudioGen: Textually Guided Audio Generation
30 September 2022
Felix Kreuk
Gabriel Synnaeve
Adam Polyak
Uriel Singer
Alexandre Défossez
Jade Copet
Devi Parikh
Yaniv Taigman
Yossi Adi
DiffM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"AudioGen: Textually Guided Audio Generation"
28 / 78 papers shown
Title
Real-time Speech Frequency Bandwidth Extension
Yunpeng Li
Marco Tagliasacchi
Oleg Rybakov
Victor Ungureanu
Dominik Roblek
42
49
0
21 Oct 2020
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Jungil Kong
Jaehyeon Kim
Jaekyoung Bae
177
1,931
0
12 Oct 2020
FSD50K: An Open Dataset of Human-Labeled Sound Events
Eduardo Fonseca
Xavier Favory
Jordi Pons
F. Font
Xavier Serra
71
458
0
01 Oct 2020
A Spectral Energy Distance for Parallel Speech Synthesis
A. Gritsenko
Tim Salimans
Rianne van den Berg
Jasper Snoek
Nal Kalchbrenner
42
70
0
03 Aug 2020
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
540
2,081
0
28 Jul 2020
Self-Supervised Contrastive Learning for Unsupervised Phoneme Segmentation
Felix Kreuk
Joseph Keshet
Yossi Adi
SSL
52
79
0
27 Jul 2020
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations
Alexei Baevski
Henry Zhou
Abdel-rahman Mohamed
Michael Auli
SSL
282
5,790
0
20 Jun 2020
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
Yi Ren
Chenxu Hu
Xu Tan
Tao Qin
Sheng Zhao
Zhou Zhao
Tie-Yan Liu
105
1,396
0
08 Jun 2020
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
743
41,932
0
28 May 2020
VGGSound: A Large-scale Audio-Visual Dataset
Honglie Chen
Weidi Xie
Andrea Vedaldi
Andrew Zisserman
89
576
0
29 Apr 2020
ViSQOL v3: An Open Source Production Ready Objective Speech and Audio Metric
Michael Chinen
Felicia S. C. Lim
Jan Skoglund
Nikita Gureev
F. O'Gorman
Andrew Hines
61
142
0
20 Apr 2020
Longformer: The Long-Document Transformer
Iz Beltagy
Matthew E. Peters
Arman Cohan
RALM
VLM
162
4,062
0
10 Apr 2020
Compressive Transformers for Long-Range Sequence Modelling
Jack W. Rae
Anna Potapenko
Siddhant M. Jayakumar
Timothy Lillicrap
RALM
VLM
KELM
64
647
0
13 Nov 2019
Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram
Ryuichi Yamamoto
Eunwoo Song
Jae-Min Kim
56
818
0
25 Oct 2019
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
419
20,127
0
23 Oct 2019
Clotho: An Audio Captioning Dataset
Konstantinos Drossos
Samuel Lipping
Tuomas Virtanen
87
389
0
21 Oct 2019
MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis
Kundan Kumar
Rithesh Kumar
T. Boissière
L. Gestin
Wei Zhen Teoh
Jose M. R. Sotelo
A. D. Brébisson
Yoshua Bengio
Aaron Courville
GAN
159
953
0
08 Oct 2019
The Curious Case of Neural Text Degeneration
Ari Holtzman
Jan Buys
Li Du
Maxwell Forbes
Yejin Choi
182
3,175
0
22 Apr 2019
wav2vec: Unsupervised Pre-training for Speech Recognition
Steffen Schneider
Alexei Baevski
R. Collobert
Michael Auli
SSL
71
418
0
11 Apr 2019
Semantic Image Synthesis with Spatially-Adaptive Normalization
Taesung Park
Ming-Yuan Liu
Ting-Chun Wang
Jun-Yan Zhu
156
2,685
0
18 Mar 2019
A Style-Based Generator Architecture for Generative Adversarial Networks
Tero Karras
S. Laine
Timo Aila
563
10,555
0
12 Dec 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.7K
94,770
0
11 Oct 2018
Representation Learning with Contrastive Predictive Coding
Aaron van den Oord
Yazhe Li
Oriol Vinyals
DRL
SSL
312
10,284
0
10 Jul 2018
Neural Discrete Representation Learning
Aaron van den Oord
Oriol Vinyals
Koray Kavukcuoglu
BDL
SSL
OCL
226
5,008
0
02 Nov 2017
Unsupervised Learning of Disentangled and Interpretable Representations from Sequential Data
Wei-Ning Hsu
Yu Zhang
James R. Glass
BDL
SSL
78
352
0
22 Sep 2017
WaveNet: A Generative Model for Raw Audio
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
DiffM
401
7,391
0
12 Sep 2016
Layer Normalization
Jimmy Lei Ba
J. Kiros
Geoffrey E. Hinton
410
10,482
0
21 Jul 2016
Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs)
Djork-Arné Clevert
Thomas Unterthiner
Sepp Hochreiter
298
5,521
0
23 Nov 2015
Previous
1
2