Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.00830
Cited By
v1
v2 (latest)
AUDIT: Audio Editing by Following Instructions with Latent Diffusion Models
3 April 2023
Yuancheng Wang
Zeqian Ju
Xuejiao Tan
Lei He
Zhizheng Wu
Jiang Bian
Sheng Zhao
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"AUDIT: Audio Editing by Following Instructions with Latent Diffusion Models"
32 / 32 papers shown
Title
COCOLA: Coherence-Oriented Contrastive Learning of Musical Audio Representations
Ruben Ciranni
Emilian Postolache
Giorgio Mariani
Michele Mancusi
Giorgio Fabbro
Emanuele Rodolà
Luca Cosmo
231
8
0
10 Jan 2025
FlowSep: Language-Queried Sound Separation with Rectified Flow Matching
Yi Yuan
Xubo Liu
Haohe Liu
Mark D. Plumbley
Wenwu Wang
124
9
0
10 Jan 2025
UIBDiffusion: Universal Imperceptible Backdoor Attack for Diffusion Models
Yuning Han
Bingyin Zhao
Rui Chu
Feng Luo
Biplab Sikdar
Yingjie Lao
DiffM
AAML
188
1
0
16 Dec 2024
T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion Models
Chong Mou
Xintao Wang
Liangbin Xie
Yanze Wu
Shuai Liu
Zhongang Qi
Ying Shan
Xiaohu Qie
DiffM
138
1,034
0
16 Feb 2023
Adding Conditional Control to Text-to-Image Diffusion Models
Lvmin Zhang
Anyi Rao
Maneesh Agrawala
AI4CE
184
4,180
1
10 Feb 2023
MusicLM: Generating Music From Text
A. Agostinelli
Timo I. Denk
Zalan Borsos
Jesse Engel
Mauro Verzetti
...
Adam Roberts
Marco Tagliasacchi
Matthew Sharifi
Neil Zeghidour
Christian Frank
MGen
147
450
0
26 Jan 2023
InstructPix2Pix: Learning to Follow Image Editing Instructions
Tim Brooks
Aleksander Holynski
Alexei A. Efros
DiffM
213
1,835
0
17 Nov 2022
AudioGen: Textually Guided Audio Generation
Felix Kreuk
Gabriel Synnaeve
Adam Polyak
Uriel Singer
Alexandre Défossez
Jade Copet
Devi Parikh
Yaniv Taigman
Yossi Adi
DiffM
104
309
0
30 Sep 2022
CMGAN: Conformer-Based Metric-GAN for Monaural Speech Enhancement
Sherif Abdulatif
Ru Cao
Bin Yang
81
74
0
22 Sep 2022
Prompt-to-Prompt Image Editing with Cross Attention Control
Amir Hertz
Ron Mokady
J. Tenenbaum
Kfir Aberman
Yael Pritch
Daniel Cohen-Or
DiffM
206
1,790
0
02 Aug 2022
Classifier-Free Diffusion Guidance
Jonathan Ho
Tim Salimans
FaML
196
3,971
0
26 Jul 2022
Diffsound: Discrete Diffusion Model for Text-to-sound Generation
Dongchao Yang
Jianwei Yu
Helin Wang
Wen Wang
Chao Weng
Yuexian Zou
Dong Yu
DiffM
92
306
0
20 Jul 2022
SpeechPainter: Text-conditioned Speech Inpainting
Zalan Borsos
Matthew Sharifi
Marco Tagliasacchi
93
28
0
15 Feb 2022
RePaint: Inpainting using Denoising Diffusion Probabilistic Models
Andreas Lugmayr
Martin Danelljan
Andrés Romero
Feng Yu
Radu Timofte
Luc Van Gool
DiffM
355
1,425
0
24 Jan 2022
Blended Diffusion for Text-driven Editing of Natural Images
Omri Avrahami
Dani Lischinski
Ohad Fried
DiffM
135
954
0
29 Nov 2021
Palette: Image-to-Image Diffusion Models
Chitwan Saharia
William Chan
Huiwen Chang
Chris A. Lee
Jonathan Ho
Tim Salimans
David J. Fleet
Mohammad Norouzi
DiffM
VLM
486
1,649
0
10 Nov 2021
Diffusion-Based Voice Conversion with Fast Maximum Likelihood Sampling Scheme
Vadim Popov
Ivan Vovk
Vladimir Gogoryan
Tasnima Sadekova
Mikhail Kudinov
Jiansheng Wei
DiffM
BDL
135
136
0
28 Sep 2021
ILVR: Conditioning Method for Denoising Diffusion Probabilistic Models
Jooyoung Choi
Sungwon Kim
Yonghyun Jeong
Youngjune Gwon
Sungroh Yoon
DiffM
157
724
0
06 Aug 2021
DiffSVC: A Diffusion Probabilistic Model for Singing Voice Conversion
Songxiang Liu
Yuewen Cao
Dan Su
Helen Meng
DiffM
69
59
0
28 May 2021
Score-Based Generative Modeling through Stochastic Differential Equations
Yang Song
Jascha Narain Sohl-Dickstein
Diederik P. Kingma
Abhishek Kumar
Stefano Ermon
Ben Poole
DiffM
SyDa
370
6,586
0
26 Nov 2020
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Jungil Kong
Jaehyeon Kim
Jaekyoung Bae
179
1,952
0
12 Oct 2020
FSD50K: An Open Dataset of Human-Labeled Sound Events
Eduardo Fonseca
Xavier Favory
Jordi Pons
F. Font
Xavier Serra
107
467
0
01 Oct 2020
Unsupervised Cross-Domain Singing Voice Conversion
Adam Polyak
Lior Wolf
Yossi Adi
Yaniv Taigman
49
44
0
06 Aug 2020
Text-to-Text Pre-Training for Data-to-Text Tasks
Mihir Kale
Abhinav Rastogi
AI4CE
68
202
0
21 May 2020
Audio inpainting with generative adversarial network
P. Ebner
Amr Eltelt
GAN
56
24
0
13 Mar 2020
Audio Inpainting: Revisited and Reweighted
Ondřej Mokrý
P. Rajmic
59
21
0
08 Jan 2020
PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition
Qiuqiang Kong
Yin Cao
Turab Iqbal
Yuxuan Wang
Wenwu Wang
Mark D. Plumbley
VLM
SSL
199
1,084
0
21 Dec 2019
WaveFlow: A Compact Flow-based Model for Raw Audio
Ming-Yu Liu
Kainan Peng
Kexin Zhao
Z. Song
87
117
0
03 Dec 2019
Vision-Infused Deep Audio Inpainting
Hang Zhou
Ziwei Liu
Lingfeng Guo
Ping Luo
Dahua Lin
142
88
0
24 Oct 2019
Play as You Like: Timbre-enhanced Multi-modal Music Style Transfer
Chien-Yu Lu
Min-Xin Xue
Chia-Che Chang
Che-Rung Lee
Li Su
83
34
0
28 Nov 2018
Efficient Neural Audio Synthesis
Nal Kalchbrenner
Erich Elsen
Karen Simonyan
Seb Noury
Norman Casagrande
Edward Lockhart
Florian Stimberg
Aaron van den Oord
Sander Dieleman
Koray Kavukcuoglu
94
870
0
23 Feb 2018
WaveNet: A Generative Model for Raw Audio
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
DiffM
406
7,421
0
12 Sep 2016
1