ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.23625
  4. Cited By
ZeroSep: Separate Anything in Audio with Zero Training

ZeroSep: Separate Anything in Audio with Zero Training

29 May 2025
Chao Huang
Yuesheng Ma
J. Huang
Susan Liang
Yunlong Tang
Jing Bi
Wenqiang Liu
Nima Mesgarani
Chenliang Xu
    DiffMVLM
ArXiv (abs)PDFHTML

Papers citing "ZeroSep: Separate Anything in Audio with Zero Training"

17 / 17 papers shown
Title
Learning to Highlight Audio by Watching Movies
Learning to Highlight Audio by Watching Movies
Chao Huang
Ruohan Gao
J. M. F. Tsang
Jan Kurcius
Cagdas Bilen
Chenliang Xu
Anurag Kumar
Sanjeel Parekh
VGen
90
1
0
17 May 2025
FlowSep: Language-Queried Sound Separation with Rectified Flow Matching
FlowSep: Language-Queried Sound Separation with Rectified Flow Matching
Yi Yuan
Xubo Liu
Haohe Liu
Mark D. Plumbley
Wenwu Wang
124
9
0
10 Jan 2025
Concept Sliders: LoRA Adaptors for Precise Control in Diffusion Models
Concept Sliders: LoRA Adaptors for Precise Control in Diffusion Models
Rohit Gandikota
Joanna Materzyñska
Tingrui Zhou
Antonio Torralba
David Bau
DiffM
105
77
0
20 Nov 2023
Separate Anything You Describe
Separate Anything You Describe
Xubo Liu
Qiuqiang Kong
Yan Zhao
Haohe Liu
Yiitan Yuan
Yuzhuo Liu
Rui Xia
Yuxuan Wang
Mark D. Plumbley
Wenwu Wang
VLM
77
52
0
09 Aug 2023
Visual Instruction Tuning
Visual Instruction Tuning
Haotian Liu
Chunyuan Li
Qingyang Wu
Yong Jae Lee
SyDaVLMMLLM
571
4,925
0
17 Apr 2023
Guided-TTS: A Diffusion Model for Text-to-Speech via Classifier Guidance
Guided-TTS: A Diffusion Model for Text-to-Speech via Classifier Guidance
Heeseung Kim
Sungwon Kim
Sungroh Yoon
DiffMBDL
108
112
0
23 Nov 2021
Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech
Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech
Vadim Popov
Ivan Vovk
Vladimir Gogoryan
Tasnima Sadekova
Mikhail Kudinov
DiffM
110
543
0
13 May 2021
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism
Jinglin Liu
Chengxi Li
Yi Ren
Feiyang Chen
Zhou Zhao
DiffM
148
269
0
06 May 2021
Localizing Visual Sounds the Hard Way
Localizing Visual Sounds the Hard Way
Honglie Chen
Weidi Xie
Triantafyllos Afouras
Arsha Nagrani
Andrea Vedaldi
Andrew Zisserman
ObjD
88
191
0
06 Apr 2021
Diff-TTS: A Denoising Diffusion Model for Text-to-Speech
Diff-TTS: A Denoising Diffusion Model for Text-to-Speech
Myeonghun Jeong
Hyeongju Kim
Sung Jun Cheon
Byoung Jin Choi
N. Kim
DiffM
65
197
0
03 Apr 2021
Attention is All You Need in Speech Separation
Attention is All You Need in Speech Separation
Cem Subakan
Mirco Ravanelli
Samuele Cornell
Mirko Bronzi
Jianyuan Zhong
97
565
0
25 Oct 2020
Denoising Diffusion Implicit Models
Denoising Diffusion Implicit Models
Jiaming Song
Chenlin Meng
Stefano Ermon
VLMDiffM
304
7,500
0
06 Oct 2020
DiffWave: A Versatile Diffusion Model for Audio Synthesis
DiffWave: A Versatile Diffusion Model for Audio Synthesis
Zhifeng Kong
Ming-Yu Liu
Jiaji Huang
Kexin Zhao
Bryan Catanzaro
DiffMBDL
169
1,468
0
21 Sep 2020
Denoising Diffusion Probabilistic Models
Denoising Diffusion Probabilistic Models
Jonathan Ho
Ajay Jain
Pieter Abbeel
DiffM
759
18,408
0
19 Jun 2020
Language Models are Few-Shot Learners
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
904
42,463
0
28 May 2020
Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for
  Speech Separation
Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation
Yi Luo
N. Mesgarani
171
1,796
0
20 Sep 2018
The Sound of Pixels
The Sound of Pixels
Hang Zhao
Chuang Gan
Andrew Rouditchenko
Carl Vondrick
Josh H. McDermott
Antonio Torralba
VLM
102
537
0
09 Apr 2018
1