Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2212.07065
Cited By
CLIPSep: Learning Text-queried Sound Separation with Noisy Unlabeled Videos
14 December 2022
Hao-Wen Dong
Naoya Takahashi
Yuki Mitsufuji
Julian McAuley
Taylor Berg-Kirkpatrick
VLM
CLIP
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CLIPSep: Learning Text-queried Sound Separation with Noisy Unlabeled Videos"
21 / 21 papers shown
Title
Hearing and Seeing Through CLIP: A Framework for Self-Supervised Sound Source Localization
Sooyoung Park
Arda Senocak
Joon Son Chung
VLM
50
0
0
08 May 2025
MACS: Multi-source Audio-to-image Generation with Contextual Significance and Semantic Alignment
Hao Zhou
Xiaobao Guo
Yuzhe Zhu
A. Kong
DiffM
60
1
0
13 Mar 2025
Audio-Language Models for Audio-Centric Tasks: A survey
Yi Su
Jisheng Bai
Qisheng Xu
Kele Xu
Yong Dou
AuLLM
99
2
0
28 Jan 2025
Beyond Speaker Identity: Text Guided Target Speech Extraction
Mingyue Huo
Abhinav Jain
Cong Phuoc Huynh
Fanjie Kong
Pichao Wang
Zhu Liu
Vimal Bhat
51
0
0
17 Jan 2025
Audio-Language Datasets of Scenes and Events: A Survey
Gijs Wijngaard
Elia Formisano
Michele Esposito
M. Dumontier
81
2
0
10 Jan 2025
OmniSep: Unified Omni-Modality Sound Separation with Query-Mixup
Xize Cheng
Siqi Zheng
Zehan Wang
Minghui Fang
Ziang Zhang
...
Z. Ma
Shengpeng Ji
Jialong Zuo
Tao Jin
Zhou Zhao
30
1
0
28 Oct 2024
OpenSep: Leveraging Large Language Models with Textual Inversion for Open World Audio Separation
Tanvir Mahmud
Diana Marculescu
VLM
29
2
0
28 Sep 2024
Leveraging Audio-Only Data for Text-Queried Target Sound Extraction
Kohei Saijo
Janek Ebbers
François G. Germain
Sameer Khurana
G. Wichern
Jonathan Le Roux
39
1
0
20 Sep 2024
Language-Queried Target Sound Extraction Without Parallel Training Data
Hao Ma
Zhiyuan Peng
Xu Li
Yukai Li
Mingjie Shao
Qiuqiang Kong
Ju Liu
VLM
69
1
0
14 Sep 2024
OmniBind: Large-scale Omni Multimodal Representation via Binding Spaces
Zehan Wang
Ziang Zhang
Hang Zhang
Luping Liu
Rongjie Huang
Xize Cheng
Hengshuang Zhao
Zhou Zhao
43
9
0
16 Jul 2024
A Reference-free Metric for Language-Queried Audio Source Separation using Contrastive Language-Audio Pretraining
Feiyang Xiao
Jian Guan
Qiaoxi Zhu
Xubo Liu
Wenbo Wang
Shuhan Qi
Kejia Zhang
Jianyuan Sun
Wenwu Wang
30
4
0
06 Jul 2024
Weakly-supervised Audio Separation via Bi-modal Semantic Similarity
Tanvir Mahmud
Saeed Amizadeh
K. Koishida
Diana Marculescu
AI4TS
14
2
0
02 Apr 2024
Cacophony: An Improved Contrastive Audio-Text Model
Ge Zhu
Jordan Darefsky
Zhiyao Duan
AuLLM
40
11
0
10 Feb 2024
Online Similarity-and-Independence-Aware Beamformer for Low-latency Target Sound Extraction
Atsuo Hiroe
27
0
0
27 Dec 2023
Can CLIP Help Sound Source Localization?
Sooyoung Park
Arda Senocak
Joon Son Chung
27
6
0
07 Nov 2023
GASS: Generalizing Audio Source Separation with Large-scale Data
Jordi Pons
Xiaoyu Liu
Santiago Pascual
Joan Serra
16
12
0
29 Sep 2023
Separate Anything You Describe
Xubo Liu
Qiuqiang Kong
Yan Zhao
Haohe Liu
Yiitan Yuan
Yuzhuo Liu
Rui Xia
Yuxuan Wang
Mark D. Plumbley
Wenwu Wang
VLM
25
43
0
09 Aug 2023
Complete and separate: Conditional separation with missing target source attribute completion
Dimitrios Bralios
Efthymios Tzinis
Paris Smaragdis
35
0
0
27 Jul 2023
CLIPSonic: Text-to-Audio Synthesis with Unlabeled Videos and Pretrained Language-Vision Models
Hao-Wen Dong
Xiaoyu Liu
Jordi Pons
Gautam Bhattacharya
Santiago Pascual
Joan Serra
Taylor Berg-Kirkpatrick
Julian McAuley
DiffM
22
19
0
16 Jun 2023
CAPTDURE: Captioned Sound Dataset of Single Sources
Yuki Okamoto
Kanta Shimonishi
Keisuke Imoto
Kota Dohi
Shota Horiguchi
Y. Kawaguchi
24
1
0
28 May 2023
Source separation with weakly labelled data: An approach to computational auditory scene analysis
Qiuqiang Kong
Yuxuan Wang
Xuchen Song
Yin Cao
Wenwu Wang
Mark D. Plumbley
21
47
0
06 Feb 2020
1