ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.05738
  4. Cited By
Text-Driven Separation of Arbitrary Sounds

Text-Driven Separation of Arbitrary Sounds

12 April 2022
Kevin Kilgour
Beat Gfeller
Qingqing Huang
A. Jansen
Scott Wisdom
Marco Tagliasacchi
ArXiv (abs)PDFHTML

Papers citing "Text-Driven Separation of Arbitrary Sounds"

32 / 32 papers shown
Title
Audio-Language Datasets of Scenes and Events: A Survey
Audio-Language Datasets of Scenes and Events: A Survey
Gijs Wijngaard
Elia Formisano
Michele Esposito
M. Dumontier
171
3
0
10 Jan 2025
Language-Queried Target Sound Extraction Without Parallel Training Data
Language-Queried Target Sound Extraction Without Parallel Training Data
Hao Ma
Zhiyuan Peng
Xu Li
Yukai Li
Mingjie Shao
Qiuqiang Kong
Xuelong Li
VLM
162
2
0
14 Sep 2024
Zero-shot Audio Source Separation through Query-based Learning from
  Weakly-labeled Data
Zero-shot Audio Source Separation through Query-based Learning from Weakly-labeled Data
Ke Chen
Xingjian Du
Bilei Zhu
Zejun Ma
Taylor Berg-Kirkpatrick
Shlomo Dubnov
95
46
0
15 Dec 2021
Unsupervised Source Separation By Steering Pretrained Music Models
Unsupervised Source Separation By Steering Pretrained Music Models
Ethan Manilow
P. O'Reilly
Prem Seetharaman
Bryan Pardo
72
2
0
25 Oct 2021
Audio Retrieval with Natural Language Queries
Audio Retrieval with Natural Language Queries
Andreea-Maria Oncescu
A. Sophia Koepke
João F. Henriques
Zeynep Akata
Samuel Albanie
63
79
0
05 May 2021
Learning Transferable Visual Models From Natural Language Supervision
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIPVLM
1.0K
29,926
0
26 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy
  Text Supervision
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLMCLIP
480
3,906
0
11 Feb 2021
ICASSP 2021 Deep Noise Suppression Challenge: Decoupling Magnitude and
  Phase Optimization with a Two-Stage Deep Network
ICASSP 2021 Deep Noise Suppression Challenge: Decoupling Magnitude and Phase Optimization with a Two-Stage Deep Network
Andong Li
Wenzhe Liu
Xiaoxue Luo
C. Zheng
Xiaodong Li
66
59
0
08 Feb 2021
Interspeech 2021 Deep Noise Suppression Challenge
Interspeech 2021 Deep Noise Suppression Challenge
Chandan K. A. Reddy
Harishchandra Dubey
K. Koishida
A. Nair
Vishak Gopal
Ross Cutler
Sebastian Braun
H. Gamper
R. Aichner
Sriram Srinivasan
AI4CE
127
164
0
06 Jan 2021
Into the Wild with AudioScope: Unsupervised Audio-Visual Separation of
  On-Screen Sounds
Into the Wild with AudioScope: Unsupervised Audio-Visual Separation of On-Screen Sounds
Efthymios Tzinis
Scott Wisdom
A. Jansen
Shawn Hershey
Tal Remez
D. Ellis
J. Hershey
81
71
0
02 Nov 2020
What's All the FUSS About Free Universal Sound Separation Data?
What's All the FUSS About Free Universal Sound Separation Data?
Scott Wisdom
Hakan Erdogan
D. Ellis
Romain Serizel
Nicolas Turpault
Eduardo Fonseca
Justin Salamon
Prem Seetharaman
J. Hershey
87
82
0
02 Nov 2020
FSD50K: An Open Dataset of Human-Labeled Sound Events
FSD50K: An Open Dataset of Human-Labeled Sound Events
Eduardo Fonseca
Xavier Favory
Jordi Pons
F. Font
Xavier Serra
111
467
0
01 Oct 2020
SEANet: A Multi-modal Speech Enhancement Network
SEANet: A Multi-modal Speech Enhancement Network
Marco Tagliasacchi
Yunpeng Li
Karolis Misiunas
Dominik Roblek
70
73
0
04 Sep 2020
Listen to What You Want: Neural Network-based Universal Sound Selector
Listen to What You Want: Neural Network-based Universal Sound Selector
Tsubasa Ochiai
Marc Delcroix
Yuma Koizumi
Hiroaki Ito
K. Kinoshita
S. Araki
68
62
0
10 Jun 2020
Source separation with weakly labelled data: An approach to
  computational auditory scene analysis
Source separation with weakly labelled data: An approach to computational auditory scene analysis
Qiuqiang Kong
Yuxuan Wang
Xuchen Song
Yin Cao
Wenwu Wang
Mark D. Plumbley
87
47
0
06 Feb 2020
Music Source Separation in the Waveform Domain
Music Source Separation in the Waveform Domain
Alexandre Défossez
Nicolas Usunier
Léon Bottou
Francis R. Bach
134
273
0
27 Nov 2019
Improving Universal Sound Separation Using Sound Classification
Improving Universal Sound Separation Using Sound Classification
Efthymios Tzinis
Scott Wisdom
J. Hershey
A. Jansen
D. Ellis
VLM
76
73
0
18 Nov 2019
Finding Strength in Weakness: Learning to Separate Sounds with Weak
  Supervision
Finding Strength in Weakness: Learning to Separate Sounds with Weak Supervision
Fatemeh Pishdadian
Gordon Wichern
Jonathan Le Roux
69
43
0
06 Nov 2019
Clotho: An Audio Captioning Dataset
Clotho: An Audio Captioning Dataset
Konstantinos Drossos
Samuel Lipping
Tuomas Virtanen
112
395
0
21 Oct 2019
Temporal FiLM: Capturing Long-Range Sequence Dependencies with
  Feature-Wise Modulations
Temporal FiLM: Capturing Long-Range Sequence Dependencies with Feature-Wise Modulations
Sawyer Birnbaum
Volodymyr Kuleshov
S. Enam
Pang Wei Koh
Stefano Ermon
AI4TS
78
70
0
14 Sep 2019
Recursive Visual Sound Separation Using Minus-Plus Net
Recursive Visual Sound Separation Using Minus-Plus Net
Xudong Xu
Bo Dai
Dahua Lin
91
91
0
30 Aug 2019
Contrastive Multiview Coding
Contrastive Multiview Coding
Yonglong Tian
Dilip Krishnan
Phillip Isola
SSL
188
2,412
0
13 Jun 2019
Universal Sound Separation
Universal Sound Separation
Ilya Kavalerov
Scott Wisdom
Hakan Erdogan
Brian Patton
K. Wilson
Jonathan Le Roux
J. Hershey
81
187
0
08 May 2019
Co-Separating Sounds of Visual Objects
Co-Separating Sounds of Visual Objects
Ruohan Gao
Kristen Grauman
136
210
0
16 Apr 2019
SDR - half-baked or well done?
SDR - half-baked or well done?
F. Sánchez-Martínez
M. Esplà-Gomis
Hakan Erdogan
J. Hershey
165
1,205
0
06 Nov 2018
End-to-end music source separation: is it possible in the waveform
  domain?
End-to-end music source separation: is it possible in the waveform domain?
Francesc Lluís
Jordi Pons
Xavier Serra
74
73
0
29 Oct 2018
VoiceFilter: Targeted Voice Separation by Speaker-Conditioned
  Spectrogram Masking
VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking
Quan Wang
Hannah Muckenhirn
K. Wilson
Prashant Sridhar
Zelin Wu
J. Hershey
Rif A. Saurous
Ron J. Weiss
Ye Jia
Ignacio López Moreno
96
370
0
11 Oct 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLMSSLSSeg
1.8K
95,324
0
11 Oct 2018
Group Normalization
Group Normalization
Yuxin Wu
Kaiming He
249
3,676
0
22 Mar 2018
FiLM: Visual Reasoning with a General Conditioning Layer
FiLM: Visual Reasoning with a General Conditioning Layer
Ethan Perez
Florian Strub
H. D. Vries
Vincent Dumoulin
Aaron Courville
FAttAIMatOffRLAI4CE
375
2,239
0
22 Sep 2017
CNN Architectures for Large-Scale Audio Classification
CNN Architectures for Large-Scale Audio Classification
Shawn Hershey
Sourish Chaudhuri
D. Ellis
J. Gemmeke
A. Jansen
...
Rif A. Saurous
Bryan Seybold
M. Slaney
Ron J. Weiss
K. Wilson
143
2,510
0
29 Sep 2016
Fast and Accurate Deep Network Learning by Exponential Linear Units
  (ELUs)
Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs)
Djork-Arné Clevert
Thomas Unterthiner
Sepp Hochreiter
317
5,539
0
23 Nov 2015
1