Text-Driven Separation of Arbitrary Sounds

12 April 2022

Papers citing "Text-Driven Separation of Arbitrary Sounds"

32 / 32 papers shown

Title
Audio-Language Datasets of Scenes and Events: A Survey Gijs Wijngaard Elia Formisano Michele Esposito M. Dumontier 171 3 0 10 Jan 2025
Language-Queried Target Sound Extraction Without Parallel Training Data Hao Ma Zhiyuan Peng Xu Li Yukai Li Mingjie Shao Qiuqiang Kong Xuelong Li VLM 162 2 0 14 Sep 2024
Zero-shot Audio Source Separation through Query-based Learning from Weakly-labeled Data Ke Chen Xingjian Du Bilei Zhu Zejun Ma Taylor Berg-Kirkpatrick Shlomo Dubnov 95 46 0 15 Dec 2021
Unsupervised Source Separation By Steering Pretrained Music Models Ethan Manilow P. O'Reilly Prem Seetharaman Bryan Pardo 72 2 0 25 Oct 2021
Audio Retrieval with Natural Language Queries Andreea-Maria Oncescu A. Sophia Koepke João F. Henriques Zeynep Akata Samuel Albanie 63 79 0 05 May 2021
Learning Transferable Visual Models From Natural Language Supervision Alec Radford Jong Wook Kim Chris Hallacy Aditya A. Ramesh Gabriel Goh ... Amanda Askell Pamela Mishkin Jack Clark Gretchen Krueger Ilya Sutskever CLIP VLM 1.0K 29,926 0 26 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision Chao Jia Yinfei Yang Ye Xia Yi-Ting Chen Zarana Parekh Hieu H. Pham Quoc V. Le Yun-hsuan Sung Zhen Li Tom Duerig VLM CLIP 480 3,906 0 11 Feb 2021
ICASSP 2021 Deep Noise Suppression Challenge: Decoupling Magnitude and Phase Optimization with a Two-Stage Deep Network Andong Li Wenzhe Liu Xiaoxue Luo C. Zheng Xiaodong Li 66 59 0 08 Feb 2021
Interspeech 2021 Deep Noise Suppression Challenge Chandan K. A. Reddy Harishchandra Dubey K. Koishida A. Nair Vishak Gopal Ross Cutler Sebastian Braun H. Gamper R. Aichner Sriram Srinivasan AI4CE 127 164 0 06 Jan 2021
Into the Wild with AudioScope: Unsupervised Audio-Visual Separation of On-Screen Sounds Efthymios Tzinis Scott Wisdom A. Jansen Shawn Hershey Tal Remez D. Ellis J. Hershey 81 71 0 02 Nov 2020
What's All the FUSS About Free Universal Sound Separation Data? Scott Wisdom Hakan Erdogan D. Ellis Romain Serizel Nicolas Turpault Eduardo Fonseca Justin Salamon Prem Seetharaman J. Hershey 87 82 0 02 Nov 2020
FSD50K: An Open Dataset of Human-Labeled Sound Events Eduardo Fonseca Xavier Favory Jordi Pons F. Font Xavier Serra 111 467 0 01 Oct 2020
SEANet: A Multi-modal Speech Enhancement Network Marco Tagliasacchi Yunpeng Li Karolis Misiunas Dominik Roblek 70 73 0 04 Sep 2020
Listen to What You Want: Neural Network-based Universal Sound Selector Tsubasa Ochiai Marc Delcroix Yuma Koizumi Hiroaki Ito K. Kinoshita S. Araki 68 62 0 10 Jun 2020
Source separation with weakly labelled data: An approach to computational auditory scene analysis Qiuqiang Kong Yuxuan Wang Xuchen Song Yin Cao Wenwu Wang Mark D. Plumbley 87 47 0 06 Feb 2020
Music Source Separation in the Waveform Domain Alexandre Défossez Nicolas Usunier Léon Bottou Francis R. Bach 134 273 0 27 Nov 2019
Improving Universal Sound Separation Using Sound Classification Efthymios Tzinis Scott Wisdom J. Hershey A. Jansen D. Ellis VLM 76 73 0 18 Nov 2019
Finding Strength in Weakness: Learning to Separate Sounds with Weak Supervision Fatemeh Pishdadian Gordon Wichern Jonathan Le Roux 69 43 0 06 Nov 2019
Clotho: An Audio Captioning Dataset Konstantinos Drossos Samuel Lipping Tuomas Virtanen 112 395 0 21 Oct 2019
Temporal FiLM: Capturing Long-Range Sequence Dependencies with Feature-Wise Modulations Sawyer Birnbaum Volodymyr Kuleshov S. Enam Pang Wei Koh Stefano Ermon AI4TS 78 70 0 14 Sep 2019
Recursive Visual Sound Separation Using Minus-Plus Net Xudong Xu Bo Dai Dahua Lin 91 91 0 30 Aug 2019
Contrastive Multiview Coding Yonglong Tian Dilip Krishnan Phillip Isola SSL 188 2,412 0 13 Jun 2019
Universal Sound Separation Ilya Kavalerov Scott Wisdom Hakan Erdogan Brian Patton K. Wilson Jonathan Le Roux J. Hershey 81 187 0 08 May 2019
Co-Separating Sounds of Visual Objects Ruohan Gao Kristen Grauman 136 210 0 16 Apr 2019
SDR - half-baked or well done? F. Sánchez-Martínez M. Esplà-Gomis Hakan Erdogan J. Hershey 165 1,205 0 06 Nov 2018
End-to-end music source separation: is it possible in the waveform domain? Francesc Lluís Jordi Pons Xavier Serra 74 73 0 29 Oct 2018
VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking Quan Wang Hannah Muckenhirn K. Wilson Prashant Sridhar Zelin Wu J. Hershey Rif A. Saurous Ron J. Weiss Ye Jia Ignacio López Moreno 96 370 0 11 Oct 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Jacob Devlin Ming-Wei Chang Kenton Lee Kristina Toutanova VLM SSL SSeg 1.8K 95,324 0 11 Oct 2018
Group Normalization Yuxin Wu Kaiming He 249 3,676 0 22 Mar 2018
FiLM: Visual Reasoning with a General Conditioning Layer Ethan Perez Florian Strub H. D. Vries Vincent Dumoulin Aaron Courville FAtt AIMat OffRL AI4CE 375 2,239 0 22 Sep 2017
CNN Architectures for Large-Scale Audio Classification Shawn Hershey Sourish Chaudhuri D. Ellis J. Gemmeke A. Jansen ... Rif A. Saurous Bryan Seybold M. Slaney Ron J. Weiss K. Wilson 143 2,510 0 29 Sep 2016
Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) Djork-Arné Clevert Thomas Unterthiner Sepp Hochreiter 317 5,539 0 23 Nov 2015