Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1810.04826
Cited By
VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking
11 October 2018
Quan Wang
Hannah Muckenhirn
K. Wilson
Prashant Sridhar
Zelin Wu
J. Hershey
Rif A. Saurous
Ron J. Weiss
Ye Jia
Ignacio López Moreno
Re-assign community
ArXiv
PDF
HTML
Papers citing
"VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking"
50 / 82 papers shown
Title
Listen to Extract: Onset-Prompted Target Speaker Extraction
Pengjie Shen
Kangrui Chen
Shulin He
Pengru Chen
Shuqi Yuan
He Kong
Xueliang Zhang
Zehao Wang
53
0
0
08 May 2025
End-to-End Target Speaker Speech Recognition Using Context-Aware Attention Mechanisms for Challenging Enrollment Scenario
Mohsen Ghane
Mohammad Sadegh Safari
78
0
0
28 Jan 2025
Beyond Speaker Identity: Text Guided Target Speech Extraction
Mingyue Huo
Abhinav Jain
Cong Phuoc Huynh
Fanjie Kong
Pichao Wang
Zhu Liu
Vimal Bhat
56
0
0
17 Jan 2025
USED: Universal Speaker Extraction and Diarization
Junyi Ao
Mehmet Sinan Yildirim
Ruijie Tao
Mengyao Ge
Shuai Wang
Yan-min Qian
Haizhou Li
43
6
0
17 Jan 2025
Reading to Listen at the Cocktail Party: Multi-Modal Speech Separation
Akam Rahimi
Triantafyllos Afouras
Andrew Zisserman
45
28
0
02 Jan 2025
Cross-attention Inspired Selective State Space Models for Target Sound Extraction
Donghang Wu
Yiwen Wang
Xihong Wu
T. Qu
Mamba
37
3
0
07 Sep 2024
USEF-TSE: Universal Speaker Embedding Free Target Speaker Extraction
Bang Zeng
Ming Li
45
3
0
04 Sep 2024
Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning
Shuai Wang
Zheng-Shou Chen
Kong Aik Lee
Yan-min Qian
Haizhou Li
44
4
0
21 Jul 2024
Joint Speaker Features Learning for Audio-visual Multichannel Speech Separation and Recognition
Guinan Li
Jiajun Deng
Youjun Chen
Mengzhe Geng
Shujie Hu
...
Zengrui Jin
Tianzi Wang
Xurong Xie
Helen Meng
Xunying Liu
VLM
34
0
0
14 Jun 2024
Single-Channel Robot Ego-Speech Filtering during Human-Robot Interaction
Yue Li
Koen V. Hindriks
Florian A. Kunneman
35
2
0
05 Mar 2024
Continuous Target Speech Extraction: Enhancing Personalized Diarization and Extraction on Complex Recordings
He Zhao
Hangting Chen
Jianwei Yu
Yuehai Wang
53
0
0
29 Jan 2024
TDFNet: An Efficient Audio-Visual Speech Separation Model with Top-down Fusion
Samuel Pegg
Kai Li
Xiaolin Hu
34
1
0
25 Jan 2024
Audio Prompt Tuning for Universal Sound Separation
Yuzhuo Liu
Xubo Liu
Yan Zhao
Yuanyuan Wang
Rui Xia
Pingchuan Tain
Yuxuan Wang
VLM
41
5
0
30 Nov 2023
Audio-Visual Speaker Tracking: Progress, Challenges, and Future Directions
Jinzheng Zhao
Yong-mei Xu
Xinyuan Qian
Davide Berghi
Peipei Wu
Meng Cui
Jianyuan Sun
Philip J. B. Jackson
Wenwu Wang
BDL
47
7
0
23 Oct 2023
SpeechX: Neural Codec Language Model as a Versatile Speech Transformer
Xiaofei Wang
Manthan Thakker
Zhuo Chen
Naoyuki Kanda
Sefik Emre Eskimez
Sanyuan Chen
M. Tang
Shujie Liu
Jinyu Li
Takuya Yoshioka
26
80
0
14 Aug 2023
Complete and separate: Conditional separation with missing target source attribute completion
Dimitrios Bralios
Efthymios Tzinis
Paris Smaragdis
37
0
0
27 Jul 2023
BASEN: Time-Domain Brain-Assisted Speech Enhancement Network with Convolutional Cross Attention in Multi-talker Conditions
Jie Zhang
Qingquan Xu
Qiu-shi Zhu
Zhenhua Ling
27
11
0
17 May 2023
Universal Source Separation with Weakly Labelled Data
Qiuqiang Kong
K. Chen
Haohe Liu
Xingjian Du
Taylor Berg-Kirkpatrick
Shlomo Dubnov
Mark D. Plumbley
18
17
0
11 May 2023
Transformers in Speech Processing: A Survey
S. Latif
Aun Zaidi
Heriberto Cuayáhuitl
Fahad Shamshad
Moazzam Shoukat
Junaid Qadir
46
47
0
21 Mar 2023
Target Sound Extraction with Variable Cross-modality Clues
Chenda Li
Yao Qian
Zhuo Chen
Dongmei Wang
Takuya Yoshioka
Shujie Liu
Y. Qian
Michael Zeng
VLM
29
13
0
15 Mar 2023
Neural Target Speech Extraction: An Overview
Kateřina Žmolíková
Marc Delcroix
Tsubasa Ochiai
K. Kinoshita
JanHonza'' vCernocký
Dong Yu
23
86
0
31 Jan 2023
Improving Target Speaker Extraction with Sparse LDA-transformed Speaker Embeddings
Kai Liu
Xucheng Wan
Z.C. Du
Huan Zhou
VLM
27
1
0
16 Jan 2023
Deep neural network techniques for monaural speech enhancement: state of the art analysis
P. Ochieng
35
21
0
01 Dec 2022
The Potential of Neural Speech Synthesis-based Data Augmentation for Personalized Speech Enhancement
Anastasia Kuznetsova
Aswin Sivaraman
Minje Kim
32
3
0
14 Nov 2022
Real-Time Joint Personalized Speech Enhancement and Acoustic Echo Cancellation
Sefik Emre Eskimez
Takuya Yoshioka
Alex Ju
M. Tang
Tanel Pärnamaa
Huaming Wang
32
7
0
04 Nov 2022
Spatially Selective Deep Non-linear Filters for Speaker Extraction
Kristina Tesch
Timo Gerkmann
32
17
0
04 Nov 2022
Hierarchical speaker representation for target speaker extraction
Shulin He
Huaiwen Zhang
Wei Rao
Kanghao Zhang
Yukai Ju
Yang-Rui Yang
Xueliang Zhang
37
4
0
28 Oct 2022
AudioScopeV2: Audio-Visual Attention Architectures for Calibrated Open-Domain On-Screen Sound Separation
Efthymios Tzinis
Scott Wisdom
Tal Remez
J. Hershey
41
30
0
20 Jul 2022
Multi-channel target speech enhancement based on ERB-scaled spatial coherence features
Yicheng Hsu
Yonghan Lee
M. Bai
31
1
0
17 Jul 2022
Speaker Verification in Multi-Speaker Environments Using Temporal Feature Fusion
Ahmad Aloradi
Wolfgang Mack
Mohamed Elminshawi
Emanuel Habets
38
5
0
28 Jun 2022
Semi-supervised Time Domain Target Speaker Extraction with Attention
Zhepei Wang
Ritwik Giri
Shrikant Venkataramani
Umut Isik
J. Valin
Paris Smaragdis
Mike Goodwin
A. Krishnaswamy
24
7
0
18 Jun 2022
NeuralEcho: A Self-Attentive Recurrent Neural Network For Unified Acoustic Echo Suppression And Speech Enhancement
Meng Yu
Yong-mei Xu
Chunlei Zhang
Shizhong Zhang
Dong Yu
22
11
0
20 May 2022
Text-Driven Separation of Arbitrary Sounds
Kevin Kilgour
Beat Gfeller
Qingqing Huang
A. Jansen
Scott Wisdom
Marco Tagliasacchi
30
30
0
12 Apr 2022
Listen only to me! How well can target speech extraction handle false alarms?
Marc Delcroix
K. Kinoshita
Tsubasa Ochiai
Kateřina Žmolíková
Hiroshi Sato
Tomohiro Nakatani
34
15
0
11 Apr 2022
SoundBeam: Target sound extraction conditioned on sound-class labels and enrollment clues for increased performance and continuous learning
Marc Delcroix
Jorge Bennasar Vázquez
Tsubasa Ochiai
K. Kinoshita
Yasunori Ohishi
S. Araki
VLM
22
32
0
08 Apr 2022
Heterogeneous Target Speech Separation
Hyunjae Cho
Wonbin Jung
Junhyeok Lee
Paris Smaragdis
Sanghyun Woo
51
26
0
07 Apr 2022
RaDur: A Reference-aware and Duration-robust Network for Target Sound Detection
Dongchao Yang
Helin Wang
Zhongjie Ye
Yuexian Zou
Wenwu Wang
28
0
0
05 Apr 2022
Speaker Extraction with Co-Speech Gestures Cue
Zexu Pan
Xinyuan Qian
Haizhou Li
SLR
23
27
0
31 Mar 2022
Separate What You Describe: Language-Queried Audio Source Separation
Xubo Liu
Haohe Liu
Qiuqiang Kong
Xinhao Mei
Jinzheng Zhao
Qiushi Huang
Mark D. Plumbley
Wenwu Wang
42
58
0
28 Mar 2022
Single microphone speaker extraction using unified time-frequency Siamese-Unet
Aviad Eisenberg
Sharon Gannot
Shlomo E. Chazan
30
3
0
06 Mar 2022
SpeechPainter: Text-conditioned Speech Inpainting
Zalan Borsos
Matthew Sharifi
Marco Tagliasacchi
16
26
0
15 Feb 2022
SkiM: Skipping Memory LSTM for Low-Latency Real-Time Continuous Speech Separation
Chenda Li
Lei Yang
Weiqin Wang
Y. Qian
34
25
0
26 Jan 2022
Directed Speech Separation for Automatic Speech Recognition of Long Form Conversational Speech
Rohit Paturi
S. Srinivasan
Katrin Kirchhoff
Daniel Garcia-Romero
19
9
0
10 Dec 2021
Learning-based personal speech enhancement for teleconferencing by exploiting spatial-spectral features
Yicheng Hsu
Yonghan Lee
M. Bai
24
10
0
10 Dec 2021
Target Speech Extraction: Independent Vector Extraction Guided by Supervised Speaker Identification
J. Málek
Jakub Janský
Zbyněk Koldovský
Tomás Kounovský
Jaroslav Cmejla
J. Zdánský
25
10
0
05 Nov 2021
Cross-attention conformer for context modeling in speech enhancement for ASR
A. Narayanan
Chung-Cheng Chiu
Tom O'Malley
Quan Wang
Yanzhang He
24
14
0
30 Oct 2021
One model to enhance them all: array geometry agnostic multi-channel personalized speech enhancement
H. Taherian
Sefik Emre Eskimez
Takuya Yoshioka
Huaming Wang
Zhuo Chen
Xuedong Huang
30
21
0
20 Oct 2021
Personalized Speech Enhancement: New Models and Comprehensive Evaluation
Sefik Emre Eskimez
Takuya Yoshioka
Huaming Wang
Xiaofei Wang
Zhuo Chen
Xuedong Huang
32
62
0
18 Oct 2021
Controllable Multichannel Speech Dereverberation based on Deep Neural Networks
Ziteng Wang
Yueyue Na
Biao Tian
Q. Fu
21
0
0
16 Oct 2021
USEV: Universal Speaker Extraction with Visual Cue
Zexu Pan
Meng Ge
Haizhou Li
34
41
0
30 Sep 2021
1
2
Next