Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2109.14831
Cited By
USEV: Universal Speaker Extraction with Visual Cue
30 September 2021
Zexu Pan
Meng Ge
Haizhou Li
Re-assign community
ArXiv
PDF
HTML
Papers citing
"USEV: Universal Speaker Extraction with Visual Cue"
21 / 21 papers shown
Title
USED: Universal Speaker Extraction and Diarization
Junyi Ao
Mehmet Sinan Yildirim
Ruijie Tao
Mengyao Ge
Shuai Wang
Yan-min Qian
Haizhou Li
33
5
0
17 Jan 2025
Noise-Robust Target-Speaker Voice Activity Detection Through Self-Supervised Pretraining
H. S. Bovbjerg
Jan Østergaard
Jesper Jensen
Zheng-Hua Tan
36
0
0
06 Jan 2025
DENSE: Dynamic Embedding Causal Target Speech Extraction
Yiwen Wang
Zeyu Yuan
Xihong Wu
41
0
0
10 Sep 2024
RAVSS: Robust Audio-Visual Speech Separation in Multi-Speaker Scenarios with Missing Visual Cues
Tianrui Pan
Jie Liu
Bohan Wang
Jie Tang
Gangshan Wu
35
2
0
27 Jul 2024
Audio-Visual Target Speaker Extraction with Reverse Selective Auditory Attention
Ruijie Tao
Xinyuan Qian
Yidi Jiang
Junjie Li
Jiadong Wang
Haizhou Li
32
1
0
29 Apr 2024
Target Speech Extraction with Pre-trained AV-HuBERT and Mask-And-Recover Strategy
Wenxuan Wu
Xueyuan Chen
Xixin Wu
Haizhou Li
Helen M. Meng
26
1
0
24 Mar 2024
NeuroHeed+: Improving Neuro-steered Speaker Extraction with Joint Auditory Attention Detection
Zexu Pan
G. Wichern
François G. Germain
Sameer Khurana
Jonathan Le Roux
26
8
0
12 Dec 2023
Audio-Visual Active Speaker Extraction for Sparsely Overlapped Multi-talker Speech
Jun Yu Li
Ruijie Tao
Zexu Pan
Meng Ge
Shuai Wang
Haizhou Li
24
5
0
15 Sep 2023
PIAVE: A Pose-Invariant Audio-Visual Speaker Extraction Network
Qinghua Liu
Meng Ge
Zhizheng Wu
Haizhou Li
19
0
0
13 Sep 2023
Focus on the Sound around You: Monaural Target Speaker Extraction via Distance and Speaker Information
Jiuxin Lin
Peng Wang
Heinrich Dinkel
Jun Chen
Zhiyong Wu
Zhiyong Yan
Yongqing Wang
Junbo Zhang
Yujun Wang
11
7
0
28 Jun 2023
AV-SepFormer: Cross-Attention SepFormer for Audio-Visual Target Speaker Extraction
Jiuxin Lin
X. Cai
Heinrich Dinkel
Jun Chen
Zhiyong Yan
Yongqing Wang
Junbo Zhang
Zhiyong Wu
Yujun Wang
Helen M. Meng
22
21
0
25 Jun 2023
Rethinking the visual cues in audio-visual speaker extraction
Junjie Li
Meng Ge
Zexu Pan
Rui Cao
Longbiao Wang
J. Dang
Shiliang Zhang
20
9
0
05 Jun 2023
Late Audio-Visual Fusion for In-The-Wild Speaker Diarization
Zexu Pan
G. Wichern
François G. Germain
Aswin Shanmugam Subramanian
Jonathan Le Roux
VGen
21
1
0
02 Nov 2022
ImagineNET: Target Speaker Extraction with Intermittent Visual Cue through Embedding Inpainting
Zexu Pan
Wupeng Wang
Marvin Borsdorf
Haizhou Li
6
10
0
31 Oct 2022
VCSE: Time-Domain Visual-Contextual Speaker Extraction Network
Junjie Li
Meng Ge
Zexu Pan
Longbiao Wang
J. Dang
18
10
0
09 Oct 2022
RaDur: A Reference-aware and Duration-robust Network for Target Sound Detection
Dongchao Yang
Helin Wang
Zhongjie Ye
Yuexian Zou
Wenwu Wang
26
0
0
05 Apr 2022
A Hybrid Continuity Loss to Reduce Over-Suppression for Time-domain Target Speaker Extraction
Zexu Pan
Meng Ge
Haizhou Li
18
17
0
31 Mar 2022
Speaker Extraction with Co-Speech Gestures Cue
Zexu Pan
Xinyuan Qian
Haizhou Li
SLR
21
26
0
31 Mar 2022
New Insights on Target Speaker Extraction
Mohamed Elminshawi
Wolfgang Mack
Srikanth Raj Chetupalli
Soumitro Chakrabarty
Emanuel Habets
11
18
0
01 Feb 2022
VoxCeleb2: Deep Speaker Recognition
Joon Son Chung
Arsha Nagrani
Andrew Zisserman
219
2,233
0
14 Jun 2018
Wave-U-Net: A Multi-Scale Neural Network for End-to-End Audio Source Separation
Daniel Stoller
Sebastian Ewert
S. Dixon
AI4TS
104
588
0
08 Jun 2018
1