ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2102.01326
  4. Cited By
Multimodal Attention Fusion for Target Speaker Extraction

Multimodal Attention Fusion for Target Speaker Extraction

2 February 2021
Hiroshi Sato
Tsubasa Ochiai
K. Kinoshita
Marc Delcroix
Tomohiro Nakatani
S. Araki
ArXivPDFHTML

Papers citing "Multimodal Attention Fusion for Target Speaker Extraction"

8 / 8 papers shown
Title
Listen to Extract: Onset-Prompted Target Speaker Extraction
Listen to Extract: Onset-Prompted Target Speaker Extraction
Pengjie Shen
Kangrui Chen
Shulin He
Pengru Chen
Shuqi Yuan
He Kong
Xueliang Zhang
Zehao Wang
55
0
0
08 May 2025
Noise-Robust Target-Speaker Voice Activity Detection Through Self-Supervised Pretraining
Noise-Robust Target-Speaker Voice Activity Detection Through Self-Supervised Pretraining
H. S. Bovbjerg
Jan Østergaard
Jesper Jensen
Zheng-Hua Tan
45
0
0
06 Jan 2025
Audio-Visual Target Speaker Extraction with Reverse Selective Auditory Attention
Audio-Visual Target Speaker Extraction with Reverse Selective Auditory Attention
Ruijie Tao
Xinyuan Qian
Yidi Jiang
Junjie Li
Jiadong Wang
Haizhou Li
36
1
0
29 Apr 2024
AV-SepFormer: Cross-Attention SepFormer for Audio-Visual Target Speaker
  Extraction
AV-SepFormer: Cross-Attention SepFormer for Audio-Visual Target Speaker Extraction
Jiuxin Lin
X. Cai
Heinrich Dinkel
Jun Chen
Zhiyong Yan
Yongqing Wang
Junbo Zhang
Zhiyong Wu
Yujun Wang
Helen M. Meng
29
21
0
25 Jun 2023
Neural Target Speech Extraction: An Overview
Neural Target Speech Extraction: An Overview
Kateřina Žmolíková
Marc Delcroix
Tsubasa Ochiai
K. Kinoshita
JanHonza'' vCernocký
Dong Yu
23
86
0
31 Jan 2023
VoViT: Low Latency Graph-based Audio-Visual Voice Separation Transformer
VoViT: Low Latency Graph-based Audio-Visual Voice Separation Transformer
Juan F. Montesinos
V. S. Kadandale
G. Haro
ViT
25
19
0
08 Mar 2022
USEV: Universal Speaker Extraction with Visual Cue
USEV: Universal Speaker Extraction with Visual Cue
Zexu Pan
Meng Ge
Haizhou Li
36
41
0
30 Sep 2021
Deep Extractor Network for Target Speaker Recovery From Single Channel
  Speech Mixtures
Deep Extractor Network for Target Speaker Recovery From Single Channel Speech Mixtures
Jun Wang
Jie Chen
Dan Su
Lianwu Chen
Meng Yu
Y. Qian
Dong Yu
46
90
0
24 Jul 2018
1