ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2506.12500
7
0

Mitigating Non-Target Speaker Bias in Guided Speaker Embedding

14 June 2025
Shota Horiguchi
Takanori Ashihara
Marc Delcroix
Atsushi Ando
Naohiro Tawara
ArXiv (abs)PDFHTML
Main:4 Pages
4 Figures
Bibliography:1 Pages
3 Tables
Abstract

Obtaining high-quality speaker embeddings in multi-speaker conditions is crucial for many applications. A recently proposed guided speaker embedding framework, which utilizes speech activities of target and non-target speakers as clues, drastically improved embeddings under severe overlap with small degradation in low-overlap cases. However, since extreme overlaps are rare in natural conversations, this degradation cannot be overlooked. This paper first reveals that the degradation is caused by the global-statistics-based modules, widely used in speaker embedding extractors, being overly sensitive to intervals containing only non-target speakers. As a countermeasure, we propose an extension of such modules that exploit the target speaker activity clues, to compute statistics from intervals where the target is active. The proposed method improves speaker verification performance in both low and high overlap ratios, and diarization performance on multiple datasets.

View on arXiv
@article{horiguchi2025_2506.12500,
  title={ Mitigating Non-Target Speaker Bias in Guided Speaker Embedding },
  author={ Shota Horiguchi and Takanori Ashihara and Marc Delcroix and Atsushi Ando and Naohiro Tawara },
  journal={arXiv preprint arXiv:2506.12500},
  year={ 2025 }
}
Comments on this paper