ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.12182
  4. Cited By
Guided Speaker Embedding
v1v2 (latest)

Guided Speaker Embedding

3 January 2025
Shota Horiguchi
Takafumi Moriya
Atsushi Ando
Takanori Ashihara
Hiroshi Sato
Naohiro Tawara
Marc Delcroix
ArXiv (abs)PDFHTML

Papers citing "Guided Speaker Embedding"

26 / 26 papers shown
Title
Alignment-Free Training for Transducer-based Multi-Talker ASR
Alignment-Free Training for Transducer-based Multi-Talker ASR
Takafumi Moriya
Shota Horiguchi
Marc Delcroix
Ryo Masumura
Takanori Ashihara
Hiroshi Sato
Kohei Matsuura
Masato Mimura
71
4
0
30 Sep 2024
Multi-channel Conversational Speaker Separation via Neural Diarization
Multi-channel Conversational Speaker Separation via Neural Diarization
H. Taherian
DeLiang Wang
BDL
64
17
0
15 Nov 2023
NTT speaker diarization system for CHiME-7: multi-domain,
  multi-microphone End-to-end and vector clustering diarization
NTT speaker diarization system for CHiME-7: multi-domain, multi-microphone End-to-end and vector clustering diarization
Naohiro Tawara
Marc Delcroix
Atsushi Ando
A. Ogawa
72
11
0
22 Sep 2023
Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding
  with Sequence-to-Sequence Architecture
Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding with Sequence-to-Sequence Architecture
Gaobin Yang
Maokui He
Shutong Niu
Ruoyu Wang
Yanyan Yue
Shuangqing Qian
Shilong Wu
Jun Du
Chin-Hui Lee
81
12
0
17 Sep 2023
A Teacher-Student approach for extracting informative speaker embeddings
  from speech mixtures
A Teacher-Student approach for extracting informative speaker embeddings from speech mixtures
Tobias Cord-Landwehr
Christoph Boeddeker
Catalin Zorila
R. Doddipatla
Reinhold Haeb-Umbach
82
3
0
01 Jun 2023
Streaming Speaker-Attributed ASR with Token-Level Speaker Embeddings
Streaming Speaker-Attributed ASR with Token-Level Speaker Embeddings
Naoyuki Kanda
Jian Wu
Yu Wu
Xiong Xiao
Zhong Meng
Xiaofei Wang
Yashesh Gaur
Zhuo Chen
Jinyu Li
Takuya Yoshioka
57
27
0
30 Mar 2022
Streaming Multi-Talker ASR with Token-Level Serialized Output Training
Streaming Multi-Talker ASR with Token-Level Serialized Output Training
Naoyuki Kanda
Jian Wu
Yu Wu
Xiong Xiao
Zhong Meng
Xiaofei Wang
Yashesh Gaur
Zhuo Chen
Jinyu Li
Takuya Yoshioka
106
60
0
02 Feb 2022
Endpoint Detection for Streaming End-to-End Multi-talker ASR
Endpoint Detection for Streaming End-to-End Multi-talker ASR
Liang Lu
Jinyu Li
Yifan Gong
107
19
0
24 Jan 2022
Multi-turn RNN-T for streaming recognition of multi-party speech
Multi-turn RNN-T for streaming recognition of multi-party speech
Ilya Sklyar
A. Piunova
Xianrui Zheng
Yulan Liu
98
24
0
19 Dec 2021
M2MeT: The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription
  Challenge
M2MeT: The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Challenge
Fan Yu
Shiliang Zhang
Yihui Fu
Lei Xie
Siqi Zheng
...
Pengcheng Guo
Zhijie Yan
B. Ma
Xin Xu
Hui Bu
73
119
0
14 Oct 2021
Advances in integration of end-to-end neural and clustering-based
  diarization for real conversational speech
Advances in integration of end-to-end neural and clustering-based diarization for real conversational speech
K. Kinoshita
Marc Delcroix
Naohiro Tawara
111
61
0
19 May 2021
AISHELL-4: An Open Source Dataset for Speech Enhancement, Separation,
  Recognition and Speaker Diarization in Conference Scenario
AISHELL-4: An Open Source Dataset for Speech Enhancement, Separation, Recognition and Speaker Diarization in Conference Scenario
Yihui Fu
Luyao Cheng
Shubo Lv
Yukai Jv
Yuxiang Kong
...
Jian Wu
Hui Bu
Xin Xu
Jun Du
Jingdong Chen
97
98
0
08 Apr 2021
Streaming Multi-talker Speech Recognition with Joint Speaker
  Identification
Streaming Multi-talker Speech Recognition with Joint Speaker Identification
Liang Lu
Naoyuki Kanda
Jinyu Li
Jiawei Liu
75
20
0
05 Apr 2021
Speaker activity driven neural speech extraction
Speaker activity driven neural speech extraction
Marc Delcroix
Kateřina Žmolíková
Tsubasa Ochiai
K. Kinoshita
Tomohiro Nakatani
101
35
0
14 Jan 2021
Streaming end-to-end multi-talker speech recognition
Streaming end-to-end multi-talker speech recognition
Liang Lu
Naoyuki Kanda
Jinyu Li
Jiawei Liu
67
44
0
26 Nov 2020
Integrating end-to-end neural and clustering-based diarization: Getting
  the best of both worlds
Integrating end-to-end neural and clustering-based diarization: Getting the best of both worlds
K. Kinoshita
Marc Delcroix
Naohiro Tawara
71
84
0
26 Oct 2020
The IDLAB VoxSRC-20 Submission: Large Margin Fine-Tuning and
  Quality-Aware Score Calibration in DNN Based Speaker Verification
The IDLAB VoxSRC-20 Submission: Large Margin Fine-Tuning and Quality-Aware Score Calibration in DNN Based Speaker Verification
Jenthe Thienpondt
Brecht Desplanques
Kris Demuynck
80
84
0
21 Oct 2020
MIRNet: Learning multiple identities representations in overlapped
  speech
MIRNet: Learning multiple identities representations in overlapped speech
Hyewon Han
Soo-Whan Chung
Hong-Goo Kang
60
8
0
04 Aug 2020
Spot the conversation: speaker diarisation in the wild
Spot the conversation: speaker diarisation in the wild
Joon Son Chung
Jaesung Huh
Arsha Nagrani
Triantafyllos Afouras
Andrew Zisserman
VGen
97
150
0
02 Jul 2020
Target-Speaker Voice Activity Detection: a Novel Approach for
  Multi-Speaker Diarization in a Dinner Party Scenario
Target-Speaker Voice Activity Detection: a Novel Approach for Multi-Speaker Diarization in a Dinner Party Scenario
Ivan Medennikov
M. Korenevsky
Tatiana Prisyach
Yuri Y. Khokhlov
Mariya Korenevskaya
...
Anton Mitrofanov
A. Andrusenko
Ivan Podluzhny
A. Laptev
A. Romanenko
58
205
0
14 May 2020
ECAPA-TDNN: Emphasized Channel Attention, Propagation and Aggregation in
  TDNN Based Speaker Verification
ECAPA-TDNN: Emphasized Channel Attention, Propagation and Aggregation in TDNN Based Speaker Verification
Brecht Desplanques
Jenthe Thienpondt
Kris Demuynck
86
1,347
0
14 May 2020
CHiME-6 Challenge:Tackling Multispeaker Speech Recognition for
  Unsegmented Recordings
CHiME-6 Challenge:Tackling Multispeaker Speech Recognition for Unsegmented Recordings
Shinji Watanabe
Michael I. Mandel
Jon Barker
Emmanuel Vincent
Ashish Arora
...
Emmanuel Vincent
Shota Horiguchi
Naoyuki Kanda
Takuya Yoshioka
Neville Ryant
75
308
0
20 Apr 2020
pyannote.audio: neural building blocks for speaker diarization
pyannote.audio: neural building blocks for speaker diarization
H. Bredin
Ruiqing Yin
Juan Manuel Coria
G. Gelly
Pavel Korshunov
Marvin Lavechin
D. Fustes
Hadrien Titeux
Wassim Bouaziz
Marie-Philippe Gill
244
326
0
04 Nov 2019
Simultaneous Speech Recognition and Speaker Diarization for Monaural
  Dialogue Recordings with Target-Speaker Acoustic Models
Simultaneous Speech Recognition and Speaker Diarization for Monaural Dialogue Recordings with Target-Speaker Acoustic Models
Naoyuki Kanda
Shota Horiguchi
Yusuke Fujita
Yawen Xue
Kenji Nagamatsu
Shinji Watanabe
53
36
0
17 Sep 2019
Deep Residual Learning for Image Recognition
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
2.3K
194,641
0
10 Dec 2015
MUSAN: A Music, Speech, and Noise Corpus
MUSAN: A Music, Speech, and Noise Corpus
David Snyder
Guoguo Chen
Daniel Povey
92
1,357
0
28 Oct 2015
1