v1v2 (latest)

Guided Speaker Embedding

3 January 2025

Papers citing "Guided Speaker Embedding"

26 / 26 papers shown

Title
Alignment-Free Training for Transducer-based Multi-Talker ASR Takafumi Moriya Shota Horiguchi Marc Delcroix Ryo Masumura Takanori Ashihara Hiroshi Sato Kohei Matsuura Masato Mimura 71 4 0 30 Sep 2024
Multi-channel Conversational Speaker Separation via Neural Diarization H. Taherian DeLiang Wang BDL 64 17 0 15 Nov 2023
NTT speaker diarization system for CHiME-7: multi-domain, multi-microphone End-to-end and vector clustering diarization Naohiro Tawara Marc Delcroix Atsushi Ando A. Ogawa 72 11 0 22 Sep 2023
Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding with Sequence-to-Sequence Architecture Gaobin Yang Maokui He Shutong Niu Ruoyu Wang Yanyan Yue Shuangqing Qian Shilong Wu Jun Du Chin-Hui Lee 81 12 0 17 Sep 2023
A Teacher-Student approach for extracting informative speaker embeddings from speech mixtures Tobias Cord-Landwehr Christoph Boeddeker Catalin Zorila R. Doddipatla Reinhold Haeb-Umbach 82 3 0 01 Jun 2023
Streaming Speaker-Attributed ASR with Token-Level Speaker Embeddings Naoyuki Kanda Jian Wu Yu Wu Xiong Xiao Zhong Meng Xiaofei Wang Yashesh Gaur Zhuo Chen Jinyu Li Takuya Yoshioka 57 27 0 30 Mar 2022
Streaming Multi-Talker ASR with Token-Level Serialized Output Training Naoyuki Kanda Jian Wu Yu Wu Xiong Xiao Zhong Meng Xiaofei Wang Yashesh Gaur Zhuo Chen Jinyu Li Takuya Yoshioka 106 60 0 02 Feb 2022
Endpoint Detection for Streaming End-to-End Multi-talker ASR Liang Lu Jinyu Li Yifan Gong 107 19 0 24 Jan 2022
Multi-turn RNN-T for streaming recognition of multi-party speech Ilya Sklyar A. Piunova Xianrui Zheng Yulan Liu 98 24 0 19 Dec 2021
M2MeT: The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Challenge Fan Yu Shiliang Zhang Yihui Fu Lei Xie Siqi Zheng ... Pengcheng Guo Zhijie Yan B. Ma Xin Xu Hui Bu 73 119 0 14 Oct 2021
Advances in integration of end-to-end neural and clustering-based diarization for real conversational speech K. Kinoshita Marc Delcroix Naohiro Tawara 111 61 0 19 May 2021
AISHELL-4: An Open Source Dataset for Speech Enhancement, Separation, Recognition and Speaker Diarization in Conference Scenario Yihui Fu Luyao Cheng Shubo Lv Yukai Jv Yuxiang Kong ... Jian Wu Hui Bu Xin Xu Jun Du Jingdong Chen 97 98 0 08 Apr 2021
Streaming Multi-talker Speech Recognition with Joint Speaker Identification Liang Lu Naoyuki Kanda Jinyu Li Jiawei Liu 75 20 0 05 Apr 2021
Speaker activity driven neural speech extraction Marc Delcroix Kateřina Žmolíková Tsubasa Ochiai K. Kinoshita Tomohiro Nakatani 101 35 0 14 Jan 2021
Streaming end-to-end multi-talker speech recognition Liang Lu Naoyuki Kanda Jinyu Li Jiawei Liu 67 44 0 26 Nov 2020
Integrating end-to-end neural and clustering-based diarization: Getting the best of both worlds K. Kinoshita Marc Delcroix Naohiro Tawara 71 84 0 26 Oct 2020
The IDLAB VoxSRC-20 Submission: Large Margin Fine-Tuning and Quality-Aware Score Calibration in DNN Based Speaker Verification Jenthe Thienpondt Brecht Desplanques Kris Demuynck 80 84 0 21 Oct 2020
MIRNet: Learning multiple identities representations in overlapped speech Hyewon Han Soo-Whan Chung Hong-Goo Kang 60 8 0 04 Aug 2020
Spot the conversation: speaker diarisation in the wild Joon Son Chung Jaesung Huh Arsha Nagrani Triantafyllos Afouras Andrew Zisserman VGen 97 150 0 02 Jul 2020
Target-Speaker Voice Activity Detection: a Novel Approach for Multi-Speaker Diarization in a Dinner Party Scenario Ivan Medennikov M. Korenevsky Tatiana Prisyach Yuri Y. Khokhlov Mariya Korenevskaya ... Anton Mitrofanov A. Andrusenko Ivan Podluzhny A. Laptev A. Romanenko 58 205 0 14 May 2020
ECAPA-TDNN: Emphasized Channel Attention, Propagation and Aggregation in TDNN Based Speaker Verification Brecht Desplanques Jenthe Thienpondt Kris Demuynck 86 1,347 0 14 May 2020
CHiME-6 Challenge:Tackling Multispeaker Speech Recognition for Unsegmented Recordings Shinji Watanabe Michael I. Mandel Jon Barker Emmanuel Vincent Ashish Arora ... Emmanuel Vincent Shota Horiguchi Naoyuki Kanda Takuya Yoshioka Neville Ryant 75 308 0 20 Apr 2020
pyannote.audio: neural building blocks for speaker diarization H. Bredin Ruiqing Yin Juan Manuel Coria G. Gelly Pavel Korshunov Marvin Lavechin D. Fustes Hadrien Titeux Wassim Bouaziz Marie-Philippe Gill 244 326 0 04 Nov 2019
Simultaneous Speech Recognition and Speaker Diarization for Monaural Dialogue Recordings with Target-Speaker Acoustic Models Naoyuki Kanda Shota Horiguchi Yusuke Fujita Yawen Xue Kenji Nagamatsu Shinji Watanabe 53 36 0 17 Sep 2019
Deep Residual Learning for Image Recognition Kaiming He Xinming Zhang Shaoqing Ren Jian Sun MedIm 2.3K 194,641 0 10 Dec 2015
MUSAN: A Music, Speech, and Noise Corpus David Snyder Guoguo Chen Daniel Povey 92 1,357 0 28 Oct 2015