ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2004.09249
  4. Cited By
CHiME-6 Challenge:Tackling Multispeaker Speech Recognition for
  Unsegmented Recordings

CHiME-6 Challenge:Tackling Multispeaker Speech Recognition for Unsegmented Recordings

20 April 2020
Shinji Watanabe
Michael I. Mandel
Jon Barker
Emmanuel Vincent
Ashish Arora
Xuankai Chang
Sanjeev Khudanpur
Vimal Manohar
Daniel Povey
Desh Raj
David Snyder
Aswin Shanmugam Subramanian
Jan "Yenda" Trmal
Bar Ben Yair
Christoph Boeddeker
Zhaoheng Ni
Emmanuel Vincent
Shota Horiguchi
Naoyuki Kanda
Takuya Yoshioka
Neville Ryant
ArXivPDFHTML

Papers citing "CHiME-6 Challenge:Tackling Multispeaker Speech Recognition for Unsegmented Recordings"

50 / 58 papers shown
Title
BERSting at the Screams: A Benchmark for Distanced, Emotional and Shouted Speech Recognition
BERSting at the Screams: A Benchmark for Distanced, Emotional and Shouted Speech Recognition
Paige Tuttosi
Mantaj Dhillon
Luna Sang
Shane Eastwood
Poorvi Bhatia
Quang Minh Dinh
Avni Kapoor
Yewon Jin
Angelica Lim
24
0
0
30 Apr 2025
Microphone Array Signal Processing and Deep Learning for Speech Enhancement
Microphone Array Signal Processing and Deep Learning for Speech Enhancement
Reinhold Haeb-Umbach
Tomohiro Nakatani
Marc Delcroix
Christoph Boeddeker
Tsubasa Ochiai
43
0
0
13 Jan 2025
Guided Speaker Embedding
Guided Speaker Embedding
Shota Horiguchi
Takafumi Moriya
Atsushi Ando
Takanori Ashihara
Hiroshi Sato
Naohiro Tawara
Marc Delcroix
45
0
0
03 Jan 2025
MSA-ASR: Efficient Multilingual Speaker Attribution with frozen ASR Models
MSA-ASR: Efficient Multilingual Speaker Attribution with frozen ASR Models
Thai-Binh Nguyen
Alexander Waibel
74
1
0
27 Nov 2024
SonicSim: A customizable simulation platform for speech processing in moving sound source scenarios
SonicSim: A customizable simulation platform for speech processing in moving sound source scenarios
Kai Li
Wendi Sang
Chang Zeng
Runxuan Yang
Guo Chen
Xiaolin Hu
26
2
0
02 Oct 2024
Advancing Multi-talker ASR Performance with Large Language Models
Advancing Multi-talker ASR Performance with Large Language Models
Mohan Shi
Zengrui Jin
Yaoxun Xu
Yong Xu
Shi-Xiong Zhang
Kun Wei
Yiwen Shao
Chunlei Zhang
Dong Yu
29
1
0
30 Aug 2024
The CHiME-8 DASR Challenge for Generalizable and Array Agnostic Distant
  Automatic Speech Recognition and Diarization
The CHiME-8 DASR Challenge for Generalizable and Array Agnostic Distant Automatic Speech Recognition and Diarization
Samuele Cornell
Taejin Park
Steve Huang
Christoph Boeddeker
Xuankai Chang
Matthew Maciejewski
Matthew Wiesner
Paola García
Shinji Watanabe
31
9
0
23 Jul 2024
LLM-based speaker diarization correction: A generalizable approach
LLM-based speaker diarization correction: A generalizable approach
Georgios Efstathiadis
Vijay Yadav
Anzar Abbas
43
3
0
07 Jun 2024
SA-SOT: Speaker-Aware Serialized Output Training for Multi-Talker ASR
SA-SOT: Speaker-Aware Serialized Output Training for Multi-Talker ASR
Zhiyun Fan
Linhao Dong
Jun Zhang
Lu Lu
Zejun Ma
41
4
0
04 Mar 2024
On Speaker Attribution with SURT
On Speaker Attribution with SURT
Desh Raj
Matthew Wiesner
Matthew Maciejewski
Leibny Paola García-Perera
Daniel Povey
Sanjeev Khudanpur
24
3
0
28 Jan 2024
Lattice Rescoring Based on Large Ensemble of Complementary Neural
  Language Models
Lattice Rescoring Based on Large Ensemble of Complementary Neural Language Models
A. Ogawa
Naohiro Tawara
Marc Delcroix
S. Araki
27
3
0
20 Dec 2023
The CHiME-7 Challenge: System Description and Performance of NeMo Team's
  DASR System
The CHiME-7 Challenge: System Description and Performance of NeMo Team's DASR System
T. Park
He Huang
Ante Jukić
Kunal Dhawan
Krishna C. Puvvada
Nithin Rao Koluguri
Nikolay Karpov
A. Laptev
Jagadeesh Balam
Boris Ginsburg
19
6
0
18 Oct 2023
Meeting Recognition with Continuous Speech Separation and
  Transcription-Supported Diarization
Meeting Recognition with Continuous Speech Separation and Transcription-Supported Diarization
Thilo von Neumann
Christoph Boeddeker
Tobias Cord-Landwehr
Marc Delcroix
Reinhold Haeb-Umbach
21
7
0
28 Sep 2023
PP-MeT: a Real-world Personalized Prompt based Meeting Transcription
  System
PP-MeT: a Real-world Personalized Prompt based Meeting Transcription System
Xiang Lyu
Yuhang Cao
Qing Wang
Jingjing Yin
Yuguang Yang
Pengpeng Zou
G. Zachmann
Heng Lu
VLM
26
3
0
28 Sep 2023
Integrating Emotion Recognition with Speech Recognition and Speaker
  Diarisation for Conversations
Integrating Emotion Recognition with Speech Recognition and Speaker Diarisation for Conversations
Wen Wu
C. Zhang
P. Woodland
29
3
0
14 Aug 2023
MeetEval: A Toolkit for Computation of Word Error Rates for Meeting
  Transcription Systems
MeetEval: A Toolkit for Computation of Word Error Rates for Meeting Transcription Systems
Thilo von Neumann
Christoph Boeddeker
Marc Delcroix
Reinhold Haeb-Umbach
13
16
0
21 Jul 2023
Mixture Encoder for Joint Speech Separation and Recognition
Mixture Encoder for Joint Speech Separation and Recognition
Simon Berger
Peter Vieting
Christoph Boeddeker
Ralf Schluter
Reinhold Häb-Umbach
16
6
0
21 Jun 2023
STHG: Spatial-Temporal Heterogeneous Graph Learning for Advanced
  Audio-Visual Diarization
STHG: Spatial-Temporal Heterogeneous Graph Learning for Advanced Audio-Visual Diarization
Kyle Min
29
5
0
18 Jun 2023
SURT 2.0: Advances in Transducer-based Multi-talker Speech Recognition
SURT 2.0: Advances in Transducer-based Multi-talker Speech Recognition
Desh Raj
Daniel Povey
Sanjeev Khudanpur
VLM
26
9
0
18 Jun 2023
An Experimental Review of Speaker Diarization methods with application
  to Two-Speaker Conversational Telephone Speech recordings
An Experimental Review of Speaker Diarization methods with application to Two-Speaker Conversational Telephone Speech recordings
L. Serafini
Samuele Cornell
Giovanni Morrone
Enrico Zovato
A. Brutti
S. Squartini
37
9
0
29 May 2023
BA-SOT: Boundary-Aware Serialized Output Training for Multi-Talker ASR
BA-SOT: Boundary-Aware Serialized Output Training for Multi-Talker ASR
Yuhao Liang
Fan Yu
Yangze Li
Pengcheng Guo
Shiliang Zhang
Qian Chen
Linfu Xie
27
7
0
23 May 2023
ReVISE: Self-Supervised Speech Resynthesis with Visual Input for
  Universal and Generalized Speech Enhancement
ReVISE: Self-Supervised Speech Resynthesis with Visual Input for Universal and Generalized Speech Enhancement
Wei-Ning Hsu
Tal Remez
Bowen Shi
Jacob Donley
Yossi Adi
DiffM
27
11
0
21 Dec 2022
GPU-accelerated Guided Source Separation for Meeting Transcription
GPU-accelerated Guided Source Separation for Meeting Transcription
Desh Raj
Daniel Povey
Sanjeev Khudanpur
13
34
0
10 Dec 2022
Better Transcription of UK Supreme Court Hearings
Better Transcription of UK Supreme Court Hearings
Hadeel Saadany
C. Breslin
Constantin Orasan
Sophie Walker
AILaw
11
6
0
29 Nov 2022
On Word Error Rate Definitions and their Efficient Computation for
  Multi-Speaker Speech Recognition Systems
On Word Error Rate Definitions and their Efficient Computation for Multi-Speaker Speech Recognition Systems
Thilo von Neumann
Christoph Boeddeker
K. Kinoshita
Marc Delcroix
Reinhold Haeb-Umbach
16
19
0
29 Nov 2022
DiaCorrect: End-to-end error correction for speaker diarization
Jiangyu Han
Yuhang Cao
Heng Lu
Yanhua Long
37
0
0
31 Oct 2022
G-Augment: Searching for the Meta-Structure of Data Augmentation
  Policies for ASR
G-Augment: Searching for the Meta-Structure of Data Augmentation Policies for ASR
Gary Wang
Ekin D.Cubuk
Andrew Rosenberg
Shuyang Cheng
Ron J. Weiss
Bhuvana Ramabhadran
Pedro J. Moreno
Quoc V. Le
Daniel S. Park
20
1
0
19 Oct 2022
Joint Speech Activity and Overlap Detection with Multi-Exit Architecture
Joint Speech Activity and Overlap Detection with Multi-Exit Architecture
Ziqing Du
Kai Liu
Xucheng Wan
Huan Zhou
25
0
0
24 Sep 2022
VarArray Meets t-SOT: Advancing the State of the Art of Streaming
  Distant Conversational Speech Recognition
VarArray Meets t-SOT: Advancing the State of the Art of Streaming Distant Conversational Speech Recognition
Naoyuki Kanda
Jian Wu
Xiaofei Wang
Zhuo Chen
Jinyu Li
Takuya Yoshioka
19
16
0
12 Sep 2022
A Meeting Transcription System for an Ad-Hoc Acoustic Sensor Network
A Meeting Transcription System for an Ad-Hoc Acoustic Sensor Network
Tobias Gburrek
Christoph Boeddeker
Thilo von Neumann
Tobias Cord-Landwehr
Joerg Schmalenstroeer
Reinhold Haeb-Umbach
11
5
0
02 May 2022
Multichannel Speech Separation with Narrow-band Conformer
Multichannel Speech Separation with Narrow-band Conformer
Changsheng Quan
Xiaofei Li
23
12
0
09 Apr 2022
An Initialization Scheme for Meeting Separation with Spatial Mixture
  Models
An Initialization Scheme for Meeting Separation with Spatial Mixture Models
Christoph Boeddeker
Tobias Cord-Landwehr
Thilo von Neumann
Reinhold Haeb-Umbach
22
10
0
04 Apr 2022
EEND-SS: Joint End-to-End Neural Speaker Diarization and Speech
  Separation for Flexible Number of Speakers
EEND-SS: Joint End-to-End Neural Speaker Diarization and Speech Separation for Flexible Number of Speakers
Soumi Maiti
Yushi Ueda
Shinji Watanabe
Chunlei Zhang
Meng Yu
Shi-Xiong Zhang
Yong-mei Xu
31
32
0
31 Mar 2022
Generation of Speaker Representations Using Heterogeneous Training Batch
  Assembly
Generation of Speaker Representations Using Heterogeneous Training Batch Assembly
Yu-Huai Peng
Hung-Shin Lee
Pin-Tuan Huang
Hsin-Min Wang
11
0
0
30 Mar 2022
Multi-scale Speaker Diarization with Dynamic Scale Weighting
Multi-scale Speaker Diarization with Dynamic Scale Weighting
Tae Jin Park
Nithin Rao Koluguri
Jagadeesh Balam
Boris Ginsburg
16
19
0
30 Mar 2022
A Conformer Based Acoustic Model for Robust Automatic Speech Recognition
A Conformer Based Acoustic Model for Robust Automatic Speech Recognition
Yufeng Yang
Peidong Wang
DeLiang Wang
20
12
0
01 Mar 2022
Measuring the Impact of Individual Domain Factors in Self-Supervised
  Pre-Training
Measuring the Impact of Individual Domain Factors in Self-Supervised Pre-Training
Ramon Sanabria
Wei-Ning Hsu
Alexei Baevski
Michael Auli
19
7
0
01 Mar 2022
Summary On The ICASSP 2022 Multi-Channel Multi-Party Meeting
  Transcription Grand Challenge
Summary On The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Grand Challenge
Fan Yu
Shiliang Zhang
Pengcheng Guo
Yihui Fu
Zhihao Du
...
Kong Aik Lee
Zhijie Yan
B. Ma
Xin Xu
Hui Bu
13
28
0
08 Feb 2022
VoxSRC 2021: The Third VoxCeleb Speaker Recognition Challenge
VoxSRC 2021: The Third VoxCeleb Speaker Recognition Challenge
A. Brown
Jaesung Huh
Joon Son Chung
Arsha Nagrani
Daniel Garcia-Romero
Andrew Zisserman
28
40
0
12 Jan 2022
Robust Self-Supervised Audio-Visual Speech Recognition
Robust Self-Supervised Audio-Visual Speech Recognition
Bowen Shi
Wei-Ning Hsu
Abdel-rahman Mohamed
24
90
0
05 Jan 2022
SA-SDR: A novel loss function for separation of meeting style data
SA-SDR: A novel loss function for separation of meeting style data
Thilo von Neumann
K. Kinoshita
Christoph Boeddeker
Marc Delcroix
Reinhold Haeb-Umbach
24
20
0
29 Oct 2021
Lhotse: a speech data representation library for the modern deep
  learning ecosystem
Lhotse: a speech data representation library for the modern deep learning ecosystem
Willem Hagemann
Daniel Povey
Jan "Yenda" Trmal
Sanjeev Khudanpur
AuLLM
AI4TS
17
31
0
25 Oct 2021
M2MeT: The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription
  Challenge
M2MeT: The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Challenge
Fan Yu
Shiliang Zhang
Yihui Fu
Lei Xie
Siqi Zheng
...
Pengcheng Guo
Zhijie Yan
B. Ma
Xin Xu
Hui Bu
8
104
0
14 Oct 2021
Multi-channel Narrow-band Deep Speech Separation with Full-band
  Permutation Invariant Training
Multi-channel Narrow-band Deep Speech Separation with Full-band Permutation Invariant Training
Changsheng Quan
Xiaofei Li
30
14
0
12 Oct 2021
Graph-PIT: Generalized permutation invariant training for continuous
  separation of arbitrary numbers of speakers
Graph-PIT: Generalized permutation invariant training for continuous separation of arbitrary numbers of speakers
Thilo von Neumann
K. Kinoshita
Christoph Boeddeker
Marc Delcroix
Reinhold Haeb-Umbach
8
23
0
30 Jul 2021
Speeding Up Permutation Invariant Training for Source Separation
Speeding Up Permutation Invariant Training for Source Separation
Thilo von Neumann
Christoph Boeddeker
K. Kinoshita
Marc Delcroix
Reinhold Haeb-Umbach
14
6
0
30 Jul 2021
A Comparative Study of Modular and Joint Approaches for
  Speaker-Attributed ASR on Monaural Long-Form Audio
A Comparative Study of Modular and Joint Approaches for Speaker-Attributed ASR on Monaural Long-Form Audio
Naoyuki Kanda
Xiong Xiao
Jian Wu
Tianyan Zhou
Yashesh Gaur
Xiaofei Wang
Zhong Meng
Zhuo Chen
Takuya Yoshioka
16
13
0
06 Jul 2021
Encoder-Decoder Based Attractors for End-to-End Neural Diarization
Encoder-Decoder Based Attractors for End-to-End Neural Diarization
Shota Horiguchi
Yusuke Fujita
Shinji Watanabe
Yawen Xue
Leibny Paola García-Perera
31
64
0
20 Jun 2021
Bayesian HMM clustering of x-vector sequences (VBx) in speaker
  diarization: theory, implementation and analysis on standard tasks
Bayesian HMM clustering of x-vector sequences (VBx) in speaker diarization: theory, implementation and analysis on standard tasks
Federico Landini
Jan Profant
Mireia Díez
L. Burget
216
198
0
29 Dec 2020
WPD++: An Improved Neural Beamformer for Simultaneous Speech Separation
  and Dereverberation
WPD++: An Improved Neural Beamformer for Simultaneous Speech Separation and Dereverberation
Zhaoheng Ni
Yong-mei Xu
Meng Yu
Bo Wu
Shi-Xiong Zhang
Dong Yu
Michael I. Mandel
12
8
0
18 Nov 2020
12
Next