ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2001.11482
  4. Cited By
Continuous speech separation: dataset and analysis

Continuous speech separation: dataset and analysis

30 January 2020
Zhuo Chen
Takuya Yoshioka
Liang Lu
Tianyan Zhou
Zhong Meng
Yi Luo
Jian Wu
Xiong Xiao
Jinyu Li
ArXivPDFHTML

Papers citing "Continuous speech separation: dataset and analysis"

47 / 47 papers shown
Title
TS-SUPERB: A Target Speech Processing Benchmark for Speech Self-Supervised Learning Models
TS-SUPERB: A Target Speech Processing Benchmark for Speech Self-Supervised Learning Models
Junyi Peng
Takanori Ashihara
Marc Delcroix
Tsubasa Ochiai
Oldrich Plchot
Shoko Araki
J. Černocký
ELM
29
0
0
10 May 2025
USED: Universal Speaker Extraction and Diarization
USED: Universal Speaker Extraction and Diarization
Junyi Ao
Mehmet Sinan Yildirim
Ruijie Tao
Mengyao Ge
Shuai Wang
Yan-min Qian
Haizhou Li
41
6
0
17 Jan 2025
Microphone Array Signal Processing and Deep Learning for Speech Enhancement
Microphone Array Signal Processing and Deep Learning for Speech Enhancement
Reinhold Haeb-Umbach
Tomohiro Nakatani
Marc Delcroix
Christoph Boeddeker
Tsubasa Ochiai
43
0
0
13 Jan 2025
MSA-ASR: Efficient Multilingual Speaker Attribution with frozen ASR Models
MSA-ASR: Efficient Multilingual Speaker Attribution with frozen ASR Models
Thai-Binh Nguyen
Alexander Waibel
82
1
0
27 Nov 2024
StoryTeller: Improving Long Video Description through Global Audio-Visual Character Identification
StoryTeller: Improving Long Video Description through Global Audio-Visual Character Identification
Yichen He
Yuan Lin
Jianchao Wu
Hanchong Zhang
Yuchen Zhang
Ruicheng Le
VGen
VLM
177
2
0
11 Nov 2024
SonicSim: A customizable simulation platform for speech processing in moving sound source scenarios
SonicSim: A customizable simulation platform for speech processing in moving sound source scenarios
Kai Li
Wendi Sang
Chang Zeng
Runxuan Yang
Guo Chen
Xiaolin Hu
34
2
0
02 Oct 2024
RealMAN: A Real-Recorded and Annotated Microphone Array Dataset for
  Dynamic Speech Enhancement and Localization
RealMAN: A Real-Recorded and Annotated Microphone Array Dataset for Dynamic Speech Enhancement and Localization
Bing Yang
Changsheng Quan
Yabo Wang
Pengyu Wang
Yujie Yang
Ying Fang
Nian Shao
Hui Bu
Xin Xu
Xiaofei Li
43
5
0
28 Jun 2024
Can Large Language Models Understand Spatial Audio?
Can Large Language Models Understand Spatial Audio?
Changli Tang
Wenyi Yu
Guangzhi Sun
Xianzhao Chen
Tian Tan
...
Jun Zhang
Lu Lu
Zejun Ma
Yuxuan Wang
Chao Zhang
49
4
0
12 Jun 2024
Meeting Recognition with Continuous Speech Separation and
  Transcription-Supported Diarization
Meeting Recognition with Continuous Speech Separation and Transcription-Supported Diarization
Thilo von Neumann
Christoph Boeddeker
Tobias Cord-Landwehr
Marc Delcroix
Reinhold Haeb-Umbach
23
7
0
28 Sep 2023
MeetEval: A Toolkit for Computation of Word Error Rates for Meeting
  Transcription Systems
MeetEval: A Toolkit for Computation of Word Error Rates for Meeting Transcription Systems
Thilo von Neumann
Christoph Boeddeker
Marc Delcroix
Reinhold Haeb-Umbach
29
16
0
21 Jul 2023
SURT 2.0: Advances in Transducer-based Multi-talker Speech Recognition
SURT 2.0: Advances in Transducer-based Multi-talker Speech Recognition
Desh Raj
Daniel Povey
Sanjeev Khudanpur
VLM
34
9
0
18 Jun 2023
An Experimental Review of Speaker Diarization methods with application
  to Two-Speaker Conversational Telephone Speech recordings
An Experimental Review of Speaker Diarization methods with application to Two-Speaker Conversational Telephone Speech recordings
L. Serafini
Samuele Cornell
Giovanni Morrone
Enrico Zovato
Alessio Brutti
S. Squartini
47
9
0
29 May 2023
BA-SOT: Boundary-Aware Serialized Output Training for Multi-Talker ASR
BA-SOT: Boundary-Aware Serialized Output Training for Multi-Talker ASR
Yuhao Liang
Fan Yu
Yangze Li
Pengcheng Guo
Shiliang Zhang
Qian Chen
Linfu Xie
30
8
0
23 May 2023
AfroDigits: A Community-Driven Spoken Digit Dataset for African
  Languages
AfroDigits: A Community-Driven Spoken Digit Dataset for African Languages
Chris C. Emezue
Sanchit Gandhi
Lewis Tunstall
Abubakar Abid
Josh Meyer
...
Douwe Kiela
Yacine Jernite
Julien Chaumond
Merve Noyan
Omar Sanseviero
33
2
0
22 Mar 2023
Multi-resolution location-based training for multi-channel continuous
  speech separation
Multi-resolution location-based training for multi-channel continuous speech separation
H. Taherian
DeLiang Wang
38
7
0
16 Jan 2023
GPU-accelerated Guided Source Separation for Meeting Transcription
GPU-accelerated Guided Source Separation for Meeting Transcription
Desh Raj
Daniel Povey
Sanjeev Khudanpur
26
35
0
10 Dec 2022
On Word Error Rate Definitions and their Efficient Computation for
  Multi-Speaker Speech Recognition Systems
On Word Error Rate Definitions and their Efficient Computation for Multi-Speaker Speech Recognition Systems
Thilo von Neumann
Christoph Boeddeker
K. Kinoshita
Marc Delcroix
Reinhold Haeb-Umbach
37
19
0
29 Nov 2022
Self-Remixing: Unsupervised Speech Separation via Separation and
  Remixing
Self-Remixing: Unsupervised Speech Separation via Separation and Remixing
Kohei Saijo
Tetsuji Ogawa
SSL
22
11
0
18 Nov 2022
VarArray Meets t-SOT: Advancing the State of the Art of Streaming
  Distant Conversational Speech Recognition
VarArray Meets t-SOT: Advancing the State of the Art of Streaming Distant Conversational Speech Recognition
Naoyuki Kanda
Jian Wu
Xiaofei Wang
Zhuo Chen
Jinyu Li
Takuya Yoshioka
29
16
0
12 Sep 2022
A Meeting Transcription System for an Ad-Hoc Acoustic Sensor Network
A Meeting Transcription System for an Ad-Hoc Acoustic Sensor Network
Tobias Gburrek
Christoph Boeddeker
Thilo von Neumann
Tobias Cord-Landwehr
Joerg Schmalenstroeer
Reinhold Haeb-Umbach
11
5
0
02 May 2022
An Initialization Scheme for Meeting Separation with Spatial Mixture
  Models
An Initialization Scheme for Meeting Separation with Spatial Mixture Models
Christoph Boeddeker
Tobias Cord-Landwehr
Thilo von Neumann
Reinhold Haeb-Umbach
30
10
0
04 Apr 2022
End-to-end multi-talker audio-visual ASR using an active speaker
  attention module
End-to-end multi-talker audio-visual ASR using an active speaker attention module
R. Rose
Olivier Siohan
13
3
0
01 Apr 2022
EEND-SS: Joint End-to-End Neural Speaker Diarization and Speech
  Separation for Flexible Number of Speakers
EEND-SS: Joint End-to-End Neural Speaker Diarization and Speech Separation for Flexible Number of Speakers
Soumi Maiti
Yushi Ueda
Shinji Watanabe
Chunlei Zhang
Meng Yu
Shi-Xiong Zhang
Yong-mei Xu
39
32
0
31 Mar 2022
Summary On The ICASSP 2022 Multi-Channel Multi-Party Meeting
  Transcription Grand Challenge
Summary On The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Grand Challenge
Fan Yu
Shiliang Zhang
Pengcheng Guo
Yihui Fu
Zhihao Du
...
Kong Aik Lee
Zhijie Yan
B. Ma
Xin Xu
Hui Bu
18
28
0
08 Feb 2022
SkiM: Skipping Memory LSTM for Low-Latency Real-Time Continuous Speech
  Separation
SkiM: Skipping Memory LSTM for Low-Latency Real-Time Continuous Speech Separation
Chenda Li
Lei Yang
Weiqin Wang
Y. Qian
32
25
0
26 Jan 2022
Multi-turn RNN-T for streaming recognition of multi-party speech
Multi-turn RNN-T for streaming recognition of multi-party speech
Ilya Sklyar
A. Piunova
Xianrui Zheng
Yulan Liu
24
22
0
19 Dec 2021
SA-SDR: A novel loss function for separation of meeting style data
SA-SDR: A novel loss function for separation of meeting style data
Thilo von Neumann
K. Kinoshita
Christoph Boeddeker
Marc Delcroix
Reinhold Haeb-Umbach
29
20
0
29 Oct 2021
Continuous Speech Separation with Recurrent Selective Attention Network
Continuous Speech Separation with Recurrent Selective Attention Network
Yixuan Zhang
Zhuo Chen
Jian Wu
Takuya Yoshioka
Peidong Wang
Zhong Meng
Jinyu Li
BDL
27
7
0
28 Oct 2021
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech
  Processing
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Sanyuan Chen
Chengyi Wang
Zhengyang Chen
Yu-Huan Wu
Shujie Liu
...
Yao Qian
Jian Wu
Micheal Zeng
Xiangzhan Yu
Furu Wei
SSL
121
1,715
0
26 Oct 2021
The Cocktail Fork Problem: Three-Stem Audio Separation for Real-World
  Soundtracks
The Cocktail Fork Problem: Three-Stem Audio Separation for Real-World Soundtracks
Darius Petermann
Gordon Wichern
Zhong-Qiu Wang
Jonathan Le Roux
23
37
0
19 Oct 2021
M2MeT: The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription
  Challenge
M2MeT: The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Challenge
Fan Yu
Shiliang Zhang
Yihui Fu
Lei Xie
Siqi Zheng
...
Pengcheng Guo
Zhijie Yan
B. Ma
Xin Xu
Hui Bu
8
106
0
14 Oct 2021
All-neural beamformer for continuous speech separation
All-neural beamformer for continuous speech separation
Zhuohuang Zhang
Takuya Yoshioka
Naoyuki Kanda
Zhuo Chen
Xiaofei Wang
Dongmei Wang
Sefik Emre Eskimez
33
15
0
13 Oct 2021
VarArray: Array-Geometry-Agnostic Continuous Speech Separation
VarArray: Array-Geometry-Agnostic Continuous Speech Separation
Takuya Yoshioka
Xiaofei Wang
Dongmei Wang
M. Tang
Zirun Zhu
Zhuo Chen
Naoyuki Kanda
17
37
0
12 Oct 2021
USEV: Universal Speaker Extraction with Visual Cue
USEV: Universal Speaker Extraction with Visual Cue
Zexu Pan
Meng Ge
Haizhou Li
34
41
0
30 Sep 2021
Graph-PIT: Generalized permutation invariant training for continuous
  separation of arbitrary numbers of speakers
Graph-PIT: Generalized permutation invariant training for continuous separation of arbitrary numbers of speakers
Thilo von Neumann
K. Kinoshita
Christoph Boeddeker
Marc Delcroix
Reinhold Haeb-Umbach
28
23
0
30 Jul 2021
Speeding Up Permutation Invariant Training for Source Separation
Speeding Up Permutation Invariant Training for Source Separation
Thilo von Neumann
Christoph Boeddeker
K. Kinoshita
Marc Delcroix
Reinhold Haeb-Umbach
16
6
0
30 Jul 2021
A Comparative Study of Modular and Joint Approaches for
  Speaker-Attributed ASR on Monaural Long-Form Audio
A Comparative Study of Modular and Joint Approaches for Speaker-Attributed ASR on Monaural Long-Form Audio
Naoyuki Kanda
Xiong Xiao
Jian Wu
Tianyan Zhou
Yashesh Gaur
Xiaofei Wang
Zhong Meng
Zhuo Chen
Takuya Yoshioka
19
14
0
06 Jul 2021
Encoder-Decoder Based Attractors for End-to-End Neural Diarization
Encoder-Decoder Based Attractors for End-to-End Neural Diarization
Shota Horiguchi
Yusuke Fujita
Shinji Watanabe
Yawen Xue
Leibny Paola García-Perera
37
64
0
20 Jun 2021
End-to-End Diarization for Variable Number of Speakers with Local-Global
  Networks and Discriminative Speaker Embeddings
End-to-End Diarization for Variable Number of Speakers with Local-Global Networks and Discriminative Speaker Embeddings
Soumi Maiti
Hakan Erdogan
K. Wilson
Scott Wisdom
Shinji Watanabe
J. Hershey
27
21
0
05 May 2021
Configurable Privacy-Preserving Automatic Speech Recognition
Configurable Privacy-Preserving Automatic Speech Recognition
Ranya Aloufi
Hamed Haddadi
David E. Boyle
25
10
0
01 Apr 2021
A Review of Speaker Diarization: Recent Advances with Deep Learning
A Review of Speaker Diarization: Recent Advances with Deep Learning
Tae Jin Park
Naoyuki Kanda
Dimitrios Dimitriadis
Kyu Jeong Han
Shinji Watanabe
Shrikanth Narayanan
VLM
274
326
0
24 Jan 2021
Streaming end-to-end multi-talker speech recognition
Streaming end-to-end multi-talker speech recognition
Liang Lu
Naoyuki Kanda
Jinyu Li
Jiawei Liu
13
41
0
26 Nov 2020
Multi-class Spectral Clustering with Overlaps for Speaker Diarization
Multi-class Spectral Clustering with Overlaps for Speaker Diarization
Desh Raj
Zili Huang
Sanjeev Khudanpur
31
30
0
05 Nov 2020
BW-EDA-EEND: Streaming End-to-End Neural Speaker Diarization for a
  Variable Number of Speakers
BW-EDA-EEND: Streaming End-to-End Neural Speaker Diarization for a Variable Number of Speakers
Eunjung Han
Chul Lee
A. Stolcke
24
42
0
05 Nov 2020
Microsoft Speaker Diarization System for the VoxCeleb Speaker
  Recognition Challenge 2020
Microsoft Speaker Diarization System for the VoxCeleb Speaker Recognition Challenge 2020
Xiong Xiao
Naoyuki Kanda
Zhuo Chen
Tianyan Zhou
Takuya Yoshioka
...
Yu-Huan Wu
Jian Wu
Shujie Liu
Jinyu Li
Jiawei Liu
27
62
0
22 Oct 2020
Multi-microphone Complex Spectral Mapping for Utterance-wise and
  Continuous Speech Separation
Multi-microphone Complex Spectral Mapping for Utterance-wise and Continuous Speech Separation
Zhong-Qiu Wang
Peidong Wang
DeLiang Wang
33
88
0
04 Oct 2020
Speaker-Conditional Chain Model for Speech Separation and Extraction
Speaker-Conditional Chain Model for Speech Separation and Extraction
Jing Shi
Jiaming Xu
Yusuke Fujita
Shinji Watanabe
Bo Xu
BDL
43
20
0
25 Jun 2020
1