Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2409.00819
Cited By
LibriheavyMix: A 20,000-Hour Dataset for Single-Channel Reverberant Multi-Talker Speech Separation, ASR and Speaker Diarization
1 September 2024
Zengrui Jin
Yifan Yang
Mohan Shi
Wei Kang
Xiaoyu Yang
Zengwei Yao
Fangjun Kuang
Liyong Guo
Lingwei Meng
Long Lin
Yong Xu
Shi-Xiong Zhang
Daniel Povey
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"LibriheavyMix: A 20,000-Hour Dataset for Single-Channel Reverberant Multi-Talker Speech Separation, ASR and Speaker Diarization"
20 / 20 papers shown
Title
MossFormer2: Combining Transformer and RNN-Free Recurrent Network for Enhanced Time-Domain Monaural Speech Separation
Shengkui Zhao
Yukun Ma
Chongjia Ni
Chong Zhang
Hao Wang
Trung Hieu Nguyen
Kun Zhou
J. Yip
Dianwen Ng
Bin Ma
70
26
0
19 Dec 2023
Powerset multi-class cross entropy loss for neural speaker diarization
Alexis Plaquet
H. Bredin
149
110
0
19 Oct 2023
Semantic VAD: Low-Latency Voice Activity Detection for Speech Interaction
Mohan Shi
Yuchun Shu
Lingyun Zuo
Qiang Chen
Shiliang Zhang
Jie Zhang
Lirong Dai
VLM
53
3
0
21 May 2023
A Sidecar Separator Can Convert a Single-Talker Speech Recognition System to a Multi-Talker One
Lingwei Meng
Jiawen Kang
Mingyu Cui
Yuejiao Wang
Xixin Wu
Helen M. Meng
40
17
0
20 Feb 2023
Unified End-to-End Speech Recognition and Endpointing for Fast and Efficient Speech Systems
Shaan Bijwadia
Shuo-yiin Chang
Yue Liu
Tara N. Sainath
Chaoyang Zhang
Yanzhang He
70
8
0
01 Nov 2022
From Simulated Mixtures to Simulated Conversations as Training Data for End-to-End Neural Diarization
Federico Landini
Alicia Lozano-Diez
Mireia Díez
Lukávs Burget
48
37
0
02 Apr 2022
M2MeT: The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Challenge
Fan Yu
Shiliang Zhang
Yihui Fu
Lei Xie
Siqi Zheng
...
Pengcheng Guo
Zhijie Yan
B. Ma
Xin Xu
Hui Bu
53
119
0
14 Oct 2021
FAST-RIR: Fast neural diffuse room impulse response generator
Anton Ratnarajah
Shi-Xiong Zhang
Meng Yu
Zhenyu Tang
Tianyi Zhou
Dong Yu
50
56
0
07 Oct 2021
End-to-End Speaker-Attributed ASR with Transformer
Naoyuki Kanda
Guoli Ye
Yashesh Gaur
Xiaofei Wang
Zhong Meng
Zhuo Chen
Takuya Yoshioka
54
49
0
05 Apr 2021
A Review of Speaker Diarization: Recent Advances with Deep Learning
Tae Jin Park
Naoyuki Kanda
Dimitrios Dimitriadis
Kyu Jeong Han
Shinji Watanabe
Shrikanth Narayanan
VLM
326
335
0
24 Jan 2021
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations
Alexei Baevski
Henry Zhou
Abdel-rahman Mohamed
Michael Auli
SSL
285
5,801
0
20 Jun 2020
Conformer: Convolution-augmented Transformer for Speech Recognition
Anmol Gulati
James Qin
Chung-Cheng Chiu
Niki Parmar
Yu Zhang
...
Wei Han
Shibo Wang
Zhengdong Zhang
Yonghui Wu
Ruoming Pang
224
3,153
0
16 May 2020
Target-Speaker Voice Activity Detection: a Novel Approach for Multi-Speaker Diarization in a Dinner Party Scenario
Ivan Medennikov
M. Korenevsky
Tatiana Prisyach
Yuri Y. Khokhlov
Mariya Korenevskaya
...
Anton Mitrofanov
A. Andrusenko
Ivan Podluzhny
A. Laptev
A. Romanenko
45
203
0
14 May 2020
CHiME-6 Challenge:Tackling Multispeaker Speech Recognition for Unsegmented Recordings
Shinji Watanabe
Michael I. Mandel
Jon Barker
Emmanuel Vincent
Ashish Arora
...
Emmanuel Vincent
Shota Horiguchi
Naoyuki Kanda
Takuya Yoshioka
Neville Ryant
61
308
0
20 Apr 2020
An empirical study of Conv-TasNet
Berkan Kadıoğlu
Michael Horgan
Xiaoyu Liu
Jordi Pons
Dan Darcy
Vivek Kumar
40
44
0
20 Feb 2020
Transformer-based Acoustic Modeling for Hybrid Speech Recognition
Yongqiang Wang
Abdel-rahman Mohamed
Duc Le
Chunxi Liu
Alex Xiao
...
Xiaohui Zhang
Frank Zhang
Christian Fuegen
Geoffrey Zweig
M. Seltzer
52
249
0
22 Oct 2019
MIMO-SPEECH: End-to-End Multi-Channel Multi-Speaker Speech Recognition
Xuankai Chang
Wangyou Zhang
Y. Qian
Jonathan Le Roux
Shinji Watanabe
71
120
0
15 Oct 2019
WHAM!: Extending Speech Separation to Noisy Environments
Gordon Wichern
J. Antognini
Michael Flynn
Licheng Richard Zhu
E. McQuinn
Dwight Crow
Ethan Manilow
Jonathan Le Roux
82
351
0
02 Jul 2019
Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation
Yi Luo
N. Mesgarani
159
1,794
0
20 Sep 2018
ESPnet: End-to-End Speech Processing Toolkit
Shinji Watanabe
Takaaki Hori
Shigeki Karita
Tomoki Hayashi
Jiro Nishitoba
...
Jahn Heymann
Sanjeev Khudanpur
Nanxin Chen
Adithya Renduchintala
Tsubasa Ochiai
VLM
109
1,509
0
30 Mar 2018
1