End-to-End Monaural Multi-speaker ASR System without Pretraining

End-to-End Monaural Multi-speaker ASR System without Pretraining

5 November 2018

Papers citing "End-to-End Monaural Multi-speaker ASR System without Pretraining"

17 / 17 papers shown

Title
SA-SOT: Speaker-Aware Serialized Output Training for Multi-Talker ASR Zhiyun Fan Linhao Dong Jun Zhang Lu Lu Zejun Ma 43 5 0 04 Mar 2024
Conformer-based Target-Speaker Automatic Speech Recognition for Single-Channel Audio Yang Zhang Krishna C. Puvvada Vitaly Lavrukhin Boris Ginsburg 38 14 0 09 Aug 2023
Mixture Encoder for Joint Speech Separation and Recognition Simon Berger Peter Vieting Christoph Boeddeker Ralf Schluter Reinhold Häb-Umbach 24 6 0 21 Jun 2023
SURT 2.0: Advances in Transducer-based Multi-talker Speech Recognition Desh Raj Daniel Povey Sanjeev Khudanpur VLM 34 9 0 18 Jun 2023
BA-SOT: Boundary-Aware Serialized Output Training for Multi-Talker ASR Yuhao Liang Fan Yu Yangze Li Pengcheng Guo Shiliang Zhang Qian Chen Linfu Xie 30 8 0 23 May 2023
VarArray Meets t-SOT: Advancing the State of the Art of Streaming Distant Conversational Speech Recognition Naoyuki Kanda Jian Wu Xiaofei Wang Zhuo Chen Jinyu Li Takuya Yoshioka 29 16 0 12 Sep 2022
Multi-turn RNN-T for streaming recognition of multi-party speech Ilya Sklyar A. Piunova Xianrui Zheng Yulan Liu 24 22 0 19 Dec 2021
A Comparative Study of Modular and Joint Approaches for Speaker-Attributed ASR on Monaural Long-Form Audio Naoyuki Kanda Xiong Xiao Jian Wu Tianyan Zhou Yashesh Gaur Xiaofei Wang Zhong Meng Zhuo Chen Takuya Yoshioka 19 14 0 06 Jul 2021
A Review of Speaker Diarization: Recent Advances with Deep Learning Tae Jin Park Naoyuki Kanda Dimitrios Dimitriadis Kyu Jeong Han Shinji Watanabe Shrikanth Narayanan VLM 274 327 0 24 Jan 2021
Streaming end-to-end multi-talker speech recognition Liang Lu Naoyuki Kanda Jinyu Li Jiawei Liu 13 41 0 26 Nov 2020
On End-to-end Multi-channel Time Domain Speech Separation in Reverberant Environments Jisi Zhang Catalin Zorila R. Doddipatla Jon Barker 14 46 0 11 Nov 2020
An End-to-end Architecture of Online Multi-channel Speech Separation Jian Wu Zhuo Chen Jinyu Li Takuya Yoshioka Zhili Tan Ed Lin Yi Luo Lei Xie 3DV 19 21 0 07 Sep 2020
Serialized Output Training for End-to-End Overlapped Speech Recognition Naoyuki Kanda Yashesh Gaur Xiaofei Wang Zhong Meng Takuya Yoshioka 8 113 0 28 Mar 2020
End-to-End Multi-speaker Speech Recognition with Transformer Xuankai Chang Wangyou Zhang Y. Qian Jonathan Le Roux Shinji Watanabe ViT 25 103 0 10 Feb 2020
End-to-end training of time domain audio separation and recognition Thilo von Neumann K. Kinoshita Lukas Drude Christoph Boeddeker Marc Delcroix Tomohiro Nakatani Reinhold Haeb-Umbach 25 34 0 18 Dec 2019
WHAMR!: Noisy and Reverberant Single-Channel Speech Separation Matthew Maciejewski Gordon Wichern E. McQuinn Jonathan Le Roux 22 180 0 22 Oct 2019
Analysis of Deep Clustering as Preprocessing for Automatic Speech Recognition of Sparsely Overlapping Speech T. Menne Ilya Sklyar Ralf Schluter Hermann Ney 24 35 0 09 May 2019