ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1704.01985
  4. Cited By
Recognizing Multi-talker Speech with Permutation Invariant Training

Recognizing Multi-talker Speech with Permutation Invariant Training

22 March 2017
Dong Yu
Xuankai Chang
Y. Qian
ArXivPDFHTML

Papers citing "Recognizing Multi-talker Speech with Permutation Invariant Training"

21 / 21 papers shown
Title
Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC
Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC
Jiawen Kang
Lingwei Meng
Mingyu Cui
Yuejiao Wang
Xixin Wu
Xunying Liu
Helen Meng
44
2
0
19 Sep 2024
Advancing Multi-talker ASR Performance with Large Language Models
Advancing Multi-talker ASR Performance with Large Language Models
Mohan Shi
Zengrui Jin
Yaoxun Xu
Yong Xu
Shi-Xiong Zhang
Kun Wei
Yiwen Shao
Chunlei Zhang
Dong Yu
31
1
0
30 Aug 2024
Speaker Mask Transformer for Multi-talker Overlapped Speech Recognition
Speaker Mask Transformer for Multi-talker Overlapped Speech Recognition
Peng Shen
Xugang Lu
Hisashi Kawai
35
1
0
18 Dec 2023
SURT 2.0: Advances in Transducer-based Multi-talker Speech Recognition
SURT 2.0: Advances in Transducer-based Multi-talker Speech Recognition
Desh Raj
Daniel Povey
Sanjeev Khudanpur
VLM
34
9
0
18 Jun 2023
BA-SOT: Boundary-Aware Serialized Output Training for Multi-Talker ASR
BA-SOT: Boundary-Aware Serialized Output Training for Multi-Talker ASR
Yuhao Liang
Fan Yu
Yangze Li
Pengcheng Guo
Shiliang Zhang
Qian Chen
Linfu Xie
33
8
0
23 May 2023
CASA-ASR: Context-Aware Speaker-Attributed ASR
CASA-ASR: Context-Aware Speaker-Attributed ASR
Mohan Shi
Zhihao Du
Qian Chen
Fan Yu
Yangze Li
Shiliang Zhang
Jie Zhang
Lirong Dai
36
8
0
21 May 2023
High-resolution embedding extractor for speaker diarisation
High-resolution embedding extractor for speaker diarisation
Hee-Soo Heo
Youngki Kwon
Bong-Jin Lee
You Jin Kim
Jee-weon Jung
32
5
0
08 Nov 2022
VarArray Meets t-SOT: Advancing the State of the Art of Streaming
  Distant Conversational Speech Recognition
VarArray Meets t-SOT: Advancing the State of the Art of Streaming Distant Conversational Speech Recognition
Naoyuki Kanda
Jian Wu
Xiaofei Wang
Zhuo Chen
Jinyu Li
Takuya Yoshioka
29
16
0
12 Sep 2022
Summary On The ICASSP 2022 Multi-Channel Multi-Party Meeting
  Transcription Grand Challenge
Summary On The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Grand Challenge
Fan Yu
Shiliang Zhang
Pengcheng Guo
Yihui Fu
Zhihao Du
...
Kong Aik Lee
Zhijie Yan
B. Ma
Xin Xu
Hui Bu
18
28
0
08 Feb 2022
The RoyalFlush System of Speech Recognition for M2MeT Challenge
The RoyalFlush System of Speech Recognition for M2MeT Challenge
Shuaishuai Ye
Peiyao Wang
Shunfei Chen
Xinhui Hu
Xinkang Xu
24
5
0
03 Feb 2022
Multi-turn RNN-T for streaming recognition of multi-party speech
Multi-turn RNN-T for streaming recognition of multi-party speech
Ilya Sklyar
A. Piunova
Xianrui Zheng
Yulan Liu
24
22
0
19 Dec 2021
Recent Advances in End-to-End Automatic Speech Recognition
Recent Advances in End-to-End Automatic Speech Recognition
Jinyu Li
VLM
37
363
0
02 Nov 2021
A Comparative Study of Modular and Joint Approaches for
  Speaker-Attributed ASR on Monaural Long-Form Audio
A Comparative Study of Modular and Joint Approaches for Speaker-Attributed ASR on Monaural Long-Form Audio
Naoyuki Kanda
Xiong Xiao
Jian Wu
Tianyan Zhou
Yashesh Gaur
Xiaofei Wang
Zhong Meng
Zhuo Chen
Takuya Yoshioka
19
14
0
06 Jul 2021
A Review of Speaker Diarization: Recent Advances with Deep Learning
A Review of Speaker Diarization: Recent Advances with Deep Learning
Tae Jin Park
Naoyuki Kanda
Dimitrios Dimitriadis
Kyu Jeong Han
Shinji Watanabe
Shrikanth Narayanan
VLM
274
327
0
24 Jan 2021
Streaming end-to-end multi-talker speech recognition
Streaming end-to-end multi-talker speech recognition
Liang Lu
Naoyuki Kanda
Jinyu Li
Jiawei Liu
13
41
0
26 Nov 2020
An End-to-end Architecture of Online Multi-channel Speech Separation
An End-to-end Architecture of Online Multi-channel Speech Separation
Jian Wu
Zhuo Chen
Jinyu Li
Takuya Yoshioka
Zhili Tan
Ed Lin
Yi Luo
Lei Xie
3DV
19
21
0
07 Sep 2020
Serialized Output Training for End-to-End Overlapped Speech Recognition
Serialized Output Training for End-to-End Overlapped Speech Recognition
Naoyuki Kanda
Yashesh Gaur
Xiaofei Wang
Zhong Meng
Takuya Yoshioka
13
113
0
28 Mar 2020
End-to-End Multi-speaker Speech Recognition with Transformer
End-to-End Multi-speaker Speech Recognition with Transformer
Xuankai Chang
Wangyou Zhang
Y. Qian
Jonathan Le Roux
Shinji Watanabe
ViT
27
103
0
10 Feb 2020
End-to-end training of time domain audio separation and recognition
End-to-end training of time domain audio separation and recognition
Thilo von Neumann
K. Kinoshita
Lukas Drude
Christoph Boeddeker
Marc Delcroix
Tomohiro Nakatani
Reinhold Haeb-Umbach
25
34
0
18 Dec 2019
Analysis of Deep Clustering as Preprocessing for Automatic Speech
  Recognition of Sparsely Overlapping Speech
Analysis of Deep Clustering as Preprocessing for Automatic Speech Recognition of Sparsely Overlapping Speech
T. Menne
Ilya Sklyar
Ralf Schluter
Hermann Ney
27
35
0
09 May 2019
Deep Extractor Network for Target Speaker Recovery From Single Channel
  Speech Mixtures
Deep Extractor Network for Target Speaker Recovery From Single Channel Speech Mixtures
Jun Wang
Jie Chen
Dan Su
Lianwu Chen
Meng Yu
Y. Qian
Dong Yu
46
90
0
24 Jul 2018
1