ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2102.01547
  4. Cited By
WeNet: Production oriented Streaming and Non-streaming End-to-End Speech
  Recognition Toolkit

WeNet: Production oriented Streaming and Non-streaming End-to-End Speech Recognition Toolkit

2 February 2021
Zhuoyuan Yao
Di Wu
Xiong Wang
Binbin Zhang
Fan Yu
Chao Yang
Zhendong Peng
Xiaoyu Chen
Lei Xie
X. Lei
ArXivPDFHTML

Papers citing "WeNet: Production oriented Streaming and Non-streaming End-to-End Speech Recognition Toolkit"

50 / 54 papers shown
Title
CS-Dialogue: A 104-Hour Dataset of Spontaneous Mandarin-English Code-Switching Dialogues for Speech Recognition
CS-Dialogue: A 104-Hour Dataset of Spontaneous Mandarin-English Code-Switching Dialogues for Speech Recognition
Jiaming Zhou
Yujie Guo
Songtao Zhao
Haoqin Sun
Hui Wang
...
Shiyao Wang
Xi Yang
Yibo Wang
Yonghua Lin
Yong Qin
51
0
0
26 Feb 2025
SegAug: CTC-Aligned Segmented Augmentation For Robust RNN-Transducer Based Speech Recognition
SegAug: CTC-Aligned Segmented Augmentation For Robust RNN-Transducer Based Speech Recognition
Khanh Le
Tuan Vu Ho
Dung Tran
Duc Thanh Chau
59
0
0
20 Feb 2025
CR-CTC: Consistency regularization on CTC for improved speech recognition
CR-CTC: Consistency regularization on CTC for improved speech recognition
Zengwei Yao
Wei Kang
Xiaoyu Yang
Fangjun Kuang
Liyong Guo
Han Zhu
Zengrui Jin
Zhaoqing Li
Long Lin
Daniel Povey
56
0
0
17 Feb 2025
Improving Zero-Shot Chinese-English Code-Switching ASR with kNN-CTC and Gated Monolingual Datastores
Improving Zero-Shot Chinese-English Code-Switching ASR with kNN-CTC and Gated Monolingual Datastores
Jiaming Zhou
Songtao Zhao
Hui Wang
Tian-Hao Zhang
Haoqin Sun
Xuechen Wang
Yong Qin
166
3
0
20 Jan 2025
Benchmarking Rotary Position Embeddings for Automatic Speech Recognition
Benchmarking Rotary Position Embeddings for Automatic Speech Recognition
Shucong Zhang
Titouan Parcollet
Rogier van Dalen
Sourav Bhattacharya
51
0
0
10 Jan 2025
LUPET: Incorporating Hierarchical Information Path into Multilingual ASR
LUPET: Incorporating Hierarchical Information Path into Multilingual ASR
Wei Liu
Jingyong Hou
Dong Yang
Muyong Cao
Tan Lee
80
1
0
10 Jan 2025
DCIM-AVSR : Efficient Audio-Visual Speech Recognition via Dual Conformer Interaction Module
DCIM-AVSR : Efficient Audio-Visual Speech Recognition via Dual Conformer Interaction Module
Xinyu Wang
Qian Wang
Haolin Huang
Yu Fang
Mengjie Xu
Qian Wang
31
0
0
31 Aug 2024
Overview of Speaker Modeling and Its Applications: From the Lens of Deep
  Speaker Representation Learning
Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning
Shuai Wang
Zheng-Shou Chen
Kong Aik Lee
Yan-min Qian
Haizhou Li
39
4
0
21 Jul 2024
Towards Effective and Efficient Non-autoregressive Decoding Using
  Block-based Attention Mask
Towards Effective and Efficient Non-autoregressive Decoding Using Block-based Attention Mask
Tianzi Wang
Xurong Xie
Zhaoqing Li
Shoukang Hu
Zengrui Jin
...
Shujie Hu
Mengzhe Geng
Guinan Li
Helen Meng
Xunying Liu
34
0
0
14 Jun 2024
Vec-Tok-VC+: Residual-enhanced Robust Zero-shot Voice Conversion with
  Progressive Constraints in a Dual-mode Training Strategy
Vec-Tok-VC+: Residual-enhanced Robust Zero-shot Voice Conversion with Progressive Constraints in a Dual-mode Training Strategy
Linhan Ma
Xinfa Zhu
Yuanjun Lv
Zhichao Wang
Ziqian Wang
Wendi He
Hongbin Zhou
Lei Xie
42
2
0
14 Jun 2024
Enhancing Voice Wake-Up for Dysarthria: Mandarin Dysarthria Speech
  Corpus Release and Customized System Design
Enhancing Voice Wake-Up for Dysarthria: Mandarin Dysarthria Speech Corpus Release and Customized System Design
Ming Gao
Hang Chen
Jun Du
Xin Xu
Hongxiao Guo
Hui Bu
Jianxing Yang
Ming Li
Chin-Hui Lee
42
2
0
14 Jun 2024
ONNXPruner: ONNX-Based General Model Pruning Adapter
ONNXPruner: ONNX-Based General Model Pruning Adapter
Dongdong Ren
Wenbin Li
Tianyu Ding
Lei Wang
Qi Fan
Jing Huo
Hongbing Pan
Yang Gao
41
3
0
10 Apr 2024
StreamVoice: Streamable Context-Aware Language Modeling for Real-time
  Zero-Shot Voice Conversion
StreamVoice: Streamable Context-Aware Language Modeling for Real-time Zero-Shot Voice Conversion
Zhichao Wang
Yuan-Jui Chen
Xinsheng Wang
Lei Xie
Yuping Wang
29
6
0
19 Jan 2024
MERBench: A Unified Evaluation Benchmark for Multimodal Emotion
  Recognition
MERBench: A Unified Evaluation Benchmark for Multimodal Emotion Recognition
Zheng Lian
Guoying Zhao
Yong Ren
Hao Gu
Haiyang Sun
Lan Chen
Bin Liu
Jianhua Tao
21
12
0
07 Jan 2024
Accent-VITS:accent transfer for end-to-end TTS
Accent-VITS:accent transfer for end-to-end TTS
Linhan Ma
Yongmao Zhang
Xinfa Zhu
Yinjiao Lei
Ziqian Ning
Pengcheng Zhu
Lei Xie
27
7
0
28 Dec 2023
Stateful Conformer with Cache-based Inference for Streaming Automatic
  Speech Recognition
Stateful Conformer with Cache-based Inference for Streaming Automatic Speech Recognition
Vahid Noroozi
Somshubra Majumdar
Ankur Kumar
Jagadeesh Balam
Boris Ginsburg
30
10
0
27 Dec 2023
Amphion: An Open-Source Audio, Music and Speech Generation Toolkit
Amphion: An Open-Source Audio, Music and Speech Generation Toolkit
Xueyao Zhang
Liumeng Xue
Yicheng Gu
Yuancheng Wang
Haorui He
...
Mingxuan Wang
Jun Han
Kai Chen
Haizhou Li
Zhizheng Wu
29
28
0
15 Dec 2023
SponTTS: modeling and transferring spontaneous style for TTS
SponTTS: modeling and transferring spontaneous style for TTS
Hanzhao Li
Xinfa Zhu
Liumeng Xue
Yang Song
Yunlin Chen
Lei Xie
35
7
0
13 Nov 2023
CDSD: Chinese Dysarthria Speech Database
CDSD: Chinese Dysarthria Speech Database
Mengyi Sun
Ming Gao
Xinchen Kang
Shiru Wang
Jun Du
Dengfeng Yao
Su-Jing Wang
33
3
0
24 Oct 2023
Personalization of CTC-based End-to-End Speech Recognition Using
  Pronunciation-Driven Subword Tokenization
Personalization of CTC-based End-to-End Speech Recognition Using Pronunciation-Driven Subword Tokenization
Zhihong Lei
Ernest Pusateri
Shiyi Han
Leo Liu
Mingbin Xu
...
R. Travadi
Youyuan Zhang
Mirko Hannemann
Man-Hung Siu
Zhen Huang
23
9
0
16 Oct 2023
LAE-ST-MoE: Boosted Language-Aware Encoder Using Speech Translation
  Auxiliary Task for E2E Code-switching ASR
LAE-ST-MoE: Boosted Language-Aware Encoder Using Speech Translation Auxiliary Task for E2E Code-switching ASR
Guodong Ma
Wenxuan Wang
Yuke Li
Yuting Yang
Binbin Du
Haoran Fu
31
5
0
28 Sep 2023
Semi-Autoregressive Streaming ASR With Label Context
Semi-Autoregressive Streaming ASR With Label Context
Siddhant Arora
G. Saon
Shinji Watanabe
Brian Kingsbury
AI4TS
23
5
0
19 Sep 2023
ApproBiVT: Lead ASR Models to Generalize Better Using Approximated
  Bias-Variance Tradeoff Guided Early Stopping and Checkpoint Averaging
ApproBiVT: Lead ASR Models to Generalize Better Using Approximated Bias-Variance Tradeoff Guided Early Stopping and Checkpoint Averaging
Fangyuan Wang
Ming Hao
Yuhai Shi
Bo Xu
MoMe
21
0
0
05 Aug 2023
CIF-T: A Novel CIF-based Transducer Architecture for Automatic Speech
  Recognition
CIF-T: A Novel CIF-based Transducer Architecture for Automatic Speech Recognition
Tian-Hao Zhang
Dinghao Zhou
Guiping Zhong
Jiaming Zhou
Baoxiang Li
20
3
0
26 Jul 2023
Pseudo-Siamese Network based Timbre-reserved Black-box Adversarial
  Attack in Speaker Identification
Pseudo-Siamese Network based Timbre-reserved Black-box Adversarial Attack in Speaker Identification
Qing Wang
Jixun Yao
Ziqian Wang
Pengcheng Guo
Linfu Xie
AAML
29
1
0
30 May 2023
DualVC: Dual-mode Voice Conversion using Intra-model Knowledge
  Distillation and Hybrid Predictive Coding
DualVC: Dual-mode Voice Conversion using Intra-model Knowledge Distillation and Hybrid Predictive Coding
Ziqian Ning
Yuepeng Jiang
Pengcheng Zhu
Jixun Yao
Shuai Wang
Linfu Xie
Mengxiao Bi
34
10
0
21 May 2023
FunASR: A Fundamental End-to-End Speech Recognition Toolkit
FunASR: A Fundamental End-to-End Speech Recognition Toolkit
Zhifu Gao
Zerui Li
Jiaming Wang
Haoneng Luo
Xian Shi
...
Yabin Li
Lingyun Zuo
Zhihao Du
Zhangyu Xiao
Shiliang Zhang
37
54
0
18 May 2023
Dynamic Chunk Convolution for Unified Streaming and Non-Streaming
  Conformer ASR
Dynamic Chunk Convolution for Unified Streaming and Non-Streaming Conformer ASR
Xilai Li
Goeric Huybrechts
S. Ronanki
Jeffrey J. Farris
S. Bodapati
38
6
0
18 Apr 2023
MER 2023: Multi-label Learning, Modality Robustness, and Semi-Supervised
  Learning
MER 2023: Multi-label Learning, Modality Robustness, and Semi-Supervised Learning
Zheng Lian
Haiyang Sun
Guoying Zhao
Kang Chen
Mingyu Xu
...
Meng Wang
Min Zhang
Guoying Zhao
Björn W. Schuller
Jianhua Tao
40
48
0
18 Apr 2023
Pyramid Multi-branch Fusion DCNN with Multi-Head Self-Attention for
  Mandarin Speech Recognition
Pyramid Multi-branch Fusion DCNN with Multi-Head Self-Attention for Mandarin Speech Recognition
Kai Liu
Hailiang Xiong
Gangqiang Yang
Zhengfeng Du
Yewen Cao
D. Shah
18
0
0
23 Mar 2023
Expressive-VC: Highly Expressive Voice Conversion with Attention Fusion
  of Bottleneck and Perturbation Features
Expressive-VC: Highly Expressive Voice Conversion with Attention Fusion of Bottleneck and Perturbation Features
Ziqian Ning
Qicong Xie
Pengcheng Zhu
Zhichao Wang
Liumeng Xue
Jixun Yao
Linfu Xie
Mengxiao Bi
27
16
0
09 Nov 2022
Distinguishable Speaker Anonymization based on Formant and Fundamental
  Frequency Scaling
Distinguishable Speaker Anonymization based on Formant and Fundamental Frequency Scaling
Jixun Yao
Qing Wang
Yi Lei
Pengcheng Guo
Linfu Xie
Namin Wang
Jie Liu
38
13
0
06 Nov 2022
The ISCSLP 2022 Intelligent Cockpit Speech Recognition Challenge
  (ICSRC): Dataset, Tracks, Baseline and Results
The ISCSLP 2022 Intelligent Cockpit Speech Recognition Challenge (ICSRC): Dataset, Tracks, Baseline and Results
Ao Zhang
F. Yu
Kaixun Huang
Linfu Xie
Longbiao Wang
E. Chng
Hui Bu
Binbin Zhang
Wei Chen
Xin Xu
32
4
0
03 Nov 2022
Variable Attention Masking for Configurable Transformer Transducer
  Speech Recognition
Variable Attention Masking for Configurable Transformer Transducer Speech Recognition
P. Swietojanski
Stefan Braun
Dogan Can
Thiago Fraga da Silva
Arnab Ghoshal
...
Henry Mason
Erik McDermott
Honza Silovsky
R. Travadi
Xiaodan Zhuang
40
13
0
02 Nov 2022
Wespeaker: A Research and Production oriented Speaker Embedding Learning
  Toolkit
Wespeaker: A Research and Production oriented Speaker Embedding Learning Toolkit
Hongji Wang
Che-Yuan Liang
Shuai Wang
Zhengyang Chen
Binbin Zhang
Xu Xiang
Yan Deng
Y. Qian
32
117
0
31 Oct 2022
The NPU-ASLP System for The ISCSLP 2022 Magichub Code-Swiching ASR
  Challenge
The NPU-ASLP System for The ISCSLP 2022 Magichub Code-Swiching ASR Challenge
Yuhao Liang
Pei-Ning Chen
F. Yu
Xinfa Zhu
Tianyi Xu
Linfu Xie
28
0
0
26 Oct 2022
Towards Personalization of CTC Speech Recognition Models with Contextual
  Adapters and Adaptive Boosting
Towards Personalization of CTC Speech Recognition Models with Contextual Adapters and Adaptive Boosting
Saket Dingliwal
Monica Sunkara
S. Bodapati
S. Ronanki
Jeffrey J. Farris
Katrin Kirchhoff
33
0
0
18 Oct 2022
A Policy-based Approach to the SpecAugment Method for Low Resource E2E
  ASR
A Policy-based Approach to the SpecAugment Method for Low Resource E2E ASR
Rui Li
Guodong Ma
Dexin Zhao
Ranran Zeng
Xiaoyu Li
Haolin Huang
29
2
0
16 Oct 2022
NWPU-ASLP System for the VoicePrivacy 2022 Challenge
NWPU-ASLP System for the VoicePrivacy 2022 Challenge
Jixun Yao
Qing Wang
Li Zhang
Pengcheng Guo
Yuhao Liang
Linfu Xie
PICV
26
16
0
24 Sep 2022
Learning a Dual-Mode Speech Recognition Model via Self-Pruning
Learning a Dual-Mode Speech Recognition Model via Self-Pruning
Chunxi Liu
Yuan Shangguan
Haichuan Yang
Yangyang Shi
Raghuraman Krishnamoorthi
Ozlem Kalinli
SSL
29
7
0
25 Jul 2022
Improving Mandarin Speech Recogntion with Block-augmented Transformer
Improving Mandarin Speech Recogntion with Block-augmented Transformer
Xiaoming Ren
Huifeng Zhu
Liuwei Wei
Minghui Wu
Jie Hao
38
9
0
24 Jul 2022
Branchformer: Parallel MLP-Attention Architectures to Capture Local and
  Global Context for Speech Recognition and Understanding
Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding
Yifan Peng
Siddharth Dalmia
Ian Lane
Shinji Watanabe
30
143
0
06 Jul 2022
TALCS: An Open-Source Mandarin-English Code-Switching Corpus and a
  Speech Recognition Baseline
TALCS: An Open-Source Mandarin-English Code-Switching Corpus and a Speech Recognition Baseline
Chengfei Li
Shuhao Deng
Yaoping Wang
Guangjing Wang
Y. Gong
Changbin Chen
Jinfeng Bai
33
16
0
27 Jun 2022
End-to-End Voice Conversion with Information Perturbation
End-to-End Voice Conversion with Information Perturbation
Qicong Xie
Shan Yang
Yinjiao Lei
Linfu Xie
Dan Su
38
7
0
15 Jun 2022
AdaVITS: Tiny VITS for Low Computing Resource Speaker Adaptation
AdaVITS: Tiny VITS for Low Computing Resource Speaker Adaptation
Kun Song
Heyang Xue
Xinsheng Wang
Jian Cong
Yongmao Zhang
Linfu Xie
Bing Yang
Xiong Zhang
Dan Su
19
5
0
01 Jun 2022
Multi-Level Modeling Units for End-to-End Mandarin Speech Recognition
Multi-Level Modeling Units for End-to-End Mandarin Speech Recognition
Yuting Yang
Binbin Du
Yuke Li
26
1
0
24 May 2022
PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit
PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit
Hui Zhang
Tian Yuan
Junkun Chen
Xintong Li
Renjie Zheng
...
Zeyu Chen
Xiaoguang Hu
Dianhai Yu
Yanjun Ma
Liang Huang
AuLLM
31
24
0
20 May 2022
CTA-RNN: Channel and Temporal-wise Attention RNN Leveraging Pre-trained
  ASR Embeddings for Speech Emotion Recognition
CTA-RNN: Channel and Temporal-wise Attention RNN Leveraging Pre-trained ASR Embeddings for Speech Emotion Recognition
Chengxin Chen
Pengyuan Zhang
AI4TS
16
10
0
31 Mar 2022
Integrating Lattice-Free MMI into End-to-End Speech Recognition
Integrating Lattice-Free MMI into End-to-End Speech Recognition
Jinchuan Tian
Jianwei Yu
Chao Weng
Yuexian Zou
Dong Yu
35
8
0
29 Mar 2022
WeNet 2.0: More Productive End-to-End Speech Recognition Toolkit
WeNet 2.0: More Productive End-to-End Speech Recognition Toolkit
Binbin Zhang
Di Wu
Zhendong Peng
Xingcheng Song
Zhuoyuan Yao
Hang Lv
Linfu Xie
Chao Yang
Fuping Pan
Jianwei Niu
VLM
29
94
0
29 Mar 2022
12
Next