Zipformer: A faster and better encoder for automatic speech recognition

Zipformer: A faster and better encoder for automatic speech recognition

17 October 2023

Zengwei Yao

Liyong Guo

Xiaoyu Yang

Wei Kang

Zengrui Jin

Long Lin

Papers citing "Zipformer: A faster and better encoder for automatic speech recognition"

15 / 15 papers shown

Title
CR-CTC: Consistency regularization on CTC for improved speech recognition Zengwei Yao Wei Kang Xiaoyu Yang Fangjun Kuang Liyong Guo Han Zhu Zengrui Jin Zhaoqing Li Long Lin Daniel Povey 53 0 0 17 Feb 2025
End-to-End Target Speaker Speech Recognition Using Context-Aware Attention Mechanisms for Challenging Enrollment Scenario Mohsen Ghane Mohammad Sadegh Safari 73 0 0 28 Jan 2025
HAINAN: Fast and Accurate Transducer for Hybrid-Autoregressive ASR Hainan Xu Travis M. Bartley Vladimir Bataev Boris Ginsburg 157 0 0 03 Oct 2024
MT2KD: Towards A General-Purpose Encoder for Speech, Speaker, and Audio Events Xiaoyu Yang Qiujia Li Chao Zhang P. Woodland 24 0 0 25 Sep 2024
LM-assisted keyword biasing with Aho-Corasick algorithm for Transducer-based ASR Iuliia Thorbecke Juan Zuluaga-Gomez Esaú Villatoro-Tello Andres Carofilis Shashi Kumar P. Motlícek Karthik Pandia A. Ganapathiraju 32 0 0 20 Sep 2024
Advancing Multi-talker ASR Performance with Large Language Models Mohan Shi Zengrui Jin Yaoxun Xu Yong Xu Shi-Xiong Zhang Kun Wei Yiwen Shao Chunlei Zhang Dong Yu 29 1 0 30 Aug 2024
Improving Neural Biasing for Contextual Speech Recognition by Early Context Injection and Text Perturbation Ruizhe Huang M. Yarmohammadi Sanjeev Khudanpur Dan Povey 35 2 0 14 Jul 2024
EfficientASR: Speech Recognition Network Compression via Attention Redundancy and Chunk-Level FFN Optimization Jianzong Wang Ziqi Liang Xulong Zhang Ning Cheng Jing Xiao 38 0 0 30 Apr 2024
On Speaker Attribution with SURT Desh Raj Matthew Wiesner Matthew Maciejewski Leibny Paola García-Perera Daniel Povey Sanjeev Khudanpur 32 3 0 28 Jan 2024
VALL-T: Decoder-Only Generative Transducer for Robust and Decoding-Controllable Text-to-Speech Chenpeng Du Yiwei Guo Hankun Wang Yifan Yang Zhikang Niu Shuai Wang Hui Zhang Xie Chen Kai Yu VLM 24 25 0 25 Jan 2024
Fast and parallel decoding for transducer Wei Kang Liyong Guo Fangjun Kuang Long Lin Mingshuang Luo Zengwei Yao Xiaoyu Yang Piotr Żelasko Daniel Povey AI4TS 19 15 0 31 Oct 2022
Accelerating RNN-T Training and Inference Using CTC guidance Yongqiang Wang Zhehuai Chen Cheng-yong Zheng Yu Zhang Wei Han Parisa Haghani 34 23 0 29 Oct 2022
E-Branchformer: Branchformer with Enhanced merging for speech recognition Kwangyoun Kim Felix Wu Yifan Peng Jing Pan Prashant Sridhar Kyu Jeong Han Shinji Watanabe 55 105 0 30 Sep 2022
NeMo: a toolkit for building AI applications using Neural Modules Oleksii Kuchaiev Jason Chun Lok Li Huyen Nguyen Oleksii Hrinchuk Ryan Leary ... Jack Cook P. Castonguay Mariya Popova Jocelyn Huang Jonathan M. Cohen 211 292 0 14 Sep 2019
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications Andrew G. Howard Menglong Zhu Bo Chen Dmitry Kalenichenko Weijun Wang Tobias Weyand M. Andreetto Hartwig Adam 3DH 950 20,567 0 17 Apr 2017