Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2506.14434
Cited By
Unifying Streaming and Non-streaming Zipformer-based ASR
17 June 2025
Bidisha Sharma
Karthik Pandia Durai
Shankar Venkatesan
Jeena Prakash
Shashi Kumar
Malolan Chetlur
Andreas Stolcke
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Unifying Streaming and Non-streaming Zipformer-based ASR"
14 / 14 papers shown
Title
Variable Attention Masking for Configurable Transformer Transducer Speech Recognition
P. Swietojanski
Stefan Braun
Dogan Can
Thiago Fraga da Silva
Arnab Ghoshal
...
Henry Mason
Erik McDermott
Honza Silovsky
R. Travadi
Xiaodan Zhuang
82
16
0
02 Nov 2022
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Sanyuan Chen
Chengyi Wang
Zhengyang Chen
Yu-Huan Wu
Shujie Liu
...
Yao Qian
Jian Wu
Micheal Zeng
Xiangzhan Yu
Furu Wei
SSL
254
1,896
0
26 Oct 2021
HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units
Wei-Ning Hsu
Benjamin Bolte
Yao-Hung Hubert Tsai
Kushal Lakhotia
Ruslan Salakhutdinov
Abdel-rahman Mohamed
SSL
180
2,989
0
14 Jun 2021
Developing Real-time Streaming Transformer Transducer for Speech Recognition on Large-scale Dataset
Xie Chen
Yu-Huan Wu
Zhenghao Wang
Shujie Liu
Jinyu Li
114
176
0
22 Oct 2020
Emformer: Efficient Memory Transformer Based Acoustic Model For Low Latency Streaming Speech Recognition
Yangyang Shi
Yongqiang Wang
Chunyang Wu
Ching-Feng Yeh
Julian Chan
Frank Zhang
Duc Le
M. Seltzer
133
172
0
21 Oct 2020
Transformer Transducer: One Model Unifying Streaming and Non-streaming Speech Recognition
Anshuman Tripathi
Jaeyoung Kim
Qian Zhang
Han Lu
Hasim Sak
50
43
0
07 Oct 2020
Conformer: Convolution-augmented Transformer for Speech Recognition
Anmol Gulati
James Qin
Chung-Cheng Chiu
Niki Parmar
Yu Zhang
...
Wei Han
Shibo Wang
Zhengdong Zhang
Yonghui Wu
Ruoming Pang
227
3,153
0
16 May 2020
ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context
Wei Han
Zhengdong Zhang
Yu Zhang
Jiahui Yu
Chung-Cheng Chiu
James Qin
Anmol Gulati
Ruoming Pang
Yonghui Wu
68
264
0
07 May 2020
A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency
Tara N. Sainath
Yanzhang He
Yue Liu
A. Narayanan
Ruoming Pang
...
Trevor Strohman
Mirkó Visontai
Yonghui Wu
Yu Zhang
Ding Zhao
68
218
0
28 Mar 2020
Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss
Qian Zhang
Han Lu
Hasim Sak
Anshuman Tripathi
Erik McDermott
Stephen Koo
Shankar Kumar
88
481
0
07 Feb 2020
Jasper: An End-to-End Convolutional Neural Acoustic Model
Jason Chun Lok Li
Vitaly Lavrukhin
Boris Ginsburg
Ryan Leary
Oleksii Kuchaiev
Jonathan M. Cohen
Huyen Nguyen
R. Gadde
DRL
VLM
AuLLM
54
265
0
05 Apr 2019
Exploring Architectures, Data and Units For Streaming End-to-End Speech Recognition with RNN-Transducer
Kanishka Rao
Hasim Sak
Rohit Prabhavalkar
AI4TS
81
348
0
02 Jan 2018
State-of-the-art Speech Recognition With Sequence-to-Sequence Models
Chung-Cheng Chiu
Tara N. Sainath
Yonghui Wu
Rohit Prabhavalkar
Patrick Nguyen
...
Katya Gonina
Navdeep Jaitly
Yue Liu
J. Chorowski
M. Bacchiani
AI4TS
91
1,154
0
05 Dec 2017
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
728
132,199
0
12 Jun 2017
1