Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2001.02674
Cited By
Streaming automatic speech recognition with the transformer model
8 January 2020
Niko Moritz
Takaaki Hori
Jonathan Le Roux
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Streaming automatic speech recognition with the transformer model"
42 / 42 papers shown
Title
A 71.2-
μ
μ
μ
W Speech Recognition Accelerator with Recurrent Spiking Neural Network
Chih-Chyau Yang
Tian-Sheuan Chang
65
1
0
27 Mar 2025
Learning from Demonstration with Implicit Nonlinear Dynamics Models
Peter David Fagan
Subramanian Ramamoorthy
216
0
0
27 Sep 2024
Streaming Sequence Transduction through Dynamic Compression
Weiting Tan
Yunmo Chen
Tongfei Chen
Guanghui Qin
Haoran Xu
Heidi C. Zhang
Benjamin Van Durme
Philipp Koehn
26
2
0
02 Feb 2024
Stateful Conformer with Cache-based Inference for Streaming Automatic Speech Recognition
Vahid Noroozi
Somshubra Majumdar
Ankur Kumar
Jagadeesh Balam
Boris Ginsburg
41
10
0
27 Dec 2023
Incremental Blockwise Beam Search for Simultaneous Speech Translation with Controllable Quality-Latency Tradeoff
Peter Polák
Brian Yan
Shinji Watanabe
A. Waibel
Ondrej Bojar
28
9
0
20 Sep 2023
Semi-Autoregressive Streaming ASR With Label Context
Siddhant Arora
G. Saon
Shinji Watanabe
Brian Kingsbury
AI4TS
23
5
0
19 Sep 2023
ApproBiVT: Lead ASR Models to Generalize Better Using Approximated Bias-Variance Tradeoff Guided Early Stopping and Checkpoint Averaging
Fangyuan Wang
Ming Hao
Yuhai Shi
Bo Xu
MoMe
23
0
0
05 Aug 2023
TST: Time-Sparse Transducer for Automatic Speech Recognition
Xiaohui Zhang
Mangui Liang
Zhengkun Tian
Jiangyan Yi
J. Tao
14
0
0
17 Jul 2023
Streaming Speech-to-Confusion Network Speech Recognition
Denis Filimonov
Prabhat Pandey
Ariya Rastrow
Ankur Gandhe
A. Stolcke
HAI
37
0
0
02 Jun 2023
Streaming Audio Transformers for Online Audio Tagging
Heinrich Dinkel
Zhiyong Yan
Yongqing Wang
Junbo Zhang
Yujun Wang
Bin Wang
37
4
0
29 May 2023
Self-regularised Minimum Latency Training for Streaming Transformer-based Speech Recognition
Mohan Li
R. Doddipatla
Catalin Zorila
30
0
0
24 Apr 2023
Streaming Audio-Visual Speech Recognition with Alignment Regularization
Pingchuan Ma
Niko Moritz
Stavros Petridis
Christian Fuegen
Maja Pantic
37
2
0
03 Nov 2022
Variable Attention Masking for Configurable Transformer Transducer Speech Recognition
P. Swietojanski
Stefan Braun
Dogan Can
Thiago Fraga da Silva
Arnab Ghoshal
...
Henry Mason
Erik McDermott
Honza Silovsky
R. Travadi
Xiaodan Zhuang
42
13
0
02 Nov 2022
E-Branchformer: Branchformer with Enhanced merging for speech recognition
Kwangyoun Kim
Felix Wu
Yifan Peng
Jing Pan
Prashant Sridhar
Kyu Jeong Han
Shinji Watanabe
61
105
0
30 Sep 2022
ConvRNN-T: Convolutional Augmented Recurrent Neural Network Transducers for Streaming Speech Recognition
Martin H. Radfar
Rohit Barnwal
Rupak Vignesh Swaminathan
Feng-Ju Chang
Grant P. Strimel
Nathan Susanj
Athanasios Mouchtaris
36
13
0
29 Sep 2022
Improving Speech Emotion Recognition Through Focus and Calibration Attention Mechanisms
Junghun Kim
Yoojin An
Jihie Kim
20
13
0
21 Aug 2022
Improving Streaming End-to-End ASR on Transformer-based Causal Models with Encoder States Revision Strategies
Zehan Li
Haoran Miao
Keqi Deng
Gaofeng Cheng
Sanli Tian
Ta Li
Yonghong Yan
KELM
27
4
0
06 Jul 2022
Blockwise Streaming Transformer for Spoken Language Understanding and Simultaneous Speech Translation
Keqi Deng
Shinji Watanabe
Jiatong Shi
Siddhant Arora
33
15
0
19 Apr 2022
Personal VAD 2.0: Optimizing Personal Voice Activity Detection for On-Device Speech Recognition
Shaojin Ding
R. Rikhye
Qiao Liang
Yanzhang He
Quan Wang
A. Narayanan
Tom O'Malley
Ian McGraw
29
27
0
08 Apr 2022
Transformer-based Streaming ASR with Cumulative Attention
Mohan Li
Shucong Zhang
Catalin Zorila
R. Doddipatla
27
9
0
11 Mar 2022
Solving Probability and Statistics Problems by Program Synthesis
Leonard Tang
Elizabeth Ke
Nikhil Singh
Nakul Verma
Iddo Drori
19
15
0
16 Nov 2021
Recent Advances in End-to-End Automatic Speech Recognition
Jinyu Li
VLM
40
363
0
02 Nov 2021
Visualization: the missing factor in Simultaneous Speech Translation
Sara Papi
Matteo Negri
Marco Turchi
19
2
0
31 Oct 2021
Study of positional encoding approaches for Audio Spectrogram Transformers
L. Pepino
Pablo Riera
Luciana Ferrer
ViT
28
6
0
13 Oct 2021
SRU++: Pioneering Fast Recurrence with Attention for Speech Recognition
Jing Pan
Tao Lei
Kwangyoun Kim
Kyu Jeong Han
Shinji Watanabe
VLM
34
9
0
11 Oct 2021
VideoModerator: A Risk-aware Framework for Multimodal Video Moderation in E-Commerce
Tan Tang
Yanhong Wu
Lingyun Yu
Yuhong Li
Yingcai Wu
46
22
0
08 Sep 2021
A Dialogue-based Information Extraction System for Medical Insurance Assessment
Shuang Peng
Mengdi Zhou
Minghui Yang
Haitao Mi
Shaosheng Cao
Zujie Wen
Teng Xu
Hongbin Wang
Lei Liu
26
4
0
13 Jul 2021
Direct Simultaneous Speech-to-Text Translation Assisted by Synchronized Streaming ASR
Junkun Chen
Mingbo Ma
Renjie Zheng
Liang Huang
31
31
0
11 Jun 2021
Advanced Long-context End-to-end Speech Recognition Using Context-expanded Transformers
Takaaki Hori
Niko Moritz
Chiori Hori
Jonathan Le Roux
30
34
0
19 Apr 2021
TransVG: End-to-End Visual Grounding with Transformers
Jiajun Deng
Zhengyuan Yang
Tianlang Chen
Wen-gang Zhou
Houqiang Li
ViT
28
332
0
17 Apr 2021
Transformer-based ASR Incorporating Time-reduction Layer and Fine-tuning with Self-Knowledge Distillation
Md. Akmal Haidar
Chao Xing
Mehdi Rezagholizadeh
27
7
0
17 Mar 2021
Motion-Based Handwriting Recognition and Word Reconstruction
Junshen Kevin Chen
Wanze Xie
Yutong He
29
1
0
15 Jan 2021
s-Transformer: Segment-Transformer for Robust Neural Speech Synthesis
Xi Wang
Huaiping Ming
Lei He
Frank Soong
19
5
0
17 Nov 2020
Block-Online Guided Source Separation
Shota Horiguchi
Yusuke Fujita
Kenji Nagamatsu
25
4
0
16 Nov 2020
Transformer-based End-to-End Speech Recognition with Local Dense Synthesizer Attention
Menglong Xu
Shengqiang Li
Xiao-Lei Zhang
27
31
0
23 Oct 2020
Developing Real-time Streaming Transformer Transducer for Speech Recognition on Large-scale Dataset
Xie Chen
Yu-Huan Wu
Zhenghao Wang
Shujie Liu
Jinyu Li
22
169
0
22 Oct 2020
Emformer: Efficient Memory Transformer Based Acoustic Model For Low Latency Streaming Speech Recognition
Yangyang Shi
Yongqiang Wang
Chunyang Wu
Ching-Feng Yeh
Julian Chan
Frank Zhang
Duc Le
M. Seltzer
56
168
0
21 Oct 2020
Dual-mode ASR: Unify and Improve Streaming ASR with Full-context Modeling
Jiahui Yu
Wei Han
Anmol Gulati
Chung-Cheng Chiu
Bo-wen Li
Tara N. Sainath
Yonghui Wu
Ruoming Pang
30
18
0
12 Oct 2020
Transformer with Bidirectional Decoder for Speech Recognition
Xi Chen
Songyang Zhang
Dandan Song
P. Ouyang
Shouyi Yin
18
13
0
11 Aug 2020
Streaming Transformer ASR with Blockwise Synchronous Beam Search
E. Tsunoo
Yosuke Kashiwagi
Shinji Watanabe
22
11
0
25 Jun 2020
A Comparison of Label-Synchronous and Frame-Synchronous End-to-End Models for Speech Recognition
Linhao Dong
Cheng Yi
Jianzong Wang
Shiyu Zhou
Shuang Xu
X. Jia
Bo Xu
36
17
0
20 May 2020
Streaming Transformer-based Acoustic Models Using Self-attention with Augmented Memory
Chunyang Wu
Yongqiang Wang
Yangyang Shi
Ching-Feng Yeh
Frank Zhang
RALM
31
60
0
16 May 2020
1