ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2306.01296
  4. Cited By
Improved Training for End-to-End Streaming Automatic Speech Recognition
  Model with Punctuation

Improved Training for End-to-End Streaming Automatic Speech Recognition Model with Punctuation

2 June 2023
Hanbyul Kim
S. Seo
Lukas Lee
Seolki Baek
ArXiv (abs)PDFHTML

Papers citing "Improved Training for End-to-End Streaming Automatic Speech Recognition Model with Punctuation"

19 / 19 papers shown
Title
Streaming Punctuation for Long-form Dictation with Transformers
Streaming Punctuation for Long-form Dictation with Transformers
Piyush Behre
S.S. Tan
Padma Varadharajan
Shuangyu Chang
62
6
0
11 Oct 2022
End-to-end Speech-to-Punctuated-Text Recognition
End-to-end Speech-to-Punctuated-Text Recognition
Jumon Nozaki
Tatsuya Kawahara
K. Ishizuka
Taiichi Hashimoto
45
12
0
07 Jul 2022
E2E Segmenter: Joint Segmenting and Decoding for Long-Form ASR
E2E Segmenter: Joint Segmenting and Decoding for Long-Form ASR
Wenjie Huang
Shuo-yiin Chang
David Rybach
Rohit Prabhavalkar
Tara N. Sainath
Cyril Allauzen
Cal Peyser
Zhiyun Lu
VLM
74
24
0
22 Apr 2022
CUSIDE: Chunking, Simulating Future Context and Decoding for Streaming
  ASR
CUSIDE: Chunking, Simulating Future Context and Decoding for Streaming ASR
Keyu An
Huahuan Zheng
Zhijian Ou
Hongyu Xiang
Ke Ding
Guanglu Wan
AI4TS
45
19
0
31 Mar 2022
Multimodal Semi-supervised Learning Framework for Punctuation Prediction
  in Conversational Speech
Multimodal Semi-supervised Learning Framework for Punctuation Prediction in Conversational Speech
Monica Sunkara
S. Ronanki
Dhanush Bekal
S. Bodapati
Katrin Kirchhoff
55
32
0
03 Aug 2020
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech
  Representations
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations
Alexei Baevski
Henry Zhou
Abdel-rahman Mohamed
Michael Auli
SSL
295
5,837
0
20 Jun 2020
CAT: A CTC-CRF based ASR Toolkit Bridging the Hybrid and the End-to-end
  Approaches towards Data Efficiency and Low Latency
CAT: A CTC-CRF based ASR Toolkit Bridging the Hybrid and the End-to-end Approaches towards Data Efficiency and Low Latency
Keyu An
Hongyu Xiang
Zhijian Ou
43
20
0
27 May 2020
Streaming Chunk-Aware Multihead Attention for Online End-to-End Speech
  Recognition
Streaming Chunk-Aware Multihead Attention for Online End-to-End Speech Recognition
Shiliang Zhang
Zhifu Gao
Haoneng Luo
Ming Lei
Jie Ying Gao
Zhijie Yan
Lei Xie
56
29
0
21 May 2020
Streaming Transformer-based Acoustic Models Using Self-attention with
  Augmented Memory
Streaming Transformer-based Acoustic Models Using Self-attention with Augmented Memory
Chunyang Wu
Yongqiang Wang
Yangyang Shi
Ching-Feng Yeh
Frank Zhang
RALM
66
63
0
16 May 2020
Transformer Transducer: A Streamable Speech Recognition Model with
  Transformer Encoders and RNN-T Loss
Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss
Qian Zhang
Han Lu
Hasim Sak
Anshuman Tripathi
Erik McDermott
Stephen Koo
Shankar Kumar
88
481
0
07 Feb 2020
Transformer-based Online CTC/attention End-to-End Speech Recognition
  Architecture
Transformer-based Online CTC/attention End-to-End Speech Recognition Architecture
Haoran Miao
Gaofeng Cheng
Changfeng Gao
Pengyuan Zhang
Yonghong Yan
58
104
0
15 Jan 2020
Streaming automatic speech recognition with the transformer model
Streaming automatic speech recognition with the transformer model
Niko Moritz
Takaaki Hori
Jonathan Le Roux
76
187
0
08 Jan 2020
fairseq: A Fast, Extensible Toolkit for Sequence Modeling
fairseq: A Fast, Extensible Toolkit for Sequence Modeling
Myle Ott
Sergey Edunov
Alexei Baevski
Angela Fan
Sam Gross
Nathan Ng
David Grangier
Michael Auli
VLMFaML
114
3,156
0
01 Apr 2019
Self-Attention Aligner: A Latency-Control End-to-End Model for ASR Using
  Self-Attention Network and Chunk-Hopping
Self-Attention Aligner: A Latency-Control End-to-End Model for ASR Using Self-Attention Network and Chunk-Hopping
Linhao Dong
Feng Wang
Bo Xu
53
91
0
18 Feb 2019
Streaming End-to-end Speech Recognition For Mobile Devices
Streaming End-to-end Speech Recognition For Mobile Devices
Yanzhang He
Tara N. Sainath
Rohit Prabhavalkar
Ian McGraw
R. Álvarez
...
K. Sim
Tom Bagby
Shuo-yiin Chang
Kanishka Rao
A. Gruenstein
114
627
0
15 Nov 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLMSSLSSeg
1.8K
95,175
0
11 Oct 2018
SentencePiece: A simple and language independent subword tokenizer and
  detokenizer for Neural Text Processing
SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing
Taku Kudo
John Richardson
201
3,528
0
19 Aug 2018
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
732
132,363
0
12 Jun 2017
Constructing Long Short-Term Memory based Deep Recurrent Neural Networks
  for Large Vocabulary Speech Recognition
Constructing Long Short-Term Memory based Deep Recurrent Neural Networks for Large Vocabulary Speech Recognition
Xiangang Li
Xihong Wu
86
309
0
16 Oct 2014
1