Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.11455
Cited By
Recognizing long-form speech using streaming end-to-end models
24 October 2019
A. Narayanan
Rohit Prabhavalkar
Chung-Cheng Chiu
David Rybach
Tara N. Sainath
Trevor Strohman
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Recognizing long-form speech using streaming end-to-end models"
47 / 97 papers shown
Title
Attentive Temporal Pooling for Conformer-based Streaming Language Identification in Long-form Speech
Quan Wang
Yang Yu
Jason W. Pelecanos
Yiling Huang
Ignacio López Moreno
21
14
0
24 Feb 2022
VADOI:Voice-Activity-Detection Overlapping Inference For End-to-end Long-form Speech Recognition
Jinhan Wang
Xiaosu Tong
Jinxi Guo
Di He
Roland Maas
21
5
0
22 Feb 2022
Neural-FST Class Language Model for End-to-End Speech Recognition
A. Bruguier
Duc Le
Rohit Prabhavalkar
Dangna Li
Zhe Liu
Bo Wang
Eun Chang
Fuchun Peng
Ozlem Kalinli
M. Seltzer
20
6
0
28 Jan 2022
A Likelihood Ratio based Domain Adaptation Method for E2E Models
Chhavi Choudhury
Ankur Gandhe
Xiaohan Ding
I. Bulyko
27
10
0
10 Jan 2022
Deliberation of Streaming RNN-Transducer by Non-autoregressive Decoding
Weiran Wang
Ke Hu
Tara N. Sainath
35
21
0
01 Dec 2021
A Conformer-based ASR Frontend for Joint Acoustic Echo Cancellation, Speech Enhancement and Speech Separation
Tom O'Malley
A. Narayanan
Quan Wang
Alex Park
James Walker
N. Howard
25
27
0
18 Nov 2021
Recent Advances in End-to-End Automatic Speech Recognition
Jinyu Li
VLM
35
363
0
02 Nov 2021
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Sanyuan Chen
Chengyi Wang
Zhengyang Chen
Yu-Huan Wu
Shujie Liu
...
Yao Qian
Jian Wu
Micheal Zeng
Xiangzhan Yu
Furu Wei
SSL
118
1,715
0
26 Oct 2021
Partial Variable Training for Efficient On-Device Federated Learning
Tien-Ju Yang
Dhruv Guliani
F. Beaufays
Giovanni Motta
FedML
24
25
0
11 Oct 2021
Personalized Automatic Speech Recognition Trained on Small Disordered Speech Datasets
Jimmy Tobin
Katrin Tomanek
24
27
0
09 Oct 2021
Exploring Heterogeneous Characteristics of Layers in ASR Models for More Efficient Training
Lillian Zhou
Dhruv Guliani
Andreas Kabel
Giovanni Motta
F. Beaufays
26
1
0
08 Oct 2021
Input Length Matters: Improving RNN-T and MWER Training for Long-form Telephony Speech Recognition
Zhiyun Lu
Yanwei Pan
Thibault Doutre
Parisa Haghani
Liangliang Cao
Rohit Prabhavalkar
C. Zhang
Trevor Strohman
AuLLM
83
14
0
08 Oct 2021
Enabling On-Device Training of Speech Recognition Models with Federated Dropout
Dhruv Guliani
Lillian Zhou
Changwan Ryu
Tien-Ju Yang
Harry Zhang
Yong Xiao
F. Beaufays
Giovanni Motta
FedML
33
16
0
07 Oct 2021
Large-scale ASR Domain Adaptation using Self- and Semi-supervised Learning
DongSeon Hwang
Ananya Misra
Zhouyuan Huo
Nikhil Siddhartha
Shefali Garg
David Qiu
K. Sim
Trevor Strohman
F. Beaufays
Yanzhang He
65
34
0
01 Oct 2021
Incremental Layer-wise Self-Supervised Learning for Efficient Speech Domain Adaptation On Device
Zhouyuan Huo
Dong-Gyo Hwang
K. Sim
Shefali Garg
Ananya Misra
Nikhil Siddhartha
Trevor Strohman
Franccoise Beaufays
48
7
0
01 Oct 2021
Tied & Reduced RNN-T Decoder
Rami Botros
Tara N. Sainath
R. David
Emmanuel Guzman
Wei Li
Yanzhang He
38
55
0
15 Sep 2021
Residual Adapters for Parameter-Efficient ASR Adaptation to Atypical and Accented Speech
Katrin Tomanek
Vicky Zayats
Dirk Padfield
K. Vaillancourt
Fadi Biadsy
59
57
0
14 Sep 2021
Learning a Neural Diff for Speech Models
J. Macoskey
Grant P. Strimel
Ariya Rastrow
18
2
0
03 Aug 2021
Translatotron 2: High-quality direct speech-to-speech translation with voice preservation
Ye Jia
Michelle Tadmor Ramanovich
Tal Remez
Roi Pomerantz
26
67
0
19 Jul 2021
VAD-free Streaming Hybrid CTC/Attention ASR for Unsegmented Recording
Hirofumi Inaguma
Tatsuya Kawahara
19
2
0
15 Jul 2021
Noisy Training Improves E2E ASR for the Edge
Dilin Wang
Yuan Shangguan
Haichuan Yang
P. Chuang
Jiatong Zhou
Meng Li
Ganesh Venkatesh
Ozlem Kalinli
Vikas Chandra
14
4
0
09 Jul 2021
Comparing Supervised Models And Learned Speech Representations For Classifying Intelligibility Of Disordered Speech On Selected Phrases
Subhashini Venugopalan
Joel Shor
Manoj Plakal
Jimmy Tobin
Katrin Tomanek
Jordan R. Green
Michael P. Brenner
27
12
0
08 Jul 2021
Multi-user VoiceFilter-Lite via Attentive Speaker Embedding
R. Rikhye
Quan Wang
Qiao Liang
Yanzhang He
Ian McGraw
29
8
0
02 Jul 2021
Personalized Keyphrase Detection using Speaker and Environment Information
R. Rikhye
Quan Wang
Qiao Liang
Yanzhang He
Ding Zhao
Yiteng Huang
Huang
A. Narayanan
Ian McGraw
26
11
0
28 Apr 2021
Bridging the gap between streaming and non-streaming ASR systems bydistilling ensembles of CTC and RNN-T models
Thibault Doutre
Wei Han
Chung-Cheng Chiu
Ruoming Pang
Olivier Siohan
Liangliang Cao
38
5
0
25 Apr 2021
Advanced Long-context End-to-end Speech Recognition Using Context-expanded Transformers
Takaaki Hori
Niko Moritz
Chiori Hori
Jonathan Le Roux
27
34
0
19 Apr 2021
Lookup-Table Recurrent Language Models for Long Tail Speech Recognition
Yifan Jiang
Tara N. Sainath
Cal Peyser
Shankar Kumar
David Rybach
Trevor Strohman
RALM
LMTD
28
5
0
09 Apr 2021
Transformer Based Deliberation for Two-Pass Speech Recognition
Ke Hu
Ruoming Pang
Tara N. Sainath
Trevor Strohman
27
37
0
27 Jan 2021
Hypothesis Stitcher for End-to-End Speaker-attributed ASR on Long-form Multi-talker Recordings
Xuankai Chang
Naoyuki Kanda
Yashesh Gaur
Xiaofei Wang
Zhong Meng
Takuya Yoshioka
RALM
11
15
0
06 Jan 2021
AV Taris: Online Audio-Visual Speech Recognition
George Sterpu
N. Harte
27
1
0
14 Dec 2020
Less Is More: Improved RNN-T Decoding Using Limited Label Context and Path Merging
Rohit Prabhavalkar
Yanzhang He
David Rybach
S. Campbell
A. Narayanan
Trevor Strohman
Tara N. Sainath
49
35
0
12 Dec 2020
A Better and Faster End-to-End Model for Streaming ASR
Bo-wen Li
Anmol Gulati
Jiahui Yu
Tara N. Sainath
Chung-Cheng Chiu
...
Wei Han
Qiao Liang
Yu Zhang
Trevor Strohman
Yonghui Wu
AuLLM
25
123
0
21 Nov 2020
Improving RNN-T ASR Accuracy Using Context Audio
A. Schwarz
Ilya Sklyar
Simon Wiesler
16
9
0
20 Nov 2020
Cascade RNN-Transducer: Syllable Based Streaming On-device Mandarin Speech Recognition with a Syllable-to-Character Converter
Xiong Wang
Zhuoyuan Yao
Xian Shi
Lei Xie
19
30
0
17 Nov 2020
Improving RNN Transducer Based ASR with Auxiliary Tasks
Chunxi Liu
Frank Zhang
Duc Le
Suyoun Kim
Yatharth Saraf
Geoffrey Zweig
26
49
0
05 Nov 2020
Cascaded encoders for unifying streaming and non-streaming ASR
A. Narayanan
Tara N. Sainath
Ruoming Pang
Jiahui Yu
Chung-Cheng Chiu
Rohit Prabhavalkar
Ehsan Variani
Trevor Strohman
AuLLM
8
85
0
27 Oct 2020
Improving Streaming Automatic Speech Recognition With Non-Streaming Model Distillation On Unsupervised Data
Thibault Doutre
Wei Han
Min Ma
Zhiyun Lu
Chung-Cheng Chiu
Ruoming Pang
A. Narayanan
Ananya Misra
Yu Zhang
Liangliang Cao
69
22
0
22 Oct 2020
Parallel Rescoring with Transformer for Streaming On-Device Speech Recognition
Wei Li
James Qin
Chung-Cheng Chiu
Ruoming Pang
Yanzhang He
20
14
0
30 Aug 2020
Learning to Count Words in Fluent Speech enables Online Speech Recognition
George Sterpu
Christian Saam
N. Harte
16
4
0
08 Jun 2020
Analyzing the Quality and Stability of a Streaming End-to-End On-Device Speech Recognizer
Yuan Shangguan
Kate Knister
Yanzhang He
Ian McGraw
F. Beaufays
16
12
0
02 Jun 2020
Improving Proper Noun Recognition in End-to-End ASR By Customization of the MWER Loss Criterion
Cal Peyser
Tara N. Sainath
Golan Pundak
20
13
0
19 May 2020
A New Training Pipeline for an Improved Neural Transducer
Albert Zeyer
André Merboldt
Ralf Schluter
Hermann Ney
AI4TS
MedIm
22
52
0
19 May 2020
RNN-T Models Fail to Generalize to Out-of-Domain Audio: Causes and Solutions
Chung-Cheng Chiu
A. Narayanan
Wei Han
Rohit Prabhavalkar
Yu Zhang
...
Ruoming Pang
Tara N. Sainath
Patrick Nguyen
Liangliang Cao
Yonghui Wu
19
42
0
07 May 2020
A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency
Tara N. Sainath
Yanzhang He
Bo-wen Li
A. Narayanan
Ruoming Pang
...
Trevor Strohman
Mirkó Visontai
Yonghui Wu
Yu Zhang
Ding Zhao
25
215
0
28 Mar 2020
High Performance Sequence-to-Sequence Model for Streaming Speech Recognition
T. Nguyen
Ngoc-Quan Pham
S. Stueker
A. Waibel
6
7
0
22 Mar 2020
Deliberation Model Based Two-Pass End-to-End Speech Recognition
Ke Hu
Tara N. Sainath
Ruoming Pang
Rohit Prabhavalkar
16
85
0
17 Mar 2020
Optimizing Speech Recognition For The Edge
Yuan Shangguan
Jian Li
Qiao Liang
R. Álvarez
Ian McGraw
28
64
0
26 Sep 2019
Previous
1
2