ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1811.06621
  4. Cited By
Streaming End-to-end Speech Recognition For Mobile Devices

Streaming End-to-end Speech Recognition For Mobile Devices

15 November 2018
Yanzhang He
Tara N. Sainath
Rohit Prabhavalkar
Ian McGraw
R. Álvarez
Ding Zhao
David Rybach
Anjuli Kannan
Yonghui Wu
Ruoming Pang
Qiao Liang
Deepti Bhatia
Yuan Shangguan
Bo-wen Li
Golan Pundak
K. Sim
Tom Bagby
Shuo-yiin Chang
Kanishka Rao
A. Gruenstein
ArXivPDFHTML

Papers citing "Streaming End-to-end Speech Recognition For Mobile Devices"

50 / 154 papers shown
Title
Transformer-based ASR Incorporating Time-reduction Layer and Fine-tuning
  with Self-Knowledge Distillation
Transformer-based ASR Incorporating Time-reduction Layer and Fine-tuning with Self-Knowledge Distillation
Md. Akmal Haidar
Chao Xing
Mehdi Rezagholizadeh
27
7
0
17 Mar 2021
Learning Word-Level Confidence For Subword End-to-End ASR
Learning Word-Level Confidence For Subword End-to-End ASR
David Qiu
Qiujia Li
Yanzhang He
Yu Zhang
Bo-wen Li
...
Deepti Bhatia
Wei Li
Ke Hu
Tara N. Sainath
Ian McGraw
32
32
0
11 Mar 2021
Mitigating Edge Machine Learning Inference Bottlenecks: An Empirical
  Study on Accelerating Google Edge Models
Mitigating Edge Machine Learning Inference Bottlenecks: An Empirical Study on Accelerating Google Edge Models
Amirali Boroumand
Saugata Ghose
Berkin Akin
Ravi Narayanaswami
Geraldo F. Oliveira
Xiaoyu Ma
Eric Shiu
O. Mutlu
27
28
0
01 Mar 2021
Jira: a Kurdish Speech Recognition System Designing and Building Speech
  Corpus and Pronunciation Lexicon
Jira: a Kurdish Speech Recognition System Designing and Building Speech Corpus and Pronunciation Lexicon
H. Veisi
Hawre Hosseini
Mohammad MohammadAmini
Wirya Fathy
Aso Mahmudi
10
4
0
15 Feb 2021
UniSpeech: Unified Speech Representation Learning with Labeled and
  Unlabeled Data
UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data
Chengyi Wang
Yu-Huan Wu
Yao Qian
K. Kumatani
Shujie Liu
Furu Wei
Michael Zeng
Xuedong Huang
OT
SSL
38
112
0
19 Jan 2021
A review of on-device fully neural end-to-end automatic speech
  recognition algorithms
A review of on-device fully neural end-to-end automatic speech recognition algorithms
Chanwoo Kim
Dhananjaya N. Gowda
Dongsoo Lee
Jiyeon Kim
Ankur Kumar
Sungsoo Kim
Abhinav Garg
C. Han
27
27
0
14 Dec 2020
Improving accuracy of rare words for RNN-Transducer through unigram
  shallow fusion
Improving accuracy of rare words for RNN-Transducer through unigram shallow fusion
Vijay Ravi
Yile Gu
Ankur Gandhe
Ariya Rastrow
Linda Liu
Denis Filimonov
Scott Novotney
I. Bulyko
27
9
0
30 Nov 2020
Streaming end-to-end multi-talker speech recognition
Streaming end-to-end multi-talker speech recognition
Liang Lu
Naoyuki Kanda
Jinyu Li
Jiawei Liu
13
41
0
26 Nov 2020
Using Synthetic Audio to Improve The Recognition of Out-Of-Vocabulary
  Words in End-To-End ASR Systems
Using Synthetic Audio to Improve The Recognition of Out-Of-Vocabulary Words in End-To-End ASR Systems
Xianrui Zheng
Yulan Liu
Deniz Gunceler
D. Willett
17
78
0
23 Nov 2020
A Better and Faster End-to-End Model for Streaming ASR
A Better and Faster End-to-End Model for Streaming ASR
Bo-wen Li
Anmol Gulati
Jiahui Yu
Tara N. Sainath
Chung-Cheng Chiu
...
Wei Han
Qiao Liang
Yu Zhang
Trevor Strohman
Yonghui Wu
AuLLM
25
123
0
21 Nov 2020
Empowering Things with Intelligence: A Survey of the Progress,
  Challenges, and Opportunities in Artificial Intelligence of Things
Empowering Things with Intelligence: A Survey of the Progress, Challenges, and Opportunities in Artificial Intelligence of Things
Jing Zhang
Dacheng Tao
45
462
0
17 Nov 2020
Deep Shallow Fusion for RNN-T Personalization
Deep Shallow Fusion for RNN-T Personalization
Duc Le
Gil Keren
Julian Chan
Jay Mahadeokar
Christian Fuegen
M. Seltzer
21
77
0
16 Nov 2020
Improving RNN Transducer Based ASR with Auxiliary Tasks
Improving RNN Transducer Based ASR with Auxiliary Tasks
Chunxi Liu
Frank Zhang
Duc Le
Suyoun Kim
Yatharth Saraf
Geoffrey Zweig
26
49
0
05 Nov 2020
Multitask Training with Text Data for End-to-End Speech Recognition
Multitask Training with Text Data for End-to-End Speech Recognition
Peidong Wang
Tara N. Sainath
Ron J. Weiss
16
27
0
27 Oct 2020
Improved Neural Language Model Fusion for Streaming Recurrent Neural
  Network Transducer
Improved Neural Language Model Fusion for Streaming Recurrent Neural Network Transducer
Suyoun Kim
Shangguan Yuan
Jay Mahadeokar
A. Bruguier
Christian Fuegen
M. Seltzer
Duc Le
15
28
0
26 Oct 2020
Confidence Estimation for Attention-based Sequence-to-sequence Models
  for Speech Recognition
Confidence Estimation for Attention-based Sequence-to-sequence Models for Speech Recognition
Qiujia Li
David Qiu
Yu Zhang
Bo-wen Li
Yanzhang He
P. Woodland
Liangliang Cao
Trevor Strohman
12
46
0
22 Oct 2020
Developing Real-time Streaming Transformer Transducer for Speech
  Recognition on Large-scale Dataset
Developing Real-time Streaming Transformer Transducer for Speech Recognition on Large-scale Dataset
Xie Chen
Yu-Huan Wu
Zhenghao Wang
Shujie Liu
Jinyu Li
22
169
0
22 Oct 2020
Replacing Human Audio with Synthetic Audio for On-device Unspoken
  Punctuation Prediction
Replacing Human Audio with Synthetic Audio for On-device Unspoken Punctuation Prediction
Daria Soboleva
Ondrej Skopek
Márius vSajgalík
Victor Cuarbune
Felix Weissenberger
...
B. Prisacari
Daniel Valcarce
Justin Lu
Rohit Prabhavalkar
Balint Miklos
32
9
0
20 Oct 2020
Dual-mode ASR: Unify and Improve Streaming ASR with Full-context
  Modeling
Dual-mode ASR: Unify and Improve Streaming ASR with Full-context Modeling
Jiahui Yu
Wei Han
Anmol Gulati
Chung-Cheng Chiu
Bo-wen Li
Tara N. Sainath
Yonghui Wu
Ruoming Pang
30
18
0
12 Oct 2020
Utterance-level Intent Recognition from Keywords
Utterance-level Intent Recognition from Keywords
Wenda Chen
Jonathan Huang
M. Hasegawa-Johnson
19
1
0
17 Sep 2020
VoiceFilter-Lite: Streaming Targeted Voice Separation for On-Device
  Speech Recognition
VoiceFilter-Lite: Streaming Targeted Voice Separation for On-Device Speech Recognition
Quan Wang
Ignacio López Moreno
Mert Saglam
K. Wilson
Alan Chiao
...
Yanzhang He
Wei Li
Jason W. Pelecanos
M. Nika
A. Gruenstein
VLM
39
82
0
09 Sep 2020
Improving Tail Performance of a Deliberation E2E ASR Model Using a Large
  Text Corpus
Improving Tail Performance of a Deliberation E2E ASR Model Using a Large Text Corpus
Cal Peyser
S. Mavandadi
Tara N. Sainath
J. Apfel
Ruoming Pang
Shankar Kumar
29
46
0
24 Aug 2020
Conv-Transformer Transducer: Low Latency, Low Frame Rate, Streamable
  End-to-End Speech Recognition
Conv-Transformer Transducer: Low Latency, Low Frame Rate, Streamable End-to-End Speech Recognition
Wenyong Huang
Wenchao Hu
Y. Yeung
Xiao Chen
25
50
0
13 Aug 2020
Transformer with Bidirectional Decoder for Speech Recognition
Transformer with Bidirectional Decoder for Speech Recognition
Xi Chen
Songyang Zhang
Dandan Song
P. Ouyang
Shouyi Yin
18
13
0
11 Aug 2020
Unacceptable, where is my privacy? Exploring Accidental Triggers of
  Smart Speakers
Unacceptable, where is my privacy? Exploring Accidental Triggers of Smart Speakers
Lea Schonherr
Maximilian Golla
Thorsten Eisenhofer
Jan Wiele
D. Kolossa
Thorsten Holz
25
41
0
02 Aug 2020
Modular End-to-end Automatic Speech Recognition Framework for
  Acoustic-to-word Model
Modular End-to-end Automatic Speech Recognition Framework for Acoustic-to-word Model
Qi Liu
Zhehuai Chen
Hao Li
Mingkun Huang
Yizhou Lu
Kai Yu
21
6
0
31 Jul 2020
Efficient minimum word error rate training of RNN-Transducer for
  end-to-end speech recognition
Efficient minimum word error rate training of RNN-Transducer for end-to-end speech recognition
Jinxi Guo
Gautam Tiwari
J. Droppo
Maarten Van Segbroeck
Che-Wei Huang
A. Stolcke
Roland Maas
21
55
0
27 Jul 2020
Deep multi-metric learning for text-independent speaker verification
Deep multi-metric learning for text-independent speaker verification
Jiwei Xu
Xinggang Wang
Bin Feng
Wenyu Liu
46
25
0
17 Jul 2020
Attention-based Transducer for Online Speech Recognition
Attention-based Transducer for Online Speech Recognition
Bin Wang
Yan Yin
Hui-Ching Lin
18
4
0
18 May 2020
Conformer: Convolution-augmented Transformer for Speech Recognition
Conformer: Convolution-augmented Transformer for Speech Recognition
Anmol Gulati
James Qin
Chung-Cheng Chiu
Niki Parmar
Yu Zhang
...
Wei Han
Shibo Wang
Zhengdong Zhang
Yonghui Wu
Ruoming Pang
83
3,038
0
16 May 2020
Streaming Transformer-based Acoustic Models Using Self-attention with
  Augmented Memory
Streaming Transformer-based Acoustic Models Using Self-attention with Augmented Memory
Chunyang Wu
Yongqiang Wang
Yangyang Shi
Ching-Feng Yeh
Frank Zhang
RALM
31
60
0
16 May 2020
Streaming keyword spotting on mobile devices
Streaming keyword spotting on mobile devices
Oleg Rybakov
Natasha Kononenko
Niranjan A. Subrahmanya
Mirkó Visontai
Stella Laurenzo
AI4TS
19
109
0
14 May 2020
Fast and Robust Unsupervised Contextual Biasing for Speech Recognition
Fast and Robust Unsupervised Contextual Biasing for Speech Recognition
Young Mo Kang
Yingbo Zhou
11
13
0
04 May 2020
Exploring Pre-training with Alignments for RNN Transducer based
  End-to-End Speech Recognition
Exploring Pre-training with Alignments for RNN Transducer based End-to-End Speech Recognition
Hu Hu
Rui Zhao
Jinyu Li
Liang Lu
Jiawei Liu
19
27
0
01 May 2020
Language-agnostic Multilingual Modeling
Language-agnostic Multilingual Modeling
A. Datta
Bhuvana Ramabhadran
Jesse Emond
Anjuli Kannan
Brian Roark
24
35
0
20 Apr 2020
A Streaming On-Device End-to-End Model Surpassing Server-Side
  Conventional Model Quality and Latency
A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency
Tara N. Sainath
Yanzhang He
Bo-wen Li
A. Narayanan
Ruoming Pang
...
Trevor Strohman
Mirkó Visontai
Yonghui Wu
Yu Zhang
Ding Zhao
25
215
0
28 Mar 2020
High-Accuracy and Low-Latency Speech Recognition with Two-Head
  Contextual Layer Trajectory LSTM Model
High-Accuracy and Low-Latency Speech Recognition with Two-Head Contextual Layer Trajectory LSTM Model
Jinyu Li
Rui Zhao
Eric Sun
J. H. M. Wong
Amit Das
Zhong Meng
Jiawei Liu
VLM
24
24
0
17 Mar 2020
Hybrid Autoregressive Transducer (hat)
Hybrid Autoregressive Transducer (hat)
Ehsan Variani
David Rybach
Cyril Allauzen
Michael Riley
21
158
0
12 Mar 2020
Small-Footprint Open-Vocabulary Keyword Spotting with Quantized LSTM
  Networks
Small-Footprint Open-Vocabulary Keyword Spotting with Quantized LSTM Networks
Théodore Bluche
Maël Primet
Thibault Gisselbrecht
ObjD
MQ
28
24
0
25 Feb 2020
Accelerating RNN Transducer Inference via One-Step Constrained Beam
  Search
Accelerating RNN Transducer Inference via One-Step Constrained Beam Search
Juntae Kim
Yoonhan Lee
20
22
0
10 Feb 2020
End-to-End Automatic Speech Recognition Integrated With CTC-Based Voice
  Activity Detection
End-to-End Automatic Speech Recognition Integrated With CTC-Based Voice Activity Detection
Takenori Yoshimura
Tomoki Hayashi
K. Takeda
Shinji Watanabe
37
49
0
03 Feb 2020
End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern
  Architectures
End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern Architectures
Gabriel Synnaeve
Qiantong Xu
Jacob Kahn
Tatiana Likhomanenko
Edouard Grave
Vineel Pratap
Anuroop Sriram
Vitaliy Liptchinsky
R. Collobert
SSL
AI4TS
36
246
0
19 Nov 2019
A Simplified Fully Quantized Transformer for End-to-end Speech
  Recognition
A Simplified Fully Quantized Transformer for End-to-end Speech Recognition
Alex Bie
Bharat Venkitesh
João Monteiro
Md. Akmal Haidar
Mehdi Rezagholizadeh
MQ
32
27
0
09 Nov 2019
A comparison of end-to-end models for long-form speech recognition
A comparison of end-to-end models for long-form speech recognition
Chung-Cheng Chiu
Wei Han
Yu Zhang
Ruoming Pang
S. Kishchenko
...
Anjuli Kannan
Rohit Prabhavalkar
Z. Chen
Tara N. Sainath
Yonghui Wu
AuLLM
14
82
0
06 Nov 2019
Transformer-Transducer: End-to-End Speech Recognition with
  Self-Attention
Transformer-Transducer: End-to-End Speech Recognition with Self-Attention
Ching-Feng Yeh
Jay Mahadeokar
Kaustubh Kalgaonkar
Yongqiang Wang
Duc Le
Mahaveer Jain
Kjell Schubert
Christian Fuegen
M. Seltzer
27
147
0
28 Oct 2019
Recognizing long-form speech using streaming end-to-end models
Recognizing long-form speech using streaming end-to-end models
A. Narayanan
Rohit Prabhavalkar
Chung-Cheng Chiu
David Rybach
Tara N. Sainath
Trevor Strohman
29
129
0
24 Oct 2019
G2G: TTS-Driven Pronunciation Learning for Graphemic Hybrid ASR
G2G: TTS-Driven Pronunciation Learning for Graphemic Hybrid ASR
Duc Le
T. Koehler
Christian Fuegen
M. Seltzer
30
16
0
22 Oct 2019
GPU-Accelerated Viterbi Exact Lattice Decoder for Batched Online and
  Offline Speech Recognition
GPU-Accelerated Viterbi Exact Lattice Decoder for Batched Online and Offline Speech Recognition
Hugo Braun
Justin Luitjens
Ryan Leary
Tim Kaldewey
Daniel Povey
OffRL
13
12
0
22 Oct 2019
Transformer-based Acoustic Modeling for Hybrid Speech Recognition
Transformer-based Acoustic Modeling for Hybrid Speech Recognition
Yongqiang Wang
Abdel-rahman Mohamed
Duc Le
Chunxi Liu
Alex Xiao
...
Xiaohui Zhang
Frank Zhang
Christian Fuegen
Geoffrey Zweig
M. Seltzer
16
248
0
22 Oct 2019
Self-Attention Transducers for End-to-End Speech Recognition
Self-Attention Transducers for End-to-End Speech Recognition
Zhengkun Tian
Jiangyan Yi
J. Tao
Ye Bai
Zhengqi Wen
AI4TS
29
70
0
28 Sep 2019
Previous
1234
Next