Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2201.10240
Cited By
Improving the fusion of acoustic and text representations in RNN-T
25 January 2022
Chao Zhang
Yue Liu
Zhiyun Lu
Tara N. Sainath
Shuo-yiin Chang
AI4CE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Improving the fusion of acoustic and text representations in RNN-T"
32 / 32 papers shown
Title
Tree-constrained Pointer Generator for End-to-end Contextual Speech Recognition
Guangzhi Sun
Chao Zhang
P. Woodland
45
32
0
01 Sep 2021
Scaling End-to-End Models for Large-Scale Multilingual ASR
Yue Liu
Ruoming Pang
Tara N. Sainath
Anmol Gulati
Yu Zhang
James Qin
Parisa Haghani
Wenjie Huang
Min Ma
Junwen Bai
CLL
87
77
0
30 Apr 2021
Contextualized Streaming End-to-End Speech Recognition with Trie-Based Deep Biasing and Shallow Fusion
Duc Le
Mahaveer Jain
Gil Keren
Suyoun Kim
Yangyang Shi
...
Yuan Shangguan
Christian Fuegen
Ozlem Kalinli
Yatharth Saraf
M. Seltzer
65
95
0
05 Apr 2021
Advancing RNN Transducer Technology for Speech Recognition
G. Saon
Zoltan Tueske
Daniel Bolaños
Brian Kingsbury
63
87
0
17 Mar 2021
Internal Language Model Training for Domain-Adaptive End-to-End Speech Recognition
Zhong Meng
Naoyuki Kanda
Yashesh Gaur
S. Parthasarathy
Eric Sun
Liang Lu
Xie Chen
Jinyu Li
Jiawei Liu
AuLLM
64
52
0
02 Feb 2021
Tiny Transducer: A Highly-efficient Speech Recognition Model on Edge Devices
Yuekai Zhang
Sining Sun
Long Ma
54
29
0
18 Jan 2021
Using Synthetic Audio to Improve The Recognition of Out-Of-Vocabulary Words in End-To-End ASR Systems
Xianrui Zheng
Yulan Liu
Deniz Gunceler
D. Willett
97
78
0
23 Nov 2020
Cascaded encoders for unifying streaming and non-streaming ASR
A. Narayanan
Tara N. Sainath
Ruoming Pang
Jiahui Yu
Chung-Cheng Chiu
Rohit Prabhavalkar
Ehsan Variani
Trevor Strohman
AuLLM
75
85
0
27 Oct 2020
Improving Streaming Automatic Speech Recognition With Non-Streaming Model Distillation On Unsupervised Data
Thibault Doutre
Wei Han
Min Ma
Zhiyun Lu
Chung-Cheng Chiu
Ruoming Pang
A. Narayanan
Ananya Misra
Yu Zhang
Liangliang Cao
87
22
0
22 Oct 2020
Combination of Deep Speaker Embeddings for Diarisation
Guangzhi Sun
Chao Zhang
P. Woodland
42
21
0
22 Oct 2020
Developing Real-time Streaming Transformer Transducer for Speech Recognition on Large-scale Dataset
Xie Chen
Yu-Huan Wu
Zhenghao Wang
Shujie Liu
Jinyu Li
96
174
0
22 Oct 2020
Emformer: Efficient Memory Transformer Based Acoustic Model For Low Latency Streaming Speech Recognition
Yangyang Shi
Yongqiang Wang
Chunyang Wu
Ching-Feng Yeh
Julian Chan
Frank Zhang
Duc Le
M. Seltzer
86
171
0
21 Oct 2020
Developing RNN-T Models Surpassing High-Performance Hybrid Models with Customization Capability
Jinyu Li
Rui Zhao
Zhong Meng
Yanqing Liu
Wenning Wei
...
V. Mazalov
Zhenghao Wang
Lei He
Sheng Zhao
Jiawei Liu
48
108
0
30 Jul 2020
A New Training Pipeline for an Improved Neural Transducer
Albert Zeyer
André Merboldt
Ralf Schluter
Hermann Ney
AI4TS
MedIm
37
52
0
19 May 2020
Conformer: Convolution-augmented Transformer for Speech Recognition
Anmol Gulati
James Qin
Chung-Cheng Chiu
Niki Parmar
Yu Zhang
...
Wei Han
Shibo Wang
Zhengdong Zhang
Yonghui Wu
Ruoming Pang
210
3,117
0
16 May 2020
Hybrid Autoregressive Transducer (hat)
Ehsan Variani
David Rybach
Cyril Allauzen
Michael Riley
46
160
0
12 Mar 2020
A Density Ratio Approach to Language Model Fusion in End-To-End Automatic Speech Recognition
Erik McDermott
Hasim Sak
Ehsan Variani
52
113
0
26 Feb 2020
Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss
Qian Zhang
Han Lu
Hasim Sak
Anshuman Tripathi
Erik McDermott
Stephen Koo
Shankar Kumar
61
480
0
07 Feb 2020
SpecAugment on Large Scale Datasets
Daniel S. Park
Yu Zhang
Chung-Cheng Chiu
Youzheng Chen
Yue Liu
William Chan
Quoc V. Le
Yonghui Wu
55
138
0
11 Dec 2019
Multimodal Intelligence: Representation Learning, Information Fusion, and Applications
Chao Zhang
Zichao Yang
Xiaodong He
Li Deng
HAI
AI4TS
62
330
0
10 Nov 2019
Improving RNN Transducer Modeling for End-to-End Speech Recognition
Jinyu Li
Rui Zhao
Hu Hu
Jiawei Liu
43
170
0
26 Sep 2019
Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling
Jonathan Shen
Patrick Nguyen
Yonghui Wu
Zhiwen Chen
Mengzhao Chen
...
William Chan
Shubham Toshniwal
Baohua Liao
M. Nirschl
Pat Rondon
VLM
68
211
0
21 Feb 2019
Streaming End-to-end Speech Recognition For Mobile Devices
Yanzhang He
Tara N. Sainath
Rohit Prabhavalkar
Ian McGraw
R. Álvarez
...
K. Sim
Tom Bagby
Shuo-yiin Chang
Kanishka Rao
A. Gruenstein
89
625
0
15 Nov 2018
Overcoming Language Priors in Visual Question Answering with Adversarial Regularization
S. Ramakrishnan
Aishwarya Agrawal
Stefan Lee
AAML
61
239
0
08 Oct 2018
Exploring Architectures, Data and Units For Streaming End-to-End Speech Recognition with RNN-Transducer
Kanishka Rao
Hasim Sak
Rohit Prabhavalkar
AI4TS
70
347
0
02 Jan 2018
Exploring Neural Transducers for End-to-End Speech Recognition
Eric Battenberg
Jitong Chen
R. Child
Adam Coates
Yashesh Gaur Yi Li
...
Hairong Liu
S. Satheesh
David Seetapun
Anuroop Sriram
Zhenyao Zhu
AI4TS
66
230
0
24 Jul 2017
Joint CTC-Attention based End-to-End Speech Recognition using Multi-task Learning
Suyoun Kim
Takaaki Hori
Shinji Watanabe
66
925
0
21 Sep 2016
Highway Long Short-Term Memory RNNs for Distant Speech Recognition
Yu Zhang
Guoguo Chen
Dong Yu
Kaisheng Yao
Sanjeev Khudanpur
James R. Glass
3DV
AI4TS
64
291
0
30 Oct 2015
Listen, Attend and Spell
William Chan
Navdeep Jaitly
Quoc V. Le
Oriol Vinyals
RALM
147
2,264
0
05 Aug 2015
Attention-Based Models for Speech Recognition
J. Chorowski
Dzmitry Bahdanau
Dmitriy Serdyuk
Kyunghyun Cho
Yoshua Bengio
113
2,606
0
24 Jun 2015
Speech Recognition with Deep Recurrent Neural Networks
Alex Graves
Abdel-rahman Mohamed
Geoffrey E. Hinton
190
8,504
0
22 Mar 2013
Sequence Transduction with Recurrent Neural Networks
Alex Graves
163
1,866
0
14 Nov 2012
1