ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2201.10240
  4. Cited By
Improving the fusion of acoustic and text representations in RNN-T

Improving the fusion of acoustic and text representations in RNN-T

25 January 2022
Chao Zhang
Yue Liu
Zhiyun Lu
Tara N. Sainath
Shuo-yiin Chang
    AI4CE
ArXivPDFHTML

Papers citing "Improving the fusion of acoustic and text representations in RNN-T"

32 / 32 papers shown
Title
Tree-constrained Pointer Generator for End-to-end Contextual Speech
  Recognition
Tree-constrained Pointer Generator for End-to-end Contextual Speech Recognition
Guangzhi Sun
Chao Zhang
P. Woodland
45
32
0
01 Sep 2021
Scaling End-to-End Models for Large-Scale Multilingual ASR
Scaling End-to-End Models for Large-Scale Multilingual ASR
Yue Liu
Ruoming Pang
Tara N. Sainath
Anmol Gulati
Yu Zhang
James Qin
Parisa Haghani
Wenjie Huang
Min Ma
Junwen Bai
CLL
87
77
0
30 Apr 2021
Contextualized Streaming End-to-End Speech Recognition with Trie-Based
  Deep Biasing and Shallow Fusion
Contextualized Streaming End-to-End Speech Recognition with Trie-Based Deep Biasing and Shallow Fusion
Duc Le
Mahaveer Jain
Gil Keren
Suyoun Kim
Yangyang Shi
...
Yuan Shangguan
Christian Fuegen
Ozlem Kalinli
Yatharth Saraf
M. Seltzer
65
95
0
05 Apr 2021
Advancing RNN Transducer Technology for Speech Recognition
Advancing RNN Transducer Technology for Speech Recognition
G. Saon
Zoltan Tueske
Daniel Bolaños
Brian Kingsbury
63
87
0
17 Mar 2021
Internal Language Model Training for Domain-Adaptive End-to-End Speech
  Recognition
Internal Language Model Training for Domain-Adaptive End-to-End Speech Recognition
Zhong Meng
Naoyuki Kanda
Yashesh Gaur
S. Parthasarathy
Eric Sun
Liang Lu
Xie Chen
Jinyu Li
Jiawei Liu
AuLLM
64
52
0
02 Feb 2021
Tiny Transducer: A Highly-efficient Speech Recognition Model on Edge
  Devices
Tiny Transducer: A Highly-efficient Speech Recognition Model on Edge Devices
Yuekai Zhang
Sining Sun
Long Ma
54
29
0
18 Jan 2021
Using Synthetic Audio to Improve The Recognition of Out-Of-Vocabulary
  Words in End-To-End ASR Systems
Using Synthetic Audio to Improve The Recognition of Out-Of-Vocabulary Words in End-To-End ASR Systems
Xianrui Zheng
Yulan Liu
Deniz Gunceler
D. Willett
97
78
0
23 Nov 2020
Cascaded encoders for unifying streaming and non-streaming ASR
Cascaded encoders for unifying streaming and non-streaming ASR
A. Narayanan
Tara N. Sainath
Ruoming Pang
Jiahui Yu
Chung-Cheng Chiu
Rohit Prabhavalkar
Ehsan Variani
Trevor Strohman
AuLLM
75
85
0
27 Oct 2020
Improving Streaming Automatic Speech Recognition With Non-Streaming
  Model Distillation On Unsupervised Data
Improving Streaming Automatic Speech Recognition With Non-Streaming Model Distillation On Unsupervised Data
Thibault Doutre
Wei Han
Min Ma
Zhiyun Lu
Chung-Cheng Chiu
Ruoming Pang
A. Narayanan
Ananya Misra
Yu Zhang
Liangliang Cao
87
22
0
22 Oct 2020
Combination of Deep Speaker Embeddings for Diarisation
Combination of Deep Speaker Embeddings for Diarisation
Guangzhi Sun
Chao Zhang
P. Woodland
42
21
0
22 Oct 2020
Developing Real-time Streaming Transformer Transducer for Speech
  Recognition on Large-scale Dataset
Developing Real-time Streaming Transformer Transducer for Speech Recognition on Large-scale Dataset
Xie Chen
Yu-Huan Wu
Zhenghao Wang
Shujie Liu
Jinyu Li
96
174
0
22 Oct 2020
Emformer: Efficient Memory Transformer Based Acoustic Model For Low
  Latency Streaming Speech Recognition
Emformer: Efficient Memory Transformer Based Acoustic Model For Low Latency Streaming Speech Recognition
Yangyang Shi
Yongqiang Wang
Chunyang Wu
Ching-Feng Yeh
Julian Chan
Frank Zhang
Duc Le
M. Seltzer
86
171
0
21 Oct 2020
Developing RNN-T Models Surpassing High-Performance Hybrid Models with
  Customization Capability
Developing RNN-T Models Surpassing High-Performance Hybrid Models with Customization Capability
Jinyu Li
Rui Zhao
Zhong Meng
Yanqing Liu
Wenning Wei
...
V. Mazalov
Zhenghao Wang
Lei He
Sheng Zhao
Jiawei Liu
48
108
0
30 Jul 2020
A New Training Pipeline for an Improved Neural Transducer
A New Training Pipeline for an Improved Neural Transducer
Albert Zeyer
André Merboldt
Ralf Schluter
Hermann Ney
AI4TS
MedIm
37
52
0
19 May 2020
Conformer: Convolution-augmented Transformer for Speech Recognition
Conformer: Convolution-augmented Transformer for Speech Recognition
Anmol Gulati
James Qin
Chung-Cheng Chiu
Niki Parmar
Yu Zhang
...
Wei Han
Shibo Wang
Zhengdong Zhang
Yonghui Wu
Ruoming Pang
210
3,117
0
16 May 2020
Hybrid Autoregressive Transducer (hat)
Hybrid Autoregressive Transducer (hat)
Ehsan Variani
David Rybach
Cyril Allauzen
Michael Riley
46
160
0
12 Mar 2020
A Density Ratio Approach to Language Model Fusion in End-To-End
  Automatic Speech Recognition
A Density Ratio Approach to Language Model Fusion in End-To-End Automatic Speech Recognition
Erik McDermott
Hasim Sak
Ehsan Variani
52
113
0
26 Feb 2020
Transformer Transducer: A Streamable Speech Recognition Model with
  Transformer Encoders and RNN-T Loss
Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss
Qian Zhang
Han Lu
Hasim Sak
Anshuman Tripathi
Erik McDermott
Stephen Koo
Shankar Kumar
61
480
0
07 Feb 2020
SpecAugment on Large Scale Datasets
SpecAugment on Large Scale Datasets
Daniel S. Park
Yu Zhang
Chung-Cheng Chiu
Youzheng Chen
Yue Liu
William Chan
Quoc V. Le
Yonghui Wu
55
138
0
11 Dec 2019
Multimodal Intelligence: Representation Learning, Information Fusion,
  and Applications
Multimodal Intelligence: Representation Learning, Information Fusion, and Applications
Chao Zhang
Zichao Yang
Xiaodong He
Li Deng
HAI
AI4TS
62
330
0
10 Nov 2019
Improving RNN Transducer Modeling for End-to-End Speech Recognition
Improving RNN Transducer Modeling for End-to-End Speech Recognition
Jinyu Li
Rui Zhao
Hu Hu
Jiawei Liu
43
170
0
26 Sep 2019
Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence
  Modeling
Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling
Jonathan Shen
Patrick Nguyen
Yonghui Wu
Zhiwen Chen
Mengzhao Chen
...
William Chan
Shubham Toshniwal
Baohua Liao
M. Nirschl
Pat Rondon
VLM
68
211
0
21 Feb 2019
Streaming End-to-end Speech Recognition For Mobile Devices
Streaming End-to-end Speech Recognition For Mobile Devices
Yanzhang He
Tara N. Sainath
Rohit Prabhavalkar
Ian McGraw
R. Álvarez
...
K. Sim
Tom Bagby
Shuo-yiin Chang
Kanishka Rao
A. Gruenstein
89
625
0
15 Nov 2018
Overcoming Language Priors in Visual Question Answering with Adversarial
  Regularization
Overcoming Language Priors in Visual Question Answering with Adversarial Regularization
S. Ramakrishnan
Aishwarya Agrawal
Stefan Lee
AAML
61
239
0
08 Oct 2018
Exploring Architectures, Data and Units For Streaming End-to-End Speech
  Recognition with RNN-Transducer
Exploring Architectures, Data and Units For Streaming End-to-End Speech Recognition with RNN-Transducer
Kanishka Rao
Hasim Sak
Rohit Prabhavalkar
AI4TS
70
347
0
02 Jan 2018
Exploring Neural Transducers for End-to-End Speech Recognition
Exploring Neural Transducers for End-to-End Speech Recognition
Eric Battenberg
Jitong Chen
R. Child
Adam Coates
Yashesh Gaur Yi Li
...
Hairong Liu
S. Satheesh
David Seetapun
Anuroop Sriram
Zhenyao Zhu
AI4TS
66
230
0
24 Jul 2017
Joint CTC-Attention based End-to-End Speech Recognition using Multi-task
  Learning
Joint CTC-Attention based End-to-End Speech Recognition using Multi-task Learning
Suyoun Kim
Takaaki Hori
Shinji Watanabe
66
925
0
21 Sep 2016
Highway Long Short-Term Memory RNNs for Distant Speech Recognition
Highway Long Short-Term Memory RNNs for Distant Speech Recognition
Yu Zhang
Guoguo Chen
Dong Yu
Kaisheng Yao
Sanjeev Khudanpur
James R. Glass
3DV
AI4TS
64
291
0
30 Oct 2015
Listen, Attend and Spell
Listen, Attend and Spell
William Chan
Navdeep Jaitly
Quoc V. Le
Oriol Vinyals
RALM
147
2,264
0
05 Aug 2015
Attention-Based Models for Speech Recognition
Attention-Based Models for Speech Recognition
J. Chorowski
Dzmitry Bahdanau
Dmitriy Serdyuk
Kyunghyun Cho
Yoshua Bengio
113
2,606
0
24 Jun 2015
Speech Recognition with Deep Recurrent Neural Networks
Speech Recognition with Deep Recurrent Neural Networks
Alex Graves
Abdel-rahman Mohamed
Geoffrey E. Hinton
190
8,504
0
22 Mar 2013
Sequence Transduction with Recurrent Neural Networks
Sequence Transduction with Recurrent Neural Networks
Alex Graves
163
1,866
0
14 Nov 2012
1