ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2011.03109
  4. Cited By
Improving RNN Transducer Based ASR with Auxiliary Tasks

Improving RNN Transducer Based ASR with Auxiliary Tasks

5 November 2020
Chunxi Liu
Frank Zhang
Duc Le
Suyoun Kim
Yatharth Saraf
Geoffrey Zweig
ArXivPDFHTML

Papers citing "Improving RNN Transducer Based ASR with Auxiliary Tasks"

37 / 37 papers shown
Title
Contextualized End-to-end Automatic Speech Recognition with Intermediate
  Biasing Loss
Contextualized End-to-end Automatic Speech Recognition with Intermediate Biasing Loss
Muhammad Shakeel
Yui Sudo
Yifan Peng
Shinji Watanabe
AI4CE
29
2
0
23 Jun 2024
PI-Whisper: An Adaptive and Incremental ASR Framework for Diverse and
  Evolving Speaker Characteristics
PI-Whisper: An Adaptive and Incremental ASR Framework for Diverse and Evolving Speaker Characteristics
Amir Nassereldine
Dancheng Liu
Chenhui Xu
Jinjun Xiong
36
0
0
21 Jun 2024
Alternating Weak Triphone/BPE Alignment Supervision from Hybrid Model
  Improves End-to-End ASR
Alternating Weak Triphone/BPE Alignment Supervision from Hybrid Model Improves End-to-End ASR
Jintao Jiang
Yingbo Gao
Mohammad Zeineldeen
Zoltán Tüske
34
0
0
23 Feb 2024
Weak Alignment Supervision from Hybrid Model Improves End-to-end ASR
Weak Alignment Supervision from Hybrid Model Improves End-to-end ASR
Jintao Jiang
Yingbo Gao
Zoltán Tüske
21
1
0
24 Nov 2023
Augmenting text for spoken language understanding with Large Language
  Models
Augmenting text for spoken language understanding with Large Language Models
Roshan Sharma
Suyoun Kim
Daniel Lazar
Trang Le
Akshat Shrivastava
Kwanghoon Ahn
Piyush Kansal
Leda Sari
Ozlem Kalinli
Michael Seltzer
23
2
0
17 Sep 2023
Towards Word-Level End-to-End Neural Speaker Diarization with Auxiliary
  Network
Towards Word-Level End-to-End Neural Speaker Diarization with Auxiliary Network
Yiling Huang
Weiran Wang
Guanlong Zhao
Hank Liao
Wei Xia
Quan Wang
22
4
0
15 Sep 2023
TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression
  For On-device ASR Models
TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression For On-device ASR Models
Shangguan Yuan
Haichuan Yang
Danni Li
Chunyang Wu
Yassir Fathullah
...
J. Jia
Jay Mahadeokar
Xin Lei
Michael Seltzer
Vikas Chandra
24
2
0
05 Sep 2023
Multi-Head State Space Model for Speech Recognition
Multi-Head State Space Model for Speech Recognition
Yassir Fathullah
Chunyang Wu
Yuan Shangguan
J. Jia
Wenhan Xiong
...
Chunxi Liu
Yangyang Shi
Ozlem Kalinli
M. Seltzer
Mark J. F. Gales
24
13
0
21 May 2023
Pushing the performances of ASR models on English and Spanish accents
Pushing the performances of ASR models on English and Spanish accents
Pooja Chitkara
M. Rivière
Jade Copet
Frank Zhang
Yatharth Saraf
13
0
0
22 Dec 2022
Anchored Speech Recognition with Neural Transducers
Anchored Speech Recognition with Neural Transducers
Desh Raj
J. Jia
Jay Mahadeokar
Chunyang Wu
Niko Moritz
Xiaohui Zhang
Ozlem Kalinli
11
2
0
20 Oct 2022
Learning ASR pathways: A sparse multilingual ASR model
Learning ASR pathways: A sparse multilingual ASR model
Mu Yang
Andros Tjandra
Chunxi Liu
David C. Zhang
Duc Le
Ozlem Kalinli
38
13
0
13 Sep 2022
Learning a Dual-Mode Speech Recognition Model via Self-Pruning
Learning a Dual-Mode Speech Recognition Model via Self-Pruning
Chunxi Liu
Yuan Shangguan
Haichuan Yang
Yangyang Shi
Raghuraman Krishnamoorthi
Ozlem Kalinli
SSL
29
7
0
25 Jul 2022
Intermediate-layer output Regularization for Attention-based Speech
  Recognition with Shared Decoder
Intermediate-layer output Regularization for Attention-based Speech Recognition with Shared Decoder
Jicheng Zhang
Yizhou Peng
Haihua Xu
Yi He
Chng Eng Siong
Hao-Ming Huang
AuLLM
20
6
0
09 Jul 2022
Transfer Learning for Robust Low-Resource Children's Speech ASR with
  Transformers and Source-Filter Warping
Transfer Learning for Robust Low-Resource Children's Speech ASR with Transformers and Source-Filter Warping
Jenthe Thienpondt
Kris Demuynck
12
11
0
19 Jun 2022
Squeezeformer: An Efficient Transformer for Automatic Speech Recognition
Squeezeformer: An Efficient Transformer for Automatic Speech Recognition
Sehoon Kim
A. Gholami
Albert Eaton Shaw
Nicholas Lee
K. Mangalam
Jitendra Malik
Michael W. Mahoney
Kurt Keutzer
19
99
0
02 Jun 2022
Improving CTC-based ASR Models with Gated Interlayer Collaboration
Improving CTC-based ASR Models with Gated Interlayer Collaboration
Yuting Yang
Yuke Li
Binbin Du
26
11
0
25 May 2022
Multi-Level Modeling Units for End-to-End Mandarin Speech Recognition
Multi-Level Modeling Units for End-to-End Mandarin Speech Recognition
Yuting Yang
Binbin Du
Yuke Li
16
1
0
24 May 2022
Memory-Efficient Training of RNN-Transducer with Sampled Softmax
Memory-Efficient Training of RNN-Transducer with Sampled Softmax
Jaesong Lee
Lukas Lee
Shinji Watanabe
25
8
0
31 Mar 2022
Visual Speech Recognition for Multiple Languages in the Wild
Visual Speech Recognition for Multiple Languages in the Wild
Pingchuan Ma
Stavros Petridis
M. Pantic
VLM
119
144
0
26 Feb 2022
A Study of Transducer based End-to-End ASR with ESPnet: Architecture,
  Auxiliary Loss and Decoding Strategies
A Study of Transducer based End-to-End ASR with ESPnet: Architecture, Auxiliary Loss and Decoding Strategies
Florian Boyer
Yusuke Shinohara
Takaaki Ishii
H. Inaguma
Shinji Watanabe
27
34
0
14 Jan 2022
Towards Measuring Fairness in Speech Recognition: Casual Conversations
  Dataset Transcriptions
Towards Measuring Fairness in Speech Recognition: Casual Conversations Dataset Transcriptions
Chunxi Liu
M. Picheny
Leda Sari
Pooja Chitkara
Alex Xiao
Xiaohui Zhang
Mark Chou
Andres Alvarado
C. Hazirbas
Yatharth Saraf
23
41
0
18 Nov 2021
Sequence Transduction with Graph-based Supervision
Sequence Transduction with Graph-based Supervision
Niko Moritz
Takaaki Hori
Shinji Watanabe
Jonathan Le Roux
16
6
0
01 Nov 2021
Back from the future: bidirectional CTC decoding using future
  information in speech recognition
Back from the future: bidirectional CTC decoding using future information in speech recognition
Namkyu Jung
Geon-min Kim
Han-Gyu Kim
31
3
0
07 Oct 2021
Noisy Training Improves E2E ASR for the Edge
Noisy Training Improves E2E ASR for the Edge
Dilin Wang
Yuan Shangguan
Haichuan Yang
P. Chuang
Jiatong Zhou
Meng Li
Ganesh Venkatesh
Ozlem Kalinli
Vikas Chandra
14
4
0
09 Jul 2021
On lattice-free boosted MMI training of HMM and CTC-based full-context
  ASR models
On lattice-free boosted MMI training of HMM and CTC-based full-context ASR models
Xiaohui Zhang
Vimal Manohar
David C. Zhang
Frank Zhang
Yangyang Shi
Nayan Singhal
Julian Chan
Fuchun Peng
Yatharth Saraf
M. Seltzer
20
14
0
09 Jul 2021
Collaborative Training of Acoustic Encoders for Speech Recognition
Collaborative Training of Acoustic Encoders for Speech Recognition
Varun K. Nagaraja
Yangyang Shi
Ganesh Venkatesh
Ozlem Kalinli
M. Seltzer
Vikas Chandra
37
11
0
16 Jun 2021
CoDERT: Distilling Encoder Representations with Co-learning for
  Transducer-based Speech Recognition
CoDERT: Distilling Encoder Representations with Co-learning for Transducer-based Speech Recognition
R. Swaminathan
Brian King
Grant P. Strimel
J. Droppo
Athanasios Mouchtaris
18
15
0
14 Jun 2021
Non-autoregressive Mandarin-English Code-switching Speech Recognition
Non-autoregressive Mandarin-English Code-switching Speech Recognition
Shun-Po Chuang
Heng-Jui Chang
Sung-Feng Huang
Hung-yi Lee
16
15
0
06 Apr 2021
Contextualized Streaming End-to-End Speech Recognition with Trie-Based
  Deep Biasing and Shallow Fusion
Contextualized Streaming End-to-End Speech Recognition with Trie-Based Deep Biasing and Shallow Fusion
Duc Le
Mahaveer Jain
Gil Keren
Suyoun Kim
Yangyang Shi
...
Yuan Shangguan
Christian Fuegen
Ozlem Kalinli
Yatharth Saraf
M. Seltzer
27
90
0
05 Apr 2021
Dynamic Encoder Transducer: A Flexible Solution For Trading Off Accuracy
  For Latency
Dynamic Encoder Transducer: A Flexible Solution For Trading Off Accuracy For Latency
Yangyang Shi
Varun K. Nagaraja
Chunyang Wu
Jay Mahadeokar
Duc Le
...
Ching-Feng Yeh
Julian Chan
Christian Fuegen
Ozlem Kalinli
M. Seltzer
25
15
0
05 Apr 2021
A Practical Survey on Faster and Lighter Transformers
A Practical Survey on Faster and Lighter Transformers
Quentin Fournier
G. Caron
Daniel Aloise
14
93
0
26 Mar 2021
Intermediate Loss Regularization for CTC-based Speech Recognition
Intermediate Loss Regularization for CTC-based Speech Recognition
Jaesong Lee
Shinji Watanabe
113
135
0
05 Feb 2021
Deep Shallow Fusion for RNN-T Personalization
Deep Shallow Fusion for RNN-T Personalization
Duc Le
Gil Keren
Julian Chan
Jay Mahadeokar
Christian Fuegen
M. Seltzer
21
77
0
16 Nov 2020
Benchmarking LF-MMI, CTC and RNN-T Criteria for Streaming ASR
Benchmarking LF-MMI, CTC and RNN-T Criteria for Streaming ASR
Xiaohui Zhang
Frank Zhang
Chunxi Liu
Kjell Schubert
Julian Chan
...
Jun Liu
Ching-Feng Yeh
Fuchun Peng
Yatharth Saraf
Geoffrey Zweig
17
20
0
09 Nov 2020
Transformer in action: a comparative study of transformer-based acoustic
  models for large scale speech recognition applications
Transformer in action: a comparative study of transformer-based acoustic models for large scale speech recognition applications
Yongqiang Wang
Yangyang Shi
Frank Zhang
Chunyang Wu
Julian Chan
Ching-Feng Yeh
Alex Xiao
10
21
0
27 Oct 2020
Dynamic Hierarchical Mimicking Towards Consistent Optimization Objectives
Duo Li
Qifeng Chen
140
19
0
24 Mar 2020
Minimum Bayes Risk Training of RNN-Transducer for End-to-End Speech
  Recognition
Minimum Bayes Risk Training of RNN-Transducer for End-to-End Speech Recognition
Chao Weng
Chengzhu Yu
Jia Cui
Chunlei Zhang
Dong Yu
69
39
0
28 Nov 2019
1