ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1712.01769
  4. Cited By
State-of-the-art Speech Recognition With Sequence-to-Sequence Models

State-of-the-art Speech Recognition With Sequence-to-Sequence Models

5 December 2017
Chung-Cheng Chiu
Tara N. Sainath
Yonghui Wu
Rohit Prabhavalkar
Patrick Nguyen
Zhehuai Chen
Anjuli Kannan
Ron J. Weiss
Kanishka Rao
Katya Gonina
Navdeep Jaitly
Bo Li
J. Chorowski
M. Bacchiani
    AI4TS
ArXivPDFHTML

Papers citing "State-of-the-art Speech Recognition With Sequence-to-Sequence Models"

50 / 501 papers shown
Title
RAND: Robustness Aware Norm Decay For Quantized Seq2seq Models
RAND: Robustness Aware Norm Decay For Quantized Seq2seq Models
David Qiu
David Rim
Shaojin Ding
Oleg Rybakov
Yanzhang He
MQ
35
4
0
24 May 2023
Rethinking Speech Recognition with A Multimodal Perspective via Acoustic
  and Semantic Cooperative Decoding
Rethinking Speech Recognition with A Multimodal Perspective via Acoustic and Semantic Cooperative Decoding
Tianren Zhang
Haibo Qin
Zhibing Lai
Songlu Chen
Qi Liu
Feng Chen
Xinyuan Qian
Xu-Cheng Yin
38
0
0
23 May 2023
A Comparative Study on E-Branchformer vs Conformer in Speech
  Recognition, Translation, and Understanding Tasks
A Comparative Study on E-Branchformer vs Conformer in Speech Recognition, Translation, and Understanding Tasks
Yifan Peng
Kwangyoun Kim
Felix Wu
Brian Yan
Siddhant Arora
William Chen
Jiyang Tang
Suwon Shon
Prashant Sridhar
Shinji Watanabe
39
17
0
18 May 2023
Deep Transfer Learning for Automatic Speech Recognition: Towards Better
  Generalization
Deep Transfer Learning for Automatic Speech Recognition: Towards Better Generalization
Hamza Kheddar
Yassine Himeur
S. Al-Maadeed
Abbes Amira
F. Bensaali
52
76
0
27 Apr 2023
Self-regularised Minimum Latency Training for Streaming
  Transformer-based Speech Recognition
Self-regularised Minimum Latency Training for Streaming Transformer-based Speech Recognition
Mohan Li
R. Doddipatla
Catalin Zorila
30
0
0
24 Apr 2023
Machine Learning Research Trends in Africa: A 30 Years Overview with
  Bibliometric Analysis Review
Machine Learning Research Trends in Africa: A 30 Years Overview with Bibliometric Analysis Review
A. Ezugwu
O. N. Oyelade
A. M. Ikotun
Jeffery O. Agushaka
Y. Ho
40
17
0
15 Apr 2023
Lego-Features: Exporting modular encoder features for streaming and
  deliberation ASR
Lego-Features: Exporting modular encoder features for streaming and deliberation ASR
Rami Botros
Rohit Prabhavalkar
J. Schalkwyk
Ciprian Chelba
Tara N. Sainath
Franccoise Beaufays
AuLLM
26
3
0
31 Mar 2023
Practical Conformer: Optimizing size, speed and flops of Conformer for
  on-Device and cloud ASR
Practical Conformer: Optimizing size, speed and flops of Conformer for on-Device and cloud ASR
Rami Botros
Anmol Gulati
Tara N. Sainath
K. Choromanski
Ruoming Pang
Trevor Strohman
Weiran Wang
Jiahui Yu
MQ
28
3
0
31 Mar 2023
Dialog act guided contextual adapter for personalized speech recognition
Dialog act guided contextual adapter for personalized speech recognition
Feng-Ju Chang
Thejaswi Muniyappa
Kanthashree Mysore Sathyendra
Kailin Wei
Grant P. Strimel
Ross McGowan
24
4
0
31 Mar 2023
A Deliberation-based Joint Acoustic and Text Decoder
A Deliberation-based Joint Acoustic and Text Decoder
S. Mavandadi
Tara N. Sainath
Ke Hu
Zelin Wu
26
7
0
23 Mar 2023
Pyramid Multi-branch Fusion DCNN with Multi-Head Self-Attention for
  Mandarin Speech Recognition
Pyramid Multi-branch Fusion DCNN with Multi-Head Self-Attention for Mandarin Speech Recognition
Kai Liu
Hailiang Xiong
Gangqiang Yang
Zhengfeng Du
Yewen Cao
D. Shah
18
0
0
23 Mar 2023
A Complete Survey on Generative AI (AIGC): Is ChatGPT from GPT-4 to
  GPT-5 All You Need?
A Complete Survey on Generative AI (AIGC): Is ChatGPT from GPT-4 to GPT-5 All You Need?
Chaoning Zhang
Chenshuang Zhang
Sheng Zheng
Yu Qiao
Chenghao Li
...
Lik-Hang Lee
Yang Yang
Heng Tao Shen
In So Kweon
Choong Seon Hong
85
160
0
21 Mar 2023
Visual Information Matters for ASR Error Correction
Visual Information Matters for ASR Error Correction
Bannihati Kumar Vanya
Shanbo Cheng
Ningxin Peng
Yuchen Zhang
32
3
0
16 Mar 2023
End-to-End Speech Recognition: A Survey
End-to-End Speech Recognition: A Survey
Rohit Prabhavalkar
Takaaki Hori
Tara N. Sainath
Ralf Schluter
Shinji Watanabe
VLM
31
153
0
03 Mar 2023
Federated Learning for ASR based on Wav2vec 2.0
Federated Learning for ASR based on Wav2vec 2.0
Tuan Nguyen
Salima Mdhaffar
N. Tomashenko
J. Bonastre
Yannick Esteve
FedML
52
10
0
20 Feb 2023
JEIT: Joint End-to-End Model and Internal Language Model Training for
  Speech Recognition
JEIT: Joint End-to-End Model and Internal Language Model Training for Speech Recognition
Zhong Meng
Weiran Wang
Rohit Prabhavalkar
Tara N. Sainath
Tongzhou Chen
Ehsan Variani
Yu Zhang
Bo Li
Andrew Rosenberg
Bhuvana Ramabhadran
AuLLM
VLM
38
11
0
16 Feb 2023
Characterizing Financial Market Coverage using Artificial Intelligence
Characterizing Financial Market Coverage using Artificial Intelligence
Jean Marie Tshimula
D'Jeff K. Nkashama
Patrick Owusu
Marc Frappier
Pierre Martin Tardif
F. Kabanza
Armelle Brun
Jean-Marc Patenaude
Shengrui Wang
Belkacem Chikhaoui
AIFin
33
2
0
07 Feb 2023
Towards Rigorous Understanding of Neural Networks via
  Semantics-preserving Transformations
Towards Rigorous Understanding of Neural Networks via Semantics-preserving Transformations
Maximilian Schlüter
Gerrit Nolte
Alnis Murtovi
Bernhard Steffen
34
6
0
19 Jan 2023
From English to More Languages: Parameter-Efficient Model Reprogramming
  for Cross-Lingual Speech Recognition
From English to More Languages: Parameter-Efficient Model Reprogramming for Cross-Lingual Speech Recognition
Chao-Han Huck Yang
Bo Li
Yu Zhang
Nanxin Chen
Rohit Prabhavalkar
Tara N. Sainath
Trevor Strohman
19
28
0
19 Jan 2023
Learning Feature Recovery Transformer for Occluded Person
  Re-identification
Learning Feature Recovery Transformer for Occluded Person Re-identification
Boqiang Xu
Lingxiao He
Jian Liang
Zhenan Sun
ViT
25
53
0
05 Jan 2023
Macro-block dropout for improved regularization in training end-to-end
  speech recognition models
Macro-block dropout for improved regularization in training end-to-end speech recognition models
Chanwoo Kim
Sathish Indurti
Jinhwan Park
Wonyong Sung
33
0
0
29 Dec 2022
Fast and accurate factorized neural transducer for text adaption of
  end-to-end speech recognition models
Fast and accurate factorized neural transducer for text adaption of end-to-end speech recognition models
Rui Zhao
Jian Xue
P. Parthasarathy
Veljko Miljanic
Jinyu Li
21
13
0
05 Dec 2022
Probabilistic Verification of ReLU Neural Networks via Characteristic
  Functions
Probabilistic Verification of ReLU Neural Networks via Characteristic Functions
Joshua Pilipovsky
Vignesh Sivaramakrishnan
Meeko Oishi
Panagiotis Tsiotras
42
5
0
03 Dec 2022
CorrectNet: Robustness Enhancement of Analog In-Memory Computing for
  Neural Networks by Error Suppression and Compensation
CorrectNet: Robustness Enhancement of Analog In-Memory Computing for Neural Networks by Error Suppression and Compensation
Amro Eldebiky
Grace Li Zhang
G. Böcherer
Bing Li
Ulf Schlichtmann
55
17
0
27 Nov 2022
Why the pseudo label based semi-supervised learning algorithm is
  effective?
Why the pseudo label based semi-supervised learning algorithm is effective?
Zeping Min
Qian Ge
Cheng Tai
MLT
34
4
0
18 Nov 2022
Massively Multilingual ASR on 70 Languages: Tokenization, Architecture,
  and Generalization Capabilities
Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities
Andros Tjandra
Nayan Singhal
David C. Zhang
Ozlem Kalinli
Abdel-rahman Mohamed
Duc Le
M. Seltzer
42
12
0
10 Nov 2022
Understanding the Role of Mixup in Knowledge Distillation: An Empirical
  Study
Understanding the Role of Mixup in Knowledge Distillation: An Empirical Study
Hongjun Choi
Eunyeong Jeon
Ankita Shukla
Pavan Turaga
26
8
0
08 Nov 2022
LAMASSU: Streaming Language-Agnostic Multilingual Speech Recognition and
  Translation Using Neural Transducers
LAMASSU: Streaming Language-Agnostic Multilingual Speech Recognition and Translation Using Neural Transducers
Peidong Wang
Eric Sun
Jian Xue
Yu-Huan Wu
Long Zhou
Yashesh Gaur
Shujie Liu
Jinyu Li
34
8
0
05 Nov 2022
A Weakly-Supervised Streaming Multilingual Speech Model with Truly
  Zero-Shot Capability
A Weakly-Supervised Streaming Multilingual Speech Model with Truly Zero-Shot Capability
Jian Xue
Peidong Wang
Jinyu Li
Eric Sun
32
10
0
04 Nov 2022
Phonetic-assisted Multi-Target Units Modeling for Improving
  Conformer-Transducer ASR system
Phonetic-assisted Multi-Target Units Modeling for Improving Conformer-Transducer ASR system
Li Li
Dongxing Xu
Haoran Wei
Yanhua Long
26
2
0
03 Nov 2022
Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing
Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing
Yonggan Fu
Yang Zhang
Kaizhi Qian
Zhifan Ye
Zhongzhi Yu
Cheng-I Jeff Lai
Yingyan Lin
35
8
0
02 Nov 2022
Internal Language Model Estimation based Adaptive Language Model Fusion
  for Domain Adaptation
Internal Language Model Estimation based Adaptive Language Model Fusion for Domain Adaptation
Rao Ma
Xiaobo Wu
Jin Qiu
Yanan Qin
Haihua Xu
Peihao Wu
Zejun Ma
32
2
0
02 Nov 2022
InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss
InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss
Yosuke Higuchi
Tetsuji Ogawa
Tetsunori Kobayashi
Shinji Watanabe
32
0
0
02 Nov 2022
Speech-text based multi-modal training with bidirectional attention for
  improved speech recognition
Speech-text based multi-modal training with bidirectional attention for improved speech recognition
Yuhang Yang
Haihua Xu
Hao-Ming Huang
Eng Siong Chng
Sheng Li
47
7
0
01 Nov 2022
Joint Audio/Text Training for Transformer Rescorer of Streaming Speech
  Recognition
Joint Audio/Text Training for Transformer Rescorer of Streaming Speech Recognition
Suyoun Kim
Ke Li
Lucas Kabela
Rongqing Huang
Jiedan Zhu
Ozlem Kalinli
Duc Le
35
8
0
31 Oct 2022
Modular Hybrid Autoregressive Transducer
Modular Hybrid Autoregressive Transducer
Zhong Meng
Tongzhou Chen
Rohit Prabhavalkar
Yu Zhang
Gary Wang
...
Bhuvana Ramabhadran
Wenjie Huang
Ehsan Variani
Yinghui Huang
Pedro J. Moreno
34
20
0
31 Oct 2022
BERT Meets CTC: New Formulation of End-to-End Speech Recognition with
  Pre-trained Masked Language Model
BERT Meets CTC: New Formulation of End-to-End Speech Recognition with Pre-trained Masked Language Model
Yosuke Higuchi
Brian Yan
Siddhant Arora
Tetsuji Ogawa
Tetsunori Kobayashi
Shinji Watanabe
59
25
0
29 Oct 2022
Accelerating RNN-T Training and Inference Using CTC guidance
Accelerating RNN-T Training and Inference Using CTC guidance
Yongqiang Wang
Zhehuai Chen
Cheng-yong Zheng
Yu Zhang
Wei Han
Parisa Haghani
42
23
0
29 Oct 2022
Random Utterance Concatenation Based Data Augmentation for Improving
  Short-video Speech Recognition
Random Utterance Concatenation Based Data Augmentation for Improving Short-video Speech Recognition
Yist Y. Lin
Tao Han
Haihua Xu
Van Tung Pham
Yerbolat Khassanov
Tze Yuang Chong
Yi He
Lu Lu
Zejun Ma
21
2
0
28 Oct 2022
Towards automatic generation of Piping and Instrumentation Diagrams
  (P&IDs) with Artificial Intelligence
Towards automatic generation of Piping and Instrumentation Diagrams (P&IDs) with Artificial Intelligence
Edwin Hirtreiter
Lukas Schulze Balhorn
Artur M. Schweidtmann
AI4CE
29
14
0
26 Oct 2022
End-to-End Integration of Speech Recognition, Dereverberation,
  Beamforming, and Self-Supervised Learning Representation
End-to-End Integration of Speech Recognition, Dereverberation, Beamforming, and Self-Supervised Learning Representation
Yoshiki Masuyama
Xuankai Chang
Samuele Cornell
Shinji Watanabe
Nobutaka Ono
27
19
0
19 Oct 2022
Acoustic-aware Non-autoregressive Spell Correction with Mask Sample
  Decoding
Acoustic-aware Non-autoregressive Spell Correction with Mask Sample Decoding
Ruchao Fan
Guoli Ye
Yashesh Gaur
Jinyu Li
19
4
0
16 Oct 2022
JOIST: A Joint Speech and Text Streaming Model For ASR
JOIST: A Joint Speech and Text Streaming Model For ASR
Tara N. Sainath
Rohit Prabhavalkar
Ankur Bapna
Yu Zhang
Zhouyuan Huo
Zhehuai Chen
Bo Li
Weiran Wang
Trevor Strohman
RALM
AuLLM
53
35
0
13 Oct 2022
Streaming Intended Query Detection using E2E Modeling for Continued
  Conversation
Streaming Intended Query Detection using E2E Modeling for Continued Conversation
Shuo-yiin Chang
Guru Prakash
Zelin Wu
Qiao Liang
Tara N. Sainath
Bo Li
Adam Stambler
Shyam Upadhyay
Manaal Faruqui
Trevor Strohman
42
5
0
29 Aug 2022
Turn-Taking Prediction for Natural Conversational Speech
Turn-Taking Prediction for Natural Conversational Speech
Shuo-yiin Chang
Bo Li
Tara N. Sainath
Chaoyang Zhang
Trevor Strohman
Qiao Liang
Yanzhang He
43
19
0
29 Aug 2022
Multimodal Lecture Presentations Dataset: Understanding Multimodality in
  Educational Slides
Multimodal Lecture Presentations Dataset: Understanding Multimodality in Educational Slides
Dong Won Lee
Chaitanya Ahuja
Paul Pu Liang
Sanika Natu
Louis-Philippe Morency
27
7
0
17 Aug 2022
VQ-T: RNN Transducers using Vector-Quantized Prediction Network States
VQ-T: RNN Transducers using Vector-Quantized Prediction Network States
Jiatong Shi
G. Saon
David Haws
Shinji Watanabe
Brian Kingsbury
32
3
0
03 Aug 2022
Learning Phone Recognition from Unpaired Audio and Phone Sequences Based
  on Generative Adversarial Network
Learning Phone Recognition from Unpaired Audio and Phone Sequences Based on Generative Adversarial Network
Da-Rong Liu
Po-Chun Hsu
Yi-Chen Chen
Sung-Feng Huang
Shun-Po Chuang
Da-Yi Wu
Hung-yi Lee
GAN
31
7
0
29 Jul 2022
Improving Mandarin Speech Recogntion with Block-augmented Transformer
Improving Mandarin Speech Recogntion with Block-augmented Transformer
Xiaoming Ren
Huifeng Zhu
Liuwei Wei
Minghui Wu
Jie Hao
40
9
0
24 Jul 2022
ILASR: Privacy-Preserving Incremental Learning for Automatic Speech
  Recognition at Production Scale
ILASR: Privacy-Preserving Incremental Learning for Automatic Speech Recognition at Production Scale
Gopinath Chennupati
Milind Rao
Gurpreet Chadha
Aaron Eakin
A. Raju
...
Andrew Oberlin
Buddha Nandanoor
Prahalad Venkataramanan
Zheng Wu
Pankaj Sitpure
CLL
29
8
0
19 Jul 2022
Previous
12345...91011
Next