ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1507.08240
  4. Cited By
EESEN: End-to-End Speech Recognition using Deep RNN Models and
  WFST-based Decoding

EESEN: End-to-End Speech Recognition using Deep RNN Models and WFST-based Decoding

29 July 2015
Yajie Miao
M. Gowayyed
Florian Metze
ArXivPDFHTML

Papers citing "EESEN: End-to-End Speech Recognition using Deep RNN Models and WFST-based Decoding"

50 / 264 papers shown
Title
Machine Learning and statistical classification of CRISPR-Cas12a diagnostic assays
Machine Learning and statistical classification of CRISPR-Cas12a diagnostic assays
Nathan Khosla
Jake M. Lesinski
Marcus Haywood-Alexander
Andrew J. deMello
Daniel A. Richards
36
0
0
08 Jan 2025
Target word activity detector: An approach to obtain ASR word boundaries
  without lexicon
Target word activity detector: An approach to obtain ASR word boundaries without lexicon
S. Sivasankaran
Eric Sun
Jinyu Li
Yan-ping Huang
Jing Pan
30
0
0
20 Sep 2024
Comparing Discrete and Continuous Space LLMs for Speech Recognition
Comparing Discrete and Continuous Space LLMs for Speech Recognition
Yaoxun Xu
Shi-Xiong Zhang
Jianwei Yu
Zhiyong Wu
Dong Yu
AuLLM
17
3
0
01 Sep 2024
Benchmarking the Performance of Large Language Models on the Cerebras
  Wafer Scale Engine
Benchmarking the Performance of Large Language Models on the Cerebras Wafer Scale Engine
Zuoning Zhang
Dhruv Parikh
Youning Zhang
Viktor Prasanna
31
1
0
30 Aug 2024
Improving Speech Recognition Error Prediction for Modern and
  Off-the-shelf Speech Recognizers
Improving Speech Recognition Error Prediction for Modern and Off-the-shelf Speech Recognizers
Prashant Serai
Peidong Wang
Eric Fosler-Lussier
24
6
0
21 Aug 2024
Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based
  Speech Recognition
Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition
Ye Bai
Jingping Chen
Jitong Chen
Wei Chen
Zhuo Chen
...
Wanyi Zhang
Yang Zhang
Yawei Zhang
Yijie Zheng
Ming Zou
AuLLM
49
19
0
05 Jul 2024
Decoder-only Architecture for Streaming End-to-end Speech Recognition
Decoder-only Architecture for Streaming End-to-end Speech Recognition
E. Tsunoo
Hayato Futami
Yosuke Kashiwagi
Siddhant Arora
Shinji Watanabe
RALM
AuLLM
36
6
0
23 Jun 2024
Transformer-based Model for ASR N-Best Rescoring and Rewriting
Transformer-based Model for ASR N-Best Rescoring and Rewriting
Iwen E. Kang
Christophe Van Gysel
Man-Hung Siu
39
2
0
12 Jun 2024
Whistle: Data-Efficient Multilingual and Crosslingual Speech Recognition via Weakly Phonetic Supervision
Whistle: Data-Efficient Multilingual and Crosslingual Speech Recognition via Weakly Phonetic Supervision
Saierdaer Yusuyin
Te Ma
Hao Huang
Wenbo Zhao
Zhijian Ou
49
2
0
04 Jun 2024
A Multimodal Approach to Device-Directed Speech Detection with Large
  Language Models
A Multimodal Approach to Device-Directed Speech Detection with Large Language Models
Dominik Wagner
Alexander W. Churchill
Siddharth Sigtia
Panayiotis Georgiou
Matt Mirsamadi
Aarshee Mishra
Erik Marchi
49
6
0
21 Mar 2024
A unified multichannel far-field speech recognition system: combining
  neural beamforming with attention based end-to-end model
A unified multichannel far-field speech recognition system: combining neural beamforming with attention based end-to-end model
Dongdi Zhao
Jianbo Ma
Lu Lu
Jinke Li
Xuan Ji
Lei Zhu
Fuming Fang
Ming-Yu Liu
Feijun Jiang
15
1
0
05 Jan 2024
Revisiting the Entropy Semiring for Neural Speech Recognition
Revisiting the Entropy Semiring for Neural Speech Recognition
Oscar Chang
DongSeon Hwang
Olivier Siohan
24
2
0
13 Dec 2023
Multimodal Data and Resource Efficient Device-Directed Speech Detection
  with Large Foundation Models
Multimodal Data and Resource Efficient Device-Directed Speech Detection with Large Foundation Models
Dominik Wagner
Alexander W. Churchill
Siddharth Sigtia
Panayiotis Georgiou
Matt Mirsamadi
Aarshee Mishra
Erik Marchi
17
3
0
06 Dec 2023
Hierarchically Gated Recurrent Neural Network for Sequence Modeling
Hierarchically Gated Recurrent Neural Network for Sequence Modeling
Zhen Qin
Songlin Yang
Yiran Zhong
36
74
0
08 Nov 2023
Personalization of CTC-based End-to-End Speech Recognition Using
  Pronunciation-Driven Subword Tokenization
Personalization of CTC-based End-to-End Speech Recognition Using Pronunciation-Driven Subword Tokenization
Zhihong Lei
Ernest Pusateri
Shiyi Han
Leo Liu
Mingbin Xu
...
R. Travadi
Youyuan Zhang
Mirko Hannemann
Man-Hung Siu
Zhen Huang
23
9
0
16 Oct 2023
Learning from Flawed Data: Weakly Supervised Automatic Speech
  Recognition
Learning from Flawed Data: Weakly Supervised Automatic Speech Recognition
Dongji Gao
Hainan Xu
Desh Raj
Leibny Paola García Perera
Daniel Povey
Sanjeev Khudanpur
27
4
0
26 Sep 2023
Massive End-to-end Models for Short Search Queries
Massive End-to-end Models for Short Search Queries
Weiran Wang
Rohit Prabhavalkar
Dongseong Hwang
Qiujia Li
K. Sim
...
Zhong Meng
CJ Zheng
Yanzhang He
Tara N. Sainath
P. M. Mengibar
32
2
0
22 Sep 2023
Self-distillation Regularized Connectionist Temporal Classification Loss
  for Text Recognition: A Simple Yet Effective Approach
Self-distillation Regularized Connectionist Temporal Classification Loss for Text Recognition: A Simple Yet Effective Approach
Ziyin Zhang
Ning Lu
Minghui Liao
Yongshuai Huang
Cheng Li
Min Wang
Wei Peng
28
11
0
17 Aug 2023
Alternative Pseudo-Labeling for Semi-Supervised Automatic Speech
  Recognition
Alternative Pseudo-Labeling for Semi-Supervised Automatic Speech Recognition
Hanjing Zhu
Dongji Gao
Gaofeng Cheng
Daniel Povey
Pengyuan Zhang
Yonghong Yan
NoLa
38
4
0
12 Aug 2023
Integration of Frame- and Label-synchronous Beam Search for Streaming
  Encoder-decoder Speech Recognition
Integration of Frame- and Label-synchronous Beam Search for Streaming Encoder-decoder Speech Recognition
E. Tsunoo
Hayato Futami
Yosuke Kashiwagi
Siddhant Arora
Shinji Watanabe
27
4
0
24 Jul 2023
Accelerating Transducers through Adjacent Token Merging
Accelerating Transducers through Adjacent Token Merging
Yuang Li
Yu-Huan Wu
Jinyu Li
Shujie Liu
22
4
0
28 Jun 2023
EM-Network: Oracle Guided Self-distillation for Sequence Learning
EM-Network: Oracle Guided Self-distillation for Sequence Learning
J. Yoon
Sunghwan Ahn
Hyeon Seung Lee
Minchan Kim
Seokhwan Kim
N. Kim
VLM
30
2
0
14 Jun 2023
Improving Frame-level Classifier for Word Timings with Non-peaky CTC in
  End-to-End Automatic Speech Recognition
Improving Frame-level Classifier for Word Timings with Non-peaky CTC in End-to-End Automatic Speech Recognition
Xianzhao Chen
Yist Y. Lin
Kang Wang
Yi He
Zejun Ma
26
2
0
09 Jun 2023
Bypass Temporal Classification: Weakly Supervised Automatic Speech
  Recognition with Imperfect Transcripts
Bypass Temporal Classification: Weakly Supervised Automatic Speech Recognition with Imperfect Transcripts
Dongji Gao
Matthew Wiesner
Hainan Xu
Leibny Paola García
Daniel Povey
Sanjeev Khudanpur
13
8
0
01 Jun 2023
Weakly-supervised forced alignment of disfluent speech using
  phoneme-level modeling
Weakly-supervised forced alignment of disfluent speech using phoneme-level modeling
Theodoros Kouzelis
Georgios Paraskevopoulos
Athanasios Katsamanis
V. Katsouros
18
8
0
30 May 2023
Study of GANs for Noisy Speech Simulation from Clean Speech
Study of GANs for Noisy Speech Simulation from Clean Speech
L. Maben
Zixun Guo
Chen Chen
Utkarsh Chudiwal
Chng Eng Siong
14
0
0
21 May 2023
Blank-regularized CTC for Frame Skipping in Neural Transducer
Blank-regularized CTC for Frame Skipping in Neural Transducer
Yifan Yang
Xiaoyu Yang
Liyong Guo
Zengwei Yao
Wei Kang
Fangjun Kuang
Long Lin
Xie Chen
Daniel Povey
16
8
0
19 May 2023
Masked Audio Text Encoders are Effective Multi-Modal Rescorers
Masked Audio Text Encoders are Effective Multi-Modal Rescorers
Jason (Jinglun) Cai
Monica Sunkara
Xilai Li
Anshu Bhatia
Xiao Pan
S. Bodapati
28
3
0
11 May 2023
Powerful and Extensible WFST Framework for RNN-Transducer Losses
Powerful and Extensible WFST Framework for RNN-Transducer Losses
A. Laptev
Vladimir Bataev
Igor Gitman
Boris Ginsburg
21
3
0
18 Mar 2023
End-to-End Speech Recognition: A Survey
End-to-End Speech Recognition: A Survey
Rohit Prabhavalkar
Takaaki Hori
Tara N. Sainath
Ralf Schluter
Shinji Watanabe
VLM
26
149
0
03 Mar 2023
LiteLSTM Architecture Based on Weights Sharing for Recurrent Neural
  Networks
LiteLSTM Architecture Based on Weights Sharing for Recurrent Neural Networks
Nelly Elsayed
Zag ElSayed
Anthony Maida
29
0
0
12 Jan 2023
Once-for-All Sequence Compression for Self-Supervised Speech Models
Once-for-All Sequence Compression for Self-Supervised Speech Models
Hsuan-Jui Chen
Yen Meng
Hung-yi Lee
27
4
0
04 Nov 2022
Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing
Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing
Yonggan Fu
Yang Zhang
Kaizhi Qian
Zhifan Ye
Zhongzhi Yu
Cheng-I Jeff Lai
Yingyan Lin
27
8
0
02 Nov 2022
Blank Collapse: Compressing CTC emission for the faster decoding
Blank Collapse: Compressing CTC emission for the faster decoding
Minkyu Jung
Ohhyeok Kwon
S. Seo
Soonshin Seo
31
3
0
31 Oct 2022
Improving Semi-supervised End-to-end Automatic Speech Recognition using
  CycleGAN and Inter-domain Losses
Improving Semi-supervised End-to-end Automatic Speech Recognition using CycleGAN and Inter-domain Losses
C. Li
Ngoc Thang Vu
21
2
0
20 Oct 2022
End-to-End Integration of Speech Recognition, Dereverberation,
  Beamforming, and Self-Supervised Learning Representation
End-to-End Integration of Speech Recognition, Dereverberation, Beamforming, and Self-Supervised Learning Representation
Yoshiki Masuyama
Xuankai Chang
Samuele Cornell
Shinji Watanabe
Nobutaka Ono
17
19
0
19 Oct 2022
LeVoice ASR Systems for the ISCSLP 2022 Intelligent Cockpit Speech Recognition Challenge
Yan Jia
Mihee Hong
Jingyu Hou
Kailong Ren
Sifan Ma
Jin Wang
Fangzhen Peng
Yinglin Ji
Lin Yang
Junjie Wang
25
1
0
14 Oct 2022
On Compressing Sequences for Self-Supervised Speech Models
On Compressing Sequences for Self-Supervised Speech Models
Yen Meng
Hsuan-Jui Chen
Jiatong Shi
Shinji Watanabe
Paola García
Hung-yi Lee
Hao Tang
SSL
15
14
0
13 Oct 2022
ASR2K: Speech Recognition for Around 2000 Languages without Audio
ASR2K: Speech Recognition for Around 2000 Languages without Audio
Xinjian Li
Florian Metze
David R. Mortensen
A. Black
Shinji Watanabe
20
27
0
06 Sep 2022
A Deep Learning Approach to Detect Lean Blowout in Combustion Systems
A Deep Learning Approach to Detect Lean Blowout in Combustion Systems
Tryambak Gangopadhyay
S. De
Qisai Liu
A. Mukhopadhyay
S. Sen
S. Sarkar
18
1
0
03 Aug 2022
PoLyScriber: Integrated Fine-tuning of Extractor and Lyrics Transcriber
  for Polyphonic Music
PoLyScriber: Integrated Fine-tuning of Extractor and Lyrics Transcriber for Polyphonic Music
Xiaoxue Gao
Chitralekha Gupta
Haizhou Li
33
7
0
15 Jul 2022
Improving Streaming End-to-End ASR on Transformer-based Causal Models
  with Encoder States Revision Strategies
Improving Streaming End-to-End ASR on Transformer-based Causal Models with Encoder States Revision Strategies
Zehan Li
Haoran Miao
Keqi Deng
Gaofeng Cheng
Sanli Tian
Ta Li
Yonghong Yan
KELM
24
4
0
06 Jul 2022
Finstreder: Simple and fast Spoken Language Understanding with Finite
  State Transducers using modern Speech-to-Text models
Finstreder: Simple and fast Spoken Language Understanding with Finite State Transducers using modern Speech-to-Text models
Daniel Bermuth
Alexander Poeppel
W. Reif
23
7
0
29 Jun 2022
Residual Language Model for End-to-end Speech Recognition
Residual Language Model for End-to-end Speech Recognition
E. Tsunoo
Yosuke Kashiwagi
Chaitanya Narisetty
Shinji Watanabe
17
11
0
15 Jun 2022
LegoNN: Building Modular Encoder-Decoder Models
LegoNN: Building Modular Encoder-Decoder Models
Siddharth Dalmia
Dmytro Okhonko
M. Lewis
Sergey Edunov
Shinji Watanabe
Florian Metze
Luke Zettlemoyer
Abdel-rahman Mohamed
AuLLM
MoE
29
14
0
07 Jun 2022
Improving CTC-based ASR Models with Gated Interlayer Collaboration
Improving CTC-based ASR Models with Gated Interlayer Collaboration
Yuting Yang
Yuke Li
Binbin Du
28
11
0
25 May 2022
Insights on Neural Representations for End-to-End Speech Recognition
Insights on Neural Representations for End-to-End Speech Recognition
A. Ollerenshaw
Md. Asif Jalal
Thomas Hain
14
7
0
19 May 2022
Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo
  Languages
Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo Languages
Felix Wu
Kwangyoun Kim
Shinji Watanabe
Kyu Jeong Han
Ryan T. McDonald
Kilian Q. Weinberger
Yoav Artzi
SyDa
45
37
0
02 May 2022
Supervised Attention in Sequence-to-Sequence Models for Speech
  Recognition
Supervised Attention in Sequence-to-Sequence Models for Speech Recognition
Gene-Ping Yang
Hao Tang
15
2
0
25 Apr 2022
Memory-Efficient Training of RNN-Transducer with Sampled Softmax
Memory-Efficient Training of RNN-Transducer with Sampled Softmax
Jaesong Lee
Lukas Lee
Shinji Watanabe
25
8
0
31 Mar 2022
123456
Next