ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1610.09975
  4. Cited By
Neural Speech Recognizer: Acoustic-to-Word LSTM Model for Large
  Vocabulary Speech Recognition

Neural Speech Recognizer: Acoustic-to-Word LSTM Model for Large Vocabulary Speech Recognition

31 October 2016
H. Soltau
H. Liao
Hasim Sak
ArXivPDFHTML

Papers citing "Neural Speech Recognizer: Acoustic-to-Word LSTM Model for Large Vocabulary Speech Recognition"

50 / 64 papers shown
Title
Whistle: Data-Efficient Multilingual and Crosslingual Speech Recognition via Weakly Phonetic Supervision
Whistle: Data-Efficient Multilingual and Crosslingual Speech Recognition via Weakly Phonetic Supervision
Saierdaer Yusuyin
Te Ma
Hao Huang
Wenbo Zhao
Zhijian Ou
52
2
0
04 Jun 2024
Transformers versus LSTMs for electronic trading
Transformers versus LSTMs for electronic trading
Paul Bilokon
Yitao Qiu
AI4TS
AIFin
21
13
0
20 Sep 2023
Timestamped Embedding-Matching Acoustic-to-Word CTC ASR
Timestamped Embedding-Matching Acoustic-to-Word CTC ASR
Woojay Jeon
27
0
0
20 Jun 2023
Dual-Attention Neural Transducers for Efficient Wake Word Spotting in
  Speech Recognition
Dual-Attention Neural Transducers for Efficient Wake Word Spotting in Speech Recognition
Saumya Yashmohini Sahai
Jing Liu
Thejaswi Muniyappa
Kanthashree Mysore Sathyendra
Anastasios Alexandridis
...
Ross McGowan
Ariya Rastrow
Feng-Ju Chang
Athanasios Mouchtaris
Siegfried Kunzmann
39
5
0
03 Apr 2023
LiteLSTM Architecture Based on Weights Sharing for Recurrent Neural
  Networks
LiteLSTM Architecture Based on Weights Sharing for Recurrent Neural Networks
Nelly Elsayed
Zag ElSayed
Anthony Maida
32
0
0
12 Jan 2023
MOPRD: A multidisciplinary open peer review dataset
MOPRD: A multidisciplinary open peer review dataset
Jialiang Lin
Jiaxin Song
Zhangping Zhou
Yidong Chen
X. Shi
31
12
0
09 Dec 2022
Accelerating RNN-T Training and Inference Using CTC guidance
Accelerating RNN-T Training and Inference Using CTC guidance
Yongqiang Wang
Zhehuai Chen
Cheng-yong Zheng
Yu Zhang
Wei Han
Parisa Haghani
40
23
0
29 Oct 2022
Improving Semi-supervised End-to-end Automatic Speech Recognition using
  CycleGAN and Inter-domain Losses
Improving Semi-supervised End-to-end Automatic Speech Recognition using CycleGAN and Inter-domain Losses
C. Li
Ngoc Thang Vu
21
2
0
20 Oct 2022
Effectiveness of Mining Audio and Text Pairs from Public Data for
  Improving ASR Systems for Low-Resource Languages
Effectiveness of Mining Audio and Text Pairs from Public Data for Improving ASR Systems for Low-Resource Languages
Kaushal Bhogale
A. Raman
Tahir Javed
Sumanth Doddapaneni
Anoop Kunchukuttan
Pratyush Kumar
Mitesh M. Khapra
36
22
0
26 Aug 2022
Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo
  Languages
Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo Languages
Felix Wu
Kwangyoun Kim
Shinji Watanabe
Kyu Jeong Han
Ryan T. McDonald
Kilian Q. Weinberger
Yoav Artzi
SyDa
48
38
0
02 May 2022
BERT-LID: Leveraging BERT to Improve Spoken Language Identification
BERT-LID: Leveraging BERT to Improve Spoken Language Identification
Yuting Nie
Junhong Zhao
Weiqiang Zhang
Jinfeng Bai
VLM
30
5
0
01 Mar 2022
LiteLSTM Architecture for Deep Recurrent Neural Networks
LiteLSTM Architecture for Deep Recurrent Neural Networks
Nelly Elsayed
Zag ElSayed
Anthony Maida
40
5
0
27 Jan 2022
Speech recognition for air traffic control via feature learning and
  end-to-end training
Speech recognition for air traffic control via feature learning and end-to-end training
Peng Fan
Dongyue Guo
Yi Lin
Bo Yang
Jianwei Zhang
15
7
0
04 Nov 2021
Recent Advances in End-to-End Automatic Speech Recognition
Recent Advances in End-to-End Automatic Speech Recognition
Jinyu Li
VLM
35
363
0
02 Nov 2021
Hierarchical Conditional End-to-End ASR with CTC and Multi-Granular
  Subword Units
Hierarchical Conditional End-to-End ASR with CTC and Multi-Granular Subword Units
Yosuke Higuchi
Keita Karube
Tetsuji Ogawa
Tetsunori Kobayashi
18
23
0
08 Oct 2021
Google Neural Network Models for Edge Devices: Analyzing and Mitigating
  Machine Learning Inference Bottlenecks
Google Neural Network Models for Edge Devices: Analyzing and Mitigating Machine Learning Inference Bottlenecks
Amirali Boroumand
Saugata Ghose
Berkin Akin
Ravi Narayanaswami
Geraldo F. Oliveira
Xiaoyu Ma
Eric Shiu
O. Mutlu
25
82
0
29 Sep 2021
Tied & Reduced RNN-T Decoder
Tied & Reduced RNN-T Decoder
Rami Botros
Tara N. Sainath
R. David
Emmanuel Guzman
Wei Li
Yanzhang He
38
55
0
15 Sep 2021
Knowledge Distillation from BERT Transformer to Speech Transformer for
  Intent Classification
Knowledge Distillation from BERT Transformer to Speech Transformer for Intent Classification
Yiding Jiang
Bidisha Sharma
Maulik C. Madhavi
Haizhou Li
36
25
0
05 Aug 2021
End-to-End Speech Recognition from Federated Acoustic Models
End-to-End Speech Recognition from Federated Acoustic Models
Yan Gao
Titouan Parcollet
Salah Zaiem
Javier Fernandez-Marques
Pedro Porto Buarque de Gusmão
Daniel J. Beutel
Nicholas D. Lane
28
43
0
29 Apr 2021
Leveraging Acoustic and Linguistic Embeddings from Pretrained speech and
  language Models for Intent Classification
Leveraging Acoustic and Linguistic Embeddings from Pretrained speech and language Models for Intent Classification
Bidisha Sharma
Maulik C. Madhavi
Haizhou Li
24
19
0
15 Feb 2021
Transformer Language Models with LSTM-based Cross-utterance Information
  Representation
Transformer Language Models with LSTM-based Cross-utterance Information Representation
G. Sun
C. Zhang
P. Woodland
76
32
0
12 Feb 2021
Dual Application of Speech Enhancement for Automatic Speech Recognition
Dual Application of Speech Enhancement for Automatic Speech Recognition
Ashutosh Pandey
Chunxi Liu
Yun Wang
Yatharth Saraf
41
37
0
07 Nov 2020
Internal Language Model Estimation for Domain-Adaptive End-to-End Speech
  Recognition
Internal Language Model Estimation for Domain-Adaptive End-to-End Speech Recognition
Zhong Meng
S. Parthasarathy
Eric Sun
Yashesh Gaur
Naoyuki Kanda
Liang Lu
Xie Chen
Rui Zhao
Jinyu Li
Jiawei Liu
AuLLM
19
107
0
03 Nov 2020
Event Prediction in the Big Data Era: A Systematic Survey
Event Prediction in the Big Data Era: A Systematic Survey
Liang Zhao
AI4TS
35
53
0
19 Jul 2020
A Streaming On-Device End-to-End Model Surpassing Server-Side
  Conventional Model Quality and Latency
A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency
Tara N. Sainath
Yanzhang He
Bo-wen Li
A. Narayanan
Ruoming Pang
...
Trevor Strohman
Mirkó Visontai
Yonghui Wu
Yu Zhang
Ding Zhao
25
215
0
28 Mar 2020
High-Accuracy and Low-Latency Speech Recognition with Two-Head
  Contextual Layer Trajectory LSTM Model
High-Accuracy and Low-Latency Speech Recognition with Two-Head Contextual Layer Trajectory LSTM Model
Jinyu Li
Rui Zhao
Eric Sun
J. H. M. Wong
Amit Das
Zhong Meng
Jiawei Liu
VLM
24
24
0
17 Mar 2020
Hybrid Autoregressive Transducer (hat)
Hybrid Autoregressive Transducer (hat)
Ehsan Variani
David Rybach
Cyril Allauzen
Michael Riley
21
158
0
12 Mar 2020
A Density Ratio Approach to Language Model Fusion in End-To-End
  Automatic Speech Recognition
A Density Ratio Approach to Language Model Fusion in End-To-End Automatic Speech Recognition
Erik McDermott
Hasim Sak
Ehsan Variani
25
112
0
26 Feb 2020
Minimum Bayes Risk Training of RNN-Transducer for End-to-End Speech
  Recognition
Minimum Bayes Risk Training of RNN-Transducer for End-to-End Speech Recognition
Chao Weng
Chengzhu Yu
Jia Cui
Chunlei Zhang
Dong Yu
91
39
0
28 Nov 2019
End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern
  Architectures
End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern Architectures
Gabriel Synnaeve
Qiantong Xu
Jacob Kahn
Tatiana Likhomanenko
Edouard Grave
Vineel Pratap
Anuroop Sriram
Vitaliy Liptchinsky
R. Collobert
SSL
AI4TS
36
246
0
19 Nov 2019
A comparison of end-to-end models for long-form speech recognition
A comparison of end-to-end models for long-form speech recognition
Chung-Cheng Chiu
Wei Han
Yu Zhang
Ruoming Pang
S. Kishchenko
...
Anjuli Kannan
Rohit Prabhavalkar
Z. Chen
Tara N. Sainath
Yonghui Wu
AuLLM
25
82
0
06 Nov 2019
Recognizing long-form speech using streaming end-to-end models
Recognizing long-form speech using streaming end-to-end models
A. Narayanan
Rohit Prabhavalkar
Chung-Cheng Chiu
David Rybach
Tara N. Sainath
Trevor Strohman
29
129
0
24 Oct 2019
G2G: TTS-Driven Pronunciation Learning for Graphemic Hybrid ASR
G2G: TTS-Driven Pronunciation Learning for Graphemic Hybrid ASR
Duc Le
T. Koehler
Christian Fuegen
M. Seltzer
30
16
0
22 Oct 2019
Linking emotions to behaviors through deep transfer learning
Linking emotions to behaviors through deep transfer learning
Haoqi Li
Brian R. Baucom
P. Georgiou
13
19
0
08 Oct 2019
End-to-End Code-Switching ASR for Low-Resourced Language Pairs
End-to-End Code-Switching ASR for Low-Resourced Language Pairs
Xianghu Yue
Grandee Lee
Emre Yilmaz
Fang Deng
Haizhou Li
11
30
0
27 Sep 2019
Improving RNN Transducer Modeling for End-to-End Speech Recognition
Improving RNN Transducer Modeling for End-to-End Speech Recognition
Jinyu Li
Rui Zhao
Hu Hu
Jiawei Liu
19
170
0
26 Sep 2019
Espresso: A Fast End-to-end Neural Speech Recognition Toolkit
Espresso: A Fast End-to-end Neural Speech Recognition Toolkit
Yiming Wang
Tongfei Chen
Hainan Xu
Shuoyang Ding
Hang Lv
Yiwen Shao
Nanyun Peng
Lei Xie
Shinji Watanabe
Sanjeev Khudanpur
VLM
27
73
0
18 Sep 2019
An Investigation Into On-device Personalization of End-to-end Automatic
  Speech Recognition Models
An Investigation Into On-device Personalization of End-to-end Automatic Speech Recognition Models
K. Sim
P. Zadrazil
F. Beaufays
31
58
0
14 Sep 2019
Multilingual Graphemic Hybrid ASR with Massive Data Augmentation
Multilingual Graphemic Hybrid ASR with Massive Data Augmentation
Chunxi Liu
Qiaochu Zhang
Xiaohui Zhang
Kritika Singh
Yatharth Saraf
Geoffrey Zweig
29
27
0
14 Sep 2019
Joint Speech Recognition and Speaker Diarization via Sequence
  Transduction
Joint Speech Recognition and Speaker Diarization via Sequence Transduction
Laurent El Shafey
H. Soltau
Izhak Shafran
33
99
0
09 Jul 2019
Word-level Speech Recognition with a Letter to Word Encoder
Word-level Speech Recognition with a Letter to Word Encoder
R. Collobert
Awni Y. Hannun
Gabriel Synnaeve
3DV
24
4
0
10 Jun 2019
Acoustic-to-Word Models with Conversational Context Information
Acoustic-to-Word Models with Conversational Context Information
Suyoun Kim
Florian Metze
22
7
0
21 May 2019
Deep Learning for Audio Signal Processing
Deep Learning for Audio Signal Processing
Hendrik Purwins
Bo-wen Li
Tuomas Virtanen
Jan Schlüter
Shuo-yiin Chang
Tara N. Sainath
VLM
24
586
0
30 Apr 2019
Guiding CTC Posterior Spike Timings for Improved Posterior Fusion and
  Knowledge Distillation
Guiding CTC Posterior Spike Timings for Improved Posterior Fusion and Knowledge Distillation
Gakuto Kurata
Kartik Audhkhasi
16
46
0
17 Apr 2019
Using multi-task learning to improve the performance of acoustic-to-word and conventional hybrid models
T. Nguyen
Sebastian Stüker
A. Waibel
33
1
0
02 Feb 2019
Speaker Adaptation for End-to-End CTC Models
Speaker Adaptation for End-to-End CTC Models
Ke Li
Jinyu Li
Yong Zhao
Kshitiz Kumar
Jiawei Liu
18
24
0
04 Jan 2019
Bytes are All You Need: End-to-End Multilingual Speech Recognition and
  Synthesis with Bytes
Bytes are All You Need: End-to-End Multilingual Speech Recognition and Synthesis with Bytes
Bo-wen Li
Yu Zhang
Tara N. Sainath
Yonghui Wu
William Chan
AuLLM
24
129
0
22 Nov 2018
Adversarial Training of End-to-end Speech Recognition Using a
  Criticizing Language Model
Adversarial Training of End-to-end Speech Recognition Using a Criticizing Language Model
Alexander H. Liu
Hung-yi Lee
Lin-Shan Lee
AuLLM
10
46
0
02 Nov 2018
Toward domain-invariant speech recognition via large scale training
Toward domain-invariant speech recognition via large scale training
A. Narayanan
Ananya Misra
K. Sim
Golan Pundak
Anshuman Tripathi
Mohamed G. Elfeky
Parisa Haghani
Trevor Strohman
M. Bacchiani
VLM
16
107
0
16 Aug 2018
Multimodal Language Analysis with Recurrent Multistage Fusion
Multimodal Language Analysis with Recurrent Multistage Fusion
Paul Pu Liang
Liu Ziyin
Amir Zadeh
Louis-Philippe Morency
30
198
0
12 Aug 2018
12
Next