ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1508.01211
  4. Cited By
Listen, Attend and Spell
v1v2 (latest)

Listen, Attend and Spell

5 August 2015
William Chan
Navdeep Jaitly
Quoc V. Le
Oriol Vinyals
    RALM
ArXiv (abs)PDFHTML

Papers citing "Listen, Attend and Spell"

50 / 1,041 papers shown
Title
ReconVAT: A Semi-Supervised Automatic Music Transcription Framework for
  Low-Resource Real-World Data
ReconVAT: A Semi-Supervised Automatic Music Transcription Framework for Low-Resource Real-World Data
K. Cheuk
Dorien Herremans
Li Su
208
34
0
11 Jul 2021
On lattice-free boosted MMI training of HMM and CTC-based full-context
  ASR models
On lattice-free boosted MMI training of HMM and CTC-based full-context ASR models
Xiaohui Zhang
Vimal Manohar
David C. Zhang
Frank Zhang
Yangyang Shi
Nayan Singhal
Julian Chan
Fuchun Peng
Yatharth Saraf
M. Seltzer
91
14
0
09 Jul 2021
End-to-End Rich Transcription-Style Automatic Speech Recognition with
  Semi-Supervised Learning
End-to-End Rich Transcription-Style Automatic Speech Recognition with Semi-Supervised Learning
Tomohiro Tanaka
Ryo Masumura
Mana Ihori
Akihiko Takashima
Shota Orihashi
Naoki Makishima
45
4
0
07 Jul 2021
Instant One-Shot Word-Learning for Context-Specific Neural
  Sequence-to-Sequence Speech Recognition
Instant One-Shot Word-Learning for Context-Specific Neural Sequence-to-Sequence Speech Recognition
Christian Huber
Juan Hussain
Sebastian Stüker
A. Waibel
73
27
0
05 Jul 2021
Relaxed Attention: A Simple Method to Boost Performance of End-to-End
  Automatic Speech Recognition
Relaxed Attention: A Simple Method to Boost Performance of End-to-End Automatic Speech Recognition
Timo Lohrenz
P. Schwarz
Zhengyang Li
Tim Fingscheidt
54
11
0
02 Jul 2021
What do End-to-End Speech Models Learn about Speaker, Language and
  Channel Information? A Layer-wise and Neuron-level Analysis
What do End-to-End Speech Models Learn about Speaker, Language and Channel Information? A Layer-wise and Neuron-level Analysis
Shammur A. Chowdhury
Nadir Durrani
Ahmed M. Ali
120
16
0
01 Jul 2021
On joint training with interfaces for spoken language understanding
On joint training with interfaces for spoken language understanding
A. Raju
Milind Rao
Gautam Tiwari
Pranav Dheram
Bryan Anderson
Zhe Zhang
Chul Lee
Bach Bui
Ariya Rastrow
VLM
55
11
0
30 Jun 2021
A Survey on Neural Speech Synthesis
A Survey on Neural Speech Synthesis
Xu Tan
Tao Qin
Frank Soong
Tie-Yan Liu
AI4TS
139
359
0
29 Jun 2021
Where are we in semantic concept extraction for Spoken Language
  Understanding?
Where are we in semantic concept extraction for Spoken Language Understanding?
Sahar Ghannay
Antoine Caubrière
Salima Mdhaffar
G. Laperriere
Bassam Jabaian
Yannick Esteve
67
18
0
24 Jun 2021
Towards Automatic Speech to Sign Language Generation
Towards Automatic Speech to Sign Language Generation
Parul Kapoor
Rudrabha Mukhopadhyay
Sindhu B. Hegde
Vinay P. Namboodiri
C. V. Jawahar
SLR
54
12
0
24 Jun 2021
Efficient Conformer with Prob-Sparse Attention Mechanism for
  End-to-EndSpeech Recognition
Efficient Conformer with Prob-Sparse Attention Mechanism for End-to-EndSpeech Recognition
Xiong Wang
Sining Sun
Lei Xie
Long Ma
65
20
0
17 Jun 2021
Layer Pruning on Demand with Intermediate CTC
Layer Pruning on Demand with Intermediate CTC
Jaesong Lee
Jingu Kang
Shinji Watanabe
45
18
0
17 Jun 2021
Momentum Pseudo-Labeling for Semi-Supervised Speech Recognition
Momentum Pseudo-Labeling for Semi-Supervised Speech Recognition
Yosuke Higuchi
Niko Moritz
Jonathan Le Roux
Takaaki Hori
VLM
132
53
0
16 Jun 2021
Attention-Based Keyword Localisation in Speech using Visual Grounding
Attention-Based Keyword Localisation in Speech using Visual Grounding
Kayode Olaleye
Herman Kamper
66
13
0
16 Jun 2021
SynthASR: Unlocking Synthetic Data for Speech Recognition
SynthASR: Unlocking Synthetic Data for Speech Recognition
A. Fazel
Wei Yang
Yulan Liu
Roberto Barra-Chicote
Yi Meng
Roland Maas
J. Droppo
SyDa
110
51
0
14 Jun 2021
Improving RNN-T ASR Performance with Date-Time and Location Awareness
Improving RNN-T ASR Performance with Date-Time and Location Awareness
Swayambhu Nath Ray
Soumyajit Mitra
Raghavendra Bilgi
Sri Garimella
38
5
0
11 Jun 2021
Raw Waveform Encoder with Multi-Scale Globally Attentive Locally
  Recurrent Networks for End-to-End Speech Recognition
Raw Waveform Encoder with Multi-Scale Globally Attentive Locally Recurrent Networks for End-to-End Speech Recognition
Max W. Y. Lam
Jun Wang
Chao Weng
Jane Polak Scowcroft
Dong Yu
68
6
0
08 Jun 2021
Data Augmentation Methods for End-to-end Speech Recognition on
  Distant-Talk Scenarios
Data Augmentation Methods for End-to-end Speech Recognition on Distant-Talk Scenarios
E. Tsunoo
Kentarou Shibata
Chaitanya Narisetty
Yosuke Kashiwagi
Shinji Watanabe
71
12
0
07 Jun 2021
Approximate Fixed-Points in Recurrent Neural Networks
Approximate Fixed-Points in Recurrent Neural Networks
Zhengxiong Wang
Anton Ragni
27
4
0
04 Jun 2021
Minimum Word Error Rate Training with Language Model Fusion for
  End-to-End Speech Recognition
Minimum Word Error Rate Training with Language Model Fusion for End-to-End Speech Recognition
Zhong Meng
Yu-Huan Wu
Naoyuki Kanda
Liang Lu
Xie Chen
Guoli Ye
Eric Sun
Jinyu Li
Jiawei Liu
MoMe
101
21
0
04 Jun 2021
Towards One Model to Rule All: Multilingual Strategy for Dialectal
  Code-Switching Arabic ASR
Towards One Model to Rule All: Multilingual Strategy for Dialectal Code-Switching Arabic ASR
Shammur A. Chowdhury
A. Hussein
Ahmed Abdelali
Ahmed M. Ali
78
36
0
31 May 2021
Listen with Intent: Improving Speech Recognition with Audio-to-Intent
  Front-End
Listen with Intent: Improving Speech Recognition with Audio-to-Intent Front-End
Swayambhu Nath Ray
Minhua Wu
A. Raju
Pegah Ghahremani
Raghavendra Bilgi
Milind Rao
Harish Arsikere
Ariya Rastrow
A. Stolcke
J. Droppo
68
11
0
14 May 2021
Exploring CTC Based End-to-End Techniques for Myanmar Speech Recognition
Exploring CTC Based End-to-End Techniques for Myanmar Speech Recognition
Khin Me Me Chit
Laet Laet Lin
40
4
0
13 May 2021
Quantifying and Maximizing the Benefits of Back-End Noise Adaption on
  Attention-Based Speech Recognition Models
Quantifying and Maximizing the Benefits of Back-End Noise Adaption on Attention-Based Speech Recognition Models
Coleman Hooper
Thierry Tambe
Gu-Yeon Wei
43
0
0
03 May 2021
On the limit of English conversational speech recognition
On the limit of English conversational speech recognition
Zoltán Tüske
G. Saon
Brian Kingsbury
94
50
0
03 May 2021
On Addressing Practical Challenges for RNN-Transducer
On Addressing Practical Challenges for RNN-Transducer
Rui Zhao
Jian Xue
Jinyu Li
Wenning Wei
Lei He
Jiawei Liu
72
32
0
27 Apr 2021
Bridging the gap between streaming and non-streaming ASR systems
  bydistilling ensembles of CTC and RNN-T models
Bridging the gap between streaming and non-streaming ASR systems bydistilling ensembles of CTC and RNN-T models
Thibault Doutre
Wei Han
Chung-Cheng Chiu
Ruoming Pang
Olivier Siohan
Liangliang Cao
68
5
0
25 Apr 2021
LeBenchmark: A Reproducible Framework for Assessing Self-Supervised
  Representation Learning from Speech
LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from Speech
Solène Evain
H. Nguyen
Hang Le
Marcely Zanon Boito
Salima Mdhaffar
...
François Portet
Solange Rossato
Fabien Ringeval
D. Schwab
Laurent Besacier
SSL
119
70
0
23 Apr 2021
Fast Text-Only Domain Adaptation of RNN-Transducer Prediction Network
Fast Text-Only Domain Adaptation of RNN-Transducer Prediction Network
Janne Pylkkönen
Antti Ukkonen
Juho Kilpikoski
Samu Tamminen
Hannes Heikinheimo
60
27
0
22 Apr 2021
Advanced Long-context End-to-end Speech Recognition Using
  Context-expanded Transformers
Advanced Long-context End-to-end Speech Recognition Using Context-expanded Transformers
Takaaki Hori
Niko Moritz
Chiori Hori
Jonathan Le Roux
76
34
0
19 Apr 2021
Non-linear Functional Modeling using Neural Networks
Non-linear Functional Modeling using Neural Networks
Aniruddha Rajendra Rao
M. Reimherr
66
31
0
19 Apr 2021
Acoustic Data-Driven Subword Modeling for End-to-End Speech Recognition
Acoustic Data-Driven Subword Modeling for End-to-End Speech Recognition
Wei Zhou
Mohammad Zeineldeen
Zuoyun Zheng
Ralf Schluter
Hermann Ney
77
14
0
19 Apr 2021
A Method to Reveal Speaker Identity in Distributed ASR Training, and How
  to Counter It
A Method to Reveal Speaker Identity in Distributed ASR Training, and How to Counter It
Trung D. Q. Dang
Om Thakkar
Swaroop Indra Ramaswamy
Rajiv Mathews
Peter Chin
Franccoise Beaufays
FedML
53
10
0
15 Apr 2021
Integration of Pre-trained Networks with Continuous Token Interface for
  End-to-End Spoken Language Understanding
Integration of Pre-trained Networks with Continuous Token Interface for End-to-End Spoken Language Understanding
S. Seo
Donghyun Kwak
Bowon Lee
85
33
0
15 Apr 2021
Annealing Knowledge Distillation
Annealing Knowledge Distillation
A. Jafari
Mehdi Rezagholizadeh
Pranav Sharma
A. Ghodsi
98
79
0
14 Apr 2021
Efficient conformer-based speech recognition with linear attention
Efficient conformer-based speech recognition with linear attention
Shengqiang Li
Menglong Xu
Xiao-Lei Zhang
60
23
0
14 Apr 2021
Investigating Methods to Improve Language Model Integration for
  Attention-based Encoder-Decoder ASR Models
Investigating Methods to Improve Language Model Integration for Attention-based Encoder-Decoder ASR Models
Mohammad Zeineldeen
Aleksandr Glushko
Wilfried Michel
Albert Zeyer
Ralf Schluter
Hermann Ney
AuLLM
48
40
0
12 Apr 2021
Non-autoregressive Transformer-based End-to-end ASR using BERT
Non-autoregressive Transformer-based End-to-end ASR using BERT
Fu-Hao Yu
Kuan-Yu Chen
59
23
0
10 Apr 2021
Lip reading using external viseme decoding
Lip reading using external viseme decoding
J. Peymanfard
Mohammad Reza Mohammadi
Hossein Zeinali
N. Mozayani
53
11
0
10 Apr 2021
Boundary and Context Aware Training for CIF-based Non-Autoregressive
  End-to-end ASR
Boundary and Context Aware Training for CIF-based Non-Autoregressive End-to-end ASR
Fan Yu
Haoneng Luo
Pengcheng Guo
Yuhao Liang
Zhuoyuan Yao
Lei Xie
Yingying Gao
Leijing Hou
Shilei Zhang
36
11
0
10 Apr 2021
Language model fusion for streaming end to end speech recognition
Language model fusion for streaming end to end speech recognition
Rodrigo Cabrera
Xiaofeng Liu
M. Ghodsi
Zebulun Matteson
Eugene Weinstein
Anjuli Kannan
MoMeAI4TS
70
14
0
09 Apr 2021
On Architectures and Training for Raw Waveform Feature Extraction in ASR
On Architectures and Training for Raw Waveform Feature Extraction in ASR
Peter Vieting
Christoph Luscher
Wilfried Michel
Ralf Schluter
Hermann Ney
59
10
0
09 Apr 2021
FSR: Accelerating the Inference Process of Transducer-Based Models by
  Applying Fast-Skip Regularization
FSR: Accelerating the Inference Process of Transducer-Based Models by Applying Fast-Skip Regularization
Zhengkun Tian
Jiangyan Yi
Ye Bai
J. Tao
Shuai Zhang
Zhengqi Wen
51
16
0
07 Apr 2021
Darts-Conformer: Towards Efficient Gradient-Based Neural Architecture
  Search For End-to-End ASR
Darts-Conformer: Towards Efficient Gradient-Based Neural Architecture Search For End-to-End ASR
Xian Shi
Pan Zhou
Wei Chen
Lei Xie
81
17
0
07 Apr 2021
Extremely Low Footprint End-to-End ASR System for Smart Device
Extremely Low Footprint End-to-End ASR System for Smart Device
Zhifu Gao
Yiwu Yao
Shiliang Zhang
Jun Yang
Ming Lei
Ian Mcloughlin
43
13
0
06 Apr 2021
Non-autoregressive Mandarin-English Code-switching Speech Recognition
Non-autoregressive Mandarin-English Code-switching Speech Recognition
Shun-Po Chuang
Heng-Jui Chang
Sung-Feng Huang
Hung-yi Lee
90
15
0
06 Apr 2021
Understanding Medical Conversations: Rich Transcription, Confidence
  Scores & Information Extraction
Understanding Medical Conversations: Rich Transcription, Confidence Scores & Information Extraction
H. Soltau
Mingqiu Wang
Izhak Shafran
Laurent El Shafey
MedImLM&MA
75
13
0
06 Apr 2021
Dissecting User-Perceived Latency of On-Device E2E Speech Recognition
Dissecting User-Perceived Latency of On-Device E2E Speech Recognition
Yuan Shangguan
Rohit Prabhavalkar
Hang Su
Jay Mahadeokar
Yangyang Shi
...
Chunyang Wu
Duc Le
Ozlem Kalinli
Christian Fuegen
M. Seltzer
57
29
0
06 Apr 2021
SpeechStew: Simply Mix All Available Speech Recognition Data to Train
  One Large Neural Network
SpeechStew: Simply Mix All Available Speech Recognition Data to Train One Large Neural Network
William Chan
Daniel S. Park
Chris A. Lee
Yu Zhang
Quoc V. Le
Mohammad Norouzi
AI4TS
90
138
0
05 Apr 2021
Streaming Multi-talker Speech Recognition with Joint Speaker
  Identification
Streaming Multi-talker Speech Recognition with Joint Speaker Identification
Liang Lu
Naoyuki Kanda
Jinyu Li
Jiawei Liu
75
20
0
05 Apr 2021
Previous
123...91011...192021
Next