ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1412.5567
  4. Cited By
Deep Speech: Scaling up end-to-end speech recognition

Deep Speech: Scaling up end-to-end speech recognition

17 December 2014
Awni Y. Hannun
Carl Case
Jared Casper
Bryan Catanzaro
G. Diamos
Erich Elsen
R. Prenger
S. Satheesh
Shubho Sengupta
Adam Coates
A. Ng
ArXivPDFHTML

Papers citing "Deep Speech: Scaling up end-to-end speech recognition"

50 / 750 papers shown
Title
MSDT: Masked Language Model Scoring Defense in Text Domain
MSDT: Masked Language Model Scoring Defense in Text Domain
Jaechul Roh
Minhao Cheng
Yajun Fang
AAML
15
1
0
10 Nov 2022
Peak-First CTC: Reducing the Peak Latency of CTC Models by Applying
  Peak-First Regularization
Peak-First CTC: Reducing the Peak Latency of CTC Models by Applying Peak-First Regularization
Zhengkun Tian
Hongyu Xiang
Min Li
Fei Lin
Ke Ding
Guanglu Wan
13
6
0
07 Nov 2022
Data-free Defense of Black Box Models Against Adversarial Attacks
Data-free Defense of Black Box Models Against Adversarial Attacks
Gaurav Kumar Nayak
Inder Khatri
Ruchit Rawal
Anirban Chakraborty
AAML
25
1
0
03 Nov 2022
Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing
Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing
Yonggan Fu
Yang Zhang
Kaizhi Qian
Zhifan Ye
Zhongzhi Yu
Cheng-I Jeff Lai
Yingyan Lin
30
8
0
02 Nov 2022
Modular Hybrid Autoregressive Transducer
Modular Hybrid Autoregressive Transducer
Zhong Meng
Tongzhou Chen
Rohit Prabhavalkar
Yu Zhang
Gary Wang
...
Bhuvana Ramabhadran
Yifan Jiang
Ehsan Variani
Yinghui Huang
Pedro J. Moreno
34
20
0
31 Oct 2022
BERT Meets CTC: New Formulation of End-to-End Speech Recognition with
  Pre-trained Masked Language Model
BERT Meets CTC: New Formulation of End-to-End Speech Recognition with Pre-trained Masked Language Model
Yosuke Higuchi
Brian Yan
Siddhant Arora
Tetsuji Ogawa
Tetsunori Kobayashi
Shinji Watanabe
54
25
0
29 Oct 2022
ESB: A Benchmark For Multi-Domain End-to-End Speech Recognition
ESB: A Benchmark For Multi-Domain End-to-End Speech Recognition
Sanchit Gandhi
Patrick von Platen
Alexander M. Rush
30
24
0
24 Oct 2022
10 hours data is all you need
10 hours data is all you need
Zeping Min
Qian Ge
Zhong Li
18
2
0
24 Oct 2022
Improving Semi-supervised End-to-end Automatic Speech Recognition using
  CycleGAN and Inter-domain Losses
Improving Semi-supervised End-to-end Automatic Speech Recognition using CycleGAN and Inter-domain Losses
C. Li
Ngoc Thang Vu
21
2
0
20 Oct 2022
Accelerating Transfer Learning with Near-Data Computation on Cloud
  Object Stores
Accelerating Transfer Learning with Near-Data Computation on Cloud Object Stores
Arsany Guirguis
Diana Petrescu
Florin Dinu
D. Quoc
Javier Picorel
R. Guerraoui
40
0
0
16 Oct 2022
Deep learning model compression using network sensitivity and gradients
Deep learning model compression using network sensitivity and gradients
M. Sakthi
N. Yadla
Raj Pawate
21
2
0
11 Oct 2022
JoeyS2T: Minimalistic Speech-to-Text Modeling with JoeyNMT
JoeyS2T: Minimalistic Speech-to-Text Modeling with JoeyNMT
Mayumi Ohta
Julia Kreutzer
Stefan Riezler
19
0
0
05 Oct 2022
E-Branchformer: Branchformer with Enhanced merging for speech
  recognition
E-Branchformer: Branchformer with Enhanced merging for speech recognition
Kwangyoun Kim
Felix Wu
Yifan Peng
Jing Pan
Prashant Sridhar
Kyu Jeong Han
Shinji Watanabe
61
105
0
30 Sep 2022
ZeroEGGS: Zero-shot Example-based Gesture Generation from Speech
ZeroEGGS: Zero-shot Example-based Gesture Generation from Speech
Saeed Ghorbani
Ylva Ferstl
Daniel Holden
N. Troje
M. Carbonneau
30
79
0
15 Sep 2022
Deep Speech Synthesis from Articulatory Representations
Deep Speech Synthesis from Articulatory Representations
Peter Wu
Shinji Watanabe
L. Goldstein
A. Black
Gopala K. Anumanchipalli
39
24
0
13 Sep 2022
Synthesizing Photorealistic Virtual Humans Through Cross-modal
  Disentanglement
Synthesizing Photorealistic Virtual Humans Through Cross-modal Disentanglement
S. Ravichandran
Ondrej Texler
Dimitar Dinev
Hyun Jae Kang
17
4
0
03 Sep 2022
Universal Fourier Attack for Time Series
Universal Fourier Attack for Time Series
Elizabeth Coda
B. Clymer
Chance N. DeSmet
Y. Watkins
Michael Girard
28
1
0
02 Sep 2022
RL-DistPrivacy: Privacy-Aware Distributed Deep Inference for low latency
  IoT systems
RL-DistPrivacy: Privacy-Aware Distributed Deep Inference for low latency IoT systems
Emna Baccour
A. Erbad
Amr M. Mohamed
Mounir Hamdi
M. Guizani
30
12
0
27 Aug 2022
Not All GPUs Are Created Equal: Characterizing Variability in
  Large-Scale, Accelerator-Rich Systems
Not All GPUs Are Created Equal: Characterizing Variability in Large-Scale, Accelerator-Rich Systems
Prasoon Sinha
Akhil Guliani
Rutwik Jain
Brandon Tran
Matthew D. Sinclair
Shivaram Venkataraman
19
17
0
23 Aug 2022
How does the degree of novelty impacts semi-supervised representation
  learning for novel class retrieval?
How does the degree of novelty impacts semi-supervised representation learning for novel class retrieval?
Q. Leroy
Olivier Buisson
Alexis Joly
SSL
21
0
0
17 Aug 2022
Unifying Gradients to Improve Real-world Robustness for Deep Networks
Unifying Gradients to Improve Real-world Robustness for Deep Networks
Yingwen Wu
Sizhe Chen
Kun Fang
X. Huang
AAML
32
3
0
12 Aug 2022
Zeus: Understanding and Optimizing GPU Energy Consumption of DNN
  Training
Zeus: Understanding and Optimizing GPU Energy Consumption of DNN Training
Jie You
Jaehoon Chung
Mosharaf Chowdhury
26
75
0
12 Aug 2022
Pronunciation-aware unique character encoding for RNN Transducer-based
  Mandarin speech recognition
Pronunciation-aware unique character encoding for RNN Transducer-based Mandarin speech recognition
Peng Shen
Xugang Lu
Hisashi Kawai
19
2
0
29 Jul 2022
Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head
  Synthesis
Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head Synthesis
Shuai Shen
Wanhua Li
Zhengbiao Zhu
Yueqi Duan
Jie Zhou
Jiwen Lu
CVBM
25
105
0
24 Jul 2022
Improving spatial cues for hearables using a parameterized binaural CDR
  estimator
Improving spatial cues for hearables using a parameterized binaural CDR estimator
Reza Ghanavi
C. Jin
16
1
0
17 Jul 2022
End-to-End Spoken Language Understanding: Performance analyses of a
  voice command task in a low resource setting
End-to-End Spoken Language Understanding: Performance analyses of a voice command task in a low resource setting
Thierry Desot
François Portet
Michel Vacher
27
12
0
17 Jul 2022
pMCT: Patched Multi-Condition Training for Robust Speech Recognition
pMCT: Patched Multi-Condition Training for Robust Speech Recognition
Pablo Peso Parada
A. Dobrowolska
Karthikeyan P. Saravanan
Mete Ozay
40
6
0
11 Jul 2022
Adversarial Ensemble Training by Jointly Learning Label Dependencies and
  Member Models
Adversarial Ensemble Training by Jointly Learning Label Dependencies and Member Models
Lele Wang
B. Liu
UQCV
23
4
0
29 Jun 2022
The Makerere Radio Speech Corpus: A Luganda Radio Corpus for Automatic
  Speech Recognition
The Makerere Radio Speech Corpus: A Luganda Radio Corpus for Automatic Speech Recognition
Jonathan Mukiibi
Andrew Katumba
J. Nakatumba‐Nabende
Ali Hussein
Josh Meyer
22
7
0
20 Jun 2022
Residual Language Model for End-to-end Speech Recognition
Residual Language Model for End-to-end Speech Recognition
E. Tsunoo
Yosuke Kashiwagi
Chaitanya Narisetty
Shinji Watanabe
19
11
0
15 Jun 2022
Local Identifiability of Deep ReLU Neural Networks: the Theory
Local Identifiability of Deep ReLU Neural Networks: the Theory
Joachim Bona-Pellissier
Franccois Malgouyres
F. Bachoc
FAtt
67
6
0
15 Jun 2022
AS2T: Arbitrary Source-To-Target Adversarial Attack on Speaker
  Recognition Systems
AS2T: Arbitrary Source-To-Target Adversarial Attack on Speaker Recognition Systems
Guangke Chen
Zhe Zhao
Fu Song
Sen Chen
Lingling Fan
Yang Liu
AAML
32
18
0
07 Jun 2022
LegoNN: Building Modular Encoder-Decoder Models
LegoNN: Building Modular Encoder-Decoder Models
Siddharth Dalmia
Dmytro Okhonko
M. Lewis
Sergey Edunov
Shinji Watanabe
Florian Metze
Luke Zettlemoyer
Abdel-rahman Mohamed
AuLLM
MoE
29
14
0
07 Jun 2022
Speech Augmentation Based Unsupervised Learning for Keyword Spotting
Speech Augmentation Based Unsupervised Learning for Keyword Spotting
Jian Luo
Jianzong Wang
Ning Cheng
Haobin Tang
Jing Xiao
SSL
23
2
0
28 May 2022
Improving CTC-based ASR Models with Gated Interlayer Collaboration
Improving CTC-based ASR Models with Gated Interlayer Collaboration
Yuting Yang
Yuke Li
Binbin Du
31
11
0
25 May 2022
Deep Learning for Visual Speech Analysis: A Survey
Deep Learning for Visual Speech Analysis: A Survey
Changchong Sheng
Gangyao Kuang
L. Bai
Chen Hou
Y. Guo
Xin Xu
M. Pietikäinen
Li Liu
VLM
26
33
0
22 May 2022
Cardinality-Minimal Explanations for Monotonic Neural Networks
Cardinality-Minimal Explanations for Monotonic Neural Networks
Ouns El Harzli
Bernardo Cuenca Grau
Ian Horrocks
FAtt
38
5
0
19 May 2022
Emotion-Controllable Generalized Talking Face Generation
Emotion-Controllable Generalized Talking Face Generation
Sanjana Sinha
S. Biswas
Ravindra Yadav
Brojeshwar Bhowmick
CVBM
15
49
0
02 May 2022
A Novel Speech-Driven Lip-Sync Model with CNN and LSTM
A Novel Speech-Driven Lip-Sync Model with CNN and LSTM
Xiaohong Li
Xiang Wang
Kai Wang
Shiguo Lian
16
4
0
02 May 2022
Extricating IoT Devices from Vendor Infrastructure with Karl
Extricating IoT Devices from Vendor Infrastructure with Karl
Gina Yuan
David Mazières
Matei A. Zaharia
18
5
0
28 Apr 2022
Improving Self-Supervised Learning-based MOS Prediction Networks
Improving Self-Supervised Learning-based MOS Prediction Networks
Bálint Gyires-Tóth
Csaba Zainkó
SSL
14
1
0
23 Apr 2022
Adversarial Scratches: Deployable Attacks to CNN Classifiers
Adversarial Scratches: Deployable Attacks to CNN Classifiers
Loris Giulivi
Malhar Jere
Loris Rossi
F. Koushanfar
Gabriela F. Cretu-Ciocarlie
Briland Hitaj
Giacomo Boracchi
AAML
20
18
0
20 Apr 2022
STRATA: Word Boundaries & Phoneme Recognition From Continuous Urdu
  Speech using Transfer Learning, Attention, & Data Augmentation
STRATA: Word Boundaries & Phoneme Recognition From Continuous Urdu Speech using Transfer Learning, Attention, & Data Augmentation
Saad Naeem
Omer Beg
6
0
0
16 Apr 2022
A Unified Cascaded Encoder ASR Model for Dynamic Model Sizes
A Unified Cascaded Encoder ASR Model for Dynamic Model Sizes
Shaojin Ding
Weiran Wang
Ding Zhao
Tara N. Sainath
Yanzhang He
...
Qiao Liang
Dongseong Hwang
Ian McGraw
Rohit Prabhavalkar
Trevor Strohman
30
17
0
13 Apr 2022
A Wav2vec2-Based Experimental Study on Self-Supervised Learning Methods
  to Improve Child Speech Recognition
A Wav2vec2-Based Experimental Study on Self-Supervised Learning Methods to Improve Child Speech Recognition
Rishabh Jain
Andrei Barcovschi
Mariam Yiwere
Dan Bigioi
Peter Corcoran
H. Cucu
22
31
0
06 Apr 2022
Successes and critical failures of neural networks in capturing
  human-like speech recognition
Successes and critical failures of neural networks in capturing human-like speech recognition
Federico Adolfi
J. Bowers
David Poeppel
UQCV
22
19
0
06 Apr 2022
Lip to Speech Synthesis with Visual Context Attentional GAN
Lip to Speech Synthesis with Visual Context Attentional GAN
Minsu Kim
Joanna Hong
Y. Ro
23
50
0
04 Apr 2022
Deep Speech Based End-to-End Automated Speech Recognition (ASR) for
  Indian-English Accents
Deep Speech Based End-to-End Automated Speech Recognition (ASR) for Indian-English Accents
P. Dubey
B. Shah
6
13
0
03 Apr 2022
Multi-task RNN-T with Semantic Decoder for Streamable Spoken Language
  Understanding
Multi-task RNN-T with Semantic Decoder for Streamable Spoken Language Understanding
Xuandi Fu
Feng-Ju Chang
Martin H. Radfar
Kailin Wei
Jing Liu
Grant P. Strimel
Kanthashree Mysore Sathyendra
16
4
0
01 Apr 2022
End-to-End Integration of Speech Recognition, Speech Enhancement, and
  Self-Supervised Learning Representation
End-to-End Integration of Speech Recognition, Speech Enhancement, and Self-Supervised Learning Representation
Xuankai Chang
Takashi Maekaku
Yuya Fujita
Shinji Watanabe
VLM
51
45
0
01 Apr 2022
Previous
12345...131415
Next