ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1412.5567
  4. Cited By
Deep Speech: Scaling up end-to-end speech recognition

Deep Speech: Scaling up end-to-end speech recognition

17 December 2014
Awni Y. Hannun
Carl Case
Jared Casper
Bryan Catanzaro
G. Diamos
Erich Elsen
R. Prenger
S. Satheesh
Shubho Sengupta
Adam Coates
A. Ng
ArXivPDFHTML

Papers citing "Deep Speech: Scaling up end-to-end speech recognition"

50 / 750 papers shown
Title
User-friendly automatic transcription of low-resource languages:
  Plugging ESPnet into Elpis
User-friendly automatic transcription of low-resource languages: Plugging ESPnet into Elpis
Oliver Adams
Benjamin Galliot
Guillaume Wisniewski
Nicholas Lambourne
Ben Foley
...
Laurent Besacier
Christopher Cox
Katya Aplonova
Guillaume Jacques
Nathan W. Hill
32
10
0
15 Dec 2020
C2C-GenDA: Cluster-to-Cluster Generation for Data Augmentation of Slot
  Filling
C2C-GenDA: Cluster-to-Cluster Generation for Data Augmentation of Slot Filling
Yutai Hou
Sanyuan Chen
Wanxiang Che
Cheng Chen
Ting Liu
8
19
0
13 Dec 2020
Confidence Estimation via Auxiliary Models
Confidence Estimation via Auxiliary Models
Charles Corbière
Nicolas Thome
A. Saporta
Tuan-Hung Vu
Matthieu Cord
P. Pérez
TPM
29
47
0
11 Dec 2020
Speech Recognition for Endangered and Extinct Samoyedic languages
Speech Recognition for Endangered and Extinct Samoyedic languages
N. Partanen
Mika Hämäläinen
T. Klooster
15
11
0
09 Dec 2020
Analyzing Finite Neural Networks: Can We Trust Neural Tangent Kernel
  Theory?
Analyzing Finite Neural Networks: Can We Trust Neural Tangent Kernel Theory?
Mariia Seleznova
Gitta Kutyniok
AAML
24
29
0
08 Dec 2020
Frame-level SpecAugment for Deep Convolutional Neural Networks in Hybrid
  ASR Systems
Frame-level SpecAugment for Deep Convolutional Neural Networks in Hybrid ASR Systems
Xinwei Li
Yuanyuan Zhang
Xiaodan Zhuang
Daben Liu
6
6
0
07 Dec 2020
End to End ASR System with Automatic Punctuation Insertion
End to End ASR System with Automatic Punctuation Insertion
Yushi Guan
3DV
21
5
0
03 Dec 2020
Learning to dance: A graph convolutional adversarial network to generate
  realistic dance motions from audio
Learning to dance: A graph convolutional adversarial network to generate realistic dance motions from audio
João P. Ferreira
Thiago M. Coutinho
Thiago L. Gomes
J. F. Neto
Rafael Azevedo
Renato Martins
Erickson R. Nascimento
GAN
36
68
0
25 Nov 2020
Dynamic backdoor attacks against federated learning
Dynamic backdoor attacks against federated learning
Anbu Huang
AAML
FedML
26
20
0
15 Nov 2020
Recognizing More Emotions with Less Data Using Self-supervised Transfer
  Learning
Recognizing More Emotions with Less Data Using Self-supervised Transfer Learning
Jonathan Boigne
Biman Liyanage
Ted Östrem
18
20
0
11 Nov 2020
Internal Language Model Estimation for Domain-Adaptive End-to-End Speech
  Recognition
Internal Language Model Estimation for Domain-Adaptive End-to-End Speech Recognition
Zhong Meng
S. Parthasarathy
Eric Sun
Yashesh Gaur
Naoyuki Kanda
Liang Lu
Xie Chen
Rui Zhao
Jinyu Li
Jiawei Liu
AuLLM
19
107
0
03 Nov 2020
FaceLeaks: Inference Attacks against Transfer Learning Models via
  Black-box Queries
FaceLeaks: Inference Attacks against Transfer Learning Models via Black-box Queries
Seng Pei Liew
Tsubasa Takahashi
MIACV
FedML
12
9
0
27 Oct 2020
HarperValleyBank: A Domain-Specific Spoken Dialog Corpus
HarperValleyBank: A Domain-Specific Spoken Dialog Corpus
Mike Wu
J. Nafziger
A. Scodary
Andrew L. Maas
31
17
0
26 Oct 2020
Stop Bugging Me! Evading Modern-Day Wiretapping Using Adversarial
  Perturbations
Stop Bugging Me! Evading Modern-Day Wiretapping Using Adversarial Perturbations
Yael Mathov
Tal Senior
A. Shabtai
Yuval Elovici
36
5
0
24 Oct 2020
On Minimum Word Error Rate Training of the Hybrid Autoregressive
  Transducer
On Minimum Word Error Rate Training of the Hybrid Autoregressive Transducer
Liang Lu
Zhong Meng
Naoyuki Kanda
Jinyu Li
Jiawei Liu
24
12
0
23 Oct 2020
Class-Conditional Defense GAN Against End-to-End Speech Attacks
Class-Conditional Defense GAN Against End-to-End Speech Attacks
Mohammad Esmaeilpour
P. Cardinal
Alessandro Lameiras Koerich
AAML
21
14
0
22 Oct 2020
Dynamic Layer Customization for Noise Robust Speech Emotion Recognition
  in Heterogeneous Condition Training
Dynamic Layer Customization for Noise Robust Speech Emotion Recognition in Heterogeneous Condition Training
Alex Wilf
E. Provost
26
5
0
21 Oct 2020
Investigating Cross-Domain Losses for Speech Enhancement
Investigating Cross-Domain Losses for Speech Enhancement
Sherif Abdulatif
Karim Armanious
Jayasankar T. Sajeev
Karim Guirguis
B. Yang
19
7
0
20 Oct 2020
Lightweight End-to-End Speech Recognition from Raw Audio Data Using
  Sinc-Convolutions
Lightweight End-to-End Speech Recognition from Raw Audio Data Using Sinc-Convolutions
Ludwig Kurzinger
Nicolas Lindae
Palle Klewitz
Gerhard Rigoll
27
5
0
15 Oct 2020
Towards Resistant Audio Adversarial Examples
Towards Resistant Audio Adversarial Examples
Tom Dörr
Karla Markert
Nicolas M. Muller
Konstantin Böttinger
AAML
25
7
0
14 Oct 2020
Jointly Optimizing Sensing Pipelines for Multimodal Mixed Reality
  Interaction
Jointly Optimizing Sensing Pipelines for Multimodal Mixed Reality Interaction
Darshana Rathnayake
Ashen de Silva
Dasun Puwakdandawa
L. Meegahapola
Archan Misra
I. Perera
19
3
0
13 Oct 2020
Conditioning Trick for Training Stable GANs
Conditioning Trick for Training Stable GANs
Mohammad Esmaeilpour
Raymel Alfonso Sallo
Olivier St-Georges
P. Cardinal
Alessandro Lameiras Koerich
22
0
0
12 Oct 2020
Pkwrap: a PyTorch Package for LF-MMI Training of Acoustic Models
Pkwrap: a PyTorch Package for LF-MMI Training of Acoustic Models
S. Madikeri
Sibo Tong
Juan Pablo Zuluaga
Apoorv Vyas
P. Motlícek
H. Bourlard
VLM
8
19
0
07 Oct 2020
Digital Voicing of Silent Speech
Digital Voicing of Silent Speech
David Gaddy
Dana Klein
14
50
0
06 Oct 2020
A Unifying Review of Deep and Shallow Anomaly Detection
A Unifying Review of Deep and Shallow Anomaly Detection
Lukas Ruff
Jacob R. Kauffmann
Robert A. Vandermeulen
G. Montavon
Wojciech Samek
Marius Kloft
Thomas G. Dietterich
Klaus-Robert Muller
UQCV
20
780
0
24 Sep 2020
A Crowdsourced Open-Source Kazakh Speech Corpus and Initial Speech
  Recognition Baseline
A Crowdsourced Open-Source Kazakh Speech Corpus and Initial Speech Recognition Baseline
Yerbolat Khassanov
Saida Mussakhojayeva
A. Mirzakhmetov
A. Adiyev
Mukhamet Nurpeiissov
H. A. Varol
22
30
0
22 Sep 2020
PodSumm -- Podcast Audio Summarization
PodSumm -- Podcast Audio Summarization
Aneesh Vartakavi
Amanmeet Garg
6
10
0
22 Sep 2020
End-to-End Bengali Speech Recognition
End-to-End Bengali Speech Recognition
S. Mandal
Sarthak Yadav
A. Rai
6
5
0
21 Sep 2020
Grounded Adaptation for Zero-shot Executable Semantic Parsing
Grounded Adaptation for Zero-shot Executable Semantic Parsing
Victor Zhong
M. Lewis
Sida I. Wang
Luke Zettlemoyer
41
98
0
16 Sep 2020
How Much Can We Really Trust You? Towards Simple, Interpretable Trust
  Quantification Metrics for Deep Neural Networks
How Much Can We Really Trust You? Towards Simple, Interpretable Trust Quantification Metrics for Deep Neural Networks
A. Wong
Xiao Yu Wang
Andrew Hryniowski
11
23
0
12 Sep 2020
RECOApy: Data recording, pre-processing and phonetic transcription for
  end-to-end speech-based applications
RECOApy: Data recording, pre-processing and phonetic transcription for end-to-end speech-based applications
Adriana Stan
23
5
0
11 Sep 2020
Explanation of Unintended Radiated Emission Classification via LIME
Explanation of Unintended Radiated Emission Classification via LIME
Tom Grimes
E. Church
W. Pitts
Lynn Wood
11
5
0
04 Sep 2020
CLAN: Continuous Learning using Asynchronous Neuroevolution on Commodity
  Edge Devices
CLAN: Continuous Learning using Asynchronous Neuroevolution on Commodity Edge Devices
Parth Mannan
A. Samajdar
T. Krishna
31
2
0
27 Aug 2020
Geometry-guided Dense Perspective Network for Speech-Driven Facial
  Animation
Geometry-guided Dense Perspective Network for Speech-Driven Facial Animation
Jing-ying Liu
Binyuan Hui
Kun Li
Yunke Liu
Yu-Kun Lai
Yuxiang Zhang
Yebin Liu
Jingyu Yang
3DH
CVBM
27
22
0
23 Aug 2020
MASRI-HEADSET: A Maltese Corpus for Speech Recognition
MASRI-HEADSET: A Maltese Corpus for Speech Recognition
C. Mena
Albert Gatt
A. DeMarco
Claudia Borg
Lonneke van der Plas
Amanda Muscat
Ian Padovani
6
12
0
13 Aug 2020
Attention-based Fully Gated CNN-BGRU for Russian Handwritten Text
Attention-based Fully Gated CNN-BGRU for Russian Handwritten Text
Abdelrahman Abdallah
Mohamed Hamada
D. Nurseitov
27
42
0
12 Aug 2020
Transformer with Bidirectional Decoder for Speech Recognition
Transformer with Bidirectional Decoder for Speech Recognition
Xi Chen
Songyang Zhang
Dandan Song
P. Ouyang
Shouyi Yin
18
13
0
11 Aug 2020
TinySpeech: Attention Condensers for Deep Speech Recognition Neural
  Networks on Edge Devices
TinySpeech: Attention Condensers for Deep Speech Recognition Neural Networks on Edge Devices
A. Wong
M. Famouri
Maya Pavlova
Siddharth Surana
69
33
0
10 Aug 2020
Improving the Accuracy of Global Forecasting Models using Time Series
  Data Augmentation
Improving the Accuracy of Global Forecasting Models using Time Series Data Augmentation
Kasun Bandara
Hansika Hewamalage
Yuan-Hao Liu
Yanfei Kang
Christoph Bergmeir
AI4TS
21
114
0
06 Aug 2020
FRMDN: Flow-based Recurrent Mixture Density Network
FRMDN: Flow-based Recurrent Mixture Density Network
S. Razavi
Reshad Hosseini
Tina Behzad
BDL
16
0
0
05 Aug 2020
Word meaning in minds and machines
Word meaning in minds and machines
Brenden M. Lake
G. Murphy
NAI
15
117
0
04 Aug 2020
Privacy-preserving Voice Analysis via Disentangled Representations
Privacy-preserving Voice Analysis via Disentangled Representations
Ranya Aloufi
Hamed Haddadi
David E. Boyle
DRL
19
58
0
29 Jul 2020
Autosegmental Neural Nets: Should Phones and Tones be Synchronous or
  Asynchronous?
Autosegmental Neural Nets: Should Phones and Tones be Synchronous or Asynchronous?
Jialu Li
M. Hasegawa-Johnson
20
5
0
28 Jul 2020
Self-Expressing Autoencoders for Unsupervised Spoken Term Discovery
Self-Expressing Autoencoders for Unsupervised Spoken Term Discovery
Saurabhchand Bhati
Jesús Villalba
Piotr Żelasko
Najim Dehak
SSL
23
16
0
26 Jul 2020
MP3 Compression To Diminish Adversarial Noise in End-to-End Speech
  Recognition
MP3 Compression To Diminish Adversarial Noise in End-to-End Speech Recognition
I. Andronic
Ludwig Kurzinger
Edgar Ricardo Chavez Rosas
Gerhard Rigoll
B. Seeber
14
15
0
25 Jul 2020
Audio Adversarial Examples for Robust Hybrid CTC/Attention Speech
  Recognition
Audio Adversarial Examples for Robust Hybrid CTC/Attention Speech Recognition
Ludwig Kurzinger
Edgar Ricardo Chavez Rosas
Lujun Li
Tobias Watzel
Gerhard Rigoll
AAML
19
4
0
21 Jul 2020
Learning to Generate Customized Dynamic 3D Facial Expressions
Learning to Generate Customized Dynamic 3D Facial Expressions
Rolandos Alexandros Potamias
Jiali Zheng
Stylianos Ploumpis
Giorgos Bouritsas
Evangelos Ververas
S. Zafeiriou
3DH
31
22
0
19 Jul 2020
Robust Image Classification Using A Low-Pass Activation Function and DCT
  Augmentation
Robust Image Classification Using A Low-Pass Activation Function and DCT Augmentation
Md Tahmid Hossain
S. Teng
Ferdous Sohel
Guojun Lu
16
10
0
18 Jul 2020
EZLDA: Efficient and Scalable LDA on GPUs
EZLDA: Efficient and Scalable LDA on GPUs
Shilong Wang
Hang Liu
Anil Gaihre
Hengyong Yu
6
1
0
17 Jul 2020
Data augmentation enhanced speaker enrollment for text-dependent speaker
  verification
Data augmentation enhanced speaker enrollment for text-dependent speaker verification
A. K. Sarkar
H. Sarma
Priyanka Dwivedi
Zheng-Hua Tan
6
3
0
12 Jul 2020
Previous
123...789...131415
Next