Deep Speech: Scaling up end-to-end speech recognition

17 December 2014

Papers citing "Deep Speech: Scaling up end-to-end speech recognition"

50 / 750 papers shown

Title
User-friendly automatic transcription of low-resource languages: Plugging ESPnet into Elpis Oliver Adams Benjamin Galliot Guillaume Wisniewski Nicholas Lambourne Ben Foley ... Laurent Besacier Christopher Cox Katya Aplonova Guillaume Jacques Nathan W. Hill 32 10 0 15 Dec 2020
C2C-GenDA: Cluster-to-Cluster Generation for Data Augmentation of Slot Filling Yutai Hou Sanyuan Chen Wanxiang Che Cheng Chen Ting Liu 8 19 0 13 Dec 2020
Confidence Estimation via Auxiliary Models Charles Corbière Nicolas Thome A. Saporta Tuan-Hung Vu Matthieu Cord P. Pérez TPM 29 47 0 11 Dec 2020
Speech Recognition for Endangered and Extinct Samoyedic languages N. Partanen Mika Hämäläinen T. Klooster 15 11 0 09 Dec 2020
Analyzing Finite Neural Networks: Can We Trust Neural Tangent Kernel Theory? Mariia Seleznova Gitta Kutyniok AAML 24 29 0 08 Dec 2020
Frame-level SpecAugment for Deep Convolutional Neural Networks in Hybrid ASR Systems Xinwei Li Yuanyuan Zhang Xiaodan Zhuang Daben Liu 6 6 0 07 Dec 2020
End to End ASR System with Automatic Punctuation Insertion Yushi Guan 3DV 21 5 0 03 Dec 2020
Learning to dance: A graph convolutional adversarial network to generate realistic dance motions from audio João P. Ferreira Thiago M. Coutinho Thiago L. Gomes J. F. Neto Rafael Azevedo Renato Martins Erickson R. Nascimento GAN 36 68 0 25 Nov 2020
Dynamic backdoor attacks against federated learning Anbu Huang AAML FedML 26 20 0 15 Nov 2020
Recognizing More Emotions with Less Data Using Self-supervised Transfer Learning Jonathan Boigne Biman Liyanage Ted Östrem 18 20 0 11 Nov 2020
Internal Language Model Estimation for Domain-Adaptive End-to-End Speech Recognition Zhong Meng S. Parthasarathy Eric Sun Yashesh Gaur Naoyuki Kanda Liang Lu Xie Chen Rui Zhao Jinyu Li Jiawei Liu AuLLM 19 107 0 03 Nov 2020
FaceLeaks: Inference Attacks against Transfer Learning Models via Black-box Queries Seng Pei Liew Tsubasa Takahashi MIACV FedML 12 9 0 27 Oct 2020
HarperValleyBank: A Domain-Specific Spoken Dialog Corpus Mike Wu J. Nafziger A. Scodary Andrew L. Maas 31 17 0 26 Oct 2020
Stop Bugging Me! Evading Modern-Day Wiretapping Using Adversarial Perturbations Yael Mathov Tal Senior A. Shabtai Yuval Elovici 36 5 0 24 Oct 2020
On Minimum Word Error Rate Training of the Hybrid Autoregressive Transducer Liang Lu Zhong Meng Naoyuki Kanda Jinyu Li Jiawei Liu 24 12 0 23 Oct 2020
Class-Conditional Defense GAN Against End-to-End Speech Attacks Mohammad Esmaeilpour P. Cardinal Alessandro Lameiras Koerich AAML 21 14 0 22 Oct 2020
Dynamic Layer Customization for Noise Robust Speech Emotion Recognition in Heterogeneous Condition Training Alex Wilf E. Provost 26 5 0 21 Oct 2020
Investigating Cross-Domain Losses for Speech Enhancement Sherif Abdulatif Karim Armanious Jayasankar T. Sajeev Karim Guirguis B. Yang 19 7 0 20 Oct 2020
Lightweight End-to-End Speech Recognition from Raw Audio Data Using Sinc-Convolutions Ludwig Kurzinger Nicolas Lindae Palle Klewitz Gerhard Rigoll 27 5 0 15 Oct 2020
Towards Resistant Audio Adversarial Examples Tom Dörr Karla Markert Nicolas M. Muller Konstantin Böttinger AAML 25 7 0 14 Oct 2020
Jointly Optimizing Sensing Pipelines for Multimodal Mixed Reality Interaction Darshana Rathnayake Ashen de Silva Dasun Puwakdandawa L. Meegahapola Archan Misra I. Perera 19 3 0 13 Oct 2020
Conditioning Trick for Training Stable GANs Mohammad Esmaeilpour Raymel Alfonso Sallo Olivier St-Georges P. Cardinal Alessandro Lameiras Koerich 22 0 0 12 Oct 2020
Pkwrap: a PyTorch Package for LF-MMI Training of Acoustic Models S. Madikeri Sibo Tong Juan Pablo Zuluaga Apoorv Vyas P. Motlícek H. Bourlard VLM 8 19 0 07 Oct 2020
Digital Voicing of Silent Speech David Gaddy Dana Klein 14 50 0 06 Oct 2020
A Unifying Review of Deep and Shallow Anomaly Detection Lukas Ruff Jacob R. Kauffmann Robert A. Vandermeulen G. Montavon Wojciech Samek Marius Kloft Thomas G. Dietterich Klaus-Robert Muller UQCV 20 780 0 24 Sep 2020
A Crowdsourced Open-Source Kazakh Speech Corpus and Initial Speech Recognition Baseline Yerbolat Khassanov Saida Mussakhojayeva A. Mirzakhmetov A. Adiyev Mukhamet Nurpeiissov H. A. Varol 22 30 0 22 Sep 2020
PodSumm -- Podcast Audio Summarization Aneesh Vartakavi Amanmeet Garg 6 10 0 22 Sep 2020
End-to-End Bengali Speech Recognition S. Mandal Sarthak Yadav A. Rai 6 5 0 21 Sep 2020
Grounded Adaptation for Zero-shot Executable Semantic Parsing Victor Zhong M. Lewis Sida I. Wang Luke Zettlemoyer 41 98 0 16 Sep 2020
How Much Can We Really Trust You? Towards Simple, Interpretable Trust Quantification Metrics for Deep Neural Networks A. Wong Xiao Yu Wang Andrew Hryniowski 11 23 0 12 Sep 2020
RECOApy: Data recording, pre-processing and phonetic transcription for end-to-end speech-based applications Adriana Stan 23 5 0 11 Sep 2020
Explanation of Unintended Radiated Emission Classification via LIME Tom Grimes E. Church W. Pitts Lynn Wood 11 5 0 04 Sep 2020
CLAN: Continuous Learning using Asynchronous Neuroevolution on Commodity Edge Devices Parth Mannan A. Samajdar T. Krishna 31 2 0 27 Aug 2020
Geometry-guided Dense Perspective Network for Speech-Driven Facial Animation Jing-ying Liu Binyuan Hui Kun Li Yunke Liu Yu-Kun Lai Yuxiang Zhang Yebin Liu Jingyu Yang 3DH CVBM 27 22 0 23 Aug 2020
MASRI-HEADSET: A Maltese Corpus for Speech Recognition C. Mena Albert Gatt A. DeMarco Claudia Borg Lonneke van der Plas Amanda Muscat Ian Padovani 6 12 0 13 Aug 2020
Attention-based Fully Gated CNN-BGRU for Russian Handwritten Text Abdelrahman Abdallah Mohamed Hamada D. Nurseitov 27 42 0 12 Aug 2020
Transformer with Bidirectional Decoder for Speech Recognition Xi Chen Songyang Zhang Dandan Song P. Ouyang Shouyi Yin 18 13 0 11 Aug 2020
TinySpeech: Attention Condensers for Deep Speech Recognition Neural Networks on Edge Devices A. Wong M. Famouri Maya Pavlova Siddharth Surana 69 33 0 10 Aug 2020
Improving the Accuracy of Global Forecasting Models using Time Series Data Augmentation Kasun Bandara Hansika Hewamalage Yuan-Hao Liu Yanfei Kang Christoph Bergmeir AI4TS 21 114 0 06 Aug 2020
FRMDN: Flow-based Recurrent Mixture Density Network S. Razavi Reshad Hosseini Tina Behzad BDL 16 0 0 05 Aug 2020
Word meaning in minds and machines Brenden M. Lake G. Murphy NAI 15 117 0 04 Aug 2020
Privacy-preserving Voice Analysis via Disentangled Representations Ranya Aloufi Hamed Haddadi David E. Boyle DRL 19 58 0 29 Jul 2020
Autosegmental Neural Nets: Should Phones and Tones be Synchronous or Asynchronous? Jialu Li M. Hasegawa-Johnson 20 5 0 28 Jul 2020
Self-Expressing Autoencoders for Unsupervised Spoken Term Discovery Saurabhchand Bhati Jesús Villalba Piotr Żelasko Najim Dehak SSL 23 16 0 26 Jul 2020
MP3 Compression To Diminish Adversarial Noise in End-to-End Speech Recognition I. Andronic Ludwig Kurzinger Edgar Ricardo Chavez Rosas Gerhard Rigoll B. Seeber 14 15 0 25 Jul 2020
Audio Adversarial Examples for Robust Hybrid CTC/Attention Speech Recognition Ludwig Kurzinger Edgar Ricardo Chavez Rosas Lujun Li Tobias Watzel Gerhard Rigoll AAML 19 4 0 21 Jul 2020
Learning to Generate Customized Dynamic 3D Facial Expressions Rolandos Alexandros Potamias Jiali Zheng Stylianos Ploumpis Giorgos Bouritsas Evangelos Ververas S. Zafeiriou 3DH 31 22 0 19 Jul 2020
Robust Image Classification Using A Low-Pass Activation Function and DCT Augmentation Md Tahmid Hossain S. Teng Ferdous Sohel Guojun Lu 16 10 0 18 Jul 2020
EZLDA: Efficient and Scalable LDA on GPUs Shilong Wang Hang Liu Anil Gaihre Hengyong Yu 6 1 0 17 Jul 2020
Data augmentation enhanced speaker enrollment for text-dependent speaker verification A. K. Sarkar H. Sarma Priyanka Dwivedi Zheng-Hua Tan 6 3 0 12 Jul 2020