Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1412.5567
Cited By
Deep Speech: Scaling up end-to-end speech recognition
17 December 2014
Awni Y. Hannun
Carl Case
Jared Casper
Bryan Catanzaro
G. Diamos
Erich Elsen
R. Prenger
S. Satheesh
Shubho Sengupta
Adam Coates
A. Ng
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Deep Speech: Scaling up end-to-end speech recognition"
50 / 750 papers shown
Title
Trustworthy Sensor Fusion against Inaudible Command Attacks in Advanced Driver-Assistance System
Jiwei Guan
Lei Pan
Chen Wang
Shui Yu
Longxiang Gao
Xi Zheng
AAML
19
3
0
30 May 2023
RAND: Robustness Aware Norm Decay For Quantized Seq2seq Models
David Qiu
David Rim
Shaojin Ding
Oleg Rybakov
Yanzhang He
MQ
35
4
0
24 May 2023
Evaluating OpenAI's Whisper ASR for Punctuation Prediction and Topic Modeling of life histories of the Museum of the Person
L. Gris
R. Marcacini
Arnaldo Cândido Júnior
Edresson Casanova
A. S. Soares
S. Aluísio
18
7
0
23 May 2023
QFA2SR: Query-Free Adversarial Transfer Attacks to Speaker Recognition Systems
Guangke Chen
Yedi Zhang
Zhe Zhao
Fu Song
AAML
41
11
0
23 May 2023
Study of GANs for Noisy Speech Simulation from Clean Speech
L. Maben
Zixun Guo
Chen Chen
Utkarsh Chudiwal
Chng Eng Siong
16
0
0
21 May 2023
Decision-based iterative fragile watermarking for model integrity verification
Z. Yin
Heng Yin
Hang Su
Xinpeng Zhang
Zhenzhe Gao
AAML
25
3
0
13 May 2023
Masked Audio Text Encoders are Effective Multi-Modal Rescorers
Jason (Jinglun) Cai
Monica Sunkara
Xilai Li
Anshu Bhatia
Xiao Pan
S. Bodapati
28
3
0
11 May 2023
Deep Learning and Geometric Deep Learning: an introduction for mathematicians and physicists
R. Fioresi
F. Zanchetta
PINN
25
4
0
09 May 2023
Mask The Bias: Improving Domain-Adaptive Generalization of CTC-based ASR with Internal Language Model Estimation
Nilaksh Das
Monica Sunkara
S. Bodapati
Jason (Jinglun) Cai
Devang Kulshreshtha
Jeffrey J. Farris
Katrin Kirchhoff
28
2
0
05 May 2023
GeneFace++: Generalized and Stable Real-Time Audio-Driven 3D Talking Face Generation
Zhenhui Ye
Jinzheng He
Ziyue Jiang
Rongjie Huang
Jia-Bin Huang
Jinglin Liu
Yixiang Ren
Xiang Yin
Zejun Ma
Zhou Zhao
CVBM
57
29
0
01 May 2023
Affective social anthropomorphic intelligent system
Md. Adyelullahil Mamun
Hasnat Md. Abdullah
Md. Golam Rabiul Alam
Muhammad Mehedi Hassan
Md. Zia Uddin
17
1
0
19 Apr 2023
ASPEST: Bridging the Gap Between Active Learning and Selective Prediction
Jiefeng Chen
Jinsung Yoon
Sayna Ebrahimi
Sercan Ö. Arik
S. Jha
Tomas Pfister
36
1
0
07 Apr 2023
Robustmix: Improving Robustness by Regularizing the Frequency Bias of Deep Nets
Jonas Ngnawé
Marianne Abémgnigni Njifon
Jonathan Heek
Yann N. Dauphin
OOD
18
4
0
06 Apr 2023
Style Transfer for 2D Talking Head Animation
Trong-Thang Pham
Nhat Le
Tuong Khanh Long Do
Hung Nguyen
Erman Tjiputra
Quang-Dieu Tran
A. Nguyen
22
3
0
17 Mar 2023
Improving Few-Shot Learning for Talking Face System with TTS Data Augmentation
Qi Chen
Ziyang Ma
Tao Liu
Xuejiao Tan
Qu Lu
Xie Chen
K. Yu
CVBM
35
5
0
09 Mar 2023
DINet: Deformation Inpainting Network for Realistic Face Visually Dubbing on High Resolution Video
Zhimeng Zhang
Zhipeng Hu
W. Deng
Changjie Fan
Tangjie Lv
Yu-qiong Ding
3DH
CVBM
38
59
0
07 Mar 2023
End-to-End Speech Recognition: A Survey
Rohit Prabhavalkar
Takaaki Hori
Tara N. Sainath
Ralf Schluter
Shinji Watanabe
VLM
26
149
0
03 Mar 2023
Variational EP with Probabilistic Backpropagation for Bayesian Neural Networks
Kehinde Olobatuyi
BDL
11
0
0
02 Mar 2023
A Comparison of Speech Data Augmentation Methods Using S3PRL Toolkit
Mina Huh
Ruchira Ray
Corey Karnei
24
3
0
27 Feb 2023
Explanations for Automatic Speech Recognition
Xiao-lan Wu
P. Bell
A. Rajan
11
6
0
27 Feb 2023
Improving Medical Speech-to-Text Accuracy with Vision-Language Pre-training Model
Jaeyoung Huh
Sangjoon Park
Jeonghyeon Lee
Jong Chul Ye
LM&MA
17
9
0
27 Feb 2023
Pose-Controllable 3D Facial Animation Synthesis using Hierarchical Audio-Vertex Attention
Bin Liu
Xiaolin K. Wei
Bo Li
Junjie Cao
Yunyu Lai
CVBM
27
1
0
24 Feb 2023
Evaluating Automatic Speech Recognition in an Incremental Setting
Ryan Whetten
M. Imtiaz
C. Kennington
9
1
0
23 Feb 2023
Using Semantic Information for Defining and Detecting OOD Inputs
Ramneet Kaur
Xiayan Ji
Souradeep Dutta
Michele Caprio
Yahan Yang
E. Bernardis
O. Sokolsky
Insup Lee
OODD
37
7
0
21 Feb 2023
JEIT: Joint End-to-End Model and Internal Language Model Training for Speech Recognition
Zhong Meng
Weiran Wang
Rohit Prabhavalkar
Tara N. Sainath
Tongzhou Chen
Ehsan Variani
Yu Zhang
Bo-wen Li
Andrew Rosenberg
Bhuvana Ramabhadran
AuLLM
VLM
36
11
0
16 Feb 2023
GeneFace: Generalized and High-Fidelity Audio-Driven 3D Talking Face Synthesis
Zhenhui Ye
Ziyue Jiang
Yi Ren
Jinglin Liu
Jinzheng He
Zhou Zhao
CVBM
25
122
0
31 Jan 2023
Open Problems in Applied Deep Learning
M. Raissi
AI4CE
42
2
0
26 Jan 2023
A Data-Efficient Visual-Audio Representation with Intuitive Fine-tuning for Voice-Controlled Robots
Peixin Chang
Shuijing Liu
Tianchen Ji
Neeloy Chakraborty
Kaiwen Hong
Katherine Driggs-Campbell
51
3
0
23 Jan 2023
Neural Architecture Search: Insights from 1000 Papers
Colin White
Mahmoud Safari
R. Sukthanker
Binxin Ru
T. Elsken
Arber Zela
Debadeepta Dey
Frank Hutter
3DV
AI4CE
34
130
0
20 Jan 2023
DiffTalk: Crafting Diffusion Models for Generalized Audio-Driven Portraits Animation
Shuai Shen
Wenliang Zhao
Zibin Meng
Wanhua Li
Zhengbiao Zhu
Jie Zhou
Jiwen Lu
DiffM
VGen
38
99
0
10 Jan 2023
Audio-Visual Efficient Conformer for Robust Speech Recognition
Maxime Burchi
Radu Timofte
VLM
11
33
0
04 Jan 2023
Imitator: Personalized Speech-driven 3D Facial Animation
Balamurugan Thambiraja
I. Habibie
S. Aliakbarian
Darren Cosker
Christian Theobalt
Justus Thies
CVBM
47
49
0
30 Dec 2022
End-to-End Automatic Speech Recognition model for the Sudanese Dialect
Ayman Mansour
Wafaa F. Mukhtar
19
1
0
21 Dec 2022
KL Regularized Normalization Framework for Low Resource Tasks
Neeraj Kumar
Ankur Narang
Brejesh Lall
26
1
0
21 Dec 2022
VSVC: Backdoor attack against Keyword Spotting based on Voiceprint Selection and Voice Conversion
Hanbo Cai
Pengcheng Zhang
Hai Dong
Yan Xiao
Shunhui Ji
13
5
0
20 Dec 2022
A Review of Speech-centric Trustworthy Machine Learning: Privacy, Safety, and Fairness
Tiantian Feng
Rajat Hebbar
Nicholas Mehlman
Xuan Shi
Aditya Kommineni
and Shrikanth Narayanan
43
31
0
18 Dec 2022
An Exploratory Study of AI System Risk Assessment from the Lens of Data Distribution and Uncertainty
Zhijie Wang
Yuheng Huang
L. Ma
Haruki Yokoyama
Susumu Tokumoto
Kazuki Munakata
24
4
0
13 Dec 2022
Estimator: An Effective and Scalable Framework for Transportation Mode Classification over Trajectories
Danlei Hu
Ziquan Fang
Hanxi Fang
Tianyi Li
Chun-ru Shen
Lu Chen
Yunjun Gao
22
5
0
11 Dec 2022
Memories are One-to-Many Mapping Alleviators in Talking Face Generation
Anni Tang
Tianyu He
Xuejiao Tan
Jun Ling
Liang Song
CVBM
26
23
0
09 Dec 2022
Thales: Formulating and Estimating Architectural Vulnerability Factors for DNN Accelerators
Abhishek Tyagi
Yiming Gan
Shaoshan Liu
Bo Yu
P. Whatmough
Yuhao Zhu
AAML
21
9
0
05 Dec 2022
PiPar: Pipeline Parallelism for Collaborative Machine Learning
Zihan Zhang
Philip Rodgers
Peter Kilpatrick
I. Spence
Blesson Varghese
FedML
43
3
0
01 Dec 2022
Evaluating and reducing the distance between synthetic and real speech distributions
Christoph Minixhofer
Ondˇrej Klejch
P. Bell
36
7
0
29 Nov 2022
Deep representation learning: Fundamentals, Perspectives, Applications, and Open Challenges
K. T. Baghaei
Amirreza Payandeh
Pooya Fayyazsanavi
Shahram Rahimi
Zhiqian Chen
Somayeh Bakhtiari Ramezani
FaML
AI4TS
35
6
0
27 Nov 2022
Dynamic Neural Portraits
M. Doukas
Stylianos Ploumpis
S. Zafeiriou
3DH
24
1
0
25 Nov 2022
HARL: Hierarchical Adaptive Reinforcement Learning Based Auto Scheduler for Neural Networks
Zining Zhang
Bingsheng He
Zhenjie Zhang
14
5
0
21 Nov 2022
Phonemic Adversarial Attack against Audio Recognition in Real World
Jiakai Wang
Zhendong Chen
Zixin Yin
Qinghong Yang
Xianglong Liu
AAML
34
3
0
19 Nov 2022
VeLO: Training Versatile Learned Optimizers by Scaling Up
Luke Metz
James Harrison
C. Freeman
Amil Merchant
Lucas Beyer
...
Naman Agrawal
Ben Poole
Igor Mordatch
Adam Roberts
Jascha Narain Sohl-Dickstein
35
60
0
17 Nov 2022
Hey ASR System! Why Aren't You More Inclusive? Automatic Speech Recognition Systems' Bias and Proposed Bias Mitigation Techniques. A Literature Review
Mikel K. Ngueajio
Gloria J. Washington
34
35
0
17 Nov 2022
Improving Children's Speech Recognition by Fine-tuning Self-supervised Adult Speech Representations
Renée Lu
M. Shahin
Beena Ahmed
32
4
0
14 Nov 2022
FullPack: Full Vector Utilization for Sub-Byte Quantized Inference on General Purpose CPUs
Hossein Katebi
Navidreza Asadi
M. Goudarzi
MQ
27
0
0
13 Nov 2022
Previous
1
2
3
4
5
6
...
13
14
15
Next