Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1412.5567
Cited By
Deep Speech: Scaling up end-to-end speech recognition
17 December 2014
Awni Y. Hannun
Carl Case
Jared Casper
Bryan Catanzaro
G. Diamos
Erich Elsen
R. Prenger
S. Satheesh
Shubho Sengupta
Adam Coates
A. Ng
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Deep Speech: Scaling up end-to-end speech recognition"
50 / 750 papers shown
Title
OT-Talk: Animating 3D Talking Head with Optimal Transportation
Xinmu Wang
Xiang Gao
Xiyun Song
Heather Yu
Zongfang Lin
Liang Peng
Xianfeng Gu
22
0
0
03 May 2025
Audio-Driven Talking Face Video Generation with Joint Uncertainty Learning
Yifan Xie
Fei Ma
Yi Bin
Ying He
Fei Richard Yu
57
0
0
26 Apr 2025
Poem Meter Classification of Recited Arabic Poetry: Integrating High-Resource Systems for a Low-Resource Task
Maged S. Al-Shaibani
Zaid Alyafeai
Irfan Ahmad
38
0
0
16 Apr 2025
Making Acoustic Side-Channel Attacks on Noisy Keyboards Viable with LLM-Assisted Spectrograms' "Typo" Correction
Seyyed Ali Ayati
Jin Hyun Park
Yichen Cai
Marcus Botacin
31
0
0
15 Apr 2025
Whisper-LM: Improving ASR Models with Language Models for Low-Resource Languages
Xabier de Zuazo
Eva Navas
Ibon Saratxaga
Inma Hernáez Rioja
37
0
0
30 Mar 2025
SyncDiff: Diffusion-based Talking Head Synthesis with Bottlenecked Temporal Visual Prior for Improved Synchronization
Xulin Fan
Heting Gao
Ziyi Chen
Peng Chang
Mei Han
Mark Hasegawa-Johnson
DiffM
57
0
0
17 Mar 2025
Cafe-Talk: Generating 3D Talking Face Animation with Multimodal Coarse- and Fine-grained Control
Hejia Chen
Haoxian Zhang
Shoulong Zhang
Xiaoqiang Liu
Sisi Zhuang
Yuan Zhang
Pengfei Wan
Di Zhang
Shuai Li
54
1
0
14 Mar 2025
Towards High-fidelity 3D Talking Avatar with Personalized Dynamic Texture
X. Li
Jianyu Wang
Yuhao Cheng
Yikun Zeng
X. Ren
W. Zhu
Weiming Zhao
Yichao Yan
31
0
0
01 Mar 2025
InsTaG: Learning Personalized 3D Talking Head from Few-Second Video
Jiahe Li
Jiawei Zhang
Xiao Bai
Jin Zheng
J. Zhou
L. Gu
57
0
0
27 Feb 2025
Logit Disagreement: OoD Detection with Bayesian Neural Networks
Kevin Raina
UQCV
BDL
UD
PER
66
0
0
24 Feb 2025
A Differentiable Alignment Framework for Sequence-to-Sequence Modeling via Optimal Transport
Yacouba Kaloga
Shashi Kumar
P. Motlícek
Ina Kodrasi
OT
74
0
0
03 Feb 2025
Privacy-Preserving Edge Speech Understanding with Tiny Foundation Models
A. Benazir
Felix Xiaozhu Lin
41
0
0
29 Jan 2025
From Audio Deepfake Detection to AI-Generated Music Detection -- A Pathway and Overview
Yupei Li
M. Milling
Lucia Specia
Björn Schuller
89
6
0
30 Nov 2024
BanglaDialecto: An End-to-End AI-Powered Regional Speech Standardization
Md. Nazmus Sadat Samin
Jawad Ibn Ahad
Tanjila Ahmed Medha
Fuad Rahman
M. R. Amin
Nabeel Mohammed
Shafin Rahman
34
0
0
16 Nov 2024
Exploring the Stability Gap in Continual Learning: The Role of the Classification Head
Wojciech Łapacz
Daniel Marczak
Filip Szatkowski
Tomasz Trzciñski
34
1
0
06 Nov 2024
RELATE: A Modern Processing Platform for Romanian Language
V. Pais
Radu Ion
Andrei-Marius Avram
Maria Mitrofan
D. Tufis
VLM
19
0
0
29 Oct 2024
Interventional Speech Noise Injection for ASR Generalizable Spoken Language Understanding
Yeonjoon Jung
Jaeseong Lee
Seungtaek Choi
Dohyeon Lee
Minsoo Kim
S. Hwang
18
0
0
21 Oct 2024
UniGlyph: A Seven-Segment Script for Universal Language Representation
G. V. Bency Sherin
A. Abijesh Euphrine
A. Lenora Moreen
L. Arun Jose
31
0
0
11 Oct 2024
The First VoicePrivacy Attacker Challenge Evaluation Plan
N. Tomashenko
Xiaoxiao Miao
Emmanuel Vincent
Junichi Yamagishi
120
2
0
09 Oct 2024
A two-stage transliteration approach to improve performance of a multilingual ASR
Rohit Kumar
13
0
0
09 Oct 2024
WeHelp: A Shared Autonomy System for Wheelchair Users
Abulikemu Abuduweili
Alice Wu
Tianhao Wei
Weiye Zhao
35
0
0
18 Sep 2024
3DFacePolicy: Speech-Driven 3D Facial Animation with Diffusion Policy
Xuanmeng Sha
Liyun Zhang
Tomohiro Mashita
Yuki Uranishi
VGen
25
0
0
17 Sep 2024
Reassessing Noise Augmentation Methods in the Context of Adversarial Speech
Karla Pizzi
Matías Pizarro
Asja Fischer
28
0
0
03 Sep 2024
Contrastive Augmentation: An Unsupervised Learning Approach for Keyword Spotting in Speech Technology
Weinan Dai
Yifeng Jiang
Yuanjing Liu
Jinkun Chen
Xin Sun
Jinglei Tao
SSL
24
0
0
31 Aug 2024
Subgroup Analysis via Model-based Rule Forest
I-Ling Cheng
Chan Hsu
Chantung Ku
Pei-Ju Lee
Yihuang Kang
11
0
0
27 Aug 2024
The State of Commercial Automatic French Legal Speech Recognition Systems and their Impact on Court Reporters et al
Nicolad Garneau
Olivier Bolduc
ELM
AILaw
45
0
0
21 Aug 2024
Content and Style Aware Audio-Driven Facial Animation
Qingju Liu
Hyeongwoo Kim
Gaurav Bharaj
DiffM
30
1
0
13 Aug 2024
Audio Enhancement for Computer Audition -- An Iterative Training Paradigm Using Sample Importance
M. Milling
Shuo Liu
Andreas Triantafyllopoulos
Ilhan Aslan
Björn W. Schuller
29
2
0
12 Aug 2024
Style-Preserving Lip Sync via Audio-Aware Style Reference
Weizhi Zhong
Jichang Li
Yinqi Cai
Liang Lin
Guanbin Li
29
2
0
10 Aug 2024
EmoFace: Audio-driven Emotional 3D Face Animation
Chang Liu
Qunfen Lin
Zijiao Zeng
Ye Pan
CVBM
38
4
0
17 Jul 2024
Leveraging LLM-Respondents for Item Evaluation: a Psychometric Analysis
Yunting Liu
Shreya Bhandari
Z. Pardos
25
8
0
15 Jul 2024
CUSIDE-T: Chunking, Simulating Future and Decoding for Transducer based Streaming ASR
Wenbo Zhao
Ziwei Li
Chuan Yu
Zhijian Ou
AI4TS
21
0
0
14 Jul 2024
Exploring State Space and Reasoning by Elimination in Tsetlin Machines
A. K. Kadhim
Ole-Christoffer Granmo
Lei Jiao
R. Shafik
36
2
0
12 Jul 2024
Zero-Query Adversarial Attack on Black-box Automatic Speech Recognition Systems
Zheng Fang
Tao Wang
Lingchen Zhao
Shenyi Zhang
Bowen Li
Yunjie Ge
Q. Li
Chao Shen
Qian Wang
16
4
0
27 Jun 2024
NLDF: Neural Light Dynamic Fields for Efficient 3D Talking Head Generation
Niu Guanchen
3DH
39
0
0
17 Jun 2024
Guiding Frame-Level CTC Alignments Using Self-knowledge Distillation
Eungbeom Kim
Hantae Kim
Kyogu Lee
32
1
0
12 Jun 2024
Embedded Distributed Inference of Deep Neural Networks: A Systematic Review
Federico Nicolás Peccia
Oliver Bringmann
28
0
0
06 May 2024
Deep Learning Models in Speech Recognition: Measuring GPU Energy Consumption, Impact of Noise and Model Quantization for Edge Deployment
Aditya Chakravarty
23
0
0
02 May 2024
TalkingGaussian: Structure-Persistent 3D Talking Head Synthesis via Gaussian Splatting
Jiahe Li
Jiawei Zhang
Xiao Bai
Jin Zheng
Xin Ning
Jun Zhou
Lin Gu
3DGS
46
15
0
23 Apr 2024
Towards Fast Setup and High Throughput of GPU Serverless Computing
Han Zhao
Weihao Cui
Quan Chen
Shulai Zhang
Zijun Li
Jingwen Leng
Chao Li
Deze Zeng
Minyi Guo
23
2
0
23 Apr 2024
Effective internal language model training and fusion for factorized transducer model
Jinxi Guo
Niko Moritz
Yingyi Ma
Frank Seide
Chunyang Wu
Jay Mahadeokar
Ozlem Kalinli
Christian Fuegen
Michael Seltzer
30
1
0
02 Apr 2024
PID Control-Based Self-Healing to Improve the Robustness of Large Language Models
Zhuotong Chen
Zihu Wang
Yifan Yang
Qianxiao Li
Zheng Zhang
AAML
34
1
0
31 Mar 2024
FastPerson: Enhancing Video Learning through Effective Video Summarization that Preserves Linguistic and Visual Contexts
Kazuki Kawamura
Jun Rekimoto
27
3
0
26 Mar 2024
Not Just Change the Labels, Learn the Features: Watermarking Deep Neural Networks with Multi-View Data
Yuxuan Li
S. K. Maharana
Yunhui Guo
AAML
38
0
0
15 Mar 2024
SpeechColab Leaderboard: An Open-Source Platform for Automatic Speech Recognition Evaluation
Jiayu Du
Jinpeng Li
Guoguo Chen
Wei-Qiang Zhang
ELM
35
3
0
13 Mar 2024
A Cross-Modal Approach to Silent Speech with LLM-Enhanced Recognition
Tyler Benster
G. Wilson
Reshef Elisha
Francis R. Willett
S. Druckmann
30
6
0
02 Mar 2024
Speaker-Independent Dysarthria Severity Classification using Self-Supervised Transformers and Multi-Task Learning
Lauren Stumpf
B. Kadirvelu
Sigourney Waibel
A. A. Faisal
21
2
0
29 Feb 2024
Representing Online Handwriting for Recognition in Large Vision-Language Models
Anastasiia Fadeeva
Philippe Schlattner
Andrii Maksai
Mark Collier
Efi Kokiopoulou
Jesse Berent
C. Musat
46
4
0
23 Feb 2024
The Balancing Act: Unmasking and Alleviating ASR Biases in Portuguese
Ajinkya Kulkarni
Anna Tokareva
Rameez Qureshi
Miguel Couceiro
20
4
0
12 Feb 2024
Arabic Synonym BERT-based Adversarial Examples for Text Classification
Norah M. Alshahrani
Saied Alshahrani
Esma Wali
Jeanna Neefe Matthews
AAML
14
5
0
05 Feb 2024
1
2
3
4
...
13
14
15
Next