Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1712.01769
Cited By
State-of-the-art Speech Recognition With Sequence-to-Sequence Models
5 December 2017
Chung-Cheng Chiu
Tara N. Sainath
Yonghui Wu
Rohit Prabhavalkar
Patrick Nguyen
Zhehuai Chen
Anjuli Kannan
Ron J. Weiss
Kanishka Rao
Katya Gonina
Navdeep Jaitly
Bo Li
J. Chorowski
M. Bacchiani
AI4TS
Re-assign community
ArXiv
PDF
HTML
Papers citing
"State-of-the-art Speech Recognition With Sequence-to-Sequence Models"
50 / 501 papers shown
Title
ReLI: A Language-Agnostic Approach to Human-Robot Interaction
Linus Nwankwo
Bjoern Ellensohn
Ozan Özdenizci
Elmar Rueckert
LM&Ro
71
0
0
03 May 2025
Robust Latent Matters: Boosting Image Generation with Sampling Error Synthesis
Kai Qiu
Xianrui Li
Jason Kuen
Hongyu Chen
Xiaohao Xu
Jiuxiang Gu
Yinyi Luo
Bhiksha Raj
Zhe Lin
Marios Savvides
62
0
0
11 Mar 2025
Aligner-Encoders: Self-Attention Transformers Can Be Self-Transducers
Adam Stooke
Rohit Prabhavalkar
K. Sim
P. M. Mengibar
41
0
0
06 Feb 2025
Multimodal Human-Autonomous Agents Interaction Using Pre-Trained Language and Visual Foundation Models
Linus Nwankwo
Elmar Rueckert
55
2
0
31 Dec 2024
Unified Speech Recognition: A Single Model for Auditory, Visual, and Audiovisual Inputs
A. Haliassos
Rodrigo Mira
Honglie Chen
Zoe Landgraf
Stavros Petridis
Maja Pantic
SSL
39
6
0
04 Nov 2024
All models are wrong, some are useful: Model Selection with Limited Labels
Patrik Okanovic
Andreas Kirsch
Jannes Kasper
Torsten Hoefler
Andreas Krause
Nezihe Merve Gürel
VLM
28
0
0
17 Oct 2024
A two-stage transliteration approach to improve performance of a multilingual ASR
Rohit Kumar
18
0
0
09 Oct 2024
Speechworthy Instruction-tuned Language Models
Hyundong Justin Cho
Nicolaas Jedema
Leonardo F. R. Ribeiro
Karishma Sharma
Pedro Szekely
Alessandro Moschitti
Ruben Janssen
Jonathan May
ALM
47
1
0
23 Sep 2024
What does it take to get state of the art in simultaneous speech-to-speech translation?
Vincent Wilmet
Johnson Du
27
0
0
02 Sep 2024
Measuring the Accuracy of Automatic Speech Recognition Solutions
Korbinian Kuhn
Verena Kersken
Benedikt Reuter
Niklas Egger
Gottfried Zimmermann
35
20
0
29 Aug 2024
Toward Improving Synthetic Audio Spoofing Detection Robustness via Meta-Learning and Disentangled Training With Adversarial Examples
Zhenyu Wang
John H. L. Hansen
AAML
43
1
0
23 Aug 2024
BasisN: Reprogramming-Free RRAM-Based In-Memory-Computing by Basis Combination for Deep Neural Networks
Amro Eldebiky
Grace Li Zhang
Xunzhao Yin
Cheng Zhuo
Ing-Chao Lin
Ulf Schlichtmann
Bing Li
26
0
0
04 Jul 2024
Token-Weighted RNN-T for Learning from Flawed Data
Gil Keren
Wei Zhou
Ozlem Kalinli
48
0
0
26 Jun 2024
Text Injection for Neural Contextual Biasing
Zhong Meng
Zelin Wu
Rohit Prabhavalkar
Cal Peyser
Weiran Wang
Nanxin Chen
Tara N. Sainath
Bhuvana Ramabhadran
48
3
0
05 Jun 2024
Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models
Yuchen Hu
Chen Chen
Chao-Han Huck Yang
Chengwei Qin
Pin-Yu Chen
Chng Eng Siong
Chao Zhang
VLM
35
3
0
23 May 2024
You don't understand me!: Comparing ASR results for L1 and L2 speakers of Swedish
Ronald Cumbal
Birger Moell
José Lopes
Olov Engwall
16
20
0
22 May 2024
AIris: An AI-powered Wearable Assistive Device for the Visually Impaired
Dionysia Danai Brilli
Evangelos Georgaras
Stefania Tsilivaki
Nikos Melanitis
Konstantina S. Nikita
25
1
0
13 May 2024
Efficient Sample-Specific Encoder Perturbations
Yassir Fathullah
Mark Gales
31
0
0
01 May 2024
Advanced Long-Content Speech Recognition With Factorized Neural Transducer
Xun Gong
Yu Wu
Jinyu Li
Shujie Liu
Rui Zhao
Xie Chen
Yanmin Qian
37
6
0
20 Mar 2024
Automatic Speech Recognition (ASR) for the Diagnosis of pronunciation of Speech Sound Disorders in Korean children
Taekyung Ahn
Yeonjung Hong
Younggon Im
Do Hyung Kim
Dayoung Kang
...
Jae Won Kim
Min Jung Kim
Ah-ra Cho
Dae-Hyun Jang
Hosung Nam
32
1
0
13 Mar 2024
Typist Experiment: an Investigation of Human-to-Human Dictation via Role-play to Inform Voice-based Text Authoring
Can Liu
Si-Yuan Hu
Li Feng
Mingming Fan
38
3
0
09 Mar 2024
Automatic Speech Recognition using Advanced Deep Learning Approaches: A survey
Hamza Kheddar
Mustapha Hemis
Yassine Himeur
OffRL
48
59
0
02 Mar 2024
Representing Online Handwriting for Recognition in Large Vision-Language Models
Anastasiia Fadeeva
Philippe Schlattner
Andrii Maksai
Mark Collier
Efi Kokiopoulou
Jesse Berent
C. Musat
54
4
0
23 Feb 2024
Sheet Music Transformer: End-To-End Optical Music Recognition Beyond Monophonic Transcription
Antonio Ríos-Vila
Jorge Calvo-Zaragoza
Thierry Paquet
46
10
0
12 Feb 2024
Automated speech audiometry: Can it work using open-source pre-trained Kaldi-NL automatic speech recognition?
Gloria Araiza-Illan
Luke Meyer
K. Truong
D. Başkent
14
5
0
19 Dec 2023
PhasePerturbation: Speech Data Augmentation via Phase Perturbation for Automatic Speech Recognition
Chengxi Lei
Satwinder Singh
Feng Hou
Xiaoyun Jia
Ruili Wang
30
1
0
13 Dec 2023
USM-Lite: Quantization and Sparsity Aware Fine-tuning for Speech Recognition with Universal Speech Models
Shaojin Ding
David Qiu
David Rim
Yanzhang He
Oleg Rybakov
...
Tara N. Sainath
Zhonglin Han
Jian Li
Amir Yazdanbakhsh
Shivani Agrawal
MQ
34
9
0
13 Dec 2023
D4AM: A General Denoising Framework for Downstream Acoustic Models
H. Wang
Yu Tsao
Hsin-Min Wang
Chu-Song Chen
21
4
0
28 Nov 2023
Neural Network Methods for Radiation Detectors and Imaging
S. Lin
S. Ning
H. Zhu
T. Zhou
C. L. Morris
S. Clayton
M. Cherukara
R. T. Chen
Z. Wang
AI4CE
37
5
0
09 Nov 2023
TACNET: Temporal Audio Source Counting Network
Amirreza Ahmadnejad
Ahmad Mahmmodian Darviishani
Mohmmad Mehrdad Asadi
Sajjad Saffariyeh
Pedram Yousef
Emad Fatemizadeh
42
2
0
04 Nov 2023
Boosting Decision-Based Black-Box Adversarial Attack with Gradient Priors
Han Liu
Xingshuo Huang
Xiaotong Zhang
Qimai Li
Fenglong Ma
Wen Wang
Hongyang Chen
Hong Yu
Xianchao Zhang
AAML
50
1
0
29 Oct 2023
Quantifying the Dialect Gap and its Correlates Across Languages
Anjali Kantharuban
Ivan Vulić
Anna Korhonen
57
21
0
23 Oct 2023
Unveiling Energy Efficiency in Deep Learning: Measurement, Prediction, and Scoring across Edge Devices
Xiaolong Tu
Anik Mallik
Dawei Chen
Kyungtae Han
Onur Altintas
Haoxin Wang
Jiang Xie
27
12
0
19 Oct 2023
Insightful analysis of historical sources at scales beyond human capabilities using unsupervised Machine Learning and XAI
Oliver Eberle
Jochen Büttner
Hassan el-Hajj
G. Montavon
Klaus-Robert Muller
Matteo Valleriani
25
1
0
13 Oct 2023
Generative Speech Recognition Error Correction with Large Language Models and Task-Activating Prompting
Chao-Han Huck Yang
Yile Gu
Yi-Chieh Liu
Shalini Ghosh
I. Bulyko
A. Stolcke
KELM
LRM
43
40
0
27 Sep 2023
Memory-augmented conformer for improved end-to-end long-form ASR
Carlos Carvalho
A. Abad
RALM
34
1
0
22 Sep 2023
Hybrid Attention-based Encoder-decoder Model for Efficient Language Model Adaptation
Shaoshi Ling
Guoli Ye
Rui Zhao
Yifan Gong
VLM
26
1
0
14 Sep 2023
Typing on Any Surface: A Deep Learning-based Method for Real-Time Keystroke Detection in Augmented Reality
Xingyu Fu
Mingze Xi
14
0
0
31 Aug 2023
Bilingual Streaming ASR with Grapheme units and Auxiliary Monolingual Loss
M. Soleymanpour
Mahmoud Al Ismail
F. Bahmaninezhad
Kshitiz Kumar
Jian Wu
24
0
0
11 Aug 2023
On-Device Speaker Anonymization of Acoustic Embeddings for ASR based onFlexible Location Gradient Reversal Layer
Md. Asif Jalal
Pablo Peso Parada
Jisi Zhang
Karthikeyan P. Saravanan
Mete Ozay
Myoungji Han
Jung In Lee
Seokyeong Jung
28
1
0
25 Jul 2023
Analyzing sports commentary in order to automatically recognize events and extract insights
Yanis Miraoui
15
0
0
18 Jul 2023
Toward Interactive Dictation
Belinda Z. Li
J. Eisner
Adam Pauls
Sam Thomson
KELM
21
2
0
08 Jul 2023
Multi-pass Training and Cross-information Fusion for Low-resource End-to-end Accented Speech Recognition
Xuefei Wang
Yanhua Long
Yijie Li
Haoran Wei
37
4
0
20 Jun 2023
SURT 2.0: Advances in Transducer-based Multi-talker Speech Recognition
Desh Raj
Daniel Povey
Sanjeev Khudanpur
VLM
36
9
0
18 Jun 2023
Research on an improved Conformer end-to-end Speech Recognition Model with R-Drop Structure
Weidong Ji
Shijie Zan
Guohui Zhou
Xu Wang
SyDa
27
1
0
14 Jun 2023
Text-only Domain Adaptation using Unified Speech-Text Representation in Transducer
Lu Huang
Yangqiu Song
Jun Zhang
Lu Lu
Zejun Ma
38
2
0
07 Jun 2023
Edit Distance based RL for RNNT decoding
DongSeon Hwang
Changwan Ryu
K. Sim
24
0
0
31 May 2023
Graph Neural Networks for Contextual ASR with the Tree-Constrained Pointer Generator
Guangzhi Sun
Chuxu Zhang
P. Woodland
20
4
0
30 May 2023
Repeated Random Sampling for Minimizing the Time-to-Accuracy of Learning
Patrik Okanovic
R. Waleffe
Vasilis Mageirakos
Konstantinos E. Nikolakakis
Amin Karbasi
Dionysis Kalogerias
Nezihe Merve Gürel
Theodoros Rekatsinas
DD
58
12
0
28 May 2023
VioLA: Unified Codec Language Models for Speech Recognition, Synthesis, and Translation
Tianrui Wang
Long Zhou
Zi-Hua Zhang
Yu-Huan Wu
Shujie Liu
Yashesh Gaur
Zhuo Chen
Jinyu Li
Furu Wei
40
101
0
25 May 2023
1
2
3
4
...
9
10
11
Next