ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1904.03288
  4. Cited By
Jasper: An End-to-End Convolutional Neural Acoustic Model

Jasper: An End-to-End Convolutional Neural Acoustic Model

5 April 2019
Jason Chun Lok Li
Vitaly Lavrukhin
Boris Ginsburg
Ryan Leary
Oleksii Kuchaiev
Jonathan M. Cohen
Huyen Nguyen
R. Gadde
    DRL
    VLM
    AuLLM
ArXivPDFHTML

Papers citing "Jasper: An End-to-End Convolutional Neural Acoustic Model"

50 / 57 papers shown
Title
AI Agents vs. Agentic AI: A Conceptual Taxonomy, Applications and Challenges
AI Agents vs. Agentic AI: A Conceptual Taxonomy, Applications and Challenges
Ranjan Sapkota
Konstantinos I Roumeliotis
Manoj Karkee
AI4TS
26
0
0
15 May 2025
Learning state and proposal dynamics in state-space models using differentiable particle filters and neural networks
Learning state and proposal dynamics in state-space models using differentiable particle filters and neural networks
Benjamin Cox
Santiago Segarra
Victor Elvira
81
0
0
23 Nov 2024
Training Large ASR Encoders with Differential Privacy
Training Large ASR Encoders with Differential Privacy
Geeticka Chauhan
Steve Chien
Om Thakkar
Abhradeep Thakurta
Arun Narayanan
33
1
0
21 Sep 2024
ApproBiVT: Lead ASR Models to Generalize Better Using Approximated
  Bias-Variance Tradeoff Guided Early Stopping and Checkpoint Averaging
ApproBiVT: Lead ASR Models to Generalize Better Using Approximated Bias-Variance Tradeoff Guided Early Stopping and Checkpoint Averaging
Fangyuan Wang
Ming Hao
Yuhai Shi
Bo Xu
MoMe
21
0
0
05 Aug 2023
The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple
  Devices in Diverse Scenarios
The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios
Samuele Cornell
Matthew Wiesner
Shinji Watanabe
Desh Raj
Xuankai Chang
...
Matthew Maciejewski
Yoshiki Masuyama
Zhong-Qiu Wang
S. Squartini
Sanjeev Khudanpur
32
51
0
23 Jun 2023
Hystoc: Obtaining word confidences for fusion of end-to-end ASR systems
Hystoc: Obtaining word confidences for fusion of end-to-end ASR systems
Karel Beneš
M. Kocour
L. Burget
37
2
0
21 May 2023
A Comparative Study on E-Branchformer vs Conformer in Speech
  Recognition, Translation, and Understanding Tasks
A Comparative Study on E-Branchformer vs Conformer in Speech Recognition, Translation, and Understanding Tasks
Yifan Peng
Kwangyoun Kim
Felix Wu
Brian Yan
Siddhant Arora
William Chen
Jiyang Tang
Suwon Shon
Prashant Sridhar
Shinji Watanabe
29
17
0
18 May 2023
Pyramid Multi-branch Fusion DCNN with Multi-Head Self-Attention for
  Mandarin Speech Recognition
Pyramid Multi-branch Fusion DCNN with Multi-Head Self-Attention for Mandarin Speech Recognition
Kai Liu
Hailiang Xiong
Gangqiang Yang
Zhengfeng Du
Yewen Cao
D. Shah
18
0
0
23 Mar 2023
Open Problems in Applied Deep Learning
Open Problems in Applied Deep Learning
M. Raissi
AI4CE
42
2
0
26 Jan 2023
BayesSpeech: A Bayesian Transformer Network for Automatic Speech
  Recognition
BayesSpeech: A Bayesian Transformer Network for Automatic Speech Recognition
Will Rieger
BDL
UQCV
19
0
0
16 Jan 2023
Inter-KD: Intermediate Knowledge Distillation for CTC-Based Automatic
  Speech Recognition
Inter-KD: Intermediate Knowledge Distillation for CTC-Based Automatic Speech Recognition
J. Yoon
Beom Jun Woo
Sunghwan Ahn
Hyeon Seung Lee
N. Kim
VLM
25
9
0
28 Nov 2022
E-Branchformer: Branchformer with Enhanced merging for speech
  recognition
E-Branchformer: Branchformer with Enhanced merging for speech recognition
Kwangyoun Kim
Felix Wu
Yifan Peng
Jing Pan
Prashant Sridhar
Kyu Jeong Han
Shinji Watanabe
61
105
0
30 Sep 2022
ConvRNN-T: Convolutional Augmented Recurrent Neural Network Transducers
  for Streaming Speech Recognition
ConvRNN-T: Convolutional Augmented Recurrent Neural Network Transducers for Streaming Speech Recognition
Martin H. Radfar
Rohit Barnwal
R. Swaminathan
Feng-Ju Chang
Grant P. Strimel
Nathan Susanj
Athanasios Mouchtaris
34
13
0
29 Sep 2022
Attention Enhanced Citrinet for Speech Recognition
Attention Enhanced Citrinet for Speech Recognition
Xianchao Wu
13
1
0
01 Sep 2022
Uconv-Conformer: High Reduction of Input Sequence Length for End-to-End
  Speech Recognition
Uconv-Conformer: High Reduction of Input Sequence Length for End-to-End Speech Recognition
A. Andrusenko
R. Nasretdinov
A. Romanenko
20
18
0
16 Aug 2022
Branchformer: Parallel MLP-Attention Architectures to Capture Local and
  Global Context for Speech Recognition and Understanding
Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding
Yifan Peng
Siddharth Dalmia
Ian Lane
Shinji Watanabe
30
143
0
06 Jul 2022
Generating gender-ambiguous voices for privacy-preserving speech
  recognition
Generating gender-ambiguous voices for privacy-preserving speech recognition
Dimitrios Stoidis
Andrea Cavallaro
36
14
0
03 Jul 2022
FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech
  Self-Supervised Learning
FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech Self-Supervised Learning
Yeonghyeon Lee
Kangwook Jang
Jahyun Goo
Youngmoon Jung
Hoi-Rim Kim
28
29
0
01 Jul 2022
Squeezeformer: An Efficient Transformer for Automatic Speech Recognition
Squeezeformer: An Efficient Transformer for Automatic Speech Recognition
Sehoon Kim
A. Gholami
Albert Eaton Shaw
Nicholas Lee
K. Mangalam
Jitendra Malik
Michael W. Mahoney
Kurt Keutzer
32
99
0
02 Jun 2022
Easter2.0: Improving convolutional models for handwritten text
  recognition
Easter2.0: Improving convolutional models for handwritten text recognition
Kartik Chaudhary
Raghav Bali
36
9
0
30 May 2022
aaeCAPTCHA: The Design and Implementation of Audio Adversarial CAPTCHA
aaeCAPTCHA: The Design and Implementation of Audio Adversarial CAPTCHA
Md. Imran Hossen
X. Hei
31
4
0
05 Mar 2022
Load-balanced Gather-scatter Patterns for Sparse Deep Neural Networks
Load-balanced Gather-scatter Patterns for Sparse Deep Neural Networks
Fei Sun
Minghai Qin
Tianyun Zhang
Xiaolong Ma
Haoran Li
Junwen Luo
Zihao Zhao
Yen-kuang Chen
Yuan Xie
22
1
0
20 Dec 2021
LipSound2: Self-Supervised Pre-Training for Lip-to-Speech Reconstruction
  and Lip Reading
LipSound2: Self-Supervised Pre-Training for Lip-to-Speech Reconstruction and Lip Reading
Leyuan Qu
C. Weber
S. Wermter
38
23
0
09 Dec 2021
Oracle Teacher: Leveraging Target Information for Better Knowledge
  Distillation of CTC Models
Oracle Teacher: Leveraging Target Information for Better Knowledge Distillation of CTC Models
J. Yoon
H. Kim
Hyeon Seung Lee
Sunghwan Ahn
N. Kim
38
1
0
05 Nov 2021
Speech recognition for air traffic control via feature learning and
  end-to-end training
Speech recognition for air traffic control via feature learning and end-to-end training
Peng Fan
Dongyue Guo
Yi Lin
Bo Yang
Jianwei Zhang
15
7
0
04 Nov 2021
Discovery of Single Independent Latent Variable
Discovery of Single Independent Latent Variable
Uri Shaham
Jonathan Svirsky
Ori Katz
Ronen Talmon
CML
28
2
0
12 Oct 2021
Real World Robustness from Systematic Noise
Real World Robustness from Systematic Noise
Yan Wang
Yuhang Li
Ruihao Gong
36
7
0
02 Sep 2021
One TTS Alignment To Rule Them All
One TTS Alignment To Rule Them All
Rohan Badlani
A. Lancucki
Kevin J. Shih
Rafael Valle
Ming-Yu Liu
Bryan Catanzaro
38
82
0
23 Aug 2021
Automatic Speech Recognition And Limited Vocabulary: A Survey
Automatic Speech Recognition And Limited Vocabulary: A Survey
J. L. E. K. Fendji
D. Tala
B. Yenke
M. Atemkeng
23
3
0
23 Aug 2021
VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised
  Speech Representation Disentanglement for One-shot Voice Conversion
VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-shot Voice Conversion
Disong Wang
Liqun Deng
Y. Yeung
Xiao Chen
Xunying Liu
Helen Meng
DRL
22
136
0
18 Jun 2021
Accent Recognition with Hybrid Phonetic Features
Accent Recognition with Hybrid Phonetic Features
Zhan Zhang
Xi Chen
Yuehai Wang
Jianyi Yang
21
18
0
05 May 2021
End-to-End Video-To-Speech Synthesis using Generative Adversarial
  Networks
End-to-End Video-To-Speech Synthesis using Generative Adversarial Networks
Rodrigo Mira
Konstantinos Vougioukas
Pingchuan Ma
Stavros Petridis
Björn W. Schuller
M. Pantic
32
43
0
27 Apr 2021
Efficient and Generic 1D Dilated Convolution Layer for Deep Learning
Efficient and Generic 1D Dilated Convolution Layer for Deep Learning
Narendra Chaudhary
Sanchit Misra
Dhiraj D. Kalamkar
A. Heinecke
E. Georganas
Barukh Ziv
Menachem Adelman
Bharat Kaul
32
9
0
16 Apr 2021
Error-driven Fixed-Budget ASR Personalization for Accented Speakers
Error-driven Fixed-Budget ASR Personalization for Accented Speakers
Abhijeet Awasthi
A. Kansal
Sunita Sarawagi
P. Jyothi
21
9
0
04 Mar 2021
Missing Value Imputation on Multidimensional Time Series
Missing Value Imputation on Multidimensional Time Series
Parikshit Bansal
Prathamesh Deshpande
Sunita Sarawagi
AI4TS
14
63
0
02 Mar 2021
Improving speech recognition models with small samples for air traffic
  control systems
Improving speech recognition models with small samples for air traffic control systems
Yi Lin
Qin Li
Bo Yang
Zhen Yan
Huachun Tan
Zhengmao Chen
34
32
0
16 Feb 2021
Nanopore Base Calling on the Edge
Nanopore Base Calling on the Edge
Peter Perešíni
V. Boža
Broňa Brejová
T. Vinař
19
38
0
09 Nov 2020
MarbleNet: Deep 1D Time-Channel Separable Convolutional Neural Network
  for Voice Activity Detection
MarbleNet: Deep 1D Time-Channel Separable Convolutional Neural Network for Voice Activity Detection
Fei Jia
Somshubra Majumdar
Boris Ginsburg
19
48
0
26 Oct 2020
Towards End-to-End Training of Automatic Speech Recognition for Nigerian
  Pidgin
Towards End-to-End Training of Automatic Speech Recognition for Nigerian Pidgin
Daniel A. Ajisafe
O. Adegboro
Esther Oduntan
T. Arulogun
13
4
0
21 Oct 2020
End-to-End Speech Recognition and Disfluency Removal
End-to-End Speech Recognition and Disfluency Removal
Paria Jamshid Lou
Mark Johnson
19
32
0
22 Sep 2020
Exploration of End-to-End ASR for OpenSTT -- Russian Open Speech-to-Text
  Dataset
Exploration of End-to-End ASR for OpenSTT -- Russian Open Speech-to-Text Dataset
A. Andrusenko
A. Laptev
Ivan Medennikov
VLM
24
12
0
15 Jun 2020
audino: A Modern Annotation Tool for Audio and Speech
audino: A Modern Annotation Tool for Audio and Speech
Manraj Singh Grover
P. Bamdev
Ratin Kumar Brala
Yaman Kumar Singla
Mika Hama
R. Shah
17
12
0
09 Jun 2020
Attention-based Transducer for Online Speech Recognition
Attention-based Transducer for Online Speech Recognition
Bin Wang
Yan Yin
Hui-Ching Lin
18
4
0
18 May 2020
Multimodal Target Speech Separation with Voice and Face References
Multimodal Target Speech Separation with Voice and Face References
Leyuan Qu
C. Weber
S. Wermter
CVBM
19
19
0
17 May 2020
Speech to Text Adaptation: Towards an Efficient Cross-Modal Distillation
Speech to Text Adaptation: Towards an Efficient Cross-Modal Distillation
Won Ik Cho
Donghyun Kwak
J. Yoon
N. Kim
31
26
0
17 May 2020
Conformer: Convolution-augmented Transformer for Speech Recognition
Conformer: Convolution-augmented Transformer for Speech Recognition
Anmol Gulati
James Qin
Chung-Cheng Chiu
Niki Parmar
Yu Zhang
...
Wei Han
Shibo Wang
Zhengdong Zhang
Yonghui Wu
Ruoming Pang
101
3,038
0
16 May 2020
ContextNet: Improving Convolutional Neural Networks for Automatic Speech
  Recognition with Global Context
ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context
Wei Han
Zhengdong Zhang
Yu Zhang
Jiahui Yu
Chung-Cheng Chiu
James Qin
Anmol Gulati
Ruoming Pang
Yonghui Wu
21
259
0
07 May 2020
Imputer: Sequence Modelling via Imputation and Dynamic Programming
Imputer: Sequence Modelling via Imputation and Dynamic Programming
William Chan
Chitwan Saharia
Geoffrey E. Hinton
Mohammad Norouzi
Navdeep Jaitly
BDL
AI4TS
21
114
0
20 Feb 2020
CGCNN: Complex Gabor Convolutional Neural Network on raw speech
CGCNN: Complex Gabor Convolutional Neural Network on raw speech
Paul-Gauthier Noé
Titouan Parcollet
Mohamed Morchid
22
29
0
11 Feb 2020
Multimodal Machine Translation through Visuals and Speech
Multimodal Machine Translation through Visuals and Speech
U. Sulubacak
Ozan Caglayan
Stig-Arne Gronroos
Aku Rouhe
Desmond Elliott
Lucia Specia
Jörg Tiedemann
49
73
0
28 Nov 2019
12
Next