Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2309.09950
Cited By
Investigating End-to-End ASR Architectures for Long Form Audio Transcription
18 September 2023
Nithin Rao Koluguri
Samuel Kriman
Georgy Zelenfroind
Somshubra Majumdar
Dima Rekesh
Vahid Noroozi
Jagadeesh Balam
Boris Ginsburg
AuLLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Investigating End-to-End ASR Architectures for Long Form Audio Transcription"
8 / 8 papers shown
Title
DOTA-ME-CS: Daily Oriented Text Audio-Mandarin English-Code Switching Dataset
Yupei Li
Zifan Wei
Heng Yu
Huichi Zhou
Björn Schuller
29
0
0
21 Jan 2025
Reducing the Gap Between Pretrained Speech Enhancement and Recognition Models Using a Real Speech-Trained Bridging Module
Zhongjian Cui
Chenrui Cui
Tianrui Wang
Mengnan He
Hao Shi
Meng Ge
Caixia Gong
Longbiao Wang
J. Dang
33
0
0
05 Jan 2025
Joint Fine-tuning and Conversion of Pretrained Speech and Language Models towards Linear Complexity
Mutian He
Philip N. Garner
82
0
0
09 Oct 2024
Longer is (Not Necessarily) Stronger: Punctuated Long-Sequence Training for Enhanced Speech Recognition and Translation
Nithin Rao Koluguri
Travis M. Bartley
Hainan Xu
Oleksii Hrinchuk
Jagadeesh Balam
Boris Ginsburg
Georg Kucsko
41
3
0
09 Sep 2024
SONICS: Synthetic Or Not -- Identifying Counterfeit Songs
Md Awsafur Rahman
Zaber Ibn Abdul Hakim
Najibul Haque Sarker
Bishmoy Paul
S. Fattah
46
7
0
26 Aug 2024
OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech Recognition, Translation, and Language Identification
Yifan Peng
Yui Sudo
Muhammad Shakeel
Shinji Watanabe
VLM
37
17
0
20 Feb 2024
How Much Context Does My Attention-Based ASR System Need?
Robert Flynn
Anton Ragni
32
1
0
24 Oct 2023
Earnings-21: A Practical Benchmark for ASR in the Wild
Miguel Rio
Natalie Delworth
Ryan Westerman
Michelle Huang
Nishchal Bhandari
Joseph Palakapilly
Quinten McNamara
Joshua Dong
Piotr Żelasko
Miguel Jetté
66
47
0
22 Apr 2021
1