ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.09950
  4. Cited By
Investigating End-to-End ASR Architectures for Long Form Audio
  Transcription

Investigating End-to-End ASR Architectures for Long Form Audio Transcription

18 September 2023
Nithin Rao Koluguri
Samuel Kriman
Georgy Zelenfroind
Somshubra Majumdar
Dima Rekesh
Vahid Noroozi
Jagadeesh Balam
Boris Ginsburg
    AuLLM
ArXivPDFHTML

Papers citing "Investigating End-to-End ASR Architectures for Long Form Audio Transcription"

8 / 8 papers shown
Title
DOTA-ME-CS: Daily Oriented Text Audio-Mandarin English-Code Switching Dataset
DOTA-ME-CS: Daily Oriented Text Audio-Mandarin English-Code Switching Dataset
Yupei Li
Zifan Wei
Heng Yu
Huichi Zhou
Björn Schuller
29
0
0
21 Jan 2025
Reducing the Gap Between Pretrained Speech Enhancement and Recognition Models Using a Real Speech-Trained Bridging Module
Zhongjian Cui
Chenrui Cui
Tianrui Wang
Mengnan He
Hao Shi
Meng Ge
Caixia Gong
Longbiao Wang
J. Dang
33
0
0
05 Jan 2025
Joint Fine-tuning and Conversion of Pretrained Speech and Language Models towards Linear Complexity
Joint Fine-tuning and Conversion of Pretrained Speech and Language Models towards Linear Complexity
Mutian He
Philip N. Garner
82
0
0
09 Oct 2024
Longer is (Not Necessarily) Stronger: Punctuated Long-Sequence Training
  for Enhanced Speech Recognition and Translation
Longer is (Not Necessarily) Stronger: Punctuated Long-Sequence Training for Enhanced Speech Recognition and Translation
Nithin Rao Koluguri
Travis M. Bartley
Hainan Xu
Oleksii Hrinchuk
Jagadeesh Balam
Boris Ginsburg
Georg Kucsko
41
3
0
09 Sep 2024
SONICS: Synthetic Or Not -- Identifying Counterfeit Songs
SONICS: Synthetic Or Not -- Identifying Counterfeit Songs
Md Awsafur Rahman
Zaber Ibn Abdul Hakim
Najibul Haque Sarker
Bishmoy Paul
S. Fattah
46
7
0
26 Aug 2024
OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech
  Recognition, Translation, and Language Identification
OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech Recognition, Translation, and Language Identification
Yifan Peng
Yui Sudo
Muhammad Shakeel
Shinji Watanabe
VLM
37
17
0
20 Feb 2024
How Much Context Does My Attention-Based ASR System Need?
How Much Context Does My Attention-Based ASR System Need?
Robert Flynn
Anton Ragni
32
1
0
24 Oct 2023
Earnings-21: A Practical Benchmark for ASR in the Wild
Earnings-21: A Practical Benchmark for ASR in the Wild
Miguel Rio
Natalie Delworth
Ryan Westerman
Michelle Huang
Nishchal Bhandari
Joseph Palakapilly
Quinten McNamara
Joshua Dong
Piotr Żelasko
Miguel Jetté
66
47
0
22 Apr 2021
1