ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2307.11795
  4. Cited By
Prompting Large Language Models with Speech Recognition Abilities

Prompting Large Language Models with Speech Recognition Abilities

21 July 2023
Yassir Fathullah
Chunyang Wu
Egor Lakomkin
J. Jia
Yuan Shangguan
Ke Li
Jinxi Guo
Wenhan Xiong
Jay Mahadeokar
Ozlem Kalinli
Christian Fuegen
M. Seltzer
    AuLLM
ArXivPDFHTML

Papers citing "Prompting Large Language Models with Speech Recognition Abilities"

24 / 24 papers shown
Title
SIFT-50M: A Large-Scale Multilingual Dataset for Speech Instruction Fine-Tuning
SIFT-50M: A Large-Scale Multilingual Dataset for Speech Instruction Fine-Tuning
Prabhat Pandey
R. Swaminathan
K V Vijay Girish
Arunasish Sen
Jian Xie
Grant P. Strimel
Andreas Schwarz
140
0
0
12 Apr 2025
QualiSpeech: A Speech Quality Assessment Dataset with Natural Language Reasoning and Descriptions
QualiSpeech: A Speech Quality Assessment Dataset with Natural Language Reasoning and Descriptions
Siyin Wang
Wenyi Yu
Xianzhao Chen
Xiaohai Tian
Jianwei Zhang
Lu Lu
Yu Tsao
Junichi Yamagishi
Yue Wang
Chao Zhang
AuLLM
83
0
0
26 Mar 2025
Audio-CoT: Exploring Chain-of-Thought Reasoning in Large Audio Language Model
Audio-CoT: Exploring Chain-of-Thought Reasoning in Large Audio Language Model
Z. Ma
Zhuo Chen
Yue Wang
Eng Siong Chng
Xie Chen
AuLLM
LRM
64
9
0
13 Jan 2025
Prepending or Cross-Attention for Speech-to-Text? An Empirical Comparison
Prepending or Cross-Attention for Speech-to-Text? An Empirical Comparison
Tsz Kin Lam
Marco Gaido
Sara Papi
L. Bentivogli
Barry Haddow
36
0
0
04 Jan 2025
Performance evaluation of SLAM-ASR: The Good, the Bad, the Ugly, and the Way Forward
Performance evaluation of SLAM-ASR: The Good, the Bad, the Ugly, and the Way Forward
Shashi Kumar
Iuliia Thorbecke
Sergio Burdisso
Esaú Villatoro-Tello
Marcelo Errecalde
Kadri Hacioğlu
Pradeep Rangappa
P. Motlícek
A. Ganapathiraju
Andreas Stolcke
55
1
0
06 Nov 2024
Thought2Text: Text Generation from EEG Signal using Large Language Models (LLMs)
Thought2Text: Text Generation from EEG Signal using Large Language Models (LLMs)
Abhijit Mishra
Shreya Shukla
Jose Torres
Jacek Gwizdka
Shounak Roychowdhury
48
4
0
10 Oct 2024
Frozen Large Language Models Can Perceive Paralinguistic Aspects of Speech
Frozen Large Language Models Can Perceive Paralinguistic Aspects of Speech
Wonjune Kang
J. Jia
Chunyang Wu
Wei Zhou
Egor Lakomkin
...
Leda Sari
Suyoun Kim
Ke Li
Jay Mahadeokar
Ozlem Kalinli
AuLLM
46
2
0
02 Oct 2024
SSR: Alignment-Aware Modality Connector for Speech Language Models
SSR: Alignment-Aware Modality Connector for Speech Language Models
Weiting Tan
Hirofumi Inaguma
Ning Dong
Paden Tomasello
Xutai Ma
32
3
0
30 Sep 2024
Enabling Auditory Large Language Models for Automatic Speech Quality Evaluation
Enabling Auditory Large Language Models for Automatic Speech Quality Evaluation
Siyin Wang
Wenyi Yu
Yudong Yang
Changli Tang
Yixuan Li
...
Jun Zhang
Guangzhi Sun
Lu Lu
Yuxuan Wang
Chao Zhang
AuLLM
LM&MA
67
5
0
25 Sep 2024
Large Language Models are Strong Audio-Visual Speech Recognition Learners
Large Language Models are Strong Audio-Visual Speech Recognition Learners
Umberto Cappellazzo
Minsu Kim
Honglie Chen
Pingchuan Ma
Stavros Petridis
Daniele Falavigna
Alessio Brutti
Maja Pantic
36
9
0
18 Sep 2024
Chain-of-Thought Prompting for Speech Translation
Chain-of-Thought Prompting for Speech Translation
Ke Hu
Zhehuai Chen
Chao-Han Huck Yang
Piotr Żelasko
Oleksii Hrinchuk
Vitaly Lavrukhin
Jagadeesh Balam
Boris Ginsburg
LRM
39
2
0
17 Sep 2024
Advancing Multi-talker ASR Performance with Large Language Models
Advancing Multi-talker ASR Performance with Large Language Models
Mohan Shi
Zengrui Jin
Yaoxun Xu
Yong Xu
Shi-Xiong Zhang
Kun Wei
Yiwen Shao
Chunlei Zhang
Dong Yu
31
1
0
30 Aug 2024
Bridging the Language Gap: Enhancing Multilingual Prompt-Based Code Generation in LLMs via Zero-Shot Cross-Lingual Transfer
Bridging the Language Gap: Enhancing Multilingual Prompt-Based Code Generation in LLMs via Zero-Shot Cross-Lingual Transfer
Mingda Li
Abhijit Mishra
Utkarsh Mujumdar
42
0
0
19 Aug 2024
Enhancing Large Language Model-based Speech Recognition by
  Contextualization for Rare and Ambiguous Words
Enhancing Large Language Model-based Speech Recognition by Contextualization for Rare and Ambiguous Words
Kento Nozawa
Takashi Masuko
Toru Taniguchi
43
1
0
15 Aug 2024
DiscreteSLU: A Large Language Model with Self-Supervised Discrete Speech
  Units for Spoken Language Understanding
DiscreteSLU: A Large Language Model with Self-Supervised Discrete Speech Units for Spoken Language Understanding
Suwon Shon
Kwangyoun Kim
Yi-Te Hsu
Prashant Sridhar
Shinji Watanabe
Karen Livescu
AuLLM
46
2
0
13 Jun 2024
Can Large Language Models Understand Spatial Audio?
Can Large Language Models Understand Spatial Audio?
Changli Tang
Wenyi Yu
Guangzhi Sun
Xianzhao Chen
Tian Tan
...
Jun Zhang
Lu Lu
Zejun Ma
Yuxuan Wang
Chao Zhang
49
4
0
12 Jun 2024
MaLa-ASR: Multimedia-Assisted LLM-Based ASR
MaLa-ASR: Multimedia-Assisted LLM-Based ASR
Guanrou Yang
Ziyang Ma
Fan Yu
Zhifu Gao
Shiliang Zhang
Xie Chen
AuLLM
41
2
0
09 Jun 2024
Listen Again and Choose the Right Answer: A New Paradigm for Automatic
  Speech Recognition with Large Language Models
Listen Again and Choose the Right Answer: A New Paradigm for Automatic Speech Recognition with Large Language Models
Yuchen Hu
Chen Chen
Chengwei Qin
Qiushi Zhu
E. Chng
Ruizhe Li
AuLLM
KELM
49
5
0
16 May 2024
It's Never Too Late: Fusing Acoustic Information into Large Language
  Models for Automatic Speech Recognition
It's Never Too Late: Fusing Acoustic Information into Large Language Models for Automatic Speech Recognition
Chen Chen
Ruizhe Li
Yuchen Hu
Sabato Marco Siniscalchi
Pin-Yu Chen
Ensiong Chng
Chao-Han Huck Yang
33
19
0
08 Feb 2024
Large Language Models are Efficient Learners of Noise-Robust Speech
  Recognition
Large Language Models are Efficient Learners of Noise-Robust Speech Recognition
Yuchen Hu
Chen Chen
Chao-Han Huck Yang
Ruizhe Li
Chao Zhang
Pin-Yu Chen
Ensiong Chng
27
20
0
19 Jan 2024
Multimodal Data and Resource Efficient Device-Directed Speech Detection
  with Large Foundation Models
Multimodal Data and Resource Efficient Device-Directed Speech Detection with Large Foundation Models
Dominik Wagner
Alexander W. Churchill
Siddharth Sigtia
Panayiotis Georgiou
Matt Mirsamadi
Aarshee Mishra
Erik Marchi
20
3
0
06 Dec 2023
SALMONN: Towards Generic Hearing Abilities for Large Language Models
SALMONN: Towards Generic Hearing Abilities for Large Language Models
Changli Tang
Wenyi Yu
Guangzhi Sun
Xianzhao Chen
Tian Tan
Wei Li
Lu Lu
Zejun Ma
Chao Zhang
LM&MA
AuLLM
39
200
0
20 Oct 2023
SALM: Speech-augmented Language Model with In-context Learning for
  Speech Recognition and Translation
SALM: Speech-augmented Language Model with In-context Learning for Speech Recognition and Translation
Zhehuai Chen
He Huang
A. Andrusenko
Oleksii Hrinchuk
Krishna C. Puvvada
Jason Chun Lok Li
Subhankar Ghosh
Jagadeesh Balam
Boris Ginsburg
LRM
29
49
0
13 Oct 2023
Can Whisper perform speech-based in-context learning?
Can Whisper perform speech-based in-context learning?
Siyin Wang
Chao-Han Huck Yang
Ji Wu
Chao Zhang
26
24
0
13 Sep 2023
1