Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2309.07081
Cited By
Can Whisper perform speech-based in-context learning?
13 September 2023
Siyin Wang
Chao-Han Huck Yang
Ji Wu
Chao Zhang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Can Whisper perform speech-based in-context learning?"
25 / 25 papers shown
Title
Spoken Language Understanding on Unseen Tasks With In-Context Learning
Neeraj Agrawal
Sriram Ganapathy
28
0
0
12 May 2025
M2R-Whisper: Multi-stage and Multi-scale Retrieval Augmentation for Enhancing Whisper
Jiaming Zhou
Songtao Zhao
Jiabei He
Hui Wang
Wenjia Zeng
Yong Chen
Haoqin Sun
Aobo Kong
Yong Qin
57
1
0
13 Mar 2025
FlanEC: Exploring Flan-T5 for Post-ASR Error Correction
Moreno La Quatra
Valerio Mario Salerno
Yu Tsao
Sabato Marco Siniscalchi
94
0
0
22 Jan 2025
What Do Speech Foundation Models Not Learn About Speech?
Abdul Waheed
Hanin Atwany
Bhiksha Raj
Rita Singh
SSL
37
1
0
16 Oct 2024
Efficient Long-Form Speech Recognition for General Speech In-Context Learning
Hao Yen
Shaoshi Ling
Guoli Ye
26
0
0
29 Sep 2024
Using LLM for Real-Time Transcription and Summarization of Doctor-Patient Interactions into ePuskesmas in Indonesia
Azmul Asmar Irfan
Nur Ahmad Khatim
Mansur M. Arief
30
0
0
25 Sep 2024
Chain-of-Thought Prompting for Speech Translation
Ke Hu
Zhehuai Chen
Chao-Han Huck Yang
Piotr Żelasko
Oleksii Hrinchuk
Vitaly Lavrukhin
Jagadeesh Balam
Boris Ginsburg
LRM
39
2
0
17 Sep 2024
Meta-Whisper: Speech-Based Meta-ICL for ASR on Low-Resource Languages
Ming-Hao Hsu
Kuan Po Huang
Hung-yi Lee
46
1
0
16 Sep 2024
LA-RAG:Enhancing LLM-based ASR Accuracy with Retrieval-Augmented Generation
Shaojun Li
Hengchao Shang
Daimeng Wei
Jiaxin Guo
Zongyao Li
Xianghui He
Min Zhang
Hao Yang
40
2
0
13 Sep 2024
Whisper-SV: Adapting Whisper for Low-data-resource Speaker Verification
Li Zhang
Ning Jiang
Qing Wang
Yuehong Li
Quan Lu
Lei Xie
36
6
0
14 Jul 2024
UniAudio 1.5: Large Language Model-driven Audio Codec is A Few-shot Audio Task Learner
Dongchao Yang
Haohan Guo
Yuanyuan Wang
Rongjie Huang
Xiang Li
Xu Tan
Xixin Wu
Helen Meng
AuLLM
47
15
0
14 Jun 2024
Do Prompts Really Prompt? Exploring the Prompt Understanding Capability of Whisper
Chih-Kai Yang
Kuan Po Huang
Hung-yi Lee
42
3
0
09 Jun 2024
Bayesian Example Selection Improves In-Context Learning for Speech, Text, and Visual Modalities
Siyin Wang
Chao-Han Huck Yang
Ji Wu
Chao Zhang
BDL
34
4
0
23 Apr 2024
Speech Translation with Speech Foundation Models and Large Language Models: What is There and What is Missing?
Marco Gaido
Sara Papi
Matteo Negri
L. Bentivogli
49
13
0
19 Feb 2024
GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators
Yuchen Hu
Chen Chen
Chao-Han Huck Yang
Ruizhe Li
Dong Zhang
Zhehuai Chen
E. Chng
20
21
0
10 Feb 2024
Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities
Zhifeng Kong
Arushi Goel
Rohan Badlani
Ming-Yu Liu
Rafael Valle
Bryan Catanzaro
AuLLM
LM&MA
MLLM
76
74
0
02 Feb 2024
Large Language Models are Efficient Learners of Noise-Robust Speech Recognition
Yuchen Hu
Chen Chen
Chao-Han Huck Yang
Ruizhe Li
Chao Zhang
Pin-Yu Chen
Ensiong Chng
27
20
0
19 Jan 2024
Investigating Zero-Shot Generalizability on Mandarin-English Code-Switched ASR and Speech-to-text Translation of Recent Foundation Models with Self-Supervision and Weak Supervision
Chih-Kai Yang
Kuan-Po Huang
Ke-Han Lu
Chun-Yi Kuan
Chi-Yuan Hsiao
Hung-yi Lee
48
7
0
30 Dec 2023
Investigating the Emergent Audio Classification Ability of ASR Foundation Models
Rao Ma
Adian Liusie
Mark J. F. Gales
Kate Knill
39
8
0
15 Nov 2023
COSMIC: Data Efficient Instruction-tuning For Speech In-Context Learning
Jing Pan
Jian Wu
Yashesh Gaur
S. Sivasankaran
Zhuo Chen
Shujie Liu
Jinyu Li
ELM
35
26
0
03 Nov 2023
Whispering LLaMA: A Cross-Modal Generative Error Correction Framework for Speech Recognition
S. Radhakrishnan
Chao-Han Huck Yang
S. Khan
Rohit Kumar
N. Kiani
D. Gómez-Cabrero
Jesper N. Tegnér
38
47
0
10 Oct 2023
Scaling Speech Technology to 1,000+ Languages
Vineel Pratap
Andros Tjandra
Bowen Shi
Paden Tomasello
Arun Babu
...
Yossi Adi
Xiaohui Zhang
Wei-Ning Hsu
Alexis Conneau
Michael Auli
VLM
77
301
0
22 May 2023
Can ChatGPT Detect Intent? Evaluating Large Language Models for Spoken Language Understanding
Mutian He
Philip N. Garner
ELM
AI4MH
LRM
48
21
0
22 May 2023
RescoreBERT: Discriminative Speech Recognition Rescoring with BERT
Liyan Xu
Yile Gu
J. Kolehmainen
Haidar Khan
Ankur Gandhe
Ariya Rastrow
A. Stolcke
I. Bulyko
39
45
0
02 Feb 2022
What Makes Good In-Context Examples for GPT-
3
3
3
?
Jiachang Liu
Dinghan Shen
Yizhe Zhang
Bill Dolan
Lawrence Carin
Weizhu Chen
AAML
RALM
275
1,312
0
17 Jan 2021
1