ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.07081
  4. Cited By
Can Whisper perform speech-based in-context learning?

Can Whisper perform speech-based in-context learning?

13 September 2023
Siyin Wang
Chao-Han Huck Yang
Ji Wu
Chao Zhang
ArXivPDFHTML

Papers citing "Can Whisper perform speech-based in-context learning?"

25 / 25 papers shown
Title
Spoken Language Understanding on Unseen Tasks With In-Context Learning
Spoken Language Understanding on Unseen Tasks With In-Context Learning
Neeraj Agrawal
Sriram Ganapathy
28
0
0
12 May 2025
M2R-Whisper: Multi-stage and Multi-scale Retrieval Augmentation for Enhancing Whisper
M2R-Whisper: Multi-stage and Multi-scale Retrieval Augmentation for Enhancing Whisper
Jiaming Zhou
Songtao Zhao
Jiabei He
Hui Wang
Wenjia Zeng
Yong Chen
Haoqin Sun
Aobo Kong
Yong Qin
57
1
0
13 Mar 2025
FlanEC: Exploring Flan-T5 for Post-ASR Error Correction
FlanEC: Exploring Flan-T5 for Post-ASR Error Correction
Moreno La Quatra
Valerio Mario Salerno
Yu Tsao
Sabato Marco Siniscalchi
94
0
0
22 Jan 2025
What Do Speech Foundation Models Not Learn About Speech?
What Do Speech Foundation Models Not Learn About Speech?
Abdul Waheed
Hanin Atwany
Bhiksha Raj
Rita Singh
SSL
35
1
0
16 Oct 2024
Efficient Long-Form Speech Recognition for General Speech In-Context
  Learning
Efficient Long-Form Speech Recognition for General Speech In-Context Learning
Hao Yen
Shaoshi Ling
Guoli Ye
26
0
0
29 Sep 2024
Using LLM for Real-Time Transcription and Summarization of
  Doctor-Patient Interactions into ePuskesmas in Indonesia
Using LLM for Real-Time Transcription and Summarization of Doctor-Patient Interactions into ePuskesmas in Indonesia
Azmul Asmar Irfan
Nur Ahmad Khatim
Mansur M. Arief
30
0
0
25 Sep 2024
Chain-of-Thought Prompting for Speech Translation
Chain-of-Thought Prompting for Speech Translation
Ke Hu
Zhehuai Chen
Chao-Han Huck Yang
Piotr Żelasko
Oleksii Hrinchuk
Vitaly Lavrukhin
Jagadeesh Balam
Boris Ginsburg
LRM
39
2
0
17 Sep 2024
Meta-Whisper: Speech-Based Meta-ICL for ASR on Low-Resource Languages
Meta-Whisper: Speech-Based Meta-ICL for ASR on Low-Resource Languages
Ming-Hao Hsu
Kuan Po Huang
Hung-yi Lee
43
1
0
16 Sep 2024
LA-RAG:Enhancing LLM-based ASR Accuracy with Retrieval-Augmented
  Generation
LA-RAG:Enhancing LLM-based ASR Accuracy with Retrieval-Augmented Generation
Shaojun Li
Hengchao Shang
Daimeng Wei
Jiaxin Guo
Zongyao Li
Xianghui He
Min Zhang
Hao Yang
40
2
0
13 Sep 2024
Whisper-SV: Adapting Whisper for Low-data-resource Speaker Verification
Whisper-SV: Adapting Whisper for Low-data-resource Speaker Verification
Li Lyna Zhang
Ning Jiang
Qing Wang
Yuehong Li
Quan Lu
Lei Xie
36
6
0
14 Jul 2024
UniAudio 1.5: Large Language Model-driven Audio Codec is A Few-shot
  Audio Task Learner
UniAudio 1.5: Large Language Model-driven Audio Codec is A Few-shot Audio Task Learner
Dongchao Yang
Haohan Guo
Yuanyuan Wang
Rongjie Huang
Xiang Li
Xu Tan
Xixin Wu
Helen Meng
AuLLM
47
15
0
14 Jun 2024
Do Prompts Really Prompt? Exploring the Prompt Understanding Capability
  of Whisper
Do Prompts Really Prompt? Exploring the Prompt Understanding Capability of Whisper
Chih-Kai Yang
Kuan Po Huang
Hung-yi Lee
42
3
0
09 Jun 2024
Bayesian Example Selection Improves In-Context Learning for Speech,
  Text, and Visual Modalities
Bayesian Example Selection Improves In-Context Learning for Speech, Text, and Visual Modalities
Siyin Wang
Chao-Han Huck Yang
Ji Wu
Chao Zhang
BDL
32
4
0
23 Apr 2024
Speech Translation with Speech Foundation Models and Large Language
  Models: What is There and What is Missing?
Speech Translation with Speech Foundation Models and Large Language Models: What is There and What is Missing?
Marco Gaido
Sara Papi
Matteo Negri
L. Bentivogli
43
13
0
19 Feb 2024
GenTranslate: Large Language Models are Generative Multilingual Speech
  and Machine Translators
GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators
Yuchen Hu
Chen Chen
Chao-Han Huck Yang
Ruizhe Li
Dong Zhang
Zhehuai Chen
E. Chng
20
21
0
10 Feb 2024
Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and
  Dialogue Abilities
Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities
Zhifeng Kong
Arushi Goel
Rohan Badlani
Ming-Yu Liu
Rafael Valle
Bryan Catanzaro
AuLLM
LM&MA
MLLM
76
73
0
02 Feb 2024
Large Language Models are Efficient Learners of Noise-Robust Speech
  Recognition
Large Language Models are Efficient Learners of Noise-Robust Speech Recognition
Yuchen Hu
Chen Chen
Chao-Han Huck Yang
Ruizhe Li
Chao Zhang
Pin-Yu Chen
Ensiong Chng
27
20
0
19 Jan 2024
Investigating Zero-Shot Generalizability on Mandarin-English
  Code-Switched ASR and Speech-to-text Translation of Recent Foundation Models
  with Self-Supervision and Weak Supervision
Investigating Zero-Shot Generalizability on Mandarin-English Code-Switched ASR and Speech-to-text Translation of Recent Foundation Models with Self-Supervision and Weak Supervision
Chih-Kai Yang
Kuan-Po Huang
Ke-Han Lu
Chun-Yi Kuan
Chi-Yuan Hsiao
Hung-yi Lee
48
7
0
30 Dec 2023
Investigating the Emergent Audio Classification Ability of ASR
  Foundation Models
Investigating the Emergent Audio Classification Ability of ASR Foundation Models
Rao Ma
Adian Liusie
Mark J. F. Gales
Kate Knill
39
7
0
15 Nov 2023
COSMIC: Data Efficient Instruction-tuning For Speech In-Context Learning
COSMIC: Data Efficient Instruction-tuning For Speech In-Context Learning
Jing Pan
Jian Wu
Yashesh Gaur
S. Sivasankaran
Zhuo Chen
Shujie Liu
Jinyu Li
ELM
32
26
0
03 Nov 2023
Whispering LLaMA: A Cross-Modal Generative Error Correction Framework
  for Speech Recognition
Whispering LLaMA: A Cross-Modal Generative Error Correction Framework for Speech Recognition
S. Radhakrishnan
Chao-Han Huck Yang
S. Khan
Rohit Kumar
N. Kiani
D. Gómez-Cabrero
Jesper N. Tegnér
38
47
0
10 Oct 2023
Scaling Speech Technology to 1,000+ Languages
Scaling Speech Technology to 1,000+ Languages
Vineel Pratap
Andros Tjandra
Bowen Shi
Paden Tomasello
Arun Babu
...
Yossi Adi
Xiaohui Zhang
Wei-Ning Hsu
Alexis Conneau
Michael Auli
VLM
77
300
0
22 May 2023
Can ChatGPT Detect Intent? Evaluating Large Language Models for Spoken
  Language Understanding
Can ChatGPT Detect Intent? Evaluating Large Language Models for Spoken Language Understanding
Mutian He
Philip N. Garner
ELM
AI4MH
LRM
46
21
0
22 May 2023
RescoreBERT: Discriminative Speech Recognition Rescoring with BERT
RescoreBERT: Discriminative Speech Recognition Rescoring with BERT
Liyan Xu
Yile Gu
J. Kolehmainen
Haidar Khan
Ankur Gandhe
Ariya Rastrow
A. Stolcke
I. Bulyko
36
45
0
02 Feb 2022
What Makes Good In-Context Examples for GPT-$3$?
What Makes Good In-Context Examples for GPT-333?
Jiachang Liu
Dinghan Shen
Yizhe Zhang
Bill Dolan
Lawrence Carin
Weizhu Chen
AAML
RALM
275
1,312
0
17 Jan 2021
1