ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2409.11538
  4. Cited By
Chain-of-Thought Prompting for Speech Translation

Chain-of-Thought Prompting for Speech Translation

17 September 2024
Ke Hu
Zhehuai Chen
Chao-Han Huck Yang
Piotr Żelasko
Oleksii Hrinchuk
Vitaly Lavrukhin
Jagadeesh Balam
Boris Ginsburg
    LRM
ArXivPDFHTML

Papers citing "Chain-of-Thought Prompting for Speech Translation"

40 / 40 papers shown
Title
MAVL: A Multilingual Audio-Video Lyrics Dataset for Animated Song Translation
MAVL: A Multilingual Audio-Video Lyrics Dataset for Animated Song Translation
Woohyun Cho
Youngmin Kim
Sunghyun Lee
Youngjae Yu
VGen
36
0
0
24 May 2025
LiSTEN: Learning Soft Token Embeddings for Neural Audio LLMs
LiSTEN: Learning Soft Token Embeddings for Neural Audio LLMs
Pooneh Mousavi
Shubham Gupta
Cem Subakan
Mirco Ravanelli
30
0
0
24 May 2025
Efficient and Direct Duplex Modeling for Speech-to-Speech Language Model
Efficient and Direct Duplex Modeling for Speech-to-Speech Language Model
Ke Hu
Ehsan Hosseini-Asl
Chen Chen
Edresson Casanova
Subhankar Ghosh
Piotr .Zelasko
Zhiwen Chen
Jia-Nan Li
Jagadeesh Balam
Boris Ginsburg
AuLLM
116
0
0
21 May 2025
On The Landscape of Spoken Language Models: A Comprehensive Survey
On The Landscape of Spoken Language Models: A Comprehensive Survey
Siddhant Arora
Kai-Wei Chang
Chung-Ming Chien
Yifan Peng
Haibin Wu
Yossi Adi
Emmanuel Dupoux
Hung-yi Lee
Karen Livescu
Shinji Watanabe
120
14
0
11 Apr 2025
XiHeFusion: Harnessing Large Language Models for Science Communication in Nuclear Fusion
XiHeFusion: Harnessing Large Language Models for Science Communication in Nuclear Fusion
Xinyu Wang
Qingquan Yang
Fuling Wang
Qiang Chen
Wentao Wu
...
Wanli Lv
Meiwen Chen
Zehua Chen
Guosheng Xu
Jin Tang
AI4CE
92
0
0
08 Feb 2025
Making LLMs Better Many-to-Many Speech-to-Text Translators with Curriculum Learning
Making LLMs Better Many-to-Many Speech-to-Text Translators with Curriculum Learning
Yexing Du
Youcheng Pan
Ziyang Ma
Keqi Deng
Yifan Yang
Keqi Deng
Xie Chen
Yang Xiang
Ming Liu
Bing Qin
LRM
111
9
0
29 Sep 2024
LLaST: Improved End-to-end Speech Translation System Leveraged by Large
  Language Models
LLaST: Improved End-to-end Speech Translation System Leveraged by Large Language Models
Xi Chen
Songyang Zhang
Qibing Bai
Kai-xiang Chen
Satoshi Nakamura
AuLLM
74
7
0
22 Jul 2024
GenTranslate: Large Language Models are Generative Multilingual Speech
  and Machine Translators
GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators
Yuchen Hu
Chen Chen
Chao-Han Huck Yang
Ruizhe Li
Dong Zhang
Zhehuai Chen
Eng Siong Chng
59
21
0
10 Feb 2024
Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and
  Dialogue Abilities
Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities
Zhifeng Kong
Arushi Goel
Rohan Badlani
Ming-Yu Liu
Rafael Valle
Bryan Catanzaro
AuLLM
LM&MA
MLLM
113
89
0
02 Feb 2024
Speech Translation with Large Language Models: An Industrial Practice
Speech Translation with Large Language Models: An Industrial Practice
Zhichao Huang
Rong Ye
Tom Ko
Qianqian Dong
Shanbo Cheng
Mingxuan Wang
Hang Li
88
18
0
21 Dec 2023
SALMONN: Towards Generic Hearing Abilities for Large Language Models
SALMONN: Towards Generic Hearing Abilities for Large Language Models
Changli Tang
Wenyi Yu
Guangzhi Sun
Xianzhao Chen
Tian Tan
Wei Li
Lu Lu
Zejun Ma
Chao Zhang
LM&MA
AuLLM
86
254
0
20 Oct 2023
SALM: Speech-augmented Language Model with In-context Learning for
  Speech Recognition and Translation
SALM: Speech-augmented Language Model with In-context Learning for Speech Recognition and Translation
Zhehuai Chen
He Huang
A. Andrusenko
Oleksii Hrinchuk
Krishna C. Puvvada
Jason Chun Lok Li
Subhankar Ghosh
Jagadeesh Balam
Boris Ginsburg
LRM
68
58
0
13 Oct 2023
LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPT
LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPT
Zhihao Du
Jiaming Wang
Qian Chen
Yunfei Chu
Zhifu Gao
...
Wen Wang
Siqi Zheng
Chang Zhou
Zhijie Yan
Shiliang Zhang
LLMAG
VLM
AuLLM
LM&MA
92
85
0
07 Oct 2023
SLM: Bridge the thin gap between speech and text foundation models
SLM: Bridge the thin gap between speech and text foundation models
Mingqiu Wang
Wei Han
Izhak Shafran
Zelin Wu
Chung-Cheng Chiu
...
Zhong Meng
Golan Pundak
Nikhil Siddhartha
J. Schalkwyk
Yonghui Wu
AuLLM
85
57
0
30 Sep 2023
Qwen Technical Report
Qwen Technical Report
Jinze Bai
Shuai Bai
Yunfei Chu
Zeyu Cui
Kai Dang
...
Zhenru Zhang
Chang Zhou
Jingren Zhou
Xiaohuan Zhou
Tianhang Zhu
OSLM
259
1,827
0
28 Sep 2023
Generative Speech Recognition Error Correction with Large Language
  Models and Task-Activating Prompting
Generative Speech Recognition Error Correction with Large Language Models and Task-Activating Prompting
Chao-Han Huck Yang
Yile Gu
Yi-Chieh Liu
Shalini Ghosh
I. Bulyko
A. Stolcke
KELM
LRM
95
49
0
27 Sep 2023
Joint Audio and Speech Understanding
Joint Audio and Speech Understanding
Yuan Gong
Alexander H. Liu
Hongyin Luo
Leonid Karlinsky
James R. Glass
AuLLM
61
80
0
25 Sep 2023
End-to-End Speech Recognition Contextualization with Large Language
  Models
End-to-End Speech Recognition Contextualization with Large Language Models
Egor Lakomkin
Chunyang Wu
Yassir Fathullah
Ozlem Kalinli
M. Seltzer
Christian Fuegen
82
22
0
19 Sep 2023
PromptASR for contextualized ASR with controllable style
PromptASR for contextualized ASR with controllable style
Xiaoyu Yang
Wei Kang
Zengwei Yao
Yifan Yang
Liyong Guo
Fangjun Kuang
Long Lin
Daniel Povey
73
12
0
14 Sep 2023
Can Whisper perform speech-based in-context learning?
Can Whisper perform speech-based in-context learning?
Siyin Wang
Chao-Han Huck Yang
Ji Wu
Chao Zhang
80
28
0
13 Sep 2023
SeamlessM4T: Massively Multilingual & Multimodal Machine Translation
SeamlessM4T: Massively Multilingual & Multimodal Machine Translation
Seamless Communication
Loïc Barrault
Yu-An Chung
Mariano Cora Meglioli
David Dale
...
Holger Schwenk
Paden Tomasello
Changhan Wang
Jeff Wang
Skyler Wang
83
91
0
22 Aug 2023
Prompting Large Language Models with Speech Recognition Abilities
Prompting Large Language Models with Speech Recognition Abilities
Yassir Fathullah
Chunyang Wu
Egor Lakomkin
Junteng Jia
Yuan Shangguan
...
Wenhan Xiong
Jay Mahadeokar
Ozlem Kalinli
Christian Fuegen
M. Seltzer
AuLLM
78
142
0
21 Jul 2023
On decoder-only architecture for speech-to-text and large language model
  integration
On decoder-only architecture for speech-to-text and large language model integration
Jian Wu
Yashesh Gaur
Zhuo Chen
Long Zhou
Yilun Zhu
...
Jinyu Li
Shujie Liu
Bo Ren
Linquan Liu
Yu-Huan Wu
AuLLM
71
133
0
08 Jul 2023
GPT-4 Technical Report
GPT-4 Technical Report
OpenAI OpenAI
OpenAI Josh Achiam
Steven Adler
Sandhini Agarwal
Lama Ahmad
...
Shengjia Zhao
Tianhao Zheng
Juntang Zhuang
William Zhuk
Barret Zoph
LLMAG
MLLM
1.4K
14,359
0
15 Mar 2023
LLaMA: Open and Efficient Foundation Language Models
LLaMA: Open and Efficient Foundation Language Models
Hugo Touvron
Thibaut Lavril
Gautier Izacard
Xavier Martinet
Marie-Anne Lachaux
...
Faisal Azhar
Aurelien Rodriguez
Armand Joulin
Edouard Grave
Guillaume Lample
ALM
PILM
1.5K
13,247
0
27 Feb 2023
Robust Speech Recognition via Large-Scale Weak Supervision
Robust Speech Recognition via Large-Scale Weak Supervision
Alec Radford
Jong Wook Kim
Tao Xu
Greg Brockman
C. McLeavey
Ilya Sutskever
OffRL
188
3,684
0
06 Dec 2022
FLEURS: Few-shot Learning Evaluation of Universal Representations of
  Speech
FLEURS: Few-shot Learning Evaluation of Universal Representations of Speech
Alexis Conneau
Min Ma
Simran Khanuja
Yu Zhang
Vera Axelrod
Siddharth Dalmia
Jason Riesa
Clara E. Rivera
Ankur Bapna
VLM
131
322
0
25 May 2022
PaLM: Scaling Language Modeling with Pathways
PaLM: Scaling Language Modeling with Pathways
Aakanksha Chowdhery
Sharan Narang
Jacob Devlin
Maarten Bosma
Gaurav Mishra
...
Kathy Meier-Hellstern
Douglas Eck
J. Dean
Slav Petrov
Noah Fiedel
PILM
LRM
480
6,240
0
05 Apr 2022
Deliberation Model for On-Device Spoken Language Understanding
Deliberation Model for On-Device Spoken Language Understanding
Duc Le
Akshat Shrivastava
Paden Tomasello
Suyoun Kim
Aleksandr Livshits
Ozlem Kalinli
M. Seltzer
AuLLM
64
12
0
04 Apr 2022
Rethinking the Role of Demonstrations: What Makes In-Context Learning
  Work?
Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?
Sewon Min
Xinxi Lyu
Ari Holtzman
Mikel Artetxe
M. Lewis
Hannaneh Hajishirzi
Luke Zettlemoyer
LLMAG
LRM
163
1,485
0
25 Feb 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
814
9,387
0
28 Jan 2022
LoRA: Low-Rank Adaptation of Large Language Models
LoRA: Low-Rank Adaptation of Large Language Models
J. E. Hu
Yelong Shen
Phillip Wallis
Zeyuan Allen-Zhu
Yuanzhi Li
Shean Wang
Lu Wang
Weizhu Chen
OffRL
AI4TS
AI4CE
ALM
AIMat
466
10,367
0
17 Jun 2021
The Power of Scale for Parameter-Efficient Prompt Tuning
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
570
4,047
0
18 Apr 2021
Prefix-Tuning: Optimizing Continuous Prompts for Generation
Prefix-Tuning: Optimizing Continuous Prompts for Generation
Xiang Lisa Li
Percy Liang
242
4,261
0
01 Jan 2021
Language Models are Few-Shot Learners
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
789
42,055
0
28 May 2020
Deliberation Model Based Two-Pass End-to-End Speech Recognition
Deliberation Model Based Two-Pass End-to-End Speech Recognition
Ke Hu
Tara N. Sainath
Ruoming Pang
Rohit Prabhavalkar
78
87
0
17 Mar 2020
Exploring the Limits of Transfer Learning with a Unified Text-to-Text
  Transformer
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
427
20,181
0
23 Oct 2019
Root Mean Square Layer Normalization
Root Mean Square Layer Normalization
Biao Zhang
Rico Sennrich
91
740
0
16 Oct 2019
Megatron-LM: Training Multi-Billion Parameter Language Models Using
  Model Parallelism
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
Mohammad Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
329
1,904
0
17 Sep 2019
NeMo: a toolkit for building AI applications using Neural Modules
NeMo: a toolkit for building AI applications using Neural Modules
Oleksii Kuchaiev
Jason Chun Lok Li
Huyen Nguyen
Oleksii Hrinchuk
Ryan Leary
...
Jack Cook
P. Castonguay
Mariya Popova
Jocelyn Huang
Jonathan M. Cohen
255
306
0
14 Sep 2019
1