ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2401.00273
  4. Cited By
Investigating Zero-Shot Generalizability on Mandarin-English
  Code-Switched ASR and Speech-to-text Translation of Recent Foundation Models
  with Self-Supervision and Weak Supervision

Investigating Zero-Shot Generalizability on Mandarin-English Code-Switched ASR and Speech-to-text Translation of Recent Foundation Models with Self-Supervision and Weak Supervision

30 December 2023
Chih-Kai Yang
Kuan-Po Huang
Ke-Han Lu
Chun-Yi Kuan
Chi-Yuan Hsiao
Hung-yi Lee
ArXiv (abs)PDFHTML

Papers citing "Investigating Zero-Shot Generalizability on Mandarin-English Code-Switched ASR and Speech-to-text Translation of Recent Foundation Models with Self-Supervision and Weak Supervision"

13 / 13 papers shown
Title
Zero Resource Code-switched Speech Benchmark Using Speech Utterance
  Pairs For Multiple Spoken Languages
Zero Resource Code-switched Speech Benchmark Using Speech Utterance Pairs For Multiple Spoken Languages
Kuan-Po Huang
Chih-Kai Yang
Yu-Kuan Fu
Ewan Dunbar
Hung-yi Lee
82
10
0
04 Oct 2023
Dynamic-SUPERB: Towards A Dynamic, Collaborative, and Comprehensive
  Instruction-Tuning Benchmark for Speech
Dynamic-SUPERB: Towards A Dynamic, Collaborative, and Comprehensive Instruction-Tuning Benchmark for Speech
Chien-yu Huang
Ke-Han Lu
Shi Wang
Chi-Yuan Hsiao
Chun-Yi Kuan
...
Roshan S. Sharma
Shinji Watanabe
Bhiksha Ramakrishnan
Shady Shehata
Hung-yi Lee
AuLLM
81
63
0
18 Sep 2023
Can Whisper perform speech-based in-context learning?
Can Whisper perform speech-based in-context learning?
Siyin Wang
Chao-Han Huck Yang
Ji Wu
Chao Zhang
103
29
0
13 Sep 2023
SeamlessM4T: Massively Multilingual & Multimodal Machine Translation
SeamlessM4T: Massively Multilingual & Multimodal Machine Translation
Seamless Communication
Loïc Barrault
Yu-An Chung
Mariano Cora Meglioli
David Dale
...
Holger Schwenk
Paden Tomasello
Changhan Wang
Jeff Wang
Skyler Wang
107
96
0
22 Aug 2023
Robust Speech Recognition via Large-Scale Weak Supervision
Robust Speech Recognition via Large-Scale Weak Supervision
Alec Radford
Jong Wook Kim
Tao Xu
Greg Brockman
C. McLeavey
Ilya Sutskever
OffRL
216
3,757
0
06 Dec 2022
Reducing Language confusion for Code-switching Speech Recognition with
  Token-level Language Diarization
Reducing Language confusion for Code-switching Speech Recognition with Token-level Language Diarization
Hexin Liu
Haihua Xu
Leibny Paola García
Andy W. H. Khong
Yi He
Sanjeev Khudanpur
58
25
0
26 Oct 2022
End-to-End Speech Translation for Code Switched Speech
End-to-End Speech Translation for Code Switched Speech
Orion Weller
Matthias Sperber
Telmo Pires
Hendra Setiawan
Christian Gollan
Dominic Telaar
Matthias Paulik
133
29
0
11 Apr 2022
Self-supervised Learning with Random-projection Quantizer for Speech
  Recognition
Self-supervised Learning with Random-projection Quantizer for Speech Recognition
Chung-Cheng Chiu
James Qin
Yu Zhang
Jiahui Yu
Yonghui Wu
SSL
107
169
0
03 Feb 2022
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech
  Processing
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Sanyuan Chen
Chengyi Wang
Zhengyang Chen
Yu-Huan Wu
Shujie Liu
...
Yao Qian
Jian Wu
Micheal Zeng
Xiangzhan Yu
Furu Wei
SSL
273
1,908
0
26 Oct 2021
W2v-BERT: Combining Contrastive Learning and Masked Language Modeling
  for Self-Supervised Speech Pre-Training
W2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-Training
Yu-An Chung
Yu Zhang
Wei Han
Chung-Cheng Chiu
James Qin
Ruoming Pang
Yonghui Wu
SSLVLM
67
429
0
07 Aug 2021
HuBERT: Self-Supervised Speech Representation Learning by Masked
  Prediction of Hidden Units
HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units
Wei-Ning Hsu
Benjamin Bolte
Yao-Hung Hubert Tsai
Kushal Lakhotia
Ruslan Salakhutdinov
Abdel-rahman Mohamed
SSL
184
3,004
0
14 Jun 2021
GigaSpeech: An Evolving, Multi-domain ASR Corpus with 10,000 Hours of
  Transcribed Audio
GigaSpeech: An Evolving, Multi-domain ASR Corpus with 10,000 Hours of Transcribed Audio
Guoguo Chen
Shuzhou Chai
Guan-Bo Wang
Jiayu Du
Weiqiang Zhang
...
Xuchen Yao
Yongqing Wang
Yujun Wang
Zhao You
Zhiyong Yan
116
385
0
13 Jun 2021
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech
  Representations
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations
Alexei Baevski
Henry Zhou
Abdel-rahman Mohamed
Michael Auli
SSL
301
5,849
0
20 Jun 2020
1