ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2007.10310
  4. Cited By
CoVoST 2 and Massively Multilingual Speech-to-Text Translation

CoVoST 2 and Massively Multilingual Speech-to-Text Translation

20 July 2020
Changhan Wang
Anne Wu
J. Pino
    SLR
ArXivPDFHTML

Papers citing "CoVoST 2 and Massively Multilingual Speech-to-Text Translation"

48 / 48 papers shown
Title
Capybara-OMNI: An Efficient Paradigm for Building Omni-Modal Language Models
Capybara-OMNI: An Efficient Paradigm for Building Omni-Modal Language Models
Xingguang Ji
Jiakang Wang
Hongzhi Zhang
Jingyuan Zhang
Haonan Zhou
Chenxi Sun
Yong-Jin Liu
Qi Wang
Fuzheng Zhang
MLLM
VLM
58
0
0
10 Apr 2025
M2-omni: Advancing Omni-MLLM for Comprehensive Modality Support with Competitive Performance
M2-omni: Advancing Omni-MLLM for Comprehensive Modality Support with Competitive Performance
Qingpei Guo
Kaiyou Song
Zipeng Feng
Ziping Ma
Qinglong Zhang
...
Yunxiao Sun
Tai-WeiChang
Jingdong Chen
Ming Yang
Jun Zhou
MLLM
VLM
90
3
0
26 Feb 2025
Speech Translation Refinement using Large Language Models
Huaixia Dou
Xinyu Tian
Xinglin Lyu
Jie Zhu
Junhui Li
Lifan Guo
149
0
0
28 Jan 2025
Text2Data: Low-Resource Data Generation with Textual Control
Text2Data: Low-Resource Data Generation with Textual Control
Shiyu Wang
Yihao Feng
Tian Lan
Ning Yu
Yu Bai
Ran Xu
Hairu Wang
Caiming Xiong
Shri Kiran Srinivasan
DiffM
85
0
0
03 Jan 2025
Distilling an End-to-End Voice Assistant Without Instruction Training
  Data
Distilling an End-to-End Voice Assistant Without Instruction Training Data
William B. Held
Ella Li
Michael Joseph Ryan
Weiyan Shi
Yanzhe Zhang
Diyi Yang
AuLLM
47
8
0
03 Oct 2024
MathBridge: A Large Corpus Dataset for Translating Spoken Mathematical
  Expressions into $LaTeX$ Formulas for Improved Readability
MathBridge: A Large Corpus Dataset for Translating Spoken Mathematical Expressions into LaTeXLaTeXLaTeX Formulas for Improved Readability
Kyudan Jung
Sieun Hyeon
Jeong Youn Kwon
N. Kim
Hyun Gon Ryu
Hyuk-Jae Lee
Jaeyoung Do
31
1
0
07 Aug 2024
Speech-MASSIVE: A Multilingual Speech Dataset for SLU and Beyond
Speech-MASSIVE: A Multilingual Speech Dataset for SLU and Beyond
Beomseok Lee
Ioan Calapodescu
Marco Gaido
Matteo Negri
Laurent Besacier
AuLLM
39
3
0
07 Aug 2024
Towards Achieving Human Parity on End-to-end Simultaneous Speech
  Translation via LLM Agent
Towards Achieving Human Parity on End-to-end Simultaneous Speech Translation via LLM Agent
Shanbo Cheng
Zhichao Huang
Tom Ko
Hang Li
Ningxin Peng
Lu Xu
Qini Zhang
48
3
0
31 Jul 2024
Qwen2-Audio Technical Report
Qwen2-Audio Technical Report
Yunfei Chu
Jin Xu
Qian Yang
Haojie Wei
Xipin Wei
...
Yuanjun Lv
Jinzheng He
Junyang Lin
Chang Zhou
Jingren Zhou
AuLLM
VLM
37
105
0
15 Jul 2024
Finetuning End-to-End Models for Estonian Conversational Spoken Language
  Translation
Finetuning End-to-End Models for Estonian Conversational Spoken Language Translation
Tiia Sildam
Andra Velve
Tanel Alumäe
48
0
0
04 Jul 2024
Multi-Modal Retrieval For Large Language Model Based Speech Recognition
Multi-Modal Retrieval For Large Language Model Based Speech Recognition
J. Kolehmainen
Aditya Gourav
Prashanth Gurunath Shivakumar
Yile Gu
Ankur Gandhe
Ariya Rastrow
Grant P. Strimel
I. Bulyko
40
4
0
13 Jun 2024
StreamSpeech: Simultaneous Speech-to-Speech Translation with Multi-task
  Learning
StreamSpeech: Simultaneous Speech-to-Speech Translation with Multi-task Learning
Shaolei Zhang
Qingkai Fang
Shoutao Guo
Zhengrui Ma
Min Zhang
Yang Feng
29
4
0
05 Jun 2024
A Large-Scale Evaluation of Speech Foundation Models
A Large-Scale Evaluation of Speech Foundation Models
Shu-Wen Yang
Heng-Jui Chang
Zili Huang
Andy T. Liu
Cheng-I Jeff Lai
...
Kushal Lakhotia
Shang-Wen Li
Abdelrahman Mohamed
Shinji Watanabe
Hung-yi Lee
38
19
0
15 Apr 2024
OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech
  Recognition, Translation, and Language Identification
OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech Recognition, Translation, and Language Identification
Yifan Peng
Yui Sudo
Muhammad Shakeel
Shinji Watanabe
VLM
37
17
0
20 Feb 2024
Qwen-Audio: Advancing Universal Audio Understanding via Unified
  Large-Scale Audio-Language Models
Qwen-Audio: Advancing Universal Audio Understanding via Unified Large-Scale Audio-Language Models
Yunfei Chu
Jin Xu
Xiaohuan Zhou
Qian Yang
Shiliang Zhang
Zhijie Yan
Chang Zhou
Jingren Zhou
AuLLM
42
268
0
14 Nov 2023
Towards Real-World Streaming Speech Translation for Code-Switched Speech
Towards Real-World Streaming Speech Translation for Code-Switched Speech
Belen Alastruey
Matthias Sperber
Christian Gollan
Dominic Telaar
Tim Ng
Aashish Agarwal
14
2
0
19 Oct 2023
DASpeech: Directed Acyclic Transformer for Fast and High-quality
  Speech-to-Speech Translation
DASpeech: Directed Acyclic Transformer for Fast and High-quality Speech-to-Speech Translation
Qingkai Fang
Yan Zhou
Yangzhou Feng
40
6
0
11 Oct 2023
DecoderLens: Layerwise Interpretation of Encoder-Decoder Transformers
DecoderLens: Layerwise Interpretation of Encoder-Decoder Transformers
Anna Langedijk
Hosein Mohebbi
Gabriele Sarti
Willem H. Zuidema
Jaap Jumelet
32
10
0
05 Oct 2023
On decoder-only architecture for speech-to-text and large language model
  integration
On decoder-only architecture for speech-to-text and large language model integration
Jian Wu
Yashesh Gaur
Zhuo Chen
Long Zhou
Yilun Zhu
...
Jinyu Li
Shujie Liu
Bo Ren
Linquan Liu
Yu-Huan Wu
AuLLM
30
118
0
08 Jul 2023
AudioPaLM: A Large Language Model That Can Speak and Listen
AudioPaLM: A Large Language Model That Can Speak and Listen
Paul Kishan Rubenstein
Chulayuth Asawaroengchai
D. Nguyen
Ankur Bapna
Zalan Borsos
...
Neil Zeghidour
Yu Zhang
Zhishuai Zhang
Lukás Zilka
Christian Frank
LM&MA
AuLLM
VLM
35
259
0
22 Jun 2023
KIT's Multilingual Speech Translation System for IWSLT 2023
KIT's Multilingual Speech Translation System for IWSLT 2023
Danni Liu
Thai-Binh Nguyen
Sai Koneru
Enes Yavuz Ugan
Ngoc-Quan Pham
Tuan-Nam Nguyen
Tu Anh Dinh
Carlos Mullov
A. Waibel
J. Niehues
26
6
0
08 Jun 2023
Speech Translation with Foundation Models and Optimal Transport: UPC at
  IWSLT23
Speech Translation with Foundation Models and Optimal Transport: UPC at IWSLT23
Ioannis Tsiamas
Gerard I. Gállego
José A. R. Fonollosa
Marta R. Costa-jussá
OT
16
3
0
02 Jun 2023
Duplex Diffusion Models Improve Speech-to-Speech Translation
Duplex Diffusion Models Improve Speech-to-Speech Translation
Xianchao Wu
DiffM
20
4
0
22 May 2023
Prompting the Hidden Talent of Web-Scale Speech Models for Zero-Shot
  Task Generalization
Prompting the Hidden Talent of Web-Scale Speech Models for Zero-Shot Task Generalization
Puyuan Peng
Brian Yan
Shinji Watanabe
David Harwath
VLM
LRM
40
46
0
18 May 2023
Understanding and Bridging the Modality Gap for Speech Translation
Understanding and Bridging the Modality Gap for Speech Translation
Qingkai Fang
Yang Feng
27
25
0
15 May 2023
Joint Speech Transcription and Translation: Pseudo-Labeling with
  Out-of-Distribution Data
Joint Speech Transcription and Translation: Pseudo-Labeling with Out-of-Distribution Data
Mozhdeh Gheini
Tatiana Likhomanenko
Matthias Sperber
Hendra Setiawan
33
5
0
20 Dec 2022
CoBERT: Self-Supervised Speech Representation Learning Through Code
  Representation Learning
CoBERT: Self-Supervised Speech Representation Learning Through Code Representation Learning
Chutong Meng
Junyi Ao
Tom Ko
Mingxuan Wang
Haizhou Li
SSL
47
6
0
08 Oct 2022
Direct Speech Translation for Automatic Subtitling
Direct Speech Translation for Automatic Subtitling
Sara Papi
Marco Gaido
Alina Karakanta
Mauro Cettolo
Matteo Negri
Marco Turchi
54
11
0
27 Sep 2022
The YiTrans End-to-End Speech Translation System for IWSLT 2022 Offline
  Shared Task
The YiTrans End-to-End Speech Translation System for IWSLT 2022 Offline Shared Task
Ziqiang Zhang
Junyi Ao
Long Zhou
Shujie Liu
Furu Wei
Jinyu Li
17
9
0
12 Jun 2022
T-Modules: Translation Modules for Zero-Shot Cross-Modal Machine
  Translation
T-Modules: Translation Modules for Zero-Shot Cross-Modal Machine Translation
Paul-Ambroise Duquenne
Hongyu Gong
Benoît Sagot
Holger Schwenk
24
18
0
24 May 2022
Hierarchical Softmax for End-to-End Low-resource Multilingual Speech
  Recognition
Hierarchical Softmax for End-to-End Low-resource Multilingual Speech Recognition
Qianying Liu
Zhuo Gong
Zhengdong Yang
Yuhang Yang
Sheng Li
...
N. Minematsu
Hao-Ming Huang
Fei Cheng
Chenhui Chu
Sadao Kurohashi
24
5
0
08 Apr 2022
SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark
  for Semantic and Generative Capabilities
SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities
Hsiang-Sheng Tsai
Heng-Jui Chang
Wen-Chin Huang
Zili Huang
Kushal Lakhotia
...
Hsuan-Jui Chen
Shang-Wen Li
Shinji Watanabe
Abdel-rahman Mohamed
Hung-yi Lee
26
109
0
14 Mar 2022
Sequence-to-Sequence Resources for Catalan
Sequence-to-Sequence Resources for Catalan
Ona de Gibert
Ksenia Kharitonova
B. Figueras
Jordi Armengol-Estapé
Maite Melero
11
0
0
14 Feb 2022
Tackling data scarcity in speech translation using zero-shot
  multilingual machine translation techniques
Tackling data scarcity in speech translation using zero-shot multilingual machine translation techniques
Tu Anh Dinh
Danni Liu
J. Niehues
24
6
0
26 Jan 2022
Textless Speech-to-Speech Translation on Real Data
Textless Speech-to-Speech Translation on Real Data
Ann Lee
Hongyu Gong
Paul-Ambroise Duquenne
Holger Schwenk
Peng-Jen Chen
...
Sravya Popuri
Yossi Adi
J. Pino
Jiatao Gu
Wei-Ning Hsu
28
142
0
15 Dec 2021
CORAA: a large corpus of spontaneous and prepared speech manually
  validated for speech recognition in Brazilian Portuguese
CORAA: a large corpus of spontaneous and prepared speech manually validated for speech recognition in Brazilian Portuguese
Arnaldo Cândido Júnior
Edresson Casanova
A. S. Soares
F. S. Oliveira
L. Oliveira
...
Daniel Peixoto Pinto da Silva
Fernando Gorgulho Fayet
B. Carlotto
L. Gris
S. Aluísio
13
14
0
14 Oct 2021
Translatotron 2: High-quality direct speech-to-speech translation with
  voice preservation
Translatotron 2: High-quality direct speech-to-speech translation with voice preservation
Ye Jia
Michelle Tadmor Ramanovich
Tal Remez
Roi Pomerantz
26
67
0
19 Jul 2021
Zero-shot Speech Translation
Zero-shot Speech Translation
Tu Anh Dinh
30
6
0
13 Jul 2021
The NiuTrans End-to-End Speech Translation System for IWSLT 2021 Offline
  Task
The NiuTrans End-to-End Speech Translation System for IWSLT 2021 Offline Task
Chen Xu
Xiaoqian Liu
Xiaowen Liu
Laohu Wang
Canan Huang
Tong Xiao
Jingbo Zhu
34
5
0
06 Jul 2021
Pay Better Attention to Attention: Head Selection in Multilingual and
  Multi-Domain Sequence Modeling
Pay Better Attention to Attention: Head Selection in Multilingual and Multi-Domain Sequence Modeling
Hongyu Gong
Yun Tang
J. Pino
Xian Li
33
11
0
21 Jun 2021
End-to-End Speech Translation with Pre-trained Models and Adapters: UPC
  at IWSLT 2021
End-to-End Speech Translation with Pre-trained Models and Adapters: UPC at IWSLT 2021
Gerard I. Gállego
Ioannis Tsiamas
Carlos Escolano
José A. R. Fonollosa
Marta R. Costa-jussá
25
30
0
10 May 2021
Searchable Hidden Intermediates for End-to-End Models of Decomposable
  Sequence Tasks
Searchable Hidden Intermediates for End-to-End Models of Decomposable Sequence Tasks
Siddharth Dalmia
Brian Yan
Vikas Raunak
Florian Metze
Shinji Watanabe
37
30
0
02 May 2021
Source and Target Bidirectional Knowledge Distillation for End-to-end
  Speech Translation
Source and Target Bidirectional Knowledge Distillation for End-to-end Speech Translation
H. Inaguma
Tatsuya Kawahara
Shinji Watanabe
29
42
0
13 Apr 2021
BSTC: A Large-Scale Chinese-English Speech Translation Dataset
BSTC: A Large-Scale Chinese-English Speech Translation Dataset
Ruiqing Zhang
Xiyang Wang
Chuanqiang Zhang
Zhongjun He
Hua-Hong Wu
Zhi Li
Haifeng Wang
Ying Chen
Qinfei Li
17
39
0
08 Apr 2021
An Approach to Improve Robustness of NLP Systems against ASR Errors
An Approach to Improve Robustness of NLP Systems against ASR Errors
Tong Cui
Jinghui Xiao
Liangyou Li
Xin Jiang
Qun Liu
19
11
0
25 Mar 2021
The Multilingual TEDx Corpus for Speech Recognition and Translation
The Multilingual TEDx Corpus for Speech Recognition and Translation
Elizabeth Salesky
Matthew Wiesner
Jacob Bremerman
R. Cattoni
Matteo Negri
Marco Turchi
Douglas W. Oard
Matt Post
11
119
0
02 Feb 2021
Dual-decoder Transformer for Joint Automatic Speech Recognition and
  Multilingual Speech Translation
Dual-decoder Transformer for Joint Automatic Speech Recognition and Multilingual Speech Translation
Hang Le
J. Pino
Changhan Wang
Jiatao Gu
D. Schwab
Laurent Besacier
39
82
0
02 Nov 2020
End-to-End Automatic Speech Translation of Audiobooks
End-to-End Automatic Speech Translation of Audiobooks
Alexandre Berard
Laurent Besacier
A. Kocabiyikoglu
Olivier Pietquin
75
190
0
12 Feb 2018
1