ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1811.02050
  4. Cited By
Leveraging Weakly Supervised Data to Improve End-to-End Speech-to-Text
  Translation
v1v2 (latest)

Leveraging Weakly Supervised Data to Improve End-to-End Speech-to-Text Translation

5 November 2018
Ye Jia
Melvin Johnson
Wolfgang Macherey
Ron J. Weiss
Yuan Cao
Chung-Cheng Chiu
Naveen Ari
Stella Laurenzo
Yonghui Wu
ArXiv (abs)PDFHTML

Papers citing "Leveraging Weakly Supervised Data to Improve End-to-End Speech-to-Text Translation"

50 / 69 papers shown
Title
Length Aware Speech Translation for Video Dubbing
Length Aware Speech Translation for Video Dubbing
Harveen Chadha
Aswin Shanmugam Subramanian
Vikas Joshi
Shubham Bansal
Jian Xue
R. Mehta
Jinyu Li
17
0
0
31 May 2025
AdaST: Dynamically Adapting Encoder States in the Decoder for End-to-End Speech-to-Text Translation
AdaST: Dynamically Adapting Encoder States in the Decoder for End-to-End Speech-to-Text Translation
Wuwei Huang
Dexin Wang
Deyi Xiong
86
4
0
18 Mar 2025
Joint Training And Decoding for Multilingual End-to-End Simultaneous Speech Translation
Wuwei Huang
Renren Jin
Wen Zhang
Jian Luan
Bin Wang
Deyi Xiong
100
1
0
14 Mar 2025
When End-to-End is Overkill: Rethinking Cascaded Speech-to-Text Translation
When End-to-End is Overkill: Rethinking Cascaded Speech-to-Text Translation
Anna Min
Chenxu Hu
Yi Ren
Hang Zhao
107
1
0
01 Feb 2025
Sparks of Large Audio Models: A Survey and Outlook
Sparks of Large Audio Models: A Survey and Outlook
S. Latif
Moazzam Shoukat
Fahad Shamshad
Muhammad Usama
Yi Ren
...
Wenwu Wang
Xulong Zhang
Roberto Togneri
Min Zhang
Björn W. Schuller
LM&MAAuLLM
188
39
0
24 Aug 2023
Recent Advances in Direct Speech-to-text Translation
Recent Advances in Direct Speech-to-text Translation
Chen Xu
Rong Ye
Qianqian Dong
Chengqi Zhao
Tom Ko
Mingxuan Wang
Tong Xiao
Jingbo Zhu
109
22
0
20 Jun 2023
ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text
  Translation
ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation
Chenyang Le
Yao Qian
Long Zhou
Shujie Liu
Yanmin Qian
Michael Zeng
Xuedong Huang
72
13
0
24 May 2023
Back Translation for Speech-to-text Translation Without Transcripts
Back Translation for Speech-to-text Translation Without Transcripts
Qingkai Fang
Yang Feng
65
14
0
15 May 2023
Understanding and Bridging the Modality Gap for Speech Translation
Understanding and Bridging the Modality Gap for Speech Translation
Qingkai Fang
Yang Feng
75
26
0
15 May 2023
Improving Speech Translation by Cross-Modal Multi-Grained Contrastive
  Learning
Improving Speech Translation by Cross-Modal Multi-Grained Contrastive Learning
Hao Zhang
Nianwen Si
Yaqi Chen
Wenlin Zhang
Xukui Yang
Dan Qu
Weiqiang Zhang
77
10
0
20 Apr 2023
WhisperX: Time-Accurate Speech Transcription of Long-Form Audio
WhisperX: Time-Accurate Speech Transcription of Long-Form Audio
Max Bain
Jaesung Huh
Tengda Han
Andrew Zisserman
137
242
0
01 Mar 2023
SegAugment: Maximizing the Utility of Speech Translation Data with
  Segmentation-based Augmentations
SegAugment: Maximizing the Utility of Speech Translation Data with Segmentation-based Augmentations
Ioannis Tsiamas
José A. R. Fonollosa
Marta R. Costa-jussá
80
6
0
19 Dec 2022
WACO: Word-Aligned Contrastive Learning for Speech Translation
WACO: Word-Aligned Contrastive Learning for Speech Translation
Siqi Ouyang
Rong Ye
Lei Li
104
28
0
19 Dec 2022
Improving End-to-end Speech Translation by Leveraging Auxiliary Speech
  and Text Data
Improving End-to-end Speech Translation by Leveraging Auxiliary Speech and Text Data
Yuhao Zhang
Chen Xu
Bojie Hu
Chunliang Zhang
Tong Xiao
Jingbo Zhu
62
16
0
04 Dec 2022
Efficient Speech Translation with Pre-trained Models
Efficient Speech Translation with Pre-trained Models
Zhaolin Li
Jan Niehues
54
2
0
09 Nov 2022
Discrete Cross-Modal Alignment Enables Zero-Shot Speech Translation
Discrete Cross-Modal Alignment Enables Zero-Shot Speech Translation
Chen Wang
Yuchen Liu
Boxing Chen
Jiajun Zhang
Wei Luo
Zhongqiang Huang
Chengqing Zong
64
10
0
18 Oct 2022
Generating Synthetic Speech from SpokenVocab for Speech Translation
Generating Synthetic Speech from SpokenVocab for Speech Translation
Jinming Zhao
Gholamreza Haffar
Ehsan Shareghi
50
6
0
15 Oct 2022
Leveraging Pseudo-labeled Data to Improve Direct Speech-to-Speech
  Translation
Leveraging Pseudo-labeled Data to Improve Direct Speech-to-Speech Translation
Qianqian Dong
Fengpeng Yue
Tom Ko
Mingxuan Wang
Qibing Bai
Yu Zhang
90
16
0
18 May 2022
Large-Scale Streaming End-to-End Speech Translation with Neural
  Transducers
Large-Scale Streaming End-to-End Speech Translation with Neural Transducers
Jian Xue
Peidong Wang
Jinyu Li
Matt Post
Yashesh Gaur
AI4TS
72
31
0
11 Apr 2022
GigaST: A 10,000-hour Pseudo Speech Translation Corpus
GigaST: A 10,000-hour Pseudo Speech Translation Corpus
Rong Ye
Chengqi Zhao
Tom Ko
Chutong Meng
Tao Wang
Mingxuan Wang
Jun Cao
85
23
0
08 Apr 2022
Enhanced Direct Speech-to-Speech Translation Using Self-supervised
  Pre-training and Data Augmentation
Enhanced Direct Speech-to-Speech Translation Using Self-supervised Pre-training and Data Augmentation
Sravya Popuri
Peng-Jen Chen
Changhan Wang
J. Pino
Yossi Adi
Jiatao Gu
Wei-Ning Hsu
Ann Lee
142
58
0
06 Apr 2022
Leveraging unsupervised and weakly-supervised data to improve direct
  speech-to-speech translation
Leveraging unsupervised and weakly-supervised data to improve direct speech-to-speech translation
Ye Jia
Yifan Ding
Ankur Bapna
Colin Cherry
Yu Zhang
Alexis Conneau
Nobuyuki Morioka
94
21
0
24 Mar 2022
STEMM: Self-learning with Speech-text Manifold Mixup for Speech
  Translation
STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation
Qingkai Fang
Rong Ye
Lei Li
Yang Feng
Mingxuan Wang
120
100
0
20 Mar 2022
Under the Morphosyntactic Lens: A Multifaceted Evaluation of Gender Bias
  in Speech Translation
Under the Morphosyntactic Lens: A Multifaceted Evaluation of Gender Bias in Speech Translation
Beatrice Savoldi
Marco Gaido
L. Bentivogli
Matteo Negri
Marco Turchi
75
27
0
18 Mar 2022
Sample, Translate, Recombine: Leveraging Audio Alignments for Data
  Augmentation in End-to-end Speech Translation
Sample, Translate, Recombine: Leveraging Audio Alignments for Data Augmentation in End-to-end Speech Translation
Tsz Kin Lam
Shigehiko Schamoni
Stefan Riezler
72
34
0
16 Mar 2022
Learning When to Translate for Streaming Speech
Learning When to Translate for Streaming Speech
Qianqian Dong
Yaoming Zhu
Mingxuan Wang
Lei Li
100
30
0
15 Sep 2021
Non-autoregressive End-to-end Speech Translation with Parallel
  Autoregressive Rescoring
Non-autoregressive End-to-end Speech Translation with Parallel Autoregressive Rescoring
Hirofumi Inaguma
Yosuke Higuchi
Kevin Duh
Tatsuya Kawahara
Shinji Watanabe
87
11
0
09 Sep 2021
Translatotron 2: High-quality direct speech-to-speech translation with
  voice preservation
Translatotron 2: High-quality direct speech-to-speech translation with voice preservation
Ye Jia
Michelle Tadmor Ramanovich
Tal Remez
Roi Pomerantz
97
73
0
19 Jul 2021
Zero-shot Speech Translation
Zero-shot Speech Translation
Tu Anh Dinh
79
6
0
13 Jul 2021
Direct speech-to-speech translation with discrete units
Direct speech-to-speech translation with discrete units
Ann Lee
Peng-Jen Chen
Changhan Wang
Jiatao Gu
Sravya Popuri
...
Yossi Adi
Qing He
Yun Tang
J. Pino
Wei-Ning Hsu
89
192
0
12 Jul 2021
Dealing with training and test segmentation mismatch: FBK@IWSLT2021
Dealing with training and test segmentation mismatch: FBK@IWSLT2021
Sara Papi
Marco Gaido
Matteo Negri
Marco Turchi
71
6
0
23 Jun 2021
RealTranS: End-to-End Simultaneous Speech Translation with Convolutional
  Weighted-Shrinking Transformer
RealTranS: End-to-End Simultaneous Speech Translation with Convolutional Weighted-Shrinking Transformer
Xingshan Zeng
Liangyou Li
Qun Liu
67
48
0
09 Jun 2021
Cascade versus Direct Speech Translation: Do the Differences Still Make
  a Difference?
Cascade versus Direct Speech Translation: Do the Differences Still Make a Difference?
L. Bentivogli
Mauro Cettolo
Marco Gaido
Alina Karakanta
A. Martinelli
Matteo Negri
Marco Turchi
76
83
0
02 Jun 2021
The Volctrans Neural Speech Translation System for IWSLT 2021
The Volctrans Neural Speech Translation System for IWSLT 2021
Chengqi Zhao
Zhicheng Liu
Jian-Fei Tong
Tao Wang
Mingxuan Wang
Rong Ye
Qianqian Dong
Jun Cao
Lei Li
59
8
0
16 May 2021
Learning Shared Semantic Space for Speech-to-Text Translation
Learning Shared Semantic Space for Speech-to-Text Translation
Chi Han
Mingxuan Wang
Heng Ji
Lei Li
112
78
0
07 May 2021
Beyond Voice Activity Detection: Hybrid Audio Segmentation for Direct
  Speech Translation
Beyond Voice Activity Detection: Hybrid Audio Segmentation for Direct Speech Translation
Marco Gaido
Matteo Negri
Mauro Cettolo
Marco Turchi
VLM
109
25
0
23 Apr 2021
Large-Scale Self- and Semi-Supervised Learning for Speech Translation
Large-Scale Self- and Semi-Supervised Learning for Speech Translation
Changhan Wang
Anne Wu
J. Pino
Alexei Baevski
Michael Auli
Alexis Conneau
SSL
76
46
0
14 Apr 2021
Fused Acoustic and Text Encoding for Multimodal Bilingual Pretraining
  and Speech Translation
Fused Acoustic and Text Encoding for Multimodal Bilingual Pretraining and Speech Translation
Renjie Zheng
Junkun Chen
Mingbo Ma
Liang Huang
155
69
0
10 Feb 2021
AI Choreographer: Music Conditioned 3D Dance Generation with AIST++
AI Choreographer: Music Conditioned 3D Dance Generation with AIST++
Ruilong Li
Sha Yang
David A. Ross
Angjoo Kanazawa
ViT
290
506
0
21 Jan 2021
Bridging the Modality Gap for Speech-to-Text Translation
Bridging the Modality Gap for Speech-to-Text Translation
Yuchen Liu
Junnan Zhu
Jiajun Zhang
Chengqing Zong
77
69
0
28 Oct 2020
TTS-by-TTS: TTS-driven Data Augmentation for Fast and High-Quality
  Speech Synthesis
TTS-by-TTS: TTS-driven Data Augmentation for Fast and High-Quality Speech Synthesis
Min-Jae Hwang
Ryuichi Yamamoto
Eunwoo Song
Jae-Min Kim
44
32
0
26 Oct 2020
Orthros: Non-autoregressive End-to-end Speech Translation with
  Dual-decoder
Orthros: Non-autoregressive End-to-end Speech Translation with Dual-decoder
Hirofumi Inaguma
Yosuke Higuchi
Kevin Duh
Tatsuya Kawahara
Shinji Watanabe
66
22
0
25 Oct 2020
Multilingual Speech Translation with Efficient Finetuning of Pretrained
  Models
Multilingual Speech Translation with Efficient Finetuning of Pretrained Models
Xian Li
Changhan Wang
Yun Tang
C. Tran
Yuqing Tang
J. Pino
Alexei Baevski
Alexis Conneau
Michael Auli
58
6
0
24 Oct 2020
A General Multi-Task Learning Framework to Leverage Text Data for Speech
  to Text Tasks
A General Multi-Task Learning Framework to Leverage Text Data for Speech to Text Tasks
Yun Tang
J. Pino
Changhan Wang
Xutai Ma
Dmitriy Genzel
81
75
0
21 Oct 2020
Cascaded Models With Cyclic Feedback For Direct Speech Translation
Cascaded Models With Cyclic Feedback For Direct Speech Translation
Tsz Kin Lam
Shigehiko Schamoni
Stefan Riezler
94
13
0
21 Oct 2020
Adaptive Feature Selection for End-to-End Speech Translation
Adaptive Feature Selection for End-to-End Speech Translation
Biao Zhang
Ivan Titov
Barry Haddow
Rico Sennrich
77
41
0
16 Oct 2020
Improving Low Resource Code-switched ASR using Augmented Code-switched
  TTS
Improving Low Resource Code-switched ASR using Augmented Code-switched TTS
Yash Sharma
Basil Abraham
Karan Taneja
Preethi Jyothi
61
21
0
12 Oct 2020
Consecutive Decoding for Speech-to-text Translation
Consecutive Decoding for Speech-to-text Translation
Qianqian Dong
Mingxuan Wang
Hao Zhou
Shuang Xu
Bo Xu
Lei Li
SLR
111
41
0
21 Sep 2020
On Target Segmentation for Direct Speech Translation
On Target Segmentation for Direct Speech Translation
Mattia Antonino Di Gangi
Marco Gaido
Matteo Negri
Marco Turchi
79
14
0
10 Sep 2020
Large-scale Transfer Learning for Low-resource Spoken Language
  Understanding
Large-scale Transfer Learning for Low-resource Spoken Language Understanding
X. Jia
Jianzong Wang
Zhiyong Zhang
Ning Cheng
Jing Xiao
73
17
0
13 Aug 2020
12
Next