ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.02967
  4. Cited By
Enhanced Direct Speech-to-Speech Translation Using Self-supervised
  Pre-training and Data Augmentation

Enhanced Direct Speech-to-Speech Translation Using Self-supervised Pre-training and Data Augmentation

6 April 2022
Sravya Popuri
Peng-Jen Chen
Changhan Wang
J. Pino
Yossi Adi
Jiatao Gu
Wei-Ning Hsu
Ann Lee
ArXivPDFHTML

Papers citing "Enhanced Direct Speech-to-Speech Translation Using Self-supervised Pre-training and Data Augmentation"

19 / 19 papers shown
Title
A Unit-based System and Dataset for Expressive Direct Speech-to-Speech Translation
A Unit-based System and Dataset for Expressive Direct Speech-to-Speech Translation
Anna Min
Chenxu Hu
Yi Ren
Hang Zhao
78
0
0
01 Feb 2025
When End-to-End is Overkill: Rethinking Cascaded Speech-to-Text Translation
When End-to-End is Overkill: Rethinking Cascaded Speech-to-Text Translation
Anna Min
Chenxu Hu
Yi Ren
Hang Zhao
80
0
0
01 Feb 2025
Can We Achieve High-quality Direct Speech-to-Speech Translation without
  Parallel Speech Data?
Can We Achieve High-quality Direct Speech-to-Speech Translation without Parallel Speech Data?
Qingkai Fang
Shaolei Zhang
Zhengrui Ma
Min Zhang
Yang Feng
VLM
54
1
0
11 Jun 2024
Direct Punjabi to English speech translation using discrete units
Direct Punjabi to English speech translation using discrete units
Prabhjot Kaur
L. A. M. Bush
Weisong Shi
43
0
0
25 Feb 2024
Towards audio language modeling -- an overview
Towards audio language modeling -- an overview
Haibin Wu
Xuanjun Chen
Yi-Cheng Lin
Kai-Wei Chang
Ho-Lam Chung
Alexander H. Liu
Hung-yi Lee
AuLLM
47
29
0
20 Feb 2024
AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation
AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation
Rongjie Huang
Huadai Liu
Xize Cheng
Yi Ren
Lin Li
...
Jinzheng He
Lichao Zhang
Jinglin Liu
Xiaoyue Yin
Zhou Zhao
82
8
0
24 May 2023
Textually Pretrained Speech Language Models
Textually Pretrained Speech Language Models
Michael Hassid
Tal Remez
Tu Nguyen
Itai Gat
Alexis Conneau
...
Alexandre Défossez
Gabriel Synnaeve
Emmanuel Dupoux
Roy Schwartz
Yossi Adi
VLM
SyDa
59
54
0
22 May 2023
Back Translation for Speech-to-text Translation Without Transcripts
Back Translation for Speech-to-text Translation Without Transcripts
Qingkai Fang
Yang Feng
45
13
0
15 May 2023
Speech-to-Speech Translation For A Real-world Unwritten Language
Speech-to-Speech Translation For A Real-world Unwritten Language
Peng-Jen Chen
Ke M. Tran
Yilin Yang
Jingfei Du
Justine T. Kao
...
Sravya Popuri
Changhan Wang
J. Pino
Wei-Ning Hsu
Ann Lee
58
26
0
11 Nov 2022
SpeechMatrix: A Large-Scale Mined Corpus of Multilingual
  Speech-to-Speech Translations
SpeechMatrix: A Large-Scale Mined Corpus of Multilingual Speech-to-Speech Translations
Paul-Ambroise Duquenne
Hongyu Gong
Ning Dong
Jingfei Du
Ann Lee
Vedanuj Goswani
Changhan Wang
J. Pino
Benoît Sagot
Holger Schwenk
65
34
0
08 Nov 2022
Audio Language Modeling using Perceptually-Guided Discrete
  Representations
Audio Language Modeling using Perceptually-Guided Discrete Representations
Felix Kreuk
Yaniv Taigman
Adam Polyak
Jade Copet
Gabriel Synnaeve
Alexandre Défossez
Yossi Adi
40
4
0
02 Nov 2022
Textless Direct Speech-to-Speech Translation with Discrete Speech
  Representation
Textless Direct Speech-to-Speech Translation with Discrete Speech Representation
Xinjian Li
Ye Jia
Chung-Cheng Chiu
61
26
0
31 Oct 2022
Joint Pre-Training with Speech and Bilingual Text for Direct Speech to
  Speech Translation
Joint Pre-Training with Speech and Bilingual Text for Direct Speech to Speech Translation
Kun Wei
Long Zhou
Zi-Hua Zhang
Liping Chen
Shujie Liu
Lei He
Jinyu Li
Furu Wei
47
13
0
31 Oct 2022
Bootstrapping meaning through listening: Unsupervised learning of spoken
  sentence embeddings
Bootstrapping meaning through listening: Unsupervised learning of spoken sentence embeddings
Jian Zhu
Zuoyu Tian
Yadong Liu
Cong Zhang
Chia-wen Lo
SSL
50
2
0
23 Oct 2022
Augmentation Invariant Discrete Representation for Generative Spoken
  Language Modeling
Augmentation Invariant Discrete Representation for Generative Spoken Language Modeling
Itai Gat
Felix Kreuk
Tu Nguyen
Ann Lee
Jade Copet
Gabriel Synnaeve
Emmanuel Dupoux
Yossi Adi
56
11
0
30 Sep 2022
AudioGen: Textually Guided Audio Generation
AudioGen: Textually Guided Audio Generation
Felix Kreuk
Gabriel Synnaeve
Adam Polyak
Uriel Singer
Alexandre Défossez
Jade Copet
Devi Parikh
Yaniv Taigman
Yossi Adi
DiffM
32
291
0
30 Sep 2022
SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language
  Processing
SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing
Junyi Ao
Rui Wang
Long Zhou
Chengyi Wang
Shuo Ren
...
Yu Zhang
Zhihua Wei
Yao Qian
Jinyu Li
Furu Wei
118
196
0
14 Oct 2021
Generative Spoken Language Modeling from Raw Audio
Generative Spoken Language Modeling from Raw Audio
Kushal Lakhotia
Evgeny Kharitonov
Wei-Ning Hsu
Yossi Adi
Adam Polyak
...
Tu Nguyen
Jade Copet
Alexei Baevski
A. Mohamed
Emmanuel Dupoux
AuLLM
201
348
0
01 Feb 2021
End-to-End Automatic Speech Translation of Audiobooks
End-to-End Automatic Speech Translation of Audiobooks
Alexandre Berard
Laurent Besacier
A. Kocabiyikoglu
Olivier Pietquin
83
190
0
12 Feb 2018
1