Stacked Acoustic-and-Textual Encoding: Integrating the Pre-trained Models into Speech Translation Encoders

12 May 2021

Jingbo Zhu

Papers citing "Stacked Acoustic-and-Textual Encoding: Integrating the Pre-trained Models into Speech Translation Encoders"

46 / 46 papers shown

Title
Adaptive Inner Speech-Text Alignment for LLM-based Speech Translation Henglyu Liu Andong Chen Kehai Chen X. Bai M. Zhong Yuan Qiu Min Zhang 45 0 0 13 Mar 2025
A Modular-based Strategy for Mitigating Gradient Conflicts in Simultaneous Speech Translation Xiaoqian Liu Yangfan Du Jiadong Wang Yuan Ge Chen Xu Tong Xiao Guocheng Chen Jingbo Zhu 36 0 0 31 Dec 2024
Unveiling the Role of Pretraining in Direct Speech Translation Belen Alastruey Gerard I. Gállego Marta R. Costa-jussá 36 0 0 26 Sep 2024
SBAAM! Eliminating Transcript Dependency in Automatic Subtitling Marco Gaido Sara Papi Matteo Negri Mauro Cettolo L. Bentivogli 43 1 0 17 May 2024
Pushing the Limits of Zero-shot End-to-End Speech Translation Ioannis Tsiamas Gerard I. Gállego José A. R. Fonollosa Marta R. Costa-jussá 43 7 0 16 Feb 2024
Soft Alignment of Modality Space for End-to-end Speech Translation Yuhao Zhang Kaiqi Kou Bei Li Chen Xu Chunliang Zhang Tong Xiao Jingbo Zhu 29 0 0 18 Dec 2023
Rethinking and Improving Multi-task Learning for End-to-end Speech Translation Yuhao Zhang Chen Xu Bei Li Hao Chen Tong Xiao Chunliang Zhang Jingbo Zhu 26 5 0 07 Nov 2023
Towards a Deep Understanding of Multilingual End-to-End Speech Translation Haoran Sun Xiaohu Zhao Yikun Lei Shaolin Zhu Deyi Xiong 39 8 0 31 Oct 2023
Tuning Large language model for End-to-end Speech Translation Hao Zhang Nianwen Si Yaqi Chen Wenlin Zhang Xu Yang Dan Qu Xiaolin Jiao 20 8 0 03 Oct 2023
Bridging the Gaps of Both Modality and Language: Synchronous Bilingual CTC for Speech Translation and Speech Recognition Chen Xu Xiaoqian Liu Erfeng He Yuhao Zhang Qianqian Dong Tong Xiao Jingbo Zhu Dapeng Man Wu Yang 35 0 0 21 Sep 2023
An Empirical Study of Consistency Regularization for End-to-End Speech-to-Text Translation Pengzhi Gao Ruiqing Zhang Zhongjun He Hua Wu Haifeng Wang 25 4 0 28 Aug 2023
Recent Advances in Direct Speech-to-text Translation Chen Xu Rong Ye Qianqian Dong Chengqi Zhao Tom Ko Mingxuan Wang Tong Xiao Jingbo Zhu 21 18 0 20 Jun 2023
Modality Adaption or Regularization? A Case Study on End-to-End Speech Translation Yucheng Han Chen Xu Tong Xiao Jingbo Zhu 27 3 0 13 Jun 2023
Speech Translation with Foundation Models and Optimal Transport: UPC at IWSLT23 Ioannis Tsiamas Gerard I. Gállego José A. R. Fonollosa Marta R. Costa-jussá OT 16 3 0 02 Jun 2023
CTC-based Non-autoregressive Speech Translation Chen Xu Xiaoqian Liu Xiaowen Liu Qingxuan Sun Yuhao Zhang ... Tom Ko Mingxuan Wang Tong Xiao Anxiang Ma Jingbo Zhu 25 11 0 27 May 2023
Bridging the Granularity Gap for Acoustic Modeling Chen Xu Yuhao Zhang Chengbo Jiao Xiaoqian Liu Chi Hu Xin Zeng Tong Xiao Anxiang Ma Huizhen Wang JingBo Zhu 29 6 0 27 May 2023
ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation Chenyang Le Yao Qian Long Zhou Shujie Liu Yanmin Qian Michael Zeng Xuedong Huang 24 13 0 24 May 2023
CMOT: Cross-modal Mixup via Optimal Transport for Speech Translation Yan Zhou Qingkai Fang Yang Feng OT 45 26 0 24 May 2023
Improving speech translation by fusing speech and text Wenbiao Yin Zhicheng Liu Chengqi Zhao Tao Wang Jian-Fei Tong Rong Ye 15 4 0 23 May 2023
Back Translation for Speech-to-text Translation Without Transcripts Qingkai Fang Yang Feng 38 13 0 15 May 2023
Understanding and Bridging the Modality Gap for Speech Translation Qingkai Fang Yang Feng 32 25 0 15 May 2023
Improving Speech Translation by Cross-Modal Multi-Grained Contrastive Learning Hao Zhang Nianwen Si Yaqi Chen Wenlin Zhang Xukui Yang Dan Qu Weiqiang Zhang 35 9 0 20 Apr 2023
When Good and Reproducible Results are a Giant with Feet of Clay: The Importance of Software Quality in NLP Sara Papi Marco Gaido Andrea Pilzer Matteo Negri 59 10 0 28 Mar 2023
Efficient CTC Regularization via Coarse Labels for End-to-End Speech Translation Biao Zhang Barry Haddow Rico Sennrich 17 3 0 21 Feb 2023
Pre-training for Speech Translation: CTC Meets Optimal Transport Hang Le Hongyu Gong Changhan Wang J. Pino Benjamin Lecouteux D. Schwab OT 13 21 0 27 Jan 2023
WACO: Word-Aligned Contrastive Learning for Speech Translation Siqi Ouyang Rong Ye Lei Li 32 25 0 19 Dec 2022
AdaTranS: Adapting with Boundary-based Shrinking for End-to-End Speech Translation Xingshan Zeng Liangyou Li Qun Liu 24 5 0 17 Dec 2022
M3ST: Mix at Three Levels for Speech Translation Xuxin Cheng Qianqian Dong Fengpeng Yue Tom Ko Mingxuan Wang Yuexian Zou 30 40 0 07 Dec 2022
Improving End-to-end Speech Translation by Leveraging Auxiliary Speech and Text Data Yuhao Zhang Chen Xu Bojie Hu Chunliang Zhang Tong Xiao Jingbo Zhu 27 15 0 04 Dec 2022
Does Joint Training Really Help Cascaded Speech Translation? Viet Anh Khoa Tran David Thulke Yingbo Gao Christian Herold Hermann Ney 30 3 0 24 Oct 2022
Discrete Cross-Modal Alignment Enables Zero-Shot Speech Translation Chen Wang Yuchen Liu Boxing Chen Jiajun Zhang Wei Luo Zhongqiang Huang Chengqing Zong 39 10 0 18 Oct 2022
SpeechUT: Bridging Speech and Text with Hidden-Unit for Encoder-Decoder Based Speech-Text Pre-training Zi-Hua Zhang Long Zhou Junyi Ao Shujie Liu Lirong Dai Jinyu Li Furu Wei 61 57 0 07 Oct 2022
JoeyS2T: Minimalistic Speech-to-Text Modeling with JoeyNMT Mayumi Ohta Julia Kreutzer Stefan Riezler 19 0 0 05 Oct 2022
M-Adapter: Modality Adaptation for End-to-End Speech-to-Text Translation Jinming Zhao Haomiao Yang Ehsan Shareghi Gholamreza Haffari 48 19 0 03 Jul 2022
Revisiting End-to-End Speech-to-Text Translation From Scratch Biao Zhang Barry Haddow Rico Sennrich 24 36 0 09 Jun 2022
T-Modules: Translation Modules for Zero-Shot Cross-Modal Machine Translation Paul-Ambroise Duquenne Hongyu Gong Benoît Sagot Holger Schwenk 30 18 0 24 May 2022
Efficient yet Competitive Speech Translation: FBK@IWSLT2022 Marco Gaido Sara Papi Dennis Fucci G. Fiameni Matteo Negri Marco Turchi 31 19 0 05 May 2022
Cross-modal Contrastive Learning for Speech Translation Rong Ye Mingxuan Wang Lei Li SSL 27 84 0 05 May 2022
Genre-conditioned Acoustic Models for Automatic Lyrics Transcription of Polyphonic Music Xiaoxue Gao Chitralekha Gupta Haizhou Li 32 21 0 07 Apr 2022
STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation Qingkai Fang Rong Ye Lei Li Yang Feng Mingxuan Wang 35 95 0 20 Mar 2022
Fast-MD: Fast Multi-Decoder End-to-End Speech Translation with Non-Autoregressive Hidden Intermediates Hirofumi Inaguma Siddharth Dalmia Brian Yan Shinji Watanabe 65 11 0 27 Sep 2021
Non-autoregressive End-to-end Speech Translation with Parallel Autoregressive Rescoring Hirofumi Inaguma Yosuke Higuchi Kevin Duh Tatsuya Kawahara Shinji Watanabe 63 11 0 09 Sep 2021
The NiuTrans End-to-End Speech Translation System for IWSLT 2021 Offline Task Chen Xu Xiaoqian Liu Xiaowen Liu Laohu Wang Canan Huang Tong Xiao Jingbo Zhu 34 5 0 06 Jul 2021
Learning Light-Weight Translation Models from Deep Transformer Bei Li Ziyang Wang Hui Liu Quan Du Tong Xiao Chunliang Zhang Jingbo Zhu VLM 120 40 0 27 Dec 2020
Tied Multitask Learning for Neural Speech Translation Antonios Anastasopoulos David Chiang 100 172 0 19 Feb 2018
End-to-End Automatic Speech Translation of Audiobooks Alexandre Berard Laurent Besacier A. Kocabiyikoglu Olivier Pietquin 75 190 0 12 Feb 2018