ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1703.08581
  4. Cited By
Sequence-to-Sequence Models Can Directly Translate Foreign Speech

Sequence-to-Sequence Models Can Directly Translate Foreign Speech

24 March 2017
Ron J. Weiss
J. Chorowski
Navdeep Jaitly
Yonghui Wu
Z. Chen
ArXivPDFHTML

Papers citing "Sequence-to-Sequence Models Can Directly Translate Foreign Speech"

50 / 68 papers shown
Title
AdaST: Dynamically Adapting Encoder States in the Decoder for End-to-End Speech-to-Text Translation
AdaST: Dynamically Adapting Encoder States in the Decoder for End-to-End Speech-to-Text Translation
Wuwei Huang
Dexin Wang
Deyi Xiong
72
4
0
18 Mar 2025
Speech to Speech Translation with Translatotron: A State of the Art Review
Speech to Speech Translation with Translatotron: A State of the Art Review
Jules R. Kala
Emmanuel Adetiba
Abdultaofeek Abayom
Oluwatobi E. Dare
Ayodele H. Ifijeh
153
0
0
21 Feb 2025
High-Fidelity Simultaneous Speech-To-Speech Translation
High-Fidelity Simultaneous Speech-To-Speech Translation
Tom Labiausse
Laurent Mazaré
Edouard Grave
P. Pérez
Alexandre Défossez
Neil Zeghidour
163
0
0
05 Feb 2025
Prepending or Cross-Attention for Speech-to-Text? An Empirical Comparison
Prepending or Cross-Attention for Speech-to-Text? An Empirical Comparison
Tsz Kin Lam
Marco Gaido
Sara Papi
L. Bentivogli
Barry Haddow
31
0
0
04 Jan 2025
CoSTA: Code-Switched Speech Translation using Aligned Speech-Text
  Interleaving
CoSTA: Code-Switched Speech Translation using Aligned Speech-Text Interleaving
Bhavani Shankar
P. Jyothi
Pushpak Bhattacharyya
40
1
0
16 Jun 2024
End-to-End Single-Channel Speaker-Turn Aware Conversational Speech
  Translation
End-to-End Single-Channel Speaker-Turn Aware Conversational Speech Translation
Juan Pablo Zuluaga
Zhaocheng Huang
Xing Niu
Rohit Paturi
S. Srinivasan
Prashant Mathur
Brian Thompson
Marcello Federico
BDL
25
2
0
01 Nov 2023
Direct Models for Simultaneous Translation and Automatic Subtitling:
  FBK@IWSLT2023
Direct Models for Simultaneous Translation and Automatic Subtitling: FBK@IWSLT2023
Sara Papi
Marco Gaido
Matteo Negri
41
7
0
27 Sep 2023
End-to-End Simultaneous Speech Translation with Differentiable
  Segmentation
End-to-End Simultaneous Speech Translation with Differentiable Segmentation
Shaolei Zhang
Yang Feng
13
17
0
25 May 2023
Improving speech translation by fusing speech and text
Improving speech translation by fusing speech and text
Wenbiao Yin
Zhicheng Liu
Chengqi Zhao
Tao Wang
Jian-Fei Tong
Rong Ye
15
4
0
23 May 2023
DropDim: A Regularization Method for Transformer Networks
DropDim: A Regularization Method for Transformer Networks
Hao Zhang
Dan Qu
Kejia Shao
Xu Yang
20
12
0
20 Apr 2023
Improving Speech Translation by Cross-Modal Multi-Grained Contrastive
  Learning
Improving Speech Translation by Cross-Modal Multi-Grained Contrastive Learning
Hao Zhang
Nianwen Si
Yaqi Chen
Wenlin Zhang
Xukui Yang
Dan Qu
Weiqiang Zhang
35
9
0
20 Apr 2023
Transformers in Speech Processing: A Survey
Transformers in Speech Processing: A Survey
S. Latif
Aun Zaidi
Heriberto Cuayáhuitl
Fahad Shamshad
Moazzam Shoukat
Junaid Qadir
42
47
0
21 Mar 2023
SegAugment: Maximizing the Utility of Speech Translation Data with
  Segmentation-based Augmentations
SegAugment: Maximizing the Utility of Speech Translation Data with Segmentation-based Augmentations
Ioannis Tsiamas
José A. R. Fonollosa
Marta R. Costa-jussá
38
6
0
19 Dec 2022
WACO: Word-Aligned Contrastive Learning for Speech Translation
WACO: Word-Aligned Contrastive Learning for Speech Translation
Siqi Ouyang
Rong Ye
Lei Li
24
25
0
19 Dec 2022
Improving End-to-end Speech Translation by Leveraging Auxiliary Speech
  and Text Data
Improving End-to-end Speech Translation by Leveraging Auxiliary Speech and Text Data
Yuhao Zhang
Chen Xu
Bojie Hu
Chunliang Zhang
Tong Xiao
Jingbo Zhu
16
15
0
04 Dec 2022
Align, Write, Re-order: Explainable End-to-End Speech Translation via
  Operation Sequence Generation
Align, Write, Re-order: Explainable End-to-End Speech Translation via Operation Sequence Generation
Motoi Omachi
Brian Yan
Siddharth Dalmia
Yuya Fujita
Shinji Watanabe
LRM
25
3
0
11 Nov 2022
Direct Speech Translation for Automatic Subtitling
Direct Speech Translation for Automatic Subtitling
Sara Papi
Marco Gaido
Alina Karakanta
Mauro Cettolo
Matteo Negri
Marco Turchi
46
11
0
27 Sep 2022
Non-Parametric Domain Adaptation for End-to-End Speech Translation
Non-Parametric Domain Adaptation for End-to-End Speech Translation
Yichao Du
Weizhi Wang
Zhirui Zhang
Boxing Chen
Tong Bill Xu
Jun Xie
Enhong Chen
51
18
0
23 May 2022
Who Are We Talking About? Handling Person Names in Speech Translation
Who Are We Talking About? Handling Person Names in Speech Translation
Marco Gaido
Matteo Negri
Marco Turchi
20
7
0
13 May 2022
Large-Scale Streaming End-to-End Speech Translation with Neural
  Transducers
Large-Scale Streaming End-to-End Speech Translation with Neural Transducers
Jian Xue
Peidong Wang
Jinyu Li
Matt Post
Yashesh Gaur
AI4TS
19
26
0
11 Apr 2022
Enhanced Direct Speech-to-Speech Translation Using Self-supervised
  Pre-training and Data Augmentation
Enhanced Direct Speech-to-Speech Translation Using Self-supervised Pre-training and Data Augmentation
Sravya Popuri
Peng-Jen Chen
Changhan Wang
J. Pino
Yossi Adi
Jiatao Gu
Wei-Ning Hsu
Ann Lee
20
56
0
06 Apr 2022
Leveraging unsupervised and weakly-supervised data to improve direct
  speech-to-speech translation
Leveraging unsupervised and weakly-supervised data to improve direct speech-to-speech translation
Ye Jia
Yifan Ding
Ankur Bapna
Colin Cherry
Yu Zhang
Alexis Conneau
Nobuyuki Morioka
39
20
0
24 Mar 2022
STEMM: Self-learning with Speech-text Manifold Mixup for Speech
  Translation
STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation
Qingkai Fang
Rong Ye
Lei Li
Yang Feng
Mingxuan Wang
22
95
0
20 Mar 2022
Keyword localisation in untranscribed speech using visually grounded
  speech models
Keyword localisation in untranscribed speech using visually grounded speech models
Kayode Olaleye
Dan Oneaţă
Herman Kamper
19
7
0
02 Feb 2022
Visualization: the missing factor in Simultaneous Speech Translation
Visualization: the missing factor in Simultaneous Speech Translation
Sara Papi
Matteo Negri
Marco Turchi
14
2
0
31 Oct 2021
Machine Translation Verbosity Control for Automatic Dubbing
Machine Translation Verbosity Control for Automatic Dubbing
Surafel Melaku Lakew
Marcello Federico
Yue Wang
Cuong Hoang
Yogesh Virkar
Roberto Barra-Chicote
Robert Enyedi
11
21
0
08 Oct 2021
Fast-MD: Fast Multi-Decoder End-to-End Speech Translation with
  Non-Autoregressive Hidden Intermediates
Fast-MD: Fast Multi-Decoder End-to-End Speech Translation with Non-Autoregressive Hidden Intermediates
H. Inaguma
Siddharth Dalmia
Brian Yan
Shinji Watanabe
57
11
0
27 Sep 2021
Cross-modal Spectrum Transformation Network For Acoustic Scene
  classification
Cross-modal Spectrum Transformation Network For Acoustic Scene classification
Yang Liu
A. Neophytou
Sunando Sengupta
Eric Sommerlade
19
9
0
13 Aug 2021
Simultaneous Speech Translation for Live Subtitling: from Delay to
  Display
Simultaneous Speech Translation for Live Subtitling: from Delay to Display
Alina Karakanta
Sara Papi
Matteo Negri
Marco Turchi
20
10
0
19 Jul 2021
Translatotron 2: High-quality direct speech-to-speech translation with
  voice preservation
Translatotron 2: High-quality direct speech-to-speech translation with voice preservation
Ye Jia
Michelle Tadmor Ramanovich
Tal Remez
Roi Pomerantz
26
67
0
19 Jul 2021
Between Flexibility and Consistency: Joint Generation of Captions and
  Subtitles
Between Flexibility and Consistency: Joint Generation of Captions and Subtitles
Alina Karakanta
Marco Gaido
Matteo Negri
Marco Turchi
19
9
0
13 Jul 2021
The NiuTrans End-to-End Speech Translation System for IWSLT 2021 Offline
  Task
The NiuTrans End-to-End Speech Translation System for IWSLT 2021 Offline Task
Chen Xu
Xiaoqian Liu
Xiaowen Liu
Laohu Wang
Canan Huang
Tong Xiao
Jingbo Zhu
29
5
0
06 Jul 2021
Dealing with training and test segmentation mismatch: FBK@IWSLT2021
Dealing with training and test segmentation mismatch: FBK@IWSLT2021
Sara Papi
Marco Gaido
Matteo Negri
Marco Turchi
31
6
0
23 Jun 2021
Stacked Acoustic-and-Textual Encoding: Integrating the Pre-trained
  Models into Speech Translation Encoders
Stacked Acoustic-and-Textual Encoding: Integrating the Pre-trained Models into Speech Translation Encoders
Chen Xu
Bojie Hu
Yanyang Li
Yuhao Zhang
Shen Huang
Qi Ju
Tong Xiao
Jingbo Zhu
17
75
0
12 May 2021
Searchable Hidden Intermediates for End-to-End Models of Decomposable
  Sequence Tasks
Searchable Hidden Intermediates for End-to-End Models of Decomposable Sequence Tasks
Siddharth Dalmia
Brian Yan
Vikas Raunak
Florian Metze
Shinji Watanabe
37
30
0
02 May 2021
Large-Scale Self- and Semi-Supervised Learning for Speech Translation
Large-Scale Self- and Semi-Supervised Learning for Speech Translation
Changhan Wang
Anne Wu
J. Pino
Alexei Baevski
Michael Auli
Alexis Conneau
SSL
31
44
0
14 Apr 2021
NeurST: Neural Speech Translation Toolkit
NeurST: Neural Speech Translation Toolkit
Chengqi Zhao
Mingxuan Wang
Qianqian Dong
Rong Ye
Lei Li
22
32
0
18 Dec 2020
Dual-decoder Transformer for Joint Automatic Speech Recognition and
  Multilingual Speech Translation
Dual-decoder Transformer for Joint Automatic Speech Recognition and Multilingual Speech Translation
Hang Le
J. Pino
Changhan Wang
Jiatao Gu
D. Schwab
Laurent Besacier
39
82
0
02 Nov 2020
Evaluating Gender Bias in Speech Translation
Evaluating Gender Bias in Speech Translation
Marta R. Costa-jussá
Christine Basta
Gerard I. Gállego
18
21
0
27 Oct 2020
Multilingual Speech Translation with Efficient Finetuning of Pretrained
  Models
Multilingual Speech Translation with Efficient Finetuning of Pretrained Models
Xian Li
Changhan Wang
Yun Tang
C. Tran
Yuqing Tang
J. Pino
Alexei Baevski
Alexis Conneau
Michael Auli
19
6
0
24 Oct 2020
A Technical Report: BUT Speech Translation Systems
A Technical Report: BUT Speech Translation Systems
Hari Krishna Vydana
L. Burget
J. Černocký
22
0
0
22 Oct 2020
A General Multi-Task Learning Framework to Leverage Text Data for Speech
  to Text Tasks
A General Multi-Task Learning Framework to Leverage Text Data for Speech to Text Tasks
Yun Tang
J. Pino
Changhan Wang
Xutai Ma
Dmitriy Genzel
18
73
0
21 Oct 2020
On Target Segmentation for Direct Speech Translation
On Target Segmentation for Direct Speech Translation
Mattia Antonino Di Gangi
Marco Gaido
Matteo Negri
Marco Turchi
31
14
0
10 Sep 2020
Contextualized Translation of Automatically Segmented Speech
Contextualized Translation of Automatically Segmented Speech
Marco Gaido
Mattia Antonino Di Gangi
Matteo Negri
Mauro Cettolo
Marco Turchi
23
18
0
05 Aug 2020
Consistent Transcription and Translation of Speech
Consistent Transcription and Translation of Speech
Matthias Sperber
Hendra Setiawan
Christian Gollan
Udhyakumar Nallasamy
Matthias Paulik
21
18
0
24 Jul 2020
Self-Supervised Representations Improve End-to-End Speech Translation
Self-Supervised Representations Improve End-to-End Speech Translation
Anne Wu
Changhan Wang
J. Pino
Jiatao Gu
SSL
17
40
0
22 Jun 2020
End-to-End Speech-Translation with Knowledge Distillation: FBK@IWSLT2020
End-to-End Speech-Translation with Knowledge Distillation: FBK@IWSLT2020
Marco Gaido
Mattia Antonino Di Gangi
Matteo Negri
Marco Turchi
14
53
0
04 Jun 2020
Self-Training for End-to-End Speech Translation
Self-Training for End-to-End Speech Translation
J. Pino
Qiantong Xu
Xutai Ma
M. Dousti
Yun Tang
33
59
0
03 Jun 2020
Unmet Needs and Opportunities for Mobile Translation AI
Unmet Needs and Opportunities for Mobile Translation AI
Susanne Putze
Michael Bonfert
Pitt Michelmann
Sebastian Höffner
Dirk Wenig
Rainer Malaka
Jan David Smeddinck
14
80
0
27 Feb 2020
SkinAugment: Auto-Encoding Speaker Conversions for Automatic Speech
  Translation
SkinAugment: Auto-Encoding Speaker Conversions for Automatic Speech Translation
Arya D. McCarthy
Liezl Puzon
J. Pino
23
24
0
27 Feb 2020
12
Next