ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1704.00784
  4. Cited By
Online and Linear-Time Attention by Enforcing Monotonic Alignments

Online and Linear-Time Attention by Enforcing Monotonic Alignments

3 April 2017
Colin Raffel
Minh-Thang Luong
Peter J. Liu
Ron J. Weiss
Douglas Eck
ArXivPDFHTML

Papers citing "Online and Linear-Time Attention by Enforcing Monotonic Alignments"

50 / 55 papers shown
Title
Spatial Speech Translation: Translating Across Space With Binaural Hearables
Spatial Speech Translation: Translating Across Space With Binaural Hearables
Tuochao Chen
Qirui Wang
Runlin He
Shyam Gollakota
31
0
0
25 Apr 2025
Aligner-Encoders: Self-Attention Transformers Can Be Self-Transducers
Aligner-Encoders: Self-Attention Transformers Can Be Self-Transducers
Adam Stooke
Rohit Prabhavalkar
K. Sim
P. M. Mengibar
39
0
0
06 Feb 2025
StyleSSP: Sampling StartPoint Enhancement for Training-free Diffusion-based Method for Style Transfer
StyleSSP: Sampling StartPoint Enhancement for Training-free Diffusion-based Method for Style Transfer
Ruojun Xu
Weijie Xi
Xiaodi Wang
Yongbo Mao
Zach Cheng
DiffM
39
1
0
20 Jan 2025
LLaMA-Omni: Seamless Speech Interaction with Large Language Models
LLaMA-Omni: Seamless Speech Interaction with Large Language Models
Qingkai Fang
Shoutao Guo
Yan Zhou
Zhengrui Ma
Shaolei Zhang
Yang Feng
AuLLM
33
32
0
10 Sep 2024
A Non-autoregressive Generation Framework for End-to-End Simultaneous
  Speech-to-Any Translation
A Non-autoregressive Generation Framework for End-to-End Simultaneous Speech-to-Any Translation
Zhengrui Ma
Qingkai Fang
Shaolei Zhang
Shoutao Guo
Yang Feng
Min Zhang
53
9
0
11 Jun 2024
DEADiff: An Efficient Stylization Diffusion Model with Disentangled
  Representations
DEADiff: An Efficient Stylization Diffusion Model with Disentangled Representations
Tianhao Qi
Shancheng Fang
Yanze Wu
Hongtao Xie
Jiawei Liu
Lang Chen
Qian He
Yongdong Zhang
DiffM
25
32
0
11 Mar 2024
From Lengthy to Lucid: A Systematic Literature Review on NLP Techniques
  for Taming Long Sentences
From Lengthy to Lucid: A Systematic Literature Review on NLP Techniques for Taming Long Sentences
Tatiana Passali
Efstathios Chatzikyriakidis
Stelios Andreadis
Thanos G. Stavropoulos
Anastasia Matonaki
A. Fachantidis
Grigorios Tsoumakas
24
1
0
08 Dec 2023
Non-autoregressive Streaming Transformer for Simultaneous Translation
Non-autoregressive Streaming Transformer for Simultaneous Translation
Zhengrui Ma
Shaolei Zhang
Shoutao Guo
Chenze Shao
Min Zhang
Yang Feng
32
13
0
23 Oct 2023
End-to-End Simultaneous Speech Translation with Differentiable
  Segmentation
End-to-End Simultaneous Speech Translation with Differentiable Segmentation
Shaolei Zhang
Yang Feng
23
17
0
25 May 2023
TAPIR: Learning Adaptive Revision for Incremental Natural Language
  Understanding with a Two-Pass Model
TAPIR: Learning Adaptive Revision for Incremental Natural Language Understanding with a Two-Pass Model
Patrick Kahardipraja
Brielen Madureira
David Schlangen
CLL
34
9
0
18 May 2023
ELODIN: Naming Concepts in Embedding Spaces
ELODIN: Naming Concepts in Embedding Spaces
Rodrigo Mello
Filipe Calegario
Geber Ramalho
DiffM
28
1
0
07 Mar 2023
Monotonic segmental attention for automatic speech recognition
Monotonic segmental attention for automatic speech recognition
Albert Zeyer
Robin Schmitt
Wei Zhou
Ralf Schluter
Hermann Ney
16
8
0
26 Oct 2022
Photorealistic Text-to-Image Diffusion Models with Deep Language
  Understanding
Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding
Chitwan Saharia
William Chan
Saurabh Saxena
Lala Li
Jay Whang
...
Raphael Gontijo-Lopes
Tim Salimans
Jonathan Ho
David J Fleet
Mohammad Norouzi
VLM
90
5,797
0
23 May 2022
Inducing and Using Alignments for Transition-based AMR Parsing
Inducing and Using Alignments for Transition-based AMR Parsing
Andrew Drozdov
Jiawei Zhou
Radu Florian
Andrew McCallum
Tahira Naseem
Yoon Kim
Ramón Fernández Astudillo
39
27
0
03 May 2022
Large-Scale Streaming End-to-End Speech Translation with Neural
  Transducers
Large-Scale Streaming End-to-End Speech Translation with Neural Transducers
Jian Xue
Peidong Wang
Jinyu Li
Matt Post
Yashesh Gaur
AI4TS
32
26
0
11 Apr 2022
End to End Lip Synchronization with a Temporal AutoEncoder
End to End Lip Synchronization with a Temporal AutoEncoder
Yoav Shalev
Lior Wolf
16
7
0
30 Mar 2022
Exploring Continuous Integrate-and-Fire for Adaptive Simultaneous Speech
  Translation
Exploring Continuous Integrate-and-Fire for Adaptive Simultaneous Speech Translation
Chih-Chiang Chang
Hung-yi Lee
27
13
0
22 Mar 2022
Gaussian Multi-head Attention for Simultaneous Machine Translation
Gaussian Multi-head Attention for Simultaneous Machine Translation
Shaolei Zhang
Yang Feng
21
22
0
17 Mar 2022
Decision Attentive Regularization to Improve Simultaneous Speech
  Translation Systems
Decision Attentive Regularization to Improve Simultaneous Speech Translation Systems
Mohd Abbas Zaidi
Beomseok Lee
Sangha Kim
Chanwoo Kim
24
5
0
13 Oct 2021
Translating Images into Maps
Translating Images into Maps
Avishkar Saha
Oscar Alejandro Mendez Maldonado
Chris Russell
Richard Bowden
ViT
21
144
0
03 Oct 2021
Universal Simultaneous Machine Translation with Mixture-of-Experts
  Wait-k Policy
Universal Simultaneous Machine Translation with Mixture-of-Experts Wait-k Policy
Shaolei Zhang
Yang Feng
MoE
30
39
0
11 Sep 2021
Infusing Future Information into Monotonic Attention Through Language
  Models
Infusing Future Information into Monotonic Attention Through Language Models
Mohd Abbas Zaidi
S. Indurthi
Beomseok Lee
Nikhil Kumar Lakumarapu
Sangha Kim
27
2
0
07 Sep 2021
Sequence-to-Sequence Learning with Latent Neural Grammars
Sequence-to-Sequence Learning with Latent Neural Grammars
Yoon Kim
33
40
0
02 Sep 2021
A Survey on Neural Speech Synthesis
A Survey on Neural Speech Synthesis
Xu Tan
Tao Qin
Frank Soong
Tie-Yan Liu
AI4TS
18
352
0
29 Jun 2021
Multi-mode Transformer Transducer with Stochastic Future Context
Multi-mode Transformer Transducer with Stochastic Future Context
Kwangyoun Kim
Felix Wu
Prashant Sridhar
Kyu Jeong Han
Shinji Watanabe
30
9
0
17 Jun 2021
Review of end-to-end speech synthesis technology based on deep learning
Review of end-to-end speech synthesis technology based on deep learning
Zhaoxi Mu
Xinyu Yang
Yizhuo Dong
AuLLM
ALM
26
24
0
20 Apr 2021
A study of latent monotonic attention variants
A study of latent monotonic attention variants
Albert Zeyer
Ralf Schluter
Hermann Ney
24
5
0
30 Mar 2021
A Survey on Deep Reinforcement Learning for Audio-Based Applications
A Survey on Deep Reinforcement Learning for Audio-Based Applications
S. Latif
Heriberto Cuayáhuitl
Farrukh Pervez
Fahad Shamshad
Hafiz Shehbaz Ali
Min Zhang
OffRL
47
73
0
01 Jan 2021
Learning to Rationalize for Nonmonotonic Reasoning with Distant
  Supervision
Learning to Rationalize for Nonmonotonic Reasoning with Distant Supervision
Faeze Brahman
Vered Shwartz
Rachel Rudinger
Yejin Choi
LRM
15
42
0
14 Dec 2020
A Better and Faster End-to-End Model for Streaming ASR
A Better and Faster End-to-End Model for Streaming ASR
Bo-wen Li
Anmol Gulati
Jiahui Yu
Tara N. Sainath
Chung-Cheng Chiu
...
Wei Han
Qiao Liang
Yu Zhang
Trevor Strohman
Yonghui Wu
AuLLM
25
123
0
21 Nov 2020
SimulMT to SimulST: Adapting Simultaneous Text Translation to End-to-End
  Simultaneous Speech Translation
SimulMT to SimulST: Adapting Simultaneous Text Translation to End-to-End Simultaneous Speech Translation
Xutai Ma
J. Pino
Philipp Koehn
17
94
0
03 Nov 2020
Dual-mode ASR: Unify and Improve Streaming ASR with Full-context
  Modeling
Dual-mode ASR: Unify and Improve Streaming ASR with Full-context Modeling
Jiahui Yu
Wei Han
Anmol Gulati
Chung-Cheng Chiu
Bo-wen Li
Tara N. Sainath
Yonghui Wu
Ruoming Pang
30
18
0
12 Oct 2020
Controllable neural text-to-speech synthesis using intuitive prosodic
  features
Controllable neural text-to-speech synthesis using intuitive prosodic features
T. Raitio
Ramya Rasipuram
D. Castellani
34
66
0
14 Sep 2020
Class LM and word mapping for contextual biasing in End-to-End ASR
Class LM and word mapping for contextual biasing in End-to-End ASR
Rongqing Huang
Ossama Abdel-Hamid
Xinwei Li
G. Evermann
25
47
0
10 Jul 2020
A Comparison of Label-Synchronous and Frame-Synchronous End-to-End
  Models for Speech Recognition
A Comparison of Label-Synchronous and Frame-Synchronous End-to-End Models for Speech Recognition
Linhao Dong
Cheng Yi
Jianzong Wang
Shiyu Zhou
Shuang Xu
X. Jia
Bo Xu
36
17
0
20 May 2020
Efficient Wait-k Models for Simultaneous Machine Translation
Efficient Wait-k Models for Simultaneous Machine Translation
Maha Elbayad
Laurent Besacier
Jakob Verbeek
VLM
24
77
0
18 May 2020
Minimum Latency Training Strategies for Streaming Sequence-to-Sequence
  ASR
Minimum Latency Training Strategies for Streaming Sequence-to-Sequence ASR
Hirofumi Inaguma
Yashesh Gaur
Liang Lu
Jinyu Li
Jiawei Liu
AI4TS
27
46
0
10 Apr 2020
Neural Machine Translation: A Review and Survey
Neural Machine Translation: A Review and Survey
Felix Stahlberg
3DV
AI4TS
MedIm
20
312
0
04 Dec 2019
Understanding and Improving Layer Normalization
Understanding and Improving Layer Normalization
Jingjing Xu
Xu Sun
Zhiyuan Zhang
Guangxiang Zhao
Junyang Lin
FAtt
32
342
0
16 Nov 2019
Teacher-Student Training for Robust Tacotron-based TTS
Teacher-Student Training for Robust Tacotron-based TTS
Rui Liu
Berrak Sisman
Jingdong Li
F. Bao
Guanglai Gao
Haizhou Li
19
38
0
07 Nov 2019
A comparison of end-to-end models for long-form speech recognition
A comparison of end-to-end models for long-form speech recognition
Chung-Cheng Chiu
Wei Han
Yu Zhang
Ruoming Pang
S. Kishchenko
...
Anjuli Kannan
Rohit Prabhavalkar
Z. Chen
Tara N. Sainath
Yonghui Wu
AuLLM
16
82
0
06 Nov 2019
Location-Relative Attention Mechanisms For Robust Long-Form Speech
  Synthesis
Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis
Eric Battenberg
RJ Skerry-Ryan
Soroosh Mariooryad
Daisy Stanton
David Kao
Matt Shannon
Tom Bagby
33
113
0
23 Oct 2019
Monotonic Multihead Attention
Monotonic Multihead Attention
Xutai Ma
J. Pino
James Cross
Liezl Puzon
Jiatao Gu
25
137
0
26 Sep 2019
Sequence to Sequence Neural Speech Synthesis with Prosody Modification
  Capabilities
Sequence to Sequence Neural Speech Synthesis with Prosody Modification Capabilities
Slava Shechtman
A. Sorin
14
33
0
23 Sep 2019
Initial investigation of an encoder-decoder end-to-end TTS framework
  using marginalization of monotonic hard latent alignments
Initial investigation of an encoder-decoder end-to-end TTS framework using marginalization of monotonic hard latent alignments
Yusuke Yasuda
Xin Wang
Junichi Yamagishi
21
8
0
30 Aug 2019
Monotonic Infinite Lookback Attention for Simultaneous Machine
  Translation
Monotonic Infinite Lookback Attention for Simultaneous Machine Translation
N. Arivazhagan
Colin Cherry
Wolfgang Macherey
Chung-Cheng Chiu
Semih Yavuz
Ruoming Pang
Wei Li
Colin Raffel
CLL
11
190
0
12 Jun 2019
Evaluating Sequence-to-Sequence Models for Handwritten Text Recognition
Evaluating Sequence-to-Sequence Models for Handwritten Text Recognition
Johannes Michael
R. Labahn
Tobias Grüning
Jochen Zöllner
21
112
0
18 Mar 2019
AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and
  Context Preservation Mechanisms
AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms
Kou Tanaka
Hirokazu Kameoka
Takuhiro Kaneko
Nobukatsu Hojo
17
111
0
09 Nov 2018
You May Not Need Attention
You May Not Need Attention
Ofir Press
Noah A. Smith
14
27
0
31 Oct 2018
Extending Recurrent Neural Aligner for Streaming End-to-End Speech
  Recognition in Mandarin
Extending Recurrent Neural Aligner for Streaming End-to-End Speech Recognition in Mandarin
Linhao Dong
Shiyu Zhou
Wei Chen
Bo Xu
24
22
0
17 Jun 2018
12
Next