Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1704.00784
Cited By
Online and Linear-Time Attention by Enforcing Monotonic Alignments
3 April 2017
Colin Raffel
Minh-Thang Luong
Peter J. Liu
Ron J. Weiss
Douglas Eck
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Online and Linear-Time Attention by Enforcing Monotonic Alignments"
50 / 55 papers shown
Title
Spatial Speech Translation: Translating Across Space With Binaural Hearables
Tuochao Chen
Qirui Wang
Runlin He
Shyam Gollakota
31
0
0
25 Apr 2025
Aligner-Encoders: Self-Attention Transformers Can Be Self-Transducers
Adam Stooke
Rohit Prabhavalkar
K. Sim
P. M. Mengibar
39
0
0
06 Feb 2025
StyleSSP: Sampling StartPoint Enhancement for Training-free Diffusion-based Method for Style Transfer
Ruojun Xu
Weijie Xi
Xiaodi Wang
Yongbo Mao
Zach Cheng
DiffM
39
1
0
20 Jan 2025
LLaMA-Omni: Seamless Speech Interaction with Large Language Models
Qingkai Fang
Shoutao Guo
Yan Zhou
Zhengrui Ma
Shaolei Zhang
Yang Feng
AuLLM
33
32
0
10 Sep 2024
A Non-autoregressive Generation Framework for End-to-End Simultaneous Speech-to-Any Translation
Zhengrui Ma
Qingkai Fang
Shaolei Zhang
Shoutao Guo
Yang Feng
Min Zhang
53
9
0
11 Jun 2024
DEADiff: An Efficient Stylization Diffusion Model with Disentangled Representations
Tianhao Qi
Shancheng Fang
Yanze Wu
Hongtao Xie
Jiawei Liu
Lang Chen
Qian He
Yongdong Zhang
DiffM
25
32
0
11 Mar 2024
From Lengthy to Lucid: A Systematic Literature Review on NLP Techniques for Taming Long Sentences
Tatiana Passali
Efstathios Chatzikyriakidis
Stelios Andreadis
Thanos G. Stavropoulos
Anastasia Matonaki
A. Fachantidis
Grigorios Tsoumakas
24
1
0
08 Dec 2023
Non-autoregressive Streaming Transformer for Simultaneous Translation
Zhengrui Ma
Shaolei Zhang
Shoutao Guo
Chenze Shao
Min Zhang
Yang Feng
32
13
0
23 Oct 2023
End-to-End Simultaneous Speech Translation with Differentiable Segmentation
Shaolei Zhang
Yang Feng
23
17
0
25 May 2023
TAPIR: Learning Adaptive Revision for Incremental Natural Language Understanding with a Two-Pass Model
Patrick Kahardipraja
Brielen Madureira
David Schlangen
CLL
34
9
0
18 May 2023
ELODIN: Naming Concepts in Embedding Spaces
Rodrigo Mello
Filipe Calegario
Geber Ramalho
DiffM
28
1
0
07 Mar 2023
Monotonic segmental attention for automatic speech recognition
Albert Zeyer
Robin Schmitt
Wei Zhou
Ralf Schluter
Hermann Ney
16
8
0
26 Oct 2022
Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding
Chitwan Saharia
William Chan
Saurabh Saxena
Lala Li
Jay Whang
...
Raphael Gontijo-Lopes
Tim Salimans
Jonathan Ho
David J Fleet
Mohammad Norouzi
VLM
90
5,797
0
23 May 2022
Inducing and Using Alignments for Transition-based AMR Parsing
Andrew Drozdov
Jiawei Zhou
Radu Florian
Andrew McCallum
Tahira Naseem
Yoon Kim
Ramón Fernández Astudillo
39
27
0
03 May 2022
Large-Scale Streaming End-to-End Speech Translation with Neural Transducers
Jian Xue
Peidong Wang
Jinyu Li
Matt Post
Yashesh Gaur
AI4TS
32
26
0
11 Apr 2022
End to End Lip Synchronization with a Temporal AutoEncoder
Yoav Shalev
Lior Wolf
16
7
0
30 Mar 2022
Exploring Continuous Integrate-and-Fire for Adaptive Simultaneous Speech Translation
Chih-Chiang Chang
Hung-yi Lee
27
13
0
22 Mar 2022
Gaussian Multi-head Attention for Simultaneous Machine Translation
Shaolei Zhang
Yang Feng
21
22
0
17 Mar 2022
Decision Attentive Regularization to Improve Simultaneous Speech Translation Systems
Mohd Abbas Zaidi
Beomseok Lee
Sangha Kim
Chanwoo Kim
24
5
0
13 Oct 2021
Translating Images into Maps
Avishkar Saha
Oscar Alejandro Mendez Maldonado
Chris Russell
Richard Bowden
ViT
21
144
0
03 Oct 2021
Universal Simultaneous Machine Translation with Mixture-of-Experts Wait-k Policy
Shaolei Zhang
Yang Feng
MoE
30
39
0
11 Sep 2021
Infusing Future Information into Monotonic Attention Through Language Models
Mohd Abbas Zaidi
S. Indurthi
Beomseok Lee
Nikhil Kumar Lakumarapu
Sangha Kim
27
2
0
07 Sep 2021
Sequence-to-Sequence Learning with Latent Neural Grammars
Yoon Kim
33
40
0
02 Sep 2021
A Survey on Neural Speech Synthesis
Xu Tan
Tao Qin
Frank Soong
Tie-Yan Liu
AI4TS
18
352
0
29 Jun 2021
Multi-mode Transformer Transducer with Stochastic Future Context
Kwangyoun Kim
Felix Wu
Prashant Sridhar
Kyu Jeong Han
Shinji Watanabe
30
9
0
17 Jun 2021
Review of end-to-end speech synthesis technology based on deep learning
Zhaoxi Mu
Xinyu Yang
Yizhuo Dong
AuLLM
ALM
26
24
0
20 Apr 2021
A study of latent monotonic attention variants
Albert Zeyer
Ralf Schluter
Hermann Ney
24
5
0
30 Mar 2021
A Survey on Deep Reinforcement Learning for Audio-Based Applications
S. Latif
Heriberto Cuayáhuitl
Farrukh Pervez
Fahad Shamshad
Hafiz Shehbaz Ali
Min Zhang
OffRL
47
73
0
01 Jan 2021
Learning to Rationalize for Nonmonotonic Reasoning with Distant Supervision
Faeze Brahman
Vered Shwartz
Rachel Rudinger
Yejin Choi
LRM
15
42
0
14 Dec 2020
A Better and Faster End-to-End Model for Streaming ASR
Bo-wen Li
Anmol Gulati
Jiahui Yu
Tara N. Sainath
Chung-Cheng Chiu
...
Wei Han
Qiao Liang
Yu Zhang
Trevor Strohman
Yonghui Wu
AuLLM
25
123
0
21 Nov 2020
SimulMT to SimulST: Adapting Simultaneous Text Translation to End-to-End Simultaneous Speech Translation
Xutai Ma
J. Pino
Philipp Koehn
17
94
0
03 Nov 2020
Dual-mode ASR: Unify and Improve Streaming ASR with Full-context Modeling
Jiahui Yu
Wei Han
Anmol Gulati
Chung-Cheng Chiu
Bo-wen Li
Tara N. Sainath
Yonghui Wu
Ruoming Pang
30
18
0
12 Oct 2020
Controllable neural text-to-speech synthesis using intuitive prosodic features
T. Raitio
Ramya Rasipuram
D. Castellani
34
66
0
14 Sep 2020
Class LM and word mapping for contextual biasing in End-to-End ASR
Rongqing Huang
Ossama Abdel-Hamid
Xinwei Li
G. Evermann
25
47
0
10 Jul 2020
A Comparison of Label-Synchronous and Frame-Synchronous End-to-End Models for Speech Recognition
Linhao Dong
Cheng Yi
Jianzong Wang
Shiyu Zhou
Shuang Xu
X. Jia
Bo Xu
36
17
0
20 May 2020
Efficient Wait-k Models for Simultaneous Machine Translation
Maha Elbayad
Laurent Besacier
Jakob Verbeek
VLM
24
77
0
18 May 2020
Minimum Latency Training Strategies for Streaming Sequence-to-Sequence ASR
Hirofumi Inaguma
Yashesh Gaur
Liang Lu
Jinyu Li
Jiawei Liu
AI4TS
27
46
0
10 Apr 2020
Neural Machine Translation: A Review and Survey
Felix Stahlberg
3DV
AI4TS
MedIm
20
312
0
04 Dec 2019
Understanding and Improving Layer Normalization
Jingjing Xu
Xu Sun
Zhiyuan Zhang
Guangxiang Zhao
Junyang Lin
FAtt
32
342
0
16 Nov 2019
Teacher-Student Training for Robust Tacotron-based TTS
Rui Liu
Berrak Sisman
Jingdong Li
F. Bao
Guanglai Gao
Haizhou Li
19
38
0
07 Nov 2019
A comparison of end-to-end models for long-form speech recognition
Chung-Cheng Chiu
Wei Han
Yu Zhang
Ruoming Pang
S. Kishchenko
...
Anjuli Kannan
Rohit Prabhavalkar
Z. Chen
Tara N. Sainath
Yonghui Wu
AuLLM
16
82
0
06 Nov 2019
Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis
Eric Battenberg
RJ Skerry-Ryan
Soroosh Mariooryad
Daisy Stanton
David Kao
Matt Shannon
Tom Bagby
33
113
0
23 Oct 2019
Monotonic Multihead Attention
Xutai Ma
J. Pino
James Cross
Liezl Puzon
Jiatao Gu
25
137
0
26 Sep 2019
Sequence to Sequence Neural Speech Synthesis with Prosody Modification Capabilities
Slava Shechtman
A. Sorin
14
33
0
23 Sep 2019
Initial investigation of an encoder-decoder end-to-end TTS framework using marginalization of monotonic hard latent alignments
Yusuke Yasuda
Xin Wang
Junichi Yamagishi
21
8
0
30 Aug 2019
Monotonic Infinite Lookback Attention for Simultaneous Machine Translation
N. Arivazhagan
Colin Cherry
Wolfgang Macherey
Chung-Cheng Chiu
Semih Yavuz
Ruoming Pang
Wei Li
Colin Raffel
CLL
11
190
0
12 Jun 2019
Evaluating Sequence-to-Sequence Models for Handwritten Text Recognition
Johannes Michael
R. Labahn
Tobias Grüning
Jochen Zöllner
21
112
0
18 Mar 2019
AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms
Kou Tanaka
Hirokazu Kameoka
Takuhiro Kaneko
Nobukatsu Hojo
17
111
0
09 Nov 2018
You May Not Need Attention
Ofir Press
Noah A. Smith
14
27
0
31 Oct 2018
Extending Recurrent Neural Aligner for Streaming End-to-End Speech Recognition in Mandarin
Linhao Dong
Shiyu Zhou
Wei Chen
Bo Xu
24
22
0
17 Jun 2018
1
2
Next