Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1704.00784
Cited By
v1
v2 (latest)
Online and Linear-Time Attention by Enforcing Monotonic Alignments
3 April 2017
Colin Raffel
Minh-Thang Luong
Peter J. Liu
Ron J. Weiss
Douglas Eck
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Online and Linear-Time Attention by Enforcing Monotonic Alignments"
50 / 155 papers shown
Title
Spatial Speech Translation: Translating Across Space With Binaural Hearables
Tuochao Chen
Qirui Wang
Runlin He
Shyam Gollakota
75
0
0
25 Apr 2025
Non-Monotonic Attention-based Read/Write Policy Learning for Simultaneous Translation
Zeeshan Ahmed
Frank Seide
Zhe Liu
Rastislav Rabatin
J. Kolár
Niko Moritz
Ruiming Xie
Simone Merello
Christian Fuegen
OffRL
121
0
0
28 Mar 2025
Aligner-Encoders: Self-Attention Transformers Can Be Self-Transducers
Adam Stooke
Rohit Prabhavalkar
K. Sim
P. M. Mengibar
194
0
0
06 Feb 2025
StyleSSP: Sampling StartPoint Enhancement for Training-free Diffusion-based Method for Style Transfer
Ruojun Xu
Weijie Xi
Xiaodi Wang
Yongbo Mao
Zach Cheng
DiffM
124
1
0
20 Jan 2025
LLaMA-Omni: Seamless Speech Interaction with Large Language Models
Qingkai Fang
Shoutao Guo
Yan Zhou
Zhengrui Ma
Shaolei Zhang
Yang Feng
AuLLM
121
54
0
10 Sep 2024
Fixed and Adaptive Simultaneous Machine Translation Strategies Using Adapters
Abderrahmane Issam
Yusuf Can Semerci
Jan Scholtes
Gerasimos Spanakis
73
0
0
18 Jul 2024
Navigating the Minefield of MT Beam Search in Cascaded Streaming Speech Translation
Rastislav Rabatin
Frank Seide
Ernie Chang
40
1
0
26 Jun 2024
Speech ReaLLM -- Real-time Streaming Speech Recognition with Multimodal LLMs by Teaching the Flow of Time
Frank Seide
Morrie Doulaty
Yangyang Shi
Yashesh Gaur
Junteng Jia
Chunyang Wu
AuLLM
KELM
79
11
0
13 Jun 2024
VALL-E R: Robust and Efficient Zero-Shot Text-to-Speech Synthesis via Monotonic Alignment
Bing Han
Long Zhou
Shujie Liu
Sanyuan Chen
Lingwei Meng
Yanming Qian
Yanqing Liu
Sheng Zhao
Jinyu Li
Furu Wei
112
24
0
12 Jun 2024
A Non-autoregressive Generation Framework for End-to-End Simultaneous Speech-to-Any Translation
Zhengrui Ma
Qingkai Fang
Shaolei Zhang
Shoutao Guo
Yang Feng
Min Zhang
85
11
0
11 Jun 2024
Agent-SiMT: Agent-assisted Simultaneous Machine Translation with Large Language Models
Shoutao Guo
Shaolei Zhang
Zhengrui Ma
Min Zhang
Yang Feng
LLMAG
104
1
0
11 Jun 2024
DEADiff: An Efficient Stylization Diffusion Model with Disentangled Representations
Tianhao Qi
Shancheng Fang
Yanze Wu
Hongtao Xie
Jiawei Liu
Lang Chen
Qian He
Yongdong Zhang
DiffM
79
43
0
11 Mar 2024
TransLLaMa: LLM-based Simultaneous Translation System
Roman Koshkin
Katsuhito Sudoh
Satoshi Nakamura
55
26
0
07 Feb 2024
DreamTuner: Single Image is Enough for Subject-Driven Generation
Miao Hua
Jiawei Liu
Fei Ding
Wei Liu
Jie Wu
Qian He
72
31
0
21 Dec 2023
From Lengthy to Lucid: A Systematic Literature Review on NLP Techniques for Taming Long Sentences
Tatiana Passali
Efstathios Chatzikyriakidis
Stelios Andreadis
Thanos G. Stavropoulos
Anastasia Matonaki
A. Fachantidis
Grigorios Tsoumakas
65
1
0
08 Dec 2023
Efficient Monotonic Multihead Attention
Xutai Ma
Anna Y. Sun
Siqi Ouyang
Hirofumi Inaguma
Paden Tomasello
75
4
0
07 Dec 2023
Bigger is not Always Better: The Effect of Context Size on Speech Pre-Training
Sean Robertson
Ewan Dunbar
SSL
71
1
0
03 Dec 2023
Unified Segment-to-Segment Framework for Simultaneous Sequence Generation
Shaolei Zhang
Yang Feng
91
8
0
27 Oct 2023
Enhanced Simultaneous Machine Translation with Word-level Policies
Kang Kim
Hankyu Cho
104
3
0
25 Oct 2023
Non-autoregressive Streaming Transformer for Simultaneous Translation
Zhengrui Ma
Shaolei Zhang
Shoutao Guo
Chenze Shao
Min Zhang
Yang Feng
78
16
0
23 Oct 2023
Simultaneous Machine Translation with Large Language Models
Minghan Wang
Jinming Zhao
Thuy-Trang Vu
Fatemeh Shiri
Ehsan Shareghi
Gholamreza Haffari
124
5
0
13 Sep 2023
RSDiff: Remote Sensing Image Generation from Text Using Diffusion Model
A. Sebaq
Mohamed ElHelw
DiffM
139
26
0
03 Sep 2023
Rhythm-controllable Attention with High Robustness for Long Sentence Speech Synthesis
Dengfeng Ke
Yayue Deng
Yukang Jia
Jinlong Xue
Qi Luo
Ya Li
Jianqing Sun
Jiaen Liang
Binghuai Lin
41
0
0
05 Jun 2023
Building Accurate Low Latency ASR for Streaming Voice Search
Abhinav Goyal
Nikesh Garera
35
1
0
29 May 2023
End-to-End Simultaneous Speech Translation with Differentiable Segmentation
Shaolei Zhang
Yang Feng
73
18
0
25 May 2023
TAPIR: Learning Adaptive Revision for Incremental Natural Language Understanding with a Two-Pass Model
Patrick Kahardipraja
Brielen Madureira
David Schlangen
CLL
90
10
0
18 May 2023
Hybrid Transducer and Attention based Encoder-Decoder Modeling for Speech-to-Text Tasks
Yun Tang
Anna Y. Sun
Hirofumi Inaguma
Xinyue Chen
Ning Dong
Xutai Ma
Paden Tomasello
J. Pino
108
22
0
04 May 2023
ELODIN: Naming Concepts in Embedding Spaces
Rodrigo Mello
Filipe Calegario
Geber Ramalho
DiffM
139
1
0
07 Mar 2023
End-to-End Speech Recognition: A Survey
Rohit Prabhavalkar
Takaaki Hori
Tara N. Sainath
Ralf Schluter
Shinji Watanabe
VLM
94
172
0
03 Mar 2023
Hidden Markov Transformer for Simultaneous Machine Translation
Shaolei Zhang
Yang Feng
80
27
0
01 Mar 2023
Efficient Encoders for Streaming Sequence Tagging
Ayush Kaushal
Aditya Gupta
Shyam Upadhyay
Manaal Faruqui
69
4
0
23 Jan 2023
Attention as a Guide for Simultaneous Speech Translation
Sara Papi
Matteo Negri
Marco Turchi
93
31
0
15 Dec 2022
Monotonic segmental attention for automatic speech recognition
Albert Zeyer
Robin Schmitt
Wei Zhou
Ralf Schluter
Hermann Ney
63
9
0
26 Oct 2022
Information-Transport-based Policy for Simultaneous Translation
Shaolei Zhang
Yang Feng
112
52
0
22 Oct 2022
Compositional Generalisation with Structured Reordering and Fertility Layers
Matthias Lindemann
Alexander Koller
Ivan Titov
CoGe
89
7
0
06 Oct 2022
Adaptive Sparse and Monotonic Attention for Transformer-based Automatic Speech Recognition
Chendong Zhao
Jianzong Wang
Wentao Wei
Xiaoyang Qu
Haoqian Wang
Jing Xiao
81
2
0
30 Sep 2022
Multimodal Speech Emotion Recognition using Cross Attention with Aligned Audio and Text
Yoonhyung Lee
Seunghyun Yoon
Kyomin Jung
140
21
0
26 Jul 2022
Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding
Chitwan Saharia
William Chan
Saurabh Saxena
Lala Li
Jay Whang
...
Raphael Gontijo-Lopes
Tim Salimans
Jonathan Ho
David J Fleet
Mohammad Norouzi
VLM
666
6,107
0
23 May 2022
Inducing and Using Alignments for Transition-based AMR Parsing
Andrew Drozdov
Jiawei Zhou
Radu Florian
Andrew McCallum
Tahira Naseem
Yoon Kim
Ramón Fernández Astudillo
78
27
0
03 May 2022
Large-Scale Streaming End-to-End Speech Translation with Neural Transducers
Jian Xue
Peidong Wang
Jinyu Li
Matt Post
Yashesh Gaur
AI4TS
79
31
0
11 Apr 2022
End to End Lip Synchronization with a Temporal AutoEncoder
Yoav Shalev
Lior Wolf
43
7
0
30 Mar 2022
Exploring Continuous Integrate-and-Fire for Adaptive Simultaneous Speech Translation
Chih-Chiang Chang
Hung-yi Lee
90
13
0
22 Mar 2022
Modeling Dual Read/Write Paths for Simultaneous Machine Translation
Shaolei Zhang
Yang Feng
67
27
0
17 Mar 2022
Gaussian Multi-head Attention for Simultaneous Machine Translation
Shaolei Zhang
Yang Feng
69
24
0
17 Mar 2022
Reducing Position Bias in Simultaneous Machine Translation with Length-Aware Framework
Shaolei Zhang
Yang Feng
92
22
0
17 Mar 2022
Anticipation-Free Training for Simultaneous Machine Translation
Chih-Chiang Chang
Shun-Po Chuang
Hung-yi Lee
76
7
0
30 Jan 2022
A comparison of streaming models and data augmentation methods for robust speech recognition
Jiyeon Kim
Mehul Kumar
Dhananjaya N. Gowda
Abhinav Garg
Chanwoo Kim
86
6
0
19 Nov 2021
Recent Advances in End-to-End Automatic Speech Recognition
Jinyu Li
VLM
173
379
0
02 Nov 2021
Simultaneous Neural Machine Translation with Constituent Label Prediction
Yasumasa Kano
Katsuhito Sudoh
Satoshi Nakamura
50
3
0
26 Oct 2021
Direct Simultaneous Speech-to-Speech Translation with Variational Monotonic Multihead Attention
Xutai Ma
Hongyu Gong
Danni Liu
Ann Lee
Yun Tang
Peng-Jen Chen
Wei-Ning Hsu
P. Koehn
J. Pino
99
9
0
15 Oct 2021
1
2
3
4
Next