Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2002.10260
Cited By
Fixed Encoder Self-Attention Patterns in Transformer-Based Machine Translation
24 February 2020
Alessandro Raganato
Yves Scherrer
Jörg Tiedemann
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Fixed Encoder Self-Attention Patterns in Transformer-Based Machine Translation"
16 / 16 papers shown
Title
Only Send What You Need: Learning to Communicate Efficiently in Federated Multilingual Machine Translation
Yun-Wei Chu
Dong-Jun Han
Christopher G. Brinton
28
4
0
15 Jan 2024
Attention-Guided Adaptation for Code-Switching Speech Recognition
Bobbi Aditya
Mahdin Rohmatillah
Liang-Hsuan Tai
Jen-Tzung Chien
26
8
0
14 Dec 2023
Fast-FNet: Accelerating Transformer Encoder Models via Efficient Fourier Layers
Nurullah Sevim
Ege Ozan Özyedek
Furkan Şahinuç
Aykut Koç
35
11
0
26 Sep 2022
ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers
Z. Yao
Reza Yazdani Aminabadi
Minjia Zhang
Xiaoxia Wu
Conglong Li
Yuxiong He
VLM
MQ
47
442
0
04 Jun 2022
Sparse Mixers: Combining MoE and Mixing to build a more efficient BERT
James Lee-Thorp
Joshua Ainslie
MoE
32
11
0
24 May 2022
Attention Mechanism with Energy-Friendly Operations
Boyi Deng
Baosong Yang
Dayiheng Liu
Rong Xiao
Derek F. Wong
Haibo Zhang
Boxing Chen
Lidia S. Chao
MU
113
1
0
28 Apr 2022
Paying More Attention to Self-attention: Improving Pre-trained Language Models via Attention Guiding
Shanshan Wang
Zhumin Chen
Z. Ren
Huasheng Liang
Qiang Yan
Pengjie Ren
33
9
0
06 Apr 2022
ETSformer: Exponential Smoothing Transformers for Time-series Forecasting
Gerald Woo
Chenghao Liu
Doyen Sahoo
Akshat Kumar
S. Hoi
AI4TS
23
161
0
03 Feb 2022
FNet: Mixing Tokens with Fourier Transforms
James Lee-Thorp
Joshua Ainslie
Ilya Eckstein
Santiago Ontanon
24
517
0
09 May 2021
Transformer-based End-to-End Speech Recognition with Local Dense Synthesizer Attention
Menglong Xu
Shengqiang Li
Xiao-Lei Zhang
27
31
0
23 Oct 2020
Direct Feedback Alignment Scales to Modern Deep Learning Tasks and Architectures
Julien Launay
Iacopo Poli
Franccois Boniface
Florent Krzakala
33
62
0
23 Jun 2020
Input-independent Attention Weights Are Expressive Enough: A Study of Attention in Self-supervised Audio Transformers
Tsung-Han Wu
Chun-Chen Hsieh
Yen-Hao Chen
Po-Han Chi
Hung-yi Lee
26
1
0
09 Jun 2020
GOBO: Quantizing Attention-Based NLP Models for Low Latency and Energy Efficient Inference
Ali Hadi Zadeh
Isak Edo
Omar Mohamed Awad
Andreas Moshovos
MQ
24
183
0
08 May 2020
The Bottom-up Evolution of Representations in the Transformer: A Study with Machine Translation and Language Modeling Objectives
Elena Voita
Rico Sennrich
Ivan Titov
198
181
0
03 Sep 2019
OpenNMT: Open-Source Toolkit for Neural Machine Translation
Guillaume Klein
Yoon Kim
Yuntian Deng
Jean Senellart
Alexander M. Rush
259
1,896
0
10 Jan 2017
Effective Approaches to Attention-based Neural Machine Translation
Thang Luong
Hieu H. Pham
Christopher D. Manning
218
7,926
0
17 Aug 2015
1