ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.10182
  4. Cited By
Modeling Localness for Self-Attention Networks

Modeling Localness for Self-Attention Networks

24 October 2018
Baosong Yang
Zhaopeng Tu
Derek F. Wong
Fandong Meng
Lidia S. Chao
Tong Zhang
    MILM
ArXivPDFHTML

Papers citing "Modeling Localness for Self-Attention Networks"

41 / 41 papers shown
Title
Finite-context Indexing of Restricted Output Space for NLP Models Facing
  Noisy Input
Finite-context Indexing of Restricted Output Space for NLP Models Facing Noisy Input
Minh Nguyen
Nancy F. Chen
30
0
0
21 Oct 2023
CTC-based Non-autoregressive Speech Translation
CTC-based Non-autoregressive Speech Translation
Chen Xu
Xiaoqian Liu
Xiaowen Liu
Qingxuan Sun
Yuhao Zhang
...
Tom Ko
Mingxuan Wang
Tong Xiao
Anxiang Ma
Jingbo Zhu
25
11
0
27 May 2023
TranSFormer: Slow-Fast Transformer for Machine Translation
TranSFormer: Slow-Fast Transformer for Machine Translation
Bei Li
Yi Jing
Xu Tan
Zhen Xing
Tong Xiao
Jingbo Zhu
49
7
0
26 May 2023
End-to-End Simultaneous Speech Translation with Differentiable
  Segmentation
End-to-End Simultaneous Speech Translation with Differentiable Segmentation
Shaolei Zhang
Yang Feng
32
17
0
25 May 2023
Recouple Event Field via Probabilistic Bias for Event Extraction
Recouple Event Field via Probabilistic Bias for Event Extraction
Xingyu Bai
Taiqiang Wu
Han Guo
Zhe Zhao
Xuefeng Yang
Jiayin Li
Weijie Liu
Qi Ju
Weigang Guo
Yujiu Yang
22
0
0
19 May 2023
Bird-Eye Transformers for Text Generation Models
Bird-Eye Transformers for Text Generation Models
Lei Sha
Yuhang Song
Yordan Yordanov
Tommaso Salvatori
Thomas Lukasiewicz
30
0
0
08 Oct 2022
Enhancing Pre-trained Models with Text Structure Knowledge for Question
  Generation
Enhancing Pre-trained Models with Text Structure Knowledge for Question Generation
Zichen Wu
Xin Jia
Fanyi Qu
Yunfang Wu
21
4
0
09 Sep 2022
Improving Multi-Document Summarization through Referenced Flexible
  Extraction with Credit-Awareness
Improving Multi-Document Summarization through Referenced Flexible Extraction with Credit-Awareness
Yun-Zhu Song
Yi-Syuan Chen
Hong-Han Shuai
43
20
0
04 May 2022
Attention Mechanism with Energy-Friendly Operations
Attention Mechanism with Energy-Friendly Operations
Boyi Deng
Baosong Yang
Dayiheng Liu
Rong Xiao
Derek F. Wong
Haibo Zhang
Boxing Chen
Lidia S. Chao
MU
185
1
0
28 Apr 2022
Attention Mechanism in Neural Networks: Where it Comes and Where it Goes
Attention Mechanism in Neural Networks: Where it Comes and Where it Goes
Derya Soydaner
3DV
46
150
0
27 Apr 2022
Skeletal Graph Self-Attention: Embedding a Skeleton Inductive Bias into
  Sign Language Production
Skeletal Graph Self-Attention: Embedding a Skeleton Inductive Bias into Sign Language Production
Ben Saunders
Necati Cihan Camgöz
Richard Bowden
SLR
37
17
0
06 Dec 2021
Modeling Concentrated Cross-Attention for Neural Machine Translation
  with Gaussian Mixture Model
Modeling Concentrated Cross-Attention for Neural Machine Translation with Gaussian Mixture Model
Shaolei Zhang
Yang Feng
18
23
0
11 Sep 2021
Can Transformers Jump Around Right in Natural Language? Assessing
  Performance Transfer from SCAN
Can Transformers Jump Around Right in Natural Language? Assessing Performance Transfer from SCAN
Rahma Chaabouni
Roberto Dessì
Eugene Kharitonov
35
20
0
03 Jul 2021
A Survey of Transformers
A Survey of Transformers
Tianyang Lin
Yuxin Wang
Xiangyang Liu
Xipeng Qiu
ViT
53
1,089
0
08 Jun 2021
Stacked Acoustic-and-Textual Encoding: Integrating the Pre-trained
  Models into Speech Translation Encoders
Stacked Acoustic-and-Textual Encoding: Integrating the Pre-trained Models into Speech Translation Encoders
Chen Xu
Bojie Hu
Yanyang Li
Yuhao Zhang
Shen Huang
Qi Ju
Tong Xiao
Jingbo Zhu
25
76
0
12 May 2021
Mask Attention Networks: Rethinking and Strengthen Transformer
Mask Attention Networks: Rethinking and Strengthen Transformer
Zhihao Fan
Yeyun Gong
Dayiheng Liu
Zhongyu Wei
Siyuan Wang
Jian Jiao
Nan Duan
Ruofei Zhang
Xuanjing Huang
34
72
0
25 Mar 2021
Improving BERT with Syntax-aware Local Attention
Improving BERT with Syntax-aware Local Attention
Zhongli Li
Qingyu Zhou
Chao Li
Ke Xu
Yunbo Cao
63
44
0
30 Dec 2020
How Does Selective Mechanism Improve Self-Attention Networks?
How Does Selective Mechanism Improve Self-Attention Networks?
Xinwei Geng
Longyue Wang
Xing Wang
Bing Qin
Ting Liu
Zhaopeng Tu
AAML
39
35
0
03 May 2020
Hard-Coded Gaussian Attention for Neural Machine Translation
Hard-Coded Gaussian Attention for Neural Machine Translation
Weiqiu You
Simeng Sun
Mohit Iyyer
33
67
0
02 May 2020
Capsule-Transformer for Neural Machine Translation
Capsule-Transformer for Neural Machine Translation
Sufeng Duan
Juncheng Cao
Hai Zhao
MedIm
32
4
0
30 Apr 2020
Highway Transformer: Self-Gating Enhanced Self-Attentive Networks
Highway Transformer: Self-Gating Enhanced Self-Attentive Networks
Yekun Chai
Jin Shuo
Xinwen Hou
25
17
0
17 Apr 2020
Fixed Encoder Self-Attention Patterns in Transformer-Based Machine
  Translation
Fixed Encoder Self-Attention Patterns in Transformer-Based Machine Translation
Alessandro Raganato
Yves Scherrer
Jörg Tiedemann
32
92
0
24 Feb 2020
Explicit Sparse Transformer: Concentrated Attention Through Explicit
  Selection
Explicit Sparse Transformer: Concentrated Attention Through Explicit Selection
Guangxiang Zhao
Junyang Lin
Zhiyuan Zhang
Xuancheng Ren
Qi Su
Xu Sun
22
108
0
25 Dec 2019
Neural Simile Recognition with Cyclic Multitask Learning and Local
  Attention
Neural Simile Recognition with Cyclic Multitask Learning and Local Attention
Jiali Zeng
Linfeng Song
Jinsong Su
Jun Xie
Wei Song
Jiebo Luo
13
23
0
19 Dec 2019
Multi-Scale Self-Attention for Text Classification
Multi-Scale Self-Attention for Text Classification
Qipeng Guo
Xipeng Qiu
Pengfei Liu
Xiangyang Xue
Zheng-Wei Zhang
ViT
8
62
0
02 Dec 2019
Two-Headed Monster And Crossed Co-Attention Networks
Two-Headed Monster And Crossed Co-Attention Networks
Yaoyiran Li
Jing Jiang
24
0
0
10 Nov 2019
Location Attention for Extrapolation to Longer Sequences
Location Attention for Extrapolation to Longer Sequences
Yann Dubois
Gautier Dagan
Dieuwke Hupkes
Elia Bruni
23
41
0
10 Nov 2019
Syntax-Infused Transformer and BERT models for Machine Translation and
  Natural Language Understanding
Syntax-Infused Transformer and BERT models for Machine Translation and Natural Language Understanding
Dhanasekar Sundararaman
Vivek Subramanian
Guoyin Wang
Shijing Si
Dinghan Shen
Dong Wang
Lawrence Carin
19
40
0
10 Nov 2019
Learning to Copy for Automatic Post-Editing
Learning to Copy for Automatic Post-Editing
Xuancheng Huang
Yang Liu
Huanbo Luan
Jingfang Xu
Maosong Sun
10
10
0
09 Nov 2019
SesameBERT: Attention for Anywhere
SesameBERT: Attention for Anywhere
Ta-Chun Su
Hsiang-Chih Cheng
36
7
0
08 Oct 2019
Multi-Granularity Self-Attention for Neural Machine Translation
Multi-Granularity Self-Attention for Neural Machine Translation
Jie Hao
Xing Wang
Shuming Shi
Jinfeng Zhang
Zhaopeng Tu
MILM
25
48
0
05 Sep 2019
Towards Better Modeling Hierarchical Structure for Self-Attention with
  Ordered Neurons
Towards Better Modeling Hierarchical Structure for Self-Attention with Ordered Neurons
Jie Hao
Xing Wang
Shuming Shi
Jinfeng Zhang
Zhaopeng Tu
29
12
0
04 Sep 2019
Improving Multi-Head Attention with Capsule Networks
Improving Multi-Head Attention with Capsule Networks
Shuhao Gu
Yang Feng
22
12
0
31 Aug 2019
On Identifiability in Transformers
On Identifiability in Transformers
Gino Brunner
Yang Liu
Damian Pascual
Oliver Richter
Massimiliano Ciaramita
Roger Wattenhofer
ViT
30
186
0
12 Aug 2019
Investigating Self-Attention Network for Chinese Word Segmentation
Investigating Self-Attention Network for Chinese Word Segmentation
Leilei Gan
Yue Zhang
21
11
0
26 Jul 2019
Convolutional Self-Attention Networks
Convolutional Self-Attention Networks
Baosong Yang
Longyue Wang
Derek F. Wong
Lidia S. Chao
Zhaopeng Tu
24
124
0
05 Apr 2019
Context-Aware Self-Attention Networks
Context-Aware Self-Attention Networks
Baosong Yang
Jian Li
Derek F. Wong
Lidia S. Chao
Xing Wang
Zhaopeng Tu
39
113
0
15 Feb 2019
Multi-Head Attention with Disagreement Regularization
Multi-Head Attention with Disagreement Regularization
Jian Li
Zhaopeng Tu
Baosong Yang
Michael R. Lyu
Tong Zhang
27
145
0
24 Oct 2018
Tied Multitask Learning for Neural Speech Translation
Tied Multitask Learning for Neural Speech Translation
Antonios Anastasopoulos
David Chiang
102
172
0
19 Feb 2018
Google's Neural Machine Translation System: Bridging the Gap between
  Human and Machine Translation
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Zhehuai Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
718
6,750
0
26 Sep 2016
Effective Approaches to Attention-based Neural Machine Translation
Effective Approaches to Attention-based Neural Machine Translation
Thang Luong
Hieu H. Pham
Christopher D. Manning
220
7,930
0
17 Aug 2015
1