ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2009.13658
  4. Cited By
Improve Transformer Models with Better Relative Position Embeddings

Improve Transformer Models with Better Relative Position Embeddings

28 September 2020
Zhiheng Huang
Davis Liang
Peng Xu
Bing Xiang
    ViT
ArXivPDFHTML

Papers citing "Improve Transformer Models with Better Relative Position Embeddings"

50 / 61 papers shown
Title
Person Recognition at Altitude and Range: Fusion of Face, Body Shape and Gait
Person Recognition at Altitude and Range: Fusion of Face, Body Shape and Gait
Feng Liu
Nicholas Chimitt
Lanqing guo
Jitesh Jain
Aditya Kane
...
Arun Ross
Humphrey Shi
Zhangyang Wang
A. Jain
Xiaoming Liu
CVBM
32
1
0
07 May 2025
Layer-Specific Scaling of Positional Encodings for Superior Long-Context Modeling
Zhenghua Wang
Yiran Ding
Changze Lv
Zhibo Xu
Tianlong Li
Tianyuan Shi
Xiaoqing Zheng
Xuanjing Huang
48
0
0
06 Mar 2025
UniNet: A Unified Multi-granular Traffic Modeling Framework for Network Security
Binghui Wu
D. Divakaran
M. Gurusamy
57
0
0
06 Mar 2025
The Role of Sparsity for Length Generalization in Transformers
The Role of Sparsity for Length Generalization in Transformers
Noah Golowich
Samy Jelassi
David Brandfonbrener
Sham Kakade
Eran Malach
42
0
0
24 Feb 2025
Understanding Knowledge Hijack Mechanism in In-context Learning through
  Associative Memory
Understanding Knowledge Hijack Mechanism in In-context Learning through Associative Memory
Shuo Wang
Issei Sato
76
0
0
16 Dec 2024
Mitigating Object Hallucination via Concentric Causal Attention
Mitigating Object Hallucination via Concentric Causal Attention
Yun Xing
Yiheng Li
Ivan Laptev
Shijian Lu
53
18
0
21 Oct 2024
EMCNet : Graph-Nets for Electron Micrographs Classification
EMCNet : Graph-Nets for Electron Micrographs Classification
Sakhinana Sagar Srinivas
Rajat Kumar Sarkar
Venkataramana Runkana
40
0
0
21 Aug 2024
OPDR: Order-Preserving Dimension Reduction for Semantic Embedding of
  Multimodal Scientific Data
OPDR: Order-Preserving Dimension Reduction for Semantic Embedding of Multimodal Scientific Data
Chengyu Gong
Gefei Shen
Luanzheng Guo
Nathan R. Tallent
Dongfang Zhao
26
1
0
15 Aug 2024
Rethinking Attention Module Design for Point Cloud Analysis
Rethinking Attention Module Design for Point Cloud Analysis
Chengzhi Wu
Kaige Wang
Zeyun Zhong
Hao Fu
Junwei Zheng
Jiaming Zhang
Julius Pfrommer
Jürgen Beyerer
3DPC
51
1
0
27 Jul 2024
A Morphology-Based Investigation of Positional Encodings
A Morphology-Based Investigation of Positional Encodings
Poulami Ghosh
Shikhar Vashishth
Raj Dabre
Pushpak Bhattacharyya
34
1
0
06 Apr 2024
EulerFormer: Sequential User Behavior Modeling with Complex Vector
  Attention
EulerFormer: Sequential User Behavior Modeling with Complex Vector Attention
Zhen Tian
Wayne Xin Zhao
Changwang Zhang
Xin Zhao
Zhongrui Ma
Ji-Rong Wen
38
2
0
26 Mar 2024
KeyPoint Relative Position Encoding for Face Recognition
KeyPoint Relative Position Encoding for Face Recognition
Minchul Kim
Yiyang Su
Feng Liu
Anil Jain
Xiaoming Liu
CVBM
49
7
0
21 Mar 2024
Local and Global Contexts for Conversation
Local and Global Contexts for Conversation
Zuoquan Lin
Xinyi Shen
29
1
0
31 Jan 2024
SymTC: A Symbiotic Transformer-CNN Net for Instance Segmentation of
  Lumbar Spine MRI
SymTC: A Symbiotic Transformer-CNN Net for Instance Segmentation of Lumbar Spine MRI
Jiasong Chen
Linchen Qian
Linhai Ma
Timur Urakov
Weiyong Gu
Liang Liang
MedIm
39
4
0
17 Jan 2024
The What, Why, and How of Context Length Extension Techniques in Large
  Language Models -- A Detailed Survey
The What, Why, and How of Context Length Extension Techniques in Large Language Models -- A Detailed Survey
Saurav Pawar
S.M. Towhidul Islam Tonmoy
S. M. M. Zaman
Vinija Jain
Aman Chadha
Amitava Das
37
28
0
15 Jan 2024
Compositional Generalization in Spoken Language Understanding
Compositional Generalization in Spoken Language Understanding
Avik Ray
Yilin Shen
Hongxia Jin
CoGe
30
1
0
25 Dec 2023
Long-MIL: Scaling Long Contextual Multiple Instance Learning for
  Histopathology Whole Slide Image Analysis
Long-MIL: Scaling Long Contextual Multiple Instance Learning for Histopathology Whole Slide Image Analysis
Honglin Li
Yunlong Zhang
Chenglu Zhu
Jiatong Cai
Sunyi Zheng
Lin Yang
VLM
43
4
0
21 Nov 2023
Window Attention is Bugged: How not to Interpolate Position Embeddings
Window Attention is Bugged: How not to Interpolate Position Embeddings
Daniel Bolya
Chaitanya K. Ryali
Judy Hoffman
Christoph Feichtenhofer
43
10
0
09 Nov 2023
HyPE: Attention with Hyperbolic Biases for Relative Positional Encoding
HyPE: Attention with Hyperbolic Biases for Relative Positional Encoding
Giorgio Angelotti
16
0
0
30 Oct 2023
Transformers as Graph-to-Graph Models
Transformers as Graph-to-Graph Models
James Henderson
Alireza Mohammadshahi
Andrei Catalin Coman
Lesly Miculicich
GNN
35
6
0
27 Oct 2023
Revisiting File Context for Source Code Summarization
Revisiting File Context for Source Code Summarization
Aakash Bansal
Chia-Yi Su
Collin McMillan
25
4
0
05 Sep 2023
DETR Doesn't Need Multi-Scale or Locality Design
DETR Doesn't Need Multi-Scale or Locality Design
Yutong Lin
Yuhui Yuan
Zheng-Wei Zhang
Chen Li
Nanning Zheng
Han Hu
37
5
0
03 Aug 2023
Linearized Relative Positional Encoding
Linearized Relative Positional Encoding
Zhen Qin
Weixuan Sun
Kaiyue Lu
Huizhong Deng
Dong Li
Xiaodong Han
Yuchao Dai
Lingpeng Kong
Yiran Zhong
20
12
0
18 Jul 2023
Length Generalization in Arithmetic Transformers
Length Generalization in Arithmetic Transformers
Samy Jelassi
Stéphane dÁscoli
Carles Domingo-Enrich
Yuhuai Wu
Yuan-Fang Li
Franccois Charton
30
38
0
27 Jun 2023
Improving Position Encoding of Transformers for Multivariate Time Series
  Classification
Improving Position Encoding of Transformers for Multivariate Time Series Classification
Navid Mohammadi Foumani
Chang Wei Tan
Geoffrey I. Webb
Mahsa Salehi
AI4TS
30
74
0
26 May 2023
Deep Multiple Instance Learning with Distance-Aware Self-Attention
Deep Multiple Instance Learning with Distance-Aware Self-Attention
Georg Wolflein
Lucie Charlotte Magister
Pietro Lio
David J. Harrison
Ognjen Arandjelovic
27
2
0
17 May 2023
SLSG: Industrial Image Anomaly Detection by Learning Better Feature
  Embeddings and One-Class Classification
SLSG: Industrial Image Anomaly Detection by Learning Better Feature Embeddings and One-Class Classification
Minghui Yang
Jing Liu
Zhiwei Yang
Zhaoyang Wu
35
8
0
30 Apr 2023
Technical Report: Impact of Position Bias on Language Models in Token
  Classification
Technical Report: Impact of Position Bias on Language Models in Token Classification
Mehdi Ben Amor
Michael Granitzer
Jelena Mitrović
33
3
0
26 Apr 2023
Causal Decision Transformer for Recommender Systems via Offline
  Reinforcement Learning
Causal Decision Transformer for Recommender Systems via Offline Reinforcement Learning
Siyu Wang
Xiaocong Chen
Dietmar Jannach
Lina Yao
CML
OffRL
24
27
0
17 Apr 2023
Enhancing Multivariate Time Series Classifiers through Self-Attention
  and Relative Positioning Infusion
Enhancing Multivariate Time Series Classifiers through Self-Attention and Relative Positioning Infusion
Mehryar Abbasi
Parvaneh Saeedi
AI4TS
27
6
0
13 Feb 2023
Real-World Compositional Generalization with Disentangled
  Sequence-to-Sequence Learning
Real-World Compositional Generalization with Disentangled Sequence-to-Sequence Learning
Hao Zheng
Mirella Lapata
OOD
CoGe
DRL
24
5
0
12 Dec 2022
P-Transformer: Towards Better Document-to-Document Neural Machine
  Translation
P-Transformer: Towards Better Document-to-Document Neural Machine Translation
Yachao Li
Junhui Li
Jing Jiang
Shimin Tao
Hao Yang
Hao Fei
ViT
25
9
0
12 Dec 2022
Mega: Moving Average Equipped Gated Attention
Mega: Moving Average Equipped Gated Attention
Xuezhe Ma
Chunting Zhou
Xiang Kong
Junxian He
Liangke Gui
Graham Neubig
Jonathan May
Luke Zettlemoyer
33
183
0
21 Sep 2022
NUWA-Infinity: Autoregressive over Autoregressive Generation for
  Infinite Visual Synthesis
NUWA-Infinity: Autoregressive over Autoregressive Generation for Infinite Visual Synthesis
Chenfei Wu
Jian Liang
Xiaowei Hu
Zhe Gan
Jianfeng Wang
Lijuan Wang
Zicheng Liu
Yuejian Fang
Nan Duan
VGen
27
72
0
20 Jul 2022
Parameterization of Cross-Token Relations with Relative Positional
  Encoding for Vision MLP
Parameterization of Cross-Token Relations with Relative Positional Encoding for Vision MLP
Zhicai Wang
Y. Hao
Xingyu Gao
Hao Zhang
Shuo Wang
Tingting Mu
Xiangnan He
21
8
0
15 Jul 2022
VReBERT: A Simple and Flexible Transformer for Visual Relationship
  Detection
VReBERT: A Simple and Flexible Transformer for Visual Relationship Detection
Yunbo Cui
M. Farazi
ViT
25
1
0
18 Jun 2022
Peripheral Vision Transformer
Peripheral Vision Transformer
Juhong Min
Yucheng Zhao
Chong Luo
Minsu Cho
ViT
MDE
32
30
0
14 Jun 2022
LIFT: Language-Interfaced Fine-Tuning for Non-Language Machine Learning
  Tasks
LIFT: Language-Interfaced Fine-Tuning for Non-Language Machine Learning Tasks
Tuan Dinh
Yuchen Zeng
Ruisu Zhang
Ziqian Lin
Michael Gira
Shashank Rajput
Jy-yong Sohn
Dimitris Papailiopoulos
Kangwook Lee
LMTD
45
128
0
14 Jun 2022
Transforming medical imaging with Transformers? A comparative review of
  key properties, current progresses, and future perspectives
Transforming medical imaging with Transformers? A comparative review of key properties, current progresses, and future perspectives
Jun Li
Junyu Chen
Yucheng Tang
Ce Wang
Bennett A. Landman
S. K. Zhou
ViT
OOD
MedIm
23
22
0
02 Jun 2022
KERPLE: Kernelized Relative Positional Embedding for Length
  Extrapolation
KERPLE: Kernelized Relative Positional Embedding for Length Extrapolation
Ta-Chung Chi
Ting-Han Fan
Peter J. Ramadge
Alexander I. Rudnicky
47
65
0
20 May 2022
Hero-Gang Neural Model For Named Entity Recognition
Hero-Gang Neural Model For Named Entity Recognition
Jinpeng Hu
Yaling Shen
Yang Liu
Xiang Wan
Tsung-Hui Chang
27
14
0
15 May 2022
TANet: Thread-Aware Pretraining for Abstractive Conversational
  Summarization
TANet: Thread-Aware Pretraining for Abstractive Conversational Summarization
Ze Yang
Liran Wang
Zhoujin Tian
Wei Wu
Zhoujun Li
30
4
0
09 Apr 2022
FIGARO: Generating Symbolic Music with Fine-Grained Artistic Control
FIGARO: Generating Symbolic Music with Fine-Grained Artistic Control
Dimitri von Rutte
Luca Biggio
Yannic Kilcher
Thomas Hofmann
33
0
0
26 Jan 2022
Relative Molecule Self-Attention Transformer
Relative Molecule Self-Attention Transformer
Lukasz Maziarka
Dawid Majchrowski
Tomasz Danel
Piotr Gaiñski
Jacek Tabor
Igor T. Podolak
Pawel M. Morkisz
Stanislaw Jastrzebski
MedIm
42
34
0
12 Oct 2021
Disentangled Sequence to Sequence Learning for Compositional
  Generalization
Disentangled Sequence to Sequence Learning for Compositional Generalization
Hao Zheng
Mirella Lapata
CoGe
DRL
30
39
0
09 Oct 2021
Multiplicative Position-aware Transformer Models for Language
  Understanding
Multiplicative Position-aware Transformer Models for Language Understanding
Zhiheng Huang
Davis Liang
Peng Xu
Bing Xiang
9
1
0
27 Sep 2021
SHAPE: Shifted Absolute Position Embedding for Transformers
SHAPE: Shifted Absolute Position Embedding for Transformers
Shun Kiyono
Sosuke Kobayashi
Jun Suzuki
Kentaro Inui
236
45
0
13 Sep 2021
The Impact of Positional Encodings on Multilingual Compression
The Impact of Positional Encodings on Multilingual Compression
Vinit Ravishankar
Anders Søgaard
25
5
0
11 Sep 2021
SIGN: Spatial-information Incorporated Generative Network for
  Generalized Zero-shot Semantic Segmentation
SIGN: Spatial-information Incorporated Generative Network for Generalized Zero-shot Semantic Segmentation
Jiaxin Cheng
Soumyaroop Nandi
Premkumar Natarajan
Wael AbdAlmageed
VLM
32
55
0
27 Aug 2021
Rethinking and Improving Relative Position Encoding for Vision
  Transformer
Rethinking and Improving Relative Position Encoding for Vision Transformer
Kan Wu
Houwen Peng
Minghao Chen
Jianlong Fu
Hongyang Chao
ViT
53
330
0
29 Jul 2021
12
Next