Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2006.15595
Cited By
Rethinking Positional Encoding in Language Pre-training
28 June 2020
Guolin Ke
Di He
Tie-Yan Liu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Rethinking Positional Encoding in Language Pre-training"
50 / 63 papers shown
Title
Spline-based Transformers
Prashanth Chandran
Agon Serifi
Markus Gross
Moritz Bächer
41
0
0
03 Apr 2025
Positional Encoding in Transformer-Based Time Series Models: A Survey
Habib Irani
Vangelis Metsis
AI4TS
53
0
0
17 Feb 2025
ArtFormer: Controllable Generation of Diverse 3D Articulated Objects
Jiayi Su
Youhe Feng
Zheng Li
Jinhua Song
Yangfan He
Botao Ren
Botian Xu
AI4CE
91
2
0
10 Dec 2024
Spatioformer: A Geo-encoded Transformer for Large-Scale Plant Species Richness Prediction
Yiqing Guo
K. Mokany
S. Levick
Jinyan Yang
P. Moghadam
MDE
47
2
0
25 Oct 2024
MLissard: Multilingual Long and Simple Sequential Reasoning Benchmarks
M. Bueno
R. Lotufo
Rodrigo Nogueira
LRM
31
0
0
08 Oct 2024
Towards LifeSpan Cognitive Systems
Yu Wang
Chi Han
Tongtong Wu
Xiaoxin He
Wangchunshu Zhou
...
Zexue He
Wei Wang
Gholamreza Haffari
Heng Ji
Julian McAuley
KELM
CLL
185
1
0
20 Sep 2024
TeXBLEU: Automatic Metric for Evaluate LaTeX Format
Kyudan Jung
N. Kim
Hyongon Ryu
Sieun Hyeon
Seung-jun Lee
Hyeok-jae Lee
37
0
0
10 Sep 2024
Are queries and keys always relevant? A case study on Transformer wave functions
Riccardo Rende
Luciano Loris Viteritti
29
5
0
29 May 2024
Exploring the Potential of Large Language Models for Improving Digital Forensic Investigation Efficiency
Akila Wickramasekara
F. Breitinger
Mark Scanlon
52
8
0
29 Feb 2024
Large Language Models: A Survey
Shervin Minaee
Tomáš Mikolov
Narjes Nikzad
M. Asgari-Chenaghlu
R. Socher
Xavier Amatriain
Jianfeng Gao
ALM
LM&MA
ELM
134
371
0
09 Feb 2024
SSIN: Self-Supervised Learning for Rainfall Spatial Interpolation
Jia Li
Yanyan Shen
Lei Chen
Charles Wang Wai Ng
22
3
0
27 Nov 2023
Long-MIL: Scaling Long Contextual Multiple Instance Learning for Histopathology Whole Slide Image Analysis
Honglin Li
Yunlong Zhang
Chenglu Zhu
Jiatong Cai
Sunyi Zheng
Lin Yang
VLM
43
4
0
21 Nov 2023
From Words and Exercises to Wellness: Farsi Chatbot for Self-Attachment Technique
Sina Elahimanesh
Shayan Salehi
Sara Zahedi Movahed
Lisa Alazraki
Ruoyu Hu
Abbas Edalat
24
0
0
13 Oct 2023
GrowLength: Accelerating LLMs Pretraining by Progressively Growing Training Length
Hongye Jin
Xiaotian Han
Jingfeng Yang
Zhimeng Jiang
Chia-Yuan Chang
Xia Hu
33
11
0
01 Oct 2023
Frameless Graph Knowledge Distillation
Dai Shi
Zhiqi Shao
Yi Guo
Junbin Gao
39
4
0
13 Jul 2023
Relational Temporal Graph Reasoning for Dual-task Dialogue Language Understanding
Bowen Xing
Ivor W. Tsang
43
13
0
15 Jun 2023
Pre-training Language Model as a Multi-perspective Course Learner
Beiduo Chen
Shaohan Huang
Zi-qiang Zhang
Wu Guo
Zhen-Hua Ling
Haizhen Huang
Furu Wei
Weiwei Deng
Qi Zhang
34
0
0
06 May 2023
HST-MRF: Heterogeneous Swin Transformer with Multi-Receptive Field for Medical Image Segmentation
Xiaofei Huang
Hongfang Gong
Jin Zhang
MedIm
34
2
0
10 Apr 2023
Ankh: Optimized Protein Language Model Unlocks General-Purpose Modelling
Ahmed Elnaggar
Hazem Essam
Wafaa Salah-Eldin
Walid Moustafa
Mohamed Elkerdawy
Charlotte Rochereau
B. Rost
167
87
0
16 Jan 2023
You Need Multiple Exiting: Dynamic Early Exiting for Accelerating Unified Vision Language Model
Sheng Tang
Yaqing Wang
Zhenglun Kong
Tianchi Zhang
Yao Li
Caiwen Ding
Yanzhi Wang
Yi Liang
Dongkuan Xu
33
31
0
21 Nov 2022
The Curious Case of Absolute Position Embeddings
Koustuv Sinha
Amirhossein Kazemnejad
Siva Reddy
J. Pineau
Dieuwke Hupkes
Adina Williams
87
15
0
23 Oct 2022
Transformers Learn Shortcuts to Automata
Bingbin Liu
Jordan T. Ash
Surbhi Goel
A. Krishnamurthy
Cyril Zhang
OffRL
LRM
48
156
0
19 Oct 2022
Melody Infilling with User-Provided Structural Context
Chih-Pin Tan
A. Su
Yi-Hsuan Yang
36
3
0
06 Oct 2022
Mega: Moving Average Equipped Gated Attention
Xuezhe Ma
Chunting Zhou
Xiang Kong
Junxian He
Liangke Gui
Graham Neubig
Jonathan May
Luke Zettlemoyer
33
183
0
21 Sep 2022
Do we really need temporal convolutions in action segmentation?
Dazhao Du
Bing-Huang Su
Yu Li
Zhongang Qi
Hui Xiong
Ying Shan
ViT
29
16
0
26 May 2022
KERPLE: Kernelized Relative Positional Embedding for Length Extrapolation
Ta-Chung Chi
Ting-Han Fan
Peter J. Ramadge
Alexander I. Rudnicky
47
65
0
20 May 2022
Trading Positional Complexity vs. Deepness in Coordinate Networks
Jianqiao Zheng
Sameera Ramasinghe
Xueqian Li
Simon Lucey
31
18
0
18 May 2022
Zero-shot Code-Mixed Offensive Span Identification through Rationale Extraction
Manikandan Ravikiran
Bharathi Raja Chakravarthi
22
3
0
12 May 2022
Decoupled Side Information Fusion for Sequential Recommendation
Yueqi Xie
Peilin Zhou
Sunghun Kim
30
111
0
23 Apr 2022
3D Shuffle-Mixer: An Efficient Context-Aware Vision Learner of Transformer-MLP Paradigm for Dense Prediction in Medical Volume
Jianye Pang
Cheng Jiang
Yihao Chen
Jianbo Chang
M. Feng
Renzhi Wang
Jianhua Yao
ViT
MedIm
28
11
0
14 Apr 2022
METRO: Efficient Denoising Pretraining of Large Scale Autoencoding Language Models with Model Generated Signals
Payal Bajaj
Chenyan Xiong
Guolin Ke
Xiaodong Liu
Di He
Saurabh Tiwary
Tie-Yan Liu
Paul N. Bennett
Xia Song
Jianfeng Gao
50
32
0
13 Apr 2022
LGT-Net: Indoor Panoramic Room Layout Estimation with Geometry-Aware Transformer Network
Zhigang Jiang
Zhongzheng Xiang
Jinhua Xu
Mingbi Zhao
ViT
3DV
27
34
0
03 Mar 2022
FastRPB: a Scalable Relative Positional Encoding for Long Sequence Tasks
Maksim Zubkov
Daniil Gavrilov
27
0
0
23 Feb 2022
General-purpose, long-context autoregressive modeling with Perceiver AR
Curtis Hawthorne
Andrew Jaegle
Cătălina Cangea
Sebastian Borgeaud
C. Nash
...
Hannah R. Sheahan
Neil Zeghidour
Jean-Baptiste Alayrac
João Carreira
Jesse Engel
43
65
0
15 Feb 2022
OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
Peng Wang
An Yang
Rui Men
Junyang Lin
Shuai Bai
Zhikang Li
Jianxin Ma
Chang Zhou
Jingren Zhou
Hongxia Yang
MLLM
ObjD
68
850
0
07 Feb 2022
Improving Sample Efficiency of Value Based Models Using Attention and Vision Transformers
Amir Ardalan Kalantari
Mohammad Amini
Sarath Chandar
Doina Precup
52
4
0
01 Feb 2022
Rewiring with Positional Encodings for Graph Neural Networks
Rickard Brüel-Gabrielsson
Mikhail Yurochkin
Justin Solomon
AI4CE
25
32
0
29 Jan 2022
SwinTrack: A Simple and Strong Baseline for Transformer Tracking
Liting Lin
Heng Fan
Zhipeng Zhang
Yong-mei Xu
Haibin Ling
ViT
34
303
0
02 Dec 2021
Swin Transformer V2: Scaling Up Capacity and Resolution
Ze Liu
Han Hu
Yutong Lin
Zhuliang Yao
Zhenda Xie
...
Yue Cao
Zheng-Wei Zhang
Li Dong
Furu Wei
B. Guo
ViT
82
1,754
0
18 Nov 2021
Theme Transformer: Symbolic Music Generation with Theme-Conditioned Transformer
Yi-Jen Shih
Shih-Lun Wu
Frank Zalkow
Meinard Muller
Yi-Hsuan Yang
35
76
0
07 Nov 2021
Can Vision Transformers Perform Convolution?
Shanda Li
Xiangning Chen
Di He
Cho-Jui Hsieh
ViT
49
19
0
02 Nov 2021
Relative Molecule Self-Attention Transformer
Lukasz Maziarka
Dawid Majchrowski
Tomasz Danel
Piotr Gaiñski
Jacek Tabor
Igor T. Podolak
Pawel M. Morkisz
Stanislaw Jastrzebski
MedIm
40
34
0
12 Oct 2021
Learning to Iteratively Solve Routing Problems with Dual-Aspect Collaborative Transformer
Yining Ma
Jingwen Li
Zhiguang Cao
Wen Song
Le Zhang
Zhenghua Chen
Jing Tang
83
129
0
06 Oct 2021
Multiplicative Position-aware Transformer Models for Language Understanding
Zhiheng Huang
Davis Liang
Peng Xu
Bing Xiang
9
1
0
27 Sep 2021
The Impact of Positional Encodings on Multilingual Compression
Vinit Ravishankar
Anders Søgaard
25
5
0
11 Sep 2021
Ultra-high Resolution Image Segmentation via Locality-aware Context Fusion and Alternating Local Enhancement
Wenxi Liu
Qi Li
Xin Lin
Weixiang Yang
Shengfeng He
Yuanlong Yu
29
7
0
06 Sep 2021
Teaching Autoregressive Language Models Complex Tasks By Demonstration
Gabriel Recchia
26
22
0
05 Sep 2021
SpectralFormer: Rethinking Hyperspectral Image Classification with Transformers
Danfeng Hong
Zhu Han
Jing Yao
Lianru Gao
Bing Zhang
Antonio J. Plaza
Jocelyn Chanussot
ViT
34
867
0
07 Jul 2021
Large-Scale Chemical Language Representations Capture Molecular Structure and Properties
Jerret Ross
Brian M. Belgodere
Vijil Chenthamarakshan
Inkit Padhi
Youssef Mroueh
Payel Das
AI4CE
27
272
0
17 Jun 2021
Do Transformers Really Perform Bad for Graph Representation?
Chengxuan Ying
Tianle Cai
Shengjie Luo
Shuxin Zheng
Guolin Ke
Di He
Yanming Shen
Tie-Yan Liu
GNN
33
433
0
09 Jun 2021
1
2
Next