ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2006.15595
  4. Cited By
Rethinking Positional Encoding in Language Pre-training

Rethinking Positional Encoding in Language Pre-training

28 June 2020
Guolin Ke
Di He
Tie-Yan Liu
ArXivPDFHTML

Papers citing "Rethinking Positional Encoding in Language Pre-training"

50 / 64 papers shown
Title
Spline-based Transformers
Spline-based Transformers
Prashanth Chandran
Agon Serifi
Markus Gross
Moritz Bächer
41
0
0
03 Apr 2025
Positional Encoding in Transformer-Based Time Series Models: A Survey
Positional Encoding in Transformer-Based Time Series Models: A Survey
Habib Irani
Vangelis Metsis
AI4TS
53
0
0
17 Feb 2025
ArtFormer: Controllable Generation of Diverse 3D Articulated Objects
ArtFormer: Controllable Generation of Diverse 3D Articulated Objects
Jiayi Su
Youhe Feng
Zheng Li
Jinhua Song
Yangfan He
Botao Ren
Botian Xu
AI4CE
91
2
0
10 Dec 2024
Spatioformer: A Geo-encoded Transformer for Large-Scale Plant Species Richness Prediction
Spatioformer: A Geo-encoded Transformer for Large-Scale Plant Species Richness Prediction
Yiqing Guo
K. Mokany
S. Levick
Jinyan Yang
P. Moghadam
MDE
47
2
0
25 Oct 2024
MLissard: Multilingual Long and Simple Sequential Reasoning Benchmarks
MLissard: Multilingual Long and Simple Sequential Reasoning Benchmarks
M. Bueno
R. Lotufo
Rodrigo Nogueira
LRM
31
0
0
08 Oct 2024
Towards LifeSpan Cognitive Systems
Towards LifeSpan Cognitive Systems
Yu Wang
Chi Han
Tongtong Wu
Xiaoxin He
Wangchunshu Zhou
...
Zexue He
Wei Wang
Gholamreza Haffari
Heng Ji
Julian McAuley
KELM
CLL
188
1
0
20 Sep 2024
TeXBLEU: Automatic Metric for Evaluate LaTeX Format
TeXBLEU: Automatic Metric for Evaluate LaTeX Format
Kyudan Jung
N. Kim
Hyongon Ryu
Sieun Hyeon
Seung-jun Lee
Hyeok-jae Lee
37
0
0
10 Sep 2024
Are queries and keys always relevant? A case study on Transformer wave functions
Are queries and keys always relevant? A case study on Transformer wave functions
Riccardo Rende
Luciano Loris Viteritti
29
5
0
29 May 2024
Exploring the Potential of Large Language Models for Improving Digital Forensic Investigation Efficiency
Exploring the Potential of Large Language Models for Improving Digital Forensic Investigation Efficiency
Akila Wickramasekara
F. Breitinger
Mark Scanlon
52
8
0
29 Feb 2024
Large Language Models: A Survey
Large Language Models: A Survey
Shervin Minaee
Tomáš Mikolov
Narjes Nikzad
M. Asgari-Chenaghlu
R. Socher
Xavier Amatriain
Jianfeng Gao
ALM
LM&MA
ELM
134
371
0
09 Feb 2024
SSIN: Self-Supervised Learning for Rainfall Spatial Interpolation
SSIN: Self-Supervised Learning for Rainfall Spatial Interpolation
Jia Li
Yanyan Shen
Lei Chen
Charles Wang Wai Ng
22
3
0
27 Nov 2023
Long-MIL: Scaling Long Contextual Multiple Instance Learning for
  Histopathology Whole Slide Image Analysis
Long-MIL: Scaling Long Contextual Multiple Instance Learning for Histopathology Whole Slide Image Analysis
Honglin Li
Yunlong Zhang
Chenglu Zhu
Jiatong Cai
Sunyi Zheng
Lin Yang
VLM
43
4
0
21 Nov 2023
From Words and Exercises to Wellness: Farsi Chatbot for Self-Attachment
  Technique
From Words and Exercises to Wellness: Farsi Chatbot for Self-Attachment Technique
Sina Elahimanesh
Shayan Salehi
Sara Zahedi Movahed
Lisa Alazraki
Ruoyu Hu
Abbas Edalat
24
0
0
13 Oct 2023
GrowLength: Accelerating LLMs Pretraining by Progressively Growing
  Training Length
GrowLength: Accelerating LLMs Pretraining by Progressively Growing Training Length
Hongye Jin
Xiaotian Han
Jingfeng Yang
Zhimeng Jiang
Chia-Yuan Chang
Xia Hu
33
11
0
01 Oct 2023
Frameless Graph Knowledge Distillation
Frameless Graph Knowledge Distillation
Dai Shi
Zhiqi Shao
Yi Guo
Junbin Gao
39
4
0
13 Jul 2023
Relational Temporal Graph Reasoning for Dual-task Dialogue Language
  Understanding
Relational Temporal Graph Reasoning for Dual-task Dialogue Language Understanding
Bowen Xing
Ivor W. Tsang
43
13
0
15 Jun 2023
Pre-training Language Model as a Multi-perspective Course Learner
Pre-training Language Model as a Multi-perspective Course Learner
Beiduo Chen
Shaohan Huang
Zi-qiang Zhang
Wu Guo
Zhen-Hua Ling
Haizhen Huang
Furu Wei
Weiwei Deng
Qi Zhang
34
0
0
06 May 2023
HST-MRF: Heterogeneous Swin Transformer with Multi-Receptive Field for
  Medical Image Segmentation
HST-MRF: Heterogeneous Swin Transformer with Multi-Receptive Field for Medical Image Segmentation
Xiaofei Huang
Hongfang Gong
Jin Zhang
MedIm
34
2
0
10 Apr 2023
Ankh: Optimized Protein Language Model Unlocks General-Purpose Modelling
Ankh: Optimized Protein Language Model Unlocks General-Purpose Modelling
Ahmed Elnaggar
Hazem Essam
Wafaa Salah-Eldin
Walid Moustafa
Mohamed Elkerdawy
Charlotte Rochereau
B. Rost
167
87
0
16 Jan 2023
You Need Multiple Exiting: Dynamic Early Exiting for Accelerating
  Unified Vision Language Model
You Need Multiple Exiting: Dynamic Early Exiting for Accelerating Unified Vision Language Model
Sheng Tang
Yaqing Wang
Zhenglun Kong
Tianchi Zhang
Yao Li
Caiwen Ding
Yanzhi Wang
Yi Liang
Dongkuan Xu
33
31
0
21 Nov 2022
The Curious Case of Absolute Position Embeddings
The Curious Case of Absolute Position Embeddings
Koustuv Sinha
Amirhossein Kazemnejad
Siva Reddy
J. Pineau
Dieuwke Hupkes
Adina Williams
87
15
0
23 Oct 2022
Transformers Learn Shortcuts to Automata
Transformers Learn Shortcuts to Automata
Bingbin Liu
Jordan T. Ash
Surbhi Goel
A. Krishnamurthy
Cyril Zhang
OffRL
LRM
48
156
0
19 Oct 2022
What Makes Convolutional Models Great on Long Sequence Modeling?
What Makes Convolutional Models Great on Long Sequence Modeling?
Yuhong Li
Tianle Cai
Yi Zhang
De-huai Chen
Debadeepta Dey
VLM
39
96
0
17 Oct 2022
Melody Infilling with User-Provided Structural Context
Melody Infilling with User-Provided Structural Context
Chih-Pin Tan
A. Su
Yi-Hsuan Yang
36
3
0
06 Oct 2022
Mega: Moving Average Equipped Gated Attention
Mega: Moving Average Equipped Gated Attention
Xuezhe Ma
Chunting Zhou
Xiang Kong
Junxian He
Liangke Gui
Graham Neubig
Jonathan May
Luke Zettlemoyer
33
183
0
21 Sep 2022
Do we really need temporal convolutions in action segmentation?
Do we really need temporal convolutions in action segmentation?
Dazhao Du
Bing-Huang Su
Yu Li
Zhongang Qi
Hui Xiong
Ying Shan
ViT
29
16
0
26 May 2022
KERPLE: Kernelized Relative Positional Embedding for Length
  Extrapolation
KERPLE: Kernelized Relative Positional Embedding for Length Extrapolation
Ta-Chung Chi
Ting-Han Fan
Peter J. Ramadge
Alexander I. Rudnicky
47
65
0
20 May 2022
Trading Positional Complexity vs. Deepness in Coordinate Networks
Trading Positional Complexity vs. Deepness in Coordinate Networks
Jianqiao Zheng
Sameera Ramasinghe
Xueqian Li
Simon Lucey
31
18
0
18 May 2022
Zero-shot Code-Mixed Offensive Span Identification through Rationale
  Extraction
Zero-shot Code-Mixed Offensive Span Identification through Rationale Extraction
Manikandan Ravikiran
Bharathi Raja Chakravarthi
22
3
0
12 May 2022
Decoupled Side Information Fusion for Sequential Recommendation
Decoupled Side Information Fusion for Sequential Recommendation
Yueqi Xie
Peilin Zhou
Sunghun Kim
30
111
0
23 Apr 2022
3D Shuffle-Mixer: An Efficient Context-Aware Vision Learner of
  Transformer-MLP Paradigm for Dense Prediction in Medical Volume
3D Shuffle-Mixer: An Efficient Context-Aware Vision Learner of Transformer-MLP Paradigm for Dense Prediction in Medical Volume
Jianye Pang
Cheng Jiang
Yihao Chen
Jianbo Chang
M. Feng
Renzhi Wang
Jianhua Yao
ViT
MedIm
28
11
0
14 Apr 2022
METRO: Efficient Denoising Pretraining of Large Scale Autoencoding
  Language Models with Model Generated Signals
METRO: Efficient Denoising Pretraining of Large Scale Autoencoding Language Models with Model Generated Signals
Payal Bajaj
Chenyan Xiong
Guolin Ke
Xiaodong Liu
Di He
Saurabh Tiwary
Tie-Yan Liu
Paul N. Bennett
Xia Song
Jianfeng Gao
50
32
0
13 Apr 2022
LGT-Net: Indoor Panoramic Room Layout Estimation with Geometry-Aware
  Transformer Network
LGT-Net: Indoor Panoramic Room Layout Estimation with Geometry-Aware Transformer Network
Zhigang Jiang
Zhongzheng Xiang
Jinhua Xu
Mingbi Zhao
ViT
3DV
27
34
0
03 Mar 2022
FastRPB: a Scalable Relative Positional Encoding for Long Sequence Tasks
FastRPB: a Scalable Relative Positional Encoding for Long Sequence Tasks
Maksim Zubkov
Daniil Gavrilov
27
0
0
23 Feb 2022
General-purpose, long-context autoregressive modeling with Perceiver AR
General-purpose, long-context autoregressive modeling with Perceiver AR
Curtis Hawthorne
Andrew Jaegle
Cătălina Cangea
Sebastian Borgeaud
C. Nash
...
Hannah R. Sheahan
Neil Zeghidour
Jean-Baptiste Alayrac
João Carreira
Jesse Engel
43
65
0
15 Feb 2022
OFA: Unifying Architectures, Tasks, and Modalities Through a Simple
  Sequence-to-Sequence Learning Framework
OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
Peng Wang
An Yang
Rui Men
Junyang Lin
Shuai Bai
Zhikang Li
Jianxin Ma
Chang Zhou
Jingren Zhou
Hongxia Yang
MLLM
ObjD
74
850
0
07 Feb 2022
Improving Sample Efficiency of Value Based Models Using Attention and
  Vision Transformers
Improving Sample Efficiency of Value Based Models Using Attention and Vision Transformers
Amir Ardalan Kalantari
Mohammad Amini
Sarath Chandar
Doina Precup
52
4
0
01 Feb 2022
Rewiring with Positional Encodings for Graph Neural Networks
Rewiring with Positional Encodings for Graph Neural Networks
Rickard Brüel-Gabrielsson
Mikhail Yurochkin
Justin Solomon
AI4CE
25
32
0
29 Jan 2022
SwinTrack: A Simple and Strong Baseline for Transformer Tracking
SwinTrack: A Simple and Strong Baseline for Transformer Tracking
Liting Lin
Heng Fan
Zhipeng Zhang
Yong-mei Xu
Haibin Ling
ViT
37
303
0
02 Dec 2021
Swin Transformer V2: Scaling Up Capacity and Resolution
Swin Transformer V2: Scaling Up Capacity and Resolution
Ze Liu
Han Hu
Yutong Lin
Zhuliang Yao
Zhenda Xie
...
Yue Cao
Zheng-Wei Zhang
Li Dong
Furu Wei
B. Guo
ViT
82
1,754
0
18 Nov 2021
Theme Transformer: Symbolic Music Generation with Theme-Conditioned
  Transformer
Theme Transformer: Symbolic Music Generation with Theme-Conditioned Transformer
Yi-Jen Shih
Shih-Lun Wu
Frank Zalkow
Meinard Muller
Yi-Hsuan Yang
35
76
0
07 Nov 2021
Can Vision Transformers Perform Convolution?
Can Vision Transformers Perform Convolution?
Shanda Li
Xiangning Chen
Di He
Cho-Jui Hsieh
ViT
49
19
0
02 Nov 2021
Relative Molecule Self-Attention Transformer
Relative Molecule Self-Attention Transformer
Lukasz Maziarka
Dawid Majchrowski
Tomasz Danel
Piotr Gaiñski
Jacek Tabor
Igor T. Podolak
Pawel M. Morkisz
Stanislaw Jastrzebski
MedIm
40
34
0
12 Oct 2021
Learning to Iteratively Solve Routing Problems with Dual-Aspect
  Collaborative Transformer
Learning to Iteratively Solve Routing Problems with Dual-Aspect Collaborative Transformer
Yining Ma
Jingwen Li
Zhiguang Cao
Wen Song
Le Zhang
Zhenghua Chen
Jing Tang
83
129
0
06 Oct 2021
Multiplicative Position-aware Transformer Models for Language
  Understanding
Multiplicative Position-aware Transformer Models for Language Understanding
Zhiheng Huang
Davis Liang
Peng Xu
Bing Xiang
9
1
0
27 Sep 2021
The Impact of Positional Encodings on Multilingual Compression
The Impact of Positional Encodings on Multilingual Compression
Vinit Ravishankar
Anders Søgaard
25
5
0
11 Sep 2021
Ultra-high Resolution Image Segmentation via Locality-aware Context
  Fusion and Alternating Local Enhancement
Ultra-high Resolution Image Segmentation via Locality-aware Context Fusion and Alternating Local Enhancement
Wenxi Liu
Qi Li
Xin Lin
Weixiang Yang
Shengfeng He
Yuanlong Yu
29
7
0
06 Sep 2021
Teaching Autoregressive Language Models Complex Tasks By Demonstration
Teaching Autoregressive Language Models Complex Tasks By Demonstration
Gabriel Recchia
26
22
0
05 Sep 2021
SpectralFormer: Rethinking Hyperspectral Image Classification with
  Transformers
SpectralFormer: Rethinking Hyperspectral Image Classification with Transformers
Danfeng Hong
Zhu Han
Jing Yao
Lianru Gao
Bing Zhang
Antonio J. Plaza
Jocelyn Chanussot
ViT
34
867
0
07 Jul 2021
Large-Scale Chemical Language Representations Capture Molecular
  Structure and Properties
Large-Scale Chemical Language Representations Capture Molecular Structure and Properties
Jerret Ross
Brian M. Belgodere
Vijil Chenthamarakshan
Inkit Padhi
Youssef Mroueh
Payel Das
AI4CE
27
272
0
17 Jun 2021
12
Next