ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1803.02155
  4. Cited By
Self-Attention with Relative Position Representations

Self-Attention with Relative Position Representations

6 March 2018
Peter Shaw
Jakob Uszkoreit
Ashish Vaswani
ArXivPDFHTML

Papers citing "Self-Attention with Relative Position Representations"

50 / 411 papers shown
Title
Positional Encoding Helps Recurrent Neural Networks Handle a Large
  Vocabulary
Positional Encoding Helps Recurrent Neural Networks Handle a Large Vocabulary
Takashi Morita
21
3
0
31 Jan 2024
On the generalization capacity of neural networks during generic
  multimodal reasoning
On the generalization capacity of neural networks during generic multimodal reasoning
Takuya Ito
Soham Dan
Mattia Rigotti
James Kozloski
Murray Campbell
LRM
40
2
0
26 Jan 2024
Cross Initialization for Personalized Text-to-Image Generation
Cross Initialization for Personalized Text-to-Image Generation
Lianyu Pang
Jian Yin
Haoran Xie
Qiping Wang
Qing Li
Xudong Mao
DiffM
35
7
0
26 Dec 2023
Delving Deeper Into Astromorphic Transformers
Delving Deeper Into Astromorphic Transformers
Md. Zesun Ahmed Mia
Malyaban Bal
Abhronil Sengupta
36
1
0
18 Dec 2023
Zebra: Extending Context Window with Layerwise Grouped Local-Global
  Attention
Zebra: Extending Context Window with Layerwise Grouped Local-Global Attention
Kaiqiang Song
Xiaoyang Wang
Sangwoo Cho
Xiaoman Pan
Dong Yu
34
7
0
14 Dec 2023
Out of Context: How important is Local Context in Neural Program Repair?
Out of Context: How important is Local Context in Neural Program Repair?
Julian Aron Prenner
Romain Robbes
29
9
0
08 Dec 2023
DiffiT: Diffusion Vision Transformers for Image Generation
DiffiT: Diffusion Vision Transformers for Image Generation
Ali Hatamizadeh
Jiaming Song
Guilin Liu
Jan Kautz
Arash Vahdat
39
67
0
04 Dec 2023
SSIN: Self-Supervised Learning for Rainfall Spatial Interpolation
SSIN: Self-Supervised Learning for Rainfall Spatial Interpolation
Jia Li
Yanyan Shen
Lei Chen
Charles Wang Wai Ng
22
3
0
27 Nov 2023
Large Language Models in Education: Vision and Opportunities
Large Language Models in Education: Vision and Opportunities
Wensheng Gan
Zhenlian Qi
Jiayang Wu
Chun-Wei Lin
AI4Ed
44
71
0
22 Nov 2023
Long-MIL: Scaling Long Contextual Multiple Instance Learning for
  Histopathology Whole Slide Image Analysis
Long-MIL: Scaling Long Contextual Multiple Instance Learning for Histopathology Whole Slide Image Analysis
Honglin Li
Yunlong Zhang
Chenglu Zhu
Jiatong Cai
Sunyi Zheng
Lin Yang
VLM
43
4
0
21 Nov 2023
COSTAR: Improved Temporal Counterfactual Estimation with Self-Supervised
  Learning
COSTAR: Improved Temporal Counterfactual Estimation with Self-Supervised Learning
Chuizheng Meng
Yihe Dong
Sercan Ö. Arik
Yan Liu
Tomas Pfister
CML
AI4TS
29
0
0
01 Nov 2023
The Expressibility of Polynomial based Attention Scheme
The Expressibility of Polynomial based Attention Scheme
Zhao Song
Guangyi Xu
Junze Yin
34
5
0
30 Oct 2023
DPP-TTS: Diversifying prosodic features of speech via determinantal
  point processes
DPP-TTS: Diversifying prosodic features of speech via determinantal point processes
Seongho Joo
Hyukhun Koh
Kyomin Jung
DiffM
47
0
0
23 Oct 2023
GTA: A Geometry-Aware Attention Mechanism for Multi-View Transformers
GTA: A Geometry-Aware Attention Mechanism for Multi-View Transformers
Takeru Miyato
Bernhard Jaeger
Max Welling
Andreas Geiger
ViT
39
14
0
16 Oct 2023
EViT: An Eagle Vision Transformer with Bi-Fovea Self-Attention
EViT: An Eagle Vision Transformer with Bi-Fovea Self-Attention
Yulong Shi
Mingwei Sun
Yongshuai Wang
Hui Sun
Zengqiang Chen
34
4
0
10 Oct 2023
RoFormer for Position Aware Multiple Instance Learning in Whole Slide
  Image Classification
RoFormer for Position Aware Multiple Instance Learning in Whole Slide Image Classification
Etienne Pochet
Rami Maroun
Roger Trullo
MedIm
23
2
0
03 Oct 2023
Tackling VQA with Pretrained Foundation Models without Further Training
Tackling VQA with Pretrained Foundation Models without Further Training
Alvin De Jun Tan
Bingquan Shen
MLLM
37
1
0
27 Sep 2023
PDPCRN: Parallel Dual-Path CRN with Bi-directional Inter-Branch
  Interactions for Multi-Channel Speech Enhancement
PDPCRN: Parallel Dual-Path CRN with Bi-directional Inter-Branch Interactions for Multi-Channel Speech Enhancement
Jia-Yu Pan
Shulin He
Tianci Wu
Hui Zhang
Xueliang Zhang
24
0
0
19 Sep 2023
Matcha-TTS: A fast TTS architecture with conditional flow matching
Matcha-TTS: A fast TTS architecture with conditional flow matching
Shivam Mehta
Ruibo Tu
Jonas Beskow
Éva Székely
G. Henter
24
72
0
06 Sep 2023
Rubric-Specific Approach to Automated Essay Scoring with Augmentation
  Training
Rubric-Specific Approach to Automated Essay Scoring with Augmentation Training
Brian Cho
Youngbin Jang
Jaewoong Yoon
33
1
0
06 Sep 2023
An Interpretable and Attention-based Method for Gaze Estimation Using
  Electroencephalography
An Interpretable and Attention-based Method for Gaze Estimation Using Electroencephalography
Nina Weng
M. Płomecka
Manuel Kaufmann
Ard Kastrati
Roger Wattenhofer
N. Langer
26
1
0
09 Aug 2023
DETR Doesn't Need Multi-Scale or Locality Design
DETR Doesn't Need Multi-Scale or Locality Design
Yutong Lin
Yuhui Yuan
Zheng-Wei Zhang
Chen Li
Nanning Zheng
Han Hu
37
5
0
03 Aug 2023
MLIC++: Linear Complexity Multi-Reference Entropy Modeling for Learned
  Image Compression
MLIC++: Linear Complexity Multi-Reference Entropy Modeling for Learned Image Compression
Wei Jiang
Jiayu Yang
Yongqi Zhai
Feng Gao
Ronggang Wang
39
32
0
28 Jul 2023
LEA: Improving Sentence Similarity Robustness to Typos Using Lexical
  Attention Bias
LEA: Improving Sentence Similarity Robustness to Typos Using Lexical Attention Bias
Mario Almagro
Emilio Almazán
Diego Ortego
David Jiménez
29
3
0
06 Jul 2023
Implicit Memory Transformer for Computationally Efficient Simultaneous
  Speech Translation
Implicit Memory Transformer for Computationally Efficient Simultaneous Speech Translation
Matthew Raffel
Lizhong Chen
9
5
0
03 Jul 2023
Shiftable Context: Addressing Training-Inference Context Mismatch in
  Simultaneous Speech Translation
Shiftable Context: Addressing Training-Inference Context Mismatch in Simultaneous Speech Translation
Matthew Raffel
Drew Penney
Lizhong Chen
24
3
0
03 Jul 2023
Research on Named Entity Recognition in Improved transformer with R-Drop
  structure
Research on Named Entity Recognition in Improved transformer with R-Drop structure
Weidong Ji
Yousheng Zhang
Guohui Zhou
Xu Wang
34
0
0
14 Jun 2023
Everybody Compose: Deep Beats To Music
Everybody Compose: Deep Beats To Music
Conghao Shen
Violet Z. Yao
Yixin Liu
13
0
0
09 Jun 2023
A2B: Anchor to Barycentric Coordinate for Robust Correspondence
A2B: Anchor to Barycentric Coordinate for Robust Correspondence
Weiyue Zhao
Hao Lu
Zhiguo Cao
Xin Li
29
4
0
05 Jun 2023
TranSFormer: Slow-Fast Transformer for Machine Translation
TranSFormer: Slow-Fast Transformer for Machine Translation
Bei Li
Yi Jing
Xu Tan
Zhen Xing
Tong Xiao
Jingbo Zhu
49
7
0
26 May 2023
Overcoming Topology Agnosticism: Enhancing Skeleton-Based Action
  Recognition through Redefined Skeletal Topology Awareness
Overcoming Topology Agnosticism: Enhancing Skeleton-Based Action Recognition through Redefined Skeletal Topology Awareness
Yuxuan Zhou
Zhi-Qi Cheng
Ju He
Bin Luo
Yifeng Geng
Xuansong Xie
31
11
0
19 May 2023
Toward Moiré-Free and Detail-Preserving Demosaicking
Toward Moiré-Free and Detail-Preserving Demosaicking
Xuan-Yi Li
Y. Niu
Bo Zhao
Haoyuan Shi
Zitong An
34
1
0
15 May 2023
MaxViT-UNet: Multi-Axis Attention for Medical Image Segmentation
MaxViT-UNet: Multi-Axis Attention for Medical Image Segmentation
Abdul Rehman Khan
Asifullah Khan
ViT
MedIm
47
14
0
15 May 2023
Reconstruct Before Summarize: An Efficient Two-Step Framework for
  Condensing and Summarizing Meeting Transcripts
Reconstruct Before Summarize: An Efficient Two-Step Framework for Condensing and Summarizing Meeting Transcripts
Haochen Tan
Han Wu
Wei Shao
Xinyun Zhang
Mingjie Zhan
Zhaohui Hou
Ding Liang
Linqi Song
47
0
0
13 May 2023
SLTUNET: A Simple Unified Model for Sign Language Translation
SLTUNET: A Simple Unified Model for Sign Language Translation
Biao Zhang
Mathias Müller
Rico Sennrich
SLR
43
33
0
02 May 2023
Early Detection of Alzheimer's Disease using Bottleneck Transformers
Early Detection of Alzheimer's Disease using Bottleneck Transformers
Arunima Jaiswal
Ananya Sadana
MedIm
26
2
0
01 May 2023
Multi-Modal Deep Learning for Credit Rating Prediction Using Text and
  Numerical Data Streams
Multi-Modal Deep Learning for Credit Rating Prediction Using Text and Numerical Data Streams
M. Tavakoli
Rohitash Chandra
Fengrui Tian
Cristián Bravo
29
8
0
21 Apr 2023
Region-Enhanced Feature Learning for Scene Semantic Segmentation
Region-Enhanced Feature Learning for Scene Semantic Segmentation
Xin Kang
Chaoqun Wang
Xuejin Chen
32
3
0
15 Apr 2023
Dynamic Graph Representation Learning with Neural Networks: A Survey
Dynamic Graph Representation Learning with Neural Networks: A Survey
Leshanshui Yang
Sébastien Adam
Clément Chatelain
AI4TS
AI4CE
39
14
0
12 Apr 2023
Diffusion Models as Masked Autoencoders
Diffusion Models as Masked Autoencoders
Chen Wei
K. Mangalam
Po-Yao (Bernie) Huang
Yanghao Li
Haoqi Fan
Hu Xu
Huiyu Wang
Cihang Xie
Alan Yuille
Christoph Feichtenhofer
DiffM
SyDa
36
48
0
06 Apr 2023
Inductive biases in deep learning models for weather prediction
Inductive biases in deep learning models for weather prediction
Jannik Thümmel
Matthias Karlbauer
S. Otte
C. Zarfl
Georg Martius
...
Thomas Scholten
Ulrich Friedrich
V. Wulfmeyer
B. Goswami
Martin Volker Butz
AI4CE
46
6
0
06 Apr 2023
CoRe-Sleep: A Multimodal Fusion Framework for Time Series Robust to
  Imperfect Modalities
CoRe-Sleep: A Multimodal Fusion Framework for Time Series Robust to Imperfect Modalities
Konstantinos Kontras
Christos Chatzichristos
Huy P Phan
Johan A. K. Suykens
Marina De Vos
AI4TS
24
11
0
27 Mar 2023
EdgeTran: Co-designing Transformers for Efficient Inference on Mobile
  Edge Platforms
EdgeTran: Co-designing Transformers for Efficient Inference on Mobile Edge Platforms
Shikhar Tuli
N. Jha
36
3
0
24 Mar 2023
AdPE: Adversarial Positional Embeddings for Pretraining Vision
  Transformers via MAE+
AdPE: Adversarial Positional Embeddings for Pretraining Vision Transformers via MAE+
Tianlin Li
Ying Wang
Ziwei Xuan
Guo-Jun Qi
ViT
45
3
0
14 Mar 2023
Unifying Layout Generation with a Decoupled Diffusion Model
Unifying Layout Generation with a Decoupled Diffusion Model
Mude Hui
Zhizheng Zhang
Xiaoyi Zhang
Wenxuan Xie
Yuwang Wang
Yan Lu
DiffM
21
39
0
09 Mar 2023
Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec
  Language Modeling
Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language Modeling
Zi-Hua Zhang
Long Zhou
Chengyi Wang
Sanyuan Chen
Yu Wu
...
Huaming Wang
Jinyu Li
Lei He
Sheng Zhao
Furu Wei
VLM
36
171
0
07 Mar 2023
Diffusing Graph Attention
Diffusing Graph Attention
Daniel Glickman
Eran Yahav
GNN
47
3
0
01 Mar 2023
Applying Plain Transformers to Real-World Point Clouds
Applying Plain Transformers to Real-World Point Clouds
Lanxiao Li
M. Heizmann
3DPC
ViT
26
3
0
28 Feb 2023
Sequential Query Encoding For Complex Query Answering on Knowledge
  Graphs
Sequential Query Encoding For Complex Query Answering on Knowledge Graphs
Jiaxin Bai
Tianshi Zheng
Yangqiu Song
24
13
0
25 Feb 2023
Embeddings for Tabular Data: A Survey
Embeddings for Tabular Data: A Survey
Rajat Singh
Srikanta J. Bedathur
LMTD
37
2
0
23 Feb 2023
Previous
123456789
Next