ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.05909
  4. Cited By
Stand-Alone Self-Attention in Vision Models

Stand-Alone Self-Attention in Vision Models

13 June 2019
Prajit Ramachandran
Niki Parmar
Ashish Vaswani
Irwan Bello
Anselm Levskaya
Jonathon Shlens
    VLMSLRViT
ArXiv (abs)PDFHTML

Papers citing "Stand-Alone Self-Attention in Vision Models"

50 / 588 papers shown
Title
Fast Point Transformer
Fast Point Transformer
Chunghyun Park
Yoonwoo Jeong
Minsu Cho
Jaesik Park
3DPCViT
76
172
0
09 Dec 2021
Relating Blindsight and AI: A Review
Relating Blindsight and AI: A Review
Joshua Bensemann
Qiming Bao
Gaël Gendron
Tim Hartill
Michael Witbrock
107
2
0
09 Dec 2021
Recurrent Glimpse-based Decoder for Detection with Transformer
Recurrent Glimpse-based Decoder for Detection with Transformer
Zhe Chen
Jing Zhang
Dacheng Tao
ViT
69
32
0
09 Dec 2021
Fully Attentional Network for Semantic Segmentation
Fully Attentional Network for Semantic Segmentation
Qi Song
Jie Li
Chenghong Li
Hao Guo
Rui Huang
3DPC
98
53
0
08 Dec 2021
CTIN: Robust Contextual Transformer Network for Inertial Navigation
CTIN: Robust Contextual Transformer Network for Inertial Navigation
Bingbing Rao
Ehsan Kazemi
Yifan Ding
D. Shila
F. M. Tucker
Liqiang Wang
3DPC
90
35
0
03 Dec 2021
Localized Feature Aggregation Module for Semantic Segmentation
Localized Feature Aggregation Module for Semantic Segmentation
Ryouichi Furukawa
Kazuhiro Hotta
95
2
0
03 Dec 2021
Hybrid Instance-aware Temporal Fusion for Online Video Instance
  Segmentation
Hybrid Instance-aware Temporal Fusion for Online Video Instance Segmentation
Xiang Li
Jinglu Wang
Xiao Li
Yan Lu
82
19
0
03 Dec 2021
TBN-ViT: Temporal Bilateral Network with Vision Transformer for Video
  Scene Parsing
TBN-ViT: Temporal Bilateral Network with Vision Transformer for Video Scene Parsing
Bo Yan
Leilei Cao
Hongbin Wang
ViT
34
1
0
02 Dec 2021
Reconstruction Student with Attention for Student-Teacher Pyramid
  Matching
Reconstruction Student with Attention for Student-Teacher Pyramid Matching
Shinji Yamada
Kazuhiro Hotta
81
39
0
30 Nov 2021
On the Integration of Self-Attention and Convolution
On the Integration of Self-Attention and Convolution
Xuran Pan
Chunjiang Ge
Rui Lu
S. Song
Guanfu Chen
Zeyi Huang
Gao Huang
SSL
137
305
0
29 Nov 2021
Video Frame Interpolation Transformer
Video Frame Interpolation Transformer
Zhihao Shi
Xiangyu Xu
Xiaohong Liu
Jun Chen
Ming-Hsuan Yang
ViT
69
166
0
27 Nov 2021
BoxeR: Box-Attention for 2D and 3D Transformers
BoxeR: Box-Attention for 2D and 3D Transformers
Duy-Kien Nguyen
Jihong Ju
Olaf Booji
Martin R. Oswald
Cees G. M. Snoek
ViT
79
36
0
25 Nov 2021
PeCo: Perceptual Codebook for BERT Pre-training of Vision Transformers
PeCo: Perceptual Codebook for BERT Pre-training of Vision Transformers
Xiaoyi Dong
Jianmin Bao
Ting Zhang
Dongdong Chen
Weiming Zhang
Lu Yuan
Dong Chen
Fang Wen
Nenghai Yu
Baining Guo
ViT
150
245
0
24 Nov 2021
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
Chenfei Wu
Jian Liang
Lei Ji
Fan Yang
Yuejian Fang
Daxin Jiang
Nan Duan
ViTVGen
77
296
0
24 Nov 2021
MetaFormer Is Actually What You Need for Vision
MetaFormer Is Actually What You Need for Vision
Weihao Yu
Mi Luo
Pan Zhou
Chenyang Si
Yichen Zhou
Xinchao Wang
Jiashi Feng
Shuicheng Yan
175
927
0
22 Nov 2021
PointMixer: MLP-Mixer for Point Cloud Understanding
PointMixer: MLP-Mixer for Point Cloud Understanding
Jaesung Choe
Chunghyun Park
François Rameau
Jaesik Park
In So Kweon
3DPC
128
102
0
22 Nov 2021
DuDoTrans: Dual-Domain Transformer Provides More Attention for Sinogram
  Restoration in Sparse-View CT Reconstruction
DuDoTrans: Dual-Domain Transformer Provides More Attention for Sinogram Restoration in Sparse-View CT Reconstruction
Ce Wang
Kun Shang
Haimiao Zhang
Qian Li
Yuan Hui
S. Kevin Zhou
ViTMedIm
71
28
0
21 Nov 2021
Mask-guided Spectral-wise Transformer for Efficient Hyperspectral Image
  Reconstruction
Mask-guided Spectral-wise Transformer for Efficient Hyperspectral Image Reconstruction
Yuanhao Cai
Jing Lin
Xiaowan Hu
Haoqian Wang
X. Yuan
Yulun Zhang
Radu Timofte
Luc Van Gool
61
261
0
15 Nov 2021
Attention Mechanisms in Computer Vision: A Survey
Attention Mechanisms in Computer Vision: A Survey
Meng-Hao Guo
Tianhan Xu
Jiangjiang Liu
Zheng-Ning Liu
Peng-Tao Jiang
Tai-Jiang Mu
Song-Hai Zhang
Ralph Robert Martin
Ming-Ming Cheng
Shimin Hu
142
1,735
0
15 Nov 2021
Searching for TrioNet: Combining Convolution with Local and Global
  Self-Attention
Searching for TrioNet: Combining Convolution with Local and Global Self-Attention
Huaijin Pi
Huiyu Wang
Yingwei Li
Zizhang Li
Alan Yuille
ViT
76
3
0
15 Nov 2021
Local Multi-Head Channel Self-Attention for Facial Expression
  Recognition
Local Multi-Head Channel Self-Attention for Facial Expression Recognition
Roberto Pecoraro
Valerio Basile
Viviana Bono
Sara Gallo
ViT
139
52
0
14 Nov 2021
Full-attention based Neural Architecture Search using Context
  Auto-regression
Full-attention based Neural Architecture Search using Context Auto-regression
Yuan Zhou
Haiyang Wang
Shuwei Huo
Boyu Wang
62
3
0
13 Nov 2021
A Survey of Visual Transformers
A Survey of Visual Transformers
Yang Liu
Yao Zhang
Yixin Wang
Feng Hou
Jin Yuan
Jiang Tian
Yang Zhang
Zhongchao Shi
Jianping Fan
Zhiqiang He
3DGSViT
189
356
0
11 Nov 2021
Are Transformers More Robust Than CNNs?
Are Transformers More Robust Than CNNs?
Yutong Bai
Jieru Mei
Alan Yuille
Cihang Xie
ViTAAML
260
269
0
10 Nov 2021
Off-policy Imitation Learning from Visual Inputs
Off-policy Imitation Learning from Visual Inputs
Zhihao Cheng
Li Shen
Dacheng Tao
42
2
0
08 Nov 2021
Sampling Equivariant Self-attention Networks for Object Detection in
  Aerial Images
Sampling Equivariant Self-attention Networks for Object Detection in Aerial Images
Guo-Ye Yang
Xiang-Li Li
Ralph Robert Martin
Shimin Hu
3DPC
56
13
0
05 Nov 2021
Implicit Deep Adaptive Design: Policy-Based Experimental Design without
  Likelihoods
Implicit Deep Adaptive Design: Policy-Based Experimental Design without Likelihoods
Desi R. Ivanova
Adam Foster
Steven Kleinegesse
Michael U. Gutmann
Tom Rainforth
OffRL
132
48
0
03 Nov 2021
Relational Self-Attention: What's Missing in Attention for Video
  Understanding
Relational Self-Attention: What's Missing in Attention for Video Understanding
Manjin Kim
Heeseung Kwon
Chunyu Wang
Suha Kwak
Minsu Cho
ViT
83
29
0
02 Nov 2021
Gabor filter incorporated CNN for compression
Gabor filter incorporated CNN for compression
Akihiro Imamura
N. Arizumi
CVBM
52
2
0
29 Oct 2021
Dispensed Transformer Network for Unsupervised Domain Adaptation
Dispensed Transformer Network for Unsupervised Domain Adaptation
Yunxiang Li
Jingxiong Li
Ruilong Dan
Shuai Wang
Kai Jin
...
Qianni Zhang
Huiyu Zhou
Qun Jin
Li Wang
Yaqi Wang
OODMedIm
62
4
0
28 Oct 2021
Denoised Non-Local Neural Network for Semantic Segmentation
Denoised Non-Local Neural Network for Semantic Segmentation
Qi Song
Jie Li
Hao Guo
Rui Huang
59
8
0
27 Oct 2021
HRFormer: High-Resolution Transformer for Dense Prediction
HRFormer: High-Resolution Transformer for Dense Prediction
Yuhui Yuan
Rao Fu
Lang Huang
Weihong Lin
Chao Zhang
Xilin Chen
Jingdong Wang
ViT
142
235
0
18 Oct 2021
Finding Strong Gravitational Lenses Through Self-Attention
Finding Strong Gravitational Lenses Through Self-Attention
H. Thuruthipilly
A. Zadrożny
Agnieszka Pollo
Marek Biesiada
49
6
0
18 Oct 2021
Multi-View Stereo Network with attention thin volume
Multi-View Stereo Network with attention thin volume
Zihang Wan
3DV
118
1
0
16 Oct 2021
MEDUSA: Multi-scale Encoder-Decoder Self-Attention Deep Neural Network
  Architecture for Medical Image Analysis
MEDUSA: Multi-scale Encoder-Decoder Self-Attention Deep Neural Network Architecture for Medical Image Analysis
Hossein Aboutalebi
Maya Pavlova
Hayden Gunraj
M. Shafiee
A. Sabri
Amer Alaref
Alexander Wong
61
17
0
12 Oct 2021
A Deep Generative Model for Reordering Adjacency Matrices
A Deep Generative Model for Reordering Adjacency Matrices
Oh-Hyun Kwon
Chiun-How Kao
Chun-Houh Chen
K. Ma
103
7
0
11 Oct 2021
Context-LGM: Leveraging Object-Context Relation for Context-Aware Object
  Recognition
Context-LGM: Leveraging Object-Context Relation for Context-Aware Object Recognition
Mingzhou Liu
Xinwei Sun
Fandong Zhang
Yizhou Yu
Yizhou Wang
64
0
0
08 Oct 2021
Token Pooling in Vision Transformers
Token Pooling in Vision Transformers
D. Marin
Jen-Hao Rick Chang
Anurag Ranjan
Anish K. Prabhu
Mohammad Rastegari
Oncel Tuzel
ViT
143
71
0
08 Oct 2021
Adversarial Robustness Comparison of Vision Transformer and MLP-Mixer to
  CNNs
Adversarial Robustness Comparison of Vision Transformer and MLP-Mixer to CNNs
Philipp Benz
Soomin Ham
Chaoning Zhang
Adil Karjauv
In So Kweon
AAMLViT
100
80
0
06 Oct 2021
Attentive Walk-Aggregating Graph Neural Networks
Attentive Walk-Aggregating Graph Neural Networks
M. F. Demirel
Shengchao Liu
Siddhant Garg
Zhenmei Shi
Yingyu Liang
131
10
0
06 Oct 2021
MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision
  Transformer
MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer
Sachin Mehta
Mohammad Rastegari
ViT
300
1,296
0
05 Oct 2021
VTAMIQ: Transformers for Attention Modulated Image Quality Assessment
VTAMIQ: Transformers for Attention Modulated Image Quality Assessment
Andrei Chubarau
James Clark
ViT
123
9
0
04 Oct 2021
Improving Axial-Attention Network Classification via Cross-Channel
  Weight Sharing
Improving Axial-Attention Network Classification via Cross-Channel Weight Sharing
Nazmul Shahadat
Anthony Maida
30
0
0
04 Oct 2021
GT U-Net: A U-Net Like Group Transformer Network for Tooth Root
  Segmentation
GT U-Net: A U-Net Like Group Transformer Network for Tooth Root Segmentation
Yunxiang Li
Shuai Wang
Jun Wang
G. Zeng
Wenjun Liu
Qianni Zhang
Qun Jin
Yaqi Wang
ViTMedIm
72
49
0
30 Sep 2021
Localizing Objects with Self-Supervised Transformers and no Labels
Localizing Objects with Self-Supervised Transformers and no Labels
Oriane Siméoni
Gilles Puy
Huy V. Vo
Simon Roburin
Spyros Gidaris
Andrei Bursuc
P. Pérez
Renaud Marlet
Jean Ponce
ViT
260
203
0
29 Sep 2021
PETA: Photo Albums Event Recognition using Transformers Attention
PETA: Photo Albums Event Recognition using Transformers Attention
Tamar Glaser
Emanuel Ben-Baruch
Gilad Sharir
Nadav Zamir
Asaf Noy
Lihi Zelnik-Manor
ViT
44
2
0
26 Sep 2021
Is Attention Better Than Matrix Decomposition?
Is Attention Better Than Matrix Decomposition?
Zhengyang Geng
Meng-Hao Guo
Hongxu Chen
Xia Li
Ke Wei
Zhouchen Lin
123
142
0
09 Sep 2021
Learning the Physics of Particle Transport via Transformers
Learning the Physics of Particle Transport via Transformers
O. Pastor-Serrano
Zoltán Perkó
MedIm
88
14
0
08 Sep 2021
Ultra-high Resolution Image Segmentation via Locality-aware Context
  Fusion and Alternating Local Enhancement
Ultra-high Resolution Image Segmentation via Locality-aware Context Fusion and Alternating Local Enhancement
Wenxi Liu
Qi Li
Xin Lin
Weixiang Yang
Shengfeng He
Yuanlong Yu
76
7
0
06 Sep 2021
Revisiting 3D ResNets for Video Recognition
Revisiting 3D ResNets for Video Recognition
Xianzhi Du
Yeqing Li
Huayu Chen
Rui Qian
Jing Li
Irwan Bello
160
17
0
03 Sep 2021
Previous
123...678...101112
Next