ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.05909
  4. Cited By
Stand-Alone Self-Attention in Vision Models

Stand-Alone Self-Attention in Vision Models

13 June 2019
Prajit Ramachandran
Niki Parmar
Ashish Vaswani
Irwan Bello
Anselm Levskaya
Jonathon Shlens
    VLMSLRViT
ArXiv (abs)PDFHTML

Papers citing "Stand-Alone Self-Attention in Vision Models"

50 / 588 papers shown
Title
Vision Transformers with Self-Distilled Registers
Vision Transformers with Self-Distilled Registers
Yinjie Chen
Zipeng Yan
Chong Zhou
Bo Dai
Andrew F. Luo
54
0
0
27 May 2025
Locality-Aware Zero-Shot Human-Object Interaction Detection
Locality-Aware Zero-Shot Human-Object Interaction Detection
Sanghyun Kim
Deunsol Jung
Minsu Cho
VLM
183
0
0
26 May 2025
Burst Image Super-Resolution via Multi-Cross Attention Encoding and Multi-Scan State-Space Decoding
Burst Image Super-Resolution via Multi-Cross Attention Encoding and Multi-Scan State-Space Decoding
Tengda Huang
Yu Zhang
Tianren Li
Yufu Qu
Fulin Liu
Zhenzhong Wei
SupR
81
0
0
26 May 2025
Grouping First, Attending Smartly: Training-Free Acceleration for Diffusion Transformers
Grouping First, Attending Smartly: Training-Free Acceleration for Diffusion Transformers
Sucheng Ren
Qihang Yu
Ju He
Alan Yuille
Liang-Chieh Chen
133
0
0
20 May 2025
Attention-based clustering
Attention-based clustering
Rodrigo Maulen-Soto
Claire Boyer
Pierre Marion
77
0
0
19 May 2025
Multimodal Fusion of Glucose Monitoring and Food Imagery for Caloric Content Prediction
Multimodal Fusion of Glucose Monitoring and Food Imagery for Caloric Content Prediction
Adarsh Kumar
158
0
0
13 May 2025
Person Recognition at Altitude and Range: Fusion of Face, Body Shape and Gait
Person Recognition at Altitude and Range: Fusion of Face, Body Shape and Gait
Feng Liu
Nicholas Chimitt
Lanqing guo
Jitesh Jain
Aditya Kane
...
Arun Ross
Humphrey Shi
Zhangyang Wang
A. Jain
Xiaoming Liu
CVBM
62
1
0
07 May 2025
Back to Fundamentals: Low-Level Visual Features Guided Progressive Token Pruning
Back to Fundamentals: Low-Level Visual Features Guided Progressive Token Pruning
Yuanbing Ouyang
Yizhuo Liang
Qingpeng Li
Xinfei Guo
Yiming Luo
Di Wu
Hao Wang
Yushan Pan
ViTVLM
98
0
0
25 Apr 2025
Generalized Neighborhood Attention: Multi-dimensional Sparse Attention at the Speed of Light
Generalized Neighborhood Attention: Multi-dimensional Sparse Attention at the Speed of Light
Ali Hassani
Fengzhe Zhou
Aditya Kane
Jiannan Huang
Chieh-Yun Chen
...
Bing Xu
Haicheng Wu
Wen-mei W. Hwu
Xuan Li
Humphrey Shi
56
1
0
23 Apr 2025
LOOPE: Learnable Optimal Patch Order in Positional Embeddings for Vision Transformers
LOOPE: Learnable Optimal Patch Order in Positional Embeddings for Vision Transformers
M. Chowdhury
Md Rifat Ur Rahman
Akil Ahmad Taki
59
0
0
19 Apr 2025
Group-based Distinctive Image Captioning with Memory Difference Encoding and Attention
Group-based Distinctive Image Captioning with Memory Difference Encoding and Attention
Jiuniu Wang
Wenjia Xu
Qingzhong Wang
Antoni B. Chan
181
0
0
03 Apr 2025
Fourier Feature Attribution: A New Efficiency Attribution Method
Fourier Feature Attribution: A New Efficiency Attribution Method
Zechen Liu
Feiyang Zhang
Wei Song
Xuelong Li
Wei Wei
FAtt
127
0
0
02 Apr 2025
3D Medical Imaging Segmentation on Non-Contrast CT
Canxuan Gang
Yuhan Peng
104
0
0
11 Mar 2025
Deep Learning of the Evolution Operator Enables Forecasting of Out-of-Training Dynamics in Chaotic Systems
Deep Learning of the Evolution Operator Enables Forecasting of Out-of-Training Dynamics in Chaotic Systems
Ira J. S. Shokar
Peter H. Haynes
R. Kerswell
AI4TS
86
1
0
28 Feb 2025
OverLoCK: An Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic Kernels
OverLoCK: An Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic Kernels
Meng Lou
Yizhou Yu
313
2
0
27 Feb 2025
DeepInteraction++: Multi-Modality Interaction for Autonomous Driving
DeepInteraction++: Multi-Modality Interaction for Autonomous Driving
Zeyu Yang
Nan Song
Wei Li
Xiatian Zhu
Lefei Zhang
Philip H. S. Torr
162
4
0
24 Feb 2025
FeatSharp: Your Vision Model Features, Sharper
FeatSharp: Your Vision Model Features, Sharper
Mike Ranzinger
Greg Heinrich
Pavlo Molchanov
Jan Kautz
Bryan Catanzaro
Andrew Tao
CLIPVLM
131
0
0
22 Feb 2025
A Novel Convolutional-Free Method for 3D Medical Imaging Segmentation
Canxuan Gang
MedImViT
89
0
0
08 Feb 2025
PDC-ViT : Source Camera Identification using Pixel Difference Convolution and Vision Transformer
O. Elharrouss
Y. Akbari
Noor Almaadeed
S. Al-Maadeed
F. Khelifi
Ahmed Bouridane
61
2
0
28 Jan 2025
GCI-ViTAL: Gradual Confidence Improvement with Vision Transformers for
  Active Learning on Label Noise
GCI-ViTAL: Gradual Confidence Improvement with Vision Transformers for Active Learning on Label Noise
Moseli Motsóehli
Kyungim Baek
90
1
0
08 Nov 2024
Harmformer: Harmonic Networks Meet Transformers for Continuous
  Roto-Translation Equivariance
Harmformer: Harmonic Networks Meet Transformers for Continuous Roto-Translation Equivariance
Tomáš Karella
Adam Harmanec
J. Kotera
Jan Blažek
F. Šroubek
72
1
0
06 Nov 2024
Is Smoothness the Key to Robustness? A Comparison of Attention and
  Convolution Models Using a Novel Metric
Is Smoothness the Key to Robustness? A Comparison of Attention and Convolution Models Using a Novel Metric
Baiyuan Chen
MLT
90
0
0
23 Oct 2024
Spatiotemporal Object Detection for Improved Aerial Vehicle Detection in
  Traffic Monitoring
Spatiotemporal Object Detection for Improved Aerial Vehicle Detection in Traffic Monitoring
Kristina Telegraph
Christos Kyrkou
ObjD
26
0
0
17 Oct 2024
Learning to rumble: Automated elephant call classification, detection
  and endpointing using deep architectures
Learning to rumble: Automated elephant call classification, detection and endpointing using deep architectures
Christiaan M. Geldenhuys
Thomas R. Niesler
29
0
0
15 Oct 2024
UnSeGArmaNet: Unsupervised Image Segmentation using Graph Neural
  Networks with Convolutional ARMA Filters
UnSeGArmaNet: Unsupervised Image Segmentation using Graph Neural Networks with Convolutional ARMA Filters
Kovvuri Sai Gopal Reddy
Bodduluri Saran
A. M. Adityaja
Saurabh J. Shigwan
Nitin Kumar
Snehasis Mukherjee
110
1
0
08 Oct 2024
Mutually-Aware Feature Learning for Few-Shot Object Counting
Mutually-Aware Feature Learning for Few-Shot Object Counting
Yerim Jeon
Subeen Lee
Jihwan Kim
Jae-Pil Heo
96
1
0
19 Aug 2024
MCPDepth: Omnidirectional Depth Estimation via Stereo Matching from
  Multi-Cylindrical Panoramas
MCPDepth: Omnidirectional Depth Estimation via Stereo Matching from Multi-Cylindrical Panoramas
Feng Qiao
Zhexiao Xiong
Xinge Zhu
Yuexin Ma
Qiumeng He
Nathan Jacobs
MDE
71
1
0
03 Aug 2024
Leaf Angle Estimation using Mask R-CNN and LETR Vision Transformer
Leaf Angle Estimation using Mask R-CNN and LETR Vision Transformer
Venkat Margapuri
Prapti Thapaliya
Trevor Rife
28
0
0
01 Aug 2024
Cross-Layer Feature Pyramid Transformer for Small Object Detection in Aerial Images
Cross-Layer Feature Pyramid Transformer for Small Object Detection in Aerial Images
Zewen Du
Zhenjiang Hu
Guiyu Zhao
Ying Jin
Hongbin Ma
ViT
114
4
0
29 Jul 2024
Rethinking Attention Module Design for Point Cloud Analysis
Rethinking Attention Module Design for Point Cloud Analysis
Chengzhi Wu
Kaige Wang
Zeyun Zhong
Hao Fu
Junwei Zheng
Jiaming Zhang
Julius Pfrommer
Jürgen Beyerer
3DPC
109
2
0
27 Jul 2024
SwinSF: Image Reconstruction from Spatial-Temporal Spike Streams
SwinSF: Image Reconstruction from Spatial-Temporal Spike Streams
Liangyan Jiang
Chuang Zhu
Yanxu Chen
90
2
0
22 Jul 2024
Learning Natural Consistency Representation for Face Forgery Video
  Detection
Learning Natural Consistency Representation for Face Forgery Video Detection
Daichi Zhang
Zihao Xiao
Shikun Li
Fanzhao Lin
Jianmin Li
Shiming Ge
CVBM
103
13
0
15 Jul 2024
HiT-SR: Hierarchical Transformer for Efficient Image Super-Resolution
HiT-SR: Hierarchical Transformer for Efficient Image Super-Resolution
Xiang Zhang
Yulun Zhang
Fisher Yu
88
23
0
08 Jul 2024
SCSA: Exploring the Synergistic Effects Between Spatial and Channel
  Attention
SCSA: Exploring the Synergistic Effects Between Spatial and Channel Attention
Yunzhong Si
Huiying Xu
Xinzhong Zhu
Wenhao Zhang
Yao Dong
Yuxing Chen
Hongbo Li
114
36
0
06 Jul 2024
Fibottention: Inceptive Visual Representation Learning with Diverse
  Attention Across Heads
Fibottention: Inceptive Visual Representation Learning with Diverse Attention Across Heads
Ali Khaleghi Rahimian
Manish Kumar Govind
Subhajit Maity
Dominick Reilly
Christian Kummerle
Srijan Das
A. Dutta
78
1
0
27 Jun 2024
Few-Shot Medical Image Segmentation with High-Fidelity Prototypes
Few-Shot Medical Image Segmentation with High-Fidelity Prototypes
Song Tang
Shaxu Yan
Xiaozhi Qi
Jianxin Gao
Mao Ye
Jianwei Zhang
Xiatian Zhu
103
0
0
26 Jun 2024
The Balanced-Pairwise-Affinities Feature Transform
The Balanced-Pairwise-Affinities Feature Transform
Daniel Shalam
Simon Korman
108
1
0
25 Jun 2024
AdaNCA: Neural Cellular Automata As Adaptors For More Robust Vision
  Transformer
AdaNCA: Neural Cellular Automata As Adaptors For More Robust Vision Transformer
Yitao Xu
Tong Zhang
Sabine Süsstrunk
ViT
86
1
0
12 Jun 2024
SYM3D: Learning Symmetric Triplanes for Better 3D-Awareness of GANs
SYM3D: Learning Symmetric Triplanes for Better 3D-Awareness of GANs
Jing Yang
Kyle Fogarty
Fangcheng Zhong
Cengiz Oztireli
95
1
0
10 Jun 2024
Convolution and Attention-Free Mamba-based Cardiac Image Segmentation
Convolution and Attention-Free Mamba-based Cardiac Image Segmentation
Abbas Khan
Muhammad Asad
Martin Benning
C. Roney
Gregory Slabaugh
Mamba
71
4
0
09 Jun 2024
SDL-MVS: View Space and Depth Deformable Learning Paradigm for
  Multi-View Stereo Reconstruction in Remote Sensing
SDL-MVS: View Space and Depth Deformable Learning Paradigm for Multi-View Stereo Reconstruction in Remote Sensing
Yongqiang Mao
Hanbo Bi
Liangyu Xu
Kaiqiang Chen
Zhirui Wang
Xian Sun
Kun Fu
46
4
0
27 May 2024
Steerable Transformers for Volumetric Data
Steerable Transformers for Volumetric Data
Soumyabrata Kundu
Risi Kondor
LLMSVViT
114
1
0
24 May 2024
Neighborhood Attention Transformer with Progressive Channel Fusion for
  Speaker Verification
Neighborhood Attention Transformer with Progressive Channel Fusion for Speaker Verification
Nian Li
Jianguo Wei
ViT
71
0
0
20 May 2024
Compression-Realized Deep Structural Network for Video Quality
  Enhancement
Compression-Realized Deep Structural Network for Video Quality Enhancement
Hanchi Sun
Xiaohong Liu
Xinyang Jiang
Yifei Shen
Dongsheng Li
Xiongkuo Min
Guangtao Zhai
81
1
0
10 May 2024
UnSegGNet: Unsupervised Image Segmentation using Graph Neural Networks
UnSegGNet: Unsupervised Image Segmentation using Graph Neural Networks
Kovvuri Sai
Bodduluri Saran
A. M. Adityaja
Saurabh J. Shigwan
Nitin Kumar
60
1
0
09 May 2024
CSA-Net: Channel-wise Spatially Autocorrelated Attention Networks
CSA-Net: Channel-wise Spatially Autocorrelated Attention Networks
Nick Nikzad
Yongsheng Gao
Jun Zhou
66
1
0
09 May 2024
Transformer-Enhanced Motion Planner: Attention-Guided Sampling for
  State-Specific Decision Making
Transformer-Enhanced Motion Planner: Attention-Guided Sampling for State-Specific Decision Making
Zhuang Lei
Jingdong Zhao
Yuntao Li
Zichun Xu
Liangliang Zhao
Hong Liu
65
1
0
30 Apr 2024
CA-Stream: Attention-based pooling for interpretable image recognition
CA-Stream: Attention-based pooling for interpretable image recognition
Felipe Torres
Hanwei Zhang
R. Sicre
Stéphane Ayache
Yannis Avrithis
82
1
0
23 Apr 2024
When Medical Imaging Met Self-Attention: A Love Story That Didn't Quite
  Work Out
When Medical Imaging Met Self-Attention: A Love Story That Didn't Quite Work Out
Tristan Piater
Niklas Penzel
Gideon Stein
Joachim Denzler
79
2
0
18 Apr 2024
Computer-Aided Diagnosis of Thoracic Diseases in Chest X-rays using
  hybrid CNN-Transformer Architecture
Computer-Aided Diagnosis of Thoracic Diseases in Chest X-rays using hybrid CNN-Transformer Architecture
Sonit Singh
MedImViT
53
1
0
18 Apr 2024
1234...101112
Next