Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2303.01542
Cited By
Self-attention in Vision Transformers Performs Perceptual Grouping, Not Attention
2 March 2023
Paria Mehrani
John K. Tsotsos
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Self-attention in Vision Transformers Performs Perceptual Grouping, Not Attention"
28 / 28 papers shown
Title
A recurrent vision transformer shows signatures of primate visual attention
Jonathan Morgan
Badr Albanna
James P. Herman
94
0
0
16 Feb 2025
Dissecting Query-Key Interaction in Vision Transformers
Xu Pan
Aaron Philip
Ziqian Xie
Odelia Schwartz
101
1
0
04 Apr 2024
RDRN: Recursively Defined Residual Network for Image Super-Resolution
Alexander Panaetov
Karim Elhadji Daou
Igor Samenko
Evgeny Tetin
Ilya A Ivanov
SupR
46
3
0
17 Nov 2022
A Benchmark for Compositional Visual Reasoning
Aimen Zerroug
Mohit Vaishnav
Julien Colin
Sebastian Musslick
Thomas Serre
OCL
CoGe
68
34
0
11 Jun 2022
Understanding The Robustness in Vision Transformers
Daquan Zhou
Zhiding Yu
Enze Xie
Chaowei Xiao
Anima Anandkumar
Jiashi Feng
J. Álvarez
ViT
140
191
0
26 Apr 2022
A ConvNet for the 2020s
Zhuang Liu
Hanzi Mao
Chaozheng Wu
Christoph Feichtenhofer
Trevor Darrell
Saining Xie
ViT
178
5,213
0
10 Jan 2022
ConvNets vs. Transformers: Whose Visual Representations are More Transferable?
Hong-Yu Zhou
Chi-Ken Lu
Sibei Yang
Yizhou Yu
ViT
64
54
0
11 Aug 2021
Vision Transformer with Progressive Sampling
Xiaoyu Yue
Shuyang Sun
Zhanghui Kuang
Meng Wei
Philip Torr
Wayne Zhang
Dahua Lin
ViT
78
85
0
03 Aug 2021
Early Convolutions Help Transformers See Better
Tete Xiao
Mannat Singh
Eric Mintun
Trevor Darrell
Piotr Dollár
Ross B. Girshick
55
771
0
28 Jun 2021
BEiT: BERT Pre-Training of Image Transformers
Hangbo Bao
Li Dong
Songhao Piao
Furu Wei
ViT
289
2,841
0
15 Jun 2021
CoAtNet: Marrying Convolution and Attention for All Data Sizes
Zihang Dai
Hanxiao Liu
Quoc V. Le
Mingxing Tan
ViT
126
1,208
0
09 Jun 2021
Vision Transformers are Robust Learners
Sayak Paul
Pin-Yu Chen
ViT
67
312
0
17 May 2021
Are Convolutional Neural Networks or Transformers more like human vision?
Shikhar Tuli
Ishita Dasgupta
Erin Grant
Thomas Griffiths
ViT
FaML
59
185
0
15 May 2021
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
708
6,127
0
29 Apr 2021
CvT: Introducing Convolutions to Vision Transformers
Haiping Wu
Bin Xiao
Noel Codella
Mengchen Liu
Xiyang Dai
Lu Yuan
Lei Zhang
ViT
154
1,917
0
29 Mar 2021
CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification
Chun-Fu Chen
Quanfu Fan
Yikang Shen
ViT
71
1,483
0
27 Mar 2021
Understanding Robustness of Transformers for Image Classification
Srinadh Bhojanapalli
Ayan Chakrabarti
Daniel Glasner
Daliang Li
Thomas Unterthiner
Andreas Veit
ViT
93
386
0
26 Mar 2021
ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases
Stéphane dÁscoli
Hugo Touvron
Matthew L. Leavitt
Ari S. Morcos
Giulio Biroli
Levent Sagun
ViT
131
833
0
19 Mar 2021
Bottleneck Transformers for Visual Recognition
A. Srinivas
Nayeon Lee
Niki Parmar
Jonathon Shlens
Pieter Abbeel
Ashish Vaswani
SLR
360
994
0
27 Jan 2021
A Survey on Visual Transformer
Kai Han
Yunhe Wang
Hanting Chen
Xinghao Chen
Jianyuan Guo
...
Chunjing Xu
Yixing Xu
Zhaohui Yang
Yiman Zhang
Dacheng Tao
ViT
205
2,245
0
23 Dec 2020
Explicitly Modeled Attention Maps for Image Classification
Andong Tan
D. Nguyen
Maximilian Dax
Matthias Nießner
Thomas Brox
52
8
0
14 Jun 2020
Do Saliency Models Detect Odd-One-Out Targets? New Datasets and Evaluations
Iuliia Kotseruba
C. Wloka
Amir Rasouli
John K. Tsotsos
64
19
0
13 May 2020
Quantifying Attention Flow in Transformers
Samira Abnar
Willem H. Zuidema
169
802
0
02 May 2020
On the Relationship between Self-Attention and Convolutional Layers
Jean-Baptiste Cordonnier
Andreas Loukas
Martin Jaggi
116
535
0
08 Nov 2019
Structured Attention Networks
Yoon Kim
Carl Denton
Luong Hoang
Alexander M. Rush
118
463
0
03 Feb 2017
PsyPhy: A Psychophysics Driven Evaluation Framework for Visual Recognition
Brandon RichardWebster
Samuel E. Anthony
Walter J. Scheirer
43
72
0
19 Nov 2016
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
Ke Xu
Jimmy Ba
Ryan Kiros
Kyunghyun Cho
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
DiffM
348
10,079
0
10 Feb 2015
Deep Neural Networks Rival the Representation of Primate IT Cortex for Core Visual Object Recognition
C. Cadieu
Ha Hong
Daniel L. K. Yamins
Nicolas Pinto
Diego Ardila
E. Solomon
N. Majaj
J. DiCarlo
94
787
0
12 Jun 2014
1