ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2212.06727
  4. Cited By
What do Vision Transformers Learn? A Visual Exploration

What do Vision Transformers Learn? A Visual Exploration

13 December 2022
Amin Ghiasi
Hamid Kazemi
Eitan Borgnia
Steven Reich
Manli Shu
Micah Goldblum
A. Wilson
Tom Goldstein
    ViT
ArXivPDFHTML

Papers citing "What do Vision Transformers Learn? A Visual Exploration"

46 / 46 papers shown
Title
Towards Scalable Foundation Model for Multi-modal and Hyperspectral Geospatial Data
Towards Scalable Foundation Model for Multi-modal and Hyperspectral Geospatial Data
Haozhe Si
Yuxuan Wan
Minh Do
Deepak Vasisht
Han Zhao
Hendrik Hamann
48
0
0
17 Mar 2025
Discovering Influential Neuron Path in Vision Transformers
Discovering Influential Neuron Path in Vision Transformers
Yifan Wang
Yifei Liu
Yingdong Shi
Chong Li
Anqi Pang
Sibei Yang
Jingyi Yu
Kan Ren
ViT
69
0
0
12 Mar 2025
Sparse Autoencoders for Scientifically Rigorous Interpretation of Vision Models
Sparse Autoencoders for Scientifically Rigorous Interpretation of Vision Models
Samuel Stevens
Wei-Lun Chao
T. Berger-Wolf
Yu-Chuan Su
VLM
74
2
0
10 Feb 2025
Universal Sparse Autoencoders: Interpretable Cross-Model Concept Alignment
Universal Sparse Autoencoders: Interpretable Cross-Model Concept Alignment
Harrish Thasarathan
Julian Forsyth
Thomas Fel
M. Kowal
Konstantinos G. Derpanis
111
7
0
06 Feb 2025
Incorporating Feature Pyramid Tokenization and Open Vocabulary Semantic
  Segmentation
Incorporating Feature Pyramid Tokenization and Open Vocabulary Semantic Segmentation
J. Zhang
Li Zhang
Shijian Li
VLM
81
0
0
18 Dec 2024
Memory Efficient Matting with Adaptive Token Routing
Memory Efficient Matting with Adaptive Token Routing
Yiheng Lin
Yihan Hu
Chenyi Zhang
Ting Liu
Xiaochao Qu
Luoqi Liu
Yao Zhao
Y. X. Wei
68
0
0
14 Dec 2024
Asynchronous Feedback Network for Perceptual Point Cloud Quality
  Assessment
Asynchronous Feedback Network for Perceptual Point Cloud Quality Assessment
Yujie Zhang
Qi Yang
Ziyu Shan
Yiling Xu
3DPC
36
0
0
13 Jul 2024
IRSAM: Advancing Segment Anything Model for Infrared Small Target
  Detection
IRSAM: Advancing Segment Anything Model for Infrared Small Target Detection
Mingjin Zhang
Yuchun Wang
Jie-Ru Guo
Yunsong Li
Xinbo Gao
Jing Zhang
VLM
31
20
0
10 Jul 2024
Visualize and Paint GAN Activations
Visualize and Paint GAN Activations
Rudolf Herdt
Peter Maass
GAN
FAtt
19
0
0
24 May 2024
BoQ: A Place is Worth a Bag of Learnable Queries
BoQ: A Place is Worth a Bag of Learnable Queries
A. Ali-bey
B. Chaib-draa
Philippe Giguère
46
17
0
12 May 2024
Examining Changes in Internal Representations of Continual Learning
  Models Through Tensor Decomposition
Examining Changes in Internal Representations of Continual Learning Models Through Tensor Decomposition
Nishant Suresh Aswani
Amira Guesmi
Muhammad Abdullah Hanif
Muhammad Shafique
CLL
30
1
0
06 May 2024
Saliency Suppressed, Semantics Surfaced: Visual Transformations in
  Neural Networks and the Brain
Saliency Suppressed, Semantics Surfaced: Visual Transformations in Neural Networks and the Brain
Gustaw Opielka
Jessica Loke
Steven Scholte
21
0
0
29 Apr 2024
EIVEN: Efficient Implicit Attribute Value Extraction using Multimodal
  LLM
EIVEN: Efficient Implicit Attribute Value Extraction using Multimodal LLM
Henry Peng Zou
Gavin Heqing Yu
Ziwei Fan
Dan Bu
Han Liu
Peng Dai
Dongmei Jia
Cornelia Caragea
23
14
0
13 Apr 2024
Dissecting Query-Key Interaction in Vision Transformers
Dissecting Query-Key Interaction in Vision Transformers
Xu Pan
Aaron Philip
Ziqian Xie
Odelia Schwartz
39
1
0
04 Apr 2024
If CLIP Could Talk: Understanding Vision-Language Model Representations
  Through Their Preferred Concept Descriptions
If CLIP Could Talk: Understanding Vision-Language Model Representations Through Their Preferred Concept Descriptions
Reza Esfandiarpoor
Cristina Menghini
Stephen H. Bach
CoGe
VLM
37
8
0
25 Mar 2024
LiFT: A Surprisingly Simple Lightweight Feature Transform for Dense ViT
  Descriptors
LiFT: A Surprisingly Simple Lightweight Feature Transform for Dense ViT Descriptors
Saksham Suri
Matthew Walmer
Kamal Gupta
Abhinav Shrivastava
41
4
0
21 Mar 2024
What do we learn from inverting CLIP models?
What do we learn from inverting CLIP models?
Hamid Kazemi
Atoosa Malemir Chegini
Jonas Geiping
S. Feizi
Tom Goldstein
38
3
0
05 Mar 2024
Feature Accentuation: Revealing 'What' Features Respond to in Natural
  Images
Feature Accentuation: Revealing 'What' Features Respond to in Natural Images
Christopher Hamblin
Thomas Fel
Srijani Saha
Talia Konkle
George A. Alvarez
FAtt
26
3
0
15 Feb 2024
LHRS-Bot: Empowering Remote Sensing with VGI-Enhanced Large Multimodal
  Language Model
LHRS-Bot: Empowering Remote Sensing with VGI-Enhanced Large Multimodal Language Model
Dilxat Muhtar
Zhenshi Li
Feng-Xue Gu
Xue-liang Zhang
P. Xiao
78
49
0
04 Feb 2024
Understanding Video Transformers via Universal Concept Discovery
Understanding Video Transformers via Universal Concept Discovery
M. Kowal
Achal Dave
Rares Ambrus
Adrien Gaidon
Konstantinos G. Derpanis
P. Tokmakov
ViT
37
8
0
19 Jan 2024
Explainable Multi-Camera 3D Object Detection with Transformer-Based
  Saliency Maps
Explainable Multi-Camera 3D Object Detection with Transformer-Based Saliency Maps
Till Beemelmanns
Wassim Zahr
Lutz Eckstein
32
0
0
22 Dec 2023
TagCLIP: A Local-to-Global Framework to Enhance Open-Vocabulary
  Multi-Label Classification of CLIP Without Training
TagCLIP: A Local-to-Global Framework to Enhance Open-Vocabulary Multi-Label Classification of CLIP Without Training
Yuqi Lin
Minghao Chen
Kaipeng Zhang
Hengjia Li
Mingming Li
Zheng Yang
Dongqin Lv
Binbin Lin
Haifeng Liu
Deng Cai
CLIP
VLM
46
11
0
20 Dec 2023
Multimodal Pretraining of Medical Time Series and Notes
Multimodal Pretraining of Medical Time Series and Notes
Ryan N. King
Tianbao Yang
Bobak J. Mortazavi
25
12
0
11 Dec 2023
ViT-Lens: Towards Omni-modal Representations
ViT-Lens: Towards Omni-modal Representations
Weixian Lei
Yixiao Ge
Kun Yi
Jianfeng Zhang
Difei Gao
Dylan Sun
Yuying Ge
Ying Shan
Mike Zheng Shou
21
18
0
27 Nov 2023
Explainability of Vision Transformers: A Comprehensive Review and New
  Perspectives
Explainability of Vision Transformers: A Comprehensive Review and New Perspectives
Rojina Kashefi
Leili Barekatain
Mohammad Sabokrou
Fatemeh Aghaeipoor
ViT
37
9
0
12 Nov 2023
Battle of the Backbones: A Large-Scale Comparison of Pretrained Models
  across Computer Vision Tasks
Battle of the Backbones: A Large-Scale Comparison of Pretrained Models across Computer Vision Tasks
Micah Goldblum
Hossein Souri
Renkun Ni
Manli Shu
Viraj Prabhu
...
Adrien Bardes
Judy Hoffman
Ramalingam Chellappa
Andrew Gordon Wilson
Tom Goldstein
VLM
81
62
0
30 Oct 2023
Analyzing Vision Transformers for Image Classification in Class
  Embedding Space
Analyzing Vision Transformers for Image Classification in Class Embedding Space
Martina G. Vilas
Timothy Schaumlöffel
Gemma Roig
ViT
21
23
0
29 Oct 2023
NeuroInspect: Interpretable Neuron-based Debugging Framework through
  Class-conditional Visualizations
NeuroInspect: Interpretable Neuron-based Debugging Framework through Class-conditional Visualizations
Yeong-Joon Ju
Ji-Hoon Park
Seong-Whan Lee
AAML
27
0
0
11 Oct 2023
SPADE: Sparsity-Guided Debugging for Deep Neural Networks
SPADE: Sparsity-Guided Debugging for Deep Neural Networks
Arshia Soltani Moakhar
Eugenia Iofinova
Elias Frantar
Dan Alistarh
40
1
0
06 Oct 2023
Out-of-Distribution Detection by Leveraging Between-Layer Transformation
  Smoothness
Out-of-Distribution Detection by Leveraging Between-Layer Transformation Smoothness
Fran Jelenić
Josip Jukić
Martin Tutek
Mate Puljiz
Jan vSnajder
OODD
26
5
0
04 Oct 2023
Channel Vision Transformers: An Image Is Worth 1 x 16 x 16 Words
Channel Vision Transformers: An Image Is Worth 1 x 16 x 16 Words
Yu Bao
Srinivasan Sivanandan
Theofanis Karaletsos
ViT
24
23
0
28 Sep 2023
Learning to Generate Training Datasets for Robust Semantic Segmentation
Learning to Generate Training Datasets for Robust Semantic Segmentation
Marwane Hariat
Olivier Laurent
Rémi Kazmierczak
Shihao Zhang
Andrei Bursuc
Angela Yao
Gianni Franchi
UQCV
21
2
0
01 Aug 2023
Scale Alone Does not Improve Mechanistic Interpretability in Vision
  Models
Scale Alone Does not Improve Mechanistic Interpretability in Vision Models
Roland S. Zimmermann
Thomas Klein
Wieland Brendel
29
13
0
11 Jul 2023
Sketch-A-Shape: Zero-Shot Sketch-to-3D Shape Generation
Sketch-A-Shape: Zero-Shot Sketch-to-3D Shape Generation
Aditya Sanghi
P. Jayaraman
Arianna Rampini
Joseph Lambourne
Hooman Shayani
Evan Atherton
Saeid Asgari Taghanaki
3DV
32
15
0
08 Jul 2023
Unlocking Feature Visualization for Deeper Networks with MAgnitude
  Constrained Optimization
Unlocking Feature Visualization for Deeper Networks with MAgnitude Constrained Optimization
Thomas Fel
Thibaut Boissin
Victor Boutin
Agustin Picard
Paul Novello
...
Drew Linsley
Tom Rousseau
Rémi Cadène
Laurent Gardes
Thomas Serre
FAtt
16
18
0
11 Jun 2023
Segment Anything in High Quality
Segment Anything in High Quality
Lei Ke
Mingqiao Ye
Martin Danelljan
Yifan Liu
Yu-Wing Tai
Chi-Keung Tang
F. I. F. Richard Yu
VLM
27
310
0
02 Jun 2023
Contextual Vision Transformers for Robust Representation Learning
Contextual Vision Transformers for Robust Representation Learning
Yu Bao
Theofanis Karaletsos
ViT
26
13
0
30 May 2023
Cinematic Mindscapes: High-quality Video Reconstruction from Brain
  Activity
Cinematic Mindscapes: High-quality Video Reconstruction from Brain Activity
Zijiao Chen
Jiaxin Qing
J. Zhou
DiffM
VGen
37
54
0
19 May 2023
AttentionViz: A Global View of Transformer Attention
AttentionViz: A Global View of Transformer Attention
Catherine Yeh
Yida Chen
Aoyu Wu
Cynthia Chen
Fernanda Viégas
Martin Wattenberg
ViT
33
52
0
04 May 2023
A Cookbook of Self-Supervised Learning
A Cookbook of Self-Supervised Learning
Randall Balestriero
Mark Ibrahim
Vlad Sobal
Ari S. Morcos
Shashank Shekhar
...
Pierre Fernandez
Amir Bar
Hamed Pirsiavash
Yann LeCun
Micah Goldblum
SyDa
FedML
SSL
44
273
0
24 Apr 2023
Self-attention in Vision Transformers Performs Perceptual Grouping, Not
  Attention
Self-attention in Vision Transformers Performs Perceptual Grouping, Not Attention
Paria Mehrani
John K. Tsotsos
25
24
0
02 Mar 2023
Single Cells Are Spatial Tokens: Transformers for Spatial Transcriptomic
  Data Imputation
Single Cells Are Spatial Tokens: Transformers for Spatial Transcriptomic Data Imputation
Haifang Wen
Wenzhuo Tang
Wei Jin
Jiayuan Ding
Renming Liu
Xinnan Dai
Feng Shi
Lulu Shang
Jiliang Tang
Yuying Xie
27
8
0
06 Feb 2023
Teaching Matters: Investigating the Role of Supervision in Vision
  Transformers
Teaching Matters: Investigating the Role of Supervision in Vision Transformers
Matthew Walmer
Saksham Suri
Kamal Gupta
Abhinav Shrivastava
38
33
0
07 Dec 2022
Plug-In Inversion: Model-Agnostic Inversion for Vision with Data
  Augmentations
Plug-In Inversion: Model-Agnostic Inversion for Vision with Data Augmentations
Amin Ghiasi
Hamid Kazemi
Steven Reich
Chen Zhu
Micah Goldblum
Tom Goldstein
48
15
0
31 Jan 2022
Masked Autoencoders Are Scalable Vision Learners
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
305
7,443
0
11 Nov 2021
Intriguing Properties of Vision Transformers
Intriguing Properties of Vision Transformers
Muzammal Naseer
Kanchana Ranasinghe
Salman Khan
Munawar Hayat
F. Khan
Ming-Hsuan Yang
ViT
265
621
0
21 May 2021
1