Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2308.16271
Cited By
Emergence of Segmentation with Minimalistic White-Box Transformers
30 August 2023
Yaodong Yu
Tianzhe Chu
Shengbang Tong
Ziyang Wu
Druv Pai
Sam Buchanan
Y. Ma
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Emergence of Segmentation with Minimalistic White-Box Transformers"
20 / 20 papers shown
Title
Register and CLS tokens yield a decoupling of local and global features in large ViTs
Alexander Lappe
M. Giese
24
0
0
09 May 2025
EDIT: Enhancing Vision Transformers by Mitigating Attention Sink through an Encoder-Decoder Architecture
Wenfeng Feng
Guoying Sun
31
0
0
09 Apr 2025
Rethinking Decoders for Transformer-based Semantic Segmentation: A Compression Perspective
Qishuai Wen
Chun-Guang Li
ViT
37
0
0
05 Nov 2024
A Global Geometric Analysis of Maximal Coding Rate Reduction
Peng Wang
Huikang Liu
Druv Pai
Yaodong Yu
Zhihui Zhu
Q. Qu
Yi Ma
34
6
0
04 Jun 2024
Scaling White-Box Transformers for Vision
Jinrui Yang
Xianhang Li
Druv Pai
Yuyin Zhou
Yi Ma
Yaodong Yu
Cihang Xie
ViT
44
9
0
30 May 2024
U-Nets as Belief Propagation: Efficient Classification, Denoising, and Diffusion in Generative Hierarchical Models
Song Mei
3DV
AI4CE
DiffM
43
11
0
29 Apr 2024
Masked Completion via Structured Diffusion with White-Box Transformers
Druv Pai
Ziyang Wu
Sam Buchanan
Yaodong Yu
Yi Ma
35
13
0
03 Apr 2024
Neural Clustering based Visual Representation Learning
Guikun Chen
Xia Li
Yi Yang
Wenguan Wang
SSL
37
8
0
26 Mar 2024
FAST: Factorizable Attention for Speeding up Transformers
Armin Gerami
Monte Hoover
P. S. Dulepet
R. Duraiswami
29
0
0
12 Feb 2024
SPFormer: Enhancing Vision Transformer with Superpixel Representation
Jieru Mei
Liang-Chieh Chen
Alan L. Yuille
Cihang Xie
ViT
MDE
21
4
0
05 Jan 2024
SCLIP: Rethinking Self-Attention for Dense Vision-Language Inference
Feng Wang
Jieru Mei
Alan L. Yuille
VLM
29
55
0
04 Dec 2023
Meta-Prior: Meta learning for Adaptive Inverse Problem Solvers
M. Terris
Thomas Moreau
29
0
0
30 Nov 2023
Deep Tensor Network
Yifan Zhang
32
0
0
18 Nov 2023
Sub-token ViT Embedding via Stochastic Resonance Transformers
Dong Lao
Yangchao Wu
Tian Yu Liu
Alex Wong
Stefano Soatto
VOS
30
4
0
06 Oct 2023
Vision Transformers Need Registers
Zilong Chen
Maxime Oquab
Julien Mairal
Huaping Liu
ViT
59
312
0
28 Sep 2023
Deep Networks as Denoising Algorithms: Sample-Efficient Learning of Diffusion Models in High-Dimensional Graphical Models
Song Mei
Yuchen Wu
DiffM
31
26
0
20 Sep 2023
In-context Learning and Induction Heads
Catherine Olsson
Nelson Elhage
Neel Nanda
Nicholas Joseph
Nova Dassarma
...
Tom B. Brown
Jack Clark
Jared Kaplan
Sam McCandlish
C. Olah
250
460
0
24 Sep 2022
On the Principles of Parsimony and Self-Consistency for the Emergence of Intelligence
Y. Ma
Doris Y. Tsao
H. Shum
67
75
0
11 Jul 2022
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
308
7,443
0
11 Nov 2021
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
335
5,785
0
29 Apr 2021
1