Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2309.16588
Cited By
Vision Transformers Need Registers
28 September 2023
Zilong Chen
Maxime Oquab
Julien Mairal
Huaping Liu
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Vision Transformers Need Registers"
40 / 240 papers shown
Title
Situation Awareness for Driver-Centric Driving Style Adaptation
Johann Haselberger
Bonifaz Stuhr
Bernhard Schick
Steffen Müller
37
1
0
28 Mar 2024
CoDA: Instructive Chain-of-Domain Adaptation with Severity-Aware Visual Prompt Tuning
Ziyang Gong
Fuhao Li
Yupeng Deng
Deblina Bhattacharjee
Xianzheng Ma
Xiangwei Zhu
Zhenming Ji
73
9
0
26 Mar 2024
Continuous, Subject-Specific Attribute Control in T2I Models by Identifying Semantic Directions
S. A. Baumann
Felix Krause
Michael Neumayr
Nick Stracke
Vincent Tao Hu
Bjorn Ommer
Björn Ommer
DiffM
LM&Ro
70
11
0
25 Mar 2024
Metric3Dv2: A Versatile Monocular Geometric Foundation Model for Zero-shot Metric Depth and Surface Normal Estimation
Mu Hu
Wei Yin
C. Zhang
Zhipeng Cai
Xiaoxiao Long
Kaixuan Wang
Kaixuan Wang
Gang Yu
Chunhua Shen
Shaojie Shen
3DGS
54
116
0
22 Mar 2024
LUWA Dataset: Learning Lithic Use-Wear Analysis on Microscopic Images
Jing Zhang
Irving Fang
Juexiao Zhang
Hao Wu
Akshat Kaushik
Alice Rodriguez
Hanwen Zhao
Zhuo Zheng
Radu Iovita
Chen Feng
24
3
0
19 Mar 2024
TTT-KD: Test-Time Training for 3D Semantic Segmentation through Knowledge Distillation from Foundation Models
Lisa Weijler
Muhammad Jehanzeb Mirza
Leon Sick
Can Ekkazan
Pedro Hermosilla
TTA
41
0
0
18 Mar 2024
Conditional computation in neural networks: principles and research trends
Simone Scardapane
Alessandro Baiocchi
Alessio Devoto
V. Marsocci
Pasquale Minervini
Jary Pomponi
34
1
0
12 Mar 2024
ObjectCompose: Evaluating Resilience of Vision-Based Models on Object-to-Background Compositional Changes
H. Malik
Muhammad Huzaifa
Muzammal Naseer
Salman Khan
Fahad Shahbaz Khan
DiffM
40
2
0
07 Mar 2024
ComFe: An Interpretable Head for Vision Transformers
Evelyn J. Mannix
H. Bondell
Howard Bondell
VLM
ViT
26
1
0
07 Mar 2024
HyenaPixel: Global Image Context with Convolutions
Julian Spravil
Sebastian Houben
Sven Behnke
31
1
0
29 Feb 2024
Massive Activations in Large Language Models
Mingjie Sun
Xinlei Chen
J. Zico Kolter
Zhuang Liu
71
68
0
27 Feb 2024
GROUNDHOG: Grounding Large Language Models to Holistic Segmentation
Yichi Zhang
Ziqiao Ma
Xiaofeng Gao
Suhaila Shakiah
Qiaozi Gao
Joyce Chai
MLLM
VLM
42
39
0
26 Feb 2024
General Purpose Image Encoder DINOv2 for Medical Image Registration
Xin Song
Xuanang Xu
Pingkun Yan
MedIm
38
5
0
24 Feb 2024
Attention-aware Semantic Communications for Collaborative Inference
Jiwoong Im
Nayoung Kwon
Taewoo Park
Jiheon Woo
Jaeho Lee
Yongjune Kim
46
2
0
23 Feb 2024
Convincing Rationales for Visual Question Answering Reasoning
Kun Li
G. Vosselman
Michael Ying Yang
44
1
0
06 Feb 2024
FindingEmo: An Image Dataset for Emotion Recognition in the Wild
Laurent Mertens
E. Yargholi
H. O. D. Beeck
Jan Van den Stock
Joost Vennekens
VLM
33
4
0
02 Feb 2024
Understanding Video Transformers via Universal Concept Discovery
M. Kowal
Achal Dave
Rares Ambrus
Adrien Gaidon
Konstantinos G. Derpanis
P. Tokmakov
ViT
37
8
0
19 Jan 2024
Multi-Track Timeline Control for Text-Driven 3D Human Motion Generation
Mathis Petrovich
Or Litany
Umar Iqbal
Michael J. Black
Gül Varol
Xue Bin Peng
Davis Rempe
DiffM
VGen
37
40
0
16 Jan 2024
RudolfV: A Foundation Model by Pathologists for Pathologists
Jonas Dippel
Barbara Feulner
Tobias Winterhoff
Timo Milbich
Stephan Tietz
...
David Horst
Lukas Ruff
Klaus-Robert Muller
Frederick Klauschen
Maximilian Alber
36
29
0
08 Jan 2024
Analyzing Local Representations of Self-supervised Vision Transformers
Ani Vanyan
Alvard Barseghyan
Hakob Tamazyan
Vahan Huroyan
Hrant Khachatrian
Martin Danelljan
42
3
0
31 Dec 2023
SSR-Encoder: Encoding Selective Subject Representation for Subject-Driven Generation
Yuxuan Zhang
Yiren Song
Jiaming Liu
Rui Wang
Jinpeng Yu
...
Huaxia Li
Xu Tang
Yao Hu
Han Pan
Zhongliang Jing
43
58
0
26 Dec 2023
CLIP-DINOiser: Teaching CLIP a few DINO tricks for open-vocabulary semantic segmentation
Monika Wysoczañska
Oriane Siméoni
Michael Ramamonjisoa
Andrei Bursuc
Tomasz Trzciñski
Patrick Pérez
VLM
CLIP
34
29
0
19 Dec 2023
Open Vocabulary Semantic Scene Sketch Understanding
Ahmed Bourouis
Judith E. Fan
Yulia Gryaditskaya
VLM
3DV
23
1
0
18 Dec 2023
Diffusion Illusions: Hiding Images in Plain Sight
R. Burgert
Xiang Li
Abe Leite
Kanchana Ranasinghe
Michael S. Ryoo
50
17
0
06 Dec 2023
Class-Discriminative Attention Maps for Vision Transformers
L. Brocki
Jakub Binda
N. C. Chung
MedIm
30
3
0
04 Dec 2023
FoundPose: Unseen Object Pose Estimation with Foundation Features
Evin Pınar Örnek
Yann Labbé
Bugra Tekin
Lingni Ma
Cem Keskin
Christian Forster
Tomás Hodan
30
48
0
30 Nov 2023
Emergent Open-Vocabulary Semantic Segmentation from Off-the-shelf Vision-Language Models
Jiayun Luo
Siddhesh Khandelwal
Leonid Sigal
Boyang Albert Li
MLLM
VLM
35
7
0
28 Nov 2023
Frozen Transformers in Language Models Are Effective Visual Encoder Layers
Ziqi Pang
Ziyang Xie
Yunze Man
Yu-xiong Wang
50
25
0
19 Oct 2023
Guiding Language Model Math Reasoning with Planning Tokens
Xinyi Wang
Lucas Caccia
O. Ostapenko
Xingdi Yuan
William Yang Wang
Alessandro Sordoni
LRM
33
19
0
09 Oct 2023
Think before you speak: Training Language Models With Pause Tokens
Sachin Goyal
Ziwei Ji
A. S. Rawat
A. Menon
Sanjiv Kumar
Vaishnavh Nagarajan
LRM
22
95
0
03 Oct 2023
Dynamic Attention-Guided Diffusion for Image Super-Resolution
Brian B. Moser
Stanislav Frolov
Federico Raue
Sebastián M. Palacio
Andreas Dengel
DiffM
32
3
0
15 Aug 2023
PerceptionCLIP: Visual Classification by Inferring and Conditioning on Contexts
Bang An
Sicheng Zhu
Michael-Andrei Panaitescu-Liess
Chaithanya Kumar Mummadi
Furong Huang
VLM
30
7
0
02 Aug 2023
CoTracker: It is Better to Track Together
Nikita Karaev
Ignacio Rocco
Benjamin Graham
Natalia Neverova
Andrea Vedaldi
Christian Rupprecht
VOT
ViT
51
246
0
14 Jul 2023
OpenVIS: Open-vocabulary Video Instance Segmentation
Pinxue Guo
Tony Huang
Peiyang He
Xuefeng Liu
Tianjun Xiao
Zhaoyu Chen
Wenqiang Zhang
VLM
33
16
0
26 May 2023
Training-Free Acceleration of ViTs with Delayed Spatial Merging
J. Heo
Seyedarmin Azizi
A. Fayyazi
Massoud Pedram
38
3
0
04 Mar 2023
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
305
7,443
0
11 Nov 2021
ViDT: An Efficient and Effective Fully Transformer-based Object Detector
Hwanjun Song
Deqing Sun
Sanghyuk Chun
Varun Jampani
Dongyoon Han
Byeongho Heo
Wonjae Kim
Ming-Hsuan Yang
87
76
0
08 Oct 2021
Localizing Objects with Self-Supervised Transformers and no Labels
Oriane Siméoni
Gilles Puy
Huy V. Vo
Simon Roburin
Spyros Gidaris
Andrei Bursuc
P. Pérez
Renaud Marlet
Jean Ponce
ViT
180
196
0
29 Sep 2021
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
320
5,785
0
29 Apr 2021
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky
Jia Deng
Hao Su
J. Krause
S. Satheesh
...
A. Karpathy
A. Khosla
Michael S. Bernstein
Alexander C. Berg
Li Fei-Fei
VLM
ObjD
296
39,198
0
01 Sep 2014
Previous
1
2
3
4
5