Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2104.14294
Cited By
v1
v2 (latest)
Emerging Properties in Self-Supervised Vision Transformers
29 April 2021
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Emerging Properties in Self-Supervised Vision Transformers"
50 / 4,176 papers shown
Title
NViST: In the Wild New View Synthesis from a Single Image with Transformers
Wonbong Jang
Lourdes Agapito
ViT
89
10
0
13 Dec 2023
Efficient-NeRF2NeRF: Streamlining Text-Driven 3D Editing with Multiview Correspondence-Enhanced Diffusion Models
Liangchen Song
Liangliang Cao
Jiatao Gu
Yifan Jiang
Junsong Yuan
Hao Tang
DiffM
88
15
0
13 Dec 2023
PAD: Self-Supervised Pre-Training with Patchwise-Scale Adapter for Infrared Images
Tao Zhang
Kun Ding
Jinyong Wen
Yu Xiong
Zeyu Zhang
Shiming Xiang
Chunhong Pan
55
3
0
13 Dec 2023
uSF: Learning Neural Semantic Field with Uncertainty
Vsevolod Skorokhodov
Darya Drozdova
Dmitry Yudin
114
0
0
13 Dec 2023
DTL: Disentangled Transfer Learning for Visual Recognition
Minghao Fu
Ke Zhu
Jianxin Wu
106
19
0
13 Dec 2023
Foundation Models in Robotics: Applications, Challenges, and the Future
Roya Firoozi
Johnathan Tucker
Stephen Tian
Anirudha Majumdar
Jiankai Sun
...
Brian Ichter
Danny Driess
Jiajun Wu
Cewu Lu
Mac Schwager
LM&Ro
AI4CE
LRM
VLM
111
161
0
13 Dec 2023
A Foundational Multimodal Vision Language AI Assistant for Human Pathology
Ming Y. Lu
Bowen Chen
Drew F. K. Williamson
Richard J. Chen
Kenji Ikamura
...
Ivy Liang
L. Le
Tong Ding
Anil V. Parwani
Faisal Mahmood
MedIm
LM&MA
86
23
0
13 Dec 2023
CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor
Shuyang Sun
Runjia Li
Philip Torr
Xiuye Gu
Siyang Li
VLM
CLIP
140
34
0
12 Dec 2023
FreeInit: Bridging Initialization Gap in Video Diffusion Models
Tianxing Wu
Chenyang Si
Yuming Jiang
Ziqi Huang
Ziwei Liu
DiffM
VGen
84
53
0
12 Dec 2023
FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition
Sicheng Mo
Fangzhou Mu
Kuan Heng Lin
Yanli Liu
Bochen Guan
Yin Li
Bolei Zhou
DiffM
107
67
0
12 Dec 2023
Weakly Supervised 3D Object Detection via Multi-Level Visual Guidance
Kuan-Chih Huang
Yi-Hsuan Tsai
Ming-Hsuan Yang
3DPC
86
4
0
12 Dec 2023
Exploring Plain ViT Reconstruction for Multi-class Unsupervised Anomaly Detection
Jiangning Zhang
Xuhai Chen
Yabiao Wang
Chengjie Wang
Yong Liu
Xiangtai Li
Ming-Hsuan Yang
Dacheng Tao
130
28
0
12 Dec 2023
NearbyPatchCL: Leveraging Nearby Patches for Self-Supervised Patch-Level Multi-Class Classification in Whole-Slide Images
Gia-Bao Le
Van-Tien Nguyen
Trung-Truc Huynh-Le
Minh-Triet Tran
74
1
0
12 Dec 2023
Boosting Latent Diffusion with Flow Matching
Johannes S. Fischer
Ming Gui
Pingchuan Ma
Nick Stracke
S. A. Baumann
Bjorn Ommer
104
24
0
12 Dec 2023
CLIP in Medical Imaging: A Comprehensive Survey
Zihao Zhao
Yuxiao Liu
Han Wu
Yonghao Li
Sheng Wang
L. Teng
Disheng Liu
Zhiming Cui
Qian Wang
Dinggang Shen
CLIP
MedIm
LM&MA
VLM
158
43
0
12 Dec 2023
Expand-and-Quantize: Unsupervised Semantic Segmentation Using High-Dimensional Space and Product Quantization
Jiyoung Kim
Kyuhong Shim
Insu Lee
B. Shim
71
2
0
12 Dec 2023
Learned representation-guided diffusion models for large-image generation
Alexandros Graikos
Srikar Yellapragada
Minh-Quan Le
S. Kapse
Prateek Prasanna
Joel H. Saltz
Dimitris Samaras
DiffM
104
30
0
12 Dec 2023
Benchmarking Pretrained Vision Embeddings for Near- and Duplicate Detection in Medical Images
Tuan Truong
Farnaz Khun Jush
Matthias Lenga
78
3
0
12 Dec 2023
One-Step Diffusion Distillation via Deep Equilibrium Models
Zhengyang Geng
Ashwini Pokle
Trevor Killeen
75
33
0
12 Dec 2023
CLASS-M: Adaptive stain separation-based contrastive learning with pseudo-labeling for histopathological image classification
Bodong Zhang
Hamid Manoochehri
M. M. Ho
Fahimeh Fooladgar
Yosep Chong
Beatrice Knudsen
Deepika Sirohi
Tolga Tasdizen
116
5
0
12 Dec 2023
Photorealistic Video Generation with Diffusion Models
Agrim Gupta
Lijun Yu
Kihyuk Sohn
Xiuye Gu
Meera Hahn
Fei-Fei Li
Irfan Essa
Lu Jiang
José Lezama
VGen
155
201
0
11 Dec 2023
TULIP: Transformer for Upsampling of LiDAR Point Clouds
Bin Yang
Patrick Pfreundschuh
Roland Siegwart
Marco Hutter
Peyman Moghadam
Vaishakh Patil
3DPC
80
6
0
11 Dec 2023
Proxy-based Item Representation for Attribute and Context-aware Recommendation
Jinseok Seol
Minseok Gang
Sang-goo Lee
Jaehui Park
90
5
0
11 Dec 2023
Counterfactual World Modeling for Physical Dynamics Understanding
Rahul Venkatesh
Honglin Chen
Kevin T. Feigelis
Daniel M. Bear
Khaled Jedoui
...
Wanhee Lee
Sherry Liu
Kevin A. Smith
Judith E. Fan
Daniel L. K. Yamins
VGen
94
2
0
11 Dec 2023
Mining Gaze for Contrastive Learning toward Computer-Assisted Diagnosis
Zihao Zhao
Sheng Wang
Qian Wang
Dinggang Shen
MedIm
85
7
0
11 Dec 2023
Deciphering 'What' and 'Where' Visual Pathways from Spectral Clustering of Layer-Distributed Neural Representations
Xiao Zhang
David Yunis
Michael Maire
56
4
0
11 Dec 2023
Learning Naturally Aggregated Appearance for Efficient 3D Editing
Ka Leong Cheng
Qiuyu Wang
Zifan Shi
Kecheng Zheng
Yinghao Xu
Ouyang Hao
Qifeng Chen
Yujun Shen
3DH
143
4
0
11 Dec 2023
AM-RADIO: Agglomerative Vision Foundation Model -- Reduce All Domains Into One
Michael Ranzinger
Greg Heinrich
Jan Kautz
Pavlo Molchanov
VLM
167
50
0
10 Dec 2023
Diffusion for Natural Image Matting
Yihan Hu
Yiheng Lin
Wei Wang
Yao-Min Zhao
Yunchao Wei
Humphrey Shi
103
9
0
10 Dec 2023
Transformer-based Selective Super-Resolution for Efficient Image Refinement
Tianyi Zhang
Kishore Kasichainula
Yaoxin Zhuo
Baoxin Li
Jae-sun Seo
Yu Cao
48
7
0
10 Dec 2023
The Counterattack of CNNs in Self-Supervised Learning: Larger Kernel Size might be All You Need
Tianjin Huang
Tianlong Chen
Zhangyang Wang
Shiwei Liu
80
1
0
09 Dec 2023
Identifying and Mitigating Model Failures through Few-shot CLIP-aided Diffusion Generation
Atoosa Malemir Chegini
Soheil Feizi
VLM
69
4
0
09 Dec 2023
Emergence and Function of Abstract Representations in Self-Supervised Transformers
Quentin RV. Ferry
Joshua Ching
Takashi Kawai
82
3
0
08 Dec 2023
Human-in-the-Loop Visual Re-ID for Population Size Estimation
Gustavo Pérez
Daniel Sheldon
Grant Van Horn
Subhransu Maji
96
1
0
08 Dec 2023
Bridging Synthetic and Real Worlds for Pre-training Scene Text Detectors
Tongkun Guan
Wei Shen
Xuehang Yang
Xuehui Wang
Xiaokang Yang
109
7
0
08 Dec 2023
Benchmarking and Analysis of Unsupervised Object Segmentation from Real-world Single Images
Yafei Yang
Bo Yang
OCL
74
4
0
08 Dec 2023
Adapting Vision Transformer for Efficient Change Detection
Yang Zhao
Yuxiang Zhang
Yanni Dong
Bo Du
VLM
75
2
0
08 Dec 2023
Generating Illustrated Instructions
Sachit Menon
Ishan Misra
Rohit Girdhar
DiffM
86
5
0
07 Dec 2023
Multimodal Industrial Anomaly Detection by Crossmodal Feature Mapping
Alex Costanzino
Pierluigi Zama Ramirez
Giuseppe Lisanti
Luigi Di Stefano
85
19
0
07 Dec 2023
Hierarchical Spatio-temporal Decoupling for Text-to-Video Generation
Zhiwu Qing
Shiwei Zhang
Jiayu Wang
Xiang Wang
Yujie Wei
Yingya Zhang
Changxin Gao
Nong Sang
VGen
DiffM
64
43
0
07 Dec 2023
PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding
Zhen Li
Mingdeng Cao
Xintao Wang
Zhongang Qi
Ming-Ming Cheng
Ying Shan
DiffM
144
201
0
07 Dec 2023
DreamVideo: Composing Your Dream Videos with Customized Subject and Motion
Yujie Wei
Shiwei Zhang
Zhiwu Qing
Hangjie Yuan
Zhiheng Liu
Yu Liu
Yingya Zhang
Jingren Zhou
Hongming Shan
DiffM
VGen
82
98
0
07 Dec 2023
Cascade-Zero123: One Image to Highly Consistent 3D with Self-Prompted Nearby Views
Yabo Chen
Jiemin Fang
Yuyang Huang
Taoran Yi
Xiaopeng Zhang
Lingxi Xie
Xinggang Wang
Wenrui Dai
Hongkai Xiong
Qi Tian
DiffM
91
21
0
07 Dec 2023
Instance Tracking in 3D Scenes from Egocentric Videos
Yunhan Zhao
Haoyu Ma
Shu Kong
Charless C. Fowlkes
3DPC
91
4
0
07 Dec 2023
LiDAR: Sensing Linear Probing Performance in Joint Embedding SSL Architectures
Vimal Thilak
Chen Huang
Omid Saremi
Laurent Dinh
Hanlin Goh
Preetum Nakkiran
Josh Susskind
Etai Littwin
112
10
0
07 Dec 2023
Auto-Vocabulary Semantic Segmentation
Osman Ülger
Maksymilian Kulicki
Yuki M. Asano
Martin R. Oswald
VLM
146
2
0
07 Dec 2023
Language Model Alignment with Elastic Reset
Michael Noukhovitch
Samuel Lavoie
Florian Strub
Aaron Courville
KELM
164
27
0
06 Dec 2023
Alpha-CLIP: A CLIP Model Focusing on Wherever You Want
Zeyi Sun
Ye Fang
Tong Wu
Pan Zhang
Yuhang Zang
Shu Kong
Yuanjun Xiong
Dahua Lin
Jiaqi Wang
VLM
CLIP
131
91
0
06 Dec 2023
TokenCompose: Text-to-Image Diffusion with Token-level Supervision
Zirui Wang
Zhizhou Sha
Zheng Ding
Yilin Wang
Zhuowen Tu
DiffM
105
23
0
06 Dec 2023
Low-shot Object Learning with Mutual Exclusivity Bias
Anh Thai
Ahmad Humayun
Stefan Stojanov
Zixuan Huang
Bikram Boote
James M. Rehg
102
3
0
06 Dec 2023
Previous
1
2
3
...
43
44
45
...
82
83
84
Next