ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2104.14294
  4. Cited By
Emerging Properties in Self-Supervised Vision Transformers
v1v2 (latest)

Emerging Properties in Self-Supervised Vision Transformers

29 April 2021
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
ArXiv (abs)PDFHTML

Papers citing "Emerging Properties in Self-Supervised Vision Transformers"

50 / 4,175 papers shown
Title
Consistency-aware Self-Training for Iterative-based Stereo Matching
Consistency-aware Self-Training for Iterative-based Stereo Matching
Jingyi Zhou
Peng Ye
Han Zhang
Jiakang Yuan
Rao Qiang
Liu YangChenXu
Wu Cailin
Feng Xu
Tao Chen
3DV
70
0
0
31 Mar 2025
Can Diffusion Models Disentangle? A Theoretical Perspective
Can Diffusion Models Disentangle? A Theoretical Perspective
Liming Wang
Muhammad Jehanzeb Mirza
Yishu Gong
Yuan Gong
Jiaqi Zhang
Brian Tracey
Katerina Placek
Marco Vilela
James Glass
DiffMCoGe
118
0
0
31 Mar 2025
Training-Free Text-Guided Image Editing with Visual Autoregressive Model
Training-Free Text-Guided Image Editing with Visual Autoregressive Model
Yufei Wang
Lanqing Guo
Zhihao Li
Jiaxing Huang
Pichao Wang
Bihan Wen
Jingchao Wang
DiffM
111
1
0
31 Mar 2025
VideoGen-Eval: Agent-based System for Video Generation Evaluation
VideoGen-Eval: Agent-based System for Video Generation Evaluation
Yuhang Yang
Ke Fan
Siyang Song
Hongxiang Li
Ailing Zeng
FeiLin Han
Wei-dong Zhai
Wen Liu
Yang Cao
Zheng-jun Zha
EGVMVGen
123
1
0
30 Mar 2025
Map Feature Perception Metric for Map Generation Quality Assessment and Loss Optimization
Map Feature Perception Metric for Map Generation Quality Assessment and Loss Optimization
Chenxing Sun
Jing Bai
EGVM
102
1
0
30 Mar 2025
Beyond Academic Benchmarks: Critical Analysis and Best Practices for Visual Industrial Anomaly Detection
Beyond Academic Benchmarks: Critical Analysis and Best Practices for Visual Industrial Anomaly Detection
Aimira Baitieva
Yacine Bouaouni
Alexandre Briot
Dick Ameln
Souhaiel Khalfaoui
S. Akçay
90
0
0
30 Mar 2025
ViT-Linearizer: Distilling Quadratic Knowledge into Linear-Time Vision Models
ViT-Linearizer: Distilling Quadratic Knowledge into Linear-Time Vision Models
Guoyizhe Wei
Rama Chellappa
101
2
0
30 Mar 2025
Can Visuo-motor Policies Benefit from Random Exploration Data? A Case Study on Stacking
Can Visuo-motor Policies Benefit from Random Exploration Data? A Case Study on Stacking
Shutong Jin
Axel Kaliff
Ruiyu Wang
Muhammad Zahid
Florian T. Pokorny
VGen
59
0
0
30 Mar 2025
Multi-label classification for multi-temporal, multi-spatial coral reef condition monitoring using vision foundation model with adapter learning
Multi-label classification for multi-temporal, multi-spatial coral reef condition monitoring using vision foundation model with adapter learning
Xinlei Shao
Hongruixuan Chen
Fan Zhao
Kirsty Magson
Jundong Chen
Peiran Li
Jingchao Wang
Jun Sasaki
125
0
0
29 Mar 2025
Z-SASLM: Zero-Shot Style-Aligned SLI Blending Latent Manipulation
Z-SASLM: Zero-Shot Style-Aligned SLI Blending Latent Manipulation
Alessio Borgi
Luca Maiano
Irene Amerini
64
0
0
29 Mar 2025
Large Self-Supervised Models Bridge the Gap in Domain Adaptive Object Detection
Large Self-Supervised Models Bridge the Gap in Domain Adaptive Object Detection
Marc-Antoine Lavoie
Anas Mahmoud
Steven Waslander
125
0
0
29 Mar 2025
Tokenization of Gaze Data
Tokenization of Gaze Data
Tim Rolff
Jurik Karimian
Niklas Hypki
S. Schmidt
Markus Lappe
Frank Steinicke
109
0
0
28 Mar 2025
EndoLRMGS: Complete Endoscopic Scene Reconstruction combining Large Reconstruction Modelling and Gaussian Splatting
EndoLRMGS: Complete Endoscopic Scene Reconstruction combining Large Reconstruction Modelling and Gaussian Splatting
Xiang Wang
Shuai Zhang
Baoru Huang
Danail Stoyanov
E. Mazomenos
3DGS
134
1
0
28 Mar 2025
LIM: Large Interpolator Model for Dynamic Reconstruction
LIM: Large Interpolator Model for Dynamic Reconstruction
Remy Sabathier
Niloy J. Mitra
David Novotny
79
0
0
28 Mar 2025
Semantix: An Energy Guided Sampler for Semantic Style Transfer
Semantix: An Energy Guided Sampler for Semantic Style Transfer
Huiang He
Minghui Hu
C. Zheng
Chaoyue Wang
Tat-Jen Cham
DiffM
88
0
0
28 Mar 2025
Zero4D: Training-Free 4D Video Generation From Single Video Using Off-the-Shelf Video Diffusion
Zero4D: Training-Free 4D Video Generation From Single Video Using Off-the-Shelf Video Diffusion
Jangho Park
Taesung Kwon
Jong Chul Ye
VGen
105
1
0
28 Mar 2025
Assessing Foundation Models for Sea Ice Type Segmentation in Sentinel-1 SAR Imagery
Assessing Foundation Models for Sea Ice Type Segmentation in Sentinel-1 SAR Imagery
Samira Alkaee Taleghan
Morteza Karimzadeh
A. Barrett
Walter N. Meier
F. Banaei-Kashani
133
0
0
28 Mar 2025
CTRL-O: Language-Controllable Object-Centric Visual Representation Learning
CTRL-O: Language-Controllable Object-Centric Visual Representation Learning
Aniket Didolkar
Andrii Zadaianchuk
Rabiul Awal
Maximilian Seitzer
E. Gavves
Aishwarya Agrawal
OCLVLM
178
3
0
27 Mar 2025
Towards Generating Realistic 3D Semantic Training Data for Autonomous Driving
Towards Generating Realistic 3D Semantic Training Data for Autonomous Driving
Lucas Nunes
Rodrigo Marcuzzi
Jens Behley
C. Stachniss
3DPC
151
1
0
27 Mar 2025
VideoMage: Multi-Subject and Motion Customization of Text-to-Video Diffusion Models
VideoMage: Multi-Subject and Motion Customization of Text-to-Video Diffusion Models
Chi-Pin Huang
Yen-Siang Wu
Hung-Kai Chung
Kai-Po Chang
Fu-En Yang
Yu-Jie Wang
DiffMVGen
98
1
0
27 Mar 2025
CoT-VLA: Visual Chain-of-Thought Reasoning for Vision-Language-Action Models
CoT-VLA: Visual Chain-of-Thought Reasoning for Vision-Language-Action Models
Qingqing Zhao
Yao Lu
Moo Jin Kim
Zipeng Fu
Zhuoyang Zhang
...
Ankur Handa
Xuan Li
Donglai Xiang
Gordon Wetzstein
Nayeon Lee
LM&RoLRM
99
33
0
27 Mar 2025
StyledStreets: Multi-style Street Simulator with Spatial and Temporal Consistency
StyledStreets: Multi-style Street Simulator with Spatial and Temporal Consistency
Yuyin Chen
Yida Wang
Xinyu Zhang
Kun Zhan
Peng Jia
Yifei Zhan
Xianpeng Lang
3DGS
89
1
0
27 Mar 2025
Semantic Consistent Language Gaussian Splatting for Point-Level Open-vocabulary Querying
Semantic Consistent Language Gaussian Splatting for Point-Level Open-vocabulary Querying
Hairong Yin
Huangying Zhan
Yi Tian Xu
Raymond A. Yeh
68
0
0
27 Mar 2025
Prototype Guided Backdoor Defense
Prototype Guided Backdoor Defense
Venkat Adithya Amula
Sunayana Samavedam
Saurabh Saini
Avani Gupta
Narayanan P J
AAML
76
0
0
26 Mar 2025
Mamba-3D as Masked Autoencoders for Accurate and Data-Efficient Analysis of Medical Ultrasound Videos
Mamba-3D as Masked Autoencoders for Accurate and Data-Efficient Analysis of Medical Ultrasound Videos
Jiaheng Zhou
Yanfeng Zhou
Wei Fang
Yuxing Tang
Le Lu
Ge Yang
Mamba
515
0
0
26 Mar 2025
Contrastive Learning Guided Latent Diffusion Model for Image-to-Image Translation
Contrastive Learning Guided Latent Diffusion Model for Image-to-Image Translation
Qi Si
Bo Wang
Zhao Zhang
107
0
0
26 Mar 2025
Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal Consistency
Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal Consistency
T. Liu
Z. Huang
Zhaoxi Chen
Guangcong Wang
Shoukang Hu
Liao Shen
Huiqiang Sun
Z. Cao
Wei Li
Ziwei Liu
VGen3DGS
127
1
0
26 Mar 2025
Feature4X: Bridging Any Monocular Video to 4D Agentic AI with Versatile Gaussian Feature Fields
Feature4X: Bridging Any Monocular Video to 4D Agentic AI with Versatile Gaussian Feature Fields
Shijie Zhou
Hui Ren
Yijia Weng
Shuwang Zhang
Zhen Wang
...
Zhiwen Fan
Suya You
Ziyi Wang
Leonidas Guibas
A. Kadambi
VGen3DGS
145
3
0
26 Mar 2025
ICE: Intrinsic Concept Extraction from a Single Image via Diffusion Models
ICE: Intrinsic Concept Extraction from a Single Image via Diffusion Models
Fernando Julio Cendra
Kai Han
VLM
136
0
0
25 Mar 2025
AvatarArtist: Open-Domain 4D Avatarization
AvatarArtist: Open-Domain 4D Avatarization
Hongyu Liu
Xuan Wang
Bo Liu
Yue Ma
Jingye Chen
Yanbo Fan
Yujun Shen
Yibing Song
Qifeng Chen
132
3
0
25 Mar 2025
Self-Supervised Learning of Motion Concepts by Optimizing Counterfactuals
Self-Supervised Learning of Motion Concepts by Optimizing Counterfactuals
Stefan Stojanov
David Wendt
Seungwoo Kim
R. Venkatesh
Kevin T. Feigelis
Jiajun Wu
Daniel L. K. Yamins
SSL
99
0
0
25 Mar 2025
Reverse Prompt: Cracking the Recipe Inside Text-to-Image Generation
Reverse Prompt: Cracking the Recipe Inside Text-to-Image Generation
Zhiyao Ren
Yibing Zhan
B. Yu
Dacheng Tao
DiffM
95
0
0
25 Mar 2025
ChA-MAEViT: Unifying Channel-Aware Masked Autoencoders and Multi-Channel Vision Transformers for Improved Cross-Channel Learning
ChA-MAEViT: Unifying Channel-Aware Masked Autoencoders and Multi-Channel Vision Transformers for Improved Cross-Channel Learning
Chau Pham
Juan C. Caicedo
Bryan A. Plummer
78
0
0
25 Mar 2025
Vanishing Depth: A Depth Adapter with Positional Depth Encoding for Generalized Image Encoders
Vanishing Depth: A Depth Adapter with Positional Depth Encoding for Generalized Image Encoders
Paul Koch
Jörg Krüger
Ankit Chowdhury
O. Heimann
MDE
89
0
0
25 Mar 2025
Scaling Vision Pre-Training to 4K Resolution
Scaling Vision Pre-Training to 4K Resolution
Baifeng Shi
Boyi Li
Han Cai
Yaojie Lu
Sifei Liu
...
Jan Kautz
Enze Xie
Trevor Darrell
Pavlo Molchanov
Hongxu Yin
CLIP
411
0
0
25 Mar 2025
Surg-3M: A Dataset and Foundation Model for Perception in Surgical Settings
Surg-3M: A Dataset and Foundation Model for Perception in Surgical Settings
Chengan Che
Chao Wang
Tom Vercauteren
Sophia Tsoka
Luis C. Garcia-Peraza-Herrera
MedIm
84
1
0
25 Mar 2025
LPOSS: Label Propagation Over Patches and Pixels for Open-vocabulary Semantic Segmentation
LPOSS: Label Propagation Over Patches and Pixels for Open-vocabulary Semantic Segmentation
Vladan Stojnić
Yannis Kalantidis
Jirí Matas
Giorgos Tolias
VLM
120
0
0
25 Mar 2025
FullDiT: Multi-Task Video Generative Foundation Model with Full Attention
FullDiT: Multi-Task Video Generative Foundation Model with Full Attention
Xuan Ju
Weicai Ye
Quande Liu
Qiulin Wang
Xintao Wang
Pengfei Wan
Di Zhang
Kun Gai
Qiang Xu
VGen
108
4
0
25 Mar 2025
Scene-agnostic Pose Regression for Visual Localization
Scene-agnostic Pose Regression for Visual Localization
Junwei Zheng
Ruiping Liu
Yuxiao Chen
Zhenfang Chen
Kailun Yang
Jiaming Zhang
Rainer Stiefelhagen
81
0
0
25 Mar 2025
LRSCLIP: A Vision-Language Foundation Model for Aligning Remote Sensing Image with Longer Text
LRSCLIP: A Vision-Language Foundation Model for Aligning Remote Sensing Image with Longer Text
Weizhi Chen
Jingbo Chen
Yupeng Deng
Jiansheng Chen
Yuman Feng
Zhihao Xi
Diyou Liu
Kai Li
Yu Meng
VLM
102
1
0
25 Mar 2025
Why Representation Engineering Works: A Theoretical and Empirical Study in Vision-Language Models
Why Representation Engineering Works: A Theoretical and Empirical Study in Vision-Language Models
Bowei Tian
Xuntao Lyu
Meng Liu
Hongyi Wang
Ang Li
94
0
0
25 Mar 2025
Self-Supervised Learning based on Transformed Image Reconstruction for Equivariance-Coherent Feature Representation
Self-Supervised Learning based on Transformed Image Reconstruction for Equivariance-Coherent Feature Representation
Qin Wang
Benjamin Bruns
Hanno Scharr
Kai Krajsek
99
0
0
24 Mar 2025
Out-of-distribution evaluations of channel agnostic masked autoencoders in fluorescence microscopy
Out-of-distribution evaluations of channel agnostic masked autoencoders in fluorescence microscopy
Christian John Hurry
Jinjie Zhang
Olubukola Ishola
Emma Slade
Cuong Q. Nguyen
OODOODD
89
0
0
24 Mar 2025
Revisiting Automatic Data Curation for Vision Foundation Models in Digital Pathology
Revisiting Automatic Data Curation for Vision Foundation Models in Digital Pathology
Boqi Chen
Cédric Vincent-Cuaz
Lydia A. Schoenpflug
Manuel Madeira
Lisa Fournier
...
D. Thanou
V. Koelzer
Pascal Frossard
Gabriele Campanella
Gunnar Rätsch
97
1
0
24 Mar 2025
U-REPA: Aligning Diffusion U-Nets to ViTs
U-REPA: Aligning Diffusion U-Nets to ViTs
Yuchuan Tian
Hanting Chen
Mengyu Zheng
Yuchen Liang
Chao Xu
Yunhe Wang
112
2
0
24 Mar 2025
k-NN as a Simple and Effective Estimator of Transferability
k-NN as a Simple and Effective Estimator of Transferability
Moein Sorkhei
Christos Matsoukas
Johan Fredin Haslum
Emir Konuk
Kevin Smith
94
0
0
24 Mar 2025
AMD-Hummingbird: Towards an Efficient Text-to-Video Model
AMD-Hummingbird: Towards an Efficient Text-to-Video Model
Takashi Isobe
He Cui
Dong Zhou
Mengmeng Ge
D. Li
E. Barsoum
VGen
93
1
0
24 Mar 2025
Surface-Aware Distilled 3D Semantic Features
Surface-Aware Distilled 3D Semantic Features
Lukas Uzolas
E. Eisemann
Petr Kellnhofer
3DPC3DH
119
0
0
24 Mar 2025
CoMP: Continual Multimodal Pre-training for Vision Foundation Models
CoMP: Continual Multimodal Pre-training for Vision Foundation Models
Yuxiao Chen
L. Meng
Wujian Peng
Zuxuan Wu
Yu-Gang Jiang
VLM
211
1
0
24 Mar 2025
Do Your Best and Get Enough Rest for Continual Learning
Do Your Best and Get Enough Rest for Continual Learning
Hankyul Kang
Gregor Seifer
Donghyun Lee
Jongbin Ryu
VLM
94
0
0
24 Mar 2025
Previous
123...789...828384
Next