v1v2 (latest)

Emerging Properties in Self-Supervised Vision Transformers

29 April 2021

Papers citing "Emerging Properties in Self-Supervised Vision Transformers"

50 / 4,175 papers shown

Title
REN: Fast and Efficient Region Encodings from Patch-Based Image Encoders Savya Khosla Sethuraman TV Barnett Lee Alexander Schwing Derek Hoiem VGen 167 0 0 23 May 2025
Semantic Correspondence: Unified Benchmarking and a Strong Baseline Kaiyan Zhang Xinghui Li Jingyi Lu Kai Han 3DV 87 1 0 23 May 2025
HOFT: Householder Orthogonal Fine-tuning Alejandro Moreno Arcas Albert Sanchis Jorge Civera Alfons Juan 69 0 0 22 May 2025
ScanBot: Towards Intelligent Surface Scanning in Embodied Robotic Systems Zhiling Chen Yang Zhang Fardin Jalil Piran Qianyu Zhou Jiong Tang Farhad Imani LM&Ro 92 0 0 22 May 2025
Redemption Score: An Evaluation Framework to Rank Image Captions While Redeeming Image Semantics and Language Pragmatics Ashim Dahal Ankit Ghimire Saydul Akbar Murad Nick Rahimi 54 0 0 22 May 2025
Bootstrapping your behavior: a new pretraining strategy for user behavior sequence data Weichang Wu Xiaolu Zhang Jun Zhou Yuchen Li Wenwen Xia 20 0 0 22 May 2025
Transformer brain encoders explain human high-level visual responses Hossein Adeli Minni Sun N. Kriegeskorte 232 0 0 22 May 2025
Generative AI for Autonomous Driving: A Review Katharina Winter Abhishek Vivekanandan Rupert Polley Yinzhe Shen Christian Schlauch ... Christian Wirth Omer Sahin Tas Nadja Klein Fabian B. Flohr Hanno Gottschalk 94 0 0 21 May 2025
gen2seg: Generative Models Enable Generalizable Instance Segmentation Om Khangaonkar Hamed Pirsiavash DiffM VLM 134 0 0 21 May 2025
Stronger ViTs With Octic Equivariance David Nordström Johan Edstedt Fredrik Kahl Georg Bökman ViT 225 0 0 21 May 2025
ReGUIDE: Data Efficient GUI Grounding via Spatial Reasoning and Search Hyunseok Lee Jeonghoon Kim Beomjun Kim Jihoon Tack Chansong Jo Jaehong Lee Cheonbok Park Sookyo In Jinwoo Shin Kang Min Yoo 124 0 0 21 May 2025
Lung Nodule-SSM: Self-Supervised Lung Nodule Detection and Classification in Thoracic CT Images Muniba Noreen Furqan Shaukat ViT 94 0 0 21 May 2025
Collaborative Unlabeled Data Optimization Xinyi Shang Peng Sun Fengyuan Liu Tao Lin 78 0 0 20 May 2025
Place Recognition Meet Multiple Modalitie: A Comprehensive Review, Current Challenges and Future Directions Zhenyu Li Tianyi Shang Pengjie Xu ZhaoJun Deng 125 0 0 20 May 2025
SSPS: Self-Supervised Positive Sampling for Robust Self-Supervised Speaker Verification Theo Lepage Reda Dehak 76 1 0 20 May 2025
From Local Details to Global Context: Advancing Vision-Language Models with Attention-Based Selection Lincan Cai Jingxuan Kang Shuang Li Wenxuan Ma Binhui Xie Zhida Qin Jian Liang VLM 88 0 0 19 May 2025
Simplicity is Key: An Unsupervised Pretraining Approach for Sparse Radio Channels Jonathan Ott Maximilian Stahlke Tobias Feigl Bjoern M. Eskofier Christopher Mutschler 97 0 0 19 May 2025
Industrial Synthetic Segment Pre-training Shinichi Mae Ryousuke Yamada Hirokatsu Kataoka VLM 88 0 0 19 May 2025
No Free Lunch in Active Learning: LLM Embedding Quality Dictates Query Strategy Success Lukas Rauch Moritz Wirth Denis Huseljic M. Herde Bernhard Sick Matthias Aßenmacher 12 0 0 18 May 2025
Ditch the Denoiser: Emergence of Noise Robustness in Self-Supervised Learning from Data Curriculum Wenquan Lu Jiaqi Zhang Hugues Van Assel Randall Balestriero 116 0 0 18 May 2025
AdaDim: Dimensionality Adaptation for SSL Representational Dynamics Kiran Kokilepersaud Mohit Prabhushankar Ghassan AlRegib 86 0 0 18 May 2025
PRETI: Patient-Aware Retinal Foundation Model via Metadata-Guided Representation Learning Yeonkyung Lee Woojung Han Youngjun Jun Hyeonmin Kim Jungkyung Cho Seong Jae Hwang MedIm 70 0 0 18 May 2025
Is Semantic SLAM Ready for Embedded Systems ? A Comparative Survey Calvin Galagain Martyna Poreba François Goulette 74 1 0 18 May 2025
Continuous Subspace Optimization for Continual Learning Quan Cheng Yuanyu Wan Lingyu Wu Chenping Hou Lijun Zhang CLL 75 0 0 17 May 2025
iSegMan: Interactive Segment-and-Manipulate 3D Gaussians Yian Zhao Wanshi Xu Ruochong Zheng Pengchong Qiao Chang Liu Jie Chen 3DGS 91 0 0 17 May 2025
Equally Critical: Samples, Targets, and Their Mappings in Datasets Runkang Yang Peng Sun Xinyi Shang Yi Tang Tao R. Lin 22 0 0 17 May 2025
Cross-Model Transfer of Task Vectors via Few-Shot Orthogonal Alignment Kazuhiko Kawamoto Atsuhiro Endo Hiroshi Kera 71 0 0 17 May 2025
Are vision language models robust to uncertain inputs? Xi Wang Eric Nalisnick AAML VLM Presented at ResearchTrend Connect \| VLM on 18 Jun 2025 144 1 0 17 May 2025
DDAE++: Enhancing Diffusion Models Towards Unified Generative and Discriminative Learning Weilai Xiang Hongyu Yang Di Huang Yunhong Wang 120 0 0 16 May 2025
Questioning Representational Optimism in Deep Learning: The Fractured Entangled Representation Hypothesis Akarsh Kumar Jeff Clune Joel Lehman Kenneth O. Stanley OOD 71 0 0 16 May 2025
CellCLIP -- Learning Perturbation Effects in Cell Painting via Text-Guided Contrastive Learning Mingyu Lu Ethan Weinberger Chanwoo Kim Su-In Lee 23 0 0 16 May 2025
Object-Centric Representations Improve Policy Generalization in Robot Manipulation Alexandre Chapin Bruno Machado Emmanuel Dellandrea Liming Chen OCL 134 0 0 16 May 2025
CleanPatrick: A Benchmark for Image Data Cleaning Fabian Gröger Simone Lionetti Philippe Gottfrois Alvaro Gonzalez-Jimenez Ludovic Amruthalingam ... Philipp Tschandl A. Koochek Matthew Groh Alexander A. Navarini Marc Pouly OOD 73 0 0 16 May 2025
Invariant Representations via Wasserstein Correlation Maximization Keenan Eikenberry Lizuo Liu Yoonsang Lee OOD 72 0 0 16 May 2025
GeoMM: On Geodesic Perspective for Multi-modal Learning Shibin Mei Hang Wang Bingbing Ni 74 0 0 16 May 2025
Self-supervised perception for tactile skin covered dexterous hands Akash Sharma Carolina Higuera Chaithanya Krishna Bodduluri Ziqiang Liu Taosha Fan ... Byron Boots Michael Kaess Tingfan Wu Francois Robert Hogan Mustafa Mukadam SSL 84 2 0 16 May 2025
IMAGE-ALCHEMY: Advancing subject fidelity in personalised text-to-image generation Amritanshu Tiwari Cherish Puniani Kaustubh Sharma Ojasva Nema DiffM 101 0 0 15 May 2025
GAIA: A Foundation Model for Operational Atmospheric Dynamics Ata Akbari Asanjan Olivia Alexander Tom Berg Clara Zhang Matt Yang ... Stephen Peng Arun Ravindran Olivier Raiman David Potere David Bell 24 0 0 15 May 2025
A Unified and Scalable Membership Inference Method for Visual Self-supervised Encoder via Part-aware Capability Jie Zhu Jirong Zha Ding Li Leye Wang 131 1 0 15 May 2025
CLIP Embeddings for AI-Generated Image Detection: A Few-Shot Study with Lightweight Classifier Ziyang Ou VLM 75 0 0 15 May 2025
EmbodiedMAE: A Unified 3D Multi-Modal Representation for Robot Manipulation Zibin Dong Fei Ni Yifu Yuan Yinchuan Li Jianye Hao 115 0 0 15 May 2025
Endo-CLIP: Progressive Self-Supervised Pre-training on Raw Colonoscopy Records Yili He Yan Zhu Peiyao Fu Ruijie Yang Tianyi Chen Zhihua Wang Quanlin Li Pinghong Zhou Xiaoyu Yang Shuo Wang MedIm VLM 60 0 0 14 May 2025
Don't Forget your Inverse DDIM for Image Editing Guillermo Gomez-Trenado Pablo Mesejo Ó. Cordón Stéphane Lathuilière DiffM 56 0 0 14 May 2025
Few-shot Novel Category Discovery Chunming Li Shidong Wang Haofeng Zhang 60 0 0 13 May 2025
$Simple Semi-supervised Knowledge Distillation from Vision-Language Models via $\mathbf{\texttt{D}}$ual-$\mathbf{\texttt{H}}$ead $\mathbf{\texttt{O}}$ptimization$ Simple Semi-supervised Knowledge Distillation from Vision-Language Models via $\mathbf{\texttt{D}}$ ual- $\mathbf{\texttt{H}}$ ead $\mathbf{\texttt{O}}$ ptimization Seongjae Kang Dong Bok Lee Hyungjoon Jang Sung Ju Hwang VLM 101 0 0 12 May 2025
ShotAdapter: Text-to-Multi-Shot Video Generation with Diffusion Models Ozgur Kara Krishna Kumar Singh Feng Liu Duygu Ceylan James M. Rehg Tobias Hinz DiffM VGen 82 0 0 12 May 2025
A Reproduction Study: The Kernel PCA Interpretation of Self-Attention Fails Under Scrutiny Karahan Sarıtaş Çağatay Yıldız 57 0 0 12 May 2025
Vision Foundation Model Embedding-Based Semantic Anomaly Detection M. Ronecker Matthew Foutter Amine Elhafsi Daniele Gammelli Ihor Barakaiev Marco Pavone Daniel Watzenig 59 1 0 12 May 2025
Joint Low-level and High-level Textual Representation Learning with Multiple Masking Strategies Zhengmi Tang Yuto Mitsui Tomo Miyazaki S. Omachi 89 0 0 11 May 2025
Image Classification Using a Diffusion Model as a Pre-Training Model Kosuke Ukita Ye Xiaolong Tsuyoshi Okita DiffM MedIm VLM 71 0 0 11 May 2025