v1v2 (latest)

Emerging Properties in Self-Supervised Vision Transformers

29 April 2021

Papers citing "Emerging Properties in Self-Supervised Vision Transformers"

50 / 4,175 papers shown

Title
DMC-VB: A Benchmark for Representation Learning for Control with Visual Distractors Joseph Ortiz Antoine Dedieu Wolfgang Lehrach Swaroop Guntupalli Carter Wendelken Ahmad Humayun Guangyao Zhou Sivaramakrishnan Swaminathan Miguel Lázaro-Gredilla Kevin P. Murphy OffRL 76 1 0 26 Sep 2024
Evaluation of Security of ML-based Watermarking: Copy and Removal Attacks Vitaliy Kinakh Brian Pulfer Yury Belousov Pierre Fernandez Teddy Furon Slava Voloshynovskiy 81 2 0 26 Sep 2024
Robot See Robot Do: Imitating Articulated Object Manipulation with Monocular 4D Reconstruction Justin Kerr Chung Min Kim Mingxuan Wu Brent Yi Qianqian Wang Ken Goldberg Angjoo Kanazawa 77 18 0 26 Sep 2024
Self-supervised Pretraining for Cardiovascular Magnetic Resonance Cine Segmentation Rob A. J. de Mooij Josien P. W. Pluim Cian M. Scannell 59 0 0 26 Sep 2024
FreeEdit: Mask-free Reference-based Image Editing with Multi-modal Instruction Runze He Kai Ma Linjiang Huang Shaofei Huang Jialin Gao Xiaoming Wei Jiao Dai Jizhong Han Si Liu DiffM 76 9 0 26 Sep 2024
Improving satellite imagery segmentation using multiple Sentinel-2 revisits Kartik Jindgar Grace W. Lindsay 71 0 0 25 Sep 2024
ChatCam: Empowering Camera Control through Conversational AI Xinhang Liu Yu-Wing Tai Chi-Keung Tang VGen 78 3 0 25 Sep 2024
PACE: Marrying generalization in PArameter-efficient fine-tuning with Consistency rEgularization Yao Ni Shan Zhang Piotr Koniusz 459 8 0 25 Sep 2024
Unveiling Ontological Commitment in Multi-Modal Foundation Models Mert Keser Gesina Schwalbe Niki Amini-Naieni Matthias Rottmann Alois Knoll 53 1 0 25 Sep 2024
Degradation-Guided One-Step Image Super-Resolution with Diffusion Priors Aiping Zhang Zongsheng Yue Renjing Pei Wenqi Ren Xiaochun Cao 66 11 0 25 Sep 2024
HVT: A Comprehensive Vision Framework for Learning in Non-Euclidean Space Jacob Fein-Ashley Ethan Feng Minh Pham 61 3 0 25 Sep 2024
GeoBiked: A Dataset with Geometric Features and Automated Labeling Techniques to Enable Deep Generative Models in Engineering Design Phillip Mueller Sebastian Mueller Lars Mikelsons 112 2 0 25 Sep 2024
DIAL: Dense Image-text ALignment for Weakly Supervised Semantic Segmentation Soojin Jang Jungmin Yun Junehyoung Kwon Eunju Lee Youngbin Kim 104 3 0 24 Sep 2024
ImPoster: Text and Frequency Guidance for Subject Driven Action Personalization using Diffusion Models D. Kothandaraman Kuldeep Kulkarni Sumit Shekhar Balaji Vasan Srinivasan Dinesh Manocha DiffM 95 1 0 24 Sep 2024
OW-Rep: Open World Object Detection with Instance Representation Learning Sunoh Lee Minsik Jeon Jihong Min Junwon Seo ObjD 488 0 0 24 Sep 2024
Clinical-grade Multi-Organ Pathology Report Generation for Multi-scale Whole Slide Images via a Semantically Guided Medical Text Foundation Model J. Tan SeungKyu Kim Eunsu Kim Sung Hak Lee Sangjeong Ahn Won-Ki Jeong 59 2 0 23 Sep 2024
A Novel Framework for the Automated Characterization of Gram-Stained Blood Culture Slides Using a Large-Scale Vision Transformer Jack McMahon Naofumi Tomita Elizabeth S. Tatishev Adrienne A. Workman Cristina R Costales Niaz Banaei Isabella W. Martin Saeed Hassanpour 51 1 0 23 Sep 2024
VLEU: a Method for Automatic Evaluation for Generalizability of Text-to-Image Models Jingtao Cao Zheng Zhang Hongru Wang Kam-Fai Wong 59 0 0 23 Sep 2024
Hierarchical end-to-end autonomous navigation through few-shot waypoint detection A. Ghafourian Zhongying CuiZhu Debo Shi Ian Chuang François Charette Rithik Sachdeva Iman Soltani 69 1 0 23 Sep 2024
PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions Weifeng Lin Xinyu Wei Renrui Zhang Le Zhuo Shitian Zhao ... Junlin Xie Junlin Xie Yu Qiao Peng Gao Hongsheng Li MLLM DiffM 190 14 0 23 Sep 2024
Quantifying Context Bias in Domain Adaptation for Object Detection Hojun Son Asma Almutairi Arpan Kusari AI4CE 137 1 0 23 Sep 2024
SOS: Segment Object System for Open-World Instance Segmentation With Object Priors Christian Wilms Tim Rolff Maris Hillemann Robert Johanson Simone Frintrop VLM 85 1 0 22 Sep 2024
Dormant: Defending against Pose-driven Human Image Animation Jiachen Zhou Mingsi Wang Tianlin Li Guozhu Meng Kai Chen 160 5 0 22 Sep 2024
SplatLoc: 3D Gaussian Splatting-based Visual Localization for Augmented Reality Hongjia Zhai Xiyu Zhang Boming Zhao Hai Li Yijia He Zhaopeng Cui Hujun Bao Guofeng Zhang 3DGS 80 12 0 21 Sep 2024
Simple Unsupervised Knowledge Distillation With Space Similarity Aditya Singh Haohan Wang 144 2 0 20 Sep 2024
ViTGuard: Attention-aware Detection against Adversarial Examples for Vision Transformer Shihua Sun Kenechukwu Nwodo Shridatt Sugrim Angelos Stavrou Haining Wang AAML 85 1 0 20 Sep 2024
LCM: Log Conformal Maps for Robust Representation Learning to Mitigate Perspective Distortion Meenakshi Subhash Chippa Prakash Chandra Chhipa Kanjar De Marcus Liwicki Rajkumar Saini 45 0 0 20 Sep 2024
Formula-Supervised Visual-Geometric Pre-training Ryosuke Yamada Kensho Hara Hirokatsu Kataoka Koshi Makihara Nakamasa Inoue Rio Yokota Y. Satoh 55 1 0 20 Sep 2024
RingMo-Aerial: An Aerial Remote Sensing Foundation Model With Affine Transformation Contrastive Learning Wenhui Diao Haichen Yu Kaiyue Kang Tong Ling Di Liu ... Hanbo Bi Libo Ren Xuexue Li Yongqiang Mao Xian Sun 265 1 0 20 Sep 2024
Personalized 2D Binary Patient Codes of Tissue Images and Immunogenomic Data Through Multimodal Self-Supervised Fusion Areej Alsaafin A. Shafique Saghir Alfasly H. R. Tizhoosh 50 0 0 19 Sep 2024
Is Tokenization Needed for Masked Particle Modelling? Matthew Leigh Samuel Klein François Charton Tobias Golling Lukas Heinrich Michael Kagan Ines Ochoa Margarita Osadchy 95 8 0 19 Sep 2024
From Linguistic Giants to Sensory Maestros: A Survey on Cross-Modal Reasoning with Large Language Models Shengsheng Qian Zuyi Zhou Dizhan Xue Bing Wang Changsheng Xu LRM 148 2 0 19 Sep 2024
GauTOAO: Gaussian-based Task-Oriented Affordance of Objects Jiawen Wang Dingsheng Luo 62 0 0 18 Sep 2024
Knowledge Adaptation Network for Few-Shot Class-Incremental Learning Ye Wang Yaxiong Wang Guoshuai Zhao Xueming Qian CLL 90 1 0 18 Sep 2024
Gradient-Driven 3D Segmentation and Affordance Transfer in Gaussian Splatting Using 2D Masks Joji Joseph B. Amrutur Shalabh Bhatnagar 3DGS 100 1 0 18 Sep 2024
Multimodal Generalized Category Discovery Yuchang Su Renping Zhou Siyu Huang Xingjian Li Tianyang Wang Ziyue Wang Min Xu 108 0 0 18 Sep 2024
Open-Set Semantic Uncertainty Aware Metric-Semantic Graph Matching Kurran Singh John J. Leonard 55 1 0 17 Sep 2024
CLIP Adaptation by Intra-modal Overlap Reduction A. Kravets V. Namboodiri VLM 57 0 0 17 Sep 2024
KVPruner: Structural Pruning for Faster and Memory-Efficient Large Language Models Bo Lv Quan Zhou Xuanang Ding Yan Wang Zeming Ma VLM 67 2 0 17 Sep 2024
Down-Sampling Inter-Layer Adapter for Parameter and Computation Efficient Ultra-Fine-Grained Image Recognition Edwin Arkel Rios Femiloye Oyerinde Min-Chun Hu Bo-Cheng Lai 74 0 0 17 Sep 2024
Sparks of Artificial General Intelligence(AGI) in Semiconductor Material Science: Early Explorations into the Next Frontier of Generative AI-Assisted Electron Micrograph Analysis Sakhinana Sagar Srinivas Geethan Sannidhi Sreeja Gangasani Chidaksh Ravuru Venkataramana Runkana 92 0 0 17 Sep 2024
Self-Supervised Syllable Discovery Based on Speaker-Disentangled HuBERT Ryota Komatsu Takahiro Shinozaki SSL 108 1 0 16 Sep 2024
Frequency-Guided Masking for Enhanced Vision Self-Supervised Learning Amin Karimi Monsefi Mengxi Zhou Nastaran Karimi Monsefi Ser-Nam Lim Wei-Lun Chao R. Ramnath 132 1 0 16 Sep 2024
Robust image representations with counterfactual contrastive learning Mélanie Roschewitz Fabio De Sousa Ribeiro Tian Xia G. Khara Ben Glocker OOD MedIm 140 2 0 16 Sep 2024
Pre-Training for 3D Hand Pose Estimation with Contrastive Learning on Large-Scale Hand Images in the Wild Nie Lin Takehiko Ohkawa Mingfang Zhang Yifei Huang Ryosuke Furuta Yoichi Sato 3DH 71 2 0 15 Sep 2024
EditBoard: Towards a Comprehensive Evaluation Benchmark for Text-Based Video Editing Models Yupeng Chen Penglin Chen Xiaoyu Zhang Yixian Huang Qian Xie DiffM 104 1 0 15 Sep 2024
On the Generalizability of Foundation Models for Crop Type Mapping Yi-Chia Chang Adam J. Stewart Favyen Bastani Piper Wolters Shreya Kannan George R. Huber Jingtong Wang Arindam Banerjee 109 1 0 14 Sep 2024
Evaluating Pre-trained Convolutional Neural Networks and Foundation Models as Feature Extractors for Content-based Medical Image Retrieval Amirreza Mahbod Nematollah Saeidi Sepideh Hatamikia Ramona Woitek VLM MedIm 126 4 0 14 Sep 2024
Phikon-v2, A large and public feature extractor for biomarker prediction Alexandre Filiot Paul Jacob Alice Mac Kain Charlie Saillard MedIm 87 21 0 13 Sep 2024
Detect Fake with Fake: Leveraging Synthetic Data-driven Representation for Synthetic Image Detection Hina Otake Yoshihiro Fukuhara Yoshiki Kubotani Shigeo Morishima ViT 80 0 0 13 Sep 2024