ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2104.14294
  4. Cited By
Emerging Properties in Self-Supervised Vision Transformers
v1v2 (latest)

Emerging Properties in Self-Supervised Vision Transformers

29 April 2021
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
ArXiv (abs)PDFHTML

Papers citing "Emerging Properties in Self-Supervised Vision Transformers"

50 / 4,176 papers shown
Title
Learning Semantic Proxies from Visual Prompts for Parameter-Efficient
  Fine-Tuning in Deep Metric Learning
Learning Semantic Proxies from Visual Prompts for Parameter-Efficient Fine-Tuning in Deep Metric Learning
Li Ren
Chen Chen
Liqiang Wang
Kien Hua
77
5
0
04 Feb 2024
Video Editing for Video Retrieval
Video Editing for Video Retrieval
Bin Zhu
Kevin Flanagan
A. Fragomeni
Michael Wray
Dima Damen
CLIP
85
1
0
04 Feb 2024
MLIP: Enhancing Medical Visual Representation with Divergence Encoder
  and Knowledge-guided Contrastive Learning
MLIP: Enhancing Medical Visual Representation with Divergence Encoder and Knowledge-guided Contrastive Learning
Zhe Li
Laurence T. Yang
Bocheng Ren
Xin Nie
Zhangyang Gao
Cheng Tan
Stan Z. Li
VLM
79
16
0
03 Feb 2024
Hypergraph-Transformer (HGT) for Interactive Event Prediction in Laparoscopic and Robotic Surgery
Hypergraph-Transformer (HGT) for Interactive Event Prediction in Laparoscopic and Robotic Surgery
Lianhao Yin
Yutong Ban
J. Eckhoff
O. Meireles
Daniela Rus
Guy Rosman
119
3
0
03 Feb 2024
SynthCLIP: Are We Ready for a Fully Synthetic CLIP Training?
SynthCLIP: Are We Ready for a Fully Synthetic CLIP Training?
Hasan Hammoud
Hani Itani
Fabio Pizzati
Philip Torr
Adel Bibi
Guohao Li
CLIPVLM
239
38
0
02 Feb 2024
A Probabilistic Model behind Self-Supervised Learning
A Probabilistic Model behind Self-Supervised Learning
Alice Bizeul
Bernhard Schölkopf
Carl Allen
SSL
85
2
0
02 Feb 2024
Guiding Masked Representation Learning to Capture Spatio-Temporal
  Relationship of Electrocardiogram
Guiding Masked Representation Learning to Capture Spatio-Temporal Relationship of Electrocardiogram
Yeongyeon Na
Minje Park
Yunwon Tae
S. Joo
92
31
0
02 Feb 2024
A Survey for Foundation Models in Autonomous Driving
A Survey for Foundation Models in Autonomous Driving
Haoxiang Gao
Yaqian Li
Kaiwen Long
Ming Yang
Yiqing Shen
VLMLRM
109
32
0
02 Feb 2024
Neural Slot Interpreters: Grounding Object Semantics in Emergent Slot Representations
Neural Slot Interpreters: Grounding Object Semantics in Emergent Slot Representations
Bhishma Dedhia
N. Jha
OCL
151
1
0
02 Feb 2024
Exploring Homogeneous and Heterogeneous Consistent Label Associations
  for Unsupervised Visible-Infrared Person ReID
Exploring Homogeneous and Heterogeneous Consistent Label Associations for Unsupervised Visible-Infrared Person ReID
Lingfeng He
De Cheng
Nannan Wang
Xinbo Gao
87
5
0
01 Feb 2024
Self-supervised learning of video representations from a child's
  perspective
Self-supervised learning of video representations from a child's perspective
A. Orhan
Wentao Wang
Alex N. Wang
Mengye Ren
Brenden M. Lake
67
4
0
01 Feb 2024
Distillation Enhanced Time Series Forecasting Network with Momentum
  Contrastive Learning
Distillation Enhanced Time Series Forecasting Network with Momentum Contrastive Learning
Haozhi Gao
Qianqian Ren
Jinbao Li
AI4TS
83
3
0
31 Jan 2024
What Do Self-Supervised Speech and Speaker Models Learn? New Findings
  From a Cross Model Layer-Wise Analysis
What Do Self-Supervised Speech and Speaker Models Learn? New Findings From a Cross Model Layer-Wise Analysis
Takanori Ashihara
Marc Delcroix
Takafumi Moriya
Kohei Matsuura
Taichi Asami
Yusuke Ijima
SSL
86
7
0
31 Jan 2024
Local Feature Matching Using Deep Learning: A Survey
Local Feature Matching Using Deep Learning: A Survey
Shibiao Xu
Shunpeng Chen
Rongtao Xu
Changwei Wang
Peng Lu
Li Guo
ObjD
117
40
0
31 Jan 2024
MouSi: Poly-Visual-Expert Vision-Language Models
MouSi: Poly-Visual-Expert Vision-Language Models
Xiaoran Fan
Tao Ji
Changhao Jiang
Shuo Li
Senjie Jin
...
Qi Zhang
Xipeng Qiu
Xuanjing Huang
Zuxuan Wu
Yunchun Jiang
VLM
54
17
0
30 Jan 2024
Pick-and-Draw: Training-free Semantic Guidance for Text-to-Image
  Personalization
Pick-and-Draw: Training-free Semantic Guidance for Text-to-Image Personalization
Henglei Lv
Jiayu Xiao
Liang Li
Qingming Huang
DiffM
100
6
0
30 Jan 2024
MuSc: Zero-Shot Industrial Anomaly Classification and Segmentation with
  Mutual Scoring of the Unlabeled Images
MuSc: Zero-Shot Industrial Anomaly Classification and Segmentation with Mutual Scoring of the Unlabeled Images
Xurui Li
Ziming Huang
Feng Xue
Yu Zhou
110
23
0
30 Jan 2024
Computer Vision for Primate Behavior Analysis in the Wild
Computer Vision for Primate Behavior Analysis in the Wild
Richard Vogg
Timo Lüddecke
Jonathan Henrich
Sharmita Dey
Matthias Nuske
...
Alexander Gail
Stefan Treue
H. Scherberger
Florentin Wörgötter
Alexander S. Ecker
131
6
0
29 Jan 2024
Bridging Generative and Discriminative Models for Unified Visual
  Perception with Diffusion Priors
Bridging Generative and Discriminative Models for Unified Visual Perception with Diffusion Priors
Shiyin Dong
Mingrui Zhu
Kun Cheng
Nannan Wang
Xinbo Gao
DiffM
45
3
0
29 Jan 2024
MLEM: Generative and Contrastive Learning as Distinct Modalities for
  Event Sequences
MLEM: Generative and Contrastive Learning as Distinct Modalities for Event Sequences
Viktor Moskvoretskii
Dmitry Osin
Egor Shvetsov
Igor Udovichenko
Maxim Zhelnin
Andrey Dukhovny
Anna Zhimerikina
Evgeny Burnaev
AI4TS
111
2
0
29 Jan 2024
An objective comparison of methods for augmented reality in laparoscopic
  liver resection by preoperative-to-intraoperative image fusion
An objective comparison of methods for augmented reality in laparoscopic liver resection by preoperative-to-intraoperative image fusion
Sharib Ali
Yamid Espinel
Yueming Jin
Peng Liu
Bianca Güttner
...
Micha Pfeiffer
Shahid Farid
Lena Maier-Hein
E. Buc
Adrien Bartoli
115
16
0
28 Jan 2024
A Survey on Data Augmentation in Large Model Era
A Survey on Data Augmentation in Large Model Era
Yue Zhou
Chenlu Guo
Xu Wang
Yi-Ju Chang
Yuan Wu
LM&MAVLM
137
27
0
27 Jan 2024
Do deep neural networks utilize the weight space efficiently?
Do deep neural networks utilize the weight space efficiently?
Onur Can Koyun
B. U. Toreyin
59
0
0
26 Jan 2024
Multimodal Pathway: Improve Transformers with Irrelevant Data from Other
  Modalities
Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities
Yiyuan Zhang
Xiaohan Ding
Kaixiong Gong
Yixiao Ge
Ying Shan
Xiangyu Yue
ViT
142
7
0
25 Jan 2024
Cross-Domain Few-Shot Learning via Adaptive Transformer Networks
Cross-Domain Few-Shot Learning via Adaptive Transformer Networks
Naeem Paeedeh
Mahardhika Pratama
M. A. Ma'sum
Wolfgang Mayer
Zehong Cao
Ryszard Kowlczyk
79
13
0
25 Jan 2024
Rethinking Patch Dependence for Masked Autoencoders
Rethinking Patch Dependence for Masked Autoencoders
Letian Fu
Long Lian
Renhao Wang
Baifeng Shi
Xudong Wang
Adam Yala
Trevor Darrell
Alexei A. Efros
Ken Goldberg
144
17
0
25 Jan 2024
Democratizing Fine-grained Visual Recognition with Large Language Models
Democratizing Fine-grained Visual Recognition with Large Language Models
Mingxuan Liu
Subhankar Roy
Wenjing Li
Zhun Zhong
N. Sebe
Elisa Ricci
VLM
106
13
0
24 Jan 2024
Masked Particle Modeling on Sets: Towards Self-Supervised High Energy
  Physics Foundation Models
Masked Particle Modeling on Sets: Towards Self-Supervised High Energy Physics Foundation Models
T. Golling
Lukas Heinrich
Michael Kagan
Samuel Klein
Matthew Leigh
Margarita Osadchy
J. A. Raine
94
27
0
24 Jan 2024
Do You Guys Want to Dance: Zero-Shot Compositional Human Dance
  Generation with Multiple Persons
Do You Guys Want to Dance: Zero-Shot Compositional Human Dance Generation with Multiple Persons
Zhe Xu
Kun-Juan Wei
Xu Yang
Cheng Deng
DiffM
46
4
0
24 Jan 2024
Memory Consistency Guided Divide-and-Conquer Learning for Generalized
  Category Discovery
Memory Consistency Guided Divide-and-Conquer Learning for Generalized Category Discovery
Yuanpeng Tu
Zhun Zhong
Yuxi Li
Hengshuang Zhao
99
0
0
24 Jan 2024
Facing the Elephant in the Room: Visual Prompt Tuning or Full
  Finetuning?
Facing the Elephant in the Room: Visual Prompt Tuning or Full Finetuning?
Cheng Han
Qifan Wang
Yiming Cui
Wenguan Wang
Lifu Huang
Siyuan Qi
Dongfang Liu
VLM
164
22
0
23 Jan 2024
DatUS^2: Data-driven Unsupervised Semantic Segmentation with Pre-trained
  Self-supervised Vision Transformer
DatUS^2: Data-driven Unsupervised Semantic Segmentation with Pre-trained Self-supervised Vision Transformer
Sonal Kumar
Arijit Sur
R. Baruah
ViT
102
2
0
23 Jan 2024
Self-Supervised Vision Transformers Are Efficient Segmentation Learners
  for Imperfect Labels
Self-Supervised Vision Transformers Are Efficient Segmentation Learners for Imperfect Labels
Seungho Lee
Seoungyoon Kang
Hyunjung Shim
ViTVLM
68
0
0
23 Jan 2024
Self-supervised Learning of LiDAR 3D Point Clouds via 2D-3D Neural Calibration
Self-supervised Learning of LiDAR 3D Point Clouds via 2D-3D Neural Calibration
Yifan Zhang
Siyu Ren
Xianqiang Lyu
Jinjian Wu
Guangming Shi
Guangming Shi
SSL3DPC
309
3
0
23 Jan 2024
Exploring Simple Open-Vocabulary Semantic Segmentation
Exploring Simple Open-Vocabulary Semantic Segmentation
Zihang Lai
VLM
76
0
0
22 Jan 2024
Template-Free Single-View 3D Human Digitalization with Diffusion-Guided
  LRM
Template-Free Single-View 3D Human Digitalization with Diffusion-Guided LRM
Zhenzhen Weng
Jingyuan Liu
Hao Tan
Zhan Xu
Yang Zhou
Serena Yeung-Levy
Jimei Yang
3DH
82
10
0
22 Jan 2024
EmerDiff: Emerging Pixel-level Semantic Knowledge in Diffusion Models
EmerDiff: Emerging Pixel-level Semantic Knowledge in Diffusion Models
Koichi Namekata
Amirmojtaba Sabour
Sanja Fidler
Seung Wook Kim
126
22
0
22 Jan 2024
MVSFormer++: Revealing the Devil in Transformer's Details for Multi-View
  Stereo
MVSFormer++: Revealing the Devil in Transformer's Details for Multi-View Stereo
Chenjie Cao
Xinlin Ren
Yanwei Fu
96
29
0
22 Jan 2024
Zoom-shot: Fast and Efficient Unsupervised Zero-Shot Transfer of CLIP to
  Vision Encoders with Multimodal Loss
Zoom-shot: Fast and Efficient Unsupervised Zero-Shot Transfer of CLIP to Vision Encoders with Multimodal Loss
Jordan Shipard
Arnold Wiliem
Kien Nguyen Thanh
Wei Xiang
Clinton Fookes
VLMCLIP
103
2
0
22 Jan 2024
UniM-OV3D: Uni-Modality Open-Vocabulary 3D Scene Understanding with
  Fine-Grained Feature Representation
UniM-OV3D: Uni-Modality Open-Vocabulary 3D Scene Understanding with Fine-Grained Feature Representation
Qingdong He
Jinlong Peng
Zhengkai Jiang
Kai Wu
Xiaozhong Ji
Jiangning Zhang
Yabiao Wang
Chengjie Wang
Mingang Chen
Yunsheng Wu
3DPC
73
8
0
21 Jan 2024
A Novel Benchmark for Few-Shot Semantic Segmentation in the Era of Foundation Models
A Novel Benchmark for Few-Shot Semantic Segmentation in the Era of Foundation Models
Reda Bensaid
Vincent Gripon
Franccois Leduc-Primeau
Lukas Mauch
G. B. Hacene
Fabien Cardinaux
VLM
99
7
0
20 Jan 2024
PhotoBot: Reference-Guided Interactive Photography via Natural Language
PhotoBot: Reference-Guided Interactive Photography via Natural Language
Oliver Limoyo
J. Li
D. Rivkin
Jonathan Kelly
Gregory Dudek
LM&Ro
75
0
0
19 Jan 2024
Memorization in Self-Supervised Learning Improves Downstream
  Generalization
Memorization in Self-Supervised Learning Improves Downstream Generalization
Wenhao Wang
Muhammad Ahmad Kaleem
Adam Dziedzic
Michael Backes
Nicolas Papernot
Franziska Boenisch
SSL
88
11
0
19 Jan 2024
LDReg: Local Dimensionality Regularized Self-Supervised Learning
LDReg: Local Dimensionality Regularized Self-Supervised Learning
Hanxun Huang
R. Campello
S. Erfani
Xingjun Ma
Michael E. Houle
James Bailey
91
5
0
19 Jan 2024
Exploring scalable medical image encoders beyond text supervision
Exploring scalable medical image encoders beyond text supervision
Fernando Pérez-García
Harshita Sharma
Sam Bond-Taylor
Kenza Bouzid
Valentina Salvatelli
...
Maria T. A. Wetscherek
Noel C. F. Codella
Stephanie L. Hyland
Javier Alvarez-Valle
Ozan Oktay
LM&MAMedIm
146
9
0
19 Jan 2024
Reconstructing the Invisible: Video Frame Restoration through Siamese
  Masked Conditional Variational Autoencoder
Reconstructing the Invisible: Video Frame Restoration through Siamese Masked Conditional Variational Autoencoder
Yongchen Zhou
Richard Jiang
46
0
0
18 Jan 2024
Supervised Fine-tuning in turn Improves Visual Foundation Models
Supervised Fine-tuning in turn Improves Visual Foundation Models
Xiaohu Jiang
Yixiao Ge
Yuying Ge
Dachuan Shi
Chun Yuan
Ying Shan
VLMCLIP
94
9
0
18 Jan 2024
CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects
CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects
Zhao Wang
Aoxue Li
Lingting Zhu
Yong Guo
Qi Dou
Zhenguo Li
VGenDiffM
106
44
0
18 Jan 2024
Image Translation as Diffusion Visual Programmers
Image Translation as Diffusion Visual Programmers
Cheng Han
James Liang
Qifan Wang
Majid Rabbani
S. Dianat
Raghuveer M. Rao
Ying Nian Wu
Dongfang Liu
79
8
0
18 Jan 2024
Vision Mamba: Efficient Visual Representation Learning with
  Bidirectional State Space Model
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
Lianghui Zhu
Bencheng Liao
Qian Zhang
Xinlong Wang
Wenyu Liu
Xinggang Wang
Mamba
125
817
0
17 Jan 2024
Previous
123...404142...828384
Next