v1v2v3 (latest)

Masked Autoencoders Are Scalable Vision Learners

11 November 2021

Piotr Dollár

Papers citing "Masked Autoencoders Are Scalable Vision Learners"

50 / 4,777 papers shown

Title
Next state prediction gives rise to entangled, yet compositional representations of objects Tankred Saanum Luca M. Schulze Buschoff Peter Dayan Eric Schulz OCL CoGe OOD 65 1 0 07 Oct 2024
A Simple Image Segmentation Framework via In-Context Examples Yang Liu Chenchen Jing Hengtao Li Muzhi Zhu Hao Chen Xinlong Wang Chunhua Shen 99 8 0 07 Oct 2024
Resource-Efficient Multiview Perception: Integrating Semantic Masking with Masked Autoencoders Kosta Dakic Kanchana Thilakarathna Rodrigo N. Calheiros Teng Joon Lim 60 0 0 07 Oct 2024
Masked Autoencoder with Swin Transformer Network for Mitigating Electrode Shift in HD-EMG-based Gesture Recognition Kasra Laamerad Mehran Shabanpour Md. Rabiul Islam Arash Mohammadi 38 0 0 07 Oct 2024
On Efficient Variants of Segment Anything Model: A Survey Xiaorui Sun Jing Liu Jikang Cheng Xiaofeng Zhu Ping Hu VLM 143 7 0 07 Oct 2024
Learning De-Biased Representations for Remote-Sensing Imagery Zichen Tian Zhaozheng Chen Qianru Sun 62 0 0 06 Oct 2024
Self-Supervised Anomaly Detection in the Wild: Favor Joint Embeddings Methods Daniel Otero Rafael Mateus Randall Balestriero 51 0 0 05 Oct 2024
Implicit to Explicit Entropy Regularization: Benchmarking ViT Fine-tuning under Noisy Labels Maria Marrium Arif Mahmood Mohammed Bennamoun NoLa AAML 102 0 0 05 Oct 2024
SyllableLM: Learning Coarse Semantic Units for Speech Language Models Alan Baade Puyuan Peng David Harwath 126 8 0 05 Oct 2024
IT $^3$ : Idempotent Test-Time Training Nikita Durasov Assaf Shocher Doruk Öner Gal Chechik Alexei A. Efros Pascal Fua OOD VLM 117 1 0 05 Oct 2024
Not All Diffusion Model Activations Have Been Evaluated as Discriminative Features Benyuan Meng Qianqian Xu Zitai Wang Xiaochun Cao Qingming Huang 82 7 0 04 Oct 2024
VEDIT: Latent Prediction Architecture For Procedural Video Representation Learning Han Lin Tushar Nagarajan Nicolas Ballas Mido Assran Mojtaba Komeili Joey Tianyi Zhou Koustuv Sinha AI4TS 110 5 0 04 Oct 2024
Self-supervised Spatio-Temporal Graph Mask-Passing Attention Network for Perceptual Importance Prediction of Multi-point Tactility Dazhong He Qian Liu 28 0 0 04 Oct 2024
Adaptive Masking Enhances Visual Grounding Sen Jia Lei Li 75 0 0 04 Oct 2024
ECHOPulse: ECG controlled echocardio-grams video generation Yiwei Li Sekeun Kim Zihao Wu Hanqi Jiang Yi Pan ... Sifan Song Yucheng Shi Tianming Liu Quanzheng Li Xiang Li VGen 69 1 0 04 Oct 2024
Predictive Coding for Decision Transformer Tung M. Luu Donghoon Lee Chang D. Yoo OffRL 129 2 0 04 Oct 2024
AirLetters: An Open Video Dataset of Characters Drawn in the Air Rishit Dagli Guillaume Berger Joanna Materzynska Ingo Bax Roland Memisevic VGen 67 1 0 03 Oct 2024
Task-Decoupled Image Inpainting Framework for Class-specific Object Remover Changsuk Oh H. J. Kim 97 0 0 03 Oct 2024
A Foundation Model for the Solar Dynamics Observatory James Walsh Daniel G. Gass Raul Ramos Pollan P. Wright Richard Galvez Noah Kasmanoff Jason Naradowsky Anne Spalding James Parr Atılım Güneş Baydin 3DGS 14 0 0 03 Oct 2024
Personalized Federated Learning for Generative AI-Assisted Semantic Communications Yubo Peng Feibo Jiang Li Dong Kezhi Wang Kun Yang 80 2 0 03 Oct 2024
Unsupervised Meta-Learning via Dynamic Head and Heterogeneous Task Construction for Few-Shot Classification Yunchuan Guan Yu Liu Ketong Liu Ke Zhou Zhiqi Shen 78 1 0 03 Oct 2024
EmbedLLM: Learning Compact Representations of Large Language Models Richard Zhuang Tianhao Wu Zhaojin Wen Andrew Li Jiantao Jiao Kannan Ramchandran AIFin 69 6 0 03 Oct 2024
BiSSL: Enhancing the Alignment Between Self-Supervised Pretraining and Downstream Fine-Tuning via Bilevel Optimization Gustav Wagner Zakarias Lars Kai Hansen Zheng-Hua Tan 81 0 0 03 Oct 2024
MDSGen: Fast and Efficient Masked Diffusion Temporal-Aware Transformers for Open-Domain Sound Generation T. Pham Tri Ton Chang D. Yoo 105 3 0 03 Oct 2024
TAEGAN: Generating Synthetic Tabular Data For Data Augmentation Jiayu Li Zilong Zhao Kevin Yee Uzair Javaid Biplab Sikdar LMTD 73 1 0 02 Oct 2024
Forte : Finding Outliers with Representation Typicality Estimation Debargha Ganguly Warren Morningstar A. Yu Vipin Chaudhary OODD 93 2 0 02 Oct 2024
Denoising with a Joint-Embedding Predictive Architecture Dengsheng Chen Jie Hu Xiaoming Wei Enhua Wu DiffM 172 3 0 02 Oct 2024
Text2PDE: Latent Diffusion Models for Accessible Physics Simulation Anthony Zhou Zijie Li Michael Schneier John R Buchanan Jr Amir Barati Farimani AI4CE DiffM 170 8 0 02 Oct 2024
Pre-training with Synthetic Patterns for Audio Yuchi Ishikawa Tatsuya Komatsu Yoshimitsu Aoki 58 0 0 01 Oct 2024
Domain Aware Multi-Task Pretraining of 3D Swin Transformer for T1-weighted Brain MRI Jonghun Kim Mansu Kim Hyunjin Park MedIm ViT 54 0 0 01 Oct 2024
CXPMRG-Bench: Pre-training and Benchmarking for X-ray Medical Report Generation on CheXpert Plus Dataset Xiao Wang Fuling Wang Yuehang Li Qingchuan Ma Shiao Wang Bo Jiang Chuanfu Li Jin Tang 119 4 0 01 Oct 2024
MAP: Unleashing Hybrid Mamba-Transformer Vision Backbone's Potential with Masked Autoregressive Pretraining Yunze Liu Li Yi Mamba 190 3 0 01 Oct 2024
Advancing Medical Radiograph Representation Learning: A Hybrid Pre-training Paradigm with Multilevel Semantic Granularity Hanqi Jiang Xixuan Hao Yuzhou Huang Chong Ma Jiaxun Zhang Yi Pan Ruimao Zhang MedIm 175 0 0 01 Oct 2024
Scaling Proprioceptive-Visual Learning with Heterogeneous Pre-trained Transformers Lirui Wang Xinlei Chen Jialiang Zhao Kaiming He 73 44 0 30 Sep 2024
AI Foundation Model for Heliophysics: Applications, Design, and Implementation Sujit Roy Talwinder Singh Marcus Freitag J. Schmude Rohit Lal ... Berkay Aydin Nikolai Pogorelov Juan Bernabé-Moreno M. Maskey Rahul Ramachandran MedIm AI4CE 94 0 0 30 Sep 2024
Task-Oriented Pre-Training for Drivable Area Detection Fulong Ma Guoyang Zhao Weiqing Qi Ming Liu Jun Ma VLM 66 1 0 30 Sep 2024
Masked Autoregressive Model for Weather Forecasting Doyi Kim Minseok Seo Hakjin Lee Junghoon Seo 77 0 0 30 Sep 2024
SurgPETL: Parameter-Efficient Image-to-Surgical-Video Transfer Learning for Surgical Phase Recognition Shu Yang Zhiyuan Cai Luyang Luo Ning Ma Shuchang Xu Hao Chen 67 1 0 30 Sep 2024
Image Copy Detection for Diffusion Models Wenhao Wang Yifan Sun Zhentao Tan Yi Yang 76 1 0 30 Sep 2024
MaskMamba: A Hybrid Mamba-Transformer Model for Masked Image Generation Wenchao Chen Liqiang Niu Ziyao Lu Fandong Meng Jie Zhou Mamba 96 4 0 30 Sep 2024
Towards Open-Vocabulary Semantic Segmentation Without Semantic Labels Heeseong Shin Chaehyun Kim Sunghwan Hong Seokju Cho Anurag Arnab Paul Hongsuck Seo Seungryong Kim VLM 82 1 0 30 Sep 2024
Annotation-Free Curb Detection Leveraging Altitude Difference Image Fulong Ma Peng Hou Yuxuan Liu Yang Liu Ming Liu Jun Ma 53 0 0 30 Sep 2024
Feature Extractor or Decision Maker: Rethinking the Role of Visual Encoders in Visuomotor Policies Ruiyu Wang Zheyu Zhuang Shutong Jin Nils Ingelhag Danica Kragic Florian T. Pokorny 97 0 0 30 Sep 2024
Vision-Language Models are Strong Noisy Label Detectors Tong Wei Haoyang Li Chun-Shu Li Jiang-Xin Shi Yu-Feng Li Min-Ling Zhang VLM 76 9 0 29 Sep 2024
Text-driven Human Motion Generation with Motion Masked Diffusion Model Xingyu Chen DiffM VGen 57 2 0 29 Sep 2024
Self-supervised Auxiliary Learning for Texture and Model-based Hybrid Robust and Fair Featuring in Face Analysis Shukesh Reddy Nishit Poddar Srijan Das Abhijit Das CVBM 74 0 0 29 Sep 2024
BiPC: Bidirectional Probability Calibration for Unsupervised Domain Adaption Wenlve Zhou Zhiheng Zhou Junyuan Shang Chang Niu Mingyue Zhang Xiyuan Tao Tianlei Wang 78 0 0 29 Sep 2024
Contrastive ground-level image and remote sensing pre-training improves representation learning for natural world imagery Andy V. Huynh Lauren E. Gillespie Jael Lopez-Saucedo Claire Tang Rohan Sikand Moisés Expósito-Alonso SSL 120 5 0 28 Sep 2024
Fast Encoding and Decoding for Implicit Video Representation Hao Chen Saining Xie Ser-Nam Lim Abhinav Shrivastava 83 1 0 28 Sep 2024
Brain-JEPA: Brain Dynamics Foundation Model with Gradient Positioning and Spatiotemporal Masking Zijian Dong Ruilin Li Yilei Wu Thuan Tinh Nguyen J. Chong Fang Ji Nathanael Ren Jie Tong Christopher Li Hsian Chen Juan Helen Zhou 60 9 0 28 Sep 2024