v1v2v3 (latest)

Masked Autoencoders Are Scalable Vision Learners

11 November 2021

Piotr Dollár

Papers citing "Masked Autoencoders Are Scalable Vision Learners"

50 / 4,778 papers shown

Title
Turbo Training with Token Dropout Tengda Han Weidi Xie Andrew Zisserman ViT 81 11 0 10 Oct 2022
Revisiting adapters with adversarial training Sylvestre-Alvise Rebuffi Francesco Croce Sven Gowal AAML 62 17 0 10 Oct 2022
Exploiting map information for self-supervised learning in motion forecasting Caio Azevedo Thomas Gilles S. Sabatini D. Tsishkou SSL 115 9 0 10 Oct 2022
Denoising Masked AutoEncoders Help Robust Classification Quanlin Wu Hang Ye Yuntian Gu Huishuai Zhang Liwei Wang Di He 77 22 0 10 Oct 2022
A Comprehensive Survey of Data Augmentation in Visual Reinforcement Learning Guozheng Ma Zhen Wang Zhecheng Yuan Xueqian Wang Bo Yuan Dacheng Tao OffRL 87 28 0 10 Oct 2022
Scaling Up Probabilistic Circuits by Latent Variable Distillation Hoang Trung-Dung Honghua Zhang Guy Van den Broeck TPM 71 27 0 10 Oct 2022
Learning to Decompose Visual Features with Latent Textual Prompts Feng Wang Manling Li Xudong Lin Hairong Lv Alex Schwing Heng Ji VLM 103 25 0 09 Oct 2022
MAMO: Masked Multimodal Modeling for Fine-Grained Vision-Language Representation Learning Zijia Zhao Longteng Guo Xingjian He Shuai Shao Zehuan Yuan Jing Liu 105 9 0 09 Oct 2022
Deep Span Representations for Named Entity Recognition Enwei Zhu Yiyang Liu Jinpeng Li 68 11 0 09 Oct 2022
Self-supervised Video Representation Learning with Motion-Aware Masked Autoencoders Haosen Yang Deng Huang Bin Wen Jiannan Wu Huanjin Yao Yi Jiang Xiatian Zhu Zehuan Yuan 54 20 0 09 Oct 2022
VoLTA: Vision-Language Transformer with Weakly-Supervised Local-Feature Alignment Shraman Pramanick Li Jing Sayan Nag Jiachen Zhu Hardik Shah Yann LeCun Ramalingam Chellappa 82 22 0 09 Oct 2022
Robustness of Unsupervised Representation Learning without Labels Aleksandar Petrov Marta Z. Kwiatkowska OffRL 90 2 0 08 Oct 2022
(Fusionformer):Exploiting the Joint Motion Synergy with Fusion Network Based On Transformer for 3D Human Pose Estimation Xinwei Yu Xiaohua Zhang ViT 98 0 0 08 Oct 2022
ViewFool: Evaluating the Robustness of Visual Recognition to Adversarial Viewpoints Yinpeng Dong Shouwei Ruan Hang Su Cai Kang Xingxing Wei Junyi Zhu AAML 85 50 0 08 Oct 2022
AlphaTuning: Quantization-Aware Parameter-Efficient Adaptation of Large-Scale Pre-Trained Language Models S. Kwon Jeonghoon Kim Jeongin Bae Kang Min Yoo Jin-Hwa Kim Baeseong Park Byeongwook Kim Jung-Woo Ha Nako Sung Dongsoo Lee MQ 117 31 0 08 Oct 2022
SVL-Adapter: Self-Supervised Adapter for Vision-Language Pretrained Models Omiros Pantazis Gabriel J. Brostow Kate E. Jones Oisin Mac Aodha VLM 82 42 0 07 Oct 2022
Pre-trained Adversarial Perturbations Y. Ban Yinpeng Dong AAML 98 24 0 07 Oct 2022
Critical Learning Periods for Multisensory Integration in Deep Networks Michael Kleinman Alessandro Achille Stefano Soatto 116 11 0 06 Oct 2022
Real-World Robot Learning with Masked Visual Pre-training Ilija Radosavovic Tete Xiao Stephen James Pieter Abbeel Jitendra Malik Trevor Darrell SSL 246 254 0 06 Oct 2022
VIMA: General Robot Manipulation with Multimodal Prompts Yunfan Jiang Agrim Gupta Zichen Zhang Guanzhi Wang Yongqiang Dou Yanjun Chen Li Fei-Fei Anima Anandkumar Yuke Zhu Linxi Fan LM&Ro 117 355 0 06 Oct 2022
The Lie Derivative for Measuring Learned Equivariance Nate Gruver Marc Finzi Micah Goldblum A. Wilson 99 40 0 06 Oct 2022
Effective Self-supervised Pre-training on Low-compute Networks without Distillation Fuwen Tan F. Saleh Brais Martínez 81 4 0 06 Oct 2022
PSVRF: Learning to restore Pitch-Shifted Voice without reference Yangfu Li Xiaodan Lin Jiaxin Yang 55 0 0 06 Oct 2022
Active Image Indexing Pierre Fernandez Matthijs Douze Hervé Jégou Teddy Furon VLM 67 10 0 05 Oct 2022
Image Masking for Robust Self-Supervised Monocular Depth Estimation Hemang Chawla Kishaan Jeeveswaran Elahe Arani Bahram Zonooz MDE 97 7 0 05 Oct 2022
Vision+X: A Survey on Multimodal Learning in the Light of Data Ye Zhu Yuehua Wu N. Sebe Yan Yan 107 19 0 05 Oct 2022
RankMe: Assessing the downstream performance of pretrained self-supervised representations by their rank Q. Garrido Randall Balestriero Laurent Najman Yann LeCun SSL 138 79 0 05 Oct 2022
Exploring The Role of Mean Teachers in Self-supervised Masked Auto-Encoders Youngwan Lee Jeffrey Willette Jonghee Kim Juho Lee Sung Ju Hwang 93 16 0 05 Oct 2022
Self-supervised Pre-training for Semantic Segmentation in an Indoor Scene Sulabh Shrestha Yimeng Li Jana Kosecka 3DPC SSL SSeg 93 2 0 04 Oct 2022
Backdoor Attacks in the Supply Chain of Masked Image Modeling Xinyue Shen Xinlei He Zheng Li Yun Shen Michael Backes Yang Zhang 78 8 0 04 Oct 2022
VICRegL: Self-Supervised Learning of Local Visual Features Adrien Bardes Jean Ponce Yann LeCun SSL 99 127 0 04 Oct 2022
Learning from the Best: Contrastive Representations Learning Across Sensor Locations for Wearable Activity Recognition Vitor Fortes Rey Sungho Suh P. Lukowicz SSL HAI 74 12 0 04 Oct 2022
MTSMAE: Masked Autoencoders for Multivariate Time-Series Forecasting Peiwang Tang Xianchao Zhang AI4TS 80 14 0 04 Oct 2022
CLIP2Point: Transfer CLIP to Point Cloud Classification with Image-Depth Pre-training Tianyu Huang Bowen Dong Yunhan Yang Xiaoshui Huang Rynson W. H. Lau Wanli Ouyang W. Zuo VLM 3DPC CLIP 138 150 0 03 Oct 2022
Expediting Large-Scale Vision Transformer for Dense Prediction without Fine-tuning Weicong Liang Yuhui Yuan Henghui Ding Xiao Luo Weihong Lin Ding Jia Zheng Zhang Chao Zhang Hanhua Hu 117 31 0 03 Oct 2022
Attention Distillation: self-supervised vision transformer students need more guidance Kai Wang Fei Yang Joost van de Weijer ViT 57 18 0 03 Oct 2022
Enhancing Fine-Grained 3D Object Recognition using Hybrid Multi-Modal Vision Transformer-CNN Models Songsong Xiong Georgios Tziafas Hamidreza Kasaei ViT 53 3 0 03 Oct 2022
Masked Supervised Learning for Semantic Segmentation H. Zunair A. Ben Hamza 65 8 0 03 Oct 2022
Fill in Fabrics: Body-Aware Self-Supervised Inpainting for Image-Based Virtual Try-On H. Zunair Y. Gobeil Samuel Mercier A. Ben Hamza 44 2 0 03 Oct 2022
Towards a Unified View on Visual Parameter-Efficient Transfer Learning Bruce X. B. Yu Jianlong Chang Lin Liu Qi Tian Changan Chen VPVLM VLM 115 36 0 03 Oct 2022
Under the Cover Infant Pose Estimation using Multimodal Data Daniel G. Kyrollos A. Fuller K. Greenwood J. Harrold J.R. Green 3DH 81 9 0 03 Oct 2022
Contrastive Audio-Visual Masked Autoencoder Yuan Gong Andrew Rouditchenko Alexander H. Liu David Harwath Leonid Karlinsky Hilde Kuehne James R. Glass 122 128 0 02 Oct 2022
Federated Training of Dual Encoding Models on Small Non-IID Client Datasets Raviteja Vemulapalli Warren Morningstar Philip Mansfield Hubert Eichner K. Singhal Arash Afkanpour Bradley Green FedML 94 2 0 30 Sep 2022
VIP: Towards Universal Visual Reward and Representation via Value-Implicit Pre-Training Yecheng Jason Ma Shagun Sodhani Dinesh Jayaraman Osbert Bastani Vikash Kumar Amy Zhang SSL OffRL 121 306 0 30 Sep 2022
Where Should I Spend My FLOPS? Efficiency Evaluations of Visual Pre-training Methods Skanda Koppula Yazhe Li Evan Shelhamer Andrew Jaegle Nikhil Parthasarathy Relja Arandjelović João Carreira Olivier J. Hénaff 86 9 0 30 Sep 2022
Slimmable Networks for Contrastive Self-supervised Learning Shuai Zhao Xiaohan Wang Linchao Zhu Yi Yang 59 1 0 30 Sep 2022
Rethinking the Learning Paradigm for Facial Expression Recognition Weijie Wang N. Sebe Bruno Lepri 86 2 0 30 Sep 2022
Learning Transferable Spatiotemporal Representations from Natural Script Knowledge Ziyun Zeng Yuying Ge Xihui Liu Bin Chen Ping Luo Shutao Xia Yixiao Ge AI4TS 91 8 0 30 Sep 2022
Universal Prompt Tuning for Graph Neural Networks Taoran Fang Yunchao Zhang Yang Yang Chunping Wang Lei Chen 122 60 0 30 Sep 2022
Self-Distillation for Further Pre-training of Transformers Seanie Lee Minki Kang Juho Lee Sung Ju Hwang Kenji Kawaguchi 100 8 0 30 Sep 2022