ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.04560
  4. Cited By
Scaling Vision Transformers

Scaling Vision Transformers

8 June 2021
Xiaohua Zhai
Alexander Kolesnikov
N. Houlsby
Lucas Beyer
    ViT
ArXivPDFHTML

Papers citing "Scaling Vision Transformers"

50 / 751 papers shown
Title
VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking
VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking
Limin Wang
Bingkun Huang
Zhiyu Zhao
Zhan Tong
Yinan He
Yi Wang
Yali Wang
Yu Qiao
VGen
71
329
0
29 Mar 2023
Unmasked Teacher: Towards Training-Efficient Video Foundation Models
Unmasked Teacher: Towards Training-Efficient Video Foundation Models
Kunchang Li
Yali Wang
Yizhuo Li
Yi Wang
Yinan He
Limin Wang
Yu Qiao
VGen
57
156
0
28 Mar 2023
Sigmoid Loss for Language Image Pre-Training
Sigmoid Loss for Language Image Pre-Training
Xiaohua Zhai
Basil Mustafa
Alexander Kolesnikov
Lucas Beyer
CLIP
VLM
36
960
0
27 Mar 2023
Text-to-Image Diffusion Models are Zero-Shot Classifiers
Text-to-Image Diffusion Models are Zero-Shot Classifiers
Kevin Clark
P. Jaini
DiffM
VLM
38
107
0
27 Mar 2023
Spatio-Temporal driven Attention Graph Neural Network with Block
  Adjacency matrix (STAG-NN-BA)
Spatio-Temporal driven Attention Graph Neural Network with Block Adjacency matrix (STAG-NN-BA)
U. Nazir
W. Islam
M. Taj
38
3
0
25 Mar 2023
VILA: Learning Image Aesthetics from User Comments with Vision-Language
  Pretraining
VILA: Learning Image Aesthetics from User Comments with Vision-Language Pretraining
Junjie Ke
Keren Ye
Jiahui Yu
Yonghui Wu
P. Milanfar
Feng Yang
VLM
46
55
0
24 Mar 2023
The Quantization Model of Neural Scaling
The Quantization Model of Neural Scaling
Eric J. Michaud
Ziming Liu
Uzay Girit
Max Tegmark
MILM
27
77
0
23 Mar 2023
The effectiveness of MAE pre-pretraining for billion-scale pretraining
The effectiveness of MAE pre-pretraining for billion-scale pretraining
Mannat Singh
Quentin Duval
Kalyan Vasudev Alwala
Haoqi Fan
Vaibhav Aggarwal
...
Piotr Dollár
Christoph Feichtenhofer
Ross B. Girshick
Rohit Girdhar
Ishan Misra
LRM
126
63
0
23 Mar 2023
CoBIT: A Contrastive Bi-directional Image-Text Generation Model
CoBIT: A Contrastive Bi-directional Image-Text Generation Model
Haoxuan You
Mandy Guo
Zhecan Wang
Kai-Wei Chang
Jason Baldridge
Jiahui Yu
DiffM
49
13
0
23 Mar 2023
An Extended Study of Human-like Behavior under Adversarial Training
An Extended Study of Human-like Behavior under Adversarial Training
Paul Gavrikov
J. Keuper
M. Keuper
AAML
31
9
0
22 Mar 2023
CAT-Seg: Cost Aggregation for Open-Vocabulary Semantic Segmentation
CAT-Seg: Cost Aggregation for Open-Vocabulary Semantic Segmentation
Seokju Cho
Heeseong Shin
Sung‐Jin Hong
Anurag Arnab
Paul Hongsuck Seo
Seung Wook Kim
VLM
29
104
0
21 Mar 2023
Large AI Models in Health Informatics: Applications, Challenges, and the
  Future
Large AI Models in Health Informatics: Applications, Challenges, and the Future
Jianing Qiu
Lin Li
Jiankai Sun
Jiachuan Peng
Peilun Shi
...
Bo Xiao
Wu Yuan
Ningli Wang
Dong Xu
Benny Lo
AI4MH
LM&MA
42
127
0
21 Mar 2023
EVA-02: A Visual Representation for Neon Genesis
EVA-02: A Visual Representation for Neon Genesis
Yuxin Fang
Quan-Sen Sun
Xinggang Wang
Tiejun Huang
Xinlong Wang
Yue Cao
VLM
ViT
CLIP
40
259
0
20 Mar 2023
What does it take to catch a Chinchilla? Verifying Rules on Large-Scale
  Neural Network Training via Compute Monitoring
What does it take to catch a Chinchilla? Verifying Rules on Large-Scale Neural Network Training via Compute Monitoring
Yonadav Shavit
31
22
0
20 Mar 2023
Dual-path Adaptation from Image to Video Transformers
Dual-path Adaptation from Image to Video Transformers
Jungin Park
Jiyoung Lee
Kwanghoon Sohn
ViT
21
37
0
17 Mar 2023
SemDeDup: Data-efficient learning at web-scale through semantic
  deduplication
SemDeDup: Data-efficient learning at web-scale through semantic deduplication
Amro Abbas
Kushal Tirumala
Daniel Simig
Surya Ganguli
Ari S. Morcos
31
162
0
16 Mar 2023
Stabilizing Transformer Training by Preventing Attention Entropy
  Collapse
Stabilizing Transformer Training by Preventing Attention Entropy Collapse
Shuangfei Zhai
Tatiana Likhomanenko
Etai Littwin
Dan Busbridge
Jason Ramapuram
Yizhe Zhang
Jiatao Gu
J. Susskind
AAML
46
65
0
11 Mar 2023
Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation
  Models
Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models
Chenfei Wu
Sheng-Kai Yin
Weizhen Qi
Xiaodong Wang
Zecheng Tang
Nan Duan
MLLM
LRM
53
614
0
08 Mar 2023
Can We Scale Transformers to Predict Parameters of Diverse ImageNet
  Models?
Can We Scale Transformers to Predict Parameters of Diverse ImageNet Models?
Boris Knyazev
Doha Hwang
Simon Lacoste-Julien
AI4CE
37
17
0
07 Mar 2023
UniHCP: A Unified Model for Human-Centric Perceptions
UniHCP: A Unified Model for Human-Centric Perceptions
Yuanzheng Ci
Yizhou Wang
Meilin Chen
Shixiang Tang
Lei Bai
Feng Zhu
Rui Zhao
F. Yu
Donglian Qi
Wanli Ouyang
82
51
0
06 Mar 2023
Adversarial Attacks on Machine Learning in Embedded and IoT Platforms
Adversarial Attacks on Machine Learning in Embedded and IoT Platforms
Christian Westbrook
S. Pasricha
AAML
25
3
0
03 Mar 2023
Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!
Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!
Shiwei Liu
Tianlong Chen
Zhenyu Zhang
Xuxi Chen
Tianjin Huang
Ajay Jaiswal
Zhangyang Wang
32
29
0
03 Mar 2023
Visual Atoms: Pre-training Vision Transformers with Sinusoidal Waves
Visual Atoms: Pre-training Vision Transformers with Sinusoidal Waves
Sora Takashima
Ryo Hayamizu
Nakamasa Inoue
Hirokatsu Kataoka
Rio Yokota
68
18
0
02 Mar 2023
OmniForce: On Human-Centered, Large Model Empowered and Cloud-Edge
  Collaborative AutoML System
OmniForce: On Human-Centered, Large Model Empowered and Cloud-Edge Collaborative AutoML System
Chao Xue
Wei Liu
Shunxing Xie
Zhenfang Wang
Jiaxing Li
...
Shi-Yong Chen
Yibing Zhan
Jing Zhang
Chaoyue Wang
Dacheng Tao
43
2
0
01 Mar 2023
Generic-to-Specific Distillation of Masked Autoencoders
Generic-to-Specific Distillation of Masked Autoencoders
Wei Huang
Zhiliang Peng
Li Dong
Furu Wei
Jianbin Jiao
QiXiang Ye
32
22
0
28 Feb 2023
The Dormant Neuron Phenomenon in Deep Reinforcement Learning
The Dormant Neuron Phenomenon in Deep Reinforcement Learning
Ghada Sokar
Rishabh Agarwal
Pablo Samuel Castro
Utku Evci
CLL
51
89
0
24 Feb 2023
Language-Driven Representation Learning for Robotics
Language-Driven Representation Learning for Robotics
Siddharth Karamcheti
Suraj Nair
Annie S. Chen
Thomas Kollar
Chelsea Finn
Dorsa Sadigh
Percy Liang
LM&Ro
SSL
47
145
0
24 Feb 2023
Learning Visual Representations via Language-Guided Sampling
Learning Visual Representations via Language-Guided Sampling
Mohamed El Banani
Karan Desai
Justin Johnson
SSL
VLM
21
28
0
23 Feb 2023
On the Robustness of ChatGPT: An Adversarial and Out-of-distribution
  Perspective
On the Robustness of ChatGPT: An Adversarial and Out-of-distribution Perspective
Jindong Wang
Xixu Hu
Wenxin Hou
Hao Chen
Runkai Zheng
...
Weirong Ye
Xiubo Geng
Binxing Jiao
Yue Zhang
Xingxu Xie
AI4MH
52
220
0
22 Feb 2023
Open-domain Visual Entity Recognition: Towards Recognizing Millions of
  Wikipedia Entities
Open-domain Visual Entity Recognition: Towards Recognizing Millions of Wikipedia Entities
Hexiang Hu
Yi Luan
Yang Chen
Urvashi Khandelwal
Mandar Joshi
Kenton Lee
Kristina Toutanova
Ming-Wei Chang
VLM
55
55
0
22 Feb 2023
Time to Embrace Natural Language Processing (NLP)-based Digital
  Pathology: Benchmarking NLP- and Convolutional Neural Network-based Deep
  Learning Pipelines
Time to Embrace Natural Language Processing (NLP)-based Digital Pathology: Benchmarking NLP- and Convolutional Neural Network-based Deep Learning Pipelines
M. Cen
Xingyu Li
Bangwei Guo
J. Jonnagaddala
Hong Zhang
Xuesong Xu
MedIm
LM&MA
16
0
0
21 Feb 2023
Optical Transformers
Optical Transformers
Maxwell G. Anderson
Shifan Ma
Tianyu Wang
Logan G. Wright
Peter L. McMahon
20
20
0
20 Feb 2023
Scaling Laws for Multilingual Neural Machine Translation
Scaling Laws for Multilingual Neural Machine Translation
Patrick Fernandes
Behrooz Ghorbani
Xavier Garcia
Markus Freitag
Orhan Firat
38
29
0
19 Feb 2023
Tuning computer vision models with task rewards
Tuning computer vision models with task rewards
André Susano Pinto
Alexander Kolesnikov
Yuge Shi
Lucas Beyer
Xiaohua Zhai
VLM
27
40
0
16 Feb 2023
Towards Efficient Visual Adaption via Structural Re-parameterization
Towards Efficient Visual Adaption via Structural Re-parameterization
Gen Luo
Minglang Huang
Yiyi Zhou
Xiaoshuai Sun
Guannan Jiang
Zhiyu Wang
Rongrong Ji
VLM
VPVLM
14
78
0
16 Feb 2023
Data pruning and neural scaling laws: fundamental limitations of
  score-based algorithms
Data pruning and neural scaling laws: fundamental limitations of score-based algorithms
Fadhel Ayed
Soufiane Hayou
14
9
0
14 Feb 2023
Symbolic Discovery of Optimization Algorithms
Symbolic Discovery of Optimization Algorithms
Xiangning Chen
Chen Liang
Da Huang
Esteban Real
Kaiyuan Wang
...
Xuanyi Dong
Thang Luong
Cho-Jui Hsieh
Yifeng Lu
Quoc V. Le
67
353
0
13 Feb 2023
Quantum Neuron Selection: Finding High Performing Subnetworks With
  Quantum Algorithms
Quantum Neuron Selection: Finding High Performing Subnetworks With Quantum Algorithms
Tim Whitaker
33
1
0
12 Feb 2023
Synaptic Stripping: How Pruning Can Bring Dead Neurons Back To Life
Synaptic Stripping: How Pruning Can Bring Dead Neurons Back To Life
Tim Whitaker
L. D. Whitley
CVBM
33
2
0
11 Feb 2023
Scaling Vision Transformers to 22 Billion Parameters
Scaling Vision Transformers to 22 Billion Parameters
Mostafa Dehghani
Josip Djolonga
Basil Mustafa
Piotr Padlewski
Jonathan Heek
...
Mario Luvcić
Xiaohua Zhai
Daniel Keysers
Jeremiah Harmsen
N. Houlsby
MLLM
66
572
0
10 Feb 2023
SimCon Loss with Multiple Views for Text Supervised Semantic
  Segmentation
SimCon Loss with Multiple Views for Text Supervised Semantic Segmentation
Yash J. Patel
Yusheng Xie
Yi Zhu
Srikar Appalaraju
R. Manmatha
35
4
0
07 Feb 2023
AIM: Adapting Image Models for Efficient Video Action Recognition
AIM: Adapting Image Models for Efficient Video Action Recognition
Taojiannan Yang
Yi Zhu
Yusheng Xie
Aston Zhang
Cheng Chen
Mu Li
ViT
58
144
0
06 Feb 2023
Mixed Precision Post Training Quantization of Neural Networks with
  Sensitivity Guided Search
Mixed Precision Post Training Quantization of Neural Networks with Sensitivity Guided Search
Clemens J. S. Schaefer
Elfie Guo
Caitlin Stanton
Xiaofan Zhang
T. Jablin
Navid Lambert-Shirzad
Jian Li
Chia-Wei Chou
Siddharth Joshi
Yu Wang
MQ
31
3
0
02 Feb 2023
Dual PatchNorm
Dual PatchNorm
Manoj Kumar
Mostafa Dehghani
N. Houlsby
UQCV
ViT
29
11
0
02 Feb 2023
Does Vision Accelerate Hierarchical Generalization of Neural Language
  Learners?
Does Vision Accelerate Hierarchical Generalization of Neural Language Learners?
Tatsuki Kuribayashi
VLM
19
1
0
01 Feb 2023
mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image
  and Video
mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video
Haiyang Xu
Qinghao Ye
Mingshi Yan
Yaya Shi
Jiabo Ye
...
Guohai Xu
Ji Zhang
Songfang Huang
Feiran Huang
Jingren Zhou
MLLM
VLM
MoE
43
160
0
01 Feb 2023
Spyker: High-performance Library for Spiking Deep Neural Networks
Spyker: High-performance Library for Spiking Deep Neural Networks
Shahriar Rezghi Shirsavar
M. Dehaqani
14
0
0
31 Jan 2023
Adaptive Computation with Elastic Input Sequence
Adaptive Computation with Elastic Input Sequence
Fuzhao Xue
Valerii Likhosherstov
Anurag Arnab
N. Houlsby
Mostafa Dehghani
Yang You
31
19
0
30 Jan 2023
A Closer Look at Few-shot Classification Again
A Closer Look at Few-shot Classification Again
Xu Luo
Hao Wu
Ji Zhang
Lianli Gao
Jing Xu
Jingkuan Song
24
48
0
28 Jan 2023
Norm-based Generalization Bounds for Compositionally Sparse Neural
  Networks
Norm-based Generalization Bounds for Compositionally Sparse Neural Networks
Tomer Galanti
Mengjia Xu
Liane Galanti
T. Poggio
35
9
0
28 Jan 2023
Previous
123...91011...141516
Next