ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2302.05442
  4. Cited By
Scaling Vision Transformers to 22 Billion Parameters

Scaling Vision Transformers to 22 Billion Parameters

10 February 2023
Mostafa Dehghani
Josip Djolonga
Basil Mustafa
Piotr Padlewski
Jonathan Heek
Justin Gilmer
Andreas Steiner
Mathilde Caron
Robert Geirhos
Ibrahim M. Alabdulmohsin
Rodolphe Jenatton
Lucas Beyer
Michael Tschannen
Anurag Arnab
Tianlin Li
C. Riquelme
Matthias Minderer
J. Puigcerver
Utku Evci
Manoj Kumar
Sjoerd van Steenkiste
Gamaleldin F. Elsayed
Aravindh Mahendran
Feng Yu
Avital Oliver
Fantine Huot
Jasmijn Bastings
Mark Collier
A. Gritsenko
Vighnesh Birodkar
C. N. Vasconcelos
Yi Tay
Thomas Mensink
Alexander Kolesnikov
Filip Pavetić
Dustin Tran
Thomas Kipf
Mario Luvcić
Xiaohua Zhai
Daniel Keysers
Jeremiah Harmsen
N. Houlsby
    MLLM
ArXivPDFHTML

Papers citing "Scaling Vision Transformers to 22 Billion Parameters"

50 / 418 papers shown
Title
ZigMa: A DiT-style Zigzag Mamba Diffusion Model
ZigMa: A DiT-style Zigzag Mamba Diffusion Model
Vincent Tao Hu
S. A. Baumann
Ming Gui
Olga Grebenkova
Pingchuan Ma
Johannes S. Fischer
Bjorn Ommer
42
42
0
20 Mar 2024
When Do We Not Need Larger Vision Models?
When Do We Not Need Larger Vision Models?
Baifeng Shi
Ziyang Wu
Maolin Mao
Xin Wang
Trevor Darrell
VLM
LRM
54
41
0
19 Mar 2024
ADAPT to Robustify Prompt Tuning Vision Transformers
ADAPT to Robustify Prompt Tuning Vision Transformers
Masih Eskandar
Tooba Imtiaz
Zifeng Wang
Jennifer Dy
VPVLM
VLM
AAML
38
0
0
19 Mar 2024
Dynamic Tuning Towards Parameter and Inference Efficiency for ViT
  Adaptation
Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation
Wangbo Zhao
Jiasheng Tang
Yizeng Han
Yibing Song
Kai Wang
Gao Huang
F. Wang
Yang You
40
11
0
18 Mar 2024
Frozen Feature Augmentation for Few-Shot Image Classification
Frozen Feature Augmentation for Few-Shot Image Classification
Andreas Bär
N. Houlsby
Mostafa Dehghani
Manoj Kumar
VLM
34
4
0
15 Mar 2024
Depth-induced Saliency Comparison Network for Diagnosis of Alzheimer's
  Disease via Jointly Analysis of Visual Stimuli and Eye Movements
Depth-induced Saliency Comparison Network for Diagnosis of Alzheimer's Disease via Jointly Analysis of Visual Stimuli and Eye Movements
Yu Liu
Wenlin Zhang
Shaochu Wang
Fangyu Zuo
Peiguang Jing
Yong Ji
27
0
0
15 Mar 2024
Generalizing Denoising to Non-Equilibrium Structures Improves
  Equivariant Force Fields
Generalizing Denoising to Non-Equilibrium Structures Improves Equivariant Force Fields
Yi-Lun Liao
Tess E. Smidt
Abhishek Das
DiffM
AI4CE
40
12
0
14 Mar 2024
Language models scale reliably with over-training and on downstream
  tasks
Language models scale reliably with over-training and on downstream tasks
S. Gadre
Georgios Smyrnis
Vaishaal Shankar
Suchin Gururangan
Mitchell Wortsman
...
Y. Carmon
Achal Dave
Reinhard Heckel
Niklas Muennighoff
Ludwig Schmidt
ALM
ELM
LRM
108
40
0
13 Mar 2024
Not just Birds and Cars: Generic, Scalable and Explainable Models for
  Professional Visual Recognition
Not just Birds and Cars: Generic, Scalable and Explainable Models for Professional Visual Recognition
Junde Wu
Jiayuan Zhu
Min Xu
Yueming Jin
38
0
0
08 Mar 2024
Rule-driven News Captioning
Rule-driven News Captioning
Ning Xu
Tingting Zhang
Hongshuo Tian
An-An Liu
68
0
0
08 Mar 2024
ACC-ViT : Atrous Convolution's Comeback in Vision Transformers
ACC-ViT : Atrous Convolution's Comeback in Vision Transformers
Nabil Ibtehaz
Ning Yan
Masood S. Mortazavi
Daisuke Kihara
ViT
32
3
0
07 Mar 2024
Batch size invariant Adam
Batch size invariant Adam
Xi Wang
Laurence Aitchison
46
2
0
29 Feb 2024
Disentangling the Causes of Plasticity Loss in Neural Networks
Disentangling the Causes of Plasticity Loss in Neural Networks
Clare Lyle
Zeyu Zheng
Khimya Khetarpal
H. V. Hasselt
Razvan Pascanu
James Martens
Will Dabney
AI4CE
55
32
0
29 Feb 2024
Sora: A Review on Background, Technology, Limitations, and Opportunities
  of Large Vision Models
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models
Yixin Liu
Kai Zhang
Yuan Li
Zhiling Yan
Chujie Gao
...
Yue Huang
Hanchi Sun
Jianfeng Gao
Lifang He
Lichao Sun
VLM
VGen
EGVM
75
260
0
27 Feb 2024
Why Transformers Need Adam: A Hessian Perspective
Why Transformers Need Adam: A Hessian Perspective
Yushun Zhang
Congliang Chen
Tian Ding
Ziniu Li
Ruoyu Sun
Zhimin Luo
40
43
0
26 Feb 2024
Pretrained Visual Uncertainties
Pretrained Visual Uncertainties
Michael Kirchhof
Mark Collier
Seong Joon Oh
Enkelejda Kasneci
UQCV
410
8
1
26 Feb 2024
StochCA: A Novel Approach for Exploiting Pretrained Models with
  Cross-Attention
StochCA: A Novel Approach for Exploiting Pretrained Models with Cross-Attention
SeungWon Seo
Suho Lee
Sangheum Hwang
38
0
0
25 Feb 2024
Parameter-efficient Prompt Learning for 3D Point Cloud Understanding
Parameter-efficient Prompt Learning for 3D Point Cloud Understanding
Hongyu Sun
Yongcai Wang
Wang Chen
Haoran Deng
Deying Li
VPVLM
53
5
0
24 Feb 2024
Genie: Generative Interactive Environments
Genie: Generative Interactive Environments
Jake Bruce
Michael Dennis
Ashley D. Edwards
Jack Parker-Holder
Yuge Shi
...
Konrad Zolna
Jeff Clune
Nando de Freitas
Satinder Singh
Tim Rocktaschel
VGen
VLM
74
146
0
23 Feb 2024
Pushing Auto-regressive Models for 3D Shape Generation at Capacity and
  Scalability
Pushing Auto-regressive Models for 3D Shape Generation at Capacity and Scalability
Xue-Qing Qian
Yu Wang
Simian Luo
Yinda Zhang
Ying Tai
...
Xiangyang Xue
Bo Zhao
Tiejun Huang
Yunsheng Wu
Yanwei Fu
29
6
0
19 Feb 2024
Linear Transformers with Learnable Kernel Functions are Better
  In-Context Models
Linear Transformers with Learnable Kernel Functions are Better In-Context Models
Yaroslav Aksenov
Nikita Balagansky
Sofia Maria Lo Cicero Vaina
Boris Shaposhnikov
Alexey Gorbatovski
Daniil Gavrilov
KELM
41
5
0
16 Feb 2024
Bridging Associative Memory and Probabilistic Modeling
Bridging Associative Memory and Probabilistic Modeling
Rylan Schaeffer
Nika Zahedi
Mikail Khona
Dhruv Pai
Sang T. Truong
...
Sarthak Chandra
Andres Carranza
Ila Rani Fiete
Andrey Gromov
Oluwasanmi Koyejo
DiffM
43
4
0
15 Feb 2024
HEAL-ViT: Vision Transformers on a spherical mesh for medium-range
  weather forecasting
HEAL-ViT: Vision Transformers on a spherical mesh for medium-range weather forecasting
Vivek Ramavajjala
38
2
0
14 Feb 2024
For Better or For Worse? Learning Minimum Variance Features With Label Augmentation
For Better or For Worse? Learning Minimum Variance Features With Label Augmentation
Muthuraman Chidambaram
Rong Ge
AAML
30
0
0
10 Feb 2024
Time-, Memory- and Parameter-Efficient Visual Adaptation
Time-, Memory- and Parameter-Efficient Visual Adaptation
Otniel-Bogdan Mercea
Alexey Gritsenko
Cordelia Schmid
Anurag Arnab
VLM
35
13
0
05 Feb 2024
ClipFormer: Key-Value Clipping of Transformers on Memristive Crossbars
  for Write Noise Mitigation
ClipFormer: Key-Value Clipping of Transformers on Memristive Crossbars for Write Noise Mitigation
Abhiroop Bhattacharjee
Abhishek Moitra
Priyadarshini Panda
CLIP
24
6
0
04 Feb 2024
A Graph is Worth $K$ Words: Euclideanizing Graph using Pure Transformer
A Graph is Worth KKK Words: Euclideanizing Graph using Pure Transformer
Zhangyang Gao
Daize Dong
Cheng Tan
Jun Xia
Bozhen Hu
Stan Z. Li
46
6
0
04 Feb 2024
Revisiting the Power of Prompt for Visual Tuning
Revisiting the Power of Prompt for Visual Tuning
Yuzhu Wang
Lechao Cheng
Chaowei Fang
Dingwen Zhang
Manni Duan
Meng Wang
VLM
56
14
0
04 Feb 2024
A General Framework for Learning from Weak Supervision
A General Framework for Learning from Weak Supervision
Hao Chen
Jindong Wang
Lei Feng
Xiang Li
Yidong Wang
Xing Xie
Masashi Sugiyama
Rita Singh
Bhiksha Raj
36
3
0
02 Feb 2024
Leveraging Large Language Models for Analyzing Blood Pressure Variations
  Across Biological Sex from Scientific Literature
Leveraging Large Language Models for Analyzing Blood Pressure Variations Across Biological Sex from Scientific Literature
Yuting Guo
Seyedeh Somayyeh Mousavi
Reza Sameni
Abeed Sarker
22
0
0
02 Feb 2024
Simulation of Graph Algorithms with Looped Transformers
Simulation of Graph Algorithms with Looped Transformers
Artur Back de Luca
K. Fountoulakis
58
14
0
02 Feb 2024
Convolution Meets LoRA: Parameter Efficient Finetuning for Segment
  Anything Model
Convolution Meets LoRA: Parameter Efficient Finetuning for Segment Anything Model
Zihan Zhong
Zhiqiang Tang
Tong He
Haoyang Fang
Chun Yuan
48
41
0
31 Jan 2024
VIALM: A Survey and Benchmark of Visually Impaired Assistance with Large
  Models
VIALM: A Survey and Benchmark of Visually Impaired Assistance with Large Models
Yi Zhao
Yilin Zhang
Rong Xiang
Jing Li
Hillming Li
43
16
0
29 Jan 2024
Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass
  Diffusion Transformers
Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers
Katherine Crowson
Stefan Andreas Baumann
Alex Birch
Tanishq Mathew Abraham
Daniel Z. Kaplan
Enrico Shippole
29
48
0
21 Jan 2024
Accelerating Heterogeneous Tensor Parallelism via Flexible Workload
  Control
Accelerating Heterogeneous Tensor Parallelism via Flexible Workload Control
Zhigang Wang
Xu Zhang
Ning Wang
Chuanfei Xu
Jie Nie
Zhiqiang Wei
Yu Gu
Ge Yu
21
0
0
21 Jan 2024
Exploring scalable medical image encoders beyond text supervision
Exploring scalable medical image encoders beyond text supervision
Fernando Pérez-García
Harshita Sharma
Sam Bond-Taylor
Kenza Bouzid
Valentina Salvatelli
...
Maria T. A. Wetscherek
Noel C. F. Codella
Stephanie L. Hyland
Javier Alvarez-Valle
Ozan Oktay
LM&MA
MedIm
50
9
0
19 Jan 2024
Scalable Pre-training of Large Autoregressive Image Models
Scalable Pre-training of Large Autoregressive Image Models
Alaaeldin El-Nouby
Michal Klein
Shuangfei Zhai
Miguel Angel Bautista
Alexander Toshev
Vaishaal Shankar
J. Susskind
Armand Joulin
VLM
33
72
0
16 Jan 2024
Transformer for Object Re-Identification: A Survey
Transformer for Object Re-Identification: A Survey
Mang Ye
Shuo Chen
Chenyue Li
Wei-Shi Zheng
David J. Crandall
Bo Du
ViT
98
13
0
13 Jan 2024
OTAS: An Elastic Transformer Serving System via Token Adaptation
OTAS: An Elastic Transformer Serving System via Token Adaptation
Jinyu Chen
Wenchao Xu
Zicong Hong
Song Guo
Yining Qi
Jie Zhang
Deze Zeng
38
4
0
10 Jan 2024
Revisiting Adversarial Training at Scale
Revisiting Adversarial Training at Scale
Zeyu Wang
Xianhang Li
Hongru Zhu
Cihang Xie
34
15
0
09 Jan 2024
Towards Truly Zero-shot Compositional Visual Reasoning with LLMs as
  Programmers
Towards Truly Zero-shot Compositional Visual Reasoning with LLMs as Programmers
Aleksandar Stanić
Sergi Caelles
Michael Tschannen
LRM
VLM
27
9
0
03 Jan 2024
SpecFormer: Guarding Vision Transformer Robustness via Maximum Singular
  Value Penalization
SpecFormer: Guarding Vision Transformer Robustness via Maximum Singular Value Penalization
Xixu Hu
Runkai Zheng
Jindong Wang
Cheuk Hang Leung
Qi Wu
Xing Xie
35
1
0
02 Jan 2024
Analyzing Local Representations of Self-supervised Vision Transformers
Analyzing Local Representations of Self-supervised Vision Transformers
Ani Vanyan
Alvard Barseghyan
Hakob Tamazyan
Vahan Huroyan
Hrant Khachatrian
Martin Danelljan
47
3
0
31 Dec 2023
MosaicBERT: A Bidirectional Encoder Optimized for Fast Pretraining
MosaicBERT: A Bidirectional Encoder Optimized for Fast Pretraining
Jacob P. Portes
Alex Trott
Sam Havens
Daniel King
Abhinav Venigalla
Moin Nadeem
Nikhil Sardana
D. Khudia
Jonathan Frankle
26
16
0
29 Dec 2023
An Empirical Study of Scaling Law for OCR
An Empirical Study of Scaling Law for OCR
Miao Rang
Zhenni Bi
Chuanjian Liu
Yunhe Wang
Kai Han
41
6
0
29 Dec 2023
Learning Vision from Models Rivals Learning Vision from Data
Learning Vision from Models Rivals Learning Vision from Data
Yonglong Tian
Lijie Fan
Kaifeng Chen
Dina Katabi
Dilip Krishnan
Phillip Isola
27
45
0
28 Dec 2023
Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision,
  Language, Audio, and Action
Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action
Jiasen Lu
Christopher Clark
Sangho Lee
Zichen Zhang
Savya Khosla
Ryan Marten
Derek Hoiem
Aniruddha Kembhavi
VLM
MLLM
40
144
0
28 Dec 2023
Make Me a BNN: A Simple Strategy for Estimating Bayesian Uncertainty
  from Pre-trained Models
Make Me a BNN: A Simple Strategy for Estimating Bayesian Uncertainty from Pre-trained Models
Gianni Franchi
Olivier Laurent
Maxence Leguéry
Andrei Bursuc
Andrea Pilzer
Angela Yao
UQCV
BDL
23
4
0
23 Dec 2023
How Smooth Is Attention?
How Smooth Is Attention?
Valérie Castin
Pierre Ablin
Gabriel Peyré
AAML
40
9
0
22 Dec 2023
InternVL: Scaling up Vision Foundation Models and Aligning for Generic
  Visual-Linguistic Tasks
InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks
Zhe Chen
Jiannan Wu
Wenhai Wang
Weijie Su
Guo Chen
...
Bin Li
Ping Luo
Tong Lu
Yu Qiao
Jifeng Dai
VLM
MLLM
176
943
0
21 Dec 2023
Previous
123456789
Next