ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.04560
  4. Cited By
Scaling Vision Transformers

Scaling Vision Transformers

8 June 2021
Xiaohua Zhai
Alexander Kolesnikov
N. Houlsby
Lucas Beyer
    ViT
ArXivPDFHTML

Papers citing "Scaling Vision Transformers"

50 / 751 papers shown
Title
Tell, don't show: Declarative facts influence how LLMs generalize
Tell, don't show: Declarative facts influence how LLMs generalize
Alexander Meinke
Owain Evans
21
7
0
12 Dec 2023
CLIP in Medical Imaging: A Comprehensive Survey
CLIP in Medical Imaging: A Comprehensive Survey
Zihao Zhao
Yuxiao Liu
Han Wu
Yonghao Li
Sheng Wang
L. Teng
Disheng Liu
Zhiming Cui
Qian Wang
Dinggang Shen
CLIP
MedIm
LM&MA
VLM
28
2
0
12 Dec 2023
Photorealistic Video Generation with Diffusion Models
Photorealistic Video Generation with Diffusion Models
Agrim Gupta
Lijun Yu
Kihyuk Sohn
Xiuye Gu
Meera Hahn
Fei-Fei Li
Irfan Essa
Lu Jiang
José Lezama
VGen
57
174
0
11 Dec 2023
Bad Students Make Great Teachers: Active Learning Accelerates
  Large-Scale Visual Understanding
Bad Students Make Great Teachers: Active Learning Accelerates Large-Scale Visual Understanding
Talfan Evans
Shreya Pathak
Hamza Merzic
Jonathan Schwarz
Ryutaro Tanno
Olivier J. Hénaff
18
16
0
08 Dec 2023
Scaling Laws of Synthetic Images for Model Training ... for Now
Scaling Laws of Synthetic Images for Model Training ... for Now
Lijie Fan
Kaifeng Chen
Dilip Krishnan
Dina Katabi
Phillip Isola
Yonglong Tian
CLIP
VLM
41
61
0
07 Dec 2023
GenTron: Diffusion Transformers for Image and Video Generation
GenTron: Diffusion Transformers for Image and Video Generation
Shoufa Chen
Mengmeng Xu
Jiawei Ren
Yuren Cong
Sen He
Yanping Xie
Animesh Sinha
Ping Luo
Tao Xiang
Juan-Manuel Perez-Rua
VGen
36
38
0
07 Dec 2023
Transformer-Powered Surrogates Close the ICF Simulation-Experiment Gap
  with Extremely Limited Data
Transformer-Powered Surrogates Close the ICF Simulation-Experiment Gap with Extremely Limited Data
M. Olson
Shusen Liu
Jayaraman J. Thiagarajan
B. Kustowski
Weng-Keen Wong
Rushil Anirudh
AI4CE
30
1
0
06 Dec 2023
MoSA: Mixture of Sparse Adapters for Visual Efficient Tuning
MoSA: Mixture of Sparse Adapters for Visual Efficient Tuning
Qizhe Zhang
Bocheng Zou
Ruichuan An
Jiaming Liu
Shanghang Zhang
MoE
27
2
0
05 Dec 2023
SEVA: Leveraging sketches to evaluate alignment between human and
  machine visual abstraction
SEVA: Leveraging sketches to evaluate alignment between human and machine visual abstraction
Kushin Mukherjee
Holly Huey
Xuanchen Lu
Yael Vinker
Rio Aguina-Kang
Ariel Shamir
Judith E. Fan
27
11
0
05 Dec 2023
Rejuvenating image-GPT as Strong Visual Representation Learners
Rejuvenating image-GPT as Strong Visual Representation Learners
Sucheng Ren
Zeyu Wang
Hongru Zhu
Junfei Xiao
Alan L. Yuille
Cihang Xie
VLM
57
7
0
04 Dec 2023
GIVT: Generative Infinite-Vocabulary Transformers
GIVT: Generative Infinite-Vocabulary Transformers
Michael Tschannen
Cian Eastwood
Fabian Mentzer
31
34
0
04 Dec 2023
Bootstrapping SparseFormers from Vision Foundation Models
Bootstrapping SparseFormers from Vision Foundation Models
Ziteng Gao
Zhan Tong
K. Lin
Joya Chen
Mike Zheng Shou
38
0
0
04 Dec 2023
InstructTA: Instruction-Tuned Targeted Attack for Large Vision-Language
  Models
InstructTA: Instruction-Tuned Targeted Attack for Large Vision-Language Models
Xunguang Wang
Zhenlan Ji
Pingchuan Ma
Zongjie Li
Shuai Wang
MLLM
43
11
0
04 Dec 2023
Improve Supervised Representation Learning with Masked Image Modeling
Improve Supervised Representation Learning with Masked Image Modeling
Kaifeng Chen
Daniel M. Salz
Huiwen Chang
Kihyuk Sohn
Dilip Krishnan
Mojtaba Seyedhosseini
SSL
ViT
39
3
0
01 Dec 2023
Brainformer: Mimic Human Visual Brain Functions to Machine Vision Models
  via fMRI
Brainformer: Mimic Human Visual Brain Functions to Machine Vision Models via fMRI
Xuan-Bac Nguyen
Xin Li
Pawan Sinha
Samee U. Khan
Khoa Luu
ViT
MedIm
29
0
0
30 Nov 2023
Leveraging VLM-Based Pipelines to Annotate 3D Objects
Leveraging VLM-Based Pipelines to Annotate 3D Objects
Rishabh Kabra
Loic Matthey
Alexander Lerchner
Niloy J. Mitra
24
6
0
29 Nov 2023
Side4Video: Spatial-Temporal Side Network for Memory-Efficient
  Image-to-Video Transfer Learning
Side4Video: Spatial-Temporal Side Network for Memory-Efficient Image-to-Video Transfer Learning
Huanjin Yao
Wenhao Wu
Zhiheng Li
VLM
95
9
0
27 Nov 2023
Insect-Foundation: A Foundation Model and Large-scale 1M Dataset for
  Visual Insect Understanding
Insect-Foundation: A Foundation Model and Large-scale 1M Dataset for Visual Insect Understanding
Hoang-Quan Nguyen
Thanh-Dat Truong
Xuan-Bac Nguyen
Ashley Dowling
Xin Li
Khoa Luu
VLM
24
19
0
26 Nov 2023
Zero redundancy distributed learning with differential privacy
Zero redundancy distributed learning with differential privacy
Zhiqi Bu
Justin Chiu
Ruixuan Liu
Sheng Zha
George Karypis
45
8
0
20 Nov 2023
Security Fence Inspection at Airports Using Object Detection
Security Fence Inspection at Airports Using Object Detection
Nils Friederich
Andreas Specker
Jürgen Beyerer
17
2
0
18 Nov 2023
Challenges in data-based geospatial modeling for environmental research
  and practice
Challenges in data-based geospatial modeling for environmental research and practice
Diana Koldasbayeva
P. Tregubova
M. Gasanov
Alexey Zaytsev
Anna Petrovskaia
E. Burnaev
AI4CE
32
1
0
18 Nov 2023
Deep Tensor Network
Deep Tensor Network
Yifan Zhang
29
0
0
18 Nov 2023
Florence-2: Advancing a Unified Representation for a Variety of Vision
  Tasks
Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks
Bin Xiao
Haiping Wu
Weijian Xu
Xiyang Dai
Houdong Hu
Yumao Lu
Michael Zeng
Ce Liu
Lu Yuan
VLM
45
143
0
10 Nov 2023
PolyMaX: General Dense Prediction with Mask Transformer
PolyMaX: General Dense Prediction with Mask Transformer
Xuan S. Yang
Liangzhe Yuan
Kimberly Wilber
Astuti Sharma
Xiuye Gu
...
Stephanie Debats
Huisheng Wang
Hartwig Adam
Mikhail Sirotenko
Liang-Chieh Chen
28
14
0
09 Nov 2023
Window Attention is Bugged: How not to Interpolate Position Embeddings
Window Attention is Bugged: How not to Interpolate Position Embeddings
Daniel Bolya
Chaitanya K. Ryali
Judy Hoffman
Christoph Feichtenhofer
43
10
0
09 Nov 2023
Univariate Radial Basis Function Layers: Brain-inspired Deep Neural
  Layers for Low-Dimensional Inputs
Univariate Radial Basis Function Layers: Brain-inspired Deep Neural Layers for Low-Dimensional Inputs
Daniel Jost
Basavasagar Patil
Xavier Alameda-Pineda
Chris Reinke
16
0
0
07 Nov 2023
FLORA: Fine-grained Low-Rank Architecture Search for Vision Transformer
FLORA: Fine-grained Low-Rank Architecture Search for Vision Transformer
Chi-Chih Chang
Yuan-Yao Sung
Shixing Yu
N. Huang
Diana Marculescu
Kai-Chiang Wu
ViT
28
1
0
07 Nov 2023
Navigating Scaling Laws: Compute Optimality in Adaptive Model Training
Navigating Scaling Laws: Compute Optimality in Adaptive Model Training
Sotiris Anagnostidis
Gregor Bachmann
Imanol Schlag
Thomas Hofmann
33
2
0
06 Nov 2023
GTP-ViT: Efficient Vision Transformers via Graph-based Token Propagation
GTP-ViT: Efficient Vision Transformers via Graph-based Token Propagation
Xuwei Xu
Sen Wang
Yudong Chen
Yanping Zheng
Zhewei Wei
Jiajun Liu
ViT
27
8
0
06 Nov 2023
Deep Double Descent for Time Series Forecasting: Avoiding Undertrained
  Models
Deep Double Descent for Time Series Forecasting: Avoiding Undertrained Models
Valentino Assandri
Sam Heshmati
Burhaneddin Yaman
Anton Iakovlev
Ariel Emiliano Repetur
45
0
0
02 Nov 2023
AiluRus: A Scalable ViT Framework for Dense Prediction
AiluRus: A Scalable ViT Framework for Dense Prediction
Jin Li
Yaoming Wang
Xiaopeng Zhang
Bowen Shi
Dongsheng Jiang
Chenglin Li
Wenrui Dai
Hongkai Xiong
Qi Tian
59
5
0
02 Nov 2023
Generating QM1B with PySCF$_{\text{IPU}}$
Generating QM1B with PySCFIPU_{\text{IPU}}IPU​
Alexander Mathiasen
Hatem Helal
Kerstin Klaser
Paul Balanca
Josef Dean
Carlo Luschi
Dominique Beaini
Andrew Fitzgibbon
Dominic Masters
25
1
0
02 Nov 2023
Text Rendering Strategies for Pixel Language Models
Text Rendering Strategies for Pixel Language Models
Jonas F. Lotz
Elizabeth Salesky
Phillip Rust
Desmond Elliott
VLM
29
11
0
01 Nov 2023
Res-Tuning: A Flexible and Efficient Tuning Paradigm via Unbinding Tuner
  from Backbone
Res-Tuning: A Flexible and Efficient Tuning Paradigm via Unbinding Tuner from Backbone
Zeyinzi Jiang
Chaojie Mao
Ziyuan Huang
Ao Ma
Yiliang Lv
Yujun Shen
Deli Zhao
Jingren Zhou
30
15
0
30 Oct 2023
Analyzing Vision Transformers for Image Classification in Class
  Embedding Space
Analyzing Vision Transformers for Image Classification in Class Embedding Space
Martina G. Vilas
Timothy Schaumlöffel
Gemma Roig
ViT
21
23
0
29 Oct 2023
Multi-scale Diffusion Denoised Smoothing
Multi-scale Diffusion Denoised Smoothing
Jongheon Jeong
Jinwoo Shin
DiffM
26
8
0
25 Oct 2023
ConvNets Match Vision Transformers at Scale
ConvNets Match Vision Transformers at Scale
Samuel L. Smith
Andrew Brock
Leonard Berrada
Soham De
13
23
0
25 Oct 2023
Gramian Attention Heads are Strong yet Efficient Vision Learners
Gramian Attention Heads are Strong yet Efficient Vision Learners
Jongbin Ryu
Dongyoon Han
J. Lim
32
1
0
25 Oct 2023
Pre-Training on Large-Scale Generated Docking Conformations with
  HelixDock to Unlock the Potential of Protein-ligand Structure Prediction
  Models
Pre-Training on Large-Scale Generated Docking Conformations with HelixDock to Unlock the Potential of Protein-ligand Structure Prediction Models
Lihang Liu
Shanzhuo Zhang
Donglong He
Xianbin Ye
Jingbo Zhou
...
Fan Wang
Jingzhou He
Liang Zheng
Yonghui Li
Xiaomin Fang
AI4CE
22
8
0
21 Oct 2023
SILC: Improving Vision Language Pretraining with Self-Distillation
SILC: Improving Vision Language Pretraining with Self-Distillation
Muhammad Ferjad Naeem
Yongqin Xian
Xiaohua Zhai
Lukas Hoyer
Luc Van Gool
F. Tombari
VLM
26
33
0
20 Oct 2023
KAKURENBO: Adaptively Hiding Samples in Deep Neural Network Training
KAKURENBO: Adaptively Hiding Samples in Deep Neural Network Training
Truong Thao Nguyen
Balazs Gerofi
Edgar Josafat Martinez-Noriega
Franccois Trahay
M. Wahib
32
1
0
16 Oct 2023
Farzi Data: Autoregressive Data Distillation
Farzi Data: Autoregressive Data Distillation
Noveen Sachdeva
Zexue He
Wang-Cheng Kang
Jianmo Ni
D. Cheng
Julian McAuley
DD
21
3
0
15 Oct 2023
An Unbiased Look at Datasets for Visuo-Motor Pre-Training
An Unbiased Look at Datasets for Visuo-Motor Pre-Training
Sudeep Dasari
Mohan Kumar Srirama
Unnat Jain
Abhinav Gupta
SSL
34
34
0
13 Oct 2023
PaLI-3 Vision Language Models: Smaller, Faster, Stronger
PaLI-3 Vision Language Models: Smaller, Faster, Stronger
Xi Chen
Xiao Wang
Lucas Beyer
Alexander Kolesnikov
Jialin Wu
...
Keran Rong
Tianli Yu
Daniel Keysers
Xiao-Qi Zhai
Radu Soricut
MLLM
VLM
38
94
0
13 Oct 2023
Efficient Adaptation of Large Vision Transformer via Adapter
  Re-Composing
Efficient Adaptation of Large Vision Transformer via Adapter Re-Composing
Wei Dong
Dawei Yan
Zhijun Lin
Peng Wang
19
21
0
10 Oct 2023
Learning Interactive Real-World Simulators
Learning Interactive Real-World Simulators
Mengjiao Yang
Yilun Du
Kamyar Ghasemipour
Jonathan Tompson
Leslie Kaelbling
Dale Schuurmans
Pieter Abbeel
LM&Ro
PINN
30
180
0
09 Oct 2023
Transformer Fusion with Optimal Transport
Transformer Fusion with Optimal Transport
Moritz Imfeld
Jacopo Graldi
Marco Giordano
Thomas Hofmann
Sotiris Anagnostidis
Sidak Pal Singh
ViT
MoMe
29
16
0
09 Oct 2023
No Token Left Behind: Efficient Vision Transformer via Dynamic Token
  Idling
No Token Left Behind: Efficient Vision Transformer via Dynamic Token Idling
Xuwei Xu
Changlin Li
Yudong Chen
Xiaojun Chang
Jiajun Liu
Sen Wang
ViT
21
5
0
09 Oct 2023
Plug n' Play: Channel Shuffle Module for Enhancing Tiny Vision
  Transformers
Plug n' Play: Channel Shuffle Module for Enhancing Tiny Vision Transformers
Xuwei Xu
Sen Wang
Yudong Chen
Jiajun Liu
ViT
21
1
0
09 Oct 2023
Module-wise Adaptive Distillation for Multimodality Foundation Models
Module-wise Adaptive Distillation for Multimodality Foundation Models
Chen Liang
Jiahui Yu
Ming-Hsuan Yang
Matthew A. Brown
Huayu Chen
Tuo Zhao
Boqing Gong
Tianyi Zhou
11
10
0
06 Oct 2023
Previous
123...567...141516
Next