Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2103.10697
Cited By
ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases
19 March 2021
Stéphane dÁscoli
Hugo Touvron
Matthew L. Leavitt
Ari S. Morcos
Giulio Biroli
Levent Sagun
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases"
50 / 399 papers shown
Title
Hyb-KAN ViT: Hybrid Kolmogorov-Arnold Networks Augmented Vision Transformer
Sainath Dey
Mitul Goswami
Jashika Sethi
Prasant Kumar Pattnaik
ViT
30
0
0
07 May 2025
Vision Transformers in Precision Agriculture: A Comprehensive Survey
Saber Mehdipour
Seyed Abolghasem Mirroshandel
Seyed Amirhossein Tabatabaei
34
0
0
30 Apr 2025
SFi-Former: Sparse Flow Induced Attention for Graph Transformer
Z. Li
J. Q. Shi
X. Zhang
Miao Zhang
B. Li
44
0
0
29 Apr 2025
A Simple DropConnect Approach to Transfer-based Targeted Attack
Tongrui Su
Qingbin Li
Shengyu Zhu
Wei Chen
Xueqi Cheng
AAML
69
0
0
24 Apr 2025
LoRAX: LoRA eXpandable Networks for Continual Synthetic Image Attribution
Danielle Sullivan-Pao
Nicole Tian
Pooya Khorrami
CLL
57
0
0
10 Apr 2025
Spectral-Adaptive Modulation Networks for Visual Perception
Guhnoo Yun
J. Yoo
Kijung Kim
Jeongho Lee
Paul Hongsuck Seo
Dong Hwan Kim
42
0
0
31 Mar 2025
Image-to-Text for Medical Reports Using Adaptive Co-Attention and Triple-LSTM Module
Yishen Liu
Shengda Liu
Hudan Pan
MedIm
50
0
0
24 Mar 2025
Beyond Accuracy: What Matters in Designing Well-Behaved Models?
Robin Hesse
Doğukan Bağcı
Bernt Schiele
Simone Schaub-Meyer
Stefan Roth
VLM
62
0
0
21 Mar 2025
FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute
Sotiris Anagnostidis
Gregor Bachmann
Yeongmin Kim
Jonas Kohler
Markos Georgopoulos
A. Sanakoyeu
Yuming Du
Albert Pumarola
Ali K. Thabet
Edgar Schönfeld
89
0
0
27 Feb 2025
E2ENet: Dynamic Sparse Feature Fusion for Accurate and Efficient 3D Medical Image Segmentation
Boqian Wu
Q. Xiao
Shiwei Liu
Lu Yin
Mykola Pechenizkiy
D. Mocanu
M. V. Keulen
Elena Mocanu
MedIm
53
4
0
20 Feb 2025
On the Ability of Deep Networks to Learn Symmetries from Data: A Neural Kernel Theory
Andrea Perin
Stéphane Deny
93
1
0
16 Dec 2024
Multi-Token Enhancing for Vision Representation Learning
Zhong-Yu Li
Yu-Song Hu
Bo Yin
Ming-Ming Cheng
66
1
0
24 Nov 2024
Improving Transferable Targeted Attacks with Feature Tuning Mixup
K. Liang
Xuelong Dai
Yanjie Li
Dong Wang
Bin Xiao
AAML
152
0
0
23 Nov 2024
Reducing catastrophic forgetting of incremental learning in the absence of rehearsal memory with task-specific token
Young Jo Choi
Min Kyoon Yoo
Yu Rang Park
24
0
0
06 Nov 2024
A Mamba Foundation Model for Time Series Forecasting
Haoyu Ma
Yushu Chen
Wenlai Zhao
Jinzhe Yang
Yingsheng Ji
Xinghua Xu
Xiaozhu Liu
Hao Jing
Shengzhuo Liu
Guangwen Yang
AI4TS
Mamba
41
2
0
05 Nov 2024
Towards High-fidelity Head Blending with Chroma Keying for Industrial Applications
Hah Min Lew
Sahng-Min Yoo
Hyunwoo Kang
Gyeong-Moon Park
31
0
0
01 Nov 2024
ProTransformer: Robustify Transformers via Plug-and-Play Paradigm
Zhichao Hou
Weizhi Gao
Yuchen Shen
Feiyi Wang
Xiaorui Liu
VLM
28
2
0
30 Oct 2024
DiRecNetV2: A Transformer-Enhanced Network for Aerial Disaster Recognition
Demetris Shianios
Panayiotis Kolios
Christos Kyrkou
28
3
0
17 Oct 2024
Locality Alignment Improves Vision-Language Models
Ian Covert
Tony Sun
James Y. Zou
Tatsunori Hashimoto
VLM
67
3
0
14 Oct 2024
On the Adversarial Transferability of Generalized "Skip Connections"
Yisen Wang
Yichuan Mo
Dongxian Wu
Mingjie Li
Xingjun Ma
Zhouchen Lin
AAML
28
2
0
11 Oct 2024
Audio Description Generation in the Era of LLMs and VLMs: A Review of Transferable Generative AI Technologies
Yingqiang Gao
Lukas Fischer
Alexa Lintner
Sarah Ebling
31
0
0
11 Oct 2024
Action Selection Learning for Multi-label Multi-view Action Recognition
Trung Thanh Nguyen
Yasutomo Kawanishi
Takahiro Komamizu
Ichiro Ide
31
2
0
04 Oct 2024
The Overfocusing Bias of Convolutional Neural Networks: A Saliency-Guided Regularization Approach
David Bertoin
Eduardo Hugo Sanchez
Mehdi Zouitine
Emmanuel Rachelson
28
0
0
25 Sep 2024
Sparks of Artificial General Intelligence(AGI) in Semiconductor Material Science: Early Explorations into the Next Frontier of Generative AI-Assisted Electron Micrograph Analysis
Sakhinana Sagar Srinivas
Geethan Sannidhi
Sreeja Gangasani
Chidaksh Ravuru
Venkataramana Runkana
33
0
0
17 Sep 2024
HiTSR: A Hierarchical Transformer for Reference-based Super-Resolution
Masoomeh Aslahishahri
Jordan R. Ubbens
Ian Stavness
33
0
0
30 Aug 2024
Parameter-Efficient Quantized Mixture-of-Experts Meets Vision-Language Instruction Tuning for Semiconductor Electron Micrograph Analysis
Sakhinana Sagar Srinivas
Chidaksh Ravuru
Geethan Sannidhi
Venkataramana Runkana
43
0
0
27 Aug 2024
Multi-Modal Instruction-Tuning Small-Scale Language-and-Vision Assistant for Semiconductor Electron Micrograph Analysis
Sakhinana Sagar Srinivas
Geethan Sannidhi
Venkataramana Runkana
38
1
0
27 Aug 2024
GenFormer -- Generated Images are All You Need to Improve Robustness of Transformers on Small Datasets
Sven Oehri
Nikolas Ebert
Ahmed Abdullah
Didier Stricker
Oliver Wasenmüller
ViT
26
5
0
26 Aug 2024
Hierarchical Network Fusion for Multi-Modal Electron Micrograph Representation Learning with Foundational Large Language Models
Sakhinana Sagar Srinivas
Geethan Sannidhi
Venkataramana Runkana
35
0
0
24 Aug 2024
Preliminary Investigations of a Multi-Faceted Robust and Synergistic Approach in Semiconductor Electron Micrograph Analysis: Integrating Vision Transformers with Large Language and Multimodal Models
Sakhinana Sagar Srinivas
Geethan Sannidhi
Sreeja Gangasani
Chidaksh Ravuru
Venkataramana Runkana
29
0
0
24 Aug 2024
Foundational Model for Electron Micrograph Analysis: Instruction-Tuning Small-Scale Language-and-Vision Assistant for Enterprise Adoption
Sakhinana Sagar Srinivas
Chidaksh Ravuru
Geethan Sannidhi
Venkataramana Runkana
33
0
0
23 Aug 2024
Vision HgNN: An Electron-Micrograph is Worth Hypergraph of Hypernodes
Sakhinana Sagar Srinivas
Rajat Kumar Sarkar
Sreeja Gangasani
Venkataramana Runkana
35
2
0
21 Aug 2024
EMCNet : Graph-Nets for Electron Micrographs Classification
Sakhinana Sagar Srinivas
Rajat Kumar Sarkar
Venkataramana Runkana
32
0
0
21 Aug 2024
Enhancing Adversarial Transferability with Adversarial Weight Tuning
Jiahao Chen
Zhou Feng
Rui Zeng
Yuwen Pu
Chunyi Zhou
Yi Jiang
Yuyou Gan
Jinbao Li
Shouling Ji
AAML
35
0
0
18 Aug 2024
UMono: Physical Model Informed Hybrid CNN-Transformer Framework for Underwater Monocular Depth Estimation
Jian Wang
Jing Wang
Shenghui Rong
Bo He
32
1
0
25 Jul 2024
Learn to Preserve and Diversify: Parameter-Efficient Group with Orthogonal Regularization for Domain Generalization
Jiajun Hu
Jian Zhang
Lei Qi
Yinghuan Shi
Yang Gao
OOD
33
4
0
21 Jul 2024
Double-Shot 3D Shape Measurement with a Dual-Branch Network
Mingyang Lei
Jingfan Fan
Long Shao
Hong Song
Deqiang Xiao
Danni Ai
Tianyu Fu
Ying Gu
Jian Yang
3DPC
3DV
23
0
0
19 Jul 2024
Improving Representation of High-frequency Components for Medical Visual Foundation Models
Yuetan Chu
Yilan Zhang
Zhongyi Han
Changchun Yang
Longxi Zhou
Gongning Luo
Chao Huang
Xin Gao
MedIm
45
1
0
19 Jul 2024
DuoFormer: Leveraging Hierarchical Visual Representations by Local and Global Attention
Xiaoya Tang
Bodong Zhang
Beatrice S. Knudsen
Tolga Tasdizen
ViT
MedIm
45
1
0
18 Jul 2024
GroupMamba: Efficient Group-Based Visual State Space Model
Abdelrahman M. Shaker
Syed Talal Wasim
Salman Khan
Juergen Gall
Fahad Shahbaz Khan
Mamba
56
0
0
18 Jul 2024
OPa-Ma: Text Guided Mamba for 360-degree Image Out-painting
Penglei Gao
Kai Yao
Tiandi Ye
Steven Wang
Yuan Yao
Xiaofeng Wang
Mamba
27
1
0
15 Jul 2024
HDKD: Hybrid Data-Efficient Knowledge Distillation Network for Medical Image Classification
Omar S. El-Assiouti
Ghada Hamed
Dina Khattab
H. M. Ebied
35
1
0
10 Jul 2024
PosMLP-Video: Spatial and Temporal Relative Position Encoding for Efficient Video Recognition
Y. Hao
Diansong Zhou
Zhicai Wang
Chong-Wah Ngo
Meng Wang
ViT
32
4
0
03 Jul 2024
Fibottention: Inceptive Visual Representation Learning with Diverse Attention Across Heads
Ali Khaleghi Rahimian
Manish Kumar Govind
Subhajit Maity
Dominick Reilly
Christian Kummerle
Srijan Das
A. Dutta
38
1
0
27 Jun 2024
Retain, Blend, and Exchange: A Quality-aware Spatial-Stereo Fusion Approach for Event Stream Recognition
Lan Chen
Dong Li
Xiao Wang
Pengpeng Shao
Wei Zhang
Yaowei Wang
Yonghong Tian
Jin Tang
68
2
0
27 Jun 2024
MD tree: a model-diagnostic tree grown on loss landscape
Yefan Zhou
Jianlong Chen
Qinxue Cao
Konstantin Schürholt
Yaoqing Yang
31
2
0
24 Jun 2024
Just How Flexible are Neural Networks in Practice?
Ravid Shwartz-Ziv
Micah Goldblum
Arpit Bansal
C. B. Bruss
Yann LeCun
Andrew Gordon Wilson
40
4
0
17 Jun 2024
AdaNCA: Neural Cellular Automata As Adaptors For More Robust Vision Transformer
Yitao Xu
Tong Zhang
Sabine Süsstrunk
ViT
42
0
0
12 Jun 2024
A Comparative Survey of Vision Transformers for Feature Extraction in Texture Analysis
Leonardo F. S. Scabini
Andre Sacilotti
Kallil M. C. Zielinski
L. C. Ribas
B. De Baets
Odemir M. Bruno
ViT
33
3
0
10 Jun 2024
The 3D-PC: a benchmark for visual perspective taking in humans and machines
Drew Linsley
Peisen Zhou
A. Ashok
Akash Nagaraj
Gaurav Gaonkar
Francis E Lewis
Zygmunt Pizlo
Thomas Serre
48
6
0
06 Jun 2024
1
2
3
4
5
6
7
8
Next