Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2103.00112
Cited By
v1
v2
v3 (latest)
Transformer in Transformer
27 February 2021
Kai Han
An Xiao
Enhua Wu
Jianyuan Guo
Chunjing Xu
Yunhe Wang
ViT
Re-assign community
ArXiv (abs)
PDF
HTML
Github (4228★)
Papers citing
"Transformer in Transformer"
50 / 558 papers shown
Title
Graph Transformers: A Survey
Ahsan Shehzad
Xiwei Xu
Shagufta Abid
Ciyuan Peng
Shuo Yu
Dongyu Zhang
Karin Verspoor
AI4CE
141
14
0
13 Jul 2024
MambaVision: A Hybrid Mamba-Transformer Vision Backbone
Ali Hatamizadeh
Jan Kautz
Mamba
156
74
0
10 Jul 2024
Isomorphic Pruning for Vision Models
Gongfan Fang
Xinyin Ma
Michael Bi Mi
Xinchao Wang
VLM
ViT
86
8
0
05 Jul 2024
ISWSST: Index-space-wave State Superposition Transformers for Multispectral Remotely Sensed Imagery Semantic Segmentation
Chang Li
Pengfei Zhang
Yu Wang
73
0
0
03 Jul 2024
Research on Reliable and Safe Occupancy Grid Prediction in Underground Parking Lots
JiaQi Luo
76
0
0
02 Jul 2024
LPViT: Low-Power Semi-structured Pruning for Vision Transformers
Kaixin Xu
Zhe Wang
Chunyun Chen
Xue Geng
Jie Lin
Xulei Yang
Min-man Wu
Min Wu
Xiaoli Li
Weisi Lin
ViT
VLM
217
10
0
02 Jul 2024
VELO: A Vector Database-Assisted Cloud-Edge Collaborative LLM QoS Optimization Framework
Zhi Yao
Zhiqing Tang
Jiong Lou
Ping Shen
Weijia Jia
86
10
0
19 Jun 2024
A Comprehensive Taxonomy and Analysis of Talking Head Synthesis: Techniques for Portrait Generation, Driving Mechanisms, and Editing
Ming Meng
Yufei Zhao
Bo Zhang
Yonggui Zhu
Weimin Shi
Maxwell Wen
Zhaoxin Fan
VGen
106
2
0
15 Jun 2024
SecureNet: A Comparative Study of DeBERTa and Large Language Models for Phishing Detection
Sakshi Mahendru
Tejul Pandit
79
1
0
10 Jun 2024
Convolutional Neural Networks and Vision Transformers for Fashion MNIST Classification: A Literature Review
Sonia Bbouzidi
Ghazala Hcini
Imen Jdey
Fadoua Drira
99
5
0
05 Jun 2024
Block Transformer: Global-to-Local Language Modeling for Fast Inference
Namgyu Ho
Sangmin Bae
Taehyeon Kim
Hyunjik Jo
Yireun Kim
Tal Schuster
Adam Fisch
James Thorne
Se-Young Yun
116
9
0
04 Jun 2024
Mamba as Decision Maker: Exploring Multi-scale Sequence Modeling in Offline Reinforcement Learning
Jiahang Cao
Qiang Zhang
Ziqing Wang
Jiaxu Wang
Hao Cheng
Yecheng Shao
Wen Zhao
Gang Han
Yijie Guo
Renjing Xu
Mamba
101
2
0
04 Jun 2024
Automatic Channel Pruning for Multi-Head Attention
Eunho Lee
Youngbae Hwang
ViT
81
1
0
31 May 2024
Activator: GLU Activation Function as the Core Component of a Vision Transformer
Abdullah Nazhat Abdullah
Tarkan Aydin
ViT
79
0
0
24 May 2024
PerLLM: Personalized Inference Scheduling with Edge-Cloud Collaboration for Diverse LLM Services
Zheming Yang
Yuanhao Yang
Chang Zhao
Qi Guo
Wenkai He
Wen Ji
85
17
0
23 May 2024
Multi-Scale VMamba: Hierarchy in Hierarchy Visual State Space Model
Yuheng Shi
Minjing Dong
Chang Xu
Mamba
114
36
0
23 May 2024
Configuring Data Augmentations to Reduce Variance Shift in Positional Embedding of Vision Transformers
Bum Jun Kim
Sang Woo Kim
ViT
68
1
0
23 May 2024
From Human-to-Human to Human-to-Bot Conversations in Software Engineering
Ranim Khojah
Francisco Gomes de Oliveira Neto
Philipp Leitner
61
1
0
21 May 2024
Multi-View Attentive Contextualization for Multi-View 3D Object Detection
Xianpeng Liu
Ce Zheng
Ming Qian
Nan Xue
Chong Chen
Zhebin Zhang
Chen Li
Tianfu Wu
127
3
0
20 May 2024
Large Language Models for Medicine: A Survey
Yanxin Zheng
Wensheng Gan
Zefeng Chen
Zhenlian Qi
Qian Liang
Philip S. Yu
LM&MA
98
22
0
20 May 2024
A Survey of Generative Techniques for Spatial-Temporal Data Mining
Qianru Zhang
Haixin Wang
Cheng Long
Liangcai Su
Xingwei He
...
Tailin Wu
Hongzhi Yin
Siu-Ming Yiu
Qi Tian
Christian S. Jensen
AI4TS
95
9
0
15 May 2024
A Semantic and Motion-Aware Spatiotemporal Transformer Network for Action Detection
Matthew Korban
Peter Youngs
Scott T. Acton
ViT
75
7
0
13 May 2024
HybridHash: Hybrid Convolutional and Self-Attention Deep Hashing for Image Retrieval
Chao He
Hongxi Wei
79
10
0
13 May 2024
ExplainableDetector: Exploring Transformer-based Language Modeling Approach for SMS Spam Detection with Explainability Analysis
Mohammad Amaz Uddin
Muhammad Nazrul Islam
Leandros A. Maglaras
Helge Janicke
Iqbal H. Sarker
75
3
0
12 May 2024
G-SAP: Graph-based Structure-Aware Prompt Learning over Heterogeneous Knowledge for Commonsense Reasoning
Ruiting Dai
Yuqiao Tan
Lisi Mo
Shuang Liang
Guohao Huo
Jiayi Luo
Yao Cheng
ReLM
RALM
LRM
67
1
0
09 May 2024
CNN-LSTM and Transfer Learning Models for Malware Classification based on Opcodes and API Calls
A. Bensaoud
Jugal Kalita
65
18
0
04 May 2024
Revolutionizing Traffic Sign Recognition: Unveiling the Potential of Vision Transformers
Susano Mingwin
Yulong Shisu
Yongshuai Wanwag
Sunshin Huing
58
2
0
29 Apr 2024
Data-independent Module-aware Pruning for Hierarchical Vision Transformers
Yang He
Qiufeng Wang
ViT
85
5
0
21 Apr 2024
Nested-TNT: Hierarchical Vision Transformers with Multi-Scale Feature Processing
Yuang Liu
Zhiheng Qiu
Xiaokai Qin
ViT
94
0
0
20 Apr 2024
X-Light: Cross-City Traffic Signal Control Using Transformer on Transformer as Meta Multi-Agent Reinforcement Learner
Haoyuan Jiang
Ziyue Li
Hua Wei
Xuantang Xiong
Jingqing Ruan
Jiaming Lu
Hangyu Mao
Rui Zhao
92
10
0
18 Apr 2024
Unblind Text Inputs: Predicting Hint-text of Text Input in Mobile Apps via LLM
Zhe Liu
Chunyang Chen
Junjie Wang
Mengzhuo Chen
Boyu Wu
Yuekai Huang
Jun Hu
Qing Wang
60
14
0
03 Apr 2024
Scene Adaptive Sparse Transformer for Event-based Object Detection
Yansong Peng
Hebei Li
Yueyi Zhang
Xiaoyan Sun
Feng Wu
ViT
115
18
0
02 Apr 2024
Heracles: A Hybrid SSM-Transformer Model for High-Resolution Image and Time-Series Analysis
Badri N. Patro
Suhas Ranganath
Vinay P. Namboodiri
Vijay Srinivas Agneeswaran
99
3
0
26 Mar 2024
Once for Both: Single Stage of Importance and Sparsity Search for Vision Transformer Compression
Hancheng Ye
Chong Yu
Peng Ye
Renqiu Xia
Yansong Tang
Jiwen Lu
Tao Chen
Bo Zhang
90
3
0
23 Mar 2024
Cost-Efficient Large Language Model Serving for Multi-turn Conversations with CachedAttention
Bin Gao
Zhuomin He
Puru Sharma
Qingxuan Kang
Djordje Jevdjic
Junbo Deng
Xingkun Yang
Zhou Yu
Pengfei Zuo
141
56
0
23 Mar 2024
Accelerating ViT Inference on FPGA through Static and Dynamic Pruning
Dhruv Parikh
Shouyi Li
Bingyi Zhang
Rajgopal Kannan
Carl E. Busart
Viktor Prasanna
99
2
0
21 Mar 2024
Transformer based Multitask Learning for Image Captioning and Object Detection
Debolena Basak
P. K. Srijith
M. Desarkar
74
2
0
10 Mar 2024
NiNformer: A Network in Network Transformer with Token Mixing Generated Gating Function
Abdullah Nazhat Abdullah
Tarkan Aydin
79
0
0
04 Mar 2024
MiM-ISTD: Mamba-in-Mamba for Efficient Infrared Small Target Detection
Tianxiang Chen
Zi Ye
Zhentao Tan
Tao Gong
Yue-bo Wu
Qi Chu
Bin Liu
Nenghai Yu
Jieping Ye
Mamba
111
60
0
04 Mar 2024
AllSpark: Reborn Labeled Features from Unlabeled in Transformer for Semi-Supervised Semantic Segmentation
Haonan Wang
Qixiang Zhang
Yi Li
Xiaomeng Li
128
20
0
04 Mar 2024
GIN-SD: Source Detection in Graphs with Incomplete Nodes via Positional Encoding and Attentive Fusion
Le Cheng
Peican Zhu
Keke Tang
Chao Gao
Zhen Wang
72
19
0
27 Feb 2024
A Comprehensive Survey of Convolutions in Deep Learning: Applications, Challenges, and Future Trends
Abolfazl Younesi
Mohsen Ansari
Mohammadamin Fazli
A. Ejlali
Muhammad Shafique
Joerg Henkel
3DV
126
52
0
23 Feb 2024
Label-efficient multi-organ segmentation with a diffusion model
Yongzhi Huang
Jinxin Zhu
Haseeb Hassan
Liyilei Su
Jingyu Li
Binding Huang
Yun Peng
Jingyu Li
Jun Ma
Bingding Huang
DiffM
MedIm
93
0
0
23 Feb 2024
WeakSAM: Segment Anything Meets Weakly-supervised Instance-level Recognition
Lianghui Zhu
Junwei Zhou
Yan Liu
Xin Hao
Wenyu Liu
Xinggang Wang
VLM
69
7
0
22 Feb 2024
An Explainable Transformer-based Model for Phishing Email Detection: A Large Language Model Approach
Mohammad Amaz Uddin
Iqbal H. Sarker
75
18
0
21 Feb 2024
TDViT: Temporal Dilated Video Transformer for Dense Video Tasks
Guanxiong Sun
Yang Hua
Guosheng Hu
N. Robertson
ViT
59
1
0
14 Feb 2024
Masked LoGoNet: Fast and Accurate 3D Image Analysis for Medical Domain
Amin Karimi Monsefi
Payam Karisani
Mengxi Zhou
Stacey S. Choi
Nathan Doble
Heng Ji
Srinivasan Parthasarathy
R. Ramnath
100
5
0
09 Feb 2024
Expediting In-Network Federated Learning by Voting-Based Consensus Model Compression
Xiaoxin Su
Yipeng Zhou
Laizhong Cui
Song Guo
FedML
58
3
0
06 Feb 2024
A Survey on Transformer Compression
Yehui Tang
Yunhe Wang
Jianyuan Guo
Zhijun Tu
Kai Han
Hailin Hu
Dacheng Tao
156
35
0
05 Feb 2024
NOAH: Learning Pairwise Object Category Attentions for Image Classification
Chao Li
Aojun Zhou
Anbang Yao
VLM
71
2
0
04 Feb 2024
Previous
1
2
3
4
5
...
10
11
12
Next