Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2011.09315
Cited By
End-to-End Object Detection with Adaptive Clustering Transformer
18 November 2020
Minghang Zheng
Peng Gao
Renrui Zhang
Kunchang Li
Xiaogang Wang
Hongsheng Li
Hao Dong
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"End-to-End Object Detection with Adaptive Clustering Transformer"
50 / 103 papers shown
Title
A 2D Semantic-Aware Position Encoding for Vision Transformers
Xi Chen
Shiyang Zhou
Muqi Huang
Jiaxu Feng
Yun Xiong
...
Yuyao Zhang
Huishuai Bao
Sijia Peng
Chong Li
Feng Shi
ViT
31
0
0
14 May 2025
Context Aware Grounded Teacher for Source Free Object Detection
Tajamul Ashraf
Rajes Manna
Partha Sarathi Purkayastha
Tavaheed Tariq
Janibul Bashir
25
0
0
21 Apr 2025
Spectral-Adaptive Modulation Networks for Visual Perception
Guhnoo Yun
J. Yoo
Kijung Kim
Jeongho Lee
Paul Hongsuck Seo
Dong Hwan Kim
42
0
0
31 Mar 2025
Efficient Token Compression for Vision Transformer with Spatial Information Preserved
Junzhu Mao
Yang Shen
Jinyang Guo
Yazhou Yao
Xiansheng Hua
ViT
36
0
0
30 Mar 2025
LD-DETR: Loop Decoder DEtection TRansformer for Video Moment Retrieval and Highlight Detection
Pengcheng Zhao
Zhixian He
Fuwei Zhang
Shujin Lin
Fan Zhou
42
1
0
18 Jan 2025
ENACT: Entropy-based Clustering of Attention Input for Improving the Computational Performance of Object Detection Transformers
Giorgos Savathrakis
Antonis Argyros
ViT
25
0
0
11 Sep 2024
A Review of Transformer-Based Models for Computer Vision Tasks: Capturing Global Context and Spatial Relationships
Gracile Astlin Pereira
Muhammad Hussain
ViT
40
7
0
27 Aug 2024
Hyper-YOLO: When Visual Object Detection Meets Hypergraph Computation
Yifan Feng
Jiangang Huang
Shaoyi Du
Shihui Ying
Jun-Hai Yong
Yipeng Li
Guiguang Ding
Rongrong Ji
Yue Gao
ObjD
43
7
0
09 Aug 2024
Neural-based Video Compression on Solar Dynamics Observatory Images
Atefeh Khoshkhahtinat
Ali Zafari
P. Mehta
Nasser M. Nasrabadi
Barbara J. Thompson
M. Kirk
D. D. Silva
48
0
0
12 Jul 2024
Learning to Adapt Category Consistent Meta-Feature of CLIP for Few-Shot Classification
Jiaying Shi
Xuetong Xue
Shenghui Xu
VLM
37
0
0
08 Jul 2024
Diff3Dformer: Leveraging Slice Sequence Diffusion for Enhanced 3D CT Classification with Transformer Networks
Zihao Jin
Yingying Fang
Jiahao Huang
Caiwen Xu
Simon Walsh
Guang Yang
MedIm
75
0
0
24 Jun 2024
ALGM: Adaptive Local-then-Global Token Merging for Efficient Semantic Segmentation with Plain Vision Transformers
Narges Norouzi
Svetlana Orlova
Daan de Geus
Gijs Dubbelman
ViT
FedML
48
4
0
14 Jun 2024
Weight Copy and Low-Rank Adaptation for Few-Shot Distillation of Vision Transformers
Diana-Nicoleta Grigore
Mariana-Iuliana Georgescu
J. A. Justo
T. Johansen
Andreea-Iuliana Ionescu
Radu Tudor Ionescu
36
0
0
14 Apr 2024
CathFlow: Self-Supervised Segmentation of Catheters in Interventional Ultrasound Using Optical Flow and Transformers
Alex Ranne
Liming Kuang
Yordanka Velikova
Nassir Navab
F. R. Y. Baena
OOD
42
1
0
21 Mar 2024
PEEB: Part-based Image Classifiers with an Explainable and Editable Language Bottleneck
Thang M. Pham
Peijie Chen
Tin Nguyen
Seunghyun Yoon
Trung Bui
Anh Nguyen
VLM
54
7
0
08 Mar 2024
DEYO: DETR with YOLO for End-to-End Object Detection
Haodong Ouyang
23
7
0
26 Feb 2024
Semi-supervised Counting via Pixel-by-pixel Density Distribution Modelling
Hui Lin
Zhiheng Ma
Rongrong Ji
Yao Wang
Zhou Su
Xiaopeng Hong
Deyu Meng
33
2
0
23 Feb 2024
Weakly Supervised Open-Vocabulary Object Detection
Jianghang Lin
Yunhang Shen
Bingquan Wang
Shaohui Lin
Ke Li
Liujuan Cao
WSOD
33
7
0
19 Dec 2023
PixelLM: Pixel Reasoning with Large Multimodal Model
Zhongwei Ren
Zhicheng Huang
Yunchao Wei
Yao-Min Zhao
Dongmei Fu
Jiashi Feng
Xiaojie Jin
VLM
MLLM
LRM
28
82
0
04 Dec 2023
AiluRus: A Scalable ViT Framework for Dense Prediction
Jin Li
Yaoming Wang
Xiaopeng Zhang
Bowen Shi
Dongsheng Jiang
Chenglin Li
Wenrui Dai
Hongkai Xiong
Qi Tian
64
5
0
02 Nov 2023
Improving Robustness for Vision Transformer with a Simple Dynamic Scanning Augmentation
Shashank Kotyan
Danilo Vasconcellos Vargas
ViT
27
2
0
01 Nov 2023
UniTime: A Language-Empowered Unified Model for Cross-Domain Time Series Forecasting
Xu Liu
Junfeng Hu
Yuan N. Li
Shizhe Diao
Keli Zhang
Bryan Hooi
Roger Zimmermann
AI4TS
27
76
0
15 Oct 2023
Anchor-Intermediate Detector: Decoupling and Coupling Bounding Boxes for Accurate Object Detection
Yilong Lv
Min Li
Yujie He
Shaopeng Li
Zhuzhen He
Aitao Yang
26
1
0
09 Oct 2023
Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy
Pingzhi Li
Zhenyu (Allen) Zhang
Prateek Yadav
Yi-Lin Sung
Yu Cheng
Mohit Bansal
Tianlong Chen
MoMe
26
33
0
02 Oct 2023
ClusterFormer: Clustering As A Universal Visual Learner
James Liang
Yiming Cui
Qifan Wang
Tong Geng
Wenguan Wang
Dongfang Liu
VLM
37
8
0
22 Sep 2023
CL-MAE: Curriculum-Learned Masked Autoencoders
Neelu Madan
Nicolae-Cătălin Ristea
Kamal Nasrollahi
T. Moeslund
Radu Tudor Ionescu
19
10
0
31 Aug 2023
SPANet: Frequency-balancing Token Mixer using Spectral Pooling Aggregation Modulation
Guhnoo Yun
J. Yoo
Kijung Kim
Jeongho Lee
Dong Hwan Kim
MoE
33
8
0
22 Aug 2023
ASAG: Building Strong One-Decoder-Layer Sparse Detectors via Adaptive Sparse Anchor Generation
Sheng-Hsiang Fu
Junkai Yan
Yipeng Gao
Xiaohua Xie
Wei-Shi Zheng
28
6
0
18 Aug 2023
Revisiting Vision Transformer from the View of Path Ensemble
Shuning Chang
Pichao Wang
Haowen Luo
Fan Wang
Mike Zheng Shou
ViT
37
3
0
12 Aug 2023
Graph Ladling: Shockingly Simple Parallel GNN Training without Intermediate Communication
A. Jaiswal
Shiwei Liu
Tianlong Chen
Ying Ding
Zhangyang Wang
GNN
42
5
0
18 Jun 2023
Instant Soup: Cheap Pruning Ensembles in A Single Pass Can Draw Lottery Tickets from Large Models
A. Jaiswal
Shiwei Liu
Tianlong Chen
Ying Ding
Zhangyang Wang
VLM
32
22
0
18 Jun 2023
The Emergence of Essential Sparsity in Large Pre-trained Models: The Weights that Matter
Ajay Jaiswal
Shiwei Liu
Tianlong Chen
Zhangyang Wang
VLM
23
33
0
06 Jun 2023
Referred by Multi-Modality: A Unified Temporal Transformer for Video Object Segmentation
Shilin Yan
Renrui Zhang
Ziyu Guo
Wenchao Chen
Wei Zhang
Hongyang Li
Yu Qiao
Hao Dong
Zhongjiang He
Peng Gao
VOS
22
30
0
25 May 2023
SSD-MonoDETR: Supervised Scale-aware Deformable Transformer for Monocular 3D Object Detection
Xuan He
Fan Yang
Kailun Yang
Jiacheng Lin
Haolong Fu
Hao Wu
Jin Yuan
Zhiyong Li
ViT
20
12
0
12 May 2023
AutoFocusFormer: Image Segmentation off the Grid
Chen Ziwen
K. Patnaik
Shuangfei Zhai
Alvin Wan
Zhile Ren
A. Schwing
Alex Colburn
Li Fuxin
24
9
0
24 Apr 2023
Transformer-Based Visual Segmentation: A Survey
Xiangtai Li
Henghui Ding
Haobo Yuan
Wenwei Zhang
Jiangmiao Pang
Guangliang Cheng
Kai-xiang Chen
Ziwei Liu
Chen Change Loy
ViT
MedIm
42
132
0
19 Apr 2023
Sparsifiner: Learning Sparse Instance-Dependent Attention for Efficient Vision Transformers
Cong Wei
Brendan Duke
R. Jiang
P. Aarabi
Graham W. Taylor
Florian Shkurti
ViT
46
14
0
24 Mar 2023
OcTr: Octree-based Transformer for 3D Object Detection
Chao Zhou
Yanan Zhang
Jiaxin Chen
Di Huang
3DPC
ViT
27
42
0
22 Mar 2023
Making Vision Transformers Efficient from A Token Sparsification View
Shuning Chang
Pichao Wang
Ming Lin
Fan Wang
David Junhao Zhang
Rong Jin
Mike Zheng Shou
ViT
45
24
0
15 Mar 2023
PiMAE: Point Cloud and Image Interactive Masked Autoencoders for 3D Object Detection
Anthony Chen
Kevin Zhang
Renrui Zhang
Zihan Wang
Yuheng Lu
Yandong Guo
Shanghang Zhang
3DPC
70
60
0
14 Mar 2023
HiCLIP: Contrastive Language-Image Pretraining with Hierarchy-aware Attention
Shijie Geng
Jianbo Yuan
Yu Tian
Yuxiao Chen
Yongfeng Zhang
CLIP
VLM
49
44
0
06 Mar 2023
Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers
Tianlong Chen
Zhenyu (Allen) Zhang
Ajay Jaiswal
Shiwei Liu
Zhangyang Wang
MoE
38
46
0
02 Mar 2023
iQuery: Instruments as Queries for Audio-Visual Sound Separation
Jiaben Chen
Renrui Zhang
Dongze Lian
Jiaqi Yang
Ziyao Zeng
Jianbo Shi
34
26
0
07 Dec 2022
Vision Transformer Computation and Resilience for Dynamic Inference
Kavya Sreedhar
Jason Clemons
Rangharajan Venkatesan
S. Keckler
M. Horowitz
26
2
0
06 Dec 2022
Vision Transformer with Super Token Sampling
Huaibo Huang
Xiaoqiang Zhou
Jie Cao
Ran He
Tieniu Tan
ViT
23
56
0
21 Nov 2022
Vision Transformers in Medical Imaging: A Review
Emerald U. Henry
Onyeka Emebob
C. Omonhinmin
ViT
MedIm
34
34
0
18 Nov 2022
Pair DETR: Contrastive Learning Speeds Up DETR Training
M. Iranmanesh
Xiaotong Chen
Kuo-Chin Lien
ViT
21
0
0
29 Oct 2022
Expediting Large-Scale Vision Transformer for Dense Prediction without Fine-tuning
Weicong Liang
Yuhui Yuan
Henghui Ding
Xiao Luo
Weihong Lin
Ding Jia
Zheng-Wei Zhang
Chao Zhang
Hanhua Hu
32
25
0
03 Oct 2022
CALIP: Zero-Shot Enhancement of CLIP with Parameter-free Attention
Ziyu Guo
Renrui Zhang
Longtian Qiu
Xianzheng Ma
Xupeng Miao
Xuming He
Bin Cui
VLM
AAML
59
109
0
28 Sep 2022
Embracing Consistency: A One-Stage Approach for Spatio-Temporal Video Grounding
Yang Jin
Yongzhi Li
Zehuan Yuan
Yadong Mu
31
32
0
27 Sep 2022
1
2
3
Next