ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2211.05778
  4. Cited By
InternImage: Exploring Large-Scale Vision Foundation Models with
  Deformable Convolutions

InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions

10 November 2022
Wenhai Wang
Jifeng Dai
Zhe Chen
Zhenhang Huang
Zhiqi Li
Xizhou Zhu
Xiao-hua Hu
Tong Lu
Lewei Lu
Hongsheng Li
Xiaogang Wang
Yu Qiao
    VLM
ArXivPDFHTML

Papers citing "InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions"

50 / 321 papers shown
Title
Learning from Rich Semantics and Coarse Locations for Long-tailed Object
  Detection
Learning from Rich Semantics and Coarse Locations for Long-tailed Object Detection
Lingchen Meng
Xiyang Dai
Jianwei Yang
Dongdong Chen
Yinpeng Chen
Mengchen Liu
Yi-Ling Chen
Zuxuan Wu
Lu Yuan
Yu-Gang Jiang
21
6
0
18 Oct 2023
SegmATRon: Embodied Adaptive Semantic Segmentation for Indoor
  Environment
SegmATRon: Embodied Adaptive Semantic Segmentation for Indoor Environment
T. Zemskova
Margarita Kichik
Dmitry A. Yudin
A. Staroverov
Aleksandr I. Panov
31
1
0
18 Oct 2023
IDRNet: Intervention-Driven Relation Network for Semantic Segmentation
IDRNet: Intervention-Driven Relation Network for Semantic Segmentation
Zhenchao Jin
Xiaowei Hu
Lingting Zhu
Luchuan Song
Li Yuan
Lequan Yu
13
18
0
16 Oct 2023
Unifying Image Processing as Visual Prompting Question Answering
Unifying Image Processing as Visual Prompting Question Answering
Yihao Liu
Xiangyu Chen
Xianzheng Ma
Xintao Wang
Jiantao Zhou
Yu Qiao
Chao Dong
MLLM
30
18
0
16 Oct 2023
Large Models for Time Series and Spatio-Temporal Data: A Survey and
  Outlook
Large Models for Time Series and Spatio-Temporal Data: A Survey and Outlook
Ming Jin
Qingsong Wen
Keli Zhang
Chaoli Zhang
Siqiao Xue
...
Shirui Pan
Vincent S. Tseng
Yu Zheng
Lei Chen
Hui Xiong
AI4TS
SyDa
42
118
0
16 Oct 2023
Learning to Adapt SAM for Segmenting Cross-domain Point Clouds
Learning to Adapt SAM for Segmenting Cross-domain Point Clouds
Xidong Peng
Runnan Chen
Feng Qiao
Lingdong Kong
You-Chen Liu
Tai Wang
Xinge Zhu
Yuexin Ma
41
12
0
13 Oct 2023
Exploring Large Language Models for Multi-Modal Out-of-Distribution
  Detection
Exploring Large Language Models for Multi-Modal Out-of-Distribution Detection
Yi Dai
Hao Lang
Kaisheng Zeng
Fei Huang
Yongbin Li
OODD
26
11
0
12 Oct 2023
PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training Paradigm
PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training Paradigm
Haoyi Zhu
Honghui Yang
Xiaoyang Wu
Di Huang
Sha Zhang
...
Hengshuang Zhao
Chunhua Shen
Yu Qiao
Tong He
Wanli Ouyang
SSL
77
43
0
12 Oct 2023
Foundation Models Meet Visualizations: Challenges and Opportunities
Foundation Models Meet Visualizations: Challenges and Opportunities
Weikai Yang
Mengchen Liu
Zheng Wang
Shixia Liu
44
36
0
09 Oct 2023
Geometry Aware Field-to-field Transformations for 3D Semantic
  Segmentation
Geometry Aware Field-to-field Transformations for 3D Semantic Segmentation
Dominik Hollidt
Clinton Jia Wang
Polina Golland
Marc Pollefeys
33
0
0
08 Oct 2023
Sub-token ViT Embedding via Stochastic Resonance Transformers
Sub-token ViT Embedding via Stochastic Resonance Transformers
Dong Lao
Yangchao Wu
Tian Yu Liu
Alex Wong
Stefano Soatto
VOS
36
4
0
06 Oct 2023
DeformUX-Net: Exploring a 3D Foundation Backbone for Medical Image
  Segmentation with Depthwise Deformable Convolution
DeformUX-Net: Exploring a 3D Foundation Backbone for Medical Image Segmentation with Depthwise Deformable Convolution
Ho Hin Lee
Quan Liu
Qi Yang
Xin Yu
Shunxing Bao
Yuankai Huo
Bennett A. Landman
MedIm
22
2
0
30 Sep 2023
Text-image Alignment for Diffusion-based Perception
Text-image Alignment for Diffusion-based Perception
Neehar Kondapaneni
Markus Marks
Manuel Knott
Rogério Guimarães
Pietro Perona
VLM
DiffM
24
32
0
29 Sep 2023
YOLOR-Based Multi-Task Learning
YOLOR-Based Multi-Task Learning
Hung-Shuo Chang
Chien-Yao Wang
Hang Yan
Yukun Zhu
Hongpeng Liao
MoE
VLM
27
17
0
29 Sep 2023
Parameter-Saving Adversarial Training: Reinforcing Multi-Perturbation
  Robustness via Hypernetworks
Parameter-Saving Adversarial Training: Reinforcing Multi-Perturbation Robustness via Hypernetworks
Huihui Gong
Minjing Dong
Siqi Ma
S. Çamtepe
Surya Nepal
Chang Xu
AAML
OOD
25
1
0
28 Sep 2023
The Robust Semantic Segmentation UNCV2023 Challenge Results
The Robust Semantic Segmentation UNCV2023 Challenge Results
Xuanlong Yu
Yi Zuo
Zitao Wang
Xiaowen Zhang
Jiaxuan Zhao
...
Angela Yao
Wenlong Chen
Ivor J. A. Simpson
Neill D. F. Campbell
Gianni Franchi
UQCV
40
4
0
27 Sep 2023
RMT: Retentive Networks Meet Vision Transformers
RMT: Retentive Networks Meet Vision Transformers
Qihang Fan
Huaibo Huang
Mingrui Chen
Hongmin Liu
Ran He
ViT
43
75
0
20 Sep 2023
ClusterFusion: Leveraging Radar Spatial Features for Radar-Camera 3D
  Object Detection in Autonomous Vehicles
ClusterFusion: Leveraging Radar Spatial Features for Radar-Camera 3D Object Detection in Autonomous Vehicles
Irfan Tito Kurniawan
B. Trilaksono
38
6
0
07 Sep 2023
DAT++: Spatially Dynamic Vision Transformer with Deformable Attention
DAT++: Spatially Dynamic Vision Transformer with Deformable Attention
Zhuofan Xia
Xuran Pan
Shiji Song
Li Erran Li
Gao Huang
ViT
39
25
0
04 Sep 2023
RevColV2: Exploring Disentangled Representations in Masked Image
  Modeling
RevColV2: Exploring Disentangled Representations in Masked Image Modeling
Qi Han
Yuxuan Cai
Xiangyu Zhang
41
7
0
02 Sep 2023
Learning Modulated Transformation in GANs
Learning Modulated Transformation in GANs
Ceyuan Yang
Qihang Zhang
Yinghao Xu
Jiapeng Zhu
Yujun Shen
Bo Dai
22
1
0
29 Aug 2023
Computation-efficient Deep Learning for Computer Vision: A Survey
Computation-efficient Deep Learning for Computer Vision: A Survey
Yulin Wang
Yizeng Han
Chaofei Wang
Shiji Song
Qi Tian
Gao Huang
VLM
38
20
0
27 Aug 2023
MOFA: A Model Simplification Roadmap for Image Restoration on Mobile
  Devices
MOFA: A Model Simplification Roadmap for Image Restoration on Mobile Devices
Xiangyu Chen
Ruiwen Zhen
Shuai Li
Xiaotian Li
Guanghui Wang
30
0
0
24 Aug 2023
SparseBEV: High-Performance Sparse 3D Object Detection from Multi-Camera
  Videos
SparseBEV: High-Performance Sparse 3D Object Detection from Multi-Camera Videos
Haisong Liu
Yao Teng
Tao Lu
Haiguang Wang
Liming Wang
16
98
0
18 Aug 2023
Dynamic Kernel-Based Adaptive Spatial Aggregation for Learned Image
  Compression
Dynamic Kernel-Based Adaptive Spatial Aggregation for Learned Image Compression
Huairui Wang
Nianxiang Fu
Zhenzhong Chen
Shanghui Liu
38
2
0
17 Aug 2023
A Unified Interactive Model Evaluation for Classification, Object
  Detection, and Instance Segmentation in Computer Vision
A Unified Interactive Model Evaluation for Classification, Object Detection, and Instance Segmentation in Computer Vision
Changjian Chen
Yukai Guo
Fengyuan Tian
Siyi Liu
Weikai Yang
Zhao-Ming Wang
Jing Wu
Hang Su
Hanspeter Pfister
Shixia Liu
26
15
0
09 Aug 2023
MixReorg: Cross-Modal Mixed Patch Reorganization is a Good Mask Learner
  for Open-World Semantic Segmentation
MixReorg: Cross-Modal Mixed Patch Reorganization is a Good Mask Learner for Open-World Semantic Segmentation
Kaixin Cai
Pengzhen Ren
Yi Zhu
Hang Xu
Jian-zhuo Liu
Changlin Li
Guangrun Wang
Xiaodan Liang
VLM
29
14
0
09 Aug 2023
Mask Frozen-DETR: High Quality Instance Segmentation with One GPU
Mask Frozen-DETR: High Quality Instance Segmentation with One GPU
Zhanhao Liang
Yuhui Yuan
ISeg
31
4
0
07 Aug 2023
The All-Seeing Project: Towards Panoptic Visual Recognition and
  Understanding of the Open World
The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World
Weiyun Wang
Min Shi
Qingyun Li
Wen Wang
Zhenhang Huang
...
Zhiguo Cao
Yushi Chen
Tong Lu
Jifeng Dai
Yu Qiao
LRM
MLLM
53
84
0
03 Aug 2023
DETR Doesn't Need Multi-Scale or Locality Design
DETR Doesn't Need Multi-Scale or Locality Design
Yutong Lin
Yuhui Yuan
Zheng-Wei Zhang
Chen Li
Nanning Zheng
Han Hu
37
5
0
03 Aug 2023
PPI-NET: End-to-End Parametric Primitive Inference
PPI-NET: End-to-End Parametric Primitive Inference
Liang Wang
Xiaogang Wang
37
1
0
03 Aug 2023
Tracking Anything in High Quality
Tracking Anything in High Quality
Jiawen Zhu
Zhe Chen
Zeqi Hao
Shijie Chang
Lu Zhang
...
Bin Luo
Ju He
Jinpeng Lan
Hanyuan Chen
Chenyang Li
VOS
21
7
0
26 Jul 2023
When Multi-Task Learning Meets Partial Supervision: A Computer Vision
  Review
When Multi-Task Learning Meets Partial Supervision: A Computer Vision Review
Maxime Fontana
Michael W. Spratling
Miaojing Shi
56
6
0
25 Jul 2023
Industrial Segment Anything -- a Case Study in Aircraft Manufacturing,
  Intralogistics, Maintenance, Repair, and Overhaul
Industrial Segment Anything -- a Case Study in Aircraft Manufacturing, Intralogistics, Maintenance, Repair, and Overhaul
Keno Moenck
Arne Wendt
Philipp Prünte
Julian Koch
Arne Sahrhage
...
Falko Kähler
Dirk Holst
Martin Gomse
Thorsten Schuppstuhl
Daniel Schoepflin
VLM
36
6
0
24 Jul 2023
Meta-Transformer: A Unified Framework for Multimodal Learning
Meta-Transformer: A Unified Framework for Multimodal Learning
Yiyuan Zhang
Kaixiong Gong
Kaipeng Zhang
Hongsheng Li
Yu Qiao
Wanli Ouyang
Xiangyu Yue
33
137
0
20 Jul 2023
MVA2023 Small Object Detection Challenge for Spotting Birds: Dataset,
  Methods, and Results
MVA2023 Small Object Detection Challenge for Spotting Birds: Dataset, Methods, and Results
Yuki Kondo
Norimichi Ukita
Takayuki Yamaguchi
Haoran Hou
Mu-Yi Shen
...
Ichiro Ide
Yosuke Shinya
Xinyao Liu
Guang Liang
S. Yasui
28
13
0
18 Jul 2023
Scale-Aware Modulation Meet Transformer
Scale-Aware Modulation Meet Transformer
Wei-Shiang Lin
Ziheng Wu
Jiayu Chen
Jun Huang
Lianwen Jin
MoE
ViT
35
66
0
17 Jul 2023
Improving Data Efficiency for Plant Cover Prediction with Label
  Interpolation and Monte-Carlo Cropping
Improving Data Efficiency for Plant Cover Prediction with Label Interpolation and Monte-Carlo Cropping
Matthias Körschens
S. F. Bucher
Christine Romermann
Joachim Denzler
36
1
0
17 Jul 2023
DreamTeacher: Pretraining Image Backbones with Deep Generative Models
DreamTeacher: Pretraining Image Backbones with Deep Generative Models
Daiqing Li
Huan Ling
Amlan Kar
David Acuna
Seung Wook Kim
Karsten Kreis
Antonio Torralba
Sanja Fidler
VLM
DiffM
22
27
0
14 Jul 2023
Flexible and Fully Quantized Ultra-Lightweight TinyissimoYOLO for
  Ultra-Low-Power Edge Systems
Flexible and Fully Quantized Ultra-Lightweight TinyissimoYOLO for Ultra-Low-Power Edge Systems
Julian Moosmann
H. Mueller
Nicky Zimmerman
Georg Rutishauser
Luca Benini
Michele Magno
38
8
0
12 Jul 2023
TBGC: Task-level Backbone-Oriented Gradient Clip for Multi-Task
  Foundation Model Learning
TBGC: Task-level Backbone-Oriented Gradient Clip for Multi-Task Foundation Model Learning
Z. Zhang
Xue Pan
19
0
0
07 Jul 2023
FB-OCC: 3D Occupancy Prediction based on Forward-Backward View
  Transformation
FB-OCC: 3D Occupancy Prediction based on Forward-Backward View Transformation
Zhiqi Li
Zhiding Yu
David Austin
Mingsheng Fang
Shiyi Lan
Jan Kautz
J. Álvarez
46
101
0
04 Jul 2023
End-to-end Autonomous Driving: Challenges and Frontiers
End-to-end Autonomous Driving: Challenges and Frontiers
Li Chen
Peng Wu
Kashyap Chitta
Bernhard Jaeger
Andreas Geiger
Hongyang Li
3DV
69
266
0
29 Jun 2023
What a MESS: Multi-Domain Evaluation of Zero-Shot Semantic Segmentation
What a MESS: Multi-Domain Evaluation of Zero-Shot Semantic Segmentation
Benedikt Blumenstiel
Johannes Jakubik
Hilde Kuhne
Michael Vossing
VLM
32
15
0
27 Jun 2023
MachMap: End-to-End Vectorized Solution for Compact HD-Map Construction
MachMap: End-to-End Vectorized Solution for Compact HD-Map Construction
Limeng Qiao
Yongchao Zheng
Peng Zhang
Wenjie Ding
Xi Qiu
Xing Wei
Chi Zhang
3DGS
3DPC
3DV
28
16
0
17 Jun 2023
PanoOcc: Unified Occupancy Representation for Camera-based 3D Panoptic
  Segmentation
PanoOcc: Unified Occupancy Representation for Camera-based 3D Panoptic Segmentation
Yu-Quan Wang
Yuntao Chen
Xingyu Liao
Lue Fan
Zhaoxiang Zhang
91
74
0
16 Jun 2023
Parameter-efficient is not sufficient: Exploring Parameter, Memory, and
  Time Efficient Adapter Tuning for Dense Predictions
Parameter-efficient is not sufficient: Exploring Parameter, Memory, and Time Efficient Adapter Tuning for Dense Predictions
Dongshuo Yin
Xueting Han
Bin Li
Hao Feng
Jinghua Bai
VPVLM
38
18
0
16 Jun 2023
Robustness Analysis on Foundational Segmentation Models
Robustness Analysis on Foundational Segmentation Models
Madeline Chantry Schiappa
Shehreen Azad
V. Sachidanand
Yunhao Ge
O. Mikšík
Yogesh S Rawat
Vibhav Vineet
OOD
VLM
AAML
30
6
0
15 Jun 2023
Transferring Knowledge for Food Image Segmentation using Transformers
  and Convolutions
Transferring Knowledge for Food Image Segmentation using Transformers and Convolutions
Grant Sinha
Krishna Parmar
Hilda Azimi
Chi-en Amy Tai
Yuhao Chen
A. Wong
Pengcheng Xi
ViT
36
4
0
15 Jun 2023
detrex: Benchmarking Detection Transformers
detrex: Benchmarking Detection Transformers
Tianhe Ren
Siyi Liu
Feng Li
Hao Zhang
Ailing Zeng
...
Zhaoyang Zeng
Xianbiao Qi
Yuhui Yuan
Jianwei Yang
Lei Zhang
42
13
0
12 Jun 2023
Previous
1234567
Next