ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2211.05778
  4. Cited By
InternImage: Exploring Large-Scale Vision Foundation Models with
  Deformable Convolutions

InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions

10 November 2022
Wenhai Wang
Jifeng Dai
Zhe Chen
Zhenhang Huang
Zhiqi Li
Xizhou Zhu
Xiao-hua Hu
Tong Lu
Lewei Lu
Hongsheng Li
Xiaogang Wang
Yu Qiao
    VLM
ArXivPDFHTML

Papers citing "InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions"

50 / 322 papers shown
Title
Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures
Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures
Yuchen Duan
Weiyun Wang
Zhe Chen
Xizhou Zhu
Lewei Lu
Tong Lu
Yu Qiao
Hongsheng Li
Jifeng Dai
Wenhai Wang
ViT
46
44
0
04 Mar 2024
ELA: Efficient Local Attention for Deep Convolutional Neural Networks
ELA: Efficient Local Attention for Deep Convolutional Neural Networks
Wei Xu
Yi Wan
47
32
0
02 Mar 2024
HyenaPixel: Global Image Context with Convolutions
HyenaPixel: Global Image Context with Convolutions
Julian Spravil
Sebastian Houben
Sven Behnke
31
1
0
29 Feb 2024
OccTransformer: Improving BEVFormer for 3D camera-only occupancy
  prediction
OccTransformer: Improving BEVFormer for 3D camera-only occupancy prediction
Jian Liu
Sipeng Zhang
Chuixin Kong
Wenyuan Zhang
Yuhang Wu
Yikang Ding
Borun Xu
Ruibo Ming
Dong-Lai Wei
Xianming Liu
32
7
0
28 Feb 2024
Instance-aware Exploration-Verification-Exploitation for Instance
  ImageGoal Navigation
Instance-aware Exploration-Verification-Exploitation for Instance ImageGoal Navigation
X. Lei
Min Wang
Wen-gang Zhou
Li Li
Houqiang Li
53
5
0
25 Feb 2024
A Comprehensive Review of Machine Learning Advances on Data Change: A
  Cross-Field Perspective
A Comprehensive Review of Machine Learning Advances on Data Change: A Cross-Field Perspective
Jeng-Lin Li
Chih-Fan Hsu
Ming-Ching Chang
Wei-Chao Chen
OOD
56
2
0
20 Feb 2024
Efficient and Scalable Fine-Tune of Language Models for Genome
  Understanding
Efficient and Scalable Fine-Tune of Language Models for Genome Understanding
Huixin Zhan
Ying Nian Wu
Zijun Zhang
ALM
35
1
0
12 Feb 2024
Delving into Multi-modal Multi-task Foundation Models for Road Scene
  Understanding: From Learning Paradigm Perspectives
Delving into Multi-modal Multi-task Foundation Models for Road Scene Understanding: From Learning Paradigm Perspectives
Sheng Luo
Wei Chen
Wanxin Tian
Rui Liu
Luanxuan Hou
...
Ling Shao
Yi Yang
Bojun Gao
Qun Li
Guobin Wu
55
13
0
05 Feb 2024
Region-Based Representations Revisited
Region-Based Representations Revisited
Michal Shlapentokh-Rothman
Ansel Blume
Yao Xiao
Yuqun Wu
TV Sethuraman
Heyi Tao
Jae Yong Lee
Wilfredo Torres
Yu-xiong Wang
Derek Hoiem
42
5
0
04 Feb 2024
SERNet-Former: Semantic Segmentation by Efficient Residual Network with
  Attention-Boosting Gates and Attention-Fusion Networks
SERNet-Former: Semantic Segmentation by Efficient Residual Network with Attention-Boosting Gates and Attention-Fusion Networks
Serdar Erişen
SSeg
35
11
0
28 Jan 2024
Enhancing Small Object Encoding in Deep Neural Networks: Introducing
  Fast&Focused-Net with Volume-wise Dot Product Layer
Enhancing Small Object Encoding in Deep Neural Networks: Introducing Fast&Focused-Net with Volume-wise Dot Product Layer
Tofik Ali
Partha Pratim Roy
ObjD
35
2
0
18 Jan 2024
Vision Mamba: Efficient Visual Representation Learning with
  Bidirectional State Space Model
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
Lianghui Zhu
Bencheng Liao
Qian Zhang
Xinlong Wang
Wenyu Liu
Xinggang Wang
Mamba
50
719
0
17 Jan 2024
MapNeXt: Revisiting Training and Scaling Practices for Online Vectorized
  HD Map Construction
MapNeXt: Revisiting Training and Scaling Practices for Online Vectorized HD Map Construction
Toyota Li
26
6
0
14 Jan 2024
UniVision: A Unified Framework for Vision-Centric 3D Perception
UniVision: A Unified Framework for Vision-Centric 3D Perception
Yu Hong
Qian Liu
Huayuan Cheng
Danjiao Ma
Hang Dai
Yu Wang
Guangzhi Cao
Yong Ding
53
7
0
13 Jan 2024
Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator
  for Vision Applications
Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator for Vision Applications
Yuwen Xiong
Zhiqi Li
Yuntao Chen
Feng Wang
Xizhou Zhu
...
Hongsheng Li
Yu Qiao
Lewei Lu
Jie Zhou
Jifeng Dai
36
51
0
11 Jan 2024
Latency-aware Road Anomaly Segmentation in Videos: A Photorealistic
  Dataset and New Metrics
Latency-aware Road Anomaly Segmentation in Videos: A Photorealistic Dataset and New Metrics
Beiwen Tian
Huan-ang Gao
Leiyao Cui
Yupeng Zheng
Lan Luo
Baofeng Wang
Rong Zhi
Guyue Zhou
Hao Zhao
31
4
0
10 Jan 2024
WoodScape Motion Segmentation for Autonomous Driving -- CVPR 2023 OmniCV
  Workshop Challenge
WoodScape Motion Segmentation for Autonomous Driving -- CVPR 2023 OmniCV Workshop Challenge
Saravanabalagi Ramachandran
Nathaniel Cibik
Ganesh Sistu
John L McDonald
43
0
0
31 Dec 2023
Visual Point Cloud Forecasting enables Scalable Autonomous Driving
Visual Point Cloud Forecasting enables Scalable Autonomous Driving
Zetong Yang
Li Chen
Yanan Sun
Hongyang Li
3DPC
27
40
0
29 Dec 2023
Harnessing Diffusion Models for Visual Perception with Meta Prompts
Harnessing Diffusion Models for Visual Perception with Meta Prompts
Qiang Wan
Zilong Huang
Bingyi Kang
Jiashi Feng
Li Zhang
MDE
VLM
26
15
0
22 Dec 2023
InternVL: Scaling up Vision Foundation Models and Aligning for Generic
  Visual-Linguistic Tasks
InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks
Zhe Chen
Jiannan Wu
Wenhai Wang
Weijie Su
Guo Chen
...
Bin Li
Ping Luo
Tong Lu
Yu Qiao
Jifeng Dai
VLM
MLLM
176
972
0
21 Dec 2023
Rethinking of Feature Interaction for Multi-task Learning on Dense
  Prediction
Rethinking of Feature Interaction for Multi-task Learning on Dense Prediction
Jingdong Zhang
Jiayuan Fan
Peng Ye
Bo Zhang
Hancheng Ye
Baopu Li
Yancheng Cai
Tao Chen
28
2
0
21 Dec 2023
Point Deformable Network with Enhanced Normal Embedding for Point Cloud
  Analysis
Point Deformable Network with Enhanced Normal Embedding for Point Cloud Analysis
Xingyilang Yin
Xi Yang
Liangchen Liu
Nannan Wang
Xinbo Gao
3DPC
33
3
0
20 Dec 2023
TADAP: Trajectory-Aided Drivable area Auto-labeling with Pre-trained
  self-supervised features in winter driving conditions
TADAP: Trajectory-Aided Drivable area Auto-labeling with Pre-trained self-supervised features in winter driving conditions
Eerik Alamikkotervo
Risto Ojala
Alvari Seppänen
Kari Tammi
32
0
0
20 Dec 2023
RadOcc: Learning Cross-Modality Occupancy Knowledge through Rendering
  Assisted Distillation
RadOcc: Learning Cross-Modality Occupancy Knowledge through Rendering Assisted Distillation
Haiming Zhang
Xu Yan
Dongfeng Bai
Jiantao Gao
Pan Wang
Bingbing Liu
Shuguang Cui
Zhen Li
91
22
0
19 Dec 2023
UniChest: Conquer-and-Divide Pre-training for Multi-Source Chest X-Ray
  Classification
UniChest: Conquer-and-Divide Pre-training for Multi-Source Chest X-Ray Classification
Tianjie Dai
Ruipeng Zhang
Feng Hong
Jiangchao Yao
Ya Zhang
Yanfeng Wang
40
8
0
18 Dec 2023
Semantic-Aware Transformation-Invariant RoI Align
Semantic-Aware Transformation-Invariant RoI Align
Guo-Ye Yang
George Kiyohiro Nakayama
Zikai Xiao
Tai-Jiang Mu
Xiaolei Huang
Shi-Min Hu
ObjD
37
0
0
15 Dec 2023
WeatherProof: A Paired-Dataset Approach to Semantic Segmentation in
  Adverse Weather
WeatherProof: A Paired-Dataset Approach to Semantic Segmentation in Adverse Weather
Blake Gella
Howard Zhang
Rishi Upadhyay
Tiffany Chang
Matthew Waliman
Yunhao Ba
Alex Wong
A. Kadambi
37
6
0
15 Dec 2023
ThinkBot: Embodied Instruction Following with Thought Chain Reasoning
ThinkBot: Embodied Instruction Following with Thought Chain Reasoning
Guanxing Lu
Ziwei Wang
Changliu Liu
Jiwen Lu
Yansong Tang
LRM
25
8
0
12 Dec 2023
Loss Functions in the Era of Semantic Segmentation: A Survey and Outlook
Loss Functions in the Era of Semantic Segmentation: A Survey and Outlook
Reza Azad
Moein Heidary
Kadir Yilmaz
Michael Huttemann
Sanaz Karimijafarbigloo
Yuli Wu
Anke Schmeink
Dorit Merhof
VLM
SSeg
47
18
0
08 Dec 2023
AI-SAM: Automatic and Interactive Segment Anything Model
AI-SAM: Automatic and Interactive Segment Anything Model
Yimu Pan
Sitao Zhang
Alison D. Gernand
Jeffery A. Goldstein
J. Z. Wang
VLM
32
4
0
05 Dec 2023
Implicit Learning of Scene Geometry from Poses for Global Localization
Implicit Learning of Scene Geometry from Poses for Global Localization
Mohammad Altillawi
Shile Li
Sai Manoj Prakhya
Ziyuan Liu
Joan Serrat
SSL
25
2
0
04 Dec 2023
BEVNeXt: Reviving Dense BEV Frameworks for 3D Object Detection
BEVNeXt: Reviving Dense BEV Frameworks for 3D Object Detection
Zhenxin Li
Shiyi Lan
Jose M. Alvarez
Zuxuan Wu
41
16
0
04 Dec 2023
A New Learning Paradigm for Foundation Model-based Remote Sensing Change
  Detection
A New Learning Paradigm for Foundation Model-based Remote Sensing Change Detection
Kaiyu Li
Xiangyong Cao
Deyu Meng
36
52
0
02 Dec 2023
A Graph-Based Approach for Category-Agnostic Pose Estimation
A Graph-Based Approach for Category-Agnostic Pose Estimation
Or Hirschorn
S. Avidan
44
10
0
29 Nov 2023
TransNeXt: Robust Foveal Visual Perception for Vision Transformers
TransNeXt: Robust Foveal Visual Perception for Vision Transformers
Dai Shi
ViT
23
77
0
28 Nov 2023
UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio,
  Video, Point Cloud, Time-Series and Image Recognition
UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
Xiaohan Ding
Yiyuan Zhang
Yixiao Ge
Sijie Zhao
Lin Song
Xiangyu Yue
Ying Shan
VLM
AI4TS
SSL
29
104
0
27 Nov 2023
Adapter is All You Need for Tuning Visual Tasks
Adapter is All You Need for Tuning Visual Tasks
Dongshuo Yin
Leiyi Hu
Bin Li
Youqun Zhang
18
15
0
25 Nov 2023
IDD-AW: A Benchmark for Safe and Robust Segmentation of Drive Scenes in
  Unstructured Traffic and Adverse Weather
IDD-AW: A Benchmark for Safe and Robust Segmentation of Drive Scenes in Unstructured Traffic and Adverse Weather
Furqan Ahmed Shaik
Abhishek Malreddy
Nikhil Reddy Billa
Kunal Chaudhary
Sunny Manchanda
Girish Varma
25
11
0
24 Nov 2023
T-Rex: Counting by Visual Prompting
T-Rex: Counting by Visual Prompting
Qing Jiang
Feng Li
Tianhe Ren
Shilong Liu
Zhaoyang Zeng
Kent Yu
Lei Zhang
24
12
0
22 Nov 2023
LABELMAKER: Automatic Semantic Label Generation from RGB-D Trajectories
LABELMAKER: Automatic Semantic Label Generation from RGB-D Trajectories
Silvan Weder
Hermann Blum
Francis Engelmann
Marc Pollefeys
VLM
32
11
0
20 Nov 2023
Sparse4D v3: Advancing End-to-End 3D Detection and Tracking
Sparse4D v3: Advancing End-to-End 3D Detection and Tracking
Xuewu Lin
Zi-Hui Pei
Tianwei Lin
Lichao Huang
Zhizhong Su
25
35
0
20 Nov 2023
On the Importance of Large Objects in CNN Based Object Detection
  Algorithms
On the Importance of Large Objects in CNN Based Object Detection Algorithms
Ahmed Ben Saad
Gabriele Facciolo
Axel Davy
ObjD
27
2
0
20 Nov 2023
Towards Open-Ended Visual Recognition with Large Language Model
Towards Open-Ended Visual Recognition with Large Language Model
Qihang Yu
Xiaohui Shen
Liang-Chieh Chen
VLM
22
8
0
14 Nov 2023
OmniVec: Learning robust representations with cross modal sharing
OmniVec: Learning robust representations with cross modal sharing
Siddharth Srivastava
Gaurav Sharma
SSL
42
64
0
07 Nov 2023
GQKVA: Efficient Pre-training of Transformers by Grouping Queries, Keys,
  and Values
GQKVA: Efficient Pre-training of Transformers by Grouping Queries, Keys, and Values
Farnoosh Javadi
Walid Ahmed
Habib Hajimolahoseini
Foozhan Ataiefard
Mohammad Hassanpour
Saina Asani
Austin Wen
Omar Mohamed Awad
Kangling Liu
Yang Liu
VLM
42
7
0
06 Nov 2023
Ultra-Efficient On-Device Object Detection on AI-Integrated Smart
  Glasses with TinyissimoYOLO
Ultra-Efficient On-Device Object Detection on AI-Integrated Smart Glasses with TinyissimoYOLO
Julian Moosmann
Pietro Bonazzi
Yawei Li
Sizhen Bian
Philipp Mayer
Luca Benini
Michele Magno
43
12
0
02 Nov 2023
DDC-PIM: Efficient Algorithm/Architecture Co-design for Doubling Data
  Capacity of SRAM-based Processing-In-Memory
DDC-PIM: Efficient Algorithm/Architecture Co-design for Doubling Data Capacity of SRAM-based Processing-In-Memory
Cenlin Duan
Jianlei Yang
Xiaolin He
Yingjie Qi
Yikun Wang
...
Bonan Yan
Xueyan Wang
Xiaotao Jia
Weitao Pan
Weisheng Zhao
32
5
0
31 Oct 2023
Self-Supervised Pre-Training for Precipitation Post-Processor
Self-Supervised Pre-Training for Precipitation Post-Processor
Sojung An
Junha Lee
Jiyeon Jang
Inchae Na
Wooyeon Park
Sujeong You
AI4Cl
26
1
0
31 Oct 2023
TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual Recognition
TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual Recognition
Meng Lou
Hong-Yu Zhou
Sibei Yang
Yizhou Yu
Chuan Wu
Yizhou Yu
ViT
49
36
0
30 Oct 2023
Out-of-distribution Object Detection through Bayesian Uncertainty
  Estimation
Out-of-distribution Object Detection through Bayesian Uncertainty Estimation
Tianhao Zhang
Shenglin Wang
N. Bouaynaya
R. Calinescu
Lyudmila Mihaylova
OODD
21
2
0
29 Oct 2023
Previous
1234567
Next