Papers
Communities
Organizations
Events
Blog
Pricing
Search
Open menu
Home
Papers
1506.01497
Cited By
v1
v2
v3 (latest)
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
4 June 2015
Shaoqing Ren
Kaiming He
Ross B. Girshick
Jian Sun
AIMat
ObjD
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks"
50 / 10,536 papers shown
Title
RTGen: Generating Region-Text Pairs for Open-Vocabulary Object Detection
Fangyi Chen
Han Zhang
Zhantao Yang
Hao Chen
Kai Hu
Marios Savvides
ObjD
VLM
93
5
0
30 May 2024
Improving Object Detector Training on Synthetic Data by Starting With a Strong Baseline Methodology
Frank Ruis
Alma M. Liezenga
Friso G. Heslinga
Luca Ballan
Thijs A. Eker
Richard J. M. den Hollander
Martin C. van Leeuwen
Judith Dijk
Wyke Huizinga
81
4
0
30 May 2024
YotoR-You Only Transform One Representation
José Ignacio Díaz Villa
P. Loncomilla
Javier Ruiz-del-Solar
ViT
71
1
0
30 May 2024
Evaluating the Effectiveness and Robustness of Visual Similarity-based Phishing Detection Models
Fujiao Ji
Kiho Lee
Hyungjoon Koo
Wenhao You
Euijin Choo
Hyoungshick Kim
Doowon Kim
AAML
99
2
0
30 May 2024
Enabling Visual Recognition at Radio Frequency
Haowen Lai
Gaoxiang Luo
Yifei Liu
Mingmin Zhao
79
4
0
29 May 2024
Exploring AI-based Anonymization of Industrial Image and Video Data in the Context of Feature Preservation
Sabrina Cynthia Triess
Timo Leitritz
Christian Jauch
72
0
0
29 May 2024
RGB-T Object Detection via Group Shuffled Multi-receptive Attention and Multi-modal Supervision
Jinzhong Wang
Xuetao Tian
Shun Dai
Tao Zhuo
Haorui Zeng
Hongjuan Liu
Jiaqi Liu
Xiuwei Zhang
Yanning Zhang
89
1
0
29 May 2024
SSGA-Net: Stepwise Spatial Global-local Aggregation Networks for for Autonomous Driving
Yiming Cui
Cheng Han
Dongfang Liu
96
0
0
29 May 2024
Vulnerable Road User Detection and Safety Enhancement: A Comprehensive Survey
Renato M. Silva
Gregório F. Azevedo
M. V. Berto
Jean R. Rocha
E. C. Fidelis
Matheus V. Nogueira
Pedro H. Lisboa
Tiago A. Almeida
96
4
0
29 May 2024
Track Initialization and Re-Identification for~3D Multi-View Multi-Object Tracking
L. Ma
Tran Thien Dat Nguyen
B. Vo
Hyunsung Jang
Moongu Jeon
92
16
0
28 May 2024
A Review and Implementation of Object Detection Models and Optimizations for Real-time Medical Mask Detection during the COVID-19 Pandemic
Ioanna C. Gogou
Dimitrios A. Koutsomitropoulos
3DH
57
2
0
28 May 2024
Deep Learning Innovations for Underwater Waste Detection: An In-Depth Analysis
Jaskaran Singh Walia
K. PavithraL
92
4
0
28 May 2024
Flow-Assisted Motion Learning Network for Weakly-Supervised Group Activity Recognition
Muhammad Adi Nugroho
Sangmin Woo
Sumin Lee
Jinyoung Park
Yooseung Wang
Donguk Kim
Changick Kim
71
1
0
28 May 2024
OV-DQUO: Open-Vocabulary DETR with Denoising Text Query Training and Open-World Unknown Objects Supervision
Junjie Wang
Bin Chen
Bin Kang
Yulin Li
Yichi Chen
Weizhi Xian
Huifeng Chang
VLM
ObjD
86
7
0
28 May 2024
Reliable Object Tracking by Multimodal Hybrid Feature Extraction and Transformer-Based Fusion
Hongze Sun
Rui Liu
Wuque Cai
Jun Wang
Yue Wang
Huajin Tang
Yan Cui
Dezhong Yao
Daqing Guo
126
9
0
28 May 2024
Adapting Pre-Trained Vision Models for Novel Instance Detection and Segmentation
Ya Lu
Jishnu Jaykumar
Yunhui Guo
Nicholas Ruozzi
Yu Xiang
VLM
ISeg
157
5
0
28 May 2024
XFormParser: A Simple and Effective Multimodal Multilingual Semi-structured Form Parser
Xianfu Cheng
Hang Zhang
Jian Yang
Xiang Li
Weixiao Zhou
...
Fei Liu
Wei Zhang
Tao Sun
Tongliang Li
Zhoujun Li
105
3
0
27 May 2024
LLM-Optic: Unveiling the Capabilities of Large Language Models for Universal Visual Grounding
Haoyu Zhao
Wenhang Ge
Ying-Cong Chen
ObjD
MLLM
VLM
98
5
0
27 May 2024
Efficient Visual Fault Detection for Freight Train via Neural Architecture Search with Data Volume Robustness
Yang Zhang
Mingying Li
Huilin Pan
Moyun Liu
Yang Zhou
64
0
0
27 May 2024
OED: Towards One-stage End-to-End Dynamic Scene Graph Generation
Guan-Bo Wang
Zhiming Li
Qingchao Chen
Yang Liu
103
11
0
27 May 2024
VoCoT: Unleashing Visually Grounded Multi-Step Reasoning in Large Multi-Modal Models
Zejun Li
Ruipu Luo
Jiwen Zhang
Minghui Qiu
Zhongyu Wei
Zhongyu Wei
LRM
MLLM
191
17
0
27 May 2024
Vision-Based Approach for Food Weight Estimation from 2D Images
Chathura Wimalasiri
P. Sahoo
33
0
0
26 May 2024
Active Learning for Finely-Categorized Image-Text Retrieval by Selecting Hard Negative Unpaired Samples
D. Jo
Kyuewang Lee
Jaeho Chung
Jin Young Choi
80
0
0
25 May 2024
Multimodal Object Detection via Probabilistic a priori Information Integration
Hafsa El Hafyani
Bastien Pasdeloup
Camille Yver
Pierre Romenteau
78
0
0
24 May 2024
Composed Image Retrieval for Remote Sensing
Bill Psomas
Ioannis Kakogeorgiou
Nikos Efthymiadis
Giorgos Tolias
Ondřej Chum
Yannis Avrithis
Konstantinos Karantzalos
118
7
0
24 May 2024
A PST Algorithm for FPs Suppression in Two-stage CNN Detection Methods
Qiang Guo
148
0
0
24 May 2024
Retro: Reusing teacher projection head for efficient embedding distillation on Lightweight Models via Self-supervised Learning
Khanh-Binh Nguyen
Chae Jung Park
86
0
0
24 May 2024
Unbiased Faster R-CNN for Single-source Domain Generalized Object Detection
Yajing Liu
Shijun Zhou
Xiyao Liu
Chunhui Hao
Baojie Fan
Jiandong Tian
ObjD
123
14
0
24 May 2024
Diversifying Human Pose in Synthetic Data for Aerial-view Human Detection
Yingzhe Shen
Hyungtae Lee
Heesung Kwon
Shuvra S. Bhattacharyya
126
5
0
24 May 2024
Concept Visualization: Explaining the CLIP Multi-modal Embedding Using WordNet
Loris Giulivi
Giacomo Boracchi
72
2
0
23 May 2024
Improving Single Domain-Generalized Object Detection: A Focus on Diversification and Alignment
M. S. Danish
Muhammad Haris Khan
Muhammad Akhtar Munir
M. Sarfraz
Mohsen Ali
ObjD
97
10
0
23 May 2024
Multi-modality Regional Alignment Network for Covid X-Ray Survival Prediction and Report Generation
Zhusi Zhong
Jie Li
J. Sollee
Scott Collins
Harrison X. Bai
Paul J Zhang
Terrance Healey
Michael Atalay
Xinbo Gao
Zhicheng Jiao
81
1
0
23 May 2024
A Survey on Vision-Language-Action Models for Embodied AI
Yueen Ma
Zixing Song
Yuzheng Zhuang
Jianye Hao
Irwin King
LM&Ro
368
54
0
23 May 2024
Animal Behavior Analysis Methods Using Deep Learning: A Survey
Edoardo Fazzari
Donato Romano
Fabrizio Falchi
Cesare Stefanini
79
6
0
22 May 2024
A General Framework for Jersey Number Recognition in Sports Video
Maria Koshkina
James H. Elder
75
4
0
22 May 2024
AltChart: Enhancing VLM-based Chart Summarization Through Multi-Pretext Tasks
Omar Moured
Jiaming Zhang
M. Sarfraz
Rainer Stiefelhagen
67
3
0
22 May 2024
Multi Player Tracking in Ice Hockey with Homographic Projections
Harish Prakash
Jia Cheng Shang
Ken M. Nsiempba
Yuhao Chen
David A Clausi
John S. Zelek
87
1
0
22 May 2024
From CNNs to Transformers in Multimodal Human Action Recognition: A Survey
Muhammad Bilal Shaikh
Syed Mohammed Shamsul Islam
Douglas Chai
Naveed Akhtar
110
10
0
22 May 2024
BiomedParse: a biomedical foundation model for image parsing of everything everywhere all at once
Theodore Zhao
Yu Gu
Jianwei Yang
Naoto Usuyama
Ho Hin Lee
...
B. Piening
Carlo Bifulco
Mu-Hsin Wei
Hoifung Poon
Sheng Wang
MedIm
109
29
0
21 May 2024
AMFD: Distillation via Adaptive Multimodal Fusion for Multispectral Pedestrian Detection
Zizhao Chen
Yeqiang Qian
Xiaoxiao Yang
Chunxiang Wang
Ming Yang
68
3
0
21 May 2024
A Survey on Multi-modal Machine Translation: Tasks, Methods and Challenges
Huangjun Shen
Liangying Shao
Wenbo Li
Zhibin Lan
Zhanyu Liu
Jinsong Su
104
3
0
21 May 2024
Scene Graph Generation Strategy with Co-occurrence Knowledge and Learnable Term Frequency
H. Kim
Sangwon Kim
Dasom Ahn
Jong Taek Lee
ByoungChul Ko
118
4
0
21 May 2024
Active Object Detection with Knowledge Aggregation and Distillation from Large Models
Dejie Yang
Yang Liu
102
5
0
21 May 2024
A Survey of Deep Learning-based Radiology Report Generation Using Multimodal Data
Xinyi Wang
Grazziela Figueredo
Ruizhe Li
Wei Emma Zhang
Weitong Chen
Xin Chen
MedIm
ViT
124
2
0
21 May 2024
CSTA: CNN-based Spatiotemporal Attention for Video Summarization
Jaewon Son
Jaehun Park
Kwangsu Kim
AI4TS
ViT
102
9
0
20 May 2024
DATR: Unsupervised Domain Adaptive Detection Transformer with Dataset-Level Adaptation and Prototypical Alignment
Jianhong Han
Liang Chen
Yupei Wang
ViT
85
2
0
20 May 2024
DLAFormer: An End-to-End Transformer For Document Layout Analysis
Jiawei Wang
Kai Hu
Qiang Huo
3DV
ViT
75
3
0
20 May 2024
Versatile Teacher: A Class-aware Teacher-student Framework for Cross-domain Adaptation
Runou Yang
Tian Tian
Jinwen Tian
108
3
0
20 May 2024
Advancing 6-DoF Instrument Pose Estimation in Variable X-Ray Imaging Geometries
Christiaan Viviers
Lena Filatova
Maurice Termeer
Peter H. N. de With
Fons van der Sommen
95
8
0
19 May 2024
InfRS: Incremental Few-Shot Object Detection in Remote Sensing Images
Wuzhou Li
Jiawei Zhou
Xiang Li
Yi Cao
Guanglu Jin
Xuemin Zhang
115
2
0
18 May 2024
Previous
1
2
3
...
24
25
26
...
209
210
211
Next