Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1506.01497
Cited By
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
4 June 2015
Shaoqing Ren
Kaiming He
Ross B. Girshick
Jian Sun
AIMat
ObjD
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks"
50 / 7,093 papers shown
Title
OmniParser V2: Structured-Points-of-Thought for Unified Visual Text Parsing and Its Generality to Multimodal Large Language Models
Wenwen Yu
Zhibo Yang
Jianqiang Wan
Sibo Song
J. Tang
Wenqing Cheng
Yunxing Liu
Xiang Bai
55
3
0
22 Feb 2025
Visible-Thermal Tiny Object Detection: A Benchmark Dataset and Baselines
Xinyi Ying
Chao Xiao
Ruojing Li
Xu He
Boyang Li
...
Miao Li
Shilin Zhou
Wei An
Weidong Sheng
Li Liu
157
7
0
21 Feb 2025
YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection
Yuming Chen
Xinbin Yuan
Ruiqi Wu
Jiabao Wang
Qibin Hou
Mingg-Ming Cheng
Ming-Ming Cheng
ObjD
159
51
0
21 Feb 2025
RSNet: A Light Framework for The Detection of Multi-scale Remote Sensing Targets
Hongyu Chen
Chong Chen
Fei Wang
Yuhu Shi
Weiming Zeng
52
1
0
20 Feb 2025
A Comprehensive Survey on Composed Image Retrieval
Xuemeng Song
Haoqiang Lin
Haokun Wen
Bohan Hou
Mingzhu Xu
Liqiang Nie
53
1
0
19 Feb 2025
Natural Language Generation from Visual Sequences: Challenges and Future Directions
Aditya K Surikuchi
Raquel Fernández
Sandro Pezzelle
EGVM
266
0
0
18 Feb 2025
Robust Adaptation of Large Multimodal Models for Retrieval Augmented Hateful Meme Detection
Jingbiao Mei
Jinghong Chen
Guangyu Yang
Weizhe Lin
Bill Byrne
VLM
108
0
0
18 Feb 2025
Detecting Systematic Weaknesses in Vision Models along Predefined Human-Understandable Dimensions
Sujan Sai Gannamaneni
Rohil Prakash Rao
Michael Mock
Maram Akila
Stefan Wrobel
AAML
198
0
0
17 Feb 2025
CLoCKDistill: Consistent Location-and-Context-aware Knowledge Distillation for DETRs
Qizhen Lan
Qing Tian
55
0
0
15 Feb 2025
A Mathematics Framework of Artificial Shifted Population Risk and Its Further Understanding Related to Consistency Regularization
Xiliang Yang
Shenyang Deng
Shicong Liu
Yuanchi Suo
Wing.W.Y NG
Jianjun Zhang
40
0
0
15 Feb 2025
Visual Graph Question Answering with ASP and LLMs for Language Parsing
Jakob Johannes Bauer
Thomas Eiter
Nelson Higuera Ruiz
J. Oetsch
GNN
64
0
0
13 Feb 2025
Generalized Class Discovery in Instance Segmentation
Cuong Manh Hoang
Yeejin Lee
Byeongkeun Kang
ISeg
92
0
0
12 Feb 2025
Dense Object Detection Based on De-homogenized Queries
Yueming Huang
Chenrui Ma
Hao Zhou
Hao Wu
Guowu Yuan
127
0
0
11 Feb 2025
SparseFormer: Detecting Objects in HRW Shots via Sparse Vision Transformer
Wenxi Li
Yuchen Guo
Jilai Zheng
Haozhe Lin
Chao Ma
Lu Fang
Xiaokang Yang
ViT
62
1
0
11 Feb 2025
Vision-Language Models for Edge Networks: A Comprehensive Survey
Ahmed Sharshar
Latif U. Khan
Waseem Ullah
Mohsen Guizani
VLM
70
3
0
11 Feb 2025
Foreign-Object Detection in High-Voltage Transmission Line Based on Improved YOLOv8m
Zhenyue Wang
Guowu Yuan
Hao Zhou
Yi Ma
Yutang Ma
47
18
0
11 Feb 2025
PLATTER: A Page-Level Handwritten Text Recognition System for Indic Scripts
Badri Vishal Kasuba
Dhruv Kudale
Venkatapathy Subramanian
P. Chaudhuri
Ganesh Ramakrishnan
48
0
0
10 Feb 2025
Enhancing Document Key Information Localization Through Data Augmentation
Yue Dai
83
0
0
10 Feb 2025
An Appearance Defect Detection Method for Cigarettes Based on C-CenterNet
Hongyu Liu
Guowu Yuan
Lei Yang
Kunxiao Liu
Hao Zhou
64
22
0
10 Feb 2025
Energy-Efficient Autonomous Aerial Navigation with Dynamic Vision Sensors: A Physics-Guided Neuromorphic Approach
Sourav Sanyal
Amogh Joshi
M. Nagaraj
Rohan Kumar Manna
Kaushik Roy
84
1
0
09 Feb 2025
Secure Visual Data Processing via Federated Learning
Pedro Santos
Tânia Carvalho
Filipe Magalhães
Luís Antunes
FedML
64
0
0
09 Feb 2025
Demystifying Catastrophic Forgetting in Two-Stage Incremental Object Detector
Qirui Wu
Shizhou Zhang
De Cheng
Yinghui Xing
Di Xu
Peng Wang
Yanning Zhang
ObjD
66
0
0
08 Feb 2025
MMHMER:Multi-viewer and Multi-task for Handwritten Mathematical Expression Recognition
Kehua Chen
Haoyang Shen
Lifan Zhong
Mingyi Chen
48
0
0
08 Feb 2025
Drone Detection and Tracking with YOLO and a Rule-based Method
Purbaditya Bhattacharya
Patrick Nowak
55
0
0
07 Feb 2025
\Éclair -- Extracting Content and Layout with Integrated Reading Order for Documents
Ilia Karmanov
A. Deshmukh
Lukas Voegtle
Philipp Fischer
Kateryna Chumachenko
...
Jarno Seppänen
Jupinder Parmar
Joseph Jennings
Andrew Tao
Karan Sapra
73
0
0
06 Feb 2025
RS-YOLOX: A High Precision Detector for Object Detection in Satellite Remote Sensing Images
Lei Yang
Guowu Yuan
Hao Zhou
Hongyu Liu
Jian Chen
Hao Wu
108
30
0
05 Feb 2025
Solar Panel Mapping via Oriented Object Detection
Conor Wallace
Isaac Corley
Jonathan Lwowski
39
1
0
05 Feb 2025
AquaticCLIP: A Vision-Language Foundation Model for Underwater Scene Analysis
B. Alawode
I. I. Ganapathi
S. Javed
Naoufel Werghi
Mohammed Bennamoun
Arif Mahmood
CLIP
VLM
78
1
0
03 Feb 2025
Foundation Model-Based Apple Ripeness and Size Estimation for Selective Harvesting
Keyi Zhu
Jiajia Li
Kaixiang Zhang
Chaaran Arunachalam
Siddhartha Bhattacharya
R. Lu
Zhaojian Li
89
0
0
03 Feb 2025
A Survey on Class-Agnostic Counting: Advancements from Reference-Based to Open-World Text-Guided Approaches
Luca Ciampi
Ali Azmoudeh
Elif Ecem Akbaba
Erdi Sarıtaş
Ziya Ata Yazıcı
H. K. Ekenel
Giuseppe Amato
Fabrizio Falchi
102
0
0
31 Jan 2025
Transfer Learning for Keypoint Detection in Low-Resolution Thermal TUG Test Images
Wei-Lun Chen
Chia-Yeh Hsieh
Yu-Hsiang Kao
Kai-Chun Liu
Sheng-Yu Peng
Yu Tsao
92
0
0
30 Jan 2025
Assessing the Capability of YOLO- and Transformer-based Object Detectors for Real-time Weed Detection
Alicia Allmendinger
Ahmet Oğuz Saltık
Gerassimos G. Peteinatos
Anthony Stein
Roland Gerhards
84
1
0
29 Jan 2025
PatentLMM: Large Multimodal Model for Generating Descriptions for Patent Figures
Shivalika Singh
Nakul Sharma
Manish Gupta
Anand Mishra
55
1
0
28 Jan 2025
A Privacy Enhancing Technique to Evade Detection by Street Video Cameras Without Using Adversarial Accessories
Jacob Shams
Ben Nassi
Satoru Koda
A. Shabtai
Yuval Elovici
193
0
0
28 Jan 2025
E2E-MFD: Towards End-to-End Synchronous Multimodal Fusion Detection
Jiaqing Zhang
Mingxiang Cao
Weiying Xie
Jie Lei
Daixun Li
Wenbo Huang
Yunsong Li
Xue Yang
62
5
0
28 Jan 2025
On the use of neural networks for the structural characterization of polymeric porous materials
Jorge Torre
Suset Barroso-Solares
M.A. Rodríguez-Pérez
Javier Pinto
46
5
0
25 Jan 2025
High-Precision Fabric Defect Detection via Adaptive Shape Convolutions and Large Kernel Spatial Modeling
Shuai Wang
Yongjun Xu
Hui Zheng
Baotian Li
39
0
0
24 Jan 2025
YOLO11-JDE: Fast and Accurate Multi-Object Tracking with Self-Supervised Re-ID
Iñaki Erregue
Kamal Nasrollahi
Sergio Escalera
VOT
62
0
0
23 Jan 2025
Rethinking the Sample Relations for Few-Shot Classification
Guowei Yin
Sheng Huang
Luwen Huangfu
Yi Zhang
Xiaohong Zhang
VLM
38
0
0
23 Jan 2025
Combining Knowledge Graph and LLMs for Enhanced Zero-shot Visual Question Answering
Qian Tao
Xiaoyang Fan
Yong Xu
Xingquan Zhu
Yufei Tang
52
0
0
22 Jan 2025
TOFFE -- Temporally-binned Object Flow from Events for High-speed and Energy-Efficient Object Detection and Tracking
Adarsh Kosta
Amogh Joshi
Arjun Roy
Rohan Kumar Manna
M. Nagaraj
Kaushik Roy
48
2
0
21 Jan 2025
Memory Storyboard: Leveraging Temporal Segmentation for Streaming Self-Supervised Learning from Egocentric Videos
Yanlai Yang
Mengye Ren
241
0
0
21 Jan 2025
Investigating Market Strength Prediction with CNNs on Candlestick Chart Images
Thanh Nam Duong
Trung Kien Hoang
Quoc Khanh Duong
Quoc Dat Dinh
Duc Hoan Le
Huy Tuan Nguyen
Xuan Bach Nguyen
Quy Ban Tran
75
0
0
21 Jan 2025
SVGS-DSGAT: An IoT-Enabled Innovation in Underwater Robotic Object Detection Technology
Dongli Wu
Ling Luo
46
0
0
21 Jan 2025
Enhancing SAR Object Detection with Self-Supervised Pre-training on Masked Auto-Encoders
Xinyang Pu
Feng Xu
39
0
0
20 Jan 2025
MRI2Speech: Speech Synthesis from Articulatory Movements Recorded by Real-time MRI
N. Shah
Ayan Kashyap
Shirish S. Karande
Vineet Gandhi
52
0
0
20 Jan 2025
AgRegNet: A Deep Regression Network for Flower and Fruit Density Estimation, Localization, and Counting in Orchards
Uddhav Bhattarai
Santosh Bhusal
Qin Zhang
Manoj Karkee
103
2
0
20 Jan 2025
TextureCrop: Enhancing Synthetic Image Detection through Texture-based Cropping
Despina Konstantinidou
C. Koutlis
Symeon Papadopoulos
81
2
0
17 Jan 2025
Embodied Scene Understanding for Vision Language Models via MetaVQA
Weizhen Wang
Chenda Duan
Zhenghao Peng
Yuxin Liu
Bolei Zhou
LM&Ro
49
0
0
17 Jan 2025
Multi-visual modality micro drone-based structural damage detection
Isaac Osei Agyemanga
Liaoyuan Zeng
Jianwen Chena
Isaac Adjei-Mensah
D. Acheampong
62
5
0
15 Jan 2025
Previous
1
2
3
...
6
7
8
...
140
141
142
Next