Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1506.01497
Cited By
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
4 June 2015
Shaoqing Ren
Kaiming He
Ross B. Girshick
Jian Sun
AIMat
ObjD
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks"
50 / 6,983 papers shown
Title
High-Precision Transformer-Based Visual Servoing for Humanoid Robots in Aligning Tiny Objects
Jialong Xue
Wei Gao
Yu Wang
Chao Ji
Dongdong Zhao
Shi Yan
Shiwu Zhang
45
0
0
06 Mar 2025
Inclusive STEAM Education: A Framework for Teaching Cod-2 ing and Robotics to Students with Visually Impairment Using 3 Advanced Computer Vision
Mahmoud Hamash
Md Raqib Khan
Peter Tiernan
42
0
0
06 Mar 2025
Teach YOLO to Remember: A Self-Distillation Approach for Continual Object Detection
Riccardo De Monte
Davide Dalle Pezze
Gian Antonio Susto
CLL
68
0
0
06 Mar 2025
AHCPTQ: Accurate and Hardware-Compatible Post-Training Quantization for Segment Anything Model
Wenlun Zhang
Shimpei Ando
Kentaro Yoshioka
VLM
MQ
67
0
0
05 Mar 2025
MIAdapt: Source-free Few-shot Domain Adaptive Object Detection for Microscopic Images
Nimra Dilawar
Sara Nadeem
Javed Iqbal
Waqas Sultani
Mohsen Ali
61
0
0
05 Mar 2025
Periodontal Bone Loss Analysis via Keypoint Detection With Heuristic Post-Processing
Ryan Banks
Vishal Thengane
María Eugenia Guerrero
Nelly Maria García-Madueño
Yunpeng Li
Hongying Tang
A. Chaurasia
54
0
0
05 Mar 2025
LION-FS: Fast & Slow Video-Language Thinker as Online Video Assistant
Wei Li
Bing Hu
Rui Shao
Leyang Shen
Liqiang Nie
47
2
0
05 Mar 2025
Robust detection of overlapping bioacoustic sound events
Louis Mahon
Benjamin Hoffman
Logan S James
M. Cusimano
Masato Hagiwara
Sarah C Woolley
Olivier Pietquin
73
0
0
04 Mar 2025
Adaptive Camera Sensor for Vision Models
Eunsu Baek
Sunghwan Han
Taesik Gong
Hyung-Sin Kim
VLM
Presented at
ResearchTrend Connect | VLM
on
28 Mar 2025
164
0
0
04 Mar 2025
Catheter Detection and Segmentation in X-ray Images via Multi-task Learning
Lin Xi
Yingliang Ma
Ethan Koland
Sandra Howell
Aldo Rinaldi
Kawal S. Rhode
69
0
0
04 Mar 2025
An Efficient Approach to Detecting Lung Nodules Using Swin Transformer
Saeed Shakuri
Alireza Rezvanian
ViT
MedIm
53
1
0
03 Mar 2025
UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Language Interface
Hao Tang
Chenwei Xie
Haiyang Wang
Xiaoyi Bao
Tingyu Weng
Pandeng Li
Yun Zheng
Liwei Wang
ObjD
VLM
62
0
0
03 Mar 2025
Identity documents recognition and detection using semantic segmentation with convolutional neural network
Mykola Kozlenko
Volodymyr Sendetskyi
Oleksiy Simkiv
Nazar Savchenko
Andy Bosyi
63
3
0
03 Mar 2025
AC-Lite : A Lightweight Image Captioning Model for Low-Resource Assamese Language
Pankaj Choudhury
Yogesh Aggarwal
Prabhanjan Jadhav
Prithwijit Guha
Sukumar Nandi
82
0
0
03 Mar 2025
Enhancing Object Detection Accuracy in Underwater Sonar Images through Deep Learning-based Denoising
Ziyu Wang
Tao Xue
Yanbin Wang
Jingkai Li
Haibin Zhang
Zhiqiang Xu
Gaofei Xu
72
0
0
03 Mar 2025
Aerial Infrared Health Monitoring of Solar Photovoltaic Farms at Scale
Isaac Corley
Conor Wallace
Sourav Agrawal
Burton Putrah
Jonathan Lwowski
58
0
0
03 Mar 2025
MI-DETR: An Object Detection Model with Multi-time Inquiries Mechanism
Zhixiong Nan
Xianghong Li
Jifeng Dai
Tao Xiang
51
0
0
03 Mar 2025
Zero-Trust Artificial Intelligence Model Security Based on Moving Target Defense and Content Disarm and Reconstruction
Daniel Gilkarov
Ran Dubin
74
0
0
03 Mar 2025
Solving Instance Detection from an Open-World Perspective
Qianqian Shen
Yunhan Zhao
Nahyun Kwon
Jeeeun Kim
Yanan Li
Shu Kong
43
0
0
01 Mar 2025
RFWNet: A Lightweight Remote Sensing Object Detector Integrating Multi-Scale Receptive Fields and Foreground Focus Mechanism
Yujie Lei
Wenjie Sun
Sen Jia
Qingquan Li
Jie Zhang
49
0
0
01 Mar 2025
The Common Objects Underwater (COU) Dataset for Robust Underwater Object Detection
Rishi Mukherjee
Sakshi Singh
Jack McWilliams
Junaed Sattar
59
1
0
28 Feb 2025
RTGen: Real-Time Generative Detection Transformer
Chi Ruan
ObjD
VLM
52
0
0
28 Feb 2025
Towards long-term player tracking with graph hierarchies and domain-specific features
Maria Koshkina
J. Elder
36
0
0
28 Feb 2025
Transformers with Joint Tokens and Local-Global Attention for Efficient Human Pose Estimation
K. A. Kinfu
René Vidal
ViT
26
0
0
28 Feb 2025
Data-free Universal Adversarial Perturbation with Pseudo-semantic Prior
Chanhui Lee
Yeonghwan Song
Jeany Son
AAML
192
0
0
28 Feb 2025
WalnutData: A UAV Remote Sensing Dataset of Green Walnuts and Model Evaluation
Mingjie Wu
Chenggui Yang
Huihua Wang
Chen Xue
Yibo Wang
...
Yuqi Han
R. Li
Lijun Yun
Zaiqing Chen
Shri Kiran Srinivasan
62
0
0
27 Feb 2025
CS-PaperSum: A Large-Scale Dataset of AI-Generated Summaries for Scientific Papers
Javin Liu
Aryan Vats
Zihao He
39
0
0
27 Feb 2025
OpenTAD: A Unified Framework and Comprehensive Study of Temporal Action Detection
Shuming Liu
Chen Zhao
Fatimah Zohra
Mattia Soldan
Alejandro Pardo
...
Juan Carlos León Alcázar
A. Cioppa
Silvio Giancola
Carlos Hinojosa
Bernard Ghanem
68
3
0
27 Feb 2025
Attention-Guided Integration of CLIP and SAM for Precise Object Masking in Robotic Manipulation
Muhammad A. Muttaqien
Tomohiro Motoda
Ryo Hanai
Domae Yukiyasu
46
0
0
26 Feb 2025
LiGT: Layout-infused Generative Transformer for Visual Question Answering on Vietnamese Receipts
Thanh-Phong Le
Trung Le Chi Phan
Nghia Hieu Nguyen
Kiet Van Nguyen
ViT
49
0
0
26 Feb 2025
Grad-ECLIP: Gradient-based Visual and Textual Explanations for CLIP
Chenyang Zhao
Kun Wang
J. H. Hsiao
Antoni B. Chan
CLIP
71
0
0
26 Feb 2025
Advanced YOLO-based Real-time Power Line Detection for Vegetation Management
Shuaiang Rong
Lina He
S. Atici
Ahmet Enis Cetin
43
0
0
26 Feb 2025
Automatic Vehicle Detection using DETR: A Transformer-Based Approach for Navigating Treacherous Roads
Istiaq Ahmed Fahad
Abdullah Ibne Hanif Arean
Nazmus Sakib Ahmed
Mahmudul Hasan
ViT
51
1
0
25 Feb 2025
Weakly Supervised Pixel-Level Annotation with Visual Interpretability
Basma Nasir
Tehseen Zia
Muhammad Nawaz
Catarina Moreira
FAtt
87
0
0
25 Feb 2025
Autonomous Vision-Guided Resection of Central Airway Obstruction
M. E. Smith
N. Yilmaz
T. Watts
P. M. Scheikl
J. Ge
A. Deguet
A. Kuntz
A. Krieger
65
1
0
25 Feb 2025
Improved YOLOv7x-Based Defect Detection Algorithm for Power Equipment
Jin Hou
Hao Tang
69
0
0
25 Feb 2025
Complex Networks for Pattern-Based Data Classification
Josimar Chire
Khalid Mahmood
Zhao Liang
41
0
0
25 Feb 2025
IBURD: Image Blending for Underwater Robotic Detection
Jungseok Hong
Sakshi Singh
Junaed Sattar
62
1
0
24 Feb 2025
PhysAug: A Physical-guided and Frequency-based Data Augmentation for Single-Domain Generalized Object Detection
Xiaoran Xu
Jiangang Yang
Wenhui Shi
Siyuan Ding
Luqing Luo
Jian Liu
92
1
0
24 Feb 2025
Soybean pod and seed counting in both outdoor fields and indoor laboratories using unions of deep neural networks
Tianyou Jiang
Mingshun Shao
Tianyi Zhang
Xiaoyu Liu
Qun Yu
65
0
0
24 Feb 2025
SentiFormer: Metadata Enhanced Transformer for Image Sentiment Analysis
Bin Feng
Shulan Ruan
Mingzheng Yang
Dongxuan Han
Huijie Liu
Kai Zhang
Qi Liu
ViT
59
0
0
24 Feb 2025
Enriching Physical-Virtual Interaction in AR Gaming by Tracking Identical Real Objects
Liuchuan Yu
Ching-I Huang
Hsueh-Cheng Wang
L. Yu
46
0
0
24 Feb 2025
ZeroPS: High-quality Cross-modal Knowledge Transfer for Zero-Shot 3D Part Segmentation
Yuheng Xue
Nenglun Chen
Jun Liu
Wenyun Sun
3DPC
75
7
0
24 Feb 2025
Hierarchical Context Transformer for Multi-level Semantic Scene Understanding
Luoying Hao
Yan Hu
Yang Yue
Li Wu
Huazhu Fu
Jinming Duan
Jiang Liu
68
0
0
24 Feb 2025
Semi-Supervised Weed Detection in Vegetable Fields: In-domain and Cross-domain Experiments
Boyang Deng
Yuzhen Lu
41
0
0
24 Feb 2025
MQADet: A Plug-and-Play Paradigm for Enhancing Open-Vocabulary Object Detection via Multimodal Question Answering
Caixiong Li
Xiongwei Zhao
Jinhang Zhang
Xing Zhang
Qihao Sun
Zhou Wu
ObjD
MLLM
VLM
56
0
0
23 Feb 2025
Cross-domain Few-shot Object Detection with Multi-modal Textual Enrichment
Zeyu Shangguan
Daniel Seita
Mohammad Rostami
ObjD
61
0
0
23 Feb 2025
EDocNet: Efficient Datasheet Layout Analysis Based on Focus and Global Knowledge Distillation
Hong Cai Chen
Longchang Wu
Yang Zhang
38
0
0
23 Feb 2025
Prompt as Knowledge Bank: Boost Vision-language model via Structural Representation for zero-shot medical detection
Yuguang Yang
Tongfei Chen
Haoyu Huang
Linlin Yang
Chunyu Xie
Dawei Leng
Xianbin Cao
Baochang Zhang
MedIm
45
0
0
22 Feb 2025
OmniParser V2: Structured-Points-of-Thought for Unified Visual Text Parsing and Its Generality to Multimodal Large Language Models
Wenwen Yu
Zhibo Yang
Jianqiang Wan
Sibo Song
J. Tang
Wenqing Cheng
Yunxing Liu
Xiang Bai
53
3
0
22 Feb 2025
Previous
1
2
3
...
5
6
7
...
138
139
140
Next