Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1902.09630
Cited By
v1
v2 (latest)
Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression
25 February 2019
S. Hamid Rezatofighi
Deyuan Li
JunYoung Gwak
Amir Sadeghian
Ian Reid
Silvio Savarese
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression"
50 / 1,104 papers shown
Title
Temporal Action Localization with Cross Layer Task Decoupling and Refinement
Qiang Li
Di Liu
Jun Kong
Sen Li
Hui Xu
Jianzhong Wang
121
0
0
12 Dec 2024
Mojito: Motion Trajectory and Intensity Control for Video Generation
Xuehai He
Shuohang Wang
Jianwei Yang
Xiaoxia Wu
Yansen Wang
Kuan-Chieh Wang
Z. Zhan
Olatunji Ruwase
Yelong Shen
Xinze Wang
VGen
236
2
0
12 Dec 2024
TimeRefine: Temporal Grounding with Time Refining Video LLM
Xizi Wang
Feng Cheng
Ziyang Wang
Huiyu Wang
Md. Mohaiminul Islam
Lorenzo Torresani
Joey Tianyi Zhou
Gedas Bertasius
David J. Crandall
208
2
0
12 Dec 2024
Swap Path Network for Robust Person Search Pre-training
Lucas Jaffe
A. Zakhor
3DPC
112
0
0
06 Dec 2024
VideoLights: Feature Refinement and Cross-Task Alignment Transformer for Joint Video Highlight Detection and Moment Retrieval
Dhiman Paul
Md Rizwan Parvez
Nabeel Mohammed
Shafin Rahman
VGen
123
0
0
02 Dec 2024
CopyrightShield: Spatial Similarity Guided Backdoor Defense against Copyright Infringement in Diffusion Models
Zhixiang Guo
Siyuan Liang
Aishan Liu
Dacheng Tao
AAML
135
3
0
02 Dec 2024
DLaVA: Document Language and Vision Assistant for Answer Localization with Enhanced Interpretability and Trustworthiness
Ahmad Mohammadshirazi
Pinaki Prasad Guha Neogi
Ser-Nam Lim
R. Ramnath
126
1
0
29 Nov 2024
Improving Accuracy and Generalization for Efficient Visual Tracking
Ram J. Zaveri
Shivang Patel
Yu Gu
Gianfranco Doretto
VLM
159
0
0
28 Nov 2024
HUPE: Heuristic Underwater Perceptual Enhancement with Semantic Collaborative Learning
Zengxi Zhang
Zhiying Jiang
Long Ma
Jinyuan Liu
Xin-Yue Fan
Risheng Liu
160
3
0
27 Nov 2024
ChatRex: Taming Multimodal LLM for Joint Perception and Understanding
Qing Jiang
Gen Luo
Yuqin Yang
Yuda Xiong
Yihao Chen
Zhaoyang Zeng
Tianhe Ren
Lei Zhang
VLM
LRM
212
10
0
27 Nov 2024
Chat2SVG: Vector Graphics Generation with Large Language Models and Image Diffusion Models
Ronghuan Wu
Wanchao Su
Jing Liao
DiffM
128
4
0
25 Nov 2024
Leverage Task Context for Object Affordance Ranking
Haojie Huang
Hongchen Luo
Wei-dong Zhai
Yang Cao
Zheng-jun Zha
137
0
0
25 Nov 2024
Corner2Net: Detecting Objects as Cascade Corners
Chenglong Liu
Jintao Liu
Haorao Wei
Jinze Yang
Liangyu Xu
Yuchen Guo
Lu Fang
90
0
0
24 Nov 2024
MambaTrack: Exploiting Dual-Enhancement for Night UAV Tracking
Chunhui Zhang
Li Liu
Hao Wen
Xi Zhou
Yijiao Wang
Mamba
162
2
0
24 Nov 2024
3D Reconstruction by Looking: Instantaneous Blind Spot Detector for Indoor SLAM through Mixed Reality
Hanbeom Chang
Jongseong Brad Choi
C. Yeum
98
0
0
19 Nov 2024
WoodYOLO: A Novel Object Detector for Wood Species Detection in Microscopic Images
Lars Nieradzik
Henrike Stephani
Jördis Sieburg-Rockel
Stephanie Helmling
Andrea Olbrich
Stephanie Wrage
J. Keuper
112
0
0
18 Nov 2024
Exploiting VLM Localizability and Semantics for Open Vocabulary Action Detection
Wentao Bao
Keqin Li
Yuxiao Chen
Deep Patel
Martin Renqiang Min
Yu Kong
VLM
ObjD
96
2
0
17 Nov 2024
RETR: Multi-View Radar Detection Transformer for Indoor Perception
Ryoma Yataka
Adriano Cardace
Peng Wang
P. Boufounos
R. Takahashi
154
2
0
15 Nov 2024
Grounded Video Caption Generation
Evangelos Kazakos
Cordelia Schmid
Josef Sivic
71
0
0
12 Nov 2024
AutoGameUI: Constructing High-Fidelity Game UIs via Multimodal Learning and Interactive Web-Based Tool
Zhongliang Tang
Mengchen Tan
Fei Xia
Qingrong Cheng
Hao Jiang
Yize Zhang
53
0
0
06 Nov 2024
SIRA: Scalable Inter-frame Relation and Association for Radar Perception
Ryoma Yataka
Peng Wang
P. Boufounos
R. Takahashi
92
5
0
04 Nov 2024
Polar R-CNN: End-to-End Lane Detection with Fewer Anchors
Shengqi Wang
Junmin Liu
Xiangyong Cao
Zengjie Song
Kai Sun
113
1
0
03 Nov 2024
Is Multiple Object Tracking a Matter of Specialization?
Gianluca Mancusi
Mattia Bernardi
Aniello Panariello
Angelo Porrello
Rita Cucchiara
Simone Calderara
MoMe
98
2
0
01 Nov 2024
LAM-YOLO: Drones-based Small Object Detection on Lighting-Occlusion Attention Mechanism YOLO
Yuchen Zheng
Yuxin Jing
Jufeng Zhao
Guangmang Cui
ObjD
89
1
0
01 Nov 2024
GigaCheck: Detecting LLM-generated Content
Irina Tolstykh
Aleksandra Tsybina
Sergey Yakubson
Aleksandr Gordeev
Vladimir Dokholyan
Maksim Kuprashevich
DeLMO
86
2
0
31 Oct 2024
Phrase Decoupling Cross-Modal Hierarchical Matching and Progressive Position Correction for Visual Grounding
Minghong Xie
Ming Wang
Huafeng Li
Yafei Zhang
Dapeng Tao
Z. Yu
ObjD
58
1
0
31 Oct 2024
Unbiased Regression Loss for DETRs
Edric
Ueta Daisuke
Kurokawa Yukimasa
Karlekar Jayashree
Sugiri Pranata
66
0
0
30 Oct 2024
PK-YOLO: Pretrained Knowledge Guided YOLO for Brain Tumor Detection in Multiplanar MRI Slices
Ming Kang
F. F. Ting
Raphaël C.-W. Phan
C. Ting
ViT
MedIm
173
1
0
29 Oct 2024
Referring Human Pose and Mask Estimation in the Wild
Bo Miao
Mingtao Feng
Zijie Wu
Mohammed Bennamoun
Yongsheng Gao
Ajmal Mian
86
0
0
27 Oct 2024
AlphaChimp: Tracking and Behavior Recognition of Chimpanzees
Xiaoxuan Ma
Yutang Lin
Yuan Xu
Stephan P. Kaufhold
Jack Terwilliger
Andres Meza
Yixin Zhu
Federico Rossano
Yizhou Wang
122
0
0
22 Oct 2024
Sketch2Code: Evaluating Vision-Language Models for Interactive Web Design Prototyping
Ryan Li
Yanzhe Zhang
Diyi Yang
3DV
59
5
0
21 Oct 2024
A Paradigm Shift in Mouza Map Vectorization: A Human-Machine Collaboration Approach
Mahir Shahriar Dhrubo
Samira Akter
Anwarul Bashir Shuaib
Md Toki Tahmid
Zahid Hasan
A. B. M. Alim Al Islam
73
0
0
21 Oct 2024
SINGAPO: Single Image Controlled Generation of Articulated Parts in Objects
Jiayi Liu
Denys Iliash
Angel X. Chang
Manolis Savva
Ali Mahdavi-Amiri
163
13
0
21 Oct 2024
ContextDet: Temporal Action Detection with Adaptive Context Aggregation
Ning Wang
Yun Xiao
Xiaopeng Peng
Xiaojun Chang
Xuanhong Wang
Dingyi Fang
102
2
0
20 Oct 2024
Let Me Finish My Sentence: Video Temporal Grounding with Holistic Text Understanding
Jongbhin Woo
H. Ryu
Youngjoon Jang
Jae-Won Cho
Joon Son Chung
69
1
0
17 Oct 2024
VividMed: Vision Language Model with Versatile Visual Grounding for Medicine
Lingxiao Luo
Bingda Tang
Xuanzhong Chen
Rong Han
Ting Chen
VLM
93
3
0
16 Oct 2024
Multiview Scene Graph
Juexiao Zhang
Gao Zhu
Sihang Li
Xinhao Liu
Haorui Song
Xinran Tang
Chen Feng
3DV
75
2
0
15 Oct 2024
Point Cloud Mixture-of-Domain-Experts Model for 3D Self-supervised Learning
Yaohua Zha
Tao Dai
Yanzi Wang
Hang Guo
Bin Chen
Zhihao Ouyang
Chunlin Fan
3DPC
88
1
0
13 Oct 2024
Token Pruning using a Lightweight Background Aware Vision Transformer
Sudhakar Sah
Ravish Kumar
Honnesh Rohmetra
Ehsan Saboori
ViT
122
1
0
12 Oct 2024
OneRef: Unified One-tower Expression Grounding and Segmentation with Mask Referring Modeling
Linhui Xiao
Xiaoshan Yang
Fang Peng
Yaowei Wang
Changsheng Xu
ObjD
122
7
0
10 Oct 2024
Saliency-Guided DETR for Moment Retrieval and Highlight Detection
Aleksandr Gordeev
Vladimir Dokholyan
Irina Tolstykh
Maksim Kuprashevich
77
7
0
02 Oct 2024
KPCA-CAM: Visual Explainability of Deep Computer Vision Models using Kernel PCA
Sachin Karmani
Thanushon Sivakaran
Gaurav Prasad
Mehmet Ali
Wenbo Yang
Sheyang Tang
FAtt
105
4
0
30 Sep 2024
Improving Visual Object Tracking through Visual Prompting
Shih-Fang Chen
Jun-Cheng Chen
I-Hong Jhuo
Yen-Yu Lin
VLM
84
1
0
27 Sep 2024
SimVG: A Simple Framework for Visual Grounding with Decoupled Multi-modal Fusion
Ming Dai
Lingfeng Yang
Yihao Xu
Zhenhua Feng
Wankou Yang
ObjD
125
13
0
26 Sep 2024
MorphoSeg: An Uncertainty-Aware Deep Learning Method for Biomedical Segmentation of Complex Cellular Morphologies
Tianhao Zhang
Heather J. McCourty
Berardo M. Sanchez-Tafolla
Anton Nikolaev
Lyudmila Mihaylova
73
0
0
25 Sep 2024
OW-Rep: Open World Object Detection with Instance Representation Learning
Sunoh Lee
Minsik Jeon
Jihong Min
Junwon Seo
ObjD
488
0
0
24 Sep 2024
Language-based Audio Moment Retrieval
Hokuto Munakata
Taichi Nishimura
Shota Nakada
Tatsuya Komatsu
131
2
0
24 Sep 2024
Provably Efficient Exploration in Inverse Constrained Reinforcement Learning
Bo Yue
Jian Li
Guiliang Liu
124
3
0
24 Sep 2024
MCTrack: A Unified 3D Multi-Object Tracking Framework for Autonomous Driving
Xiyang Wang
Shouzheng Qi
Jieyou Zhao
Hangning Zhou
Siyu Zhang
...
Kai Tu
Songlin Guo
Jianbo Zhao
Jian Li
Mu Yang
VOT
97
6
0
23 Sep 2024
Region Prompt Tuning: Fine-grained Scene Text Detection Utilizing Region Text Prompt
Xingtao Lin
Heqian Qiu
Lanxiao Wang
RUihang Wang
Linfeng XU
Hongliang Li
VLM
51
0
0
20 Sep 2024
Previous
1
2
3
4
5
6
...
21
22
23
Next