Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.11463
Cited By
Siamese Learning with Joint Alignment and Regression for Weakly-Supervised Video Paragraph Grounding
18 March 2024
Chaolei Tan
Jian-Huang Lai
Wei-Shi Zheng
Jianfang Hu
AI4TS
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Siamese Learning with Joint Alignment and Regression for Weakly-Supervised Video Paragraph Grounding"
12 / 12 papers shown
Title
Object-Shot Enhanced Grounding Network for Egocentric Video
Yisen Feng
Haoyu Zhang
Meng Liu
Weili Guan
Liqiang Nie
38
0
0
07 May 2025
Collaborative Temporal Consistency Learning for Point-supervised Natural Language Video Localization
Zhuo Tao
Liang Li
Qi Chen
Yunbin Tu
Zheng-Jun Zha
Ming-Hsuan Yang
Yuankai Qi
Qingming Huang
45
0
0
22 Mar 2025
Knowing Your Target: Target-Aware Transformer Makes Better Spatio-Temporal Video Grounding
Xin Gu
Yaojie Shen
Chenxi Luo
Tiejian Luo
Yan Huang
Yuewei Lin
Heng Fan
L. Zhang
63
1
0
16 Feb 2025
Dual-task Mutual Reinforcing Embedded Joint Video Paragraph Retrieval and Grounding
M. Wang
Huafeng Li
Yafei Zhang
Jinxing Li
Minghong Xie
Dapeng Tao
AI4TS
67
2
0
26 Nov 2024
SynopGround: A Large-Scale Dataset for Multi-Paragraph Video Grounding from TV Dramas and Synopses
Chaolei Tan
Zihang Lin
Junfu Pu
Zhongang Qi
Wei-Yi Pei
Zhi Qu
Yexin Wang
Ying Shan
Wei-Shi Zheng
Jianfang Hu
AI4TS
43
0
0
03 Aug 2024
Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?
Wenhao Wu
Haipeng Luo
Bo Fang
Jingdong Wang
Wanli Ouyang
95
80
0
31 Dec 2022
DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR
Shilong Liu
Feng Li
Hao Zhang
X. Yang
Xianbiao Qi
Hang Su
Jun Zhu
Lei Zhang
ViT
138
728
0
28 Jan 2022
Self-supervised Learning for Semi-supervised Temporal Language Grounding
Fan Luo
Shaoxiang Chen
Jingjing Chen
Zuxuan Wu
Yu-Gang Jiang
VLM
49
11
0
23 Sep 2021
End-to-End Dense Video Grounding via Parallel Regression
Fengyuan Shi
Weilin Huang
Limin Wang
37
10
0
23 Sep 2021
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
311
5,773
0
29 Apr 2021
Natural Language Video Localization: A Revisit in Span-based Question Answering Framework
Hao Zhang
Aixin Sun
Wei Jing
Liangli Zhen
Joey Tianyi Zhou
Rick Siow Mong Goh
111
84
0
26 Feb 2021
Multi-modal Transformer for Video Retrieval
Valentin Gabeur
Chen Sun
Alahari Karteek
Cordelia Schmid
ViT
412
595
0
21 Jul 2020
1