Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2004.13931
Cited By
Span-based Localizing Network for Natural Language Video Localization
29 April 2020
Hao Zhang
Aixin Sun
Wei Jing
Qiufeng Wang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Span-based Localizing Network for Natural Language Video Localization"
50 / 179 papers shown
Title
Object-Shot Enhanced Grounding Network for Egocentric Video
Yisen Feng
Haoyu Zhang
Meng Liu
Weili Guan
Liqiang Nie
41
0
0
07 May 2025
Exploiting Inter-Sample Correlation and Intra-Sample Redundancy for Partially Relevant Video Retrieval
Junlong Ren
Gangjian Zhang
Y. Hu
Jian Shu
Haoran Wang
29
0
0
28 Apr 2025
Ask2Loc: Learning to Locate Instructional Visual Answers by Asking Questions
Chang Zong
Bin Li
Shoujun Zhou
Jian Wan
Lei Zhang
135
0
0
22 Apr 2025
VideoExpert: Augmented LLM for Temporal-Sensitive Video Understanding
Henghao Zhao
Ge-Peng Ji
Rui Yan
Huan Xiong
Zechao Li
24
0
0
10 Apr 2025
SVLTA: Benchmarking Vision-Language Temporal Alignment via Synthetic Video Situation
Hao Du
Bo Wu
Yan Lu
Zhendong Mao
27
0
0
08 Apr 2025
Moment Quantization for Video Temporal Grounding
Xiaolong Sun
Le Wang
Sanping Zhou
Liushuai Shi
Kun Xia
Mengnan Liu
Yabing Wang
Gang Hua
MQ
31
0
0
03 Apr 2025
Collaborative Temporal Consistency Learning for Point-supervised Natural Language Video Localization
Zhuo Tao
Liang Li
Qi Chen
Yunbin Tu
Zheng-Jun Zha
Ming-Hsuan Yang
Yuankai Qi
Qingming Huang
45
0
0
22 Mar 2025
EgoDTM: Towards 3D-Aware Egocentric Video-Language Pretraining
Boshen Xu
Yuting Mei
Xinbi Liu
Sipeng Zheng
Qin Jin
VLM
MDE
65
0
0
19 Mar 2025
VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning
Yong-Jin Liu
Kevin Qinghong Lin
C. Chen
Mike Zheng Shou
LM&Ro
LRM
84
0
0
17 Mar 2025
V-STaR: Benchmarking Video-LLMs on Video Spatio-Temporal Reasoning
Zixu Cheng
Jian Hu
Ziquan Liu
Chenyang Si
Wei Li
Shaogang Gong
LRM
72
2
0
14 Mar 2025
Deep Understanding of Sign Language for Sign to Subtitle Alignment
Youngjoon Jang
Jeongsoo Choi
Junseok Ahn
Joon Son Chung
SLR
79
0
0
05 Mar 2025
Modeling Fine-Grained Hand-Object Dynamics for Egocentric Video Representation Learning
Baoqi Pei
Y. Huang
Jilan Xu
Guo Chen
Yuping He
...
Yali Wang
Weidi Xie
Yu Qiao
Fei Wu
Limin Wang
41
0
0
02 Mar 2025
LD-DETR: Loop Decoder DEtection TRansformer for Video Moment Retrieval and Highlight Detection
Pengcheng Zhao
Zhixian He
Fuwei Zhang
Shujin Lin
Fan Zhou
42
1
0
18 Jan 2025
Length-Aware DETR for Robust Moment Retrieval
S. Park
Jiho Choi
Kyungjune Baek
Hyunjung Shim
36
0
0
31 Dec 2024
FlashVTG: Feature Layering and Adaptive Score Handling Network for Video Temporal Grounding
Zhuo Cao
Bingqing Zhang
Heming Du
Xin Yu
Xue Li
Sen Wang
67
1
0
18 Dec 2024
TimeRefine: Temporal Grounding with Time Refining Video LLM
Xizi Wang
Feng Cheng
Ziyang Wang
Huiyu Wang
Md. Mohaiminul Islam
Lorenzo Torresani
Joey Tianyi Zhou
Gedas Bertasius
David J. Crandall
109
1
0
12 Dec 2024
Streaming Detection of Queried Event Start
Cristobal Eyzaguirre
Eric Tang
S. Buch
Adrien Gaidon
Jiajun Wu
Juan Carlos Niebles
76
0
0
04 Dec 2024
VideoLights: Feature Refinement and Cross-Task Alignment Transformer for Joint Video Highlight Detection and Moment Retrieval
Dhiman Paul
Md Rizwan Parvez
Nabeel Mohammed
Shafin Rahman
VGen
69
0
0
02 Dec 2024
Vid-Morp: Video Moment Retrieval Pretraining from Unlabeled Videos in the Wild
Peijun Bao
Chenqi Kong
Zihao Shao
Boon Poh Ng
Meng Hwa Er
Alex C. Kot
63
2
0
01 Dec 2024
On the Consistency of Video Large Language Models in Temporal Comprehension
Minjoon Jung
Junbin Xiao
Byoung-Tak Zhang
Angela Yao
87
2
0
20 Nov 2024
Grounded Video Caption Generation
Evangelos Kazakos
Cordelia Schmid
Josef Sivic
36
0
0
12 Nov 2024
Learning to Unify Audio, Visual and Text for Audio-Enhanced Multilingual Visual Answer Localization
Zhibin Wen
Bin Li
34
1
0
05 Nov 2024
Let Me Finish My Sentence: Video Temporal Grounding with Holistic Text Understanding
Jongbhin Woo
H. Ryu
Youngjoon Jang
Jae-Won Cho
Joon Son Chung
35
1
0
17 Oct 2024
Grounding is All You Need? Dual Temporal Grounding for Video Dialog
You Qin
Wei Ji
Xinze Lan
Hao Fei
Xun Yang
Dan Guo
Roger Zimmermann
Lizi Liao
VGen
41
0
0
08 Oct 2024
Frame-Voyager: Learning to Query Frames for Video Large Language Models
Sicheng Yu
Chengkai Jin
Huanyu Wang
Zhenghao Chen
Sheng Jin
...
Zhenbang Sun
Bingni Zhang
Jiawei Wu
Hao Zhang
Qianru Sun
67
5
0
04 Oct 2024
UAL-Bench: The First Comprehensive Unusual Activity Localization Benchmark
Hasnat Md Abdullah
Tian Liu
Kangda Wei
Shu Kong
Ruihong Huang
34
3
0
02 Oct 2024
ChatVTG: Video Temporal Grounding via Chat with Video Dialogue Large Language Models
Mengxue Qu
Xiaodong Chen
Wu Liu
Alicia Li
Yao Zhao
44
13
0
01 Oct 2024
Show and Guide: Instructional-Plan Grounded Vision and Language Model
Diogo Glória-Silva
David Semedo
João Magalhães
23
0
0
27 Sep 2024
Beyond Uncertainty: Evidential Deep Learning for Robust Video Temporal Grounding
Kaijing Ma
Haojian Huang
Jin Chen
Haodong Chen
Pengliang Ji
...
Han Fang
Chao Ban
Hao Sun
Mulin. Chen
Xuelong Li
37
7
0
29 Aug 2024
Disentangle and denoise: Tackling context misalignment for video moment retrieval
Kaijing Ma
Han Fang
Xianghao Zang
Chao Ban
Lanxiang Zhou
Zhongjiang He
Yongxiang Li
Hao Sun
Zerun Feng
Xingsong Hou
57
1
0
14 Aug 2024
ActPrompt: In-Domain Feature Adaptation via Action Cues for Video Temporal Grounding
Yubin Wang
Xinyang Jiang
De Cheng
Dongsheng Li
Cairong Zhao
VLM
35
1
0
13 Aug 2024
Lighthouse: A User-Friendly Library for Reproducible Video Moment Retrieval and Highlight Detection
Taichi Nishimura
Shota Nakada
Hokuto Munakata
Tatsuya Komatsu
VLM
28
1
0
06 Aug 2024
Infusing Environmental Captions for Long-Form Video Language Grounding
Hyogun Lee
Soyeon Hong
Mujeen Sung
Jinwoo Choi
40
0
0
05 Aug 2024
Prior Knowledge Integration via LLM Encoding and Pseudo Event Regulation for Video Moment Retrieval
Yiyang Jiang
Wengyu Zhang
Xu-Lu Zhang
Xiaoyong Wei
Chang Wen Chen
Qing Li
46
4
0
21 Jul 2024
Temporally Grounding Instructional Diagrams in Unconstrained Videos
Jiahao Zhang
Frederic Z. Zhang
Cristian Rodriguez
Yizhak Ben-Shabat
A. Cherian
Stephen Gould
39
2
0
16 Jul 2024
SHINE: Saliency-aware HIerarchical NEgative Ranking for Compositional Temporal Grounding
Zixu Cheng
Yujiang Pu
Shaogang Gong
Parisa Kordjamshidi
Yu Kong
AI4TS
30
0
0
06 Jul 2024
ReXTime: A Benchmark Suite for Reasoning-Across-Time in Videos
Jr-Jen Chen
Yu-Chien Liao
Hsi-Che Lin
Yu-Chu Yu
Yen-Chun Chen
Yu-Chiang Frank Wang
37
10
0
27 Jun 2024
LoongTrain: Efficient Training of Long-Sequence LLMs with Head-Context Parallelism
Diandian Gu
Peng Sun
Qinghao Hu
Ting Huang
Xun Chen
...
Jiarui Fang
Yonggang Wen
Tianwei Zhang
Xin Jin
Xuanzhe Liu
LRM
40
7
0
26 Jun 2024
EgoVideo: Exploring Egocentric Foundation Model and Downstream Adaptation
Baoqi Pei
Guo Chen
Jilan Xu
Yuping He
Yicheng Liu
...
Yifei Huang
Yali Wang
Tong Lu
Limin Wang
Yu Qiao
EgoV
42
14
0
26 Jun 2024
MLLM as Video Narrator: Mitigating Modality Imbalance in Video Moment Retrieval
Weitong Cai
Jiabo Huang
Shaogang Gong
Hailin Jin
Yang Liu
44
0
0
25 Jun 2024
ObjectNLQ @ Ego4D Episodic Memory Challenge 2024
Yisen Feng
Haoyu Zhang
Yuquan Xie
Zaijing Li
Meng Liu
Liqiang Nie
23
3
0
22 Jun 2024
CARLOR @ Ego4D Step Grounding Challenge: Bayesian temporal-order priors for test time refinement
Carlos Plou
Lorenzo Mur-Labadia
Ruben Martinez-Cantin
Ana C. Murillo
BDL
54
1
0
13 Jun 2024
SViTT-Ego: A Sparse Video-Text Transformer for Egocentric Video
Hector A. Valdez
Kyle Min
Subarna Tripathi
VLM
44
1
0
13 Jun 2024
Simplify Implant Depth Prediction as Video Grounding: A Texture Perceive Implant Depth Prediction Network
Xinquan Yang
Xuguang Li
Xiaoling Luo
Leilei Zeng
Yudi Zhang
Linlin Shen
Yongqiang Deng
MedIm
38
2
0
07 Jun 2024
Context-Enhanced Video Moment Retrieval with Large Language Models
Weijia Liu
Bo Miao
Jiuxin Cao
Xueling Zhu
Bo Liu
Mehwish Nasim
Ajmal Saeed Mian
31
2
0
21 May 2024
Prompt When the Animal is: Temporal Animal Behavior Grounding with Positional Recovery Training
Sheng Yan
Xin Du
Zongying Li
Yi Wang
Hongcang Jin
Mengyuan Liu
OOD
VLM
27
0
0
09 May 2024
MLP: Motion Label Prior for Temporal Sentence Localization in Untrimmed 3D Human Motions
Sheng Yan
Mengyuan Liu
Yong Wang
Yang Liu
Cheng Chen
Hong Liu
46
0
0
21 Apr 2024
Video sentence grounding with temporally global textual knowledge
Cai Chen
Runzhong Zhang
Jianjun Gao
Kejun Wu
Kim-Hui Yap
Yi Wang
32
0
0
21 Apr 2024
TrafficVLM: A Controllable Visual Language Model for Traffic Video Captioning
Quang Minh Dinh
Minh Khoi Ho
Anh Quan Dang
Hung Phong Tran
45
6
0
14 Apr 2024
Task-Driven Exploration: Decoupling and Inter-Task Feedback for Joint Moment Retrieval and Highlight Detection
Jin Yang
Ping Wei
Huan Li
Ziyang Ren
51
8
0
14 Apr 2024
1
2
3
4
Next