Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2109.08039
Cited By
v1
v2 (latest)
A Survey on Temporal Sentence Grounding in Videos
16 September 2021
Xiaohan Lan
Yitian Yuan
Xin Eric Wang
Zhi Wang
Wenwu Zhu
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"A Survey on Temporal Sentence Grounding in Videos"
34 / 84 papers shown
Title
Cross-Modal Interaction Networks for Query-Based Moment Retrieval in Videos
Zhu Zhang
Zhijie Lin
Zhou Zhao
Zhenxin Xiao
49
213
0
06 Jun 2019
Tripping through time: Efficient Localization of Activities in Videos
Meera Hahn
Asim Kadav
James M. Rehg
H. Graf
70
86
0
22 Apr 2019
Heterogeneous Memory Enhanced Multimodal Attention Model for Video Question Answering
Chenyou Fan
Xiaofan Zhang
Shu Zhang
Wensheng Wang
Chi Zhang
Heng-Chiao Huang
52
278
0
08 Apr 2019
Weakly Supervised Video Moment Retrieval From Text Queries
Niluthpol Chowdhury Mithun
S. Paul
Amit K. Roy-Chowdhury
117
194
0
05 Apr 2019
ExCL: Extractive Clip Localization Using Natural Language Descriptions
Soham Ghosh
Anuva Agarwal
Zarana Parekh
Alexander G. Hauptmann
CLIP
51
152
0
04 Apr 2019
Read, Watch, and Move: Reinforcement Learning for Temporally Grounding Natural Language Descriptions in Videos
Dongliang He
Xiang Zhao
Jizhou Huang
Fu Li
Xiao-Chang Liu
Shilei Wen
63
153
0
21 Jan 2019
Weakly Supervised Dense Event Captioning in Videos
Xuguang Duan
Wen-bing Huang
Chuang Gan
Jingdong Wang
Wenwu Zhu
Junzhou Huang
69
150
0
10 Dec 2018
MAN: Moment Alignment Network for Natural Language Moment Retrieval via Iterative Graph Adjustment
Da Zhang
Xiyang Dai
Xin Eric Wang
Yuan-fang Wang
L. Davis
68
305
0
30 Nov 2018
MAC: Mining Activity Concepts for Language-based Temporal Localization
Runzhou Ge
J. Gao
Kan Chen
Ram Nevatia
73
179
0
21 Nov 2018
Exploring Visual Relationship for Image Captioning
Ting Yao
Yingwei Pan
Yehao Li
Tao Mei
76
834
0
19 Sep 2018
TVQA: Localized, Compositional Video Question Answering
Muhammad Abdul Wahab
Licheng Yu
Mounir Nasr Allah
Tamara L. Berg
90
640
0
05 Sep 2018
Localizing Moments in Video with Temporal Language
Lisa Anne Hendricks
Oliver Wang
Eli Shechtman
Josef Sivic
Trevor Darrell
Bryan C. Russell
79
159
0
05 Sep 2018
End-to-End Audio Visual Scene-Aware Dialog using Multimodal Attention-Based Video Features
Chiori Hori
Huda AlAmri
Jue Wang
Gordon Wichern
Takaaki Hori
...
Raphael Gontijo-Lopes
Abhishek Das
Irfan Essa
Dhruv Batra
Devi Parikh
VGen
64
125
0
21 Jun 2018
To Find Where You Talk: Temporal Sentence Localization in Video with Attention Based Location Regression
Yitian Yuan
Tao Mei
Wenwu Zhu
78
333
0
19 Apr 2018
Multilevel Language and Vision Integration for Text-to-Clip Retrieval
Huijuan Xu
Kun He
Bryan A. Plummer
Leonid Sigal
Stan Sclaroff
Kate Saenko
CLIP
63
323
0
13 Apr 2018
MAttNet: Modular Attention Network for Referring Expression Comprehension
Licheng Yu
Zhe Lin
Xiaohui Shen
Jimei Yang
Xin Lu
Joey Tianyi Zhou
Tamara L. Berg
ObjD
97
828
0
24 Jan 2018
Single Shot Temporal Action Detection
Tianwei Lin
Xu Zhao
Zheng Shou
77
455
0
17 Oct 2017
Localizing Moments in Video with Natural Language
Lisa Anne Hendricks
Oliver Wang
Eli Shechtman
Josef Sivic
Trevor Darrell
Bryan C. Russell
115
946
0
04 Aug 2017
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
Peter Anderson
Xiaodong He
Chris Buehler
Damien Teney
Mark Johnson
Stephen Gould
Lei Zhang
AIMat
121
4,216
0
25 Jul 2017
Query-Focused Video Summarization: Dataset, Evaluation, and A Memory Network Based Approach
Aidean Sharghi
Jacob S. Laurel
Boqing Gong
EgoV
92
137
0
16 Jul 2017
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
701
131,652
0
12 Jun 2017
TALL: Temporal Activity Localization via Language Query
J. Gao
Chen Sun
Zhenheng Yang
Ram Nevatia
123
820
0
05 May 2017
Dense-Captioning Events in Videos
Ranjay Krishna
Kenji Hata
F. Ren
Li Fei-Fei
Juan Carlos Niebles
136
1,244
0
02 May 2017
Reading Wikipedia to Answer Open-Domain Questions
Danqi Chen
Adam Fisch
Jason Weston
Antoine Bordes
RALM
114
2,015
0
31 Mar 2017
Video Captioning with Transferred Semantic Attributes
Yingwei Pan
Ting Yao
Houqiang Li
Tao Mei
63
329
0
23 Nov 2016
Boosting Image Captioning with Attributes
Ting Yao
Yingwei Pan
Yehao Li
Zhaofan Qiu
Tao Mei
VLM
89
622
0
05 Nov 2016
Semi-Supervised Classification with Graph Convolutional Networks
Thomas Kipf
Max Welling
GNN
SSL
641
29,076
0
09 Sep 2016
Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding
Gunnar Sigurdsson
Gül Varol
Xinyu Wang
Ali Farhadi
Ivan Laptev
Abhinav Gupta
VGen
104
1,245
0
06 Apr 2016
End-to-end Learning of Action Detection from Frame Glimpses in Videos
Serena Yeung
Olga Russakovsky
Greg Mori
Li Fei-Fei
EgoV
106
608
0
22 Nov 2015
Natural Language Object Retrieval
Ronghang Hu
Huazhe Xu
Marcus Rohrbach
Jiashi Feng
Kate Saenko
Trevor Darrell
ObjD
94
553
0
13 Nov 2015
Jointly Modeling Embedding and Translation to Bridge Video and Language
Yingwei Pan
Tao Mei
Ting Yao
Houqiang Li
Y. Rui
77
534
0
07 May 2015
VQA: Visual Question Answering
Aishwarya Agrawal
Jiasen Lu
Stanislaw Antol
Margaret Mitchell
C. L. Zitnick
Dhruv Batra
Devi Parikh
CoGe
202
5,478
0
03 May 2015
Microsoft COCO Captions: Data Collection and Evaluation Server
Xinlei Chen
Hao Fang
Nayeon Lee
Ramakrishna Vedantam
Saurabh Gupta
Piotr Dollar
C. L. Zitnick
215
2,478
0
01 Apr 2015
Rich feature hierarchies for accurate object detection and semantic segmentation
Ross B. Girshick
Jeff Donahue
Trevor Darrell
Jitendra Malik
ObjD
289
26,193
0
11 Nov 2013
Previous
1
2