ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2109.08039
  4. Cited By
A Survey on Temporal Sentence Grounding in Videos
v1v2 (latest)

A Survey on Temporal Sentence Grounding in Videos

16 September 2021
Xiaohan Lan
Yitian Yuan
Xin Eric Wang
Zhi Wang
Wenwu Zhu
ArXiv (abs)PDFHTML

Papers citing "A Survey on Temporal Sentence Grounding in Videos"

34 / 84 papers shown
Title
Cross-Modal Interaction Networks for Query-Based Moment Retrieval in
  Videos
Cross-Modal Interaction Networks for Query-Based Moment Retrieval in Videos
Zhu Zhang
Zhijie Lin
Zhou Zhao
Zhenxin Xiao
49
213
0
06 Jun 2019
Tripping through time: Efficient Localization of Activities in Videos
Tripping through time: Efficient Localization of Activities in Videos
Meera Hahn
Asim Kadav
James M. Rehg
H. Graf
70
86
0
22 Apr 2019
Heterogeneous Memory Enhanced Multimodal Attention Model for Video
  Question Answering
Heterogeneous Memory Enhanced Multimodal Attention Model for Video Question Answering
Chenyou Fan
Xiaofan Zhang
Shu Zhang
Wensheng Wang
Chi Zhang
Heng-Chiao Huang
52
278
0
08 Apr 2019
Weakly Supervised Video Moment Retrieval From Text Queries
Weakly Supervised Video Moment Retrieval From Text Queries
Niluthpol Chowdhury Mithun
S. Paul
Amit K. Roy-Chowdhury
117
194
0
05 Apr 2019
ExCL: Extractive Clip Localization Using Natural Language Descriptions
ExCL: Extractive Clip Localization Using Natural Language Descriptions
Soham Ghosh
Anuva Agarwal
Zarana Parekh
Alexander G. Hauptmann
CLIP
51
152
0
04 Apr 2019
Read, Watch, and Move: Reinforcement Learning for Temporally Grounding
  Natural Language Descriptions in Videos
Read, Watch, and Move: Reinforcement Learning for Temporally Grounding Natural Language Descriptions in Videos
Dongliang He
Xiang Zhao
Jizhou Huang
Fu Li
Xiao-Chang Liu
Shilei Wen
63
153
0
21 Jan 2019
Weakly Supervised Dense Event Captioning in Videos
Weakly Supervised Dense Event Captioning in Videos
Xuguang Duan
Wen-bing Huang
Chuang Gan
Jingdong Wang
Wenwu Zhu
Junzhou Huang
69
150
0
10 Dec 2018
MAN: Moment Alignment Network for Natural Language Moment Retrieval via
  Iterative Graph Adjustment
MAN: Moment Alignment Network for Natural Language Moment Retrieval via Iterative Graph Adjustment
Da Zhang
Xiyang Dai
Xin Eric Wang
Yuan-fang Wang
L. Davis
68
305
0
30 Nov 2018
MAC: Mining Activity Concepts for Language-based Temporal Localization
MAC: Mining Activity Concepts for Language-based Temporal Localization
Runzhou Ge
J. Gao
Kan Chen
Ram Nevatia
73
179
0
21 Nov 2018
Exploring Visual Relationship for Image Captioning
Exploring Visual Relationship for Image Captioning
Ting Yao
Yingwei Pan
Yehao Li
Tao Mei
76
834
0
19 Sep 2018
TVQA: Localized, Compositional Video Question Answering
TVQA: Localized, Compositional Video Question Answering
Muhammad Abdul Wahab
Licheng Yu
Mounir Nasr Allah
Tamara L. Berg
90
640
0
05 Sep 2018
Localizing Moments in Video with Temporal Language
Localizing Moments in Video with Temporal Language
Lisa Anne Hendricks
Oliver Wang
Eli Shechtman
Josef Sivic
Trevor Darrell
Bryan C. Russell
79
159
0
05 Sep 2018
End-to-End Audio Visual Scene-Aware Dialog using Multimodal
  Attention-Based Video Features
End-to-End Audio Visual Scene-Aware Dialog using Multimodal Attention-Based Video Features
Chiori Hori
Huda AlAmri
Jue Wang
Gordon Wichern
Takaaki Hori
...
Raphael Gontijo-Lopes
Abhishek Das
Irfan Essa
Dhruv Batra
Devi Parikh
VGen
64
125
0
21 Jun 2018
To Find Where You Talk: Temporal Sentence Localization in Video with
  Attention Based Location Regression
To Find Where You Talk: Temporal Sentence Localization in Video with Attention Based Location Regression
Yitian Yuan
Tao Mei
Wenwu Zhu
78
333
0
19 Apr 2018
Multilevel Language and Vision Integration for Text-to-Clip Retrieval
Multilevel Language and Vision Integration for Text-to-Clip Retrieval
Huijuan Xu
Kun He
Bryan A. Plummer
Leonid Sigal
Stan Sclaroff
Kate Saenko
CLIP
63
323
0
13 Apr 2018
MAttNet: Modular Attention Network for Referring Expression
  Comprehension
MAttNet: Modular Attention Network for Referring Expression Comprehension
Licheng Yu
Zhe Lin
Xiaohui Shen
Jimei Yang
Xin Lu
Joey Tianyi Zhou
Tamara L. Berg
ObjD
97
828
0
24 Jan 2018
Single Shot Temporal Action Detection
Single Shot Temporal Action Detection
Tianwei Lin
Xu Zhao
Zheng Shou
77
455
0
17 Oct 2017
Localizing Moments in Video with Natural Language
Localizing Moments in Video with Natural Language
Lisa Anne Hendricks
Oliver Wang
Eli Shechtman
Josef Sivic
Trevor Darrell
Bryan C. Russell
115
946
0
04 Aug 2017
Bottom-Up and Top-Down Attention for Image Captioning and Visual
  Question Answering
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
Peter Anderson
Xiaodong He
Chris Buehler
Damien Teney
Mark Johnson
Stephen Gould
Lei Zhang
AIMat
121
4,216
0
25 Jul 2017
Query-Focused Video Summarization: Dataset, Evaluation, and A Memory
  Network Based Approach
Query-Focused Video Summarization: Dataset, Evaluation, and A Memory Network Based Approach
Aidean Sharghi
Jacob S. Laurel
Boqing Gong
EgoV
92
137
0
16 Jul 2017
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
701
131,652
0
12 Jun 2017
TALL: Temporal Activity Localization via Language Query
TALL: Temporal Activity Localization via Language Query
J. Gao
Chen Sun
Zhenheng Yang
Ram Nevatia
123
820
0
05 May 2017
Dense-Captioning Events in Videos
Dense-Captioning Events in Videos
Ranjay Krishna
Kenji Hata
F. Ren
Li Fei-Fei
Juan Carlos Niebles
136
1,244
0
02 May 2017
Reading Wikipedia to Answer Open-Domain Questions
Reading Wikipedia to Answer Open-Domain Questions
Danqi Chen
Adam Fisch
Jason Weston
Antoine Bordes
RALM
114
2,015
0
31 Mar 2017
Video Captioning with Transferred Semantic Attributes
Video Captioning with Transferred Semantic Attributes
Yingwei Pan
Ting Yao
Houqiang Li
Tao Mei
63
329
0
23 Nov 2016
Boosting Image Captioning with Attributes
Boosting Image Captioning with Attributes
Ting Yao
Yingwei Pan
Yehao Li
Zhaofan Qiu
Tao Mei
VLM
89
622
0
05 Nov 2016
Semi-Supervised Classification with Graph Convolutional Networks
Semi-Supervised Classification with Graph Convolutional Networks
Thomas Kipf
Max Welling
GNNSSL
641
29,076
0
09 Sep 2016
Hollywood in Homes: Crowdsourcing Data Collection for Activity
  Understanding
Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding
Gunnar Sigurdsson
Gül Varol
Xinyu Wang
Ali Farhadi
Ivan Laptev
Abhinav Gupta
VGen
104
1,245
0
06 Apr 2016
End-to-end Learning of Action Detection from Frame Glimpses in Videos
End-to-end Learning of Action Detection from Frame Glimpses in Videos
Serena Yeung
Olga Russakovsky
Greg Mori
Li Fei-Fei
EgoV
106
608
0
22 Nov 2015
Natural Language Object Retrieval
Natural Language Object Retrieval
Ronghang Hu
Huazhe Xu
Marcus Rohrbach
Jiashi Feng
Kate Saenko
Trevor Darrell
ObjD
94
553
0
13 Nov 2015
Jointly Modeling Embedding and Translation to Bridge Video and Language
Jointly Modeling Embedding and Translation to Bridge Video and Language
Yingwei Pan
Tao Mei
Ting Yao
Houqiang Li
Y. Rui
77
534
0
07 May 2015
VQA: Visual Question Answering
VQA: Visual Question Answering
Aishwarya Agrawal
Jiasen Lu
Stanislaw Antol
Margaret Mitchell
C. L. Zitnick
Dhruv Batra
Devi Parikh
CoGe
202
5,478
0
03 May 2015
Microsoft COCO Captions: Data Collection and Evaluation Server
Microsoft COCO Captions: Data Collection and Evaluation Server
Xinlei Chen
Hao Fang
Nayeon Lee
Ramakrishna Vedantam
Saurabh Gupta
Piotr Dollar
C. L. Zitnick
215
2,478
0
01 Apr 2015
Rich feature hierarchies for accurate object detection and semantic
  segmentation
Rich feature hierarchies for accurate object detection and semantic segmentation
Ross B. Girshick
Jeff Donahue
Trevor Darrell
Jitendra Malik
ObjD
289
26,193
0
11 Nov 2013
Previous
12