Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1705.02101
Cited By
v1
v2 (latest)
TALL: Temporal Activity Localization via Language Query
5 May 2017
J. Gao
Chen Sun
Zhenheng Yang
Ram Nevatia
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"TALL: Temporal Activity Localization via Language Query"
50 / 433 papers shown
Title
You Can Ground Earlier than See: An Effective and Efficient Pipeline for Temporal Sentence Grounding in Compressed Videos
Xiang Fang
Daizong Liu
Pan Zhou
Guoshun Nan
86
39
0
14 Mar 2023
Generation-Guided Multi-Level Unified Network for Video Grounding
Xingyi Cheng
Xiangyu Wu
Dong Shen
Hezheng Lin
Fan Yang
63
0
0
14 Mar 2023
Align and Attend: Multimodal Summarization with Dual Contrastive Losses
Bo He
Jun Wang
Jielin Qiu
Trung Bui
Abhinav Shrivastava
Zhaowen Wang
85
71
0
13 Mar 2023
Towards Diverse Temporal Grounding under Single Positive Labels
Hao Zhou
Chongyang Zhang
Yanjun Chen
Chuanping Hu
69
1
0
12 Mar 2023
Learning Grounded Vision-Language Representation for Versatile Understanding in Untrimmed Videos
Teng Wang
Jinrui Zhang
Feng Zheng
Wenhao Jiang
Ran Cheng
Ping Luo
VLM
82
11
0
11 Mar 2023
Text-Visual Prompting for Efficient 2D Temporal Video Grounding
Yimeng Zhang
Xin Chen
Jinghan Jia
Sijia Liu
Ke Ding
89
27
0
09 Mar 2023
Towards Generalisable Video Moment Retrieval: Visual-Dynamic Injection to Image-Text Pre-Training
Dezhao Luo
Jiabo Huang
S. Gong
Hailin Jin
Yang Liu
VGen
89
30
0
28 Feb 2023
Deep Visual Forced Alignment: Learning to Align Transcription with Talking Face Video
Minsu Kim
Chae Won Kim
Y. Ro
CVBM
DiffM
64
3
0
27 Feb 2023
Localizing Moments in Long Video Via Multimodal Guidance
Wayner Barrios
Mattia Soldan
Alberto M. Ceballos-Arroyo
Fabian Caba Heilbron
Guohao Li
89
21
0
26 Feb 2023
Tracking Objects and Activities with Attention for Temporal Sentence Grounding
Zeyu Xiong
Daizong Liu
Pan Zhou
Jiahao Zhu
59
5
0
21 Feb 2023
Constraint and Union for Partially-Supervised Temporal Sentence Grounding
Chen Ju
Haicheng Wang
Jinxian Liu
Chaofan Ma
Ya Zhang
Peisen Zhao
Jianlong Chang
Qi Tian
49
15
0
20 Feb 2023
MINOTAUR: Multi-task Video Grounding From Multimodal Queries
Raghav Goyal
E. Mavroudi
Xitong Yang
Sainbayar Sukhbaatar
Leonid Sigal
Matt Feiszli
Lorenzo Torresani
Du Tran
95
7
0
16 Feb 2023
Multi-video Moment Ranking with Multimodal Clue
Danyang Hou
Liang Pang
Yanyan Lan
Huawei Shen
Xueqi Cheng
48
1
0
29 Jan 2023
Variational Cross-Graph Reasoning and Adaptive Structured Semantics Learning for Compositional Temporal Grounding
Juncheng Li
Siliang Tang
Linchao Zhu
Wenqiao Zhang
Yi Yang
Tat-Seng Chua
Fei Wu
Yueting Zhuang
BDL
81
17
0
22 Jan 2023
Exploiting Auxiliary Caption for Video Grounding
Hongxiang Li
Meng Cao
Xuxin Cheng
Zhihong Zhu
Yaowei Li
Yuexian Zou
70
10
0
15 Jan 2023
Hypotheses Tree Building for One-Shot Temporal Sentence Localization
Daizong Liu
Xiang Fang
Pan Zhou
Xing Di
Weining Lu
Yu Cheng
79
19
0
05 Jan 2023
NaQ: Leveraging Narrations as Queries to Supervise Episodic Memory
Santhosh Kumar Ramakrishnan
Ziad Al-Halah
Kristen Grauman
207
42
0
02 Jan 2023
Rethinking the Video Sampling and Reasoning Strategies for Temporal Sentence Grounding
Jiahao Zhu
Daizong Liu
Pan Zhou
Xing Di
Yu Cheng
...
Wenzheng Xu
Zichuan Xu
Yao Wan
Lichao Sun
Zeyu Xiong
80
18
0
02 Jan 2023
MRTNet: Multi-Resolution Temporal Network for Video Sentence Grounding
Wei Ji
Long Chen
Yin-wei Wei
Yiming Wu
Tat-Seng Chua
AI4TS
72
19
0
26 Dec 2022
Distilling Vision-Language Pre-training to Collaborate with Weakly-Supervised Temporal Action Localization
Chen Ju
Kunhao Zheng
Jinxian Liu
Peisen Zhao
Ya Zhang
Jianlong Chang
Yanfeng Wang
Qi Tian
58
11
0
19 Dec 2022
SimVTP: Simple Video Text Pre-training with Masked Autoencoders
Yue Ma
Tianyu Yang
Yin Shan
Xiu Li
88
27
0
07 Dec 2022
InternVideo-Ego4D: A Pack of Champion Solutions to Ego4D Challenges
Guo Chen
Sen Xing
Zhe Chen
Yi Wang
Kunchang Li
...
Hongjie Zhang
Tong Lu
Yali Wang
Liming Wang
Yu Qiao
82
49
0
17 Nov 2022
Soft-Landing Strategy for Alleviating the Task Discrepancy Problem in Temporal Action Localization Tasks
Hyolim Kang
Hanjung Kim
Joungbin An
Minsu Cho
Seon Joo Kim
79
5
0
11 Nov 2022
Zero-shot Video Moment Retrieval With Off-the-Shelf Models
Anuj Diwan
Puyuan Peng
Raymond J. Mooney
VLM
67
3
0
03 Nov 2022
FedVMR: A New Federated Learning method for Video Moment Retrieval
Yan Wang
Xin Luo
Zhen-Duo Chen
P. Zhang
Meng Liu
Xin-Shun Xu
FedML
73
3
0
28 Oct 2022
Language-free Training for Zero-shot Video Grounding
Dahye Kim
Jungin Park
Jiyoung Lee
S. Park
Kwanghoon Sohn
94
20
0
24 Oct 2022
Modal-specific Pseudo Query Generation for Video Corpus Moment Retrieval
Minjoon Jung
Seongho Choi
Joo-Kyung Kim
Jin-Hwa Kim
Byoung-Tak Zhang
95
10
0
23 Oct 2022
Weakly-Supervised Temporal Article Grounding
Long Chen
Yulei Niu
Brian Chen
Xudong Lin
G. Han
Christopher Thomas
Hammad A. Ayyubi
Heng Ji
Shih-Fu Chang
AI4TS
86
13
0
22 Oct 2022
Fine-grained Semantic Alignment Network for Weakly Supervised Temporal Language Grounding
Yuechen Wang
Wen-gang Zhou
Houqiang Li
AI4TS
58
13
0
21 Oct 2022
Selective Query-guided Debiasing for Video Corpus Moment Retrieval
Sunjae Yoon
Jiajing Hong
Eunseop Yoon
Dahyun Kim
Junyeong Kim
Hee Suk Yoon
Changdong Yoo
142
23
0
17 Oct 2022
Semantic Video Moments Retrieval at Scale: A New Task and a Baseline
Na Li
113
0
0
15 Oct 2022
ConTra: (Con)text (Tra)nsformer for Cross-Modal Video Retrieval
A. Fragomeni
Michael Wray
Dima Damen
CLIP
ViT
56
4
0
09 Oct 2022
Video Referring Expression Comprehension via Transformer with Content-aware Query
Ji Jiang
Meng Cao
Tengtao Song
Yuexian Zou
83
5
0
06 Oct 2022
Vision+X: A Survey on Multimodal Learning in the Light of Data
Ye Zhu
Yuehua Wu
N. Sebe
Yan Yan
105
19
0
05 Oct 2022
Locate before Answering: Answer Guided Question Localization for Video Question Answering
Tianwen Qian
Ran Cui
Jingjing Chen
Pai Peng
Xiao-Wei Guo
Yu-Gang Jiang
94
18
0
05 Oct 2022
Embracing Consistency: A One-Stage Approach for Spatio-Temporal Video Grounding
Yang Jin
Yongzhi Li
Zehuan Yuan
Yadong Mu
83
34
0
27 Sep 2022
Towards Parameter-Efficient Integration of Pre-Trained Language Models In Temporal Video Grounding
Erica K. Shimomoto
Edison Marrese-Taylor
Hiroya Takamura
Ichiro Kobayashi
Hideki Nakayama
Yusuke Miyao
81
7
0
26 Sep 2022
Multi-Modal Cross-Domain Alignment Network for Video Moment Retrieval
Xiang Fang
Daizong Liu
Pan Zhou
Yuchong Hu
199
43
0
23 Sep 2022
CONE: An Efficient COarse-to-fiNE Alignment Framework for Long Video Temporal Grounding
Zhijian Hou
Wanjun Zhong
Lei Ji
Difei Gao
Kun Yan
W. Chan
Chong-Wah Ngo
Zheng Shou
Nan Duan
AI4TS
118
26
0
22 Sep 2022
WildQA: In-the-Wild Video Question Answering
Santiago Castro
Naihao Deng
Pingxuan Huang
Mihai Burzo
Rada Mihalcea
152
7
0
14 Sep 2022
Frame-Subtitle Self-Supervision for Multi-Modal Video Question Answering
Jiong Wang
Zhou Zhao
Weike Jin
65
0
0
08 Sep 2022
An Empirical Study of End-to-End Video-Language Transformers with Masked Visual Modeling
Tsu-Jui Fu
Linjie Li
Zhe Gan
Kevin Qinghong Lin
William Yang Wang
Lijuan Wang
Zicheng Liu
VLM
130
65
0
04 Sep 2022
Video-Guided Curriculum Learning for Spoken Video Grounding
Yan Xia
Zhou Zhao
Shangwei Ye
Yang Zhao
Haoyuan Li
Yi Ren
77
11
0
01 Sep 2022
Hierarchical Local-Global Transformer for Temporal Sentence Grounding
Xiang Fang
Daizong Liu
Pan Zhou
Zichuan Xu
Rui Li
113
30
0
31 Aug 2022
Partially Relevant Video Retrieval
Jianfeng Dong
Xianke Chen
Minsong Zhang
Xun Yang
Shujie Chen
Xirong Li
Xun Wang
80
45
0
26 Aug 2022
Dilated Context Integrated Network with Cross-Modal Consensus for Temporal Emotion Localization in Videos
Juncheng Billy Li
Junlin Xie
Linchao Zhu
Long Qian
Siliang Tang
...
Haochen Shi
Shengyu Zhang
Longhui Wei
Qi Tian
Yueting Zhuang
70
13
0
03 Aug 2022
Video Question Answering with Iterative Video-Text Co-Tokenization
A. Piergiovanni
K. Morton
Weicheng Kuo
Michael S. Ryoo
A. Angelova
104
18
0
01 Aug 2022
Can Shuffling Video Benefit Temporal Bias Problem: A Novel Training Framework for Temporal Grounding
Jiachang Hao
Haifeng Sun
Pengfei Ren
Jingyu Wang
Q. Qi
J. Liao
95
26
0
29 Jul 2022
Reducing the Vision and Language Bias for Temporal Sentence Grounding
Daizong Liu
Xiaoye Qu
Wei Hu
105
50
0
27 Jul 2022
Skimming, Locating, then Perusing: A Human-Like Framework for Natural Language Video Localization
Daizong Liu
Wei Hu
101
40
0
27 Jul 2022
Previous
1
2
3
4
5
6
7
8
9
Next