ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1705.02101
  4. Cited By
TALL: Temporal Activity Localization via Language Query
v1v2 (latest)

TALL: Temporal Activity Localization via Language Query

5 May 2017
J. Gao
Chen Sun
Zhenheng Yang
Ram Nevatia
ArXiv (abs)PDFHTML

Papers citing "TALL: Temporal Activity Localization via Language Query"

50 / 433 papers shown
Title
Meta Spatio-Temporal Debiasing for Video Scene Graph Generation
Meta Spatio-Temporal Debiasing for Video Scene Graph Generation
Li Xu
Haoxuan Qu
Jason Kuen
Jiuxiang Gu
Jun Liu
CML
93
27
0
23 Jul 2022
EgoEnv: Human-centric environment representations from egocentric video
EgoEnv: Human-centric environment representations from egocentric video
Tushar Nagarajan
Santhosh Kumar Ramakrishnan
Ruta Desai
James M. Hillis
Kristen Grauman
EgoV
109
20
0
22 Jul 2022
LocVTP: Video-Text Pre-training for Temporal Localization
LocVTP: Video-Text Pre-training for Temporal Localization
Meng Cao
Tianyu Yang
Junwu Weng
Can Zhang
Jue Wang
Yuexian Zou
90
65
0
21 Jul 2022
Unifying Event Detection and Captioning as Sequence Generation via
  Pre-Training
Unifying Event Detection and Captioning as Sequence Generation via Pre-Training
Qi Zhang
Yuqing Song
Qin Jin
68
26
0
18 Jul 2022
Gaussian Kernel-based Cross Modal Network for Spatio-Temporal Video
  Grounding
Gaussian Kernel-based Cross Modal Network for Spatio-Temporal Video Grounding
Zeyu Xiong
Daizong Liu
Technology
21
8
0
02 Jul 2022
Video Activity Localisation with Uncertainties in Temporal Boundary
Video Activity Localisation with Uncertainties in Temporal Boundary
Jiabo Huang
Hailin Jin
S. Gong
Yang Liu
104
24
0
26 Jun 2022
Multimodal Dialogue State Tracking
Multimodal Dialogue State Tracking
Hung Le
Nancy F. Chen
Guosheng Lin
67
9
0
16 Jun 2022
Beyond Grounding: Extracting Fine-Grained Event Hierarchies Across
  Modalities
Beyond Grounding: Extracting Fine-Grained Event Hierarchies Across Modalities
Hammad A. Ayyubi
Christopher Thomas
Lovish Chum
R. Lokesh
Long Chen
...
Xudong Lin
Xuande Feng
Jaywon Koo
Sounak Ray
Shih-Fu Chang
AI4TS
64
0
0
14 Jun 2022
Egocentric Video-Language Pretraining
Egocentric Video-Language Pretraining
Kevin Qinghong Lin
Alex Jinpeng Wang
Mattia Soldan
Michael Wray
Rui Yan
...
Hongfa Wang
Dima Damen
Guohao Li
Wei Liu
Mike Zheng Shou
VLMEgoV
99
207
0
03 Jun 2022
You Need to Read Again: Multi-granularity Perception Network for Moment
  Retrieval in Videos
You Need to Read Again: Multi-granularity Perception Network for Moment Retrieval in Videos
Xin Sun
Xinyu Wang
Jialin Gao
Qiong Liu
Xiaoping Zhou
89
34
0
25 May 2022
Entity-aware and Motion-aware Transformers for Language-driven Action
  Localization in Videos
Entity-aware and Motion-aware Transformers for Language-driven Action Localization in Videos
Shuo Yang
Xinxiao Wu
77
15
0
12 May 2022
Contrastive Language-Action Pre-training for Temporal Localization
Contrastive Language-Action Pre-training for Temporal Localization
Mengmeng Xu
Erhan Gundogdu
⋆⋆ Maksim
Guohao Li
M. Donoser
Loris Bazzani
100
27
0
26 Apr 2022
Video Moment Retrieval from Text Queries via Single Frame Annotation
Video Moment Retrieval from Text Queries via Single Frame Annotation
Ran Cui
Tianwen Qian
Pai Peng
E. Daskalaki
Jingjing Chen
Xiao-Wei Guo
Huyang Sun
Yu-Gang Jiang
97
37
0
20 Apr 2022
Animal Kingdom: A Large and Diverse Dataset for Animal Behavior
  Understanding
Animal Kingdom: A Large and Diverse Dataset for Animal Behavior Understanding
Xun Long Ng
Kian Eng Ong
Qichen Zheng
Yun Ni
S. Yeo
Jing Liu
VGen
79
88
0
18 Apr 2022
Position-aware Location Regression Network for Temporal Video Grounding
Position-aware Location Regression Network for Temporal Video Grounding
Sunoh Kim
Kimin Yun
J. Choi
52
4
0
12 Apr 2022
Learning Commonsense-aware Moment-Text Alignment for Fast Video Temporal
  Grounding
Learning Commonsense-aware Moment-Text Alignment for Fast Video Temporal Grounding
Ziyue Wu
Junyu Gao
Shucheng Huang
Changsheng Xu
84
4
0
04 Apr 2022
GEB+: A Benchmark for Generic Event Boundary Captioning, Grounding and Retrieval
GEB+: A Benchmark for Generic Event Boundary Captioning, Grounding and Retrieval
Yuxuan Wang
Difei Gao
Licheng Yu
Stan Weixian Lei
Matt Feiszli
Mike Zheng Shou
98
25
0
01 Apr 2022
TubeDETR: Spatio-Temporal Video Grounding with Transformers
TubeDETR: Spatio-Temporal Video Grounding with Transformers
Antoine Yang
Antoine Miech
Josef Sivic
Ivan Laptev
Cordelia Schmid
ViT
106
95
0
30 Mar 2022
AxIoU: An Axiomatically Justified Measure for Video Moment Retrieval
AxIoU: An Axiomatically Justified Measure for Video Moment Retrieval
Riku Togashi
Mayu Otani
Yuta Nakashima
Esa Rahtu
J. Heikkilä
T. Sakai
57
0
0
30 Mar 2022
Searching for fingerspelled content in American Sign Language
Searching for fingerspelled content in American Sign Language
Bowen Shi
D. Brentari
G. Shakhnarovich
Karen Livescu
58
5
0
24 Mar 2022
Compositional Temporal Grounding with Structured Variational Cross-Graph
  Correspondence Learning
Compositional Temporal Grounding with Structured Variational Cross-Graph Correspondence Learning
Juncheng Li
Junlin Xie
Long Qian
Linchao Zhu
Siliang Tang
Leilei Gan
Yi Yang
Yueting Zhuang
Xinze Wang
95
75
0
24 Mar 2022
UMT: Unified Multi-modal Transformers for Joint Video Moment Retrieval
  and Highlight Detection
UMT: Unified Multi-modal Transformers for Joint Video Moment Retrieval and Highlight Detection
Ye Liu
Siyuan Li
Yang Wu
C. Chen
Ying Shan
Xiaohu Qie
ViT
104
151
0
23 Mar 2022
How Do You Do It? Fine-Grained Action Understanding with Pseudo-Adverbs
How Do You Do It? Fine-Grained Action Understanding with Pseudo-Adverbs
Hazel Doughty
Cees G. M. Snoek
133
19
0
23 Mar 2022
Towards Visual-Prompt Temporal Answering Grounding in Medical
  Instructional Video
Towards Visual-Prompt Temporal Answering Grounding in Medical Instructional Video
Bin Li
Yixuan Weng
Bin Sun
Shutao Li
135
33
0
13 Mar 2022
A Closer Look at Debiased Temporal Sentence Grounding in Videos:
  Dataset, Metric, and Approach
A Closer Look at Debiased Temporal Sentence Grounding in Videos: Dataset, Metric, and Approach
Xiaohan Lan
Yitian Yuan
Xin Eric Wang
Long Chen
Zhi Wang
Lin Ma
Wenwu Zhu
CML
70
16
0
10 Mar 2022
Multi-Scale Self-Contrastive Learning with Hard Negative Mining for
  Weakly-Supervised Query-based Video Grounding
Multi-Scale Self-Contrastive Learning with Hard Negative Mining for Weakly-Supervised Query-based Video Grounding
Shentong Mo
Daizong Liu
Wei Hu
SSL
73
6
0
08 Mar 2022
Exploring Optical-Flow-Guided Motion and Detection-Based Appearance for
  Temporal Sentence Grounding
Exploring Optical-Flow-Guided Motion and Detection-Based Appearance for Temporal Sentence Grounding
Daizong Liu
Xiang Fang
Wei Hu
Pan Zhou
98
37
0
06 Mar 2022
When Did It Happen? Duration-informed Temporal Localization of Narrated
  Actions in Vlogs
When Did It Happen? Duration-informed Temporal Localization of Narrated Actions in Vlogs
Oana Ignat
Santiago Castro
Yuhang Zhou
Jiajun Bao
Dandan Shan
Rada Mihalcea
46
3
0
16 Feb 2022
Explore-And-Match: Bridging Proposal-Based and Proposal-Free With
  Transformer for Sentence Grounding in Videos
Explore-And-Match: Bridging Proposal-Based and Proposal-Free With Transformer for Sentence Grounding in Videos
Sangmin Woo
Jinyoung Park
Inyong Koo
Sumin Lee
Minki Jeong
Changick Kim
86
4
0
25 Jan 2022
Temporal Sentence Grounding in Videos: A Survey and Future Directions
Temporal Sentence Grounding in Videos: A Survey and Future Directions
Hao Zhang
Aixin Sun
Wei Jing
Qiufeng Wang
3DGS
98
41
0
20 Jan 2022
Unsupervised Temporal Video Grounding with Deep Semantic Clustering
Unsupervised Temporal Video Grounding with Deep Semantic Clustering
Daizong Liu
Xiaoye Qu
Yinzhen Wang
Xing Di
Kai Zou
Yu Cheng
Zichuan Xu
Pan Zhou
97
51
0
14 Jan 2022
Learning Sample Importance for Cross-Scenario Video Temporal Grounding
Learning Sample Importance for Cross-Scenario Video Temporal Grounding
P. Bao
Yadong Mu
72
13
0
08 Jan 2022
Exploring Motion and Appearance Information for Temporal Sentence
  Grounding
Exploring Motion and Appearance Information for Temporal Sentence Grounding
Daizong Liu
Xiaoye Qu
Pan Zhou
Yang Liu
92
42
0
03 Jan 2022
Memory-Guided Semantic Learning Network for Temporal Sentence Grounding
Memory-Guided Semantic Learning Network for Temporal Sentence Grounding
Daizong Liu
Xiaoye Qu
Xing Di
Yu Cheng
Zichuan Xu
Pan Zhou
107
60
0
03 Jan 2022
LocFormer: Enabling Transformers to Perform Temporal Moment Localization
  on Long Untrimmed Videos With a Feature Sampling Approach
LocFormer: Enabling Transformers to Perform Temporal Moment Localization on Long Untrimmed Videos With a Feature Sampling Approach
Cristian Rodriguez-Opazo
Edison Marrese-Taylor
Basura Fernando
Hiroya Takamura
Qi Wu
ViT
41
3
0
19 Dec 2021
Progressive Attention on Multi-Level Dense Difference Maps for Generic
  Event Boundary Detection
Progressive Attention on Multi-Level Dense Difference Maps for Generic Event Boundary Detection
Jiaqi Tang
Zhaoyang Liu
Chao Qian
Wayne Wu
Limin Wang
96
18
0
09 Dec 2021
Classification-Then-Grounding: Reformulating Video Scene Graphs as
  Temporal Bipartite Graphs
Classification-Then-Grounding: Reformulating Video Scene Graphs as Temporal Bipartite Graphs
Kaifeng Gao
Long Chen
Yulei Niu
Jian Shao
Jun Xiao
59
29
0
08 Dec 2021
SNEAK: Synonymous Sentences-Aware Adversarial Attack on Natural Language
  Video Localization
SNEAK: Synonymous Sentences-Aware Adversarial Attack on Natural Language Video Localization
Wenbo Gou
Wen Shi
Jian Lou
Lijie Huang
Pan Zhou
Ruixuan Li
AAML
69
2
0
08 Dec 2021
MAD: A Scalable Dataset for Language Grounding in Videos from Movie
  Audio Descriptions
MAD: A Scalable Dataset for Language Grounding in Videos from Movie Audio Descriptions
Mattia Soldan
Alejandro Pardo
Juan Carlos León Alcázar
Fabian Caba Heilbron
Chen Zhao
Silvio Giancola
Guohao Li
VGen
116
100
0
01 Dec 2021
AssistSR: Task-oriented Video Segment Retrieval for Personal AI
  Assistant
AssistSR: Task-oriented Video Segment Retrieval for Personal AI Assistant
Stan Weixian Lei
Difei Gao
Yuxuan Wang
Dongxing Mao
Zihan Liang
L. Ran
Mike Zheng Shou
61
8
0
30 Nov 2021
VIOLET : End-to-End Video-Language Transformers with Masked Visual-token
  Modeling
VIOLET : End-to-End Video-Language Transformers with Masked Visual-token Modeling
Tsu-Jui Fu
Linjie Li
Zhe Gan
Kevin Qinghong Lin
Wenjie Wang
Lijuan Wang
Zicheng Liu
VLM
146
221
0
24 Nov 2021
Exploring Segment-level Semantics for Online Phase Recognition from
  Surgical Videos
Exploring Segment-level Semantics for Online Phase Recognition from Surgical Videos
Xinpeng Ding
Xiaomeng Li
98
36
0
22 Nov 2021
Towards Debiasing Temporal Sentence Grounding in Video
Towards Debiasing Temporal Sentence Grounding in Video
Hao Zhang
Aixin Sun
Wei Jing
Qiufeng Wang
98
16
0
08 Nov 2021
Multi-scale 2D Representation Learning for weakly-supervised moment
  retrieval
Multi-scale 2D Representation Learning for weakly-supervised moment retrieval
Ding Li
Rui Wu
Yongqiang Tang
Zhizhong Zhang
Wensheng Zhang
55
2
0
04 Nov 2021
Hierarchical Deep Residual Reasoning for Temporal Moment Localization
Hierarchical Deep Residual Reasoning for Temporal Moment Localization
Ziyang Ma
Xianjing Han
Xuemeng Song
Yiran Cui
Liqiang Nie
56
9
0
31 Oct 2021
Visual Keyword Spotting with Attention
Visual Keyword Spotting with Attention
Prajwal K R
Liliane Momeni
Triantafyllos Afouras
Andrew Zisserman
72
13
0
29 Oct 2021
Video and Text Matching with Conditioned Embeddings
Video and Text Matching with Conditioned Embeddings
Ameen Ali
Idan Schwartz
Tamir Hazan
Lior Wolf
180
14
0
21 Oct 2021
Multi-Modal Interaction Graph Convolutional Network for Temporal
  Language Localization in Videos
Multi-Modal Interaction Graph Convolutional Network for Temporal Language Localization in Videos
Zongmeng Zhang
Xianjing Han
Xuemeng Song
Yan Yan
Liqiang Nie
118
37
0
12 Oct 2021
Relation-aware Video Reading Comprehension for Temporal Language
  Grounding
Relation-aware Video Reading Comprehension for Temporal Language Grounding
Jialin Gao
Xin Sun
Mengmeng Xu
Xi Zhou
Guohao Li
96
48
0
12 Oct 2021
Weakly Supervised Human-Object Interaction Detection in Video via
  Contrastive Spatiotemporal Regions
Weakly Supervised Human-Object Interaction Detection in Video via Contrastive Spatiotemporal Regions
Shuang Li
Yilun Du
Antonio Torralba
Josef Sivic
Bryan C. Russell
91
16
0
07 Oct 2021
Previous
123456789
Next