ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1705.02101
  4. Cited By
TALL: Temporal Activity Localization via Language Query
v1v2 (latest)

TALL: Temporal Activity Localization via Language Query

5 May 2017
J. Gao
Chen Sun
Zhenheng Yang
Ram Nevatia
ArXiv (abs)PDFHTML

Papers citing "TALL: Temporal Activity Localization via Language Query"

50 / 433 papers shown
Title
Harvest Video Foundation Models via Efficient Post-Pretraining
Harvest Video Foundation Models via Efficient Post-Pretraining
Yizhuo Li
Kunchang Li
Yinan He
Yi Wang
Yali Wang
Limin Wang
Yu Qiao
Ping Luo
CLIPVLMVGen
106
2
0
30 Oct 2023
Learning Temporal Sentence Grounding From Narrated EgoVideos
Learning Temporal Sentence Grounding From Narrated EgoVideos
Kevin Flanagan
Dima Damen
Michael Wray
67
3
0
26 Oct 2023
Exploring Iterative Refinement with Diffusion Models for Video Grounding
Exploring Iterative Refinement with Diffusion Models for Video Grounding
Xiao Liang
Tao Shi
Yaoyuan Liang
Te Tao
Shao-Lun Huang
DiffM
83
2
0
26 Oct 2023
Video Referring Expression Comprehension via Transformer with
  Content-conditioned Query
Video Referring Expression Comprehension via Transformer with Content-conditioned Query
Jiang Ji
Meng Cao
Tengtao Song
Long Chen
Yi Wang
Yuexian Zou
85
6
0
25 Oct 2023
Temporally Aligning Long Audio Interviews with Questions: A Case Study
  in Multimodal Data Integration
Temporally Aligning Long Audio Interviews with Questions: A Case Study in Multimodal Data Integration
Piyush Singh Pasi
Karthikeya Battepati
Preethi Jyothi
Ganesh Ramakrishnan
T. Mahapatra
Manoj Singh
82
0
0
10 Oct 2023
GMMFormer: Gaussian-Mixture-Model Based Transformer for Efficient
  Partially Relevant Video Retrieval
GMMFormer: Gaussian-Mixture-Model Based Transformer for Efficient Partially Relevant Video Retrieval
Yuting Wang
Jinpeng Wang
Bin Chen
Ziyun Zeng
Shu-Tao Xia
68
11
0
08 Oct 2023
SCANet: Scene Complexity Aware Network for Weakly-Supervised Video Moment Retrieval
SCANet: Scene Complexity Aware Network for Weakly-Supervised Video Moment Retrieval
Sunjae Yoon
Gwanhyeong Koo
Dahyun Kim
Changdong Yoo
93
12
0
08 Oct 2023
A Hierarchical Graph-based Approach for Recognition and Description
  Generation of Bimanual Actions in Videos
A Hierarchical Graph-based Approach for Recognition and Description Generation of Bimanual Actions in Videos
Fatemeh Ziaeetabar
Reza Safabakhsh
S. Momtazi
M. Tamosiunaite
Florentin Wörgötter
58
2
0
01 Oct 2023
VidChapters-7M: Video Chapters at Scale
VidChapters-7M: Video Chapters at Scale
Antoine Yang
Arsha Nagrani
Ivan Laptev
Josef Sivic
Cordelia Schmid
VGen
98
28
0
25 Sep 2023
Towards Surveillance Video-and-Language Understanding: New Dataset,
  Baselines, and Challenges
Towards Surveillance Video-and-Language Understanding: New Dataset, Baselines, and Challenges
Tongtong Yuan
Xuange Zhang
Kun Liu
Bo Liu
Chen Chen
Jian Jin
Zhenzhen Jiao
AI4TS
105
19
0
25 Sep 2023
Dual-Path Temporal Map Optimization for Make-up Temporal Video Grounding
Dual-Path Temporal Map Optimization for Make-up Temporal Video Grounding
Jiaxiu Li
Kun Li
Jia Li
Guoliang Chen
Dan Guo
Meng Wang
68
3
0
12 Sep 2023
Can I Trust Your Answer? Visually Grounded Video Question Answering
Can I Trust Your Answer? Visually Grounded Video Question Answering
Junbin Xiao
Angela Yao
Yicong Li
Tat-Seng Chua
137
61
0
04 Sep 2023
Language-Conditioned Change-point Detection to Identify Sub-Tasks in
  Robotics Domains
Language-Conditioned Change-point Detection to Identify Sub-Tasks in Robotics Domains
Divyanshu Raj
Chitta Baral
N. Gopalan
128
1
0
01 Sep 2023
Zero-Shot Video Moment Retrieval from Frozen Vision-Language Models
Zero-Shot Video Moment Retrieval from Frozen Vision-Language Models
Dezhao Luo
Jiabo Huang
Shaogang Gong
Hailin Jin
Yang Liu
VLM
135
11
0
01 Sep 2023
Distraction-free Embeddings for Robust VQA
Distraction-free Embeddings for Robust VQA
Atharvan Dogra
Deeksha Varshney
Ashwin Kalyan
Ameet Deshpande
Neeraj Kumar
95
0
0
31 Aug 2023
DiffusionVMR: Diffusion Model for Joint Video Moment Retrieval and
  Highlight Detection
DiffusionVMR: Diffusion Model for Joint Video Moment Retrieval and Highlight Detection
Henghao Zhao
Kevin Qinghong Lin
Rui Yan
Zechao Li
VGenDiffM
111
3
0
29 Aug 2023
Multi-event Video-Text Retrieval
Multi-event Video-Text Retrieval
Gengyuan Zhang
Jisen Ren
Jindong Gu
Volker Tresp
78
14
0
22 Aug 2023
UnLoc: A Unified Framework for Video Localization Tasks
UnLoc: A Unified Framework for Video Localization Tasks
Shengjia Yan
Xuehan Xiong
Arsha Nagrani
Anurag Arnab
Zhonghao Wang
Weina Ge
David A. Ross
Cordelia Schmid
133
55
0
21 Aug 2023
Temporal Sentence Grounding in Streaming Videos
Temporal Sentence Grounding in Streaming Videos
Tian Gan
Xiao Wang
Yan Sun
Jianlong Wu
Qingpei Guo
Liqiang Nie
81
3
0
14 Aug 2023
Knowing Where to Focus: Event-aware Transformer for Video Grounding
Knowing Where to Focus: Event-aware Transformer for Video Grounding
Jinhyun Jang
Jungin Park
Jin-Hwa Kim
Hyeongjun Kwon
Kwanghoon Sohn
70
51
0
14 Aug 2023
ViGT: Proposal-free Video Grounding with Learnable Token in Transformer
ViGT: Proposal-free Video Grounding with Learnable Token in Transformer
Kun Li
Dan Guo
Meng Wang
ViT
79
42
0
11 Aug 2023
Encode-Store-Retrieve: Enhancing Memory Augmentation through
  Language-Encoded Egocentric Perception
Encode-Store-Retrieve: Enhancing Memory Augmentation through Language-Encoded Egocentric Perception
Junxiao Shen
John J. Dudley
Per Ola Kristensson
RALM
42
0
0
10 Aug 2023
Counterfactual Cross-modality Reasoning for Weakly Supervised Video
  Moment Localization
Counterfactual Cross-modality Reasoning for Weakly Supervised Video Moment Localization
Zezhong Lv
Fuchun Sun
Ji-Rong Wen
101
16
0
10 Aug 2023
Local-Global Information Interaction Debiasing for Dynamic Scene Graph Generation
Xinyu Lyu
Jingwei Liu
Yuyu Guo
Lianli Gao
109
1
0
10 Aug 2023
D3G: Exploring Gaussian Prior for Temporal Sentence Grounding with
  Glance Annotation
D3G: Exploring Gaussian Prior for Temporal Sentence Grounding with Glance Annotation
Hanjun Li
Xiujun Shu
Su He
Ruizhi Qiao
Wei Wen
Taian Guo
Bei Gan
Xing Sun
65
12
0
08 Aug 2023
Efficient Temporal Sentence Grounding in Videos with Multi-Teacher
  Knowledge Distillation
Efficient Temporal Sentence Grounding in Videos with Multi-Teacher Knowledge Distillation
Renjie Liang
Yiming Yang
Hui Lu
Li Li
93
10
0
07 Aug 2023
UniVTG: Towards Unified Video-Language Temporal Grounding
UniVTG: Towards Unified Video-Language Temporal Grounding
Kevin Qinghong Lin
Pengchuan Zhang
Joya Chen
Shraman Pramanick
Difei Gao
Alex Jinpeng Wang
Rui Yan
Mike Zheng Shou
104
123
0
31 Jul 2023
Learning Multi-modal Representations by Watching Hundreds of Surgical Video Lectures
Learning Multi-modal Representations by Watching Hundreds of Surgical Video Lectures
Kun Yuan
V. Srivastav
Tong Yu
Joël L. Lavanchy
J. Marescaux
Pietro Mascagni
Nassir Navab
N. Padoy
186
23
0
27 Jul 2023
G2L: Semantically Aligned and Uniform Video Grounding via Geodesic and
  Game Theory
G2L: Semantically Aligned and Uniform Video Grounding via Geodesic and Game Theory
Hongxiang Li
Meng Cao
Xuxin Cheng
Yaowei Li
Zhihong Zhu
Yuexian Zou
114
20
0
26 Jul 2023
Towards Video Anomaly Retrieval from Video Anomaly Detection: New
  Benchmarks and Model
Towards Video Anomaly Retrieval from Video Anomaly Detection: New Benchmarks and Model
Peng Wu
Jing Liu
Xiangteng He
Yuxin Peng
Peng Wang
Yanning Zhang
124
34
0
24 Jul 2023
No-frills Temporal Video Grounding: Multi-Scale Neighboring Attention
  and Zoom-in Boundary Detection
No-frills Temporal Video Grounding: Multi-Scale Neighboring Attention and Zoom-in Boundary Detection
Qi Zhang
S. Zheng
Qin Jin
90
0
0
20 Jul 2023
EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the
  Backbone
EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the Backbone
Shraman Pramanick
Yale Song
Sayan Nag
Kevin Qinghong Lin
Hardik Shah
Mike Zheng Shou
Ramalingam Chellappa
Pengchuan Zhang
VLM
112
100
0
11 Jul 2023
MomentDiff: Generative Video Moment Retrieval from Random to Real
MomentDiff: Generative Video Moment Retrieval from Random to Real
P. Li
Chen-Wei Xie
Hongtao Xie
Liming Zhao
Lei Zhang
Yun Zheng
Deli Zhao
Yongdong Zhang
DiffMVGen
109
60
0
06 Jul 2023
Zero-Shot Dense Video Captioning by Jointly Optimizing Text and Moment
Zero-Shot Dense Video Captioning by Jointly Optimizing Text and Moment
Yongrae Jo
Seongyun Lee
Aiden Seung Joon Lee
Hyunji Lee
Hanseok Oh
Minjoon Seo
47
2
0
05 Jul 2023
SpotEM: Efficient Video Search for Episodic Memory
SpotEM: Efficient Video Search for Episodic Memory
Santhosh Kumar Ramakrishnan
Ziad Al-Halah
Kristen Grauman
VLM
93
9
0
28 Jun 2023
Dissecting Multimodality in VideoQA Transformer Models by Impairing
  Modality Fusion
Dissecting Multimodality in VideoQA Transformer Models by Impairing Modality Fusion
Isha Rawal
Alexander Matyasko
Shantanu Jaiswal
Basura Fernando
Cheston Tan
55
3
0
15 Jun 2023
A Survey on Video Moment Localization
A Survey on Video Moment Localization
Meng Liu
Liqiang Nie
Yunxiao Wang
Meng Wang
Yong Rui
120
28
0
13 Jun 2023
MS-DETR: Natural Language Video Localization with Sampling Moment-Moment
  Interaction
MS-DETR: Natural Language Video Localization with Sampling Moment-Moment Interaction
Jiashuo Wang
Aixin Sun
Hao Zhang
Xiaoli Li
ViT
73
14
0
30 May 2023
Deep Neural Networks in Video Human Action Recognition: A Review
Deep Neural Networks in Video Human Action Recognition: A Review
Zihan Wang
Yang Yang
Zhi Liu
Y. Zheng
89
5
0
25 May 2023
Faster Video Moment Retrieval with Point-Level Supervision
Faster Video Moment Retrieval with Point-Level Supervision
Xun Jiang
Zailei Zhou
Xing Xu
Yang Yang
Guoqing Wang
Heng Tao Shen
73
16
0
23 May 2023
Movie101: A New Movie Understanding Benchmark
Movie101: A New Movie Understanding Benchmark
Zihao Yue
Qi Zhang
Anwen Hu
Liang Zhang
Ziheng Wang
Qin Jin
VGen
77
17
0
20 May 2023
Joint Moment Retrieval and Highlight Detection Via Natural Language
  Queries
Joint Moment Retrieval and Highlight Detection Via Natural Language Queries
Richard Luo
Austin Peng
Heidi Yap
Koby Beard
ViT
55
0
0
08 May 2023
Transform-Equivariant Consistency Learning for Temporal Sentence
  Grounding
Transform-Equivariant Consistency Learning for Temporal Sentence Grounding
Daizong Liu
Xiaoye Qu
Jianfeng Dong
Pan Zhou
Zichuan Xu
Yining Qi
Xing Di
Weining Lu
Yu Cheng
126
8
0
06 May 2023
TMR: Text-to-Motion Retrieval Using Contrastive 3D Human Motion
  Synthesis
TMR: Text-to-Motion Retrieval Using Contrastive 3D Human Motion Synthesis
Mathis Petrovich
Michael J. Black
Gül Varol
VGen
125
85
0
02 May 2023
MH-DETR: Video Moment and Highlight Detection with Cross-modal
  Transformer
MH-DETR: Video Moment and Highlight Detection with Cross-modal Transformer
Yifang Xu
Yunzhuo Sun
Yang Li
Yilei Shi
Xiaoxia Zhu
S. Du
ViT
117
35
0
29 Apr 2023
Boundary-Denoising for Video Activity Localization
Boundary-Denoising for Video Activity Localization
Mengmeng Xu
Mattia Soldan
Jialin Gao
Shuming Liu
Juan-Manuel Perez-Rua
Guohao Li
70
10
0
06 Apr 2023
Sketch-based Video Object Localization
Sketch-based Video Object Localization
Sangmin Woo
So-Yeong Jeon
Jinyoung Park
Minji Son
Sumin Lee
Changick Kim
87
0
0
02 Apr 2023
Learning Action Changes by Measuring Verb-Adverb Textual Relationships
Learning Action Changes by Measuring Verb-Adverb Textual Relationships
Davide Moltisanti
Frank Keller
Hakan Bilen
Laura Sevilla-Lara
110
7
0
27 Mar 2023
Query-Dependent Video Representation for Moment Retrieval and Highlight
  Detection
Query-Dependent Video Representation for Moment Retrieval and Highlight Detection
WonJun Moon
Sangeek Hyun
S. Park
Dongchan Park
Jae-Pil Heo
ViT
105
115
0
24 Mar 2023
Scanning Only Once: An End-to-end Framework for Fast Temporal Grounding
  in Long Videos
Scanning Only Once: An End-to-end Framework for Fast Temporal Grounding in Long Videos
Yulin Pan
Xiangteng He
Biao Gong
Yiliang Lv
Yujun Shen
Yuxin Peng
Deli Zhao
110
13
0
15 Mar 2023
Previous
123456789
Next