ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2104.09496
  4. Cited By
Temporal Query Networks for Fine-grained Video Understanding

Temporal Query Networks for Fine-grained Video Understanding

19 April 2021
Chuhan Zhang
Ankush Gupta
Andrew Zisserman
ArXivPDFHTML

Papers citing "Temporal Query Networks for Fine-grained Video Understanding"

23 / 23 papers shown
Title
Enhancing Video Understanding: Deep Neural Networks for Spatiotemporal Analysis
Enhancing Video Understanding: Deep Neural Networks for Spatiotemporal Analysis
Amir Hosein Fadaei
M. Dehaqani
45
0
0
11 Feb 2025
MS-Temba : Multi-Scale Temporal Mamba for Efficient Temporal Action Detection
MS-Temba : Multi-Scale Temporal Mamba for Efficient Temporal Action Detection
Arkaprava Sinha
Monish Soundar Raj
Pu Wang
Ahmed Helmy
Srijan Das
Mamba
53
3
0
10 Jan 2025
End-to-End Spatio-Temporal Action Localisation with Video Transformers
End-to-End Spatio-Temporal Action Localisation with Video Transformers
A. Gritsenko
Xuehan Xiong
Josip Djolonga
Mostafa Dehghani
Chen Sun
Mario Lucic
Cordelia Schmid
Anurag Arnab
ViT
32
13
0
24 Apr 2023
Co-Occurrence Matters: Learning Action Relation for Temporal Action
  Localization
Co-Occurrence Matters: Learning Action Relation for Temporal Action Localization
Congqi Cao
Yizhe Wang
Yuelie Lu
X. Zhang
Yanning Zhang
28
4
0
15 Mar 2023
Building Scalable Video Understanding Benchmarks through Sports
Building Scalable Video Understanding Benchmarks through Sports
Aniket Agarwal
Alex Zhang
Karthik Narasimhan
Igor Gilitschenski
Vishvak Murahari
Yash Kant
19
1
0
17 Jan 2023
Cross-Modal Learning with 3D Deformable Attention for Action Recognition
Cross-Modal Learning with 3D Deformable Attention for Action Recognition
Sangwon Kim
Dasom Ahn
ByoungChul Ko
ViT
3DPC
27
24
0
12 Dec 2022
PromptonomyViT: Multi-Task Prompt Learning Improves Video Transformers
  using Synthetic Scene Data
PromptonomyViT: Multi-Task Prompt Learning Improves Video Transformers using Synthetic Scene Data
Roei Herzig
Ofir Abramovich
Elad Ben-Avraham
Assaf Arbelle
Leonid Karlinsky
Ariel Shamir
Trevor Darrell
Amir Globerson
34
16
0
08 Dec 2022
Learning State-Aware Visual Representations from Audible Interactions
Learning State-Aware Visual Representations from Audible Interactions
Himangi Mittal
Pedro Morgado
Unnat Jain
Abhinav Gupta
72
22
0
27 Sep 2022
Vision-Centric BEV Perception: A Survey
Vision-Centric BEV Perception: A Survey
Yuexin Ma
Tai Wang
Xuyang Bai
Huitong Yang
Yuenan Hou
Yaming Wang
Yu Qiao
Ruigang Yang
Dinesh Manocha
Xinge Zhu
43
129
0
04 Aug 2022
GateHUB: Gated History Unit with Background Suppression for Online
  Action Detection
GateHUB: Gated History Unit with Background Suppression for Online Action Detection
Junwen Chen
Gaurav Mittal
Ye Yu
Yu Kong
Mei Chen
36
33
0
09 Jun 2022
A Survey on Video Action Recognition in Sports: Datasets, Methods and
  Applications
A Survey on Video Action Recognition in Sports: Datasets, Methods and Applications
Fei Wu
Qingzhong Wang
Jian Bian
Haoyi Xiong
Ning Ding
Feixiang Lu
Junqing Cheng
Dejing Dou
AI4TS
24
52
0
02 Jun 2022
P3IV: Probabilistic Procedure Planning from Instructional Videos with
  Weak Supervision
P3IV: Probabilistic Procedure Planning from Instructional Videos with Weak Supervision
Henghui Zhao
Isma Hadji
Nikita Dvornik
Konstantinos G. Derpanis
Richard P. Wildes
Allan D. Jepson
26
45
0
04 May 2022
TALLFormer: Temporal Action Localization with a Long-memory Transformer
TALLFormer: Temporal Action Localization with a Long-memory Transformer
Feng Cheng
Gedas Bertasius
ViT
24
91
0
04 Apr 2022
X-Pool: Cross-Modal Language-Video Attention for Text-Video Retrieval
X-Pool: Cross-Modal Language-Video Attention for Text-Video Retrieval
S. Gorti
Noël Vouitsis
Junwei Ma
Keyvan Golestan
M. Volkovs
Animesh Garg
Guangwei Yu
31
149
0
28 Mar 2022
How Do You Do It? Fine-Grained Action Understanding with Pseudo-Adverbs
How Do You Do It? Fine-Grained Action Understanding with Pseudo-Adverbs
Hazel Doughty
Cees G. M. Snoek
25
19
0
23 Mar 2022
Gate-Shift-Fuse for Video Action Recognition
Gate-Shift-Fuse for Video Action Recognition
Swathikiran Sudhakaran
Sergio Escalera
O. Lanz
22
22
0
16 Mar 2022
TFCNet: Temporal Fully Connected Networks for Static Unbiased Temporal
  Reasoning
TFCNet: Temporal Fully Connected Networks for Static Unbiased Temporal Reasoning
Shiwen Zhang
AI4TS
19
9
0
11 Mar 2022
MS-TCT: Multi-Scale Temporal ConvTransformer for Action Detection
MS-TCT: Multi-Scale Temporal ConvTransformer for Action Detection
Rui Dai
Srijan Das
Kumara Kahatapitiya
Michael S. Ryoo
F. Brémond
ViT
39
73
0
07 Dec 2021
BEVT: BERT Pretraining of Video Transformers
BEVT: BERT Pretraining of Video Transformers
Rui Wang
Dongdong Chen
Zuxuan Wu
Yinpeng Chen
Xiyang Dai
Mengchen Liu
Yu-Gang Jiang
Luowei Zhou
Lu Yuan
ViT
36
203
0
02 Dec 2021
Object-Region Video Transformers
Object-Region Video Transformers
Roei Herzig
Elad Ben-Avraham
K. Mangalam
Amir Bar
Gal Chechik
Anna Rohrbach
Trevor Darrell
Amir Globerson
ViT
21
82
0
13 Oct 2021
QVHighlights: Detecting Moments and Highlights in Videos via Natural
  Language Queries
QVHighlights: Detecting Moments and Highlights in Videos via Natural Language Queries
Jie Lei
Tamara L. Berg
Mohit Bansal
ViT
24
62
0
20 Jul 2021
Multi-modal Transformer for Video Retrieval
Multi-modal Transformer for Video Retrieval
Valentin Gabeur
Chen Sun
Alahari Karteek
Cordelia Schmid
ViT
415
596
0
21 Jul 2020
Grouped Spatial-Temporal Aggregation for Efficient Action Recognition
Grouped Spatial-Temporal Aggregation for Efficient Action Recognition
Chenxu Luo
Alan Yuille
127
150
0
28 Sep 2019
1