ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1806.01810
  4. Cited By
Videos as Space-Time Region Graphs

Videos as Space-Time Region Graphs

5 June 2018
Xinyu Wang
Abhinav Gupta
ArXivPDFHTML

Papers citing "Videos as Space-Time Region Graphs"

50 / 154 papers shown
Title
Interacted Object Grounding in Spatio-Temporal Human-Object Interactions
Interacted Object Grounding in Spatio-Temporal Human-Object Interactions
Xiaoyang Liu
Boran Wen
Xinpeng Liu
Zizheng Zhou
Hongwei Fan
Cewu Lu
Lizhuang Ma
Yulong Chen
Yongqian Li
56
2
0
27 Dec 2024
Situational Scene Graph for Structured Human-centric Situation Understanding
Situational Scene Graph for Structured Human-centric Situation Understanding
Chinthani Sugandhika
Chen Li
Deepu Rajan
Basura Fernando
200
1
0
30 Oct 2024
Rethinking Image-to-Video Adaptation: An Object-centric Perspective
Rethinking Image-to-Video Adaptation: An Object-centric Perspective
Rui Qian
Shuangrui Ding
Dahua Lin
OCL
52
1
0
09 Jul 2024
A Semantic-Aware and Multi-Guided Network for Infrared-Visible Image Fusion
A Semantic-Aware and Multi-Guided Network for Infrared-Visible Image Fusion
Xiaoli Zhang
Liying Wang
Libo Zhao
Xiongfei Li
Siwei Ma
42
0
0
11 Jun 2024
MOSA: Music Motion with Semantic Annotation Dataset for Cross-Modal
  Music Processing
MOSA: Music Motion with Semantic Annotation Dataset for Cross-Modal Music Processing
Yu-Fen Huang
Nikki Moran
Simon Coleman
Jon Kelly
Shun-Hwa Wei
...
Chih-Hsuan Li
Da-Yu Huang
Hsuan-Kai Kao
Ting-Wei Lin
Li Su
41
1
0
10 Jun 2024
3VL: Using Trees to Improve Vision-Language Models' Interpretability
3VL: Using Trees to Improve Vision-Language Models' Interpretability
Nir Yellinek
Leonid Karlinsky
Raja Giryes
CoGe
VLM
49
4
0
28 Dec 2023
Video Recognition in Portrait Mode
Video Recognition in Portrait Mode
Mingfei Han
Linjie Yang
Xiaojie Jin
Jiashi Feng
Xiaojun Chang
Heng Wang
30
3
0
21 Dec 2023
Action Scene Graphs for Long-Form Understanding of Egocentric Videos
Action Scene Graphs for Long-Form Understanding of Egocentric Videos
Ivan Rodin
Antonino Furnari
Kyle Min
Subarna Tripathi
G. Farinella
EgoV
27
12
0
06 Dec 2023
Object-based (yet Class-agnostic) Video Domain Adaptation
Object-based (yet Class-agnostic) Video Domain Adaptation
Dantong Niu
Amir Bar
Roei Herzig
Trevor Darrell
Anna Rohrbach
40
1
0
29 Nov 2023
Towards Weakly Supervised End-to-end Learning for Long-video Action
  Recognition
Towards Weakly Supervised End-to-end Learning for Long-video Action Recognition
Jiaming Zhou
Hanjun Li
Kun-Yu Lin
Junwei Liang
29
1
0
28 Nov 2023
Sheaf Hypergraph Networks
Sheaf Hypergraph Networks
Iulia Duta
Giulia Cassara
Fabrizio Silvestri
Pietro Lió
17
18
0
29 Sep 2023
Multimodal Distillation for Egocentric Action Recognition
Multimodal Distillation for Egocentric Action Recognition
Gorjan Radevski
Dusan Grujicic
Marie-Francine Moens
Matthew Blaschko
Tinne Tuytelaars
EgoV
30
23
0
14 Jul 2023
How can objects help action recognition?
How can objects help action recognition?
Xingyi Zhou
Anurag Arnab
Chen Sun
Cordelia Schmid
42
14
0
20 Jun 2023
GEST: the Graph of Events in Space and Time as a Common Representation
  between Vision and Language
GEST: the Graph of Events in Space and Time as a Common Representation between Vision and Language
Mihai Masala
Nicolae Cudlenco
Traian Rebedea
Marius Leordeanu
14
0
0
22 May 2023
End-to-End Spatio-Temporal Action Localisation with Video Transformers
End-to-End Spatio-Temporal Action Localisation with Video Transformers
A. Gritsenko
Xuehan Xiong
Josip Djolonga
Mostafa Dehghani
Chen Sun
Mario Lucic
Cordelia Schmid
Anurag Arnab
ViT
40
13
0
24 Apr 2023
On the Benefits of 3D Pose and Tracking for Human Action Recognition
On the Benefits of 3D Pose and Tracking for Human Action Recognition
Jathushan Rajasegaran
Georgios Pavlakos
Angjoo Kanazawa
Christoph Feichtenhofer
Jitendra Malik
39
30
0
03 Apr 2023
Deep Learning for Video-based Person Re-Identification: A Survey
Deep Learning for Video-based Person Re-Identification: A Survey
Khawar Islam
26
6
0
21 Mar 2023
Local-to-Global Information Communication for Real-Time Semantic
  Segmentation Network Search
Local-to-Global Information Communication for Real-Time Semantic Segmentation Network Search
Guangliang Cheng
Peng Sun
Ting-Bing Xu
Shuchang Lyu
Peiwen Lin
26
1
0
16 Feb 2023
Optical Flow Estimation in 360$^\circ$ Videos: Dataset, Model and
  Application
Optical Flow Estimation in 360∘^\circ∘ Videos: Dataset, Model and Application
Bin Duan
Keshav Bhandari
Gaowen Liu
Yan Yan
24
0
0
27 Jan 2023
GPViT: A High Resolution Non-Hierarchical Vision Transformer with Group
  Propagation
GPViT: A High Resolution Non-Hierarchical Vision Transformer with Group Propagation
Chenhongyi Yang
Jiarui Xu
Shalini De Mello
Elliot J. Crowley
Xinyu Wang
ViT
38
21
0
13 Dec 2022
PromptonomyViT: Multi-Task Prompt Learning Improves Video Transformers
  using Synthetic Scene Data
PromptonomyViT: Multi-Task Prompt Learning Improves Video Transformers using Synthetic Scene Data
Roei Herzig
Ofir Abramovich
Elad Ben-Avraham
Assaf Arbelle
Leonid Karlinsky
Ariel Shamir
Trevor Darrell
Amir Globerson
41
16
0
08 Dec 2022
Multi-Task Edge Prediction in Temporally-Dynamic Video Graphs
Multi-Task Edge Prediction in Temporally-Dynamic Video Graphs
Osman Ulger
Julian Wiederer
Mohsen Ghafoorian
Vasileios Belagiannis
Pascal Mettes
43
0
0
06 Dec 2022
Teaching Structured Vision&Language Concepts to Vision&Language Models
Teaching Structured Vision&Language Concepts to Vision&Language Models
Sivan Doveh
Assaf Arbelle
Sivan Harary
Yikang Shen
Roei Herzig
...
Donghyun Kim
Raja Giryes
Rogerio Feris
S. Ullman
Leonid Karlinsky
VLM
CoGe
56
70
0
21 Nov 2022
Dynamic Temporal Filtering in Video Models
Dynamic Temporal Filtering in Video Models
Fuchen Long
Zhaofan Qiu
Yingwei Pan
Ting Yao
Chong-Wah Ngo
Tao Mei
AI4TS
27
17
0
15 Nov 2022
Discovering A Variety of Objects in Spatio-Temporal Human-Object
  Interactions
Discovering A Variety of Objects in Spatio-Temporal Human-Object Interactions
Yong-Lu Li
Hongwei Fan
Zuoyu Qiu
Yiming Dou
Liang Xu
...
Peiyang Guo
Haisheng Su
Dongliang Wang
Wei Wu
Cewu Lu
35
7
0
14 Nov 2022
EgoSpeed-Net: Forecasting Speed-Control in Driver Behavior from
  Egocentric Video Data
EgoSpeed-Net: Forecasting Speed-Control in Driver Behavior from Egocentric Video Data
Yichen Ding
Ziming Zhang
Yanhua Li
Xun Zhou
42
3
0
27 Sep 2022
Graph Reasoning Transformer for Image Parsing
Graph Reasoning Transformer for Image Parsing
Dong Zhang
Jinhui Tang
Kwang-Ting Cheng
ViT
24
16
0
20 Sep 2022
Occlusion-Aware Instance Segmentation via BiLayer Network Architectures
Occlusion-Aware Instance Segmentation via BiLayer Network Architectures
Lei Ke
Yu-Wing Tai
Chi-Keung Tang
ISeg
27
11
0
08 Aug 2022
Equivariant and Invariant Grounding for Video Question Answering
Equivariant and Invariant Grounding for Video Question Answering
Yicong Li
Xiang Wang
Junbin Xiao
Tat-Seng Chua
23
25
0
26 Jul 2022
Is an Object-Centric Video Representation Beneficial for Transfer?
Is an Object-Centric Video Representation Beneficial for Transfer?
Chuhan Zhang
Ankush Gupta
Andrew Zisserman
ViT
37
27
0
20 Jul 2022
ViGAT: Bottom-up event recognition and explanation in video using
  factorized graph attention network
ViGAT: Bottom-up event recognition and explanation in video using factorized graph attention network
Nikolaos Gkalelis
Dimitrios Daskalakis
Vasileios Mezaris
19
10
0
20 Jul 2022
Learning Sequence Representations by Non-local Recurrent Neural Memory
Learning Sequence Representations by Non-local Recurrent Neural Memory
Wenjie Pei
Xin Feng
Canmiao Fu
Qi Cao
Guangming Lu
Yu-Wing Tai
AI4TS
27
1
0
20 Jul 2022
Beyond Transfer Learning: Co-finetuning for Action Localisation
Beyond Transfer Learning: Co-finetuning for Action Localisation
Anurag Arnab
Xuehan Xiong
A. Gritsenko
Rob Romijnders
Josip Djolonga
Mostafa Dehghani
Chen Sun
Mario Lucic
Cordelia Schmid
38
8
0
08 Jul 2022
Predicting Team Performance with Spatial Temporal Graph Convolutional
  Networks
Predicting Team Performance with Spatial Temporal Graph Convolutional Networks
Shengnan Hu
G. Sukthankar
GNN
23
2
0
21 Jun 2022
Hierarchical Self-supervised Representation Learning for Movie
  Understanding
Hierarchical Self-supervised Representation Learning for Movie Understanding
Fanyi Xiao
Kaustav Kundu
Joseph Tighe
Davide Modolo
SSL
44
24
0
06 Apr 2022
Gate-Shift-Fuse for Video Action Recognition
Gate-Shift-Fuse for Video Action Recognition
Swathikiran Sudhakaran
Sergio Escalera
Oswald Lanz
22
22
0
16 Mar 2022
Motion-driven Visual Tempo Learning for Video-based Action Recognition
Motion-driven Visual Tempo Learning for Video-based Action Recognition
Yuanzhong Liu
Junsong Yuan
Zhigang Tu
27
58
0
24 Feb 2022
Adaptive Graph Convolutional Networks for Weakly Supervised Anomaly
  Detection in Videos
Adaptive Graph Convolutional Networks for Weakly Supervised Anomaly Detection in Videos
Congqi Cao
Xin Zhang
Shizhou Zhang
Peng Wang
Yanning Zhang
AI4TS
25
22
0
14 Feb 2022
Action Keypoint Network for Efficient Video Recognition
Action Keypoint Network for Efficient Video Recognition
Xu Chen
Yahong Han
Xiaohan Wang
Yifang Sun
Yi Yang
3DPC
27
6
0
17 Jan 2022
Representing Videos as Discriminative Sub-graphs for Action Recognition
Representing Videos as Discriminative Sub-graphs for Action Recognition
Dong Li
Zhaofan Qiu
Yingwei Pan
Ting Yao
Houqiang Li
Tao Mei
42
25
0
11 Jan 2022
Distillation of Human-Object Interaction Contexts for Action Recognition
Distillation of Human-Object Interaction Contexts for Action Recognition
Muna Almushyti
Frederick W. Li
34
3
0
17 Dec 2021
TCGL: Temporal Contrastive Graph for Self-supervised Video
  Representation Learning
TCGL: Temporal Contrastive Graph for Self-supervised Video Representation Learning
Yang Liu
Keze Wang
Lingbo Liu
Hao Lan
Liang Lin
SSL
AI4TS
53
113
0
07 Dec 2021
Multi-Scale Semantics-Guided Neural Networks for Efficient
  Skeleton-Based Human Action Recognition
Multi-Scale Semantics-Guided Neural Networks for Efficient Skeleton-Based Human Action Recognition
Pengfei Zhang
Cuiling Lan
Wenjun Zeng
Junliang Xing
Jianru Xue
Nanning Zheng
35
6
0
07 Nov 2021
Relational Self-Attention: What's Missing in Attention for Video
  Understanding
Relational Self-Attention: What's Missing in Attention for Video Understanding
Manjin Kim
Heeseung Kwon
Chunyu Wang
Suha Kwak
Minsu Cho
ViT
27
28
0
02 Nov 2021
BiC-Net: Learning Efficient Spatio-Temporal Relation for Text-Video
  Retrieval
BiC-Net: Learning Efficient Spatio-Temporal Relation for Text-Video Retrieval
Ning Han
Jingjing Chen
Chuhao Shi
Yawen Zeng
Guangyi Xiao
Hao Chen
22
10
0
29 Oct 2021
Temporal-attentive Covariance Pooling Networks for Video Recognition
Temporal-attentive Covariance Pooling Networks for Video Recognition
Zilin Gao
Qilong Wang
Bingbing Zhang
Q. Hu
P. Li
21
24
0
27 Oct 2021
A Variational Graph Autoencoder for Manipulation Action Recognition and
  Prediction
A Variational Graph Autoencoder for Manipulation Action Recognition and Prediction
Gamze Akyol
Sanem Sariel
E. Aksoy
GNN
DRL
BDL
41
2
0
25 Oct 2021
Object-Region Video Transformers
Object-Region Video Transformers
Roei Herzig
Elad Ben-Avraham
K. Mangalam
Amir Bar
Gal Chechik
Anna Rohrbach
Trevor Darrell
Amir Globerson
ViT
30
82
0
13 Oct 2021
Modelling Neighbor Relation in Joint Space-Time Graph for Video
  Correspondence Learning
Modelling Neighbor Relation in Joint Space-Time Graph for Video Correspondence Learning
Zixu Zhao
Yueming Jin
Pheng-Ann Heng
SSL
37
21
0
28 Sep 2021
TSM: Temporal Shift Module for Efficient and Scalable Video
  Understanding on Edge Device
TSM: Temporal Shift Module for Efficient and Scalable Video Understanding on Edge Device
Ji Lin
Chuang Gan
Kuan-Chieh Jackson Wang
Song Han
40
64
0
27 Sep 2021
1234
Next