ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1912.09930
  4. Cited By
Something-Else: Compositional Action Recognition with Spatial-Temporal
  Interaction Networks

Something-Else: Compositional Action Recognition with Spatial-Temporal Interaction Networks

20 December 2019
Joanna Materzynska
Tete Xiao
Roei Herzig
Huijuan Xu
Xiaolong Wang
Trevor Darrell
    CoGe
ArXivPDFHTML

Papers citing "Something-Else: Compositional Action Recognition with Spatial-Temporal Interaction Networks"

43 / 43 papers shown
Title
Magma: A Foundation Model for Multimodal AI Agents
Magma: A Foundation Model for Multimodal AI Agents
Jianwei Yang
Reuben Tan
Qianhui Wu
Ruijie Zheng
Baolin Peng
...
Seonghyeon Ye
Joel Jang
Yuquan Deng
Lars Liden
Jianfeng Gao
VLM
AI4TS
122
9
0
18 Feb 2025
Interacted Object Grounding in Spatio-Temporal Human-Object Interactions
Interacted Object Grounding in Spatio-Temporal Human-Object Interactions
Xiaoyang Liu
Boran Wen
Xinpeng Liu
Zizheng Zhou
Hongwei Fan
Cewu Lu
Lizhuang Ma
Yulong Chen
Yong Li
56
2
0
27 Dec 2024
InterDyn: Controllable Interactive Dynamics with Video Diffusion Models
InterDyn: Controllable Interactive Dynamics with Video Diffusion Models
Rick Akkerman
Haiwen Feng
M. Black
Dimitrios Tzionas
Victoria Fernandez-Abrevaya
VGen
AI4CE
105
3
0
16 Dec 2024
Rethinking Image-to-Video Adaptation: An Object-centric Perspective
Rethinking Image-to-Video Adaptation: An Object-centric Perspective
Rui Qian
Shuangrui Ding
Dahua Lin
OCL
52
1
0
09 Jul 2024
Rank2Reward: Learning Shaped Reward Functions from Passive Video
Rank2Reward: Learning Shaped Reward Functions from Passive Video
Daniel Yang
Davin Tjia
Jacob Berg
Dima Damen
Pulkit Agrawal
Abhishek Gupta
OffRL
40
5
0
23 Apr 2024
3VL: Using Trees to Improve Vision-Language Models' Interpretability
3VL: Using Trees to Improve Vision-Language Models' Interpretability
Nir Yellinek
Leonid Karlinsky
Raja Giryes
CoGe
VLM
49
4
0
28 Dec 2023
EgoPCA: A New Framework for Egocentric Hand-Object Interaction
  Understanding
EgoPCA: A New Framework for Egocentric Hand-Object Interaction Understanding
Yue Xu
Yong-Lu Li
Zhemin Huang
Michael Xu Liu
Cewu Lu
Yu-Wing Tai
Chi-Keung Tang
EgoV
25
9
0
05 Sep 2023
Does Visual Pretraining Help End-to-End Reasoning?
Does Visual Pretraining Help End-to-End Reasoning?
Chen Sun
Calvin Luo
Xingyi Zhou
Anurag Arnab
Cordelia Schmid
OCL
LRM
ViT
38
3
0
17 Jul 2023
Multimodal Distillation for Egocentric Action Recognition
Multimodal Distillation for Egocentric Action Recognition
Gorjan Radevski
Dusan Grujicic
Marie-Francine Moens
Matthew Blaschko
Tinne Tuytelaars
EgoV
26
23
0
14 Jul 2023
How can objects help action recognition?
How can objects help action recognition?
Xingyi Zhou
Anurag Arnab
Chen Sun
Cordelia Schmid
42
14
0
20 Jun 2023
LASER: A Neuro-Symbolic Framework for Learning Spatial-Temporal Scene Graphs with Weak Supervision
LASER: A Neuro-Symbolic Framework for Learning Spatial-Temporal Scene Graphs with Weak Supervision
Jiani Huang
Ziyang Li
Mayur Naik
Ser-Nam Lim
37
3
0
15 Apr 2023
PromptonomyViT: Multi-Task Prompt Learning Improves Video Transformers
  using Synthetic Scene Data
PromptonomyViT: Multi-Task Prompt Learning Improves Video Transformers using Synthetic Scene Data
Roei Herzig
Ofir Abramovich
Elad Ben-Avraham
Assaf Arbelle
Leonid Karlinsky
Ariel Shamir
Trevor Darrell
Amir Globerson
41
16
0
08 Dec 2022
Multi-Task Learning of Object State Changes from Uncurated Videos
Multi-Task Learning of Object State Changes from Uncurated Videos
Tomávs Souvcek
Jean-Baptiste Alayrac
Antoine Miech
Ivan Laptev
Josef Sivic
34
11
0
24 Nov 2022
Teaching Structured Vision&Language Concepts to Vision&Language Models
Teaching Structured Vision&Language Concepts to Vision&Language Models
Sivan Doveh
Assaf Arbelle
Sivan Harary
Yikang Shen
Roei Herzig
...
Donghyun Kim
Raja Giryes
Rogerio Feris
S. Ullman
Leonid Karlinsky
VLM
CoGe
56
70
0
21 Nov 2022
Discovering A Variety of Objects in Spatio-Temporal Human-Object
  Interactions
Discovering A Variety of Objects in Spatio-Temporal Human-Object Interactions
Yong-Lu Li
Hongwei Fan
Zuoyu Qiu
Yiming Dou
Liang Xu
...
Peiyang Guo
Haisheng Su
Dongliang Wang
Wei Wu
Cewu Lu
35
7
0
14 Nov 2022
Simple Primitives with Feasibility- and Contextuality-Dependence for
  Open-World Compositional Zero-shot Learning
Simple Primitives with Feasibility- and Contextuality-Dependence for Open-World Compositional Zero-shot Learning
Zhe Liu
Yun Yvonna Li
Lina Yao
Xiaojun Chang
Wei Fang
Xiaojun Wu
Yi Yang
CoGe
29
1
0
05 Nov 2022
Holistic Interaction Transformer Network for Action Detection
Holistic Interaction Transformer Network for Action Detection
Gueter Josmy Faure
Min-Hung Chen
S. Lai
33
37
0
23 Oct 2022
EgoSpeed-Net: Forecasting Speed-Control in Driver Behavior from
  Egocentric Video Data
EgoSpeed-Net: Forecasting Speed-Control in Driver Behavior from Egocentric Video Data
Yichen Ding
Ziming Zhang
Yanhua Li
Xun Zhou
42
3
0
27 Sep 2022
Action Recognition based on Cross-Situational Action-object Statistics
Action Recognition based on Cross-Situational Action-object Statistics
Satoshi Tsutsui
Xizi Wang
Guangyuan Weng
Yayun Zhang
David J. Crandall
Chen Yu
43
2
0
15 Aug 2022
Graph Inverse Reinforcement Learning from Diverse Videos
Graph Inverse Reinforcement Learning from Diverse Videos
Sateesh Kumar
Jonathan Zamora
Nicklas Hansen
Rishabh Jangir
Xiaolong Wang
35
53
0
28 Jul 2022
Is an Object-Centric Video Representation Beneficial for Transfer?
Is an Object-Centric Video Representation Beneficial for Transfer?
Chuhan Zhang
Ankush Gupta
Andrew Zisserman
ViT
37
27
0
20 Jul 2022
Disentangled Action Recognition with Knowledge Bases
Disentangled Action Recognition with Knowledge Bases
Zhekun Luo
Shalini Ghosh
Devin Guillory
Keizo Kato
Trevor Darrell
Huijuan Xu
21
7
0
04 Jul 2022
BasicTAD: an Astounding RGB-Only Baseline for Temporal Action Detection
BasicTAD: an Astounding RGB-Only Baseline for Temporal Action Detection
Mingdong Yang
Guo Chen
Yin-Dong Zheng
Tong Lu
Limin Wang
41
45
0
05 May 2022
Discovering Human-Object Interaction Concepts via Self-Compositional
  Learning
Discovering Human-Object Interaction Concepts via Self-Compositional Learning
Zhi Hou
Baosheng Yu
Dacheng Tao
27
18
0
27 Mar 2022
Compositional Temporal Grounding with Structured Variational Cross-Graph
  Correspondence Learning
Compositional Temporal Grounding with Structured Variational Cross-Graph Correspondence Learning
Juncheng Li
Junlin Xie
Long Qian
Linchao Zhu
Siliang Tang
Fei Wu
Yi Yang
Yueting Zhuang
Qing Guo
39
73
0
24 Mar 2022
How Do You Do It? Fine-Grained Action Understanding with Pseudo-Adverbs
How Do You Do It? Fine-Grained Action Understanding with Pseudo-Adverbs
Hazel Doughty
Cees G. M. Snoek
32
19
0
23 Mar 2022
TFCNet: Temporal Fully Connected Networks for Static Unbiased Temporal
  Reasoning
TFCNet: Temporal Fully Connected Networks for Static Unbiased Temporal Reasoning
Shiwen Zhang
AI4TS
27
9
0
11 Mar 2022
Distillation of Human-Object Interaction Contexts for Action Recognition
Distillation of Human-Object Interaction Contexts for Action Recognition
Muna Almushyti
Frederick W. Li
34
3
0
17 Dec 2021
Auto-X3D: Ultra-Efficient Video Understanding via Finer-Grained Neural
  Architecture Search
Auto-X3D: Ultra-Efficient Video Understanding via Finer-Grained Neural Architecture Search
Yi Ding
Xinyu Gong
Junru Wu
Humphrey Shi
Zhicheng Yan
Zhangyang Wang
VGen
52
1
0
09 Dec 2021
Video-Text Pre-training with Learned Regions
Video-Text Pre-training with Learned Regions
Rui Yan
Mike Zheng Shou
Yixiao Ge
Alex Jinpeng Wang
Xudong Lin
Guanyu Cai
Jinhui Tang
33
23
0
02 Dec 2021
A Variational Graph Autoencoder for Manipulation Action Recognition and
  Prediction
A Variational Graph Autoencoder for Manipulation Action Recognition and Prediction
Gamze Akyol
Sanem Sariel
E. Aksoy
GNN
DRL
BDL
41
2
0
25 Oct 2021
Object-Region Video Transformers
Object-Region Video Transformers
Roei Herzig
Elad Ben-Avraham
K. Mangalam
Amir Bar
Gal Chechik
Anna Rohrbach
Trevor Darrell
Amir Globerson
ViT
30
82
0
13 Oct 2021
Searching for Two-Stream Models in Multivariate Space for Video
  Recognition
Searching for Two-Stream Models in Multivariate Space for Video Recognition
Xinyu Gong
Heng Wang
Zheng Shou
Matt Feiszli
Zhangyang Wang
Zhicheng Yan
39
9
0
30 Aug 2021
EAN: Event Adaptive Network for Enhanced Action Recognition
EAN: Event Adaptive Network for Enhanced Action Recognition
Yuan Tian
Yichao Yan
Guangtao Zhai
G. Guo
Zhiyong Gao
35
41
0
22 Jul 2021
Composable Augmentation Encoding for Video Representation Learning
Composable Augmentation Encoding for Video Representation Learning
Chen Sun
Arsha Nagrani
Yonglong Tian
Cordelia Schmid
SSL
AI4TS
37
17
0
01 Apr 2021
Grounding Physical Concepts of Objects and Events Through Dynamic Visual
  Reasoning
Grounding Physical Concepts of Objects and Events Through Dynamic Visual Reasoning
Zhenfang Chen
Jiayuan Mao
Jiajun Wu
Kwan-Yee K. Wong
J. Tenenbaum
Chuang Gan
VGen
36
92
0
30 Mar 2021
ACTION-Net: Multipath Excitation for Action Recognition
ACTION-Net: Multipath Excitation for Action Recognition
Zhengwei Wang
Qi She
A. Smolic
3DPC
39
165
0
11 Mar 2021
Reconstructing Hand-Object Interactions in the Wild
Reconstructing Hand-Object Interactions in the Wild
Zhe Cao
Ilija Radosavovic
Angjoo Kanazawa
Jitendra Malik
3DH
25
146
0
17 Dec 2020
Learning Object Detection from Captions via Textual Scene Attributes
Learning Object Detection from Captions via Textual Scene Attributes
Achiya Jerbi
Roei Herzig
Jonathan Berant
Gal Chechik
Amir Globerson
27
21
0
30 Sep 2020
Actor-Context-Actor Relation Network for Spatio-Temporal Action
  Localization
Actor-Context-Actor Relation Network for Spatio-Temporal Action Localization
Junting Pan
Siyu Chen
Zheng Shou
Yu Liu
Jing Shao
Hongsheng Li
3DPC
19
150
0
14 Jun 2020
Graph-Based Global Reasoning Networks
Graph-Based Global Reasoning Networks
Yunpeng Chen
Marcus Rohrbach
Zhicheng Yan
Shuicheng Yan
Jiashi Feng
Yannis Kalantidis
GNN
NAI
268
457
0
30 Nov 2018
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
Chelsea Finn
Pieter Abbeel
Sergey Levine
OOD
377
11,700
0
09 Mar 2017
Interaction Networks for Learning about Objects, Relations and Physics
Interaction Networks for Learning about Objects, Relations and Physics
Peter W. Battaglia
Razvan Pascanu
Matthew Lai
Danilo Jimenez Rezende
Koray Kavukcuoglu
AI4CE
OCL
PINN
GNN
280
1,401
0
01 Dec 2016
1