Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1912.09930
Cited By
Something-Else: Compositional Action Recognition with Spatial-Temporal Interaction Networks
20 December 2019
Joanna Materzynska
Tete Xiao
Roei Herzig
Huijuan Xu
Xiaolong Wang
Trevor Darrell
CoGe
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Something-Else: Compositional Action Recognition with Spatial-Temporal Interaction Networks"
43 / 43 papers shown
Title
Magma: A Foundation Model for Multimodal AI Agents
Jianwei Yang
Reuben Tan
Qianhui Wu
Ruijie Zheng
Baolin Peng
...
Seonghyeon Ye
Joel Jang
Yuquan Deng
Lars Liden
Jianfeng Gao
VLM
AI4TS
122
9
0
18 Feb 2025
Interacted Object Grounding in Spatio-Temporal Human-Object Interactions
Xiaoyang Liu
Boran Wen
Xinpeng Liu
Zizheng Zhou
Hongwei Fan
Cewu Lu
Lizhuang Ma
Yulong Chen
Yong Li
56
2
0
27 Dec 2024
InterDyn: Controllable Interactive Dynamics with Video Diffusion Models
Rick Akkerman
Haiwen Feng
M. Black
Dimitrios Tzionas
Victoria Fernandez-Abrevaya
VGen
AI4CE
105
3
0
16 Dec 2024
Rethinking Image-to-Video Adaptation: An Object-centric Perspective
Rui Qian
Shuangrui Ding
Dahua Lin
OCL
52
1
0
09 Jul 2024
Rank2Reward: Learning Shaped Reward Functions from Passive Video
Daniel Yang
Davin Tjia
Jacob Berg
Dima Damen
Pulkit Agrawal
Abhishek Gupta
OffRL
40
5
0
23 Apr 2024
3VL: Using Trees to Improve Vision-Language Models' Interpretability
Nir Yellinek
Leonid Karlinsky
Raja Giryes
CoGe
VLM
49
4
0
28 Dec 2023
EgoPCA: A New Framework for Egocentric Hand-Object Interaction Understanding
Yue Xu
Yong-Lu Li
Zhemin Huang
Michael Xu Liu
Cewu Lu
Yu-Wing Tai
Chi-Keung Tang
EgoV
25
9
0
05 Sep 2023
Does Visual Pretraining Help End-to-End Reasoning?
Chen Sun
Calvin Luo
Xingyi Zhou
Anurag Arnab
Cordelia Schmid
OCL
LRM
ViT
38
3
0
17 Jul 2023
Multimodal Distillation for Egocentric Action Recognition
Gorjan Radevski
Dusan Grujicic
Marie-Francine Moens
Matthew Blaschko
Tinne Tuytelaars
EgoV
30
23
0
14 Jul 2023
How can objects help action recognition?
Xingyi Zhou
Anurag Arnab
Chen Sun
Cordelia Schmid
42
14
0
20 Jun 2023
LASER: A Neuro-Symbolic Framework for Learning Spatial-Temporal Scene Graphs with Weak Supervision
Jiani Huang
Ziyang Li
Mayur Naik
Ser-Nam Lim
37
3
0
15 Apr 2023
PromptonomyViT: Multi-Task Prompt Learning Improves Video Transformers using Synthetic Scene Data
Roei Herzig
Ofir Abramovich
Elad Ben-Avraham
Assaf Arbelle
Leonid Karlinsky
Ariel Shamir
Trevor Darrell
Amir Globerson
41
16
0
08 Dec 2022
Multi-Task Learning of Object State Changes from Uncurated Videos
Tomávs Souvcek
Jean-Baptiste Alayrac
Antoine Miech
Ivan Laptev
Josef Sivic
34
11
0
24 Nov 2022
Teaching Structured Vision&Language Concepts to Vision&Language Models
Sivan Doveh
Assaf Arbelle
Sivan Harary
Yikang Shen
Roei Herzig
...
Donghyun Kim
Raja Giryes
Rogerio Feris
S. Ullman
Leonid Karlinsky
VLM
CoGe
56
70
0
21 Nov 2022
Discovering A Variety of Objects in Spatio-Temporal Human-Object Interactions
Yong-Lu Li
Hongwei Fan
Zuoyu Qiu
Yiming Dou
Liang Xu
...
Peiyang Guo
Haisheng Su
Dongliang Wang
Wei Wu
Cewu Lu
35
7
0
14 Nov 2022
Simple Primitives with Feasibility- and Contextuality-Dependence for Open-World Compositional Zero-shot Learning
Zhe Liu
Yun Yvonna Li
Lina Yao
Xiaojun Chang
Wei Fang
Xiaojun Wu
Yi Yang
CoGe
29
1
0
05 Nov 2022
Holistic Interaction Transformer Network for Action Detection
Gueter Josmy Faure
Min-Hung Chen
S. Lai
33
37
0
23 Oct 2022
EgoSpeed-Net: Forecasting Speed-Control in Driver Behavior from Egocentric Video Data
Yichen Ding
Ziming Zhang
Jun Luo
Xun Zhou
42
3
0
27 Sep 2022
Action Recognition based on Cross-Situational Action-object Statistics
Satoshi Tsutsui
Xizi Wang
Guangyuan Weng
Yayun Zhang
David J. Crandall
Chen Yu
43
2
0
15 Aug 2022
Graph Inverse Reinforcement Learning from Diverse Videos
Sateesh Kumar
Jonathan Zamora
Nicklas Hansen
Rishabh Jangir
Xiaolong Wang
35
53
0
28 Jul 2022
Is an Object-Centric Video Representation Beneficial for Transfer?
Chuhan Zhang
Ankush Gupta
Andrew Zisserman
ViT
37
27
0
20 Jul 2022
Disentangled Action Recognition with Knowledge Bases
Zhekun Luo
Shalini Ghosh
Devin Guillory
Keizo Kato
Trevor Darrell
Huijuan Xu
21
7
0
04 Jul 2022
BasicTAD: an Astounding RGB-Only Baseline for Temporal Action Detection
Mingdong Yang
Guo Chen
Yin-Dong Zheng
Tong Lu
Limin Wang
46
45
0
05 May 2022
Discovering Human-Object Interaction Concepts via Self-Compositional Learning
Zhi Hou
Baosheng Yu
Dacheng Tao
27
18
0
27 Mar 2022
Compositional Temporal Grounding with Structured Variational Cross-Graph Correspondence Learning
Juncheng Li
Junlin Xie
Long Qian
Linchao Zhu
Siliang Tang
Fei Wu
Yi Yang
Yueting Zhuang
Qing Guo
39
73
0
24 Mar 2022
How Do You Do It? Fine-Grained Action Understanding with Pseudo-Adverbs
Hazel Doughty
Cees G. M. Snoek
32
19
0
23 Mar 2022
TFCNet: Temporal Fully Connected Networks for Static Unbiased Temporal Reasoning
Shiwen Zhang
AI4TS
27
9
0
11 Mar 2022
Distillation of Human-Object Interaction Contexts for Action Recognition
Muna Almushyti
Frederick W. Li
34
3
0
17 Dec 2021
Auto-X3D: Ultra-Efficient Video Understanding via Finer-Grained Neural Architecture Search
Yi Ding
Xinyu Gong
Junru Wu
Humphrey Shi
Zhicheng Yan
Zhangyang Wang
VGen
52
1
0
09 Dec 2021
Video-Text Pre-training with Learned Regions
Rui Yan
Mike Zheng Shou
Yixiao Ge
Alex Jinpeng Wang
Xudong Lin
Guanyu Cai
Jinhui Tang
33
23
0
02 Dec 2021
A Variational Graph Autoencoder for Manipulation Action Recognition and Prediction
Gamze Akyol
Sanem Sariel
E. Aksoy
GNN
DRL
BDL
41
2
0
25 Oct 2021
Object-Region Video Transformers
Roei Herzig
Elad Ben-Avraham
K. Mangalam
Amir Bar
Gal Chechik
Anna Rohrbach
Trevor Darrell
Amir Globerson
ViT
30
82
0
13 Oct 2021
Searching for Two-Stream Models in Multivariate Space for Video Recognition
Xinyu Gong
Heng Wang
Zheng Shou
Matt Feiszli
Zhangyang Wang
Zhicheng Yan
42
9
0
30 Aug 2021
EAN: Event Adaptive Network for Enhanced Action Recognition
Yuan Tian
Yichao Yan
Guangtao Zhai
G. Guo
Zhiyong Gao
35
41
0
22 Jul 2021
Composable Augmentation Encoding for Video Representation Learning
Chen Sun
Arsha Nagrani
Yonglong Tian
Cordelia Schmid
SSL
AI4TS
37
17
0
01 Apr 2021
Grounding Physical Concepts of Objects and Events Through Dynamic Visual Reasoning
Zhenfang Chen
Jiayuan Mao
Jiajun Wu
Kwan-Yee K. Wong
J. Tenenbaum
Chuang Gan
VGen
36
92
0
30 Mar 2021
ACTION-Net: Multipath Excitation for Action Recognition
Zhengwei Wang
Qi She
A. Smolic
3DPC
39
165
0
11 Mar 2021
Reconstructing Hand-Object Interactions in the Wild
Zhe Cao
Ilija Radosavovic
Angjoo Kanazawa
Jitendra Malik
3DH
25
146
0
17 Dec 2020
Learning Object Detection from Captions via Textual Scene Attributes
Achiya Jerbi
Roei Herzig
Jonathan Berant
Gal Chechik
Amir Globerson
27
21
0
30 Sep 2020
Actor-Context-Actor Relation Network for Spatio-Temporal Action Localization
Junting Pan
Siyu Chen
Zheng Shou
Yu Liu
Jing Shao
Hongsheng Li
3DPC
19
150
0
14 Jun 2020
Graph-Based Global Reasoning Networks
Yunpeng Chen
Marcus Rohrbach
Zhicheng Yan
Shuicheng Yan
Jiashi Feng
Yannis Kalantidis
GNN
NAI
268
457
0
30 Nov 2018
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
Chelsea Finn
Pieter Abbeel
Sergey Levine
OOD
383
11,700
0
09 Mar 2017
Interaction Networks for Learning about Objects, Relations and Physics
Peter W. Battaglia
Razvan Pascanu
Matthew Lai
Danilo Jimenez Rezende
Koray Kavukcuoglu
AI4CE
OCL
PINN
GNN
283
1,401
0
01 Dec 2016
1