ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.10828
  4. Cited By
Grounded Video Situation Recognition

Grounded Video Situation Recognition

19 October 2022
Zeeshan Khan
C. V. Jawahar
Makarand Tapaswi
ArXivPDFHTML

Papers citing "Grounded Video Situation Recognition"

33 / 33 papers shown
Title
Effectively Leveraging CLIP for Generating Situational Summaries of Images and Videos
Effectively Leveraging CLIP for Generating Situational Summaries of Images and Videos
Dhruv Verma
Debaditya Roy
Basura Fernando
63
1
0
30 Jul 2024
Hierarchical Self-supervised Representation Learning for Movie
  Understanding
Hierarchical Self-supervised Representation Learning for Movie Understanding
Fanyi Xiao
Kaustav Kundu
Joseph Tighe
Davide Modolo
SSL
58
25
0
06 Apr 2022
Collaborative Transformers for Grounded Situation Recognition
Collaborative Transformers for Grounded Situation Recognition
Junhyeong Cho
Youngseok Yoon
Suha Kwak
ViT
46
26
0
30 Mar 2022
TubeDETR: Spatio-Temporal Video Grounding with Transformers
TubeDETR: Spatio-Temporal Video Grounding with Transformers
Antoine Yang
Antoine Miech
Josef Sivic
Ivan Laptev
Cordelia Schmid
ViT
61
94
0
30 Mar 2022
Rethinking the Two-Stage Framework for Grounded Situation Recognition
Rethinking the Two-Stage Framework for Grounded Situation Recognition
Meng Wei
Long Chen
Wei Ji
Xiaoyu Yue
Tat-Seng Chua
57
30
0
10 Dec 2021
Grounded Situation Recognition with Transformers
Grounded Situation Recognition with Transformers
Junhyeong Cho
Youngseok Yoon
Hyeonjun Lee
Suha Kwak
ViT
48
18
0
19 Nov 2021
Visual Semantic Role Labeling for Video Understanding
Visual Semantic Role Labeling for Video Understanding
Arka Sadhu
Tanmay Gupta
Mark Yatskar
Ram Nevatia
Aniruddha Kembhavi
VLM
45
70
0
02 Apr 2021
End-to-End Object Detection with Transformers
End-to-End Object Detection with Transformers
Nicolas Carion
Francisco Massa
Gabriel Synnaeve
Nicolas Usunier
Alexander Kirillov
Sergey Zagoruyko
ViT
3DV
PINN
349
12,966
0
26 May 2020
Grounded Situation Recognition
Grounded Situation Recognition
Sarah M Pratt
Mark Yatskar
Luca Weihs
Ali Farhadi
Aniruddha Kembhavi
79
112
0
26 Mar 2020
Video Object Grounding using Semantic Roles in Language Description
Video Object Grounding using Semantic Roles in Language Description
Arka Sadhu
Kan Chen
Ram Nevatia
92
48
0
24 Mar 2020
Where Does It Exist: Spatio-Temporal Video Grounding for Multi-Form
  Sentences
Where Does It Exist: Spatio-Temporal Video Grounding for Multi-Form Sentences
Zhu Zhang
Zhou Zhao
Yang Zhao
Qi. Wang
Huasheng Liu
Lianli Gao
60
115
0
19 Jan 2020
PyTorch: An Imperative Style, High-Performance Deep Learning Library
PyTorch: An Imperative Style, High-Performance Deep Learning Library
Adam Paszke
Sam Gross
Francisco Massa
Adam Lerer
James Bradbury
...
Sasank Chilamkurthy
Benoit Steiner
Lu Fang
Junjie Bai
Soumith Chintala
ODL
354
42,299
0
03 Dec 2019
ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for
  Vision-and-Language Tasks
ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks
Jiasen Lu
Dhruv Batra
Devi Parikh
Stefan Lee
SSL
VLM
217
3,667
0
06 Aug 2019
BMN: Boundary-Matching Network for Temporal Action Proposal Generation
BMN: Boundary-Matching Network for Temporal Action Proposal Generation
Tianwei Lin
Xiao-Chang Liu
Xin Li
Errui Ding
Shilei Wen
131
601
0
23 Jul 2019
ActivityNet-QA: A Dataset for Understanding Complex Web Videos via
  Question Answering
ActivityNet-QA: A Dataset for Understanding Complex Web Videos via Question Answering
Zhou Yu
D. Xu
Jun-chen Yu
Ting Yu
Zhou Zhao
Yueting Zhuang
Dacheng Tao
101
458
0
06 Jun 2019
VideoBERT: A Joint Model for Video and Language Representation Learning
VideoBERT: A Joint Model for Video and Language Representation Learning
Chen Sun
Austin Myers
Carl Vondrick
Kevin Patrick Murphy
Cordelia Schmid
VLM
SSL
69
1,243
0
03 Apr 2019
Long-Term Feature Banks for Detailed Video Understanding
Long-Term Feature Banks for Detailed Video Understanding
Chao-Yuan Wu
Christoph Feichtenhofer
Haoqi Fan
Kaiming He
Philipp Krahenbuhl
Ross B. Girshick
159
480
0
12 Dec 2018
SlowFast Networks for Video Recognition
SlowFast Networks for Video Recognition
Christoph Feichtenhofer
Haoqi Fan
Jitendra Malik
Kaiming He
162
3,262
0
10 Dec 2018
Video Action Transformer Network
Video Action Transformer Network
Rohit Girdhar
João Carreira
Carl Doersch
Andrew Zisserman
ViT
124
708
0
06 Dec 2018
Actor-Centric Relation Network
Actor-Centric Relation Network
Chen Sun
Abhinav Shrivastava
Carl Vondrick
Kevin Patrick Murphy
Rahul Sukthankar
Cordelia Schmid
86
220
0
28 Jul 2018
BSN: Boundary Sensitive Network for Temporal Action Proposal Generation
BSN: Boundary Sensitive Network for Temporal Action Proposal Generation
Tianwei Lin
Xu Zhao
Haisheng Su
Chongjing Wang
Ming Yang
195
701
0
08 Jun 2018
Situation Recognition with Graph Neural Networks
Situation Recognition with Graph Neural Networks
Ruiyu Li
Makarand Tapaswi
Renjie Liao
Jiaya Jia
R. Urtasun
Sanja Fidler
GNN
52
131
0
14 Aug 2017
Bottom-Up and Top-Down Attention for Image Captioning and Visual
  Question Answering
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
Peter Anderson
Xiaodong He
Chris Buehler
Damien Teney
Mark Johnson
Stephen Gould
Lei Zhang
AIMat
111
4,208
0
25 Jul 2017
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
624
130,942
0
12 Jun 2017
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
João Carreira
Andrew Zisserman
219
7,989
0
22 May 2017
Temporal Segment Networks: Towards Good Practices for Deep Action
  Recognition
Temporal Segment Networks: Towards Good Practices for Deep Action Recognition
Limin Wang
Yuanjun Xiong
Zhe Wang
Yu Qiao
Dahua Lin
Xiaoou Tang
Luc Van Gool
ViT
98
3,825
0
02 Aug 2016
Movie Description
Movie Description
Anna Rohrbach
Atousa Torabi
Marcus Rohrbach
Niket Tandon
C. Pal
Hugo Larochelle
Aaron Courville
Bernt Schiele
3DV
VGen
73
357
0
12 May 2016
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense
  Image Annotations
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations
Ranjay Krishna
Yuke Zhu
Oliver Groth
Justin Johnson
Kenji Hata
...
Yannis Kalantidis
Li Li
David A. Shamma
Michael S. Bernstein
Fei-Fei Li
194
5,726
0
23 Feb 2016
MovieQA: Understanding Stories in Movies through Question-Answering
MovieQA: Understanding Stories in Movies through Question-Answering
Makarand Tapaswi
Yukun Zhu
Rainer Stiefelhagen
Antonio Torralba
R. Urtasun
Sanja Fidler
101
742
0
09 Dec 2015
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal
  Networks
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Shaoqing Ren
Kaiming He
Ross B. Girshick
Jian Sun
AIMat
ObjD
461
62,122
0
04 Jun 2015
Visual Semantic Role Labeling
Visual Semantic Role Labeling
Saurabh Gupta
Jitendra Malik
64
408
0
17 May 2015
Adam: A Method for Stochastic Optimization
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
1.4K
149,842
0
22 Dec 2014
CIDEr: Consensus-based Image Description Evaluation
CIDEr: Consensus-based Image Description Evaluation
Ramakrishna Vedantam
C. L. Zitnick
Devi Parikh
250
4,471
0
20 Nov 2014
1