ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1705.07750
  4. Cited By
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
v1v2v3 (latest)

Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset

22 May 2017
João Carreira
Andrew Zisserman
ArXiv (abs)PDFHTML

Papers citing "Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset"

50 / 3,647 papers shown
Title
The State of Aerial Surveillance: A Survey
The State of Aerial Surveillance: A Survey
Kien Nguyen Thanh
Clinton Fookes
Sridha Sridharan
Yingli Tian
Feng Liu
Xiaoming Liu
Arun Ross
133
25
0
09 Jan 2022
Learning Sample Importance for Cross-Scenario Video Temporal Grounding
Learning Sample Importance for Cross-Scenario Video Temporal Grounding
P. Bao
Yadong Mu
76
13
0
08 Jan 2022
Sign Language Video Retrieval with Free-Form Textual Queries
Sign Language Video Retrieval with Free-Form Textual Queries
A. Duarte
Samuel Albanie
Xavier Giró-i-Nieto
Gül Varol
SLR
90
29
0
07 Jan 2022
Detecting Human-to-Human-or-Object (H2O) Interactions with DIABOLO
Detecting Human-to-Human-or-Object (H2O) Interactions with DIABOLO
Astrid Orcesi
Romaric Audigier
Fritz Poka Toukam
B. Luvison
78
3
0
07 Jan 2022
Cross-Modality Deep Feature Learning for Brain Tumor Segmentation
Cross-Modality Deep Feature Learning for Brain Tumor Segmentation
Dingwen Zhang
Guohai Huang
Qiang Zhang
Jungong Han
Junwei Han
Yizhou Yu
96
221
0
07 Jan 2022
Advancing 3D Medical Image Analysis with Variable Dimension Transform
  based Supervised 3D Pre-training
Advancing 3D Medical Image Analysis with Variable Dimension Transform based Supervised 3D Pre-training
Shu Zhen Zhang
Zihao Li
Hong-Yu Zhou
Jiechao Ma
Yizhou Yu
60
12
0
05 Jan 2022
Exploring Motion and Appearance Information for Temporal Sentence
  Grounding
Exploring Motion and Appearance Information for Temporal Sentence Grounding
Daizong Liu
Xiaoye Qu
Pan Zhou
Yang Liu
97
42
0
03 Jan 2022
Memory-Guided Semantic Learning Network for Temporal Sentence Grounding
Memory-Guided Semantic Learning Network for Temporal Sentence Grounding
Daizong Liu
Xiaoye Qu
Xing Di
Yu Cheng
Zichuan Xu
Pan Zhou
117
60
0
03 Jan 2022
TVNet: Temporal Voting Network for Action Localization
TVNet: Temporal Voting Network for Action Localization
Hanyuan Wang
Dima Damen
Majid Mirmehdi
Toby Perrett
96
6
0
02 Jan 2022
StyleGAN-V: A Continuous Video Generator with the Price, Image Quality
  and Perks of StyleGAN2
StyleGAN-V: A Continuous Video Generator with the Price, Image Quality and Perks of StyleGAN2
Ivan Skorokhodov
Sergey Tulyakov
Mohamed Elhoseiny
VGen
148
289
0
29 Dec 2021
AdaFocus V2: End-to-End Training of Spatial Dynamic Networks for Video
  Recognition
AdaFocus V2: End-to-End Training of Spatial Dynamic Networks for Video Recognition
Yulin Wang
Yang Yue
Yuanze Lin
Haojun Jiang
Zihang Lai
V. Kulikov
Nikita Orlov
Humphrey Shi
Gao Huang
102
50
0
28 Dec 2021
Extended Self-Critical Pipeline for Transforming Videos to Text
  (TRECVID-VTT Task 2021) -- Team: MMCUniAugsburg
Extended Self-Critical Pipeline for Transforming Videos to Text (TRECVID-VTT Task 2021) -- Team: MMCUniAugsburg
Philipp Harzig
Moritz Einfalt
K. Ludwig
Rainer Lienhart
ViT
94
0
0
28 Dec 2021
Synchronized Audio-Visual Frames with Fractional Positional Encoding for
  Transformers in Video-to-Text Translation
Synchronized Audio-Visual Frames with Fractional Positional Encoding for Transformers in Video-to-Text Translation
Philipp Harzig
Moritz Einfalt
Rainer Lienhart
ViT
68
2
0
28 Dec 2021
Cross Modal Retrieval with Querybank Normalisation
Cross Modal Retrieval with Querybank Normalisation
Simion-Vlad Bogolin
Ioana Croitoru
Hailin Jin
Yang Liu
Samuel Albanie
109
90
0
23 Dec 2021
3D Skeleton-based Few-shot Action Recognition with JEANIE is not so
  Naïve
3D Skeleton-based Few-shot Action Recognition with JEANIE is not so Naïve
Lei Wang
Jun Liu
Piotr Koniusz
98
21
0
23 Dec 2021
Recur, Attend or Convolve? On Whether Temporal Modeling Matters for
  Cross-Domain Robustness in Action Recognition
Recur, Attend or Convolve? On Whether Temporal Modeling Matters for Cross-Domain Robustness in Action Recognition
Sofia Broomé
Ernest Pokropek
Boyu Li
Hedvig Kjellström
84
7
0
22 Dec 2021
Expansion-Squeeze-Excitation Fusion Network for Elderly Activity
  Recognition
Expansion-Squeeze-Excitation Fusion Network for Elderly Activity Recognition
Xiangbo Shu
Jiawen Yang
Rui Yan
Yan Song
103
149
0
21 Dec 2021
ACGNet: Action Complement Graph Network for Weakly-supervised Temporal
  Action Localization
ACGNet: Action Complement Graph Network for Weakly-supervised Temporal Action Localization
Zichen Yang
Jie Qin
Di Huang
80
61
0
21 Dec 2021
Are Large-scale Datasets Necessary for Self-Supervised Pre-training?
Are Large-scale Datasets Necessary for Self-Supervised Pre-training?
Alaaeldin El-Nouby
Gautier Izacard
Hugo Touvron
Ivan Laptev
Hervé Jégou
Edouard Grave
SSL
118
152
0
20 Dec 2021
LocFormer: Enabling Transformers to Perform Temporal Moment Localization
  on Long Untrimmed Videos With a Feature Sampling Approach
LocFormer: Enabling Transformers to Perform Temporal Moment Localization on Long Untrimmed Videos With a Feature Sampling Approach
Cristian Rodriguez-Opazo
Edison Marrese-Taylor
Basura Fernando
Hiroya Takamura
Qi Wu
ViT
53
3
0
19 Dec 2021
Precondition and Effect Reasoning for Action Recognition
Precondition and Effect Reasoning for Action Recognition
Hongsang Yoo
Haopeng Li
Qiuhong Ke
Liangchen Liu
Rui Zhang
CML
94
4
0
19 Dec 2021
Tell me what you see: A zero-shot action recognition method based on
  natural language descriptions
Tell me what you see: A zero-shot action recognition method based on natural language descriptions
Valter Estevam
Rayson Laroca
David Menotti
Hélio Pedrini
88
13
0
18 Dec 2021
Adversarial Memory Networks for Action Prediction
Adversarial Memory Networks for Action Prediction
Zhiqiang Tao
Yue Bai
Handong Zhao
Sheng Li
Yuanyuan Kong
Y. Fu
GAN
30
2
0
18 Dec 2021
Cross-Model Pseudo-Labeling for Semi-Supervised Action Recognition
Cross-Model Pseudo-Labeling for Semi-Supervised Action Recognition
Yinghao Xu
Fangyun Wei
Xiao Sun
Ceyuan Yang
Yujun Shen
Bo Dai
Bolei Zhou
Stephen Lin
VLM
64
56
0
17 Dec 2021
Distillation of Human-Object Interaction Contexts for Action Recognition
Distillation of Human-Object Interaction Contexts for Action Recognition
Muna Almushyti
Frederick W. Li
96
3
0
17 Dec 2021
Masked Feature Prediction for Self-Supervised Visual Pre-Training
Masked Feature Prediction for Self-Supervised Visual Pre-Training
Chen Wei
Haoqi Fan
Saining Xie
Chaoxia Wu
Alan Yuille
Christoph Feichtenhofer
ViT
217
677
0
16 Dec 2021
Decoupling and Recoupling Spatiotemporal Representation for RGB-D-based
  Motion Recognition
Decoupling and Recoupling Spatiotemporal Representation for RGB-D-based Motion Recognition
Benjia Zhou
Pichao Wang
Jun Wan
Yanyan Liang
Fan Wang
Du Zhang
Zhen Lei
Hao Li
Rong Jin
105
31
0
16 Dec 2021
Contrastive Spatio-Temporal Pretext Learning for Self-supervised Video
  Representation
Contrastive Spatio-Temporal Pretext Learning for Self-supervised Video Representation
Yujia Zhang
L. Po
Xuyuan Xu
Mengyang Liu
Yexin Wang
Weifeng Ou
Yuzhi Zhao
Weikang Yu
SSLAI4TS
85
17
0
16 Dec 2021
Spatio-Temporal CNN baseline method for the Sports Video Task of
  MediaEval 2021 benchmark
Spatio-Temporal CNN baseline method for the Sports Video Task of MediaEval 2021 benchmark
Pierre-Etienne Martin
50
7
0
16 Dec 2021
Two Stream Network for Stroke Detection in Table Tennis
Two Stream Network for Stroke Detection in Table Tennis
Anam Zahra
Pierre-Etienne Martin
47
4
0
16 Dec 2021
Sports Video: Fine-Grained Action Detection and Classification of Table
  Tennis Strokes from Videos for MediaEval 2021
Sports Video: Fine-Grained Action Detection and Classification of Table Tennis Strokes from Videos for MediaEval 2021
Pierre-Etienne Martin
J. Calandre
Boris Mansencal
J. Benois-Pineau
Renaud Péteri
L. Mascarilla
J. Morlier
AI4TS
61
7
0
16 Dec 2021
Rethinking Nearest Neighbors for Visual Classification
Rethinking Nearest Neighbors for Visual Classification
Menglin Jia
Bor-Chun Chen
Zuxuan Wu
Claire Cardie
Serge Belongie
Ser-Nam Lim
SSL
94
10
0
15 Dec 2021
Dense Video Captioning Using Unsupervised Semantic Information
Dense Video Captioning Using Unsupervised Semantic Information
Valter Estevam
Rayson Laroca
Hélio Pedrini
David Menotti
100
10
0
15 Dec 2021
Vision Transformer Based Video Hashing Retrieval for Tracing the Source
  of Fake Videos
Vision Transformer Based Video Hashing Retrieval for Tracing the Source of Fake Videos
Pengfei Pei
Xianfeng Zhao
Yun Cao
Jinchuan Li
Xiaowei Yi
ViT
65
8
0
15 Dec 2021
Temporal Action Proposal Generation with Background Constraint
Temporal Action Proposal Generation with Background Constraint
Haosen Yang
Wenhao Wu
Lining Wang
Sheng Jin
Boyang Xia
Huanjin Yao
Hujie Huang
137
28
0
15 Dec 2021
Temporal Shuffling for Defending Deep Action Recognition Models against
  Adversarial Attacks
Temporal Shuffling for Defending Deep Action Recognition Models against Adversarial Attacks
Ian Ryu
Huan Zhang
Jun-Ho Choi
Cho-Jui Hsieh
Jong-Seok Lee
AAML
81
5
0
15 Dec 2021
Temporal Transformer Networks with Self-Supervision for Action
  Recognition
Temporal Transformer Networks with Self-Supervision for Action Recognition
Yongkang Zhang
Jun Li
Guoming Wu
Hanjie Zhang
Zhiping Shi
Zhaoxun Liu
Zizhang Wu
ViT
68
6
0
14 Dec 2021
A real-time spatiotemporal AI model analyzes skill in open surgical
  videos
A real-time spatiotemporal AI model analyzes skill in open surgical videos
E. Goodman
Krishna K. Patel
Yilun Zhang
William Locke
C. Kennedy
...
Maren Downing
Hechang Chen
Jevin Z. Clark
G. Brat
Serena Yeung
67
21
0
14 Dec 2021
Co-training Transformer with Videos and Images Improves Action
  Recognition
Co-training Transformer with Videos and Images Improves Action Recognition
Bowen Zhang
Jiahui Yu
Christopher Fifty
Wei Han
Andrew M. Dai
Ruoming Pang
Fei Sha
ViT
85
54
0
14 Dec 2021
SVIP: Sequence VerIfication for Procedures in Videos
SVIP: Sequence VerIfication for Procedures in Videos
Yichen Qian
Weixin Luo
Dongze Lian
Xu Tang
P. Zhao
Shenghua Gao
ViT
119
18
0
13 Dec 2021
Video as Conditional Graph Hierarchy for Multi-Granular Question
  Answering
Video as Conditional Graph Hierarchy for Multi-Granular Question Answering
Junbin Xiao
Angela Yao
Zhiyuan Liu
Yicong Li
Wei Ji
Tat-Seng Chua
99
114
0
12 Dec 2021
COMPOSER: Compositional Reasoning of Group Activity in Videos with
  Keypoint-Only Modality
COMPOSER: Compositional Reasoning of Group Activity in Videos with Keypoint-Only Modality
Honglu Zhou
Asim Kadav
Aviv Shamsian
Shijie Geng
Farley Lai
Long Zhao
Tingxi Liu
Mubbasir Kapadia
H. Graf
71
24
0
11 Dec 2021
Self-supervised Spatiotemporal Representation Learning by Exploiting
  Video Continuity
Self-supervised Spatiotemporal Representation Learning by Exploiting Video Continuity
Hanwen Liang
N. Quader
Zhixiang Chi
Lizhe Chen
Peng Dai
Juwei Lu
Yang Wang
SSLAI4TS
90
32
0
11 Dec 2021
Cross-Modal Transferable Adversarial Attacks from Images to Videos
Cross-Modal Transferable Adversarial Attacks from Images to Videos
Zhipeng Wei
Jingjing Chen
Zuxuan Wu
Yu-Gang Jiang
AAML
97
42
0
10 Dec 2021
Rethinking the Two-Stage Framework for Grounded Situation Recognition
Rethinking the Two-Stage Framework for Grounded Situation Recognition
Meng Wei
Long Chen
Wei Ji
Xiaoyu Yue
Tat-Seng Chua
91
31
0
10 Dec 2021
Contextualized Spatio-Temporal Contrastive Learning with
  Self-Supervision
Contextualized Spatio-Temporal Contrastive Learning with Self-Supervision
Liangzhe Yuan
Rui Qian
Huayu Chen
Boqing Gong
Florian Schroff
Ming-Hsuan Yang
Hartwig Adam
Ting Liu
AI4TS
105
16
0
09 Dec 2021
Spatio-temporal Relation Modeling for Few-shot Action Recognition
Spatio-temporal Relation Modeling for Few-shot Action Recognition
Anirudh Thatipelli
Sanath Narayan
Salman Khan
Rao Muhammad Anwer
Fahad Shahbaz Khan
Guohao Li
ViT
83
92
0
09 Dec 2021
Progressive Attention on Multi-Level Dense Difference Maps for Generic
  Event Boundary Detection
Progressive Attention on Multi-Level Dense Difference Maps for Generic Event Boundary Detection
Jiaqi Tang
Zhaoyang Liu
Chao Qian
Wayne Wu
Limin Wang
103
18
0
09 Dec 2021
Auto-X3D: Ultra-Efficient Video Understanding via Finer-Grained Neural
  Architecture Search
Auto-X3D: Ultra-Efficient Video Understanding via Finer-Grained Neural Architecture Search
Yi Ding
Xinyu Gong
Junru Wu
Humphrey Shi
Zhicheng Yan
Zhangyang Wang
VGen
90
1
0
09 Dec 2021
DualFormer: Local-Global Stratified Transformer for Efficient Video
  Recognition
DualFormer: Local-Global Stratified Transformer for Efficient Video Recognition
Yuxuan Liang
Pan Zhou
Roger Zimmermann
Shuicheng Yan
ViT
84
21
0
09 Dec 2021
Previous
123...394041...717273
Next