ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1705.07750
  4. Cited By
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
v1v2v3 (latest)

Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset

22 May 2017
João Carreira
Andrew Zisserman
ArXiv (abs)PDFHTML

Papers citing "Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset"

50 / 3,647 papers shown
Title
Learning by Aligning Videos in Time
Learning by Aligning Videos in Time
S. Haresh
Sateesh Kumar
Huseyin Coskun
S. N. Syed
Andrey Konin
M. Zia
Quoc-Huy Tran
AI4TS
99
64
0
31 Mar 2021
Dogfight: Detecting Drones from Drones Videos
Dogfight: Detecting Drones from Drones Videos
M. W. Ashraf
Waqas Sultani
M. Shah
69
58
0
31 Mar 2021
Embracing Uncertainty: Decoupling and De-bias for Robust Temporal
  Grounding
Embracing Uncertainty: Decoupling and De-bias for Robust Temporal Grounding
Hao Zhou
Chongyang Zhang
Yan Luo
Yanjun Chen
Chuanping Hu
77
52
0
31 Mar 2021
Learning Representational Invariances for Data-Efficient Action
  Recognition
Learning Representational Invariances for Data-Efficient Action Recognition
Yuliang Zou
Jinwoo Choi
Qitong Wang
Jia-Bin Huang
108
41
0
30 Mar 2021
Broaden Your Views for Self-Supervised Video Learning
Broaden Your Views for Self-Supervised Video Learning
Adrià Recasens
Pauline Luc
Jean-Baptiste Alayrac
Luyu Wang
Ross Hemsley
...
Florent Altché
M. Valko
Jean-Bastien Grill
Aaron van den Oord
Andrew Zisserman
SSLAI4TS
137
128
0
30 Mar 2021
Recognizing Actions in Videos from Unseen Viewpoints
Recognizing Actions in Videos from Unseen Viewpoints
A. Piergiovanni
Michael S. Ryoo
63
25
0
30 Mar 2021
Read and Attend: Temporal Localisation in Sign Language Videos
Read and Attend: Temporal Localisation in Sign Language Videos
Gül Varol
Liliane Momeni
Samuel Albanie
Triantafyllos Afouras
Andrew Zisserman
SLR
84
41
0
30 Mar 2021
Face Forensics in the Wild
Face Forensics in the Wild
Tianfei Zhou
Wenguan Wang
Zhiyuan Liang
Jianbing Shen
CVBM
84
122
0
30 Mar 2021
Augmented Transformer with Adaptive Graph for Temporal Action Proposal
  Generation
Augmented Transformer with Adaptive Graph for Temporal Action Proposal Generation
Shuning Chang
Pichao Wang
F. Wang
Hao Li
Jiashi Feng
ViT
86
42
0
30 Mar 2021
PLAN-B: Predicting Likely Alternative Next Best Sequences for Action
  Prediction
PLAN-B: Predicting Likely Alternative Next Best Sequences for Action Prediction
D. Scarafoni
Irfan Essa
Thomas Ploetz
27
1
0
29 Mar 2021
ViViT: A Video Vision Transformer
ViViT: A Video Vision Transformer
Anurag Arnab
Mostafa Dehghani
G. Heigold
Chen Sun
Mario Lucic
Cordelia Schmid
ViT
242
2,178
0
29 Mar 2021
Memory Enhanced Embedding Learning for Cross-Modal Video-Text Retrieval
Memory Enhanced Embedding Learning for Cross-Modal Video-Text Retrieval
Rui Zhao
Kecheng Zheng
Zhengjun Zha
Hongtao Xie
Jiebo Luo
47
3
0
29 Mar 2021
Unified Graph Structured Models for Video Understanding
Unified Graph Structured Models for Video Understanding
Anurag Arnab
Chen Sun
Cordelia Schmid
125
46
0
29 Mar 2021
Busy-Quiet Video Disentangling for Video Classification
Busy-Quiet Video Disentangling for Video Classification
Guoxi Huang
A. Bors
58
7
0
29 Mar 2021
SUTD-TrafficQA: A Question Answering Benchmark and an Efficient Network
  for Video Reasoning over Traffic Events
SUTD-TrafficQA: A Question Answering Benchmark and an Efficient Network for Video Reasoning over Traffic Events
Li Xu
He Huang
Jun Liu
ViTLRM
114
88
0
29 Mar 2021
Automated freezing of gait assessment with marker-based motion capture
  and multi-stage spatial-temporal graph convolutional neural networks
Automated freezing of gait assessment with marker-based motion capture and multi-stage spatial-temporal graph convolutional neural networks
Benjamin Filtjens
Pieter Ginis
A. Nieuwboer
P. Slaets
Bart Vanrumste
30
21
0
29 Mar 2021
No frame left behind: Full Video Action Recognition
No frame left behind: Full Video Action Recognition
X. Liu
S. Pintea
Fatemeh Karimi Nejadasl
Olaf Booij
Jan van Gemert
90
41
0
29 Mar 2021
Low-Fidelity End-to-End Video Encoder Pre-training for Temporal Action
  Localization
Low-Fidelity End-to-End Video Encoder Pre-training for Temporal Action Localization
Mengmeng Xu
Juan-Manuel Perez-Rua
Xiatian Zhu
Guohao Li
Brais Martinez
81
28
0
28 Mar 2021
A Comprehensive Review of the Video-to-Text Problem
A Comprehensive Review of the Video-to-Text Problem
Jesus Perez-Martin
B. Bustos
S. Guimarães
I. Sipiran
Jorge A. Pérez
Grethel Coello Said
71
17
0
27 Mar 2021
GPRAR: Graph Convolutional Network based Pose Reconstruction and Action
  Recognition for Human Trajectory Prediction
GPRAR: Graph Convolutional Network based Pose Reconstruction and Action Recognition for Human Trajectory Prediction
Manh Huynh
G. Alaghband
3DH
41
2
0
25 Mar 2021
Discriminative Semantic Transitive Consistency for Cross-Modal Learning
Discriminative Semantic Transitive Consistency for Cross-Modal Learning
Kranti K. Parida
Gaurav Sharma
83
1
0
25 Mar 2021
An Image is Worth 16x16 Words, What is a Video Worth?
An Image is Worth 16x16 Words, What is a Video Worth?
Gilad Sharir
Asaf Noy
Lihi Zelnik-Manor
ViT
104
125
0
25 Mar 2021
Structured Co-reference Graph Attention for Video-grounded Dialogue
Structured Co-reference Graph Attention for Video-grounded Dialogue
Junyeong Kim
Sunjae Yoon
Dahyun Kim
Chang D. Yoo
68
26
0
24 Mar 2021
The Blessings of Unlabeled Background in Untrimmed Videos
The Blessings of Unlabeled Background in Untrimmed Videos
Yuan Liu
Jingyuan Chen
Zhenfang Chen
Bing Deng
Jianqiang Huang
Hanwang Zhang
CML
96
44
0
24 Mar 2021
Temporal Context Aggregation Network for Temporal Action Proposal
  Refinement
Temporal Context Aggregation Network for Temporal Action Proposal Refinement
Zhiwu Qing
Haisheng Su
Weihao Gan
Dongliang Wang
Wei Wu
Xiang Wang
Yu Qiao
Junjie Yan
Changxin Gao
Nong Sang
125
175
0
24 Mar 2021
Learning Salient Boundary Feature for Anchor-free Temporal Action
  Localization
Learning Salient Boundary Feature for Anchor-free Temporal Action Localization
Chuming Lin
C. Xu
Donghao Luo
Yabiao Wang
Ying Tai
Chengjie Wang
Jilin Li
Feiyue Huang
Yanwei Fu
91
256
0
24 Mar 2021
Learning Comprehensive Motion Representation for Action Recognition
Learning Comprehensive Motion Representation for Action Recognition
Mingyu Wu
Boyuan Jiang
Donghao Luo
Junchi Yan
Yabiao Wang
Ying Tai
Chengjie Wang
Jilin Li
Feiyue Huang
Xiaokang Yang
58
12
0
23 Mar 2021
Measuring and modeling the motor system with machine learning
Measuring and modeling the motor system with machine learning
Sébastien B Hausmann
Alessandro Marin Vargas
Alexander Mathis
Mackenzie W. Mathis
98
51
0
22 Mar 2021
AdaSGN: Adapting Joint Number and Model Size for Efficient
  Skeleton-Based Action Recognition
AdaSGN: Adapting Joint Number and Model Size for Efficient Skeleton-Based Action Recognition
Lei Shi
Yifan Zhang
Jian Cheng
Hanqing Lu
73
48
0
22 Mar 2021
Context-aware Biaffine Localizing Network for Temporal Sentence
  Grounding
Context-aware Biaffine Localizing Network for Temporal Sentence Grounding
Daizong Liu
Xiaoye Qu
Jianfeng Dong
Pan Zhou
Yu Cheng
Wei Wei
Zichuan Xu
Yulai Xie
88
145
0
22 Mar 2021
PGT: A Progressive Method for Training Models on Long Videos
PGT: A Progressive Method for Training Models on Long Videos
Bo Pang
Gao Peng
Yizhuo Li
Cewu Lu
VLM
42
12
0
21 Mar 2021
Temporally-Weighted Hierarchical Clustering for Unsupervised Action
  Segmentation
Temporally-Weighted Hierarchical Clustering for Unsupervised Action Segmentation
M. Sarfraz
Naila Murray
Vivek Sharma
Ali Diba
Luc Van Gool
Rainer Stiefelhagen
138
72
0
20 Mar 2021
Efficient Spatialtemporal Context Modeling for Action Recognition
Efficient Spatialtemporal Context Modeling for Action Recognition
Congqi Cao
Yue Lu
Yifan Zhang
Dengyang Jiang
Yanning Zhang
90
4
0
20 Mar 2021
TDIOT: Target-driven Inference for Deep Video Object Tracking
TDIOT: Target-driven Inference for Deep Video Object Tracking
Filiz Gurkan
L. Cerkezi
Ozgun Cirakman
Bilge Günsel
VOT
95
16
0
19 Mar 2021
Hopper: Multi-hop Transformer for Spatiotemporal Reasoning
Hopper: Multi-hop Transformer for Spatiotemporal Reasoning
Honglu Zhou
Asim Kadav
Farley Lai
Alexandru Niculescu-Mizil
Martin Renqiang Min
Mubbasir Kapadia
H. Graf
LRM
89
18
0
19 Mar 2021
CLTA: Contents and Length-based Temporal Attention for Few-shot Action
  Recognition
CLTA: Contents and Length-based Temporal Attention for Few-shot Action Recognition
Yang Bo
Yangdi Lu
Wenbo He
VLM
97
0
0
18 Mar 2021
Computer Vision Aided URLL Communications: Proactive Service
  Identification and Coexistence
Computer Vision Aided URLL Communications: Proactive Service Identification and Coexistence
Muhammad Alrabeiah
Umut Demirhan
Andrew Hredzak
Ahmed Alkhateeb
24
4
0
18 Mar 2021
Space-Time Crop & Attend: Improving Cross-modal Video Representation
  Learning
Space-Time Crop & Attend: Improving Cross-modal Video Representation Learning
Mandela Patrick
Yuki M. Asano
Bernie Huang
Ishan Misra
Florian Metze
Joao Henriques
Andrea Vedaldi
AI4TS
100
35
0
18 Mar 2021
Decoupled Spatial Temporal Graphs for Generic Visual Grounding
Decoupled Spatial Temporal Graphs for Generic Visual Grounding
Qi Feng
Yunchao Wei
Mingming Cheng
Yi Yang
64
5
0
18 Mar 2021
Enhancing Transformer for Video Understanding Using Gated Multi-Level
  Attention and Temporal Adversarial Training
Enhancing Transformer for Video Understanding Using Gated Multi-Level Attention and Temporal Adversarial Training
Saurabh Sahu
Palash Goyal
ViT
42
2
0
18 Mar 2021
NAS-TC: Neural Architecture Search on Temporal Convolutions for Complex
  Action Recognition
NAS-TC: Neural Architecture Search on Temporal Convolutions for Complex Action Recognition
Pengzhen Ren
Gang Xiao
Xiaojun Chang
Yun Xiao
Zhihui Li
Xiaojiang Chen
ViT
79
4
0
17 Mar 2021
Skeleton Aware Multi-modal Sign Language Recognition
Skeleton Aware Multi-modal Sign Language Recognition
Songyao Jiang
Bin Sun
Lichen Wang
Yue Bai
Kunpeng Li
Y. Fu
SLR
85
178
0
16 Mar 2021
Boundary Proposal Network for Two-Stage Natural Language Video
  Localization
Boundary Proposal Network for Two-Stage Natural Language Video Localization
Shaoning Xiao
Long Chen
Songyang Zhang
Wei Ji
Jian Shao
Lu Ye
Jun Xiao
76
160
0
15 Mar 2021
ACTION-Net: Multipath Excitation for Action Recognition
ACTION-Net: Multipath Excitation for Action Recognition
Zhengwei Wang
Qi She
A. Smolic
3DPC
81
172
0
11 Mar 2021
Temporal Action Segmentation from Timestamp Supervision
Temporal Action Segmentation from Timestamp Supervision
Zhe Li
Yazan Abu Farha
Juergen Gall
86
82
0
11 Mar 2021
ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis
ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis
Yinan He
Bei Gan
Siyu Chen
Yichun Zhou
Guojun Yin
Luchuan Song
Lu Sheng
Jing Shao
Ziwei Liu
AAML
130
137
0
09 Mar 2021
Behavior-Driven Synthesis of Human Dynamics
Behavior-Driven Synthesis of Human Dynamics
A. Blattmann
Timo Milbich
Michael Dorkenwald
Bjorn Ommer
79
14
0
08 Mar 2021
VIPriors 1: Visual Inductive Priors for Data-Efficient Deep Learning
  Challenges
VIPriors 1: Visual Inductive Priors for Data-Efficient Deep Learning Challenges
Robert-Jan Bruintjes
A. Lengyel
Marcos Baptista-Rios
O. Kayhan
Jan van Gemert
VLM
48
12
0
05 Mar 2021
Unsupervised Motion Representation Enhanced Network for Action
  Recognition
Unsupervised Motion Representation Enhanced Network for Action Recognition
Xiaohang Yang
Lingtong Kong
Jie Yang
43
4
0
05 Mar 2021
Modeling Multi-Label Action Dependencies for Temporal Action
  Localization
Modeling Multi-Label Action Dependencies for Temporal Action Localization
Praveen Tirupattur
Kevin Duarte
Yogesh S Rawat
M. Shah
82
57
0
04 Mar 2021
Previous
123...495051...717273
Next