ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1705.07750
  4. Cited By
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
v1v2v3 (latest)

Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset

22 May 2017
João Carreira
Andrew Zisserman
ArXiv (abs)PDFHTML

Papers citing "Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset"

50 / 3,647 papers shown
Title
Comparing Correspondences: Video Prediction with Correspondence-wise
  Losses
Comparing Correspondences: Video Prediction with Correspondence-wise Losses
Daniel Geng
Max Hamilton
Andrew Owens
3DH
97
16
0
19 Apr 2021
Temporal Query Networks for Fine-grained Video Understanding
Temporal Query Networks for Fine-grained Video Understanding
Chuhan Zhang
Ankush Gupta
Andrew Zisserman
119
87
0
19 Apr 2021
Agent-Centric Representations for Multi-Agent Reinforcement Learning
Agent-Centric Representations for Multi-Agent Reinforcement Learning
Wenling Shang
L. Espeholt
Anton Raichuk
Tim Salimans
EgoV
55
10
0
19 Apr 2021
BM-NAS: Bilevel Multimodal Neural Architecture Search
BM-NAS: Bilevel Multimodal Neural Architecture Search
Yihang Yin
Siyu Huang
Xiang Zhang
84
27
0
19 Apr 2021
Camera Calibration and Player Localization in SoccerNet-v2 and
  Investigation of their Representations for Action Spotting
Camera Calibration and Player Localization in SoccerNet-v2 and Investigation of their Representations for Action Spotting
A. Cioppa
Adrien Deliège
Floriane Magera
Silvio Giancola
Olivier Barnich
Guohao Li
Marc Van Droogenbroeck
75
58
0
19 Apr 2021
Metadata Normalization
Metadata Normalization
Mandy Lu
Qingyu Zhao
Jiequan Zhang
K. Pohl
L. Fei-Fei
Juan Carlos Niebles
Ehsan Adeli
70
20
0
19 Apr 2021
Higher Order Recurrent Space-Time Transformer for Video Action
  Prediction
Higher Order Recurrent Space-Time Transformer for Video Action Prediction
Tsung-Ming Tai
G. Fiameni
Cheng-Kuang Lee
Oswald Lanz
66
9
0
17 Apr 2021
TEACHTEXT: CrossModal Generalized Distillation for Text-Video Retrieval
TEACHTEXT: CrossModal Generalized Distillation for Text-Video Retrieval
Ioana Croitoru
Simion-Vlad Bogolin
Marius Leordeanu
Hailin Jin
Andrew Zisserman
Samuel Albanie
Yang Liu
VGen
67
125
0
16 Apr 2021
Temporally smooth online action detection using cycle-consistent future
  anticipation
Temporally smooth online action detection using cycle-consistent future anticipation
Young Hwi Kim
Seonghyeon Nam
Seon Joo Kim
OffRL
72
30
0
16 Apr 2021
VGNMN: Video-grounded Neural Module Network to Video-Grounded Language
  Tasks
VGNMN: Video-grounded Neural Module Network to Video-Grounded Language Tasks
Hung Le
Nancy F. Chen
Guosheng Lin
MLLM
83
19
0
16 Apr 2021
Action Segmentation with Mixed Temporal Domain Adaptation
Action Segmentation with Mixed Temporal Domain Adaptation
Min-Hung Chen
Baopu Li
Yingze Bao
Ghassan AlRegib
120
30
0
15 Apr 2021
Weakly Supervised Video Anomaly Detection via Center-guided
  Discriminative Learning
Weakly Supervised Video Anomaly Detection via Center-guided Discriminative Learning
Boyang Wan
Yuming Fang
Xue Xia
Jiajie Mei
58
135
0
15 Apr 2021
Adaptive Intermediate Representations for Video Understanding
Adaptive Intermediate Representations for Video Understanding
Juhana Kangaspunta
A. Piergiovanni
Rico Jonschkowski
Michael S. Ryoo
A. Angelova
51
3
0
14 Apr 2021
Temporally-Aware Feature Pooling for Action Spotting in Soccer
  Broadcasts
Temporally-Aware Feature Pooling for Action Spotting in Soccer Broadcasts
Silvio Giancola
Guohao Li
65
45
0
14 Apr 2021
Revisiting Hierarchical Approach for Persistent Long-Term Video
  Prediction
Revisiting Hierarchical Approach for Persistent Long-Term Video Prediction
Wonkwang Lee
Whie Jung
Han Zhang
Ting Chen
Jing Yu Koh
Thomas E. Huang
Hyungsuk Yoon
Honglak Lee
Seunghoon Hong
57
29
0
14 Apr 2021
ADNet: Temporal Anomaly Detection in Surveillance Videos
ADNet: Temporal Anomaly Detection in Surveillance Videos
H. Öztürk
Ahmet Burak Can
130
15
0
14 Apr 2021
Towards Extremely Compact RNNs for Video Recognition with Fully
  Decomposed Hierarchical Tucker Structure
Towards Extremely Compact RNNs for Video Recognition with Fully Decomposed Hierarchical Tucker Structure
Miao Yin
Siyu Liao
Xiao-Yang Liu
Xiaodong Wang
Bo Yuan
AI4TS
87
31
0
12 Apr 2021
Tensor Processing Primitives: A Programming Abstraction for Efficiency
  and Portability in Deep Learning & HPC Workloads
Tensor Processing Primitives: A Programming Abstraction for Efficiency and Portability in Deep Learning & HPC Workloads
E. Georganas
Dhiraj D. Kalamkar
Sasikanth Avancha
Menachem Adelman
Deepti Aggarwal
...
Ramanarayan Mohanty
Hans Pabst
Brian Retford
Barukh Ziv
A. Heinecke
114
18
0
12 Apr 2021
Object Priors for Classifying and Localizing Unseen Actions
Object Priors for Classifying and Localizing Unseen Actions
Pascal Mettes
William Thong
Cees G. M. Snoek
85
21
0
10 Apr 2021
Unidentified Video Objects: A Benchmark for Dense, Open-World
  Segmentation
Unidentified Video Objects: A Benchmark for Dense, Open-World Segmentation
Weiyao Wang
Matt Feiszli
Heng Wang
Du Tran
VOS
89
127
0
10 Apr 2021
Video-aided Unsupervised Grammar Induction
Video-aided Unsupervised Grammar Induction
Songyang Zhang
Linfeng Song
Lifeng Jin
Kun Xu
Dong Yu
Jiebo Luo
63
27
0
09 Apr 2021
FIBER: Fill-in-the-Blanks as a Challenging Video Understanding
  Evaluation Framework
FIBER: Fill-in-the-Blanks as a Challenging Video Understanding Evaluation Framework
Santiago Castro
Ruoyao Wang
Pingxuan Huang
Ian Stewart
Oana Ignat
Nan Liu
Jonathan C. Stroud
Rada Mihalcea
AIMat
91
11
0
09 Apr 2021
TRiPOD: Human Trajectory and Pose Dynamics Forecasting in the Wild
TRiPOD: Human Trajectory and Pose Dynamics Forecasting in the Wild
Vida Adeli
Mahsa Ehsanpour
Ian Reid
Juan Carlos Niebles
Silvio Savarese
Ehsan Adeli
Hamid Rezatofighi
76
61
0
08 Apr 2021
Few-Shot Action Recognition with Compromised Metric via Optimal
  Transport
Few-Shot Action Recognition with Compromised Metric via Optimal Transport
Su Lu
Han-Jia Ye
De-Chuan Zhan
90
18
0
08 Apr 2021
Progressive Temporal Feature Alignment Network for Video Inpainting
Progressive Temporal Feature Alignment Network for Video Inpainting
Xueyan Zou
Linjie Yang
Ding Liu
Yong Jae Lee
84
57
0
08 Apr 2021
Self-Supervised Learning for Semi-Supervised Temporal Action Proposal
Self-Supervised Learning for Semi-Supervised Temporal Action Proposal
Xiang Wang
Shiwei Zhang
Zhiwu Qing
Yuanjie Shao
Changxin Gao
Nong Sang
74
68
0
07 Apr 2021
The SARAS Endoscopic Surgeon Action Detection (ESAD) dataset: Challenges
  and methods
The SARAS Endoscopic Surgeon Action Detection (ESAD) dataset: Challenges and methods
V. Bawa
Gurkirt Singh
Francis KapingA
I. Skarga-Bandurova
Elettra Oleari
...
Li Li
Armando Stabile
Francesco Setti
R. Muradore
Fabio Cuzzolin
61
41
0
07 Apr 2021
ACM-Net: Action Context Modeling Network for Weakly-Supervised Temporal
  Action Localization
ACM-Net: Action Context Modeling Network for Weakly-Supervised Temporal Action Localization
Sanqing Qu
Guang Chen
Zhijun Li
Lijun Zhang
Fan Lu
Alois C. Knoll
102
55
0
07 Apr 2021
The Multi-Agent Behavior Dataset: Mouse Dyadic Social Interactions
The Multi-Agent Behavior Dataset: Mouse Dyadic Social Interactions
Jennifer J. Sun
Tomomi Karigo
Dipam Chakraborty
Sharada Mohanty
Benjamin Wild
...
Chen Chen
D. Anderson
Pietro Perona
Yisong Yue
Ann Kennedy
136
49
0
06 Apr 2021
Zeus: Efficiently Localizing Actions in Videos using Reinforcement
  Learning
Zeus: Efficiently Localizing Actions in Videos using Reinforcement Learning
Pramod Chunduri
J. Bang
Yao Lu
Joy Arulraj
54
12
0
06 Apr 2021
Few-Shot Transformation of Common Actions into Time and Space
Few-Shot Transformation of Common Actions into Time and Space
Pengwan Yang
Pascal Mettes
Cees G. M. Snoek
VLMViT
53
10
0
06 Apr 2021
Adaptive Mutual Supervision for Weakly-Supervised Temporal Action
  Localization
Adaptive Mutual Supervision for Weakly-Supervised Temporal Action Localization
Chen Ju
Peisen Zhao
Siheng Chen
Ya Zhang
Xiaoyun Zhang
Qi Tian
WSOL
80
20
0
06 Apr 2021
Can audio-visual integration strengthen robustness under multimodal
  attacks?
Can audio-visual integration strengthen robustness under multimodal attacks?
Yapeng Tian
Chenliang Xu
AAML
107
39
0
05 Apr 2021
MIST: Multiple Instance Self-Training Framework for Video Anomaly
  Detection
MIST: Multiple Instance Self-Training Framework for Video Anomaly Detection
Jianfeng Feng
Fa-Ting Hong
Weishi Zheng
114
251
0
04 Apr 2021
Deepfake Detection Scheme Based on Vision Transformer and Distillation
Deepfake Detection Scheme Based on Vision Transformer and Distillation
Young-Jin Heo
Y. Choi
Young-Woon Lee
Byung-Gyu Kim
ViT
74
59
0
03 Apr 2021
Beyond Short Clips: End-to-End Video-Level Learning with Collaborative
  Memories
Beyond Short Clips: End-to-End Video-Level Learning with Collaborative Memories
Xitong Yang
Haoqi Fan
Lorenzo Torresani
L. Davis
Heng Wang
VLM
84
21
0
02 Apr 2021
On the Pitfalls of Learning with Limited Data: A Facial Expression
  Recognition Case Study
On the Pitfalls of Learning with Limited Data: A Facial Expression Recognition Case Study
Miguel Rodríguez Santander
Juan Felipe Hernandez Albarracin
Adín Ramirez Rivera
68
4
0
02 Apr 2021
M3L: Language-based Video Editing via Multi-Modal Multi-Level
  Transformers
M3L: Language-based Video Editing via Multi-Modal Multi-Level Transformers
Tsu-Jui Fu
Xinze Wang
Scott T. Grafton
Miguel P. Eckstein
Wenjie Wang
122
9
0
02 Apr 2021
Visual Semantic Role Labeling for Video Understanding
Visual Semantic Role Labeling for Video Understanding
Arka Sadhu
Tanmay Gupta
Mark Yatskar
Ram Nevatia
Aniruddha Kembhavi
VLM
97
71
0
02 Apr 2021
UAV-Human: A Large Benchmark for Human Behavior Understanding with
  Unmanned Aerial Vehicles
UAV-Human: A Large Benchmark for Human Behavior Understanding with Unmanned Aerial Vehicles
Tianjiao Li
Jun Liu
Wei Emma Zhang
Yun Ni
Wenqian Wang
Zhiheng Li
AI4TS
109
192
0
02 Apr 2021
Self-supervised Video Representation Learning by Context and Motion
  Decoupling
Self-supervised Video Representation Learning by Context and Motion Decoupling
Lianghua Huang
Yu Liu
Bin Wang
Pan Pan
Yinghui Xu
Rong Jin
SSL
114
51
0
02 Apr 2021
Memorability: An image-computable measure of information utility
Memorability: An image-computable measure of information utility
Zoya Bylinskii
L. Goetschalckx
Anelise Newman
A. Oliva
HAI
40
19
0
01 Apr 2021
Multiview Pseudo-Labeling for Semi-supervised Learning from Video
Multiview Pseudo-Labeling for Semi-supervised Learning from Video
Bo Xiong
Haoqi Fan
Kristen Grauman
Christoph Feichtenhofer
SSL
70
51
0
01 Apr 2021
Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval
Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval
Max Bain
Arsha Nagrani
Gül Varol
Andrew Zisserman
VGen
236
1,194
0
01 Apr 2021
Motion Guided Attention Fusion to Recognize Interactions from Videos
Motion Guided Attention Fusion to Recognize Interactions from Videos
Tae Soo Kim
Jonathan D. Jones
Gregory Hager
43
15
0
01 Apr 2021
CUPID: Adaptive Curation of Pre-training Data for Video-and-Language
  Representation Learning
CUPID: Adaptive Curation of Pre-training Data for Video-and-Language Representation Learning
Luowei Zhou
Jingjing Liu
Yu Cheng
Zhe Gan
Lei Zhang
75
7
0
01 Apr 2021
Self-supervised Motion Learning from Static Images
Self-supervised Motion Learning from Static Images
Ziyuan Huang
Shiwei Zhang
Jianwen Jiang
Mingqian Tang
Rong Jin
M. Ang
SSL
59
29
0
01 Apr 2021
A Survey on Natural Language Video Localization
A Survey on Natural Language Video Localization
Xinfang Liu
Xiushan Nie
Zhifang Tan
Jie Guo
Yilong Yin
123
7
0
01 Apr 2021
Adaptive Configuration of In Situ Lossy Compression for Cosmology
  Simulations via Fine-Grained Rate-Quality Modeling
Adaptive Configuration of In Situ Lossy Compression for Cosmology Simulations via Fine-Grained Rate-Quality Modeling
Sian Jin
Jesus Pulido
Pascal Grosset
Jiannan Tian
Dingwen Tao
J. Ahrens
78
23
0
01 Apr 2021
Rethinking Self-supervised Correspondence Learning: A Video Frame-level
  Similarity Perspective
Rethinking Self-supervised Correspondence Learning: A Video Frame-level Similarity Perspective
Jiarui Xu
Xiaolong Wang
VOS
194
95
0
31 Mar 2021
Previous
123...484950...717273
Next