ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1705.07750
  4. Cited By
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset

Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset

22 May 2017
João Carreira
Andrew Zisserman
ArXivPDFHTML

Papers citing "Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset"

50 / 1,475 papers shown
Title
Local-Global Video-Text Interactions for Temporal Grounding
Local-Global Video-Text Interactions for Temporal Grounding
Jonghwan Mun
Minsu Cho
Bohyung Han
36
267
0
16 Apr 2020
Towards Anomaly Detection in Dashcam Videos
Towards Anomaly Detection in Dashcam Videos
S. Haresh
Sateesh Kumar
M. Zia
Quoc-Huy Tran
27
30
0
11 Apr 2020
Would Mega-scale Datasets Further Enhance Spatiotemporal 3D CNNs?
Would Mega-scale Datasets Further Enhance Spatiotemporal 3D CNNs?
Hirokatsu Kataoka
Tenga Wakamiya
Kensho Hara
Y. Satoh
3DPC
31
87
0
10 Apr 2020
X3D: Expanding Architectures for Efficient Video Recognition
X3D: Expanding Architectures for Efficient Video Recognition
Christoph Feichtenhofer
78
1,004
0
09 Apr 2020
Dense Regression Network for Video Grounding
Dense Regression Network for Video Grounding
Runhao Zeng
Haoming Xu
Wenbing Huang
Peihao Chen
Mingkui Tan
Chuang Gan
22
284
0
07 Apr 2020
TEA: Temporal Excitation and Aggregation for Action Recognition
TEA: Temporal Excitation and Aggregation for Action Recognition
Yan-Ran Li
Bin Ji
Xintian Shi
Jianguo Zhang
Bin Kang
Limin Wang
ViT
42
439
0
03 Apr 2020
Knowing What, Where and When to Look: Efficient Video Action Modeling
  with Attention
Knowing What, Where and When to Look: Efficient Video Action Modeling with Attention
Juan-Manuel Perez-Rua
Brais Martínez
Xiatian Zhu
Antoine Toisoul
Victor Escorcia
Tao Xiang
48
19
0
02 Apr 2020
Temporal Accumulative Features for Sign Language Recognition
Temporal Accumulative Features for Sign Language Recognition
A. Kındıroglu
Ogulcan Özdemir
L. Akarun
SLR
16
18
0
02 Apr 2020
Learning Longterm Representations for Person Re-Identification Using
  Radio Signals
Learning Longterm Representations for Person Re-Identification Using Radio Signals
Lijie Fan
Tianhong Li
Rongyao Fang
Rumen Hristov
Yuan. Yuan
Dina Katabi
27
86
0
02 Apr 2020
Weakly-Supervised Action Localization with Expectation-Maximization
  Multi-Instance Learning
Weakly-Supervised Action Localization with Expectation-Maximization Multi-Instance Learning
Zhekun Luo
Devin Guillory
Baifeng Shi
Wei Ke
Fang Wan
Trevor Darrell
Huijuan Xu
19
119
0
31 Mar 2020
Explaining Motion Relevance for Activity Recognition in Video Deep
  Learning Models
Explaining Motion Relevance for Activity Recognition in Video Deep Learning Models
Liam Hiley
Alun D. Preece
Y. Hicks
Supriyo Chakraborty
Prudhvi K. Gurram
Richard J. Tomsett
FAtt
25
15
0
31 Mar 2020
Long Short-Term Relation Networks for Video Action Detection
Long Short-Term Relation Networks for Video Action Detection
Dong Li
Ting Yao
Zhaofan Qiu
Houqiang Li
Tao Mei
12
22
0
31 Mar 2020
Spatio-Temporal Graph for Video Captioning with Knowledge Distillation
Spatio-Temporal Graph for Video Captioning with Knowledge Distillation
Boxiao Pan
Haoye Cai
De-An Huang
Kuan-Hui Lee
Adrien Gaidon
Ehsan Adeli
Juan Carlos Niebles
31
235
0
31 Mar 2020
Combining detection and tracking for human pose estimation in videos
Combining detection and tracking for human pose estimation in videos
Manchen Wang
Joseph Tighe
Davide Modolo
VOT
26
109
0
30 Mar 2020
Speech2Action: Cross-modal Supervision for Action Recognition
Speech2Action: Cross-modal Supervision for Action Recognition
Arsha Nagrani
Chen Sun
David A. Ross
Rahul Sukthankar
Cordelia Schmid
Andrew Zisserman
33
54
0
30 Mar 2020
Learning Interactions and Relationships between Movie Characters
Learning Interactions and Relationships between Movie Characters
Anna Kukleva
Makarand Tapaswi
Ivan Laptev
41
51
0
29 Mar 2020
Learning a Weakly-Supervised Video Actor-Action Segmentation Model with
  a Wise Selection
Learning a Weakly-Supervised Video Actor-Action Segmentation Model with a Wise Selection
Jie Chen
Zhiheng Li
Jiebo Luo
Chenliang Xu
27
13
0
29 Mar 2020
Omni-sourced Webly-supervised Learning for Video Recognition
Omni-sourced Webly-supervised Learning for Video Recognition
Haodong Duan
Yue Zhao
Yuanjun Xiong
Wentao Liu
Dahua Lin
VLM
23
88
0
29 Mar 2020
Actor-Transformers for Group Activity Recognition
Actor-Transformers for Group Activity Recognition
Kirill Gavrilyuk
Ryan Sanford
Mehrsan Javan
Cees G. M. Snoek
ViT
19
178
0
28 Mar 2020
Multi-Granularity Reference-Aided Attentive Feature Aggregation for
  Video-based Person Re-identification
Multi-Granularity Reference-Aided Attentive Feature Aggregation for Video-based Person Re-identification
Zhizheng Zhang
Cuiling Lan
Wenjun Zeng
Zhibo Chen
VOS
27
98
0
27 Mar 2020
Negative Margin Matters: Understanding Margin in Few-shot Classification
Negative Margin Matters: Understanding Margin in Few-shot Classification
Bin Liu
Yue Cao
Yutong Lin
Qi Li
Zheng-Wei Zhang
Mingsheng Long
Han Hu
35
318
0
26 Mar 2020
Coronary Artery Segmentation in Angiographic Videos Using A 3D-2D CE-Net
Coronary Artery Segmentation in Angiographic Videos Using A 3D-2D CE-Net
Lu Wang
Dongxue Liang
Xiao-Lei Yin
Jing Qiu
Zhi-Yun Yang
Jun-Hui Xing
Jian-Zeng Dong
Zhao-Yuan Ma
MedIm
21
0
0
26 Mar 2020
Learning Object Permanence from Video
Learning Object Permanence from Video
Aviv Shamsian
Ofri Kleinfeld
Amir Globerson
Gal Chechik
SSL
47
31
0
23 Mar 2020
Normalized and Geometry-Aware Self-Attention Network for Image
  Captioning
Normalized and Geometry-Aware Self-Attention Network for Image Captioning
Longteng Guo
Jing Liu
Xinxin Zhu
Peng Yao
Shichen Lu
Hanqing Lu
ViT
135
189
0
19 Mar 2020
PIC: Permutation Invariant Convolution for Recognizing Long-range
  Activities
PIC: Permutation Invariant Convolution for Recognizing Long-range Activities
Noureldien Hussein
E. Gavves
A. Smeulders
VLM
31
13
0
18 Mar 2020
Multi-modal Dense Video Captioning
Multi-modal Dense Video Captioning
Vladimir E. Iashin
Esa Rahtu
22
165
0
17 Mar 2020
On Translation Invariance in CNNs: Convolutional Layers can Exploit
  Absolute Spatial Location
On Translation Invariance in CNNs: Convolutional Layers can Exploit Absolute Spatial Location
O. Kayhan
Jan van Gemert
213
233
0
16 Mar 2020
SF-Net: Single-Frame Supervision for Temporal Action Localization
SF-Net: Single-Frame Supervision for Temporal Action Localization
Fan Ma
Linchao Zhu
Yi Yang
Shengxin Cindy Zha
Gourab Kundu
Matt Feiszli
Zheng Shou
18
140
0
15 Mar 2020
Interaction Graphs for Object Importance Estimation in On-road Driving
  Videos
Interaction Graphs for Object Importance Estimation in On-road Driving Videos
Zehua Zhang
Ashish Tawari
Sujitha Martin
David J. Crandall
GNN
FAtt
17
23
0
12 Mar 2020
Visual Grounding in Video for Unsupervised Word Translation
Visual Grounding in Video for Unsupervised Word Translation
Gunnar A. Sigurdsson
Jean-Baptiste Alayrac
Aida Nematzadeh
Lucas Smaira
Mateusz Malinowski
João Carreira
Phil Blunsom
Andrew Zisserman
VGen
27
49
0
11 Mar 2020
Accurate Temporal Action Proposal Generation with Relation-Aware Pyramid
  Network
Accurate Temporal Action Proposal Generation with Relation-Aware Pyramid Network
Jialin Gao
Zhixiang Shi
Jiani Li
Guanshuo Wang
Yufeng Yuan
Shiming Ge
Xiaoping Zhou
13
73
0
09 Mar 2020
Better Captioning with Sequence-Level Exploration
Better Captioning with Sequence-Level Exploration
Jia Chen
Qin Jin
37
12
0
08 Mar 2020
Transferring Cross-domain Knowledge for Video Sign Language Recognition
Transferring Cross-domain Knowledge for Video Sign Language Recognition
Dongxu Li
Xin Yu
Chenchen Xu
L. Petersson
Hongdong Li
SLR
36
104
0
08 Mar 2020
Noise Estimation Using Density Estimation for Self-Supervised Multimodal
  Learning
Noise Estimation Using Density Estimation for Self-Supervised Multimodal Learning
Elad Amrani
Rami Ben-Ari
Daniel Rotman
A. Bronstein
17
121
0
06 Mar 2020
Detecting Attended Visual Targets in Video
Detecting Attended Visual Targets in Video
Eunji Chong
Yongxin Wang
Nataniel Ruiz
James M. Rehg
199
112
0
05 Mar 2020
Rethinking Zero-shot Video Classification: End-to-end Training for
  Realistic Applications
Rethinking Zero-shot Video Classification: End-to-end Training for Realistic Applications
Biagio Brattoli
Joseph Tighe
Fedor Zhdanov
Pietro Perona
Krzysztof Chalupka
VLM
137
127
0
03 Mar 2020
Fine-grained Video-Text Retrieval with Hierarchical Graph Reasoning
Fine-grained Video-Text Retrieval with Hierarchical Graph Reasoning
Shizhe Chen
Yida Zhao
Qin Jin
Qi Wu
48
310
0
01 Mar 2020
Joint 2D-3D Breast Cancer Classification
Joint 2D-3D Breast Cancer Classification
G. Liang
Xiaoqin Wang
Yu Zhang
Xin Xing
Hunter Blanton
Tawfiq Salem
Nathan Jacobs
31
39
0
27 Feb 2020
Evolving Losses for Unsupervised Video Representation Learning
Evolving Losses for Unsupervised Video Representation Learning
A. Piergiovanni
A. Angelova
Michael S. Ryoo
SSL
27
138
0
26 Feb 2020
Fine-Grained Instance-Level Sketch-Based Video Retrieval
Fine-Grained Instance-Level Sketch-Based Video Retrieval
Peng Xu
Kun Liu
Tao Xiang
Timothy M. Hospedales
Zhanyu Ma
Jun Guo
Yi-Zhe Song
32
32
0
21 Feb 2020
Strength from Weakness: Fast Learning Using Weak Supervision
Strength from Weakness: Fast Learning Using Weak Supervision
Joshua Robinson
Stefanie Jegelka
S. Sra
43
32
0
19 Feb 2020
Over-the-Air Adversarial Flickering Attacks against Video Recognition
  Networks
Over-the-Air Adversarial Flickering Attacks against Video Recognition Networks
Roi Pony
I. Naeh
Shie Mannor
AAML
21
51
0
12 Feb 2020
An End-to-End Visual-Audio Attention Network for Emotion Recognition in
  User-Generated Videos
An End-to-End Visual-Audio Attention Network for Emotion Recognition in User-Generated Videos
Sicheng Zhao
Yunsheng Ma
Yang Gu
Jufeng Yang
Tengfei Xing
Pengfei Xu
Runbo Hu
Hua Chai
Kurt Keutzer
19
98
0
12 Feb 2020
Dynamic Inference: A New Approach Toward Efficient Video Action
  Recognition
Dynamic Inference: A New Approach Toward Efficient Video Action Recognition
Wenhao Wu
Dongliang He
Xiao Tan
Shifeng Chen
Yi Yang
Shilei Wen
24
35
0
09 Feb 2020
Weakly-Supervised Multi-Person Action Recognition in 360$^{\circ}$
  Videos
Weakly-Supervised Multi-Person Action Recognition in 360∘^{\circ}∘ Videos
Junnan Li
Jianquan Liu
Yongkang Wong
Shoji Nishimura
Mohan S. Kankanhalli
31
13
0
09 Feb 2020
Solving Raven's Progressive Matrices with Neural Networks
Solving Raven's Progressive Matrices with Neural Networks
Tao Zhuo
Mohan S. Kankanhalli
27
26
0
05 Feb 2020
Action Graphs: Weakly-supervised Action Localization with Graph
  Convolution Networks
Action Graphs: Weakly-supervised Action Localization with Graph Convolution Networks
M. Rashid
Hedvig Kjellström
Yong Jae Lee
WSOL
GNN
19
46
0
04 Feb 2020
Interpreting video features: a comparison of 3D convolutional networks
  and convolutional LSTM networks
Interpreting video features: a comparison of 3D convolutional networks and convolutional LSTM networks
Joonatan Mänttäri
Sofia Broomé
John Folkesson
Hedvig Kjellström
FAtt
27
27
0
02 Feb 2020
TVR: A Large-Scale Dataset for Video-Subtitle Moment Retrieval
TVR: A Large-Scale Dataset for Video-Subtitle Moment Retrieval
Jie Lei
Licheng Yu
Tamara L. Berg
Joey Tianyi Zhou
119
277
0
24 Jan 2020
Audiovisual SlowFast Networks for Video Recognition
Audiovisual SlowFast Networks for Video Recognition
Fanyi Xiao
Yong Jae Lee
Kristen Grauman
Jitendra Malik
Christoph Feichtenhofer
197
207
0
23 Jan 2020
Previous
123...232425...282930
Next