ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1705.07750
  4. Cited By
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
v1v2v3 (latest)

Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset

22 May 2017
João Carreira
Andrew Zisserman
ArXiv (abs)PDFHTML

Papers citing "Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset"

50 / 3,645 papers shown
Title
Multi-Agent Reinforcement Learning Based Frame Sampling for Effective
  Untrimmed Video Recognition
Multi-Agent Reinforcement Learning Based Frame Sampling for Effective Untrimmed Video Recognition
Wenhao Wu
Dongliang He
Xiao Tan
Shifeng Chen
Shilei Wen
88
127
0
31 Jul 2019
Learning Question-Guided Video Representation for Multi-Turn Video
  Question Answering
Learning Question-Guided Video Representation for Multi-Turn Video Question Answering
Guan-Lin Chao
Abhinav Rastogi
Semih Yavuz
Dilek Z. Hakkani-Tür
Jindong Chen
Ian Lane
51
6
0
31 Jul 2019
Open Set Domain Adaptation for Image and Action Recognition
Open Set Domain Adaptation for Image and Action Recognition
Pau Panareda Busto
Ahsan Iqbal
Juergen Gall
VLM
73
89
0
30 Jul 2019
Deep Multi-Kernel Convolutional LSTM Networks and an Attention-Based
  Mechanism for Videos
Deep Multi-Kernel Convolutional LSTM Networks and an Attention-Based Mechanism for Videos
Sebastian Agethen
Winston H. Hsu
HAI
79
25
0
30 Jul 2019
Temporal Attentive Alignment for Large-Scale Video Domain Adaptation
Temporal Attentive Alignment for Large-Scale Video Domain Adaptation
Min-Hung Chen
Z. Kira
G. Al-Regib
Jaekwon Yoo
Ruxin Chen
Jian Zheng
TTAAI4TS
101
180
0
30 Jul 2019
ChaLearn Looking at People: IsoGD and ConGD Large-scale RGB-D Gesture
  Recognition
ChaLearn Looking at People: IsoGD and ConGD Large-scale RGB-D Gesture Recognition
Jun Wan
Chi Lin
Longyin Wen
Yunan Li
Qiguang Miao
Sergio Escalera
G. Anbarjafari
Isabelle M Guyon
G. Guo
Stan Z. Li
3DV
60
26
0
29 Jul 2019
Using 3D Convolutional Neural Networks to Learn Spatiotemporal Features
  for Automatic Surgical Gesture Recognition in Video
Using 3D Convolutional Neural Networks to Learn Spatiotemporal Features for Automatic Surgical Gesture Recognition in Video
Isabel Funke
S. Bodenstedt
F. Oehme
F. Bechtolsheim
Jürgen Weitz
Stefanie Speidel
83
85
0
26 Jul 2019
Submission to ActivityNet Challenge 2019: Task B Spatio-temporal Action
  Localization
Submission to ActivityNet Challenge 2019: Task B Spatio-temporal Action Localization
Chunfei Ma
Joonhyang Choi
Byeongwon Lee
Seungji Yang
26
0
0
25 Jul 2019
Zero-Shot Sign Language Recognition: Can Textual Data Uncover Sign
  Languages?
Zero-Shot Sign Language Recognition: Can Textual Data Uncover Sign Languages?
Yunus Can Bilge
Nazli Ikizler-Cinbis
R. G. Cinbis
SLR
72
29
0
24 Jul 2019
Motion-Aware Feature for Improved Video Anomaly Detection
Motion-Aware Feature for Improved Video Anomaly Detection
Yi Zhu
Shawn D. Newsam
79
159
0
24 Jul 2019
Switchable Normalization for Learning-to-Normalize Deep Representation
Switchable Normalization for Learning-to-Normalize Deep Representation
Ping Luo
Ruimao Zhang
Jiamin Ren
Zhanglin Peng
Jingyu Li
129
74
0
22 Jul 2019
Domain-Specific Priors and Meta Learning for Few-Shot First-Person
  Action Recognition
Domain-Specific Priors and Meta Learning for Few-Shot First-Person Action Recognition
Huseyin Coskun
Zeeshan Zia
Bugra Tekin
Federica Bogo
Nassir Navab
Federico Tombari
H. Sawhney
79
27
0
22 Jul 2019
Trends in Integration of Vision and Language Research: A Survey of
  Tasks, Datasets, and Methods
Trends in Integration of Vision and Language Research: A Survey of Tasks, Datasets, and Methods
Aditya Mogadala
M. Kalimuthu
Dietrich Klakow
VLM
141
136
0
22 Jul 2019
Attention Filtering for Multi-person Spatiotemporal Action Detection on
  Deep Two-Stream CNN Architectures
Attention Filtering for Multi-person Spatiotemporal Action Detection on Deep Two-Stream CNN Architectures
João Antunes
Pedro Abreu
Alexandre Bernardino
A. Smailagic
D. Siewiorek
23
1
0
21 Jul 2019
An Efficient 3D CNN for Action/Object Segmentation in Video
An Efficient 3D CNN for Action/Object Segmentation in Video
Rui Hou
Chong Chen
Rahul Sukthankar
M. Shah
67
28
0
21 Jul 2019
Only Time Can Tell: Discovering Temporal Data for Temporal Modeling
Only Time Can Tell: Discovering Temporal Data for Temporal Modeling
Laura Sevilla-Lara
Shengxin Cindy Zha
Zhicheng Yan
Vedanuj Goswami
Matt Feiszli
Lorenzo Torresani
105
76
0
19 Jul 2019
Adversarial Video Generation on Complex Datasets
Adversarial Video Generation on Complex Datasets
Aidan Clark
Jeff Donahue
Karen Simonyan
VGenGAN
111
77
0
15 Jul 2019
A Short Note on the Kinetics-700 Human Action Dataset
A Short Note on the Kinetics-700 Human Action Dataset
João Carreira
Eric Noland
Chloe Hillier
Andrew Zisserman
93
458
0
15 Jul 2019
AVD: Adversarial Video Distillation
AVD: Adversarial Video Distillation
M. Tavakolian
Mohammad Sabokrou
Abdenour Hadid
VGen
58
6
0
12 Jul 2019
Activitynet 2019 Task 3: Exploring Contexts for Dense Captioning Events
  in Videos
Activitynet 2019 Task 3: Exploring Contexts for Dense Captioning Events in Videos
Shizhe Chen
Yuqing Song
Yida Zhao
Qin Jin
Zhaoyang Zeng
Bei Liu
Jianlong Fu
Alexander G. Hauptmann
58
12
0
11 Jul 2019
Micro-expression Action Unit Detection with Spatio-temporal Adaptive Pooling
Yante Li
Xiaohua Huang
Guoying Zhao
70
3
0
11 Jul 2019
Two-stream Spatiotemporal Feature for Video QA Task
Two-stream Spatiotemporal Feature for Video QA Task
Chiwan Song
Woobin Im
Sung-eui Yoon
21
0
0
11 Jul 2019
Video Action Recognition Via Neural Architecture Searching
Video Action Recognition Via Neural Architecture Searching
Wei Peng
Xiaopeng Hong
Guoying Zhao
99
36
0
10 Jul 2019
GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural
  Language Processing
GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing
Jian Guo
He He
Tong He
Leonard Lausen
Mu Li
...
Hang Zhang
Zhi-Li Zhang
Zhongyue Zhang
Shuai Zheng
Yi Zhu
VLMBDL
101
198
0
09 Jul 2019
Positional Normalization
Positional Normalization
Boyi Li
Felix Wu
Kilian Q. Weinberger
Serge J. Belongie
78
92
0
09 Jul 2019
Multimodal Transformer Networks for End-to-End Video-Grounded Dialogue
  Systems
Multimodal Transformer Networks for End-to-End Video-Grounded Dialogue Systems
Hung Le
Doyen Sahoo
Nancy F. Chen
Guosheng Lin
63
112
0
02 Jul 2019
Learnable Gated Temporal Shift Module for Deep Video Inpainting
Learnable Gated Temporal Shift Module for Deep Video Inpainting
Ya-Liang Chang
Zhe-Yu Liu
Kuan-Ying Lee
Winston H. Hsu
75
12
0
02 Jul 2019
INN: Inflated Neural Networks for IPMN Diagnosis
INN: Inflated Neural Networks for IPMN Diagnosis
Rodney LaLonde
Irene Tanner
K. Nikiforaki
G. Papadakis
Pujan Kandel
C. Bolan
Michael B. Wallace
Ulas Bagci
41
12
0
30 Jun 2019
Loss Switching Fusion with Similarity Search for Video Classification
Loss Switching Fusion with Similarity Search for Video Classification
Lei Wang
D. Huynh
M. Mansour
73
24
0
27 Jun 2019
Few-Shot Video Classification via Temporal Alignment
Few-Shot Video Classification via Temporal Alignment
Kaidi Cao
Jingwei Ji
Zhangjie Cao
C. Chang
Juan Carlos Niebles
AI4TS
93
242
0
27 Jun 2019
A Comparative Review of Recent Kinect-based Action Recognition
  Algorithms
A Comparative Review of Recent Kinect-based Action Recognition Algorithms
Lei Wang
D. Huynh
Piotr Koniusz
61
214
0
24 Jun 2019
Baidu-UTS Submission to the EPIC-Kitchens Action Recognition Challenge
  2019
Baidu-UTS Submission to the EPIC-Kitchens Action Recognition Challenge 2019
Xiaohan Wang
Yu Wu
Linchao Zhu
Yi Yang
77
19
0
22 Jun 2019
Towards Real-Time Action Recognition on Mobile Devices Using Deep Models
Towards Real-Time Action Recognition on Mobile Devices Using Deep Models
Chen-Da Liu-Zhang
Xin-Xin Liu
Jianxin Wu
HAI
31
9
0
17 Jun 2019
Spatio-Temporal Fusion Networks for Action Recognition
Spatio-Temporal Fusion Networks for Action Recognition
Sangwoo Cho
H. Foroosh
52
12
0
17 Jun 2019
Delving into 3D Action Anticipation from Streaming Videos
Hongsong Wang
Jiashi Feng
85
4
0
15 Jun 2019
Hallucinating IDT Descriptors and I3D Optical Flow Features for Action
  Recognition with CNNs
Hallucinating IDT Descriptors and I3D Optical Flow Features for Action Recognition with CNNs
Lei Wang
Piotr Koniusz
D. Huynh
32
7
0
13 Jun 2019
Learning Video Representations using Contrastive Bidirectional
  Transformer
Learning Video Representations using Contrastive Bidirectional Transformer
Chen Sun
Fabien Baradel
Kevin Patrick Murphy
Cordelia Schmid
SSLViT
136
134
0
13 Jun 2019
Learning Spatio-Temporal Representation with Local and Global Diffusion
Learning Spatio-Temporal Representation with Local and Global Diffusion
Zhaofan Qiu
Ting Yao
Chong-Wah Ngo
Xinmei Tian
Tao Mei
83
171
0
13 Jun 2019
Identifying Visible Actions in Lifestyle Vlogs
Identifying Visible Actions in Lifestyle Vlogs
Oana Ignat
Laura Burdick
Jia Deng
Rada Mihalcea
55
14
0
10 Jun 2019
FASTER Recurrent Networks for Efficient Video Classification
FASTER Recurrent Networks for Efficient Video Classification
Linchao Zhu
Laura Sevilla-Lara
Du Tran
Matt Feiszli
Yi Yang
Heng Wang
85
6
0
10 Jun 2019
The role of ego vision in view-invariant action recognition
The role of ego vision in view-invariant action recognition
Gaurvi Goyal
Nicoletta Noceti
Francesca Odone
A. Sciutti
EgoV
24
0
0
10 Jun 2019
UniDual: A Unified Model for Image and Video Understanding
UniDual: A Unified Model for Image and Video Understanding
Yufei Wang
Du Tran
Lorenzo Torresani
36
2
0
10 Jun 2019
An Attention-based Recurrent Convolutional Network for Vehicle Taillight
  Recognition
An Attention-based Recurrent Convolutional Network for Vehicle Taillight Recognition
Kuan-Hui Lee
Takaaki Tagawa
Jia Pan
Adrien Gaidon
B. Douillard
ViT
32
15
0
09 Jun 2019
Video Modeling with Correlation Networks
Video Modeling with Correlation Networks
Heng Wang
Du Tran
Lorenzo Torresani
Matt Feiszli
116
129
0
07 Jun 2019
Detecting the Starting Frame of Actions in Video
Detecting the Starting Frame of Actions in Video
Iljung S. Kwak
Jian-Zhong Guo
Adam Hantman
D. Kriegman
K. Branson
23
8
0
07 Jun 2019
HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million
  Narrated Video Clips
HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million Narrated Video Clips
Antoine Miech
Dimitri Zhukov
Jean-Baptiste Alayrac
Makarand Tapaswi
Ivan Laptev
Josef Sivic
VGen
130
1,212
0
07 Jun 2019
Recognizing American Sign Language Manual Signs from RGB-D Videos
Recognizing American Sign Language Manual Signs from RGB-D Videos
Longlong Jing
Elahe Vahdani
Matt Huenerfauth
Yingli Tian
SLR
64
26
0
07 Jun 2019
Scaling Autoregressive Video Models
Scaling Autoregressive Video Models
Dirk Weissenborn
Oscar Täckström
Jakob Uszkoreit
DiffMVGen
124
204
0
06 Jun 2019
Weakly-Supervised Spatio-Temporally Grounding Natural Sentence in Video
Weakly-Supervised Spatio-Temporally Grounding Natural Sentence in Video
Zhenfang Chen
Lin Ma
Wenhan Luo
Kwan-Yee K. Wong
98
103
0
06 Jun 2019
Detecting Kissing Scenes in a Database of Hollywood Films
Detecting Kissing Scenes in a Database of Hollywood Films
Amir Ziai
20
2
0
05 Jun 2019
Previous
123...656667...717273
Next