Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1705.07750
Cited By
v1
v2
v3 (latest)
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
22 May 2017
João Carreira
Andrew Zisserman
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset"
50 / 3,647 papers shown
Title
Advancing High-Resolution Video-Language Representation with Large-Scale Video Transcriptions
Hongwei Xue
Tiankai Hang
Yanhong Zeng
Yuchong Sun
Bei Liu
Huan Yang
Jianlong Fu
B. Guo
AI4TS
VLM
81
194
0
19 Nov 2021
M2A: Motion Aware Attention for Accurate Video Action Recognition
Brennan Gebotys
Alexander Wong
David A Clausi
54
3
0
18 Nov 2021
Evaluating Transformers for Lightweight Action Recognition
Raivo Koot
Markus Hennerbichler
Haiping Lu
ViT
82
8
0
18 Nov 2021
Learning to Align Sequential Actions in the Wild
Weizhe Liu
Bugra Tekin
Huseyin Coskun
Vibhav Vineet
Pascal Fua
Marc Pollefeys
80
24
0
17 Nov 2021
Language bias in Visual Question Answering: A Survey and Taxonomy
Desen Yuan
103
13
0
16 Nov 2021
Learnable Locality-Sensitive Hashing for Video Anomaly Detection
Yue Lu
Congqi Cao
Yanning Zhang
62
27
0
15 Nov 2021
Weakly-Supervised Dense Action Anticipation
Haotong Zhang
Fuhai Chen
Angela Yao
AI4TS
55
4
0
15 Nov 2021
Unsupervised Action Localization Crop in Video Retargeting for 3D ConvNets
Prithwish Jana
Swarnabja Bhaumik
Partha Pratim Mohanta
52
3
0
14 Nov 2021
Co-segmentation Inspired Attention Module for Video-based Computer Vision Tasks
Arulkumar Subramaniam
Jayesh Vaidya
Muhammed Ameen
Athira M. Nambiar
Anurag Mittal
78
7
0
14 Nov 2021
Dense Unsupervised Learning for Video Segmentation
Nikita Araslanov
Simone Schaub-Meyer
Stefan Roth
VOS
72
31
0
11 Nov 2021
Towards Domain-Independent and Real-Time Gesture Recognition Using mmWave Signal
Yadong Li
Dongheng Zhang
Jinbo Chen
Jinwei Wan
Dong Zhang
Yang Hu
Qibin Sun
Yan Chen
91
76
0
11 Nov 2021
Sparse Adversarial Video Attacks with Spatial Transformations
Ronghui Mu
Wenjie Ruan
Leandro Soriano Marcolino
Q. Ni
AAML
95
19
0
10 Nov 2021
Cross Attentional Audio-Visual Fusion for Dimensional Emotion Recognition
R Gnana Praveen
Eric Granger
P. Cardinal
CVBM
80
44
0
09 Nov 2021
Towards Debiasing Temporal Sentence Grounding in Video
Hao Zhang
Aixin Sun
Wei Jing
Qiufeng Wang
103
16
0
08 Nov 2021
Are we ready for a new paradigm shift? A Survey on Visual Deep MLP
Ruiyang Liu
Hai-Tao Zheng
Li Tao
Dun Liang
Haitao Zheng
217
100
0
07 Nov 2021
NarrationBot and InfoBot: A Hybrid System for Automated Video Description
Shasta Ihorn
Y. Siu
Aditya Bodi
Lothar D Narins
Jose M. Castanon
Yash Kant
Abhishek Das
Ilmi Yoon
Pooyan Fazli
48
3
0
07 Nov 2021
Will You Ever Become Popular? Learning to Predict Virality of Dance Clips
Jiahao Wang
Yunhong Wang
Nina Weng
Tianrui Chai
Annan Li
Faxi Zhang
Sansi Yu
61
13
0
06 Nov 2021
BBC-Oxford British Sign Language Dataset
Samuel Albanie
Gül Varol
Liliane Momeni
Hannah Bull
Triantafyllos Afouras
...
Neil Fox
B. Woll
Robert J. Cooper
A. McParland
Andrew Zisserman
SLR
59
33
0
05 Nov 2021
Sequence-to-Sequence Modeling for Action Identification at High Temporal Resolution
Aakash Kaku
Kangning Liu
A. Parnandi
H. Rajamohan
Kannan Venkataramanan
Anita Venkatesan
Audre Wirtanen
Natasha Pandit
Heidi M. Schambra
C. Fernandez‐Granda
38
5
0
03 Nov 2021
Revisiting spatio-temporal layouts for compositional action recognition
Gorjan Radevski
Marie-Francine Moens
Tinne Tuytelaars
104
26
0
02 Nov 2021
Relational Self-Attention: What's Missing in Attention for Video Understanding
Manjin Kim
Heeseung Kwon
Chunyu Wang
Suha Kwak
Minsu Cho
ViT
88
29
0
02 Nov 2021
A Critical Study on the Recent Deep Learning Based Semi-Supervised Video Anomaly Detection Methods
M. Baradaran
R. Bergevin
92
18
0
02 Nov 2021
Masking Modalities for Cross-modal Video Retrieval
Valentin Gabeur
Arsha Nagrani
Chen Sun
Alahari Karteek
Cordelia Schmid
88
30
0
01 Nov 2021
AdaPool: Exponential Adaptive Pooling for Information-Retaining Downsampling
Alexandros Stergiou
R. Poppe
102
83
0
01 Nov 2021
Hierarchical Deep Residual Reasoning for Temporal Moment Localization
Ziyang Ma
Xianjing Han
Xuemeng Song
Yiran Cui
Liqiang Nie
67
9
0
31 Oct 2021
whu-nercms at trecvid2021:instance search task
Yanrui Niu
Jingya Yang
Ankang Lu
Baojin Huang
Yue Zhang
...
Shishi Wen
Dongshu Xu
Chao Liang
Zhongyuan Wang
Jun Chen
53
4
0
30 Oct 2021
Multi-Task and Multi-Modal Learning for RGB Dynamic Gesture Recognition
Dinghao Fan
Hengjie Lu
Shugong Xu
Shan Cao
67
16
0
29 Oct 2021
Attacking Video Recognition Models with Bullet-Screen Comments
Kai-xiang Chen
Zhipeng Wei
Jingjing Chen
Zuxuan Wu
Yu-Gang Jiang
AAML
90
23
0
29 Oct 2021
BiC-Net: Learning Efficient Spatio-Temporal Relation for Text-Video Retrieval
Ning Han
Jingjing Chen
Chuhao Shi
Yawen Zeng
Guangyi Xiao
Hao Chen
113
11
0
29 Oct 2021
ST-ABN: Visual Explanation Taking into Account Spatio-temporal Information for Video Recognition
Masahiro Mitsuhara
Tsubasa Hirakawa
Takayoshi Yamashita
H. Fujiyoshi
61
1
0
29 Oct 2021
Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Background Mixing
Aadarsh Sahoo
Rutav Shah
Yikang Shen
Kate Saenko
Abir Das
80
65
0
28 Oct 2021
Skeleton-Based Mutually Assisted Interacted Object Localization and Human Action Recognition
Liang Xu
Cuiling Lan
Wenjun Zeng
Cewu Lu
64
25
0
28 Oct 2021
Temporal-attentive Covariance Pooling Networks for Video Recognition
Zilin Gao
Qilong Wang
Bingbing Zhang
Q. Hu
P. Li
116
25
0
27 Oct 2021
Image Comes Dancing with Collaborative Parsing-Flow Video Synthesis
Bowen Wu
Zhenyu Xie
Xiaodan Liang
Yubei Xiao
Haoye Dong
Liang Lin
3DH
75
6
0
27 Oct 2021
Zero-Shot Action Recognition from Diverse Object-Scene Compositions
Carlo Bretti
Pascal Mettes
OCL
59
9
0
26 Oct 2021
CTRN: Class-Temporal Relational Network for Action Detection
Rui Dai
Srijan Das
Francois Bremond
ViT
73
22
0
26 Oct 2021
Self-Denoising Neural Networks for Few Shot Learning
S. Schwarcz
Sai Saketh Rambhatla
Ramalingam Chellappa
81
1
0
26 Oct 2021
IIP-Transformer: Intra-Inter-Part Transformer for Skeleton-Based Action Recognition
Qingtian Wang
Jianlin Peng
Shuze Shi
Tingxi Liu
Jiabin He
Renliang Weng
ViT
79
37
0
26 Oct 2021
Using Motion History Images with 3D Convolutional Networks in Isolated Sign Language Recognition
Hamed Valizadegan
D. Caldwell
SLR
72
51
0
24 Oct 2021
A Closer Look at Few-Shot Video Classification: A New Baseline and Benchmark
Zhenxi Zhu
Limin Wang
Sheng Guo
Gangshan Wu
151
32
0
24 Oct 2021
Multimodal Learning using Optimal Transport for Sarcasm and Humor Detection
Shraman Pramanick
A. Roy
Vishal M. Patel
82
58
0
21 Oct 2021
LARNet: Latent Action Representation for Human Action Synthesis
Naman Biyani
A. J. Rana
Shruti Vyas
Yogesh S Rawat
68
4
0
21 Oct 2021
Few-Shot Temporal Action Localization with Query Adaptive Transformer
Sauradip Nag
Xiatian Zhu
Tao Xiang
73
19
0
20 Oct 2021
GTM: Gray Temporal Model for Video Recognition
Yanping Zhang
Yongxin Yu
52
0
0
20 Oct 2021
Domain Generalization through Audio-Visual Relative Norm Alignment in First Person Action Recognition
M. Planamente
Chiara Plizzari
Emanuele Alberti
Barbara Caputo
EgoV
123
35
0
19 Oct 2021
LSTC: Boosting Atomic Action Detection with Long-Short-Term Context
Yuxi Li
Boshen Zhang
Jian Li
Yabiao Wang
Weiyao Lin
Chengjie Wang
Jilin Li
Feiyue Huang
81
5
0
19 Oct 2021
Boosting the Transferability of Video Adversarial Examples via Temporal Translation
Zhipeng Wei
Jingjing Chen
Zuxuan Wu
Yu-Gang Jiang
AAML
112
34
0
18 Oct 2021
TEAM-Net: Multi-modal Learning for Video Action Recognition with Partial Decoding
Zhengwei Wang
Qi She
A. Smolic
74
9
0
17 Oct 2021
ASFormer: Transformer for Action Segmentation
Fangqiu Yi
Hongyu Wen
Tingting Jiang
ViT
139
177
0
16 Oct 2021
"Knights": First Place Submission for VIPriors21 Action Recognition Challenge at ICCV 2021
Ishan R. Dave
Naman Biyani
Brandon Clark
Rohit Gupta
Yogesh S Rawat
M. Shah
ViT
78
3
0
14 Oct 2021
Previous
1
2
3
...
41
42
43
...
71
72
73
Next