Papers
Communities
Organizations
Events
Blog
Pricing
Search
Open menu
Home
Papers
1705.07750
Cited By
v1
v2
v3 (latest)
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
22 May 2017
João Carreira
Andrew Zisserman
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset"
50 / 3,647 papers shown
Title
Dynamic Curriculum Learning for Great Ape Detection in the Wild
Xinyu Yang
T. Burghardt
Majid Mirmehdi
95
14
0
30 Apr 2022
On Negative Sampling for Audio-Visual Contrastive Learning from Movies
Mahdi M. Kalayeh
Shervin Ardeshir
Lingyi Liu
Nagendra Kamath
Ashok Chandrashekar
SSL
70
3
0
29 Apr 2022
Tragedy Plus Time: Capturing Unintended Human Activities from Weakly-labeled Videos
Arnav Chakravarthy
Zhiyuan Fang
Yezhou Yang
77
2
0
28 Apr 2022
The Wisdom of Crowds: Temporal Progressive Attention for Early Action Prediction
Alexandros Stergiou
Dima Damen
AI4TS
EgoV
EDL
89
8
0
28 Apr 2022
Human-Centered Prior-Guided and Task-Dependent Multi-Task Representation Learning for Action Recognition Pre-Training
Guanhong Wang
Ke Lu
Yang Zhou
Zhanhao He
Gaoang Wang
SSL
76
3
0
27 Apr 2022
Contrastive Language-Action Pre-training for Temporal Localization
Mengmeng Xu
Erhan Gundogdu
⋆⋆ Maksim
Guohao Li
M. Donoser
Loris Bazzani
102
27
0
26 Apr 2022
Adaptive Split-Fusion Transformer
Zixuan Su
Hao Zhang
Jingjing Chen
Lei Pang
Chong-Wah Ngo
Yu-Gang Jiang
ViT
103
8
0
26 Apr 2022
ClothFormer:Taming Video Virtual Try-on in All Module
Jianbin Jiang
Tan Wang
He Yan
Junhui Liu
88
28
0
26 Apr 2022
BronchoPose: an analysis of data and model configuration for vision-based bronchoscopy pose estimation
Juan Borrego-Carazo
Carles Sánchez
David Castells-Rufas
J. Carrabina
D. Gil
62
14
0
25 Apr 2022
Temporal Relevance Analysis for Video Action Models
Quanfu Fan
Donghyun Kim
Chun-Fu Chen
Chen
Stan Sclaroff
Kate Saenko
Sarah Adel Bargal
FAtt
68
0
0
25 Apr 2022
Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications
Han Cai
Ji Lin
Chengyue Wu
Zhijian Liu
Haotian Tang
Hanrui Wang
Ligeng Zhu
Song Han
116
117
0
25 Apr 2022
Estimation of Reliable Proposal Quality for Temporal Action Detection
Junshan Hu
Chaoxu Guo
Liansheng Zhuang
Biao Wang
T. Ge
Yuning Jiang
Houqiang Li
90
4
0
25 Apr 2022
A Spatio-Temporal Multilayer Perceptron for Gesture Recognition
Adrian Holzbock
Alexander Tsaregorodtsev
Youssef Dawoud
Klaus C. J. Dietmayer
Vasileios Belagiannis
87
12
0
25 Apr 2022
iCAR: Bridging Image Classification and Image-text Alignment for Visual Recognition
Yixuan Wei
Yue Cao
Zheng Zhang
Zhuliang Yao
Zhenda Xie
Han Hu
B. Guo
VLM
61
11
0
22 Apr 2022
Future Object Detection with Spatiotemporal Transformers
Adam Tonderski
Joakim Johnander
Christoffer Petersson
Kalle AAstrom
ViT
67
1
0
21 Apr 2022
THORN: Temporal Human-Object Relation Network for Action Recognition
Mohammed Guermal
Rui Dai
Francois Bremond
EgoV
74
3
0
20 Apr 2022
FenceNet: Fine-grained Footwork Recognition in Fencing
Kevin Zhu
Alexander Wong
J. McPhee
56
18
0
20 Apr 2022
Video Moment Retrieval from Text Queries via Single Frame Annotation
Ran Cui
Tianwen Qian
Pai Peng
E. Daskalaki
Jingjing Chen
Xiao-Wei Guo
Huyang Sun
Yu-Gang Jiang
106
37
0
20 Apr 2022
Attention in Attention: Modeling Context Correlation for Efficient Video Classification
Y. Hao
Shuo Wang
P. Cao
Xinjian Gao
Tong Xu
Jinmeng Wu
Xiangnan He
93
41
0
20 Apr 2022
Sound-Guided Semantic Video Generation
Seung Hyun Lee
Gyeongrok Oh
Wonmin Byeon
Chanyoung Kim
Wonjae Ryoo
Sang Ho Yoon
Hyunjun Cho
Jihyun Bae
Jinkyu Kim
Sangpil Kim
VGen
100
27
0
20 Apr 2022
A Survey of Video-based Action Quality Assessment
Shunli Wang
Dingkang Yang
Peng Zhai
Qing Yu
Tao Suo
Zhan Sun
Ka Li
Lihua Zhang
69
19
0
20 Apr 2022
ActAR: Actor-Driven Pose Embeddings for Video Action Recognition
Soufiane Lamghari
Guillaume-Alexandre Bilodeau
Nicolas Saunier
197
4
0
19 Apr 2022
Temporally Efficient Vision Transformer for Video Instance Segmentation
Shusheng Yang
Xinggang Wang
Yu Li
Yuxin Fang
Jiemin Fang
Wenyu Liu
Xun Zhao
Ying Shan
ViT
72
68
0
18 Apr 2022
OMG: Observe Multiple Granularities for Natural Language-Based Vehicle Retrieval
Yunhao Du
Binyu Zhang
Xiang Ruan
Fei Su
Zhicheng Zhao
Hong Chen
58
5
0
18 Apr 2022
Animal Kingdom: A Large and Diverse Dataset for Animal Behavior Understanding
Xun Long Ng
Kian Eng Ong
Qichen Zheng
Yun Ni
S. Yeo
Jing Liu
VGen
79
89
0
18 Apr 2022
End-to-end Dense Video Captioning as Sequence Generation
Wanrong Zhu
Bo Pang
Ashish V. Thapliyal
William Yang Wang
Radu Soricut
DiffM
61
34
0
18 Apr 2022
Video Action Detection: Analysing Limitations and Challenges
Rajat Modi
A. J. Rana
Akash Kumar
Praveen Tirupattur
Shruti Vyas
Yogesh S Rawat
M. Shah
105
12
0
17 Apr 2022
Attention Mechanism based Cognition-level Scene Understanding
Xuejiao Tang
Tai Le Quy
LRM
86
0
0
17 Apr 2022
Clothes-Changing Person Re-identification with RGB Modality Only
Xinqian Gu
Hong Chang
Bingpeng Ma
Shutao Bai
Shiguang Shan
Xilin Chen
71
167
0
14 Apr 2022
3D Convolutional Networks for Action Recognition: Application to Sport Gesture Recognition
Pierre-Etienne Martin
J. Benois-Pineau
Renaud Péteri
A. Zemmari
J. Morlier
68
5
0
13 Apr 2022
Do You Really Mean That? Content Driven Audio-Visual Deepfake Dataset and Multimodal Method for Temporal Forgery Localization
Zhixi Cai
Kalin Stefanov
Abhinav Dhall
Munawar Hayat
79
3
0
13 Apr 2022
Calibrating Class Weights with Multi-Modal Information for Partial Video Domain Adaptation
Xiyu Wang
Yuecong Xu
K. Mao
Jianfei Yang
77
8
0
13 Apr 2022
Position-aware Location Regression Network for Temporal Video Grounding
Sunoh Kim
Kimin Yun
J. Choi
58
4
0
12 Apr 2022
SOS! Self-supervised Learning Over Sets Of Handled Objects In Egocentric Action Recognition
Victor Escorcia
Ricardo Guerrero
Xiatian Zhu
Brais Martínez
EgoV
72
9
0
10 Apr 2022
CholecTriplet2021: A benchmark challenge for surgical action triplet recognition
C. Nwoye
Deepak Alapatt
Tong Yu
Armine Vardazaryan
Fangfang Xia
...
Didier Mutter
Pietro Mascagni
B. Seeliger
Cristians Gonzalez
N. Padoy
62
53
0
10 Apr 2022
A Comparative Analysis of Decision-Level Fusion for Multimodal Driver Behaviour Understanding
Alina Roitberg
Kunyu Peng
Zdravko Marinov
C. Seibold
David Schneider
Rainer Stiefelhagen
110
19
0
10 Apr 2022
Is my Driver Observation Model Overconfident? Input-guided Calibration Networks for Reliable and Interpretable Confidence Estimates
Alina Roitberg
Kunyu Peng
David Schneider
Kailun Yang
Marios Koulakis
Manuel Martínez
Rainer Stiefelhagen
UQCV
68
9
0
10 Apr 2022
Learning Pixel-Level Distinctions for Video Highlight Detection
Fanyue Wei
Biao Wang
T. Ge
Yuning Jiang
Wen Li
Lixin Duan
49
20
0
10 Apr 2022
Self-Supervised Video Representation Learning with Motion-Contrastive Perception
Jin-Yuan Liu
Ying Cheng
Yuejie Zhang
Ruiwei Zhao
Rui Feng
SSL
72
1
0
10 Apr 2022
Multimodal Transformer for Nursing Activity Recognition
Momal Ijaz
Renato Diaz
Chong Chen
ViT
103
27
0
09 Apr 2022
Probabilistic Representations for Video Contrastive Learning
Jungin Park
Jiyoung Lee
Ig-Jae Kim
Kwanghoon Sohn
SSL
111
47
0
08 Apr 2022
Frequency Selective Augmentation for Video Representation Learning
Jinhyung Kim
Taeoh Kim
Minho Shim
Dongyoon Han
Dongyoon Wee
Junmo Kim
AI4TS
101
4
0
08 Apr 2022
FineDiving: A Fine-grained Dataset for Procedure-aware Action Quality Assessment
Jinglin Xu
Yongming Rao
Xumin Yu
Guangyi Chen
Jie Zhou
Jiwen Lu
84
97
0
07 Apr 2022
Long Video Generation with Time-Agnostic VQGAN and Time-Sensitive Transformer
Songwei Ge
Thomas Hayes
Harry Yang
Xiaoyue Yin
Guan Pang
David Jacobs
Jia-Bin Huang
Devi Parikh
ViT
191
223
0
07 Apr 2022
Video Diffusion Models
Jonathan Ho
Tim Salimans
Alexey A. Gritsenko
William Chan
Mohammad Norouzi
David J. Fleet
DiffM
VGen
433
1,650
0
07 Apr 2022
Continual Inference: A Library for Efficient Online Inference with Deep Neural Networks in PyTorch
Lukas Hedegaard
Alexandros Iosifidis
BDL
3DV
CLL
48
6
0
07 Apr 2022
Detection of Distracted Driver using Convolution Neural Network
Narayana Darapaneni
Jai Arora
MoniShankar Hazra
Naman Vig
Simrandeep Singh Gandhi
Saurabh Gupta
A. Paduri
22
8
0
07 Apr 2022
Hierarchical Self-supervised Representation Learning for Movie Understanding
Fanyi Xiao
Kaustav Kundu
Joseph Tighe
Davide Modolo
SSL
100
26
0
06 Apr 2022
Learning from Untrimmed Videos: Self-Supervised Video Representation Learning with Hierarchical Consistency
Zhiwu Qing
Shiwei Zhang
Ziyuan Huang
Yi Tian Xu
Xiang Wang
Mingqian Tang
Changxin Gao
Rong Jin
Nong Sang
SSL
AI4TS
76
17
0
06 Apr 2022
Video Demoireing with Relation-Based Temporal Consistency
Peng Dai
Xin Yu
Lan Ma
Baoheng Zhang
Jia Li
Wenbo Li
Jiajun Shen
Xiaojuan Qi
86
28
0
06 Apr 2022
Previous
1
2
3
...
35
36
37
...
71
72
73
Next