Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1705.07750
Cited By
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
22 May 2017
João Carreira
Andrew Zisserman
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset"
50 / 1,495 papers shown
Title
How Do You Do It? Fine-Grained Action Understanding with Pseudo-Adverbs
Hazel Doughty
Cees G. M. Snoek
45
19
0
23 Mar 2022
Enabling faster and more reliable sonographic assessment of gestational age through machine learning
Chace Lee
Angelica Willis
Christina W. Chen
M. Sieniek
Akib A Uddin
...
Rory Pilgrim
Katherine Chou
Daniel Tse
S. Shetty
Ryan G. Gomes
31
0
0
22 Mar 2022
Look for the Change: Learning Object States and State-Modifying Actions from Untrimmed Web Videos
Tomávs Souvcek
Jean-Baptiste Alayrac
Antoine Miech
Ivan Laptev
Josef Sivic
28
32
0
22 Mar 2022
PressureVision: Estimating Hand Pressure from a Single RGB Image
Patrick Grady
Chengcheng Tang
Samarth Brahmbhatt
Christopher D. Twigg
Chengde Wan
James Hays
Charles C. Kemp
3DH
22
19
0
19 Mar 2022
DirecFormer: A Directed Attention in Transformer Approach to Robust Action Recognition
Thanh-Dat Truong
Quoc-Huy Bui
C. Duong
Han-Seok Seo
Son Lam Phung
Xin Li
Khoa Luu
ViT
44
49
0
19 Mar 2022
Multi-input segmentation of damaged brain in acute ischemic stroke patients using slow fusion with skip connection
Luca Tomasetti
M. Khanmohammadi
K. Engan
Liv Jorunn Høllesli
K. D. Kurz
20
5
0
18 Mar 2022
Local-Global Context Aware Transformer for Language-Guided Video Segmentation
Chen Liang
Wenguan Wang
Tianfei Zhou
Jiaxu Miao
Yawei Luo
Yi Yang
VOS
34
74
0
18 Mar 2022
Look Outside the Room: Synthesizing A Consistent Long-Term 3D Scene Video from A Single Image
Xuanchi Ren
Xiaolong Wang
VGen
26
58
0
17 Mar 2022
Gate-Shift-Fuse for Video Action Recognition
Swathikiran Sudhakaran
Sergio Escalera
Oswald Lanz
30
22
0
16 Mar 2022
On the Pitfalls of Batch Normalization for End-to-End Video Learning: A Study on Surgical Workflow Analysis
Dominik Rivoir
Isabel Funke
Stefanie Speidel
24
17
0
15 Mar 2022
All in One: Exploring Unified Video-Language Pre-training
Alex Jinpeng Wang
Yixiao Ge
Rui Yan
Yuying Ge
Xudong Lin
Guanyu Cai
Jianping Wu
Ying Shan
Xiaohu Qie
Mike Zheng Shou
40
200
0
14 Mar 2022
RCL: Recurrent Continuous Localization for Temporal Action Detection
Qiang Wang
Yanhao Zhang
Yun Zheng
Pan Pan
ObjD
32
38
0
14 Mar 2022
Active Learning by Feature Mixing
Amin Parvaneh
Ehsan Abbasnejad
Damien Teney
Reza Haffari
Anton Van Den Hengel
Javen Qinfeng Shi
35
90
0
14 Mar 2022
WLASL-LEX: a Dataset for Recognising Phonological Properties in American Sign Language
Federico Tavella
Viktor Schlegel
Marta Romeo
Aphrodite Galata
Angelo Cangelosi
34
9
0
11 Mar 2022
TFCNet: Temporal Fully Connected Networks for Static Unbiased Temporal Reasoning
Shiwen Zhang
AI4TS
34
10
0
11 Mar 2022
A Closer Look at Debiased Temporal Sentence Grounding in Videos: Dataset, Metric, and Approach
Xiaohan Lan
Yitian Yuan
Xin Eric Wang
Long Chen
Zhi Wang
Lin Ma
Wenwu Zhu
CML
32
15
0
10 Mar 2022
End-to-End Semantic Video Transformer for Zero-Shot Action Recognition
Keval Doshi
Yasin Yılmaz
ViT
40
2
0
10 Mar 2022
OpenTAL: Towards Open Set Temporal Action Localization
Wentao Bao
Qi Yu
Yu Kong
EDL
42
26
0
10 Mar 2022
Human Gaze Guided Attention for Surgical Activity Recognition
Abdishakour Awale
Duygu Sarikaya
18
0
0
09 Mar 2022
A Simple Multi-Modality Transfer Learning Baseline for Sign Language Translation
Yutong Chen
Fangyun Wei
Xiao Sun
Zhirong Wu
Stephen Lin
SLR
35
99
0
08 Mar 2022
Gait Recognition with Mask-based Regularization
Chuanfu Shen
Beibei Lin
Shunli Zhang
George Q. Huang
Shiqi Yu
Xin-cen Yu
CVBM
51
17
0
08 Mar 2022
Live Laparoscopic Video Retrieval with Compressed Uncertainty
Tong Yu
Pietro Mascagni
J. Verde
J. Marescaux
Didier Mutter
N. Padoy
42
7
0
08 Mar 2022
Parallel Training of GRU Networks with a Multi-Grid Solver for Long Sequences
G. Moon
E. Cyr
25
5
0
07 Mar 2022
Learnable Irrelevant Modality Dropout for Multimodal Action Recognition on Modality-Specific Annotated Videos
Saghir Alfasly
Jian Lu
C. Xu
Yuru Zou
46
18
0
06 Mar 2022
Exploring Optical-Flow-Guided Motion and Detection-Based Appearance for Temporal Sentence Grounding
Daizong Liu
Xiang Fang
Wei Hu
Pan Zhou
32
37
0
06 Mar 2022
Weakly Supervised Temporal Action Localization via Representative Snippet Knowledge Propagation
Linjiang Huang
Liang Wang
Hongsheng Li
AI4TS
17
66
0
06 Mar 2022
Machine Learning Applications in Lung Cancer Diagnosis, Treatment and Prognosis
Yawei Li
Xin Wu
P. Yang
Guoqian Jiang
Yuan Luo
AI4CE
30
2
0
05 Mar 2022
Audio-visual speech separation based on joint feature representation with cross-modal attention
Jun Xiong
Peng Zhang
Lei Xie
Wei Huang
Yufei Zha
Yanni Zhang
30
3
0
05 Mar 2022
A Large-scale Comprehensive Dataset and Copy-overlap Aware Evaluation Protocol for Segment-level Video Copy Detection
Sifeng He
Xudong Yang
Chenhan Jiang
Gang Liang
Wei Zhang
...
Kaiming Huang
Yuan Cheng
Feng Qian
Xiaobo Zhang
Lei Yang
36
12
0
05 Mar 2022
HOI4D: A 4D Egocentric Dataset for Category-Level Human-Object Interaction
Yunze Liu
Yun-Hai Liu
Chen Jiang
Kangbo Lyu
Weikang Wan
Hao Shen
Bo-Hua Liang
Zhoujie Fu
He Wang
Li Yi
50
174
0
03 Mar 2022
Audio Self-supervised Learning: A Survey
Shuo Liu
Adria Mallol-Ragolta
Emilia Parada-Cabeleiro
Kun Qian
Xingshuo Jing
Alexander Kathan
Bin Hu
Bjoern W. Schuller
SSL
45
106
0
02 Mar 2022
Colar: Effective and Efficient Online Action Detection by Consulting Exemplars
Le Yang
Junwei Han
Dingwen Zhang
27
35
0
02 Mar 2022
TransDARC: Transformer-based Driver Activity Recognition with Latent Space Feature Calibration
Kunyu Peng
Alina Roitberg
Kailun Yang
Jiaming Zhang
Rainer Stiefelhagen
ViT
41
32
0
02 Mar 2022
Temporal Perceiver: A General Architecture for Arbitrary Boundary Detection
Jing Tan
Yuhong Wang
Gangshan Wu
Limin Wang
64
14
0
01 Mar 2022
Motion-driven Visual Tempo Learning for Video-based Action Recognition
Yuanzhong Liu
Junsong Yuan
Zhigang Tu
27
58
0
24 Feb 2022
Skeleton Sequence and RGB Frame Based Multi-Modality Feature Fusion Network for Action Recognition
Xiaoguang Zhu
Ye Zhu
Haoyu Wang
Honglin Wen
Yan Yan
Peilin Liu
35
25
0
23 Feb 2022
Predicting emotion from music videos: exploring the relative contribution of visual and auditory information to affective responses
Phoebe K. Chua
D. Makris
Dorien Herremans
Gemma Roig
Design
29
8
0
19 Feb 2022
VLP: A Survey on Vision-Language Pre-training
Feilong Chen
Duzhen Zhang
Minglun Han
Xiuyi Chen
Jing Shi
Shuang Xu
Bo Xu
VLM
82
213
0
18 Feb 2022
When Did It Happen? Duration-informed Temporal Localization of Narrated Actions in Vlogs
Oana Ignat
Santiago Castro
Yuhang Zhou
Jiajun Bao
Dandan Shan
Rada Mihalcea
18
3
0
16 Feb 2022
ActionFormer: Localizing Moments of Actions with Transformers
Chen-Da Liu-Zhang
Jianxin Wu
Yin Li
ViT
31
333
0
16 Feb 2022
HAKE: A Knowledge Engine Foundation for Human Activity Understanding
Yong-Lu Li
Xinpeng Liu
Xiaoqian Wu
Yizhuo Li
Zuoyu Qiu
Liang Xu
Yue Xu
Haoshu Fang
Cewu Lu
32
38
0
14 Feb 2022
Adaptive Graph Convolutional Networks for Weakly Supervised Anomaly Detection in Videos
Congqi Cao
Xin Zhang
Shizhou Zhang
Peng Wang
Yanning Zhang
AI4TS
27
22
0
14 Feb 2022
Robust Deepfake On Unrestricted Media: Generation And Detection
Trung-Nghia Le
H. Nguyen
Junichi Yamagishi
Isao Echizen
41
7
0
13 Feb 2022
Joint-bone Fusion Graph Convolutional Network for Semi-supervised Skeleton Action Recognition
Zhigang Tu
Jiaxu Zhang
Hongyan Li
Yujin Chen
Junsong Yuan
40
77
0
08 Feb 2022
Towards To-a-T Spatio-Temporal Focus for Skeleton-Based Action Recognition
Lipeng Ke
Kuan-Chuan Peng
Siwei Lyu
3DPC
34
34
0
04 Feb 2022
Video Violence Recognition and Localization Using a Semi-Supervised Hard Attention Model
Hamid Reza Mohammadi
Ehsan Nazerfard
32
24
0
04 Feb 2022
Should I take a walk? Estimating Energy Expenditure from Video Data
Kunyu Peng
Alina Roitberg
Kailun Yang
Jiaming Zhang
Rainer Stiefelhagen
21
4
0
01 Feb 2022
Learning To Recognize Procedural Activities with Distant Supervision
Xudong Lin
Fabio Petroni
Gedas Bertasius
Marcus Rohrbach
Shih-Fu Chang
Lorenzo Torresani
37
83
0
26 Jan 2022
UniFormer: Unifying Convolution and Self-attention for Visual Recognition
Kunchang Li
Yali Wang
Junhao Zhang
Peng Gao
Guanglu Song
Yu Liu
Hongsheng Li
Yu Qiao
ViT
162
363
0
24 Jan 2022
vCLIMB: A Novel Video Class Incremental Learning Benchmark
Andrés Villa
Kumail Alhamoud
Juan Carlos León Alcázar
Fabian Caba Heilbron
Victor Escorcia
Guohao Li
CLL
79
32
0
23 Jan 2022
Previous
1
2
3
...
13
14
15
...
28
29
30
Next