ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1705.07750
  4. Cited By
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset

Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset

22 May 2017
João Carreira
Andrew Zisserman
ArXivPDFHTML

Papers citing "Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset"

50 / 1,495 papers shown
Title
How Do You Do It? Fine-Grained Action Understanding with Pseudo-Adverbs
How Do You Do It? Fine-Grained Action Understanding with Pseudo-Adverbs
Hazel Doughty
Cees G. M. Snoek
45
19
0
23 Mar 2022
Enabling faster and more reliable sonographic assessment of gestational
  age through machine learning
Enabling faster and more reliable sonographic assessment of gestational age through machine learning
Chace Lee
Angelica Willis
Christina W. Chen
M. Sieniek
Akib A Uddin
...
Rory Pilgrim
Katherine Chou
Daniel Tse
S. Shetty
Ryan G. Gomes
31
0
0
22 Mar 2022
Look for the Change: Learning Object States and State-Modifying Actions
  from Untrimmed Web Videos
Look for the Change: Learning Object States and State-Modifying Actions from Untrimmed Web Videos
Tomávs Souvcek
Jean-Baptiste Alayrac
Antoine Miech
Ivan Laptev
Josef Sivic
28
32
0
22 Mar 2022
PressureVision: Estimating Hand Pressure from a Single RGB Image
PressureVision: Estimating Hand Pressure from a Single RGB Image
Patrick Grady
Chengcheng Tang
Samarth Brahmbhatt
Christopher D. Twigg
Chengde Wan
James Hays
Charles C. Kemp
3DH
22
19
0
19 Mar 2022
DirecFormer: A Directed Attention in Transformer Approach to Robust
  Action Recognition
DirecFormer: A Directed Attention in Transformer Approach to Robust Action Recognition
Thanh-Dat Truong
Quoc-Huy Bui
C. Duong
Han-Seok Seo
Son Lam Phung
Xin Li
Khoa Luu
ViT
44
49
0
19 Mar 2022
Multi-input segmentation of damaged brain in acute ischemic stroke
  patients using slow fusion with skip connection
Multi-input segmentation of damaged brain in acute ischemic stroke patients using slow fusion with skip connection
Luca Tomasetti
M. Khanmohammadi
K. Engan
Liv Jorunn Høllesli
K. D. Kurz
20
5
0
18 Mar 2022
Local-Global Context Aware Transformer for Language-Guided Video
  Segmentation
Local-Global Context Aware Transformer for Language-Guided Video Segmentation
Chen Liang
Wenguan Wang
Tianfei Zhou
Jiaxu Miao
Yawei Luo
Yi Yang
VOS
34
74
0
18 Mar 2022
Look Outside the Room: Synthesizing A Consistent Long-Term 3D Scene
  Video from A Single Image
Look Outside the Room: Synthesizing A Consistent Long-Term 3D Scene Video from A Single Image
Xuanchi Ren
Xiaolong Wang
VGen
26
58
0
17 Mar 2022
Gate-Shift-Fuse for Video Action Recognition
Gate-Shift-Fuse for Video Action Recognition
Swathikiran Sudhakaran
Sergio Escalera
Oswald Lanz
30
22
0
16 Mar 2022
On the Pitfalls of Batch Normalization for End-to-End Video Learning: A
  Study on Surgical Workflow Analysis
On the Pitfalls of Batch Normalization for End-to-End Video Learning: A Study on Surgical Workflow Analysis
Dominik Rivoir
Isabel Funke
Stefanie Speidel
24
17
0
15 Mar 2022
All in One: Exploring Unified Video-Language Pre-training
All in One: Exploring Unified Video-Language Pre-training
Alex Jinpeng Wang
Yixiao Ge
Rui Yan
Yuying Ge
Xudong Lin
Guanyu Cai
Jianping Wu
Ying Shan
Xiaohu Qie
Mike Zheng Shou
40
200
0
14 Mar 2022
RCL: Recurrent Continuous Localization for Temporal Action Detection
RCL: Recurrent Continuous Localization for Temporal Action Detection
Qiang Wang
Yanhao Zhang
Yun Zheng
Pan Pan
ObjD
32
38
0
14 Mar 2022
Active Learning by Feature Mixing
Active Learning by Feature Mixing
Amin Parvaneh
Ehsan Abbasnejad
Damien Teney
Reza Haffari
Anton Van Den Hengel
Javen Qinfeng Shi
35
90
0
14 Mar 2022
WLASL-LEX: a Dataset for Recognising Phonological Properties in American
  Sign Language
WLASL-LEX: a Dataset for Recognising Phonological Properties in American Sign Language
Federico Tavella
Viktor Schlegel
Marta Romeo
Aphrodite Galata
Angelo Cangelosi
34
9
0
11 Mar 2022
TFCNet: Temporal Fully Connected Networks for Static Unbiased Temporal
  Reasoning
TFCNet: Temporal Fully Connected Networks for Static Unbiased Temporal Reasoning
Shiwen Zhang
AI4TS
34
10
0
11 Mar 2022
A Closer Look at Debiased Temporal Sentence Grounding in Videos:
  Dataset, Metric, and Approach
A Closer Look at Debiased Temporal Sentence Grounding in Videos: Dataset, Metric, and Approach
Xiaohan Lan
Yitian Yuan
Xin Eric Wang
Long Chen
Zhi Wang
Lin Ma
Wenwu Zhu
CML
32
15
0
10 Mar 2022
End-to-End Semantic Video Transformer for Zero-Shot Action Recognition
End-to-End Semantic Video Transformer for Zero-Shot Action Recognition
Keval Doshi
Yasin Yılmaz
ViT
40
2
0
10 Mar 2022
OpenTAL: Towards Open Set Temporal Action Localization
OpenTAL: Towards Open Set Temporal Action Localization
Wentao Bao
Qi Yu
Yu Kong
EDL
42
26
0
10 Mar 2022
Human Gaze Guided Attention for Surgical Activity Recognition
Human Gaze Guided Attention for Surgical Activity Recognition
Abdishakour Awale
Duygu Sarikaya
18
0
0
09 Mar 2022
A Simple Multi-Modality Transfer Learning Baseline for Sign Language
  Translation
A Simple Multi-Modality Transfer Learning Baseline for Sign Language Translation
Yutong Chen
Fangyun Wei
Xiao Sun
Zhirong Wu
Stephen Lin
SLR
35
99
0
08 Mar 2022
Gait Recognition with Mask-based Regularization
Gait Recognition with Mask-based Regularization
Chuanfu Shen
Beibei Lin
Shunli Zhang
George Q. Huang
Shiqi Yu
Xin-cen Yu
CVBM
51
17
0
08 Mar 2022
Live Laparoscopic Video Retrieval with Compressed Uncertainty
Live Laparoscopic Video Retrieval with Compressed Uncertainty
Tong Yu
Pietro Mascagni
J. Verde
J. Marescaux
Didier Mutter
N. Padoy
42
7
0
08 Mar 2022
Parallel Training of GRU Networks with a Multi-Grid Solver for Long
  Sequences
Parallel Training of GRU Networks with a Multi-Grid Solver for Long Sequences
G. Moon
E. Cyr
25
5
0
07 Mar 2022
Learnable Irrelevant Modality Dropout for Multimodal Action Recognition
  on Modality-Specific Annotated Videos
Learnable Irrelevant Modality Dropout for Multimodal Action Recognition on Modality-Specific Annotated Videos
Saghir Alfasly
Jian Lu
C. Xu
Yuru Zou
46
18
0
06 Mar 2022
Exploring Optical-Flow-Guided Motion and Detection-Based Appearance for
  Temporal Sentence Grounding
Exploring Optical-Flow-Guided Motion and Detection-Based Appearance for Temporal Sentence Grounding
Daizong Liu
Xiang Fang
Wei Hu
Pan Zhou
32
37
0
06 Mar 2022
Weakly Supervised Temporal Action Localization via Representative
  Snippet Knowledge Propagation
Weakly Supervised Temporal Action Localization via Representative Snippet Knowledge Propagation
Linjiang Huang
Liang Wang
Hongsheng Li
AI4TS
17
66
0
06 Mar 2022
Machine Learning Applications in Lung Cancer Diagnosis, Treatment and
  Prognosis
Machine Learning Applications in Lung Cancer Diagnosis, Treatment and Prognosis
Yawei Li
Xin Wu
P. Yang
Guoqian Jiang
Yuan Luo
AI4CE
30
2
0
05 Mar 2022
Audio-visual speech separation based on joint feature representation
  with cross-modal attention
Audio-visual speech separation based on joint feature representation with cross-modal attention
Jun Xiong
Peng Zhang
Lei Xie
Wei Huang
Yufei Zha
Yanni Zhang
30
3
0
05 Mar 2022
A Large-scale Comprehensive Dataset and Copy-overlap Aware Evaluation
  Protocol for Segment-level Video Copy Detection
A Large-scale Comprehensive Dataset and Copy-overlap Aware Evaluation Protocol for Segment-level Video Copy Detection
Sifeng He
Xudong Yang
Chenhan Jiang
Gang Liang
Wei Zhang
...
Kaiming Huang
Yuan Cheng
Feng Qian
Xiaobo Zhang
Lei Yang
36
12
0
05 Mar 2022
HOI4D: A 4D Egocentric Dataset for Category-Level Human-Object
  Interaction
HOI4D: A 4D Egocentric Dataset for Category-Level Human-Object Interaction
Yunze Liu
Yun-Hai Liu
Chen Jiang
Kangbo Lyu
Weikang Wan
Hao Shen
Bo-Hua Liang
Zhoujie Fu
He Wang
Li Yi
50
174
0
03 Mar 2022
Audio Self-supervised Learning: A Survey
Audio Self-supervised Learning: A Survey
Shuo Liu
Adria Mallol-Ragolta
Emilia Parada-Cabeleiro
Kun Qian
Xingshuo Jing
Alexander Kathan
Bin Hu
Bjoern W. Schuller
SSL
45
106
0
02 Mar 2022
Colar: Effective and Efficient Online Action Detection by Consulting
  Exemplars
Colar: Effective and Efficient Online Action Detection by Consulting Exemplars
Le Yang
Junwei Han
Dingwen Zhang
27
35
0
02 Mar 2022
TransDARC: Transformer-based Driver Activity Recognition with Latent
  Space Feature Calibration
TransDARC: Transformer-based Driver Activity Recognition with Latent Space Feature Calibration
Kunyu Peng
Alina Roitberg
Kailun Yang
Jiaming Zhang
Rainer Stiefelhagen
ViT
41
32
0
02 Mar 2022
Temporal Perceiver: A General Architecture for Arbitrary Boundary
  Detection
Temporal Perceiver: A General Architecture for Arbitrary Boundary Detection
Jing Tan
Yuhong Wang
Gangshan Wu
Limin Wang
64
14
0
01 Mar 2022
Motion-driven Visual Tempo Learning for Video-based Action Recognition
Motion-driven Visual Tempo Learning for Video-based Action Recognition
Yuanzhong Liu
Junsong Yuan
Zhigang Tu
27
58
0
24 Feb 2022
Skeleton Sequence and RGB Frame Based Multi-Modality Feature Fusion
  Network for Action Recognition
Skeleton Sequence and RGB Frame Based Multi-Modality Feature Fusion Network for Action Recognition
Xiaoguang Zhu
Ye Zhu
Haoyu Wang
Honglin Wen
Yan Yan
Peilin Liu
35
25
0
23 Feb 2022
Predicting emotion from music videos: exploring the relative
  contribution of visual and auditory information to affective responses
Predicting emotion from music videos: exploring the relative contribution of visual and auditory information to affective responses
Phoebe K. Chua
D. Makris
Dorien Herremans
Gemma Roig
Design
29
8
0
19 Feb 2022
VLP: A Survey on Vision-Language Pre-training
VLP: A Survey on Vision-Language Pre-training
Feilong Chen
Duzhen Zhang
Minglun Han
Xiuyi Chen
Jing Shi
Shuang Xu
Bo Xu
VLM
82
213
0
18 Feb 2022
When Did It Happen? Duration-informed Temporal Localization of Narrated
  Actions in Vlogs
When Did It Happen? Duration-informed Temporal Localization of Narrated Actions in Vlogs
Oana Ignat
Santiago Castro
Yuhang Zhou
Jiajun Bao
Dandan Shan
Rada Mihalcea
18
3
0
16 Feb 2022
ActionFormer: Localizing Moments of Actions with Transformers
ActionFormer: Localizing Moments of Actions with Transformers
Chen-Da Liu-Zhang
Jianxin Wu
Yin Li
ViT
31
333
0
16 Feb 2022
HAKE: A Knowledge Engine Foundation for Human Activity Understanding
HAKE: A Knowledge Engine Foundation for Human Activity Understanding
Yong-Lu Li
Xinpeng Liu
Xiaoqian Wu
Yizhuo Li
Zuoyu Qiu
Liang Xu
Yue Xu
Haoshu Fang
Cewu Lu
32
38
0
14 Feb 2022
Adaptive Graph Convolutional Networks for Weakly Supervised Anomaly
  Detection in Videos
Adaptive Graph Convolutional Networks for Weakly Supervised Anomaly Detection in Videos
Congqi Cao
Xin Zhang
Shizhou Zhang
Peng Wang
Yanning Zhang
AI4TS
27
22
0
14 Feb 2022
Robust Deepfake On Unrestricted Media: Generation And Detection
Robust Deepfake On Unrestricted Media: Generation And Detection
Trung-Nghia Le
H. Nguyen
Junichi Yamagishi
Isao Echizen
41
7
0
13 Feb 2022
Joint-bone Fusion Graph Convolutional Network for Semi-supervised
  Skeleton Action Recognition
Joint-bone Fusion Graph Convolutional Network for Semi-supervised Skeleton Action Recognition
Zhigang Tu
Jiaxu Zhang
Hongyan Li
Yujin Chen
Junsong Yuan
40
77
0
08 Feb 2022
Towards To-a-T Spatio-Temporal Focus for Skeleton-Based Action
  Recognition
Towards To-a-T Spatio-Temporal Focus for Skeleton-Based Action Recognition
Lipeng Ke
Kuan-Chuan Peng
Siwei Lyu
3DPC
34
34
0
04 Feb 2022
Video Violence Recognition and Localization Using a Semi-Supervised Hard
  Attention Model
Video Violence Recognition and Localization Using a Semi-Supervised Hard Attention Model
Hamid Reza Mohammadi
Ehsan Nazerfard
32
24
0
04 Feb 2022
Should I take a walk? Estimating Energy Expenditure from Video Data
Should I take a walk? Estimating Energy Expenditure from Video Data
Kunyu Peng
Alina Roitberg
Kailun Yang
Jiaming Zhang
Rainer Stiefelhagen
21
4
0
01 Feb 2022
Learning To Recognize Procedural Activities with Distant Supervision
Learning To Recognize Procedural Activities with Distant Supervision
Xudong Lin
Fabio Petroni
Gedas Bertasius
Marcus Rohrbach
Shih-Fu Chang
Lorenzo Torresani
37
83
0
26 Jan 2022
UniFormer: Unifying Convolution and Self-attention for Visual
  Recognition
UniFormer: Unifying Convolution and Self-attention for Visual Recognition
Kunchang Li
Yali Wang
Junhao Zhang
Peng Gao
Guanglu Song
Yu Liu
Hongsheng Li
Yu Qiao
ViT
162
363
0
24 Jan 2022
vCLIMB: A Novel Video Class Incremental Learning Benchmark
vCLIMB: A Novel Video Class Incremental Learning Benchmark
Andrés Villa
Kumail Alhamoud
Juan Carlos León Alcázar
Fabian Caba Heilbron
Victor Escorcia
Guohao Li
CLL
79
32
0
23 Jan 2022
Previous
123...131415...282930
Next