ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1705.07750
  4. Cited By
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset

Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset

22 May 2017
João Carreira
Andrew Zisserman
ArXivPDFHTML

Papers citing "Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset"

50 / 1,638 papers shown
Title
Learning from What is Already Out There: Few-shot Sign Language
  Recognition with Online Dictionaries
Learning from What is Already Out There: Few-shot Sign Language Recognition with Online Dictionaries
Matyáš Boháček
M. Hrúz
35
5
0
10 Jan 2023
Ancilia: Scalable Intelligent Video Surveillance for the Artificial
  Intelligence of Things
Ancilia: Scalable Intelligent Video Surveillance for the Artificial Intelligence of Things
Armin Danesh Pazho
Christopher Neff
Ghazal Alinezhad Noghre
B. R. Ardabili
S. Yao
Mohammadreza Baharani
Hamed Tabkhi
24
38
0
09 Jan 2023
Simplifying Open-Set Video Domain Adaptation with Contrastive Learning
Simplifying Open-Set Video Domain Adaptation with Contrastive Learning
Giacomo Zara
Victor G. Turrisi da Costa
Subhankar Roy
Paolo Rota
Elisa Ricci
50
1
0
09 Jan 2023
EgoDistill: Egocentric Head Motion Distillation for Efficient Video
  Understanding
EgoDistill: Egocentric Head Motion Distillation for Efficient Video Understanding
Shuhan Tan
Tushar Nagarajan
Kristen Grauman
34
21
0
05 Jan 2023
Ego-Only: Egocentric Action Detection without Exocentric Transferring
Ego-Only: Egocentric Action Detection without Exocentric Transferring
Huiyu Wang
Mitesh Singh
Lorenzo Torresani
EgoV
94
25
0
03 Jan 2023
Look, Listen, and Attack: Backdoor Attacks Against Video Action
  Recognition
Look, Listen, and Attack: Backdoor Attacks Against Video Action Recognition
Hasan Hammoud
Shuming Liu
Mohammad Alkhrashi
Fahad Albalawi
Guohao Li
AAML
51
8
0
03 Jan 2023
Efficient Robustness Assessment via Adversarial Spatial-Temporal Focus
  on Videos
Efficient Robustness Assessment via Adversarial Spatial-Temporal Focus on Videos
Xingxing Wei
Songping Wang
Huanqian Yan
AAML
39
15
0
03 Jan 2023
Rethinking the Video Sampling and Reasoning Strategies for Temporal
  Sentence Grounding
Rethinking the Video Sampling and Reasoning Strategies for Temporal Sentence Grounding
Jiahao Zhu
Daizong Liu
Pan Zhou
Xing Di
Yu Cheng
...
Wenzheng Xu
Zichuan Xu
Yao Wan
Lichao Sun
Zeyu Xiong
34
18
0
02 Jan 2023
Hierarchical Explanations for Video Action Recognition
Hierarchical Explanations for Video Action Recognition
Sadaf Gulshad
Teng Long
Nanne van Noord
FAtt
42
6
0
01 Jan 2023
Skeletal Video Anomaly Detection using Deep Learning: Survey, Challenges
  and Future Directions
Skeletal Video Anomaly Detection using Deep Learning: Survey, Challenges and Future Directions
Pratik K. Mishra
Alex Mihailidis
Shehroz S. Khan
46
17
0
31 Dec 2022
MonoNeRF: Learning a Generalizable Dynamic Radiance Field from Monocular
  Videos
MonoNeRF: Learning a Generalizable Dynamic Radiance Field from Monocular Videos
Fengrui Tian
S. Du
Yueqi Duan
VGen
29
42
0
26 Dec 2022
StepNet: Spatial-temporal Part-aware Network for Isolated Sign Language
  Recognition
StepNet: Spatial-temporal Part-aware Network for Isolated Sign Language Recognition
Xi Shen
Zhedong Zheng
Yi Yang
SLR
54
14
0
25 Dec 2022
SLGTformer: An Attention-Based Approach to Sign Language Recognition
SLGTformer: An Attention-Based Approach to Sign Language Recognition
Neil Song
Yu Xiang
SLR
32
0
0
21 Dec 2022
METEOR Guided Divergence for Video Captioning
METEOR Guided Divergence for Video Captioning
D. Rothenpieler
Shahin Amiriparian
34
3
0
20 Dec 2022
Re-evaluating the Need for Multimodal Signals in Unsupervised Grammar
  Induction
Re-evaluating the Need for Multimodal Signals in Unsupervised Grammar Induction
Boyi Li
Rodolfo Corona
K. Mangalam
Catherine Chen
Daniel Flaherty
Serge Belongie
Kilian Q. Weinberger
Jitendra Malik
Trevor Darrell
Dan Klein
26
1
0
20 Dec 2022
C2F-TCN: A Framework for Semi and Fully Supervised Temporal Action
  Segmentation
C2F-TCN: A Framework for Semi and Fully Supervised Temporal Action Segmentation
Dipika Singhania
R. Rahaman
Angela Yao
27
29
0
20 Dec 2022
A Survey on Human Action Recognition
A Survey on Human Action Recognition
Zhou Shuchang
34
0
0
20 Dec 2022
Graph Neural Network based Child Activity Recognition
Graph Neural Network based Child Activity Recognition
Sanka Mohottala
Pradeepa Samarasinghe
D. Kasthurirathna
Charith Abhayaratne
BDL
GNN
35
5
0
18 Dec 2022
Inductive Attention for Video Action Anticipation
Inductive Attention for Video Action Anticipation
Tsung-Ming Tai
G. Fiameni
Cheng-Kuang Lee
Simon See
Oswald Lanz
50
1
0
17 Dec 2022
Weakly Supervised Video Anomaly Detection Based on Cross-Batch
  Clustering Guidance
Weakly Supervised Video Anomaly Detection Based on Cross-Batch Clustering Guidance
Congqi Cao
Xin Zhang
Shizhou Zhang
Peng Wang
Yanning Zhang
32
6
0
16 Dec 2022
EgoLoc: Revisiting 3D Object Localization from Egocentric Videos with
  Visual Queries
EgoLoc: Revisiting 3D Object Localization from Egocentric Videos with Visual Queries
Jinjie Mai
Abdullah Hamdi
Silvio Giancola
Chen Zhao
Guohao Li
EgoV
45
14
0
14 Dec 2022
PV3D: A 3D Generative Model for Portrait Video Generation
PV3D: A 3D Generative Model for Portrait Video Generation
Eric Xu
Jianfeng Zhang
Jun Hao Liew
Wenqing Zhang
Song Bai
Jiashi Feng
Mike Zheng Shou
VGen
39
21
0
13 Dec 2022
Adversarially Robust Video Perception by Seeing Motion
Adversarially Robust Video Perception by Seeing Motion
Lingyu Zhang
Chengzhi Mao
Junfeng Yang
Carl Vondrick
VGen
AAML
49
2
0
13 Dec 2022
Egocentric Video Task Translation
Egocentric Video Task Translation
Zihui Xue
Yale Song
Kristen Grauman
Lorenzo Torresani
EgoV
36
12
0
13 Dec 2022
Contextual Explainable Video Representation: Human Perception-based
  Understanding
Contextual Explainable Video Representation: Human Perception-based Understanding
Khoa T. Vo
Kashu Yamazaki
Phong H. Nguyen
Pha Nguyen
Khoa Luu
Ngan Le
26
9
0
12 Dec 2022
Reconstructing Humpty Dumpty: Multi-feature Graph Autoencoder for Open
  Set Action Recognition
Reconstructing Humpty Dumpty: Multi-feature Graph Autoencoder for Open Set Action Recognition
Dawei Du
Ameya Shringi
A. Hoogs
Christopher Funk
26
2
0
12 Dec 2022
Cross-Modal Learning with 3D Deformable Attention for Action Recognition
Cross-Modal Learning with 3D Deformable Attention for Action Recognition
Sangwon Kim
Dasom Ahn
ByoungChul Ko
ViT
3DPC
45
24
0
12 Dec 2022
OpenPack: A Large-scale Dataset for Recognizing Packaging Works in
  IoT-enabled Logistic Environments
OpenPack: A Large-scale Dataset for Recognizing Packaging Works in IoT-enabled Logistic Environments
Naoya Yoshimura
Jaime Morales
T. Maekawa
Takahiro Hara
30
19
0
10 Dec 2022
MAGVIT: Masked Generative Video Transformer
MAGVIT: Masked Generative Video Transformer
Lijun Yu
Yong Cheng
Kihyuk Sohn
José Lezama
Han Zhang
...
Alexander G. Hauptmann
Ming-Hsuan Yang
Yuan Hao
Irfan Essa
Lu Jiang
DiffM
VGen
40
234
0
10 Dec 2022
Audiovisual Masked Autoencoders
Audiovisual Masked Autoencoders
Mariana-Iuliana Georgescu
Eduardo Fonseca
Radu Tudor Ionescu
Mario Lucic
Cordelia Schmid
Anurag Arnab
SSL
46
44
0
09 Dec 2022
VideoCoCa: Video-Text Modeling with Zero-Shot Transfer from Contrastive
  Captioners
VideoCoCa: Video-Text Modeling with Zero-Shot Transfer from Contrastive Captioners
Shen Yan
Tao Zhu
Zirui Wang
Yuan Cao
Mi Zhang
Soham Ghosh
Yonghui Wu
Jiahui Yu
VLM
VGen
39
47
0
09 Dec 2022
Multimodal Prototype-Enhanced Network for Few-Shot Action Recognition
Multimodal Prototype-Enhanced Network for Few-Shot Action Recognition
Xin Ni
Yong Liu
Hao Wen
Yatai Ji
Jing Xiao
Yujiu Yang
46
10
0
09 Dec 2022
Tencent AVS: A Holistic Ads Video Dataset for Multi-modal Scene
  Segmentation
Tencent AVS: A Holistic Ads Video Dataset for Multi-modal Scene Segmentation
Jie Jiang
Zhimin Li
Jiangfeng Xiong
Rongwei Quan
Qinglin Lu
Wei Liu
38
2
0
09 Dec 2022
Learning Video Representations from Large Language Models
Learning Video Representations from Large Language Models
Yue Zhao
Ishan Misra
Philipp Krahenbuhl
Rohit Girdhar
VLM
AI4TS
42
169
0
08 Dec 2022
PromptonomyViT: Multi-Task Prompt Learning Improves Video Transformers
  using Synthetic Scene Data
PromptonomyViT: Multi-Task Prompt Learning Improves Video Transformers using Synthetic Scene Data
Roei Herzig
Ofir Abramovich
Elad Ben-Avraham
Assaf Arbelle
Leonid Karlinsky
Ariel Shamir
Trevor Darrell
Amir Globerson
46
16
0
08 Dec 2022
Exploiting Completeness and Uncertainty of Pseudo Labels for Weakly
  Supervised Video Anomaly Detection
Exploiting Completeness and Uncertainty of Pseudo Labels for Weakly Supervised Video Anomaly Detection
Chen Zhang
Guorong Li
Yuankai Qi
Shuhui Wang
Laiyun Qing
Qingming Huang
Ming-Hsuan Yang
40
54
0
08 Dec 2022
Learning to Dub Movies via Hierarchical Prosody Models
Learning to Dub Movies via Hierarchical Prosody Models
Gaoxiang Cong
Liang Li
Yuankai Qi
Zhengjun Zha
Qi Wu
Wen-yu Wang
Bin Jiang
Ming-Hsuan Yang
Qin Huang
80
26
0
08 Dec 2022
Multimodal Vision Transformers with Forced Attention for Behavior
  Analysis
Multimodal Vision Transformers with Forced Attention for Behavior Analysis
Tanay Agrawal
Michal Balazia
Philippe Muller
Franccois Brémond
ViT
48
9
0
07 Dec 2022
iQuery: Instruments as Queries for Audio-Visual Sound Separation
iQuery: Instruments as Queries for Audio-Visual Sound Separation
Jiaben Chen
Renrui Zhang
Dongze Lian
Jiaqi Yang
Ziyao Zeng
Jianbo Shi
39
27
0
07 Dec 2022
DroneAttention: Sparse Weighted Temporal Attention for Drone-Camera
  Based Activity Recognition
DroneAttention: Sparse Weighted Temporal Attention for Drone-Camera Based Activity Recognition
Santosh Kumar Yadav
Achleshwar Luthra
Esha Pahwa
K. Tiwari
Heena Rathore
Hari Mohan Pandey
Peter Corcoran
47
12
0
07 Dec 2022
Fine-tuned CLIP Models are Efficient Video Learners
Fine-tuned CLIP Models are Efficient Video Learners
H. Rasheed
Muhammad Uzair Khattak
Muhammad Maaz
Salman Khan
Fahad Shahbaz Khan
CLIP
VLM
52
153
0
06 Dec 2022
Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video
  Learning
Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning
A. Piergiovanni
Weicheng Kuo
A. Angelova
ViT
48
56
0
06 Dec 2022
InternVideo: General Video Foundation Models via Generative and
  Discriminative Learning
InternVideo: General Video Foundation Models via Generative and Discriminative Learning
Yi Wang
Kunchang Li
Yizhuo Li
Yinan He
Bingkun Huang
...
Junting Pan
Jiashuo Yu
Yali Wang
Limin Wang
Yu Qiao
VLM
VGen
86
315
0
06 Dec 2022
Self-supervised and Weakly Supervised Contrastive Learning for Frame-wise Action Representations
Minghao Chen
Renbo Tu
Chenxi Huang
Yuqi Lin
Boxi Wu
Deng Cai
SSL
AI4TS
42
1
0
06 Dec 2022
Location-Aware Self-Supervised Transformers for Semantic Segmentation
Location-Aware Self-Supervised Transformers for Semantic Segmentation
Mathilde Caron
N. Houlsby
Cordelia Schmid
ViT
39
12
0
05 Dec 2022
Day2Dark: Pseudo-Supervised Activity Recognition beyond Silent Daylight
Day2Dark: Pseudo-Supervised Activity Recognition beyond Silent Daylight
Yunhua Zhang
Hazel Doughty
Cees G. M. Snoek
VLM
53
0
0
05 Dec 2022
Joint Self-Supervised Image-Volume Representation Learning with
  Intra-Inter Contrastive Clustering
Joint Self-Supervised Image-Volume Representation Learning with Intra-Inter Contrastive Clustering
D. M. Nguyen
Hoangvu Nguyen
M. T. N. Truong
T. Cao
Binh Duc Nguyen
Nhat Ho
Paul Swoboda
Shadi Albarqouni
P. Xie
Daniel Sonntag
SSL
38
21
0
04 Dec 2022
VLG: General Video Recognition with Web Textual Knowledge
VLG: General Video Recognition with Web Textual Knowledge
Jintao Lin
Zhaoyang Liu
Wenhai Wang
Wayne Wu
Limin Wang
46
0
0
03 Dec 2022
Video-based Pose-Estimation Data as Source for Transfer Learning in
  Human Activity Recognition
Video-based Pose-Estimation Data as Source for Transfer Learning in Human Activity Recognition
Shrutarv Awasthi
Fernando Moya Rueda
Gernot A. Fink
43
1
0
02 Dec 2022
Multilingual Communication System with Deaf Individuals Utilizing
  Natural and Visual Languages
Multilingual Communication System with Deaf Individuals Utilizing Natural and Visual Languages
Tuan-Luc Huynh
Khoi-Nguyen Nguyen-Ngoc
Chi-Bien Chu
Minh-Triet Tran
Trung-Nghia Le
SLR
23
0
0
01 Dec 2022
Previous
123...789...313233
Next