ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1705.07750
  4. Cited By
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset

Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset

22 May 2017
João Carreira
Andrew Zisserman
ArXivPDFHTML

Papers citing "Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset"

50 / 1,440 papers shown
Title
DNeRV: Modeling Inherent Dynamics via Difference Neural Representation
  for Videos
DNeRV: Modeling Inherent Dynamics via Difference Neural Representation for Videos
Qi Zhao
Ulugbek S. Kamilov
Zhan Ma
26
32
0
13 Apr 2023
Zoom-VQA: Patches, Frames and Clips Integration for Video Quality
  Assessment
Zoom-VQA: Patches, Frames and Clips Integration for Video Quality Assessment
Kai Zhao
Kun Yuan
Ming-Ting Sun
Xingsen Wen
21
20
0
13 Apr 2023
Sign Language Translation from Instructional Videos
Sign Language Translation from Instructional Videos
Laia Tarrés
Gerard I. Gállego
A. Duarte
Jordi Torres
Xavier Giró-i-Nieto
SLR
31
30
0
13 Apr 2023
ASL Citizen: A Community-Sourced Dataset for Advancing Isolated Sign
  Language Recognition
ASL Citizen: A Community-Sourced Dataset for Advancing Isolated Sign Language Recognition
Aashaka Desai
Lauren Berger
Fyodor O. Minakov
Vanessa Milan
Chinmay Singh
...
Richard E. Ladner
Hal Daumé
Alex X. Lu
Naomi K. Caselli
Danielle Bragg
SLR
26
21
0
12 Apr 2023
Co-attention Propagation Network for Zero-Shot Video Object Segmentation
Co-attention Propagation Network for Zero-Shot Video Object Segmentation
Gensheng Pei
Yazhou Yao
Fumin Shen
Daniel Huang
Xing-Rui Huang
Hengtao Shen
VOS
40
12
0
08 Apr 2023
Machine Learning with Requirements: a Manifesto
Machine Learning with Requirements: a Manifesto
Eleonora Giunchiglia
F. Imrie
M. Schaar
Thomas Lukasiewicz
AI4TS
OffRL
VLM
45
5
0
07 Apr 2023
Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting
Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting
Syed Talal Wasim
Muzammal Naseer
Salman Khan
Fahad Shahbaz Khan
M. Shah
VLM
VPVLM
39
75
0
06 Apr 2023
Synthetic Sample Selection for Generalized Zero-Shot Learning
Synthetic Sample Selection for Generalized Zero-Shot Learning
Shreyank N. Gowda
30
16
0
06 Apr 2023
VicTR: Video-conditioned Text Representations for Activity Recognition
VicTR: Video-conditioned Text Representations for Activity Recognition
Kumara Kahatapitiya
Anurag Arnab
Arsha Nagrani
Michael S. Ryoo
42
20
0
05 Apr 2023
Bodily expressed emotion understanding through integrating Laban
  movement analysis
Bodily expressed emotion understanding through integrating Laban movement analysis
Chenyan Wu
Dolzodmaa Davaasuren
T. Shafir
Rachelle Tsachor
James Z. Wang
32
6
0
05 Apr 2023
DIR-AS: Decoupling Individual Identification and Temporal Reasoning for
  Action Segmentation
DIR-AS: Decoupling Individual Identification and Temporal Reasoning for Action Segmentation
Peiyao Wang
Haibin Ling
15
2
0
04 Apr 2023
Black Box Few-Shot Adaptation for Vision-Language models
Black Box Few-Shot Adaptation for Vision-Language models
Yassine Ouali
Adrian Bulat
Brais Martínez
Georgios Tzimiropoulos
VLM
39
31
0
04 Apr 2023
On the Benefits of 3D Pose and Tracking for Human Action Recognition
On the Benefits of 3D Pose and Tracking for Human Action Recognition
Jathushan Rajasegaran
Georgios Pavlakos
Angjoo Kanazawa
Christoph Feichtenhofer
Jitendra Malik
41
30
0
03 Apr 2023
MoLo: Motion-augmented Long-short Contrastive Learning for Few-shot
  Action Recognition
MoLo: Motion-augmented Long-short Contrastive Learning for Few-shot Action Recognition
Xiang Wang
Shiwei Zhang
Zhiwu Qing
Changxin Gao
Yingya Zhang
Deli Zhao
Nong Sang
24
40
0
03 Apr 2023
Unbiased Scene Graph Generation in Videos
Unbiased Scene Graph Generation in Videos
Sayak Nag
Kyle Min
Subarna Tripathi
Amit K. Roy-Chowdhury
34
29
0
03 Apr 2023
DOAD: Decoupled One Stage Action Detection Network
DOAD: Decoupled One Stage Action Detection Network
Shuning Chang
Pichao Wang
Fan Wang
Jiashi Feng
Mike Zheng Show
26
4
0
01 Apr 2023
Diffusion Action Segmentation
Diffusion Action Segmentation
Dao-jun Liu
Qiyue Li
A. Dinh
Ting Jiang
Mubarak Shah
Chan Xu
VGen
DiffM
37
68
0
31 Mar 2023
Decomposed Cross-modal Distillation for RGB-based Temporal Action
  Detection
Decomposed Cross-modal Distillation for RGB-based Temporal Action Detection
Pilhyeon Lee
Taeoh Kim
Minho Shim
Dongyoon Wee
H. Byun
38
11
0
30 Mar 2023
Hierarchical Video-Moment Retrieval and Step-Captioning
Hierarchical Video-Moment Retrieval and Step-Captioning
Abhaysinh Zala
Jaemin Cho
Satwik Kottur
Xilun Chen
Barlas Ouguz
Yasher Mehdad
Joey Tianyi Zhou
3DV
20
51
0
29 Mar 2023
Egocentric Auditory Attention Localization in Conversations
Egocentric Auditory Attention Localization in Conversations
Fiona Ryan
Hao Jiang
Abhinav Shukla
James M. Rehg
V. Ithapu
EgoV
31
16
0
28 Mar 2023
SELF-VS: Self-supervised Encoding Learning For Video Summarization
SELF-VS: Self-supervised Encoding Learning For Video Summarization
Hojjat Mokhtarabadi
Kaveh Bahraman
M. Hosseinzadeh
M. Eftekhari
AI4TS
SSL
ViT
25
0
0
28 Mar 2023
EVA-CLIP: Improved Training Techniques for CLIP at Scale
EVA-CLIP: Improved Training Techniques for CLIP at Scale
Quan-Sen Sun
Yuxin Fang
Ledell Yu Wu
Xinlong Wang
Yue Cao
CLIP
VLM
81
470
0
27 Mar 2023
Unified Keypoint-based Action Recognition Framework via Structured
  Keypoint Pooling
Unified Keypoint-based Action Recognition Framework via Structured Keypoint Pooling
Ryo Hachiuma
Fumiaki Sato
Taiki Sekii
3DPC
29
37
0
27 Mar 2023
Seer: Language Instructed Video Prediction with Latent Diffusion Models
Seer: Language Instructed Video Prediction with Latent Diffusion Models
Xianfan Gu
Chuan Wen
Weirui Ye
Jiaming Song
Yang Gao
DiffM
VGen
26
40
0
27 Mar 2023
DiffTAD: Temporal Action Detection with Proposal Denoising Diffusion
DiffTAD: Temporal Action Detection with Proposal Denoising Diffusion
Sauradip Nag
Xiatian Zhu
Jiankang Deng
Yi-Zhe Song
Tao Xiang
DiffM
VGen
45
21
0
27 Mar 2023
PDPP: Projected Diffusion for Procedure Planning in Instructional Videos
PDPP: Projected Diffusion for Procedure Planning in Instructional Videos
Hanlin Wang
Yilu Wu
Sheng Guo
Limin Wang
VGen
DiffM
78
30
0
26 Mar 2023
Aligning Step-by-Step Instructional Diagrams to Video Demonstrations
Aligning Step-by-Step Instructional Diagrams to Video Demonstrations
Jiahao Zhang
A. Cherian
Yanbin Liu
Yizhak Ben-Shabat
Cristian Rodriguez-Opazo
Stephen Gould
37
8
0
24 Mar 2023
Conditional Image-to-Video Generation with Latent Flow Diffusion Models
Conditional Image-to-Video Generation with Latent Flow Diffusion Models
Haomiao Ni
Changhao Shi
Kaican Li
Sharon X. Huang
Martin Renqiang Min
VGen
DiffM
37
165
0
24 Mar 2023
AI Models Close to your Chest: Robust Federated Learning Strategies for
  Multi-site CT
AI Models Close to your Chest: Robust Federated Learning Strategies for Multi-site CT
Edward H. Lee
B. Kelly
E. Altinmakas
H. Doğan
M. Mohammadzadeh
...
Faezeh Sazgara
S. Wong
Michael E. Moseley
S. Halabi
Kristen W. Yeom
FedML
OOD
28
1
0
23 Mar 2023
VADER: Video Alignment Differencing and Retrieval
VADER: Video Alignment Differencing and Retrieval
Alexander Black
Simon Jenni
Tu Bui
Md. Mehrab Tanjim
Stefano Petrangeli
Ritwik Sinha
Viswanathan Swaminathan
John Collomosse
31
2
0
23 Mar 2023
Machine Learning for Brain Disorders: Transformers and Visual
  Transformers
Machine Learning for Brain Disorders: Transformers and Visual Transformers
Robin Courant
Maika Edberg
Nicolas Dufour
Vicky Kalogeiton
MedIm
ViT
40
1
0
21 Mar 2023
Multi-modal Prompting for Low-Shot Temporal Action Localization
Multi-modal Prompting for Low-Shot Temporal Action Localization
Chen Ju
Zeqian Li
Peisen Zhao
Ya Zhang
Xiaopeng Zhang
Qi Tian
Yanfeng Wang
Weidi Xie
41
18
0
21 Mar 2023
TemporalMaxer: Maximize Temporal Context with only Max Pooling for
  Temporal Action Localization
TemporalMaxer: Maximize Temporal Context with only Max Pooling for Temporal Action Localization
Tuan N. Tang
Kwonyoung Kim
Kwanghoon Sohn
29
29
0
16 Mar 2023
Aerial Diffusion: Text Guided Ground-to-Aerial View Translation from a
  Single Image using Diffusion Models
Aerial Diffusion: Text Guided Ground-to-Aerial View Translation from a Single Image using Diffusion Models
D. Kothandaraman
Dinesh Manocha
Ming Lin
Dinesh Manocha
33
5
0
15 Mar 2023
Co-Occurrence Matters: Learning Action Relation for Temporal Action
  Localization
Co-Occurrence Matters: Learning Action Relation for Temporal Action Localization
Congqi Cao
Yizhe Wang
Yuelie Lu
X. Zhang
Yanning Zhang
33
4
0
15 Mar 2023
PoseRAC: Pose Saliency Transformer for Repetitive Action Counting
PoseRAC: Pose Saliency Transformer for Repetitive Action Counting
Ziyu Yao
Xuxin Cheng
Yuexian Zou
ViT
27
19
0
15 Mar 2023
Activity Recognition From Newborn Resuscitation Videos
Activity Recognition From Newborn Resuscitation Videos
Øyvind Meinich-Bache
Simon Lennart Austnes
K. Engan
Ivar Austvoll
T. Eftestøl
H. Myklebust
S. Kusulla
H. Kidanto
H. Ersdal
13
19
0
14 Mar 2023
Generation-Guided Multi-Level Unified Network for Video Grounding
Generation-Guided Multi-Level Unified Network for Video Grounding
Xingyi Cheng
Xiangyu Wu
Dong Shen
Hezheng Lin
Fan Yang
21
0
0
14 Mar 2023
DECOMPL: Decompositional Learning with Attention Pooling for Group
  Activity Recognition from a Single Volleyball Image
DECOMPL: Decompositional Learning with Attention Pooling for Group Activity Recognition from a Single Volleyball Image
Berker Demirel
Huseyin Ozkan
31
2
0
11 Mar 2023
Neuron Structure Modeling for Generalizable Remote Physiological
  Measurement
Neuron Structure Modeling for Generalizable Remote Physiological Measurement
Hao Lu
Zitong Yu
Xuesong Niu
Yingke Chen
34
31
0
10 Mar 2023
Rethinking Self-Supervised Visual Representation Learning in
  Pre-training for 3D Human Pose and Shape Estimation
Rethinking Self-Supervised Visual Representation Learning in Pre-training for 3D Human Pose and Shape Estimation
Hongsuk Choi
Hyeongjin Nam
T. Lee
Gyeongsik Moon
Kyoung Mu Lee
39
7
0
09 Mar 2023
TAEC: Unsupervised Action Segmentation with Temporal-Aware Embedding and
  Clustering
TAEC: Unsupervised Action Segmentation with Temporal-Aware Embedding and Clustering
Wei Lin
Anna Kukleva
Horst Possegger
Hilde Kuehne
Horst Bischof
48
2
0
09 Mar 2023
CLIP-guided Prototype Modulating for Few-shot Action Recognition
CLIP-guided Prototype Modulating for Few-shot Action Recognition
Xiang Wang
Shiwei Zhang
Jun Cen
Changxin Gao
Yingya Zhang
Deli Zhao
Nong Sang
VLM
27
53
0
06 Mar 2023
Maximizing Spatio-Temporal Entropy of Deep 3D CNNs for Efficient Video
  Recognition
Maximizing Spatio-Temporal Entropy of Deep 3D CNNs for Efficient Video Recognition
Junyan Wang
Zhenhong Sun
Yichen Qian
Dong Gong
Xiuyu Sun
Ming Lin
Maurice Pagnucco
Yang Song
3DPC
25
11
0
05 Mar 2023
MITFAS: Mutual Information based Temporal Feature Alignment and Sampling
  for Aerial Video Action Recognition
MITFAS: Mutual Information based Temporal Feature Alignment and Sampling for Aerial Video Action Recognition
Ruiqi Xian
Xijun Wang
Tianyi Zhou
37
10
0
05 Mar 2023
AZTR: Aerial Video Action Recognition with Auto Zoom and Temporal
  Reasoning
AZTR: Aerial Video Action Recognition with Auto Zoom and Temporal Reasoning
Xijun Wang
Ruiqi Xian
Tianrui Guan
Celso M. de Melo
Stephen M. Nogar
Aniket Bera
Tianyi Zhou
24
11
0
02 Mar 2023
CLIPER: A Unified Vision-Language Framework for In-the-Wild Facial
  Expression Recognition
CLIPER: A Unified Vision-Language Framework for In-the-Wild Facial Expression Recognition
Hanting Li
Hongjing Niu
Zhaoqing Zhu
Feng Zhao
VLM
CLIP
26
26
0
01 Mar 2023
Knowledge Augmented Relation Inference for Group Activity Recognition
Knowledge Augmented Relation Inference for Group Activity Recognition
Xianglong Lang
Zhuming Wang
Zun Li
Meng-Syue Tian
Ge Shi
Lifang Wu
Liang Wang
21
3
0
28 Feb 2023
LIT-Former: Linking In-plane and Through-plane Transformers for
  Simultaneous CT Image Denoising and Deblurring
LIT-Former: Linking In-plane and Through-plane Transformers for Simultaneous CT Image Denoising and Deblurring
Zhihao Chen
Chuang Niu
Qi Gao
Ge Wang
Hongming Shan
MedIm
ViT
3DV
46
20
0
21 Feb 2023
Video Action Recognition Collaborative Learning with Dynamics via
  PSO-ConvNet Transformer
Video Action Recognition Collaborative Learning with Dynamics via PSO-ConvNet Transformer
N. H. Phong
B. Ribeiro
29
15
0
17 Feb 2023
Previous
123...567...272829
Next