ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1808.01340
  4. Cited By
A Short Note about Kinetics-600

A Short Note about Kinetics-600

3 August 2018
João Carreira
Eric Noland
Andras Banki-Horvath
Chloe Hillier
Andrew Zisserman
ArXivPDFHTML

Papers citing "A Short Note about Kinetics-600"

50 / 114 papers shown
Title
TT-DF: A Large-Scale Diffusion-Based Dataset and Benchmark for Human Body Forgery Detection
TT-DF: A Large-Scale Diffusion-Based Dataset and Benchmark for Human Body Forgery Detection
Wenkui Yang
Zhida Zhang
Xiaoqiang Zhou
Junxian Duan
Jie Cao
DiffM
33
0
0
13 May 2025
Direct Motion Models for Assessing Generated Videos
Direct Motion Models for Assessing Generated Videos
Kelsey R. Allen
Carl Doersch
Guangyao Zhou
Mohammed Suhail
Danny Driess
...
Thomas Kipf
Mehdi S. M. Sajjadi
Kevin P. Murphy
João Carreira
Sjoerd van Steenkiste
EGVM
DiffM
VGen
78
0
0
30 Apr 2025
Is Temporal Prompting All We Need For Limited Labeled Action Recognition?
Is Temporal Prompting All We Need For Limited Labeled Action Recognition?
Shreyank N. Gowda
Boyan Gao
Xiao Gu
Xiaobo Jin
VLM
41
0
0
02 Apr 2025
Enhancing Video Understanding: Deep Neural Networks for Spatiotemporal Analysis
Enhancing Video Understanding: Deep Neural Networks for Spatiotemporal Analysis
Amir Hosein Fadaei
M. Dehaqani
45
0
0
11 Feb 2025
Human Activity Recognition in an Open World
Human Activity Recognition in an Open World
D. Prijatelj
Samuel Grieggs
Jin Huang
Dawei Du
Ameya Shringi
Christopher Funk
Adam Kaufman
Eric Robertson
Walter J. Scheirer University of Notre Dame
72
3
0
17 Jan 2025
Do Language Models Understand Time?
Do Language Models Understand Time?
Xi Ding
Lei Wang
184
0
0
18 Dec 2024
Towards Student Actions in Classroom Scenes: New Dataset and Baseline
Towards Student Actions in Classroom Scenes: New Dataset and Baseline
Zhuolin Tan
Chenqiang Gao
Anyong Qin
Ruixin Chen
Tiecheng Song
Feng Yang
Deyu Meng
29
0
0
02 Sep 2024
Masked Image Modeling: A Survey
Masked Image Modeling: A Survey
Vlad Hondru
Florinel-Alin Croitoru
Shervin Minaee
Radu Tudor Ionescu
N. Sebe
72
6
0
13 Aug 2024
A Comprehensive Review of Few-shot Action Recognition
A Comprehensive Review of Few-shot Action Recognition
Yuyang Wanyan
Xiaoshan Yang
Weiming Dong
Changsheng Xu
VLM
80
3
0
20 Jul 2024
Video In-context Learning: Autoregressive Transformers are Zero-Shot Video Imitators
Video In-context Learning: Autoregressive Transformers are Zero-Shot Video Imitators
Wentao Zhang
Junliang Guo
Tianyu He
Li Zhao
Linli Xu
Jiang Bian
47
3
0
10 Jul 2024
AWT: Transferring Vision-Language Models via Augmentation, Weighting,
  and Transportation
AWT: Transferring Vision-Language Models via Augmentation, Weighting, and Transportation
Yuhan Zhu
Yuyang Ji
Zhiyu Zhao
Gangshan Wu
Limin Wang
VLM
47
7
0
05 Jul 2024
Mitigating the Human-Robot Domain Discrepancy in Visual Pre-training for Robotic Manipulation
Mitigating the Human-Robot Domain Discrepancy in Visual Pre-training for Robotic Manipulation
Jiaming Zhou
Teli Ma
Kun-Yu Lin
Ronghe Qiu
Zifan Wang
Junwei Liang
52
4
0
20 Jun 2024
No Time to Waste: Squeeze Time into Channel for Mobile Video
  Understanding
No Time to Waste: Squeeze Time into Channel for Mobile Video Understanding
Yingjie Zhai
Wenshuo Li
Yehui Tang
Xinghao Chen
Yunhe Wang
ViT
30
0
0
14 May 2024
Multi-Stage Multi-Modal Pre-Training for Automatic Speech Recognition
Multi-Stage Multi-Modal Pre-Training for Automatic Speech Recognition
Yash Jain
David M. Chan
Pranav Dheram
Aparna Khare
Olabanji Shonibare
Venkatesh Ravichandran
Shalini Ghosh
40
2
0
28 Mar 2024
VideoPrism: A Foundational Visual Encoder for Video Understanding
VideoPrism: A Foundational Visual Encoder for Video Understanding
Long Zhao
N. B. Gundavarapu
Liangzhe Yuan
Hao Zhou
Shen Yan
...
Huisheng Wang
Hartwig Adam
Mikhail Sirotenko
Ting Liu
Boqing Gong
VGen
45
29
0
20 Feb 2024
STREAM: Spatio-TempoRal Evaluation and Analysis Metric for Video
  Generative Models
STREAM: Spatio-TempoRal Evaluation and Analysis Metric for Video Generative Models
Pum Jun Kim
Seojun Kim
Jaejun Yoo
EGVM
30
3
0
30 Jan 2024
Photorealistic Video Generation with Diffusion Models
Photorealistic Video Generation with Diffusion Models
Agrim Gupta
Lijun Yu
Kihyuk Sohn
Xiuye Gu
Meera Hahn
Fei-Fei Li
Irfan Essa
Lu Jiang
José Lezama
VGen
59
177
0
11 Dec 2023
Unsupervised Video Domain Adaptation with Masked Pre-Training and Collaborative Self-Training
Unsupervised Video Domain Adaptation with Masked Pre-Training and Collaborative Self-Training
Arun V. Reddy
William Paul
Corban Rivera
Ketul Shah
Celso M. de Melo
Rama Chellappa
37
4
0
05 Dec 2023
Towards Weakly Supervised End-to-end Learning for Long-video Action
  Recognition
Towards Weakly Supervised End-to-end Learning for Long-video Action Recognition
Jiaming Zhou
Hanjun Li
Kun-Yu Lin
Junwei Liang
29
1
0
28 Nov 2023
Telling Stories for Common Sense Zero-Shot Action Recognition
Telling Stories for Common Sense Zero-Shot Action Recognition
Shreyank N. Gowda
Carolina Scarton
LM&Ro
30
2
0
29 Sep 2023
SlowFast Network for Continuous Sign Language Recognition
SlowFast Network for Continuous Sign Language Recognition
Junseok Ahn
Youngjoon Jang
Joon Son Chung
SLR
38
10
0
21 Sep 2023
AI-Generated Content (AIGC) for Various Data Modalities: A Survey
AI-Generated Content (AIGC) for Various Data Modalities: A Survey
Lin Geng Foo
Hossein Rahmani
Jun Liu
78
31
0
27 Aug 2023
Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action
  Recognition
Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action Recognition
Syed Talal Wasim
Muhammad Uzair Khattak
Muzammal Naseer
Salman Khan
M. Shah
Fahad Shahbaz Khan
ViT
54
19
0
13 Jul 2023
Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting
Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting
Syed Talal Wasim
Muzammal Naseer
Salman Khan
Fahad Shahbaz Khan
M. Shah
VLM
VPVLM
39
74
0
06 Apr 2023
Unmasked Teacher: Towards Training-Efficient Video Foundation Models
Unmasked Teacher: Towards Training-Efficient Video Foundation Models
Kunchang Li
Yali Wang
Yizhuo Li
Yi Wang
Yinan He
Limin Wang
Yu Qiao
VGen
57
156
0
28 Mar 2023
EVA-CLIP: Improved Training Techniques for CLIP at Scale
EVA-CLIP: Improved Training Techniques for CLIP at Scale
Quan-Sen Sun
Yuxin Fang
Ledell Yu Wu
Xinlong Wang
Yue Cao
CLIP
VLM
81
470
0
27 Mar 2023
Confidence Attention and Generalization Enhanced Distillation for
  Continuous Video Domain Adaptation
Confidence Attention and Generalization Enhanced Distillation for Continuous Video Domain Adaptation
Xiyu Wang
Yuecong Xu
Jianfei Yang
Xiaoli Li
Zhenghua Chen
TTA
32
0
0
18 Mar 2023
Adapting Pre-trained Vision Transformers from 2D to 3D through Weight
  Inflation Improves Medical Image Segmentation
Adapting Pre-trained Vision Transformers from 2D to 3D through Weight Inflation Improves Medical Image Segmentation
Yuhui Zhang
Shihua Huang
Zhengping Zhou
M. Lungren
Serena Yeung
ViT
MedIm
18
8
0
08 Feb 2023
Baseline Method for the Sport Task of MediaEval 2022 with 3D CNNs using
  Attention Mechanisms
Baseline Method for the Sport Task of MediaEval 2022 with 3D CNNs using Attention Mechanisms
Pierre-Etienne Martin
22
1
0
06 Feb 2023
ADAPT: Action-aware Driving Caption Transformer
ADAPT: Action-aware Driving Caption Transformer
Bu Jin
Xinyi Liu
Yupeng Zheng
Pengfei Li
Hao Zhao
Tong Zhang
Yuhang Zheng
Guyue Zhou
Jingjing Liu
30
69
0
01 Feb 2023
CNN-Based Action Recognition and Pose Estimation for Classifying Animal
  Behavior from Videos: A Survey
CNN-Based Action Recognition and Pose Estimation for Classifying Animal Behavior from Videos: A Survey
Michael Perez
Corey Toler-Franklin
MedIm
36
14
0
15 Jan 2023
Multi-Stage Spatio-Temporal Aggregation Transformer for Video Person
  Re-identification
Multi-Stage Spatio-Temporal Aggregation Transformer for Video Person Re-identification
Ziyi Tang
Ruimao Zhang
Zhanglin Peng
Jinrui Chen
Liang Lin
33
18
0
02 Jan 2023
A Survey on Human Action Recognition
A Survey on Human Action Recognition
Zhou Shuchang
29
0
0
20 Dec 2022
2D Pose Estimation based Child Action Recognition
2D Pose Estimation based Child Action Recognition
Sanka Mohottala
Sandun Abeygunawardana
Pradeepa Samarasinghe
D. Kasthurirathna
Charith Abhayaratne
24
2
0
18 Dec 2022
MAGVIT: Masked Generative Video Transformer
MAGVIT: Masked Generative Video Transformer
Lijun Yu
Yong Cheng
Kihyuk Sohn
José Lezama
Han Zhang
...
Alexander G. Hauptmann
Ming-Hsuan Yang
Yuan Hao
Irfan Essa
Lu Jiang
DiffM
VGen
38
228
0
10 Dec 2022
Fine-tuned CLIP Models are Efficient Video Learners
Fine-tuned CLIP Models are Efficient Video Learners
H. Rasheed
Muhammad Uzair Khattak
Muhammad Maaz
Salman Khan
Fahad Shahbaz Khan
CLIP
VLM
34
150
0
06 Dec 2022
InternVideo: General Video Foundation Models via Generative and
  Discriminative Learning
InternVideo: General Video Foundation Models via Generative and Discriminative Learning
Yi Wang
Kunchang Li
Yizhuo Li
Yinan He
Bingkun Huang
...
Junting Pan
Jiashuo Yu
Yali Wang
Limin Wang
Yu Qiao
VLM
VGen
57
311
0
06 Dec 2022
UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video
  UniFormer
UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer
Kunchang Li
Yali Wang
Yinan He
Yizhuo Li
Yi Wang
Limin Wang
Yu Qiao
ViT
30
107
0
17 Nov 2022
Token Turing Machines
Token Turing Machines
Michael S. Ryoo
K. Gopalakrishnan
Kumara Kahatapitiya
Ted Xiao
Kanishka Rao
Austin Stone
Yao Lu
Julian Ibarz
Anurag Arnab
27
21
0
16 Nov 2022
EVA: Exploring the Limits of Masked Visual Representation Learning at
  Scale
EVA: Exploring the Limits of Masked Visual Representation Learning at Scale
Yuxin Fang
Wen Wang
Binhui Xie
Quan-Sen Sun
Ledell Yu Wu
Xinggang Wang
Tiejun Huang
Xinlong Wang
Yue Cao
VLM
CLIP
87
679
0
14 Nov 2022
AltCLIP: Altering the Language Encoder in CLIP for Extended Language
  Capabilities
AltCLIP: Altering the Language Encoder in CLIP for Extended Language Capabilities
Zhongzhi Chen
Guangyi Liu
Bo-Wen Zhang
Fulong Ye
Qinghong Yang
Ledell Yu Wu
VLM
37
80
0
12 Nov 2022
Adversarial Domain Adaptation for Action Recognition Around the Clock
Adversarial Domain Adaptation for Action Recognition Around the Clock
Anwaar Ulhaq
22
3
0
25 Oct 2022
Solving Reasoning Tasks with a Slot Transformer
Solving Reasoning Tasks with a Slot Transformer
Ryan Faulkner
Daniel Zoran
LRM
26
1
0
20 Oct 2022
Transfer-learning for video classification: Video Swin Transformer on multiple domains
Transfer-learning for video classification: Video Swin Transformer on multiple domains
Daniel de Oliveira
David Martins de Matos
ViT
24
0
0
18 Oct 2022
Phenaki: Variable Length Video Generation From Open Domain Textual
  Description
Phenaki: Variable Length Video Generation From Open Domain Textual Description
Ruben Villegas
Mohammad Babaeizadeh
Pieter-Jan Kindermans
Hernan Moraldo
Han Zhang
M. Saffar
Santiago Castro
Julius Kunze
D. Erhan
DiffM
VGen
68
374
0
05 Oct 2022
Expediting Large-Scale Vision Transformer for Dense Prediction without
  Fine-tuning
Expediting Large-Scale Vision Transformer for Dense Prediction without Fine-tuning
Weicong Liang
Yuhui Yuan
Henghui Ding
Xiao Luo
Weihong Lin
Ding Jia
Zheng-Wei Zhang
Chao Zhang
Hanhua Hu
35
25
0
03 Oct 2022
Learning in Audio-visual Context: A Review, Analysis, and New
  Perspective
Learning in Audio-visual Context: A Review, Analysis, and New Perspective
Yake Wei
Di Hu
Yapeng Tian
Xuelong Li
46
55
0
20 Aug 2022
Expanding Language-Image Pretrained Models for General Video Recognition
Expanding Language-Image Pretrained Models for General Video Recognition
Bolin Ni
Houwen Peng
Minghao Chen
Songyang Zhang
Gaofeng Meng
Jianlong Fu
Shiming Xiang
Haibin Ling
VLM
CLIP
ViT
40
313
0
04 Aug 2022
Video Swin Transformers for Egocentric Video Understanding @ Ego4D
  Challenges 2022
Video Swin Transformers for Egocentric Video Understanding @ Ego4D Challenges 2022
María Escobar
Laura Alexandra Daza
Cristina González
Jordi Pont-Tuset
Pablo Arbelaez
18
8
0
22 Jul 2022
Learning from Label Relationships in Human Affect
Learning from Label Relationships in Human Affect
Niki Maria Foteinopoulou
Ioannis Patras
CVBM
25
8
0
12 Jul 2022
123
Next