ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1705.07750
  4. Cited By
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset

Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset

22 May 2017
João Carreira
Andrew Zisserman
ArXivPDFHTML

Papers citing "Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset"

50 / 1,478 papers shown
Title
Temporal Action Localization Using Gated Recurrent Units
Temporal Action Localization Using Gated Recurrent Units
Hassan Keshvari Khojasteh
Hoda Mohammadzade
H. Behroozi
23
3
0
07 Aug 2021
Token Shift Transformer for Video Classification
Token Shift Transformer for Video Classification
Hao Zhang
Y. Hao
Chong-Wah Ngo
ViT
32
116
0
05 Aug 2021
Hybrid Reasoning Network for Video-based Commonsense Captioning
Hybrid Reasoning Network for Video-based Commonsense Captioning
Weijiang Yu
Jian Liang
Lei Ji
Lu Li
Yuejian Fang
Nong Xiao
Nan Duan
24
10
0
05 Aug 2021
Enhancing Self-supervised Video Representation Learning via Multi-level
  Feature Optimization
Enhancing Self-supervised Video Representation Learning via Multi-level Feature Optimization
Rui Qian
Yuxi Li
Huabin Liu
John See
Shuangrui Ding
Xian Liu
Dian Li
Weiyao Lin
35
42
0
04 Aug 2021
Optimizing Latency for Online Video CaptioningUsing Audio-Visual
  Transformers
Optimizing Latency for Online Video CaptioningUsing Audio-Visual Transformers
Chiori Hori
Takaaki Hori
Jonathan Le Roux
25
4
0
04 Aug 2021
Skeleton Cloud Colorization for Unsupervised 3D Action Representation
  Learning
Skeleton Cloud Colorization for Unsupervised 3D Action Representation Learning
Siyuan Yang
Jun Liu
Shijian Lu
Meng Hwa Er
Alex C. Kot
3DH
3DPC
43
91
0
04 Aug 2021
OncoNet: Weakly Supervised Siamese Network to automate cancer treatment
  response assessment between longitudinal FDG PET/CT examinations
OncoNet: Weakly Supervised Siamese Network to automate cancer treatment response assessment between longitudinal FDG PET/CT examinations
Anirudh Joshi
Sabri Eyuboglu
Shih-Cheng Huang
Jared A. Dunnmon
Arjun Soin
G. Davidzon
Akshay S. Chaudhari
M. Lungren
14
3
0
03 Aug 2021
Video Generation from Text Employing Latent Path Construction for
  Temporal Modeling
Video Generation from Text Employing Latent Path Construction for Temporal Modeling
Amir Mazaheri
M. Shah
30
8
0
29 Jul 2021
Insights from Generative Modeling for Neural Video Compression
Insights from Generative Modeling for Neural Video Compression
Ruihan Yang
Yibo Yang
Joseph Marino
Stephan Mandt
VGen
35
16
0
28 Jul 2021
Enriching Local and Global Contexts for Temporal Action Localization
Enriching Local and Global Contexts for Temporal Action Localization
Zixin Zhu
Wei Tang
Le Wang
N. Zheng
G. Hua
29
108
0
27 Jul 2021
Spatial-Temporal Transformer for Dynamic Scene Graph Generation
Spatial-Temporal Transformer for Dynamic Scene Graph Generation
Yuren Cong
Wentong Liao
H. Ackermann
Bodo Rosenhahn
M. Yang
ViT
24
122
0
26 Jul 2021
HANet: Hierarchical Alignment Networks for Video-Text Retrieval
HANet: Hierarchical Alignment Networks for Video-Text Retrieval
Peng Wu
Xiangteng He
Mingqian Tang
Yiliang Lv
Jing Liu
42
52
0
26 Jul 2021
Spatio-Temporal Representation Factorization for Video-based Person
  Re-Identification
Spatio-Temporal Representation Factorization for Video-based Person Re-Identification
Abhishek Aich
Meng Zheng
Srikrishna Karanam
Terrence Chen
Amit K. Roy-Chowdhury
Ziyan Wu
42
70
0
25 Jul 2021
Adaptive Recursive Circle Framework for Fine-grained Action Recognition
Adaptive Recursive Circle Framework for Fine-grained Action Recognition
Hanxi Lin
Xinxiao Wu
Jiebo Luo
30
1
0
25 Jul 2021
EAN: Event Adaptive Network for Enhanced Action Recognition
EAN: Event Adaptive Network for Enhanced Action Recognition
Yuan Tian
Yichao Yan
Guangtao Zhai
G. Guo
Zhiyong Gao
40
41
0
22 Jul 2021
Evidential Deep Learning for Open Set Action Recognition
Evidential Deep Learning for Open Set Action Recognition
Wentao Bao
Qi Yu
Yu Kong
CML
EDL
26
135
0
21 Jul 2021
Multi-modal Residual Perceptron Network for Audio-Video Emotion
  Recognition
Multi-modal Residual Perceptron Network for Audio-Video Emotion Recognition
Xin Chang
W. Skarbek
30
19
0
21 Jul 2021
UNIK: A Unified Framework for Real-world Skeleton-based Action
  Recognition
UNIK: A Unified Framework for Real-world Skeleton-based Action Recognition
Di Yang
Yaohui Wang
A. Dantcheva
Lorenzo Garattoni
Gianpiero Francesca
Francois Bremond
32
47
0
19 Jul 2021
Is attention to bounding boxes all you need for pedestrian action
  prediction?
Is attention to bounding boxes all you need for pedestrian action prediction?
Lina Achaji
Julien Moreau
Thibault Fouqueray
François Aioun
François Charpillet
23
30
0
16 Jul 2021
End-to-end Multi-modal Video Temporal Grounding
End-to-end Multi-modal Video Temporal Grounding
Yi-Wen Chen
Yi-Hsuan Tsai
Ming-Hsuan Yang
11
51
0
12 Jul 2021
Aligning Correlation Information for Domain Adaptation in Action
  Recognition
Aligning Correlation Information for Domain Adaptation in Action Recognition
Yuecong Xu
Jianfei Yang
Haozhi Cao
K. Mao
Jianxiong Yin
Simon See
24
38
0
11 Jul 2021
TA2N: Two-Stage Action Alignment Network for Few-shot Action Recognition
TA2N: Two-Stage Action Alignment Network for Few-shot Action Recognition
Shuyuan Li
Huabin Liu
Rui Qian
Yuxi Li
John See
Mengjuan Fei
Xiaoyuan Yu
W. Lin
28
75
0
10 Jul 2021
Long Short-Term Transformer for Online Action Detection
Long Short-Term Transformer for Online Action Detection
Mingze Xu
Yuanjun Xiong
Hao Chen
Xinyu Li
Wei Xia
Zhuowen Tu
Stefano Soatto
ViT
40
130
0
07 Jul 2021
Do Different Tracking Tasks Require Different Appearance Models?
Do Different Tracking Tasks Require Different Appearance Models?
Zhongdao Wang
Hengshuang Zhao
Yali Li
Shengjin Wang
Philip Torr
Luca Bertinetto
42
82
0
05 Jul 2021
Action Transformer: A Self-Attention Model for Short-Time Pose-Based
  Human Action Recognition
Action Transformer: A Self-Attention Model for Short-Time Pose-Based Human Action Recognition
Vittorio Mazzia
Simone Angarano
Francesco Salvetti
Federico Angelini
Marcello Chiaberge
ViT
38
137
0
01 Jul 2021
Productivity, Portability, Performance: Data-Centric Python
Productivity, Portability, Performance: Data-Centric Python
Yiheng Wang
Yao Zhang
Yanzhang Wang
Yan Wan
Jiao Wang
Zhongyuan Wu
Yuhao Yang
Bowen She
59
95
0
01 Jul 2021
Attention Bottlenecks for Multimodal Fusion
Attention Bottlenecks for Multimodal Fusion
Arsha Nagrani
Shan Yang
Anurag Arnab
A. Jansen
Cordelia Schmid
Chen Sun
48
544
0
30 Jun 2021
Spatio-Temporal Context for Action Detection
Spatio-Temporal Context for Action Detection
Manuel Sarmiento Calderó
David Varas
Elisenda Bou
27
2
0
29 Jun 2021
Can An Image Classifier Suffice For Action Recognition?
Can An Image Classifier Suffice For Action Recognition?
Quanfu Fan
Chun-Fu Chen
Chen
Yikang Shen
ViT
36
33
0
26 Jun 2021
Detection of Deepfake Videos Using Long Distance Attention
Detection of Deepfake Videos Using Long Distance Attention
Wei Lu
Lingyi Liu
Junwei Luo
Xianfeng Zhao
Yicong Zhou
Jiwu Huang
CVBM
32
22
0
24 Jun 2021
IA-RED$^2$: Interpretability-Aware Redundancy Reduction for Vision
  Transformers
IA-RED2^22: Interpretability-Aware Redundancy Reduction for Vision Transformers
Bowen Pan
Yikang Shen
Yi Ding
Zhangyang Wang
Rogerio Feris
A. Oliva
VLM
ViT
44
156
0
23 Jun 2021
Towards Long-Form Video Understanding
Towards Long-Form Video Understanding
Chaoxia Wu
Philipp Krahenbuhl
VLM
ViT
59
166
0
21 Jun 2021
TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?
TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?
Michael S. Ryoo
A. Piergiovanni
Anurag Arnab
Mostafa Dehghani
A. Angelova
ViT
37
127
0
21 Jun 2021
Does Optimal Source Task Performance Imply Optimal Pre-training for a
  Target Task?
Does Optimal Source Task Performance Imply Optimal Pre-training for a Target Task?
Steven Gutstein
Brent Lance
Sanjay Shakkottai
27
1
0
21 Jun 2021
OadTR: Online Action Detection with Transformers
OadTR: Online Action Detection with Transformers
Xiang Wang
Shiwei Zhang
Zhiwu Qing
Yuanjie Shao
Zhe Zuo
Changxin Gao
Nong Sang
OffRL
ViT
34
110
0
21 Jun 2021
Weakly-Supervised Temporal Action Localization Through Local-Global
  Background Modeling
Weakly-Supervised Temporal Action Localization Through Local-Global Background Modeling
Xiang Wang
Zhiwu Qing
Ziyuan Huang
Yutong Feng
Shiwei Zhang
Jianwen Jiang
Mingqian Tang
Yuanjie Shao
Nong Sang
29
4
0
20 Jun 2021
Video Summarization through Reinforcement Learning with a 3D
  Spatio-Temporal U-Net
Video Summarization through Reinforcement Learning with a 3D Spatio-Temporal U-Net
Tianrui Liu
Qingjie Meng
Jun-Jie Huang
Athanasios Vlontzos
Daniel Rueckert
Bernhard Kainz
OffRL
AI4TS
26
70
0
19 Jun 2021
Attend What You Need: Motion-Appearance Synergistic Networks for Video
  Question Answering
Attend What You Need: Motion-Appearance Synergistic Networks for Video Question Answering
Ahjeong Seo
Gi-Cheon Kang
J. Park
Byoung-Tak Zhang
18
53
0
19 Jun 2021
Self-supervised Video Representation Learning with Cross-Stream
  Prototypical Contrasting
Self-supervised Video Representation Learning with Cross-Stream Prototypical Contrasting
Martine Toering
Ioannis Gatopoulos
M. Stol
Vincent Tao Hu
SSL
40
11
0
18 Jun 2021
EPIC-KITCHENS-100 Unsupervised Domain Adaptation Challenge for Action
  Recognition 2021: Team M3EM Technical Report
EPIC-KITCHENS-100 Unsupervised Domain Adaptation Challenge for Action Recognition 2021: Team M3EM Technical Report
Lijin Yang
Yifei Huang
Yusuke Sugano
Yoichi Sato
16
5
0
18 Jun 2021
Relation Modeling in Spatio-Temporal Action Localization
Relation Modeling in Spatio-Temporal Action Localization
Yutong Feng
Jianwen Jiang
Ziyuan Huang
Zhiwu Qing
Xiang Wang
Shiwei Zhang
Mingqian Tang
Yue Gao
33
11
0
15 Jun 2021
Multi-level Attention Fusion Network for Audio-visual Event Recognition
Multi-level Attention Fusion Network for Audio-visual Event Recognition
Mathilde Brousmiche
Jean Rouat
Stéphane Dupont
27
11
0
12 Jun 2021
Space-time Mixing Attention for Video Transformer
Space-time Mixing Attention for Video Transformer
Adrian Bulat
Juan-Manuel Perez-Rua
Swathikiran Sudhakaran
Brais Martínez
Georgios Tzimiropoulos
ViT
36
124
0
10 Jun 2021
Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers
Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers
Mandela Patrick
Dylan Campbell
Yuki M. Asano
Ishan Misra
Ishan Misra Florian Metze
Christoph Feichtenhofer
Andrea Vedaldi
João F. Henriques
32
274
0
09 Jun 2021
Towards Training Stronger Video Vision Transformers for
  EPIC-KITCHENS-100 Action Recognition
Towards Training Stronger Video Vision Transformers for EPIC-KITCHENS-100 Action Recognition
Ziyuan Huang
Zhiwu Qing
Xiang Wang
Yutong Feng
Shiwei Zhang
Jianwen Jiang
Zhurong Xia
Mingqian Tang
Nong Sang
M. Ang
ViT
27
11
0
09 Jun 2021
Image2Point: 3D Point-Cloud Understanding with 2D Image Pretrained
  Models
Image2Point: 3D Point-Cloud Understanding with 2D Image Pretrained Models
Chenfeng Xu
Shijia Yang
Tomer Galanti
Bichen Wu
Xiangyu Yue
Bohan Zhai
Wei Zhan
Peter Vajda
Kurt Keutzer
Masayoshi Tomizuka
3DPC
39
53
0
08 Jun 2021
How to Design a Three-Stage Architecture for Audio-Visual Active Speaker
  Detection in the Wild
How to Design a Three-Stage Architecture for Audio-Visual Active Speaker Detection in the Wild
Okan Kopuklu
Maja Taseska
Gerhard Rigoll
3DV
31
45
0
07 Jun 2021
Hierarchical Video Generation for Complex Data
Hierarchical Video Generation for Complex Data
Lluis Castrejon
Nicolas Ballas
Aaron Courville
VGen
22
4
0
04 Jun 2021
CT-Net: Channel Tensorization Network for Video Classification
CT-Net: Channel Tensorization Network for Video Classification
Kunchang Li
Xianhang Li
Yali Wang
Jun Wang
Yu Qiao
ViT
30
55
0
03 Jun 2021
Continual 3D Convolutional Neural Networks for Real-time Processing of
  Videos
Continual 3D Convolutional Neural Networks for Real-time Processing of Videos
Lukas Hedegaard
Alexandros Iosifidis
3DPC
25
14
0
31 May 2021
Previous
123...171819...282930
Next