ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1705.07750
  4. Cited By
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
v1v2v3 (latest)

Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset

22 May 2017
João Carreira
Andrew Zisserman
ArXiv (abs)PDFHTML

Papers citing "Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset"

50 / 3,647 papers shown
Title
Anomaly Detection in Video Sequences: A Benchmark and Computational
  Model
Anomaly Detection in Video Sequences: A Benchmark and Computational Model
Boyang Wan
Wenhui Jiang
Yuming Fang
Zhiyuan Luo
Guanqun Ding
AI4TS
75
48
0
16 Jun 2021
Watching Too Much Television is Good: Self-Supervised Audio-Visual
  Representation Learning from Movies and TV Shows
Watching Too Much Television is Good: Self-Supervised Audio-Visual Representation Learning from Movies and TV Shows
Mahdi M. Kalayeh
Nagendra Kamath
Lingyi Liu
Ashok Chandrashekar
SSL
31
2
0
16 Jun 2021
Gradient Forward-Propagation for Large-Scale Temporal Video Modelling
Gradient Forward-Propagation for Large-Scale Temporal Video Modelling
Mateusz Malinowski
Dimitrios Vytiniotis
G. Swirszcz
Viorica Patraucean
João Carreira
65
8
0
15 Jun 2021
Multi-StyleGAN: Towards Image-Based Simulation of Time-Lapse Live-Cell
  Microscopy
Multi-StyleGAN: Towards Image-Based Simulation of Time-Lapse Live-Cell Microscopy
Christoph Reich
Tim Prangemeier
C. Wildner
Heinz Koeppl
AI4CE
61
9
0
15 Jun 2021
Relation Modeling in Spatio-Temporal Action Localization
Relation Modeling in Spatio-Temporal Action Localization
Yutong Feng
Jianwen Jiang
Ziyuan Huang
Zhiwu Qing
Xiang Wang
Shiwei Zhang
Mingqian Tang
Yue Gao
70
11
0
15 Jun 2021
A Stronger Baseline for Ego-Centric Action Detection
A Stronger Baseline for Ego-Centric Action Detection
Zhiwu Qing
Ziyuan Huang
Xiang Wang
Yutong Feng
Shiwei Zhang
Jianwen Jiang
Mingqian Tang
Changxin Gao
M. Ang
Nong Sang
EgoV
61
3
0
13 Jun 2021
Multi-level Attention Fusion Network for Audio-visual Event Recognition
Multi-level Attention Fusion Network for Audio-visual Event Recognition
Mathilde Brousmiche
Jean Rouat
Stéphane Dupont
161
11
0
12 Jun 2021
Team RUC_AIM3 Technical Report at ActivityNet 2021: Entities Object
  Localization
Team RUC_AIM3 Technical Report at ActivityNet 2021: Entities Object Localization
Ludan Ruan
Jieting Chen
Yuqing Song
Shizhe Chen
Qin Jin
34
0
0
11 Jun 2021
Space-time Mixing Attention for Video Transformer
Space-time Mixing Attention for Video Transformer
Adrian Bulat
Juan-Manuel Perez-Rua
Swathikiran Sudhakaran
Brais Martínez
Georgios Tzimiropoulos
ViT
95
127
0
10 Jun 2021
Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers
Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers
Mandela Patrick
Dylan Campbell
Yuki M. Asano
Ishan Misra
Ishan Misra Florian Metze
Christoph Feichtenhofer
Andrea Vedaldi
João F. Henriques
114
282
0
09 Jun 2021
Towards Training Stronger Video Vision Transformers for
  EPIC-KITCHENS-100 Action Recognition
Towards Training Stronger Video Vision Transformers for EPIC-KITCHENS-100 Action Recognition
Ziyuan Huang
Zhiwu Qing
Xiang Wang
Yutong Feng
Shiwei Zhang
Jianwen Jiang
Zhurong Xia
Mingqian Tang
Nong Sang
M. Ang
ViT
64
11
0
09 Jun 2021
Image2Point: 3D Point-Cloud Understanding with 2D Image Pretrained
  Models
Image2Point: 3D Point-Cloud Understanding with 2D Image Pretrained Models
Chenfeng Xu
Shijia Yang
Tomer Galanti
Bichen Wu
Xiangyu Yue
Bohan Zhai
Wei Zhan
Peter Vajda
Kurt Keutzer
Masayoshi Tomizuka
3DPC
62
55
0
08 Jun 2021
Few-Shot Action Localization without Knowing Boundaries
Few-Shot Action Localization without Knowing Boundaries
Tingting Xie
Christos Tzelepis
Fan Fu
Ioannis Patras
67
5
0
08 Jun 2021
Novel View Video Prediction Using a Dual Representation
Novel View Video Prediction Using a Dual Representation
Sarah Shiraz
Krishna Regmi
Shruti Vyas
Yogesh S Rawat
M. Shah
74
6
0
07 Jun 2021
How to Design a Three-Stage Architecture for Audio-Visual Active Speaker
  Detection in the Wild
How to Design a Three-Stage Architecture for Audio-Visual Active Speaker Detection in the Wild
Okan Kopuklu
Maja Taseska
Gerhard Rigoll
3DV
100
46
0
07 Jun 2021
Transformed ROIs for Capturing Visual Transformations in Videos
Transformed ROIs for Capturing Visual Transformations in Videos
Abhinav Rai
Fadime Sener
Angela Yao
ViT
69
3
0
06 Jun 2021
Hierarchical Video Generation for Complex Data
Hierarchical Video Generation for Complex Data
Lluis Castrejon
Nicolas Ballas
Aaron Courville
VGen
69
4
0
04 Jun 2021
Anticipative Video Transformer
Anticipative Video Transformer
Rohit Girdhar
Kristen Grauman
ViT
96
212
0
03 Jun 2021
Cross-Domain First Person Audio-Visual Action Recognition through
  Relative Norm Alignment
Cross-Domain First Person Audio-Visual Action Recognition through Relative Norm Alignment
M. Planamente
Chiara Plizzari
Emanuele Alberti
Barbara Caputo
EgoV
127
12
0
03 Jun 2021
CT-Net: Channel Tensorization Network for Video Classification
CT-Net: Channel Tensorization Network for Video Classification
Kunchang Li
Xianhang Li
Yali Wang
Jun Wang
Yu Qiao
ViT
74
55
0
03 Jun 2021
Deconfounded Video Moment Retrieval with Causal Intervention
Deconfounded Video Moment Retrieval with Causal Intervention
Xun Yang
Fuli Feng
Wei Ji
Meng Wang
Tat-Seng Chua
CMLVGen
82
191
0
03 Jun 2021
TSI: Temporal Saliency Integration for Video Action Recognition
TSI: Temporal Saliency Integration for Video Action Recognition
Haisheng Su
Kunchang Li
Jinyuan Feng
Dongliang Wang
Weihao Gan
Wei Wu
Yu Qiao
67
4
0
02 Jun 2021
Dual Normalization Multitasking for Audio-Visual Sounding Object
  Localization
Dual Normalization Multitasking for Audio-Visual Sounding Object Localization
Tokuhiro Nishikawa
Daiki Shimada
Jerry Jun Yokono
29
0
0
01 Jun 2021
Continual 3D Convolutional Neural Networks for Real-time Processing of
  Videos
Continual 3D Convolutional Neural Networks for Real-time Processing of Videos
Lukas Hedegaard
Alexandros Iosifidis
3DPC
94
15
0
31 May 2021
Connecting Language and Vision for Natural Language-Based Vehicle
  Retrieval
Connecting Language and Vision for Natural Language-Based Vehicle Retrieval
Shuai Bai
Zhedong Zheng
Xiaohan Wang
Junyang Lin
Zhu Zhang
Chang Zhou
Yi Yang
Hongxia Yang
103
27
0
31 May 2021
Towards Diverse Paragraph Captioning for Untrimmed Videos
Towards Diverse Paragraph Captioning for Untrimmed Videos
Yuqing Song
Shizhe Chen
Qin Jin
68
38
0
30 May 2021
Maintaining Common Ground in Dynamic Environments
Maintaining Common Ground in Dynamic Environments
Takuma Udagawa
Akiko Aizawa
48
13
0
29 May 2021
SSCAP: Self-supervised Co-occurrence Action Parsing for Unsupervised
  Temporal Action Segmentation
SSCAP: Self-supervised Co-occurrence Action Parsing for Unsupervised Temporal Action Segmentation
Zhe Wang
Hao Chen
Xinyu Li
Chunhui Liu
Yuanjun Xiong
Joseph Tighe
Charless C. Fowlkes
118
20
0
29 May 2021
Unsupervised Action Segmentation by Joint Representation Learning and
  Online Clustering
Unsupervised Action Segmentation by Joint Representation Learning and Online Clustering
Sateesh Kumar
S. Haresh
Awais Ahmed
Andrey Konin
M. Zia
Quoc-Huy Tran
SSL
108
48
0
27 May 2021
Tracking Without Re-recognition in Humans and Machines
Tracking Without Re-recognition in Humans and Machines
Drew Linsley
Girik Malik
Junkyung Kim
L. Govindarajan
E. Mingolla
Thomas Serre
69
18
0
27 May 2021
SSAN: Separable Self-Attention Network for Video Representation Learning
SSAN: Separable Self-Attention Network for Video Representation Learning
Xudong Guo
Xun Guo
Yan Lu
ViTAI4TS
55
26
0
27 May 2021
Detecting Biological Locomotion in Video: A Computational Approach
Detecting Biological Locomotion in Video: A Computational Approach
Soo-Min Kang
Richard P. Wildes
45
0
0
26 May 2021
Improving Sign Language Translation with Monolingual Data by Sign
  Back-Translation
Improving Sign Language Translation with Monolingual Data by Sign Back-Translation
Hao Zhou
Wen-gang Zhou
Weizhen Qi
Junfu Pu
Houqiang Li
SLR
65
194
0
26 May 2021
DSANet: Dynamic Segment Aggregation Network for Video-Level
  Representation Learning
DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning
Wenhao Wu
Yuxiang Zhao
Yanwu Xu
Xiao Tan
Dongliang He
...
Jinxing Ye
Yingying Li
Mingde Yao
Zichao Dong
Yifeng Shi
AI4TS
93
30
0
25 May 2021
Temporal Action Proposal Generation with Transformers
Temporal Action Proposal Generation with Transformers
Lining Wang
Haosen Yang
Wenhao Wu
Huanjin Yao
Hujie Huang
ViT
85
28
0
25 May 2021
GAN for Vision, KG for Relation: a Two-stage Deep Network for Zero-shot
  Action Recognition
GAN for Vision, KG for Relation: a Two-stage Deep Network for Zero-shot Action Recognition
Bin Sun
Dehui Kong
Shaofan Wang
Jinghua Li
Baocai Yin
Xiaonan Luo
55
18
0
25 May 2021
ST-HOI: A Spatial-Temporal Baseline for Human-Object Interaction
  Detection in Videos
ST-HOI: A Spatial-Temporal Baseline for Human-Object Interaction Detection in Videos
Meng-Jiun Chiou
Chun-Yu Liao
Li-Wei Wang
Roger Zimmermann
Jiashi Feng
107
27
0
25 May 2021
FineAction: A Fine-Grained Video Dataset for Temporal Action
  Localization
FineAction: A Fine-Grained Video Dataset for Temporal Action Localization
Yi Liu
Limin Wang
Yali Wang
Xiao Ma
Yu Qiao
102
62
0
24 May 2021
Coarse to Fine Multi-Resolution Temporal Convolutional Network
Coarse to Fine Multi-Resolution Temporal Convolutional Network
Dipika Singhania
R. Rahaman
Angela Yao
AI4TS
85
55
0
23 May 2021
Video-based Person Re-identification without Bells and Whistles
Video-based Person Re-identification without Bells and Whistles
Chih-Ting Liu
Jun-Cheng Chen
Chu-Song Chen
Shao-Yi Chien
115
15
0
22 May 2021
Sharing Pain: Using Pain Domain Transfer for Video Recognition of Low
  Grade Orthopedic Pain in Horses
Sharing Pain: Using Pain Domain Transfer for Video Recognition of Low Grade Orthopedic Pain in Horses
Sofia Broomé
K. Ask
Maheen Rashid-Engström
Pia Haubro Andersen
Hedvig Kjellström
85
12
0
21 May 2021
Egocentric Activity Recognition and Localization on a 3D Map
Egocentric Activity Recognition and Localization on a 3D Map
Miao Liu
Lingni Ma
Kiran Somasundaram
Yin Li
Kristen Grauman
James M. Rehg
Chao Li
EgoV
69
20
0
20 May 2021
Medical Image Segmentation Using Squeeze-and-Expansion Transformers
Medical Image Segmentation Using Squeeze-and-Expansion Transformers
Shaohua Li
Xiuchao Sui
Xiangde Luo
Xinxing Xu
Yong Liu
Rick Siow Mong Goh
ViTMedIm
78
170
0
20 May 2021
Non-contact Pain Recognition from Video Sequences with Remote
  Physiological Measurements Prediction
Non-contact Pain Recognition from Video Sequences with Remote Physiological Measurements Prediction
Ruijing Yang
Ziyu Guan
Zitong Yu
Xiaoyi Feng
Jinye Peng
Guoying Zhao
37
10
0
18 May 2021
Parallel Attention Network with Sequence Matching for Video Grounding
Parallel Attention Network with Sequence Matching for Video Grounding
Hao Zhang
Aixin Sun
Wei Jing
Liangli Zhen
Qiufeng Wang
Rick Siow Mong Goh
109
41
0
18 May 2021
NExT-QA:Next Phase of Question-Answering to Explaining Temporal Actions
NExT-QA:Next Phase of Question-Answering to Explaining Temporal Actions
Junbin Xiao
Xindi Shang
Angela Yao
Tat-Seng Chua
205
507
0
18 May 2021
VPN++: Rethinking Video-Pose embeddings for understanding Activities of
  Daily Living
VPN++: Rethinking Video-Pose embeddings for understanding Activities of Daily Living
Srijan Das
Rui Dai
Di Yang
Francois Bremond
ViT
104
70
0
17 May 2021
Leveraging Semantic Scene Characteristics and Multi-Stream Convolutional
  Architectures in a Contextual Approach for Video-Based Visual Emotion
  Recognition in the Wild
Leveraging Semantic Scene Characteristics and Multi-Stream Convolutional Architectures in a Contextual Approach for Video-Based Visual Emotion Recognition in the Wild
Ioannis Pikoulis
P. Filntisis
Petros Maragos
89
14
0
16 May 2021
MultiSports: A Multi-Person Video Dataset of Spatio-Temporally Localized
  Sports Actions
MultiSports: A Multi-Person Video Dataset of Spatio-Temporally Localized Sports Actions
Yixuan Li
Lei Chen
Runyu He
Zhenzhi Wang
Gangshan Wu
Limin Wang
127
100
0
16 May 2021
Cross-Modal Progressive Comprehension for Referring Segmentation
Cross-Modal Progressive Comprehension for Referring Segmentation
Si Liu
Tianrui Hui
Shaofei Huang
Yunchao Wei
Yue Liu
Guanbin Li
EgoVVOS
86
130
0
15 May 2021
Previous
123...464748...717273
Next