Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1905.00561
Cited By
Large-scale weakly-supervised pre-training for video action recognition
2 May 2019
Deepti Ghadiyaram
Matt Feiszli
Du Tran
Xueting Yan
Heng Wang
D. Mahajan
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Large-scale weakly-supervised pre-training for video action recognition"
32 / 82 papers shown
Title
Context-Aware Personality Inference in Dyadic Scenarios: Introducing the UDIVA Dataset
Cristina Palmero
Javier Selva
Sorina Smeureanu
Julio C. S. Jacques Junior
Albert Clapés
...
Zejian Zhang
D. Gallardo-Pujol
G. Guilera
D. Leiva
Sergio Escalera
30
53
0
28 Dec 2020
A Comprehensive Study of Deep Video Action Recognition
Yi Zhu
Xinyu Li
Chunhui Liu
Mohammadreza Zolfaghari
Yuanjun Xiong
Chongruo Wu
Zhi-Li Zhang
Joseph Tighe
R. Manmatha
Mu Li
VLM
AI4TS
38
185
0
11 Dec 2020
Proactive Interaction Framework for Intelligent Social Receptionist Robots
Yang Xue
Fan Wang
Hao Tian
Min Zhao
Jiangyong Li
Haiqing Pan
Yueqiang Dong
29
9
0
09 Dec 2020
t-EVA: Time-Efficient t-SNE Video Annotation
Soroosh Poorgholi
O. Kayhan
Jan van Gemert
16
5
0
26 Nov 2020
TSP: Temporally-Sensitive Pretraining of Video Encoders for Localization Tasks
Humam Alwassel
Silvio Giancola
Guohao Li
33
123
0
23 Nov 2020
Multi-Temporal Convolutions for Human Action Recognition in Videos
Alexandros Stergiou
R. Poppe
29
1
0
08 Nov 2020
Deep Analysis of CNN-based Spatio-temporal Representations for Action Recognition
Chun-Fu Chen
Yikang Shen
K. Ramakrishnan
Rogerio Feris
J. M. Cohn
A. Oliva
Quanfu Fan
23
95
0
22 Oct 2020
Making a Case for 3D Convolutions for Object Segmentation in Videos
Sabarinath Mahadevan
A. Athar
Aljosa Osep
Sebastian Hennen
Laura Leal-Taixé
Bastian Leibe
VOS
21
87
0
26 Aug 2020
Spatiotemporal Contrastive Video Representation Learning
Rui Qian
Tianjian Meng
Boqing Gong
Ming-Hsuan Yang
Haoran Wang
Serge J. Belongie
Huayu Chen
SSL
AI4TS
41
493
0
09 Aug 2020
Late Temporal Modeling in 3D CNN Architectures with BERT for Action Recognition
M. E. Kalfaoglu
Sinan Kalkan
A. Aydin Alatan
3DPC
39
140
0
03 Aug 2020
The End-of-End-to-End: A Video Understanding Pentathlon Challenge (2020)
Samuel Albanie
Yang Liu
Arsha Nagrani
Antoine Miech
Ernesto Coto
...
Kaixu Cui
Hui Liu
Chen Wang
Yudong Jiang
Xiaoshuai Hao
34
9
0
03 Aug 2020
Hierarchical Action Classification with Network Pruning
Mahdi Davoodikakhki
KangKang Yin
34
19
0
30 Jul 2020
Learning Video Representations from Textual Web Supervision
Jonathan C. Stroud
Zhichao Lu
Chen Sun
Jia Deng
Rahul Sukthankar
Cordelia Schmid
David A. Ross
SSL
40
48
0
29 Jul 2020
Self-Supervised MultiModal Versatile Networks
Jean-Baptiste Alayrac
Adrià Recasens
R. Schneider
Relja Arandjelović
Jason Ramapuram
J. Fauw
Lucas Smaira
Sander Dieleman
Andrew Zisserman
SSL
40
372
0
29 Jun 2020
AVLnet: Learning Audio-Visual Language Representations from Instructional Videos
Andrew Rouditchenko
Angie Boggust
David Harwath
Brian Chen
D. Joshi
...
Rogerio Feris
Brian Kingsbury
M. Picheny
Antonio Torralba
James R. Glass
SSL
22
141
0
16 Jun 2020
Egocentric Object Manipulation Graphs
Eadom Dessalene
Michael Maynord
Chinmaya Devaraj
Cornelia Fermuller
Yiannis Aloimonos
EgoV
30
19
0
05 Jun 2020
Would Mega-scale Datasets Further Enhance Spatiotemporal 3D CNNs?
Hirokatsu Kataoka
Tenga Wakamiya
Kensho Hara
Y. Satoh
3DPC
31
87
0
10 Apr 2020
Speech2Action: Cross-modal Supervision for Action Recognition
Arsha Nagrani
Chen Sun
David A. Ross
Rahul Sukthankar
Cordelia Schmid
Andrew Zisserman
33
54
0
30 Mar 2020
Omni-sourced Webly-supervised Learning for Video Recognition
Haodong Duan
Yue Zhao
Yuanjun Xiong
Wentao Liu
Dahua Lin
VLM
23
88
0
29 Mar 2020
Transferring Cross-domain Knowledge for Video Sign Language Recognition
Dongxu Li
Xin Yu
Chenchen Xu
L. Petersson
Hongdong Li
SLR
36
104
0
08 Mar 2020
Self-supervising Action Recognition by Statistical Moment and Subspace Descriptors
Lei Wang
Piotr Koniusz
26
50
0
14 Jan 2020
ClusterFit: Improving Generalization of Visual Representations
Xueting Yan
Ishan Misra
Abhinav Gupta
Deepti Ghadiyaram
D. Mahajan
SSL
VLM
27
132
0
06 Dec 2019
A Multigrid Method for Efficiently Training Video Models
Chaoxia Wu
Ross B. Girshick
Kaiming He
Christoph Feichtenhofer
Philipp Krahenbuhl
32
94
0
02 Dec 2019
Gate-Shift Networks for Video Action Recognition
Swathikiran Sudhakaran
Sergio Escalera
Oswald Lanz
3DPC
19
155
0
01 Dec 2019
Self-Supervised Learning by Cross-Modal Audio-Video Clustering
Humam Alwassel
D. Mahajan
Bruno Korbar
Lorenzo Torresani
Guohao Li
Du Tran
SSL
42
428
0
28 Nov 2019
Guided Weak Supervision for Action Recognition with Scarce Data to Assess Skills of Children with Autism
Prashant Pandey
P. PrathoshA.
Manu Kohli
Joshua K. Pritchard
29
33
0
11 Nov 2019
Class Feature Pyramids for Video Explanation
Alexandros Stergiou
G. Kapidis
Grigorios Kalliatakis
C. Chrysoulas
R. Poppe
R. Veltkamp
FAtt
33
18
0
18 Sep 2019
Transferability and Hardness of Supervised Classification Tasks
Anh Tran
Cuong V Nguyen
Tal Hassner
134
164
0
21 Aug 2019
Use What You Have: Video Retrieval Using Representations From Collaborative Experts
Yang Liu
Samuel Albanie
Arsha Nagrani
Andrew Zisserman
36
387
0
31 Jul 2019
Baidu-UTS Submission to the EPIC-Kitchens Action Recognition Challenge 2019
Xiaohan Wang
Yu Wu
Linchao Zhu
Yi Yang
24
19
0
22 Jun 2019
What Makes Training Multi-Modal Classification Networks Hard?
Weiyao Wang
Du Tran
Matt Feiszli
34
443
0
29 May 2019
ECO: Efficient Convolutional Network for Online Video Understanding
Mohammadreza Zolfaghari
Kamaljeet Singh
Thomas Brox
142
496
0
24 Apr 2018
Previous
1
2