YouTube-8M: A Large-Scale Video Classification Benchmark

27 September 2016

Joonseok Lee

Balakrishnan Varadarajan

Sudheendra Vijayanarasimhan

VLM

ArXiv PDF HTML

Papers citing "YouTube-8M: A Large-Scale Video Classification Benchmark"

50 / 211 papers shown

Title
Learning to Anticipate Egocentric Actions by Imagination Yu Wu Linchao Zhu Xiaohan Wang Yi Yang Fei Wu EgoV 85 69 0 13 Jan 2021
Advances in Electron Microscopy with Deep Learning Jeffrey M. Ede 35 2 0 04 Jan 2021
Context-Aware Personality Inference in Dyadic Scenarios: Introducing the UDIVA Dataset Cristina Palmero Javier Selva Sorina Smeureanu Julio C. S. Jacques Junior Albert Clapés ... Zejian Zhang D. Gallardo-Pujol G. Guilera D. Leiva Sergio Escalera 28 53 0 28 Dec 2020
SMART Frame Selection for Action Recognition Shreyank N. Gowda Marcus Rohrbach Laura Sevilla-Lara 26 141 0 19 Dec 2020
Multi-shot Temporal Event Localization: a Benchmark Xiaolong Liu Yao Hu S. Bai Fei Ding X. Bai Philip Torr 46 81 0 17 Dec 2020
A Comprehensive Study of Deep Video Action Recognition Yi Zhu Xinyu Li Chunhui Liu Mohammadreza Zolfaghari Yuanjun Xiong Chongruo Wu Zhi-Li Zhang Joseph Tighe R. Manmatha Mu Li VLM AI4TS 38 185 0 11 Dec 2020
MEVA: A Large-Scale Multiview, Multimodal Video Dataset for Activity Detection Kellie Corona Katie Osterdahl Roderic Collins A. Hoogs 22 63 0 02 Dec 2020
Multi-Modal Detection of Alzheimer's Disease from Speech and Text Amish Mittal Sourav Sahoo Arnhav Datar Juned Kadiwala H. Shalu Jimson Mathew 12 20 0 30 Nov 2020
SoccerNet-v2: A Dataset and Benchmarks for Holistic Understanding of Broadcast Soccer Videos A. Deliège A. Cioppa Silvio Giancola M. J. Seikavandi J. Dueholm Kamal Nasrollahi Guohao Li T. Moeslund Marc Van Droogenbroeck 18 152 0 26 Nov 2020
Video Big Data Analytics in the Cloud: A Reference Architecture, Survey, Opportunities, and Open Research Issues A. Alam I. Ullah Young-Koo Lee 42 22 0 16 Nov 2020
Multimodal Pretraining for Dense Video Captioning Gabriel Huang Bo Pang Zhenhai Zhu Clara E. Rivera Radu Soricut 21 81 0 10 Nov 2020
Deep Analysis of CNN-based Spatio-temporal Representations for Action Recognition Chun-Fu Chen Yikang Shen K. Ramakrishnan Rogerio Feris J. M. Cohn A. Oliva Quanfu Fan 23 95 0 22 Oct 2020
MLRSNet: A Multi-label High Spatial Resolution Remote Sensing Dataset for Semantic Scene Understanding Xiaoman Qi P. Zhu Yuebin Wang Liqiang Zhang Junhuan Peng Mengfan Wu Jialong Chen Xudong Zhao Ning Zang P. Mathiopoulos 11 109 0 01 Oct 2020
Learning Visual Voice Activity Detection with an Automatically Annotated Dataset Sylvain Guy Stéphane Lathuilière Pablo Mesejo Radu Horaud 27 11 0 23 Sep 2020
DeepRemaster: Temporal Source-Reference Attention Networks for Comprehensive Video Enhancement S. Iizuka E. Simo-Serra 24 39 0 18 Sep 2020
Review: Deep Learning in Electron Microscopy Jeffrey M. Ede 34 79 0 17 Sep 2020
Real-Time Selfie Video Stabilization Ji-yang Yu R. Ramamoorthi Ke-Li Cheng M. Sarkis N. Bi 24 22 0 04 Sep 2020
Look, Listen, and Attend: Co-Attention Network for Self-Supervised Audio-Visual Representation Learning Ying Cheng Ruize Wang Zhihao Pan Rui Feng Yuejie Zhang SSL 33 106 0 13 Aug 2020
Spatiotemporal Contrastive Video Representation Learning Rui Qian Tianjian Meng Boqing Gong Ming-Hsuan Yang Haoran Wang Serge J. Belongie Huayu Chen SSL AI4TS 36 492 0 09 Aug 2020
Pixel-wise Crowd Understanding via Synthetic Data Wang Qi Junyu Gao Lin Wei Yuan. Yuan 33 117 0 30 Jul 2020
Learning Video Representations from Textual Web Supervision Jonathan C. Stroud Zhichao Lu Chen Sun Jia Deng Rahul Sukthankar Cordelia Schmid David A. Ross SSL 40 48 0 29 Jul 2020
End-to-end Learning of Compressible Features Saurabh Singh Sami Abu-El-Haija Nick Johnston Johannes Ballé Abhinav Shrivastava G. Toderici SSL 97 71 0 23 Jul 2020
Context-Aware RCNN: A Baseline for Action Detection in Videos Jianchao Wu Zhanghui Kuang Limin Wang Wayne Zhang Gangshan Wu 30 79 0 20 Jul 2020
On Robustness and Transferability of Convolutional Neural Networks Josip Djolonga Jessica Yung Michael Tschannen Rob Romijnders Lucas Beyer ... D. Moldovan Sylvain Gelly N. Houlsby Xiaohua Zhai Mario Lucic OOD 13 154 0 16 Jul 2020
TinyVIRAT: Low-resolution Video Action Recognition Ugur Demir Yogesh S Rawat M. Shah 33 36 0 14 Jul 2020
AViD Dataset: Anonymized Videos from Diverse Countries A. Piergiovanni Michael S. Ryoo 25 35 0 10 Jul 2020
Self-Supervised MultiModal Versatile Networks Jean-Baptiste Alayrac Adrià Recasens R. Schneider Relja Arandjelović Jason Ramapuram J. Fauw Lucas Smaira Sander Dieleman Andrew Zisserman SSL 40 371 0 29 Jun 2020
Hierarchical Patch VAE-GAN: Generating Diverse Videos from a Single Sample Shir Gur Sagie Benaim Lior Wolf VGen GAN DRL 25 69 0 22 Jun 2020
Naive-Student: Leveraging Semi-Supervised Learning in Video Sequences for Urban Scene Segmentation Liang-Chieh Chen Raphael Gontijo-Lopes Bowen Cheng Maxwell D. Collins E. D. Cubuk Barret Zoph Hartwig Adam Jonathon Shlens 28 76 0 20 May 2020
Learning to Segment Actions from Observation and Narration Daniel Fried Jean-Baptiste Alayrac Phil Blunsom Chris Dyer S. Clark Aida Nematzadeh 30 31 0 07 May 2020
Action recognition in real-world videos Waqas Sultani Qazi Ammar Arshad Chen Chen 26 2 0 22 Apr 2020
Would Mega-scale Datasets Further Enhance Spatiotemporal 3D CNNs? Hirokatsu Kataoka Tenga Wakamiya Kensho Hara Y. Satoh 3DPC 28 87 0 10 Apr 2020
Gradient Centralization: A New Optimization Technique for Deep Neural Networks Hongwei Yong Jianqiang Huang Xiansheng Hua Lei Zhang ODL 27 183 0 03 Apr 2020
M2m: Imbalanced Classification via Major-to-minor Translation Jaehyung Kim Jongheon Jeong Jinwoo Shin 15 220 0 01 Apr 2020
Learning Interactions and Relationships between Movie Characters Anna Kukleva Makarand Tapaswi Ivan Laptev 41 51 0 29 Mar 2020
Watching the World Go By: Representation Learning from Unlabeled Videos Daniel Gordon Kiana Ehsani Dieter Fox Ali Farhadi SSL AI4TS 29 87 0 18 Mar 2020
Evolving Losses for Unsupervised Video Representation Learning A. Piergiovanni A. Angelova Michael S. Ryoo SSL 27 138 0 26 Feb 2020
Automatic Shortcut Removal for Self-Supervised Representation Learning Matthias Minderer Olivier Bachem N. Houlsby Michael Tschannen SSL 13 73 0 20 Feb 2020
Deep Audio-Visual Learning: A Survey Hao Zhu Mandi Luo Rui Wang A. Zheng Ran He 31 156 0 14 Jan 2020
Neural Data Server: A Large-Scale Search Engine for Transfer Learning Data Xi Yan David Acuna Sanja Fidler 24 42 0 09 Jan 2020
Multi-attention Networks for Temporal Localization of Video-level Labels Lijun Zhang Srinath Nizampatnam Ahana Gangopadhyay Marcos V. Conde 30 7 0 15 Nov 2019
Comprehensive Video Understanding: Video summarization with content-based video recommender design Yudong Jiang Kaixu Cui B. Peng Changliang Xu BDL 12 28 0 30 Oct 2019
Semi-supervised Learning using Adversarial Training with Good and Bad Samples Wenyuan Li Zichen Wang Yuguang Yue Jiayun Li W. Speier Mingyuan Zhou C. Arnold GAN 13 22 0 18 Oct 2019
Moviescope: Large-scale Analysis of Movies using Multiple Modalities Paola Cascante-Bonilla Kalpathy Sitaraman Mengjia Luo Vicente Ordonez 22 39 0 08 Aug 2019
Few-Shot Video Classification via Temporal Alignment Kaidi Cao Jingwei Ji Zhangjie Cao C. Chang Juan Carlos Niebles AI4TS 27 235 0 27 Jun 2019
Baidu-UTS Submission to the EPIC-Kitchens Action Recognition Challenge 2019 Xiaohan Wang Yu Wu Linchao Zhu Yi Yang 16 19 0 22 Jun 2019
Two-Stream Region Convolutional 3D Network for Temporal Activity Detection Huijuan Xu Abir Das Kate Saenko 3DPC 13 46 0 05 Jun 2019
Hallucinating Optical Flow Features for Video Classification Yongyi Tang Lin Ma Lianqiang Zhou 19 19 0 28 May 2019
A Compressive Sensing Video dataset using Pixel-wise coded exposure Sathyaprakash Narayanan Y. Bethi Chetan Singh Thakur 16 4 0 24 May 2019
Decentralized Learning of Generative Adversarial Networks from Non-iid Data Ryo Yonetani Tomohiro Takahashi Atsushi Hashimoto Yoshitaka Ushiku 42 24 0 23 May 2019