ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1609.08675
  4. Cited By
YouTube-8M: A Large-Scale Video Classification Benchmark

YouTube-8M: A Large-Scale Video Classification Benchmark

27 September 2016
Sami Abu-El-Haija
Nisarg Kothari
Joonseok Lee
Apostol Natsev
G. Toderici
Balakrishnan Varadarajan
Sudheendra Vijayanarasimhan
    VLM
ArXivPDFHTML

Papers citing "YouTube-8M: A Large-Scale Video Classification Benchmark"

50 / 204 papers shown
Title
On Negative Sampling for Audio-Visual Contrastive Learning from Movies
On Negative Sampling for Audio-Visual Contrastive Learning from Movies
Mahdi M. Kalayeh
Shervin Ardeshir
Lingyi Liu
Nagendra Kamath
Ashok Chandrashekar
SSL
32
3
0
29 Apr 2022
Google Scanned Objects: A High-Quality Dataset of 3D Scanned Household
  Items
Google Scanned Objects: A High-Quality Dataset of 3D Scanned Household Items
Laura Downs
Anthony G. Francis
Nate Koenig
Brandon Kinman
R. Hickman
Krista Reymann
T. B. McHugh
Vincent Vanhoucke
LM&Ro
52
473
0
25 Apr 2022
Empirical Evaluation and Theoretical Analysis for Representation
  Learning: A Survey
Empirical Evaluation and Theoretical Analysis for Representation Learning: A Survey
Kento Nozawa
Issei Sato
AI4TS
21
4
0
18 Apr 2022
BYOL for Audio: Exploring Pre-trained General-purpose Audio
  Representations
BYOL for Audio: Exploring Pre-trained General-purpose Audio Representations
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
N. Harada
K. Kashino
SSL
36
53
0
15 Apr 2022
Brainish: Formalizing A Multimodal Language for Intelligence and
  Consciousness
Brainish: Formalizing A Multimodal Language for Intelligence and Consciousness
Paul Pu Liang
24
4
0
14 Apr 2022
Leveraging Adversarial Examples to Quantify Membership Information
  Leakage
Leveraging Adversarial Examples to Quantify Membership Information Leakage
Ganesh Del Grosso
Hamid Jalalzai
Georg Pichler
C. Palamidessi
Pablo Piantanida
MIACV
31
21
0
17 Mar 2022
Transframer: Arbitrary Frame Prediction with Generative Models
Transframer: Arbitrary Frame Prediction with Generative Models
C. Nash
João Carreira
Jacob Walker
Iain Barr
Andrew Jaegle
Mateusz Malinowski
Peter W. Battaglia
ViT
27
37
0
17 Mar 2022
GrainSpace: A Large-scale Dataset for Fine-grained and Domain-adaptive
  Recognition of Cereal Grains
GrainSpace: A Large-scale Dataset for Fine-grained and Domain-adaptive Recognition of Cereal Grains
Lei Fan
Yiwen Ding
Dongdong Fan
Donglin Di
M. Pagnucco
Yang Song
AI4TS
29
19
0
10 Mar 2022
Gait Recognition with Mask-based Regularization
Gait Recognition with Mask-based Regularization
Chuanfu Shen
Beibei Lin
Shunli Zhang
George Q. Huang
Shiqi Yu
Xin-cen Yu
CVBM
41
17
0
08 Mar 2022
Skating-Mixer: Long-Term Sport Audio-Visual Modeling with MLPs
Skating-Mixer: Long-Term Sport Audio-Visual Modeling with MLPs
Jingfei Xia
Mingchen Zhuge
Tiantian Geng
Shun Fan
Yuantai Wei
Zhenyu He
Feng Zheng
23
14
0
08 Mar 2022
Spatio-temporal Vision Transformer for Super-resolution Microscopy
Spatio-temporal Vision Transformer for Super-resolution Microscopy
Charles N Christensen
M. Lu
Edward N. Ward
Pietro Lio'
C. Kaminski
27
8
0
28 Feb 2022
When Did It Happen? Duration-informed Temporal Localization of Narrated
  Actions in Vlogs
When Did It Happen? Duration-informed Temporal Localization of Narrated Actions in Vlogs
Oana Ignat
Santiago Castro
Yuhang Zhou
Jiajun Bao
Dandan Shan
Rada Mihalcea
18
3
0
16 Feb 2022
Boundary-aware Self-supervised Learning for Video Scene Segmentation
Boundary-aware Self-supervised Learning for Video Scene Segmentation
Jonghwan Mun
Minchul Shin
Gunsoo Han
Sangho Lee
S. Ha
Joonseok Lee
Eun-Sol Kim
SSL
46
20
0
14 Jan 2022
MERLOT Reserve: Neural Script Knowledge through Vision and Language and
  Sound
MERLOT Reserve: Neural Script Knowledge through Vision and Language and Sound
Rowan Zellers
Jiasen Lu
Ximing Lu
Youngjae Yu
Yanpeng Zhao
Mohammadreza Salehi
Aditya Kusupati
Jack Hessel
Ali Farhadi
Yejin Choi
33
207
0
07 Jan 2022
InstaIndoor and Multi-modal Deep Learning for Indoor Scene Recognition
InstaIndoor and Multi-modal Deep Learning for Indoor Scene Recognition
A. Glavan
Estefanía Talavera
21
10
0
23 Dec 2021
SVIP: Sequence VerIfication for Procedures in Videos
SVIP: Sequence VerIfication for Procedures in Videos
Yichen Qian
Weixin Luo
Dongze Lian
Xu Tang
P. Zhao
Shenghua Gao
ViT
29
17
0
13 Dec 2021
Time-Equivariant Contrastive Video Representation Learning
Time-Equivariant Contrastive Video Representation Learning
Simon Jenni
Hailin Jin
SSL
AI4TS
143
60
0
07 Dec 2021
Self-supervised Video Transformer
Self-supervised Video Transformer
Kanchana Ranasinghe
Muzammal Naseer
Salman Khan
F. Khan
Michael S. Ryoo
ViT
39
84
0
02 Dec 2021
Advancing High-Resolution Video-Language Representation with Large-Scale
  Video Transcriptions
Advancing High-Resolution Video-Language Representation with Large-Scale Video Transcriptions
Hongwei Xue
Tiankai Hang
Yanhong Zeng
Yuchong Sun
Bei Liu
Huan Yang
Jianlong Fu
B. Guo
AI4TS
VLM
29
189
0
19 Nov 2021
Video Background Music Generation with Controllable Music Transformer
Video Background Music Generation with Controllable Music Transformer
Shangzhe Di
Jiang
Sihan Liu
Zhaokai Wang
Leyan Zhu
Zexin He
Hongming Liu
Shuicheng Yan
19
91
0
16 Nov 2021
Occluded Video Instance Segmentation: Dataset and ICCV 2021 Challenge
Occluded Video Instance Segmentation: Dataset and ICCV 2021 Challenge
Jiyang Qi
Yan Gao
Yao Hu
Xinggang Wang
Xiaoyu Liu
Xiang Bai
Serge Belongie
Alan Yuille
Philip Torr
S. Bai
VOS
27
6
0
15 Nov 2021
Masking Modalities for Cross-modal Video Retrieval
Masking Modalities for Cross-modal Video Retrieval
Valentin Gabeur
Arsha Nagrani
Chen Sun
Alahari Karteek
Cordelia Schmid
16
29
0
01 Nov 2021
Can't Fool Me: Adversarially Robust Transformer for Video Understanding
Can't Fool Me: Adversarially Robust Transformer for Video Understanding
D. Choudhary
Palash Goyal
Saurabh Sahu
ViT
33
0
0
26 Oct 2021
SERAB: A multi-lingual benchmark for speech emotion recognition
SERAB: A multi-lingual benchmark for speech emotion recognition
Neil Scheidwasser
M. Kegler
P. Beckmann
Milos Cernak
32
44
0
07 Oct 2021
Motion-aware Contrastive Video Representation Learning via
  Foreground-background Merging
Motion-aware Contrastive Video Representation Learning via Foreground-background Merging
Shuangrui Ding
Maomao Li
Tianyu Yang
Rui Qian
Haohang Xu
Qingyi Chen
Jue Wang
Hongkai Xiong
SSL
28
49
0
30 Sep 2021
LIGAR: Lightweight General-purpose Action Recognition
LIGAR: Lightweight General-purpose Action Recognition
Evgeny Izutov
12
3
0
30 Aug 2021
Video Ads Content Structuring by Combining Scene Confidence Prediction
  and Tagging
Video Ads Content Structuring by Combining Scene Confidence Prediction and Tagging
Tomoyuki Suzuki
Antonio Tejero-de-Pablos
25
1
0
20 Aug 2021
Weakly-supervised Joint Anomaly Detection and Classification
Weakly-supervised Joint Anomaly Detection and Classification
Snehashis Majhi
Srijan Das
F. Brémond
Ratnakar Dash
Pankaj K. Sa
19
19
0
20 Aug 2021
Cross-modal Spectrum Transformation Network For Acoustic Scene
  classification
Cross-modal Spectrum Transformation Network For Acoustic Scene classification
Yang Liu
A. Neophytou
Sunando Sengupta
Eric Sommerlade
21
9
0
13 Aug 2021
FakeAVCeleb: A Novel Audio-Video Multimodal Deepfake Dataset
FakeAVCeleb: A Novel Audio-Video Multimodal Deepfake Dataset
Hasam Khalid
Shahroz Tariq
Minha Kim
Simon S. Woo
36
184
0
11 Aug 2021
Enhancing Self-supervised Video Representation Learning via Multi-level
  Feature Optimization
Enhancing Self-supervised Video Representation Learning via Multi-level Feature Optimization
Rui Qian
Yuxi Li
Huabin Liu
John See
Shuangrui Ding
Xian Liu
Dian Li
Weiyao Lin
35
42
0
04 Aug 2021
Adaptive Hierarchical Graph Reasoning with Semantic Coherence for
  Video-and-Language Inference
Adaptive Hierarchical Graph Reasoning with Semantic Coherence for Video-and-Language Inference
Juncheng Li
Siliang Tang
Linchao Zhu
Haochen Shi
Xuanwen Huang
Fei Wu
Yi Yang
Yueting Zhuang
25
28
0
26 Jul 2021
FoleyGAN: Visually Guided Generative Adversarial Network-Based
  Synchronous Sound Generation in Silent Videos
FoleyGAN: Visually Guided Generative Adversarial Network-Based Synchronous Sound Generation in Silent Videos
Sanchita Ghose
John J. Prevost
GAN
27
26
0
20 Jul 2021
MERLOT: Multimodal Neural Script Knowledge Models
MERLOT: Multimodal Neural Script Knowledge Models
Rowan Zellers
Ximing Lu
Jack Hessel
Youngjae Yu
J. S. Park
Jize Cao
Ali Farhadi
Yejin Choi
VLM
LRM
22
372
0
04 Jun 2021
FineAction: A Fine-Grained Video Dataset for Temporal Action
  Localization
FineAction: A Fine-Grained Video Dataset for Temporal Action Localization
Yi Liu
Limin Wang
Yali Wang
Xiao Ma
Yu Qiao
22
56
0
24 May 2021
MultiSports: A Multi-Person Video Dataset of Spatio-Temporally Localized
  Sports Actions
MultiSports: A Multi-Person Video Dataset of Spatio-Temporally Localized Sports Actions
Yixuan Li
Lei Chen
Runyu He
Zhenzhi Wang
Gangshan Wu
Limin Wang
27
97
0
16 May 2021
Audio Retrieval with Natural Language Queries
Audio Retrieval with Natural Language Queries
Andreea-Maria Oncescu
A. Sophia Koepke
João F. Henriques
Zeynep Akata
Samuel Albanie
21
77
0
05 May 2021
Comparison and Analysis of Deep Audio Embeddings for Music Emotion
  Recognition
Comparison and Analysis of Deep Audio Embeddings for Music Emotion Recognition
E. Koh
Shlomo Dubnov
24
38
0
13 Apr 2021
Automatic Generation of Descriptive Titles for Video Clips Using Deep
  Learning
Automatic Generation of Descriptive Titles for Video Clips Using Deep Learning
Soheyla Amirian
Khaled Rasheed
T. Taha
H. Arabnia
VLM
VGen
16
23
0
07 Apr 2021
The Multi-Agent Behavior Dataset: Mouse Dyadic Social Interactions
The Multi-Agent Behavior Dataset: Mouse Dyadic Social Interactions
Jennifer J. Sun
Tomomi Karigo
Dipam Chakraborty
Sharada Mohanty
Benjamin Wild
...
Chen Chen
D. Anderson
Pietro Perona
Yisong Yue
Ann Kennedy
32
47
0
06 Apr 2021
Broaden Your Views for Self-Supervised Video Learning
Broaden Your Views for Self-Supervised Video Learning
Adrià Recasens
Pauline Luc
Jean-Baptiste Alayrac
Luyu Wang
Ross Hemsley
...
Florent Altché
M. Valko
Jean-Bastien Grill
Aaron van den Oord
Andrew Zisserman
SSL
AI4TS
33
127
0
30 Mar 2021
Adaptive Methods for Real-World Domain Generalization
Adaptive Methods for Real-World Domain Generalization
Abhimanyu Dubey
Vignesh Ramanathan
Alex Pentland
D. Mahajan
OOD
30
88
0
29 Mar 2021
Enhancing Transformer for Video Understanding Using Gated Multi-Level
  Attention and Temporal Adversarial Training
Enhancing Transformer for Video Understanding Using Gated Multi-Level Attention and Temporal Adversarial Training
Saurabh Sahu
Palash Goyal
ViT
29
2
0
18 Mar 2021
RMS-Net: Regression and Masking for Soccer Event Spotting
RMS-Net: Regression and Masking for Soccer Event Spotting
Matteo Tomei
Lorenzo Baraldi
Simone Calderara
Simone Bronzin
Rita Cucchiara
35
28
0
15 Feb 2021
Occluded Video Instance Segmentation: A Benchmark
Occluded Video Instance Segmentation: A Benchmark
Jiyang Qi
Yan Gao
Yao Hu
Xinggang Wang
Xiaoyu Liu
Xiang Bai
Serge Belongie
Alan Yuille
Philip Torr
S. Bai
VOS
VLM
27
135
0
02 Feb 2021
Clairvoyant Prefetching for Distributed Machine Learning I/O
Clairvoyant Prefetching for Distributed Machine Learning I/O
Nikoli Dryden
Roman Böhringer
Tal Ben-Nun
Torsten Hoefler
31
55
0
21 Jan 2021
Learning to Anticipate Egocentric Actions by Imagination
Learning to Anticipate Egocentric Actions by Imagination
Yu Wu
Linchao Zhu
Xiaohan Wang
Yi Yang
Fei Wu
EgoV
85
69
0
13 Jan 2021
Advances in Electron Microscopy with Deep Learning
Advances in Electron Microscopy with Deep Learning
Jeffrey M. Ede
32
2
0
04 Jan 2021
Context-Aware Personality Inference in Dyadic Scenarios: Introducing the
  UDIVA Dataset
Context-Aware Personality Inference in Dyadic Scenarios: Introducing the UDIVA Dataset
Cristina Palmero
Javier Selva
Sorina Smeureanu
Julio C. S. Jacques Junior
Albert Clapés
...
Zejian Zhang
D. Gallardo-Pujol
G. Guilera
D. Leiva
Sergio Escalera
28
53
0
28 Dec 2020
SMART Frame Selection for Action Recognition
SMART Frame Selection for Action Recognition
Shreyank N. Gowda
Marcus Rohrbach
Laura Sevilla-Lara
26
141
0
19 Dec 2020
Previous
12345
Next