74
118

EventNet: A Large Scale Structured Concept Library for Complex Event Detection in Video

Abstract

Event-specific concepts are the semantic entities specifically designed for the events of interest, which can be used as a mid-level representation of complex events in video. Existing methods only focus on defining event-specific concepts for a small number of pre-defined events, but cannot handle novel unseen events. This motivates us to build a large scale event-specific concept library that covers as many real-world events and their concepts as possible. Specifically, we choose WikiHow as our event discovery resource. We perform a coarse-to-fine event discovery process and discover 500 events from WikiHow articles. Then we use each event name as query to search YouTube and discover event-specific concepts from the tags of returned videos. After an automatic filter process, we end up with around 95,321 videos and 4,490 concepts for a total of 500 events. We train a CNN model on the 95,321 videos over the 500 events, and use the model to extract deep learning feature from video content. With the learned deep learning feature, we train 4,490 binary SVM classifiers as the event-specific concept library. The concepts and events are further organized in a hierarchical structure, and the resultant concept library is called EventNet. EventNet library is used to generate concept based representation of event videos. To our best knowledge, EventNet represents the first video event ontology that organizes events and their concepts into a semantic structure. It offers great potential for event retrieval and browsing. Extensive experiments over various video event detection tasks show that the proposed EventNet consistently and significantly beats the state-of-the-art concept library by a large margin up to 173%. We will also show that EventNet structure can help users find relevant concepts for novel event queries that cannot be well addressed by conventional text based semantic analysis alone.

View on arXiv
Comments on this paper