NEU_MITLL @ TRECVid 2015: multimedia event detection by pre-trained CNN models

January 1, 2015

Conference Paper

Author:

Joseph P. Robinson

…

Published in:

TRECVid 2015.

R&D Area:

Cyber Security and Information Sciences

R&D Group:

Artificial Intelligence Technology and Systems

NEU_MITLL @ TRECVid 2015: multimedia event detection by pre-trained CNN models

Summary

We introduce a framework for multimedia event detection (MED), which was developed for TRECVID 2015 using convolutional neural networks (CNNs) to detect complex events via deterministic models trained on video frame data. We used several well-known CNN models designed to detect objects, scenes, and a combination of both (i.e., Hybrid-CNN). We also experimented with features from different networks fused together in different ways. The best score achieved was by fusing objects and scene detections at the feature-level (i.e., early fusion), resulting in a mean average precision (MAP) of 16.02%. Results showed that our framework is capable of detecting various complex events in videos when there are only a few instances of each within a large video search pool.