Charles Explorer logo
🇨🇿

Video scene location recognition with neural networks

Publikace na Matematicko-fyzikální fakulta, Ústřední knihovna |
2021

Tento text není v aktuálním jazyce dostupný. Zobrazuje se verze "en".Abstrakt

This paper provides an insight into the possibility of scene recognition from a video sequence with a small set of repeated shooting locations (such as in television series) using artificial neural networks. The basic idea of the presented approach is to select a set of frames from each scene, transform them by a pre-trained single-image pre-processing convolutional network, and classify the scene location with subsequent layers of the neural network.

The considered networks have been tested and compared on a dataset obtained from The Big Bang Theory television series. We have investigated different neural network layers to combine individual frames, particularly AveragePooling, MaxPooling, Product, Flatten, LSTM, and Bidirectional LSTM layers.

We have observed that only some of the approaches are suitable for the task at hand.