In this paper, we survey different state-of-the-art visual processing methods and utilize them in hyperlinking. Visual information, calculated using Features Signatures, SIMILE descriptors and convolutional neural networks (CNN), is utilized as similarity between video frames and used to find similar faces, objects and setting.
Visual concepts in frames are also automatically recognized and textual output of the recognition is combined with search based on subtitles and transcripts. All presented experiments were performed in the Search and Hyperlinking 2014 MediaEval task and Video Hyperlinking 2015 TRECVid task.