Charles Explorer logo
🇬🇧

Biased k-NN Similarity Content Based Prediction of Movie Tweets Popularity

Publication at Faculty of Mathematics and Physics |
2015

Abstract

In this paper we describe details of our approach to the RecSys Challenge 2014: User Engagement as Evaluation. The challenge was based on a dataset, which contains tweets that are generated when users rate movies on IMDb (using the iOS app in a smartphone).

The challenge for participants is to rank such tweets by expected user interaction, which is expressed in terms of retweet and favorite counts. During experiments we have tested several current off-the-shelf prediction techniques and proposed a variant of item biased k-NN algorithm, which better reflects user engagement and nature of the movie domain content-based attributes.

Our final solution (placed in the third quartile of the challenge leader board) is an aggregation of several runs of this algorithm and some off-the-shelf predictors. In the paper we will further describe dataset used, data filtration, algorithm details and settings as well as decisions made during the challenge and dead ends we explored.