Group recommender systems (GRS) are a specific case of recommender systems (RS), where recommendations are constructed to a group of users rather than an individual. GRS has diverse application areas including trip planning, recommending movies to watch together, or music in shared environments.
However, due to the lack of large datasets with group decision-making feedback information, or even the group definitions, GRS approaches are often evaluated offline w.r.t. individual user feedback and artificially generated groups. These synthetic groups are usually constructed w.r.t. pre-defined group size and inter-user similarity metric.
While numerous variants of synthetic group generation procedures were utilized so far, its impact on the evaluation results was not sufficiently discussed. In this paper, we address this research gap by investigating the impact of various synthetic group generation procedures, namely the usage of different user similarity metrics and the effect of group sizes.
We consider them in the context of "outlier vs. majority" groups, where a group of similar users is extended with one or more diverse ones. Experimental results indicate a strong impact of the selected similarity metric on both the typical characteristics of selected outliers as well as the performance of individual GRS algorithms.
Moreover, we show that certain algorithms better adapt to larger groups than others.