How can we identify individuals at risk of being drawn into online sex work? The spread of online communication removes transaction costs and enables a greater number of people to be involved in illicit activities, including online sex trade. As a result, social media platforms often work as springboard for criminal careers posing a significant risk to the economy, public health and trust.
Detecting deviant behaviors online is limited by the poor availability of ground-truth data and machine learning tools. Unlike prior work which focuses exclusively on either qualitative or quantitative methods, in this paper we combine covert online ethnography with semi-supervised learning methodologies, using data from a popular European adult forum.
We obtained risk assessment results of 78 users using covert online ethnography, and set out to build a machine learning model that can predict the risk factor in other 28,832 users. Results show that a combination-based approach in which all features are used yields the most accurate results.