Charles Explorer logo

Neighborhood features in geospatial machine learning: the case of population disaggregation

Publication at Faculty of Science |


High-resolution population density data are crucial for advanced geographical analysis but are difficult to obtain owing to personal data protection. This paper presents a method to obtain these data through spatial disaggregation of aggregate data using random forests.

Ancillary topographic data are used from open data sources, namely OpenStreetMap, Urban Atlas, and the NASA Shuttle Radar Topography Mission (SRTM). An attempt to increase disaggregation accuracy is made through a systematic conceptualization of proximity, neighborhood features.

The method is implemented as a toolbox for Python and PostGIS and is tested on three cities in Central and Eastern Europe: Prague, Maribor, and Tallinn. It is shown that this approach produces more accurate predictions than other comparable approaches.