One of the best preconditions for the sufficient monitoring of peat bog ecosystems is the collection, processing, and analysis of unique spatial data to understand peat bog dynamics. Over two seasons, we sampled groundwater level (GWL) and soil moisture (SM) ground truth data at two diverse locations at the Rokytka Peat bog within the Sumava Mountains, Czechia.
These data served as reference data and were modeled with a suite of potential variables derived from digital surface models (DSMs) and RGB, multispectral, and thermal orthoimages reflecting topomorphometry, vegetation, and surface temperature information generated from drone mapping. We used 34 predictors to feed the random forest (RF) algorithm.
The predictor selection, hyperparameter tuning, and performance assessment were performed with the target-oriented leave-location-out (LLO) spatial cross-validation (CV) strategy combined with forward feature selection (FFS) to avoid overfitting and to predict on unknown locations. The spatial CV performance statistics showed low (R-2 = 0.12) to high (R-2 = 0.78) model predictions.
The predictor importance was used for model interpretation, where temperature had strong impact on GWL and SM, and we found significant contributions of other predictors, such as Normalized Difference Vegetation Index (NDVI), Normalized Difference Index (NDI), Enhanced Red-Green-Blue Vegetation Index (ERGBVE), Shape Index (SHP), Green Leaf Index (GLI), Brightness Index (BI), Coloration Index (CI), Redness Index (RI), Primary Colours Hue Index (HI), Overall Hue Index (HUE), SAGA Wetness Index (TWI), Plan Curvature (PlnCurv), Topographic Position Index (TPI), and Vector Ruggedness Measure (VRM). Additionally, we estimated the area of applicability (AOA) by presenting maps where the prediction model yielded high-quality results and where predictions were highly uncertain because machine learning (ML) models make predictions far beyond sampling locations without sampling data with no knowledge about these environments.
The AOA method is well suited and unique for planning and decision-making about the best sampling strategy, most notably with limited data.