Step 1: (Input) 20,575 picks of the sub-Cretaceous unconformity within the study area boundary were used to identify the extent of the subcropping geological units and delineate the subcrop polygons.
Step 2: (Creating subcrop polygons)
-Point locations attributed with the subcropping formation were converted to spatial geometries. The formation name, representing the target for prediction, was encoded as a factor/categorical variable type suitable for classification models.
- A suite of raster datasets was prepared from the elevation and elevation residual sub-Cretaceous surfaces. These raster datasets were used as the predictors in the model to describe the morphometry of the surfaces in terms of surface curvature, depth of depressions, height relative to local highs/lows in the surface, and the depth of channel-like forms. Additional predictors that describe the horizontal proximity to edges of the salt dissolution front were also used.
- A training set was generated by spatially querying/extracting the values of the raster predictors at the point locations so that their feature values can be associated with target variable.
- Multiple machine learning classifiers, consisting of K-nearest neighbours, linear discriminant analysis, naïve bayes, extremely randomized trees, and XGBoost were evaluated based on (a) their predictive performances using 10-fold cross-validation; and (b) the geological plausible of the predicted subcrop classification maps.
- The most performant and realistic model, XGBoost was selected as the model to use for the final map prediction. This predicted map was generalized/smoothed and converted to vector features.
-Boundaries were checked and modified manually to enforce stratigraphical relationships and honour data points.