Abstract:
During the Anthropocene and especially in the past decades earth’s environment has undergone major changes. The planetary boundaries are increasingly under pressure. Since soil affects climate as compartment of the carbon and nitrogen cycles, it is an important resource in approaching these environmental problems. Consequently, knowledge about soil, soil processes and soil functions plays an essential role in research on and solutions for these severe environmental and socio-economic challenges. The mapping and modelling of soil provides spatial knowledge of soil status and changes over time, which allows to assess and evaluate soil management practices and attempts to solve to environmental problems. Machine learning methods have proven to be suitable for spatial mapping and modelling of soil, but often are black boxes and the model decisions and prediction results remain unexplained. However, explainable soil models based on machine learning would facilitate detection of environmental changes, contribute to decision making for environmental protection and foster acceptance in science, politics, and society. Therefore, latest efforts in machine learning were to expand the conventional machine learning framework to explainable machine learning to 1) justify decisions, 2) control, and 3) improve models and 4) to discover new knowledge. The core elements for explainable machine learning are transparency, interpretability and explainability. Additionally, domain knowledge and scientific consistency are crucial. However, to date the concepts of explainable machine learning played a marginal role in soil modelling and mapping. Objective of this thesis was to explore and describe how transparency, interpretability and explainability can be achieved in the soil mapping framework. The example studies showed how scientific consistency can be evaluated with model comparison and domain knowledge was and incorporated in DSM models. The studies showed how transparency can be accomplished with reproducible sample and covariate selection, and how interpretation of the models can be linked with domain knowledge about soil formation and processes to explain the model results.