AUTHOR=Baiddah Abdeslam , Krimissa Samira , Hajji Sonia , Ismaili Maryem , Abdelrahman Kamal , El Bouzekraoui Meryem , Eloudi Hasna , Elaloui Abdenbi , Khouz Abdellah , Badreldin Nasem , Namous Mustapha 

TITLE=Head-cut gully erosion susceptibility mapping in semi-arid region using machine learning methods: insight from the high atlas, Morocco

JOURNAL=Frontiers in Earth Science

VOLUME=11

YEAR=2023

URL=https://www.frontiersin.org/journals/earth-science/articles/10.3389/feart.2023.1184038

DOI=10.3389/feart.2023.1184038

ISSN=2296-6463

ABSTRACT=<p>Gully erosion has been identified in recent decades as a global threat to people and property. This problem also affects the socioeconomic stability of societies and therefore limits their sustainable development, as it impacts a nonrenewable resource on a human scale, namely, soil. The focus of this study is to evaluate the prediction performance of four machine learning (ML) models: Logistic Regression (LR), classification and regression tree (CART), Linear Discriminate Analysis (LDA), and the k-Nearest Neighbors (kNN), which are novel approaches in gully erosion modeling research, particularly in semi-arid regions with a mountainous character. 204 samples of erosion areas and 204 samples of non-erosion areas were collected through field surveys and high-resolution satellite images, and 17 significant factors were considered. The dataset cells of samples (70% for training and 30% for testing) were randomly prepared to assess the robustness of the different models. The functional relevance between soil erosion and effective factors was computed using the ML models. The ML models were evaluated using different metrics, including accuracy, the kappa coefficient. kNN is the ideal model for this study. The value of the AUC from ROC considering the testing datasets of KNN is 0.93; the remaining models are associated to ideal AUC and are similar to kNN in terms of values. The AUC values from ROC of GLM, LDA, and CART for testing datasets are 0.90, 0.91, and 0.84, respectively. The value of accuracy considering the validation datasets of LDA, CART, KNN, and GLM are 0.85, 0.82, 0.89, 0.84 respectively. The values of Kappa of LDA, CART, and GLM for testing datasets are 0.70, 0.65, and 0.68, respectively. ML models, in particular KNN, GLM, and LDA, have achieved outstanding results in terms of creating soil erosion susceptibility maps. The maps created with the most reliable models could be a useful tool for sustainable management, watershed conservation and prevention of soil and water losses.</p>