AUTHOR=Zydroń Tymoteusz , Demczuk Piotr , Gruchot Andrzej 

TITLE=Assessment of Landslide Susceptibility of the Wiśnickie Foothills Mts. (The Flysch Carpathians, Poland) Using Selected Machine Learning Algorithms

JOURNAL=Frontiers in Earth Science

VOLUME=Volume 10 - 2022

YEAR=2022

URL=https://www.frontiersin.org/journals/earth-science/articles/10.3389/feart.2022.872192

DOI=10.3389/feart.2022.872192

ISSN=2296-6463

ABSTRACT=Landslides are well-known phenomena that cause significant changes to the relief of an area’s terrain, often causing damage to technical infrastructure and loss of life. One of the possible means of reducing the negative impact of landslides on people's lives or property is to recognize areas that are prone to their occurrence. The most common approach to this problem is preparing landslide susceptibility maps. These can factor in the actual location of landslides or the causal relationship between landslides and selected environmental factors. Creating a classification of landslide-prone areas is a challenging task when landslide density is not high and the area of analysis is large. We prepared shallow 10x10 m resolution landslide susceptibility maps of the Wiśnickie Foothills (Western Carpathians, Poland) using eleven different machine learning algorithms derived from the Python libraries Scikit Learn and Imbalanced-Learn. The analyzed area is characterized by a mean density of  3.4  surficial landslides (composed of soils and rocks) per km2. We also compared different approaches to imbalanced sets of data: Logistic Regression, Naive Bayes, Random Forest, AdaBoost, Bagging, ExtraTrees (Extremely Randomized Trees), EasyEnsemble, Balanced Bagging, Balanced Random Forest, RUSBoost and a hybrid model combining RandomUnderSampler and Multi-layer Perceptron algorithms. The environmental factors (slope inclination and aspect, distance from rivers, geological complexes soil type and permeability, groundwater table depth, profile and plan curvature, mean annual rainfall) were categorized and divided into training (70%) and testing (30%) sets. Accuracy, recall, G-mean and area under receiver operating curve (AUC) were used to validate the quality of the models. The results confirmed that algorithms based on decision tree classifiers are suitable for preparing landslide susceptibility maps. We also found that methods that generate random undersampling subsets (EasyEnsemble, BalancedBagging) and ensemble methods (Bagging, AdaBoost, Extra-Trees) both yield very similar test results to those that use full sets of data for training. Relatively high-quality results were also obtained by integrating the RandomUnder Smpler algorithm with the Multi-layer Perceptron algorithm.