AUTHOR=Su Qiaomei , Tao Weiheng , Mei Shiguang , Zhang Xiaoyuan , Li Kaixin , Su Xiaoye , Guo Jianli , Yang Yonggang TITLE=Landslide Susceptibility Zoning Using C5.0 Decision Tree, Random Forest, Support Vector Machine and Comparison of Their Performance in a Coal Mine Area JOURNAL=Frontiers in Earth Science VOLUME=9 YEAR=2021 URL=https://www.frontiersin.org/journals/earth-science/articles/10.3389/feart.2021.781472 DOI=10.3389/feart.2021.781472 ISSN=2296-6463 ABSTRACT=

The main purpose of this study is to establish an effective landslide susceptibility zoning model and test whether underground mined areas and ground collapse in coal mine areas seriously affect the occurrence of landslides. Taking the Fenxi Coal Mine Area of Shanxi Province in China as the research area, landslide data has been investigated by the Shanxi Geological Environment Monitoring Center; adopting the 5-fold cross-validation method, and through Geostatistics analysis means the datasets of all non-landslides and landslides were divided into 80:20 proportions randomly for training and validating models. A set of 15 condition factors including terrain, geological, hydrological, land cover, and human engineering activity factors (distance to road, distance to mined area, ground collapse density) were selected as the evaluation indices to construct the susceptibility assessment model. Three machine learning algorithms for landslide susceptibility prediction (LSP) including C5.0 Decision Tree (C5.0), Random Forest (RF), and Support Vector Machine (SVM) have been selected and compared through the Areas under the Receiver Operating Characteristics (ROC) Curves (AUC), and several statistical estimates. The study revealed that for these three models the value range of prediction accuracies vary from 83.49 to 99.29% (in the training stage), and 62.26–73.58% (in the validation stage). In the two stages, AUCs are between 0.92 to 0.99 and 0.71 to 0.80 respectively. Using Jenks Natural Breaks algorithm, three LSPs levels are established as very low, low, medium, high, and very high probability of landslide by dividing the indices of the LSP. Compared with RF and SVM, C5.0 is considered better in five categories according to quantities and distribution of the landslides and their area percentage for different LSP zones. Four factors such as distance to road, lithology, profile curvature, and ground collapse density are the most suitable condition factors for LSP. The distance to mine area factor has a medium contribution and plays an obvious role in the occurrence of landslides in all the models. The result reveals that C5.0 possesses better prediction efficiency than RF and SVM, and underground mined area and ground collapse sifnigicantly affect significantly the occurrence of landslides in the Fenxi Coal Mine Area.