AUTHOR=Mohammed Musaab A. A. , Kaya Fuat , Mohamed Ahmed , Alarifi Saad S. , Abdelrady Ahmed , Keshavarzi Ali , Szabó Norbert P. , Szűcs Péter TITLE=Application of GIS-based machine learning algorithms for prediction of irrigational groundwater quality indices JOURNAL=Frontiers in Earth Science VOLUME=11 YEAR=2023 URL=https://www.frontiersin.org/journals/earth-science/articles/10.3389/feart.2023.1274142 DOI=10.3389/feart.2023.1274142 ISSN=2296-6463 ABSTRACT=

Agriculture is considered one of the primary elements for socioeconomic stability in most parts of Sudan. Consequently, the irrigation water should be properly managed to achieve sustainable crop yield and soil fertility. This research aims to predict the irrigation indices of sodium adsorption ratio (SAR), sodium percentage (Na%), permeability index (PI), and potential salinity (PS) using innovative machine learning (ML) techniques, including K-nearest neighbor (KNN), random forest (RF), support vector regression (SVR), and Gaussian process regression (GPR). Thirty-seven groundwater samples are collected and analyzed for twelve physiochemical parameters (TDS, pH, EC, TH, Ca+2, Mg+2, Na+, HCO3, Cl, SO4−2, and NO3) to assess the hydrochemical characteristics of groundwater and its suitability for irrigation purposes. The primary investigation indicated that the samples are dominated by Ca-Mg-HCO3 and Na-HCO3 water types resulted from groundwater recharge and ion exchange reactions. The observed irrigation indices of SAR, Na%, PI, and PS showed average values of 7, 42.5%, 64.7%, and 0.5, respectively. The ML modeling is based on the ion’s concentration as input and the observed values of the indices as output. The data is divided into two sets for training (70%) and validation (30%), and the models are validated using a 10-fold cross-validation technique. The models are tested with three statistical criteria, including mean square error (MSE), root means square error (RMSE), and correlation coefficient (R2). The SVR algorithm showed the best performance in predicting the irrigation indices, with the lowest RMSE value of 1.45 for SAR. The RMSE values for the other indices, Na%, PI, and PS, were 6.70, 7.10, and 0.55, respectively. The models were applied to digital predictive data in the Nile River area of Khartoum state, and the uncertainty of the maps was estimated by running the models 10 times iteratively. The standard deviation maps were generated to assess the model’s sensitivity to the data, and the uncertainty of the model can be used to identify areas where a denser sampling is needed to improve the accuracy of the irrigation indices estimates.