AUTHOR=McKelvey Sean , Abassi Amirhassan , Nataraj C. , Duran Metin TITLE=Data-driven modeling techniques for prediction of settled water turbidity in drinking water treatment JOURNAL=Frontiers in Environmental Engineering VOLUME=3 YEAR=2024 URL=https://www.frontiersin.org/journals/environmental-engineering/articles/10.3389/fenve.2024.1401180 DOI=10.3389/fenve.2024.1401180 ISSN=2813-5067 ABSTRACT=
Drinking water treatment is a complex system of chemical, physical, and biological processes that is highly dependent on water quality and the design of the treatment process. To create decision-support tools, the prediction of key performance indicators, such as settled water turbidity, is needed. A variety of data-driven modeling techniques is available to formulate such predictions. Data-driven models provide valuable tools for formulating predictions where there is a lack of mechanistic models or the mechanisms are not fully understood, as in surface water treatment. The objective of this paper is to evaluate and compare the effectiveness of various data-driven techniques for this important, but difficult, problem. Recognizing that the size and quality of the dataset are most critical in this kind of analysis, this work uses one of the largest datasets used in this context consisting of 2,527 vectors of water quality and operational data (2,527 X nine data frame) from a full-scale water treatment plant. The paper constructs and compares the performance of the several data-driven models including k-nearest neighbor (KNN) regression, polynomial regression, and artificial neural networks (ANN). Based on test scaled root mean square error (RMSE), the ANN model was the most predictive (0.124). Similarly, the ANN model had the best predictive performance based on total scaled RMSE (0.086). These results show that ANNs have a high potential for the development of a future decision support system in selecting appropriate coagulant doses based on settled water turbidity.