AUTHOR=Wang Peng , Cheng Shuwen , Li Yaxin , Liu Li , Liu Jia , Zhao Qiang , Luo Shuang TITLE=Prediction of Lumbar Drainage-Related Meningitis Based on Supervised Machine Learning Algorithms JOURNAL=Frontiers in Public Health VOLUME=10 YEAR=2022 URL=https://www.frontiersin.org/journals/public-health/articles/10.3389/fpubh.2022.910479 DOI=10.3389/fpubh.2022.910479 ISSN=2296-2565 ABSTRACT=Background

Lumbar drainage is widely used in the clinic; however, forecasting lumbar drainage-related meningitis (LDRM) is limited. We aimed to establish prediction models using supervised machine learning (ML) algorithms.

Methods

We utilized a cohort of 273 eligible lumbar drainage cases. Data were preprocessed and split into training and testing sets. Optimal hyper-parameters were archived by 10-fold cross-validation and grid search. The support vector machine (SVM), random forest (RF), and artificial neural network (ANN) were adopted for model training. The area under the operating characteristic curve (AUROC) and precision-recall curve (AUPRC), true positive ratio (TPR), true negative ratio (TNR), specificity, sensitivity, accuracy, and kappa coefficient were used for model evaluation. All trained models were internally validated. The importance of features was also analyzed.

Results

In the training set, all the models had AUROC exceeding 0.8. SVM and the RF models had an AUPRC of more than 0.6, but the ANN model had an unexpectedly low AUPRC (0.380). The RF and ANN models revealed similar TPR, whereas the ANN model had a higher TNR and demonstrated better specificity, sensitivity, accuracy, and kappa efficiency. In the testing set, most performance indicators of established models decreased. However, the RF and AVM models maintained adequate AUROC (0.828 vs. 0.719) and AUPRC (0.413 vs. 0.520), and the RF model also had better TPR, specificity, sensitivity, accuracy, and kappa efficiency. Site leakage showed the most considerable mean decrease in accuracy.

Conclusions

The RF and SVM models could predict LDRM, in which the RF model owned the best performance, and site leakage was the most meaningful predictor.