AUTHOR=Wen Zhongjian , Wang Yiren , Chen Shouying , Li Yunfei , Deng Hairui , Pang Haowen , Guo Shengmin , Zhou Ping , Zhu Shiqin TITLE=Construction of a predictive model for postoperative hospitalization time in colorectal cancer patients based on interpretable machine learning algorithm: a prospective preliminary study JOURNAL=Frontiers in Oncology VOLUME=14 YEAR=2024 URL=https://www.frontiersin.org/journals/oncology/articles/10.3389/fonc.2024.1384931 DOI=10.3389/fonc.2024.1384931 ISSN=2234-943X ABSTRACT=Objective

This study aims to construct a predictive model based on machine learning algorithms to assess the risk of prolonged hospital stays post-surgery for colorectal cancer patients and to analyze preoperative and postoperative factors associated with extended hospitalization.

Methods

We prospectively collected clinical data from 83 colorectal cancer patients. The study included 40 variables (comprising 39 predictor variables and 1 target variable). Important variables were identified through variable selection via the Lasso regression algorithm, and predictive models were constructed using ten machine learning models, including Logistic Regression, Decision Tree, Random Forest, Support Vector Machine, Light Gradient Boosting Machine, KNN, and Extreme Gradient Boosting, Categorical Boosting, Artificial Neural Network and Deep Forest. The model performance was evaluated using Bootstrap ROC curves and calibration curves, with the optimal model selected and further interpreted using the SHAP explainability algorithm.

Results

Ten significantly correlated important variables were identified through Lasso regression, validated by 1000 Bootstrap resamplings, and represented through Bootstrap ROC curves. The Logistic Regression model achieved the highest AUC (AUC=0.99, 95% CI=0.97–0.99). The explainable machine learning algorithm revealed that the distance walked on the third day post-surgery was the most important variable for the LR model.

Conclusion

This study successfully constructed a model predicting postoperative hospital stay duration using patients’ clinical data. This model promises to provide healthcare professionals with a more precise prediction tool in clinical practice, offering a basis for personalized nursing interventions, thereby improving patient prognosis and quality of life and enhancing the efficiency of medical resource utilization.