AUTHOR=Dai Pingping , Chang Weifu , Xin Zirui , Cheng Haiwei , Ouyang Wei , Luo Aijing TITLE=Retrospective Study on the Influencing Factors and Prediction of Hospitalization Expenses for Chronic Renal Failure in China Based on Random Forest and LASSO Regression JOURNAL=Frontiers in Public Health VOLUME=9 YEAR=2021 URL=https://www.frontiersin.org/journals/public-health/articles/10.3389/fpubh.2021.678276 DOI=10.3389/fpubh.2021.678276 ISSN=2296-2565 ABSTRACT=

Aim: With the improvement in people's living standards, the incidence of chronic renal failure (CRF) is increasing annually. The increase in the number of patients with CRF has significantly increased pressure on China's medical budget. Predicting hospitalization expenses for CRF can provide guidance for effective allocation and control of medical costs. The purpose of this study was to use the random forest (RF) method and least absolute shrinkage and selection operator (LASSO) regression to predict personal hospitalization expenses of hospitalized patients with CRF and to evaluate related influencing factors.

Methods: The data set was collected from the first page of data of the medical records of three tertiary first-class hospitals for the whole year of 2016. Factors influencing hospitalization expenses for CRF were analyzed. Random forest and least absolute shrinkage and selection operator regression models were used to establish a prediction model for the hospitalization expenses of patients with CRF, and comparisons and evaluations were carried out.

Results: For CRF inpatients, statistically significant differences in hospitalization expenses were found for major procedures, medical payment method, hospitalization frequency, length of stay, number of other diagnoses, and number of procedures. The R2 of LASSO regression model and RF regression model are 0.6992 and 0.7946, respectively. The mean absolute error (MAE) and root mean square error (RMSE) of the LASSO regression model were 0.0268 and 0.043, respectively, and the MAE and RMSE of the RF prediction model were 0.0171 and 0.0355, respectively. In the RF model, and the weight of length of stay was the highest (0.730).

Conclusions: The hospitalization expenses of patients with CRF are most affected by length of stay. The RF prediction model is superior to the LASSO regression model and can be used to predict the hospitalization expenses of patients with CRF. Health administration departments may consider formulating accurate individualized hospitalization expense reimbursement mechanisms accordingly.