In this retrospective study, we aimed to identify key risk factors and establish an interpretable model for HCC with a diameter ≥ 5 cm using Lasso regression for effective risk stratification and clinical decision-making.
In this study, 843 patients with advanced hepatocellular carcinoma (HCC) and tumor diameter ≥ 5 cm were included. Using Lasso regression to screen multiple characteristic variables, cox proportional hazard regression and random survival forest models (RSF) were established. By comparing the area under the curve (AUC), the optimal model was selected. The model was visualized, and the order of interpretable importance was determined. Finally, risk stratification was established to identify patients at high risk.
Lasso regression identified 8 factors as characteristic risk factors. Subsequent analysis revealed that the lasso-cox model had AUC values of 0.773, 0.758, and 0.799, while the lasso-RSF model had AUC values of 0.734, 0.695, and 0.741, respectively. Based on these results, the lasso-cox model was chosen as the superior model. Interpretability assessments using SHAP values indicated that the most significant characteristic risk factors, in descending order of importance, were tumor number, BCLC stage, alkaline phosphatase (ALP), ascites, albumin (ALB), and aspartate aminotransferase (AST). Additionally, through risk score stratification and subgroup analysis, it was observed that the median OS of the low-risk group was significantly better than that of the middle- and high-risk groups.
We have developed an interpretable predictive model for middle and late HCC with tumor diameter ≥ 5 cm using lasso-cox regression analysis. This model demonstrates excellent prediction performance and can be utilized for risk stratification.