AUTHOR=Zhang Wen-hai , Tan Yang , Huang Zhen , Tan Qi-xing , Zhang Yue-mei , Chen Bin-jie , Wei Chang-yuan TITLE=Development and validation of AI models using LR and LightGBM for predicting distant metastasis in breast cancer: a dual-center study JOURNAL=Frontiers in Oncology VOLUME=14 YEAR=2024 URL=https://www.frontiersin.org/journals/oncology/articles/10.3389/fonc.2024.1409273 DOI=10.3389/fonc.2024.1409273 ISSN=2234-943X ABSTRACT=Objective

This study aims to develop an artificial intelligence model utilizing clinical blood markers, ultrasound data, and breast biopsy pathological information to predict the distant metastasis in breast cancer patients.

Methods

Data from two medical centers were utilized, Clinical blood markers, ultrasound data, and breast biopsy pathological information were separately extracted and selected. Feature dimensionality reduction was performed using Spearman correlation and LASSO regression. Predictive models were constructed using LR and LightGBM machine learning algorithms and validated on internal and external validation sets. Feature correlation analysis was conducted for both models.

Results

The LR model achieved AUC values of 0.892, 0.816, and 0.817 for the training, internal validation, and external validation cohorts, respectively. The LightGBM model achieved AUC values of 0.971, 0.861, and 0.890 for the same cohorts, respectively. Clinical decision curve analysis showed a superior net benefit of the LightGBM model over the LR model in predicting distant metastasis in breast cancer. Key features identified included creatine kinase isoenzyme (CK-MB) and alpha-hydroxybutyrate dehydrogenase.

Conclusion

This study developed an artificial intelligence model using clinical blood markers, ultrasound data, and pathological information to identify distant metastasis in breast cancer patients. The LightGBM model demonstrated superior predictive accuracy and clinical applicability, suggesting it as a promising tool for early diagnosis of distant metastasis in breast cancer.