Skip to main content

ORIGINAL RESEARCH article

Front. Oncol.
Sec. Thoracic Oncology
Volume 14 - 2024 | doi: 10.3389/fonc.2024.1403392
This article is part of the Research Topic Advancements and Cutting-Edge Approaches to Counteract the Inefficacy of Immune Checkpoint Inhibitor Therapies in Lung Cancer View all 13 articles

Construction of a risk prediction model for lung infection after chemotherapy in lung cancer patients based on the machine learning algorithm

Provisionally accepted
Tao Sun Tao Sun 1*Jun Liu Jun Liu 2Houqin Yuan Houqin Yuan 1Xin Li Xin Li 1Hui Yan Hui Yan 1
  • 1 The Central Hospital of Shaoyang, Shaoyang, China
  • 2 The First Affiliated Hospital of Shaoyang University, Shaoyang, Hunan, China

The final, formatted version of the article will be published soon.

    The objective of this study was to create and validate a machine learning (ML)-based model for predicting the likelihood of lung infections following chemotherapy in patients with lung cancer.Methods: A retrospective study was conducted on a cohort of 502 lung cancer patients undergoing chemotherapy. Data on age, Body Mass Index (BMI), underlying disease, chemotherapy cycle, number of hospitalizations, and various blood test results were collected from medical records. We used the Synthetic Minority Oversampling Technique (SMOTE) to handle unbalanced data. Feature screening was performed using the Boruta algorithm and The Least Absolute Shrinkage and Selection Operator (LASSO). Subsequently, six ML algorithms, namely Logistic Regression (LR), Random Forest (RF), Gaussian Naive Bayes (GNB), Multi-layer Perceptron (MLP), Support Vector Machine (SVM), and K-Nearest Neighbors (KNN) were employed to train and develop an ML model using a 10-fold crossvalidation methodology. The model's performance was evaluated through various metrics, including the area under the receiver operating characteristic curve (ROC), accuracy, sensitivity, specificity, F1 score, calibration curve, decision curves, clinical impact curve, and confusion matrix. In addition, model interpretation was performed by the Shapley Additive Explanations (SHAP) analysis to clarify the importance of each feature of the model and its decision basis. Finally, we constructed nomograms to make the predictive model results more readable.The integration of Boruta and LASSO methodologies identified Gender, Smoke, Drink, Chemotherapy cycles, pleural effusion (PE), Neutrophil-lymphocyte count ratio (NLR), Neutrophilmonocyte count ratio (NMR), Lymphocytes (LYM) and Neutrophil (NEUT) as significant predictors. The LR model demonstrated superior performance compared to alternative ML algorithms, achieving an accuracy of 81.80%, a sensitivity of 81.1%, a specificity of 82.5%, an F1 score of 81.6%, and an AUC of 0.888(95%CI(0.863-0.911)). Furthermore, the SHAP method identified Chemotherapy cycles and Smoke as the primary decision factors influencing the ML model's predictions. Finally, this study successfully constructed interactive nomograms and dynamic nomograms.The ML algorithm, combining demographic and clinical factors, accurately predicted post-chemotherapy lung infections in cancer patients. The LR model performed well, potentially improving early detection and treatment in clinical practice.

    Keywords: lung infection, chemotherapy, machine learning, Logistic regression, predictive model, nomogram

    Received: 19 Mar 2024; Accepted: 23 Jul 2024.

    Copyright: © 2024 Sun, Liu, Yuan, Li and Yan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

    * Correspondence: Tao Sun, The Central Hospital of Shaoyang, Shaoyang, China

    Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.