Skip to main content

ORIGINAL RESEARCH article

Front. Immunol., 20 November 2024
Sec. Cancer Immunity and Immunotherapy
This article is part of the Research Topic Artificial Intelligence for Cancer Immunotherapy View all 3 articles

Prediction of benign and malignant pulmonary nodules using preoperative CT features: using PNI-GARS as a predictor

Yuxin Zhan&#x;Yuxin Zhan1†Feipeng Song,&#x;Feipeng Song2,3†Wenjia ZhangWenjia Zhang3Tong GongTong Gong4Shuai Zhao*Shuai Zhao1*Fajin Lv*Fajin Lv2*
  • 1School of Science, Chongqing University of Technology, Chongqing, China
  • 2Department of Radiology, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
  • 3Department of Radiology, The Second Hospital of Shanxi Medical University, Taiyuan, China
  • 4Department of Radiology, Sichuan Provincial People’s Hospital, Chengdu, China

Purpose: The aim of this study was to develop and validate a prediction model for classification of pulmonary nodules based on preoperative CT imaging.

Materials and methods: A data set of Centers 1 (training set: 2633; internal testing set: 1129); Center 2 and Center 3 (external testing set: 218) of patients with pulmonary nodule cases was retrospectively collected. Handcrafted features were extracted from noncontrast chest CT scans by three senior radiologists. A total of 22 clinically handcrafted parameters (age, gender, L-RADS, and PNI-GARS et al.) were used to construct machine learning models (random forest, gradient boosting, and explainable boosting) for the classification of preoperative pulmonary nodules, and the parameters of the model were adjusted to achieve optimal performance. To evaluate the prediction capacity of each model. Both 5-fold cross-validation and 10-fold cross-validation were used to test the robustness of the models.

Results: The explainable boosting model had the best performance on our constructed data. The model achieves an accuracy of 89.9%, a precision of 97.48%, a specificity of 89.5%, a sensitivity of 91.1%, and an AUC of 90.3%. In human-machine comparison, the AUC of machine learning models (90.4%, 95% CI: 85.5%–94.8%) was significantly improved compared to radiologists (60%, 95% CI: 50%–71.4%).

Conclusions: The explainable boosting model exhibited superior performance on our dataset, achieving high accuracy and precision in the diagnosis of pulmonary nodules compared to experienced radiologists.

Highlights

● CT imaging features can be used to predict the benign or malignant nature of pulmonary nodules.

● Preoperative machine learning model predicts malignancy of pulmonary nodules.

● PNI-GARS enhances lung nodule diagnosis by standardizing CT grading and integrating with machine learning for improved malignancy prediction.

Introduction

Lung cancer, the leading cause of cancer-related deaths worldwide, is responsible for a significant proportion of total cancer cases, with lung nodules often being the initial imaging manifestation of early-stage lung cancer (13). According to statistics released by the World Health Organization (WHO), there were approximately 2.21 million confirmed cases of lung cancer and 1.8 million deaths in 2020 (4). Lung cancer is one of the most dangerous malignancies, characterized by a poor prognosis and a low overall survival rate due to untimely detection and the limitations of conventional treatment (5, 6). The detection rate of pulmonary nodules has increased dramatically with the widespread use of multi-detector spiral CT scans. However, the majority of these nodules are benign; according to the 2011 National Lung Screening Trial (NLST), a staggering 96.4% of CT-detected lung nodules were not cancerous. Nevertheless, the presence of a lung nodule can cause significant anxiety for patients, leading to a need for accurate assessment to differentiate between malignant and benign lesions (7). The transformation of a lung nodule into lung cancer is a complex process influenced by various factors. Lung cancer development can be influenced by the size, morphology, and growth rate of the nodule. For instance, lung nodules with a diameter greater than 15 mm, those located in the upper lobe, and those exhibiting features such as spiculation, chest membrane retraction, and bronchial truncation are considered high-risk and more likely to be malignant. In the diagnosis of lung nodules, lung-RADS (Lung Imaging Reporting and Data System), a screening classification system for lung nodules, was proposed by the National College of Radiology (ACR) (8). Although Lung-RADS provide important guidance in the classification and management of pulmonary nodules, its limitations cannot be ignored, especially the lack of a comprehensive assessment of nodule imaging features such as edges, morphology, burrs, etc. The development of a lung cancer risk prediction model, therefore, presents a strategic approach to mitigate the subjectivity and unreliability inherent in radiologist diagnoses, particularly for those with limited experience (912). With the rapid development of artificial intelligence technology, machine learning has shown great potential in the diagnosis and treatment of lung cancer (13, 14). As the leading cause of cancer-related death worldwide, early diagnosis of lung cancer is crucial to improving treatment success and patient survival (15, 16). The machine learning model is able to identify signs of lung cancer by analyzing CT image data (17, 18). These models can automatically detect lung nodules and provide a quantitative assessment of nodule properties and can predict the histological type of lung cancer by analyzing the imaging characteristics of lung nodules, such as shape, margin, transparency, and uniformity (1922).

The aim of this study is to incorporate a broader range of clinical and radiological features into the model, addressing the limitations of existing diagnostic systems such as the Lung-RADS classification criteria. Additionally, the developed machine learning model is comparatively analyzed with the existing Lung-RADS and PNI-GARS diagnostic systems. This comprehensive evaluation provides us with a more granular understanding of nodule characteristics and their association with malignancy. By utilizing SHAP values to explain the influence of each feature variable on the model’s output, it aids in understanding the decision-making process of the model and enhances its interpretability. Overall, this study, through the integration of multicenter data, a large sample size, advanced machine learning techniques, and comprehensive statistical analysis, has developed an efficient, accurate, and interpretable prediction model for lung nodule characterization, serving as a powerful auxiliary tool for clinical diagnosis.

Materials and methods

Dataset

The institutional review boards of the three participating institutions approved the retrospective multicohort study and waived the requirement for written informed consent. The patient data used in this study were obtained from three centers. The training set data consisted of patients from Center 1, collected between December 2017 and November 2021. Data for the external validation set were obtained from 218 patient CT examinations between December 2021 and March 2022 at Center 2 and Center 3. The inclusion and exclusion criteria were the same across all three centers. All patient data were obtained in daily practice. The inclusion criteria for the study considered the following: (1) patients aged 18 years or older; (2) size of the nodule(s) ≤ 30mm; (3) final pathological results that were definitive. The exclusion criteria for the study were as follows: (1) missing data; (2) poor image quality; (3) the size of the nodules that could not be measured accurately. The patient data selection flowchart is shown in Figure 1. Based on patient inclusion and exclusion criteria, 4,792 malignant and 1,631 benign pulmonary nodules from 5,404 patients at Center 1 were selected as the training set data. One hundred seventy-three malignant and 45 benign pulmonary nodules from 218 patients at Centers 2 and 3 were selected as the validation set data. To create a relatively balanced dataset for modeling, we randomly selected malignant pulmonary nodules and all benign pulmonary nodules from Center 1 at a ratio of 1.3:1 (2,131 malignant and 1,631 benign pulmonary nodules). For external validation, all pulmonary nodules from Centers 2 and 3 were included.

Figure 1
www.frontiersin.org

Figure 1. Flowchart of patient selection and data processing. Center 1, The First Affiliated Hospital of Chongqing Medical University; Center 2, Second Hospital of Shanxi Medical University; Center 3, Sichuan Provincial People’s Hospital.

Clinical features and non-contrast chest CT scans characteristics

Non-contrast chest CT scans were acquired by SOMATOM Definition Flash (Siemens Healthineers, Erlangen, Germany), SOMATOM Force (Siemens Healthineers, Erlangen, Germany), and Discovery CT750 HD (GE Healthcare, Milwaukee, WI, USA) CT scanners (Table 1). All patients were asked to hold their hands over their heads, lie on their backs, breathe deeply, and hold their breath. The scan range was from the tip of the lung to the level of the costophrenic angle. All images were independently and blindly read by two experienced radiologists with 8 and 10 years of experience, respectively. To assess the reliability of the readings, we calculated the agreement rate among the radiologists, which is detailed in Supplementary Table S1. Pathological results for each patient were collected from surgical pathology biopsies. CT features were given by each radiologist. When discrepancies occurred, the final assessment was determined by a third radiologist with 12 years of experience, who integrated the differing opinions to provide a conclusive evaluation. After a detailed evaluation, nine clinical characteristics and thirteen radiological characteristics on the CT images were used for model development. The specific features were as follows: (1) Age; (2) Sex; (3) N-nodules (Number of nodules); (4) Nature of the nodule (SN/PSN/GGN); (5) Total diameter (in mm); (6) L-RADS (1/2/3/4A/4B/4X); (7) PNI-GARS (0/I/II/IIIa/IIIb/IIIc/IV); (8) Spiculation (yes/no); (9) Lobulation (yes/no); (10) Vascular sign (yes/no); (11) Pleural indentation (yes/no); (12) Vacuole sign (yes/no); (13) Cavitations (yes/no); (14) M-features (number of malignant features); (15) Margin smooth (yes/no); (16) Pulmonary cord (yes/no); (17) Margin blurring (yes/no); (18) Calcification (yes/no); (19) Fat (yes/no); (20) Satellite feature (yes/no); (21) Nodular patchy shadow (yes/no); (22) N-features (number of benign features).

Table 1
www.frontiersin.org

Table 1. The protocol parameters and reconstruction parameters for the intra-CT protocol trial.

Factor correlation coefficient calculation

In the actual study, we needed to remove variables with correlation coefficients > 0.8 to prevent the occurrence of data leakage, where certain variables could directly affect the prediction results. Correlation coefficients > 0.8 indicate the presence of multicollinearity (23) in the data. In this paper, the correlation coefficient test was performed using Pearson’s coefficient (24). As shown by the statistics (Figure 2), all the clinical and radiological features we used had correlation coefficients not greater than 0.8, which shows that these data were suitable for machine learning model development.

Figure 2
www.frontiersin.org

Figure 2. Pearson’s correlation coefficient matrix.

Machine learning model development

To build up a machine learning classification model for pathology prediction, we divided the pulmonary nodule data from Center 1 into a training set and an internal test set in a ratio of 7:3, and the data from centers 2 and 3 were all used in the external test set. Machine learning models used in this study included GradientBoosting, RandomForest, and ExplainableBoosting. Gradient boosting is an ensemble learning technique that iteratively trains decision trees to minimize a loss function. The advantage of the gradient boosting algorithm lies in its high-precision prediction capabilities and adaptability to various types of data. It also demonstrates superiority in situations with imbalanced data by exhibiting good robustness and generalization ability. Furthermore, the gradient boosting algorithm can assess feature importance and effectively handle large-scale datasets. Random Forest is an ensemble learning method that improves prediction accuracy and stability by constructing multiple decision trees and combining their results. Each tree in the Random Forest is trained on a randomly selected subset of samples, which reduces the variance of the model and enhances its generalization ability. The Explainable Boosting Machine (EBM) is a tree-based, iteratively gradient-boosted generalized additive model with automatic interaction detection capabilities. The design goal of EBM is to maintain comparable accuracy with state-of-the-art machine learning methods (such as Random Forest and Boosted Trees) while preserving a high degree of interpretability.

Feature preprocessing

Feature preprocessing is a crucial step in machine learning, as it directly impacts the performance and predictive capability of the model, making the model’s predictions easier to understand and interpret. In order to enable all features to be applied for machine learning model building, we used the LabelEncoder from the scikit-learn library to perform numerical encoding on non-numerical attributes. The non-numerical characteristics were as follows: (1) Nature of the nodule; (2) L-RADS; (3) PNI-GARS. The remaining features were represented by the number 1 to indicate the presence of the imaging feature, and by the number 0 to indicate the absence of the imaging feature.

Experimental settings

In our study, experiments were executed using Python 3.8.3, with the experimental framework outlined in Figure 3. We undertook a grid search approach to hyperparameter optimization for three machine learning algorithms to achieve the best model fit for pulmonary nodule diagnosis. The GradientBoosting model yielded optimal results with a learning rate of 0.1, 140 estimators (n_estimators), a maximum tree depth of 4 (max_depth), and a minimum sample requirement for node splits of 4 (min_samples_split). For the RandomForest model, the grid search identified the most effective parameters as a maximum tree depth of 5 (max_depth), a minimum sample count at leaf nodes of 4 (min_samples_leaf), a minimum sample split of 4 (min_samples_split), 30 trees in the forest (n_estimators), and utilizing 4 jobs for parallel processing (n_jobs). Interestingly, the default parameters were found to be the most effective for the ExplainableBoosting model, indicating that the model’s designers had already established a robust starting point for a wide range of applications. Through this meticulous grid search-based optimization, we enhanced the predictive accuracy and generalizability of our models.

Figure 3
www.frontiersin.org

Figure 3. The overall pipeline of this study. (A) The process of data processing. (B) The process of Model prediction and testing.

Statistical analysis

The area under the receiver operating characteristic curve (AUC), accuracy, precision, sensitivity, specificity, positive predictive value (PPV) and Negative predictive value (NPV) were used to assess the diagnostic performance of the model in each cohort. Categorical variables in the patient data were represented using numerical (%) values, while continuous variables were described using means and standard deviations (SD).

Results

Baseline characteristics

In this study, a total of 6423 patients(Median lung nodule diameter,10.9 [IQR,7.78-29.93] mm; mean age,56.09[SD,11.13]; Man,39.4%) with CT imaging data of pulmonary nodules from three centers were analyzed according to inclusion and exclusion criteria. The training set included 2660 patients(Median lung nodule diameter,10.4 [IQR,7.2-29.76] mm; mean age,55.73[SD,10.93]; Man,40.3%).The internal test set contained 1129 patients(Median lung nodule diameter,10.4 [IQR,7.2-29.76] mm; mean age,55.73[SD,10.93]; Man,40.3%).The external test set contained 218 patients (Median lung nodule diameter,15.7 [IQR,11.4-30] mm; mean age,58.12[SD,12.17]; Man,53.1%). More detailed clinical and radiological features in different cohort are shown in Table 2.

Table 2
www.frontiersin.org

Table 2. Clinical and non-contrast chest CT scans characteristics of patients in three cohorts.

Machine learning model performance

During the training process, different machine learning models showed different performance on clinical-radiological features. In the external-testing set, Explainable Boosting showed the best fitting results with an AUC of 0.904 (95% CI: 0.855–0.948), accuracy of 0.899, sensitivity of 0.911, specificity of 0.895, PPV of 0.974, and NPV of 0.694. While in the internal-testing set, these values were 0.858 (95% CI: 0.839–0.877), 0.867, 0.767, 0.948, 0.833, and 0.923, respectively. Furthermore, Explainable Boosting compared favorably to the Random Forest model and the Gradient Boosting model, with sensitivity improvements of 0.089 and 0.023, respectively. Specificity was 0.034 higher compared to the Gradient Boosting model. The prediction performance of the model across three cohorts is shown in Figures 4A–C and Table 3 . By calculating the confusion matrix, the values of True Positive (TP), False Positive (FP), True Negative (TN), and False Negative (FN) (25) can be obtained. The five metrics used to evaluate our model are as follows:

1. Accuracy=(TP + TN)/(TP + TN + FP + FN).

2. Sensitivity=TP/(TP + FN).

3. Specificity=TN/(TN + FP).

4. PPV=TP/(TP + FP).

5. NPV=TN/(FN + TN).

Table 3
www.frontiersin.org

Table 3. The performance comparison of different models.

Figure 4
www.frontiersin.org

Figure 4. ROC curve for classification in the dataset. (A) training-sets. (B) internal-testing-sets. (C) external-testing-sets; AUC, area under curve. Confusion matrix. (D) Classification performance of different models on SN. (E) PSN. (F) GGN-type nodules (From left to right; Gradient Boosting; Random forest; Explainable Boosting.

Classification effectiveness assessment

In the second phase of experiments, we demonstrated the performance of machine learning in classifying three types of pulmonary nodule properties: SN (solid nodule), PSN (partially solid nodule), and GGN (ground glass nodule). The average probability of misclassifying SN, PSN, and GGN-type pulmonary nodules as malignant or benign by the three models was 2.9%, 4.0%, and 6.8%, respectively. The average probability of misclassifying benign as malignant was 10.6%, 5.2%, and 16.0%, respectively. Of the three types of nodules, the model had the best performance in diagnosing the SN type of nodule, and the GGN type had the worst performance. In clinical practice, the probability of malignancy is, in descending order, PSN > GGN > SN. However, the probability of misclassifying malignant as benign is on average 6.0% lower in our model than the probability of misclassifying benign as malignant. Specific classification details are shown in Figures 4D–F. Therefore, we calculated diagnostic precision and recall scores separately for male and female patients in the dataset to compare the diagnostic performance of the model across genders. The details of the relevant scores are shown in Supplementary Figure S2A, B. The diagnostic performance of male patients is slightly better than that of female patients on the external validation set. However, there were fewer benign cases in the external validation set, which resulted in a lower recall score. The diagnosis of benign nodules by radiologists is higher than our model. Thus, the use of machine learning can assist in improving the diagnostic accuracy of pulmonary nodule pathology by radiologists. In addition, we found that the diagnostic performance of the model continued to improve with age by exploring the diagnostic performance at different ages. It indicates that our model has a significant impact on the diagnosis of malignant lung nodules in the elderly population. The age-specific diagnostic information is shown in Supplementary Figure S2C.

Model validation

In this study, we used 5-fold cross-validation and 10-fold cross-validation to test the stability of the machine learning model, respectively. The cross-validation process is shown in Supplementary Figure S1A. Cross-validation helps to mitigate potential bias and overfitting problems that can arise from using a single training-testing split. In addition, the cross-validated classification results of different models are shown in Supplementary Figures S1B, C.

In the 5-fold and 10-fold cross-validation used for the different cohorts of patient data, the three machine learning models performed stable AUC values in the training set and in the internal test set. but the accuracy values fluctuate more on the external validation set data, which may be due to the non-uniformity of the benign and malignant samples in the external validation set. and with uneven dichotomous samples, the AUC and precision metrics are more representative of the overall prediction.

Model feature interpretability

The SHAP values (26) elucidate the influence of each feature variable on the prediction model output (Figure 5). The importance of features is depicted with a decreasing gradient from top to bottom. In the context of this study, “positive samples” refer to malignant lung nodules, while “negative samples” denote benign nodules. A red color signifies a greater impact on the classification of malignant nodules (positive samples), whereas a blue color indicates a greater impact on the classification of benign nodules (negative samples). The x-axis represents the SHAP value, where positive values suggest a contribution towards a positive classification (malignant), and negative values imply a negative impact on this classification, potentially leaning towards a negative classification (benign). The interaction of features with negative values in the prediction is contingent upon the interplay with other variables. Wider bars on the plot indicate higher density and more recurrent values. As depicted in Figures 5A–C, distinct machine learning models prioritize features differently. We identified the top five features for prediction in each model (Supplementary Table S2). PNI-GARS was identified by three machine learning models as the primary indicator of benign and malignant pulmonary nodules. Upon comparing the diagnostic efficacy of the PNI-GARS and the L-RADS systems for malignant lung nodules using identical patient data, the L-RADS system was observed to be less effective in diagnosing malignant lung nodules across various stages. Conversely, the PNI-GARS system demonstrated incremental improvements in diagnostic accuracy for malignant lung nodules with each progressive grading level. The detailed diagnostic performance, which included data from all cohorts—the training set, the validation set, and the external validation set—is illustrated in Figures 5D, E. The PNI-GARS system thus offers superior diagnostic efficacy for lung nodule assessment compared to the L-RADS system. The specific PNI-GARS grading criteria are outlined in Table 4 and depicted in Figure 6.

Figure 5
www.frontiersin.org

Figure 5. Feature contributions of different machine learning models. (A) Gradient Boosting. (B) Randomforest. (C) Explainable Boosting. Diagnostic accuracy of the pulmonary nodule classification system for malignant nodules at different stages. (D) L-RADS system. (E) .PNI-GARS system.

Figure 6
www.frontiersin.org

Figure 6. Main grading basis of PNI-GARS.

Table 4
www.frontiersin.org

Table 4. PNI-GARS.

Diagnostic performance of the radiologist and machine learning model

A comparison of diagnoses between radiologists and machine learning models is presented in Table 5. In the external validation set, the radiologist with 8 years of clinical experience, Radiologist A, achieved a performance with an AUC of 0.60 (95% CI: 0.500–0.714), accuracy of 0.707, sensitivity of 0.200, specificity of 0.993, positive predictive value (PPV) of 0.684, and negative predictive value (NPV) of 0.986. Another radiologist with 10 years of clinical experience, Radiologist B, demonstrated performance metrics: AUC of 0.720 (95% CI: 0.611–0.792), accuracy of 0.751, sensitivity of 0.552, specificity of 0.982, PPV of 0.754, and NPV of 0.991. Upon analyzing the performance metrics of the radiologists and the machine learning models, it is evident that the machine learning models demonstrate superior diagnostic performance in the classification of pulmonary nodules. Specifically, the Gradient Boosting model achieved an AUC of 0.875 (95% CI: 0.821–0.925), accuracy of 0.866, sensitivity of 0.888, specificity of 0.861, PPV of 0.967, and NPV of 0.625. The Random Forest model showed slightly higher diagnostic accuracy with an AUC of 0.893 (95% CI: 0.806–0.925), accuracy of 0.899, sensitivity of 0.822, specificity of 0.919, PPV of 0.952, and NPV of 0.725. The Explainable Boosting model led the comparison with the highest AUC of 0.904 (95% CI: 0.855–0.948), accuracy of 0.899, sensitivity of 0.911, specificity of 0.895, PPV of 0.974, and NPV of 0.694. The Explainable Boosting model shows high sensitivity and NPV. On the other hand, the specificity and PPV values from the radiologists’ diagnoses indicate a high likelihood of correctly identifying patients with benign nodules. Radiologist A achieved a specificity of 0.993 and a PPV of 0.684, while Radiologist B obtained a specificity of 0.982 and a PPV of 0.754. However, when it comes to sensitivity, the data suggest that the machine learning models are more effective in identifying actual cases of malignancy. Radiologist A had a sensitivity of 0.200, and Radiologist B had a sensitivity of 0.552, which, while improved over A, still fell short of the machine learning models’ sensibilities. This highlights the importance of machine learning models in enhancing the detection of malignant nodules. For the remaining malignant data after balancing the data set, we combine this data with benign data to form a new data set, and predict this new data set by using ExplainableBoosting model to evaluate the performance of the model. The relevant prediction results are shown in Supplementary Tables S3, S4. ExplainableBoosting still showed good classification performance in the remaining data, with an AUC value of 0.876(95% CI: 0.857-0.895).The AUC curve is shown in Supplementary Figure S3. For malignant nodules, the accuracy and recall rates were 0.97 and 0.92, respectively, and for benign nodules, the accuracy and recall rates were 0.94 and 0.79, respectively. This means that the model can identify malignant nodules accurately and comprehensively. However, when it comes to predicting benign nodules, while accuracy is high, recall rates are relatively low.

Table 5
www.frontiersin.org

Table 5. Performance comparison between machine learning model and Radiologist in external-validation set.

Discussion

In our study, we found that the machine learning model we developed were highly accurate for pulmonary nodule diagnosis. The PNI-GARS system were all recognized by the three machine learning models as a first indicator of the influence of the benign versus malignant nature of lung nodules. In contrast, the Lung-RADS classification criteria, which are now widely used internationally, were less effective in classifying lung nodules as benign or malignant in our dataset. The results suggest that our use of clinical patient characteristics and the PNI-GARS grading system can help radiologists improve the accuracy of their diagnosis of pulmonary nodules.

Currently, research is focused on how CT can be used to achieve an accurate diagnosis of pulmonary nodules without interventional procedures. The Mayo Clinic model, the Veterans Affairs (VA) model, the Brock model (PanCan model), and the Herder model were widely used for pulmonary nodule malignancy diagnosis (1920, 27). However, the above studies suggest that these models have limited performance in the clinical prediction of malignant lung nodules (2830). Moreover, these models were performed with data from lung cancer screening trials, where the majority of patients were clinically asymptomatic and more benign. In contrast, the model we developed was based on a wider range of patient CT imaging presentations and combined two pulmonary nodule diagnostic systems for lung nodules, which were comprehensively evaluated in patients, validated with internal data as well as cross-center validation. In the same data situation, the models proposed in previous studies and the diagnoses made by radiologists were compared, our model had a predictive accuracy with an AUC of 90.3% (CI: 85.5%-94.8%), which was not only higher than that of radiologists with 60% (CI: 50.0%-71.4%), but also higher than that of Mayo, (74.5%; 95% CI: 71.8%-81.5%); Brock, (78.3%, 95% CI: 71.5% -83.8%); VA, (70%, 95% CI 65.5%-71.4%).

As artificial intelligence continues to develop, machine learning is also commonly used in the analysis of medical data. Machine learning algorithms specialize in discovering associations between data rather than the one-dimensional statistical methods currently used (31) (e.g., logistic regression). As computing power and storage continue to increase, machine learning algorithms are able to analyze more complex data and make decisions faster (32, 33). A one-dimensional logistic model was used to predict cancer classification, a traditional approach in the study by Cui X et al (34, 35). Our research uses multiple integrated learning models to fit the data and makes the machine learning decision-making process more transparent by outputting a ranking of model features. Random Forest classification is a method that combines several randomly selected trees and makes predictions by averaging them. This method is of great interest to the research community due to its high accuracy, superiority, and improved performance (36). The gradient boosting method can capture complex relationships in clinical research better than methods based on generalized linear models (37). Magunia H et al. stratified patient risk and predicted ICU survival and prognosis by developing a machine learning model(ExplainableBoosting) based on retrospective and prospective clinical data (38). In this paper, all three machine learning models performed well in terms of classification predictive results in the medical data.

In addition, the PNI-GARS system is proposed on the premise of standardizing the writing of CT reports on lung cancer and grading different nodules, classifying nodules into grades 0 to V. As the grading level increases, the risk of malignancy of the nodules increases, and the different grades of nodules are closely related to the next step of the diagnostic and therapeutic protocols. However, we found that the PNI-GARS system is limited by combining all features into one level. So, we used machine learning to combine clinical and radiological features and the PNI-GARS system to obtain more accurate predictions.

There are several limitations of this study, the first is that our model did not take into account additional clinical indicators of the patient such as smoking history, living environment, family history of cancer, work environment, and previous history of cancer, etc., and the VA model used smoking history and history of cancer to determine the malignancy of lung nodules. However, even in the case where we did not use these indicators, the prediction performance of our model was still higher than that of the VA model. Secondly, the samples selected for our model were surgically confirmed disease cases, which may have been treated surgically with a high degree of suspicion of malignancy by the radiologist, which may be biased. Thirdly, the small number of cases in the external dataset phase of the multicenter study meant that we did not search for cases with two pulmonary nodules or more at the same time. In subsequent studies, we will include as many more clinical factors as possible as well as life factors of the patients and apply the model to cases that were not involve surgical treatment to validate the validity of the model.

To conclude, we selected the machine learning model by analyzing the best results obtained in the previous studies, combining it with our self-developed PNI-GARS system and clinical characterization data, and validating it with data from different centers, resulting in excellent predictions of the nature of pulmonary nodules. This demonstrates that by combining the PNI-GARS system with clinical imaging features and using machine learning to predict the nature of lung nodules, it can be used to clinically assist radiologists in pathological diagnosis.

Data availability statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Ethics statement

The study was conducted in accordance with the Declaration of Helsinki, and the protocol was approved by the Ethics Committee of The First Affiliated Hospital of Chongqing Medical University, and written informed consent was waived because of the retrospective study (No. 2022-K346). The studies were conducted in accordance with the local legislation and institutional requirements. Written informed consent for participation was not required from the participants or the participants’ legal guardians/next of kin in accordance with the national legislation and institutional requirements.

Author contributions

YZ: Conceptualization, Methodology, Software, Writing – original draft. FS: Data curation, Investigation, Writing – review & editing. WZ: Formal analysis, Supervision, Writing – review & editing. TG: Investigation, Writing – review & editing. SZ: Data curation, Methodology, Writing – review & editing. FL: Conceptualization, Data curation, Methodology, Writing – review & editing.

Funding

The author(s) declare that no financial support was received for the research, authorship, and/or publication of this article.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fimmu.2024.1446511/full#supplementary-material

Abbreviations

AUC, Area under the curve; CT, Computed tomography; PNI-GARS, Pulmonary node imaging-grading reporting system; L-RADS, Lung imaging-reporting and data system; NPV, Negative predictive value; PPV, Positive predictive value; CI, Confidence interval; IQR, Inter-quartile range; SN, Solid nodule; PSN, Part-solid nodule; GGN, Ground glass nodule.

References

1. Lahiri A, Maji A, Potdar PD, Singh N, Parikh P, Bisht B, et al. Lung cancer immunotherapy: progress, pitfalls, and promises. Mol Cancer. (2023) 22:1–37. doi: 10.1186/s12943-022-01453-4

PubMed Abstract | Crossref Full Text | Google Scholar

2. Bray F, Msc ML, Weiderpass E, Soerjomataram I. The ever-increasing importance of cancer as a leading cause of premature death worldwide. Cancer. (2021) 127:3029–30. doi: 10.1002/cncr.33535

PubMed Abstract | Crossref Full Text | Google Scholar

3. Xia C, Dong X, Li H, Cao M, Sun D, He S, et al. Cancer statistics in China and United States, 2022: Profiles, trends, and determinants. Chin Med J. (2022) 135:584–90. doi: 10.1097/CM9.0000000000002000

PubMed Abstract | Crossref Full Text | Google Scholar

4. World Health Organization. Cancer . Available online at: https://www.who.int/news-room/fact-sheets/detail/cancer (Accessed April 11, 2022).

Google Scholar

5. Sharma A, Shambhwani D, Pandey S, Singh J, Lalhlenmawia H, Kumarasamy M, et al. Advances in lung cancer treatment using nanomedicines. ACS Omega. (2022) 8:10–41. doi: 10.1021/acsomega.2c00148

PubMed Abstract | Crossref Full Text | Google Scholar

6. Wiener RS, Gould MK, Woloshin S, Schwartz LM, Clark JA. What do you mean, a spot?: a qualitative analysis of patients’ reactions to discussions with their doctors about pulmonary nodules. Chest. (2013) 143:672–7. doi: 10.1378/chest.12-0980

PubMed Abstract | Crossref Full Text | Google Scholar

7. Abraham J. Reduced lung-cancer mortality with low-dose computed tomographic screening. N Engl J Med. (2011) 365:395–409. doi: 10.1056/NEJMoa1102873

PubMed Abstract | Crossref Full Text | Google Scholar

8. American College of Radiology. Lung CT screening reporting and data system (Lung-RADS) . Available online at: http://www.acr.org/Quality-Safety/Resources/LungRADS (Accessed May 12, 2016).

Google Scholar

9. Zhang Y, Lv F, Chu Z, Li Q, Bi Q, Jiang X, et al. Evaluation of benign and Malignant pulmonary nodules based on thin-layer CT imaging features. Chin J Med Imaging. (2019) 27:182–7. doi: 10.1186/s40548-019-0169-5

Crossref Full Text | Google Scholar

10. Cui X, Heuvelmans MA, Han D, Zhao Y, Fan S, Zheng S, et al. Comparison of Veterans Affairs, Mayo, Brock classification models and radiologist diagnosis for classifying the Malignancy of pulmonary nodules in Chinese clinical population. Transl Lung Cancer Res. (2019) 8:605. doi: 10.21037/tlcr.2019.06.08

PubMed Abstract | Crossref Full Text | Google Scholar

11. Wataya T, Yanagawa M, Tsubamoto M, Sato T, Nishigaki D, Kita K, et al. Radiologists with and without deep learning–based computer-aided diagnosis: comparison of performance and interobserver agreement for characterizing and diagnosing pulmonary nodules/masses. Eur Radiol. (2023) 33:348–59. doi: 10.1007/s00330-022-08180-5

PubMed Abstract | Crossref Full Text | Google Scholar

12. van Riel SJ, Jacobs C, Scholten ET, Wittenberg R, Winkler Wille MM, de Hoop B, et al. Observer variability for Lung-RADS categorisation of lung cancer screening CTs: impact on patient management. Eur Radiol. (2019) 29:924–31. doi: 10.1007/s00330-018-5838-8

PubMed Abstract | Crossref Full Text | Google Scholar

13. Khalil M, Teunissen CE, Otto M, Piehl F, Sormani MP, Gattringer T, et al. Neurofilaments as biomarkers in neurological disorders. Nat Rev Neurol. (2018) 14:577–89. doi: 10.1038/s41582-018-0099-1

Crossref Full Text | Google Scholar

14. Mirza FJ, Zahid S. The role of synapsins in neurological disorders. Neurosci Bull. (2018) 34:349–58. doi: 10.1007/s12264-018-0242-0

PubMed Abstract | Crossref Full Text | Google Scholar

15. Bansal M. Cardiovascular disease and COVID-19. Diabetes Metab Syndr. (2020) 14:247–50. doi: 10.1016/j.dsx.2020.03.022

Crossref Full Text | Google Scholar

16. Kamdar JH, Jeba Praba JJ, Georrge JJ. Artificial intelligence in medical diagnosis: methods, algorithms and applications. In: Jain V, Chatterjee J. Machine Learning with Health Care Perspective: Machine Learning and Healthcare (2020) Cham: Springer 13:27–37.

Google Scholar

17. Wang W, Kiik M, Peek N, Curcin V, Marshall IJ, Rudd AG, et al. A systematic review of machine learning models for predicting outcomes of stroke with structured data. PloS One. (2020) 15:e0234722. doi: 10.1371/journal.pone.0234722

PubMed Abstract | Crossref Full Text | Google Scholar

18. Harrison M. Machine learning pocket reference: working with structured data in python. O’Reilly Media (2019).

Google Scholar

19. Gould MK, Ananth L, Barnett PG. A clinical model to estimate the pretest probability of lung cancer in patients with solitary pulmonary nodules. Chest. (2007) 131:383–8. doi: 10.1378/chest.06-2143

PubMed Abstract | Crossref Full Text | Google Scholar

20. Swensen SJ, Silverstein MD, Ilstrup DM, Schleck CD, Edell ES. The probability of Malignancy in solitary pulmonary nodules. Application to small radiologically indeterminate nodules. Arch Intern Med. (1997) 157:849–55. doi: 10.1001/archinte.1997.00440150122025

PubMed Abstract | Crossref Full Text | Google Scholar

21. Roberts AS, Shetty AS, Mellnick VM, Pickhardt PJ, Bhalla S, Menias CO. Extramedullary haematopoiesis: radiological imaging features. Clin Radiol. (2016) 71:807–14. doi: 10.1016/j.crad.2016.03.014

PubMed Abstract | Crossref Full Text | Google Scholar

22. Vetter P, Vu DL, L’Huillier AG, Schibler M, Kaiser L, Jacquerioz F. Clinical features of covid-19. BMJ. (2020) 369:m2182. doi: 10.1136/bmj.m2182

PubMed Abstract | Crossref Full Text | Google Scholar

23. Kalnins A. Multicollinearity: How common factors cause Type 1 errors in multivariate regression. Strat Manag J. (2018) 39:2362–85. doi: 10.1108/SMJ-02-2018-0056

Crossref Full Text | Google Scholar

24. Mu Y, Liu X, Wang L. A Pearson’s correlation coefficient based decision tree and its parallel implementation. Inf Sci. (2018) 435:40–58. doi: 10.1016/j.ins.2018.01.030

Crossref Full Text | Google Scholar

25. Yang S, Berdine G. The receiver operating characteristic (ROC) curve. Southwest Respir Crit Care Chronicles. (2017) 5:34–6. doi: 10.12746/swrccc.v5i19.391

Crossref Full Text | Google Scholar

26. Baptista ML, Goebel K, Henriques EMP. Relation between prognostics predictor evaluation metrics and local interpretability SHAP values. Artif Intell. (2022) 306:103667. doi: 10.1016/j.artint.2022.103667

Crossref Full Text | Google Scholar

27. Herder GJ, Van Tinteren H, Golding RP, Kostense PJ, Comans EFI, Smit EF, et al. Clinical prediction model to characterize pulmonary nodules: validation and added value of 18 F-fluorodeoxyglucose positron emission tomography. Chest. (2005) 128:2490–6. doi: 10.1378/chest.128.3.2490

PubMed Abstract | Crossref Full Text | Google Scholar

28. Zhang X, Yan HH, Lin JT, Wu ZH, Liu J, Cao XW, et al. Comparison of three mathematical prediction models in patients with a solitary pulmonary nodule. Chin J Cancer Res. (2014) 26:647–57. doi: 10.3978/j.issn.1000-9604.2014.06.09

PubMed Abstract | Crossref Full Text | Google Scholar

29. Yang B, Jhun BW, Shin SH, Jeong BH, Um SW, Zo JI, et al. Comparison of four models predicting the Malignancy of pulmonary nodules: a single-center study of Korean adults. PloS One. (2018) 13:e0201242. doi: 10.1371/journal.pone.0201242

PubMed Abstract | Crossref Full Text | Google Scholar

30. Susam S, Çinkooğlu A, Ceylan KC, Gürsoy S, Kömürcüoğlu BE, Mertoğlu A, et al. Comparison of Brock University, Mayo Clinic and Herder models for pretest probability of cancer in solid pulmonary nodules. Clin Respir J. (2022) 16:740–9. doi: 10.1111/crj.13524

PubMed Abstract | Crossref Full Text | Google Scholar

31. Chen K, Nie Y, Park S, Zhang K, Zhang Y, Liu Y, et al. Development and validation of machine learning–based model for the prediction of Malignancy in multiple pulmonary nodules: Analysis from multicentric cohorts. Clin Cancer Res. (2021) 27:2255–65. doi: 10.1158/1078-0432.CCR-20-2192

PubMed Abstract | Crossref Full Text | Google Scholar

32. Rajkomar A, Dean J, Kohane I. Machine learning in medicine. N Engl J Med. (2019) 380:1347–58. doi: 10.1056/NEJMra1814259

PubMed Abstract | Crossref Full Text | Google Scholar

33. Kourous K, Exarchos TP, Exarchos KP, Karamouzis MV, Fotiadis DI. Machine learning applications in cancer prognosis and prediction. Comput Struct Biotechnol J. (2015) 13:8–17. doi: 10.1016/j.csbj.2014.11.005

PubMed Abstract | Crossref Full Text | Google Scholar

34. Cui X, Zheng S, Zhang W, Fan S, Wang J, Song F, et al. Prediction of histologic types in solid lung lesions using preoperative contrast-enhanced CT. Eur Radiol. (2023), 1–12. doi: 10.1007/s00330-022-08488-7

PubMed Abstract | Crossref Full Text | Google Scholar

35. Priyadarshini A, Aravinth J. (2021). Correlation based breast cancer detection using machine learning, in: 2021 International Conference on Recent Trends on Electronics, Information, Communication & Technology (RTEICT), . pp. 499–504. IEEE. doi: 10.1109/RTEICT52447.2021.9579675

Crossref Full Text | Google Scholar

36. Parmar A, Katariya R, Patel V. (2019). A review on random forest: An ensemble classifier, in: International conference on intelligent data communication technologies and internet of things (ICICI) 2018, . pp. 758–63. Springer International Publishing. doi: 10.1007/978-3-030-12063-5_73

Crossref Full Text | Google Scholar

37. Zhang Z, Zhao Y, Canes A, Steinberg D, Lyashevska O, AME Big-Data Clinical Trial Collaborative Group, et al. Predictive analytics with gradient boosting in clinical medicine. Ann Transl Med. (2019) 7:152. doi: 10.21037/atm.2019.03.06

PubMed Abstract | Crossref Full Text | Google Scholar

38. Magunia H, Lederer S, Verbuecheln R, Gilot BJ, Koeppen M, Haeberle HA, et al. Machine Learning Identifies ICU Outcome Predictors in a Multicenter COVID-19 Cohort. (2021) 25:295. doi: 10.1101/2021.09.29.20220130.

Crossref Full Text | Google Scholar

Keywords: machine learning, pulmonary nodules, computed tomography, pulmonary node imaging-grading reporting system, cancer imaging

Citation: Zhan Y, Song F, Zhang W, Gong T, Zhao S and Lv F (2024) Prediction of benign and malignant pulmonary nodules using preoperative CT features: using PNI-GARS as a predictor. Front. Immunol. 15:1446511. doi: 10.3389/fimmu.2024.1446511

Received: 10 June 2024; Accepted: 30 October 2024;
Published: 20 November 2024.

Edited by:

Shari Pilon-Thomas, Moffitt Cancer Center, United States

Reviewed by:

Rongguo Zhang, Infervision, China
Duilio Divisi, University of L’Aquila, Italy

Copyright © 2024 Zhan, Song, Zhang, Gong, Zhao and Lv. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Fajin Lv, ZmFqaW5sdkBzb2h1LmNvbQ==; Shuai Zhao, emhhb3NodWFpQGNxdXQuZWR1LmNu

These authors have contributed equally to this work

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.