Interpretable machine learning models for predicting short-term prognosis in AChR-Ab+ generalized myasthenia gravis using clinical features and systemic inflammation index

Xu, Yanan; Li, Qi; Pan, Meng; Jia, Xiao; Wang, Wenbin; Guo, Qiqi; Luan, Liqin

doi:10.3389/fneur.2024.1459555

ORIGINAL RESEARCH article

Front. Neurol., 09 October 2024

Sec. Artificial Intelligence in Neurology

Volume 15 - 2024 | https://doi.org/10.3389/fneur.2024.1459555

Interpretable machine learning models for predicting short-term prognosis in AChR-Ab+ generalized myasthenia gravis using clinical features and systemic inflammation index

Yanan Xu¹

Qi Li¹

Meng Pan²

Xiao Jia³

Wenbin Wang¹

Qiqi Guo¹

Liqin Luan¹^*

¹Department of Neurology, Nanjing Jiangbei Hospital, Nanjing, China
²Department of Neurology, The Affiliated Brain Hospital of Nanjing Medical University, Nanjing, China
³Department of Neurology, The Affiliated Hospital of Xuzhou Medical University, Xuzhou, China

Background: Myasthenia Gravis (MG) is an autoimmune disease that causes muscle weakness in 80% of patients, most of whom test positive for anti-acetylcholine receptor (AChR) antibodies (AChR-Abs). Predicting and improving treatment outcomes are necessary due to varying responses, ranging from complete relief to minimal improvement.

Objective: Our study aims to develop and validate an interpretable machine learning (ML) model that integrates systemic inflammation indices with traditional clinical indicators. The goal is to predict the short-term prognosis (after 6 months of treatment) of AChR-Ab+ generalized myasthenia gravis (GMG) patients to guide personalized treatment strategies.

Methods: We performed a retrospective analysis on 202 AChR-Ab+ GMG patients, dividing them into training and external validation cohorts. The primary outcome of this study was the Myasthenia Gravis Foundation of America (MGFA) post-intervention status assessed after 6 months of treatment initiation. Prognoses were classified as “unchanged or worse” for a poor outcome and “improved or better” for a good outcome. Accordingly, patients were categorized into “good outcome” or “poor outcome” groups. In the training cohort, we developed and internally validated various ML models using systemic inflammation indices, clinical indicators, or a combination of both. We then carried out external validation with the designated cohort. Additionally, we assessed the feature importance of our most effective model using the Shapley Additive Explanations (SHAP) method.

Results: In our study of 202 patients, 28.7% (58 individuals) experienced poor outcomes after 6 months of standard therapy. We identified 11 significant predictors, encompassing both systemic inflammation indexes and clinical metrics. The extreme gradient boosting (XGBoost) model demonstrated the best performance, achieving an area under the receiver operating characteristic (ROC) curve (AUC) of 0.944. This was higher than that achieved by logistic regression (Logit) (AUC: 0.882), random forest (RF) (AUC: 0.917), support vector machines (SVM) (AUC: 0.872). Further refinement through SHAP analysis highlighted five critical determinants—two clinical indicators and three inflammation indexes—as crucial for assessing short-term prognosis in AChR-Ab+ GMG patients.

Conclusion: Our analysis confirms that the XGBoost model, integrating clinical indicators with systemic inflammation indexes, effectively predicts short-term prognosis in AChR-Ab+ GMG patients. This approach enhances clinical decision-making and improves patient outcomes.

Introduction

Myasthenia gravis (MG) is an autoimmune disorder marked by autoantibody disruptions at neuromuscular junctions, affecting ocular, bulbar, limb, respiratory, and axial muscles. Its clinical diversity allows categorization into subgroups based on symptoms, antibody specificity, and onset age (1). Approximately 80% of patients with MG develop generalized weakness (2), and among these, 85% test positive for anti-acetylcholine receptor (AChR) antibodies (3). These anti-AChR antibody-positive (AChR-Ab+) generalized myasthenia gravis (GMG) patients constitute the majority of MG cases and are central to trials exploring new immunotherapies. Treatment responses in MG vary significantly, from complete symptom relief to minimal improvement or even progression (4). Unfortunately, a notable portion of patients show suboptimal responses (5), highlighting the urgent need to predict poor treatment outcome to improve therapeutic strategies.

Previous studies have associated traditional clinical characteristics such as disease duration, quantitative myasthenia gravis (QMG) score, and gender with short-term outcome in AChR-Ab+ GMG patients (6, 7). However, these indicators fail to capture the full range of predictive data available. Given that inflammation is a central element in MG pathogenesis (8, 9), exploring systemic inflammation markers—such as the neutrophil to lymphocyte ratio (NLR), platelet to lymphocyte ratio (PLR), lymphocyte to monocyte ratio (LMR), and systemic immune-inflammation index (SII)—could be valuable (10). These markers are recognized as significant biomarkers in autoimmune diseases (11–13) and may illuminate the dynamics of AChR-Ab+ GMG. Nevertheless, the intricate and often nonlinear relationships between comprehensive medical data and clinical outcome create significant analytical challenges, diminishing the effectiveness of linear models such as logistic regression (Logit) for accurate predictions. In this context, the use of machine learning (ML)—a branch of artificial intelligence celebrated for its unmatched ability to unravel complex patterns in large and intricate datasets—is crucial for developing a robust predictive model (14). Common ML classifiers, including support vector machines (SVM) and extreme gradient boosting (XGBoost), have shown versatile applications in various fields such as oncology (15), cardiology (16), and MG (17). Despite this, research remains sparse on ML models that combine systemic inflammation indices with traditional clinical indicators to predict short-term prognosis in AChR-Ab+ GMG patients.

Although Liang et al. (6) and Zhao et al. (7) developed a predictive model for short-term prognosis in patients with AChR-Ab+ GMG, their work primarily focused on clinical characteristics with less emphasis on the systemic inflammation index. Furthermore, their reliance on traditional linear models instead of more advanced ML techniques compromised the precision of their predictions. In this context, our study aims to develop and validate an interpretable ML model that integrates systemic inflammation indices with traditional clinical indicators. Our goal is to predict the short-term prognosis of AChR-Ab+ GMG patients to guide personalized treatment strategies.

Methods

Ethics approval

This retrospective study adhered to the Declaration of Helsinki principles and was approved by the Ethics Committee of Nanjing Jiangbei Hospital (No. 2024062). Informed consent was obtained from all participants or their relatives.

Patient selection

From January 2016 to December 2023, 566 MG patients were screened at Nanjing Jiangbei Hospital, the Affiliated Brain Hospital of Nanjing Medical University, and the Affiliated Hospital of Xuzhou Medical University. The inclusion criteria for our study included onset symptoms compatible with GMG, seropositivity for anti-AChR antibody, a follow-up period exceeding 6 months post-diagnosis, and patients aged over 18 years. The exclusion criteria encompassed symptoms confined to extraocular muscles, the presence of hyperthyroidism, systemic lupus erythematosus, or other immune diseases, and incomplete or missing medical records. After thorough screening, 202 GMG patients were enrolled in our study. To prevent overfitting in the predictive model, 141 patients from January 2016 to May 2020 were involved in the training cohort and 61 patients from June 2020 to December 2023 were assigned to the external verification cohort (Figure 1). For treatment, 109 patients were administered prednisone acetate, and 143 received tacrolimus, with 50 of these also taking a combination of prednisone acetate tablets and tacrolimus capsules. All patients were prescribed pyridostigmine.

Figure 1

Figure 1. Flowchart for patient selection and cohort distribution in developing and validating predictive models for AChR-Ab+ GMG patients. GMG, generalized myasthenia gravis; AChR, acetylcholine receptor; Ab, antibody; MG, myasthenia gravis; SHAP, Shapley Additive Explanations.

Outcome measures

The primary outcome of this study was the Myasthenia Gravis Foundation of America (MGFA) post-intervention status assessed after 6 months of treatment. Prognoses were classified as “unchanged or worse” for poor outcome and “improved or better” for good outcome. This outcome measure, which is widely used in clinical and research settings, demonstrates the robustness and recognized utility of the MGFA post-intervention status in evaluating treatment effectiveness for MG (18).

Data collection

To comprehensively evaluate treatment efficacy predictors in AChR-Ab+ GMG patients, we analyzed a range of clinical and systemic inflammation indicators. Clinical features assessed included age at onset, gender, body mass index (BMI), systolic and diastolic blood pressures (SBP and DBP), scores from the Myasthenia Gravis Foundation of America (MGFA), Quantitative Myasthenia Gravis (QMG), Myasthenia Gravis-Activity of Daily Living (MG-ADL), and the 15-item Myasthenia Gravis Quality of Life questionnaire (MG-QoL). We also considered thymectomy, thymoma presence, autoimmune disease comorbidity, disease duration, anti-AChR antibody titers, and hemoglobin (Hb) levels. Systemic inflammation was evaluated using white blood cell count (WBC), neutrophil, lymphocyte, platelet, monocyte, neutrophil-to-lymphocyte ratio (NLR), platelet-to-lymphocyte ratio (PLR), lymphocyte-to-monocyte ratio (LMR), and the systemic immune-inflammation index (SII), calculated using the formula (platelets * neutrophils)/lymphocytes. The disease duration was defined as the time from the onset of MG symptoms to the patient’s first hospital visit.

Data preprocessing

Before developing the prediction model, we undertook a crucial data preprocessing phase to ensure the process’s fairness. This phase involved normalizing all data, covering both clinical features and systemic inflammation index. We applied Z-score normalization to continuous variables to standardize them to a mean of zero and a standard deviation of one. Categorical variables were converted to binary format, assigned values of “0” or “1.”

Selection of features

To maintain a straightforward model, we applied student’s t-test, Mann–Whitney U test, and chi-square test to identify variables that significantly differed between the groups with good and poor outcomes. We then employed the least absolute shrinkage and selection operator (LASSO) regression with five-fold cross-validation for dimensionality reduction. Finally, variables with non-zero coefficients were analyzed using multivariable logistic regression to identify independent risk factors, thus constructing a ML model.

Derivation and internal validation of ML models

To evaluate the short-term prognosis risk in AChR-Ab+ GMG patients, we utilized four established ML classifiers: Logit, random forest (RF), SVM, and XGBoost. Logit, a linear method, is essential for binary classification due to its straightforwardness and ease of interpretation, establishing it as a fundamental model (19). RF, developed from decision trees, is employed in classification models. It operates by each decision tree in the ensemble classifying the input data independently. Then, RF aggregates these predictions to determine the most common outcome. This method uses multiple decision tree models, leveraging varied data samples from the dataset to enhance prediction accuracy (20). SVM exemplifies kernel-based techniques due to its proficiency in identifying high-dimensional patterns. This robust classification algorithm focuses on establishing an efficient class-separating hyperplane, enhancing performance in complex datasets with numerous features (21). Finally, XGBoost, a tree-based gradient boosting algorithm that constructs an ensemble of weak decision trees to form a robust model, is known for its strong resistance to overfitting (22). It is notably flexible, managing diverse data types and formats without extensive feature engineering. Additionally, XGBoost excels in structured data problems, often surpassing other algorithms in predictive accuracy (23).

Our predictive models were based on clinical features, systemic inflammation index, and their combination, each forming a unique analytical base. During the training phase, to prevent overfitting, we implemented a triple-repeated five-fold cross-validation method. In this approach, each of the five iterations selects a unique fold as the internal testing set, while the remaining four folds serve as the internal training set. Additionally, this entire process is repeated three times to enhance accuracy further. The cumulative average of these three repetitions provides a reliable estimate of error rate (24). For the RF classifier, we configured 500 trees with node splitting based on the square root of the total features. In SVM, we selected a radial basis function (RBF) kernel, effectively handling non-linear data, and fine-tuned its hyperparameters—adjusting the cost parameter through a grid search of [0.1, 1, 10] and the gamma parameter at [0.001, 0.01, 0.1]. For XGBoost, we meticulously chose parameters to balance model complexity and accuracy, setting a learning rate of 0.02, a maximum tree depth of 4, and deploying an ensemble of 600 trees.

Following model development, each was subjected to comprehensive internal validation, evaluating its discrimination, calibration, and clinical utility. The optimal model was chosen based on its superior discrimination, strong calibration, and practical value in a clinical setting.

External validation and interpretability of ML models

To ensure the robustness of our models, we performed external validation. This rigorous assessment confirmed their discriminative ability, calibration, and clinical applicability, providing a comprehensive view of their predictive capabilities. Additionally, after selecting the optimal predictive models, we explored the individual contributions of each variable using the Shapley Additive Explanation (SHAP) methodology. SHAP is based on the Shapley value from game theory, developed by economist Lloyd Shapley. This method and its extensions help in explaining machine learning model outputs through optimal credit distribution for local explanations (25). For example, Bi et al. (26) applied SHAP to measure the contribution of each feature in a model by calculating individual SHAP values for training samples. By aggregating these values, they ranked features according to their importance in predicting positive outcomes (26). SHAP’s interpretability is enhanced by visual plots where each point represents a sample, colored to denote the feature’s value—yellow for higher values and blue for lower ones, with the intensity of the color showing the magnitude of the feature value. We used the SHAP dependence plot to assess the significance of specific features and their effects on the model’s output. The SHAP force plot is designed to analyze and interpret the prediction outcomes for an individual sample.

Statistical analysis

We employed a customized statistical approach tailored to the data type. We applied the chi-square test to categorical variables and used the Shapiro–Wilk test to evaluate the distribution of continuous variables. This assessment determined the appropriate use of either the Mann–Whitney U test or the independent-sample t-test for further analysis. To evaluate model performance, we employed receiver operating characteristic (ROC) curve analysis, which included metrics such as area under the curve (AUC), precision, recall, and F1 score to assess discrimination capability. DeLong’s test was used for AUC comparisons. Additionally, model fit was evaluated using calibration curve analysis and the Brier score to gauge the precision of probability predictions. Decision curve analysis (DCA) was conducted to estimate the net benefits of our models across various threshold probabilities, emphasizing their clinical value. Statistical analyses were carried out using IBM SPSS Statistics (version 22.0) and Python (version 3.7.1).

Results

Patient characteristics

The recruitment of study participants is illustrated in the flow diagram (Figure 1), with 202 out of 566 patients successfully enrolled. The poor outcome rates for AChR-Ab+ GMG patients after 6 months of standard therapy were similar between groups: 29.1% (41/141) in the training cohort and 27.9% (17/61) in the external validation cohort, with no statistically significant difference (χ² = 0.030, p = 0.862). Data in Table 1 confirm these findings, showing consistent distributions of clinical features and systemic inflammation indices across both cohorts, with no significant disparities (all p > 0.05).

Table 1

Table 1. Comparisons of clinical parameters and systemic inflammation markers between the training and external verification cohorts.

Feature selection in the training cohort

Table 2 presents a comparison of clinical features and systemic inflammation index levels between patients with good and poor outcomes in the training cohort. The analysis reveals that poor outcome is associated with being female, having a lower BMI, higher QMG scores, longer disease durations, higher anti-AChR antibody titers, lower Hb levels, and elevated counts of WBCs, neutrophils, NLR, PLR, and SII—all with significant p-values (<0.05). We then applied LASSO regression and 10-fold cross-validation to refine the variable set, selecting nine variables using 1 standard error’s lambda (Figure 2): gender, BMI, QMG score, duration of disease before treatment, Hb, WBC, NLR, PLR, and SII. To further mitigate the impact of confounding factors, we conducted multivariate logistic regression on these variables to assess their roles as independent predictors of outcome in AChR-Ab+ GMG patients (Table 3). The analysis confirmed that all variables, except WBC, were significant independent predictors (all p < 0.05). The results of the correlation heatmap (Figure 3) indicate that all variable correlations are below 0.3, suggesting no significant correlations or multicollinearity among the variables. Finally, the ML model included the following variables: gender, BMI, QMG score, duration of disease before treatment, Hb, NLR, PLR, and SII. These key parameters underwent Z-score normalization to achieve a mean of zero and a standard deviation of one. This standardization streamlined their documentation and integration into the development of ML prediction models, thereby enhancing their predictive accuracy.

Table 2

Table 2. Comparison of clinical parameters and systemic inflammation markers in patients with good and poor outcomes.

Figure 2

Figure 2. LASSO regression analysis for feature selection. (A) Coefficient profiles for 11 variables. (B) Determination of the optimal penalty coefficient lambda using five-fold cross-validation. The plot shows partial likelihood deviance against log (lambda), where lambda serves as the tuning parameter. Red dots represent average deviance values per model at each lambda, with error bars for standard error. Optimal values are marked with dotted vertical lines based on minimum criteria and the 1-SE rule.

Table 3

Table 3. Validation of variables in LASSO regression using multivariable logistic analysis.

Figure 3

Figure 3. Correlation heatmap of variables.

Comparing models for predicting poor outcome risk

In our comprehensive analysis of predictive models for poor outcomes in AChR-Ab+ GMG patients, we assessed four ML classifiers: Logit, SVM, RF, and XGBoost. These classifiers were tested against three sets of predictors: clinical indicators, systemic inflammation indices, and their combination. Table 4 details the performance comparison of these models, while Figures 4–6 display the ROC curves, calibration plots, and DCA. Our findings indicate that models integrating both sets of predictors achieved better discriminative ability (AUC: 0.872–0.944) compared to those using solely clinical indicators (AUC: 0.772–0.831) or systemic inflammation measures (AUC: 0.792–0.855), with statistical significance confirmed by DeLong’s test (p < 0.05).

Table 4

Table 4. Performance of ML classifiers in predicting poor outcome risk in AChR-Ab+ GMG using clinical data, systemic inflammation markers, and combined datasets.

Figure 4

Figure 4. Comparative performance of ML classifiers (Logit, SVM, RF, XGBoost) on clinical data: (A) ROC curves, (B) calibration plots, and (C) DCA. They achieved ROC-AUCs of 0.772, 0.831, 0.819, and 0.824, respectively. ML, machine learning; ROC, receiver operating characteristic; AUC, area under the curve; DCA, decision curve analysis; Logit, logistic regression; SVM, support vector machine; RF, random forest; XGBoost, extreme gradient boosting.

Figure 5

Figure 5. Comparative performance of ML classifiers (Logit, SVM, RF, XGBoost) on systemic inflammation index: (A) ROC curves, (B) calibration plots, and (C) DCA. They achieved ROC-AUCs of 0.804, 0.792, 0.855, and 0.854, respectively. ML, machine learning; ROC, receiver operating characteristic; AUC, area under the curve; DCA, decision curve analysis; Logit, logistic regression; SVM, support vector machine; RF, random forest; XGBoost, extreme gradient boosting.

Figure 6

Figure 6. Comparative performance of ML classifiers (Logit, SVM, RF, XGBoost) on combined clinical data and systemic inflammation index: (A) ROC curves, (B) calibration plots, and (C) DCA. They achieved ROC-AUCs of 0.882, 0.872, 0.917, and 0.944, respectively. ML, machine learning; ROC, receiver operating characteristic; AUC, area under the curve; DCA, decision curve analysis; Logit, logistic regression; SVM, support vector machine; RF, random forest; XGBoost, extreme gradient boosting.

Among the models that combined clinical indicators and systemic inflammation indices, XGBoost emerged as the most effective, achieving the highest AUC of 0.944 with superior calibration, especially above the 75% threshold. Performance was uniformly validated across all models in DCA. The performance of XGBoost was consistently strong across all key metrics, including precision, recall, F1 score, and Brier score. These results establish XGBoost as the optimal model for predicting poor outcome risk in AChR-Ab+ GMG patients.

Assessing ML model using an external verification cohort

The external verification cohort was utilized to evaluate the predictive accuracy of the XGBoost model for poor outcome, employing ROC, calibration, and DCA analyses (Figure 7). Although there was a slight decrease in performance relative to the training cohort, the XGBoost model maintained a strong discriminative ability, with an AUC of 0.908 (Figure 7A). The calibration curve demonstrated high agreement between the predicted risks and observed outcome (Figure 7B). Furthermore, the DCA curve confirmed the model’s effectiveness by showing significant net benefits (Figure 7C). These findings underscore the XGBoost model’s robustness and clinical value as a predictive tool for assessing poor outcome risk.

Figure 7

Figure 7. Assessing the predictive performance of the optimal ML model using an external verification cohort: (A) ROC curve (AUC = 0.908), (B) calibration curve, and (C) DCA. ML, machine learning; ROC, receiver operating characteristic; AUC, area under the curve; DCA, decision curve analysis.

Interpretation of the model

SHAP analysis was employed to elucidate the impact of individual features in the XGBoost model, quantifying the influence of each by calculating their absolute mean SHAP values. This approach ranked features by importance, revealing two clinical indicators and three systemic inflammation indices as the top five contributors (Figure 8). The SHAP summary plot (Figure 8A) is derived from estimates, allocating a specific data point to each feature for every patient. In this visualization, yellow signifies higher values and blue denotes lower ones. The SHAP values are displayed along the horizontal axis, where larger shapes highlight features with greater importance in forecasting the short-term prognosis of AChR-Ab+ GMG patients. The importance bar chart (Figure 8B) outlines the impact of each variable on prognosis prediction. In summary, ranked by decreasing significance, the key features are: SII, NLR, disease duration, PLR, QMG score, BMI, gender, Hb.

Figure 8

Figure 8. SHAP analysis of XGBoost model for predicting short-term prognosis. (A) Summary plot and (B) feature importance ranking. SHAP, Shapley Additive Explanations; XGBoost, extreme gradient boosting; SII, systemic immune-inflammation index; QMG, quantitative myasthenia gravis; NLR, neutrophil to lymphocyte ratio; PLR, platelet to lymphocyte ratio; BMI, body mass index; Hb, hemoglobin.

Figure 9 presents SHAP dependence plots for each of the eight factors, elucidating their influence on the outcomes of the XGBoost model. Positive SHAP values indicate a higher risk of poor outcomes in AChR-Ab+ GMG patients. Our findings associate poor outcomes with several factors: increased SII, NLR, and PLR; longer disease duration prior to treatment; elevated QMG scores; female gender; lower BMI; and decreased Hb levels.

Figure 9

Figure 9. SHAP dependency plot of the XGboost model. SHAP, Shapley Additive Explanations; XGBoost, extreme gradient boosting; SII, systemic immune-inflammation index; NLR, neutrophil to lymphocyte ratio; PLR, platelet to lymphocyte ratio; QMG, quantitative myasthenia gravis; BMI, body mass index; Hb, hemoglobin.

In predictive modeling, the SHAP force plot clearly demonstrates how certain features affect individual patient outcomes (Figure 10). Yellow areas show features increasing the likelihood of poor outcomes in AChR-Ab+ GMG patients, while red areas show features decreasing this likelihood. The wider the color region, the more significant the impact. The value f(x) sums up the SHAP values for each patient, with the base value representing the average SHAP value across all patients. The upper panel illustrates an accurate prediction of a poor outcome, attributed to factors such as female gender and higher SII values (Figure 10A). The lower panel, in contrast, accurately identifies a patient likely to experience a good outcome, based on a lower QMG score and male gender, and others (Figure 10B). Using XGBoost, this approach effectively differentiates between patients at risk for poor or good outcomes, providing customized risk assessments.

Figure 10

Figure 10. SHAP force plots illustrating individual prediction results: (A) for a patient with a poor outcome; (B) for a patient with a good outcome. SHAP, Shapley Additive Explanations; SII, systemic immune-inflammation index; QMG, quantitative myasthenia gravis; NLR, neutrophil to lymphocyte ratio; PLR, platelet to lymphocyte ratio; BMI, body mass index; Hb, hemoglobin.

Discussion

MG is an autoimmune disease that affects various muscles, leading to generalized weakness in 80% of patients, most of whom test positive for anti-AChR-Abs (2, 3). Inflammation is key in MG, enhancing inflammatory factors, activating B cells, and producing autoantibodies (8, 9, 27). Due to varied treatment responses, precise predictive models are essential. Current models fail in accuracy as they overlook systemic inflammation and use traditional linear approaches instead of advanced ML techniques (6, 7). To address this deficiency, our study developed predictive models using four ML classifiers, incorporating clinical features, the systemic inflammation index, or a combination of both. Our analysis identified the XGBoost model, integrating both clinical features and the systemic inflammation index, as the most effective for predicting short-term prognosis. Notably, integrating SHAP analysis enhanced the interpretability of the XGBoost model, clarifying the influence of the systemic inflammation index in prognosis. This research signifies a substantial advancement in using ML to integrate clinical indicators with the systemic inflammation index for accurate short-term prognosis assessments. Early prediction of treatment response facilitates tailored treatment strategies, potentially offering more intensive or alternative therapies to high-risk patients. Such strategies are likely to increase therapeutic success, slow disease progression, reduce hospital stays, and enhance patient quality of life.

In our study, we selected ML models due to their proficiency in analyzing complex non-linear relationships between variables and outcomes, a capability that exceeds that of conventional linear models (28). We utilized four ML models to assess clinical indicators, the systemic inflammation index, and their integration. The models that combined both data types demonstrated superior efficacy in predicting short-term prognosis, likely because they capture a broader array of factors that directly influence outcomes. This comprehensive methodology significantly improved predictive accuracy.

Among the ML models we evaluated, XGBoost proved to be the most effective. It utilized clinical indicators and the systemic inflammation index to deliver high accuracy, a performance that was consistent even during external validation. Indeed, previous studies have developed ML techniques to predict short-term clinical outcome in MG patients. For example, Zhong et al. (17) analyzed clinical and other characteristics of MG patients with diverse antibody types using an ML model to predict their short-term outcomes. Our research specifically targets AChR-Ab+ GMG patients, with a focus on evaluating the systemic inflammation index to enhance prediction accuracy. To enhance the interpretability of this complex ML model, we employed SHAP analysis. The SHAP feature importance map visually represents the impact of each feature on a model’s output. It displays the SHAP values for each feature, indicating their range and the positive or negative effect on the model. High SHAP values correlate with significant influence (29). Each point in the plot corresponds to a sample, with bar graphs showing SHAP value distributions. The color of these bars reflects feature values within the sample. The position of each bar graph on the plot reveals the feature’s influence direction: leftward shifts indicate negative impacts, while rightward shifts suggest positive effects. This tool aids in identifying critical features for optimizing model performance and guiding feature selection (30, 31). The five most important predictors of short-term prognosis on the SHAP feature importance map include two clinical indicators and three related to the systemic inflammation index. Inflammatory mediators such as interleukins, interferons, and chemokines from inflammatory cells play a key role in modulating the immune response. Studies indicate that macrophages and monocytes in MG release cytokines, initiating inflammatory cascades that activate the immune system (32, 33). Supporting evidence includes detection of neuromuscular antigens like AChRs, germinal centers, elevated Tfh cell counts in the thymus, changes in microRNAs, and specific IFN signaling in thymic epithelial cell subpopulations in MG patients with thymoma (34, 35). The detection of circulating inflammatory indexes is easy to conduct and cost-effective. Based on the information above, we specifically focused on the circulating inflammatory markers in our study, predicting the short-term prognosis of AChR-Ab+ GMG patients, a focus that is rare in previous study. In our study, we observed that elevated levels of three key systemic inflammation indices—NLR, PLR, and SII—correlated with poor outcomes in patients with AChR-Ab+ GMG. The robustness of these indices against physiological, pathological, and physical variations makes them more effective than individual metrics such as neutrophils, lymphocytes, monocytes, or platelets (36, 37). Among these, the SII particularly stands out as it encapsulates the dynamic interplay and potential synergy among platelets, neutrophils, and lymphocytes (38). Consequently, compared to other markers like NLR and PLR, the SII potentially offers a more objective representation of the interactions between inflammatory and immune responses. Moreover, our model indicates that higher QMG scores and prolonged disease duration are associated with poorer treatment responses, aligning with the established correlation between these factors and increased disease severity and chronicity. This finding is consistent with prior research (6, 7), underscoring the importance of early and aggressive intervention in patients with severe symptoms or a lengthy disease history to enhance treatment outcomes. Using SHAP, XGBoost offered clear insights into how different factors affect outcomes, proving crucial for screening risks of poor outcomes. Integrating ML into this screening process holds promise for enabling clinicians to initiate early interventions that improve outcomes for AChR-Ab+ GMG patients.

This ML model could transform management practices in several ways according to our study. First, for patients with a higher systemic inflammation index, the model improves patient-physician communication by alerting about potential poor outcomes, which also allows clinicians to better prepare and proactively manage care. Second, it aids early-career clinicians by facilitating referrals for patients predicted to have poor outcomes to more specialized and experienced clinicians, thus reducing the risks associated with inexperience. Finally, other clinicians can input clinical features and systemic inflammation indices into our XGBoost ML models to obtain precise clinical predictions. The model also provides a SHAP force plot that illustrates the impact of each variable on the outcomes, enhancing diagnostic accuracy and understanding.

Our study yielded promising results, yet two limitations should be noted. Initially, it was limited to three institutions in the same region and involved only 202 patients, possibly reflecting regional biases and the constraints of a modest sample size, which may affect the generalizability of the findings. Furthermore, the exclusion of patients without comprehensive clinical records and the retrospective design of the study may further contribute to selection bias. Despite these issues, our research highlights the capability of ML models that integrate clinical indicators and the systemic inflammation index to predict the short-term prognosis of AChR-Ab+ GMG patients. Future research should adopt larger-scale, multi-center, prospective studies to enhance the model’s reliability and extend its applicability.

In conclusion, the XGBoost model excels in predicting the short-term prognosis of AChR-Ab+ GMG patients by integrating clinical indicators with the systemic inflammation index. This ML model enables precise risk assessment, aiding clinicians in informed decision-making and improving patient outcomes.

Data availability statement

The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.

Ethics statement

The studies involving humans were approved by the Ethics Committee of Nanjing Jiangbei Hospital. The studies were conducted in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in this study.

Author contributions

YX: Writing – original draft, Methodology, Formal analysis, Data curation, Conceptualization. QL: Writing – original draft. MP: Writing – original draft, Formal analysis, Data curation. XJ: Writing – original draft, Data curation. WW: Writing – original draft. QG: Writing – original draft, Data curation. LL: Writing – review & editing, Supervision.

Funding

The author(s) declare that no financial support was received for the research, authorship, and/or publication of this article.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

1. Gilhus, NE, Tzartos, S, Evoli, A, Palace, J, Burns, TM, and Verschuuren, J. Myasthenia gravis. Nat Rev Dis Primers. (2019) 5:30. doi: 10.1038/s41572-019-0079-y

Crossref Full Text | Google Scholar

2. Hehir, MK, and Silvestri, NJ. Generalized myasthenia gravis: classification, clinical presentation, natural history, and epidemiology. Neurol Clin. (2018) 36:253–60. doi: 10.1016/j.ncl.2018.01.002

Crossref Full Text | Google Scholar

3. Lindstrom, JM, Seybold, ME, Lennon, VA, Whittingham, S, and Duane, DD. Antibody to acetylcholine receptor in myasthenia gravis. Prevalence, clinical correlates, and diagnostic value. Neurology. (1976) 26:1054–9. doi: 10.1212/wnl.26.11.1054

Crossref Full Text | Google Scholar

4. Silvestri, NJ, and Wolfe, GI. Treatment-refractory myasthenia gravis. J Clin Neuromuscul Dis. (2014) 15:167–78. doi: 10.1097/cnd.0000000000000034

Crossref Full Text | Google Scholar

5. Sanders, DB, Wolfe, GI, Benatar, M, Evoli, A, Gilhus, NE, Illa, I, et al. International consensus guidance for management of myasthenia gravis: executive summary. Neurology. (2016) 87:419–25. doi: 10.1212/wnl.0000000000002790

PubMed Abstract | Crossref Full Text | Google Scholar

6. Liang, F, Yin, Z, Li, Y, Li, G, Ma, J, Zhang, H, et al. Constructing and validating a nomogram model for short-term prognosis of patients with AChR-Ab+ GMG. Neurol Ther. (2024) 13:551–62. doi: 10.1007/s40120-024-00590-0

PubMed Abstract | Crossref Full Text | Google Scholar

7. Zhao, R, Wang, Y, Huan, X, Zhong, H, Zhou, Z, Xi, J, et al. Nomogram for short-term outcome assessment in AChR subtype generalized myasthenia gravis. J Transl Med. (2021) 19:285. doi: 10.1186/s12967-021-02961-9

PubMed Abstract | Crossref Full Text | Google Scholar

8. Wang, Z, and Yan, Y. Immunopathogenesis in myasthenia gravis and neuromyelitis optica. Front Immunol. (2017) 8:1785. doi: 10.3389/fimmu.2017.01785

PubMed Abstract | Crossref Full Text | Google Scholar

9. Uzawa, A, Kuwabara, S, and Suzuki, S. Roles of cytokines and T cells in the pathogenesis of myasthenia gravis. Clin Exp Immunol. (2021) 203:366–74. doi: 10.1111/cei.13546

PubMed Abstract | Crossref Full Text | Google Scholar

10. Nøst, TH, Alcala, K, Urbarova, I, Byrne, KS, Guida, F, Sandanger, TM, et al. Systemic inflammation markers and cancer incidence in the UK Biobank. Eur J Epidemiol. (2021) 36:841–8. doi: 10.1007/s10654-021-00752-6

PubMed Abstract | Crossref Full Text | Google Scholar

11. Qin, B, Ma, N, Tang, Q, Wei, T, Yang, M, Fu, H, et al. Neutrophil to lymphocyte ratio (NLR) and platelet to lymphocyte ratio (PLR) were useful markers in assessment of inflammatory response and disease activity in SLE patients. Mod Rheumatol. (2016) 26:372–6. doi: 10.3109/14397595.2015.1091136

PubMed Abstract | Crossref Full Text | Google Scholar

12. Gasparyan, AY, and Ayvazyan, L. The platelet-to-lymphocyte ratio as an inflammatory marker in rheumatic diseases. Ann Lab Med. (2019) 39:345–57. doi: 10.3343/alm.2019.39.4.345

PubMed Abstract | Crossref Full Text | Google Scholar

13. Xie, H, Zhao, Y, Pan, C, Zhang, J, Zhou, Y, Li, Y, et al. Association of neutrophil-to-lymphocyte ratio (NLR) with the prognosis of first attack neuromyelitis optica spectrum disorder (NMOSD): a retrospective cohort study. BMC Neurol. (2021) 21:389. doi: 10.1186/s12883-021-02432-0

PubMed Abstract | Crossref Full Text | Google Scholar

14. Jiang, T, Gradus, JL, and Rosellini, AJ. Supervised machine learning: a brief primer. Behav Ther. (2020) 51:675–87. doi: 10.1016/j.beth.2020.05.002

PubMed Abstract | Crossref Full Text | Google Scholar

15. Radak, M, Lafta, HY, and Fallahi, H. Machine learning and deep learning techniques for breast cancer diagnosis and classification: a comprehensive review of medical imaging studies. J Cancer Res Clin Oncol. (2023) 149:10473–91. doi: 10.1007/s00432-023-04956-z

PubMed Abstract | Crossref Full Text | Google Scholar

16. Chen, J, Yang, L, Han, J, Wang, L, Wu, T, and Zhao, D. Interpretable machine learning models using peripheral immune cells to predict 90-day readmission or mortality in acute heart failure patients. Clin Appl Thromb Hemost. (2024) 30:10760296241259784. doi: 10.1177/10760296241259784

PubMed Abstract | Crossref Full Text | Google Scholar

17. Zhong, H, Ruan, Z, Yan, C, Lv, Z, Zheng, X, Goh, LY, et al. Short-term outcome prediction for myasthenia gravis: an explainable machine learning model. Ther Adv Neurol Disord. (2023) 16:17562864231154976. doi: 10.1177/17562864231154976

PubMed Abstract | Crossref Full Text | Google Scholar

18. Yoshimoto, Y, Ishida, S, Hosokawa, T, and Arawaka, S. Assessment of clinical factors affecting outcome of myasthenia gravis. Muscle Nerve. (2021) 64:90–4. doi: 10.1002/mus.27247

Crossref Full Text | Google Scholar

19. Omar, ED, Mat, H, Abd Karim, AZ, Sanaudi, R, Ibrahim, FH, Omar, MA, et al. Comparative analysis of logistic regression, gradient boosted trees, SVM, and random forest algorithms for prediction of acute kidney injury requiring dialysis after cardiac surgery. Int J Nephrol Renovasc Dis. (2024) 17:197–204. doi: 10.2147/ijnrd.s461028

PubMed Abstract | Crossref Full Text | Google Scholar

20. Liu, L, Liu, W, Jia, Z, Li, Y, Wu, H, Qu, S, et al. Application of machine learning algorithms to predict lymph node metastasis in gastric neuroendocrine neoplasms. Heliyon. (2023) 9:e20928. doi: 10.1016/j.heliyon.2023.e20928

PubMed Abstract | Crossref Full Text | Google Scholar

21. Chen, L, Jiang, J, Dou, B, Feng, H, Liu, J, Zhu, Y, et al. Machine learning study of the extended drug-target interaction network informed by pain related voltage-gated sodium channels. Pain. (2024) 165:908–21. doi: 10.1097/j.pain.0000000000003089

PubMed Abstract | Crossref Full Text | Google Scholar

22. Ding, R, Deng, M, Wei, H, Zhang, Y, Wei, L, Jiang, G, et al. Machine learning-based prediction of clinical outcomes after traumatic brain injury: hidden information of early physiological time series. CNS Neurosci Ther. (2024) 30:e14848. doi: 10.1111/cns.14848

PubMed Abstract | Crossref Full Text | Google Scholar

23. Jannusch, K, Dietzel, F, Bruckmann, NM, Morawitz, J, Boschheidgen, M, Minko, P, et al. Prediction of therapy response of breast cancer patients with machine learning based on clinical data and imaging data derived from breast [¹⁸F]FDG-PET/MRI. Eur J Nucl Med Mol Imaging. (2024) 51:1451–61. doi: 10.1007/s00259-023-06513-9

PubMed Abstract | Crossref Full Text | Google Scholar

24. Nakatsu, RT . An evaluation of four resampling methods used in machine learning classification. IEEE Intell Syst. (2020) 36:51–7. doi: 10.1109/MIS.2020.2978066

Crossref Full Text | Google Scholar

25. Lundberg, SM, and Lee, S-I (2017). A unified approach to interpreting model predictions. NIPS’17: Proceedings of the 31st International Conference on Neural Information Processing Systems. 4768–4777.

Google Scholar

26. Bi, Y, Xiang, D, Ge, Z, Li, F, Jia, C, and Song, J. An interpretable prediction model for identifying N⁷-methylguanosine sites based on XGBoost and SHAP. Mol Ther Nucleic Acids. (2020) 22:362–72. doi: 10.1016/j.omtn.2020.08.022

PubMed Abstract | Crossref Full Text | Google Scholar

27. Hu, Y, Wang, J, Rao, J, Xu, X, Cheng, Y, Yan, L, et al. Comparison of peripheral blood B cell subset ratios and B cell-related cytokine levels between ocular and generalized myasthenia gravis. Int Immunopharmacol. (2020) 80:106130. doi: 10.1016/j.intimp.2019.106130

PubMed Abstract | Crossref Full Text | Google Scholar

28. Uddin, S, Khan, A, Hossain, ME, and Moni, MA. Comparing different supervised machine learning algorithms for disease prediction. BMC Med Inform Decis Mak. (2019) 19:281. doi: 10.1186/s12911-019-1004-8

PubMed Abstract | Crossref Full Text | Google Scholar

29. Zhang, J, Niu, W, Yang, Y, Hou, D, and Dong, B. Machine learning prediction models for compressive strength of calcined sludge-cement composites. Constr Build Mater. (2022) 346:128442. doi: 10.1016/j.conbuildmat.2022.128442

Crossref Full Text | Google Scholar

30. Nohara, Y, Matsumoto, K, Soejima, H, and Nakashima, N. Explanation of machine learning models using shapley additive explanation and application for real data in hospital. Comput Methods Prog Biomed. (2022) 214:106584. doi: 10.1016/j.cmpb.2021.106584

PubMed Abstract | Crossref Full Text | Google Scholar

31. Xue, B, Li, D, Lu, C, King, CR, Wildes, T, Avidan, MS, et al. Use of machine learning to develop and evaluate models using preoperative and intraoperative data to identify risks of postoperative complications. JAMA Netw Open. (2021) 4:e212240. doi: 10.1001/jamanetworkopen.2021.2240

PubMed Abstract | Crossref Full Text | Google Scholar

32. Maselli, RA, Richman, DP, and Wollmann, RL. Inflammation at the neuromuscular junction in myasthenia gravis. Neurology. (1991) 41:1497–504. doi: 10.1212/wnl.41.9.1497

Crossref Full Text | Google Scholar

33. Lefeuvre, CM, Payet, CA, Fayet, OM, Maillard, S, Truffault, F, Bondet, V, et al. Risk factors associated with myasthenia gravis in thymoma patients: the potential role of thymic germinal centers. J Autoimmun. (2020) 106:102337. doi: 10.1016/j.jaut.2019.102337

PubMed Abstract | Crossref Full Text | Google Scholar

34. Song, Y, Zhou, L, Miao, F, Chen, G, Zhu, Y, Gao, X, et al. Increased frequency of thymic T follicular helper cells in myasthenia gravis patients with thymoma. J Thorac Dis. (2016) 8:314–22. doi: 10.21037/jtd.2016.03.03

PubMed Abstract | Crossref Full Text | Google Scholar

35. Sengupta, M, and Wang, BD. MicroRNA and mRNA expression associated with ectopic germinal centers in thymus of myasthenia gravis. PLoS One. (2018) 13:e0205464. doi: 10.1371/journal.pone.0205464

PubMed Abstract | Crossref Full Text | Google Scholar

36. Schiefer, S, Wirsik, NM, Kalkum, E, Seide, SE, and Nienhüser, H. Systematic review of prognostic role of blood cell ratios in patients with gastric cancer undergoing surgery. Diagnostics. (2022) 12:3. doi: 10.3390/diagnostics12030593

PubMed Abstract | Crossref Full Text | Google Scholar

37. Almășan, O, and Leucuța, DC. Blood cell count inflammatory markers as prognostic indicators of periodontitis: a systematic review and meta-analysis. J Pers Med. (2022) 12, 12:6:692. doi: 10.3390/jpm12060992

PubMed Abstract | Crossref Full Text | Google Scholar

38. Wu, X, Wang, H, Xie, G, Lin, S, and Ji, C. Increased systemic immune-inflammation index can predict respiratory failure in patients with Guillain–Barré syndrome. Neurol Sci. (2022) 43:1223–31. doi: 10.1007/s10072-021-05420-x

Crossref Full Text | Google Scholar

Keywords: short-term prognosis, generalized myasthenia gravis, systemic inflammation index, machine learning, prognosis

Citation: Xu Y, Li Q, Pan M, Jia X, Wang W, Guo Q and Luan L (2024) Interpretable machine learning models for predicting short-term prognosis in AChR-Ab+ generalized myasthenia gravis using clinical features and systemic inflammation index. Front. Neurol. 15:1459555. doi: 10.3389/fneur.2024.1459555

Received: 04 July 2024; Accepted: 18 September 2024;
Published: 09 October 2024.

Edited by:

Fei He, Coventry University, United Kingdom

Reviewed by:

Shahar Shelly, Rambam Health Care Campus, Israel
Xinglong Yang, The First Affiliated Hospital of Kunming Medical University, China

Copyright © 2024 Xu, Li, Pan, Jia, Wang, Guo and Luan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Liqin Luan, bGxxX3piQDEyNi5jb20=

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Interpretable machine learning models for predicting short-term prognosis in AChR-Ab+ generalized myasthenia gravis using clinical features and systemic inflammation index

Introduction

Methods

Ethics approval

Patient selection

Outcome measures

Data collection

Data preprocessing

Selection of features

Derivation and internal validation of ML models

External validation and interpretability of ML models

Statistical analysis

Results

Patient characteristics

Feature selection in the training cohort

Comparing models for predicting poor outcome risk

Assessing ML model using an external verification cohort

Interpretation of the model

Discussion

Data availability statement

Ethics statement

Author contributions

Funding

Conflict of interest

Publisher’s note

References

94% of researchers rate our articles as excellent or good

94% of researchers rate our articles as excellent or good