Accurately and objectively quantifying the clinical features of Parkinson's disease (PD) is crucial for assisting in diagnosis and guiding the formulation of treatment plans. Therefore, based on the data on multi-site motor features, this study aimed to develop an interpretable machine learning (ML) model for classifying the “OFF” and “ON” status of patients with PD, as well as to explore the motor features that are most associated with changes in clinical symptoms.
We employed a support vector machine with a recursive feature elimination (SVM-RFE) algorithm to select promising motion features. Subsequently, 12 ML models were constructed based on these features, and we identified the model with the best classification performance. Then, we used the SHapley Additive exPlanations (SHAP) and the Local Interpretable Model agnostic Explanations (LIME) methods to explain the model and rank the importance of those motor features.
A total of 96 patients were finally included in this study. The naive Bayes (NB) model had the highest classification performance (AUC = 0.956; sensitivity = 0.8947, 95% CI 0.6686–0.9870; accuracy = 0.8421, 95% CI 0.6875–0.9398). Based on the NB model, we analyzed the importance of eight motor features toward the classification results using the SHAP algorithm. The Gait: range of motion (RoM) Shank left (L) (degrees) [Mean] might be the most important motor feature for all classification horizons.
The symptoms of PD could be objectively quantified. By utilizing suitable motor features to construct ML models, it became possible to intelligently identify whether patients with PD were in the “ON” or “OFF” status. The variations in these motor features were significantly correlated with improvement rates in patients' quality of life. In the future, they might act as objective digital biomarkers to elucidate the changes in symptoms observed in patients with PD and might be used to assist in the diagnosis and treatment of patients with PD.