Machine Learning Classifiers to Evaluate Data From Gait Analysis With Depth Cameras in Patients With Parkinson’s Disease

Muñoz-Ospina, Beatriz; Alvarez-Garcia, Daniela; Clavijo-Moran, Hugo Juan Camilo; Valderrama-Chaparro, Jaime Andrés; García-Peña, Melisa; Herrán, Carlos Alfonso; Urcuqui, Christian Camilo; Navarro-Cadavid, Andrés; Orozco, Jorge

doi:10.3389/fnhum.2022.826376

ORIGINAL RESEARCH article

Front. Hum. Neurosci. , 19 May 2022

Sec. Motor Neuroscience

Volume 16 - 2022 | https://doi.org/10.3389/fnhum.2022.826376

This article is part of the Research Topic Use of Computerized Gait Analysis in Neurological Pathologies View all 14 articles

Machine Learning Classifiers to Evaluate Data From Gait Analysis With Depth Cameras in Patients With Parkinson’s Disease

$\r\nBeatriz Muoz-Ospina*$ Beatriz Muñoz-Ospina^1*

Daniela Alvarez-Garcia^2,3

Hugo Juan Camilo Clavijo-Moran⁴

Jaime Andrés Valderrama-Chaparro⁵

Melisa García-Peña³

Carlos Alfonso Herrán³

Christian Camilo Urcuqui³

Andrés Navarro-Cadavid³

Jorge Orozco¹

¹Fundación Valle del Lili, Departamento de Neurología, Cali, Colombia
²Fundación Valle del Lili, Departamento de Neurocirugía, Cali, Colombia
³Universidad Icesi, Lab i2t/CENIT, Cali, Colombia
⁴Fundación Valle del Lili, Centro de investigaciones clínicas, Cali, Colombia
⁵Universidad Icesi, Facultad de ciencias de la salud, Cali, Colombia

Introduction: The assessments of the motor symptoms in Parkinson’s disease (PD) are usually limited to clinical rating scales (MDS UPDRS III), and it depends on the clinician’s experience. This study aims to propose a machine learning technique algorithm using the variables from upper and lower limbs, to classify people with PD from healthy people, using data from a portable low-cost device (RGB-D camera). And can be used to support the diagnosis and follow-up of patients in developing countries and remote areas.

Methods: We used Kinect^®eMotion system to capture the spatiotemporal gait data from 30 patients with PD and 30 healthy age-matched controls in three walking trials. First, a correlation matrix was made using the variables of upper and lower limbs. After this, we applied a backward feature selection model using R and Python to determine the most relevant variables. Three further analyses were done using variables selected from backward feature selection model (Dataset A), movement disorders specialist (Dataset B), and all the variables from the dataset (Dataset C). We ran seven machine learning models for each model. Dataset was divided 80% for algorithm training and 20% for evaluation. Finally, a causal inference model (CIM) using the DoWhy library was performed on Dataset B due to its accuracy and simplicity.

Results: The Random Forest model is the most accurate for all three variable Datasets (Dataset A: 81.8%; Dataset B: 83.6%; Dataset C: 84.5%) followed by the support vector machine. The CIM shows a relation between leg variables and the arms swing asymmetry (ASA) and a proportional relationship between ASA and the diagnosis of PD with a robust estimator (1,537).

Conclusions: Machine learning techniques based on objective measures using portable low-cost devices (Kinect^®eMotion) are useful and accurate to classify patients with Parkinson’s disease. This method can be used to evaluate patients remotely and help clinicians make decisions regarding follow-up and treatment.

Introduction

Parkinson’s disease (PD) represents the second most prevalent neurodegenerative disease in the world with an alarming growth rate in the number of affected individuals estimating that the number of cases will double between 2015 and 2040 (de Lau and Breteler, 2006; Tysnes and Storstein, 2017; Dorsey and Bloem, 2018). PD is clinically characterized by motor symptoms such as bradykinesia, rigidity, tremor, gait disturbance, and impaired postural instability (Schneider and Obeso, 2014; Postuma et al., 2015; Deb et al., 2021). Diagnosis and follow-up are based on several scales and questionnaires to assess severity including Movement Disorder Society-Sponsored Revision of the Unified Parkinson’s Disease Rating Scale (MDS-UPDRS; Goetz et al., 2008). However, these clinical scales are subjective with high inter-rater variability between clinicians. Furthermore, follow-up is also based on self-report questionnaires that imply recall bias (Deb et al., 2021). In the last 20 years, there has been great interest in developing objective measurement focused on early diagnosis, accurate follow-up, evaluation of motor fluctuations, and prognosis in PD, from which has arisen technology-based objective measurements (TOMs) as a complement for clinical assessment (Urcuqui et al., 2018; Deb et al., 2021).

In PD, changes in gait kinematics and spatiotemporal features are hallmarks of the disease. Gait analysis is complex and usually requires a gait and biomechanics laboratory which is expensive and not globally available for medical consultation (Urcuqui et al., 2018). Recently, several cost-effective instruments have been used to assess PD motor symptoms such as RGB-D cameras (Kinect^®). Despite the large number of TOMs studies and available data, such as inertial measurement units (IMUS) that do not need a specialized laboratory, the RGB-D cameras are the most accessible technology in remote areas for its cost and its simplicity. However, the data processing and classification methods are still variable upon the studies.

Machine learning (ML) techniques have been studied in several medical areas including PD (Sidey-Gibbons and Sidey-Gibbons, 2019) in order to classify healthy volunteers from patients using voice analysis (Ozkan, 2016), feet pressure systems (Abdulhay et al., 2018), RGB-D cameras (Buongiorno et al., 2019; Jaggy Castaño-Pino et al., 2019), optoelectronic motion analysis system (Varrecchia et al., 2021), wearable sensors such as accelerometers or inertial measurement units (IMU; Yoneyama et al., 2013; Caramia et al., 2018), walkway pressure analysis (Wahid et al., 2015), and variables associated with knee and trunk rotation (Varrecchia et al., 2021). Other studies have been using unsupervised learning to extract features in the initial stages of the disease (Singh and Samavedham, 2015), propose a method to obtain informative correlation-aware signals (Zhang et al., 2021), and evaluate clustering algorithms to support the prediction of the disease (Sherly Puspha Annabel et al., 2021). Most of the studies that aimed to classify healthy people from PD patients focused solely on leg variables or arm variables or axial trunk and knee rotation even though the disease involves all four limbs and the first affected are the arms (Ospina et al., 2018; Monje et al., 2021).

With the rise of telemedicine in recent years, particularly after the beginning of the SARS-CoV2 pandemic, never has it been so important to develop simple assessment methods that do not require high costs or specialized equipment, particularly in developing countries where access to specialized medicine is limited. In addition, telemedicine programs in Parkinson’s disease are a growing field and gait measurement demands many challenges to evaluate patients in rural regions and developing countries in order to ensure quality evaluation. Remote monitoring with synchronous and asynchronous assessments included the use of specialized devices and recorded and uploaded videos, for motor evaluation such as bradykinesia, gait, and falls (Shalash et al., 2021).

In this work, our aim is to study the causal relationship between gait features from upper and lower extremities and assess the performance of a machine learning model to classify people with PD from healthy subjects using data from a portable low-cost device (Depth Camera) called Kinect^®eMotion system in order to support diagnosis and follow-up to patients with PD in remote areas.

Materials and Methods

Design and Participants

The dataset was extracted from a single-center study carried out between June and December 2016, by the Neurology Service at the Fundación Valle del Lili academic Hospital in Cali—Colombia (Muñoz Ospina et al., 2019). We included spatiotemporal gait data from 30 patients with PD and 30 healthy age-matched controls. Each patient was evaluated by a movement disorder specialist and met the criteria from the UK Parkinson’s Disease Society Brain Bank diagnostic criteria. No participants had major features that affected their gait (major orthopedic surgeries, osteoarthritis, other neuromuscular disorders, or walking aids) All participants with PD were treated with dopaminergic agonists and were evaluated in the “on” state. Institutional review board approval was obtained prior to starting the study and all participants provided written informed consent before participation.

Gait data were obtained from previous studies using an RGB-D camera (Kinect^®eMotion) coupled with a signal processing software. Subjects underwent a single gait evaluation session during which each subject was asked to walk at their preferred speed during three consecutive walking trials. The measurements were made in a corridor 4 m long and 1.5 m wide free of interference. The distance allowed for Kinect^® to record a minimum of one full gait cycle per limb. Figure 1 shows the setup during a measurement campaign in a rural area in the southwest of the country.

FIGURE 1

Figure 1. RGB-D camera setup and gait evaluation zone.

As indicated in previous studies we used wavelet techniques to extract gait phases and generate several spatio/temporal variables (see Table 1). These variables were obtained based on a wavelet decomposition using a Daubechies wavelet (Db8; Jaggy Castaño-Pino et al., 2019).

TABLE 1

Table 1. Gait variables definition.

Preprocessing Features

As we aimed to study arms and legs variables (one dataset for each set of features), the integration of all the data was made using a unique ID for each patient and the result was a dataset of 620 records and 28 features. The join presented 96 records without values that were excluded during the study. After the filtering process, the dataset had a shape of 554 × 28; 37% of the dataset corresponded to healthy controls and 63% to PD patients.

Two datasets were generated with the same shape for further analysis: one with normalized information because this is a prerequisite for some machine learning algorithms and the other set of information without normalizing technique. For normalization we used the open library ClusterSim (Walesiak and Dudek, 2020) which uses the following formula: $\frac{(x - m e a n)}{\sqrt{(s u m {(x - m e a n)}^{2}}}$ .

Exploratory Analysis

The data exploration included an evaluation between gait variables using a correlation matrix; three thresholds (0.35, 0.4, and 0.9) were selected randomly from a range of 0–1. Each value was analyzed, the correlations higher than the threshold 0.9 did not show similarity in the variables related to arms and legs. Using the value of 0.4 the features presented a similarity; the highest similarities between upper and lower extremities were obtained using 0.35 as the correlation threshold. Scatter plots were created using the correlation matrix. Backward and forward feature selection models were applied using R and Python to determine the most important variables for further analysis, especially to perform a partial correlation analysis with different sets of features. The significance level selected was 5% for all the variables in the backward feature selection process.

Machine Learning and Evaluation

Three further analyses were done using variables selected from: Backward models (Dataset A), movement disorders specialist (Dataset B), and all the variables from the dataset (Dataset C).

Dataset A: In order to find the most important variables, a backward elimination process for all the models used in this research was run for the full set of variables and the results were: Left-arm magnitude, arm swing asymmetry (ASA; Zifchock et al., 2008), left swing time, left length of step.

Dataset B: Eight variables were selected (Swing magnitude of both arms, swing time of both legs, step length of both feet, ASA, and global gait speed) by a movement disorder specialist according to their clinical relevance to PD diagnosis and follow-up.

Dataset C: All variables were included in this dataset.

Seven machine learning algorithms were chosen based on the results of previous studies (Urcuqui et al., 2018; Reyes et al., 2019; Alzubaidi et al., 2021). Six of the selected algorithms were trained using R statistical software (logistic regression, decision tree without processing, pre-pruning decision tree, post-pruning decision tree, naive Bayes, and random forest). Using Python, a support vector machine model was trained (see Table 2: machine learning parameters and commands for execution).

TABLE 2

Table 2. Machine learning parameters and commands for execution.

The experiments applied hold-out (a train set, validation set, and testing set were made) and K-fold cross-validation to reduce overfitting. The dataset was divided into: 10 records for final validation, 80% for algorithm training, and 20% for testing. The cross-validation used k iterations equal to 5 to include different sets of information during the training and validation phases. Classification metrics used in this study for the testing phase were accuracy, false-positive ratio, false negative ratio, and Cohen’s Kappa, the latter as an evaluation metric to evaluate the model’s performance against the imbalance of the values from the dependent variables.

Causal Inference Model

We decided to find if there was some causal relationship between the variables. For this task, we used the DoWhy library (Sharma and Kiciman, 2020) and applied the causal inference model (CIM). The causal model was applied to each relevant variable of the selected dataset.

The DoWhy library is a Python library developed by Microsoft with the aim to spark causal thinking and analysis. The main idea of the DoWhy library is to model and validate causal assumptions testing these assumptions for any estimation method. The library is based on the Structural Causal Model theory proposed by Pearl (1995) and implements a refutation API to simplify the analysis for non-experts in this area (see \hyperref[s10]Supplementary Material for details on the procedure).

Results

We included data from 30 patients with PD, 17 (57%) men, and 30 healthy age-matched controls. Both groups had a median age of 66 years (IQR 59–75). The median duration of the disease was 5 years (IQR 1–7). Hoehn and Yahr stage classification was stage I for 17% of the patients, stage II for 73%, and stage III for the remaining 10%. The mean of MDS-UPDRS part III was 39.06 (±13.74; see Table 3). We retained a dataset with 554 records and 28 variables, we did not exclude outliers to simulate real clinical situations.

TABLE 3

Table 3. Clinical features of the sample.

First, we conducted an exploratory analysis using a correlation matrix to identify the most relevant variables. We reduced data based on the degree of correlation, retaining only variables with a correlation greater than 0.35 (see Figure 2).

FIGURE 2

Figure 2. Correlation matrix using gait variables.

Based on the correlation matrix we obtained several scatter plots (see Figure 3).

FIGURE 3

Figure 3. Scatter plot using arm swing speed comparing controls and Parkinson’s disease patients: green curve represents the regression curve of the left and right arms swing speed in Parkinson’s disease patients. The red curve represents the regression curve of the left and right arms swing speed in healthy subjects.

Variable Selection

Using the Backward feature selection model, the most relevant variables were: (1) swing magnitude of left arm; (2) swing time of left leg; (3) left step length; and (4) arm swing asymmetry (ASA). Based on previous studies and clinical expertise, a dataset was created (B) to perform further analysis with some selected variables: swing magnitude of both arms, swing time of both legs, step length of both feet, arm swing asymmetry (ASA), and global gait speed.

Machine Learning Results

Results from the coefficient of concordance Kappa and accuracy for each model using each set of variables for the test dataset are shown in Table 4.

As we can see, the Random Forest model is the most accurate for all three variable Datasets (Dataset A: 81.8%; Dataset B: 83.6%; Dataset C: 84.5%) followed by the support vector machine for both, A and B datasets, and decision tree pre-pruning for dataset C. Results showing the degree of false positive and false negative are shown for each model and each set of variables in Table 4.

TABLE 4

Table 4. Confusion matrix results showing kappa, accuracy, false positive, and false negative rate for each machine learning model using the test dataset.

In order to verify the accuracy of the model we selected 10 aleatory data from the sample (validation records), we compared the classification between patient and control that the algorithm was able to predict vs. the real diagnosis. The accuracy was 90% with only one false positive case.

Relationship Between Arms and Legs Variables

Due to its accuracy (83.6%) and simplicity (eight variables), dataset B was chosen to run the CIM. Using this model and the DoWhy library relationships between leg gait variables and arm swing variables were analyzed (Figure 4).

FIGURE 4

Figure 4. Causal inference model. Dashed arrows show the causal relation identified by the model between leg and arms gait variables with the diagnosis of PD (PD-Classifier). Continued lines show the causal relations identified by the model between arms and legs gait variables.

Causal inference estimator results show that there is a proportional relationship between ASA and the diagnosis of PD (estimator: 1,536). This can be interpreted as every time the classifier goes up 1 unit (the subject is diagnosed with PD), ASA goes up 1,536 units. To verify the robustness of this estimation three refuters tests were calculated. When “random common cause refuter” is applied to this estimator the results do not significantly vary (1,520), the same happens with “data subset refuter” (1,531) which implies the result is robust. In the placebo treatment refuter, the result for ASA is 0.00436 which is very close to 0. This also means the estimator is robust (see Table 5). That is why based on the DoWhy library, ASA is the most representative variable in the causal inference model.

TABLE 5

Table 5. Causal inference model estimators and refuters.

Discussion

The main objective of this study was to propose a machine learning-based algorithm to classify the patients with PD from the healthy controls, using a portable RGB-D camera (Kinect^®eMotion capture system). These results are in line with our attempt to explore other ways to assess the gait variables using a low-cost system that can be used during medical consultation in a developing country. According to our previous results, this machine learning-based algorithm will improve the data analytical and clinical efforts to analyze disease-relevant information for physicians and patients.

Correlations and Variable Exploration

As expected there is a positive strong correlation between arm speed and arm swing magnitude which represents that some of the normal dynamics of human gait is preserved even in PD patients. Despite the correlation of magnitude between both arms being weak and positive, this could be explained by the limb movement asymmetry secondary to the motor symptoms of the disease (increased rigidity and bradykinesia) predominantly affecting only one body side in the PD group. This pathological asymmetry between left and right arm swing magnitudes is represented by the ASA coefficient which is one of the earliest clinical manifestations of PD (Mirelman et al., 2016).

Regarding the results of the non-PD group, controls exhibit a similar speed in both upper limbs, which could be related to the normal pattern of gait unaffected by the disease (see Figure 3).

Variable Selection and Dataset Construction

Variables were selected according to different criteria into three datasets. When the backward technique was applied predominantly left variables (arm swing magnitude, step length, and swing time) were selected, which could be related to the prevalence of left-sided motor symptoms in our sample of PD patients (17/30; 57%).

Also, the gait variables selected by the backward feature selection model are related to the clinical changes expected in PD and features needed to fulfill diagnostic criteria: PD patients move their arms and legs more slowly (bradykinesia) and stiffy (rigidity) than controls, for this reason, the magnitude of the arm swing, the time of the leg swing and the step length differ significantly from the healthy-controls.

Furthermore, the selection of both arm and leg variables suggests alterations in the motor pattern of upper and lower limbs. These complex changes in the gait dynamics indicate that objective examination of gait should consider multiple motor variables of each limb. This consideration is consistent with clinical environments where the patient diagnosis and follow-up are based on a full-body examination using the MDS-UPDRS part III (Goetz et al., 2008; Postuma et al., 2015).

Machine Learning Algorithm

Our results show that it is possible to classify patients from controls using different datasets processed by multiple machine learning techniques with different accuracy levels.

Although dataset C had the best performance, dataset B was chosen for having a high accuracy with a low number of variables, which facilitates the data acquisition and processing.

The clinician accuracy for the diagnosis of Parkinson’s disease varies upon studies, however a systematic review showed that clinical diagnosis for PD in non-experts is 73.8% (67.8%–79.6%); for a movement disorder expert at first consult is 79.6% (46%–95.1%) and 83.9% at follow-up (69.7%–92.6%). Also, the accuracy for the UK Parkinson’s Disease Society Brain Bank diagnostic criteria is 82.7% (62.6%–93%; Rizzo et al., 2016) with a high sensibility (90%) but a low specificity (30%–40%; Marsili et al., 2018). With an accuracy of 83.3%, the selected random forest machine learning algorithm is not far from the clinical reality in the ideal settings. These selected variables are closely related to the PD diagnostic criteria because they represent surrogate measures of the slowness of movement (bradykinesia), asymmetry of arm swing, and rigidity.

The Gait Is Intricate: The Causal Inference Estimator

Although much is known about the gait pattern, asymmetry of arm swing (ASA) is a clinical characteristic that has been widely used in the last decade to describe the affected motor central pattern in PD patients (Lewek et al., 2010; Huang et al., 2012; Roggendorf et al., 2012; Mirelman et al., 2016). According to our causal inference estimator, there is a relation between leg variables and the symmetry of the arms which represents a new opportunity in the research of these dynamics, particularly in pathological conditions such as PD.

Finding Differences Between PD Patients and Controls

As seen in the causal inference estimator, there are some unobserved confounders and other variables that could explain some of the changes secondary to PD, as seen in other neurodegenerative diseases, its complexity, and inter-patient variability difficulties to obtain higher accuracy levels. The challenge of new methods of signal processing and machine learning in clinical research is helping clinicians to achieve clinically meaningful technology-based objective measures (TOMs; Espay et al., 2016).

Related Work

Prior works were made using RBG-D cameras to classify PD patients, the variables selected included stride length, age, gait speed, stance time, step length, distance, cycle time, and swing time. The model that had the best accuracy (82%) was Random Forest. This includes a larger number of variables and not all of them are related to the clinical reality, also no further analysis was made (Urcuqui et al., 2018).

As reported in the literature other studies used RBG-D cameras to classify PD patients, but they used other methods. One of them used neural networks and cross-validation using the variables of gait velocity and stride length with an accuracy of 97.2% (Ťupa et al., 2015) or another classification method (the Bayesian), with a maximum accuracy of 94.1% using the stride length and age (Procházka et al., 2015). Differences could be also due to different preprocessing, filtering and exploration of the data. However, other models reported in the literature used only variables from legs.

Other studies used foot pressure sensors and selected the variables of stride time, stance time, swing time, and foot strike profile to classify the controls from the PD patients with an accuracy of 92.7% (Abdulhay et al., 2018). Similar accuracy (92.6%) was found with a normalized multiple regression and Random forest using stance time, stride length, time of total stance, and cadence with the same type of device (Wahid et al., 2015).

The arm swing analysis has been a point of interest in the study of PD. Previous studies confirm that the arms swing magnitude and speed are significantly reduced in the PD for both limbs (Jaggy Castaño-Pino et al., 2019). On the other hand, several studies have been made with wearable technology (Inertial movement unit (IMU), accelerometers). An arm swing asymmetry (ASA) can also be extracted with accelerometers, it is calculated with the root mean square (RMS) differences between arm movements. The ASA and RMS significantly differ in PD patients. This could be used in future studies (Rincón et al., 2020).

Advantages, Limitations, and Future Work

The Kinect^®eMotion system is a portable RGB Camera that can be used in different scenarios (Figure 1) and does not require a specialized gait laboratory. For that reason, this technology can be used as a complement to telemedicine in places without specialized medicine to support the diagnosis and management of patients’ PD. Our findings suggest that in the future it could be considered to employ these measures and algorithms to complement Parkinson’s disease diagnosis as well as to adapt the algorithms to evaluate disease progression, clinical subtypes, follow-up, response to treatment and correlate with clinical rating scales such as MDS-UPDRS.

Some limitations of the study were the sample size which limited the training of the algorithms to create a more accurate and robust model and only one dataset was used for training the algorithms which could also limit the results. Also, no gait speed matching procedure was implemented, however, some spatio-temporal gait parameters are speed-dependent which may have led to overrepresenting some of the gait variables in the backward feature selection model. Furthermore, some machine learning algorithms described in previous studies for classification between PD and healthy controls were not implemented such as artificial neural networks (ANN) and K-nearest neighbor (K-NN). The first was implemented in the first stages of the study, however, their results were similar to simple statistical methods with no machine learning and no further analysis was performed. The latter is a different machine learning algorithm because it does not save information, it cannot be trained. These limitations will be considered in the development of future studies.

Further studies are needed to explore the use of RGB-D cameras and machine learning algorithms for follow-up and treatment response and more data is needed to improve the machine learning training which will allow to achieve higher accuracy.

Conclusions

This study shows how machine learning techniques based on objective measures using portable low-cost devices (Kinect^®eMotion) are useful to classify patients with Parkinson’s disease. This proposed method can be used to evaluate patients remotely and help clinicians make decisions regarding follow-up and treatment.

Data Availability Statement

The data analyzed in this study is subject to the following licenses/restrictions: the raw data supporting the conclusions of this article can be made available by the authors on request, prior approval by the institutional ethics committee. Requests to access these datasets should be directed to beatriz.munoz@fvl.org.co.

Ethics Statement

The studies involving human participants were reviewed and approved by Comité de ética en investigación biomédica (CEIB), Fundación Valle Del Lili. The patients/participants provided their written informed consent to participate in this study.

Author Contributions

JO, AN-C, BM-O, and CU participated in the design of the study, interpretation and revision of data. JV-C organized the database. MG-P, CH, and CU contributed with data processing, statistical analysis, machine learning data processing, and interpretation of the work. JV-C, HC-M, and DA-G participated with data analysis and its interpretation as well as the writing of the first draft of the manuscript. AN-C and BM-O wrote sections of the manuscript. All authors contributed to the article and approved the submitted version.

Funding

This study was financed by Minciencias Grant #38-2021.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Acknowledgments

This work was possible thanks to support from Universidad Icesi, Minciencias, and Fundación Valle del Lili.

Supplementary Material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fnhum.2022.826376/full#supplementary-material.

References

Abdulhay, E., Arunkumar, N., Narasimhan, K., Vellaiappan, E., and Venkatraman, V. (2018). Gait and tremor investigation using machine learning techniques for the diagnosis of Parkinson disease. Future Gener. Comput. Syst. 83, 366–373. doi: 10.1016/j.future.2018.02.009

CrossRef Full Text | Google Scholar

Alzubaidi, M. S., Shah, U., Dhia Zubaydi, H., Dolaat, K., Abd-Alrazaq, A. A., Ahmed, A., et al. (2021). The role of neural network for the detection of Parkinson’s disease: a scoping review. Healthcare (Basel) 9:740. doi: 10.3390/healthcare9060740

PubMed Abstract | CrossRef Full Text | Google Scholar

Buongiorno, D., Bortone, I., Cascarano, G. D., Trotta, G. F., Brunetti, A., and Bevilacqua, V. (2019). A low-cost vision system based on the analysis of motor features for recognition and severity rating of Parkinson’s disease. BMC Med. Inform. Decis. Mak. 19:243. doi: 10.1186/s12911-019-0987-5

PubMed Abstract | CrossRef Full Text | Google Scholar

Caramia, C., Torricelli, D., Schmid, M., Munoz-Gonzalez, A., Gonzalez-Vargas, J., Grandas, F., et al. (2018). IMU-based classification of Parkinson’s disease from gait: a sensitivity analysis on sensor location and feature selection. IEEE J. Biomed. Health Inform. 22, 1765–1774. doi: 10.1109/JBHI.2018.2865218

PubMed Abstract | CrossRef Full Text | Google Scholar

de Lau, L. M., and Breteler, M. M. (2006). Epidemiology of Parkinson’s disease. Lancet Neurol. 5, 525–535. doi: 10.1016/S1474-4422(06)70471-9

PubMed Abstract | CrossRef Full Text | Google Scholar

Deb, R., Bhat, G., An, S., Shill, H., and Ogras, U. Y. (2021). Trends in technology usage for Parkinson’s disease assessment: a systematic review. medRxiv [Preprint]. doi: 10.1101/2021.02.01.21250939

CrossRef Full Text | Google Scholar

Dorsey, E. R., and Bloem, B. R. (2018). The Parkinson pandemic—a call to action. JAMA Neurol. 75, 9–10. doi: 10.1001/jamaneurol.2017.3299

PubMed Abstract | CrossRef Full Text | Google Scholar

Espay, A. J., Bonato, P., Nahab, F. B., Maetzler, W., Dean, J. M., Klucken, J., et al. (2016). Technology in Parkinson’s disease: challenges and opportunities: technology in PD. Mov. Disord. 31, 1272–1282. doi: 10.1002/mds.26642

PubMed Abstract | CrossRef Full Text | Google Scholar

Goetz, C. G., Tilley, B. C., Shaftman, S. R., Stebbins, G. T., Fahn, S., Martinez-Martin, P., et al. (2008). Movement disorder society-sponsored revision of the unified Parkinson’s disease rating scale (MDS-UPDRS): scale presentation and clinimetric testing results: MDS-UPDRS: clinimetric assessment. Mov. Disord. 23, 2129–2170. doi: 10.1002/mds.22340

PubMed Abstract | CrossRef Full Text | Google Scholar

Huang, X., Mahoney, J. M., Lewis, M. M., Guangwei, D. U., Piazza, S. J., and Cusumano, J. P. (2012). Both coordination and symmetry of arm swing are reduced in Parkinson’s disease. Gait Posture 35, 373–377. doi: 10.1016/j.gaitpost.2011.10.180

PubMed Abstract | CrossRef Full Text | Google Scholar

Jaggy Castaño-Pino, Y., Navarro, A., Muñoz, B., and Luis Orozco, J. (2019). “Using wavelets for gait and arm swing analysis,” in Wavelet Transform and Complexity, ed D. Baleanu (London: IntechOpen), 1–16. doi: 10.5772/intechopen.84962

CrossRef Full Text | Google Scholar

Lewek, M. D., Poole, R., Johnson, J., Halawa, O., and Huang, X. (2010). Arm swing magnitude and asymmetry during gait in the early stages of Parkinson’s disease. Gait Posture 31, 256–260. doi: 10.1016/j.gaitpost.2009.10.013

PubMed Abstract | CrossRef Full Text | Google Scholar

Marsili, L., Rizzo, G., and Colosimo, C. (2018). Diagnostic criteria for Parkinson’s disease: from james parkinson to the concept of prodromal disease. Front. Neurol. 9:156. doi: 10.3389/fneur.2018.00156

PubMed Abstract | CrossRef Full Text | Google Scholar

Mirelman, A., Bernad-Elazari, H., Thaler, A., Giladi-Yacobi, E., Gurevich, T., Gana-Weisz, M., et al. (2016). Arm swing as a potential new prodromal marker of Parkinson’s disease: arm swing as a new prodromal marker of PD. Mov. Disord. 31, 1527–1534. doi: 10.1002/mds.26720

PubMed Abstract | CrossRef Full Text | Google Scholar

Monje, M. H. G., Sánchez-Ferro, Á., Pineda-Pardo, J. A., Vela-Desojo, L., Alonso-Frech, F., and Obeso, J. A. (2021). Motor onset topography and progression in Parkinson’s disease: the upper limb is first. Mov. Disord. 36, 905–915. doi: 10.1002/mds.28462

PubMed Abstract | CrossRef Full Text | Google Scholar

Muñoz Ospina, B., Valderrama Chaparro, J. A., Arango Paredes, J. D., Castaño Pino, Y. J., Navarro, A., and Orozco, J. L. (2019). Age matters: objective gait assessment in early Parkinson’s disease using an RGB-D camera. Parkinsons Dis. 2019:5050182. doi: 10.1155/2019/5050182

PubMed Abstract | CrossRef Full Text | Google Scholar

Ospina, B. M., Chaparro, J. A. V., Paredes, J. D. A., Pino, Y. J. C., Navarro, A., and Orozco, J. L. (2018). Objective arm swing analysis in early-stage Parkinson’s disease using an RGB-D camera (Kinect^®). J. Parkinsons Dis. 8, 563–570. doi: 10.3233/JPD-181401

PubMed Abstract | CrossRef Full Text | Google Scholar

Ozkan, H. (2016). A comparison of classification methods for telediagnosis of Parkinson’s disease. Entropy 18:115. doi: 10.3390/e18040115

CrossRef Full Text | Google Scholar

Pearl, J. (1995). Causal diagrams for empirical research. Biometrika 82, 669–688. doi: 10.2307/2337329

CrossRef Full Text | Google Scholar

Postuma, R. B., Berg, D., Stern, M., Poewe, W., Olanow, C. W., Oertel, W., et al. (2015). MDS clinical diagnostic criteria for Parkinson’s disease: MDS-PD clinical diagnostic criteria. Mov. Disord. 30, 1591–1601. doi: 10.1002/mds.26424

PubMed Abstract | CrossRef Full Text | Google Scholar

Procházka, A., Vyšata, O., Vališ, M., Ťupa, O., Schätz, M., and Mařík, V. (2015). Bayesian classification and analysis of gait disorders using image and depth sensors of Microsoft Kinect. Digit. Signal Process. 47, 169–177. doi: 10.1016/j.dsp.2015.05.011

CrossRef Full Text | Google Scholar

Reyes, J. F., Steven Montealegre, J., Castano, Y. J., Urcuqui, C., and Navarro, A. (2019). “LSTM and convolution networks exploration for Parkinson’s diagnosis,” in 2019 IEEE Colombian Conference on Communications and Computing (COLCOM) (Barranquilla, Colombia: IEEE), 1–4. doi: 10.1109/ColComCon.2019.8809160

CrossRef Full Text | Google Scholar

Rincón, D., Valderrama, J., González, M. C., Muñoz, B., Orozco, J., Montilla, L., et al. (2020). Wristbands containing accelerometers for objective arm swing analysis in patients with Parkinson’s disease. Sensors (Basel) 20:4339. doi: 10.3390/s20154339

PubMed Abstract | CrossRef Full Text | Google Scholar

Rizzo, G., Copetti, M., Arcuti, S., Martino, D., Fontana, A., and Logroscino, G. (2016). Accuracy of clinical diagnosis of Parkinson disease: a systematic review and meta-analysis. Neurology 86, 566–576. doi: 10.1212/WNL.0000000000002350

PubMed Abstract | CrossRef Full Text | Google Scholar

Roggendorf, J., Chen, S., Baudrexel, S., van de Loo, S., Seifried, C., and Hilker, R. (2012). Arm swing asymmetry in Parkinson’s disease measured with ultrasound based motion analysis during treadmill gait. Gait Posture 35, 116–120. doi: 10.1016/j.gaitpost.2011.08.020

PubMed Abstract | CrossRef Full Text | Google Scholar

Schneider, S. A., and Obeso, J. A. (2014). “Clinical and pathological features of Parkinson’s disease,” in Behavioral Neurobiology of Huntington’s Disease and Parkinson’s Disease, Vol. 22, eds H. H. P. Nguyen, and M. A. Cenci (Berlin, Heidelberg: Springer Berlin Heidelberg), 205–220. doi: 10.1007/7854_2014_317

CrossRef Full Text | Google Scholar

Shalash, A., Spindler, M., and Cubo, E. (2021). Global perspective on telemedicine for Parkinson’s disease. J. Parkinsons Dis. 11, S11–S18. doi: 10.3233/JPD-202411

PubMed Abstract | CrossRef Full Text | Google Scholar

Sharma, A., and Kiciman, E. (2020). DoWhy: an end-to-end library for causal inference. ArXiv [Preprint]. doi: 10.48550/arXiv.2011.04216

CrossRef Full Text | Google Scholar

Sherly Puspha Annabel, L., Sreenidhi, S., and Vishali, N. (2021). “A novel diagnosis system for Parkinson’s disease using K-means clustering and decision tree,” in Communication and Intelligent Systems, Vol. 204, eds H. Sharma, M. K. Gupta, G. S. Tomar, and W. Lipo (Singapore: Springer Singapore), 607–615. doi: 10.1007/978-981-16-1089-9_48

CrossRef Full Text | Google Scholar

Sidey-Gibbons, J. A. M., and Sidey-Gibbons, C. J. (2019). Machine learning in medicine: a practical introduction. BMC Med. Res. Methodol. 19:64. doi: 10.1186/s12874-019-0681-4

PubMed Abstract | CrossRef Full Text | Google Scholar

Singh, G., and Samavedham, L. (2015). Unsupervised learning based feature extraction for differential diagnosis of neurodegenerative diseases: a case study on early-stage diagnosis of Parkinson disease. J. Neurosci. Methods 256, 30–40. doi: 10.1016/j.jneumeth.2015.08.011

PubMed Abstract | CrossRef Full Text | Google Scholar

Ťupa, O., Procházka, A., Vyšata, O., Schätz, M., Mareš, J., Vališ, M., et al. (2015). Motion tracking and gait feature estimation for recognising Parkinson’s disease using MS Kinect. Biomed. Eng. Online 14:97. doi: 10.1186/s12938-015-0092-7

PubMed Abstract | CrossRef Full Text | Google Scholar

Tysnes, O.-B., and Storstein, A. (2017). Epidemiology of Parkinson’s disease. J. Neural Transm. (Vienna) 124, 901–905. doi: 10.1007/s00702-017-1686-y

PubMed Abstract | CrossRef Full Text | Google Scholar

Urcuqui, C., Castano, Y., Delgado, J., Navarro, A., Diaz, J., Munoz, B., et al. (2018). “Exploring machine learning to analyze Parkinson’s disease patients,” in 2018 14th International Conference on Semantics, Knowledge and Grids (SKG), (Guangzhou, China: IEEE), 160–166. doi: 10.1109/SKG.2018.00029

CrossRef Full Text | Google Scholar

Varrecchia, T., Castiglia, S. F., Ranavolo, A., Conte, C., Tatarelli, A., Coppola, G., et al. (2021). An artificial neural network approach to detect presence and severity of Parkinson’s disease via gait parameters. PLoS One 16:e0244396. doi: 10.1371/journal.pone.0244396

PubMed Abstract | CrossRef Full Text | Google Scholar

Wahid, F., Begg, R. K., Hass, C. J., Halgamuge, S., and Ackland, D. C. (2015). Classification of Parkinson’s disease gait using spatial-temporal gait features. IEEE J. Biomed. Health Inform. 19, 1794–1802. doi: 10.1109/JBHI.2015.2450232

PubMed Abstract | CrossRef Full Text | Google Scholar

Walesiak, M., and Dudek, A. (2020). “The choice of variable normalization method in cluster analysis,” in Education Excellence and Innovation Management: A 2025 Vision to Sustain Economic Development During Global Challenges (Sevilla: International Business Information Management Association (IBIMA)), 325–340. Available online at: https://rdrr.io/cran/clusterSim/man/data.Normalization.html#heading-5.

Google Scholar

Yoneyama, M., Kurihara, Y., Watanabe, K., and Mitoma, H. (2013). Accelerometry-based gait analysis and its application to Parkinson’s disease assessment—part 2: a new measure for quantifying walking behavior. IEEE Trans. Neural Syst. Rehabil. Eng. 21, 999–1005. doi: 10.1109/TNSRE.2013.2268251

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhang, X., Wang, Y., Zhang, L., Jin, B., and Zhang, H. (2021). Exploring unsupervised multivariate time series representation learning for chronic disease diagnosis. Int. J. Data Sci. Anal. doi: 10.1007/s41060-021-00290-0

CrossRef Full Text | Google Scholar

Zifchock, R. A., Davis, I., Higginson, J., and Royer, T. (2008). The symmetry angle: a novel, robust method of quantifying asymmetry. Gait Posture 27, 622–627. doi: 10.1016/j.gaitpost.2007.08.006

PubMed Abstract | CrossRef Full Text | Google Scholar

Keywords: Parkinson’s disease, gait, biomechanics, kinect, depth camera, machine learning

Citation: Muñoz-Ospina B, Alvarez-Garcia D, Clavijo-Moran HJC, Valderrama-Chaparro JA, García-Peña M, Herrán CA, Urcuqui CC, Navarro-Cadavid A and Orozco J (2022) Machine Learning Classifiers to Evaluate Data From Gait Analysis With Depth Cameras in Patients With Parkinson’s Disease. Front. Hum. Neurosci. 16:826376. doi: 10.3389/fnhum.2022.826376

Received: 30 November 2021; Accepted: 13 April 2022;
Published: 19 May 2022.

Edited by:

Marco Iosa, Sapienza University of Rome, Italy

Reviewed by:

Stefano Filippo Castiglia, Sapienza University of Rome, Italy
Jean Meunier, Université de Montréal, Canada
Tomoko Arakaki, Hospital Ramos Mejí-a, Argentina

Copyright © 2022 Muñoz-Ospina, Alvarez-Garcia, Clavijo-Moran, Valderrama-Chaparro, García-Peña, Herrán, Urcuqui, Navarro-Cadavid and Orozco. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Beatriz Muñoz-Ospina, YmVhdHJpei5tdW5vekBmdmwub3JnLmNv

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Machine Learning Classifiers to Evaluate Data From Gait Analysis With Depth Cameras in Patients With Parkinson’s Disease

Introduction

Materials and Methods

Design and Participants

Preprocessing Features

Exploratory Analysis

Machine Learning and Evaluation

Causal Inference Model

Results

Variable Selection

Machine Learning Results

Relationship Between Arms and Legs Variables

Discussion

Correlations and Variable Exploration

Variable Selection and Dataset Construction

Machine Learning Algorithm

The Gait Is Intricate: The Causal Inference Estimator

Finding Differences Between PD Patients and Controls

Related Work

Advantages, Limitations, and Future Work

Conclusions

Data Availability Statement

Ethics Statement

Author Contributions

Funding

Conflict of Interest

Publisher’s Note

Acknowledgments

Supplementary Material

References

95% of researchers rate our articles as excellent or good

95% of researchers rate our articles as excellent or good