The search for a method that utilizes biomarkers to identify patients with schizophrenia from healthy individuals has occupied researchers for decades. However, no single indicator can be employed to achieve the good in clinical practice. We aim to develop a comprehensive machine learning pipeline based on neurocognitive and electrophysiological combined features for distinguishing schizophrenia patients from healthy people.
In the present study, 69 patients with schizophrenia and 50 healthy controls participated. Neurocognitive (contains seven specific domains of cognition) and electrophysiological [prepulse inhibition, electroencephalography (EEG) power spectrum, detrended fluctuation analysis, and fractal dimension (FD)] features were collected, all these features were taken together to generate the identification models of schizophrenia by applying logistics, random forest, and extreme gradient boosting algorithm. The classification capabilities of these models were also evaluated.
Both the neurocognitive and electrophysiological feature sets showed a good classification effect with the highest accuracy greater than 85% and AUC greater than 90%. Specifically, the performances of the combined neurocognitive and electrophysiological feature sets achieved the highest accuracy of 93.28% and AUC of 97.91%. The extreme gradient boosting algorithm as a whole presented more stably and precisely in classification efficiency.
The highest classification accuracy of 93.28% by combination of neurocognitive and electrophysiological features shows that both measurements are appropriate indicators to be used in discriminating schizophrenia patients and healthy individuals. Also, among three algorithms, extreme gradient boosting had better classified performances than logistics and random forest algorithms.