Real-time evaluations of the severity of depressive symptoms are of great significance for the diagnosis and treatment of patients with major depressive disorder (MDD). In clinical practice, the evaluation approaches are mainly based on psychological scales and doctor-patient interviews, which are time-consuming and labor-intensive. Also, the accuracy of results mainly depends on the subjective judgment of the clinician. With the development of artificial intelligence (AI) technology, more and more machine learning methods are used to diagnose depression by appearance characteristics. Most of the previous research focused on the study of single-modal data; however, in recent years, many studies have shown that multi-modal data has better prediction performance than single-modal data. This study aimed to develop a measurement of depression severity from expression and action features and to assess its validity among the patients with MDD.
We proposed a multi-modal deep convolutional neural network (CNN) to evaluate the severity of depressive symptoms in real-time, which was based on the detection of patients’ facial expression and body movement from videos captured by ordinary cameras. We established behavioral depression degree (BDD) metrics, which combines expression entropy and action entropy to measure the depression severity of MDD patients.
We found that the information extracted from different modes, when integrated in appropriate proportions, can significantly improve the accuracy of the evaluation, which has not been reported in previous studies. This method presented an over 74% Pearson similarity between BDD and self-rating depression scale (SDS), self-rating anxiety scale (SAS), and Hamilton depression scale (HAMD). In addition, we tracked and evaluated the changes of BDD in patients at different stages of a course of treatment and the results obtained were in agreement with the evaluation from the scales.
The BDD can effectively measure the current state of patients’ depression and its changing trend according to the patient’s expression and action features. Our model may provide an automatic auxiliary tool for the diagnosis and treatment of MDD.