Background

AUTHOR=Shin Daun , Kim Kyungdo , Lee Seung-Bo , Lee Changwoo , Bae Ye Seul , Cho Won Ik , Kim Min Ji , Hyung Keun Park C. , Chie Eui Kyu , Kim Nam Soo , Ahn Yong Min 

TITLE=Detection of Depression and Suicide Risk Based on Text From Clinical Interviews Using Machine Learning: Possibility of a New Objective Diagnostic Marker

JOURNAL=Frontiers in Psychiatry

VOLUME=13

YEAR=2022

URL=https://www.frontiersin.org/journals/psychiatry/articles/10.3389/fpsyt.2022.801301

DOI=10.3389/fpsyt.2022.801301

ISSN=1664-0640

ABSTRACT=<sec><title>Background</title><p>Depression and suicide are critical social problems worldwide, but tools to objectively diagnose them are lacking. Therefore, this study aimed to diagnose depression through machine learning and determine whether it is possible to identify groups at high risk of suicide through words spoken by the participants in a semi-structured interview.</p></sec><sec><title>Methods</title><p>A total of 83 healthy and 83 depressed patients were recruited. All participants were recorded during the Mini-International Neuropsychiatric Interview. Through the suicide risk assessment from the interview items, participants with depression were classified into high-suicide-risk (31 participants) and low-suicide-risk (52 participants) groups. The recording was transcribed into text after only the words uttered by the participant were extracted. In addition, all participants were evaluated for depression, anxiety, suicidal ideation, and impulsivity. The chi-square test and student’s <italic>T</italic>-test were used to compare clinical variables, and the Naive Bayes classifier was used for the machine learning text model.</p></sec><sec><title>Results</title><p>A total of 21,376 words were extracted from all participants and the model for diagnosing patients with depression based on this text confirmed an area under the curve (AUC) of 0.905, a sensitivity of 0.699, and a specificity of 0.964. In the model that distinguished the two groups using statistically significant demographic variables, the AUC was only 0.761. The DeLong test result (<italic>p</italic>-value 0.001) confirmed that the text-based classification was superior to the demographic model. When predicting the high-suicide-risk group, the demographics-based AUC was 0.499, while the text-based one was 0.632. However, the AUC of the ensemble model incorporating demographic variables was 0.800.</p></sec><sec><title>Conclusion</title><p>The possibility of diagnosing depression using interview text was confirmed; regarding suicide risk, the diagnosis accuracy increased when demographic variables were incorporated. Therefore, participants’ words during an interview show significant potential as an objective and diagnostic marker through machine learning.</p></sec>