Acoustic based machine learning approaches for depression detection in Chinese university students

Wei, Yange; Qin, Shisen; Liu, Fengyi; Liu, Rongxun; Zhou, Yunze; Yuanle, Chen; Xiong, Xingliang; Zheng, Wei; Ji, Guangjun; Meng, Yong; Wang, Fei; Zhang, Ruiling

doi:10.3389/fpubh.2025.1561332

ORIGINAL RESEARCH article

Front. Public Health

Sec. Public Mental Health

Volume 13 - 2025 | doi: 10.3389/fpubh.2025.1561332

This article is part of the Research TopicWorld Mental Health Day: Mental Health in the WorkplaceView all 12 articles

Acoustic based machine learning approaches for depression detection in Chinese university students

Provisionally accepted

Yange Wei

Shisen Qin

Fengyi Liu

Rongxun Liu

Yunze Zhou

Chen Yuanle

Xingliang Xiong

Wei Zheng

Guangjun Ji

Yong Meng

Fei Wang

Ruiling Zhang^*

Peking University Sixth Hospital, Beijing, China

The final, formatted version of the article will be published soon.

Background: Depression is major global public health problems among university students. Currently, the evaluation and monitoring of depression predominantly depend on subjective and self-reported methods. There is an urgent necessity to develop objective means of identifying depression. Acoustic features, which convey emotional information, have the potential to enhance the objectivity of depression assessments. This study aimed to investigate the feasibility of utilizing acoustic features for the objective and automated identification and characterization of depression among Chinese university students.A cross-sectional study was undertaken involving 103 students with depression and 103 controls matched for age, gender, and education. Participants' voices were recorded using a smartphone as they read neutral texts. Acoustic analysis and feature extraction were performed using the OpenSMILE toolkit, yielding 523 features encompassing spectral, glottal, and prosodic characteristics. These extracted acoustic features were utilized for discriminant analysis between depression and control groups. Pearson correlation analyses were conducted to evaluate the relationship between acoustic features and Patient Health Questionnaire-9 (PHQ-9) scores. Five machine learning algorithms including Linear Discriminant Analysis (LDA), Logistic Regression, Support Vector Classification, Naive Bayes, and Random Forest were used to perform the classification. For training and testing, ten-fold cross-validation was employed. Model performance was assessed using receiver operating characteristic (ROC) curve, area under the curve (AUC), precision, accuracy, recall, and F1 score. Shapley Additive exPlanations (SHAP) method was used for model interpretation.In depression group, 32 acoustic features (25 spectral features, 5 prosodic features and 2 glottal features) showed significant alterations compared with controls. Further, 27 acoustic features (10 spectral features, 3 prosodic features, and 1 glottal features) were significantly correlated with depression severity.Among five machine learning algorithms, LDA model demonstrated the highest classification performance, with an AUC of 0.771. SHAP analysis suggested that Mel-frequency cepstral coefficients (MFCC) features contributed most to the model's classification efficacy.The integration of acoustic features and LDA model demonstrates a high accuracy in distinguishing depression among Chinese university students, suggesting its potential utility in rapid and largescale depression screening. MFCC may serve as objective and valid features for the automated identification of depression on Chinese university campuses.

Keywords: Depression, Acoustic features, machine learning, Chinese university students, Campuses

Received: 15 Jan 2025; Accepted: 17 Apr 2025.

Copyright: © 2025 Wei, Qin, Liu, Liu, Zhou, Yuanle, Xiong, Zheng, Ji, Meng, Wang and Zhang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Ruiling Zhang, Peking University Sixth Hospital, Beijing, China

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.