Skip to main content

ORIGINAL RESEARCH article

Front. Psychiatry
Sec. Computational Psychiatry
Volume 15 - 2024 | doi: 10.3389/fpsyt.2024.1437569
This article is part of the Research Topic Machine Learning and Statistical Models: Unraveling Patterns and Enhancing Understanding of Mental Disorders View all 3 articles

Predicting Neuroticism with Open-Ended Response Using Natural Language Processing

Provisionally accepted
  • 1 School of Psychology, Korea University, Seoul, Republic of Korea
  • 2 Department of Software, Sejong University, Seoul, Seoul, Republic of Korea
  • 3 KU Mind Health Institute, Seoul, Republic of Korea

The final, formatted version of the article will be published soon.

    With the rapid advancement in natural language processing, predicting personality using this technology has recently generated great research interest. Neuroticism has been identified as one of the core psychological traits that predict psychological distress in various contexts. In this study, verbal responses to a series of open-ended questions, developed based on the five-factor model of personality, were utilized to predict individual levels of neuroticism.Previous personality prediction studies with pre-existing language data barely explored the importance of content. However, exploring appropriate questions for language-based personality assessment (LPA) is particularly important because questions determine the context of elicited responses. This study examined the model's accuracy and the influence of item content in predicting neuroticism.425 Korean adults were recruited through a consecutive sampling method and provided their consent. Psychological assessment batteries were administered, including the measurement of the Five-Factor Model traits, alongside collecting verbal answers to 18 open-ended questions about their personalities through the interview. In total, 30,576 Korean sentences were collected. To develop the prediction models, we employed the pre-trained language model KoBERT.We identified questions that effectively predicted participants' neuroticism based on their responses. Prediction models that were theoretically aligned with neuroticism exhibited greater predictive performance.Computational personality research benefits from computational science norms, especially in acquiring large datasets for machine-learning approaches. To advance computational personality science and provide meaningful psychological insights, these limitations must be addressed from a psychological perspective (8,9,20). Limitations may include but are not limited to, the validity of personality labels, practicability, the context of language use, content validity, and employing readily accessible data without hypotheses (9,17). One of the persistent issues in the field of computational science applied to psychological and personality assessments is the difficulty in understanding what

    Keywords: Personality prediction1, Natural language processing2, Language analysis3, neuroticism4, Open-ended questions5, Computational personality assessment6

    Received: 23 May 2024; Accepted: 17 Jul 2024.

    Copyright: © 2024 Yoon, Jang, Son, Park, Hwang, Choeh and Choi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

    * Correspondence:
    Joon Yeon Choeh, Department of Software, Sejong University, Seoul, 143-747, Seoul, Republic of Korea
    Kee-Hong Choi, School of Psychology, Korea University, Seoul, Republic of Korea

    Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.