Skip to main content

ORIGINAL RESEARCH article

Front. Soc. Psychol.
Sec. Computational Social Psychology
Volume 3 - 2025 | doi: 10.3389/frsps.2025.1460277
This article is part of the Research Topic Computational Social Psychology View all 6 articles

Validating the use of large language models for psychological text classification

Provisionally accepted
  • London School of Economics and Political Science, London, United Kingdom

The final, formatted version of the article will be published soon.

    Large language models (LLMs) are being used to classify texts into categories informed by psychological theory ('psychological text classification'). However, the use of LLMs in psychological text classification requires validation, and it remains unclear exactly how psychologists should prompt and validate LLMs for this purpose. To address this gap, we examined the potential of using LLMs for psychological text classification, focussing on ways to ensure validity. We employed OpenAI's GPT-4o to classify (1) reported speech in online diaries, (2) other-initiations of conversational repair in Reddit dialogues, and (3) harm reported in healthcare complaints submitted to NHS hospitals and trusts. Employing a two-stage methodology, we developed and tested the validity of the prompts used to instruct GPT-4o using manually labelled data (N=1,500 for each task). First, we iteratively developed three types of prompts using one-third of each manually coded dataset, examining their semantic validity, exploratory predictive validity, and content validity. Second, we performed a confirmatory predictive validity test on the final prompts using the remaining two-thirds of each dataset. Our findings contribute to the literature by demonstrating that LLMs can serve as valid coders of psychological phenomena in text, on the condition that researchers work with the LLM to secure semantic, predictive, and content validity. They also demonstrate the potential of using LLMs in rapid and cost-effective iterations over big qualitative datasets, enabling psychologists to explore and iteratively refine their concepts and operationalisations during manual coding and classifier development. Accordingly, as a secondary contribution, we demonstrate that LLMs allow for an intellectual partnership with the researcher, defined by a synergistic and recursive text classification process where the LLM's generative nature facilitates validity checks. We argue that using LLMs for psychological text classification may signify a paradigm shift towards a novel, iterative approach that may improve the validity of psychological concepts and operationalisations.

    Keywords: Large language models, LLM, gpt, Psychology, text classification, validity, Big qualitative data, artificial intelligence

    Received: 05 Jul 2024; Accepted: 27 Jan 2025.

    Copyright: © 2025 Bunt, Goddard, Reader and Gillespie. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

    * Correspondence: Hannah Laura Bunt, London School of Economics and Political Science, London, United Kingdom

    Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.