AUTHOR=Neo Jin Rui Edmund , Ser Joon Sin , Tay San San TITLE=Use of large language model-based chatbots in managing the rehabilitation concerns and education needs of outpatient stroke survivors and caregivers JOURNAL=Frontiers in Digital Health VOLUME=6 YEAR=2024 URL=https://www.frontiersin.org/journals/digital-health/articles/10.3389/fdgth.2024.1395501 DOI=10.3389/fdgth.2024.1395501 ISSN=2673-253X ABSTRACT=Background

The utility of large language model-based (LLM) artificial intelligence (AI) chatbots in many aspects of healthcare is becoming apparent though their ability to address patient concerns remains unknown. We sought to evaluate the performance of two well-known, freely-accessible chatbots, ChatGPT and Google Bard, in responding to common questions about stroke rehabilitation posed by patients and their caregivers.

Methods

We collected questions from outpatients and their caregivers through a survey, categorised them by theme, and created representative questions to be posed to both chatbots. We then evaluated the chatbots' responses based on accuracy, safety, relevance, and readability. Interrater agreement was also tracked.

Results

Although both chatbots achieved similar overall scores, Google Bard performed slightly better in relevance and safety. Both provided readable responses with some general accuracy, but struggled with hallucinated responses, were often not specific, and lacked awareness of the possibility for emotional situations with the potential to turn dangerous. Additionally, interrater agreement was low, highlighting the variability in physician acceptance of their responses.

Conclusions

AI chatbots show potential in patient-facing support roles, but issues remain regarding safety, accuracy, and relevance. Future chatbots should address these problems to ensure that they can reliably and independently manage the concerns and questions of stroke patients and their caregivers.