The final, formatted version of the article will be published soon.
BRIEF RESEARCH REPORT article
Front. Public Health
Sec. Digital Public Health
Volume 12 - 2024 |
doi: 10.3389/fpubh.2024.1491087
This article is part of the Research Topic Extracting Insights from Digital Public Health Data using Artificial Intelligence, Volume III View all 4 articles
Piecing Together the Narrative of #LongCOVID: An Unsupervised Deep Learning of 1,354,889 X (formerly Twitter) Posts from 2020 to 2023
Provisionally accepted- 1 Saw Swee Hock School of Public Health, National University of Singapore, Singapore, Singapore
- 2 Department of Infectious Diseases, Singapore General Hospital, Singapore, Singapore
- 3 Department of General Medicine, Tan Tock Seng Hospital, Singapore, Singapore
- 4 Changi General Hospital, Singapore, Singapore
Objective: To characterize the public conversations around long COVID, as expressed through X (formerly Twitter) posts from May 2020 to April 2023. Methods: Using X as the data source, we extracted tweets containing #long-covid, #long_covid, or "long covid", posted from May 2020 to April 2023. We then conducted an unsupervised deep learning analysis using Bidirectional Encoder Representations from Transformers (BERT). This method allowed us to process and analyze large-scale textual data, focusing on individual user tweets. We then employed BERT-based Topic Modeling, followed by reflexive thematic analysis to categorize and further refine tweets into coherent themes to interpret the overarching narratives within the long COVID discourse. In contrast to prior studies, the constructs framing our analyses were data driven as well as informed by the tenets of social constructivism. Results: Out of an initial dataset of 2,905,906 tweets, a total of 1,354,889 unique, English-language tweets from individual users were included in the final dataset for analysis. Three main themes were generated: (1) General discussions of long COVID, (2) Skepticism about long COVID, and (3) Adverse effects of long COVID on individuals. These themes highlighted various aspects, including public awareness, community support, misinformation, and personal experiences with long COVID. The analysis also revealed a stable temporal trend in the long COVID discussions from 2020 to 2023, indicating its sustained interest in public discourse.Conclusion: Social media, specifically X, helped in shaping public awareness and perception of long COVID, and the posts demonstrate a collective effort in community building and information sharing.
Keywords: Long Covid, x, Twitter, Social Media, BERTopic, Topic modelling, machine learning
Received: 04 Sep 2024; Accepted: 14 Nov 2024.
Copyright: © 2024 Ng, Wee, Lim, Ong, Ong, Venkatachalam and Liew. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence:
Qin Xiang Ng, Saw Swee Hock School of Public Health, National University of Singapore, Singapore, 117549, Singapore
Tau Ming Liew, Saw Swee Hock School of Public Health, National University of Singapore, Singapore, 117549, Singapore
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.