Skip to main content

ORIGINAL RESEARCH article

Front. Res. Metr. Anal., 03 November 2023
Sec. Emerging Technologies and Transformative Paradigms in Research
This article is part of the Research Topic Text Mining-Based Mental Health Research View all 5 articles

Identifying mental health discussion topic in social media community: subreddit of bipolar disorder analysis

  • 1Department of Information Science, Chiang Mai Rajabhat University, Chiang Mai, Thailand
  • 2School of Management, Shenzhen Polytechnic, Shenzhen, Guangdong, China
  • 3Department of Library and Information Science, Yonsei University, Seoul, Republic of Korea

Online platforms allow individuals to connect with others, share experiences, and find communities with similar interests, providing a sense of belonging and reducing feelings of isolation. Numerous previous studies examined the content of online health communities to gain insights into the sentiments surrounding mental health conditions. However, there is a noticeable gap in the research landscape, as no study has specifically concentrated on conducting an in-depth analysis or providing a comprehensive visualization of Bipolar disorder. Therefore, this study aimed to address this gap by examining the Bipolar subreddit online community, where we collected 1,460,447 posts as plain text documents for analysis. By employing LDA topic modeling and sentiment analysis, we found that the Bipolar disorder online community on Reddit discussed various aspects of the condition, including symptoms, mood swings, diagnosis, and medication. Users shared personal experiences, challenges, and coping strategies, seeking support and connection. Discussions related to therapy and medication were prevalent, emphasizing the importance of finding suitable therapists and managing medication side effects. The online community serves as a platform for seeking help, advice, and information, highlighting the role of social support in managing bipolar disorder. This study enhances our understanding of individuals living with bipolar disorder and provides valuable insights and feedback for researchers developing mental health interventions.

Introduction

With the rapid increase in the number of social media users, social media has become a powerful tool for sharing medical information, gathering user feedback, and establishing support networks (Moorhead et al., 2013). Individuals with severe mental health problems can challenge the stigma associated with their conditions through individual empowerment and by providing hope to others through online communities. They benefit greatly from interacting with peers online, sharing personal experiences, and strategies for living with mental illness (Naslund et al., 2016). Furthermore, social media communities provide researchers with a unique opportunity to learn about patients' health experiences, treatment preferences, and potentially discover new knowledge in the field of health science. Interactions and shared information on social media offer valuable data that can shed light on the impact of drugs, diseases, and medical treatments on patients outside controlled settings.

Previous analyses compiled information on the utilization of social media for various health-related purposes, including health interventions, health promotion initiatives, medical education, and monitoring disease outbreaks (Zhao and Zhang, 2017; Marar et al., 2019; Chen and Wang, 2021). Major social media platforms like Twitter, Facebook, and Reddit have become significant sources of online health information, including mental health-related content (Zhang et al., 2013; Pershad et al., 2018; Record et al., 2018; Foufi et al., 2019). Moreover, these platforms often provide users with natural means of self-expression, including discussions about their behavior, thoughts, and emotions, which can be indicative of their emotional wellbeing (Conway and O'Connor, 2016). This abundance of user-generated content also serves as a valuable data source for researchers. Studies in the field of data mining have employed machine learning and statistical methods to analyze online messages, focusing on mood, psycholinguistic processes, and content topics extracted from posts (Nguyen et al., 2014). Patients are increasingly turning to social media platforms to search for treatments, discuss their experiences with side effects and treatments with healthcare professionals, participate in health-related online forums, and gain further knowledge about their illnesses (Monnier et al., 2002). User-generated content from social media has proven to be a rich resource for health investigations (O'Neill et al., 2014). Users' posts reflect their thoughts and feelings about their medical experiences and often attract the attention of other patients, caregivers, and medical professionals. User-generated content on the web has become a novel source of clinical data for research purposes (Denecke, 2015).

In the era of Big Data and evidence-based research, social media data offers researchers vast opportunities to gather healthcare information, including insights into patients' health behavior and medical experiences. By leveraging interdisciplinary approaches spanning social sciences, information and computer science, and applied statistics, large datasets from social media content have been integrated into biomedical and psychological research endeavors. Collaborative efforts in Big Data science can unveil and explain patterns in psychology, human behavior, cognition, and their impact on sociocultural systems over time, ultimately generating valuable data, such as medical information (Harlow and Oswald, 2016).

Previous studies utilized computer software and social media content analysis techniques to achieve various outcomes. For example, researchers collected depression and schizophrenia-related hashtags on Twitter to identify stigmatizing attitudes and personal experiences (Reavley and Pilkington, 2014), and analyzed online community posts related to mental health issues to understand the characteristics of online depression communities in terms of language styles, topics, and sentiments (Nguyen et al., 2014). Additionally, psycholinguistics was employed as a feature for text analysis to detect emotions in online communication, recognize psychological status, and predict mental conditions. For instance, a study focused on depression analysis on Facebook using machine learning and a set of psycholinguistic features to investigate the effects of depression and provide solutions for mental health problems (Islam et al., 2018). Furthermore, researchers detected posts containing words or phrases consistent with suicidal ideation on Twitter to observe the expression of suicidality based on linguistic patterns (O'Dea et al., 2015) and utilized linguistic metadata features, including specific words, parts of speech (POS), and lexicon-based elements from the Linguistic Inquiry and Word Count (LIWC), to detect depression in Reddit posts (Trotzek et al., 2018).

Similarly, social networks have provided insights into various aspects of bipolar disorder (BD), such as functioning and social skills. For example, linguistic and phonological features were applied to detect the early stages of BD from Twitter using a supervised machine-learning approach (Huang et al., 2017). Pathway analysis was employed to identify predictors of suicide ideation in older adults with BD, using socio-demographically targeted and social media advertising within Facebook and newsfeeds (O'Rourke et al., 2017). Furthermore, language impairment in Reddit was analyzed to study online mental health communities focusing on depression, BD, and schizophrenia, examining reports of positive emotion, exercising, and weight management (Park and Conway, 2018).

Numerous previous studies have examined the content of online health communities to gain insights into the sentiments surrounding mental health conditions. However, there is a noticeable gap in the research landscape, as no study has specifically concentrated on conducting an in-depth analysis or providing a comprehensive visualization of Bipolar disorder. Several researches demonstrated the association between language usage and mental disorders. Consequently, many researchers have concentrated on social media linguistics analysis and developed tools to detect language patterns that support clinical mental health care, such as systems for emotion detection from user-generated text. However, there remains a scarcity of research focusing on content and sentiment analysis.

Therefore, this study aims to explore the topics and sentiments related to mental health, specifically bipolar disorder (BD), by examining social media texts, in order to answer the following research questions:

RQ1. What are the main topics and discussions related to bipolar disorder within the Bipolar subreddit?

RQ2. What sentiments are expressed by users in the Bipolar subreddit when discussing their experiences with bipolar disorder?

RQ3. How does the Bipolar subreddit serve as a platform for seeking help, advice, and information related to bipolar disorder?

RQ4. What role does social support play in managing bipolar disorder within the Bipolar subreddit?

The findings of this study can contribute to a better understanding of individuals living with BD and offer valuable insights and feedback for researchers developing health interventions. Social media analytics can be used to track the impact and reach of health interventions in real time, allowing for the refinement and adaptation of interventions to maximize their effectiveness. Furthermore, the research framework of this study can be applied to the study of other health conditions as well.

Materials and methods

This study aimed to examine social media discussions within Bipolar subreddit communities. The experiment utilized topic modeling and sentiment analysis to identify hidden topics and observe polarity in social media data, respectively.

Dataset

The data used for this study was collected from the Reddit community (www.reddit.com). A finite set of two subreddits related to bipolar disorder, including https://www.reddit.com/r/bipolar and https://www.reddit.com/r/bippolar2, was selected for analysis. The subreddit “r/bipolar” community, dedicated to bipolar-related issues, has over 85.5 k members, while the subreddit “r/bipolar2” has 13.5 k members who live with bipolar disorder type 2 and encompass the entire bipolar spectrum.

The posts and comments were collected from November 26th, 2018, and ended on December 28th, 2018. During this time period, a total of 1,460,447 posts were collected as plain text documents by scraping Reddit using Python Reddit API Wrapper (PRAW). Posts consist of the initial textual statements that initiate communication with other users, while comments are replies to these posts, organized in a tree-like structure (Gkotsis et al., 2016) as shown in Figure 1.

FIGURE 1
www.frontiersin.org

Figure 1. Reddit post and comment.

Reddit is one of the largest and most active online communities, where users can engage in discussions and share their experiences on a wide range of topics, including mental health. It boasts a large community of members, and many of them have an extensive history of previous submissions. The platform also hosts substantial content related to various diseases and medical conditions. The language of Reddit text posts is more structured than that on other major social media platforms, such as Twitter. It provides a unique environment for studying online communities due to its structure of subreddits, which are individual, topic-specific communities within the larger platform. While there may be multiple subreddits related to bipolar disorder on Reddit. We selected the “r/bipolar” and “r/bipolar2” subreddits based on factors such as their popularity, activity level, and relevance to the research objectives. Both subreddits are dedicated to in-depth discussions specifically related to bipolar disorder, making it a suitable and focused community to investigate the topics and sentiments related to this mental health condition. Furthermore, Reddit's terms and conditions allow the use of its content for research purposes, which constitutes a major advantage for researchers. The Reddit forum also tends to be distinct from similar offline groups. Users are more likely to discuss problems that they do not feel comfortable talking about in face-to-face interactions (Johnson and Ambrose, 2006).

The data used in this study was collected from Reddit, a social media platform, in accordance with the platform's terms of service and community guidelines. The data collection process strictly adhered to the ethical standards outlined by Reddit, ensuring respect for user privacy, anonymity, and the responsible use of the platform's content. Consequently, we confirm that the collection, analysis, and reuse of social media data were conducted in strict accordance with Reddit's policies and terms of use, as well as all relevant institutional regulations.

Data processing

Pre-processing

Before analyzing the dataset, it underwent a pre-processing process to ensure the quality of the data prior to conducting specific analyses. This process included sentence segmentation, which involved splitting the sentences based on full-stop delimiters using the OpenNLP Sentence Detector (Apache OpenNLP Development Community, 2011). Additionally, stop word removal eliminated words that do not convey meaningful information, such as “a,” “is,” and “are.” Furthermore, Stanford CoreNLP (Manning et al., 2014) was utilized for part-of-speech tagging (POS) to identify the word types (noun, verb, adverb, and adjective) in the dataset based on tokens and context. This technique proved valuable for examining complex words and accurately determining the meaning of words within sentences. Furthermore, lemmatization was employed to analyze the lemma of each term. This step reduces inflection and derivationally related forms of words, returning them to their standard form (Manning et al., 2009; Song et al., 2015).

Topic analysis

The topic modeling approach was employed for content analysis to summarize the topics of bipolar disorder (BD) content in Reddit. Topic modeling is a probabilistic generative model based on the assumption that documents contain a mixture of topics consisting of the highest probability for each word. This statistical model helps identify abstract topics that occur in collections of documents. A topic represents a cluster of words that frequently appear together. In this study, we utilized Latent Dirichlet Allocation (LDA) for topic analysis, an unsupervised model that automatically clusters words into topics and associates documents with those topics. The underlying assumption is that a text document has a probability distribution over a mixture of “topics,” where each topic is associated with a distribution over words, and each word is drawn from the mixture (Blei et al., 2003).

LDA topic modeling has been a well-established and widely used technique in natural language processing and topic modeling for many years. Consequently, the availability of numerous open-source implementations, libraries, and resources for LDA makes it relatively easier to integrate into existing workflows. LDA is a relatively lightweight algorithm that can be trained efficiently on smaller datasets. LDA might be a more practical choice for tasks with limited training time and computational resources. In contrast, another approach, such as BERT Topic, being a newer approach, has fewer implementations and resources available, which could affect its adoption in certain contexts. BERT Topic is based on BERT, a large-scale language model that demands substantial computational resources and training time. Compared to BERT Topic, LDA provides a more interpretable representation of topics. LDA models assign probabilities to words in each topic, allowing users to understand and label the topics based on the most representative words. This level of interpretability is crucial in domains where human understanding and explainability are important, such as social sciences and content analysis.

It is important to note that both LDA and the summarization algorithms assume the documents to be a “bag of words” and do not take grammar into account. This purely statistical approach relies on the meanings of documents being conveyed through words (Arora and Ravindran, 2008). Regarding the LDA topic model (shown in Figure 2), the process of generating topics for a document is as follows:

• The dirichlet distribution (α) generates the topic distribution (θd) for document d.

• The multinomial distribution of the topic (θd) assigns the topic (zd,n) for word n of document d.

• The corresponding topics from the dirichlet distribution (βk) generate the topic distribution (φzd,n).

• The multinomial distribution of topics (φzd,n) generates the words (wd,n).

FIGURE 2
www.frontiersin.org

Figure 2. LDA topic model.

N represents the different variables demonstrating the observed words in the various documents. K represents the total number of topics, and D represents the total number of documents.

Subsequently, LDA was employed for topic modeling to uncover hidden topics and terms in Reddit posts. This process analyzes the extensive amount of social media text to determine the connections between topics and summarizes the dataset on a scale that is impossible to achieve through manual annotation. In this study, we configured the algorithm to identify 30 topics per dataset, with each topic displaying its top 30 words.

The LDA algorithm was employed to perform clustering, aiming to identify coherent topics within the dataset. It was observed that a higher coherence score yielded better clustering results. Coherence served as a metric for determining the optimal number of clusters. The coherence scores were calculated using the scikit-learn metric package. We conducted experiments by varying the number of cluster parameters from 5 to 50, with intervals of 5. Among these values, the coherence score was highest when the number of topics (K) was set to 10. However, we conducted a manual inspection of the data. Upon manual inspection, it became evident that the results were most distinct and interpretable when the K-value was set to 30. Smaller K-values, such as 5 and 10, rendered the characteristics of the topics unclear and incomplete. Conversely, the largest K-value (50) resulted in an overlap of topic words across most topics. Consequently, we determined the optimal number of topics to be 30 for the overview topics and to be 20 for the sentiment topics.

Finally, we considered the list of top words in the probability distribution for each topic to assign a name to that specific topic. In other words, this interpretative process of labeling LDA topic outputs involves selecting a set of words from each topic's probability distribution that best represents the theme or subject of that topic. These words essentially serve as labels or names for the topics. Therefore, this process entails selecting words from each topic's probability distribution to provide a meaningful name or label that captures its content or theme. Afterward, we measured the agreement among three annotators regarding the labeling of the topics.

Sentiment analysis

An essential task in sentiment analysis is to classify the polarity of a given text at the sentence or document level, determining whether the expressed opinion is positive, negative, or neutral. Advanced sentiment classification can also identify emotions such as happiness, sadness, and anger, providing insights into the psychological patterns in text based on the analysis of human verbal behavior (Gottschalk and Bechtel, 2008). Therefore, sentiment analysis examines sentiment and detects individual words and phrases in the text to reflect the emotional scales.

To investigate the polarity properties of social media posts from Reddit, the SentiStrength software (http://sentistrength.wlv.ac.uk/download.html) was utilized for automatic sentiment analysis. SentiStrength is specifically designed for analyzing social web texts and predicts the strength of positive and negative sentiment in short texts, even for informal language, by employing a lexical approach. The sentiment analysis captures reactions to events through free texts, offering new insights into human behavior and people's positive or negative attitudes toward an event (Thelwall and Buckley, 2013). The software operates in two modes: supervised and unsupervised. In the unsupervised mode, it employs pre-defined sentiment strength weights for the lexicon, while in the supervised mode, it utilizes a training set of data to automatically adjust the lexicon term weights and produce more accurate results.

At the core of SentiStrength is a list of 2,608 words and word stems, each associated with a typical positive or negative sentiment strength for social web texts. The software assigns a positive sentiment strength ranging from 1 (no positive sentiment) to 5 (very strong positive sentiment) and a negative sentiment strength ranging from −1 (no negative sentiment) to −5 (very strong negative sentiment) to each text (zeros are not used). In the presence of multiple sentiment words, the strongest sentiment word is chosen, while in the absence of sentiment words, a neutral sentiment is assumed. Additionally, the software incorporates special rules for handling negations, questions, booster words (e.g., “very”), emoticons, and various other special cases (Thelwall et al., 2010, 2012).

Results

Bipolar disorder topic modeling

To analyze the content of Reddit data, we assume that the number of 30 topics in LDA corresponds to the number of issues or events that describe each dataset. Therefore, we fixed K = 30 throughout this study. For each topic, we present only 30 words with higher probabilities under that topic.

In this paper, we display the top 10 ranked topics based on the topic probability distribution scores. Each topic consists of 30 words, with stronger influences indicating higher relevance to the topic. Using this information, we manually labeled each topic with a corresponding word. Additionally, we removed groups of words that appeared as stand-alone words or special characters, as they did not provide any relevant information about the topic. We replaced them with the next word in the rank. The main idea of each dataset's content is provided below.

The top 10 topics related to bipolar disorder (BD) and their top 30 words are as follows:

Topic 1. Help and advice (0.161): Social media users shared their conditions, asked for help and information, and expressed appreciation.

Topic 2. Bipolar disorder (0.158): Discussions about symptoms, mood swings, diagnosis, and medication.

Topic 3. Social process (0.139): Patients expressed the need for mental support from their social environment, such as family and friends.

Topic 4. Therapy (0.120): Experiences shared regarding medicine, doctors, and psychiatrists.

Topic 5. Drugs (0.116): Discussions about medications used for BD and related mental health issues on Reddit, including Lithium, Lamictal, and Wellbutrin.

Topic 6. Episode (0.084): Personal accounts of specific periods or conditions experienced by individuals living with BD.

Topic 7. Emotion expression (0.059): Patients with BD expressed their feelings about living with the illness, including emotions such as hate, sadness, and crying.

Topic 8. Friend (0.054): Frequent mentions of individuals who interacted with patients in social settings.

Topic 9. Relaxation (0.047): Techniques shared for relaxation during times of stress and anxiety, such as smoking, drinking, taking drugs, sleeping, eating, and meditation.

Topic 10. Drug-side effect (0.045): Discussions about the side effects of drugs used for BD, with weight gain being the most frequently mentioned side effect (Table 1).

TABLE 1
www.frontiersin.org

Table 1. Top-10 topics of bipolar disorder discussion in the Reddit communities.

Bipolar disorder conversation sentiment analysis

By employing topic-based sentiment analysis on social media data, it becomes possible to identify people's moods, related issues, and emotional responses to specific situations. For instance, when discussing a health topic, the mention of a drug name often evokes strong positive or negative sentiments within the conversation. To gauge the emotional intensity within the bipolar disorder (BD) community, we utilized SentiStrength 2.3 for sentiment analysis. This tool adopts a lexical approach that begins with a collection of terms known to have sentiment associations. It then applies a set of rules to predict the sentiment of texts based on the occurrence of these words. After obtaining the results of positive and negative text classification, we integrated these findings into the topic modeling approach to determine the common topics associated with each polarity.

Polarity classification results

The system employed sentiment analysis to determine the sentiment strength in Reddit data, categorizing each text as neutral, positive, or negative. However, for the purposes of this study, we focused solely on reporting the positive and negative sentiment text to observe the experiences related to bipolar disorder. Each text was assigned two numerical values by SentiStrength: a score between 1 and 5 for positive sentiment strength, and a score between −1 and −5 for negative sentiment strength. Additionally, a score of 1 or −1 indicated no sentiment, while a score of 5 or −5 represented a strong sentiment of that type. For example, a social web text with a score of 1–4 would convey weak positive sentiment and strong negative sentiment. The system utilized two scales because a conviction can encompass both positivity and negativity, and the objective was to detect the expressed viewpoint rather than the overall polarity (Thelwall et al., 2010). Below are some examples of the short text analysis results:

Cried at work Cried at work”

(Negative) 1 −4 Cried [−3] at [0] work [0] [[Sentence=-4,1=word max,1–5]] [[[1, −4 max of sentences]]]

Wow, fantastic response. Thank you so much for sharing, you encouraged me!”

(Positive) 41 Wow[2] fantastic[2][+1 Multiple Positive Words] response[0] [[Sentence=-1,4=word max, 1–5]] Thank [1] you [0] so [0] much [0] For [0] sharing [0] you [0] encouraged [1] me [0] [[Sentence= −1,2=word max, 1–5]] [[[4, −1 max of sentences]]]

The software assigned a score to each word in the sentence, and a total weighted score was calculated to predict the text's polarity. Each word received a numerical value based on its strength. For instance, the word “cried” was assigned a strongly negative score of −3, while both “wow” and “fantastic” were rated as moderately positive with a score of 2.

The results of the sentiment analysis indicate that there are more negative sentiment (641,264) than positive sentiment (392,352) within the discussions of Reddit-related bipolar disorder communities. Examples of the experimental results are presented in Table 2.

TABLE 2
www.frontiersin.org

Table 2. Examples of polarity categorization.

Polarity topics results

In order to investigate the polarity of emotions surrounding BD in social media discourse, we identified common issues that were publicly discussed within the Reddit-related bipolar community. The texts from the sentiment analysis results were divided into two datasets: positive and negative. These datasets were then subjected to LDA topic modeling to uncover hidden topics within each polarity. We set the number of topics to 20 and specified that each topic should consist of 20 words with the highest probabilities.

The LDA results present the content of documents with 20 different topic mixtures and 20 words per topic. For the purpose of this paper, we selected the top five topics to report. We manually assigned labels to these top five topics based on the words associated with each topic.

Table 3 illustrates the positively-related topics of BD within the Reddit community. These topics include “Medication”, where users express their feelings and opinions regarding drugs used in bipolar treatment, such as lamictal, abilify, and lithium, and discuss their effectiveness. The next topic is “Social wellbeing”, which explores the sense of social inclusion, individuals' lifestyles, and overall quality of life. The third topic is “Personal concerns”, which delves into social factors with a greater focus on personal experiences, such as career, relationships, and children. The fourth topic is “Social support”, which highlights the importance of having friends, family, and others to rely on during times of need and who provide encouragement when experiencing bipolar-related symptoms, such as mood swings. Lastly, the fifth topic is “Mental therapy”, which revolves around discussions within the community regarding various treatments for the disease.

TABLE 3
www.frontiersin.org

Table 3. Top-five positive topics.

Table 4 presents the negative topics of BD in the Reddit community. The first topic is “Drugs”, where social media users share their experiences related to medications and their side effects. The following topic is “Mood swings”, where individuals living with bipolar disorder describe their experiences with mixed episodes, including depression, hypomania, and mania. The next topic is “Bipolar symptoms”, which encompasses the various symptoms associated with the condition, such as anxiety, depression, panic, and happiness, typically occurring during the night and morning and significantly impacting their lives. The fourth topic is “Treatments”, where individuals discuss their attempts to address negative feelings through strategies like meeting friends, consulting with a psychiatrist, and seeking medical assistance. Lastly, the fifth topic discussed within the bipolar sub-Reddit community is “Mental disorder”, which takes a broader perspective on the mental challenges they face in both personal and social aspects of their lives.

TABLE 4
www.frontiersin.org

Table 4. Top-five negative topics.

To illustrate the association between terms and document topics, we constructed a co-word map. The positive and negative text post were processed into graphML format to generate word co-occurrences and create a network of co-occurring words. Subsequently, we visualized the graph using Gephi software, adjusting the nodes to highlight those with high betweenness centrality for improved visualization. This allowed us to observe the graphs with key terms (nodes) representing the core concepts within each polarity. The results are presented below:

Figure 3 presents the co-word network of the positive post text. The terms that influenced other nodes the most were “life” (0.00016), followed by “mind” (0.000149), “doctor” (0.000142), “episode” (0.000118), and “medication” (0.000114). These terms exhibited a high betweenness centrality score. The co-word network confirms a core concept of positive conversations, where Reddit users shared experiences about their personal lives, including medication, mind therapy, activities, and social environments. Additionally, the network reveals connected terms such as life-peace, doctor-mind, work-love, medication-change, and song-summer.

FIGURE 3
www.frontiersin.org

Figure 3. Network of co-occurring words in the positive post text dataset.

Figure 4 displays the co-occurrence network of words in the negative post text. The top-ranking terms, which play a crucial role as bridges in the network, are “depression” (0.000256), followed by “life” (0.000164), “episode” (0.000157), “shit” (0.000101), “psychiatrist” (0.000093), and “food” (0.000091). This graph reveals that individuals living with BD expressed concerns about recurrent disorders such as schizophrenia and obesity, along with symptoms of mood swings. Furthermore, they experienced irritability in their relationships with others and struggled with negative thoughts, including suicidal ideation. Notable word pairs identified in the network include depression-schizophrenia, life-fight, episode-spike, shit-thinking, food-panic, and irritability-relationship.

FIGURE 4
www.frontiersin.org

Figure 4. Network of co-occurring words in the negative post text dataset.

Discussion

Topics discussed in the bipolar disorder online community of Reddit users

The online community of Reddit users discussing Bipolar disorder covers a variety of topics related to the condition. The top 10 topics predominantly revolve around symptoms, mood swings, diagnosis, and medication. Users openly share their personal experiences with bipolar episodes, including the challenges they face and strategies for coping. They also express their emotions about living with bipolar disorder, seeking solace and connection with others who can relate to their struggles. Therapy is another crucial topic, with users sharing their experiences and interactions with mental health professionals. They discuss different therapeutic approaches, the benefits of therapy in managing bipolar disorder, and provide tips for finding a suitable therapist.

Discussions related to medication are common, as users share their experiences with specific medications such as Lithium, Lamictal, and Wellbutrin. They talk about side effects and how certain drugs have helped them manage their condition. Weight gain is often mentioned as a common side effect, and users seek advice and strategies for managing medication-related effects.

The findings demonstrate that a large number of social media users use online community platforms to discuss mental health issues. Users have a keen interest in seeking out information related to treatments, engaging in conversations with others to express their thoughts on the efficacy and potential side effects of treatments, participating in discussions within mental health communities to address their queries, and acquiring insights about their medical conditions (Nguyen et al., 2014; Reavley and Pilkington, 2014; Foufi et al., 2019).

Social support plays a significant role in the management of bipolar disorder. Within the online community, users engage in discussions about effective communication with loved ones regarding the condition, seeking understanding, empathy, and establishing a support system. Friends are regarded as essential in the lives of individuals with bipolar disorder, and conversations often revolve around the impact of friendships on mental health, maintaining relationships, and the importance of supportive friends. This finding is consistent with a study on the use of social media for social support among adolescents, including emotional, appraisal, and informational support (Selkie et al., 2020). Discussions also cover various relaxation techniques aimed at managing stress and anxiety associated with bipolar disorder. Methods such as smoking, drinking, medication use, sleeping, eating, and meditation practices are mentioned. It is important to approach coping strategies, including substance use, with caution and consult healthcare professionals for guidance. Moreover, the online community serves as a platform for seeking help and information. Users actively seek advice on managing symptoms, finding effective treatment options, and expressing gratitude for the support they receive. Social media enables people to share information on treatments and research to improve care (De Choudhury and De, 2014; Pershad et al., 2018).

Overall, the Bipolar disorder online community on Reddit encompasses a wide range of topics related to the condition, fostering support, understanding, and the sharing of experiences among its members.

Polarity classification

In the analysis of people's opinions and feelings toward mental health issues, specifically related to Bipolar Disorder (BD), the sentiment was assessed using the SentiStrength software. The software analyzed user-generated text from social media, which described situations related to health problems. The results of the sentiment analysis on the Reddit dataset, under the topic of BD indicated a higher frequency of negative polarity compared to positive polarity. This finding suggests that discussions and expressions within the Bipolar Disorder online community on Reddit tended to lean toward a more negative sentiment. It indicates that individuals sharing their experiences, challenges, and emotions related to BD often conveyed negative feelings or perspectives.

It is important to note that sentiment analysis provides an automated assessment of sentiment based on textual content, and it may not fully capture the nuances and complexities of individual experiences. Additionally, the analysis is limited to the specific dataset and platform analyzed, and sentiments expressed may vary across different social media platforms and online communities. For instance, using Twitter content analysis to measure attitudes toward mental illness and analyzing mental health based on the positive language of Facebook users provided different insights into online discussions about mental health (Reavley and Pilkington, 2014; Bogolyubova et al., 2020).

Further qualitative analysis and a deeper understanding of the specific context and content of the negative sentiment would be necessary to gain insights into the concerns, challenges, and negative experiences expressed by individuals living with or relating to Bipolar Disorder in the Reddit community.

Polarity topics

The study findings revealed that individuals expressed their thoughts and emotions related to both the biological and mental aspects of their conditions. Distinct topics emerged when analyzing positive and negative sentiments. Regarding positive sentiment, Reddit users discussed receiving encouragement and support from various sources, including friends, partners, family members, doctors, and psychiatrists. They expressed satisfaction with their social wellbeing while living with a mental illness. Some users also reported positive experiences with bipolar medication and psychological therapy, although others mentioned negative effects such as weight gain from the drugs.

On the other hand, negative sentiment topics focused on the negative aspects of drugs and treatments, including instances of incorrect diagnosis by physicians when experiencing mental symptoms. Users also shared their struggles with bipolar disorder symptoms, particularly mood swings, which caused both biological, and psychological stress.

The polarity findings emphasize the results showing that social media users seek information to address their mental health problems and to connect with individuals who can understand and encourage positive emotions. These findings underscore the importance of addressing patients' wellbeing and providing appropriate treatments for their mental health.

Limitation

It is essential to acknowledge certain limitations inherent in our study. One notable limitation is the nature of bipolar disorder (BD) as a clinical diagnosis. Within the context of our research, participants on social media platforms may encompass a diverse range of individuals, including those who have received a professional diagnosis of BD, those who self-identify as bipolar, and individuals who are family members of those with BD. Consequently, this gap introduces a potential risk of misattribution when analyzing sentiments and topics related to BD within the context of discussions about BD. It is important to recognize and consider this limitation.

Furthermore, in future work, we should consider the potential value of predicting user post-behavior during the presence of mental symptoms and conducting a sentiment comparison between Bipolar Disorder (BD) and other online communities.

Conclusion

This study aimed to explore the topics and sentiments discussed in the Bipolar disorder online community on Reddit. The findings revealed that users engaged in discussions related to various aspects of bipolar disorder, including symptoms, mood swings, diagnosis, medication, therapy, social support, and relaxation techniques. They shared their personal experiences, sought advice, and expressed their emotions about living with bipolar disorder. The analysis showed a predominance of negative sentiment expressed in the community, indicating the presence of challenges and struggles associated with the condition. These findings emphasize the importance of addressing the wellbeing of individuals living with bipolar disorder and highlight the need for appropriate treatment and support. The research framework employed in this study can also be applied to gain insights into other health conditions through social media analysis.

In summary, the primary objective of the present study is to enhance the understanding of how individuals with BD communicate on social media platforms. While not providing definitive medical insights, our research offers opportunities for further exploration. We intend to investigate unmet information needs and aspects requiring professional attention in future research, guided by the input of clinicians and individuals with BD, ensuring practical relevance.

Data availability statement

The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.

Ethics statement

The data used in this study was collected from Reddit, a social media platform, in accordance with the platform's terms of service and community guidelines. The data collection process strictly adhered to the ethical standards outlined by Reddit, ensuring respect for user privacy, anonymity, and the responsible use of the platform's content. Consequently, we confirm that the collection, analysis, and reuse of social media data were conducted in strict accordance with Reddit's policies and terms of use, as well as all relevant institutional regulations.

Author contributions

TT conceived and co-designed the analysis, collected the data, and co-wrote the paper. QX co-designed the analysis and co-wrote the paper. SL co-wrote the paper. All authors contributed to the article and approved the submitted version.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher's note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

Apache OpenNLP Development Community (2011). Apache OpenNLP Developer Documentation.pdf. OpenNLP. Available online at: https://opennlp.apache.org/docs/1.9.0/manual/opennlp.html (accessed July 23, 2023).

Google Scholar

Arora, R., and Ravindran, B. (2008). “Latent dirichlet allocation based multi-document summarization,” in Proceedings of SIGIR 2008 Workshop on Analytics for Noisy Unstructured Text Data, AND'08. Available online at: http://www.cse.iitm.ac.in/~ravi//papers/Rachit-AND08.pdf (accessed July 23, 2023).

Google Scholar

Blei, D. M., Edu, B. B., Ng, A. Y., Edu, A. S., Jordan, M. I., and Edu, J. B. (2003). Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022. Available online at: https://www.jmlr.org/papers/volume3/blei03a/blei03a.pdf (accessed July 23, 2023).

Google Scholar

Bogolyubova, O., Panicheva, P., Ledovaya, Y., Tikhonov, R., and Yaminov, B. (2020). The language of positive mental health: findings from a sample of Russian Facebook users. SAGE Open 10, 2158244020924370. doi: 10.1177/2158244020924370

CrossRef Full Text | Google Scholar

Chen, J., and Wang, Y. (2021). Social media use for health purposes: systematic review. J. Med. Inter. Res. 23, e17917. doi: 10.2196/17917

CrossRef Full Text | Google Scholar

Conway, M., and O'Connor, D. (2016). Social media, big data, and mental health: current advances and ethical implications. Curr. Opin. Psychol. 9, 77–82. doi: 10.1016/j.copsyc.2016.01.004

PubMed Abstract | CrossRef Full Text | Google Scholar

De Choudhury, M., and De, S. (2014). Mental health discourse on reddit: self-disclosure, social support, and anonymity. Proc. Int. AAAI Conf. Web Soc. Media 8, 71–80. doi: 10.1609/icwsm.v8i1.14526

CrossRef Full Text | Google Scholar

Denecke, K. (2015). Health Web Science: Social Media Data for Healthcare, eds Y. Zhang. Cham: Springer.

Google Scholar

Foufi, V., Timakum, T., Gaudet-Blavignac, C., Lovis, C., and Song, M. (2019). Mining of textual health information from Reddit: analysis of chronic diseases with extracted entities and their relations. J. Med. Inter. Res. 21, e12876. doi: 10.2196/12876

PubMed Abstract | CrossRef Full Text | Google Scholar

Gkotsis, G., Oellrich, A., Hubbard, T., Dobson, R., Liakata, M., Velupillai, S., et al. (2016). “The language of mental health problems in social media,” in Proceedings of the Third Workshop on Computational Linguistics and Clinical Psychology, p. 63–73. Available online at: https://aclanthology.org/W16-0307.pdf (accessed July 23, 2023).

Google Scholar

Gottschalk, L. A., and Bechtel, R. J., (eds.). (2008). Computerized Content Analysis of Speech and Verbal Texts and Its Many Applications. New York, NY: Nova Science.

Google Scholar

Harlow, L. L., and Oswald, F. L. (2016). Big data in psychology: introduction to the special issue. Psychol. Methods. 21, 447–457. doi: 10.1037/met0000120

PubMed Abstract | CrossRef Full Text | Google Scholar

Huang, Y.-H., Wei, L.-H., and Chen, Y.-S. (2017). Detection of the Prodromal Phase of Bipolar Disorder from Psychological and Phonological Aspects in Social Media. Available online at: http://arxiv.org/abs/1712.09183 (accessed June 15, 2023).

Google Scholar

Islam, R., Kabir, M. A., Ahmed, A., Kamal, A. R. M., and Wang, H. (2018). Depression detection from social network data using machine learning techniques. Health Inform. Sci. Syst. 6, 8. doi: 10.1007/s13755-018-0046-0

PubMed Abstract | CrossRef Full Text | Google Scholar

Johnson, G., and Ambrose, P. (2006). Neo-tribes: the power and potential of online communities in health care. Commun ACM 49, 107–113. doi: 10.1145/1107458.1107463

CrossRef Full Text | Google Scholar

Manning, C., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S., and McClosky, D. (2014). “The stanford CoreNLP natural language processing toolkit,” in Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, p. 55–60. Available online at: https://aclanthology.org/P14-5010.pdf (accessed July 23, 2023).

Google Scholar

Manning, C. D., Ragahvan, P., and Schutze, H. (2009). An Introduction to Information Retrieval. Information Retrieval. Available online at: https://ds.amu.edu.et/xmlui/bitstream/handle/123456789/14697/Book%20558%20pages.pdf?sequence=1&isAllowed=y (accessed July 30, 2023).

Google Scholar

Marar, S. D., Al-Madaney, M. M., and Almousawi, F. H. (2019). Health information on social media: perceptions, attitudes, and practices of patients and their companions. Saudi Med. J. 40, 1294. doi: 10.15537/smj.2019.12.24682

PubMed Abstract | CrossRef Full Text | Google Scholar

Monnier, J., Laken, M., and Carter, C. L. (2002). Patient and caregiver interest in Internet-based cancer services. Cancer Pract. 10, 305–310. doi: 10.1046/j.1523-5394.2002.106005.x

PubMed Abstract | CrossRef Full Text | Google Scholar

Moorhead, S. A., Hazlett, D. E., Harrison, L., Carroll, J. K., Irwin, A., and Hoving, C. (2013). A new dimension of health care: systematic review of the uses, benefits, and limitations of social media for health communication. J. Med. Inter. Res. 15, 1–16. doi: 10.2196/jmir.1933

PubMed Abstract | CrossRef Full Text | Google Scholar

Naslund, J. A., Aschbrenner, K. A., Marsch, L. A., and Bartels, S. J. (2016). The future of mental health care: peer-to-peer support and social media. Epidemiol. Psychiatr. Sci. 25, 113–122. doi: 10.1017/S2045796015001067

PubMed Abstract | CrossRef Full Text | Google Scholar

Nguyen, T., Phung, D., Dao, B., Venkatesh, S., and Berk, M. (2014). Affective and content analysis of online depression communities. IEEE Trans. Affect. Comput. 5, 217–226. doi: 10.1109/TAFFC.2014.2315623

CrossRef Full Text | Google Scholar

O'Dea, B., Wan, S., Batterham, P. J., Calear, A. L., Paris, C., and Christensen, H. (2015). Detecting suicidality on twitter. Intern. Interven. 2, 183–188. doi: 10.1016/j.invent.2015.03.005

CrossRef Full Text | Google Scholar

O'Neill, B., Ziebland, S., Valderas, J., and Lupiáñez-Villanueva, F. (2014). User-generated online health content: a survey of internet users in the United Kingdom. J. Med. Intern. Res. 16, 3187. doi: 10.2196/jmir.3187

PubMed Abstract | CrossRef Full Text | Google Scholar

O'Rourke, N., Heisel, M. J., Canham, S. L., and Sixsmith, A. (2017). Predictors of suicide ideation among older adults with bipolar disorder. PLoS ONE. 16, e0187632. doi: 10.1371/journal.pone.0187632

PubMed Abstract | CrossRef Full Text | Google Scholar

Park, A., and Conway, M. (2018). Harnessing reddit to understand the written-communication challenges experienced by individuals with mental health disorders: analysis of texts from mental health communities. J. Med. Intern. Res. 20, e121. doi: 10.2196/jmir.8219

PubMed Abstract | CrossRef Full Text | Google Scholar

Pershad, Y., Hangge, P. T., Albadawi, H., and Oklu, R. (2018). Social medicine: Twitter in healthcare. J. Clin. Med. 7, 121. doi: 10.3390/jcm7060121

PubMed Abstract | CrossRef Full Text | Google Scholar

Reavley, N. J., and Pilkington, P. D. (2014). Use of Twitter to monitor attitudes toward depression and schizophrenia: an exploratory study. PeerJ. 2, e647. doi: 10.7717/peerj.647

PubMed Abstract | CrossRef Full Text | Google Scholar

Record, R. A., Silberman, W. R., Santiago, J. E., and Ham, T. (2018). I sought it, I Reddit: examining health information engagement behaviors among Reddit users. J. Health Commun. 23, 470–476. doi: 10.1080/10810730.2018.1465493

PubMed Abstract | CrossRef Full Text | Google Scholar

Selkie, E., Adkins, V., Masters, E., Bajpai, A., and Shumer, D. (2020). Transgender adolescents' uses of social media for social support. J. Adoles. Health 66, 275–280. doi: 10.1016/j.jadohealth.2019.08.011

CrossRef Full Text | Google Scholar

Song, M., Kim, W. C., Lee, D., Heo, G. E., and Kang, K. Y. (2015). PKDE4J: entity and relation extraction for public knowledge discovery. J. Biomed. Inform. 57, 320–332. doi: 10.1016/j.jbi.2015.08.008

PubMed Abstract | CrossRef Full Text | Google Scholar

Thelwall, M., and Buckley, K. (2013). Topic-based sentiment analysis for the social web: the role of mood and issue-related words. J. Am. Soc. Inform. Sci. Technol. 64, 22872. doi: 10.1002/asi.22872

CrossRef Full Text | Google Scholar

Thelwall, M., Buckley, K., and Paltoglou, G. (2012). Sentiment strength detection for the social web. J. Am. Soc. Inform. Sci. Technol. 63, 163–173. doi: 10.1002/asi.21662

CrossRef Full Text | Google Scholar

Thelwall, M., Buckley, K., Paltoglou, G., Cai, D., and Kappas, A. (2010). Sentiment in short strength detection informal text. J. Am. Soc. Inform. Sci. Technol. 61, 2544–2558. doi: 10.1002/asi.21416

CrossRef Full Text | Google Scholar

Trotzek, M., Koitka, S., and Friedrich, C. M. (2018). Utilizing neural networks and linguistic metadata for early detection of depression indications in text sequences. IEEE Trans. Knowl. Data Eng. 32, 588–601. doi: 10.1109/TKDE.2018.2885515

CrossRef Full Text | Google Scholar

Zhang, Y., He, D., and Sang, Y. (2013). Facebook as a platform for health information and communication: a case study of a diabetes group. J. Med. Syst. 37, 1–12. doi: 10.1007/s10916-013-9942-7

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhao, Y., and Zhang, J. (2017). Consumer health information seeking in social media: a literature review. Health Inform. Libraries J. 34, 268–283. doi: 10.1111/hir.12192

CrossRef Full Text | Google Scholar

Keywords: bipolar, mental disorder, sentiment analysis, topic modeling, network analysis, social media data

Citation: Timakum T, Xie Q and Lee S (2023) Identifying mental health discussion topic in social media community: subreddit of bipolar disorder analysis. Front. Res. Metr. Anal. 8:1243407. doi: 10.3389/frma.2023.1243407

Received: 20 June 2023; Accepted: 17 October 2023;
Published: 03 November 2023.

Edited by:

Kirk Roberts, University of Texas Health Science Center at Houston, United States

Reviewed by:

Susan McRoy, University of Wisconsin–Milwaukee, United States
Tavleen Singh, University of Texas Health Science Center at Houston, United States

Copyright © 2023 Timakum, Xie and Lee. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Tatsawan Timakum, dGF0c2F3YW5fdGltJiN4MDAwNDA7Y21ydS5hYy50aA==

ORCID: Tatsawan Timakum orcid.org/0000-0002-9877-0323
Qing Xie orcid.org/0000-0003-1926-1457
Soobin Lee orcid.org/0000-0002-7515-0199

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.