Investigating pregnant women’s health information needs during pregnancy on internet platforms

Hou, Keke; Hou, Tingting

doi:10.3389/fphys.2022.1038048

ORIGINAL RESEARCH article

Front. Physiol., 28 November 2022

Sec. Developmental Physiology

Volume 13 - 2022 | https://doi.org/10.3389/fphys.2022.1038048

This article is part of the Research TopicAdverse outcomes of preeclampsia: From mother to baby, pregnancy to postpartumView all 10 articles

Investigating pregnant women’s health information needs during pregnancy on internet platforms

Keke Hou¹

Tingting Hou²*

¹School of Health Sciences, Guangzhou Xinhua University, Guangzhou, China
²School of Management, Zhengzhou University, Zhengzhou, China

Artificial intelligence gives pregnant women another avenue for receiving healthcare information. With the advancement of information and communication technology, searching online for pregnancy information has become commonplace during COVID-19. This study aimed to explore pregnant women’s information-seeking behavior based on data mining and text analysis in China. Posts on maternal and infant-related websites were collected during 1 June 2020, and 31 January 2021. A total of 5,53,117 valid posts were obtained. Based on the data, we performed correlation analysis, topic analysis, and sentiment analysis. The correlation analysis showed the positive effects of population, population with a college education or above, and GDP on post counts. The topic analysis extracted six, nineteen, eighteen, thirteen, eleven, sixteen, thirteen, sixteen, nineteen, and fourteen topics in different months of pregnancy, reflecting different information needs in various pregnancy periods. The results of sentiment analysis show that a peak of the posts emerged in the second month of pregnancy and the proportion of emotionally positive posts reached its peak in the sixth month of pregnancy. The study provides important insights for understanding pregnant women’s information-seeking behavior.

1 Introduction

Artificial intelligence (AI) creates opportunities for enabling pregnant women to receive healthcare information. Pregnancy is a crucial period in a woman’s life accompanied by physical change, psychological change, and role transformation. Information-seeking can play an important role in addressing the issue of a healthy delivery. Access to advantageous and concerned information contributes to health-related decisions and the life of both pregnant women and unborn children (Kamali et al., 2018). Childbirth-related information is considerable for performing beneficial interventions and suggestions for pregnant women (Kamali et al., 2018). For example, health-related information will enable women to prepare for pregnancy, concentrate on balanced nutrition and medication use during pregnancy, and make decisions on exercise intensity and mode.

Extant research on health information has addressed the crucial role of research contexts, such as the user group and the domain of information subject in determining information needs (Pian et al., 2020; Reifegerste et al., 2020). The development of information technology and the spread of the mobile Internet enable pregnant women to seek information in a more conveniently and fairly way. Centered on the information needs of maternal health, recent studies have shown that pregnant women’s information-seeking behavior is crucial to enriching the knowledge of childbirth and maternal health and improving maternal health outcomes (Kamali et al., 2018; Ahmadian et al., 2020; Jin et al., 2020; Kassim, 2021). For example, Kamali et al. (2018) found that pregnant women need information such as psychological and physical complications after delivery and pregnancy nutrition in the descriptive study. The qualitative study conducted by Kassim (2021) found that the unavailability of health facilities and limited chances of accessing professional health care could lead to the results that pregnant women seek information from non-professional and informal sources. Ahmadian et al. (2020) identified commonly searched topics during pregnancy using the questionnaire. However, researchers have not treated the topics of information-seeking and pregnant women’s emotions in much detail by employing a relatively large amount of data.

The objective of this research is to explore pregnant women’s information-seeking behavior during the whole pregnancy, including the factors that contribute to the information-seeking behavior, the topics that cause pregnant women’s attention at different months of pregnancy, and the change in pregnant women’s emotions at different stages of pregnancy. By collecting and analyzing the posts in the “pregnant section” under “Mama.cn” from 1 June 2020, to 31 January 2021, and 5,53,117 valid posts, the current work provides a comprehensive study.

2 Materials and methods

2.1 Data collection

With the advancement of Internet technology, pregnant women’s behavior of seeking online health information has become a universal trend worldwide because of insufficient information received from healthcare providers and the natural advantage of the Internet to ask questions anonymously (Al-Dahshan et al., 2021). As one of the largest maternal and child health websites in China, “Mama.cn” has integrated websites, APPS, new media, micro-network celebrities, and other media resources, covering hundreds of millions of pan-maternal and infant groups. Dedicated to serving all kinds of needs of pregnant women, the company has built several service sections including information, social networking, tools, and e-commerce, aiming to build a diversified Internet maternal, and infant service platform with pregnant women as the core. “Mama.cn” is widely popular among people who are preparing for pregnancy, during pregnancy, and childrearing. In August 2019, “Mama.cn” had 16.479 million active users. The number of active users of “Mama.cn” reached 19.31 million in June 2020, ranking first in the parenting subdivision list in China. Therefore, “Mama.cn” was selected as the research data source for this study. This study collected the posts in the “pregnant section” under “Mama.cn” from 1 June 2020, to 31 January 2021, involving data from “the first month of pregnancy” to “the tenth month of pregnancy.” The current study extracted the following information from the “pregnant section” under “Mama.cn” posts: username, post time, duration of pregnancy, city, and text. A total of 5,75,970 posts were obtained. Examples of our dataset are presented in Table 1.

TABLE 1

TABLE 1. The examples of dataset.

We pre-processed the original data before formal analysis by the following procedures. First, the raw information may include missing city tags, irrelevant advertising messages, or posts that did not match the actual time of pregnancy. We filtrated and deleted the above data and finally obtained 5,53,117 texts. Second, the original message may contain distracting information, such as interpunction, emoticons, blank, and hashtags. For excluding data noise and improving data analysis efficiency, we employed regular expressions operations in Python for text filtering.

Measures were performed to ensure data privacy, anonymity, and security. The data collection and analysis did not disclose any privacy issues regarding pregnant women’s identifiable and sensitive information (Favaretto et al., 2020). During data collection, only username, post time, duration of pregnancy, city, and text were extracted. In data processing and analysis, only the duration of pregnancy, city, and text data was used, while the personal information of users was not disclosed. By involving as many samples as possible, more anonymity was preserved as a combination of the variables will be repeated among the samples (Leon-Sanz, 2019).

2.2 Methods

2.2.1 Text topic analysis based on latent Dirichlet allocation model

LDA (Latent Dirichlet Allocation) topic model is a topic probability distribution model based on PLSI (Probabilistic Latent Semantic Indexing) model (Blei et al., 2003). The LDA topic model simulates the process of document generation by using an implied random variable that follows a Dirichlet distribution to represent the document’s topic mixing ratio. Its model structure is more complete and clearer, and the probability inference algorithm is adopted to process the text, which can greatly reduce the dimension of the text representation, to avoid dimension disaster (Blei, 2012). Therefore, LDA is widely used in text mining, text clustering, language processing, and other aspects. The topic number K contained in the document set is a hyperparameter. Given other hyperparameters, the selection process of topic number K is the process of the model searching for the optimal topic number. When the number of topics is too large, there will be many topics without obvious classification semantic information. When the number of topics is too small, broad topics will be generated with a mixture of two or more distributions (Panichella, 2021). Therefore, the determination of the optimal number of topics is an important issue. A coherence score was used to determine the optimal number of topics, with a higher coherence score indicating better quality of topics (Korenčić et al., 2018; Panichella, 2021; Shah et al., 2021). This study used the open-source LDA tool in the Gensim library. The LDA model was evaluated by topic coherence to determine the optimal number of topics. According to the trained LDA model, the topic words under each topic were obtained and the probability of each text belonging to each topic could be directly predicted. Finally, the corresponding topic name was summarized in accordance with the topic words. Figure 1 presents the process of topic extraction in this study.

FIGURE 1

FIGURE 1. The process of topic extraction.

2.2.2 Text sentiment analysis based on SnowNLP

In recent years, there has been an increasing interest in sentiment analysis (Wang et al., 2019). Sentiment analysis, also known as opinion mining, is an application of text mining and computational linguistics to mine subjective texts with emotional colors and identify the emotional tendencies contained in them. It is a process of identifying information from texts and analyzing, processing, induction, and reasoning subjective texts with emotional color. Through sentiment analysis, researchers can determine users’ emotional orientation in the text. Text-based sentiment analysis methods are mainly divided into three types: sentiment dictionary-based, machine learning-based, and deep learning-based (Xu et al., 2019; Li et al., 2020). The machine learning-based analysis method trains the emotion classifier with emotion-labeled data to achieve emotion classification. Classification accuracy relies on high-quality human-annotated training sets, and large-scale high-quality training data requires a lot of labor costs, and the results of human subjective data annotation will also affect the classification effect. The deep learning-based analysis is based on feature self-learning and deep neural network. It has a good classification effect when dealing with high-dimensional, unlabeled big data, but it is difficult to accurately classify the semantically ambiguous and short text content in social networks. The sentiment dictionary-based method is an unsupervised method, which uses a sentiment dictionary to discriminate the sentiment polarity of text containing keywords, to achieve sentiment classification for each text. There is no need for complex data labeling in the research process and the accuracy of emotion recognition can be improved by adjusting and expanding the vocabulary of the sentiment dictionary according to the specific research background.

SnowNLP, a Python library for Chinese natural language processing, is used to analyze the sentiment of texts. The tool is based on a sentiment dictionary to analyze the sentiment orientation of texts. SnowNLP employs a sentiment dictionary to realize the sentiment tendency analysis of the text. The main functions include part-of-speech tagging, sentiment analysis, keyword extraction, and text summarization (He et al., 2020; Zhang et al., 2021).

3 Correlation analysis

The correlation analysis is performed using SPSS24.0. The results of descriptive statistical analysis at the provincial level are presented in Table 2. The data of posts, population, population with a college education or above, illiteracy rate, and GDP of the province are from mainland China. More precisely, population, population with a college education and above, and illiteracy rate are all data from the 2020 census.

TABLE 2

TABLE 2. The results of descriptive statistical analysis at the provincial level.

Table 3 presents the correlation analysis results at the provincial level. Post counts was found to positively related to population (ß = 0.889, p < 0.001), population with college education or above (ß = 0.835, p < 0.001), and GDP (ß = 0.819, p < 0.001). However, a significant relationship between post counts and illiteracy rate (p > 0.05) was not found in this study. This result is consistent with previous research which indicates that the illiteracy rate had a small and insignificant correlation with computer and Internet penetration rates statistically (Chinn and Fairlie, 2010).

TABLE 3

TABLE 3. The results of correlation analysis at the provincial level (N = 31).

4 Topic analysis of information needs

4.1 Emerged topics in different months of pregnancy

4.1.1 Information needs in the first month

As mentioned above, topic analysis was divided based on the stages of pregnancy, corresponding to the period from “the first month of pregnancy” to “the tenth month of pregnancy”. Table 4 presents the topics identified in the first month of pregnancy, relative weight, and LDA keywords. Six topics emerged in the first month of pregnancy in which the first frequent topic. “Test strip,” accounts for 20.53% of all topics. “Pregnancy tests consultation,” “early pregnancy inspection,” and “early pregnancy reaction,” accounting for 16.59%, 14.73%, and 14.02%, respectively, were the second, third, and fourth most frequent topics. Among them, early pregnancy reaction refers to pregnant women’s body response during the early pregnancy period. The next two frequent topics are “appeals and desire” and “question for help,” at 12.54% and 10.96%, respectively.

TABLE 4

TABLE 4. Topics in the first month of pregnancy.

4.1.2 Information needs in the second month

Nineteen topics are identified in the second month of pregnancy. The most frequent ten topics in the second month of pregnancy, relative weight, and LDA keywords are presented in Table 5. The results show that “precautions for early pregnancy,” “early pregnancy inspection,” and “symptoms of early pregnancy” emerged to be the top three frequent topics, accounting for 8.59%, 8.35%, and 7.39%, respectively. The next five frequent topics are “the gender of baby,” “fetal heart and embryo bud,” “vomiting during pregnancy,” “early pregnancy indicators,” and “calculation of pregnancy period,” at 7.04%, 6.54%, 6.40%, 6.31%, and 6.30%, respectively. The following two frequent topics are “appeals and desire” and “prenatal diet,” at 5.70% and 5.34%, respectively.

TABLE 5

TABLE 5. Topics in the second month of pregnancy.

4.1.3 Information needs in the third month

Eighteen topics are extracted in the third month of pregnancy. Table 6 presents the top ten topics in the third month of pregnancy. The results indicate that “nuchal translucency and filling,” “vomiting during pregnancy,” “the gender of baby,” and “symptom of early pregnancy” emerged to be the four most frequent topics, accounting for 13.05%, 12.51%, 8.65%, and 7.94%, respectively. The next four frequent topics are “prenatal diet,” “fetal heart and embryo bud,” “threatened miscarriage,” and “fetus protection,” at 5.80%, 5.72%, 5.12%, and 4.45%, respectively. The following two most frequent topics are “share and exchange” and “household affairs,” at 4.24% and 4.22%, respectively.

TABLE 6

TABLE 6. Topics in the third month of pregnancy.

4.1.4 Information needs in the fourth month

Thirteen topics are identified in the fourth month of pregnancy. Table 7 presents the top ten topics in the fourth month of pregnancy. “The gender of baby” accounts for 18.98% of all topics. “Down’s syndrome,” “household affairs,” “nuchal translucency,” and “appeals and desire,” accounting for 9.18%, 7.50%, 7.42%, and 7.25%, respectively, were the second, the third, the fourth, and the fifth most frequent topics. The next five frequent topics are “abnormality in antenatal care,” “prenatal diet,” “fetal movement,” “question for help,” and “belly size and weight,” at 6.91%, 6.85%, 6.83%, 6.26%, and 6.25%, respectively.

TABLE 7

TABLE 7. Topics in the fourth month of pregnancy.

4.1.5 Information needs in the fifth month

Eleven topics are extracted in the fifth month of pregnancy. Table 8 indicates the top ten topics in the fifth month of pregnancy. The top two frequent topics are “the gender of baby” and “Down’s syndrome,” at 12.73% and 11.97%. “Fetal movement,” “prenatal diet,” “pregnant women’s physical discomfort,” and “inspection of a large row of deformities” emerged to be the third, fourth, fifth, and sixth frequent topics, accounting for 9.93%, 9.26%, 9.11%, and 8.68%, respectively. The next four most frequent topics are “ponderal growth,” “experience sharing,” “household affairs,” and “appeals and desire,” accounting for 8.09%, 7.34%, 6.95%, and 6.26%, respectively.

TABLE 8

TABLE 8. Topics in the fifth month of pregnancy.

4.1.6 Information needs in the sixth month

Sixteen topics are identified in the sixth month of pregnancy. Table 9 shows the top ten topics in the sixth month of pregnancy. The top two topics are “the gender of baby” and “four-dimensional ultrasound,” accounting for 18.65% and 17.38% of all topics. The following four topics, “pregnant women’s physical discomfort,” “prenatal diet,” “household affairs,” and “fetal movement,” comprise 7.53%, 6.33%, 6.03%, and 5.73%, respectively. “Appeals and desire,” “ponderal growth,” “glucose tolerance test,” and “sleep during pregnancy” accounted for 5.55%, 5.21%, 4.79%, and 4.04%, respectively.

TABLE 9

TABLE 9. Topics in the sixth month of pregnancy.

4.1.7 Information needs in the seventh month

Thirteen topics are extracted in the seventh month of pregnancy. Table 10 shows the top ten most frequent topics, rates, and LDA keywords. The results present that “the gender of baby” and “pregnant women’s physical discomfort” emerged to be the first and the second most frequent topic, accounting for 21.67% and 13.34% of all topics, respectively. The following four topics are “sleep during pregnancy,” “ponderal growth,” “glucose tolerance test,” and “prenatal diet,” accounting for 7.19%, 6.63%, 6.58%, and 6.49%, respectively. “Items for childbirth,” “household affairs,” “appeals and desire,” and “fetal movement” then comprised 6.36%, 6.09%, 5.82%, and 5.77%, respectively.

TABLE 10

TABLE 10. Topics in the seventh month of pregnancy.

4.1.8 Information needs in the eighth month

Sixteen topics are identified in the eighth month of pregnancy. Table 11 presents the top ten most frequent topics. As shown in the results, the top two topics are “the gender of baby” and “emotion sharing”, accounting for 10.36% and 10.31%. The following four topics, “prenatal care,” “sleep during late pregnancy,” “prenatal diet,” and “appeals and desire,” account for 7.94%, 7.15%, 6.93%, and 6.76%, respectively. The next four topics are “ponderal growth,” “pregnant women’s physical discomfort,” “household affairs,” and “preparation for delivery,” at 6.47%, 6.43%, 6.26%, and 5.88%, respectively.

TABLE 11

TABLE 11. Topics in the eighth month of pregnancy.

4.1.9 Information needs in the ninth month

Nineteen topics are extracted from the ninth month of pregnancy. Table 12 presents the top ten topics, rates, and LDA keywords. The results show that “prenatal care,” “the gender of baby,” and “emotion sharing” emerged to be the top three topics, accounting for 10.78%, 7.96%, and 7.13%, respectively. The next four most frequent topics are “items for childbirth,” “sleep during late pregnancy,” “fetal movement,” and “pregnant women’s physical discomfort” which comprised 6.27%, 6.16%, 6.10%, and 6.05%, respectively. “Prenatal diet,” “household affairs,” and “expected date of confinement” emerged to be the last three topics, at 5.72%, 5.40%, and 4.87%.

TABLE 12

TABLE 12. Topics in the ninth month of pregnancy.

4.1.10 Information needs in the tenth month

Fourteen topics are identified in the tenth month of pregnancy. Table 13 presents the top ten topics in the tenth month. “Appeals and desire” emerged to be the most frequent topics, accounting for 21.17% of all topics. The following five topics, “delivery,” “expected date of confinement,” “pregnant women’s physical discomfort,” “full term,” and “prenatal care,” comprised 13.16%, 7.62%, 6.86%, 6.46%, and 6.19%, respectively. The next four topics are “sleep during late pregnancy,” “nutrition and weight during pregnancy,” “household affairs,” and “good things to recommend,” accounting for 5.96%, 5.64%, 5.28%, and 5.03%, respectively.

TABLE 13

TABLE 13. Topics in the tenth month of pregnancy.

4.2 Summary of topic analysis about information needs

To more vividly show the main topics that pregnant women pay attention to during the whole pregnancy, we conducted a word cloud analysis on the LDA keywords of the topics during pregnancy. The results are presented in Figure 2. In word cloud statistics, word frequency is distributed by font size. As shown in Figure 2, the fonts of words such as “pregnancy,” “child,” and “fetus” are prominent, indicating that the topic of pregnancy is centered on pregnant women and babies. Secondly, the fonts of words such as “hospital,” “normal,” “doctor,” and “healthy” are also clearly displayed, indicating that obstetric examination is an important topic that pregnant women continue to pay attention to during pregnancy, which can help pregnant women to keep abreast of their physical status and fetal upgrowth. Then, words such as “pain,” “belly,” “good,” “eat,” “hungry,” “drink,” and “food” appeared frequently, reflecting pregnant women’s concerns about their physical condition and diet during pregnancy. Words such as “cheer,” “hope,” “happy,” “love,” “boy,” and “girl” reflect pregnant women’s good wishes for their babies and their curiosity about their babies’ gender.

FIGURE 2

FIGURE 2. The results of word cloud analysis.

5 Sentiment analysis

Sentiment analysis is performed to further understand the changes in pregnant women’s information-seeking behavior during pregnancy. As discussed earlier, we use Python to call the third-party library SnowNLP to calculate the sentiment value of each post text, and the range of sentiment value results is [0, 1]. Among them, a sentiment with a value greater than 0.5 is positive, and a sentiment less than or equal to 0.5 is negative. The closer the value is to 1, the more positive the emotion; the closer the value is to 0, the more negative the emotion. Figure 3 presents the posts with a sentiment value greater than 0.5 in each pregnancy month.

FIGURE 3

FIGURE 3. The results of sentiment analysis.

By combining the outcomes of topic analysis and sentiment analysis, the results show that in the first month of pregnancy, the number of posts is relatively small, mainly focusing on topics such as “test strip” and “pregnancy tests consultation,” and the proportion of emotionally positive posts is also relatively low. On the one hand, many pregnant women have not found out that they are pregnant in the first month of pregnancy; on the other hand, the first month of pregnancy is often unstable and at a loss for pregnant women, so their emotions are relatively negative.

The number of posts in the second month of pregnancy is the most, but the proportion of posts with positive emotions is also relatively low. In the second month of pregnancy, most pregnant women have already guessed or confirmed pregnancy, but new pregnant women have little knowledge about pregnancy. Therefore, posts about “precautions for early pregnancy,” “early pregnancy inspection,” “symptoms of early pregnancy” and other related early pregnancy topics surged. However, due to the uncertainty of the baby’s status and the lack of relevant knowledge of pregnant women, the proportion of emotionally positive posts in the second month of pregnancy is relatively low. After the first 2 months of relevant inspections and understanding of pregnancy knowledge, pregnant women have entered a relatively mature stage. At the same time, the status of the baby gradually stabilized, so the number of posts from the second month of pregnancy to the third month dropped significantly, and it continued to be stable until the ninth of pregnancy.

The proportion of emotionally positive posts from the third month to the ninth month of pregnancy is higher than that in other months, and there is an upward trend from the third month to the sixth month of pregnancy. The proportion is the highest in the sixth month of pregnancy, and then gradually decreases. After the third month of pregnancy, the baby’s state gradually stabilizes, the pregnant women’s belly gradually bulges, and the pregnant women can even feel the baby’s fetal movement, but there is generally no obvious physical discomfort, so the pregnant women’s emotions are relatively more positive. Since the seventh month of pregnancy, the baby’s weight increases, the pregnant women’s belly increases, the body gradually becomes clumsy, and the body also has various discomforts such as soreness and difficulty sleeping, so pregnant women show more negative emotions.

The number of posts in the tenth month of pregnancy surged again, second only to the second month of pregnancy, and the proportion of emotionally positive posts also dropped sharply, only higher than in the first of pregnancy. The tenth month of pregnancy is the month when the baby is about to be born. On the one hand, the pregnant women’s body aches and sleep problems are more prominent. On the other hand, pregnant women are faced with the uncertainty of childbirth, and a state of fear and anxiety appears. It can also be seen from the results of the topic analysis that in the current month, “appeals and desire” ranked first among the topics that pregnant women paid attention to, accounting for 21.17%. In addition, “expected date of confinement” and “pregnant women’s physical discomfort” are also the main contents of concern for pregnant women.

6 Discussion and conclusion

6.1 Summary of findings

The purpose of the current study was to investigate pregnant women’s information-seeking behavior. By a combination of descriptive analysis, topic analysis, and sentiment analysis, the current work expands our knowledge by proving important findings. The correlation analysis showed that more pregnant women contribute to more posts. Moreover, pregnant women with a college education or above are more likely to seek information about pregnancy on internet platforms. The more economically developed cities have higher Internet usage. Therefore, pregnant women will be more probable to use Internet platforms to seek information.

Furthermore, the topics from the first month to the tenth month of pregnancy were extracted in topic analysis. The findings show that the topics in different months of pregnancy relate to the present stages of pregnancy. The current paper identified six, nineteen, eighteen, thirteen, eleven, sixteen, thirteen, sixteen, nineteen, and fourteen topics in different months of pregnancy. The specific topics in different stages show the changes in pregnant women’s attention.

In addition, the sentiment analysis showed the variation of pregnant women’s emotions in information-seeking. The results of sentiment analysis show a peak of the posts in the second month of pregnancy. The proportion of emotionally positive posts reached its peak in the sixth month of pregnancy. Pregnant women’s emotional sentiment deeply interacts with the results of topic analysis.

6.2 Practical and theoretical implications

Our study presents theoretical and practical significance. First, this is one of the first studies to understand pregnant women’s information-seeking using the methods of data mining and text analysis. Previous studies on the information needs of maternal health revealed the topics that pregnant women pay attention to; however, the existing work is limited in the descriptive analysis and self-reported questionnaire data (Kamali et al., 2018; Ahmadian et al., 2020; Jin et al., 2020; Kassim, 2021). This study is unique by employing enormous quantities of data and the research data covers a long period. By visualizing the posts of every province, the geographical distribution of pregnant women’s posts was clearly displayed. The current study enriches our understanding of the relationships among pregnant women’s information-seeking, regional economic development level, and educational level.

Second, this study provides comprehensive research, involving abundant analysis. Compared with previous research (Kamali et al., 2018), the current work divides the data from the first month of pregnancy to the tenth month of pregnancy and analyzes the large amounts of data according to the pregnancy period. This study provides important insights for understanding the change of emotions during different pregnant stages and connecting the changes of emotions with the topics that cause pregnant women’s attention. The current work provides the perspectives for future research by the subdivision of data in different pregnant stages.

Third, the findings of this study have several practical implications. The findings indicate that pregnant women pay attention to different topics during various months of pregnancy. The maternal and infant-related websites should provide customized information recommendations for pregnant women according to their stages of pregnancy. For example, information such as precautions and inspection for early pregnancy should be recommended for pregnant women in the second month of pregnancy. Moreover, the proportion of emotionally positive posts reached its peak in the sixth month of pregnancy and is relatively low in the first and the tenth of pregnancy. The relevant government management departments and hospitals should concern about anxiety during early pregnancy and before delivery. The popularization of knowledge about pregnancy and childbirth would be useful for improving pregnant women’s emotions.

6.3 Limitations and future research

The study is subject to several inevitable limitations. First, the data source of this study is “Mama.cn” mainly located in China. What is now needed in the future is a cross-national study involving data for countries at different levels of development. The present study lays the groundwork for future research into pregnant women’s information-seeking behavior around the world. Future studies are encouraged to improve the generalizability of the current work by involving data from different countries and understanding the role of cultural identity in determining pregnant women’s information-seeking. Second, the data such as personal attributes and specific family environments are not included in the paper since such data cannot be obtained from the website. It would be interesting to investigate the effect of family-related variables on pregnant women’s emotional sentiment in future work.

Data availability statement

The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.

Author contributions

KH provided the conceptualization, data collection, initial analysis, review and editing. TH worked on the results, methodology, and writing. All authors contributed to this study, read and agreed to the submitted version of the manuscript.

Funding

This study was funded by the College Youth Innovation Talent Project of Guangdong Province, China (No. 2022WQNCX099), the Higher Education Research Project sponsored by Guangdong Higher Education Academy (No. 22GQN14), and the Teaching and Research Project of Guangzhou Xinhua University (No. 2022J036).

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

Ahmadian L., Khajouei R., Kamali S., Mirzaee M. (2020). Use of the Internet by pregnant women to seek information about pregnancy and childbirth. Inf. Health Soc. Care 45 (4), 385–395. doi:10.1080/17538157.2020.1769106

PubMed Abstract | CrossRef Full Text | Google Scholar

Al-Dahshan A., Chehab M., Mohamed A., Al-Kubaisi N., Selim N. (2021). Pattern of internet use for pregnancy-related information and its predictors among women visiting primary healthcare in Qatar: A cross-sectional study. BMC Pregnancy Childbirth 21, 747. doi:10.1186/s12884-021-04227-0

PubMed Abstract | CrossRef Full Text | Google Scholar

Blei D. M., Ng A. Y., Jordan M. I. (2003). Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022.

Google Scholar

Blei D. M. (2012). Probabilistic topic models. Commun. ACM 55 (4), 77–84. doi:10.1145/2133806.2133826

CrossRef Full Text | Google Scholar

Chinn M. D., Fairlie R. W. (2010). ICT use in the developing world: An analysis of differences in computer and internet penetration. Rev. Int. Econ. 18 (1), 153–167. doi:10.1111/j.1467-9396.2009.00861.x

CrossRef Full Text | Google Scholar

Favaretto M., Shaw D., De Clercq E., Joda T., Elger B. S. (2020). Big data and digitalization in dentistry: A systematic review of the ethical issues. Int. J. Environ. Res. Public Health 17 (7), 2495. doi:10.3390/ijerph17072495

PubMed Abstract | CrossRef Full Text | Google Scholar

He D., Yao Z., Zhao F., Feng J. (2020). How do weather factors drive online reviews? The mediating role of online reviewers’ affect. Industrial Manag. Data Syst. 120 (11), 2133–2149. doi:10.1108/imds-02-2020-0121

CrossRef Full Text | Google Scholar

Jin H., Wang H., Gong C., Liu L. (2020). A study on the influencing factors of consumer information-seeking behavior in the context of ambient intelligence. J. Ambient. Intell. Humaniz. Comput. 11 (4), 1397–1404. doi:10.1007/s12652-018-1005-y

CrossRef Full Text | Google Scholar

Kamali S., Ahmadian L., Khajouei R., Bahaadinbeigy K. (2018). Health information needs of pregnant women: Information sources, motives and barriers. Health info. Libr. J. 35 (1), 24–37. doi:10.1111/hir.12200

PubMed Abstract | CrossRef Full Text | Google Scholar

Kassim M. (2021). A qualitative study of the maternal health information-seeking behaviour of women of reproductive age in Mpwapwa district, Tanzania. Health info. Libr. J. 38 (3), 182–193. doi:10.1111/hir.12329

PubMed Abstract | CrossRef Full Text | Google Scholar

Korenčić D., Ristov S., Šnajder J. (2018). Document-based topic coherence measures for news media text. Expert Syst. Appl. 114, 357–373. doi:10.1016/j.eswa.2018.07.063

CrossRef Full Text | Google Scholar

Leon-Sanz P. (2019). Key points for an ethical evaluation of healthcare big data. Processes 7 (8), 493. doi:10.3390/pr7080493

CrossRef Full Text | Google Scholar

Li D., Rzepka R., Ptaszynski M., Araki K. (2020). Hemos: A novel deep learning-based fine-grained humor detecting method for sentiment analysis of social media. Inf. Process. Manag. 57 (6), 102290. doi:10.1016/j.ipm.2020.102290

CrossRef Full Text | Google Scholar

Panichella A. (2021). A Systematic Comparison of search-Based approaches for LDA hyperparameter tuning. Inf. Softw. Technol. 130, 106411. doi:10.1016/j.infsof.2020.106411

CrossRef Full Text | Google Scholar

Pian W., Song S., Zhang Y. (2020). Consumer health information needs: A systematic review of measures. Inf. Process. Manag. 57 (2), 102077. doi:10.1016/j.ipm.2019.102077

CrossRef Full Text | Google Scholar

Reifegerste D., Blech S., Dechant P. (2020). Understanding information-seeking about the health of others: Applying the comprehensive model of information-seeking to proxy online health information-seeking. J. Health Commun. 25 (2), 126–135. doi:10.1080/10810730.2020.1716280

PubMed Abstract | CrossRef Full Text | Google Scholar

Shah A. M., Yan X., Qayyum A., Naqvi R. A., Shah S. J. (2021). Mining topic and sentiment dynamics in physician rating websites during the early wave of the COVID-19 pandemic: Machine learning approach. Int. J. Med. Inf. 149, 104434. doi:10.1016/j.ijmedinf.2021.104434

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang L., Niu J., Yu S. (2019). SentiDiff: Combining textual information and sentiment diffusion patterns for Twitter sentiment analysis. IEEE Trans. Knowl. Data Eng. 32 (10), 2026–2039. doi:10.1109/tkde.2019.2913641

CrossRef Full Text | Google Scholar

Xu G., Meng Y., Qiu X., Yu Z., Wu X. (2019). Sentiment analysis of comment texts based on BiLSTM. IEEE Access 7, 51522–51532. doi:10.1109/access.2019.2909919

CrossRef Full Text | Google Scholar

Zhang C., Jiang J., Jin H., Chen T. (2021). The impact of COVID-19 on consumers’ psychological behavior based on data mining for online user comments in the catering industry in China. Int. J. Environ. Res. Public Health 18 (8), 4178. doi:10.3390/ijerph18084178

PubMed Abstract | CrossRef Full Text | Google Scholar

Keywords: pregnancy, health information, text analysis, topic analysis, sentiment analysis

Citation: Hou K and Hou T (2022) Investigating pregnant women’s health information needs during pregnancy on internet platforms. Front. Physiol. 13:1038048. doi: 10.3389/fphys.2022.1038048

Received: 06 September 2022; Accepted: 17 November 2022;
Published: 28 November 2022.

Edited by:

Frank Spradley, University of Mississippi Medical Center, United States

Reviewed by:

Paula Tavares, University of Coimbra, Portugal
Alyssa Cheadle, Hope College, United States

Copyright © 2022 Hou and Hou. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Tingting Hou, aHR0OTRAZm94bWFpbC5jb20=

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Investigating pregnant women’s health information needs during pregnancy on internet platforms

1 Introduction

2 Materials and methods

2.1 Data collection

2.2 Methods

2.2.1 Text topic analysis based on latent Dirichlet allocation model

2.2.2 Text sentiment analysis based on SnowNLP

3 Correlation analysis

4 Topic analysis of information needs

4.1 Emerged topics in different months of pregnancy

4.1.1 Information needs in the first month

4.1.2 Information needs in the second month

4.1.3 Information needs in the third month

4.1.4 Information needs in the fourth month

4.1.5 Information needs in the fifth month

4.1.6 Information needs in the sixth month

4.1.7 Information needs in the seventh month

4.1.8 Information needs in the eighth month

4.1.9 Information needs in the ninth month

4.1.10 Information needs in the tenth month

4.2 Summary of topic analysis about information needs

5 Sentiment analysis

6 Discussion and conclusion

6.1 Summary of findings

6.2 Practical and theoretical implications

6.3 Limitations and future research

Data availability statement

Author contributions

Funding

Conflict of interest

Publisher’s note

References

94% of researchers rate our articles as excellent or good

94% of researchers rate our articles as excellent or good