Background: Carotid plaques are major risk factors for stroke. Carotid ultrasound can help to assess the risk and incidence rate of stroke. However, large-scale carotid artery screening is time-consuming and laborious, the diagnostic results inevitably involve the subjectivity of the diagnostician to a certain extent. Deep learning demonstrates the ability to solve the aforementioned challenges. Thus, we attempted to develop an automated algorithm to provide a more consistent and objective diagnostic method and to identify the presence and stability of carotid plaques using deep learning.
Methods: A total of 3,860 ultrasound images from 1,339 participants who underwent carotid plaque assessment between January 2021 and March 2023 at the Shanghai Eighth People’s Hospital were divided into a 4:1 ratio for training and internal testing. The external test included 1,564 ultrasound images from 674 participants who underwent carotid plaque assessment between January 2022 and May 2023 at Xinhua Hospital affiliated with Dalian University. Deep learning algorithms, based on the fusion of a bilinear convolutional neural network with a residual neural network (BCNN-ResNet), were used for modeling to detect carotid plaques and assess plaque stability. We chose AUC as the main evaluation index, along with accuracy, sensitivity, and specificity as auxiliary evaluation indices.
Results: Modeling for detecting carotid plaques involved training and internal testing on 1,291 ultrasound images, with 617 images showing plaques and 674 without plaques. The external test comprised 470 ultrasound images, including 321 images with plaques and 149 without. Modeling for assessing plaque stability involved training and internal testing on 764 ultrasound images, consisting of 494 images with unstable plaques and 270 with stable plaques. The external test was composed of 279 ultrasound images, including 197 images with unstable plaques and 82 with stable plaques. For the task of identifying the presence of carotid plaques, our model achieved an AUC of 0.989 (95% CI: 0.840, 0.998) with a sensitivity of 93.2% and a specificity of 99.21% on the internal test. On the external test, the AUC was 0.951 (95% CI: 0.962, 0.939) with a sensitivity of 95.3% and a specificity of 82.24%. For the task of identifying the stability of carotid plaques, our model achieved an AUC of 0.896 (95% CI: 0.865, 0.922) on the internal test with a sensitivity of 81.63% and a specificity of 87.27%. On the external test, the AUC was 0.854 (95% CI: 0.889, 0.830) with a sensitivity of 68.52% and a specificity of 89.49%.
Conclusion: Deep learning using BCNN-ResNet algorithms based on routine ultrasound images could be useful for detecting carotid plaques and assessing plaque instability.
In the information age, real-world data-based evidence can help extrapolate and supplement data from randomized controlled trials, which can benefit clinical trials and drug development and improve public health decision-making. However, the legitimate use of real-world data in China is limited due to concerns over patient confidentiality. The use of personal information is a core element of data governance in public health. In China’s public health data governance, practical problems exist, such as balancing personal information protection and public value conflict. In 2021, China adopted the Personal Information Protection Law (PIPL) to provide a consistent legal framework for protecting personal information, including sensitive medical health data. Despite the PIPL offering critical legal safeguards for processing health data, further clarification is needed regarding specific issues, including the meaning of “separate consent,” cross-border data transfer requirements, and exceptions for scientific research. A shift in the law and regulatory framework is necessary to advance public health research further and realize the potential benefits of combining real-world evidence and digital health while respecting privacy in the technological and demographic change era.
Wikipedia is an open-source online encyclopedia and one of the most-read sources of online health information. Likewise, Wikipedia page views have also been analyzed to inform public health services and policies. The present review analyzed 29 studies utilizing Wikipedia page views for health research. Most reviewed studies were published in recent years and emanated from high-income countries. Together with Wikipedia page views, most studies also used data from other internet sources, such as Google, Twitter, YouTube, and Reddit. The reviewed studies also explored various non-communicable diseases, infectious diseases, and health interventions to describe changes in the utilization of online health information from Wikipedia, to examine the effect of public events on public interest and information usage about health-related Wikipedia pages, to estimate and predict the incidence and prevalence of diseases, to predict data from other internet data sources, to evaluate the effectiveness of health education activities, and to explore the evolution of a health topic. Given some of the limitations in replicating some of the reviewed studies, future research can specify the specific Wikipedia page or pages analyzed, the language of the Wikipedia pages examined, dates of data collection, dates explored, type of data, and whether page views were limited to Internet users and whether web crawlers and redirects to the Wikipedia page were included. Future research can also explore public interest in other commonly read health topics available in Wikipedia, develop Wikipedia-based models that can be used to predict disease incidence and improve Wikipedia-based health education activities.