Assessing the likelihood of engaging in high-risk sexual behavior can assist in delivering tailored educational interventions. The objective of this study was to identify the most effective algorithm and assess high-risk sexual behaviors within the last six months through the utilization of machine-learning models.
The survey conducted in the Longhua District CDC, Shenzhen, involved 2023 participants who were employees of 16 different factories. The data was collected through questionnaires administered between October 2019 and November 2019. We evaluated the model's overall predictive classification performance using the area under the curve (AUC) of the receiver operating characteristic (ROC) curve. All analyses were performed using the open-source Python version 3.9.12.
About a quarter of the factory workers had engaged in risky sexual behavior in the past 6 months. Most of them were Han Chinese (84.53%), hukou in foreign provinces (85.12%), or rural areas (83.19%), with junior high school education (55.37%), personal monthly income between RMB3,000 (US$417.54) and RMB4,999 (US$695.76; 64.71%), and were workers (80.67%). The random forest model (RF) outperformed all other models in assessing risky sexual behavior in the past 6 months and provided acceptable performance (accuracy 78%; sensitivity 11%; specificity 98%; PPV 63%; ROC 84%).
Machine learning has aided in evaluating risky sexual behavior within the last six months. Our assessment models can be integrated into government or public health departments to guide sexual health promotion and follow-up services.