Background

AUTHOR=Sun Yuantong , Zheng Weiwei , Zhang Ling , Zhao Huijuan , Li Xun , Zhang Chao , Ma Wuren , Tian Dajun , Yu Kun-Hsing , Xiao Shuo , Jin Liping , Hua Jing 

TITLE=Quantifying the Impacts of Pre- and Post-Conception TSH Levels on Birth Outcomes: An Examination of Different Machine Learning Models

JOURNAL=Frontiers in Endocrinology

VOLUME=12

YEAR=2021

URL=https://www.frontiersin.org/journals/endocrinology/articles/10.3389/fendo.2021.755364

DOI=10.3389/fendo.2021.755364

ISSN=1664-2392

ABSTRACT=<sec><title>Background</title><p>While previous studies identified risk factors for diverse pregnancy outcomes, traditional statistical methods had limited ability to quantify their impacts on birth outcomes precisely. We aimed to use a novel approach that applied different machine learning models to not only predict birth outcomes but systematically quantify the impacts of pre- and post-conception serum thyroid-stimulating hormone (TSH) levels and other predictive characteristics on birth outcomes.</p></sec><sec><title>Methods</title><p>We used data from women who gave birth in Shanghai First Maternal and Infant Hospital from 2014 to 2015. We included 14,110 women with the measurement of preconception TSH in the first analysis and 3,428 out of 14,110 women with both pre- and post-conception TSH measurement in the second analysis. Synthetic Minority Over-sampling Technique (SMOTE) was applied to adjust the imbalance of outcomes. We randomly split (7:3) the data into a training set and a test set in both analyses. We compared Area Under Curve (AUC) for dichotomous outcomes and macro F1 score for categorical outcomes among four machine learning models, including logistic model, random forest model, XGBoost model, and multilayer neural network models to assess model performance. The model with the highest AUC or macro F1 score was used to quantify the importance of predictive features for adverse birth outcomes with the loss function algorithm.</p></sec><sec><title>Results</title><p>The XGBoost model provided prominent advantages in terms of improved performance and prediction of polytomous variables. Predictive models with abnormal preconception TSH or not-well-controlled TSH, a novel indicator with pre- and post-conception TSH levels combined, provided the similar robust prediction for birth outcomes. The highest AUC of 98.7% happened in XGBoost model for predicting low Apgar score with not-well-controlled TSH adjusted. By loss function algorithm, we found that not-well-controlled TSH ranked 4<sup>th</sup>, 6<sup>th</sup>, and 7<sup>th</sup> among 14 features, respectively, in predicting birthweight, induction, and preterm birth, and 3<sup>rd</sup> among 19 features in predicting low Apgar score.</p></sec><sec><title>Conclusions</title><p>Our four machine learning models offered valid predictions of birth outcomes in women during pre- and post-conception. The predictive features panel suggested the combined TSH indicator (not-well-controlled TSH) could be a potentially competitive biomarker to predict adverse birth outcomes.</p></sec>