Hypothyroidism can be easily misdiagnosed in dogs, and prediction models can support clinical decision-making, avoiding unnecessary testing and treatment. The aim of this study is to develop and internally validate diagnostic prediction models for hypothyroidism in dogs by applying machine-learning algorithms.
A single-institutional cross-sectional study was designed searching the electronic database of a Veterinary Teaching Hospital for dogs tested for hypothyroidism. Hypothyroidism was diagnosed based on suggestive clinical signs and thyroid function tests. Dogs were excluded if medical records were incomplete or a definitive diagnosis was lacking. Predictors identified after data processing were dermatological signs, alopecia, lethargy, hematocrit, serum concentrations of cholesterol, creatinine, total thyroxine (tT4), and thyrotropin (cTSH). Four models were created by combining clinical signs and clinicopathological variables expressed as quantitative (models 1 and 2) and qualitative variables (models 3 and 4). Models 2 and 4 included tT4 and cTSH, models 1 and 3 did not. Six different algorithms were applied to each model. Internal validation was performed using a 10-fold cross-validation. Apparent performance was evaluated by calculating the area under the receiver operating characteristic curve (AUROC).
Eighty-two hypothyroid and 233 euthyroid client-owned dogs were included. The best performing algorithms were naive Bayes in model 1 (AUROC = 0.85; 95% confidence interval [CI] = 0.83–0.86) and in model 2 (AUROC = 0.98; 95% CI = 0.97–0.99), logistic regression in model 3 (AUROC = 0.88; 95% CI = 0.86–0.89), and random forest in model 4 (AUROC = 0.99; 95% CI = 0.98–0.99). Positive predictive value was 0.76, 0.84, 0.93, and 0.97 in model 1, 2, 3, and 4, respectively. Negative predictive value was 0.89, 0.89, 0.99, and 0.99 in model 1, 2, 3, and 4, respectively.
Machine learning-based prediction models were accurate in predicting and quantifying the likelihood of hypothyroidism in dogs based on internal validation performed in a single-institution, but external validation is required to support the clinical applicability of these models.