Machine learning-based risk prediction model for canine myxomatous mitral valve disease using electronic health record data

Kim, Yunji; Kim, Jaejin; Kim, Sehoon; Youn, Hwayoung; Choi, Jihye; Seo, Kyoungwon

doi:10.3389/fvets.2023.1189157

ORIGINAL RESEARCH article

Front. Vet. Sci., 31 August 2023

Sec. Comparative and Clinical Medicine

Volume 10 - 2023 | https://doi.org/10.3389/fvets.2023.1189157

This article is part of the Research TopicVetinformatics: An Insight for Decoding Livestock Systems Through In Silico Biology Volume IIView all 6 articles

Machine learning-based risk prediction model for canine myxomatous mitral valve disease using electronic health record data

Kyoungwon Seo¹^*

¹Department of Veterinary Internal Medicine, College of Veterinary Medicine, Seoul, Republic of Korea
²School of Biological Sciences, Seoul National University, Seoul, Republic of Korea
³Department of Veterinary Medical Imaging, College of Veterinary Medicine, Seoul National University, Seoul, Republic of Korea

Introduction: Myxomatous mitral valve disease (MMVD) is the most common cause of heart failure in dogs, and assessing the risk of heart failure in dogs with MMVD is often challenging. Machine learning applied to electronic health records (EHRs) is an effective tool for predicting prognosis in the medical field. This study aimed to develop machine learning-based heart failure risk prediction models for dogs with MMVD using a dataset of EHRs.

Methods: A total of 143 dogs with MMVD between May 2018 and May 2022. Complete medical records were reviewed for all patients. Demographic data, radiographic measurements, echocardiographic values, and laboratory results were obtained from the clinical database. Four machine-learning algorithms (random forest, K-nearest neighbors, naïve Bayes, support vector machine) were used to develop risk prediction models. Model performance was represented by plotting the receiver operating characteristic (ROC) curve and calculating the area under the curve (AUC). The best-performing model was chosen for the feature-ranking process.

Results: The random forest model showed superior performance to the other models (AUC = 0.88), while the performance of the K-nearest neighbors model showed the lowest performance (AUC = 0.69). The top three models showed excellent performance (AUC ≥ 0.8). According to the random forest algorithm’s feature ranking, echocardiographic and radiographic variables had the highest predictive values for heart failure, followed by packed cell volume (PCV) and respiratory rates. Among the electrolyte variables, chloride had the highest predictive value for heart failure.

Discussion: These machine-learning models will enable clinicians to support decision-making in estimating the prognosis of patients with MMVD.

1. Introduction

Myxomatous mitral valve disease (MMVD) is the most common cardiovascular disease, accounting for approximately 75% of all canine heart diseases. Moreover, MMVD is the most common cause of heart failure (HF) in small-breed dogs (1–3). The progression of MMVD depends on several risk factors; therefore, the prognosis differs significantly between patients (4). Although there are many preclinical MMVD patients, few develop congestive HF, while others develop various forms of HF or even cardiac death. Furthermore, cardiogenic pulmonary edema due to HF is one of the leading causes of cardiac death in dogs (1).

Given the importance of the disease, predicting HF in patients with MMVD has become a priority. However, this requires a comprehensive interpretation of various clinical data, which is too complex to predict at the right time. Several studies have been conducted to identify key risk factors contributing to the development of HF in dogs with MMVD; however, there are still limitations in predicting the risk of HF (2). Therefore, new methods to support the assessment and prediction of the risk of HF in patients with MMVD are warranted.

In human medicine, many studies have attempted to use advanced technologies to assess and predict the risk and onset of HF and prognosis in cardiovascular diseases (5–8). Electronic health records (EHRs) are considered useful data sources to reveal correlations with clinical data (9). Machine learning, a branch of artificial intelligence applied to medical records, is an effective tool for prognostic prediction and medical decision-making.

In light of advances in machine-learning technologies, machine learning-based models have outperformed conventional risk prediction models owing to their capability to process large volumes and various data types (9, 10). Several recent medical studies have attempted to develop machine learning-based models and have improved the performance of classifiers from simple infectious diseases to complex heterogeneous diseases by using various machine-learning algorithms in human and veterinary medicine (11–15).

This study aimed to develop machine learning-based risk prediction models for HF in dogs with MMVD using a dataset of EHRs. Additionally, four machine-learning algorithms were used, and significant HF predictive markers were identified through feature ranking of the best-performing algorithm.

2. Materials and methods

An illustrative scheme for conducting machine learning-based modeling of HF prediction and feature ranking is shown in Figure 1.

FIGURE 1

Figure 1. Scheme for risk prediction model and feature ranking.

2.1. Case selection and data collection

This retrospective study was conducted at the Seoul National University Veterinary Medical Teaching Hospital. The EHRs of 396 records with MMVD were collected between May 2018 and May 2022. All dogs underwent a physical examination, thoracic radiography, echocardiography, and blood analysis. Complete medical records were manually reviewed for all dogs. Demographic data (breed, sex, neuter status, age, and body condition score), radiographic measurements, echocardiographic values, and laboratory results were extracted from the clinical database.

Thoracic radiographic values indicating cardiac remodeling, such as the vertebral heart scale and vertebral left atrial score, were collected. Based on previous echocardiographic measurement studies on the severity and prognosis of MMVD in dogs, five echocardiographic variables were selected for machine learning modeling: the left atrium-to-aorta ratio (LA/Ao ratio), left ventricular end-diastolic diameter normalized for body weight (LVIDDn), left ventricular fractional shortening, E-wave transmitral peak velocity (E-vel), and ejection fraction (2, 16, 17).

As suggested by previous study indicating stable performance with a sample size of around 120, stringent selection criteria was adopted during the data selection process (18). The inclusion criteria were a confirmatory diagnosis of MMVD through radiographic and echocardiographic imaging, with every selected feature available within 1 month of administration or at the time of event. Consequently, out of the 396 initial cases, 165 cases were excluded with insufficient data in either echocardiography or blood test records. Additionally, 31 cases were removed due to the absence of follow-up data, and 57 cases were excluded due to the presence of other cardiac diseases coexisting with MMVD or when the cause of heart failure was not MMVD. This rigorous selection process resulted in 143 high-quality cases being included in the final dataset, ensuring the reliability and robustness of our machine learning analysis.

The patients were divided into MMVD with HF (HF group) and MMVD without HF (non-HF group). Non-HF was defined as when the patients do not have an HF event more than once month. MMVD-related HF was confirmed by cardiogenic pulmonary edema on thoracic radiographs in relation to patient history and clinical assessment. Moreover, patients with MMVD had previous HF episodes and cardiac death.

Cases with complete demographic data, physical examination data, laboratory results, and clinical imaging data were used for further analysis. Features with categorical data were assessed using counts and corresponding percentages, and continuous numerical data were summarized using the mean value.

2.2. Machine-learning model development

The data values were first transformed into appropriate data types for machine learning. Continuous variables were normalized using min-max normalization, while binary categorization was applied to neuter status, sex, anemia, and tachypnea. HF events were the target of this binary prediction model. Four frequently used machine-learning algorithms were trained to develop an optimized prediction model: random forest (RF), K-nearest neighbors (KNN), naïve Bayes (NB), and support vector machine (SVM).

RF is an ensemble learning method based on multiple decision trees. The decision trees are randomly built from the variable set. The prediction is made by majority vote across all decision trees (19). KNN is an instance-based model which is based on the characteristic of the K-nearest neighbor of a new point to classify it (20). NB is a probabilistic classifier based on the assumption of independence between the variables of the problem. The NB model performs a probabilistic classification of an unclassified sample to put it in the most likely class (21). SVM is a high-performance model for non-linear problems which discriminates between two classes by generating a hyperplane. SVM is not biased by outliers and is not sensitive to them (22).

The model training process was repeated 100 times to yield an average confusion matrix to tune the hyperparameters. The database was split randomly into a training and testing set for each of the 100 executions to prevent model overfitting. In cases of hyperparameter optimization, the database was split into training, testing, and validation sets wherein the prediction results were measured using confusion matrix rates such as sensitivity, specificity, and accuracy. In addition, model performance was represented by plotting the receiver operating characteristic (ROC) curve and calculating the area under the curve (AUC) (23). The best-performing model was defined by having the highest AUC and was chosen for the feature-ranking process.

2.3. Feature ranking

The top-performing model, RF, was used for the feature-ranking process, wherein the Gini impurity-reduction feature-ranking technique was applied (24). Using a dataset, RF constructs multiple random decision trees and checks all binary outcomes across all decision trees. Additionally, RF chooses its final output through a majority vote. Feature ranking is based on the mean Gini decrease value of how much the Gini impurity decreases when a specific variable is removed. The algorithm then compares the Gini value with the other Gini values obtained by applying all other features and ranks the features according to their significance.

2.4. Statistical analysis

Statistical analyzes and machine learning were conducted using R version 4.2.0 (R software, R Core Team, Vienna, Austria) and various R packages (class, clusterSim, dplyr, e1071, formula.tools, gmodels, kernlab, pastecs, PRROC, randomForest, ROCR, ROSE, rpart), while Prism 9 (GraphPad Software, San Diego, CA) was used to create graphs. All codes and data used for this analysis are provided upon request.

3. Results

3.1. Patient characteristics

Among 143 patients, 90 (63%) were labeled as MMVD with HF and 53 (37%) as MMVD without HF. According to the American College of Veterinary Internal Medicine classification, the HF and non-HF groups were classified as stages B and C/D, respectively.

Although 25 variables were collected from the EHRs, only 24 were included in the dataset since the breed variable was removed due to dataset encoding complexity. Of the dogs in the HF group, 48 (53.33%) were male and 42 (46.67%) were female, with a mean age of 11.78 years. Of the patients in the non-HF group, 30 (56.6%) were male and 23 (43.4%) were female, with a mean age of 11.45 years. Several breeds of dogs were included in this study. For the HF group, 15 breeds were recorded, with Maltese being the most frequently observed (50%), followed by Shih Tzu (11%) and Pomeranian (10%); 26 dogs were of 12 other breeds. Thirteen different breeds were included in the non-HF group, with Maltese (24%) being the most observed, followed by Poodle (7%), Shih Tzu (4%), and Chihuahua (4%). There were fewer anemia and tachypnea cases in the non-HF group, while there were more HF cases. The demographic and binary variables for each group are summarized in Table 1.

TABLE 1

Table 1. Summary of patient demographics and binary features of dogs in the dataset.

The mean thoracic radiographic and three echocardiographic values (LA/Ao ratio, LVIDDn, and E-vel) were higher in the HF group. In addition, the mean of each electrolyte feature was similar between groups, except for chloride. The quantitative characteristics of the datasets are presented in Table 2.

TABLE 2

Table 2. Statistical quantitative description of the numeric features of dogs in the dataset.

3.2. Machine-learning performance evaluation

For algorithms that required hyperparameter optimization, such as KNN and SVM, the dataset was randomly split into 60, 20, and 20% for the training, validation, and test sets, respectively. On the contrary, the other algorithms, RF and NB, split the dataset into 80% for the training set and 20% for the test set. The model prediction results were reported as accurate, sensitive, and specific. ROC plots were generated and AUCs were calculated to estimate model discrimination. The mean result scores of the four methods are demonstrated in Table 3. Figure 2 displays the ROC curves for the four machine-learning models. Of the four methods, RF showed superior performance to the other models in terms of accuracy (0.78), sensitivity (0.85), and AUC (0.88). NB showed the highest specificity (0.87) compared to RF (0.68). The top three models indicated very good performance (AUC ≥ 0.8) in the dataset, while the performance of KNN showed the lowest AUC (0.69).

TABLE 3

Table 3. Performance of ML models predicting heart failure risk of MMVD dogs – mean of 100 executions.

FIGURE 2

Figure 2. ROC curves for the risk prediction models for MMVD dogs. (A) Random forest (AUC 0.887). (B) Support vector machine (AUC 0.870). (C) Naïve bayes (AUC 0.801). (D) K-nearest neighbors (AUC 0.698). AUC, area under the curve; MMVD, myxomatous mitral valve disease; ROC, receiver operator characteristic.

3.3. Feature selection results

The RF algorithm, which was the top-performing model, was used for feature-ranking analysis. The mean Gini decrease value was used to rank the significance of its variables and was listed in the order of importance. The most influential predictor was LVIDDn, followed by LA/Ao ratio and E-vel. Furthermore, both thoracic radiographic values were highly predictive of HF. Among the electrolyte features, chloride had the highest predictive value for HF. Respiratory rate and packed cell volume (PCV) were selected based on their relative influence compared to anemia and tachypnea. Neuter status and sex were last in the feature ranking. Figure 3 shows the feature ranking selected by RF.

FIGURE 3

Figure 3. Random-forest feature-ranking selection through mean Gini decrease value.

4. Discussion

In this study, four supervised machine-learning models were developed to predict the risk of HF in dogs with MMVD using EHRs. Several machine-learning algorithms can be used to analyze diseases, each with advantages and disadvantages. This study considered algorithms most frequently used to achieve the best performance. Other studies also showed that RF had superior performance compared to other methods (25–27). The feature-ranking method used RF because of its high evaluation results and efficiency for identifying novel risk predictors and complex interplay between variables (28–30).

The input data quality can strongly influence the performance of a machine-learning model (10). Overall, all classifiers performed well in this study. The RF model outperformed all other methods with the highest AUC; the KNN model demonstrated a decline in performance but was still acceptable. This finding indicates that the dataset is ideal for developing a risk assessment model for canine MMVD, thus suggesting that machine learning is effective in assessing the preclinical risk of MMVD patients developing HF and death due to underlying cardiac disease.

The degree of disease progression often differs between patients because of heterogeneous features and MMVD manifestations. Moreover, the high burden of comorbidities and unpredictable, complex interactions makes it challenging to assess and establish treatment strategies. Improved identification of patients with HF could provide opportunities to detect patients early on and provide appropriate monitoring strategies. Therefore, the classification of whether there is a risk of HF itself can help significantly in the management of patients.

According to the demographic characteristics, the disease pattern is similar to that of other studies, demonstrating a higher percentage of male and smaller (<20 kg) breeds (3). In this study, the number of male dogs was slightly higher than that of female dogs in both groups. Age is also considered a contributing factor to disease development in dogs. Several studies have shown a high prevalence of MMVD associated with aging (2, 4). Similarly, in this study, both groups had a mean age of approximately 11 years, indicating they belonged to a senior population. However, while the data showed a certain extent of older age, other factors are considered to have more predictive power than age in this model. Echocardiographic and radiographic variables had the highest predictive value for HF. The features with the most significant influence on the model were LVIDDn, LA/Ao ratio, and E-vel, which are echocardiographic features that have been proposed as contributing factors to the severity of MMVD (16, 31). The prognostic value of these echocardiographic features was higher than that of thoracic radiographic measurements, which means that the algorithm provided more value when assessing HF risk.

Numeric features (PCV and respiratory rates) showed a higher mean Gini decrease value than binary features (anemia and tachypnea). The criteria for a binary diagnosis of anemia or tachypnea focus on normal patients. Therefore, the risk level can generally be judged only by existing criteria. However, there may be considerable risk in the case of a patient who is more sensitive to a specific stress, even if it falls within the normal range according to existing criteria. Therefore, it is necessary to reconsider the diagnostic criteria for anemia and tachypnea in patients with MMVD. Considering the multifactorial nature of the disease, this implies that careful assessment of the numeric value is required when the clinical examination results are on the borderline of the normal reference range rather than whether the patients have anemia and tachypnea.

Among the electrolyte variables, chloride showed the highest mean Gini decrease value. Electrolyte abnormalities, including hyponatremia and hypochloremia, can be observed in patients with congestive HF. A recent retrospective study reported that hypochloremia was considered a negative prognostic marker in dogs with HF (32). Concurrent renal impairment and HF, defined as a cardiorenal syndrome, has a negative prognostic impact on patients (33, 34). Similarly, serum creatinine has a moderate predictive value in this study. Since dog sizes vary depending on breed and weight, this study used body condition scores to reflect the overall body condition of individual dogs; however, these scores had a relatively lower predictive value in the feature-ranking results.

This study has some limitations. First, due to the imbalance of the dataset, most models obtained better performance in terms of sensitivity than specificity. These results occurred since algorithms were exposed to more HF patient components than those of non-HF patients during training; hence, they are more equipped to recognize HF patient profiles during testing. Second, treatment was not considered, which may have resulted in underestimating certain clinical values, thereby influencing the results. Further investigation on additional external datasets with the same variables from a different cohort of dogs would improve prediction performance.

The most common cardiovascular disease among small-breed dogs is MMVD, and small-breed dogs tend to be prone to developing MMVD (1). Given that most breeds presented in this dataset are small-breed patients (weight ≤ 15 kg), our results are consistent with the prevalence of MMVD documented in previous studies (1, 4). Therefore, the model outputs are optimized for small-breed dogs rather than large-breed dogs. Moreover, the prevalence of heart disease is different in large-breed dogs (e.g., canine dilated cardiomyopathy). Further access to different heart diseases in large-breed patients would enable the development of other risk stratification models.

Patient EHRs are stored with each admission, and diagnostic variables may change over time. Since time-varying values have more evidence for the determination of HF, advanced machine-learning techniques such as recurrent neural networks may show better predictive power by reflecting all the periodic chart data (35). However, training a recurrent neural network with a complex dataset is very difficult and requires significant computing power for each patient, making it more difficult for use in real clinical practice. Even though our RF model was trained with only baseline data regardless of previous history records, it shows very good performance (AUC = 0.88). Our model requires much less computing power and is relatively easy to train for predicting patients at high risk of HF. Therefore, this platform can be further applied to predict significant HF events during disease management.

The data used in this study were extracted from referral animal hospitals, which means that patients may have more complicated clinicopathologic characteristics and different tendencies than general MMVD patients; therefore, known prognostic factors may not be well applied. However, machine-learning algorithms consider correlations between large volumes of variables, and this process contributes to the increased ranking of other undervalued features in general MMVD patients. This can assist the drive toward personalized risk management and provide insight into the veterinarian’s decision-making for patients with MMVD.

In summary, this study obtained significant results in predicting HF events in patients with MMVD. Although further testing and validation are needed for incorporation into clinical practice, this study highlights the potential of machine learning in heart disease management and can encourage future approaches to apply machine learning in veterinary medicine. Simultaneously, a more comprehensive application of machine learning may improve diagnosis and treatment decisions for diseases and risk prediction.

Data availability statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Author contributions

YK, JK, and KS contributed to study design, data analysis, and interpretation. YK and JK wrote and edited codes for machine learning. SK contributed to data interpretation, manuscript review, and editing. KS, HY, and JC contributed to manuscript review and editing. YK worte the original draft. All authors contributed to the article and approved the submitted version.

Acknowledgments

This work was supported by the New Faculty Startup Fund from Seoul National University.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

1. Nelson, RW, and Couto, CG. Small Animal Internal Medicine. 6th ed. Philadelphia: Elsevier Health Sciences (2019).

Google Scholar

2. Borgarelli, M, Ferasin, L, Lamb, K, Chiavegato, D, Bussadori, C, D’Agnolo, G, et al. The predictive value of clinical, radiographic, echocardiographic variables and cardiac biomarkers for assessing risk of the onset of heart failure or cardiac death in dogs with preclinical myxomatous mitral valve disease enrolled in the DELAY study. J Vet Cardiol. (2021) 36:77–88. doi: 10.1016/j.jvc.2021.04.009

CrossRef Full Text | Google Scholar

3. Keene, BW, Atkins, CE, Bonagura, JD, Fox, PR, Häggström, J, Fuentes, VL, et al. ACVIM consensus guidelines for the diagnosis and treatment of myxomatous mitral valve disease in dogs. J Vet Intern Med. (2019) 33:1127–40. doi: 10.1111/jvim.15488

CrossRef Full Text | Google Scholar

4. Kim, H-T, Han, S-M, Song, W-J, Kim, B, Choi, M, Yoon, J, et al. Retrospective study of degenerative mitral valve disease in small-breed dogs: survival and prognostic variables. J Vet Sci. (2017) 18:369–76. doi: 10.4142/jvs.2017.18.3.369

CrossRef Full Text | Google Scholar

5. Heo, J, Yoon, JG, Park, H, Kim, YD, Nam, HS, and Heo, JH. Machine learning-based model for prediction of outcomes in acute stroke. Stroke. (2019) 50:1263–5. doi: 10.1161/STROKEAHA.118.024293

CrossRef Full Text | Google Scholar

6. Weng, SF, Reps, J, Kai, J, Garibaldi, JM, and Qureshi, N. Can machine-learning improve cardiovascular risk prediction using routine clinical data? PLoS One. (2017) 12:e0174944. doi: 10.1371/journal.pone.0174944

CrossRef Full Text | Google Scholar

7. Lorenzoni, G, Sabato, SS, Lanera, C, Bottigliengo, D, Minot, C, Ocagli, H, et al. Comparison of machine learning techniques for prediction of hospitalization in heart failure patients. J Clin Med. (2019) 8:1298. doi: 10.3390/jcm8091298

CrossRef Full Text | Google Scholar

8. Sax, DR, Mark, DG, Huang, J, Sofrygin, O, Rana, JS, Collins, SP, et al. Use of machine learning to develop a risk-stratification tool for emergency department patients with acute heart failure. Ann Emerg Med. (2021) 77:237–48. doi: 10.1016/j.annemergmed.2020.09.436

CrossRef Full Text | Google Scholar

9. Desai, RJ, Wang, SV, Vaduganathan, M, Evers, T, and Schneeweiss, S. Comparison of machine learning methods with traditional models for use of administrative claims with electronic medical records to predict heart failure outcomes. JAMA Netw Open. (2020) 3:e1918962. doi: 10.1001/jamanetworkopen.2019.18962

CrossRef Full Text | Google Scholar

10. Basran, PS, and Appleby, RB. The unmet potential of artificial intelligence in veterinary medicine. Am J Vet Res. (2022) 83:385–92. doi: 10.2460/ajvr.22.03.0038

CrossRef Full Text | Google Scholar

11. Ferreira, TS, Santana, EEC, Jacob Junior, AFL, Silva Junior, PF, Bastos, LS, Silva, ALA, et al. Diagnostic classification of cases of canine leishmaniasis using machine learning. Sensors (Basel). (2022) 22:3128. doi: 10.3390/s22093128

CrossRef Full Text | Google Scholar

12. Reagan, KL, Deng, S, Sheng, J, Sebastian, J, Wang, Z, Huebner, SN, et al. Use of machine-learning algorithms to aid in the early detection of leptospirosis in dogs. J Vet Diagn Investig. (2022) 34:612–21. doi: 10.1177/10406387221096781

CrossRef Full Text | Google Scholar

13. Schofield, I, Brodbelt, DC, Kennedy, N, Niessen, SJM, Church, DB, Geddes, RF, et al. Machine-learning based prediction of Cushing's syndrome in dogs attending UK primary-care veterinary practice. Sci Rep. (2021) 11:9035. doi: 10.1038/s41598-021-88440-z

CrossRef Full Text | Google Scholar

14. Bradley, R, Tagkopoulos, I, Kim, M, Kokkinos, Y, Panagiotakos, T, Kennedy, J, et al. Predicting early risk of chronic kidney disease in cats using routine clinical laboratory tests and machine learning. J Vet Intern Med. (2019) 33:2644–56. doi: 10.1111/jvim.15623

CrossRef Full Text | Google Scholar

15. Renard, J, Faucher, MR, Combes, A, Concordet, D, and Reynolds, BS. Machine-learning algorithm as a prognostic tool in non-obstructive acute-on-chronic kidney disease in the cat. J Feline Med Surg. (2021) 23:1140–8. doi: 10.1177/1098612X211001273

CrossRef Full Text | Google Scholar

16. Vezzosi, T, Grosso, G, Tognetti, R, Meucci, V, Patata, V, Marchesotti, F, et al. The mitral INsufficiency echocardiographic score: a severity classification of myxomatous mitral valve disease in dogs. J Vet Intern Med. (2021) 35:1238–44. doi: 10.1111/jvim.16131

CrossRef Full Text | Google Scholar

17. Borgarelli, M, Crosara, S, Lamb, K, Savarino, P, La Rosa, G, Tarducci, A, et al. Survival characteristics and prognostic variables of dogs with preclinical chronic degenerative mitral valve disease attributable to myxomatous degeneration. J Vet Intern Med. (2012) 26:69–75. doi: 10.1111/j.1939-1676.2011.00860.x

CrossRef Full Text | Google Scholar

18. Rajput, D, Wang, WJ, and Chen, CC. Evaluation of a decided sample size in machine learning applications. BMC Bioinform. (2023) 24:48. doi: 10.1186/s12859-023-05156-9

CrossRef Full Text | Google Scholar

19. Breiman, L. Random forests. Mach Learn. (2001) 45:5–32. doi: 10.1023/A:1010933404324

CrossRef Full Text | Google Scholar

20. Cover, TM, and Hart, PE. Nearest neighbor pattern classification. IEEE Trans Inf Theory. (1967) 13:21–7. doi: 10.1109/TIT.1967.1053964

CrossRef Full Text | Google Scholar

21. Rish, I. An empirical study of the naive bayes classifier. IJCAI Workshop on Empirical Methods in Artificial Intelligence (2001) 41–46

Google Scholar

22. Yu, W, Liu, T, Valdez, R, Gwinn, M, and Khoury, MJ. Application of support vector machine modeling for prediction of common diseases: the case of diabetes and pre-diabetes. BMC Med Inform Decis Mak. (2010) 10:16. doi: 10.1186/1472-6947-10-16

CrossRef Full Text | Google Scholar

23. Greiner, M, Pfeiffer, D, and Smith, RD. Principles and practical application of the receiver-operating characteristic analysis for diagnostic tests. Prev Vet Med. (2000) 45:23–41. doi: 10.1016/s0167-5877(00)00115-x

CrossRef Full Text | Google Scholar

24. Chicco, D, and Rovelli, C. Computational prediction of diagnosis and feature selection on mesothelioma patient health records. PLoS One. (2019) 14:e0208737. doi: 10.1371/journal.pone.0208737

CrossRef Full Text | Google Scholar

25. Delpino, FM, Costa, ÂK, Farias, SR, Chiavegatto Filho, ADP, Arcêncio, RA, and Nunes, BP. Machine learning for predicting chronic diseases: a systematic review. Public Health. (2022) 205:14–25. doi: 10.1016/j.puhe.2022.01.007

CrossRef Full Text | Google Scholar

26. Uddin, S, Khan, A, Hossain, ME, and Moni, MA. Comparing different supervised machine learning algorithms for disease prediction. BMC Med Inform Decis Mak. (2019) 19:281. doi: 10.1186/s12911-019-1004-8

CrossRef Full Text | Google Scholar

27. Ali, MM, Paul, BK, Ahmed, K, Bui, FM, Quinn, JMW, and Moni, MA. Heart disease prediction using supervised machine learning algorithms: performance analysis and comparison. Comput Biol Med. (2021) 136:104672. doi: 10.1016/j.compbiomed.2021.104672

CrossRef Full Text | Google Scholar

28. Chicco, D, and Jurman, G. Machine learning can predict survival of patients with heart failure from serum creatinine and ejection fraction alone. BMC Med Inform Decis Mak. (2020) 20:16. doi: 10.1186/s12911-020-1023-5

CrossRef Full Text | Google Scholar

29. Yang, L, Wu, H, Jin, X, Zheng, P, Hu, S, Xu, X, et al. Study of cardiovascular disease prediction model based on random forest in eastern China. Sci Rep. (2020) 10:5245. doi: 10.1038/s41598-020-62133-5

CrossRef Full Text | Google Scholar

30. Alaa, AM, Bolton, T, Di Angelantonio, E, Rudd, JHF, and van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: a prospective study of 423,604 UK biobank participants. PLoS One. (2019) 14:e0213653. doi: 10.1371/journal.pone.0213653

CrossRef Full Text | Google Scholar

31. Larouche-Lebel, É, Loughran, KA, and Oyama, MA. Echocardiographic indices and severity of mitral regurgitation in dogs with preclinical degenerative mitral valve disease. J Vet Intern Med. (2019) 33:489–98. doi: 10.1111/jvim.15461

CrossRef Full Text | Google Scholar

32. Roche-Catholy, M, Van Cappellen, I, Locquet, L, Broeckx, BJG, Paepe, D, and Smets, P. Clinical relevance of serum electrolytes in dogs and cats with acute heart failure: a retrospective study. J Vet Intern Med. (2021) 35:1652–62. doi: 10.1111/jvim.16187

CrossRef Full Text | Google Scholar

33. Hadjiphilippou, S, and Kon, SP. Cardiorenal syndrome: review of our current understanding. J R Soc Med. (2016) 109:12–7. doi: 10.1177/0141076815616091

CrossRef Full Text | Google Scholar

34. Damman, K, Navis, G, Voors, AA, Asselbergs, FW, Smilde, TDJ, Cleland, JGF, et al. Worsening renal function and prognosis in heart failure: systematic review and meta-analysis. J Card Fail. (2007) 13:599–608. doi: 10.1016/j.cardfail.2007.04.008

CrossRef Full Text | Google Scholar

35. Choi, E, Schuetz, A, Stewart, WF, and Sun, J. Using recurrent neural network models for early detection of heart failure onset. J Am Med Inform Assoc. (2017) 24:361–70. doi: 10.1093/jamia/ocw112

CrossRef Full Text | Google Scholar

Keywords: canine, artificial intelligence, feature ranking, heart failure, random forest

Citation: Kim Y, Kim J, Kim S, Youn H, Choi J and Seo K (2023) Machine learning-based risk prediction model for canine myxomatous mitral valve disease using electronic health record data. Front. Vet. Sci. 10:1189157. doi: 10.3389/fvets.2023.1189157

Received: 18 March 2023; Accepted: 15 August 2023;
Published: 31 August 2023.

Edited by:

Isaac Karimi, Razi University, Iran

Reviewed by:

Barbara Contiero, University of Padova, Italy
Sirilak Disatian Surachetpong, Chulalongkorn University, Thailand

Copyright © 2023 Kim, Kim, Kim, Youn, Choi and Seo. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Kyoungwon Seo, a3dzZW9Ac251LmFjLmty

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.