AUTHOR=Droppelmann Guillermo , Tello Manuel , García Nicolás , Greene Cristóbal , Jorquera Carlos , Feijoo Felipe TITLE=Lateral elbow tendinopathy and artificial intelligence: Binary and multilabel findings detection using machine learning algorithms JOURNAL=Frontiers in Medicine VOLUME=Volume 9 - 2022 YEAR=2022 URL=https://www.frontiersin.org/journals/medicine/articles/10.3389/fmed.2022.945698 DOI=10.3389/fmed.2022.945698 ISSN=2296-858X ABSTRACT=Background: Ultrasound is a valuable technique to detect degenerative findings and intrasubstance tears in lateral elbow tendinopathy. Machine learning methods allow supporting this radiological diagnosis. Aim: To assess machine learning models to detect degenerative findings and intrasubstance tears in ultrasound images with lateral elbow tendinopathy diagnosis. Methods: A retrospective study was performed. Ultrasound images and medical records from patients with lateral elbow tendinopathy diagnosis from January 1st, 2017, to December 30th, 2018, were selected. Datasets were built for training and testing models. For image analysis, features extraction, texture characteristics, intensity distribution, pixel-pixel co-occurrence patterns, and scales granularity were implemented. Six different supervised learning models were implemented for binary and multilabel classification. All models were trained to classify four tendon findings (hypoechogenicity, neovascularity, enthesopathy, and intrasubstance tear). Accuracy indicators and their confidence intervals were obtained for all models following a K-fold-repeated-cross-validation method. To measure multi-label prediction, multilabel accuracy, sensitivity, specificity, and ROC with 95% confidence intervals were used. Results: A total of 30,007 ultrasound images (4,324 exams, 2,917 patients) were included in the analysis. The RF model presented the highest mean values in AUC, sensitivity, and also specificity by each degenerative finding in the binary classification. The AUC and sensitivity showed the best performance in intrasubstance tear with 0.991 [95% CI, 099, 0.99], and 0.775 [95% CI, 0.77, 0.77], respectively. Instead, specificity showed upper values in hypoechogenicity with 0.821 [95% CI, 0.82 - 0.82]. In the multilabel classifier, RF also presented the highest performance. The accuracy was 0.772 [95% CI, 0.771, 0.773], a great macro of 0.948 [95% CI, 0.94, 0.94], and a micro of 0.962 [95% CI, 0.96, 0.96] AUC scores were detected. Diagnostic accuracy, sensitivity, and specificity with 95% confidence intervals were calculated. Conclusion: Machine learning algorithms based on ultrasound images with lateral elbow tendinopathy presented high diagnosis accuracy. Mainly the random forest model shows the best performance in binary and multilabel classifiers, particularly for intrasubstance tears.