Skip to main content

ORIGINAL RESEARCH article

Front. Artif. Intell.
Sec. Medicine and Public Health
Volume 7 - 2024 | doi: 10.3389/frai.2024.1397388
This article is part of the Research Topic Soft Computing and Machine Learning Applications for Healthcare Systems View all 7 articles

Enhancing Diagnostic Accuracy in Symptom-Based Health Checkers: A Comprehensive Machine Learning Approach with Clinical Vignettes and Benchmarking

Provisionally accepted
Leila Aissaoui Leila Aissaoui 1*Manel Ben amar Manel Ben amar 2
  • 1 Ecole Supérieure des Communications de Tunis, Université de Carthage, Ariana, Tunisia
  • 2 Département de Médecine Communautaire, Faculté de Médecine de Monastir, Université de Monastir, Monastir, Monastir, Tunisia

The final, formatted version of the article will be published soon.

    This study presents the development and evaluation of machine learning models for a symptom-based health checker using a dataset comprising 10 diseases and 9572 samples. The dataset was divided into training and testing sets to facilitate model training and evaluation. Decision Tree, Random Forest, Naive Bayes, Logistic Regression, and K-Nearest Neighbors models were selected and optimized to maximize performance. Evaluation metrics including accuracy, F1 scores, and 10-fold cross-validation were employed to assess model performance. Clinical vignettes were utilized to gauge the real-world applicability of the models, demonstrating their robustness in providing accurate diagnoses. Additionally, the role of ROC-AUC curves in assessing model performance was explored, revealing improvements in model performance with increasing complexity. Precision-recall curves were also instrumental in evaluating model sensitivity, particularly in scenarios with imbalanced datasets. Overall, this study underscores the importance of comprehensive model evaluation techniques, including clinical vignette testing and analysis of ROC-AUC and precision-recall curves, in ensuring the reliability and sensitivity of symptom-based health checkers.

    Keywords: Health Checker, Symptoms, machine learning, confusion matrix, ROC/AUC curves, Precision-recall curve, Clinical vignettes, Benchmarking

    Received: 07 Mar 2024; Accepted: 17 Jul 2024.

    Copyright: © 2024 Aissaoui and Ben amar. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

    * Correspondence: Leila Aissaoui, Ecole Supérieure des Communications de Tunis, Université de Carthage, Ariana, Tunisia

    Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.