Skip to main content

ORIGINAL RESEARCH article

Front. Digit. Health
Sec. Health Informatics
Volume 6 - 2024 | doi: 10.3389/fdgth.2024.1498939
This article is part of the Research Topic Unleashing the Power of Large Data: Models to Improve Individual Health Outcomes View all 7 articles

Leveraging Machine Learning and Rule Extraction for Enhanced Transparency in Emergency Department Length of Stay Prediction

Provisionally accepted
Waqar A Sulaiman Waqar A Sulaiman 1Charithea Stylianides Charithea Stylianides 2Andria Nikolaou Andria Nikolaou 2,3Zinonas Antoniou Zinonas Antoniou 4Ioannis Constantinou Ioannis Constantinou 4Lakis Palazis Lakis Palazis 5Anna Vavlitou Anna Vavlitou 5Theodoros Kyprianou Theodoros Kyprianou 3Efthyvoulos Kyriacou Efthyvoulos Kyriacou 6Antonis Kakas Antonis Kakas 3Marios S Pattichis Marios S Pattichis 7Andreas S Panayides Andreas S Panayides 2Constantinos S. Pattichis Constantinos S. Pattichis 2,3*
  • 1 University of Cyprus, Nicosia, Cyprus
  • 2 CYENS Centre of Excellence, Nicosia, Cyprus
  • 3 Computer Science and Biomedical Engineering Research Centre, University of Cyprus, Nicosia, Cyprus
  • 4 Research Development Department, 3aHealth, Nicosia, Cyprus, Nicosia, Cyprus
  • 5 State Health Services Organisation, Nicosia, Cyprus, Nicosia, Cyprus
  • 6 Cyprus University of Technology, Limassol, Limassol, Cyprus
  • 7 University of New Mexico, Albuquerque, New Mexico, United States

The final, formatted version of the article will be published soon.

    This study uses machine learning techniques on a large data set in healthcare to meticulously classify the length of stay in the emergency department (ED LOS). Models such as Gradient Boosting (GB), Random Forest (RF), Logistic Regression (LR), and Support Vector Machine (SVM) are employed to categorize ED LOS into short (less than 4.5 hours) or long stays (greater than 4.5 hours) utilizing the MIMIC IV-ED database containing deidentified information of more than 400,000 subjects. GB demonstrated slightly better predictive performance compared to other models, achieving an AUC of 0.736, an accuracy of 70.68%, a sensitivity of 89.50%, and a specificity of 39. 93% on the original data set. For the balanced dataset, GB achieved an AUC of 0.716, an accuracy of 65.11%, sensitivity of 75.05%, and specificity of 57.33%. Emphasizing model interpretability, we employ a novel rule extraction process for the Gradient Boosting model. The rules extracted were based on key predictors of ED LOS such as triage acuity, elixhauser comorbidity index (ECI), arrival methods, and medication history. These extracted rules provide healthcare 1 Sulaiman et al. professionals with clear and actionable information, increasing the transparency of the classification process. By further analyzing the rules, we hope to boost confidence in the model's recommendations, allowing for better decision-making and resource management in emergency departments. Our findings indicate that combining predictive models with rule extraction can significantly improve patient flow and optimize care delivery. However, more work is needed to enhance model performance accompanied by rule interpretability.

    Keywords: emergency department, Length of Stay, machine learning, gradient boosting, Rule extraction, Predictive Modeling, Explainable AI, healthcare analytics

    Received: 19 Sep 2024; Accepted: 16 Dec 2024.

    Copyright: © 2024 Sulaiman, Stylianides, Nikolaou, Antoniou, Constantinou, Palazis, Vavlitou, Kyprianou, Kyriacou, Kakas, Pattichis, Panayides and Pattichis. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

    * Correspondence: Constantinos S. Pattichis, Computer Science and Biomedical Engineering Research Centre, University of Cyprus, Nicosia, Cyprus

    Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.