Skip to main content

ORIGINAL RESEARCH article

Front. Mol. Biosci.
Sec. Metabolomics
Volume 11 - 2024 | doi: 10.3389/fmolb.2024.1426964
This article is part of the Research Topic Metabolomics in Personalized Cancer Medicine View all articles

Biomarker Discovery and Development of Prognostic Prediction Model Using Metabolomic Panel in Breast Cancer Patients: A Hybrid Methodology Integrating Machine Learning and Explainable Artificial Intelligence

Provisionally accepted
  • 1 Department of Biostatistics and Medical Informatics, Faculty of Medicine, Inonu University, Malatya, Türkiye
  • 2 Other, Sivas, Türkiye
  • 3 Department of Physiology, College of Medicine, King Khalid University, Abha, Saudi Arabia
  • 4 Department of Medical Rehabilitation sciences, College of Applied Medical Sciences, King Khalid University, Abha, Saudi Arabia
  • 5 College of Applied Sciences, University of Almaarefa, Dariyah, Riyadh, Saudi Arabia
  • 6 Department for Teacher Education, NLA University College, Oslo, Norway

The final, formatted version of the article will be published soon.

    Background: Breast cancer (BC) is a significant cause of morbidity and mortality in women. Although the important role of metabolism in the molecular pathogenesis of BC is known, there is still a need for robust metabolomic biomarkers and predictive models that will enable the detection and prognosis of BC. This study aims to identify targeted metabolomic biomarker candidates based on explainable artificial intelligence (XAI) for the specific detection of BC. Methods: Data obtained after targeted metabolomics analyses using plasma samples from BC patients (n = 102) and healthy controls (n = 99) were used. Machine learning (ML) models based on raw data were developed, then feature selection methods were applied, and the results were compared. SHapley Additive exPlanations (SHAP), an XAI method, was used to clinically explain the decisions of the optimal model in BC prediction. Results: The results revealed that variable selection increased the performance of ML models in BC classification, and the optimal model was obtained with the logistic regression (LR) classifier after support vector machine (SVM)-SHAP-based feature selection. SHAP annotations of the LR model revealed that Leucine, isoleucine, L-alloisoleucine, norleucine, and homoserine acids were the most important potential BC diagnostic biomarkers. Combining the identified metabolite markers provided robust BC classification measures with precision, recall, and specificity of 89.50%, 88.38%, and 83.67%, respectively. Conclusions: In conclusion, this study adds valuable information to the discovery of BC biomarkers and underscores the potential of targeted metabolomics-based diagnostic advances in the management of BC.

    Keywords: breast cancer, Metabolomics, Feature Selection, Explainable artificial intelligence, Prognostic model

    Received: 02 May 2024; Accepted: 02 Dec 2024.

    Copyright: © 2024 Yagin, GÖRMEZ, AL-Hashem, Ahmad, Ahmad and Ardigò. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

    * Correspondence:
    Fatma Hilal Yagin, Department of Biostatistics and Medical Informatics, Faculty of Medicine, Inonu University, Malatya, Türkiye
    Luca Paolo Ardigò, Department for Teacher Education, NLA University College, Oslo, 0130, Norway

    Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.