Skip to main content

ORIGINAL RESEARCH article

Front. Med.
Sec. Precision Medicine
Volume 11 - 2024 | doi: 10.3389/fmed.2024.1482726

BreCML: identifying breast cancer cell state in scRNA-seq via machine learning

Provisionally accepted
Shanbao Ke Shanbao Ke 1Yuxuan Huang Yuxuan Huang 2Dong Wang Dong Wang 3Qiang Jiang Qiang Jiang 1Luo Zhanyang Luo Zhanyang 3Baiyu Li Baiyu Li 1Danfang Yan Danfang Yan 4Jianwei Zhou Jianwei Zhou 5*
  • 1 People's Hospital of Zhengzhou University, Zhengzhou, Henan Province, China
  • 2 Duke Kunshan University, Kunshan, Jiangsu, China
  • 3 Pudong Institute for Health Development, pudong, Shanghai, China
  • 4 Zhejiang University, Hangzhou, Zhejiang Province, China
  • 5 Henan Provincial People's Hospital, Zhengzhou, China

The final, formatted version of the article will be published soon.

    Breast cancer is a prevalent malignancy and one of the leading causes of cancer-related mortality among women worldwide. This disease typically manifests through the abnormal proliferation and dissemination of malignant cells within breast tissue.Current diagnostic and therapeutic strategies face significant challenges in accurately identifying and localizing specific subtypes of breast cancer. In this study, we developed a novel machine learning-based predictor, BreCML, designed to accurately classify subpopulations of breast cancer cells and their associated marker genes. BreCML exhibits outstanding predictive performance, achieving an accuracy of 98.92% on the training dataset. Utilizing the XGBoost algorithm, BreCML demonstrates superior accuracy (98.67%), precision (99.15%), recall (99.49%), and F1-score (99.79%) on the test dataset. Through the application of machine learning and feature selection techniques, BreCML successfully identified new key genes. This predictor not only serves as a powerful tool for assessing breast cancer cellular status but also offers a rapid and efficient means to uncover potential biomarkers, providing critical insights for precision medicine and therapeutic strategies.

    Keywords: breast cancer, machine learning, ScRNA-seq, cell subpopulations, Feature Selection

    Received: 18 Aug 2024; Accepted: 15 Oct 2024.

    Copyright: © 2024 Ke, Huang, Wang, Jiang, Zhanyang, Li, Yan and Zhou. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

    * Correspondence: Jianwei Zhou, Henan Provincial People's Hospital, Zhengzhou, China

    Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.