AUTHOR=Hamidi Farzaneh , Gilani Neda , Arabi Belaghi Reza , Yaghoobi Hanif , Babaei Esmaeil , Sarbakhsh Parvin , Malakouti Jamileh 

TITLE=Identifying potential circulating miRNA biomarkers for the diagnosis and prediction of ovarian cancer using machine-learning approach: application of Boruta

JOURNAL=Frontiers in Digital Health

VOLUME=Volume 5 - 2023

YEAR=2023

URL=https://www.frontiersin.org/journals/digital-health/articles/10.3389/fdgth.2023.1187578

DOI=10.3389/fdgth.2023.1187578

ISSN=2673-253X

ABSTRACT=In gynecologic oncology, ovarian cancer is a great clinical challenge. Because of the lack of typical symptoms and effective biomarkers for noninvasive screening, most patients develop advanced-stage ovarian cancer by the time of diagnosis. MicroRNAs (miRNAs) are a type of non-coding RNA molecule that has been linked to human cancers. Specifying diagnostic biomarkers to determine non-cancer and cancer samples is difficult. By using Boruta, a novel random forest-based feature selection in the machine learning techniques, we aimed to identify biomarkers associated with ovarian cancer using cancerous and non-cancer samples from the Gene Expression Omnibus (GEO) database: GSE106817. In this study, we used two independent GEO datasets as external validation, including GSE113486 and GSE113740. We utilized five state-of-the-art machine learning algorithms for classification: logistic regression, random forest, decision trees, artificial neural networks, and XGBoost. Four models discovered in GSE113486 had an AUC of 100%, three in GSE113740 with AUC of over 94%, and four in GSE113486 with AUC of over 94%. We identified 10 miRNAs to distinguish ovarian cancer cases from normal controls: hsa-miR-1290, hsa-miR-1233-5p, hsa-miR-1914-5p, hsa-miR-1469, hsa-miR-4675, hsa-miR-1228-5p, hsa-miR-3184-5p, hsa-miR-6784-5p, hsa-miR-6800-5p, and hsa-miR-5100. Our findings suggest that miRNAs could be used as possible biomarkers for ovarian cancer screening, for possible intervention.