AUTHOR=Ding Huanfei , Fawad Muhammad , Xu Xiaolin , Hu Bowen TITLE=A framework for identification and classification of liver diseases based on machine learning algorithms JOURNAL=Frontiers in Oncology VOLUME=12 YEAR=2022 URL=https://www.frontiersin.org/journals/oncology/articles/10.3389/fonc.2022.1048348 DOI=10.3389/fonc.2022.1048348 ISSN=2234-943X ABSTRACT=

Hepatocellular carcinoma (HCC) is one of the most commonly seen liver disease. Most of HCC patients are diagnosed as Hepatitis B related cirrhosis simultaneously, especially in Asian countries. HCC is the fifth most common cancer and the second most common cause of cancer-related death in the World. HCC incidence rates have been rising in the past 3 decades, and it is expected to be doubled by 2030, if there is no effective means for its early diagnosis and management. The improvement of patient’s care, research, and policy is significantly based on accurate medical diagnosis, especially for malignant tumor patients. However, sometimes it is really difficult to get access to advanced and expensive diagnostic tools such as computed tomography (CT), magnetic resonance imaging (MRI) and positron emission tomography (PET-CT)., especially for people who resides in poverty-stricken area. Therefore, experts are searching for a framework for predicting of early liver diseases based on basic and simple examinations such as biochemical and routine blood tests, which are easily accessible all around the World. Disease identification and classification has been significantly enhanced by using artificial intelligence (AI) and machine learning (ML) in conjunction with clinical data. The goal of this research is to extract the most significant risk factors or clinical parameters for liver diseases in 525 patients based on clinical experience using machine learning algorithms, such as regularized regression (RR), logistic regression (LR), random forest (RF), decision tree (DT), and extreme gradient boosting (XGBoost). The results showed that RF classier had the best performance (accuracy = 0.762, recall = 0.843, F1-score = 0.775, and AUC = 0.999) among the five ML algorithms. And the important orders of 14 significant risk factors are as follows: Total bilirubin, gamma-glutamyl transferase (GGT), direct bilirubin, hemoglobin, age, platelet, alkaline phosphatase (ALP), aspartate transaminase (AST), creatinine, alanine aminotransferase (ALT), cholesterol, albumin, urea nitrogen, and white blood cells. ML classifiers might aid medical organizations in the early detection and classification of liver disease, which would be beneficial in low-income regions, and the relevance of risk factors would be helpful in the prevention and treatment of liver disease patients.