- 1First Affiliated Hospital of Jinzhou Medical University, Jinzhou, China
- 2Jinzhou Medical University, Jinzhou, Liaoning, China
Liver cancer is a common malignancy of the digestive system. Hepatocellular carcinoma (HCC) accounts for the most majority of these tumors and it has brought a heavy medical burden to underdeveloped countries and regions. Many factors affect the prognosis of HCC patients, however, there is no specific statistical model to predict the survival time of clinical patients. This study derived a risk factor signature of HCC and reliable clinical prediction model by statistically analyzing The Surveillance, Epidemiology, and End Results (SEER) database patient information using an open source package in the python environment.
1 Background
Liver cancer is a common malignancy of the digestive system (1, 2). Primary liver cancer mainly includes hepatocellular carcinoma (HCC) and intrahepatic cholangiocarcinoma (ICC) (3). HCC accounts for most of these tumors and is the fifth leading cause of cancer and the fourth leading cause of cancer-related deaths worldwide (4, 5).Men have a higher risk of HCC than women, comprising the second leading cause of cancer death in men. Besides, HCC morbidity and mortality are still rising (6, 7). The main risk factors for HCC development are cirrhosis and chronic liver disease (8). Cirrhosis is an important process for HCC viral carcinogenesis (9). Additionally, chronic hepatitis, caused by hepatitis B virus (HBV) and hepatitis C virus (HCV) infections, is an important risk factor for liver cancer (10). Most new liver cancer cases occur in developing countries with a high rate of hepatitis B virus infections. Meanwhile, non-alcoholic fatty liver disease (NAFLD) is the leading cause of HCC in developed countries (11, 12).
Liver Doppler ultrasound and AFP are simple and easy methods to screen liver cancer (13). Elevated AFP and DCP levels are typical features of liver cancer (14). Additionally, CT, enhanced CT, MRI, enhanced MRI, and other imaging methods are helpful for precise HCC diagnosis (15). Since liver biopsy is related to tumor implantation and bleeding risks, and false negative results might occur, it is generally not recommended for HCC (16).
At present, the most commonly used staging systems for liver cancer include the TNM (tumor node metastasis), China liver cancer (CNLC), and Barcelona clinical liver cancer (BCLC) staging systems (17). The TNM staging was jointly proposed by the American Joint Committee on Cancer (AJCC) and the Union for International Cancer Control (UICC) and has been widely used in clinical practice. TNM is a tumor staging system based on tumor morphology (T), regional lymph node metastasis (N), and distant metastasis (M). The TNM staging of liver cancer is very detailed, especially the T staging, including the invasion of microvessels around the tumor that can better help evaluate the prognosis.
Radical surgical resection is the primary treatment for early HCC. However, whether advanced HCC patients can benefit from surgery is controversial. Recently, breakthroughs have been made in non-surgical treatments. For example, drug therapy, immunotherapy, and targeted therapy have been successfully applied to treat advanced liver cancer (18). Transcatheter arterial chemoembolization (TACE), hepatic arterial infusion chemotherapy (HAIC), and radiotherapy can improve patient prognosis (19). Some experts believe that conventional chemotherapy can also benefit HCC patients (20). Nevertheless, most experts believe that conventional chemotherapy has little effect on liver cancer (21–23).
The SEER database is a publicly available cancer reporting system funded by the US federal government (24). This representative and reliable data come from 18 US states. Users can retrieve the patient’s sex, age, surgical method, chemotherapy, radiotherapy, other clinical information, survival time, and status. This study obtained permission to use the SEER PLUS database. Thus, to further explore HCC risk factors and treatment plans and establish a machine learning model to guide clinical treatment, we retrieved HCC patient data from the SEER database and analyzed them after the screening.
2 Methods
2.1 Data acquisition
Herein, we retrieved data from 107148 HCC patients from the SEER database. Clinical information included gender, age, race, histological type, histological grading, surgical method, regional lymph node dissection, radiotherapy, chemotherapy, diagnosis to treatment time, AFP, TNM staging, survival time, and survival status.
2.2 Excluding factors
To ensure the accuracy of the machine learning model, we did not use automatic imputation of missing information. Data were filtered according to the clinical characteristics of each group, and the information gaps and unknown groups were excluded from a total of 102680 patients. Finally, 4468 patients were selected for subsequent analysis.
2.3 Statistical methods
The algorithm applied here was based on python 3.10.6 (Python Software Foundation, https://www.python.org/). Clinical feature analysis was conducted with TableOne. The COX regression analysis was performed using Lifelines. The random survival forest (RSF) analysis was carried out using Scikit-Survival. The survival curves of clinical patients were predicted using the random forest model. The accuracy of the model was evaluated using the C-index.
3 Results
3.1 Clinical characteristics
After the screening, 4468 patients were selected for further analysis (Table 1) . The clinical characteristics were analyzed in Table 1. A total of 2324 patients received chemotherapy, and 2144 patients did not. Most clinical features significantly differed between the two groups, including gender, race, histological type, surgery, regional lymph node dissection, diagnosis to treatment time, survival time, AFP, survival status, and T, N, M stages (χ2 test, p < 0.05).
3.2 Overall risk factors
Furthermore, we used COX regression analysis to evaluate the impact of various clinical features on the survival of HCC patients (Table 2). Distant organ metastasis, lymph node metastasis, chemotherapy, AFP positive, histological grade, sex, race, tumor size, and age were risk factors for HCC. On the other hand, surgical treatment and early diagnosis and treatment were remission factors for HCC (p < 0.05). No significant differences were detected for radiotherapy and regional lymph node dissection (p > 0.05). The C-index of the COX regression model was 0.76 (Figure 1).
3.3 Risk factors at different stages
To explore the differences in treatment plans for HCC patients at different TNM stages, we divided patients into I, II, IIIa, IIIb, IVa, and IVb groups according to the 7th edition of the AJCC staging system. Then, we applied COX regression analysis to evaluate the risk for each group (Table 3). We found that early diagnosis and treatment, and timely surgery were mitigating factors for HCC patients at stages I, II, and IIIa. In contrast, chemotherapy, radiotherapy, and positive AFP were risk factors for HCC patients, unfavorable for prognoses. Surgical treatment and early diagnosis and treatment were also remission factors for stage IV HCC patients. Nevertheless, the prognosis risk was reduced in patients at stage IVa receiving radiotherapy, comprehending a mitigating factor. The survival of patients receiving chemotherapy did not differ. However, radiotherapy and chemotherapy were mitigating factors in the IVb group.
3.4 Clinical feature importance and survival prediction
We randomly selected 25% of the included test group data, and the remaining 75% was used as the training group data. To obtain the best model, the survival analysis of the post-screening data was performed using the RSF model based on hyperparameter optimization with manual parameter adjustment, leading to a C-index of 0.80 for the training set and 0.77 for the testing set. Thus, the RSF model had slightly better reliability than the Cox regression model.
The clinical feature importance ranking indicated that surgical treatment was the most important feature among clinical factors in the RSF model (Table 4). Then, three patients in surgery and non-surgery groups were separately retrieved from the test group to draw predictive survival curves. Patients in the surgery group had a significantly better prognosis than those in the non-surgery group (Figure 2).
Subsequently, we used Streamlit to establish a clinical patient survival prediction platform based on the RSF model. In this framework, clinicians can enter the corresponding clinical information, which is used to generate survival and cumulative risk curves of predicted patients and real-time survival curve changes by dynamically adjusting treatment parameters. Therefore, this platform can be used to guide clinical treatment selection (Video 1).
4 Discussion
The incidence and mortality of liver cancer continue to rise, and its treatment remains a global challenge (25). Surgery is the primary treatment of liver cancer (26). Nevertheless, liver cancer treatment has entered a new era with the development of immunotherapy and targeted therapeutic drugs. Since early liver cancer has no specific manifestation, few patients are diagnosed at early stages during regular physical examinations. Hence, most liver cancer patients are diagnosed at advanced stages when they present abdominal pain, jaundice, and other discomfort symptoms, missing the best time for treatment.
Moreover, HCC has brought a heavy medical burden to underdeveloped countries and regions (18).Chronic HBV infection, chronic HCV infection, NAFLD, aflatoxin, and alcohol intake are important causes of HCC. For example, Hepatitis B virus vaccination can reduce HCC incidence. Herein, the COX regression analysis showed that the time from diagnosis to treatment was a remission factor for HCC patients. Thus, early detection and timely treatment might improve the prognosis of HCC patients (HR: 0.92, p < 0.005). Thus, government departments and relevant medical security institutions should strengthen the health testing of high-risk HCC groups to achieve early detection and treatment, which can prolong the survival time of patients and reduce the economic burden on families and medical security institutions.
We found that positive AFP was also a risk factor for HCC patients at stages I, II, and IIIA. Hence, AFP can be used as an indicator of the prognosis of HCC patients, and similar conclusions have been reached in other studies (27).The Cox regression and RSF models indicated that surgery could reduce HCC risk and improve patient outcomes. Surgical treatment was the most important clinical feature affecting the survival of HCC patients in the RSF model, comprising a key factor for HCC management. For patients who can tolerate surgery, appropriate surgical treatment should be implemented as early as possible to avoid missing the optimal timing of treatment. Meanwhile, for patients not temporarily suitable for surgery, neoadjuvant treatments such as targeted therapy, immunotherapy, and TACE can be immediately implemented when the condition permits further operation.
We found that chemotherapy and radiotherapy were unsuitable for early liver cancer patients. Unnecessary radiotherapy and chemotherapy can increase the risk of these patients. However, chemotherapy can be used for advanced liver cancer patients, who might benefit from systemic chemotherapy (HR: 0.67, p < 0.005). Sun et al. showed that chemotherapy was a common treatment for advanced HCC, but the effects were not ideal. Adding all-trans-retinoic acid (ATRA) to fluorouracil, leuprorelin, and oxaliplatin (FOLFOX4) to treat advanced HCC can improve the overall survival and disease progression time of patients.
However, our study also has some limitations. First, the SEER database does not contain specific information on targeted therapy and immunotherapy regimens, which can extend the survival time of patients with recurrent or advanced liver malignancies. Second, we did not evaluate various objective factors affecting tumor patients’ survival time, such as economic conditions, medical insurance systems, and the level of medical development in the region. Finally, different machine learning models exhibit varying degrees of prognostic evaluation of patients. Therefore, this study should only be considered a machine-learning reference for treating tumor patients. With the continuous refinement of local databases and the optimization of artificial intelligence algorithms, machine learning models will be increasingly close to the reality of clinical practice.
Herein, we obtained a relatively reliable machine learning model by RSF. Then, we used this model to establish a survival prediction platform for HCC patients. This platform can generate a predicted survival curve by inputting clinical patient information. Survival curves can also be compared to get the best clinical treatment plan. Since the SEER database does not contain immunotherapy, targeted therapy, TACE, and other information, this platform only tests the feasibility of methods based on existing data to guide further research.
5 Conclusion
In the present study, we found that distant organ metastasis, lymph node metastasis, histological grade, sex, race, tumor size, and age were risk factors for HCC patients. Additionally, early detection and timely treatment might improve the prognosis of HCC patients, and positive AFP might be used as a risk indicator. Moreover, surgical treatment is crucial for HCC patient survival. Chemotherapy and radiotherapy are inappropriate for early liver cancer patients since these treatments can increase their risk. Nevertheless, advanced liver cancer patients might benefit from systemic chemotherapy. Finally, the RSF model can be used for clinical survival prediction.
Data availability statement
The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding author.
Author contributions
X-YG is responsible for writing manuscripts and program codes; M-CS and T-YW are responsible for literature retrieval; X-MW, GL, Y-ML and TY are responsible for the program code; WW is responsible for proofreading and reviewing. All authors contributed to the article and approved the submitted version.
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fonc.2023.1067353/full#supplementary-material
References
1. Zhou JM, Wang T, Zhang KH. Afp-L3 for the diagnosis of early hepatocellular carcinoma: A meta-analysis. Med (Baltimore) (2021) 100(43):e27673. doi: 10.1097/MD.0000000000027673
2. Liu M, Zhao Q, Zheng X, Yang L, Zhao Y, Li X, et al. Transcriptome changes in Ergic3-knockdown hepatocellular carcinoma cells: Ergic3 is a novel immune function related gene. PeerJ (2022) 10:e13369. doi: 10.7717/peerj.13369
3. Petrick JL, Braunlin M, Laversanne M, Valery PC, Bray F, McGlynn KA. International trends in liver cancer incidence, overall and by histologic subtype, 1978-2007. Int J Cancer (2016) 139(7):1534–45. doi: 10.1002/ijc.30211
4. Abbate V, Marcantoni M, Giuliante F, Vecchio FM, Gatto I, Mele C, et al. Heppar1-positive circulating microparticles are increased in subjects with hepatocellular carcinoma and predict early recurrence after liver resection. Int J Mol Sci (2017) 18(5). doi: 10.3390/ijms18051043
5. Tang H, You T, Sun Z, Bai C. A comprehensive prognostic analysis of Pold1 in hepatocellular carcinoma. BMC Cancer (2022) 22(1):197. doi: 10.1186/s12885-022-09284-y
6. Chew SA, Moscato S, George S, Azimi B, Danti S. Liver cancer: Current and future trends using biomaterials. Cancers (Basel) (2019) 11(12). doi: 10.3390/cancers11122026
7. Kreidieh M, Zeidan YH, Shamseddine A. The combination of stereotactic body radiation therapy and immunotherapy in primary liver tumors. J Oncol (2019) 2019:4304817. doi: 10.1155/2019/4304817
8. Baek M, Chai JC, Choi HI, Yoo E, Binas B, Lee YS, et al. Comprehensive transcriptome profiling of bet inhibitor-treated Hepg2 cells. PloS One (2022) 17(4):e0266966. doi: 10.1371/journal.pone.0266966
9. Shang YK, Li F, Zhang Y, Liu ZK, Wang ZL, Bian H, et al. Systems analysis of key genes and pathways in the progression of hepatocellular carcinoma. Med (Baltimore) (2018) 97(23):e10892. doi: 10.1097/md.0000000000010892
10. Chidambaranathan-Reghupaty S, Fisher PB, Sarkar D. Hepatocellular carcinoma (Hcc): Epidemiology, etiology and molecular classification. Adv Cancer Res (2021) 149:1–61. doi: 10.1016/bs.acr.2020.10.001
11. Raza S, Rajak S, Anjum B, Sinha RA. Molecular links between non-alcoholic fatty liver disease and hepatocellular carcinoma. Hepatoma Res (2019) 5:42. doi: 10.20517/2394-5079.2019.014
12. Liu X, Liu F, Yu H, Zhang Q, Liu F. Development and validation of a prediction model for predicting the prognosis of postoperative patients with hepatocellular carcinoma. Int J Gen Med (2022) 15:3625–37. doi: 10.2147/ijgm.S351265
13. Lee Q, Yu X, Yu W. The value of pivka-II versus afp for the diagnosis and detection of postoperative changes in hepatocellular carcinoma. J Interv Med (2021) 4(2):77–81. doi: 10.1016/j.jimed.2021.02.004
14. Ijuin S, Oda K, Mawatari S, Taniyama O, Toyodome A, Sakae H, et al. Serine palmitoyltransferase long chain subunit 3 is associated with hepatocellular carcinoma in patients with nafld. Mol Clin Oncol (2022) 16(2):55. doi: 10.3892/mco.2021.2488
15. Zhang Y, Numata K, Du Y, Maeda S. Contrast agents for hepatocellular carcinoma imaging: Value and progression. Front Oncol (2022) 12:921667. doi: 10.3389/fonc.2022.921667
16. Rios RS, Zheng KI, Zheng MH. Non-alcoholic steatohepatitis and risk of hepatocellular carcinoma. Chin Med J (Engl) (2021) 134(24):2911–21. doi: 10.1097/cm9.0000000000001888
17. Rao QW, Zhang SL, Guo MZ, Yuan FF, Sun JL, Qi F, et al. Sulfiredoxin-1 is a promising novel prognostic biomarker for hepatocellular carcinoma. Cancer Med (2020) 9(22):8318–32. doi: 10.1002/cam4.3430
18. Li Z, Wang R, Qiu C, Cao C, Zhang J, Ge J, et al. Role of dtl in hepatocellular carcinoma and its impact on the tumor microenvironment. Front Immunol (2022) 13:834606. doi: 10.3389/fimmu.2022.834606
19. Hatooka M, Kawaoka T, Aikata H, Inagaki Y, Morio K, Nakahara T, et al. Hepatic arterial infusion chemotherapy followed by sorafenib in patients with advanced hepatocellular carcinoma (Hics 55): An open label, non-comparative, phase ii trial. BMC Cancer (2018) 18(1):633. doi: 10.1186/s12885-018-4519-y
20. Dai HY, Chen HY, Lai WC, Hung MC, Li LY. Targeted expression of bikdd combined with metronomic doxorubicin induces synergistic antitumor effect through bax activation in hepatocellular carcinoma. Oncotarget (2015) 6(27):23807–19. doi: 10.18632/oncotarget.4278
21. Kim EH, Kim MS, Furusawa Y, Uzawa A, Han S, Jung WG, et al. Metformin enhances the radiosensitivity of human liver cancer cells to Γ-rays and carbon ion beams. Oncotarget (2016) 7(49):80568–78. doi: 10.18632/oncotarget.12966
22. Köhler BC, Waldburger N, Schlamp K, Jäger D, Weiss KH, Schulze-Bergkamen H, et al. Liver cancers with Stem/Progenitor-cell features - a rare chemotherapy-sensitive malignancy. Oncotarget (2017) 8(35):59991–8. doi: 10.18632/oncotarget.19000
23. Peck-Radosavljevic M, Bota S, Hucke F. Time to stop using hepatic arterial infusion chemotherapy (Haic) for advanced hepatocellular carcinoma?-the scoop-2 trial experience. Ann Transl Med (2020) 8(21):1340. doi: 10.21037/atm-2020-96
24. Zheng Y, Lu Z, Shi X, Tan T, Xing C, Xu J, et al. Lymph node ratio is a superior predictor in surgically treated early-onset pancreatic cancer. Front Oncol (2022) 12:975846. doi: 10.3389/fonc.2022.975846
25. Wang R, Fan H, Sun M, Lv Z, Yi W. Roles of Bmi1 in the initiation, progression, and treatment of hepatocellular carcinoma. Technol Cancer Res Treat (2022) 21:15330338211070689. doi: 10.1177/15330338211070689
26. Chen Y, Li Q, Wu Q. Stepwise encapsulation and controlled two-stage release system for cis-diamminediiodoplatinum. Int J Nanomedicine (2014) 9:3175–82. doi: 10.2147/ijn.S61570
Keywords: HCC (hepatic cellular carcinoma), SEER (Surveillance Epidemiology and End Results) database, machine learning - ML, risk factors, random survival forest model
Citation: Ge X-Y, Sun M-C, Wang T-Y, Wang X-M, Liu G, Yang T, Lu Y-M and Wang W (2023) Analysis of risk factors of hepatocellular carcinoma and establishment of a clinical prognosis model. Front. Oncol. 13:1067353. doi: 10.3389/fonc.2023.1067353
Received: 11 October 2022; Accepted: 10 March 2023;
Published: 22 March 2023.
Edited by:
Hong-Tao Hu, Henan Provincial Cancer Hospital, ChinaReviewed by:
Jiang Chen, Zhejiang University, ChinaAlfredo Caturano, University of Campania Luigi Vanvitelli, Italy
Copyright © 2023 Ge, Sun, Wang, Wang, Liu, Yang, Lu and Wang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Wei Wang, d2FuZ3dlaV9seXl5QDE2My5jb20=