- 1Department of Obstetrics and Gynecology, Shanghai First Maternity and Infant Hospital, School of Medicine, Tongji University, Shanghai, China
- 2Department of Obstetrics and Gynecology, Renji Hospital, School of Medicine, Shanghai Jiaotong University, Shanghai, China
- 3MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University, Shanghai, China
Background: Tumor pathology can assess patient prognosis based on a morphological deviation of tumor tissue from normal. Digitizing whole slide images (WSIs) of tissue enables the use of deep learning (DL) techniques in pathology, which may shed light on prognostic indicators of cancers, and avoid biases introduced by human experience.
Purpose: We aim to explore new prognostic indicators of ovarian cancer (OC) patients using the DL framework on WSIs, and provide a valuable approach for OC risk stratification.
Methods: We obtained the TCGA-OV dataset from the NIH Genomic Data Commons Data Portal database. The preprocessing of the dataset was comprised of three stages: 1) The WSIs and corresponding clinical data were paired and filtered based on a unique patient ID; 2) a weakly-supervised CLAM WSI-analysis tool was exploited to segment regions of interest; 3) the pre-trained model ResNet50 on ImageNet was employed to extract feature tensors. We proposed an attention-based network to predict a hazard score for each case. Furthermore, all cases were divided into a high-risk score group and a low-risk one according to the median as the threshold value. The multi-omics data of OC patients were used to assess the potential applications of the risk score. Finally, a nomogram based on risk scores and age features was established.
Results: A total of 90 WSIs were processed, extracted, and fed into the attention-based network. The mean value of the resulting C-index was 0.5789 (0.5096–0.6053), and the resulting p-value was 0.00845. Moreover, the risk score showed a better prediction ability in the HRD + subgroup.
Conclusion: Our deep learning framework is a promising method for searching WSIs, and providing a valuable clinical means for prognosis.
1 Introduction
Ovarian cancer (OC), as the “silent killer” of women’s health, is the leading cause of cancer-related death in gynecologic malignant diseases (Kuroki and Guntupalli, 2020). OC is a highly heterogeneous disease with a variety of subtypes that have various histologic and molecular characteristics (Kurman and Shih Ie, 2016), which raises challenges for effective prognosis stratification and clinical treatment management. High-throughput sequencing technologies have expedited research in cancer biology and provided a comprehensive genetic landscape (Lu et al., 2019). In recent years, more potential biomarkers for diagnosis and prognosis have been discovered based on the rapid advances in sequencing technologies. Similar to high throughput sequencing, the analysis of digital pathological images has provided an opportunity for biomarker detection and prognostic stratification (Desbois et al., 2020; Saillard et al., 2020; Skrede et al., 2020; Shi J. et al., 2021; Jin et al., 2021). Pathological analysis of OC patients is essential for obtaining patient diagnosis and cancer characteristics including histological subtype, grade and stage. Whole slide images (WSIs) harbor vast amount of information, such as growth patterns and intercellular interactions within tumor microenvironment, which is associated with the survival outcome. However, the high-dimensional information of pathology images cannot be recognized by the naked eyes of a pathologist.
Deep learning has presented outstanding advantages in medical image analysis due to its powerful feature representation (Litjens et al., 2017). Recent articles have shown that deep learning can enhance the analysis of pathology images for diagnostic and prognostic stratification (Skrede et al., 2020; Shi J. et al., 2021; Zhang X. et al., 2022). In practice, the labeling task mostly needs to spend much manual labor with experienced experts to implement a specific task for determining the target tissue. Especially, it is extremely challenging to finish a pixel-level labeling task for gigapixel images. Fortunately, the weakly-supervised learning approach can be exploited to alleviate this question because the clinical information almost includes a patient-level label (Lu et al., 2021).
Some works have investigated survival analysis based on time-to-event data via deep learning methods. Both learning the underlying dynamics of the modeling survival data and censoring are two important issues in the survival analysis. The right-censored cases led to the bias in the cross-Entropy-based model. Aimed at this question, the bias between them was analyzed systematically via different deep-learning model comparisons (Zadeh and Schmid, 2021). For right-censored data, the recurrent neural network was also utilized to conduct survival prediction and analysis. In addition, the survival loss function was also improved to reduce the bias by introducing the weight coefficient (Ren et al., 2019). Based on these works, multi-modality data was exploited and fused to predict the risk stratification for multiple types of cancers (Chen et al., 2022). However, risk stratification according to existing clinical indicators is not sufficient for OC patients. Thus, this study proposed a deep survival network based on WSIs to predict risk scores and obtained prediction of prognosis.
2 Materials and methods
2.1 Data collection and processing
For ovarian cancer prognostic analysis, we collected 106 patients’ H&E diagnostic WSIs with corresponding clinic data from TCGA-OV (https://portal.gdc.cancer.gov/projects/TCGA-OV) via the National Cancer Institute GDC Data Portal. The inclusion criteria herein consist of three aspects: 1) The quality of WSIs was assessed by an experienced clinic doctor; 2) retaining only one WSI for each case; 3) the case contained both clinical information and WSIs. As a result, 90 cases (60 uncensored patients and 30 censored ones) were incorporated to obtain a prediction model. Additionally, we exploited the five-cross validation method to train the model.
Herein, we utilized an open-source tool, CLAM WSI-analysis toolbox (Lu et al., 2021), to implement the segmentation and feature extraction tasks for each WSI. In this scenario, each original WSI consists of four levels of different resolutions. First, we segmented the tissue region of interest from the level 0 with the highest resolution in each WSI. Second, the whole slide image for each patient was split into M patches with 256 × 256 pixels without overlap. Third, a pre-trained deep network ResNet50 (trained on the ImageNet dataset) was exploited to extract feature tensors by feeding patches into it. Finally, the third block in the ResNet50 model was selected to output the feature tensor with 1,024 dimensions. Additionally, normalized gene expression was measured as Transcripts Per Kilobase Millions, and we processed the genetic mutation data of the TCGA dataset using the R package “maftools”.
2.2 Deep learning model
For each gigapixel WSI, it is a challenging task to provide pixel-level labels via human labor. The goal of survival data analysis is to train a predictor for hazard probability in a time interval based on plenty of image patches. Thus, we designed a weakly-supervised deep learning architecture as shown in Figure 1A, and the representative images were shown in Figure 1B.
FIGURE.1. The overall network architecture. (A) The weakly-supervised deep learning architecture. (B) The representative images.
We split the entire pipeline into two parts. In the first part, we extract
Due to the size of the dataset being slightly small, a two-layer network is designed to build the prediction layer. The overall network is optimized to predict the four hazard probabilities in four intervals via the loss function. The patient-level risk score is calculated by summing the four hazard scores. For subsequent analysis, the high-group and low-group are divided according to the median of all risk scores.
2.3 Loss function
To realize the survival analysis from patient-level data, we divide evenly the overall patient survival time into four intervals [ti,ti+1), i = 0,1,2,3 according to the uncensored cases.
where
The loss function is exploited as shown in Eq. 3 (Zadeh and Schmid, 2021).
To balance the differences between the uncensored cases and censored counterparts, a hyper-parameter
2.4 HRD analysis
The HRD score is calculated as the sum of telomeric allelic imbalance (TAI), large-scale state transitions (LST), and loss of heterozygosity (LOH) scores (Shi Z. et al., 2021). HRD scores are derived from research by Thorsson et al. (2018). HRD+ was defined as a high HRD score (threshold >42 score).
2.5 Tumor immune infiltration and pathway enrichment analysis
The relative infiltration level of immune cell types was quantified via single sample gene set enrichment analysis (ssGSEA) by the “GSVA” R package (Hanzelmann et al., 2013). GSEA was performed to assess related pathways.
2.6 Chemotherapeutic sensitivity prediction
Chemotherapeutic response prediction for OC samples was conducted in R by using the “oncoPredict” package (Maeser et al., 2021) from the Genomics of Drug Sensitivity in Cancer (GDSC) database (Yang et al., 2013). The ridge regression model was applied to evaluate the half maximal inhibitory concentration (IC50).
2.7 Nomogram construction
We performed a univariate analysis based on clinic parameters and risk scores. Afterward, multivariate Cox regression was conducted using the significant prognostic variables (p < 0.05). The nomogram was generated using the R package “rms”. We used calibration curves to test the consistency between predicted and actual survival rates. A time-dependent Receiver operating characteristic (ROC) curve was also used to assess the predictive accuracy of the nomogram. In addition, the Decision Curve Analysis (DCA) was used to demonstrate the advantage of the prediction curve using the R package “ggDCA”.
2.8 Statistical analysis
The statistical significance for variables with non-normal distribution was analyzed using the Wilcoxon rank sum test. The comparison between two groups of variables with normal distribution was estimated using an unpaired Student’s t-test. Non-parametric correlation analyses were conducted based on Spearman’s rank correlation coefficient. The prognostic analysis was performed by the Kaplan-Meier analysis method, and the log-rank test was used to evaluate significant differences. All statistical analyses were conducted using Python software (version 3.7) and R software (version 4.1.3). p < 0.05 was considered statistically significant.
3 Results
3.1 The training and validation of deep neural network
Overall 90 cases picked from the original TCGA-OV dataset were divided randomly into training (72 cases) and test (18 cases) datasets, respectively. We initialized the network parameters with “nn.init ()” and adopted the Adam solver with a momentum of 0.9 for the training process. Batch size, epoch number, and
C-index was originally proposed to evaluate predictions for binary responses (Harrell et al., 1996). As an evaluation metric utilized widely, it is herein utilized to evaluate the network performance. Figure 2A demonstrated that the C-index overall increased in the process of training. The C-index in the cross-validation experiment varies from 0.5096 to 0.6053 and the corresponding mean value was 0.5789. Figure 2B showed that the loss function reduces overall in 5-fold cross-validation results.
FIGURE 2. Evaluation metric and training loss were visualized. (A) The C-index changes with the epoch. (B) The whole loss reduces with the epoch.
To verify further the effectiveness of the deep survival network, the KM curve was employed to analyze the prediction results. The high- and low-risk cohort was classified via the median value of the sorted hazard value associated with each patient. The KM cure of overall survival and recurrence-free survival were plotted as shown in Figures 3A, B. In addition, the p-value was calculated to analyze these two distributions based on the log-rank test. p-value of 0.00845 demonstrated that there was a significant difference between the high- and the low-risk score group The clinical characteristics between the high and low risk score subgroups, including age, stage, grade, and residuals were presented in Table1 to provide a clearer understanding of the sample distribution. Detailed risk score and clinical characteristics were shown in Supplementary Table S1.
FIGURE 3. The Kaplan-Meier curve was plotted based on the prediction hazard value for each case. (A,B), The overall [(A), p = 0.0085] and recurrence-free [(B), p = 0.15] survival difference between high- and low-risk score groups.
3.2 Predicting survival outcome of HRD patients via risk score
Platinum-based chemotherapy is the essential treatment for OC. Patients with HRD+ (HRD score> 42) and BRCA1/2 mutation are more beneficial from chemotherapy, thus, we evaluated the prognostic applications of the risk score in HRD+ and HRD-subgroups. In the TCGA-OV cohort, the HRD + subgroup displayed a significantly better OS than HRD-group (p < 0.0001, Figure 4A). 61 overlapped patients with WSIs and HRD scores were detected (Figure 4B). The HRD + subgroup harbored mainly high-risk score population, while the HRD-group revealed an opposite result (Figure 4C), which suggested that the risk score may work in a different way from the method based on HRD in predicting survival outcomes. Moreover, survival analysis demonstrated a significantly better OS in patients with high-risk score than that with low-risk score in HRD + subgroup (p = 0.013; Figure 4D). Interestingly, there was no significant difference in HRD-group comparing the OS of the two risk populations (p = 0.92, Figure 4E). ROC curve analysis is used to assess the sensitivity and specificity of prediction models and validate the results of risk prediction values. The time-dependent ROC curve proved the reliable performance of risk score in HRD + subgroup (2-years AUC = 0.743, 3-years AUC = 0.718, 5-years AUC = 0.775, Figure 4F). To present the HRD score, FIGO staging, survival status, and risk score calculated by WSI as a unified system, Sankey diagram was constructed to describe the relationship between these features (Figure 4G).
FIGURE 4. Predicting survival of HRD patients. (A) The survival difference between HRD+ and HRD-groups. (B) Obtaining HRD score and risk score intersections with venn diagrams. (C) In the HRD + subgroup, a high percentage of people with high risk score. (D, E) The survival difference in HRD + subgroup (D) and HRD-group (E) between high and low risk score groups. (F) Time dependent ROC curves of risk score in HRD + group at 2, 3, and 5 years. (G) Described the relationship among HRD score, risk score, stage, and survival status by sankey diagram.
3.3 Mutant landscape and immune infiltration
The relationship between risk score and mutation landscape has been assessed in OC patients. In both high- and low-risk score groups, the top 20 mutated genes were presented in Figures 5A,B. TP53 (93%), USH2A (20%), and TTN (17%) exhibited the most frequent mutations in the high-risk score group, while the highly mutant genes in the low-risk score group were TP53 (79%), TTN (18%) and AHNAK (11%). Moreover, the high-risk score group genes were enriched in HOMOLOGOUS RECOMBINATION, OXIDATIVE_PHOSPHORYLATION, and RIBOSOME pathway (Figure 5C), as well as the low-risk score group genes were enriched in FOCAL ADHESION, JAK STAT SIGNALING PATHWAY, and ECM RECEPTOR INTERACTION pathway (Figure 5D). The mutation (Supplementary Tables S2, S3) and GSEA data (Supplementary Table S4) were shown in Supplementary Material. We also tried to analyze the correlation between the risk score and TMB, but got a negative result (Figure 5E). Since the high antigenicity induced by tumor mutations can recruit a large number of immune cells, we studied the relationship between risk and TIDE score, to evaluate the predictive value of risk score in immunotherapy outcomes. Unfortunately, although the risk score was correlated with TIDE (p = 0.0091, Figure 5F), the spearman correlation coefficient was relatively low. These results were consistent with the poor response of ovarian cancer to immunotherapy. Next, we analyzed the relationship between risk score and immune cell infiltration by ssGSEA. Several types of immune cells, such as the central memory CD4 T-cell, central memory CD8 T-cell, and effector memory CD8 T-cell, were found significantly less recruited in high-risk group tumor environment (Figure 5G). The risk score may be a supplemented as an indicative tool to further investigate immune cell infiltration in ovarian cancer.
FIGURE 5. Mutant landscape and immune infiltration between high and low risk score groups. (A,B) Mutation landscape of OC patient with low risk score (A) and high risk score (B). (C,D) Enrichment analysis based on GSEA in high risk score group (C) and low risk score group (D). (E) Comparison of tumor mutation burden between high and low risk score groups. (F) Relationship between risk score and TIDE score. (G) Relationship between risk score and immune infiltration.
3.4 Chemotherapeutic response analysis
We attempted to identify whether the risk score could be applied to predict the sensitivity of response to chemotherapies using the GDSC database. The results revealed that the low-risk score group had a lower half maximal inhibitory concentration of BMS-754807, doramapimod, JAK1_8709, JQ1, NU7441, RO3306, SB216763, and WZ4003, while the high-risk score group had a lower half maximal inhibitory concentration of cisplatin, leflunomide (Figure 6). The p38MAPK inhibitors ralimetinib have been shown to play an anti-cancer role in ovarian cancer (Campbell et al., 2014). While Doramapimod also showed potent anti-inflammatory effects as p38MAPK inhibitors (Schreiber et al., 2006), perhaps our study will explore a new alternative for the application of Doramapimod in ovarian cancer. Similarly, RO3306 (Yang et al., 2016), SB216763 (Kaltofen et al., 2020) were also shown to play an anti-cancer role in ovarian cancer. Moreover, the combination of JQ1 and cisplatin helped ovarian cancer-bearing mice survive (Yokoyama et al., 2016). However, previous study showed that NU7441 could induce resistance to PARP inhibitor in BRCA1-defective cells (McCormick et al., 2017), and BMS-754807 combined with carboplatin/paclitaxel was observed resistance in ovarian carcinosarcoma of patient-derived xenograft (Glaser et al., 2015). Therefore, we should actively explore new strategies for more drug combinations to avoid the resistance in ovarian cancer. Our model provided possibility for novel pathways of drugs. Also, we hope that more pre-clinical models will prove our predicted results.
3.5 Development of a nomogram for predicting survival
Based on the available clinic features, Cox regression analyses were conducted to identify the possibility that risk score was an independent prognostic factor for OS. The univariate Cox regression analysis revealed significant associations between risk score and OS (HR = 0.48, 95% CI = 0.266–0.86, p = 0.0138, Figure 7A). When other confounding factors were corrected, multivariate Cox regression analysis proved the risk score of WSIs was an independent predictor of prognosis (HR = 0.505, 95% CI = 0.275–0.928, p = 0.028, Figure 7B).
FIGURE 7. Predicting survival by integrating risk score and clinical feature. (A,B) Univariate (A) and multivariate Cox regression analysis (B) showed risk score and age were significantly correlated with overall survival. (C) Nomogram was constructed to predict the 2-, 3-, and 5-years survival of OC patients. (D) Calibration curve of the nomogram for predicting the probability of OS at 2, 3 and 5 years. (E) Time dependent ROC curves of nomogram at 2, 3 and 5 years. (F) Decision curve analysis of OS for the predicted nomogram model.
Following the results of the Cox regression analysis, we have further developed a nomogram incorporating two independent prognostic factors (risk score and age) to offer a quantitative method for estimating 2-, 3-, and 5-years survival rates of OC patients (Figure 7C). In addition, calibration plots showed that the nomogram was comparable to an ideal model (Figure 7D). At 2, 3, and 5 years, the AUC of the risk score was 0.605, 0.578, and 0.680. Nomogram’s accuracy of prediction over 2, 3, and 5 years was significantly higher, at 0.680, 0.613, and 0.745, respectively (Figure 7E). DCA result also indicated our nomogram had a promising potential for clinical application (Figure 7F).
4 Discussion
Tumor histology remains essential in predicting tumor aggressiveness and evaluating prognostic stratification. Previous studies have proved that deep learning models, such as DeepSurv (Katzman et al., 2018)and AECOX (Huang et al., 2020), can provide better predicted performance than traditional Cox regression model in predicting prognosis since learning the complex non-linear interactions (Tran et al., 2021). In addition, several recent studies have provided preliminary evidence that deep learning can help predict patient prognosis from digital pathology images (Skrede et al., 2020; Shi J. et al., 2021), but it is yet unclear how this contributes to OC risk categorization. In our study, we showed that WSIs data had a predictive capacity for survival.
Based on this concept, we exploited an attention-based network architecture to predict the hazard function based on patient-level labels. Unfortunately, the sample size of the TCGA-OV obtained 106 cases and only 90 cases were incorporated into this experiment finally. It was prone to overfitting for small sample datasets. To overcome the overfitting issue, we reduced the parameter number of FC. In the modeling process, the cross-entropy loss function and negative log-likelihood loss function are popular approaches. (Zadeh and Schmid (2021) analyzed the bias between two loss functions. Experiment results showed that the cross-entropy-based network tended to be similar to the respective results obtained from log-likelihood-based training if the censoring rate was high. The censored/uncensored case belongs to two kinds of survival data. For deep learning algorithms, we generally assume that the training dataset and test one are subject to the same distribution. Especially, The TCGA-OV dataset herein only obtained 90 cases after deleting some unsuitable cases. Thus, we randomly split the dataset and made the training have a similar portion in each fold. Of course, we conducted the same operation in the test dataset. The experiment results showed that the network presents a good performance in terms of the C-index. However, more cases incorporated will be helpful to enhance the model’s accuracy. The prognostic analysis of ovarian cancer by pathological images was included in the previous pan-cancer analysis, which achieved a c-index of 0.57243 (Fu et al., 2020), but our c-index was up to 0.6053. Moreover, our method automatically selects regions of interest (ROI) from the entire tissue field, which will allow pathologists to enhance standard clinical workflows without additional manual steps. And previous study has successfully predicted the recurrence in breast cancer without ROI label (Phan et al., 2021). We believe that the approach will avoid subjective bias.
We also performed analysis at the molecular level associated with the pathological images, including in HRD subgroup analysis, pathway enrichment analysis, etc., which was not seen in the previous study. OC patients with HRD can increase sensitivity to PARP inhibitors and improve overall survival benefits. Previous studies have shown that integrating image information and HRD status using machine learning methods has increased the capability of prognostic prediction for OC patients (Boehm et al., 2022), but no studies are using WSIs data to assess prognosis in HRD patients. Thus, we researched the survival differences in the HRD + subgroup with a risk score. The results showed that patient survival with a high-risk score was better than that with a low-risk score. And AUC of time-dependent ROC curves verified the reliable predicted performance of the risk score. Thus, the pathological images can not only determine the risk stratification of overall ovarian cancer patients but also differentiate the prognosis of HRD subgroups of patients, which may provide options for targeted therapy in OC patients.
Tumor microenvironment (TME) has been proved to play an essential role in tumor proliferation, migration and metastasis (Jiang et al., 2020). Here we found the low risk score group slides harbor complex cellular components as a common characteristics in WSI. In addition, Park et al. (2022) analyzed tumor-infiltrating lymphocytes based on AI using WSI, which illustrated the relationship between the characteristics of WSI and TME. The similar relationship between risk score based on WSI and immune cell infiltration was presented in our study.
Ovarian cancer with low tumor-infiltrating lymphocyte is considered “cold” tumor (Yang et al., 2022). The results of our immune infiltration analysis also showed no difference in activated CD8 T-cells between the high and low risk score subgroups. Interestingly, we found that there was significant difference in central memory T-cells, such as central memory CD8 and CD4 T-cells. The central memory T-cells have high proliferative potential compared to effector memory T-cells, which contributes to the formation of the patient’s immune memory pool (Sallusto et al., 2004). Sckisel et al. (2017) showed that the composition of the memory pool at different loci had a strong influence on the overall expression of those markers in the memory pool. For example, the memory ratio of CD8 subpopulations was more skewed toward central memory T-cell in the lymphoid region (Sallusto et al., 2004). Therefore, we hypothesized that the characteristic marker distribution of the immune memory pool in ovarian cancer patients could be detected by the advantage of spatial visualization of WSI, and predicting the recurrence and survival of patients. However, there are fewer studies on central memory T-cells in ovarian cancer, and experiments will be needed to explore in the future.
By GSEA analysis, we found that pathways associated with HRD were enriched, such as homologous recombination pathways. DNA can be repaired by high-fidelity homologous recombination when double-stranded damage occurs. The risk of cancer will increase when HR is dysregulated (Helleday, 2010). In addition, we observed that the ribosomal pathway was also enriched. It has been shown that deficient ribosome assembly was associated with cancer, while mutation in ribosomal proteins regulated the translation and activity of p53, ultimately leading to disease (Goudarzi and Lindstrom, 2016). Interestingly, it was found that pathways associated with TME, such as the ECM pathway (Sangaletti et al., 2017), were enriched by GSEA analysis. The previous study has proven that the invasion and survival of tumor cells can be promoted by ECM-mediated signaling (Conklin and Keely, 2012). Cancer patients, such as pancreatic and colorectal cancer, with high levels of ECM change deposition have a poor prognosis (Levental et al., 2009; Calon et al., 2015; Isella et al., 2015). Similarly, several studies have suggested that abnormal activation of focal adhesion and JAK-STAT was associated with progression and poor prognosis of ovarian cancer (Yang et al., 2019; Zhang J. et al., 2022).
We further observed the different sensitivity to chemotherapy drugs in two risk groups. The risk scores obtained from the WSIs had a low correlation coefficient with the TIDE score, which meant that the risk score was not a good predictor of response to immunotherapy. Our findings demonstrated that the high-risk score group had higher IC50 levels for several chemotherapeutic drugs, indicating the OC patients with low-risk scores were more responsive to the selected drugs. Additionally, the risk score can be utilized as an independent prognostic factor. By combining risk score with age to draw a nomogram, the model had a stable and powerful survival predictive capability. However, some limitations of this research should be noted. Firstly, the risk score has not been validated in external clinical settings, although we are planning the clinical validation of our WSI datasets. Additionally, prospective multi-center studies may be needed to test our model and to overcome possible biases of the retrospective research.
In summary, this study proposed a deep learning framework based on WSI to predict patient prognosis in OC. We believed that the prognostic indicator has the possibility of being used by clinicians to improve decision-making.
Data availability statement
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.
Ethics statement
Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.
Author contributions
MW, CZ, WS, and YW designed the study. CZ collected the original slides. CZ, WS, and XY. performed the deep learning algorithm development and validation. MW and JY conducted bioinformatics analysis. The search of the existing literature was conducted by SC, SG, SX, and YW, MW, and CZ drafted the article. CZ, SH, and YW edited the manuscript. All authors contributed to the article and approved the final version.
Funding
This work is supported by the National Natural Science Foundation of China (Grant No. 82072866), the Shanghai Special Program of Biomedical Science and Technology Support (Grant No. 21S31903600), and the Clinical Scientific innovation and Cultivation Fund of Renji Hospital Affiliated School of Medicine, Shanghai Jiao Tong University (Grant No. PYII20-02), Shanghai Municipal Science and Technology Major Project (2021SHZDZX0102), the National Natural Science Foundation of China (Grant No. 61901259, Grant No. 62176159), Natural Science Foundation of Shanghai (Grant No. 21ZR1432200) and Zhejiang Lab’s International Talent Fund for Young Professionals.
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene.2022.1069673/full#supplementary-material
References
Boehm, K. M., Aherne, E. A., Ellenson, L., Nikolovski, I., Alghamdi, M., Vazquez-Garcia, I., et al. (2022). Multimodal data integration using machine learning improves risk stratification of high-grade serous ovarian cancer. Nat. Cancer 3 (6), 723–733. doi:10.1038/s43018-022-00388-9
Calon, A., Lonardo, E., Berenguer-Llergo, A., Espinet, E., Hernando-Momblona, X., Iglesias, M., et al. (2015). Stromal gene expression defines poor-prognosis subtypes in colorectal cancer. Nat. Genet. 47 (4), 320–329. doi:10.1038/ng.3225
Campbell, R. M., Anderson, B. D., Brooks, N. A., Brooks, H. B., Chan, E. M., De Dios, A., et al. (2014). Characterization of LY2228820 dimesylate, a potent and selective inhibitor of p38 MAPK with antitumor activity. Mol. Cancer Ther. 13 (2), 364–374. doi:10.1158/1535-7163.MCT-13-0513
Chen, R. J., Lu, M. Y., Williamson, D. F. K., Chen, T. Y., Lipkova, J., Noor, Z., et al. (2022). Pan-cancer integrative histology-genomic analysis via multimodal deep learning. Cancer Cell 40 (8), 865–878.e6. doi:10.1016/j.ccell.2022.07.004
Conklin, M. W., and Keely, P. J. (2012). Why the stroma matters in breast cancer: Insights into breast cancer patient outcomes through the examination of stromal biomarkers. Cell Adh Migr. 6 (3), 249–260. doi:10.4161/cam.20567
Desbois, M., Udyavar, A. R., Ryner, L., Kozlowski, C., Guan, Y., Durrbaum, M., et al. (2020). Integrated digital pathology and transcriptome analysis identifies molecular mediators of T-cell exclusion in ovarian cancer. Nat. Commun. 11 (1), 5583. doi:10.1038/s41467-020-19408-2
Fu, Y., Jung, A. W., Torne, R. V., Gonzalez, S., Vohringer, H., Shmatko, A., et al. (2020). Pan-cancer computational histopathology reveals mutations, tumor composition and prognosis. Nat. Cancer 1 (8), 800–810. doi:10.1038/s43018-020-0085-8
Glaser, G., Weroha, S. J., Becker, M. A., Hou, X., Enderica-Gonzalez, S., Harrington, S. C., et al. (2015). Conventional chemotherapy and oncogenic pathway targeting in ovarian carcinosarcoma using a patient-derived tumorgraft. PLoS One 10 (5), e0126867. doi:10.1371/journal.pone.0126867
Goudarzi, K. M., and Lindstrom, M. S. (2016). Role of ribosomal protein mutations in tumor development (Review). Int. J. Oncol. 48 (4), 1313–1324. doi:10.3892/ijo.2016.3387
Hanzelmann, S., Castelo, R., and Guinney, J. (2013). Gsva: Gene set variation analysis for microarray and RNA-seq data. BMC Bioinforma. 14, 7. doi:10.1186/1471-2105-14-7
Harrell, F. E., Lee, K. L., and Mark, D. B. (1996). Multivariable prognostic models: Issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat. Med. 15 (4), 361–387.
Helleday, T. (2010). Homologous recombination in cancer development, treatment and development of drug resistance. Carcinogenesis 31 (6), 955–960. doi:10.1093/carcin/bgq064
Huang, Z., Johnson, T. S., Han, Z., Helm, B., Cao, S., Zhang, C., et al. (2020). Deep learning-based cancer survival prognosis from RNA-seq data: Approaches and evaluations. BMC Med. Genomics 13 (5), 41. doi:10.1186/s12920-020-0686-1
Isella, C., Terrasi, A., Bellomo, S. E., Petti, C., Galatola, G., Muratore, A., et al. (2015). Stromal contribution to the colorectal cancer transcriptome. Nat. Genet. 47 (4), 312–319. doi:10.1038/ng.3224
Jiang, Y., Wang, C., and Zhou, S. (2020). Targeting tumor microenvironment in ovarian cancer: Premise and promise. Biochim. Biophys. Acta Rev. Cancer 1873 (2), 188361. doi:10.1016/j.bbcan.2020.188361
Jin, L., Shi, F., Chun, Q., Chen, H., Ma, Y., Wu, S., et al. (2021). Artificial intelligence neuropathologist for glioma classification using deep learning on hematoxylin and eosin stained slide images and molecular markers. Neuro Oncol. 23 (1), 44–52. doi:10.1093/neuonc/noaa163
Kaltofen, T., Preinfalk, V., Schwertler, S., Fraungruber, P., Heidegger, H., Vilsmaier, T., et al. (2020). Potential of platinum-resensitization by Wnt signaling modulators as treatment approach for epithelial ovarian cancer. J. Cancer Res. Clin. Oncol. 146 (10), 2559–2574. doi:10.1007/s00432-020-03317-4
Katzman, J. L., Shaham, U., Cloninger, A., Bates, J., Jiang, T., and Kluger, Y. (2018). DeepSurv: Personalized treatment recommender system using a Cox proportional hazards deep neural network. BMC Med. Res. Methodol. 18 (1), 24. doi:10.1186/s12874-018-0482-1
Kurman, R. J., and Shih Ie, M. (2016). The dualistic model of ovarian carcinogenesis: Revisited, revised, and expanded. Am. J. Pathol. 186 (4), 733–747. doi:10.1016/j.ajpath.2015.11.011
Kuroki, L., and Guntupalli, S. R. (2020). Treatment of epithelial ovarian cancer. BMJ 371, m3773. doi:10.1136/bmj.m3773
Levental, K. R., Yu, H., Kass, L., Lakins, J. N., Egeblad, M., Erler, J. T., et al. (2009). Matrix crosslinking forces tumor progression by enhancing integrin signaling. Cell 139 (5), 891–906. doi:10.1016/j.cell.2009.10.027
Litjens, G., Kooi, T., Bejnordi, B. E., Setio, A. A. A., Ciompi, F., Ghafoorian, M., et al. (2017). A survey on deep learning in medical image analysis. Med. Image Anal. 42, 60–88. doi:10.1016/j.media.2017.07.005
Loshchilov, I., and Hutter, F. (2017). “Sgdr: Stochastic gradient descent with warm restarts,” in International Conference on Learning Representations(ICLR), Toulon, France, 22 Jul 2022. doi:10.48550/arXiv.1608.03983
Lu, H. M., Li, S., Black, M. H., Lee, S., Hoiness, R., Wu, S., et al. (2019). Association of breast and ovarian cancers with predisposition genes identified by large-scale sequencing. JAMA Oncol. 5 (1), 51–57. doi:10.1001/jamaoncol.2018.2956
Lu, M. Y., Williamson, D. F. K., Chen, T. Y., Chen, R. J., Barbieri, M., and Mahmood, F. (2021). Data-efficient and weakly supervised computational pathology on whole-slide images. Nat. Biomed. Eng. 5 (6), 555–570. doi:10.1038/s41551-020-00682-w
Maeser, D., Gruener, R. F., and Huang, R. S. (2021). oncoPredict: an R package for predicting in vivo or cancer patient drug response and biomarkers from cell line screening data. Brief. Bioinform 22 (6), bbab260. doi:10.1093/bib/bbab260
McCormick, A., Donoghue, P., Dixon, M., O'Sullivan, R., O'Donnell, R. L., Murray, J., et al. (2017). Ovarian cancers harbor defects in nonhomologous end joining resulting in resistance to rucaparib. Clin. Cancer Res. 23 (8), 2050–2060. doi:10.1158/1078-0432.CCR-16-0564
Park, S., Ock, C. Y., Kim, H., Pereira, S., Park, S., Ma, M., et al. (2022). Artificial intelligence-powered spatial analysis of tumor-infiltrating lymphocytes as complementary biomarker for immune checkpoint inhibition in non-small-cell lung cancer. J. Clin. Oncol. 40 (17), 1916–1928. doi:10.1200/JCO.21.02010
Phan, N. N., Hsu, C. Y., Huang, C. C., Tseng, L. M., and Chuang, E. Y. (2021). Prediction of breast cancer recurrence using a deep convolutional neural network without region-of-interest labeling. Front. Oncol. 11, 734015. doi:10.3389/fonc.2021.734015
Ren, K., Qin, J., Zheng, L., Yang, Z., Zhang, W., Qiu, L., et al. (2019). Deep recurrent survival analysis. Proc. AAAI Conf. Artif. Intell. 33 (01), 4798–4805. doi:10.1609/aaai.v33i01.33014798
Saillard, C., Schmauch, B., Laifa, O., Moarii, M., Toldo, S., Zaslavskiy, M., et al. (2020). Predicting survival after hepatocellular carcinoma resection using deep learning on histological slides. Hepatology 72 (6), 2000–2013. doi:10.1002/hep.31207
Sallusto, F., Geginat, J., and Lanzavecchia, A. (2004). Central memory and effector memory T cell subsets: Function, generation, and maintenance. Annu. Rev. Immunol. 22, 745–763. doi:10.1146/annurev.immunol.22.012703.104702
Sangaletti, S., Chiodoni, C., Tripodo, C., and Colombo, M. P. (2017). The good and bad of targeting cancer-associated extracellular matrix. Curr. Opin. Pharmacol. 35, 75–82. doi:10.1016/j.coph.2017.06.003
Schreiber, S., Feagan, B., D'Haens, G., Colombel, J. F., Geboes, K., Yurcov, M., et al. (2006). Oral p38 mitogen-activated protein kinase inhibition with BIRB 796 for active crohn's disease: A randomized, double-blind, placebo-controlled trial. Clin. Gastroenterol. Hepatol. 4 (3), 325–334. doi:10.1016/j.cgh.2005.11.013
Sckisel, G. D., Mirsoian, A., Minnar, C. M., Crittenden, M., Curti, B., Chen, J. Q., et al. (2017). Differential phenotypes of memory CD4 and CD8 T cells in the spleen and peripheral tissues following immunostimulatory therapy. J. Immunother. Cancer 5, 33. doi:10.1186/s40425-017-0235-4
Shi, J. Y., Wang, X., Ding, G. Y., Dong, Z., Han, J., Guan, Z., et al. (2021). Exploring prognostic indicators in the pathological images of hepatocellular carcinoma based on deep learning. Gut 70 (5), 951–961. doi:10.1136/gutjnl-2020-320930
Shi, Z., Zhao, Q., Lv, B., Qu, X., Han, X., Wang, H., et al. (2021). Identification of biomarkers complementary to homologous recombination deficiency for improving the clinical outcome of ovarian serous cystadenocarcinoma. Clin. Transl. Med. 11 (5), e399. doi:10.1002/ctm2.399
Skrede, O. J., De Raedt, S., Kleppe, A., Hveem, T. S., Liestol, K., Maddison, J., et al. (2020). Deep learning for prediction of colorectal cancer outcome: A discovery and validation study. Lancet 395 (10221), 350–360. doi:10.1016/S0140-6736(19)32998-8
Thorsson, V., Gibbs, D. L., Brown, S. D., Wolf, D., Bortone, D. S., Ou Yang, T. H., et al. (2018). The immune landscape of cancer. Immunity 48 (4), 812–830. e814. doi:10.1016/j.immuni.2018.03.023
Tran, K. A., Kondrashova, O., Bradley, A., Williams, E. D., Pearson, J. V., and Waddell, N. (2021). Deep learning in cancer diagnosis, prognosis and treatment selection. Genome Med. 13 (1), 152. doi:10.1186/s13073-021-00968-x
Yang, J., Xing, H., Lu, D., Wang, J., Li, B., Tang, J., et al. (2019). Role of Jagged1/STAT3 signalling in platinum-resistant ovarian cancer. J. Cell Mol. Med. 23 (6), 4005–4018. doi:10.1111/jcmm.14286
Yang, W., Cho, H., Shin, H. Y., Chung, J. Y., Kang, E. S., Lee, E. J., et al. (2016). Accumulation of cytoplasmic Cdk1 is associated with cancer growth and survival rate in epithelial ovarian cancer. Oncotarget 7 (31), 49481–49497. doi:10.18632/oncotarget.10373
Yang, W., Soares, J., Greninger, P., Edelman, E. J., Lightfoot, H., Forbes, S., et al. (2013). Genomics of drug sensitivity in cancer (GDSC): A resource for therapeutic biomarker discovery in cancer cells. Nucleic Acids Res. 41, D955–D961. doi:10.1093/nar/gks1111
Yang, Y., Zhao, T., Chen, Q., Li, Y., Xiao, Z., Xiang, Y., et al. (2022). Nanomedicine strategies for heating "cold" ovarian cancer (OC): Next evolution in immunotherapy of OC. Adv. Sci. (Weinh) 9 (28), e2202797. doi:10.1002/advs.202202797
Yokoyama, Y., Zhu, H., Lee, J. H., Kossenkov, A. V., Wu, S. Y., Wickramasinghe, J. M., et al. (2016). BET inhibitors suppress ALDH activity by targeting ALDH1A1 super-enhancer in ovarian cancer. Cancer Res. 76 (21), 6320–6330. doi:10.1158/0008-5472.CAN-16-0854
Zadeh, S. G., and Schmid, M. (2021). Bias in cross-entropy-based training of deep survival networks. IEEE Trans. Pattern Anal. Mach. Intell. 43 (9), 3126–3137. doi:10.1109/TPAMI.2020.2979450
Zhang, J., Li, Y., Liu, H., Zhang, J., Wang, J., Xia, J., et al. (2022). Genome-wide CRISPR/Cas9 library screen identifies PCMT1 as a critical driver of ovarian cancer metastasis. J. Exp. Clin. Cancer Res. 41 (1), 24. doi:10.1186/s13046-022-02242-3
Keywords: ovarian cancer, deep learning, prognosis, risk stratification, pathology
Citation: Wu M, Zhu C, Yang J, Cheng S, Yang X, Gu S, Xu S, Wu Y, Shen W, Huang S and Wang Y (2023) Exploring prognostic indicators in the pathological images of ovarian cancer based on a deep survival network. Front. Genet. 13:1069673. doi: 10.3389/fgene.2022.1069673
Received: 14 October 2022; Accepted: 12 December 2022;
Published: 04 January 2023.
Edited by:
Jinhui Liu, Nanjing Medical University, ChinaReviewed by:
Simin Li, Southern Medical University, ChinaKyoung-Ho Pyo, Yonsei University, South Korea
Copyright © 2023 Wu, Zhu, Yang, Cheng, Yang, Gu, Xu, Wu, Shen, Huang and Wang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Yu Wang, cmVuaml3YW5neXVAMTI2LmNvbQ==; Shan Huang, aHVhbmdzaGFuQHJlbmppLmNvbQ==; Wei Shen, d2VpLnNoZW5Ac2p0dS5lZHUuY24=
†These authors have contributed equally to this work