To automatically quantify colorectal tumor microenvironment (TME) in hematoxylin and eosin stained whole slide images (WSIs), and to develop a TME signature for prognostic prediction in colorectal cancer (CRC).
A deep learning model based on VGG19 architecture and transfer learning strategy was trained to recognize nine different tissue types in whole slide images of patients with CRC. Seven of the nine tissue types were defined as TME components besides background and debris. Then 13 TME features were calculated based on the areas of TME components. A total of 562 patients with gene expression data, survival information and WSIs were collected from The Cancer Genome Atlas project for further analysis. A TME signature for prognostic prediction was developed and validated using Cox regression method. A prognostic prediction model combined the TME signature and clinical variables was also established. At last, gene-set enrichment analysis was performed to identify the significant TME signature associated pathways by querying Gene Ontology database and Kyoto Encyclopedia of Genes and Genomes database.
The deep learning model achieved an accuracy of 94.2% for tissue type recognition. The developed TME signature was found significantly associated to progression-free survival. The clinical combined model achieved a concordance index of 0.714. Gene-set enrichment analysis revealed the TME signature associated genes were enriched in neuroactive ligand-receptor interaction pathway.
The TME signature was proved to be a prognostic factor and the associated biologic pathways would be beneficial to a better understanding of TME in CRC patients.