Cancer heterogeneity is a major challenge in clinical practice, and to some extent, the varying combinations of different cell types and their cross-talk with tumor cells that modulate the tumor microenvironment (TME) are thought to be responsible. Despite recent methodological advances in cancer, a reliable and robust model that could effectively investigate heterogeneity with direct prognostic/diagnostic clinical application remained elusive.
To investigate cancer heterogeneity, we took advantage of single-cell transcriptome data and constructed the first indication- and cell type-specific reference gene expression profile (RGEP) for breast cancer (BC) that can accurately predict the cellular infiltration. By utilizing the BC-specific RGEP combined with a proven deconvolution model (LinDeconSeq), we were able to determine the intrinsic gene expression of 15 cell types in BC tissues. Besides identifying significant differences in cellular proportions between molecular subtypes, we also evaluated the varying degree of immune cell infiltration (basal-like subtype: highest; Her2 subtype: lowest) across all available TCGA-BRCA cohorts. By converting the cellular proportions into functional gene sets, we further developed a 24 functional gene set-based prognostic model that can effectively discriminate the overall survival (
Herein, we have developed a highly reliable BC-RGEP that adequately annotates different cell types and estimates the cellular infiltration. Of importance, the functional gene set-based prognostic model that we have introduced here showed a great ability to screen patients based on their therapeutic response. On a broader perspective, we provide a perspective to generate similar models in other cancer types to identify shared factors that drives cancer heterogeneity.