In the world, colon cancer is regarded as one of the most common deadly cancer. Due to the lack of a better understanding of its prognosis system, this prevailing cancer has the second-highest morbidity and mortality rate compared with other cancers. A variety of genes are responsible to participate in colon cancer and the molecular mechanism is almost unsure. In addition, various studies have been done to identify the differentially expressed genes to investigate the dysfunctions of the genes but most of them did it individually. In this study, we constructed a functional interaction network for identifying the group of genes that conduct cellular functions and Protein-Protein Interaction network, which aims to better understanding protein functions and their biological relationships. A functional evolution network was also generated to analyze the dysfunctions from initial stage to later stage of colon cancer by investigating the gene modules and their molecular functions. The results show that the proposed evolution network is able to detect the significant cellular functions, which can be used to explore the evolution process of colon cancer. Moreover, a total of 10 core genes associated with colon cancer were identified, which were INS, SNAP25, GRIA2, SST, GCG, PVALB, SLC17A7, SLC32A1, SLC17A6, and NPY, respectively. The responsible candidate genes and corresponding pathways presented in this study could be used to develop new tumor indicators and novel therapeutic targets for the prevention and treatment of colon cancer.
Single-cell multiomics sequencing techniques have rapidly developed in the past few years. Among these techniques, single-cell cellular indexing of transcriptomes and epitopes (CITE-seq) allows simultaneous quantification of gene expression and surface proteins. Clustering CITE-seq data have the great potential of providing us with a more comprehensive and in-depth view of cell states and interactions. However, CITE-seq data inherit the properties of scRNA-seq data, being noisy, large-dimensional, and highly sparse. Moreover, representations of RNA and surface protein are sometimes with low correlation and contribute divergently to the clustering object. To overcome these obstacles and find a combined representation well suited for clustering, we proposed scCTClust for multiomics data, especially CITE-seq data, and clustering analysis. Two omics-specific neural networks are introduced to extract cluster information from omics data. A deep canonical correlation method is adopted to find the maximumly correlated representations of two omics. A novel decentralized clustering method is utilized over the linear combination of latent representations of two omics. The fusion weights which can account for contributions of omics to clustering are adaptively updated during training. Extensive experiments over both simulated and real CITE-seq data sets demonstrated the power of scCTClust. We also applied scCTClust on transcriptome–epigenome data to illustrate its potential for generalizing.