Objective: Modern medicine needs to shift from a wait and react, curative discipline to a preventative, interdisciplinary science aiming at providing personalized, systemic, and precise treatment plans to patients. To this purpose, we propose a “digital twin” of patients modeling the human body as a whole and providing a panoramic view over individuals' conditions.
Methods: We propose a general framework that composes advanced artificial intelligence (AI) approaches and integrates mathematical modeling in order to provide a panoramic view over current and future pathophysiological conditions. Our modular architecture is based on a graph neural network (GNN) forecasting clinically relevant endpoints (such as blood pressure) and a generative adversarial network (GAN) providing a proof of concept of transcriptomic integrability.
Results: We tested our digital twin model on two simulated clinical case studies combining information at organ, tissue, and cellular level. We provided a panoramic overview over current and future patient's conditions by monitoring and forecasting clinically relevant endpoints representing the evolution of patient's vital parameters using the GNN model. We showed how to use the GAN to generate multi-tissue expression data for blood and lung to find associations between cytokines conditioned on the expression of genes in the renin–angiotensin pathway. Our approach was to detect inflammatory cytokines, which are known to have effects on blood pressure and have previously been associated with SARS-CoV-2 infection (e.g., CXCR6, XCL1, and others).
Significance: The graph representation of a computational patient has potential to solve important technological challenges in integrating multiscale computational modeling with AI. We believe that this work represents a step forward toward next-generation devices for precision and predictive medicine.
The functions of proteins are mainly determined by their subcellular localizations in cells. Currently, many computational methods for predicting the subcellular localization of proteins have been proposed. However, these methods require further improvement, especially when used in protein representations. In this study, we present an embedding-based method for predicting the subcellular localization of proteins. We first learn the functional embeddings of KEGG/GO terms, which are further used in representing proteins. Then, we characterize the network embeddings of proteins on a protein–protein network. The functional and network embeddings are combined as novel representations of protein locations for the construction of the final classification model. In our collected benchmark dataset with 4,861 proteins from 16 locations, the best model shows a Matthews correlation coefficient of 0.872 and is thus superior to multiple conventional methods.
Breast cancer is the most common malignancy in women, and because it has a high mortality rate, it is urgent to develop computational methods to increase the accuracy of breast cancer survival predictive models. Although multi-omics data such as gene expression have been extensively used in recent studies, the accurate prognosis of breast cancer remains a challenge. Somatic mutations are another important and promising data source for studying cancer development, and its effect on the prognosis of breast cancer remains to be further explored. Meanwhile, these omics datasets are high-dimensional and redundant. Therefore, we adopted multiple kernel learning (MKL) to efficiently integrate somatic mutation to currently molecular data including gene expression, copy number variation (CNV), methylation, and protein expression data for the prediction of breast cancer survival. Before integration, the maximum relevance minimum redundancy (mRMR) feature selection method was utilized to select features that present high relevance to survival and low redundancy among themselves for each type of data. The experimental results demonstrated that the proposed method achieved the most optimal performance and there was a remarkable improvement in the prediction performance when somatic mutations were included, indicating that somatic mutations are critical for improving breast cancer survival predictions. Moreover, mRMR was superior to other feature selection methods used in previous studies. Furthermore, MKL outperformed the other traditional classifiers in multi-omics data integration. Our analysis indicated that through employing promising omics data such as somatic mutations and harnessing the power of proper feature selection methods and effective integration frameworks, the breast cancer survival predictive accuracy can be further increased, thereby providing a more optimal clinical diagnosis and more effective treatment for breast cancer patients.
Acute and chronic inflammation often leads to fibrosis, which is also the common and final pathological outcome of chronic inflammatory diseases. To explore the common genes and pathogenic pathways among different fibrotic diseases, we collected all the reported genes of the eight fibrotic diseases: eye fibrosis, heart fibrosis, hepatic fibrosis, intestinal fibrosis, lung fibrosis, pancreas fibrosis, renal fibrosis, and skin fibrosis. We calculated the Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology (GO) enrichment scores of all fibrotic disease genes. Each gene was encoded using KEGG and GO enrichment scores, which reflected how much a gene can affect this function. For each fibrotic disease, by comparing the KEGG and GO enrichment scores between reported disease genes and other genes using the Monte Carlo feature selection (MCFS) method, the key KEGG and GO features were identified. We compared the gene overlaps among eight fibrotic diseases and connective tissue growth factor (CTGF) was finally identified as the common key molecule. The key KEGG and GO features of the eight fibrotic diseases were all screened by MCFS method. Moreover, we interestingly found overlaps of pathways between renal fibrosis and skin fibrosis, such as GO:1901890-positive regulation of cell junction assembly, as well as common regulatory genes, such as CTGF, which is the key molecule regulating fibrogenesis. We hope to offer a new insight into the cellular and molecular mechanisms underlying fibrosis and therefore help leading to the development of new drugs, which specifically delay or even improve the symptoms of fibrosis.
Frontiers in Human Neuroscience
Methods and Applications in Brain-Computer Interfaces