Precision medicine is being developed as a preventative, diagnostic and treatment tool to combat complex human diseases in a personalized manner. By utilizing high-throughput technologies, dynamic ‘omics data including genetics, epi-genetics and even meta-genomics has produced temporal-spatial big biological datasets which can be associated with individual genotypes underlying pathogen progressive phenotypes. It is therefore necessary to investigate how to integrate these multi-scale ‘omics datasets to distinguish the novel individual-specific disease causes from conventional cohort-common disease causes.
Currently, machine learning plays an important role in biological and biomedical research, especially in the analysis of big ‘omics data. However, in contrast to traditional big social data, ‘omics datasets are currently always “small-sample-high-dimension”, which causes overwhelming application problems and also introduces new challenges:
(1) Big ‘omics datasets can be extremely unbalanced, due to the difficulty of obtaining enough positive samples of such rare mutations or rare diseases;
(2) A large number of machine learning models are “black box,” which is enough to apply in social applications. However, in biological or biomedical fields, knowledge of the molecular mechanisms underlying any disease or biological study is necessary to deepen our understanding;
(3) The genotype-phenotype association is a “white clue” captured in conventional big data studies. But identification of “causality” rather than association would be more helpful for physicians or biologists, as this can be used to determine an experimental target as the subject of future research.
Therefore, to simultaneously improve the phenotype discrimination and genotype interpretability for complex diseases, it is necessary:
To design and implement new machine learning technologies to integrate prior-knowledge with new ‘omics datasets to provide transferable learning methods by combining multiple sources of data;
To develop new network-based theories and methods to balance the trade-off between accuracy and interpretability of machine learning in biomedical and biological domains;
To enhance the causality inference on “small-sample high dimension” data to capture the personalized causal relationship.
In this Research Topic, we focus on the application of wet ‘omics technology and dry machine learning approaches together to further develop precision medicine. Several study designs are especially welcome:
(1) Studies based on individual temporal ‘omics data from disease cohorts or animal model, e.g. the sampling of time points across disease progression or treatment options;
(2) Studies based on multiple ‘omics data, e.g. the combination of genomic, transcriptomic, epigenomic, or proteomic data for a single disease/condition;
(3) Studies based on the gut metagenome and host 'omics for complex diseases diagnosis and treatment;
(4) Studies based on conditional genotype-phenotype detection with deep learning or other brain-like artificial intelligence (AI) technologies.
We encourage submissions of both Original Research and Review articles that address these existing challenges in the fields of disease progressive ‘omics data analysis, disease dynamic network biomarker detection, disease drivers or controls recognition, predictive individual disease model, disease re-stratification model, rare disease model, small-sample based machine learning model, and “white-box” or “grey-box” deep learning approach for precision medicine.
Precision medicine is being developed as a preventative, diagnostic and treatment tool to combat complex human diseases in a personalized manner. By utilizing high-throughput technologies, dynamic ‘omics data including genetics, epi-genetics and even meta-genomics has produced temporal-spatial big biological datasets which can be associated with individual genotypes underlying pathogen progressive phenotypes. It is therefore necessary to investigate how to integrate these multi-scale ‘omics datasets to distinguish the novel individual-specific disease causes from conventional cohort-common disease causes.
Currently, machine learning plays an important role in biological and biomedical research, especially in the analysis of big ‘omics data. However, in contrast to traditional big social data, ‘omics datasets are currently always “small-sample-high-dimension”, which causes overwhelming application problems and also introduces new challenges:
(1) Big ‘omics datasets can be extremely unbalanced, due to the difficulty of obtaining enough positive samples of such rare mutations or rare diseases;
(2) A large number of machine learning models are “black box,” which is enough to apply in social applications. However, in biological or biomedical fields, knowledge of the molecular mechanisms underlying any disease or biological study is necessary to deepen our understanding;
(3) The genotype-phenotype association is a “white clue” captured in conventional big data studies. But identification of “causality” rather than association would be more helpful for physicians or biologists, as this can be used to determine an experimental target as the subject of future research.
Therefore, to simultaneously improve the phenotype discrimination and genotype interpretability for complex diseases, it is necessary:
To design and implement new machine learning technologies to integrate prior-knowledge with new ‘omics datasets to provide transferable learning methods by combining multiple sources of data;
To develop new network-based theories and methods to balance the trade-off between accuracy and interpretability of machine learning in biomedical and biological domains;
To enhance the causality inference on “small-sample high dimension” data to capture the personalized causal relationship.
In this Research Topic, we focus on the application of wet ‘omics technology and dry machine learning approaches together to further develop precision medicine. Several study designs are especially welcome:
(1) Studies based on individual temporal ‘omics data from disease cohorts or animal model, e.g. the sampling of time points across disease progression or treatment options;
(2) Studies based on multiple ‘omics data, e.g. the combination of genomic, transcriptomic, epigenomic, or proteomic data for a single disease/condition;
(3) Studies based on the gut metagenome and host 'omics for complex diseases diagnosis and treatment;
(4) Studies based on conditional genotype-phenotype detection with deep learning or other brain-like artificial intelligence (AI) technologies.
We encourage submissions of both Original Research and Review articles that address these existing challenges in the fields of disease progressive ‘omics data analysis, disease dynamic network biomarker detection, disease drivers or controls recognition, predictive individual disease model, disease re-stratification model, rare disease model, small-sample based machine learning model, and “white-box” or “grey-box” deep learning approach for precision medicine.