About this Research Topic
As large numbers of genetic and genomic data, including multi-omics data, whole-genome, and whole-exome sequencing data, in national and institutional biobanks become available, it is important to jointly analyze them at scale to improve the generalizability of genetic discoveries. The summary level data-based approaches, including meta-analysis methods and federated learning methods, provided an attractive solution to leverage large sample sizes to discover the genetic basis of human disease or traits and address biobank data privacy and consent concerns. However, these summary level data-based approaches face many challenges, for example in overcoming computational scalability issues that arise with hundreds of millions of variants. These methods also have limited abilities to account for relatedness and population structure. Thus, there is a pressing need to develop powerful, scalable, and resource-efficient methods to study the impact of genetic variants on diseases and traits, risk prediction, and the causal effects of biomarkers on diseases by leveraging summary-level data of large-scale genetic studies and biobanks.
This Research Topic is inclusive of both novel methods and novel applications for summary statistics in genetics research. Potential topics include, but are not restricted to:
• Integration of summary statistics across multiple phenotypes to better understand the etiology of related diseases.
• Utilizing summary statistics in causal inference approaches, e.g. to perform mediation analysis.
• Adapting summary statistic methodology developed for GWAS arrays to sequencing studies, for example by accounting for lack of normality when summary statistics are computed using rare variants.
• Fine-mapping approaches using summary statistics.
• Methods for creating polygenic risk scores with summary statistics.
• Development of biological network models using summary statistics.
• Application of summary statistics from understudied ethnic groups to better understand racial and ethnic differences in disease.
• Application of summary statistics from massive modern datasets to uncover new genetic risk factors for complex phenotypes.
• Tools or pipelines to automatically generate summary statistics from contemporary genetic compendiums with large numbers of outcomes.
Keywords: Summary statistics, Meta-analysis, Genome sequencing
Important Note: All contributions to this Research Topic must be within the scope of the section and journal to which they are submitted, as defined in their mission statements. Frontiers reserves the right to guide an out-of-scope manuscript to a more suitable section or journal at any stage of peer review.