Large-scale genome-wide association studies (GWASs) have identified a number of genetic variants, in particular, single nucleotide polymorphisms (SNPs), associated with important human diseases. Some of these SNPs are further correlated with other measured phenotypes, such as blood pressure or body mass index. This overlap can be leveraged to learn about the biological mechanisms underlying these outcomes. Interventions can, thus, be tailored based on well targeted causal pathways. Personalizing treatment strategies in this way may improve patient outcomes while reducing medical costs.
Individual-level analyses of multiple GWASs are cumbersome and often unnecessary. Researchers are increasingly being encouraged to make the summary results from their GWAS analyses available to the public. This has led to the development of publicly accessible data-banks (e.g. GWAS catalogue, Phenoscanner, UK Biobank) which offer an unprecedented opportunity to investigate the impact of common variants on complex diseases. In recent years, statistical approaches, such as Mendelian randomization, fine-mapping, and colocalization, have been developed, all of which synthesize summary statistics from multiple GWASs. These approaches enable the investigation of putative causal mechanisms of certain diseases and the elucidation of shared etiological pathways. However, valid causal inference using these genetic data is challenging, firstly, because several SNPs along the genome are in strong linkage disequilibrium, which complicates identification of any causal SNPs. Secondly, the observed coincidence of an association signal from SNPs across multiple studies can occur due to chance alone, particularly since the number of SNPs studied is often very large. Lastly, SNPs may influence a disease via distinct causal pathways, either directly or indirectly, resulting in a loss of specific information about intervention on the risk factor under analysis. Consequently, the successful applications of these existing approaches rely on method specific assumptions, which have strengths and limitations.
There is a current need to: (i) investigate the plausibility of these assumptions across the diverse areas in which they are regularly employed; (ii) further robustify the established methodologies by developing extensions which relax some of the assumptions or; (iii) develop advanced statistical methods, combining multiple statistical approaches appropriately or using machine learning methods, to perform valid causal inference using GWAS data.
Contributions, in the form of original research papers, reviews, methods and technology and code reports, are particularly welcome in the following areas:
• Novel methods of Mendelian randomization, fine-mapping and colocalization
• Novel integrative methods of the above three approaches
• Machine learning methods for causal inference using multiple GWASs
• New software supporting the methodologies
• Important applications of existing or new methodologies
Large-scale genome-wide association studies (GWASs) have identified a number of genetic variants, in particular, single nucleotide polymorphisms (SNPs), associated with important human diseases. Some of these SNPs are further correlated with other measured phenotypes, such as blood pressure or body mass index. This overlap can be leveraged to learn about the biological mechanisms underlying these outcomes. Interventions can, thus, be tailored based on well targeted causal pathways. Personalizing treatment strategies in this way may improve patient outcomes while reducing medical costs.
Individual-level analyses of multiple GWASs are cumbersome and often unnecessary. Researchers are increasingly being encouraged to make the summary results from their GWAS analyses available to the public. This has led to the development of publicly accessible data-banks (e.g. GWAS catalogue, Phenoscanner, UK Biobank) which offer an unprecedented opportunity to investigate the impact of common variants on complex diseases. In recent years, statistical approaches, such as Mendelian randomization, fine-mapping, and colocalization, have been developed, all of which synthesize summary statistics from multiple GWASs. These approaches enable the investigation of putative causal mechanisms of certain diseases and the elucidation of shared etiological pathways. However, valid causal inference using these genetic data is challenging, firstly, because several SNPs along the genome are in strong linkage disequilibrium, which complicates identification of any causal SNPs. Secondly, the observed coincidence of an association signal from SNPs across multiple studies can occur due to chance alone, particularly since the number of SNPs studied is often very large. Lastly, SNPs may influence a disease via distinct causal pathways, either directly or indirectly, resulting in a loss of specific information about intervention on the risk factor under analysis. Consequently, the successful applications of these existing approaches rely on method specific assumptions, which have strengths and limitations.
There is a current need to: (i) investigate the plausibility of these assumptions across the diverse areas in which they are regularly employed; (ii) further robustify the established methodologies by developing extensions which relax some of the assumptions or; (iii) develop advanced statistical methods, combining multiple statistical approaches appropriately or using machine learning methods, to perform valid causal inference using GWAS data.
Contributions, in the form of original research papers, reviews, methods and technology and code reports, are particularly welcome in the following areas:
• Novel methods of Mendelian randomization, fine-mapping and colocalization
• Novel integrative methods of the above three approaches
• Machine learning methods for causal inference using multiple GWASs
• New software supporting the methodologies
• Important applications of existing or new methodologies