- 1Emory University, Atlanta, GA, United States
- 2Carmel High School, Carmel, IN, United States
- 3James Martin High School, Arlington, TX, United States
- 4Next-Gen Intelligent Science Training, Ann Arbor, MI, United States
- 5Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA, United States
The collection of expression quantitative trait loci (eQTLs) is an important resource to study complex traits through understanding where and how transcriptional regulations are controlled by genetic variations in the non-coding regions of the genome. Previous studies have focused on associating eQTLs with traits to identify the roles of trait-related eQTLs and their corresponding target genes involved in trait determination. Since most genes function as a part of pathways in a systematic manner, it is crucial to explore the pathways’ involvements in complex traits to test potentially novel hypotheses and to reveal underlying mechanisms of disease pathogenesis. In this study, we expanded and applied loci2path software to perform large-scale eQTLs enrichment [i.e., eQTLs’ target genes (eGenes) enrichment] analysis at pathway level to identify the tissue-specific enriched pathways within trait-related genomic intervals. By utilizing 13,791,909 eQTLs cataloged in the Genotype-Tissue Expression (GTEx) V8 data for 49 tissue types, 2,893 pathway sets reported from MSigDB, and query regions derived from the Phenotype-Genotype Integrator (PheGenI) catalog, we identified intriguing biological pathways that are likely to be involved in ten traits [Alzheimer’s disease (AD), body mass index, Parkinson’s disease (PD), schizophrenia, amyotrophic lateral sclerosis, non-small cell lung cancer (NSCLC), stroke, blood pressure, autism spectrum disorder, and myocardial infarction]. Furthermore, we extracted the most significant pathways for AD, such as BioCarta D4-GDI pathway and WikiPathways sulfation biotransformation reaction and viral acute myocarditis pathways, to study specific genes within pathways. Our data presented new hypotheses in AD pathogenesis supported by previous studies, like the increased level of caspase-3 in the amygdala that cleaves GDP dissociation inhibitor and binds to beta-amyloid, leading to increased apoptosis and neuronal loss. Our findings also revealed potential pathogenesis mechanisms for PD, schizophrenia, NSCLC, blood pressure, autism spectrum disorder, and myocardial infarction, which were consistent with past studies. Our results indicated that loci2path′s eQTLs enrichment test was valuable in unveiling novel biological mechanisms of complex traits. The discovered mechanisms of disease pathogenesis and traits require further in-depth analysis and experimental validation.
1 Introduction
Expression quantitative trait loci (eQTLs) have been one of the major focuses in determining the genetic variants that affect gene expressions locating in non-coding regions of the genome. eQTLs’ nature of influencing expression levels of their target genes (eGenes) makes them powerful at studying transcription regulation (Li et al., 2010). The traditional usage of genomic physical proximity to connect genetic loci with their corresponding eGenes has been proven somewhat ineffective since it has been demonstrated that only about 25% of eQTLs have their physically closest genes to be their eGenes (Zhu et al., 2016; Xu et al., 2020). Further, eQTLs have become an increasingly popular tool for researchers to identify specific genes for diseases and traits.
Researchers often use eQTLs associations to link expression traits to genotypes of genetic variants located in genomic intervals. Multiple studies have been conducted on connecting eQTLs and various traits including Alzheimer’s disease (AD) to determine the roles trait-related eQTLs and their corresponding eGenes play in pathogenesis (Hormozdiari et al., 2016; Zhao et al., 2019; Sieberts et al., 2020). Though many interesting findings have been discussed, the observed eQTLs patterns in cerebral and cerebellar brain regions require further investigations with respect to their potential functions, but so far, to our knowledge, no systematic in-depth studies have been performed to explore the roles of such eQTLs in etiologies of neurodegenerative diseases such as AD (Zhao et al., 2019; Sieberts et al., 2020). Another common practice is to use eQTLs mapping to link an expression trait to genetic variants in certain genomic regions, which holds promise in elucidating gene regulations and predicting gene networks associated with complex phenotypes (Li et al., 2010). By using eQTLs mapping methods, we can generate a comprehensive connection map of eQTLs and their eGenes’ enriched pathways to help us develop a more thorough understanding of eQTLs’ involvement in gene regulation, thus providing insights in discovering hidden biological mechanisms (Gilad et al., 2008). In addition, eQTLs studies can also help reveal the architecture of gene regulation, which in combination with results from previous genetic association studies of human traits may help predict regulatory roles for genetic variants previously associated with particular human phenotypes (Gilad et al., 2008). Therefore, it is crucial to explore the associations between eQTLs and genes at the pathway level in complex traits to develop a systematic review of such associations and infer mechanisms of pathogenesis.
The objective of this study was to perform large-scale eQTLs enrichment tests at the pathway level and determine the tissue-specific enriched pathways for trait-related genomic intervals based on the Bioconductor package loci2path (Xu et al., 2020). There are two key advantages of using loci2path than other existing methods: first, we do not depend on physical proximity to provide a link between an eQTL and its target gene, which could be unreliable; second, eQTLs enable us to produce the regulatory annotation for specific tissue types (Xu et al., 2020). For a specific genomic interval containing multiple eQTLs, if eQTLs enrichment analysis indicates that their corresponding eGenes are participating in the same biological pathway, this could imply a potential relationship between that specific pathway and the genomic interval of interest. The tissue-specific eQTLs sets also can demonstrate in what specific tissues would such enrichment be observed, which could help us generate new hypotheses on the biological mechanisms of disease pathogenesis.
In this study, we used the computer program loci2path to perform eQTLs enrichment analysis for genomic regions of ten traits [AD, body mass index, Parkinson’s disease (PD), schizophrenia, amyotrophic lateral sclerosis, non-small cell lung cancer (NSCLC), stroke, blood pressure, autism spectrum disorder, and myocardial infarction]. We have updated the loci2path to utilize the most current data sets of query regions, eQTLs sets, and pathway sets. We used the entire multi-tissue eQTLs data from the GTEx V8 data release that contains 13,791,909 eQTLs with 32,958 unique eGenes for 49 tissue types. In addition to BioCarta and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway sets that were included in the original loci2path (Xu et al., 2020), we have added pathway sets from three new pathway databases, i.e., Pathway Interaction Database (PID), Reactome, and WikiPathways to generate more comprehensive results.
2 Materials and Methods
2.1 Extension of the loci2path
In this study, we extended the Bioconductor package loci2path (Xu et al., 2020) that runs on an R-based platform, and then applied the extended loci2path to perform eQTLs enrichment analyses at pathway level based on different pathway databases to identify enriched pathways for genomic intervals of multiple traits. The advantage of loci2path is that this computer program uses eQTLs information to directly link to their eGenes, rather than using genome proximity, because an eQTL and its corresponding eGene are not always located near each other. For each gene set, the loci2path will first identify eGenes based on the eQTLs set in the given genomic intervals and then evaluate the significance of these eGenes’ enrichment within a gene set. The eQTLs enrichment program really refers to their corresponding eGenes’ enrichment because multiple eQTLs could target the same eGenes due to linkage disequilibrium. p-values calculated using Fisher’s exact test for an eQTLs set could be computed for each pathway to evaluate the enrichment significance, and those pathways with greater enrichments were indicated by smaller p-values. The results were filtered with a p-value of 10−4, which was chosen after multiple trials to balance the number of most significant tissue-pathway combinations and specificity, and used to construct heatmaps for further analysis. We have tried other p-values and obtained similar outcomes.
2.2 Datasets
2.2.1 GTEx eQTLs
For this study, we used the full set of multi-tissue QTL data from the GTEx V8 data release as the input data of eQTLs sets, consisting of 49 tissue types (GTEx Consortium, 2020). The data were downloaded from GTEx through this link: https://storage.googleapis.com/gtex_analysis_v8/multi_tissue_qtl_data/GTEx_Analysis_v8.metasoft.txt.gz. eQTLs sets for each tissue were filtered with a p-value threshold of 10−4. Each gene’s entrez ID and gene name were obtained by using the given gene’s ensemble gene ID and the Bioconductor package biomaRt.
2.2.2 MSigDB Pathways
A total of 2,893 pathways from BioCarta, KEGG, PID, Reactome, and WikiPathways gene sets were used in this study as the input data of gene sets. The data were downloaded from the MSigDB website: http://www.gsea-msigdb.org/gsea/msigdb/collections.jsp.
2.2.3 Phenotype-Genotype Integrator Query Regions
The list of known trait-associated variants was obtained from National Center for Biotechnology Information (NCBI) via PheGenI website: https://www.ncbi.nlm.nih.gov/gap/phegeni. For a given genetic variant, the genomic region is defined as a flanking 50 kb on each of left and right sides of that variant, which spans 100 kb. Overlapped regions were merged. A total of 9,894 genomic intervals were used in this study, and the numbers of genomic regions for each trait are demonstrated in Table 1.
TABLE 1. The numbers of genomic intervals selected that contain known GWAS variants for each of the ten complex traits.
3 Results
3.1 Overview
The objectives of this study were to identify significantly enriched pathways for eQTLs sets of specific tissues at trait-related genomic intervals to generate potentially novel hypotheses of trait determination. A workflow of the study is presented in Figure 1, showing that the input data were query regions, and the internal process involved usages of eQTLs sets and gene pathway sets, and the output results were enriched pathways and the corresponding tissues sorted by multiplicity-adjusted enrichment p-values. We used loci2path to conduct eQTLs enrichment analyses by computing the p-values of Fisher’s exact test adjusted by Benjamini & Hochberg correction method (Benjamini & Hochberg, 1995), and then converting such results into a heatmap. The heatmap was displayed where each row represents a tissue type, and each column represents a gene pathway. The strong significant enrichments were indicated by red cells, and the weak insignificant enrichments were indicated by blue cells. Other data including eQTLs in pathways, eQTLs in tissues, and hit genes generated by loci2path were used to construct tables. Various adjusted p-values of genes through Fisher’s exact test were used as thresholds to filter out the most significant pathway-tissue combinations for each trait. Specific genes that pathways hit in the eQTLs sets were extracted for further analysis. Additional heatmaps and result tables for traits can be found in Supplementary Figures. The results of three of the ten traits, i.e., body mass index, amyotrophic lateral sclerosis, and stroke were not presented, because the outputs obtained from eQTLs enrichment tests at the pathway level for these traits were insignificant, and no further analyses could be performed on them.
FIGURE 1. A diagram depicting our study’s analysis pipeline, including input data, internal processes, and output results.
3.2 Adding PID, Reactome, and WikiPathways to loci2path
We have extended the loci2path (Xu et al., 2020) by adding gene pathway sets of PID, Reactome, and WikiPathways to loci2path′s pathway collection. The data of pathway links and NCBI entrez gene IDs were retrieved from the MSigDB website: http://www.gsea-msigdb.org/gsea/msigdb/collections.jsp. The data were separated into two text documents with one containing gene links and the other containing the pathway’s gene entrez IDs using a self-written R program (Supplementary Data Sheet S1). The documents were added into the loci2path Bioconductor package at loci2path-master/inst/extdata/geneSet, which could be called by the loci2path-running program to match significant eQTLs at the new gene pathway sets.
3.3 Alzheimer’s Disease
Currently, there are three major pathology divisions for AD: protein accumulation, neuron loss, and reactive process (Duyckaerts et al., 2009). Past studies have shown that the extracellular accumulation and deposition of amyloid-beta (Aβ) protein induce the appearance of senile plaques and create an abnormal neuron environment, which causes cognitive disabilities (Sadigh-Eteghad et al., 2015; Cheignon et al., 2018). Such accumulation of Aβ not only enhances the interaction between amyloid-forming protein and neuronal membrane and increases membrane permeability through hypothetical mechanisms like amyloid-forming protein’s channel-like conductance, but also contributes to the increase in the reactive oxygen species production and thus the disruption of neuronal membrane integrity (Butterfield and Lashuel, 2010; Cheignon et al., 2018).
Figure 2A demonstrated the eQTLs enrichment of AD-related genomic intervals in the BioCarta pathway set. There was a distinct significant enrichment of the D4-GDI pathway in the brain amygdala (Figure 2A). Significant eQTLs enrichment results from the amygdala tissue were extracted for further analysis. The table has demonstrated that most pathways’ gene hit in brain amygdala tissue was Rho GDP dissociation inhibitor beta (ARHGDIB) gene (Table 2). The D4-GDI pathway had the lowest p-value of genes, which was consistent with the data in Figure 2A where the D4-GDI pathway was only enriched in amygdala tissue (Table 2; Figure 2A). D4-GDI represents the negative regulator of Ras-related Rho GTPases, and its removal is crucial to induce apoptosis since Rho GTPases increase the cytoskeletal and membrane modification related to apoptosis (Coleman and Olson, 2002). As an enzyme that cleaves D4-GDI, caspase-3 was found to be positively correlated with mild cognitive deficiency in early AD pathology (Gastard et al., 2003). Clinical research suggested that Aβ could sequester caspase-3 via direct interaction and induce neuronal apoptosis via caspase-3 activation, thus strengthening AD development (Chang et al., 2016). One possible hypothesis was that an increased level of caspase-3 in the amygdala leads to increased apoptosis and neuronal loss and thus contributes to the memory loss symptom of AD.
FIGURE 2. Heatmap of Alzheimer’s disease’s eQTLs enrichment results in (A) BioCarta and (B) WikiPathways pathway sets, respectively.
TABLE 2. P-values Obtained from Fisher’s Exact Test of Significant eQTLs Enrichment for Alzheimer’s Disease in BioCarta Pathway Set for Brain Amygdala Tissue
Similarly, Figure 2B showed significant enrichment of sulfation biotransformation reaction and viral acute myocarditis pathways in brain cortex, IL2 and IL5 signaling pathways in brain cerebellum, and development and heterogeneity of the innate lymphoid cell (ILC) pathway in brain hippocampus for the WikiPathways set (Figure 2B). The significant enrichment of viral acute myocarditis pathway in the brain cortex suggested that the correlation observed between heart failure and AD was due to not only the majority of patients’ age, but also genetic factors (Figure 2B) (Li et al., 2006). Such findings were consistent with a previous study where the viral myocarditis pathway from other pathway sets was identified to be significantly associated with AD (Liu et al., 2014). One population study also found a higher than 80% risk of developing AD for patients with heart failures when major confounders like vascular comorbidities were controlled (Qiu et al., 2006). The significant enrichment in the sulfation biotransformation reaction pathway could also be explained by previous findings (Figure 2B). One research suggested an increased frequency of reduced metabolism and impaired sulfation of xenobiotics among AD patients (McFadden, 1996). A clinical study showed that sulfated curcumin can bind to copper and iron ions that are enriched in the brain cortex of AD patients and induce Aβ peptide formation, thus indicating that impaired sulfation ability would increase risk of AD (Baum and Ng, 2004). One possible connection between acute viral myocarditis and AD is kynurenine 3-monooxygenase (KMO), which is a key regulatory enzyme in the kynurenine metabolism pathway that converts kynurenine to 3-hydroxykynurenine (Kubo et al., 2017). Studies have shown that the absence of KMO increased the production of kynurenine pathway metabolite, which lowered the synthesis of chemokine and thus resulted in the decrease of mortality of viral acute myocarditis by encephalomyocarditis virus in mice (Kubo et al., 2017). Interestingly, another study pointed out that JM6, a KMO inhibitor, was found to be able to prevent memory deficiency and synaptic loss in AD mouse models through the increase of the neuroprotective kynurenine metabolite kynurenic acid (Zwilling et al., 2011). Such interaction may imply a hidden mechanism in AD’s pathogenesis that increases KMO production and thus decreases levels of neuroprotective kynurenine metabolite and enhances AD symptoms, which explains AD’s connection to acute viral myocarditis.
3.4 Parkinson’s Disease
One key sign of PD is the accumulation of α-synuclein and the formation of Lewy bodies in brainstem, limbic system, and cortical areas (Alecu and Bennett, 2019). Pathological hallmarks also include the loss of dopaminergic neurons from the substantia nigra and Lewy bodies in surviving cells of affected brains, which leads to reduced voluntary movements (Gegg et al., 2012).
As demonstrated in the Supplementary Figure S1A, the enrichment of the KEGG sphingolipid metabolism pathway was observed to be highly and uniquely significant in amygdala tissue, which indicates a correlation between sphingolipid metabolism and PD. This is consistent with previous studies since the metabolism of sphingolipid glucosylceramide catalyzed by glucocerebrosidase (GCase) was found to be deficient in PD patients (Gegg et al., 2012). The deficiency of GCase that catalyzes sphingolipid metabolism has reached up to 40% at amygdala for PD patients compared to normal patients, which is likely to cause α-synuclein accumulation as GCase mRNA level decreased in cells with exogenous α-synuclein (Gegg et al., 2012). One possible explanation for the decreasing GCase could be a mutation at glucosylceramidase-beta gene that encodes this lysosomal enzyme. Similarly, the lysosomal-associated membrane protein 2A and heat shock cognate 70 from lysosome had significantly lower expression levels in amygdala of brains with PD compared to brains with AD or normal brains (Alvarez-Erviti et al., 2010). The chaperone-mediated autophagy strongly depends on these two proteins, and the downregulation of lysosomal-associated membrane protein 2A has increased the mean half-life of α-synuclein from 46.5 to 65 h, suggesting a direct link between this protein and PD (Alvarez-Erviti et al., 2010). Since wild-type α-synuclein was mostly degraded by chaperone-mediated autophagy, it is valid to hypothesize that impaired lysosomal functions could initiate the accumulation of α-synuclein and thus lead to PD.
3.5 Schizophrenia
As demonstrated, most significantly enriched pathways in all 49 tissues were immune-related pathways including allograft rejection, graft vs. host disease, and antigen processing and presentation pathways (Table 3). The significantly enriched KEGG allograft rejection pathway in different tissues shared the major histocompatibility complex, Class I, C (HLA-C) gene (Figure 3A; Table 3). HLA-C has been shown to be strongly associated with schizophrenia by multiple past studies. HLA-C*01:02 was positively associated with schizophrenia, while HLA-C*07:01 was negatively associated with schizophrenia (Andreassen et al., 2015; Corvin, 2012). One study suggested that in the absence of glutamic acid at the 74th position of the mature protein encoded by the major histocompatibility complex, Class II, DR Beta 1(HLA-DRB1) gene, the amino acid methionine at the 99th position of HLA-C may contribute to individuals’ susceptibility to schizophrenia, in which the glutamic acid in HLA-DRB1 has a protective function against the disease (Seshasubramanian et al., 2020). Interestingly, HLA-DRB1 was hit by the majority of tissues enriched with the KEGG allograft rejection pathway (Table 3). Similarly, the major histocompatibility complex, Class II, DQ Beta 1 (HLA-DQB1) gene was also shared by most tissues with such a pathway, a molecule that presents peptides derived from extracellular proteins and is expressed in antigen expression cells (Table 3). DQB1*05:01:01 was also positively associated with schizophrenia and the predominant haplotype in the schizophrenia population, while decreased frequency of DQB1*02:01 was found among schizophrenia patients (Katrinli et al., 2019; Seshasubramanian et al., 2020). No studies have been conducted on specific mechanisms of HLA-C, HLA-DRB1, and HLA-DQB1’s interventions in schizophrenia pathogenesis, but their interaction is much likely to contribute to the disease.
In the PID pathway set, the FOXO pathway was significantly enriched in the eQTLs set of thyroid tissue, which suggested a potential correlation between the forehead box transcription factor O family and schizophrenia at thyroid (Figure 3B). The mRNA expression level of FOXO pathway genes including FOXO1 and FOXO3A were significantly lower in patients with acute schizophrenia (Gu et al., 2021).
3.6 Non-Small Cell Lung Cancer
The BioCarta IL1R pathway was shown to be significantly enriched in the testis tissue for NSCLC (Supplementary Figure S2). The IL1R pathway involves signal transduction through interleukin-1. One study found that interleukin-17 (IL-17) was involved in angiogenesis in a variety of inflammatory associated cancers, although it still remains unclear how IL-17 contributes to the process (Pan et al., 2015). It is also known that interleukin-37 (IL-37), a new member of the interleukin-1 family, plays an immunosuppressive role in a variety of inflammatory disorders. A study recently found that IL-37 demonstrates a protective role in cancer development possibly through tumor angiogenesis and that it could serve as a promising therapeutic target for NSCLC (Ge et al., 2016).
In Supplementary Figure S2, the PCG1A pathway was enriched in the kidney. The PCG1A pathway involves the regulation of peroxisome proliferator-activated receptor gamma coactivator-1 alpha (PGC-1a), which is a tissue-specific coactivator that enhances the activity of many nuclear receptors and coordinates transcriptional programs important for energy metabolism and homeostasis. In NSCLC patients, there are a variety of cases where the cells show therapeutic resistance. As a result, a plethora of studies focus on drug resistance mechanisms, but not many have focused on the metabolic flexibility of drug-resistant NSCLC. In one study, it was found that during the development of resistance for tyrosine kinase inhibitors, NSCLC cells switched from glycolysis to oxidative phosphorylation through increasing activity of the mitochondria. Cells were treated with the MCT-1 inhibitor AZD3965 and there was a resulting significant decrease in cell proliferation and motility in TK1-sensitive and TK-resistant cells. A study recently found that IL-37 demonstrates a protective role in cancer development possibly through tumor angiogenesis and that it could serve as a promising therapeutic target for NSCLC (Huang et al., 2020).
3.7 Blood Pressure
For blood pressure, the majority of the pathways most significantly enriched in tissues were immune-related, and the atrial appendages tissue contained the most pathways with the most significant p-values (Table 4). The role of the immune system in the pathogenesis of hypertension has been firmly established by many laboratories. The KEGG viral myocarditis pathway and the tissue heart atrial appendage had one of the most significant p-values at 3.08E-14, the KEGG type I diabetes mellitus pathway was also significantly enriched at the atrial appendage tissue (Table 4).
TABLE 4. Adjusted p-values of Five Most Significant eQTLs for Blood Pressure in KEGG and WikiPathways Sets for Heart Atrial Appendage Tissue.
Myocarditis is a cardiac disease associated with inflammation and injury of the myocardium. It results from various etiologies, but coxsackievirus is considered the dominant etiological agent. Infiltrating macrophages have been proven as a pivotal pathological inflammatory cell subset in coxsackievirus induced viral myocarditis, however, the mechanisms involving initiation and promotion are still unknown (Zhang et al., 2017).
Type 1 diabetes is the autoimmune destruction of the insulin producing beta-cells. High blood pressure is a common symptom of diabetes because the high levels of glucose in the blood damage the blood vessels and lead to hypertension. One study found that the left atrium mechanical functions were impaired in patients with type 1 diabetes (Acar et al., 2009).
3.8 Autism Spectrum Disorder
Few significant pathways were uniquely enriched in one or two tissues for autism spectrum disorder as shown in Supplementary Figures S3–S7. KEGG pathways of drug metabolism by cytochrome p450 and metabolism of xenobiotics by cytochrome p450 were found to be enriched in various tissues and most significantly in the liver tissue (Supplementary Figure S4; Table 5). Out of 29 most significant pathway-tissue combinations passing the p-value threshold of 10−4, genes GSTM3 and GSTM5 were hit 24 times, followed by genes GSTM1, GSTP1, GSTM4, and GSTM2 (Supplementary Table S1). The two most significantly enriched pathways, Reactome phase II conjugation of compounds and KEGG metabolism of xenobiotics by cytochrome p45 pathways, were in liver tissues, and they have both hit genes GSTM2, GSTM3, GSTM4, and GSTM5, which encode for multiple proteins from the glutathione S-transferase mu class (Table 5). The two pathways cover proteins functioning in pharmacological inactivation of chemicals and detoxification, and the mu class enzymes are known for their functions in detoxification of electrophilic compounds by conjugation with glutathione (Cheng et al., 2020). Therefore, such highly significant adjusted p-values suggested a key role glutathione S-transferase mu enzymes play in autism spectrum disorder (Table 5). Studies have shown that when exposed to chronic heavy metal and chemical xenobiotic pollution, patients with autism spectrum disorder demonstrated significantly higher total glutathione and oxidized glutathione in red blood cells (Faber et al., 2019). The study also believed that the elevated glutathione was a compensatory mechanism to the exposure of a high xenobiotic environment (Faber et al., 2019). However, such a mechanism could not deal with oxidative stress as the reduced to oxidized glutathione ratio was lower in autistic patients, which indicates a crucial role glutathione plays in the xenobiotic detoxification among patients with autism spectrum disorder (Faber et al., 2019; Bjørklund et al., 2020).
TABLE 5. Adjusted p-values of 10 Most Significant eQTLs for autism spectrum disorder from 49 tissues.
3.9 Myocardial Infarction
Supplementary Figure S8 demonstrated the eQTLs enrichment in BioCarta and Reactome pathway sets of myocardial infarction-related genomic intervals. The AT1R pathway from the BioCarta pathway set was significantly enriched in brain cortex tissue (Supplementary Figure S8A), and the cell cycle pathway from the Reactome pathway set was enriched in whole blood tissue (Supplementary Figure S8B), respectively. RAC1 gene was hit by the BioCarta AT1R pathway at the brain cortex tissue, and PPP2R5A gene was hit by the Reactome cell cycle pathway at the whole blood tissue (Table 6). In myocardial infarction, the RAC1 protein in the brain cortex tissue paired with the BioCarta AT1R pathway was enriched. The RAC1 protein belongs to the RAS superfamily of small GTP-binding proteins. Members of this superfamily appear to regulate a diverse array of cellular events, including the control of cell growth, cytoskeletal reorganization, and the activation of protein kinases. In terms of myocardial infarction, the RAC1 protein serves as a small GTP-binding protein that regulates NADPH oxidase. NADPH oxidase is a reactive oxygen species (ROS) that contributes to heart failure, such as myocardial infarction. Failing of the myocardium in patients with dilated cardiomyopathy (DCM) and ischemic cardiomyopathy (ICM) is characterized by an upregulation of NADPH oxidase–mediated ROS release associated with increased RAC1 activity (Maack et al., 2003).
TABLE 6. Adjusted p-values of Five Most Significant eQTLs for Myocardial Infarction in BioCarta and Reactome Pathway Sets from 49 tissues.
Furthermore, the AT1R pathway is responsible for promoting hypertension, G protein-dependent signaling, transactivation of growth factor receptors, NADPH oxidase, and ROS signaling explaining why the RAC1 gene was enriched by the AT1R pathway (Kawai et al., 2017). In addition to the RAC1 gene, the PPP2R5A gene in the tissue whole blood paired with the Reactome cell cycle pathway was hit on. The PPP2R5A gene stands for protein phosphatase 2 regulatory subunit B’alpha. The gene serves as a subunit of the protein phosphatase 2A (PP2A) holoenzyme, which plays an essential role in regulating a diverse array of myocyte functions through dephosphorylation of target molecules. Functioning as an important phosphatase, the PP2A holoenzyme is critical for serving as a regulatory module within the heart, such that dysregulation of PP2A function may contribute to cardiac diseases. Alterations in PP2A activity are associated with heart failure and arrhythmia (Lubbers and Mohler, 2016). The varying types of myocardial infarction make it difficult for researchers to pinpoint a cure. In recent years, scientists have recognized multiple types of myocardial infarction with different causes, yet the knowledge of its pathogenic mechanisms is still poorly understood and greatly lacking (DeFilippis et al., 2019). While the different causes of myocardial infarction can be difficult to pinpoint, we can start by identifying the pathways, tissues, genes that are related to the causes. The results have shown some genomic mechanisms contributing to myocardial infarction, whether it be the enrichment of the RAC1 protein leading to the regulation of NADPH oxidase causing heart failure, or the altered regulation in the PP2A gene leading to heart failure and arrhythmia. The importance of these findings is two-fold: first, these results could serve as a pipeline to benefit the scientific community through reducing repeated work, and second, the discovered specific pathway-tissue-gene results could help researchers to reveal pathogenesis mechanisms in myocardial infarction in hopes to lower its occurrence rates or raise the rates of survival.
4 Discussion
We have extended the loci2path (Xu et al., 2020) by using the latest multi-tissue eQTLs data set from GTEx V8 release and adding PID, Reactome, and WikiPathways databases. The total numbers of eQTLs for each of 49 tissues we used in this study are shown in Supplementary Table S2. Our results of enrichment analysis have suggested multiple novel biological hypotheses of disease mechanisms for AD, PD, and schizophrenia. The proposed mechanisms of the increase of caspase-3 level in amygdala tissue and KMO production that may contribute to AD’s memory loss symptoms by increasing apoptosis and neuronal loss and decreasing kynurenine metabolite levels were supported by multiple past studies. The impaired lysosomal functions of GCase, lysosomal-associated membrane protein 2A, and heat shock cognate 70 resulted from mutations in genes corresponding to these proteins may cause α-synuclein accumulation to begin and thus lead to PD. The interaction among HLA-C, HLA-DRB1, and HLA-DQB1 is likely to take part in schizophrenia’s pathogenesis as well.
Our study has extensively evaluated multiple gene pathways’ involvements in the ten traits and further investigated significant genes in each pathway that were hit in the given genomic query regions. The proposed hypotheses have opened new avenues to explore the underlying molecular mechanisms and thus could illuminate further investigations on these traits. We have also found many interesting associations between eQTLs and gene pathways at trait-associated variants of NSCLC, blood pressure, autism spectrum disorder, and myocardial infarction which provided valuable insights into our comprehensive understandings of them. Furthermore, our study has confirmed the advantages of using tissue-specific eQTLs enrichment analysis at pathway level, because our findings based on loci2path software were strongly supported by multiple previous studies (Xu et al., 2020). This has indicated that using eQTLs catalogs to find links between genomic loci and their corresponding eGenes is valid and should be vastly applied in future studies involving gene sets and traits.
There were several limitations in our study. Due to the nature of the statistical analysis, our findings from loci2path could not be considered as providing direct understandings of biological mechanisms underpinning these traits, and we were only able to generate hypotheses for trait determination. These hypotheses should be experimentally verified by conducting further in-depth functional studies by molecular biology laboratories. In addition, loci2path′s reliance on current eQTLs sets data from GTEx could also lead to biased results since the eQTLs sets data from brain tissues were significantly smaller than other tissues like tibial nerves, leg skin without sun exposure, and thyroid. This was caused by the limited sample sizes of brain tissues from GTEx, which may result in missing important biological pathways in brain tissues for neurodegenerative diseases due to inadequate statistical power. The imbalance of eQTLs sizes of various tissues could also bring false-positive results in tissues with more samples and generate coincidental enrichment of certain pathways at tissues not related to the traits. Therefore, results from loci2path need to be treated with extra care, and only the most significant tissue-pathway associations should be extracted for analysis with sufficient past evidence. The software itself also has rooms for improvement, like including new gene pathway sets and adding annotations on pathways uniquely enriched in a tissue.
Future studies on neurodegenerative diseases specifically should implement more data on brain tissues to increase the accuracy of loci2path. Other neurodegenerative diseases like bipolar disorder and attention deficit disorder could be added for a systematic analysis on their patterns to find potential patterns for commonality among this type of disease.
Data Availability Statement
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.
Author Contributions
BW has modified the loci2path program and generated and processed the raw data. BW, SQ, and JY did the analysis. ZQ and YB supervised the project and provided suggestions and guidance on directions. All authors participated in writing and revising.
Conflict of Interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Publisher’s Note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Acknowledgments
Funding for article processing charge is provided by the Halle Institute for Global Research at Emory University.
Supplementary Material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fdata.2021.719737/full#supplementary-material
Abbreviations
AD, Alzheimer’s disease; ARHGDIB, Rho GDP dissociation inhibitor beta; Aß, amyloid-beta; DCM, dilated cardiomyopathy; eGene, target gene of an expression quantitative trait locus; eQTL, expression quantitative trait locus; GCase, glucocerebrosidase; HLA-C, major histocompatibility complex, Class I, C; HLA-DQB1, major histocompatibility complex, Class II, DQ Beta 1; HLA-DRB1, major histocompatibility complex, Class II, DR Beta 1; ICM, ischemic cardiomyopathy; IL-17, interleukin-17; IL-37, interleukin-37; ILC, innate lymphoid cell; KMO, kynurenine 3-monooxygenase; NSCLC, non-small cell lung cancer; PD, Parkinson’s disease; PP2A, protein phosphatase 2A; PPP2R5A, protein phosphatase 2 regulatory subunit B’alpha; ROS, reactive oxygen species.
References
Acar, G., Akcay, A., Sokmen, A., Ozkaya, M., Guler, E., Sokmen, G., et al. (2009). Assessment of Atrial Electromechanical Delay, Diastolic Functions, and Left Atrial Mechanical Functions in Patients with Type 1 Diabetes Mellitus. J. Am. Soc. Echocardiography 22 (6), 732–738. doi:10.1016/j.echo.2009.03.028
Alecu, I., and Bennett, S. A. L. (2019). Dysregulated Lipid Metabolism and its Role in α-Synucleinopathy in Parkinson's Disease. Front. Neurosci. 13, 328. doi:10.3389/fnins.2019.00328
Alvarez-Erviti, L., Rodriguez-Oroz, M. C., Cooper, J. M., Caballero, C., Ferrer, I., Obeso, J. A., et al. (2010). Chaperone-Mediated Autophagy Markers in Parkinson Disease Brains. Arch. Neurol. 67 (12), 1464–1472. doi:10.1001/archneurol.2010.198
Andreassen, O. A., Harbo, H. F., Harbo, H. F., Wang, Y., Thompson, W. K., Schork, A. J., et al. (2015). Genetic Pleiotropy between Multiple Sclerosis and Schizophrenia but Not Bipolar Disorder: Differential Involvement of Immune-Related Gene Loci. Mol. Psychiatry 20 (2), 207–214. doi:10.1038/mp.2013.195
Baum, L., and Ng, A. (2004). Curcumin Interaction with Copper and Iron Suggests One Possible Mechanism of Action in Alzheimer's Disease Animal Models. Jad 6 (4), 367–377. doi:10.3233/JAD-2004-6403
Benjamini, Y., and Hochberg, Y. (1995). Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. J. R. Stat. Soc. Ser. B (Methodological) 57 (1), 289–300. doi:10.1111/j.2517-6161.1995.tb02031.x
Bjørklund, G., Tinkov, A. A., Hosnedlová, B., Kizek, R., Ajsuvakova, O. P., Chirumbolo, S., et al. (2020). The Role of Glutathione Redox Imbalance in Autism Spectrum Disorder: A Review. Free Radic. Biol. Med. 160, 149–162. doi:10.1016/j.freeradbiomed.2020.07.017
Butterfield, S. M., and Lashuel, H. A. (2010). Amyloidogenic Protein-Membrane Interactions: Mechanistic Insight from Model Systems. Angew. Chem. Int. Edition 49 (33), 5628–5654. doi:10.1002/anie.200906670
Chang, Y.-J., Linh, N. H., Shih, Y. H., Yu, H.-M., Li, M. S., and Chen, Y.-R. (2016). Alzheimer's Amyloid-β Sequesters Caspase-3 In Vitro via its C-Terminal Tail. ACS Chem. Neurosci. 7 (8), 1096–1106. doi:10.1021/acschemneuro.6b00049
Cheignon, C., Tomas, M., Bonnefont-Rousselot, D., Faller, P., Hureau, C., and Collin, F. (2018). Oxidative Stress and the Amyloid Beta Peptide in Alzheimer's Disease. Redox Biol. 14, 450–464. doi:10.1016/j.redox.2017.10.014
Cheng, C.-H., Ma, H.-L., Deng, Y.-Q., Feng, J., Chen, X.-L., and Guo, Z.-X. (2020). The Role of Mu-type Glutathione S-Transferase in the Mud Crab (Scylla Paramamosain) during Ammonia Stress. Comp. Biochem. Physiol. C: Toxicol. Pharmacol. 227, 108642. doi:10.1016/j.cbpc.2019.108642
Coleman, M. L., and Olson, M. F. (2002). Rho GTPase Signalling Pathways in the Morphological Changes Associated with Apoptosis. Cell Death Differ 9, 493–504. doi:10.1038/sj.cdd.4400987
Corvin, A. (2012). Irish Schizophrenia Genomics Consortium and the Wellcome Trust Case Control Consortium 2Genome-wide Association Study Implicates HLA-C*01:02 as a Risk Factor at the Major Histocompatibility Complex Locus in Schizophrenia. Biol. Psychiatry 72 (8), 620–628. doi:10.1016/j.biopsych.2012.05.035
DeFilippis, A. P., Chapman, A. R., Mills, N. L., de Lemos, J. A., Arbab-Zadeh, A., Newby, L. K., et al. (2019). Assessment and Treatment of Patients with Type 2 Myocardial Infarction and Acute Nonischemic Myocardial Injury. Circulation 140 (20), 1661–1678. doi:10.1161/CIRCULATIONAHA.119.040631
Duyckaerts, C., Delatour, B., and Potier, M.-C. (2009). Classification and Basic Pathology of Alzheimer Disease. Acta Neuropathol. 118, 5–36. doi:10.1007/s00401-009-0532-1
Faber, S., Fahrenholz, T., Wolle, M. M., Kern, J. C., Pamuku, M., Miller, L., et al. (2019). Chronic Exposure to Xenobiotic Pollution Leads to Significantly Higher Total Glutathione and Lower Reduced to Oxidized Glutathione Ratio in Red Blood Cells of Children with Autism. Free Radic. Biol. Med. 134, 666–677. doi:10.1016/j.freeradbiomed.2019.02.009
Gastard, M. C., Troncoso, J. C., and Koliatsos, V. E. (2003). Caspase Activation in the Limbic Cortex of Subjects with Early Alzheimer's Disease. Ann. Neurol. 54, 393–398. doi:10.1002/ana.10680
Ge, G., Wang, A., Yang, J., Chen, Y., Yang, J., Li, Y., et al. (2016). Interleukin-37 Suppresses Tumor Growth through Inhibition of Angiogenesis in Non-small Cell Lung Cancer. J. Exp. Clin. Cancer Res. 35, 13. doi:10.1186/s13046-016-0293-3
Gegg, M. E., Burke, D., Heales, S. J. R., Cooper, J. M., Hardy, J., Wood, N. W., et al. (2012). Glucocerebrosidase Deficiency in Substantia Nigra of Parkinson Disease Brains. Ann. Neurol. 72, 455–463. doi:10.1002/ana.23614
Gilad, Y., Rifkin, S. A., and Pritchard, J. K. (2008). Revealing the Architecture of Gene Regulation: the Promise of eQTL Studies. Trends Genet. 24 (8), 408–415. doi:10.1016/j.tig.2008.06.001
GTEx Consortium (2020). The GTEx Consortium Atlas of Genetic Regulatory Effects across Human Tissues. Science 369 (6509), 1318–1330. doi:10.1126/science.aaz1776
Gu, S., Cui, F., Yin, J., Fang, C., and Liu, L. (2021). Altered mRNA Expression Levels of Autophagy- and Apoptosis-Related Genes in the FOXO Pathway in Schizophrenia Patients Treated with Olanzapine. Neurosci. Lett. 746, 135669. doi:10.1016/j.neulet.2021.135669
Hormozdiari, F., van de Bunt, M., Segrè, A. V., Li, X., Joo, J. W. J., Bilow, M., et al. (2016). Colocalization of GWAS and eQTL Signals Detects Target Genes. Am. J. Hum. Genet. 99 (6), 1245–1260. doi:10.1016/j.ajhg.2016.10.003
Huang, C.-Y., Hsu, L.-H., Chen, C.-Y., Chang, G.-C., Chang, H.-W., Hung, Y.-M., et al. (2020). Inhibition of Alternative Cancer Cell Metabolism of EGFR Mutated Non-small Cell Lung Cancer Serves as a Potential Therapeutic Strategy. Cancers 12 (1), 181. doi:10.3390/cancers12010181
Katrinli, S., Lori, A., Kilaru, V., Carter, S., Powers, A., Gillespie, C. F., et al. (2019). Association of HLA Locus Alleles with Posttraumatic Stress Disorder. Brain Behav. Immun. 81, 655–658. doi:10.1016/j.bbi.2019.07.016
Kawai, T., Forrester, S. J., O’Brien, S., Baggett, A., Rizzo, V., and Eguchi, S. (2017). AT1 Receptor Signaling Pathways in the Cardiovascular System. Pharmacol. Res. 125, 4–13. doi:10.1016/j.phrs.2017.05.008
Kubo, H., Hoshi, M., Mouri, A., Tashita, C., Yamamoto, Y., Nabeshima, T., et al. (2017). Absence of Kynurenine 3-monooxygenase Reduces Mortality of Acute Viral Myocarditis in Mice. Immunol. Lett. 181, 94–100. doi:10.1016/j.imlet.2016.11.012
Li, D., Parks, S. B., Kushner, J. D., Nauman, D., Burgess, D., Ludwigsen, S., et al. (2006). Mutations of Presenilin Genes in Dilated Cardiomyopathy and Heart Failure. Am. J. Hum. Genet. 79, 1030–1039. doi:10.1086/509900
Li, S., Lu, Q., and Cui, Y. (2010). A Systems Biology Approach for Identifying Novel Pathway Regulators in eQTL Mapping. J. Biopharm. Stat. 20 (2), 373–400. doi:10.1080/10543400903572803
Liu, G., Yao, L., Liu, J., Jiang, Y., Ma, G., Chen, Z., et al. (2014). Cardiovascular Disease Contributes to Alzheimer's Disease: Evidence from Large-Scale Genome-wide Association Studies. Neurobiol. Aging 35 (4), 786–792. doi:10.1016/j.neurobiolaging.2013.10.084
Lubbers, E. R., and Mohler, P. J. (2016). Roles and Regulation of Protein Phosphatase 2A (PP2A) in the Heart. J. Mol. Cell. Cardiol. 101, 127–133. doi:10.1016/j.yjmcc.2016.11.003
Maack, C., Kartes, T., Kilter, H., Schäfers, H. J., Nickenig, G., Böhm, M., et al. (2003). Oxygen Free Radical Release in Human Failing Myocardium Is Associated with Increased Activity of Rac1-GTPase and Represents a Target for Statin Treatment. Circulation 108 (13), 1567–1574. doi:10.1161/01.cir.0000091084.46500.bb
McFadden, S. A. (1996). Phenotypic Variation in Xenobiotic Metabolism and Adverse Environmental Response: Focus on Sulfur-dependent Detoxification Pathways. Toxicology 111 (1–3), 43–65. doi:10.1016/0300-483X(96)03392-6
Pan, B., Shen, J., Cao, J., Zhou, Y., Shang, L., Jin, S., et al. (2015). Interleukin-17 Promotes Angiogenesis by Stimulating VEGF Production of Cancer Cells via the STAT3/GIV Signaling Pathway in Non-small-cell Lung Cancer. Sci. Rep. 5, 16053. doi:10.1038/srep16053
Qiu, C., Winblad, B., Marengoni, A., Klarin, I., Fastbom, J., and Fratiglioni, L. (2006). Heart Failure and Risk of Dementia and Alzheimer Disease: A Population-Based Cohort Study. Arch. Intern. Med. 166 (9), 1003–1008. doi:10.1001/archinte.166.9.1003
Sadigh-Eteghad, S., Sabermarouf, B., Majdi, A., Talebi, M., Farhoudi, M., and Mahmoudi, J. (2015). Amyloid-beta: a Crucial Factor in Alzheimer's Disease. Med. Princ Pract. 24 (1), 1–10. doi:10.1159/000369101
Seshasubramanian, V., Raghavan, V., SathishKannan, A. D., Naganathan, C., Ramachandran, A., Arasu, P., et al. (2020). Association of HLA‐A, ‐B, ‐C, ‐DRB1 and ‐DQB1 Alleles at Amino Acid Level in Individuals with Schizophrenia: A Study from South India. Int. J. Immunogenet. 47 (6), 501–511. doi:10.1111/iji.12507
Sieberts, S. K., Perumal, T. M., Carrasquillo, M. M., Allen, M., Reddy, J. S., Hoffman, G. E., et al. (2020). Large eQTL Meta-Analysis Reveals Differing Patterns between Cerebral Cortical and Cerebellar Brain Regions. Sci. Data 7, 340. doi:10.1038/s41597-020-00642-8
Xu, T., Jin, P., and Qin, Z. S. (2020). Regulatory Annotation of Genomic Intervals Based on Tissue-specific Expression QTLs. Bioinformatics 36 (3), 690–697. doi:10.1093/bioinformatics/btz669
Zhang, H., Yue, Y., Sun, T., Wu, X., and Xiong, S. (2017). Transmissible Endoplasmic Reticulum Stress from Myocardiocytes to Macrophages Is Pivotal for the Pathogenesis of CVB3-Induced Viral Myocarditis. Sci. Rep. 7, 42162. doi:10.1038/srep42162
Zhao, T., Hu, Y., Zang, T., and Wang, Y. (2019). Integrate GWAS, eQTL, and mQTL Data to Identify Alzheimer's Disease-Related Genes. Front. Genet. 10, 1021. doi:10.3389/fgene.2019.01021
Zhu, Z., Zhang, F., Hu, H., Bakshi, A., Robinson, M. R., Powell, J. E., et al. (2016). Integration of Summary Data from GWAS and eQTL Studies Predicts Complex Trait Gene Targets. Nat. Genet. 48, 481–487. doi:10.1038/ng.3538
Keywords: eQTLs, gene pathway sets, gene set enrichment analyses, tissue-pathway association, complex traits
Citation: Wang B, Yang J, Qiu S, Bai Y and Qin ZS (2021) Systematic Exploration in Tissue-Pathway Associations of Complex Traits Using Comprehensive eQTLs Catalog. Front. Big Data 4:719737. doi: 10.3389/fdata.2021.719737
Received: 03 June 2021; Accepted: 13 October 2021;
Published: 03 November 2021.
Edited by:
Prashanti Manda, University of North Carolina at Greensboro, United StatesReviewed by:
Kui Zhang, Michigan Technological University, United StatesTianhua Niu, Tulane University, United States
Copyright © 2021 Wang, Yang, Qiu, Bai and Qin. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Zhaohui S. Qin, emhhb2h1aS5xaW5AZW1vcnkuZWR1