- 1IRCCS Neuromed, Pozzilli, Italy
- 2Institute of Genetics and Biophysics, National Research Council, Naples, Italy
Parkinson Disease (PD) is a complex neurodegenerative disorder characterized by large genetic heterogeneity and missing heritability. Since the genetic background of PD can partly vary among ethnicities and neurological scales have been scarcely investigated in a PD setting, we performed an exploratory Whole Exome Sequencing (WES) analysis of 123 PD patients from mainland Italy, investigating scales assessing motor (UPDRS), cognitive (MoCA), and other non-motor symptoms (NMS). We performed variant prioritization, followed by targeted association testing of prioritized variants in 446 PD cases and 211 controls. Then we ran Exome-Wide Association Scans (EWAS) within sequenced PD cases (N = 113), testing both motor and non-motor PD endophenotypes, as well as their associations with Polygenic Risk Scores (PRS) influencing brain subcortical volumes. We identified a variant associated with PD, rs201330591 in GTF2H2 (5q13; alternative T allele: OR [CI] = 8.16[1.08; 61.52], FDR = 0.048), which was not replicated in an independent cohort of European ancestry (1,148 PD cases, 503 controls). In the EWAS, polygenic analyses revealed statistically significant multivariable associations of amygdala- [β(SE) = −0.039(0.013); FDR = 0.039] and caudate-PRS [0.043(0.013); 0.028] with motor symptoms. All subcortical PRSs in a multivariable model notably increased the variance explained in motor (adjusted-R2 = 38.6%), cognitive (32.2%) and other non-motor symptoms (28.9%), compared to baseline models (~20%). Although, the small sample size warrants further replications, these findings suggest shared genetic architecture between PD symptoms and subcortical structures, and provide interesting clues on PD genetic and neuroimaging features.
Introduction
Parkinson's Disease (PD) is one of the most common neurodegenerative disorders, affecting 1% of the population over 60 years of age, causing a progressive loss of dopaminergic neurons in the substantia nigra pars compacta (1, 2). This results in a wide phenotypic spectrum, including both motor (e.g., rigidity, tremor, and bradikynesia) and non-motor symptoms (e.g., cognitive impairment and depression) (2). PD is characterized by a complex architecture, with a number of genetic and environmental factors influencing susceptibility to the disease (3). This disorder shows an extreme genetic heterogeneity, with 10% of PD cases having Mendelian inheritance (1, 4). The genes which have been most robustly implicated in Mendelian forms of PD include SNCA (5), LRRK2 (6), PARK2 (7), ATP13A2 (8), PINK1 (9), DJ-1 (10), VPS35 (11), DNAJC13 (12), and GBA (13) [see (14–16) for a review]. In these and other genes, rare mutations with both dominant (5, 6) or recessive inheritance modes (7, 9, 10) have been identified, often through genome-wide linkage studies followed by targeted genotyping [e.g., (6)] or, more recently, through Next Generation Sequencing (NGS) studies [e.g., (11, 12)]. In addition to rare mutations, also common susceptibility variants like Single Nucleotide Polymorphisms (SNPs) have been detected within these genes, e.g., in LRRK2 and SNCA (16). However, the genetic variants identified so far—be they common or rare—explain only a minor part of PD heritability (17), and for a large majority of PD cases the genetic diagnosis remains unresolved. The issue of missing heritability has been tackled through different approaches in the last years, including Genome Wide Association Scans (GWAS) to identify common variants with moderate/weak effect sizes on PD susceptibility [e.g., (18)], and NGS (mostly Whole Exome Sequencing) studies to identify rare causative mutations [e.g., (3, 4, 17, 19–22)]. Moreover, the genetic architecture and the mutational spectrum of PD can vary based on the ethnic and genetic background of the population (2, 23), hence population-specific genetic studies are warranted [as in (17, 21)].
Large-scale genomic studies carried out so far have scarcely investigated inter-individual variation in PD endophenotypes like neurological scales (3, 4, 17–22, 24). A GWAS study of age-at-onset in 25,568 PD cases reported two genome-wide significant associations within SNCA and TMEM175 (25), while other preliminary GWAS of cognitive performance and motor symptoms progression are ongoing (26, 27). Other SNP-based genomic studies tested associations of Polygenic Risk Scores (PRS) for PD with alpha-synuclein levels in the cerebrospinal fluid, age-at-onset of the disease, motor/cognitive symptoms and PD status [as reviewed in (28)], detecting significant associations with PD risk (29), earlier PD onset (29, 30), and faster motor and cognitive decline (31). In addition, the largest case-control GWAS on PD carried out so far—involving the analysis of ~56,000 cases and 1.4 million controls—identified significant genetic correlations with structural neuroimaging measures like intracranial and putamen volume (24). However, PRS analyses of Parkinson neuroimaging correlates were never reported. Overall, no NGS study so far focused on identifying genetic variants associated with PD endophenotypes, and there is a paucity of genomic studies doing so, in particular with motor, cognitive and non-motor scales, as well as with neuroimaging traits related to PD risk and symptoms, like subcortical volumes (32–35).
Here, we present the first Whole Exome Sequencing (WES) analysis investigating continuous PD endophenotypes, and PD genetic susceptibility in mainland Italy. Through an exploratory multi-stage approach, we first performed rare variant prioritization and case-control association testing, attempting replication of findings in an international cohort of PD cases and controls of European ancestry (22). Then, we carried out Exome-Wide Association Scans with continuous neurological scales related to PD, assessing both motor and non-motor symptoms, to identify common variants potentially affecting these domains. Finally, we performed PRS analyses to test associations between polygenic scores influencing subcortical volumes and the above mentioned scales. Our study provides a contribution to the research on the genetic basis of PD, focusing on motor, non-motor, and neuroimaging measures related to the disease.
Subjects and Methods
PD Cohorts
Inclusion criteria for the participants to the study were reported Italian ancestry and a clinical diagnosis of PD by a qualified neurologist, according to published diagnostic criteria (see Supplementary Methods and (36).
Four hundred and seventy-two PD patients [288 males; 196 familiar cases; mean (SD) age of 66.6 (8.8) years] were recruited at the Parkinson Center of the specialized clinics IRCCS Neuromed, Pozzilli, Italy, between June 2015 and December 2017. They underwent a detailed phenotypic assessment and diagnostic protocol, which included neurological examination and evaluation of non-motor domains (see Supplementary Methods for details). The mean (SD) age at diagnosis was 58.3 (10.0) years. Along with patients, 121 non-consanguineous family members with no neurological signs or symptoms of PD at the time of recruitment were involved in the study, by donating blood samples for targeted genetic analyses [mean (SD) age 62.9 (9.1) years; 44 males].
An additional cohort from mainland Italy was involved in the study, recruited at the Parkinson Institute of Istituti Clinici di Perfezionamento in Milan (hereafter called ICP). This included 82 related FPD patients of Italian ancestry, coming from 42 families with two or more first-degree relatives affected by PD [mean age 66.7 (10.4) years; mean (SD) age at diagnosis 60.69 (10.62) years; 41 males]. Further details on these cohorts are reported in Table S1a.
The project was approved by the ethical committees of IRCCS Neuromed, Pozzilli, and of ICP, Milan, and written informed consent was obtained from all the participating subjects.
Whole Exome Sequencing, Quality Control (QC) and Annotation
162 PD cases, including 90 familiar cases (FPD, 42 from Neuromed and 48 from ICP) and 72 sporadic cases (SPD, from Neuromed), underwent WES analysis (see Table S1b for details) through the Illumina® HiSeq2000 platform (Illumina, San Diego, CA, USA), using the SureSelect All Exome kit v6 (Agilent® Technologies, Santa Clara, CA, USA) for enrichment of exonic regions. The alignments of reads to GRCh37/hg19 was performed using the Burrows Wheeler Aligner (BWA) MEM v0.7.5 (37). After removal of duplicate reads through Picard MarkDuplicates command, single nucleotide variants (SNVs) and insertions/deletions (indels) were called using HaplotypeCaller and GenotypeGVCFs in Genome Analysis Toolkit (GATK) v3.5-0-g36282e4 (38). Variant calls with total depth (DP) <8 and genotype quality (GQ) <50 were set to missing, and variants with Minor Allele Count (MAC) = 0, number of alternative alleles ≠ 2 and call rate <95% were filtered out, as well as samples with identical-by-descent (IBD) sharing and sex mismatches, and samples with call rate <90% and intraspecific contamination rate >7%. Similarly, samples were checked for absence of outliers in terms of genome-wide homozygosity, number of singleton variants, and genetic ancestry [through Multidimensional Scaling Analysis in PLINK v 1.9; (39)]. 123 PD cases (52 FPD + 71 SPD) and 334,671 variants (321,967 SNPs + 12,704 indels) passed QC. These variants were annotated to genes (within 10 kb from transcription start/stop site) through Annovar version 1-2-2016 (40) and Ensembl Variant Effect Predictor (VEP) v88 (41). Further details on genotype calling and QC are reported in Supplementary Methods.
Variants Prioritization, Validation, and Genetic Association Analysis With PD Status
Among 334,671 variants passing QC, we attempted to detect rare variants potentially associated with PD status in our dataset (123 PD cases). To this purpose, we applied the following bioinformatic pipeline (resumed in Figure 1):
1. We selected variants with predicted high or moderate impact on protein function, based on VEP annotation (41). These included 2,334 variants assumed to have high (disruptive) impact on the protein, probably causing protein truncation, loss of function or triggering nonsense mediated decay (hereafter called HIGH variants), and 67,047 non-disruptive variants that might change protein effectiveness (hereafter called MODERATE variants; see Table S2 for a detailed classification).
2. We retained variants with an alternative allele frequency (AF) at least five times higher than in three WES databases representative of the European population, namely 1,000 Genomes EUR (European Samples of the 1,000 Genomes project, phase 3 v5; N = 503) (42), ESP EA (European American samples of NHLBI Exome Sequencing Project 6500 release si-v2; N = 4,300) (43), and ExAC NFE (Non-Finnish Europeans of the Exome Aggregation Consortium version 0.3.1; N = 33,370) (44). This resulted in the selection of 1,120 HIGH and 23,985 MODERATE variants.
3. We ranked resulting variants based on decreasing AF, and validated top-ranked variants in our dataset—namely HIGH variants with AF > 1% and MODERATE variants with AF > 2.5%—through Sanger sequencing or PCR (see Supplementary Methods; Tables S3a,b).
4. We performed targeted genotyping and case-control association analysis of the most frequent validated variants within each of the two functional annotation classes, namely HIGH and MODERATE impact variants. More specifically, we tested two HIGH variants (AF = 2.03%) and one MODERATE variant (AF = 4.66%) in our cohort (Table S4). This analysis was performed on 446 PD cases and 211 controls, which included 121 non-consanguineous relatives of PD patients and 90 unscreened controls (pseudo-controls) belonging to the general Italian population. Association analysis was performed through an allelic Fisher Exact Test with adaptive permutations in PLINK (see Supplementary Methods). Since age and sex were missing for pseudo-controls, no covariates were used in this analysis to avoid a substantial loss of sample size.
5. We attempted a replication of a significant association observed (rs201330591), in an independent case-control WES study of 1,148 young-onset unrelated PD cases (average age at onset 40.6 years; range 35–56 years) and 503 control participants of European ancestry (IPDGC cohort) (22). As above, we performed an allelic Fisher Exact Test and then meta-analyzed the resulting association with that observed in the Italian cohort, through a Mantel-Haenszel meta-analysis in R (45) (see Supplementary Methods for further details).
Figure 1. Bioinformatic pipeline applied in the present study for PD causative variants prioritization and case-control association testing. PD, Parkinson disease; FPD/SPD, familial/sporadic Parkinson disease; IPDGC, International Parkinson Disease Genetics Consortium (22). *Public exomes databases: 1000G EUR, European Sample of the 1000 Genomes project, phase 3 (N = 503) (42); ESP EA, NHLBI Exome Sequencing Project 6500 release si-v2, European ancestry (N = 4,300) (43); ExAC NFE, Exome Aggregation Consortium version 0.3.1, Non-Finnish Europeans (N = 33,370) (44).
Exome-Wide Association Study With PD Endophenotypes
We tested common variants detected through WES for association with three continuous scales which assessed different domains usually affected in PD (Tables S5a,b). These scales included the Movement Disorder Society revised version of the Unified Parkinson's Disease Rating Scale Part III (hereafter called UPDRS) (46), which assessed motor symptoms; the Montreal Cognitive Assessment (MoCA) (47), which measures general cognitive abilities; and a modified version of Non-Motor Symptoms Scale for Parkinson Disease (hereafter called NMS) (48), which tests non-motor symptoms (see Supplementary Methods for details). Indeed, these scales represent useful endophenotypes which allow to disentangle the genetic basis of PD at a fine-grained resolution, as done elsewhere (49–51).
After genotypic and phenotypic QC (described in Supplementary Methods), 110,803 common autosomal variants (MAF > 5%) in 113 PD cases were available for association testing, which was carried out in two steps. First, we performed univariate linear mixed effect models in EMMAX [version March 2010; (52)], to identify genetic effects on each single neurological scale tested. Then, in light of the moderate correlations among these scales (Table S5b), we carried out a multivariate genetic association analysis on all the three scales together, through TATES (53), to identify relational pleiotropic effects on the domains assessed. These analyses were adjusted for different covariates, including PD familiarity, sex, age, pharmacological treatment status (ON/OFF), years of disease, daily L-Dopa dosage, and 10 genetic ancestry components. The significance threshold of the multivariate analysis was corrected for the number of LD-independent SNPs tested (α = 0.05/56,588 = 8.84 × 10−7), as computed by the Genetic Type I error calculator (GEC) (54), while for univariate tests we applied an additional Bonferroni correction for the number of scales tested (α = 2.95 × 10−7).
To follow-up on the results of the exome-wide association study (EWAS), we performed a targeted genotyping and association testing of the top hit identified in the whole Neuromed cohort (472 PD cases), through linear regression models with adaptive permutations in PLINK, using the same covariates as above except for genetic ancestry (see Supplementary Methods). After single univariate association tests with PD endophenotypes, we then combined the results into a multivariate association analysis through TATES.
Polygenic Risk Score (PRS) Analyses of PD Endophenotypes
We used exome-wide genetic data to build Polygenic Risk Scores (PRSs) influencing brain subcortical volumes, which were trained using summary statistics of a previous large independent GWAS (Nmax = 30,717) (55). This analysis was motivated by previous literature reporting both structural and functional alterations of several subcortical structures in PD (32–35), for which these are often considered useful neuroimaging correlates. First we computed standardized best-fit polygenic scores through PRSice-2 (56), over varying association significance thresholds in the training GWAS (ranging from 5 × 10−8 to 1), for nucleus accumbens, caudate, putamen, pallidum, amygdala, hippocampus and thalamus volume. Then we tested the resulting scores for association with UPDRS, MoCA and NMS scales through generalized linear models [glm() function in R], in a random extraction of one individual per family in our cohort (four relatives removed). These models were adjusted for PD familiarity, sex, age, pharmacological treatment status (ON/OFF), years of disease, daily L-Dopa dosage, and 10 genetic ancestry components, as above. Since these subcortical structures are thought to be functionally connected in complex connectivity networks and the pattern of atrophy is often spread across these networks in PD pathology (32, 57), we performed multivariable models, testing all the PRSs built simultaneously, for each PD endophenotype. First, we assessed multivariate associations of each subcortical PRS to detect evidence at the specific structure level, applying a Benjamini-Hochberg correction for three different PD scales and five independent latent subcortical traits tested (58), as revealed by Matrix Spectral Decomposition applied to the phenotypic correlation matrix of these measures (59). Then, to have a measure of the total variance of PD endophenotypes explained by the polygenic scores, we compared adjusted Nagelke's R2 values of the multivariable models including all the subcortical PRSs and covariates (hereafter called full PRS models) with the baseline models including only covariates (see above).
Results
Exome-wide, we prioritized three validated variants showing the highest alternative allele frequency among those with high and moderate impact on protein function, in our cohort of patients. These included rs772162369 in MFSD6L and rs56407180 in KALRN among HIGH variants (AF = 2.03%), and rs201330591 in GTF2H2 among MODERATE variants (AF = 4.66%) (Table S4). These variants were further tested for association with PD in the whole Neuromed cohort, which revealed a significant associations with PD for rs201330591 [uc011crt.2:exon7:c.T217A:p.S73T; alternative T allele: OR [CI] = 8.16 [1.08; 61.52], p = 0.02, FDR < 0.05; see Table 1]. However, this association was not replicated in the IPDGC cohort [rs201330591: OR [CI] = 1.12 [0.41; 3.04], p = 0.83], nor in the following meta-analysis with our study (see Table S6).
Table 1. Association statistics (OR and 95% Confidence Interval) of the most frequent genetic variants detected through the variant prioritization pipeline in our PD callset.
Exome-wide association analyses with different scales assessing motor and non-motor PD symptoms revealed no significant genetic associations surviving Bonferroni correction, neither in a univariate (Table 2; Figures S1a–f), nor in a multivariate setting (Table S7; Figures S1g,h). The most significant effects were observed for rs3835072 [GAA/G, MAF ~ 40%; p = 6.69 × 10−7, β (SE) = 0.089 (0.017) for major allele GAA], an intronic indel located in the CCT7 gene. Although this effect met nominal exome-wide significance (α = 8.84 × 10−7), it did not survive correction for testing of multiple scales (α = 2.95 × 10−7). Other close variants showed comparable associations and were all in high LD (r2 > 0.75; see Table 2), suggesting they tagged the same genetic effect. Similarly, rs3835072 showed the most significant multivariate association with the different PD endophenotypes tested (p = 1.88 × 10−6), supported by a nominally significant effect on UPDRS (p = 0.033, β (SE) = −0.038 (0.017)], in addition to the one detected with MoCA. Again, this variant approached but did not met exome-wide significance (α = 8.84 × 10−7; Table S7). A follow-up association test in the whole Neuromed cohort revealed only a trend of association of this variant with MoCA [p = 0.067; β (SE) = 0.018 (0.010) for major allele GAA], while no significant association was observed neither with univariate UPDRS/NMS scales, nor in a multivariate setting (p = 0.19; Table S8).
Table 2. Most significant single variant associations (p < 10−5) detected in the univariate EWAS of three continuous scales assessing PD endophenotypes (see abbreviations below).
Multivariable associations analyses with standardized subcortical polygenic scores revealed statistically significant associations of UPDRS score with amygdala- [β (SE) = −0.039 (0.013), p = 0.004, FDR = 0.039] and caudate-PRS [β (SE) = 0.043 (0.013), p = 0.001, FDR = 0.028]. Full results of the multivariable models for the three PD scales tested are reported in Table 3. Overall, the multivariable association model including all the subcortical PRSs (full PRS model) explained 38.6% of variance in UPDRS scores, vs. 20.3% in the baseline model (including only covariates). A smaller discrepancy was observed for the other PD endophenotypes, where full PRS models explained 32.2 and 28.9% of the total variance in MoCA and NMS scores (vs. 20.3 and 20.7% in the baseline models), respectively.
Table 3. Results of multivariable regression models testing associations of PD endophenotypes with best-fit Polygenic Risk Scores (PRSs) for subcortical volumes in the Neuromed cohort.
Discussion
In this paper, we report an exploratory WES analysis of 123 PD cases from Italy. Although a previous study analyzed PD cases from an Italian genetic isolate, Sardinia (17), this represents the first WES study focused on PD patients from mainland Italy, the largest ever carried out in the country, and the richest in terms of phenotypes assessed. Indeed, in spite of the relatively small sample size sequenced and of the availability of exome (rather than whole genome) data, which represent the main limitations of the present study, we exploited the wealth of neurological scales assessed to carry out an exome-wide association study of motor and non-motor PD endophenotypes. Moreover, we tested associations of the scales available—namely UPDRS, MoCA and NMS—with PRSs known to influence subcortical volumes, which have long been considered as neuroimaging correlates of PD and neurodegeneration (32–35). To our knowledge, this study represents the first attempt to test genetic associations with neurological scales in PD at the exome-wide level.
Through a stepwise approach, we identified a genetic variant with a frequency notably higher than in published WES databases and a significant association with PD in an extended analysis of our cohort, namely rs201330591, encoding a Serine-to-Threonine change in GTF2H2 (General Transcription Factor IIH Subunit 2; 5q13). GTF2H2 has been previously implicated as a modifier gene in spinal muscular atrophy (SMA), an autosomal recessive neurodegenerative disorder characterized by progressive death of motor neurons, implying proximal muscle weakness, and wasting in the absence of sensory signs. This gene is located not far from the causative gene of SMA, SMN1 (Survival Motor Neuron 1), and deletions involving this gene have been detected in severe forms of the disease (60). It encodes a subunit of the TFIIH transcription factor, which has been also implicated in Cockayne syndrome, a rare disease characterized by progeria and nervous system abnormalities, among other signs (61). However, the association detected was not replicated in the IPDGC cohort (22), which suggests caution in the interpretation of this finding. This may be due to several reasons, including different recruitment and filtering criteria of the two studies, the extreme genetic heterogeneity of PD, the lack of power in our analyses or the possibility that false positives were detected in the discovery cohort, due to its small sample size. Further replication attempts are warranted to support this finding.
The exome-wide analysis of continuous PD endophenotypes revealed an association approaching exome-wide significance at rs3835072, both in a univariate setting with MoCA score (representing general cognitive performance) and at the multivariate level, including other motor (UPDRS) and non-motor PD endophenotypes (NMS). rs3835072 is an intronic indel predicted to alter splicing in the CCT7 gene (chaperonin containing TCP1, subunit 7; 2p13.2). This gene encodes a member of the chaperonin containing TCP1 (CCT) complex, which is impaired in severe neuropathies and in neurodegenerative disorders like Alzheimer's Disease (AD), where it is thought to promote toxic protein aggregates and cell death (62). Interestingly, the leading association signal identified at rs3835072 was with cognitive performance, which is impaired both in AD and in PD (63). However, this association only approached significance in an extended follow-up analysis of our PD cohort, which does not support a significant influence of this gene on cognitive performance.
The most interesting findings of the present work come from associations analyses of polygenic risk scores (PRSs) influencing brain subcortical volumes with the continuous PD endophenotypes available. Multivariable regression models analyzing all the subcortical polygenic scores together (full PRS models) revealed a notable increase in the proportion of variance explained for the PD scales tested, compared to the baseline models including only covariates (see Methods section). In particular, the fraction of variance explained almost doubled for motor symptoms (UPDRS), increasing from ~20% in the baseline model to ~39% in the full model. For non-motor symptoms scales (MoCA and NMS), the increase in variance explained by the full PRS model was less sharp (32 and 29%, respectively), but still evident. This suggests that the genetic underpinnings of brain subcortical structures may be important in influencing PD symptoms, especially for the motor domain. Among the seven subcortical PRS tested in the multivariable model, we observed significant associations of amygdala- and caudate-PRS with UPDRS. These findings do not support the significant bivariate genetic correlation between putamen volume and PD risk recently reported (24), a discrepancy which may be explained by the different methodologies used to investigate genetic overlap and by the low power provided by our study. Moreover, while the inverse association observed between the amygdala-PRS and motor symptoms is in line with its reported atrophy in PD (64, 65), the positive association observed for the caudate polygenic score is in contrast with previous neuroimaging observations of reduced caudate volumes in Parkinson patients (66, 67), although these associations are often localized and not always consistent (33, 67). A potential explanation for this discrepancy may be again type I error, due to the small sample size of our study. Alternatively, since caudate hypertrophy has been associated with vascular parkinsonism (68) and compensatory hypertrophy mechanisms have been reported for some subcortical structures in PD (64, 66), we may hypothesize that some PD patients may have a genetic predisposition to atrophy/hypertrophy in different subcortical structures, each representing a unique “mosaic” in terms of liability to motor and non-motor neurological symptoms. Although we are still far from a comprehensive view of structural brain changes in PD, multi-omic studies involving neuroimaging, clinical and genetic levels may help to verify this hypothesis.
Overall, the evidence reported here suggests that it is likely low power the current bottleneck in the research on the genetic bases of such a heterogeneous disorder like PD (20), and underlines the need of collaborative efforts to homogenize genetic analyses and increase sample size in WES studies of the disease. In addition, studies exploiting diverse phenotypic, pharmacological and clinical information can provide clues into the neurobiological basis of the disease. Overall, this paper represents an exploratory attempt in this sense, providing interesting insights into the shared genetic bases of PD symptoms and brain subcortical structures, in spite of the small sample size. This suggests further collaborative investigations in order to elucidate the genetic underpinnings of Parkinson Disease, its neurological endophenotypes and neuroimaging correlates.
Data Availability Statement
Raw WES data which were analyzed in the present manuscript will be made available upon request to the corresponding author, in a way which does not affect privacy of the patients involved in the present study.
URLs
Annovar: http://annovar.openbioinformatics.org/en/latest/
Variant Effect Predictor (VEP): https://www.ensembl.org/info/docs/tools/vep/index.html
Genome Analysis Toolkit (GATK): https://software.broadinstitute.org/gatk/
Burrows Wheeler Aligner (BWA): http://bio-bwa.sourceforge.net/
Samtools: http://samtools.sourceforge.net/
Picard: http://broadinstitute.github.io/picard
Vcftools: https://vcftools.github.io/index.html
PLINK: https://www.cog-genomics.org/plink/1.9/
1000 Genomes Project: ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/
NHLBI Exome Sequencing Project: https://evs.gs.washington.edu/EVS/
Exome Aggregation Consortium: http://exac.broadinstitute.org/
Rmeta package: https://cran.r-project.org/web/packages/rmeta/index.html
EMMAX: http://genetics.cs.ucla.edu/emmax/index.html
TATES: https://ctg.cncr.nl/software/tates
GEC: http://grass.cgs.hku.hk/gec/
Human Integrated Protein Expression Database: http://www.genecards.org/.
Ethics Statement
The studies involving human participants were reviewed and approved by IRCCS Neuromed. The patients/participants provided their written informed consent to participate in this study.
Author Contributions
TE, AS, and MC designed and supervised the study. NM, SV, SP, and MR recruited the patients, carried out phenotypic assessment, and collected the data. MR and AT carried out database curation, and performed sample management and bio-banking, along with AL and CD. AG and TN performed quality control and analysis of WES and other genotype data, under the supervision of MC. The International Parkinson's Disease Genomic Consortium (IPDGC) provided the replication data set and relevant statistics. MR and AT performed wet lab experiments, under the supervision of TE. AG wrote the manuscript, with contributions and final approval by all the co-authors.
Funding
This work was supported by Italian Ministry of Economic Development (M.I.S.E.), Invitalia CDS 0031, and by the Italian Ministry of Health. We thank Dr. Stefano Goldwurm for an informal review of the manuscript and the Parkinson Institute Biobank, member of the Telethon Network of Genetic Biobank (biobanknetwork.telethon.it/) funded by TELETHON Italy (project no. GTB12001), and supported by Fondazione Grigioni per il Morbo di Parkinson.
Conflict of Interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Supplementary Material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fneur.2019.01362/full#supplementary-material
References
1. Lill CM. Genetics of Parkinson's disease. Mol Cell Probes. (2016) 30:386–96. doi: 10.1016/j.mcp.2016.11.001
2. Kalinderi K, Bostantjopoulou S, Fidani L. The genetic background of Parkinson's disease: current progress and future prospects. Acta Neurol Scand. (2016) 134:314–26. doi: 10.1111/ane.12563
3. Farlow JL, Robak LA, Hetrick K, Bowling K, Boerwinkle E, Coban-Akdemir ZH, et al. Whole-exome sequencing in familial Parkinson disease. JAMA Neurol. (2016) 73:68–75. doi: 10.1001/jamaneurol.2015.3266
4. Shulskaya MV, Alieva AK, Vlasov IN, Zyrin VV, Fedotova EY, Abramycheva NY, et al. Whole-exome sequencing in searching for new variants associated with the development of Parkinson's disease. Front Aging Neurosci. (2018) 10:136. doi: 10.3389/fnagi.2018.00136
5. Polymeropoulos MH. Mutation in the -synuclein gene identified in families with Parkinson's disease. Science. (1997) 276:2045–7. doi: 10.1126/science.276.5321.2045
6. Paisán-Ruíz C, Jain S, Evans EW, Gilks WP, Simón J, Van Der Brug M, et al. Cloning of the gene containing mutations that cause PARK8-linked Parkinson's disease. Neuron. (2004) 44:595–600. doi: 10.1016/j.neuron.2004.10.023
7. Kitada T, Asakawa S, Hattori N, Matsumine H, Yamamura Y, Minoshima S, et al. Mutations in the parkin gene cause autosomal recessive juvenile parkinsonism. Nature. (1998) 392:605–8. doi: 10.1038/33416
8. Ramirez A, Heimbach A, Gründemann J, Stiller B, Hampshire D, Cid LP, et al. Hereditary parkinsonism with dementia is caused by mutations in ATP13A2, encoding a lysosomal type 5 P-type ATPase. Nat Genet. (2006) 38:1184–91. doi: 10.1038/ng1884
9. Valente EM. Hereditary early-onset Parkinson's disease caused by mutations in PINK1. Science. (2004) 304:1158–60. doi: 10.1126/science.1096284
10. Bonifati V. Mutations in the DJ-1 gene associated with autosomal recessive early-onset parkinsonism. Science. (2003) 299:256–9. doi: 10.1126/science.1077209
11. Zimprich A, Benet-Pagès A, Struhal W, Graf E, Eck SH, Offman MN, et al. A mutation in VPS35, encoding a subunit of the retromer complex, causes late-onset parkinson disease. Am J Hum Genet. (2011) 89:168–75. doi: 10.1016/j.ajhg.2011.06.008
12. Vilariño-Güell C, Rajput A, Milnerwood AJ, Shah B, Szu-Tu C, Trinh J, et al. DNAJC13 mutations in Parkinson disease. Hum Mol Genet. (2013) 23:1794–801. doi: 10.1093/hmg/ddt570
13. Aharon-Peretz J, Badarny S, Rosenbaum H, Gershoni-Baruch R. Mutations in the glucocerebrosidase gene and Parkinson disease: phenotype-genotype correlation. Neurology. (2005) 65:1460–1. doi: 10.1212/01.wnl.0000176987.47875.28
14. Verstraeten A, Theuns J, Van Broeckhoven C. Progress in unraveling the genetic etiology of Parkinson disease in a genomic era. Trends Genet. (2015) 31:140–9. doi: 10.1016/j.tig.2015.01.004
15. Kim CY, Alcalay RN. Genetic forms of Parkinson' s disease. Semin Neurol. (2017) 37:135–46. doi: 10.1055/s-0037-1601567
16. Bonifati V. Genetics of Parkinson's disease - state of the art, 2013. Park Relat Disord. (2014) 20:S23–8. doi: 10.1016/S1353-8020(13)70009-9
17. Quadri M, Yang X, Cossu G, Olgiati S, Saddi VM, Breedveld GJ, et al. An exome study of Parkinson's disease in Sardinia, a Mediterranean genetic isolate. Neurogenetics. (2015) 16:55–64. doi: 10.1007/s10048-014-0425-x
18. Chang D, Nalls MA, Hallgrímsdóttir IB, Hunkapiller J, van der Brug M, Cai F, et al. A meta-analysis of genome-wide association studies identifies 17 new Parkinson's disease risk loci. Nat Genet. (2017) 49:1511–6. doi: 10.1038/ng.3955
19. Ruiz-Martínez J, Azcona LJ, Bergareche A, Martí-Massó JF, Paisán-Ruiz C. Whole-exome sequencing associates novel CSMD1 gene mutations with familial Parkinson disease. Neurol Genet. (2017) 3:1–6. doi: 10.1212/NXG.0000000000000177
20. Sandor C, Honti F, Haerty W, Szewczyk-Krolikowski K, Tomlinson P, Evetts S, et al. Whole-exome sequencing of 228 patients with sporadic Parkinson's disease. Sci Rep. (2017) 7:1–8. doi: 10.1038/srep41188
21. Siitonen A, Nalls MA, Hernández D, Gibbs JR, Ding J, Ylikotila P, et al. Genetics of early-onset Parkinson's disease in Finland: exome sequencing and genome-wide association study. Neurobiol Aging. (2017) 53:195.e7–e10. doi: 10.1016/j.neurobiolaging.2017.01.019
22. Jansen IE, Ye H, Heetveld S, Lechler MC, Michels H, Seinstra RI, et al. Discovery and functional prioritization of Parkinson's disease candidate genes from large-scale whole exome sequencing. Genome Biol. (2017) 18:1–26. doi: 10.1186/s13059-017-1147-9
23. Ylönen S, Siitonen A, Nalls MA, Ylikotila P, Autere J, Eerola-Rautio J, et al. Genetic risk factors in Finnish patients with Parkinson's disease. Park Relat Disord. (2017) 45:39–43. doi: 10.1016/j.parkreldis.2017.09.021
24. Nalls MA, Blauwendraat C, Vallerga CL, Heilbron K, Bandres-Ciga S, Chang D, et al. Expanding Parkinson's disease genetics: novel risk loci, genomic context, causal insights and heritable risk. bioRxiv. (2019) 388165. doi: 10.1101/388165
25. Blauwendraat C, Heilbron K, Vallerga CL, Bandres-Ciga S, von Coelln R, Pihlstrøm L, et al. Parkinson's disease age at onset genome-wide association study: defining heritability, genetic loci, and α-synuclein mechanisms. Mov Disord. (2019) 34:866–75. doi: 10.1002/mds.27659
26. Tan M, Hubbard L, Lawton M, Kanavou S, Wood N, Hardy J, et al. Genome-wide association studies of motor and cognitive progression in Parkinson's disease. Mov Disord. (2018) 33(Suppl. 2).
27. Chung SJ, Choi N, Kim J, Kim K, Kim MJ, Kim YJ, et al. Genomic variants associated with cognitive impairment in Parkinson's disease: ethnicity-specific GWAS. Mov Disord. (2019) 34(Suppl. 2).
28. Ibanez L, Farias FHG, Dube U, Mihindukulasuriya KA, Harari O. Polygenic risk scores in neurodegenerative diseases: a review. Curr. Genet. Med. Rep. (2019) 7:22–9. doi: 10.1007/s40142-019-0158-0
29. Ibanez L, Dube U, Saef B, Budde J, Black K, Medvedeva A, et al. Parkinson disease polygenic risk score is associated with Parkinson disease status and age at onset but not with alpha- synuclein cerebrospinal fluid levels. BMC Neurol. (2017) 17:198. doi: 10.1186/s12883-017-0978-z
30. Escott-Price V, Nalls MA, Morris HR, Lubbe S, Brice A, Gasser T, et al. Polygenic risk of Parkinson disease is correlated with disease age at onset. Ann Neurol. (2015) 77:582–91. doi: 10.1002/ana.24335
31. Paul KC, Schulz J, Bronstein JM, Lill CM, Ritz BR. Association of polygenic risk score with cognitive decline and motor progression in Parkinson disease. JAMA Neurol. (2018) 75:360. doi: 10.1001/jamaneurol.2017.4206
32. Caligiore D, Helmich RC, Hallett M, Moustafa AA, Timmermann L, Toni I, et al. Parkinson's disease as a system-level disorder. NPJ Park Dis. (2016) 2:1–9. doi: 10.1038/npjparkd.2016.25
33. Sterling NW, Lewis MM, Du G, Huang X. Structural imaging and Parkinson's disease: moving toward quantitative markers of disease progression. J Parkinsons Dis. (2016) 6:557–67. doi: 10.3233/JPD-160824
34. Ferrazzoli D, Ortelli P, Madeo G, Giladi N, Petzinger GM, Frazzitta G. Basal ganglia and beyond: the interplay between motor and cognitive aspects in Parkinson's disease rehabilitation. Neurosci Biobehav Rev. (2018) 90:294–308. doi: 10.1016/j.neubiorev.2018.05.007
35. Prell T. Structural and functional brain patterns of non-motor syndromes in Parkinson's disease. Front Neurol. (2018) 9:138. doi: 10.3389/fneur.2018.00138
36. Postuma RB, Berg D, Stern M, Poewe W, Olanow CW, Oertel W, et al. MDS clinical diagnostic criteria for Parkinson's disease. Mov Disord. (2015) 30:1591–601. doi: 10.1002/mds.26424
37. Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. (2009) 25:1754–60. doi: 10.1093/bioinformatics/btp324
38. DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet. (2011) 43:491. doi: 10.1038/ng.806
39. Chang CC, Chow CC, Tellier LCAM, Vattikuti S, Purcell SM, Lee JJ. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience. (2015) 4:7. doi: 10.1186/s13742-015-0047-8
40. Wang K, Li M, Hakonarson H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. (2010) 38:e164. doi: 10.1093/nar/gkq603
41. McLaren W, Gil L, Hunt SE, Riat HS, Ritchie GRS, Thormann A, et al. The ensembl variant effect predictor. Genome Biol. (2016) 17:122. doi: 10.1186/s13059-016-0974-4
42. The Genomes Project C. A global reference for human genetic variation. Nature. (2015) 526:68. doi: 10.1038/nature15393
43. Auer PL, Reiner AP, Wang G, Kang HM, Abecasis GR, Altshuler D, et al. Guidelines for large-scale sequence-based complex trait association studies: lessons learned from the NHLBI exome sequencing project. Am J Hum Genet. (2016) 99:791–801. doi: 10.1016/j.ajhg.2016.08.012
44. Lek M, Karczewski KJ, Minikel EV, Samocha KE, Banks E, Fennell T, et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature. (2016) 536:285. doi: 10.1038/nature19057
45. R Core Team. R: A Language and Environment for Statistical Computing. Vienna: R Foundation for Statistical Computing (2015). Available online at: http://www.r-project.org/
46. Hentz JG, Mehta SH, Shill HA, Driver-Dunckley E, Beach TG, Adler CH. Simplified conversion method for unified Parkinson's disease rating scale motor examinations. Mov Disord. (2015) 30:1967–70. doi: 10.1002/mds.26435
47. Conti S, Bonazzi S, Laiacona M, Masina M, Coralli MV. Montreal cognitive assessment (MoCA)-Italian version: regression based norms and equivalent scores. Neurol Sci. (2015) 36:209–14. doi: 10.1007/s10072-014-1921-3
48. Cova I, Di Battista ME, Vanacore N, Papi CP, Alampi G, Rubino A, et al. Validation of the Italian version of the non motor symptoms scale for Parkinson's disease. Park Relat Disord. (2017) 34:38–42. doi: 10.1016/j.parkreldis.2016.10.020
49. Mata IF, Leverenz JB, Weintraub D, Trojanowski JQ, Chen-Plotkin A, Van Deerlin VM, et al. GBA Variants are associated with a distinct pattern of cognitive deficits in Parkinson's disease. Mov Disord. (2016) 31:95–102. doi: 10.1002/mds.26359
50. Dan X, Wang C, Zhang J, Gu Z, Zhou Y, Ma J, et al. Association between common genetic risk variants and depression in Parkinson's disease: a dPD study in Chinese. Parkinsonism Relat Disord. (2016) 33:122–6. doi: 10.1016/j.parkreldis.2016.09.029
51. Cooper CA, Jain N, Gallagher MD, Weintraub D, Xie SX, Berlyand Y, et al. Common variant rs356182 near SNCA defines a Parkinson's disease endophenotype. Ann Clin Transl Neurol. (2017) 4:15–25. doi: 10.1002/acn3.371
52. Kang HM, Sul JH, Service SK, Zaitlen NA, Kong S-Y, Freimer NB, et al. Variance component model to account for sample structure in genome-wide association studies. Nat Genet. (2010) 42:348–54. doi: 10.1038/ng.548
53. van der Sluis S, Posthuma D, Dolan CV. TATES: efficient multivariate genotype-phenotype analysis for genome-wide association studies. PLOS Genet. (2013) 9:e1003235. doi: 10.1371/journal.pgen.1003235
54. Li M-X, Yeung JMY, Cherny SS, Sham PC. Evaluating the effective numbers of independent tests and significant p-value thresholds in commercial genotyping arrays and public imputation reference datasets. Hum Genet. (2012) 131:747–56. doi: 10.1007/s00439-011-1118-2
55. Hibar DP, Stein JL, Renteria ME, Arias-Vasquez A, Desrivieres S, Jahanshad N, et al. Common genetic variants influence human subcortical brain structures. Nature. (2015) 520:224–9. doi: 10.1038/nature14101
56. Choi SW, O'Reilly PF. PRSice-2: polygenic risk score software for biobank-scale data. Gigascience. (2019) 8:giz082. doi: 10.1093/gigascience/giz082
57. Zeighami Y, Ulla M, Iturria-Medina Y, Dadar M, Zhang Y, Larcher KMH, et al. Network structure of brain atrophy in de novo parkinson's disease. Elife. (2015) 4:e08440. doi: 10.7554/eLife.08440
58. Gialluisi A, Andlauer TFM, Mirza-Schreiber N, Moll K, Becker J, Hoffmann P, et al. Genome-wide association scan identifies new variants associated with a cognitive predictor of dyslexia. Transl Psychiatry. (2019) 9:77. doi: 10.1038/s41398-019-0402-0
59. Li J, Ji L. Adjusting multiple testing in multilocus analyses using the eigenvalues of a correlation matrix. Heredity. (2005) 95:221–7. doi: 10.1038/sj.hdy.6800717
60. He J, Zhang Q-J, Lin Q-F, Chen Y-F, Lin X-Z, Lin M-T, et al. Molecular analysis of SMN1, SMN2, NAIP, GTF2H2, and H4F5 genes in 157 Chinese patients with spinal muscular atrophy. Gene. (2013) 518:325–9. doi: 10.1016/j.gene.2012.12.109
61. Iyer N, Reagan MS, Wu KJ, Canagarajah B, Friedberg EC. Interactions involving the human RNA polymerase II transcription/nucleotide excision repair complex TFIIH, the nucleotide excision repair protein XPG, and Cockayne syndrome group B (CSB) protein. Biochemistry. (1996) 35:2157–67. doi: 10.1021/bi9524124
62. Pavel M, Imarisio S, Menzies FM, Jimenez-Sanchez M, Siddiqi FH, Wu X, et al. CCT complex restricts neuropathogenic protein aggregation via autophagy. Nat Commun. (2016) 7:13821. doi: 10.1038/ncomms13821
63. Ferrari R, Wang Y, Vandrovcova J, Guelfi S, Karch CM, Schork AJ, et al. Genetic architecture of sporadic frontotemporal dementia and overlap with Alzheimer's and Parkinson's diseases. Neurogenetics. (2017) 88:152–64. doi: 10.1136/jnnp-2016-314411
64. Rosenberg-Katz K, Herman T, Jacob Y, Kliper E, Giladi N, Hausdorff JM. Subcortical volumes differ in parkinson's disease motor subtypes: new insights into the pathophysiology of disparate symptoms. Front Hum Neurosci. (2016) 10:356. doi: 10.3389/fnhum.2016.00356
65. Harding AJ, Stimson E, Henderson JM, Halliday GM. Clinical correlates of selective pathology in the amygdala of patients with Parkinson's disease. Brain. (2002) 125:2431–45. doi: 10.1093/brain/awf251
66. Garg A, Appel-Cresswell S, Popuri K, McKeown MJ, Beg MF. Morphological alterations in the caudate, putamen, pallidum, and thalamus in Parkinson's disease. Front Neurosci. (2015) 9:101. doi: 10.3389/fnins.2015.00101
67. Tanner JJ, McFarland NR, Price CC. Striatal and hippocampal atrophy in idiopathic Parkinson's disease patients without dementia: a morphometric analysis. Front Neurol. (2017) 8:139. doi: 10.3389/fneur.2017.00139
Keywords: Parkinson disease, genetics, whole exome sequencing, cognitive performance, motor symptoms, non-motor symptoms, subcortical volumes, polygenic scores
Citation: Gialluisi A, Reccia MG, Tirozzi A, Nutile T, Lombardi A, De Sanctis C, International Parkinson's Disease Genomic Consortium (IPDGC), Varanese S, Pietracupa S, Modugno N, Simeone A, Ciullo M and Esposito T (2020) Whole Exome Sequencing Study of Parkinson Disease and Related Endophenotypes in the Italian Population. Front. Neurol. 10:1362. doi: 10.3389/fneur.2019.01362
Received: 27 July 2019; Accepted: 10 December 2019;
Published: 10 January 2020.
Edited by:
Jingyun Yang, Rush University Medical Center, United StatesReviewed by:
Chuntao Zhao, Cincinnati Children's Hospital Medical Center, United StatesJun Mitsui, The University of Tokyo, Japan
Copyright © 2020 Gialluisi, Reccia, Tirozzi, Nutile, Lombardi, De Sanctis, International Parkinson's Disease Genomic Consortium (IPDGC), Varanese, Pietracupa, Modugno, Simeone, Ciullo and Esposito. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Teresa Esposito, teresa.esposito@igb.cnr.it
†These authors have contributed equally to this work
‡A list of members of the IPDGC can be found in Supplementary File 1