- 1Department of Psychiatry, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, Hong Kong
- 2Department of Systems Biology, Department of Pediatrics, Columbia University Medical Center, New York, NY, United States
- 3Department of Pharmacology and Pharmacy, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, Hong Kong
- 4Li Ka Shing Faculty of Medicine, Center for Genomic Sciences, The University of Hong Kong, Hong Kong, Hong Kong
- 5Department of Orthopaedic Surgery, Faculty of Life Sciences, Kumamoto University, Kumamoto City, Japan
- 6Medical Research Center Oulu, University of Oulu and Oulu University Hospital, Oulu, Finland
- 7Department of Orthopaedics and Traumatology, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, Hong Kong
- 8Hebrew SeniorLife, Institute for Aging Research, Roslindale, MA, United States
- 9Harvard Medical School, Boston, MA, United States
- 10Molecular and Integrative Physiological Sciences Program, Harvard School of Public Health, Boston, MA, United States
- 11Li Ka Shing Faculty of Medicine, School of Biomedical Science, The University of Hong Kong, Hong Kong, Hong Kong
- 12Department of Orthopedic Surgery, National Defense Medical College, Tokorozawa, Saitama, Japan
- 13Department of Orthopaedic Surgery, Toyama University, Toyama Prefecture, Japan
- 14Laboratory of Bone and Joint Diseases, Center for Integrative Medical Sciences, RIKEN, Tokyo, Japan
Lumbar disc degeneration (LDD) is age-related break-down in the fibrocartilaginous joints between lumbar vertebrae. It is a major cause of low back pain and is conventionally assessed by magnetic resonance imaging (MRI). Like most other complex traits, LDD is likely polygenic and influenced by both genetic and environmental factors. However, genome-wide association studies (GWASs) of LDD have uncovered few susceptibility loci due to the limited sample size. Previous epidemiology studies of LDD also reported multiple heritable risk factors, including height, body mass index (BMI), bone mineral density (BMD), lipid levels, etc. Genetics can help elucidate causality between traits and suggest loci with pleiotropic effects. One such approach is polygenic score (PGS) which summarizes the effect of multiple variants by the summation of alleles weighted by estimated effects from GWAS. To investigate genetic overlaps of LDD and related heritable risk factors, we calculated the PGS of height, BMI, BMD and lipid levels in a Chinese population-based cohort with spine MRI examination and a Japanese case-control cohort of lumbar disc herniation (LDH) requiring surgery. Because most large-scale GWASs were done in European populations, PGS of corresponding traits were created using weights from European GWASs. We calibrated their prediction performance in independent Chinese samples, then tested associations with MRI-derived LDD scores and LDH affection status. The PGS of height, BMI, BMD and lipid levels were strongly associated with respective phenotypes in Chinese, but phenotype variances explained were lower than in Europeans which would reduce the power to detect genetic overlaps. Despite of this, the PGS of BMI and lumbar spine BMD were significantly associated with LDD scores; and the PGS of height was associated with the increased the liability of LDH. Furthermore, linkage disequilibrium score regression suggested that, osteoarthritis, another degenerative disorder that shares common features with LDD, also showed genetic correlations with height, BMI and BMD. The findings suggest a common key contribution of biomechanical stress to the pathogenesis of LDD and will direct the future search for pleiotropic genes.
Introduction
Human intervertebral discs (IVDs) are fibrocartilaginous structures that lie between adjacent vertebrae. These IVDs hold the vertebrae together, facilitate some vertebral motion, and act as shock absorbers to accommodate biomechanical loads (Oxland, 2016). IVD is composed of a gel-like nucleus pulposus surrounded by an annulus fibrosis and separated from the vertebral body by a cartilaginous endplate (Humzah and Soames, 1988). During one's lifetime, due to excessive physical loading, occupational injuries, aging, genetics, and other factors, the IVDs may degenerate and display marked biochemical and morphological changes (Buckwalter, 1995; Urban and Roberts, 2003). Currently, magnetic resonance imaging (MRI) is the gold-standard for evaluating disc degeneration. Based on this imaging, numerous methods are available to grade and summarize different features indicative of degeneration, including signal intensity loss, bulging and herniation, as well as disc space narrowing (Battié et al., 2004; Cheung et al., 2009). Lumbar disc degeneration (LDD) is of clinical importance because it is believed to be a major cause of low back pain (Luoma et al., 2000; Livshits et al., 2011; Samartzis et al., 2011; Takatalo et al., 2011). Its severe form lumbar disc herniation (LDH), in which disc material herniates into the epidural space and compresses a lumbar nerve root, can cause neuropathic pains (sciatica) radiating to the lower extremity (Ropper and Zafonte, 2015).
Twin studies have demonstrated a strong genetic contribution to LDD (Sambrook et al., 1999; Battié et al., 2008). However, searching for genetic variants associated with LDD has been a challenge due to discrepancies and non-standardization of phenotype definitions, inconsistencies with imaging technology, and limited sample sizes in genome-wide association studies (GWASs) (Eskola et al., 2012, 2014; Williams et al., 2013). Similar to most other complex traits, LDD is likely to be polygenic with thousands of trait-associated variants each of which has tiny effect size.
In addition to age, sex, and environmental influences, LDD is also associated with several heritable risk factors including body mass index (BMI) (Liuke et al., 2005; Samartzis et al., 2011, 2012; Takatalo et al., 2013), bone mineral density (BMD) (Harada et al., 1998; Pye et al., 2006; Wang et al., 2011), and serum lipid levels (Leino-Arjas et al., 2008; Longo et al., 2011; Zhang et al., 2016). But it is not fully clear if there is a genetic basis for these phenotype associations. Identifying genetic overlaps between LDD and related traits will be useful for elucidating cause and effect because genetic markers are not subject to reverse causation or confounding and can be used as an instrument to infer causality using Mendelian randomization (Davey Smith and Hemani, 2014), and it can also suggest pleiotropic loci that reveal novel insights into biology (Solovieff et al., 2013).
Several methods have been developed to evaluate genetic overlap between traits by exploiting the polygenic architecture (Dudbridge, 2016). A polygenic score (PGS) of a trait is the summation of alleles across loci weighted by their effect sizes estimated from GWAS (Purcell et al., 2009). In its typical application, GWAS of a base phenotype is first done in a discovery sample. PGS can be calculated in an independent testing sample using single-nucleotide polymorphisms (SNPs) whose p-values are below some threshold in the GWAS of discovery sample. It can then be used as a predictor of target phenotypes in the testing sample using regression analysis. PGS has been widely used to predict disease risk (Chatterjee et al., 2016), evaluate genetic overlaps across traits (Krapohl et al., 2016), and infer genetic architectures (Stahl et al., 2012; Palla and Dudbridge, 2015). Because phenotyping of LDD by MRI is expensive and labor intensive, sample sizes are usually limited for well-phenotyped cohorts. PGS can leverage GWAS meta-analysis results from large consortia to maximize the power to detect genetic overlaps and is most suitable for the current study of LDD. Some other methods, such as bivariate linear mixed-effect model (Lee et al., 2012b; Vattikuti et al., 2012) would require genotypes of individuals of base and target phenotypes. Recently developed linkage disequilibrium score (LDSC) regression (Bulik-Sullivan et al., 2015) makes use of only summary-level association statistics and can account for sample overlaps between different studies, but it requires very large sample sizes that has not been available for LDD.
In this study, we applied PGS to investigate the genetic overlap of LDD with four related risk factors using the GWAS data of Hong Kong Disc Degeneration (HKDD) population-based cohort (Cheung et al., 2009; Samartzis et al., 2012; Li et al., 2016) and a Japanese case-control cohort of LDH that required surgery (Song et al., 2013). We selected BMI, BMD and serum lipids levels as base phenotypes, based on their previous reported associations with LDD (Pye et al., 2006; Longo et al., 2011; Samartzis et al., 2012). Height was also included because its association with chronic low back pain (Hershkovich et al., 2013; Heuch et al., 2015). Two semi-quantitative scores that summarize different aspects of LDD from lumbar spine MRI were used as target phenotypes in the HKDD cohort; LDH affection status was used as the third target phenotype in the Japanese case-control cohort. Because GWASs of base phenotypes were done in European populations whereas our testing samples were of East Asian ancestry, the performance of PGS in predicting base phenotypes was first evaluated in independent Chinese samples. Then we applied the best performing PGS of the base phenotype to test association with target phenotypes in testing samples (Figure 1). Results were then interpreted in light of previous epidemiological evidence and statistical power to detect association. To better understand the mechanism implied by the genetic overlaps and motivated by the suggestion that LDD and osteoarthritis (OA) may share common pathophysiological features (Loughlin, 2011; Ikegawa, 2013), we further tested if the base phenotypes that had genetic overlaps with LDD also showed genetic correlations with OA using the GWAS summary data of the arcOGEN study (Zeggini et al., 2012). Finally, we evaluated the predictive power of trans-ethnic PGS to aid the design of future studies.
Figure 1. The analysis framework. GWAS summary statistics of base phenotypes were obtained from published studies in European populations. The polygenic score (PGS) in a testing sample of East Asian population was calculated by weighted summation of alleles at approximately independent SNPs whose association p-values fall below some threshold in the discovery GWAS done in European populations. The performance of PGS to predict the base phenotype in East Asians was first evaluated in a validation sample. Then the best performing PGS of the base phenotype was used to test genetic overlap with lumbar disc degeneration (LDD) in the testing samples. In this study, we selected height, body mass index (BMI), bone mineral density (BMD) and lipid levels as base phenotypes. The prediction performance of PGS was evaluated in the HKDD cohort for height, BMI, and lipid levels, and in the HKOS cohort for BMD. Three LDD phenotypes were used as target phenotypes, including disc displacement and disc degeneration scores in the HKDD cohort and affection status of lumbar disc herniation (LDH) in the Japanese LDH case-control cohort.
Materials and Methods
Study Samples
HKDD Cohort
The HKDD Study was a population-based cohort of approximately 3,500 Southern Chinese initiated to assess spinal phenotypes and their risk factors. All participants underwent T2-weighted MRI examination of the lumbar spine assessed by expert physicians (JK and KMC). Sample recruitment and MRI procedures have been described in detail previously (Cheung et al., 2009; Samartzis et al., 2012; Li et al., 2016). For the current study, we focused on two major aspects of LDD captured by different MRI features (Figure 2a). The first was signal intensity loss within nucleus pulposus, which may represents loss of water content of IVD. Its presence and severity at each lumbar disc was assessed by the Schneiderman's grades (Schneiderman et al., 1987). Based on this grading scheme each disc was given a score of 0–3, whereby 0 indicated normal and higher scores indicated increased severity. A disc degeneration score for each individual was calculated by the summation of Schneiderman's grades over all five lumbar discs. We also assessed disc displacement, represented as a bulging/protrusion or extrusion of disc material. An ordinal grade from 0 to 2 was assigned to each lumbar disc to indicate normal, bulge/protrusion or extrusion of disc material; for each individual, a disc displacement score was calculated by the summation of the grades over all five lumbar discs (Cheung et al., 2009). Age, sex, physical workload based on occupation, history of smoking, and history of lumbar spine injury were obtained by a questionnaire for all participants. Body height and weight were measured at the time when each subject underwent MRI, and BMI was calculated by dividing weight by height squared (kg/m2). A subset of the cohort (N = 815) also had their blood metabolite profiles measured by quantitative serum nuclear magnetic resonance (NMR) platform (Soininen et al., 2009, 2015). Low-density lipoprotein cholesterol (LDL-C), high-density lipoprotein cholesterol (HDL-C), triglycerides (TG) and total cholesterol (TC) were obtained as part of NMR metabolite measures. Association between LDD scores and other covariates were analyzed using multiple linear regression to account for correlations between predictor variables. The best fitting model was selected using Akaike information criterion.
Figure 2. Summary of phenotypes in the HKDD cohort. (a) Examples of magnetic resonance imaging show two major aspects of LDD. Disc displacement (left) is shown as bulging of disc material beyond confine of annulus fibrosus. The loss of proteoglycan and water content (right) within nucleus pulposus is reflected by the signal intensity loss. The lumbar spine has 5 intervertebral segments, termed L1 through L5. S1 stands for the first segment of sacral that is intermediately below the lumbar spine. (b) The prevalence of signal intensity loss and disc displacement at different levels of lumbar spine discs. Two ordinal grades (0–3 for signal intensity loss, 0–2 for disc displacement) were assigned to each lumbar disc to indicate the presence and severity of LDD, where 0 indicated normal and higher scores indicated increased severity. (c) The distribution of two LDD scores. The disc degeneration score and displacement score were defined by the summation of grades over all disc levels for signal intensity loss and disc displacement respectively. The two LDD scores are correlated in the population. The age threshold divides the HKDD cohort in two parts with roughly equal sample sizes. The older subjects tend to have higher disc degeneration scores and disc displacement scores. (d) Pairwise Pearson correlations between original phenotypes (upper triangle) and between residual phenotypes after adjusting for age and gender (lower triangle).
A total of 2,373 individuals from the HKDD cohort were genotyped by Illumina HumanOmni-ZhongHua-8 Beadchip. Basic genotyping and quality control (QC) procedures have been described in our previous study (Li et al., 2016). In this study, we used more stringent QC criteria that keep only individuals with a call rate >99% and common SNPs with minor allele frequency (MAF) >0.01. The genotypes were imputed to over 8 million common variants in Phase 3 of 1,000 Genomes reference panel using the Michigan Imputation Server (Das et al., 2016) and filtered to keep only common bi-allelic SNPs (MAF>0.01) with imputation quality metrics r2 ≥ 0.3.
LDH Case-Control Cohort
The Japanese LDH case-control cohort was part of our previous genetic study of LDD (Song et al., 2013). Hospitalized patients LDH were ascertained on the basis of sciatica or severe low back pain requiring surgical treatment and confirmed by lumbar spine MRI. The controls were unrelated individuals from general Japanese population as part of Japan Biobank Project. All individuals were genotyped by Illumina HumanHap550v3 BeadChip. A total of 366 cases and 3,331 controls passed QC and were used in the association analysis. Genotypes were imputed to 2.5 million SNPs in Phase 2 HapMap Project using IMPUTE2 (Howie et al., 2009), and association analysis at each SNP was performed by logistic regression assuming an additive model using SNPTEST (Marchini et al., 2007).
HKOS GWAS
Hong Kong Osteoporosis Study (HKOS) was a prospective cohort study of over 9,000 Southern Chinese residents in Hong Kong (Cheung et al., 2017). BMD of the lumbar spine (LS-BMD) and femoral neck (FN-BMD) were measured by dual-energy X-ray absorptiometry. The age-corrected and standardized BMD was generated for each gender. A total of 800 unrelated females with extreme BMD were selected in the previous GWAS (Kung et al., 2010). The low BMD subjects were those with BMD Z-score ≤ −1.28 at either the LS or FN; the high BMD subjects were those with BMD Z-score ≥1.0 at either of the two skeletal sites. All individuals were genotyped by Illumina HumanHap610Quad Beadchip, whereby 780 individuals passed QC. Association analysis at each SNP was performed by linear regression using PLINK (Purcell et al., 2007). Detailed genotyping, QC, and imputation procedures have been described elsewhere (Kung et al., 2010; Xiao et al., 2012).
arcOGEN Study
The arcOGEN study (http://www.arcogen.org.uk/) was a collection of unrelated, UK-based individuals of European ancestry with knee and/or hip OA from the arcOGEN Consortium (Panoutsopoulou et al., 2011; Zeggini et al., 2012). Cases were ascertained based on clinical evidence with a need of joint replacement or radiographic evidence of disease (Kellgren-Lawrence grade ≥2), controls were from ancestry-matched (UK) population. A GWAS meta-analysis that included 7,410 cases and 11,009 controls as the discovery sample has been described previously (Zeggini et al., 2012). The summary statistics of the discovery GWAS were obtained by application to the consortium.
All studies were approved by local ethical committees. Written informed consent was obtained from all participants.
Statistical Analysis
Polygenic Score Regression
We selected height, BMI, BMD, and serum lipid levels as base phenotypes, and obtained GWAS summary data (Table 1). PGS was created by the two strategies as described below and used to predict phenotypes through linear regression after accounting for other covariates. The non-genetic covariates were selected based on their association with each phenotype in the baseline multiple linear regression model (listed in footnotes of Table 3 and Table S7). We validated the prediction performance of PGS on base phenotypes in the HKDD cohort (for height, BMI, and lipid levels) and HKOS cohort (for LS- and FN-BMD). Then, the PGS with best prediction performance for each base phenotype was used to test genetic overlap with LDD by predicting two LDD scores in the HKDD cohort and the LDH affection status in the Japanese case-control cohort. When individual-level genotype data in the testing sample were not available (for the HKOS GWAS and LDH case-control cohort), PGS regression was performed using summary statistics based algorithm implemented in gtx R package (Johnson, 2012), and SNP genotypes of HapMap 3 East Asian samples were used as the reference panel for SNP clumping.
The first strategy for calculating PGS only used known trait-associated SNPs that reached genome-wide significance in previous studies (GWAS hits). Individual PGS profiles were calculated by summing up the dosage of trait-increasing alleles from imputed genotypes weighted by the reported effect sizes. This strategy has the advantage of including secondary signals within the same locus and increased accuracy of effect size estimates from a larger independent replication sample (for LS- and FN-BMD).
As a second strategy, we performed genome-wide PGS analysis using PRsice (Euesden et al., 2015). Briefly, summary statistics of base GWAS was first aligned with genotyped SNPs of the testing sample. Then SNPs were pruned based on p-value informed clumping algorithm [linkage disequilibrium (LD) r2 < 0.1 across 500 kb] that selected SNPs most associated with the base phenotype in a locus to generate sets of independent SNPs. PGS was created using clumped SNPs whose p-value in the base GWAS were below pre-specified threshold. We varied p-value thresholds (1.0E-7, 1.0E-5, and from 1.0E-4 to 0.5 with a step of 0.0001) to select the one that maximized the variance explained (R2) for the base phenotype in the validation sample.
Correcting Sample Overlap and Extreme Selection
The HKOS GWAS was part of BMD GWAS meta-analysis conducted by the GEFOS consortium. To make it an independent testing data, we inverted the fixed effect meta-analysis to subtract the contribution of HKOS from the GEFOS summary statistics (Appendix 1 in the Supplementary Material). When calculating PGS using the BMD GWAS hits, we used effect estimates from the stage II replication sample to avoid the issue of overlapping sample.
The HKOS GWAS adopted an extreme phenotype design to increase association power, which also resulted in upward biased estimates of R2 by PGS using linear regression. To get an estimate of R2 in the unselected sample () for comparing with the previous report, we corrected the R2 estimate in the selected sample () by:
where f is the increased phenotype variance due to extreme phenotype selection (= 2.739 in the HKOS GWAS sample). The derivation and validation of this approximation formula is given in Appendix 3 (Supplementary Material).
R2 for Case-Control Data on the Liability Scale
For the case-control data, it is meaningful to estimate the disease liability explained by PGS under the liability threshold model (Falconer and Mackay, 1996), so that the result can be compared to the heritability of LDH (Heikkila et al., 1989). We first converted summary statistics generated by the logistic regression to those of linear regression by first-order approximation. Then we used summary statistics based PGS regression to obtain an estimate of R2 on the observed scale. Finally, the observed R2 was converted to the liability scale using the transformation formula by Lee et al. (2012a) assuming the disease prevalence of 0.02 (Jordan et al., 2009). More details are given in Appendix 4 (Supplementary Material).
Inference of Genetic Architecture and Projecting Prediction Performance
We applied AVENGEME (Palla and Dudbridge, 2015) to estimate parameters of genetic architecture of height and BMI from the PGS results. The procedure is described in Appendix 2 (Supplementary Material). Briefly, for a presumed SNP heritability , the method estimated the fraction of markers that are null () and genetic correlation between the discovery and testing samples (). If the genetic architectures of the discovery and testing sample are the same, then the genetic correlation between two samples can be estimated as .
The same model was also applied to predict the expected R2 for height and BMI in Chinese population under different study designs. To project R2 using PGS created by weights from European GWAS, model parameters were set to the maximum likelihood estimates fitted to the observed PGS results. We also increased the discovery sample size by 500,000 to evaluate the increase of R2 in the future. To predict R2 using PGS created by weights from East Asian GWAS, we used the same set of model parameters, but set the discovery GWAS sample size to match the published study of East Asians (Wen et al., 2014; He et al., 2015) and assumed no heterogeneity of effect sizes between the discovery and testing samples (). To incorporate between-sample heterogeneity within East Asians, we changed between population genetic correlation to 0.9 (so ), which is a lower bound for height and BMI in Europeans (de Vlaming et al., 2017) and in different Chinese GWAS samples (data not shown).
SNP Heritabilities
Phenotypes analyzed in the HKDD cohort were adjusted by covariates that are associated with the phenotype (listed in Table 2) by linear regression; and residues were inverse normal transformed when necessary. SNP heritabilities of the adjusted phenotypes were estimated using GCTA v1.25 (Yang et al., 2011a) after excluding individuals so that no pair of individuals had estimated coefficient of relatedness >0.05 as recommended by the GCTA developers (Yang et al., 2017).
Estimating Genetic Correlation Between Anthropometric Traits and OA
To test the genetic overlaps of OA with height, BMI and BMD, we applied LDSC regression (Bulik-Sullivan et al., 2015) to the GWAS summary statistics following the recommended procedure. BMD summary statistics were corrected to remove the contribution from the HKOS GWAS (the only non-European study) as in PGS regression.
Due to sample overlaps in different European GWASs, PGS could not have been applied to the genome-wide summary data. But for known BMD associated SNPs whose effect size estimates from an independent replication sample were available (Estrada et al., 2012), we also used PGS regression under summary statistic mode to assess the genetic correlation between osteoarthritis and BMD.
Power Analysis
Power calculation was done assuming the test statistics follows non-central chi-squared distribution under the alternative hypothesis. The non-centrality parameter for quantitative trait is , where N is the sample size and R2 is the phenotype variance explained by PGS. For binary trait, R2 in the above formula is on the observed scale and can be converted from liability scale using Lee et al. (2012a)'s formula as described in Appendix 4 (Supplementary Material).
Results
Phenotype Summary of the HKDD Cohort
A total of 2,054 unrelated Chinese subjects in the HKDD cohort (60% were females) were included in polygenic analysis. The basic demographic and phenotype summary are shown in Table S1. Both signal intensity loss and disc displacement showed a higher prevalence and severity at lower lumbar levels (Figure 2b). The disc degeneration and disc displacement scores for each individual were calculated by the summation of grades over all levels. Consistent with a major effect of aging, older individuals tend to have higher disc degeneration and displacement scores (Figure 2c). The two LDD scores were correlated with each other (r = 0.57; Figure 2c). Both of them were also positively correlated with height, body weight, BMI, and lumbar spine injury (P < 0.001; Figure 2d, Table S2A); the correlation remained significant for all except injury after correcting for the effect of age and gender (Figure 2d, Table S2B). Multiple linear regression analysis showed that the best fitting models for both disc degeneration and displacement scores included age, sex, lumbar injury, height and BMI as covariates (Table S3), which together explained 21.5 and 9.6% phenotype variances respectively. The SNP heritability estimates for height and BMI in the HKDD cohort were 0.38 (±0.18) and 0.25 (±0.17), similar to the previous reports in Europeans (Yang et al., 2010, 2011b). For disc degeneration and displacement scores, after adjusting for known covariates, SNP heritability estimates were about 0.2~0.3 (Table 2).
Evaluating the Prediction Performance of PGS of Anthropometric Traits
We first evaluated prediction performance of PGS in Chinese samples and compared them with Europeans. PGS profiles of height and BMI were created in the HKDD cohort using known trait-associated SNPs identified by the GIANT consortium GWAS meta-analyses. They explained 5.7 and 1.2% of height and BMI variances respectively (P < 1.0E-10 for height, P = 1.6E-07 for BMI), after adjusting for age, sex and principle components. It is 2~3-fold lower than previous reports in independent European samples, which were 16% for height (Wood et al., 2014) and 2.7% for BMI (Locke et al., 2015). The prediction performance of BMD associated SNPs reported by GEFOS consortium were tested in the HKOS GWAS sample (Kung et al., 2010). After correcting for extreme phenotype selection (Appendix 3 in the Supplementary Material), the known BMD-associated SNPs explained 3.4% and 3.0% variance of LS-BMD and FN-BMD in Chinese population (P < 1.0E-10), also lower than previous reported ~5% in Europeans (Estrada et al., 2012).
Since GWAS hits may only explain a small proportion of phenotype variance, we extended PGS analysis to make use of whole-genome summary statistics (Figure 3). As the p-value threshold of the discovery GWAS increases, both true and false positive SNPs will be included in the PGS. The p-value threshold that optimized phenotype prediction depends on the discovery sample size and unknown genetic architecture (Chatterjee et al., 2013; Dudbridge, 2013), and should be determined empirically. At the optimal p-value threshold, we found the phenotype variance explained is similar to that using GWAS hits for FN-BMD, marginally improved for height, slightly worse for LS-BMD, and more than doubled for BMI (R2 = 2.6%, P < 1.0E-10). Theoretical model fitting under a range of plausible parameters (Table S4) suggested that BMD had smaller fraction of trait associated SNPs (0.5~0.6%) with larger effect sizes compared with BMI and height (estimated fraction of non-null markers: 14~17%), which explained why sparse PGS models showed better prediction performance for BMD. The trait variances explained by PGS predicted by the models generally captures the trend of empirical observations at different p-value thresholds (Figure 3). The estimate of between-population genetic covariance for each phenotype was consistently lower than the presumed heritability (Table S4), reflecting trans-ethnic heterogeneity in effect sizes.
Figure 3. Prediction performance of polygenic scores (PGS) on four base phenotypes in Chinese population. Phenotype variances explained (R2) are shown at different p-value thresholds. The gray lines are the predicted R2 based on the theoretical model of Dudbridge (2013) with parameters given in Table S4. Different parameter sets of each model give similar results. PGS of height (A) and BMI (B) were tested on HKDD cohort; PGS of bone mineral density at the lumbar spine (LS-BMD, C), and femoral neck (FN-BMD, D), were evaluated on HKOS sample.
Testing Genetic Overlap Between Anthropometric Traits and LDD
We then applied PGS to test genetic overlaps between anthropometric traits and LDD (Table 3). In the HKDD cohort, the BMI PGS at its optimal threshold was positively associated with both disc displacement score (R2 = 0.29%, P = 0.015) and disc degeneration score (R2 = 0.31%, P = 0.011) after adjusting for sex, age and lumbar injury. The results are consistent with obesity as a major risk factor for LDD development and progression (Hassett et al., 2003; Hangai et al., 2008). The associations remained significant (P < 0.05) after further adjusting for height but disappeared after adjusting for BMI or body weight (Table S5). The PGS of LS-BMD were positively associated with disc displacement score (R2 ≈ 0.2%; P < 0.05) and remained significant (P < 0.05) after further adjusting for height, BMI or weight (Table S6). The same trend was also observed for FN-BMD but did not reach significance. The finding supports the previous reported genetic correlation between BMD and disc bulge in a twin study (Livshits et al., 2010). The lack of association with disc degeneration score is also consistent with the previous study that showed a smaller effect size between BMD and disc signal intensity on MRI (Livshits et al., 2010).
In addition to LDD scores in the general population, we also applied PGS to predict case-control status of symptomatic LDH (Song et al., 2013). The height PGS was positively associated with LDH (P < 0.01) and explained 0.35% of disease liability. The association cannot be explained by body weight, because the BMI PGS is better associated with weight but does not show association with LDH (P > 0.5). The result provides a genetic basis to the previous epidemiological observation that being tall is a risk factor for hospitalization due to LDH (Wahlstrom et al., 2012) and back surgery (Coeuret-Pellicer et al., 2010).
Testing Genetic Overlap Between Lipid Levels and LDD
Previous studies also reported that increased level of LDL-C, TC, and TG were associated with increased risk of LDH (Leino-Arjas et al., 2008; Longo et al., 2011; Zhang et al., 2016). To test if serum lipid levels have genetic correlation with LDD, we did similar PGS analysis using known lipid associated SNPs and GWAS summary data from the Global Lipids Genetic Consortium (Willer et al., 2013). Prediction performance of PGS was first evaluated in a subset of the HKDD cohort (N = 620 with genotypes) whose lipid levels were measured by the high-throughput NMR approach. All PGS were significantly associated with the corresponding lipid levels. Except for LDL-C, the PGS of known lipid loci showed the best prediction performance (Figure S1). However, none of them was significantly associated with LDD scores with the expected direction in the HKDD cohort or LDH in the Japanese case-control cohort (Table S7). Directly testing the phenotype association in the HKDD cohort by multiple linear regression also showed no association (Table S8). Therefore, our data does not support the previously suggested role of atherosclerotic lipids in LDD.
Power Consideration
Given sample sizes and study designs, the two testing samples used in this study show similar profiles of statistical power (Figure 4). We have >50% power (at significance level α = 0.05) to detect genetic correlation if PGS explains >0.2% variance of adjusted LDD scores (or LDH disease liability). To achieve the same power at α = 0.01, it would require PGS to explain >0.33% phenotype (liability) variance. But the current study does not have enough power to detect genetic overlap if the PGS explain less than 0.2% variance of LDD scores (or LDH liability). Therefore, we designed this study to only test phenotypes with previous epidemiological evidence for association with LDD. To further reduce the multiple testing burden, we had only used the PGS which was optimal in predicting the corresponding base phenotype to test the genetic overlap with LDD. The results that were nominally significant and consistent with the expected phenotype correlations can be interpreted as supportive evidence of genetic overlaps.
Figure 4. Power to detect association with polygenic score (PGS) in two testing samples. (A) For the HKDD cohort (N = 2,054), given significance level (at α = 0.01 or 0.05), the power is determined by the phenotype variance explained by PGS. (B) For the Japanese case-control cohort (366 cases, 3,331 controls), assuming the disease prevalence of 0.02, the power is a function of the disease liability explained by PGS.
The Association of a Height Associated SNP rs6651255 With LDD Scores
The current study has no power to search for individual SNPs showing pleiotropic associations with LDD and related traits. But we noted that a recent GWAS of LDH with lumbar spine surgery in Iceland population (Bjornsdottir et al., 2017) identified a genome-wide significant SNP rs6651255, which was also a known height associated SNP (Wood et al., 2014). The risk allele T showed an odds ratio of 1.23 and associated with increased height. The same study also reported that increase in the genetically determined height increased the risk of LDH with surgery, but the effect of rs6651255 on LDH was not mediated by height. To replicate this finding in our cohorts, we found an LD proxy rs4733724 (LD r2 = 1 with rs6651255 in 1000 Genomes CEU population) was directly genotyped in the HKDD cohort and reliably imputed in the Japanese case-control cohort. The allele A was coupled to the LDH risk allele and significantly increased both disc displacement and disc degeneration scores (P < 0.05; Table 4). The effects remained significant after further adjusting for height, BMI or body weight. The same allele was also weakly associated with increased height and increased risk of LDH requiring surgery (odds ratio = 1.11), but the results were not significant as the sample sizes limited the power to detect associations with small effect sizes.
Table 4. Association of rs4733724-A allele with lumbar disc degeneration and height in East Asian samples.
Genetic Correlation Between Anthropometric Traits and OA
Finally, LDD has been suggested to share common features with OA which is also known as degenerative joint disease (Loughlin, 2011; Ikegawa, 2013). To test if osteoarthritis also showed genetic overlaps with the same set of traits as LDD, we assessed the genetic correlations of BMI, BMD and height with osteoarthritis using LDSC regression (Table 5). Significant positive genetic correlation was found between BMI and osteoarthritis ( = 0.255, P = 4.0E-07), which is expected given the strong evidence for a causal role of BMI (Panoutsopoulou et al., 2014). Suggestive positive genetic correlations with osteoarthritis were also observed for height (P = 9.5E-03) and LS-BMD (P = 0.012) but not for FN-BMD (P > 0.1).
The genetic correlation between osteoarthritis and LS-BMD was less significant though its effect was stronger than between osteoarthritis and height, which was possibly due to smaller sample size of the BMD GWAS. Since the genetic architecture of BMD was dominated by fewer number of causal SNPs with larger effect sizes (Table S4), it is also possible that LDSC which assumed an infinitesimal model may be less optimal to detect genetic correlations for BMD and other traits. To support this, we calculated PGS of BMD GWAS hits using weights from the second stage replication sample of the GEFOS consortium (Estrada et al., 2012) to predict OA. The PGS of both LS-BMD and FN-BMD were strongly associated with OA case-control status in the acrOGEN sample (R2 = 0.13%, P = 7.8E-07 for LS-BMD and R2 = 0.12%, P = 2.5E-06 for FN-BMD). Taken together, the results suggest that like LDD, OA also shares genetic overlaps with height, BMI and BMD.
Discussion
Between-Population Heterogeneity and Its Impact on Prediction Performance of PGS
In this study, we adopted a trans-ethnic PGS strategy to evaluate the genetic overlaps between different traits where GWAS of base phenotypes were done in Europeans and validation and testing samples were East Asians. Although most GWAS findings were generally replicated in populations different from the initial discovery, heterogeneity commonly existed in the estimated effect sizes (e.g., Carlson et al., 2013; Marigorta and Navarro, 2013), which would reduce the power of PGS to predict phenotypes in populations from a different ethnicity (e.g., Johnson et al., 2015). Consistent with this, we found in Chinese validation samples that variance of height, BMI and BMD explained by the PGS of corresponding GWAS hits were all lower than in Europeans. For height and BMI, assuming the genetic architecture is the same between European and Chinese, the observed PGS results suggest that between-population genetic correlations are about 0.4~0.6 (Table S4, Materials and Methods). The rough estimations are within the range of previous estimate for type 2 diabetes and rheumatoid arthritis between European and East Asian using a different methodology (Brown et al., 2016).
The use of European GWAS in the current study is mainly due to large sample sizes and publicly available summary statistics. GWAS meta-analyses of height and BMI were also conducted in East Asians (Wen et al., 2014; He et al., 2015). Although sample sizes are much smaller (N≈36,000 for height, 87,000 for BMI), they are expected to have more similar effect sizes to the Chinese sample. To evaluate the tradeoff between sample size and effects heterogeneity, we projected expected prediction performance (R2) of height and BMI using a theoretical model with parameters of genetic architecture compatible with the observed PGS results (Materials and Methods). Despite smaller sample sizes, using East Asian GWAS as the discovery sample is expected have comparable maximum R2 as European GWAS to predict height in the Chinese population (Figure 5). For BMI, depending on the presumed SNP heritability, using East Asian GWAS shows comparable or better maximum R2 (Figure 5, Figure S2). Further increase the European GWAS sample sizes by half a million, a scale similar to the on-going UK biobank study, the increase in R2 for height is capped at 8% but roughly doubles for BMI. Notably, when using East Asian GWAS as the discovery sample, the best prediction performance can only be achieved at p-value thresholds >0.01. However, whole-genome summary statistics of East Asian GWASs were not publicly available for us before the start of this study. Also consistent with the theoretical predictions, incorporating East Asian GWAS top hits to the PGS of GWAS hits only marginally increased R2 in predicting height and BMI, and their associations with the LDD scores remained insignificant (Table S9).
Figure 5. Comparing the prediction performance of polygenic score (PGS) in Chinese sample using GWAS of European and East Asian. Expected phenotype variances explained by PGS (R2) were calculated using the theoretical model of Dudbridge (2013) with parameters compatible with the observed PGS results of height and BMI. (A) For height, the latest European GWAS has sample size 252K. Assuming SNP heritability of 0.42 in both populations, expected R2 in Chinese population (red line) was calculated using the parameters best fit to Figure 3. In comparison, the latest East Asian GWAS has sample size only 36K, expected R2 was calculated using the same set of parameters except that we assumed no heterogeneity in effect sizes (i.e., genetic correlation = 1) between discovery and testing sample (blue line). To predict the gain in R2 when using even larger European GWAS in the future, we further increased the discovery GWAS sample size by 500K (red dashed line). We also relaxed the assumption of no heterogeneity within East Asian and calculate expected R2 assuming genetic correlation of 0.9 (blue dashed line). (B) For BMI, European GWAS has sample size 234K; East Asian GWAS has sample size 87K. Expected R2 were calculated similarly as height, assuming SNP heritability of 0.22.
Given genetic architecture and sample sizes, the power of PGS in detecting genetic overlaps is mainly determined by the performance PGS in predicting the corresponding base phenotype. Therefore, the theoretical results suggest that the use of European GWAS as discovery sample in PGS analysis can still be a favorable approach in cross-trait analysis in the East Asian population. But we caution that the trans-ethnic PGS strategy may not be suitable for other populations like African. Nevertheless, whenever possible ancestry-matched GWAS of base phenotype with large sample sizes should be used to improve the power. Since summary data from large scale GWAS in non-European populations have started to become available recently (e.g., Akiyama et al., 2017), new method will be needed to integrate GWAS data from multiple ethnicities to further improve the PGS prediction performance.
The Influence of Phenotype Definition
In this study, we analyzed three LDD phenotypes, including two semi-quantitative scores derived from MRI assessment and one clinically defined symptom. The PGS of height, BMI and BMD were associated with at least one LDD phenotype. It highlights the complexity in operationally defining LDD, as the current diagnostic approach only captures certain aspects of the degenerative process. Therefore, comparison between different studies should clarify how phenotypes are defined. And it will be fruitful to jointly evaluate multiple MRI features in future genetic studies. However, although MRI is the current gold standard that gives best resolution in defining LDD, it is too expensive to be carried out in large samples.
An alternative strategy is to use the a “proxy phenotype” such as patient-based LDH in which large number of cases can be identified based on electronic medical records. Use of proxy phenotype has been demonstrated to improve the power in GWAS (e.g., Okbay et al., 2016). Increase in sample sizes can outweigh the dilution of genetic effects, but it may also capture certain aspects of the trait that is irrelevant to the phenotype of interest (e.g., Kong et al., 2017). In the current and our previous study (Song et al., 2013), LDH requiring surgery was presumed to represent an extreme end of disc displacement in the population. In this regard, it is surprising that the PGS of height strongly associated with LDH but not LDD scores, and PGS of BMI and BMD were associated with LDD scores but not LDH. Although the lack of expected associations can be false negatives due to insufficient power, we cannot rule out the possibility that ascertainment of LDH patients based on severe low back pain or sciatica may enrich polygenic factors other than LDD.
Biological Interpretations
The observed genetic overlaps can be explained by either causality or genetic pleiotropy or both. Interestingly, BMI, BMD and height also showed suggestive evidence of positive genetic correlation with OA. It is possible that they can be explained by some common mechanisms. Although formal assessment of causality could utilize the Mendelian randomization paradigm in larger sample sizes, PGS can be used to nominate candidate phenotypes (Evans et al., 2013). Overweight or obesity has been established as one of the major risk factors for the development and progression of both LDD (Hassett et al., 2003; Hangai et al., 2008) and OA (Bierma-Zeinstra and Koes, 2007). It is commonly believed that increased body weight or BMI exerts more physical loading to the IVD and vertebral endplate (Videman et al., 2007) or joint cartilage (Guilak, 2011), and leads to increased wear and tear of the structures. For BMD, in addition to its correlation with LDD, previous studies also found the increase in BMD in OA patients and an inverse association between OA and osteoporosis (Hannan et al., 1993; Arden et al., 1996). It was postulated that increased BMD is associated with a loss of resilience of subchondral bone which may results in increased mechanical stress on joint cartilage (Foss and Byers, 1972; Radin and Rose, 1986) and similarly on IVD (Harada et al., 1998). The causal mechanism of tall stature on LDH that leads to hospitalization or surgery remains unclear. One possibility may be related to increased disc height, because a previous study using finite element modeling demonstrated that discs with taller height and smaller area were prone to larger motion, higher annular fiber stress and larger degree of disc displacement (Natarajan and Andersson, 1999). Another possibility may be altered spinal alignment in taller individuals that predispose them to lumbar spine injury. Notably, the postulated mechanisms all point to the pathophysiological role of biomechanical stress. Some other mechanisms have also been proposed (Katz et al., 2010; Samartzis et al., 2013). For example, obesity is also believed to lead to local inflammatory response of secondary mediators secreted by adipocytes known as adipokines. The causal role of adipokines and inflammatory markers can also be tested using their genetic predictors as instrumental variables in future studies.
Alternatively, the observed genetic correlations are also consistent with the genetic pleiotropy and shared pathways among skeletal phenotypes. In supporting this notion, several individual OA associated SNPs were associated with height or BMD (Reynard and Loughlin, 2013; Hackinger et al., 2017), and OA and LDD were found to share some common genetic risk factors (Song et al., 2008; Williams et al., 2011). At single SNP level, we also replicated the recent finding of Bjornsdottir et al. (2017) and showed that the height-increasing allele SNP rs6651255 was associated with the increase of two LDD scores in the HKDD cohort. The previous study did not find association of the same SNP with other related skeletal phenotypes like OA of the spine or osteoporotic vertebral fractures and suggested that the association was driven by the neuropathic pain rather than herniated lumbar discs. However, they did not examine the association of the SNP with radiologically defined LDD phenotypes. Our results in the large population-based cohort with MRI assessment suggest that the same SNP also influences the changes in composition and morphology of lumbar discs. Future genetic studies on LDD with larger sample sizes should search for additional pleiotropic SNPs to better understand bone-cartilage relationships.
In summary, the current study is the first attempt to evaluate genetic overlap between LDD and related traits using GWAS data. Our trans-ethnic polygenic analysis supports the genetic correlations of height, BMI and BMD with LDD, and sheds new light on understanding the pathological mechanism of degenerative skeletal disorders.
Data Availability
The genome-wide association summary statistics of the HKDD cohort is available at https://goo.gl/6gpt9g.
Ethics Statement
This study was carried out in accordance with the recommendations of the ethical principles and guidelines for the protection of human participants of research, Human Research Ethics Committee, The University of Hong Kong. The protocol was approved by the Human Research Ethics Committee, The University of Hong Kong. All subjects gave written informed consent in accordance with the Declaration of Helsinki.
Author Contributions
XZ and PCS conceived the study and coordinated the research. KS-EC obtained the funding. XZ developed the methods and performed analysis. JK and KM-CC evaluated MRI images of the HKDD cohort. TK, Y-QS, KC, YK, and SI contributed the Japanese LDH GWAS summary data. C-LC contributed the HKOS GWAS summary data. Y-HH contributed the GEFOS consortium GWAS summary data. DS contributed the NMR data. YL and DC applied for the arcOGEN consortium GWAS summary data. XZ drafted the manuscript with inputs from PCS, KS-EC. DS, TS-HM, C-LC, and JK reviewed and revised manuscript. PCS and KS-EC participated discussion.
Conflict of Interest Statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Acknowledgments
This work was supported by Research Grant Council of Hong Kong Theme-based Research Scheme Functional Analyses of How Genomic Variation Affect Personal Risk for Degenerative Skeletal Disorders (T12-708/12N), and General Research Fund 776513M, 17128515 and 17124027. GEFOS study was funded by the European Commission (HEALTH-F2-2008-201865-GEFOS). arcOGEN study was funded by a special purpose grant from Arthritis Research UK (grant 18030). We thank Ms. Pei Yu for curating the HKDD phenotype database, and Dr. Eleftheria Zeggini for providing the arcOGEN GWAS summary data.
Supplementary Material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene.2018.00267/full#supplementary-material
References
Akiyama, M., Okada, Y., Kanai, M., Takahashi, A., Momozawa, Y., Ikeda, M., et al. (2017). Genome-wide association study identifies 112 new loci for body mass index in the Japanese population. Nat. Genet. 49, 1458–1467. doi: 10.1038/ng.3951
Arden, N. K., Griffiths, G. O., Hart, D. J., Doyle, D. V., and Spector, T. D. (1996). The association between osteoarthritis and osteoporotic fracture: the Chingford Study. Br. J. Rheumatol. 35, 1299–1304. doi: 10.1093/rheumatology/35.12.1299
Battié, M. C., Videman, T., Levälahti, E., Gill, K., and Kaprio, J. (2008). Genetic and environmental effects on disc degeneration by phenotype and spinal level: a multivariate twin study. Spine 33, 2801–2808. doi: 10.1097/BRS.0b013e31818043b7
Battié, M. C., Videman, T., and Parent, E. (2004). Lumbar disc degeneration: epidemiology and genetic influences. Spine 29, 2679–2690. doi: 10.1097/01.brs.0000146457.83240.eb
Bierma-Zeinstra, S. M., and Koes, B. W. (2007). Risk factors and prognostic factors of hip and knee osteoarthritis. Nat. Clin. Pract. Rheumatol. 3, 78–85. doi: 10.1038/ncprheum0423
Bjornsdottir, G., Benonisdottir, S., Sveinbjornsson, G., Styrkarsdottir, U., Thorleifsson, G., Walters, G. B., et al. (2017). Sequence variant at 8q24.21 associates with sciatica caused by lumbar disc herniation. Nat. Commun. 8:14265. doi: 10.1038/ncomms14265
Brown, B. C., Consortium Asian Genetic Epidemiology Network Type 2 Diabetes, Ye, C. J., Price, A. L., and Zaitlen, N. (2016). Transethnic genetic-correlation estimates from summary statistics. Am. J. Hum. Genet. 99, 76–88. doi: 10.1016/j.ajhg.2016.05.001
Buckwalter, J. A. (1995). Aging and degeneration of the human intervertebral disc. Spine 20, 1307–1314. doi: 10.1097/00007632-199506000-00022
Bulik-Sullivan, B., Finucane, H. K., Anttila, V., Gusev, A., Day, F. R., Loh, P. R., et al. (2015). An atlas of genetic correlations across human diseases and traits. Nat. Genet. 47, 1236–1241. doi: 10.1038/ng.3406
Carlson, C. S., Matise, T. C., North, K. E., Haiman, C. A., Fesinmeyer, M. D., Buyske, S., et al. (2013). Generalization and dilution of association results from European GWAS in populations of non-European ancestry: the PAGE study. PLoS Biol. 11:e1001661. doi: 10.1371/journal.pbio.1001661
Chatterjee, N., Shi, J., and García-Closas, M. (2016). Developing and evaluating polygenic risk prediction models for stratified disease prevention. Nat. Rev. Genet. 17, 392–406. doi: 10.1038/nrg.2016.27
Chatterjee, N., Wheeler, B., Sampson, J., Hartge, P., Chanock, S. J., and Park, J. H. (2013). Projecting the performance of risk prediction based on polygenic analyses of genome-wide association studies. Nat. Genet. 45, 400-5–405e1-3. doi: 10.1038/ng.2579
Cheung, C. L., Tan, K. C. B., and Kung, A. W. C. (2017). Cohort profile: the Hong Kong Osteoporosis Study and the follow-up study. Int. J. Epidemiol. 47, 397–398f. doi: 10.1093/ije/dyx172
Cheung, K. M., Karppinen, J., Chan, D., Ho, D. W., Song, Y. Q., Sham, P., et al. (2009). Prevalence and pattern of lumbar magnetic resonance imaging changes in a population study of one thousand forty-three individuals. Spine 34, 934–940. doi: 10.1097/BRS.0b013e3181a01b3f
Coeuret-Pellicer, M., Descatha, A., Leclerc, A., and Zins, M. (2010). Are tall people at higher risk of low back pain surgery? A discussion on the results of a multipurpose cohort. Arthritis Care Res. 62, 125–127. doi: 10.1002/acr.20023
Das, S., Forer, L., Schönherr, S., Sidore, C., Locke, A. E., Kwong, A., et al. (2016). Next-generation genotype imputation service and methods. Nat. Genet. 48, 1284–1287. doi: 10.1038/ng.3656
Davey Smith, G., and Hemani, G. (2014). Mendelian randomization: genetic anchors for causal inference in epidemiological studies. Hum. Mol. Genet. 23, R89–R98. doi: 10.1093/hmg/ddu328
de Vlaming, R., Okbay, A., Rietveld, C. A., Johannesson, M., Magnusson, P. K., Uitterlinden, A. G., et al. (2017). Meta-GWAS accuracy and power (MetaGAP) calculator shows that hiding heritability is partially due to imperfect genetic correlations across studies. PLoS Genet. 13:e1006495. doi: 10.1371/journal.pgen.1006495
Dudbridge, F. (2013). Power and predictive accuracy of polygenic risk scores. PLoS Genet. 9:e1003348. doi: 10.1371/annotation/b91ba224-10be-409d-93f4-7423d502cba0
Dudbridge, F. (2016). Polygenic epidemiology. Genet. Epidemiol. 40, 268–272. doi: 10.1002/gepi.21966
Eskola, P. J., Lemmela, S., Kjaer, P., Solovieva, S., Mannikko, M., Tommerup, N., et al. (2012). Genetic association studies in lumbar disc degeneration: a systematic review. PLoS ONE 7:e49995. doi: 10.1371/journal.pone.0049995
Eskola, P. J., Männikkö, M., Samartzis, D., and Karppinen, J. (2014). Genome-wide association studies of lumbar disc degeneration–are we there yet? Spine J. 14, 479–482. doi: 10.1016/j.spinee.2013.07.437
Estrada, K., Styrkarsdottir, U., Evangelou, E., Hsu, Y. H., Duncan, E. L., Ntzani, E. E., et al. (2012). Genome-wide meta-analysis identifies 56 bone mineral density loci and reveals 14 loci associated with risk of fracture. Nat. Genet. 44, 491–501. doi: 10.1038/ng.2249
Euesden, J., Lewis, C. M., and O'Reilly, P. F. (2015). PRSice: polygenic risk score software. Bioinformatics 31, 1466–1468. doi: 10.1093/bioinformatics/btu848
Evans, D. M., Brion, M. J., Paternoster, L., Kemp, J. P., McMahon, G., Munafò, M., et al. (2013). Mining the human phenome using allelic scores that index biological intermediates. PLoS Genet. 9:e1003919. doi: 10.1371/journal.pgen.1003919
Falconer, D. S., and Mackay, T. F. C. (1996). Introduction to Quantitative Genetics. Pearson Education.
Foss, M. V., and Byers, P. D. (1972). Bone density, osteoarthrosis of the hip, and fracture of the upper end of the femur. Ann. Rheum. Dis. 31, 259–264. doi: 10.1136/ard.31.4.259
Guilak, F. (2011). Biomechanical factors in osteoarthritis. Best Pract. Res. Clin. Rheumatol. 25, 815–823. doi: 10.1016/j.berh.2011.11.013
Hackinger, S., Trajanoska, K., Styrkarsdottir, U., Zengini, E., Steinberg, J., Ritchie, G. R. S., et al. (2017). Evaluation of shared genetic aetiology between osteoarthritis and bone mineral density identifies SMAD3 as a novel osteoarthritis risk locus. Hum Mol Genet. 26, 3850–3858. doi: 10.1093/hmg/ddx285
Hangai, M., Kaneoka, K., Kuno, S., Hinotsu, S., Sakane, M., Mamizuka, N., et al. (2008). Factors associated with lumbar intervertebral disc degeneration in the elderly. Spine J. 8, 732–740. doi: 10.1016/j.spinee.2007.07.392
Hannan, M. T., Anderson, J. J., Zhang, Y., Levy, D., and Felson, D. T. (1993). Bone mineral density and knee osteoarthritis in elderly men and women. The Framingham study. Arthritis Rheum 36, 1671–1680. doi: 10.1002/art.1780361205
Harada, A., Okuizumi, H., Miyagi, N., and Genda, E. (1998). Correlation between bone mineral density and intervertebral disc degeneration. Spine 23, 857–861. discussion: 862. doi: 10.1097/00007632-199804150-00003
Hassett, G., Hart, D. J., Manek, N. J., Doyle, D. V., and Spector, T. D. (2003). Risk factors for progression of lumbar spine disc degeneration: the Chingford Study. Arthritis Rheum. 48, 3112–3117. doi: 10.1002/art.11321
He, M., Xu, M., Zhang, B., Liang, J., Chen, P., Lee, J. Y., et al. (2015). Meta-analysis of genome-wide association studies of adult height in East Asians identifies 17 novel loci. Hum. Mol. Genet. 24, 1791–1800. doi: 10.1093/hmg/ddu583
Heikkilä, J. K., Koskenvuo, M., Heliövaara, M., Kurppa, K., Riihimäki, H., Heikkila, K., et al. (1989). Genetic and environmental factors in sciatica. Evidence from a nationwide panel of 9365 adult twin pairs. Ann. Med. 21, 393–398. doi: 10.3109/07853898909149227
Hershkovich, O., Friedlander, A., Gordon, B., Arzi, H., Derazne, E., Tzur, D., et al. (2013). Associations of body mass index and body height with low back pain in 829,791 adolescents. Am. J. Epidemiol. 178, 603–609. doi: 10.1093/aje/kwt019
Heuch, I., Heuch, I., Hagen, K., and Zwart, J. A. (2015). Association between body height and chronic low back pain: a follow-up in the Nord-Trondelag Health Study. BMJ Open 5:e006983. doi: 10.1136/bmjopen-2014-006983
Howie, B. N., Donnelly, P., and Marchini, J. (2009). A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet. 5:e1000529. doi: 10.1371/journal.pgen.1000529
Humzah, M. D., and Soames, R. W. (1988). Human intervertebral disc: structure and function. Anat. Rec. 220, 337–356. doi: 10.1002/ar.1092200402
Ikegawa, S. (2013). The genetics of common degenerative skeletal disorders: osteoarthritis and degenerative disc disease. Annu. Rev. Genomics Hum. Genet. 14, 245–256. doi: 10.1146/annurev-genom-091212-153427
Johnson, L., Zhu, J., Scott, E. R., and Wineinger, N. E. (2015). An examination of the relationship between lipid levels and associated genetic markers across racial/ethnic populations in the multi-ethnic study of atherosclerosis. PLoS ONE 10:e0126361. doi: 10.1371/journal.pone.0126361
Johnson, T. (2012). “Efficient calculation for multi-SNP genetic risk scores,” in American Society of Human Genetics Annual Meeting (San Francisco, CA).
Jordan, J., Konstantinou, K., and O'Dowd, J. (2009). Herniated lumbar disc. BMJ Clin. Evid. 2009:1118.
Katz, J. D., Agrawal, S., and Velasquez, M. (2010). Getting to the heart of the matter: osteoarthritis takes its place as part of the metabolic syndrome. Curr. Opin. Rheumatol. 22, 512–519. doi: 10.1097/BOR.0b013e32833bfb4b
Kong, A., Frigge, M. L., Thorleifsson, G., Stefansson, H., Young, A. I., Zink, F., et al. (2017). Selection against variants in the genome associated with educational attainment. Proc. Natl. Acad. Sci. U.S.A. 114, E727–E732. doi: 10.1073/pnas.1612113114
Krapohl, E., Euesden, J., Zabaneh, D., Pingault, J. B., Rimfeld, K., von Stumm, S., et al. (2016). Phenome-wide analysis of genome-wide polygenic scores. Mol. Psychiatry 21, 1188–1193. doi: 10.1038/mp.2015.126
Kung, A. W., Xiao, S. M., Cherny, S., Li, G. H., Gao, Y., Tso, G., et al. (2010). Association of JAG1 with bone mineral density and osteoporotic fractures: a genome-wide association study and follow-up replication studies. Am. J. Hum. Genet. 86, 229–239. doi: 10.1016/j.ajhg.2009.12.014
Lee, S. H., Goddard, M. E., Wray, N. R., and Visscher, P. M. (2012a). A better coefficient of determination for genetic profile analysis. Genet. Epidemiol. 36, 214–224. doi: 10.1002/gepi.21614
Lee, S. H., Yang, J., Goddard, M. E., Visscher, P. M., and Wray, N. R. (2012b). Estimation of pleiotropy between complex diseases using single-nucleotide polymorphism-derived genomic relationships and restricted maximum likelihood. Bioinformatics 28, 2540–2542. doi: 10.1093/bioinformatics/bts474
Leino-Arjas, P., Kauppila, L., Kaila-Kangas, L., Shiri, R., Heistaro, S., and Heliövaara, M. (2008). Serum lipids in relation to sciatica among Finns. Atherosclerosis 197, 43–49. doi: 10.1016/j.atherosclerosis.2007.07.035
Li, Y., Samartzis, D., Campbell, D. D., Cherny, S. S., Cheung, K. M., Luk, K. D., et al. (2016). Two subtypes of intervertebral disc degeneration distinguished by large-scale population-based study. Spine J. 16, 1079–1089. doi: 10.1016/j.spinee.2016.04.020
Liuke, M., Solovieva, S., Lamminen, A., Luoma, K., Leino-Arjas, P., Luukkonen, R., et al. (2005). Disc degeneration of the lumbar spine in relation to overweight. Int. J. Obes. 29, 903–908. doi: 10.1038/sj.ijo.0802974
Livshits, G., Ermakov, S., Popham, M., Macgregor, A. J., Sambrook, P. N., Spector, T. D., et al. (2010). Evidence that bone mineral density plays a role in degenerative disc disease: the UK Twin Spine study. Ann. Rheum. Dis. 69, 2102–2106. doi: 10.1136/ard.2010.131441
Livshits, G., Popham, M., Malkin, I., Sambrook, P. N., Macgregor, A. J., Spector, T., et al. (2011). Lumbar disc degeneration and genetic factors are the main risk factors for low back pain in women: the UK Twin Spine Study. Ann. Rheum. Dis. 70, 1740–1745. doi: 10.1136/ard.2010.137836
Locke, A. E., Kahali, B., Berndt, S. I., Justice, A. E., Pers, T. H., Day, F. R., et al. (2015). Genetic studies of body mass index yield new insights for obesity biology. Nature 518, 197–206. doi: 10.1038/nature14177
Longo, U. G., Denaro, L., Spiezia, F., Forriol, F., Maffulli, N., and Denaro, V. (2011). Symptomatic disc herniation and serum lipid levels. Eur. Spine J. 20, 1658–1662. doi: 10.1007/s00586-011-1737-2
Loughlin, J. (2011). Knee osteoarthritis, lumbar-disc degeneration and developmental dysplasia of the hip–an emerging genetic overlap. Arthritis Res. Ther. 13:108. doi: 10.1186/ar3291
Luoma, K., Riihimäki, H., Luukkonen, R., Raininko, R., Viikari-Juntura, E., and Lamminen, A. (2000). Low back pain in relation to lumbar disc degeneration. Spine 25, 487–492. doi: 10.1097/00007632-200002150-00016
Marchini, J., Howie, B., Myers, S., McVean, G., and Donnelly, P. (2007). A new multipoint method for genome-wide association studies by imputation of genotypes. Nat. Genet. 39, 906–913. doi: 10.1038/ng2088
Marigorta, U. M., and Navarro, A. (2013). High trans-ethnic replicability of GWAS results implies common causal variants. PLoS Genet. 9:e1003566. doi: 10.1371/journal.pgen.1003566
Natarajan, R. N., and Andersson, G. B. (1999). The influence of lumbar disc height and cross-sectional area on the mechanical response of the disc to physiologic loading. Spine 24, 1873–1881. doi: 10.1097/00007632-199909150-00003
Okbay, A., Beauchamp, J. P., Fontana, M. A., Lee, J. J., Pers, T. H., Rietveld, C. A., et al. (2016). Genome-wide association study identifies 74 loci associated with educational attainment. Nature 533, 539–542. doi: 10.1038/nature17671
Oxland, T. R. (2016). Fundamental biomechanics of the spine–what we have learned in the past 25 years and future directions. J. Biomech. 49, 817–832. doi: 10.1016/j.jbiomech.2015.10.035
Palla, L., and Dudbridge, F. (2015). A fast method that uses polygenic scores to estimate the variance explained by genome-wide marker panels and the proportion of variants affecting a trait. Am. J. Hum. Genet. 97, 250–259. doi: 10.1016/j.ajhg.2015.06.005
Panoutsopoulou, K., Metrustry, S., Doherty, S. A., Laslett, L. L., Maciewicz, R. A., Hart, D. J., et al. (2014). The effect of FTO variation on increased osteoarthritis risk is mediated through body mass index: a Mendelian randomisation study. Ann. Rheum. Dis. 73, 2082–2086. doi: 10.1136/annrheumdis-2013-203772
Panoutsopoulou, K., Southam, L., Elliott, K. S., Wrayner, N., Zhai, G., Beazley, C., et al. (2011). Insights into the genetic architecture of osteoarthritis from stage 1 of the arcOGEN study. Ann. Rheum. Dis. 70, 864–867. doi: 10.1136/ard.2010.141473
Purcell, S. M., Wray, N. R., Stone, J. L., Visscher, P. M., O'Donovan, M. C., Sullivan, P. F., et al. (2009). Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature 460, 748–752. doi: 10.1038/nature08185
Purcell, S., Neale, B., Todd-Brown, K., Thomas, L., Ferreira, M. A., Bender, D., et al. (2007). PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575. doi: 10.1086/519795
Pye, S. R., Reid, D. M., Adams, J. E., Silman, A. J., and O'Neill, T. W. (2006). Radiographic features of lumbar disc degeneration and bone mineral density in men and women. Ann. Rheum. Dis. 65, 234–238. doi: 10.1136/ard.2005.038224
Radin, E. L., and Rose, R. M. (1986). Role of subchondral bone in the initiation and progression of cartilage damage. Clin. Orthop. Relat. Res. 213, 34–40. doi: 10.1097/00003086-198612000-00005
Reynard, L. N., and Loughlin, J. (2013). Insights from human genetic studies into the pathways involved in osteoarthritis. Nat. Rev. Rheumatol. 9, 573–583. doi: 10.1038/nrrheum.2013.121
Ropper, A. H., and Zafonte, R. D. (2015). Sciatica. N. Engl. J. Med. 372, 1240–1248. doi: 10.1056/NEJMra1410151
Samartzis, D., Karppinen, J., Chan, D., Luk, K. D., and Cheung, K. M. (2012). The association of lumbar intervertebral disc degeneration on magnetic resonance imaging with body mass index in overweight and obese adults: a population-based study. Arthritis Rheum. 64, 1488–1496. doi: 10.1002/art.33462
Samartzis, D., Karppinen, J., Cheung, J. P., and Lotz, J. (2013). Disk degeneration and low back pain: are they fat-related conditions? Global Spine J. 3, 133–144. doi: 10.1055/s-0033-1350054
Samartzis, D., Karppinen, J., Mok, F., Fong, D. Y., Luk, K. D., and Cheung, K. M. (2011). A population-based study of juvenile disc degeneration and its association with overweight and obesity, low back pain, and diminished functional status. J. Bone Joint Surg. Am. 93, 662–670. doi: 10.2106/JBJS.I.01568
Sambrook, P. N., MacGregor, A. J., and Spector, T. D. (1999). Genetic influences on cervical and lumbar disc degeneration: a magnetic resonance imaging study in twins. Arthritis Rheum. 42, 366–372.
Schneiderman, G., Flannigan, B., Kingston, S., Thomas, J., Dillin, W. H., and Watkins, R. G. (1987). Magnetic resonance imaging in the diagnosis of disc degeneration: correlation with discography. Spine 12, 276–281. doi: 10.1097/00007632-198704000-00016
Soininen, P., Kangas, A. J., Würtz, P., Suna, T., and Ala-Korpela, M. (2015). Quantitative serum nuclear magnetic resonance metabolomics in cardiovascular epidemiology and genetics. Circ. Cardiovasc. Genet. 8, 192–206. doi: 10.1161/CIRCGENETICS.114.000216
Soininen, P., Kangas, A. J., Würtz, P., Tukiainen, T., Tynkkynen, T., Laatikainen, R., et al. (2009). High-throughput serum NMR metabonomics for cost-effective holistic studies on systemic metabolism. Analyst 134, 1781–1785. doi: 10.1039/b910205a
Solovieff, N., Cotsapas, C., Lee, P. H., Purcell, S. M., and Smoller, J. W. (2013). Pleiotropy in complex traits: challenges and strategies. Nat. Rev. Genet. 14, 483–495. doi: 10.1038/nrg3461
Song, Y. Q., Cheung, K. M., Ho, D. W., Poon, S. C., Chiba, K., Kawaguchi, Y., et al. (2008). Association of the asporin D14 allele with lumbar-disc degeneration in Asians. Am. J. Hum. Genet. 82, 744–747. doi: 10.1016/j.ajhg.2007.12.017
Song, Y. Q., Karasugi, T., Cheung, K. M., Chiba, K., Ho, D. W., Miyake, A., et al. (2013). Lumbar disc degeneration is linked to a carbohydrate sulfotransferase 3 variant. J. Clin. Invest. 123, 4909–4917. doi: 10.1172/JCI69277
Stahl, E. A., Wegmann, D., Trynka, G., Gutierrez-Achury, J., Do, R., Voight, B. F., et al. (2012). Bayesian inference analyses of the polygenic architecture of rheumatoid arthritis. Nat. Genet. 44, 483–489. doi: 10.1038/ng.2232
Takatalo, J., Karppinen, J., Niinimäki, J., Taimela, S., Nayha, S., Mutanen, P., et al. (2011). Does lumbar disc degeneration on magnetic resonance imaging associate with low back symptom severity in young Finnish adults? Spine 36, 2180–2189. doi: 10.1097/BRS.0b013e3182077122
Takatalo, J., Karppinen, J., Taimela, S., Niinimäki, J., Laitinen, J., Blanco Sequeiros, R., et al. (2013). Body mass index is associated with lumbar disc degeneration in young Finnish males: subsample of Northern Finland birth cohort study 1986. BMC Musculoskelet. Disord. 14:87. doi: 10.1186/1471-2474-14-87
Urban, J. P., and Roberts, S. (2003). Degeneration of the intervertebral disc. Arthritis Res. Ther. 5, 120–130. doi: 10.1186/ar629
Vattikuti, S., Guo, J., and Chow, C. C. (2012). Heritability and genetic correlations explained by common SNPs for metabolic syndrome traits. PLoS Genet. 8:e1002637. doi: 10.1371/journal.pgen.1002637
Videman, T., Levälahti, E., and Battié, M. C. (2007). The effects of anthropometrics, lifting strength, and physical activities in disc degeneration. Spine 32, 1406–1413. doi: 10.1097/BRS.0b013e31806011fa
Wahlström, J., Burström, L., Nilsson, T., and Järvholm, B. (2012). Risk factors for hospitalization due to lumbar disc disease. Spine 37, 1334–1339. doi: 10.1097/BRS.0b013e31824b5464
Wang, Y., Boyd, S. K., Battié, M. C., Yasui, Y., and Videman, T. (2011). Is greater lumbar vertebral BMD associated with more disk degeneration? A study using microCT and discography. J. Bone Miner Res. 26, 2785–2791. doi: 10.1002/jbmr.476
Wen, W., Zheng, W., Okada, Y., Takeuchi, F., Tabara, Y., Hwang, J. Y., et al. (2014). Meta-analysis of genome-wide association studies in East Asian-ancestry populations identifies four new loci for body mass index. Hum. Mol. Genet. 23, 5492–5504. doi: 10.1093/hmg/ddu248
Willer, C. J., Schmidt, E. M., Sengupta, S., Peloso, G. M., Gustafsson, S., Kanoni, S., et al. (2013). Discovery and refinement of loci associated with lipid levels. Nat. Genet. 45, 1274–1283. doi: 10.1038/ng.2797
Williams, F. M., Bansal, A. T., van Meurs, J. B., Bell, J. T., Meulenbelt, I., Suri, P., et al. (2013). Novel genetic variants associated with lumbar disc degeneration in northern Europeans: a meta-analysis of 4600 subjects. Ann. Rheum. Dis. 72, 1141–1148. doi: 10.1136/annrheumdis-2012-201551
Williams, F. M., Popham, M., Hart, D. J., de Schepper, E., Bierma-Zeinstra, S., Hofman, A., et al. (2011). GDF5 single-nucleotide polymorphism rs143383 is associated with lumbar disc degeneration in Northern European women. Arthritis Rheum. 63, 708–712. doi: 10.1002/art.30169
Wood, A. R., Esko, T., Yang, J., Vedantam, S., Pers, T. H., Gustafsson, S., et al. (2014). Defining the role of common variation in the genomic and biological architecture of adult human height. Nat. Genet. 46, 1173–1186. doi: 10.1038/ng.3097
Xiao, S. M., Kung, A. W., Gao, Y., Lau, K. S., Ma, A., Zhang, Z. L., et al. (2012). Post-genome wide association studies and functional analyses identify association of MPP7 gene variants with site-specific bone mineral density. Hum. Mol. Genet. 21, 1648–1657. doi: 10.1093/hmg/ddr586
Yang, J., Benyamin, B., McEvoy, B. P., Gordon, S., Henders, A. K., Nyholt, D. R., et al. (2010). Common SNPs explain a large proportion of the heritability for human height. Nat. Genet. 42, 565–569. doi: 10.1038/ng.608
Yang, J., Lee, S. H., Goddard, M. E., and Visscher, P. M. (2011a). GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 76–82. doi: 10.1016/j.ajhg.2010.11.011
Yang, J., Manolio, T. A., Pasquale, L. R., Boerwinkle, E., Caporaso, N., Cunningham, J. M., et al. (2011b). Genome partitioning of genetic variation for complex traits using common SNPs. Nat. Genet. 43, 519–525. doi: 10.1038/ng.823
Yang, J., Zeng, J., Goddard, M. E., Wray, N. R., and Visscher, P. M. (2017). Concepts, estimation and interpretation of SNP-based heritability. Nat. Genet. 49, 1304–1310. doi: 10.1038/ng.3941
Zeggini, E., Panoutsopoulou, K., Southam, L., Rayner, N. W., Day-Williams, A. G., Lopes, M. C., et al. (2012). Identification of new susceptibility loci for osteoarthritis (arcOGEN): a genome-wide association study. Lancet 380, 815–823. doi: 10.1016/S0140-6736(12)60681-3
Keywords: polygenic score, genetic correlation, causality, pleiotropy, lumbar disc degeneration, osteoarthritis
Citation: Zhou X, Cheung C-L, Karasugi T, Karppinen J, Samartzis D, Hsu Y-H, Mak TS-H, Song Y-Q, Chiba K, Kawaguchi Y, Li Y, Chan D, Cheung KM-C, Ikegawa S, Cheah KS-E and Sham PC (2018) Trans-Ethnic Polygenic Analysis Supports Genetic Overlaps of Lumbar Disc Degeneration With Height, Body Mass Index, and Bone Mineral Density. Front. Genet. 9:267. doi: 10.3389/fgene.2018.00267
Received: 13 April 2018; Accepted: 02 July 2018;
Published: 03 August 2018.
Edited by:
Dana C. Crawford, Case Western Reserve University, United StatesReviewed by:
Kenneth M. Weiss, Pennsylvania State University, United StatesJing Hua Zhao, University of Cambridge, United Kingdom
Anne E. Justice, Geisinger Health System, United States
Copyright © 2018 Zhou, Cheung, Karasugi, Karppinen, Samartzis, Hsu, Mak, Song, Chiba, Kawaguchi, Li, Chan, Cheung, Ikegawa, Cheah and Sham. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Pak Chung Sham, pcsham@hku.hk