- 1National Maize Improvement Center of China, Key Laboratory of Crop Heterosis and Utilization Ministry of Education (MOE), China Agricultural University, Beijing, China
- 2Maize Research Institute, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China
Southern corn rust (SCR), caused by Puccinia polysora Underw, is a destructive disease that can severely reduce grain yield in maize (Zea mays L.). Owing to P. polysora being multi-racial, it is very important to explore more resistance genes and develop more efficient selection approaches in maize breeding programs. Here, four Doubled Haploid (DH) populations with 384 accessions originated from selected parents and their 903 testcross hybrids were used to perform genome-wide association (GWAS). Three GWAS processes included the additive model in the DH panel, additive and dominant models in the hybrid panel. As a result, five loci were detected on chromosomes 1, 7, 8, 8, and 10, with P-values ranging from 4.83×10-7 to 2.46×10-41. In all association analyses, a highly significant locus on chromosome 10 was detected, which was tight chained with the known SCR resistance gene RPPC and RPPK. Genomic prediction (GP), has been proven to be effective in plant breeding. In our study, several models were performed to explore predictive ability in hybrid populations for SCR resistance, including extended GBLUP with different genetic matrices, maker based prediction models, and mixed models with QTL as fixed factors. For GBLUP models, the prediction accuracies ranged from 0.56-0.60. Compared with traditional prediction only with additive effect, prediction ability was significantly improved by adding additive-by-additive effect (P-value< 0.05). For maker based models, the accuracy of BayesA and BayesB was 0.65, 8% higher than other models (i.e., RRBLUP, BRR, BL, BayesC). Finally, by adding QTL into the mixed linear prediction model, the accuracy can be further improved to 0.67, especially for the G_A model, the prediction performance can be increased by 11.67%. The prediction accuracy of the BayesB model can be further improved significantly by adding QTL information (P-value< 0.05). This study will provide important valuable information for understanding the genetic architecture and the application of GP for SCR in maize breeding.
1 Introduction
Southern corn rust (SCR) caused by Puccinia polysora Underw, is one of the most devastating maize diseases, widely distributed in Asia, America, Africa and other major corn production areas (Sun et al., 2021). SCR was first reported by Underwood in 1897 in the USA (Underwood, 1897) and observed in most tropical and temperate maize-growing areas of the world in subsequent decades (Orian, 1954; Duan and He, 1984). The invasiveness of leaves and stems of maize resulted in yield losses of up to 50% (Rhind et al., 1952; Liu and Wang, 1999). The wide distribution, long-distance migration, multiple physiological races and fast evolution made SCR difficult to be controlled, causing great grain yield losses (Sun et al., 2021). With climate change, SCR tends to further increase and expand to higher latitudes regions (Ramirez-Cabral et al., 2017).
The breeding of SCR resistant varieties is very important for disease management, which poses challenges for breeders. In China, several main cultivated corn varieties, such as Zhengdan958, Xundan20 and Xianyu335, have been identified to be susceptible to SCR (Yuan et al., 2010). Indeed, Wang et al. (Wang et al., 2006) investigated the resistance of 178 corn varieties to SCR, and reported that only 14% of varieties were highly resistant to SCR. On the other hand, Zhou et al. (Zhou et al., 2017) identified several highly resistant germplasms, such as DH02, Zheng39, T2 and JH3372. In addition, some inbred lines such as AFR024 (Storey and Howland, 1957), Qi319 (Chen et al., 2004), CML470 (Yao et al., 2013), J2416K (Wang et al., 2020) were also found to be resistant germplasm. The discovery of these germplasms not only improved the variety resistance by breeding, but also provided the basis for gene detection.
Based on geographic distribution, more than 10 physiological races of P. polysora have been identified, including EA.1, EA.2, EA.3, and PP.3-PP.9 (Ryland and Storey, 1955; Storey and Howland, 1957; Robert, 1962; Ullstrup, 1965). Owing to the rapid development of genetics, so far, several unique, major, race-specific SCR-resistance genes have been reported. Rpp1, a fully dominant gene, was identified as a resistance gene to P. polysora races EA.1 and EA.3; Rpp2, a partially dominant gene closely linked with Rpp1, was resistant to races EA.1, EA.2, and EA.3; Rpp9, a single dominant gene on 10.01 bin, was resistant to race PP.9 (Storey and Howland, 1959; Storey and Howland, 1967). It is noteworthy that Rpp9 is closely linked, with a genetic distance of 1.5 cM, to a common rust resistance gene rp1, but its genomic location had not been confirmed (Ullstrup, 1965). In recent years, more resistance loci on chromosome 10 have been detected, including RppP25 (Liu et al., 2003), RppQ (Chen et al., 2004), RppD (Zhang et al., 2009), RppC (Yao et al., 2013), Rpp12 (Zhang, 2013), RppS (Wu et al., 2015), RppM (Wang et al., 2020), qSCR6.01 (Lu et al., 2020), RppCML496 (Lv et al., 2021), RppK (Chen et al., 2022).
Genome-wide association study (GWAS), which is based on genetic linkage disequilibrium (LD) in a panel including a large number of genotypes representing broadly natural variations, has been used as an alternative approach for exploring the molecular basis and identifying SNPs of complex quantitative traits (Yu and Buckler, 2006). In maize, GWAS has been successfully utilized to identify numerous candidate loci/genes controlling disease resistance, such as head smut (Wang et al., 2012) common rust (Kibe et al., 2020; Ren et al., 2021), rough dwarf (Zhao et al., 2021), ear rot (Guo et al., 2020), gray leaf spot (Mammadov et al., 2015), etc. For SCR, eight SNPs were identified as significant loci using GWAS with a panel of 164 maize inbred lines in previous studies (Souza Camacho et al., 2019). The results of these studies provide valuable information on understanding the mechanism of disease resistance and breeding superior varieties.
Genomic prediction (GP), also known as genomic selection (GS), is a technology to predict the performance of plants without phenotyping, and has been proven to be effective in plant breeding (Meuwissen et al., 2001; Cerrudo et al., 2018). Gowda et al. (Gowda et al., 2015) successfully modeled the resistance of lethal necrosis disease in tropical maize germplasm with ridge regression best linear unbiased prediction (RRBLUP). For common rust, GP accuracies observed in the GWAS panel and Doubled Haploid (DH) population were 0.61 and 0.51 (Ren et al., 2021). For goss’s wilt, the GP model was trained with an accuracy of 0.69 (Cooper et al., 2019). However, in maize hybrids, there are few cases of genomic prediction for disease resistance.
In this study, four DH populations with 384 accessions and their testcross hybrids with 903 accessions were used to perform GWAS and GP analyses for SCR resistance. The objectives of the current study were to (1) detect the significantly associated SNPs, and major QTL conferring SCR resistance; (2) predict SCR resistance trait with different GBLUP models; (3) test the predictive power of different marker-based models for resistance trait; and (4) estimate the GP accuracies using models with QTL information.
2 Materials and methods
2.1 Plant materials
A total of 384 DH lines belonging to four DH populations were developed from four elite inbred lines (Table 1) in BeiJing (N40°08’ E116°10’) in 2017. The founders of the four DH populations were C783 × C229, C783 × UH306, C783 × EH, C229 × UH306, respectively, and we named them as POP1-4. The quantities of DH lines in POP1-4 are 66, 107, 127, and 77, respectively. Then, we testcross each population with three testers, yielding a total of 903 hybrids (Table 1). Thus, the hybrid population is divided into 12 subgroups with quantities ranging from 40 to 119.
To test the SCR infection levels of the accessions, we planted the DH lines and hybrids in Huang-Huai-Hai summer corn planting region in China for phenotypic identification. This region is the main area where SCR occurs in China due to high temperature and rainy summer. The DHs were planted in Jinan (N37°42’ E117°27’) in 2021, Xinxiang (N35°9’ E113°47’) in 2018 and 2021; the hybrids were planted in Jinan in 2021, Jining (N35°6’ E116°31’) in 2020 and 2021, Xinxiang in 2020. We used the augmented experimental design, setting every 20 accessions as a block. Each block consisted of 19 rows and standard accessions were planted in random order. For the DHs, the standard accession was a susceptible inbred line C116A. For the hybrids, the standard accession was a susceptible commercial hybrids ZhengDan958. In the field, each accession was planted in a one-row plot for DHs and a two-row plot for hybrids at a spacing with 0.6 x 0.25 m spacing (66,000 plants per hectare).
2.2 Southern corn rust resistance score (SCRRS) collection
SCRRS for each accession was visually collected from the leaf area covered by lesions at 4 weeks after flowering (Figure 1A). A rating scale of 1 corresponds to severe infection covering > 75% of the leaf surface, 3 corresponds to moderate-to-severe infection covering 50–75% of the leaf surface, 5 corresponds to moderate infection covering 25–50% of the leaf surface, 7 corresponds to weak to moderate infection covering 10–25% of the leaf surface, and 9 corresponds to high resistance covering 0–10% of the leaf surface (Ren et al., 2021).
Figure 1 Southern Corn Rust Resistance Score (SCRRS) and its distribution. (A) The manifestation of susceptible leaves, the SCRRS of leaves were 9, 7, 5, 3, 1 from left to right. (B) the SCRRSs in DH founders and testers. (C) The distribution of SCRRS in DH (top) and hybrid (bottom) populations.
2.3 Phenotypic data analysis
The raw phenotypic data were analyzed using the linear mixed model with an R add-on package “lme4” (Bates et al., 2014). Best linear unbiased predictors (BLUPs) were calculated for DHs and hybrids. In the model,
where, yij is the mean phenotypic value of the ith DH or hybrid in the jth environment;
μ is the overall mean of the trait; gi is the random effect of the ith accession; lj is the random effect of the jth environment; gi×lj is the random interaction effect between the ith accession and the jth environment; and ϵ is the random error.
Heritability was calculated using variance components estimated from the above model. The following equation was used to estimate heritability on an individual plot basis,
Where Vg is the genotypic variance component, Ve is the error variance, and l is the number of environments.
2.4 Genotyping and genotypic data analysis
Young leaves of all the DHs and the tester lines were sampled for DNA extraction using the CTAB method (Porebski et al., 1997). Then, genotyping was conducted using the Maize-6H-60K SNP chip (Tian et al., 2021). SNPs with minor allele frequency (MAF) > 0.05 and per locus missing rate< 0.1 were filtered out using plink 1.90 (http://www.cog-genomics.org/plink2/). The genotypes of hybrids were obtained with the cleaned SNPs (N=34,037) of DHs and testers using TASSEL V5.2 software (Bradbury et al., 2007). Pairwise measures of linkage disequilibrium (LD) were performed to analyze the squared allele‐frequency correlation coefficient (r2) between two loci using plink software. Only SNPs with a MAF > 0.05 and less than 0.1 missing data were used to estimate LD. Principal component analysis (PCA) was used to assess the level of genetic structure using TASSEL software.
2.5 Genome wide association study
Genome wide association analysis was performed with the BLUPs obtained from the combined analysis for the DHs and hybrids. A Fixed and Random Model Circulating Probability Unification (FarmCPU) method, as proposed by Liu et al. (Liu et al., 2016) was applied in GAPIT V3 software (Wang and Zhang, 2021). Two genetic models, additive and dominant, are used for the hybrids panel, and only the additive model was used for the DH panel. Under the additive model, homozygous genotypes with recessive allele combinations were coded as 0, homozygous genotypes with dominant allele combinations are coded as 2, and heterozygous genotypes were coded as 1. Under the dominant model, both types of homozygous genotypes are coded as 0 and heterozygous genotypes were coded as 1. The Bonferroni testing was used to determine the genome-wide significance thresholds (0.05/34,034 = 1.47 × 10−6), where 34,037 is the total number of SNP markers (Holm, 1979). Markers whose P-values passed the threshold were identified as candidate loci. Unlike natural material populations, such as the artificial DH population or testcross hybrid population, which had a high LD level, our candidate intervals were selected according to LD decay and LD block. Makers with a physical distance of<20 Mb and in high LD (r2 ≥ 0.8) were considered to mark the same genomic region (Mayer et al., 2020). The corresponding candidate region was described by the positions of the first and last maker, respectively.
2.6 Genomic prediction
The Genomic prediction was performed for the hybrid panel with three conditions, including 1) extended GBLUP models, 2) maker based prediction methods, and 3) prediction models with QTL calculated by GWAS as fixed effects.
For extended GBLUP models, which comprised additive (Ga), dominant (Gd) and epistatic relationship matrices. Ga and Gd matrices were calculated using the “sommer” package in R (Covarrubias-Pazaran, 2016). The epistatic matrices terms were computed using Hadamard products (i.e., cell-by-cell product denoted as “#”) of the following form: (i) additive-by-additive interactions (Ga#Ga); (ii) dominance-by-dominance interactions (Gd#Gd); and (iii) additive-by-dominance interactions (Ga#Gd), respectively (Muñoz et al., 2014). In total, six GBLUP models were used in this study (Table 1). The programs were implemented in the “BGLR” package in R (Pérez and De Los Campos, 2014). The extended GBLUP models can be described as
Model(G_A): y=1nμ+Gaua+ϵ
Model(G_D): y=1nμ+Gdud+ϵ
Model(G_A_D): y=1nμ+Gaua+Gdud+ϵ
Model(G_A_AA): y=1nμ+Gaua+Gaauaa+ϵ
Model(G_A_AD): y=1nμ+Gaua+Gaduad+ϵ
Model(G_A_DD): y=1nμ+Gaua+Gddudd+ϵ
Model(G_A_D_E): y=1nμ+Gaua+Gdud+Gaauaa+Gaduad+Gddudd+ϵ
where y is the vector of phenotypic data; 1n is the n-dimensional vector of ones; μ is the overall mean; ua, ud, uaa, uad, udd are the vectors of random effects for additive, dominant, additive-by-additive, additive-by-dominance and dominance-by-dominance effects assumed to obey the normal distributions N(0, ), N(0, ), N(0, ), N(0, ) and N(0, ), respectively; Ga, Gd, Gaa, Gad and Gdd are the genomic relationship matrices corresponding to additive, dominance, additive-by-additive, additive-by-dominance and dominance-by-dominance genotypic values, respectively.
We also performed maker based prediction models including RRBLUP (Whittaker et al., 2000), BRR (Pérez and De Los Campos, 2014), BL (Park and Casella, 2008), BayesA-C (Meuwissen et al., 2001b). The RRBLUP method is based on a restricted maximum likelihood (REML) approach to ridge regression, we performed it by R package “rrBLUP” (Endelman, 2011). Meanwhile, we also used Bayes-based methods to fit models, containing different prior densities, i.e., Gaussian (BRR), Double exponential (BL), Scaled-t (BayesA), Scaled-t mixture (BayesB), Gaussian mixture (BayesC) in BGLR package (Pérez and De Los Campos, 2014). The basic model is,
where y is the vector of phenotypes; 1n is the n-dimensional vector of ones; μ is the overall mean,; α is a vector of random regression coefficients of all the marker effects; Z is an genotypic matrix for markers; and ϵ is a vector of residuals. The alternative methods discussed here differ primarily in their specific prior used for α. For RRBLUP, α~N(0, ) and has a scaled inverse chi-square distribution. For BayesA, the unconditional distributions of the marker effects follow identical and independent univariate t distributions, each with mean zero. BayesB employs a mixture distribution that includes a point of mass at zero and a univariate scaled t distribution. The assumption of BayesC is that each marker effect is zero with probability π and follows a univariate normal distribution with probability (1 − π) having mean zero and variance , which has a scaled inverse chi-square distribution.
To further improve the prediction ability, we added QTL into the mixed linear model as fixed factors. Two representative models were selected, namely G_A and BayesB. We added additive localization maker and dominant localization maker obtained by GWAS into the model separately or together, including G_A_qa (G_A with additive GWAS SNPs), G_A_qd (G_A with dominant GWAS SNPs), G_A_qad (G_A with additive and dominant GWAS SNPs), BayesB_qa (BayesB with additive GWAS SNPs), BayesB_qd (BayesB with dominant GWAS SNPs), BayesB_qad (BayesB with additive and dominant GWAS SNPs). When the prediction was performed with additive QTL, homozygous genotypes with recessive allele combinations were coded as 0, homozygous genotypes with dominant allele combinations were coded as 2, and heterozygous genotypes were coded as 1. When the prediction was performed with dominant QTL, both types of homozygous genotypes were coded as 0 and heterozygous genotypes were coded as 1. The models can be described as,
Model(G_A_qa): y=XQTLaβa+Gaua+ϵ
Model(G_A_qd): y=XQTLdβd+Gaua+ϵ
Model(G_A_qad): y=XQTLadβad+Gaua+ϵ
Model(BayesB_qa): y=XQTLaβa+Zαa+ϵ
Model(BayesB_qd): y=XQTLdβd+Zαd+ϵ
Model(BayesB_qad): y=XQTLadβad+Zαad+ϵ
where y is the vector of phenotypes; XQTLa, XQTLd, XQTLad are incidence matrices of additive localization makers, dominant localization makers and both, respectively; βa, βdand βad are vectors of fixed effects for XQTLa, XQTLd and XQTLad, respectively; Ga is the genomic relationship matrix corresponding to additive genotypic values; Z is a genotypic matrix for all markers; α is a vector of random regression coefficients of all the marker effects; and ϵ is a vector of residuals.
In this study, we used a five-fold cross validation approach to assess the ability of the tested GP models. Prediction accuracy was quantified using two methods, 1) the Pearson correlation between the input trait values and the genomic estimated breeding values (GEBVs) predicted from a given GS model evaluated in the test set, 2) the number of top 20% accessions intersections selected by GEBVs and true values derived by the total number of accessions in the test set. The process was repeated 100 times to eliminate the prediction error.
2.7 Statistical analysis
Data analysis was carried out with R software (Version 3.6.2). Microsoft Excel for Mac (Version 16.50) was used for collation of phenotypic data. Tukey’s test and Students’ t-test were performed to assess the significance of differences between values, and P < 0.05 was considered to be statistically significant.
3 Results
3.1 Phenotypic variations and heritability
We evaluated the SCRRS in 384 DH lines and 903 hybrids under three and four environments, respectively. The results indicated that there were abundant phenotypic variations within each panel (Figures 1B, C). The descriptive statistics for each population are presented in Table 2. For DH founders, C229 and UH306 showed the highest and lowest SCRRSs, which were 5.90 and 4.89 respectively. In the DH panel, the SCRRS ranged from 4.10–7.06, and POP1 showed significantly high resistance to SCR (Tukey-test, P<0.05), with the mean SCRRS was 6.02. In the hybrid panel, the scores ranged from 3.65 to 6.08, with a mean of 4.97. The most resistant subgroup was POP2/C229, with the mean SCRRS was 5.39. In particular, the DHs and hybrids were planted at different locations, so we didn’t make a comparison between the two panels. The broad-sense heritability (H2) analysis revealed that the H2 in the DH panel and the hybrid panel were 0.64 and 0.54, respectively, suggesting that the phenotypic variation in the two panels was genetically controlled.
Table 2 Descriptive statistics, variance components, and broad-sense heritability (H2) of southern corn rust resistance.
3.2 Genotype and population structure analysis
After marker quality control (see Materials and Methods), 34,037 SNP markers for 384 DH genotypes were available for further analysis. The 903 hybrid genotypes were imputed by their parents. The molecular diversities among the DH lines and hybrids were examined by applying principal coordinate analysis (Figures 2A, B). There were 4 subgroups in the DH population, among which POP2 and POP3 were relatively close, possibly because they share a common parent C783 and another parent EH were closely related to UH306. In the hybrid panel, three subgroups were observed, including hybrids using C116A as the tester, Z58 as the tester, DH founders or J2416 as the testers. The LD was estimated for the two panels using SNPs. The LD rapidly decreased with increasing the physical distance between SNPs (Figure 2C), but the decay rate varied among the two panels. At r2 = 0.2, the mean LD decay was about 20 Mb and 5 Mb for the DH panel and the hybrid panel.
Figure 2 Analysis of genetic structure in the DH and hybrid panels. (A) the principal component analysis for the DH panel. (B) the principal component analysis for the hybrid panel. (C) Linkage disequilibrium decay in the two populations.
3.3 Genome wide association study
Three GWAS processes were performed using the FarmCPU method, including additive GWAS in the DH panel, additive GWAS in the hybrid panel and dominant GWAS in the hybrid panel. The quantile–quantile (q–q) plot implied that the population structure and family relatedness were well controlled in the three GWAS jobs (Figures 3B, D, F). One SNP (AX-107958879) on chromosome 10 significantly associated with the SCRRC trait was identified at P< 1.47 × 10−6 in the DH panel, with effect value was -0.25 (Figure 3A and Table 3). LD analysis suggested candidate region was 1,150,363–3,990,150 bp, which overlapped the previously reported gene RPPC or RPPK (Supplementary Figure 1). For the additive GWAS in hybrids, three significant SNPs (AX-90698604, AX-108029030, AX-108089672) were detected, with -log10 (P) ranging from 6.30 to 40.61. These SNPs were distributed on chromosomes 1, 8 and 10, with the candidate regions Chr1: 181,330,348-188,255,567, Chr8: 13,140,413-18,429,572, Chr10: 2,656,837-4,990,741, respectively (Figure 3C and Table 3). The effects of them were 0.24, 0.22 and -0.54. For dominant GWAS in the hybrid panel, three SNPs (AX-107981937, AX-108109448, AX-108089672) on chromosomes 7, 8 and 10 significantly associated, with -log10 (P) were ranged 7.17-37.12, the effects were -0.16, 0.14 and -0.5 (Figure 3E and Table 3). Their candidate regions were suggested as Chr7: 13,581,102-23,774,017, Chr8: 167,766,262-168,856,337, Chr10: 2,656,837-4,990,741. The QTL on chromosome 10 obtained by the three GWAS processes were identified as the same region using LD analysis (Supplementary Figure 1).
Figure 3 Genome-wide association study Manhattan and quantile–quantile (q–q) plots for Southern Corn Rust (SCR) resistance. (A, C, E) Manhattan plots for SCR resistance in additive GWAS in DH panel, additive and dominant GWAS in hybrid panel, respectively. the dashed line corresponds to the threshold level defined at P = 1.47 × 10−6 by a false discovery rate correction method. (B, D, F) q–q plot for SCR resistance in additive GWAS in DH panel, additive and dominant GWAS in hybrid panel, respectively.
3.4 Genomic prediction with the different marker density, and training population size
The effect of marker density and training population size on the GP accuracy is shown in Figure 4. For marker density, the prediction accuracy increased as the number of markers increased. The prediction accuracy increased rapidly when the number of markers increased from 10 to 5,000. Then, the prediction accuracy increased slightly when the number of markers kept increasing. For training population size, prediction accuracy increased as the size increased, and no slowdown in the growth rate was observed.
Figure 4 Genomic prediction study in hybrid panel with different SNP numbers (A) and training population size (B) for Southern Corn Rust (SCR) resistance.
3.5 Genomic prediction with extended GBLUP models
To meet the breeding needs of SCR-resistant hybrid selection, different GP methods were implemented to improve the prediction accuracy. Firstly, six extended GBLUP models with combinations of additive, dominant, epistatic matrices were tested (Figure 5 and Supplementary Table 1). For test set correlation, the G_A model which only used the additive matrix was found significantly better than the G_D model which used the dominant matrix, with accuracy were 0.60 and 0.57, respectively. Another less accurate model than G_A, but not significant, was the G_A_D_E model, which had a mean accuracy of 0.59. The accuracy of the G_A_AA model was higher than that of the G_A model, suggesting that the epistatic effect was beneficial to GP in this study. Other models (G_A_D, G_A_AD, G_A_DD) performed as well or slightly better than G_A, with the accuracy of 0.60, 0.61 and 0.60, respectively. For top selection accuracy, the overall accuracy was lower than that of the test set correlation. The correlation test shows a significant correlation between the two accuracy evaluation methods, with R=0.77 (P value<0.05). Interestingly, G_A is better than other models for top selection, which is different from previous reports (Muñoz et al., 2014). The accuracy of the six models ranged from 0.45 to 0.48, indicating that further improvement is needed.
Figure 5 Genomic prediction study in hybrid panel with extend GBLUP models for Southern Corn Rust (SCR) resistance. The left is prediction accuracy for test set and the right is accuracy for top selection.
3.6 Genomic prediction with maker effect based models
Then, given that our hybrid panel had several significant resistance QTL, six maker based prediction models were performed (Figure 6 and Supplementary Table 1). The results showed that RRBLUP, BRR, BL and BayesC were at the same level with an accuracy of 0.60 for test set correlation. BayesA and BayesB were significantly better than other models with an accuracy of 0.65. The top selection accuracy showed the same trend, the accuracy of BayesA and BayesB were 0.53 and 0.52, respectively, which were significantly higher than other models. Meanwhile, a more significant correlation than extend GBLUP models was detected between the two accuracy evaluation methods (R = 0.98).
Figure 6 Genomic prediction study in hybrid panel with maker based models for Southern Corn Rust (SCR) resistance. The left is prediction accuracy for test set and the right is accuracy for top selection.
3.7 Genomic prediction with QTL results
Two representative methods (G_A, BayesB) were selected to test the effect of adding QTL as fixation factors (Figure 7 and Supplementary Table 1). For the two models, the GP results showed that the test set correlations were significantly improved whether additive GWAS QTL, dominant GWAS QTL or both were added. The test set correlations for G_A, G_A_qa, G_A_qd, G_A_qad, BayesB, BayesB_qa, BayesB_qd, BayesB_qad were 0.60, 0.66, 0.67, 0.67, 0.65, 0.66, 0.67, 0.66. For top selection using G_A models, the addition of QTL significantly improved the accuracy, among which the G_A_qa model performed best, with an accuracy of 0.55. In contrast, the BayesB model was not significantly changed the accuracy after the QTL addition for top selection accuracy, in which BayesB_qa (0.53) was slightly improved, while BayesB_qd (0.51) and BayesB_qad (0.51) slightly decreased compared with BayesB (0.52).
Figure 7 Genomic prediction study in hybrid panel with adding QTL as fixed factor into G_A and BayesB models for Southern Corn Rust (SCR) resistance. The left is prediction accuracy for test set and the right is accuracy for top selection.
4 Discussion
SCR is a major disease widely existing in maize, which can cause large yield loss and occur in a wider geographical range (Sun et al., 2021). Therefore, it is important to know the genetic basis of rust resistance, and develop appropriate breeding selection strategies. DH technology can shorten time and speed up the breeding process (Ren et al., 2017), so it is popular in modern maize breeding programs. Moreover, due to obtaining the homozygous population quickly, it is also widely used in genetic research (Wang et al., 2012; Shen et al., 2018). Here, we phenotyped SCR resistance in 384 DH lines and 903 testcross hybrids in multi-environment trials. The widely distribution of SCRRS in populations revealed that quantitative genes still played a particularly important role (Figure 1B). Comparing the hybrid panel consisting of 12 subgroups, we can find that the SCR resistance of hybrids crossed by C229 was significantly higher than that by Z58 and J2416 (Table 2), this is because the genetic contribution of the tester is 50% for each hybrid. This result indicated the importance of the tester in DH-based hybrid breeding, that is, an excellent tester can significantly alter the phenotypic outcome. The heritabilities of DH and hybrid populations were moderate (Table 2), suggesting that the SCR resistance was affected by the environment, so the selection of resistant varieties may need to consider regional adaptability.
Unlike the natural line based GWAS analysis, we used the population derived from four bi-parent DH and their testcross hybrid populations. In previous studies, the background of the GWAS homozygous population formed by multiple artificial populations is more controllable, which has been confirmed in the NAM population (Tian et al., 2011; Wu et al., 2016). This method is more powerful than linkage mapping analysis, however, it is also faced with the tight linkage between SNPs, which is not conducive to mapping accuracy. In our study, obvious population structure could be observed in PCA analysis of genotypes (Figures 2A, B), but no overfitting was found in GWAS by controlling genetic background (Figures 3B, D, F), revealing that these populations can be analyzed by GWAS. At the same time, we found that the LD decay rate could be improved in the hybrid panel (Figure 2C), suggesting that for the DH panel, further genetic combination by test-cross could improve the accuracy of GWAS.
It was found that although dominant Rp genes mainly functioned in SCR resistance in previous studies, there was evidence that quantitative genes also contributed to SCR resistance (Souza Camacho et al., 2019). Five significant loci were detected in our research (Figures 3A, C, E), Table 3). DH population only detected one candidate interval, and the number is less than the hybrid panel, which may be due to the larger size or the richer genetic background with the introduced testers in the hybrid populations. In the hybrid panel, three candidate loci were detected by association analysis with additive and dominant coding (Figures 3C, E and Table 3), two of them were different, which suggested that the dominant effect is also very important in the breeding of rust-resistant hybrids. In all association analyses for 2 populations, a highly significant locus was detected on chromosome 10, which tight chained with the known SCR resistance gene RppC (Deng et al., 2022) and other reported QTL, including RppQ (Chen et al., 2004), RppD (Zhang et al., 2009), RppS/RppK (Wu et al., 2015; Chen et al., 2022), RppM (Wang et al., 2020). The stability and significant effect of this loci suggested that MAS can be used to fix this region to the germplasm in the breeding process. In addition, four minor genes loci were detected in the hybrid population, and no overlap was found with the known candidate loci, indicating that further fine mapping and function research is needed. Our GWAS analysis enriched the genetic analysis of SCR resistance, demonstrating that many potential SCR resistance genes exist in different maize germplasm backgrounds.
In recent years, GP is a commonly used method to reduce costs and workload in plant and animal breeding programs, especially when combined with DH technology, breeding efficiency can be further improved (Fu et al., 2022). However, for the SCR resistant hybrid selection with GP, experience and reference are lacking. We performed GP analysis on the hybrid panel to explore the prediction accuracy under different GP models. GBLUP, as a classical model of GP, is based on the genetic relationship matrix (Crossa et al., 2017). In hybrid populations, additive, dominant and epistatic effects exist simultaneously. We performed extended GBLUP models and found that the additive-by-additive matrices could significantly improve the prediction performance (Figure 5), suggesting that the epistatic effect plays a role in maize SCR resistance. The prediction effect of the pure dominant effect matrix is relatively poor, indicating that the application value of GBLUP only using the dominant matrix is low. It is worth noting that when all matrices were put into the model, the prediction ability is poor, indicating that redundant matrices will reduce the prediction accuracy.
Since the heredity of plant resistance seems to be controlled by dominant genes, GP models based on the genetic relationship matrix may have limited predictive power. We tried maker based GP models and found that most of them had comparable predictive power to G_A, including RRBLUP, BRR, BL and BayesC (Figure 6). However, BayesA and BayesB showed a higher prediction ability of 8% beyond other models, which may be due to the difference brought by prior densities of the Bayes model. This difference provided a reference for the prediction of SCR, indicating different GP models significantly impact the prediction power.
Based on the results of GWAS and GP, we further added candidate loci resulting from association analysis into the prediction model as fixed factors (Figure 7). The prediction accuracies of the G_A and BayesB models were significantly improved, indicating that QTL information can significantly promote prediction accuracy, which is consistent with previous studies (Jiao et al., 2020). Especially for the G_A model, the prediction accuracy improved by 11.67% after all QTL information was added, which may be due to the G_A_qad model complementing the large effect of QTL on the phenotypic outcome. In breeding applications, prediction ability can be improved by adding known QTL loci to GP models. In addition, implementing GP in hybrids is more complex than in homozygous populations, and it may be more efficient to explore a combination of multiple approaches.
5 Conclusion
SCR occurs widely in maize and brings great loss to yield. Here, we developed the DH panel with 384 lines and the hybrid panel with 903 testcross hybrids. SCRRS of accessions were collected with multi-year and multi-location field testing. Using GWAS analytical pipeline, five QTL loci were detected on chromosomes 1, 7, 8, 8, and 10, with P-values ranging from 4.83×10-7 to 2.46×10-41. On the other hand, to improve the selection efficiency of resistant materials in breeding, several GP methods were performed to explore predictive ability for SCRRS in hybrids, including extended GBLUP with different genetic matrices, maker based prediction models, and mixed models with QTL as fixed factors. We found that adding additive-by-additive effect to GBLUP model, selecting BayesA or BayesB model, adding QTL into the mixed linear prediction model will improve the prediction performance. The results will provide important valuable information for understanding the genetic architecture and the application of GP for SCR in maize breeding.
Data availability statement
The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding authors.
Author contributions
JL and DC: investigation, data processing, and writing. SG, CC, YuwW, YZ, XQ, ZL, and DW: investigation, data collection and editing. WL and YuaW: managing the project, review and editing. CL and SC: managing the project, editing the manuscript, and funding acquisition. All authors contributed to the article and approved the submitted version.
Funding
This research was funded by the National Key Research and Development Program of China (2016YFD0101201), the Modern Maize Industry Technology System (CARS-02-04), National Key Research and Development Plan (2022YFD1201001), Beijing Agricultural Reform and Development Special Transfer Payment Fund from Beijing Municipal Bureau of Agriculture and Rural Affairs to SC.
Acknowledgments
We thank Beijing Tongzhou International Seed Science and Technology Co., Ltd to provide genotyping support.
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2023.1109116/full#supplementary-material
Supplementary Figure 1 | Linkage disequilibrium (LD) heatmap around the SNPs derived from genome-wide association (GWAS) analysis on chromosome 10.
References
Bates, D., Mächler, M., Bolker, B., Walker, S. (2014). Fitting linear mixed-effects models using lme4. arXiv preprint arXiv 1406, 5823. doi: 10.18637/jss.v067.i01
Bradbury, P. J., Zhang, Z., Kroon, D. E., Casstevens, T. M., Ramdoss, Y., Buckler, E. S. (2007). TASSEL: Software for association mapping of complex traits in diverse samples. Bioinformatics 23, 2633–2635. doi: 10.1093/bioinformatics/btm308
Cerrudo, D., Cao, S., Yuan, Y., Martinez, C., Suarez, E. A., Babu, R., et al. (2018). Genomic selection outperforms marker assisted selection for grain yield and physiological traits in a maize doubled haploid population across water treatments. Front. Plant Sci. 9, 366. doi: 10.3389/fpls.2018.00366
Chen, C. X., Wang, Z. L., Yang, D. E., Ye, C. J., Zhao, Y. B., Jin, D. M., et al. (2004). Molecular tagging and genetic mapping of the disease resistance gene RppQ to southern corn rust. Theor. Appl. Genet. 108, 945–950. doi: 10.1007/s00122-003-1506-7
Chen, G., Zhang, B., Ding, J., Wang, H., Deng, C., Wang, J., et al. (2022). Cloning southern corn rust resistant gene RppK and its cognate gene AvrRppK from puccinia polysora. Nat. Commun. 13, 4392. doi: 10.1038/s41467-022-32026-4
Cooper, J. S., Rice, B. R., Shenstone, E. M., Lipka, A. E., Jamann, T. M. (2019). Genome-wide analysis and prediction of resistance to goss's wilt in maize. Plant Genome 12, 180045. doi: 10.3835/plantgenome2018.06.0045
Covarrubias-Pazaran, G. (2016). Genome-assisted prediction of quantitative traits using the r package sommer. PloS One 11, e0156744. doi: 10.1371/journal.pone.0156744
Crossa, J., Pérez-Rodríguez, P., Cuevas, J., Montesinos-López, O., Jarquín, D., De Los Campos, G., et al. (2017). Genomic selection in plant breeding: methods, models, and perspectives. Trends Plant Sci. 22, 961–975. doi: 10.1016/j.tplants.2017.08.011
Deng, C., Leonard, A., Cahill, J., Lv, M., Li, Y., Thatcher, S., et al. (2022). The RppC-AvrRppC NLR-effector interaction mediates the resistance to southern corn rust in maize. Mol. Plant 15, 904–912. doi: 10.1016/j.molp.2022.01.007
Duan, D.-R., He, H. (1984). Description of a rust puccinia polysora on corn in hainan island. Chen Chun Hsueh Pao Acta Mycol Sinica 2, 63–64.
Endelman, J. B. (2011). Ridge regression and other kernels for genomic selection with r package rrBLUP. Plant Genome J. 4, 250–255. doi: 10.3835/plantgenome2011.08.0024
Fu, J., Hao, Y., Li, H., Reif, J. C., Chen, S., Huang, C., et al. (2022). Integration of genomic selection with doubled-haploid evaluation in hybrid breeding: from GS 1.0 to GS 4.0 and beyond. Mol. Plant S1674-2052 (1622), 00053–00053. doi: 10.1016/j.molp.2022.02.005
Gowda, M., Das, B., Makumbi, D., Babu, R., Semagn, K., Mahuku, G., et al. (2015). Genome-wide association and genomic prediction of resistance to maize lethal necrosis disease in tropical maize germplasm. Theor. Appl. Genet. 128, 1957–1968. doi: 10.1007/s00122-015-2559-0
Guo, Z., Zou, C., Liu, X., Wang, S., Li, W.-X., Jeffers, D., et al. (2020). Complex genetic system involved in fusarium ear rot resistance in maize as revealed by GWAS, bulked sample analysis, and genomic prediction. Plant Dis. 104, 1725–1735. doi: 10.1094/PDIS-07-19-1552-RE
Holm, S. (1979). A simple sequentially rejective multiple test procedure. Scandinavian J. Stat 6, 65–70.
Jiao, Y., Li, J., Li, W., Chen, M., Li, M., Liu, W., et al. (2020). QTL mapping and prediction of haploid Male fertility traits in maize (Zea mays l.). Plants 9, 836. doi: 10.3390/plants9070836
Kibe, M., Nyaga, C., Nair, S. K., Beyene, Y., Das, B., Bright, J. M., et al. (2020). Combination of linkage mapping, gwas, and gp to dissect the genetic basis of common rust resistance in tropical maize germplasm. Int. J. Mol. Sci. 21, 6518. doi: 10.3390/ijms21186518
Liu, X., Huang, M., Fan, B., Buckler, E. S., Zhang, Z. (2016). Iterative usage of fixed and random effect models for powerful and efficient genome-wide association studies. PloS Genet. 12, e1005767. doi: 10.1371/journal.pgen.1005767
Liu, Z.-X., Wang, S.-C., Dai, J.-R., Huang, L.-J., Cao, H.-H. (2003). Studies of genetic analysis and SSR linked marker location of gene resistance to southern rust in inbred line P25 of maize. Yi Chuan Xue Bao Acta Genetica Sin. 30, 706–710.
Lu, L., Xu, Z., Sun, S., Du, Q., Zhu, Z., Weng, J., et al. (2020). Discovery and fine mapping of qSCR6. 01, a novel major QTL conferring southern rust resistance in maize. Plant Dis. 104, 1918–1924. doi: 10.1094/PDIS-01-20-0053-RE
Lv, M., Deng, C., Li, X., Zhao, X., Li, H., Li, Z., et al. (2021). Identification and fine-mapping of RppCML496, a major QTL for resistance to puccinia polysora in maize. Plant Genome 14, e20062. doi: 10.1002/tpg2.20062
Mammadov, J., Sun, X., Gao, Y., Ochsenfeld, C., Bakker, E., Ren, R., et al. (2015). Combining powers of linkage and association mapping for precise dissection of QTL controlling resistance to gray leaf spot disease in maize (Zea mays l.). BMC Genomics 16, 1–16. doi: 10.1186/s12864-015-2171-3
Mayer, M., Hölker, A. C., González-Segovia, E., Bauer, E., Presterl, T., Ouzunova, M., et al. (2020). Discovery of beneficial haplotypes for complex traits in maize landraces. Nat. Commun. 11, 1–10. doi: 10.1038/s41467-020-18683-3
Meuwissen, T. H., Hayes, B. J., Goddard, M. E. (2001). Prediction of total genetic value using genome-wide dense marker maps. Genetics 157, 1819–1829. doi: 10.1093/genetics/157.4.1819
Muñoz, P. R., Resende, M. F., Jr., Gezan, S. A., Resende, M. D. V., De Los Campos, G., Kirst, M., et al. (2014). Unraveling additive from nonadditive effects using genomic relationship matrices. Genetics 198, 1759–1768. doi: 10.1534/genetics.114.171322
Orian, G. (1954). Occurrence of puccinia polysora Underwood in the Indian ocean area. Nature 173, 505–505. doi: 10.1038/173505a0
Park, T., Casella, G. (2008). The bayesian lasso. J. Am. Stat. Assoc. 103, 681–686. doi: 10.1198/016214508000000337
Pérez, P., De Los Campos, G. (2014). Genome-wide regression and prediction with the BGLR statistical package. Genetics 198, 483–495. doi: 10.1534/genetics.114.164442
Porebski, S., Bailey, L. G., Baum, B. R. (1997). Modification of a CTAB DNA extraction protocol for plants containing high polysaccharide and polyphenol components. Plant Mol. Biol. Rep. 15, 8–15. doi: 10.1007/BF02772108
Ramirez-Cabral, N. Y. Z., Kumar, L., Shabani, F. (2017). Global risk levels for corn rusts (Puccinia sorghi and puccinia polysora) under climate change projections. J. Phytopathol. 165, 563–574. doi: 10.1111/jph.12593
Ren, J., Li, Z., Wu, P., Zhang, A., Liu, Y., Hu, G., et al. (2021). Genetic dissection of quantitative resistance to common rust (Puccinia sorghi) in tropical maize (Zea mays l.) by combined genome-wide association study, linkage mapping, and genomic prediction. Front. Plant Sci. 12, 692205. doi: 10.21203/rs.3.rs-126178/v1
Ren, J., Wu, P., Trampe, B., Tian, X., Lubberstedt, T., Chen, S. (2017). Novel technologies in doubled haploid line development. Plant Biotechnol. J. 15, 1361–1370. doi: 10.1111/pbi.12805
Rhind, D., Waterston, J., Deighton, F. (1952). Occurrence of puccinia polysora underw. in west Africa. Nature 169, 631–631. doi: 10.1038/169631a0
Ryland, A., Storey, H. (1955). Physiological races of puccinia polysora underw. Nature 176, 655–656. doi: 10.1038/176655b0
Shen, Y., Yang, Y., Xu, E., Ge, X., Xiang, Y., Li, Z. (2018). Novel and major QTL for branch angle detected by using DH population from an exotic introgression in rapeseed (Brassica napus l.). Theor. Appl. Genet. 131, 67–78. doi: 10.1007/s00122-017-2986-1
Souza Camacho, L. R., Coan, M. M. D., Scapim, C. A., Barth Pinto, R. J., Tessmann, D. J., Contreras-Soto, R. I., et al. (2019). A genome-wide association study for partial resistance to southern corn rust in tropical maize. Plant Breed. 138, 770–780. doi: 10.1111/pbr.12718
Storey, H., Howland, A. K. (1957). Resistance in maize to the tropical American rust fungus, puccinia polysora underw. Heredity 11, 289–301. doi: 10.1038/hdy.1957.26
Storey, H., Howland, A. K. (1959). Resistance in maize to the tropical American rust fungus, puccinia polysora. Heredity 13, 61–65. doi: 10.1038/hdy.1959.4
Storey, H., Howland, A. K. (1967). Resistance in maize to a third East African race of puccinia polysora underw. Ann. Appl. Biol. 60, 297–303. doi: 10.1111/j.1744-7348.1967.tb04481.x
Sun, Q., Li, L., Guo, F., Zhang, K., Dong, J., Luo, Y., et al. (2021). Southern corn rust caused by puccinia polysora underw: a review. Phytopathol. Res. 3, 1–11. doi: 10.1186/s42483-021-00102-0
Tian, F., Bradbury, P. J., Brown, P. J., Hung, H., Sun, Q., Flint-Garcia, S., et al. (2011). Genome-wide association study of leaf architecture in the maize nested association mapping population. Nat. Genet. 43, 159–162. doi: 10.1038/ng.746
Tian, H., Yang, Y., Yi, H., Xu, L., He, H., Fan, Y., et al. (2021). New resources for genetic studies in maize (Zea mays l.): a genome-wide Maize6H-60K single nucleotide polymorphism array and its application. Plant J. 105, 1113–1122. doi: 10.1111/tpj.15089
Ullstrup, A. (1965). Inheritance and linkage of a gene determining resistance in maize to an American race of fuccinia polysora. Phytopathology 55, 425–428.
Underwood, L. M. (1897). Some new fungi, chiefly from Alabama. Bull. Torrey Bot Club 24, 81–86. doi: 10.2307/2477799
Wang, X., Jin, Q., Shi, J., Wang, Z., Li, X. (2006). The status of maize diseases and the possible effect of variety resistance on disease occurrence in the future. Acta Phytopathol Sin. 1, 1–11.
Wang, M., Yan, J., Zhao, J., Song, W., Zhang, X., Xiao, Y., et al. (2012). Genome-wide association study (GWAS) of resistance to head smut in maize. Plant Sci. 196, 125–131. doi: 10.1016/j.plantsci.2012.08.004
Wang, J., Zhang, Z. (2021). GAPIT version 3: boosting power and accuracy for genomic association and prediction. Genom Proteomics Bioinf. 19, 629–640. doi: 10.1016/j.gpb.2021.08.005
Wang, S., Zhang, R., Shi, Z., Zhao, Y., Su, A., Wang, Y., et al. (2020). Identification and fine mapping of RppM, a southern corn rust resistance gene in maize. Front. Plant Sci. 11, 1057. doi: 10.3389/fpls.2020.01057
Whittaker, J. C., Thompson, R., Denham, M. C. (2000). Marker-assisted selection using ridge regression. Genet. Res. 75, 249–252. doi: 10.1017/S0016672399004462
Wu, X., Li, Y., Shi, Y., Song, Y., Zhang, D., Li, C., et al. (2016). Joint-linkage mapping and GWAS reveal extensive genetic loci that regulate male inflorescence size in maize. Plant Biotechnol. J. 14, 1551–1562. doi: 10.1111/pbi.12519
Wu, X., Li, N., Zhao, P., He, Y., Wang, S. (2015). Geographic and genetic identification of RppS, a novel locus conferring broad resistance to southern corn rust disease in China. Euphytica 205, 17–23. doi: 10.1007/s10681-015-1376-5
Yao, G.-Q., Chan, J., Cao, B., Cui, L.-G., Dou, S.-L., Han, Z.-J., et al. (2013). Mapping the maize southern rust resistance gene in inbred line CML470. J. Plant Genet. Resour. 14, 518–522.
Yuan, H., Xin, X., Li, C. (2010). Resistance comparisons to south⁃ ern corn rust in different corn varieties. J. Maize Sci. 18, 107–109.
Yu, J., Buckler, E. S. (2006). Genetic association mapping and genome organization of maize. Curr. Opin. Biotechnol. 17, 155–160. doi: 10.1016/j.copbio.2006.02.003
Zhang, X. (2013). Study on the resistance of maize to northern corn leaf blight and southern corn rust. Doctorate Chin. Acad. Agric. Sci.
Zhang, Y., Xu, L., Zhang, D.-F., Dai, J.-R., Wang, S.-C. (2009). Mapping of southern corn rust-resistant genes in the W2D inbred line of maize (Zea mays l.). Mol. Breed. 25, 433–439. doi: 10.1007/s11032-009-9342-3
Zhao, M., Liu, S., Pei, Y., Jiang, X., Jaqueth, J. S., Li, B., et al. (2021). Identification of genetic loci associated with rough dwarf disease resistance in maize by integrating GWAS and linkage mapping. Plant Sci. 315, 111100. doi: 10.1016/j.plantsci.2021.111100
Keywords: maize, southern corn rust resistance, genome-wide association study, genomic prediction, models
Citation: Li J, Cheng D, Guo S, Chen C, Wang Y, Zhong Y, Qi X, Liu Z, Wang D, Wang Y, Liu W, Liu C and Chen S (2023) Genome-wide association and genomic prediction for resistance to southern corn rust in DH and testcross populations. Front. Plant Sci. 14:1109116. doi: 10.3389/fpls.2023.1109116
Received: 27 November 2022; Accepted: 13 January 2023;
Published: 26 January 2023.
Edited by:
Hongxiang Ma, Yangzhou University, ChinaReviewed by:
Francisco E. Gomez, Michigan State University, United StatesHossein Sabouri, Gonbad Kavous University, Iran
Copyright © 2023 Li, Cheng, Guo, Chen, Wang, Zhong, Qi, Liu, Wang, Wang, Liu, Liu and Chen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Chenxu Liu, liucx@cau.edu.cn; Shaojiang Chen, chen368@126.com
†These authors have contributed equally to this work