- 1School of Life Science, Liaoning University, Shenyang, China
- 2School of Life Science and Bioengineering, Shenyang University, Shenyang, China
Climate can shape plant genetic diversity and genetic structure, and genetic diversity and genetic structure can reflect the adaptation of plants to climate change. We used rbcl and trnL-trnF sequences to analyze the genetic diversity and genetic structure of C. squarrosa under the influence of different environmental factors in Inner Mongolia grassland. The results showed that the genetic diversity of this species was low. (The trnL-trnF sequences have higher genetic diversity than rbcl sequences.) C. squarrosa had low genetic diversity compared to other prairie plants, but had a more pronounced genetic structure. The haplotype network diagram of the combined sequences could be divided into two categories, and the results of the NJ, MP, and ML trees also showed that the haplotypes were divided into two branches. The results of genetic structure analysis showed that that the populations located in the desert steppe fall into exactly one cluster, and the populations located in the typical steppe fall into exactly another cluster. The neutrality tests were all negative and the mismatch distribution also showed a single peak across the population, suggesting that C. squarrosa had undergone population expansion and was well adapted to the local environment. The results of the mantel test showed that climate had a greater influence on the genetic distance of C. squarrosa, with annual precipitation having a higher influence than mean annual temperature. This study provided basic genetic information on the genetic structure of C. squarrosa and contributes to the study of genetic adaptation mechanisms in grassland plants.
1 Introduction
Genetic diversity is critical to population survival and evolution, especially in changing environments (Santelices et al., 2018). Genetic diversity arises from mutations in the DNA sequence of cell lines, and from there, it reaches the level of organisms, populations, and regions (Hughes et al., 2008; Engelhardt et al., 2014). At the organism level, genetic diversity includes genetic variation at the gene level. At the population level, genetic diversity is measured by allelic and genotypic diversity, including the number of alleles and genotypes found in a population and their relative abundance. Populations with higher genotypic diversity exhibit higher productivity, higher resistance to pathogens, and a better capacity to recover after environmental disturbances (Bell, 1991). At the regional level, genetic diversity is measured by comparisons between populations, which are expected to have gone through local adaptation and thus show genetic differentiation (Kawecki and Ebert, 2004). From an ecological perspective, genetic diversity is usually considered as an advantage since there are positive correlations between genetic diversity, fitness, and population survival (Dole and Sun, 1992; Buza et al., 2000). From an evolutionary viewpoint, genetic diversity is needed for population adaptations in changing environments (Booy et al., 2000; Reed and Frankham, 2003). Previous studies have focused on the relationship between genetic diversity and evolution, mostly on small spatial scales, but the relationship between genetic diversity and the climate on large-scale gradients has been less well investigated (Lin et al., 2006).
Environmental factors and habitat heterogeneity have a great role in shaping the genetic diversity of species (Kabiel et al., 2013). Within populations, genetic diversity is considered to be essential for adaptations to environmental change (Hamasha et al., 2013), consequently, for the long-term survival of plant populations (Słomka et al., 2011). Strong environmental gradients can affect the genetic structure of plant populations and may lead to limited gene flow between populations, ultimately enhancing genetic variation between populations (Hamasha et al., 2013). The climate is one of the most important drivers of local adaptation in plant species; climate change has the potential to alter the genetic diversity of plant populations with consequences for community dynamics and ecosystem processes (Huang et al., 2016). Changes in climate affect plant phenological patterns (e.g., flowering time and seed germination) and mating systems (e.g., pollen dispersal), thereby increasing selection pressure on populations between different regions and affecting the genetic structure of populations (Hamasha et al., 2013).
Temperature and precipitation are among the main environmental factors affecting genetic differentiation and diversity among populations (Al-Gharaibeh et al., 2017). The influence of temperature on genetic variation is mainly reflected in plant phenology (Liu et al., 2020). For example, temperature changes affect the timing of plant flowering, variation in flowering phenology can potentially increase reproductive isolation and genetic differentiation within and among populations, affecting genetic differentiation within and between populations and leading to the formation of genetic structure (Luo et al., 2021). Hence, early-flowering plants are likely to mate with other early-flowering plants, while late-flowering plants are likely to mate with other late-bloomers (Pramuk and Runkle, 2005). The effects of precipitation on genetic diversity are more often seen in the mating systems of plants. Particularly for water-borne pollinator species, the pattern of variation is determined by precipitation, which in turn affects seed dispersal (Kabiel et al., 2013). Located on the northern frontier of China, the Inner Mongolia Autonomous Region has a large area and a longitude span (Cui D. et al., 2021). It is characterized by a typical mid-temperate continental monsoon climate with dramatic intra-annual temperature variation and uneven precipitation distribution (Naugžemys et al., 2022). Differences in precipitation and temperature allow for robust analysis of the composition of genetic variation in populations across large-scale gradients (Ying et al., 2020).
Chloroplast genomes are highly conserved among species and are often used to analyze interspecific genetic diversity (Shinozaki et al., 1986). CUI et al. demonstrated high genetic diversity among melon populations from the perspective of chloroplast genes (Cui H. et al., 2021). It is also demonstrated that the decrease in genetic diversity in melon had a significant effect on the variety of the chloroplast genome (Cui et al., 2020). However, there needs to be more in-depth studies on the impact of chloroplasts on genetic diversity over large-scale gradients (Leimu et al., 2006). The insertion sequences in the chloroplast trnL-trnF sequences are highly variable and informative for inter-species differentiation studies, while the rbcl sequences are relatively conserved and suitable for low-level phylogenetic analyses and population structure studies within species. The simultaneous use of these two sequences can improve the resolution and accuracy of genetic analysis. Therefore, these two fragments were selected for analysis in this paper, so as to effectively evaluate the polymorphism level and population expansion pattern of C. squarrosa populations.
Plants cannot escape environmental stress and are fully exposed to a variety of environments during their life cycle. They have developed many adaptive mechanisms to survive in a constantly changing environment. The Cleistogenes squarrosa Keng is widely distributed in the Eurasian steppe zone. In China, it is mainly found in the typical grasslands, meadows, and desert grasslands of the Inner Mongolian Plateau. It is the dominant lower-layer species in needlegrass grasslands and sheep grass grasslands. C. squarrosa is a small perennial tufted grass of the genus Cleistogenes, highly resistant to cold, with spikes around mid-July and flowering usually in early to mid-August. Temperature affects the return of C. squarrosa, while precipitation will have a facilitative effect on the tassel and flowering stages (Zhang et al., 2014). As a significantly associated species of typical grasslands in Inner Mongolia, C. squarrosa is highly adaptable to adapt environment change. In terms of morphology, when precipitation decreases, the specific leaf area of C. squarrosa decreases. Leaf nitrogen content per unit area increases significantly, indicating that C. squarrosa adapts to drought environments by changing morphological and physiological traits when precipitation changes (Zhang et al., 2019). At the same time, the root depth of C. squarrosa is around 10 cm, which enables it to cope with environments with low precipitation. Qi Yan et al. explored the response of Cleistogenes songorica to drought stress at the gene level and found that increase, miRNAs, PCgenes, and transcription factors together constitute a complex transcriptional regulatory system to adapt to water stress (Zhang et al., 2020). As a typical C4 plant, C. squarrosa is more adaptable to environmental changes than other grassland plants and can be assessed for climate change in Inner Mongolian grasslands (Zhang et al., 2014). Therefore, using C. squarrosa as experimental material to investigate the relationship between environmental change and genetic diversity is more appropriate (McRae, 2006).
The association between genetic and climatic gradients has been well-established as evidence of natural selection (Manel et al., 2010). In the present study, 20 sites of C. squarrosa in the Inner Mongolia grassland were used as experimental samples, and these sample sites showed gradients in precipitation and temperature. Studies over such environmentally heterogeneous geographic ranges are promising to reveal the influence of climatic conditions on plant genetic structure (Nevo, 2001; Hamasha et al., 2013). In this study we will illustrate the following points: (1) the effects of climate factors on the genetic diversity and genetic structure of C. squarrosa between different geographical regions; and (2) the genetic differentiation characteristics of C. squarrosa in 20 sample sites. This study revealed the information underlying the genetic diversity and structure of C. squarrosa, and provided a theoretical basis for the genetic adaptation mechanisms of C. squarrosa.
2 Materials and methods
2.1 Study area
This study was conducted along an east-west transect of 3,000 km in northern China’s arid and semi-arid grasslands, mainly in northeastern Inner Mongolia. The sample zone has a longitude range of 110.294–118.533 ºE and a latitude range of 42.623º–49.783° N. The sample zone has a typical continental climate with limited summer precipitation. Along the sample zone, mean annual precipitation (AP) increases from 139 mm in the west to 443 mm in the east, and annual mean temperature (MAT) ranges from a maximum of 6.3°C to a minimum of −0.1°C (AP and MAT data from the worldclim website) (Table 1). Thus the main vegetation types from west to east are desert steppe and typical steppe, and the main soil types are black calcareous soil, chestnut calcareous soil, gray meadow soil, and gray forest soil. Of the C. squarrosa collected in this study, populations 1 to 8 are in the desert steppes, and 9 to 20 are in the typical steppes (Figure 1). The 20 samples selected, running through the east-west sample zone of our study area and with each population being representative of its region, provide a good reflection of the influence of climatic factors on the genetic structure of populations.
2.2 Plant sampling and DNA extraction
Samples were collected from 2012 to 2015 and frozen after collection. The experimental material for this experiment was taken from the young leaves of C. squarrosa seedlings at the above sample sites and the total DNA were extracted from the chloroplasts of these young leaves by CTAB method, followed by the purity of DNA was determined by 1% agarose gel electrophoresis, and a spectrophotometer was selected to determine its concentration, and finally placed at −20°C for backup.
The trnL-trnF primers used for PCR amplification were trnL (TabC) (5-CGAAATCGGTAGACGCTACG-3′), and trnF (TabF) (5′-ATTTGAACTGGTGACACGAG-3′), the upstream primer for rbcl was (NcoI) (5′- CATGCCATGGTTATGTCAC- CACAAACAGAGA-3′), and the downstream primer was (XhoI) (5′-CCGCTCGAGTTAG- GAAAAGATTGTGCCGAG-3′). The PCR amplification system included 1 μL of template DNA, 0.5 μL of upstream primer, 0.5 μL of downstream primer, 0.5 μL (10 mmol-L-1) of dNTP, 2.5 μL of 10×buffer, 2.0 μL (25 mmol-L-1) of Mg2+, 0.2 μL (5 U-μL-1) of Taqase, 17.8 μL of ddH2O for a total of 25 μL of amplification system. The PCR amplification system was set up as follows: pre-denaturation at 95°C for 3 min; denaturation at 95°C for 30 s, annealing at 60°C for 30 s, and extension at 72°C for 30 s for 10 cycles; denaturation at 95°C for 30 s, annealing at 55°C for 30s, and extension at 72°C for 30 s. This process was carried out for 20 cycles, and finally, extension at 72°C for 6 min. The amplified products were detected by electrophoresis using a DNA sequencer and the peak electrophoresis profiles of all amplified products were obtained for subsequent analysis. Taq enzyme from Thermo Fisher, Sequencers were performed using the Thermo Fisher Applied Biosystems TM 3730XL sequencer. The above experimental conditions are the best conditions for PCR amplification after several pre-experiment. Currently, the four sequences suitable for plant genetic diversity analysis are rbcl, ITS, trnL-F, matk, of which ITS and matk are more suitable for genetic evolution analysis, and rbcl and trnL-F are more suitable for genetic evolution analysis which is more in line with the research content of this paper.
2.3 Data analysis
Peak electrophoresis profiles were converted to amplified fragments using GeneMapper, and the data were imported into CLUSTALX for sequence comparison and manual correction. Nucleotide polymorphism (Pi), the mean number of nucleotide differences (K), haplotype diversity (Hd), number of haplotypes (h), random deviation of haplotype diversity (Sh), Tajima’s D and Fu’s Fs values were calculated using DNAsp (https://doi.org/10.1093/molbev/msx248) (Liu et al., 2020). And haplotype network structure analysis by NETWORK. The variance between and within populations was estimated using molecular analysis of variance (amova) using 1,000 bootstrap. Genetic distance and Fst values between and within populations were calculated using MEGA 11 (https://doi.org/10.1093/molbev/msab120). Genetic distances were estimated using variance estimation, selected by the p-distance method. The NJ tree was obtained by heuristic search using PAUP4.0 software, with default settings selected for the multitree parameters, while 1,000 bootstrap bootstrap tests were performed. The non-coercive topological tree shapes of equally distant trees were compared using the kishino-hasegawa likelihood test, and the best topology (min-nl) was chosen. Maximum Parsimony (MP) trees are also calculated using PAUP 4.0 (https://api.semanticscholar.org/CorpusID:90795438) software, while Maximal likelihood (ML) trees are calculated using IQtree software. ArcGIS software was used to map the haplotype distribution of the populations (Peleg et al., 2008).
To determine the population structure, Structure 2.3 was used for Bayesian clustering analysis (Cui D. et al., 2021). This model has both the analytical advantages of a Bayesian model over other models and the ability to analyze the clustering of individuals in groups. The program used a mixed model and independent allele frequencies using MCMC replication. Ten independent operations were performed for each of the classifications from 1 to 19. The burn-in length was 20,000, and run length was 1,200,000 and final k values were determined based on LnP(D) and Δk in the output of structure results (Cui D. et al., 2021).
Bioclimatic variables for each sample site were taken from the worldclim website. These data are available in four spatial resolutions, ranging from 30 s (∼1 km2) to 10 min (∼340 km2) (Fick and Hijmans, 2017). Nineteen climatic variables were extracted for each site from 1970–2000 using the R package (https://doi.org/10.1016/j.envsoft.2018.09.009). Principal component analysis (PCA) was performed on the 19 bioclimatic variables using the FactoMineR (https://doi.org/10.18637/jss.v025.i01) package (Table 2). Annual precipitation and annual mean temperature were selected as the main units of analysis for the climatic variables. The mantel and partial mantel tests were performed using the vegan package in R (https://doi.org/10.1111/j.1654-1103.2003.tb02228.x) to assess. The relationship between genetic and geographic distances and environmental distances was assessed, and 9,999 permutations were set (Haider et al., 2012). Setting the number of permutations to 9,999 ensures the accuracy of our calculations.
3 Results
3.1 Genetic diversity
The selected chloroplast gene regions of C. squarrosa differed from each other, and the levels of population genetic diversity varied considerably (Table 3). The trnL-trnF sequence had a relatively high level of polymorphism with a nucleotide polymorphism (pi) of 0.00544, and an average nucleotide difference (K) of 3.50403. The rbcl sequence had the next highest level of polymorphism with a nucleotide polymorphism (pi) of 0.00117, and a mean nucleotide difference (K) of 1.52028. The lowest level of polymorphism was observed after combining fragments from both regions, with nucleotide polymorphisms and average nucleotide differences of 0.00096 and 1.24500, respectively.
Table 3. The genetic diversity and haplotype diversity analysis in C. squarrosa based on 1 non-coding fragment trnL-trnF and a conserved coding fragment rbcl.
The Tajima’s D values for both coding regions were negative and highly significantly different (P < 0.001) (Table 3), and the Tajima’s D and Fu’s Fs values for the rbcl sequence were lower than those for the trnL-trnF sequence, at −2.65547 and −8.171, respectively. The combined sequence had the smallest Tajima’s D value at −2.75509 and the Fu’s Fs value at −6.196, which were also significantly different (P < 0.001). Tajima’s D and Fu’s Fs values were lower than those of the trnL-trnF sequence, at −2.65547 and −8.171, respectively, while Tajima’s D was the smallest at −2.75509 and Fu’s Fs at −6.196, which were also significantly different (P < 0.001). It is hypothesized that C. squarrosa may have undergone population expansion or purifying selection during its evolution, following a neutral evolutionary pattern, and the genetic differentiation was all significantly different. The above negative results indicate that the population faced environmental pressures during evolution but still had partially evolved populations, suggesting that climatic factors had an effect on the genetic structure of the population. The results of the mismatch distribution showed a unimodal distribution, which also proved that the population of Cleistogenes squarrosa (Trin.) Keng experienced population expansion under selective pressure. Combined with the results of the neutrality test and the mismatch distribution, it can be seen that the population of Cleistogenes squarrosa (Trin.) Keng has experienced a relatively smooth expansion and has not undergone rapid expansion in the recent past (Figure 2).
Figure 2. Mismatch distribution established for C. squarrosa: the red line represents the expected mismatch distribution of a stationary population. Note: The green line represents the observed mismatch distribution from segregating sites of the aligned sequences of rbcl and trnL-trnF intergenic spacers of the chloroplast DNA in C. squarrosa.
All populations’ rbcl and trnL-trnF sequences were combined and analysed for genetic diversity. The nucleotide polymorphism (pi), mean nucleotide difference (K), haplotype diversity (Hd), and haplotype distribution of the various populations are shown in Table 4. There were significant differences in polymorphism between the populations after combining sequences. Populations 7 and 9 had the highest nucleotide polymorphism of 0.00775 and 0.00728, respectively, and hence the highest mean nucleotide differences of 17.286 and 16.2222 for populations 7 and 9, respectively. Populations 2, 3, 4, 5, 6, 8, 11, 12, 13, 14, and 20 all had zero nucleotide polymorphisms.
Table 4. The genetic diversity and haplotype diversity analysis of different population in C. squarrosa based on combined sequence.
The results of AMOVA molecular variation analysis showed that the genetic variation among the 19 geographic populations of C. squarrosa was 59.17%, and within each geographic population was 40.83%, indicating that variation in C. squarrosa occurs both among and within geographic population species, and the difference between the two is negligible (Table 5).
The genetic distances of the various populations of C. squarrosa varied widely (Table 6). Populations 7 and 8 had the most significant genetic distance of 0.01. Population 7 also had a relatively large genetic distance of 0.006–0.007 from populations 10, 11, 12, 13, 14, 15, 16, 18, 19, and 20. Population 9 had a genetic distance of 0.006 from populations 1, 2, 3, 4, 5, 6, and 8, with the remaining populations having a relatively small genetic distance. The first values also differed more markedly between populations. 11, 12, 13, 14, 20 populations and 2, 3, 4, 5, 6, and 8 populations all had a genetic differentiation index of 1, indicating that these populations were completely isolated. Genetic differentiation indices between the remaining populations ranged from −0.191 to 0.954 and were more or less genetically similar.
Table 6. Genetic distance and pairwise fixation indices Fst (upper diagonal) of genetic variation of different population in C. squarrosa based on combined sequence.
3.2 Genetic structure and haplotype analysis
The haplotype diversity of the rbcl and trnL-trnF sequences was calculated by DNAsp and showed that the rbcl sequence had a high haplotype diversity with 19 haplotypes detected individually with a haplotype polymorphism of 0.561. 9 haplotypes were detected in the trnL-trnF sequence with a haplotype polymorphism of 0.514. The haplotype analysis showed that 15 haplotypes were detected in the combined sequences, with a haplotype polymorphism of 0.544, a haplotype diversity variance of 0.00035, and a standard deviation of 0.019 (Table 3).
An analysis of the haplotypes of all populations in the combined sequence shows that populations 7 and 18 have the highest number of haplotypes. In terms of haplotype diversity, population 10 is the highest at 0.6. Populations 2, 3, 4, 5, 6, 8, 11, 12, 13, 14, and 20 have a single haplotype, and therefore all have zero haplotype polymorphism (Table 4).
A haplotype network structure map of merged sequences between different geographical locations of C. squarrosa was constructed by the minimum-spanning method (Figure 3). Haplotypes are denoted by Hap. The merged sequences were divided into 15 haplotypes. Hap11, Hap6, Hap13, Hap12, Hap8, Hap10, Hap7 are directly linked to Hap4, while Hap14 is linked to Hap4 via Hap7. Hap2, Hap3, Hap6, and Hap15 are directly linked to Hap1, while Hap9 is special and may be linked to Hap4 by one or several mutations. And the NJ tree, the MP tree and the ML tree show similar results.
Figure 3. The network for haplotype based on combined sequence. (Different colors represent different haplotypes).
The structure analysis suggested K = 3 as the optimal number of clusters based on the calculation of ΔK, which indicated that the 236 individuals from the 19 populations most likely belonged to three main genetic clusters (Figure 4A). Cluster I mainly included individuals from the western group, containing 1, 2, 3, 4, 5, 6, 7 and 8 populations, whereas Cluster II consisted of most individuals from the eastern group, including 9, 10, 11, 12, 13, 14, 15, 16, 18, 19 and 20 populations. However, some individuals with admixed genotypes in both groups indicated ongoing gene flow or weak genetic differentiation. There are also a very small number of individuals clustered together in a separate category (Figure 4B).
Figure 4. Bayesian model-based clustering STRUCTURE analysis of 234 individuals of C. squarrosa. (A) K values for different numbers of clusters (K); (B) Estimated population structure of 234 C. squarrosa individuals on K = 3. (In picture b, populations of the same color are clustered into one class).
To further demonstrate gene flow among different populations, we estimated frequency changes of dominant haplotypes at the combined sequence in other locations (Figure 5). The total number of haplotypes for C. squarrosa was 5 and 10 in the cluster I and cluster II groups, respectively. Among these haplotypes, Hap1 was widely distributed in cluster I, whereas Hap4 was widely distributed in cluster II, which indicates that these two haplotypes were favored in two groups. In addition, we found some distinct differences in haplotype frequency among C. squarrosa from different locations. Some environment-specific haplotypes could be found only in one location, such as Hap5, Hap2, and Hap3 in cluster I, and Hap4, Hap9, and Hap10 in cluster II. The NJ tree showed that the C. squarrosa accessions could be differentiated according to their geographic locations, including a cluster from Hap1、Hap2、Hap3、Hap5、Hap 9、Hap15, and another cluster from Hap4、Hap6、Hap7、Hap8、Hap10、Hap11、Hap12、Hap13、Hap14 (Figure 6). This clustering result is consistent with the clustering result in Figure 3.
Figure 5. A map showing the sampled populations of C. squarrosa and the distribution of haplotypes. Pie charts show the proportions of the haplotypes within each sample. Haplotypes are indicated by different colors. (The location maps for clusterⅠ and clusterⅡ are enlarged versions of the top left image).
Figure 6. Phylogenetic relationship of the haplotype. Note: Different lowercase letters showed different haplotypes. The numbers at the nodes in the figure are confidence intervals.
To further ensure the accuracy of the clustering diagram STRUCUTRE 2.3 was used for Bayesian clustering analysis and based on the calculation of ΔK, the STRUCTURE analysis showed that K = 2 was the optimal number of clusters (Supplementary Figure S1). This suggests that the 15 haplotypes may belong to two major clusters and among them haplotype 9 and haplotype 15 clustered together which is consistent with the trend of the clustering results described above and justifies the reasonableness of our clustering results.
3.3 Correlation between genetic distance, geographic distance, and climate
The results of the principal component analysis showed that PC1 (68.7%) not only divided the 11 temperature-related variables and 8 precipitation-related climatic variables into 3 clusters, except for seasonal temperature (TS), annual temperature range (TAR) and seasonal precipitation (PS), where precipitation and temperature were more clearly differentiated. Population 1 to population 9 is mainly influenced by temperature and shows positive correlation, population 10 to population 14 is mainly influenced by precipitation and shows positive correlation, population 15 to population 19 is mainly influenced by TST, AR and PS and shows positive correlation (Figure 7).
Figure 7. Principal component analysis (PCA) of environmental variables of 19 populations. (Note: MAT, Mean Annual Temperature; MDR, Mean Diurnal Range; I, Isothermality; TS, Temperature Seasonality; MTWM, Max Temperature of Warmest Month; MTCM, Min Temperature of Coldest Month; TAR, Temperature Annual Range; MTWQ, Mean Temperature of Wettest Quarter; MTDQ, Mean Temperature Driest Quarter; MTW, Mean Temperature of Warmest Quarter; MTC, Mean Temperature of Coldest Quarter; AP, Annual Precipitation; PWM, Precipitation of Wettest Month; PDM, Precipitation of Driest Month; PS, Precipitation seasonality; PWQ, Precipitation of Wettest Quarter; PDQ, Precipitation of Driest Quarter; PW, Precipitation of Warmest Quarter; PC, Precipitation of Coldest Quarter).
According to the mantel tests, both climatic and geographic distance significantly affected genetic distance. Climatic distance had a more substantial effect on genetic distance than geographic distance. In addition, there was a significant correlation between geographic distance and ecological distance (Figure 8). The pure impact of climate on genetic distance was highly substantial according to the partial mantel test. While excluding climatic factors, the pure effect of geographical distance on genetic distance was insignificant (Figure 9).
Figure 8. Plot of mantel tests for the correlation between genetic diversity parameter (Bray–Curtis distance) and explanatory distances (geographic and environmental distance) using Spearman’s coefficient.
Figure 9. Plot of partial mantel tests for the correlation between genetic diversity parameter (Bray–Curtis distance) and explanatory distances (geographic and climatic distance) using Spearman’s coefficient. (This part of the test is a correlation test after excluding the effects of some of the certain factors such as MAT, AP, Climatic factors, and Geo factors separately).
Furthermore, the mantel tests of annual precipitation and annual mean temperature showed that annual precipitation had a more significant impact on genetic distance parameters than annual mean temperature. According to the partial mantel test, the pure effect of annual precipitation and mean annual temperature on genetic distance were all significant when geography was excluded. In addition, the geographical distance was not correlated with climatic distance when the effect of mean annual temperature was not considered. In contrast, geographical and climatic distance were significantly correlated when the effect of annual precipitation was excluded. These results suggest that climate has a more significant impact on genetic distance than geographical distance does on genetic distance and that among climatic factors, annual precipitation has a more substantial effect on genetic distance than mean annual temperature.
4 Discussion
4.1 Genetic diversity
We investigated genetic diversity using sequencing data for the rbcl and trnL-trnF chloroplast genes of C. squarrosa. Tajima’s D values and mismatch distributions suggest that C. squarrosa has undergone population expansion, meaning that populations adapt well to the environment in their area. AMOVA analysis revealed some genetic variation within C. squarrosa populations, but genetic variation was mainly between populations (FST = 0.5917). But the genetic variation was mainly between populations (FST = 0.5917). Genetic differences between the populations may have been caused by the limited gene flow due to the species’ limited seed dispersal (Liu et al., 2016).
Haplotype diversity of 0.544 and nucleotide diversity of 0.00096 in C. squarrosa indicated low genetic diversity. Specific characteristics of the species influence the genetic diversity of plants. The low level of genetic diversity may be due to the reproductive behavior of C. squarrosa is a parthenogenic pollinator, with both self-pollination and open pollination present in the same single plant, which allows C. squarrosa to exhibit high levels of inbreeding and, over time, produce too many pure congeners, leading to a loss of its genetic diversity, which may have led to low levels of genetic variation in some populations (de Groot et al., 2011).
Genetic diversity varies between populations of C. squarrosa. Populations 7 and 18, which have the highest genetic diversity, contain four haplotypes, while populations 2, 3, 4, 5, 6, 8, 11, 12, 13, 14, and 20 contain only one haplotype. In some populations, low genetic diversity at the cpDNA level is always related to demographic events. The haplotype network suggests that the frequency of haplotype 9 may indicate an expansion event. Therefore, demographic events have played an essential role in shaping the genetic diversity of C. squarrosa. It has been suggested that potential refugia or centers of diversification should have high genetic diversity, while recolonized sites exhibit lower genetic diversity. Our study found the highest genetic diversity in population 7, suggesting that this population may be a refuge or center of diversification for C. squarrosa (Liu et al., 2016).
4.2 Genetic structure
STRUCTURE analysis confirmed that the populations of C. squarrosa species clustered ecogeographical, generally in line with their bioclimatic regions, because those sharing the same climate environment grouped in the same cluster, irrespective of their geographic distances (Hamasha et al., 2013). Further discussion in conjunction with the main influences derived in PCA based on precipitation and temperature changes. Typical grasslands have an annual precipitation of 200–400 mm and an average annual temperature of −1.0°C–6.1°C, while desert steppes are the driest type of steppes in the steppe to desert steppes, with an annual precipitation of 150–280 mm and an average annual temperature of 2.6°C–7.8°C (Hamasha et al., 2013). Populations living in the same climatic environment can be divided into the same clusters. The two clusters belong to different climatic zones (Wu et al., 2010). Consequently, variability in precipitation and temperature impose fluctuating selection pressures between other bioclimatic regions, influencing phenological patterns such as flowering time and timing of seed germination. This may also increase the genetic differentiation of populations in the two climatic zones (Xie et al., 1992). Such differences in reproductive phenology may lead to partial reproductive isolation, favoring population adaptation to different habitats and neutral genetic differentiation (Ping et al., 2013). Differences in such demographic processes may produce patterns of genetic variation that resemble patterns due to natural selection (Rouger and Jump, 2015). Population genetic structure developed under restricted gene flow cannot be distinguished from population structures built by spatially heterogeneous samples (Cox et al., 2016). Thus, genetic differentiation between bioclimatic regions may have been caused by local adaptation rather than solely by demographic processes (Borokini et al., 2021). Hence, the genetic structure of plant populations is not randomly distributed but is associated with ecogeographic factors (Hamasha et al., 2013).
Variations in humidity and temperature among habitats can cause deviation in flowering phenology and consequently limit gene flow, which leads to a high probability of reproductive isolation. Temperature is an essential factor influencing population differentiation and potential adaptation, and temperature can affect populations through life history traits, seedling tolerance, and temperature requirements for initial growth (Al-Gharaibeh et al., 2017). C. squarrosa is more sensitive to environmental changes, and temperature and precipitation may affect the phenology of C. squarrosa, particularly the timing of flowering, thus affecting the exchange of genes between populations. Therefore, we assume that the differences in drought and temperature between the phytogeographic regions isolated the C. squarrosa populations’ climatic and shaped their genetic structure (Zhang et al., 2014). Low rainfall resulted in limited seed dispersal around the mother plant, creating an adaptation of the genetic system to minor population conditions and a restricted gene flow rate which increases genetic differentiation among populations (Cox et al., 2016). High levels of genetic differentiation between plant populations have shaped the genetic structure of C. squarrosa (Kabiel et al., 2013).
4.3 Climatic effects
A significant positive correlation between genetic distance and geographical distance was found in our study, with the genetic distance increasing with the geographical distance from west to east (Garot et al., 2019). Wright (Wright: Genetics 28:114–138, 1943) proposed that given limits on dispersal and in the absence of selection, drift would cause populations to become more differentiated at greater distances (Sexton et al., 2014). In the study of Volis, there was no significant correlation between the genetic distance and the geographical distance if the environmental choice caused genetic differentiation (Wu et al., 2010; Al-Gharaibeh et al., 2017). In addition, C. squarrosa is a parthenogenetic pollinator and our results suggest that selfing may predominate in the mating system of C. squarrosa. In contrast to other long-lived, wind-pollinated, predominantly heterozygous gymnosperms, C. squarrosa is more limited in its ability to spread seeds and therefore has difficulty spreading to other regions and restricted gene exchange leads to increased genetic distances with increasing geographical distance (Buer et al., 2022).
Genetic distance and precipitation were significantly and positively correlated with temperature in this study (Zhao et al., 2022). They remained significantly correlated after excluding the effect of geographical distance, and genetic distance increased with increasing temperature and precipitation (Cui D. et al., 2021). Different precipitation and temperature gradients had different effects on C. squarrosa, and this effect led to a mismatch in flowering time between individual C. squarrosa populations, i.e., non-random mating occurred, which led to an increase in the genetic distance between populations (Sexton et al., 2014). In addition, different precipitation and temperature gradients exert strong differential selection pressures on C. squarrosa (Wu et al., 2010). With limited gene flow, C. squarrosa populations can develop various adaptations, leading to differences between genetic distances (Lin et al., 2006).
5 Conclusion
The degree of genetic diversity in C. squarrosa is low and structure analysis reveals the presence of two genetic clusters. We suggest that geographical distance and environmental factors may explain the genetic differentiation between these populations, that the main factors shaping the genetic structure of the populations are annual precipitation and mean annual temperature, and that differences in precipitation and temperature may affect the adaptability of C. squarrosa and the phenological period. Exploring the effects of climatic factors on the genetic variation of C. squarrosa under a large scale gradient, exploring the effects of climatic factors on the genetic variation of C. squarrosa can not only provide information on the genetic variation of C. squarrosa, but also provide a theoretical basis for the genetic adaptation mechanism of grass plants. The gene sequences measured by our experiments can be applied to the training phase of machine learning models such as multi-information fusion techniques, and effective learning models will be identified for application in drug targeting studies. (Zhao et al., 2023).
Data availability statement
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.
Author contributions
RS: Conceptualization, Formal analysis, Project administration, Software, Writing–original draft. XZ: Data curation, Methodology, Writing–original draft. ZZ: Investigation, Validation, Writing–review and editing. CZ: Supervision, Resources, Funding acquisition, Writing–review and editing.
Funding
The author(s) declare that financial support was received for the research, authorship, and/or publication of this article. This study was supported by the National Natural Science Foundation of China (32071860) and consent was granted for sample acquisition at the sampling site.
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fbinf.2024.1454689/full#supplementary-material
SUPPLEMENTARY FIGURE S1Bayesian model-based clustering STRUCTURE analysis of 15 haplotype of the merged sequence. (A) K values for different numbers of clusters (K); (B) Estimated clustered structure of 15 haplotype on K = 2. (In picture b, haplotype of the same color are clustered into one class.
References
Al-Gharaibeh, M., Hamasha, H., Rosche, C., Lachmuth, S., Wesche, K., and Hensen, I. (2017). Environmental gradients shape the genetic structure of two medicinal Salvia species in Jordan. Plant Biol. 19 (2), 227–238. doi:10.1111/plb.12512
Bell, G. (1991). The ecology and genetics of fitness in Chlamydomonas. IV. The properties of mixtures of genotypes of the same species. Evolution 45 (4), 1036–1046. doi:10.2307/2409707
Booy, G., Hendriks, R., Smulders, M., Van Groenendael, J., and Vosman, B. (2000). Genetic diversity and the survival of populations. Plant Biol. 2 (04), 379–395. doi:10.1055/s-2000-5958
Borokini, I. T., Klingler, K. B., and Peacock, M. M. (2021). Life in the desert: the impact of geographic and environmental gradients on genetic diversity and population structure of Ivesia webberi. Ecol. Evol. 11 (23), 17537–17556. doi:10.1002/ece3.8389
Buer, H., Rula, S., Wang, Z. Y., Fang, S., and Bai, Y. e. (2022). Analysis of genetic diversity in Prunus sibirica L. in inner Mongolia using SCoT molecular markers. Genet. Resour. Crop Evol. 69 (3), 1057–1068. doi:10.1007/s10722-021-01284-4
Buza, L., Young, A., and Thrall, P. (2000). Genetic erosion, inbreeding and reduced fitness in fragmented populations of the endangered tetraploid pea Swainsona recta. Biol. Conserv. 93 (2), 177–186. doi:10.1016/s0006-3207(99)00150-0
Cox, C. B., Moore, P. D., and Ladle, R. J. (2016). Biogeography: an ecological and evolutionary approach. John Wiley and Sons.
Cui, D., Tang, C., Lu, H., Li, J., Ma, X., Han, B., et al. (2021a). Genetic differentiation and restricted gene flow in rice landraces from Yunnan, China: effects of isolation-by-distance and isolation-by-environment. Rice 14 (1), 54–14. doi:10.1186/s12284-021-00497-6
Cui, H., Ding, Z., Zhu, Q., Wu, Y., and Gao, P. (2020). Population structure and genetic diversity of watermelon (Citrullus lanatus) based on SNP of chloroplast genome. 3 Biotech. 10, 374–378. doi:10.1007/s13205-020-02372-5
Cui, H., Ding, Z., Zhu, Q., Wu, Y., Qiu, B., and Gao, P. (2021b). Population structure, genetic diversity and fruit-related traits of wild and cultivated melons based on chloroplast genome. Genet. Resour. Crop Evol. 68, 1011–1021. doi:10.1007/s10722-020-01041-z
de Groot, G. A., During, H. J., Maas, J. W., Schneider, H., Vogel, J. C., and Erkens, R. H. (2011). Use of rbcL and trnL-F as a two-locus DNA barcode for identification of NW-European ferns: an ecological perspective. PLoS one 6 (1), e16371. doi:10.1371/journal.pone.0016371
Dole, J. A., and Sun, M. (1992). Field and genetic survey of the endangered Butte County meadowfoam—limnanthes floccosa subsp. californica (Limnanthaceae). Conserv. Biol. 6 (4), 549–558. doi:10.1046/j.1523-1739.1992.06040549.x
Engelhardt, K. A., Lloyd, M. W., and Neel, M. C. (2014). Effects of genetic diversity on conservation and restoration potential at individual, population, and regional scales. Biol. Conserv. 179, 6–16. doi:10.1016/j.biocon.2014.08.011
Fick, S. E., and Hijmans, R. J. (2017). WorldClim 2: new 1-km spatial resolution climate surfaces for global land areas. Int. J. Climatol. 37 (12), 4302–4315. doi:10.1002/joc.5086
Garot, E., Joët, T., Combes, M.-C., and Lashermes, P. (2019). Genetic diversity and population divergences of an indigenous tree (Coffea mauritiana) in Reunion Island: role of climatic and geographical factors. Heredity 122 (6), 833–847. doi:10.1038/s41437-018-0168-9
Haider, S., Kueffer, C., Edwards, P. J., and Alexander, J. M. (2012). Genetically based differentiation in growth of multiple non-native plant species along a steep environmental gradient. Oecologia 170, 89–99. doi:10.1007/s00442-012-2291-2
Hamasha, H., Schmidt-Lebuhn, A., Durka, W., Schleuning, M., and Hensen, I. (2013). Bioclimatic regions influence genetic structure of four Jordanian Stipa species. Plant Biol. 15 (5), 882–891. doi:10.1111/j.1438-8677.2012.00689.x
Huang, W.-D., Zhao, X.-Y., Zhao, X., Li, Y.-L., and Pan, C.-C. (2016). Environmental determinants of genetic diversity in Caragana microphylla (Fabaceae) in northern China. Botanical J. Linn. Soc. 181 (2), 269–278. doi:10.1111/boj.12407
Hughes, A. R., Inouye, B. D., Johnson, M. T., Underwood, N., and Vellend, M. (2008). Ecological consequences of genetic diversity. Ecol. Lett. 11 (6), 609–623. doi:10.1111/j.1461-0248.2008.01179.x
Kabiel, H., Hegazy, A., Faisal, M., and Doma, E. (2013). Genetic variations within and among populations of anastatica heirochuntica at macroscale geographical range. Appl. Ecol. Environ. Res. 11 (3), 343–354. doi:10.15666/aeer/1103_343354
Kawecki, T. J., and Ebert, D. (2004). Conceptual issues in local adaptation. Ecol. Lett. 7 (12), 1225–1241. doi:10.1111/j.1461-0248.2004.00684.x
Leimu, R., Mutikainen, P., Koricheva, J., and Fischer, M. (2006). How general are positive relationships between plant population size, fitness and genetic variation? J. Ecol. 94 (5), 942–952. doi:10.1111/j.1365-2745.2006.01150.x
Lin, H., Hai-Yan, L., Xiang-Yang, Z., Feng, Y., Xiang-Dong, B., and Chong-Bo, H. (2006). Population genetic structure and genetic differentiation of Artemia parthenogenetica in China. J. Shellfish Res. 25 (3), 999–1005. doi:10.2983/0730-8000(2006)25[999:PGSAGD]2.0.CO;2
Liu, H., Zhang, R., Geng, M., Zhu, J., An, J., and Ma, J. (2016). Chloroplast analysis of Zelkova schneideriana (Ulmaceae): genetic diversity, population structure, and conservation implications. Genet. Mol. Res. 15, 1–9. doi:10.4238/gmr.15017739
Liu, X., Ma, Y., Wan, Y., Li, Z., and Ma, H. (2020). Genetic diversity of Phyllanthus emblica from two different climate type areas. Front. plant Sci. 11, 580812. doi:10.3389/fpls.2020.580812
Luo, M.-X., Lu, H.-P., Chai, M.-W., Chang, J.-T., and Liao, P.-C. (2021). Environmental heterogeneity leads to spatial differences in genetic diversity and demographic structure of Acer caudatifolium. Plants 10 (8), 1646. doi:10.3390/plants10081646
Manel, S., Joost, S., Epperson, B. K., Holderegger, R., Storfer, A., Rosenberg, M. S., et al. (2010). Perspectives on the use of landscape genetics to detect genetic adaptive variation in the field. Mol. Ecol. 19 (17), 3760–3772. doi:10.1111/j.1365-294x.2010.04717.x
Naugžemys, D., Patamsytė, J., Žilinskaitė, S., Hoshino, Y., Skridaila, A., and Žvingila, D. (2022). Genetic structure of native blue honeysuckle populations in the western and eastern Eurasian ranges. Plants 11 (11), 1480. doi:10.3390/plants11111480
Nevo, E. (2001). Evolution of genome–phenome diversity under environmental stress. Proc. Natl. Acad. Sci. 98 (11), 6233–6240. doi:10.1073/pnas.101109298
Peleg, Z., Saranga, Y., Krugman, T., Abbo, S., Nevo, E., and Fahima, T. (2008). Allelic diversity associated with aridity gradient in wild emmer wheat populations. Plant, Cell and Environ. 31 (1), 39–49. doi:10.1111/j.1365-3040.2007.01731.x
Ping, Y., Yuan, X.-p., Qun, X., Wang, C.-h., Yu, H.-y., Wang, Y.-p., et al. (2013). Genetic structure and Indica/Japonica component changes in major inbred rice varieties in China. Rice Sci. 20 (1), 39–44. doi:10.1016/s1672-6308(13)60106-7
Pramuk, L. A., and Runkle, E. S. (2005). Photosynthetic daily light integral during the seedling stage influences subsequent growth and flowering of Celosia, Impatiens, Salvia, Tagetes, and Viola. HortScience 40 (4), 1336–1339. doi:10.21273/hortsci.40.5.1336
Reed, D. H., and Frankham, R. (2003). Correlation between fitness and genetic diversity. Conserv. Biol. 17 (1), 230–237. doi:10.1046/j.1523-1739.2003.01236.x
Rouger, R., and Jump, A. (2015). Fine-scale spatial genetic structure across a strong environmental gradient in the saltmarsh plant Puccinellia maritima. Evol. Ecol. 29, 609–623. doi:10.1007/s10682-015-9767-6
Santelices, B., Gallegos Sánchez, C., and González, A. V. (2018). Intraorganismal genetic heterogeneity as a source of genetic variation in modular macroalgae. J. Phycol. 54 (6), 767–771. doi:10.1111/jpy.12784
Sexton, J. P., Hangartner, S. B., and Hoffmann, A. A. (2014). Genetic isolation by environment or distance: which pattern of gene flow is most common? Evolution 68 (1), 1–15. doi:10.1111/evo.12258
Shinozaki, K., Ohme, M., Tanaka, M., Wakasugi, T., Hayashida, N., Matsubayashi, T., et al. (1986). The complete nucleotide sequence of the tobacco chloroplast genome: its gene organization and expression. EMBO J. 5 (9), 2043–2049. doi:10.1002/j.1460-2075.1986.tb04464.x
Słomka, A., Sutkowska, A., Szczepaniak, M., Malec, P., Mitka, J., and Kuta, E. (2011). Increased genetic diversity of Viola tricolor L.(Violaceae) in metal-polluted environments. Chemosphere 83 (4), 435–442. doi:10.1016/j.chemosphere.2010.12.081
Wu, J.-b., Gao, Y.-b., Bao, X.-y., Gao, H., Jia, M.-q., Li, J., et al. (2010). Genetic variation among Stipa grandis P. Smirn populations with different durations of fencing in the Inner Mongolian Steppe. Rangel. J. 32 (4), 427–434. doi:10.1071/rj09038
Xie, C. Y., Dancik, B. P., and Yeh, F. C. (1992). Genetic structure of Thuja orientalis. Biochem. Syst. Ecol. 20 (5), 433–441. doi:10.1016/0305-1978(92)90083-p
Ying, H., Zhang, H., Zhao, J., Shan, Y., Zhang, Z., Guo, X., et al. (2020). Effects of spring and summer extreme climate events on the autumn phenology of different vegetation types of Inner Mongolia, China, from 1982 to 2015. Ecol. Indic. 111, 105974. doi:10.1016/j.ecolind.2019.105974
Zhang, H. X., Gao, Y. Z., Tasisa, B. Y., Baskin, J. M., Baskin, C. C., Lü, X. T., et al. (2019). Divergent responses to water and nitrogen addition of three perennial bunchgrass species from variously degraded typical steppe in Inner Mongolia. Sci. Total Environ. 647, 1344–1350. doi:10.1016/j.scitotenv.2018.08.025
Zhang, J. Y., Wu, F., Yan, Q., John, U. P., Cao, M., Xu, P., et al. (2020). The genome of Cleistogenes songorica provides a blueprint for functional dissection of dimorphic flower differentiation and drought adaptability. Plant Biotechnol. J. 19 (3), 532–547. doi:10.1111/pbi.13483
Zhang, Q., Ding, Y., Ma, W., Kang, S., Li, X., Niu, J., et al. (2014). Grazing primarily drives the relative abundance change of C4 plants in the typical steppe grasslands across households at a regional scale. Rangel. J. 36 (6), 565–572. doi:10.1071/rj13050
Zhao, B. W., Wang, L., Hu, P. W., Wong, L., Su, X. R., Wang, B. Q., et al. (2023). Fusing higher and lower-order biological information for drug repositioning via graph representation learning. IEEE Trans. Emerg. Top. Comput. 12 (1), 163–176. doi:10.1109/tetc.2023.3239949
Keywords: genetic diversity, genetic structure, Cleistogenes squarrosa Keng, climatic factors, geographic distance
Citation: Song R, Zhang X, Zhang Z and Zhou C (2024) Climatic factors, but not geographic distance, promote genetic structure and differentiation of Cleistogenes squarrosa (Trin.) Keng populations. Front. Bioinform. 4:1454689. doi: 10.3389/fbinf.2024.1454689
Received: 25 June 2024; Accepted: 28 October 2024;
Published: 13 November 2024.
Edited by:
Paola Bonizzoni, University of Milano-Bicocca, ItalyReviewed by:
Bo-Wei Zhao, Chinese Academy of Sciences (CAS), ChinaNadia Pisanti, University of Pisa, Italy
Copyright © 2024 Song, Zhang, Zhang and Zhou. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Chan Zhou, emhvdV95YW5oYW5AMTYzLmNvbQ==