- 1Department of Plant Biology, Michigan State University, East Lansing, MI, United States
- 2Department of Energy (DOE) Great Lakes Bioenergy Research Center, Michigan State University, East Lansing, MI, United States
- 3Kunpeng Institute of Modern Agriculture at Foshan, Foshan, China
- 4Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
- 5Department of Computational Mathematics, Science, and Engineering, Michigan State University, East Lansing, MI, United States
Switchgrass low-land ecotypes have significantly higher biomass but lower cold tolerance compared to up-land ecotypes. Understanding the molecular mechanisms underlying cold response, including the ones at transcriptional level, can contribute to improving tolerance of high-yield switchgrass under chilling and freezing environmental conditions. Here, by analyzing an existing switchgrass transcriptome dataset, the temporal cis-regulatory basis of switchgrass transcriptional response to cold is dissected computationally. We found that the number of cold-responsive genes and enriched Gene Ontology terms increased as duration of cold treatment increased from 30 min to 24 hours, suggesting an amplified response/cascading effect in cold-responsive gene expression. To identify genomic sequences likely important for regulating cold response, machine learning models predictive of cold response were established using k-mer sequences enriched in the genic and flanking regions of cold-responsive genes but not non-responsive genes. These k-mers, referred to as putative cis-regulatory elements (pCREs) are likely regulatory sequences of cold response in switchgrass. There are in total 655 pCREs where 54 are important in all cold treatment time points. Consistent with this, eight of 35 known cold-responsive CREs were similar to top-ranked pCREs in the models and only these eight were important for predicting temporal cold response. More importantly, most of the top-ranked pCREs were novel sequences in cold regulation. Our findings suggest additional sequence elements important for cold-responsive regulation previously not known that warrant further studies.
Introduction
Switchgrass (Panicum virgatum L.) is a perennial C4 grass species native to North America and identified as a major lignocellulosic feedstock for biofuel production (Sanderson et al., 2006). Higher biomass production has been a major breeding target and a potent research area in switchgrass. However, high-yielding switchgrass cultivars grow in narrow climatic niches and are known to be less productive under drought, high salinity, and freezing/chilling environmental conditions (Sage et al., 2015; Zhuo et al., 2015; Lovell et al., 2021). Expanding the growing range of high-yielding switchgrass cultivars has been proposed as a way to achieve economic bioenergy production (Sanderson et al., 2006). Coupling high biomass production with low and freezing temperature tolerance can be an effective way of increasing the range expansion of high-yielding switchgrass cultivars. Thus, it is important to understand which genes and how they are responsive to cold stress in cold-resistant switchgrass cultivars.
The ability to tolerate and/or resist cold stress has been an active area of research with respect to the underlying genes, their transcriptional regulators, and signaling pathways (Thomashow, 2010; Park et al., 2018; Manasa et al., 2021). At the level of transcriptional regulation, the C-repeat-binding factor (CBF) cold response pathway is one of the best characterized. In Arabidopsis thaliana, three C-Repeat Binding Factor/Dehydration Responsive Element-Binding Protein 1 (CBF/DREB1) transcription factor (TF) genes are rapidly up-regulated in response to cold stress (Stockinger et al., 1997; Liu et al., 1998). Such rapid cold response is due to a signaling network that is active upon cold stress. During cold treatment, cellular Ca+2 is elevated and activates Calmodulin proteins (CAMs). CAMs then bind to promoters of CAM-binding Transcription Activators (CAMTAs) and up-regulate expression of CAMTAs. Finally, CAMTAs bind to the conserved CGCG-box in CBF genes and up-regulate their transcription. Another well-studied regulator of CBF expression is the Inducer of CBF Expression (ICE) (Chinnusamy et al., 2003). ICE TFs are activated through low temperature mediated sumoylation and subsequently bind to ICE-box promoters in CBF genes to activate its transcription (Chinnusamy et al., 2003; Chinnusamy et al., 2007; Chinnusamy et al., 2010). CBF TFs then up-regulate over 100 cold regulated (COR) and low-temperature induced genes by binding to C-repeat/dehydration-responsive (CRT/DRE) elements, located in promoters of COR genes (Thomashow, 2010). This regulatory hub is known as the CBF regulon which is a major mechanism of cold stress response regulation in plants.
Beyond the CBF regulatory hub, there are examples of other, non-CBF regulatory pathways important for cold stress response in plants. Studies using CBF mutants have shown that TFs rapidly responsive to cold, such as HSFC1, ZAT12, and CZF1, also regulate COR gene expression, indicating CBF-independent regulation (Park et al., 2018; Liu et al., 2019). Another example is BZR1 TFs in the brassinosteroid (BR) signaling pathway that become dephosphorylated upon exposure to cold stress and bind to BR responsive element and E-box in the promoter regions of COR genes such as WRKY6, SAG21, and SOC1 (Li et al., 2017). It is also shown that cold-induced, Abscisic Acid modulated COR gene expression works independently from CBF regulon (Liu et al., 1998). There are likely other, non-CBF regulatory mechanisms for plant cold-responsive transcription that remain to be discovered. In addition, in switchgrass, it remains unclear how temporal regulation of cold response is regulated, CBF-dependent or not.
Computational approaches are powerful tools in the identification of genome-wide regulatory patterns in plants under biotic and abiotic stress conditions. In switchgrass, co-expression analysis has been used to establish the potential transcriptional regulatory networks in heat, drought, and biotic stress conditions (Pingault et al., 2020; Hayford et al., 2022; Zhou et al., 2022). Recently, a comprehensive, transcriptomic study on several panicoid grasses, including switchgrass, revealed that machine learning approaches can be implemented to predict cold stress responses of genes within and between species based on nucleotide frequencies in promoter regions of genes, among other features (Meng et al., 2021). Beyond nucleotide frequencies, a similar approach using longer nucleotide sequences (i.e., k-mers) can identify putative cis-regulatory elements that are regulatory switches of gene expression under cold stress in switchgrass. Such approaches have been applied to identify the regulatory switches of genes under wounding (Liu et al., 2018; Moore et al., 2022), high salinity (Uygun et al., 2017), iron excess response (Kakei et al., 2021), heat, and drought stress conditions (Azodi et al., 2020).
In this study, we aim to apply a similar, machine-learning based approach in switchgrass to assess the involvement of CBF-dependent components of cold response regulation and identify other cis-regulatory mechanisms. Using existing cold stress time course transcriptomes of switchgrass (Meng et al., 2021), we first identified temporally cold-responsive genes. To test the extent to which the temporal cold transcriptional response at different cold treatment duration can be explained using potential cis-regulatory sequences, we built machine learning models to predict genes that are up- and down-regulated upon cold treatment in the time course experiment using k-mers enriched among up- or down-regulated genes. The k-mers that were the most predictive for cold-responsive genes were considered putative Cis-Regulatory Elements (pCREs) controlling the temporal transcriptional response. To further reveal the regulatory logic behind the temporal transcriptional response, we examined transcription factors that may bind to pCREs, similarity between pCREs to known CREs, as well as functions of the genes that these pCREs are located on. In addition, to understand if there are common mechanisms underlying the transcriptional response at different time points after cold treatment, we assessed if pCREs identified in one time point were similar to the regulatory elements identified in other time points.
Results and discussion
Temporal transcriptional response in switchgrass under cold stress
Switchgrass genes responsive to cold stress at different treatment time points (0.5, 1, 3, 6, 16, and 24 hrs) were identified using the transcriptome data from Meng et al. (2021) (S1 Table). We found that the number of cold-responsive genes, regardless if they were responsive to cold at multiple time points or at a specific time point, increased as the duration of cold treatments increased (S1A Figure). This observation is consistent with a cascading effect of transcriptional response over time, similar to responses to other biotic (Ren et al., 2008; Ikeuchi et al., 2017; Moore et al., 2022) and abiotic (Joshi et al., 2016; Ohama et al., 2016) stress conditions. This cascading effect could be because the key regulators are activated sequentially during the cold treatment (Ding et al., 2019a; Lamers et al., 2020). Moreover, as expected, more cold-responsive genes tend to be shared between adjacent time points compared with time points apart from each other (S1A Figure).
To understand what functions the genes that are responsive to cold stress at different time points tend to have, we conducted Gene Ontology (GO) enrichment analysis (see Methods, S1B, C Figures). GO terms relevant to signaling and activity of transcription factors, such as protein phosphorylation and regulation of transcription, were enriched for genes up-regulated at earlier time points (i.e., 0.5 - 3 hrs, S1B Figure). These early up-regulated genes may act as initial regulators of genes that are responsive to cold at later time points. Consistent with this, it is known that the accumulation of Ca+2 as a result of initial cold sensing activates the expression of calcium-dependent protein kinases (CDPKs), which in turn activate transcription factors that regulate downstream cold stress response (Chinnusamy et al., 2010; Knight and Knight, 2012). Moreover, GO terms such as glucan metabolism and trehalose biosynthesis were also found to be enriched at initial time points. These biological processes are known to be important in the initial cold acclimation in Arabidopsis (Miranda et al., 2007; Maruyama et al., 2009). The GO terms enriched in up-regulated genes at later time points (i.e., 6-24 hrs) may involve biological processes that are required to maintain the functionality of the plant under prolonged cold stress. For example, during prolonged cold stress an increase in plant respiration has been observed (Manasa et al., 2021). As a result of elevated respiration, plants tend to accumulate higher amounts of reactive oxidative species (ROS), followed by the transcription of genes that are responsive to oxidative stress (Wei et al., 2022). This is in line with the enriched GO terms for later cold-responsive genes, such as response to oxidative stress and metal ion transport. Thus, the results from GO enrichment analysis are also indicative of the cascading effect of temporal transcriptional response under cold stress in switchgrass, where initial responsive genes activate later cold-responsive genes that are involved in different physiological and metabolic processes to withstand cold stress conditions.
Putative cis-regulatory elements regulating temporal cold stress responses
The cascading effect of temporal transcriptional response that we observed, as well as the differences between GO terms enriched in genes that were up-regulated at different time points, indicates that the transcriptional regulation differs among time points after cold treatment. To understand how cold-responsive genes are regulated at the cis-regulatory level, we first identified k-mers in the promoter and gene body regions that were enriched among cold-responsive genes at each time point. Then the enriched k-mers were used to establish a predictive model to distinguish cold-responsive genes from non-responsive genes for each time point with machine learning (see Methods; Figure 1A). We calculated F-measure (F1 score) on the validation and test instances (held out before model training, see Methods). In our modeling setup, the F1 score ranges between one and zero, where one represents a model with perfect prediction, while a score ~0.5 indicates a model with predictions no better than random guesses. Among models distinguishing genes that are significantly up- or down-regulated from non-responsive genes at different time points, the F1s were all higher than random expectation (> 0.7) (Figure 1B), indicating that the sequence information (i.e., k-mers) was predictive of cold stress response at a time point.
Figure 1 Models predicting the cold responsiveness. (A) The overall procedure to model transcriptional response. Genes that are significantly up- or down-regulated at a cold treatment time point were used as positive examples, while genes not responsive to cold treatment at any time point and to other abiotic stresses (dehydration, salt, and drought) were used as negative examples. k-mers enriched in the gene body and flanking non-genic regions of the cold-responsive genes were used as predictors (features). RandomForest classifier was used to train models, and the model performance was evaluated using the F1 score. (B) Model performances (F1) on the cross-validation (CV) and test sets for each time point model distinguishing genes that were up- (top chart) or down-regulated (bottom chart) after cold treatment for a specific duration from non-responsive genes. The number of positive example genes used in each time-point model is shown in the parenthesis.
Next, we asked what features (k-mers) were most predictive of the temporal cold stress response of genes with feature selection. By assessing the model performance improvement by adding features successively from the most to the least important, the minimal number of features required to reach 95% of the optimal model performance was identified for each time point model (S2 Figure). The k-mers that met this criteria for each time point model were defined as pCREs (S2, S3 Tables). From here onwards, we focus on the pCREs predictive of up-regulated genes. Some of these pCREs were general across time points (indicating these pCREs were found in all the time point models) (Figure 2A), which may indicate: (1) the genes regulated by these pCREs are responsive to cold across time points; and/or (2) different genes that are responsive to cold stress at different time points are regulated by the same pCRE set and/or (3) these pCREs may be basal regulatory elements required for cold-stress regulation which may be working in combination with other CREs to fine-tune the temporal transcriptional response under cold stress. We should note that only 154 and 411 genes for up- and down-regulation across >4 time points, respectively. On the other hand, 16,414 and 16,911 genes are up- and down-regulated in >=1 time points. Considering that very few genes are commonly responsive across multiple time points, the first possibility is unlikely. Some other pCREs were time point-specific (indicating these pCREs were found only in a single time point model) (Figure 2A). The remaining pCREs were identified by models predicting genes up-regulated at 2~5, usually disjointed, time points (S3 Figure).
Figure 2 Interpretation of the temporal cold-responsiveness prediction models. (A) General and time point-specific pCREs and their similarities to known cold responsive CREs. Heatmap in the left panel shows the relative importance of pCREs, short sequences in the middle indicate pCREs that are similar to CREs known to regulate cold response (cold-TFBMs), which are shown in the right panel. Color scale in blue represents min-max scaled Gini index calculated for features in a time point model; color scale in pink indicates similarities between pCREs and cold-TFBMs. (B) Transcription factors (TFs) that bind to the cold-TFBMs are shown with different colors, and the sequence logos of TF binding sites are shown in the rightmost panel. PCC, pearson correlation coefficient; TFBM, transcription factor binding motifs.
GO terms enriched for down-regulated genes indicate that genes involved in essential biochemical processes such as photosynthesis and growth-related processes have been down-regulated (S1 Figure C) which has also been observed in other species such as Arabidopsis (Manasa et al., 2021). However, our focus in this study is to identify the regulatory elements that may be directly related with regulating physiological and metabolic processes to endow plant resistance to cold. Although our down-regulation models also had higher performance, the pCREs might be a mixture of regulatory elements that are responsive to cold and responsive to other physiological and biochemical processes that are affected by cold stress. Thus, we will not be focusing on regulatory elements that control the downregulation under cold stress in our downstream analysis.
Known cold response regulation transcription factors likely bound to pCRE sites
Previous studies have shown that there are some conserved CREs that control the expression of both early responsive transcription factors (TFs), such as CBF, and downstream cold-responsive genes (e.g., COR genes) that carry out the cold stress tolerance in plants (Chinnusamy et al., 2010; Thomashow, 2010; Park et al., 2018; Ding et al., 2019b). To see if our models have identified binding sites for these known regulators as well as novel CREs, we examined the similarities between the general and time point-specific pCREs and 35 known transcription factor binding motifs (TFBMs) in Arabidopsis using DAP-seq (O’Malley et al., 2016) and CIS-BP (Weirauch et al., 2014) datasets (S3 Table). In addition, we collected 35 known TFs regulating plant cold stress response that have binding site information (S4 Table). Some pCREs that are significantly more similar (see Methods) to binding sites of 11 out of 35 known TFs regulating cold response than the 95 percentile of TFBMs from TFs of the same families (Figure 2A, see Methods). Two general pCREs were similar to the binding sites of CAMTA1 and CAMTA5 (orange and yellow in Figure 2B). CAMTAs are known to be up-regulated by the activation of Ca+2-dependent Calmodulin due to cold-induced Ca+2 spike (Finkler et al., 2007; Manasa et al., 2021). In addition, CAMTAs are major regulators of CBF genes that are known regulators of cold responses, for the immediate cold stress response (Finkler et al., 2007). Consistent with the involvement of CAMTAs in early cold response pCREs, the most closely related to CAMTA binding motifs had the highest feature importance in the 30 min model (CAMTA1 and CAMTA5 ranked 17 and 6, respectively). We should point out that the CAMTA1/5 binding motif-like pCREs were also found in 1hr- and 16 hr-specific sets, indicating that, like in Arabidopsis, (Doherty et al., 2009) the CAMTAs may also be involved in maintaining CBF or other cold response gene expression that are critical for overall cold acclimation in switchgrass. Because only 11 of 35 cold CREs of known plant cold stress TFs have similar binding sites to general and specific pCREs (Figure 2A), we next examined if they could be recovered using pCREs important in >1 time points (non-specific pCREs, S3 Table). We found that no new cold CREs can be recovered. Thus, in later discussion, we mainly focus on general and time point-specific pCREs only.
Another notable finding is that pCREs are similar to ERF binding sites (gray and green in Figures 2A, B) and were identified both in the general and most of the time point-specific pCRE sets (excluding the 3 and 6-hrs). Like CBF/DREB TFs, ERF TFs are members of APETALA2/Ethylene Responsive Element Binding Protein (AP2/EREBP) gene family which are known to be involved in multiple types of stress tolerance (Dey and Corina Vlot, 2015; Park et al., 2021). ERF115 prevents water deprivation in rice under extreme temperatures and drought conditions (Park et al., 2021). Dehydration is a condition that can occur under cold stress and transgenic switchgrass with higher water retention also has an increased cold tolerance (Xie et al., 2019). Despite the lack of experimental evidence for the function of ERF TFs in switchgrass, our findings suggest that ERF TFs may play important roles in cold tolerance in switchgrass. Moreover, there were also pCREs that are similar to binding sites of TFs from other TFs families, such as WRKY, BZR and ABR. pCREs similar to binding sites for BZR1 (rank 1 to 4), WRKY24 (rank seven to eight), and WRKY 30 (rank seven) were also among the most predictive cold-CREs in cold-TFBM models (S4 Figure). These TFs are known for cold signal transduction and cold stress tolerance via CBF-independent pathways (Park et al., 2015; Ramirez and Poppenberger, 2020). BZR1 is known to be involved in cold stress tolerance through processes such as ROS scavenging (Ramirez and Poppenberger, 2020) and facilitating structural changes in cell membranes and cell walls (Benatti et al., 2012). Moreover, WRKY TFs are also known to be involved in phytohormonal-induced signal transduction for low-temperature tolerance in plants (Park et al., 2015; Park et al., 2018). ABR1 on the other hand is known to regulate multiple stress responses, including cold stress in a CBF-independent, CBL9-CIPK3-mediated, ABA-signaling cascade (Pandey et al., 2005). These findings indicate that our prediction models can not only predict cold-responsiveness for different time points after cold treatment, but also recover known plant cold-TFBMs.
Potentially novel cold cis-regulatory sequences in switchgrass
While known TFs involved in cold-responsive regulation can be identified, 45 pCREs either resembled known TFBMs but the TFs were not known to be involved in cold-regulation. Perhaps more importantly, another 598 pCREs did not have significant similarity to known TFBMs. This raises the question if these pCREs not resembling cold-TFBMs, represent novel component of switchgrass cis-regulation under cold treatment. To address this, we compared the informativeness of pCREs identified by our models and the experimentally validated cold-TFBMs for predicting cold stress response. Based on literature search, 35 TFs involved in cold response regulation with binding site information in different plant species (S4 Table) were used to build models (hereafter referred to as cold-TFBM models). We found that the cold-TFBM models had far worse prediction performance (median F1 = 0.66) than models built using all pCREs (median F1 = 0.85, Figure 3A). Since these 11 of 35 cold-TFBMs are significantly similar to top-ranked pCREs (similarity >95% of randomly expected matches, see Methods), it is not particularly surprising that the cold-TFBMs predictive of cold responsiveness at different specific time points are similar to the findings in Figure 2. By looking at the feature importance of the cold-TFBMs models built for each of the time points (S4 Figure), TFBMs of CAMTA1/5 and CBFs were among the most predictive features among the cold-TFBMs time point models.
Figure 3 (A) Model performance comparison among models built using all the pCREs (blue), top 35 most important pCREs (cyan), and 35 known cold-TFBMs (hot pink). (B) Enrichments of top 10 pCREs in 0.5, 1, 3, 6, 16, 24 hr time point models (a-f respectively).
While the all-pCRE models overall performed significantly better than cold-TFBM-based ones (T-test, p < 0.01, Figure 3A), it is possible that the all-pCRE models simply have far more features. To address this, we also built models using the top 35 most important pCREs (based on the feature importance of time point models) for comparison. We found that the cold-TFBM models remain worse than models built using the top 35 pCREs (median F1 = 0.77, p<0.01, Figure 3A). This finding, together with that based on all-pCRE models, suggests that pCREs identified in our models contain potentially novel cold-responsive CREs that may or may not be specific to switchgrass. In Figure 3B, the top 10 ranked pCREs from each of the time point models are shown with emphasis on the enrichment of novel pCREs in up-regulated genes under cold stress of different timepoints. These novel pCREs are significantly enriched (multiple testing corrected, p<0.05) in cold stress up-regulated genes at each time point (Median log odds ratio=0.55). Taken together, the comparison between cold-TFBM models and the all-pCRE or the top-35 pCRE models shows that known cold-TFBMs could not explain cold responsiveness at any particular time point as well. These findings suggest that there are novel temporal cis-regulatory components of cold transcriptional response.
Relationships between pCREs across time points
The majority of top pCREs are sequences that do not resemble TFBMs associated with cold regulation. To further understand how these pCREs we identified may be involved in temporal cold stress regulation, we examined: (1) the similarity of the pCREs across time point models (Figure 4A); (2) sequence similarities between pCREs and TFBMs (earlier the focus was only on cold-related TFs, Figure 4B) (3) importance of pCREs from different clusters in predicting cold response (Figure 4C); (4) functions carried out by the genes that the pCREs were located in (Figure 4D); and (5) expression profiles of genes that the pCREs were located in (Figure 4E). First, we categorized the pCREs into clusters by calculating the pairwise PCC distance (1-PCC) based on their sequences (see Methods; S5 Figure). The clusters were defined using the same PCC distance threshold as in (Liu et al., 2018), where pCREs with PCC distance <0.39 were considered to be bound by TFs of the same family. The pCREs were grouped into 27 clusters and pCREs in 25 clusters were shared by >1 cold treatment time points. Since pCREs in a cluster are likely bound by TFs of the same family, this finding indicates the involvement of most TF families across time points. These clusters consisted of pCREs important in >1 time points were referred to as non-specific pCRE clusters (Figure 4A).
Figure 4 Properties of pCRE clusters which were defined based on sequence similarity. (A) Heatmap showing the distribution of general and time point-specific pCREs within a cluster. Color scale represents the percentage of general and time point-specific pCREs in each pCRE cluster. (B) Potential transcription factors (TFs) that could bind to pCREs in pCRE clusters based on the similarity between pCREs and TF binding sites (TFBS) information based on in-vitro binding assays. A TF was considered to bind a pCRE only if the PCC similarity between the pCRE and its binding sites was above the 95th percentile of the background PCC distribution, which was calculated among TFs in the same TF family. TF families that don’t fall under this threshold were marked in gray. Color scale represents the percentage of pCREs within a pCRE cluster that showed significant similarity with TFBS. (C) Median importance of pCREs in a cluster. Cell color depicts median min-max scaled Gini index of the pCREs within each cluster. Gray color indicates that the pCRE is not used in the time point model in question. (D) Significantly enriched biological GO terms of genes containing pCREs in a pCRE cluster. Color scale represents the log10(odds ratio), for details, see Methods. (E) Differential expression of genes that contain pCREs in clusters 3, 16, or 23 at different time points. Each row shows the profile of a gene, and color scale indicates log2(FC).
To assess if pCREs in different clusters may regulate distinct sets of genes, we compared the differential expression profiles of genes that contain pCREs from different clusters in different time points (Figures 4E, S6). To facilitate interpretation of the differential expression profiles, we encoded the transcriptional responsiveness of a gene at a time point as U, D, N if it is significantly up-regulated, significantly down-regulated, and not differentially expressed, respectively. For example, a profile of “UUDDNN” indicates that the gene is significantly up-regulated at 30 minutes and 1 hr, down-regulated at 3 hrs and 6 hrs, and not differentially expressed at 16 hrs and 24 hrs after cold treatment. Using this strategy, we investigated the frequency of differential expression profiles of genes with pCREs in different pCRE clusters. NNNUUN, NNNUUU, and NNUUUU were the top three most frequent expression profiles found on the genes that contain pCREs in all 25 non-specific pCRE clusters (S7 Figure). Because the up-regulatory patterns were contiguous after 3hrs of cold treatment, regulatory switches common between time points may have a role in the up-regulation and maintaining the expression of genes at later time points. Similarly, previous studies also show that in both CBF-dependent and independent pathways, immediately cold-responsive TFs are responsible for up-regulating and maintaining the expression of a large number of downstream cold-responsive genes by binding to conserved regulatory sequences (Thomashow, 2010; Park et al., 2015; Li et al., 2017). Some genes harboring pCREs from non-specific pCRE clusters also had unique expression profiles (expressed in a single time point) as well as much more complex expression profiles (up- or down-regulated in multiple, non-contiguous time points) (S6 and S7 Figure).
In addition to non-specific clusters, there were two 30 min-specific pCRE clusters (clusters 23 and 25) (Figure 4A). pCREs in these clusters may regulate initial cold transcriptional response. However, these clusters were significantly enriched (q ≤ 0.05) with the genes that are up-regulated only at the 30-min time point compared to genes that contain pCREs in other clusters (S8 Figure), For example, in cluster 23, UNNNNN, UNUUNN, UNUNNU, UUUUUU, UUDDDD, and UUNNDD are among profiles with the highest degrees of enrichment. There are ~360 different gene expression profiles that contain pCREs in all 25 of the shared pCRE clusters (S7 Figure). Thus, the temporal regulation of cold transcriptional response is likely mediated through a combination of general CREs that are important for the entire duration, specific CREs that regulate response at a particular time, as well as non-specific CREs that regulate a certain duration (contiguous time points) or complicated expression profiles (e.g., UNUNNU). To assess the functions of genes that contained pCREs from pCREs clusters, we examined which GO terms were enriched with genes containing pCREs in a cluster (Figure 4D). Except for the general enriched GO terms (e.g., metabolic processes), genes containing pCREs of non-specific pCRE clusters were enriched with biosynthetic processes that are involved in cold stress responses (e.g., fatty acid biosynthetic process, lipid biosynthetic process, and trehalose biosynthetic process) and specific metabolic processes (e.g., response to oxidative stress, carbohydrate metabolic process) (Figure 4D). These GO terms are known to be enriched in late responsive genes under cold stress (Manasa et al., 2021). Our findings suggest that some genes containing pCREs from these non-specific pCRE clusters may contribute to metabolic processes crucial for cold tolerance. None of the GO terms were enriched for genes containing pCREs in the specific pCRE clusters 23 and 25, potentially due to the small sample size of these two clusters.
Cold stress regulatory pCREs that do not resemble known TFBMs
To further assess the regulatory role of the pCREs in pCRE clusters, we asked what TFs may bind to these pCREs using the in-vitro TFBM information of 344 Arabidopsis TFs. Although the Arabidopsis and the switchgrass lineages diverged ~200 million years ago (Wolfe et al., 1989), the TFBMs of dicot and monocot TFs from the same families are highly similar (Weirauch et al., 2014).A TF was considered to have the potential to bind to a pCRE if the similarity between its TFBM and the pCRE in question was above the 95th percentile of the similarity distribution calculated among TFBMs in the same TF family (see Methods). In addition to members of the AP2-EREBP family discussed previously (Figure 2, 4B), TFBMs of B3, bZIP, MYB, Trihelix, and FAR1 TF families were also found to have a significant similarity to pCREs in multiple clusters (Figure 4B). In soybean, the bZIP TFs are known to regulate cold stress in ABA-dependent pathways by inducing the expression of downstream COR and ERF type genes that help plants to resist cold stress conditions (Liao et al., 2008; Yu et al., 2020). Moreover, in tomatoes, the Trihelix type TFs are known to be up-regulated under cold stress conditions, and activate downstream genes with products that modulate stomatal conductance to prevent water loss (Liu et al., 2012; Yu et al., 2018). In apples, R2R3-MYB TFs were found to be induced by cold stress and activate ROS scavenging genes (An et al., 2018).
Aside from 19 clusters containing pCREs resembling known Arabidopsis TFBMs, eight clusters did not contain pCREs resembling TFBMs we investigated (Figure 4B). These pCREs are referred to as “unknown” pCREs (those with “between” threshold in S3 Table). In our time-point models, those unknown pCREs were also important for predicting cold responsiveness of a gene (S3 Table) as indicated by the median importance of pCREs in clusters (Figure 4C). Furthermore, the feature importance ranks of these pCREs in predicting cold transcriptional response in the time point models (median rank=0.45) are significantly similar (T-test, p-value<0.01) to those of pCREs resembling known TFBMs (median rank=0.38). Using general pCREs as examples, we built models to predict genes up-regulated at different time points using solely pCREs similar to known TFBMs (n=16), and another model with unknown pCREs (n=38). We found that the performances of models built using general pCREs similar to known TFBM (median F1 = 0.66) and general “unknown” pCREs (median F1 = 0.70) were not significantly different (T-test, p-value>0.01). This result also suggests that “unknown” pCREs have similar importance to pCREs that resemble known TFBMs in predicting temporal cold-stress response in switchgrass. The reasons we did not find similar TFBMs to these pCREs may be because the threshold we used to assign a pCRE to TFBSs was too stringent. However, the threshold used was established as the degree of similarity that allows binding motifs of a plant TF family to be identified (Azodi et al., 2020). Thus, it was not asking if a pCRE resembled a specific TFBM, but the binding motifs at the level of family. The second reason may be that Arabidopsis TFBMs were used, which may miss TFBMs specific in other species. Although there is broad conservation of TFBMs across species, even between plants and humans (Weirauch et al., 2014), this can only be assessed with additional experimental studies either through DAP-seq or one-hybrid assay. Another possibility is that the Arabidopsis TFBM data may miss binding sites due to the limitations of in vitro binding assays (Bartlett et al., 2017). Finally, it is also possible that, instead of TFBMs, a subset of pCREs may represent motifs relevant for levels of regulation beyond transcription, such as post-transcriptional or translational regulation. This possibility remains to be investigated.
Conclusion
In this study, we aimed to find DNA regulatory switches responsible for temporal transcriptional response in switchgrass under cold stress conditions. By examining the number of cold-responsive genes at different time points, and the functions these genes tend to have, we found a cascading effect of gene transcriptional responses with regards to the time the plant was exposed to cold stress. The k-mers enriched for cold-responsive genes at a particular time point were predictive of the cold responsiveness of genes at that time point. By examining the top most predictive k-mers, we were able to identify well known CREs that regulate cold stress response in plants, indicating the usefulness of our models. Based on similarity of a subset of pCREs to known cold TFs, switchgrass cold stress response is mediated through both CBF-dependent and independent pathways. Beyond the known cold-responsive CREs, additional pCREs not known to be regulating cold response were identified. Some pCREs were identified in specific time point models, while others (general and non-specific pCREs) appeared to be relevant to regulation of cold response at multiple, sometimes disjoint, time points. In the latter case, differential expression profiles of genes containing these pCREs show complex patterns throughout the time course.
A substantial fraction of the pCREs do not resemble known binding motifs of known cold response regulatory TFs or, in general, Arabidopsis TFs with in vivo binding data. However, the regulatory function of these pCREs in cold responses needs to be experimentally validated using knockout lines and additional efforts, including modeling complex expression patterns under cold stress response (i.e., non-contiguous, up-/down-regulation) to identify the pCREs responsible for complex temporal expression and modeling cold stress response using combinations of pCREs to identify complex expression patterns under cold stress are required to fully understand the cold-responsive cis-regulatory code in switchgrass. We also emphasize how building computational methods and their interpretations are important for identifying the global patterns of gene expression and their context-specific regulatory elements. This study provides sequence elements that regulate temporal cold stress response, allows a systematic understanding of the temporal cold stress regulation in switchgrass and, with subsequent validation studies, the information can be used as the bases for fine tuning switchgrass tolerance to cold stress.
Materials and methods
Transcriptome data collection, preprocessing, and gene-set enrichment analysis
The switchgrass cold response RNA-seq data were from a published study of a time course (0.5, 1, 3, 6, 16, and 24 hrs) under cold treatment (6°C) with paired control samples (29°C/23°C in a 12-h/12-h day/night cycle) (Meng et al., 2021). Switchgrass transcriptomes under three other stress conditions were from three published studies [Dehydration ( (Zhang et al., 2018)), salt ( (Zhang et al., 2021)), and drought ( (Zuo et al., 2018)]. The RNA-sequencing (RNA-seq) data of these studies were downloaded from NCBI-SRA database (https://www.ncbi.nlm.nih.gov/sra), processed, and used to generate raw counts and transcript abundance (transcripts per million, TPM) using an RNA-seq analysis pipeline (https://github.com/ShiuLab/RNA-seq_data_processing.git). For mapping RNA-seq reads, Panicum virgatum v5.1 genome and the corresponding genome annotations were downloaded from the Joint Genome Institute (JGI) database (https://jgi.doe.gov). Only reads that were uniquely mapped to the genome were used. Differential expression of genes (fold change, FC) contrasting cold stress treatment and corresponding control at each time point and false discovery rate corrected p-values were calculated using the EdgeR package implemented in R (Robinson et al., 2010).
Gene Ontology (GO) annotations of switchgrass genes were downloaded from JGI Data Portal as of 07.08.2021 (https://data.jgi.doe.gov). Fisher’s exact test was conducted to identify GO categories enriched in cold-responsive genes at each time point versus all the other genes in the genome. The resulting p-values were adjusted using the Benjamini-Hochberg method (Benjamini and Hochberg, 1995), and GO terms with adjusted p-values ≤ 0.05 were considered as enriched for cold-responsive genes (https://github.com/ShiuLab/Manuscript_Code/tree/master/2022_switchgrass_cold_pCREs). The GO enrichment analysis was also conducted for genes that contain pCREs from the same pCRE distance cluster versus all the genes in the genome (see next sections).
Identification of cold-responsive putative cis-regulatory elements
Cold-responsive genes were defined as genes that were either significantly up-regulated (Log2FC≥1 and adjusted p ≤ 0.05) or down-regulated (Log2FC≤-1 and adjusted p ≤ 0.05) upon cold treatment at each time point. Genes were defined as non-responsive to cold at any of the six time points and nonresponsive to the other three stress conditions mentioned above (|log2FC|<0.5 and/or adjusted p>0.05). Here, stress conditions other than cold treatment were considered to define non-responsive genes, because previous studies have found that stress-responsive CREs could activate genes under multiple stress conditions (Zou et al., 2011; Azodi et al., 2020). Thus, contrasting the cold-responsive genes against genes that are not responsive to combined stresses would allow us to identify the full scale of pCREs, i.e., both cold-stress-specific pCREs and pCREs responsible for multiple stress conditions including cold stress.
To identify pCREs, we applied a combination of a k-mer enrichment approach and machine learning. To avoid data leakage, for each time point, cold-responsive genes (up- or down-regulated after cold treatment) and non-responsive genes were split where 80% of the genes were used as the training set and 20% were the test set. The test set was set aside and was not used for any pCRE identification or modeling steps. For the k-mer enrichment step, genes in the training set were further split into five bins. For each bin, we first identified all possible k-mers (k=5-8 nucleotides where a forward k-mer was considered as the same as its reverse complements) from 1kb upstream, gene body including 5’ and 3’ untranslated regions, and 1kb downstream regions of both cold-responsive and non-responsive genes. K-mers enriched for cold-responsive genes (Fisher’s exact test adjusted p-value<0.05) were identified for each bin, and the k-mers commonly enriched among all five bins were used as features to establish machine learning models classifying cold-responsive genes (positive examples) and non-responsive genes (negative examples) in the training set.
To create a balanced training dataset (same numbers of positive and negative examples), genes in the minority class with fewer instances were randomly up-sampled using the Synthetic Minority Over-sampling Technique (Chawla et al., 2002). We also experimented with down-sampling where the majority class was randomly selected to match the number of minority class genes. Classification models were built for each time point to predict cold-responsive and non-cold responsive genes using the random forest algorithm (Breiman, 2001) and grid search was conducted based on 60 hyperparameter combinations (‘max_depth’: [3, 5, 10], ‘max_features’: [0.1, 0.5, ‘sqrt’, ‘log2’, None], ‘n_estimators’: [10, 100, 500, 1000]) in a five-fold cross-validation scheme where every gene was used in the validation set exactly once. The optimal hyperparameter set was selected based on F1 score of the validation set predictions. F1 measure is the harmonic mean of precision and recall. An “optimal” model for each time point was then built using all training instances with the optimal hyperparameters. The final model for each time point was then applied to predict the cold responsiveness of genes in the testing set and model performance was evaluated using F1 measure.
Selection of minimal pCRE sets as features and determining relationships between pCREs
To identify the minimal number of features (enriched k-mers) that have a similar performance as the optimal model using all features to distinguish cold-responsive from non-responsive genes, features were selected based on Gini importance defined as the impurity difference of a node in the decision tree when the feature in question is used, a measure of the contribution of a feature for distinguishing the cold-responsive and non-responsive genes. New models use the training set again by increasing the numbers of features used, starting with just the top 10 important features and, for subsequent models, increasing the number of features by 20 in order of decreasing feature importance. The trend line of the cross-validation F1 score against the number of features was fit with the Michaelis-Menten Equation. For each time point the minimal number of features was determined as where the fitted line had a near zero differential (e.g., the 30 min model, S1 Figure), or where the F1 first reached 90% of the optimal model F1 if there was no clear plateau (e.g., the 30 min model, S1 Figure). Features within the minimal set were designated as pCREs for the cold response at the time point in question. If the same pCRE (an important k-mer with identical sequence) was found among all time point models, we designated it as a general pCRE, while a pCRE that was only identified in one time point model was designated as time point-specific pCRE. To determine the similarity between pCREs, pairwise PCC distances between pCREs were calculated using the TAMO package (Gordon et al., 2005), implemented in R. The distance matrix was used to construct a UPGMA tree using average linkage in the library ‘cluster’ in R (Maechler et al., 2012). Sequence similarity of 0.39 was used as a threshold, such that pCREs with similarity >0.39 can be treated as a single pCRE (Liu et al., 2018). For each cluster of pCREs, the proportion of pCREs in different categories (general or time point-specific pCRE groups) were calculated using custom scripts (https://github.com/ShiuLab/Manuscript_Code/tree/master/2022_switchgrass_cold_pCREs).
Identification of transcription factors with binding sites similar to pCREs
The assessment of sequence similarity between pCREs and known transcription factor binding sites (TFBSs) was carried out using the Motif Discovery Pipeline (https://github.com/ShiuLab/MotifDiscovery.git) as described in (Azodi et al., 2020). For this analysis, only the pCREs responsible for up-regulation upon cold treatment were considered. Known TFBS data was retrieved from two datasets: (1) DNA Affinity Purification sequencing (DAP-seq) database, where in-vitro DNA binding assays were performed for 344 TFs in Arabidopsis thaliana; (2) Catalog of Inferred Sequence Binding Preferences (CIS-BP) database, where position frequency matrices (PFMs) for TFBS of 190 TFs (non-redundant TFBS with DAP-seq database) in A. thaliana were available (Weirauch et al., 2014). To assess the similarity between pCREs and TFBSs, the Pearson’s Correlation Coefficients (PCC) between the position weighted matrices (PWMs) of pCREs and PWMs of TFBSs were calculated as described in (Azodi et al., 2020). The top matching TFBS for each pCRE was reported in three threshold levels (same TF, same family, or significantly more similar than randomly expected) as described in (Azodi et al., 2020). To determine the similarity between pCRE and TFBMs for TFs regulating cold response, we checked if pCRE-TFBM PCC is higher than 95th percentile of the PCCs calculated among TFBMs of different transcription factors families. This is a mid-stringency threshold out of the three thresholds we used to find similarities between pCREs and TFBMs. Since we are using Arabidopsis TFBMs to identify similar binding sites of specific TFs in switchgrass, we wanted to use TFBMs with the highest similarity when compared with other families of TFs, which with a higher stringency threshold would not have been found. Using this mid-stringency threshold we will be able to say if a pCRE resembles a specific binding site of a particular TF in comparison with other TFs in different TF families.
To assess how well the binding sites of TFs known to regulate cold response might predict cold response, we collected known cold regulation TFs through a literature search (S4 Table). Using PWMs of binding sites of TFs known to regulate cold stress in plants (cold-CREs), we mapped similar binding sites in up-regulated genes in different time points. Based on absence/presence of cold-CREs in a gene we recreate feature tables for genes that are up-regulated in each time point. Using the similar machine learning methods used in the “Identification of cold-responsive putative cis-regulatory elements” section, we made models to predict cold responsiveness of a gene up-regulated in each time point using cold-TFBMs. The performance of these models were then compared to our original time point models.
Data availability statement
The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding authors.
Author contributions
TR, PW, and S-HS conceptualized and designed the study. TR and BNIB acquired and analyzed the data. TR, BNIB, and PW wrote the original draft of the manuscript. S-HS and PW supervised the study. All authors contributed to the article and approved the submitted version.
Funding
This work was mainly funded by the US Department of Energy Great Lakes Bioenergy Research Center (BER DE-SC0018409) and in part by the National Science Foundation (DGE-1828149; IOS-2107215; MCB-2210431).
Acknowledgment
We thank Kenia Segura Abá, Melissa Lehti-Shiu, Serena Lotreck, and Ally Schumacher for their feedback on the figures and discussion.
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2022.998400/full#supplementary-material
Supplementary Figure 1 | Properties of cold-responsive genes at different time points. (A) Matrix showing the number of up-regulated (top left triangle) and down-regulated (bottom right triangle) genes at different time points after cold treatment. Color scale and number within the cell on the diagonal represent the count of time point-specific cold-responsive genes, while those in other cells indicate the number of responsive genes shared between two time points. For example, the number eight in the top left cell indicates that there are eight genes that are up-regulated at both 30 min and 24 hrs. (B, C) Biological process GO terms that are significantly enriched (q ≤ 0.05) for genes that are down-regulated (B) or up-regulated (C) at different time points. Color scale: -log10(q) for over-representative GO terms, and log10(q) for under-representative GO terms.
Supplementary Figure 2 | Feature selection. Graphs show the relationship between the F1score and the number of features in time point models distinguishing genes up-regulated (left panel) or down-regulated (right panel) after cold treatment from non-responsive genes. The data points were fitted using the Michaelis-Menten Equation.
Supplementary Figure 3 | pCREs that were identified from models predictive of genes up-regulated in >1 time points and their resemblances with known cold-CREs.
Supplementary Figure 4 | Heatmap showing feature importance in the cold-TFBM models. Color scale and numbers in the cells represent the importance rank of features that have positive Gini indexes, the darker color and smaller number, the more important a feature was. Gray color indicates that the Gini index for the feature was negative.
Supplementary Figure 5 | A dendrogram showing relationships among general and time point-specific pCREs based on pairwise PCC distances. The dendrogram is clustered based on the similarity threshold of 0.39.
Supplementary Figure 6 | Heatmaps showing the degrees of differential expression of genes that contain pCREs from different pCRE clusters at different time points after cold treatment. Color scale indicates log fold change values.
Supplementary Figure 7 | Frequency of genes showing a specific expression profiles (e.g., NNUDNN, y-axis) and containing pCREs that belong to different pCRE clusters (x-axis). Color scale indicates log2(counts) of genes showing the expression profile. U, up-regulated; D, down-regulated; N, non-responsive.
Supplementary Figure 8 | Degrees of enrichment of genes containing pCREs from a pCRE cluster (x-axis) that also have a specific expression profile (e.g., NNUDNN, y-axis). The color scale represents the log odds ratio, which was calculated as ratios between two values. The first value is the number of genes containing pCRE in cluster C and having expression profile P divided by the number of genes with C and not in P. The second value is the number of genes with no C but in P divided by the number of genes without C and not in P.
Supplementary Table 1 | Metadata of the transcriptome sequences used in this study.
Supplementary Table 2 | Number of features selected in the feature selection processes and the best threshold used in different time point models.
Supplementary Table 3 | Enrichment p-values, feature importance scores, feature importance ranks, and summary of the similarity between pCREs and in-vitro transcription factor binding site data of pCREs in different time point models.
References
An, J.-P., Li, R., Qu, F.-J., You, C.-X., Wang, X.-F., Hao, Y.-J. (2018). R2R3-MYB transcription factor MdMYB23 is involved in the cold tolerance and proanthocyanidin accumulation in apple. Plant J. 96, 562–577. doi: 10.1111/tpj.14050
Azodi, C. B., Lloyd, J. P., Shiu, S.-H. (2020). The cis-regulatory codes of response to combined heat and drought stress in arabidopsis thaliana. NAR Genomics Bioinforma 2, lqaa049. doi: 10.1093/nargab/lqaa049
Bartlett, A., O’Malley, R. C., Huang, S. C., Galli, M., Nery, J. R., Gallavotti, A., et al. (2017). Mapping genome-wide transcription factor binding sites using DAP-seq. Nat. Protoc. 12, 1659–1672. doi: 10.1038/nprot.2017.055
Benatti, M., Penning, B., Carpita, N., Mccann, M. (2012). We are good to grow: dynamic integration of cell wall architecture with the machinery of growth. Front. Plant Sci. 3. doi: 10.3389/fpls.2012.00187
Benjamini, Y., Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. R. Stat. Soc Ser. B Methodol. 57, 289–300. doi: 10.1111/j.2517-6161.1995.tb02031.x
Chawla, N. V., Bowyer, K. W., Hall, L. O., Kegelmeyer, W. P. (2002). SMOTE: synthetic minority over-sampling technique. J Artif Intell Res. 16, 321–57.
Chinnusamy, V., Ohta, M., Kanrar, S., Lee, B.-H., Hong, X., Agarwal, M., et al. (2003). ICE1: a regulator of cold-induced transcriptome and freezing tolerance in Arabidopsis. Genes Dev. 17, 1043–1054. doi: 10.1101/gad.1077503
Chinnusamy, V., Zhu, J.-K., Sunkar, R. (2010). Gene regulation during cold stress acclimation in plants. Methods Mol. Biol. Clifton NJ 639, 39–55. doi: 10.1007/978-1-60761-702-0_3
Chinnusamy, V., Zhu, J., Zhu, J.-K. (2007). Cold stress regulation of gene expression in plants. Trends Plant Sci. 12, 444–451. doi: 10.1016/j.tplants.2007.07.002
Dey, S., Corina Vlot, A. (2015). Ethylene responsive factors in the orchestration of stress responses in monocotyledonous plants. Front. Plant Sci. 6. doi: 10.3389/fpls.2015.00640
Ding, Y., Lv, J., Shi, Y., Gao, J., Hua, J., Song, C., et al. (2019a). EGR2 phosphatase regulates OST1 kinase activity and freezing tolerance in Arabidopsis. EMBO J. 38, e99819. doi: 10.15252/embj.201899819
Ding, Y., Shi, Y., Yang, S. (2019b). Advances and challenges in uncovering cold tolerance regulatory mechanisms in plants. New Phytol. 222, 1690–1704. doi: 10.1111/nph.15696
Doherty, C. J., Van Buskirk, H. A., Myers, S. J., Thomashow, M. F. (2009). Roles for Arabidopsis CAMTA transcription factors in cold-regulated gene expression and freezing tolerance. Plant Cell 21, 972–984. doi: 10.1105/tpc.108.063958
Finkler, A., Ashery-Padan, R., Fromm, H. (2007). CAMTAs: calmodulin-binding transcription activators from plants to human. FEBS Lett. 581, 3893–3898. doi: 10.1016/j.febslet.2007.07.051
Gordon, D. B., Nekludova, L., McCallum, S., Fraenkel, E. (2005). TAMO: a flexible, object-oriented framework for analyzing transcriptional regulation using DNA-sequence motifs. Bioinformatics 21, 3164–3165. doi: 10.1093/bioinformatics/bti481
Hayford, R. K., Serba, D. D., Xie, S., Ayyappan, V., Thimmapuram, J., Saha, M. C., et al. (2022). Global analysis of switchgrass (Panicum virgatum l.) transcriptomes in response to interactive effects of drought and heat stresses. BMC Plant Biol. 22, 107. doi: 10.1186/s12870-022-03477-0
Ikeuchi, M., Iwase, A., Rymen, B., Lambolez, A., Kojima, M., Takebayashi, Y., et al. (2017). Wounding triggers callus formation via dynamic hormonal and transcriptional changes. Plant Physiol. 175, 1158–1174. doi: 10.1104/pp.17.01035
Joshi, R., Wani, S. H., Singh, B., Bohra, A., Dar, Z. A., Lone, A. A., et al. (2016). Transcription factors and plants response to drought stress: Current understanding and future directions. Front. Plant Sci. 7. doi: 10.3389/fpls.2016.01029
Kakei, Y., Masuda, H., Nishizawa, N. K., Hattori, H., Aung, M. S. (2021). Elucidation of novel cis-regulatory elements and promoter structures involved in iron excess response mechanisms in rice using a bioinformatics approach. Front. Plant Sci. 12. doi: 10.3389/fpls.2021.660303
Knight, M. R., Knight, H. (2012). Low-temperature perception leading to gene expression and cold tolerance in higher plants. New Phytol. 195, 737–751. doi: 10.1111/j.1469-8137.2012.04239.x
Lamers, J., van der Meer, T., Testerink, C. (2020). How plants sense and respond to stressful Environments1 [OPEN]. Plant Physiol. 182, 1624–1635. doi: 10.1104/pp.19.01464
Liao, Y., Zou, H.-F., Wei, W., Hao, Y.-J., Tian, A.-G., Huang, J., et al. (2008). Soybean GmbZIP44, GmbZIP62 and GmbZIP78 genes function as negative regulator of ABA signaling and confer salt and freezing tolerance in transgenic arabidopsis. Planta 228, 225–240. doi: 10.1007/s00425-008-0731-3
Liu, Y., Dang, P., Liu, L., He, C. (2019). Cold acclimation by the CBF–COR pathway in a changing climate: Lessons from Arabidopsis thaliana. Plant Cell Rep. 38, 511–519. doi: 10.1007/s00299-019-02376-3
Liu, Q., Kasuga, M., Sakuma, Y., Abe, H., Miura, S., Yamaguchi-Shinozaki, K., et al. (1998). Two transcription factors, DREB1 and DREB2, with an EREBP/AP2 DNA binding domain separate two cellular signal transduction pathways in drought- and low-temperature-responsive gene expression, respectively, in arabidopsis. Plant Cell 10, 1391–1406. doi: 10.1105/tpc.10.8.1391
Liu, H., Ouyang, B., Zhang, J., Wang, T., Li, H., Zhang, Y., et al. (2012). Differential modulation of photosynthesis, signaling, and transcriptional regulation between tolerant and sensitive tomato genotypes under cold stress. PloS One 7, e50785. doi: 10.1371/journal.pone.0050785
Liu, M.-J., Sugimoto, K., Uygun, S., Panchy, N., Campbell, M. S., Yandell, M., et al. (2018). Regulatory divergence in wound-responsive gene expression between domesticated and wild tomato. Plant Cell 30, 1445–1460. doi: 10.1105/tpc.18.00194
Li, H., Ye, K., Shi, Y., Cheng, J., Zhang, X., Yang, S. (2017). BZR1 positively regulates freezing tolerance via CBF-dependent and CBF-independent pathways in Arabidopsis. mol. Plant 10, 545–559. doi: 10.1016/j.molp.2017.01.004
Lovell, J. T., MacQueen, A. H., Mamidi, S., Bonnette, J., Jenkins, J., Napier, J. D., et al. (2021). Genomic mechanisms of climate adaptation in polyploid bioenergy switchgrass. Nature 590, 438–444. doi: 10.1038/s41586-020-03127-1
Maechler, M., Rousseeuw, P., Struyf, A., Hubert, M., Hornik, K. (2012). Cluster: cluster analysis basics and extensions. R package version 1 (2), 56.
Manasa, L., Panigrahy, M., Panigrahi, K. C. S., Rout, G. R. (2021). Overview of cold stress regulation in plants. Bot. Rev 88, 359–387. doi: 10.1007/s12229-021-09267-x
Maruyama, K., Takeda, M., Kidokoro, S., Yamada, K., Sakuma, Y., Urano, K., et al. (2009). Metabolic pathways involved in cold acclimation identified by integrated analysis of metabolites and transcripts regulated by DREB1A and DREB2A. Plant Physiol. 150, 1972–1980. doi: 10.1104/pp.109.135327
Meng, X., Liang, Z., Dai, X., Zhang, Y., Mahboub, S., Ngu, D. W., et al. (2021). Predicting transcriptional responses to cold stress across plant species. Proc. Natl. Acad. Sci. 118, e2026330118. doi: 10.1073/pnas.2026330118
Miranda, J. A., Avonce, N., Suárez, R., Thevelein, J. M., Van Dijck, P., Iturriaga, G. (2007). A bifunctional TPS–TPP enzyme from yeast confers tolerance to multiple and extreme abiotic-stress conditions in transgenic arabidopsis. Planta 226, 1411–1421. doi: 10.1007/s00425-007-0579-y
Moore, B. M., Lee, Y. S., Wang, P., Azodi, C., Grotewold, E., Shiu, S.-H. (2022). Modeling temporal and hormonal regulation of plant transcriptional response to wounding. Plant Cell 34, 867–888. doi: 10.1093/plcell/koab287
Ohama, N., Kusakabe, K., Mizoi, J., Zhao, H., Kidokoro, S., Koizumi, S., et al. (2016). The transcriptional cascade in the heat stress response of arabidopsis is strictly regulated at the level of transcription factor expression. Plant Cell 28, 181–201. doi: 10.1105/tpc.15.00435
O’Malley, R. C., Huang, S. C., Song, L., Lewsey, M. G., Bartlett, A., Nery, J. R., et al. (2016). Cistrome and epicistrome features shape the regulatory DNA landscape. Cell 165, 1280–1292. doi: 10.1016/j.cell.2016.04.038
Pandey, G. K., Grant, J. J., Cheong, Y. H., Kim, B. G., Li, L., Luan, S. (2005). ABR1, an APETALA2-domain transcription factor that functions as a repressor of ABA response in arabidopsis. Plant Physiol. 139, 1185–1193. doi: 10.1104/pp.105.066324
Park, S., Gilmour, S. J., Grumet, R., Thomashow, M. F. (2018). CBF-dependent and CBF-independent regulatory pathways contribute to the differences in freezing tolerance and cold-regulated gene expression of two Arabidopsis ecotypes locally adapted to sites in Sweden and Italy. PloS One 13, e0207723. doi: 10.1371/journal.pone.0207723
Park, S.-I., Kwon, H. J., Cho, M. H., Song, J. S., Kim, B.-G., Baek, J., et al. (2021). The OsERF115/AP2EREBP110 transcription factor is involved in the multiple stress tolerance to heat and drought in rice plants. Int. J. Mol. Sci. 22, 7181. doi: 10.3390/ijms22137181
Park, S., Lee, C.-M., Doherty, C. J., Gilmour, S. J., Kim, Y., Thomashow, M. F. (2015). Regulation of the arabidopsis CBF regulon by a complex low-temperature regulatory network. Plant J. 82, 193–207. doi: 10.1111/tpj.12796
Pingault, L., Palmer, N. A., Koch, K. G., Heng-Moss, T., Bradshaw, J., Seravalli, J., et al. (2020). Differential defense responses of upland and lowland switchgrass cultivars to a cereal aphid pest 23. 21 (21), 7966. doi: 10.3390/ijms21217966
Ramirez, V. E., Poppenberger, B. (2020). Modes of brassinosteroid activity in cold stress tolerance. Front. Plant Sci. 11. doi: 10.3389/fpls.2020.583666
Ren, D., Liu, Y., Yang, K.-Y., Han, L., Mao, G., Glazebrook, J., et al. (2008). A fungal-responsive MAPK cascade regulates phytoalexin biosynthesis in Arabidopsis. Proc. Natl. Acad. Sci. 105, 5638–5643. doi: 10.1073/pnas.0711301105
Robinson, M. D., McCarthy, D. J., Smyth, G. K. (2010). edgeR: a bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140. doi: 10.1093/bioinformatics/btp616
Sage, R. F., de Melo Peixoto, M., Friesen, P., Deen, B. (2015). C 4 bioenergy crops for cool climates, with special emphasis on perennial c 4 grasses. J. Exp. Bot. 66, 4195–4212. doi: 10.1093/jxb/erv123
Sanderson, M. A., Adler, P. R., Boateng, A. A., Casler, M. D., Sarath, G. (2006). Switchgrass as a biofuels feedstock in the USA. Can. J. Plant Sci. 86, 1315–1325. doi: 10.4141/P06-136
Stockinger, E. J., Gilmour, S. J., Thomashow, M. F. (1997). Arabidopsis thaliana CBF1 encodes an AP2 domain-containing transcriptional activator that binds to the c-repeat/DRE, a cis-acting DNA regulatory element that stimulates transcription in response to low temperature and water deficit. Proc. Natl. Acad. Sci. U. S. A. 94, 1035–1040. doi: 10.1073/pnas.94.3.1035
Thomashow, M. F. (2010). Molecular basis of plant cold acclimation: Insights gained from studying the CBF cold response pathway: Figure 1. Plant Physiol. 154, 571–577. doi: 10.1104/pp.110.161794
Uygun, S., Seddon, A. E., Azodi, C. B., Shiu, S.-H. (2017). Predictive models of spatial transcriptional response to high salinity. Plant Physiol. 174, 450–464. doi: 10.1104/pp.16.01828
Wei, Y., Chen, H., Wang, L., Zhao, Q., Wang, D., Zhang, T. (2022). Cold acclimation alleviates cold stress-induced PSII inhibition and oxidative damage in tobacco leaves. Plant Signal. Behav. 17, 2013638. doi: 10.1080/15592324.2021.2013638
Weirauch, M. T., Yang, A., Albu, M., Cote, A. G., Montenegro-Montero, A., Drewe, P., et al. (2014). Determination and inference of eukaryotic transcription factor sequence specificity. Cell 158, 1431–1443. doi: 10.1016/j.cell.2014.08.009
Wolfe, K. H., Gouy, M., Yang, Y. W., Sharp, P. M., Li, W. H. (1989). Date of the monocot-dicot divergence estimated from chloroplast DNA sequence data. Proc. Natl. Acad. Sci. 86, 6201–6205. doi: 10.1073/pnas.86.16.6201
Xie, Z., Lin, W., Yu, G., Cheng, Q., Xu, B., Huang, B. (2019). Improved cold tolerance in switchgrass by a novel CCCH-type zinc finger transcription factor gene, PvC3H72, associated with ICE1–CBF–COR regulon and ABA-responsive genes. Biotechnol. Biofuels 12, 224. doi: 10.1186/s13068-019-1564-y
Yu, Y., Qian, Y., Jiang, M., Xu, J., Yang, J., Zhang, T., et al. (2020). Regulation mechanisms of plant basic leucine zippers to various abiotic stresses. Front. Plant Sci. 11. doi: 10.3389/fpls.2020.01258
Yu, C., Song, L., Song, J., Ouyang, B., Guo, L., Shang, L., et al. (2018). ShCIGT, a trihelix family gene, mediates cold and drought tolerance by interacting with SnRK1 in tomato. Plant Sci. 270, 140–149. doi: 10.1016/j.plantsci.2018.02.012
Zhang, P., Duo, T., Wang, F., Zhang, X., Yang, Z., Hu, G. (2021). De novo transcriptome in roots of switchgrass (Panicum virgatum l.) reveals gene expression dynamic and act network under alkaline salt stress. BMC Genomics 22, 82. doi: 10.1186/s12864-021-07368-w
Zhang, C., Peng, X., Guo, X., Tang, G., Sun, F., Liu, S., et al. (2018). Transcriptional and physiological data reveal the dehydration memory behavior in switchgrass (Panicum virgatum l.). Biotechnol. Biofuels 11, 91. doi: 10.1186/s13068-018-1088-x
Zhou, P., Enders, T. A., Myers, Z. A., Magnusson, E., Crisp, P. A., Noshay, J. M., et al. (2022). Prediction of conserved and variable heat and cold stress response in maize using cis-regulatory information. Plant Cell 34, 514–534. doi: 10.1093/plcell/koab267
Zhuo, Y., Zhang, Y., Xie, G., Xiong, S. (2015). Effects of salt stress on biomass and ash composition of switchgrass (Panicum virgatum). Acta Agric. Scand. Sect. B — Soil Plant Sci. 65, 300–309. doi: 10.1080/09064710.2015.1006670
Zou, C., Sun, K., Mackaluso, J. D., Seddon, A. E., Jin, R., Thomashow, M. F., et al. (2011). Cis -regulatory code of stress-responsive transcription in Arabidopsis thaliana. Proc. Natl. Acad. Sci. 108, 14992–14997. doi: 10.1073/pnas.1103202108
Keywords: Temporal transcriptional response, random forest classifier, regulation of cold stress, machine learning model interpretation, novel cis-regulatory sequences
Citation: Ranaweera T, Brown BNI, Wang P and Shiu S-H (2022) Temporal regulation of cold transcriptional response in switchgrass. Front. Plant Sci. 13:998400. doi: 10.3389/fpls.2022.998400
Received: 19 July 2022; Accepted: 16 September 2022;
Published: 10 October 2022.
Edited by:
Victoria Mironova, Radboud University, NetherlandsReviewed by:
Desalegn D. Serba, United States Arid Land Agricultural Research Center (USDA), United StatesPeng Zhou, National Engineering Laboratory for Crop Molecular Breeding (CAAS), China
Copyright © 2022 Ranaweera, Brown, Wang and Shiu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Peipei Wang, peipeiw@msu.edu; Shin-Han Shiu, shius@msu.edu