- 1Horticultural Sciences Department, IFAS, University of Florida, Gainesville, FL, United States
- 2Gulf Coast Research and Education Center, IFAS, University of Florida, Wimauma, FL, United States
Octoploid strawberry (Fragaria ×ananassa) is a major specialty crop under intense annual selection for traits relating to plant vigor and fruit quality. Most functional validation experiments rely on transgenic or transient gene expression assays in the mature receptacle. These findings are not typically translatable to breeding without identifying a natural genetic source of transcript level variation, and developing reliable markers for selection in octoploids. Expression QTL (eQTL) analysis is a genetic/transcriptomic association approach for identifying sequence variants predicting differential expression. This eQTL study analyzed a wide array of mature receptacle-expressed genes, encompassing the majority of total mature receptacle transcript accumulation and almost all strawberry genes described in the literature. These results identified segregating genetic variants associated with the differential expression of hundreds of strawberry genes, many with known interest to breeders. Several of these eQTL pertain to published genes whose expression levels have been demonstrated to influence mature receptacle phenotypes. Many include key genes of the phenylpropanoid pathway, vitamin C, carotenoid, pectin, and receptacle carbohydrate/sugar metabolism. These subgenome-specific genetic markers may allow breeders to select for desired ranges of target gene expression. These results may also guide basic research efforts and facilitate the identification of causal genes underlying trait QTL.
Introduction
Strawberry is a major specialty crop cultivated worldwide for its sweet and flavorful receptacle, which is referred to commonly as a fruit. Strawberry is under intense breeder selection for new cultivars based on diverse traits. These include receptacle color, firmness, sweetness, yield, flowering time, shipping quality, shelf life, nutrition, flavors, aromas, and disease resistance. The genomics era has provided a dense collection of phenotypically important genes that have been experimentally validated via transgenic analysis. However, this basic research often stops short of application, as genetic markers associated with traits are not coincidentally developed for use in breeding. Several resource and technology advances have recently converged to enable high-quality octoploid expression quantitative trait loci (eQTL) analysis. These include an octoploid genome reference (Edger et al., 2019), high-density subgenomic genotyping via the IStraw35 platform (Verma et al., 2017), and octoploid reference-based transcriptomics assembly.
eQTL analysis relates genotypic and transcriptomic data to identify segregating genomic regions influencing differential gene expression. Identifying eQTL provides major advantages over pure transcriptomic analysis. The results of an eQTL analysis identify the subset of genes whose differential expression is determined by genotype, the extent of that genetic influence, and markers that may be used for selection of desired gene expression ranges. These selectable markers are potentially useful for application where strawberry phenotypes are known to be influenced by transcript abundance. These include genes which have been characterized via transgenic overexpression or silencing in the strawberry receptacle. These eQTL markers may be applied to translate transgenic discoveries into breeding tools. In addition, eQTL controlling transcripts of undetermined function in strawberry can support candidate gene evaluation and trait-based QTL cloning. In one example, simple cross-referencing of trait QTL and eQTL markers identified a causal aroma biosynthesis gene in melon (Galpaz et al., 2018). In strawberry, eQTL experiments helped identify the γ-decalactone biosynthesis gene in the octoploid mature receptacle even while limited to incomplete de novo and diploid reference-based RNAseq assemblies (Sánchez-Sevilla et al., 2014). Using the recent subgenome-scale octoploid genome for ‘Camarosa’, 76 mature receptacle-expressed disease resistances genes (R-genes) were identified to be under the control of an eQTL (Barbey et al., 2019).
Most cis-eQTL are caused by sequence variants in or near the gene promoter region (Michaelson et al., 2009). As the approximate causal locus of cis-QTL is known, the resolution limits of the IStraw35 genotyping array can be measured. The distance from the originating gene locus to the most-correlated subgenomic marker, when sampled across hundreds of cis-eQTL, essentially creates a probability distribution for QTL size-resolution in studies under similar conditions. This distribution can be usefully applied to octoploid QTL studies where a priori knowledge of the causal variant locus is not known.
In this work, three octoploid strawberry populations were generated from cultivars varying for fruit quality attributes, such as firmness, sweetness, aroma, and flavor compounds (Vance et al., 2011; Whitaker et al., 2011). Mature receptacle transcriptomes from identical developmental stages were generated and compared against genotype. Analyzed transcripts include those with comparatively high accumulation, those representing differentially expressed genes, and a near-complete list of all published octoploid strawberry genes. Data from the octoploid ‘Camarosa’ strawberry gene expression atlas (Sánchez-Sevilla et al., 2017) were used to profile the expression of these genes throughout the plant. Genetic associations were filtered based on false-discovery rate (FDR) adjusted p-value, effect size, minor allele frequency, and other criteria. Collectively, these results specify the major genetics-based expression differences between cultivars, and the selectable genetics predicting them. These findings bridge basic and applied biology and provide a means to convert previous molecular research directly into plant breeding efforts.
Materials and Methods
Plant Materials
Three strawberry flavor and aroma populations were created from Florida cultivars and also ‘Mara des Bois’ which possesses unique receptacle quality and aroma traits (Figure S1). These populations were derived from the crosses ‘Florida Elyana’ × ‘Mara de Bois’ (population 10.113), ‘Mara des Bois’ × ‘Florida Radiance’ (population 13.75), and ‘Strawberry Festival’ × ‘Winter Dawn’ (population 13.76). Mature receptacles were harvested fully ripe from the field during winter growing seasons at the Gulf Coast Research and Education Center (GCREC) in Wimauma, Florida. Populations 13.75 and 13.76 were harvested during the winter of 2014. Population 10.133 was sampled on January 20, February 11, February 25, and March 18, 2011 (Chambers, 2013). Harvest days were selected based on dry weather and moderate temperature, both on the day of harvest and for several days preceding harvest.
Genotyping of Octoploid Strawberry Lines
The Affymetrix IStraw35 Axiom® SNP array (Verma et al., 2017) was used to genotype 61 individuals consisting of parents and progeny from crosses of ‘Mara de Bois’ × ‘Florida Elyana’, ‘Mara des Bois’ × ‘Florida Radiance’, and ‘Strawberry Festival’ × ‘Winter Dawn’ (Figure S1). Sequence variants belonging to the Poly High Resolution (PHR) and No Minor Homozygote (NMH) marker classes were included for association mapping. Mono High Resolution (MHR), Off-Target Variant (OTV), Call Rate Below Threshold (CRBT), and Other marker quality classes were discarded and not used for mapping. Individual marker calls inconsistent with disomic Mendelian inheritance from parental lines were removed. Genetic relatedness was evaluated using the VanRaden method using GAPIT v2 package (Tang et al., 2016) in R (Figure S2).
Transcriptome Analysis
Octoploid mature receptacle transcriptomes from 61 individuals were sequenced via Illumina paired-end RNAseq (avg. 65 million 2× 100-bp reads), consisting of parents and progeny from crosses of ‘Florida Elyana’ × ‘Mara de Bois’, ‘Florida Radiance’ × ‘Mara des Bois,’ and ‘Strawberry Festival’ × ‘Winter Dawn.’ Reads were trimmed and mapped to the Fragaria ×ananassa octoploid ‘Camarosa’ annotated genome (Edger et al., 2019) using CLC Genomic Workbench 11 with a mismatch cost of 2, insertion cost of 3, deletion cost of 3, length fraction of 0.8, similarity fraction of 0.8, and 1 maximum hit per read. Reads which mapped equally well to multiple loci were discarded. RNAseq counts were calculated in Transcripts Per Million (TPM). Transcript levels were normalized via the Box-Cox transformation algorithm (Box and Cox, 1964) performed in R-Studio (Racine, 2011) prior to genetic correlation. The BLAST2GO pipeline (Conesa et al., 2005) was used to annotate the full ‘Camarosa’ predicted gene compliment. Raw reads from the strawberry gene expression atlas study (Sánchez-Sevilla et al., 2017) were aligned to the ‘Camarosa’ genome using identical procedures, with biological replicates averaged and compared for tissue-based expression using ClustVis (Metsalu and Vilo, 2015) with default parameters.
Identification of High-Variance and Highly Expressed Genes
The 2,000 mature receptacle transcripts with the highest coefficient of variation between samples were identified via 1-Pearson correlation distance using the heat map clustering algorithm in CLC Genomics Workbench 11 (Figure S3). The 2,000 mature receptacle transcripts with the highest total expression were identified by calculating the sum total expression for each ‘Camarosa’ transcript across all samples (Figure S4).
Retrieval of Published Strawberry Gene Sequences
All 607 non-redundant mRNA accessions under the query “Fragaria ×ananassa” were retrieved from the public databases collectively housed at NCBI. This list included all transiently modified strawberry genes compiled in a review (Carvalho et al., 2016) as well as other recently characterized genes in strawberry. Of these, 493 accessions contained an annotated coding sequence (CDS). All retrieved sequences not containing a CDS annotation were determined to be misannotated microsatellite sequences and discarded. These extracted coding sequences were compared by BLAST to identify the most identical gene in the octoploid ‘Camarosa’ reference genome, identifying 380 unique putative orthologs. This figure was somewhat smaller than the query size, as many deposited mRNA sequences represent alleles or splicoforms corresponding to a single orthologous gene in the non-haplotype specific cv. ‘Camarosa’ genome. Transcriptome data for these corresponding ‘Camarosa’ genes were used in eQTL analysis.
Genetic Association of Gene Expression (eQTL)
Genome-wide association was performed via a mixed linear model approach using the GAPIT v2 package (Tang et al., 2016) in R. The diploid Fragaria vesca physical map was used to orient marker positions, as current genetic maps in octoploid do not include a majority of the available IStraw35 markers. eQTL were evaluated for significance based on the presence of multiple co-locating markers of p-value < 0.05 after false discovery rate (FDR) correction for multiple comparisons (Benjamini and Hochberg, 1995). Cis vs trans eQTL determinations were made by corroborating the known ‘Camarosa’ gene position with the eQTL marker position in the physical map.
Results
In all, 268 robust cis and trans eQTL were discovered relating to the mature receptacle expression of 224 octoploid strawberry genes (Table S1). cis-eQTL were found abundantly across all subgenomes (Figure 1). A vast majority of the identified eQTL loci were found proximal to the originating gene locus, within 0.42 Mb median distance on the same homoeologous chromosome (0.053% of the ‘Camarosa’ octoploid genome length) (Table S1). A frequency plot of cis-eQTL (N = 213) marker/gene distances is presented in Figure 2. A plurality (16%) of cis-eQTL gene/marker distances are located within 0.1 Mb of the originating gene locus. Larger gene/marker distances are progressively rarer until reaching a frequency minimum around 1 Mb. Approximately 90% of gene/marker distances are within this interval. Most eQTL display stepwise changes in transcript accumulation corresponding to allelic dosage, with many displaying near-zero transcript expression in one homozygous state (Table S1, File S1).
Figure 1 Composite Manhattan plot for octoploid fruit cis-eQTL. The ‘Camarosa’ genome position of the most-correlated marker for each cis-eQTL is shown with single-marker p-value, effect size and BLAST2GO gene annotation.
Figure 2 Subgenomic distances (Mb) between the most-correlated cis-eQTL marker and the originating gene locus. The frequency of each marker/gene distance observation is indicated (bin size = 0.1 Mb).
Thirty-five eQTL relate to strawberry alleles known to influence fruit traits via transgenic analyses (Table 1) or which were experimentally described in strawberry literature (Table 2). The magnitude of experimental overexpression/silencing for each gene, and its biological effect in a given cultivar, is shown in comparison with eQTL-associated transcript ranges. In several cases these eQTL naturally replicate transcript accumulation levels observed after transgenic manipulation (Tables 1 and 2). These include the published strawberry mature receptacle transcription factors FanMYB10 (Medina-Puche et al., 2014; Kadomura-Ishikawa et al., 2015; Medina-Puche et al., 2015) FanEOBII (Medina-Puche et al., 2015), FanSnRK2.6 (Han et al., 2015), FanSLC8 (Pillet et al., 2015), and the phenylpropanoid-modulating genes FanCCR (Yeh et al., 2014), FanF′3H (Miyawaki et al., 2012), FanGT1 (Griesser et al., 2008), and FanFra a3 (Muñoz et al., 2010). Boxplots showing mature receptacle transcript ranges stratified by marker genotype (AA, AB, or BB) are provided for all genes, together with ANOVA omnibus p values and post hoc significances (File S1).
Table 1 eQTL pertaining to transgenically characterized F ×ananassa genes (cv. Camarosa exact putative ortholog).
Many of the remaining eQTL-associated genes were further investigated due to their collective participation in mature receptacle quality pathways. These include key genes relevant to phenylpropanoid metabolism (Figure 3), flavonoid biosynthesis (Figure 4), monolignol biosynthesis (Figure 5), and pectin metabolism (Figure 6). For fruit phenylpropanoid metabolism, eQTL were discovered for PHENYLALANINE AMMONIA LYASE, 4-COUMARATE CoA LIGASE, CINNAMATE β -D-GLUCOSYLTRANSFERASE, and CHALCONE 2′-O-GLUCOSYLTRANSFERASE (Figure 2). Relating to fruit flavonoid biosynthesis, eQTL were discovered for FLAVONOID 3′-HYDROXYLASE, ANTHOCYANIN SYNTHASE, ANTHOCYANIN/FLAVONOL-SPECIFIC UDP-GLUCOSYLTRANSFERASE, and DIHYDROFLAVONOL 4-REDUCTASE (two homoeologs) (Figure 4). For fruit monolignol biosynthesis, eQTL were discovered for HYDROXYCINNAMOYL TRANSFERASE, CINNAMOYL CoA REDUCTASE, 4-COUMARATE CoA LIGASE, ALCOHOL DEHYDROGENASE, and CAFFEIC ACID O-METHYLTRANSFERASE (Figure 5). These eQTL-associated genes are outlined in context with their pathways, and boxplots demonstrate transcript distribution ranges distributed by marker genotype and with supporting statistics. For fruit pectin metabolism, eQTL were discovered for PECTIN ESTERASE 3 (two non-homoelogs), PECTIN METHYLESTERASE INHIBITOR (two homoeologs and one non-homoeolog), and POLYGALACTURANASE (two non-homoeologs) (Figure 6). Pectin metabolism-related mature receptacle transcript expression values are shown across parental cultivars (Figure 6A) with boxplots of TPM stratified by marker genotype (Figure 6B). Genome-wide Manhattan plots for these genes are provided in File S3, demonstrating multiple significant markers at each locus.
Figure 3 eQTL controlling transcript accumulation of key genes in the strawberry phenylpropanoid pathway (PPP). Marker effect sizes are indicated by boxplots stratified by allelic state (AA, AB, or BB) and shown with ANOVA p-values. eQTL genes based on the ‘Camarosa’ genome are indicated as either possessing the highest sequence identity to the published sequence (purple) or not (green). Letters represent statistically separable means via Tukey’s HSD post hoc test (p < 0.05).
Figure 4 eQTL controlling transcript accumulation of key genes in the flavonoid pathway. Marker effect sizes are indicated by boxplots stratified by allelic state (AA, AB, or BB) and shown with ANOVA p-values. eQTL genes based on the ‘Camarosa’ genome are indicated as either possessing the highest sequence identity to the published sequence (purple) or not (green). Letters represent statistically separable means via Tukey’s HSD post hoc test (p < 0.05).
Figure 5 eQTL controlling monolignol pathway gene expression. Marker effect sizes are indicated by box plots stratified by allelic state (AA, AB, or BB) and shown with ANOVA p-values. eQTL genes based on the ‘Camarosa’ genome are indicated as either possessing the highest sequence identity to the published sequence (purple) or not (green). Letters represent statistically separable means via Tukey’s HSD post hoc test (p < 0.05).
Figure 6 eQTL pertaining to strawberry pectin metabolism. (A) Transcript accumulation in parental lines, GWAS-derived FDR p values and narrow-sense heritability estimates, single-marker R2 and p values are shown. Phenotype distributions based on allelic state are shown for (B) pectin esterases, (C) pectin esterase inhibitors, and (D) polygalacturonases. Letters represent statistically separable means via Tukey’s HSD post hoc test (p < 0.05).
Other eQTL are highlighted for genes whose transcript levels are known in strawberry to influence sugar/carbohydrate metabolism, L-ascorbic acid content, and carotenoid metabolism (Table 3). Large mature receptacle transcript abundance differences are dependent upon allele dosage for genes D-GALACTURONATE REDUCTASE, PHYTOENE CHLOROPLASTIC, bidirectional sugar transporter SWEET1 and many others. A complete list all 268 eQTL and supporting statistics are presented in Table S1, including mean +/- SD transcript values for each marker genotype, minor allele, and minor allele frequencies, transcript variance explained via single-marker analysis (omnibus R2), narrow-sense heritability estimates for mature receptacle transcript accumulation (h2) with FDR adjusted p-values, phase (cis or trans), physical distance between the originating gene and the cis-eQTL, and citations for published genes. Raw transcript abundances in mature receptacles across the eQTL populations (Table S2) and in various ‘Camarosa’ tissues (Sánchez-Sevilla et al., 2017) (Table S3) are provided. Complete IStraw35 genotypes for all individuals is provided in File S2.
Reassembly of raw RNAseq data from various tissues of octoploid ‘Camarosa’ (Sánchez-Sevilla et al., 2017) determined that a majority of eQTL-associated transcripts predominate in the mature receptacle, and are upregulated with ripening (Figure 7). As an external test of the predictive power of each eQTL, the unused ‘Camarosa’ IStraw35 genotype and six mature receptacle transcriptome replicates (Sánchez-Sevilla et al., 2017) were tested against the eQTL-population transcript distributions for each marker genotype (AA, AB, or BB). Mean ‘Camarosa’ TPM fell within the 95% prediction interval for its marker genotype in 240 of 268 cases (Table S1).
Figure 7 Transcript accumulation of fruit eQTL genes across various tissues. Scaled heatmap of gene expression reveals fruit eQTL genes are predominantly or exclusively expressed in the later fruit stages.
Discussion
This work identified eQTL genetic markers associated with differential mature receptacle transcript accumulation between strawberry genotypes. Most of the identified eQTL are cis-variants proximal to the originating gene locus in the ‘Camarosa’ genome and show stepwise increases in transcript accumulation according to allelic dosage, with many having near-zero TPM in one homozygous state. Loci demonstrating this behavior could correspond to gene presence/absence variants (PAVs) between the cultivars used as parents in this study. Gene PAVs is a major driver of agronomic trait variation in brassica napus, with nearly 40% of genes showing PAVs in the pangenome (Hurgobin et al., 2018). Gene PAVs are caused mainly by homeologous exchange (Hurgobin et al., 2018), which has extensively shaped the octoploid strawberry genome (Edger et al., 2019). It seems possible that PAVs could also be a major driver of diversity in octoploid strawberry. Discovery of octoploid eQTL, including those representing PAVs, is limited by the single reference genome available for octoploid transcriptome assembly. This eQTL study hints at the influence of gene PAVs in octoploid cultivar diversity. These eQTL can be immediately leveraged for basic biological investigation, and in some cases genetic selection for several strawberry traits.
Pectin Metabolism
Pectin metabolism is a central feature of ripening-associated sweetening and softening. Pectin metabolism is mainly regulated by differential expression of pectin methylesterases and methylesterase inhibitors (Di Matteo et al., 2005). Several eQTL were identified for these genes, including three pectin methylesterase inhibitors (PMEI), two homoeologs of PECTIN METHYLESTERASE 3 (PME3), and two non-homologous POLYGALACTURONASE (PG) genes. The eQTL effect size is very strong for the PMEI gene “augustus_masked-Fvb7-3-processed-gene-41.7” (baseline average 29 TPM), where single segregating allele leads to expression increases in excess of 1,500 TPM (>50-fold increase) (R2 = 0.78, single-marker p-value 2.2e-16). This large difference is present among modern cultivars. High PMEI varieties include ‘Mara des Bois’ and ‘Florida Elyana’, while low-expression varieties include ‘Florida Radiance’, ‘Winter Dawn’, and ‘Strawberry Festival’. Transgenic analysis can be used to isolate a possible phenotypic effect of this highly variable gene, and determine an ideal allelic state for variety improvement.
Phenylpropanoid Pathway
The phenylpropanoid pathway (PPP) influences many attributes of the strawberry receptacle, and many commercial strawberry breeding priorities are related to phenylpropanoid metabolism. These attributes include firmness and texture, flavor, color, ripening, quantitative disease resistance, shelf-life, and other facets of fruit quality (Dixon et al., 1996; Vogt, 2010; Peled-Zehavi et al., 2015). Several strawberry PPP genes have been characterized using transient expression analysis in the receptacle (Carvalho et al., 2016). Previous RNAseq-based network analyses in strawberry mature receptacles identified that PPP transcripts tend to be highly abundant and broadly variable between cultivars (Pillet et al., 2015). It is expected that PPP-associated transcript levels should vary in populations arising from crosses of strawberry cultivars with contrasting fruit qualities. This eQTL analysis advances previous findings in the strawberry PPP by identifying the specific octoploid subgenomic alleles which are variably expressed due to genetics, and the selectable sequence variants which predict them.
One eQTL in this category is the major transcription factor FanMYB10 (EU15516, maker-Fvb1-3-augustus-gene-144.30) (R2 = 0.64, single-marker p-value 1.3e-14), which regulates flavonoid and phenylpropanoid metabolism (Medina-Puche et al., 2014). This gene has been studied in strawberry using transgenesis (Kadomura-Ishikawa et al., 2015), but natural variation has not previously been identified. This analysis identifies that the cultivars ‘Florida Radiance,’ ‘Strawberry Festival,’ and ‘Winterdawn’ have 5-to-8-fold greater FanMYB10 transcript levels compared to ‘Mara des Bois’ and ‘Florida Elyana,’ and that these differences are heritable (h2 = 0.93). Relatively modest transient silencing (80-90% of normal) substantially decreased anthocyanin content, whereas relatively modest overexpression (170% of normal) increased anthocyanin content (Kadomura-Ishikawa et al., 2015). The eQTL for FanMYB10 expression naturally approximates the expression level changes achieved through transgenesis. Genetic selection for this eQTL, and others related to the PPP, could lead to more efficient breeding methods for modified anthocyanin content and other PPP-related metabolites.
A robust eQTL was found for a putative HYDROXYCINNAMOYL TRANSFERASE (FanHCT, maker-Fvb7-2-augustus-gene-207.46) (R2 = 0.79, FDR-adjusted p-value 0.000023). Hydroxycinnamoyl transferases function in the PPP to generate diverse substrates for Cinnamoyl CoA reductase (CCR) proteins. The candidate FanHCT is among the most abundantly accumulating acyltransferases transcripts in the mature receptacle (averaging about 1:200 total transcripts). However, expression of this major transcript is exclusive to ‘Camarosa’, ‘Florida Radiance’, and segregating progeny (h2 = 1.0). Heterologous downregulation of HCT expression led to enrichment of H-lignins and improved cell wall saccharification in alfalfa (Jackson et al., 2008), a key process in strawberry receptacle ripening. Several other eQTL were found for other genes in the PPP, including UDP-GLUCOSE : CINNAMATE GLUCOSYLTRANSFERASE, an enzyme upstream of HCT.
Vitamin and Nutrient-Associated Transcripts
Both cis and trans eQTL were identified for D-GALACTURONIC ACID REDUCTASE (FanGalUR, AF039182; FDR-adjusted p-value 0.0007). The expression of the FanGalUR transcript is heritable (h2 = 0.71) and shows substantial expression variation (500-2,000 TPM range determined by genotype). Previous research in strawberry demonstrated that L-ascorbic acid content is limited by FanGalUR transcript abundance (Agius et al., 2003). This limitation has also been confirmed in F. chiloensis, F. virginiana, and F. moschata, and experimentally validated in species outside of the Fragaria genus. Transgenic overexpression of the strawberry FaGalUR increased L-ascorbic acid content in Arabidopsis thaliana (Agius et al., 2003), Lactuca sativa L. (Lim et al., 2008), Solanum lycopersicum (Lim et al., 2016), and Solanum tuberosum (Hemavathi et al., 2009). It is therefore likely that selecting for increased FanGalUR transcript in strawberry will lead to increased L-ascorbic acid content.
L-ascorbic acid levels are influenced by additional factors including metabolite degradation (Cruz-Rus et al., 2011). The gene MONODEHYDROASCORBATE REDUCTASE (MDAR) is involved in oxidative stress tolerance and is described as a key component of fruit L-ascorbic acid repair (Cruz-Rus et al., 2011). Both cis and trans-eQTL were discovered for a published strawberry FanMDAR allele (JQ320104) (h2 = 0.68, FDR-adjusted p-value 0.00028), of which the cis-eQTL accounts for 53% of a 3-fold transcript variation (single-marker p-value 8.87e-10). It is possible this heritable fold-change difference contributes to L-ascorbic acid maintenance. This hypothesis could be quickly tested post hoc by examining this new genetic variant in strawberry lines with previously existing L-ascorbic acid data. Additional eQTL were found for the vitamin C antioxidant-associated genes MN-SUPEROXIDE DISMUTASE (FanSODM), GLUTATHIONE PEROXIDASE (FanGPX), and GLUTATHIONE REDUCTASE (FanGR) (Erkan et al., 2008) (Table S2).
Carotenoids
Strawberry carotenoids provide fruit color and photoprotection, and are essential human nutrients (Ruiz-Sola and Rodríguez-Concepción, 2012). Cis-eQTL were discovered for published alleles of strawberry PHYTOENE SYNTHASE (FanPSY, FJ784889) (h2 = 0.60, FDR-adjusted p-value 0.0018) and Ζ-CAROTENE DESATURASE (FanZDS, FJ795343) (h2 = 0.91, FDR-adjusted p-value 0.0034). Single-marker analysis accounts for 62% and 50% of the observed mature receptacle differential expression between genotypes, respectively. PSY is a common control point for substrate flux into the carotenoid pathway in several plants (Fraser et al., 2007) and is often correlated with the upregulation of ZDS transcripts (Fanciullino et al., 2008). In strawberry fruit, Zhu et al., (2015) noted that FanPSY and FanZDS transcript accumulation varies between cultivars, and were modestly correlated with carotenoid levels. To assess the impact of FanPSY and FanZDS-related carotenoid content, these eQTL markers may be used to screen for seedlings that will abundantly express these genes in the fruit. A cis-eQTL accounting for 43% of a β-carotene hydroxylase (FanBCH) transcript accumulation variance was also discovered, however total accumulation was low (h2 = 0.92, single-marker p-value 1.2e-08).
Fruit Ripening Transcription Factors
The strawberry receptacle ripening process is mainly determined by genetic factors (Perkins‐Veazie, 2010). Several eQTL were found for fruit-based transcription factors associated with the ripening process (Figure 6), including the negative regulator FanSnRK2.6, whose natural expression decreases with fruit ripening (Han et al., 2015). The phenotypic effects of FanSnRK2.6 expression have been studied via transgenesis. Transgenic overexpression of FanSnRK2.6 (400% of normal) in octoploid receptacles arrested ripening, while silencing (10% of normal) accelerated ripening (Han et al., 2015). The cis-QTL for FanSnRK2.6 is associated with a ~5-fold difference in mature receptacle transcript accumulation between cultivars. As this is a similar range to that demonstrated by transgenesis, it is possible that eQTL marker selection could produce similar phenotypes to that observed by transgenesis. However, the influence of FanSnRK2.6 is likely greatest in the developing receptacle, and it is unknown if this eQTL is predictive at earlier receptacle stages.
An eQTL was found for the transcription factor EMISSION OF BENZENOID II (FanEOBII, KM099230). This transcription factor has been experimentally characterized in Fragaria ×ananassa using transient overexpression (Medina-Puche et al., 2015). Transient overexpression of FanEOBII in the receptacle increased levels of eugenol, a desirable volatile organic compound (Medina-Puche et al., 2015). Ripening-related transcript accumulation of FanEOBII is elicited by FanMYB10, a phenylpropanoid pathway transcription factor whose eQTL was previously discussed.
An eQTL was also discovered for the flavonoid-associated transcription factor SCARECROW-LIKE 8 (FanSCL8, F. vesca- gene13212). FaSCL8 was identified as a flavonoid pathway regulator using transcriptome network correlation analysis, and experimentally shown to regulate accumulation of several flavonoid-associated transcripts (Pillet et al., 2015). Additional eQTL were found for one orthologous and one paralogous copy of FanSCL8. These gene copies have different expression patterns (Table S1), suggesting non-redundant functions.
Bridging to Application in Strawberry Breeding
eQTL analysis is a tool for evaluating gene expression using genetics. These concrete genetic differences can be used to explore gene function, and serve as a bridge to marker/trait association. The biochemistry and genetics underlying many important traits in strawberry has been detailed in the literature, though few of these discoveries have been translated into practical markers for breeding. It is frequently the case that informative molecular research does not describe a source of beneficial genetics which can be selected by breeders. This eQTL analysis identified natural genetic variants influencing transcript variation, analogous to transgenic expression levels. These markers may also be used for targeted basic research aimed at genes/genetics/transcripts which are highly variable between cultivars. As these eQTL markers are derived from the widely used IStraw35 SNP array platform, eQTL markers can be easily cross-referenced with trait QTL experiments in silico. This approach can be used to help identify the causal basis of trait QTL in cases where differential expression contributes to traits. This approach can also be used post-hoc to rapidly test existing trait QTL.
These eQTL results highlight the shortcomings of transcriptomics-only driven candidate gene discovery. With RNAseq data alone, it is typically indiscernible whether differential expression is due to genetics, environment or stochastic effects. This genetic association study establishes that most Fragaria ×ananassa fruit transcripts are probably not influenced by differential genetics under normal growth conditions. Even among the biased set of 2,000 fruit genes with the highest transcriptional variance, only 8% were rigorously associated with segregating genetics (Table S1). As myriad genetic and environmental interactions can influence transcript accumulation, further work is warranted to widen the scope of this foundational analysis. Though multiple populations from different seasons were used in this analysis, these results pertain only to mature strawberry receptacle in normal field conditions, and the modest number of transcriptomes (61) is a limitation for estimating heritability and R2. Future RNAseq experiments performed in octoploid strawberry are encouraged to utilize low-cost IStraw-based genotyping to facilitate expression-QTL analysis.
Author’s Note
eQTL analysis in octoploid strawberry uncovered genetic variants determining the differential expression of key fruit genes, including published genes where transcript-level variation is known to govern important traits.
Data Availability Statement
The datasets generated for this study can be found in the Raw short read RNAseq data from fruit transcriptomes are available from the NCBI Short Read Archive under project SRP039356 (http://www.ncbi.nlm.nih.gov/sra/?term=SRP039356). Raw short read RNAseq data from the ‘Camarosa’ gene expression atlas (Sánchez-Sevilla et al., 2017) are available at the European Nucleotide Archive (https://www.ebi.ac.uk/ena) with the study reference PRJEB12420.
Author Contributions
CB conceived and led the research experiment. MH, NM, and CB performed gDNA isolation and genotyping data filtering. AS, MH, NM, and CB evaluated eQTL candidates. AS, MH, and CB performed results collection, organization, and single-marker analysis. SV provided guidance in genotyping and QTL mapping. KF, SL, and VW contributed to project oversight and manuscript editing. CB and KF composed the manuscript. All authors read and approved the final manuscript.
Funding
This work was supported by grants from the Florida Department of Agriculture and Consumer Services and the UF/IFAS Plant Breeding Graduate Initiative (VW and KF).
Conflict of Interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Acknowledgments
The authors acknowledge Drs. Jeremy Pillet and Alan Chambers for RNA isolation and RNAseq line selection, Ben Harrison and Aristotle Koukoulidis for assistance with gDNA isolation, Drs. Iraida Amaya and José Sánchez-Sevilla for raw RNAseq data allowing for expression profiling, and Drs. Pat Edger and Steven Knapp for access to an advanced draft of the octoploid ‘Camarosa’ genome.
Supplementary Material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene.2019.01317/full#supplementary-material
References
Agius, F., González-Lamothe, R., Caballero, J. L., Muñoz-Blanco, J., Botella, M. A., Valpuesta, V. (2003). Engineering increased vitamin C levels in plants by overexpression of a D-galacturonic acid reductase. Nat. Biotechnol. 21, 177. doi: 10.1038/nbt777
Barbey, C., Lee, S., Verma, S., Bird, K. A., Yocca, A. E., Edger, P. P., et al. (2019). Disease resistance genetics and genomics in octoploid strawberry. Genes Genomes Genet. g3, 400597–402019. doi: 10.1534/g3.119.400597
Benjamini, Y., Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B (Methodol) 57, 289–300. doi: 10.1111/j.2517-6161.1995.tb02031.x
Box, G. E. P., Cox, D. R. (1964). An analysis of transformations. J. R. Stat. Soc.: Ser. B (Methodol) 26, 211–243. doi: 10.1111/j.2517-6161.1964.tb00553.x
Carvalho, R. F., Carvalho, S. D., O’Grady, K., Folta, K. M. (2016). Agroinfiltration of strawberry fruit — a powerful transient expression system for gene validation. Curr. Plant Biol. 6, 19–37. doi: 10.1016/j.cpb.2016.09.002
Chambers, A. H. (2013). Strawberry flavor: from genomics to practical applications. Dissertation. (Gainesville, FL, USA: University of Florida).
Conesa, A., Götz, S., García-Gómez, J. M., Terol, J., Talón, M., Robles, M. (2005). Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics 21, 3674–3676. doi: 10.1093/bioinformatics/bti610
Cruz-Rus, E., Amaya, I., Sanchez-Sevilla, A., Botella, M. A., Valpuesta, V. (2011). Regulation of L-ascorbic acid content in strawberry fruits. J. Exp. Bot. 62 (12), 4191–4201.
Di Matteo, A., et al. (2005). Structural basis for the interaction between pectin methylesterase and a specific inhibitor protein. Plant Cell 17 (3), 849–858.
Dixon, R. A., Lamb, C. J., Masoud, S., Sewalt, V. J. H., Paiva, N. L. (1996). Metabolic engineering: prospects for crop improvement through the genetic manipulation of phenylpropanoid biosynthesis and defense responses — a review. Gene 179, 61–71. doi: 10.1016/S0378-1119(96)00327-7
Edger, P. P., Poorten, T. J., VanBuren, R., Hardigan, M. A., Colle, M., McKain, M. R., et al. (2019). Origin and evolution of the octoploid strawberry genome. Nat. Genet. 51, 541–547. doi: 10.1038/s41588-019-0356-4
Fanciullino, A. L., Dhuique-Mayer, C, Froelicher, Y., Talón, M., Ollitrault, P., Morillon, R. (2008). Changes in carotenoid content and biosynthetic gene expression in juice sacs of four orange varieties (Citrus sinensis) differing in flesh fruit color. J. Agric. Food Chem. 56 (10), 3628–3638.
Fraser, P. D., Enfissi, E. M., Goodfellow, M., Eguchi, T., Bramley, P. M. (2007). Metabolite profiling of plant carotenoids using the matrix-assisted laser desorption ionization time-of-flight mass spectrometry. Plant J. 49 (3), 552–564.
Galpaz, N., Gonda, I., Shem-Tov, D., Barad, O., Tzuri, G., Lev, S., et al. (2018). Deciphering genetic factors that determine melon fruit-quality traits using RNA-Seq-based high-resolution QTL and eQTL mapping. Plant J. 94, 169–191. doi: 10.1111/tpj.13838
Griesser, M., Hoffmann, T., Bellido, M. L., Rosati, C., Fink, B., Kurtzer, R., et al. (2008). Redirection of flavonoid biosynthesis through the down-regulation of an anthocyanidin glucosyltransferase in ripening strawberry fruit. Plant Physiol. 146, 1528. doi: 10.1093/jxb/ern117
Han, Y., Dang, R., Li, J., Jiang, J., Zhang, N., Jia, M., et al. (2015). FaSnRK2. 6, an ortholog of Open Stomata 1, is a negative regulator of strawberry fruit development and ripening. Plant Physiol. 114, 915–930. doi: 10.1104/pp.114.251314
Hemavathi, U. C. P., Young, K. E., Akula, N., Kim, H. S., Heung, J. J., Oh, O. M., et al. (2009). Over-expression of strawberry d-galacturonic acid reductase in potato leads to accumulation of vitamin C with enhanced abiotic stress tolerance. Plant Sci. 177, 659–667 doi: 10.1016/j.plantsci.2009.08.004
Hurgobin, B., Golicz, A. A., Bayer, P. E., Chan, C.-K. K., Tirnaz, S., Dolatabadian, A., et al. (2018). Homoeologous exchange is a major cause of gene presence/absence variation in the amphidiploid Brassica napus. Plant Biotechnol. J. 16, 1265–1274. doi: 10.1111/pbi.12867
Jackson, L. A., Shadle, G. L., Zhou, R., Nakashima, J., Chen, F., Dixon, R. A. (2008). Improving saccharification efficiency of alfalfa stems through modification of the terminal stages of monolignol biosynthesis. Bioenergy Res. 1, 180. doi:10.1007/s12155-008-9020-z
Kadomura-Ishikawa, Y., Miyawaki, K., Takahashi, A., Masuda, T., Noji, S. (2015). Light and abscisic acid independently regulated FaMYB10 in Fragaria × ananassa fruit. Planta 241, 953–965. doi: 10.1007/s00425-014-2228-6
Lim, M.-Y., Cho, Y.-N., Chae, W.-K., Park, Y.-S., Min, B.-W., Harn, C.-H. (2008). Transgenic lettuce (Lactuca sativa L.) with increased vitamin C levels using GalUR gene. J. Plant Biotechnol.
Lim, M. Y., Jeong, B. R., Jung, M., Harn, C. H. (2016). Transgenic tomato plants expressing strawberry d-galacturonic acid reductase gene display enhanced tolerance to abiotic stresses. Plant Biotechnol. Rep. 10, 105–116. doi: 10.1007/s11816-016-0392-9
Medina-Puche, L., Cumplido-Laso, G., Amil-Ruiz, F., Hoffmann, T., Ring, L., Rodríguez-Franco, A., et al. (2014). MYB10 plays a major role in the regulation of flavonoid/phenylpropanoid metabolism during ripening of Fragaria × ananassa fruits. J. Exp. Bot. 65, 401–417. doi: 10.1093/jxb/ert377
Medina-Puche, L., Molina-Hidalgo, F. J., Boersma, M., Schuurink, R. C., López-Vidriero, I., Solano, R., et al. (2015). An R2R3-MYB transcription factor regulates eugenol production in ripe strawberry fruit receptacles. Plant Physiol. 168, 598. doi: 10.1104/pp.114.252908
Metsalu, T., Vilo, J. (2015). ClustVis: a web tool for visualizing clustering of multivariate data using principal component analysis and heatmap. Nucleic Acids Res. 43, W566–W57010. doi: 10.1093/nar/gkv468
Michaelson, J. J., Loguercio, S., Beyer, A. (2009). Detection and interpretation of expression quantitative trait loci (eQTL). Methods 48, 265–276. doi: 10.1016/j.ymeth.2009.03.004
Miyawaki, K., Fukuoka, S., Kadomura, Y., Hamaoka, H., Mito, T., Ohuchi, H., et al. (2012). Establishment of a novel system to elucidate the mechanisms underlying light-induced ripening of strawberry fruit with an < i > Agrobacterium-mediated RNAi technique. Plant Biotechnol. 29, 271–277. doi: 10.5511/plantbiotechnology.12.0406a
Muñoz, C., Hoffmann, T., Escobar, N. M., Ludemann, F., Botella, M. A., Valpuesta, V., et al. (2010). The Strawberry Fruit Fra a Allergen Functions in Flavonoid Biosynthesis. Mol. Plant 3, 113–124. doi: 10.1093/mp/ssp087
Peled-Zehavi, H., Oliva, M., Xie, Q., Tzin, V., Oren-Shamir, M., Aharoni, A., et al. (2015). Metabolic engineering of the phenylpropanoid and its primary, precursor pathway to enhance the flavor of fruits and the aroma of flowers. Bioeng (Basel Switzerland) 2, 204–212. doi: 10.3390/bioengineering2040204
Perkins-Veazie, P. (2010). Growth and ripening of strawberry fruit. In Horticultural Reviews, J. Janick (Ed.). doi: 10.1002/9780470650585.ch8
Pillet, J., Yu, H.-W., Chambers, A. H., Whitaker, V. M., Folta, K. M. (2015). Identification of candidate flavonoid pathway genes using transcriptome correlation network analysis in ripe strawberry (Fragaria × ananassa) fruits. J. Exp. Bot. 66, 4455–4467. doi: 10.1093/jxb/erv205
Racine, J. S. (2011). RStudio: A platform-independent IDE for R and sweave. J. Appl. Econom. 27, 167–172. doi: 10.1002/jae.1278
Sánchez-Sevilla, J. F., Cruz-Rus, E., Valpuesta, V., Botella, M. A., Amaya, I. (2014). Deciphering gamma-decalactone biosynthesis in strawberry fruit using a combination of genetic mapping, RNA-Seq and eQTL analyses. BMC Genomics 15, 218. doi: 10.1186/1471-2164-15-218
Sánchez-Sevilla, J. F., Vallarino, J. G., Osorio, S., Bombarely, A., Posé, D., Merchante, C., et al. (2017). Gene expression atlas of fruit ripening and transcriptome assembly from RNA-seq data in octoploid strawberry (Fragaria × ananassa). Sci. Rep. 7, 13737. doi: 10.1038/s41598-017-14239-6
Tang, Y., Liu, X., Wang, J., Li, M., Wang, Q., Tian, F., et al. (2016). GAPIT version 2: an enhanced integrated tool for genomic association and prediction. Plant Genome 9. doi: 10.3835/plantgenome2015.11.0120
Vance, M. W., Tomas, H., Craig, K. C., Anne, P., Elizabeth, B. (2011). Historical trends in strawberry fruit quality revealed by a trial of University of Florida Cultivars and Advanced Selections. HortScihorts 46, 553–557. doi: 10.21273/HORTSCI.46.4.553
Verma, S., Bassil, N. V., van de Weg, E., Harrison, R. J., Monfort, A., Hidalgo, J. M., et al. (2017). Development and evaluation of the Axiom® IStraw35 384HT array for the allo-octoploid cultivated strawberry Fragaria ×ananassa: International Society for Horticultural Science (ISHS) (Belgium: Leuven), 75–82.
Whitaker, V. M., Hasing, T., Chandler, C. K., Plotto, A., Baldwin, E. (2011). Historical trends in strawberry fruit quality revealed by a trial of University of Florida cultivars and advanced selections. HortScience 46, 553–557. doi: 10.3233/BR-2011-013
Yeh, S.-Y., Huang, F.-C., Hoffmann, T., Mayershofer, M., Schwab, W. (2014). FaPOD27 functions in the metabolism of polyphenols in strawberry fruit (Fragaria sp.). Front. Plant Sci. 5, 518. doi: 10.3389/fpls.2014.00518
Keywords: eQTL analysis, pathway analysis, anthocyanins, pectin, transcriptomics, strawberry (Fragaria ×ananassa Duch.)
Citation: Barbey C, Hogshead M, Schwartz AE, Mourad N, Verma S, Lee S, Whitaker VM and Folta KM (2020) The Genetics of Differential Gene Expression Related to Fruit Traits in Strawberry (Fragaria ×ananassa). Front. Genet. 10:1317. doi: 10.3389/fgene.2019.01317
Received: 06 August 2019; Accepted: 03 December 2019;
Published: 07 February 2020.
Edited by:
Ray Ming, University of Illinois at Urbana-Champaign, United StatesReviewed by:
Zhongchi Liu, University of Maryland, College Park, United StatesMeiru Jia, China Agricultural University (CAU), China
Copyright © 2020 Barbey, Hogshead, Schwartz, Mourad, Verma, Lee, Whitaker and Folta. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Christopher Barbey, cbarbey@ufl.edu