ORIGINAL RESEARCH article

Front. Plant Sci., 26 January 2023

Sec. Technical Advances in Plant Science

Volume 14 - 2023 | https://doi.org/10.3389/fpls.2023.1042913

Versatile mapping-by-sequencing with Easymap v.2

  • Instituto de Bioingeniería, Universidad Miguel Hernández, Elche, Spain

Abstract

Mapping-by-sequencing combines Next Generation Sequencing (NGS) with classical genetic mapping by linkage analysis to establish gene-to-phenotype relationships. Although numerous tools have been developed to analyze NGS datasets, only a few are available for mapping-by-sequencing. One such tool is Easymap, a versatile, easy-to-use package that performs automated mapping of point mutations and large DNA insertions. Here, we describe Easymap v.2, which also maps small insertion/deletions (InDels), and includes workflows to perform QTL-seq and variant density mapping analyses. Each mapping workflow can accommodate different experimental designs, including outcrossing and backcrossing, F2, M2, and M3 mapping populations, chemically induced mutation and natural variant mapping, input files containing single-end or paired-end reads of genomic or complementary DNA sequences, and alternative control sample files in FASTQ and VCF formats. Easymap v.2 can also be used as a variant analyzer in the absence of a mapping algorithm and includes a multi-threading option.

Introduction

Identifying the causal genetic variant for a phenotype of interest is a common starting point in the genetic dissection of a biological process. Individuals exhibiting a phenotype of interest can be isolated by screening a large set of wild-type accessions or natural races or by the mutagenesis of a wild-type strain to isolate phenotypically distinct mutants among its progeny. The commonly used mutagen ethyl methanesulfonate (EMS) induces point mutations (usually G→A transitions) in random positions across the genome, some of which alter the sequence of genes and/or their transcriptional or post-transcriptional regulation (James and Dooner, 1990; Jansen et al., 1997).

A classic approach to mapping the causal mutation is linkage analysis between the mutation and molecular markers in segregating populations. This procedure has been integrated with Next Generation Sequencing (NGS): the improved technique is known as “mapping-by-sequencing” (Schneeberger and Weigel, 2011; Hartwig et al., 2012; James et al., 2013; Candela et al., 2015). In a typical mapping-by-sequencing experiment, the distribution of allele frequencies of biallelic Single Nucleotide Polymorphisms (SNPs) is studied in a mapping population: a pool of phenotypically recessive mutant individuals selected from a segregating population. The mapping population is used to identify genomic regions where SNP allele frequency is influenced by the phenotypic selection performed (Schneeberger et al., 2009; James et al., 2013; Wachsman et al., 2017). In the model plant Arabidopsis (Arabidopsis thaliana), bulked segregant analysis is usually (but not exclusively) performed using populations composed of F2 individuals generated from the selfing of an F1 progeny derived from a cross between a mutant and a wild-type strain. The mutant can be crossed to a wild-type strain genetically divergent from —and hence polymorphic to— its pre-mutagenesis wild-type parent (outcross or map cross), or to the wild-type parent itself (backcross or isogenic cross).

Another common approach to uncovering gene-to-phenotype relationships is to identify genetic lesions in a population of phenotypically mutant individuals obtained from recurrent backcrosses to a reference strain (Doitsidou et al., 2016). This approach, which was first used to identify EMS-induced mutations, is called EMS variant density mapping (Zuryn et al., 2010; Minevich et al., 2012). This technique relies on the presence or absence of variants along the genome and the detection of genomic regions with a significantly higher density of variants (high-density variant peaks or clusters) compared to the rest of the genome. These regions, which show linkage disequilibrium, are expected to contain the mutation causing the phenotype of interest, along with a set of tightly linked variants selected through recurrent backcrossing. This mapping strategy is convenient when selecting numerous mutants from a segregating population is not feasible due to complex or expensive phenotyping, scarce offspring, or life cycles that hinder the isolation of recombinant individuals. This approach is however slower than conventional mapping-by-sequencing strategies, since several backcrosses are needed to obtain the mapping population (Table 1). There are currently no user-friendly, graphic interface-based bioinformatic tools that automate the analysis of datasets obtained from recurrent backcrossing mapping strategies.

Table 1

ApproachAdvantagesLimitations
Linkage analysis mappingFast, 1-3 generations from M1 to the mapping population (F2 or M2)
Simultaneous identification of the region of interest and candidates
A mapping population of at least 100 individuals is required
A read depth of at least 25× is required for accurate sampling of allele frequencies
Highly sensitive to screening errors during mutant selection
Variant density mappingSmall test samples
The read depth can be low (at least 10×)
Simultaneous identification of the region of interest and candidates
Convenient for complex or expensive screenings
Slow, 3-6 backcrosses needed to obtain the mapping population
Not appropriate for strains genetically distant from the reference strain
Prone to artifacts (e.g., peaks around a centromere)
Detection of candidates is limited by read depth
QTL-seq mappingAnalysis of complex phenotypes influenced by more than one gene
Simultaneous detection of multiple loci contributing to a phenotype under study
A mapping population of at least 50 individuals is required
One or several large genomic intervals are usually selected
Many candidates are reported
Minor QTL can be overlooked

Experimental approaches for mapping-by-sequencing of SNPs and small InDels.

Most phenotypic traits are influenced by multiple genes and their interactions with the environment. Quantitative trait loci (QTL) are genomic regions containing genes that contribute to a specific quantitative phenotype, which in plants include agronomically relevant traits such as plant height, biomass production, and pathogen resistance (Kearsey, 1998; Kearsey and Farquhar, 1998; Alonso-Blanco and Koornneef, 2000). QTL were traditionally mapped by linkage analysis in the segregating progeny of a cross of two strains that genetically differ for a quantitative trait of interest (Juenger et al., 2005; Chen et al., 2021). This approach was combined with NGS to create QTL-seq, a technique involving the sequencing of two pools of individuals with opposite phenotypes selected from a population that segregates for a number of genetic variants (Takagi et al., 2013). QTL-seq can be used to identify linkage disequilibrium in genomic regions that potentially contain QTL for the trait under study. However, only a few tools have been developed for the analysis of QTL-seq datasets, and these tools require the use of additional software, thus creating complex bioinformatic pipelines (Mansfeld and Grumet, 2018; Wu et al., 2019).

Easymap was developed as a user-friendly software package to facilitate conventional mapping-by-sequencing of point mutations and tagged-sequence mapping of large insertions, both using NGS datasets (Lup et al., 2021; Lup et al., 2022). Easymap implements mapping workflows for diverse types of datasets, including DNA whole-genome resequencing and transcriptome sequencing (RNA-seq) data, mapping populations obtained by backcrossing, outcrossing or selfing of a mutant, and control samples consisting of the whole-genome sequences of any parental line of the mapping population or a pool of phenotypically wild-type siblings of the mapping population. Here, we describe Easymap v.2, an updated version of Easymap that features variant density and QTL-seq mapping workflows to detect any spontaneous or mutagen-induced SNPs and small insertion/deletions (InDels), which we refer to collectively here as variants. Easymap v.2 also includes a variant analyzer to explore the effects of a list of variants on genes that contain these variants and on their products. In addition, Easymap v.2 contains a preprocessing module for FASTQ files, supports the use of Variant Call Format (VCF) files as control samples, and allows multithreading. Easymap v.2 is open source and available for download at http://genetics.edu.umh.es/resources/easymap/. We recommend the Quickstart Installation Guide, which any person with no bioinformatics skills can follow to install a fully functional Easymap v.2 program.

Methods

Architecture

Easymap v.2 works in the Unix-based operating systems Ubuntu, Red Hat, Fedora and AMI. It can also be used in Windows 10 within the Ubuntu apps currently available at Microsoft and in virtual machines running a Unix-based operating system within macOS. Easymap v.2 can also be installed and accessed remotely (e.g., in a computational cluster or the Amazon Elastic Compute Cloud service) through its graphical and command line interfaces.

The installation of Easymap v.2 is automated, with a single script that compiles and installs all required software and third-party tools: Python2 (https://www.python.org/about/), Python Imaging Library (https://pillow.readthedocs.io/en/stable/), Virtualenv (https://virtualenv.pypa.io/en/latest/), HTSlib (http://www.htslib.org/), HISAT2 (Kim et al., 2019), Bowtie2 (Langmead and Salzberg, 2012), SAMtools (Li et al., 2009), and BCFtools (Narasimhan et al., 2016).

The installation script also launches the graphical web interface once installation is complete. The Easymap v.2 Quickstart Installation Guide (Supplementary File 1) provides detailed information about how to install Easymap v.2 without any prior bioinformatics knowledge. Advanced installation setups and usage instructions can be found in the Easymap v.2 Documentation (Supplementary File 2).

Testing

Easymap v.2 was tested on regular desktop computers and on high-performance machines, performance depends on the machine being used and the computational resources allocated to the program. For example, a typical linkage analysis from an Arabidopsis (genome size of ~135 Mb) (www.arabidopsis.org; The Arabidopsis Genome Initiative, 2000) mapping population derived from a backcross, in which test and control samples have a read depth of 50×, can take 6-8 hours using a standard computer without multi-threading. However, the same analysis involving larger genomes such those of maize (Zea mays, ~2.4 Gb) (Haberer et al., 2005) and barley (Hordeum vulgare, ~5.3 Gb) (The International Barley Genome Sequencing Consortium, 2012) can take weeks. Therefore, multi-threading is highly recommended when working with large genomes or with experimental designs involving an outcross, and can easily be set up using the graphic interface. Easymap v.2 also allows multiple projects to be executed simultaneously, but this can reduce the overall performance of a desktop computer. A minimum of 8 Gb of RAM and available disk storage at least twice the size of all input reads (or three-times the size if pre-processing is enabled) should suffice for most analyses.

Results

Variant density mapping workflow

We implemented a workflow in Easymap v.2 that performs variant density mapping in a test sample (Figure 1). The test sample consists of NGS reads obtained from a pool of individuals exhibiting a phenotype of interest that were subjected to several (usually 3 to 6) backcrosses to the reference strain. The use of a control sample is strongly advised. The control sample consists in reads obtained from an individual (or pool of individuals) that shares a considerable number of variants with the test sample. These variants are not related to the phenotype of interest and therefore must be filtered out from the test sample to aid in the identification of high-density variant peaks and candidate variants. In this manner, control reads can be obtained from strains that do not show the phenotype of interest but are genetically related to the test strain, such as the pre-mutagenesis wild-type strain, the parental reference strain, phenotypically wild-type siblings of the mapping population, or other mutant lines isolated from the same mutagenesis screen (Figure 1A).

Figure 1

Once the input files (comprising the test and control reads) have been loaded by the user (Figure 1B), Easymap v.2 reports the list of test sample-specific variants. This list is used to generate two sublists: one containing homozygous variants, and the other all EMS-type mutations. A third sublist that contains the homozygous EMS-type variants is created by the intersection of the first two sublists (Figure 1C). Easymap v.2 then detects high-density variant peaks along the genome of the test sample in overlapping sliding windows and establishes regions of interest according to the variant density distribution (Figure 1D.1). The variants within the regions of interest are reported as candidate mutations if they are located within a gene (Figures 1D.2, D.3). In the web interface, Easymap v.2 provides diagrams representing each gene of interest, plots of the distribution of variant density along the genome, and a table listing extensive information about each variant (refer to Supplementary Video 1 to see a tutorial on the use of the variant density mapping workflow).

To test the functionality of the variant density mapping workflow, we reproduced results from eight previously published datasets regarding studies in the nematode Caenorhabditis elegans, and detected the already known causal mutation in all instances (Supplementary Table S1). These datasets were generated from mutants in the reference background and provided fairly clear information, as the number of background variants was limited, resulting in a generally approachable number of candidate causal mutations.

We also validated the variant density mapping workflow in a dataset generated from a maize mutant genetically distant to the reference strain (Klein et al., 2018). In addition, the mapping population was not obtained by recurrent backcrossing, but by a single outcross to the reference strain. Despite this mapping strategy resulted in a large number of candidates due to the high density of natural polymorphisms between the two strains, our variant density mapping workflow was able to identify the causal mutation (Supplementary Table S1).

QTL-seq mapping workflow

Another workflow implemented in Easymap v.2 performs QTL-seq mapping analysis from two pools of individuals of a given segregating population with opposite phenotypes (Figure 2A). After loading the input files (Figure 2B), the QTL-seq mapping workflow uses SNPs common to both pools to identify the differences between the allele frequencies of each sample (dAF) in sliding windows across the genome (Figure 2C). This step allows the software to select genomic regions in which the dAF deviates from 0, i.e., there is opposite linkage disequilibrium in both samples. The selected regions are reported as potential QTL that contain candidate variants and genes, and a set of figures and tabular data is generated to allow the user to consider whether these candidates are modifiers of the phenotype under study (Figure 2D; refer to Supplementary Video 2 to see a tutorial on the use of the QTL-seq mapping workflow). As QTL-seq is a common approach for characterizing agronomically relevant traits in cultivars and species that lack a proper structural annotation of the genome, we enabled the possibility to run the QTL-seq mapping workflow without a structural annotation file (usually in genome feature file [GFF] format). Without a GFF file, this workflow can identify candidate regions that might contain QTL, but gene annotations and identification will not be available in the report.

Figure 2

To test the QTL-seq mapping workflow, we reproduced results from 13 different QTL-seq analyses in tomato (Solanum lycopersicum), barley, and different rice (Oryza sativa) cultivars using F2 (Takagi et al., 2013; Illa-Berenguer et al., 2015; Yang et al., 2017; Wang et al., 2018), M3 (Fekih et al., 2013), double haploid (Hisano et al., 2017), and Recombinant Inbred Line (RIL) (Fekih et al., 2013) mapping populations. These datasets included whole-genome and exome sequencing datasets, some with suboptimal average read depths (below 8×; Supplementary Table S2). Data analysis and criteria for QTL selection varied markedly among these studies. To provide robust results, Easymap v.2 performs a stringent selection of mapping variants for the detection of major QTL. However, we recommend that the user inspect the dAF plots, as well as the supporting files produced by Easymap v.2, to detect additional regions of interest that might have been overlooked, such as minor QTL. In our validation experiments, major QTL were selected correctly, but a few minor QTL were missed by Easymap v.2. These minor QTL became evident after visual inspection of the final report produced by our software. Identification of the variants that affect the phenotype under study is restricted by the availability of a structural annotation file, as well as the read depth of the dataset. Nonetheless, Easymap v.2 was successful in detecting all previously reported variants in the tested datasets (Hisano et al., 2017; Wang et al., 2018).

Additional implementations

Variant analyzer workflow

Easymap v.2 includes a variant analyzer workflow, which reports the effect of a given set of variants (SNPs and small InDels) on genes and gene products without applying any mapping algorithm. This workflow supports read (FASTQ) and variant (VCF) files as input for the test sample and an optional control sample. The variant analyzer can be used to assess the effects of a short list of mutations as well as those identified in reads from whole-genome sequencing datasets. As in the previous workflows, the report includes tabular data and diagrams describing all variants contained within the input file. The following information is provided for each variant: its position in the genome, quality value (estimated by the variant-calling pipeline), read counts, allele frequency, nucleotide and amino acid changes (if present), gene and gene elements affected by the variant, functional annotation of the gene (if the corresponding functional annotation file is available), a pair of primer sequences that can be used to genotype the variant, and sequences flanking the variant in the reference genome.

Reporting of SNPs and small InDels

While the first version of Easymap only reported EMS-type mutations, the user can now specify if other type of SNPs or small InDels should be reported as candidates. This function allows these variants to be identified with the pre-existing linkage-analysis workflow, the newly implemented variant density mapping, QTL-seq mapping, and the variant analyzer workflows.

Flexibility of control samples

Easymap v.2 supports VCF files as control samples for all mapping analyses that do not require the computation of allele frequencies of the control sample variants. The use of VCF files instead of FASTQ files enables the use of a customized control sample consisting of a compilation of variants pooled from samples of different genotypes, as long as they are not linked to the phenotype of interest. This type of control sample is useful when working with strains with a high number of polymorphisms with the reference strain. The use of VCF files as control files also saves time for mapping analyses, since Easymap v.2 skips the time-consuming alignment and variant-calling steps for the control sample. Some mapping workflows implemented in Easymap v.2 can also be executed without a control sample. While this approach is highly unadvisable for most mapping scenarios, it can be useful for previsualizing data in the absence of a control sample.

Multi-threading

Easymap v.2 allows the user to set the number of dedicated central processing unit (CPU) threads for each analysis. This option is particularly useful when working with large genomes or large read files, as the analysis rate is proportional to the number of threads used during the steps that are compatible with multi-threading.

Preprocessing of reads

Preprocessing of NGS reads is a common step prior to any data analysis using FASTQ files. We incorporated the FASTQ preprocessing tool fastp (Chen et al., 2018) into Easymap v.2 as an optional step for every workflow, since it is fast, and easy to automate and implement within a bioinformatics pipeline. In Easymap v.2, fastp functions in its default configuration to perform automated quality filtering, adapter trimming, and read pruning, and can be enabled or disabled using a switch in the web interface prior to analysis.

Discussion

The increased availability and sharp decline in the cost of NGS technologies during the last decade has opened the door for researchers to use NGS on a semi-routine basis (Sarin et al., 2008; James et al., 2013; Candela et al., 2015). However, manipulating NGS reads is a complex and time-consuming endeavor. Many tools and platforms have been developed for this purpose, but most are meant for bioinformaticians, as they require the user to combine multiple unrelated tools in order to perform a complete analysis. Specifically, for mutation mapping, few tools implement workflows that use raw reads to generate a list of candidate mutations in a user-friendly manner, and most of these tools lack versatility. The first version of Easymap was designed to ease mutation mapping by linkage analysis and to map large DNA insertions, making it quite useful for identifying transgenes and characterizing insertional lines of any type (Lup et al., 2021). In Easymap v.2, we implemented additional workflows for other common mapping strategies.

Mapping approaches based on studying variant density in a pool of mutants that have been recurrently backcrossed to the reference strain are often used for Caenorhabditis elegans due to its short lifespan and the difficulty in isolating and phenotyping large subsets of individuals of the same generation (Zuryn et al., 2010; Svensk et al., 2016). These approaches can also be used with large plants such as maize due to the spatial difficulty of simultaneously working with many individual plants. We demonstrated the success of our variant density mapping workflow for datasets obtained using such approaches. In some cases, the presence of large numbers of non-causal variants hinders the identification of the causal mutation. This limitation can be addressed by using control samples that combine variants from multiple sub-samples into a VCF file, an option that is supported by Easymap v.2.

For QTL-seq mapping, automated workflows such as the one implemented in Easymap v.2 can rapidly point to genomic regions exhibiting linkage disequilibrium. Since QTL-seq relies on the use of two genetic backgrounds that are highly different from each other, no control sample can be used to filter the data, as the variants of interest can be present in either of the two sequenced pools of individuals from the mapping population. Therefore, a vast number of candidate variants is commonly identified. Furthermore, the unknown molecular nature of the causal variants impedes any filtering step based on this property. In this sense, the identification of the causal gene is often disregarded in QTL-seq approaches due to the complexity of discerning between all the variants detected. Instead, narrow chromosomal regions are often defined, which can be used for the genetic improvement of crops (Fekih et al., 2013; Illa-Berenguer et al., 2015; Yang et al., 2017). Further fine-mapping experiments such as linkage analysis to molecular markers and deep-sequencing are often required to narrow down the regions of interest or to identify the causal mutations, especially when working with large genomes and very low read depths (Wang et al., 2020; Yang et al., 2021).

Our software successfully identified the genomic regions harboring potential QTL in the tested datasets and reported all the variants, indicating those that could be of interest. Easymap v.2 provides lists of polymorphisms with detailed information to help users define narrower or alternative QTL-seq mapping intervals or to apply more stringent filters to detect candidate variants. Since the phenotype of interest could be caused by genetic variants that remain undetectable by re-aligning short reads to a reference genome, such as large InDels, microsatellites, or chromosomal rearrangements (Doitsidou et al., 2016), one list provided by Easymap v.2 contains the genes present in potential QTL to help the user identify additional candidates.

In conclusion, Easymap v.2 is a robust, versatile tool that can be used by researchers without previous experience in applying NGS strategies to gene mapping. Installing Easymap v.2 in any operating system is simple, as detailed in the one-page Quickstart Installation Guide (Supplementary File 1). Although the web interface is largely self-explanatory, comprehensive instructions and usage details can be found in the Easymap v.2 Documentation (Supplementary File 2). An interactive preview of the user interface with the mapping reports generated during the validation of all the workflows performed here is available at http://atlas.umh.es/easymapv2.

Statements

Data availability statement

Easymap v.2 is freely available at http://genetics.edu.umh.es/resources/easymap/. The sources of the datasets used in this work are detailed in Supplementary Tables 1, 2. Some data used to validate our software was provided by the authors of the original work. Requests to access these datasets should be directed to Sophie Jarriault, sophie@igbmc.fr; Marc Pilon, marc.pilon@cmb.gu.se; Esther van der Knaap, vanderknaap.1@osu.edu.

Author contributions

JLM obtained funding, provided resources, and supervised the work. SDL and JLM conceived and designed the new implementations found in Easymap v.2. SDL developed the new implementations and tested the software with real datasets. SDL, CN-Q and JLM wrote the article. All authors contributed to the article and approved the submitted version.

Funding

This work was supported by grants from the Ministerio de Ciencia e Innovación of Spain (PGC2018-093445-B-I00 and PID2021-127725NB-I00 [MCI/AEI/FEDER, UE]) and the Generalitat Valenciana (PROMETEO/2019/117) to JLM. SDL held a predoctoral fellowship (ACIF/2018/005) from the Generalitat Valenciana.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2023.1042913/full#supplementary-material

Supplementary Video 1

A brief tutorial on the use of the Easymap v.2 variant density mapping tool.

Supplementary Video 2

A brief tutorial on the use of the Easymap v.2 QTL-seq mapping tool.

Abbreviations

NGS, next generation sequencing; EMS, ethyl methanesulfonate; SNP, single-nucleotide polymorphism; QTL, quantitative trait loci; VCF, Variant Call Format.

References

  • 1

    Alonso-BlancoC.KoornneefM. (2000). Naturally occurring variation in arabidopsis: an underexploited resource for plant genetics. Trends Plant Sci.5 (1), 2229. doi: 10.1016/s1360-1385(99)01510-1

  • 2

    CandelaH.Casanova-SáezR.MicolJ. L. (2015). Getting started in mapping-by-sequencing. J. Integr. Plant Biol.57 (7), 606612. doi: 10.1111/jipb.12305

  • 3

    ChenH.PanX.WangF.LiuC.WangX.LiY.et al. (2021). Novel QTL and meta-QTL mapping for major quality traits in soybean. Front. Plant Sci.12. doi: 10.3389/fpls.2021.774270

  • 4

    ChenS.ZhouY.ChenY.GuJ. (2018). Fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics34 (17), i884i890. doi: 10.1093/bioinformatics/bty560

  • 5

    DoitsidouM.JarriaultS.PooleR. J. (2016). Next-generation sequencing-based approaches for mutation mapping and identification in Caenorhabditis elegans. Genetics204 (2), 451474. doi: 10.1534/genetics.115.186197

  • 6

    FekihR.TakagiH.TamiruM.AbeA.NatsumeS.YaegashiH.et al. (2013). MutMap+: genetic mapping and mutant identification without crossing in rice. PloS One8 (7), e68529. doi: 10.1371/journal.pone.0068529

  • 7

    HabererG.YoungS.BhartiA. K.GundlachH.RaymondC.FuksG.et al. (2005). Structure and architecture of the maize genome. Plant Physiol.139 (4), 16121624. doi: 10.1104/pp.105.068718

  • 8

    HartwigB.JamesG. V.KonradK.SchneebergerK.TurckF. (2012). Fast isogenic mapping-by-sequencing of ethyl methanesulfonate-induced mutant bulks. Plant Physiol.160 (2), 591600. doi: 10.1104/pp.112.200311

  • 9

    HisanoH.SakamotoK.TakagiH.TerauchiR.SatoK. (2017). Exome QTL-seq maps monogenic locus and QTLs in barley. BMC Genomics18 (1), 125. doi: 10.1186/s12864-017-3511-2

  • 10

    Illa-BerenguerE.Van HoutenJ.HuangZ.van der KnaapE. (2015). Rapid and reliable identification of tomato fruit weight and locule number loci by QTL-seq. Theor. Appl. Genet.128 (7), 13291342. doi: 10.1007/s00122-015-2509-x

  • 11

    JamesD. W.Jr.DoonerH. K. (1990). Isolation of EMS-induced mutants in arabidopsis altered in seed fatty acid composition. Theor. Appl. Genet.80 (2), 241245. doi: 10.1007/BF00224393

  • 12

    JamesG. V.PatelV.NordströmK. J.KlasenJ. R.SaloméP. A.WeigelD.et al. (2013). User guide for mapping-by-sequencing in Arabidopsis. Genome Biol.14 (6), R61. doi: 10.1186/gb-2013-14-6-r61

  • 13

    JansenG.HazendonkE.ThijssenK. L.PlasterkR. H. (1997). Reverse genetics by chemical mutagenesis in Caenorhabditis elegans. Nat. Genet.17 (1), 119121. doi: 10.1038/ng0997-119

  • 14

    JuengerT.Pérez-PérezJ. M.BernalS.MicolJ. L. (2005). Quantitative trait loci mapping of floral and leaf morphology traits in Arabidopsis thaliana: evidence for modular genetic architecture. Evol. Dev.7 (3), 259271. doi: 10.1111/j.1525-142X.2005.05028.x

  • 15

    KearseyM. J. (1998). The principles of QTL analysis (a minimal mathematics approach). J. Exp. Bot.49, 16191623. doi: 10.1093/jxb/49.327.1619

  • 16

    KearseyM. J.FarquharA. G. (1998). QTL analysis in plants; where are we now? Heredity80 (Pt 2), 137142. doi: 10.1046/j.1365-2540.1998.00500.x

  • 17

    KimD.PaggiJ. M.ParkC.BennettC.SalzbergS. L. (2019). Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat. Biotechnol.37 (8), 907915. doi: 10.1038/s41587-019-0201-4

  • 18

    KleinH.XiaoY.ConklinP. A.GovindarajuluR.KellyJ. A.ScanlonM. J.et al. (2018). Bulked-segregant analysis coupled to whole genome sequencing (BSA-seq) for rapid gene cloning in maize. G3: Genes Genom. Genet.8 (11), 35833592. doi: 10.1534/g3.118.200499

  • 19

    LangmeadB.SalzbergS. L. (2012). Fast gapped-read alignment with bowtie 2. Nat. Methods9 (4), 357359. doi: 10.1038/nmeth.1923

  • 20

    LiH.HandsakerB.WysokerA.FennellT.RuanJ.HomerN.et al. (2009). The sequence Alignment/Map format and SAMtools. Bioinformatics25 (16), 20782079. doi: 10.1093/bioinformatics/btp352

  • 21

    LupS. D.Wilson-SánchezD.Andreu-SánchezS.MicolJ. L. (2021). Easymap: A user-friendly software package for rapid mapping-by-sequencing of point mutations and large insertions. Front. Plant Sci.12. doi: 10.3389/fpls.2021.655286

  • 22

    LupS. D.Wilson-SánchezD.MicolJ. L. (2022). Mapping-by-sequencing of point and insertional mutations with easymap. Methods Mol. Biol.2484, 343361. doi: 10.1007/978-1-0716-2253-7_23

  • 23

    MansfeldB. N.GrumetR. (2018). QTLseqr: An r package for bulk segregant analysis with next-generation sequencing. Plant Genome11 (2), 15. doi: 10.3835/plantgenome2018.01.0006

  • 24

    MinevichG.ParkD. S.BlankenbergD.PooleR. J.HobertO. (2012). CloudMap: A cloud-based pipeline for analysis of mutant genome sequences. Genetics192 (4), 12491269. doi: 10.1534/genetics.112.144204

  • 25

    NarasimhanV.DanecekP.ScallyA.XueY.Tyler-SmithC.DurbinR. (2016). BCFtools/RoH: A hidden Markov model approach for detecting autozygosity from next-generation sequencing data. Bioinformatics32 (11), 17491751. doi: 10.1093/bioinformatics/btw044

  • 26

    SarinS.PrabhuS.O'MearaM. M.Pe'erI.HobertO. (2008). Caenorhabditis elegans mutant allele identification by whole-genome sequencing. Nat. Methods5 (10), 865867. doi: 10.1038/nmeth.1249

  • 27

    SchneebergerK.OssowskiS.LanzC.JuulT.PetersenA. H.NielsenK. L.et al. (2009). SHOREmap: Simultaneous mapping and mutation identification by deep sequencing. Nat. Methods6 (8), 550551. doi: 10.1038/nmeth0809-550

  • 28

    SchneebergerK.WeigelD. (2011). Fast-forward genetics enabled by new sequencing technologies. Trends Plant Sci.16 (5), 282288. doi: 10.1016/j.tplants.2011.02.006

  • 29

    SvenskE.BiermannJ.HammarstenS.MagnussonF.PilonM. (2016). Leveraging the withered tail tip phenotype in C. elegans to identify proteins that influence membrane properties. Worm5 (3), e1206171. doi: 10.1080/21624054.2016.1206171

  • 30

    TakagiH.AbeA.YoshidaK.KosugiS.NatsumeS.MitsuokaC.et al. (2013). QTL-seq: rapid mapping of quantitative trait loci in rice by whole genome resequencing of DNA from two bulked populations. Plant J.74 (1), 174183. doi: 10.1111/tpj.12105

  • 31

    The Arabidopsis Genome Initiative (2000). Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature408 (6814), 796815. doi: 10.1038/35048692

  • 32

    The International Barley Genome Sequencing Consortium (2012). A physical, genetic and functional sequence assembly of the barley genome. Nature491 (7426), 711716. doi: 10.1038/nature11543

  • 33

    WachsmanG.ModliszewskiJ. L.ValdesM.BenfeyP. N. (2017). A SIMPLE pipeline for mapping point mutations. Plant Physiol.174 (3), 13071313. doi: 10.1104/pp.17.00415

  • 34

    WangH.ZhangY.SunL.XuP.TuR.MengS.et al. (2018). WB1, a regulator of endosperm development in rice, is identified by a modified MutMap method. Int. J. Mol. Sci.19 (8), 2159. doi: 10.3390/ijms19082159

  • 35

    WangG.ZhaoY.MaoW.MaX.SuC. (2020). QTL analysis and fine mapping of a major QTL conferring kernel size in maize (Zea mays). Front. Genet.11. doi: 10.3389/fgene.2020.603920

  • 36

    WuS.QiuJ.GaoQ. (2019). QTL-BSA: a bulked segregant analysis and visualization pipeline for QTL-seq. Interdiscip. Sci.: Comput. Life Sci.11 (4), 730737. doi: 10.1007/s12539-019-00344-9

  • 37

    YangL.WangJ.HanZ.LeiL.LiuH. L.ZhengH.et al. (2021). Combining QTL-seq and linkage mapping to fine map a candidate gene in qCTS6 for cold tolerance at the seedling stage in rice. BMC Plant Biol.21 (1), 278. doi: 10.1186/s12870-021-03076-5

  • 38

    YangX.XiaX.ZhangZ.NongB.ZengY.XiongF.et al. (2017). QTL mapping by whole genome re-sequencing and analysis of candidate genes for nitrogen use efficiency in rice. Front. Plant Sci.8. doi: 10.3389/fpls.2017.01634

  • 39

    ZurynS.Le GrasS.JametK.JarriaultS. (2010). A strategy for direct mapping and identification of mutations by whole-genome sequencing. Genetics186 (1), 427430. doi: 10.1534/genetics.110.119230

  • 40

    ZurynS.AhierA.PortosoM.WhiteE.R.MorinM.C.MargueronR. (2014). Sequential histone-modifying activities determine the robustness of transdifferentiation. Science345 (6198), 826829. doi: 10.1126/science.1255885

Summary

Keywords

forward genetics, next generation sequencing, mapping-by-sequencing, variant density mapping, QTL-seq

Citation

Lup SD, Navarro-Quiles C and Micol JL (2023) Versatile mapping-by-sequencing with Easymap v.2. Front. Plant Sci. 14:1042913. doi: 10.3389/fpls.2023.1042913

Received

13 September 2022

Accepted

04 January 2023

Published

26 January 2023

Volume

14 - 2023

Edited by

Marcos Egea-Cortines, Universidad Politécnica de Cartagena, Spain

Reviewed by

Feng Cheng, Insititute of Vegetables and Flowers, Chinese Academy of Agricultural Sciences, China; Eri Ogiso-Tanaka, National Museum of Nature and Science, Japan

Updates

Copyright

*Correspondence: José Luis Micol,

This article was submitted to Technical Advances in Plant Science, a section of the journal Frontiers in Plant Science

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Outline

Figures

Cite article

Copy to clipboard


Export citation file


Share article

Article metrics