ORIGINAL RESEARCH article

Front. Plant Sci., 14 January 2015

Sec. Plant Genetics and Genomics

Volume 5 - 2014 | https://doi.org/10.3389/fpls.2014.00696

The complete chloroplast genome provides insight into the evolution and polymorphism of Panax ginseng

  • 1. CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences Beijing, China

  • 2. University of Chinese Academy of Sciences Beijing, China

  • 3. School of Pharmaceutical Sciences, Changchun University of Chinese Medicine Changchun, China

Abstract

Panax ginseng C.A. Meyer (P. ginseng) is an important medicinal plant and is often used in traditional Chinese medicine. With next generation sequencing (NGS) technology, we determined the complete chloroplast genome sequences for four Chinese P. ginseng strains, which are Damaya (DMY), Ermaya (EMY), Gaolishen (GLS), and Yeshanshen (YSS). The total chloroplast genome sequence length for DMY, EMY, and GLS was 156,354 bp, while that for YSS was 156,355 bp. Comparative genomic analysis of the chloroplast genome sequences indicate that gene content, GC content, and gene order in DMY are quite similar to its relative species, and nucleotide sequence diversity of inverted repeat region (IR) is lower than that of its counterparts, large single copy region (LSC) and small single copy region (SSC). A comparison among these four P. ginseng strains revealed that the chloroplast genome sequences of DMY, EMY, and GLS were identical and YSS had a 1-bp insertion at base 5472. To further study the heterogeneity in chloroplast genome during domestication, high-resolution reads were mapped to the genome sequences to investigate the differences at the minor allele level; 208 minor allele sites with minor allele frequencies (MAF) of ≥0.05 were identified. The polymorphism site numbers per kb of chloroplast genome sequence for DMY, EMY, GLS, and YSS were 0.74, 0.59, 0.97, and 1.23, respectively. All the minor allele sites located in LSC and IR regions, and the four strains showed the same variation types (substitution base or indel) at all identified polymorphism sites. Comparison results of heterogeneity in the chloroplast genome sequences showed that the minor allele sites on the chloroplast genome were undergoing purifying selection to adapt to changing environment during domestication process. A study of P. ginseng chloroplast genome with particular focus on minor allele sites would aid in investigating the dynamics on the chloroplast genomes and different P. ginseng strains typing.

Introduction

The chloroplast, an important plastid, plays an essential role in plant cell functions, including photosynthesis and carbon fixation (Neuhaus and Emes, 2000). In angiosperms, the chloroplast has a conserved quadripartite structure composed of two copies of inverted repeat (IR), one large single copy (LSC), and one small single copy (SSC) (Palmer, 1985). Comparative analysis indicates that most chloroplast genomes are highly conserved at the gene level (Jansen et al., 2005). Compared with the nuclear genome, the chloroplast genome has many other advantages, such as haploid and maternal inheritance (Birky, 2001). Since decoding the first chloroplast genome sequence (Nicotiana tabacum, Shinozaki et al., 1986) in 1986, over 200 complete chloroplast genomes have been sequenced and deposited in NCBI Organelle Genome Resources database. With the development of DNA sequencing technology, the chloroplast genomes have been widely used for plant identification and resolving phylogenetic relationships (Jansen et al., 2007; Moore et al., 2007, 2010; Parks et al., 2009).

Panax ginseng, a member of Araliaceae family, is native to China, Korea, and Russia. For thousands of years, traditional Chinese medicine has relied on P. ginseng to restore and enhance human health, and P. ginseng has been used as a tonic, stimulant, and agent to foster fatigue and stress-resistance for more than 2000 years. More recently, the pharmaceutical effects of P. ginseng have been shown by a host of studies, and P. ginseng has become one of the most well-known medicinal plants worldwide (Pazyar et al., 2012; Zheng et al., 2012; Gao et al., 2013).

Traditional methods to identify wild ginseng from cultivars are based on phenotypic observations, while morphological characteristics are often affected by environmental and developmental factors. Particularly during the early developmental stages, the seeds and seedlings of different ginseng cultivars are extremely morphologically similar; rendering their differentiation is quite difficult and sometimes impossible. Furthermore, morphological characteristics cannot be used for screening large numbers of ginseng samples. Therefore, a simple method of DNA analysis, rather than the traditional authentication methods, is clearly desirable. Lee et al. studied the phylogeny of Panax with chloroplast trnCtrnD intergenic region sequences (Lee and Wen, 2004). In 2004, the first P. ginseng chloroplast genome sequence, P. schinseng Nees (156,318 bp, Genbank accession number: NC_006290), was reported by Kim et al. (Kim and Lee, 2004). The general features of P. schinseng Nees chloroplast genome and genome structure dynamics compared to other chloroplast genome sequences have been well described. However, the data were insufficient for studying the genome dynamics and evolution of P. ginseng chloroplast. Herein, we present four complete Chinese P. ginseng chloroplast genome sequences based on next generation sequencing (NGS) technology. Comparative analyses of the chloroplast genomes between P. ginseng strains and other plants were conducted. At the same time, high resolution of genome sequences has made it easy to investigate genetic variations in the chloroplast genome sequence that have occurred over the course of domestication.

Materials and methods

Ethics statement

All plant samples (four kinds of Chinese ginseng cultivars) in current study were collected from E-Mu ginseng experimental base (N 43°47′45.72″, E128°6′7.88″, 46 m above sea level), Changchun University of Chinese Medicine with permission. Ginseng in current research does not involve endangered or protected species.

Plant DNA extraction and sequencing

Four P. ginseng samples, including three domestic ginseng strains (DMY, EMY, and GLS) and one wild ginseng strain (YSS), were all collected from E-Mu ginseng experimental base. Morphological differences among these four strains mainly focus on rhizome, lateral root and so on, and the detailed differences were introduced in Supplementary Table S1. After cleaning, the fresh roots were frozen in liquid nitrogen and stored at −80°C until further processing. Changchun University of Chinese Medicine confirmed the identification of the four ginseng samples through morphology. The total mixed genomic DNA was extracted from fresh roots by using a Plant Genomic DNA Kit (Tiangen Biotech Co., China) following the manufacturer's instructions.

For each sample, pair-end reads with different insertion lengths were sequenced by Illumina HiSeq2000 system. Additionally, DMY was also sequenced with Roche/454 GS-FLX (Titanium) pyrosequencing machine for single-end fragment reads. The detailed sequencing information for four strains was listed in Supplementary Table S2.

Chloroplast genome assembly and minor allele analysis

All raw HiSeq reads of four ginseng strains were filtered by quality control protocol with the following five rules: (i) there is no “N” in the entire read, (ii) the percentage of bases with quality <10 is 0, (iii) the percentage of bases with quality <15 is ≤3%, (iv) the percentage of bases with quality <20 is ≤5%, and (v) the average quality of all bases is ≥30. Only high quality reads satisfying all five rules were used in genome assembly and SNP calling (high quality reads information were listed Supplementary Table S2).

For DMY, all reads from 454 GS FLX sequences were directly assembled using Newbler (overlapMinMatchLength = 40 and overlapMinMatchIdentity = 90). According to Zhang's assembly strategy (Zhang et al., 2011), we used BLAT (-minIdentity = 90) to search large contigs (length ≥ 1 kb) with the conserved chloroplast genes (both identity and coverage on gene nucleotide sequence are no less than 90%). Those large contigs, which contain two or more conserved chloroplast genes, were selected as seed contigs in bb.454contignet (Iorizzo et al., 2012) script (set –lowlimit to 10). The chloroplast contig graph was also drawn with bb.454contignet, and low coverage contig branches (possibly including the DNA transferred from chloroplast to mitochondrial or nuclear genomes) were excluded manually (Zhang et al., 2011). Subsequently, SSC, LSC, and IR were connected into a circle. High-quality pair-end reads from three different insertion length libraries were employed to verify the contig connections and whether each base on the circle genome was supported by the primary allele base by using BWA (Li and Durbin, 2010) with default parameters and SAMTools (-Q = 20, -m = 5 on mpileup) (Li et al., 2009).

For EMY, GLS, and YSS, high-quality pair-end reads with different insertion lengths were BLAT-searched (set –minIdentity to 70 in BLAT) against all 207 chloroplast genomes (206 were downloaded from NCBI FTP, the other one is DMY chloroplast genome, and the accession numbers were listed in Supplementary Table S3). The matched reads (coverage and identity on each reads are no less than 70%) were considered as chloroplast-related reads, and 150,000 pairs of high-quality and chloroplast-related pair-end reads were selected randomly for assembly from different insertion length libraries respectively. Moreover, Velvet (Zerbino and Birney, 2008) was employed to assemble the reads into scaffolds (hash_length values are set to 63, 63, and 69 in velveth for EMY, GLS, and YSS respectively). For each of the other three ginseng strains, the order of scaffolds was determined by mapping scaffolds to DMY chloroplast genome with BLAT program (set –minIdentity to 90 in BLAT), and gaps were filled by fishing reads from high-quality pair-end reads with Gapcloser and then manually checked. The verification method for the chloroplast genome sequences of EMY, GLS, and YSS were the same as that for DMY.

In minor allele identification section, all high quality pair-end reads with an inserting length of 500 bp from all four strains were mapped onto DMY chloroplast genome using BWA with default value. The program rmdup in SAMTools was employed to remove duplicated reads and SNP calling. Finally, only alleles with MAF ≥0.05 were retained as minor allele candidates (De et al., 2008; Skoglund and Jakobsson, 2011; Iorizzo et al., 2013).

Chloroplast gene annotation

Protein-coding genes in ginseng chloroplast genome were annotated by DOGMA (set protein identity to 50, and hits number to 10) (Wyman et al., 2004); start and stop sites of these annotated protein-coding genes were corrected manually. rRNA was detected by aligning rRNA sequences from other chloroplast genomes to ginseng chloroplast genome sequence using BLAT with global coverage and identify ≥90%, and tRNA was identified by tRNAscan-SE (Lowe and Eddy, 1997) with default parameters. The gene map of ginseng DMY chloroplast was drawn by OGDRAW (Lohse et al., 2007).

Identification of repeat sequences

All repeat sequences were identified using microsatellite identification tool (MISA, http://pgrc.ipk-gatersleben.de/misa/misa.html), and each repeat sequence was ≥10 bp. Repeat sequences whose repeating sequence units were arranged from 2–6 bp and repeated not less than three times were considered as SSRs. Repeat sequences with lengths ≥20 bp were considered as large repeat sequences.

Phylogenetic analysis

A phylogeny constructed for ginseng and other plants from Asterids was based on the whole genome sequences of LSC, SSC, and IR regions, and 52 protein-coding gene sequences. As to protein-coding sequences, each gene was aligned using Clustal W with default parameters (Goujon et al., 2010), and the aligned sequences from all genomes were connected in order of their sequence. MAFFT (-maxiterate = 10) (Katoh and Toh, 2010) was used to align the sequences from LSC, SSC, and IR, and poor alignment regions were manually checked and adjusted. The phylogenetic trees were constructed using maximum likelihood (ML) algorithm in MEGA5 (Tamura et al., 2011) with default parameters, and Spinacia oleracea (Order: Caryophyllales) and Vitis vinfera (Clade: Rosid) were used as outgroups.

Phylogeny among the four newly sequenced ginseng chloroplasts and P. schinseng Nees chloroplast were constructed on the basis of the genome sequences and MAFs in the polymorphism sites. Because clear differences were observed between P. schinseng Nees chloroplast and the newly sequenced chloroplasts, P. schinseng Nees chloroplast was used as an outgroup directly. According to UPGMA algorithm, the evolutionary relationship among the four newly sequenced chloroplasts was calculated based on MAF of 208 polymorphism sites.

Genome structure analysis and nucleotide diversity

When comparing CDS, introns, and intergenic regions in LSC, SSC, and IR regions between ginseng and the other three closely related chloroplast genomes, the entire genome sequences were divided into series of small fragments by those exons of protein-coding genes. The fragment flanked by the same exons or genes would be taken as orthologous fragment. Orthologous fragments between ginseng and the other three closely related chloroplast genomes were aligned respectively, and then aligned sequences were orderly connected together according their regions. Nucleotide diversity and Ka/Ks in the different regions were calculated using DnaSP (version 5.10) (Librado and Rozas, 2009) with default parameters. The Ka/Ks value for each single protein-coding gene was calculated using the KaKs_caculator (-m MA -c 11) (Zhang et al., 2006).

Results and discussion

General features of P. ginseng chloroplast genome

Genome sequencing and assembly

The length for the whole chloroplast genome sequences for DMY, EMY, and GLS was 156,354 bp, while that for YSS was 156,355 bp. In DMY chloroplast genome sequence, the total length for LSC, SSC, and IR regions were 86,129 bp, 18,077 bp, and 26,074 bp, respectively. The accession numbers in GenBank for the four chloroplast genomes are KC686331 (DMY), KC686332 (EMY), KC686333 (GLS), and KF431956 (YSS). The clean reads for these newly sequenced chloroplast genomes were deposited in NCBI, and the accession number for SRA datasets are SRR1251992 (DMY), SRR1252006 (EMY), SRR1252007 (GLS), and SRR1252008 (YSS).

Genome annotation

In the determined nucleotide sequences of the chloroplast genome, DMY, EMY, and GLS chloroplast genome sequences were identical. Compared with these three chloroplast genome sequences, YSS chloroplast genome had only one base-pair insertion at 5472th base, which is located in the intron of rps16 in LSC region. Therefore, DMY chloroplast genome was used as the representative of these newly sequenced P. ginseng chloroplast genomes for further comparative analyses.

In DMY chloroplast genome, 128 functional genes were identified, including 86 protein-coding genes, 34 tRNA genes, and 8 rRNA genes (Table 1), and the gene map for DMY chloroplast genome was shown as Figure 1. Among 86 protein-coding genes, 74 were single copy genes, 12 were duplicates, and 13 genes were composed of two or more exons. For instance, the rps12 gene had three exons, which located in LSC region (one exon) and IR region (the other two). Among 42 RNA genes, 20 were unique and 22 were duplicates. 7 of 34 tRNA genes had one intron. A total of 26,162 codons were present in protein-coding genes of DMY chloroplast genome (Supplementary Table S4). Generally the nucleotide sequence of protein-coding gene begins with a start codon ATG, while there were some exceptions consist of the following: rps19 began with GTG and ndhD began with ACG, which was also found in the chloroplast genomes of other plants, such as Phoenix dactylifera L (Yang et al., 2010); the ycf15 gene in Eleutherococcus senticosus began with ATG (Yi et al., 2012), while that in P. ginseng began with GTG, with the first nucleotide changing from A to G.

Table 1

Category for genesGroup of genesName of genes
Self-replicationrRNA genesrrn162), rrn232), rrn4.52), rrn52)
tRNA genes34 trn genes(7 contain an intron, 12 in the IR regions)
trnA-UGC*(×2), trnC-GCA, trnD-GUC, trnE-UUC, trnF-GAA, trnG-UCC, trnH-GUG, trnI-GAU*(×2), trnK-UUU*, trnL-CAA2), trnL-UAA*, trnL-UAG, trnM-CAU2), trnN-GUU2), trnP-UGG, trnQ-UUG, trnR-ACG2), trnR-UCU, trnS-GCU, trnS-GGA, trnS-UGA, trnT-GGU, trnT-UGU, trnV-GAC2), trnV-UAC*, trnW-CCA, trnY-GUA
Small subunit of ribosomerps2, rps3, rps4, rps72), rps8, rps11, rps12**, rps14, rps15, rps16*, rps18, rps19
Large subunit of ribosomerpl2*(×2), rpl14, rpl16*, rpl20, rpl22, rpl232),rpl32, rpl33, rpl36
DNA dependent RNA polymeraserpoA, rpoB, rpoC1*, rpoC2
Genes for photosynthesisSubunits of NADH-dehydrogenasendhA*, ndhB*(×2), ndhC, ndhD, ndhE, ndhF, ndhG, ndhH, ndhI, ndhJ, ndhK
Subunits of photosystem IpsaA, psaB, psaC, psaI, psaJ, ycf3**
Subunits of photosystem IIpsbA, psbB, psbC, psbD, psbE, psbF, psbH, psbI, psbJ, psbK, psbL, psbM, psbN, psbT
Subunits of cytochrome b/f complexpetA, petB*, petD*, petG, petL, petN
Subunits of ATP synthaseatpA, atpB, atpE, atpF*, atpH, atpI
Large subunit of rubiscorbcL
Other genesTranslational initiation factorinfA
MaturasematK
ProteaseclpP**
Envelope membrane proteincemA
Subunit of Acetyl-CoA-carboxylaseaccD
c-type cytochrome synthesis geneccsA
Genes of unknown functionOpen Reading Frames (ORF, ycf)ycf1, ycf22), ycf4, ycf15(×2), lhbA

Gene contents in P. ginseng DMY chloroplast genome.

One and two asterisks after gene names reflect one- and two-intron containing genes, respectively. Genes located in IR regions are indicated by the (×2) symbol after the gene name. The rps12 gene is divided: the 5 '-rps12 is located in LSC region and the 3 '-rps12 in IR region.

Figure 1

Simple sequence repeats in DMY chloroplast genome

Simple sequence repeats (SSRs), which are also known as microsatellites and short tandem repeats (STRs), present high diversity in copy numbers. SSRs are important genetic molecular markers for population genetics (Doorduin et al., 2011; He et al., 2012) and are widely used for plant typing (Xue et al., 2012; Yang et al., 2011). With the microsatellite identification tool (MISA), 30 SSRs were detected in P. ginseng DMY chloroplast genome (Table 2), including 18 homopolymers, 1 dipolymer, 8 tetrapolymers, 2 pentapolymers, and 1 hexapolymer. In 19 homopolymers and dipolymers, 15 SSRs were only composed of A or T bases. In the other 11 SSRs, more than half of the bases were composed of A or T bases. Therefore, SSRs in DMY chloroplasts are AT-rich. Among these SSRs, 10 were located in intergenic regions, 2 in rRNA gene (rrn23), and 8 in protein-coding genes (rpoA, rpoB, rpoC2, atpB, psbM, and 3 in ycf1). SSRs with the length of repeat unit not less than two bases were not identified in the previously reported P. schinseng Nees chloroplast genome (Kim and Lee, 2004). Comparing with P. ginseng DMY chloroplast genome, copy number variations in P. schinseng Nees chloroplast genome have introduced some difference in homopolymer SSR. (A)13 (4823–4835 bp) and (G)11 (105,431–105,441 bp) in P. ginseng DMY chloroplast genome were designated as (A)12 (4822–4833 bp) and (G)10 (105,406–105,415 bp) in P. schinseng Nees chloroplast genome, which was caused by the deletions in P. schinseng Nees chloroplast genome.

Table 2

UnitLengthNo. SSRsPosition on genome
A10117677–17686
11123946–23956
1324823–4835, 14249–14261
C1027503–7512, 38191-38200
111137043–137053
G111105431–105441
T10727594–27603(rpoB), 56528–56537(atpB), 71553–71562, 80110–80119(rpoA), 83153–83162, 127890–127899(ycf1), 130063–130072(ycf1)
11319889–19899(rpoC2), 83064–83074, 128582–128592(ycf1)
TA14185868–85881
AAGA12130782–30793
TCTT12130804–30815
AATT12130948–30959(psbM)
ATTT12134090–34101
TATT12169890–69901
AAAG12172233–72244
AGGT121107514–107525(rrn23)
CTAC121134957–134968(rrn23)
ATTAG151100769–100783
CTAAT151141701–141715
CATAGT18174295–74312

Simple sequence repeats in P. ginseng DMY chloroplast genome.

Gene name in the bracket indicates that this SSR located on the gene.

In addition to SSR sequences, repeats with lengths ≥20 bp were designated as large repeat sequences in DMY chloroplast, and 5 large repeat sequences were detected (Table 3). Two of them were located in intergenic regions. The other three were located in protein-coding regions, one in the ycf1 gene and two in the ycf2 gene. Because ycf2 is located in IR region, each ycf2 is comprised of one large repeat sequence only, which has also been reported in Sesamum indicum chloroplast genome (Yi and Kim, 2012).

Table 3

Repeat patternSize (bp)PositionLocation
(CTACATC)3211945–1965Intergenic region
(CGATATTGATGCTAGTGA)47292801–92872ycf2
(ATATCGTCACTAGCATCA)472149606–149677ycf2
(AGAAACCCCAACAACGGAAGAAAGGGGGGAAAGTGAGGAAGAAACAGATGTAGAAAT)4228111304–111531Intergenic region
(GTTTCTATTTCTACATCTGTTTCTTCCTCACTTTCCCCCCTTTCTTCCGTTGTTGGG)4228130947–131174ycf1

Long repeat sequences in P. ginseng DMY chloroplast genome.

Comparative genomics and phylogenetic analyses across species

Phylogenetic analysis of P. ginseng chloroplast genome

To study the phylogenetic relationships of ginseng, 31 plants from Asterids clade were employed, and Spinacia oleracea and Vitis vinifera were incorporated as the outgroups (Supplementary Table S5). LSC, SSC, IR, and protein-coding regions were employed to construct the phylogenetic tree respectively. As to protein-coding region, 55 conserved protein-coding genes were employed (Supplementary Table S6), and the results were shown in Figure 2. The results for LSC, SSC, and IR regions were presented in Supplementary Figure S1A1C. The phylogenetic relationship from LSC, SSC, and protein-coding regions were almost identical, and the topological relation also agreed with the taxonomy of core eudicots (Figure 2 and Supplementary Figure S1). The phylogenetic relationships based on IR region showed high similarities with the LSC, SSC and protein-coding regions' results, while Ipomoea purpurea from Solanales and Trachelium caeruleum from Asterales did not located in their original branches as their taxonomy but clustered into a new branch, which may be caused by frequent recombination and expansion/contraction in IR region (Wicke et al., 2011). Among these four phylogenetic trees, P. ginseng DMY and other plants in Apiales, including P. schinseng Nees, Eleutherococcus senticosus (E. senticosus), Anthriscus cerefolium (A. cerefolium), and Daucus carota (D. carota), had stable, topological relationships. It is apparent that the conserved protein-coding genes of the plant chloroplast genome can adequately trace the phylogenetic relationships of core eudicot plants, which have been proven by a series of other works (Wu et al., 2007; Lin et al., 2010; Liu et al., 2013; Qian et al., 2013). Meanwhile, the LSC and SSC sequences of chloroplasts could also be used for phylogenetic analyses of core eudicot plants.

Figure 2

Dynamics in IR region of P. ginseng chloroplast genome

The contraction and expansion at the borders of IR regions were common evolutionary events, which primarily contributed to the observed variation in the size of chloroplast genomes (Goulding et al., 1996; Wang et al., 2008). To elucidate this phenomenon and mechanism in Apiales, P. ginseng DMY chloroplast genome was used as the representative for comparison to the other three closely related chloroplast genomes, namely, those of E. senticosus, A. cerefolium, and D. carota. Detailed comparisons of LSC, SSC, and IR boundaries in these four representatives from Apiales were shown in Figure 3. The rps19 and ycf1 genes crossed the LSC/IRa and SSC/IRb boundaries, respectively. At the same time, the pseudogene fragment of rps19 and ycf1rps19 and Ψ ycf1) were located at the IRb/LSC and IRa/SSC boundaries, respectively. In P. ginseng DMY, IRb carried a short Ψ rps19 fragment with 50 bp at the IRb/SSC boundary, and IRa carried a short Ψ ycf1 fragment with 1649 bp at the IRa/SSC boundary. Compared to Ψ rps19, the length of Ψ ycf1 had a wider range of variation (1478–1675 bp). The Ψ ycf1 fragment in E. senticosus is the shortest, and the length of Ψ ycf1 in P. ginseng DMY, A. cerefolium, and D. carota is almost identical. Therefore, we may infer that the expansion and contraction of IR region in P. ginseng chloroplast genome is fairly stable compared to that of the other chloroplast genomes in Apiales.

Figure 3

Selection pressure on P. ginseng chloroplast genome

To explore the selection pressures of P. ginseng chloroplast genome, DMY chloroplast genome was compared to the other three chloroplast genomes in Apiales (E. senticosus, A. cerefolium, and D. carota; Table 4). Furthermore, to clarify the variations in each part of chloroplast genomes, the sequences of LSC, SSC, and IR regions were divided into three functional regions, including a protein-coding region (CDS), intron region, and intergenic spacers (IGS) region. The polymorphic site numbers between P. ginseng and E. senticosus (1721) were less than those between P. ginseng and D. carota (9916) or A. cerefolium (9504). Nucleotide differences between P. ginseng and D. carota or A. cerefolium were significantly higher than those between P. ginseng and E. senticosus for all three functional regions. Thus, we can infer that P. ginseng is more closely related to E. senticosus than the other two, which is also reflected in the phylogenetic trees (Figure 2 and Supplementary Figure S1).

Table 4

P. ginseng/D. carotaP. ginseng/A. cerefoliumP. ginseng/E. senticosus
RegionNPNDKaKsKa/Ks (P-value)NPNDKaKsKa/Ks (P-value)NPNDKaKsKa/Ks (P-value)
CDSLSC19580.04430.02120.14840.1427 (0)20210.04580.02260.14960.1511 (0)3520.00790.0040.02290.1765 (8.5e-58)
IR4960.02580.02240.04070.5507 (8.3e-10)4700.02430.02210.03510.6276 (4.9e-6)620.00320.00370.00152.528 (0.0142)
SSC12960.09070.07210.23210.3107 (1.9e-99)12760.08950.07080.23140.306 (2e-100)2810.01930.01540.03910.3937 (1.7e-13)
TOTAL37500.04820.03050.13160.2319 (0)37670.04850.0310.13080.2371 (0)6950.00890.0060.02020.2988 (7e-54)
IntronLSC7130.09627110.09621060.0139
IR460.0174300.011340.0015
SSC1150.11361150.11301090.0282
TOTAL8740.07898560.07741300.0115
IGSLSC38980.129538710.12607190.0210
IR7680.02844130.0151680.0025
SSC6260.17975970.17221090.0282
TOTAL52920.087348810.07938960.0137
TOTAL99160.066495040.063217210.0111

Comparison of protein-coding region (CDS), intron, and intergenic spacers (IGS) at LSC, IR, and SSC regions of chloroplast genomes.

This is a summary table of each calculation from three different comparisons of P. ginseng vs. D. carota, P. ginseng vs. A. cerefolium, and P. ginseng vs. E. senticosus. Abbreviation: NP, the numbers of polymorphic sites; ND, nucleotide differences; Ks, synonymous substitution differences; and Ks, nonsysnonymous substitution differences. In the column Ka/Ks, there are two numbers, in which the above number is the Ka/Ks value and the below value in bracket is the p-value.

From Table 4, it is apparent that SSC region has the highest variation ratio for all three functional regions, and IR region has the lowest variation ratio, which was the same as the chloroplast genome of Sesamum indicum L (Yi and Kim, 2012; Zhang et al., 2013). The Ka/Ks values of IR region were higher than those of LSC and SSC regions. Moreover, this trend was consistent with all comparison results of between P. ginseng and D. carota or A. cerefolium. In addition, the Ka/Ks value of CDS IR region between P. ginseng and E. senticosus was 2.528 (p-value = 0.0142), indicating a strong positive selection on the IR region. To further investigate the evolutionary rate in the IR region, the Ka/Ks values for all protein-coding gene in IR region between P. ginseng and E. senticosus were calculated using the KaKs_Calculator 2.0 with MA method. Among the six genes (ndhB, rpl2, rpl23, rps7, ycf2, and ycf15) in IR region, the Ka/Ks values for ndhB, rpl2, rpl23, and rps7 were NA, the Ka/Ks value for ycf2 was 2.38 (Ka = 0.0055, Ks = 0.0023, p-value = 0.124), and the Ka/Ks value for ycf15 was 50 (Ka = 0.0039, Ks = 7.7e-5, p-value = 0). It's obvious that those exceptionally high Ka/Ks value were mainly caused by extremely low Ks value, which could not well reflect the real selection pressure on these genes. Even though, we could infer that ycf2 and ycf15 suffered positive selection. ycf2 and ycf15 have been predicted in many chloroplast genomes by several previous studies, and experimental evidences have proved that they are indeed functional (Drescher et al., 2000; Dong et al., 2013; Shi et al., 2013). Anja Drescher and et al have proved that ycf2 played an essential role in cell survival in tobacco chloroplast (Drescher et al., 2000), and ycf15 was also found transcribed but not spliced in spinach (Schmitz-Linneweber et al., 2001).

To make more comprehensive investigation on selection pressure of all chloroplast genes, we have calculated the Ka/Ks value for all chloroplast genes between DMY and the three closely related species (E. senticosus, A. cerefolium, and D. carota). From Supplementary Table S7, we found that, when comparing the genes between DMY and E. senticosus, many genes (such as atpE, atpF and so on) have high Ka/Ks values, which were also introduced by extremely low Ks value. When comparing DMY with the three closely related-species, besides these genes with low Ks introduced high Ka/Ks values, ycf1, ycf2, and ycf15 indeed suffered positive selection with Ka/Ks value no less than 0.5, which were also found in tobacco and Sedum sarmentosum chloroplast genomes(Drescher et al., 2000; Dong et al., 2013). Furthermore, Kapralov MV and Filatov DA found that rbcL evolved under strong positive selection in Schiedea and plants from Amaranthaceae family(Kapralov and Filatov, 2006; Kapralov et al., 2012), while rbcL did not suffered obviously positive selection in P. ginseng, which indicates that gene may suffers different selection pressure in different plants.

Polymorphism and phylogenetic analysis of ginseng chloroplast genome sequences within species

Comparisons of the five P. ginseng chloroplast genome sequences

Genetic variations in chloroplast genome sequences usually have significant value in population genetic analyses and plant domestications (Tang et al., 2004; Kawakami et al., 2007; Yang et al., 2013). To explore the evolution of P. ginseng, comparative analyses among the four newly sequenced chloroplast genomes and P. schinseng Nees chloroplast genome were performed. DMY, EMY, and GLS had identical chloroplast genome sequences, and the YSS chloroplast genome sequence had a 1 bp insertion located at base 5472 reference to DMY chloroplast genome sequence. We also compared the previously reported P. schinseng Nees chloroplast genome with P. ginseng DMY chloroplast genome. P. schinseng Nees chloroplast genome was sequenced by ABI 377 with an average coverage of 4.7×, and P. ginseng DMY chloroplast genome was sequenced by HiSeq2000 with an average coverage over 1000× and Roche/454 GS-FLX (Titanium) with an average coverage of 217×. Five sites on P. schinseng Nees chloroplast genome sequence (accession number in GenBank: NC_006290) were undefined, four were marked as R (purine, A or G) and one was Y (pyrimidine, T or C). All five sites were excluded from the differentiated sites, because those bases in DMY at the same site were also a purine or pyrimidine. Excluding the undefined sites, a total of 172 variant sites, including 88 indels and 84 substitutions, were found (Supplementary Table S8). According to the annotation of DMY chloroplast genome, 30 of 84 substitution sites were located in intergenic region, 16 of them were located in intron region, 2 were located in rRNA gene rrn23, 1 was located in tRNA gene trnV-UAC, and the remaining 35 were located in protein-coding genes (7 for ycf1; 5 for rpoC2; 4 for accD and psbD; 2 for atpB and rpoB; and 1 for rps2, rps12, lhbA, ndhD, psaA, psaB, psaC, atpF, atpI, clpP, and atpA). Regarding 88 indels, 51 were located in intergenic region, 35 were located in intron region, and 2 were located in tRNA genes (trnC-GCA and trnS-UGA). Further analyzing the relationship between repeat regions of these indel sites, the result showed that 21 sites were located in the short single nucleotide acid repeat regions, with a repeat unit occurrence of five or more times. Thus, the different origins of P. schinseng Nees and P. ginseng DMY could account for the differences in their chloroplast genome sequences, and other reasons require further investigation.

Minor allele site polymorphisms and evolution of P. ginseng chloroplast genome

Minor allele frequency (MAF) refers to the frequency of the least common allele occurs in a given population and is usually used for conducting evolutionary analyses of genetic markers (Zhang et al., 2012a; McPherson et al., 2013). Diversity in the MAF has been widely employed for evolutionary and genetic analyses between domestic and wild species in both plants and animals (Vasemagi et al., 2012; Alhaddad et al., 2013; Iorizzo et al., 2013; Petersen et al., 2013). There are a large number of chloroplast genomes in per plant cell (Pyke, 1999). In current study, the DNA sample for each strain was extracted from numerous cells. This means that we took all these chloroplast genomes from the same individual as a chloroplast population for each strain. In the assembly section, the assembled genome sequences substantially represent major alleles in the population. To study the heterogeneity of chloroplast genomes and the dynamics of minor alleles over the course of P. ginseng domestication, all minor allele sites for DMY, EMY, GLS, and YSS were identified with high-resolution reads. After mapping the high-quality reads of all four strains in DMY chloroplast genome respectively and filtering minor alleles with an MAF value of <0.01 (Kolz et al., 2009; Liu et al., 2012), a total of 312 polymorphism sites were detected (Figure 4). To avoid interference from sequencing errors and DNA migration from chloroplast genome to mitochondria and nuclear genomes (Thorsness and Weber, 1996; Birky, 2001; Zhang et al., 2012b), and to restrict analyses to common alleles (De et al., 2008; Skoglund and Jakobsson, 2011; Iorizzo et al., 2013), minor alleles sites with an MAF value of ≥0.05 were used as credible minor alleles, and the cutoff line was shown as a dashed horizontal line in Figure 4. With this MAF value, 208 minor allele sites were identified from all four strains (Supplementary Table S9). And the major types in variation of all these common SNP sites were immobile in every strain. Excluding the variant between YSS and the other three strains at base 5472, there were 5 deletions (3 for A, 1 for C, and 1 for CG) and 202 substitutions, including 41 transitions (transitions of A–G occurred 11 times, all other types transitions occurred 10 times). Among 161 transversion events, the occurrences A–C and T–G accounted for the majority part, 68 and 52 times, respectively. 136 of 208 minor alleles were located in LSC region, and the other 72 were located in IR region. According to the annotation of DMY chloroplast genome, 52 were located in intergenic region, 35 were in intron region, 1 was in tRNA gene trnD-GUC, and the other 120 were in protein-coding region (73 non-synonymous and 47 synonymous substitutions). For the protein-coding genes in LSC region, the details for minor allele number in each gene were shown as follows: 13 in rbcL; 6 in psaB; 6 in psbC; 4 in psbD, psaA, atpB, and rpl22; 3 in rps8; 2 in psbA, ycf3, petD, rpoA, rpl14, and rps3; and 1 in matK, infA, and rpl16 for each. For the protein-coding genes in IR region, the number of minor alleles would be doubled because there are two copies of IR (for each copy, 20 in ycf2, 4 in rp12, 3 in ycf15 and ndhB, and 1 in rps19 and rpl23).

Figure 4

For all 208 chloroplast minor allele sites with MAF ≥0.05, DMY, EMY, GLS, and YSS covered 116, 92, 152, and 193, respectively. The polymorphism site numbers per kb of the chloroplast genome sequences for these four strains were 0.74, 0.59, 0.97, and 1.23, respectively. The minor allele overlaps among DMY, EMY, GLS, and YSS was 76 (Figure 5). All these 76 common minor allele sites were shared by all four strains; 29 of them were located in protein-coding region (14 non-synonymous and 15 synonymous substitutions). Additionally, there were 30 specific minor allele sites in YSS, with 23 in protein-coding region (15 non-synonymous and 8 synonymous substitutions). In DMY, there were also 11 specific minor allele sits, of which 6 were in rbcL. Those specific minor allele sites in YSS and DMY could be used as genetic markers to discriminate them from others. However, the specific minor allele sites in GLS and EMY were very few, so it was very difficult to choose applicable minor allele sites. When the threshold for MAF value was increased to 0.1, the total number of minor alleles decreased to 88, among which DMY, EMY, GLS, and YSS covered 1, 11, 35, and 86, respectively. Seen from the minor allele analysis result, the wild-type P. ginseng YSS is very different from the other three cultivated strains.

Figure 5

At the same time, from Figure 4, we found that those minor alleles on chloroplast genomes were exposed to purifying selection due to domestication. Based on the information of minor allele sites and the differences of genome sequences, an evolutionary relationship among the four newly sequenced strains and the previously reported P. schinseng Nees can be inferred (Figure 6). Greater divergence in genome sequences was found between P. schinseng Nees and the four newly sequenced Chinese ginseng species; thus, P. schinseng Nees is an outgroup. Additionally, we can infer that DMY is closely related to EMY, while GLS is closely related to YSS.

Figure 6

Statements

Ethics statement

All plant samples (four kinds of Chinese ginseng cultivars) in current study were collected from E-Mu ginseng experimental base (N 43°47′45.72″, E128°6′7.88″, 46 m above sea level), Changchun University of Chinese Medicine with permission. Ginseng in current research does not involve endangered or protected species.

Author contributions

Yongbing Zhao, Jingfa Xiao, Xumin Wang, Xiaobo Qu and Jun Yu conceived and designed the experiments. Jinlong Yin, Haiyan Guo, Yongbing Zhao, Wen Xiao and Xiaobo Qu performed the experiments. Yongbing Zhao, Yuyu Zhang, Chen Sun, Jiayan Wu, Xumin Wang and Jingfa Xiao analyzed and interpreted the data. Haiyan Guo, Jinlong Yin and Xumin Wang contributed reagents/materials or analysis tools. Yongbing Zhao, Jinlong Yin, Haiyan Guo, Xumin Wang, Jingfa Xiao and Jun Yu wrote the manuscript. All authors contributed to and approved the final manuscript.

Acknowledgments

This work is supported by a grant (No. 2010CB126604, No. 2012CB721101) from the National Basic Research and Development Program (973 Program).

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Supplementary material

The Supplementary Material for this article can be found online at: http://www.frontiersin.org/journal/10.3389/fpls.2014.00696/abstract

References

  • 1

    AlhaddadH.KhanR.GrahnR. A.GandolfiB.MullikinJ. C.ColeS. A.et al. (2013). Extent of linkage disequilibrium in the domestic cat, Felis silvestris catus, and its breeds. PLoS ONE8:e53537. 10.1371/journal.pone.0053537

  • 2

    BirkyC. J. (2001). The inheritance of genes in mitochondria and chloroplasts: laws, mechanisms, and models. Annu. Rev. Genet. 35, 125148. 10.1146/annurev.genet.35.102401.090231

  • 3

    DeS.Lopez-BigasN.TeichmannS. A. (2008). Patterns of evolutionary constraints on genes in humans. BMC Evol. Biol. 8:275. 10.1186/1471-2148-8-275

  • 4

    DongW.XuC.ChengT.ZhouS. (2013). Complete chloroplast genome of Sedum sarmentosum and chloroplast genome evolution in Saxifragales. PLoS ONE8:e77965. 10.1371/journal.pone.0077965

  • 5

    DoorduinL.GravendeelB.LammersY.AriyurekY.Chin-A-WoengT.VrielingK. (2011). The complete chloroplast genome of 17 individuals of pest species Jacobaea vulgaris: SNPs, microsatellites and barcoding markers for population and phylogenetic studies. DNA Res. 18, 93105. 10.1093/dnares/dsr002

  • 6

    DrescherA.RufS.CalsaT. J.CarrerH.BockR. (2000). The two largest chloroplast genome-encoded open reading frames of higher plants are essential genes. Plant J. 22, 97104. 10.1046/j.1365-313x.2000.00722.x

  • 7

    GaoY.YangM. F.SuY. P.JiangH. M.YouX. J.YangY. J.et al. (2013). Ginsenoside Re reduces insulin resistance through activation of PPAR-gamma pathway and inhibition of TNF-alpha production. J. Ethnopharmacol. 147, 509516. 10.1016/j.jep.2013.03.057

  • 8

    GoujonM.McWilliamH.LiW.ValentinF.SquizzatoS.PaernJ.et al. (2010). A new bioinformatics analysis tools framework at EMBL-EBI. Nucleic Acids Res. 38, W695W699. 10.1093/nar/gkq313

  • 9

    GouldingS. E.OlmsteadR. G.MordenC. W.WolfeK. H. (1996). Ebb and flow of the chloroplast inverted repeat. Mol. Gen. Genet. 252, 195206. 10.1007/BF02173220

  • 10

    HeS.WangY.VolisS.LiD.YiT. (2012). Genetic diversity and population structure: implications for conservation of wild soybean (Glycine soja Sieb. et Zucc) based on nuclear and chloroplast microsatellite variation. Int. J. Mol. Sci. 13, 1260812628. 10.3390/ijms131012608

  • 11

    IorizzoM.SenalikD. A.EllisonS. L.GrzebelusD.CavagnaroP. F.AllenderC.et al. (2013). Genetic structure and domestication of carrot (Daucus carota subsp. sativus) (Apiaceae). Am. J. Bot. 100, 930938. 10.3732/ajb.1300055

  • 12

    IorizzoM.SenalikD.SzklarczykM.GrzebelusD.SpoonerD.SimonP. (2012). De novo assembly of the carrot mitochondrial genome using next generation sequencing of whole genomic DNA provides first evidence of DNA transfer into an angiosperm plastid genome. BMC Plant Biol. 12:61. 10.1186/1471-2229-12-61

  • 13

    JansenR. K.CaiZ.RaubesonL. A.DaniellH.DepamphilisC. W.Leebens-MackJ.et al. (2007). Analysis of 81 genes from 64 plastid genomes resolves relationships in angiosperms and identifies genome-scale evolutionary patterns. Proc. Natl. Acad. Sci. U.S.A. 104, 1936919374. 10.1073/pnas.0709121104

  • 14

    JansenR. K.RaubesonL. A.BooreJ. L.DePamphilisC. W.ChumleyT. W.HaberleR. C.et al. (2005). Methods for obtaining and analyzing whole chloroplast genome sequences. Methods Enzymol. 395, 348384. 10.1016/S0076-6879(05)95020-9

  • 15

    KapralovM. V.FilatovD. A. (2006). Molecular adaptation during adaptive radiation in the Hawaiian endemic genus Schiedea. PLoS ONE1:e8. 10.1371/journal.pone.0000008

  • 16

    KapralovM. V.SmithJ. A.FilatovD. A. (2012). Rubisco evolution in C(4) eudicots: an analysis of Amaranthaceae sensu lato. PLoS ONE7:e52974. 10.1371/journal.pone.0052974

  • 17

    KatohK.TohH. (2010). Parallelization of the MAFFT multiple sequence alignment program. Bioinformatics26, 18991900. 10.1093/bioinformatics/btq224

  • 18

    KawakamiS.EbanaK.NishikawaT.SatoY.VaughanD. A.KadowakiK. (2007). Genetic variation in the chloroplast genome suggests multiple domestication of cultivated Asian rice (Oryza sativa L.). Genome50, 180187. 10.1139/g06-139

  • 19

    KimK. J.LeeH. L. (2004). Complete chloroplast genome sequences from Korean ginseng (Panax schinseng Nees) and comparative analysis of sequence evolution among 17 vascular plants. DNA Res. 11, 247261. 10.1093/dnares/11.4.247

  • 20

    KolzM.JohnsonT.SannaS.TeumerA.VitartV.PerolaM.et al. (2009). Meta-analysis of 28,141 individuals identifies common variants within five new loci that influence uric acid concentrations. PLoS Genet. 5:e1000504. 10.1371/journal.pgen.1000504

  • 21

    LeeC.WenJ. (2004). Phylogeny of Panax using chloroplast trnC-trnD intergenic region and the utility of trnC-trnD in interspecific studies of plants. Mol. Phylogenet. Evol. 31, 894903. 10.1016/j.ympev.2003.10.009

  • 22

    LiH.DurbinR. (2010). Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics26, 589595. 10.1093/bioinformatics/btp698

  • 23

    LiH.HandsakerB.WysokerA.FennellT.RuanJ.HomerN.et al. (2009). The Sequence Alignment/Map format and SAMtools. Bioinformatics25, 20782079. 10.1093/bioinformatics/btp352

  • 24

    LibradoP.RozasJ. (2009). DnaSP v5: a software for comprehensive analysis of DNA polymorphism data. Bioinformatics25, 14511452. 10.1093/bioinformatics/btp187

  • 25

    LinC. P.HuangJ. P.WuC. S.HsuC. Y.ChawS. M. (2010). Comparative chloroplast genomics reveals the evolution of Pinaceae genera and subfamilies. Genome Biol. Evol. 2, 504517. 10.1093/gbe/evq036

  • 26

    LiuC.ZhangF.LiT.LuM.WangL.YueW.et al. (2012). MirSNP, a database of polymorphisms altering miRNA target sites, identifies miRNA-related SNPs in GWAS SNPs and eQTLs. BMC Genomics13:661. 10.1186/1471-2164-13-661

  • 27

    LiuY.HuoN.DongL.WangY.ZhangS.YoungH. A.et al. (2013). Complete chloroplast genome sequences of Mongolia medicine Artemisia frigida and phylogenetic relationships with other plants. PLoS ONE8:e57533. 10.1371/journal.pone.0057533

  • 28

    LohseM.DrechselO.BockR. (2007). OrganellarGenomeDRAW (OGDRAW): a tool for the easy generation of high-quality custom graphical maps of plastid and mitochondrial genomes. Curr. Genet. 52, 267274. 10.1007/s00294-007-0161-y

  • 29

    LoweT. M.EddyS. R. (1997). tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 25, 955964. 10.1093/nar/25.5.0955

  • 30

    McPhersonH.van der MerweM.DelaneyS. K.EdwardsM. A.HenryR. J.McIntoshE.et al. (2013). Capturing chloroplast variation for molecular ecology studies: a simple next generation sequencing approach applied to a rainforest tree. BMC Ecol. 13:8. 10.1186/1472-6785-13-8

  • 31

    MooreM. J.BellC. D.SoltisP. S.SoltisD. E. (2007). Using plastid genome-scale data to resolve enigmatic relationships among basal angiosperms. Proc. Natl. Acad. Sci. U.S.A. 104, 1936319368. 10.1073/pnas.0708072104

  • 32

    MooreM. J.SoltisP. S.BellC. D.BurleighJ. G.SoltisD. E. (2010). Phylogenetic analysis of 83 plastid genes further resolves the early diversification of eudicots. Proc. Natl. Acad. Sci. U.S.A. 107, 46234628. 10.1073/pnas.0907801107

  • 33

    NeuhausH. E.EmesM. J. (2000). Nonphotosynthetic metabolism in plastids. Annu. Rev. Plant Physiol. Plant Mol. Biol. 51, 111140. 10.1146/annurev.arplant.51.1.111

  • 34

    PalmerJ. D. (1985). Comparative organization of chloroplast genomes. Annu. Rev. Genet. 19, 325354. 10.1146/annurev.ge.19.120185.001545

  • 35

    ParksM.CronnR.ListonA. (2009). Increasing phylogenetic resolution at low taxonomic levels using massively parallel sequencing of chloroplast genomes. BMC Biol. 7:84. 10.1186/1741-7007-7-84

  • 36

    PazyarN.OmidianM.JamshydianN. (2012). Ginseng as a potential novel addition to the antikeloid weaponry. Phytother. Res. 26, 15791580. 10.1002/ptr.4598

  • 37

    PetersenJ. L.MickelsonJ. R.RendahlA. K.ValbergS. J.AnderssonL. S.AxelssonJ.et al. (2013). Genome-wide analysis reveals selection for important traits in domestic horse breeds. PLoS Genet. 9:e1003211. 10.1371/journal.pgen.1003211

  • 38

    PykeK. A. (1999). Plastid division and development. Plant Cell11, 549556. 10.1105/tpc.11.4.549

  • 39

    QianJ.SongJ.GaoH.ZhuY.XuJ.PangX.et al. (2013). The complete chloroplast genome sequence of the medicinal plant Salvia miltiorrhiza. PLoS ONE8:e57607. 10.1371/journal.pone.0057607

  • 40

    Schmitz-LinneweberC.MaierR. M.AlcarazJ. P.CottetA.HerrmannR. G.MacheR. (2001). The plastid chromosome of spinach (Spinacia oleracea): complete nucleotide sequence and gene organization. Plant Mol. Biol. 45, 307315. 10.1023/A:1006478403810

  • 41

    ShiC.LiuY.HuangH.XiaE. H.ZhangH. B.GaoL. Z. (2013). Contradiction between plastid gene transcription and function due to complex posttranscriptional splicing: an exemplary study of ycf15 function and evolution in angiosperms. PLoS ONE8:e59620. 10.1371/journal.pone.0059620

  • 42

    ShinozakiK.OhmeM.TanakaM.WakasugiT.HayashidaN.MatsubayashiT.et al. (1986). The complete nucleotide sequence of the tobacco chloroplast genome: its gene organization and expression. EMBO J. 5, 20432049.

  • 43

    SkoglundP.JakobssonM. (2011). Archaic human ancestry in East Asia. Proc. Natl. Acad. Sci. U.S.A. 108, 1830118306. 10.1073/pnas.1108181108

  • 44

    TamuraK.PetersonD.PetersonN.StecherG.NeiM.KumarS. (2011). MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol. Biol. Evol. 28, 27312739. 10.1093/molbev/msr121

  • 45

    TangJ.XiaH.CaoM.ZhangX.ZengW.HuS.et al. (2004). A comparison of rice chloroplast genomes. Plant Physiol. 135, 412420. 10.1104/pp.103.031245

  • 46

    ThorsnessP. E.WeberE. R. (1996). Escape and migration of nucleic acids between chloroplasts, mitochondria, and the nucleus. Int. Rev. Cytol. 165, 207234. 10.1016/S0074-7696(08)62223-8

  • 47

    VasemagiA.NilssonJ.McGinnityP.CrossT.O'ReillyP.GlebeB.et al. (2012). Screen for footprints of selection during domestication/Captive Breeding of Atlantic Salmon. Comp. Funct. Genomics2012:628204. 10.1155/2012/628204

  • 48

    WangR. J.ChengC. L.ChangC. C.WuC. L.SuT. M.ChawS. M. (2008). Dynamics and evolution of the inverted repeat-large single copy junctions in the chloroplast genomes of monocots. BMC Evol. Biol. 8:36. 10.1186/1471-2148-8-36

  • 49

    WickeS.SchneeweissG. M.DePamphilisC. W.MullerK. F.QuandtD. (2011). The evolution of the plastid chromosome in land plants: gene content, gene order, gene function. Plant Mol. Biol. 76, 273297. 10.1007/s11103-011-9762-4

  • 50

    WuC. S.WangY. N.LiuS. M.ChawS. M. (2007). Chloroplast genome (cpDNA) of Cycas taitungensis and 56 cp protein-coding genes of Gnetum parvifolium: insights into cpDNA evolution and phylogeny of extant seed plants. Mol. Biol. Evol. 24, 13661379. 10.1093/molbev/msm059

  • 51

    WymanS. K.JansenR. K.BooreJ. L. (2004). Automatic annotation of organellar genomes with DOGMA. Bioinformatics20, 32523255. 10.1093/bioinformatics/bth352

  • 52

    XueJ.WangS.ZhouS. L. (2012). Polymorphic chloroplast microsatellite loci in Nelumbo (Nelumbonaceae). Am. J. Bot. 99, e240e244. 10.3732/ajb.1100547

  • 53

    YangA. H.ZhangJ. J.YaoX. H.HuangH. W. (2011). Chloroplast microsatellite markers in Liriodendron tulipifera (Magnoliaceae) and cross-species amplification in L. chinense. Am. J. Bot. 98, e123e126. 10.3732/ajb.1000532

  • 54

    YangJ. B.TangM.LiH. T.ZhangZ. R.LiD. Z. (2013). Complete chloroplast genome of the genus Cymbidium: lights into the species identification, phylogenetic implications and population genetic analyses. BMC Evol. Biol. 13:84. 10.1186/1471-2148-13-84

  • 55

    YangM.ZhangX.LiuG.YinY.ChenK.YunQ.et al. (2010). The complete chloroplast genome sequence of date palm (Phoenix dactylifera L.). PLoS ONE5:e12762. 10.1371/journal.pone.0012762

  • 56

    YiD. K.KimK. J. (2012). Complete chloroplast genome sequences of important oilseed crop Sesamum indicum L. PLoS ONE7:e35872. 10.1371/journal.pone.0035872

  • 57

    YiD. K.LeeH. L.SunB. Y.ChungM. Y.KimK. J. (2012). The complete chloroplast DNA sequence of Eleutherococcus senticosus (Araliaceae); comparative evolutionary analyses with other three asterids. Mol. Cells33, 497508. 10.1007/s10059-012-2281-6

  • 58

    ZerbinoD. R.BirneyE. (2008). Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 18, 821829. 10.1101/gr.074492.107

  • 59

    ZhangH.LiC.MiaoH.XiongS. (2013). Insights from the complete chloroplast genome into the evolution of Sesamum indicum L. PLoS ONE8:e80508. 10.1371/journal.pone.0080508

  • 60

    ZhangT.FangY.WangX.DengX.ZhangX.HuS.et al. (2012b). The complete chloroplast and mitochondrial genome sequences of Boea hygrometrica: insights into the evolution of plant organellar genomes. PLoS ONE7:e30531. 10.1371/journal.pone.0030531

  • 61

    ZhangT.HuS.ZhangG.PanL.ZhangX.Al-MssallemI. S.et al. (2012a). The organelle genomes of Hassawi rice (Oryza sativa L.) and its hybrid in saudi arabia: genome variation, rearrangement, and origins. PLoS ONE7:e42041. 10.1371/journal.pone.0042041

  • 62

    ZhangT.ZhangX.HuS.YuJ. (2011). An efficient procedure for plant organellar genome assembly, based on whole genome data from the 454 GS FLX sequencing platform. Plant Methods7:38. 10.1186/1746-4811-7-38

  • 63

    ZhangZ.LiJ.ZhaoX. Q.WangJ.WongG. K.YuJ. (2006). KaKs_Calculator: calculating Ka and Ks through model selection and model averaging. Genom. Proteom. Bioinform. 4, 259263. 10.1016/S1672-0229(07)60007-2

  • 64

    ZhengS. D.WuH. J.WuD. L. (2012). Roles and mechanisms of ginseng in protecting heart. Chin. J. Integr. Med. 18, 548555. 10.1007/s11655-012-1148-1

Summary

Keywords

comparative genomics, SNP, minor allele, chloroplast genome, Panax ginseng

Citation

Zhao Y, Yin J, Guo H, Zhang Y, Xiao W, Sun C, Wu J, Qu X, Yu J, Wang X and Xiao J (2015) The complete chloroplast genome provides insight into the evolution and polymorphism of Panax ginseng. Front. Plant Sci. 5:696. doi: 10.3389/fpls.2014.00696

Received

17 September 2014

Accepted

23 November 2014

Published

14 January 2015

Volume

5 - 2014

Edited by

Tiegang Lu, Chinese Academy of Agricultural Sciences, China

Reviewed by

Shaojie Zhang, University of Central Florida, USA; Xiao Han, Chinese Academy of Agricultural Sciences, China

Copyright

*Correspondence: Jingfa Xiao and Xumin Wang, Beijing Institute of Genomics, Chinese Academy of Sciences. NO.1 Beichen West Road, Chaoyang District, Beijing 100101, China e-mail: ;

†These authors have contributed equally to this work.

This article was submitted to Plant Genetics and Genomics, a section of the journal Frontiers in Plant Science.

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Outline

Figures

Cite article

Copy to clipboard


Export citation file


Share article

Article metrics