This Research Topic is part of the Resolving the Complexity of Plant Genomes and Transcriptomes with Long Reads series.
Thanks to the low-cost second-generation sequencing technologies, nearly six hundred plant genomes have been sequenced to a chromosome or scaffold level within the last twenty years, enabling structural and functional studies. This, in turn, triggered the development of integrative resources for plant comparative genomics, e.g. PLAZA, Gramene, or Phytozome. Subsequently, numerous large-scale projects have been launched, with the focus on dissecting the intra-species genomic variation within plant populations. All these efforts lead to an exponential increase of our knowledge about the genome structure and the importance of regulatory mechanisms, such as alternative transcript splicing or non-coding RNA-mediated regulation of gene expression. We have also learned that structural variation plays a prominent role in plant adaptation to new environmental conditions, stress response, and evolution.
However, all these achievements also inevitably led to a growing appreciation of the plant genomes’ complexity, difficult or impossible to be fully resolved by applying short-read sequencing. Plants are extremely variable in terms of their genome sizes and ploidy level. Their genomes are abundant in repeated sequences, including transposable elements and tandem repeat arrays. In some extreme cases, like in maize, these repetitive elements constitute as much as 80% of the genomic content. Additionally, extensive gene copy number variation had been observed in plants. Remarkably, the regions of highest structural diversity turned out to contribute to a vast amount of phenotypic variation and determine traits important for breeding and agriculture.
The third-generation sequencing technologies, including single-molecule real-time (SMRT) and nanopore sequencing, are capable of delivering unprecedentedly long reads, opening entirely new perspectives for genomic studies. In fact, the gapless chromosome-level assemblies of complex eukaryotic genomes, like that of a human or maize, have been obtained only recently, by utilizing long-read sequencing, coupled with optical maps. Additionally, more complete and more contiguous versions of a number of plant genomes – radish, sorghum, tomato, cabbage or banana, have been obtained. The ability to obtain full-length RNA/cDNA sequencing reads also empowered the discovery of new transcript isoforms. Also, direct identification of methylation sites in native DNA/RNA strands has been enabled for the first time, providing insights into the links between genomic variation, chromatin structure, and phenotypic diversity.
The objective of this Research Topic is to outline the current advances, trends, and perspectives in plant research in the third-generation sequencing era. We expect to gather a collection of excellent research findings from the fields of plant structural and functional genomics, enabled or supported by applying long-read sequencing technologies. We especially encourage manuscripts related to the following topics:
- DNA/RNA chemical modifications and their impact on gene expression
- Understanding gene regulation and function through full-length transcript sequencing and isoform diversity analysis
- The role of tandem duplications in the evolution of gene families
- Resolving the structure of complex genomic rearrangements
- The role of structural genomic variation (especially large-scale variants, copy number variants, and transposable elements)
- In-field plant pathogen detection and tracking species biodiversity
- Methods and computational tools for the analysis of third-generation sequencing data
We welcome Original research, Methods, Reviews, Mini Reviews, and Perspective manuscripts in this collection.
This Research Topic is part of the Resolving the Complexity of Plant Genomes and Transcriptomes with Long Reads series.
Thanks to the low-cost second-generation sequencing technologies, nearly six hundred plant genomes have been sequenced to a chromosome or scaffold level within the last twenty years, enabling structural and functional studies. This, in turn, triggered the development of integrative resources for plant comparative genomics, e.g. PLAZA, Gramene, or Phytozome. Subsequently, numerous large-scale projects have been launched, with the focus on dissecting the intra-species genomic variation within plant populations. All these efforts lead to an exponential increase of our knowledge about the genome structure and the importance of regulatory mechanisms, such as alternative transcript splicing or non-coding RNA-mediated regulation of gene expression. We have also learned that structural variation plays a prominent role in plant adaptation to new environmental conditions, stress response, and evolution.
However, all these achievements also inevitably led to a growing appreciation of the plant genomes’ complexity, difficult or impossible to be fully resolved by applying short-read sequencing. Plants are extremely variable in terms of their genome sizes and ploidy level. Their genomes are abundant in repeated sequences, including transposable elements and tandem repeat arrays. In some extreme cases, like in maize, these repetitive elements constitute as much as 80% of the genomic content. Additionally, extensive gene copy number variation had been observed in plants. Remarkably, the regions of highest structural diversity turned out to contribute to a vast amount of phenotypic variation and determine traits important for breeding and agriculture.
The third-generation sequencing technologies, including single-molecule real-time (SMRT) and nanopore sequencing, are capable of delivering unprecedentedly long reads, opening entirely new perspectives for genomic studies. In fact, the gapless chromosome-level assemblies of complex eukaryotic genomes, like that of a human or maize, have been obtained only recently, by utilizing long-read sequencing, coupled with optical maps. Additionally, more complete and more contiguous versions of a number of plant genomes – radish, sorghum, tomato, cabbage or banana, have been obtained. The ability to obtain full-length RNA/cDNA sequencing reads also empowered the discovery of new transcript isoforms. Also, direct identification of methylation sites in native DNA/RNA strands has been enabled for the first time, providing insights into the links between genomic variation, chromatin structure, and phenotypic diversity.
The objective of this Research Topic is to outline the current advances, trends, and perspectives in plant research in the third-generation sequencing era. We expect to gather a collection of excellent research findings from the fields of plant structural and functional genomics, enabled or supported by applying long-read sequencing technologies. We especially encourage manuscripts related to the following topics:
- DNA/RNA chemical modifications and their impact on gene expression
- Understanding gene regulation and function through full-length transcript sequencing and isoform diversity analysis
- The role of tandem duplications in the evolution of gene families
- Resolving the structure of complex genomic rearrangements
- The role of structural genomic variation (especially large-scale variants, copy number variants, and transposable elements)
- In-field plant pathogen detection and tracking species biodiversity
- Methods and computational tools for the analysis of third-generation sequencing data
We welcome Original research, Methods, Reviews, Mini Reviews, and Perspective manuscripts in this collection.