The final, formatted version of the article will be published soon.
BRIEF RESEARCH REPORT article
Front. Genet.
Sec. Computational Genomics
Volume 16 - 2025 |
doi: 10.3389/fgene.2025.1498978
Characterization of Novel Human Endogenous Retrovirus Structures on Chromosomes 6 and 7
Provisionally accepted- 1 University of Cambridge, Cambridge, England, United Kingdom
- 2 National Institute of Neurological Disorders and Stroke (NIH), Bethesda, Maryland, United States
Human endogenous retroviruses (HERV) represent nearly 8% of the human genome. Of these, HERV-K subtype HML-2 is a transposable element that plays a critical role in embryonic development and in the pathogenesis of several diseases. Quantification and characterization of these multiple HML-2 insertions in the human chromosome has been challenging due to their size, sequence homology with each other, and their repetitive nature. We examined a cohort of 222 individuals for HML-2 proviruses 6q14.1 and 7p22.1a, two loci that are capable of producing full-length viral proteins and have been previously implicated in several cancers, autoimmune disorders and neurodegenerative diseases, using long-read DNA sequencing. While the reference genome for both regions suggests these two loci are structurally dissimilar, we found that for both loci about 5% of individuals have a unique tandem repeat-like sequence (three long terminal repeat sequences sandwiching two internal, potentially protein coding sequences), while most individuals have a standard proviral structure (one internal region sandwiched by two long terminal repeats). Moreover, both proviruses can make full-length, or nearly full-length, HERV-K proteins in multiple transcription orientations. The amino acid sequences from different loci in the same transcriptional orientation share sequence homology with each other. These results demonstrate a clear, previously unreported, relationship between HML-2 loci 6q14.1 and 7p22.1a and highlight the utility of long-read sequencing to study repetitive elements. Future studies need to determine if these polymorphisms determine genetic susceptibility to diseases that are associated with them.
Keywords: DNA SEQUENCING, herv-k, HML-2, Long-read sequencing, Structural variants, tandem repeat
Received: 19 Sep 2024; Accepted: 07 Jan 2025.
Copyright: © 2025 Pasternack, Nath and Paulsen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence:
Nick Pasternack, University of Cambridge, Cambridge, CB2 1TN, England, United Kingdom
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.