AUTHOR=Liu Zhuochong , Jiang Zhonghua , Wu Wei , Xu Xinyi , Ma Yudong , Guo Xiaomei , Zhang Senlin , Sun Qun TITLE=Identification of region of difference and H37Rv-related deletion in Mycobacterium tuberculosis complex by structural variant detection and genome assembly JOURNAL=Frontiers in Microbiology VOLUME=13 YEAR=2022 URL=https://www.frontiersin.org/journals/microbiology/articles/10.3389/fmicb.2022.984582 DOI=10.3389/fmicb.2022.984582 ISSN=1664-302X ABSTRACT=

Mycobacterium tuberculosis complex (MTBC), the main cause of TB in humans and animals, is an extreme example of genetic homogeneity, whereas it is still nevertheless separated into various lineages by numerous typing methods, which differ in phenotype, virulence, geographic distribution, and host preference. The large sequence polymorphism (LSP), incorporating region of difference (RD) and H37Rv-related deletion (RvD), is considered to be a powerful means of constructing phylogenetic relationships within MTBC. Although there have been many studies on LSP already, focusing on the distribution of RDs in MTBC and their impact on MTB phenotypes, a crumb of new lineages or sub-lineages have been excluded and RvDs have received less attention. We, therefore, sampled a dataset of 1,495 strains, containing 113 lineages from the laboratory collection, to screen for RDs and RvDs by structural variant detection and genome assembly, and examined the distribution of RvDs in MTBC, including RvD2, RvD5, and cobF region. Consistent with genealogical delineation by single nucleotide polymorphism (SNP), we identified 125 RDs and 5 RvDs at the species, lineage, or sub-lineage levels. The specificities of RDs and RvDs were further investigated in the remaining 10,218 strains, suggesting that most of them were highly specific to distinct phylogenetic groups, could be used as stable genetic markers in genotyping. More importantly, we identified 34 new lineage or evolutionary branch specific RDs and 2 RvDs, also demonstrated the distribution of known RDs and RvDs in MTBC. This study provides novel details about deletion events that have occurred in distinct phylogenetic groups and may help to understand the genealogical differentiation.