AUTHOR=Swargam Sandeep , Kumari Indu , Kumar Amit , Pradhan Dibyabhaba , Alam Anwar , Singh Harpreet , Jain Anuja , Devi Kangjam Rekha , Trivedi Vishal , Sarma Jogesh , Hanif Mahmud , Narain Kanwar , Ehtesham Nasreen Zafar , Hasnain Seyed Ehtesham , Ahmad Shandar TITLE=MycoVarP: Mycobacterium Variant and Drug Resistance Prediction Pipeline for Whole-Genome Sequence Data Analysis JOURNAL=Frontiers in Bioinformatics VOLUME=1 YEAR=2022 URL=https://www.frontiersin.org/journals/bioinformatics/articles/10.3389/fbinf.2021.805338 DOI=10.3389/fbinf.2021.805338 ISSN=2673-7647 ABSTRACT=

Whole-genome sequencing (WGS) provides a comprehensive tool to analyze the bacterial genomes for genotype–phenotype correlations, diversity of single-nucleotide variant (SNV), and their evolution and transmission. Several online pipelines and standalone tools are available for WGS analysis of Mycobacterium tuberculosis (Mtb) complex (MTBC). While they facilitate the processing of WGS data with minimal user expertise, they are either too general, providing little insights into bacterium-specific issues such as gene variations, INDEL/synonymous/PE-PPE (IDP family), and drug resistance from sample data, or are limited to specific objectives, such as drug resistance. It is understood that drug resistance and lineage-specific issues require an elaborate prioritization of identified variants to choose the best target for subsequent therapeutic intervention. Mycobacterium variant pipeline (MycoVarP) addresses these specific issues with a flexible battery of user-defined and default filters. It provides an end-to-end solution for WGS analysis of Mtb variants from the raw reads and performs two quality checks, viz, before trimming and after alignments of reads to the reference genome. MycoVarP maps the annotated variants to the drug-susceptible (DS) database and removes the false-positive variants, provides lineage identification, and predicts potential drug resistance. We have re-analyzed the WGS data reported by Advani et al. (2019) using MycoVarP and identified some additional variants not reported so far. We conclude that MycoVarP will help in identifying nonsynonymous, true-positive, drug resistance–associated variants more effectively and comprehensively, including those within the IDP of the PE-PPE/PGRS family, than possible from the currently available pipelines.