AUTHOR=Swargam Sandeep , Kumari Indu , Kumar Amit , Pradhan Dibyabhaba , Alam Anwar , Singh Harpreet , Jain Anuja , Devi Kangjam Rekha , Trivedi Vishal , Sarma Jogesh , Hanif Mahmud , Narain Kanwar , Ehtesham Nasreen Zafar , Hasnain Seyed Ehtesham , Ahmad Shandar TITLE=MycoVarP: Mycobacterium Variant and Drug Resistance Prediction Pipeline for Whole-Genome Sequence Data Analysis JOURNAL=Frontiers in Bioinformatics VOLUME=Volume 1 - 2021 YEAR=2022 URL=https://www.frontiersin.org/journals/bioinformatics/articles/10.3389/fbinf.2021.805338 DOI=10.3389/fbinf.2021.805338 ISSN=2673-7647 ABSTRACT=Whole-genome sequencing (WGS) provides a comprehensive tool to analyse the bacterial genomes for genotype-phenotype correlations, diversity of single nucleotide variant (SNV), their evolution and transmission. Several online pipelines and standalone tools are available for WGS analysis of Mycobacterium tuberculosis (Mtb) complex (MTBC). While they facilitate the processing of WGS data with minimal user expertise, they are either too general providing little insights into bacterium-specific issues or are limited to specific objectives, such as drug resistance. It is understood that drug resistance and lineage specific issues require an elaborate prioritization of identified variants to choose the best target for subsequent therapeutic intervention. Mycobacterium Variant Pipeline (MycoVarP) addresses these specific issues with a flexible battery of user-defined and default filters. It provides an end-to-end solution for WGS analysis of Mtb variants from the raw reads and performs two quality checks viz before trimming and after alignments of reads to reference genome. MycoVarP maps the annotated variants to drug susceptible (DS) database and removes the false-positive variants, provides lineage identification and predicts potential drug resistance. We have re-analysed the WGS data reported by Advani et al. (2019) using MycoVarP and identified some additional variants not reported so far. We conclude that MycoVarP will help in identifying non-synonymous, true positive, drug resistance-associated variants more effectively and comprehensively, including those within the PE-PPE/PGRS family, than possible from currently available pipelines. Availability: Online analysis of small data sets and full pipelines for local versions are provided at: http://www.sciwhylab.org/serves/mtb-var/