AUTHOR=Kouchaki Samaneh , Yang Yang , Lachapelle Alexander , Walker Timothy M. , Walker A. Sarah , CRyPTIC Consortium , Peto Timothy E. A. , Crook Derrick W. , Clifton David A. TITLE=Multi-Label Random Forest Model for Tuberculosis Drug Resistance Classification and Mutation Ranking JOURNAL=Frontiers in Microbiology VOLUME=11 YEAR=2020 URL=https://www.frontiersin.org/journals/microbiology/articles/10.3389/fmicb.2020.00667 DOI=10.3389/fmicb.2020.00667 ISSN=1664-302X ABSTRACT=
Resistance prediction and mutation ranking are important tasks in the analysis of Tuberculosis sequence data. Due to standard regimens for the use of first-line antibiotics, resistance co-occurrence, in which samples are resistant to multiple drugs, is common. Analysing all drugs simultaneously should therefore enable patterns reflecting resistance co-occurrence to be exploited for resistance prediction. Here, multi-label random forest (MLRF) models are compared with single-label random forest (SLRF) for both predicting phenotypic resistance from whole genome sequences and identifying important mutations for better prediction of four first-line drugs in a dataset of 13402