AUTHOR=Guang August , Howison Mark , Ledingham Lauren , D’Antuono Matthew , Chan Philip A. , Lawrence Charles , Dunn Casey W. , Kantor Rami TITLE=Incorporating Within-Host Diversity in Phylogenetic Analyses for Detecting Clusters of New HIV Diagnoses JOURNAL=Frontiers in Microbiology VOLUME=12 YEAR=2022 URL=https://www.frontiersin.org/journals/microbiology/articles/10.3389/fmicb.2021.803190 DOI=10.3389/fmicb.2021.803190 ISSN=1664-302X ABSTRACT=Background

Phylogenetic analyses of HIV sequences are used to detect clusters and inform public health interventions. Conventional approaches summarize within-host HIV diversity with a single consensus sequence per host of the pol gene, obtained from Sanger or next-generation sequencing (NGS). There is growing recognition that this approach discards potentially important information about within-host sequence variation, which can impact phylogenetic inference. However, whether alternative summary methods that incorporate intra-host variation impact phylogenetic inference of transmission network features is unknown.

Methods

We introduce profile sampling, a method to incorporate within-host NGS sequence diversity into phylogenetic HIV cluster inference. We compare this approach to Sanger- and NGS-derived pol and near-whole-genome consensus sequences and evaluate its potential benefits in identifying molecular clusters among all newly-HIV-diagnosed individuals over six months at the largest HIV center in Rhode Island.

Results

Profile sampling cluster inference demonstrated that within-host viral diversity impacts phylogenetic inference across individuals, and that consensus sequence approaches can obscure both magnitude and effect of these impacts. Clustering differed between Sanger- and NGS-derived consensus and profile sampling sequences, and across gene regions.

Discussion

Profile sampling can incorporate within-host HIV diversity captured by NGS into phylogenetic analyses. This additional information can improve robustness of cluster detection.