AUTHOR=MacGregor Barbara J.
TITLE=Visualizing Evolutionary Relationships of Multidomain Proteins: An Example from Receiver (REC) Domains of Sensor Histidine Kinases in the Candidatus Maribeggiatoa str. Orange Guaymas Draft Genome
JOURNAL=Frontiers in Microbiology
VOLUME=7
YEAR=2016
URL=https://www.frontiersin.org/journals/microbiology/articles/10.3389/fmicb.2016.01780
DOI=10.3389/fmicb.2016.01780
ISSN=1664-302X
ABSTRACT=
For multidomain proteins, evolutionary changes may occur at the domain as well as the whole-protein level. An example is presented here, with suggestions for how such complicated relationships might be visualized. Earlier analysis of the Candidatus Maribeggiatoa str. Orange Guaymas (BOGUAY; Gammaproteobacteria) single-filament draft genome found evidence of gene exchange with the phylogenetically distant Cyanobacteria, particularly for sensory and signal transduction proteins. Because these are modular proteins, known to undergo frequent duplication, domain swapping, and horizontal gene transfer, a single domain was chosen for analysis. Recognition (REC) domains are short (~125 amino acids) and well conserved, simplifying sequence alignments and phylogenetic calculations. Over 100 of these were identified in the BOGUAY genome and found to have a wide range of inferred phylogenetic relationships. Two sets were chosen here for detailed study. One set of four BOGUAY ORFs has closest relatives among other Beggiatoaceae and Cyanobacteria. A second set of four has REC domains with more mixed affiliations, including other Beggiatoaceae, several sulfate-reducing Deltaproteobacteria and Firmicutes, magnetotactic Nitrospirae, one Shewanella and one Ferrimonas strain (both Gammaproteobacteria), and numerous Vibrio vulnificus and V. navarrensis strains (also Gammaproteobacteria). For an overview of the possible origins of the whole proteins and the surrounding genomic regions, color-coded BLASTP results were produced and displayed against cartoons showing protein domain structure of predicted genes. This is suggested as a visualization method for investigation of possible horizontally transferred regions, giving more detail than scans of DNA composition and codon usage but much faster than carrying out full phylogenetic analyses for multiple proteins. As expected, most of the predicted sensor histidine kinases investigated have two or more segments with distinct BLASTP affiliations. For the first set of BOGUAY ORFs, the flanking regions were also examined, and the results suggest they are embedded in genomic stretches with complex histories. An automated method of creating such visualizations could be generally useful; a wish list for its features is given.