AUTHOR=Janßen René , Beck Aaron J. , Werner Johannes , Dellwig Olaf , Alneberg Johannes , Kreikemeyer Bernd , Maser Edmund , Böttcher Claus , Achterberg Eric P. , Andersson Anders F. , Labrenz Matthias TITLE=Machine Learning Predicts the Presence of 2,4,6-Trinitrotoluene in Sediments of a Baltic Sea Munitions Dumpsite Using Microbial Community Compositions JOURNAL=Frontiers in Microbiology VOLUME=12 YEAR=2021 URL=https://www.frontiersin.org/journals/microbiology/articles/10.3389/fmicb.2021.626048 DOI=10.3389/fmicb.2021.626048 ISSN=1664-302X ABSTRACT=

Bacteria are ubiquitous and live in complex microbial communities. Due to differences in physiological properties and niche preferences among community members, microbial communities respond in specific ways to environmental drivers, potentially resulting in distinct microbial fingerprints for a given environmental state. As proof of the principle, our goal was to assess the opportunities and limitations of machine learning to detect microbial fingerprints indicating the presence of the munition compound 2,4,6-trinitrotoluene (TNT) in southwestern Baltic Sea sediments. Over 40 environmental variables including grain size distribution, elemental composition, and concentration of munition compounds (mostly at pmol⋅g–1 levels) from 150 sediments collected at the near-to-shore munition dumpsite Kolberger Heide by the German city of Kiel were combined with 16S rRNA gene amplicon sequencing libraries. Prediction was achieved using Random Forests (RFs); the robustness of predictions was validated using Artificial Neural Networks (ANN). To facilitate machine learning with microbiome data we developed the R package phyloseq2ML. Using the most classification-relevant 25 bacterial genera exclusively, potentially representing a TNT-indicative fingerprint, TNT was predicted correctly with up to 81.5% balanced accuracy. False positive classifications indicated that this approach also has the potential to identify samples where the original TNT contamination was no longer detectable. The fact that TNT presence was not among the main drivers of the microbial community composition demonstrates the sensitivity of the approach. Moreover, environmental variables resulted in poorer prediction rates than using microbial fingerprints. Our results suggest that microbial communities can predict even minor influencing factors in complex environments, demonstrating the potential of this approach for the discovery of contamination events over an integrated period of time. Proven for a distinct environment future studies should assess the ability of this approach for environmental monitoring in general.