AUTHOR=Smith Anne Marie E. , Lanevskij Kiril , Sazonovas Andrius , Harris Jesse TITLE=Impact of Established and Emerging Software Tools on the Metabolite Identification Landscape JOURNAL=Frontiers in Toxicology VOLUME=4 YEAR=2022 URL=https://www.frontiersin.org/journals/toxicology/articles/10.3389/ftox.2022.932445 DOI=10.3389/ftox.2022.932445 ISSN=2673-3080 ABSTRACT=

Scientists’ ability to detect drug-related metabolites at trace concentrations has improved over recent decades. High-resolution instruments enable collection of large amounts of raw experimental data. In fact, the quantity of data produced has become a challenge due to effort required to convert raw data into useful insights. Various cheminformatics tools have been developed to address these metabolite identification challenges. This article describes the current state of these tools. They can be split into two categories: Pre-experimental metabolite generation and post-experimental data analysis. The former can be subdivided into rule-based, machine learning-based, and docking-based approaches. Post-experimental tools help scientists automatically perform chromatographic deconvolution of LC/MS data and identify metabolites. They can use pre-experimental predictions to improve metabolite identification, but they are not limited to these predictions: unexpected metabolites can also be discovered through fractional mass filtering. In addition to a review of available software tools, we present a description of pre-experimental and post-experimental metabolite structure generation using MetaSense. These software tools improve upon manual techniques, increasing scientist productivity and enabling efficient handling of large datasets. However, the trend of increasingly large datasets and highly data-driven workflows requires a more sophisticated informatics transition in metabolite identification labs. Experimental work has traditionally been separated from the information technology tools that handle our data. We argue that these IT tools can help scientists draw connections via data visualizations and preserve and share results via searchable centralized databases. In addition, data marshalling and homogenization techniques enable future data mining and machine learning.