AUTHOR=Pfeifer Leah D. , Patabandige Milani W. , Desaire Heather TITLE=Leveraging R (LevR) for fast processing of mass spectrometry data and machine learning: Applications analyzing fingerprints and glycopeptides JOURNAL=Frontiers in Analytical Science VOLUME=2 YEAR=2022 URL=https://www.frontiersin.org/journals/analytical-science/articles/10.3389/frans.2022.961592 DOI=10.3389/frans.2022.961592 ISSN=2673-9283 ABSTRACT=
Applying machine learning strategies to interpret mass spectrometry data has the potential to revolutionize the way in which disease is diagnosed, prognosed, and treated. A persistent and tedious obstacle, however, is relaying mass spectrometry data to the machine learning algorithm. Given the native format and large size of mass spectrometry data files, preprocessing is a critical step. To ameliorate this challenge, we sought to create an easy-to-use, continuous pipeline that runs from data acquisition to the machine learning algorithm. Here, we present a start-to-finish pipeline designed to facilitate supervised and unsupervised classification of mass spectrometry data. The input can be any ESI data set collected by LC-MS or flow injection, and the output is a machine learning ready matrix, in which each row is a feature (an abundance of a particular