AUTHOR=Saba Federico , Mariethoz Julien , Lisacek Frederique 

TITLE=What is a consistent glycan composition dataset?

JOURNAL=Frontiers in Analytical Science

VOLUME=Volume 3 - 2023

YEAR=2023

URL=https://www.frontiersin.org/journals/analytical-science/articles/10.3389/frans.2023.1073540

DOI=10.3389/frans.2023.1073540

ISSN=2673-9283

ABSTRACT=One of the main challenges in bioinformatics has been and still is, the comparison of entities through the development of algorithms for similarity scoring and data clustering according to biologically relevant aspects. Glycoinformatics also faces this challenge, in particular regarding the automated comparison of protein and tissue glycomes, that remains a relatively uncharted territory. The aim of the work presented here is therefore to lay down initial requirements for developing a software tool comparing glycomes and discuss the definition of similarity as well as the  methods suited for its evaluation and implementation. This is partly achieved through the analysis of networks of glycan compositions generated by the Compozitor application via another java application specifically tailored for these purposes. The  outcome of the present study is two-fold. Theoretically, it is shown that the idiosyncrasy of current data limit the definition of appropriate estimates for systematically comparing N-glycomes and practically, several lines of improvement could be derived for the Compozitor application.