AUTHOR=Radomski Nicolas , Cadel-Six Sabrina , Cherchame Emeline , Felten Arnaud , Barbet Pauline , Palma Federica , Mallet Ludovic , Le Hello Simon , Weill François-Xavier , Guillier Laurent , Mistou Michel-Yves TITLE=A Simple and Robust Statistical Method to Define Genetic Relatedness of Samples Related to Outbreaks at the Genomic Scale – Application to Retrospective Salmonella Foodborne Outbreak Investigations JOURNAL=Frontiers in Microbiology VOLUME=10 YEAR=2019 URL=https://www.frontiersin.org/journals/microbiology/articles/10.3389/fmicb.2019.02413 DOI=10.3389/fmicb.2019.02413 ISSN=1664-302X ABSTRACT=
The investigation of foodborne outbreaks (FBOs) from genomic data typically relies on inspecting the relatedness of samples through a phylogenomic tree computed on either SNPs, genes, kmers, or alleles (i.e., cgMLST and wgMLST). The phylogenomic reconstruction is often time-consuming, computation-intensive and depends on hidden assumptions, pipelines implementation and their parameterization. In the context of FBO investigations, robust links between isolates are required in a timely manner to trigger appropriate management actions. Here, we propose a non-parametric statistical method to assert the relatedness of samples (i.e., outbreak cases) or whether to reject them (i.e., non-outbreak cases). With typical computation running within minutes on a desktop computer, we benchmarked the ability of three non-parametric statistical tests (i.e., Wilcoxon rank-sum, Kolmogorov–Smirnov and Kruskal–Wallis) on six different genomic features (i.e., SNPs, SNPs excluding recombination events, genes, kmers, cgMLST alleles, and wgMLST alleles) to discriminate outbreak cases (i.e., positive control: C+) from non-outbreak cases (i.e., negative control: C−). We leveraged four well-characterized and retrospectively investigated FBOs of