Whole genome sequencing (WGS) is increasingly used for characterizing foodborne pathogens and it has become a standard typing technique for surveillance and research purposes. WGS data can help assessing microbial risks and defining risk mitigating strategies for foodborne pathogens, including
To test the hypothesis that (combinations of) different genes can predict the probability of infection [P(inf)] given exposure to a certain pathogen strain, we determined P(inf) based on invasion potential of 87
P(inf) values ranged from 6.7E-05 to 5.2E-01, showing variability both among and within serovars. P(inf) values also varied between isolation sources, but no unambiguous pattern was observed in the tested serovars. Interestingly, serovars causing the highest number of human infections did not show better ability to invade cells in the GIT model system, with strains belonging to other serovars displaying even higher infectivity. The RF model did not identify any virulence factor as significant P(inf) predictors. Significant associations of P(inf) with biofilm formation were found in all the different conditions for a limited number of serovars, indicating that the two phenotypes are governed by different mechanisms and that the ability to form biofilm does not correlate with the ability to invade epithelial cells. Other omics techniques therefore seem more promising as alternatives to identify genes associated with P(inf), and different hypotheses, such as gene expression rather than presence/absence, could be tested to explain phenotypic virulence [P(inf)].