AUTHOR=Chen Cui‐Xia , Sun Li‐Na , Hou Xue‐Xin , Du Peng‐Cheng , Wang Xiao‐Long , Du Xiao‐Chen , Yu Yu‐Fei , Cai Rui‐Kun , Yu Lei , Li Tian‐Jun , Luo Min‐Na , Shen Yue , Lu Chao , Li Qian , Zhang Chuan , Gao Hua‐Fang , Ma Xu , Lin Hao , Cao Zong‐Fu TITLE=Prevention and Control of Pathogens Based on Big-Data Mining and Visualization Analysis JOURNAL=Frontiers in Molecular Biosciences VOLUME=7 YEAR=2021 URL=https://www.frontiersin.org/journals/molecular-biosciences/articles/10.3389/fmolb.2020.626595 DOI=10.3389/fmolb.2020.626595 ISSN=2296-889X ABSTRACT=

Morbidity and mortality caused by infectious diseases rank first among all human illnesses. Many pathogenic mechanisms remain unclear, while misuse of antibiotics has led to the emergence of drug-resistant strains. Infectious diseases spread rapidly and pathogens mutate quickly, posing new threats to human health. However, with the increasing use of high-throughput screening of pathogen genomes, research based on big data mining and visualization analysis has gradually become a hot topic for studies of infectious disease prevention and control. In this paper, the framework was performed on four infectious pathogens (Fusobacterium, Streptococcus, Neisseria, and Streptococcus salivarius) through five functions: 1) genome annotation, 2) phylogeny analysis based on core genome, 3) analysis of structure differences between genomes, 4) prediction of virulence genes/factors with their pathogenic mechanisms, and 5) prediction of resistance genes/factors with their signaling pathways. The experiments were carried out from three angles: phylogeny (macro perspective), structure differences of genomes (micro perspective), and virulence and drug-resistance characteristics (prediction perspective). Therefore, the framework can not only provide evidence to support the rapid identification of new or unknown pathogens and thus plays a role in the prevention and control of infectious diseases, but also help to recommend the most appropriate strains for clinical and scientific research. This paper presented a new genome information visualization analysis process framework based on big data mining technology with the accommodation of the depth and breadth of pathogens in molecular level research.