The outbreak of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in Wuhan, China, rapidly grew into a global pandemic. How SARS-CoV-2 evolved remains unclear.
We performed a comprehensive analysis using the available genomes of SARS-CoV-2 and its closely related coronaviruses.
The ratio of nucleotide substitutions to amino acid substitutions of the spike gene (9.07) between SARS-CoV-2 WIV04 and Bat-CoV RaTG13 was markedly higher than that between other coronaviruses (range, 1.29–4.81); the ratio of non-synonymous to synonymous substitution rates (dN/dS) between SARS-CoV-2 WIV04 and Bat-CoV RaTG13 was the lowest among all the performed comparisons, suggesting evolution under stringent selective pressure. Notably, the relative proportion of the T:C transition was markedly higher between SARS-CoV-2 WIV04 and Bat-CoV RaTG13 than between other compared coronaviruses. Codon usage is similar across these coronaviruses and is unlikely to explain the increased number of synonymous mutations. Moreover, some sites of the spike protein might be subjected to positive selection.
Our results showed an increased proportion of synonymous substitutions and the T:C transition between SARS-CoV-2 and RaTG13. Further investigation of the mutation pattern mechanism would contribute to understanding viral pathogenicity and its adaptation to hosts.