AUTHOR=Régnier Mireille , Chassignet Philippe TITLE=Accurate Prediction of the Statistics of Repetitions in Random Sequences: A Case Study in Archaea Genomes JOURNAL=Frontiers in Bioengineering and Biotechnology VOLUME=4 YEAR=2016 URL=https://www.frontiersin.org/journals/bioengineering-and-biotechnology/articles/10.3389/fbioe.2016.00035 DOI=10.3389/fbioe.2016.00035 ISSN=2296-4185 ABSTRACT=

Repetitive patterns in genomic sequences have a great biological significance and also algorithmic implications. Analytic combinatorics allow to derive formula for the expected length of repetitions in a random sequence. Asymptotic results, which generalize previous works on a binary alphabet, are easily computable. Simulations on random sequences show their accuracy. As an application, the sample case of Archaea genomes illustrates how biological sequences may differ from random sequences.