In order to recognize and combat a diverse array of pathogens the immune system has a large repertoire of T cells having unique T cell receptors (TCRs) with only a few clones specific for any given antigen. We discuss how the number of different possible TCRs encoded in the genome (the potential repertoire) and the number of different TCRs present in an individual (the realized repertoire) can be measured. One puzzle is that the potential repertoire greatly exceeds the realized diversity of naïve T cells within any individual. We show that the existing hypotheses fail to explain why the immune system has the potential to generate far more diversity than is used in an individual, and propose an alternative hypothesis of “evolutionary sloppiness.” Another immunological puzzle is why mice and humans have similar repertoires even though humans have over 1000-fold more T cells. We discuss how the idea of the “protecton,” the smallest unit of protection, might explain this discrepancy and estimate the size of “protecton” based on available precursor frequencies data. We then consider T cell cross-reactivity – the ability of a T cell clone to respond to more than one epitope. We extend existing calculations to estimate the extent of expected cross-reactivity between the responses to different pathogens. Our results are consistent with two observations: a low probability of observing cross-reactivity between the immune responses to two randomly chosen pathogens; and the ensemble of memory cells being sufficiently diverse to generate cross-reactive responses to new pathogens.
T and B cell repertoires are collections of lymphocytes, each characterized by its antigen-specific receptor. We review here classical technologies and analysis strategies developed to assess immunoglobulin (IG) and T cell receptor (TR) repertoire diversity, and describe recent advances in the field. First, we describe the broad range of available methodological tools developed in the past decades, each of which answering different questions and showing complementarity for progressive identification of the level of repertoire alterations: global overview of the diversity by flow cytometry, IG repertoire descriptions at the protein level for the identification of IG reactivities, IG/TR CDR3 spectratyping strategies, and related molecular quantification or dynamics of T/B cell differentiation. Additionally, we introduce the recent technological advances in molecular biology tools allowing deeper analysis of IG/TR diversity by next-generation sequencing (NGS), offering systematic and comprehensive sequencing of IG/TR transcripts in a short amount of time. NGS provides several angles of analysis such as clonotype frequency, CDR3 diversity, CDR3 sequence analysis, V allele identification with a quantitative dimension, therefore requiring high-throughput analysis tools development. In this line, we discuss the recent efforts made for nomenclature standardization and ontology development. We then present the variety of available statistical analysis and modeling approaches developed with regards to the various levels of diversity analysis, and reveal the increasing sophistication of those modeling approaches. To conclude, we provide some examples of recent mathematical modeling strategies and perspectives that illustrate the active rise of a “next-generation” of repertoire analysis.