AUTHOR=Zhou Jinghang , Liu Liyuan , Reynolds Edwardo , Huang Xixia , Garrick Dorian , Shi Yuangang TITLE=Discovering Copy Number Variation in Dual-Purpose XinJiang Brown Cattle JOURNAL=Frontiers in Genetics VOLUME=12 YEAR=2022 URL=https://www.frontiersin.org/journals/genetics/articles/10.3389/fgene.2021.747431 DOI=10.3389/fgene.2021.747431 ISSN=1664-8021 ABSTRACT=

Copy number variants (CNVs), which are a class of structural variant, can be important in relating genomic variation to phenotype. The primary aims of this study were to discover the common CNV regions (CNVRs) in the dual-purpose XinJiang-Brown cattle population and to detect differences between CNVs inferred using the ARS-UCD 1.2 (ARS) or the UMD 3.1 (UMD) genome assemblies based on the 150K SNP (Single Nucleotide Polymorphisms) Chip. PennCNV and CNVPartition methods were applied to calculate the deviation of the standardized signal intensity of SNPs markers to detect CNV status. Following the discovery of CNVs, we used the R package HandyCNV to generate and visualize CNVRs, compare CNVs and CNVRs between genome assemblies, and identify consensus genes using annotation resources. We identified 38 consensus CNVRs using the ARS assembly with 1.95% whole genome coverage, and 33 consensus CNVRs using the UMD assembly with 1.46% whole genome coverage using PennCNV and CNVPartition. We identified 37 genes that intersected 13 common CNVs (>5% frequency), these included functionally interesting genes such as GBP4 for which an increased copy number has been negatively associated with cattle stature, and the BoLA gene family which has been linked to the immune response and adaption of cattle. The ARS map file of the GGP Bovine 150K Bead Chip maps the genomic position of more SNPs with increased accuracy compared to the UMD map file. Comparison of the CNVRs identified between the two reference assemblies suggests the newly released ARS reference assembly is better for CNV detection. In spite of this, different CNV detection methods can complement each other to generate a larger number of CNVRs than using a single approach and can highlight more genes of interest.