Skip to main content

DATA REPORT article

Front. Genet., 26 March 2021
Sec. Livestock Genomics
This article is part of the Research Topic Functional Annotation of Farm Animal Genomes View all 22 articles

“Adopt-a-Tissue” Initiative Advances Efforts to Identify Tissue-Specific Histone Marks in the Mare

  • 1Veterinary Genetics Laboratory, School of Veterinary Medicine, University of California, Davis, Davis, CA, United States
  • 2Department of Population Health and Reproduction, School of Veterinary Medicine, University of California, Davis, Davis, CA, United States
  • 3Faculty of Science, School of Life and Environmental Science, University of Sydney, Camperdown, NSW, Australia
  • 4Department of Animal Breeding and Genetics, Swedish University of Agricultural Sciences, Uppsala, Sweden
  • 5Livestock Genetics, Department of Biosystems, KU Leuven, Leuven, Belgium
  • 6Centre d'Anthropobiologie et Génomique de Toulouse (CAGT), Faculté de Médecine Purpan, Université Toulouse III-Paul Sabatier, Toulouse, France
  • 7Maxwell H. Gluck Equine Research Center, Department of Veterinary Science, University of Kentucky, Lexington, KY, United States
  • 8Department of Animal Sciences, University of Florida, Gainesville, FL, United States
  • 9Department of Veterinary Population Medicine, College of Veterinary Medicine, University of Minnesota, St. Paul, MN, United States
  • 10Department of Animal Science, University of Nebraska-Lincoln, Lincoln, NE, United States

Introduction

The equine genetics and genomics research community has a long history of synergistic collaborations for developing tools and resources to advance equine biology. Starting in 1995 with the first International Equine Gene Mapping Workshop supported by the Dorothy Russell Havemeyer Foundation Inc. (Bailey, 2010), researchers collaborated to build comprehensive equine linkage maps (Guérin et al., 1999, 2003; Penedo et al., 2005; Swinburne et al., 2006), radiation hybrid and comparative maps (Caetano et al., 1999; Chowdhary et al., 2002), physical marker and BAC contig maps (Raudsepp et al., 2004, 2008; Leeb et al., 2006), reference genomes for the horse (Wade et al., 2009; Kalbfleisch et al., 2018), and genotyping arrays to economically map and study traits of interest for horse owners and breeders (McCue et al., 2012; McCoy and McCue, 2014; Schaefer et al., 2017). Continuing the legacy of community-based advancements, a new collective effort began in 2015 to functionally annotate DNA elements in the horse as part of the international Functional Annotation of ANimal Genomes (FAANG) Consortium (Andersson et al., 2015; Tuggle et al., 2016; Burns et al., 2018).

Reminiscent of the ENCODE project in humans and mice (Dunham et al., 2012), the ultimate goal of the FAANG consortium is to annotate the major functional elements in the genomes of domesticated animal species (Andersson et al., 2015). In particular, four histone modifications were chosen by the consortium to characterize the genomic locations of enhancers (H3K4me1), promoters and transcription start sites (H3K4me3), open chromatin with active regulatory elements (H3K27ac), and facultative heterochromatin with inaccessible or repressed regulatory elements (H3K27me3) (Andersson et al., 2015; Giuffra and Tuggle, 2019). The initial equine FAANG efforts identified putative regulatory regions in eight prioritized tissues of interest (TOI) by performing Chromatin Immuno-Precipitation Sequencing (ChIP-Seq) for the four target histone marks (Kingsley et al., 2020). In that investigation, more than one million putative regulatory sites were characterized across the equine genome. With more than 80 tissues, cell lines, and body fluids stored in the equine biobank (Burns et al., 2018), further opportunities to expand the scope of the annotation work exist. To leverage the benefits of the biobank, a collaborative sponsorship program titled “Adopt-a-Tissue” was created to enable researchers from across the globe to select and support annotation of a tissue by the equine FAANG group. Through this effort, four additional “Adopted” tissues— spleen, metacarpal 3 (MC3), sesamoid, and full thickness skin— were assayed by histone mark ChIP-Seq to expand the tissue-specific annotation resources for the entire equine research community.

Methods

All ChIP-Seq assays were performed by Diagenode ChIP-Seq Profiling Service (Diagenode, Cat# G02010000, Liège, Belgium). Summarized experimental procedures are available in more detail at the FAANG FTP portal hosted by EBI (ftp://ftp.faang.ebi.ac.uk/faang/ftp/protocols/assays/ and ftp://ftp.faang.ebi.ac.uk/faang/ftp/protocols/experiments/). Spleen samples were processed following the assay procedures outlined in UCD_SOP_ChIP-Seq_for_Histone_Marks_20191101.pdf. Skin and both bone tissues were processed following the experimental protocols outlined in UCD_SOP_ChIP-seq_for_Histone_Marks_Skin_20201218.pdf and UCD_SOP_ChIP-seq_for_Histone_Marks_Bone_20201218.pdf, respectively. “Adopted” tissues, as summarized in Supplementary Table 1, were collected from two Thoroughbred mares (denoted as ECA_UCD_AH1 for SAMEA104728862 and ECA_UCD_AH2 for SAMEA104728877) as part of the FAANG equine biobank (Burns et al., 2018) following protocols approved by the University of California, Davis Institutional Animal Care and Use Committee (Protocol #19037).

Chromatin was isolated from the two bone tissues using the TrueMicro ChIP-Seq kit (Diagenode Cat# C01010140) and from spleen and skin using the iDeal ChIP-Seq kit for Histones (Diagenode Cat# C01010059). Starting amounts for each replicate varied by tissue with ~100 mg for spleen, 375–770 mg for MC3, 445–650 mg for sesamoid, and ~125 mg for skin. After homogenization, fixed samples were sheared with the Bioruptor® Pico (Diagenode Cat# B01060001) for 12 (spleen), 10–12 (MC3 and sesamoid), and 8 (skin) cycles of 30 s on and 30 s off. The amount of chromatin yield and thus chromatin per IP varied by tissue. Spleen and skin had the greatest amounts (1.5 μg and 600 ng, respectively) per IP and MC3 and sesamoid had the least (350 ng each). The following antibody concentrations were used for MC3, sesamoid, and skin: 0.5 μg for H3K4me1, 0.5 μg for H3K4me3, 1 μg for H3K27ac, and 1 μg for H3K27me3. To account for the greater amount of chromatin from spleen, twice the amount of antibody was used for each mark compared to the other three tissues. For all tissues, 10% of the total chromatin from each replicate was saved for the input.

Libraries were prepared with the IP-Star® Compact Automated System (Diagenode Cat# B03000002) using the MicroPlex Library Preparation Kit v2 (Diagenode Cat# C05010013). Spleen, MC3, and sesamoid were sequenced as 50 base pair single-end (SE) reads on the HiSeq 4000 platform (Illumina, San Diego, CA, USA). For these tissues, the broad mark (H3K27me3) was sequenced to a minimum of 50M raw reads while the remaining marks (H3Kme1, H3K4me3, and H3K27ac) and the input were sequenced to a minimum depth of 30 M raw reads. Due to advancements in sequencing technology, skin tissue was sequenced as 50 base pair paired-end (PE) reads on the NovaSeq 6000 (Illumina, San Diego, CA, USA). For skin, the broad mark (H3K27me3) was sequenced to a minimum of 100 M raw fragments while the remaining marks (H3Kme1, H3K4me3, and H3K27ac) and the input were sequenced to a minimum depth of 40 M raw fragments.

Methods for analyzing SE reads followed the procedures described previously (Kingsley et al., 2020) and modifications were made to the SE analysis methods to accommodate PE data generated from skin. After trimming with Trim-Galore version 0.4.0 (Martin, 2011; Andrews et al., 2012), reads were aligned to EquCab3.0 (Kalbfleisch et al., 2018) with BWA-MEM version 0.7.9a (Li and Durbin, 2009). Alignments in BAM format were filtered using SAMtools version 1.9 (Li et al., 2009). Reads were removed if they did not map, had secondary alignments (including split hits), failed platform/vendor quality tests, were identified as optical duplicates, or had an alignment quality score <30. PE reads were also removed if the mates did not map. PCR duplicates were marked with PicardTools version 2.7.1 (Picard toolkit, 2019) and removed with SAMtools. For peak-calling, MACS2 version 2.1.1.20160309 (Zhang et al., 2008) was used to call peaks for all marks with PE data denoted by a PE flag (-f BAMPE). SICERpy version 0.1.1 was also used to call peaks for H3K27me3 as it specializes in broad peak calling (SICERpy, SICERpy, GitHub Repository; Zang et al., 2009). To use SICERpy with the PE data, the second read in each pair was removed and data were processed as SE based on recommendations from the software developers. Peak-calls were combined by identifying overlapping regions of enrichment in both biological replicates where at least one replicate was significantly enriched for a given mark. Heatmaps and quality metrics were generated using deepTools 2.4.2 (Ramírez et al., 2016), SPP 1.13 (Kharchenko et al., 2008), and custom scripts. Detailed bioinformatic workflows are available at ftp://ftp.faang.ebi.ac.uk/faang/ftp/protocols/analysis/.

Quality Assessment

Library Complexity

Data were assessed for library complexity with metrics established by ENCODE and endorsed by FAANG, including nonredundant fraction (NRF), PCR bottleneck coefficient 1 (PBC1), and PCR bottleneck coefficient 2 (PBC2) (Landt et al., 2012; Kingsley et al., 2020). All of the libraries prepared surpassed the quality threshold for the PBC2 metric (PBC2 > 1), however, several marks and tissues fell below the quality threshold for NRF and PBC1 (Table 1). For example, three of the four marks for spleen passed all library complexity measures while the H3K27me3 data from both biological replicates failed NRF and PBC1. Additionally, both replicates for sesamoid and MC3 passed all three metrics for H3K4me1 and H3K27me3 but fell below threshold for H3K4me3 and H3K27ac. All skin libraries passed NRF and PBC1 thresholds with three exceptions: both replicates for H3K4me3 and ECA_UCD_AH2 replicate for H3K4me1.

TABLE 1
www.frontiersin.org

Table 1. Quality metrics and peak-calling summary for each biological replicate.

In addition to quality metrics, sequencing data were evaluated at several processing stages of the analysis including alignment and PCR deduplication. All datasets generated high mapping quality scores (>35) and exceeded the minimum sequencing targets as described in the methods (Supplementary Table 2). Skin and spleen tissues retained a high number of reads for H3K4me1, H3K4me3, and H3K27ac after alignment, filtering, and deduplication (>20 M reads per replicate). Although all three activating marks were sequenced to the same target for both bone tissues, H3K4me1 retained more than 20 M reads per replicate while H3K4me3 and H3K27ac fell below 20 M processed reads per replicate with the majority of reads removed by deduplication. More than 40 M reads remained for each H3K27me3 replicate after processing with the exception of ECA_UCD_AH2 for sesamoid.

IP Enrichment

Data were also evaluated for IP enrichment using a variety of metrics to determine signal quality. Using normalized strand cross correlation (NSC) and relative strand cross correlation (RSC) assessments established by ENCODE (Landt et al., 2012), all marks for skin tissue exceeded the minimum quality threshold (Table 1). Additionally, the biological replicates for H3K4me3 and H3K27ac from spleen and MC3, as well as the H3K4me3 replicates for sesamoid, passed both cross-correlation measures. Similar to the library complexity metrics, several tissues fell below the quality thresholds (NCS > 1.05 and RSC > 0.8) including H3K4me1 from sesamoid and MC3; H3K27ac from ECA_UCD_AH2 sesamoid; and H3K27me3 from spleen, sesamoid, and MC3. Alignments were also assessed using the Jensen Shannon distance (JSD) to compare the distribution of reads with that of the background (input). Using JSD, H3K27me3 from both spleen replicates had values below 0.05, which is indicative of insufficient IP enrichment.

The final measure of IP enrichment evaluated the fraction of reads in peaks (FRiP) by comparing the peak calls with the read distribution for each sample. All tissues produced a high proportion of aligned reads within peaks for H3K4me3, ranging from 0.21 for sesamoid to 0.69 for skin. Similarly, MC3, skin, and spleen generated high FRiP scores for H3K27ac (0.47–0.19), and peaks from skin and spleen also scored well for H3K4me1 (0.47–0.29). Although lower than the values from skin and spleen, FRiP scores from MC3 indicated sufficient enrichment was obtained for H3K4me1 (0.07–0.09). For sesamoid tissue, the ECA_UCD_AH2 replicate generated peaks with comparable enrichment for H3K4me1, H3K27ac, and H3K27me3, while the ECA_UCD_AH2 replicate scored below threshold for both H3K4me1 and H3K27me3 (0.0005 and 0.0043, respectively). Further, H3K27me3 peaks from skin generated a substantially higher fraction of reads compared with MC3 and spleen (0.21–0.24 vs. 0.05–0.10), although all three of these tissues obtained sufficient enrichment based on this assessment.

Replicate Comparison

In addition to quality assessments for the read alignments, peaks called from the biological replicates were compared. For most of the marks, the percentage of genome covered by peaks was consistent with previously reported values for the TOI (Table 1). For sesamoid tissue, at least one replicate for H3K4me1, H3K27ac, and H3K27me3 generated fewer peak calls than expected based on results from the other replicate and the MC3 replicates. Additionally, the initial data for H3K27me3 from both spleen replicates yielded fewer peaks in accordance with the low complexity and enrichment scores for those libraries. The Jaccard similarity coefficient identified the highest correlation between the biological replicates for H3K4me3 across all “adopted” tissues, ranging from 0.65 to 0.84 (Table 2), and data from skin also showed high correlation for all marks (0.44–0.84). Replicates for spleen and MC3 had moderate levels of similarity for H3K4me1 and H3K27ac (0.32–0.58), while the biological replicates for H3K4me1 and H3K27me3 from sesamoid had no identity detected, consistent with the low-scoring quality assessments.

TABLE 2
www.frontiersin.org

Table 2. Summary of the combined peak calls and replicate comparison.

Additional Data Collection

Due to insufficient enrichment and replicate identity, IP and sequencing were repeated for H3K27me3 from both spleen replicates. Unfortunately, the repeated ECA_UCD_AH1 data had low library complexity and IP enrichment (Table 1 and Supplementary Table 2). To achieve sufficient data for accurate peak calling from spleen tissue, the first round of IP and sequencing from ECA_UCD_AH1 for H3K27me3 and both rounds from ECA_UCD_AH2 were used for combined peak calling. Reads from the two input files for ECA_UCD_AH2 were also merged. The number of combined peaks increased from 4,955 covering 1.98% from the first round of sequencing to 5,267 covering 2.18% of the genome when data were merged (Table 2). Similar issues with enrichment prevented sufficient signal for peak calling in sesamoid for three of the four marks, and therefore, a second round of IP and quality evaluation of ECA_UCD_AH2 sesamoid is underway for H3K4me1, H3K27ac, and H3K27me3.

Data Metrics

After combining replicates, the number of retained peaks for each mark from the SE data ranged from 4,933 to 73,528 for spleen and from 5,628 to 46,511 for MC3 (Table 2). For both tissues, H3K4me1— the mark indicative of enhancers— was found to have the highest number of peaks while the repressive mark was found to have the lowest. This pattern is also consistent with the TOI data (Kingsley et al., 2020). For PE skin data, the number of combined peaks varied from 24,353 to 92,971 regions, and H3K4me3, which denotes promoters, was the mark with the lowest number of peaks. Additionally, the amount of the genome covered by H3K27me3 peaks was substantially higher for skin compared to the other equine FAANG tissues analyzed to date (6.28 vs. 2.94%), while the number of reads retained for H3K27me3 from the PE data after filtering (42.8%) was comparable to the average retained for all of the equine H3K27me3 SE data (41.3%, PRJEB42315 and PRJEB35307).

Evaluating general enrichment patterns revealed that the “adopted” tissues detected mark distributions for the activating marks that were consistent with those identified previously for the TOI (Supplementary Figures 13). Data for H3K27me3 from skin, however, generated strong enrichment around the TSS and upstream of an average gene, while still maintaining a similar level of relative enrichment for H3K27me3 distributed throughout the rest of the gene body and downstream as seen for other tissues (Supplementary Figure 4). Evaluation of the spleen datasets detected the strongest H3K27me3 enrichment when combining the original ECA_UCD_AH1 dataset and the merged ECA_UCD_AH2 dataset (denoted as “spleen” on Supplementary Figure 4). While enrichment distributions for sesamoid tissue detected consistent patterns for H3K4me1, H3K27ac, and H3K27me3, the relative level of enrichment is lower than expected based on the other tissues. In addition to genome-wide evaluations, the replicate-combined peak calls were also manually evaluated across a small number of well-characterized regions. Consistent with expectations, activating marks were detected at the TSS and upstream of ubiquitously expressed genes such as ACTB for all tissues (Supplementary Figures 5A,B). Additionally, all “adopted” tissues lacked peaks indicative of active transcription for a liver-specific gene known as CYP2E1 (Supplementary Figures 5C,D).

Discussion

The ENCODE project profoundly impacted scientific understanding of genome function in humans by enabling researchers to explore previously impossible challenges, such as charting genomic landscape shifts during development and uncovering enhancer networks associated with disease (Nord et al., 2013; Rhie et al., 2016). The advancements made by ENCODE paved a path for the FAANG consortium to characterize genomic function in numerous agricultural species (Andersson et al., 2015; Tuggle et al., 2016; Giuffra and Tuggle, 2019), which will expand research opportunities across diverse genera. As a part of the larger consortium, the equine FAANG group established a community-based initiative to “adopt” additional tissues for annotation. As a result of that expansive collaborative effort, characterization of putative regulatory regions was performed in spleen, sesamoid, MC3, and skin. The four additional tissues are of major importance for equine health and traits of economic impact. Specifically, research on catastrophic fracture involving sesamoid and MC3 can benefit from bone-specific annotations as recent advances in treatment have focused on transgenically modified stem cell therapeutics (Ball et al., 2019). Similarly, many diseases and traits under artificial selection in horses, such as melanoma, insect bite hypersensitivity, and coat colors including Appaloosa spotting among others, involve skin tissue (Rieder et al., 2000, 2001; Bellone et al., 2008, 2013; Rosengren Pielberg et al., 2008; Curik et al., 2013; Lanz et al., 2017). Several of these characterized phenotypes have been associated with mutations affecting gene expression (Rieder et al., 2000; Rosengren Pielberg et al., 2008; Bellone et al., 2013), making regulatory regions identified from whole skin a valuable resource for equine researchers. The “Adopt-a-Tissue” effort fits into a broader legacy of collaborative resource development that has historically led to rapid advancements for equine genomics and will continue to push equine science toward new frontiers. In concordance with past community efforts, the high quality data generated from the “Adopted” tissues are publicly available to benefit all investigators and lead to further progress in equine research.

Using quality metrics first standardized by ENCODE (Dunham et al., 2012), we identified low IP enrichment for the broad mark in spleen, sesamoid, and MC3 tissues. Unlike the SE datasets, the skin replicates sequenced with PE reads generated a higher enrichment signal for H3K27me3 as determined by quality metrics and enrichment topology plots. In particular, enrichment near the TSS was more strongly detected for skin than for any of the TOI or the other “adopted” tissues, suggesting that PE reads may better evaluate the broad repressive mark than SE datasets. With only one tissue evaluated as PE, we cannot exclude the possibility that this enrichment pattern may be skin-specific rather than evidence of a better method for detecting H3K27me3. Although enrichment difficulties have been previously recognized for the broad domains like those of H3K27me3 (Landt et al., 2012; Carelli et al., 2017), investigation of specific ChIP methods for broad histone marks appear to be rare. O'Geen et al. (2011) used both short and long sonication periods to account for the different rates of shearing efficiency for compact versus. open chromatin. They found that the larger DNA fragments after sonication were more enriched for broad repressive histone marks while smaller fragments were more likely to contain active chromatin modifications (O'Geen et al., 2011). Their work suggests that shorter sonication times and stringent size selection may bias ChIP samples toward higher enrichment of regions containing narrow marks at the expense of more condensed areas with broad marks, yet current ChIP-Seq standards do not encourage separate protocols for the different mark topologies (Landt et al., 2012; ENCODE Guidelines for Experiments Generating ChIP-seq Data, 2017). Instead, advances in ChIP-Seq methods have focused on analysis and software development to accommodate the different enrichment levels expected from broad and narrow domains assayed with the same protocol (Zhang et al., 2008; Zang et al., 2009). Future investigations involving H3K27me3 and other broad histone modifications may benefit from developing bench protocols, including sequencing parameters, that are specific for broad marks.

To account for insufficient H3K27me3 signal from spleen tissue, IP and sequencing were repeated for both biological replicates. By combining the reads from both sets of data for ECA_UCD_AH2, we were able to obtain sufficient enrichment for peak identification. These data support that combining results from different IPs performed on the same tissue sample can be a useful approach to obtain the enrichment needed for annotation purposes. Study of the best means for combining information from biological and technical replicates for differential enrichment analyses suggests that combining ChIP datasets without accounting for enrichment levels may lead to more false negatives (Bao et al., 2013). Although our data may not have captured all possible peaks, combining data enabled detection of more H3K27me3 peak calls with higher consistency than possible with the first dataset alone. Therefore, the current peak calls can serve as the starting point for spleen-specific annotations, which can be improved upon with characterization of heterochromatin regions from additional equine spleen samples.

The low quality metrics for three of the four marks from ECA_UCD_AH1 sesamoid tissue indicated there was low IP enrichment. To the best of the authors' knowledge, the MC3 and sesamoid data generated here represent the first histone mark peak calls from healthy, whole bone tissue. The overall lower quality metrics for bone tissues support the difficulty of working with these tissues, however, one of the two replicates for sesamoid showed sufficient quality for all four marks, suggesting the issue may be sample specific. To determine if any issues arose during chromatin extraction or IP, further evaluation of H3K4me1, H3K27ac, and H3K27me3 marks in sesamoid tissue from ECA_UCD_AH1 is warranted. Additional data generated from ECA_UCD_AH1 sesamoid tissue will be added to PRJEB42315 when available.

Previous equine annotations were developed based on homology and transcriptomics, leaving much of the genome, especially noncoding regions, uncharacterized (Hestand et al., 2015; Aken et al., 2016; Mansour et al., 2017). While valuable, annotation of regulatory regions based solely on homology with other species is not expected to be sufficient given the evolutionary role of these elements within and among species (Schmidt et al., 2010; McLean et al., 2011; Shibata et al., 2012; Lowdon et al., 2016). With the first publication of the equine FAANG data from eight prioritized tissues (Kingsley et al., 2020) and the four “adopted” tissues presented in this manuscript, researchers can begin to interrogate the role of regulatory regions in equine traits, such as the recent investigation of a novel 16 KB deletion associated with an ocular disorder known as distichiasis (Hisey et al., 2020). Future annotations for the horse will include maps of regulatory states characteristic of healthy tissue, making it a vital resource to compare against disease states. The histone ChIP-Seq data from the horse have already been integrated into a useable annotation resource by a new project known as FAANGMine (FAANGMine, FAANGMine). Similar to FlyMine (Lyne et al., 2007), the project aims to combine the results from all of the genomic assays used by the FAANG consortium into a single resource for easier use. Thanks to these integration effort, additional equine FAANG datasets including the “adopted” tissue peak calls will open up opportunities for variant investigations in previously uncharacterized noncoding regions and expand research opportunities in equine omics.

Data Availability Statement

Data were submitted to the European Nucleotide Archive following the best practices established by the FAANG Metadata and Data Sharing Committee and the FAANG Data Coordination Centre (Harrison et al., 2018). All of the new data referenced in the article were submitted under project ID PRJEB42315. The following files types were submitted for all high quality data: raw fastq for each mark and input from both replicates (38 files), processed BAM files for each mark and input from both replicates (34 files), bed files with peak calls per replicate including both SICERpy and MACS2 calls for H3K27me3 (32 files), and bed files with combined peak calls including those from both SICERpy and MACS2 for H3K27me3 (16 files). All files and metadata can be accessed from the FAANG Data Portal (https://data.faang.org/home). Previously published FAANG data used in the comparisons are also available from the FAANG data portal under project ID PRJEB35307 (Kingsley et al., 2020).

Ethics Statement

The animal study was reviewed and approved by University of California, Davis Institutional Animal Care and Use Committee.

Author Contributions

CF, RB, JP, TK, JM, NH, GL, LO, SB, MM, and EB: conceptualization. NK, RB, CF, and JP: methodology. NK: formal analysis, investigation, data curation, and visualization. CF, RB, and JP: resources. NK and RB: writing—original draft preparation. CF, JP, JM, TK, NH, GL, LO, SB, MM, and EB: writing—review and editing. RB: supervision. CF, RB, and JP: project administration. CF, RB, JP, TK, JM, NH, GL, LO, EB, SB, and MM: funding acquisition. All authors have read and agreed to the published version of the manuscript.

Funding

Funding for experimental materials, and other resources was provided by the Grayson Jockey Club Foundation, United States Department of Agriculture (USDA) NRSP-8 equine species coordinator funds, a Priority Partnership Collaboration Award from the University of Sydney and University of California, Davis, and the Center for Equine Health (CEH) at UC Davis with funds provided by the State of California pari-mutuel fund and contributions by private donors. Support for NK was provided by the Grayson Jockey Club Foundation, USDA (2018-06530), Morris Animal Foundation (D16-EQ-028), and a CEH fellowship. Support for CF was provided by the National Institutes of Health (L40 TR001136). Publication fees supplied by UC Davis Open Access Publication Fund.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Acknowledgments

The authors would like to acknowledge Drs. Catherine Creppe, Paola Genevini, and the other members of the Diagenode Epigenomics Services Team for their high-quality histone mark profiling service. The authors would also like to thank Drs. Colin Kern, Huaijun Zhou, Pablo Ross, Ying Wang, and the other members of the UC Davis FAANG group for providing technical expertise and feedback. Additionally, the authors would like to recognize the contributions from horse owners that made this work possible.

Supplementary Material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene.2021.649959/full#supplementary-material

References

Aken, B. L., Ayling, S., Barrell, D., Clarke, L., Curwen, V., Fairley, S., et al. (2016). The Ensembl gene annotation system. Database. 2016, 1–19. doi: 10.1093/database/baw093

CrossRef Full Text | Google Scholar

Andersson, L., Archibald, A. L., Bottema, C. D., Brauning, R., Burgess, S. C., Burt, D. W., et al. (2015). Coordinated international action to accelerate genome-to-phenome with FAANG, the Functional Annotation of Animal Genomes project. Genome Biol. 16, 4–9. doi: 10.1186/s13059-015-0622-4

PubMed Abstract | CrossRef Full Text | Google Scholar

Andrews, S., Krueger, F., Segonds-Pichon, A., Biggins, L., Krueger, C., and Wingett, S. (2012). Trim Galore. Available online at: https://www.bioinformatics.babraham.ac.uk/projects/trim_galore/ (accessed May 6, 2020).

Google Scholar

Bailey, E. (2010). Horse genomics and the Dorothy Russell Havemeyer Foundation. Anim. Genet. 41:1. doi: 10.1111/j.1365-2052.2010.02136.x

PubMed Abstract | CrossRef Full Text | Google Scholar

Ball, A. N., Phillips, J. N., McIlwraith, C. W., Kawcak, C. E., Samulski, R. J., and Goodrich, L. R. (2019). Genetic modification of scAAV-equine-BMP-2 transduced bone-marrow-derived mesenchymal stem cells before and after cryopreservation: An “off-the-shelf” option for fracture repair. J. Orthop. Res. 37, 1310–1317. doi: 10.1002/jor.24209

PubMed Abstract | CrossRef Full Text | Google Scholar

Bao, Y., Vinciotti, V., Wit, E., and 't Hoen, P. A. C. (2013). Accounting for immunoprecipitation efficiencies in the statistical analysis of ChIP-seq data. BMC Bioinformatics 14:169. doi: 10.1186/1471-2105-14-169

PubMed Abstract | CrossRef Full Text | Google Scholar

Bellone, R. R., Brooks, S. A., Sandmeyer, L., Murphy, B. A., Forsyth, G., Archer, S., et al. (2008). Differential gene expression of TRPM1, the potential cause of congenital stationary night blindness and coat spotting patterns (LP) in the appaloosa horse (Equus caballus). Genetics 179, 1861–1870. doi: 10.1534/genetics.108.088807

PubMed Abstract | CrossRef Full Text | Google Scholar

Bellone, R. R., Holl, H., Setaluri, V., Devi, S., Maddodi, N., Archer, S., et al. (2013). Evidence for a retroviral insertion in TRPM1 as the cause of congenital stationary night blindness and leopard complex spotting in the horse. PLoS ONE 8, 1–14. doi: 10.1371/journal.pone.0078280

PubMed Abstract | CrossRef Full Text | Google Scholar

Burns, E. N., Bordbari, M. H., Mienaltowski, M. J., Affolter, V. K., Barro, M. V., Gianino, F., et al. (2018). Generation of an equine biobank to be used for functional annotation of animal genomes project. Anim. Genet. 49, 564–570. doi: 10.1111/age.12717

PubMed Abstract | CrossRef Full Text | Google Scholar

Caetano, A. R., Shiue, Y. L., Lyons, L. A., O'Brien, S. J., Laughlin, T. F., Bowling, A. T., et al. (1999). A comparative gene map of the horse (Equus caballus). Genome Res. 9, 1239–1249. doi: 10.1101/gr.9.12.1239

PubMed Abstract | CrossRef Full Text | Google Scholar

Carelli, F. N., Sharma, G., and Ahringer, J. (2017). Broad chromatin domains: an important facet of genome regulation. BioEssays 39, 1–7. doi: 10.1002/bies.201700124

PubMed Abstract | CrossRef Full Text | Google Scholar

Chowdhary, B. P., Raudsepp, T., Honeycutt, D., Owens, E. K., Piumi, F., Guérin, G., et al. (2002). Construction of a 5000rad whole-genome radiation hybrid panel in the horse and generation of a comprehensive and comparative map for ECA11. Mamm. Genome 13, 89–94. doi: 10.1007/s00335-001-2089-8

CrossRef Full Text | Google Scholar

Curik, I., Druml, T., Seltenhammer, M., Sundström, E., Pielberg, G. R., Andersson, L., et al. (2013). Complex inheritance of melanoma and pigmentation of coat and skin in grey horses. PLoS Genet. 9:e1003248. doi: 10.1371/journal.pgen.1003248

PubMed Abstract | CrossRef Full Text | Google Scholar

Dunham, I., Kundaje, A., Aldred, S. F., Collins, P. J., Davis, C. A., Doyle, F., et al. (2012). An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74. doi: 10.1038/nature11247

PubMed Abstract | CrossRef Full Text | Google Scholar

ENCODE Guidelines for Experiments Generating ChIP-seq Data (2017). Available online at: https://www.encodeproject.org/about/experiment-guidelines/ (accessed June 30, 2020).

Google Scholar

FAANGMine. Available online at: http://faangmine.org (accessed July 16, 2020).

Google Scholar

Giuffra, E., and Tuggle, C. K. (2019). Functional annotation of animal genomes (FAANG): current achievements and roadmap. Annu. Rev. Anim. Biosci. 7, 65–88. doi: 10.1146/annurev-animal-020518-114913

PubMed Abstract | CrossRef Full Text | Google Scholar

Guérin, G., Bailey, E., Bernoco, D., Anderson, I., Antczak, D. F., Bell, K., et al. (1999). Report of the international equine gene mapping workshop: male linkage map. Anim. Genet. 30, 341–354. doi: 10.1046/j.1365-2052.1999.00510.x

PubMed Abstract | CrossRef Full Text | Google Scholar

Guérin, G., Bailey, E., Bernoco, D., Anderson, I., Antczak, D. F., Bell, K., et al. (2003). The second generation of the International Equine Gene Mapping Workshop half-sibling linkage map. Anim. Genet. 34, 161–168. doi: 10.1046/j.1365-2052.2003.00973.x

PubMed Abstract | CrossRef Full Text | Google Scholar

Harrison, P. W., Fan, J., Richardson, D., Clarke, L., Zerbino, D., Cochrane, G., et al. (2018). FAANG, establishing metadata standards, validation and best practices for the farmed and companion animal community. Anim. Genet. 49, 520–526. doi: 10.1111/age.12736

PubMed Abstract | CrossRef Full Text | Google Scholar

Hestand, M. S., Kalbfleisch, T. S., Coleman, S. J., Zeng, Z., Liu, J., Orlando, L., et al. (2015). Annotation of the protein coding regions of the equine genome. PLoS ONE 10, 1–13. doi: 10.1371/journal.pone.0124375

PubMed Abstract | CrossRef Full Text | Google Scholar

Hisey, E. A., Hermans, H., Lounsberry, Z. T., Avila, F., Grahn, R. A., Knickelbein, K. E., et al. (2020). Whole genome sequencing identified a 16 kilobase deletion on ECA13 associated with distichiasis in Friesian horses. BMC Genomics 21, 1–13. doi: 10.1186/s12864-020-07265-8

PubMed Abstract | CrossRef Full Text | Google Scholar

Kalbfleisch, T. S., Rice, E. S., DePriest, M. S., Walenz, B. P., Hestand, M. S., Vermeesch, J. R., et al. (2018). Improved reference genome for the domestic horse increases assembly contiguity and composition. Commun. Biol. 1, 1–8. doi: 10.1038/s42003-018-0199-z

PubMed Abstract | CrossRef Full Text | Google Scholar

Kharchenko, P. V., Tolstorukov, M. Y., and Park, P. J. (2008). Design and analysis of ChIP-seq experiments for DNA-binding proteins. Nat. Biotechnol. 26, 1351–1359. doi: 10.1038/nbt.1508

PubMed Abstract | CrossRef Full Text | Google Scholar

Kingsley, N. B., Kern, C., Creppe, C., Hales, E. N., Zhou, H., Kalbfleisch, T. S., et al. (2020). Functionally annotating regulatory elements in the equine genome using Histone Mark ChIP-Seq. Genes 11:3. doi: 10.3390/genes11010003

PubMed Abstract | CrossRef Full Text | Google Scholar

Landt, S. G., Marinov, G. K., Kundaje, A., Kheradpour, P., Pauli, F., Batzoglou, S., et al. (2012). ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia. Genome Res. 22, 1813–1831. doi: 10.1101/gr.136184.111

PubMed Abstract | CrossRef Full Text | Google Scholar

Lanz, S., Brunner, A., Graubner, C., Marti, E., and Gerber, V. (2017). Insect bite hypersensitivity in horses is associated with airway hyperreactivity. J. Vet. Intern. Med. 31, 1877–1883. doi: 10.1111/jvim.14817

PubMed Abstract | CrossRef Full Text | Google Scholar

Leeb, T., Vogl, C., Zhu, B., de Jong, P. J., Binns, M. M., Chowdhary, B. P., et al. (2006). A human-horse comparative map based on equine BAC end sequences. Genomics 87, 772–776. doi: 10.1016/j.ygeno.2006.03.002

PubMed Abstract | CrossRef Full Text | Google Scholar

Li, H., and Durbin, R. (2009). Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760. doi: 10.1093/bioinformatics/btp324

PubMed Abstract | CrossRef Full Text | Google Scholar

Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., et al. (2009). The sequence alignment/map format and SAMtools. Bioinformatics 25, 2078–2079. doi: 10.1093/bioinformatics/btp352

PubMed Abstract | CrossRef Full Text | Google Scholar

Lowdon, R. F., Jang, H. S., and Wang, T. (2016). Evolution of epigenetic regulation in vertebrate genomes. Trends Genet. 32, 269–283. doi: 10.1016/j.tig.2016.03.001

PubMed Abstract | CrossRef Full Text | Google Scholar

Lyne, R., Smith, R., Rutherford, K., Wakeling, M., Varley, A., Guillier, F., et al. (2007). FlyMine: an integrated database for drosophila and anopheles genomics. Genome Biol. 8:R129. doi: 10.1186/gb-2007-8-7-r129

PubMed Abstract | CrossRef Full Text | Google Scholar

Mansour, T. A., Scott, E. Y., Finno, C. J., Bellone, R. R., Mienaltowski, M. J., Penedo, M. C., et al. (2017). Tissue resolved, gene structure refined equine transcriptome. BMC Genomics 18, 1–12. doi: 10.1186/s12864-016-3451-2

PubMed Abstract | CrossRef Full Text | Google Scholar

Martin, M. (2011). Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J. 17, 10–12. doi: 10.14806/ej.17.1.200

CrossRef Full Text | Google Scholar

McCoy, A. M., and McCue, M. E. (2014). Validation of imputation between equine genotyping arrays. Anim. Genet. 45:153. doi: 10.1111/age.12093

PubMed Abstract | CrossRef Full Text | Google Scholar

McCue, M. E., Bannasch, D. L., Petersen, J. L., Gurr, J., Bailey, E., Binns, M. M., et al. (2012). A high density SNP array for the domestic horse and extant Perissodactyla: utility for association mapping, genetic diversity, and phylogeny studies. PLoS Genet. 8:e1002451. doi: 10.1371/journal.pgen.1002451

PubMed Abstract | CrossRef Full Text | Google Scholar

McLean, C. Y., Reno, P. L., Pollen, A. A., Bassan, A. I., Capellini, T. D., Guenther, C., et al. (2011). Human-specific loss of regulatory DNA and the evolution of human-specific traits. Nature 471, 216–219. doi: 10.1038/nature09774

PubMed Abstract | CrossRef Full Text | Google Scholar

Nord, A. S., Blow, M. J., Attanasio, C., Akiyama, J. A., Holt, A., Hosseini, R., et al. (2013). Rapid and pervasive changes in genome-wide enhancer usage during mammalian development. Cell 155, 1521–1531. doi: 10.1016/j.cell.2013.11.033

PubMed Abstract | CrossRef Full Text | Google Scholar

O'Geen, H., Echipare, L., and Farnham, P. J. (2011). Using ChIP-seq technology to generate high-resolution profiles of histone modifications. Methods Mol. Biol. 791, 265–286. doi: 10.1007/978-1-61779-316-5_20

PubMed Abstract | CrossRef Full Text | Google Scholar

Penedo, M. C. T., Millon, L. V., Bernoco, D., Bailey, E., Binns, M., Cholewinski, G., et al. (2005). International equine gene mapping workshop report: a comprehensive linkage map constructed with data from new markers and by merging four mapping resources. Cytogenet. Genome Res. 111, 5–15. doi: 10.1159/000085664

PubMed Abstract | CrossRef Full Text | Google Scholar

Picard toolkit (2019). Broad Institute, GitHub Repos. Available online at: http://broadinstitute.github.io/picard/ (accessed July 8, 2020).

Google Scholar

Ramírez, F., Ryan, D. P., Grüning, B., Bhardwaj, V., Kilpert, F., Richter, A. S., et al. (2016). deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res. 44, W160–W165. doi: 10.1093/nar/gkw257

PubMed Abstract | CrossRef Full Text | Google Scholar

Raudsepp, T., Gustafson-Seabury, A., Durkin, K., Wagner, M. L., Goh, G., Seabury, C. M., et al. (2008). A 4,103 marker integrated physical and comparative map of the horse genome. Cytogenet. Genome Res. 122, 28–36. doi: 10.1159/000151313

PubMed Abstract | CrossRef Full Text | Google Scholar

Raudsepp, T., Santani, A., Wallner, B., Kata, S. R., Ren, C., Zhang, H., et al. (2004). A detailed physical map of the horse Y chromosome. Proc. Natl. Acad. Sci. U. S. A. 101, 9321–9326. doi: 10.1073/pnas.0403011101

PubMed Abstract | CrossRef Full Text | Google Scholar

Rhie, S. K., Guo, Y., Tak, Y. G., Yao, L., Shen, H., Coetzee, G. A., et al. (2016). Identification of activated enhancers and linked transcription factors in breast, prostate, and kidney tumors by tracing enhancer networks using epigenetic traits. Epigenet. Chromatin 9, 1–17. doi: 10.1186/s13072-016-0102-4

PubMed Abstract | CrossRef Full Text | Google Scholar

Rieder, S., Stricker, C., Joerg, H., Dummer, R., and Stranzinger, G. (2000). A comparative genetic approach for the investigation of ageing grey horse melanoma. J. Anim. Breed. Genet. 117, 73–82. doi: 10.1111/j.1439-0388.2000x.00245.x

CrossRef Full Text | Google Scholar

Rieder, S., Taourit, S., Mariat, D., Langlois, B., and Guérin, G. (2001). Mutations in the agouti (ASIP), the extension (MCIR), and the brown (TYRP1) loci and their association to coat color phenotypes in horses (Equus caballus). Mamm. Genome 12, 450–455. doi: 10.1007/s003350020017

CrossRef Full Text | Google Scholar

Rosengren Pielberg, G., Golovko, A., Sundström, E., Curik, I., Lennartsson, J., Seltenhammer, M. H., et al. (2008). A cis-acting regulatory mutation causes premature hair graying and susceptibility to melanoma in the horse. Nat. Genet. 40, 1004–1009. doi: 10.1038/ng.185

PubMed Abstract | CrossRef Full Text | Google Scholar

Schaefer, R. J., Schubert, M., Bailey, E., Bannasch, D. L., Barrey, E., Bar-Gal, G. K., et al. (2017). Developing a 670k genotyping array to tag ~2M SNPs across 24 horse breeds. BMC Genomics 18, 1–18. doi: 10.1186/s12864-017-3943-8

PubMed Abstract | CrossRef Full Text | Google Scholar

Schmidt, D., Wilson, M. D., Ballester, B., Schwalie, P. C., Brown, G. D., Marshall, A., et al. (2010). Five-vertebrate ChIP-seq reveals the evolutionary dynamics of transcription factor binding. Science 328, 1036–1040. doi: 10.1126/science.1186176

PubMed Abstract | CrossRef Full Text | Google Scholar

Shibata, Y., Sheffield, N. C., Fedrigo, O., Babbitt, C. C., Wortham, M., Tewari, A. K., et al. (2012). Extensive evolutionary changes in regulatory element activity during human origins are associated with altered gene expression and positive selection. PLoS Genet. 8:e1002789. doi: 10.1371/journal.pgen.1002789

PubMed Abstract | CrossRef Full Text | Google Scholar

SICERpy. GitHub Repository. Available online at: https://github.com/dariober/SICERpy (accessed July 8, 2020).

Google Scholar

Swinburne, J. E., Boursnell, M., Hill, G., Pettitt, L., Allen, T., Chowdhary, B., et al. (2006). Single linkage group per chromosome genetic linkage map for the horse, based on two three-generation, full-sibling, crossbred horse reference families. Genomics 87, 1–29. doi: 10.1016/j.ygeno.2005.09.001

PubMed Abstract | CrossRef Full Text | Google Scholar

Tuggle, C. K., Giuffra, E., White, S. N., Clarke, L., Zhou, H., Ross, P. J., et al. (2016). GO-FAANG meeting: a gathering on functional annotation of animal genomes. Anim. Genet. 47, 528–533. doi: 10.1111/age.12466

PubMed Abstract | CrossRef Full Text | Google Scholar

Wade, C. M., Giulotto, E., Sigurdsson, S., Zoli, M., Gnerre, S., Imsland, F., et al. (2009). Genome sequence, comparative analysis, and population genetics of the domestic horse. Science. 326, 865–867. doi: 10.1126/science.1178158

PubMed Abstract | CrossRef Full Text | Google Scholar

Zang, C., Schones, D. E., Zeng, C., Cui, K., Zhao, K., and Peng, W. (2009). A clustering approach for identification of enriched domains from histone modification ChIP-Seq data. Bioinformatics 25, 1952–1958. doi: 10.1093/bioinformatics/btp340

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhang, Y., Liu, T., Meyer, C. A., Eeckhoute, J., Johnson, D. S., Bernstein, B. E., et al. (2008). Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9:R137. doi: 10.1186/gb-2008-9-9-r137

CrossRef Full Text | Google Scholar

Keywords: genome, annotation, epigenetics, horse, chromatin, consortium, collaboration, regulation

Citation: Kingsley NB, Hamilton NA, Lindgren G, Orlando L, Bailey E, Brooks S, McCue M, Kalbfleisch TS, MacLeod JN, Petersen JL, Finno CJ and Bellone RR (2021) “Adopt-a-Tissue” Initiative Advances Efforts to Identify Tissue-Specific Histone Marks in the Mare. Front. Genet. 12:649959. doi: 10.3389/fgene.2021.649959

Received: 06 January 2021; Accepted: 01 March 2021;
Published: 26 March 2021.

Edited by:

Hans Cheng, United States Department of Agriculture, United States

Reviewed by:

Cong-jun Li, Agricultural Research Service (USDA), United States
Herve Acloque, Institut National de Recherche pour l'agriculture, l'alimentation et l'environnement (INRAE), France

Copyright © 2021 Kingsley, Hamilton, Lindgren, Orlando, Bailey, Brooks, McCue, Kalbfleisch, MacLeod, Petersen, Finno and Bellone. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Rebecca R. Bellone, cmJlbGxvbmUmI3gwMDA0MDt1Y2RhdmlzLmVkdQ==

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.