AUTHOR=Kumar Ashwath , Hu Michael Y. , Mei Yajun , Fan Yuhong TITLE=CSSQ: a ChIP-seq signal quantifier pipeline JOURNAL=Frontiers in Cell and Developmental Biology VOLUME=11 YEAR=2023 URL=https://www.frontiersin.org/journals/cell-and-developmental-biology/articles/10.3389/fcell.2023.1167111 DOI=10.3389/fcell.2023.1167111 ISSN=2296-634X ABSTRACT=

Chromatin immunoprecipitation followed by sequencing (ChIP-seq) has revolutionized the studies of epigenomes and the massive increase in ChIP-seq datasets calls for robust and user-friendly computational tools for quantitative ChIP-seq. Quantitative ChIP-seq comparisons have been challenging due to noisiness and variations inherent to ChIP-seq and epigenomes. By employing innovative statistical approaches specially catered to ChIP-seq data distribution and sophisticated simulations along with extensive benchmarking studies, we developed and validated CSSQ as a nimble statistical analysis pipeline capable of differential binding analysis across ChIP-seq datasets with high confidence and sensitivity and low false discovery rate with any defined regions. CSSQ models ChIP-seq data as a finite mixture of Gaussians faithfully that reflects ChIP-seq data distribution. By a combination of Anscombe transformation, k-means clustering, estimated maximum normalization, CSSQ minimizes noise and bias from experimental variations. Further, CSSQ utilizes a non-parametric approach and incorporates comparisons under the null hypothesis by unaudited column permutation to perform robust statistical tests to account for fewer replicates of ChIP-seq datasets. In sum, we present CSSQ as a powerful statistical computational pipeline tailored for ChIP-seq data quantitation and a timely addition to the tool kits of differential binding analysis to decipher epigenomes.