Data from: Generative modeling of multi-mapping reads with mHi-C advances analysis of Hi-C studies

Main Authors: Zheng, Ye, Ay, Ferhat, Keles, Sunduz
Format: info dataset Journal
Terbitan: , 2019
Subjects:
Online Access: https://zenodo.org/record/4967533
Daftar Isi:
  • Current Hi-C analysis approaches are unable to account for reads that align to multiple locations, and hence underestimate biological signal from repetitive regions of genomes. We developed and validated mHi-C, a multi-read mapping strategy to probabilistically allocate Hi-C multi-reads. mHi-C exhibited superior performance over utilizing only uni-reads and heuristic approaches aimed at rescuing multi-reads on benchmarks. Specifically, mHi-C increased the sequencing depth by an average of 20% resulting in higher reproducibility of contact matrices and detected interactions across biological replicates. The impact of the multi-reads on the detection of significant interactions is influenced marginally by the relative contribution of multi-reads to the sequencing depth compared to uni-reads, cis-to-trans ratio of contacts, and the broad data quality as reflected by the proportion of mappable reads of datasets. Computational experiments highlighted that in Hi-C studies with short read lengths, mHi-C rescued multi-reads can emulate the effect of longer reads. mHi-C also revealed biologically supported bona fide promoter-enhancer interactions and topologically associating domains involving repetitive genomic regions, thereby unlocking a previously masked portion of the genome for conformation capture studies.
  • p300 ChIP-seq peaks using uni- and multi-readsChIP-seq peaks detected following the standard ChIP-seq data processing pipeline of ENCODE (The ENCODE Project Consortium, 2012) using both uni-reads and multi-reads aligned by Permseq (Zeng et al., 2015) for p300.p300.zipp65 ChIP-seq peaks using uni- and multi-readsChIP-seq peaks detected following the standard ChIP-seq data processing pipeline of ENCODE (The ENCODE Project Consortium, 2012) using both uni-reads and multi-reads aligned by Permseq (Zeng et al., 2015) for P65.p65.zipPolII ChIP-seq peaks using uni- and multi-readsChIP-seq peaks detected following the standard ChIP-seq data processing pipeline of ENCODE (The ENCODE Project Consortium, 2012) using both uni-reads and multi-reads aligned by Permseq (Zeng et al., 2015) for PolII.PolII.zipH3K4me1 ChIP-seq peaks using uni- and multi-readsChIP-seq peaks detected following the standard ChIP-seq data processing pipeline of ENCODE (The ENCODE Project Consortium, 2012) using both uni-reads and multi-reads aligned by Permseq (Zeng et al., 2015) for H3K4me1.H3K4me1.zipCTCF ChIP-seq peaks using uni- and multi-readsChIP-seq peaks detected following the standard ChIP-seq data processing pipeline of ENCODE (The ENCODE Project Consortium, 2012) using both uni-reads and multi-reads aligned by Permseq (Zeng et al., 2015) for CTCF.CTCF.zipH3K4me3 ChIP-seq peaks using uni- and multi-readsChIP-seq peaks detected following the standard ChIP-seq data processing pipeline of ENCODE (The ENCODE Project Consortium, 2012) using both uni-reads and multi-reads aligned by Permseq (Zeng et al., 2015) for H3K4me3.H3K4me3.zipH3K27ac ChIP-seq peaks using uni- and multi-readsChIP-seq peaks detected following the standard ChIP-seq data processing pipeline of ENCODE (The ENCODE Project Consortium, 2012) using both uni-reads and multi-reads aligned by Permseq (Zeng et al., 2015) for H3K27ac.H3K27ac.zipH3K27me3 ChIP-seq peaks using uni- and multi-readsChIP-seq peaks detected following the standard ChIP-seq data processing pipeline of ENCODE (The ENCODE Project Consortium, 2012) using both uni-reads and multi-reads aligned by Permseq (Zeng et al., 2015) for H3K27me3.H3K27me3.zipH3K36me3 ChIP-seq peaks using uni- and multi-readsChIP-seq peaks detected following the standard ChIP-seq data processing pipeline of ENCODE (The ENCODE Project Consortium, 2012) using both uni-reads and multi-reads aligned by Permseq (Zeng et al., 2015) for H3K36me3.H3K36me3.zip