====== Update on PLAC-seq/HiChIP protocol ====== * QC metrics * Recommended antibodies * Crosslinking conditions ===== QC Metrics ===== * Hi-C - digestion/ligation efficiency, noise level * IP - enrichment performance * Library preparation - complexity All three aspects have both a qualitative detection methods and quantitative detection method (with shallow sequencing) ==== Qualitative ==== * Hi-C - DNA fragment size before digestion (should not have smear), after digestion (should be a smear at lower size range, note that the expected size should be larger than the theoretical value because of potential chromatin structures and accessibility issue) and after ligation (should be a similarly-shaped smear at a larger size range) * IP - DNA fragments after sonication should be around 100~600bp, incomplete sonication will result in larger fragments and affect IP performance. IP yield (IPed DNA / input DNA) is also a good metric. In general for H3K4me3 / H3K27ac will have <1~3% IP yield, and CTCF / PolII will have < 0.1%. However, IP yield is necessary but not sufficient for a good IP. * Library preparation - Libraries with good complexity require at least 10~20ng of IPed DNA, with 11~13 PCR cycles and 20~40% duplication rate at ~250M reads. Libraries with worse complexity will need more input. ==== Quantitative ==== **Glossary:** A - sequenced read pairs\\ B - valid read pairs\\ C - valid read pairs after PCR duplicates removal\\ D - inter-chromosomal read pairs\\ E - intra-chromosomal read pairs\\ F - short-range (<=1kb) of E\\ G - long-range (>1kb) of E\\ H - F that overlap with ChIP peaks * Hi-C - trans ratio (D/C) reflects noise level (reference < 20~40%), long-range cis ratio (G/E): (reference > 50~70%) * IP - on-target rate (H/F): (reference for histone marks > 20%, for TFs > 5~10%) * Library preparation - PCR duplication rate (C/B): (reference < 3%) ===== How to choose the best antibody ===== * High specificity - high on-target rate * High affinity - large IP yield * Highly robust - less batch effects (monoclonal Ab is better than polyclonal) Currently recommended tested antibodies (all monoclonal): * CTCF: Cell Signaling, 3418T * H3K4me3: Millipore, 04-745 * H3K27ac: Diagenode, C15200184-50; Active motif, 91193 Bill Noble: Will ENCODE develop QC metrics on Hi-C data? Shall we establish a data quality measurement procedure? There are several software that can evaluate Hi-C datasets, like HiCRep or other tools as described in [[https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1658-7|https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1658-7]]. Miao Yu: The current QC metrics is before deep sequencing and the evaluation can be done after data generation PLAC-seq / HiChIP has lower IP efficiency than ChIP-seq * Hi-C may disrupt protein complexes * Biotin enrichment after IP may enrich DNA fragments without protein binding ===== Test of crosslinking conditions ===== * Different crosslinking conditions may affect on-target rate. Results are preliminary and higher temperature does not improve on-target rates. One DSG + HCHO test had a high on-target rate but needs further verification. ===== Discussion ===== Burak: What is the intended disseminate method for all this results? Bing: We are currently preparing a protocol that will be circulated within 4DN and be submitted to Nature Protocol but the manuscript is still under work.