User Tools

Site Tools


4dn:phase1:working_groups:omics_data_standards:minutes-03-25-2019

This is an old revision of the document!


Update on PLAC-seq/HiChIP protocol

QC metrics

Recommended antibodies

Crosslinking conditions

QC Metrics

Hi-C - digestion/ligation efficiency, noise level

IP - enrichment performance

Library preparation - complexity All three aspects have both a qualitative detection methods and quantitative detection method (with shallow sequencing)

Qualitative

Hi-C - DNA fragment size before digestion (should not have smear), after digestion (should be a smear at lower size range, note that the expected size should be larger than the theoretical value because of potential chromatin structures and accessibility issue) and after ligation (should be a similarly-shaped smear at a larger size range)

IP - DNA fragments after sonication should be around 100~600bp, incomplete sonication will result in larger fragments and affect IP performance. IP yield (IPed DNA / input DNA) is also a good metric. In general for H3K4me3 / H3K27ac will have <1~3% IP yield, and CTCF / PolII will have < 0.1%. However, IP yield is necessary but not sufficient for a good IP.

Library preparation - Libraries with good complexity require at least 10~20ng of IPed DNA, with 11~13 PCR cycles and 20~40% duplication rate at ~250M reads. Libraries with worse complexity will need more input.

Quantitative

Glossary:

A - sequenced read pairs

B - valid read pairs

C - valid read pairs after PCR duplicates removal

D - inter-chromosomal read pairs

E - intra-chromosomal read pairs

F - short-range (⇐1kb) of E

G - long-range (>1kb) of E

H - F that overlap with ChIP peaks

Hi-C - trans ratio (D/C) reflects noise level (reference < 20~40%), long-range cis ratio (G/E): (reference > 50~70%)

IP - on-target rate (H/F): (reference for histone marks > 20%, for TFs > 5~10%)

Library preparation - PCR duplication rate (C/B): (reference < 3%)

How to choose the best antibody

High specificity - high on-target rate

High affinity - large IP yield

Highly robust - less batch effects (monoclonal Ab is better than polyclonal) Currently recommended tested antibodies (all monoclonal):

CTCF: Cell Signaling, 3418T

H3K4me3: Millipore, 04-745

H3K27ac: Diagenode, C15200184-50; Active motif, 91193 Bill Noble: Will ENCODE develop QC metrics on Hi-C data? Shall we establish a data quality measurement procedure? There are several software that can evaluate Hi-C datasets, like HiCRep or other tools as described in https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1658-7.

Miao Yu: The current QC metrics is before deep sequencing and the evaluation can be done after data generation

PLAC-seq / HiChIP has lower IP efficiency than ChIP-seq

Hi-C may disrupt protein complexes

Biotin enrichment after IP may enrich DNA fragments without protein binding

Test of crosslinking conditions

Different crosslinking conditions may affect on-target rate. Results are preliminary and higher temperature does not improve on-target rates. One DSG + HCHO test had a high on-target rate but needs further verification.

Discussion

Burak: What is the intended disseminate method for all this results?

Bing: We are currently preparing a protocol that will be circulated within 4DN and be submitted to Nature Protocol but the manuscript is still under work.

4dn/phase1/working_groups/omics_data_standards/minutes-03-25-2019.1553544007.txt.gz · Last modified: 2025/04/22 16:21 (external edit)