User Tools

Site Tools


4dn:phase1:working_groups:omics_data_standards:minutes-03-25-2019

This is an old revision of the document!


Update on PLAC-seq/HiChIP protocol

  • QC metrics
  • Recommended antibodies
  • Crosslinking conditions

QC Metrics

  • Hi-C - digestion/ligation efficiency, noise level
  • IP - enrichment performance
  • Library preparation - complexity All three aspects have both a qualitative detection methods and quantitative detection method (with shallow sequencing)

Qualitative

  • Hi-C - DNA fragment size before digestion (should not have smear), after digestion (should be a smear at lower size range, note that the expected size should be larger than the theoretical value because of potential chromatin structures and accessibility issue) and after ligation (should be a similarly-shaped smear at a larger size range)
  • IP - DNA fragments after sonication should be around 100~600bp, incomplete sonication will result in larger fragments and affect IP performance. IP yield (IPed DNA / input DNA) is also a good metric. In general for H3K4me3 / H3K27ac will have <1~3% IP yield, and CTCF / PolII will have < 0.1%. However, IP yield is necessary but not sufficient for a good IP.
  • Library preparation - Libraries with good complexity require at least 10~20ng of IPed DNA, with 11~13 PCR cycles and 20~40% duplication rate at ~250M reads. Libraries with worse complexity will need more input.

Quantitative

Glossary:

A - sequenced read pairs
B - valid read pairs
C - valid read pairs after PCR duplicates removal
D - inter-chromosomal read pairs
E - intra-chromosomal read pairs
F - short-range (⇐1kb) of E
G - long-range (>1kb) of E
H - F that overlap with ChIP peaks

  • Hi-C - trans ratio (D/C) reflects noise level (reference < 20~40%), long-range cis ratio (G/E): (reference > 50~70%)
  • IP - on-target rate (H/F): (reference for histone marks > 20%, for TFs > 5~10%)
  • Library preparation - PCR duplication rate (C/B): (reference < 3%)

How to choose the best antibody

  • High specificity - high on-target rate
  • High affinity - large IP yield
  • Highly robust - less batch effects (monoclonal Ab is better than polyclonal)

Currently recommended tested antibodies (all monoclonal):

  • CTCF: Cell Signaling, 3418T
  • H3K4me3: Millipore, 04-745
  • H3K27ac: Diagenode, C15200184-50; Active motif, 91193

Bill Noble: Will ENCODE develop QC metrics on Hi-C data? Shall we establish a data quality measurement procedure? There are several software that can evaluate Hi-C datasets, like HiCRep or other tools as described in https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1658-7.

Miao Yu: The current QC metrics is before deep sequencing and the evaluation can be done after data generation

PLAC-seq / HiChIP has lower IP efficiency than ChIP-seq

  • Hi-C may disrupt protein complexes
  • Biotin enrichment after IP may enrich DNA fragments without protein binding

Test of crosslinking conditions

  • Different crosslinking conditions may affect on-target rate. Results are preliminary and higher temperature does not improve on-target rates. One DSG + HCHO test had a high on-target rate but needs further verification.

Discussion

Burak: What is the intended disseminate method for all this results?

Bing: We are currently preparing a protocol that will be circulated within 4DN and be submitted to Nature Protocol but the manuscript is still under work.

4dn/phase1/working_groups/omics_data_standards/minutes-03-25-2019.1553551600.txt.gz · Last modified: 2025/04/22 16:21 (external edit)