User Tools

Site Tools


4dn:phase1:data_analysis:dawg-meeting-notes-20170803

DAWG Meeting Notes 20170803

Comparison of Hi-C processing tools

Burak Alver
Reminder:
We are planning to run:
- fastq → pairs (juicer)
- pairs → hic (juicer)
- pairs → cool (cooler)
- export juicer normvectors and import to the cools.
- export cooler normvectors and provide in juicebox format.
4 of the 6 data sets have completed running.
The resulting files and juicebox.js links are\here.
Next steps:
- finish running the largest 2 samples.
- start implementing hicrep.
\

Visualizing TAD calls on HiGlass

Peter Kerpedijev\
HiGlass all calls all reps link:\
http://higlass.io/app/?config=JALHH-HzQGeJCaJaU9EwTA

HiGlass RepH calls link:\http://higlass.io/app/?config=IPCHmdOQR4CDY2sqj5VJHQ
The RepH calls link above corresponds to Figure 3 from\
Foracto et al.
Easy to remember link:\http://higlass.io/examples

- Showcasing 8 linked views and overlaid TAD calls.

- Erez had mentioned that the TAD calls in Forcato et al are for one replicate, constituting a shallow data set.
- The RepH view corresponds to the actual matrix that Forcato et al used.
- Erez reiterated that using this shallow a data set is not ideal.
- A caveat on the view: the matrices are in hg38. The TAD calls were lifted over from hg19. At most 5% of the TADs were lost across 7 sets.
\

- Erez: z-scale (color scale) zoom-in/out feature will be useful, and should be easy.\
- - Pete: It is easy, but there is an advantage to optimize on UX.\

\ ==== Domain calling with Arrowhead ==== Neva Durand
(See\
slides\for details.)

Background: features at different scales are resolved with different sequencing depths:
- compartments: checkerboard pattern, extracted with eigenvectors, Aiden 2009 (~Mbs)\

- TADs: seen in Dixon 2012, directionality index (~1Mb)\
- loop domains: seen in Rao et al 2014, peak+square motif (as small as 100kb)\

- exclusion domain: Sanborn et al 2015; Even without CTCF, loop-domain like structures are present.\
- cohesin degradation eliminates all loop domains but not all loops, and does not eliminate compartments.
Overall: Different contact domains have different biologies; we need to define the biology we are after.\


Arrowhead:\
- similar to directionality index in principle.\

- But matrix transformation makes the sought-after feature much more clearly defined.
\ Juicebox.js - linked views feature is now also available in juicebox.js
- showing Focatto + arrowhead on complete GM data\

- Also see IMR90 arrowhead vs. directionality index results.
\ http://www.aidenlab.org/juicebox/HIC003_GM12878_MboI.html
http://www.aidenlab.org/juicebox/GM12878_combined.html
http://www.aidenlab.org/juicebox/IMR90_TADS.html
\

Discussion

There are three inter-related topics

1. Defining different types of domains with a biological basis.
2. Resolvability of features vs. sequencing depth\
3. Assessment of different callers:\

- Are they making accurate robust calls?\__

- Are they making good use of the available data?

To partially separate the three points, we will proceed with a presentation of cohesin and CTCF depletion work by Erez and Leonid at first possible next call.

4dn/phase1/data_analysis/dawg-meeting-notes-20170803.txt · Last modified: 2025/04/22 16:21 (external edit)