User Tools

Site Tools


4dn:phase1:nofic_steering_committee:analysis_products

This is an old revision of the document!


Analysis products

Use the Edit button located on the right hand side of the page. To add files, you can type in the title of the file and click on the link button located on the top menu bar. You may upload any file type; however, for some formats (such as png for example) the uploaded media will appear directly linked after the uploading step, while for others (for example txt or pdf files) you need to perform a two-step process: first uploading the file to the wiki, and then linking it in the page. File size limit should be 256 MB.

For more detail, please see the getting started guide.

sci-Hi-C data (Shendure)

mcool files: https://noble.gs.washington.edu/~ranz0/4DN/wtc11/results/shendure-lab/mcool/. For details please refer to notebook https://noble.gs.washington.edu/~ranz0/4DN/wtc11/results/ranz0/results.html#Process_sciHi-C_to_mcool_format

* 28 Mar 2020 *

Processed embeddings:

Here we used three different approaches to calculate similarity/distance between Hi-C matrices:

Contact decay profile: https://data.4dnucleome.org/files-processed/4DNFIHA69V3Y/. CDP by 500kb bins, similarity matrix is calculated as cosine similarity of log-scaled bins. Distance matrix is calculated as sqrt(2-2*similarity) and 2D embeddings are generated by multidimensional scaling.

HiCRep: https://data.4dnucleome.org/files-processed/4DNFIPX8D8DX/. Calculate pairwise similarity by HiCRep (Yang et al. 2017). Distance matrix is calculated as sqrt(2-2*similarity) and 2D embeddings are generated by multidimensional scaling.

scHiCluster: https://data.4dnucleome.org/files-processed/4DNFIKTK5Z1U/

* 07 May 2020 *


scHi-C data (Ren)

scHi-C script and processed files provided by @Yanxiao Zhang in the Ren lab:

Script: https://github.com/ren-lab/hic-pipeline/blob/master/scripts/Snakefile with the rule scHiC. Please note that since the pipeline is still under development the documentation is not complete.

Processed files: http://renlab.sdsc.edu/yanxiao/Miao_data/scHiC/data/Xiaomeng_data/useful_contacts/
Please use the *.valid_pairs.rm_hotspot.sorted.txt.gz. The column names are (1) read name (2) chr1 (3) position 1 (4) bin1 (5) chr2 (6) pos2 (7) bin2

28 Mar 2020


haplotype analysis (Gilbert and Ren)

Current progress on haplotype phasing:

We are using Hi-c, 10X genomics and all the available sequencing reads to get the highest resolution/accuracy and coverage haplotype phased genomes for the following cell lines:

WTC-11, H1, H9, Hffc6, F121-9, HCT116

In the case of F121-9, we have also performed Bionano structural variation analysis to create a more accurate castaneus build.


haplotype analysis (Rafelski)

From Ru Gunawardane:

We have parental WTC-11 linked read (10X) and short read whole genome data, and it is available as two packages through quilt (https://docs.quiltdata.com). These packages include both raw and aligned reads, variant calls, and phasing for linked read data.

https://open.quiltdata.com/b/allencell/tree/aics/wtc11_linkedread_wgs/

https://open.quiltdata.com/b/allencell/tree/aics/wtc11_short_read_genome_sequence/

Mar 28 2020


Working with mcool files:

Attached links provide information about handling .mcool files.

https://github.com/hms-dbmi/hic-data-analysis-bootcamp

https://github.com/mirnylab/cooler/tree/master/cooler/cli

https://github.com/mirnylab/cooltools/tree/master/cooltools/cli

Converting mcool to a text format data matrix: cooler dump

4dn/phase1/nofic_steering_committee/analysis_products.1588908096.txt.gz · Last modified: 2025/04/22 16:21 (external edit)