User Tools

Site Tools


4dn:phase1:nofic_steering_committee:analysis_products

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
4dn:phase1:nofic_steering_committee:analysis_products [2020/05/07 20:25]
Ran
4dn:phase1:nofic_steering_committee:analysis_products [2025/04/22 16:21] (current)
Line 11: Line 11:
 *** 28 Mar 2020 *** *** 28 Mar 2020 ***
  
-**Processed embeddings:​**+**Processed embeddings ​[Noble Lab]:**
  
 Here we used three different approaches to calculate similarity/​distance between Hi-C matrices: Here we used three different approaches to calculate similarity/​distance between Hi-C matrices:
Line 20: Line 20:
  
 scHiCluster:​ [[https://​data.4dnucleome.org/​files-processed/​4DNFIKTK5Z1U/​|https://​data.4dnucleome.org/​files-processed/​4DNFIKTK5Z1U/​]] scHiCluster:​ [[https://​data.4dnucleome.org/​files-processed/​4DNFIKTK5Z1U/​|https://​data.4dnucleome.org/​files-processed/​4DNFIKTK5Z1U/​]]
- 
-Plots of 2D embeddings with CDP, HiCRep and scHiCluster:​ 
- 
-**{{https://​lh5.googleusercontent.com/​hw9d-nP_ZHz8_jH9Pd9391JA2_-d5bLNUJNH6JnOB7niUKwa3upPT5s475YdpY33QTNdufhRjL1lH9Yh-oYhopuogxvIFyVbSmq-6ku8JHZPEMf9bshW-wsxBktxSK7yuoysKfGTP-M?​nolinkx249px;​}}** **{{https://​lh5.googleusercontent.com/​otBNSUIQZ2wPkNW6r0pZcm6e0maEZ8tRnNs3ngfRohYzM5xnhZcEc22GXIXvO7-tsH21XaPnwBYCwAuG7RV0h71eIFAnix4AVRN2uuBupQ3qHGA5ZAPAX42l9f4ZR_wqSN_KspQbmjw?​nolinkx249px;​}}{{https://​lh4.googleusercontent.com/​fpRt-BFzCEyijdP_K_ftI9JL5GXf3_nenlBvXrPW5e39dTMSd8xQCrzhmX2xJkB8Oqpww6FG1PdtpitL-Vq62yyjazoTlkxivKduRHA2041ayJ0fxSNwpLsVcIV9qB0t__RRBKDSKTM?​nolinkx249px;​}}** **{{https://​lh6.googleusercontent.com/​IMzWgPTsmbCx_yLV-fjL3gZReQJws9EbBRJPJ4WscSXes64YUsLSut5Ixalf9ZUME-Lg2XSFdc5yk_locXObKtSdcmtr7kwpjiKWYx0nRWsi1u9D_bS1RZYP2MsraQ1PLLEQxsSVHro?​nolinkx98px;​}}** 
  
 *** 07 May 2020 *** *** 07 May 2020 ***
Line 31: Line 27:
 ===== scHi-C data (Ren) ===== ===== scHi-C data (Ren) =====
  
-scHi-C script and processed files provided by [[https://​4dn.slack.com/​team/​UU1HZACHK|@Yanxiao Zhang]] ​in the Ren lab:+scHi-C script and processed files provided by [[https://​4dn.slack.com/​team/​UU1HZACHK|@Yanxiao Zhang]] ​**[Ren Lab]**:
  
 **Script**: [[https://​github.com/​ren-lab/​hic-pipeline/​blob/​master/​scripts/​Snakefile|https://​github.com/​ren-lab/​hic-pipeline/​blob/​master/​scripts/​Snakefile]] with the rule scHiC. Please note that since the pipeline is still under development the documentation is not complete. **Script**: [[https://​github.com/​ren-lab/​hic-pipeline/​blob/​master/​scripts/​Snakefile|https://​github.com/​ren-lab/​hic-pipeline/​blob/​master/​scripts/​Snakefile]] with the rule scHiC. Please note that since the pipeline is still under development the documentation is not complete.
Line 37: Line 33:
 **Processed files**: [[http://​renlab.sdsc.edu/​yanxiao/​Miao_data/​scHiC/​data/​Xiaomeng_data/​useful_contacts/​|http://​renlab.sdsc.edu/​yanxiao/​Miao_data/​scHiC/​data/​Xiaomeng_data/​useful_contacts/​]]\\ Please use the *.valid_pairs.rm_hotspot.sorted.txt.gz. The column names are (1) read name (2) chr1 (3) position 1 (4) bin1 (5) chr2 (6) pos2 (7) bin2 **Processed files**: [[http://​renlab.sdsc.edu/​yanxiao/​Miao_data/​scHiC/​data/​Xiaomeng_data/​useful_contacts/​|http://​renlab.sdsc.edu/​yanxiao/​Miao_data/​scHiC/​data/​Xiaomeng_data/​useful_contacts/​]]\\ Please use the *.valid_pairs.rm_hotspot.sorted.txt.gz. The column names are (1) read name (2) chr1 (3) position 1 (4) bin1 (5) chr2 (6) pos2 (7) bin2
  
-28 Mar 2020+*** 28 Mar 2020 *** 
 + 
 +**Processed embeddings:​**** [Noble Lab]** 
 + 
 +Here we used three different approaches to calculate similarity/​distance between Hi-C matrices: 
 + 
 +Contact decay profile: [[https://​data.4dnucleome.org/​files-processed/​4DNFIGLS3JZN/?​redirected_from=%2F4DNFIGLS3JZN%2F|https://​data.4dnucleome.org/​files-processed/​4DNFIGLS3JZN/?​redirected_from=%2F4DNFIGLS3JZN%2F]]. CDP by 500kb bins, similarity matrix is calculated as cosine similarity of log-scaled bins. Distance matrix is calculated as sqrt(2-2*similarity) and 2D embeddings are generated by multidimensional scaling. 
 + 
 +HiCRep: [[https://​data.4dnucleome.org/​files-processed/​4DNFIPX8D8DX/​|https://​data.4dnucleome.org/​files-processed/​4DNFIHM5K63H/​]]. Calculate pairwise similarity by HiCRep (Yang et al. 2017). Distance matrix is calculated as sqrt(2-2*similarity) and 2D embeddings are generated by multidimensional scaling. 
 + 
 +scHiCluster:​ [[https://​data.4dnucleome.org/​files-processed/​4DNFIKTK5Z1U/​|https://​data.4dnucleome.org/​files-processed/​4DNFI488OK8H/​]] 
 + 
 +*** 07 May 2020 ***
  
 ---- ----
Line 55: Line 63:
 ===== haplotype analysis (Rafelski) ===== ===== haplotype analysis (Rafelski) =====
  
-From Ru Gunawardane:​+**[Ru Gunawardane]:**
  
 We have parental WTC-11 linked read (10X) and short read whole genome data, and it is available as two packages through quilt ([[https://​nam12.safelinks.protection.outlook.com/?​url=https%3A%2F%2Fdocs.quiltdata.com%2Finstallation&​data=02%7C01%7C%7C0a610651a1014473e27a08d7d26adcac%7C32669cd6737f4b398bddd6951120d3fc%7C0%7C0%7C637209228714084680&​sdata=6mZztNIU0nWSIgXqKJ2%2BKX%2FdMXGzbSklTTM%2Bgde%2BnSY%3D&​reserved=0|https://​docs.quiltdata.com]]). These packages include both raw and aligned reads, variant calls, and **phasing** for linked read data. We have parental WTC-11 linked read (10X) and short read whole genome data, and it is available as two packages through quilt ([[https://​nam12.safelinks.protection.outlook.com/?​url=https%3A%2F%2Fdocs.quiltdata.com%2Finstallation&​data=02%7C01%7C%7C0a610651a1014473e27a08d7d26adcac%7C32669cd6737f4b398bddd6951120d3fc%7C0%7C0%7C637209228714084680&​sdata=6mZztNIU0nWSIgXqKJ2%2BKX%2FdMXGzbSklTTM%2Bgde%2BnSY%3D&​reserved=0|https://​docs.quiltdata.com]]). These packages include both raw and aligned reads, variant calls, and **phasing** for linked read data.
Line 63: Line 71:
 [[https://​nam12.safelinks.protection.outlook.com/?​url=https%3A%2F%2Fopen.quiltdata.com%2Fb%2Fallencell%2Ftree%2Faics%2Fwtc11_short_read_genome_sequence%2F&​data=02%7C01%7C%7C0a610651a1014473e27a08d7d26adcac%7C32669cd6737f4b398bddd6951120d3fc%7C0%7C0%7C637209228714094625&​sdata=KVUFOEVY3gGPnC5RPNLHx1M%2BY7l1arLHlDIFgtlElLk%3D&​reserved=0|https://​open.quiltdata.com/​b/​allencell/​tree/​aics/​wtc11_short_read_genome_sequence/​]] [[https://​nam12.safelinks.protection.outlook.com/?​url=https%3A%2F%2Fopen.quiltdata.com%2Fb%2Fallencell%2Ftree%2Faics%2Fwtc11_short_read_genome_sequence%2F&​data=02%7C01%7C%7C0a610651a1014473e27a08d7d26adcac%7C32669cd6737f4b398bddd6951120d3fc%7C0%7C0%7C637209228714094625&​sdata=KVUFOEVY3gGPnC5RPNLHx1M%2BY7l1arLHlDIFgtlElLk%3D&​reserved=0|https://​open.quiltdata.com/​b/​allencell/​tree/​aics/​wtc11_short_read_genome_sequence/​]]
  
-Mar 28 2020+*** Mar 28 2020 ***
  
 ---- ----
 +
  
 ===== Working with mcool files: ===== ===== Working with mcool files: =====
4dn/phase1/nofic_steering_committee/analysis_products.1588908329.txt.gz · Last modified: 2025/04/22 16:21 (external edit)