User Tools

Site Tools


4dn:phase1:working_groups:omics_data_standards:minutes-05-08-2017

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
4dn:phase1:working_groups:omics_data_standards:minutes-05-08-2017 [2019/02/15 13:53]
rcalandrelli created
4dn:phase1:working_groups:omics_data_standards:minutes-05-08-2017 [2025/04/22 16:21] (current)
Line 1: Line 1:
-====  Omics Data Standards WG - Minutes 05-08-2017 ​           ====+==== Omics Data Standards WG - Minutes 05-08-2017 ====
  
 |~~TABLE_CELL_WRAP_START~~<​WRAP>​ |~~TABLE_CELL_WRAP_START~~<​WRAP>​
  
-====== ​ Hi-C Data Processing and Analysis ​                        ​======+====== Hi-C Data Processing and Analysis ======
  
 <font 11pt/​Arial;;​initial;;#​ffffff>​Last meeting, OMICS group discussed about mapping of Hi-C reads and there were large consistence between different groups and the main differences were single-end vs paired-end and filtering. Currently Burak at DCIC are still trying to use different tools for comparison. The real issue is that whether OMICS will give specific recommendations.</​font>​ <font 11pt/​Arial;;​initial;;#​ffffff>​Last meeting, OMICS group discussed about mapping of Hi-C reads and there were large consistence between different groups and the main differences were single-end vs paired-end and filtering. Currently Burak at DCIC are still trying to use different tools for comparison. The real issue is that whether OMICS will give specific recommendations.</​font>​
Line 9: Line 9:
 <font 11pt/​Arial;;​initial;;#​ffffff>​Over the last week, Soo and Neva compared different parameters for the mapping.</​font>​ <font 11pt/​Arial;;​initial;;#​ffffff>​Over the last week, Soo and Neva compared different parameters for the mapping.</​font>​
  
-=====  Hi-C Read Alignment and Data Standards ​                               =====+===== Hi-C Read Alignment and Data Standards =====
  
-<font 11pt/​Arial;;​initial;;#​ffffff>​(Please refer to Soo Lee’s slides)</​font>​+ (Please refer to Soo Lee’s slides)
  
-====  Reporting of chimeric alignment (-5 and -M flags) ​                               ====+==== Reporting of chimeric alignment (-5 and -M flags) ====
  
 <font 11pt/​Arial;;​initial;;#​ffffff>​Chimeric alignment will have one soft-clipping and one hard-clipping reads.</​font>​ <font 11pt/​Arial;;​initial;;#​ffffff>​Chimeric alignment will have one soft-clipping and one hard-clipping reads.</​font>​
Line 25: Line 25:
 <font 14.6667px/​Arial;;​initial;;#​ffffff>​-M flags marks the</​font> ​  <​font 14.6667px/​Arial;;​inherit;;​inherit>​hard-clipped read as "​secondary alignment"​. This flag / reporting style is very widespreadly used in the genomics community.</​font>​ <font 14.6667px/​Arial;;​initial;;#​ffffff>​-M flags marks the</​font> ​  <​font 14.6667px/​Arial;;​inherit;;​inherit>​hard-clipped read as "​secondary alignment"​. This flag / reporting style is very widespreadly used in the genomics community.</​font>​
  
-====  Single-end vs. Paired-end mode                             ​====+==== Single-end vs. Paired-end mode ====
  
 <font 13.3333px/​arial;;​initial;;#​ffffff>​We reported that paired-end mode with -SP produces equivalent results to single-end mode. We (Soo and Neva) investigated this further.</​font>​ <font 13.3333px/​arial;;​initial;;#​ffffff>​We reported that paired-end mode with -SP produces equivalent results to single-end mode. We (Soo and Neva) investigated this further.</​font>​
Line 50: Line 50:
   * <font 13.3333px/​arial;;​initial;;#​ffffff>​DCIC will confirm that SE and PE runtimes are roughly identical.</​font>​   * <font 13.3333px/​arial;;​initial;;#​ffffff>​DCIC will confirm that SE and PE runtimes are roughly identical.</​font>​
  
-=====  Hi-C Normalization Procedures ​                               =====+===== Hi-C Normalization Procedures =====
  
 <font 11pt/​Arial;;​initial;;#​ffffff>​Model particular bias modality and attempts to correct them;</​font>​ <font 11pt/​Arial;;​initial;;#​ffffff>​Model particular bias modality and attempts to correct them;</​font>​
Line 56: Line 56:
 <font 11pt/​Arial;;​initial;;#​ffffff>​Matrix balancing methods (KR balancing, pre-filter the matrix).</​font>​ <font 11pt/​Arial;;​initial;;#​ffffff>​Matrix balancing methods (KR balancing, pre-filter the matrix).</​font>​
  
-  * +  * <font 11pt/​arial;;​initial;;#​ffffff>​Is this the right thing to do? It assumes every bits of the genome has the same probability to contact some other bits. However, this may not be true.</​font>​ 
- +  * <font 11pt/​arial;;​initial;;#​ffffff>​Each method has its pros and cons, the resulting chromosomal features may become different from the two different approaches.</​font>​ 
-<font 11pt/​arial;;​initial;;#​ffffff>​Is this the right thing to do? It assumes every bits of the genome has the same probability to contact some other bits. However, this may not be true.</​font>​ +  * <font 11pt/​arial;;​initial;;#​ffffff>​Comparison may be presented on a particular chromosomal features with different normalization methods.</​font>​ 
- +  * <font 11pt/​arial;;​initial;;#​ffffff>​We need to find some common criteria to accept while choosing normalization methods and different normalizations may be used for different feature detection.</​font>​
-  * +
- +
-<font 11pt/​arial;;​initial;;#​ffffff>​Each method has its pros and cons, the resulting chromosomal features may become different from the two different approaches.</​font>​ +
- +
-  * +
- +
-<font 11pt/​arial;;​initial;;#​ffffff>​Comparison may be presented on a particular chromosomal features with different normalization methods.</​font>​ +
- +
-  * +
- +
-<font 11pt/​arial;;​initial;;#​ffffff>​We need to find some common criteria to accept while choosing normalization methods and different normalizations may be used for different feature detection.</​font>​+
  
 <font 11pt/​Arial;;​initial;;#​ffffff>​Reproducibility needs to be evaluated carefully and different normalization methods should provide good reproducibility. Which reproducibility metric needs to be used also needs to be determined. Hi-C rep may be a good candidate for evaluation.</​font>​ <font 11pt/​Arial;;​initial;;#​ffffff>​Reproducibility needs to be evaluated carefully and different normalization methods should provide good reproducibility. Which reproducibility metric needs to be used also needs to be determined. Hi-C rep may be a good candidate for evaluation.</​font>​
Line 76: Line 65:
 <font 11pt/​Arial;;​initial;;#​ffffff>​We can apply different normalization methods on the same dataset (same Hi-C file) and use known features (SHH, for example) as criteria.</​font>​ <font 11pt/​Arial;;​initial;;#​ffffff>​We can apply different normalization methods on the same dataset (same Hi-C file) and use known features (SHH, for example) as criteria.</​font>​
  
-  * +  * <font 11pt/​arial;;​initial;;#​ffffff>​This comparison will be coordinated with DCIC.</​font>​ 
- +  * <font 11pt/​arial;;​initial;;#​ffffff>​Neva can use different normalization methods on the file and let people view the ending results across the genome to check the results.</​font>​
-<font 11pt/​arial;;​initial;;#​ffffff>​This comparison will be coordinated with DCIC.</​font>​ +
- +
-  * +
- +
-<font 11pt/​arial;;​initial;;#​ffffff>​Neva can use different normalization methods on the file and let people view the ending results across the genome to check the results.</​font>​+
  
 <font 11pt/​Arial;;​initial;;#​ffffff>​A very-high depth data file at different resolutions will be preferred, like the 1kb-resolution ones.</​font>​ <font 11pt/​Arial;;​initial;;#​ffffff>​A very-high depth data file at different resolutions will be preferred, like the 1kb-resolution ones.</​font>​
  
-  * +  * <font 11pt/​arial;;​initial;;#​ffffff>​However,​ those high-res maps are expensive to run (and few labs other than Erez’s lab is generating them) so maybe focusing on 5kb or 10kb resolutions may fit the actual usage better.</​font>​ 
- +  * <font 11pt/​arial;;​initial;;#​ffffff>​The standards will be for the community and will last for a long time. Even choosing a hi-res dataset we can still sub-sample it to generate lower-res ones. Also visualization may not be a very good method of inspection and some quantitative methods may need to be agreed upon.</​font>​ 
-<font 11pt/​arial;;​initial;;#​ffffff>​However,​ those high-res maps are expensive to run (and few labs other than Erez’s lab is generating them) so maybe focusing on 5kb or 10kb resolutions may fit the actual usage better.</​font>​ +  * <font 11pt/​arial;;​initial;;#​ffffff>​The initial normalization applies to low-resolution datasets and theoretically it can be applied to every resolution (although no previous hi-res datasets have been tested).</​font>​
- +
-  * +
- +
-<font 11pt/​arial;;​initial;;#​ffffff>​The standards will be for the community and will last for a long time. Even choosing a hi-res dataset we can still sub-sample it to generate lower-res ones. Also visualization may not be a very good method of inspection and some quantitative methods may need to be agreed upon.</​font>​ +
- +
-  * +
- +
-<font 11pt/​arial;;​initial;;#​ffffff>​The initial normalization applies to low-resolution datasets and theoretically it can be applied to every resolution (although no previous hi-res datasets have been tested).</​font>​+
  
 <font 11pt/​Arial;;​initial;;#​ffffff>​Normalization methods may be tied to the type of feature-calls people are using but it will make it hard to converge to standards.</​font>​ <font 11pt/​Arial;;​initial;;#​ffffff>​Normalization methods may be tied to the type of feature-calls people are using but it will make it hard to converge to standards.</​font>​
Line 106: Line 82:
 <font 11pt/​Arial;;​initial;;#​ffffff>​We can use GM12878 5kb Hi-C datasets, chromosome 1 or chromosome 18 and let people send in normalization factors.</​font>​ <font 11pt/​Arial;;​initial;;#​ffffff>​We can use GM12878 5kb Hi-C datasets, chromosome 1 or chromosome 18 and let people send in normalization factors.</​font>​
  
-=====  AGENDA ​                               =====+===== AGENDA =====
  
 <font 11pt/​Arial;;​initial;;#​ffffff>​Chimeric reads simulation (Neva, Burak and soo)</​font>​ <font 11pt/​Arial;;​initial;;#​ffffff>​Chimeric reads simulation (Neva, Burak and soo)</​font>​
Line 113: Line 89:
  
 <font 11pt/​Arial;;​initial;;#​ffffff>​Ask other groups about their research</​font>​ <font 11pt/​Arial;;​initial;;#​ffffff>​Ask other groups about their research</​font>​
- 
 </​WRAP>​~~TABLE_CELL_WRAP_STOP~~| </​WRAP>​~~TABLE_CELL_WRAP_STOP~~|
  
  
4dn/phase1/working_groups/omics_data_standards/minutes-05-08-2017.1550267589.txt.gz · Last modified: 2025/04/22 16:21 (external edit)