This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
4dn:phase1:working_groups:omics_data_standards:minutes-05-08-2017 [2019/02/15 13:53] rcalandrelli created |
4dn:phase1:working_groups:omics_data_standards:minutes-05-08-2017 [2025/04/22 16:21] (current) |
||
---|---|---|---|
Line 1: | Line 1: | ||
- | ==== Omics Data Standards WG - Minutes 05-08-2017 ==== | + | ==== Omics Data Standards WG - Minutes 05-08-2017 ==== |
|~~TABLE_CELL_WRAP_START~~<WRAP> | |~~TABLE_CELL_WRAP_START~~<WRAP> | ||
- | ====== Hi-C Data Processing and Analysis ====== | + | ====== Hi-C Data Processing and Analysis ====== |
<font 11pt/Arial;;initial;;#ffffff>Last meeting, OMICS group discussed about mapping of Hi-C reads and there were large consistence between different groups and the main differences were single-end vs paired-end and filtering. Currently Burak at DCIC are still trying to use different tools for comparison. The real issue is that whether OMICS will give specific recommendations.</font> | <font 11pt/Arial;;initial;;#ffffff>Last meeting, OMICS group discussed about mapping of Hi-C reads and there were large consistence between different groups and the main differences were single-end vs paired-end and filtering. Currently Burak at DCIC are still trying to use different tools for comparison. The real issue is that whether OMICS will give specific recommendations.</font> | ||
Line 9: | Line 9: | ||
<font 11pt/Arial;;initial;;#ffffff>Over the last week, Soo and Neva compared different parameters for the mapping.</font> | <font 11pt/Arial;;initial;;#ffffff>Over the last week, Soo and Neva compared different parameters for the mapping.</font> | ||
- | ===== Hi-C Read Alignment and Data Standards ===== | + | ===== Hi-C Read Alignment and Data Standards ===== |
- | <font 11pt/Arial;;initial;;#ffffff>(Please refer to Soo Lee’s slides)</font> | + | (Please refer to Soo Lee’s slides) |
- | ==== Reporting of chimeric alignment (-5 and -M flags) ==== | + | ==== Reporting of chimeric alignment (-5 and -M flags) ==== |
<font 11pt/Arial;;initial;;#ffffff>Chimeric alignment will have one soft-clipping and one hard-clipping reads.</font> | <font 11pt/Arial;;initial;;#ffffff>Chimeric alignment will have one soft-clipping and one hard-clipping reads.</font> | ||
Line 25: | Line 25: | ||
<font 14.6667px/Arial;;initial;;#ffffff>-M flags marks the</font> <font 14.6667px/Arial;;inherit;;inherit>hard-clipped read as "secondary alignment". This flag / reporting style is very widespreadly used in the genomics community.</font> | <font 14.6667px/Arial;;initial;;#ffffff>-M flags marks the</font> <font 14.6667px/Arial;;inherit;;inherit>hard-clipped read as "secondary alignment". This flag / reporting style is very widespreadly used in the genomics community.</font> | ||
- | ==== Single-end vs. Paired-end mode ==== | + | ==== Single-end vs. Paired-end mode ==== |
<font 13.3333px/arial;;initial;;#ffffff>We reported that paired-end mode with -SP produces equivalent results to single-end mode. We (Soo and Neva) investigated this further.</font> | <font 13.3333px/arial;;initial;;#ffffff>We reported that paired-end mode with -SP produces equivalent results to single-end mode. We (Soo and Neva) investigated this further.</font> | ||
Line 50: | Line 50: | ||
* <font 13.3333px/arial;;initial;;#ffffff>DCIC will confirm that SE and PE runtimes are roughly identical.</font> | * <font 13.3333px/arial;;initial;;#ffffff>DCIC will confirm that SE and PE runtimes are roughly identical.</font> | ||
- | ===== Hi-C Normalization Procedures ===== | + | ===== Hi-C Normalization Procedures ===== |
<font 11pt/Arial;;initial;;#ffffff>Model particular bias modality and attempts to correct them;</font> | <font 11pt/Arial;;initial;;#ffffff>Model particular bias modality and attempts to correct them;</font> | ||
Line 56: | Line 56: | ||
<font 11pt/Arial;;initial;;#ffffff>Matrix balancing methods (KR balancing, pre-filter the matrix).</font> | <font 11pt/Arial;;initial;;#ffffff>Matrix balancing methods (KR balancing, pre-filter the matrix).</font> | ||
- | * | + | * <font 11pt/arial;;initial;;#ffffff>Is this the right thing to do? It assumes every bits of the genome has the same probability to contact some other bits. However, this may not be true.</font> |
- | + | * <font 11pt/arial;;initial;;#ffffff>Each method has its pros and cons, the resulting chromosomal features may become different from the two different approaches.</font> | |
- | <font 11pt/arial;;initial;;#ffffff>Is this the right thing to do? It assumes every bits of the genome has the same probability to contact some other bits. However, this may not be true.</font> | + | * <font 11pt/arial;;initial;;#ffffff>Comparison may be presented on a particular chromosomal features with different normalization methods.</font> |
- | + | * <font 11pt/arial;;initial;;#ffffff>We need to find some common criteria to accept while choosing normalization methods and different normalizations may be used for different feature detection.</font> | |
- | * | + | |
- | + | ||
- | <font 11pt/arial;;initial;;#ffffff>Each method has its pros and cons, the resulting chromosomal features may become different from the two different approaches.</font> | + | |
- | + | ||
- | * | + | |
- | + | ||
- | <font 11pt/arial;;initial;;#ffffff>Comparison may be presented on a particular chromosomal features with different normalization methods.</font> | + | |
- | + | ||
- | * | + | |
- | + | ||
- | <font 11pt/arial;;initial;;#ffffff>We need to find some common criteria to accept while choosing normalization methods and different normalizations may be used for different feature detection.</font> | + | |
<font 11pt/Arial;;initial;;#ffffff>Reproducibility needs to be evaluated carefully and different normalization methods should provide good reproducibility. Which reproducibility metric needs to be used also needs to be determined. Hi-C rep may be a good candidate for evaluation.</font> | <font 11pt/Arial;;initial;;#ffffff>Reproducibility needs to be evaluated carefully and different normalization methods should provide good reproducibility. Which reproducibility metric needs to be used also needs to be determined. Hi-C rep may be a good candidate for evaluation.</font> | ||
Line 76: | Line 65: | ||
<font 11pt/Arial;;initial;;#ffffff>We can apply different normalization methods on the same dataset (same Hi-C file) and use known features (SHH, for example) as criteria.</font> | <font 11pt/Arial;;initial;;#ffffff>We can apply different normalization methods on the same dataset (same Hi-C file) and use known features (SHH, for example) as criteria.</font> | ||
- | * | + | * <font 11pt/arial;;initial;;#ffffff>This comparison will be coordinated with DCIC.</font> |
- | + | * <font 11pt/arial;;initial;;#ffffff>Neva can use different normalization methods on the file and let people view the ending results across the genome to check the results.</font> | |
- | <font 11pt/arial;;initial;;#ffffff>This comparison will be coordinated with DCIC.</font> | + | |
- | + | ||
- | * | + | |
- | + | ||
- | <font 11pt/arial;;initial;;#ffffff>Neva can use different normalization methods on the file and let people view the ending results across the genome to check the results.</font> | + | |
<font 11pt/Arial;;initial;;#ffffff>A very-high depth data file at different resolutions will be preferred, like the 1kb-resolution ones.</font> | <font 11pt/Arial;;initial;;#ffffff>A very-high depth data file at different resolutions will be preferred, like the 1kb-resolution ones.</font> | ||
- | * | + | * <font 11pt/arial;;initial;;#ffffff>However, those high-res maps are expensive to run (and few labs other than Erez’s lab is generating them) so maybe focusing on 5kb or 10kb resolutions may fit the actual usage better.</font> |
- | + | * <font 11pt/arial;;initial;;#ffffff>The standards will be for the community and will last for a long time. Even choosing a hi-res dataset we can still sub-sample it to generate lower-res ones. Also visualization may not be a very good method of inspection and some quantitative methods may need to be agreed upon.</font> | |
- | <font 11pt/arial;;initial;;#ffffff>However, those high-res maps are expensive to run (and few labs other than Erez’s lab is generating them) so maybe focusing on 5kb or 10kb resolutions may fit the actual usage better.</font> | + | * <font 11pt/arial;;initial;;#ffffff>The initial normalization applies to low-resolution datasets and theoretically it can be applied to every resolution (although no previous hi-res datasets have been tested).</font> |
- | + | ||
- | * | + | |
- | + | ||
- | <font 11pt/arial;;initial;;#ffffff>The standards will be for the community and will last for a long time. Even choosing a hi-res dataset we can still sub-sample it to generate lower-res ones. Also visualization may not be a very good method of inspection and some quantitative methods may need to be agreed upon.</font> | + | |
- | + | ||
- | * | + | |
- | + | ||
- | <font 11pt/arial;;initial;;#ffffff>The initial normalization applies to low-resolution datasets and theoretically it can be applied to every resolution (although no previous hi-res datasets have been tested).</font> | + | |
<font 11pt/Arial;;initial;;#ffffff>Normalization methods may be tied to the type of feature-calls people are using but it will make it hard to converge to standards.</font> | <font 11pt/Arial;;initial;;#ffffff>Normalization methods may be tied to the type of feature-calls people are using but it will make it hard to converge to standards.</font> | ||
Line 106: | Line 82: | ||
<font 11pt/Arial;;initial;;#ffffff>We can use GM12878 5kb Hi-C datasets, chromosome 1 or chromosome 18 and let people send in normalization factors.</font> | <font 11pt/Arial;;initial;;#ffffff>We can use GM12878 5kb Hi-C datasets, chromosome 1 or chromosome 18 and let people send in normalization factors.</font> | ||
- | ===== AGENDA ===== | + | ===== AGENDA ===== |
<font 11pt/Arial;;initial;;#ffffff>Chimeric reads simulation (Neva, Burak and soo)</font> | <font 11pt/Arial;;initial;;#ffffff>Chimeric reads simulation (Neva, Burak and soo)</font> | ||
Line 113: | Line 89: | ||
<font 11pt/Arial;;initial;;#ffffff>Ask other groups about their research</font> | <font 11pt/Arial;;initial;;#ffffff>Ask other groups about their research</font> | ||
- | |||
</WRAP>~~TABLE_CELL_WRAP_STOP~~| | </WRAP>~~TABLE_CELL_WRAP_STOP~~| | ||