OMICS WG MEETING 08-14-2017
\
Agenda:
\
Continue discussion an updated ChIA-PET protocol and guideline (Yijun Ruan)
A proposed DCIC HI-C processing pipeline ( here). (Peter Park)
\
DISCUSSION
\
1. Continue discussion an updated ChIA-PET protocol and guideline (Yijun Ruan)
Vote and approve ChIA-PET experimental protocols so that 4DN can accept ChIA-PET data
ChIA-PET experimental protocol contain very strict controls for library construction. MiSeq is utilized for library control data and once is validated, HiSeq 4000 is carried out.
Key features in ChiA-PET processing, to identify enrich chromatin interactions, singleton data will be applied for additional data.
Interspecies was used to assess noise
ChIA-PET is able to detect haplotype based and allele based interactions, single nt resolution of chromatin interactions and \SNP-based validation.
* QC metrics, would both metrics be expected?
* Standard protocols will include both wet lab and dry lab parts.
* Separate experimental protocol from data analysis, currently the data analysis part is not agreed upon within 4DN data analysis groups yet.
* From the data analysis side: There are some aspects, such as file format, alignment parameters are not determined in DCIC.
* DCIC would prefer using the same alignment methods for both HiC and ChIA-PET, the same goes to file formats as well.
* For example, currently ChIA-PET is generating contact map files in the different format with the one DCIC is using for HiC. DCIC would wish to adapt the format that is supported by downstream tools such as Juicer.
* Otherwise when the data is released to the broad community outside complaints may be made about the conflict of formats.
* From the data generators side: However, such solution may not be transferable since both technology are fundamentally different and generating different type of reads, therefore, the results of mapping performance may not apply to ChIA-PET.
* ChIA-PET is currently using BWA and unless DCIC thinks otherwise it should be quite robust for the analysis.
* Current standard can be still improved and should not be carved in stone.
* If no major changes in the algorithms or pipelines are made, the standard may be approved right now to enable the submission of data.
* ChIA-PET is a fairly mature technology and a full pipeline has been developed by Yijun’s lab. The file format issue can be separated from the approval of standard document and put into the implementation of file format conversion that DCIC can help to solve.
* Using the current pipeline can enable data to flow at the moment and improvements can be made later.
* DCIC may be at a better place to convert the files to something that goes better with the community and data generators should support DCIC by providing necessary information on their formats. The conversion would also needs to be accepted by data generators (they should be comfortable with the representation of their data).
\
2. A proposed DCIC HI-C processing pipeline (here). (Peter Park)
Erez propose that Juicer has implemented the entire Juicer pipeline which appears to be reimplemented by DCIC and concerns that this might set a bad precedent for open-source projects being re-implemented instead of adapted.
However, DCIC will need to incorporate all the other groups’ data and have made some improvements (in terms of programming language being used, for example, DCIC’s implementation used Python instead of AWK and Perl as Juicer, etc.). And this working group and DCIC would be responsible to implement anything approved in the standard. Also DCIC would need to implement the parts that conforms to the infrastructure of DCIC.
Another call (regular or special held) may be scheduled to hear Erez’s proposal and resolve this issue.
|