-
Notifications
You must be signed in to change notification settings - Fork 4
2. Input format
Ming Gu edited this page Sep 24, 2023
·
5 revisions
We downloaded data GEO:GSE63525 from Rao et al (2014) for standalone example usage of diffDomain.
Example data saved in <data/>
:
- GM12878 TADs.
- GM12878 combined Hi-C data on Chr1.
- K562 combined Hi-C data on Chr1.
DiffDomain supports input Hi-C data in .hic, .cool, .mcool, and .tsv format. Specifically,
- If the name of your hic data ends with '.hic', we will extract its data by hicstraw from Aiden Lab.
- From the 0.2.1 version, DiffDomain can process the input in .cool format. If the name of the input file ends with '.cool' or 'mcool', we will extract its data by cooler and transform it by hicexplorer.
- If the file name of Hi-C data ends with .tsv, we will read it as a tab-separated file.
In 'dvsd multiple', we expect a bed file of TADs, the tadlist of condition1(hic0) separated by '\t', like below, whose first column is chromosome name, second column is the locus where the TAD starts, and third column is the locus where the TAD ends.
We will only use the first 3 columns in a tadlist (framed in red). And whatever its column names are, it should have a header (framed in green).
In 'classification', we also expect the tadlist of condition2(hic1) like before.
DiffDomain~Wiki