The memory use for the Tissue specificity (calculation + plot), as well as Neighbor distance and GC content, requires lots of memory with the large bed files (ENCFF303ZGP, 3.7GB, is the largest file we have in Bedbase database right now).
For testing, I submitted slurm jobs for each of the steps in bedstat separately, and got the time and memory use with sacct command for different size of bedfiles in the table below.
| |
Bed file |
size |
time |
mem (K) |
| TSS |
ENCFF950CZM |
2.1MB |
00:00:43 |
867192 |
|
ENCFF610FVD |
41.9MB |
00:00:32 |
1198232 |
|
ENCFF349STD |
744.1MB |
00:03:46 |
4207212 |
|
ENCFF303ZGP |
3.7GB |
00:03:38 |
12977208 |
| chrom_bin |
ENCFF950CZM |
2.1MB |
00:00:55 |
794532 |
|
ENCFF610FVD |
41.9MB |
00:00:50 |
1114972 |
|
ENCFF349STD |
744.1MB |
00:04:10 |
4008000 |
|
ENCFF303ZGP |
3.7GB |
00:03:38 |
11061980 |
| GC_content |
ENCFF950CZM |
2.1MB |
00:01:22 |
883388 |
|
ENCFF610FVD |
41.9MB |
00:05:24 |
3893472 |
|
ENCFF349STD |
744.1MB |
01:34:21 |
44189576 |
|
ENCFF303ZGP |
3.7GB |
05:42:47 |
127170468 |
| partitions (+Expected partitio, Cumulative partition) |
ENCFF950CZM |
2.1MB |
00:00:46 |
861968 |
|
ENCFF610FVD |
41.9MB |
00:00:38 |
1125128 |
|
ENCFF349STD |
744.1MB |
00:03:56 |
9159448 |
|
ENCFF303ZGP |
3.7GB |
00:16:29 |
54185920 |
| Qthist |
ENCFF950CZM |
2.1MB |
00:00:42 |
880560 |
|
ENCFF610FVD |
41.9MB |
00:00:26 |
- |
|
ENCFF349STD |
744.1MB |
00:00:39 |
1542740 |
|
ENCFF303ZGP |
3.7GB |
00:01:44 |
8402252 |
| Neighbor distance |
ENCFF950CZM |
2.1MB |
00:00:46 |
867540 |
|
ENCFF610FVD |
41.9MB |
00:03:32 |
3529856 |
|
ENCFF349STD |
744.1MB |
01:27:17 |
44840124 |
|
ENCFF303ZGP |
3.7GB |
04:52:45 |
140088948 |
| Tissue specificity |
ENCFF950CZM |
2.1MB |
00:01:11 |
3229968 |
|
ENCFF610FVD |
41.9MB |
00:02:45 |
10113384 |
|
ENCFF349STD |
744.1MB |
00:23:21 |
134680844 |
|
ENCFF303ZGP |
3.7GB |
00:33:05 |
148197020 |
The memory use for the
Tissue specificity(calculation + plot), as well asNeighbor distanceandGC content, requires lots of memory with the large bed files (ENCFF303ZGP, 3.7GB, is the largest file we have inBedbasedatabase right now).For testing, I submitted slurm jobs for each of the steps in
bedstatseparately, and got the time and memory use withsacctcommand for different size of bedfiles in the table below.