[maker-devel] MAKER processing time in a 2Gb genome
Ganko Eric USRE
eric.ganko at syngenta.com
Mon Jul 6 09:37:28 MDT 2015
I'm hoping for some advice on an unexpectedly long process time for a 2Gb genome. Currently I'm using an install of MAKER-P on the iForge system @ NCSA and I've successfully run ~1Gb genomes in 2-3 hours across 20 nodes (24 Intel "Haswell" cores, 64 GB of RAM per node) via MPICH.
I recently ran some tests on 50Mb of corn that took ~2 hours on 2 nodes (48 cores). Based on that I was surprised when the full 2Gb corn genome run timed out at >24h with 30 nodes (720 cores); in that time it hadn't processed many sequences based on the master_datastore_index.log :
TOTAL: 25000 seqs
STARTED: 3594
FINISHED: 2979
FAILED: 10
RETRY: 9
DIED_SKIPPED_PERMANENT: 0
SKIPPED_SMALL: 7635
While I can set a longer wall clock, these results are several times longer than what was reported in the MAKER-P paper, i.e. running the corn B73 genome in less than 4 hours; here it is not close to done after 24h. I don't have an enormous amount of supporting data- this trial run has ~100k transcripts and another ~100k proteins. Corn has a very high repeat content, so my suspicion is Repeatmasker IO. In discussions with the iForge admins I have discovered that the temp space is network attached (GPFS), and they've suggested using a RAM disk (i.e /dev/shm) as the temp directory. In tests on smaller sequence that ran a little slower so I'm not sure if MAKER is meant to run that way. I'd appreciate input on experience with a RAM disk approach, or if anyone has alternative thoughts or suggestions?
Thanks,
Eric
________________________________
This message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20150706/549c9107/attachment-0002.html>
More information about the maker-devel
mailing list