[maker-devel] open3: fork failed: Cannot allocate memory
Carson Holt
carsonhh at gmail.com
Fri Sep 21 07:28:34 MDT 2012
Exclusive access doesn't necessarily mean that memory is not full,
especially if you are running via MPI. Under MPI each MAKER instance will
have a separate memory footprint (the sum of which may be too high for
your system). You can lower the memory footprint per instance by setting
all the blast_depth options in the maker_bopt.ctl file. Set them to 30
for example. Also in you have set max_dna_len to a value greater than
100000, bring it back down. A value of 200000 for example will cause
MAEKR to use twice as much memory as 100000.
The primary cause of high memory is too much evidence aligning in a
region. Setting the depth parameter to 30 will cause MAKER to sort the
alignments as they come in and drop all but the best 30 alignments. Each
instance of MAKER will usually take up 500Mb to 1Gb of RAM when
max_dna_len is set to 100,000. It will take up twice that when
max_dna_len is set to 200,000.
Alternatively, if you are also using MPI, you can set the job up to launch
fewer instances of MAKER, but allow each instance to use more CPUs. You
do this by halving the value of the -n option given to mpiexec and then
setting cpus=2 in the maker_opts file. The result in that BLAST analyses
will run on 2 cpus each but MAKER will only keep the results for 6 jobs in
memory at a time rather than 12 jobs (I.e. Lower memory footprint). When
using this technique, make sure the hostfile used by mpiexec is set to
follow a round robin distribution or all instances will start on half your
nodes while the other half of your nodes will be left idle.
Example - this is correct:
host1
host2
host3
host1
host2
host3
host1
host2
host3
This is incorrect:
host1
host1
host1
host2
host2
host2
host3
host3
host3
FYI when using MPI the -n parameter given to mpiexec and the -cpus
parameter given to MAKER have a multiplicative affect on CPU usage.
Example:
mpiexec -n 12 maker -cpus 1 (will launch 12 maker instances and use a
total of 12 cpus)
mpiexec -n 12 maker -cpus 2 (will launch 12 maker instances but use a
total of 24 cpus)
mpiexec -n 6 maker -cpus 4 (will launch 6 maker instances but use a total
of 24 cpus)
Why is this? It's because CPUs affects how many processors to use to
solve a single 100,000 bp chunk (per instance). But mpiexec -n sets how
many simultaneous instances and as a result how many chunks to run
together.
So mpiexec -n will cause memory usage to increase in a way that -cpus
doesn't. But mpiexec also scales across machines for parallelization
where -cpus can't (it's use applies to a single machine).
Thanks,
Carson
On 12-09-19 12:41 PM, "Yunfei Guo" <guoyunfei1989 at gmail.com> wrote:
>Hi Carson,
>
>Sorry to bother you again because I really can't find a solution to
>it. Have you seen this error before? I simply started multiple jobs in
>the same directory and got the following msg after hours of running.
>
>From Admin:
>"Your batch job 2511011 encountered the following error:
> open3: fork failed: Cannot allocate memory
> ERROR: Failed while polishig ESTs
> ERROR: Chunk failed at level:2, tier_type:2
> FAILED CONTIG:scaffold740
>It continued generating errors in a loop until the /var filesystem filled.
>I will cancel the job. You may find it useful within your job script to
>check for errors and exit if present."
>
>And this is not because of lack of memory since this job has exclusive
>access to the node. Let me know if you want the entire stderr file.
>Thanks.
>
>_______________________________________________
>maker-devel mailing list
>maker-devel at box290.bluehost.com
>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
More information about the maker-devel
mailing list