[maker-devel] Running MAKER using MPI over SGE scheduler with multiple nodes

Carson Holt carsonhh at gmail.com
Tue Aug 21 11:14:55 MDT 2018


Hi Lior,

I know it can be done, but I never have had to do it on SGE. It will require that both SGE be setup to do this and OpenMPI be set up to work with SGE. If it is not setup already, you may have to involve your IT manager.

There are a number of documentation sources on how to do this —>
https://www.open-mpi.org/faq/?category=sge <https://www.open-mpi.org/faq/?category=sge>
SGE Parallel Environment - Softpanorama <https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=3&cad=rja&uact=8&ved=2ahUKEwiA_vS0z_7cAhWI_p8KHflrAQQQFjACegQICBAB&url=http%3A%2F%2Fwww.softpanorama.org%2FHPC%2FGrid_engine%2Fparallel_environment.shtml&usg=AOvVaw24R99SzqWJJdC-mpY7eTCV>

Alternatively submit multiple MAKER jobs to one node each. You can have all jobs write to the same output directory and use different input fastas (i.e. chunk the original fasta) using the -base and -g command line options while running maker. You can chunk the fasta using fasta_tool (bundled with MAKER) and the --chunk option.

Example:
fasta_tool --chunk 10 assembly.fasta

#job1
mpiexec -n 20 maker -base assembly -g assembly_00.fasta

#job 2
mpiexec -n 20 maker -base assembly -g assembly_01.fasta

#job 3
mpiexec -n 20 maker -base assembly -g assembly_02.fasta

# and so on …


Thanks,
Carson


> On Aug 8, 2018, at 5:51 AM, Lior Glick <liorglic at mail.tau.ac.il> wrote:
> 
> Dear MAKER users,
> 
> I am running MAKER in order to annotate a large plant genome. To improve performance, I use the MPI option as described in the documentation (specifically openMPI). The machines I currently have access to are part of a cluster on which SGE is used as the job scheduler. There are about 15 machines, each with 20 cores. Therefore, in order to run, I create files that look something like this:
> 
> #!/bin/bash
> #$ -N try_MAKER
> #$ -S /bin/bash
> #$ -e /path/to/err
> #$ -o /path/to/out
> #$ -pe openmpi-x86_64 20
> cd /path/to/run_dir
> mpiexec -n 20 --mca btl tcp,self maker
> 
> I then just qsub the file.
> This works fine, but I'd like to use more than 20 cores, which means I need to use multiple nodes of the cluster. Simply increasing the number of requested cores (e.g.  mpiexec -n 100) does not work - it keeps using 20 cores of a single node.
> I see this is possible when the scheduler used is PBS <https://www.osc.edu/supercomputing/batch-processing-at-osc/pbs-directives-summary> (using the nodes:ppn option), but couldn't find any examples/instructions regarding SGE.
> Can anyone help me figure it out? Has anyone done this on SGE?
> 
> Thanks a lot and best regards,
> Lior
> 
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180821/beedde63/attachment-0002.html>


More information about the maker-devel mailing list