[maker-devel] PARALLELIZED DE NOVO GENOME ANNOTATION WITHOUT MPI
Carson Holt
carsonhh at gmail.com
Wed Mar 1 13:36:17 MST 2017
If you submit too many simultaneous, MAKER run then file locks will start to collide and one run will slow down the others. You should submit fewer simultaneous jobs and instead use MPI (maker must be configured and compiled to use MPI).
An example MPI launch command for running on 200 CPUs on a cluster —>
mpiexec -n 200 maker 2> maker_mpi1.error
—Carson
> On Feb 27, 2017, at 8:25 AM, Quanwei Zhang <qwzhang0601 at gmail.com> wrote:
>
> Hello:
>
> I am doing genome annotation using Maker on our high performance computational cluster (HPC). Due to some issues of MPI, I submitted the Maker jobs several times under the same directory to HPC. Followed by the example in the protocol (as shown below), when I submit the jobs I make them as background processes by "&" except the first one. Is this necessary when I submit a job to a HPC? I found it costed much much longer time than I expected (according to a testing on a smaller data set). I am not sure whether setting the process as background process lead to this issue?
>
> The example in the protocol
> % maker 2> maker1.error
> % maker 2> maker2.error &
> % maker 2> maker3.error &
> ......
>
> BTW, will the annotation on shorter contig (e.g., 500bp) cost ~ 1/100 of the time that cost for annotation a 50000bp contig? I am using SNAP for an inito and RNA-seq assembly and protein sequences as evidence. I have more than half contigs shorter than 300bp (whose total length is only about 5% of the total length of all contigs), I want to know whether I can save about half (or only about 5%) of the time if I ignore those short contigs.
>
> Thanks
>
> Best
> Quanwei
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
More information about the maker-devel
mailing list