[maker-devel] master_datastore_index.log file shrinks.]

Carson Holt carsonhh at gmail.com
Tue Mar 19 09:02:22 MDT 2013


Try it with the no_locks option then.  Make sure to let one instance
finish populating the mpi_blastdb directory before running other instances
as that is where most initial locking occurs.

I'll send you more details on how to install with OpenMPI, so you can give
that a shot while your jobs are also running serially (so you don't lose
time).  Also instead of 50 serial instances, you could try 10 with -cpus
set to 5.

Thanks,
Carson



On 13-03-19 11:19 AM, "Michael Nuhn" <mnuhn at ebi.ac.uk> wrote:

>Hello Carson!
>
>On 03/19/2013 02:27 PM, Carson Holt wrote:
>> Yes.  If at all possible use MPI.  It removes the overhead of locks
>> which happen per primary instance of MAKER.  So one maker job using 1000
>> cpus via MPI will have one shared set of locks.  1000 serial instances
>> of MAKER on the other hand would have 1000x the locks.
>
>I don't know a thing about MPI.
>
>I tried installing maker (2.2.7) with mpich-3.0.2, mpich2-1.4.1 and open
>mpi and none of them worked for me. I also tried the automatic
>installation that comes with maker, but it didn't work for me either.
>
>If need be, I could spend time getting to the bottom of this, but there
>is no telling how long this would take me so I'd rather not, if there is
>an alternative.
>
>Would the approach I outlined before work? (Treating the split files as
>separate genomes to annotate and then combine the gffs afterwards)
>
>I also like this approach, because I would select a few contigs in the
>beginning which I would run on their own. They would complete early and
>this way I would get a preview of the results of the run instead of
>having to wait for everything to complete.
>
>It might also be more robust, because file locking issues would be
>confined to the instances working on a sequence chunk, but the rest of
>the instances could continue working.
>
>Cheers,
>Michael.
>
>> Alternatively if you do need to continue without MPI for some reason, I
>> just finished a devel version of MAKER that has a --no_locks option.
>>   You can never start two instances using the same input fasta when
>> --no_locks is specified, but the splitting to use different input fastas
>> I mentioned before in the example will still work fine.
>>
>> I also have updated the indexing/reindexing, so if indexing failures
>> happen, MAKER will switch between the current working directory and the
>> TMP= directory from the maker_opts.ctl file so as to try different IO
>> locations (I.e. NFS and non-NFS).  Note you should never set TMP= in the
>> control files to an NFS mounted location (it not only makes things a lot
>> slower, but berkleydb and sqllite will get frequent errors on NFS).
>>   TMP= defaults to /tmp when not specified
>>
>> I'll send you download information in a separate e-mail.  Try a regular
>> MAKER run to see if the indexing/reindexing changes are sufficient
>> before attempting the ‹no_locks option.
>>
>> Thanks,
>> Carson
>






More information about the maker-devel mailing list