[maker-devel] master_datastore_index.log file shrinks.]

Tue Mar 19 09:13:51 MDT 2013

You really don't need to know anything about MPI. While MPI is itself
pretty complex, I seem to recall maker uses the p2p subset alone mainly to
send serialised perl objects as c strings etc., for IPC across ad hoc
infrastructure - but none of that is relevant as Carson has done all the
IPC debugging for you and its use should be transparent. If it's failing,
its almost certainly because you've got discrepencies between the mpi
libraries visible at compile-time vs. run-time and you may need to force
the dynamic linker to behave itself. The only other caveat on ebi
infrastructure i can think of off the top of my head relates to cross-node
MPI usage when going into the hundreds of processes but i'm assuming you
not doing that? You need to be more specific about how it's failing.

dan

from me phone...
On Mar 19, 2013 11:55 AM, "Michael Nuhn" <mnuhn at ebi.ac.uk> wrote:

> Hello Carson!
>
> On 03/19/2013 02:27 PM, Carson Holt wrote:
>
>> Yes.  If at all possible use MPI.  It removes the overhead of locks
>> which happen per primary instance of MAKER.  So one maker job using 1000
>> cpus via MPI will have one shared set of locks.  1000 serial instances
>> of MAKER on the other hand would have 1000x the locks.
>>
>
> I don't know a thing about MPI.
>
> I tried installing maker (2.2.7) with mpich-3.0.2, mpich2-1.4.1 and open
> mpi and none of them worked for me. I also tried the automatic installation
> that comes with maker, but it didn't work for me either.
>
> If need be, I could spend time getting to the bottom of this, but there is
> no telling how long this would take me so I'd rather not, if there is an
> alternative.
>
> Would the approach I outlined before work? (Treating the split files as
> separate genomes to annotate and then combine the gffs afterwards)
>
> I also like this approach, because I would select a few contigs in the
> beginning which I would run on their own. They would complete early and
> this way I would get a preview of the results of the run instead of having
> to wait for everything to complete.
>
> It might also be more robust, because file locking issues would be
> confined to the instances working on a sequence chunk, but the rest of the
> instances could continue working.
>
> Cheers,
> Michael.
>
>  Alternatively if you do need to continue without MPI for some reason, I
>> just finished a devel version of MAKER that has a --no_locks option.
>>   You can never start two instances using the same input fasta when
>> --no_locks is specified, but the splitting to use different input fastas
>> I mentioned before in the example will still work fine.
>>
>> I also have updated the indexing/reindexing, so if indexing failures
>> happen, MAKER will switch between the current working directory and the
>> TMP= directory from the maker_opts.ctl file so as to try different IO
>> locations (I.e. NFS and non-NFS).  Note you should never set TMP= in the
>> control files to an NFS mounted location (it not only makes things a lot
>> slower, but berkleydb and sqllite will get frequent errors on NFS).
>>   TMP= defaults to /tmp when not specified
>>
>> I'll send you download information in a separate e-mail.  Try a regular
>> MAKER run to see if the indexing/reindexing changes are sufficient
>> before attempting the —no_locks option.
>>
>> Thanks,
>> Carson
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20130319/e4e9d162/attachment-0003.html>