[maker-devel] Incomplete/Missing lines in datastore index log under openMPI

Carson Holt carsonhh at gmail.com
Tue Apr 10 08:26:40 MDT 2012


Depending on if your using NFS and other architecture design you can get
race conditions with the datastore log file.  This primarily happens when
you have multiple instances of MAKER running at the same time or thousands
of short contigs running in parallel so many finish at the same time.  In a
future release, I plan on having the last MAKER job to exit just rebuild the
log at the end of a run to ensure it is complete. For now though, just run
'maker -dsindex' at the end of a run when it happens.  It will rebbuild the
log and only takes a few seconds.

Thanks,
Carson




From:  Evan Ernst <eernst at cshl.edu>
Date:  Sun, 8 Apr 2012 18:09:22 -0400
To:  <maker-devel at yandell-lab.org>
Subject:  [maker-devel] Incomplete/Missing lines in datastore index log
under openMPI

Hi Carson,

It looks like there may be a locking issue with the datastore index log in
MAKER 2.25/openmpi 1.4.5. I noticed this when running 8 MPI maker instances,
each with 32 nodes. Examples from the log:

scaffold1001.1  genome_datastore/93/A6/scaffold1001.1/  FINISHED
scaffold1002.1  genome_datastore/72/43/scaffold1002.1/  FINISHED

scaffold1003.1  genome_datastore/B8/05/scaffold1003.1/  FINISHED

...

scaffold10085.1 genome_datastore/1C/7E/scaffold10085.1/ FINISHED
scaffold8265.1  genome_datastore/01/E4/scaffold8265.1/  FINISHED
D
scaffold8295.1  genome_datastore/63/13/scaffold8295.1/  FINISHED

...

scaffold8351.1  genome_datastore/27/52/scaffold8351.1/  FINISHED
scaffold8343.1  genome_datastore/BF/31/scaffold8343.1/  FINISHED
scaffold10167.1 genome_datastore/0B/9A/scaffold10167.1/
FINISHEscaffold10170.1  genome_datastore/F4/FF/scaffold10170.1/ FINISHED
scaffold10209.1 genome_datastore/2D/AA/scaffold10209.1/
FINISHEscaffold10072.1  genome_datastore/E0/A5/scaffold10072.1/ FINISHED
scaffold10113.1 genome_datastore/00/23/scaffold10113.1/ FINISHED

I see this even when running a single MPI instance, 32 nodes, when no actual
processing is required apart from marking the scaffolds FINISHED. Comparing
the result to a single, non-MPI maker instance running on the same completed
hierarchy reveals that many entries aren't being written to the log at all
when running under MPI. The single process instance runs just fine,
generating a complete log that can be used for the downstream scripts.

Between runs, I execute a

find genome.maker.output/ -name .NFSLock* -type f -print0 | xargs -0 rm &

to be sure lingering lock files from badly exiting processes weren't
interfering.

This looks like the sort of thing that may be difficult to track down, and
there's a clear workaround, but I'm happy to provide more information if
you'd like to debug it.

Thanks,
Evan
_______________________________________________ maker-devel mailing list
maker-devel at box290.bluehost.com
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20120410/10fc0516/attachment-0003.html>


More information about the maker-devel mailing list