[maker-devel] maker hung on a contig

Jeremy Semeiks jeremy.semeiks at utsw.edu
Thu Oct 4 13:39:25 MDT 2012


Hi,

I ran maker 2.26 beta on a mammalian assembly containing ~200,000
scaffolds. I ran like this:

$ /usr/bin/time mpiexec -n 30 maker -q < /dev/null > maker.oe 2>&1 & disown
-h

After 1--2 weeks, all but one contig had finished successfully, although
maker needed a retry on ~80 others. The one contig that failed is small
(1,146 nt) and not obviously weird. maker hung on this one contig,
outputting the following type of stack-trace-looking thing to maker.oe,
over and over:

"""
    eval {...} called at /home/jrs/maker-2.26-beta/bin/../lib/Error.pm line
329
    Error::subs::run_clauses('HASH(0xdd9d480)',
'Error::Simple=HASH(0xa9a56f8)', undef, 'ARRAY(0x9ecc710)') called at
/home/jrs/maker-2.26-beta/bin/../lib/Error.pm line 426
    Error::subs::try('CODE(0xf1b6a40)', 'HASH(0xdd9d480)') called at
/home/jrs/maker-2.26-beta/bin/../lib/Process/MpiTiers.pm line 337
    Process::MpiTiers::_next_level('Process::MpiTiers=HASH(0xc10a7e0)')
called at /home/jrs/maker-2.26-beta/bin/../lib/Process/MpiTiers.pm line 179
    Process::MpiTiers::next_chunk('Process::MpiTiers=HASH(0xc10a7e0)')
called at /home/jrs/maker-2.26-beta/bin/../lib/Process/MpiTiers.pm line 815
    Process::MpiTiers::_handler('Process::MpiTiers=HASH(0xc10a7e0)',
'Error::Simple=HASH(0xab33f30)', 'Can not get next level') called at
/home/jrs/maker-2.26-beta/bin/../lib/Process/MpiTiers.pm line 336
    Process::MpiTiers::__ANON__('Error::Simple=HASH(0xab33f30)',
'SCALAR(0x17cd3a90)') called at
/home/jrs/maker-2.26-beta/bin/../lib/Error.pm line 339
    eval {...} called at /home/jrs/maker-2.26-beta/bin/../lib/Error.pm line
329
    Error::subs::run_clauses('HASH(0xb3a58e0)',
'Error::Simple=HASH(0xab33f30)', undef, 'ARRAY(0x77ffae0)') called at
/home/jrs/maker-2.26-beta/bin/../lib/Error.pm line 426
    Error::subs::try('CODE(0xbe47098)', 'HASH(0xb3a58e0)') called at
/home/jrs/maker-2.26-beta/bin/../lib/Process/MpiTiers.pm line 337
    Process::MpiTiers::_next_level('Process::MpiTiers=HASH(0xc10a7e0)')
called at /home/jrs/maker-2.26-beta/bin/../lib/Process/MpiTiers.pm line 179
    Process::MpiTiers::next_chunk('Process::MpiTiers=HASH(0xc10a7e0)')
called at /home/jrs/maker-2.26-beta/bin/../lib/Process/MpiTiers.pm line 815
    Process::MpiTiers::_handler('Process::MpiTiers=HASH(0xc10a7e0)',
'Error::Simple=HASH(0xc86fb88)', 'Can not get next level') called at
/home/jrs/maker-2.26-beta/bin/../lib/Process/MpiTiers.pm line 336
    Process::MpiTiers::__ANON__('Error::Simple=HASH(0xc86fb88)',
'SCALAR(0xb504490)') called at
/home/jrs/maker-2.26-beta/bin/../lib/Error.pm line 339
    eval {...} called at /home/jrs/maker-2.26-beta/bin/../lib/Error.pm line
329
    Error::subs::run_clauses('HASH(0xc2d7d00)',
'Error::Simple=HASH(0xc86fb88)', undef, 'ARRAY(0x767a290)') called at
/home/jrs/maker-2.26-beta/bin/../lib/Error.pm line 426
    Error::subs::try('CODE(0x5ab13f0)', 'HASH(0xc2d7d00)') called at
/home/jrs/maker-2.26-beta/bin/../lib/Process/MpiTiers.pm line 337
    Process::MpiTiers::_next_level('Process::MpiTiers=HASH(0xc10a7e0)')
called at /home/jrs/maker-2.26-beta/bin/../lib/Process/MpiTiers.pm line 179
    Process::MpiTiers::next_chunk('Process::MpiTiers=HASH(0xc10a7e0)')
called at /home/jrs/maker-2.26-beta/bin/../lib/Process/MpiTiers.pm line 815
    Process::MpiTiers::_handler('Process::MpiTiers=HASH(0xc10a7e0)',
'Error::Simple=HASH(0x18943010)', 'Can not get next level') called at
/home/jrs/maker-2.26-beta/bin/../lib/Process/MpiTiers.pm line 336
    Process::MpiTiers::__ANON__('Error::Simple=HASH(0x18943010)',
'SCALAR(0xa6e7cd0)') called at
/home/jrs/maker-2.26-beta/bin/../lib/Error.pm line 339
"""

Every 100,000 lines or so, that sort of pattern is interrupted by this sort
of pattern:

"""
    Process::MpiTiers::__ANON__('Error::Simple=HASH(0xd9816c8)',
'SCALAR(0x2558c48)') called at
/home/jrs/maker-2.26-beta/bin/../lib/Error.pm line 339
    eval {...} called at /home/jrs/maker-2.26-beta/bin/../lib/Error.pm line
329
    Error::subs::run_clauses('HASH(0x1bd77ab8)',
'Error::Simple=HASH(0xd9816c8)', undef, 'ARRAY(0x2561948)') called at
/home/jrs/maker-2.26-beta/bin/../lib/Error.pm line 426
    Error::subs::try('CODE(0xf48b0d8)', 'HASH(0x1bd77ab8)') called at
/home/jrs/maker-2.26-beta/bin/../lib/Process/MpiTiers.pm line 337
    Process::MpiTiers::_next_level('Process::MpiTiers=HASH(0xc10a7e0)')
called at /home/jrs/maker-2.26-beta/bin/../lib/Process/MpiTiers.pm line 179
    Process::MpiTiers::next_chunk('Process::MpiTiers=HASH(0xc10a7e0)')
called at /home/jrs/maker-2.26-beta/bin/../lib/Process/MpiTiers.pm line 285
    Process::MpiTiers::run_all('Process::MpiTiers=HASH(0xc10a7e0)', 0)
calledERROR: Failed while builing masking tiers
ERROR: Can not get next level
ERROR: Can't open seq file:
/export/home9/jrs/bmy/maker02/bmy.min1e3.maker.output/bmy.min1e3_datastore/E0/1F/C16738816//theVoid.C16738816/query.masked.gff.seq
No such file or directory

 at /home/jrs/maker-2.26-beta/bin/../lib/Dumper/GFF/GFFV3.pm line 173.
    Dumper::GFF::GFFV3::finalize('Dumper::GFF::GFFV3=HASH(0x4c328c8)')
called at /home/jrs/maker-2.26-beta/bin/../lib/Process/MpiChunk.pm line 700
    Process::MpiChunk::__ANON__() called at
/home/jrs/maker-2.26-beta/bin/../lib/Error.pm line 415
    eval {...} called at /home/jrs/maker-2.26-beta/bin/../lib/Error.pm line
407
    Error::subs::try('CODE(0xfc1d208)', 'HASH(0xfc1fbd8)') called at
/home/jrs/maker-2.26-beta/bin/../lib/Process/MpiChunk.pm line 3768
    Process::MpiChunk::_go('Process::MpiChunk=HASH(0x47162a0)', 'flow',
'HASH(0x93ba680)', 2, 0) called at
/home/jrs/maker-2.26-beta/bin/../lib/Process/MpiChunk.pm line 369
"""

The makers kept outputting these things until I killed them all, and also
mpiexec, with kill -s SIGKILL.

These strack traces go on for at least the last million lines of maker.oe,
which is ~100 GB. Because the log is so big, without knowing what to grep
for it's hard for me to suggest what originally went wrong.

The contig's datastore directory does not contain anything obvious. Its
run.log indicates only that it was the second try for this contig. None of
the files in this directory (including theVoid) had been updated since 3
days ago. So this contig itself may even be a red herring.

maker's protein output for all the other contigs seems sane, and this one
contig is probably too small to contain any proteins, so it doesn't matter
to me if maker finishes it successfully. I just want to figure out how I
can keep maker from hanging.

Suggestions?

Thanks,
Jeremy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20121004/700fb4c7/attachment-0002.html>


More information about the maker-devel mailing list