[maker-devel] Maker crash on increasingly small contigs

Daniel Ence dence at genetics.utah.edu
Wed Jan 28 09:22:09 MST 2015


Hi Marc, so a few things on the maker side to check out. 

Did you have the min_contig set to 1000, to set the lower limit on contig size?
Did maker do anything with the 1kb contigs? Or did it just skip them? 
You can check that in the master_datastore_index.log or in the void directories for the small contigs. 
That will tell us whether maker is functioning correctly, even though it’s giving those messages. 

With the newer versions of makers, I get messages identical to what you sent as part of the normal thread termination, even when maker is functioning normally. 

Thanks,
Daniel



> On Jan 28, 2015, at 12:01 AM, Marc Höppner <marc.hoeppner at imbim.uu.se> wrote:
> 
> Hi,
> 
> this is probably a long shot, but I was hoping that someone on the list may have some advice as to how to debug an error that has been popping up when running Maker on our 10 node cluster. So, what is the issue?
> 
> Maker runs fine on several assemblies that w have processed in the past, but I recently started on a fairly fragment (low N50) mammalian assembly and the collaborator was keen to have all contigs annotated, down to 1kb (I guess it is more about the repeats and blast matches in those small bits). Anyway, As the contigs get smaller, Maker starts crashing in MPI mode with the following error (no other message given prior to that):
> 
> perl:13424 terminated with signal 11 at PC=3d47095012 SP=7f8ac076e530.  Backtrace:
> /usr/lib64/perl5/CORE/libperl.so(Perl_csighandler+0x22)[0x3d47095012]
> /lib64/libpthread.so.0[0x358ae0f710]
> /usr/lib64/perl5/CORE/libperl.so(Perl_csighandler+0x0)[0x3d47094ff0]
> /lib64/libpthread.so.0[0x358ae0f710]
> /lib64/libc.so.6(__poll+0x53)[0x358aadf343]
> /sw/openmpi/1.8.3/lib/libopen-pal.so.6(+0x6af4a)[0x7f8ac0a29f4a]
> /sw/openmpi/1.8.3/lib/libopen-pal.so.6(opal_libevent2021_event_base_loop+0x221)[0x7f8ac0a21961]
> /sw/openmpi/1.8.3/lib/libopen-rte.so.7(+0x52f8e)[0x7f8ac0ce5f8e]
> /lib64/libpthread.so.0[0x358ae079d1]
> /lib64/libc.so.6(clone+0x6d)[0x358aae8b6d]
> SIGTERM received
> 
> A few words about the setup:
> 
> We have 10 nodes, 160 cores and the shared file system is exported via Infiniband from a ‘standard’ NFS server. As OS we run Scientific Linux 6.5. Tests so far don’t point to congestion issues or anything like that, the bandwidth usage is actually fairly low. I
> 
> So far I tried:
> 
> - running the MPI processes through both the ethernet network as well as over IPoIB, same problem. 
> - installing a more recent version of perl through perlbrew, with all the required modules, and re-compiled Maker
> - ran some (albeit simple) network checks to for retransmissions, lost packages etc - nothing popped up
> - running Maker in a subset of nodes to eliminate the possibility of a bad node
> 
> The error message is a bit cryptic to me and it would be very helpful to know if Maker has a problem with accessing a file, or whether OpenMPI has a communication problem etc - but I am not able to tell from the information I have been able to extract so far. Any ideas?
> 
> So 
> 
> Cheers,
> 
> Marc
> 
> 
> Marc P. Hoeppner, PhD
> Team Leader
> BILS Genome Annotation Platform
> Department for Medical Biochemistry and Microbiology
> Uppsala University, Sweden
> marc.hoeppner at imbim.uu.se
> 
> 
> 
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org



More information about the maker-devel mailing list