[maker-devel] SegFault with MPI

Matthew MacManes macmanes at gmail.com
Mon Jun 20 06:34:49 MDT 2016


Hello, I am having a strange SegFault that is related to MPI use - invoking maker without MPI seems to be fine, tho obviously slow. The issue is that the segfault occurs seemingly non-deterministically after 5–10 contigs are processed. If I restart after a fail, another 5–10 contigs are processed, then another.. It does not seem to be a specific contig that is causing failure. I’m pretty confident that I’ve followed the directions, and not quite sure how to troubleshoot from here, so any advice would be helpful.

I’m using Ubuntu 14.04, Maker 3 (similar behavior with Maker 2.31.8), OpenMPI 1.10.3 (I’ve tried a few different versions), augustus–3.2, EVM 1.1.1 on a large system with lots of RAM. I set

export LD_PRELOAD=/openmpi/lib/libmpi.so
export OMPI_MCA_mpi_warn_on_fork=0
and did this before installing Maker. I attempt to run maker like this, with both showing the same behavior.

/openmpi/bin/mpirun -np 30 /share/maker3/maker/bin/maker -fix_nucleotides -base peer

or  

/openmpi/bin/mpirun -mca btl ^openib -np 30 /share/maker3/maker/bin/maker -fix_nucleotides -base peer
lots of output
…
choosing best annotation set
Choosing best annotations
deleted:30 hits
doing blastn of ESTs
formating database...
#--------- command -------------#
Widget::formater:
/usr/bin/makeblastdb -dbtype nucl -in /tmp/maker_dFW8I4/28/blastprep/Pero%2EBLTK%2Efasta.mpi.10.2
#-------------------------------#
processing chunk output
running  blast search.
#--------- command -------------#
Widget::blastn:
/usr/bin/blastn -db /tmp/maker_dFW8I4/Pero%2EBLTK%2Efasta.mpi.10.2 -query /tmp/maker_dFW8I4/28/scaffold_46.0 -num_alignments 10000 -num_descriptions 10000 -evalue 1e-10 -word_size 28 -reward 1 -penalty -5
 -gapopen 5 -gapextend 5 -dbsize 1000 -searchsp 500000000 -num_threads 1 -lcase_masking -dust yes -soft_masking true -show_gis -out /mouse/pero_genome/maker/peer.maker.output/peer_datastore/B4/81/scaffold
_46//theVoid.scaffold_46/0/scaffold_46.0.Pero%2EBLTK%2Efasta.blastn.temp_dir/Pero%2EBLTK%2Efasta.mpi.10.2.blastn
#-------------------------------#
processing contig output
examining contents of the fasta file and run log
[davinci:09219] *** Process received signal ***
[davinci:09219] Signal: Segmentation fault (11)
[davinci:09219] Signal code: Address not mapped (1)
[davinci:09219] Failing at address: 0x50c
[davinci:09219] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x36d40)[0x7f3ba992ed40]
[davinci:09219] [ 1] /usr/lib/libperl.so.5.18(Perl_csighandler+0x22)[0x7f3ba9d5d982]
[davinci:09219] [ 2] /lib/x86_64-linux-gnu/libc.so.6(+0x36d40)[0x7f3ba992ed40]
[davinci:09219] [ 3]


--Next Contig--

/lib/x86_64-linux-gnu/libc.so.6(__poll+0x2d)[0x7f3ba99e512d]
[davinci:09219] [ 4] /openmpi/lib/libopen-pal.so.13(+0x6b738)[0x7f3ba9403738]
[davinci:09219] [ 5] /openmpi/lib/libopen-pal.so.13(opal_libevent2021_event_base_loop+0x1b2)[0x7f3ba93fa862]
[davinci:09219] [ 6] /openmpi/lib/libopen-rte.so.12(+0x381fe)[0x7f3ba96b01fe]
[davinci:09219] [ 7] /lib/x86_64-linux-gnu/libpthread.so.0(+0x8182)[0x7f3ba9180182]
[davinci:09219] [ 8] /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f3ba99f247d]
[davinci:09219] *** End of error message ***
SIGTERM received
SIGTERM received
Perl exited with active threads:
        1 running and unjoined
        0 finished and unjoined
        0 running and detached
SIGTERM received
SIGTERM received
SIGTERM received
Thanks for any advice,

Matt



Matthew MacManes, Ph.D.
University of New Hampshire  I  Assistant Professor of Genome Enabled Biology
Department of Molecular, Cellular, & Biomedical Sciences
Durham, NH  03824
Phone: 603-862-4052  | Twitter: @macmanes | Web: genomebio.org
Office: 189 Rudman Hall | Laboratory: 145 Rudman Hall
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20160620/e3fe867e/attachment-0002.html>


More information about the maker-devel mailing list