[maker-devel] SegFault with MPI
Carson Holt
carsonhh at gmail.com
Wed Jun 22 08:52:34 MDT 2016
You may have more than one MPI flavor. For example, the mpicc used to compile or the mpi.h referred to during maker setup and compile may be from mpich2 but your running with mpiexec from OpenMPI. Or they may be from a different OpenMPI version than the one you are running with. Also make sure LD_PRELOAD gets exported in your bash_profile or each time before running MAKER (not just during install). Finally if all else fails, you can try installing perl without pthread support, and use that version of Perl to run with instead of /usr/bin/perl. Very very rarely needed, but in cases involving /lib/x86_64-linux-gnu/libpthread.so in a seg fault it can be the source of all grief. Usually indicates that something is broken with your package manager for the MPI install, Perl install, or GNU C library install.
—Carson
> On Jun 20, 2016, at 6:34 AM, Matthew MacManes <macmanes at gmail.com> wrote:
>
> Hello, I am having a strange SegFault that is related to MPI use - invoking maker without MPI seems to be fine, tho obviously slow. The issue is that the segfault occurs seemingly non-deterministically after 5–10 contigs are processed. If I restart after a fail, another 5–10 contigs are processed, then another.. It does not seem to be a specific contig that is causing failure. I’m pretty confident that I’ve followed the directions, and not quite sure how to troubleshoot from here, so any advice would be helpful.
>
> I’m using Ubuntu 14.04, Maker 3 (similar behavior with Maker 2.31.8), OpenMPI 1.10.3 (I’ve tried a few different versions), augustus–3.2, EVM 1.1.1 on a large system with lots of RAM. I set
>
> export LD_PRELOAD=/openmpi/lib/libmpi.so
> export OMPI_MCA_mpi_warn_on_fork=0
> and did this before installing Maker. I attempt to run maker like this, with both showing the same behavior.
>
> /openmpi/bin/mpirun -np 30 /share/maker3/maker/bin/maker -fix_nucleotides -base peer
>
> or
>
> /openmpi/bin/mpirun -mca btl ^openib -np 30 /share/maker3/maker/bin/maker -fix_nucleotides -base peer
> lots of output
> …
> choosing best annotation set
> Choosing best annotations
> deleted:30 hits
> doing blastn of ESTs
> formating database...
> #--------- command -------------#
> Widget::formater:
> /usr/bin/makeblastdb -dbtype nucl -in /tmp/maker_dFW8I4/28/blastprep/Pero%2EBLTK%2Efasta.mpi.10.2
> #-------------------------------#
> processing chunk output
> running blast search.
> #--------- command -------------#
> Widget::blastn:
> /usr/bin/blastn -db /tmp/maker_dFW8I4/Pero%2EBLTK%2Efasta.mpi.10.2 -query /tmp/maker_dFW8I4/28/scaffold_46.0 -num_alignments 10000 -num_descriptions 10000 -evalue 1e-10 -word_size 28 -reward 1 -penalty -5
> -gapopen 5 -gapextend 5 -dbsize 1000 -searchsp 500000000 -num_threads 1 -lcase_masking -dust yes -soft_masking true -show_gis -out /mouse/pero_genome/maker/peer.maker.output/peer_datastore/B4/81/scaffold
> _46//theVoid.scaffold_46/0/scaffold_46.0.Pero%2EBLTK%2Efasta.blastn.temp_dir/Pero%2EBLTK%2Efasta.mpi.10.2.blastn
> #-------------------------------#
> processing contig output
> examining contents of the fasta file and run log
> [davinci:09219] *** Process received signal ***
> [davinci:09219] Signal: Segmentation fault (11)
> [davinci:09219] Signal code: Address not mapped (1)
> [davinci:09219] Failing at address: 0x50c
> [davinci:09219] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x36d40)[0x7f3ba992ed40]
> [davinci:09219] [ 1] /usr/lib/libperl.so.5.18(Perl_csighandler+0x22)[0x7f3ba9d5d982]
> [davinci:09219] [ 2] /lib/x86_64-linux-gnu/libc.so.6(+0x36d40)[0x7f3ba992ed40]
> [davinci:09219] [ 3]
>
>
> --Next Contig--
>
> /lib/x86_64-linux-gnu/libc.so.6(__poll+0x2d)[0x7f3ba99e512d]
> [davinci:09219] [ 4] /openmpi/lib/libopen-pal.so.13(+0x6b738)[0x7f3ba9403738]
> [davinci:09219] [ 5] /openmpi/lib/libopen-pal.so.13(opal_libevent2021_event_base_loop+0x1b2)[0x7f3ba93fa862]
> [davinci:09219] [ 6] /openmpi/lib/libopen-rte.so.12(+0x381fe)[0x7f3ba96b01fe]
> [davinci:09219] [ 7] /lib/x86_64-linux-gnu/libpthread.so.0(+0x8182)[0x7f3ba9180182]
> [davinci:09219] [ 8] /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f3ba99f247d]
> [davinci:09219] *** End of error message ***
> SIGTERM received
> SIGTERM received
> Perl exited with active threads:
> 1 running and unjoined
> 0 finished and unjoined
> 0 running and detached
> SIGTERM received
> SIGTERM received
> SIGTERM received
> Thanks for any advice,
>
> Matt
>
>
>
>
> Matthew MacManes, Ph.D.
> University of New Hampshire I Assistant Professor of Genome Enabled Biology
> Department of Molecular, Cellular, & Biomedical Sciences
> Durham, NH 03824
> Phone: 603-862-4052 <tel://Phone:%20603-862-4052> | Twitter: @macmanes <https://urldefense.proofpoint.com/v2/url?u=http-3A__twitter.com_macmanes&d=CwMFAg&c=c6MrceVCY5m5A_KAUkrdoA&r=FM3LJfYXZ1h-_Ot21HFFtqNXBqfGYyBrUQLf9mLeTOQ&m=RETSVWfQxdoWK0IcpUZloxNf7eGuKdm90EFhZQi67PY&s=UqTeeQPXM6sdQU60Nu8UgU2jh30ms8LUeLT76lhbY44&e=> | Web: genomebio.org <https://urldefense.proofpoint.com/v2/url?u=http-3A__genomebio.org_&d=CwMFAg&c=c6MrceVCY5m5A_KAUkrdoA&r=FM3LJfYXZ1h-_Ot21HFFtqNXBqfGYyBrUQLf9mLeTOQ&m=RETSVWfQxdoWK0IcpUZloxNf7eGuKdm90EFhZQi67PY&s=xcTCYRLTnvXh7r_eBb-HmHZgaIA-ba7vE62iQnWHFdk&e=>
> Office: 189 Rudman Hall | Laboratory: 145 Rudman Hall
>
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org <http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20160622/b3435ba1/attachment-0003.html>
More information about the maker-devel
mailing list