[maker-devel] Problem with OpenFabrics and infiniband

UMD Bioinformatics bioinformatics.umd at gmail.com
Thu Feb 27 09:46:44 MST 2014


Hello,

I’ve had my IT folks install maker on our cluster at UMD. I’m having a SEGFAULT error when running maker on inifiniband nodes vs gigE nodes. According to the logs this appears to be an issue with forks but I’m not sure how to fix this. I would simply use the gigE nodes but we are in the process of updating everything to inifiniband so I’ll need to address this issue as some point. I’ve attached the error log from the MPI run as well as commentary from my HPCC team. 

IT suggestions

If you look at the top of the error log for the problematic job, it clearly
warns of an issue with doing 'fork's within openmpi/openfabrics framework.

In particular, the use of the fork system call is only partially supported
in the OpenFabrics software (this is the drivers, etc for the infiniband
connections). See e.g. 
http://www.open-mpi.org/faq/?category=openfabrics#ofa-fork
for more information. In particular the paragraphs starting with the
sentence with the red highlighted "it does not mean that your fork()-calling 
application is safe". (The kernel, openMPI version, and OFED version are 
sufficiently recent to mean that there is _some_ fork support).

The fact that the job runs over gigE but not IB, in conjunction with the
warning from openmpi, strongly suggests that this is the issue that you are 
encountering. I suspect that maker touches registered memory before the fork,
which would result in a segfault (matching what was observed).

You can try adding the arguments
--mca mpi_warn_on_fork 0 
to the mpirun command, just in case the crash was somehow caused by openmpi's
warning, but I would not hold out much hope for that.

###UPDATE### This does not fix the problem.


Basically, it looks like maker uses some system calls like fork in a manner
which is incompatible with the current OpenFabrics software, and thus will
not work with infiniband. This situation is likely to remain until either
maker changes to be compatible with OFED, or OFED's support for the fork
system call is broadened.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20140227/acd7e3ab/attachment-0004.html>
-------------- next part --------------
STATUS: Parsing control files...
--------------------------------------------------------------------------
An MPI process has executed an operation involving a call to the
"fork()" system call to create a child process.  Open MPI is currently
operating in a condition that could result in memory corruption or
other system errors; your MPI job may hang, crash, or produce silent
data corruption.  The use of fork() (or system() or other calls that
create child processes) is strongly discouraged.  

The process that invoked fork was:

  Local host:          compute-g20-7.deepthought.umd.edu (PID 28015)
  MPI_COMM_WORLD rank: 0

If you are *absolutely sure* that your application will successfully
and correctly survive a call to fork(), you may disable this warning
by setting the mpi_warn_on_fork MCA parameter to 0.
--------------------------------------------------------------------------
[compute-g20-8:09542] *** Process received signal ***
[compute-g20-8:09542] Signal: Segmentation fault (11)
[compute-g20-8:09542] Signal code: Address not mapped (1)
[compute-g20-8:09542] Failing at address: 0xee00350
[compute-g20-8:09543] *** Process received signal ***
[compute-g20-8:09543] Signal: Segmentation fault (11)
[compute-g20-8:09543] Signal code: Address not mapped (1)
[compute-g20-8:09543] Failing at address: 0xf020c90
[compute-g20-8:09544] *** Process received signal ***
[compute-g20-8:09544] Signal: Segmentation fault (11)
[compute-g20-8:09544] Signal code: Address not mapped (1)
[compute-g20-8:09544] Failing at address: 0x1ad68f10
[compute-g20-8:09545] *** Process received signal ***
[compute-g20-8:09545] Signal: Segmentation fault (11)
[compute-g20-8:09545] Signal code: Address not mapped (1)
[compute-g20-8:09545] Failing at address: 0x84a3188
[compute-g20-8:09545] [ 0] /lib64/libpthread.so.0 [0x2b98fac5eca0]
[compute-g20-8:09545] [ 1] /cell_root/software/openmpi/1.6/gnu/sys/lib/libmpi.so(opal_memory_ptmalloc2_int_malloc+0x530) [0x2b98f9ea4ec0]
[compute-g20-8:09545] [ 2] /cell_root/software/openmpi/1.6/gnu/sys/lib/libmpi.so(opal_memory_ptmalloc2_malloc+0x4a) [0x2b98f9ea60ca]
[compute-g20-8:09545] [ 3] perl(Perl_safesysmalloc+0x12) [0x481602]
[compute-g20-8:09545] [ 4] perl(Perl_savepvn+0x26) [0x4816b6]
[compute-g20-8:09545] [ 5] perl(Perl_do_exec3+0x31e) [0x4f715e]
[compute-g20-8:09545] [ 6] perl(Perl_my_popen+0x403) [0x484d63]
[compute-g20-8:09545] [ 7] perl(Perl_do_openn+0x1696) [0x4f9536]
[compute-g20-8:09545] [ 8] perl(Perl_pp_open+0x184) [0x4efc44]
[compute-g20-8:09545] [ 9] perl(Perl_runops_standard+0xe) [0x49d5ce]
[compute-g20-8:09545] [10] perl(perl_run+0x243) [0x4340f3]
[compute-g20-8:09545] [11] perl(main+0x135) [0x41b485]
[compute-g20-8:09545] [12] /lib64/libc.so.6(__libc_start_main+0xf4) [0x2b98fae899c4]
[compute-g20-8:09545] [13] perl [0x41b299]
[compute-g20-8:09545] *** End of error message ***
[compute-g20-8:09546] *** Process received signal ***
[compute-g20-8:09546] Signal: Segmentation fault (11)
[compute-g20-8:09546] Signal code: Address not mapped (1)
[compute-g20-8:09546] Failing at address: 0x8240850
[compute-g20-8:09547] *** Process received signal ***
[compute-g20-8:09547] Signal: Segmentation fault (11)
[compute-g20-8:09547] Signal code: Address not mapped (1)
[compute-g20-8:09547] Failing at address: 0xd5c8850
[compute-g20-8:09548] *** Process received signal ***
[compute-g20-8:09548] Signal: Segmentation fault (11)
[compute-g20-8:09548] Signal code: Address not mapped (1)
[compute-g20-8:09548] Failing at address: 0x8c80850
[compute-g20-8:09549] *** Process received signal ***
[compute-g20-8:09549] Signal: Segmentation fault (11)
[compute-g20-8:09549] Signal code: Address not mapped (1)
[compute-g20-8:09549] Failing at address: 0x18d72850
[compute-g20-10:07087] *** Process received signal ***
[compute-g20-10:07087] Signal: Segmentation fault (11)
[compute-g20-10:07087] Signal code: Address not mapped (1)
[compute-g20-10:07087] Failing at address: 0x6659f10
[compute-g20-10:07088] *** Process received signal ***
[compute-g20-10:07088] Signal: Segmentation fault (11)
[compute-g20-10:07088] Signal code: Address not mapped (1)
[compute-g20-10:07088] Failing at address: 0x1fe3b5d0
[compute-g20-10:07089] *** Process received signal ***
[compute-g20-10:07089] Signal: Segmentation fault (11)
[compute-g20-10:07089] Signal code: Address not mapped (1)
[compute-g20-10:07089] Failing at address: 0x9870350
[compute-g20-10:07090] *** Process received signal ***
[compute-g20-10:07090] Signal: Segmentation fault (11)
[compute-g20-10:07090] Signal code: Address not mapped (1)
[compute-g20-10:07090] Failing at address: 0x17bad350
STATUS: Processing and indexing input FASTA files...
[compute-g20-8:09567] *** Process received signal ***
[compute-g20-8:09567] Signal: Segmentation fault (11)
[compute-g20-8:09567] Signal code: Address not mapped (1)
[compute-g20-8:09567] Failing at address: 0x1ad5aa10
[compute-g20-8:09567] [ 0] /lib64/libpthread.so.0 [0x2b6de3ce1ca0]
[compute-g20-8:09567] [ 1] /lib64/libc.so.6(strlen+0x30) [0x2b6de3f67f40]
[compute-g20-8:09567] [ 2] perl(Perl_do_exec3+0x3a) [0x4f6e7a]
[compute-g20-8:09567] [ 3] perl(Perl_my_popen+0x403) [0x484d63]
[compute-g20-8:09567] [ 4] perl(Perl_do_openn+0x1696) [0x4f9536]
[compute-g20-8:09567] [ 5] perl(Perl_pp_open+0x184) [0x4efc44]
[compute-g20-8:09567] [ 6] perl(Perl_runops_standard+0xe) [0x49d5ce]
[compute-g20-8:09567] [ 7] perl(perl_run+0x243) [0x4340f3]
[compute-g20-8:09567] [ 8] perl(main+0x135) [0x41b485]
[compute-g20-8:09567] [ 9] /lib64/libc.so.6(__libc_start_main+0xf4) [0x2b6de3f0c9c4]
[compute-g20-8:09567] [10] perl [0x41b299]
[compute-g20-8:09567] *** End of error message ***
[compute-g20-7:28123] *** Process received signal ***
[compute-g20-7:28123] Signal: Segmentation fault (11)
[compute-g20-7:28123] Signal code: Address not mapped (1)
[compute-g20-7:28123] Failing at address: 0x19ad9f10
STATUS: Setting up database for any GFF3 input...
A data structure will be created for you at:
/export/lustre_1/imisner/Maker/UCSC1_CLC_de_novo.maker.output/UCSC1_CLC_de_novo_datastore

To access files for individual sequences use the datastore index:
/export/lustre_1/imisner/Maker/UCSC1_CLC_de_novo.maker.output/UCSC1_CLC_de_novo_master_datastore_index.log

STATUS: Now running MAKER...
examining contents of the fasta file and run log
[compute-g20-10:07107] *** Process received signal ***
[compute-g20-10:07107] Signal: Segmentation fault (11)
[compute-g20-10:07107] Signal code: Address not mapped (1)
[compute-g20-10:07107] Failing at address: 0x9870362
[compute-g20-10:07107] [ 0] /lib64/libpthread.so.0 [0x2b50c5c8cca0]
[compute-g20-10:07107] [ 1] perl [0x487218]
[compute-g20-10:07107] [ 2] perl(Perl_hv_common+0xe67) [0x499dd7]
[compute-g20-10:07107] [ 3] perl [0x49d9dc]
[compute-g20-10:07107] [ 4] perl(Perl_pp_method_named+0x6e) [0x49dd4e]
[compute-g20-10:07107] [ 5] perl(Perl_runops_standard+0xe) [0x49d5ce]
[compute-g20-10:07107] [ 6] perl(perl_run+0x243) [0x4340f3]
[compute-g20-10:07107] [ 7] perl(main+0x135) [0x41b485]
[compute-g20-10:07107] [ 8] /lib64/libc.so.6(__libc_start_main+0xf4) [0x2b50c5eb79c4]
[compute-g20-10:07107] [ 9] perl [0x41b299]
[compute-g20-10:07107] *** End of error message ***
examining contents of the fasta file and run log
examining contents of the fasta file and run log
[compute-g20-10:07108] *** Process received signal ***
[compute-g20-10:07108] Signal: Segmentation fault (11)
[compute-g20-10:07108] Signal code: Address not mapped (1)
[compute-g20-10:07108] Failing at address: 0x1fe3b5c8
examining contents of the fasta file and run log
[compute-g20-10:07108] [ 0] /lib64/libpthread.so.0 [0x2b88f6f8dca0]
[compute-g20-10:07108] [ 1] /cell_root/software/openmpi/1.6/gnu/sys/lib/libmpi.so(opal_memory_ptmalloc2_free+0x22) [0x2b88f61d55b2]
[compute-g20-10:07108] [ 2] /lib64/libc.so.6(cfree+0xd1) [0x2b88f7210ad1]
[compute-g20-10:07108] [ 3] perl(Perl_sv_setsv_flags+0xb49) [0x4ad919]
[compute-g20-10:07108] [ 4] perl(Perl_pp_aassign+0x209) [0x4a3a19]
[compute-g20-10:07108] [ 5] perl(Perl_runops_standard+0xe) [0x49d5ce]
[compute-g20-10:07108] [ 6] perl(perl_run+0x243) [0x4340f3]
[compute-g20-10:07108] [ 7] perl(main+0x135) [0x41b485]
[compute-g20-10:07108] [ 8] /lib64/libc.so.6(__libc_start_main+0xf4) [0x2b88f71b89c4]
[compute-g20-10:07108] [ 9] perl [0x41b299]
[compute-g20-10:07108] *** End of error message ***
examining contents of the fasta file and run log
[compute-g20-10:07109] *** Process received signal ***
[compute-g20-10:07109] Signal: Segmentation fault (11)
[compute-g20-10:07109] Signal code: Address not mapped (1)
[compute-g20-10:07109] Failing at address: 0x6664ad0
[compute-g20-10:07109] [ 0] /lib64/libpthread.so.0 [0x2b0809664ca0]
[compute-g20-10:07109] [ 1] /lib64/libc.so.6 [0x2b08098edada]
[compute-g20-10:07109] [ 2] /lib64/libc.so.6(memmove+0x75) [0x2b08098ec095]
[compute-g20-10:07109] [ 3] perl(Perl_sv_setpvn+0x7a) [0x4b775a]
[compute-g20-10:07109] [ 4] perl(Perl_pp_concat+0xc9) [0x4a5739]
[compute-g20-10:07109] [ 5] perl(Perl_runops_standard+0xe) [0x49d5ce]
[compute-g20-10:07109] [ 6] perl(Perl_call_sv+0x160) [0x4333a0]
[compute-g20-10:07109] [ 7] perl(Perl_magic_methcall+0x182) [0x488c22]
[compute-g20-10:07109] [ 8] perl(Perl_magic_setpack+0x52) [0x489292]
[compute-g20-10:07109] [ 9] perl(Perl_mg_set+0x66) [0x48aca6]
[compute-g20-10:07109] [10] perl(Perl_pp_sassign+0x19c) [0x4a5c8c]
[compute-g20-10:07109] [11] perl(Perl_runops_standard+0xe) [0x49d5ce]
[compute-g20-10:07109] [12] perl(perl_run+0x243) [0x4340f3]
[compute-g20-10:07109] [13] perl(main+0x135) [0x41b485]
[compute-g20-10:07109] [14] /lib64/libc.so.6(__libc_start_main+0xf4) [0x2b080988f9c4]
[compute-g20-10:07109] [15] perl [0x41b299]
[compute-g20-10:07109] *** End of error message ***
examining contents of the fasta file and run log
examining contents of the fasta file and run log
examining contents of the fasta file and run log
examining contents of the fasta file and run log



--Next Contig--




--Next Contig--




--Next Contig--

examining contents of the fasta file and run log



--Next Contig--

Processing run.log file...
Processing run.log file...
examining contents of the fasta file and run log
Processing run.log file...
Processing run.log file...



--Next Contig--




--Next Contig--




--Next Contig--




--Next Contig--




--Next Contig--

Processing run.log file...
Processing run.log file...



--Next Contig--




--Next Contig--

Processing run.log file...
#---------------------------------------------------------------------
Now starting the contig!!
SeqID: Gc_UCSC1_contig_2
Length: 2857
#---------------------------------------------------------------------


Processing run.log file...
MAKER WARNING: The file UCSC1_CLC_de_novo.maker.output/UCSC1_CLC_de_novo_datastore/D5/5A/Gc_UCSC1_contig_17//theVoid.Gc_UCSC1_contig_17/0/Gc_UCSC1_contig_17.0.all.rb.out
did not finish on the last run and must be erased
Processing run.log file...
setting up GFF3 output and fasta chunks
Processing run.log file...
#---------------------------------------------------------------------
Now starting the contig!!
SeqID: Gc_UCSC1_contig_7
Length: 972
#---------------------------------------------------------------------


[compute-g20-8:09576] *** Process received signal ***
[compute-g20-8:09576] Signal: Segmentation fault (11)
[compute-g20-8:09576] Signal code: Address not mapped (1)
[compute-g20-8:09576] Failing at address: 0x1ad68f08
examining contents of the fasta file and run log
#---------------------------------------------------------------------
Now starting the contig!!
SeqID: Gc_UCSC1_contig_3
Length: 2316
#---------------------------------------------------------------------


[compute-g20-8:09576] [ 0] /lib64/libpthread.so.0 [0x2b6de3ce1ca0]
[compute-g20-8:09576] [ 1] /cell_root/software/openmpi/1.6/gnu/sys/lib/libmpi.so(opal_memory_ptmalloc2_free+0x22) [0x2b6de2f295b2]
[compute-g20-8:09576] [ 2] /lib64/libc.so.6(cfree+0xd1) [0x2b6de3f64ad1]
[compute-g20-8:09576] [ 3] perl(Perl_sv_setsv_flags+0xb49) [0x4ad919]
[compute-g20-8:09576] [ 4] perl(Perl_pp_aassign+0x209) [0x4a3a19]
[compute-g20-8:09576] [ 5] perl(Perl_runops_standard+0xe) [0x49d5ce]
[compute-g20-8:09576] [ 6] perl(perl_run+0x243) [0x4340f3]
[compute-g20-8:09576] [ 7] perl(main+0x135) [0x41b485]
[compute-g20-8:09576] [ 8] /lib64/libc.so.6(__libc_start_main+0xf4) [0x2b6de3f0c9c4]
[compute-g20-8:09576] [ 9] perl [0x41b299]
[compute-g20-8:09576] *** End of error message ***
#---------------------------------------------------------------------
Now starting the contig!!
SeqID: Gc_UCSC1_contig_4
Length: 1230
#---------------------------------------------------------------------


examining contents of the fasta file and run log
examining contents of the fasta file and run log
examining contents of the fasta file and run log
examining contents of the fasta file and run log
examining contents of the fasta file and run log
[compute-g20-8:09578] *** Process received signal ***
[compute-g20-8:09578] Signal: Segmentation fault (11)
[compute-g20-8:09578] Signal code: Address not mapped (1)
[compute-g20-8:09578] Failing at address: 0xee0af18
[compute-g20-8:09578] [ 0] /lib64/libpthread.so.0 [0x2b03d0637ca0]
[compute-g20-8:09578] [ 1] perl(Perl_av_fetch+0x5b) [0x49cf8b]
[compute-g20-8:09578] [ 2] perl(Perl_pp_aelem+0x26e) [0x49e48e]
[compute-g20-8:09578] [ 3] perl(Perl_runops_standard+0xe) [0x49d5ce]
[compute-g20-8:09578] [ 4] perl(perl_run+0x243) [0x4340f3]
[compute-g20-8:09578] [ 5] perl(main+0x135) [0x41b485]
[compute-g20-8:09578] [ 6] /lib64/libc.so.6(__libc_start_main+0xf4) [0x2b03d08629c4]
[compute-g20-8:09578] [ 7] perl [0x41b299]
[compute-g20-8:09578] *** End of error message ***
setting up GFF3 output and fasta chunks
Processing run.log file...
[compute-g20-8:09583] *** Process received signal ***
[compute-g20-8:09583] Signal: Segmentation fault (11)
[compute-g20-8:09583] Signal code: Address not mapped (1)
[compute-g20-8:09583] Failing at address: 0x822b0e2
[compute-g20-8:09582] *** Process received signal ***
[compute-g20-8:09582] Signal: Segmentation fault (11)
[compute-g20-8:09582] Signal code: Address not mapped (1)
[compute-g20-8:09582] Failing at address: 0x8c6b0e2
[compute-g20-8:09583] [ 0] /lib64/libpthread.so.0 [0x2ab7f114dca0]
[compute-g20-8:09583] [ 1] perl [0x487218]
[compute-g20-8:09583] [ 2] perl(Perl_hv_common+0xe67) [0x499dd7]
[compute-g20-8:09583] [ 3] perl [0x49d9dc]
[compute-g20-8:09583] [ 4] perl(Perl_pp_method_named+0x6e) [0x49dd4e]
[compute-g20-8:09583] [ 5] perl(Perl_runops_standard+0xe) [0x49d5ce]
[compute-g20-8:09583] [ 6] perl(perl_run+0x243) [0x4340f3]
[compute-g20-8:09583] [ 7] perl(main+0x135) [0x41b485]
[compute-g20-8:09583] [ 8] /lib64/libc.so.6(__libc_start_main+0xf4) [0x2ab7f13789c4]
[compute-g20-8:09583] [ 9] perl [0x41b299]
[compute-g20-8:09583] *** End of error message ***
[compute-g20-8:09582] [ 0] /lib64/libpthread.so.0 [0x2b4eace23ca0]
[compute-g20-8:09582] [ 1] perl [0x487218]
[compute-g20-8:09582] [ 2] perl(Perl_hv_common+0xe67) [0x499dd7]
[compute-g20-8:09582] [ 3] perl [0x49d9dc]
[compute-g20-8:09582] [ 4] perl(Perl_pp_method_named+0x6e) [0x49dd4e]
[compute-g20-8:09582] [ 5] perl(Perl_runops_standard+0xe) [0x49d5ce]
[compute-g20-8:09582] [ 6] perl(perl_run+0x243) [0x4340f3]
[compute-g20-8:09582] [ 7] perl(main+0x135) [0x41b485]
[compute-g20-8:09582] [ 8] /lib64/libc.so.6(__libc_start_main+0xf4) [0x2b4ead04e9c4]
[compute-g20-8:09582] [ 9] perl [0x41b299]
[compute-g20-8:09582] *** End of error message ***
examining contents of the fasta file and run log
[compute-g20-8:09581] *** Process received signal ***
[compute-g20-8:09581] Signal: Segmentation fault (11)
[compute-g20-8:09581] Signal code: Address not mapped (1)
[compute-g20-8:09581] Failing at address: 0x848da08
#---------------------------------------------------------------------
Now starting the contig!!
SeqID: Gc_UCSC1_contig_17
Length: 1413
#---------------------------------------------------------------------


#---------------------------------------------------------------------
Now starting the contig!!
SeqID: Gc_UCSC1_contig_13
Length: 2019
#---------------------------------------------------------------------


[compute-g20-8:09581] [ 0] /lib64/libpthread.so.0 [0x2b98fac5eca0]
[compute-g20-8:09581] [ 1] /cell_root/software/openmpi/1.6/gnu/sys/lib/libmpi.so(opal_memory_ptmalloc2_free+0x22) [0x2b98f9ea65b2]
[compute-g20-8:09581] [ 2] /lib64/libc.so.6(cfree+0xd1) [0x2b98faee1ad1]
[compute-g20-8:09581] [ 3] perl(Perl_sv_setsv_flags+0xb49) [0x4ad919]
[compute-g20-8:09581] [ 4] perl(Perl_pp_aassign+0x209) [0x4a3a19]
[compute-g20-8:09581] [ 5] perl(Perl_runops_standard+0xe) [0x49d5ce]
[compute-g20-8:09581] [ 6] perl(perl_run+0x243) [0x4340f3]
[compute-g20-8:09581] [ 7] perl(main+0x135) [0x41b485]
[compute-g20-8:09581] [ 8] /lib64/libc.so.6(__libc_start_main+0xf4) [0x2b98fae899c4]
[compute-g20-8:09581] [ 9] perl [0x41b299]
[compute-g20-8:09577] *** Process received signal ***
[compute-g20-8:09581] *** End of error message ***
[compute-g20-8:09577] Signal: Segmentation fault (11)
[compute-g20-8:09577] Signal code: Address not mapped (1)
[compute-g20-8:09577] Failing at address: 0xd5b30e2
[compute-g20-8:09577] [ 0] /lib64/libpthread.so.0 [0x2b79d382aca0]
[compute-g20-8:09577] [ 1] perl [0x487218]
[compute-g20-8:09577] [ 2] perl(Perl_hv_common+0xe67) [0x499dd7]
[compute-g20-8:09577] [ 3] perl [0x49d9dc]
[compute-g20-8:09577] [ 4] perl(Perl_pp_method_named+0x6e) [0x49dd4e]
[compute-g20-8:09577] [ 5] perl(Perl_runops_standard+0xe) [0x49d5ce]
[compute-g20-8:09577] [ 6] perl(perl_run+0x243) [0x4340f3]
[compute-g20-8:09577] [ 7] perl(main+0x135) [0x41b485]
[compute-g20-8:09577] [ 8] /lib64/libc.so.6(__libc_start_main+0xf4) [0x2b79d3a559c4]
[compute-g20-8:09577] [ 9] perl [0x41b299]
[compute-g20-8:09577] *** End of error message ***
#---------------------------------------------------------------------
Now starting the contig!!
SeqID: Gc_UCSC1_contig_1
Length: 1446
#---------------------------------------------------------------------


setting up GFF3 output and fasta chunks
[compute-g20-8:09579] *** Process received signal ***
[compute-g20-8:09579] Signal: Segmentation fault (11)
[compute-g20-8:09579] Signal code: Address not mapped (1)
[compute-g20-8:09579] Failing at address: 0x18d64350
examining contents of the fasta file and run log
[compute-g20-8:09579] [ 0] /lib64/libpthread.so.0 [0x2b31b670fca0]
[compute-g20-8:09579] [ 1] /usr/local/BerkeleyDB/lib/libdb-4.7.so(__ham_get_meta+0x4c) [0x2b31bbd1bccc]
[compute-g20-8:09579] [ 2] /usr/local/BerkeleyDB/lib/libdb-4.7.so [0x2b31bbd103fb]
[compute-g20-8:09579] [ 3] /usr/local/BerkeleyDB/lib/libdb-4.7.so(__dbc_get+0x1fa) [0x2b31bbd81f3a]
[compute-g20-8:09579] [ 4] /usr/local/BerkeleyDB/lib/libdb-4.7.so(__dbc_get_pp+0xb4) [0x2b31bbd8db04]
[compute-g20-8:09579] [ 5] /usr/local/BerkeleyDB/lib/libdb-4.7.so [0x2b31bbce4b85]
[compute-g20-8:09579] [ 6] /usr/local/perl/5.16.3-threaded/lib/site_perl/5.16.3/x86_64-linux-thread-multi/auto/DB_File/DB_File.so [0x2b31bbabafc9]
[compute-g20-8:09579] [ 7] perl(Perl_pp_entersub+0x58f) [0x49ee4f]
[compute-g20-8:09579] [ 8] perl(Perl_runops_standard+0xe) [0x49d5ce]
[compute-g20-8:09579] [ 9] perl(perl_run+0x243) [0x4340f3]
[compute-g20-8:09579] [10] perl(main+0x135) [0x41b485]
[compute-g20-8:09579] [11] /lib64/libc.so.6(__libc_start_main+0xf4) [0x2b31b693a9c4]
[compute-g20-8:09579] [12] perl [0x41b299]
[compute-g20-8:09579] *** End of error message ***



--Next Contig--

setting up GFF3 output and fasta chunks
setting up GFF3 output and fasta chunks



--Next Contig--

setting up GFF3 output and fasta chunks



--Next Contig--




--Next Contig--

Processing run.log file...
MAKER WARNING: The file UCSC1_CLC_de_novo.maker.output/UCSC1_CLC_de_novo_datastore/3B/F3/Gc_UCSC1_contig_26//theVoid.Gc_UCSC1_contig_26/0/Gc_UCSC1_contig_26.0.all.rb.out
did not finish on the last run and must be erased



--Next Contig--




--Next Contig--




--Next Contig--

#---------------------------------------------------------------------
Now starting the contig!!
SeqID: Gc_UCSC1_contig_18
Length: 937
#---------------------------------------------------------------------


Processing run.log file...
Processing run.log file...
Processing run.log file...



--Next Contig--

FATAL: Thread terminated, causing all processes to fail
--> rank=17, hostname=compute-g20-10.deepthought.umd.edu
setting up GFF3 output and fasta chunks
Processing run.log file...
Processing run.log file...
#---------------------------------------------------------------------
Now starting the contig!!
SeqID: Gc_UCSC1_contig_14
Length: 6745
#---------------------------------------------------------------------


#---------------------------------------------------------------------
Now starting the contig!!
SeqID: Gc_UCSC1_contig_9
Length: 554
#---------------------------------------------------------------------


MAKER WARNING: The file UCSC1_CLC_de_novo.maker.output/UCSC1_CLC_de_novo_datastore/FB/E4/Gc_UCSC1_contig_22//theVoid.Gc_UCSC1_contig_22/0/Gc_UCSC1_contig_22.0.all.rb.out
did not finish on the last run and must be erased
setting up GFF3 output and fasta chunks
Processing run.log file...
setting up GFF3 output and fasta chunks
#---------------------------------------------------------------------
Now starting the contig!!
SeqID: Gc_UCSC1_contig_16
Length: 995
#---------------------------------------------------------------------


setting up GFF3 output and fasta chunks
#---------------------------------------------------------------------
Now starting the contig!!
SeqID: Gc_UCSC1_contig_26
Length: 1895
#---------------------------------------------------------------------


FATAL: Thread terminated, causing all processes to fail
--> rank=16, hostname=compute-g20-10.deepthought.umd.edu
#---------------------------------------------------------------------
Now starting the contig!!
SeqID: Gc_UCSC1_contig_23
Length: 618
#---------------------------------------------------------------------


#---------------------------------------------------------------------
Now starting the contig!!
SeqID: Gc_UCSC1_contig_31
Length: 506
#---------------------------------------------------------------------


setting up GFF3 output and fasta chunks
setting up GFF3 output and fasta chunks
#---------------------------------------------------------------------
Now starting the contig!!
SeqID: Gc_UCSC1_contig_28
Length: 5246
#---------------------------------------------------------------------


MAKER WARNING: The file UCSC1_CLC_de_novo.maker.output/UCSC1_CLC_de_novo_datastore/E5/53/Gc_UCSC1_contig_29//theVoid.Gc_UCSC1_contig_29/0/Gc_UCSC1_contig_29.0.all.rb.out
did not finish on the last run and must be erased
setting up GFF3 output and fasta chunks
setting up GFF3 output and fasta chunks
setting up GFF3 output and fasta chunks
#---------------------------------------------------------------------
Now starting the contig!!
SeqID: Gc_UCSC1_contig_19
Length: 880
#---------------------------------------------------------------------


#---------------------------------------------------------------------
Now starting the contig!!
SeqID: Gc_UCSC1_contig_22
Length: 831
#---------------------------------------------------------------------


#---------------------------------------------------------------------
Now starting the contig!!
SeqID: Gc_UCSC1_contig_21
Length: 12421
#---------------------------------------------------------------------


doing repeat masking
FATAL: Thread terminated, causing all processes to fail
--> rank=18, hostname=compute-g20-10.deepthought.umd.edu
#---------------------------------------------------------------------
Now starting the contig!!
SeqID: Gc_UCSC1_contig_29
Length: 1161
#---------------------------------------------------------------------


doing repeat masking
DBD::SQLite::db do failed: disk I/O error at /export/rel50_shadow/glue.umd.edu/software/maker/2.28/.amd64_rel50/bin/../lib/GFFDB.pm line 105.
DBD::SQLite::db do failed: disk I/O error at /export/rel50_shadow/glue.umd.edu/software/maker/2.28/.amd64_rel50/bin/../lib/GFFDB.pm line 106.
DBD::SQLite::db selectcol_arrayref failed: disk I/O error at /export/rel50_shadow/glue.umd.edu/software/maker/2.28/.amd64_rel50/bin/../lib/GFFDB.pm line 108.
DBD::SQLite::db do failed: disk I/O error at /export/rel50_shadow/glue.umd.edu/software/maker/2.28/.amd64_rel50/bin/../lib/GFFDB.pm line 110.
[compute-g20-7.deepthought.umd.edu:28014] 19 more processes have sent help message help-mpi-runtime.txt / mpi_init:warn-fork
[compute-g20-7.deepthought.umd.edu:28014] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
doing repeat masking
running  repeat masker.
#--------- command -------------#
Widget::RepeatMasker:
cd /tmp/maker_amJ13c; /a/g20-fs1/software/dt-sw0/RepeatMasker/4.0.3/RepeatMasker /export/lustre_1/imisner/Maker/UCSC1_CLC_de_novo.maker.output/UCSC1_CLC_de_novo_datastore/0F/67/Gc_UCSC1_contig_9//theVoid.Gc_UCSC1_contig_9/0/Gc_UCSC1_contig_9.0.all.rb -species all -dir /export/lustre_1/imisner/Maker/UCSC1_CLC_de_novo.maker.output/UCSC1_CLC_de_novo_datastore/0F/67/Gc_UCSC1_contig_9//theVoid.Gc_UCSC1_contig_9/0 -pa 1
#-------------------------------#
SIGTERM received
doing repeat masking
running  repeat masker.
#--------- command -------------#
Widget::RepeatMasker:
cd /tmp/maker_amJ13c; /a/g20-fs1/software/dt-sw0/RepeatMasker/4.0.3/RepeatMasker /export/lustre_1/imisner/Maker/UCSC1_CLC_de_novo.maker.output/UCSC1_CLC_de_novo_datastore/D5/5A/Gc_UCSC1_contig_17//theVoid.Gc_UCSC1_contig_17/0/Gc_UCSC1_contig_17.0.all.rb -species all -dir /export/lustre_1/imisner/Maker/UCSC1_CLC_de_novo.maker.output/UCSC1_CLC_de_novo_datastore/D5/5A/Gc_UCSC1_contig_17//theVoid.Gc_UCSC1_contig_17/0 -pa 1
#-------------------------------#
SIGTERM received
SIGTERM received
[compute-g20-7:28161] *** Process received signal ***
[compute-g20-7:28161] Signal: Segmentation fault (11)
[compute-g20-7:28161] Signal code: Address not mapped (1)
[compute-g20-7:28161] Failing at address: 0x19a33ad0
[compute-g20-7:28161] [ 0] /lib64/libpthread.so.0 [0x2b9e1cd6bca0]
[compute-g20-7:28161] [ 1] /cell_root/software/openmpi/1.6/gnu/sys/lib/libmpi.so(opal_memory_ptmalloc2_int_malloc+0xb0) [0x2b9e1bfb1a40]
[compute-g20-7:28161] [ 2] /cell_root/software/openmpi/1.6/gnu/sys/lib/libmpi.so(opal_memory_ptmalloc2_malloc+0x4a) [0x2b9e1bfb30ca]
[compute-g20-7:28161] [ 3] perl(Perl_safesysmalloc+0x12) [0x481602]
[compute-g20-7:28161] [ 4] perl(Perl_do_exec3+0x46) [0x4f6e86]
[compute-g20-7:28161] [ 5] perl(Perl_my_popen+0x403) [0x484d63]
[compute-g20-7:28161] [ 6] perl(Perl_pp_backtick+0xc2) [0x4f0752]
[compute-g20-7:28161] [ 7] perl(Perl_runops_standard+0xe) [0x49d5ce]
[compute-g20-7:28161] [ 8] perl(Perl_call_sv+0x4d1) [0x433711]
[compute-g20-7:28161] [ 9] perl(Perl_sighandler+0x208) [0x4876c8]
[compute-g20-7:28161] [10] /lib64/libpthread.so.0 [0x2b9e1cd6bca0]
[compute-g20-7:28161] [11] /usr/local/ofed/1.5.4/lib64/libmthca-rdmav2.so [0x2b9e29187bbc]
[compute-g20-7:28161] [12] /cell_root/software/openmpi/1.6/gnu/sys/lib/openmpi/mca_btl_openib.so [0x2b9e2686a8dd]
[compute-g20-7:28161] [13] /cell_root/software/openmpi/1.6/gnu/sys/lib/libmpi.so(opal_progress+0x5b) [0x2b9e1bfc93cb]
[compute-g20-7:28161] [14] /cell_root/software/openmpi/1.6/gnu/sys/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_recv+0x205) [0x2b9e25e22005]
[compute-g20-7:28161] [15] /cell_root/software/openmpi/1.6/gnu/sys/lib/libmpi.so(PMPI_Recv+0x14f) [0x2b9e1bf2927f]
[compute-g20-7:28161] [16] /export/rel50_shadow/glue.umd.edu/software/maker/2.28/.amd64_rel50/perl/lib/auto/Parallel/Application/MPI/MPI.so(_MPI_Recv+0x59) [0x2b9e23ba8d69]
[compute-g20-7:28161] [17] /export/rel50_shadow/glue.umd.edu/software/maker/2.28/.amd64_rel50/perl/lib/auto/Parallel/Application/MPI/MPI.so [0x2b9e23ba8f58]
[compute-g20-7:28161] [18] perl(Perl_pp_entersub+0x58f) [0x49ee4f]
[compute-g20-7:28161] [19] perl(Perl_runops_standard+0xe) [0x49d5ce]
[compute-g20-7:28161] [20] perl(perl_run+0x243) [0x4340f3]
[compute-g20-7:28161] [21] perl(main+0x135) [0x41b485]
[compute-g20-7:28161] [22] /lib64/libc.so.6(__libc_start_main+0xf4) [0x2b9e1cf969c4]
[compute-g20-7:28161] [23] perl [0x41b299]
[compute-g20-7:28161] *** End of error message ***
running  repeat masker.
#--------- command -------------#
Widget::RepeatMasker:
cd /tmp/maker_amJ13c; /a/g20-fs1/software/dt-sw0/RepeatMasker/4.0.3/RepeatMasker /export/lustre_1/imisner/Maker/UCSC1_CLC_de_novo.maker.output/UCSC1_CLC_de_novo_datastore/DC/D5/Gc_UCSC1_contig_18//theVoid.Gc_UCSC1_contig_18/0/Gc_UCSC1_contig_18.0.all.rb -species all -dir /export/lustre_1/imisner/Maker/UCSC1_CLC_de_novo.maker.output/UCSC1_CLC_de_novo_datastore/DC/D5/Gc_UCSC1_contig_18//theVoid.Gc_UCSC1_contig_18/0 -pa 1
#-------------------------------#
SIGTERM received
doing repeat masking
running  repeat masker.
#--------- command -------------#
Widget::RepeatMasker:
cd /tmp/maker_amJ13c; /a/g20-fs1/software/dt-sw0/RepeatMasker/4.0.3/RepeatMasker /export/lustre_1/imisner/Maker/UCSC1_CLC_de_novo.maker.output/UCSC1_CLC_de_novo_datastore/BE/77/Gc_UCSC1_contig_16//theVoid.Gc_UCSC1_contig_16/0/Gc_UCSC1_contig_16.0.all.rb -species all -dir /export/lustre_1/imisner/Maker/UCSC1_CLC_de_novo.maker.output/UCSC1_CLC_de_novo_datastore/BE/77/Gc_UCSC1_contig_16//theVoid.Gc_UCSC1_contig_16/0 -pa 1
#-------------------------------#
SIGTERM received
doing repeat masking
running  repeat masker.
#--------- command -------------#
Widget::RepeatMasker:
cd /tmp/maker_amJ13c; /a/g20-fs1/software/dt-sw0/RepeatMasker/4.0.3/RepeatMasker /export/lustre_1/imisner/Maker/UCSC1_CLC_de_novo.maker.output/UCSC1_CLC_de_novo_datastore/1C/8A/Gc_UCSC1_contig_14//theVoid.Gc_UCSC1_contig_14/0/Gc_UCSC1_contig_14.0.all.rb -species all -dir /export/lustre_1/imisner/Maker/UCSC1_CLC_de_novo.maker.output/UCSC1_CLC_de_novo_datastore/1C/8A/Gc_UCSC1_contig_14//theVoid.Gc_UCSC1_contig_14/0 -pa 1
#-------------------------------#
SIGTERM received
doing repeat masking
running  repeat masker.
#--------- command -------------#
Widget::RepeatMasker:
cd /tmp/maker_amJ13c; /a/g20-fs1/software/dt-sw0/RepeatMasker/4.0.3/RepeatMasker /export/lustre_1/imisner/Maker/UCSC1_CLC_de_novo.maker.output/UCSC1_CLC_de_novo_datastore/CB/E5/Gc_UCSC1_contig_13//theVoid.Gc_UCSC1_contig_13/0/Gc_UCSC1_contig_13.0.all.rb -species all -dir /export/lustre_1/imisner/Maker/UCSC1_CLC_de_novo.maker.output/UCSC1_CLC_de_novo_datastore/CB/E5/Gc_UCSC1_contig_13//theVoid.Gc_UCSC1_contig_13/0 -pa 1
#-------------------------------#
SIGTERM received
Perl exited with active threads:
	1 running and unjoined
	0 finished and unjoined
	0 running and detached
doing repeat masking
running  repeat masker.
#--------- command -------------#
Widget::RepeatMasker:
cd /tmp/maker_amJ13c; /a/g20-fs1/software/dt-sw0/RepeatMasker/4.0.3/RepeatMasker /export/lustre_1/imisner/Maker/UCSC1_CLC_de_novo.maker.output/UCSC1_CLC_de_novo_datastore/AA/A6/Gc_UCSC1_contig_1//theVoid.Gc_UCSC1_contig_1/0/Gc_UCSC1_contig_1.0.all.rb -species all -dir /export/lustre_1/imisner/Maker/UCSC1_CLC_de_novo.maker.output/UCSC1_CLC_de_novo_datastore/AA/A6/Gc_UCSC1_contig_1//theVoid.Gc_UCSC1_contig_1/0 -pa 1
#-------------------------------#
SIGTERM received
--------------------------------------------------------------------------
mpirun has exited due to process rank 17 with PID 7052 on
node compute-g20-10.deepthought.umd.edu exiting improperly. There are two reasons this could occur:

1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.

2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"

This may have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------
SIGTERM received
SIGTERM received
Perl exited with active threads:
	0 running and unjoined
	1 finished and unjoined
	0 running and detached
FATAL: Thread terminated, causing all processes to fail
--> rank=14, hostname=compute-g20-8.deepthought.umd.edu
Perl exited with active threads:
	0 running and unjoined
	1 finished and unjoined
	0 running and detached
FATAL: Thread terminated, causing all processes to fail
--> rank=12, hostname=compute-g20-8.deepthought.umd.edu
[compute-g20-8:09470] *** Process received signal ***
[compute-g20-8:09470] Signal: Segmentation fault (11)
[compute-g20-8:09470] Signal code: Address not mapped (1)
[compute-g20-8:09470] Failing at address: 0x4b0
[compute-g20-8:09470] [ 0] /lib64/libpthread.so.0 [0x2b03d0637ca0]
[compute-g20-8:09470] [ 1] perl(Perl_csighandler+0x23) [0x488103]
[compute-g20-8:09470] [ 2] /lib64/libpthread.so.0 [0x2b03d0637ca0]
[compute-g20-8:09470] [ 3] /lib64/libc.so.6(__select+0x62) [0x2b03d0913402]
[compute-g20-8:09470] [ 4] /cell_root/software/openmpi/1.6/gnu/sys/lib/openmpi/mca_btl_openib.so [0x2b03da142ff3]
[compute-g20-8:09470] [ 5] /lib64/libpthread.so.0 [0x2b03d062f83d]
[compute-g20-8:09470] [ 6] /lib64/libc.so.6(clone+0x6d) [0x2b03d091a26d]
[compute-g20-8:09470] *** End of error message ***
Perl exited with active threads:
	0 running and unjoined
	1 finished and unjoined
	0 running and detached
FATAL: Thread terminated, causing all processes to fail
--> rank=11, hostname=compute-g20-8.deepthought.umd.edu
setting up GFF3 output and fasta chunks
FATAL: Thread terminated, causing all processes to fail
--> rank=10, hostname=compute-g20-8.deepthought.umd.edu
setting up GFF3 output and fasta chunks
FATAL: Thread terminated, causing all processes to fail
--> rank=13, hostname=compute-g20-8.deepthought.umd.edu
FATAL: Thread terminated, causing all processes to fail
--> rank=15, hostname=compute-g20-8.deepthought.umd.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20140227/acd7e3ab/attachment-0005.html>


More information about the maker-devel mailing list