[maker-devel] Maker-Error when started with OpenMPI

Rainer Rutka rainer.rutka at uni-konstanz.de
Thu Feb 16 03:44:39 MST 2017


Hi!

Unfortunately all of the options failed on our cluster.

See:

Hi,

Most recent Maker test with
--mca btl vader,tcp,self --mca btl_tcp_if_include eth0
Error:
--> rank=2, hostname=uc1n518.localdomain
[uc1n518:67009] *** Process received signal ***
[uc1n518:67009] Signal: Segmentation fault (11)
[uc1n518:67009] Signal code: Address not mapped (1)
[uc1n518:67009] Failing at address: 0x4b0
--------------------------------------------------------------------------
mpiexec noticed that process rank 1 with PID 67009 on node uc1n518 
exited on signal 11 (Segmentation fault).


With:
--mca btl ^openib
and also this
--mca btl vader,tcp,self --mca btl_tcp_if_include ib0
Error:
### Runing Maker example

STATUS: Parsing control files...
STATUS: Processing and indexing input FASTA files...
[uc1n514:59985] *** Process received signal ***
[uc1n514:59985] Signal: Segmentation fault (11)
[uc1n514:59985] Signal code: Address not mapped (1)
[uc1n514:59985] Failing at address: 0x4b0
--------------------------------------------------------------------------
mpiexec noticed that process rank 10 with PID 59985 on node uc1n514 
exited on signal 11 (Segmentation fault).


Am 28.01.2017 um 21:53 schrieb Carson Holt:
> Try adding one of the following to your mpiexec command —>
>
> 1. --mca btl ^openib
> 2. --mca btl vader,tcp,self --mca btl_tcp_if_include ib0
> 3. --mca btl vader,tcp,self --mca btl_tcp_if_include eth0
>
> One or the other may fix your issue.  The first causes OpenMPI to not use the infiniband communication option (infiniband libraries use registered memory in a way that causes system calls to generate segfaults). It will usually force communication to go over another adapter. The second tries to use the infiband adapter, but uses TCP over infiniband (way to indirectly bypass problem causing libraries). The third specifically forces the use of the ethernet adapter instead of infiniband adapter.
>
> --Carson
>
>
>> On Jan 27, 2017, at 3:31 AM, Rainer Rutka <rainer.rutka at uni-konstanz.de> wrote:
>>
>> Hi everybody.
>>
>> My name is Rainer. I am an administrator for our HPC-Systems at our
>> university in Konstanz, Baden-Wuertemberg/Germany.
>> The procect is called bwHPC-C5.
>>
>> See: https://www.bwhpc-c5.de/en/index.php
>>
>> I try to get Maker running on our bwUniCluster since weeks. Unfortunately
>> i get errors while running a Maker job in the MPI-environment.
>>
>> BUILD STATUS
>>
>> ==============================================================================
>> STATUS MAKER v2.31.9
>> ==============================================================================
>> PERL Dependencies: VERIFIED
>> External Programs: VERIFIED
>> External C Libraries: VERIFIED
>> MPI SUPPORT: ENABLED
>> MWAS Web Interface: DISABLED
>> MAKER PACKAGE: CONFIGURATION OK
>>
>> MODULES / INCLUDES / COMPILERS
>>
>> # knbw03 20170117 r.rutka Initial revision knbw02 of module version 2.31.9
>> #
>> ##### (B) Dependencies:
>> #
>> # conflict: any other maker version
>> # module load compiler/gnu/5.2
>> # module load mpi/openmpi/2.0-gnu-5.2
>> [...]
>>
>> MPI/MOAB SUBMIT
>>
>> [...]
>> ### Queues ###
>> #MSUB -q fat
>> #MSUB -l nodes=1:ppn=16
>> #MSUB -l mem=20gb
>> #MSUB -l walltime=50:00:00
>> #
>> [...]
>> echo " "
>> echo "### Loading MAKER module:"
>> echo " "
>> module load bio/maker/2.31.9
>> [ "$MAKER_VERSION" ] || { echo "ERROR: Failed to load module 'bio/maker/2.31.9'."; exit 1; }
>> echo "MAKER_VERSION = $MAKER_VERSION"
>> module list
>> [...]
>> echo " "
>> echo "### Runing Maker example"
>> echo " "
>> export LD_PRELOAD=${MPI_LIB_DIR}/libmpi.so
>> export OMPI_MCA_mpi_warn_on_fork=0
>>
>> echo "LD_PRELOAD=${LD_PRELOAD}"
>> #
>> # "STATUS: Processing and indexing input FASTA files..."
>> #
>> mpiexec -mca btl ^openib -n 16 maker
>> [...]
>>
>>
>> E R R O R S
>> =======
>> [...]
>> LD_PRELOAD=/opt/bwhpc/common/mpi/openmpi/2.0.1-gnu-5.2/lib/libmpi.so
>> STATUS: Parsing control files...
>> STATUS: Processing and indexing input FASTA files...
>> [uc1n338:113607] *** Process received signal ***
>> [uc1n338:113607] Signal: Segmentation fault (11)
>> [uc1n338:113607] Signal code: Address not mapped (1)
>> [uc1n338:113607] Failing at address: 0x4b0
>> [uc1n338:113608] *** Process received signal ***
>> [uc1n338:113608] Signal: Segmentation fault (11)
>> [uc1n338:113608] Signal code: Address not mapped (1)
>> [uc1n338:113608] Failing at address: 0x4b0
>> [uc1n338:113621] *** Process received signal ***
>> [uc1n338:113621] Signal: Segmentation fault (11)
>> [uc1n338:113621] Signal code: Address not mapped (1)
>> [uc1n338:113621] Failing at address: 0x4b0
>> --------------------------------------------------------------------------
>> mpiexec noticed that process rank 2 with PID 113608 on node uc1n338 exited on signal 11 (Segmentation fault).
>> --------------------------------------------------------------------------
>> [...]
>>
>> WHATS WRONG HERE!?
>>
>> Thank you for your help!
>>
>> All the best ,
>>
>> Rainer
>>
>> --
>> Rainer Rutka
>> University of Konstanz
>> Communication, Information, Media Centre (KIM)
>> * High-Performance-Computing (HPC)
>> * KIM-Support and -Base-Services
>> Room: V511
>> 78457 Konstanz, Germany
>> +49 7531 88-5413
>>
>> _______________________________________________
>> maker-devel mailing list
>> maker-devel at box290.bluehost.com
>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>

-- 
Rainer Rutka
Universität Konstanz
Kommunikations-, Informations-, Medienzentrum (KIM)
Raum: V511, Tel: 54 13

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5055 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170216/82114c51/attachment-0001.p7s>


More information about the maker-devel mailing list