[maker-devel] Maker-Error when started with IMPI : CORRECTED MAIL : SEE THIS ONE
Carson Holt
carsonhh at gmail.com
Wed Mar 1 17:43:30 MST 2017
Try this command —> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec -n 2 echo Hello
Then this command —> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec -n 2 /opt/bwhpc/common/bio/maker/2.31.8_impi/bin/maker -h
If both of these fail, there is the chance that the Intel MPI you are using was compiled on a different architecture than the one you are launching it on. In that case the failure indicates a need to reinstall Intel MPI for that architecture.
The following may or may not work if the first two fail:
Then this command —> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec.hydra -n 2 echo Hello
Then this command —> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec.hydra -n 2 /opt/bwhpc/common/bio/maker/2.31.8_impi/bin/maker -h
Also send me this file —> perl/lib/MAKER/ConfigData.pm
Thanks,
Carson
> On Mar 1, 2017, at 5:51 AM, Rainer Rutka <rainer.rutka at uni-konstanz.de> wrote:
>
>
> Sorry, sent wrong e-mail :-(
>
> IGNORE THE FIRST MAIL I SENT!
>
> Am 01.03.2017 um 13:30 schrieb Rainer Rutka:
> Hi Carson.
> Again THANK YOU for your efforts :-)
> Am 24.02.2017 um 18:30 schrieb Carson Holt:
>> Specific things.
>>
>> 1. Do not set LD_PRELOAD. That is only for OpenMPI, but it will cause
>> problems with other MPI's.
>
> OK, I deleted this envirnoment. Not set any more.
>
>> 2. Make sure you recompiled MAKER for Intel MPI (MPI code always has
>> to be compiled for the flavor you are using, so make sure you have a
>> separate installation of MAKER for Intel MPI). Also validate that the
>> mpicc and libmpi.h listed during the MAKER install belong to Intel
>> MPI. Don’t just assume they do because you loaded the module. Manually
>> verify the paths during MAKER’s setup.
>
> I validated:
> UC:[kn at uc1n996 bwhpc-examples]$ module list
> Currently Loaded Modulefiles:
> 1) compiler/intel/16.0(default)
> 2) mpi/impi/5.1.3-intel-16.0(default)
> FOR MPICC:
> UC:[kn at uc1n996 bwhpc-examples]$ type mpicc
> mpicc is
> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpicc
> FOR LIBMPI:
> UC:[kn at uc1n996 bwhpc-examples]$ echo $MPIDIR
> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64
> UC:[kn at uc1n996 bwhpc-examples]$ find $MPIDIR -name '*'mpi.h -print
> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/include/mpi.h
> Here i can find a mpi.h but not a libmpi.h. But I thinks this is o.k.,
> because the SW was compiled and linkes without any errors or missing libs.
>
>> 3. The error you got previously should not even be possible with the
>> current version of Intel MPI,
>> which is why I say that when you called mpiexec, something else (that
>> was not Intel MPI) was launched.
>> Easy solution is to give the full path of mpiexec in your job, so are
>> not relying on PATH to be unaltered in your job.
>
> mpiexec is in the PATH and the right one is/was used, too:
> MPIXEC:
> UC:[kn at uc1n996 bwhpc-examples]$ type mpiexec
> mpiexec is
> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec
>
>> Do not do —> mpiexec -nc 1 maker
>> Do this for example —>
>> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec
>> -nc maker
> OK, so i did:
> [...]
> #MSUB -l nodes=1:ppn=1
> #MSUB -l mem=20gb
> [...]
> echo " "
> echo "### Runing Maker example"
> echo " "
> export OMPI_MCA_mpi_warn_on_fork=0
> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec
> -nc maker
> [...]
>
>> 4. Build and run on the same node for your test. If you build on one
>> node and run on another, you may
>> be changing your environment in ways you don’t realize that break
>> things. So if you can build and test on
>> the same node and it works, then it fails when you test it elsewhere,
>> then you have to track down how your
>> environment is changing.
>
> OK I did. Same node: uc1n996
> UNFORTUNATELY I GOT THE SAME ERROR:
> [...]
> Currently Loaded Modulefiles:
> 1) compiler/intel/16.0(default)
> 2) mpi/impi/5.1.3-intel-16.0(default)
> 3) bio/maker/2.31.8_impi
>
>
> ### Display internal Maker/bwHPC environments...
>
> MAKER_BIN_DIR = /opt/bwhpc/common/bio/maker/2.31.8_impi/bin
> MAKER_EXA_DIR = /opt/bwhpc/common/bio/maker/2.31.8_impi/bwhpc-examples
>
>
> ### Runing Maker example
> OMPI_MCA_mpi_warn_on_fork=0
> I_MPI_CPUINFO=proc
> I_MPI_PMI_LIBRARY=/opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/lib/libmpi.so
> I_MPI_PIN_DOMAIN=node
> I_MPI_FABRICS=shm:tcp
> I_MPI_HYDRA_IFACE=ib0
> mpiexec_uc1n326.localdomain: cannot connect to local mpd (/scratch/mpd2.console_uc1n326.localdomain_kn_pop235844); possible causes:
> 1. no mpd is running on this host
> 2. an mpd is running but was started without a "console" (-n option)
> ### Cleaning up files ... removing unnecessary scratch files ...
> [...]
>
>> —Carson
> tbc. ? :-)
> THANX
>
> --
> Rainer Rutka
> Universität Konstanz
> Kommunikations-, Informations-, Medienzentrum (KIM)
> * KIM Ausbildung
> * Wissenschaftliches Rechnen/bwHPC-C5
> * KIM Basisdienste, KIM Support
> Raum: V511
> 78457 Konstanz
> +49 7531 88-5413
>
More information about the maker-devel
mailing list