[maker-devel] Help with MPI installation and configuration
Carson Holt
carsonhh at gmail.com
Tue Apr 18 09:23:04 MDT 2017
MPICH3 works without any special configuration. OpenMPI is built to support infiniband which internally can drive other problems related to how registered memory is used when infiniband libraries are loaded. So you have have to set more parameters.
Both will use shared memory for communications when running on a single machine, but you will still need to set parameters on OpenMPI.
—Carson
> On Apr 18, 2017, at 9:19 AM, Tim Fallon <tfallon at mit.edu> wrote:
>
> Hi Carson,
>
> Thanks for the feedback, I’ll look into the maker INSTALL file to ensure things are setup correctly.
>
> Should I be looking into MPICH3 instead? I’m trying to run maker on a single node (~20 CPUs), rather than distribute it across our cluster. I assume that using MPICH3 changes the mpiexec / mpirun executable, but not the commands otherwise? I ask as I may have to configure this myself one day, so any tips appreciated. In the meantime your comments seem straightforward, will give it a shot.
>
> All the best,
> -Tim
>
>> On Apr 18, 2017, at 11:10 AM, Carson Holt <carsonhh at gmail.com <mailto:carsonhh at gmail.com>> wrote:
>>
>> I agree that just using MPICH3 is the easiest solution. But to use OpenMPI you will need to export the environmental variables add the command line flags to mpiexec explained in the ../maker/INSTALL file.
>>
>> #this must be doen exery time you run it
>> export LD_PRELOAD=/location/of/openmpi/lib/libmpi.so
>>
>> # -mca btl ^openib myust be on the command line
>> mpiexec -mca btl ^openib -n 20 maker
>>
>> #alternatively add --mca btl vader,tcp,self --mca btl_tcp_if_include ib0 (use ib0 or whatever adapter interface exists on your cluster)
>> mpiexec --mca btl vader,tcp,self --mca btl_tcp_if_include ib0 -n 20 maker
>>
>> —Carson
>>
>>> On Apr 17, 2017, at 9:35 AM, Tim Fallon <tfallon at mit.edu <mailto:tfallon at mit.edu>> wrote:
>>>
>>> Hi there,
>>>
>>> I’ve been troubleshooting the maker 3.0.0 beta plus MPI installation and configuration with our institute’s IT department. Despite a lot of tries and recompilations from scratch, we haven’t been able to get it to work (non-MPI maker works fine, albeit a bit slow). I’ve attached our most recent crash log when trying to run maker with MPI. Any tips? I’ve never used MPI before maker so not that familiar with troubleshooting it.
>>>
>>> See below for my most recent contact with IT on it. I’ve also attached the most recent log file. We are trying to execute it through the LSF cluster submission tool, but I don’t think that is related to the problem. Any ideas?
>>>
>>> "
>>> I think one option is to try a build on a host without the full suite of cluster software. It may actually help, but if nothing else we can say that our problem exists with a (relatively) vanilla ubuntu linux host.
>>>
>>> Peter Macfarlane / WIBR IT
>>>
>>>
>>> Description : Hi Peter,
>>>
>>> Hoping to hear back from you on this. It does seem like we’ve exhausted the reasonable troubleshooting for this maker MPI execution, but maybe worth another shot before I ask the developers?
>>>
>>> All the best,
>>> -Tim
>>>
>>> On Apr 7, 2017, at 9:11 AM, Tim Fallon <tfallon at mit.edu <mailto:tfallon at mit.edu>> wrote:
>>>
>>> Hi Peter,
>>>
>>> I’ve tested with the following command:
>>> bsub -o bsub_maker_with_mpi_log.txt "/nfs/apps/test/openmpi/bin/mpirun -n 8 /nfs/apps/test/maker/bin/maker”
>>>
>>> Unfortunately it still seems to be giving errors and not running, with things like:
>>> "“mca_base_component_repository_open: unable to open mca_patcher_overwrite: /nfs/apps/test/openmpi/lib/openmpi/mca_patcher_overwrite.so: undefined symbol: mca_patcher_base_patch_t_class (ignored)””
>>>
>>> It also doesn’t work if I try to run it directly on Tak4.
>>>
>>> I’ve attached the relevant error log file. Any more solutions?
>>>
>>> All the best,
>>> -Tim
>>>
>>> On Apr 5, 2017, at 3:50 PM, IT Systems Group <unix-help at wi.mit.edu <mailto:unix-help at wi.mit.edu>> wrote:
>>>
>>> I tried rebuilding maker 3 with it's own also-built-from-source openmpi instance. So with that:
>>>
>>> - /nfs/apps/test/maker is now maker 3.00
>>> - /nfs/apps/test/openmpi was used to make the aforementioned maker
>>>
>>> I think that all you'll need to do differently here is invoke the version of mpirun in /nfs/apps/test/openmpi/bin
>>> "
>>> All the best,
>>> -Tim
>>>
>>> Timothy R. Fallon
>>> PhD candidate
>>> Laboratory of Jing-Ke Weng
>>> Department of Biology
>>> MIT
>>>
>>> tfallon at mit.edu <mailto:tfallon at mit.edu>
>>> <bsub_maker_with_mpi_log.txt>
>>> _______________________________________________
>>> maker-devel mailing list
>>> maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org <http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org>
>>
>
> Timothy R. Fallon
> PhD candidate
> Laboratory of Jing-Ke Weng
> Department of Biology
> MIT
>
> tfallon at mit.edu <mailto:tfallon at mit.edu>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170418/a23814d7/attachment-0002.html>
More information about the maker-devel
mailing list