<html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"></head><body dir="auto"><div>Just forwarding this to the list.<br><br><div><span style="font-size: 13pt;">--Carson</span></div></div><div><br></div><blockquote type="cite"><div><b>From:</b> Carson Holt <<a href="mailto:carsonhh@gmail.com">carsonhh@gmail.com</a>><br><b>Date:</b> August 14, 2017 at 2:00:11 PM MDT<br><b>To:</b> zl c <<a href="mailto:chzelin@gmail.com">chzelin@gmail.com</a>><br><b>Subject:</b> <b>Re: [maker-devel] maker MPI problem</b><br><br></div></blockquote><blockquote type="cite"><div><meta http-equiv="Content-Type" content="text/html charset=utf-8"><meta http-equiv="Content-Type" content="text/html charset=utf-8" class=""><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">Yes. You can delete them.<div class=""><br class=""></div><div class="">Also I notice this library being mentioned in the segfault —> libpthread.so</div><div class=""><br class=""></div><div class="">MAKER doesn’t use pthreads, so I’m surprised it’s showing up in an error. You could try installing a separate version of perl without pthread support and running MAKER with that (pthreads is optional for perl). It may remove an OpenMPI/perl incompatibility happening on your system.</div><div class=""><br class=""></div><div class="">—Carson</div><div class=""><br class=""><div class=""><br class=""><div class=""><blockquote type="cite" class=""><div class="">On Aug 14, 2017, at 1:50 PM, zl c <<a href="mailto:chzelin@gmail.com" class="">chzelin@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class="">Maker dies.<div class=""><br class=""></div><div class="">I've set LD_PRELOAD before install.</div><div class=""><br class=""></div><div class="">I'll try the option.<div class=""><br class=""></div><div class="">Can I remove the .NFS files before rerunning?</div><div class=""><br class=""></div><div class="">Thanks,</div><div class="">Zelin</div><div class="gmail_extra"><br clear="all" class=""><div class=""><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr" class=""><div dir="ltr" class=""><br class=""></div></div></div></div>
<br class=""><div class="gmail_quote">On Mon, Aug 14, 2017 at 3:35 PM, Carson Holt <span dir="ltr" class=""><<a href="mailto:carsonhh@gmail.com" target="_blank" class="">carsonhh@gmail.com</a>></span> wrote:<br class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word" class="">Is the issue that your cluster dies or that MAKER dies? (i.e. I want to know if this is an issue with your cluster or just an issue running MAKER)<div class=""><br class=""></div><div class="">I see in the file that you are getting segfaults which should not crash the cluster but would kill maker. They would indicate either an installation problem, or just a command configuration option.</div><div class=""><br class=""></div><div class="">You may need to recompile while the LD_PRELOAD value is set (it must be set during MAKER install and whenever you run with OpenMPI). Or you may still have the native infiniband communication active (causes segfaults with system calls).</div><div class=""><br class=""></div><div class="">You can try this (to do ip over infiiniband instead, worls only if ib0 exists or set it to eth0 if eth0 exists) —> '--mca btl vader,tcp,self --mca btl_tcp_if_include ib0'</div><div class=""><br class=""></div><div class="">That would replace the '-mca btl ^openib'</div><div class=""><br class=""></div><div class="">Also make sure you can run maker on a single node under MPI before trying to work across nodes, then try on two nodes for your first test.</div><div class=""><br class=""></div><div class="">The NFSLock files are file locks that are not cleaned up on a hard failure.<br class=""><div class=""><br class=""></div><div class="">—Carson</div><div class=""><br class=""></div><div class=""><br class=""></div><div class=""><br class=""></div><div class=""><br class=""></div><div class=""><br class=""><div class=""><blockquote type="cite" class=""><div class="">On Aug 14, 2017, at 1:22 PM, zl c <<a href="mailto:chzelin@gmail.com" target="_blank" class="">chzelin@gmail.com</a>> wrote:</div><br class="m_3595073103290944534Apple-interchange-newline"><div class=""><div dir="ltr" class="">It's in the attached file.<div class=""><br class=""></div><div class="">Beside, I see there are lots of .NFS... files.like:</div><div class=""><div style="margin:0px;font-size:12px;line-height:normal;font-family:Menlo;background-color:rgb(255,222,150)" class=""><span style="font-variant-ligatures:no-common-ligatures" class="">.NFSLock..NFSLock.genomedb.<wbr class="">NFSLock.share.tmp.2247.26272.<wbr class="">7466.34069868502337</span></div></div><div class="gmail_extra"><br clear="all" class=""><div class=""><div class="m_3595073103290944534gmail_signature" data-smartmail="gmail_signature"><div dir="ltr" class=""><div class=""><div dir="ltr" class=""><div class=""><div class="">------------------------------<wbr class="">--------------</div>Zelin Chen [<a href="mailto:chzelin@gmail.com" target="_blank" class="">chzelin@gmail.com</a>]<br class=""></div><div class=""><br class=""></div><div class=""><div class="">NIH/NHGRI</div><div class=""><span style="font-family:Verdana,Geneva,sans-serif;font-size:12px" class="">Building 50, Room 5531</span><br style="font-family:Verdana,Geneva,sans-serif;font-size:12px" class=""><span style="font-family:Verdana,Geneva,sans-serif;font-size:12px" class="">50 SOUTH DR, MSC 8004 </span><br style="font-family:Verdana,Geneva,sans-serif;font-size:12px" class=""><span style="font-family:Verdana,Geneva,sans-serif;font-size:12px" class="">BETHESDA, MD 20892-8004</span></div></div></div></div></div></div></div>
<br class=""><div class="gmail_quote">On Mon, Aug 14, 2017 at 3:18 PM, Carson Holt <span dir="ltr" class=""><<a href="mailto:carsonhh@gmail.com" target="_blank" class="">carsonhh@gmail.com</a>></span> wrote:<br class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word" class="">This is rather vague —> “<span style="font-size:14.666666984558105px" class="">crashed the computer cluster</span>”<div class=""><br class=""></div><div class="">Do you have a specific error?</div><div class=""><br class=""></div><div class="">—Carson</div><div class=""><br class=""></div><div class=""><br class=""></div><div class=""><br class=""><div class=""><blockquote type="cite" class=""><div class="">On Aug 14, 2017, at 12:59 PM, zl c <<a href="mailto:chzelin@gmail.com" target="_blank" class="">chzelin@gmail.com</a>> wrote:</div><br class="m_3595073103290944534m_602814503278033286Apple-interchange-newline"><div class=""><div dir="ltr" class=""><p class="MsoNormal"><span style="font-size:11pt" class="">Hello,<span class=""></span></span></p><div class=""><span style="font-size:11pt" class=""><span class=""> </span></span><br class="m_3595073103290944534m_602814503278033286webkit-block-placeholder"></div><p class="MsoNormal"><span style="font-size:11pt" class="">I ran maker 3.0 with openmpi 2.0.2
and it crashed the computer cluster. I attached the log file. Could you help me
to solve the problem?<span class=""></span></span></p><div class=""><span style="font-size:11pt" class=""><span class=""> </span></span><br class="m_3595073103290944534m_602814503278033286webkit-block-placeholder"></div><p class="MsoNormal"><span style="font-size:11pt" class="">CMD:<span class=""></span></span></p><p class="MsoNormal"><span style="font-size:11pt" class="">export
LD_PRELOAD=/usr/local/OpenMPI/<wbr class="">2.0.2/gcc-6.3.0/lib/libmpi.so<span class=""></span></span></p><p class="MsoNormal"><span style="font-size:11pt" class="">export
OMPI_MCA_mpi_warn_on_fork=0<span class=""></span></span></p><p class="MsoNormal"><span style="font-size:11pt" class="">mpiexec -mca btl ^openib -n
$SLURM_NTASKS maker -c 1 –base genome -g
genome.fasta<span class=""></span></span></p><div class=""><span style="font-size:11pt" class=""><span class=""> </span></span><br class="m_3595073103290944534m_602814503278033286webkit-block-placeholder"></div><p class="MsoNormal"><span style="font-size:11pt" class="">Thanks,<span class=""></span></span></p><p class="MsoNormal"><span style="font-size:11pt" class="">Zelin Chen<span class=""></span></span></p><div class=""><span style="font-size:11pt" class=""><span class=""> </span></span><br class="m_3595073103290944534m_602814503278033286webkit-block-placeholder"></div>
<div class=""><div class="m_3595073103290944534m_602814503278033286gmail_signature"><div dir="ltr" class=""><div dir="ltr" class=""><div class=""><div class="">------------------------------<wbr class="">--------------</div>Zelin Chen [<a href="mailto:chzelin@gmail.com" target="_blank" class="">chzelin@gmail.com</a>] Ph.D.</div><div class=""><br class=""></div><div class="">NIH/NHGRI<br class=""></div><div class=""><div class=""><span style="font-family:Verdana,Geneva,sans-serif;font-size:12px" class="">Building 50, Room 5531</span><br style="font-family:Verdana,Geneva,sans-serif;font-size:12px" class=""><span style="font-family:Verdana,Geneva,sans-serif;font-size:12px" class="">50 SOUTH DR, MSC 8004 </span><br style="font-family:Verdana,Geneva,sans-serif;font-size:12px" class=""><span style="font-family:Verdana,Geneva,sans-serif;font-size:12px" class="">BETHESDA, MD 20892-8004</span></div></div></div></div></div></div>
</div>
<span id="m_3595073103290944534m_602814503278033286cid:8EC7467F-C13C-403B-AE2E-97762F28B1B0@hsd1.ut.comcast.net." class=""><run05.mpi.o47346077></span>_________<wbr class="">______________________________<wbr class="">________<br class="">maker-devel mailing list<br class=""><a href="mailto:maker-devel@box290.bluehost.com" target="_blank" class="">maker-devel@box290.bluehost.co<wbr class="">m</a><br class=""><a href="http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org" target="_blank" class="">http://box290.bluehost.com/mai<wbr class="">lman/listinfo/maker-devel_yand<wbr class="">ell-lab.org</a><br class=""></div></blockquote></div><br class=""></div></div></blockquote></div><br class=""></div></div>
<span id="m_3595073103290944534cid:5556DE63-FD27-467B-B7E9-4E2FE483095C@hsd1.ut.comcast.net." class=""><run05.mpi.o47346077></span></div></blockquote></div><br class=""></div></div></div></blockquote></div><br class=""></div></div></div>
</div></blockquote></div><br class=""></div></div></div></div></blockquote></body></html>