<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">MAKER will use locks to divide up work between simultaneously running jobs. So submitting five 200 CPU jobs, will give you the same throughput, and will be more stable. The jobs will probably move through the queue faster as well.<div class=""><br class=""></div><div class="">—Carson</div><div class=""><br class=""><div class=""><br class=""></div><div class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Sep 1, 2016, at 10:00 AM, Mark Ebbert <<a href="mailto:me.mark@gmail.com" class="">me.mark@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><img src="https://welovepg.polymail.io/v2/z/a/NTdjODUwM2Q5OWZh/wp5SWy16YDuF--yLSkGD7hhWmNlU0R_KpxmCGNUCm1PKq7oXGd2Y1Qu3TXiGt5EbDuxag5exDYW2wHJA1sZRUkekrOEufBRj4LiEDv8Mome7X0AGkW0f-AZRHikHY5Zjylw3AgDOCztfevr5Bw==.png" alt="" width="0px" height="0px" border="0" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; border: none; background-image: none; width: 0px; height: 0px;" class=""><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""></div><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">Bummer. It worked at 720 the first time. Thanks again!</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><div id="psignature" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><div class="">Mark T. W. Ebbert</div></div><div class="gmail_extra" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><br class=""><div class="gmail_quote"><div dir="ltr" class="">On Thu, Sep 01, 2016 at 9:57 AM Carson Holt<span class="Apple-converted-space"> </span><carsonhh@gmail.com class=""><> wrote:<br class=""></carsonhh@gmail.com></div><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex;">-n 1000 is probably too high for mpich3. It’s communication manager is not that robust. You can go that high with OpenMPI or MVAPICH2, but I’ve found that MPICH3 tops out at 100-200. Just submit multiple jobs at the lower count.<div class=""><br class=""></div><div class="">—Carson</div><div class=""><br class=""><div class=""><br class=""></div><div class=""><br class=""><div class=""><blockquote type="cite" class=""><div class="">On Sep 1, 2016, at 8:47 AM, Mark Ebbert <<a href="mailto:me.mark@gmail.com" class="" style="word-wrap: break-word !important;">me.mark@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><img src="https://welovepg.polymail.io/v2/z/a/NTdjODNlZWNhNWU3/nQKngm25XnOHdY47CB7IvkoCWltlVKD1tzRaUaCErsGD2pSaY-nv00d4bmcMFYlQFrphFgDQNF4upG96_Ov0Gsa48wKe_1xwMSiwi8CJyYIHNKfFxakfn_SIKtFEiPxpzO2lcNaDkQv0dU5_-Q==.png" alt="" width="0px" height="0px" border="0" class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; border: none; background-image: none; width: 0px; height: 0px;"><div class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"></div><span class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;">Thanks Carson! The help message only printed once, so everything seemed fine. I deleted all of the lock files with the following command: “find . -name *.NFSLock* -exec rm {} \;”</span><div class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><br class=""></div><div class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">I restarted the job and got the following segfault:</div><div class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><br class=""></div><div class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">“Module mpi/mpich-3.1.4_intel-15.0.3 requires compiler_intel/15.0.3. Loading it now.</div><div class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">Module compiler_intel/15.0.3 requires mkl/11.2.0. Loading it now.</div><div class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">mpdboot_m7-1-2 (handle_mpd_output 1000): from mpd on m7-1-2, invalid port info:</div><div class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">mpd_uncaught_except_tb handling:</div><div class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><type 'exceptions.IndexError'>: list index out of range</div><div class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">/apps/intel_parallel_studio_xe/2015_update3/mpirt/bin/intel64/mpd.py 264 pin_Uni_num</div><div class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">if list.index(list[i]) == i:</div><div class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">/apps/intel_parallel_studio_xe/2015_update3/mpirt/bin/intel64/mpd.py 1449 pin_Cpuinfo</div><div class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">info['cache1'] = pin_Uni_num(info['cache1_id'], info['lcpu'])</div><div class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">/apps/intel_parallel_studio_xe/2015_update3/mpirt/bin/intel64/mpd.py 1658 run</div><div class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">self.CpuInfo = pin_Cpuinfo(self.PinCase,self.Arch)</div><div class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">/apps/intel_parallel_studio_xe/2015_update3/mpirt/bin/intel64/mpd.py 3676 <module></div><div class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">mpd.run()</div><div class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">/var/spool/slurmd/job11326444/slurm_script: line 27: 29365 Segmentation fault mpiexec -n 1000 maker”</div><div class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><br class=""></div><div class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">Any ideas?<br class=""><br class=""><div id="psignature" class=""><div class="">Mark T. W. Ebbert</div></div></div><div class="gmail_extra" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><br class=""><div class="gmail_quote"><div dir="ltr" class="">On Tue, Aug 30, 2016 at 10:54 AM Carson Holt<span class="Apple-converted-space"> </span><carsonhh class=""><> wrote:<br class=""></carsonhh></div><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex;">Run 'maker -help’ with mpiexec.<div class=""><br class=""></div><div class="">Example:</div><div class="">mpiexec -n 10 maker -help</div><div class=""><br class=""></div><div class="">If the MPI communication ring is working correctly, then it will print the help message only once (from the root process). If it is not working, it will print the help message 10 time because each of the 10 MPI processes will think they are the root process. It is a simple test that can identify if it is an MPI issue or not.</div><div class=""><br class=""></div><div class="">If it is not an MPI issue, you can just search for the NFSLock files using find and delete them,.</div><div class=""><br class=""></div><div class="">—Carson</div><div class=""><br class=""></div><div class=""><br class=""><div class=""><blockquote type="cite" class=""><div class="">On Aug 30, 2016, at 10:10 AM, Mark Ebbert <<a href="mailto:me.mark@gmail.com" class="" style="word-wrap: break-word !important;">me.mark@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><img src="https://welovepg.polymail.io/v2/z/a/NTdjNWFmMjNhODQy/E0jjXDQWWqJK2KQFm4bViNSvWd22c6AdERC_UTksNbTXpSfOFBHmgFSmNU4cgAbUvE_EKAA8GYcxjJpLu65IbRZgHAaluhYM6RvmcA_kcrVOVgAu5SpDKI5tyd3Y4_TIe_AvF-dS9ldIFdK5Xrxm1PWFwIY0eA==.png" alt="" width="0px" height="0px" border="0" class="" style="border: none; background-image: none; width: 0px; height: 0px;"><div class=""></div><div class="">Good day everyone!</div><div class=""><br class=""></div><div class="">I’m getting the error stating: “WARNING: Multiple MAKER processes have been started in the same directory.” Everything I’ve seen mentions version issues with MPICH. The difference in my situation is that my initial run ran just fine, but died because of the cluster time constraints. We’re only allowed 3 days. </div><div class=""><br class=""></div><div class="">There are a bunch of .NFSLock files in the output directory. I’m guessing Maker wasn’t able to clear the locks when the jobs died? Can I safely delete those lock files? What’s the best way to handle this going forward since I can only run jobs for 3 days at a time?</div><div class=""><br class=""></div><div class="">Thanks!<br class=""></div><br class=""><div id="psignature" class=""><div class="">Mark T. W. Ebbert</div></div>_______________________________________________<br class="">maker-devel mailing list<br class=""><a href="mailto:maker-devel@box290.bluehost.com" class="" style="word-wrap: break-word !important;">maker-devel@box290.bluehost.com</a><br class=""><a href="http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org" class="" style="word-wrap: break-word !important;">http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org</a></div></blockquote></div></div></blockquote></div></div></div></blockquote></div></div></div></blockquote></div></div></div></blockquote></div><br class=""></div></div></body></html>