Hi Carson,<div><br></div><div>In terms of rev 995, on a simplified version of our data set, I tried a sequential run successfully, and even a "mpiexec -n 4" which ran to completion.</div><div><br></div><div>In any case, many thanks for the new version 996. I did have a problem with the build, namely the new line:</div>
<div>'bin/<a href="http://TACC.PL">TACC.PL</a>' => ['bin/ibrun'],</div><div><br></div><div>I tried to find <a href="http://TACC.PL">TACC.PL</a> unsuccessfully, so I decided to dispense with this new line and then it compiled fine.</div>
<div><br></div><div>I started one or two tests and will inform you later about them. From my end I must admit I am using a rather large EST fasta file, but is not useful for test .. I will try to cut it down Monday or Tues so that tests can be more agile.</div>
<div><br></div><div>Many thanks / Ramón.</div><div><br></div><div><br><div class="gmail_quote">On Fri, Mar 8, 2013 at 9:28 PM, Carson Holt <span dir="ltr"><<a href="mailto:carsonhh@gmail.com" target="_blank">carsonhh@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="font-size:14px;font-family:Calibri,sans-serif;word-wrap:break-word"><div>Also delete mpi_blastdb before retrying with the new svn repository.</div>
<div><br></div><div>Thanks,</div><div>Carson</div><div><br></div><div><br></div><span><div style="border-right:medium none;padding-right:0in;padding-left:0in;padding-top:3pt;text-align:left;font-size:11pt;border-bottom:medium none;font-family:Calibri;border-top:#b5c4df 1pt solid;padding-bottom:0in;border-left:medium none">
<span style="font-weight:bold">From: </span> Carson Holt <<a href="mailto:carsonhh@gmail.com" target="_blank">carsonhh@gmail.com</a>><br><span style="font-weight:bold">Date: </span> Friday, 8 March, 2013 3:20 PM<br>
<span style="font-weight:bold">To: </span> Ramón Fallon <<a href="mailto:ramonfallon@gmail.com" target="_blank">ramonfallon@gmail.com</a>><div><div class="h5"><br><span style="font-weight:bold">Cc: </span> "<a href="mailto:maker-devel@yandell-lab.org" target="_blank">maker-devel@yandell-lab.org</a>" <<a href="mailto:maker-devel@yandell-lab.org" target="_blank">maker-devel@yandell-lab.org</a>><br>
<span style="font-weight:bold">Subject: </span> Re: [maker-devel] thread terminated, causing all processes to fail<br></div></div></div><div><div class="h5"><div><br></div><div><div style="font-size:14px;font-family:Calibri,sans-serif;word-wrap:break-word">
<div>I think I've found the potential cause and committed the necessary changes to fix it.</div><div><br></div><div>Thanks,</div><div>Carson</div><div><br></div><div><br></div><span><div style="border-right:medium none;padding-right:0in;padding-left:0in;padding-top:3pt;text-align:left;font-size:11pt;border-bottom:medium none;font-family:Calibri;border-top:#b5c4df 1pt solid;padding-bottom:0in;border-left:medium none">
<span style="font-weight:bold">From: </span> Ramón Fallon <<a href="mailto:ramonfallon@gmail.com" target="_blank">ramonfallon@gmail.com</a>><br><span style="font-weight:bold">Date: </span> Thursday, 7 March, 2013 12:47 PM<br>
<span style="font-weight:bold">To: </span> Carson Holt <<a href="mailto:carsonhh@gmail.com" target="_blank">carsonhh@gmail.com</a>><br><span style="font-weight:bold">Cc: </span> "<a href="mailto:maker-devel@yandell-lab.org" target="_blank">maker-devel@yandell-lab.org</a>" <<a href="mailto:maker-devel@yandell-lab.org" target="_blank">maker-devel@yandell-lab.org</a>><br>
<span style="font-weight:bold">Subject: </span> Re: [maker-devel] thread terminated, causing all processes to fail<br></div><div><br></div>This is a standalone machine and no NFS at all. "df" gives a healthy amount of disk space, so there should be no problem there.<div>
<br></div><div>Yes that file does exist although it has the nominal 12288 bytes size, which appears to be the minimum for a DB_file tie.</div><div><br></div><div>As I mentioned the dpp_contig.fa example set does work so part of my investigation is looking at how.</div>
<div><br></div><div>I can do some trivial unit tests on the Bioperl stat-before-tied-hashes situation and see what comes up.</div><div><br></div><div>So I'll attempt to clear that up and then revert.</div><div><br></div>
<div>Many thanks! / Ramón.</div><div><br></div><div><br></div><div><div class="gmail_quote">On Thu, Mar 7, 2013 at 5:44 PM, Carson Holt <span dir="ltr"><<a href="mailto:carsonhh@gmail.com" target="_blank">carsonhh@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="font-size:14px;font-family:Calibri,sans-serif;word-wrap:break-word"><div>That is extremely odd. It fails to even generate the indexes. Could you check the drive space of your working directory and your /tmp directory?</div>
<div><br></div><div>It is odd because Bioperl uses the stat command to check on the file right before making a tied hash. So it was there for the stat but not the tie, which is immediately following.</div><div><br></div>
<div>If you check manually does it exist now? --> /home/ramonf/makertrials/mgallocut7/sca29310_8.maker.output/mpi_blastdb/sca29310_8%2Efa.mpi.1/sca29310_8%2Efa.mpi.1.0.index</div><div><br></div><div>Are you running in an NFS mounted directory?</div>
<div><br></div><div>--Carson</div><div><br></div><div><br></div><span><div style="border-right:medium none;padding-right:0in;padding-left:0in;padding-top:3pt;text-align:left;font-size:11pt;border-bottom:medium none;font-family:Calibri;border-top:#b5c4df 1pt solid;padding-bottom:0in;border-left:medium none">
<span style="font-weight:bold">From: </span> Ramón Fallon <<a href="mailto:ramonfallon@gmail.com" target="_blank">ramonfallon@gmail.com</a>><br><span style="font-weight:bold">Date: </span> Thursday, 7 March, 2013 9:40 AM<div>
<br><span style="font-weight:bold">To: </span> Carson Holt <<a href="mailto:carson.holt@oicr.on.ca" target="_blank">carson.holt@oicr.on.ca</a>><br><span style="font-weight:bold">Cc: </span> "<a href="mailto:maker-devel@yandell-lab.org" target="_blank">maker-devel@yandell-lab.org</a>" <<a href="mailto:maker-devel@yandell-lab.org" target="_blank">maker-devel@yandell-lab.org</a>><br>
</div><span style="font-weight:bold">Subject: </span> Re: [maker-devel] thread terminated, causing all processes to fail<br></div><div><div><div><br></div>Hi Carson,<div><br></div><div>I send you a zip of the text file of my repeated maker session, this time having deleted the mpi_blastdb dir and with the -a flag added to "mpiexec -n 8 maker -debug". Command line.</div>
<div><br></div><div>Cheers / Ramón.</div><div><br><br><div class="gmail_quote">On Wed, Mar 6, 2013 at 7:49 PM, Ramón Fallon <span dir="ltr"><<a href="mailto:ramonfallon@gmail.com" target="_blank">ramonfallon@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">OK, will do. <div><br></div><div>Will get back to you tomorrow on it.<div><br></div><div>Many thanks!</div><div><div><div>
<br><br><div class="gmail_quote">On Wed, Mar 6, 2013 at 7:22 PM, Carson Holt <span dir="ltr"><<a href="mailto:Carson.Holt@oicr.on.ca" target="_blank">Carson.Holt@oicr.on.ca</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div style="font-size:14px;font-family:Calibri,sans-serif;word-wrap:break-word"><div><div><div>Could you delete your ../*maker.output/mpi_blastdb directory, and then when rerunning maker, run with the –a flag.</div><div><br>
</div><div>Thanks,</div><div>Carson</div><div><br></div><div><div><div style="font-size:14px;font-family:Calibri,sans-serif"></div></div></div></div></div><div><br></div><span><div style="border-right:medium none;padding-right:0in;padding-left:0in;padding-top:3pt;text-align:left;font-size:11pt;border-bottom:medium none;font-family:Calibri;border-top:#b5c4df 1pt solid;padding-bottom:0in;border-left:medium none">
<span style="font-weight:bold">From: </span>Ramón Fallon <<a href="mailto:ramonfallon@gmail.com" target="_blank">ramonfallon@gmail.com</a>><br><span style="font-weight:bold">Date: </span>Wednesday, 6 March, 2013 1:15 PM<br>
<span style="font-weight:bold">To: </span>Carson Holt <<a href="mailto:carson.holt@oicr.on.ca" target="_blank">carson.holt@oicr.on.ca</a>><br><span style="font-weight:bold">Cc: </span>"<a href="mailto:maker-devel@yandell-lab.org" target="_blank">maker-devel@yandell-lab.org</a>" <<a href="mailto:maker-devel@yandell-lab.org" target="_blank">maker-devel@yandell-lab.org</a>><div>
<div><br><span style="font-weight:bold">Subject: </span>Re: thread terminated, causing all processes to fail<br></div></div></div><div><div><div><br></div><div><div>OK great, here goes .. many thanks!
<div><br></div><div><br><br><div class="gmail_quote">On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt <span dir="ltr">
<<a href="mailto:Carson.Holt@oicr.on.ca" target="_blank">Carson.Holt@oicr.on.ca</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="font-size:14px;font-family:Calibri,sans-serif;word-wrap:break-word">
<div><div><div>If you do reply all to this message, I should get the attachment. It will be stripped from the one going to the list though.</div><div><br></div><div>Thanks,</div><div>Carson</div><div><br></div><div><br></div>
<div><div><div style="font-size:14px;font-family:Calibri,sans-serif"></div></div></div></div></div><div><br></div><span><div style="border-right:medium none;padding-right:0in;padding-left:0in;padding-top:3pt;text-align:left;font-size:11pt;border-bottom:medium none;font-family:Calibri;border-top:#b5c4df 1pt solid;padding-bottom:0in;border-left:medium none">
<span style="font-weight:bold">From: </span>Ramón Fallon <<a href="mailto:ramonfallon@gmail.com" target="_blank">ramonfallon@gmail.com</a>><br><span style="font-weight:bold">Date: </span>Wednesday, 6 March, 2013 12:57 PM<br>
<span style="font-weight:bold">To: </span><<a href="mailto:maker-devel@yandell-lab.org" target="_blank">maker-devel@yandell-lab.org</a>><br><span style="font-weight:bold">Subject: </span>Re: thread terminated, causing all processes to fail<br>
</div><div><div><div><br></div>
Hi,
<div><br></div><div>Many thanks for your quick reply and hint.</div><div><br></div><div>Yes, you're right .. further up there is indeed</div><div><br></div><div>Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line 148 thread 1.</div>
<div><div>Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw FastaSeq for Storable</div><div>--> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 thread 1.</div><div><br></div><div>
I run a "script" session and have maker on -debug so I have everything in one file. Do you prefer to have it attached to a post to this mailing list (if it accepts txt attachments)</div><div><br></div>
Cheers.</div><div><br></div><div><br><div class="gmail_quote">On Wed, Mar 6, 2013 at 6:34 PM, Ramón Fallon <span dir="ltr">
<<a href="mailto:ramonfallon@gmail.com" target="_blank">ramonfallon@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Hi,
<div><br></div><div>I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a single multicore machine.</div><div><br></div><div>I've successfully run the dpp_contig.fasta (MPI/8 processes) example but am having trouble with larger contigs fasta files of my own, which are well formed.</div>
<div><br></div><div>I've run into a problem whereby an mpiexec run of 8 processes will stop due to a perl-thread related problem which says</div><div><br></div><div>FATAL: Thread terminated, causing all processes to fail</div>
<div> </div><div>this corresponds to line 924 in the maker executable (which is for the secondary/worker threads), and is the result of a test on !$thr OR'd with !$thr->is_running, so clearly one of these is failing. </div>
<div><br></div><div>$thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a programmer, I've only recently started to look at the code and have not got the hang of the parallelisation setup here, though I gather the master must use threads to initially generate
the parallel instances which then use the message passing. Of course threads don't have message passing ability, so I guess something clever is going on and will take some time for me to understand. </div><div><br>
</div><div>Clearly however, it has worked before on dpp_contigs, so it may be is something wrong with my datafile or the way I am carrying out the analysis.</div><div><br></div><div>Any clues that can be put my way are welcome.</div>
<div><br></div><div>Thank you!</div></blockquote></div><br></div></div></div></span></div></blockquote></div><br></div></div></div></div></div></span></div></blockquote></div><br></div></div></div></div></blockquote></div>
<br></div></div></div><div>
_______________________________________________
maker-devel mailing list
<a href="mailto:maker-devel@box290.bluehost.com" target="_blank">maker-devel@box290.bluehost.com</a><a href="http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org" target="_blank">http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org</a></div>
</span></div></blockquote></div><br></div></span></div></div></div></div></span></div>
</blockquote></div><br></div>