[maker-devel] thread terminated, causing all processes to fail

Carson Holt carsonhh at gmail.com
Sun Mar 10 10:31:27 MDT 2013


I've fixed the missing script issue.

Thanks,
Carson


From:  Ramón Fallon <ramonfallon at gmail.com>
Date:  Sunday, 10 March, 2013 10:45 AM
To:  Carson Holt <carsonhh at gmail.com>
Cc:  "maker-devel at yandell-lab.org" <maker-devel at yandell-lab.org>
Subject:  Re: [maker-devel] thread terminated, causing all processes to fail

Hi Carson,

In terms of rev 995, on a simplified version of our data set, I tried a
sequential run successfully, and even a "mpiexec -n 4" which ran to
completion.

In any case, many thanks for the new version 996. I did have a problem with
the build, namely the new line:
'bin/TACC.PL <http://TACC.PL> ' => ['bin/ibrun'],

I tried to find TACC.PL <http://TACC.PL>  unsuccessfully, so I decided to
dispense with this new line and then it compiled fine.

I started one or two tests and will inform you later about them. From my end
I must admit I am using a rather large EST fasta file, but is not useful for
test .. I will try to cut it down Monday or Tues so that tests can be more
agile.

Many thanks / Ramón.


On Fri, Mar 8, 2013 at 9:28 PM, Carson Holt <carsonhh at gmail.com> wrote:
> Also delete mpi_blastdb before retrying with the new svn repository.
> 
> Thanks,
> Carson
> 
> 
> From:  Carson Holt <carsonhh at gmail.com>
> Date:  Friday, 8 March, 2013 3:20 PM
> To:  Ramón Fallon <ramonfallon at gmail.com>
> 
> Cc:  "maker-devel at yandell-lab.org" <maker-devel at yandell-lab.org>
> Subject:  Re: [maker-devel] thread terminated, causing all processes to fail
> 
> I think I've found the potential cause and committed the necessary changes to
> fix it.
> 
> Thanks,
> Carson
> 
> 
> From:  Ramón Fallon <ramonfallon at gmail.com>
> Date:  Thursday, 7 March, 2013 12:47 PM
> To:  Carson Holt <carsonhh at gmail.com>
> Cc:  "maker-devel at yandell-lab.org" <maker-devel at yandell-lab.org>
> Subject:  Re: [maker-devel] thread terminated, causing all processes to fail
> 
> This is a standalone machine and no NFS at all. "df" gives a healthy amount of
> disk space, so there should be no problem there.
> 
> Yes that file does exist although it has the nominal 12288 bytes size, which
> appears to be the minimum for a DB_file tie.
> 
> As I mentioned the dpp_contig.fa example set does work so part of my
> investigation is looking at how.
> 
> I can do some trivial unit tests on the Bioperl stat-before-tied-hashes
> situation and see what comes up.
> 
> So I'll attempt to clear that up and then revert.
> 
> Many thanks! / Ramón.
> 
> 
> On Thu, Mar 7, 2013 at 5:44 PM, Carson Holt <carsonhh at gmail.com> wrote:
>> That is extremely odd.  It fails to even generate the indexes. Could you
>> check the drive space of your working directory and your /tmp directory?
>> 
>> It is odd because Bioperl uses the stat command to check on the file right
>> before making a tied hash.  So it was there for the stat but not the tie,
>> which is immediately following.
>> 
>> If you check manually does it exist now? -->
>> /home/ramonf/makertrials/mgallocut7/sca29310_8.maker.output/mpi_blastdb/sca29
>> 310_8%2Efa.mpi.1/sca29310_8%2Efa.mpi.1.0.index
>> 
>> Are you running in an NFS mounted directory?
>> 
>> --Carson
>> 
>> 
>> From:  Ramón Fallon <ramonfallon at gmail.com>
>> Date:  Thursday, 7 March, 2013 9:40 AM
>> 
>> To:  Carson Holt <carson.holt at oicr.on.ca>
>> Cc:  "maker-devel at yandell-lab.org" <maker-devel at yandell-lab.org>
>> Subject:  Re: [maker-devel] thread terminated, causing all processes to fail
>> 
>> Hi Carson,
>> 
>> I send you a zip of the text file of my repeated maker session, this time
>> having deleted the mpi_blastdb dir and with the -a flag added to "mpiexec -n
>> 8 maker -debug". Command line.
>> 
>> Cheers / Ramón.
>> 
>> 
>> On Wed, Mar 6, 2013 at 7:49 PM, Ramón Fallon <ramonfallon at gmail.com> wrote:
>>> OK, will do. 
>>> 
>>> Will get back to you tomorrow on it.
>>> 
>>> Many thanks!
>>>  
>>> 
>>> On Wed, Mar 6, 2013 at 7:22 PM, Carson Holt <Carson.Holt at oicr.on.ca> wrote:
>>>> Could you delete your ../*maker.output/mpi_blastdb directory, and then when
>>>> rerunning maker, run with the ­a flag.
>>>> 
>>>> Thanks,
>>>> Carson
>>>> 
>>>> 
>>>> From: Ramón Fallon <ramonfallon at gmail.com>
>>>> Date: Wednesday, 6 March, 2013 1:15 PM
>>>> To: Carson Holt <carson.holt at oicr.on.ca>
>>>> Cc: "maker-devel at yandell-lab.org" <maker-devel at yandell-lab.org>
>>>> 
>>>> Subject: Re: thread terminated, causing all processes to fail
>>>> 
>>>> OK great, here goes .. many thanks!
>>>> 
>>>> 
>>>> 
>>>> On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt <Carson.Holt at oicr.on.ca> wrote:
>>>>> If you do reply all to this message, I should get the attachment.  It will
>>>>> be stripped from the one going to the list though.
>>>>> 
>>>>> Thanks,
>>>>> Carson
>>>>> 
>>>>> 
>>>>> 
>>>>> From: Ramón Fallon <ramonfallon at gmail.com>
>>>>> Date: Wednesday, 6 March, 2013 12:57 PM
>>>>> To: <maker-devel at yandell-lab.org>
>>>>> Subject: Re: thread terminated, causing all processes to fail
>>>>> 
>>>>> Hi, 
>>>>> 
>>>>> Many thanks for your quick reply and hint.
>>>>> 
>>>>> Yes, you're right .. further up there is indeed
>>>>> 
>>>>> Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line 148
>>>>> thread 1.
>>>>> Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw
>>>>> FastaSeq for Storable
>>>>> --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457
>>>>> thread 1.
>>>>> 
>>>>> I run a "script" session and have maker on -debug so I have everything in
>>>>> one file. Do you prefer to have it attached to a post to this mailing list
>>>>> (if it accepts txt attachments)
>>>>> 
>>>>> Cheers.
>>>>> 
>>>>> 
>>>>> On Wed, Mar 6, 2013 at 6:34 PM, Ramón Fallon <ramonfallon at gmail.com>
>>>>> wrote:
>>>>>> Hi, 
>>>>>> 
>>>>>> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a
>>>>>> single multicore machine.
>>>>>> 
>>>>>> I've successfully run the dpp_contig.fasta (MPI/8 processes) example but
>>>>>> am having trouble with larger contigs fasta files of my own, which are
>>>>>> well formed.
>>>>>> 
>>>>>> I've run into a problem whereby an mpiexec run of 8 processes will stop
>>>>>> due to a perl-thread related problem which says
>>>>>> 
>>>>>> FATAL: Thread terminated, causing all processes to fail
>>>>>>  
>>>>>> this corresponds to line 924 in the maker executable (which is for the
>>>>>> secondary/worker threads), and is the result of a test on !$thr OR'd with
>>>>>> !$thr->is_running, so clearly one of these is failing.
>>>>>> 
>>>>>> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a
>>>>>> programmer, I've only recently started to look at the code and have not
>>>>>> got the hang of the parallelisation setup here, though I gather the
>>>>>> master must use threads to initially generate the parallel instances
>>>>>> which then use the message passing. Of course threads don't have message
>>>>>> passing ability, so I guess something clever is going on and will take
>>>>>> some time for me to understand.
>>>>>> 
>>>>>> Clearly however, it has worked before on dpp_contigs, so it may be is
>>>>>> something wrong with my datafile or the way I am carrying out the
>>>>>> analysis.
>>>>>> 
>>>>>> Any clues that can be put my way are welcome.
>>>>>> 
>>>>>> Thank you!
>>>>> 
>>>> 
>>> 
>> 
>> _______________________________________________ maker-devel mailing list
>> maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/ma
>> ker-devel_yandell-lab.org
> 



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20130310/64709e8f/attachment-0003.html>


More information about the maker-devel mailing list