[maker-devel] thread terminated, causing all processes to fail
Carson Holt
carsonhh at gmail.com
Sun Mar 10 10:31:27 MDT 2013
I've fixed the missing script issue.
Thanks,
Carson
From: Ramón Fallon <ramonfallon at gmail.com>
Date: Sunday, 10 March, 2013 10:45 AM
To: Carson Holt <carsonhh at gmail.com>
Cc: "maker-devel at yandell-lab.org" <maker-devel at yandell-lab.org>
Subject: Re: [maker-devel] thread terminated, causing all processes to fail
Hi Carson,
In terms of rev 995, on a simplified version of our data set, I tried a
sequential run successfully, and even a "mpiexec -n 4" which ran to
completion.
In any case, many thanks for the new version 996. I did have a problem with
the build, namely the new line:
'bin/TACC.PL <http://TACC.PL> ' => ['bin/ibrun'],
I tried to find TACC.PL <http://TACC.PL> unsuccessfully, so I decided to
dispense with this new line and then it compiled fine.
I started one or two tests and will inform you later about them. From my end
I must admit I am using a rather large EST fasta file, but is not useful for
test .. I will try to cut it down Monday or Tues so that tests can be more
agile.
Many thanks / Ramón.
On Fri, Mar 8, 2013 at 9:28 PM, Carson Holt <carsonhh at gmail.com> wrote:
> Also delete mpi_blastdb before retrying with the new svn repository.
>
> Thanks,
> Carson
>
>
> From: Carson Holt <carsonhh at gmail.com>
> Date: Friday, 8 March, 2013 3:20 PM
> To: Ramón Fallon <ramonfallon at gmail.com>
>
> Cc: "maker-devel at yandell-lab.org" <maker-devel at yandell-lab.org>
> Subject: Re: [maker-devel] thread terminated, causing all processes to fail
>
> I think I've found the potential cause and committed the necessary changes to
> fix it.
>
> Thanks,
> Carson
>
>
> From: Ramón Fallon <ramonfallon at gmail.com>
> Date: Thursday, 7 March, 2013 12:47 PM
> To: Carson Holt <carsonhh at gmail.com>
> Cc: "maker-devel at yandell-lab.org" <maker-devel at yandell-lab.org>
> Subject: Re: [maker-devel] thread terminated, causing all processes to fail
>
> This is a standalone machine and no NFS at all. "df" gives a healthy amount of
> disk space, so there should be no problem there.
>
> Yes that file does exist although it has the nominal 12288 bytes size, which
> appears to be the minimum for a DB_file tie.
>
> As I mentioned the dpp_contig.fa example set does work so part of my
> investigation is looking at how.
>
> I can do some trivial unit tests on the Bioperl stat-before-tied-hashes
> situation and see what comes up.
>
> So I'll attempt to clear that up and then revert.
>
> Many thanks! / Ramón.
>
>
> On Thu, Mar 7, 2013 at 5:44 PM, Carson Holt <carsonhh at gmail.com> wrote:
>> That is extremely odd. It fails to even generate the indexes. Could you
>> check the drive space of your working directory and your /tmp directory?
>>
>> It is odd because Bioperl uses the stat command to check on the file right
>> before making a tied hash. So it was there for the stat but not the tie,
>> which is immediately following.
>>
>> If you check manually does it exist now? -->
>> /home/ramonf/makertrials/mgallocut7/sca29310_8.maker.output/mpi_blastdb/sca29
>> 310_8%2Efa.mpi.1/sca29310_8%2Efa.mpi.1.0.index
>>
>> Are you running in an NFS mounted directory?
>>
>> --Carson
>>
>>
>> From: Ramón Fallon <ramonfallon at gmail.com>
>> Date: Thursday, 7 March, 2013 9:40 AM
>>
>> To: Carson Holt <carson.holt at oicr.on.ca>
>> Cc: "maker-devel at yandell-lab.org" <maker-devel at yandell-lab.org>
>> Subject: Re: [maker-devel] thread terminated, causing all processes to fail
>>
>> Hi Carson,
>>
>> I send you a zip of the text file of my repeated maker session, this time
>> having deleted the mpi_blastdb dir and with the -a flag added to "mpiexec -n
>> 8 maker -debug". Command line.
>>
>> Cheers / Ramón.
>>
>>
>> On Wed, Mar 6, 2013 at 7:49 PM, Ramón Fallon <ramonfallon at gmail.com> wrote:
>>> OK, will do.
>>>
>>> Will get back to you tomorrow on it.
>>>
>>> Many thanks!
>>>
>>>
>>> On Wed, Mar 6, 2013 at 7:22 PM, Carson Holt <Carson.Holt at oicr.on.ca> wrote:
>>>> Could you delete your ../*maker.output/mpi_blastdb directory, and then when
>>>> rerunning maker, run with the a flag.
>>>>
>>>> Thanks,
>>>> Carson
>>>>
>>>>
>>>> From: Ramón Fallon <ramonfallon at gmail.com>
>>>> Date: Wednesday, 6 March, 2013 1:15 PM
>>>> To: Carson Holt <carson.holt at oicr.on.ca>
>>>> Cc: "maker-devel at yandell-lab.org" <maker-devel at yandell-lab.org>
>>>>
>>>> Subject: Re: thread terminated, causing all processes to fail
>>>>
>>>> OK great, here goes .. many thanks!
>>>>
>>>>
>>>>
>>>> On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt <Carson.Holt at oicr.on.ca> wrote:
>>>>> If you do reply all to this message, I should get the attachment. It will
>>>>> be stripped from the one going to the list though.
>>>>>
>>>>> Thanks,
>>>>> Carson
>>>>>
>>>>>
>>>>>
>>>>> From: Ramón Fallon <ramonfallon at gmail.com>
>>>>> Date: Wednesday, 6 March, 2013 12:57 PM
>>>>> To: <maker-devel at yandell-lab.org>
>>>>> Subject: Re: thread terminated, causing all processes to fail
>>>>>
>>>>> Hi,
>>>>>
>>>>> Many thanks for your quick reply and hint.
>>>>>
>>>>> Yes, you're right .. further up there is indeed
>>>>>
>>>>> Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line 148
>>>>> thread 1.
>>>>> Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw
>>>>> FastaSeq for Storable
>>>>> --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457
>>>>> thread 1.
>>>>>
>>>>> I run a "script" session and have maker on -debug so I have everything in
>>>>> one file. Do you prefer to have it attached to a post to this mailing list
>>>>> (if it accepts txt attachments)
>>>>>
>>>>> Cheers.
>>>>>
>>>>>
>>>>> On Wed, Mar 6, 2013 at 6:34 PM, Ramón Fallon <ramonfallon at gmail.com>
>>>>> wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a
>>>>>> single multicore machine.
>>>>>>
>>>>>> I've successfully run the dpp_contig.fasta (MPI/8 processes) example but
>>>>>> am having trouble with larger contigs fasta files of my own, which are
>>>>>> well formed.
>>>>>>
>>>>>> I've run into a problem whereby an mpiexec run of 8 processes will stop
>>>>>> due to a perl-thread related problem which says
>>>>>>
>>>>>> FATAL: Thread terminated, causing all processes to fail
>>>>>>
>>>>>> this corresponds to line 924 in the maker executable (which is for the
>>>>>> secondary/worker threads), and is the result of a test on !$thr OR'd with
>>>>>> !$thr->is_running, so clearly one of these is failing.
>>>>>>
>>>>>> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a
>>>>>> programmer, I've only recently started to look at the code and have not
>>>>>> got the hang of the parallelisation setup here, though I gather the
>>>>>> master must use threads to initially generate the parallel instances
>>>>>> which then use the message passing. Of course threads don't have message
>>>>>> passing ability, so I guess something clever is going on and will take
>>>>>> some time for me to understand.
>>>>>>
>>>>>> Clearly however, it has worked before on dpp_contigs, so it may be is
>>>>>> something wrong with my datafile or the way I am carrying out the
>>>>>> analysis.
>>>>>>
>>>>>> Any clues that can be put my way are welcome.
>>>>>>
>>>>>> Thank you!
>>>>>
>>>>
>>>
>>
>> _______________________________________________ maker-devel mailing list
>> maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/ma
>> ker-devel_yandell-lab.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20130310/64709e8f/attachment-0003.html>
More information about the maker-devel
mailing list