[maker-devel] Further split genome questions
Jeanne Wilbrandt
j.wilbrandt at zfmk.de
Wed Aug 6 09:33:07 MDT 2014
We are using MPI as well, each of the 20 parts gets assigned 4 threads. Our admin reports
however, that the processes seem to assemble more threads than they are allowed. It is
not Blast (which is set to 1 cpu in the opts.ctl). Do you have a suggestion why?
If I start the jobs in the same directory, how can I make sure they write to the same
directory (as, I think is required to put the pieces together in the end?)? das -basename
take paths?
On Wed, 6 Aug 2014 15:12:50 +0000
Carson Holt <carsonhh at gmail.com> wrote:
>I think the freezing is because you are starting too many simultaneous jobs. You should
>try and use MPI to parallelize instead. The concurrent job way of doing things can
>start to cause problems If you are running 10 or more jobs in the same directory. You
>could try splitting them into different directories.
>
>--Carson
>
>Sent from my iPhone
>
>> On Aug 6, 2014, at 9:01 AM, "Jeanne Wilbrandt" <j.wilbrandt at zfmk.de> wrote:
>>
>>
>> aha, so this explains that.
>> Daniel, the average is 5930.37 bp, but ranging from ~ 50 to more than 60,000, roughly
>> half of the sequences being shorter than 3,000 bp.
>>
>> What do you think about this weird 'I am running but not really doing
>anything'-behavior?
>>
>>
>> Thanks a lot!
>> Jeanne
>>
>>
>>
>> On Wed, 6 Aug 2014 14:16:52 +0000
>> Carson Holt <carsonhh at gmail.com> wrote:
>>> If you are starting and restarting, or running multiple jobs then the log can be
>>> partially rebuilt. On rebuild only the FINISHED entries are added. If there is a
>GFF3
>>> result file for the contig, then it is FINISHED. FASTA files will only exist for the
>>> contigs that have gene models. Small contigs will rarely contain models.
>>>
>>> --Carson
>>>
>>> Sent from my iPhone
>>>
>>>> On Aug 6, 2014, at 6:40 AM, "Jeanne Wilbrandt" <j.wilbrandt at zfmk.de> wrote:
>>>>
>>>>
>>>> Hi Carson,
>>>>
>>>> I ran into more conspicuous behavior running maker 2.31 on a genome which is split
>>> into
>>>> 20 parts, using the -g flag and the same basename.
>>>> Most of the jobs ran simultaneously on the same node, 17 seemed to finish normally,
>>> while
>>>> the remaining three seemed to be stalled and produced 0B of output. Do you have any
>>>> suggestion why this is happening?
>>>>
>>>> After I stopped these stalled jobs, I checked the index.log and found that of 38.384
>>>> mentioned scaffolds, 154 appear only once in the log. The surprise is, that 2/3 of
>>> these
>>>> only appear as FINISHED (the rest only started). There are no models for these
>>> 'finished'
>>>> scaffolds stored in the .db and they are distributed over all parts of the genome
>>> (i.e.,
>>>> each of the 20 jobs contained scaffolds that 'did not start' but 'finished')
>>>> Should this be an issue of concern?
>>>> It might be a NFS lock problem, as NFS is heavily loaded, but the NFS files look
>good,
>>> so
>>>> we suspect something fishy going on...
>>>>
>>>> Hope you can help,
>>>> best wishes,
>>>> Jeanne Wilbrandt
>>>>
>>>> zmb // ZFMK // University of Bonn
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> maker-devel mailing list
>>>> maker-devel at box290.bluehost.com
>>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>>
More information about the maker-devel
mailing list