[maker-devel] Further split genome questions

Jeanne Wilbrandt j.wilbrandt at zfmk.de
Wed Aug 6 09:33:07 MDT 2014



We are using MPI as well, each of the 20 parts gets assigned 4 threads. Our admin reports
however, that the processes seem to assemble more threads than they are allowed. It is
not Blast (which is set to 1 cpu in the opts.ctl). Do you have a suggestion why?

If I start the jobs in the same directory, how can I make sure they write to the same
directory (as, I think is required to put the pieces together in the end?)? das -basename
take paths?


On Wed, 6 Aug 2014 15:12:50 +0000
 Carson Holt <carsonhh at gmail.com> wrote:
>I think the freezing is because you are starting too many simultaneous jobs.  You should
>try and use MPI to parallelize instead.  The concurrent job way of doing things can
>start to cause problems If you are running 10 or more jobs in the same directory. You
>could try splitting them into different directories.
>
>--Carson
>
>Sent from my iPhone
>
>> On Aug 6, 2014, at 9:01 AM, "Jeanne Wilbrandt" <j.wilbrandt at zfmk.de> wrote:
>> 
>> 
>> aha, so this explains that. 
>> Daniel, the average is 5930.37 bp, but ranging from ~ 50 to more than 60,000, roughly
>> half of the sequences being shorter than 3,000 bp.
>> 
>> What do you think about this weird 'I am running but not really doing
>anything'-behavior?
>> 
>> 
>> Thanks a lot!
>> Jeanne
>> 
>> 
>> 
>> On Wed, 6 Aug 2014 14:16:52 +0000
>> Carson Holt <carsonhh at gmail.com> wrote:
>>> If you are starting and restarting, or running multiple jobs then the log can be
>>> partially rebuilt.  On rebuild only the FINISHED entries are added.  If there is a
>GFF3
>>> result file for the contig, then it is FINISHED. FASTA files will only exist for the
>>> contigs that have gene models. Small contigs will rarely contain models.
>>> 
>>> --Carson
>>> 
>>> Sent from my iPhone
>>> 
>>>> On Aug 6, 2014, at 6:40 AM, "Jeanne Wilbrandt" <j.wilbrandt at zfmk.de> wrote:
>>>> 
>>>> 
>>>> Hi Carson, 
>>>> 
>>>> I ran into more conspicuous behavior running maker 2.31 on a genome which is split
>>> into
>>>> 20 parts, using the -g flag and the same basename.
>>>> Most of the jobs ran simultaneously on the same node, 17 seemed to finish normally,
>>> while
>>>> the remaining three seemed to be stalled and produced 0B of output. Do you have any
>>>> suggestion why this is happening?
>>>> 
>>>> After I stopped these stalled jobs, I checked the index.log and found that of 38.384
>>>> mentioned scaffolds, 154 appear only once in the log. The surprise is, that 2/3 of
>>> these
>>>> only appear as FINISHED (the rest only started). There are no models for these
>>> 'finished'
>>>> scaffolds stored in the .db and they are distributed over all parts of the genome
>>> (i.e.,
>>>> each of the 20 jobs contained scaffolds that 'did not start' but 'finished')
>>>> Should this be an issue of concern?
>>>> It might be a NFS lock problem, as NFS is heavily loaded, but the NFS files look
>good,
>>> so
>>>> we suspect something fishy going on...
>>>> 
>>>> Hope you can help,
>>>> best wishes,
>>>> Jeanne Wilbrandt
>>>> 
>>>> zmb // ZFMK // University of Bonn
>>>> 
>>>> 
>>>> 
>>>> _______________________________________________
>>>> maker-devel mailing list
>>>> maker-devel at box290.bluehost.com
>>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>> 





More information about the maker-devel mailing list