[maker-devel] Some errors reported by Maker2

Carson Holt carsonhh at gmail.com
Tue Sep 5 16:08:28 MDT 2017


max_dna_len is the window size for keeping data in RAM. Smaller values do not split genes. But values lower than 100kb can create issues (if a single gene models spans 3 or more windows, it creates a weird failure).

—Carson



> On Sep 5, 2017, at 4:04 PM, Quanwei Zhang <qwzhang0601 at gmail.com> wrote:
> 
> Dear Carson:
> 
> Thanks. I wonder whether smaller "max_dna_len" will split longer scaffolds. I set max_dna_len as 1Mb, because there are quite many long scaffolds (e.g., the longest one is about 100Mb). Would you explain whether smaller "max_dna_len" will decrease the quality of annotation (e.g., split some genes in the same scaffold)? 
> 
> 
> Best
> Quanwei  
> 
> 2017-09-05 17:48 GMT-04:00 Carson Holt <carsonhh at gmail.com <mailto:carsonhh at gmail.com>>:
> You ran out of memory. You probably set max_dna_len too high for the machines you are using. There is a note in the maker_opts.ctl file that tells you that this value affects memory usage.
> 
> So you can either set it lower, or if running under MPI, use fewer CPUs per node (how you do this is MPI flavor dependent, but some flavors let you do this by setting process count lower combined with the round robin option).
> 
> —Carson
> 
> 
> 
>> On Sep 5, 2017, at 2:24 PM, Quanwei Zhang <qwzhang0601 at gmail.com <mailto:qwzhang0601 at gmail.com>> wrote:
>> 
>> Hello:
>> 
>> We are doing genome annotation for a new rodent species. We have finished the training of the ab initio gene predictors successful by setting the following parameters (split_hit=40000, max_dna_len=1000000, and 99k mammalian Swiss protein sequences as evidences. 
>> 
>> But when I used the trained model to do the genome annotation, I got the following kinds of errors (shown in red). I used the same parameters as those for training, except for addition of 340k rodent TrEMBL protein sequences for protein evidences (i.e., I use both 99k mammalian Swiss protein sequences and 340k rodent TrEMBL protein sequences). 
>> 
>> I am doing the annotation on a cluster and started multiple Maker in the same directory (I had tried to use MPI but met some problems).  
>> 
>> Do you have any suggestions? Many thanks
>> #some kinds of errors
>> open3: fork failed: Cannot allocate memory at /gs/gsfs0/hpc01/apps/MAKER/2.31.9/bin/../lib/Widget/blastx.pm <http://blastx.pm/> line 40.
>> --> rank=NA, hostname=n520
>> ERROR: Failed while doing blastx of proteins
>> ERROR: Chunk failed at level:8, tier_type:3
>> FAILED CONTIG:Contig2
>> 
>> 
>> setting up GFF3 output and fasta chunks
>> doing repeat masking
>> Can't kill a non-numeric process ID at /gs/gsfs0/hpc01/apps/MAKER/2.31.9/bin/../lib/File/NFSLock.pm line 1050.
>> --> rank=NA, hostname=n513
>> ERROR: Failed while doing repeat masking
>> ERROR: Chunk failed at level:0, tier_type:1
>> FAILED CONTIG:Contig12378
>> 
>> 
>> Best
>> Quanwei
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170905/6032bfb2/attachment-0003.html>


More information about the maker-devel mailing list