<div dir="ltr"><div><div><div>Dear Carson:<br><br></div>Thanks. I wonder whether smaller "max_dna_len" will split longer scaffolds. I set max_dna_len as 1Mb, because there are quite many long scaffolds (e.g., the longest one is about 100Mb). Would you explain whether smaller "max_dna_len" will decrease the quality of annotation (e.g., split some genes in the same scaffold)? <br></div><div><br></div><div><br></div>Best<br></div>Quanwei </div><div class="gmail_extra"><br><div class="gmail_quote">2017-09-05 17:48 GMT-04:00 Carson Holt <span dir="ltr"><<a href="mailto:carsonhh@gmail.com" target="_blank">carsonhh@gmail.com</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word">You ran out of memory. You probably set max_dna_len too high for the machines you are using. There is a note in the maker_opts.ctl file that tells you that this value affects memory usage.<div><br></div><div>So you can either set it lower, or if running under MPI, use fewer CPUs per node (how you do this is MPI flavor dependent, but some flavors let you do this by setting process count lower combined with the round robin option).</div><span class="HOEnZb"><font color="#888888"><div><br></div></font></span><div><span class="HOEnZb"><font color="#888888">—Carson</font></span><div><div class="h5"><br><div><br></div><div><br><div><blockquote type="cite"><div>On Sep 5, 2017, at 2:24 PM, Quanwei Zhang <<a href="mailto:qwzhang0601@gmail.com" target="_blank">qwzhang0601@gmail.com</a>> wrote:</div><br class="m_-7513552749084071906Apple-interchange-newline"><div><div dir="ltr"><div>Hello:<br></div><div><br></div><div>We are doing genome annotation for a new rodent species. We have finished the training of the ab initio gene predictors successful by setting the following parameters (split_hit=40000, max_dna_len=1000000, and 99k mammalian Swiss protein sequences as evidences. <br></div><div><br></div><div>But when I used the trained model to do the genome annotation, I got the following kinds of errors (shown in red). I used the same parameters as those for training, except for addition of 340k rodent TrEMBL protein sequences for protein evidences (i.e., I use both 99k mammalian Swiss protein sequences and 340k rodent TrEMBL protein sequences). <br></div><div><br></div><div>I am doing the annotation on a cluster and started multiple Maker in the same directory (I had tried to use MPI but met some problems). <br></div><div><br></div><div>Do you have any suggestions? Many thanks<br></div><div>
<span style="font-size:20pt;font-family:Calibri"></span><span style="font-size:20pt;font-family:Calibri"><span></span></span>#some kinds of errors<br></div><div><span style="color:rgb(255,0,0)">open3: fork failed: Cannot allocate memory at /gs/gsfs0/hpc01/apps/MAKER/2.<wbr>31.9/bin/../lib/Widget/<a href="http://blastx.pm/" target="_blank">blastx.<wbr>pm</a> line 40.<br>--> rank=NA, hostname=n520<br>ERROR: Failed while doing blastx of proteins<br>ERROR: Chunk failed at level:8, tier_type:3<br>FAILED CONTIG:Contig2<br><br><br>setting up GFF3 output and fasta chunks<br>doing repeat masking<br>Can't kill a non-numeric process ID at /gs/gsfs0/hpc01/apps/MAKER/2.<wbr>31.9/bin/../lib/File/NFSLock.<wbr>pm line 1050.<br>--> rank=NA, hostname=n513<br>ERROR: Failed while doing repeat masking<br>ERROR: Chunk failed at level:0, tier_type:1<br>FAILED CONTIG:Contig12378</span></div><div><span style="color:rgb(255,0,0)"><br></span></div><div><br></div><div><span>Best</span></div><div><span style="color:rgb(255,0,0)"><span>Quanwei</span><br></span></div></div>
</div></blockquote></div><br></div></div></div></div></div></blockquote></div><br></div>