<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">max_dna_len is the window size for keeping data in RAM. Smaller values do not split genes. But values lower than 100kb can create issues (if a single gene models spans 3 or more windows, it creates a weird failure).<div class=""><br class=""></div><div class="">—Carson<br class=""><div class=""><br class=""></div><div class=""><br class=""></div><div class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Sep 5, 2017, at 4:04 PM, Quanwei Zhang <<a href="mailto:qwzhang0601@gmail.com" class="">qwzhang0601@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class=""><div class=""><div class=""><div class="">Dear Carson:<br class=""><br class=""></div>Thanks. I wonder whether smaller "max_dna_len" will split longer scaffolds. I set max_dna_len as 1Mb, because there are quite many long scaffolds (e.g., the longest one is about 100Mb). Would you explain whether smaller "max_dna_len" will decrease the quality of annotation (e.g., split some genes in the same scaffold)? <br class=""></div><div class=""><br class=""></div><div class=""><br class=""></div>Best<br class=""></div>Quanwei   </div><div class="gmail_extra"><br class=""><div class="gmail_quote">2017-09-05 17:48 GMT-04:00 Carson Holt <span dir="ltr" class=""><<a href="mailto:carsonhh@gmail.com" target="_blank" class="">carsonhh@gmail.com</a>></span>:<br class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word" class="">You ran out of memory. You probably set max_dna_len too high for the machines you are using. There is a note in the maker_opts.ctl file that tells you that this value affects memory usage.<div class=""><br class=""></div><div class="">So you can either set it lower, or if running under MPI, use fewer CPUs per node (how you do this is MPI flavor dependent, but some flavors let you do this by setting process count lower combined with the round robin option).</div><span class="HOEnZb"><font color="#888888" class=""><div class=""><br class=""></div></font></span><div class=""><span class="HOEnZb"><font color="#888888" class="">—Carson</font></span><div class=""><div class="h5"><br class=""><div class=""><br class=""></div><div class=""><br class=""><div class=""><blockquote type="cite" class=""><div class="">On Sep 5, 2017, at 2:24 PM, Quanwei Zhang <<a href="mailto:qwzhang0601@gmail.com" target="_blank" class="">qwzhang0601@gmail.com</a>> wrote:</div><br class="m_-7513552749084071906Apple-interchange-newline"><div class=""><div dir="ltr" class=""><div class="">Hello:<br class=""></div><div class=""><br class=""></div><div class="">We are doing genome annotation for a new rodent species. We have finished the training of the ab initio gene predictors successful by setting the following parameters (split_hit=40000, max_dna_len=1000000, and 99k mammalian Swiss protein sequences as evidences. <br class=""></div><div class=""><br class=""></div><div class="">But when I used the trained model to do the genome annotation, I got the following kinds of errors (shown in red). I used the same parameters as those for training, except for addition of 340k rodent TrEMBL protein sequences for protein evidences (i.e., I use both 99k mammalian Swiss protein sequences and 340k rodent TrEMBL protein sequences). <br class=""></div><div class=""><br class=""></div><div class="">I am doing the annotation on a cluster and started multiple Maker in the same directory (I had tried to use MPI but met some problems).  <br class=""></div><div class=""><br class=""></div><div class="">Do you have any suggestions? Many thanks<br class=""></div><div class="">


<span style="font-size:20pt;font-family:Calibri" class=""></span><span style="font-size:20pt;font-family:Calibri" class=""><span class=""></span></span>#some kinds of errors<br class=""></div><div class=""><span style="color:rgb(255,0,0)" class="">open3: fork failed: Cannot allocate memory at /gs/gsfs0/hpc01/apps/MAKER/2.<wbr class="">31.9/bin/../lib/Widget/<a href="http://blastx.pm/" target="_blank" class="">blastx.<wbr class="">pm</a> line 40.<br class="">--> rank=NA, hostname=n520<br class="">ERROR: Failed while doing blastx of proteins<br class="">ERROR: Chunk failed at level:8, tier_type:3<br class="">FAILED CONTIG:Contig2<br class=""><br class=""><br class="">setting up GFF3 output and fasta chunks<br class="">doing repeat masking<br class="">Can't kill a non-numeric process ID at /gs/gsfs0/hpc01/apps/MAKER/2.<wbr class="">31.9/bin/../lib/File/NFSLock.<wbr class="">pm line 1050.<br class="">--> rank=NA, hostname=n513<br class="">ERROR: Failed while doing repeat masking<br class="">ERROR: Chunk failed at level:0, tier_type:1<br class="">FAILED CONTIG:Contig12378</span></div><div class=""><span style="color:rgb(255,0,0)" class=""><br class=""></span></div><div class=""><br class=""></div><div class=""><span class="">Best</span></div><div class=""><span style="color:rgb(255,0,0)" class=""><span class="">Quanwei</span><br class=""></span></div></div>

</div></blockquote></div><br class=""></div></div></div></div></div></blockquote></div><br class=""></div>

</div></blockquote></div><br class=""></div></div></body></html>