<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><div class="">Hi Panos,</div><div class=""><br class=""></div>EST’s and mRNA-seq assemblies will bey their nature be partial. After a first round of training you can run MAKER together with protein and EST evidence and the newly trained Augustus species file. Because MAKER gives hints to Augustus as it runs, the models it produces will be improved over what it would get from just running Augustus on it’s own. Then take these gene models and use them to retrain Augustus. This is the standard bootstrap retraining procedure, and can be repeated as needed.<div class=""><br class=""></div><div class="">More info on bootstrap training here (info is for SNAP but procedure is similar to Augustus) —> <span style="color: rgb(34, 34, 34); font-family: Arial, Helvetica, sans-serif; font-size: 13px; background-color: rgb(255, 255, 255);" class=""> </span><a href="http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/MAKER_Tutorial_for_GMOD_Online_Training_2014#Training_ab_initio_Gene_Predictors" class=""><font color="#6611cc" face="Arial, Helvetica, sans-serif" size="2" class=""><span style="cursor: pointer;" class="">http://weatherby.genetics.</span></font><font color="#6611cc" face="Arial, Helvetica, sans-serif" size="2" class=""><span style="cursor: pointer;" class=""><wbr class="">utah.edu/MAKER/wiki/index.php/<wbr class="">MAKER_Tutorial_for_GMOD_<wbr class="">Online_Training_2014#Training_<wbr class="">ab_initio_Gene_Predictors</span></font></a></div><div class="">Here is an excellent explanation of Augustus training —> <a href="http://brie4.cshl.edu/pipermail/gmod-help/2012-June/001724.html" class=""><font color="#6611cc" face="Arial, Helvetica, sans-serif" size="2" class=""><span style="cursor: pointer;" class="">http://brie4.cshl.edu/</span></font><font color="#6611cc" face="Arial, Helvetica, sans-serif" size="2" class=""><span style="cursor: pointer;" class=""><wbr class="">pipermail/gmod-help/2012-June/<wbr class="">001724.html</span></font></a></div><div class="">and here are tools to convert SNAP training files to Augustus training files (MAKER comes with a tool that converts GFF3 for SNAP training so just take that and convert it for Augustus)—> <a href="https://github.com/hyphaltip/genome-scripts/blob/master/gene_prediction/zff2augustus_gbk.pl" class=""><font color="#6611cc" face="Arial, Helvetica, sans-serif" size="2" class=""><span style="cursor: pointer;" class="">https://github.com/</span></font><font color="#6611cc" face="Arial, Helvetica, sans-serif" size="2" class=""><span style="cursor: pointer;" class=""><wbr class="">hyphaltip/genome-scripts/blob/<wbr class="">master/gene_prediction/<wbr class="">zff2augustus_gbk.pl</span></font></a></div><div class=""><br class=""></div><div class="">Finally you can also manually edit the GFF3 file in Apollo (easier to use the legacy stand alone version), and then convert that file for bootstrap training.</div><div class=""><br class=""></div><div class="">—Carson</div><div class=""><br class=""></div><div class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Mar 24, 2015, at 6:24 AM, Panos Ioannidis <<a href="mailto:panos.ioannidis@gmail.com" class="">panos.ioannidis@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class=""><div class=""><div class=""><div class="">Hi Xabier,<br class=""><br class=""></div><div class="">Thanks for your quick reply!<br class=""><br class="">No, I haven't used WebAugustus, but I just checked it out and it looks like my training set is too big (~300 Mbp), so I can't even upload it!<br class=""></div><br class=""></div>Anyway, I prefer to train it locally because I have better control over each step. Also, I have done the entire training procedure with less genes, but didn't get a good gene-level sensitivity (~5%). So now I'm trying to replicate it using more of my scaffolds, but as it appears I get a lot more incomplete models from exonerate (run through Maker).<br class=""><br class=""></div>P<br class=""><br class=""><br class=""></div><div class="gmail_extra"><br class=""><div class="gmail_quote">On Tue, Mar 24, 2015 at 1:06 PM, Xabier Vázquez Campos <span dir="ltr" class=""><<a href="mailto:xvazquezc@gmail.com" target="_blank" class="">xvazquezc@gmail.com</a>></span> wrote:<br class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr" class=""><div class=""><div class="">Hi Panos,<br class=""><br class=""></div>Have you tried using webAugustus for the (re)training? I found it very convenient for generating the models for Augustus.<br class=""><br class=""></div>Cheers,<br class=""></div><div class="gmail_extra"><br class=""><div class="gmail_quote"><div class=""><div class="h5">2015-03-24 19:29 GMT+11:00 Panos Ioannidis <span dir="ltr" class=""><<a href="mailto:panos.ioannidis@gmail.com" target="_blank" class="">panos.ioannidis@gmail.com</a>></span>:<br class=""></div></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class=""><div class="h5"><div dir="ltr" class=""><div class=""><div class=""><div class=""><div class="">Hello All,<br class=""><br class=""></div>I'm trying to retrain Augustus using EST data from the same species and realized that quite a few of the gene models I get based on EST data are incomplete (i.e. no start and/or stop codon).<br class=""><br class=""></div>Now, when I get to the "etraining" step in Augustus retraining (right after the time-consuming "<a href="http://optimize_augustus.pl/" target="_blank" class="">optimize_augustus.pl</a>" step), I get a warning for each gene that doesn't contain a start or stop codon.<br class=""><br class=""><span style="font-family:monospace,monospace" class="">.....<br class="">gene maker-scaffold4|size2210279-exonerate_est2genome-gene-20.1-mRNA-1 transcr. 1 in sequence scaffold4|size2210279_2021791-2044735: Initial exon does not begin with start codon but with acg<br class="">gene maker-scaffold4|size2210279-exonerate_est2genome-gene-20.2-mRNA-1 transcr. 1 in sequence scaffold4|size2210279_2045713-2064983: Terminal exon doesn't end in stop codon. Variable stopCodonExcludedFromCDS set right?<br class="">....</span><br class=""><br class=""></div>Does anyone know whether training is compromised by such incomplete gene models? Do you usually exclude them from the training set?<br class=""><br class=""></div>Oh, and by the way, the best guide to retraining Augustus is <a href="http://avrilomics.blogspot.ch/2013/04/training-augustus-gene-finding-software.html" target="_blank" class="">here</a>. The <a href="http://bioinf.uni-greifswald.de/augustus/binaries/retraining.html" target="_blank" class="">official</a> web page isn't bad, but doesn't explain in detail certain things.<br class=""><div class=""><div class=""><br class=""></div><div class="">Thanks,<br class=""></div><div class="">Panos<br class=""><br class=""></div></div></div>
<br class=""></div></div>_______________________________________________<br class="">
maker-devel mailing list<br class="">
<a href="mailto:maker-devel@box290.bluehost.com" target="_blank" class="">maker-devel@box290.bluehost.com</a><br class="">
<a href="http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org" target="_blank" class="">http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org</a><br class="">
<br class=""></blockquote></div><span class="HOEnZb"><font color="#888888" class=""><br class=""><br clear="all" class=""><br class="">-- <br class=""><div class="">Xabier Vázquez Campos<br class=""><i class="">PhD Candidate</i><br class="">Water Research Centre<br class="">School of Civil and Environmental Engineering<br class="">
The University of New South Wales<br class="">Sydney NSW 2052 AUSTRALIA<br class=""></div>
</font></span></div>
</blockquote></div><br class=""></div>
_______________________________________________<br class="">maker-devel mailing list<br class=""><a href="mailto:maker-devel@box290.bluehost.com" class="">maker-devel@box290.bluehost.com</a><br class="">http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org<br class=""></div></blockquote></div><br class=""></div></body></html>