Yes thanks for re-sharing. <div><br></div><div>Maybe we should write this up into a clearer tutorial - I go back and forth on how to make this easier and automated.  <span></span></div><div><br></div><div>Jason <br><br>On Sunday, October 23, 2016, Xabier Vázquez-Campos <<a href="mailto:xvazquezc@gmail.com">xvazquezc@gmail.com</a>> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">If it's of any help I had this notes on my old protocol (before I started to do the training with BUSCO):<br><br><blockquote style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex" class="gmail_quote">For Augustus, we need the script "<a href="http://zff2augustus_gbk.pl" target="_blank">zff2augustus_gbk.pl</a>". This will take the export.dna generated by fathom and generate a *.gb file that will be used as "training gene structure file" in a new training submission in WebAugustus, but remember to give it a new name in the submission, e.g. MYGENOME_v2, or Maker won't see the difference (same name):<br>    perl PATH/TO/SCRIPT/<a href="http://zff2augustus_gbk.pl" target="_blank">zff2augustus_<wbr>gbk.pl</a> > <a href="http://MYGENOME.train.gb" target="_blank">MYGENOME.train.gb</a><br></blockquote><div><br></div><div>As said, you could also do the training with BUSCO with the --long option. It has a dataset specific for arthropods. But if you have EST data you'll probably do better with the other method, as it allows to enter the EST for a more accurate training.<br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On 24 October 2016 at 10:25, Carson Holt <span dir="ltr"><<a href="javascript:_e(%7B%7D,'cvml','carsonhh@gmail.com');" target="_blank">carsonhh@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div><div>It’s unfortunate the archived GMOD post is gone, because I always used it for my own reference. If I remember right, the main point was that Jason Stajich wrote a tool to convert Snap’s ZFF format to a Genbank format suitable for Augustus training. This meant you could use the maker2zff script that came with MAKER, then use Jason’s tool to convert for Augustus training.</div><div><br></div><div>Tool to convert SNAP training ZFF to Augustus trining input file —></div><a href="https://github.com/hyphaltip/genome-scripts/blob/master/gene_prediction/zff2augustus_gbk.pl" target="_blank">https://github.com/hyphaltip/g<wbr>enome-scripts/blob/master/gene<wbr>_prediction/zff2augustus_gbk.<wbr>pl</a></div><div><br></div><div><br></div><div>Since the post is gone, you could use that documentation provided with his tool and then maybe a generic Augustus training guide like the following to design a path forward —></div><div><a href="http://www.molecularevolution.org/molevolfiles/exercises/augustus/training.html" target="_blank">http://www.molecularevolution.<wbr>org/molevolfiles/exercises/aug<wbr>ustus/training.html</a></div><span><font color="#888888"><div><br></div><div>—Carson</div></font></span><div><div><div><br></div><div><br><div><blockquote type="cite"><div>On Oct 12, 2016, at 3:44 AM, chebbi mohamed amine <<a href="javascript:_e(%7B%7D,'cvml','mohamed.amine.chebbi@univ-poitiers.fr');" target="_blank">mohamed.amine.chebbi@univ-poi<wbr>tiers.fr</a>> wrote:</div><br><div><div><div style="font-family:arial,helvetica,sans-serif;font-size:12pt"><div><div style="margin:0px"><span style="font-size:12.0pt;line-height:115%;font-family:'Times New Roman','serif';background:white">Thank you Carson for your quick response.  Sorry, I have another question concerning Augustus Training. You posted previously in the mailing list a link to an explanation of Augustus training steps  </span><span style="font-size:12.0pt;line-height:115%;font-family:'Times New Roman','serif';color:#7030a0"><a href="http://brie4.cshl.edu/pipermail/gmod-help/2012-June/001724.html" style="text-align:start;word-spacing:0px" target="_blank"><span style="color:#7030a0;border:none windowtext 1.0pt;padding:0cm;background:white;text-decoration:none">http://brie4.cshl.edu/piperma<wbr>il/gmod-help/2012-June/001724.<wbr>htm</span></a></span><span style="font-size:12.0pt;line-height:115%;font-family:'Times New Roman','serif';color:#7030a0">l</span><span style="font-size:12.0pt;line-height:115%;font-family:'Times New Roman','serif';background:white">. Unfortunately the link doesn't work anymore. Otherwise could you explain how to filter the  gff  file produced by the first run of Maker to get best full length ORF as a set of gene models to train Augustus ?</span></div><div style="margin:0px"><span style="font-size:12.0pt;line-height:115%;font-family:'Times New Roman','serif';background:white"><br></span></div><div style="margin:0px"><span style="font-size:12.0pt;line-height:115%;font-family:'Times New Roman','serif';background:white">Best,</span></div><div style="margin:0px"><span style="font-size:12.0pt;line-height:115%;font-family:'Times New Roman','serif';background:white">Amine</span></div></div><div><br></div><hr><div><b>De: </b>"chebbi mohamed amine" <<a href="javascript:_e(%7B%7D,'cvml','mohamed.amine.chebbi@univ-poitiers.fr');" target="_blank">mohamed.amine.chebbi@univ-poi<wbr>tiers.fr</a>><br><b>À: </b>"Carson Holt" <<a href="javascript:_e(%7B%7D,'cvml','carsonhh@gmail.com');" target="_blank">carsonhh@gmail.com</a>><br><b>Cc: </b><a href="javascript:_e(%7B%7D,'cvml','maker-devel@yandell-lab.org');" target="_blank">maker-devel@yandell-lab.org</a><br><b>Envoyé: </b>Mercredi 12 Octobre 2016 11:44:21<br><b>Objet: </b>Re: [maker-devel] Combining and merging two Maker annotation gff files ?<br></div><div><br></div><div><div style="font-family:arial,helvetica,sans-serif;font-size:12pt"><div><div style="margin:0px"><span style="font-size:12.0pt;line-height:115%;font-family:'Times New Roman','serif';background:white">Thank you Carson for your quick response.  Sorry, I have another question concerning Augustus Training. You posted previously in the mailing list a link to an explanation of Augustus training steps  </span><span style="font-size:12.0pt;line-height:115%;font-family:'Times New Roman','serif';color:#7030a0"><a href="http://brie4.cshl.edu/pipermail/gmod-help/2012-June/001724.html" style="text-align:start;word-spacing:0px" target="_blank"><span style="color:#7030a0;border:none windowtext 1.0pt;padding:0cm;background:white;text-decoration:none">http://brie4.cshl.edu/piperma<wbr>il/gmod-help/2012-June/001724.<wbr>htm</span></a></span><span style="font-size:12.0pt;line-height:115%;font-family:'Times New Roman','serif';color:#7030a0">l</span><span style="font-size:12.0pt;line-height:115%;font-family:'Times New Roman','serif';background:white">. Unfortunately the link doesn't work anymore. Otherwise could you explain how to filter the  gff  file produced by the first run of Maker to get best full length ORF as a set of gene models to train Augustus ?</span></div><div style="margin:0px"><span style="font-size:12.0pt;line-height:115%;font-family:'Times New Roman','serif';background:white"><br></span></div></div><br><hr><div><b>De: </b>"Carson Holt" <<a href="javascript:_e(%7B%7D,'cvml','carsonhh@gmail.com');" target="_blank">carsonhh@gmail.com</a>><br><b>À: </b>"Mohamed Amine CHEBBI" <<a href="javascript:_e(%7B%7D,'cvml','mohamed.amine.chebbi@univ-poitiers.fr');" target="_blank">mohamed.amine.chebbi@univ-poi<wbr>tiers.fr</a>><br><b>Cc: </b><a href="javascript:_e(%7B%7D,'cvml','maker-devel@yandell-lab.org');" target="_blank">maker-devel@yandell-lab.org</a><br><b>Envoyé: </b>Mardi 11 Octobre 2016 22:05:50<br><b>Objet: </b>Re: [maker-devel] Combining and merging two Maker annotation gff files ?<br></div><br><div>Masking doesn’t just affect the gene models, but also evidence alignment and thus scoring. So merging in this way would not make much sense as the second less masked set would always score better because it has more evidence alignments permitted by the lack of masking (not necessarily real, but drawn in by repeats).<div><br></div><div>The result would be that any attempt of a merge would almost exclusively result in all genes from the second set always scoring higher.</div><div><br></div><div>—Carson</div><div><br><div><br></div><div><br><div><blockquote><div>On Oct 10, 2016, at 3:43 AM, Mohamed Amine CHEBBI <<a href="javascript:_e(%7B%7D,'cvml','mohamed.amine.chebbi@univ-poitiers.fr');" target="_blank">mohamed.amine.chebbi@univ-poi<wbr>tiers.fr</a>> wrote:</div><div><p class="MsoNormal" style="margin:0cm 0cm 10pt;line-height:normal;font-size:11pt;font-family:Calibri,sans-serif;font-style:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:#ffffff"><span style="font-size:12pt;color:#222222;background-color:white;background-position:initial initial;background-repeat:initial initial" lang="EN-US">Hi!<span> </span><br><br>I’m using the latest version of Maker2 to annotate an arthropod genome. First, I have run RepeatModeler to create rmlib for Maker, then I have followed two independent annotation strategies on the same assembly :<br>1- Passing throw Maker all the repeats collected by RepeatModeler ( Identified repeats in the Repbase + Unkown Models).<br>2- <span> </span></span><span style="font-size:12pt" lang="EN-US">Passing throw Maker only the identified repeats.<span> </span></span></p><p class="MsoNormal" style="margin:0cm 0cm 10pt;line-height:normal;font-size:11pt;font-family:Calibri,sans-serif;font-style:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:#ffffff"><span style="font-size:12pt" lang="EN-US">Both annotations work successfully. The first annotation gives me 19048  genes against 22931 done by the second one. Know, I'm seeing for a mean to merge the two annotation gff files without<span> </span><span style="text-decoration:underline">doing a re-annotation<span> </span></span>and by taking the best and non redundant supported gene models .</span></p><p class="MsoNormal" style="margin:0cm 0cm 10pt;line-height:normal;font-size:11pt;font-family:Calibri,sans-serif;font-style:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:#ffffff"><span style="font-size:12pt" lang="EN-US">So, do you think that configuring  the maker options as below, could resolve this issue :<br>maker_gff=1-mask-all.gff,2-mas<wbr>k-onlyKnown.gff #MAKER derived GFF3 file<br>#MAKER derived GFF3 file<br>est_pass=1 #use ESTs in maker_gff: 1 = yes, 0 = no<br>altest_pass=0 #use alternate organism ESTs in maker_gff: 1 = yes, 0 = no<br>protein_pass=1 #use protein alignments in maker_gff: 1 = yes, 0 = no<br>rm_pass=1 #use repeats in maker_gff: 1 = yes, 0 = no<br>model_pass=0 #use gene models in maker_gff: 1 = yes, 0 = no<br>pred_pass=1 #use ab-initio predictions in maker_gff: 1 = yes, 0 = no<br>other_pass=0 #passthrough anyything else in maker_gff: 1 = yes, 0 = no</span></p><span style="font-family:Helvetica;font-style:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:#ffffff;font-size:12pt" lang="EN-US"></span><span style="font-family:Helvetica;font-size:12px;font-style:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:#ffffff;float:none;display:inline!important"></span><span style="font-family:Helvetica;font-size:12px;font-style:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:#ffffff;float:none;display:inline!important"></span><span style="font-family:Helvetica;font-size:12px;font-style:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:#ffffff;float:none;display:inline!important"></span><span style="font-family:Helvetica;font-size:12px;font-style:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:#ffffff;float:none;display:inline!important"></span><span style="font-family:Helvetica;font-size:12px;font-style:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:#ffffff;float:none;display:inline!important"></span><span style="font-family:Helvetica;font-size:12px;font-style:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:#ffffff;float:none;display:inline!important"></span><span style="font-family:Helvetica;font-size:12px;font-style:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:#ffffff;float:none;display:inline!important"></span><span style="font-family:Helvetica;font-size:12px;font-style:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:#ffffff;float:none;display:inline!important"></span><span style="font-family:Helvetica;font-style:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:#ffffff;font-size:12pt;line-height:18.399999618530273px" lang="EN-US"></span><span style="font-family:Helvetica;font-size:12px;font-style:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:#ffffff;float:none;display:inline!important"></span><span style="font-family:Helvetica;font-size:12px;font-style:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:#ffffff;float:none;display:inline!important"></span><span style="font-family:Helvetica;font-size:12px;font-style:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:#ffffff;float:none;display:inline!important"></span><span style="font-family:Helvetica;font-size:12px;font-style:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:#ffffff;float:none;display:inline!important"></span><span style="font-family:Helvetica;font-size:12px;font-style:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:#ffffff;float:none;display:inline!important"></span><span style="font-family:Helvetica;font-size:12px;font-style:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:#ffffff;float:none;display:inline!important"></span><span style="font-family:Helvetica;font-size:12px;font-style:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:#ffffff;float:none;display:inline!important"></span><span style="font-family:Helvetica;font-size:12px;font-style:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:#ffffff;float:none;display:inline!important"></span><span style="font-family:Helvetica;font-size:12px;font-style:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:#ffffff;float:none;display:inline!important"></span><pre style="font-size:12px;font-style:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;word-spacing:0px;background-color:#ffffff">-- 
Mohamed Amine CHEBBI, PhD Student
Université de Poitiers
</pre><span style="font-family:Helvetica;font-size:12px;font-style:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:#ffffff;float:none;display:inline!important">______________________________<wbr>_________________</span><br style="font-family:Helvetica;font-size:12px;font-style:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:#ffffff"><span style="font-family:Helvetica;font-size:12px;font-style:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:#ffffff;float:none;display:inline!important">maker-devel mailing list</span><br style="font-family:Helvetica;font-size:12px;font-style:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:#ffffff"><a href="javascript:_e(%7B%7D,'cvml','maker-devel@box290.bluehost.com');" style="font-family:Helvetica;font-size:12px;font-style:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:#ffffff" target="_blank">maker-devel@box290.bluehost.co<wbr>m</a><br style="font-family:Helvetica;font-size:12px;font-style:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:#ffffff"><a href="http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org" style="font-family:Helvetica;font-size:12px;font-style:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:#ffffff" target="_blank">http://box290.bluehost.com/mai<wbr>lman/listinfo/maker-devel_yand<wbr>ell-lab.org</a><br style="font-family:Helvetica;font-size:12px;font-style:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:#ffffff"></div></blockquote></div><br></div></div></div></div><br></div></div></div></div></blockquote></div><br></div></div></div></div><br>______________________________<wbr>_________________<br>
maker-devel mailing list<br>
<a href="javascript:_e(%7B%7D,'cvml','maker-devel@box290.bluehost.com');" target="_blank">maker-devel@box290.bluehost.co<wbr>m</a><br>
<a href="http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org" rel="noreferrer" target="_blank">http://box290.bluehost.com/mai<wbr>lman/listinfo/maker-devel_yand<wbr>ell-lab.org</a><br>
<br></blockquote></div><br><br clear="all"><br>-- <br><div data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div>Xabier Vázquez-Campos, <i>PhD</i><br><i>Research Associate</i><br>Water Research Centre<br>School of Civil and Environmental Engineering<br>
The University of New South Wales<br>Sydney NSW 2052 AUSTRALIA<br></div></div></div></div></div></div></div>
</div>
</blockquote></div><br><br>-- <br><div dir="ltr">Jason Stajich<br><a href="mailto:jason.stajich@gmail.com" target="_blank">jason.stajich@gmail.com</a><br></div><br>