<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><div class="">I would recommend just using the trinity assembly. The cufflinks results tend to be messy.</div><div class=""><br class=""></div><div class="">You shouldn’t need the est2genome or protein2genome results if you already trained using cegma results. You can then do one MAKER run (can be on just part of the genome) where you use both SNAP and Augustus as the predictors (est2genome and protein2genome should be turned off), and then give these results back to SNAP to train with again. This second round of bootstrap training is usually beneficial to SNAP (beyond two rounds doesn’t really help). Also don’t concatenate with previous training sets for the second round of bootstrap round of training. The idea is that the second round of training genes will be more correct than the first round, so you want to use them instead.</div><div class=""><br class=""></div><div class="">When you are done, look at one of the larger contigs in a viewer like apollo and compare the raw augustus calls, raw snap calls, and the evidence aware augustus and snap calls produced by maker. If SNAP and augustus are properly trained then they will produce similar calls, and they will also be similar to the evidence aware calls from MAKER (this convergence is the result of the training). If one predictor seems to produce calls that are still very divergent, then just drop that predictor from the analysis. A bad predictor will make all results worse.</div><div class=""><br class=""></div><div class="">--Carson</div><br class=""><div><blockquote type="cite" class=""><div class="">On Feb 18, 2015, at 8:30 AM, Kai Kamm <<a href="mailto:kai.kamm@ecolevol.de" class="">kai.kamm@ecolevol.de</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div class=""><div style="font-family: Verdana;font-size: 12.0px;" class=""><div class="">
<div class="">Hello</div>
<div class="">I have just started in this field of research and I want to annotate my assembled non-bilaterian invertebrate genome with Maker (100Mb in 7000 scaffolds) .</div>
<div class=""> </div>
<div class="">I have red the maker tutorials but I am still a little uncertain about the iterative procedure. What I have already done is:</div>
<div class=""> </div>
<div class="">- trained Augustus (using the web service) on the reference genome of a closely related species and its published dataset of "best transcripts" which are mainly based on gene prediction and some EST evidence. The published ESTs themselves were rejected from Augustus as being not sufficient for training (to few long transcripts).<br class="">
- trained SNAP with the CEGMA-output of my genome<br class="">
- assembled RNA-seq data with tophat/cufflinks and generated gff-file with cufflinks2gff<br class="">
- de novo assembled RNA-seq data with Trinity</div>
<div class=""><br class="">
I have already done some preliminary Maker runs with initially trained Augustus, SNAP and some protein evidence which had good results.</div>
<div class=""> </div>
<div class="">Now my strategy is:</div>
<div class=""> </div>
<div class="">running maker with</div>
<div class="">- the est2genome option using the cufflinks gff and the Trinity transcripts as EST evidence</div>
<div class=""> </div>
<div class="">- the protein2genome option using a protein file including all proteins of the closely related species, a less related non-bilaterian species and a collection of reviewed Swiss-Prot entries from one representative mammal and all protostomes</div>
<div class=""> </div>
<div class="">- Augustus and SNAP for gene prediction</div>
<div class=""> </div>
<div class="">When this is done I want to:</div>
<div class=""> </div>
<div class="">- create 2nd training set for SNAP from the merged gffs with maker2zff<br class="">
- train Augustus again with the Maker transcripts using the Augustus web service</div>
<div class=""><br class="">
And run Maker again</div>
<div class=""><br class="">
Is this a reasonable procedure? Or am I missing some important aspects here?</div>
<div class="">Thanks in advance?</div>
</div></div></div>
_______________________________________________<br class="">maker-devel mailing list<br class=""><a href="mailto:maker-devel@box290.bluehost.com" class="">maker-devel@box290.bluehost.com</a><br class="">http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org<br class=""></div></blockquote></div><br class=""></body></html>