<html><head></head><body><div style="color:#000; background-color:#fff; font-family:HelveticaNeue, Helvetica Neue, Helvetica, Arial, Lucida Grande, sans-serif;font-size:16px"><div id="yui_3_16_0_1_1450388290921_8749" dir="ltr">Hi Daniel,</div><div id="yui_3_16_0_1_1450388290921_10695" dir="ltr"><br></div><div id="yui_3_16_0_1_1450388290921_10696" dir="ltr">I used the pre-trained models of Arabidopsis from SNAP and Augustus for this first run of maker. Do you think it would be wise to use the run I used previously (shown at the start of the topic) or should I make a new run with the following parameters to use for training? <br></div><div id="yui_3_16_0_1_1450388290921_8748" dir="ltr">genome=CAB_assembly.fasta</div><div id="yui_3_16_0_1_1450388290921_8952" dir="ltr">est=RTLs.fa</div><div id="yui_3_16_0_1_1450388290921_9347" dir="ltr">altest=Brassica_oleracea.fasta<br></div><div id="yui_3_16_0_1_1450388290921_9032" dir="ltr">protein=Arabidopsis_proteins.fasta</div><div id="yui_3_16_0_1_1450388290921_9164" dir="ltr">est2genome=0</div><div id="yui_3_16_0_1_1450388290921_8989" dir="ltr">protein2genome=0</div><div id="yui_3_16_0_1_1450388290921_9165" dir="ltr">SNAP=A.thaliana</div><div id="yui_3_16_0_1_1450388290921_9166" dir="ltr">Augustus=arabidopsis</div><div id="yui_3_16_0_1_1450388290921_9167" dir="ltr">model_org=arabidopsis</div><div id="yui_3_16_0_1_1450388290921_9442" dir="ltr">rmlib=Brassicaceae_repeats.fasta</div><div id="yui_3_16_0_1_1450388290921_9615" dir="ltr">repeat_protein=te_proteins.fasta<br></div><div id="yui_3_16_0_1_1450388290921_8086"><br><span></span></div><div dir="ltr" id="yui_3_16_0_1_1450388290921_9626"><span id="yui_3_16_0_1_1450388290921_9796">At what point would I use est2genome=1? Also for this plant genome, is it better to use model_org=arabidopsis or model_org=all? I am also considering using RepeatModeler to create a custom repeat library, but I am not sure it is necessary with all of the repeat information I am putting in already.</span></div><div id="yui_3_16_0_1_1450388290921_10866" dir="ltr"><br><span id="yui_3_16_0_1_1450388290921_9796"></span></div><div id="yui_3_16_0_1_1450388290921_10867" dir="ltr"><span id="yui_3_16_0_1_1450388290921_9796">Any advice is helpful.</span></div><div id="yui_3_16_0_1_1450388290921_10870" dir="ltr"><span id="yui_3_16_0_1_1450388290921_9796">Thanks,</span></div><div id="yui_3_16_0_1_1450388290921_10871" dir="ltr"><span id="yui_3_16_0_1_1450388290921_9796">-Elyssa</span></div> <div class="qtdSeparateBR"><br><br></div><div style="display: block;" class="yahoo_quoted"> <div style="font-family: HelveticaNeue, Helvetica Neue, Helvetica, Arial, Lucida Grande, sans-serif; font-size: 16px;"> <div style="font-family: HelveticaNeue, Helvetica Neue, Helvetica, Arial, Lucida Grande, sans-serif; font-size: 16px;"> <div dir="ltr"><font face="Arial" size="2"> On Wednesday, December 16, 2015 12:07 PM, Daniel Ence <dence@genetics.utah.edu> wrote:<br></font></div> <br><br> <div class="y_msg_container"><div id="yiv3335730612">
<div>
Hi Elyssa,
<div class="yiv3335730612"><br class="yiv3335730612">
</div>
<div class="yiv3335730612">Setting est2genome=1 tells MAKER to promote all of the est2genome alignments to a gene model, which is not what you want for a final gene set. That being said, since your gene models are basically the unmodified alignments, I’m surprised that
all of them have an AED of 1, since that means that they’re not supported by any of the evidence (either est or protein). </div>
<div class="yiv3335730612"><br class="yiv3335730612">
</div>
<div class="yiv3335730612">Did you get gene models from snap or augustus? You can gather those with the fasta_merge script. Those should be a good starting place for training ab initio predictors. Instructions for training snap can be found here:</div>
<div class="yiv3335730612"><a rel="nofollow" target="_blank" href="http://gmod.org/wiki/MAKER_Tutorial#Training_ab_initio_Gene_Predictors" class="yiv3335730612">http://gmod.org/wiki/MAKER_Tutorial#Training_ab_initio_Gene_Predictors</a></div>
<div class="yiv3335730612"><br class="yiv3335730612">
</div>
<div class="yiv3335730612">Augustus can also be trained but is much more involved.</div>
<div class="yiv3335730612"><br class="yiv3335730612">
</div>
<div class="yiv3335730612">~Daniel</div>
<div class="yiv3335730612"><br class="yiv3335730612">
</div>
<div class="yiv3335730612"><br class="yiv3335730612">
</div>
<div class="yiv3335730612">
<div class="yiv3335730612">Daniel Ence<br class="yiv3335730612">
Graduate Student<br class="yiv3335730612">
Eccles Institute of Human Genetics<br class="yiv3335730612">
University of Utah<br class="yiv3335730612">
15 North 2030 East, Room 2100<br class="yiv3335730612">
Salt Lake City, UT 84112-5330 </div>
<br class="yiv3335730612">
<div>
<blockquote type="cite" class="yiv3335730612">
<div class="yiv3335730612">On Dec 11, 2015, at 10:43 AM, Elyssa Garza <<a rel="nofollow" ymailto="mailto:elyssa_garza@yahoo.com" target="_blank" href="mailto:elyssa_garza@yahoo.com" class="yiv3335730612">elyssa_garza@yahoo.com</a>> wrote:</div>
<br class="yiv3335730612Apple-interchange-newline">
<div class="yiv3335730612">
<div style="word-wrap:break-word;" class="yiv3335730612">
Hello,
<div class="yiv3335730612"><br class="yiv3335730612">
</div>
<div class="yiv3335730612">I have recently begun running Maker. I am currently trying to annotate my Caulanthus Genome (~372Mb); a relative to Arabidopsis. I am unsure about the parameters I have chosen for my first run in maker, which include:</div>
<div class="yiv3335730612"><br class="yiv3335730612">
</div>
<div class="yiv3335730612">genome=CAB_assembly.fasta (1044 contigs)</div>
<div class="yiv3335730612">est=Representative_transcript_loci.fasta (assembled transcripts btw 200-20000bp long)</div>
<div class="yiv3335730612">protein=TAIR10pep.fasta (Arabidopsis proteins)</div>
<div class="yiv3335730612">—</div>
<div class="yiv3335730612"><u class="yiv3335730612">Repeat masking</u></div>
<div class="yiv3335730612">model_org=arabidopsis</div>
<div class="yiv3335730612">rmlib=list of Brassicaceae and common plant repeats</div>
<div class="yiv3335730612">repeat_protein=te_proteins.fasta</div>
<div class="yiv3335730612"><u class="yiv3335730612">Gene Prediction</u></div>
<div class="yiv3335730612">snaphmm=A.thaliana.hmm</div>
<div class="yiv3335730612">augustus_species=arabidopsis</div>
<div class="yiv3335730612">est2genome=1</div>
<div class="yiv3335730612"><br class="yiv3335730612">
</div>
<div class="yiv3335730612">I have run a sample file of scaffolds, as well as the entire genome.</div>
<div class="yiv3335730612">In the sample file of scaffolds, I gff3merged the gffs and then ran evaluator. I noticed that my AED are all 1. Is this bad? What should I try next?</div>
<div class="yiv3335730612"><br class="yiv3335730612">
</div>
<div class="yiv3335730612">I am also unsure on how to train files and if this should be done in my case.</div>
<div class="yiv3335730612"><br class="yiv3335730612">
</div>
<div class="yiv3335730612">Can anyone advise me on these issues?</div>
<div class="yiv3335730612"><br class="yiv3335730612">
</div>
<div class="yiv3335730612">-Elyssa</div>
</div>
_______________________________________________<br class="yiv3335730612">
maker-devel mailing list<br class="yiv3335730612">
<a rel="nofollow" ymailto="mailto:maker-devel@box290.bluehost.com" target="_blank" href="mailto:maker-devel@box290.bluehost.com" class="yiv3335730612">maker-devel@box290.bluehost.com</a><br class="yiv3335730612">
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org<br class="yiv3335730612">
</div>
</blockquote>
</div>
<br class="yiv3335730612">
</div>
</div>
</div><br>_______________________________________________<br>maker-devel mailing list<br><a ymailto="mailto:maker-devel@box290.bluehost.com" href="mailto:maker-devel@box290.bluehost.com">maker-devel@box290.bluehost.com</a><br><a href="http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org" target="_blank">http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org</a><br><br><br></div> </div> </div> </div></div></body></html>