<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><div class="">A couple of corrections from the reply below. SNAP doesn’t work well on primates, so you probably don’t want to use it (the mammal hmm is not a good replacement). This suggestion comes directly from the author of SNAP. There are ways to make it work by splitting the genome into isotigs but it’s a little messy and technical, so just don’t use it on primates.</div><div class=""><br class="">Here’s a good website on training Augustus (<a href="http://www.molecularevolution.org/molevolfiles/exercises/augustus/training.html" class="">http://www.molecularevolution.org/molevolfiles/exercises/augustus/training.html</a>). You need some sort of results to train with. You can either use results from a protein2genome run of MAKER or a run where you use human as your species together with other evidence in MAKER (models won’t be perfect but will be enough to get training going).</div><div class=""><br class=""></div><div class="">Unless it’s really really close evolutionarily to human, you probably don’t just want to stick to the human species file (this is because your not going to want to use SNAP, so you will need to optimize the one gene predictor you will get to use as much as possible). </div><div class=""><br class=""></div><div class="">You need models to be in GeneBank format for training. There is a round about way to do this with GFF3 models. First use the scripts that come with MAKER for training SNAP (makerr2zff). Then follow SNAP’s training instructions on training SNAP (in SNAP’s README).</div><div class=""><br class=""></div><div class="">Basically the following commands (where the first two files came from maker2zff) —></div><div class=""><div class="">fathom genome.ann genome.dna -categorize 1000</div><div class="">fathom uni.ann uni.dna -export 1000 -plus</div></div><div class=""><br class=""></div><div class="">Then using this script from Jason Stajich, you can convert it to the export.ann and export.dna files to a genebank format file —></div><div class=""><a href="https://github.com/hyphaltip/genome-scripts/blob/master/gene_prediction/zff2augustus_gbk.pl" class="">https://github.com/hyphaltip/genome-scripts/blob/master/gene_prediction/zff2augustus_gbk.pl</a></div><div class=""><br class=""></div><div class=""><br class=""></div><div class="">Go ahead and run with human as your species first, so you can review models and see how models and evidence correlating in a viewer like Apollo or IGV. But I still would recommend training Augustus to your species.</div><div class=""><br class=""></div><div class="">—Carson</div><div class=""><br class=""></div><div class=""><br class=""><div class=""><br class=""></div><div class=""><br class=""></div><div class=""><br class=""><div><blockquote type="cite" class=""><div class="">On May 19, 2015, at 3:18 PM, Michael Campbell <<a href="mailto:michael.s.campbell1@gmail.com" class="">michael.s.campbell1@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class="">Hi Julian,<div class=""><br class=""></div><div class="">Since you are annotating a primate I would use the pre-trained human parameter for augustus. Here is what I would try first</div><div class=""><br class=""></div><div class=""> <span style="font-size: 13px; font-family: Tahoma;" class="">genome=data/hsap_contig.fasta </span><span style="font-size: 13px; font-family: Tahoma;" class=""> # contig file from example data</span></div><span style="font-family: Tahoma; font-size: 13px;" class="">est=data/mRNAs.fa # RNAs filtered to just mRNAs</span><br style="font-family: Tahoma; font-size: 13px;" class=""><span style="font-family: Tahoma; font-size: 13px;" class="">protein=data/protein.fa</span><br style="font-family: Tahoma; font-size: 13px;" class=""><span style="font-family: Tahoma; font-size: 13px;" class="">est2genome=0</span><br style="font-family: Tahoma; font-size: 13px;" class=""><span style="font-family: Tahoma; font-size: 13px;" class="">protein2genome=0</span><div class=""><span style="font-family: Tahoma;" class="">augustus_species=human</span><br class=""></div><div class=""><font face="Tahoma" class=""><br class=""></font></div><div class=""><font face="Tahoma" class="">You could also use one of the mammal HMMs packaged with SNAP as well, or use the output from the above to train SNAP. There are tutorial that walk through these steps here: </font></div><div class=""><font face="Tahoma" class=""><br class=""></font></div><div class=""><font face="Tahoma" class=""><a href="http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Main_Page" class="">http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Main_Page</a></font></div><div class=""><font face="Tahoma" class=""><br class=""></font></div><div class=""><font face="Tahoma" class="">There is also a current protocols in bioinformatics article for using MAKER can may help you get started as well.</font></div><div class=""><font face="Tahoma" class=""><br class=""></font></div><div class=""><font face="Tahoma" class=""><a href="http://onlinelibrary.wiley.com/doi/10.1002/0471250953.bi0411s48/abstract" class="">http://onlinelibrary.wiley.com/doi/10.1002/0471250953.bi0411s48/abstract</a><br class=""></font></div><div class=""><font face="Tahoma" class=""><br class=""></font></div><div class=""><font face="Tahoma" class="">Good luck,</font></div><div class=""><font face="Tahoma" class="">Mike</font></div></div><div class="gmail_extra"><br class=""><div class="gmail_quote">On Tue, May 19, 2015 at 1:51 PM, Julian Egger <span dir="ltr" class=""><<a href="mailto:julian.egger@omahazoo.com" target="_blank" class="">julian.egger@omahazoo.com</a>></span> wrote:<br class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="">
<div style="direction: ltr; font-family: Tahoma; font-size: 13px;" class="">
<div class=""> </div>
I am trying to use Augustus in MAKER to help with annotating as many genes as possible from genomic reads of a primate sample. I am new to using gene prediction tools such as SNAP and Augustus, but was told Augustus would be better for primates. I tried using
reference mRNAs and protein sequences from NCBI on the sample contig file included with the MAKER software and it ran ok. My question is how do I now use the output to train Augustus iteratively and thus create a file set of annotations from my original input?
<br class="">
<br class="">
After creating the control files with maker -CTL, the only configurations I made to maker_opts.ctl were:<br class="">
genome=data/hsap_contig.fasta # contig file from example data<br class="">
est=data/mRNAs.fa # RNAs filtered to just mRNAs<br class="">
protein=data/protein.fa<br class="">
est2genome=1<br class="">
protein2genome=1<br class="">
<br class="">
I will eventually replace the contig file with our scaffolds file from the assembly. I know the output created a gff file along with protein and mRNA files. Do I then need to change the maker_opts file to account for the new files and if so how and what should
the maker__opts file look like now? Was Augustus supposed to be set up on the initial maker run or do I wait until the second run after est2genome and protein2genome were used to initialize training for Augustus and how do the configurations change between
multiple iterations because I have a solid annotation set? <br class="">
<br class="">
Sorry for all the questions, newbie here with a lot of data to work with. <br class="">
<br class="">
Thanks<br class="">
</div>
</div>
<br class="">_______________________________________________<br class="">
maker-devel mailing list<br class="">
<a href="mailto:maker-devel@box290.bluehost.com" class="">maker-devel@box290.bluehost.com</a><br class="">
<a href="http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org" target="_blank" class="">http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org</a><br class="">
<br class=""></blockquote></div><br class=""><br clear="all" class=""><div class=""><br class=""></div>-- <br class=""><div class="gmail_signature"><div dir="ltr" class="">Michael Campbell MS, RD.<br class="">Doctoral Candidate<br class="">Eccles Institute of Human Genetics<br class="">
University of Utah<br class="">
15 North 2030 East, Room 2100<br class="">
Salt Lake City, UT 84112-5330<br class="">ph:585-3543<br class=""><br class=""></div></div>
</div>
_______________________________________________<br class="">maker-devel mailing list<br class=""><a href="mailto:maker-devel@box290.bluehost.com" class="">maker-devel@box290.bluehost.com</a><br class="">http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org<br class=""></div></blockquote></div><br class=""></div></div></body></html>