<html>

<head>

<meta http-equiv="Content-Type" content="text/html; charset=Windows-1252">

</head>

<body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; ">

Hi Saad, 

<div><br>

</div>

<div>Maker doesn't view EST or protein evidence as a gene model in themselves. There's a good reason for this. Aligners like blast  don't guarantee complete gene models, with accurate start and stop codons and splice sites. With it's default settings maker

 won't make a gene model unless there's evidence that overlaps an ab-initio prediction (or something from the pred_gff option). </div>

<div><br>

</div>

<div>You can use est2genome to promote everything from the est_gff option to a gene model, but this will probably give you many spurious results. What you're saying with est2genome is, "Everything that this tool found is a complete gene model." I don't think

 that's true even for cufflinks output. </div>

<div><br>

</div>

<div>One of the gene predictors that can run internally is snap. It's really easy to train; here's a link to a tutorial for training it: <a href="http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/MAKER_Tutorial_for_GMOD_Online_Training_2014#Training_ab_initio_Gene_Predictors">http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/MAKER_Tutorial_for_GMOD_Online_Training_2014#Training_ab_initio_Gene_Predictors</a></div>

<div><br>

</div>

<div>Let me know if that helps, or if you have more question</div>

<div><br>

</div>

<div><br>

</div>

<div>~Daniel</div>

<div><br>

</div>

<div>

<div apple-content-edited="true">

<div><span style="font-family: Tahoma; font-size: small; ">Daniel Ence</span></div>

<div><span style="font-family: Tahoma; font-size: small; ">Graduate Student</span></div>

<div><a href="mailto:dence@genetics.utah.edu">dence@genetics.utah.edu</a><br style="font-family: Tahoma; font-size: small; ">

<span style="font-family: Tahoma; font-size: small; ">Eccles Institute of Human Genetics</span><br style="font-family: Tahoma; font-size: small; ">

<span style="font-family: Tahoma; font-size: small; ">University of Utah</span><br style="font-family: Tahoma; font-size: small; ">

<span style="font-family: Tahoma; font-size: small; ">15 North 2030 East, Room 2100</span><br style="font-family: Tahoma; font-size: small; ">

<span style="font-family: Tahoma; font-size: small; ">Salt Lake City, UT 84112-5330</span></div>

</div>

<br>

<div>

<div>On Jun 18, 2014, at 5:09 AM, Saad Arif <<a href="mailto:saad.arif@tuebingen.mpg.de">saad.arif@tuebingen.mpg.de</a>></div>

<div> wrote:</div>

<br class="Apple-interchange-newline">

<blockquote type="cite">Thank you for the response. I still have one question though, with these options:<br>

<br>

est_GFF=cufflinksout.GFF<br>

<br>

modle_GFF= ensembl reference.GFF<br>

<br>

What happens to cufflinks assembled transcripts that are not confined to current gene loci (i.e. novel genes in cufflinks ouput)? Would i have to prepare ab initio gene predictions for each of these predicted 'new' genes?

<br>

Is there a simple way to combine adding (new genes) and improving of an existing annotation?<br>

<br>

Any feedback on this would be greatly appreciated.<br>

<br>

saad<br>

<br>

On 13 Jun 2014, at 17:59, Carson Holt wrote:<br>

<br>

<blockquote type="cite">Use the cufflinks instead of the tophat features (tophat tends to be<br>

really noisy).  Give the existing models to model_gff (they will then<br>

always be kept unless something better is found).  There is no option to<br>

keep models and then just add isoforms.  The model_gff input will either<br>

be kept as is (unchanged), or replaced with an updated model suggested by<br>

the evidence (the updated model may contain multiple isoforms though), and<br>

map_forward=1 can be used to pull names forward from the old model onto<br>

the new models.<br>

<br>

Thansk,<br>

Carson<br>

<br>

<br>

On 6/13/14, 5:03 AM, "Saad Arif" <<a href="mailto:saad.arif@tuebingen.mpg.de">saad.arif@tuebingen.mpg.de</a>> wrote:<br>

<br>

<blockquote type="cite">Dear All,<br>

<br>

I would like to use Maker pipeline  to expand a current annotation (new<br>

isoforms and novel genes with respect to current annotation) and was<br>

wondering if anyone had experience with this and or suggestions to my<br>

questions.<br>

<br>

Briefly:<br>

<br>

I have tophat splice junctions from RNAseq data or alternatively<br>

cufflinks generated transcript models (fasts format) that i want to use<br>

as my new data (est_gff or est).<br>

<br>

I want to provide the current Ensembl annotation for gene prediction but<br>

i want this annotation to remain unchanged. Hence, i’m not sure if i<br>

should provide this annotation as pred_gff<br>

or model_gff. Can the model_gff be used for gene prediction or is this<br>

just a subset of pred_gff that remain unaltered? Can we provide the same<br>

annotation for both options (pred_ and mod_gff)?<br>

<br>

<br>

<br>

Importantly, my main goal is to use the new RNAseq data to add more<br>

isoforms and (any) novel genes to the existing Ensembl annotation. Any<br>

thoughts or suggestions on how to go about  this would be  sincerely<br>

appreciated.<br>

<br>

<br>

Thanks in advance,<br>

saad<br>

<br>

<br>

<br>

<br>

_______________________________________________<br>

maker-devel mailing list<br>

<a href="mailto:maker-devel@box290.bluehost.com">maker-devel@box290.bluehost.com</a><br>

http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org<br>

</blockquote>

<br>

<br>

</blockquote>

<br>

<br>

_______________________________________________<br>

maker-devel mailing list<br>

<a href="mailto:maker-devel@box290.bluehost.com">maker-devel@box290.bluehost.com</a><br>

http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org<br>

</blockquote>

</div>

<br>

</div>

</body>

</html>