<html dir="ltr">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<style id="owaParaStyle" type="text/css">
<!--
p
{margin-top:0;
margin-bottom:0}
p
{margin-top:0;
margin-bottom:0}
-->
</style>
</head>
<body fpstyle="1" ocsi="0">
<div style="direction: ltr;font-family: Tahoma;color: #000000;font-size: 10pt;">So, MAKER itself isn't probabilistic. If you give it the same data and the same options, it will give you the same outputs. The iterative approach for MAKER is to get gene models
on the first round using the est2genome option and the Augustus model that you mentioned. After that first round, you train the ab-initio predictors and tell maker to use those newly trained gene predictors in the second round.
<div><br>
</div>
<div>Regarding the set of manually annotated genes, I think you should put those in the model_gff option.</div>
<div><br>
</div>
<div>Thanks,</div>
<div>Daniel<br>
<div><br>
<div class="BodyFragment"><font size="2">
<div class="PlainText">Daniel Ence<br>
Graduate Student<br>
Eccles Institute of Human Genetics<br>
University of Utah<br>
15 North 2030 East, Room 2100<br>
Salt Lake City, UT 84112-5330</div>
</font></div>
</div>
<div style="font-family: Times New Roman; color: #000000; font-size: 16px">
<hr tabindex="-1">
<div id="divRpF91787" style="direction: ltr; "><font face="Tahoma" size="2" color="#000000"><b>From:</b> Sivaranjani Namasivayam [ranjani@uga.edu]<br>
<b>Sent:</b> Tuesday, September 11, 2012 9:57 AM<br>
<b>To:</b> Daniel Ence; maker-devel@yandell-lab.org<br>
<b>Subject:</b> RE: MAKER training<br>
</font><br>
</div>
<div></div>
<div>
<div style="direction:ltr; font-family:Tahoma; color:#000000; font-size:10pt">Hey,<br>
<br>
I used the MAKER model to retrain MAKER itself. I read somewhere it improves MAKER's predictions.
<br>
<br>
I did train abinitio gene predictors using MAKERs output, but I wanted to identify the best prediction before using it to train other gene predictors.<br>
<br>
Thanks,<br>
Ranjani<br>
<div style="font-family:Times New Roman; color:#000000; font-size:16px">
<hr tabindex="-1">
<div id="divRpF742812" style="direction:ltr"><font color="#000000" face="Tahoma" size="2"><b>From:</b> Daniel Ence [dence@genetics.utah.edu]<br>
<b>Sent:</b> Tuesday, September 11, 2012 11:46 AM<br>
<b>To:</b> Sivaranjani Namasivayam; maker-devel@yandell-lab.org<br>
<b>Subject:</b> RE: MAKER training<br>
</font><br>
</div>
<div></div>
<div>
<div style="direction:ltr; font-family:Tahoma; color:#000000; font-size:10pt">Hi Ranjani,
<div><br>
</div>
<div>It is fine to include all three of those transcriptome datatsets. The more (relevant) evidence the better. </div>
<div><br>
</div>
<div>I'm not certain what you mean when you say "you used the above model to retrain". Did you train an abinitio gene predictor using the results from your first maker run?<br>
<div><br>
<div class="BodyFragment"><font size="2">
<div class="PlainText">Daniel Ence<br>
Graduate Student<br>
Eccles Institute of Human Genetics<br>
University of Utah<br>
15 North 2030 East, Room 2100<br>
Salt Lake City, UT 84112-5330</div>
</font></div>
</div>
<div style="font-family:Times New Roman; color:#000000; font-size:16px">
<hr tabindex="-1">
<div id="divRpF441968" style="direction:ltr"><font color="#000000" face="Tahoma" size="2"><b>From:</b> maker-devel-bounces@yandell-lab.org [maker-devel-bounces@yandell-lab.org] on behalf of Sivaranjani Namasivayam [ranjani@uga.edu]<br>
<b>Sent:</b> Tuesday, September 11, 2012 9:38 AM<br>
<b>To:</b> maker-devel@yandell-lab.org<br>
<b>Subject:</b> [maker-devel] MAKER training<br>
</font><br>
</div>
<div></div>
<div>
<div style="direction:ltr; font-family:Tahoma; color:#000000; font-size:10pt">Hi,<br>
<br>
I am using MAKER to annotate a newly sequenced genome. I have trained and retrained with datasets but I would like some advice on assessing the output and how this is affected by the input provided.<br>
<br>
- I have transcriptome data from 454 and Illumina platforms. Illumina is from a single time point and 454 from multiple time point. 454 was assembled using Newbler(dataset 1) and Illumina using Tophat-Cufflinks (dataset 2) and the denovo Trinity pipeline (dataset
3). I now have3 assemblies - 454 and Illumina will have some redunant transcripts (because of one overlapping time point); TopHat-Cufflinks and Trinity will have highly redundant transcripts (because they use same raw reads). Is it OK to provide all 3 datasets
as EST evidence, how does it affect the quality of annotation. (For now I have used dataset 1 and dataset 2 as EST evidence)<br>
<br>
- I used the above model to retrain, I passed through everything except the abinitio gene predictions. I also provided a set a manually annotated genes , many of which have EST evidence. Is this OK to do? [ For proteins evidence, I gave a set from related organisms,
same as above]<br>
<br>
- In my third retraining, I used the above retrained model, but this time I only provided the genome_gff but did not pass through any other data. However I did provide the manually annotated genes as EST evidence and related proteins as protein_evidence.
<br>
<br>
Can you please give me some advice on which of these could give me the best prediction, or if I can alter something to get a better prediction.<br>
<br>
- A quick question about Augustus - I used a Augustus model (trained for a closely related organism) for ab-initio prediction. Does MAKER adjust this model based on the evidence provided, or use the model as such for a prediction.<br>
<br>
Greatly appreciate your help!<br>
Thanks!<br>
Ranjani<br>
<br>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</body>
</html>