[maker-devel] MAKER training

Tue Sep 11 09:38:33 MDT 2012

Hi,

I am using MAKER to annotate a newly sequenced genome. I have trained and retrained with datasets but I would like some advice on assessing the output and how this is affected by the input provided.

- I have transcriptome data from 454 and Illumina platforms. Illumina is from a single time point and 454 from multiple time point. 454 was assembled using Newbler(dataset 1) and Illumina using  Tophat-Cufflinks (dataset 2) and the denovo Trinity pipeline (dataset 3). I now have3  assemblies - 454 and Illumina will have some redunant transcripts (because of one overlapping time point); TopHat-Cufflinks and Trinity will have highly redundant transcripts (because they use same raw reads). Is it OK to provide all 3 datasets as EST evidence, how does it affect the quality of annotation. (For now I have used dataset 1 and dataset 2 as EST evidence)

- I used the above model to retrain, I passed through everything except the abinitio gene predictions. I also provided a set a manually annotated genes , many of which have EST evidence. Is this OK to do? [ For proteins evidence, I gave a set from related organisms, same as above]

- In my third retraining, I used the above retrained model, but this time I only provided the genome_gff but did not pass through any other data. However I did provide the manually annotated genes as EST evidence and related proteins as protein_evidence.

Can you please give me some advice on which of these could give me the best prediction, or if I can alter something to get a better prediction.

- A quick question about Augustus - I used a Augustus model (trained for a closely related organism) for ab-initio prediction. Does MAKER adjust this model based on the evidence provided, or use the model as such for a prediction.

Greatly appreciate your help!
Thanks!
Ranjani

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20120911/59600a49/attachment-0002.html>