[maker-devel] evidence-only gene annotation
nellerk at yorku.ca
nellerk at yorku.ca
Thu Apr 12 11:12:12 MDT 2018
Hello,
I am using Maker to annotate a novel, non-model plant genome.
Following the published protocol, I have run one evidence-only round
(est2genome, prot2genome = 1) followed by two iterative rounds, re-training
Snap and Augustus each time.
I have a curious result in that the gene predictors do not seem to be
finding many genes, but instead creating gene fusions. As such, my
evidence-only round resulted in 29,773 genes (mean length=5071 bp), and my
final round yielded 29,845 genes (mean length=6530 bp). If I am
interpreting this correctly, the predictors found only 72 new genes but
greatly increased the mean length of all genes. I have inspected the
results visually in a genome viewer and it seems that the predictors often
create fusions with nearby pseudogenes. I attempted to reduce this by
changing pred_flank from 200 (default) to 100, but it didn't seem to make a
difference (at least for the genes I was looking at).
So although my final Maker round looks good (~30,000 genes, 95% of genes
have AED < 0.5), I have greater confidence in the models created by the
evidence-only round.
I have two questions:1) In this case, would it be acceptable to use
evidence-only gene models (from Round 1), rather than those from Round 3
(which incorporated trained gene predictors)? I ask because I haven't seen
reports of Maker being used in this way.2) Do you have any suggestions to
improve my ab initio training or prediction? Please note, I have already
repeat-masked the genome with a species-specific repeat library.
Thank you for any assistance!
Kira
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20180412/4f70fd90/attachment.html>
More information about the maker-devel
mailing list