[maker-devel] exon/intron boundaries
Carson Holt
carsonhh at gmail.com
Mon Aug 26 13:21:27 MDT 2013
Are you getting gene fusions or just more exons? Gene fusions can be
reduced by setting correct_est_fusion=1, or reducing pred_flank, although
reducing pred_flank can cause other issues (but those generally only appear
if setting the value below below 150). Also if you have the maximum intron
size set to high (split_hit option), you may also be generating bridging
alignments that make evidence align across distant paralogous genes as well
(this can result in gene merging)
You should also look at your results manually in a viewer like Apollo. Then
see if the extra exons are supported by something such as protein alignments
from another species. If this is the case, you may have a poorly annotated
protein set that is being used as evidence that is carrying over it's
erroneous exons into the species you are annotating. If the extra exons
are supported by EST evidence, then perhaps you should try and rebuild the
EST assembly (for example trinity has an option to use a Jarccardian
similarity coefficient to avoid fusing transcripts).
Another option, is to retrain SNAP or Augustus. MAKER does not actually
produce any of the models itself (it is a pipeline not a predictor). The
models are all generated using these other algorithms, MAKER just feeds them
hints based on protein and transcript alignments, so making sure training is
sufficient is important for those programs to produce their best models.
Finally make sure your repeat database is sufficient, you may need to
generate a species specific repeat library using something like
RepeatModeler. Repeats can end up being included as extra exons in gene
models because they may contain reading frames the do code for proteins
(I.e. reverse transcriptases).
If you have any questions on any of the above, just let us know.
Thanks,
Carson
From: Janna Fierst <jfierst at uoregon.edu>
Date: Monday, August 26, 2013 2:54 PM
To: <maker-devel at yandell-lab.org>
Subject: [maker-devel] exon/intron boundaries
Hi,
I am using MAKER 2.28 to annotate a Caenorhabditid worm genome, and the
initial results appear fairly good but we seem to be be annotating too many
exons for multiple genes. I was wondering which parameters should be tuned
to change the threshold for exon/intron boundaries? Thanks for your help
-Janna Fierst
_______________________________________________ maker-devel mailing list
maker-devel at box290.bluehost.com
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20130826/64d4f76a/attachment-0003.html>
More information about the maker-devel
mailing list