[maker-devel] est2genome=1 for est and altest
Carson Holt
carsonhh at gmail.com
Tue Sep 2 10:57:56 MDT 2014
There is a reason why no altest2genome option exists in the maker_opts.ctl
file. The est2genome and protein2genome options are meant only for
generating rough partial models that can be used for training gene finders
(should not be used for generating final models). And if you are thinking
of using ESTs from another species (altest) to generate initial models for
training it's actually an analysis error. This is because altest
alignments will be far less accurate than EST or protein alignments (so
they will hurt your training). They are slower to generate than EST or
protein alignments (by as much as 10-20 fold because they are translated
into all 6 reading frames). Also there will be far fewer of them (6 frames
of translation make the alignments more spurious; thus they require higher
thresholds of significance). So if you are using a species for initial
training that is distant enough that it must be aligned as altest via
tblastx, then you should have been using proteins instead which will be
widely available and more accurately aligned. Note that both proteins and
altests are aligned in amino acid space, so you can expect anywhere from
several million to hundreds of millions of years of divergence, and the
species you use is not expected to be closely related (so whole proteomes
will be available from a number of sources that will be far more accurate
than any altest alignment).
The only real benefit of altest is to provide evidence of lineage specific
genes for organisms where there are no species in the same branch or
phylum to get protein evidence from. Since there will only be a handful
of these genes and they can be obtained in any later bootstrap training
steps which will not involve est2genome or protein2genome models. You
should use protein2genome models instead for the initial training and only
use altest for a any bootstrap training or for your final models.
Thanks,
Carson
On 9/1/14, 7:07 AM, "Marc Höppner" <mphoeppner at gmail.com> wrote:
>Hi,
>
>I may be wrong about this, but it seems to me that Maker will never build
>a gene model from EST evidence, if the set data is provided as ‘altest'
>rather than ‘est'. In my case, I am annotating a plant for which there is
>a closely related reference genome + annotation, as well as pretty good
>EST data. So I supplied the EST data as ‘altest', assuming that the only
>difference would be that the alignment parameters would be slightly more
>relaxed. But I found that Maker never made any genome models from that
>data. When moving the EST data to ‘est’, it worked.
>
>So I am not sure whether this is an intended behaviour, but in my case it
>caught me a bit by surprise…
>
>Regards,
>
>Marc
>_______________________________________________
>maker-devel mailing list
>maker-devel at box290.bluehost.com
>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
More information about the maker-devel
mailing list