[maker-devel] question regarding MAKER determination of CDS boundaries
Andrew Farmer
adf at ncgr.org
Thu Oct 2 13:28:16 MDT 2014
Hi all-
several months ago, our group used MAKER-P (version 2.30) to annotate
some draft genome assemblies,
and have since been working a bit more closely evaluating the predicted
gene models in an effort to get them
ready for public release. One of the things that we recently noticed
during this process is that a considerable proportion
(~%10) of the peptides predicted do not begin with start codons.
Initially, my guess was that this was simply due
to assembly gaps causing truncations (and this may be a partial
explanation) but I was surprised to see many of
them with 5' UTRs reported- about half of the proteins beginning without
a start codon report a 5'UTR of length 0,
while the rest of have 5'UTR lengths reported in a range from a few bp
to several kb in length.
Having dug in a little deeper on the supporting evidence for one
example, one plausible explanation seems
to be that the choice of CDS start has been influenced by an outlier in
the protein alignments (ie one protein whose
alignment start extends a little further upstream than all of the
others, which ). Before I spend more time trying
to reverse engineer the diagnosis of other examples, it seemed worth
sending the list a message to see if this
seems plausible, or maybe there is a simpler explanation for it that
I've overlooked. I can send more specific
details on my example case if it would be helpful.
thanks in advance for your insights/suggestions
Andrew Farmer
--
...all concepts in which an entire process is semiotically concentrated
elude definition; only that which has no history is definable.
Friedrich Nietzsche
More information about the maker-devel
mailing list