[maker-devel] Annotations from proteins, follow-up
Carson Holt
carsonhh at gmail.com
Mon Mar 24 11:05:15 MDT 2014
It not so much intentional as it is a a limitation of the information in
GFF3 format alignments. Right now protein2genome for Eukaryotes will only
try and make exonerate derived alignments work because they have been
polished around splice sites and MAKER still has access to the original
protein sequence and alignment cigar string fro additional filtering, etc.
With GFF3 pass-through the algorithm doesn't know nearly as much about
what is passed in. For example the protein sequence is gone, cigar
alignment strings are rarely included (Gap= attribute in GFF3), and it's
not always clear if the alignment was polished for splice sites. Also
since protein2genome=1 is expected to be used only to generate an initial
training set, and not for final annotations, this is considered a
reasonable restriction.
If you still really want to force protein alignments from a GFF3 to be
considered as potential models, you could put them in as pred_gff. In
which case they will always be considered as potential models. Of course
it will be relatively ugly because you lack things I mentioned before such
as the alignment cigar string and original protein sequence that are
normally used to filter protein2genome results for inclusion as models.
--Carson
On 3/24/14, 4:08 AM, "Marc Höppner" <marc.hoeppner at imbim.uu.se> wrote:
>Hi,
>
>I had previously inquired about protein-based gene building (for example
>to create a training set for SNAP). This is currently possible with Maker
>(2.31), but I noticed a limitation. Specifically, I tend to run Maker
>once to generate all the raw computes (protein and set alignments,
>mostly). I then separate these out into GFF files that I can store away
>and use in various combinations of settings and data in parallel.
>
>However, the protein2genome option does not seem to work off pre-aligned
>protein data (e.g. protein2genome.gff produced with Maker). Is that
>intentional and is there a work-around? Or is the only option to run this
>with fasta files?
>
>Cheers,
>
>Marc
>
>
>Marc P. Hoeppner, PhD
>
>Department for Medical Biochemistry and Microbiology
>Uppsala University, Sweden
>marc.hoeppner at imbim.uu.se
>
>
>
>
>_______________________________________________
>maker-devel mailing list
>maker-devel at box290.bluehost.com
>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
More information about the maker-devel
mailing list