[maker-devel] model_gff not in output
Michael Thon
mike.thon at gmail.com
Tue Sep 4 02:59:06 MDT 2012
I'm using maker to update a legacy annotation. As input I'm using RNA-Seq aligned with cufflinks, ESTs, provided in fasta format, and proteins downloaded from UniProt.SwissProt. I have done two runs of maker so far. The first one using the legacy annotations in both the model_gff and pred_gff parameters. In the second run I used the legacy annotations in model_gff and in pred_gff I included gene models created with GeneMark-ES.
In both runs 1 and 2 I have found two genes (so far) that exist in the legacy annotations but are not in the final gene models output by maker. Both genes have overlapping cufflinks annotations, in addition to having annotations in model_gff. I thought maker was supposed to keep all the annotations in model_gff, only replacing ones in which it could find an alternative model with better support. Is there any case in which is will remove a model?
Another discrepency I found in run1 is a gene that maker 'moved' upstream approx. 150 bases. The gene locus annotated by maker covers the original annotation, but the CDS does not. The site of the original CDS is covered by an annotation in model_gff, pred_gff, two ESTs and a cufflinks annotation. Maker still seems to have moved is is upstream where it only has an overlapping cufflinks annotation. the three-prime utr annotated by cufflinks still covers the legacy annotation though.
Here's a link to download the maker gff file I'm looking at:
https://dl.dropbox.com/u/320712/supercont1%252E1.gff.zip
The genes that are in the legacy annotation but missing in the maker annotation are:
GLRG_00074 and GLRG_00092
the 'moved' gene model I described is model GLRG_00081. they all within the first 350 K of sequence.
mike
More information about the maker-devel
mailing list