[maker-devel] Use pass-through system to add missing genes
Carson Holt
carsonhh at gmail.com
Wed Apr 25 08:29:01 MDT 2012
The way you proceed depends on why the genes are not there to begin with.
Are they not there because of a lack of evidence? If that's the case just
adding the new fasta file should do the trick. Or are they not there
because an assembly error makes it impossible to get a logical model for the
region (I.e reading frame breaks). Are there ab initio models already
called in those regions that could just be promoted to the annotation tier?
You can test that one by blasting against the nonoverlaping_abinits.fasta
files.
For any of the cases described, you can provide the existing annotation set
as the input in GFF3 format, and previous models will be maintained
preferentially. If you know which ab initio predictions you want to add
(I.e. the ab initio promoting scenario I descibed), you can provide those
predictions to the use the pred_gff option and then set keep_preds=1 and
they will be maintained even without evidence. Attached is a script that
would make selecting those easier. It take the MAKER generated GFF3 and a
list of predictions to keep (one name per line). These might be the results
of a BLAST analysis for example. It will then return the GFF3 entries for
just those models selected.
If the situation is more complex, just provide more detail, and I am sure we
can help you come up with a plan.
Thanks,
Carson
From: Anastasia Gioti <anastasia.gioti at scilifelab.se>
Date: Wed, 25 Apr 2012 11:09:36 +0200
To: <maker-devel at yandell-lab.org>
Subject: [maker-devel] Use pass-through system to add missing genes
Hi,
I have a set of predicted proteins from the genome of a fungus annotated by
MAKER using EST data from a closely related species and 3 ab initio
predictors (snap iterativelly trained 3 times, genemark trained directly on
the assembly and augustus with a model from a less closely related species),
along with a set of fungal proteins. I am missing ~ 1000 proteins when I
compare to the species i used EST data from, and there is good evidence from
alignments that these genes exist. The question is how to proceed from Blast
hits to actual gene models here. The idea would be to add these genes to the
existing dataset, rather than reannotate the genome. I believe that
reannotating it without any further evidence such as RNA-seq from the
species itself would not change much,and i d rather stick with actual
predictions that i trust and have used in subsequent analyses. The 1000
genes I can accept to annotate with a less stringent and reliable way than
MAKER, I just want to add them so that the difference in gene count gets
corrected.
I was reading the MAKER 2 paper and i was wondering if I can use the legacy
annotations scheme to do it, by providing GFF3 of the alignments between the
two species in the regions where genes were missed, but as i said, I would
not like to reannotate the whole genome, and running MAKER2 might cause
slight changes that i d like to avoid. Is this possible? First, is it
possible to provide a Gff3 file of specific locations and not the entire
genome alignment? (I guess so..) Second, how can I tag the existing
annotations as 'not to be changed' or alternatively, tag the new models
only? How should I run maker2, with which predictors on and which off?
Thanks,
Anastasia
Anastasia Gioti
Post-doctoral Researcher
anastasia.gioti at scilifelab.se
anastasia.gioti at ebc.uu.se
http://www.ebc.uu.se/Research/IEG/evbiol/people/pages/Gioti_Anastasia/
_______________________________________________ maker-devel mailing list
maker-devel at box290.bluehost.com
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20120425/99e4eb79/attachment-0003.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gff3_select
Type: application/octet-stream
Size: 3067 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20120425/99e4eb79/attachment-0003.obj>
More information about the maker-devel
mailing list