[maker-devel] Combining and merging two Maker annotation gff files ?

Carson Holt carsonhh at gmail.com
Sun Oct 23 17:25:34 MDT 2016


It’s unfortunate the archived GMOD post is gone, because I always used it for my own reference. If I remember right, the main point was that Jason Stajich wrote a tool to convert Snap’s ZFF format to a Genbank format suitable for Augustus training. This meant you could use the maker2zff script that came with MAKER, then use Jason’s tool to convert for Augustus training.

Tool to convert SNAP training ZFF to Augustus trining input file —>
https://github.com/hyphaltip/genome-scripts/blob/master/gene_prediction/zff2augustus_gbk.pl <https://github.com/hyphaltip/genome-scripts/blob/master/gene_prediction/zff2augustus_gbk.pl>


Since the post is gone, you could use that documentation provided with his tool and then maybe a generic Augustus training guide like the following to design a path forward —>
http://www.molecularevolution.org/molevolfiles/exercises/augustus/training.html <http://www.molecularevolution.org/molevolfiles/exercises/augustus/training.html>

—Carson


> On Oct 12, 2016, at 3:44 AM, chebbi mohamed amine <mohamed.amine.chebbi at univ-poitiers.fr> wrote:
> 
> Thank you Carson for your quick response.  Sorry, I have another question concerning Augustus Training. You posted previously in the mailing list a link to an explanation of Augustus training steps  http://brie4.cshl.edu/pipermail/gmod-help/2012-June/001724.htm <http://brie4.cshl.edu/pipermail/gmod-help/2012-June/001724.html>l. Unfortunately the link doesn't work anymore. Otherwise could you explain how to filter the  gff  file produced by the first run of Maker to get best full length ORF as a set of gene models to train Augustus ?
> 
> Best,
> Amine
> 
> De: "chebbi mohamed amine" <mohamed.amine.chebbi at univ-poitiers.fr>
> À: "Carson Holt" <carsonhh at gmail.com>
> Cc: maker-devel at yandell-lab.org
> Envoyé: Mercredi 12 Octobre 2016 11:44:21
> Objet: Re: [maker-devel] Combining and merging two Maker annotation gff files ?
> 
> Thank you Carson for your quick response.  Sorry, I have another question concerning Augustus Training. You posted previously in the mailing list a link to an explanation of Augustus training steps  http://brie4.cshl.edu/pipermail/gmod-help/2012-June/001724.htm <http://brie4.cshl.edu/pipermail/gmod-help/2012-June/001724.html>l. Unfortunately the link doesn't work anymore. Otherwise could you explain how to filter the  gff  file produced by the first run of Maker to get best full length ORF as a set of gene models to train Augustus ?
> 
> 
> De: "Carson Holt" <carsonhh at gmail.com>
> À: "Mohamed Amine CHEBBI" <mohamed.amine.chebbi at univ-poitiers.fr>
> Cc: maker-devel at yandell-lab.org
> Envoyé: Mardi 11 Octobre 2016 22:05:50
> Objet: Re: [maker-devel] Combining and merging two Maker annotation gff files ?
> 
> Masking doesn’t just affect the gene models, but also evidence alignment and thus scoring. So merging in this way would not make much sense as the second less masked set would always score better because it has more evidence alignments permitted by the lack of masking (not necessarily real, but drawn in by repeats).
> 
> The result would be that any attempt of a merge would almost exclusively result in all genes from the second set always scoring higher.
> 
> —Carson
> 
> 
> 
> On Oct 10, 2016, at 3:43 AM, Mohamed Amine CHEBBI <mohamed.amine.chebbi at univ-poitiers.fr <mailto:mohamed.amine.chebbi at univ-poitiers.fr>> wrote:
> Hi! 
> 
> I’m using the latest version of Maker2 to annotate an arthropod genome. First, I have run RepeatModeler to create rmlib for Maker, then I have followed two independent annotation strategies on the same assembly :
> 1- Passing throw Maker all the repeats collected by RepeatModeler ( Identified repeats in the Repbase + Unkown Models).
> 2-  Passing throw Maker only the identified repeats. 
> 
> Both annotations work successfully. The first annotation gives me 19048  genes against 22931 done by the second one. Know, I'm seeing for a mean to merge the two annotation gff files without doing a re-annotation and by taking the best and non redundant supported gene models .
> 
> So, do you think that configuring  the maker options as below, could resolve this issue :
> maker_gff=1-mask-all.gff,2-mask-onlyKnown.gff #MAKER derived GFF3 file
> #MAKER derived GFF3 file
> est_pass=1 #use ESTs in maker_gff: 1 = yes, 0 = no
> altest_pass=0 #use alternate organism ESTs in maker_gff: 1 = yes, 0 = no
> protein_pass=1 #use protein alignments in maker_gff: 1 = yes, 0 = no
> rm_pass=1 #use repeats in maker_gff: 1 = yes, 0 = no
> model_pass=0 #use gene models in maker_gff: 1 = yes, 0 = no
> pred_pass=1 #use ab-initio predictions in maker_gff: 1 = yes, 0 = no
> other_pass=0 #passthrough anyything else in maker_gff: 1 = yes, 0 = no
> 
> -- 
> Mohamed Amine CHEBBI, PhD Student
> Université de Poitiers
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org <http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org>
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20161023/66c092e1/attachment-0003.html>


More information about the maker-devel mailing list