[maker-devel] Combining and merging two Maker annotation gff files ?

chebbi mohamed amine mohamed.amine.chebbi at univ-poitiers.fr
Wed Oct 26 02:32:52 MDT 2016


Thank you very much for your help. 

Best, 

Mohamed 


De: "Xabier Vázquez-Campos" <xvazquezc at gmail.com> 
À: "Carson Holt" <carsonhh at gmail.com> 
Cc: "chebbi mohamed amine" <mohamed.amine.chebbi at univ-poitiers.fr>, "Maker Mailing List" <maker-devel at yandell-lab.org> 
Envoyé: Lundi 24 Octobre 2016 01:49:53 
Objet: Re: [maker-devel] Combining and merging two Maker annotation gff files ? 

If it's of any help I had this notes on my old protocol (before I started to do the training with BUSCO): 



For Augustus, we need the script " zff2augustus_gbk.pl ". This will take the export.dna generated by fathom and generate a *.gb file that will be used as "training gene structure file" in a new training submission in WebAugustus, but remember to give it a new name in the submission, e.g. MYGENOME_v2, or Maker won't see the difference (same name): 
perl PATH/TO/SCRIPT/ zff2augustus_gbk.pl > MYGENOME.train.gb 



As said, you could also do the training with BUSCO with the --long option. It has a dataset specific for arthropods. But if you have EST data you'll probably do better with the other method, as it allows to enter the EST for a more accurate training. 

On 24 October 2016 at 10:25, Carson Holt < carsonhh at gmail.com > wrote: 

BQ_BEGIN

It’s unfortunate the archived GMOD post is gone, because I always used it for my own reference. If I remember right, the main point was that Jason Stajich wrote a tool to convert Snap’s ZFF format to a Genbank format suitable for Augustus training. This meant you could use the maker2zff script that came with MAKER, then use Jason’s tool to convert for Augustus training. 

Tool to convert SNAP training ZFF to Augustus trining input file —> 
https://github.com/hyphaltip/genome-scripts/blob/master/gene_prediction/zff2augustus_gbk.pl 


Since the post is gone, you could use that documentation provided with his tool and then maybe a generic Augustus training guide like the following to design a path forward —> 
http://www.molecularevolution.org/molevolfiles/exercises/augustus/training.html 

—Carson 



BQ_BEGIN

On Oct 12, 2016, at 3:44 AM, chebbi mohamed amine < mohamed.amine.chebbi at univ-poitiers.fr > wrote: 

Thank you Carson for your quick response. Sorry, I have another question concerning Augustus Training. You posted previously in the mailing list a link to an explanation of Augustus training steps http://brie4.cshl.edu/pipermail/gmod-help/2012-June/001724.htm l . Unfortunately the link doesn't work anymore. Otherwise could you explain how to filter the gff file produced by the first run of Maker to get best full length ORF as a set of gene models to train Augustus ? 

Best, 
Amine 


De: "chebbi mohamed amine" < mohamed.amine.chebbi at univ-poitiers.fr > 
À: "Carson Holt" < carsonhh at gmail.com > 
Cc: maker-devel at yandell-lab.org 
Envoyé: Mercredi 12 Octobre 2016 11:44:21 
Objet: Re: [maker-devel] Combining and merging two Maker annotation gff files ? 

Thank you Carson for your quick response. Sorry, I have another question concerning Augustus Training. You posted previously in the mailing list a link to an explanation of Augustus training steps http://brie4.cshl.edu/pipermail/gmod-help/2012-June/001724.htm l . Unfortunately the link doesn't work anymore. Otherwise could you explain how to filter the gff file produced by the first run of Maker to get best full length ORF as a set of gene models to train Augustus ? 



De: "Carson Holt" < carsonhh at gmail.com > 
À: "Mohamed Amine CHEBBI" < mohamed.amine.chebbi at univ-poitiers.fr > 
Cc: maker-devel at yandell-lab.org 
Envoyé: Mardi 11 Octobre 2016 22:05:50 
Objet: Re: [maker-devel] Combining and merging two Maker annotation gff files ? 

Masking doesn’t just affect the gene models, but also evidence alignment and thus scoring. So merging in this way would not make much sense as the second less masked set would always score better because it has more evidence alignments permitted by the lack of masking (not necessarily real, but drawn in by repeats). 
The result would be that any attempt of a merge would almost exclusively result in all genes from the second set always scoring higher. 

—Carson 




BQ_BEGIN

On Oct 10, 2016, at 3:43 AM, Mohamed Amine CHEBBI < mohamed.amine.chebbi at univ-poitiers.fr > wrote: 


Hi! 

I’m using the latest version of Maker2 to annotate an arthropod genome. First, I have run RepeatModeler to create rmlib for Maker, then I have followed two independent annotation strategies on the same assembly : 
1- Passing throw Maker all the repeats collected by RepeatModeler ( Identified repeats in the Repbase + Unkown Models). 
2- Passing throw Maker only the identified repeats. 

Both annotations work successfully. The first annotation gives me 19048 genes against 22931 done by the second one. Know, I'm seeing for a mean to merge the two annotation gff files without doing a re-annotation and by taking the best and non redundant supported gene models . 

So, do you think that configuring the maker options as below, could resolve this issue : 
maker_gff=1-mask-all.gff,2-mask-onlyKnown.gff #MAKER derived GFF3 file 
#MAKER derived GFF3 file 
est_pass=1 #use ESTs in maker_gff: 1 = yes, 0 = no 
altest_pass=0 #use alternate organism ESTs in maker_gff: 1 = yes, 0 = no 
protein_pass=1 #use protein alignments in maker_gff: 1 = yes, 0 = no 
rm_pass=1 #use repeats in maker_gff: 1 = yes, 0 = no 
model_pass=0 #use gene models in maker_gff: 1 = yes, 0 = no 
pred_pass=1 #use ab-initio predictions in maker_gff: 1 = yes, 0 = no 
other_pass=0 #passthrough anyything else in maker_gff: 1 = yes, 0 = no 
-- 
Mohamed Amine CHEBBI, PhD Student
Université de Poitiers 
_______________________________________________ 
maker-devel mailing list 
maker-devel at box290.bluehost.com 
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org 

BQ_END




BQ_END



_______________________________________________ 
maker-devel mailing list 
maker-devel at box290.bluehost.com 
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org 


BQ_END




-- 
Xabier Vázquez-Campos, PhD 
Research Associate 
Water Research Centre 
School of Civil and Environmental Engineering 
The University of New South Wales 
Sydney NSW 2052 AUSTRALIA 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20161026/6e52c2b5/attachment-0003.html>


More information about the maker-devel mailing list