[maker-devel] RV: Problem training agustus
p sz
seoanezonjic at hotmail.com
Tue May 21 05:25:33 MDT 2019
Hi Xavier
I've changed from zff2genbank.pl<http://zff2genbank.pl> to zff2augustus_gbk.pl<http://zff2augustus_gbk.pl> and the the problem is fixed. I used zff2genbank.pl<http://zff2genbank.pl> because it is packaged into the maker suite. I think that MAKER authors should include zff2augustus_gbk.pl<http://zff2augustus_gbk.pl> into the main suite, I didn't know about this genome-scripts repository.
By the way, I would like to show you my training steps with augustus, in order to know if they are correct:
zff2augustus_gbk.pl export.ann export.dna > final_genes.gb
randomSplit.pl final_genes.gb 500
new_species.pl --species=Demo
etraining --species=Demo final_genes.gb
optimize_augustus.pl --species=Demo --onlytrain=final_genes.gb.train final_genes.gb.test
etraining --species=Demo final_genes.gb
I have taken the training parameters (excepting the 500 parameter) from the train_augustus.pl script included in MAKER suite.
Thank you in advance
Pedro Seoane
________________________________
De: Xabier Vázquez-Campos <xvazquezc at gmail.com>
Enviado: jueves, 16 de mayo de 2019 4:42
Para: p sz
Cc: maker-devel at yandell-lab.org
Asunto: Re: [maker-devel] RV: Problem training agustus
Hi Pedro,
I checked some of my files and there is no issue with a model in the inverse order. And the gb files generated look fine.
You need to use zff2augustus_gbk.pl<http://zff2augustus_gbk.pl> not zff2genbank.pl<http://zff2genbank.pl>. I don't remember the differences but I know that zff2augustus_gbk.pl<http://zff2augustus_gbk.pl> works for sure.
Link: https://github.com/hyphaltip/genome-scripts/blob/master/gene_prediction/zff2augustus_gbk.pl
Cheers,
Xabi
On Tue, 14 May 2019 at 18:12, p sz <seoanezonjic at hotmail.com<mailto:seoanezonjic at hotmail.com>> wrote:
Hi Maker author
I have been using Maker for long years and recently, I've tried to train agustus using the snap training files. To do this, I have used the train_augustus.pl<http://train_augustus.pl> script as follows:
zff2genbank.pl<http://zff2genbank.pl> export.ann export.dna > final_genes.gb<http://final_genes.gb>
train_augustus.pl<http://train_augustus.pl> final_genes.gb<http://final_genes.gb> MyOrg
For each of my gene models the error is the following:
Constructing GenBank feature: Feature begins after it ends: 1006..1001,2051..1917,7791..7689,7993..7880,8485..8374,8775..8628,9050..8873,10467..10459,13315..11920,13598..13511,14971..14945,18637..18471,18898..18821,20558..20389,21067..20923,23249..23004,23549..23354,23647..23624
GBProcessor::getGeneList(): GBFeature constructor:Format error when reading genbank format.
Encountered error after reading 0 annotations.
The export files are generated with SNAP as described by your reference guides (two maker rounds). The issue seems related with the sense of the gene model that can be inspected here:
(export.ann file)
>MODEL236
Eterm 23624 23647 MODEL236
Exon 23354 23549 MODEL236
Exon 23004 23249 MODEL236
Exon 20923 21067 MODEL236
Exon 20389 20558 MODEL236
Exon 18821 18898 MODEL236
Exon 18471 18637 MODEL236
Exon 14945 14971 MODEL236
Exon 13511 13598 MODEL236
Exon 11920 13315 MODEL236
Exon 10459 10467 MODEL236
Exon 8873 9050 MODEL236
Exon 8628 8775 MODEL236
Exon 8374 8485 MODEL236
Exon 7880 7993 MODEL236
Exon 7689 7791 MODEL236
Exon 1917 2051 MODEL236
Einit 1001 1006 MODEL236
(genbankfile)
LOCUS MODEL236 24647 bp dna linear UNK
ACCESSION unknown
FEATURES Location/Qualifiers
source 1..24647
CDS complement(join(1006..1001,2051..1917,7791..7689,
7993..7880,8485..8374,8775..8628,9050..8873,10467..10459,
13315..11920,13598..13511,14971..14945,18637..18471,
18898..18821,20558..20389,21067..20923,23249..23004,
23549..23354,23647..23624))
It seems that augustus needs the direct sense description of the gene model in order to read the gb file and perform the training. How could I fix the problem?
Thank you in advance
Pedro Seoane
_______________________________________________
maker-devel mailing list
maker-devel at box290.bluehost.com<mailto:maker-devel at box290.bluehost.com>
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
--
Xabier Vázquez-Campos, PhD
Research Associate
NSW Systems Biology Initiative
School of Biotechnology and Biomolecular Sciences
The University of New South Wales
Sydney NSW 2052 AUSTRALIA
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20190521/59b1bf79/attachment-0003.html>
More information about the maker-devel
mailing list