[maker-devel] RV: Problem training agustus

p sz seoanezonjic at hotmail.com
Tue May 14 02:12:09 MDT 2019


Hi Maker author
I have been using Maker for long years and recently, I've tried to train agustus using the snap training files. To do this, I have used the train_augustus.pl script as follows:

zff2genbank.pl export.ann export.dna > final_genes.gb
train_augustus.pl final_genes.gb MyOrg

For each of my gene models the error is the following:

Constructing GenBank feature: Feature begins after it ends: 1006..1001,2051..1917,7791..7689,7993..7880,8485..8374,8775..8628,9050..8873,10467..10459,13315..11920,13598..13511,14971..14945,18637..18471,18898..18821,20558..20389,21067..20923,23249..23004,23549..23354,23647..23624
GBProcessor::getGeneList(): GBFeature constructor:Format error when reading genbank format.
Encountered error after reading 0 annotations.

The export files are generated with SNAP as described by your reference guides (two maker rounds). The issue seems related with the sense of the gene model that can be inspected here:
(export.ann file)
>MODEL236
Eterm   23624   23647   MODEL236
Exon    23354   23549   MODEL236
Exon    23004   23249   MODEL236
Exon    20923   21067   MODEL236
Exon    20389   20558   MODEL236
Exon    18821   18898   MODEL236
Exon    18471   18637   MODEL236
Exon    14945   14971   MODEL236
Exon    13511   13598   MODEL236
Exon    11920   13315   MODEL236
Exon    10459   10467   MODEL236
Exon    8873    9050    MODEL236
Exon    8628    8775    MODEL236
Exon    8374    8485    MODEL236
Exon    7880    7993    MODEL236
Exon    7689    7791    MODEL236
Exon    1917    2051    MODEL236
Einit   1001    1006    MODEL236

(genbankfile)
LOCUS       MODEL236               24647 bp    dna     linear   UNK
ACCESSION   unknown
FEATURES             Location/Qualifiers
     source          1..24647
     CDS             complement(join(1006..1001,2051..1917,7791..7689,
                     7993..7880,8485..8374,8775..8628,9050..8873,10467..10459,
                     13315..11920,13598..13511,14971..14945,18637..18471,
                     18898..18821,20558..20389,21067..20923,23249..23004,
                     23549..23354,23647..23624))

It seems that augustus needs the direct sense description of the gene model in order to read the gb file and perform the training. How  could I fix the problem?
Thank you in advance
Pedro Seoane
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20190514/553d420f/attachment-0002.html>


More information about the maker-devel mailing list