[maker-devel] RV: Problem training agustus

Xabier Vázquez-Campos xvazquezc at gmail.com
Wed May 15 22:42:20 MDT 2019


Hi Pedro,
I checked some of my files and there is no issue with a model in the
inverse order. And the gb files generated look fine.
You need to use zff2augustus_gbk.pl not zff2genbank.pl. I don't remember
the differences but I know that zff2augustus_gbk.pl works for sure.

Link:
https://github.com/hyphaltip/genome-scripts/blob/master/gene_prediction/zff2augustus_gbk.pl
Cheers,
Xabi

On Tue, 14 May 2019 at 18:12, p sz <seoanezonjic at hotmail.com> wrote:

> Hi Maker author
> I have been using Maker for long years and recently, I've tried to train
> agustus using the snap training files. To do this, I have used the
> train_augustus.pl script as follows:
>
> zff2genbank.pl export.ann export.dna > final_genes.gb
> train_augustus.pl final_genes.gb MyOrg
>
> For each of my gene models the error is the following:
>
> Constructing GenBank feature: Feature begins after it ends:
> 1006..1001,2051..1917,7791..7689,7993..7880,8485..8374,8775..8628,9050..8873,10467..10459,13315..11920,13598..13511,14971..14945,18637..18471,18898..18821,20558..20389,21067..20923,23249..23004,23549..23354,23647..23624
> GBProcessor::getGeneList(): GBFeature constructor:Format error when
> reading genbank format.
> Encountered error after reading 0 annotations.
>
> The export files are generated with SNAP as described by your reference
> guides (two maker rounds). The issue seems related with the sense of the
> gene model that can be inspected here:
> (export.ann file)
> >MODEL236
> Eterm   23624   23647   MODEL236
> Exon    23354   23549   MODEL236
> Exon    23004   23249   MODEL236
> Exon    20923   21067   MODEL236
> Exon    20389   20558   MODEL236
> Exon    18821   18898   MODEL236
> Exon    18471   18637   MODEL236
> Exon    14945   14971   MODEL236
> Exon    13511   13598   MODEL236
> Exon    11920   13315   MODEL236
> Exon    10459   10467   MODEL236
> Exon    8873    9050    MODEL236
> Exon    8628    8775    MODEL236
> Exon    8374    8485    MODEL236
> Exon    7880    7993    MODEL236
> Exon    7689    7791    MODEL236
> Exon    1917    2051    MODEL236
> Einit   1001    1006    MODEL236
>
> (genbankfile)
> LOCUS       MODEL236               24647 bp    dna     linear   UNK
> ACCESSION   unknown
> FEATURES             Location/Qualifiers
>      source          1..24647
>      CDS             complement(join(1006..1001,2051..1917,7791..7689,
>
>  7993..7880,8485..8374,8775..8628,9050..8873,10467..10459,
>                      13315..11920,13598..13511,14971..14945,18637..18471,
>                      18898..18821,20558..20389,21067..20923,23249..23004,
>                      23549..23354,23647..23624))
>
> It seems that augustus needs the direct sense description of the gene
> model in order to read the gb file and perform the training. How  could I
> fix the problem?
> Thank you in advance
> Pedro Seoane
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>


-- 
Xabier Vázquez-Campos, *PhD*
*Research Associate*
NSW Systems Biology Initiative
School of Biotechnology and Biomolecular Sciences
The University of New South Wales
Sydney NSW 2052 AUSTRALIA
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20190516/908a8b34/attachment-0003.html>


More information about the maker-devel mailing list