[maker-devel] duplicate CDS in annotation
Carson Holt
carsonhh at gmail.com
Mon Mar 11 07:02:13 MDT 2013
I think the issue is that you are getting a match feature that is being
printed with the same ID as the mRNA feature. Correct?
What version of MAKER are you using, and what does the gile you are giving
to pred_gff or model_gff look like? Could you send them?
Thanks,
Carson
From: Barry Moore <barry.moore at genetics.utah.edu>
Date: Monday, 11 March, 2013 7:32 AM
To: Sasha Mikheyev <mikheyev at gmail.com>
Cc: <maker-devel at yandell-lab.org>
Subject: Re: [maker-devel] duplicate CDS in annotation
Hi Sasha,
This gene model appears to be correctly formatted to me. In GFF3 format the
CDS features are allowed to span multiple lines and they share the same ID
to indicate that it is all the same features. See the GFF3 specification on
the Sequence Ontology website
(http://www.sequenceontology.org/resources/gff3.html), and in particular the
description of the ID attribute specifies:
> ID Indicates the ID of the feature. IDs for each feature must be unique
> within the scope of the GFF file. In the case of discontinuous features (i.e.
> a single feature that exists over multiple genomic locations) the same ID may
> appear on multiple lines. All lines that share an ID collectively represent a
> single feature.
So each of those CDS lines forms one part of the single CDS feature for this
gene.
B
On Mar 11, 2013, at 3:46 AM, Sasha Mikheyev wrote:
> Dear Yandell lab,
>
> I am re-annotating the harvester and genome using protein and RNA-seq data.
> However, I get many artifacts like the one below. It seems that there are
> several CDS records that should tie in to the same mRNA, but they are really
> hanging out separately, and produce several nucleotide sequences with the same
> name when extracted from the gff. I would appreciate any guidance about how to
> fix this!
>
> Thank you,
>
> Sasha
>
> grep "pbar_scf7180000350377:hit:2506" Pbar.2.0.gff
> pbar_scf7180000350377 protein2genome protein_match 172004 172162 150 - .
> ID=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;score=150;
> pbar_scf7180000350377 protein2genome match_part 172004 172162 150 - .
> ID=pbar_scf7180000350377:hsp:2798;Parent=pbar_scf7180000350377:hit:2506;Name=H
> sal|HS9704;Target=Hsal|HS9704 1 53 +;Gap=M159;
> pbar_scf7180000350377 maker mRNA 538308 558769 . + .
> ID=pbar_scf7180000350377:hit:2506;Parent=augustus_masked-pbar_scf7180000350377
> -abinit-gene-5.29;Name=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29-
> mRNA-1;_AED=0.48;_eAED=0.39;_QI=0|0|0|0.5|1|1|6|0|395;score=0.01;
> pbar_scf7180000350377 maker exon 538308 538334 0.01 + .
> ID=pbar_scf7180000350377:hit:2506:exon:305;Parent=pbar_scf7180000350377:hit:25
> 06;
> pbar_scf7180000350377 maker exon 538748 538968 0.01 + .
> ID=pbar_scf7180000350377:hit:2506:exon:306;Parent=pbar_scf7180000350377:hit:25
> 06;
> pbar_scf7180000350377 maker exon 539842 540242 0.01 + .
> ID=pbar_scf7180000350377:hit:2506:exon:307;Parent=pbar_scf7180000350377:hit:25
> 06;
> pbar_scf7180000350377 maker exon 542624 542798 0.01 + .
> ID=pbar_scf7180000350377:hit:2506:exon:308;Parent=pbar_scf7180000350377:hit:25
> 06;
> pbar_scf7180000350377 maker exon 555823 556025 0.01 + .
> ID=pbar_scf7180000350377:hit:2506:exon:309;Parent=pbar_scf7180000350377:hit:25
> 06;
> pbar_scf7180000350377 maker exon 558609 558769 0.01 + .
> ID=pbar_scf7180000350377:hit:2506:exon:310;Parent=pbar_scf7180000350377:hit:25
> 06;
> pbar_scf7180000350377 maker CDS 538308 538334 . + 0
> ID=pbar_scf7180000350377:hit:2506:cds:305;Parent=pbar_scf7180000350377:hit:250
> 6;
> pbar_scf7180000350377 maker CDS 538748 538968 . + 0
> ID=pbar_scf7180000350377:hit:2506:cds:306;Parent=pbar_scf7180000350377:hit:250
> 6;
> pbar_scf7180000350377 maker CDS 539842 540242 . + 1
> ID=pbar_scf7180000350377:hit:2506:cds:307;Parent=pbar_scf7180000350377:hit:250
> 6;
> pbar_scf7180000350377 maker CDS 542624 542798 . + 2
> ID=pbar_scf7180000350377:hit:2506:cds:308;Parent=pbar_scf7180000350377:hit:250
> 6;
> pbar_scf7180000350377 maker CDS 555823 556025 . + 1
> ID=pbar_scf7180000350377:hit:2506:cds:309;Parent=pbar_scf7180000350377:hit:250
> 6;
> pbar_scf7180000350377 maker CDS 558609 558769 . + 2
> ID=pbar_scf7180000350377:hit:2506:cds:310;Parent=pbar_scf7180000350377:hit:250
> 6;
>
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
Barry Moore
Research Scientist
Dept. of Human Genetics
University of Utah
Salt Lake City, UT 84112
--------------------------------------------
(801) 585-3543
_______________________________________________ maker-devel mailing list
maker-devel at box290.bluehost.com
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20130311/473797ca/attachment-0003.html>
More information about the maker-devel
mailing list