[maker-devel] duplicate CDS in annotation
Carson Holt
carsonhh at gmail.com
Tue Mar 12 07:37:35 MDT 2013
Yes. Try the newer version and see if you still have the issue.
Thanks,
Carson
From: Sasha Mikheyev <mikheyev at gmail.com>
Date: Tuesday, 12 March, 2013 1:26 AM
To: Carson Holt <carsonhh at gmail.com>
Cc: Barry Moore <barry.moore at genetics.utah.edu>,
<maker-devel at yandell-lab.org>
Subject: Re: [maker-devel] duplicate CDS in annotation
Hi Carson,
I have been using version 2.10. Is it worth trying with a newer version?
You can find the model file here
<https://dl.dropbox.com/u/5275622/all.gff.gz> . It is rather large, as it
includes all of the output from the first maker run.
Yours,
Sasha
On Mon, Mar 11, 2013 at 10:02 PM, Carson Holt <carsonhh at gmail.com> wrote:
> I think the issue is that you are getting a match feature that is being
> printed with the same ID as the mRNA feature. Correct?
>
> What version of MAKER are you using, and what does the gile you are giving to
> pred_gff or model_gff look like? Could you send them?
>
> Thanks,
> Carson
>
>
> From: Barry Moore <barry.moore at genetics.utah.edu>
> Date: Monday, 11 March, 2013 7:32 AM
> To: Sasha Mikheyev <mikheyev at gmail.com>
> Cc: <maker-devel at yandell-lab.org>
> Subject: Re: [maker-devel] duplicate CDS in annotation
>
> Hi Sasha,
>
> This gene model appears to be correctly formatted to me. In GFF3 format the
> CDS features are allowed to span multiple lines and they share the same ID to
> indicate that it is all the same features. See the GFF3 specification on the
> Sequence Ontology website
> (http://www.sequenceontology.org/resources/gff3.html), and in particular the
> description of the ID attribute specifies:
>
>> ID Indicates the ID of the feature. IDs for each feature must be unique
>> within the scope of the GFF file. In the case of discontinuous features
>> (i.e. a single feature that exists over multiple genomic locations) the same
>> ID may appear on multiple lines. All lines that share an ID collectively
>> represent a single feature.
>
> So each of those CDS lines forms one part of the single CDS feature for this
> gene.
>
> B
>
> On Mar 11, 2013, at 3:46 AM, Sasha Mikheyev wrote:
>
>> Dear Yandell lab,
>>
>> I am re-annotating the harvester and genome using protein and RNA-seq data.
>> However, I get many artifacts like the one below. It seems that there are
>> several CDS records that should tie in to the same mRNA, but they are really
>> hanging out separately, and produce several nucleotide sequences with the
>> same name when extracted from the gff. I would appreciate any guidance about
>> how to fix this!
>>
>> Thank you,
>>
>> Sasha
>>
>> grep "pbar_scf7180000350377:hit:2506" Pbar.2.0.gff
>> pbar_scf7180000350377 protein2genome protein_match 172004 172162 150 - .
>> ID=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;score=150;
>> pbar_scf7180000350377 protein2genome match_part 172004 172162 150 - .
>> ID=pbar_scf7180000350377:hsp:2798;Parent=pbar_scf7180000350377:hit:2506;Name=
>> Hsal|HS9704;Target=Hsal|HS9704 1 53 +;Gap=M159;
>> pbar_scf7180000350377 maker mRNA 538308 558769 . + .
>> ID=pbar_scf7180000350377:hit:2506;Parent=augustus_masked-pbar_scf718000035037
>> 7-abinit-gene-5.29;Name=augustus_masked-pbar_scf7180000350377-abinit-gene-5.2
>> 9-mRNA-1;_AED=0.48;_eAED=0.39;_QI=0|0|0|0.5|1|1|6|0|395;score=0.01;
>> pbar_scf7180000350377 maker exon 538308 538334 0.01 + .
>> ID=pbar_scf7180000350377:hit:2506:exon:305;Parent=pbar_scf7180000350377:hit:2
>> 506;
>> pbar_scf7180000350377 maker exon 538748 538968 0.01 + .
>> ID=pbar_scf7180000350377:hit:2506:exon:306;Parent=pbar_scf7180000350377:hit:2
>> 506;
>> pbar_scf7180000350377 maker exon 539842 540242 0.01 + .
>> ID=pbar_scf7180000350377:hit:2506:exon:307;Parent=pbar_scf7180000350377:hit:2
>> 506;
>> pbar_scf7180000350377 maker exon 542624 542798 0.01 + .
>> ID=pbar_scf7180000350377:hit:2506:exon:308;Parent=pbar_scf7180000350377:hit:2
>> 506;
>> pbar_scf7180000350377 maker exon 555823 556025 0.01 + .
>> ID=pbar_scf7180000350377:hit:2506:exon:309;Parent=pbar_scf7180000350377:hit:2
>> 506;
>> pbar_scf7180000350377 maker exon 558609 558769 0.01 + .
>> ID=pbar_scf7180000350377:hit:2506:exon:310;Parent=pbar_scf7180000350377:hit:2
>> 506;
>> pbar_scf7180000350377 maker CDS 538308 538334 . + 0
>> ID=pbar_scf7180000350377:hit:2506:cds:305;Parent=pbar_scf7180000350377:hit:25
>> 06;
>> pbar_scf7180000350377 maker CDS 538748 538968 . + 0
>> ID=pbar_scf7180000350377:hit:2506:cds:306;Parent=pbar_scf7180000350377:hit:25
>> 06;
>> pbar_scf7180000350377 maker CDS 539842 540242 . + 1
>> ID=pbar_scf7180000350377:hit:2506:cds:307;Parent=pbar_scf7180000350377:hit:25
>> 06;
>> pbar_scf7180000350377 maker CDS 542624 542798 . + 2
>> ID=pbar_scf7180000350377:hit:2506:cds:308;Parent=pbar_scf7180000350377:hit:25
>> 06;
>> pbar_scf7180000350377 maker CDS 555823 556025 . + 1
>> ID=pbar_scf7180000350377:hit:2506:cds:309;Parent=pbar_scf7180000350377:hit:25
>> 06;
>> pbar_scf7180000350377 maker CDS 558609 558769 . + 2
>> ID=pbar_scf7180000350377:hit:2506:cds:310;Parent=pbar_scf7180000350377:hit:25
>> 06;
>>
>> _______________________________________________
>> maker-devel mailing list
>> maker-devel at box290.bluehost.com
>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>
> Barry Moore
> Research Scientist
> Dept. of Human Genetics
> University of Utah
> Salt Lake City, UT 84112
> --------------------------------------------
> (801) 585-3543 <tel:%28801%29%20585-3543>
>
>
>
>
> _______________________________________________ maker-devel mailing list
> maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/mak
> er-devel_yandell-lab.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20130312/473aad7a/attachment-0003.html>
More information about the maker-devel
mailing list