[maker-devel] duplicate CDS in annotation

Carson Holt carsonhh at gmail.com
Wed Mar 13 15:47:06 MDT 2013


The output shows that the original model was
Alias=maker-pbar_scf7180000349951-snap-gene-1.17-mRNA-1 and the new model
replacing it is 
Alias=genemark-pbar_scf7180000349951-abinit-gene-1.14-mRNA-1.

So it is really a completely different model (as one derived from SNAP and
one from GeneMark).  I'm guessing you have map_forward=1 set and are using
the GFF3 passthrough options correct?

Thanks,
Carson



From:  Sasha Mikheyev <mikheyev at gmail.com>
Date:  Wednesday, 13 March, 2013 3:23 AM
To:  Carson Holt <carsonhh at gmail.com>
Cc:  Barry Moore <barry.moore at genetics.utah.edu>,
<maker-devel at yandell-lab.org>
Subject:  Re: [maker-devel] duplicate CDS in annotation

Dear Carson,

The new version does indeed fix the problem!

However, I noticed that some of the CDS annotations were swallowed. This
seems to affect a ~600 genes.

e.g. input:

pbar_scf7180000349951 maker mRNA 98033 98530 . - .
ID=PB12301-RA;Parent=PB12301;Name=PB12301-RA;Alias=maker-pbar_scf71800003499
51-snap-gene-1.17-mRNA-1;_AED=1.00;_QI=0|0|0|0|0|0|2|0|81;
pbar_scf7180000349951 maker exon 98393 98530 . - .
ID=PB12301-RA:exon:10283;Parent=PB12301-RA;
pbar_scf7180000349951 maker exon 98033 98140 . - .
ID=PB12301-RA:exon:10284;Parent=PB12301-RA;
pbar_scf7180000349951 maker CDS 98033 98140 . - 0
ID=PB12301-RA:cds:10114;Parent=PB12301-RA;
pbar_scf7180000349951 maker CDS 98393 98530 . - 0
ID=PB12301-RA:cds:10113;Parent=PB12301-RA;

output:

pbar_scf7180000349951 maker mRNA 98033 98530 . - .
ID=PB12301-RA;Parent=PB12301;Name=PB12301-RA;_AED=0.38;_eAED=0.38;_QI=0|0|0.
33|1|0.5|1|3|246|165;Alias=genemark-pbar_scf7180000349951-abinit-gene-1.14-m
RNA-1,PB12301-RA
pbar_scf7180000349951 maker exon 98033 98530 . - .
ID=PB12301-RA:exon:134;Parent=PB12301-RA
pbar_scf7180000349951 maker exon 98033 98140 . - .
ID=PB12301-RA:exon:133;Parent=PB12301-RA
pbar_scf7180000349951 maker exon 98393 98530 . - .
ID=PB12301-RA:exon:132;Parent=PB12301-RA
pbar_scf7180000349951 maker three_prime_UTR 98393 98530 . - .
ID=PB12301-RA:three_prime_utr;Parent=PB12301-RA
pbar_scf7180000349951 maker three_prime_UTR 98033 98140 . - .
ID=PB12301-RA:three_prime_utr;Parent=PB12301-RA
pbar_scf7180000349951 maker CDS 98033 98530 . - 0
ID=PB12301-RA:cds;Parent=PB12301-RA

Thank you,

Sasha

On Tue, Mar 12, 2013 at 10:37 PM, Carson Holt <carsonhh at gmail.com> wrote:
> Yes.  Try the newer version and see if you still have the issue.
> 
> Thanks,
> Carson
> 
> 
> From:  Sasha Mikheyev <mikheyev at gmail.com>
> Date:  Tuesday, 12 March, 2013 1:26 AM
> To:  Carson Holt <carsonhh at gmail.com>
> Cc:  Barry Moore <barry.moore at genetics.utah.edu>,
> <maker-devel at yandell-lab.org>
> 
> Subject:  Re: [maker-devel] duplicate CDS in annotation
> 
> Hi Carson,
> 
> I have been using version 2.10. Is it worth trying with a newer version?
> 
> You can find the model file here <https://dl.dropbox.com/u/5275622/all.gff.gz>
> . It is rather large, as it includes all of the output from the first maker
> run.
> 
> Yours,
> 
> Sasha
> 
> 
> On Mon, Mar 11, 2013 at 10:02 PM, Carson Holt <carsonhh at gmail.com> wrote:
>> I think the issue is that you are getting a match feature that is being
>> printed with the same ID as the mRNA feature. Correct?
>> 
>> What version of MAKER are you using, and what does the gile you are giving to
>> pred_gff or model_gff look like?  Could you send them?
>> 
>> Thanks,
>> Carson
>> 
>> 
>> From:  Barry Moore <barry.moore at genetics.utah.edu>
>> Date:  Monday, 11 March, 2013 7:32 AM
>> To:  Sasha Mikheyev <mikheyev at gmail.com>
>> Cc:  <maker-devel at yandell-lab.org>
>> Subject:  Re: [maker-devel] duplicate CDS in annotation
>> 
>> Hi Sasha,
>> 
>> This gene model appears to be correctly formatted to me.  In GFF3 format the
>> CDS features are allowed to span multiple lines and they share the same ID to
>> indicate that it is all the same features.  See the GFF3 specification on the
>> Sequence Ontology website
>> (http://www.sequenceontology.org/resources/gff3.html), and in particular the
>> description of the ID attribute specifies:
>> 
>>> ID Indicates the ID of the feature.  IDs for each feature must be unique
>>> within the scope of the GFF file.  In the case of discontinuous features
>>> (i.e. a single feature that exists over multiple genomic locations) the same
>>> ID may appear on multiple lines.  All lines that share an ID collectively
>>> represent a single feature.
>> 
>> So each of those CDS lines forms one part of the single CDS feature for this
>> gene.
>> 
>> B
>>  
>> On Mar 11, 2013, at 3:46 AM, Sasha Mikheyev wrote:
>> 
>>> Dear Yandell lab,
>>> 
>>> I am re-annotating the harvester and genome using protein and RNA-seq data.
>>> However, I get many artifacts like the one below. It seems that there are
>>> several CDS records that should tie in to the same mRNA, but they are really
>>> hanging out separately, and produce several nucleotide sequences with the
>>> same name when extracted from the gff. I would appreciate any guidance about
>>> how to fix this!
>>> 
>>> Thank you,
>>> 
>>> Sasha
>>> 
>>> grep "pbar_scf7180000350377:hit:2506" Pbar.2.0.gff
>>> pbar_scf7180000350377 protein2genome protein_match 172004 172162 150 - .
>>> ID=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;score=150;
>>> pbar_scf7180000350377 protein2genome match_part 172004 172162 150 - .
>>> ID=pbar_scf7180000350377:hsp:2798;Parent=pbar_scf7180000350377:hit:2506;Name
>>> =Hsal|HS9704;Target=Hsal|HS9704 1 53 +;Gap=M159;
>>> pbar_scf7180000350377 maker mRNA 538308 558769 . + .
>>> ID=pbar_scf7180000350377:hit:2506;Parent=augustus_masked-pbar_scf71800003503
>>> 77-abinit-gene-5.29;Name=augustus_masked-pbar_scf7180000350377-abinit-gene-5
>>> .29-mRNA-1;_AED=0.48;_eAED=0.39;_QI=0|0|0|0.5|1|1|6|0|395;score=0.01;
>>> pbar_scf7180000350377 maker exon 538308 538334 0.01 + .
>>> ID=pbar_scf7180000350377:hit:2506:exon:305;Parent=pbar_scf7180000350377:hit:
>>> 2506;
>>> pbar_scf7180000350377 maker exon 538748 538968 0.01 + .
>>> ID=pbar_scf7180000350377:hit:2506:exon:306;Parent=pbar_scf7180000350377:hit:
>>> 2506;
>>> pbar_scf7180000350377 maker exon 539842 540242 0.01 + .
>>> ID=pbar_scf7180000350377:hit:2506:exon:307;Parent=pbar_scf7180000350377:hit:
>>> 2506;
>>> pbar_scf7180000350377 maker exon 542624 542798 0.01 + .
>>> ID=pbar_scf7180000350377:hit:2506:exon:308;Parent=pbar_scf7180000350377:hit:
>>> 2506;
>>> pbar_scf7180000350377 maker exon 555823 556025 0.01 + .
>>> ID=pbar_scf7180000350377:hit:2506:exon:309;Parent=pbar_scf7180000350377:hit:
>>> 2506;
>>> pbar_scf7180000350377 maker exon 558609 558769 0.01 + .
>>> ID=pbar_scf7180000350377:hit:2506:exon:310;Parent=pbar_scf7180000350377:hit:
>>> 2506;
>>> pbar_scf7180000350377 maker CDS 538308 538334 . + 0
>>> ID=pbar_scf7180000350377:hit:2506:cds:305;Parent=pbar_scf7180000350377:hit:2
>>> 506;
>>> pbar_scf7180000350377 maker CDS 538748 538968 . + 0
>>> ID=pbar_scf7180000350377:hit:2506:cds:306;Parent=pbar_scf7180000350377:hit:2
>>> 506;
>>> pbar_scf7180000350377 maker CDS 539842 540242 . + 1
>>> ID=pbar_scf7180000350377:hit:2506:cds:307;Parent=pbar_scf7180000350377:hit:2
>>> 506;
>>> pbar_scf7180000350377 maker CDS 542624 542798 . + 2
>>> ID=pbar_scf7180000350377:hit:2506:cds:308;Parent=pbar_scf7180000350377:hit:2
>>> 506;
>>> pbar_scf7180000350377 maker CDS 555823 556025 . + 1
>>> ID=pbar_scf7180000350377:hit:2506:cds:309;Parent=pbar_scf7180000350377:hit:2
>>> 506;
>>> pbar_scf7180000350377 maker CDS 558609 558769 . + 2
>>> ID=pbar_scf7180000350377:hit:2506:cds:310;Parent=pbar_scf7180000350377:hit:2
>>> 506;
>>> 
>>> _______________________________________________
>>> maker-devel mailing list
>>> maker-devel at box290.bluehost.com
>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>> 
>> Barry Moore
>> Research Scientist
>> Dept. of Human Genetics
>> University of Utah
>> Salt Lake City, UT 84112
>> --------------------------------------------
>> (801) 585-3543 <tel:%28801%29%20585-3543>
>> 
>> 
>> 
>> 
>> _______________________________________________ maker-devel mailing list
>> maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/ma
>> ker-devel_yandell-lab.org
> 



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20130313/836da6cb/attachment-0003.html>


More information about the maker-devel mailing list