[maker-devel] est_forward and conflicting names

Carson Holt carsonhh at gmail.com
Thu May 8 16:43:40 MDT 2014


Only if you were to remove the brackets around gene=.

--Carson

From:  Shaun Jackman <sjackman at gmail.com>
Reply-To:  Shaun Jackman <sjackman at gmail.com>
Date:  Thursday, May 8, 2014 at 4:41 PM
To:  Carson Holt <carsonhh at gmail.com>
Cc:  "maker-devel at yandell-lab.org" <maker-devel at yandell-lab.org>
Subject:  Re: [maker-devel] est_forward and conflicting names

Interesting. Thanks for the clarification. I’m working on a plant
mitochondrion, and so as far as I know, there’s no alternative splicing. My
protein FASTA file is composed of the protein sequences of ~100 species
downloaded from GenBank. It looks like this:
>cox1|lcl|KJ461445.1_cdsid_AHY20320.1 [gene=cox1] [protein=cytochrome c oxidase
subunit 1] [protein_id=AHY20320.1] [location=complement(59212..60795)]
…
>cox1|lcl|EU534409.1_cdsid_ACA62629.1 [gene=cox1] [protein=cox1]
[protein_id=ACA62629.1] [location=245282..246856]
…
>cox1|lcl|NC_023103.1_cdsid_YP_008964124.1 [gene=cox1] [protein=cytochrome c
oxidase subunit 1] [protein_id=YP_008964124.1]
[location=join(317824..318438,319511..320368)]
…
I’m not sure that I actually want the fancy behaviour that you describe,
though it probably wouldn’t hurt anything. Will this FASTA format trigger
the fancy behaviour?

Cheers,
Shaun


http://sjackman.ca


On 8 May 2014 15:33, Carson Holt <carsonhh at gmail.com> wrote:
> When moving transcripts onto a new assembly, you may have multiple transcripts
> of the same gene. Because your transcript name should be your fasta ID there
> is no way for MAKER to know that they go together when moving the models
> forward, so you can use the gene= option to make MAKER aware that these belong
> to the same genes.  They will be grouped and you recover all splice forms as a
> group. 
> 
> Example:
> 
>> >SMEDT_00004   gene=dpp
> AAAAAAA
> 
>> >SMEDT_00005 gene=dpp
> AAAAAAA
> 
> --Carson
> 
> 
> 
> From:  Shaun Jackman <sjackman at gmail.com>
> Reply-To:  Shaun Jackman <sjackman at gmail.com>
> Date:  Thursday, May 8, 2014 at 4:26 PM
> To:  Carson Holt <carsonhh at gmail.com>
> Cc:  "maker-devel at yandell-lab.org" <maker-devel at yandell-lab.org>
> Subject:  Re: [maker-devel] est_forward and conflicting names
> 
> Hi, Carson. Could you give an example of how to add gene_id= to the header of
> the FASTA file? I’m not clear on what you mean by this. In the FASTA header,
> what portion is the transcript name, and what portion is the gene name?
> 
> Cheers,
> Shaun
> 
> 
> http://sjackman.ca
> 
> 
> On 2 May 2014 11:55, Carson Holt <carsonhh at gmail.com> wrote:
>> Whichever has the best AED score I believe, but you can add gene_id= to the
>> header of each fasta file to ensure MAKER doesn't try and cluster unrelated
>> transcripts into a single gene.  Then the transcript name and gene name will
>> be guaranteed to match up.
>> 
>> --Carson
>> 
>> 
>> From:  Shaun Jackman <sjackman at gmail.com>
>> Date:  Wednesday, April 30, 2014 at 5:25 PM
>> To:  "maker-devel at yandell-lab.org" <maker-devel at yandell-lab.org>
>> Subject:  [maker-devel] est_forward and conflicting names
>> 
>> Hi, Carson.
>> 
>> I’ve downloaded a number genes from GenBank using Entrez Direct, which I’m
>> using with est and protein to annotate a plant mitochondrion. Most of these
>> reference sequences have sensible and consistent gene names, and so I’m using
>> est_forward to retain the gene names. This workflow is working well for me.
>> Some of the genes pulled in from GenBank have less useful names like orf1234
>> or other numeric IDs. When multiple evidence sequences map to the same
>> location, how does est_forward choose which name to use? If it’s chosen
>> arbitrarily, could it be possible to choose the most common name instead?
>> 
>> Thanks,
>> Shaun
>> 
>> 
>> 
>> _______________________________________________ maker-devel mailing list
>> maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/ma
>> ker-devel_yandell-lab.org
> 



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20140508/d8d1c2da/attachment-0003.html>


More information about the maker-devel mailing list