[maker-devel] est_forward and conflicting names
Carson Holt
carsonhh at gmail.com
Thu May 8 16:33:36 MDT 2014
When moving transcripts onto a new assembly, you may have multiple
transcripts of the same gene. Because your transcript name should be your
fasta ID there is no way for MAKER to know that they go together when moving
the models forward, so you can use the gene= option to make MAKER aware that
these belong to the same genes. They will be grouped and you recover all
splice forms as a group.
Example:
>SMEDT_00004 gene=dpp
AAAAAAA
>SMEDT_00005 gene=dpp
AAAAAAA
--Carson
From: Shaun Jackman <sjackman at gmail.com>
Reply-To: Shaun Jackman <sjackman at gmail.com>
Date: Thursday, May 8, 2014 at 4:26 PM
To: Carson Holt <carsonhh at gmail.com>
Cc: "maker-devel at yandell-lab.org" <maker-devel at yandell-lab.org>
Subject: Re: [maker-devel] est_forward and conflicting names
Hi, Carson. Could you give an example of how to add gene_id= to the header
of the FASTA file? I’m not clear on what you mean by this. In the FASTA
header, what portion is the transcript name, and what portion is the gene
name?
Cheers,
Shaun
http://sjackman.ca
On 2 May 2014 11:55, Carson Holt <carsonhh at gmail.com> wrote:
> Whichever has the best AED score I believe, but you can add gene_id= to the
> header of each fasta file to ensure MAKER doesn't try and cluster unrelated
> transcripts into a single gene. Then the transcript name and gene name will
> be guaranteed to match up.
>
> --Carson
>
>
> From: Shaun Jackman <sjackman at gmail.com>
> Date: Wednesday, April 30, 2014 at 5:25 PM
> To: "maker-devel at yandell-lab.org" <maker-devel at yandell-lab.org>
> Subject: [maker-devel] est_forward and conflicting names
>
> Hi, Carson.
>
> I’ve downloaded a number genes from GenBank using Entrez Direct, which I’m
> using with est and protein to annotate a plant mitochondrion. Most of these
> reference sequences have sensible and consistent gene names, and so I’m using
> est_forward to retain the gene names. This workflow is working well for me.
> Some of the genes pulled in from GenBank have less useful names like orf1234
> or other numeric IDs. When multiple evidence sequences map to the same
> location, how does est_forward choose which name to use? If it’s chosen
> arbitrarily, could it be possible to choose the most common name instead?
>
> Thanks,
> Shaun
>
>
>
> _______________________________________________ maker-devel mailing list
> maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/mak
> er-devel_yandell-lab.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20140508/2c6e11e6/attachment-0003.html>
More information about the maker-devel
mailing list