[maker-devel] maker snap output files
Carson Holt
carsonhh at gmail.com
Thu Mar 20 11:24:41 MDT 2014
maker transcripts will be the gene/mRNA/exon/CDS features
All other transcripts from SNAP etc. will be match/match_part features in
the GFF3.
When you look at these in something like Apollo, they will be placed in
different viewing panels based on their type.
Thanks,
Carson
From: dhivya arasappan <darasappan at gmail.com>
Date: Thursday, March 20, 2014 at 11:22 AM
To: Carson Holt <carsonhh at gmail.com>
Cc: <maker-devel at yandell-lab.org>
Subject: Re: maker snap output files
Hi Carson,
Given that I now have maker transcripts, ab initio predicted transcripts and
transcripts that don’t overlap, which ones are reflected in the gff file?
The ids in the gff file (for exons, genes, mrna) all say something like
‘*snap-gene’ so does this mean these are the genes from the snap prediction
tool?
Thanks
dhivya
On Mar 18, 2014, at 3:09 PM, Carson Holt <carsonhh at gmail.com> wrote:
> There can also be hint based predictions. They may be similar in size, but
> there is no rule. Generally maker.snap_masked.proteins.fasta will be larger,
> as gene predictors tend to over predict (as much as 10 fold). You should
> always review your annotations in something like Apollo, to see how the models
> compare to the evidence. Just counts don’t really mean anything.
>
> Thanks,
> Carson
>
> From: dhivya arasappan <darasappan at gmail.com>
> Date: Tuesday, March 18, 2014 at 2:05 PM
> To: Carson Holt <carsonhh at gmail.com>
> Cc: <maker-devel at yandell-lab.org>
> Subject: Re: maker snap output files
>
> Thanks Carson.
>
> Is it normal that in my maker results after running snap, the number of
> proteins (in *maker.proteins.fasta) Is actually less than the number of
> proteins in my pre-snap maker results? I assumed that annotations through
> alignment+annotation through prediction would equal more annotations?
>
> The unfiltered proteins file has more proteins though.
>
> Thanks
> Dhivya
>
>
>
> On Mar 18, 2014, at 2:34 PM, Carson Holt <carsonhh at gmail.com> wrote:
>
>> maker.proteins.fasta - these are the final filtered and modified protein
>> models (this is what you want)
>> maker.snap_masked.proteins.fasta - these are the raw unfiltered snap ab
>> initio predictions (for reference purposes)
>> maker.non_overlapping_ab_initio.proteins.fasta - these are non-redundant
>> rejected models that do not overlap the maker.proteins.fasta entries. If you
>> think you are missing a gene, look for it here. Sometimes people use
>> interproscan (very slow) to analyze this file for false negatives.
>>
>>
>> These files are also described in the README distributed with MAKER in the
>> “MAKER OUTPUT” section.
>>
>> Thanks,
>> Carson
>>
>>
>>
>>
>> From: dhivya arasappan <darasappan at gmail.com>
>> Date: Tuesday, March 18, 2014 at 1:27 PM
>> To: Carson Holt <carsonhh at gmail.com>, <maker-devel at yandell-lab.org>
>> Subject: maker snap output files
>>
>> Hello,
>>
>> I ran maker after running SNAP ab initio prediction (following instructions
>> from the maker tutorial). It ran successfully and when I ran fasta_merge, I
>> got several output fasta files. I’m unable to find information on the
>> tutorial about interpreting these different files. I’m hoping one of you can
>> help.
>>
>> *maker.proteins.fasta
>> *maker.snap_masked.proteins.fasta
>> *maker.non_overlapping_ab_initio.proteins.fasta
>>
>> What is the difference among these? They all have different number of
>> sequences.
>>
>> Similarly,with transcripts:
>>
>> maker.non_overlapping_ab_initio.transcripts.fasta
>> maker.snap_masked.transcripts.fasta
>> maker.transcripts.fasta
>>
>> Thanks
>> Dhivya
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20140320/5d055334/attachment-0003.html>
More information about the maker-devel
mailing list