[maker-devel] maker snap output files
dhivya arasappan
darasappan at gmail.com
Thu Mar 20 11:22:47 MDT 2014
Hi Carson,
Given that I now have maker transcripts, ab initio predicted transcripts and transcripts that don’t overlap, which ones are reflected in the gff file?
The ids in the gff file (for exons, genes, mrna) all say something like ‘*snap-gene’ so does this mean these are the genes from the snap prediction tool?
Thanks
dhivya
On Mar 18, 2014, at 3:09 PM, Carson Holt <carsonhh at gmail.com> wrote:
> There can also be hint based predictions. They may be similar in size, but there is no rule. Generally maker.snap_masked.proteins.fasta will be larger, as gene predictors tend to over predict (as much as 10 fold). You should always review your annotations in something like Apollo, to see how the models compare to the evidence. Just counts don’t really mean anything.
>
> Thanks,
> Carson
>
> From: dhivya arasappan <darasappan at gmail.com>
> Date: Tuesday, March 18, 2014 at 2:05 PM
> To: Carson Holt <carsonhh at gmail.com>
> Cc: <maker-devel at yandell-lab.org>
> Subject: Re: maker snap output files
>
> Thanks Carson.
>
> Is it normal that in my maker results after running snap, the number of proteins (in *maker.proteins.fasta) Is actually less than the number of proteins in my pre-snap maker results? I assumed that annotations through alignment+annotation through prediction would equal more annotations?
>
> The unfiltered proteins file has more proteins though.
>
> Thanks
> Dhivya
>
>
>
> On Mar 18, 2014, at 2:34 PM, Carson Holt <carsonhh at gmail.com> wrote:
>
>> maker.proteins.fasta - these are the final filtered and modified protein models (this is what you want)
>> maker.snap_masked.proteins.fasta - these are the raw unfiltered snap ab initio predictions (for reference purposes)
>> maker.non_overlapping_ab_initio.proteins.fasta - these are non-redundant rejected models that do not overlap the maker.proteins.fasta entries. If you think you are missing a gene, look for it here. Sometimes people use interproscan (very slow) to analyze this file for false negatives.
>>
>>
>> These files are also described in the README distributed with MAKER in the “MAKER OUTPUT” section.
>>
>> Thanks,
>> Carson
>>
>>
>>
>>
>> From: dhivya arasappan <darasappan at gmail.com>
>> Date: Tuesday, March 18, 2014 at 1:27 PM
>> To: Carson Holt <carsonhh at gmail.com>, <maker-devel at yandell-lab.org>
>> Subject: maker snap output files
>>
>> Hello,
>>
>> I ran maker after running SNAP ab initio prediction (following instructions from the maker tutorial). It ran successfully and when I ran fasta_merge, I got several output fasta files. I’m unable to find information on the tutorial about interpreting these different files. I’m hoping one of you can help.
>>
>> *maker.proteins.fasta
>> *maker.snap_masked.proteins.fasta
>> *maker.non_overlapping_ab_initio.proteins.fasta
>>
>> What is the difference among these? They all have different number of sequences.
>>
>> Similarly,with transcripts:
>>
>> maker.non_overlapping_ab_initio.transcripts.fasta
>> maker.snap_masked.transcripts.fasta
>> maker.transcripts.fasta
>>
>> Thanks
>> Dhivya
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20140320/9aed362d/attachment-0003.html>
More information about the maker-devel
mailing list