[maker-devel] model_gff question
Carson Holt
carsonhh at gmail.com
Tue Oct 2 12:05:04 MDT 2012
If it overlaps the UTR of some other chosen model it might have been
excluded for that reason as well. Sometimes this can happen when you have
RNA-seq and high gene density (so models wander into each other). Try
setting the the correct_est_fusion option to 1. This will take steps to
trim UTR that might cause neighboring models to be left out because of UTR
overlap (it also helps with false fusions caused by cufflinks results).
I would not be surprised if the model being left out was because of UTR
overlap.
I recommend using the cufflinks data and leaving tophat out. Tophat results
tend to be very noisy and can span large rather weird regions.
Thanks,
Carson
From: Michael Thon <mike.thon at gmail.com>
Date: Tuesday, 2 October, 2012 3:50 AM
To: Daniel Hughes <dsthughes at gmail.com>
Cc: Michael Thon <mike.thon at gmail.com>, <maker-devel at yandell-lab.org>,
Carson Holt <carsonhh at gmail.com>
Subject: Re: [maker-devel] model_gff question
It seems to have disappeared completely. I'm running MAKER again now using
the tophat alignments that I fed to cufflinks, instead of the cufflinks
data. So far the two models visually checked as missing with the cufflinks
data are present as they should be. I have to wait for the run to finish to
get a whole genome count though. Maybe I need to look more closely at the
cufflinks run that I did.
The RNA-Seq data are from the NCBI SRA and I didn't do anything to clean
them up before I ran tophat.
On Oct 2, 2012, at 9:10 AM, Daniel Hughes <dsthughes at gmail.com> wrote:
>
> Did the whole model vanish or just the protein product - contaminated rnaseq
> that hasn't been cleaned up enough will regularly cause the later to become
> part of a bad utr.
>
> Dan
>
> On Oct 2, 2012 6:01 AM, "Michael Thon" <mike.thon at gmail.com> wrote:
>> I looked at two cases in which the model_gff disappeared and they occurred in
>> regions where there are multiple overlapping cufflinks features. One model
>> that I'm looking at right now has overlapping protein2genome and a SNAP
>> feature overlapping it but it was still not included in the output. it could
>> be a problem in MAKER or it could be a problem with my RNA Seq data. I
>> aligned the RNA Seq data using tophat/cufflinks and converted the
>> transcripts.gtf file to gff using cufflinks2gff3 script.
>>
>> Is it better to use RNA Seq feature from tophat or cufflinks?
>>
>>
>> On Oct 1, 2012, at 4:01 PM, Carson Holt <carsonhh at gmail.com> wrote:
>>
>>> They can be replaced under two circumstances.
>>> 1. If you provide two model_gff files (comma separated list), in which case
>>> MAKER thinks it is merging legacy annotations and will only keep one or the
>>> other if models overlap.
>>> 2. If you turn snap, augusutus, genemark, or est2genome on. MAKER sees this
>>> as a cue that if these other programs produce a better model, it can replace
>>> the current model. If you set map_forward=1, MAKER will conserve the name
>>> of the previous model (so models change structure but names are conserved);
>>> otherwise, it gets a new name. Sometimes groups like to rename models every
>>> time their is a structural change. I think you are supposed to get the
>>> Alias attribute set when you don't get names mapped forward though (I can't
>>> remember if I added this or just planned on adding the Alias mapping
>>> though).
>>>
>>> MAKER should never drop a model_gff model. It can only replace it if
>>> something better comes along, but it should not disappear.
>>>
>>> Thanks,
>>> Carson
>>>
>>>
>>> From: Michael Thon <mike.thon at gmail.com>
>>> Date: Monday, 1 October, 2012 1:53 AM
>>> To: <maker-devel at yandell-lab.org>
>>> Subject: [maker-devel] model_gff question
>>>
>>> Under what circumstances will maker not include a gene model from the
>>> model_gff file in its final output? It was my understanding from this post:
>>> https://groups.google.com/d/topic/maker-devel/Y5jSdZ1Olcc/discussion
>>>
>>> That maker will keep or replace models in model_gff and never remove them.
>>> I'm reannotating a fungal genome and in model_gff I'm providing the gene
>>> models originally made by the sequencing center. I have 12006 models in the
>>> file I specify in model_gff but maker's final annotation has only 10727
>>> models in it.
>>> -Mike
>>>
>>> _______________________________________________ maker-devel mailing list
>>> maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/m
>>> aker-devel_yandell-lab.org
>>
>>
>> _______________________________________________
>> maker-devel mailing list
>> maker-devel at box290.bluehost.com
>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20121002/c12b75fc/attachment-0003.html>
More information about the maker-devel
mailing list