[maker-devel] question on gene numbers with quality_filter.pl
Michael Campbell
michael.s.campbell1 at gmail.com
Mon Oct 2 13:19:51 MDT 2017
Hi Chris,
Yeah By default MAKER shouldn’t keep any annotation with an AED of 1. I’ve ccd the dev list on this to see if anyone else has any idea why you might get AED 1 genes with keep_preds=0. Could you send me the maker_opts.ctl file for the run. There may be something informative in there.
Thanks,
Mike
> On Oct 2, 2017, at 2:32 PM, Willett, Christopher S <willett4 at email.unc.edu> wrote:
>
> Hi Mike-
>
> I was looking at the lists of mRNAs and I think what is happening is that there are still genes retained in our initial output from MAKER that have an AED=1 that are then getting trimmed out of the filtered file. If I am setting the AED threshold equal to 1 in the control file for the MAKER run is that less than one or less than or equal to one for retention? Should these AED=1 genes be making it into the gene and mRNA pools if we have the keep predictions parameter set to 0?
>
> Thanks for your help,
>
> Best,
>
> Chris
>
>
>
>> On Oct 2, 2017, at 9:30 AM, Michael Campbell <michael.s.campbell1 at gmail.com <mailto:michael.s.campbell1 at gmail.com>> wrote:
>>
>> Hi Chris,
>>
>> This is interesting. -d in quality_filter.pl should only filter out genes based on AED. Is there a chance that you counted transcripts instead of genes? If there is a transcript with an AED of 1 then quality filter should remove it but leave the gene and the transcripts with AEDs less than 1. I can have a look at it if you send me one of the genes (in GFF3 format) that was filtered out by quality_filter.pl even though it had an AED less than 1.
>>
>> Thanks,
>> Mike
>>
>>
>>> On Sep 29, 2017, at 1:20 PM, Willett, Christopher S <willett4 at email.unc.edu <mailto:willett4 at email.unc.edu>> wrote:
>>>
>>> Hello-
>>>
>>> We are getting to the final stages (hopefully) of a reannotation of a new assembly of a copepod genome using MAKER and we had some questions about which set of genes to use. Our latest runs were using Pfam domains to define default vs standard set using the quality_filter.pl script and I had a question about stringency of the filters for this script. It appears that the default is more stringent than the output that we get from MAKER without using this script (all with AED max set to 1). Are there additional filters in this script beyond AED that would cause this?
>>>
>>> Here is what we are seeing if more details would be helpful. With a run with or without the keep_pred turned our final MAKER run gives ~21500 predicted genes with or 15200 without the keep predictions turned on. What I was wondering about was why this 15200 is higher than the default set (which gives ~14500 genes) after we filter the gff using the -d setting in quality_filter.pl. For completeness the standard set (-s setting) is retaining ~14800 genes and if I filter the 15200 gff file with the default parameters that yields ~14100 genes. So I was curious what else was going on in the filter script beyond AED that would trim out genes?
>>>
>>> The genes sets look pretty good overall and seem like reasonable numbers so we were debating which set to use as our final set. I am also trying a few other analyses in InterProScan to see if that identifies additional genes beyond Pfam for retention but that seems a bit independent from the question above.
>>>
>>> Thanks for your help,
>>>
>>> Best,
>>>
>>> Chris Willett
>>>
>>>
>>>
>>>
>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>> Research Associate Professor
>>> Department of Biology
>>> CB#3280 Coker Hall
>>> University of North Carolina, Chapel Hill
>>> Chapel Hill, NC, 27599-3280
>>>
>>> Office: 2252 Genome Science Building
>>> phone:
>>> 919-843-8663
>>> fax:
>>> 919-962-1625
>>>
>>> http://labs.bio.unc.edu/Willett/ <http://labs.bio.unc.edu/Willett/>
>>> _______________________________________________
>>> maker-devel mailing list
>>> maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org <http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20171002/c8dad6a5/attachment-0001.html>
More information about the maker-devel
mailing list