[maker-devel] Filtering gene models based on eAED scores

Carson Holt carsonhh at gmail.com
Wed Jun 13 13:57:46 MDT 2018


AED is documented in the 2011 MAKER2 paper, but eAED (extended AED) is not currently documented in a publication and is not used by any of the scripts that come with MAKER (it’s just there for reference right now). Basically AED is calculated with evidence overlap, but eAED will not count protein overlap unless it occurs in the same codon reading frame as the model (so evidence may count for a stretch, then stop counting for a few codons, then count again if there is an insertion in the alignment). Also eAED will infer support for exons if both introns are validated by evidence and the region in between is all ORF (this allows joint intron support to infer support for an internal exon). 99% of the time AED and eAED are the same, but eAED can be useful in identifying edge cases. Much of the time if AED and eAED are very different, it’s because there is a single base pair insertion or deletion in the assembly. The predictors still find the locus the best they can, but protein evidence and alignments will be out of sync with the reading frame on one of the exons. BLAST can’t really handle single bp INDELs in it’s alignments, but Exonerate can do mid alignment reading frame shifts to capture the assembly INDEL (and eAED is an attempt to use the extra Exonerate info in the score).

—Carson


> On Jun 13, 2018, at 1:34 PM, Surya Saha <ss2489 at cornell.edu> wrote:
> 
> Hi Carson,
> 
> We have been using AED as a primary metric for evaluating predictions in our group but it sounds like we should be using both eAED and AED. Is there a detailed explanation of how exactly eAED and AED are computed besides Table 2 in the Cantarel 2008 paper? Thanks
> 
> -Surya
> 
> On Wed, Jun 13, 2018 at 2:03 PM Carson Holt <carsonhh at gmail.com <mailto:carsonhh at gmail.com>> wrote:
> The eAED score also take protein reading frame into account and it can infers support for exons when both introns are validated (i.e. can be lower than AED in some cases). For your case where eAED is 1 but AED less than 1 means that you evidence support is from an overlapping protein, but it is never in the same reading frame as the gene model. So the positive evidence support may be suspect, or it may be real and the model is poor because of the assembly, gaps, etc. To use eAED instead in the quality_filter.pl <http://quality_filter.pl/> script, you would have to to manually edit the script and replace ‘_AED' with ‘_eAED’. Using eAED instead will greatly drop sensitivity on lower quality assemblies (places where the predictors make the best model they can and not the correct model because the assembly won’t allow for the correct model but there is evidence that there is a gene locus). So make sure to always view suspect regions in browser first.
> 
> —Carson
> 
> 
> 
>> On Jun 9, 2018, at 2:06 PM, Federico López <flopezo84 at gmail.com <mailto:flopezo84 at gmail.com>> wrote:
>> 
>> Hello,
>> 
>> I'm using MAKER's "quality_filter.pl <http://quality_filter.pl/>" with the default option (AED<1). However, I have noticed cases in which models have low AED scores and high eAED scores (1.00), so presumably the good AED scores are the result of spurious evidence alignments. Is there a way to filter models based on eAED scores too?
>> 
>> Thank you.
>> _______________________________________________
>> maker-devel mailing list
>> maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org <http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org>
> 
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org <http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org>
> 
> 
> -- 
> 
> Surya Saha
> Sol Genomics Network
> Boyce Thompson Institute, Ithaca, NY, USA
> https://citrusgreening.org/ <http://www.linkedin.com/in/suryasaha>
> http://www.linkedin.com/in/suryasaha <http://www.linkedin.com/in/suryasaha>
> https://twitter.com/SahaSurya <https://twitter.com/SahaSurya>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180613/7eb17966/attachment-0002.html>


More information about the maker-devel mailing list