<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">One note. When I say single exon blastx hit, I mean that the evidence is single exon, not that the gene model is single exon. What I think you are seeing is an effect that seems to be partially related to under-masking, i.e. a spurious partial blastx alignment to a low complexity repeat (which is why the blastx protein alignment refuses to polish with exonerate). That is why the filter was added. So if a model (single or multi-exon) has no additional ab initio prediction support, has no EST support, and has no exonerate polished protein support, but does have a single-exon/single-hsp blastx overlap it gets filtered out at 0.5 (that threshold based on trial and error on a couple of genomes where we saw this occur - but your graph suggests that filter might be too loose and 0.4 or 0.45 might be a better value). So the spike is caused by poor blastx and under-masking (this may be explained if your are using in pred_gff models that were generated on an unmasked assembly outside of MAKER), then the drop around 0.5 is caused by MAKER filtering out models only supported by what appears to be spuious blastx alignments.<div class=""><br class=""></div><div class="">—Carson</div><div class=""><br class=""><div><br class=""><blockquote type="cite" class=""><div class="">On Apr 8, 2019, at 3:10 AM, Lior Glick <<a href="mailto:liorglic@mail.tau.ac.il" class="">liorglic@mail.tau.ac.il</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="rtl" class=""><div style="" dir="ltr" class="">Hi again - quick update:</div><div style="" dir="ltr" class="">I made a plot comparing the histograms of single-exon genes to multi-exon genes:</div><div style="" dir="ltr" class=""><div class=""><span id="cid:ii_ju84rhn00"><newplot (5).png></span><br class=""></div></div><div style="" dir="ltr" class="">It definitely looks like single-exon genes are <b class="">enriched</b> for the 0.5 score, but it does not account for the entire surge, as there also seem to be lots of multi-exon genes involved. This may suggest that the 0.5 peak is a result of multiple effects buried within the software.</div><div style="" dir="ltr" class="">Any other thoughts/suggestions?</div><div style="" dir="ltr" class=""><br class=""></div><div style="" dir="ltr" class="">Thanks again,</div><div style="" dir="ltr" class=""><br class=""></div></div>
</div></blockquote></div><br class=""></div></body></html>