<div><font face="'Trebuchet MS'">Hi Carson,</font></div><div><font face="'Trebuchet MS'"><br></font></div><div><font face="'Trebuchet MS'">Thanks a lot. This is helpful.</font></div><div><font face="'Trebuchet MS'"><br></font></div><div><font face="'Trebuchet MS'">Also I wanted to ask how can I follow this up with exonerate polishing. Is there a module in MAKER that can run this separately on my BLAST results?</font></div><div><font face="'Trebuchet MS'"><br></font></div><div><font face="'Trebuchet MS'">-Yogesh</font></div><div><font face="'Trebuchet MS'"><br></font></div>
<p style="color: #A0A0A8;"><font face="'Trebuchet MS'">On Friday, May 18, 2012 at 9:22 AM, Carson Holt wrote:</font></p><blockquote type="cite"><div>
<span><div><div><font face="'Trebuchet MS'"><div>There are several things. I set several filtering options directly on the BLAST command line. These are things like maximum intron length, an e-value filter, and simple repeat filtering (called dust filter in NCBI blast and seg filter in WUBLAST).</div><div><br></div><div>I also run repeat masker over the genome first. This allows simple and complex repeats to be removed before running BLAST (otherwise you get many false alignments).</div><div><br></div><div>Last I filter the results based on percent coverage of the hit to the original database sequence and percent identity. I think you can set percent identity as a flag in BLAST, but the percent coverage filter is being calculated by MAKER, so to do this outside of MAKER would require that you write your own filtering script to compare the length of the alignment to the length of the sequence in the database.</div><div><br></div><div>I also have an HSP depth overlap filter. This removes weird low complexity hits that escape repeatmasking. They show up as multiple HSPs overlapping multiple times in the same region (usually very high numbers like 90 HSPs all 100 bp long in the same region). I calculate the number of base pairs in the alignment on the hit then divide by the number of base pairs in the query alignment. If it's greater than 3, I throw the hit out.</div><div><br></div><div>Thanks,</div><div>Carson</div><div><br></div><div><br></div><div><br></div><span><div style="text-align: left; color: black; border-bottom-width: medium; border-bottom-style: none; border-bottom-color: initial; border-left-width: medium; border-left-style: none; border-left-color: initial; padding-bottom: 0in; padding-left: 0in; padding-right: 0in; border-top-color: rgb(181, 196, 223); border-top-width: 1pt; border-top-style: solid; border-right-width: medium; border-right-style: none; border-right-color: initial; padding-top: 3pt; "><span style="font-weight:bold">From: </span> Yogesh <<a href="mailto:yogeshp08@gmail.com">yogeshp08@gmail.com</a>><br><span style="font-weight:bold">Date: </span> Tuesday, 15 May, 2012 12:07 PM<br><span style="font-weight:bold">To: </span> <<a href="mailto:maker-devel@yandell-lab.org">maker-devel@yandell-lab.org</a>><br><span style="font-weight:bold">Subject: </span> [maker-devel] tblastn Cleanup?<br></div><div><br></div>
<div>Hello,</div><div><br></div><div>I have a few tblastn alignments with a lot of low quality hits. I have to clean that up. Can you please suggest how Maker pipeline does it? Also can I run it directly on my data without having to go through the whole pipeline?</div><div><br></div><div>Thanks,</div><div><br></div><div>-Yogesh</div></span></font></div></div></span></div></blockquote>