[maker-devel] tblastn Cleanup?

Yogesh yogeshp08 at gmail.com
Fri Jun 1 10:56:13 MDT 2012


Hi Carson,

Thanks a lot. This is helpful.

Also I wanted to ask how can I follow this up with exonerate polishing. Is there a module in MAKER that can run this separately on my BLAST results?

-Yogesh


On Friday, May 18, 2012 at 9:22 AM, Carson Holt wrote:

> There are several things.  I set several filtering options directly on the BLAST command line.  These are things like maximum intron length, an e-value filter, and simple repeat filtering (called dust filter in NCBI blast and seg filter in WUBLAST).
> 
> I also run repeat masker over the genome first.  This allows simple and complex repeats to be removed before running BLAST (otherwise you get many false alignments).
> 
> Last I filter the results based on percent coverage of the hit to the original database sequence and percent identity.  I think you can set percent identity as a flag in BLAST, but the percent coverage filter is being calculated by MAKER, so to do this outside of MAKER would require that you write your own filtering script to compare the length of the alignment to the length of the sequence in the database.
> 
> I also have an HSP depth overlap filter.  This removes weird low complexity hits that escape repeatmasking.  They show up as multiple HSPs overlapping multiple times in the same region (usually very high numbers like 90 HSPs all 100 bp long in the same region).  I calculate the number of base pairs in the alignment on the hit then divide by the number of base pairs in the query alignment.  If it's greater than 3, I throw the hit out.
> 
> Thanks,
> Carson
> 
> 
> 
> From: Yogesh <yogeshp08 at gmail.com (mailto:yogeshp08 at gmail.com)>
> Date: Tuesday, 15 May, 2012 12:07 PM
> To: <maker-devel at yandell-lab.org (mailto:maker-devel at yandell-lab.org)>
> Subject: [maker-devel] tblastn Cleanup?
> 
> Hello,
> 
> I have a few tblastn alignments with a lot of low quality hits. I have to clean that up. Can you please suggest how Maker pipeline does it? Also can I run it directly on my data without having to go through the whole pipeline?
> 
> Thanks,
> 
> -Yogesh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20120601/71d95d76/attachment-0002.html>


More information about the maker-devel mailing list