[maker-devel] The origin of te_proteins.fasta

Carson Holt carsonhh at gmail.com
Wed Sep 30 13:18:07 MDT 2015


It’s from a tool called RepeatRunner.  Here is the paper —> https://publications.mpi-cbg.de/Smith_2007_5404.pdf <https://publications.mpi-cbg.de/Smith_2007_5404.pdf>

Post RepeatRunner development, RepeatMasker also started checking against repeats to get better performance. So nowadays it may be somewhat redundant with what RepeatMasker will do, but it does add a little.  It’s not updated regularly, but since RepBase started adding proteins that should not be an issue.

In addition to a number of protein repeats, te_proteins also contains a number of low complexity entries from NCBI’s NR database that tend to falsely align with great frequency frequently to many genomes. All te_protein matches generate soft masking in the genome whereas RepeatMasker results will be hard masked.

—Carson


> On Sep 30, 2015, at 1:00 PM, Ole Kristian Tørresen <ole.toerresen at gmail.com> wrote:
> 
> Hi,
> the file te_proteins.fasta is distributed with MAKER and is suggested as a way to find more divergent transposable elements by searching in protein level instead of at nucleotide level. I've been unable to find any information about it's creation, and whether or not it has been kept current. There is a file with mobile elements derived proteins distributed with RepBase, called RepeatPeps.lib, which seem to contain the same amount of sequences (about 9.4 Mbp in both), but half the number (10500 vs 25000). 
> 
> Does anyone know how these two files compare? Could I use RepeatPeps.lib instead, or combine them (with some clustering maybe?)?
> 
> Thank you.
> 
> Sincerely,
> Ole Kristian Tørresen
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20150930/287f2f18/attachment-0003.html>


More information about the maker-devel mailing list