[maker-devel] repeats masking

Quanwei Zhang qwzhang0601 at gmail.com
Wed Aug 30 08:01:02 MDT 2017


Dear Carson:

Thank you again for all you valuable suggestions. Now I am generating the
species specific repeat library. I wonder whether I need to remove the
masked the regions by existing repeatMasker library, before I run
repeatModeler? I think there may be some redundancy if I run  repeatModeler
directly on the genome and then use both existing repeatMasker library and
the repeatModeler library to mask the genome. Does it matter, if there is
such redundancy?

Thanks

Best
Quanwei

2017-08-23 14:10 GMT-04:00 Carson Holt <carsonhh at gmail.com>:

>
> (1) For the predicted unknown (unclassified) repeat sequences (those in
> Modelerunknown.lib), it mentioned "Sequences in Modelerunknown.lib were
> searched against a transposase database (derived from RepeatMaske
> <http://www.repeatmasker.org/>r) and sequences matching transposase were
> considered as transposons belonging to the relevant superfamily".
> I wonder how to do this search. Annotate the "unknown" repeat sequences
> using the Repeatmaker? Then what to do, if for an "unknown" repeat
> sequence, only part of the sequence match the known repeat elements.
>
>
> You can use RepBase match I guess, but I would not be overly worried about
> classification. MAKER won’t use any classification info you give it.
>
>
> (2) To exclude gene fragments, I need map the predicted repeat sequences
> against a protein database, and then run the package "ProExcluder"*. *
> Right?  I wonder how to get such protein database. Since I am working on
> a new rodent species, can I use all the rodent proteins from Uniprot (both
> Swiss-Prot and TrEMBL)?
>
>
> Try Swiss-Prot. That is a well curated cross species set.
>
>
> (3) After I generate the species specific repeat library, do I still need
> to select a model organism for RepBase masking (as shown below).
>
> In the file "maker_opts.ctl"
> #-----Repeat Masking (leave values blank to skip repeat masking)
> model_org=Mammalia #select a model organism for RepBase masking in
> RepeatMasker
> rmlib=myRepeat.fa #provide an organism specific repeat library in fasta
> format for RepeatMasker
>
>
> Yes. Supply both.
>
>
> —Carson
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170830/72d396a5/attachment-0003.html>


More information about the maker-devel mailing list