[maker-devel] repeats masking
Quanwei Zhang
qwzhang0601 at gmail.com
Wed Aug 30 08:21:15 MDT 2017
Dear Daniel:
Thank you! I am running it on an unmasked genome. Just want to make sure it
is the correct way.
Have a nice day!
Best
Quanwei
2017-08-30 10:19 GMT-04:00 Daniel Ence <dandence at gmail.com>:
> Hi Quanwei,
>
> I think you should run it on an unmasked genome. I don’t think that
> redundancy between repeat libraries will be an issue.
>
> Thanks,
> Daniel
>
>
>
> On Aug 30, 2017, at 10:01 AM, Quanwei Zhang <qwzhang0601 at gmail.com> wrote:
>
> Dear Carson:
>
> Thank you again for all you valuable suggestions. Now I am generating the
> species specific repeat library. I wonder whether I need to remove the
> masked the regions by existing repeatMasker library, before I run
> repeatModeler? I think there may be some redundancy if I run repeatModeler
> directly on the genome and then use both existing repeatMasker library and
> the repeatModeler library to mask the genome. Does it matter, if there is
> such redundancy?
>
> Thanks
>
> Best
> Quanwei
>
> 2017-08-23 14:10 GMT-04:00 Carson Holt <carsonhh at gmail.com>:
>
>>
>> (1) For the predicted unknown (unclassified) repeat sequences (those in
>> Modelerunknown.lib), it mentioned "Sequences in Modelerunknown.lib were
>> searched against a transposase database (derived from RepeatMaske
>> <http://www.repeatmasker.org/>r) and sequences matching transposase were
>> considered as transposons belonging to the relevant superfamily".
>> I wonder how to do this search. Annotate the "unknown" repeat sequences
>> using the Repeatmaker? Then what to do, if for an "unknown" repeat
>> sequence, only part of the sequence match the known repeat elements.
>>
>>
>> You can use RepBase match I guess, but I would not be overly worried
>> about classification. MAKER won’t use any classification info you give it.
>>
>>
>> (2) To exclude gene fragments, I need map the predicted repeat sequences
>> against a protein database, and then run the package "ProExcluder"*. *
>> Right? I wonder how to get such protein database. Since I am working on
>> a new rodent species, can I use all the rodent proteins from Uniprot (both
>> Swiss-Prot and TrEMBL)?
>>
>>
>> Try Swiss-Prot. That is a well curated cross species set.
>>
>>
>> (3) After I generate the species specific repeat library, do I still need
>> to select a model organism for RepBase masking (as shown below).
>>
>> In the file "maker_opts.ctl"
>> #-----Repeat Masking (leave values blank to skip repeat masking)
>> model_org=Mammalia #select a model organism for RepBase masking in
>> RepeatMasker
>> rmlib=myRepeat.fa #provide an organism specific repeat library in fasta
>> format for RepeatMasker
>>
>>
>> Yes. Supply both.
>>
>>
>> —Carson
>>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170830/85d91879/attachment-0003.html>
More information about the maker-devel
mailing list