[maker-devel] Repeats annotation

Quanwei Zhang qwzhang0601 at gmail.com
Wed Sep 13 08:51:32 MDT 2017


Dear Carson:

We have generated species specific repeat library following your pipeline (
http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Repeat_Library_Construction--Basic).
And did genome annotation by maker2 by using both species specific repeat
library and mammalian repeat library.

Now, we want to do some comparison about the repeat contexts among
different species. So I want to generate species specific for other species
and also use both their species specific repeat library and mammalian
repeat library. But I found, I can only provide either the species specific
repeat library or mammalian repeat library to RepeatMasker (not for both).
I wonder whether I can run maker2 on those genome but only for repeat
masking.

BTW, by running RepeatMasker we can get a summary report (as below), I
wonder whether there is any script from maker2 to analyze repeats element
(or other tools to process the output of maker2).

Many thanks


file name: test_scaffold31.fasta
sequences:             1
total length:     863590 bp  (858757 bp excl N/X-runs)
GC level:         37.02 %
bases masked:     301634 bp ( 34.93 %)
==================================================
               number of      length   percentage
               elements*    occupied  of sequence
--------------------------------------------------
SINEs:               134        14362 bp    1.66 %
      Alu/B1          28         2183 bp    0.25 %
      MIRs            21         2860 bp    0.33 %

LINEs:               188       129104 bp   14.95 %
      LINE1          168       124633 bp   14.43 %
      LINE2           16         4266 bp    0.49 %
      L3/CR1           4          205 bp    0.02 %
      RTE              0            0 bp    0.00 %

LTR elements:        127       101129 bp   11.71 %
      ERVL            10         3057 bp    0.35 %
      ERVL-MaLRs      22         6902 bp    0.80 %
      ERV_classI      66        80258 bp    9.29 %
      ERV_classII     29        10912 bp    1.26 %

DNA elements:         27         4402 bp    0.51 %
      hAT-Charlie     13         1836 bp    0.21 %
      TcMar-Tigger     8         1651 bp    0.19 %

Unclassified:          4         1590 bp    0.18 %

Total interspersed repeats:    250587 bp   29.02 %


Small RNA:             9          616 bp    0.07 %

Satellites:           66        40820 bp    4.73 %
Simple repeats:      159         7235 bp    0.84 %
Low complexity:       50         2766 bp    0.32 %
==================================================

* most repeats fragmented by insertions or deletions
  have been counted as one element


The query species was assumed to be mammalia
RepeatMasker Combined Database: Dfam_Consensus-20170127, RepBase-20170127

run with rmblastn version 2.2.27+
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170913/739f1e6a/attachment-0002.html>


More information about the maker-devel mailing list