[maker-devel] About loss of Histone H2A, H2B, H4

Quanwei Zhang qwzhang0601 at gmail.com
Tue Nov 28 06:39:52 MST 2017


Dear Carson:

Thank you!

Best
Quanwei

2017-11-27 14:56 GMT-05:00 Carson Holt <carsonhh at gmail.com>:

> You should not have to train separately for SNAP on unmasked sequence, and
> I do believe adding back genes that were rejected because of lack of
> support but contain an identifiable domain may help. These will be in the
> fasta files labeled non-overlapping file in the datastore.
>
> —Carson
>
> On Nov 21, 2017, at 10:42 AM, Quanwei Zhang <qwzhang0601 at gmail.com> wrote:
>
> Dear Carson:
>
> Thank you for your comments and suggestions. Now the SNAP was trained with
> repeat masked, is it necessary to retrain the predictor without repeat
> masking?
> By BUSCO analysis on the genome, the completeness is shown as below. Now I
> am doing the analysis using the default reports of Maker2 (i.e., gene
> models with evidence support, the default build). For the gene loss,
> besides you suggestions I am also considering to do the analysis using the
> gene models with evidence support plus those with scanned domains (i.e.,
> standard build). How do you think?
>
>
> C:95.0%[S:92.7%,D:2.3%],F:2.2%,M:2.8%,n:4104
>   3902  Complete BUSCOs (C)
>   3806  Complete and single-copy BUSCOs (S)
>   96  Complete and duplicated BUSCOs (D)
>   92  Fragmented BUSCOs (F)
>   110  Missing BUSCOs (M)
>
> Thanks
> Best
> Quanwei
>
>
> 2017-11-21 11:19 GMT-05:00 Carson Holt <carsonhh at gmail.com>:
>
>> No known biases, but if you are concerned, you can collect known Histone
>> H2A, H2B, H4 proteins and transcripts from other species (protein= and
>> altest= options), them run MAKER with no masking to see if you gain any
>> models that may have been overlooked because of over-masking of repeats.
>> Make sure to evaluate any models you find as being a pseudogene. Run
>> InterProScan on results to make sure they contain known InterPro domains
>> for that gene family as well. Running without repeat masking will increase
>> sensitivity but also false positives derived from low homology alignments
>> to simple repeats which is why you need to evaluate results using something
>> like InterProScan.
>>
>> Also run BUSCO to evaluate the completeness of the genome. Make sure that
>> the observed contraction is not just a result of an incomplete assembly.
>>
>> —Carson
>>
>>
>> On Nov 16, 2017, at 12:46 PM, Quanwei Zhang <qwzhang0601 at gmail.com>
>> wrote:
>>
>> Hello:
>>
>> We have annotated a new rodent genome using Maker2. Based on the
>> annotated maker2 gene sets, we did gene family expansion/contraction
>> analysis using CAFE. We found Histone H2A, H2B, H4 gene families are under
>> contraction. I wonder whether there are known bias to predict those gene
>> families using Maker2? For example, can this due to repeat masking of the
>> genome? I used repeatmaker and generated species specific repeat libraries
>> follows http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Repe
>> at_Library_Construction--Basic.
>>
>> Thanks
>>
>> Best
>> Quanwei
>> _______________________________________________
>> maker-devel mailing list
>> maker-devel at box290.bluehost.com
>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20171128/da129354/attachment-0003.html>


More information about the maker-devel mailing list