[maker-devel] training of gene finders using whole assembly or longest contigs?

Quanwei Zhang qwzhang0601 at gmail.com
Fri Feb 10 09:03:41 MST 2017


Hello:

I am training the gene finders using the whole assembly. But it seems very
time consuming. Besides, I have to repeat the training process several
times.  Although I am running it on 25 nodes on a server, it may still take
3 (or even more) weeks for the training. I wonder how you guys train the
SNAP. Do you use the whole assembly or just select the longest contigs for
the training. If I only use longest contigs (like top 20% longest), will it
be good enough as that get by using the whole assembly? Or should I
randomly select 20% contigs for the training, for which we will have
similar length distribution as the whole assembly?

Thanks

Best
Quanwei
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170210/552ed06e/attachment-0002.html>


More information about the maker-devel mailing list