[maker-devel] Question about Maker2

Daniel Ence dence at genetics.utah.edu
Thu Mar 31 11:35:36 MDT 2016


Hi Jose, the time it takes maker to annotate a genome depends greatly on the hardware setup (as you pointed out, processors, memory, etc) as well as the size of the genome and the size and type of the datasets you use to annotate the genome (numerous RNAseq datasets for example will take longer than a project without any RNAseq data).

However, the MPI parallelization implemented in MAKER guarantees that the runtime should scale linearly with the number of processors allotted to the MAKER run. This is explained in the MAKER2 paper (Holt and Yandell), which I’m going to quote:
MAKER2 was used to annotate a 10 megabase section of the C. elegans genome
(NGASP dataset). The algorithm was parallelized using MPI on an increasing number
of CPU cores. The results demonstrate how MAKER2 scales almost linearly with
CPU number (with a slope of near 1). If we project our results forward to the entire C.
elegans genome (~100 megabases), MAKER2 should take under 10 hours on 32
CPUs to complete; similarly, the human genome (~3 gigabases) would require fewer
than 24 hours on 400 CPUs

I’m also not sure what you mean by the first run taking less time than the second run. By the first run do you mean running with est2genome turned on to create models for training ab-initio predictors? In that case, I would guess that the second run would take longer, but it should be too big of a difference.

~Daniel

Daniel Ence
Graduate Student
Eccles Institute of Human Genetics
University of Utah
15 North 2030 East, Room 2100
Salt Lake City, UT 84112-5330

On Mar 31, 2016, at 6:57 AM, José Mª G. Perez-Silva <ereboperezsilva at gmail.com<mailto:ereboperezsilva at gmail.com>> wrote:

​​
Hello,

We are using Maker for the first time, and we are a little concerned about the time it takes the program to finish a whole genome (2.2Gb) ab-initio annotation.

In a month we have nearly annotate a half of the genome (let's say around 40% of it).
I'd like to know how much time and under which technical specifications (processors, memory, ...) does it takes to annotate a complete genome for the first time.
The second round of annotations (in which we use the results from the first round as extra data) is faster?

Thank you in advance.

---

Jose Maria G. Perez-Silva.
Departamento de Biologia Molecular y Bioquimica.
Universidad de Oviedo.
Spain.
_______________________________________________
maker-devel mailing list
maker-devel at yandell-lab.org<mailto:maker-devel at yandell-lab.org>
http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20160331/19e9bbd9/attachment-0003.html>


More information about the maker-devel mailing list