[maker-devel] Question about Maker2

Carson Holt carsonhh at gmail.com
Thu Mar 31 11:38:14 MDT 2016


If you provide all evidence on the first run, the second run will be faster because MAKER will be able to reuse alignments from the previous run. Since 90% of runtime is BLAST, being able to just reuse the BLAST reports really improves runtime.

—Carson

> On Mar 31, 2016, at 11:35 AM, Daniel Ence <dence at genetics.utah.edu> wrote:
> 
> Hi Jose, the time it takes maker to annotate a genome depends greatly on the hardware setup (as you pointed out, processors, memory, etc) as well as the size of the genome and the size and type of the datasets you use to annotate the genome (numerous RNAseq datasets for example will take longer than a project without any RNAseq data). 
> 
> However, the MPI parallelization implemented in MAKER guarantees that the runtime should scale linearly with the number of processors allotted to the MAKER run. This is explained in the MAKER2 paper (Holt and Yandell), which I’m going to quote:
> MAKER2 was used to annotate a 10 megabase section of the C. elegans genome
> (NGASP dataset). The algorithm was parallelized using MPI on an increasing number
> of CPU cores. The results demonstrate how MAKER2 scales almost linearly with
> CPU number (with a slope of near 1). If we project our results forward to the entire C.
> elegans genome (~100 megabases), MAKER2 should take under 10 hours on 32
> CPUs to complete; similarly, the human genome (~3 gigabases) would require fewer
> than 24 hours on 400 CPUs
> 
> I’m also not sure what you mean by the first run taking less time than the second run. By the first run do you mean running with est2genome turned on to create models for training ab-initio predictors? In that case, I would guess that the second run would take longer, but it should be too big of a difference. 
> 
> ~Daniel
> 
> Daniel Ence
> Graduate Student
> Eccles Institute of Human Genetics
> University of Utah
> 15 North 2030 East, Room 2100
> Salt Lake City, UT 84112-5330
> 
>> On Mar 31, 2016, at 6:57 AM, José Mª G. Perez-Silva <ereboperezsilva at gmail.com <mailto:ereboperezsilva at gmail.com>> wrote:
>> 
>> ​​
>> Hello,
>> 
>> We are using Maker for the first time, and we are a little concerned about the time it takes the program to finish a whole genome (2.2Gb) ab-initio annotation.
>> 
>> In a month we have nearly annotate a half of the genome (let's say around 40% of it).
>> I'd like to know how much time and under which technical specifications (processors, memory, ...) does it takes to annotate a complete genome for the first time.
>> The second round of annotations (in which we use the results from the first round as extra data) is faster?
>> 
>> Thank you in advance.
>> 
>> ---
>> 
>> Jose Maria G. Perez-Silva.
>> Departamento de Biologia Molecular y Bioquimica.
>> Universidad de Oviedo.
>> Spain.
>> _______________________________________________
>> maker-devel mailing list
>> maker-devel at yandell-lab.org <mailto:maker-devel at yandell-lab.org>
>> http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org
> 
> _______________________________________________
> maker-devel mailing list
> maker-devel at yandell-lab.org
> http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20160331/700cf94a/attachment-0003.html>


More information about the maker-devel mailing list