[maker-devel] annotation of birch genome
Carson Holt
carsonhh at gmail.com
Fri May 10 11:44:01 MDT 2013
Also, if you will be annotating more genomes, you should look into getting
allocation on your university's cluster. Queen Mary University has a 2000
cpu cluster. Most cluster managers bend over backwards to help Biologists
use their systems as it looks good on progress reports and funding requests
as they can show they have a broader user base (i.e. departments other than
physics :-)
--Carson
From: Carson Holt <carsonhh at gmail.com>
Date: Friday, 10 May, 2013 1:25 PM
To: Jasmin Zohren <j.zohren at qmul.ac.uk>, <maker-devel at yandell-lab.org>
Subject: Re: [maker-devel] annotation of birch genome
Really only 560 Mb (Pine is 20 Gb by comparison). The single longest step
for MAKER, is alignment which is done via BLAST. So the evidence dataset
tends to be what can be filtered to get to a reasonable size. Protein
alignments take long as they must be aligned against the 3 translated
reading frames of the genome (so minimum 3x longer than DNA2DNA alignment,
but in practice much much more). Alt_EST is even worse, as it must
translate all 3 reading frames of the genome and all 3 of the data to be
aligned (TBLASTX type alignment). So minimum 3x longer than protein
alignment or 9X times longer than DNA2DNA alignment (but in practice much
more). So the single best thing to do to reduce run time is to use protein
evidence where possible instead of alt_EST evidence, or to ESTs from the
same species and limit the use of proteins (ESTs from the same species are
aligned as DNA2DNA, so it is very fast). Set all the blast_depth parameters
in the maker_bopts.ctl file to 20 or 30. This will help if you have a very
deep evidence dataset, by trimming overly deep alignment regions (less
exonerate polishing).
Also you can try running MAKER on 40 cpus rather than 20 (basically doubling
up even though you only have 20). This can work because, even though you
gave MAKER 20 cpus to use, all 20 will rarely be using 100% of each CPU
simultaneously. So launching 40 threads will give a slight boost in many
instances by filling in the gaps when "wait" operations let cpus idle for a
fraction of a second.
One good thing though, is that you only pay the price for data generation
once. If you ever rerun with slightly modified parameters, MAKER is smart
enough to reuse old results, so BLAST won't have to rerun.
Thanks,
Carson
From: Jasmin Zohren <j.zohren at qmul.ac.uk>
Date: Friday, 10 May, 2013 1:07 PM
To: <maker-devel at yandell-lab.org>
Subject: [maker-devel] annotation of birch genome
Dear Maker developers,
I am a PhD student at Queen Mary University in London working on tree
genomics. I recently attended the GMOD conference in Cambridge and it was a
pity that no one from the Maker side was there. But the two days were
interesting anyway.
My current project is about birch which has just been sequenced and I now
want to annotate it. Here are the details:
- Genome size: 560 Mb
- Size of EST file (from a related species): 28 Mb
- I am running it on a single node with 20 cores of 512 GB RAM
(using ³mpiexec -n 20 maker²)
I¹ve also attached my maker_opts file with the parameters I am using. I
assume the maker_bopts and maker_exe file are of minor importance for now.
My problem is, that the analysis is taking very long. It¹s been running for
weeks already and has only processed about 65 % of the scaffolds/contigs.
So I was wondering whether you have any suggestions how to speed things up.
Especially as I intend to use Maker for other projects, too, and will also
come back to the birch annotation once I have mRNA data for it.
Many thanks in advance and kind regards,
Jasmin
-----------------------------
Jasmin Zohren
PhD student in the INTERCROSSING ITN
Queen Mary University of London
intercrossing.wikispaces.com <http://intercrossing.wikispaces.com/>
evolve.sbcs.qmul.ac.uk <http://evolve.sbcs.qmul.ac.uk/buggs/jasmin-zohren/>
_______________________________________________ maker-devel mailing list
maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/m
aker-devel_yandell-lab.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20130510/99a642d6/attachment-0003.html>
More information about the maker-devel
mailing list