[maker-devel] TACC lonestar and N50 value
Stein, Joshua
steinj at cshl.edu
Mon Mar 16 07:29:36 MDT 2015
Hi Arne,
I have experience with iPlant resources and with MAKER-P. I would encourage you to try the Atmosphere image (7888b8e1-c006-4794-82d9-4c940ddbf4c6). You can request a large instance (up to 16 CPU's and 128 GB memory) and run in MPI-mode to distribute the work. Please see this tutorial, which includes information on running in MPI-mode: https://pods.iplantcollaborative.org/wiki/display/sciplant/MAKER-P+Atmosphere+Tutorial.
You can also access the TACC Lonestar installation using the iPlant Discovery Environment. There is an app called "MAKER-P-Lonestar-Small-Genomes 2.3". Although it is advertised as appropriate for "small" genomes, I think there is a good chance that it will work for 450 Mb. This is a new resource and the iPlant team would value any feedback and benchmarks on how the system is working. Depending how this goes there are plans to roll-out additional apps intended for larger genomes. Here is a tutorial: https://pods.iplantcollaborative.org/wiki/display/sciplant/Tutorial+for+running+MAKER-P+on+TACC-Lonestar+from+iPlant+Discovery+Environment
Regarding contig sizes, though not ideal, you can include contigs smaller than 10kbp in your run. Plant genes tend to be more compact than vertebrate genes so you ought to be able to recover annotations on the smaller contigs, though keep an eye out for truncated genes.
Best,
Josh
On Mar 13, 2015, at 6:06 PM, Van Hoeck Arne <avhoeck at SCKCEN.BE<mailto:avhoeck at SCKCEN.BE>> wrote:
Dear MAKER developer,
We have a plant genome of about 450 Mbp with an N50 value of 20 kbp whereas only 3/4 (333 Mbp) are contigs longer than 10 kbp. CEGMA said that 87% of the genes were found, whereas 94 % were partial identified. You said last time that contigs smaller than 10kbp are not ideal for annotating and preferable to throw them away. Does this mean that I lose all genes present in the small contigs? Or is there another way to annotate them? (is concatenating all the small contigs together with 500 N's between each contig an option?)
Besides, i could run succesfully Maker via iplant's atmoshpere. However, for my large genomes i registred myself at the TACC lonestar cluster but Dave C. replied that i won't be able to run on the TACC supercomputers without an allocation. He said that I need to contact my PI. With my loginID, i haven't any acces to the cluser via ssh since my permission was denied. Therefore, is it possible to use the TACC supercomputers to run MAKER?
Best regards
Arne
[-] Consider the environment before you print
Denk aan het milieu voor u deze e-mail print
Pensez à l'environnement avant d'imprimer
[-]
[-]
SCK•CEN Disclaimer: http://www.sckcen.be/en/e-mail_disclaimer
_______________________________________________
maker-devel mailing list
maker-devel at box290.bluehost.com<mailto:maker-devel at box290.bluehost.com>
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
Joshua Stein, PhD
Manager, Sci. Informatics III
Cold Spring Harbor Laboratory
steinj at cshl.edu<mailto:steinj at cshl.edu>
http://ware.cshl.org/
More information about the maker-devel
mailing list