[maker-devel] Adapting MAKER pipeline for my HPC

Jason Stajich jason.stajich at gmail.com
Wed Oct 6 18:53:42 MDT 2021


It’s just a genome assembly as input. Nothing more.

On Wed, Oct 6, 2021 at 1:28 PM Steven L. Miller <Fungi at uwyo.edu> wrote:

> Yes thank you Jason for responding!  Yes I have tried the Galaxy approach
> and as you predicted my dataset is too large to run an annotation locally
> on my laptop - about 157 x too large by my calculations.  I would be happy
> to try your funannotate tool, but my dataset is in Fasta, and I would have
> to run BUSCO to get my Fasta data into a different format?  Because of the
> size of my WGS, that would likely have to be done on the HPC?
>
> Although I do not have all the experience necessary to pull this off
> quickly, I am thinking about my experimental design and the steps that I
> will have to accomplish.
> My experimental design is this:
> I have a non-photosynthetic plant.  I want to eventually find out how this
> plant functions differently from its fully photosynthetic cousin and if all
> related non-photosynthetic plants function in the same way.  There are two
> annotated genomes of closely related plants in NCBI: 1.  another fairly
> closely related non-photosynthetic plant and 2. a fairly closely related
> fully photosynthetic plant.  I do not have a clue as to how good either
> these annotations are.
>
> For the annotation portion of my WGS, I can understand using the fairly
> closely related fully photosynthetic plant to train my annotation gene
> prediction, assuming that all functional genes required for photosynthesis
> and metabolism are present in this plant.  Once my WGS is annotated, I can
> then move on to compare the functional genes in both non-photosynthetic
> plants.
>
> So, these are the steps I must figure out to run this analysis on my HPC:
> 1.  installation of necessary MAKER related bits of software?  MAKER version
> 2.31.10 is installed on my HPC, but I don’t know if all the associated
> tools are also installed (e.g. Augustus)
> 2.  upload my assembled WGS in fasta format (from my limited knowledge I
> would need to use GlobusFTP?)
> 3.  upload the annotated WGS of the fully photosynthetic plant from NCBI
> (again in Fasta using Globus FTP?)
> 4.  upload the annotated WGS of the other non-photosynthetic plant (again
> in Fasta using Globus FTP?)
> 5.  Run MAKER, using one or the other fully annotated WGS to train MAKER
> (or Augustus)  to predict the genes from my non-photosynthetic plant
> 6.  Use the ugly MAKER output data in GFF3 along with reports from
> InterProScan and a BLAST report of homology and script maker_map_ids to
> make pretty graphics
>
> Thanks again for any help you can provide.  I wish there was an upcoming
> workshop I could attend to get into this process in a hurry!
>
>
>
> On Oct 6, 2021, at 11:11 AM, Mark Yandell <myandell at genetics.utah.edu>
> wrote:
>
> ◆ This message was sent from a non-UWYO address. Please exercise caution
> when clicking links or opening attachments from external sources.
>
>
> Thanks, Jason!
>
> *From: *maker-devel <maker-devel-bounces at yandell-lab.org> on behalf of
> Jason Stajich <jason.stajich at gmail.com>
> *Date: *Wednesday, October 6, 2021 at 10:35 AM
> *To: *"Steven L. Miller" <Fungi at uwyo.edu>
> *Cc: *"maker-devel at yandell-lab.org" <maker-devel at yandell-lab.org>
> *Subject: *Re: [maker-devel] Adapting MAKER pipeline for my HPC
>
> Steven -
>
> You might start with the galaxy install and tutorial or the cyverse one.
>
> https://galaxyproject.github.io/training-material/topics/genome-annotation/tutorials/annotation-with-maker/tutorial.html
>
> https://learning.cyverse.org/projects/sciapps_guide/en/latest/annotation.html
> http://gmod.org/wiki/MAKER_Tutorial
> Not sure if the space needed will be too much for your large genome.  To
> gain better help you might specify the problems you are having in running
> or installing?
>
> I'll also point out a parallel genome annotation tool we have built called
> funannotate which does the prediction training automatically from BUSCO or
> RNASeq datasets.
> https://funannotate.readthedocs.io/en/latest/ -
>
> installation with conda
> https://funannotate.readthedocs.io/en/latest/install.html
>  also docker image.
> https://hub.docker.com/r/nextgenusfs/funannotate
> https://github.com/reslp/funannotate-docker
>
> Jason Stajich
>
>
> On Tue, Oct 5, 2021 at 6:17 PM Steven L. Miller <Fungi at uwyo.edu> wrote:
>
> Sorry to say probably not a new issue.  I saw that MAKER is an easy-to-use
> genome annotation pipeline designed to be usable by small research groups
> with little bioinformatics experience.  That statement describes me to a t.
>   I am new to Whole Genome Sequencing and associated analysis, and have a
> limited skill set with computers.  I have a a 1.52 GB, 75k unitigs,
> non-photosynthetic vascular plant assembly for which I would like to
> predict functional genes.  I have tried to follow the tutorial that you
> have posted for Winter School 2018 to see if I can duplicate your
> installation and implementation for my HPC.  I doubt that you will be
> impressed, but I recently received a “certificate” for completing our HPC
> Linux course path the University of Wyoming. .  I realize now how much I
> don’t know and how much work it will take to complete an annotation, but I
> am needing to generate at least some preliminary data to include in an
> upcoming grant submittal (~2 months).
>
>  I read your MAKER wiki page, but it seems that it hasn’t been completed
> in many cases, with many important parts (important parts to me, in any
> case) under construction.
>
> I am writing to get any help I can to find a way forward in my project.
> Any suggestions and any help you can provide would be gratefully accepted!
>
>
>
> Sincerely,
> Steve Miller
> Botany
> University of Wyoming
> _______________________________________________
> maker-devel mailing list
> maker-devel at yandell-lab.org
> http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org
>
>
> --
Jason Stajich
jason.stajich at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20211006/426fb22f/attachment-0003.html>


More information about the maker-devel mailing list