[maker-devel] Maker on Amazon EC2 Using Starcluster

Carson Holt carsonhh at gmail.com
Fri Jan 23 13:00:56 MST 2015


MAKER needs a global storage location.  You probably need to set up one of your instances up to act as a shared storage server. AWS has lustre implementations for the cloud, perhaps you can try that. Also use OpenMPI instead of MPICH2. It’s more stable.

I look forward to seeing how your experiment with AWS, MPI, and MAKER works out.

—Carson



> On Jan 21, 2015, at 6:56 AM, Jason Gallant <jgallant at msu.edu> wrote:
> 
> Hi Everyone,
> 
> I’m attempting to run Maker on Amazon EC2 using MIT’s starcluster— I’ve started a 200 node cluster, and enabled MPICH2 (Starcluster by default uses OpenMPI).  I plan on documenting this setup once I’ve figured out how to run things reliably.
> 
> I’m having a persistent issue where something fails on one of the nodes, and std error is flooded with:
> 
> examining contents of the fasta file and run log
> [67] ERROR: could not make datastore directory
> [67] --> rank=67, hostname=node067
> [67] ERROR: Failed while examining contents of the fasta file and run log
> [67] ERROR: Chunk failed at level:0, tier_type:0
> [67] FAILED CONTIG:Scaffold261
> 
> This error repeats for each “next” scaffold for some time.  When I go back to find the “source” of the error in the log, the following is the first error message on that node:
> 
> 67] #-------------------------------#
> [67] deleted:-60 hits
> [67] collecting blastx reports
> [67] ERROR: Could not colapse BLAST reports
> [67]  at /root/maker/bin/../lib/GI.pm line 2524 thread 1.
> [67] GI::combine_blast_report(FastaChunk=HASH(0x108e1a90), ARRAY(0x1b874938), ARRAY(0xf127ad8), runlog=HASH(0x4d54ed8)) called at /root/maker/bin/../lib/Process/MpiChunk.pm line 2760 thread 1
> [67] Process::MpiChunk::__ANON__() called at /root/maker/bin/../lib/Error.pm line 415 thread 1
> [67] eval {...} called at /root/maker/bin/../lib/Error.pm line 407 thread 1
> [67] Error::subs::try(CODE(0x1514eb00), HASH(0x9cbeb568)) called at /root/maker/bin/../lib/Process/MpiChunk.pm line 4215 thread 1
> [67] Process::MpiChunk::_go(Process::MpiChunk=HASH(0x13976308), "run", HASH(0x12e04268), 9, 3) called at /root/maker/bin/../lib/Process/MpiChunk.pm line 341 thread 1
> [67] Process::MpiChunk::run(Process::MpiChunk=HASH(0x13976308), 67) called at /root/maker/bin/maker line 1457 thread 1
> [67] main::node_thread("/mnt/data/paramormyrops_new_annotation/supercontigs.maker.out"...) called at /usr/local/lib/perl/5.14.2/forks.pm line 799 thread 1
> [67] eval {...} called at /usr/local/lib/perl/5.14.2/forks.pm line 799 thread 1
> [67] threads::new("threads", CODE(0x3dc5b38), "/mnt/data/paramormyrops_new_annotation/supercontigs.maker.out"...) called at /root/maker/bin/maker line 917 thread 1
> [67] --> rank=67, hostname=node067
> [67] ERROR: Failed while collecting blastx reports
> [67] ERROR: Chunk failed at level:9, tier_type:3
> [67] FAILED CONTIG:Scaffold66
> [67] 
> [67] ERROR: Chunk failed at level:4, tier_type:0
> [67] FAILED CONTIG:Scaffold66
> 
> 
> I’ve attempted to ignore the error to see if things will proceed on the other 199 processors.  When I returned to the “master” node after the evening, Maker keeps repeating the same error code over and over (same scaffold):
> ] examining contents of the fasta file and run log
> [67] ERROR: could not make datastore directory
> [67] --> rank=67, hostname=node067
> [67] ERROR: Failed while examining contents of the fasta file and run log
> [67] ERROR: Chunk failed at level:0, tier_type:0
> [67] FAILED CONTIG:Scaffold1589
> 
> I stop the job, and restart, and after only a few minutes of running, the same error is reported, this time on a new scaffold.  Strangely here, the error is reported in the MPI tag of node001, but the error originates at node137:
> 
> ERROR: Could not colapse BLAST reports
> [1]  at /root/maker/bin/../lib/GI.pm line 2524.
> [1]     GI::combine_blast_report(FastaChunk=HASH(0xf4aa9b8), ARRAY(0xf628f90), ARRAY(0x325fea78), runlog=HASH(0x133cc8e8)) called at /root/maker/bin/../lib/Process/MpiChunk.pm line 2760
> [1]     Process::MpiChunk::__ANON__() called at /root/maker/bin/../lib/Error.pm line 415
> [1]     eval {...} called at /root/maker/bin/../lib/Error.pm line 407
> [1]     Error::subs::try(CODE(0x352c9b8), HASH(0xdab3b690)) called at /root/maker/bin/../lib/Process/MpiChunk.pm line 4215
> [1]     Process::MpiChunk::_go(Process::MpiChunk=HASH(0x3545d90), "run", HASH(0x30aa710), 9, 3) called at /root/maker/bin/../lib/Process/MpiChunk.pm line 341
> [1]     Process::MpiChunk::run(Process::MpiChunk=HASH(0x3545d90), 137) called at /root/maker/bin/maker line 979
> [1] --> rank=137, hostname=node137
> [1] ERROR: Failed while collecting blastx reports
> [1] ERROR: Chunk failed at level:9, tier_type:3
> [1] FAILED CONTIG:Scaffold249
> [1]
> [1] ERROR: Chunk failed at level:4, tier_type:0
> [1] FAILED CONTIG:Scaffold249
> [1]
> [1] examining contents of the fasta file and run log
> [1] ERROR: could not make datastore directory
> [1] --> rank=1, hostname=node001
> [1] ERROR: Failed while examining contents of the fasta file and run log
> [1] ERROR: Chunk failed at level:0, tier_type:0
> [1] FAILED CONTIG:Scaffold249
> [1]
> [1] examining contents of the fasta file and run log
> [1] ERROR: could not make datastore directory
> [1] --> rank=1, hostname=node001
> [1] ERROR: Failed while examining contents of the fasta file and run log
> [1] ERROR: Chunk failed at level:0, tier_type:0
> [1] FAILED CONTIG:Scaffold249
> [1]
> [1] examining contents of the fasta file and run log
> [1] ERROR: could not make datastore directory
> [1] --> rank=1, hostname=node001
> [1] ERROR: Failed while examining contents of the fasta file and run log
> [1] ERROR: Chunk failed at level:0, tier_type:0
> [1] FAILED CONTIG:Scaffold249
> [1]
> [1] examining contents of the fasta file and run log
> [1] ERROR: could not make datastore directory
> [1] --> rank=1, hostname=node001
> [1] ERROR: Failed while examining contents of the fasta file and run log
> [1] ERROR: Chunk failed at level:0, tier_type:0
> [1] FAILED CONTIG:Scaffold249
> [1]
> [1] examining contents of the fasta file and run log
> [1] ERROR: could not make datastore directory
> [1] --> rank=1, hostname=node001
> [1] ERROR: Failed while examining contents of the fasta file and run log
> [1] ERROR: Chunk failed at level:0, tier_type:0
> [1] FAILED CONTIG:Scaffold249
> [1]
> [1] examining contents of the fasta file and run log
> [1] ERROR: could not make datastore directory
> [1] --> rank=1, hostname=node001
> [1] ERROR: Failed while examining contents of the fasta file and run log
> [1] ERROR: Chunk failed at level:0, tier_type:0
> [1] FAILED CONTIG:Scaffold249
> [1]
> [1] examining contents of the fasta file and run log
> [1] ERROR: could not make datastore directory
> [1] --> rank=1, hostname=node001
> [1] ERROR: Failed while examining contents of the fasta file and run log
> [1] ERROR: Chunk failed at level:0, tier_type:0
> [1] FAILED CONTIG:Scaffold249
> [1]
> [1] examining contents of the fasta file and run log
> [1] ERROR: could not make datastore directory
> [1] --> rank=1, hostname=node001
> [1] ERROR: Failed while examining contents of the fasta file and run log
> [1] ERROR: Chunk failed at level:0, tier_type:0
> [1] FAILED CONTIG:Scaffold249
> [1]
> [1] examining contents of the fasta file and run log
> [1] ERROR: could not make datastore directory
> [1] --> rank=1, hostname=node001
> [1] ERROR: Failed while examining contents of the fasta file and run log
> [1] ERROR: Chunk failed at level:0, tier_type:0
> [1] FAILED CONTIG:Scaffold249
> [1]
> [1] examining contents of the fasta file and run log
> [1] ERROR: could not make datastore directory
> [1] --> rank=1, hostname=node001
> [1] ERROR: Failed while examining contents of the fasta file and run log
> [1] ERROR: Chunk failed at level:0, tier_type:0
> [1] FAILED CONTIG:Scaffold249
> [1]
> [1] examining contents of the fasta file and run log
> [1] ERROR: could not make datastore directory
> [1] --> rank=1, hostname=node001
> [1] ERROR: Failed while examining contents of the fasta file and run log
> [1] ERROR: Chunk failed at level:0, tier_type:0
> [1] FAILED CONTIG:Scaffold249
> [1]
> [1] examining contents of the fasta file and run log
> [1] ERROR: could not make datastore directory
> [1] --> rank=1, hostname=node001
> [1] ERROR: Failed while examining contents of the fasta file and run log
> [1] ERROR: Chunk failed at level:0, tier_type:0
> [1] FAILED CONTIG:Scaffold249
> [1]
> [1] examining contents of the fasta file and run log
> [1] ERROR: could not make datastore directory
> [1] --> rank=1, hostname=node001
> [1] ERROR: Failed while examining contents of the fasta file and run log
> [1] ERROR: Chunk failed at level:0, tier_type:0
> [1] FAILED CONTIG:Scaffold249
> [1]
> [1] examining contents of the fasta file and run log
> [1] ERROR: could not make datastore directory
> [1] --> rank=1, hostname=node001
> [1] ERROR: Failed while examining contents of the fasta file and run log
> [1] ERROR: Chunk failed at level:0, tier_type:0
> [1] FAILED CONTIG:Scaffold249
> [1]
> [1] examining contents of the fasta file and run log
> [1] ERROR: could not make datastore directory
> [1] --> rank=1, hostname=node001
> [1] ERROR: Failed while examining contents of the fasta file and run log
> [1] ERROR: Chunk failed at level:0, tier_type:0
> [1] FAILED CONTIG:Scaffold249
> [1]
> [1] examining contents of the fasta file and run log
> [1] ERROR: could not make datastore directory
> [1] --> rank=1, hostname=node001
> [1] ERROR: Failed while examining contents of the fasta file and run log
> [1] ERROR: Chunk failed at level:0, tier_type:0
> [1] FAILED CONTIG:Scaffold249
> [1]
> [1] examining contents of the fasta file and run log
> [1] ERROR: could not make datastore directory
> [1] --> rank=1, hostname=node001
> [1] ERROR: Failed while examining contents of the fasta file and run log
> [1] ERROR: Chunk failed at level:0, tier_type:0
> [1] FAILED CONTIG:Scaffold249
> [1]
> [1] examining contents of the fasta file and run log
> [1] ERROR: could not make datastore directory
> [1] --> rank=1, hostname=node001
> [1] ERROR: Failed while examining contents of the fasta file and run log
> [1] ERROR: Chunk failed at level:0, tier_type:0
> [1] FAILED CONTIG:Scaffold249
> [1]
> [1] examining contents of the fasta file and run log
> [1] ERROR: could not make datastore directory
> [1] --> rank=1, hostname=node001
> [1] ERROR: Failed while examining contents of the fasta file and run log
> [1] ERROR: Chunk failed at level:0, tier_type:0
> [1] FAILED CONTIG:Scaffold249
> 
> I’d appreciate any guidance as how best to diagnose this error!
> 
> Many thanks,
> Jason Gallant
> 
> 
> 
> 
>> Dr. Jason R. Gallant
> Assistant Professor
> Room 38 Natural Sciences
> Department of Zoology
> Michigan State University
> East Lansing, MI 48824
> jgallant at msu.edu
> office: 517-884-7756
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org





More information about the maker-devel mailing list