[maker-devel] Early obstacle with SplitDB

Carson Holt carsonhh at gmail.com
Wed Aug 13 12:19:50 MDT 2014


The Berkley_DB/NFS issues happen more often for large index files or NFS
systems with a slow response.  Such issues also happen almost exclusively
during index creation.  There is a way you can tell MAKER to have BioPerl
use something other than Berkley DB for indexing if you suspect that's the
issue. You can give it a flag during the initial MAKER setup and
installation. 

#use GDBM library
cd .../maker/src
perl Build.PL --AnyDBM_ISA GDBM_File
./Build install

#use SDBM files
cd .../maker/src
perl Build.PL --AnyDBM_ISA SDBM_File
./Build install

#use Berkley DB (default)
cd .../maker/src
perl Build.PL --AnyDBM_ISA DB_File
./Build install

However, I find that the alternatives to Berkley DB can be more flakey. Also
make sure /tmp is not tmpfs (which it may be on some systems).  I've also
seen weird behavior trying to index files on tmpfs storage on some systems.

Thanks,
Carson


From:  "Fields, Christopher J" <cjfields at illinois.edu>
Date:  Wednesday, August 13, 2014 at 11:14 AM
To:  Carson Holt <carsonhh at gmail.com>
Cc:  Kevin Tsai <kevintsai at iis.sinica.edu.tw>, "maker-devel at yandell-lab.org"
<maker-devel at yandell-lab.org>
Subject:  Re: [maker-devel] Early obstacle with SplitDB

On Aug 11, 2014, at 11:11 AM, Carson Holt <carsonhh at gmail.com> wrote:

> If you are updating every month to BioPerl live, don't.  You should use the
> CPAN version of BioPerl or even the stable download.  BioPerl live has
> actually broken several components MAKER uses at different times and depending
> on which version you currently have, may be broken now.  Could you send me the
> Bio::Root::Version line from the initial debug output?

Exactly.  Just a note, but the CPAN releases (now at 1.6.924) merge over all
changes from the master branch on a regular basis.  The key parts that will
not work when running off master (such as Bio::Root, Bio::FeatureIO, etc)
have been split out into separate repos; it’s entirely possible to add these
separately to a PERL5LIB but the intent is that we will release Bio-Root and
others to CPAN separately.

> Also could you send me this file --> /home/keceltes/maker2/final.fasta
> 
> The point of failure is actually very simple.  At that point in the code,
> MAKER opens a file, reads it in one line at a time, writes it out to a new
> file, and then indexes it with BioPerl (the BioPerl won't work with NFS drives
> because it uses Berkley DB).  For that reason whenever it fails at that point,
> it is either a drive space issue, NFS issue, BioPerl issue, or file format
> issue.

Re: Berkeley_DB, if you have a need to push this in a more NFS-portable
direction we are more than happy to let you experiment on what works best.
Mark Jensen actually started on this a while back but ran into problems.

I personally haven’t had problems with Bio::DB::Fasta on our local GPFS to
be frank, but I’m sure that isn’t working for everyone.

> Also are you running via MPI? I ask because if you are using multiple nodes
> you will have to check the sixe of /tmp independently on each node (since the
> values will be different).
> 
> Thanks,
> Carson

chris


> From: Kevin Tsai <kevintsai at iis.sinica.edu.tw>
> Date: Monday, August 11, 2014 at 5:11 AM
> To: Carson Holt <carsonhh at gmail.com>
> Cc: <maker-devel at yandell-lab.org>
> Subject: Re: [maker-devel] Early obstacle with SplitDB
> 
> Hi Carson, 
> Thanks for the suggestions.
> 
> I left the TMP= empty, which as you mentioned defaults to /tmp.  There seems
> to be a different error when using an NFS mounted directory (as I manually
> verified).  My /tmp is also not full or nearly full, I have verified proper
> fasta formatting as I have run the fasta file through other statistics
> generating tools (i.e. Quast).  We are also update BioPerl monthly.
> 
> Do you think it could be anything else?  Do you think any more information
> that I might be able to provide will be more insightful?
> 
> 
> On Tue, Aug 5, 2014 at 1:26 PM, Carson Holt <carsonhh at gmail.com> wrote:
>> Either you speciied TMP= in your maker_opts.ctl file to be an NFS mounted
>> directory (must be locally mounted), the drive containing directory specified
>> by TMP= (defaults to /tmp) is full or nearly full, your input file is not
>> proper fasta format, or you are using an out of date version of BioPerl.
>> 
>> Try the first three in the list then look at BioPerl.  The BioPerl version
>> should be printed as part of the the debug output.
>> 
>> --Carson
>> 
>> 
>> From: Kevin Tsai <kevintsai at iis.sinica.edu.tw>
>> Date: Tuesday, August 5, 2014 at 4:59 AM
>> To: <maker-devel at yandell-lab.org>
>> Subject: [maker-devel] Early obstacle with SplitDB
>> 
>> Hello, 
>> I'm a new user to Maker so I suspect this will be a simple question, but I am
>> having trouble finding documentation on SplitDB.  Our IT admin set up the
>> application and I'm running into the following issue about 30 seconds after
>> kickoff.  Below is the debugged output:
>> 
>> STATUS: Parsing control files...
>> Calling GI::load_control_files at /usr/bin/maker line 452.
>> Calling GI::new_instance_temp at /usr/bin/maker line 463.
>> Calling GI::mount_check at /usr/bin/maker line 465.
>> Calling GI::set_global_temp at /usr/bin/maker line 483.
>> STATUS: Processing and indexing input FASTA files...
>> Calling GI::s_abs_path at /usr/bin/maker line 519.
>> Calling GI::s_abs_path at /usr/bin/maker line 519.
>> Calling GI::s_abs_path at /usr/bin/maker line 519.
>> Calling GI::s_abs_path at /usr/bin/maker line 519.
>> Calling GI::s_abs_path at /usr/bin/maker line 519.
>> Calling List::Util::shuffle at /usr/bin/maker line 529.
>> Calling GI::split_db at /usr/bin/maker line 536.
>> Calling File::Path::rmtree at /usr/bin/maker line 537.
>> Calling Iterator::Any::new at /usr/bin/maker line 537.
>> Calling Iterator::Any::nextDef at /usr/bin/maker line 537.
>> Calling Iterator::Any::new at /usr/bin/maker line 537.
>> Calling mkdir at /usr/bin/maker line 537.
>> Calling Iterator::Any::nextFastaRef at /usr/bin/maker line 537.
>> Calling system at /usr/bin/maker line 537.
>> ERROR: SplitDB not created correctly
>> 
>>  at /usr/local/share/perl5/GI.pm line 1144.
>>         GI::split_db("/home/keceltes/maker2/final.fasta", "nucleotide", 1,
>> "/home/keceltes/maker2/final.maker.output/mpi_blastdb", "C") called at
>> /usr/bin/maker line 537
>> --> rank=NA, hostname=Za2.cglab
>> 
>> Any suggestions?  Thank you in advance!
>> -- 
>> Kevin Tsai
>> www.linkedin.com/in/kevinjtsai/ <http://www.linkedin.com/in/kevinjtsai/>
>> Ph.D. Candidate, Bioinformatics
>> Institute of Information Science, Academia Sinica
>> _______________________________________________ maker-devel mailing list
>> maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/ma
>> ker-devel_yandell-lab.org
> 
> 
> 
> -- 
> Kevin Tsai
> www.linkedin.com/in/kevinjtsai/ <http://www.linkedin.com/in/kevinjtsai/>
> Ph.D. Candidate, Bioinformatics
> Institute of Information Science, Academia Sinica
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20140813/ece48f9e/attachment-0002.html>


More information about the maker-devel mailing list