[maker-devel] master_datastore_index.log file shrinks.

Michael Nuhn mnuhn at ebi.ac.uk
Tue Mar 19 06:12:32 MDT 2013


Hello Carson!

On 03/14/2013 04:49 PM, Michael Nuhn wrote:
>> Try dialling back on the number of simultaneous instances you start and
>> instead use MPI or the -cpus option to get the parallelization boost.
>> Alternatively you can also split up the input file and use the -base
>> option so everything gets written to the same place (then you never have
>> to worry about locks affecting individual contigs - as no single instance
>> has access to all the contigs)
>>
>> Example:
>> fasta_tool --chunks 5 maize_assembly.fasta
>> maker -g maize_assembly_0.fasta -base maize_assembly
>> maker -g maize_assembly_1.fasta -base maize_assembly
>>
>> maker -g maize_assembly_2.fasta -base maize_assembly
>>
>> maker -g maize_assembly_3.fasta -base maize_assembly
>>
>> maker -g maize_assembly_4.fasta -base maize_assembly
>>
>> maker -dsindex
>>
>> Everything then gets written to maize_assembly.maker.output for all
>> results.  The last call to maker with the -dsindex flag then rebuilds the
>> datastore_index.log file to match the original maize_assembly.fasta file

I have tried this, split my genome into 50 files and run them as you 
suggested above.

This worked well most of the time, but now I am getting locking issues 
again. The working directory gets flooded with STACK.STACK.STACK.STACK 
... files.

What I think is happening is that for some reason the maker instances 
decide that they want to rebuild the index. This takes a lot of time and 
this blocks even more instances wanting to lock the index files. In the 
end most of the maker instances end up waiting.

I would like to try the following, but I don't know, if this might cause 
problems later on:

I would like to run all of the split sequence files as separate maker 
projects as if they were independent genomes. In the end I'd merge all 
the individual gff files using the gff3_merge script.

Do you see any reason why this wouldn't work?

Cheers,
Michael.






More information about the maker-devel mailing list