[maker-devel] problem with dsindex

Carson Holt carson.holt at genetics.utah.edu
Wed Apr 23 08:51:59 MDT 2014


I don't think all your contigs are finished or you did not supply the
-base tag when running -dsindex.  If it says STARTED rather than FINISHED,
then the output files for that contig are missing from the directory it is
looking at.

For example this is how you should be running everything -->
/maker/bin/fasta_tool --split placed.fasta

mpiexec -n 4 /maker/bin/maker -base placed -g 1.fasta -fix_nucleotides
mpiexec -n 4 /maker/bin/maker -base placed -g 2.fasta -fix_nucleotides

mpiexec -n 4 /maker/bin/maker -base placed -g 3.fasta -fix_nucleotides

mpiexec -n 4 /maker/bin/maker -base placed -g 4.fasta -fix_nucleotides

mpiexec -n 4 /maker/bin/maker -base placed -g 5.fasta -fix_nucleotides


Now all will write to placed.maker.output

Then you need to do this-->
maker/bin/maker -dsindex -base placed -g placed.fasta


Then it will rebuild the index for
placed.maker.output/placed_master_datastore_index.log

Thanks,
Carson



On 4/22/14, 10:48 PM, "kdelmore at zoology.ubc.ca" <kdelmore at zoology.ubc.ca>
wrote:

>I am having some trouble with the dsindex tool. I used the fasta_tool to
>split my original multifasta file and ran maker with the –base and –g
>flags. I then used the dsindex tool to summarize results from each fasta.
>The tool finished without an error message and pointed me to where the
>files should be but when I went to that directory there was no datastore
>and the index.log said that it had started on each of the fastas but not
>finished. I got around this problem using gff3_merge by using the –o
>option and providing paths to the gff files but this is not working with
>the fasta_merge tool. I don’t want to just cat the files together because
>I want to be sure the merged gff and protein.fasta files are the same for
>downstream annotation steps. I’ve included examples of the commands I used
>below and the output from dsindex. Note that the individual fastas
>finished without errors and produced datastores.
>
>I would really appreciate any input you might have with this problem and
>THANK YOU for developing such a user friendly pipeline.
>
>/maker/bin/fasta_tool --split placed.fasta
>
>mpiexec -n 4 /maker/bin/maker -base 1 -g 1.fasta -fix_nucleotides
>
>maker/bin/maker -dsindex -fix_nucleotides
>STATUS: Parsing control files...
>STATUS: Processing and indexing input FASTA files...
>STATUS: Setting up database for any GFF3 input...
>A data structure will be created for you at:
>/placed.maker.output/placed_datastore ##this directory was not generated
>To access files for individual sequences use the datastore index:
>/placed.maker.output/placed_master_datastore_index.log
>
>/maker/bin/gff3_merge -o placed.gff *
>
>/maker/bin/fasta_merge –o placed.all 1.maker.proteins.fasta
>2.maker.proteins.fasta ##this did not work
>
>
>



More information about the maker-devel mailing list