[maker-devel] Does maker support muti-processing for a single long fasta sequence using openMPI?

Xu, taosheng taosheng.x at gmail.com
Tue Oct 20 10:23:25 MDT 2020


Thank you very much Carson for your timely response,
Yes I think so. The Maker MPI should support the multi-processing of an
ultra-long single sequence. But I cannot run it successfully for a single
sequence.
First I  make sure the openMPI with maker has been installed properly. It
works well for multiple DNA sequences in a parallel way.
When I submit a maker job for an ultra-long single sequence (mpiexec -mca
btl ^openib -n 40 maker -g scaffold1.fasta -fix_nucleotides, The
max_dna_len is set to 100000). It always left only one maker thread run.
The other maker threads are disappeared and show finished in the output
information. See the output information below. Please help me to check it.
Thanks for your kind help and your time.

#-----MAKER Behavior Options
max_dna_len=100000 #length for dividing up contigs into chunks
(increases/decreases memory usage)
min_contig=1 #skip genome contigs below this length (under 10kb are often
useless)

Best regards,
Taosheng



OUTPUT Information
STATUS: Processing and indexing input FASTA files...
STATUS: Setting up database for any GFF3 input...
examining contents of the fasta file and run log
A data structure will be created for you at:
/data/splitgenome/top1000/splitGenome/part1/plant.maker.output/plant_datastore

To access files for individual sequences use the datastore index:
/data/splitgenome/top1000/splitGenome/part1/plant.maker.output/plant_master_datastore_index.log

STATUS: Now running MAKER...



--Next Contig--

Processing run.log file...
examining contents of the fasta file and run log



--Next Contig--

#---------------------------------------------------------------------
Another instance of maker is processing this contig!!
SeqID: scaffold#1
Length: 73580997
#---------------------------------------------------------------------


examining contents of the fasta file and run log



--Next Contig--

#---------------------------------------------------------------------
Another instance of maker is processing this contig!!
SeqID: scaffold#1
Length: 73580997
#---------------------------------------------------------------------




Maker is now finished!!!



Start_time: 1603000030
End_time:   1603000033
Elapsed:    3
A data structure will be created for you at:
/data/splitgenome/top1000/splitGenome/part1/plant.maker.output/plant_datastore

To access files for individual sequences use the datastore index:
/data/splitgenome/top1000/splitGenome/part1/plant.maker.output/plant_master_datastore_index.log

STATUS: Now running MAKER...
MAKER WARNING: The file
plant.maker.output/plant_datastore/B5/CD/scaffold#1//theVoid.scaffold#1/28/scaffold#1.289.plant_repeatFinal%2Elib%2Empi%2E10%2E2.specific.out
did not finish on the last run and must be erased
WARNING: Multiple MAKER processes have been started in the
same directory.

STATUS: Processing and indexing input FASTA files...
examining contents of the fasta file and run log



--Next Contig--

#---------------------------------------------------------------------
Another instance of maker is processing this contig!!
SeqID: scaffold#1
Length: 73580997
#---------------------------------------------------------------------


WARNING: Multiple MAKER processes have been started in the
same directory.

STATUS: Processing and indexing input FASTA files...
STATUS: Setting up database for any GFF3 input...
STATUS: Setting up database for any GFF3 input...
A data structure will be created for you at:
/data/splitgenome/top1000/splitGenome/part1/plant.maker.output/plant_datastore

To access files for individual sequences use the datastore index:
/data/splitgenome/top1000/splitGenome/part1/plant.maker.output/plant_master_datastore_index.log

STATUS: Now running MAKER...
examining contents of the fasta file and run log



--Next Contig--

#---------------------------------------------------------------------
Another instance of maker is processing this contig!!
SeqID: scaffold#1
Length: 73580997
#---------------------------------------------------------------------




*Maker is now finished!!!*



Start_time: 1603000030
End_time:   1603000034
Elapsed:    4
#---------------------------------------------------------------------
Now starting the contig!!
SeqID: scaffold#1
Length: 73580997
#---------------------------------------------------------------------


examining contents of the fasta file and run log



--Next Contig--

#---------------------------------------------------------------------
Another instance of maker is processing this contig!!
SeqID: scaffold#1
Length: 73580997
#---------------------------------------------------------------------


examining contents of the fasta file and run log



--Next Contig--

#---------------------------------------------------------------------
Another instance of maker is processing this contig!!
SeqID: scaffold#1
Length: 73580997
#---------------------------------------------------------------------




Start_time: 1603000030
End_time:   1603000034
Elapsed:    4


*Maker is now finished!!!*

A data structure will be created for you at:
/data/splitgenome/top1000/splitGenome/part1/plant.maker.output/plant_datastore

To access files for individual sequences use the datastore index:
/data/splitgenome/top1000/splitGenome/part1/plant.maker.output/plant_master_datastore_index.log

STATUS: Now running MAKER...
examining contents of the fasta file and run log
..........

Maker is now finished!!!



Start_time: 1602920407
End_time:   1602920580
Elapsed:    173
running  repeat masker.
#--------- command -------------#
Widget::RepeatMasker:
cd /tmp/maker_VflRYH; ./maker3.01/exe/RepeatMasker/RepeatMasker
/data/splitgenome/top1000/splitGenome/part1/plant.maker.output/plant_datastore/B5/CD/scaffold#1//theVoid.scaffold#1/0/scaffold#1.0.plant_repeatFinal%2Elib%2Empi%2E10%2E0.specific
-dir
/data/splitgenome/top1000/splitGenome/part1/plant.maker.output/plant_datastore/B5/CD/scaffold#1//theVoid.scaffold#1/0
-pa 1 -lib
/data/splitgenome/top1000/splitGenome/part1/plant.maker.output/mpi_blastdb/plant_repeatFinal%2Elib.mpi.10/plant_repeatFinal%2Elib.mpi.10.0
#-------------------------------#
running  repeat masker.
#--------- command -------------#
Widget::RepeatMasker:
cd /tmp/maker_VflRYH; ./maker3.01/exe/RepeatMasker/RepeatMasker
/data/splitgenome/top1000/splitGenome/part1/plant.maker.output/plant_datastore/B5/CD/scaffold#1//theVoid.scaffold#1/0/scaffold#1.0.plant_repeatFinal%2Elib%2Empi%2E10%2E1.specific
-dir
/data/splitgenome/top1000/splitGenome/part1/plant.maker.output/plant_datastore/B5/CD/scaffold#1//theVoid.scaffold#1/0
-pa 1 -lib
/data/splitgenome/top1000/splitGenome/part1/plant.maker.output/mpi_blastdb/plant_repeatFinal%2Elib.mpi.10/plant_repeatFinal%2Elib.mpi.10.1
#-------------------------------#
running  repeat masker.
#--------- command -------------#
Widget::RepeatMasker:
cd /tmp/maker_VflRYH; ./maker3.01/exe/RepeatMasker/RepeatMasker
/data/splitgenome/top1000/splitGenome/part1/plant.maker.output/plant_datastore/B5/CD/scaffold#1//theVoid.scaffold#1/0/scaffold#1.0.plant_repeatFinal%2Elib%2Empi%2E10%2E2.specific
-dir
/data/splitgenome/top1000/splitGenome/part1/plant.maker.output/plant_datastore/B5/CD/scaffold#1//theVoid.scaffold#1/0
-pa 1 -lib
/data/splitgenome/top1000/splitGenome/part1/plant.maker.output/mpi_blastdb/plant_repeatFinal%2Elib.mpi.10/plant_repeatFinal%2Elib.mpi.10.2
#-------------------------------#
running  repeat masker.
#--------- command -------------#
Widget::RepeatMasker:
cd /tmp/maker_VflRYH; ./maker3.01/exe/RepeatMasker/RepeatMasker
/data/splitgenome/top1000/splitGenome/part1/plant.maker.output/plant_datastore/B5/CD/scaffold#1//theVoid.scaffold#1/0/scaffold#1.0.plant_repeatFinal%2Elib%2Empi%2E10%2E3.specific
-dir
/data/splitgenome/top1000/splitGenome/part1/plant.maker.output/plant_datastore/B5/CD/scaffold#1//theVoid.scaffold#1/0
-pa 1 -lib
/data/splitgenome/top1000/splitGenome/part1/plant.maker.output/mpi_blastdb/plant_repeatFinal%2Elib.mpi.10/plant_repeatFinal%2Elib.mpi.10.3
#-------------------------------#
running  repeat masker.
#--------- command -------------#

.....

On Mon, Oct 19, 2020 at 10:32 PM Carson Holt <carsonhh at gmail.com> wrote:

> Yes. It will divide contigs into chunks the same size as the max_dna_len
> parameter.
>
> —Carson
>
> Sent from my iPhone
>
> > On Oct 17, 2020, at 12:48 AM, Xu, taosheng <taosheng.x at gmail.com> wrote:
> >
> > 
> > Dear Maker Development Team,
> > I wonder whether maker supports parallel processing for a single long
> genome sequence?
> >
> > When I submit my maker task using openMPI with multiple cpus (like,
> mpiexec -n 40  maker) to annotate a single long genome sequence, always
> only one maker with 4 rmblast run. The other cpus is on idle.
> >
> > I want to use the maker parallel processing with openMPI to speed up a
> single ultra-long genome sequence annotation.
> >
> > Best regards,
> > Taosheng
> > _______________________________________________
> > maker-devel mailing list
> > maker-devel at yandell-lab.org
> > http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20201021/755f2187/attachment-0003.html>


More information about the maker-devel mailing list