[maker-devel] Some errors reported by Maker2
Quanwei Zhang
qwzhang0601 at gmail.com
Wed Sep 13 09:32:34 CDT 2017
Dear Carson:
I did more tests on one of the contigs (with length 863kb) that failed when
doing repeat masking. I found it only fail when I added the species
specific repeat library, and it can be successfully annotated when only
considering mammalian repeat library. When I did the test I only picked the
this contig and run maker with 64G memory. So I think the failure should
not be the problem with memory or IO, because even the contigs with length
98Mb can be annotated with memory 32G.
I also run RepeatMasker on this contig with mammalian and species specific
repeat library, separately. I found when I use mammalian repeat library,
about 35% was masked as repeats, while it is 65% when I use species
specific repeat library (as shown below in blue). I wonder whether the high
level of repeats can lead to the failure of this contig. Do you have any
ideas about this. Thanks
file name: test_scaffold31.fasta
sequences: 1
total length: 863590 bp (858757 bp excl N/X-runs)
GC level: 37.02 %
bases masked: 562909 bp ( 65.18 %)
==================================================
number of length percentage
elements* occupied of sequence
--------------------------------------------------
SINEs: 113 16134 bp 1.87 %
ALUs 71 12479 bp 1.45 %
MIRs 1 133 bp 0.02 %
LINEs: 251 380142 bp 44.02 %
LINE1 211 210623 bp 24.39 %
LINE2 1 86 bp 0.01 %
L3/CR1 0 0 bp 0.00 %
LTR elements: 246 101221 bp 11.72 %
ERVL 5 1037 bp 0.12 %
ERVL-MaLRs 18 2744 bp 0.32 %
ERV_classI 201 90942 bp 10.53 %
ERV_classII 18 5964 bp 0.69 %
DNA elements: 39 14177 bp 1.64 %
hAT-Charlie 7 3864 bp 0.45 %
TcMar-Tigger 7 1706 bp 0.20 %
Unclassified: 196 45831 bp 5.31 %
Total interspersed repeats: 557505 bp 64.56 %
Small RNA: 3 823 bp 0.10 %
Satellites: 2 237 bp 0.03 %
Simple repeats: 94 4472 bp 0.52 %
Low complexity: 18 766 bp 0.09 %
==================================================
* most repeats fragmented by insertions or deletions
have been counted as one element
The query species was assumed to be homo
RepeatMasker Combined Database: Dfam_Consensus-20170127, RepBase-20170127
run with rmblastn version 2.2.27+
The query was compared to classified sequences in
".../consensi.fa.classifiednoProtFinal"
Best
Quanwei
2017-09-11 14:33 GMT-04:00 Quanwei Zhang <qwzhang0601 at gmail.com>:
> Dear Carson:
>
> I see. Thank you. I will try it.
>
> Best
> Quanwei
>
> 2017-09-11 13:46 GMT-04:00 Carson Holt <carsonhh at gmail.com>:
>
>> Each node is a single machine. Because you currently run without MPI,
>> each MAKER job you submit runs on a single machine. So you are either
>> running multiple times on the same node, or you submitted 5 separate batch
>> jobs in which case you may have a single maker process on each of 5 nodes.
>>
>> MPI can parallelize on the same node or across nodes. If you request 10
>> nodes, then it can communicate across nodes to run the job on all hardware.
>> Or you can run MPI on a single node and ask for all CPUs on that node. In
>> that case it will split up work within a single node and use all resources
>> just on that node. So if you can’t get MPI to work across nodes, you can
>> just submit a job that goes to a single node and ask for all CPUs on that
>> node (multinode jobs may be hard to configure, but single node jobs are
>> very easy). Just set the -n parameter of mpiexec to the CPU count of that
>> node, and it will parallelize within the node.
>>
>> Example command for a 20 CPU node —> mpiexec -n 20 maker
>>
>> —Carson
>>
>>
>>
>>
>>
>> On Sep 11, 2017, at 11:27 AM, Quanwei Zhang <qwzhang0601 at gmail.com>
>> wrote:
>>
>> Dear Carson:
>>
>> Would you please explain what do you mean by "a single machine"? I am
>> running maker2 on our high performance cluster. The cluster has more than
>> 1,620-core compute nodes with 128 GB RAM each. Univa Grid Engine was used
>> as the scheduler. Can I use MPICH3?
>>
>> Thanks
>>
>> Best
>> Quanwei
>>
>> 2017-09-11 13:18 GMT-04:00 Carson Holt <carsonhh at gmail.com>:
>>
>>> If you are just using a single machine (and not cross machine MPI), use
>>> MPICH3 —> https://www.mpich.org
>>>
>>> It’s easy to install yourself, and tends to be very robust to failure.
>>>
>>> —Carson
>>>
>>>
>>>
>>> On Sep 11, 2017, at 11:16 AM, Quanwei Zhang <qwzhang0601 at gmail.com>
>>> wrote:
>>>
>>> Dear Carson:
>>>
>>> I met some problems to use MPI. I will give it another try.
>>> Thank you!
>>>
>>> Best
>>> Quanwei
>>>
>>> 2017-09-11 13:14 GMT-04:00 Carson Holt <carsonhh at gmail.com>:
>>>
>>>> It could be either. Please use MPI instead of starting multiple
>>>> instances. It will greatly reduce both IO and RAM usage.
>>>>
>>>> —Carson
>>>>
>>>>
>>>>
>>>> On Sep 11, 2017, at 11:12 AM, Quanwei Zhang <qwzhang0601 at gmail.com>
>>>> wrote:
>>>>
>>>> Dear Carson:
>>>>
>>>> I only run 5 Maker instances in each directory (and set cpus=2). If it
>>>> is related to memory issue or an IO issue, I am not sure why the much
>>>> longer scaffolds (than the failed ones) were all annotated successfully,
>>>> but the relatively shorter ones failed.
>>>>
>>>> I have set "tries=5" (#number of times to try a contig if there is a
>>>> failure for some reason). I will try "clean_try=1" and test on the failed
>>>> scaffolds individually with larger memory to see whether they can be
>>>> annotated.
>>>>
>>>> Thank you!
>>>>
>>>> Best
>>>> Quanwei
>>>>
>>>> 2017-09-11 13:07 GMT-04:00 Carson Holt <carsonhh at gmail.com>:
>>>>
>>>>> I think the cause of the error may have been a little further upstream
>>>>> from what you pasted in the e-mail. One thing that may be happening is that
>>>>> you are taxing resources (like IO) if running MAKER multiple times or on
>>>>> too many CPUs. That can lead to failures because of truncated BLAST reports
>>>>> etc. In which case you can just retry and that will get around those types
>>>>> of IO derived errors. MAKER can generate a lot of IO, and if you are
>>>>> working on network mounted locations (i.e. the storage being used is
>>>>> actually across the network), then they can be lest robust than local
>>>>> storage (when under heavy load NFS can falsely report success on read/write
>>>>> operations that actually failed). It’s the reason we built in the retry
>>>>> capabilities of MAKER.
>>>>>
>>>>> For contigs that continuously fail, you may need to set clean_try=1.
>>>>> That will cause failures to start from scratch (i.e. delete all old reports
>>>>> on failure rather than just those suspected of being truncated).
>>>>>
>>>>> —Carson
>>>>>
>>>>>
>>>>>
>>>>> On Sep 11, 2017, at 10:19 AM, Quanwei Zhang <qwzhang0601 at gmail.com>
>>>>> wrote:
>>>>>
>>>>> Dear Carson:
>>>>>
>>>>> About the error in my above email, I found the contig was correctly
>>>>> annotated at the second time RETRY. So please ignore my last email. But
>>>>> now, for a few number of scaffolds, I met problems to process the repeats
>>>>> (as shown below in red). I used both Mammalia repeat library and species
>>>>> specific repeat library (which is generated by your pipeline "
>>>>> http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Rep
>>>>> eat_Library_Construction--Basic"). There were no such problems when I
>>>>> only used Mammalia repeat library. Do you have any ideas about this? What
>>>>> could be the reason? Or do you have any suggestions for me to find the
>>>>> reason? Many thanks
>>>>>
>>>>> Here are some parameters I used
>>>>>
>>>>> #-----Repeat Masking (leave values blank to skip repeat masking)
>>>>> model_org=Mammalia #select a model organism for RepBase masking in
>>>>> RepeatMasker
>>>>> rmlib=../consensi.fa.classifiednoProtFinal #provide an organism
>>>>> specific repeat library in fasta format for Repe
>>>>>
>>>>> max_dna_len=300000
>>>>> split_hit=40000
>>>>> depth_blastn=30 #Blastn depth cutoff (0 to disable cutoff)
>>>>> depth_blastx=30 #Blastx depth cutoff (0 to disable cutoff)
>>>>> depth_tblastx=30 #tBlastx depth cutoff (0 to disable cutoff)
>>>>> bit_rm_blastx=30 #Blastx bit cutoff for transposable element masking
>>>>>
>>>>>
>>>>> Died at /gs/gsfs0/hpc01/apps/MAKER/2.31.9/bin/../lib/Bio/Search/Hit/PhatHit/Base.pm
>>>>> line 188.
>>>>> 33708 --> rank=NA, hostname=n409
>>>>> 33709 ERROR: Failed while processing all repeats
>>>>> 33710 ERROR: Chunk failed at level:3, tier_type:1
>>>>> 33711 FAILED CONTIG:Contig31
>>>>>
>>>>>
>>>>> Best
>>>>> Quanwei
>>>>>
>>>>> 2017-09-08 23:25 GMT-04:00 Quanwei Zhang <qwzhang0601 at gmail.com>:
>>>>>
>>>>>> Dear Carson:
>>>>>>
>>>>>> I got the following error again. Is this still related to memory
>>>>>> issues? I wonder whether there can be other reasons lead to this error?
>>>>>> This time, I got this error during training of the SNAP model. Before, even
>>>>>> I set max_dna_len=1Mb, I can train the model successfully. And in the
>>>>>> current training (where I get the following error), I have decreased the
>>>>>> max_dna_len to 300kb. I required the same amount memory as before. The only
>>>>>> difference is that I am using both mammalian repeat library and species
>>>>>> specific repeat library, while previously I only use the mammalian repeat
>>>>>> library. Will it greatly increases the requirement of memory to use both
>>>>>> repeat libraries (even when I decrease max_dna_len from 1Mb to 300kb)? I
>>>>>> have also set the depth_blast as 30 in current training.
>>>>>>
>>>>>> Thank you! Have a nice weekend!
>>>>>>
>>>>>>
>>>>>>
>>>>>> #-----------------------------------------------------------
>>>>>> ----------
>>>>>> Now starting the contig!!
>>>>>> SeqID: Contig10
>>>>>> Length: 18773588
>>>>>> #-----------------------------------------------------------
>>>>>> ----------
>>>>>>
>>>>>>
>>>>>> setting up GFF3 output and fasta chunks
>>>>>> doing repeat masking
>>>>>> doing blastx repeats
>>>>>> doing blastx repeats
>>>>>> doing blastx repeats
>>>>>> doing blastx repeats
>>>>>> doing blastx repeats
>>>>>> doing blastx repeats
>>>>>> doing blastx repeats
>>>>>> doing blastx repeats
>>>>>> doing blastx repeats
>>>>>> doing blastx repeats
>>>>>> collecting blastx repeatmasking
>>>>>> processing all repeats
>>>>>> doing repeat masking
>>>>>> Can't kill a non-numeric process ID at /gs/gsfs0/hpc01/apps/MAKER/2.31.9/bin/../lib/File/NFSLock.pm
>>>>>> line 1050.
>>>>>> --> rank=NA, hostname=n224
>>>>>> ERROR: Failed while doing repeat masking
>>>>>> ERROR: Chunk failed at level:0, tier_type:1
>>>>>> FAILED CONTIG:Contig10
>>>>>>
>>>>>> ERROR: Chunk failed at level:2, tier_type:0
>>>>>> FAILED CONTIG:Contig10
>>>>>>
>>>>>> Best
>>>>>> Quanwei
>>>>>>
>>>>>> 2017-09-06 12:06 GMT-04:00 Carson Holt <carsonhh at gmail.com>:
>>>>>>
>>>>>>>
>>>>>>> (2) By reading some of your replies in the maker google group, and I
>>>>>>> noticed that it can reduce memory and save time for annotation if I set
>>>>>>> depth_blast to a certain number. So I changed the following parameters. But
>>>>>>> I wonder, whether it will decrease the quality of annotation? If it won't
>>>>>>> affect the quality, can I even use a smaller number (e.g., 20) to save more
>>>>>>> memory and time?
>>>>>>>
>>>>>>> depth_blastn=30 #Blastn depth cutoff (0 to disable cutoff)
>>>>>>> depth_blastx=30 #Blastx depth cutoff (0 to disable cutoff)
>>>>>>> depth_tblastx=30 #tBlastx depth cutoff (0 to disable cutoff)
>>>>>>> bit_rm_blastx=30 #Blastx bit cutoff for transposable element masking
>>>>>>>
>>>>>>>
>>>>>>> This values really only affects the final evidence kept in the GFF3
>>>>>>> when you look at it in a browser. It has not affect on the annotation. This
>>>>>>> is because internally MAKER already collapses evidence down to the 10 best
>>>>>>> non-redundant features per evidence set per locus. The rest are put in the
>>>>>>> GFF3 just for reference. by setting it lower, you are just letting MAKER
>>>>>>> know it can through things away even sooner since you don’t want them in
>>>>>>> the GFF3. It provides a minor improvement for memory use, but
>>>>>>> max_dna_length is the big one that has the greatest effect.
>>>>>>>
>>>>>>>
>>>>>>> (3) I also have some concerns about the speed, especially for the
>>>>>>> long scaffolds (around 100Mb). I wonder which part is the most time
>>>>>>> consuming for genome annotation (repeat masking, blast, or polishing?).
>>>>>>> Particularly, I wonder whether the blastx of protein evidence will take
>>>>>>> majority of time. Now, I have prepared 99k mammalian Swiss protein
>>>>>>> sequences and 340k rodent TrEMBL protein sequences as protein evidences. I
>>>>>>> am considering whether I can save much time if I only use the 99k mammalian
>>>>>>> Swiss protein sequences as evidences.
>>>>>>>
>>>>>>>
>>>>>>> BLASTN (ESTs) -> fastest as it is searching nucleotide space
>>>>>>> BLASTX (proteins) -> must search 6 reading frames so will be at
>>>>>>> least 6 times slower than BLASTN
>>>>>>> TBLASTX (alt-ESTs) -> must search 12 reading frames so will be at
>>>>>>> least 12 times slower than BLASTN and twice as slow as BLASTX
>>>>>>>
>>>>>>> Also double the dataset size, double the runtime. Larger window
>>>>>>> sizes via max_dna_length will also increase runtimes.
>>>>>>>
>>>>>>>
>>>>>>> (4) For some reasons, I can not run maker though MPI on our cluster.
>>>>>>> So I can only start multiple maker. I wonder if it is possible to let
>>>>>>> multiple maker to annotate the same long scaffold (i.e., for a single
>>>>>>> sequence I start multiple maker, without splitting the long sequence into
>>>>>>> shorter ones).
>>>>>>>
>>>>>>>
>>>>>>> Without MPI you won’t be able to split up large contigs. At the very
>>>>>>> least you can try and run on a single node and set MPI to use all CPUs on
>>>>>>> that node. It’s less difficult to set up compared to cross node jobs via
>>>>>>> MPI.
>>>>>>>
>>>>>>>
>>>>>>> (5) Still about the speed issue. I read some of your comments about
>>>>>>> "cpus" parameters in the maker_opts file (
>>>>>>> http://gmod.827538.n3.nabble.com/open3-fork-failed-Cannot-a
>>>>>>> llocate-memory-td4025117.html). And I know it indicate the number
>>>>>>> of cpus for a single chunk. So if I set "cpus=2" in the maker_opts file,
>>>>>>> then I can use the following command to submit the job, right?
>>>>>>>
>>>>>>>
>>>>>>> The cpu parameter only affects how many CPUs are given to the blast
>>>>>>> command line. So only the BLASt step will speed up, so I recommend using
>>>>>>> MPI to get all steps to speed up. Even if you are only running on a single
>>>>>>> node, you can give all CPUs to the mpiexec command.
>>>>>>>
>>>>>>>
>>>>>>> —Carson
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20170913/c1467038/attachment.html>
More information about the maker-devel
mailing list