[maker-devel] failed contigs in master_datastore_index.log but no errors in screen output

Carson Holt carsonhh at gmail.com
Tue Feb 20 16:03:11 MST 2018


On two, there is a FINISHED entry but it is out of order. There may be a race condition with one process breaking the other’s file lock if you ran multiple jobs at the same time. In which case, one failed and the other finished near the same time. The failure from a broken lock may not say “ERROR” in the STDERR, but may have another tag. search for “unclustered_scaffold_3148” in the STDERR, then look at the entries above and below it.

—Carson


> On Feb 20, 2018, at 3:58 PM, Valerie Soza <vsoza at uw.edu> wrote:
> 
> Hi Carson
> 
> No worries. Thanks for your response. I am using qsub on an SGE cluster and get logs of the jobs so I have the entire screen output when I searched for errors and the 4 scaffolds that failed in STDERR. I also did a search for the 4 scaffolds that failed in the master_datastore_index.log and this is what I got:
> 
> $ grep LG12_ordered_scaffold_101 Rwill4_master_datastore_index.log
> LG12_ordered_scaffold_101	Rwill4_datastore/B7/F2/LG12_ordered_scaffold_101/	STARTED
> LG12_ordered_scaffold_101	Rwill4_datastore/B7/F2/LG12_ordered_scaffold_101/	FAILED
> LG12_ordered_scaffold_101	Rwill4_datastore/B7/F2/LG12_ordered_scaffold_101/	RETRY
> LG12_ordered_scaffold_101	Rwill4_datastore/B7/F2/LG12_ordered_scaffold_101/	FINISHED
> 
> $ grep unclustered_scaffold_3148 Rwill4_master_datastore_index.log
> unclustered_scaffold_3148	Rwill4_datastore/62/82/unclustered_scaffold_3148/	STARTED
> unclustered_scaffold_3148	Rwill4_datastore/62/82/unclustered_scaffold_3148/	STARTED
> unclustered_scaffold_3148	Rwill4_datastore/62/82/unclustered_scaffold_3148/	FINISHED
> unclustered_scaffold_3148	Rwill4_datastore/62/82/unclustered_scaffold_3148/	FAILED
> 
> $ grep unclustered_scaffold_3490 Rwill4_master_datastore_index.log
> unclustered_scaffold_3490	Rwill4_datastore/1D/F9/unclustered_scaffold_3490/	STARTED
> unclustered_scaffold_3490	Rwill4_datastore/1D/F9/unclustered_scaffold_3490/	STARTED
> unclustered_scaffold_3490	Rwill4_datastore/1D/F9/unclustered_scaffold_3490/	FINISHED
> unclustered_scaffold_3490	Rwill4_datastore/1D/F9/unclustered_scaffold_3490/	FAILED
> 
> $ grep unclustered_scaffold_7506 Rwill4_master_datastore_index.log
> unclustered_scaffold_7506	Rwill4_datastore/69/B8/unclustered_scaffold_7506/	STARTED
> unclustered_scaffold_7506	Rwill4_datastore/69/B8/unclustered_scaffold_7506/	FAILED
> unclustered_scaffold_7506	Rwill4_datastore/69/B8/unclustered_scaffold_7506/	RETRY
> unclustered_scaffold_7506	Rwill4_datastore/69/B8/unclustered_scaffold_7506/	FINISHED
> 
> Based on this, it seems like 2 of the scaffolds were retried and finished successfully, while the other 2 were retried but failed for some reason. I am now rerunning this with 4 retrys instead of the default of 2, but it is weird that I did not get any errors in the STDERR though.
> 
> -Valerie
> 
> 
>> On Feb 20, 2018, at 8:23 AM, Carson Holt <carsonhh at gmail.com> wrote:
>> 
>> Hi Valerie,
>> 
>> Sorry for the slow reply. If you are running in a screen session, instead try redirecting STDERR to a file so you can capture all errors. Example: maker &> log.err
>> 
>> Also the datastore_index.log is a cumulative file. Rather than just just grepping for FAILED. grep for the contig of interest. Example: grep “unclustered_scaffold_3490” Rwill4_master_datastore_index.log
>> 
>> You may get something like this:
>> 
>> unclustered_scaffold_3490	Rwill4_datastore/1D/F9/unclustered_scaffold_3490/	STARTED
>> unclustered_scaffold_3490	Rwill4_datastore/1D/F9/unclustered_scaffold_3490/	FAILED
>> unclustered_scaffold_3490	Rwill4_datastore/1D/F9/unclustered_scaffold_3490/	RETRY
>> unclustered_scaffold_3490	Rwill4_datastore/1D/F9/unclustered_scaffold_3490/	FINISHED
>> 
>> If rather than FINISHED, it shows DIED_SKIPPED_PERMANENT, then increase the maker retry count on the next run (in maker_opts file or command line flag). Then you can see why it fails on the next run by capturing all STDERR to a file.
>> 
>> Thanks,
>> Carson
>> 
>> 
>> 
>>> On Feb 12, 2018, at 12:41 PM, Valerie Soza <vsoza at uw.edu> wrote:
>>> 
>>> Hi all
>>> 
>>> I ran 3 instances of Maker 2.31.9 on a genome assembly using a SNAP training parameters file and my training parameters from running BUSCO on our genome. The job completed on our departmental computing cluster but when I looked at the master_datastore_index.log, 4 scaffolds had FAILED and 2 were indicated as RETRY.
>>> 
>>> I previously ran Maker on this same genome just using the SNAP training parameters file and it worked fine so I am perplexed.
>>> 
>>> $ grep FAILED Rwill4_master_datastore_index.log 
>>> LG12_ordered_scaffold_101	Rwill4_datastore/B7/F2/LG12_ordered_scaffold_101/	FAILED 
>>> unclustered_scaffold_3148	Rwill4_datastore/62/82/unclustered_scaffold_3148/	FAILED 
>>> unclustered_scaffold_3490	Rwill4_datastore/1D/F9/unclustered_scaffold_3490/	FAILED 
>>> unclustered_scaffold_7506	Rwill4_datastore/69/B8/unclustered_scaffold_7506/	FAILED
>>> 
>>> $ grep RETRY Rwill4_master_datastore_index.log 
>>> LG12_ordered_scaffold_101	Rwill4_datastore/B7/F2/LG12_ordered_scaffold_101/	RETRY 
>>> unclustered_scaffold_7506	Rwill4_datastore/69/B8/unclustered_scaffold_7506/	RETRY
>>> 
>>> When I looked at the screen output from all 3 instances, there are no errors when I grep for Error, error, or ERROR.
>>> None of the 3 screen outputs indicated that Maker had finished, so I restarted the job and then immediately got that "Maker is now finished!!!” 
>>> However when I grep for the 4 failed scaffolds above in the 3 screen outputs, I only get something for 1 of the scaffolds, which also happens to be the last lines in one of the screens' output's:
>>> 
>>> #---------------------------------------------------------------------
>>> Now starting the contig!!
>>> SeqID: LG12_ordered_scaffold_101
>>> Length: 89169
>>> #---------------------------------------------------------------------
>>> 
>>> 
>>> setting up GFF3 output and fasta chunks
>>> doing repeat masking
>>> running  repeat masker.
>>> #--------- command -------------#
>>> Widget::RepeatMasker:
>>> cd /tmp/935482.1.ravana.q/maker_ul9sWE; /net/gs/vol3/software/modules-sw/RepeatMasker/4.0.7/Linux/RHEL6/x86_64/RepeatMasker /net/shendure/vol8/projects/R.williamsianum.annotation/Maker_analyses/SNAP_training/Rwill4.maker.output/Rwill4_datastore/B7/F2/LG12_ordered_scaffold_101//theVoid.LG12_ordered_scaffold_101/0/LG12_ordered_scaffold_101.0.all.rb -species all -dir /net/shendure/vol8/projects/R.williamsianum.annotation/Maker_analyses/SNAP_training/Rwill4.maker.output/Rwill4_datastore/B7/F2/LG12_ordered_scaffold_101//theVoid.LG12_ordered_scaffold_101/0 -pa 10
>>> #———————————————#
>>> 
>>> It seems like the run did not finish properly. Am I interpreting this correctly? Does anyone have suggestions on what I should do or how to troubleshoot? 
>>> 
>>> Thanks.
>>> 
>>> -Valerie
>>> 	
>>> Valerie Soza, Ph.D.
>>> c/o Hall Lab
>>> Department of Biology
>>> University of Washington
>>> Johnson Hall 202A
>>> Box 351800
>>> Seattle, WA 98195-1800
>>> 206-543-6740
>>> http://staff.washington.edu/vsoza/
>>> 
>>> 
>>> _______________________________________________
>>> maker-devel mailing list
>>> maker-devel at box290.bluehost.com
>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>> 
> 
> Valerie Soza, Ph.D.
> c/o Hall Lab
> Department of Biology
> University of Washington
> Johnson Hall 202A
> Box 351800
> Seattle, WA 98195-1800
> 206-543-6740
> http://staff.washington.edu/vsoza/
> 





More information about the maker-devel mailing list