[maker-devel] Maker actions when using importied rm_gff file

Mon Mar 16 10:48:13 MDT 2020

In the log I see these —> 

ERROR: Could not open file: /scratch/dro49/myluwork/annotation/test9/lu.maker.output/lu_datastore/DC/34/scaffold1731//theVoid.scaffold1731/scaffold1731.gff.seq.tmp
Cannot send after transport endpoint shutdown

You are having IO timeout issues.  It can be an issue with a single node on your cluster, or in issue with the object storage servers on the lustre network storage.  Or your job may just be too big for the network storage to handle.  You will likely need to run on fewer nodes or you can have your system admin increase timeout options for Lustre to see if that helps.

—Carson

> On Mar 9, 2020, at 7:24 AM, Devon O'Rourke <devon.orourke at gmail.com> wrote:
> 
> Hi Carson,
> 
> I recently completed one round of Maker annotation successfully thanks to your expert advise on resetting MPI parameters. Because earlier tests (prior to this successful run) indicated that other dependency programs might also be contributing to failed Maker jobs, this first successful run consisted entirely of GFF data as input for the est, altest, and protein evidence, as well as using a custom rm_gff file for complex repeats (I was following the strategy posted in an earlier thread in this forum (https://groups.google.com/forum/#!topic/maker-devel/patU-l_TQUM <https://groups.google.com/forum/#!topic/maker-devel/patU-l_TQUM>).
> 
> The good news is that using GFF files only will get the job to finish, the bad news is that if I try to input the original fasta files instead of the resulting GFF's for the evidence data, Maker gets close but fails to finish the job at the stage where (I think) the per-scaffold chunks of "evidence_*.gff", "scaffold*.*.pred.raw.section", and "scaffold*.*.final.section" is collapsed into a set of "scaffold*.gff", "scaffold*.maker.transcripts.fasta" and "scaffold*.maker.proteins.fasta" files. The behavior is not entirely consistent across all scaffolds: most scaffolds in fact produce finished files (the "scaffold*.gff", "transcripts.fasta", etc.), however the majority of the failed scaffolds are the longest ones (though at least a handful of longer scaffolds do finish!). 
> 
> The initial error in the run.log.child.* files in these failed scaffolds aren't always the same. Here's a few:
> 
> ```
> DIED    RANK    4:6:0:4
> DIED    RANK    5:6:0:4
> DIED    RANK    6:6:0:53
> ```
> 
> The second error is always:
> ```
> DIED    COUNT   1
> ```
> 
> You can view the .log file here: https://osf.io/4wn6h/download <https://osf.io/4wn6h/download>. I've attached the .opts file to this message. 
> 
> Maybe again there is something about our MPI parameters that are not optimized for these jobs. I could certainly re-run the same data through a machine without MPI at this point because all the jobs are basically completed (no more blasting or repeat masking is needed). Thus I think the question is - should I just restart the run without MPI and see if it finishes? Or perhaps, there are alternative Maker scripts to try testing directly (even on a single scaffold subdirectory) to see if these instances where Maker doesn't quite finish would finish otherwise?
> 
> Thank you once more for your help with troubleshooting,
> Devon
> 
> -- 
> Devon O'Rourke
> Postdoctoral researcher, Northern Arizona University
> Lab of Jeffrey T. Foster - https://fozlab.weebly.com/ <https://fozlab.weebly.com/>
> twitter: @thesciencedork
> <makerRun2_opts.ctl>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20200316/b8b4908c/attachment-0004.html>