[maker-devel] MAKER RepeatRunner error on long scaffolds only

Carson Holt carsonhh at gmail.com
Sun Oct 8 18:37:12 MDT 2017


MAKER will use whatever blast is indicated in maker_exe.ctl, so make sure the new installation is the one indicated there. RepeatRunner is not part of RepeatMasker, and is a separate step that is essentially just a modified BLASTX against a protein database. So the standard NCBI blast+ installation is what gets used for that (not RMBLAST).

The error you get is because the BLAST report is truncated. At the top of a BLAST report there is a summary of results, and then below there are details about each result. What is happening is that there are results in the top summary that are not being found in the bottom detail section. If Updating to BLAST+ 2.6 does not fix it for you, you may need to drop to legacy NCBI BLAST (i.e. the one that is not the BLAST+ rewrite). Here —> ftp://ftp.ncbi.nlm.nih.gov/blast/executables/legacy/2.2.26/ <ftp://ftp.ncbi.nlm.nih.gov/blast/executables/legacy/2.2.26/>

—Carson





> On Oct 6, 2017, at 6:23 AM, Daren C. Card <daren.card at gmail.com> wrote:
> 
> Dear Carson,
> 
> Thanks so much for the quick reply. I updated BLAST to v2.6 and reran the configure script for RepeatMasker. Looks like MAKER should natively work with the BLAST that is available in the $PATH.
> 
> Unfortunately, I’m still getting the same error what appears to be at roughly the same spot (~child 226). I’ve copied the stderr below. I checked my GFF file and I don’t see any issues with coordinates. I’m going to try running without a GFF of repeat annotations to see what that does, but in the meantime I wanted to send an update and see if there is anything else I should look into.
> 
> Thank you,
> Daren Card
> 
> 
> ################################################
> doing repeat masking
> re reading repeat masker report.
> /home/castoelab/Desktop/daren/cvv_annotation/chr1-8/CroVir_rnd1_chr1.maker.output/CroVir_rnd1_chr1_datastore/51/66/scaffold-1//theVoid.scaffold-1/68/scaffold-1.227.simple.rb.out
> doing blastx repeats
> re reading blast report.
> /home/castoelab/Desktop/daren/cvv_annotation/chr1-8/CroVir_rnd1_chr1.maker.output/CroVir_rnd1_chr1_datastore/51/66/scaffold-1//theVoid.scaffold-1/68/scaffold-1.227.te_proteins%2Efasta.repeatrunner
> deleted:2 hits
> doing blastx repeats
> doing blastx repeats
> doing blastx repeats
> doing blastx repeats
> doing blastx repeats
> doing blastx repeats
> doing blastx repeats
> doing blastx repeats
> doing blastx repeats
> collecting blastx repeatmasking
> processing all repeats
> in cluster::shadow_cluster...
> Died at /opt/maker/bin/../lib/Bio/Search/Hit/PhatHit/Base.pm line 188.
> --> rank=NA, hostname=moonunit0
> ERROR: Failed while processing all repeats
> ERROR: Chunk failed at level:3, tier_type:1
> FAILED CONTIG:scaffold-1
> 
> ERROR: Chunk failed at level:2, tier_type:0
> FAILED CONTIG:scaffold-1
> 
> examining contents of the fasta file and run log
> ################################################
> 
> 
> 
>> On Oct 4, 2017, at 11:03 AM, Carson Holt <carsonhh at gmail.com> wrote:
>> 
>> The point where it dies is because there is no start/end coordinate for one of the alignments. The issue can either be with the GFF3 you gave it or is a truncated BLAST report. Recently there have been a number of weird BLAST+ issues related to truncated reports. Updating to 2.6+ seems to solve it for most people. There is also a 2.6 update for rmblast inside RepeatMasker. I submitted a bug report and example set to BLAST a few months ago.
>> 
>> —Carson
>> 
>> 
>>> On Oct 4, 2017, at 9:53 AM, Daren C. Card <daren.card at gmail.com> wrote:
>>> 
>>> Hi all,
>>> 
>>> I’ve been having an issue with MAKER (v. 2.31.8) that I haven’t been able to overcome, and no former questions have really addressed or helped fix the problem. I’ve run MAKER on a vertebrate genome and it runs fine and finishes all but the 8 longest scaffolds. These are all above 65Mb (others are below 5Mb) and most are around 20% Ns (one is 35%). The 9th longest sequence, which is just above 60Mb and 27% Ns finished fine too, which is strange because it is the only really long scaffold to run to completion. The fact that MAKER works fine on all but a few scaffolds indicates to me that the issue is those scaffolds and not MAKER/my settings, but the only difference is the length of the sequences. Is there an upper limit on scaffold size?
>>> 
>>> I originally ran whole genome as MPI, but have since tried to rerun individual scaffolds using a single core and still get issues. The error I get is below, but I can’t find any additional info in the program-specific logs to help figure this out. MAKER actually runs a little bit longer after this error before stalling and trying again. Seems to have something to do with RepeatRunner. For repeats I’m providing a GFF of complex repeats obtained from custom RepeatMasker annotations (using rm_gff option) and letting MAKER handle simple repeats (model_org=simple) and protein-based annotation with RepeatRunner (with default library).
>>> 
>>> Any help would be greatly appreciated.
>>> Daren Card
>>> 
>>> University of Texas Arlington
>>> 
>>> ###################################################
>>> doing blastx repeats
>>> running  blast search.
>>> #--------- command -------------#
>>> Widget::blastx:
>>> /usr/bin/blastx -db /tmp/maker_xiChvf/te_proteins%2Efasta.mpi.10.6 -query /tmp/maker_xiChvf/1/scaffold-1.226 -num_alignments 10000 -num_descriptions 10000 -evalue 1e-06 -dbsize 300 -searchsp 500000000 -num_threads 1 -seg yes -soft_masking true -lcase_masking -show_gis -out /home/castoelab/Desktop/daren/cvv_annotation/chr1-8/CroVir_rnd1_chr1.maker.output/CroVir_rnd1_chr1_datastore/51/66/scaffold-1//theVoid.scaffold-1/67/scaffold-1.226.te_proteins%2Efasta.repeatrunner.temp_dir/te_proteins%2Efasta.mpi.10.6.repeatrunner
>>> #-------------------------------#
>>> deleted:0 hits
>>> collecting blastx repeatmasking
>>> processing all repeats
>>> in cluster::shadow_cluster...
>>> Died at /opt/maker/bin/../lib/Bio/Search/Hit/PhatHit/Base.pm line 188.
>>> --> rank=3, hostname=moonunit0
>>> ERROR: Failed while processing all repeats
>>> ERROR: Chunk failed at level:3, tier_type:1
>>> FAILED CONTIG:scaffold-1
>>> 
>>> doing blastx repeats
>>> running  blast search.
>>> #--------- command -------------#
>>> Widget::blastx:
>>> /usr/bin/blastx -db /tmp/maker_xiChvf/te_proteins%2Efasta.mpi.10.3 -query /tmp/maker_xiChvf/3/scaffold-1.225 -num_alignments 10000 -num_descriptions 10000 -evalue 1e-06 -dbsize 300 -searchsp 500000000 -num_threads 1 -seg yes -soft_masking true -lcase_masking -show_gis -out /home/castoelab/Desktop/daren/cvv_annotation/chr1-8/CroVir_rnd1_chr1.maker.output/CroVir_rnd1_chr1_datastore/51/66/scaffold-1//theVoid.scaffold-1/67/scaffold-1.225.te_proteins%2Efasta.repeatrunner.temp_dir/te_proteins%2Efasta.mpi.10.3.repeatrunner
>>> #-------------------------------#
>>> ERROR: Chunk failed at level:2, tier_type:0
>>> FAILED CONTIG:scaffold-1
>>> 
>>> deleted:0 hits
>>> deleted:0 hits
>>> ###################################################
>>> 
>>> 
>>> _______________________________________________
>>> maker-devel mailing list
>>> maker-devel at box290.bluehost.com
>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20171008/5653047c/attachment-0001.html>


More information about the maker-devel mailing list