[maker-devel] Patch for a bug with repeat gff

Anthony Bretaudeau anthony.bretaudeau at rennes.inra.fr
Mon Jun 10 09:48:13 MDT 2013


Hello,
I am running Maker 2.27b on an insect genome, and I use a gff file 
containing some repeat positions (rm_gff option in maker_opts.ctl).

I encountered an error on 10 scaffolds (the genome contains ~40000 
scaffolds) : "substr outside of string" (similar to this post: 
http://gmod.827538.n3.nabble.com/substr-outside-of-string-td4031889.html).

After a lot a debugging, it turns out the problem came from the code of 
"phathits_on_chunk" function in lib/GFFDB.pm, near line 539: there is a 
SQL query that fetches features that overlap with the border of the 
sequence chunk.
The problem is that it also fetches features that are completely outside 
of the chunk in the same region. This produces an error when maker tries 
to mask the sequence as it does a substr outside the string.

I fixed it by patching lib/repeat_mask_seq.pm, near line 138:
I replaced:
           substr($$seq, $b -1 , $l, "$replace"x$l);
By:
       if ($b < length($$seq)) {
           substr($$seq, $b -1 , $l, "$replace"x$l);
       }

I don't know if there is a more elegant solution, but this seems to 
solve the problem.

Cheers
Anthony




More information about the maker-devel mailing list