[maker-devel] Error with maker_functional_gff
Ole Kristian Tørresen
ole.toerresen at gmail.com
Mon Jan 4 12:08:43 MST 2016
I found the mistake, I used different versions of SwissProt/UniProt for
BLASTing and as an option for maker_functional_gff. When I changed to the
same version, the error went away.
Sad to say, but stuff like different versions of SwissProt/UniProt do
accumulate over time a bit...
Thank you.
Ole
On 4 January 2016 at 17:04, Carson Holt <carsonhh at gmail.com> wrote:
> Perhaps the easiest way to look at this is if you send us the files. I’m
> still leaning towards a format error. But it’s the kind of thing where I
> would need the files to find the specific entry.
>
> —Carson
>
>
>
> On Dec 16, 2015, at 11:32 PM, Ole Kristian Tørresen <
> ole.toerresen at gmail.com> wrote:
>
> Here's the hits for GAMO_00029233
> >sp|Q9SUR9|SGT1A_ARATH Protein SGT1 homolog A OS=Arabidopsis thaliana
> GN=SGT1A PE=1 SV=1
> >sp|Q9SUT5|SGT1B_ARATH Protein SGT1 homolog B OS=Arabidopsis thaliana
> GN=SGT1B PE=1 SV=1
> >sp|Q2KIK0|SGT1_BOVIN Protein SGT1 homolog OS=Bos taurus GN=SUGT1 PE=2 SV=1
> >sp|Q55ED0|SGT1_DICDI Protein SGT1 homolog OS=Dictyostelium discoideum
> GN=sugt1 PE=2 SV=1
> >sp|Q9Y2Z0|SGT1_HUMAN Protein SGT1 homolog OS=Homo sapiens GN=SUGT1 PE=1
> SV=3
> >sp|Q9CX34|SGT1_MOUSE Protein SGT1 homolog OS=Mus musculus GN=Sugt1 PE=1
> SV=3
> >sp|Q0JL44|SGT1_ORYSJ Protein SGT1 homolog OS=Oryza sativa subsp. japonica
> GN=SGT1 PE=1 SV=1
> >sp|B0BN85|SGT1_RAT Protein SGT1 homolog OS=Rattus norvegicus GN=Sugt1
> PE=2 SV=1
>
> The bovin is the first hit. I can't really see anything different about
> that.
>
> I'm don't know perl that well. Do you have some code which I can use to
> debug this? In line 58 it tries to access the blast hash with the ID as a
> key, if I understand this correctly. Either the hash is empty where the key
> tries to access, or the key is empty. If I could print each ID as it is
> found, maybe I can find a pattern. And/or print each blast entry when the
> blast hash is created.
>
> Thank you.
>
> Ole
>
> On 16 December 2015 at 21:55, Carson Holt <carsonhh at gmail.com> wrote:
>
>> Find the hit for GAMO_00029233 and then pull it’s header line out of the
>> Uniprot fasta file. There may be an unexpected formatting difference in
>> that header.
>>
>> —Carson
>>
>>
>>
>> On Dec 16, 2015, at 1:53 PM, Ole Kristian Tørresen <
>> ole.toerresen at gmail.com> wrote:
>>
>> Daniel,
>> this is the previous gene, before maker_functional_gff:
>> LG08 maker gene 13648888 13656687 . - .
>> ID=GAMO_00029212;Name=GAMO_00029212;Alias=maker-LG08-snap-gene-46.325;
>> LG08 maker mRNA 13648888 13656687 . - .
>>
>> ID=GAMO_00029212-RA;Parent=GAMO_00029212;Name=GAMO_00029212-RA;Alias=maker-LG08-snap-gene-46.325-mRNA-1;_AED=0.45;_QI=0|0.83|0.84|1|0.5|0.61|13|1843|351;_eAED=0.45;
>> LG08 maker exon 13648888 13648944 . - .
>> ID=GAMO_00029212-RA:exon:9363;Parent=GAMO_00029212-RA;
>> LG08 maker exon 13649295 13649577 . - .
>> ID=GAMO_00029212-RA:exon:9362;Parent=GAMO_00029212-RA;
>> LG08 maker exon 13649816 13651468 . - .
>> ID=GAMO_00029212-RA:exon:9361;Parent=GAMO_00029212-RA;
>> LG08 maker exon 13651736 13651789 . - .
>> ID=GAMO_00029212-RA:exon:9360;Parent=GAMO_00029212-RA;
>> LG08 maker exon 13652270 13652365 . - .
>> ID=GAMO_00029212-RA:exon:9359;Parent=GAMO_00029212-RA;
>> LG08 maker exon 13652643 13652730 . - .
>> ID=GAMO_00029212-RA:exon:9358;Parent=GAMO_00029212-RA;
>> LG08 maker exon 13653175 13653212 . - .
>> ID=GAMO_00029212-RA:exon:9357;Parent=GAMO_00029212-RA;
>> LG08 maker exon 13653587 13653641 . - .
>> ID=GAMO_00029212-RA:exon:9356;Parent=GAMO_00029212-RA;
>> LG08 maker exon 13653764 13653817 . - .
>> ID=GAMO_00029212-RA:exon:9355;Parent=GAMO_00029212-RA;
>> LG08 maker exon 13653910 13653974 . - .
>> ID=GAMO_00029212-RA:exon:9354;Parent=GAMO_00029212-RA;
>> LG08 maker exon 13654085 13654164 . - .
>> ID=GAMO_00029212-RA:exon:9353;Parent=GAMO_00029212-RA;
>> LG08 maker exon 13654474 13654828 . - .
>> ID=GAMO_00029212-RA:exon:9352;Parent=GAMO_00029212-RA;
>> LG08 maker exon 13656667 13656687 . - .
>> ID=GAMO_00029212-RA:exon:9351;Parent=GAMO_00029212-RA;
>> LG08 maker CDS 13656667 13656687 . - 0
>> ID=GAMO_00029212-RA:cds;Parent=GAMO_00029212-RA;
>> LG08 maker CDS 13654474 13654828 . - 0
>> ID=GAMO_00029212-RA:cds;Parent=GAMO_00029212-RA;
>> LG08 maker CDS 13654085 13654164 . - 2
>> ID=GAMO_00029212-RA:cds;Parent=GAMO_00029212-RA;
>> LG08 maker CDS 13653910 13653974 . - 0
>> ID=GAMO_00029212-RA:cds;Parent=GAMO_00029212-RA;
>> LG08 maker CDS 13653764 13653817 . - 1
>> ID=GAMO_00029212-RA:cds;Parent=GAMO_00029212-RA;
>> LG08 maker CDS 13653587 13653641 . - 1
>> ID=GAMO_00029212-RA:cds;Parent=GAMO_00029212-RA;
>> LG08 maker CDS 13653175 13653212 . - 0
>> ID=GAMO_00029212-RA:cds;Parent=GAMO_00029212-RA;
>> LG08 maker CDS 13652643 13652730 . - 1
>> ID=GAMO_00029212-RA:cds;Parent=GAMO_00029212-RA;
>> LG08 maker CDS 13652270 13652365 . - 0
>> ID=GAMO_00029212-RA:cds;Parent=GAMO_00029212-RA;
>> LG08 maker CDS 13651736 13651789 . - 0
>> ID=GAMO_00029212-RA:cds;Parent=GAMO_00029212-RA;
>> LG08 maker CDS 13651319 13651468 . - 0
>> ID=GAMO_00029212-RA:cds;Parent=GAMO_00029212-RA;
>> LG08 maker three_prime_UTR 13649816 13651318 . -
>> . ID=GAMO_00029212-RA:three_prime_utr;Parent=GAMO_00029212-RA;
>> LG08 maker three_prime_UTR 13649295 13649577 . -
>> . ID=GAMO_00029212-RA:three_prime_utr;Parent=GAMO_00029212-RA;
>> LG08 maker three_prime_UTR 13648888 13648944 . -
>> . ID=GAMO_00029212-RA:three_prime_utr;Parent=GAMO_00029212-RA;
>> LG08 maker gene 13786695 13806565 . - .
>> ID=GAMO_00029233;Name=GAMO_00029233;Alias=maker-LG08-snap-gene-46.343;
>> LG08 maker mRNA 13786695 13806565 . - .
>>
>> ID=GAMO_00029233-RA;Parent=GAMO_00029233;Name=GAMO_00029233-RA;Alias=maker-LG08-snap-gene-46.343-mRNA-1;_AED=0.47;_QI=173|0.78|0.66|1|0.21|0.26|15|0|301;_eAED=0.47;
>>
>> After :
>> LG08 maker gene 13648888 13656687 . - .
>>
>> ID=GAMO_00029212;Name=GAMO_00029212;Alias=maker-LG08-snap-gene-46.325;Note=Similar
>> to Tmbim1: Protein lifeguard 3 (Mus musculus);
>> LG08 maker mRNA 13648888 13656687 . - .
>>
>> ID=GAMO_00029212-RA;Parent=GAMO_00029212;Name=GAMO_00029212-RA;Alias=maker-LG08-snap-gene-46.325-mRNA-1;_AED=0.45;_QI=0|0.83|0.84|1|0.5|0.61|13|1843|351;_eAED=0.45;Note=Similar
>> to Tmbim1: Protein lifeguard 3 (Mus musculus);
>> LG08 maker exon 13648888 13648944 . - .
>> ID=GAMO_00029212-RA:exon:9363;Parent=GAMO_00029212-RA;
>> LG08 maker exon 13649295 13649577 . - .
>> ID=GAMO_00029212-RA:exon:9362;Parent=GAMO_00029212-RA;
>> LG08 maker exon 13649816 13651468 . - .
>> ID=GAMO_00029212-RA:exon:9361;Parent=GAMO_00029212-RA;
>> LG08 maker exon 13651736 13651789 . - .
>> ID=GAMO_00029212-RA:exon:9360;Parent=GAMO_00029212-RA;
>> LG08 maker exon 13652270 13652365 . - .
>> ID=GAMO_00029212-RA:exon:9359;Parent=GAMO_00029212-RA;
>> LG08 maker exon 13652643 13652730 . - .
>> ID=GAMO_00029212-RA:exon:9358;Parent=GAMO_00029212-RA;
>> LG08 maker exon 13653175 13653212 . - .
>> ID=GAMO_00029212-RA:exon:9357;Parent=GAMO_00029212-RA;
>> LG08 maker exon 13653587 13653641 . - .
>> ID=GAMO_00029212-RA:exon:9356;Parent=GAMO_00029212-RA;
>> LG08 maker exon 13653764 13653817 . - .
>> ID=GAMO_00029212-RA:exon:9355;Parent=GAMO_00029212-RA;
>> LG08 maker exon 13653910 13653974 . - .
>> ID=GAMO_00029212-RA:exon:9354;Parent=GAMO_00029212-RA;
>> LG08 maker exon 13654085 13654164 . - .
>> ID=GAMO_00029212-RA:exon:9353;Parent=GAMO_00029212-RA;
>> LG08 maker exon 13654474 13654828 . - .
>> ID=GAMO_00029212-RA:exon:9352;Parent=GAMO_00029212-RA;
>> LG08 maker exon 13656667 13656687 . - .
>> ID=GAMO_00029212-RA:exon:9351;Parent=GAMO_00029212-RA;
>> LG08 maker CDS 13656667 13656687 . - 0
>> ID=GAMO_00029212-RA:cds;Parent=GAMO_00029212-RA;
>> LG08 maker CDS 13654474 13654828 . - 0
>> ID=GAMO_00029212-RA:cds;Parent=GAMO_00029212-RA;
>> LG08 maker CDS 13654085 13654164 . - 2
>> ID=GAMO_00029212-RA:cds;Parent=GAMO_00029212-RA;
>> LG08 maker CDS 13653910 13653974 . - 0
>> ID=GAMO_00029212-RA:cds;Parent=GAMO_00029212-RA;
>> LG08 maker CDS 13653764 13653817 . - 1
>> ID=GAMO_00029212-RA:cds;Parent=GAMO_00029212-RA;
>> LG08 maker CDS 13653587 13653641 . - 1
>> ID=GAMO_00029212-RA:cds;Parent=GAMO_00029212-RA;
>> LG08 maker CDS 13653175 13653212 . - 0
>> ID=GAMO_00029212-RA:cds;Parent=GAMO_00029212-RA;
>> LG08 maker CDS 13652643 13652730 . - 1
>> ID=GAMO_00029212-RA:cds;Parent=GAMO_00029212-RA;
>> LG08 maker CDS 13652270 13652365 . - 0
>> ID=GAMO_00029212-RA:cds;Parent=GAMO_00029212-RA;
>> LG08 maker CDS 13651736 13651789 . - 0
>> ID=GAMO_00029212-RA:cds;Parent=GAMO_00029212-RA;
>> LG08 maker CDS 13651319 13651468 . - 0
>> ID=GAMO_00029212-RA:cds;Parent=GAMO_00029212-RA;
>> LG08 maker three_prime_UTR 13649816 13651318 . -
>> . ID=GAMO_00029212-RA:three_prime_utr;Parent=GAMO_00029212-RA;
>> LG08 maker three_prime_UTR 13649295 13649577 . -
>> . ID=GAMO_00029212-RA:three_prime_utr;Parent=GAMO_00029212-RA;
>> LG08 maker three_prime_UTR 13648888 13648944 . -
>> . ID=GAMO_00029212-RA:three_prime_utr;Parent=GAMO_00029212-RA;
>>
>> Carson, I saw that, but I did use Uniprot/Swiss-prot. A snap of the
>> blast-output used as input here:
>> GAMO_00029212-RA sp|Q8BJZ3|LFG3_MOUSE 53.93 280 112 3
>> 81 348 33 307 2e-92 285
>> GAMO_00029212-RA sp|Q969X1|LFG3_HUMAN 54.51 288 103 5
>> 76 347 33 308 4e-92 284
>> GAMO_00029212-RA sp|Q9BWQ8|LFG2_HUMAN 45.73 328 134 6
>> 44 351 13 316 2e-86 270
>> GAMO_00029212-RA sp|Q5R4I4|LFG2_PONAB 45.73 328 134 6
>> 44 351 13 316 3e-86 269
>> GAMO_00029212-RA sp|Q1LZ71|LFG2_BOVIN 45.03 322 145 5
>> 44 351 13 316 5e-84 264
>> GAMO_00029212-RA sp|O88407|LFG2_RAT 44.65 327 139 6
>> 44 351 13 316 8e-83 261
>> GAMO_00029212-RA sp|Q8K097|LFG2_MOUSE 45.16 310 129 5
>> 60 351 31 317 1e-80 255
>> GAMO_00029212-RA sp|Q7Z429|LFG1_HUMAN 39.32 351 164 9
>> 32 351 39 371 6e-69 226
>> GAMO_00029212-RA sp|Q32L53|LFG1_BOVIN 41.69 343 158 8
>> 29 351 46 366 8e-66 218
>> GAMO_00029212-RA sp|Q9ESF4|LFG1_MOUSE 40.43 324 156 8
>> 53 351 34 345 2e-59 201
>> GAMO_00029212-RA sp|Q6P6R0|LFG1_RAT 39.71 345 165
>> 11 34 351 20 348 2e-59 201
>> GAMO_00029212-RA sp|Q9DA39|LFG4_MOUSE 35.59 222 120 7
>> 142 351 27 237 3e-24 103
>> GAMO_00029212-RA sp|Q49P94|GAAP_VACCL 33.47 239 128 9
>> 113 337 1 222 5e-22 97.1
>> GAMO_00029233-RA sp|Q2KIK0|SGT1_BOVIN 53.18 299 100 3
>> 5 268 17 310 5e-89 275
>> GAMO_00029233-RA sp|B0BN85|SGT1_RAT 51.51 299 104 3
>> 5 268 16 308 5e-86 268
>> GAMO_00029233-RA sp|Q9CX34|SGT1_MOUSE 51.51 299 104 3
>> 5 268 16 308 8e-86 267
>> GAMO_00029233-RA sp|Q9Y2Z0|SGT1_HUMAN 46.83 331 100 5
>> 5 268 16 337 1e-80 254
>> GAMO_00029233-RA sp|Q0JL44|SGT1_ORYSJ 30.75 322 160 4
>> 10 268 16 337 5e-36 137
>> GAMO_00029233-RA sp|Q9SUT5|SGT1B_ARATH 27.99 318 171 4
>> 9 268 11 328 3e-35 135
>> GAMO_00029233-RA sp|Q9SUR9|SGT1A_ARATH 28.28 297 159 5
>> 24 268 26 320 7e-35 134
>> GAMO_00029233-RA sp|Q55ED0|SGT1_DICDI 37.72 167 63 3
>> 138 268 196 357 5e-25 107
>>
>> 521 genes have had added function before maker_functional_gff choked
>> particular gene GAMO_00029233.
>>
>> Thank you.
>>
>> Ole
>>
>>
>> On 16 December 2015 at 20:37, Carson Holt <carsonhh at gmail.com> wrote:
>>
>>> I’ve seen this exact same error before (
>>> https://groups.google.com/forum/#!searchin/maker-devel/$2Fmaker_functional_gff$20line$2058/maker-devel/cBuQMKTJj2M/aXGnARZ7JhsJ
>>> ).
>>>
>>> It is caused by the ID from the blast report and input protein
>>> fasta. maker_functional_gff is not a generic script that can work on any
>>> input, it only works on blast results against Uniprot/Swiss-prot. The
>>> script is expecting a very specific header format in both the report and
>>> the protein fasta and if it doesn’t see it, then it is missing certain
>>> pieces of needed information.
>>>
>>> Thanks,
>>> Carson
>>>
>>> On Dec 16, 2015, at 12:27 PM, Daniel Ence <dence at genetics.utah.edu>
>>> wrote:
>>>
>>> Hi Ole, can you send a line for a gene feature that does work?
>>>
>>>
>>> Daniel Ence
>>> Graduate Student
>>> Eccles Institute of Human Genetics
>>> University of Utah
>>> 15 North 2030 East, Room 2100
>>> Salt Lake City, UT 84112-5330
>>>
>>> On Dec 14, 2015, at 12:21 PM, Ole Kristian Tørresen <
>>> ole.toerresen at gmail.com> wrote:
>>>
>>> Hi,
>>> I'm trying to update my annotation with some functional annotations
>>> with maker_functional_gff, but get this annoying error:
>>> Can't use string ("") as a HASH ref while "strict refs" in use at
>>> /cluster/software/VERSIONS/maker-2.31.8/bin/maker_functional_gff line 58,
>>> <$IN> line 108947.
>>>
>>> Line 108947 in the input gff is this:
>>>
>>> LG08 maker gene 13786695 13806565 . -
>>> .
>>> ID=GAMO_00029233;Name=GAMO_00029233;Alias=maker-LG08-snap-gene-46.343;
>>>
>>> It seems like the regexp in line 55 in the maker_functional_gff script
>>> doesn't pick up the ID, but I can't see any difference between that line
>>> and other similar lines.
>>>
>>> Any help to trace down this is really appreciated. Do you need any other
>>> information?
>>>
>>> Thank you.
>>>
>>> Sincerely,
>>>
>>> Ole Kristian Tørresen
>>>
>>>
>>> _______________________________________________
>>> maker-devel mailing list
>>> maker-devel at box290.bluehost.com
>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>>>
>>>
>>> _______________________________________________
>>> maker-devel mailing list
>>> maker-devel at box290.bluehost.com
>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>>>
>>>
>>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20160104/1aa86b47/attachment-0003.html>
More information about the maker-devel
mailing list