[maker-devel] iprscan and ipr_update_gff

Carson Holt carsonhh at gmail.com
Tue May 6 09:47:23 MDT 2014


Ok. With the full file I can see what what was causing the message.  It is
a parsing bug that was happening in a few cases, and I've now fixed it.
But you can ignore it, because it has no effect on the output.

It would only be an issue if the ID= and Name= tags were different in the
GFF3 for the gene feature lines (which is never be true for MAKER's
output).  It was correctly parsing the 'mRNA' Name and ID tags, but was
sometimes having issue with the Name= tags for the 'gene' lines (but
because they are redundant with ID= tag, the script still finds what it
needs to add the Dbxref= tags).

--Carson


On 5/6/14, 9:26 AM, "kdelmore at zoology.ubc.ca" <kdelmore at zoology.ubc.ca>
wrote:

>I just printed the first 20000 lines of the gff to send to you because it
>was too large to send through email. I've included a dropbox link to the
>full file below. I've also included a link to the final gff with dbx refs;
>as I mentioned, it does seem to add them even with the error. If I run
>ipr_update_gff twice, I get the warnings on the first run but not on the
>second. Does that help diagnose the problem?
>
>The only other red flag I've encountered with maker was in including
>external gff3 from geneid and sgp2. These gff3s failed validation at the
>website suggested the the README file, with the warning message "cds:
>non-unique id" for all cds, but maker didn't give me a warning and they
>seem to be incorporated into the annotation fine.
>
>original gff
>https://www.dropbox.com/s/nimoh605jdk9myx/6.gff
>
>final gff
>https://www.dropbox.com/s/3m2vwscjnz1y3o9/6.final_gff.fasta
>
>Thanks again for getting back to me.
>
>> The file you sent was missing the ##FASTA entry and all sequence at the
>> bottom for example. Is that the way it is in the datastore?
>>
>> --Carson
>>
>>
>> On 5/6/14, 9:06 AM, "kdelmore at zoology.ubc.ca" <kdelmore at zoology.ubc.ca>
>> wrote:
>>
>>>Thanks for your reply. I have not truncated the gff3. I'm using files
>>> from
>>>the datastore that were written at the same time so I'm not sure how
>>>that
>>>would happen. I split my multifasta before running it through maker and
>>>have not merged the gff or protein.fasta for iprscan. That wouldn't be
>>> the
>>>problem would it?
>>>
>>>> You have entries in your interproscan output that aren't in your GFF3.
>>>>Is
>>>> your GFF3 file truncated?
>>>>
>>>> --Carson
>>>>
>>>>
>>>> On 5/5/14, 10:36 PM, "kdelmore at zoology.ubc.ca"
>>>> <kdelmore at zoology.ubc.ca>
>>>> wrote:
>>>>
>>>>>Hi, I have a question about the interproscan scripts available with
>>>>> maker.
>>>>>
>>>>>I'm following the recommendations posted by Carson in Aug 2011 to
>>>>>incorporate results from iprscan. I'm getting quite a few warning
>>>>> messages
>>>>>with ipr_update_gff; they're all the same and suggest that there's no
>>>>>value for $name. When I look through the updated gff, however, the
>>>>> dbxrefs
>>>>>have been added. Is this something I should be worried about?
>>>>>
>>>>>I'm using iprscan version 5 and actually get some warning messages
>>>>> there
>>>>>as well but again, the output looks alright. In addition, some of my
>>>>>fastas don't get these warnings in iprscan and they still give me the
>>>>>error with ipr_update_gff so I don't think that's the problem. I'm
>>>>> using
>>>>>proteins from UniProt. My commands and errors are below. I've also
>>>>>attached the first 20000 lines from my initial gff and raw file from
>>>>>iprscan.
>>>>>
>>>>>Thanks, I really appreciate your continued support.
>>>>>Kira
>>>>>
>>>>>###
>>>>>
>>>>>commands for interproscan scripts available in maker
>>>>>iprscan2gff3 6.maker.proteins.fasta.xml.raw 6.gff  > 6.domains.gff
>>>>>gff3_merge 6.gff 6.domains.gff -o 6_w_domains.all.gff
>>>>>ipr_update_gff 6_w_domains.all.gff 6.maker.proteins.fasta.xml.raw
>>>>> -inplace
>>>>>
>>>>>error after last step (just an example, a ton of similar lines):
>>>>>Use of uninitialized value $name in hash element at
>>>>>/home/kdelmore/tools/maker/bin/ipr_update_gff line 107, <$IN> line
>>>>>15242.
>>>>>Use of uninitialized value $name in hash element at
>>>>>/home/kdelmore/tools/maker/bin/ipr_update_gff line 107, <$IN> line
>>>>>15353.
>>>>>Use of uninitialized value $name in hash element at
>>>>>/home/kdelmore/tools/maker/bin/ipr_update_gff line 107, <$IN> line
>>>>>15674.
>>>>>Use of uninitialized value $name in hash element at
>>>>>/home/kdelmore/tools/maker/bin/ipr_update_gff line 107, <$IN> line
>>>>>15776.
>>>>>
>>>>>
>>>>>###
>>>>>
>>>>>commands for interproscan 5
>>>>>interproscan.sh -i 6.maker.proteins.fasta -f xml -goterms -iprlookup \
>>>>> >
>>>>>interpro_6.out 2>&1
>>>>>interproscan.sh -mode convert -f raw -i 6.maker.proteins.fasta.xml
>>>>>
>>>>>error after first step:
>>>>>04/05/2014 19:22:09:269 25% completed
>>>>>04/05/2014 21:27:36:305 50% completed
>>>>>04/05/2014 21:32:34:236 75% completed
>>>>>04/05/2014 21:38:01:379 90% completed
>>>>>2014-05-04 21:50:22,761
>>>>>[uk.ac.ebi.interpro.scan.management.model.implementations.WriteOutputS
>>>>>te
>>>>>p:
>>>>>248]
>>>>>WARN - At run completion, unable to delete temporary directory
>>>>>/lustre/home/kdelmore/interpro_good/interpro_6/temp/cl2n116_20140504_1
>>>>>74
>>>>>83
>>>>>7921_l959/jobPIRSF-2.84
>>>>>2014-05-04 21:50:22,908
>>>>>[uk.ac.ebi.interpro.scan.management.model.implementations.WriteOutputS
>>>>>te
>>>>>p:
>>>>>253]
>>>>>WARN - At run completion, unable to delete temporary directory
>>>>>/lustre/home/kdelmore/interpro_good/interpro_6/temp/cl2n116_20140504_1
>>>>>74
>>>>>83
>>>>>7921_l959
>>>>>04/05/2014 21:50:23:380 100% done:  InterProScan analyses completed
>>>>>
>>>>>error after second step:
>>>>>interproscan.sh -mode convert -f raw -i 6.maker.proteins.fasta.xml
>>>>>05/05/2014 21:03:40:457 Welcome to InterProScan-5.3-46.0
>>>>>05/05/2014 21:03:53:292 Running InterProScan v5 in CONVERT mode...
>>>>>2014-05-05 21:04:00,603
>>>>>[uk.ac.ebi.interpro.scan.jms.converter.Converter:277] WARN - At run
>>>>>completion, unable to delete temporary directory
>>>>>/home/kdelmore/interpro_good/interpro_6/temp/jasper.westgrid.ca_201405
>>>>>05
>>>>>_2
>>>>>10353293_gsjh_______________________________________________
>>>>>maker-devel mailing list
>>>>>maker-devel at box290.bluehost.com
>>>>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.or
>>>>>g
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>>
>
>






More information about the maker-devel mailing list