<div dir="ltr">Awesome! Thanks Carson.<div><br></div><div>Dhivya</div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Mon, Mar 17, 2014 at 4:14 PM, Carson Holt <span dir="ltr"><<a href="mailto:carsonhh@gmail.com" target="_blank">carsonhh@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Just an update on this. I’ve fixed the maker2zff script to handle the<br>
issues seen. Looking at this actually brought to light another issue.<br>
There is inconsistent escape character specification for GFF3 in column 1<br>
(the source ID), column 8 (the attributes ID and Target_ID), as well as<br>
the FASTA ID for internal sequence. We’re updating the GFF3 spec to<br>
clarify this so that everywhere you see the same ID getting treated the<br>
same way for character escaping.<br>
<br>
To be safe though, only use these characters in your contig IDs for the<br>
assembly when using any tool that reads or outputs GFF3 —><br>
a-zA-Z0-9.:^*$@!+_?-|<br>
<br>
Any character not in that set has a high chance of breaking some<br>
downstream tool. For now just assume the strict interpretation from the<br>
GFF3 spec for column 1, must be used on all IDs everywhere (see below).<br>
<br>
>>Column 1: “seqid"<br>
>>The ID of the landmark used to establish the coordinate system for the<br>
>>current feature.<br>
>>IDs may contain any characters, but must escape any characters not in<br>
>>the set [a-zA-Z0-9.:^*$@!+_?-|].<br>
>>In particular, IDs may not contain unescaped whitespace and must not<br>
>>begin with an unescaped ">".<br>
<br>
<br>
Thanks,<br>
Carson<br>
<div class="HOEnZb"><div class="h5"><br>
<br>
<br>
On 3/13/14, 7:35 PM, "dhivya arasappan" <<a href="mailto:darasappan@gmail.com">darasappan@gmail.com</a>> wrote:<br>
<br>
>Hi Carson,<br>
><br>
>I used gff3_merge to create my gff file from maker output. I've<br>
>attached it here. But when I run maker2zff on it, I get the following<br>
>error:<br>
><br>
>Can't use an undefined value as an ARRAY reference at /opt/apps/maker/<br>
>2.30/bin/maker2zff line 177, <GFF> line 7294251.<br>
><br>
>It produces an incomplete output file and it looks like it may be<br>
>running into problems when it encounters scaffold3%2F0. I'm wondering<br>
>if its having problems with my scaffold names. There seem to be some<br>
>inconsistencies because it's referred to as scaffold3%F0 and<br>
>scaffold3/0 in the gff file. It goes through other scaffolds like<br>
>SCAFFOLD3_873, SCAFFOLD3_95 etc just fine. I did try replacing the<br>
>scaffold names in the gff file, but still get the same error. Any<br>
>ideas?<br>
><br>
>Substitution command I used, for your reference: sed 's/3\%2F/3_/g'<br>
>gfffile| sed 's/\//\_/' > mod.gfffile<br>
><br>
>Thanks<br>
>Dhivya<br>
><br>
<br>
<br>
</div></div></blockquote></div><br></div>