[maker-devel] GFF no longer valid after renaming genes
Carson Holt
carsonhh at gmail.com
Tue Mar 21 10:15:20 MDT 2017
The problem appears to be the multiple ‘|’ characters in your contig names (ChromoV|quiver|quiver). They end up in the gene ID, and since ‘|’ has a special meaning in perl, it creates weird replacement behavior. I’ve attached two scripts that will fix that.
Use them to replace their counterparts in the …/maker/bin/ and .../maker/src/bin/ directories, then rerun all renaming steps on a new gff3 (not the one you already tried to rename). Also you may want to consider changing IDs in the assembly itself before you release it or use it for analysis. You would want to remove the '|quiver|quiver’ tail on every contig. That tail has the potential to open up hidden downstream analysis errors from other tools for the same reasons outlined above, since ‘|’ characters have special meaning.
Thanks,
Carson
> On Mar 20, 2017, at 7:37 PM, Glenna Kramer <glenna.kramer at utoronto.ca> wrote:
>
> Hi there,
>
> I am hoping that you can give me some assistance with finishing up my maker annotated genome for submission. I have been able to rename the genes for GenBank submission - using Support Protocol 2 in the paper by Campbell et. al "Genome Annotation and Curation Using MAKER and MAKER-P" Curr Protoc Bioinformatics. 2014; 48: 4.11.1–4.11.39. <https://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&retmode=ref&cmd=prlinks&id=25501943>(PMC4286374). I have also been able to use the Support Protocol 3 from that same paper to assign a putative gene function. However, I am running into problems when I am trying to convert the GFF file to the tbl format for submission. I have tried to use scripts from GAG (Genome Annotation Generator) and maker (gff32table). Both of these scripts work wonderfully on the gff originally output from maker, but do not work once I rename the genes for GenBank submission. When I feed my file into a gff validator it turns out that my gff is valid prior to renaming, but after I rename the gff is no longer valid. I have been trying to troubleshoot what is happening to my gff when I rename as in Support Protocol 2, but am stumped. Has anyone else out there had a similar issue? I would be very thankful for any insight that you can provide!
>
> Best,
> Glenna
>
> Not sure if this will be helpful, but here is an example gene from prior to renaming:
>
> ##gff-version 3
> ChromoV|quiver|quiver maker gene 62081 62650 . + . ID=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9;Name=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9
> ChromoV|quiver|quiver maker mRNA 62081 62650 . + . ID=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9-mRNA-1;Parent=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9;Name=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9-mRNA-1;_AED=0.00;_eAED=0.00;_QI=0|-1|0|1|-1|1|1|0|189
> ChromoV|quiver|quiver maker exon 62081 62650 . + . ID=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9-mRNA-1:exon:11978;Parent=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9-mRNA-1
> ChromoV|quiver|quiver maker CDS 62081 62650 . + 0 ID=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9-mRNA-1:cds;Parent=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9-mRNA-1
>
> And after renaming:
>
> ##gff-version 3
> ChromoV|quiver|quiver maker gene 62081 62650 . + . ID=A9K44_2555|quiver|quiver-processed-gene-0.9;Name=A9K55_2555|quiver|quiver-processed-gene-0.9;Alias=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9;
> ChromoV|quiver|quiver maker mRNA 62081 62650 . + . ID=A9K44_2555|A9K55_2555-RA|quiver-processed-gene-0.9-mRNA-1;Parent=A9K55_2555|A9K55_2555-RA|quiver-processed-gene-0.9;Name=A9K55_2555|A9K55_2555-RA|quiver-processed-gene-0.9-mRNA-1;Alias=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9-mRNA-1;_AED=0.00;_QI=0|-1|0|1|-1|1|1|0|189;_eAED=0.00;
> ChromoV|quiver|quiver maker exon 62081 62650 . + . ID=A9K44_2555-RA|quiver|quiver-processed-gene-0.9-mRNA-1:exon:11978;Parent=A9K55_2555-RA|quiver|quiver-processed-gene-0.9-mRNA-1;
> ChromoV|quiver|quiver maker CDS 62081 62650 . + 0 ID=A9K44_2555-RA|quiver|quiver-processed-gene-0.9-mRNA-1:cds;Parent=A9K55_2555-RA|quiver|quiver-processed-gene-0.9-mRNA-1;
>
> The commands I used were:
>
> % maker_map_ids --prefix_A9K44_ --justify 4 myfilename.gff>myfilename.map
>
> %map_gff_ids myfilename.map myfilename.gff
>
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org <http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170321/3a22661c/attachment-0009.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: map_fasta_ids
Type: application/octet-stream
Size: 1676 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170321/3a22661c/attachment-0006.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170321/3a22661c/attachment-0010.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: map_gff_ids
Type: application/octet-stream
Size: 5048 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170321/3a22661c/attachment-0007.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170321/3a22661c/attachment-0011.html>
More information about the maker-devel
mailing list