[maker-devel] failed to assign putative gene function
Carson Holt
carsonhh at gmail.com
Mon Feb 20 16:56:02 MST 2017
You either uses TrEMBL or the UniProtKB Isoform sequence set. Their fasta headers are slightly different and will not be parsed correctly,
For example, here is the header as formatted for the same sequence in the Swiss-prot dataset download —>
>sp|Q7Z5M8|AB12B_HUMAN Protein ABHD12B OS=Homo sapiens GN=ABHD12B PE=2 SV=1
I think you used the UniProtKB Isoform sequence dataset instead.
—Carson
> On Feb 13, 2017, at 8:16 AM, Quanwei Zhang <qwzhang0601 at gmail.com> wrote:
>
> Hello:
>
> I am trying to add putative gene function to the predicted gene models. Firstly, I use uniProt/Swiss-Prot protein sequences to build the database. I used canonical and isoform proteins of human, mouse and rat with the script "makeblastdb". Then use "blastp" generated "maker2uni.blastp" whose context is as below.
> maker-CasCan_contig_64815-snap-gene-0.0-mRNA-1 sp|Q6P5S2|LEG1H_HUMAN 69.97 303 91 0 1 303 1 303 7e-164 464
> snap_masked-CasCan_contig_14203-processed-gene-0.10-mRNA-1 sp|Q91ZA8|NRARP_MOUSE 99.12 114 1 0 1 114 1 114 3e-80 236
>
> After that, I am trying to add the protein homology data to the Maker gff3 and fasta files with maker_functional_gff and maker_functional_fasta, but get the reports as below.
>
> Can't parse details from FASTA header: >sp|Q7Z5M8-2|AB12B_HUMAN Isoform 2 of Protein ABHD12B OS=Homo sapiens GN=ABHD12B
>
> Use of uninitialized value $id in hash element at /public/apps/MAKER/2.31.9/bin/maker_functional_gff line 139, <$IN> line 39.
> Use of uninitialized value $id in hash element at /public/apps/MAKER/2.31.9/bin/maker_functional_gff line 141, <$IN> line 39.
> Can't parse details from FASTA header: >sp|Q7Z5M8-4|AB12B_HUMAN Isoform 4 of Protein ABHD12B OS=Homo sapiens GN=ABHD12B
>
> Use of uninitialized value $id in hash element at /public/apps/MAKER/2.31.9/bin/maker_functional_gff line 139, <$IN> line 45.
> Use of uninitialized value $id in hash element at /public/apps/MAKER/2.31.9/bin/maker_functional_gff line 141, <$IN> line 45.
> Can't parse details from FASTA header: >sp|Q7Z5M8-5|AB12B_HUMAN Isoform 5 of Protein ABHD12B OS=Homo sapiens GN=ABHD12B
> .....
>
> I am not sure how to deal with this. I followed the command given in the protocol. Any suggestions?
>
> Thanks
>
> Best
> Quanwei
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
More information about the maker-devel
mailing list