[maker-devel] maker_functional_gff Error

Paul Sheridan paul at tupac.bio
Sun Apr 28 20:40:12 CDT 2019


Hi Carson,

Thanks, your suggestions got me sorted out.

Best,

Paul

On Tue, Apr 23, 2019 at 2:50 AM Carson Holt <carsonhh at gmail.com> wrote:

> This “WARNING: No mapping available for ThuMac01937-RA” means you are
> running on a file that already has been renamed. The file will have names
> like maker-SDFGDG-gene-0.1-mRNA-1 for example, and it’s finding the name
> ThuMac01937-RA, which is not in the first column of the map file. So it
> throws a warning.
>
> The second one —> Can't use string ("") as a HASH ref while "strict refs"
> in use at /root/maker/bin/maker_functional_gff line 55, <$IN> line 3.
>
>
> You likely have a trucated line in the GFF3. It’s missing an ID= tag. This
> can sometimes happen if writing to network mounted (NFS) file systems
> because of an ansyncrounous IO error. NFS file systems have a performance
> enhancement where they return SUCCESS on IO operations even and then
> complete the IO operation later in the background. This improves speed by
> letting the program advance by not blocking for the IO operation, but it
> reduces reliability because if the later operation is not really
> successful, it can’t go back and tell the program “never mind it failed.”
> The result is a silent truncation of data. Not super common, but not all
> that rare either depending on IO load (i.e. heavy MPI with lots of writes).
> Find the line that’s truncated, then rerun just that contig before building
> the merged gff3 for everything.
>
> —Carson
>
>
>
> On Apr 18, 2019, at 3:23 AM, Paul Sheridan <paul at tupac.bio> wrote:
>
> Dear MAKER Team,
>
> I am running MAKER 2.31.10 a 32 core instance. I followed the Post
> Processing of Annotations steps as described in the MAKER Tutorial for GMOD
> Online Training 2014 as best I could, but I get an error when I run
> maker_functional_gff. The commands in the order of execution and relevant
> output are shown below.
>
> Where did I do wrong?
>
> # run blastp command
> blastp -query genome.all.maker.proteins.fasta -db uniprot_sprot.fasta
> -num_threads 32 -evalue 1e-6 -max_hsps 1 -max_target_seqs 1 -outfmt 6 -out
> output.blastp
>
> # run interproscan command
> interproscan.sh -appl pfam -dp -f TSV -goterms -iprlookup -pa -t p -i
> genome.all.maker.proteins.fasta -o output.iprscan
>
> # create naming table
> maker_map_ids --prefix ThuMac --justify 5  genome.all.gff > genome.all.map
>
> # copy files for safe keeping
> cp genome.all.gff genome.all.renamed.gff
> cp genome.all.noseq.gff genome.all.noseq.renamed.gff
> cp genome.all.maker.proteins.fasta genome.all.maker.proteins.renamed.fasta
> cp genome.all.maker.proteins.aed.0.50.fasta
> genome.all.maker.proteins.aed.0.50.renamed.fasta
> cp genome.all.maker.unique.proteins.aed.0.50.fasta
> genome.all.maker.unique.proteins.aed.0.50.renamed.fasta
> cp genome.all.maker.transcripts.fasta
> genome.all.maker.transcripts.renamed.fasta
> cp genome.all.maker.transcripts.aed.0.50.fasta
> genome.all.maker.transcripts.aed.0.50.renamed.fasta
> cp output.iprscan output.renamed.iprscan
> cp output.blastp output.renamed.blastp
>
> # replace uninformative MAKER protein/transcript names with useful ones
> map_gff_ids genome.all.map genome.all.renamed.gff
> map_gff_ids genome.all.map genome.all.noseq.renamed.gff
> map_fasta_ids genome.all.map genome.all.maker.proteins.renamed.fasta
> map_fasta_ids genome.all.map
> genome.all.maker.proteins.aed.0.50.renamed.fasta
> map_fasta_ids genome.all.map
> genome.all.maker.unique.proteins.aed.0.50.renamed.fasta
> map_fasta_ids genome.all.map genome.all.maker.transcripts.renamed.fasta
> map_fasta_ids genome.all.map
> genome.all.maker.transcripts.aed.0.50.renamed.fasta
> map_data_ids genome.all.map output.renamed.iprscan
> map_data_ids genome.all.map output.renamed.blastp
>
> # assign annotations
> maker_functional_gff uniprot_sprot.db output.renamed.blastp
> genome.all.renamed.gff > genome.all.renamed.putative_function.gff
>
> > head output.renamed.blastp
> ThuMac30929-RA P20036 41.791 134 77 1 326 458 113 246 9.51e-28 114
> ThuMac19623-RA P81018 35.714 168 87 2 1 147 1 168 8.40e-33 117
> ThuMac19629-RA Q66I51 68.939 264 79 2 1 263 1 262 1.48e-130 372
> ThuMac19628-RA Q61464 55.172 87 37 1 766 852 382 466 4.42e-25 119
> ThuMac19627-RA P07898 48.276 58 29 1 13 69 1962 2019 3.60e-13 65.9
> ThuMac19626-RA P81018 36.782 174 96 2 21 180 1 174 5.75e-36 127
> ThuMac19624-RA P81018 35.057 174 99 2 21 180 1 174 2.19e-33 120
> ThuMac19625-RA Q28343 32.520 123 43 2 35 117 2123 2245 7.57e-17 78.6
> ThuMac19636-RA Q9QX29 90.909 110 10 0 5 114 458 567 6.45e-65 216
> ThuMac19638-RA Q9QX29 57.391 115 35 3 5 114 703 808 3.06e-28 120
>
> > head output.renamed.iprscan
> ThuMac08407-RA f1e60af0e3add9ce493bd7a78114da1e 631 Pfam PF00520 Ion
> transport protein 154 413 3.8E-21 T 18-04-2019 IPR005821 Ion transport
> domain GO:0005216|GO:0006811|GO:0016020|GO:0055085
> ThuMac08407-RA f1e60af0e3add9ce493bd7a78114da1e 631 Pfam PF08412 Ion
> transport protein N-terminal 109 152 5.1E-19 T 18-04-2019 IPR013621 Ion
> transport N-terminal Reactome: R-HSA-1296061
> ThuMac08407-RA f1e60af0e3add9ce493bd7a78114da1e 631 Pfam PF00027 Cyclic
> nucleotide-binding domain 519 601 1.0E-17 T 18-04-2019 IPR000595 Cyclic
> nucleotide-binding domain
> ThuMac24094-RA f3c3ae9be61177558ac12f745bd0dd8e 414 Pfam PF13765 SPRY-associated
> domain 235 283 8.9E-23 T 18-04-2019 IPR006574 SPRY-associated
> ThuMac24094-RA f3c3ae9be61177558ac12f745bd0dd8e 414 Pfam PF00643 B-box
> zinc finger 18 56 5.2E-12 T 18-04-2019 IPR000315 B-box-type zinc finger
> GO:0008270
> ThuMac24094-RA f3c3ae9be61177558ac12f745bd0dd8e 414 Pfam PF00622 SPRY
> domain 287 391 2.2E-14 T 18-04-2019 IPR003877 SPRY domain GO:0005515
> ThuMac08369-RA 7aee1da5a47975ab8e43b68bfd1a117c 139 Pfam PF00076 RNA
> recognition motif. (a.k.a. RRM, RBD, or RNP domain) 22 87 1.6E-15 T
> 18-04-2019 IPR000504 RNA recognition motif domain GO:0003676
> ThuMac26054-RA 8f4119609312bd6442f8bb094c104231 462 Pfam PF07565 Band 3
> cytoplasmic domain 173 443 7.3E-100 T 18-04-2019 IPR013769 Band 3
> cytoplasmic domain GO:0006820|GO:0008509|GO:0016021 Reactome: R-HSA-425381
> ThuMac07958-RA d2b749fa573a5e452cadee56090c9588 804 Pfam PF03372 Endonuclease/Exonuclease/phosphatase
> family 235 535 7.0E-11 T 18-04-2019 IPR005135
> Endonuclease/exonuclease/phosphatase
> ThuMac07958-RA d2b749fa573a5e452cadee56090c9588 804 Pfam PF17751 SKICH
> domain 555 649 9.8E-23 T 18-04-2019 IPR041611 SKICH domain
>
> > map_data_ids genome.all.map output.renamed.iprscan
> WARNING: No mapping available for ThuMac01937-RA
> WARNING: No mapping available for ThuMac02226-RA
> WARNING: No mapping available for ThuMac20730-RA
> WARNING: No mapping available for ThuMac20730-RA
> WARNING: No mapping available for ThuMac14750-RA
> (Thousands of warnings like these were returned)
>
> > maker_functional_gff uniprot_sprot.db output.renamed.blastp
> genome.all.renamed.gff > genome.all.renamed.putative_function.gff
> Can't use string ("") as a HASH ref while "strict refs" in use at
> /root/maker/bin/maker_functional_gff line 55, <$IN> line 3.
>
> > head genome.all.renamed.putative_function.gff
> ##gff-version 3
> scf7180000008677_pilon_pilon . contig 1 49996 . . .
> ID=scf7180000008677_pilon_pilon;Name=scf7180000008677_pilon_pilon
>
> Thanks in Advance,
>
> Paul Sheridan
>
> --
> CSO at Tupac Bio
> Email: paul at tupac.bio <paul at tupac.bio>
> Homepage: www.paulsheridan.net
> Mobile: +81 80 7889 0859
> _______________________________________________
> maker-devel mailing list
> maker-devel at yandell-lab.org
> http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org
>
>
>

-- 
CSO at Tupac Bio
Email: paul at tupac.bio
Homepage: www.paulsheridan.net
Mobile: +81 80 7889 0859
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20190429/b8c30829/attachment.html>


More information about the maker-devel mailing list