[maker-devel] maker_functional_gff Error

Carson Holt carsonhh at gmail.com
Mon Apr 22 12:50:27 CDT 2019


This “WARNING: No mapping available for ThuMac01937-RA” means you are running on a file that already has been renamed. The file will have names like maker-SDFGDG-gene-0.1-mRNA-1 for example, and it’s finding the name ThuMac01937-RA, which is not in the first column of the map file. So it throws a warning.

The second one —> Can't use string ("") as a HASH ref while "strict refs" in use at /root/maker/bin/maker_functional_gff line 55, <$IN> line 3.


You likely have a trucated line in the GFF3. It’s missing an ID= tag. This can sometimes happen if writing to network mounted (NFS) file systems because of an ansyncrounous IO error. NFS file systems have a performance enhancement where they return SUCCESS on IO operations even and then complete the IO operation later in the background. This improves speed by letting the program advance by not blocking for the IO operation, but it reduces reliability because if the later operation is not really successful, it can’t go back and tell the program “never mind it failed.” The result is a silent truncation of data. Not super common, but not all that rare either depending on IO load (i.e. heavy MPI with lots of writes). Find the line that’s truncated, then rerun just that contig before building the merged gff3 for everything.

—Carson
 


> On Apr 18, 2019, at 3:23 AM, Paul Sheridan <paul at tupac.bio> wrote:
> 
> Dear MAKER Team,
> 
> I am running MAKER 2.31.10 a 32 core instance. I followed the Post Processing of Annotations steps as described in the MAKER Tutorial for GMOD Online Training 2014 as best I could, but I get an error when I run maker_functional_gff. The commands in the order of execution and relevant output are shown below.
> 
> Where did I do wrong? 
> 
> # run blastp command
> blastp -query genome.all.maker.proteins.fasta -db uniprot_sprot.fasta -num_threads 32 -evalue 1e-6 -max_hsps 1 -max_target_seqs 1 -outfmt 6 -out output.blastp
> 
> # run interproscan command
> interproscan.sh -appl pfam -dp -f TSV -goterms -iprlookup -pa -t p -i genome.all.maker.proteins.fasta -o output.iprscan
> 
> # create naming table
> maker_map_ids --prefix ThuMac --justify 5  genome.all.gff > genome.all.map
> 
> # copy files for safe keeping
> cp genome.all.gff genome.all.renamed.gff
> cp genome.all.noseq.gff genome.all.noseq.renamed.gff
> cp genome.all.maker.proteins.fasta genome.all.maker.proteins.renamed.fasta
> cp genome.all.maker.proteins.aed.0.50.fasta genome.all.maker.proteins.aed.0.50.renamed.fasta
> cp genome.all.maker.unique.proteins.aed.0.50.fasta genome.all.maker.unique.proteins.aed.0.50.renamed.fasta
> cp genome.all.maker.transcripts.fasta genome.all.maker.transcripts.renamed.fasta
> cp genome.all.maker.transcripts.aed.0.50.fasta genome.all.maker.transcripts.aed.0.50.renamed.fasta
> cp output.iprscan output.renamed.iprscan 
> cp output.blastp output.renamed.blastp
> 
> # replace uninformative MAKER protein/transcript names with useful ones
> map_gff_ids genome.all.map genome.all.renamed.gff
> map_gff_ids genome.all.map genome.all.noseq.renamed.gff
> map_fasta_ids genome.all.map genome.all.maker.proteins.renamed.fasta
> map_fasta_ids genome.all.map genome.all.maker.proteins.aed.0.50.renamed.fasta
> map_fasta_ids genome.all.map genome.all.maker.unique.proteins.aed.0.50.renamed.fasta
> map_fasta_ids genome.all.map genome.all.maker.transcripts.renamed.fasta
> map_fasta_ids genome.all.map genome.all.maker.transcripts.aed.0.50.renamed.fasta
> map_data_ids genome.all.map output.renamed.iprscan
> map_data_ids genome.all.map output.renamed.blastp
> 
> # assign annotations
> maker_functional_gff uniprot_sprot.db output.renamed.blastp genome.all.renamed.gff > genome.all.renamed.putative_function.gff
> 
> > head output.renamed.blastp
> ThuMac30929-RA	P20036	41.791	134	77	1	326	458	113	246	9.51e-28	114
> ThuMac19623-RA	P81018	35.714	168	87	2	1	147	1	168	8.40e-33	117
> ThuMac19629-RA	Q66I51	68.939	264	79	2	1	263	1	262	1.48e-130	372
> ThuMac19628-RA	Q61464	55.172	87	37	1	766	852	382	466	4.42e-25	119
> ThuMac19627-RA	P07898	48.276	58	29	1	13	69	1962	2019	3.60e-13	65.9
> ThuMac19626-RA	P81018	36.782	174	96	2	21	180	1	174	5.75e-36	127
> ThuMac19624-RA	P81018	35.057	174	99	2	21	180	1	174	2.19e-33	120
> ThuMac19625-RA	Q28343	32.520	123	43	2	35	117	2123	2245	7.57e-17	78.6
> ThuMac19636-RA	Q9QX29	90.909	110	10	0	5	114	458	567	6.45e-65	216
> ThuMac19638-RA	Q9QX29	57.391	115	35	3	5	114	703	808	3.06e-28	120
> 
> > head output.renamed.iprscan 
> ThuMac08407-RA	f1e60af0e3add9ce493bd7a78114da1e	631	Pfam	PF00520	Ion transport protein	154	413	3.8E-21	T	18-04-2019	IPR005821	Ion transport domain	GO:0005216|GO:0006811|GO:0016020|GO:0055085
> ThuMac08407-RA	f1e60af0e3add9ce493bd7a78114da1e	631	Pfam	PF08412	Ion transport protein N-terminal	109	152	5.1E-19	T	18-04-2019	IPR013621	Ion transport N-terminal		Reactome: R-HSA-1296061
> ThuMac08407-RA	f1e60af0e3add9ce493bd7a78114da1e	631	Pfam	PF00027	Cyclic nucleotide-binding domain	519	601	1.0E-17	T	18-04-2019	IPR000595	Cyclic nucleotide-binding domain
> ThuMac24094-RA	f3c3ae9be61177558ac12f745bd0dd8e	414	Pfam	PF13765	SPRY-associated domain	235	283	8.9E-23	T	18-04-2019	IPR006574	SPRY-associated
> ThuMac24094-RA	f3c3ae9be61177558ac12f745bd0dd8e	414	Pfam	PF00643	B-box zinc finger	18	56	5.2E-12	T	18-04-2019	IPR000315	B-box-type zinc finger	GO:0008270
> ThuMac24094-RA	f3c3ae9be61177558ac12f745bd0dd8e	414	Pfam	PF00622	SPRY domain	287	391	2.2E-14	T	18-04-2019	IPR003877	SPRY domain	GO:0005515
> ThuMac08369-RA	7aee1da5a47975ab8e43b68bfd1a117c	139	Pfam	PF00076	RNA recognition motif. (a.k.a. RRM, RBD, or RNP domain)	22	87	1.6E-15	T	18-04-2019	IPR000504	RNA recognition motif domain	GO:0003676
> ThuMac26054-RA	8f4119609312bd6442f8bb094c104231	462	Pfam	PF07565	Band 3 cytoplasmic domain	173	443	7.3E-100	T	18-04-2019	IPR013769	Band 3 cytoplasmic domain	GO:0006820|GO:0008509|GO:0016021	Reactome: R-HSA-425381
> ThuMac07958-RA	d2b749fa573a5e452cadee56090c9588	804	Pfam	PF03372	Endonuclease/Exonuclease/phosphatase family	235	535	7.0E-11	T	18-04-2019	IPR005135	Endonuclease/exonuclease/phosphatase
> ThuMac07958-RA	d2b749fa573a5e452cadee56090c9588	804	Pfam	PF17751	SKICH domain	555	649	9.8E-23	T	18-04-2019	IPR041611	SKICH domain
> 
> > map_data_ids genome.all.map output.renamed.iprscan
> WARNING: No mapping available for ThuMac01937-RA
> WARNING: No mapping available for ThuMac02226-RA
> WARNING: No mapping available for ThuMac20730-RA
> WARNING: No mapping available for ThuMac20730-RA
> WARNING: No mapping available for ThuMac14750-RA
> (Thousands of warnings like these were returned)
> 
> > maker_functional_gff uniprot_sprot.db output.renamed.blastp genome.all.renamed.gff > genome.all.renamed.putative_function.gff
> Can't use string ("") as a HASH ref while "strict refs" in use at /root/maker/bin/maker_functional_gff line 55, <$IN> line 3.
> 
> > head genome.all.renamed.putative_function.gff
> ##gff-version 3
> scf7180000008677_pilon_pilon	.	contig	1	49996	.	.	.	ID=scf7180000008677_pilon_pilon;Name=scf7180000008677_pilon_pilon
> 
> Thanks in Advance,
> 
> Paul Sheridan
> 
> -- 
> CSO at Tupac Bio
> Email: paul at tupac.bio
> Homepage: www.paulsheridan.net <http://www.paulsheridan.net/>
> Mobile: +81 80 7889 0859
> _______________________________________________
> maker-devel mailing list
> maker-devel at yandell-lab.org
> http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20190422/bf0cc6bd/attachment.html>


More information about the maker-devel mailing list