From psbpedrobarbosa at gmail.com Mon Nov 2 08:57:08 2015 From: psbpedrobarbosa at gmail.com (Pedro Barbosa) Date: Mon, 2 Nov 2015 14:57:08 +0000 Subject: [maker-devel] Repeat runner issue Message-ID: Hello, I'm unable to run MAKER properly as the datastore_index.log file shows a FAILED run for all the scaffolds. Inspecting the log files from the output directory i see that it always dies when repeatrunner is invoked. *STARTED scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.consensi%2Efa%2Eclassified%2Empi%2E10%2E9.specific.out* *FINISHED scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.consensi%2Efa%2Eclassified%2Empi%2E10%2E9.specific.out* *STARTED scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.te_proteins%2Efasta.repeatrunner* *DIED RANK 0:2:0:0* *DIED COUNT 1* Afterwards, I realized that repeatrunner is no longer included internally in MAKER, so I installed and included it in the system path. The error remained. I tried to remove the parameter 'repeat_protein' from the opts control file, because apparently repeatrunner is called when we provide a set of transposable elements. No success again. I tested both with the MAKER v2.31.8 and v3.00.0 beta, but the problem doesn't seem to be version related. Please find attached the maker_opts file that i used to run. Could you help me with this ? Best regards, Pedro Barbosa -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_opts.ctl Type: application/octet-stream Size: 840 bytes Desc: not available URL: From carsonhh at gmail.com Mon Nov 2 10:16:53 2015 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 2 Nov 2015 09:16:53 -0700 Subject: [maker-devel] Repeat runner issue In-Reply-To: References: Message-ID: <1617178F-2F83-4F54-B6F5-D47A32766BC2@gmail.com> You need to look at your STDERR as the datastorem log just gives you a summary of what failed but not why. The cause of the error will be somewhere in your job's STDERR. You can capture the STDERR to a file by redirecting it. For example, in bash shell ?> maker 2> stderr.log ?Carson > On Nov 2, 2015, at 7:57 AM, Pedro Barbosa wrote: > > Hello, > > I'm unable to run MAKER properly as the datastore_index.log file shows a FAILED run for all the scaffolds. > > Inspecting the log files from the output directory i see that it always dies when repeatrunner is invoked. > > STARTED scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.consensi%2Efa%2Eclassified%2Empi%2E10%2E9.specific.out > FINISHED scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.consensi%2Efa%2Eclassified%2Empi%2E10%2E9.specific.out > STARTED scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.te_proteins%2Efasta.repeatrunner > DIED RANK 0:2:0:0 > DIED COUNT 1 > > Afterwards, I realized that repeatrunner is no longer included internally in MAKER, so I installed and included it in the system path. The error remained. > I tried to remove the parameter 'repeat_protein' from the opts control file, because apparently repeatrunner is called when we provide a set of transposable elements. No success again. > > I tested both with the MAKER v2.31.8 and v3.00.0 beta, but the problem doesn't seem to be version related. Please find attached the maker_opts file that i used to run. > > Could you help me with this ? > > Best regards, > Pedro Barbosa > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Nov 2 10:36:11 2015 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 2 Nov 2015 09:36:11 -0700 Subject: [maker-devel] Repeat runner issue In-Reply-To: References: <1617178F-2F83-4F54-B6F5-D47A32766BC2@gmail.com> Message-ID: RepeatRunner requires BLAST regardless. RapSearch on the beta version will run for the protein alignment (even then I?ve found it prone to failure and it actually runs slower than BLAST on some longer sequences). If there is an issue with your own BLAST installation you can let MAKER install it?s own BLAST (some versions of BLAST+ do have issues). Go to the maker source directory and run './Build blast?. It will install BLAST+ in ?/maker/exe/blast. It will be version 2.2.28. This is because a couple of the newer updates to BLAST+ actually have bugs and spurious warnings/errors. ?Carson > On Nov 2, 2015, at 9:28 AM, Pedro Barbosa wrote: > > Ok. seems to be related with BLAST then ? I turned the rapsearch flag on (in the 3.0 beta version) expecting to rapsearch being run over blast. > > BLAST options error: File tmp-dir/maker_k2Axy1/0/blastprep/te_proteins%2Efasta.mpi.10.0 does not exist > ERROR: /opt/tools/ncbi-blast-2.2.31+/bin/makeblastdb failed in Widget::formater > --> rank=NA, hostname=cebal.example.com > ERROR: Failed while doing blastx repeats > ERROR: Chunk failed at level:1, tier_type:1 > FAILED CONTIG:scaffold_1 > > ERROR: Chunk failed at level:2, tier_type:0 > FAILED CONTIG:scaffold_1 > > > Pedro > > 2015-11-02 16:16 GMT+00:00 Carson Holt >: > You need to look at your STDERR as the datastorem log just gives you a summary of what failed but not why. The cause of the error will be somewhere in your job's STDERR. You can capture the STDERR to a file by redirecting it. > > For example, in bash shell ?> maker 2> stderr.log > > ?Carson > > > >> On Nov 2, 2015, at 7:57 AM, Pedro Barbosa > wrote: >> >> Hello, >> >> I'm unable to run MAKER properly as the datastore_index.log file shows a FAILED run for all the scaffolds. >> >> Inspecting the log files from the output directory i see that it always dies when repeatrunner is invoked. >> >> STARTED scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.consensi%2Efa%2Eclassified%2Empi%2E10%2E9.specific.out >> FINISHED scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.consensi%2Efa%2Eclassified%2Empi%2E10%2E9.specific.out >> STARTED scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.te_proteins%2Efasta.repeatrunner >> DIED RANK 0:2:0:0 >> DIED COUNT 1 >> >> Afterwards, I realized that repeatrunner is no longer included internally in MAKER, so I installed and included it in the system path. The error remained. >> I tried to remove the parameter 'repeat_protein' from the opts control file, because apparently repeatrunner is called when we provide a set of transposable elements. No success again. >> >> I tested both with the MAKER v2.31.8 and v3.00.0 beta, but the problem doesn't seem to be version related. Please find attached the maker_opts file that i used to run. >> >> Could you help me with this ? >> >> Best regards, >> Pedro Barbosa >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From psbpedrobarbosa at gmail.com Mon Nov 2 10:28:20 2015 From: psbpedrobarbosa at gmail.com (Pedro Barbosa) Date: Mon, 2 Nov 2015 16:28:20 +0000 Subject: [maker-devel] Repeat runner issue In-Reply-To: <1617178F-2F83-4F54-B6F5-D47A32766BC2@gmail.com> References: <1617178F-2F83-4F54-B6F5-D47A32766BC2@gmail.com> Message-ID: Ok. seems to be related with BLAST then ? I turned the rapsearch flag on (in the 3.0 beta version) expecting to rapsearch being run over blast. BLAST options error: File tmp-dir/maker_k2Axy1/0/blastprep/te_proteins%2Efasta.mpi.10.0 does not exist ERROR: /opt/tools/ncbi-blast-2.2.31+/bin/makeblastdb failed in Widget::formater --> rank=NA, hostname=cebal.example.com ERROR: Failed while doing blastx repeats ERROR: Chunk failed at level:1, tier_type:1 FAILED CONTIG:scaffold_1 ERROR: Chunk failed at level:2, tier_type:0 FAILED CONTIG:scaffold_1 Pedro 2015-11-02 16:16 GMT+00:00 Carson Holt : > You need to look at your STDERR as the datastorem log just gives you a > summary of what failed but not why. The cause of the error will be > somewhere in your job's STDERR. You can capture the STDERR to a file by > redirecting it. > > For example, in bash shell ?> maker 2> stderr.log > > ?Carson > > > > On Nov 2, 2015, at 7:57 AM, Pedro Barbosa > wrote: > > Hello, > > I'm unable to run MAKER properly as the datastore_index.log file shows a > FAILED run for all the scaffolds. > > Inspecting the log files from the output directory i see that it always > dies when repeatrunner is invoked. > > *STARTED > scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.consensi%2Efa%2Eclassified%2Empi%2E10%2E9.specific.out* > *FINISHED > scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.consensi%2Efa%2Eclassified%2Empi%2E10%2E9.specific.out* > *STARTED > scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.te_proteins%2Efasta.repeatrunner* > *DIED RANK 0:2:0:0* > *DIED COUNT 1* > > Afterwards, I realized that repeatrunner is no longer included internally > in MAKER, so I installed and included it in the system path. The error > remained. > I tried to remove the parameter 'repeat_protein' from the opts control > file, because apparently repeatrunner is called when we provide a set of > transposable elements. No success again. > > I tested both with the MAKER v2.31.8 and v3.00.0 beta, but the problem > doesn't seem to be version related. Please find attached the maker_opts > file that i used to run. > > Could you help me with this ? > > Best regards, > Pedro Barbosa > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jason.r.gallant at gmail.com Mon Nov 2 13:03:55 2015 From: jason.r.gallant at gmail.com (Jason Gallant) Date: Mon, 02 Nov 2015 19:03:55 +0000 Subject: [maker-devel] AUGUSTUS Training and "Off the Shelf" HMMs Message-ID: Hi Everyone, I?ve been experimenting with optimizing Amazon to perform the HMM training of augustus more speedily, based on a procedure that Kevin Childs has written for ?speedy? Augustus training. The procedure essentially comes from taking a subset of the genes predicted by SNAP, rather than the whole genome and constructing the training set? a good idea that undoubtedly saves a lot of time. I?ve written some modifications to the Augustus scripts and dependencies to try to speed this process up on Amazon, and I?d be happy to share my notes with anyone that is interested. I?ve gotten it to the point where the whole AutoAug procedure can be accomplished in a day on a small cluster. I think that working with the Augustus authors, more improvements could be made, but the whole experience with Augustus has lead me to some questions more generally... 1) One of the things noted in monkeying around with this reduced gene set procedure is that you are unable to do UTR training with Augustus? the AutoAug script complains that there aren?t enough genes left to make an adequate training set. Has anyone noted this, because I haven?t seen much discussion of how important that the Augustus HMM is trained for UTRs when used in the Maker2 pipeline. 2) I?ve been trying to evaluate how good my AUGUSTUS HMM is based on the training set. Running the newly trained species file, I see that the performance on the ?exon level? is low (around 5-6%) but sensitivity on the nucleotide level is in the 89-95%, where the specificity is in the 50-60% range, which seems consistent with other users on this and the Augustus list serve. This is assessed based on a training set of approximately 200 genes selected from the output generated by multiple iterative runs using the SNAP program, documented in the MAKER tutorial. This is all based on data & genes selected from a ?to be published? genome of an electric fish I?m working on. 3) Just for laughs, I tried the HMM trained for zebrafish on the same training set and found that the performance was slightly better than my species-specific one that I?ve been working so hard on (a few percentage points on both nucleotide level sensitivity and specificity). I?ve reasoned that it might be best in terms of reproducibility to run Maker one last time with my multiple rounds of SNAP hmm together with the augustus zebrafish species file, rather than using my own custom species training. Can anyone think of a good reason why not to do this? Are there qualities/benefits not expressed by these sensitivity/specificity measures not captured that I would benefit using my own custom species trained file for? What are folks? experiences with AUGUSTUS in this regard? Many thanks for any advise in advance! Jason Gallant -- ---- Dr. Jason R. Gallant Assistant Professor Room 38 Natural Sciences Department of Integrative Biology Michigan State University East Lansing, MI 48824 jgallant at msu.edu office: 517-884-7756 -------------- next part -------------- An HTML attachment was scrubbed... URL: From jgallant at msu.edu Mon Nov 2 13:05:22 2015 From: jgallant at msu.edu (Jason Gallant) Date: Mon, 02 Nov 2015 19:05:22 +0000 Subject: [maker-devel] AUGUSTUS Training and "Off the Shelf" HMMs Message-ID: Hi Everyone, I?ve been experimenting with optimizing Amazon to perform the HMM training of augustus more speedily, based on a procedure that Kevin Childs has written for ?speedy? Augustus training. The procedure essentially comes from taking a subset of the genes predicted by SNAP, rather than the whole genome and constructing the training set? a good idea that undoubtedly saves a lot of time. I?ve written some modifications to the Augustus scripts and dependencies to try to speed this process up on Amazon, and I?d be happy to share my notes with anyone that is interested. I?ve gotten it to the point where the whole AutoAug procedure can be accomplished in a day on a small cluster. I think that working with the Augustus authors, more improvements could be made, but the whole experience with Augustus has lead me to some questions more generally... 1) One of the things noted in monkeying around with this reduced gene set procedure is that you are unable to do UTR training with Augustus? the AutoAug script complains that there aren?t enough genes left to make an adequate training set. Has anyone noted this, because I haven?t seen much discussion of how important that the Augustus HMM is trained for UTRs when used in the Maker2 pipeline. 2) I?ve been trying to evaluate how good my AUGUSTUS HMM is based on the training set. Running the newly trained species file, I see that the performance on the ?exon level? is low (around 5-6%) but sensitivity on the nucleotide level is in the 89-95%, where the specificity is in the 50-60% range, which seems consistent with other users on this and the Augustus list serve. This is assessed based on a training set of approximately 200 genes selected from the output generated by multiple iterative runs using the SNAP program, documented in the MAKER tutorial. This is all based on data & genes selected from a ?to be published? genome of an electric fish I?m working on. 3) Just for laughs, I tried the HMM trained for zebrafish on the same training set and found that the performance was slightly better than my species-specific one that I?ve been working so hard on (a few percentage points on both nucleotide level sensitivity and specificity). I?ve reasoned that it might be best in terms of reproducibility to run Maker one last time with my multiple rounds of SNAP hmm together with the augustus zebrafish species file, rather than using my own custom species training. Can anyone think of a good reason why not to do this? Are there qualities/benefits not expressed by these sensitivity/specificity measures not captured that I would benefit using my own custom species trained file for? What are folks? experiences with AUGUSTUS in this regard? Many thanks for any advise in advance! Jason Gallant -------------- next part -------------- An HTML attachment was scrubbed... URL: From scott at scottcain.net Tue Nov 3 08:28:07 2015 From: scott at scottcain.net (Scott Cain) Date: Tue, 3 Nov 2015 09:28:07 -0500 Subject: [maker-devel] Time is short: GMOD related talks at PAG Message-ID: Hi All, I realize time is short for this: there is a hard deadline of November 6 for entering speakers for the Plant and Animal Genomes meeting in January 2016. If you would like to give a talk in the GMOD section of the meeting, please get me a title and brief description (an abstract would be good but not required) by November 5. The talk can take the form of a project update, user "story" or interpretive dance. Speakers get early registration discount no matter when they register, so if you haven't registered yet, it would be like getting $100 off. Sorry for the short notice, and I look forward to seeing in San Diego! Scott -- ------------------------------------------------------------------------ Scott Cain, Ph. D. scott at scottcain dot net GMOD Coordinator (http://gmod.org/) 216-392-3087 Ontario Institute for Cancer Research -------------- next part -------------- An HTML attachment was scrubbed... URL: From jgallant at msu.edu Tue Nov 3 13:48:10 2015 From: jgallant at msu.edu (Jason Gallant) Date: Tue, 03 Nov 2015 19:48:10 +0000 Subject: [maker-devel] quality_filter.pl script -question for Mike Campbell Message-ID: Hi Mike (list copied for future reference), I found your very nice protocols paper on using Maker from 2014. I?ve been following it to the letter as I?m wrapping up my annotation project. I?ve located your quality_filter.pl script and am using it on my GFF files to create maker standard and maker default data sets from my maker-max GFF file. I?m noticing that perl complains a lot while this is running about ?use of uninitialized value?. This occurs on two separate passes as far as I can tell. When generating the ?maker standard? file, it occurs for many lines in my GFF file as: Use of uninitialized value $array[2] in pattern match (m//) at /mnt/home/jgallant/quality_filter.pl line 50, line Y. And then later it complains again with a similar message Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/ quality_filter.pl line 92, line X. Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/ quality_filter.pl line 96, line X. Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/ quality_filter.pl line 100, line X Any insights as to what causes this? I seem to get a fully formed GFF3 file out the other side, but the command line fills with these messages and makes me nervous that something isn?t right. I?d appreciate any thoughts! Best, Jason Gallant -------------- next part -------------- An HTML attachment was scrubbed... URL: From michael.s.campbell1 at gmail.com Tue Nov 3 15:06:32 2015 From: michael.s.campbell1 at gmail.com (Michael Campbell) Date: Tue, 3 Nov 2015 16:06:32 -0500 Subject: [maker-devel] quality_filter.pl script -question for Mike Campbell In-Reply-To: References: Message-ID: Hi Jason, It could be a couple of things. If you have a cut down version of your gff3 that I can use to recreate the error I can debug it. The quality_filter.pl script is still a pretty young accessory script, so you may have something in your file that It wasn?t tested against. Thanks, Mike > On Nov 3, 2015, at 2:48 PM, Jason Gallant wrote: > > Hi Mike (list copied for future reference), > > I found your very nice protocols paper on using Maker from 2014. I?ve been following it to the letter as I?m wrapping up my annotation project. > > I?ve located your quality_filter.pl script and am using it on my GFF files to create maker standard and maker default data sets from my maker-max GFF file. I?m noticing that perl complains a lot while this is running about ?use of uninitialized value?. This occurs on two separate passes as far as I can tell. When generating the ?maker standard? file, it occurs for many lines in my GFF file as: > > Use of uninitialized value $array[2] in pattern match (m//) at /mnt/home/jgallant/quality_filter.pl line 50, line Y. > > And then later it complains again with a similar message > > Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/quality_filter.pl line 92, line X. > Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/quality_filter.pl line 96, line X. > Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/quality_filter.pl line 100, line X > > Any insights as to what causes this? I seem to get a fully formed GFF3 file out the other side, but the command line fills with these messages and makes me nervous that something isn?t right. > > I?d appreciate any thoughts! > > Best, > Jason Gallant > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From jgallant at msu.edu Tue Nov 3 21:20:45 2015 From: jgallant at msu.edu (Jason Gallant) Date: Wed, 04 Nov 2015 03:20:45 +0000 Subject: [maker-devel] quality_filter.pl script -question for Mike Campbell In-Reply-To: References: Message-ID: Hi Mike, I?ve done a little digging on my own, and I think I have traced the issue? it appears that the script is being thrown for the first time when it encounters the line containing the first FASTA sequence header in the GFF file. For example: Scaffold435 blastx match_part 102606 102851 201 - . ID=Scaffold435:hsp:71210:3.10.1.2;Parent=Scaffold435:hit:24267:3.10.1.2;Target=gi|37594442|ref|NP_003431.2| 161 242;Gap=M82 Scaffold435 blastx match_part 102528 102851 173 - . ID=Scaffold435:hsp:71211:3.10.1.2;Pare##FASTA >Scaffold3579 GCAGTAGGCTGTGATACGTTTGCACCCGGGGACTAAGGGGAGATGTGTACAGGATGGGGA GATGTGTACAGGATGGGGAGATGTGTACAGGATGGAGGGGTCCGTGCGAGAGCGTACCAC GTGTCTCCCGTGCAGTGGTGCGGCGTGTACTTGATGCGATAAGCCACAGGGTCCTGCTTC CCATCCTCCATGATCATTTTCATCCGCAGCGTGACCTCCCCCTCAGAAAAGACCCCCTTC CTCATCCTCTCAAAGAGCACCAGAGATTCCTCAATCGGCCGATCCCTCCAAGGGGACAGG GGAGGGCTGTGGCCCTTCAGCTCCTCCACACGCTGGTGACACACATAGGCGAGGCCCCTG GATGCAGAACAGTGCAGACAGTGACATACCATTCACATGACACTGATCCGGTTAAGCCAC ACGCGATACAATACAGTGTCATCATCAGGAAGAAGGGGAAACAGAGGCGTCAAAACGCCC TATGAAGAGAGGAGTCTGCTTGCACTCACCGGCGAATAAGATCCACTGCGAGGTCGTACA GCTTTTGGAAGTGGTCAGACGCGTGGGTCACTGCATAGGGCGTGTACCCTGTTTTACAGA Perl complains when it encounters the line containing >Scaffold3579 and pretty much every line thereafter. Intriguingly, it would appear that the preceding line appears to be truncated compared to those before it. I can trace this all the way back to the output of gff3_merge for several files. Not sure what to do here (or if you can help!). Best, Jason On Tue, Nov 3, 2015 at 4:06 PM Michael Campbell < michael.s.campbell1 at gmail.com> wrote: > Hi Jason, > > It could be a couple of things. If you have a cut down version of your > gff3 that I can use to recreate the error I can debug it. The > quality_filter.pl script is still a pretty young accessory script, so you > may have something in your file that It wasn?t tested against. > > Thanks, > Mike > > > > On Nov 3, 2015, at 2:48 PM, Jason Gallant wrote: > > Hi Mike (list copied for future reference), > > I found your very nice protocols paper on using Maker from 2014. I?ve > been following it to the letter as I?m wrapping up my annotation project. > > I?ve located your quality_filter.pl script and am using it on my GFF > files to create maker standard and maker default data sets from my > maker-max GFF file. I?m noticing that perl complains a lot while this is > running about ?use of uninitialized value?. This occurs on two separate > passes as far as I can tell. When generating the ?maker standard? file, it > occurs for many lines in my GFF file as: > > Use of uninitialized value $array[2] in pattern match (m//) at > /mnt/home/jgallant/quality_filter.pl line 50, line Y. > > And then later it complains again with a similar message > > Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/ > quality_filter.pl line 92, line X. > Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/ > quality_filter.pl line 96, line X. > Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/ > quality_filter.pl line 100, line X > > Any insights as to what causes this? I seem to get a fully formed GFF3 > file out the other side, but the command line fills with these messages and > makes me nervous that something isn?t right. > > I?d appreciate any thoughts! > > Best, > Jason Gallant > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Wed Nov 4 10:09:55 2015 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 4 Nov 2015 09:09:55 -0700 Subject: [maker-devel] quality_filter.pl script -question for Mike Campbell In-Reply-To: References: Message-ID: <26CDF518-B664-4AD2-9C1F-6844AD9764E5@gmail.com> If you have a truncated result, then you should look for truncation in one of the pre-merged files (usually indicates a broken file lock if you started multiple instances of MAKER simultaneously). Also make sure your /tmp or whatever your system default TMPDIR is has not become full. gff3_merge uses that directory to store temporary files. ?Carson > On Nov 3, 2015, at 8:20 PM, Jason Gallant wrote: > > Hi Mike, > > I?ve done a little digging on my own, and I think I have traced the issue? it appears that the script is being thrown for the first time when it encounters the line containing the first FASTA sequence header in the GFF file. For example: > > Scaffold435 blastx match_part 102606 102851 201 - . ID=Scaffold435:hsp:71210:3.10.1.2;Parent=Scaffold435:hit:24267:3.10.1.2;Target=gi|37594442|ref|NP_003431.2| 161 242;Gap=M82 > Scaffold435 blastx match_part 102528 102851 173 - . ID=Scaffold435:hsp:71211:3.10.1.2;Pare##FASTA > >Scaffold3579 > GCAGTAGGCTGTGATACGTTTGCACCCGGGGACTAAGGGGAGATGTGTACAGGATGGGGA > GATGTGTACAGGATGGGGAGATGTGTACAGGATGGAGGGGTCCGTGCGAGAGCGTACCAC > GTGTCTCCCGTGCAGTGGTGCGGCGTGTACTTGATGCGATAAGCCACAGGGTCCTGCTTC > CCATCCTCCATGATCATTTTCATCCGCAGCGTGACCTCCCCCTCAGAAAAGACCCCCTTC > CTCATCCTCTCAAAGAGCACCAGAGATTCCTCAATCGGCCGATCCCTCCAAGGGGACAGG > GGAGGGCTGTGGCCCTTCAGCTCCTCCACACGCTGGTGACACACATAGGCGAGGCCCCTG > GATGCAGAACAGTGCAGACAGTGACATACCATTCACATGACACTGATCCGGTTAAGCCAC > ACGCGATACAATACAGTGTCATCATCAGGAAGAAGGGGAAACAGAGGCGTCAAAACGCCC > TATGAAGAGAGGAGTCTGCTTGCACTCACCGGCGAATAAGATCCACTGCGAGGTCGTACA > GCTTTTGGAAGTGGTCAGACGCGTGGGTCACTGCATAGGGCGTGTACCCTGTTTTACAGA > > Perl complains when it encounters the line containing >Scaffold3579 and pretty much every line thereafter. Intriguingly, it would appear that the preceding line appears to be truncated compared to those before it. I can trace this all the way back to the output of gff3_merge for several files. Not sure what to do here (or if you can help!). > > Best, > Jason > > > > > On Tue, Nov 3, 2015 at 4:06 PM Michael Campbell > wrote: > Hi Jason, > > It could be a couple of things. If you have a cut down version of your gff3 that I can use to recreate the error I can debug it. The quality_filter.pl script is still a pretty young accessory script, so you may have something in your file that It wasn?t tested against. > > Thanks, > Mike > > > > >> On Nov 3, 2015, at 2:48 PM, Jason Gallant > wrote: >> > >> Hi Mike (list copied for future reference), >> >> I found your very nice protocols paper on using Maker from 2014. I?ve been following it to the letter as I?m wrapping up my annotation project. >> >> I?ve located your quality_filter.pl script and am using it on my GFF files to create maker standard and maker default data sets from my maker-max GFF file. I?m noticing that perl complains a lot while this is running about ?use of uninitialized value?. This occurs on two separate passes as far as I can tell. When generating the ?maker standard? file, it occurs for many lines in my GFF file as: >> >> Use of uninitialized value $array[2] in pattern match (m//) at /mnt/home/jgallant/quality_filter.pl line 50, line Y. >> >> And then later it complains again with a similar message >> >> Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/quality_filter.pl line 92, line X. >> Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/quality_filter.pl line 96, line X. >> Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/quality_filter.pl line 100, line X >> >> Any insights as to what causes this? I seem to get a fully formed GFF3 file out the other side, but the command line fills with these messages and makes me nervous that something isn?t right. >> >> I?d appreciate any thoughts! >> >> Best, >> Jason Gallant > >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From jgallant at msu.edu Wed Nov 4 11:14:57 2015 From: jgallant at msu.edu (Jason Gallant) Date: Wed, 04 Nov 2015 17:14:57 +0000 Subject: [maker-devel] quality_filter.pl script -question for Mike Campbell In-Reply-To: <26CDF518-B664-4AD2-9C1F-6844AD9764E5@gmail.com> References: <26CDF518-B664-4AD2-9C1F-6844AD9764E5@gmail.com> Message-ID: Hi Carson, Great? the full temporary directory was indeed the issue! On amazon the AMI images are so small that a couple of big files will fill this up, and of course because the script cleans up after itself, I was never the wiser. Thanks for the insight. I was missing about 1GB of data! Doh! Mike, I was able to regenerate my GFF file with this in mind, and no more complaining. It was indeed the truncated file that was the culprit. Thanks for your insights as well. Best, Jason Gallant On Wed, Nov 4, 2015 at 11:10 AM Carson Holt wrote: > If you have a truncated result, then you should look for truncation in one > of the pre-merged files (usually indicates a broken file lock if you > started multiple instances of MAKER simultaneously). Also make sure your > /tmp or whatever your system default TMPDIR is has not become full. > gff3_merge uses that directory to store temporary files. > > ?Carson > > > On Nov 3, 2015, at 8:20 PM, Jason Gallant wrote: > > Hi Mike, > > I?ve done a little digging on my own, and I think I have traced the issue? > it appears that the script is being thrown for the first time when it > encounters the line containing the first FASTA sequence header in the GFF > file. For example: > > Scaffold435 blastx match_part 102606 102851 201 - . ID=Scaffold435:hsp:71210:3.10.1.2;Parent=Scaffold435:hit:24267:3.10.1.2;Target=gi|37594442|ref|NP_003431.2| > 161 242;Gap=M82 > Scaffold435 blastx match_part 102528 102851 173 - . > ID=Scaffold435:hsp:71211:3.10.1.2;Pare##FASTA > >Scaffold3579 > GCAGTAGGCTGTGATACGTTTGCACCCGGGGACTAAGGGGAGATGTGTACAGGATGGGGA > GATGTGTACAGGATGGGGAGATGTGTACAGGATGGAGGGGTCCGTGCGAGAGCGTACCAC > GTGTCTCCCGTGCAGTGGTGCGGCGTGTACTTGATGCGATAAGCCACAGGGTCCTGCTTC > CCATCCTCCATGATCATTTTCATCCGCAGCGTGACCTCCCCCTCAGAAAAGACCCCCTTC > CTCATCCTCTCAAAGAGCACCAGAGATTCCTCAATCGGCCGATCCCTCCAAGGGGACAGG > GGAGGGCTGTGGCCCTTCAGCTCCTCCACACGCTGGTGACACACATAGGCGAGGCCCCTG > GATGCAGAACAGTGCAGACAGTGACATACCATTCACATGACACTGATCCGGTTAAGCCAC > ACGCGATACAATACAGTGTCATCATCAGGAAGAAGGGGAAACAGAGGCGTCAAAACGCCC > TATGAAGAGAGGAGTCTGCTTGCACTCACCGGCGAATAAGATCCACTGCGAGGTCGTACA > GCTTTTGGAAGTGGTCAGACGCGTGGGTCACTGCATAGGGCGTGTACCCTGTTTTACAGA > > Perl complains when it encounters the line containing >Scaffold3579 and > pretty much every line thereafter. Intriguingly, it would appear that the > preceding line appears to be truncated compared to those before it. I can > trace this all the way back to the output of gff3_merge for several files. > Not sure what to do here (or if you can help!). > > Best, > Jason > > > > > On Tue, Nov 3, 2015 at 4:06 PM Michael Campbell < > michael.s.campbell1 at gmail.com> wrote: > >> Hi Jason, >> >> It could be a couple of things. If you have a cut down version of your >> gff3 that I can use to recreate the error I can debug it. The >> quality_filter.pl script is still a pretty young accessory script, so >> you may have something in your file that It wasn?t tested against. >> >> Thanks, >> Mike >> >> >> >> On Nov 3, 2015, at 2:48 PM, Jason Gallant wrote: >> >> Hi Mike (list copied for future reference), >> >> I found your very nice protocols paper on using Maker from 2014. I?ve >> been following it to the letter as I?m wrapping up my annotation project. >> >> I?ve located your quality_filter.pl script and am using it on my GFF >> files to create maker standard and maker default data sets from my >> maker-max GFF file. I?m noticing that perl complains a lot while this is >> running about ?use of uninitialized value?. This occurs on two separate >> passes as far as I can tell. When generating the ?maker standard? file, it >> occurs for many lines in my GFF file as: >> >> Use of uninitialized value $array[2] in pattern match (m//) at >> /mnt/home/jgallant/quality_filter.pl line 50, line Y. >> >> And then later it complains again with a similar message >> >> Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/ >> quality_filter.pl line 92, line X. >> Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/ >> quality_filter.pl line 96, line X. >> Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/ >> quality_filter.pl line 100, line X >> >> Any insights as to what causes this? I seem to get a fully formed GFF3 >> file out the other side, but the command line fills with these messages and >> makes me nervous that something isn?t right. >> >> I?d appreciate any thoughts! >> >> Best, >> Jason Gallant >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> >> _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dancsi90 at gmail.com Thu Nov 5 01:16:19 2015 From: dancsi90 at gmail.com (Anna Nyiri) Date: Thu, 5 Nov 2015 07:16:19 +0000 Subject: [maker-devel] MPI Message-ID: Hi, I tried to use MAKER with MPICH2, but I got an error message: "/molbio/bin/danna/mpich2-install/bin/hydra_pmi_proxy: error while loading shared libraries: libtorque.so.2: cannot open shared object file: No such file or directory" The attached file contains the shell sctipt, which I used. Is this script correct? The script should contain MPICH module. But I can't find it on my computer. Where can I find this module? Thanks for your help, Anna Nyiri -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_mpi.sh Type: application/x-sh Size: 1295 bytes Desc: not available URL: From carsonhh at gmail.com Thu Nov 5 10:57:21 2015 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 5 Nov 2015 09:57:21 -0700 Subject: [maker-devel] MPI In-Reply-To: References: Message-ID: The problem is the actual MPICH2 installation. You may be missing prerequisites or you may not have compiled with the necessary shared library flags (-enable-shared). You may also be compiling on one machine that has certain libraries installed then running on another that doesn?t have access to those libraries (this can happen if running on a cluster). Try reinstalling MPICH2 or switching to OpenMPI. If you decide to use OpenMPI, he following is from the INSTALL file that should be included with MAKER ?> If using OpenMPI, make sure to set LD_PRELOAD to the location of libmpi.so before even trying to install MAKER. It must also be set before running MAKER (or any program that uses OpenMPI's shared libraries), so it's best just to add it to your ~/.bash_profile. (i.e. export LD_PRELOAD=/location/of/openmpi/lib/libmpi.so). 1. Say yes to the 'configure for MPI' question when running 'perl Build.PL? in step 1 of the EASY INSTALL. 2. Give path to 'mpicc'. Note to make sure you do not give the path to ?mpicc' from another MPI flavor that might be installed on your system. 3. Give path to the folder containing 'mpi,h'. Note to make sure you do not give the path to a folder from another MPI flavor that might be installed on your system. Mixing MPI flavors for 'mpicc' and 'mpi.h' will cause failures. Make sure to read and confirm the auto-detected paths. 4. Finish installation according to steps 2-4 of the EASY INSTALL Note: For OpenMPI you may also want to set OMPI_MCA_mpi_warn_on_fork=0 in your ~/.bash_profile to turn off certain nonfatal warnings. Note: If jobs hang or freeze when using mpiexec under OpenMPI try adding the '-mca btl ^openib' flag to mpiexec command when running MAKER. Example: mpiexec -mca btl ^openib -n 20 maker Thanks, Carson > On Nov 5, 2015, at 12:16 AM, Anna Nyiri wrote: > > Hi, > > I tried to use MAKER with MPICH2, but I got an error message: > "/molbio/bin/danna/mpich2-install/bin/hydra_pmi_proxy: error while loading shared libraries: libtorque.so.2: cannot open shared object file: No such file or directory" > > The attached file contains the shell sctipt, which I used. Is this script correct? > > The script should contain MPICH module. But I can't find it on my computer. Where can I find this module? > > Thanks for your help, > Anna Nyiri > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From psh65 at cornell.edu Tue Nov 10 16:17:09 2015 From: psh65 at cornell.edu (Prashant S Hosmani) Date: Tue, 10 Nov 2015 22:17:09 +0000 Subject: [maker-devel] Maker beta 3.0 version Message-ID: <830525A3-D0E9-4373-A526-3C5C89F391A2@cornell.edu> Hi All, I am currently annotating a plant genome and was curious about new version of Maker. I would like to know what's new in the 3.00 beta version of Maker. Thank you in Advance for help, Prashant From carsonhh at gmail.com Wed Nov 11 18:30:57 2015 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 11 Nov 2015 17:30:57 -0700 Subject: [maker-devel] Maker beta 3.0 version In-Reply-To: <830525A3-D0E9-4373-A526-3C5C89F391A2@cornell.edu> References: <830525A3-D0E9-4373-A526-3C5C89F391A2@cornell.edu> Message-ID: <551B16F5-A4A7-42BE-A509-9AB235CE518D@gmail.com> Primarily EVidenceModeler (EVM) integration. ?Carson > On Nov 10, 2015, at 3:17 PM, Prashant S Hosmani wrote: > > Hi All, > > I am currently annotating a plant genome and was curious about new version of Maker. I would like to know what's new in the 3.00 beta version of Maker. > > Thank you in Advance for help, > Prashant > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From jgallant at msu.edu Fri Nov 13 14:03:30 2015 From: jgallant at msu.edu (Jason Gallant) Date: Fri, 13 Nov 2015 20:03:30 +0000 Subject: [maker-devel] Alternative Splicing and ipr_update_gff Message-ID: Hi Everyone, Another nitty gritty question, probably directed at Carson once more. I decided to make one more go at my maker annotation, this time turning on alt_splice=1. I have been keeping keep_preds=1 on to export the ?max? dataset as detailed in Campbell et al (2014), with the hopes of ?rescuing? genes that have IPR domains do not have evidence. Everything works swimmingly when alt_splice=0, but when activated, the run behaved normally? I ran gff3_merge and fast_merge to obtain proteins and transcripts, and found that as predicted the resulting fasta files contained more proteins than the initial run. I ran IPR scan and now am trying to update my GFF3 file to obtain the ?maker standard? dataset? what I am noticing is a sudden complaint by the ipr_update_gff script about use of an uninitialized value. This appears to have happened to others: https://groups.google.com/d/msg/maker-devel/dM4WvyghYks/BboRZQLmEF8J And indeed, I can verify that the first protein listed in my fasta file is only listed as ?augustus_masked? match/match part in the original GFF3 file. If I understand the ipr_update_gff script correctly, this transcript will be ignored because it lacks the mRNA type. Is this expected naming behavior for the alternative splicing? I would have expected the alternative splice variants to be listed as alternative mRNAs under the same parent gene? Is there some sort of misconfiguration or am I expecting incorrectly? Hopes for any help you all can provide in diagnosing Best, Jason Gallant -------------- next part -------------- An HTML attachment was scrubbed... URL: From jgallant at msu.edu Fri Nov 13 14:55:47 2015 From: jgallant at msu.edu (Jason Gallant) Date: Fri, 13 Nov 2015 20:55:47 +0000 Subject: [maker-devel] Alternative Splicing and ipr_update_gff In-Reply-To: References: Message-ID: Hi Everyone, Doh! Very stupid error? I ran my interpro scan on the augustus masked proteins file instead of the maker masked proteins file by mistake. Apologies! Jason Gallant On Fri, Nov 13, 2015 at 3:03 PM Jason Gallant wrote: > Hi Everyone, > > Another nitty gritty question, probably directed at Carson once more. > > I decided to make one more go at my maker annotation, this time turning on > alt_splice=1. I have been keeping keep_preds=1 on to export the ?max? > dataset as detailed in Campbell et al (2014), with the hopes of ?rescuing? > genes that have IPR domains do not have evidence. > > Everything works swimmingly when alt_splice=0, but when activated, the run > behaved normally? I ran gff3_merge and fast_merge to obtain proteins and > transcripts, and found that as predicted the resulting fasta files > contained more proteins than the initial run. > > I ran IPR scan and now am trying to update my GFF3 file to obtain the > ?maker standard? dataset? what I am noticing is a sudden complaint by the > ipr_update_gff script about use of an uninitialized value. > > This appears to have happened to others: > https://groups.google.com/d/msg/maker-devel/dM4WvyghYks/BboRZQLmEF8J > > And indeed, I can verify that the first protein listed in my fasta file is > only listed as ?augustus_masked? match/match part in the original GFF3 > file. > > If I understand the ipr_update_gff script correctly, this transcript will > be ignored because it lacks the mRNA type. Is this expected naming > behavior for the alternative splicing? I would have expected the > alternative splice variants to be listed as alternative mRNAs under the > same parent gene? Is there some sort of misconfiguration or am I expecting > incorrectly? > > Hopes for any help you all can provide in diagnosing > > Best, > Jason Gallant > -------------- next part -------------- An HTML attachment was scrubbed... URL: From psbpedrobarbosa at gmail.com Mon Nov 2 07:57:08 2015 From: psbpedrobarbosa at gmail.com (Pedro Barbosa) Date: Mon, 2 Nov 2015 14:57:08 +0000 Subject: [maker-devel] Repeat runner issue Message-ID: Hello, I'm unable to run MAKER properly as the datastore_index.log file shows a FAILED run for all the scaffolds. Inspecting the log files from the output directory i see that it always dies when repeatrunner is invoked. *STARTED scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.consensi%2Efa%2Eclassified%2Empi%2E10%2E9.specific.out* *FINISHED scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.consensi%2Efa%2Eclassified%2Empi%2E10%2E9.specific.out* *STARTED scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.te_proteins%2Efasta.repeatrunner* *DIED RANK 0:2:0:0* *DIED COUNT 1* Afterwards, I realized that repeatrunner is no longer included internally in MAKER, so I installed and included it in the system path. The error remained. I tried to remove the parameter 'repeat_protein' from the opts control file, because apparently repeatrunner is called when we provide a set of transposable elements. No success again. I tested both with the MAKER v2.31.8 and v3.00.0 beta, but the problem doesn't seem to be version related. Please find attached the maker_opts file that i used to run. Could you help me with this ? Best regards, Pedro Barbosa -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_opts.ctl Type: application/octet-stream Size: 840 bytes Desc: not available URL: From carsonhh at gmail.com Mon Nov 2 09:16:53 2015 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 2 Nov 2015 09:16:53 -0700 Subject: [maker-devel] Repeat runner issue In-Reply-To: References: Message-ID: <1617178F-2F83-4F54-B6F5-D47A32766BC2@gmail.com> You need to look at your STDERR as the datastorem log just gives you a summary of what failed but not why. The cause of the error will be somewhere in your job's STDERR. You can capture the STDERR to a file by redirecting it. For example, in bash shell ?> maker 2> stderr.log ?Carson > On Nov 2, 2015, at 7:57 AM, Pedro Barbosa wrote: > > Hello, > > I'm unable to run MAKER properly as the datastore_index.log file shows a FAILED run for all the scaffolds. > > Inspecting the log files from the output directory i see that it always dies when repeatrunner is invoked. > > STARTED scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.consensi%2Efa%2Eclassified%2Empi%2E10%2E9.specific.out > FINISHED scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.consensi%2Efa%2Eclassified%2Empi%2E10%2E9.specific.out > STARTED scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.te_proteins%2Efasta.repeatrunner > DIED RANK 0:2:0:0 > DIED COUNT 1 > > Afterwards, I realized that repeatrunner is no longer included internally in MAKER, so I installed and included it in the system path. The error remained. > I tried to remove the parameter 'repeat_protein' from the opts control file, because apparently repeatrunner is called when we provide a set of transposable elements. No success again. > > I tested both with the MAKER v2.31.8 and v3.00.0 beta, but the problem doesn't seem to be version related. Please find attached the maker_opts file that i used to run. > > Could you help me with this ? > > Best regards, > Pedro Barbosa > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Nov 2 09:36:11 2015 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 2 Nov 2015 09:36:11 -0700 Subject: [maker-devel] Repeat runner issue In-Reply-To: References: <1617178F-2F83-4F54-B6F5-D47A32766BC2@gmail.com> Message-ID: RepeatRunner requires BLAST regardless. RapSearch on the beta version will run for the protein alignment (even then I?ve found it prone to failure and it actually runs slower than BLAST on some longer sequences). If there is an issue with your own BLAST installation you can let MAKER install it?s own BLAST (some versions of BLAST+ do have issues). Go to the maker source directory and run './Build blast?. It will install BLAST+ in ?/maker/exe/blast. It will be version 2.2.28. This is because a couple of the newer updates to BLAST+ actually have bugs and spurious warnings/errors. ?Carson > On Nov 2, 2015, at 9:28 AM, Pedro Barbosa wrote: > > Ok. seems to be related with BLAST then ? I turned the rapsearch flag on (in the 3.0 beta version) expecting to rapsearch being run over blast. > > BLAST options error: File tmp-dir/maker_k2Axy1/0/blastprep/te_proteins%2Efasta.mpi.10.0 does not exist > ERROR: /opt/tools/ncbi-blast-2.2.31+/bin/makeblastdb failed in Widget::formater > --> rank=NA, hostname=cebal.example.com > ERROR: Failed while doing blastx repeats > ERROR: Chunk failed at level:1, tier_type:1 > FAILED CONTIG:scaffold_1 > > ERROR: Chunk failed at level:2, tier_type:0 > FAILED CONTIG:scaffold_1 > > > Pedro > > 2015-11-02 16:16 GMT+00:00 Carson Holt >: > You need to look at your STDERR as the datastorem log just gives you a summary of what failed but not why. The cause of the error will be somewhere in your job's STDERR. You can capture the STDERR to a file by redirecting it. > > For example, in bash shell ?> maker 2> stderr.log > > ?Carson > > > >> On Nov 2, 2015, at 7:57 AM, Pedro Barbosa > wrote: >> >> Hello, >> >> I'm unable to run MAKER properly as the datastore_index.log file shows a FAILED run for all the scaffolds. >> >> Inspecting the log files from the output directory i see that it always dies when repeatrunner is invoked. >> >> STARTED scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.consensi%2Efa%2Eclassified%2Empi%2E10%2E9.specific.out >> FINISHED scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.consensi%2Efa%2Eclassified%2Empi%2E10%2E9.specific.out >> STARTED scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.te_proteins%2Efasta.repeatrunner >> DIED RANK 0:2:0:0 >> DIED COUNT 1 >> >> Afterwards, I realized that repeatrunner is no longer included internally in MAKER, so I installed and included it in the system path. The error remained. >> I tried to remove the parameter 'repeat_protein' from the opts control file, because apparently repeatrunner is called when we provide a set of transposable elements. No success again. >> >> I tested both with the MAKER v2.31.8 and v3.00.0 beta, but the problem doesn't seem to be version related. Please find attached the maker_opts file that i used to run. >> >> Could you help me with this ? >> >> Best regards, >> Pedro Barbosa >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From psbpedrobarbosa at gmail.com Mon Nov 2 09:28:20 2015 From: psbpedrobarbosa at gmail.com (Pedro Barbosa) Date: Mon, 2 Nov 2015 16:28:20 +0000 Subject: [maker-devel] Repeat runner issue In-Reply-To: <1617178F-2F83-4F54-B6F5-D47A32766BC2@gmail.com> References: <1617178F-2F83-4F54-B6F5-D47A32766BC2@gmail.com> Message-ID: Ok. seems to be related with BLAST then ? I turned the rapsearch flag on (in the 3.0 beta version) expecting to rapsearch being run over blast. BLAST options error: File tmp-dir/maker_k2Axy1/0/blastprep/te_proteins%2Efasta.mpi.10.0 does not exist ERROR: /opt/tools/ncbi-blast-2.2.31+/bin/makeblastdb failed in Widget::formater --> rank=NA, hostname=cebal.example.com ERROR: Failed while doing blastx repeats ERROR: Chunk failed at level:1, tier_type:1 FAILED CONTIG:scaffold_1 ERROR: Chunk failed at level:2, tier_type:0 FAILED CONTIG:scaffold_1 Pedro 2015-11-02 16:16 GMT+00:00 Carson Holt : > You need to look at your STDERR as the datastorem log just gives you a > summary of what failed but not why. The cause of the error will be > somewhere in your job's STDERR. You can capture the STDERR to a file by > redirecting it. > > For example, in bash shell ?> maker 2> stderr.log > > ?Carson > > > > On Nov 2, 2015, at 7:57 AM, Pedro Barbosa > wrote: > > Hello, > > I'm unable to run MAKER properly as the datastore_index.log file shows a > FAILED run for all the scaffolds. > > Inspecting the log files from the output directory i see that it always > dies when repeatrunner is invoked. > > *STARTED > scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.consensi%2Efa%2Eclassified%2Empi%2E10%2E9.specific.out* > *FINISHED > scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.consensi%2Efa%2Eclassified%2Empi%2E10%2E9.specific.out* > *STARTED > scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.te_proteins%2Efasta.repeatrunner* > *DIED RANK 0:2:0:0* > *DIED COUNT 1* > > Afterwards, I realized that repeatrunner is no longer included internally > in MAKER, so I installed and included it in the system path. The error > remained. > I tried to remove the parameter 'repeat_protein' from the opts control > file, because apparently repeatrunner is called when we provide a set of > transposable elements. No success again. > > I tested both with the MAKER v2.31.8 and v3.00.0 beta, but the problem > doesn't seem to be version related. Please find attached the maker_opts > file that i used to run. > > Could you help me with this ? > > Best regards, > Pedro Barbosa > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jason.r.gallant at gmail.com Mon Nov 2 12:03:55 2015 From: jason.r.gallant at gmail.com (Jason Gallant) Date: Mon, 02 Nov 2015 19:03:55 +0000 Subject: [maker-devel] AUGUSTUS Training and "Off the Shelf" HMMs Message-ID: Hi Everyone, I?ve been experimenting with optimizing Amazon to perform the HMM training of augustus more speedily, based on a procedure that Kevin Childs has written for ?speedy? Augustus training. The procedure essentially comes from taking a subset of the genes predicted by SNAP, rather than the whole genome and constructing the training set? a good idea that undoubtedly saves a lot of time. I?ve written some modifications to the Augustus scripts and dependencies to try to speed this process up on Amazon, and I?d be happy to share my notes with anyone that is interested. I?ve gotten it to the point where the whole AutoAug procedure can be accomplished in a day on a small cluster. I think that working with the Augustus authors, more improvements could be made, but the whole experience with Augustus has lead me to some questions more generally... 1) One of the things noted in monkeying around with this reduced gene set procedure is that you are unable to do UTR training with Augustus? the AutoAug script complains that there aren?t enough genes left to make an adequate training set. Has anyone noted this, because I haven?t seen much discussion of how important that the Augustus HMM is trained for UTRs when used in the Maker2 pipeline. 2) I?ve been trying to evaluate how good my AUGUSTUS HMM is based on the training set. Running the newly trained species file, I see that the performance on the ?exon level? is low (around 5-6%) but sensitivity on the nucleotide level is in the 89-95%, where the specificity is in the 50-60% range, which seems consistent with other users on this and the Augustus list serve. This is assessed based on a training set of approximately 200 genes selected from the output generated by multiple iterative runs using the SNAP program, documented in the MAKER tutorial. This is all based on data & genes selected from a ?to be published? genome of an electric fish I?m working on. 3) Just for laughs, I tried the HMM trained for zebrafish on the same training set and found that the performance was slightly better than my species-specific one that I?ve been working so hard on (a few percentage points on both nucleotide level sensitivity and specificity). I?ve reasoned that it might be best in terms of reproducibility to run Maker one last time with my multiple rounds of SNAP hmm together with the augustus zebrafish species file, rather than using my own custom species training. Can anyone think of a good reason why not to do this? Are there qualities/benefits not expressed by these sensitivity/specificity measures not captured that I would benefit using my own custom species trained file for? What are folks? experiences with AUGUSTUS in this regard? Many thanks for any advise in advance! Jason Gallant -- ---- Dr. Jason R. Gallant Assistant Professor Room 38 Natural Sciences Department of Integrative Biology Michigan State University East Lansing, MI 48824 jgallant at msu.edu office: 517-884-7756 -------------- next part -------------- An HTML attachment was scrubbed... URL: From jgallant at msu.edu Mon Nov 2 12:05:22 2015 From: jgallant at msu.edu (Jason Gallant) Date: Mon, 02 Nov 2015 19:05:22 +0000 Subject: [maker-devel] AUGUSTUS Training and "Off the Shelf" HMMs Message-ID: Hi Everyone, I?ve been experimenting with optimizing Amazon to perform the HMM training of augustus more speedily, based on a procedure that Kevin Childs has written for ?speedy? Augustus training. The procedure essentially comes from taking a subset of the genes predicted by SNAP, rather than the whole genome and constructing the training set? a good idea that undoubtedly saves a lot of time. I?ve written some modifications to the Augustus scripts and dependencies to try to speed this process up on Amazon, and I?d be happy to share my notes with anyone that is interested. I?ve gotten it to the point where the whole AutoAug procedure can be accomplished in a day on a small cluster. I think that working with the Augustus authors, more improvements could be made, but the whole experience with Augustus has lead me to some questions more generally... 1) One of the things noted in monkeying around with this reduced gene set procedure is that you are unable to do UTR training with Augustus? the AutoAug script complains that there aren?t enough genes left to make an adequate training set. Has anyone noted this, because I haven?t seen much discussion of how important that the Augustus HMM is trained for UTRs when used in the Maker2 pipeline. 2) I?ve been trying to evaluate how good my AUGUSTUS HMM is based on the training set. Running the newly trained species file, I see that the performance on the ?exon level? is low (around 5-6%) but sensitivity on the nucleotide level is in the 89-95%, where the specificity is in the 50-60% range, which seems consistent with other users on this and the Augustus list serve. This is assessed based on a training set of approximately 200 genes selected from the output generated by multiple iterative runs using the SNAP program, documented in the MAKER tutorial. This is all based on data & genes selected from a ?to be published? genome of an electric fish I?m working on. 3) Just for laughs, I tried the HMM trained for zebrafish on the same training set and found that the performance was slightly better than my species-specific one that I?ve been working so hard on (a few percentage points on both nucleotide level sensitivity and specificity). I?ve reasoned that it might be best in terms of reproducibility to run Maker one last time with my multiple rounds of SNAP hmm together with the augustus zebrafish species file, rather than using my own custom species training. Can anyone think of a good reason why not to do this? Are there qualities/benefits not expressed by these sensitivity/specificity measures not captured that I would benefit using my own custom species trained file for? What are folks? experiences with AUGUSTUS in this regard? Many thanks for any advise in advance! Jason Gallant -------------- next part -------------- An HTML attachment was scrubbed... URL: From scott at scottcain.net Tue Nov 3 07:28:07 2015 From: scott at scottcain.net (Scott Cain) Date: Tue, 3 Nov 2015 09:28:07 -0500 Subject: [maker-devel] Time is short: GMOD related talks at PAG Message-ID: Hi All, I realize time is short for this: there is a hard deadline of November 6 for entering speakers for the Plant and Animal Genomes meeting in January 2016. If you would like to give a talk in the GMOD section of the meeting, please get me a title and brief description (an abstract would be good but not required) by November 5. The talk can take the form of a project update, user "story" or interpretive dance. Speakers get early registration discount no matter when they register, so if you haven't registered yet, it would be like getting $100 off. Sorry for the short notice, and I look forward to seeing in San Diego! Scott -- ------------------------------------------------------------------------ Scott Cain, Ph. D. scott at scottcain dot net GMOD Coordinator (http://gmod.org/) 216-392-3087 Ontario Institute for Cancer Research -------------- next part -------------- An HTML attachment was scrubbed... URL: From jgallant at msu.edu Tue Nov 3 12:48:10 2015 From: jgallant at msu.edu (Jason Gallant) Date: Tue, 03 Nov 2015 19:48:10 +0000 Subject: [maker-devel] quality_filter.pl script -question for Mike Campbell Message-ID: Hi Mike (list copied for future reference), I found your very nice protocols paper on using Maker from 2014. I?ve been following it to the letter as I?m wrapping up my annotation project. I?ve located your quality_filter.pl script and am using it on my GFF files to create maker standard and maker default data sets from my maker-max GFF file. I?m noticing that perl complains a lot while this is running about ?use of uninitialized value?. This occurs on two separate passes as far as I can tell. When generating the ?maker standard? file, it occurs for many lines in my GFF file as: Use of uninitialized value $array[2] in pattern match (m//) at /mnt/home/jgallant/quality_filter.pl line 50, line Y. And then later it complains again with a similar message Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/ quality_filter.pl line 92, line X. Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/ quality_filter.pl line 96, line X. Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/ quality_filter.pl line 100, line X Any insights as to what causes this? I seem to get a fully formed GFF3 file out the other side, but the command line fills with these messages and makes me nervous that something isn?t right. I?d appreciate any thoughts! Best, Jason Gallant -------------- next part -------------- An HTML attachment was scrubbed... URL: From michael.s.campbell1 at gmail.com Tue Nov 3 14:06:32 2015 From: michael.s.campbell1 at gmail.com (Michael Campbell) Date: Tue, 3 Nov 2015 16:06:32 -0500 Subject: [maker-devel] quality_filter.pl script -question for Mike Campbell In-Reply-To: References: Message-ID: Hi Jason, It could be a couple of things. If you have a cut down version of your gff3 that I can use to recreate the error I can debug it. The quality_filter.pl script is still a pretty young accessory script, so you may have something in your file that It wasn?t tested against. Thanks, Mike > On Nov 3, 2015, at 2:48 PM, Jason Gallant wrote: > > Hi Mike (list copied for future reference), > > I found your very nice protocols paper on using Maker from 2014. I?ve been following it to the letter as I?m wrapping up my annotation project. > > I?ve located your quality_filter.pl script and am using it on my GFF files to create maker standard and maker default data sets from my maker-max GFF file. I?m noticing that perl complains a lot while this is running about ?use of uninitialized value?. This occurs on two separate passes as far as I can tell. When generating the ?maker standard? file, it occurs for many lines in my GFF file as: > > Use of uninitialized value $array[2] in pattern match (m//) at /mnt/home/jgallant/quality_filter.pl line 50, line Y. > > And then later it complains again with a similar message > > Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/quality_filter.pl line 92, line X. > Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/quality_filter.pl line 96, line X. > Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/quality_filter.pl line 100, line X > > Any insights as to what causes this? I seem to get a fully formed GFF3 file out the other side, but the command line fills with these messages and makes me nervous that something isn?t right. > > I?d appreciate any thoughts! > > Best, > Jason Gallant > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From jgallant at msu.edu Tue Nov 3 20:20:45 2015 From: jgallant at msu.edu (Jason Gallant) Date: Wed, 04 Nov 2015 03:20:45 +0000 Subject: [maker-devel] quality_filter.pl script -question for Mike Campbell In-Reply-To: References: Message-ID: Hi Mike, I?ve done a little digging on my own, and I think I have traced the issue? it appears that the script is being thrown for the first time when it encounters the line containing the first FASTA sequence header in the GFF file. For example: Scaffold435 blastx match_part 102606 102851 201 - . ID=Scaffold435:hsp:71210:3.10.1.2;Parent=Scaffold435:hit:24267:3.10.1.2;Target=gi|37594442|ref|NP_003431.2| 161 242;Gap=M82 Scaffold435 blastx match_part 102528 102851 173 - . ID=Scaffold435:hsp:71211:3.10.1.2;Pare##FASTA >Scaffold3579 GCAGTAGGCTGTGATACGTTTGCACCCGGGGACTAAGGGGAGATGTGTACAGGATGGGGA GATGTGTACAGGATGGGGAGATGTGTACAGGATGGAGGGGTCCGTGCGAGAGCGTACCAC GTGTCTCCCGTGCAGTGGTGCGGCGTGTACTTGATGCGATAAGCCACAGGGTCCTGCTTC CCATCCTCCATGATCATTTTCATCCGCAGCGTGACCTCCCCCTCAGAAAAGACCCCCTTC CTCATCCTCTCAAAGAGCACCAGAGATTCCTCAATCGGCCGATCCCTCCAAGGGGACAGG GGAGGGCTGTGGCCCTTCAGCTCCTCCACACGCTGGTGACACACATAGGCGAGGCCCCTG GATGCAGAACAGTGCAGACAGTGACATACCATTCACATGACACTGATCCGGTTAAGCCAC ACGCGATACAATACAGTGTCATCATCAGGAAGAAGGGGAAACAGAGGCGTCAAAACGCCC TATGAAGAGAGGAGTCTGCTTGCACTCACCGGCGAATAAGATCCACTGCGAGGTCGTACA GCTTTTGGAAGTGGTCAGACGCGTGGGTCACTGCATAGGGCGTGTACCCTGTTTTACAGA Perl complains when it encounters the line containing >Scaffold3579 and pretty much every line thereafter. Intriguingly, it would appear that the preceding line appears to be truncated compared to those before it. I can trace this all the way back to the output of gff3_merge for several files. Not sure what to do here (or if you can help!). Best, Jason On Tue, Nov 3, 2015 at 4:06 PM Michael Campbell < michael.s.campbell1 at gmail.com> wrote: > Hi Jason, > > It could be a couple of things. If you have a cut down version of your > gff3 that I can use to recreate the error I can debug it. The > quality_filter.pl script is still a pretty young accessory script, so you > may have something in your file that It wasn?t tested against. > > Thanks, > Mike > > > > On Nov 3, 2015, at 2:48 PM, Jason Gallant wrote: > > Hi Mike (list copied for future reference), > > I found your very nice protocols paper on using Maker from 2014. I?ve > been following it to the letter as I?m wrapping up my annotation project. > > I?ve located your quality_filter.pl script and am using it on my GFF > files to create maker standard and maker default data sets from my > maker-max GFF file. I?m noticing that perl complains a lot while this is > running about ?use of uninitialized value?. This occurs on two separate > passes as far as I can tell. When generating the ?maker standard? file, it > occurs for many lines in my GFF file as: > > Use of uninitialized value $array[2] in pattern match (m//) at > /mnt/home/jgallant/quality_filter.pl line 50, line Y. > > And then later it complains again with a similar message > > Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/ > quality_filter.pl line 92, line X. > Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/ > quality_filter.pl line 96, line X. > Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/ > quality_filter.pl line 100, line X > > Any insights as to what causes this? I seem to get a fully formed GFF3 > file out the other side, but the command line fills with these messages and > makes me nervous that something isn?t right. > > I?d appreciate any thoughts! > > Best, > Jason Gallant > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Wed Nov 4 09:09:55 2015 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 4 Nov 2015 09:09:55 -0700 Subject: [maker-devel] quality_filter.pl script -question for Mike Campbell In-Reply-To: References: Message-ID: <26CDF518-B664-4AD2-9C1F-6844AD9764E5@gmail.com> If you have a truncated result, then you should look for truncation in one of the pre-merged files (usually indicates a broken file lock if you started multiple instances of MAKER simultaneously). Also make sure your /tmp or whatever your system default TMPDIR is has not become full. gff3_merge uses that directory to store temporary files. ?Carson > On Nov 3, 2015, at 8:20 PM, Jason Gallant wrote: > > Hi Mike, > > I?ve done a little digging on my own, and I think I have traced the issue? it appears that the script is being thrown for the first time when it encounters the line containing the first FASTA sequence header in the GFF file. For example: > > Scaffold435 blastx match_part 102606 102851 201 - . ID=Scaffold435:hsp:71210:3.10.1.2;Parent=Scaffold435:hit:24267:3.10.1.2;Target=gi|37594442|ref|NP_003431.2| 161 242;Gap=M82 > Scaffold435 blastx match_part 102528 102851 173 - . ID=Scaffold435:hsp:71211:3.10.1.2;Pare##FASTA > >Scaffold3579 > GCAGTAGGCTGTGATACGTTTGCACCCGGGGACTAAGGGGAGATGTGTACAGGATGGGGA > GATGTGTACAGGATGGGGAGATGTGTACAGGATGGAGGGGTCCGTGCGAGAGCGTACCAC > GTGTCTCCCGTGCAGTGGTGCGGCGTGTACTTGATGCGATAAGCCACAGGGTCCTGCTTC > CCATCCTCCATGATCATTTTCATCCGCAGCGTGACCTCCCCCTCAGAAAAGACCCCCTTC > CTCATCCTCTCAAAGAGCACCAGAGATTCCTCAATCGGCCGATCCCTCCAAGGGGACAGG > GGAGGGCTGTGGCCCTTCAGCTCCTCCACACGCTGGTGACACACATAGGCGAGGCCCCTG > GATGCAGAACAGTGCAGACAGTGACATACCATTCACATGACACTGATCCGGTTAAGCCAC > ACGCGATACAATACAGTGTCATCATCAGGAAGAAGGGGAAACAGAGGCGTCAAAACGCCC > TATGAAGAGAGGAGTCTGCTTGCACTCACCGGCGAATAAGATCCACTGCGAGGTCGTACA > GCTTTTGGAAGTGGTCAGACGCGTGGGTCACTGCATAGGGCGTGTACCCTGTTTTACAGA > > Perl complains when it encounters the line containing >Scaffold3579 and pretty much every line thereafter. Intriguingly, it would appear that the preceding line appears to be truncated compared to those before it. I can trace this all the way back to the output of gff3_merge for several files. Not sure what to do here (or if you can help!). > > Best, > Jason > > > > > On Tue, Nov 3, 2015 at 4:06 PM Michael Campbell > wrote: > Hi Jason, > > It could be a couple of things. If you have a cut down version of your gff3 that I can use to recreate the error I can debug it. The quality_filter.pl script is still a pretty young accessory script, so you may have something in your file that It wasn?t tested against. > > Thanks, > Mike > > > > >> On Nov 3, 2015, at 2:48 PM, Jason Gallant > wrote: >> > >> Hi Mike (list copied for future reference), >> >> I found your very nice protocols paper on using Maker from 2014. I?ve been following it to the letter as I?m wrapping up my annotation project. >> >> I?ve located your quality_filter.pl script and am using it on my GFF files to create maker standard and maker default data sets from my maker-max GFF file. I?m noticing that perl complains a lot while this is running about ?use of uninitialized value?. This occurs on two separate passes as far as I can tell. When generating the ?maker standard? file, it occurs for many lines in my GFF file as: >> >> Use of uninitialized value $array[2] in pattern match (m//) at /mnt/home/jgallant/quality_filter.pl line 50, line Y. >> >> And then later it complains again with a similar message >> >> Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/quality_filter.pl line 92, line X. >> Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/quality_filter.pl line 96, line X. >> Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/quality_filter.pl line 100, line X >> >> Any insights as to what causes this? I seem to get a fully formed GFF3 file out the other side, but the command line fills with these messages and makes me nervous that something isn?t right. >> >> I?d appreciate any thoughts! >> >> Best, >> Jason Gallant > >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From jgallant at msu.edu Wed Nov 4 10:14:57 2015 From: jgallant at msu.edu (Jason Gallant) Date: Wed, 04 Nov 2015 17:14:57 +0000 Subject: [maker-devel] quality_filter.pl script -question for Mike Campbell In-Reply-To: <26CDF518-B664-4AD2-9C1F-6844AD9764E5@gmail.com> References: <26CDF518-B664-4AD2-9C1F-6844AD9764E5@gmail.com> Message-ID: Hi Carson, Great? the full temporary directory was indeed the issue! On amazon the AMI images are so small that a couple of big files will fill this up, and of course because the script cleans up after itself, I was never the wiser. Thanks for the insight. I was missing about 1GB of data! Doh! Mike, I was able to regenerate my GFF file with this in mind, and no more complaining. It was indeed the truncated file that was the culprit. Thanks for your insights as well. Best, Jason Gallant On Wed, Nov 4, 2015 at 11:10 AM Carson Holt wrote: > If you have a truncated result, then you should look for truncation in one > of the pre-merged files (usually indicates a broken file lock if you > started multiple instances of MAKER simultaneously). Also make sure your > /tmp or whatever your system default TMPDIR is has not become full. > gff3_merge uses that directory to store temporary files. > > ?Carson > > > On Nov 3, 2015, at 8:20 PM, Jason Gallant wrote: > > Hi Mike, > > I?ve done a little digging on my own, and I think I have traced the issue? > it appears that the script is being thrown for the first time when it > encounters the line containing the first FASTA sequence header in the GFF > file. For example: > > Scaffold435 blastx match_part 102606 102851 201 - . ID=Scaffold435:hsp:71210:3.10.1.2;Parent=Scaffold435:hit:24267:3.10.1.2;Target=gi|37594442|ref|NP_003431.2| > 161 242;Gap=M82 > Scaffold435 blastx match_part 102528 102851 173 - . > ID=Scaffold435:hsp:71211:3.10.1.2;Pare##FASTA > >Scaffold3579 > GCAGTAGGCTGTGATACGTTTGCACCCGGGGACTAAGGGGAGATGTGTACAGGATGGGGA > GATGTGTACAGGATGGGGAGATGTGTACAGGATGGAGGGGTCCGTGCGAGAGCGTACCAC > GTGTCTCCCGTGCAGTGGTGCGGCGTGTACTTGATGCGATAAGCCACAGGGTCCTGCTTC > CCATCCTCCATGATCATTTTCATCCGCAGCGTGACCTCCCCCTCAGAAAAGACCCCCTTC > CTCATCCTCTCAAAGAGCACCAGAGATTCCTCAATCGGCCGATCCCTCCAAGGGGACAGG > GGAGGGCTGTGGCCCTTCAGCTCCTCCACACGCTGGTGACACACATAGGCGAGGCCCCTG > GATGCAGAACAGTGCAGACAGTGACATACCATTCACATGACACTGATCCGGTTAAGCCAC > ACGCGATACAATACAGTGTCATCATCAGGAAGAAGGGGAAACAGAGGCGTCAAAACGCCC > TATGAAGAGAGGAGTCTGCTTGCACTCACCGGCGAATAAGATCCACTGCGAGGTCGTACA > GCTTTTGGAAGTGGTCAGACGCGTGGGTCACTGCATAGGGCGTGTACCCTGTTTTACAGA > > Perl complains when it encounters the line containing >Scaffold3579 and > pretty much every line thereafter. Intriguingly, it would appear that the > preceding line appears to be truncated compared to those before it. I can > trace this all the way back to the output of gff3_merge for several files. > Not sure what to do here (or if you can help!). > > Best, > Jason > > > > > On Tue, Nov 3, 2015 at 4:06 PM Michael Campbell < > michael.s.campbell1 at gmail.com> wrote: > >> Hi Jason, >> >> It could be a couple of things. If you have a cut down version of your >> gff3 that I can use to recreate the error I can debug it. The >> quality_filter.pl script is still a pretty young accessory script, so >> you may have something in your file that It wasn?t tested against. >> >> Thanks, >> Mike >> >> >> >> On Nov 3, 2015, at 2:48 PM, Jason Gallant wrote: >> >> Hi Mike (list copied for future reference), >> >> I found your very nice protocols paper on using Maker from 2014. I?ve >> been following it to the letter as I?m wrapping up my annotation project. >> >> I?ve located your quality_filter.pl script and am using it on my GFF >> files to create maker standard and maker default data sets from my >> maker-max GFF file. I?m noticing that perl complains a lot while this is >> running about ?use of uninitialized value?. This occurs on two separate >> passes as far as I can tell. When generating the ?maker standard? file, it >> occurs for many lines in my GFF file as: >> >> Use of uninitialized value $array[2] in pattern match (m//) at >> /mnt/home/jgallant/quality_filter.pl line 50, line Y. >> >> And then later it complains again with a similar message >> >> Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/ >> quality_filter.pl line 92, line X. >> Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/ >> quality_filter.pl line 96, line X. >> Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/ >> quality_filter.pl line 100, line X >> >> Any insights as to what causes this? I seem to get a fully formed GFF3 >> file out the other side, but the command line fills with these messages and >> makes me nervous that something isn?t right. >> >> I?d appreciate any thoughts! >> >> Best, >> Jason Gallant >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> >> _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dancsi90 at gmail.com Thu Nov 5 00:16:19 2015 From: dancsi90 at gmail.com (Anna Nyiri) Date: Thu, 5 Nov 2015 07:16:19 +0000 Subject: [maker-devel] MPI Message-ID: Hi, I tried to use MAKER with MPICH2, but I got an error message: "/molbio/bin/danna/mpich2-install/bin/hydra_pmi_proxy: error while loading shared libraries: libtorque.so.2: cannot open shared object file: No such file or directory" The attached file contains the shell sctipt, which I used. Is this script correct? The script should contain MPICH module. But I can't find it on my computer. Where can I find this module? Thanks for your help, Anna Nyiri -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_mpi.sh Type: application/x-sh Size: 1295 bytes Desc: not available URL: From carsonhh at gmail.com Thu Nov 5 09:57:21 2015 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 5 Nov 2015 09:57:21 -0700 Subject: [maker-devel] MPI In-Reply-To: References: Message-ID: The problem is the actual MPICH2 installation. You may be missing prerequisites or you may not have compiled with the necessary shared library flags (-enable-shared). You may also be compiling on one machine that has certain libraries installed then running on another that doesn?t have access to those libraries (this can happen if running on a cluster). Try reinstalling MPICH2 or switching to OpenMPI. If you decide to use OpenMPI, he following is from the INSTALL file that should be included with MAKER ?> If using OpenMPI, make sure to set LD_PRELOAD to the location of libmpi.so before even trying to install MAKER. It must also be set before running MAKER (or any program that uses OpenMPI's shared libraries), so it's best just to add it to your ~/.bash_profile. (i.e. export LD_PRELOAD=/location/of/openmpi/lib/libmpi.so). 1. Say yes to the 'configure for MPI' question when running 'perl Build.PL? in step 1 of the EASY INSTALL. 2. Give path to 'mpicc'. Note to make sure you do not give the path to ?mpicc' from another MPI flavor that might be installed on your system. 3. Give path to the folder containing 'mpi,h'. Note to make sure you do not give the path to a folder from another MPI flavor that might be installed on your system. Mixing MPI flavors for 'mpicc' and 'mpi.h' will cause failures. Make sure to read and confirm the auto-detected paths. 4. Finish installation according to steps 2-4 of the EASY INSTALL Note: For OpenMPI you may also want to set OMPI_MCA_mpi_warn_on_fork=0 in your ~/.bash_profile to turn off certain nonfatal warnings. Note: If jobs hang or freeze when using mpiexec under OpenMPI try adding the '-mca btl ^openib' flag to mpiexec command when running MAKER. Example: mpiexec -mca btl ^openib -n 20 maker Thanks, Carson > On Nov 5, 2015, at 12:16 AM, Anna Nyiri wrote: > > Hi, > > I tried to use MAKER with MPICH2, but I got an error message: > "/molbio/bin/danna/mpich2-install/bin/hydra_pmi_proxy: error while loading shared libraries: libtorque.so.2: cannot open shared object file: No such file or directory" > > The attached file contains the shell sctipt, which I used. Is this script correct? > > The script should contain MPICH module. But I can't find it on my computer. Where can I find this module? > > Thanks for your help, > Anna Nyiri > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From psh65 at cornell.edu Tue Nov 10 15:17:09 2015 From: psh65 at cornell.edu (Prashant S Hosmani) Date: Tue, 10 Nov 2015 22:17:09 +0000 Subject: [maker-devel] Maker beta 3.0 version Message-ID: <830525A3-D0E9-4373-A526-3C5C89F391A2@cornell.edu> Hi All, I am currently annotating a plant genome and was curious about new version of Maker. I would like to know what's new in the 3.00 beta version of Maker. Thank you in Advance for help, Prashant From carsonhh at gmail.com Wed Nov 11 17:30:57 2015 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 11 Nov 2015 17:30:57 -0700 Subject: [maker-devel] Maker beta 3.0 version In-Reply-To: <830525A3-D0E9-4373-A526-3C5C89F391A2@cornell.edu> References: <830525A3-D0E9-4373-A526-3C5C89F391A2@cornell.edu> Message-ID: <551B16F5-A4A7-42BE-A509-9AB235CE518D@gmail.com> Primarily EVidenceModeler (EVM) integration. ?Carson > On Nov 10, 2015, at 3:17 PM, Prashant S Hosmani wrote: > > Hi All, > > I am currently annotating a plant genome and was curious about new version of Maker. I would like to know what's new in the 3.00 beta version of Maker. > > Thank you in Advance for help, > Prashant > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From jgallant at msu.edu Fri Nov 13 13:03:30 2015 From: jgallant at msu.edu (Jason Gallant) Date: Fri, 13 Nov 2015 20:03:30 +0000 Subject: [maker-devel] Alternative Splicing and ipr_update_gff Message-ID: Hi Everyone, Another nitty gritty question, probably directed at Carson once more. I decided to make one more go at my maker annotation, this time turning on alt_splice=1. I have been keeping keep_preds=1 on to export the ?max? dataset as detailed in Campbell et al (2014), with the hopes of ?rescuing? genes that have IPR domains do not have evidence. Everything works swimmingly when alt_splice=0, but when activated, the run behaved normally? I ran gff3_merge and fast_merge to obtain proteins and transcripts, and found that as predicted the resulting fasta files contained more proteins than the initial run. I ran IPR scan and now am trying to update my GFF3 file to obtain the ?maker standard? dataset? what I am noticing is a sudden complaint by the ipr_update_gff script about use of an uninitialized value. This appears to have happened to others: https://groups.google.com/d/msg/maker-devel/dM4WvyghYks/BboRZQLmEF8J And indeed, I can verify that the first protein listed in my fasta file is only listed as ?augustus_masked? match/match part in the original GFF3 file. If I understand the ipr_update_gff script correctly, this transcript will be ignored because it lacks the mRNA type. Is this expected naming behavior for the alternative splicing? I would have expected the alternative splice variants to be listed as alternative mRNAs under the same parent gene? Is there some sort of misconfiguration or am I expecting incorrectly? Hopes for any help you all can provide in diagnosing Best, Jason Gallant -------------- next part -------------- An HTML attachment was scrubbed... URL: From jgallant at msu.edu Fri Nov 13 13:55:47 2015 From: jgallant at msu.edu (Jason Gallant) Date: Fri, 13 Nov 2015 20:55:47 +0000 Subject: [maker-devel] Alternative Splicing and ipr_update_gff In-Reply-To: References: Message-ID: Hi Everyone, Doh! Very stupid error? I ran my interpro scan on the augustus masked proteins file instead of the maker masked proteins file by mistake. Apologies! Jason Gallant On Fri, Nov 13, 2015 at 3:03 PM Jason Gallant wrote: > Hi Everyone, > > Another nitty gritty question, probably directed at Carson once more. > > I decided to make one more go at my maker annotation, this time turning on > alt_splice=1. I have been keeping keep_preds=1 on to export the ?max? > dataset as detailed in Campbell et al (2014), with the hopes of ?rescuing? > genes that have IPR domains do not have evidence. > > Everything works swimmingly when alt_splice=0, but when activated, the run > behaved normally? I ran gff3_merge and fast_merge to obtain proteins and > transcripts, and found that as predicted the resulting fasta files > contained more proteins than the initial run. > > I ran IPR scan and now am trying to update my GFF3 file to obtain the > ?maker standard? dataset? what I am noticing is a sudden complaint by the > ipr_update_gff script about use of an uninitialized value. > > This appears to have happened to others: > https://groups.google.com/d/msg/maker-devel/dM4WvyghYks/BboRZQLmEF8J > > And indeed, I can verify that the first protein listed in my fasta file is > only listed as ?augustus_masked? match/match part in the original GFF3 > file. > > If I understand the ipr_update_gff script correctly, this transcript will > be ignored because it lacks the mRNA type. Is this expected naming > behavior for the alternative splicing? I would have expected the > alternative splice variants to be listed as alternative mRNAs under the > same parent gene? Is there some sort of misconfiguration or am I expecting > incorrectly? > > Hopes for any help you all can provide in diagnosing > > Best, > Jason Gallant > -------------- next part -------------- An HTML attachment was scrubbed... URL: From psbpedrobarbosa at gmail.com Mon Nov 2 07:57:08 2015 From: psbpedrobarbosa at gmail.com (Pedro Barbosa) Date: Mon, 2 Nov 2015 14:57:08 +0000 Subject: [maker-devel] Repeat runner issue Message-ID: Hello, I'm unable to run MAKER properly as the datastore_index.log file shows a FAILED run for all the scaffolds. Inspecting the log files from the output directory i see that it always dies when repeatrunner is invoked. *STARTED scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.consensi%2Efa%2Eclassified%2Empi%2E10%2E9.specific.out* *FINISHED scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.consensi%2Efa%2Eclassified%2Empi%2E10%2E9.specific.out* *STARTED scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.te_proteins%2Efasta.repeatrunner* *DIED RANK 0:2:0:0* *DIED COUNT 1* Afterwards, I realized that repeatrunner is no longer included internally in MAKER, so I installed and included it in the system path. The error remained. I tried to remove the parameter 'repeat_protein' from the opts control file, because apparently repeatrunner is called when we provide a set of transposable elements. No success again. I tested both with the MAKER v2.31.8 and v3.00.0 beta, but the problem doesn't seem to be version related. Please find attached the maker_opts file that i used to run. Could you help me with this ? Best regards, Pedro Barbosa -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_opts.ctl Type: application/octet-stream Size: 841 bytes Desc: not available URL: From carsonhh at gmail.com Mon Nov 2 09:16:53 2015 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 2 Nov 2015 09:16:53 -0700 Subject: [maker-devel] Repeat runner issue In-Reply-To: References: Message-ID: <1617178F-2F83-4F54-B6F5-D47A32766BC2@gmail.com> You need to look at your STDERR as the datastorem log just gives you a summary of what failed but not why. The cause of the error will be somewhere in your job's STDERR. You can capture the STDERR to a file by redirecting it. For example, in bash shell ?> maker 2> stderr.log ?Carson > On Nov 2, 2015, at 7:57 AM, Pedro Barbosa wrote: > > Hello, > > I'm unable to run MAKER properly as the datastore_index.log file shows a FAILED run for all the scaffolds. > > Inspecting the log files from the output directory i see that it always dies when repeatrunner is invoked. > > STARTED scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.consensi%2Efa%2Eclassified%2Empi%2E10%2E9.specific.out > FINISHED scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.consensi%2Efa%2Eclassified%2Empi%2E10%2E9.specific.out > STARTED scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.te_proteins%2Efasta.repeatrunner > DIED RANK 0:2:0:0 > DIED COUNT 1 > > Afterwards, I realized that repeatrunner is no longer included internally in MAKER, so I installed and included it in the system path. The error remained. > I tried to remove the parameter 'repeat_protein' from the opts control file, because apparently repeatrunner is called when we provide a set of transposable elements. No success again. > > I tested both with the MAKER v2.31.8 and v3.00.0 beta, but the problem doesn't seem to be version related. Please find attached the maker_opts file that i used to run. > > Could you help me with this ? > > Best regards, > Pedro Barbosa > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Nov 2 09:36:11 2015 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 2 Nov 2015 09:36:11 -0700 Subject: [maker-devel] Repeat runner issue In-Reply-To: References: <1617178F-2F83-4F54-B6F5-D47A32766BC2@gmail.com> Message-ID: RepeatRunner requires BLAST regardless. RapSearch on the beta version will run for the protein alignment (even then I?ve found it prone to failure and it actually runs slower than BLAST on some longer sequences). If there is an issue with your own BLAST installation you can let MAKER install it?s own BLAST (some versions of BLAST+ do have issues). Go to the maker source directory and run './Build blast?. It will install BLAST+ in ?/maker/exe/blast. It will be version 2.2.28. This is because a couple of the newer updates to BLAST+ actually have bugs and spurious warnings/errors. ?Carson > On Nov 2, 2015, at 9:28 AM, Pedro Barbosa wrote: > > Ok. seems to be related with BLAST then ? I turned the rapsearch flag on (in the 3.0 beta version) expecting to rapsearch being run over blast. > > BLAST options error: File tmp-dir/maker_k2Axy1/0/blastprep/te_proteins%2Efasta.mpi.10.0 does not exist > ERROR: /opt/tools/ncbi-blast-2.2.31+/bin/makeblastdb failed in Widget::formater > --> rank=NA, hostname=cebal.example.com > ERROR: Failed while doing blastx repeats > ERROR: Chunk failed at level:1, tier_type:1 > FAILED CONTIG:scaffold_1 > > ERROR: Chunk failed at level:2, tier_type:0 > FAILED CONTIG:scaffold_1 > > > Pedro > > 2015-11-02 16:16 GMT+00:00 Carson Holt >: > You need to look at your STDERR as the datastorem log just gives you a summary of what failed but not why. The cause of the error will be somewhere in your job's STDERR. You can capture the STDERR to a file by redirecting it. > > For example, in bash shell ?> maker 2> stderr.log > > ?Carson > > > >> On Nov 2, 2015, at 7:57 AM, Pedro Barbosa > wrote: >> >> Hello, >> >> I'm unable to run MAKER properly as the datastore_index.log file shows a FAILED run for all the scaffolds. >> >> Inspecting the log files from the output directory i see that it always dies when repeatrunner is invoked. >> >> STARTED scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.consensi%2Efa%2Eclassified%2Empi%2E10%2E9.specific.out >> FINISHED scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.consensi%2Efa%2Eclassified%2Empi%2E10%2E9.specific.out >> STARTED scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.te_proteins%2Efasta.repeatrunner >> DIED RANK 0:2:0:0 >> DIED COUNT 1 >> >> Afterwards, I realized that repeatrunner is no longer included internally in MAKER, so I installed and included it in the system path. The error remained. >> I tried to remove the parameter 'repeat_protein' from the opts control file, because apparently repeatrunner is called when we provide a set of transposable elements. No success again. >> >> I tested both with the MAKER v2.31.8 and v3.00.0 beta, but the problem doesn't seem to be version related. Please find attached the maker_opts file that i used to run. >> >> Could you help me with this ? >> >> Best regards, >> Pedro Barbosa >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From psbpedrobarbosa at gmail.com Mon Nov 2 09:28:20 2015 From: psbpedrobarbosa at gmail.com (Pedro Barbosa) Date: Mon, 2 Nov 2015 16:28:20 +0000 Subject: [maker-devel] Repeat runner issue In-Reply-To: <1617178F-2F83-4F54-B6F5-D47A32766BC2@gmail.com> References: <1617178F-2F83-4F54-B6F5-D47A32766BC2@gmail.com> Message-ID: Ok. seems to be related with BLAST then ? I turned the rapsearch flag on (in the 3.0 beta version) expecting to rapsearch being run over blast. BLAST options error: File tmp-dir/maker_k2Axy1/0/blastprep/te_proteins%2Efasta.mpi.10.0 does not exist ERROR: /opt/tools/ncbi-blast-2.2.31+/bin/makeblastdb failed in Widget::formater --> rank=NA, hostname=cebal.example.com ERROR: Failed while doing blastx repeats ERROR: Chunk failed at level:1, tier_type:1 FAILED CONTIG:scaffold_1 ERROR: Chunk failed at level:2, tier_type:0 FAILED CONTIG:scaffold_1 Pedro 2015-11-02 16:16 GMT+00:00 Carson Holt : > You need to look at your STDERR as the datastorem log just gives you a > summary of what failed but not why. The cause of the error will be > somewhere in your job's STDERR. You can capture the STDERR to a file by > redirecting it. > > For example, in bash shell ?> maker 2> stderr.log > > ?Carson > > > > On Nov 2, 2015, at 7:57 AM, Pedro Barbosa > wrote: > > Hello, > > I'm unable to run MAKER properly as the datastore_index.log file shows a > FAILED run for all the scaffolds. > > Inspecting the log files from the output directory i see that it always > dies when repeatrunner is invoked. > > *STARTED > scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.consensi%2Efa%2Eclassified%2Empi%2E10%2E9.specific.out* > *FINISHED > scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.consensi%2Efa%2Eclassified%2Empi%2E10%2E9.specific.out* > *STARTED > scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.te_proteins%2Efasta.repeatrunner* > *DIED RANK 0:2:0:0* > *DIED COUNT 1* > > Afterwards, I realized that repeatrunner is no longer included internally > in MAKER, so I installed and included it in the system path. The error > remained. > I tried to remove the parameter 'repeat_protein' from the opts control > file, because apparently repeatrunner is called when we provide a set of > transposable elements. No success again. > > I tested both with the MAKER v2.31.8 and v3.00.0 beta, but the problem > doesn't seem to be version related. Please find attached the maker_opts > file that i used to run. > > Could you help me with this ? > > Best regards, > Pedro Barbosa > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jason.r.gallant at gmail.com Mon Nov 2 12:03:55 2015 From: jason.r.gallant at gmail.com (Jason Gallant) Date: Mon, 02 Nov 2015 19:03:55 +0000 Subject: [maker-devel] AUGUSTUS Training and "Off the Shelf" HMMs Message-ID: Hi Everyone, I?ve been experimenting with optimizing Amazon to perform the HMM training of augustus more speedily, based on a procedure that Kevin Childs has written for ?speedy? Augustus training. The procedure essentially comes from taking a subset of the genes predicted by SNAP, rather than the whole genome and constructing the training set? a good idea that undoubtedly saves a lot of time. I?ve written some modifications to the Augustus scripts and dependencies to try to speed this process up on Amazon, and I?d be happy to share my notes with anyone that is interested. I?ve gotten it to the point where the whole AutoAug procedure can be accomplished in a day on a small cluster. I think that working with the Augustus authors, more improvements could be made, but the whole experience with Augustus has lead me to some questions more generally... 1) One of the things noted in monkeying around with this reduced gene set procedure is that you are unable to do UTR training with Augustus? the AutoAug script complains that there aren?t enough genes left to make an adequate training set. Has anyone noted this, because I haven?t seen much discussion of how important that the Augustus HMM is trained for UTRs when used in the Maker2 pipeline. 2) I?ve been trying to evaluate how good my AUGUSTUS HMM is based on the training set. Running the newly trained species file, I see that the performance on the ?exon level? is low (around 5-6%) but sensitivity on the nucleotide level is in the 89-95%, where the specificity is in the 50-60% range, which seems consistent with other users on this and the Augustus list serve. This is assessed based on a training set of approximately 200 genes selected from the output generated by multiple iterative runs using the SNAP program, documented in the MAKER tutorial. This is all based on data & genes selected from a ?to be published? genome of an electric fish I?m working on. 3) Just for laughs, I tried the HMM trained for zebrafish on the same training set and found that the performance was slightly better than my species-specific one that I?ve been working so hard on (a few percentage points on both nucleotide level sensitivity and specificity). I?ve reasoned that it might be best in terms of reproducibility to run Maker one last time with my multiple rounds of SNAP hmm together with the augustus zebrafish species file, rather than using my own custom species training. Can anyone think of a good reason why not to do this? Are there qualities/benefits not expressed by these sensitivity/specificity measures not captured that I would benefit using my own custom species trained file for? What are folks? experiences with AUGUSTUS in this regard? Many thanks for any advise in advance! Jason Gallant -- ---- Dr. Jason R. Gallant Assistant Professor Room 38 Natural Sciences Department of Integrative Biology Michigan State University East Lansing, MI 48824 jgallant at msu.edu office: 517-884-7756 -------------- next part -------------- An HTML attachment was scrubbed... URL: From jgallant at msu.edu Mon Nov 2 12:05:22 2015 From: jgallant at msu.edu (Jason Gallant) Date: Mon, 02 Nov 2015 19:05:22 +0000 Subject: [maker-devel] AUGUSTUS Training and "Off the Shelf" HMMs Message-ID: Hi Everyone, I?ve been experimenting with optimizing Amazon to perform the HMM training of augustus more speedily, based on a procedure that Kevin Childs has written for ?speedy? Augustus training. The procedure essentially comes from taking a subset of the genes predicted by SNAP, rather than the whole genome and constructing the training set? a good idea that undoubtedly saves a lot of time. I?ve written some modifications to the Augustus scripts and dependencies to try to speed this process up on Amazon, and I?d be happy to share my notes with anyone that is interested. I?ve gotten it to the point where the whole AutoAug procedure can be accomplished in a day on a small cluster. I think that working with the Augustus authors, more improvements could be made, but the whole experience with Augustus has lead me to some questions more generally... 1) One of the things noted in monkeying around with this reduced gene set procedure is that you are unable to do UTR training with Augustus? the AutoAug script complains that there aren?t enough genes left to make an adequate training set. Has anyone noted this, because I haven?t seen much discussion of how important that the Augustus HMM is trained for UTRs when used in the Maker2 pipeline. 2) I?ve been trying to evaluate how good my AUGUSTUS HMM is based on the training set. Running the newly trained species file, I see that the performance on the ?exon level? is low (around 5-6%) but sensitivity on the nucleotide level is in the 89-95%, where the specificity is in the 50-60% range, which seems consistent with other users on this and the Augustus list serve. This is assessed based on a training set of approximately 200 genes selected from the output generated by multiple iterative runs using the SNAP program, documented in the MAKER tutorial. This is all based on data & genes selected from a ?to be published? genome of an electric fish I?m working on. 3) Just for laughs, I tried the HMM trained for zebrafish on the same training set and found that the performance was slightly better than my species-specific one that I?ve been working so hard on (a few percentage points on both nucleotide level sensitivity and specificity). I?ve reasoned that it might be best in terms of reproducibility to run Maker one last time with my multiple rounds of SNAP hmm together with the augustus zebrafish species file, rather than using my own custom species training. Can anyone think of a good reason why not to do this? Are there qualities/benefits not expressed by these sensitivity/specificity measures not captured that I would benefit using my own custom species trained file for? What are folks? experiences with AUGUSTUS in this regard? Many thanks for any advise in advance! Jason Gallant -------------- next part -------------- An HTML attachment was scrubbed... URL: From scott at scottcain.net Tue Nov 3 07:28:07 2015 From: scott at scottcain.net (Scott Cain) Date: Tue, 3 Nov 2015 09:28:07 -0500 Subject: [maker-devel] Time is short: GMOD related talks at PAG Message-ID: Hi All, I realize time is short for this: there is a hard deadline of November 6 for entering speakers for the Plant and Animal Genomes meeting in January 2016. If you would like to give a talk in the GMOD section of the meeting, please get me a title and brief description (an abstract would be good but not required) by November 5. The talk can take the form of a project update, user "story" or interpretive dance. Speakers get early registration discount no matter when they register, so if you haven't registered yet, it would be like getting $100 off. Sorry for the short notice, and I look forward to seeing in San Diego! Scott -- ------------------------------------------------------------------------ Scott Cain, Ph. D. scott at scottcain dot net GMOD Coordinator (http://gmod.org/) 216-392-3087 Ontario Institute for Cancer Research -------------- next part -------------- An HTML attachment was scrubbed... URL: From jgallant at msu.edu Tue Nov 3 12:48:10 2015 From: jgallant at msu.edu (Jason Gallant) Date: Tue, 03 Nov 2015 19:48:10 +0000 Subject: [maker-devel] quality_filter.pl script -question for Mike Campbell Message-ID: Hi Mike (list copied for future reference), I found your very nice protocols paper on using Maker from 2014. I?ve been following it to the letter as I?m wrapping up my annotation project. I?ve located your quality_filter.pl script and am using it on my GFF files to create maker standard and maker default data sets from my maker-max GFF file. I?m noticing that perl complains a lot while this is running about ?use of uninitialized value?. This occurs on two separate passes as far as I can tell. When generating the ?maker standard? file, it occurs for many lines in my GFF file as: Use of uninitialized value $array[2] in pattern match (m//) at /mnt/home/jgallant/quality_filter.pl line 50, line Y. And then later it complains again with a similar message Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/ quality_filter.pl line 92, line X. Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/ quality_filter.pl line 96, line X. Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/ quality_filter.pl line 100, line X Any insights as to what causes this? I seem to get a fully formed GFF3 file out the other side, but the command line fills with these messages and makes me nervous that something isn?t right. I?d appreciate any thoughts! Best, Jason Gallant -------------- next part -------------- An HTML attachment was scrubbed... URL: From michael.s.campbell1 at gmail.com Tue Nov 3 14:06:32 2015 From: michael.s.campbell1 at gmail.com (Michael Campbell) Date: Tue, 3 Nov 2015 16:06:32 -0500 Subject: [maker-devel] quality_filter.pl script -question for Mike Campbell In-Reply-To: References: Message-ID: Hi Jason, It could be a couple of things. If you have a cut down version of your gff3 that I can use to recreate the error I can debug it. The quality_filter.pl script is still a pretty young accessory script, so you may have something in your file that It wasn?t tested against. Thanks, Mike > On Nov 3, 2015, at 2:48 PM, Jason Gallant wrote: > > Hi Mike (list copied for future reference), > > I found your very nice protocols paper on using Maker from 2014. I?ve been following it to the letter as I?m wrapping up my annotation project. > > I?ve located your quality_filter.pl script and am using it on my GFF files to create maker standard and maker default data sets from my maker-max GFF file. I?m noticing that perl complains a lot while this is running about ?use of uninitialized value?. This occurs on two separate passes as far as I can tell. When generating the ?maker standard? file, it occurs for many lines in my GFF file as: > > Use of uninitialized value $array[2] in pattern match (m//) at /mnt/home/jgallant/quality_filter.pl line 50, line Y. > > And then later it complains again with a similar message > > Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/quality_filter.pl line 92, line X. > Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/quality_filter.pl line 96, line X. > Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/quality_filter.pl line 100, line X > > Any insights as to what causes this? I seem to get a fully formed GFF3 file out the other side, but the command line fills with these messages and makes me nervous that something isn?t right. > > I?d appreciate any thoughts! > > Best, > Jason Gallant > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From jgallant at msu.edu Tue Nov 3 20:20:45 2015 From: jgallant at msu.edu (Jason Gallant) Date: Wed, 04 Nov 2015 03:20:45 +0000 Subject: [maker-devel] quality_filter.pl script -question for Mike Campbell In-Reply-To: References: Message-ID: Hi Mike, I?ve done a little digging on my own, and I think I have traced the issue? it appears that the script is being thrown for the first time when it encounters the line containing the first FASTA sequence header in the GFF file. For example: Scaffold435 blastx match_part 102606 102851 201 - . ID=Scaffold435:hsp:71210:3.10.1.2;Parent=Scaffold435:hit:24267:3.10.1.2;Target=gi|37594442|ref|NP_003431.2| 161 242;Gap=M82 Scaffold435 blastx match_part 102528 102851 173 - . ID=Scaffold435:hsp:71211:3.10.1.2;Pare##FASTA >Scaffold3579 GCAGTAGGCTGTGATACGTTTGCACCCGGGGACTAAGGGGAGATGTGTACAGGATGGGGA GATGTGTACAGGATGGGGAGATGTGTACAGGATGGAGGGGTCCGTGCGAGAGCGTACCAC GTGTCTCCCGTGCAGTGGTGCGGCGTGTACTTGATGCGATAAGCCACAGGGTCCTGCTTC CCATCCTCCATGATCATTTTCATCCGCAGCGTGACCTCCCCCTCAGAAAAGACCCCCTTC CTCATCCTCTCAAAGAGCACCAGAGATTCCTCAATCGGCCGATCCCTCCAAGGGGACAGG GGAGGGCTGTGGCCCTTCAGCTCCTCCACACGCTGGTGACACACATAGGCGAGGCCCCTG GATGCAGAACAGTGCAGACAGTGACATACCATTCACATGACACTGATCCGGTTAAGCCAC ACGCGATACAATACAGTGTCATCATCAGGAAGAAGGGGAAACAGAGGCGTCAAAACGCCC TATGAAGAGAGGAGTCTGCTTGCACTCACCGGCGAATAAGATCCACTGCGAGGTCGTACA GCTTTTGGAAGTGGTCAGACGCGTGGGTCACTGCATAGGGCGTGTACCCTGTTTTACAGA Perl complains when it encounters the line containing >Scaffold3579 and pretty much every line thereafter. Intriguingly, it would appear that the preceding line appears to be truncated compared to those before it. I can trace this all the way back to the output of gff3_merge for several files. Not sure what to do here (or if you can help!). Best, Jason On Tue, Nov 3, 2015 at 4:06 PM Michael Campbell < michael.s.campbell1 at gmail.com> wrote: > Hi Jason, > > It could be a couple of things. If you have a cut down version of your > gff3 that I can use to recreate the error I can debug it. The > quality_filter.pl script is still a pretty young accessory script, so you > may have something in your file that It wasn?t tested against. > > Thanks, > Mike > > > > On Nov 3, 2015, at 2:48 PM, Jason Gallant wrote: > > Hi Mike (list copied for future reference), > > I found your very nice protocols paper on using Maker from 2014. I?ve > been following it to the letter as I?m wrapping up my annotation project. > > I?ve located your quality_filter.pl script and am using it on my GFF > files to create maker standard and maker default data sets from my > maker-max GFF file. I?m noticing that perl complains a lot while this is > running about ?use of uninitialized value?. This occurs on two separate > passes as far as I can tell. When generating the ?maker standard? file, it > occurs for many lines in my GFF file as: > > Use of uninitialized value $array[2] in pattern match (m//) at > /mnt/home/jgallant/quality_filter.pl line 50, line Y. > > And then later it complains again with a similar message > > Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/ > quality_filter.pl line 92, line X. > Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/ > quality_filter.pl line 96, line X. > Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/ > quality_filter.pl line 100, line X > > Any insights as to what causes this? I seem to get a fully formed GFF3 > file out the other side, but the command line fills with these messages and > makes me nervous that something isn?t right. > > I?d appreciate any thoughts! > > Best, > Jason Gallant > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Wed Nov 4 09:09:55 2015 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 4 Nov 2015 09:09:55 -0700 Subject: [maker-devel] quality_filter.pl script -question for Mike Campbell In-Reply-To: References: Message-ID: <26CDF518-B664-4AD2-9C1F-6844AD9764E5@gmail.com> If you have a truncated result, then you should look for truncation in one of the pre-merged files (usually indicates a broken file lock if you started multiple instances of MAKER simultaneously). Also make sure your /tmp or whatever your system default TMPDIR is has not become full. gff3_merge uses that directory to store temporary files. ?Carson > On Nov 3, 2015, at 8:20 PM, Jason Gallant wrote: > > Hi Mike, > > I?ve done a little digging on my own, and I think I have traced the issue? it appears that the script is being thrown for the first time when it encounters the line containing the first FASTA sequence header in the GFF file. For example: > > Scaffold435 blastx match_part 102606 102851 201 - . ID=Scaffold435:hsp:71210:3.10.1.2;Parent=Scaffold435:hit:24267:3.10.1.2;Target=gi|37594442|ref|NP_003431.2| 161 242;Gap=M82 > Scaffold435 blastx match_part 102528 102851 173 - . ID=Scaffold435:hsp:71211:3.10.1.2;Pare##FASTA > >Scaffold3579 > GCAGTAGGCTGTGATACGTTTGCACCCGGGGACTAAGGGGAGATGTGTACAGGATGGGGA > GATGTGTACAGGATGGGGAGATGTGTACAGGATGGAGGGGTCCGTGCGAGAGCGTACCAC > GTGTCTCCCGTGCAGTGGTGCGGCGTGTACTTGATGCGATAAGCCACAGGGTCCTGCTTC > CCATCCTCCATGATCATTTTCATCCGCAGCGTGACCTCCCCCTCAGAAAAGACCCCCTTC > CTCATCCTCTCAAAGAGCACCAGAGATTCCTCAATCGGCCGATCCCTCCAAGGGGACAGG > GGAGGGCTGTGGCCCTTCAGCTCCTCCACACGCTGGTGACACACATAGGCGAGGCCCCTG > GATGCAGAACAGTGCAGACAGTGACATACCATTCACATGACACTGATCCGGTTAAGCCAC > ACGCGATACAATACAGTGTCATCATCAGGAAGAAGGGGAAACAGAGGCGTCAAAACGCCC > TATGAAGAGAGGAGTCTGCTTGCACTCACCGGCGAATAAGATCCACTGCGAGGTCGTACA > GCTTTTGGAAGTGGTCAGACGCGTGGGTCACTGCATAGGGCGTGTACCCTGTTTTACAGA > > Perl complains when it encounters the line containing >Scaffold3579 and pretty much every line thereafter. Intriguingly, it would appear that the preceding line appears to be truncated compared to those before it. I can trace this all the way back to the output of gff3_merge for several files. Not sure what to do here (or if you can help!). > > Best, > Jason > > > > > On Tue, Nov 3, 2015 at 4:06 PM Michael Campbell > wrote: > Hi Jason, > > It could be a couple of things. If you have a cut down version of your gff3 that I can use to recreate the error I can debug it. The quality_filter.pl script is still a pretty young accessory script, so you may have something in your file that It wasn?t tested against. > > Thanks, > Mike > > > > >> On Nov 3, 2015, at 2:48 PM, Jason Gallant > wrote: >> > >> Hi Mike (list copied for future reference), >> >> I found your very nice protocols paper on using Maker from 2014. I?ve been following it to the letter as I?m wrapping up my annotation project. >> >> I?ve located your quality_filter.pl script and am using it on my GFF files to create maker standard and maker default data sets from my maker-max GFF file. I?m noticing that perl complains a lot while this is running about ?use of uninitialized value?. This occurs on two separate passes as far as I can tell. When generating the ?maker standard? file, it occurs for many lines in my GFF file as: >> >> Use of uninitialized value $array[2] in pattern match (m//) at /mnt/home/jgallant/quality_filter.pl line 50, line Y. >> >> And then later it complains again with a similar message >> >> Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/quality_filter.pl line 92, line X. >> Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/quality_filter.pl line 96, line X. >> Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/quality_filter.pl line 100, line X >> >> Any insights as to what causes this? I seem to get a fully formed GFF3 file out the other side, but the command line fills with these messages and makes me nervous that something isn?t right. >> >> I?d appreciate any thoughts! >> >> Best, >> Jason Gallant > >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From jgallant at msu.edu Wed Nov 4 10:14:57 2015 From: jgallant at msu.edu (Jason Gallant) Date: Wed, 04 Nov 2015 17:14:57 +0000 Subject: [maker-devel] quality_filter.pl script -question for Mike Campbell In-Reply-To: <26CDF518-B664-4AD2-9C1F-6844AD9764E5@gmail.com> References: <26CDF518-B664-4AD2-9C1F-6844AD9764E5@gmail.com> Message-ID: Hi Carson, Great? the full temporary directory was indeed the issue! On amazon the AMI images are so small that a couple of big files will fill this up, and of course because the script cleans up after itself, I was never the wiser. Thanks for the insight. I was missing about 1GB of data! Doh! Mike, I was able to regenerate my GFF file with this in mind, and no more complaining. It was indeed the truncated file that was the culprit. Thanks for your insights as well. Best, Jason Gallant On Wed, Nov 4, 2015 at 11:10 AM Carson Holt wrote: > If you have a truncated result, then you should look for truncation in one > of the pre-merged files (usually indicates a broken file lock if you > started multiple instances of MAKER simultaneously). Also make sure your > /tmp or whatever your system default TMPDIR is has not become full. > gff3_merge uses that directory to store temporary files. > > ?Carson > > > On Nov 3, 2015, at 8:20 PM, Jason Gallant wrote: > > Hi Mike, > > I?ve done a little digging on my own, and I think I have traced the issue? > it appears that the script is being thrown for the first time when it > encounters the line containing the first FASTA sequence header in the GFF > file. For example: > > Scaffold435 blastx match_part 102606 102851 201 - . ID=Scaffold435:hsp:71210:3.10.1.2;Parent=Scaffold435:hit:24267:3.10.1.2;Target=gi|37594442|ref|NP_003431.2| > 161 242;Gap=M82 > Scaffold435 blastx match_part 102528 102851 173 - . > ID=Scaffold435:hsp:71211:3.10.1.2;Pare##FASTA > >Scaffold3579 > GCAGTAGGCTGTGATACGTTTGCACCCGGGGACTAAGGGGAGATGTGTACAGGATGGGGA > GATGTGTACAGGATGGGGAGATGTGTACAGGATGGAGGGGTCCGTGCGAGAGCGTACCAC > GTGTCTCCCGTGCAGTGGTGCGGCGTGTACTTGATGCGATAAGCCACAGGGTCCTGCTTC > CCATCCTCCATGATCATTTTCATCCGCAGCGTGACCTCCCCCTCAGAAAAGACCCCCTTC > CTCATCCTCTCAAAGAGCACCAGAGATTCCTCAATCGGCCGATCCCTCCAAGGGGACAGG > GGAGGGCTGTGGCCCTTCAGCTCCTCCACACGCTGGTGACACACATAGGCGAGGCCCCTG > GATGCAGAACAGTGCAGACAGTGACATACCATTCACATGACACTGATCCGGTTAAGCCAC > ACGCGATACAATACAGTGTCATCATCAGGAAGAAGGGGAAACAGAGGCGTCAAAACGCCC > TATGAAGAGAGGAGTCTGCTTGCACTCACCGGCGAATAAGATCCACTGCGAGGTCGTACA > GCTTTTGGAAGTGGTCAGACGCGTGGGTCACTGCATAGGGCGTGTACCCTGTTTTACAGA > > Perl complains when it encounters the line containing >Scaffold3579 and > pretty much every line thereafter. Intriguingly, it would appear that the > preceding line appears to be truncated compared to those before it. I can > trace this all the way back to the output of gff3_merge for several files. > Not sure what to do here (or if you can help!). > > Best, > Jason > > > > > On Tue, Nov 3, 2015 at 4:06 PM Michael Campbell < > michael.s.campbell1 at gmail.com> wrote: > >> Hi Jason, >> >> It could be a couple of things. If you have a cut down version of your >> gff3 that I can use to recreate the error I can debug it. The >> quality_filter.pl script is still a pretty young accessory script, so >> you may have something in your file that It wasn?t tested against. >> >> Thanks, >> Mike >> >> >> >> On Nov 3, 2015, at 2:48 PM, Jason Gallant wrote: >> >> Hi Mike (list copied for future reference), >> >> I found your very nice protocols paper on using Maker from 2014. I?ve >> been following it to the letter as I?m wrapping up my annotation project. >> >> I?ve located your quality_filter.pl script and am using it on my GFF >> files to create maker standard and maker default data sets from my >> maker-max GFF file. I?m noticing that perl complains a lot while this is >> running about ?use of uninitialized value?. This occurs on two separate >> passes as far as I can tell. When generating the ?maker standard? file, it >> occurs for many lines in my GFF file as: >> >> Use of uninitialized value $array[2] in pattern match (m//) at >> /mnt/home/jgallant/quality_filter.pl line 50, line Y. >> >> And then later it complains again with a similar message >> >> Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/ >> quality_filter.pl line 92, line X. >> Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/ >> quality_filter.pl line 96, line X. >> Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/ >> quality_filter.pl line 100, line X >> >> Any insights as to what causes this? I seem to get a fully formed GFF3 >> file out the other side, but the command line fills with these messages and >> makes me nervous that something isn?t right. >> >> I?d appreciate any thoughts! >> >> Best, >> Jason Gallant >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> >> _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dancsi90 at gmail.com Thu Nov 5 00:16:19 2015 From: dancsi90 at gmail.com (Anna Nyiri) Date: Thu, 5 Nov 2015 07:16:19 +0000 Subject: [maker-devel] MPI Message-ID: Hi, I tried to use MAKER with MPICH2, but I got an error message: "/molbio/bin/danna/mpich2-install/bin/hydra_pmi_proxy: error while loading shared libraries: libtorque.so.2: cannot open shared object file: No such file or directory" The attached file contains the shell sctipt, which I used. Is this script correct? The script should contain MPICH module. But I can't find it on my computer. Where can I find this module? Thanks for your help, Anna Nyiri -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_mpi.sh Type: application/x-sh Size: 1296 bytes Desc: not available URL: From carsonhh at gmail.com Thu Nov 5 09:57:21 2015 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 5 Nov 2015 09:57:21 -0700 Subject: [maker-devel] MPI In-Reply-To: References: Message-ID: The problem is the actual MPICH2 installation. You may be missing prerequisites or you may not have compiled with the necessary shared library flags (-enable-shared). You may also be compiling on one machine that has certain libraries installed then running on another that doesn?t have access to those libraries (this can happen if running on a cluster). Try reinstalling MPICH2 or switching to OpenMPI. If you decide to use OpenMPI, he following is from the INSTALL file that should be included with MAKER ?> If using OpenMPI, make sure to set LD_PRELOAD to the location of libmpi.so before even trying to install MAKER. It must also be set before running MAKER (or any program that uses OpenMPI's shared libraries), so it's best just to add it to your ~/.bash_profile. (i.e. export LD_PRELOAD=/location/of/openmpi/lib/libmpi.so). 1. Say yes to the 'configure for MPI' question when running 'perl Build.PL? in step 1 of the EASY INSTALL. 2. Give path to 'mpicc'. Note to make sure you do not give the path to ?mpicc' from another MPI flavor that might be installed on your system. 3. Give path to the folder containing 'mpi,h'. Note to make sure you do not give the path to a folder from another MPI flavor that might be installed on your system. Mixing MPI flavors for 'mpicc' and 'mpi.h' will cause failures. Make sure to read and confirm the auto-detected paths. 4. Finish installation according to steps 2-4 of the EASY INSTALL Note: For OpenMPI you may also want to set OMPI_MCA_mpi_warn_on_fork=0 in your ~/.bash_profile to turn off certain nonfatal warnings. Note: If jobs hang or freeze when using mpiexec under OpenMPI try adding the '-mca btl ^openib' flag to mpiexec command when running MAKER. Example: mpiexec -mca btl ^openib -n 20 maker Thanks, Carson > On Nov 5, 2015, at 12:16 AM, Anna Nyiri wrote: > > Hi, > > I tried to use MAKER with MPICH2, but I got an error message: > "/molbio/bin/danna/mpich2-install/bin/hydra_pmi_proxy: error while loading shared libraries: libtorque.so.2: cannot open shared object file: No such file or directory" > > The attached file contains the shell sctipt, which I used. Is this script correct? > > The script should contain MPICH module. But I can't find it on my computer. Where can I find this module? > > Thanks for your help, > Anna Nyiri > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From psh65 at cornell.edu Tue Nov 10 15:17:09 2015 From: psh65 at cornell.edu (Prashant S Hosmani) Date: Tue, 10 Nov 2015 22:17:09 +0000 Subject: [maker-devel] Maker beta 3.0 version Message-ID: <830525A3-D0E9-4373-A526-3C5C89F391A2@cornell.edu> Hi All, I am currently annotating a plant genome and was curious about new version of Maker. I would like to know what's new in the 3.00 beta version of Maker. Thank you in Advance for help, Prashant From carsonhh at gmail.com Wed Nov 11 17:30:57 2015 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 11 Nov 2015 17:30:57 -0700 Subject: [maker-devel] Maker beta 3.0 version In-Reply-To: <830525A3-D0E9-4373-A526-3C5C89F391A2@cornell.edu> References: <830525A3-D0E9-4373-A526-3C5C89F391A2@cornell.edu> Message-ID: <551B16F5-A4A7-42BE-A509-9AB235CE518D@gmail.com> Primarily EVidenceModeler (EVM) integration. ?Carson > On Nov 10, 2015, at 3:17 PM, Prashant S Hosmani wrote: > > Hi All, > > I am currently annotating a plant genome and was curious about new version of Maker. I would like to know what's new in the 3.00 beta version of Maker. > > Thank you in Advance for help, > Prashant > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From jgallant at msu.edu Fri Nov 13 13:03:30 2015 From: jgallant at msu.edu (Jason Gallant) Date: Fri, 13 Nov 2015 20:03:30 +0000 Subject: [maker-devel] Alternative Splicing and ipr_update_gff Message-ID: Hi Everyone, Another nitty gritty question, probably directed at Carson once more. I decided to make one more go at my maker annotation, this time turning on alt_splice=1. I have been keeping keep_preds=1 on to export the ?max? dataset as detailed in Campbell et al (2014), with the hopes of ?rescuing? genes that have IPR domains do not have evidence. Everything works swimmingly when alt_splice=0, but when activated, the run behaved normally? I ran gff3_merge and fast_merge to obtain proteins and transcripts, and found that as predicted the resulting fasta files contained more proteins than the initial run. I ran IPR scan and now am trying to update my GFF3 file to obtain the ?maker standard? dataset? what I am noticing is a sudden complaint by the ipr_update_gff script about use of an uninitialized value. This appears to have happened to others: https://groups.google.com/d/msg/maker-devel/dM4WvyghYks/BboRZQLmEF8J And indeed, I can verify that the first protein listed in my fasta file is only listed as ?augustus_masked? match/match part in the original GFF3 file. If I understand the ipr_update_gff script correctly, this transcript will be ignored because it lacks the mRNA type. Is this expected naming behavior for the alternative splicing? I would have expected the alternative splice variants to be listed as alternative mRNAs under the same parent gene? Is there some sort of misconfiguration or am I expecting incorrectly? Hopes for any help you all can provide in diagnosing Best, Jason Gallant -------------- next part -------------- An HTML attachment was scrubbed... URL: From jgallant at msu.edu Fri Nov 13 13:55:47 2015 From: jgallant at msu.edu (Jason Gallant) Date: Fri, 13 Nov 2015 20:55:47 +0000 Subject: [maker-devel] Alternative Splicing and ipr_update_gff In-Reply-To: References: Message-ID: Hi Everyone, Doh! Very stupid error? I ran my interpro scan on the augustus masked proteins file instead of the maker masked proteins file by mistake. Apologies! Jason Gallant On Fri, Nov 13, 2015 at 3:03 PM Jason Gallant wrote: > Hi Everyone, > > Another nitty gritty question, probably directed at Carson once more. > > I decided to make one more go at my maker annotation, this time turning on > alt_splice=1. I have been keeping keep_preds=1 on to export the ?max? > dataset as detailed in Campbell et al (2014), with the hopes of ?rescuing? > genes that have IPR domains do not have evidence. > > Everything works swimmingly when alt_splice=0, but when activated, the run > behaved normally? I ran gff3_merge and fast_merge to obtain proteins and > transcripts, and found that as predicted the resulting fasta files > contained more proteins than the initial run. > > I ran IPR scan and now am trying to update my GFF3 file to obtain the > ?maker standard? dataset? what I am noticing is a sudden complaint by the > ipr_update_gff script about use of an uninitialized value. > > This appears to have happened to others: > https://groups.google.com/d/msg/maker-devel/dM4WvyghYks/BboRZQLmEF8J > > And indeed, I can verify that the first protein listed in my fasta file is > only listed as ?augustus_masked? match/match part in the original GFF3 > file. > > If I understand the ipr_update_gff script correctly, this transcript will > be ignored because it lacks the mRNA type. Is this expected naming > behavior for the alternative splicing? I would have expected the > alternative splice variants to be listed as alternative mRNAs under the > same parent gene? Is there some sort of misconfiguration or am I expecting > incorrectly? > > Hopes for any help you all can provide in diagnosing > > Best, > Jason Gallant > -------------- next part -------------- An HTML attachment was scrubbed... URL: From psbpedrobarbosa at gmail.com Mon Nov 2 07:57:08 2015 From: psbpedrobarbosa at gmail.com (Pedro Barbosa) Date: Mon, 2 Nov 2015 14:57:08 +0000 Subject: [maker-devel] Repeat runner issue Message-ID: Hello, I'm unable to run MAKER properly as the datastore_index.log file shows a FAILED run for all the scaffolds. Inspecting the log files from the output directory i see that it always dies when repeatrunner is invoked. *STARTED scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.consensi%2Efa%2Eclassified%2Empi%2E10%2E9.specific.out* *FINISHED scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.consensi%2Efa%2Eclassified%2Empi%2E10%2E9.specific.out* *STARTED scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.te_proteins%2Efasta.repeatrunner* *DIED RANK 0:2:0:0* *DIED COUNT 1* Afterwards, I realized that repeatrunner is no longer included internally in MAKER, so I installed and included it in the system path. The error remained. I tried to remove the parameter 'repeat_protein' from the opts control file, because apparently repeatrunner is called when we provide a set of transposable elements. No success again. I tested both with the MAKER v2.31.8 and v3.00.0 beta, but the problem doesn't seem to be version related. Please find attached the maker_opts file that i used to run. Could you help me with this ? Best regards, Pedro Barbosa -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_opts.ctl Type: application/octet-stream Size: 841 bytes Desc: not available URL: From carsonhh at gmail.com Mon Nov 2 09:16:53 2015 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 2 Nov 2015 09:16:53 -0700 Subject: [maker-devel] Repeat runner issue In-Reply-To: References: Message-ID: <1617178F-2F83-4F54-B6F5-D47A32766BC2@gmail.com> You need to look at your STDERR as the datastorem log just gives you a summary of what failed but not why. The cause of the error will be somewhere in your job's STDERR. You can capture the STDERR to a file by redirecting it. For example, in bash shell ?> maker 2> stderr.log ?Carson > On Nov 2, 2015, at 7:57 AM, Pedro Barbosa wrote: > > Hello, > > I'm unable to run MAKER properly as the datastore_index.log file shows a FAILED run for all the scaffolds. > > Inspecting the log files from the output directory i see that it always dies when repeatrunner is invoked. > > STARTED scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.consensi%2Efa%2Eclassified%2Empi%2E10%2E9.specific.out > FINISHED scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.consensi%2Efa%2Eclassified%2Empi%2E10%2E9.specific.out > STARTED scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.te_proteins%2Efasta.repeatrunner > DIED RANK 0:2:0:0 > DIED COUNT 1 > > Afterwards, I realized that repeatrunner is no longer included internally in MAKER, so I installed and included it in the system path. The error remained. > I tried to remove the parameter 'repeat_protein' from the opts control file, because apparently repeatrunner is called when we provide a set of transposable elements. No success again. > > I tested both with the MAKER v2.31.8 and v3.00.0 beta, but the problem doesn't seem to be version related. Please find attached the maker_opts file that i used to run. > > Could you help me with this ? > > Best regards, > Pedro Barbosa > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Nov 2 09:36:11 2015 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 2 Nov 2015 09:36:11 -0700 Subject: [maker-devel] Repeat runner issue In-Reply-To: References: <1617178F-2F83-4F54-B6F5-D47A32766BC2@gmail.com> Message-ID: RepeatRunner requires BLAST regardless. RapSearch on the beta version will run for the protein alignment (even then I?ve found it prone to failure and it actually runs slower than BLAST on some longer sequences). If there is an issue with your own BLAST installation you can let MAKER install it?s own BLAST (some versions of BLAST+ do have issues). Go to the maker source directory and run './Build blast?. It will install BLAST+ in ?/maker/exe/blast. It will be version 2.2.28. This is because a couple of the newer updates to BLAST+ actually have bugs and spurious warnings/errors. ?Carson > On Nov 2, 2015, at 9:28 AM, Pedro Barbosa wrote: > > Ok. seems to be related with BLAST then ? I turned the rapsearch flag on (in the 3.0 beta version) expecting to rapsearch being run over blast. > > BLAST options error: File tmp-dir/maker_k2Axy1/0/blastprep/te_proteins%2Efasta.mpi.10.0 does not exist > ERROR: /opt/tools/ncbi-blast-2.2.31+/bin/makeblastdb failed in Widget::formater > --> rank=NA, hostname=cebal.example.com > ERROR: Failed while doing blastx repeats > ERROR: Chunk failed at level:1, tier_type:1 > FAILED CONTIG:scaffold_1 > > ERROR: Chunk failed at level:2, tier_type:0 > FAILED CONTIG:scaffold_1 > > > Pedro > > 2015-11-02 16:16 GMT+00:00 Carson Holt >: > You need to look at your STDERR as the datastorem log just gives you a summary of what failed but not why. The cause of the error will be somewhere in your job's STDERR. You can capture the STDERR to a file by redirecting it. > > For example, in bash shell ?> maker 2> stderr.log > > ?Carson > > > >> On Nov 2, 2015, at 7:57 AM, Pedro Barbosa > wrote: >> >> Hello, >> >> I'm unable to run MAKER properly as the datastore_index.log file shows a FAILED run for all the scaffolds. >> >> Inspecting the log files from the output directory i see that it always dies when repeatrunner is invoked. >> >> STARTED scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.consensi%2Efa%2Eclassified%2Empi%2E10%2E9.specific.out >> FINISHED scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.consensi%2Efa%2Eclassified%2Empi%2E10%2E9.specific.out >> STARTED scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.te_proteins%2Efasta.repeatrunner >> DIED RANK 0:2:0:0 >> DIED COUNT 1 >> >> Afterwards, I realized that repeatrunner is no longer included internally in MAKER, so I installed and included it in the system path. The error remained. >> I tried to remove the parameter 'repeat_protein' from the opts control file, because apparently repeatrunner is called when we provide a set of transposable elements. No success again. >> >> I tested both with the MAKER v2.31.8 and v3.00.0 beta, but the problem doesn't seem to be version related. Please find attached the maker_opts file that i used to run. >> >> Could you help me with this ? >> >> Best regards, >> Pedro Barbosa >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From psbpedrobarbosa at gmail.com Mon Nov 2 09:28:20 2015 From: psbpedrobarbosa at gmail.com (Pedro Barbosa) Date: Mon, 2 Nov 2015 16:28:20 +0000 Subject: [maker-devel] Repeat runner issue In-Reply-To: <1617178F-2F83-4F54-B6F5-D47A32766BC2@gmail.com> References: <1617178F-2F83-4F54-B6F5-D47A32766BC2@gmail.com> Message-ID: Ok. seems to be related with BLAST then ? I turned the rapsearch flag on (in the 3.0 beta version) expecting to rapsearch being run over blast. BLAST options error: File tmp-dir/maker_k2Axy1/0/blastprep/te_proteins%2Efasta.mpi.10.0 does not exist ERROR: /opt/tools/ncbi-blast-2.2.31+/bin/makeblastdb failed in Widget::formater --> rank=NA, hostname=cebal.example.com ERROR: Failed while doing blastx repeats ERROR: Chunk failed at level:1, tier_type:1 FAILED CONTIG:scaffold_1 ERROR: Chunk failed at level:2, tier_type:0 FAILED CONTIG:scaffold_1 Pedro 2015-11-02 16:16 GMT+00:00 Carson Holt : > You need to look at your STDERR as the datastorem log just gives you a > summary of what failed but not why. The cause of the error will be > somewhere in your job's STDERR. You can capture the STDERR to a file by > redirecting it. > > For example, in bash shell ?> maker 2> stderr.log > > ?Carson > > > > On Nov 2, 2015, at 7:57 AM, Pedro Barbosa > wrote: > > Hello, > > I'm unable to run MAKER properly as the datastore_index.log file shows a > FAILED run for all the scaffolds. > > Inspecting the log files from the output directory i see that it always > dies when repeatrunner is invoked. > > *STARTED > scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.consensi%2Efa%2Eclassified%2Empi%2E10%2E9.specific.out* > *FINISHED > scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.consensi%2Efa%2Eclassified%2Empi%2E10%2E9.specific.out* > *STARTED > scaff-1k.maker.output/scaff-1k_datastore/3A/5C/scaffold_13//theVoid.scaffold_13/0/scaffold_13.0.te_proteins%2Efasta.repeatrunner* > *DIED RANK 0:2:0:0* > *DIED COUNT 1* > > Afterwards, I realized that repeatrunner is no longer included internally > in MAKER, so I installed and included it in the system path. The error > remained. > I tried to remove the parameter 'repeat_protein' from the opts control > file, because apparently repeatrunner is called when we provide a set of > transposable elements. No success again. > > I tested both with the MAKER v2.31.8 and v3.00.0 beta, but the problem > doesn't seem to be version related. Please find attached the maker_opts > file that i used to run. > > Could you help me with this ? > > Best regards, > Pedro Barbosa > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jason.r.gallant at gmail.com Mon Nov 2 12:03:55 2015 From: jason.r.gallant at gmail.com (Jason Gallant) Date: Mon, 02 Nov 2015 19:03:55 +0000 Subject: [maker-devel] AUGUSTUS Training and "Off the Shelf" HMMs Message-ID: Hi Everyone, I?ve been experimenting with optimizing Amazon to perform the HMM training of augustus more speedily, based on a procedure that Kevin Childs has written for ?speedy? Augustus training. The procedure essentially comes from taking a subset of the genes predicted by SNAP, rather than the whole genome and constructing the training set? a good idea that undoubtedly saves a lot of time. I?ve written some modifications to the Augustus scripts and dependencies to try to speed this process up on Amazon, and I?d be happy to share my notes with anyone that is interested. I?ve gotten it to the point where the whole AutoAug procedure can be accomplished in a day on a small cluster. I think that working with the Augustus authors, more improvements could be made, but the whole experience with Augustus has lead me to some questions more generally... 1) One of the things noted in monkeying around with this reduced gene set procedure is that you are unable to do UTR training with Augustus? the AutoAug script complains that there aren?t enough genes left to make an adequate training set. Has anyone noted this, because I haven?t seen much discussion of how important that the Augustus HMM is trained for UTRs when used in the Maker2 pipeline. 2) I?ve been trying to evaluate how good my AUGUSTUS HMM is based on the training set. Running the newly trained species file, I see that the performance on the ?exon level? is low (around 5-6%) but sensitivity on the nucleotide level is in the 89-95%, where the specificity is in the 50-60% range, which seems consistent with other users on this and the Augustus list serve. This is assessed based on a training set of approximately 200 genes selected from the output generated by multiple iterative runs using the SNAP program, documented in the MAKER tutorial. This is all based on data & genes selected from a ?to be published? genome of an electric fish I?m working on. 3) Just for laughs, I tried the HMM trained for zebrafish on the same training set and found that the performance was slightly better than my species-specific one that I?ve been working so hard on (a few percentage points on both nucleotide level sensitivity and specificity). I?ve reasoned that it might be best in terms of reproducibility to run Maker one last time with my multiple rounds of SNAP hmm together with the augustus zebrafish species file, rather than using my own custom species training. Can anyone think of a good reason why not to do this? Are there qualities/benefits not expressed by these sensitivity/specificity measures not captured that I would benefit using my own custom species trained file for? What are folks? experiences with AUGUSTUS in this regard? Many thanks for any advise in advance! Jason Gallant -- ---- Dr. Jason R. Gallant Assistant Professor Room 38 Natural Sciences Department of Integrative Biology Michigan State University East Lansing, MI 48824 jgallant at msu.edu office: 517-884-7756 -------------- next part -------------- An HTML attachment was scrubbed... URL: From jgallant at msu.edu Mon Nov 2 12:05:22 2015 From: jgallant at msu.edu (Jason Gallant) Date: Mon, 02 Nov 2015 19:05:22 +0000 Subject: [maker-devel] AUGUSTUS Training and "Off the Shelf" HMMs Message-ID: Hi Everyone, I?ve been experimenting with optimizing Amazon to perform the HMM training of augustus more speedily, based on a procedure that Kevin Childs has written for ?speedy? Augustus training. The procedure essentially comes from taking a subset of the genes predicted by SNAP, rather than the whole genome and constructing the training set? a good idea that undoubtedly saves a lot of time. I?ve written some modifications to the Augustus scripts and dependencies to try to speed this process up on Amazon, and I?d be happy to share my notes with anyone that is interested. I?ve gotten it to the point where the whole AutoAug procedure can be accomplished in a day on a small cluster. I think that working with the Augustus authors, more improvements could be made, but the whole experience with Augustus has lead me to some questions more generally... 1) One of the things noted in monkeying around with this reduced gene set procedure is that you are unable to do UTR training with Augustus? the AutoAug script complains that there aren?t enough genes left to make an adequate training set. Has anyone noted this, because I haven?t seen much discussion of how important that the Augustus HMM is trained for UTRs when used in the Maker2 pipeline. 2) I?ve been trying to evaluate how good my AUGUSTUS HMM is based on the training set. Running the newly trained species file, I see that the performance on the ?exon level? is low (around 5-6%) but sensitivity on the nucleotide level is in the 89-95%, where the specificity is in the 50-60% range, which seems consistent with other users on this and the Augustus list serve. This is assessed based on a training set of approximately 200 genes selected from the output generated by multiple iterative runs using the SNAP program, documented in the MAKER tutorial. This is all based on data & genes selected from a ?to be published? genome of an electric fish I?m working on. 3) Just for laughs, I tried the HMM trained for zebrafish on the same training set and found that the performance was slightly better than my species-specific one that I?ve been working so hard on (a few percentage points on both nucleotide level sensitivity and specificity). I?ve reasoned that it might be best in terms of reproducibility to run Maker one last time with my multiple rounds of SNAP hmm together with the augustus zebrafish species file, rather than using my own custom species training. Can anyone think of a good reason why not to do this? Are there qualities/benefits not expressed by these sensitivity/specificity measures not captured that I would benefit using my own custom species trained file for? What are folks? experiences with AUGUSTUS in this regard? Many thanks for any advise in advance! Jason Gallant -------------- next part -------------- An HTML attachment was scrubbed... URL: From scott at scottcain.net Tue Nov 3 07:28:07 2015 From: scott at scottcain.net (Scott Cain) Date: Tue, 3 Nov 2015 09:28:07 -0500 Subject: [maker-devel] Time is short: GMOD related talks at PAG Message-ID: Hi All, I realize time is short for this: there is a hard deadline of November 6 for entering speakers for the Plant and Animal Genomes meeting in January 2016. If you would like to give a talk in the GMOD section of the meeting, please get me a title and brief description (an abstract would be good but not required) by November 5. The talk can take the form of a project update, user "story" or interpretive dance. Speakers get early registration discount no matter when they register, so if you haven't registered yet, it would be like getting $100 off. Sorry for the short notice, and I look forward to seeing in San Diego! Scott -- ------------------------------------------------------------------------ Scott Cain, Ph. D. scott at scottcain dot net GMOD Coordinator (http://gmod.org/) 216-392-3087 Ontario Institute for Cancer Research -------------- next part -------------- An HTML attachment was scrubbed... URL: From jgallant at msu.edu Tue Nov 3 12:48:10 2015 From: jgallant at msu.edu (Jason Gallant) Date: Tue, 03 Nov 2015 19:48:10 +0000 Subject: [maker-devel] quality_filter.pl script -question for Mike Campbell Message-ID: Hi Mike (list copied for future reference), I found your very nice protocols paper on using Maker from 2014. I?ve been following it to the letter as I?m wrapping up my annotation project. I?ve located your quality_filter.pl script and am using it on my GFF files to create maker standard and maker default data sets from my maker-max GFF file. I?m noticing that perl complains a lot while this is running about ?use of uninitialized value?. This occurs on two separate passes as far as I can tell. When generating the ?maker standard? file, it occurs for many lines in my GFF file as: Use of uninitialized value $array[2] in pattern match (m//) at /mnt/home/jgallant/quality_filter.pl line 50, line Y. And then later it complains again with a similar message Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/ quality_filter.pl line 92, line X. Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/ quality_filter.pl line 96, line X. Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/ quality_filter.pl line 100, line X Any insights as to what causes this? I seem to get a fully formed GFF3 file out the other side, but the command line fills with these messages and makes me nervous that something isn?t right. I?d appreciate any thoughts! Best, Jason Gallant -------------- next part -------------- An HTML attachment was scrubbed... URL: From michael.s.campbell1 at gmail.com Tue Nov 3 14:06:32 2015 From: michael.s.campbell1 at gmail.com (Michael Campbell) Date: Tue, 3 Nov 2015 16:06:32 -0500 Subject: [maker-devel] quality_filter.pl script -question for Mike Campbell In-Reply-To: References: Message-ID: Hi Jason, It could be a couple of things. If you have a cut down version of your gff3 that I can use to recreate the error I can debug it. The quality_filter.pl script is still a pretty young accessory script, so you may have something in your file that It wasn?t tested against. Thanks, Mike > On Nov 3, 2015, at 2:48 PM, Jason Gallant wrote: > > Hi Mike (list copied for future reference), > > I found your very nice protocols paper on using Maker from 2014. I?ve been following it to the letter as I?m wrapping up my annotation project. > > I?ve located your quality_filter.pl script and am using it on my GFF files to create maker standard and maker default data sets from my maker-max GFF file. I?m noticing that perl complains a lot while this is running about ?use of uninitialized value?. This occurs on two separate passes as far as I can tell. When generating the ?maker standard? file, it occurs for many lines in my GFF file as: > > Use of uninitialized value $array[2] in pattern match (m//) at /mnt/home/jgallant/quality_filter.pl line 50, line Y. > > And then later it complains again with a similar message > > Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/quality_filter.pl line 92, line X. > Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/quality_filter.pl line 96, line X. > Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/quality_filter.pl line 100, line X > > Any insights as to what causes this? I seem to get a fully formed GFF3 file out the other side, but the command line fills with these messages and makes me nervous that something isn?t right. > > I?d appreciate any thoughts! > > Best, > Jason Gallant > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From jgallant at msu.edu Tue Nov 3 20:20:45 2015 From: jgallant at msu.edu (Jason Gallant) Date: Wed, 04 Nov 2015 03:20:45 +0000 Subject: [maker-devel] quality_filter.pl script -question for Mike Campbell In-Reply-To: References: Message-ID: Hi Mike, I?ve done a little digging on my own, and I think I have traced the issue? it appears that the script is being thrown for the first time when it encounters the line containing the first FASTA sequence header in the GFF file. For example: Scaffold435 blastx match_part 102606 102851 201 - . ID=Scaffold435:hsp:71210:3.10.1.2;Parent=Scaffold435:hit:24267:3.10.1.2;Target=gi|37594442|ref|NP_003431.2| 161 242;Gap=M82 Scaffold435 blastx match_part 102528 102851 173 - . ID=Scaffold435:hsp:71211:3.10.1.2;Pare##FASTA >Scaffold3579 GCAGTAGGCTGTGATACGTTTGCACCCGGGGACTAAGGGGAGATGTGTACAGGATGGGGA GATGTGTACAGGATGGGGAGATGTGTACAGGATGGAGGGGTCCGTGCGAGAGCGTACCAC GTGTCTCCCGTGCAGTGGTGCGGCGTGTACTTGATGCGATAAGCCACAGGGTCCTGCTTC CCATCCTCCATGATCATTTTCATCCGCAGCGTGACCTCCCCCTCAGAAAAGACCCCCTTC CTCATCCTCTCAAAGAGCACCAGAGATTCCTCAATCGGCCGATCCCTCCAAGGGGACAGG GGAGGGCTGTGGCCCTTCAGCTCCTCCACACGCTGGTGACACACATAGGCGAGGCCCCTG GATGCAGAACAGTGCAGACAGTGACATACCATTCACATGACACTGATCCGGTTAAGCCAC ACGCGATACAATACAGTGTCATCATCAGGAAGAAGGGGAAACAGAGGCGTCAAAACGCCC TATGAAGAGAGGAGTCTGCTTGCACTCACCGGCGAATAAGATCCACTGCGAGGTCGTACA GCTTTTGGAAGTGGTCAGACGCGTGGGTCACTGCATAGGGCGTGTACCCTGTTTTACAGA Perl complains when it encounters the line containing >Scaffold3579 and pretty much every line thereafter. Intriguingly, it would appear that the preceding line appears to be truncated compared to those before it. I can trace this all the way back to the output of gff3_merge for several files. Not sure what to do here (or if you can help!). Best, Jason On Tue, Nov 3, 2015 at 4:06 PM Michael Campbell < michael.s.campbell1 at gmail.com> wrote: > Hi Jason, > > It could be a couple of things. If you have a cut down version of your > gff3 that I can use to recreate the error I can debug it. The > quality_filter.pl script is still a pretty young accessory script, so you > may have something in your file that It wasn?t tested against. > > Thanks, > Mike > > > > On Nov 3, 2015, at 2:48 PM, Jason Gallant wrote: > > Hi Mike (list copied for future reference), > > I found your very nice protocols paper on using Maker from 2014. I?ve > been following it to the letter as I?m wrapping up my annotation project. > > I?ve located your quality_filter.pl script and am using it on my GFF > files to create maker standard and maker default data sets from my > maker-max GFF file. I?m noticing that perl complains a lot while this is > running about ?use of uninitialized value?. This occurs on two separate > passes as far as I can tell. When generating the ?maker standard? file, it > occurs for many lines in my GFF file as: > > Use of uninitialized value $array[2] in pattern match (m//) at > /mnt/home/jgallant/quality_filter.pl line 50, line Y. > > And then later it complains again with a similar message > > Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/ > quality_filter.pl line 92, line X. > Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/ > quality_filter.pl line 96, line X. > Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/ > quality_filter.pl line 100, line X > > Any insights as to what causes this? I seem to get a fully formed GFF3 > file out the other side, but the command line fills with these messages and > makes me nervous that something isn?t right. > > I?d appreciate any thoughts! > > Best, > Jason Gallant > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Wed Nov 4 09:09:55 2015 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 4 Nov 2015 09:09:55 -0700 Subject: [maker-devel] quality_filter.pl script -question for Mike Campbell In-Reply-To: References: Message-ID: <26CDF518-B664-4AD2-9C1F-6844AD9764E5@gmail.com> If you have a truncated result, then you should look for truncation in one of the pre-merged files (usually indicates a broken file lock if you started multiple instances of MAKER simultaneously). Also make sure your /tmp or whatever your system default TMPDIR is has not become full. gff3_merge uses that directory to store temporary files. ?Carson > On Nov 3, 2015, at 8:20 PM, Jason Gallant wrote: > > Hi Mike, > > I?ve done a little digging on my own, and I think I have traced the issue? it appears that the script is being thrown for the first time when it encounters the line containing the first FASTA sequence header in the GFF file. For example: > > Scaffold435 blastx match_part 102606 102851 201 - . ID=Scaffold435:hsp:71210:3.10.1.2;Parent=Scaffold435:hit:24267:3.10.1.2;Target=gi|37594442|ref|NP_003431.2| 161 242;Gap=M82 > Scaffold435 blastx match_part 102528 102851 173 - . ID=Scaffold435:hsp:71211:3.10.1.2;Pare##FASTA > >Scaffold3579 > GCAGTAGGCTGTGATACGTTTGCACCCGGGGACTAAGGGGAGATGTGTACAGGATGGGGA > GATGTGTACAGGATGGGGAGATGTGTACAGGATGGAGGGGTCCGTGCGAGAGCGTACCAC > GTGTCTCCCGTGCAGTGGTGCGGCGTGTACTTGATGCGATAAGCCACAGGGTCCTGCTTC > CCATCCTCCATGATCATTTTCATCCGCAGCGTGACCTCCCCCTCAGAAAAGACCCCCTTC > CTCATCCTCTCAAAGAGCACCAGAGATTCCTCAATCGGCCGATCCCTCCAAGGGGACAGG > GGAGGGCTGTGGCCCTTCAGCTCCTCCACACGCTGGTGACACACATAGGCGAGGCCCCTG > GATGCAGAACAGTGCAGACAGTGACATACCATTCACATGACACTGATCCGGTTAAGCCAC > ACGCGATACAATACAGTGTCATCATCAGGAAGAAGGGGAAACAGAGGCGTCAAAACGCCC > TATGAAGAGAGGAGTCTGCTTGCACTCACCGGCGAATAAGATCCACTGCGAGGTCGTACA > GCTTTTGGAAGTGGTCAGACGCGTGGGTCACTGCATAGGGCGTGTACCCTGTTTTACAGA > > Perl complains when it encounters the line containing >Scaffold3579 and pretty much every line thereafter. Intriguingly, it would appear that the preceding line appears to be truncated compared to those before it. I can trace this all the way back to the output of gff3_merge for several files. Not sure what to do here (or if you can help!). > > Best, > Jason > > > > > On Tue, Nov 3, 2015 at 4:06 PM Michael Campbell > wrote: > Hi Jason, > > It could be a couple of things. If you have a cut down version of your gff3 that I can use to recreate the error I can debug it. The quality_filter.pl script is still a pretty young accessory script, so you may have something in your file that It wasn?t tested against. > > Thanks, > Mike > > > > >> On Nov 3, 2015, at 2:48 PM, Jason Gallant > wrote: >> > >> Hi Mike (list copied for future reference), >> >> I found your very nice protocols paper on using Maker from 2014. I?ve been following it to the letter as I?m wrapping up my annotation project. >> >> I?ve located your quality_filter.pl script and am using it on my GFF files to create maker standard and maker default data sets from my maker-max GFF file. I?m noticing that perl complains a lot while this is running about ?use of uninitialized value?. This occurs on two separate passes as far as I can tell. When generating the ?maker standard? file, it occurs for many lines in my GFF file as: >> >> Use of uninitialized value $array[2] in pattern match (m//) at /mnt/home/jgallant/quality_filter.pl line 50, line Y. >> >> And then later it complains again with a similar message >> >> Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/quality_filter.pl line 92, line X. >> Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/quality_filter.pl line 96, line X. >> Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/quality_filter.pl line 100, line X >> >> Any insights as to what causes this? I seem to get a fully formed GFF3 file out the other side, but the command line fills with these messages and makes me nervous that something isn?t right. >> >> I?d appreciate any thoughts! >> >> Best, >> Jason Gallant > >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From jgallant at msu.edu Wed Nov 4 10:14:57 2015 From: jgallant at msu.edu (Jason Gallant) Date: Wed, 04 Nov 2015 17:14:57 +0000 Subject: [maker-devel] quality_filter.pl script -question for Mike Campbell In-Reply-To: <26CDF518-B664-4AD2-9C1F-6844AD9764E5@gmail.com> References: <26CDF518-B664-4AD2-9C1F-6844AD9764E5@gmail.com> Message-ID: Hi Carson, Great? the full temporary directory was indeed the issue! On amazon the AMI images are so small that a couple of big files will fill this up, and of course because the script cleans up after itself, I was never the wiser. Thanks for the insight. I was missing about 1GB of data! Doh! Mike, I was able to regenerate my GFF file with this in mind, and no more complaining. It was indeed the truncated file that was the culprit. Thanks for your insights as well. Best, Jason Gallant On Wed, Nov 4, 2015 at 11:10 AM Carson Holt wrote: > If you have a truncated result, then you should look for truncation in one > of the pre-merged files (usually indicates a broken file lock if you > started multiple instances of MAKER simultaneously). Also make sure your > /tmp or whatever your system default TMPDIR is has not become full. > gff3_merge uses that directory to store temporary files. > > ?Carson > > > On Nov 3, 2015, at 8:20 PM, Jason Gallant wrote: > > Hi Mike, > > I?ve done a little digging on my own, and I think I have traced the issue? > it appears that the script is being thrown for the first time when it > encounters the line containing the first FASTA sequence header in the GFF > file. For example: > > Scaffold435 blastx match_part 102606 102851 201 - . ID=Scaffold435:hsp:71210:3.10.1.2;Parent=Scaffold435:hit:24267:3.10.1.2;Target=gi|37594442|ref|NP_003431.2| > 161 242;Gap=M82 > Scaffold435 blastx match_part 102528 102851 173 - . > ID=Scaffold435:hsp:71211:3.10.1.2;Pare##FASTA > >Scaffold3579 > GCAGTAGGCTGTGATACGTTTGCACCCGGGGACTAAGGGGAGATGTGTACAGGATGGGGA > GATGTGTACAGGATGGGGAGATGTGTACAGGATGGAGGGGTCCGTGCGAGAGCGTACCAC > GTGTCTCCCGTGCAGTGGTGCGGCGTGTACTTGATGCGATAAGCCACAGGGTCCTGCTTC > CCATCCTCCATGATCATTTTCATCCGCAGCGTGACCTCCCCCTCAGAAAAGACCCCCTTC > CTCATCCTCTCAAAGAGCACCAGAGATTCCTCAATCGGCCGATCCCTCCAAGGGGACAGG > GGAGGGCTGTGGCCCTTCAGCTCCTCCACACGCTGGTGACACACATAGGCGAGGCCCCTG > GATGCAGAACAGTGCAGACAGTGACATACCATTCACATGACACTGATCCGGTTAAGCCAC > ACGCGATACAATACAGTGTCATCATCAGGAAGAAGGGGAAACAGAGGCGTCAAAACGCCC > TATGAAGAGAGGAGTCTGCTTGCACTCACCGGCGAATAAGATCCACTGCGAGGTCGTACA > GCTTTTGGAAGTGGTCAGACGCGTGGGTCACTGCATAGGGCGTGTACCCTGTTTTACAGA > > Perl complains when it encounters the line containing >Scaffold3579 and > pretty much every line thereafter. Intriguingly, it would appear that the > preceding line appears to be truncated compared to those before it. I can > trace this all the way back to the output of gff3_merge for several files. > Not sure what to do here (or if you can help!). > > Best, > Jason > > > > > On Tue, Nov 3, 2015 at 4:06 PM Michael Campbell < > michael.s.campbell1 at gmail.com> wrote: > >> Hi Jason, >> >> It could be a couple of things. If you have a cut down version of your >> gff3 that I can use to recreate the error I can debug it. The >> quality_filter.pl script is still a pretty young accessory script, so >> you may have something in your file that It wasn?t tested against. >> >> Thanks, >> Mike >> >> >> >> On Nov 3, 2015, at 2:48 PM, Jason Gallant wrote: >> >> Hi Mike (list copied for future reference), >> >> I found your very nice protocols paper on using Maker from 2014. I?ve >> been following it to the letter as I?m wrapping up my annotation project. >> >> I?ve located your quality_filter.pl script and am using it on my GFF >> files to create maker standard and maker default data sets from my >> maker-max GFF file. I?m noticing that perl complains a lot while this is >> running about ?use of uninitialized value?. This occurs on two separate >> passes as far as I can tell. When generating the ?maker standard? file, it >> occurs for many lines in my GFF file as: >> >> Use of uninitialized value $array[2] in pattern match (m//) at >> /mnt/home/jgallant/quality_filter.pl line 50, line Y. >> >> And then later it complains again with a similar message >> >> Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/ >> quality_filter.pl line 92, line X. >> Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/ >> quality_filter.pl line 96, line X. >> Use of uninitialized value $array[2] in string eq at /mnt/home/jgallant/ >> quality_filter.pl line 100, line X >> >> Any insights as to what causes this? I seem to get a fully formed GFF3 >> file out the other side, but the command line fills with these messages and >> makes me nervous that something isn?t right. >> >> I?d appreciate any thoughts! >> >> Best, >> Jason Gallant >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> >> _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dancsi90 at gmail.com Thu Nov 5 00:16:19 2015 From: dancsi90 at gmail.com (Anna Nyiri) Date: Thu, 5 Nov 2015 07:16:19 +0000 Subject: [maker-devel] MPI Message-ID: Hi, I tried to use MAKER with MPICH2, but I got an error message: "/molbio/bin/danna/mpich2-install/bin/hydra_pmi_proxy: error while loading shared libraries: libtorque.so.2: cannot open shared object file: No such file or directory" The attached file contains the shell sctipt, which I used. Is this script correct? The script should contain MPICH module. But I can't find it on my computer. Where can I find this module? Thanks for your help, Anna Nyiri -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_mpi.sh Type: application/x-sh Size: 1296 bytes Desc: not available URL: From carsonhh at gmail.com Thu Nov 5 09:57:21 2015 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 5 Nov 2015 09:57:21 -0700 Subject: [maker-devel] MPI In-Reply-To: References: Message-ID: The problem is the actual MPICH2 installation. You may be missing prerequisites or you may not have compiled with the necessary shared library flags (-enable-shared). You may also be compiling on one machine that has certain libraries installed then running on another that doesn?t have access to those libraries (this can happen if running on a cluster). Try reinstalling MPICH2 or switching to OpenMPI. If you decide to use OpenMPI, he following is from the INSTALL file that should be included with MAKER ?> If using OpenMPI, make sure to set LD_PRELOAD to the location of libmpi.so before even trying to install MAKER. It must also be set before running MAKER (or any program that uses OpenMPI's shared libraries), so it's best just to add it to your ~/.bash_profile. (i.e. export LD_PRELOAD=/location/of/openmpi/lib/libmpi.so). 1. Say yes to the 'configure for MPI' question when running 'perl Build.PL? in step 1 of the EASY INSTALL. 2. Give path to 'mpicc'. Note to make sure you do not give the path to ?mpicc' from another MPI flavor that might be installed on your system. 3. Give path to the folder containing 'mpi,h'. Note to make sure you do not give the path to a folder from another MPI flavor that might be installed on your system. Mixing MPI flavors for 'mpicc' and 'mpi.h' will cause failures. Make sure to read and confirm the auto-detected paths. 4. Finish installation according to steps 2-4 of the EASY INSTALL Note: For OpenMPI you may also want to set OMPI_MCA_mpi_warn_on_fork=0 in your ~/.bash_profile to turn off certain nonfatal warnings. Note: If jobs hang or freeze when using mpiexec under OpenMPI try adding the '-mca btl ^openib' flag to mpiexec command when running MAKER. Example: mpiexec -mca btl ^openib -n 20 maker Thanks, Carson > On Nov 5, 2015, at 12:16 AM, Anna Nyiri wrote: > > Hi, > > I tried to use MAKER with MPICH2, but I got an error message: > "/molbio/bin/danna/mpich2-install/bin/hydra_pmi_proxy: error while loading shared libraries: libtorque.so.2: cannot open shared object file: No such file or directory" > > The attached file contains the shell sctipt, which I used. Is this script correct? > > The script should contain MPICH module. But I can't find it on my computer. Where can I find this module? > > Thanks for your help, > Anna Nyiri > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From psh65 at cornell.edu Tue Nov 10 15:17:09 2015 From: psh65 at cornell.edu (Prashant S Hosmani) Date: Tue, 10 Nov 2015 22:17:09 +0000 Subject: [maker-devel] Maker beta 3.0 version Message-ID: <830525A3-D0E9-4373-A526-3C5C89F391A2@cornell.edu> Hi All, I am currently annotating a plant genome and was curious about new version of Maker. I would like to know what's new in the 3.00 beta version of Maker. Thank you in Advance for help, Prashant From carsonhh at gmail.com Wed Nov 11 17:30:57 2015 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 11 Nov 2015 17:30:57 -0700 Subject: [maker-devel] Maker beta 3.0 version In-Reply-To: <830525A3-D0E9-4373-A526-3C5C89F391A2@cornell.edu> References: <830525A3-D0E9-4373-A526-3C5C89F391A2@cornell.edu> Message-ID: <551B16F5-A4A7-42BE-A509-9AB235CE518D@gmail.com> Primarily EVidenceModeler (EVM) integration. ?Carson > On Nov 10, 2015, at 3:17 PM, Prashant S Hosmani wrote: > > Hi All, > > I am currently annotating a plant genome and was curious about new version of Maker. I would like to know what's new in the 3.00 beta version of Maker. > > Thank you in Advance for help, > Prashant > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From jgallant at msu.edu Fri Nov 13 13:03:30 2015 From: jgallant at msu.edu (Jason Gallant) Date: Fri, 13 Nov 2015 20:03:30 +0000 Subject: [maker-devel] Alternative Splicing and ipr_update_gff Message-ID: Hi Everyone, Another nitty gritty question, probably directed at Carson once more. I decided to make one more go at my maker annotation, this time turning on alt_splice=1. I have been keeping keep_preds=1 on to export the ?max? dataset as detailed in Campbell et al (2014), with the hopes of ?rescuing? genes that have IPR domains do not have evidence. Everything works swimmingly when alt_splice=0, but when activated, the run behaved normally? I ran gff3_merge and fast_merge to obtain proteins and transcripts, and found that as predicted the resulting fasta files contained more proteins than the initial run. I ran IPR scan and now am trying to update my GFF3 file to obtain the ?maker standard? dataset? what I am noticing is a sudden complaint by the ipr_update_gff script about use of an uninitialized value. This appears to have happened to others: https://groups.google.com/d/msg/maker-devel/dM4WvyghYks/BboRZQLmEF8J And indeed, I can verify that the first protein listed in my fasta file is only listed as ?augustus_masked? match/match part in the original GFF3 file. If I understand the ipr_update_gff script correctly, this transcript will be ignored because it lacks the mRNA type. Is this expected naming behavior for the alternative splicing? I would have expected the alternative splice variants to be listed as alternative mRNAs under the same parent gene? Is there some sort of misconfiguration or am I expecting incorrectly? Hopes for any help you all can provide in diagnosing Best, Jason Gallant -------------- next part -------------- An HTML attachment was scrubbed... URL: From jgallant at msu.edu Fri Nov 13 13:55:47 2015 From: jgallant at msu.edu (Jason Gallant) Date: Fri, 13 Nov 2015 20:55:47 +0000 Subject: [maker-devel] Alternative Splicing and ipr_update_gff In-Reply-To: References: Message-ID: Hi Everyone, Doh! Very stupid error? I ran my interpro scan on the augustus masked proteins file instead of the maker masked proteins file by mistake. Apologies! Jason Gallant On Fri, Nov 13, 2015 at 3:03 PM Jason Gallant wrote: > Hi Everyone, > > Another nitty gritty question, probably directed at Carson once more. > > I decided to make one more go at my maker annotation, this time turning on > alt_splice=1. I have been keeping keep_preds=1 on to export the ?max? > dataset as detailed in Campbell et al (2014), with the hopes of ?rescuing? > genes that have IPR domains do not have evidence. > > Everything works swimmingly when alt_splice=0, but when activated, the run > behaved normally? I ran gff3_merge and fast_merge to obtain proteins and > transcripts, and found that as predicted the resulting fasta files > contained more proteins than the initial run. > > I ran IPR scan and now am trying to update my GFF3 file to obtain the > ?maker standard? dataset? what I am noticing is a sudden complaint by the > ipr_update_gff script about use of an uninitialized value. > > This appears to have happened to others: > https://groups.google.com/d/msg/maker-devel/dM4WvyghYks/BboRZQLmEF8J > > And indeed, I can verify that the first protein listed in my fasta file is > only listed as ?augustus_masked? match/match part in the original GFF3 > file. > > If I understand the ipr_update_gff script correctly, this transcript will > be ignored because it lacks the mRNA type. Is this expected naming > behavior for the alternative splicing? I would have expected the > alternative splice variants to be listed as alternative mRNAs under the > same parent gene? Is there some sort of misconfiguration or am I expecting > incorrectly? > > Hopes for any help you all can provide in diagnosing > > Best, > Jason Gallant > -------------- next part -------------- An HTML attachment was scrubbed... URL: