From carsonhh at gmail.com Tue May 1 17:07:47 2012 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 01 May 2012 18:07:47 -0400 Subject: [maker-devel] gff3_preds2models usage question In-Reply-To: Message-ID: Sorry for the slow response. The gff3_preds2models script has been deprecated for some time now (isn't even in the release code anymore), and the old one won't work with the new library. I've attached a made from scratch drop-in replacement that you can use to do what the old script would have done. In the current release of MAKER, instead of the gff3_preds2models script users can just give MAKER a set of predictions in GFF3 format (pred_gff option) and set keep_preds=1 (then leave all other options blank). The predictions given will the be converted into gene models. Thanks, Carson From: Walter Eckalbar Date: Tuesday, 3 April, 2012 7:28 PM To: Subject: [maker-devel] gff3_preds2models usage question Hello maker developers and users, I am attempting to use the gff3_preds2models scripts, but running into a few issues. Initially, I hit errors that seemed to be fixed by installing CGI and its dependancies. However, that during that installation a few tests did fail. I can provide error logs if that would be helpful, however, I went on to install and attempt gff3_preds2models anyway. What I am currently doing is running gff3_merge first, to gather the maker outputs. I am doing so with both the -n option on and off. When providing the gff3 file with the sequence I get the following error from gff3_preds2models: Undefined subroutine &maker::auto_annotator::annotate called at /Users/Walter/Bioinformatics/Tools/maker/bin/gff3_preds2models line 97, line 992291. This seemed to be the same error as that of what someone else saw on these boards, but I did not see a later email resolving the issue. I also tried giving it just the gff3 without the sequences at the bottom of the file and then I get this error: ERROR: There was a problem in the writing the fasta entry Either no sequence was given, or there was an error in writing This leads me to believe I should be using the one with the sequence, but I am not certain of that. I see it might be possible to go from maker outputs to chado database then to gene->mRNA->exon gff3s, but I have not set up my machine for XML or chado yet, and it does not appear trivial. Thanks for the help, Walter _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: gff3_preds2models Type: application/octet-stream Size: 4777 bytes Desc: not available URL: From weckalba at asu.edu Tue May 1 20:33:00 2012 From: weckalba at asu.edu (Walter Eckalbar) Date: Tue, 1 May 2012 18:33:00 -0700 Subject: [maker-devel] gff3_preds2models usage question In-Reply-To: References: Message-ID: Hi Carson, Thanks for the response, even a late one, and thanks for the script. I'll certainly be giving that a try. Walter On 1 May 2012 15:07, Carson Holt wrote: > Sorry for the slow response. The gff3_preds2models script has been > deprecated for some time now (isn't even in the release code anymore), and > the old one won't work with the new library. > > I've attached a made from scratch drop-in replacement that you can use to > do what the old script would have done. In the current release of MAKER, > instead of the gff3_preds2models script users can just give MAKER a set of > predictions in GFF3 format (pred_gff option) and set keep_preds=1 (then > leave all other options blank). The predictions given will the be > converted into gene models. > > Thanks, > Carson > > > > From: Walter Eckalbar > Date: Tuesday, 3 April, 2012 7:28 PM > To: > Subject: [maker-devel] gff3_preds2models usage question > > Hello maker developers and users, > > I am attempting to use the gff3_preds2models scripts, but running into a > few issues. > > Initially, I hit errors that seemed to be fixed by installing CGI and its > dependancies. However, that during that installation a few tests did fail. > I can provide error logs if that would be helpful, however, I went on to > install and attempt gff3_preds2models anyway. > > What I am currently doing is running gff3_merge first, to gather the maker > outputs. I am doing so with both the -n option on and off. When providing > the gff3 file with the sequence I get the following error from > gff3_preds2models: > > Undefined subroutine &maker::auto_annotator::annotate called at > /Users/Walter/Bioinformatics/Tools/maker/bin/gff3_preds2models line 97, > line 992291. > > This seemed to be the same error as that of what someone else saw on these > boards, but I did not see a later email resolving the issue. > > I also tried giving it just the gff3 without the sequences at the bottom > of the file and then I get this error: > > ERROR: There was a problem in the writing the fasta entry > Either no sequence was given, or there was an error in writing > > This leads me to believe I should be using the one with the sequence, but > I am not certain of that. > > I see it might be possible to go from maker outputs to chado database then > to gene->mRNA->exon gff3s, but I have not set up my machine for XML or > chado yet, and it does not appear trivial. > > Thanks for the help, > > Walter > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From qwang at uwyo.edu Thu May 3 11:23:16 2012 From: qwang at uwyo.edu (Qiurong Wang) Date: Thu, 3 May 2012 10:23:16 -0600 Subject: [maker-devel] MAKER download problem Message-ID: <8754B55D-C119-4A7C-9594-BEAEAD3BB939@uwyo.edu> Hi, I was trying to download MAKER, but I couldn't open the download page. Could you please help me to figure out the problem? Thanks a lot! Qiurong Wang PhD candidate Department of Botany University of Wyoming Department 3165, 1000 E University Ave. Laramie, Wyoming 82071, USA Phone: 307-766-2634 Email: qwang at uwyo.edu From barry.moore at genetics.utah.edu Thu May 3 12:51:48 2012 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Thu, 3 May 2012 11:51:48 -0600 Subject: [maker-devel] MAKER download problem In-Reply-To: <8754B55D-C119-4A7C-9594-BEAEAD3BB939@uwyo.edu> References: <8754B55D-C119-4A7C-9594-BEAEAD3BB939@uwyo.edu> Message-ID: <6B24C00B-6C70-4A40-BA7C-E3AC0C7F5E25@genetics.utah.edu> Hi all, The web server hosting the MAKER licensing application and MAKER code distribution was attacked this week. The University of Utah is currently blocking access to that server from outside University IP space. We are working hard to move all of the content and web-applications from that machine to a new server and I expect to have the MAKER services restored over the weekend. This is affecting the ability to submit new licenses for MAKER and to download the MAKER code. In addition those with existing licenses will need to update their code links for future code updates. For those of you on campus in Utah or with campus VPN access, the server is running and available. Updates on the status of the server and details about new links to the MAKER code will be posted to the mailing list soon. Barry On May 3, 2012, at 10:23 AM, Qiurong Wang wrote: > Hi, > > I was trying to download MAKER, but I couldn't open the download page. Could you please help me to figure out the problem? Thanks a lot! > > > Qiurong Wang > PhD candidate > Department of Botany > University of Wyoming > Department 3165, 1000 E University Ave. Laramie, Wyoming 82071, USA > Phone: 307-766-2634 > Email: qwang at uwyo.edu > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Thu May 3 13:00:01 2012 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 03 May 2012 14:00:01 -0400 Subject: [maker-devel] MAKER download problem In-Reply-To: <8754B55D-C119-4A7C-9594-BEAEAD3BB939@uwyo.edu> Message-ID: The server hosting the file to download is down temporarily. I'll put a copy of MAKER on a separate server and e-mail the link to you. --Carson On 12-05-03 12:23 PM, "Qiurong Wang" wrote: >Hi, > >I was trying to download MAKER, but I couldn't open the download page. >Could you please help me to figure out the problem? Thanks a lot! > > >Qiurong Wang >PhD candidate >Department of Botany >University of Wyoming >Department 3165, 1000 E University Ave. Laramie, Wyoming 82071, USA >Phone: 307-766-2634 >Email: qwang at uwyo.edu > > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From cjfields at illinois.edu Tue May 15 10:01:31 2012 From: cjfields at illinois.edu (Fields, Christopher J) Date: Tue, 15 May 2012 15:01:31 +0000 Subject: [maker-devel] mail list and Trac Message-ID: <64A6759A-9FD1-4AEE-BCEA-151B0D791ADD@illinois.edu> Just wanted to point out, I noticed the mail list is not being indexed on Google Groups any more (nothing in May). Also, any status on Trac? chris From barry.moore at genetics.utah.edu Tue May 15 11:34:29 2012 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Tue, 15 May 2012 10:34:29 -0600 Subject: [maker-devel] mail list and Trac In-Reply-To: <64A6759A-9FD1-4AEE-BCEA-151B0D791ADD@illinois.edu> References: <64A6759A-9FD1-4AEE-BCEA-151B0D791ADD@illinois.edu> Message-ID: Thanks Chris, I'll check on the Google groups. The MAKER Trac server was on a server that we had to shut down a couple weeks ago and I had made moving it a lower priority since I didn't think it was getting much use. I'll get it moved over and bump up the priority on that move since I know someone is looking at it. B On May 15, 2012, at 9:01 AM, Fields, Christopher J wrote: > Just wanted to point out, I noticed the mail list is not being indexed on Google Groups any more (nothing in May). Also, any status on Trac? > > chris > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From anastasia.gioti at scilifelab.se Thu May 17 06:27:04 2012 From: anastasia.gioti at scilifelab.se (Anastasia Gioti) Date: Thu, 17 May 2012 13:27:04 +0200 Subject: [maker-devel] Use pass-through system to add missing genes In-Reply-To: <03439C8F-75B0-42FE-894C-CC564AEB73E9@genetics.utah.edu> References: <4FE7CD5B-FC1C-43E7-AC41-A05823348B99@scilifelab.se> <03439C8F-75B0-42FE-894C-CC564AEB73E9@genetics.utah.edu> Message-ID: <4D9922AF-A917-4747-9B7C-CFE9F142D51C@scilifelab.se> Hi Barry, Thanks for your detailed instructions. You well understood that I have already included the proteins of the closely related species in my protein evidence dataset, but still did not get the genes. I have now blasted (P) the missing 949 proteins from this species against my nonoverlaping_abinits.fasta proteins and have found 618 good hits, which i guess I can promote to models using the routine no 2 of your last email and Carson's script gff3_select. I have also looked at the rest of the proteins (331) for which there was no model in the nonoverlaping_abinits.fasta. I will try to describe 2 examples I looked at in apollo: 1) ab initio models predicted a ~7.5 kb gene covering 3 genes (as predicted in the closely related species). Blastx+protein2genome similarities were reported for two of these genes, but not for the 3rd (the one in the middle). MAKER finally decided to call two genes, respecting the blastx+protein2genome evidence, but the 3rd was lost. I have previously reported here that MAKER tends to fuse genes in multi-exonic genes and others reported that too, I remember you proposed changing a papameter to alter this. To keep in mind for my final strategy that i am trying to decide on (for the moment i have not rerun MAKER). For this case, abinitio models do not exist for the gene (in the sense that the existing models overlap many genes) and the similarity to the protein of the closely related species was not judged sufficient, although when i look at a TblastN alignment for this area it looks fine to me. 2) Only the 3' end of the gene was called by MAKER, despite blastx +protein2genome evidence from the closely related species for the entire region. Abinitio models existed as 2 separate genes , one for the 3' end region (finally retained by MAKER in a consensus decision I guess) and one for the 5' region, but here not all predictors called an orf, and finally nothing was called in this region. In this case, it is a misannotation rather, but which misses a very important part of the gene. I hope my descriptions are clear, otherwise I can provide you the gff file of these 2 examples to look by yourself. I am not very clear about what to do about these 331 cases (which I do not know how to look at as well, except for random examples' viwing in Apollo). I feel that a second MAKER run would be probably the solution, this time providing as pred_gff the result of a blast against the 331. But still, the existing annotations would then have to be somehow updated as the new predictions are in conflict with them (see example 2). I am a bit confused. to recap, what would you suggest for the 331 still-missing proteins in terms of asessing their profiles n a rather automatic way and in inluding them in my annotations without going deep into manual gene curation? Many thnks, Anastasia > > Let me just restate what you've said so that I can be sure that I am > correct about what you've already done. You have run Maker with > SNAP, Genemark and Augustus using EST from a closely related species > (passed to altest) and protein evidence from other fungi. You are > missing about 1,000 genes compared to the species that provided the > EST alignments. You say their is good evidence that these genes > exist from the alignments and I assume by this that you mean the EST/ > protein alignments that Maker produced. > > 1) Is the closely related fungus annotated and if so have you > included it's proteins in the evidence set that you provided to > Maker. If you haven't provided these proteins as evidence to maker > then you should do this. You can re-run maker passing your original > models back through like this: > > #-----Re-annotation Using MAKER Derived GFF3 > genome_gff=original_maker_annotations.gff3 > est_pass=1 > altest_pass=1 > protein_pass=1 > rm_pass=1 > model_pass=1 > pred_pass=1 > other_pass=1 > > #-----Protein Homology Evidence (for best results provide a file for > at least one) > protein=proteins_from_closely_related.fasta > ## OR it sounds like you've already aligned these with exonerate? > protein_gff=proteins_from_closely_related_already_aligned.gff > > 2) If you've already included those closely related species proteins > but still didn't get the 1,000 genes, then take your > nonoverlaping_abinits.fasta and blast them directly against your > closely related proteins. Presumably they don't hit too well > because if they did they should have been promoted to predictions by > Maker the first time, but here you can decide yourself what > thresholds to allow to keep the abinit predictions that hit the > closely related species proteins. If you filter you blast hits the > way you want and keep the names of the abinit predictions that pass > your filter, then use the script Carson attached it it will generate > a abinit precidtion GFF file with only the predictions you > selected. You can then pass those predictions back to Maker and > force it to keep them and Maker will turn them from predictions > (match/match_part) into gene models. > > #-----Re-annotation Using MAKER Derived GFF3 > genome_gff=original_maker_annotations.gff3 > est_pass=1 > altest_pass=1 > protein_pass=1 > rm_pass=1 > model_pass=1 > pred_pass=0 > other_pass=1 > > #-----Gene Prediction > snaphmm= > gmhmm= > augustus_species= > fgenesh_par_file= > pred_gff=ab_init_predictions_rescued_by_blast.gff > > keep_preds=1 > > Barry > >>> Thanks, >>> Carson >>> >>> From: Anastasia Gioti >>> Date: Wed, 25 Apr 2012 11:09:36 +0200 >>> To: >>> Subject: [maker-devel] Use pass-through system to add missing genes >>> >>> Hi, >>> I have a set of predicted proteins from the genome of a fungus >>> annotated by MAKER using EST data from a closely related species >>> and 3 ab initio predictors (snap iterativelly trained 3 times, >>> genemark trained directly on the assembly and augustus with a >>> model from a less closely related species), along with a set of >>> fungal proteins. I am missing ~ 1000 proteins when I compare to >>> the species i used EST data from, and there is good evidence from >>> alignments that these genes exist. The question is how to proceed >>> from Blast hits to actual gene models here. The idea would be to >>> add these genes to the existing dataset, rather than reannotate >>> the genome. I believe that reannotating it without any further >>> evidence such as RNA-seq from the species itself would not change >>> much,and i d rather stick with actual predictions that i trust and >>> have used in subsequent analyses. The 1000 genes I can accept to >>> annotate with a less stringent and reliable way than MAKER, I just >>> want to add them so that the difference in gene count gets >>> corrected. >>> I was reading the MAKER 2 paper and i was wondering if I can use >>> the legacy annotations scheme to do it, by providing GFF3 of the >>> alignments between the two species in the regions where genes were >>> missed, but as i said, I would not like to reannotate the whole >>> genome, and running MAKER2 might cause slight changes that i d >>> like to avoid. Is this possible? First, is it possible to provide >>> a Gff3 file of specific locations and not the entire genome >>> alignment? (I guess so..) Second, how can I tag the existing >>> annotations as 'not to be changed' or alternatively, tag the new >>> models only? How should I run maker2, with which predictors on and >>> which off? >>> Thanks, >>> Anastasia >>> >>> Anastasia Gioti >>> Post-doctoral Researcher >>> >>> anastasia.gioti at scilifelab.se >>> anastasia.gioti at ebc.uu.se >>> >>> http://www.ebc.uu.se/Research/IEG/evbiol/people/pages/Gioti_Anastasia/ >>> >>> >>> >>> _______________________________________________ maker-devel >>> mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >> >> Anastasia Gioti >> Post-doctoral Researcher >> >> anastasia.gioti at scilifelab.se >> anastasia.gioti at ebc.uu.se >> >> http://www.ebc.uu.se/Research/IEG/evbiol/people/pages/ >> Gioti_Anastasia/ >> >> >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > Barry Moore > Research Scientist > Dept. of Human Genetics > University of Utah > Salt Lake City, UT 84112 > -------------------------------------------- > (801) 585-3543 > > > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell- > lab.org Anastasia (Natassa) Gioti Post-Doc Researcher Evolutionary Biology Department Uppsala University -Science for Life lab, Karolinska Institute Stockholm anastasia.gioti at ebc.uu.se anastasia.gioti at scilifelab.se http://www.ebc.uu.se/Research/IEG/evbiol/people/pages/Gioti_Anastasia/ From yogeshp08 at gmail.com Tue May 15 11:07:57 2012 From: yogeshp08 at gmail.com (Yogesh) Date: Tue, 15 May 2012 11:07:57 -0500 Subject: [maker-devel] tblastn Cleanup? Message-ID: <4478F0B20ED84A85B3C4FE4154F8FAD1@gmail.com> Hello, I have a few tblastn alignments with a lot of low quality hits. I have to clean that up. Can you please suggest how Maker pipeline does it? Also can I run it directly on my data without having to go through the whole pipeline? Thanks, -Yogesh -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Fri May 18 09:22:50 2012 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 18 May 2012 10:22:50 -0400 Subject: [maker-devel] tblastn Cleanup? In-Reply-To: <4478F0B20ED84A85B3C4FE4154F8FAD1@gmail.com> Message-ID: There are several things. I set several filtering options directly on the BLAST command line. These are things like maximum intron length, an e-value filter, and simple repeat filtering (called dust filter in NCBI blast and seg filter in WUBLAST). I also run repeat masker over the genome first. This allows simple and complex repeats to be removed before running BLAST (otherwise you get many false alignments). Last I filter the results based on percent coverage of the hit to the original database sequence and percent identity. I think you can set percent identity as a flag in BLAST, but the percent coverage filter is being calculated by MAKER, so to do this outside of MAKER would require that you write your own filtering script to compare the length of the alignment to the length of the sequence in the database. I also have an HSP depth overlap filter. This removes weird low complexity hits that escape repeatmasking. They show up as multiple HSPs overlapping multiple times in the same region (usually very high numbers like 90 HSPs all 100 bp long in the same region). I calculate the number of base pairs in the alignment on the hit then divide by the number of base pairs in the query alignment. If it's greater than 3, I throw the hit out. Thanks, Carson From: Yogesh Date: Tuesday, 15 May, 2012 12:07 PM To: Subject: [maker-devel] tblastn Cleanup? Hello, I have a few tblastn alignments with a lot of low quality hits. I have to clean that up. Can you please suggest how Maker pipeline does it? Also can I run it directly on my data without having to go through the whole pipeline? Thanks, -Yogesh _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From smg283 at gmail.com Tue May 22 01:42:51 2012 From: smg283 at gmail.com (Scott Geib) Date: Mon, 21 May 2012 20:42:51 -1000 Subject: [maker-devel] can't call method strand on an undefined value ERROR: Failed while flattening protein clusters Message-ID: Hi, Using maker 2.24, I am getting the following error (see below) in protein2genome widget. I also get the same error with est2genome. This happens with my own data (testing on a single scaffold), but not with the test data supplied with maker (dpp files in data folder). Scott Widget::exonerate::protein2genome: /data0/opt/AlignmentSoftware/exonerate/exonerate-2.2.0-x86_64/bin/exonerate -q /data0/opt/GenePrediction/maker/bdor/makercustom/scaffold1.maker.output/scaffold1_datastore/B8/E3/scaffold00001//theVoid.scaffold00001/UniRef90_UPI000194D3FC.for.1588546-1589203.9.fasta -t /data0/opt/GenePrediction/maker/bdor/makercustom/scaffold1.maker.output/scaffold1_datastore/B8/E3/scaffold00001//theVoid.scaffold00001/scaffold00001.1588546-1589203.9.fasta -Q protein -T dna -m protein2genome --softmasktarget --percent 20 --showcigar > /data0/opt/GenePrediction/maker/bdor/makercustom/scaffold1.maker.output/scaffold1_datastore/B8/E3/scaffold00001//theVoid.scaffold00001/scaffold00001.1588546-1589203.UniRef90_UPI000194D3FC.p_exonerate.9 #-------------------------------# cleaning blastx... in cluster::shadow_cluster... ...finished clustering. cleaning clusters.... total clusters:2 now processing 0 ...processing 0 of 23 ...processing 1 of 23 ...processing 2 of 23 ...processing 3 of 23 ...processing 4 of 23 ...processing 5 of 23 ...processing 6 of 23 ...processing 7 of 23 ...processing 8 of 23 ...processing 9 of 23 ...processing 10 of 23 ...processing 11 of 23 ...processing 12 of 23 ...processing 13 of 23 ...processing 14 of 23 ...processing 15 of 23 ...processing 16 of 23 ...processing 17 of 23 ...processing 18 of 23 ...processing 19 of 23 ...processing 20 of 23 ...processing 21 of 23 total clusters:2 now processing 0 in cluster::shadow_cluster... ...finished clustering. cleaning clusters.... total clusters:2 now processing 0 ...processing 0 of 20 ...processing 1 of 20 ...processing 2 of 20 ...processing 3 of 20 ...processing 4 of 20 ...processing 5 of 20 ...processing 6 of 20 ...processing 7 of 20 ...processing 8 of 20 ...processing 9 of 20 ...processing 10 of 20 ...processing 11 of 20 ...processing 12 of 20 ...processing 13 of 20 ...processing 14 of 20 ...processing 15 of 20 ...processing 16 of 20 ...processing 17 of 20 ...processing 18 of 20 total clusters:2 now processing 0 Can't call method "strand" on an undefined valueERROR: Failed while flattening protein clusters ERROR: Chunk failed at level:11, tier_type:2 FAILED CONTIG:scaffold00001 -------------- next part -------------- An HTML attachment was scrubbed... URL: From anastasia.gioti at scilifelab.se Tue May 22 08:14:17 2012 From: anastasia.gioti at scilifelab.se (Anastasia Gioti) Date: Tue, 22 May 2012 15:14:17 +0200 Subject: [maker-devel] Use pass-through system to add missing genes In-Reply-To: <03439C8F-75B0-42FE-894C-CC564AEB73E9@genetics.utah.edu> References: <4FE7CD5B-FC1C-43E7-AC41-A05823348B99@scilifelab.se> <03439C8F-75B0-42FE-894C-CC564AEB73E9@genetics.utah.edu> Message-ID: <19E36E3B-6A82-49D5-B0AC-5E521F3E8999@scilifelab.se> Hi again, I hav sent an email a few days ago about this thread, and i am not sure if you have received it or you still did not have time to look at it. In any case, this email was dealing with the fact that some proteins were not retrieved in the abinitio models and how to deal with it. What I would like to ask here is a few confirmations on how to rerun maker for the proteins that were retrieved in the abinitio models. i have looked at the Blast results, and have done a series of check-ups, so now I am ready to run MAKER again with a list of models that I want to retain. Regarding the following parameters: 1. Do I set the genome= to nothing here? i.e quote it out? This is in the beginning of the control file #-----Genome (Required for De-Novo Annotation) genome=#genome sequence file in fasta format organism_type= #eukaryotic or prokaryotic. Default is eukaryotic > > #-----Re-annotation Using MAKER Derived GFF3 > genome_gff=original_maker_annotations.gff3 > est_pass=1 > altest_pass=1 > protein_pass=1 > rm_pass=1 > model_pass=1 > pred_pass=0 > other_pass=1 > > #-----Gene Prediction 2. Do i provide again the snap etc models? I am not sure, because i thought MAKER would not run ab initio predictors this time (this is why I would also quote out the genome file above, as this is not a de novo annotation). but if it will, i will then provide the previous models i used, except for snap, for which I will generate a new model from the gff3 file of the last run (according to snap documentation). Am i correct? > snaphmm= > gmhmm= > augustus_species= > fgenesh_par_file= > pred_gff=ab_init_predictions_rescued_by_blast.gff > > keep_preds=1 Samely, what do i do with repeatmasking etc? Thanks in adavance, Anastasia > > Barry > >>> Thanks, >>> Carson >>> >>> From: Anastasia Gioti >>> Date: Wed, 25 Apr 2012 11:09:36 +0200 >>> To: >>> Subject: [maker-devel] Use pass-through system to add missing genes >>> >>> Hi, >>> I have a set of predicted proteins from the genome of a fungus >>> annotated by MAKER using EST data from a closely related species >>> and 3 ab initio predictors (snap iterativelly trained 3 times, >>> genemark trained directly on the assembly and augustus with a >>> model from a less closely related species), along with a set of >>> fungal proteins. I am missing ~ 1000 proteins when I compare to >>> the species i used EST data from, and there is good evidence from >>> alignments that these genes exist. The question is how to proceed >>> from Blast hits to actual gene models here. The idea would be to >>> add these genes to the existing dataset, rather than reannotate >>> the genome. I believe that reannotating it without any further >>> evidence such as RNA-seq from the species itself would not change >>> much,and i d rather stick with actual predictions that i trust and >>> have used in subsequent analyses. The 1000 genes I can accept to >>> annotate with a less stringent and reliable way than MAKER, I just >>> want to add them so that the difference in gene count gets >>> corrected. >>> I was reading the MAKER 2 paper and i was wondering if I can use >>> the legacy annotations scheme to do it, by providing GFF3 of the >>> alignments between the two species in the regions where genes were >>> missed, but as i said, I would not like to reannotate the whole >>> genome, and running MAKER2 might cause slight changes that i d >>> like to avoid. Is this possible? First, is it possible to provide >>> a Gff3 file of specific locations and not the entire genome >>> alignment? (I guess so..) Second, how can I tag the existing >>> annotations as 'not to be changed' or alternatively, tag the new >>> models only? How should I run maker2, with which predictors on and >>> which off? >>> Thanks, >>> Anastasia >>> >>> Anastasia Gioti >>> Post-doctoral Researcher >>> >>> anastasia.gioti at scilifelab.se >>> anastasia.gioti at ebc.uu.se >>> >>> http://www.ebc.uu.se/Research/IEG/evbiol/people/pages/Gioti_Anastasia/ >>> >>> >>> >>> _______________________________________________ maker-devel >>> mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >> >> Anastasia Gioti >> Post-doctoral Researcher >> >> anastasia.gioti at scilifelab.se >> anastasia.gioti at ebc.uu.se >> >> http://www.ebc.uu.se/Research/IEG/evbiol/people/pages/ >> Gioti_Anastasia/ >> >> >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > Barry Moore > Research Scientist > Dept. of Human Genetics > University of Utah > Salt Lake City, UT 84112 > -------------------------------------------- > (801) 585-3543 > > > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell- > lab.org Anastasia (Natassa) Gioti Post-Doc Researcher Evolutionary Biology Department Uppsala University -Science for Life lab, Karolinska Institute Stockholm anastasia.gioti at ebc.uu.se anastasia.gioti at scilifelab.se http://www.ebc.uu.se/Research/IEG/evbiol/people/pages/Gioti_Anastasia/ From anastasia.gioti at scilifelab.se Wed May 23 04:07:12 2012 From: anastasia.gioti at scilifelab.se (Anastasia Gioti) Date: Wed, 23 May 2012 11:07:12 +0200 Subject: [maker-devel] Use pass-through system to add missing genes In-Reply-To: <03439C8F-75B0-42FE-894C-CC564AEB73E9@genetics.utah.edu> References: <4FE7CD5B-FC1C-43E7-AC41-A05823348B99@scilifelab.se> <03439C8F-75B0-42FE-894C-CC564AEB73E9@genetics.utah.edu> Message-ID: <1B0770A0-6D14-4336-BC3A-DC24619BC3FE@scilifelab.se> Hi and sorry for the multiple postings. I have a list of models rescued by the nonoverlaping_abinits.fasta fles (against which i blasted my missing proteins from the closely related species and further filtered out the dubious hits) and a maker gff3 file, but Carson's script gff3_select won't work, and the reason is that these abinitio models were not promoted into the maker gff3 file, thus they are not there. I refer to the gff3 file generated by gff3_merge script. Am i missing something? Thank you, Anastasia > >>> If you know which ab initio predictions you want to add (I.e. the ab initio promoting scenario I descibed), you can provide those predictions to the use the pred_gff option and then set keep_preds=1 and they will be maintained even without evidence. Attached is a script that would make selecting those easier. It take the MAKER generated GFF3 and a list of predictions to keep (one name per line). These might be the results of a BLAST analysis for example. It will then return the GFF3 entries for just those models selected. -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.hackl at uni-wuerzburg.de Wed May 23 07:01:55 2012 From: thomas.hackl at uni-wuerzburg.de (Thomas Hackl) Date: Wed, 23 May 2012 14:01:55 +0200 Subject: [maker-devel] missing est2genome annotation Message-ID: <4FBCD1B3.8000102@uni-wuerzburg.de> Hi, I used maker to annotate genomic contigs and among other stuff provided transcripts from the transcriptome as est evidence. Blast and exonerate work fine and produce valid alignments, the alignment files exist in theVoid and look very good. Unfortunatly neither the evidence_0.gff nor the final .gff carry the corresponding feature annotations. Any ideas why? Regards Thomas -- Thomas Hackl Julius-Maximilians-Universit?t Department of Bioinformatics 97074 W?rzburg, Germany Fon: +49 931 - 31 86883 Mail: thomas.hackl at uni-wuerzburg.de From thomas.hackl at uni-wuerzburg.de Wed May 23 12:14:27 2012 From: thomas.hackl at uni-wuerzburg.de (Thomas Hackl) Date: Wed, 23 May 2012 19:14:27 +0200 Subject: [maker-devel] missing est2genome annotation In-Reply-To: <4FBCD1B3.8000102@uni-wuerzburg.de> References: <4FBCD1B3.8000102@uni-wuerzburg.de> Message-ID: <4FBD1AF3.8020904@uni-wuerzburg.de> Hi again, I did some source code digging and caught the following line burying my exonerate alignments. I suspect it does so for a very good reason, therefore it would help me a lot if someone could explain to me, what is going on there. /lib/GI.pm l.1473 next if $e->pAh< $pcov; Regards Thomas Am 23.05.2012 14:01, schrieb Thomas Hackl: > Hi, > > I used maker to annotate genomic contigs and among other stuff > provided transcripts from the transcriptome as est evidence. Blast and > exonerate work fine and produce valid alignments, the alignment files > exist in theVoid and look very good. Unfortunatly neither the > evidence_0.gff nor the final .gff carry the corresponding feature > annotations. > > Any ideas why? > > Regards > Thomas > -- Thomas Hackl Julius-Maximilians-Universit?t Department of Bioinformatics 97074 W?rzburg, Germany Fon: +49 931 - 31 86883 Mail: thomas.hackl at uni-wuerzburg.de From gowthaman.ramasamy at seattlebiomed.org Thu May 24 14:30:09 2012 From: gowthaman.ramasamy at seattlebiomed.org (Gowthaman Ramasamy) Date: Thu, 24 May 2012 12:30:09 -0700 Subject: [maker-devel] Merging gene predictions.... Message-ID: <89080953C3D300419AACB6E63A7EEFBA5C8409F8D6@mail02.sbri.org> Hi Carson and others, I am wondering if I can use Maker to merge gene predictions from three gff files. One of the algorithm is 'augustus' which of course i can use it inside Maker. Other two are not part of Maker. But, in case if I want to pass only the GFFs to Maker and ask it to merge the annotations (when overlap) and pick only annotation that are predicted in 2 out of 3 gffs. Is it possible? I prefer this approach, as we need to run a blast based validation step on predicted features before even try to merge them. Thats another reason why we dont prefer to use the augustus inside maker. Thanks, Gowthaman From gowthaman.ramasamy at seattlebiomed.org Thu May 24 16:02:55 2012 From: gowthaman.ramasamy at seattlebiomed.org (Gowthaman Ramasamy) Date: Thu, 24 May 2012 14:02:55 -0700 Subject: [maker-devel] Merging gene predictions.... In-Reply-To: <89080953C3D300419AACB6E63A7EEFBA5C8409F8D6@mail02.sbri.org> References: <89080953C3D300419AACB6E63A7EEFBA5C8409F8D6@mail02.sbri.org> Message-ID: <89080953C3D300419AACB6E63A7EEFBA5C8409F8D8@mail02.sbri.org> Can i use the following approach Carson? This is your reply to one of the earlier question: I've attached a made from scratch drop-in replacement that you can use to do what the old script would have done. In the current release of MAKER, instead of the gff3_preds2models script users can just give MAKER a set of predictions in GFF3 format (pred_gff option) and set keep_preds=1 (then leave all other options blank). The predictions given will the be converted into gene models. Thanks, Carson ________________________________________ From: maker-devel-bounces at yandell-lab.org [maker-devel-bounces at yandell-lab.org] On Behalf Of Gowthaman Ramasamy [gowthaman.ramasamy at seattlebiomed.org] Sent: Thursday, May 24, 2012 12:30 PM To: Carson Holt; maker-devel at yandell-lab.org Subject: [maker-devel] Merging gene predictions.... Hi Carson and others, I am wondering if I can use Maker to merge gene predictions from three gff files. One of the algorithm is 'augustus' which of course i can use it inside Maker. Other two are not part of Maker. But, in case if I want to pass only the GFFs to Maker and ask it to merge the annotations (when overlap) and pick only annotation that are predicted in 2 out of 3 gffs. Is it possible? I prefer this approach, as we need to run a blast based validation step on predicted features before even try to merge them. Thats another reason why we dont prefer to use the augustus inside maker. Thanks, Gowthaman _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From gowthaman.ramasamy at seattlebiomed.org Thu May 24 16:21:30 2012 From: gowthaman.ramasamy at seattlebiomed.org (Gowthaman Ramasamy) Date: Thu, 24 May 2012 14:21:30 -0700 Subject: [maker-devel] Merging gene predictions.... In-Reply-To: <89080953C3D300419AACB6E63A7EEFBA5C8409F8D8@mail02.sbri.org> References: <89080953C3D300419AACB6E63A7EEFBA5C8409F8D6@mail02.sbri.org>, <89080953C3D300419AACB6E63A7EEFBA5C8409F8D8@mail02.sbri.org> Message-ID: <89080953C3D300419AACB6E63A7EEFBA5C8409F8DA@mail02.sbri.org> Hi, I am trying to install MAKER in centOS. I was able to install all the perl deps. and external programs. Perl Build.pl and ./Build Install went with out errors/warnings. I did not enable MPI though. But, when i start Maker it returs "segmentation fault". I have no clue whats going wrong....or where to check for error logs? Any help would be appreciated, Thanks, gowthaman _________ From gowthaman.ramasamy at seattlebiomed.org Thu May 24 16:22:16 2012 From: gowthaman.ramasamy at seattlebiomed.org (Gowthaman Ramasamy) Date: Thu, 24 May 2012 14:22:16 -0700 Subject: [maker-devel] MAKER installation problem Message-ID: <89080953C3D300419AACB6E63A7EEFBA5C8409F8DB@mail02.sbri.org> Hi, I am trying to install MAKER in centOS. I was able to install all the perl deps. and external programs. Perl Build.pl and ./Build Install went with out errors/warnings. I did not enable MPI though. But, when i start Maker it returs "segmentation fault". I have no clue whats going wrong....or where to check for error logs? Any help would be appreciated, Thanks, gowthaman _________ ________________________________________ From bob_freeman at hms.harvard.edu Fri May 25 11:23:22 2012 From: bob_freeman at hms.harvard.edu (Bob Freeman) Date: Fri, 25 May 2012 12:23:22 -0400 Subject: [maker-devel] Alternate translation table? Message-ID: <454CA235-0DB6-451F-97C4-83D32E2E805A@hms.harvard.edu> Hello all! Unusual question here: I am running MAKER on a ciliate that uses a non-standard translation table for its translation products. I haven't found an option in the control files that I can change for the for translation of the predicted transcripts. How or where can I go about this? Tx, Bob ----------------------------------------------------- Bob Freeman, Ph.D. Acorn Worm Informatics, Kirschner lab Dept of Systems Biology, Alpert 524 Harvard Medical School 200 Longwood Avenue Boston, MA 02115 617/432.2294, vox "Sorry I'm late. Oh, God, that sounded insincere. I'm late." -- Karen Walker, from Will and Grace -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon May 28 07:43:34 2012 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 28 May 2012 08:43:34 -0400 Subject: [maker-devel] Alternate translation table? In-Reply-To: <454CA235-0DB6-451F-97C4-83D32E2E805A@hms.harvard.edu> Message-ID: The alternate translation table is not currently an option. It's one of those things that needs to be implemented, but has not been yet. It's also not supported by many of the eukaryotic gene predictors MAKER uses. I could probably get something implemented for you to test in two to three weeks though (there are a lot of places where the translation table comes into play). Let me know. --Carson From: Bob Freeman Date: Friday, 25 May, 2012 12:23 PM To: Subject: [maker-devel] Alternate translation table? Hello all! Unusual question here: I am running MAKER on a ciliate that uses a non-standard translation table for its translation products. I haven't found an option in the control files that I can change for the for translation of the predicted transcripts. How or where can I go about this? Tx, Bob ----------------------------------------------------- Bob Freeman, Ph.D. Acorn Worm Informatics, Kirschner lab Dept of Systems Biology, Alpert 524 Harvard Medical School 200 Longwood Avenue Boston, MA 02115 617/432.2294, vox "Sorry I'm late. Oh, God, that sounded insincere. I'm late." -- Karen Walker, from Will and Grace _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon May 28 08:38:25 2012 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 28 May 2012 09:38:25 -0400 Subject: [maker-devel] missing est2genome annotation In-Reply-To: <4FBD1AF3.8020904@uni-wuerzburg.de> Message-ID: Sorry for the slow reply. I'm just getting back after traveling. That's a percent coverage flag. You can set a percent coverage threshold in the maker_bopts.ctl file. Partial high scoring alignments can be common. If you only filter by expect score, you would be surprised to see how many ugly and confusing all the alignments become. Thanks, Carson On 12-05-23 1:14 PM, "Thomas Hackl" wrote: >Hi again, > >I did some source code digging and caught the following line burying my >exonerate alignments. I suspect it does so for a very good reason, >therefore it would help me a lot if someone could explain to me, what is >going on there. > > >/lib/GI.pm l.1473 >next if $e->pAh< $pcov; > > >Regards >Thomas > > >Am 23.05.2012 14:01, schrieb Thomas Hackl: >> Hi, >> >> I used maker to annotate genomic contigs and among other stuff >> provided transcripts from the transcriptome as est evidence. Blast and >> exonerate work fine and produce valid alignments, the alignment files >> exist in theVoid and look very good. Unfortunatly neither the >> evidence_0.gff nor the final .gff carry the corresponding feature >> annotations. >> >> Any ideas why? >> >> Regards >> Thomas >> > > >-- >Thomas Hackl >Julius-Maximilians-Universit?t >Department of Bioinformatics >97074 W?rzburg, Germany >Fon: +49 931 - 31 86883 >Mail: thomas.hackl at uni-wuerzburg.de > > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From seoanezonjic at hotmail.com Tue May 22 07:55:12 2012 From: seoanezonjic at hotmail.com (p sz) Date: Tue, 22 May 2012 12:55:12 +0000 Subject: [maker-devel] ipr_update_gff ERROR Message-ID: First, thanks by help me on the lprevious error that I submitted. I'm still working in the same project and I get a new error. I try interproscan with this commandline: iprscan_wrap -i parsed_input.all.maker.proteins.fasta -email seoanezonjic at hotmail.com -format raw parsed_input.all.maker.proteins.fasta was generated with the tool fasta_merge. I use the output (attached in this email) and a gff file (generated by a normal run of maker, attached in this email) with the ipr_update_gff script of this way: ipr_update_gff BAC12_Clone_Pt314B2_Lib_Pt_7Ba__organism_Pinus_taeda__0.gff sc_interpro.sh.o109053 And i get this error: Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 1. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 1. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 2. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 2. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 3. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 3. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 8. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 8. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 9. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 9. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 10. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 10. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 11. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 11. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 12. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 12. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 13. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 13. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 14. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 14. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 15. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 15. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 18. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 18. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 18. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 18. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 19. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 19. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 19. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 19. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 48. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 48. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 48. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 48. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 49. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 49. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 49. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 49. The gff file seems updated but i don't know if it works fine or is corrupt Thanks in advance -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: sc_interpro.sh.o109053 Type: application/octet-stream Size: 2542 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: BAC12_Clone_Pt314B2_Lib_Pt_7Ba__organism_Pinus_taeda__0.gff Type: application/octet-stream Size: 220243 bytes Desc: not available URL: From larriba.ed at gmail.com Fri May 25 11:01:41 2012 From: larriba.ed at gmail.com (Eduardo Larriba) Date: Fri, 25 May 2012 18:01:41 +0200 Subject: [maker-devel] Consensus gene models Message-ID: Hi Carson and people, I am working on structural annotation of a filamentous fungus, of which there is little evidence as EST or Protein. For generate consensus gene based on limited evidences to me I used Marker. For this I created the files GeneMark prediction-is and SNAP. I run maker using the EST of my organims (85), along with 5700 EST of the closed organims. I have made ??predictions with Augustus, and SNAP GeneMark, with the training files for my organims, in Maker pipeline. Everything works fine. My problem is that when I get the consensus sequences of all my contigs, fasta_merge script (included in Maker), I get different list for each predictor, as well as when I try to get the gff of all. They could tell me how I can use Maker consensus for a list of genes? Or I have to do it manually? There is the possibility that Maker evaluates the accuracy of each prediction and confirm, so just get a list of the different predictions? Thank you very much. -- Eduardo Larriba Tornel Universidad de Alicante. Lab. Fitopatolog?a Dept. Ciencias del Mar y Biolog?a Aplicada Pabell?n 13. San Vicente del Raspeig Tel. 96 590 3400 ext 3280 -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon May 28 09:14:22 2012 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 28 May 2012 10:14:22 -0400 Subject: [maker-devel] Consensus gene models In-Reply-To: Message-ID: The consensus list is in the maker.proteins.fasta and maker.transcripts.fasta file. The predictor specific lists are just for reference purposes (incase you want to see what the other predictors produced on their own, i.e. without MAKER's intervention). The non-overlapping.fasta file in the same directory will contain consensus entries for models that were not supported by any evidence and don't overlap any gene models in the maker.transcripts.fasta (think of these as the maybe gene and the maker.transcripts.fasta as the very likely genes). You can set keep_preds=1 if you just want MAKER to keep everything with or without support and just produce consensus (probably ok on a fungus, but I wouldn't recommend it on other eukayotes because false positive rates will be very high). Thanks, Carson From: Eduardo Larriba Date: Friday, 25 May, 2012 12:01 PM To: Subject: [maker-devel] Consensus gene models Hi Carson and people, I am working on structural annotation of a filamentous fungus, of which there is little evidence as EST or Protein. For generate consensus gene based on limited evidences to me I used Marker. For this I created the files GeneMark prediction-is and SNAP. I run maker using the EST of my organims (85), along with 5700 EST of the closed organims. I have made ??predictions with Augustus, and SNAP GeneMark, with the training files for my organims, in Maker pipeline. Everything works fine. My problem is that when I get the consensus sequences of all my contigs, fasta_merge script (included in Maker), I get different list for each predictor, as well as when I try to get the gff of all. They could tell me how I can use Maker consensus for a list of genes? Or I have to do it manually? There is the possibility that Maker evaluates the accuracy of each prediction and confirm, so just get a list of the different predictions? Thank you very much. -- Eduardo Larriba Tornel Universidad de Alicante. Lab. Fitopatolog?a Dept. Ciencias del Mar y Biolog?a Aplicada Pabell?n 13. San Vicente del Raspeig Tel. 96 590 3400 ext 3280 _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon May 28 09:18:26 2012 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 28 May 2012 10:18:26 -0400 Subject: [maker-devel] ipr_update_gff ERROR In-Reply-To: Message-ID: This error would happen if some results exist in the iprscan output, but don't match gene entries in the GFF3 file. If I could see the original files I can tell you which ones. This can happen if you combine results from the non-overlapping.fasta files with the maker.proteins.fasta files for example. Models in the non-overlapping.fasta file are not genes in the GFF3 (they are match/match_part enties), so errors happen. Thanks, Carson From: p sz Date: Tuesday, 22 May, 2012 8:55 AM To: Subject: [maker-devel] ipr_update_gff ERROR First, thanks by help me on the lprevious error that I submitted. I'm still working in the same project and I get a new error. I try interproscan with this commandline: iprscan_wrap -i parsed_input.all.maker.proteins.fasta -email seoanezonjic at hotmail.com -format raw parsed_input.all.maker.proteins.fasta was generated with the tool fasta_merge. I use the output (attached in this email) and a gff file (generated by a normal run of maker, attached in this email) with the ipr_update_gff script of this way: ipr_update_gff BAC12_Clone_Pt314B2_Lib_Pt_7Ba__organism_Pinus_taeda__0.gff sc_interpro.sh.o109053 And i get this error: Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 1. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 1. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 2. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 2. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 3. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 3. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 8. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 8. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 9. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 9. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 10. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 10. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 11. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 11. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 12. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 12. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 13. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 13. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 14. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 14. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 15. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 15. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 18. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 18. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 18. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 18. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 19. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 19. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 19. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 19. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 48. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 48. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 48. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 48. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 49. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 49. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 49. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 49. The gff file seems updated but i don't know if it works fine or is corrupt Thanks in advance _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue May 29 07:37:55 2012 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 29 May 2012 08:37:55 -0400 Subject: [maker-devel] can't call method strand on an undefined value ERROR: Failed while flattening protein clusters In-Reply-To: Message-ID: Use this command to check out the latest unreleased test version, and lt me know if you still get the error. Command --> svn co svn://malachite.genetics.utah.edu/maker/trunk maker User: yandell_guest Password: y at ndell_Gu3st Thanks, Carson From: Scott Geib Date: Tuesday, 22 May, 2012 2:42 AM To: Subject: [maker-devel] can't call method strand on an undefined value ERROR: Failed while flattening protein clusters Hi, Using maker 2.24, I am getting the following error (see below) in protein2genome widget. I also get the same error with est2genome. This happens with my own data (testing on a single scaffold), but not with the test data supplied with maker (dpp files in data folder). Scott Widget::exonerate::protein2genome: /data0/opt/AlignmentSoftware/exonerate/exonerate-2.2.0-x86_64/bin/exonerate -q /data0/opt/GenePrediction/maker/bdor/makercustom/scaffold1.maker.output/scaf fold1_datastore/B8/E3/scaffold00001//theVoid.scaffold00001/UniRef90_UPI00019 4D3FC.for.1588546-1589203.9.fasta -t /data0/opt/GenePrediction/maker/bdor/makercustom/scaffold1.maker.output/scaf fold1_datastore/B8/E3/scaffold00001//theVoid.scaffold00001/scaffold00001.158 8546-1589203.9.fasta -Q protein -T dna -m protein2genome --softmasktarget --percent 20 --showcigar > /data0/opt/GenePrediction/maker/bdor/makercustom/scaffold1.maker.output/scaf fold1_datastore/B8/E3/scaffold00001//theVoid.scaffold00001/scaffold00001.158 8546-1589203.UniRef90_UPI000194D3FC.p_exonerate.9 #-------------------------------# cleaning blastx... in cluster::shadow_cluster... ...finished clustering. cleaning clusters.... total clusters:2 now processing 0 ...processing 0 of 23 ...processing 1 of 23 ...processing 2 of 23 ...processing 3 of 23 ...processing 4 of 23 ...processing 5 of 23 ...processing 6 of 23 ...processing 7 of 23 ...processing 8 of 23 ...processing 9 of 23 ...processing 10 of 23 ...processing 11 of 23 ...processing 12 of 23 ...processing 13 of 23 ...processing 14 of 23 ...processing 15 of 23 ...processing 16 of 23 ...processing 17 of 23 ...processing 18 of 23 ...processing 19 of 23 ...processing 20 of 23 ...processing 21 of 23 total clusters:2 now processing 0 in cluster::shadow_cluster... ...finished clustering. cleaning clusters.... total clusters:2 now processing 0 ...processing 0 of 20 ...processing 1 of 20 ...processing 2 of 20 ...processing 3 of 20 ...processing 4 of 20 ...processing 5 of 20 ...processing 6 of 20 ...processing 7 of 20 ...processing 8 of 20 ...processing 9 of 20 ...processing 10 of 20 ...processing 11 of 20 ...processing 12 of 20 ...processing 13 of 20 ...processing 14 of 20 ...processing 15 of 20 ...processing 16 of 20 ...processing 17 of 20 ...processing 18 of 20 total clusters:2 now processing 0 Can't call method "strand" on an undefined valueERROR: Failed while flattening protein clusters ERROR: Chunk failed at level:11, tier_type:2 FAILED CONTIG:scaffold00001 _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From bob_freeman at hms.harvard.edu Tue May 29 11:30:23 2012 From: bob_freeman at hms.harvard.edu (Bob Freeman) Date: Tue, 29 May 2012 12:30:23 -0400 Subject: [maker-devel] Alternate translation table? In-Reply-To: References: Message-ID: Thanks, Carson, for the update on this. No need to implement something. I'll keep it simple and translate the collected transcripts using an appropriate translation table. -Bob On May 28, 2012, at 8:43 AM, Carson Holt wrote: > The alternate translation table is not currently an option. It's one of those things that needs to be implemented, but has not been yet. It's also not supported by many of the eukaryotic gene predictors MAKER uses. > > I could probably get something implemented for you to test in two to three weeks though (there are a lot of places where the translation table comes into play). Let me know. > > --Carson > > > > From: Bob Freeman > Date: Friday, 25 May, 2012 12:23 PM > To: > Subject: [maker-devel] Alternate translation table? > > Hello all! > > Unusual question here: I am running MAKER on a ciliate that uses a non-standard translation table for its translation products. I haven't found an option in the control files that I can change for the for translation of the predicted transcripts. How or where can I go about this? > > Tx, > Bob > > > _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org ----------------------------------------------------- Bob Freeman, Ph.D. Acorn Worm Informatics, Kirschner lab Dept of Systems Biology, Alpert 524 Harvard Medical School 200 Longwood Avenue Boston, MA 02115 617/432.2294, vox "Sorry I'm late. Oh, God, that sounded insincere. I'm late." -- Karen Walker, from Will and Grace -------------- next part -------------- An HTML attachment was scrubbed... URL: From gowthaman.ramasamy at seattlebiomed.org Tue May 29 16:54:33 2012 From: gowthaman.ramasamy at seattlebiomed.org (Gowthaman Ramasamy) Date: Tue, 29 May 2012 14:54:33 -0700 Subject: [maker-devel] Can maker select a gene model based on #algoritham predicted it Message-ID: <89080953C3D300419AACB6E63A7EEFBA5C8409F8EB@mail02.sbri.org> Hi Carson, Thanks for all the help during the long weekend, in spite of that long drive. I am still trying to imagine that. I now have maker to consider our own prediction via pred_gff, and use augustus and gene mark (with our training model). And i was able to use altest and protein evidences. Maker happily picks one gene model when there is a overlap between three different predictions. But, when I look at the gff, it seems like it picks a gene model only when there is an est/protein evidence. It leaves out some genes even though, they are predicted by all three algorithms. Of course, keep_pred=1 helps to keep all the models. This kind of leads to over prediction. But, I am looking for something in between. And would like to know if that is possible? 1) Pick a gene model if it has an evidence from (est/prot etc...) irrespective of how many algorithms predicted it 2) In the absence of extrinsic evidence (est/prot etc), pick a gene model if that is predicted by at least two algorithms. Or even simpler: I have ab-initio predictions from three algorithms, Can I output, those genes that is supported by at least two of them. I care less about exactness of gene boundaries. Thanks, Gowthaman PS: With my recent attempts, i learned couple things about maker/other associated tools that is not documented in gmod-maker wiki. Is it possible/ok if I add contents to it. I am okay with running it by you before making it public. From carsonhh at gmail.com Wed May 30 07:54:32 2012 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 30 May 2012 08:54:32 -0400 Subject: [maker-devel] Can maker select a gene model based on #algoritham predicted it In-Reply-To: <89080953C3D300419AACB6E63A7EEFBA5C8409F8EB@mail02.sbri.org> Message-ID: It's not an option in exactly the way you are specifying, but there is something I usually do for annotation that works well. I run interproscan or rpsblast on the non_overlapping.proteins.fasta file and select just those non-overlapping models that have a recognizable protein domain (just searching the pfam doamin space is more than sufficient). Then I provide the selected results to model_gff, and provide the previous maker results to the maker_gff option with (all reannotation pass options set to 1 and all analysis options turned off). This adds models with at least recognizable domains (as even multiple gene predictors can overpredict in a similar way). Attached is a script to help select predictions and upgrade them to models in GFF3 format. If you have question let me know. Thanks, Carson On 12-05-29 5:54 PM, "Gowthaman Ramasamy" wrote: >Hi Carson, >Thanks for all the help during the long weekend, in spite of that long >drive. I am still trying to imagine that. > >I now have maker to consider our own prediction via pred_gff, and use >augustus and gene mark (with our training model). And i was able to use >altest and protein evidences. Maker happily picks one gene model when >there is a overlap between three different predictions. But, when I look >at the gff, it seems like it picks a gene model only when there is an >est/protein evidence. It leaves out some genes even though, they are >predicted by all three algorithms. Of course, keep_pred=1 helps to keep >all the models. This kind of leads to over prediction. > >But, I am looking for something in between. And would like to know if >that is possible? >1) Pick a gene model if it has an evidence from (est/prot etc...) >irrespective of how many algorithms predicted it >2) In the absence of extrinsic evidence (est/prot etc), pick a gene model >if that is predicted by at least two algorithms. > >Or even simpler: >I have ab-initio predictions from three algorithms, Can I output, those >genes that is supported by at least two of them. I care less about >exactness of gene boundaries. > >Thanks, >Gowthaman > >PS: With my recent attempts, i learned couple things about maker/other >associated tools that is not documented in gmod-maker wiki. Is it >possible/ok if I add contents to it. I am okay with running it by you >before making it public. -------------- next part -------------- A non-text attachment was scrubbed... Name: gff3_preds2models Type: application/octet-stream Size: 4777 bytes Desc: not available URL: From mikael.durling at slu.se Thu May 31 07:25:31 2012 From: mikael.durling at slu.se (=?iso-8859-1?Q?Mikael_Brandstr=F6m_Durling?=) Date: Thu, 31 May 2012 14:25:31 +0200 Subject: [maker-devel] maker leaving large numbers of defunct zombies Message-ID: Hello, I've been working lately to set up maker for annotation work on a few fungal genomes. I've got mpi maker up and running now, however, I notice that maker is leaving a lot of perl processes behind. This happens to the extent that the process table on the system gets filled up after a few hours run time. Right now the process tree after three hours running looks like this: |-sge_execd-+-sge_shepherd---bash---mpirun-+-maker-+-maker | | | `-perl | | |-maker-+-maker | | | |-maker---1371*[perl] | | | `-perl | | |-maker-+-maker | | | |-maker---1348*[perl] | | | `-perl | | |-maker-+-maker | | | |-maker---1384*[perl] | | | `-perl ...and so on for all mpi processes, except for the controlling processes. What perl programs is maker calling, that might end up as zombies? I've had a brief look at the source to no avail, but would be happy to dig further with some pointers for where to look. This is run with the 2.25-beta from the web page, perl 5.16.0 and openmpi 1.4.5. Thanks, Mikael ------------------------------------- Mikael Brandstr?m Durling, PhD Assistant Professor Sveriges lantbruksuniversitet Swedish University of Agricultural Sciences Uppsala BioCenter Dept of Forest Mycology and Plant Pathology Box 7026, 75007 Uppsala Visiting address: Almas All? 5 Telefon: 018-671512 mikael.durling at slu.se, www.slu.se/mykopat From mikael.durling at slu.se Thu May 31 07:34:30 2012 From: mikael.durling at slu.se (=?iso-8859-1?Q?Mikael_Brandstr=F6m_Durling?=) Date: Thu, 31 May 2012 14:34:30 +0200 Subject: [maker-devel] Using GDBM_File instead of DB_File Message-ID: Hello, I've been struggling for a few days to get maker up and running with MPI on a debian squeeze system. Compiling a new perl 5.16 exclusively for maker I wound down to that the segfaults came from DB_File. Even by recomiling and updating that module, nothing worked. After checking for dependencies on DB_File in maker, I concluded that the only dependency was through the GI::localize_file, which expects the FastaDB to be instantiated with DB_File. However, FastaDB can run on GDBM_File too. I patched the calls to GI::localize_file in maker to handle the .pag/.dir extensions used by GDBM (below). With this patch applied maker is running for me, even when I have deleted the DB_File module from the perl path and made sure that GDBM_File is installed. My basic question is if there is any other dependency for DB_File which I have missed which may break things? cheers, Mikael --- maker.orig 2012-03-30 15:48:05.000000000 +0200 +++ maker 2012-05-31 10:35:30.253022648 +0200 @@ -512,7 +515,12 @@ } if($size > 1){ carp "Calling GI::localize_file" if($main::debug); - GI::localize_file("$gdbfile.index"); + if( -f "$gdbfile.index.dir" ){ + GI::localize_file("$gdbfile.index.dir"); + GI::localize_file("$gdbfile.index.pag"); + }else{ + GI::localize_file("$gdbfile.index"); + } carp "Calling GI::localize_file" if($main::debug); $gdbfile = GI::localize_file($gdbfile); } From carsonhh at gmail.com Thu May 31 08:04:58 2012 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 31 May 2012 09:04:58 -0400 Subject: [maker-devel] Using GDBM_File instead of DB_File In-Reply-To: Message-ID: DB_File is being called by Bio::DB::Fasta. I can check the object returned when using GDBM_File instead to see if the index file names are contained on the object, as I'm just assuming an extension of '.index'. I'll look around to see if the extension name is assumed anywhere else. Thanks, Carson On 12-05-31 8:34 AM, "Mikael Brandstr?m Durling" wrote: >Hello, > >I've been struggling for a few days to get maker up and running with MPI >on a debian squeeze system. Compiling a new perl 5.16 exclusively for >maker I wound down to that the segfaults came from DB_File. Even by >recomiling and updating that module, nothing worked. After checking for >dependencies on DB_File in maker, I concluded that the only dependency >was through the GI::localize_file, which expects the FastaDB to be >instantiated with DB_File. However, FastaDB can run on GDBM_File too. I >patched the calls to GI::localize_file in maker to handle the .pag/.dir >extensions used by GDBM (below). With this patch applied maker is running >for me, even when I have deleted the DB_File module from the perl path >and made sure that GDBM_File is installed. My basic question is if there >is any other dependency for DB_File which I have missed which may break >things? > >cheers, >Mikael > >--- maker.orig 2012-03-30 15:48:05.000000000 +0200 >+++ maker 2012-05-31 10:35:30.253022648 +0200 >@@ -512,7 +515,12 @@ > } > if($size > 1){ > carp "Calling GI::localize_file" if($main::debug); >- GI::localize_file("$gdbfile.index"); >+ if( -f "$gdbfile.index.dir" ){ >+ GI::localize_file("$gdbfile.index.dir"); >+ GI::localize_file("$gdbfile.index.pag"); >+ }else{ >+ GI::localize_file("$gdbfile.index"); >+ } > carp "Calling GI::localize_file" if($main::debug); > $gdbfile = GI::localize_file($gdbfile); > } > > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From carsonhh at gmail.com Thu May 31 08:17:20 2012 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 31 May 2012 09:17:20 -0400 Subject: [maker-devel] maker leaving large numbers of defunct zombies In-Reply-To: Message-ID: MAKER uses IPC::Open3 to open almost all external applications, including a helper script called every once in a while that helps check file locks on NFS. MAKER then calls waitpid to reap the processes, as IPC::Open3 doesn't auto-reap. The only time previously I've seen issues with zombie accumulation was with MPICH2 when it moved from the MPD process manager to Hydra. Hydra had certain broken signal handling issues that I had to bug the MPICH2 developers about and they fixed it. It is possible that the issue you are having may be with OpenMPI or with perl 5.16. I currently use perl 5.12. Perl instituted something called safe signals in either 5.6 or 5.8 and there may be some updates in 5.16 where they've been changing those around again. I can try installing a copy of 5.16 to test with and OpenMPI to see if I can replicate the error. Thanks, Carson On 12-05-31 8:25 AM, "Mikael Brandstr?m Durling" wrote: >Hello, > >I've been working lately to set up maker for annotation work on a few >fungal genomes. I've got mpi maker up and running now, however, I notice >that maker is leaving a lot of perl processes behind. This >happens to the extent that the process table on the system gets filled up >after a few hours run time. Right now the process tree after three hours >running looks like this: > > |-sge_execd-+-sge_shepherd---bash---mpirun-+-maker-+-maker > | | | `-perl > | | |-maker-+-maker > | | | >|-maker---1371*[perl] > | | | `-perl > | | |-maker-+-maker > | | | >|-maker---1348*[perl] > | | | `-perl > | | |-maker-+-maker > | | | >|-maker---1384*[perl] > | | | `-perl > >...and so on for all mpi processes, except for the controlling processes. > >What perl programs is maker calling, that might end up as zombies? I've >had a brief look at the source to no avail, but would be happy to dig >further with some pointers for where to look. > >This is run with the 2.25-beta from the web page, perl 5.16.0 and openmpi >1.4.5. > >Thanks, >Mikael > > > > > > > > > > >------------------------------------- >Mikael Brandstr?m Durling, PhD >Assistant Professor > >Sveriges lantbruksuniversitet >Swedish University of Agricultural Sciences > >Uppsala BioCenter >Dept of Forest Mycology and Plant Pathology >Box 7026, 75007 Uppsala >Visiting address: Almas All? 5 >Telefon: 018-671512 >mikael.durling at slu.se, www.slu.se/mykopat > > > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From mikael.durling at slu.se Thu May 31 08:57:06 2012 From: mikael.durling at slu.se (=?iso-8859-1?Q?Mikael_Brandstr=F6m_Durling?=) Date: Thu, 31 May 2012 15:57:06 +0200 Subject: [maker-devel] maker leaving large numbers of defunct zombies In-Reply-To: References: Message-ID: I saw the same problem with the latest MPICH2 using hydra too, so it might boil down to perl/openmpi interactions. I didn't see this problem with the debian supplied perl 5.10, but then I had intermittent segfaults with in DB_File and libpthread. Seemed to be some interaction with the LD_PRELOADed libmpi. That requirement for preloading libmpi was easiest solved by compiling openmpi with --disable-dlopen. Thanks, Mikael 31 maj 2012 kl. 15:17 skrev Carson Holt: > MAKER uses IPC::Open3 to open almost all external applications, including > a helper script called every once in a while that helps check file locks > on NFS. > > MAKER then calls waitpid to reap the processes, as IPC::Open3 doesn't > auto-reap. The only time previously I've seen issues with zombie > accumulation was with MPICH2 when it moved from the MPD process manager to > Hydra. Hydra had certain broken signal handling issues that I had to bug > the MPICH2 developers about and they fixed it. It is possible that the > issue you are having may be with OpenMPI or with perl 5.16. I currently > use perl 5.12. Perl instituted something called safe signals in either > 5.6 or 5.8 and there may be some updates in 5.16 where they've been > changing those around again. > > I can try installing a copy of 5.16 to test with and OpenMPI to see if I > can replicate the error. > > Thanks, > Carson > > > > On 12-05-31 8:25 AM, "Mikael Brandstr?m Durling" > wrote: > >> Hello, >> >> I've been working lately to set up maker for annotation work on a few >> fungal genomes. I've got mpi maker up and running now, however, I notice >> that maker is leaving a lot of perl processes behind. This >> happens to the extent that the process table on the system gets filled up >> after a few hours run time. Right now the process tree after three hours >> running looks like this: >> >> |-sge_execd-+-sge_shepherd---bash---mpirun-+-maker-+-maker >> | | | `-perl >> | | |-maker-+-maker >> | | | >> |-maker---1371*[perl] >> | | | `-perl >> | | |-maker-+-maker >> | | | >> |-maker---1348*[perl] >> | | | `-perl >> | | |-maker-+-maker >> | | | >> |-maker---1384*[perl] >> | | | `-perl >> >> ...and so on for all mpi processes, except for the controlling processes. >> >> What perl programs is maker calling, that might end up as zombies? I've >> had a brief look at the source to no avail, but would be happy to dig >> further with some pointers for where to look. >> >> This is run with the 2.25-beta from the web page, perl 5.16.0 and openmpi >> 1.4.5. >> >> Thanks, >> Mikael >> >> >> >> >> >> >> >> >> >> >> ------------------------------------- >> Mikael Brandstr?m Durling, PhD >> Assistant Professor >> >> Sveriges lantbruksuniversitet >> Swedish University of Agricultural Sciences >> >> Uppsala BioCenter >> Dept of Forest Mycology and Plant Pathology >> Box 7026, 75007 Uppsala >> Visiting address: Almas All? 5 >> Telefon: 018-671512 >> mikael.durling at slu.se, www.slu.se/mykopat >> >> >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > From carsonhh at gmail.com Tue May 1 16:07:47 2012 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 01 May 2012 18:07:47 -0400 Subject: [maker-devel] gff3_preds2models usage question In-Reply-To: Message-ID: Sorry for the slow response. The gff3_preds2models script has been deprecated for some time now (isn't even in the release code anymore), and the old one won't work with the new library. I've attached a made from scratch drop-in replacement that you can use to do what the old script would have done. In the current release of MAKER, instead of the gff3_preds2models script users can just give MAKER a set of predictions in GFF3 format (pred_gff option) and set keep_preds=1 (then leave all other options blank). The predictions given will the be converted into gene models. Thanks, Carson From: Walter Eckalbar Date: Tuesday, 3 April, 2012 7:28 PM To: Subject: [maker-devel] gff3_preds2models usage question Hello maker developers and users, I am attempting to use the gff3_preds2models scripts, but running into a few issues. Initially, I hit errors that seemed to be fixed by installing CGI and its dependancies. However, that during that installation a few tests did fail. I can provide error logs if that would be helpful, however, I went on to install and attempt gff3_preds2models anyway. What I am currently doing is running gff3_merge first, to gather the maker outputs. I am doing so with both the -n option on and off. When providing the gff3 file with the sequence I get the following error from gff3_preds2models: Undefined subroutine &maker::auto_annotator::annotate called at /Users/Walter/Bioinformatics/Tools/maker/bin/gff3_preds2models line 97, line 992291. This seemed to be the same error as that of what someone else saw on these boards, but I did not see a later email resolving the issue. I also tried giving it just the gff3 without the sequences at the bottom of the file and then I get this error: ERROR: There was a problem in the writing the fasta entry Either no sequence was given, or there was an error in writing This leads me to believe I should be using the one with the sequence, but I am not certain of that. I see it might be possible to go from maker outputs to chado database then to gene->mRNA->exon gff3s, but I have not set up my machine for XML or chado yet, and it does not appear trivial. Thanks for the help, Walter _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: gff3_preds2models Type: application/octet-stream Size: 4777 bytes Desc: not available URL: From weckalba at asu.edu Tue May 1 19:33:00 2012 From: weckalba at asu.edu (Walter Eckalbar) Date: Tue, 1 May 2012 18:33:00 -0700 Subject: [maker-devel] gff3_preds2models usage question In-Reply-To: References: Message-ID: Hi Carson, Thanks for the response, even a late one, and thanks for the script. I'll certainly be giving that a try. Walter On 1 May 2012 15:07, Carson Holt wrote: > Sorry for the slow response. The gff3_preds2models script has been > deprecated for some time now (isn't even in the release code anymore), and > the old one won't work with the new library. > > I've attached a made from scratch drop-in replacement that you can use to > do what the old script would have done. In the current release of MAKER, > instead of the gff3_preds2models script users can just give MAKER a set of > predictions in GFF3 format (pred_gff option) and set keep_preds=1 (then > leave all other options blank). The predictions given will the be > converted into gene models. > > Thanks, > Carson > > > > From: Walter Eckalbar > Date: Tuesday, 3 April, 2012 7:28 PM > To: > Subject: [maker-devel] gff3_preds2models usage question > > Hello maker developers and users, > > I am attempting to use the gff3_preds2models scripts, but running into a > few issues. > > Initially, I hit errors that seemed to be fixed by installing CGI and its > dependancies. However, that during that installation a few tests did fail. > I can provide error logs if that would be helpful, however, I went on to > install and attempt gff3_preds2models anyway. > > What I am currently doing is running gff3_merge first, to gather the maker > outputs. I am doing so with both the -n option on and off. When providing > the gff3 file with the sequence I get the following error from > gff3_preds2models: > > Undefined subroutine &maker::auto_annotator::annotate called at > /Users/Walter/Bioinformatics/Tools/maker/bin/gff3_preds2models line 97, > line 992291. > > This seemed to be the same error as that of what someone else saw on these > boards, but I did not see a later email resolving the issue. > > I also tried giving it just the gff3 without the sequences at the bottom > of the file and then I get this error: > > ERROR: There was a problem in the writing the fasta entry > Either no sequence was given, or there was an error in writing > > This leads me to believe I should be using the one with the sequence, but > I am not certain of that. > > I see it might be possible to go from maker outputs to chado database then > to gene->mRNA->exon gff3s, but I have not set up my machine for XML or > chado yet, and it does not appear trivial. > > Thanks for the help, > > Walter > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From qwang at uwyo.edu Thu May 3 10:23:16 2012 From: qwang at uwyo.edu (Qiurong Wang) Date: Thu, 3 May 2012 10:23:16 -0600 Subject: [maker-devel] MAKER download problem Message-ID: <8754B55D-C119-4A7C-9594-BEAEAD3BB939@uwyo.edu> Hi, I was trying to download MAKER, but I couldn't open the download page. Could you please help me to figure out the problem? Thanks a lot! Qiurong Wang PhD candidate Department of Botany University of Wyoming Department 3165, 1000 E University Ave. Laramie, Wyoming 82071, USA Phone: 307-766-2634 Email: qwang at uwyo.edu From barry.moore at genetics.utah.edu Thu May 3 11:51:48 2012 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Thu, 3 May 2012 11:51:48 -0600 Subject: [maker-devel] MAKER download problem In-Reply-To: <8754B55D-C119-4A7C-9594-BEAEAD3BB939@uwyo.edu> References: <8754B55D-C119-4A7C-9594-BEAEAD3BB939@uwyo.edu> Message-ID: <6B24C00B-6C70-4A40-BA7C-E3AC0C7F5E25@genetics.utah.edu> Hi all, The web server hosting the MAKER licensing application and MAKER code distribution was attacked this week. The University of Utah is currently blocking access to that server from outside University IP space. We are working hard to move all of the content and web-applications from that machine to a new server and I expect to have the MAKER services restored over the weekend. This is affecting the ability to submit new licenses for MAKER and to download the MAKER code. In addition those with existing licenses will need to update their code links for future code updates. For those of you on campus in Utah or with campus VPN access, the server is running and available. Updates on the status of the server and details about new links to the MAKER code will be posted to the mailing list soon. Barry On May 3, 2012, at 10:23 AM, Qiurong Wang wrote: > Hi, > > I was trying to download MAKER, but I couldn't open the download page. Could you please help me to figure out the problem? Thanks a lot! > > > Qiurong Wang > PhD candidate > Department of Botany > University of Wyoming > Department 3165, 1000 E University Ave. Laramie, Wyoming 82071, USA > Phone: 307-766-2634 > Email: qwang at uwyo.edu > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Thu May 3 12:00:01 2012 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 03 May 2012 14:00:01 -0400 Subject: [maker-devel] MAKER download problem In-Reply-To: <8754B55D-C119-4A7C-9594-BEAEAD3BB939@uwyo.edu> Message-ID: The server hosting the file to download is down temporarily. I'll put a copy of MAKER on a separate server and e-mail the link to you. --Carson On 12-05-03 12:23 PM, "Qiurong Wang" wrote: >Hi, > >I was trying to download MAKER, but I couldn't open the download page. >Could you please help me to figure out the problem? Thanks a lot! > > >Qiurong Wang >PhD candidate >Department of Botany >University of Wyoming >Department 3165, 1000 E University Ave. Laramie, Wyoming 82071, USA >Phone: 307-766-2634 >Email: qwang at uwyo.edu > > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From cjfields at illinois.edu Tue May 15 09:01:31 2012 From: cjfields at illinois.edu (Fields, Christopher J) Date: Tue, 15 May 2012 15:01:31 +0000 Subject: [maker-devel] mail list and Trac Message-ID: <64A6759A-9FD1-4AEE-BCEA-151B0D791ADD@illinois.edu> Just wanted to point out, I noticed the mail list is not being indexed on Google Groups any more (nothing in May). Also, any status on Trac? chris From barry.moore at genetics.utah.edu Tue May 15 10:34:29 2012 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Tue, 15 May 2012 10:34:29 -0600 Subject: [maker-devel] mail list and Trac In-Reply-To: <64A6759A-9FD1-4AEE-BCEA-151B0D791ADD@illinois.edu> References: <64A6759A-9FD1-4AEE-BCEA-151B0D791ADD@illinois.edu> Message-ID: Thanks Chris, I'll check on the Google groups. The MAKER Trac server was on a server that we had to shut down a couple weeks ago and I had made moving it a lower priority since I didn't think it was getting much use. I'll get it moved over and bump up the priority on that move since I know someone is looking at it. B On May 15, 2012, at 9:01 AM, Fields, Christopher J wrote: > Just wanted to point out, I noticed the mail list is not being indexed on Google Groups any more (nothing in May). Also, any status on Trac? > > chris > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From anastasia.gioti at scilifelab.se Thu May 17 05:27:04 2012 From: anastasia.gioti at scilifelab.se (Anastasia Gioti) Date: Thu, 17 May 2012 13:27:04 +0200 Subject: [maker-devel] Use pass-through system to add missing genes In-Reply-To: <03439C8F-75B0-42FE-894C-CC564AEB73E9@genetics.utah.edu> References: <4FE7CD5B-FC1C-43E7-AC41-A05823348B99@scilifelab.se> <03439C8F-75B0-42FE-894C-CC564AEB73E9@genetics.utah.edu> Message-ID: <4D9922AF-A917-4747-9B7C-CFE9F142D51C@scilifelab.se> Hi Barry, Thanks for your detailed instructions. You well understood that I have already included the proteins of the closely related species in my protein evidence dataset, but still did not get the genes. I have now blasted (P) the missing 949 proteins from this species against my nonoverlaping_abinits.fasta proteins and have found 618 good hits, which i guess I can promote to models using the routine no 2 of your last email and Carson's script gff3_select. I have also looked at the rest of the proteins (331) for which there was no model in the nonoverlaping_abinits.fasta. I will try to describe 2 examples I looked at in apollo: 1) ab initio models predicted a ~7.5 kb gene covering 3 genes (as predicted in the closely related species). Blastx+protein2genome similarities were reported for two of these genes, but not for the 3rd (the one in the middle). MAKER finally decided to call two genes, respecting the blastx+protein2genome evidence, but the 3rd was lost. I have previously reported here that MAKER tends to fuse genes in multi-exonic genes and others reported that too, I remember you proposed changing a papameter to alter this. To keep in mind for my final strategy that i am trying to decide on (for the moment i have not rerun MAKER). For this case, abinitio models do not exist for the gene (in the sense that the existing models overlap many genes) and the similarity to the protein of the closely related species was not judged sufficient, although when i look at a TblastN alignment for this area it looks fine to me. 2) Only the 3' end of the gene was called by MAKER, despite blastx +protein2genome evidence from the closely related species for the entire region. Abinitio models existed as 2 separate genes , one for the 3' end region (finally retained by MAKER in a consensus decision I guess) and one for the 5' region, but here not all predictors called an orf, and finally nothing was called in this region. In this case, it is a misannotation rather, but which misses a very important part of the gene. I hope my descriptions are clear, otherwise I can provide you the gff file of these 2 examples to look by yourself. I am not very clear about what to do about these 331 cases (which I do not know how to look at as well, except for random examples' viwing in Apollo). I feel that a second MAKER run would be probably the solution, this time providing as pred_gff the result of a blast against the 331. But still, the existing annotations would then have to be somehow updated as the new predictions are in conflict with them (see example 2). I am a bit confused. to recap, what would you suggest for the 331 still-missing proteins in terms of asessing their profiles n a rather automatic way and in inluding them in my annotations without going deep into manual gene curation? Many thnks, Anastasia > > Let me just restate what you've said so that I can be sure that I am > correct about what you've already done. You have run Maker with > SNAP, Genemark and Augustus using EST from a closely related species > (passed to altest) and protein evidence from other fungi. You are > missing about 1,000 genes compared to the species that provided the > EST alignments. You say their is good evidence that these genes > exist from the alignments and I assume by this that you mean the EST/ > protein alignments that Maker produced. > > 1) Is the closely related fungus annotated and if so have you > included it's proteins in the evidence set that you provided to > Maker. If you haven't provided these proteins as evidence to maker > then you should do this. You can re-run maker passing your original > models back through like this: > > #-----Re-annotation Using MAKER Derived GFF3 > genome_gff=original_maker_annotations.gff3 > est_pass=1 > altest_pass=1 > protein_pass=1 > rm_pass=1 > model_pass=1 > pred_pass=1 > other_pass=1 > > #-----Protein Homology Evidence (for best results provide a file for > at least one) > protein=proteins_from_closely_related.fasta > ## OR it sounds like you've already aligned these with exonerate? > protein_gff=proteins_from_closely_related_already_aligned.gff > > 2) If you've already included those closely related species proteins > but still didn't get the 1,000 genes, then take your > nonoverlaping_abinits.fasta and blast them directly against your > closely related proteins. Presumably they don't hit too well > because if they did they should have been promoted to predictions by > Maker the first time, but here you can decide yourself what > thresholds to allow to keep the abinit predictions that hit the > closely related species proteins. If you filter you blast hits the > way you want and keep the names of the abinit predictions that pass > your filter, then use the script Carson attached it it will generate > a abinit precidtion GFF file with only the predictions you > selected. You can then pass those predictions back to Maker and > force it to keep them and Maker will turn them from predictions > (match/match_part) into gene models. > > #-----Re-annotation Using MAKER Derived GFF3 > genome_gff=original_maker_annotations.gff3 > est_pass=1 > altest_pass=1 > protein_pass=1 > rm_pass=1 > model_pass=1 > pred_pass=0 > other_pass=1 > > #-----Gene Prediction > snaphmm= > gmhmm= > augustus_species= > fgenesh_par_file= > pred_gff=ab_init_predictions_rescued_by_blast.gff > > keep_preds=1 > > Barry > >>> Thanks, >>> Carson >>> >>> From: Anastasia Gioti >>> Date: Wed, 25 Apr 2012 11:09:36 +0200 >>> To: >>> Subject: [maker-devel] Use pass-through system to add missing genes >>> >>> Hi, >>> I have a set of predicted proteins from the genome of a fungus >>> annotated by MAKER using EST data from a closely related species >>> and 3 ab initio predictors (snap iterativelly trained 3 times, >>> genemark trained directly on the assembly and augustus with a >>> model from a less closely related species), along with a set of >>> fungal proteins. I am missing ~ 1000 proteins when I compare to >>> the species i used EST data from, and there is good evidence from >>> alignments that these genes exist. The question is how to proceed >>> from Blast hits to actual gene models here. The idea would be to >>> add these genes to the existing dataset, rather than reannotate >>> the genome. I believe that reannotating it without any further >>> evidence such as RNA-seq from the species itself would not change >>> much,and i d rather stick with actual predictions that i trust and >>> have used in subsequent analyses. The 1000 genes I can accept to >>> annotate with a less stringent and reliable way than MAKER, I just >>> want to add them so that the difference in gene count gets >>> corrected. >>> I was reading the MAKER 2 paper and i was wondering if I can use >>> the legacy annotations scheme to do it, by providing GFF3 of the >>> alignments between the two species in the regions where genes were >>> missed, but as i said, I would not like to reannotate the whole >>> genome, and running MAKER2 might cause slight changes that i d >>> like to avoid. Is this possible? First, is it possible to provide >>> a Gff3 file of specific locations and not the entire genome >>> alignment? (I guess so..) Second, how can I tag the existing >>> annotations as 'not to be changed' or alternatively, tag the new >>> models only? How should I run maker2, with which predictors on and >>> which off? >>> Thanks, >>> Anastasia >>> >>> Anastasia Gioti >>> Post-doctoral Researcher >>> >>> anastasia.gioti at scilifelab.se >>> anastasia.gioti at ebc.uu.se >>> >>> http://www.ebc.uu.se/Research/IEG/evbiol/people/pages/Gioti_Anastasia/ >>> >>> >>> >>> _______________________________________________ maker-devel >>> mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >> >> Anastasia Gioti >> Post-doctoral Researcher >> >> anastasia.gioti at scilifelab.se >> anastasia.gioti at ebc.uu.se >> >> http://www.ebc.uu.se/Research/IEG/evbiol/people/pages/ >> Gioti_Anastasia/ >> >> >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > Barry Moore > Research Scientist > Dept. of Human Genetics > University of Utah > Salt Lake City, UT 84112 > -------------------------------------------- > (801) 585-3543 > > > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell- > lab.org Anastasia (Natassa) Gioti Post-Doc Researcher Evolutionary Biology Department Uppsala University -Science for Life lab, Karolinska Institute Stockholm anastasia.gioti at ebc.uu.se anastasia.gioti at scilifelab.se http://www.ebc.uu.se/Research/IEG/evbiol/people/pages/Gioti_Anastasia/ From yogeshp08 at gmail.com Tue May 15 10:07:57 2012 From: yogeshp08 at gmail.com (Yogesh) Date: Tue, 15 May 2012 11:07:57 -0500 Subject: [maker-devel] tblastn Cleanup? Message-ID: <4478F0B20ED84A85B3C4FE4154F8FAD1@gmail.com> Hello, I have a few tblastn alignments with a lot of low quality hits. I have to clean that up. Can you please suggest how Maker pipeline does it? Also can I run it directly on my data without having to go through the whole pipeline? Thanks, -Yogesh -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Fri May 18 08:22:50 2012 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 18 May 2012 10:22:50 -0400 Subject: [maker-devel] tblastn Cleanup? In-Reply-To: <4478F0B20ED84A85B3C4FE4154F8FAD1@gmail.com> Message-ID: There are several things. I set several filtering options directly on the BLAST command line. These are things like maximum intron length, an e-value filter, and simple repeat filtering (called dust filter in NCBI blast and seg filter in WUBLAST). I also run repeat masker over the genome first. This allows simple and complex repeats to be removed before running BLAST (otherwise you get many false alignments). Last I filter the results based on percent coverage of the hit to the original database sequence and percent identity. I think you can set percent identity as a flag in BLAST, but the percent coverage filter is being calculated by MAKER, so to do this outside of MAKER would require that you write your own filtering script to compare the length of the alignment to the length of the sequence in the database. I also have an HSP depth overlap filter. This removes weird low complexity hits that escape repeatmasking. They show up as multiple HSPs overlapping multiple times in the same region (usually very high numbers like 90 HSPs all 100 bp long in the same region). I calculate the number of base pairs in the alignment on the hit then divide by the number of base pairs in the query alignment. If it's greater than 3, I throw the hit out. Thanks, Carson From: Yogesh Date: Tuesday, 15 May, 2012 12:07 PM To: Subject: [maker-devel] tblastn Cleanup? Hello, I have a few tblastn alignments with a lot of low quality hits. I have to clean that up. Can you please suggest how Maker pipeline does it? Also can I run it directly on my data without having to go through the whole pipeline? Thanks, -Yogesh _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From smg283 at gmail.com Tue May 22 00:42:51 2012 From: smg283 at gmail.com (Scott Geib) Date: Mon, 21 May 2012 20:42:51 -1000 Subject: [maker-devel] can't call method strand on an undefined value ERROR: Failed while flattening protein clusters Message-ID: Hi, Using maker 2.24, I am getting the following error (see below) in protein2genome widget. I also get the same error with est2genome. This happens with my own data (testing on a single scaffold), but not with the test data supplied with maker (dpp files in data folder). Scott Widget::exonerate::protein2genome: /data0/opt/AlignmentSoftware/exonerate/exonerate-2.2.0-x86_64/bin/exonerate -q /data0/opt/GenePrediction/maker/bdor/makercustom/scaffold1.maker.output/scaffold1_datastore/B8/E3/scaffold00001//theVoid.scaffold00001/UniRef90_UPI000194D3FC.for.1588546-1589203.9.fasta -t /data0/opt/GenePrediction/maker/bdor/makercustom/scaffold1.maker.output/scaffold1_datastore/B8/E3/scaffold00001//theVoid.scaffold00001/scaffold00001.1588546-1589203.9.fasta -Q protein -T dna -m protein2genome --softmasktarget --percent 20 --showcigar > /data0/opt/GenePrediction/maker/bdor/makercustom/scaffold1.maker.output/scaffold1_datastore/B8/E3/scaffold00001//theVoid.scaffold00001/scaffold00001.1588546-1589203.UniRef90_UPI000194D3FC.p_exonerate.9 #-------------------------------# cleaning blastx... in cluster::shadow_cluster... ...finished clustering. cleaning clusters.... total clusters:2 now processing 0 ...processing 0 of 23 ...processing 1 of 23 ...processing 2 of 23 ...processing 3 of 23 ...processing 4 of 23 ...processing 5 of 23 ...processing 6 of 23 ...processing 7 of 23 ...processing 8 of 23 ...processing 9 of 23 ...processing 10 of 23 ...processing 11 of 23 ...processing 12 of 23 ...processing 13 of 23 ...processing 14 of 23 ...processing 15 of 23 ...processing 16 of 23 ...processing 17 of 23 ...processing 18 of 23 ...processing 19 of 23 ...processing 20 of 23 ...processing 21 of 23 total clusters:2 now processing 0 in cluster::shadow_cluster... ...finished clustering. cleaning clusters.... total clusters:2 now processing 0 ...processing 0 of 20 ...processing 1 of 20 ...processing 2 of 20 ...processing 3 of 20 ...processing 4 of 20 ...processing 5 of 20 ...processing 6 of 20 ...processing 7 of 20 ...processing 8 of 20 ...processing 9 of 20 ...processing 10 of 20 ...processing 11 of 20 ...processing 12 of 20 ...processing 13 of 20 ...processing 14 of 20 ...processing 15 of 20 ...processing 16 of 20 ...processing 17 of 20 ...processing 18 of 20 total clusters:2 now processing 0 Can't call method "strand" on an undefined valueERROR: Failed while flattening protein clusters ERROR: Chunk failed at level:11, tier_type:2 FAILED CONTIG:scaffold00001 -------------- next part -------------- An HTML attachment was scrubbed... URL: From anastasia.gioti at scilifelab.se Tue May 22 07:14:17 2012 From: anastasia.gioti at scilifelab.se (Anastasia Gioti) Date: Tue, 22 May 2012 15:14:17 +0200 Subject: [maker-devel] Use pass-through system to add missing genes In-Reply-To: <03439C8F-75B0-42FE-894C-CC564AEB73E9@genetics.utah.edu> References: <4FE7CD5B-FC1C-43E7-AC41-A05823348B99@scilifelab.se> <03439C8F-75B0-42FE-894C-CC564AEB73E9@genetics.utah.edu> Message-ID: <19E36E3B-6A82-49D5-B0AC-5E521F3E8999@scilifelab.se> Hi again, I hav sent an email a few days ago about this thread, and i am not sure if you have received it or you still did not have time to look at it. In any case, this email was dealing with the fact that some proteins were not retrieved in the abinitio models and how to deal with it. What I would like to ask here is a few confirmations on how to rerun maker for the proteins that were retrieved in the abinitio models. i have looked at the Blast results, and have done a series of check-ups, so now I am ready to run MAKER again with a list of models that I want to retain. Regarding the following parameters: 1. Do I set the genome= to nothing here? i.e quote it out? This is in the beginning of the control file #-----Genome (Required for De-Novo Annotation) genome=#genome sequence file in fasta format organism_type= #eukaryotic or prokaryotic. Default is eukaryotic > > #-----Re-annotation Using MAKER Derived GFF3 > genome_gff=original_maker_annotations.gff3 > est_pass=1 > altest_pass=1 > protein_pass=1 > rm_pass=1 > model_pass=1 > pred_pass=0 > other_pass=1 > > #-----Gene Prediction 2. Do i provide again the snap etc models? I am not sure, because i thought MAKER would not run ab initio predictors this time (this is why I would also quote out the genome file above, as this is not a de novo annotation). but if it will, i will then provide the previous models i used, except for snap, for which I will generate a new model from the gff3 file of the last run (according to snap documentation). Am i correct? > snaphmm= > gmhmm= > augustus_species= > fgenesh_par_file= > pred_gff=ab_init_predictions_rescued_by_blast.gff > > keep_preds=1 Samely, what do i do with repeatmasking etc? Thanks in adavance, Anastasia > > Barry > >>> Thanks, >>> Carson >>> >>> From: Anastasia Gioti >>> Date: Wed, 25 Apr 2012 11:09:36 +0200 >>> To: >>> Subject: [maker-devel] Use pass-through system to add missing genes >>> >>> Hi, >>> I have a set of predicted proteins from the genome of a fungus >>> annotated by MAKER using EST data from a closely related species >>> and 3 ab initio predictors (snap iterativelly trained 3 times, >>> genemark trained directly on the assembly and augustus with a >>> model from a less closely related species), along with a set of >>> fungal proteins. I am missing ~ 1000 proteins when I compare to >>> the species i used EST data from, and there is good evidence from >>> alignments that these genes exist. The question is how to proceed >>> from Blast hits to actual gene models here. The idea would be to >>> add these genes to the existing dataset, rather than reannotate >>> the genome. I believe that reannotating it without any further >>> evidence such as RNA-seq from the species itself would not change >>> much,and i d rather stick with actual predictions that i trust and >>> have used in subsequent analyses. The 1000 genes I can accept to >>> annotate with a less stringent and reliable way than MAKER, I just >>> want to add them so that the difference in gene count gets >>> corrected. >>> I was reading the MAKER 2 paper and i was wondering if I can use >>> the legacy annotations scheme to do it, by providing GFF3 of the >>> alignments between the two species in the regions where genes were >>> missed, but as i said, I would not like to reannotate the whole >>> genome, and running MAKER2 might cause slight changes that i d >>> like to avoid. Is this possible? First, is it possible to provide >>> a Gff3 file of specific locations and not the entire genome >>> alignment? (I guess so..) Second, how can I tag the existing >>> annotations as 'not to be changed' or alternatively, tag the new >>> models only? How should I run maker2, with which predictors on and >>> which off? >>> Thanks, >>> Anastasia >>> >>> Anastasia Gioti >>> Post-doctoral Researcher >>> >>> anastasia.gioti at scilifelab.se >>> anastasia.gioti at ebc.uu.se >>> >>> http://www.ebc.uu.se/Research/IEG/evbiol/people/pages/Gioti_Anastasia/ >>> >>> >>> >>> _______________________________________________ maker-devel >>> mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >> >> Anastasia Gioti >> Post-doctoral Researcher >> >> anastasia.gioti at scilifelab.se >> anastasia.gioti at ebc.uu.se >> >> http://www.ebc.uu.se/Research/IEG/evbiol/people/pages/ >> Gioti_Anastasia/ >> >> >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > Barry Moore > Research Scientist > Dept. of Human Genetics > University of Utah > Salt Lake City, UT 84112 > -------------------------------------------- > (801) 585-3543 > > > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell- > lab.org Anastasia (Natassa) Gioti Post-Doc Researcher Evolutionary Biology Department Uppsala University -Science for Life lab, Karolinska Institute Stockholm anastasia.gioti at ebc.uu.se anastasia.gioti at scilifelab.se http://www.ebc.uu.se/Research/IEG/evbiol/people/pages/Gioti_Anastasia/ From anastasia.gioti at scilifelab.se Wed May 23 03:07:12 2012 From: anastasia.gioti at scilifelab.se (Anastasia Gioti) Date: Wed, 23 May 2012 11:07:12 +0200 Subject: [maker-devel] Use pass-through system to add missing genes In-Reply-To: <03439C8F-75B0-42FE-894C-CC564AEB73E9@genetics.utah.edu> References: <4FE7CD5B-FC1C-43E7-AC41-A05823348B99@scilifelab.se> <03439C8F-75B0-42FE-894C-CC564AEB73E9@genetics.utah.edu> Message-ID: <1B0770A0-6D14-4336-BC3A-DC24619BC3FE@scilifelab.se> Hi and sorry for the multiple postings. I have a list of models rescued by the nonoverlaping_abinits.fasta fles (against which i blasted my missing proteins from the closely related species and further filtered out the dubious hits) and a maker gff3 file, but Carson's script gff3_select won't work, and the reason is that these abinitio models were not promoted into the maker gff3 file, thus they are not there. I refer to the gff3 file generated by gff3_merge script. Am i missing something? Thank you, Anastasia > >>> If you know which ab initio predictions you want to add (I.e. the ab initio promoting scenario I descibed), you can provide those predictions to the use the pred_gff option and then set keep_preds=1 and they will be maintained even without evidence. Attached is a script that would make selecting those easier. It take the MAKER generated GFF3 and a list of predictions to keep (one name per line). These might be the results of a BLAST analysis for example. It will then return the GFF3 entries for just those models selected. -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.hackl at uni-wuerzburg.de Wed May 23 06:01:55 2012 From: thomas.hackl at uni-wuerzburg.de (Thomas Hackl) Date: Wed, 23 May 2012 14:01:55 +0200 Subject: [maker-devel] missing est2genome annotation Message-ID: <4FBCD1B3.8000102@uni-wuerzburg.de> Hi, I used maker to annotate genomic contigs and among other stuff provided transcripts from the transcriptome as est evidence. Blast and exonerate work fine and produce valid alignments, the alignment files exist in theVoid and look very good. Unfortunatly neither the evidence_0.gff nor the final .gff carry the corresponding feature annotations. Any ideas why? Regards Thomas -- Thomas Hackl Julius-Maximilians-Universit?t Department of Bioinformatics 97074 W?rzburg, Germany Fon: +49 931 - 31 86883 Mail: thomas.hackl at uni-wuerzburg.de From thomas.hackl at uni-wuerzburg.de Wed May 23 11:14:27 2012 From: thomas.hackl at uni-wuerzburg.de (Thomas Hackl) Date: Wed, 23 May 2012 19:14:27 +0200 Subject: [maker-devel] missing est2genome annotation In-Reply-To: <4FBCD1B3.8000102@uni-wuerzburg.de> References: <4FBCD1B3.8000102@uni-wuerzburg.de> Message-ID: <4FBD1AF3.8020904@uni-wuerzburg.de> Hi again, I did some source code digging and caught the following line burying my exonerate alignments. I suspect it does so for a very good reason, therefore it would help me a lot if someone could explain to me, what is going on there. /lib/GI.pm l.1473 next if $e->pAh< $pcov; Regards Thomas Am 23.05.2012 14:01, schrieb Thomas Hackl: > Hi, > > I used maker to annotate genomic contigs and among other stuff > provided transcripts from the transcriptome as est evidence. Blast and > exonerate work fine and produce valid alignments, the alignment files > exist in theVoid and look very good. Unfortunatly neither the > evidence_0.gff nor the final .gff carry the corresponding feature > annotations. > > Any ideas why? > > Regards > Thomas > -- Thomas Hackl Julius-Maximilians-Universit?t Department of Bioinformatics 97074 W?rzburg, Germany Fon: +49 931 - 31 86883 Mail: thomas.hackl at uni-wuerzburg.de From gowthaman.ramasamy at seattlebiomed.org Thu May 24 13:30:09 2012 From: gowthaman.ramasamy at seattlebiomed.org (Gowthaman Ramasamy) Date: Thu, 24 May 2012 12:30:09 -0700 Subject: [maker-devel] Merging gene predictions.... Message-ID: <89080953C3D300419AACB6E63A7EEFBA5C8409F8D6@mail02.sbri.org> Hi Carson and others, I am wondering if I can use Maker to merge gene predictions from three gff files. One of the algorithm is 'augustus' which of course i can use it inside Maker. Other two are not part of Maker. But, in case if I want to pass only the GFFs to Maker and ask it to merge the annotations (when overlap) and pick only annotation that are predicted in 2 out of 3 gffs. Is it possible? I prefer this approach, as we need to run a blast based validation step on predicted features before even try to merge them. Thats another reason why we dont prefer to use the augustus inside maker. Thanks, Gowthaman From gowthaman.ramasamy at seattlebiomed.org Thu May 24 15:02:55 2012 From: gowthaman.ramasamy at seattlebiomed.org (Gowthaman Ramasamy) Date: Thu, 24 May 2012 14:02:55 -0700 Subject: [maker-devel] Merging gene predictions.... In-Reply-To: <89080953C3D300419AACB6E63A7EEFBA5C8409F8D6@mail02.sbri.org> References: <89080953C3D300419AACB6E63A7EEFBA5C8409F8D6@mail02.sbri.org> Message-ID: <89080953C3D300419AACB6E63A7EEFBA5C8409F8D8@mail02.sbri.org> Can i use the following approach Carson? This is your reply to one of the earlier question: I've attached a made from scratch drop-in replacement that you can use to do what the old script would have done. In the current release of MAKER, instead of the gff3_preds2models script users can just give MAKER a set of predictions in GFF3 format (pred_gff option) and set keep_preds=1 (then leave all other options blank). The predictions given will the be converted into gene models. Thanks, Carson ________________________________________ From: maker-devel-bounces at yandell-lab.org [maker-devel-bounces at yandell-lab.org] On Behalf Of Gowthaman Ramasamy [gowthaman.ramasamy at seattlebiomed.org] Sent: Thursday, May 24, 2012 12:30 PM To: Carson Holt; maker-devel at yandell-lab.org Subject: [maker-devel] Merging gene predictions.... Hi Carson and others, I am wondering if I can use Maker to merge gene predictions from three gff files. One of the algorithm is 'augustus' which of course i can use it inside Maker. Other two are not part of Maker. But, in case if I want to pass only the GFFs to Maker and ask it to merge the annotations (when overlap) and pick only annotation that are predicted in 2 out of 3 gffs. Is it possible? I prefer this approach, as we need to run a blast based validation step on predicted features before even try to merge them. Thats another reason why we dont prefer to use the augustus inside maker. Thanks, Gowthaman _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From gowthaman.ramasamy at seattlebiomed.org Thu May 24 15:21:30 2012 From: gowthaman.ramasamy at seattlebiomed.org (Gowthaman Ramasamy) Date: Thu, 24 May 2012 14:21:30 -0700 Subject: [maker-devel] Merging gene predictions.... In-Reply-To: <89080953C3D300419AACB6E63A7EEFBA5C8409F8D8@mail02.sbri.org> References: <89080953C3D300419AACB6E63A7EEFBA5C8409F8D6@mail02.sbri.org>, <89080953C3D300419AACB6E63A7EEFBA5C8409F8D8@mail02.sbri.org> Message-ID: <89080953C3D300419AACB6E63A7EEFBA5C8409F8DA@mail02.sbri.org> Hi, I am trying to install MAKER in centOS. I was able to install all the perl deps. and external programs. Perl Build.pl and ./Build Install went with out errors/warnings. I did not enable MPI though. But, when i start Maker it returs "segmentation fault". I have no clue whats going wrong....or where to check for error logs? Any help would be appreciated, Thanks, gowthaman _________ From gowthaman.ramasamy at seattlebiomed.org Thu May 24 15:22:16 2012 From: gowthaman.ramasamy at seattlebiomed.org (Gowthaman Ramasamy) Date: Thu, 24 May 2012 14:22:16 -0700 Subject: [maker-devel] MAKER installation problem Message-ID: <89080953C3D300419AACB6E63A7EEFBA5C8409F8DB@mail02.sbri.org> Hi, I am trying to install MAKER in centOS. I was able to install all the perl deps. and external programs. Perl Build.pl and ./Build Install went with out errors/warnings. I did not enable MPI though. But, when i start Maker it returs "segmentation fault". I have no clue whats going wrong....or where to check for error logs? Any help would be appreciated, Thanks, gowthaman _________ ________________________________________ From bob_freeman at hms.harvard.edu Fri May 25 10:23:22 2012 From: bob_freeman at hms.harvard.edu (Bob Freeman) Date: Fri, 25 May 2012 12:23:22 -0400 Subject: [maker-devel] Alternate translation table? Message-ID: <454CA235-0DB6-451F-97C4-83D32E2E805A@hms.harvard.edu> Hello all! Unusual question here: I am running MAKER on a ciliate that uses a non-standard translation table for its translation products. I haven't found an option in the control files that I can change for the for translation of the predicted transcripts. How or where can I go about this? Tx, Bob ----------------------------------------------------- Bob Freeman, Ph.D. Acorn Worm Informatics, Kirschner lab Dept of Systems Biology, Alpert 524 Harvard Medical School 200 Longwood Avenue Boston, MA 02115 617/432.2294, vox "Sorry I'm late. Oh, God, that sounded insincere. I'm late." -- Karen Walker, from Will and Grace -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon May 28 06:43:34 2012 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 28 May 2012 08:43:34 -0400 Subject: [maker-devel] Alternate translation table? In-Reply-To: <454CA235-0DB6-451F-97C4-83D32E2E805A@hms.harvard.edu> Message-ID: The alternate translation table is not currently an option. It's one of those things that needs to be implemented, but has not been yet. It's also not supported by many of the eukaryotic gene predictors MAKER uses. I could probably get something implemented for you to test in two to three weeks though (there are a lot of places where the translation table comes into play). Let me know. --Carson From: Bob Freeman Date: Friday, 25 May, 2012 12:23 PM To: Subject: [maker-devel] Alternate translation table? Hello all! Unusual question here: I am running MAKER on a ciliate that uses a non-standard translation table for its translation products. I haven't found an option in the control files that I can change for the for translation of the predicted transcripts. How or where can I go about this? Tx, Bob ----------------------------------------------------- Bob Freeman, Ph.D. Acorn Worm Informatics, Kirschner lab Dept of Systems Biology, Alpert 524 Harvard Medical School 200 Longwood Avenue Boston, MA 02115 617/432.2294, vox "Sorry I'm late. Oh, God, that sounded insincere. I'm late." -- Karen Walker, from Will and Grace _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon May 28 07:38:25 2012 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 28 May 2012 09:38:25 -0400 Subject: [maker-devel] missing est2genome annotation In-Reply-To: <4FBD1AF3.8020904@uni-wuerzburg.de> Message-ID: Sorry for the slow reply. I'm just getting back after traveling. That's a percent coverage flag. You can set a percent coverage threshold in the maker_bopts.ctl file. Partial high scoring alignments can be common. If you only filter by expect score, you would be surprised to see how many ugly and confusing all the alignments become. Thanks, Carson On 12-05-23 1:14 PM, "Thomas Hackl" wrote: >Hi again, > >I did some source code digging and caught the following line burying my >exonerate alignments. I suspect it does so for a very good reason, >therefore it would help me a lot if someone could explain to me, what is >going on there. > > >/lib/GI.pm l.1473 >next if $e->pAh< $pcov; > > >Regards >Thomas > > >Am 23.05.2012 14:01, schrieb Thomas Hackl: >> Hi, >> >> I used maker to annotate genomic contigs and among other stuff >> provided transcripts from the transcriptome as est evidence. Blast and >> exonerate work fine and produce valid alignments, the alignment files >> exist in theVoid and look very good. Unfortunatly neither the >> evidence_0.gff nor the final .gff carry the corresponding feature >> annotations. >> >> Any ideas why? >> >> Regards >> Thomas >> > > >-- >Thomas Hackl >Julius-Maximilians-Universit?t >Department of Bioinformatics >97074 W?rzburg, Germany >Fon: +49 931 - 31 86883 >Mail: thomas.hackl at uni-wuerzburg.de > > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From seoanezonjic at hotmail.com Tue May 22 06:55:12 2012 From: seoanezonjic at hotmail.com (p sz) Date: Tue, 22 May 2012 12:55:12 +0000 Subject: [maker-devel] ipr_update_gff ERROR Message-ID: First, thanks by help me on the lprevious error that I submitted. I'm still working in the same project and I get a new error. I try interproscan with this commandline: iprscan_wrap -i parsed_input.all.maker.proteins.fasta -email seoanezonjic at hotmail.com -format raw parsed_input.all.maker.proteins.fasta was generated with the tool fasta_merge. I use the output (attached in this email) and a gff file (generated by a normal run of maker, attached in this email) with the ipr_update_gff script of this way: ipr_update_gff BAC12_Clone_Pt314B2_Lib_Pt_7Ba__organism_Pinus_taeda__0.gff sc_interpro.sh.o109053 And i get this error: Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 1. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 1. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 2. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 2. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 3. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 3. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 8. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 8. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 9. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 9. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 10. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 10. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 11. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 11. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 12. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 12. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 13. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 13. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 14. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 14. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 15. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 15. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 18. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 18. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 18. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 18. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 19. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 19. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 19. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 19. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 48. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 48. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 48. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 48. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 49. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 49. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 49. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 49. The gff file seems updated but i don't know if it works fine or is corrupt Thanks in advance -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: sc_interpro.sh.o109053 Type: application/octet-stream Size: 2542 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: BAC12_Clone_Pt314B2_Lib_Pt_7Ba__organism_Pinus_taeda__0.gff Type: application/octet-stream Size: 220243 bytes Desc: not available URL: From larriba.ed at gmail.com Fri May 25 10:01:41 2012 From: larriba.ed at gmail.com (Eduardo Larriba) Date: Fri, 25 May 2012 18:01:41 +0200 Subject: [maker-devel] Consensus gene models Message-ID: Hi Carson and people, I am working on structural annotation of a filamentous fungus, of which there is little evidence as EST or Protein. For generate consensus gene based on limited evidences to me I used Marker. For this I created the files GeneMark prediction-is and SNAP. I run maker using the EST of my organims (85), along with 5700 EST of the closed organims. I have made ??predictions with Augustus, and SNAP GeneMark, with the training files for my organims, in Maker pipeline. Everything works fine. My problem is that when I get the consensus sequences of all my contigs, fasta_merge script (included in Maker), I get different list for each predictor, as well as when I try to get the gff of all. They could tell me how I can use Maker consensus for a list of genes? Or I have to do it manually? There is the possibility that Maker evaluates the accuracy of each prediction and confirm, so just get a list of the different predictions? Thank you very much. -- Eduardo Larriba Tornel Universidad de Alicante. Lab. Fitopatolog?a Dept. Ciencias del Mar y Biolog?a Aplicada Pabell?n 13. San Vicente del Raspeig Tel. 96 590 3400 ext 3280 -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon May 28 08:14:22 2012 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 28 May 2012 10:14:22 -0400 Subject: [maker-devel] Consensus gene models In-Reply-To: Message-ID: The consensus list is in the maker.proteins.fasta and maker.transcripts.fasta file. The predictor specific lists are just for reference purposes (incase you want to see what the other predictors produced on their own, i.e. without MAKER's intervention). The non-overlapping.fasta file in the same directory will contain consensus entries for models that were not supported by any evidence and don't overlap any gene models in the maker.transcripts.fasta (think of these as the maybe gene and the maker.transcripts.fasta as the very likely genes). You can set keep_preds=1 if you just want MAKER to keep everything with or without support and just produce consensus (probably ok on a fungus, but I wouldn't recommend it on other eukayotes because false positive rates will be very high). Thanks, Carson From: Eduardo Larriba Date: Friday, 25 May, 2012 12:01 PM To: Subject: [maker-devel] Consensus gene models Hi Carson and people, I am working on structural annotation of a filamentous fungus, of which there is little evidence as EST or Protein. For generate consensus gene based on limited evidences to me I used Marker. For this I created the files GeneMark prediction-is and SNAP. I run maker using the EST of my organims (85), along with 5700 EST of the closed organims. I have made ??predictions with Augustus, and SNAP GeneMark, with the training files for my organims, in Maker pipeline. Everything works fine. My problem is that when I get the consensus sequences of all my contigs, fasta_merge script (included in Maker), I get different list for each predictor, as well as when I try to get the gff of all. They could tell me how I can use Maker consensus for a list of genes? Or I have to do it manually? There is the possibility that Maker evaluates the accuracy of each prediction and confirm, so just get a list of the different predictions? Thank you very much. -- Eduardo Larriba Tornel Universidad de Alicante. Lab. Fitopatolog?a Dept. Ciencias del Mar y Biolog?a Aplicada Pabell?n 13. San Vicente del Raspeig Tel. 96 590 3400 ext 3280 _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon May 28 08:18:26 2012 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 28 May 2012 10:18:26 -0400 Subject: [maker-devel] ipr_update_gff ERROR In-Reply-To: Message-ID: This error would happen if some results exist in the iprscan output, but don't match gene entries in the GFF3 file. If I could see the original files I can tell you which ones. This can happen if you combine results from the non-overlapping.fasta files with the maker.proteins.fasta files for example. Models in the non-overlapping.fasta file are not genes in the GFF3 (they are match/match_part enties), so errors happen. Thanks, Carson From: p sz Date: Tuesday, 22 May, 2012 8:55 AM To: Subject: [maker-devel] ipr_update_gff ERROR First, thanks by help me on the lprevious error that I submitted. I'm still working in the same project and I get a new error. I try interproscan with this commandline: iprscan_wrap -i parsed_input.all.maker.proteins.fasta -email seoanezonjic at hotmail.com -format raw parsed_input.all.maker.proteins.fasta was generated with the tool fasta_merge. I use the output (attached in this email) and a gff file (generated by a normal run of maker, attached in this email) with the ipr_update_gff script of this way: ipr_update_gff BAC12_Clone_Pt314B2_Lib_Pt_7Ba__organism_Pinus_taeda__0.gff sc_interpro.sh.o109053 And i get this error: Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 1. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 1. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 2. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 2. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 3. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 3. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 8. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 8. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 9. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 9. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 10. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 10. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 11. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 11. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 12. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 12. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 13. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 13. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 14. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 14. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 15. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 15. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 18. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 18. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 18. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 18. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 19. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 19. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 19. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 19. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 48. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 48. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 48. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 48. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 49. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 49. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 49. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 49. The gff file seems updated but i don't know if it works fine or is corrupt Thanks in advance _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue May 29 06:37:55 2012 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 29 May 2012 08:37:55 -0400 Subject: [maker-devel] can't call method strand on an undefined value ERROR: Failed while flattening protein clusters In-Reply-To: Message-ID: Use this command to check out the latest unreleased test version, and lt me know if you still get the error. Command --> svn co svn://malachite.genetics.utah.edu/maker/trunk maker User: yandell_guest Password: y at ndell_Gu3st Thanks, Carson From: Scott Geib Date: Tuesday, 22 May, 2012 2:42 AM To: Subject: [maker-devel] can't call method strand on an undefined value ERROR: Failed while flattening protein clusters Hi, Using maker 2.24, I am getting the following error (see below) in protein2genome widget. I also get the same error with est2genome. This happens with my own data (testing on a single scaffold), but not with the test data supplied with maker (dpp files in data folder). Scott Widget::exonerate::protein2genome: /data0/opt/AlignmentSoftware/exonerate/exonerate-2.2.0-x86_64/bin/exonerate -q /data0/opt/GenePrediction/maker/bdor/makercustom/scaffold1.maker.output/scaf fold1_datastore/B8/E3/scaffold00001//theVoid.scaffold00001/UniRef90_UPI00019 4D3FC.for.1588546-1589203.9.fasta -t /data0/opt/GenePrediction/maker/bdor/makercustom/scaffold1.maker.output/scaf fold1_datastore/B8/E3/scaffold00001//theVoid.scaffold00001/scaffold00001.158 8546-1589203.9.fasta -Q protein -T dna -m protein2genome --softmasktarget --percent 20 --showcigar > /data0/opt/GenePrediction/maker/bdor/makercustom/scaffold1.maker.output/scaf fold1_datastore/B8/E3/scaffold00001//theVoid.scaffold00001/scaffold00001.158 8546-1589203.UniRef90_UPI000194D3FC.p_exonerate.9 #-------------------------------# cleaning blastx... in cluster::shadow_cluster... ...finished clustering. cleaning clusters.... total clusters:2 now processing 0 ...processing 0 of 23 ...processing 1 of 23 ...processing 2 of 23 ...processing 3 of 23 ...processing 4 of 23 ...processing 5 of 23 ...processing 6 of 23 ...processing 7 of 23 ...processing 8 of 23 ...processing 9 of 23 ...processing 10 of 23 ...processing 11 of 23 ...processing 12 of 23 ...processing 13 of 23 ...processing 14 of 23 ...processing 15 of 23 ...processing 16 of 23 ...processing 17 of 23 ...processing 18 of 23 ...processing 19 of 23 ...processing 20 of 23 ...processing 21 of 23 total clusters:2 now processing 0 in cluster::shadow_cluster... ...finished clustering. cleaning clusters.... total clusters:2 now processing 0 ...processing 0 of 20 ...processing 1 of 20 ...processing 2 of 20 ...processing 3 of 20 ...processing 4 of 20 ...processing 5 of 20 ...processing 6 of 20 ...processing 7 of 20 ...processing 8 of 20 ...processing 9 of 20 ...processing 10 of 20 ...processing 11 of 20 ...processing 12 of 20 ...processing 13 of 20 ...processing 14 of 20 ...processing 15 of 20 ...processing 16 of 20 ...processing 17 of 20 ...processing 18 of 20 total clusters:2 now processing 0 Can't call method "strand" on an undefined valueERROR: Failed while flattening protein clusters ERROR: Chunk failed at level:11, tier_type:2 FAILED CONTIG:scaffold00001 _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From bob_freeman at hms.harvard.edu Tue May 29 10:30:23 2012 From: bob_freeman at hms.harvard.edu (Bob Freeman) Date: Tue, 29 May 2012 12:30:23 -0400 Subject: [maker-devel] Alternate translation table? In-Reply-To: References: Message-ID: Thanks, Carson, for the update on this. No need to implement something. I'll keep it simple and translate the collected transcripts using an appropriate translation table. -Bob On May 28, 2012, at 8:43 AM, Carson Holt wrote: > The alternate translation table is not currently an option. It's one of those things that needs to be implemented, but has not been yet. It's also not supported by many of the eukaryotic gene predictors MAKER uses. > > I could probably get something implemented for you to test in two to three weeks though (there are a lot of places where the translation table comes into play). Let me know. > > --Carson > > > > From: Bob Freeman > Date: Friday, 25 May, 2012 12:23 PM > To: > Subject: [maker-devel] Alternate translation table? > > Hello all! > > Unusual question here: I am running MAKER on a ciliate that uses a non-standard translation table for its translation products. I haven't found an option in the control files that I can change for the for translation of the predicted transcripts. How or where can I go about this? > > Tx, > Bob > > > _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org ----------------------------------------------------- Bob Freeman, Ph.D. Acorn Worm Informatics, Kirschner lab Dept of Systems Biology, Alpert 524 Harvard Medical School 200 Longwood Avenue Boston, MA 02115 617/432.2294, vox "Sorry I'm late. Oh, God, that sounded insincere. I'm late." -- Karen Walker, from Will and Grace -------------- next part -------------- An HTML attachment was scrubbed... URL: From gowthaman.ramasamy at seattlebiomed.org Tue May 29 15:54:33 2012 From: gowthaman.ramasamy at seattlebiomed.org (Gowthaman Ramasamy) Date: Tue, 29 May 2012 14:54:33 -0700 Subject: [maker-devel] Can maker select a gene model based on #algoritham predicted it Message-ID: <89080953C3D300419AACB6E63A7EEFBA5C8409F8EB@mail02.sbri.org> Hi Carson, Thanks for all the help during the long weekend, in spite of that long drive. I am still trying to imagine that. I now have maker to consider our own prediction via pred_gff, and use augustus and gene mark (with our training model). And i was able to use altest and protein evidences. Maker happily picks one gene model when there is a overlap between three different predictions. But, when I look at the gff, it seems like it picks a gene model only when there is an est/protein evidence. It leaves out some genes even though, they are predicted by all three algorithms. Of course, keep_pred=1 helps to keep all the models. This kind of leads to over prediction. But, I am looking for something in between. And would like to know if that is possible? 1) Pick a gene model if it has an evidence from (est/prot etc...) irrespective of how many algorithms predicted it 2) In the absence of extrinsic evidence (est/prot etc), pick a gene model if that is predicted by at least two algorithms. Or even simpler: I have ab-initio predictions from three algorithms, Can I output, those genes that is supported by at least two of them. I care less about exactness of gene boundaries. Thanks, Gowthaman PS: With my recent attempts, i learned couple things about maker/other associated tools that is not documented in gmod-maker wiki. Is it possible/ok if I add contents to it. I am okay with running it by you before making it public. From carsonhh at gmail.com Wed May 30 06:54:32 2012 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 30 May 2012 08:54:32 -0400 Subject: [maker-devel] Can maker select a gene model based on #algoritham predicted it In-Reply-To: <89080953C3D300419AACB6E63A7EEFBA5C8409F8EB@mail02.sbri.org> Message-ID: It's not an option in exactly the way you are specifying, but there is something I usually do for annotation that works well. I run interproscan or rpsblast on the non_overlapping.proteins.fasta file and select just those non-overlapping models that have a recognizable protein domain (just searching the pfam doamin space is more than sufficient). Then I provide the selected results to model_gff, and provide the previous maker results to the maker_gff option with (all reannotation pass options set to 1 and all analysis options turned off). This adds models with at least recognizable domains (as even multiple gene predictors can overpredict in a similar way). Attached is a script to help select predictions and upgrade them to models in GFF3 format. If you have question let me know. Thanks, Carson On 12-05-29 5:54 PM, "Gowthaman Ramasamy" wrote: >Hi Carson, >Thanks for all the help during the long weekend, in spite of that long >drive. I am still trying to imagine that. > >I now have maker to consider our own prediction via pred_gff, and use >augustus and gene mark (with our training model). And i was able to use >altest and protein evidences. Maker happily picks one gene model when >there is a overlap between three different predictions. But, when I look >at the gff, it seems like it picks a gene model only when there is an >est/protein evidence. It leaves out some genes even though, they are >predicted by all three algorithms. Of course, keep_pred=1 helps to keep >all the models. This kind of leads to over prediction. > >But, I am looking for something in between. And would like to know if >that is possible? >1) Pick a gene model if it has an evidence from (est/prot etc...) >irrespective of how many algorithms predicted it >2) In the absence of extrinsic evidence (est/prot etc), pick a gene model >if that is predicted by at least two algorithms. > >Or even simpler: >I have ab-initio predictions from three algorithms, Can I output, those >genes that is supported by at least two of them. I care less about >exactness of gene boundaries. > >Thanks, >Gowthaman > >PS: With my recent attempts, i learned couple things about maker/other >associated tools that is not documented in gmod-maker wiki. Is it >possible/ok if I add contents to it. I am okay with running it by you >before making it public. -------------- next part -------------- A non-text attachment was scrubbed... Name: gff3_preds2models Type: application/octet-stream Size: 4777 bytes Desc: not available URL: From mikael.durling at slu.se Thu May 31 06:25:31 2012 From: mikael.durling at slu.se (=?iso-8859-1?Q?Mikael_Brandstr=F6m_Durling?=) Date: Thu, 31 May 2012 14:25:31 +0200 Subject: [maker-devel] maker leaving large numbers of defunct zombies Message-ID: Hello, I've been working lately to set up maker for annotation work on a few fungal genomes. I've got mpi maker up and running now, however, I notice that maker is leaving a lot of perl processes behind. This happens to the extent that the process table on the system gets filled up after a few hours run time. Right now the process tree after three hours running looks like this: |-sge_execd-+-sge_shepherd---bash---mpirun-+-maker-+-maker | | | `-perl | | |-maker-+-maker | | | |-maker---1371*[perl] | | | `-perl | | |-maker-+-maker | | | |-maker---1348*[perl] | | | `-perl | | |-maker-+-maker | | | |-maker---1384*[perl] | | | `-perl ...and so on for all mpi processes, except for the controlling processes. What perl programs is maker calling, that might end up as zombies? I've had a brief look at the source to no avail, but would be happy to dig further with some pointers for where to look. This is run with the 2.25-beta from the web page, perl 5.16.0 and openmpi 1.4.5. Thanks, Mikael ------------------------------------- Mikael Brandstr?m Durling, PhD Assistant Professor Sveriges lantbruksuniversitet Swedish University of Agricultural Sciences Uppsala BioCenter Dept of Forest Mycology and Plant Pathology Box 7026, 75007 Uppsala Visiting address: Almas All? 5 Telefon: 018-671512 mikael.durling at slu.se, www.slu.se/mykopat From mikael.durling at slu.se Thu May 31 06:34:30 2012 From: mikael.durling at slu.se (=?iso-8859-1?Q?Mikael_Brandstr=F6m_Durling?=) Date: Thu, 31 May 2012 14:34:30 +0200 Subject: [maker-devel] Using GDBM_File instead of DB_File Message-ID: Hello, I've been struggling for a few days to get maker up and running with MPI on a debian squeeze system. Compiling a new perl 5.16 exclusively for maker I wound down to that the segfaults came from DB_File. Even by recomiling and updating that module, nothing worked. After checking for dependencies on DB_File in maker, I concluded that the only dependency was through the GI::localize_file, which expects the FastaDB to be instantiated with DB_File. However, FastaDB can run on GDBM_File too. I patched the calls to GI::localize_file in maker to handle the .pag/.dir extensions used by GDBM (below). With this patch applied maker is running for me, even when I have deleted the DB_File module from the perl path and made sure that GDBM_File is installed. My basic question is if there is any other dependency for DB_File which I have missed which may break things? cheers, Mikael --- maker.orig 2012-03-30 15:48:05.000000000 +0200 +++ maker 2012-05-31 10:35:30.253022648 +0200 @@ -512,7 +515,12 @@ } if($size > 1){ carp "Calling GI::localize_file" if($main::debug); - GI::localize_file("$gdbfile.index"); + if( -f "$gdbfile.index.dir" ){ + GI::localize_file("$gdbfile.index.dir"); + GI::localize_file("$gdbfile.index.pag"); + }else{ + GI::localize_file("$gdbfile.index"); + } carp "Calling GI::localize_file" if($main::debug); $gdbfile = GI::localize_file($gdbfile); } From carsonhh at gmail.com Thu May 31 07:04:58 2012 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 31 May 2012 09:04:58 -0400 Subject: [maker-devel] Using GDBM_File instead of DB_File In-Reply-To: Message-ID: DB_File is being called by Bio::DB::Fasta. I can check the object returned when using GDBM_File instead to see if the index file names are contained on the object, as I'm just assuming an extension of '.index'. I'll look around to see if the extension name is assumed anywhere else. Thanks, Carson On 12-05-31 8:34 AM, "Mikael Brandstr?m Durling" wrote: >Hello, > >I've been struggling for a few days to get maker up and running with MPI >on a debian squeeze system. Compiling a new perl 5.16 exclusively for >maker I wound down to that the segfaults came from DB_File. Even by >recomiling and updating that module, nothing worked. After checking for >dependencies on DB_File in maker, I concluded that the only dependency >was through the GI::localize_file, which expects the FastaDB to be >instantiated with DB_File. However, FastaDB can run on GDBM_File too. I >patched the calls to GI::localize_file in maker to handle the .pag/.dir >extensions used by GDBM (below). With this patch applied maker is running >for me, even when I have deleted the DB_File module from the perl path >and made sure that GDBM_File is installed. My basic question is if there >is any other dependency for DB_File which I have missed which may break >things? > >cheers, >Mikael > >--- maker.orig 2012-03-30 15:48:05.000000000 +0200 >+++ maker 2012-05-31 10:35:30.253022648 +0200 >@@ -512,7 +515,12 @@ > } > if($size > 1){ > carp "Calling GI::localize_file" if($main::debug); >- GI::localize_file("$gdbfile.index"); >+ if( -f "$gdbfile.index.dir" ){ >+ GI::localize_file("$gdbfile.index.dir"); >+ GI::localize_file("$gdbfile.index.pag"); >+ }else{ >+ GI::localize_file("$gdbfile.index"); >+ } > carp "Calling GI::localize_file" if($main::debug); > $gdbfile = GI::localize_file($gdbfile); > } > > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From carsonhh at gmail.com Thu May 31 07:17:20 2012 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 31 May 2012 09:17:20 -0400 Subject: [maker-devel] maker leaving large numbers of defunct zombies In-Reply-To: Message-ID: MAKER uses IPC::Open3 to open almost all external applications, including a helper script called every once in a while that helps check file locks on NFS. MAKER then calls waitpid to reap the processes, as IPC::Open3 doesn't auto-reap. The only time previously I've seen issues with zombie accumulation was with MPICH2 when it moved from the MPD process manager to Hydra. Hydra had certain broken signal handling issues that I had to bug the MPICH2 developers about and they fixed it. It is possible that the issue you are having may be with OpenMPI or with perl 5.16. I currently use perl 5.12. Perl instituted something called safe signals in either 5.6 or 5.8 and there may be some updates in 5.16 where they've been changing those around again. I can try installing a copy of 5.16 to test with and OpenMPI to see if I can replicate the error. Thanks, Carson On 12-05-31 8:25 AM, "Mikael Brandstr?m Durling" wrote: >Hello, > >I've been working lately to set up maker for annotation work on a few >fungal genomes. I've got mpi maker up and running now, however, I notice >that maker is leaving a lot of perl processes behind. This >happens to the extent that the process table on the system gets filled up >after a few hours run time. Right now the process tree after three hours >running looks like this: > > |-sge_execd-+-sge_shepherd---bash---mpirun-+-maker-+-maker > | | | `-perl > | | |-maker-+-maker > | | | >|-maker---1371*[perl] > | | | `-perl > | | |-maker-+-maker > | | | >|-maker---1348*[perl] > | | | `-perl > | | |-maker-+-maker > | | | >|-maker---1384*[perl] > | | | `-perl > >...and so on for all mpi processes, except for the controlling processes. > >What perl programs is maker calling, that might end up as zombies? I've >had a brief look at the source to no avail, but would be happy to dig >further with some pointers for where to look. > >This is run with the 2.25-beta from the web page, perl 5.16.0 and openmpi >1.4.5. > >Thanks, >Mikael > > > > > > > > > > >------------------------------------- >Mikael Brandstr?m Durling, PhD >Assistant Professor > >Sveriges lantbruksuniversitet >Swedish University of Agricultural Sciences > >Uppsala BioCenter >Dept of Forest Mycology and Plant Pathology >Box 7026, 75007 Uppsala >Visiting address: Almas All? 5 >Telefon: 018-671512 >mikael.durling at slu.se, www.slu.se/mykopat > > > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From mikael.durling at slu.se Thu May 31 07:57:06 2012 From: mikael.durling at slu.se (=?iso-8859-1?Q?Mikael_Brandstr=F6m_Durling?=) Date: Thu, 31 May 2012 15:57:06 +0200 Subject: [maker-devel] maker leaving large numbers of defunct zombies In-Reply-To: References: Message-ID: I saw the same problem with the latest MPICH2 using hydra too, so it might boil down to perl/openmpi interactions. I didn't see this problem with the debian supplied perl 5.10, but then I had intermittent segfaults with in DB_File and libpthread. Seemed to be some interaction with the LD_PRELOADed libmpi. That requirement for preloading libmpi was easiest solved by compiling openmpi with --disable-dlopen. Thanks, Mikael 31 maj 2012 kl. 15:17 skrev Carson Holt: > MAKER uses IPC::Open3 to open almost all external applications, including > a helper script called every once in a while that helps check file locks > on NFS. > > MAKER then calls waitpid to reap the processes, as IPC::Open3 doesn't > auto-reap. The only time previously I've seen issues with zombie > accumulation was with MPICH2 when it moved from the MPD process manager to > Hydra. Hydra had certain broken signal handling issues that I had to bug > the MPICH2 developers about and they fixed it. It is possible that the > issue you are having may be with OpenMPI or with perl 5.16. I currently > use perl 5.12. Perl instituted something called safe signals in either > 5.6 or 5.8 and there may be some updates in 5.16 where they've been > changing those around again. > > I can try installing a copy of 5.16 to test with and OpenMPI to see if I > can replicate the error. > > Thanks, > Carson > > > > On 12-05-31 8:25 AM, "Mikael Brandstr?m Durling" > wrote: > >> Hello, >> >> I've been working lately to set up maker for annotation work on a few >> fungal genomes. I've got mpi maker up and running now, however, I notice >> that maker is leaving a lot of perl processes behind. This >> happens to the extent that the process table on the system gets filled up >> after a few hours run time. Right now the process tree after three hours >> running looks like this: >> >> |-sge_execd-+-sge_shepherd---bash---mpirun-+-maker-+-maker >> | | | `-perl >> | | |-maker-+-maker >> | | | >> |-maker---1371*[perl] >> | | | `-perl >> | | |-maker-+-maker >> | | | >> |-maker---1348*[perl] >> | | | `-perl >> | | |-maker-+-maker >> | | | >> |-maker---1384*[perl] >> | | | `-perl >> >> ...and so on for all mpi processes, except for the controlling processes. >> >> What perl programs is maker calling, that might end up as zombies? I've >> had a brief look at the source to no avail, but would be happy to dig >> further with some pointers for where to look. >> >> This is run with the 2.25-beta from the web page, perl 5.16.0 and openmpi >> 1.4.5. >> >> Thanks, >> Mikael >> >> >> >> >> >> >> >> >> >> >> ------------------------------------- >> Mikael Brandstr?m Durling, PhD >> Assistant Professor >> >> Sveriges lantbruksuniversitet >> Swedish University of Agricultural Sciences >> >> Uppsala BioCenter >> Dept of Forest Mycology and Plant Pathology >> Box 7026, 75007 Uppsala >> Visiting address: Almas All? 5 >> Telefon: 018-671512 >> mikael.durling at slu.se, www.slu.se/mykopat >> >> >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > From carsonhh at gmail.com Tue May 1 16:07:47 2012 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 01 May 2012 18:07:47 -0400 Subject: [maker-devel] gff3_preds2models usage question In-Reply-To: Message-ID: Sorry for the slow response. The gff3_preds2models script has been deprecated for some time now (isn't even in the release code anymore), and the old one won't work with the new library. I've attached a made from scratch drop-in replacement that you can use to do what the old script would have done. In the current release of MAKER, instead of the gff3_preds2models script users can just give MAKER a set of predictions in GFF3 format (pred_gff option) and set keep_preds=1 (then leave all other options blank). The predictions given will the be converted into gene models. Thanks, Carson From: Walter Eckalbar Date: Tuesday, 3 April, 2012 7:28 PM To: Subject: [maker-devel] gff3_preds2models usage question Hello maker developers and users, I am attempting to use the gff3_preds2models scripts, but running into a few issues. Initially, I hit errors that seemed to be fixed by installing CGI and its dependancies. However, that during that installation a few tests did fail. I can provide error logs if that would be helpful, however, I went on to install and attempt gff3_preds2models anyway. What I am currently doing is running gff3_merge first, to gather the maker outputs. I am doing so with both the -n option on and off. When providing the gff3 file with the sequence I get the following error from gff3_preds2models: Undefined subroutine &maker::auto_annotator::annotate called at /Users/Walter/Bioinformatics/Tools/maker/bin/gff3_preds2models line 97, line 992291. This seemed to be the same error as that of what someone else saw on these boards, but I did not see a later email resolving the issue. I also tried giving it just the gff3 without the sequences at the bottom of the file and then I get this error: ERROR: There was a problem in the writing the fasta entry Either no sequence was given, or there was an error in writing This leads me to believe I should be using the one with the sequence, but I am not certain of that. I see it might be possible to go from maker outputs to chado database then to gene->mRNA->exon gff3s, but I have not set up my machine for XML or chado yet, and it does not appear trivial. Thanks for the help, Walter _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: gff3_preds2models Type: application/octet-stream Size: 4778 bytes Desc: not available URL: From weckalba at asu.edu Tue May 1 19:33:00 2012 From: weckalba at asu.edu (Walter Eckalbar) Date: Tue, 1 May 2012 18:33:00 -0700 Subject: [maker-devel] gff3_preds2models usage question In-Reply-To: References: Message-ID: Hi Carson, Thanks for the response, even a late one, and thanks for the script. I'll certainly be giving that a try. Walter On 1 May 2012 15:07, Carson Holt wrote: > Sorry for the slow response. The gff3_preds2models script has been > deprecated for some time now (isn't even in the release code anymore), and > the old one won't work with the new library. > > I've attached a made from scratch drop-in replacement that you can use to > do what the old script would have done. In the current release of MAKER, > instead of the gff3_preds2models script users can just give MAKER a set of > predictions in GFF3 format (pred_gff option) and set keep_preds=1 (then > leave all other options blank). The predictions given will the be > converted into gene models. > > Thanks, > Carson > > > > From: Walter Eckalbar > Date: Tuesday, 3 April, 2012 7:28 PM > To: > Subject: [maker-devel] gff3_preds2models usage question > > Hello maker developers and users, > > I am attempting to use the gff3_preds2models scripts, but running into a > few issues. > > Initially, I hit errors that seemed to be fixed by installing CGI and its > dependancies. However, that during that installation a few tests did fail. > I can provide error logs if that would be helpful, however, I went on to > install and attempt gff3_preds2models anyway. > > What I am currently doing is running gff3_merge first, to gather the maker > outputs. I am doing so with both the -n option on and off. When providing > the gff3 file with the sequence I get the following error from > gff3_preds2models: > > Undefined subroutine &maker::auto_annotator::annotate called at > /Users/Walter/Bioinformatics/Tools/maker/bin/gff3_preds2models line 97, > line 992291. > > This seemed to be the same error as that of what someone else saw on these > boards, but I did not see a later email resolving the issue. > > I also tried giving it just the gff3 without the sequences at the bottom > of the file and then I get this error: > > ERROR: There was a problem in the writing the fasta entry > Either no sequence was given, or there was an error in writing > > This leads me to believe I should be using the one with the sequence, but > I am not certain of that. > > I see it might be possible to go from maker outputs to chado database then > to gene->mRNA->exon gff3s, but I have not set up my machine for XML or > chado yet, and it does not appear trivial. > > Thanks for the help, > > Walter > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From qwang at uwyo.edu Thu May 3 10:23:16 2012 From: qwang at uwyo.edu (Qiurong Wang) Date: Thu, 3 May 2012 10:23:16 -0600 Subject: [maker-devel] MAKER download problem Message-ID: <8754B55D-C119-4A7C-9594-BEAEAD3BB939@uwyo.edu> Hi, I was trying to download MAKER, but I couldn't open the download page. Could you please help me to figure out the problem? Thanks a lot! Qiurong Wang PhD candidate Department of Botany University of Wyoming Department 3165, 1000 E University Ave. Laramie, Wyoming 82071, USA Phone: 307-766-2634 Email: qwang at uwyo.edu From barry.moore at genetics.utah.edu Thu May 3 11:51:48 2012 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Thu, 3 May 2012 11:51:48 -0600 Subject: [maker-devel] MAKER download problem In-Reply-To: <8754B55D-C119-4A7C-9594-BEAEAD3BB939@uwyo.edu> References: <8754B55D-C119-4A7C-9594-BEAEAD3BB939@uwyo.edu> Message-ID: <6B24C00B-6C70-4A40-BA7C-E3AC0C7F5E25@genetics.utah.edu> Hi all, The web server hosting the MAKER licensing application and MAKER code distribution was attacked this week. The University of Utah is currently blocking access to that server from outside University IP space. We are working hard to move all of the content and web-applications from that machine to a new server and I expect to have the MAKER services restored over the weekend. This is affecting the ability to submit new licenses for MAKER and to download the MAKER code. In addition those with existing licenses will need to update their code links for future code updates. For those of you on campus in Utah or with campus VPN access, the server is running and available. Updates on the status of the server and details about new links to the MAKER code will be posted to the mailing list soon. Barry On May 3, 2012, at 10:23 AM, Qiurong Wang wrote: > Hi, > > I was trying to download MAKER, but I couldn't open the download page. Could you please help me to figure out the problem? Thanks a lot! > > > Qiurong Wang > PhD candidate > Department of Botany > University of Wyoming > Department 3165, 1000 E University Ave. Laramie, Wyoming 82071, USA > Phone: 307-766-2634 > Email: qwang at uwyo.edu > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Thu May 3 12:00:01 2012 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 03 May 2012 14:00:01 -0400 Subject: [maker-devel] MAKER download problem In-Reply-To: <8754B55D-C119-4A7C-9594-BEAEAD3BB939@uwyo.edu> Message-ID: The server hosting the file to download is down temporarily. I'll put a copy of MAKER on a separate server and e-mail the link to you. --Carson On 12-05-03 12:23 PM, "Qiurong Wang" wrote: >Hi, > >I was trying to download MAKER, but I couldn't open the download page. >Could you please help me to figure out the problem? Thanks a lot! > > >Qiurong Wang >PhD candidate >Department of Botany >University of Wyoming >Department 3165, 1000 E University Ave. Laramie, Wyoming 82071, USA >Phone: 307-766-2634 >Email: qwang at uwyo.edu > > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From cjfields at illinois.edu Tue May 15 09:01:31 2012 From: cjfields at illinois.edu (Fields, Christopher J) Date: Tue, 15 May 2012 15:01:31 +0000 Subject: [maker-devel] mail list and Trac Message-ID: <64A6759A-9FD1-4AEE-BCEA-151B0D791ADD@illinois.edu> Just wanted to point out, I noticed the mail list is not being indexed on Google Groups any more (nothing in May). Also, any status on Trac? chris From barry.moore at genetics.utah.edu Tue May 15 10:34:29 2012 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Tue, 15 May 2012 10:34:29 -0600 Subject: [maker-devel] mail list and Trac In-Reply-To: <64A6759A-9FD1-4AEE-BCEA-151B0D791ADD@illinois.edu> References: <64A6759A-9FD1-4AEE-BCEA-151B0D791ADD@illinois.edu> Message-ID: Thanks Chris, I'll check on the Google groups. The MAKER Trac server was on a server that we had to shut down a couple weeks ago and I had made moving it a lower priority since I didn't think it was getting much use. I'll get it moved over and bump up the priority on that move since I know someone is looking at it. B On May 15, 2012, at 9:01 AM, Fields, Christopher J wrote: > Just wanted to point out, I noticed the mail list is not being indexed on Google Groups any more (nothing in May). Also, any status on Trac? > > chris > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From anastasia.gioti at scilifelab.se Thu May 17 05:27:04 2012 From: anastasia.gioti at scilifelab.se (Anastasia Gioti) Date: Thu, 17 May 2012 13:27:04 +0200 Subject: [maker-devel] Use pass-through system to add missing genes In-Reply-To: <03439C8F-75B0-42FE-894C-CC564AEB73E9@genetics.utah.edu> References: <4FE7CD5B-FC1C-43E7-AC41-A05823348B99@scilifelab.se> <03439C8F-75B0-42FE-894C-CC564AEB73E9@genetics.utah.edu> Message-ID: <4D9922AF-A917-4747-9B7C-CFE9F142D51C@scilifelab.se> Hi Barry, Thanks for your detailed instructions. You well understood that I have already included the proteins of the closely related species in my protein evidence dataset, but still did not get the genes. I have now blasted (P) the missing 949 proteins from this species against my nonoverlaping_abinits.fasta proteins and have found 618 good hits, which i guess I can promote to models using the routine no 2 of your last email and Carson's script gff3_select. I have also looked at the rest of the proteins (331) for which there was no model in the nonoverlaping_abinits.fasta. I will try to describe 2 examples I looked at in apollo: 1) ab initio models predicted a ~7.5 kb gene covering 3 genes (as predicted in the closely related species). Blastx+protein2genome similarities were reported for two of these genes, but not for the 3rd (the one in the middle). MAKER finally decided to call two genes, respecting the blastx+protein2genome evidence, but the 3rd was lost. I have previously reported here that MAKER tends to fuse genes in multi-exonic genes and others reported that too, I remember you proposed changing a papameter to alter this. To keep in mind for my final strategy that i am trying to decide on (for the moment i have not rerun MAKER). For this case, abinitio models do not exist for the gene (in the sense that the existing models overlap many genes) and the similarity to the protein of the closely related species was not judged sufficient, although when i look at a TblastN alignment for this area it looks fine to me. 2) Only the 3' end of the gene was called by MAKER, despite blastx +protein2genome evidence from the closely related species for the entire region. Abinitio models existed as 2 separate genes , one for the 3' end region (finally retained by MAKER in a consensus decision I guess) and one for the 5' region, but here not all predictors called an orf, and finally nothing was called in this region. In this case, it is a misannotation rather, but which misses a very important part of the gene. I hope my descriptions are clear, otherwise I can provide you the gff file of these 2 examples to look by yourself. I am not very clear about what to do about these 331 cases (which I do not know how to look at as well, except for random examples' viwing in Apollo). I feel that a second MAKER run would be probably the solution, this time providing as pred_gff the result of a blast against the 331. But still, the existing annotations would then have to be somehow updated as the new predictions are in conflict with them (see example 2). I am a bit confused. to recap, what would you suggest for the 331 still-missing proteins in terms of asessing their profiles n a rather automatic way and in inluding them in my annotations without going deep into manual gene curation? Many thnks, Anastasia > > Let me just restate what you've said so that I can be sure that I am > correct about what you've already done. You have run Maker with > SNAP, Genemark and Augustus using EST from a closely related species > (passed to altest) and protein evidence from other fungi. You are > missing about 1,000 genes compared to the species that provided the > EST alignments. You say their is good evidence that these genes > exist from the alignments and I assume by this that you mean the EST/ > protein alignments that Maker produced. > > 1) Is the closely related fungus annotated and if so have you > included it's proteins in the evidence set that you provided to > Maker. If you haven't provided these proteins as evidence to maker > then you should do this. You can re-run maker passing your original > models back through like this: > > #-----Re-annotation Using MAKER Derived GFF3 > genome_gff=original_maker_annotations.gff3 > est_pass=1 > altest_pass=1 > protein_pass=1 > rm_pass=1 > model_pass=1 > pred_pass=1 > other_pass=1 > > #-----Protein Homology Evidence (for best results provide a file for > at least one) > protein=proteins_from_closely_related.fasta > ## OR it sounds like you've already aligned these with exonerate? > protein_gff=proteins_from_closely_related_already_aligned.gff > > 2) If you've already included those closely related species proteins > but still didn't get the 1,000 genes, then take your > nonoverlaping_abinits.fasta and blast them directly against your > closely related proteins. Presumably they don't hit too well > because if they did they should have been promoted to predictions by > Maker the first time, but here you can decide yourself what > thresholds to allow to keep the abinit predictions that hit the > closely related species proteins. If you filter you blast hits the > way you want and keep the names of the abinit predictions that pass > your filter, then use the script Carson attached it it will generate > a abinit precidtion GFF file with only the predictions you > selected. You can then pass those predictions back to Maker and > force it to keep them and Maker will turn them from predictions > (match/match_part) into gene models. > > #-----Re-annotation Using MAKER Derived GFF3 > genome_gff=original_maker_annotations.gff3 > est_pass=1 > altest_pass=1 > protein_pass=1 > rm_pass=1 > model_pass=1 > pred_pass=0 > other_pass=1 > > #-----Gene Prediction > snaphmm= > gmhmm= > augustus_species= > fgenesh_par_file= > pred_gff=ab_init_predictions_rescued_by_blast.gff > > keep_preds=1 > > Barry > >>> Thanks, >>> Carson >>> >>> From: Anastasia Gioti >>> Date: Wed, 25 Apr 2012 11:09:36 +0200 >>> To: >>> Subject: [maker-devel] Use pass-through system to add missing genes >>> >>> Hi, >>> I have a set of predicted proteins from the genome of a fungus >>> annotated by MAKER using EST data from a closely related species >>> and 3 ab initio predictors (snap iterativelly trained 3 times, >>> genemark trained directly on the assembly and augustus with a >>> model from a less closely related species), along with a set of >>> fungal proteins. I am missing ~ 1000 proteins when I compare to >>> the species i used EST data from, and there is good evidence from >>> alignments that these genes exist. The question is how to proceed >>> from Blast hits to actual gene models here. The idea would be to >>> add these genes to the existing dataset, rather than reannotate >>> the genome. I believe that reannotating it without any further >>> evidence such as RNA-seq from the species itself would not change >>> much,and i d rather stick with actual predictions that i trust and >>> have used in subsequent analyses. The 1000 genes I can accept to >>> annotate with a less stringent and reliable way than MAKER, I just >>> want to add them so that the difference in gene count gets >>> corrected. >>> I was reading the MAKER 2 paper and i was wondering if I can use >>> the legacy annotations scheme to do it, by providing GFF3 of the >>> alignments between the two species in the regions where genes were >>> missed, but as i said, I would not like to reannotate the whole >>> genome, and running MAKER2 might cause slight changes that i d >>> like to avoid. Is this possible? First, is it possible to provide >>> a Gff3 file of specific locations and not the entire genome >>> alignment? (I guess so..) Second, how can I tag the existing >>> annotations as 'not to be changed' or alternatively, tag the new >>> models only? How should I run maker2, with which predictors on and >>> which off? >>> Thanks, >>> Anastasia >>> >>> Anastasia Gioti >>> Post-doctoral Researcher >>> >>> anastasia.gioti at scilifelab.se >>> anastasia.gioti at ebc.uu.se >>> >>> http://www.ebc.uu.se/Research/IEG/evbiol/people/pages/Gioti_Anastasia/ >>> >>> >>> >>> _______________________________________________ maker-devel >>> mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >> >> Anastasia Gioti >> Post-doctoral Researcher >> >> anastasia.gioti at scilifelab.se >> anastasia.gioti at ebc.uu.se >> >> http://www.ebc.uu.se/Research/IEG/evbiol/people/pages/ >> Gioti_Anastasia/ >> >> >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > Barry Moore > Research Scientist > Dept. of Human Genetics > University of Utah > Salt Lake City, UT 84112 > -------------------------------------------- > (801) 585-3543 > > > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell- > lab.org Anastasia (Natassa) Gioti Post-Doc Researcher Evolutionary Biology Department Uppsala University -Science for Life lab, Karolinska Institute Stockholm anastasia.gioti at ebc.uu.se anastasia.gioti at scilifelab.se http://www.ebc.uu.se/Research/IEG/evbiol/people/pages/Gioti_Anastasia/ From yogeshp08 at gmail.com Tue May 15 10:07:57 2012 From: yogeshp08 at gmail.com (Yogesh) Date: Tue, 15 May 2012 11:07:57 -0500 Subject: [maker-devel] tblastn Cleanup? Message-ID: <4478F0B20ED84A85B3C4FE4154F8FAD1@gmail.com> Hello, I have a few tblastn alignments with a lot of low quality hits. I have to clean that up. Can you please suggest how Maker pipeline does it? Also can I run it directly on my data without having to go through the whole pipeline? Thanks, -Yogesh -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Fri May 18 08:22:50 2012 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 18 May 2012 10:22:50 -0400 Subject: [maker-devel] tblastn Cleanup? In-Reply-To: <4478F0B20ED84A85B3C4FE4154F8FAD1@gmail.com> Message-ID: There are several things. I set several filtering options directly on the BLAST command line. These are things like maximum intron length, an e-value filter, and simple repeat filtering (called dust filter in NCBI blast and seg filter in WUBLAST). I also run repeat masker over the genome first. This allows simple and complex repeats to be removed before running BLAST (otherwise you get many false alignments). Last I filter the results based on percent coverage of the hit to the original database sequence and percent identity. I think you can set percent identity as a flag in BLAST, but the percent coverage filter is being calculated by MAKER, so to do this outside of MAKER would require that you write your own filtering script to compare the length of the alignment to the length of the sequence in the database. I also have an HSP depth overlap filter. This removes weird low complexity hits that escape repeatmasking. They show up as multiple HSPs overlapping multiple times in the same region (usually very high numbers like 90 HSPs all 100 bp long in the same region). I calculate the number of base pairs in the alignment on the hit then divide by the number of base pairs in the query alignment. If it's greater than 3, I throw the hit out. Thanks, Carson From: Yogesh Date: Tuesday, 15 May, 2012 12:07 PM To: Subject: [maker-devel] tblastn Cleanup? Hello, I have a few tblastn alignments with a lot of low quality hits. I have to clean that up. Can you please suggest how Maker pipeline does it? Also can I run it directly on my data without having to go through the whole pipeline? Thanks, -Yogesh _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From smg283 at gmail.com Tue May 22 00:42:51 2012 From: smg283 at gmail.com (Scott Geib) Date: Mon, 21 May 2012 20:42:51 -1000 Subject: [maker-devel] can't call method strand on an undefined value ERROR: Failed while flattening protein clusters Message-ID: Hi, Using maker 2.24, I am getting the following error (see below) in protein2genome widget. I also get the same error with est2genome. This happens with my own data (testing on a single scaffold), but not with the test data supplied with maker (dpp files in data folder). Scott Widget::exonerate::protein2genome: /data0/opt/AlignmentSoftware/exonerate/exonerate-2.2.0-x86_64/bin/exonerate -q /data0/opt/GenePrediction/maker/bdor/makercustom/scaffold1.maker.output/scaffold1_datastore/B8/E3/scaffold00001//theVoid.scaffold00001/UniRef90_UPI000194D3FC.for.1588546-1589203.9.fasta -t /data0/opt/GenePrediction/maker/bdor/makercustom/scaffold1.maker.output/scaffold1_datastore/B8/E3/scaffold00001//theVoid.scaffold00001/scaffold00001.1588546-1589203.9.fasta -Q protein -T dna -m protein2genome --softmasktarget --percent 20 --showcigar > /data0/opt/GenePrediction/maker/bdor/makercustom/scaffold1.maker.output/scaffold1_datastore/B8/E3/scaffold00001//theVoid.scaffold00001/scaffold00001.1588546-1589203.UniRef90_UPI000194D3FC.p_exonerate.9 #-------------------------------# cleaning blastx... in cluster::shadow_cluster... ...finished clustering. cleaning clusters.... total clusters:2 now processing 0 ...processing 0 of 23 ...processing 1 of 23 ...processing 2 of 23 ...processing 3 of 23 ...processing 4 of 23 ...processing 5 of 23 ...processing 6 of 23 ...processing 7 of 23 ...processing 8 of 23 ...processing 9 of 23 ...processing 10 of 23 ...processing 11 of 23 ...processing 12 of 23 ...processing 13 of 23 ...processing 14 of 23 ...processing 15 of 23 ...processing 16 of 23 ...processing 17 of 23 ...processing 18 of 23 ...processing 19 of 23 ...processing 20 of 23 ...processing 21 of 23 total clusters:2 now processing 0 in cluster::shadow_cluster... ...finished clustering. cleaning clusters.... total clusters:2 now processing 0 ...processing 0 of 20 ...processing 1 of 20 ...processing 2 of 20 ...processing 3 of 20 ...processing 4 of 20 ...processing 5 of 20 ...processing 6 of 20 ...processing 7 of 20 ...processing 8 of 20 ...processing 9 of 20 ...processing 10 of 20 ...processing 11 of 20 ...processing 12 of 20 ...processing 13 of 20 ...processing 14 of 20 ...processing 15 of 20 ...processing 16 of 20 ...processing 17 of 20 ...processing 18 of 20 total clusters:2 now processing 0 Can't call method "strand" on an undefined valueERROR: Failed while flattening protein clusters ERROR: Chunk failed at level:11, tier_type:2 FAILED CONTIG:scaffold00001 -------------- next part -------------- An HTML attachment was scrubbed... URL: From anastasia.gioti at scilifelab.se Tue May 22 07:14:17 2012 From: anastasia.gioti at scilifelab.se (Anastasia Gioti) Date: Tue, 22 May 2012 15:14:17 +0200 Subject: [maker-devel] Use pass-through system to add missing genes In-Reply-To: <03439C8F-75B0-42FE-894C-CC564AEB73E9@genetics.utah.edu> References: <4FE7CD5B-FC1C-43E7-AC41-A05823348B99@scilifelab.se> <03439C8F-75B0-42FE-894C-CC564AEB73E9@genetics.utah.edu> Message-ID: <19E36E3B-6A82-49D5-B0AC-5E521F3E8999@scilifelab.se> Hi again, I hav sent an email a few days ago about this thread, and i am not sure if you have received it or you still did not have time to look at it. In any case, this email was dealing with the fact that some proteins were not retrieved in the abinitio models and how to deal with it. What I would like to ask here is a few confirmations on how to rerun maker for the proteins that were retrieved in the abinitio models. i have looked at the Blast results, and have done a series of check-ups, so now I am ready to run MAKER again with a list of models that I want to retain. Regarding the following parameters: 1. Do I set the genome= to nothing here? i.e quote it out? This is in the beginning of the control file #-----Genome (Required for De-Novo Annotation) genome=#genome sequence file in fasta format organism_type= #eukaryotic or prokaryotic. Default is eukaryotic > > #-----Re-annotation Using MAKER Derived GFF3 > genome_gff=original_maker_annotations.gff3 > est_pass=1 > altest_pass=1 > protein_pass=1 > rm_pass=1 > model_pass=1 > pred_pass=0 > other_pass=1 > > #-----Gene Prediction 2. Do i provide again the snap etc models? I am not sure, because i thought MAKER would not run ab initio predictors this time (this is why I would also quote out the genome file above, as this is not a de novo annotation). but if it will, i will then provide the previous models i used, except for snap, for which I will generate a new model from the gff3 file of the last run (according to snap documentation). Am i correct? > snaphmm= > gmhmm= > augustus_species= > fgenesh_par_file= > pred_gff=ab_init_predictions_rescued_by_blast.gff > > keep_preds=1 Samely, what do i do with repeatmasking etc? Thanks in adavance, Anastasia > > Barry > >>> Thanks, >>> Carson >>> >>> From: Anastasia Gioti >>> Date: Wed, 25 Apr 2012 11:09:36 +0200 >>> To: >>> Subject: [maker-devel] Use pass-through system to add missing genes >>> >>> Hi, >>> I have a set of predicted proteins from the genome of a fungus >>> annotated by MAKER using EST data from a closely related species >>> and 3 ab initio predictors (snap iterativelly trained 3 times, >>> genemark trained directly on the assembly and augustus with a >>> model from a less closely related species), along with a set of >>> fungal proteins. I am missing ~ 1000 proteins when I compare to >>> the species i used EST data from, and there is good evidence from >>> alignments that these genes exist. The question is how to proceed >>> from Blast hits to actual gene models here. The idea would be to >>> add these genes to the existing dataset, rather than reannotate >>> the genome. I believe that reannotating it without any further >>> evidence such as RNA-seq from the species itself would not change >>> much,and i d rather stick with actual predictions that i trust and >>> have used in subsequent analyses. The 1000 genes I can accept to >>> annotate with a less stringent and reliable way than MAKER, I just >>> want to add them so that the difference in gene count gets >>> corrected. >>> I was reading the MAKER 2 paper and i was wondering if I can use >>> the legacy annotations scheme to do it, by providing GFF3 of the >>> alignments between the two species in the regions where genes were >>> missed, but as i said, I would not like to reannotate the whole >>> genome, and running MAKER2 might cause slight changes that i d >>> like to avoid. Is this possible? First, is it possible to provide >>> a Gff3 file of specific locations and not the entire genome >>> alignment? (I guess so..) Second, how can I tag the existing >>> annotations as 'not to be changed' or alternatively, tag the new >>> models only? How should I run maker2, with which predictors on and >>> which off? >>> Thanks, >>> Anastasia >>> >>> Anastasia Gioti >>> Post-doctoral Researcher >>> >>> anastasia.gioti at scilifelab.se >>> anastasia.gioti at ebc.uu.se >>> >>> http://www.ebc.uu.se/Research/IEG/evbiol/people/pages/Gioti_Anastasia/ >>> >>> >>> >>> _______________________________________________ maker-devel >>> mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >> >> Anastasia Gioti >> Post-doctoral Researcher >> >> anastasia.gioti at scilifelab.se >> anastasia.gioti at ebc.uu.se >> >> http://www.ebc.uu.se/Research/IEG/evbiol/people/pages/ >> Gioti_Anastasia/ >> >> >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > Barry Moore > Research Scientist > Dept. of Human Genetics > University of Utah > Salt Lake City, UT 84112 > -------------------------------------------- > (801) 585-3543 > > > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell- > lab.org Anastasia (Natassa) Gioti Post-Doc Researcher Evolutionary Biology Department Uppsala University -Science for Life lab, Karolinska Institute Stockholm anastasia.gioti at ebc.uu.se anastasia.gioti at scilifelab.se http://www.ebc.uu.se/Research/IEG/evbiol/people/pages/Gioti_Anastasia/ From anastasia.gioti at scilifelab.se Wed May 23 03:07:12 2012 From: anastasia.gioti at scilifelab.se (Anastasia Gioti) Date: Wed, 23 May 2012 11:07:12 +0200 Subject: [maker-devel] Use pass-through system to add missing genes In-Reply-To: <03439C8F-75B0-42FE-894C-CC564AEB73E9@genetics.utah.edu> References: <4FE7CD5B-FC1C-43E7-AC41-A05823348B99@scilifelab.se> <03439C8F-75B0-42FE-894C-CC564AEB73E9@genetics.utah.edu> Message-ID: <1B0770A0-6D14-4336-BC3A-DC24619BC3FE@scilifelab.se> Hi and sorry for the multiple postings. I have a list of models rescued by the nonoverlaping_abinits.fasta fles (against which i blasted my missing proteins from the closely related species and further filtered out the dubious hits) and a maker gff3 file, but Carson's script gff3_select won't work, and the reason is that these abinitio models were not promoted into the maker gff3 file, thus they are not there. I refer to the gff3 file generated by gff3_merge script. Am i missing something? Thank you, Anastasia > >>> If you know which ab initio predictions you want to add (I.e. the ab initio promoting scenario I descibed), you can provide those predictions to the use the pred_gff option and then set keep_preds=1 and they will be maintained even without evidence. Attached is a script that would make selecting those easier. It take the MAKER generated GFF3 and a list of predictions to keep (one name per line). These might be the results of a BLAST analysis for example. It will then return the GFF3 entries for just those models selected. -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.hackl at uni-wuerzburg.de Wed May 23 06:01:55 2012 From: thomas.hackl at uni-wuerzburg.de (Thomas Hackl) Date: Wed, 23 May 2012 14:01:55 +0200 Subject: [maker-devel] missing est2genome annotation Message-ID: <4FBCD1B3.8000102@uni-wuerzburg.de> Hi, I used maker to annotate genomic contigs and among other stuff provided transcripts from the transcriptome as est evidence. Blast and exonerate work fine and produce valid alignments, the alignment files exist in theVoid and look very good. Unfortunatly neither the evidence_0.gff nor the final .gff carry the corresponding feature annotations. Any ideas why? Regards Thomas -- Thomas Hackl Julius-Maximilians-Universit?t Department of Bioinformatics 97074 W?rzburg, Germany Fon: +49 931 - 31 86883 Mail: thomas.hackl at uni-wuerzburg.de From thomas.hackl at uni-wuerzburg.de Wed May 23 11:14:27 2012 From: thomas.hackl at uni-wuerzburg.de (Thomas Hackl) Date: Wed, 23 May 2012 19:14:27 +0200 Subject: [maker-devel] missing est2genome annotation In-Reply-To: <4FBCD1B3.8000102@uni-wuerzburg.de> References: <4FBCD1B3.8000102@uni-wuerzburg.de> Message-ID: <4FBD1AF3.8020904@uni-wuerzburg.de> Hi again, I did some source code digging and caught the following line burying my exonerate alignments. I suspect it does so for a very good reason, therefore it would help me a lot if someone could explain to me, what is going on there. /lib/GI.pm l.1473 next if $e->pAh< $pcov; Regards Thomas Am 23.05.2012 14:01, schrieb Thomas Hackl: > Hi, > > I used maker to annotate genomic contigs and among other stuff > provided transcripts from the transcriptome as est evidence. Blast and > exonerate work fine and produce valid alignments, the alignment files > exist in theVoid and look very good. Unfortunatly neither the > evidence_0.gff nor the final .gff carry the corresponding feature > annotations. > > Any ideas why? > > Regards > Thomas > -- Thomas Hackl Julius-Maximilians-Universit?t Department of Bioinformatics 97074 W?rzburg, Germany Fon: +49 931 - 31 86883 Mail: thomas.hackl at uni-wuerzburg.de From gowthaman.ramasamy at seattlebiomed.org Thu May 24 13:30:09 2012 From: gowthaman.ramasamy at seattlebiomed.org (Gowthaman Ramasamy) Date: Thu, 24 May 2012 12:30:09 -0700 Subject: [maker-devel] Merging gene predictions.... Message-ID: <89080953C3D300419AACB6E63A7EEFBA5C8409F8D6@mail02.sbri.org> Hi Carson and others, I am wondering if I can use Maker to merge gene predictions from three gff files. One of the algorithm is 'augustus' which of course i can use it inside Maker. Other two are not part of Maker. But, in case if I want to pass only the GFFs to Maker and ask it to merge the annotations (when overlap) and pick only annotation that are predicted in 2 out of 3 gffs. Is it possible? I prefer this approach, as we need to run a blast based validation step on predicted features before even try to merge them. Thats another reason why we dont prefer to use the augustus inside maker. Thanks, Gowthaman From gowthaman.ramasamy at seattlebiomed.org Thu May 24 15:02:55 2012 From: gowthaman.ramasamy at seattlebiomed.org (Gowthaman Ramasamy) Date: Thu, 24 May 2012 14:02:55 -0700 Subject: [maker-devel] Merging gene predictions.... In-Reply-To: <89080953C3D300419AACB6E63A7EEFBA5C8409F8D6@mail02.sbri.org> References: <89080953C3D300419AACB6E63A7EEFBA5C8409F8D6@mail02.sbri.org> Message-ID: <89080953C3D300419AACB6E63A7EEFBA5C8409F8D8@mail02.sbri.org> Can i use the following approach Carson? This is your reply to one of the earlier question: I've attached a made from scratch drop-in replacement that you can use to do what the old script would have done. In the current release of MAKER, instead of the gff3_preds2models script users can just give MAKER a set of predictions in GFF3 format (pred_gff option) and set keep_preds=1 (then leave all other options blank). The predictions given will the be converted into gene models. Thanks, Carson ________________________________________ From: maker-devel-bounces at yandell-lab.org [maker-devel-bounces at yandell-lab.org] On Behalf Of Gowthaman Ramasamy [gowthaman.ramasamy at seattlebiomed.org] Sent: Thursday, May 24, 2012 12:30 PM To: Carson Holt; maker-devel at yandell-lab.org Subject: [maker-devel] Merging gene predictions.... Hi Carson and others, I am wondering if I can use Maker to merge gene predictions from three gff files. One of the algorithm is 'augustus' which of course i can use it inside Maker. Other two are not part of Maker. But, in case if I want to pass only the GFFs to Maker and ask it to merge the annotations (when overlap) and pick only annotation that are predicted in 2 out of 3 gffs. Is it possible? I prefer this approach, as we need to run a blast based validation step on predicted features before even try to merge them. Thats another reason why we dont prefer to use the augustus inside maker. Thanks, Gowthaman _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From gowthaman.ramasamy at seattlebiomed.org Thu May 24 15:21:30 2012 From: gowthaman.ramasamy at seattlebiomed.org (Gowthaman Ramasamy) Date: Thu, 24 May 2012 14:21:30 -0700 Subject: [maker-devel] Merging gene predictions.... In-Reply-To: <89080953C3D300419AACB6E63A7EEFBA5C8409F8D8@mail02.sbri.org> References: <89080953C3D300419AACB6E63A7EEFBA5C8409F8D6@mail02.sbri.org>, <89080953C3D300419AACB6E63A7EEFBA5C8409F8D8@mail02.sbri.org> Message-ID: <89080953C3D300419AACB6E63A7EEFBA5C8409F8DA@mail02.sbri.org> Hi, I am trying to install MAKER in centOS. I was able to install all the perl deps. and external programs. Perl Build.pl and ./Build Install went with out errors/warnings. I did not enable MPI though. But, when i start Maker it returs "segmentation fault". I have no clue whats going wrong....or where to check for error logs? Any help would be appreciated, Thanks, gowthaman _________ From gowthaman.ramasamy at seattlebiomed.org Thu May 24 15:22:16 2012 From: gowthaman.ramasamy at seattlebiomed.org (Gowthaman Ramasamy) Date: Thu, 24 May 2012 14:22:16 -0700 Subject: [maker-devel] MAKER installation problem Message-ID: <89080953C3D300419AACB6E63A7EEFBA5C8409F8DB@mail02.sbri.org> Hi, I am trying to install MAKER in centOS. I was able to install all the perl deps. and external programs. Perl Build.pl and ./Build Install went with out errors/warnings. I did not enable MPI though. But, when i start Maker it returs "segmentation fault". I have no clue whats going wrong....or where to check for error logs? Any help would be appreciated, Thanks, gowthaman _________ ________________________________________ From bob_freeman at hms.harvard.edu Fri May 25 10:23:22 2012 From: bob_freeman at hms.harvard.edu (Bob Freeman) Date: Fri, 25 May 2012 12:23:22 -0400 Subject: [maker-devel] Alternate translation table? Message-ID: <454CA235-0DB6-451F-97C4-83D32E2E805A@hms.harvard.edu> Hello all! Unusual question here: I am running MAKER on a ciliate that uses a non-standard translation table for its translation products. I haven't found an option in the control files that I can change for the for translation of the predicted transcripts. How or where can I go about this? Tx, Bob ----------------------------------------------------- Bob Freeman, Ph.D. Acorn Worm Informatics, Kirschner lab Dept of Systems Biology, Alpert 524 Harvard Medical School 200 Longwood Avenue Boston, MA 02115 617/432.2294, vox "Sorry I'm late. Oh, God, that sounded insincere. I'm late." -- Karen Walker, from Will and Grace -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon May 28 06:43:34 2012 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 28 May 2012 08:43:34 -0400 Subject: [maker-devel] Alternate translation table? In-Reply-To: <454CA235-0DB6-451F-97C4-83D32E2E805A@hms.harvard.edu> Message-ID: The alternate translation table is not currently an option. It's one of those things that needs to be implemented, but has not been yet. It's also not supported by many of the eukaryotic gene predictors MAKER uses. I could probably get something implemented for you to test in two to three weeks though (there are a lot of places where the translation table comes into play). Let me know. --Carson From: Bob Freeman Date: Friday, 25 May, 2012 12:23 PM To: Subject: [maker-devel] Alternate translation table? Hello all! Unusual question here: I am running MAKER on a ciliate that uses a non-standard translation table for its translation products. I haven't found an option in the control files that I can change for the for translation of the predicted transcripts. How or where can I go about this? Tx, Bob ----------------------------------------------------- Bob Freeman, Ph.D. Acorn Worm Informatics, Kirschner lab Dept of Systems Biology, Alpert 524 Harvard Medical School 200 Longwood Avenue Boston, MA 02115 617/432.2294, vox "Sorry I'm late. Oh, God, that sounded insincere. I'm late." -- Karen Walker, from Will and Grace _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon May 28 07:38:25 2012 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 28 May 2012 09:38:25 -0400 Subject: [maker-devel] missing est2genome annotation In-Reply-To: <4FBD1AF3.8020904@uni-wuerzburg.de> Message-ID: Sorry for the slow reply. I'm just getting back after traveling. That's a percent coverage flag. You can set a percent coverage threshold in the maker_bopts.ctl file. Partial high scoring alignments can be common. If you only filter by expect score, you would be surprised to see how many ugly and confusing all the alignments become. Thanks, Carson On 12-05-23 1:14 PM, "Thomas Hackl" wrote: >Hi again, > >I did some source code digging and caught the following line burying my >exonerate alignments. I suspect it does so for a very good reason, >therefore it would help me a lot if someone could explain to me, what is >going on there. > > >/lib/GI.pm l.1473 >next if $e->pAh< $pcov; > > >Regards >Thomas > > >Am 23.05.2012 14:01, schrieb Thomas Hackl: >> Hi, >> >> I used maker to annotate genomic contigs and among other stuff >> provided transcripts from the transcriptome as est evidence. Blast and >> exonerate work fine and produce valid alignments, the alignment files >> exist in theVoid and look very good. Unfortunatly neither the >> evidence_0.gff nor the final .gff carry the corresponding feature >> annotations. >> >> Any ideas why? >> >> Regards >> Thomas >> > > >-- >Thomas Hackl >Julius-Maximilians-Universit?t >Department of Bioinformatics >97074 W?rzburg, Germany >Fon: +49 931 - 31 86883 >Mail: thomas.hackl at uni-wuerzburg.de > > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From seoanezonjic at hotmail.com Tue May 22 06:55:12 2012 From: seoanezonjic at hotmail.com (p sz) Date: Tue, 22 May 2012 12:55:12 +0000 Subject: [maker-devel] ipr_update_gff ERROR Message-ID: First, thanks by help me on the lprevious error that I submitted. I'm still working in the same project and I get a new error. I try interproscan with this commandline: iprscan_wrap -i parsed_input.all.maker.proteins.fasta -email seoanezonjic at hotmail.com -format raw parsed_input.all.maker.proteins.fasta was generated with the tool fasta_merge. I use the output (attached in this email) and a gff file (generated by a normal run of maker, attached in this email) with the ipr_update_gff script of this way: ipr_update_gff BAC12_Clone_Pt314B2_Lib_Pt_7Ba__organism_Pinus_taeda__0.gff sc_interpro.sh.o109053 And i get this error: Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 1. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 1. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 2. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 2. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 3. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 3. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 8. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 8. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 9. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 9. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 10. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 10. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 11. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 11. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 12. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 12. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 13. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 13. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 14. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 14. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 15. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 15. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 18. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 18. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 18. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 18. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 19. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 19. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 19. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 19. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 48. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 48. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 48. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 48. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 49. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 49. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 49. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 49. The gff file seems updated but i don't know if it works fine or is corrupt Thanks in advance -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: sc_interpro.sh.o109053 Type: application/octet-stream Size: 2542 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: BAC12_Clone_Pt314B2_Lib_Pt_7Ba__organism_Pinus_taeda__0.gff Type: application/octet-stream Size: 220243 bytes Desc: not available URL: From larriba.ed at gmail.com Fri May 25 10:01:41 2012 From: larriba.ed at gmail.com (Eduardo Larriba) Date: Fri, 25 May 2012 18:01:41 +0200 Subject: [maker-devel] Consensus gene models Message-ID: Hi Carson and people, I am working on structural annotation of a filamentous fungus, of which there is little evidence as EST or Protein. For generate consensus gene based on limited evidences to me I used Marker. For this I created the files GeneMark prediction-is and SNAP. I run maker using the EST of my organims (85), along with 5700 EST of the closed organims. I have made ??predictions with Augustus, and SNAP GeneMark, with the training files for my organims, in Maker pipeline. Everything works fine. My problem is that when I get the consensus sequences of all my contigs, fasta_merge script (included in Maker), I get different list for each predictor, as well as when I try to get the gff of all. They could tell me how I can use Maker consensus for a list of genes? Or I have to do it manually? There is the possibility that Maker evaluates the accuracy of each prediction and confirm, so just get a list of the different predictions? Thank you very much. -- Eduardo Larriba Tornel Universidad de Alicante. Lab. Fitopatolog?a Dept. Ciencias del Mar y Biolog?a Aplicada Pabell?n 13. San Vicente del Raspeig Tel. 96 590 3400 ext 3280 -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon May 28 08:14:22 2012 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 28 May 2012 10:14:22 -0400 Subject: [maker-devel] Consensus gene models In-Reply-To: Message-ID: The consensus list is in the maker.proteins.fasta and maker.transcripts.fasta file. The predictor specific lists are just for reference purposes (incase you want to see what the other predictors produced on their own, i.e. without MAKER's intervention). The non-overlapping.fasta file in the same directory will contain consensus entries for models that were not supported by any evidence and don't overlap any gene models in the maker.transcripts.fasta (think of these as the maybe gene and the maker.transcripts.fasta as the very likely genes). You can set keep_preds=1 if you just want MAKER to keep everything with or without support and just produce consensus (probably ok on a fungus, but I wouldn't recommend it on other eukayotes because false positive rates will be very high). Thanks, Carson From: Eduardo Larriba Date: Friday, 25 May, 2012 12:01 PM To: Subject: [maker-devel] Consensus gene models Hi Carson and people, I am working on structural annotation of a filamentous fungus, of which there is little evidence as EST or Protein. For generate consensus gene based on limited evidences to me I used Marker. For this I created the files GeneMark prediction-is and SNAP. I run maker using the EST of my organims (85), along with 5700 EST of the closed organims. I have made ??predictions with Augustus, and SNAP GeneMark, with the training files for my organims, in Maker pipeline. Everything works fine. My problem is that when I get the consensus sequences of all my contigs, fasta_merge script (included in Maker), I get different list for each predictor, as well as when I try to get the gff of all. They could tell me how I can use Maker consensus for a list of genes? Or I have to do it manually? There is the possibility that Maker evaluates the accuracy of each prediction and confirm, so just get a list of the different predictions? Thank you very much. -- Eduardo Larriba Tornel Universidad de Alicante. Lab. Fitopatolog?a Dept. Ciencias del Mar y Biolog?a Aplicada Pabell?n 13. San Vicente del Raspeig Tel. 96 590 3400 ext 3280 _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon May 28 08:18:26 2012 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 28 May 2012 10:18:26 -0400 Subject: [maker-devel] ipr_update_gff ERROR In-Reply-To: Message-ID: This error would happen if some results exist in the iprscan output, but don't match gene entries in the GFF3 file. If I could see the original files I can tell you which ones. This can happen if you combine results from the non-overlapping.fasta files with the maker.proteins.fasta files for example. Models in the non-overlapping.fasta file are not genes in the GFF3 (they are match/match_part enties), so errors happen. Thanks, Carson From: p sz Date: Tuesday, 22 May, 2012 8:55 AM To: Subject: [maker-devel] ipr_update_gff ERROR First, thanks by help me on the lprevious error that I submitted. I'm still working in the same project and I get a new error. I try interproscan with this commandline: iprscan_wrap -i parsed_input.all.maker.proteins.fasta -email seoanezonjic at hotmail.com -format raw parsed_input.all.maker.proteins.fasta was generated with the tool fasta_merge. I use the output (attached in this email) and a gff file (generated by a normal run of maker, attached in this email) with the ipr_update_gff script of this way: ipr_update_gff BAC12_Clone_Pt314B2_Lib_Pt_7Ba__organism_Pinus_taeda__0.gff sc_interpro.sh.o109053 And i get this error: Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 1. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 1. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 2. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 2. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 3. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 3. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 8. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 8. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 9. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 9. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 10. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 10. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 11. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 11. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 12. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 12. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 13. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 13. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 14. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 14. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 15. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 15. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 18. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 18. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 18. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 18. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 19. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 19. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 19. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 19. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 48. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 48. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 48. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 48. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 49. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 49. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 49. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 49. The gff file seems updated but i don't know if it works fine or is corrupt Thanks in advance _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue May 29 06:37:55 2012 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 29 May 2012 08:37:55 -0400 Subject: [maker-devel] can't call method strand on an undefined value ERROR: Failed while flattening protein clusters In-Reply-To: Message-ID: Use this command to check out the latest unreleased test version, and lt me know if you still get the error. Command --> svn co svn://malachite.genetics.utah.edu/maker/trunk maker User: yandell_guest Password: y at ndell_Gu3st Thanks, Carson From: Scott Geib Date: Tuesday, 22 May, 2012 2:42 AM To: Subject: [maker-devel] can't call method strand on an undefined value ERROR: Failed while flattening protein clusters Hi, Using maker 2.24, I am getting the following error (see below) in protein2genome widget. I also get the same error with est2genome. This happens with my own data (testing on a single scaffold), but not with the test data supplied with maker (dpp files in data folder). Scott Widget::exonerate::protein2genome: /data0/opt/AlignmentSoftware/exonerate/exonerate-2.2.0-x86_64/bin/exonerate -q /data0/opt/GenePrediction/maker/bdor/makercustom/scaffold1.maker.output/scaf fold1_datastore/B8/E3/scaffold00001//theVoid.scaffold00001/UniRef90_UPI00019 4D3FC.for.1588546-1589203.9.fasta -t /data0/opt/GenePrediction/maker/bdor/makercustom/scaffold1.maker.output/scaf fold1_datastore/B8/E3/scaffold00001//theVoid.scaffold00001/scaffold00001.158 8546-1589203.9.fasta -Q protein -T dna -m protein2genome --softmasktarget --percent 20 --showcigar > /data0/opt/GenePrediction/maker/bdor/makercustom/scaffold1.maker.output/scaf fold1_datastore/B8/E3/scaffold00001//theVoid.scaffold00001/scaffold00001.158 8546-1589203.UniRef90_UPI000194D3FC.p_exonerate.9 #-------------------------------# cleaning blastx... in cluster::shadow_cluster... ...finished clustering. cleaning clusters.... total clusters:2 now processing 0 ...processing 0 of 23 ...processing 1 of 23 ...processing 2 of 23 ...processing 3 of 23 ...processing 4 of 23 ...processing 5 of 23 ...processing 6 of 23 ...processing 7 of 23 ...processing 8 of 23 ...processing 9 of 23 ...processing 10 of 23 ...processing 11 of 23 ...processing 12 of 23 ...processing 13 of 23 ...processing 14 of 23 ...processing 15 of 23 ...processing 16 of 23 ...processing 17 of 23 ...processing 18 of 23 ...processing 19 of 23 ...processing 20 of 23 ...processing 21 of 23 total clusters:2 now processing 0 in cluster::shadow_cluster... ...finished clustering. cleaning clusters.... total clusters:2 now processing 0 ...processing 0 of 20 ...processing 1 of 20 ...processing 2 of 20 ...processing 3 of 20 ...processing 4 of 20 ...processing 5 of 20 ...processing 6 of 20 ...processing 7 of 20 ...processing 8 of 20 ...processing 9 of 20 ...processing 10 of 20 ...processing 11 of 20 ...processing 12 of 20 ...processing 13 of 20 ...processing 14 of 20 ...processing 15 of 20 ...processing 16 of 20 ...processing 17 of 20 ...processing 18 of 20 total clusters:2 now processing 0 Can't call method "strand" on an undefined valueERROR: Failed while flattening protein clusters ERROR: Chunk failed at level:11, tier_type:2 FAILED CONTIG:scaffold00001 _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From bob_freeman at hms.harvard.edu Tue May 29 10:30:23 2012 From: bob_freeman at hms.harvard.edu (Bob Freeman) Date: Tue, 29 May 2012 12:30:23 -0400 Subject: [maker-devel] Alternate translation table? In-Reply-To: References: Message-ID: Thanks, Carson, for the update on this. No need to implement something. I'll keep it simple and translate the collected transcripts using an appropriate translation table. -Bob On May 28, 2012, at 8:43 AM, Carson Holt wrote: > The alternate translation table is not currently an option. It's one of those things that needs to be implemented, but has not been yet. It's also not supported by many of the eukaryotic gene predictors MAKER uses. > > I could probably get something implemented for you to test in two to three weeks though (there are a lot of places where the translation table comes into play). Let me know. > > --Carson > > > > From: Bob Freeman > Date: Friday, 25 May, 2012 12:23 PM > To: > Subject: [maker-devel] Alternate translation table? > > Hello all! > > Unusual question here: I am running MAKER on a ciliate that uses a non-standard translation table for its translation products. I haven't found an option in the control files that I can change for the for translation of the predicted transcripts. How or where can I go about this? > > Tx, > Bob > > > _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org ----------------------------------------------------- Bob Freeman, Ph.D. Acorn Worm Informatics, Kirschner lab Dept of Systems Biology, Alpert 524 Harvard Medical School 200 Longwood Avenue Boston, MA 02115 617/432.2294, vox "Sorry I'm late. Oh, God, that sounded insincere. I'm late." -- Karen Walker, from Will and Grace -------------- next part -------------- An HTML attachment was scrubbed... URL: From gowthaman.ramasamy at seattlebiomed.org Tue May 29 15:54:33 2012 From: gowthaman.ramasamy at seattlebiomed.org (Gowthaman Ramasamy) Date: Tue, 29 May 2012 14:54:33 -0700 Subject: [maker-devel] Can maker select a gene model based on #algoritham predicted it Message-ID: <89080953C3D300419AACB6E63A7EEFBA5C8409F8EB@mail02.sbri.org> Hi Carson, Thanks for all the help during the long weekend, in spite of that long drive. I am still trying to imagine that. I now have maker to consider our own prediction via pred_gff, and use augustus and gene mark (with our training model). And i was able to use altest and protein evidences. Maker happily picks one gene model when there is a overlap between three different predictions. But, when I look at the gff, it seems like it picks a gene model only when there is an est/protein evidence. It leaves out some genes even though, they are predicted by all three algorithms. Of course, keep_pred=1 helps to keep all the models. This kind of leads to over prediction. But, I am looking for something in between. And would like to know if that is possible? 1) Pick a gene model if it has an evidence from (est/prot etc...) irrespective of how many algorithms predicted it 2) In the absence of extrinsic evidence (est/prot etc), pick a gene model if that is predicted by at least two algorithms. Or even simpler: I have ab-initio predictions from three algorithms, Can I output, those genes that is supported by at least two of them. I care less about exactness of gene boundaries. Thanks, Gowthaman PS: With my recent attempts, i learned couple things about maker/other associated tools that is not documented in gmod-maker wiki. Is it possible/ok if I add contents to it. I am okay with running it by you before making it public. From carsonhh at gmail.com Wed May 30 06:54:32 2012 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 30 May 2012 08:54:32 -0400 Subject: [maker-devel] Can maker select a gene model based on #algoritham predicted it In-Reply-To: <89080953C3D300419AACB6E63A7EEFBA5C8409F8EB@mail02.sbri.org> Message-ID: It's not an option in exactly the way you are specifying, but there is something I usually do for annotation that works well. I run interproscan or rpsblast on the non_overlapping.proteins.fasta file and select just those non-overlapping models that have a recognizable protein domain (just searching the pfam doamin space is more than sufficient). Then I provide the selected results to model_gff, and provide the previous maker results to the maker_gff option with (all reannotation pass options set to 1 and all analysis options turned off). This adds models with at least recognizable domains (as even multiple gene predictors can overpredict in a similar way). Attached is a script to help select predictions and upgrade them to models in GFF3 format. If you have question let me know. Thanks, Carson On 12-05-29 5:54 PM, "Gowthaman Ramasamy" wrote: >Hi Carson, >Thanks for all the help during the long weekend, in spite of that long >drive. I am still trying to imagine that. > >I now have maker to consider our own prediction via pred_gff, and use >augustus and gene mark (with our training model). And i was able to use >altest and protein evidences. Maker happily picks one gene model when >there is a overlap between three different predictions. But, when I look >at the gff, it seems like it picks a gene model only when there is an >est/protein evidence. It leaves out some genes even though, they are >predicted by all three algorithms. Of course, keep_pred=1 helps to keep >all the models. This kind of leads to over prediction. > >But, I am looking for something in between. And would like to know if >that is possible? >1) Pick a gene model if it has an evidence from (est/prot etc...) >irrespective of how many algorithms predicted it >2) In the absence of extrinsic evidence (est/prot etc), pick a gene model >if that is predicted by at least two algorithms. > >Or even simpler: >I have ab-initio predictions from three algorithms, Can I output, those >genes that is supported by at least two of them. I care less about >exactness of gene boundaries. > >Thanks, >Gowthaman > >PS: With my recent attempts, i learned couple things about maker/other >associated tools that is not documented in gmod-maker wiki. Is it >possible/ok if I add contents to it. I am okay with running it by you >before making it public. -------------- next part -------------- A non-text attachment was scrubbed... Name: gff3_preds2models Type: application/octet-stream Size: 4778 bytes Desc: not available URL: From mikael.durling at slu.se Thu May 31 06:25:31 2012 From: mikael.durling at slu.se (=?iso-8859-1?Q?Mikael_Brandstr=F6m_Durling?=) Date: Thu, 31 May 2012 14:25:31 +0200 Subject: [maker-devel] maker leaving large numbers of defunct zombies Message-ID: Hello, I've been working lately to set up maker for annotation work on a few fungal genomes. I've got mpi maker up and running now, however, I notice that maker is leaving a lot of perl processes behind. This happens to the extent that the process table on the system gets filled up after a few hours run time. Right now the process tree after three hours running looks like this: |-sge_execd-+-sge_shepherd---bash---mpirun-+-maker-+-maker | | | `-perl | | |-maker-+-maker | | | |-maker---1371*[perl] | | | `-perl | | |-maker-+-maker | | | |-maker---1348*[perl] | | | `-perl | | |-maker-+-maker | | | |-maker---1384*[perl] | | | `-perl ...and so on for all mpi processes, except for the controlling processes. What perl programs is maker calling, that might end up as zombies? I've had a brief look at the source to no avail, but would be happy to dig further with some pointers for where to look. This is run with the 2.25-beta from the web page, perl 5.16.0 and openmpi 1.4.5. Thanks, Mikael ------------------------------------- Mikael Brandstr?m Durling, PhD Assistant Professor Sveriges lantbruksuniversitet Swedish University of Agricultural Sciences Uppsala BioCenter Dept of Forest Mycology and Plant Pathology Box 7026, 75007 Uppsala Visiting address: Almas All? 5 Telefon: 018-671512 mikael.durling at slu.se, www.slu.se/mykopat From mikael.durling at slu.se Thu May 31 06:34:30 2012 From: mikael.durling at slu.se (=?iso-8859-1?Q?Mikael_Brandstr=F6m_Durling?=) Date: Thu, 31 May 2012 14:34:30 +0200 Subject: [maker-devel] Using GDBM_File instead of DB_File Message-ID: Hello, I've been struggling for a few days to get maker up and running with MPI on a debian squeeze system. Compiling a new perl 5.16 exclusively for maker I wound down to that the segfaults came from DB_File. Even by recomiling and updating that module, nothing worked. After checking for dependencies on DB_File in maker, I concluded that the only dependency was through the GI::localize_file, which expects the FastaDB to be instantiated with DB_File. However, FastaDB can run on GDBM_File too. I patched the calls to GI::localize_file in maker to handle the .pag/.dir extensions used by GDBM (below). With this patch applied maker is running for me, even when I have deleted the DB_File module from the perl path and made sure that GDBM_File is installed. My basic question is if there is any other dependency for DB_File which I have missed which may break things? cheers, Mikael --- maker.orig 2012-03-30 15:48:05.000000000 +0200 +++ maker 2012-05-31 10:35:30.253022648 +0200 @@ -512,7 +515,12 @@ } if($size > 1){ carp "Calling GI::localize_file" if($main::debug); - GI::localize_file("$gdbfile.index"); + if( -f "$gdbfile.index.dir" ){ + GI::localize_file("$gdbfile.index.dir"); + GI::localize_file("$gdbfile.index.pag"); + }else{ + GI::localize_file("$gdbfile.index"); + } carp "Calling GI::localize_file" if($main::debug); $gdbfile = GI::localize_file($gdbfile); } From carsonhh at gmail.com Thu May 31 07:04:58 2012 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 31 May 2012 09:04:58 -0400 Subject: [maker-devel] Using GDBM_File instead of DB_File In-Reply-To: Message-ID: DB_File is being called by Bio::DB::Fasta. I can check the object returned when using GDBM_File instead to see if the index file names are contained on the object, as I'm just assuming an extension of '.index'. I'll look around to see if the extension name is assumed anywhere else. Thanks, Carson On 12-05-31 8:34 AM, "Mikael Brandstr?m Durling" wrote: >Hello, > >I've been struggling for a few days to get maker up and running with MPI >on a debian squeeze system. Compiling a new perl 5.16 exclusively for >maker I wound down to that the segfaults came from DB_File. Even by >recomiling and updating that module, nothing worked. After checking for >dependencies on DB_File in maker, I concluded that the only dependency >was through the GI::localize_file, which expects the FastaDB to be >instantiated with DB_File. However, FastaDB can run on GDBM_File too. I >patched the calls to GI::localize_file in maker to handle the .pag/.dir >extensions used by GDBM (below). With this patch applied maker is running >for me, even when I have deleted the DB_File module from the perl path >and made sure that GDBM_File is installed. My basic question is if there >is any other dependency for DB_File which I have missed which may break >things? > >cheers, >Mikael > >--- maker.orig 2012-03-30 15:48:05.000000000 +0200 >+++ maker 2012-05-31 10:35:30.253022648 +0200 >@@ -512,7 +515,12 @@ > } > if($size > 1){ > carp "Calling GI::localize_file" if($main::debug); >- GI::localize_file("$gdbfile.index"); >+ if( -f "$gdbfile.index.dir" ){ >+ GI::localize_file("$gdbfile.index.dir"); >+ GI::localize_file("$gdbfile.index.pag"); >+ }else{ >+ GI::localize_file("$gdbfile.index"); >+ } > carp "Calling GI::localize_file" if($main::debug); > $gdbfile = GI::localize_file($gdbfile); > } > > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From carsonhh at gmail.com Thu May 31 07:17:20 2012 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 31 May 2012 09:17:20 -0400 Subject: [maker-devel] maker leaving large numbers of defunct zombies In-Reply-To: Message-ID: MAKER uses IPC::Open3 to open almost all external applications, including a helper script called every once in a while that helps check file locks on NFS. MAKER then calls waitpid to reap the processes, as IPC::Open3 doesn't auto-reap. The only time previously I've seen issues with zombie accumulation was with MPICH2 when it moved from the MPD process manager to Hydra. Hydra had certain broken signal handling issues that I had to bug the MPICH2 developers about and they fixed it. It is possible that the issue you are having may be with OpenMPI or with perl 5.16. I currently use perl 5.12. Perl instituted something called safe signals in either 5.6 or 5.8 and there may be some updates in 5.16 where they've been changing those around again. I can try installing a copy of 5.16 to test with and OpenMPI to see if I can replicate the error. Thanks, Carson On 12-05-31 8:25 AM, "Mikael Brandstr?m Durling" wrote: >Hello, > >I've been working lately to set up maker for annotation work on a few >fungal genomes. I've got mpi maker up and running now, however, I notice >that maker is leaving a lot of perl processes behind. This >happens to the extent that the process table on the system gets filled up >after a few hours run time. Right now the process tree after three hours >running looks like this: > > |-sge_execd-+-sge_shepherd---bash---mpirun-+-maker-+-maker > | | | `-perl > | | |-maker-+-maker > | | | >|-maker---1371*[perl] > | | | `-perl > | | |-maker-+-maker > | | | >|-maker---1348*[perl] > | | | `-perl > | | |-maker-+-maker > | | | >|-maker---1384*[perl] > | | | `-perl > >...and so on for all mpi processes, except for the controlling processes. > >What perl programs is maker calling, that might end up as zombies? I've >had a brief look at the source to no avail, but would be happy to dig >further with some pointers for where to look. > >This is run with the 2.25-beta from the web page, perl 5.16.0 and openmpi >1.4.5. > >Thanks, >Mikael > > > > > > > > > > >------------------------------------- >Mikael Brandstr?m Durling, PhD >Assistant Professor > >Sveriges lantbruksuniversitet >Swedish University of Agricultural Sciences > >Uppsala BioCenter >Dept of Forest Mycology and Plant Pathology >Box 7026, 75007 Uppsala >Visiting address: Almas All? 5 >Telefon: 018-671512 >mikael.durling at slu.se, www.slu.se/mykopat > > > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From mikael.durling at slu.se Thu May 31 07:57:06 2012 From: mikael.durling at slu.se (=?iso-8859-1?Q?Mikael_Brandstr=F6m_Durling?=) Date: Thu, 31 May 2012 15:57:06 +0200 Subject: [maker-devel] maker leaving large numbers of defunct zombies In-Reply-To: References: Message-ID: I saw the same problem with the latest MPICH2 using hydra too, so it might boil down to perl/openmpi interactions. I didn't see this problem with the debian supplied perl 5.10, but then I had intermittent segfaults with in DB_File and libpthread. Seemed to be some interaction with the LD_PRELOADed libmpi. That requirement for preloading libmpi was easiest solved by compiling openmpi with --disable-dlopen. Thanks, Mikael 31 maj 2012 kl. 15:17 skrev Carson Holt: > MAKER uses IPC::Open3 to open almost all external applications, including > a helper script called every once in a while that helps check file locks > on NFS. > > MAKER then calls waitpid to reap the processes, as IPC::Open3 doesn't > auto-reap. The only time previously I've seen issues with zombie > accumulation was with MPICH2 when it moved from the MPD process manager to > Hydra. Hydra had certain broken signal handling issues that I had to bug > the MPICH2 developers about and they fixed it. It is possible that the > issue you are having may be with OpenMPI or with perl 5.16. I currently > use perl 5.12. Perl instituted something called safe signals in either > 5.6 or 5.8 and there may be some updates in 5.16 where they've been > changing those around again. > > I can try installing a copy of 5.16 to test with and OpenMPI to see if I > can replicate the error. > > Thanks, > Carson > > > > On 12-05-31 8:25 AM, "Mikael Brandstr?m Durling" > wrote: > >> Hello, >> >> I've been working lately to set up maker for annotation work on a few >> fungal genomes. I've got mpi maker up and running now, however, I notice >> that maker is leaving a lot of perl processes behind. This >> happens to the extent that the process table on the system gets filled up >> after a few hours run time. Right now the process tree after three hours >> running looks like this: >> >> |-sge_execd-+-sge_shepherd---bash---mpirun-+-maker-+-maker >> | | | `-perl >> | | |-maker-+-maker >> | | | >> |-maker---1371*[perl] >> | | | `-perl >> | | |-maker-+-maker >> | | | >> |-maker---1348*[perl] >> | | | `-perl >> | | |-maker-+-maker >> | | | >> |-maker---1384*[perl] >> | | | `-perl >> >> ...and so on for all mpi processes, except for the controlling processes. >> >> What perl programs is maker calling, that might end up as zombies? I've >> had a brief look at the source to no avail, but would be happy to dig >> further with some pointers for where to look. >> >> This is run with the 2.25-beta from the web page, perl 5.16.0 and openmpi >> 1.4.5. >> >> Thanks, >> Mikael >> >> >> >> >> >> >> >> >> >> >> ------------------------------------- >> Mikael Brandstr?m Durling, PhD >> Assistant Professor >> >> Sveriges lantbruksuniversitet >> Swedish University of Agricultural Sciences >> >> Uppsala BioCenter >> Dept of Forest Mycology and Plant Pathology >> Box 7026, 75007 Uppsala >> Visiting address: Almas All? 5 >> Telefon: 018-671512 >> mikael.durling at slu.se, www.slu.se/mykopat >> >> >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > From carsonhh at gmail.com Tue May 1 16:07:47 2012 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 01 May 2012 18:07:47 -0400 Subject: [maker-devel] gff3_preds2models usage question In-Reply-To: Message-ID: Sorry for the slow response. The gff3_preds2models script has been deprecated for some time now (isn't even in the release code anymore), and the old one won't work with the new library. I've attached a made from scratch drop-in replacement that you can use to do what the old script would have done. In the current release of MAKER, instead of the gff3_preds2models script users can just give MAKER a set of predictions in GFF3 format (pred_gff option) and set keep_preds=1 (then leave all other options blank). The predictions given will the be converted into gene models. Thanks, Carson From: Walter Eckalbar Date: Tuesday, 3 April, 2012 7:28 PM To: Subject: [maker-devel] gff3_preds2models usage question Hello maker developers and users, I am attempting to use the gff3_preds2models scripts, but running into a few issues. Initially, I hit errors that seemed to be fixed by installing CGI and its dependancies. However, that during that installation a few tests did fail. I can provide error logs if that would be helpful, however, I went on to install and attempt gff3_preds2models anyway. What I am currently doing is running gff3_merge first, to gather the maker outputs. I am doing so with both the -n option on and off. When providing the gff3 file with the sequence I get the following error from gff3_preds2models: Undefined subroutine &maker::auto_annotator::annotate called at /Users/Walter/Bioinformatics/Tools/maker/bin/gff3_preds2models line 97, line 992291. This seemed to be the same error as that of what someone else saw on these boards, but I did not see a later email resolving the issue. I also tried giving it just the gff3 without the sequences at the bottom of the file and then I get this error: ERROR: There was a problem in the writing the fasta entry Either no sequence was given, or there was an error in writing This leads me to believe I should be using the one with the sequence, but I am not certain of that. I see it might be possible to go from maker outputs to chado database then to gene->mRNA->exon gff3s, but I have not set up my machine for XML or chado yet, and it does not appear trivial. Thanks for the help, Walter _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: gff3_preds2models Type: application/octet-stream Size: 4778 bytes Desc: not available URL: From weckalba at asu.edu Tue May 1 19:33:00 2012 From: weckalba at asu.edu (Walter Eckalbar) Date: Tue, 1 May 2012 18:33:00 -0700 Subject: [maker-devel] gff3_preds2models usage question In-Reply-To: References: Message-ID: Hi Carson, Thanks for the response, even a late one, and thanks for the script. I'll certainly be giving that a try. Walter On 1 May 2012 15:07, Carson Holt wrote: > Sorry for the slow response. The gff3_preds2models script has been > deprecated for some time now (isn't even in the release code anymore), and > the old one won't work with the new library. > > I've attached a made from scratch drop-in replacement that you can use to > do what the old script would have done. In the current release of MAKER, > instead of the gff3_preds2models script users can just give MAKER a set of > predictions in GFF3 format (pred_gff option) and set keep_preds=1 (then > leave all other options blank). The predictions given will the be > converted into gene models. > > Thanks, > Carson > > > > From: Walter Eckalbar > Date: Tuesday, 3 April, 2012 7:28 PM > To: > Subject: [maker-devel] gff3_preds2models usage question > > Hello maker developers and users, > > I am attempting to use the gff3_preds2models scripts, but running into a > few issues. > > Initially, I hit errors that seemed to be fixed by installing CGI and its > dependancies. However, that during that installation a few tests did fail. > I can provide error logs if that would be helpful, however, I went on to > install and attempt gff3_preds2models anyway. > > What I am currently doing is running gff3_merge first, to gather the maker > outputs. I am doing so with both the -n option on and off. When providing > the gff3 file with the sequence I get the following error from > gff3_preds2models: > > Undefined subroutine &maker::auto_annotator::annotate called at > /Users/Walter/Bioinformatics/Tools/maker/bin/gff3_preds2models line 97, > line 992291. > > This seemed to be the same error as that of what someone else saw on these > boards, but I did not see a later email resolving the issue. > > I also tried giving it just the gff3 without the sequences at the bottom > of the file and then I get this error: > > ERROR: There was a problem in the writing the fasta entry > Either no sequence was given, or there was an error in writing > > This leads me to believe I should be using the one with the sequence, but > I am not certain of that. > > I see it might be possible to go from maker outputs to chado database then > to gene->mRNA->exon gff3s, but I have not set up my machine for XML or > chado yet, and it does not appear trivial. > > Thanks for the help, > > Walter > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From qwang at uwyo.edu Thu May 3 10:23:16 2012 From: qwang at uwyo.edu (Qiurong Wang) Date: Thu, 3 May 2012 10:23:16 -0600 Subject: [maker-devel] MAKER download problem Message-ID: <8754B55D-C119-4A7C-9594-BEAEAD3BB939@uwyo.edu> Hi, I was trying to download MAKER, but I couldn't open the download page. Could you please help me to figure out the problem? Thanks a lot! Qiurong Wang PhD candidate Department of Botany University of Wyoming Department 3165, 1000 E University Ave. Laramie, Wyoming 82071, USA Phone: 307-766-2634 Email: qwang at uwyo.edu From barry.moore at genetics.utah.edu Thu May 3 11:51:48 2012 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Thu, 3 May 2012 11:51:48 -0600 Subject: [maker-devel] MAKER download problem In-Reply-To: <8754B55D-C119-4A7C-9594-BEAEAD3BB939@uwyo.edu> References: <8754B55D-C119-4A7C-9594-BEAEAD3BB939@uwyo.edu> Message-ID: <6B24C00B-6C70-4A40-BA7C-E3AC0C7F5E25@genetics.utah.edu> Hi all, The web server hosting the MAKER licensing application and MAKER code distribution was attacked this week. The University of Utah is currently blocking access to that server from outside University IP space. We are working hard to move all of the content and web-applications from that machine to a new server and I expect to have the MAKER services restored over the weekend. This is affecting the ability to submit new licenses for MAKER and to download the MAKER code. In addition those with existing licenses will need to update their code links for future code updates. For those of you on campus in Utah or with campus VPN access, the server is running and available. Updates on the status of the server and details about new links to the MAKER code will be posted to the mailing list soon. Barry On May 3, 2012, at 10:23 AM, Qiurong Wang wrote: > Hi, > > I was trying to download MAKER, but I couldn't open the download page. Could you please help me to figure out the problem? Thanks a lot! > > > Qiurong Wang > PhD candidate > Department of Botany > University of Wyoming > Department 3165, 1000 E University Ave. Laramie, Wyoming 82071, USA > Phone: 307-766-2634 > Email: qwang at uwyo.edu > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Thu May 3 12:00:01 2012 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 03 May 2012 14:00:01 -0400 Subject: [maker-devel] MAKER download problem In-Reply-To: <8754B55D-C119-4A7C-9594-BEAEAD3BB939@uwyo.edu> Message-ID: The server hosting the file to download is down temporarily. I'll put a copy of MAKER on a separate server and e-mail the link to you. --Carson On 12-05-03 12:23 PM, "Qiurong Wang" wrote: >Hi, > >I was trying to download MAKER, but I couldn't open the download page. >Could you please help me to figure out the problem? Thanks a lot! > > >Qiurong Wang >PhD candidate >Department of Botany >University of Wyoming >Department 3165, 1000 E University Ave. Laramie, Wyoming 82071, USA >Phone: 307-766-2634 >Email: qwang at uwyo.edu > > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From cjfields at illinois.edu Tue May 15 09:01:31 2012 From: cjfields at illinois.edu (Fields, Christopher J) Date: Tue, 15 May 2012 15:01:31 +0000 Subject: [maker-devel] mail list and Trac Message-ID: <64A6759A-9FD1-4AEE-BCEA-151B0D791ADD@illinois.edu> Just wanted to point out, I noticed the mail list is not being indexed on Google Groups any more (nothing in May). Also, any status on Trac? chris From barry.moore at genetics.utah.edu Tue May 15 10:34:29 2012 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Tue, 15 May 2012 10:34:29 -0600 Subject: [maker-devel] mail list and Trac In-Reply-To: <64A6759A-9FD1-4AEE-BCEA-151B0D791ADD@illinois.edu> References: <64A6759A-9FD1-4AEE-BCEA-151B0D791ADD@illinois.edu> Message-ID: Thanks Chris, I'll check on the Google groups. The MAKER Trac server was on a server that we had to shut down a couple weeks ago and I had made moving it a lower priority since I didn't think it was getting much use. I'll get it moved over and bump up the priority on that move since I know someone is looking at it. B On May 15, 2012, at 9:01 AM, Fields, Christopher J wrote: > Just wanted to point out, I noticed the mail list is not being indexed on Google Groups any more (nothing in May). Also, any status on Trac? > > chris > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From anastasia.gioti at scilifelab.se Thu May 17 05:27:04 2012 From: anastasia.gioti at scilifelab.se (Anastasia Gioti) Date: Thu, 17 May 2012 13:27:04 +0200 Subject: [maker-devel] Use pass-through system to add missing genes In-Reply-To: <03439C8F-75B0-42FE-894C-CC564AEB73E9@genetics.utah.edu> References: <4FE7CD5B-FC1C-43E7-AC41-A05823348B99@scilifelab.se> <03439C8F-75B0-42FE-894C-CC564AEB73E9@genetics.utah.edu> Message-ID: <4D9922AF-A917-4747-9B7C-CFE9F142D51C@scilifelab.se> Hi Barry, Thanks for your detailed instructions. You well understood that I have already included the proteins of the closely related species in my protein evidence dataset, but still did not get the genes. I have now blasted (P) the missing 949 proteins from this species against my nonoverlaping_abinits.fasta proteins and have found 618 good hits, which i guess I can promote to models using the routine no 2 of your last email and Carson's script gff3_select. I have also looked at the rest of the proteins (331) for which there was no model in the nonoverlaping_abinits.fasta. I will try to describe 2 examples I looked at in apollo: 1) ab initio models predicted a ~7.5 kb gene covering 3 genes (as predicted in the closely related species). Blastx+protein2genome similarities were reported for two of these genes, but not for the 3rd (the one in the middle). MAKER finally decided to call two genes, respecting the blastx+protein2genome evidence, but the 3rd was lost. I have previously reported here that MAKER tends to fuse genes in multi-exonic genes and others reported that too, I remember you proposed changing a papameter to alter this. To keep in mind for my final strategy that i am trying to decide on (for the moment i have not rerun MAKER). For this case, abinitio models do not exist for the gene (in the sense that the existing models overlap many genes) and the similarity to the protein of the closely related species was not judged sufficient, although when i look at a TblastN alignment for this area it looks fine to me. 2) Only the 3' end of the gene was called by MAKER, despite blastx +protein2genome evidence from the closely related species for the entire region. Abinitio models existed as 2 separate genes , one for the 3' end region (finally retained by MAKER in a consensus decision I guess) and one for the 5' region, but here not all predictors called an orf, and finally nothing was called in this region. In this case, it is a misannotation rather, but which misses a very important part of the gene. I hope my descriptions are clear, otherwise I can provide you the gff file of these 2 examples to look by yourself. I am not very clear about what to do about these 331 cases (which I do not know how to look at as well, except for random examples' viwing in Apollo). I feel that a second MAKER run would be probably the solution, this time providing as pred_gff the result of a blast against the 331. But still, the existing annotations would then have to be somehow updated as the new predictions are in conflict with them (see example 2). I am a bit confused. to recap, what would you suggest for the 331 still-missing proteins in terms of asessing their profiles n a rather automatic way and in inluding them in my annotations without going deep into manual gene curation? Many thnks, Anastasia > > Let me just restate what you've said so that I can be sure that I am > correct about what you've already done. You have run Maker with > SNAP, Genemark and Augustus using EST from a closely related species > (passed to altest) and protein evidence from other fungi. You are > missing about 1,000 genes compared to the species that provided the > EST alignments. You say their is good evidence that these genes > exist from the alignments and I assume by this that you mean the EST/ > protein alignments that Maker produced. > > 1) Is the closely related fungus annotated and if so have you > included it's proteins in the evidence set that you provided to > Maker. If you haven't provided these proteins as evidence to maker > then you should do this. You can re-run maker passing your original > models back through like this: > > #-----Re-annotation Using MAKER Derived GFF3 > genome_gff=original_maker_annotations.gff3 > est_pass=1 > altest_pass=1 > protein_pass=1 > rm_pass=1 > model_pass=1 > pred_pass=1 > other_pass=1 > > #-----Protein Homology Evidence (for best results provide a file for > at least one) > protein=proteins_from_closely_related.fasta > ## OR it sounds like you've already aligned these with exonerate? > protein_gff=proteins_from_closely_related_already_aligned.gff > > 2) If you've already included those closely related species proteins > but still didn't get the 1,000 genes, then take your > nonoverlaping_abinits.fasta and blast them directly against your > closely related proteins. Presumably they don't hit too well > because if they did they should have been promoted to predictions by > Maker the first time, but here you can decide yourself what > thresholds to allow to keep the abinit predictions that hit the > closely related species proteins. If you filter you blast hits the > way you want and keep the names of the abinit predictions that pass > your filter, then use the script Carson attached it it will generate > a abinit precidtion GFF file with only the predictions you > selected. You can then pass those predictions back to Maker and > force it to keep them and Maker will turn them from predictions > (match/match_part) into gene models. > > #-----Re-annotation Using MAKER Derived GFF3 > genome_gff=original_maker_annotations.gff3 > est_pass=1 > altest_pass=1 > protein_pass=1 > rm_pass=1 > model_pass=1 > pred_pass=0 > other_pass=1 > > #-----Gene Prediction > snaphmm= > gmhmm= > augustus_species= > fgenesh_par_file= > pred_gff=ab_init_predictions_rescued_by_blast.gff > > keep_preds=1 > > Barry > >>> Thanks, >>> Carson >>> >>> From: Anastasia Gioti >>> Date: Wed, 25 Apr 2012 11:09:36 +0200 >>> To: >>> Subject: [maker-devel] Use pass-through system to add missing genes >>> >>> Hi, >>> I have a set of predicted proteins from the genome of a fungus >>> annotated by MAKER using EST data from a closely related species >>> and 3 ab initio predictors (snap iterativelly trained 3 times, >>> genemark trained directly on the assembly and augustus with a >>> model from a less closely related species), along with a set of >>> fungal proteins. I am missing ~ 1000 proteins when I compare to >>> the species i used EST data from, and there is good evidence from >>> alignments that these genes exist. The question is how to proceed >>> from Blast hits to actual gene models here. The idea would be to >>> add these genes to the existing dataset, rather than reannotate >>> the genome. I believe that reannotating it without any further >>> evidence such as RNA-seq from the species itself would not change >>> much,and i d rather stick with actual predictions that i trust and >>> have used in subsequent analyses. The 1000 genes I can accept to >>> annotate with a less stringent and reliable way than MAKER, I just >>> want to add them so that the difference in gene count gets >>> corrected. >>> I was reading the MAKER 2 paper and i was wondering if I can use >>> the legacy annotations scheme to do it, by providing GFF3 of the >>> alignments between the two species in the regions where genes were >>> missed, but as i said, I would not like to reannotate the whole >>> genome, and running MAKER2 might cause slight changes that i d >>> like to avoid. Is this possible? First, is it possible to provide >>> a Gff3 file of specific locations and not the entire genome >>> alignment? (I guess so..) Second, how can I tag the existing >>> annotations as 'not to be changed' or alternatively, tag the new >>> models only? How should I run maker2, with which predictors on and >>> which off? >>> Thanks, >>> Anastasia >>> >>> Anastasia Gioti >>> Post-doctoral Researcher >>> >>> anastasia.gioti at scilifelab.se >>> anastasia.gioti at ebc.uu.se >>> >>> http://www.ebc.uu.se/Research/IEG/evbiol/people/pages/Gioti_Anastasia/ >>> >>> >>> >>> _______________________________________________ maker-devel >>> mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >> >> Anastasia Gioti >> Post-doctoral Researcher >> >> anastasia.gioti at scilifelab.se >> anastasia.gioti at ebc.uu.se >> >> http://www.ebc.uu.se/Research/IEG/evbiol/people/pages/ >> Gioti_Anastasia/ >> >> >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > Barry Moore > Research Scientist > Dept. of Human Genetics > University of Utah > Salt Lake City, UT 84112 > -------------------------------------------- > (801) 585-3543 > > > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell- > lab.org Anastasia (Natassa) Gioti Post-Doc Researcher Evolutionary Biology Department Uppsala University -Science for Life lab, Karolinska Institute Stockholm anastasia.gioti at ebc.uu.se anastasia.gioti at scilifelab.se http://www.ebc.uu.se/Research/IEG/evbiol/people/pages/Gioti_Anastasia/ From yogeshp08 at gmail.com Tue May 15 10:07:57 2012 From: yogeshp08 at gmail.com (Yogesh) Date: Tue, 15 May 2012 11:07:57 -0500 Subject: [maker-devel] tblastn Cleanup? Message-ID: <4478F0B20ED84A85B3C4FE4154F8FAD1@gmail.com> Hello, I have a few tblastn alignments with a lot of low quality hits. I have to clean that up. Can you please suggest how Maker pipeline does it? Also can I run it directly on my data without having to go through the whole pipeline? Thanks, -Yogesh -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Fri May 18 08:22:50 2012 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 18 May 2012 10:22:50 -0400 Subject: [maker-devel] tblastn Cleanup? In-Reply-To: <4478F0B20ED84A85B3C4FE4154F8FAD1@gmail.com> Message-ID: There are several things. I set several filtering options directly on the BLAST command line. These are things like maximum intron length, an e-value filter, and simple repeat filtering (called dust filter in NCBI blast and seg filter in WUBLAST). I also run repeat masker over the genome first. This allows simple and complex repeats to be removed before running BLAST (otherwise you get many false alignments). Last I filter the results based on percent coverage of the hit to the original database sequence and percent identity. I think you can set percent identity as a flag in BLAST, but the percent coverage filter is being calculated by MAKER, so to do this outside of MAKER would require that you write your own filtering script to compare the length of the alignment to the length of the sequence in the database. I also have an HSP depth overlap filter. This removes weird low complexity hits that escape repeatmasking. They show up as multiple HSPs overlapping multiple times in the same region (usually very high numbers like 90 HSPs all 100 bp long in the same region). I calculate the number of base pairs in the alignment on the hit then divide by the number of base pairs in the query alignment. If it's greater than 3, I throw the hit out. Thanks, Carson From: Yogesh Date: Tuesday, 15 May, 2012 12:07 PM To: Subject: [maker-devel] tblastn Cleanup? Hello, I have a few tblastn alignments with a lot of low quality hits. I have to clean that up. Can you please suggest how Maker pipeline does it? Also can I run it directly on my data without having to go through the whole pipeline? Thanks, -Yogesh _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From smg283 at gmail.com Tue May 22 00:42:51 2012 From: smg283 at gmail.com (Scott Geib) Date: Mon, 21 May 2012 20:42:51 -1000 Subject: [maker-devel] can't call method strand on an undefined value ERROR: Failed while flattening protein clusters Message-ID: Hi, Using maker 2.24, I am getting the following error (see below) in protein2genome widget. I also get the same error with est2genome. This happens with my own data (testing on a single scaffold), but not with the test data supplied with maker (dpp files in data folder). Scott Widget::exonerate::protein2genome: /data0/opt/AlignmentSoftware/exonerate/exonerate-2.2.0-x86_64/bin/exonerate -q /data0/opt/GenePrediction/maker/bdor/makercustom/scaffold1.maker.output/scaffold1_datastore/B8/E3/scaffold00001//theVoid.scaffold00001/UniRef90_UPI000194D3FC.for.1588546-1589203.9.fasta -t /data0/opt/GenePrediction/maker/bdor/makercustom/scaffold1.maker.output/scaffold1_datastore/B8/E3/scaffold00001//theVoid.scaffold00001/scaffold00001.1588546-1589203.9.fasta -Q protein -T dna -m protein2genome --softmasktarget --percent 20 --showcigar > /data0/opt/GenePrediction/maker/bdor/makercustom/scaffold1.maker.output/scaffold1_datastore/B8/E3/scaffold00001//theVoid.scaffold00001/scaffold00001.1588546-1589203.UniRef90_UPI000194D3FC.p_exonerate.9 #-------------------------------# cleaning blastx... in cluster::shadow_cluster... ...finished clustering. cleaning clusters.... total clusters:2 now processing 0 ...processing 0 of 23 ...processing 1 of 23 ...processing 2 of 23 ...processing 3 of 23 ...processing 4 of 23 ...processing 5 of 23 ...processing 6 of 23 ...processing 7 of 23 ...processing 8 of 23 ...processing 9 of 23 ...processing 10 of 23 ...processing 11 of 23 ...processing 12 of 23 ...processing 13 of 23 ...processing 14 of 23 ...processing 15 of 23 ...processing 16 of 23 ...processing 17 of 23 ...processing 18 of 23 ...processing 19 of 23 ...processing 20 of 23 ...processing 21 of 23 total clusters:2 now processing 0 in cluster::shadow_cluster... ...finished clustering. cleaning clusters.... total clusters:2 now processing 0 ...processing 0 of 20 ...processing 1 of 20 ...processing 2 of 20 ...processing 3 of 20 ...processing 4 of 20 ...processing 5 of 20 ...processing 6 of 20 ...processing 7 of 20 ...processing 8 of 20 ...processing 9 of 20 ...processing 10 of 20 ...processing 11 of 20 ...processing 12 of 20 ...processing 13 of 20 ...processing 14 of 20 ...processing 15 of 20 ...processing 16 of 20 ...processing 17 of 20 ...processing 18 of 20 total clusters:2 now processing 0 Can't call method "strand" on an undefined valueERROR: Failed while flattening protein clusters ERROR: Chunk failed at level:11, tier_type:2 FAILED CONTIG:scaffold00001 -------------- next part -------------- An HTML attachment was scrubbed... URL: From anastasia.gioti at scilifelab.se Tue May 22 07:14:17 2012 From: anastasia.gioti at scilifelab.se (Anastasia Gioti) Date: Tue, 22 May 2012 15:14:17 +0200 Subject: [maker-devel] Use pass-through system to add missing genes In-Reply-To: <03439C8F-75B0-42FE-894C-CC564AEB73E9@genetics.utah.edu> References: <4FE7CD5B-FC1C-43E7-AC41-A05823348B99@scilifelab.se> <03439C8F-75B0-42FE-894C-CC564AEB73E9@genetics.utah.edu> Message-ID: <19E36E3B-6A82-49D5-B0AC-5E521F3E8999@scilifelab.se> Hi again, I hav sent an email a few days ago about this thread, and i am not sure if you have received it or you still did not have time to look at it. In any case, this email was dealing with the fact that some proteins were not retrieved in the abinitio models and how to deal with it. What I would like to ask here is a few confirmations on how to rerun maker for the proteins that were retrieved in the abinitio models. i have looked at the Blast results, and have done a series of check-ups, so now I am ready to run MAKER again with a list of models that I want to retain. Regarding the following parameters: 1. Do I set the genome= to nothing here? i.e quote it out? This is in the beginning of the control file #-----Genome (Required for De-Novo Annotation) genome=#genome sequence file in fasta format organism_type= #eukaryotic or prokaryotic. Default is eukaryotic > > #-----Re-annotation Using MAKER Derived GFF3 > genome_gff=original_maker_annotations.gff3 > est_pass=1 > altest_pass=1 > protein_pass=1 > rm_pass=1 > model_pass=1 > pred_pass=0 > other_pass=1 > > #-----Gene Prediction 2. Do i provide again the snap etc models? I am not sure, because i thought MAKER would not run ab initio predictors this time (this is why I would also quote out the genome file above, as this is not a de novo annotation). but if it will, i will then provide the previous models i used, except for snap, for which I will generate a new model from the gff3 file of the last run (according to snap documentation). Am i correct? > snaphmm= > gmhmm= > augustus_species= > fgenesh_par_file= > pred_gff=ab_init_predictions_rescued_by_blast.gff > > keep_preds=1 Samely, what do i do with repeatmasking etc? Thanks in adavance, Anastasia > > Barry > >>> Thanks, >>> Carson >>> >>> From: Anastasia Gioti >>> Date: Wed, 25 Apr 2012 11:09:36 +0200 >>> To: >>> Subject: [maker-devel] Use pass-through system to add missing genes >>> >>> Hi, >>> I have a set of predicted proteins from the genome of a fungus >>> annotated by MAKER using EST data from a closely related species >>> and 3 ab initio predictors (snap iterativelly trained 3 times, >>> genemark trained directly on the assembly and augustus with a >>> model from a less closely related species), along with a set of >>> fungal proteins. I am missing ~ 1000 proteins when I compare to >>> the species i used EST data from, and there is good evidence from >>> alignments that these genes exist. The question is how to proceed >>> from Blast hits to actual gene models here. The idea would be to >>> add these genes to the existing dataset, rather than reannotate >>> the genome. I believe that reannotating it without any further >>> evidence such as RNA-seq from the species itself would not change >>> much,and i d rather stick with actual predictions that i trust and >>> have used in subsequent analyses. The 1000 genes I can accept to >>> annotate with a less stringent and reliable way than MAKER, I just >>> want to add them so that the difference in gene count gets >>> corrected. >>> I was reading the MAKER 2 paper and i was wondering if I can use >>> the legacy annotations scheme to do it, by providing GFF3 of the >>> alignments between the two species in the regions where genes were >>> missed, but as i said, I would not like to reannotate the whole >>> genome, and running MAKER2 might cause slight changes that i d >>> like to avoid. Is this possible? First, is it possible to provide >>> a Gff3 file of specific locations and not the entire genome >>> alignment? (I guess so..) Second, how can I tag the existing >>> annotations as 'not to be changed' or alternatively, tag the new >>> models only? How should I run maker2, with which predictors on and >>> which off? >>> Thanks, >>> Anastasia >>> >>> Anastasia Gioti >>> Post-doctoral Researcher >>> >>> anastasia.gioti at scilifelab.se >>> anastasia.gioti at ebc.uu.se >>> >>> http://www.ebc.uu.se/Research/IEG/evbiol/people/pages/Gioti_Anastasia/ >>> >>> >>> >>> _______________________________________________ maker-devel >>> mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >> >> Anastasia Gioti >> Post-doctoral Researcher >> >> anastasia.gioti at scilifelab.se >> anastasia.gioti at ebc.uu.se >> >> http://www.ebc.uu.se/Research/IEG/evbiol/people/pages/ >> Gioti_Anastasia/ >> >> >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > Barry Moore > Research Scientist > Dept. of Human Genetics > University of Utah > Salt Lake City, UT 84112 > -------------------------------------------- > (801) 585-3543 > > > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell- > lab.org Anastasia (Natassa) Gioti Post-Doc Researcher Evolutionary Biology Department Uppsala University -Science for Life lab, Karolinska Institute Stockholm anastasia.gioti at ebc.uu.se anastasia.gioti at scilifelab.se http://www.ebc.uu.se/Research/IEG/evbiol/people/pages/Gioti_Anastasia/ From anastasia.gioti at scilifelab.se Wed May 23 03:07:12 2012 From: anastasia.gioti at scilifelab.se (Anastasia Gioti) Date: Wed, 23 May 2012 11:07:12 +0200 Subject: [maker-devel] Use pass-through system to add missing genes In-Reply-To: <03439C8F-75B0-42FE-894C-CC564AEB73E9@genetics.utah.edu> References: <4FE7CD5B-FC1C-43E7-AC41-A05823348B99@scilifelab.se> <03439C8F-75B0-42FE-894C-CC564AEB73E9@genetics.utah.edu> Message-ID: <1B0770A0-6D14-4336-BC3A-DC24619BC3FE@scilifelab.se> Hi and sorry for the multiple postings. I have a list of models rescued by the nonoverlaping_abinits.fasta fles (against which i blasted my missing proteins from the closely related species and further filtered out the dubious hits) and a maker gff3 file, but Carson's script gff3_select won't work, and the reason is that these abinitio models were not promoted into the maker gff3 file, thus they are not there. I refer to the gff3 file generated by gff3_merge script. Am i missing something? Thank you, Anastasia > >>> If you know which ab initio predictions you want to add (I.e. the ab initio promoting scenario I descibed), you can provide those predictions to the use the pred_gff option and then set keep_preds=1 and they will be maintained even without evidence. Attached is a script that would make selecting those easier. It take the MAKER generated GFF3 and a list of predictions to keep (one name per line). These might be the results of a BLAST analysis for example. It will then return the GFF3 entries for just those models selected. -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.hackl at uni-wuerzburg.de Wed May 23 06:01:55 2012 From: thomas.hackl at uni-wuerzburg.de (Thomas Hackl) Date: Wed, 23 May 2012 14:01:55 +0200 Subject: [maker-devel] missing est2genome annotation Message-ID: <4FBCD1B3.8000102@uni-wuerzburg.de> Hi, I used maker to annotate genomic contigs and among other stuff provided transcripts from the transcriptome as est evidence. Blast and exonerate work fine and produce valid alignments, the alignment files exist in theVoid and look very good. Unfortunatly neither the evidence_0.gff nor the final .gff carry the corresponding feature annotations. Any ideas why? Regards Thomas -- Thomas Hackl Julius-Maximilians-Universit?t Department of Bioinformatics 97074 W?rzburg, Germany Fon: +49 931 - 31 86883 Mail: thomas.hackl at uni-wuerzburg.de From thomas.hackl at uni-wuerzburg.de Wed May 23 11:14:27 2012 From: thomas.hackl at uni-wuerzburg.de (Thomas Hackl) Date: Wed, 23 May 2012 19:14:27 +0200 Subject: [maker-devel] missing est2genome annotation In-Reply-To: <4FBCD1B3.8000102@uni-wuerzburg.de> References: <4FBCD1B3.8000102@uni-wuerzburg.de> Message-ID: <4FBD1AF3.8020904@uni-wuerzburg.de> Hi again, I did some source code digging and caught the following line burying my exonerate alignments. I suspect it does so for a very good reason, therefore it would help me a lot if someone could explain to me, what is going on there. /lib/GI.pm l.1473 next if $e->pAh< $pcov; Regards Thomas Am 23.05.2012 14:01, schrieb Thomas Hackl: > Hi, > > I used maker to annotate genomic contigs and among other stuff > provided transcripts from the transcriptome as est evidence. Blast and > exonerate work fine and produce valid alignments, the alignment files > exist in theVoid and look very good. Unfortunatly neither the > evidence_0.gff nor the final .gff carry the corresponding feature > annotations. > > Any ideas why? > > Regards > Thomas > -- Thomas Hackl Julius-Maximilians-Universit?t Department of Bioinformatics 97074 W?rzburg, Germany Fon: +49 931 - 31 86883 Mail: thomas.hackl at uni-wuerzburg.de From gowthaman.ramasamy at seattlebiomed.org Thu May 24 13:30:09 2012 From: gowthaman.ramasamy at seattlebiomed.org (Gowthaman Ramasamy) Date: Thu, 24 May 2012 12:30:09 -0700 Subject: [maker-devel] Merging gene predictions.... Message-ID: <89080953C3D300419AACB6E63A7EEFBA5C8409F8D6@mail02.sbri.org> Hi Carson and others, I am wondering if I can use Maker to merge gene predictions from three gff files. One of the algorithm is 'augustus' which of course i can use it inside Maker. Other two are not part of Maker. But, in case if I want to pass only the GFFs to Maker and ask it to merge the annotations (when overlap) and pick only annotation that are predicted in 2 out of 3 gffs. Is it possible? I prefer this approach, as we need to run a blast based validation step on predicted features before even try to merge them. Thats another reason why we dont prefer to use the augustus inside maker. Thanks, Gowthaman From gowthaman.ramasamy at seattlebiomed.org Thu May 24 15:02:55 2012 From: gowthaman.ramasamy at seattlebiomed.org (Gowthaman Ramasamy) Date: Thu, 24 May 2012 14:02:55 -0700 Subject: [maker-devel] Merging gene predictions.... In-Reply-To: <89080953C3D300419AACB6E63A7EEFBA5C8409F8D6@mail02.sbri.org> References: <89080953C3D300419AACB6E63A7EEFBA5C8409F8D6@mail02.sbri.org> Message-ID: <89080953C3D300419AACB6E63A7EEFBA5C8409F8D8@mail02.sbri.org> Can i use the following approach Carson? This is your reply to one of the earlier question: I've attached a made from scratch drop-in replacement that you can use to do what the old script would have done. In the current release of MAKER, instead of the gff3_preds2models script users can just give MAKER a set of predictions in GFF3 format (pred_gff option) and set keep_preds=1 (then leave all other options blank). The predictions given will the be converted into gene models. Thanks, Carson ________________________________________ From: maker-devel-bounces at yandell-lab.org [maker-devel-bounces at yandell-lab.org] On Behalf Of Gowthaman Ramasamy [gowthaman.ramasamy at seattlebiomed.org] Sent: Thursday, May 24, 2012 12:30 PM To: Carson Holt; maker-devel at yandell-lab.org Subject: [maker-devel] Merging gene predictions.... Hi Carson and others, I am wondering if I can use Maker to merge gene predictions from three gff files. One of the algorithm is 'augustus' which of course i can use it inside Maker. Other two are not part of Maker. But, in case if I want to pass only the GFFs to Maker and ask it to merge the annotations (when overlap) and pick only annotation that are predicted in 2 out of 3 gffs. Is it possible? I prefer this approach, as we need to run a blast based validation step on predicted features before even try to merge them. Thats another reason why we dont prefer to use the augustus inside maker. Thanks, Gowthaman _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From gowthaman.ramasamy at seattlebiomed.org Thu May 24 15:21:30 2012 From: gowthaman.ramasamy at seattlebiomed.org (Gowthaman Ramasamy) Date: Thu, 24 May 2012 14:21:30 -0700 Subject: [maker-devel] Merging gene predictions.... In-Reply-To: <89080953C3D300419AACB6E63A7EEFBA5C8409F8D8@mail02.sbri.org> References: <89080953C3D300419AACB6E63A7EEFBA5C8409F8D6@mail02.sbri.org>, <89080953C3D300419AACB6E63A7EEFBA5C8409F8D8@mail02.sbri.org> Message-ID: <89080953C3D300419AACB6E63A7EEFBA5C8409F8DA@mail02.sbri.org> Hi, I am trying to install MAKER in centOS. I was able to install all the perl deps. and external programs. Perl Build.pl and ./Build Install went with out errors/warnings. I did not enable MPI though. But, when i start Maker it returs "segmentation fault". I have no clue whats going wrong....or where to check for error logs? Any help would be appreciated, Thanks, gowthaman _________ From gowthaman.ramasamy at seattlebiomed.org Thu May 24 15:22:16 2012 From: gowthaman.ramasamy at seattlebiomed.org (Gowthaman Ramasamy) Date: Thu, 24 May 2012 14:22:16 -0700 Subject: [maker-devel] MAKER installation problem Message-ID: <89080953C3D300419AACB6E63A7EEFBA5C8409F8DB@mail02.sbri.org> Hi, I am trying to install MAKER in centOS. I was able to install all the perl deps. and external programs. Perl Build.pl and ./Build Install went with out errors/warnings. I did not enable MPI though. But, when i start Maker it returs "segmentation fault". I have no clue whats going wrong....or where to check for error logs? Any help would be appreciated, Thanks, gowthaman _________ ________________________________________ From bob_freeman at hms.harvard.edu Fri May 25 10:23:22 2012 From: bob_freeman at hms.harvard.edu (Bob Freeman) Date: Fri, 25 May 2012 12:23:22 -0400 Subject: [maker-devel] Alternate translation table? Message-ID: <454CA235-0DB6-451F-97C4-83D32E2E805A@hms.harvard.edu> Hello all! Unusual question here: I am running MAKER on a ciliate that uses a non-standard translation table for its translation products. I haven't found an option in the control files that I can change for the for translation of the predicted transcripts. How or where can I go about this? Tx, Bob ----------------------------------------------------- Bob Freeman, Ph.D. Acorn Worm Informatics, Kirschner lab Dept of Systems Biology, Alpert 524 Harvard Medical School 200 Longwood Avenue Boston, MA 02115 617/432.2294, vox "Sorry I'm late. Oh, God, that sounded insincere. I'm late." -- Karen Walker, from Will and Grace -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon May 28 06:43:34 2012 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 28 May 2012 08:43:34 -0400 Subject: [maker-devel] Alternate translation table? In-Reply-To: <454CA235-0DB6-451F-97C4-83D32E2E805A@hms.harvard.edu> Message-ID: The alternate translation table is not currently an option. It's one of those things that needs to be implemented, but has not been yet. It's also not supported by many of the eukaryotic gene predictors MAKER uses. I could probably get something implemented for you to test in two to three weeks though (there are a lot of places where the translation table comes into play). Let me know. --Carson From: Bob Freeman Date: Friday, 25 May, 2012 12:23 PM To: Subject: [maker-devel] Alternate translation table? Hello all! Unusual question here: I am running MAKER on a ciliate that uses a non-standard translation table for its translation products. I haven't found an option in the control files that I can change for the for translation of the predicted transcripts. How or where can I go about this? Tx, Bob ----------------------------------------------------- Bob Freeman, Ph.D. Acorn Worm Informatics, Kirschner lab Dept of Systems Biology, Alpert 524 Harvard Medical School 200 Longwood Avenue Boston, MA 02115 617/432.2294, vox "Sorry I'm late. Oh, God, that sounded insincere. I'm late." -- Karen Walker, from Will and Grace _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon May 28 07:38:25 2012 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 28 May 2012 09:38:25 -0400 Subject: [maker-devel] missing est2genome annotation In-Reply-To: <4FBD1AF3.8020904@uni-wuerzburg.de> Message-ID: Sorry for the slow reply. I'm just getting back after traveling. That's a percent coverage flag. You can set a percent coverage threshold in the maker_bopts.ctl file. Partial high scoring alignments can be common. If you only filter by expect score, you would be surprised to see how many ugly and confusing all the alignments become. Thanks, Carson On 12-05-23 1:14 PM, "Thomas Hackl" wrote: >Hi again, > >I did some source code digging and caught the following line burying my >exonerate alignments. I suspect it does so for a very good reason, >therefore it would help me a lot if someone could explain to me, what is >going on there. > > >/lib/GI.pm l.1473 >next if $e->pAh< $pcov; > > >Regards >Thomas > > >Am 23.05.2012 14:01, schrieb Thomas Hackl: >> Hi, >> >> I used maker to annotate genomic contigs and among other stuff >> provided transcripts from the transcriptome as est evidence. Blast and >> exonerate work fine and produce valid alignments, the alignment files >> exist in theVoid and look very good. Unfortunatly neither the >> evidence_0.gff nor the final .gff carry the corresponding feature >> annotations. >> >> Any ideas why? >> >> Regards >> Thomas >> > > >-- >Thomas Hackl >Julius-Maximilians-Universit?t >Department of Bioinformatics >97074 W?rzburg, Germany >Fon: +49 931 - 31 86883 >Mail: thomas.hackl at uni-wuerzburg.de > > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From seoanezonjic at hotmail.com Tue May 22 06:55:12 2012 From: seoanezonjic at hotmail.com (p sz) Date: Tue, 22 May 2012 12:55:12 +0000 Subject: [maker-devel] ipr_update_gff ERROR Message-ID: First, thanks by help me on the lprevious error that I submitted. I'm still working in the same project and I get a new error. I try interproscan with this commandline: iprscan_wrap -i parsed_input.all.maker.proteins.fasta -email seoanezonjic at hotmail.com -format raw parsed_input.all.maker.proteins.fasta was generated with the tool fasta_merge. I use the output (attached in this email) and a gff file (generated by a normal run of maker, attached in this email) with the ipr_update_gff script of this way: ipr_update_gff BAC12_Clone_Pt314B2_Lib_Pt_7Ba__organism_Pinus_taeda__0.gff sc_interpro.sh.o109053 And i get this error: Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 1. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 1. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 2. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 2. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 3. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 3. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 8. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 8. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 9. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 9. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 10. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 10. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 11. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 11. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 12. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 12. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 13. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 13. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 14. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 14. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 15. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 15. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 18. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 18. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 18. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 18. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 19. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 19. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 19. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 19. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 48. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 48. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 48. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 48. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 49. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 49. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 49. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 49. The gff file seems updated but i don't know if it works fine or is corrupt Thanks in advance -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: sc_interpro.sh.o109053 Type: application/octet-stream Size: 2542 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: BAC12_Clone_Pt314B2_Lib_Pt_7Ba__organism_Pinus_taeda__0.gff Type: application/octet-stream Size: 220243 bytes Desc: not available URL: From larriba.ed at gmail.com Fri May 25 10:01:41 2012 From: larriba.ed at gmail.com (Eduardo Larriba) Date: Fri, 25 May 2012 18:01:41 +0200 Subject: [maker-devel] Consensus gene models Message-ID: Hi Carson and people, I am working on structural annotation of a filamentous fungus, of which there is little evidence as EST or Protein. For generate consensus gene based on limited evidences to me I used Marker. For this I created the files GeneMark prediction-is and SNAP. I run maker using the EST of my organims (85), along with 5700 EST of the closed organims. I have made ??predictions with Augustus, and SNAP GeneMark, with the training files for my organims, in Maker pipeline. Everything works fine. My problem is that when I get the consensus sequences of all my contigs, fasta_merge script (included in Maker), I get different list for each predictor, as well as when I try to get the gff of all. They could tell me how I can use Maker consensus for a list of genes? Or I have to do it manually? There is the possibility that Maker evaluates the accuracy of each prediction and confirm, so just get a list of the different predictions? Thank you very much. -- Eduardo Larriba Tornel Universidad de Alicante. Lab. Fitopatolog?a Dept. Ciencias del Mar y Biolog?a Aplicada Pabell?n 13. San Vicente del Raspeig Tel. 96 590 3400 ext 3280 -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon May 28 08:14:22 2012 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 28 May 2012 10:14:22 -0400 Subject: [maker-devel] Consensus gene models In-Reply-To: Message-ID: The consensus list is in the maker.proteins.fasta and maker.transcripts.fasta file. The predictor specific lists are just for reference purposes (incase you want to see what the other predictors produced on their own, i.e. without MAKER's intervention). The non-overlapping.fasta file in the same directory will contain consensus entries for models that were not supported by any evidence and don't overlap any gene models in the maker.transcripts.fasta (think of these as the maybe gene and the maker.transcripts.fasta as the very likely genes). You can set keep_preds=1 if you just want MAKER to keep everything with or without support and just produce consensus (probably ok on a fungus, but I wouldn't recommend it on other eukayotes because false positive rates will be very high). Thanks, Carson From: Eduardo Larriba Date: Friday, 25 May, 2012 12:01 PM To: Subject: [maker-devel] Consensus gene models Hi Carson and people, I am working on structural annotation of a filamentous fungus, of which there is little evidence as EST or Protein. For generate consensus gene based on limited evidences to me I used Marker. For this I created the files GeneMark prediction-is and SNAP. I run maker using the EST of my organims (85), along with 5700 EST of the closed organims. I have made ??predictions with Augustus, and SNAP GeneMark, with the training files for my organims, in Maker pipeline. Everything works fine. My problem is that when I get the consensus sequences of all my contigs, fasta_merge script (included in Maker), I get different list for each predictor, as well as when I try to get the gff of all. They could tell me how I can use Maker consensus for a list of genes? Or I have to do it manually? There is the possibility that Maker evaluates the accuracy of each prediction and confirm, so just get a list of the different predictions? Thank you very much. -- Eduardo Larriba Tornel Universidad de Alicante. Lab. Fitopatolog?a Dept. Ciencias del Mar y Biolog?a Aplicada Pabell?n 13. San Vicente del Raspeig Tel. 96 590 3400 ext 3280 _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon May 28 08:18:26 2012 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 28 May 2012 10:18:26 -0400 Subject: [maker-devel] ipr_update_gff ERROR In-Reply-To: Message-ID: This error would happen if some results exist in the iprscan output, but don't match gene entries in the GFF3 file. If I could see the original files I can tell you which ones. This can happen if you combine results from the non-overlapping.fasta files with the maker.proteins.fasta files for example. Models in the non-overlapping.fasta file are not genes in the GFF3 (they are match/match_part enties), so errors happen. Thanks, Carson From: p sz Date: Tuesday, 22 May, 2012 8:55 AM To: Subject: [maker-devel] ipr_update_gff ERROR First, thanks by help me on the lprevious error that I submitted. I'm still working in the same project and I get a new error. I try interproscan with this commandline: iprscan_wrap -i parsed_input.all.maker.proteins.fasta -email seoanezonjic at hotmail.com -format raw parsed_input.all.maker.proteins.fasta was generated with the tool fasta_merge. I use the output (attached in this email) and a gff file (generated by a normal run of maker, attached in this email) with the ipr_update_gff script of this way: ipr_update_gff BAC12_Clone_Pt314B2_Lib_Pt_7Ba__organism_Pinus_taeda__0.gff sc_interpro.sh.o109053 And i get this error: Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 1. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 1. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 2. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 2. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 3. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 3. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 8. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 8. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 9. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 9. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 10. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 10. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 11. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 11. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 12. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 12. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 13. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 13. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 14. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 14. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 179, <$IN> line 15. Use of uninitialized value $gene_id in hash element at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 181, <$IN> line 15. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 18. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 18. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 18. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 18. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 19. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 19. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 19. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 19. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 48. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 48. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 48. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 48. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 49. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 49. Use of uninitialized value in string eq at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 69, <$IN> line 49. Use of uninitialized value in concatenation (.) or string at /export/home_users/home/soft/maker/programs/x86_64/maker/bin/ipr_update_gff line 70, <$IN> line 49. The gff file seems updated but i don't know if it works fine or is corrupt Thanks in advance _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue May 29 06:37:55 2012 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 29 May 2012 08:37:55 -0400 Subject: [maker-devel] can't call method strand on an undefined value ERROR: Failed while flattening protein clusters In-Reply-To: Message-ID: Use this command to check out the latest unreleased test version, and lt me know if you still get the error. Command --> svn co svn://malachite.genetics.utah.edu/maker/trunk maker User: yandell_guest Password: y at ndell_Gu3st Thanks, Carson From: Scott Geib Date: Tuesday, 22 May, 2012 2:42 AM To: Subject: [maker-devel] can't call method strand on an undefined value ERROR: Failed while flattening protein clusters Hi, Using maker 2.24, I am getting the following error (see below) in protein2genome widget. I also get the same error with est2genome. This happens with my own data (testing on a single scaffold), but not with the test data supplied with maker (dpp files in data folder). Scott Widget::exonerate::protein2genome: /data0/opt/AlignmentSoftware/exonerate/exonerate-2.2.0-x86_64/bin/exonerate -q /data0/opt/GenePrediction/maker/bdor/makercustom/scaffold1.maker.output/scaf fold1_datastore/B8/E3/scaffold00001//theVoid.scaffold00001/UniRef90_UPI00019 4D3FC.for.1588546-1589203.9.fasta -t /data0/opt/GenePrediction/maker/bdor/makercustom/scaffold1.maker.output/scaf fold1_datastore/B8/E3/scaffold00001//theVoid.scaffold00001/scaffold00001.158 8546-1589203.9.fasta -Q protein -T dna -m protein2genome --softmasktarget --percent 20 --showcigar > /data0/opt/GenePrediction/maker/bdor/makercustom/scaffold1.maker.output/scaf fold1_datastore/B8/E3/scaffold00001//theVoid.scaffold00001/scaffold00001.158 8546-1589203.UniRef90_UPI000194D3FC.p_exonerate.9 #-------------------------------# cleaning blastx... in cluster::shadow_cluster... ...finished clustering. cleaning clusters.... total clusters:2 now processing 0 ...processing 0 of 23 ...processing 1 of 23 ...processing 2 of 23 ...processing 3 of 23 ...processing 4 of 23 ...processing 5 of 23 ...processing 6 of 23 ...processing 7 of 23 ...processing 8 of 23 ...processing 9 of 23 ...processing 10 of 23 ...processing 11 of 23 ...processing 12 of 23 ...processing 13 of 23 ...processing 14 of 23 ...processing 15 of 23 ...processing 16 of 23 ...processing 17 of 23 ...processing 18 of 23 ...processing 19 of 23 ...processing 20 of 23 ...processing 21 of 23 total clusters:2 now processing 0 in cluster::shadow_cluster... ...finished clustering. cleaning clusters.... total clusters:2 now processing 0 ...processing 0 of 20 ...processing 1 of 20 ...processing 2 of 20 ...processing 3 of 20 ...processing 4 of 20 ...processing 5 of 20 ...processing 6 of 20 ...processing 7 of 20 ...processing 8 of 20 ...processing 9 of 20 ...processing 10 of 20 ...processing 11 of 20 ...processing 12 of 20 ...processing 13 of 20 ...processing 14 of 20 ...processing 15 of 20 ...processing 16 of 20 ...processing 17 of 20 ...processing 18 of 20 total clusters:2 now processing 0 Can't call method "strand" on an undefined valueERROR: Failed while flattening protein clusters ERROR: Chunk failed at level:11, tier_type:2 FAILED CONTIG:scaffold00001 _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From bob_freeman at hms.harvard.edu Tue May 29 10:30:23 2012 From: bob_freeman at hms.harvard.edu (Bob Freeman) Date: Tue, 29 May 2012 12:30:23 -0400 Subject: [maker-devel] Alternate translation table? In-Reply-To: References: Message-ID: Thanks, Carson, for the update on this. No need to implement something. I'll keep it simple and translate the collected transcripts using an appropriate translation table. -Bob On May 28, 2012, at 8:43 AM, Carson Holt wrote: > The alternate translation table is not currently an option. It's one of those things that needs to be implemented, but has not been yet. It's also not supported by many of the eukaryotic gene predictors MAKER uses. > > I could probably get something implemented for you to test in two to three weeks though (there are a lot of places where the translation table comes into play). Let me know. > > --Carson > > > > From: Bob Freeman > Date: Friday, 25 May, 2012 12:23 PM > To: > Subject: [maker-devel] Alternate translation table? > > Hello all! > > Unusual question here: I am running MAKER on a ciliate that uses a non-standard translation table for its translation products. I haven't found an option in the control files that I can change for the for translation of the predicted transcripts. How or where can I go about this? > > Tx, > Bob > > > _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org ----------------------------------------------------- Bob Freeman, Ph.D. Acorn Worm Informatics, Kirschner lab Dept of Systems Biology, Alpert 524 Harvard Medical School 200 Longwood Avenue Boston, MA 02115 617/432.2294, vox "Sorry I'm late. Oh, God, that sounded insincere. I'm late." -- Karen Walker, from Will and Grace -------------- next part -------------- An HTML attachment was scrubbed... URL: From gowthaman.ramasamy at seattlebiomed.org Tue May 29 15:54:33 2012 From: gowthaman.ramasamy at seattlebiomed.org (Gowthaman Ramasamy) Date: Tue, 29 May 2012 14:54:33 -0700 Subject: [maker-devel] Can maker select a gene model based on #algoritham predicted it Message-ID: <89080953C3D300419AACB6E63A7EEFBA5C8409F8EB@mail02.sbri.org> Hi Carson, Thanks for all the help during the long weekend, in spite of that long drive. I am still trying to imagine that. I now have maker to consider our own prediction via pred_gff, and use augustus and gene mark (with our training model). And i was able to use altest and protein evidences. Maker happily picks one gene model when there is a overlap between three different predictions. But, when I look at the gff, it seems like it picks a gene model only when there is an est/protein evidence. It leaves out some genes even though, they are predicted by all three algorithms. Of course, keep_pred=1 helps to keep all the models. This kind of leads to over prediction. But, I am looking for something in between. And would like to know if that is possible? 1) Pick a gene model if it has an evidence from (est/prot etc...) irrespective of how many algorithms predicted it 2) In the absence of extrinsic evidence (est/prot etc), pick a gene model if that is predicted by at least two algorithms. Or even simpler: I have ab-initio predictions from three algorithms, Can I output, those genes that is supported by at least two of them. I care less about exactness of gene boundaries. Thanks, Gowthaman PS: With my recent attempts, i learned couple things about maker/other associated tools that is not documented in gmod-maker wiki. Is it possible/ok if I add contents to it. I am okay with running it by you before making it public. From carsonhh at gmail.com Wed May 30 06:54:32 2012 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 30 May 2012 08:54:32 -0400 Subject: [maker-devel] Can maker select a gene model based on #algoritham predicted it In-Reply-To: <89080953C3D300419AACB6E63A7EEFBA5C8409F8EB@mail02.sbri.org> Message-ID: It's not an option in exactly the way you are specifying, but there is something I usually do for annotation that works well. I run interproscan or rpsblast on the non_overlapping.proteins.fasta file and select just those non-overlapping models that have a recognizable protein domain (just searching the pfam doamin space is more than sufficient). Then I provide the selected results to model_gff, and provide the previous maker results to the maker_gff option with (all reannotation pass options set to 1 and all analysis options turned off). This adds models with at least recognizable domains (as even multiple gene predictors can overpredict in a similar way). Attached is a script to help select predictions and upgrade them to models in GFF3 format. If you have question let me know. Thanks, Carson On 12-05-29 5:54 PM, "Gowthaman Ramasamy" wrote: >Hi Carson, >Thanks for all the help during the long weekend, in spite of that long >drive. I am still trying to imagine that. > >I now have maker to consider our own prediction via pred_gff, and use >augustus and gene mark (with our training model). And i was able to use >altest and protein evidences. Maker happily picks one gene model when >there is a overlap between three different predictions. But, when I look >at the gff, it seems like it picks a gene model only when there is an >est/protein evidence. It leaves out some genes even though, they are >predicted by all three algorithms. Of course, keep_pred=1 helps to keep >all the models. This kind of leads to over prediction. > >But, I am looking for something in between. And would like to know if >that is possible? >1) Pick a gene model if it has an evidence from (est/prot etc...) >irrespective of how many algorithms predicted it >2) In the absence of extrinsic evidence (est/prot etc), pick a gene model >if that is predicted by at least two algorithms. > >Or even simpler: >I have ab-initio predictions from three algorithms, Can I output, those >genes that is supported by at least two of them. I care less about >exactness of gene boundaries. > >Thanks, >Gowthaman > >PS: With my recent attempts, i learned couple things about maker/other >associated tools that is not documented in gmod-maker wiki. Is it >possible/ok if I add contents to it. I am okay with running it by you >before making it public. -------------- next part -------------- A non-text attachment was scrubbed... Name: gff3_preds2models Type: application/octet-stream Size: 4778 bytes Desc: not available URL: From mikael.durling at slu.se Thu May 31 06:25:31 2012 From: mikael.durling at slu.se (=?iso-8859-1?Q?Mikael_Brandstr=F6m_Durling?=) Date: Thu, 31 May 2012 14:25:31 +0200 Subject: [maker-devel] maker leaving large numbers of defunct zombies Message-ID: Hello, I've been working lately to set up maker for annotation work on a few fungal genomes. I've got mpi maker up and running now, however, I notice that maker is leaving a lot of perl processes behind. This happens to the extent that the process table on the system gets filled up after a few hours run time. Right now the process tree after three hours running looks like this: |-sge_execd-+-sge_shepherd---bash---mpirun-+-maker-+-maker | | | `-perl | | |-maker-+-maker | | | |-maker---1371*[perl] | | | `-perl | | |-maker-+-maker | | | |-maker---1348*[perl] | | | `-perl | | |-maker-+-maker | | | |-maker---1384*[perl] | | | `-perl ...and so on for all mpi processes, except for the controlling processes. What perl programs is maker calling, that might end up as zombies? I've had a brief look at the source to no avail, but would be happy to dig further with some pointers for where to look. This is run with the 2.25-beta from the web page, perl 5.16.0 and openmpi 1.4.5. Thanks, Mikael ------------------------------------- Mikael Brandstr?m Durling, PhD Assistant Professor Sveriges lantbruksuniversitet Swedish University of Agricultural Sciences Uppsala BioCenter Dept of Forest Mycology and Plant Pathology Box 7026, 75007 Uppsala Visiting address: Almas All? 5 Telefon: 018-671512 mikael.durling at slu.se, www.slu.se/mykopat From mikael.durling at slu.se Thu May 31 06:34:30 2012 From: mikael.durling at slu.se (=?iso-8859-1?Q?Mikael_Brandstr=F6m_Durling?=) Date: Thu, 31 May 2012 14:34:30 +0200 Subject: [maker-devel] Using GDBM_File instead of DB_File Message-ID: Hello, I've been struggling for a few days to get maker up and running with MPI on a debian squeeze system. Compiling a new perl 5.16 exclusively for maker I wound down to that the segfaults came from DB_File. Even by recomiling and updating that module, nothing worked. After checking for dependencies on DB_File in maker, I concluded that the only dependency was through the GI::localize_file, which expects the FastaDB to be instantiated with DB_File. However, FastaDB can run on GDBM_File too. I patched the calls to GI::localize_file in maker to handle the .pag/.dir extensions used by GDBM (below). With this patch applied maker is running for me, even when I have deleted the DB_File module from the perl path and made sure that GDBM_File is installed. My basic question is if there is any other dependency for DB_File which I have missed which may break things? cheers, Mikael --- maker.orig 2012-03-30 15:48:05.000000000 +0200 +++ maker 2012-05-31 10:35:30.253022648 +0200 @@ -512,7 +515,12 @@ } if($size > 1){ carp "Calling GI::localize_file" if($main::debug); - GI::localize_file("$gdbfile.index"); + if( -f "$gdbfile.index.dir" ){ + GI::localize_file("$gdbfile.index.dir"); + GI::localize_file("$gdbfile.index.pag"); + }else{ + GI::localize_file("$gdbfile.index"); + } carp "Calling GI::localize_file" if($main::debug); $gdbfile = GI::localize_file($gdbfile); } From carsonhh at gmail.com Thu May 31 07:04:58 2012 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 31 May 2012 09:04:58 -0400 Subject: [maker-devel] Using GDBM_File instead of DB_File In-Reply-To: Message-ID: DB_File is being called by Bio::DB::Fasta. I can check the object returned when using GDBM_File instead to see if the index file names are contained on the object, as I'm just assuming an extension of '.index'. I'll look around to see if the extension name is assumed anywhere else. Thanks, Carson On 12-05-31 8:34 AM, "Mikael Brandstr?m Durling" wrote: >Hello, > >I've been struggling for a few days to get maker up and running with MPI >on a debian squeeze system. Compiling a new perl 5.16 exclusively for >maker I wound down to that the segfaults came from DB_File. Even by >recomiling and updating that module, nothing worked. After checking for >dependencies on DB_File in maker, I concluded that the only dependency >was through the GI::localize_file, which expects the FastaDB to be >instantiated with DB_File. However, FastaDB can run on GDBM_File too. I >patched the calls to GI::localize_file in maker to handle the .pag/.dir >extensions used by GDBM (below). With this patch applied maker is running >for me, even when I have deleted the DB_File module from the perl path >and made sure that GDBM_File is installed. My basic question is if there >is any other dependency for DB_File which I have missed which may break >things? > >cheers, >Mikael > >--- maker.orig 2012-03-30 15:48:05.000000000 +0200 >+++ maker 2012-05-31 10:35:30.253022648 +0200 >@@ -512,7 +515,12 @@ > } > if($size > 1){ > carp "Calling GI::localize_file" if($main::debug); >- GI::localize_file("$gdbfile.index"); >+ if( -f "$gdbfile.index.dir" ){ >+ GI::localize_file("$gdbfile.index.dir"); >+ GI::localize_file("$gdbfile.index.pag"); >+ }else{ >+ GI::localize_file("$gdbfile.index"); >+ } > carp "Calling GI::localize_file" if($main::debug); > $gdbfile = GI::localize_file($gdbfile); > } > > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From carsonhh at gmail.com Thu May 31 07:17:20 2012 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 31 May 2012 09:17:20 -0400 Subject: [maker-devel] maker leaving large numbers of defunct zombies In-Reply-To: Message-ID: MAKER uses IPC::Open3 to open almost all external applications, including a helper script called every once in a while that helps check file locks on NFS. MAKER then calls waitpid to reap the processes, as IPC::Open3 doesn't auto-reap. The only time previously I've seen issues with zombie accumulation was with MPICH2 when it moved from the MPD process manager to Hydra. Hydra had certain broken signal handling issues that I had to bug the MPICH2 developers about and they fixed it. It is possible that the issue you are having may be with OpenMPI or with perl 5.16. I currently use perl 5.12. Perl instituted something called safe signals in either 5.6 or 5.8 and there may be some updates in 5.16 where they've been changing those around again. I can try installing a copy of 5.16 to test with and OpenMPI to see if I can replicate the error. Thanks, Carson On 12-05-31 8:25 AM, "Mikael Brandstr?m Durling" wrote: >Hello, > >I've been working lately to set up maker for annotation work on a few >fungal genomes. I've got mpi maker up and running now, however, I notice >that maker is leaving a lot of perl processes behind. This >happens to the extent that the process table on the system gets filled up >after a few hours run time. Right now the process tree after three hours >running looks like this: > > |-sge_execd-+-sge_shepherd---bash---mpirun-+-maker-+-maker > | | | `-perl > | | |-maker-+-maker > | | | >|-maker---1371*[perl] > | | | `-perl > | | |-maker-+-maker > | | | >|-maker---1348*[perl] > | | | `-perl > | | |-maker-+-maker > | | | >|-maker---1384*[perl] > | | | `-perl > >...and so on for all mpi processes, except for the controlling processes. > >What perl programs is maker calling, that might end up as zombies? I've >had a brief look at the source to no avail, but would be happy to dig >further with some pointers for where to look. > >This is run with the 2.25-beta from the web page, perl 5.16.0 and openmpi >1.4.5. > >Thanks, >Mikael > > > > > > > > > > >------------------------------------- >Mikael Brandstr?m Durling, PhD >Assistant Professor > >Sveriges lantbruksuniversitet >Swedish University of Agricultural Sciences > >Uppsala BioCenter >Dept of Forest Mycology and Plant Pathology >Box 7026, 75007 Uppsala >Visiting address: Almas All? 5 >Telefon: 018-671512 >mikael.durling at slu.se, www.slu.se/mykopat > > > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From mikael.durling at slu.se Thu May 31 07:57:06 2012 From: mikael.durling at slu.se (=?iso-8859-1?Q?Mikael_Brandstr=F6m_Durling?=) Date: Thu, 31 May 2012 15:57:06 +0200 Subject: [maker-devel] maker leaving large numbers of defunct zombies In-Reply-To: References: Message-ID: I saw the same problem with the latest MPICH2 using hydra too, so it might boil down to perl/openmpi interactions. I didn't see this problem with the debian supplied perl 5.10, but then I had intermittent segfaults with in DB_File and libpthread. Seemed to be some interaction with the LD_PRELOADed libmpi. That requirement for preloading libmpi was easiest solved by compiling openmpi with --disable-dlopen. Thanks, Mikael 31 maj 2012 kl. 15:17 skrev Carson Holt: > MAKER uses IPC::Open3 to open almost all external applications, including > a helper script called every once in a while that helps check file locks > on NFS. > > MAKER then calls waitpid to reap the processes, as IPC::Open3 doesn't > auto-reap. The only time previously I've seen issues with zombie > accumulation was with MPICH2 when it moved from the MPD process manager to > Hydra. Hydra had certain broken signal handling issues that I had to bug > the MPICH2 developers about and they fixed it. It is possible that the > issue you are having may be with OpenMPI or with perl 5.16. I currently > use perl 5.12. Perl instituted something called safe signals in either > 5.6 or 5.8 and there may be some updates in 5.16 where they've been > changing those around again. > > I can try installing a copy of 5.16 to test with and OpenMPI to see if I > can replicate the error. > > Thanks, > Carson > > > > On 12-05-31 8:25 AM, "Mikael Brandstr?m Durling" > wrote: > >> Hello, >> >> I've been working lately to set up maker for annotation work on a few >> fungal genomes. I've got mpi maker up and running now, however, I notice >> that maker is leaving a lot of perl processes behind. This >> happens to the extent that the process table on the system gets filled up >> after a few hours run time. Right now the process tree after three hours >> running looks like this: >> >> |-sge_execd-+-sge_shepherd---bash---mpirun-+-maker-+-maker >> | | | `-perl >> | | |-maker-+-maker >> | | | >> |-maker---1371*[perl] >> | | | `-perl >> | | |-maker-+-maker >> | | | >> |-maker---1348*[perl] >> | | | `-perl >> | | |-maker-+-maker >> | | | >> |-maker---1384*[perl] >> | | | `-perl >> >> ...and so on for all mpi processes, except for the controlling processes. >> >> What perl programs is maker calling, that might end up as zombies? I've >> had a brief look at the source to no avail, but would be happy to dig >> further with some pointers for where to look. >> >> This is run with the 2.25-beta from the web page, perl 5.16.0 and openmpi >> 1.4.5. >> >> Thanks, >> Mikael >> >> >> >> >> >> >> >> >> >> >> ------------------------------------- >> Mikael Brandstr?m Durling, PhD >> Assistant Professor >> >> Sveriges lantbruksuniversitet >> Swedish University of Agricultural Sciences >> >> Uppsala BioCenter >> Dept of Forest Mycology and Plant Pathology >> Box 7026, 75007 Uppsala >> Visiting address: Almas All? 5 >> Telefon: 018-671512 >> mikael.durling at slu.se, www.slu.se/mykopat >> >> >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > >