From carsonhh at gmail.com Wed Oct 2 09:14:44 2013 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 02 Oct 2013 10:14:44 -0400 Subject: [maker-devel] maker2 scripts for functional annotation In-Reply-To: Message-ID: It still works, but it will be reduced. Without ESTs you won't get any UTR prediction, also you will be limited by how well the protein set matches to the genes. If you use the protein set of a close relative you will capture most things. You will probably capture 80-95% of what you would by also including ESTs. It all depending on how different the species in the proteins evidence file are compared to the the species being annotated. --Carson On 10/1/13 3:00 PM, "Xia.Cao at dupont.com" wrote: >Hi Carson, > >Thank you for your message and your kind help. Now conversion from >match/match_part to gene/mRNA/exons/CDS works well after blanking out >some options in control file. > >I have one more question about maker2. In case we don't have EST evidence >(set of ESTs or assembled mRNA-seq in fasta format) for a genome, does >maker2 function? If it does function, could you please let me know the >performance of maker2 without providing EST evidence compared to the one >with EST evidence? > >Thank you again and I look forward to hearing from you. > >Best, >Xia > > > > >-----Original Message----- >From: Carson Holt [mailto:carsonhh at gmail.com] >Sent: Wednesday, September 25, 2013 10:36 AM >To: CAO, XIA; myandell at genetics.utah.edu; RIVERA, CORBAN GREGORY; >maker-devel at yandell-lab.org >Subject: Re: [maker-devel] maker2 scripts for functional annotation > >If it is launching predictors then you have snap hmm or augustus_species >set. You ned to blank out all other options in the control files >(including repeat masking options, proteins, ESTs, etc.) when trying to >convert mathc/match_part to gene/mRNA/exons/CDS, or else those other >programs will run. > >--Carson > > >On 9/25/13 10:31 AM, "Xia.Cao at dupont.com" wrote: > >>Hi Carson, >> >>Thank you for the message and your kind help. We tested maker2 by >>setting keep_preds=1, pred_gff=generated_gff_file_from_first_makerRun . >>But it seemed maker2 started to launch all predictors again and it took >>long time to finish. I wonder if there is any way that we can directly >>get gene/mRNA/exons/CDS gff file without re-running maker2 to convert >>match/match_part features into gene/mRNA/exons/CDS. >> >>Thanks, >>Xia >> >>-----Original Message----- >>From: Carson Holt [mailto:carsonhh at gmail.com] >>Sent: Thursday, September 19, 2013 5:58 PM >>To: Mark Yandell; CAO, XIA; RIVERA, CORBAN GREGORY; >>maker-devel at yandell-lab.org >>Subject: Re: [maker-devel] maker2 scripts for functional annotation >> >>Hello Corban & Xia, >> >>Some scripts like gff3_preds2models are deprecated. To get the same >>result as was offered by gff3_preds2models, just give your >>match/match_part features to pref_gff= in the maker_opts.ctl file, set >>keep_preds=1, and run with all other options and predictors turned off. >>The final MAKER result will be your match/match_part features converted >>into gene/mRNA/exons/CDS. >> >>For functional annotation, you can use Interproscan, BLASTP against >>UniProt, or BALST2GO. My preference is to use InterProScan to add GO >>terms and proteins domains via the ipr_update_gff and iprscan2gff3 >>scripts. Then add putative gene functions via BLASTP to UniProt and >>maker_functional_fasta and maker_functional_gff scripts. >> >>Go ahead and take a look and that those tools and let me know if you >>have any questions or need help you configuring them. >> >>Thanks, >>Carson >> >> >>On 9/19/13 11:53 AM, "Mark Yandell" wrote: >> >>>Hi Corban & Xia, >>> >>> >>>I've forwarded your question along to the MAKER_dev list, were you can >>>get speedy answers to your maker related questions. Thanks for using >>>MAKER. >>> >>>--mark >>> >>> >>>Mark Yandell >>>Professor of Human Genetics >>>H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of >>>Human Genetics University of Utah >>>15 North 2030 East, Room 2100 >>>Salt Lake City, UT 84112-5330 >>>ph:801-587-7707 >>> >>>________________________________________ >>>From: Xia.Cao at dupont.com [Xia.Cao at dupont.com] >>>Sent: Thursday, September 19, 2013 11:49 AM >>>To: Mark Yandell; Corban-Gregory.Rivera at dupont.com >>>Subject: maker2 scripts for functional annotation >>> >>>Dr. Yandell, >>> >>>We were recently evaluating maker2 for annotation and going through >>>the maker tutorial from 2012. >>> >>>http://gmod.org/wiki/MAKER_Tutorial_2012 >>> >>>The tutorial makes references to some scripts that we couldn?t find in >>>the current release. We were looking for scripts like >>>gff3_preds2models to convert match/match_part format into annotations >>>with gene/mRNA/exons/CDS and others. I was wondering if maybe we did >>>not have the most up to date version. >>> >>>In addition to getting accurate gene annotations, I was looking for a >>>solution to get functional assignments. I see that there are some >>>scripts like maker_functional_fasta that may help, but I was wondering >>>what you would recommend. >>> >>>Thanks, >>> >>>Corban & Xia >>> >>>This communication is for use by the intended recipient and contains >>>information that may be Privileged, confidential or copyrighted under >>>applicable law. If you are not the intended recipient, you are hereby >>>formally notified that any use, copying or distribution of this >>>e-mail, in whole or in part, is strictly prohibited. Please notify the >>>sender by return e-mail and delete this e-mail from your system. >>>Unless explicitly and conspicuously designated as "E-Contract >>>Intended", this e-mail does not constitute a contract offer, a >>>contract amendment, or an acceptance of a contract offer. This e-mail >>>does not constitute a consent to the use of sender's contact >>>information for direct marketing purposes or for transfers of data to >>>third parties. >>> >>>The dupont.com web address will continue in use for a transitional >>>period for communications sent or received on behalf of DuPont >>>Performance Coatings., which is not affiliated in any way with the >>>DuPont Company. >>> >>>Francais Deutsch Italiano Espanol Portugues Japanese Chinese >>>Korean >>> >>> http://www.DuPont.com/corp/email_disclaimer.html >>> >>>_______________________________________________ >>>maker-devel mailing list >>>maker-devel at box290.bluehost.com >>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.or >>>g >> >> >> >>This communication is for use by the intended recipient and contains >>information that may be Privileged, confidential or copyrighted under >>applicable law. If you are not the intended recipient, you are hereby >>formally notified that any use, copying or distribution of this e-mail, >>in whole or in part, is strictly prohibited. Please notify the sender >>by return e-mail and delete this e-mail from your system. Unless >>explicitly and conspicuously designated as "E-Contract Intended", this >>e-mail does not constitute a contract offer, a contract amendment, or >>an acceptance of a contract offer. This e-mail does not constitute a >>consent to the use of sender's contact information for direct marketing >>purposes or for transfers of data to third parties. >> >>The dupont.com web address will continue in use for >>a transitional period for communications sent or received on behalf of >>DuPont Performance Coatings., which is not affiliated in any way with >>the DuPont Company. >> >>Francais Deutsch Italiano Espanol Portugues Japanese Chinese >>Korean >> >> http://www.DuPont.com/corp/email_disclaimer.html >> > > > >This communication is for use by the intended recipient and contains >information that may be Privileged, confidential or copyrighted under >applicable law. If you are not the intended recipient, you are hereby >formally notified that any use, copying or distribution of this e-mail, >in whole or in part, is strictly prohibited. Please notify the sender by >return e-mail and delete this e-mail from your system. Unless explicitly >and conspicuously designated as "E-Contract Intended", this e-mail does >not constitute a contract offer, a contract amendment, or an acceptance >of a contract offer. This e-mail does not constitute a consent to the >use of sender's contact information for direct marketing purposes or for >transfers of data to third parties. > >The dupont.com web address will continue in use for a >transitional period for communications sent or received on behalf of >DuPont >Performance Coatings., which is not affiliated in any way with the DuPont >Company. > >Francais Deutsch Italiano Espanol Portugues Japanese Chinese Korean > > http://www.DuPont.com/corp/email_disclaimer.html > From Xia.Cao at dupont.com Tue Oct 1 14:00:42 2013 From: Xia.Cao at dupont.com (Xia.Cao at dupont.com) Date: Tue, 1 Oct 2013 19:00:42 +0000 Subject: [maker-devel] maker2 scripts for functional annotation In-Reply-To: References: Message-ID: Hi Carson, Thank you for your message and your kind help. Now conversion from match/match_part to gene/mRNA/exons/CDS works well after blanking out some options in control file. I have one more question about maker2. In case we don't have EST evidence (set of ESTs or assembled mRNA-seq in fasta format) for a genome, does maker2 function? If it does function, could you please let me know the performance of maker2 without providing EST evidence compared to the one with EST evidence? Thank you again and I look forward to hearing from you. Best, Xia -----Original Message----- From: Carson Holt [mailto:carsonhh at gmail.com] Sent: Wednesday, September 25, 2013 10:36 AM To: CAO, XIA; myandell at genetics.utah.edu; RIVERA, CORBAN GREGORY; maker-devel at yandell-lab.org Subject: Re: [maker-devel] maker2 scripts for functional annotation If it is launching predictors then you have snap hmm or augustus_species set. You ned to blank out all other options in the control files (including repeat masking options, proteins, ESTs, etc.) when trying to convert mathc/match_part to gene/mRNA/exons/CDS, or else those other programs will run. --Carson On 9/25/13 10:31 AM, "Xia.Cao at dupont.com" wrote: >Hi Carson, > >Thank you for the message and your kind help. We tested maker2 by >setting keep_preds=1, pred_gff=generated_gff_file_from_first_makerRun . >But it seemed maker2 started to launch all predictors again and it took >long time to finish. I wonder if there is any way that we can directly >get gene/mRNA/exons/CDS gff file without re-running maker2 to convert >match/match_part features into gene/mRNA/exons/CDS. > >Thanks, >Xia > >-----Original Message----- >From: Carson Holt [mailto:carsonhh at gmail.com] >Sent: Thursday, September 19, 2013 5:58 PM >To: Mark Yandell; CAO, XIA; RIVERA, CORBAN GREGORY; >maker-devel at yandell-lab.org >Subject: Re: [maker-devel] maker2 scripts for functional annotation > >Hello Corban & Xia, > >Some scripts like gff3_preds2models are deprecated. To get the same >result as was offered by gff3_preds2models, just give your >match/match_part features to pref_gff= in the maker_opts.ctl file, set >keep_preds=1, and run with all other options and predictors turned off. >The final MAKER result will be your match/match_part features converted >into gene/mRNA/exons/CDS. > >For functional annotation, you can use Interproscan, BLASTP against >UniProt, or BALST2GO. My preference is to use InterProScan to add GO >terms and proteins domains via the ipr_update_gff and iprscan2gff3 >scripts. Then add putative gene functions via BLASTP to UniProt and >maker_functional_fasta and maker_functional_gff scripts. > >Go ahead and take a look and that those tools and let me know if you >have any questions or need help you configuring them. > >Thanks, >Carson > > >On 9/19/13 11:53 AM, "Mark Yandell" wrote: > >>Hi Corban & Xia, >> >> >>I've forwarded your question along to the MAKER_dev list, were you can >>get speedy answers to your maker related questions. Thanks for using >>MAKER. >> >>--mark >> >> >>Mark Yandell >>Professor of Human Genetics >>H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of >>Human Genetics University of Utah >>15 North 2030 East, Room 2100 >>Salt Lake City, UT 84112-5330 >>ph:801-587-7707 >> >>________________________________________ >>From: Xia.Cao at dupont.com [Xia.Cao at dupont.com] >>Sent: Thursday, September 19, 2013 11:49 AM >>To: Mark Yandell; Corban-Gregory.Rivera at dupont.com >>Subject: maker2 scripts for functional annotation >> >>Dr. Yandell, >> >>We were recently evaluating maker2 for annotation and going through >>the maker tutorial from 2012. >> >>http://gmod.org/wiki/MAKER_Tutorial_2012 >> >>The tutorial makes references to some scripts that we couldn?t find in >>the current release. We were looking for scripts like >>gff3_preds2models to convert match/match_part format into annotations >>with gene/mRNA/exons/CDS and others. I was wondering if maybe we did >>not have the most up to date version. >> >>In addition to getting accurate gene annotations, I was looking for a >>solution to get functional assignments. I see that there are some >>scripts like maker_functional_fasta that may help, but I was wondering >>what you would recommend. >> >>Thanks, >> >>Corban & Xia >> >>This communication is for use by the intended recipient and contains >>information that may be Privileged, confidential or copyrighted under >>applicable law. If you are not the intended recipient, you are hereby >>formally notified that any use, copying or distribution of this >>e-mail, in whole or in part, is strictly prohibited. Please notify the >>sender by return e-mail and delete this e-mail from your system. >>Unless explicitly and conspicuously designated as "E-Contract >>Intended", this e-mail does not constitute a contract offer, a >>contract amendment, or an acceptance of a contract offer. This e-mail >>does not constitute a consent to the use of sender's contact >>information for direct marketing purposes or for transfers of data to third parties. >> >>The dupont.com web address will continue in use for a transitional >>period for communications sent or received on behalf of DuPont >>Performance Coatings., which is not affiliated in any way with the >>DuPont Company. >> >>Francais Deutsch Italiano Espanol Portugues Japanese Chinese >>Korean >> >> http://www.DuPont.com/corp/email_disclaimer.html >> >>_______________________________________________ >>maker-devel mailing list >>maker-devel at box290.bluehost.com >>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.or >>g > > > >This communication is for use by the intended recipient and contains >information that may be Privileged, confidential or copyrighted under >applicable law. If you are not the intended recipient, you are hereby >formally notified that any use, copying or distribution of this e-mail, >in whole or in part, is strictly prohibited. Please notify the sender >by return e-mail and delete this e-mail from your system. Unless >explicitly and conspicuously designated as "E-Contract Intended", this >e-mail does not constitute a contract offer, a contract amendment, or >an acceptance of a contract offer. This e-mail does not constitute a >consent to the use of sender's contact information for direct marketing >purposes or for transfers of data to third parties. > >The dupont.com web address will continue in use for >a transitional period for communications sent or received on behalf of >DuPont Performance Coatings., which is not affiliated in any way with >the DuPont Company. > >Francais Deutsch Italiano Espanol Portugues Japanese Chinese >Korean > > http://www.DuPont.com/corp/email_disclaimer.html > This communication is for use by the intended recipient and contains information that may be Privileged, confidential or copyrighted under applicable law. If you are not the intended recipient, you are hereby formally notified that any use, copying or distribution of this e-mail, in whole or in part, is strictly prohibited. Please notify the sender by return e-mail and delete this e-mail from your system. Unless explicitly and conspicuously designated as "E-Contract Intended", this e-mail does not constitute a contract offer, a contract amendment, or an acceptance of a contract offer. This e-mail does not constitute a consent to the use of sender's contact information for direct marketing purposes or for transfers of data to third parties. The dupont.com web address will continue in use for a transitional period for communications sent or received on behalf of DuPont Performance Coatings., which is not affiliated in any way with the DuPont Company. Francais Deutsch Italiano Espanol Portugues Japanese Chinese Korean http://www.DuPont.com/corp/email_disclaimer.html From carsonhh at gmail.com Wed Oct 2 09:43:29 2013 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 02 Oct 2013 10:43:29 -0400 Subject: [maker-devel] maker2 scripts for functional annotation In-Reply-To: Message-ID: That's ok. If it is very closely related, provide it directly to the est= option, otherwise provide it to the altest= option. The difference between the two options is how they are aligned. The altest= alignments takes about 10x longer than the est= alignments. --Carson On 10/2/13 10:40 AM, "Xia.Cao at dupont.com" wrote: >Hi Carson, > >Thank you for your quick and kind reply. One more question here. If we >don't have EST evidence for the strain we sequenced/assembled, will it be >ok to provide some EST evidence that is from some other strains which are >close to the one we are trying to annotate? Could you please let us know >how this will affect performance of maker2? > >Thank you again. > >Best, >Xia > >-----Original Message----- >From: Carson Holt [mailto:carsonhh at gmail.com] >Sent: Wednesday, October 02, 2013 10:15 AM >To: CAO, XIA >Cc: myandell at genetics.utah.edu; RIVERA, CORBAN GREGORY; >maker-devel at yandell-lab.org >Subject: Re: [maker-devel] maker2 scripts for functional annotation > >It still works, but it will be reduced. Without ESTs you won't get any >UTR prediction, also you will be limited by how well the protein set >matches to the genes. If you use the protein set of a close relative you >will capture most things. You will probably capture 80-95% of what you >would by also including ESTs. It all depending on how different the >species in the proteins evidence file are compared to the the species >being annotated. > >--Carson > >On 10/1/13 3:00 PM, "Xia.Cao at dupont.com" wrote: > >>Hi Carson, >> >>Thank you for your message and your kind help. Now conversion from >>match/match_part to gene/mRNA/exons/CDS works well after blanking out >>some options in control file. >> >>I have one more question about maker2. In case we don't have EST >>evidence (set of ESTs or assembled mRNA-seq in fasta format) for a >>genome, does >>maker2 function? If it does function, could you please let me know the >>performance of maker2 without providing EST evidence compared to the >>one with EST evidence? >> >>Thank you again and I look forward to hearing from you. >> >>Best, >>Xia >> >> >> >> >>-----Original Message----- >>From: Carson Holt [mailto:carsonhh at gmail.com] >>Sent: Wednesday, September 25, 2013 10:36 AM >>To: CAO, XIA; myandell at genetics.utah.edu; RIVERA, CORBAN GREGORY; >>maker-devel at yandell-lab.org >>Subject: Re: [maker-devel] maker2 scripts for functional annotation >> >>If it is launching predictors then you have snap hmm or >>augustus_species set. You ned to blank out all other options in the >>control files (including repeat masking options, proteins, ESTs, etc.) >>when trying to convert mathc/match_part to gene/mRNA/exons/CDS, or else >>those other programs will run. >> >>--Carson >> >> >>On 9/25/13 10:31 AM, "Xia.Cao at dupont.com" wrote: >> >>>Hi Carson, >>> >>>Thank you for the message and your kind help. We tested maker2 by >>>setting keep_preds=1, pred_gff=generated_gff_file_from_first_makerRun . >>>But it seemed maker2 started to launch all predictors again and it >>>took long time to finish. I wonder if there is any way that we can >>>directly get gene/mRNA/exons/CDS gff file without re-running maker2 to >>>convert match/match_part features into gene/mRNA/exons/CDS. >>> >>>Thanks, >>>Xia >>> >>>-----Original Message----- >>>From: Carson Holt [mailto:carsonhh at gmail.com] >>>Sent: Thursday, September 19, 2013 5:58 PM >>>To: Mark Yandell; CAO, XIA; RIVERA, CORBAN GREGORY; >>>maker-devel at yandell-lab.org >>>Subject: Re: [maker-devel] maker2 scripts for functional annotation >>> >>>Hello Corban & Xia, >>> >>>Some scripts like gff3_preds2models are deprecated. To get the same >>>result as was offered by gff3_preds2models, just give your >>>match/match_part features to pref_gff= in the maker_opts.ctl file, set >>>keep_preds=1, and run with all other options and predictors turned off. >>>The final MAKER result will be your match/match_part features >>>converted into gene/mRNA/exons/CDS. >>> >>>For functional annotation, you can use Interproscan, BLASTP against >>>UniProt, or BALST2GO. My preference is to use InterProScan to add GO >>>terms and proteins domains via the ipr_update_gff and iprscan2gff3 >>>scripts. Then add putative gene functions via BLASTP to UniProt and >>>maker_functional_fasta and maker_functional_gff scripts. >>> >>>Go ahead and take a look and that those tools and let me know if you >>>have any questions or need help you configuring them. >>> >>>Thanks, >>>Carson >>> >>> >>>On 9/19/13 11:53 AM, "Mark Yandell" wrote: >>> >>>>Hi Corban & Xia, >>>> >>>> >>>>I've forwarded your question along to the MAKER_dev list, were you >>>>can get speedy answers to your maker related questions. Thanks for >>>>using MAKER. >>>> >>>>--mark >>>> >>>> >>>>Mark Yandell >>>>Professor of Human Genetics >>>>H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of >>>>Human Genetics University of Utah >>>>15 North 2030 East, Room 2100 >>>>Salt Lake City, UT 84112-5330 >>>>ph:801-587-7707 >>>> >>>>________________________________________ >>>>From: Xia.Cao at dupont.com [Xia.Cao at dupont.com] >>>>Sent: Thursday, September 19, 2013 11:49 AM >>>>To: Mark Yandell; Corban-Gregory.Rivera at dupont.com >>>>Subject: maker2 scripts for functional annotation >>>> >>>>Dr. Yandell, >>>> >>>>We were recently evaluating maker2 for annotation and going through >>>>the maker tutorial from 2012. >>>> >>>>http://gmod.org/wiki/MAKER_Tutorial_2012 >>>> >>>>The tutorial makes references to some scripts that we couldn?t find >>>>in the current release. We were looking for scripts like >>>>gff3_preds2models to convert match/match_part format into annotations >>>>with gene/mRNA/exons/CDS and others. I was wondering if maybe we did >>>>not have the most up to date version. >>>> >>>>In addition to getting accurate gene annotations, I was looking for a >>>>solution to get functional assignments. I see that there are some >>>>scripts like maker_functional_fasta that may help, but I was >>>>wondering what you would recommend. >>>> >>>>Thanks, >>>> >>>>Corban & Xia >>>> >>>>This communication is for use by the intended recipient and contains >>>>information that may be Privileged, confidential or copyrighted under >>>>applicable law. If you are not the intended recipient, you are hereby >>>>formally notified that any use, copying or distribution of this >>>>e-mail, in whole or in part, is strictly prohibited. Please notify >>>>the sender by return e-mail and delete this e-mail from your system. >>>>Unless explicitly and conspicuously designated as "E-Contract >>>>Intended", this e-mail does not constitute a contract offer, a >>>>contract amendment, or an acceptance of a contract offer. This e-mail >>>>does not constitute a consent to the use of sender's contact >>>>information for direct marketing purposes or for transfers of data to >>>>third parties. >>>> >>>>The dupont.com web address will continue in use for a transitional >>>>period for communications sent or received on behalf of DuPont >>>>Performance Coatings., which is not affiliated in any way with the >>>>DuPont Company. >>>> >>>>Francais Deutsch Italiano Espanol Portugues Japanese Chinese >>>>Korean >>>> >>>> http://www.DuPont.com/corp/email_disclaimer.html >>>> >>>>_______________________________________________ >>>>maker-devel mailing list >>>>maker-devel at box290.bluehost.com >>>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.o >>>>r >>>>g >>> >>> >>> >>>This communication is for use by the intended recipient and contains >>>information that may be Privileged, confidential or copyrighted under >>>applicable law. If you are not the intended recipient, you are hereby >>>formally notified that any use, copying or distribution of this >>>e-mail, in whole or in part, is strictly prohibited. Please notify the >>>sender by return e-mail and delete this e-mail from your system. >>>Unless explicitly and conspicuously designated as "E-Contract >>>Intended", this e-mail does not constitute a contract offer, a >>>contract amendment, or an acceptance of a contract offer. This e-mail >>>does not constitute a consent to the use of sender's contact >>>information for direct marketing purposes or for transfers of data to >>>third parties. >>> >>>The dupont.com web address will continue in use >>>for a transitional period for communications sent or received on >>>behalf of DuPont Performance Coatings., which is not affiliated in any >>>way with the DuPont Company. >>> >>>Francais Deutsch Italiano Espanol Portugues Japanese Chinese >>>Korean >>> >>> http://www.DuPont.com/corp/email_disclaimer.html >>> >> >> >> >>This communication is for use by the intended recipient and contains >>information that may be Privileged, confidential or copyrighted under >>applicable law. If you are not the intended recipient, you are hereby >>formally notified that any use, copying or distribution of this e-mail, >>in whole or in part, is strictly prohibited. Please notify the sender >>by return e-mail and delete this e-mail from your system. Unless >>explicitly and conspicuously designated as "E-Contract Intended", this >>e-mail does not constitute a contract offer, a contract amendment, or >>an acceptance of a contract offer. This e-mail does not constitute a >>consent to the use of sender's contact information for direct marketing >>purposes or for transfers of data to third parties. >> >>The dupont.com web address will continue in use for >>a transitional period for communications sent or received on behalf of >>DuPont Performance Coatings., which is not affiliated in any way with >>the DuPont Company. >> >>Francais Deutsch Italiano Espanol Portugues Japanese Chinese >>Korean >> >> http://www.DuPont.com/corp/email_disclaimer.html >> > > > >This communication is for use by the intended recipient and contains >information that may be Privileged, confidential or copyrighted under >applicable law. If you are not the intended recipient, you are hereby >formally notified that any use, copying or distribution of this e-mail, >in whole or in part, is strictly prohibited. Please notify the sender by >return e-mail and delete this e-mail from your system. Unless explicitly >and conspicuously designated as "E-Contract Intended", this e-mail does >not constitute a contract offer, a contract amendment, or an acceptance >of a contract offer. This e-mail does not constitute a consent to the >use of sender's contact information for direct marketing purposes or for >transfers of data to third parties. > >The dupont.com web address will continue in use for a >transitional period for communications sent or received on behalf of >DuPont >Performance Coatings., which is not affiliated in any way with the DuPont >Company. > >Francais Deutsch Italiano Espanol Portugues Japanese Chinese Korean > > http://www.DuPont.com/corp/email_disclaimer.html > From Carson.Holt at oicr.on.ca Wed Oct 2 10:24:42 2013 From: Carson.Holt at oicr.on.ca (Carson Holt) Date: Wed, 2 Oct 2013 15:24:42 +0000 Subject: [maker-devel] Incorrect gene model, antisense transcripts In-Reply-To: Message-ID: SNAP appears to be making small ab initio calls all over the place. You may need to retrain it, or just drop it completely from your analysis and just use augustus. Try running with protein2genome and then training SNAP off of those results. Also make sure your repeat masking is sufficient. You may be getting too many SANP predictions if that is the case. Thanks, Carson From: Sivaranjani Namasivayam > Date: Tuesday, October 1, 2013 5:01 PM To: "maker-devel-bounces at yandell-lab.org" > Subject: Incorrect gene model, antisense transcripts Hello, I am working on annotating a genome. I have RNAseq data assembled with Cufflinks and Trinity. I am using to predict gene models. Below is the procedure I used - For the first round of prediction I provided the following : EST evidence: assembled RNAseq, Protein evidence: proteins from related organism, Gene predictor: augustus trained on a related organism. - The number of genes I got was much less than expected (Expect atleast 10K genes, got ~4K genes) - Used the gene models from the first round of annotations to train SNAP (default parameters) and provided the HMM for the second round of annotation (used augustus also). All other data I passed as such in the gff3, except the gene models.I also included the some manually curated genes in the EST evidence. - The number of genes were much higher, ~11k, but many seemed incorrect and appear to have been generated from the SNAP models. I am attaching a screen shot from Apollo showing this. Would you have any suggestions on why this might be happening and how I can fix? I am essentially looking for gene models for my assembled RNAseq, can I achieve this by setting est2genome to 1? Would I be losing/misrepresenting any data if I do this. Also, I recently noticed my RNAseq data has assembled antisense transcripts. MAKER seems to be predicting a coding region for these antisense transcripts in some cases, (although the coding region predicted is much smaller than that of the sense gene model.) Is there a way to tell MAKER not to predict gene models on both strands at the same loci. Appreciate your help. Thanks, Ranjani -------------- next part -------------- An HTML attachment was scrubbed... URL: From Xia.Cao at dupont.com Wed Oct 2 09:40:12 2013 From: Xia.Cao at dupont.com (Xia.Cao at dupont.com) Date: Wed, 2 Oct 2013 14:40:12 +0000 Subject: [maker-devel] maker2 scripts for functional annotation In-Reply-To: References: Message-ID: Hi Carson, Thank you for your quick and kind reply. One more question here. If we don't have EST evidence for the strain we sequenced/assembled, will it be ok to provide some EST evidence that is from some other strains which are close to the one we are trying to annotate? Could you please let us know how this will affect performance of maker2? Thank you again. Best, Xia -----Original Message----- From: Carson Holt [mailto:carsonhh at gmail.com] Sent: Wednesday, October 02, 2013 10:15 AM To: CAO, XIA Cc: myandell at genetics.utah.edu; RIVERA, CORBAN GREGORY; maker-devel at yandell-lab.org Subject: Re: [maker-devel] maker2 scripts for functional annotation It still works, but it will be reduced. Without ESTs you won't get any UTR prediction, also you will be limited by how well the protein set matches to the genes. If you use the protein set of a close relative you will capture most things. You will probably capture 80-95% of what you would by also including ESTs. It all depending on how different the species in the proteins evidence file are compared to the the species being annotated. --Carson On 10/1/13 3:00 PM, "Xia.Cao at dupont.com" wrote: >Hi Carson, > >Thank you for your message and your kind help. Now conversion from >match/match_part to gene/mRNA/exons/CDS works well after blanking out >some options in control file. > >I have one more question about maker2. In case we don't have EST >evidence (set of ESTs or assembled mRNA-seq in fasta format) for a >genome, does >maker2 function? If it does function, could you please let me know the >performance of maker2 without providing EST evidence compared to the >one with EST evidence? > >Thank you again and I look forward to hearing from you. > >Best, >Xia > > > > >-----Original Message----- >From: Carson Holt [mailto:carsonhh at gmail.com] >Sent: Wednesday, September 25, 2013 10:36 AM >To: CAO, XIA; myandell at genetics.utah.edu; RIVERA, CORBAN GREGORY; >maker-devel at yandell-lab.org >Subject: Re: [maker-devel] maker2 scripts for functional annotation > >If it is launching predictors then you have snap hmm or >augustus_species set. You ned to blank out all other options in the >control files (including repeat masking options, proteins, ESTs, etc.) >when trying to convert mathc/match_part to gene/mRNA/exons/CDS, or else >those other programs will run. > >--Carson > > >On 9/25/13 10:31 AM, "Xia.Cao at dupont.com" wrote: > >>Hi Carson, >> >>Thank you for the message and your kind help. We tested maker2 by >>setting keep_preds=1, pred_gff=generated_gff_file_from_first_makerRun . >>But it seemed maker2 started to launch all predictors again and it >>took long time to finish. I wonder if there is any way that we can >>directly get gene/mRNA/exons/CDS gff file without re-running maker2 to >>convert match/match_part features into gene/mRNA/exons/CDS. >> >>Thanks, >>Xia >> >>-----Original Message----- >>From: Carson Holt [mailto:carsonhh at gmail.com] >>Sent: Thursday, September 19, 2013 5:58 PM >>To: Mark Yandell; CAO, XIA; RIVERA, CORBAN GREGORY; >>maker-devel at yandell-lab.org >>Subject: Re: [maker-devel] maker2 scripts for functional annotation >> >>Hello Corban & Xia, >> >>Some scripts like gff3_preds2models are deprecated. To get the same >>result as was offered by gff3_preds2models, just give your >>match/match_part features to pref_gff= in the maker_opts.ctl file, set >>keep_preds=1, and run with all other options and predictors turned off. >>The final MAKER result will be your match/match_part features >>converted into gene/mRNA/exons/CDS. >> >>For functional annotation, you can use Interproscan, BLASTP against >>UniProt, or BALST2GO. My preference is to use InterProScan to add GO >>terms and proteins domains via the ipr_update_gff and iprscan2gff3 >>scripts. Then add putative gene functions via BLASTP to UniProt and >>maker_functional_fasta and maker_functional_gff scripts. >> >>Go ahead and take a look and that those tools and let me know if you >>have any questions or need help you configuring them. >> >>Thanks, >>Carson >> >> >>On 9/19/13 11:53 AM, "Mark Yandell" wrote: >> >>>Hi Corban & Xia, >>> >>> >>>I've forwarded your question along to the MAKER_dev list, were you >>>can get speedy answers to your maker related questions. Thanks for >>>using MAKER. >>> >>>--mark >>> >>> >>>Mark Yandell >>>Professor of Human Genetics >>>H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of >>>Human Genetics University of Utah >>>15 North 2030 East, Room 2100 >>>Salt Lake City, UT 84112-5330 >>>ph:801-587-7707 >>> >>>________________________________________ >>>From: Xia.Cao at dupont.com [Xia.Cao at dupont.com] >>>Sent: Thursday, September 19, 2013 11:49 AM >>>To: Mark Yandell; Corban-Gregory.Rivera at dupont.com >>>Subject: maker2 scripts for functional annotation >>> >>>Dr. Yandell, >>> >>>We were recently evaluating maker2 for annotation and going through >>>the maker tutorial from 2012. >>> >>>http://gmod.org/wiki/MAKER_Tutorial_2012 >>> >>>The tutorial makes references to some scripts that we couldn?t find >>>in the current release. We were looking for scripts like >>>gff3_preds2models to convert match/match_part format into annotations >>>with gene/mRNA/exons/CDS and others. I was wondering if maybe we did >>>not have the most up to date version. >>> >>>In addition to getting accurate gene annotations, I was looking for a >>>solution to get functional assignments. I see that there are some >>>scripts like maker_functional_fasta that may help, but I was >>>wondering what you would recommend. >>> >>>Thanks, >>> >>>Corban & Xia >>> >>>This communication is for use by the intended recipient and contains >>>information that may be Privileged, confidential or copyrighted under >>>applicable law. If you are not the intended recipient, you are hereby >>>formally notified that any use, copying or distribution of this >>>e-mail, in whole or in part, is strictly prohibited. Please notify >>>the sender by return e-mail and delete this e-mail from your system. >>>Unless explicitly and conspicuously designated as "E-Contract >>>Intended", this e-mail does not constitute a contract offer, a >>>contract amendment, or an acceptance of a contract offer. This e-mail >>>does not constitute a consent to the use of sender's contact >>>information for direct marketing purposes or for transfers of data to >>>third parties. >>> >>>The dupont.com web address will continue in use for a transitional >>>period for communications sent or received on behalf of DuPont >>>Performance Coatings., which is not affiliated in any way with the >>>DuPont Company. >>> >>>Francais Deutsch Italiano Espanol Portugues Japanese Chinese >>>Korean >>> >>> http://www.DuPont.com/corp/email_disclaimer.html >>> >>>_______________________________________________ >>>maker-devel mailing list >>>maker-devel at box290.bluehost.com >>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.o >>>r >>>g >> >> >> >>This communication is for use by the intended recipient and contains >>information that may be Privileged, confidential or copyrighted under >>applicable law. If you are not the intended recipient, you are hereby >>formally notified that any use, copying or distribution of this >>e-mail, in whole or in part, is strictly prohibited. Please notify the >>sender by return e-mail and delete this e-mail from your system. >>Unless explicitly and conspicuously designated as "E-Contract >>Intended", this e-mail does not constitute a contract offer, a >>contract amendment, or an acceptance of a contract offer. This e-mail >>does not constitute a consent to the use of sender's contact >>information for direct marketing purposes or for transfers of data to third parties. >> >>The dupont.com web address will continue in use >>for a transitional period for communications sent or received on >>behalf of DuPont Performance Coatings., which is not affiliated in any >>way with the DuPont Company. >> >>Francais Deutsch Italiano Espanol Portugues Japanese Chinese >>Korean >> >> http://www.DuPont.com/corp/email_disclaimer.html >> > > > >This communication is for use by the intended recipient and contains >information that may be Privileged, confidential or copyrighted under >applicable law. If you are not the intended recipient, you are hereby >formally notified that any use, copying or distribution of this e-mail, >in whole or in part, is strictly prohibited. Please notify the sender >by return e-mail and delete this e-mail from your system. Unless >explicitly and conspicuously designated as "E-Contract Intended", this >e-mail does not constitute a contract offer, a contract amendment, or >an acceptance of a contract offer. This e-mail does not constitute a >consent to the use of sender's contact information for direct marketing >purposes or for transfers of data to third parties. > >The dupont.com web address will continue in use for >a transitional period for communications sent or received on behalf of >DuPont Performance Coatings., which is not affiliated in any way with >the DuPont Company. > >Francais Deutsch Italiano Espanol Portugues Japanese Chinese >Korean > > http://www.DuPont.com/corp/email_disclaimer.html > This communication is for use by the intended recipient and contains information that may be Privileged, confidential or copyrighted under applicable law. If you are not the intended recipient, you are hereby formally notified that any use, copying or distribution of this e-mail, in whole or in part, is strictly prohibited. Please notify the sender by return e-mail and delete this e-mail from your system. Unless explicitly and conspicuously designated as "E-Contract Intended", this e-mail does not constitute a contract offer, a contract amendment, or an acceptance of a contract offer. This e-mail does not constitute a consent to the use of sender's contact information for direct marketing purposes or for transfers of data to third parties. The dupont.com web address will continue in use for a transitional period for communications sent or received on behalf of DuPont Performance Coatings., which is not affiliated in any way with the DuPont Company. Francais Deutsch Italiano Espanol Portugues Japanese Chinese Korean http://www.DuPont.com/corp/email_disclaimer.html From markwiz at us.ibm.com Wed Oct 2 11:52:55 2013 From: markwiz at us.ibm.com (Mark Wisner) Date: Wed, 2 Oct 2013 12:52:55 -0400 Subject: [maker-devel] Maker2 on IBM PowerLinux? Message-ID: IBMs PowerLinux is tuned for Big Data workloads. We have had queries for Maker2 on PowerLinux. How do we work with Maker2 development to create a PowerLinux port. Thanks, Mark K. Wisner Linux On Power ISV PM IBM 3039 Cornwallis Rd RTP, NC 27709 Cell 919-649-5813 -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry.moore at genetics.utah.edu Wed Oct 2 12:29:36 2013 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Wed, 2 Oct 2013 11:29:36 -0600 Subject: [maker-devel] Maker2 on IBM PowerLinux? In-Reply-To: References: Message-ID: Hi Mark, Carson Holt is the primary Maker developer, but others of us may be able to help as well. I've added e-mails for Carson, Michael Campbell, Mark Yandell and myself who are currently developers on the project. You can feel free to contact this mailing list - or if the conversation is getting too far afield for general interest you can contact us off list and we will do whatever we can to help. If you feel like you need to do some modification of MAKER code to optimize it for PowerLinux I'm happy to give you a branch on the subversion repo to work with and then we can work with you to merge these changes back to the trunk when you feel like you have some improvements that benefit PowerLinux (and all other) users. Thanks very much for your interest in MAKER - your participation in the community is most welcome! Barry On Oct 2, 2013, at 10:52 AM, Mark Wisner wrote: > IBMs PowerLinux is tuned for Big Data workloads. We have had queries for Maker2 on PowerLinux. > How do we work with Maker2 development to create a PowerLinux port. > Thanks, > > Mark K. Wisner > Linux On Power ISV PM > IBM > 3039 Cornwallis Rd > RTP, NC 27709 > Cell 919-649-5813 > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Wed Oct 2 12:29:50 2013 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 02 Oct 2013 13:29:50 -0400 Subject: [maker-devel] Maker2 on IBM PowerLinux? In-Reply-To: Message-ID: MAKER is written in perl, so it can work pretty much anywhere you have perl. It is more an issue of the tools MAKER uses. If you can get SNAP, BLAST, exonerate, etc. to run then MAKER should be no problem. You can let MAKER try and install and configure any missing perl libraries as well as external tools for you (this is by far the easiest way). You will need an internet connection. cd ?/maker/src/ perl Build.PL #say 'Y' to local installation of libraries and 'N' to MPI for now ./Build installdeps #grabs missing perl libraries from the ether, and installs them in the .../maker/perl/ directory ./Build installers #grabs missing tools from the ether, installs, and configures them in the .../maker/exe/ directory ./Build install #just installs maker If you want to run via MPI, I'd recommend OpenMPI, and you will need to rerun the 'perl Build.PL' and './Build install' steps (say yes to MPI this time around). Here are some variables you will likly have to add to your .bash_profile to get openmpi to run with MAKER (you must add these and source the profile before attempting to install with OpenMPI), otherwise compilation will not work correctly (OpenMPI shared library issue). #you need to set $OMPIHOME to OpenMPI's location here --> export LD_PRELOAD=$OMPIHOME/lib/libmpi.so:$LD_PRELOAD #this one just suppresses a warning --> export OMPI_MCA_mpi_warn_on_fork=0 And on some systems you have to add this ' -mca btl ^openib' to the mpiexec command when I run maker. Example--> mpiexec -mca btl ^openib -n 20 maker Thanks, Carson From: Mark Wisner Date: Wednesday, October 2, 2013 12:52 PM To: Subject: [maker-devel] Maker2 on IBM PowerLinux? IBMs PowerLinux is tuned for Big Data workloads. We have had queries for Maker2 on PowerLinux. How do we work with Maker2 development to create a PowerLinux port. Thanks, Mark K. Wisner Linux On Power ISV PM IBM 3039 Cornwallis Rd RTP, NC 27709 Cell 919-649-5813 _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry.moore at genetics.utah.edu Wed Oct 2 12:21:29 2013 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Wed, 2 Oct 2013 11:21:29 -0600 Subject: [maker-devel] maker2 scripts for functional annotation In-Reply-To: References: Message-ID: Hi Xia, You can use EST evidence from other strains/species. The considerations are as follows: When you pass EST (or mRNASeq based transcripts) evidence MAKER uses blastn to align those with alignment parameters that assume they sequences will align easily with almost perfect identity - this is fast. If you use EST/mRNA evidence from another species MAKER uses tblastx to compare the RNA evidence to the assembly, both translated into protein space - this works fine, but it is slow and often doesn't get you much more evidence than you could have gotten from proteins as evidence. If you believe that you're other strain should have very little divergence in sequence at the nt level then you could experiment with passing EST/mRNA evidence as est in the control file. Consider this an experiment, do it on a few contigs and examine the results carefully and skeptically. If it works well, that would be the way to go because you've get better UTRs, but if it doesn't you'll see that you get little support for models because sequence divergence was too much for good alignment. In that case you'll need to use alt_est and this will be slower and you'll likely get little in the way of support for UTRs - in this case as well, do a few contigs and examine the output carefully to see if the additional alignment time with tblastx is worth it in your case. Barry On Oct 2, 2013, at 8:40 AM, wrote: > Hi Carson, > > Thank you for your quick and kind reply. One more question here. If we don't have EST evidence for the strain we sequenced/assembled, will it be ok to provide some EST evidence that is from some other strains which are close to the one we are trying to annotate? Could you please let us know how this will affect performance of maker2? > > Thank you again. > > Best, > Xia > > -----Original Message----- > From: Carson Holt [mailto:carsonhh at gmail.com] > Sent: Wednesday, October 02, 2013 10:15 AM > To: CAO, XIA > Cc: myandell at genetics.utah.edu; RIVERA, CORBAN GREGORY; maker-devel at yandell-lab.org > Subject: Re: [maker-devel] maker2 scripts for functional annotation > > It still works, but it will be reduced. Without ESTs you won't get any UTR prediction, also you will be limited by how well the protein set matches to the genes. If you use the protein set of a close relative you will capture most things. You will probably capture 80-95% of what you would by also including ESTs. It all depending on how different the species in the proteins evidence file are compared to the the species being annotated. > > --Carson > > On 10/1/13 3:00 PM, "Xia.Cao at dupont.com" wrote: > >> Hi Carson, >> >> Thank you for your message and your kind help. Now conversion from >> match/match_part to gene/mRNA/exons/CDS works well after blanking out >> some options in control file. >> >> I have one more question about maker2. In case we don't have EST >> evidence (set of ESTs or assembled mRNA-seq in fasta format) for a >> genome, does >> maker2 function? If it does function, could you please let me know the >> performance of maker2 without providing EST evidence compared to the >> one with EST evidence? >> >> Thank you again and I look forward to hearing from you. >> >> Best, >> Xia >> >> >> >> >> -----Original Message----- >> From: Carson Holt [mailto:carsonhh at gmail.com] >> Sent: Wednesday, September 25, 2013 10:36 AM >> To: CAO, XIA; myandell at genetics.utah.edu; RIVERA, CORBAN GREGORY; >> maker-devel at yandell-lab.org >> Subject: Re: [maker-devel] maker2 scripts for functional annotation >> >> If it is launching predictors then you have snap hmm or >> augustus_species set. You ned to blank out all other options in the >> control files (including repeat masking options, proteins, ESTs, etc.) >> when trying to convert mathc/match_part to gene/mRNA/exons/CDS, or else >> those other programs will run. >> >> --Carson >> >> >> On 9/25/13 10:31 AM, "Xia.Cao at dupont.com" wrote: >> >>> Hi Carson, >>> >>> Thank you for the message and your kind help. We tested maker2 by >>> setting keep_preds=1, pred_gff=generated_gff_file_from_first_makerRun . >>> But it seemed maker2 started to launch all predictors again and it >>> took long time to finish. I wonder if there is any way that we can >>> directly get gene/mRNA/exons/CDS gff file without re-running maker2 to >>> convert match/match_part features into gene/mRNA/exons/CDS. >>> >>> Thanks, >>> Xia >>> >>> -----Original Message----- >>> From: Carson Holt [mailto:carsonhh at gmail.com] >>> Sent: Thursday, September 19, 2013 5:58 PM >>> To: Mark Yandell; CAO, XIA; RIVERA, CORBAN GREGORY; >>> maker-devel at yandell-lab.org >>> Subject: Re: [maker-devel] maker2 scripts for functional annotation >>> >>> Hello Corban & Xia, >>> >>> Some scripts like gff3_preds2models are deprecated. To get the same >>> result as was offered by gff3_preds2models, just give your >>> match/match_part features to pref_gff= in the maker_opts.ctl file, set >>> keep_preds=1, and run with all other options and predictors turned off. >>> The final MAKER result will be your match/match_part features >>> converted into gene/mRNA/exons/CDS. >>> >>> For functional annotation, you can use Interproscan, BLASTP against >>> UniProt, or BALST2GO. My preference is to use InterProScan to add GO >>> terms and proteins domains via the ipr_update_gff and iprscan2gff3 >>> scripts. Then add putative gene functions via BLASTP to UniProt and >>> maker_functional_fasta and maker_functional_gff scripts. >>> >>> Go ahead and take a look and that those tools and let me know if you >>> have any questions or need help you configuring them. >>> >>> Thanks, >>> Carson >>> >>> >>> On 9/19/13 11:53 AM, "Mark Yandell" wrote: >>> >>>> Hi Corban & Xia, >>>> >>>> >>>> I've forwarded your question along to the MAKER_dev list, were you >>>> can get speedy answers to your maker related questions. Thanks for >>>> using MAKER. >>>> >>>> --mark >>>> >>>> >>>> Mark Yandell >>>> Professor of Human Genetics >>>> H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of >>>> Human Genetics University of Utah >>>> 15 North 2030 East, Room 2100 >>>> Salt Lake City, UT 84112-5330 >>>> ph:801-587-7707 >>>> >>>> ________________________________________ >>>> From: Xia.Cao at dupont.com [Xia.Cao at dupont.com] >>>> Sent: Thursday, September 19, 2013 11:49 AM >>>> To: Mark Yandell; Corban-Gregory.Rivera at dupont.com >>>> Subject: maker2 scripts for functional annotation >>>> >>>> Dr. Yandell, >>>> >>>> We were recently evaluating maker2 for annotation and going through >>>> the maker tutorial from 2012. >>>> >>>> http://gmod.org/wiki/MAKER_Tutorial_2012 >>>> >>>> The tutorial makes references to some scripts that we couldn?t find >>>> in the current release. We were looking for scripts like >>>> gff3_preds2models to convert match/match_part format into annotations >>>> with gene/mRNA/exons/CDS and others. I was wondering if maybe we did >>>> not have the most up to date version. >>>> >>>> In addition to getting accurate gene annotations, I was looking for a >>>> solution to get functional assignments. I see that there are some >>>> scripts like maker_functional_fasta that may help, but I was >>>> wondering what you would recommend. >>>> >>>> Thanks, >>>> >>>> Corban & Xia >>>> >>>> This communication is for use by the intended recipient and contains >>>> information that may be Privileged, confidential or copyrighted under >>>> applicable law. If you are not the intended recipient, you are hereby >>>> formally notified that any use, copying or distribution of this >>>> e-mail, in whole or in part, is strictly prohibited. Please notify >>>> the sender by return e-mail and delete this e-mail from your system. >>>> Unless explicitly and conspicuously designated as "E-Contract >>>> Intended", this e-mail does not constitute a contract offer, a >>>> contract amendment, or an acceptance of a contract offer. This e-mail >>>> does not constitute a consent to the use of sender's contact >>>> information for direct marketing purposes or for transfers of data to >>>> third parties. >>>> >>>> The dupont.com web address will continue in use for a transitional >>>> period for communications sent or received on behalf of DuPont >>>> Performance Coatings., which is not affiliated in any way with the >>>> DuPont Company. >>>> >>>> Francais Deutsch Italiano Espanol Portugues Japanese Chinese >>>> Korean >>>> >>>> http://www.DuPont.com/corp/email_disclaimer.html >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.o >>>> r >>>> g >>> >>> >>> >>> This communication is for use by the intended recipient and contains >>> information that may be Privileged, confidential or copyrighted under >>> applicable law. If you are not the intended recipient, you are hereby >>> formally notified that any use, copying or distribution of this >>> e-mail, in whole or in part, is strictly prohibited. Please notify the >>> sender by return e-mail and delete this e-mail from your system. >>> Unless explicitly and conspicuously designated as "E-Contract >>> Intended", this e-mail does not constitute a contract offer, a >>> contract amendment, or an acceptance of a contract offer. This e-mail >>> does not constitute a consent to the use of sender's contact >>> information for direct marketing purposes or for transfers of data to third parties. >>> >>> The dupont.com web address will continue in use >>> for a transitional period for communications sent or received on >>> behalf of DuPont Performance Coatings., which is not affiliated in any >>> way with the DuPont Company. >>> >>> Francais Deutsch Italiano Espanol Portugues Japanese Chinese >>> Korean >>> >>> http://www.DuPont.com/corp/email_disclaimer.html >>> >> >> >> >> This communication is for use by the intended recipient and contains >> information that may be Privileged, confidential or copyrighted under >> applicable law. If you are not the intended recipient, you are hereby >> formally notified that any use, copying or distribution of this e-mail, >> in whole or in part, is strictly prohibited. Please notify the sender >> by return e-mail and delete this e-mail from your system. Unless >> explicitly and conspicuously designated as "E-Contract Intended", this >> e-mail does not constitute a contract offer, a contract amendment, or >> an acceptance of a contract offer. This e-mail does not constitute a >> consent to the use of sender's contact information for direct marketing >> purposes or for transfers of data to third parties. >> >> The dupont.com web address will continue in use for >> a transitional period for communications sent or received on behalf of >> DuPont Performance Coatings., which is not affiliated in any way with >> the DuPont Company. >> >> Francais Deutsch Italiano Espanol Portugues Japanese Chinese >> Korean >> >> http://www.DuPont.com/corp/email_disclaimer.html >> > > > > This communication is for use by the intended recipient and contains > information that may be Privileged, confidential or copyrighted under > applicable law. If you are not the intended recipient, you are hereby > formally notified that any use, copying or distribution of this e-mail, > in whole or in part, is strictly prohibited. Please notify the sender by > return e-mail and delete this e-mail from your system. Unless explicitly > and conspicuously designated as "E-Contract Intended", this e-mail does > not constitute a contract offer, a contract amendment, or an acceptance > of a contract offer. This e-mail does not constitute a consent to the > use of sender's contact information for direct marketing purposes or for > transfers of data to third parties. > > The dupont.com web address will continue in use for a > transitional period for communications sent or received on behalf of DuPont > Performance Coatings., which is not affiliated in any way with the DuPont Company. > > Francais Deutsch Italiano Espanol Portugues Japanese Chinese Korean > > http://www.DuPont.com/corp/email_disclaimer.html > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From Xia.Cao at dupont.com Wed Oct 2 12:51:26 2013 From: Xia.Cao at dupont.com (Xia.Cao at dupont.com) Date: Wed, 2 Oct 2013 17:51:26 +0000 Subject: [maker-devel] maker2 scripts for functional annotation In-Reply-To: References: Message-ID: Hi Barry, Thank you very much for the message and suggestions. It's really helpful to us. We will do some tests to see how we can take advantage of Maker2 to produce high-quality gene models from our assembled genome. Thank you again, Xia From: Barry Moore [mailto:barry.moore at genetics.utah.edu] Sent: Wednesday, October 02, 2013 1:21 PM To: CAO, XIA Cc: carsonhh at gmail.com; maker-devel at yandell-lab.org; RIVERA, CORBAN GREGORY Subject: Re: [maker-devel] maker2 scripts for functional annotation Hi Xia, You can use EST evidence from other strains/species. The considerations are as follows: When you pass EST (or mRNASeq based transcripts) evidence MAKER uses blastn to align those with alignment parameters that assume they sequences will align easily with almost perfect identity - this is fast. If you use EST/mRNA evidence from another species MAKER uses tblastx to compare the RNA evidence to the assembly, both translated into protein space - this works fine, but it is slow and often doesn't get you much more evidence than you could have gotten from proteins as evidence. If you believe that you're other strain should have very little divergence in sequence at the nt level then you could experiment with passing EST/mRNA evidence as est in the control file. Consider this an experiment, do it on a few contigs and examine the results carefully and skeptically. If it works well, that would be the way to go because you've get better UTRs, but if it doesn't you'll see that you get little support for models because sequence divergence was too much for good alignment. In that case you'll need to use alt_est and this will be slower and you'll likely get little in the way of support for UTRs - in this case as well, do a few contigs and examine the output carefully to see if the additional alignment time with tblastx is worth it in your case. Barry On Oct 2, 2013, at 8:40 AM, > wrote: Hi Carson, Thank you for your quick and kind reply. One more question here. If we don't have EST evidence for the strain we sequenced/assembled, will it be ok to provide some EST evidence that is from some other strains which are close to the one we are trying to annotate? Could you please let us know how this will affect performance of maker2? Thank you again. Best, Xia -----Original Message----- From: Carson Holt [mailto:carsonhh at gmail.com] Sent: Wednesday, October 02, 2013 10:15 AM To: CAO, XIA Cc: myandell at genetics.utah.edu; RIVERA, CORBAN GREGORY; maker-devel at yandell-lab.org Subject: Re: [maker-devel] maker2 scripts for functional annotation It still works, but it will be reduced. Without ESTs you won't get any UTR prediction, also you will be limited by how well the protein set matches to the genes. If you use the protein set of a close relative you will capture most things. You will probably capture 80-95% of what you would by also including ESTs. It all depending on how different the species in the proteins evidence file are compared to the the species being annotated. --Carson On 10/1/13 3:00 PM, "Xia.Cao at dupont.com" > wrote: Hi Carson, Thank you for your message and your kind help. Now conversion from match/match_part to gene/mRNA/exons/CDS works well after blanking out some options in control file. I have one more question about maker2. In case we don't have EST evidence (set of ESTs or assembled mRNA-seq in fasta format) for a genome, does maker2 function? If it does function, could you please let me know the performance of maker2 without providing EST evidence compared to the one with EST evidence? Thank you again and I look forward to hearing from you. Best, Xia -----Original Message----- From: Carson Holt [mailto:carsonhh at gmail.com] Sent: Wednesday, September 25, 2013 10:36 AM To: CAO, XIA; myandell at genetics.utah.edu; RIVERA, CORBAN GREGORY; maker-devel at yandell-lab.org Subject: Re: [maker-devel] maker2 scripts for functional annotation If it is launching predictors then you have snap hmm or augustus_species set. You ned to blank out all other options in the control files (including repeat masking options, proteins, ESTs, etc.) when trying to convert mathc/match_part to gene/mRNA/exons/CDS, or else those other programs will run. --Carson On 9/25/13 10:31 AM, "Xia.Cao at dupont.com" > wrote: Hi Carson, Thank you for the message and your kind help. We tested maker2 by setting keep_preds=1, pred_gff=generated_gff_file_from_first_makerRun . But it seemed maker2 started to launch all predictors again and it took long time to finish. I wonder if there is any way that we can directly get gene/mRNA/exons/CDS gff file without re-running maker2 to convert match/match_part features into gene/mRNA/exons/CDS. Thanks, Xia -----Original Message----- From: Carson Holt [mailto:carsonhh at gmail.com] Sent: Thursday, September 19, 2013 5:58 PM To: Mark Yandell; CAO, XIA; RIVERA, CORBAN GREGORY; maker-devel at yandell-lab.org Subject: Re: [maker-devel] maker2 scripts for functional annotation Hello Corban & Xia, Some scripts like gff3_preds2models are deprecated. To get the same result as was offered by gff3_preds2models, just give your match/match_part features to pref_gff= in the maker_opts.ctl file, set keep_preds=1, and run with all other options and predictors turned off. The final MAKER result will be your match/match_part features converted into gene/mRNA/exons/CDS. For functional annotation, you can use Interproscan, BLASTP against UniProt, or BALST2GO. My preference is to use InterProScan to add GO terms and proteins domains via the ipr_update_gff and iprscan2gff3 scripts. Then add putative gene functions via BLASTP to UniProt and maker_functional_fasta and maker_functional_gff scripts. Go ahead and take a look and that those tools and let me know if you have any questions or need help you configuring them. Thanks, Carson On 9/19/13 11:53 AM, "Mark Yandell" > wrote: Hi Corban & Xia, I've forwarded your question along to the MAKER_dev list, were you can get speedy answers to your maker related questions. Thanks for using MAKER. --mark Mark Yandell Professor of Human Genetics H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:801-587-7707 ________________________________________ From: Xia.Cao at dupont.com [Xia.Cao at dupont.com] Sent: Thursday, September 19, 2013 11:49 AM To: Mark Yandell; Corban-Gregory.Rivera at dupont.com Subject: maker2 scripts for functional annotation Dr. Yandell, We were recently evaluating maker2 for annotation and going through the maker tutorial from 2012. http://gmod.org/wiki/MAKER_Tutorial_2012 The tutorial makes references to some scripts that we couldn?t find in the current release. We were looking for scripts like gff3_preds2models to convert match/match_part format into annotations with gene/mRNA/exons/CDS and others. I was wondering if maybe we did not have the most up to date version. In addition to getting accurate gene annotations, I was looking for a solution to get functional assignments. I see that there are some scripts like maker_functional_fasta that may help, but I was wondering what you would recommend. Thanks, Corban & Xia This communication is for use by the intended recipient and contains information that may be Privileged, confidential or copyrighted under applicable law. If you are not the intended recipient, you are hereby formally notified that any use, copying or distribution of this e-mail, in whole or in part, is strictly prohibited. Please notify the sender by return e-mail and delete this e-mail from your system. Unless explicitly and conspicuously designated as "E-Contract Intended", this e-mail does not constitute a contract offer, a contract amendment, or an acceptance of a contract offer. This e-mail does not constitute a consent to the use of sender's contact information for direct marketing purposes or for transfers of data to third parties. The dupont.com web address will continue in use for a transitional period for communications sent or received on behalf of DuPont Performance Coatings., which is not affiliated in any way with the DuPont Company. Francais Deutsch Italiano Espanol Portugues Japanese Chinese Korean http://www.DuPont.com/corp/email_disclaimer.html _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.o r g This communication is for use by the intended recipient and contains information that may be Privileged, confidential or copyrighted under applicable law. If you are not the intended recipient, you are hereby formally notified that any use, copying or distribution of this e-mail, in whole or in part, is strictly prohibited. Please notify the sender by return e-mail and delete this e-mail from your system. Unless explicitly and conspicuously designated as "E-Contract Intended", this e-mail does not constitute a contract offer, a contract amendment, or an acceptance of a contract offer. This e-mail does not constitute a consent to the use of sender's contact information for direct marketing purposes or for transfers of data to third parties. The dupont.com web address will continue in use for a transitional period for communications sent or received on behalf of DuPont Performance Coatings., which is not affiliated in any way with the DuPont Company. Francais Deutsch Italiano Espanol Portugues Japanese Chinese Korean http://www.DuPont.com/corp/email_disclaimer.html This communication is for use by the intended recipient and contains information that may be Privileged, confidential or copyrighted under applicable law. If you are not the intended recipient, you are hereby formally notified that any use, copying or distribution of this e-mail, in whole or in part, is strictly prohibited. Please notify the sender by return e-mail and delete this e-mail from your system. Unless explicitly and conspicuously designated as "E-Contract Intended", this e-mail does not constitute a contract offer, a contract amendment, or an acceptance of a contract offer. This e-mail does not constitute a consent to the use of sender's contact information for direct marketing purposes or for transfers of data to third parties. The dupont.com web address will continue in use for a transitional period for communications sent or received on behalf of DuPont Performance Coatings., which is not affiliated in any way with the DuPont Company. Francais Deutsch Italiano Espanol Portugues Japanese Chinese Korean http://www.DuPont.com/corp/email_disclaimer.html This communication is for use by the intended recipient and contains information that may be Privileged, confidential or copyrighted under applicable law. If you are not the intended recipient, you are hereby formally notified that any use, copying or distribution of this e-mail, in whole or in part, is strictly prohibited. Please notify the sender by return e-mail and delete this e-mail from your system. Unless explicitly and conspicuously designated as "E-Contract Intended", this e-mail does not constitute a contract offer, a contract amendment, or an acceptance of a contract offer. This e-mail does not constitute a consent to the use of sender's contact information for direct marketing purposes or for transfers of data to third parties. The dupont.com web address will continue in use for a transitional period for communications sent or received on behalf of DuPont Performance Coatings., which is not affiliated in any way with the DuPont Company. Francais Deutsch Italiano Espanol Portugues Japanese Chinese Korean http://www.DuPont.com/corp/email_disclaimer.html _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 This communication is for use by the intended recipient and contains information that may be Privileged, confidential or copyrighted under applicable law. If you are not the intended recipient, you are hereby formally notified that any use, copying or distribution of this e-mail, in whole or in part, is strictly prohibited. Please notify the sender by return e-mail and delete this e-mail from your system. Unless explicitly and conspicuously designated as "E-Contract Intended", this e-mail does not constitute a contract offer, a contract amendment, or an acceptance of a contract offer. This e-mail does not constitute a consent to the use of sender's contact information for direct marketing purposes or for transfers of data to third parties. The dupont.com web address will continue in use for a transitional period for communications sent or received on behalf of DuPont Performance Coatings., which is not affiliated in any way with the DuPont Company. Francais Deutsch Italiano Espanol Portugues Japanese Chinese Korean http://www.DuPont.com/corp/email_disclaimer.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From cjfields at illinois.edu Thu Oct 3 13:44:07 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Thu, 3 Oct 2013 18:44:07 +0000 Subject: [maker-devel] NFSLock problem Message-ID: I have a MAKER job running that seems to be stalled on a failed scaffold. It's running via MPI (MAKER v2.28, openMPI 1.6.3), that appears to have worked successfully for the most part. This is a run that only uses transcriptome and protein information in order to get a decent dseat of The failed scaffold seems to be holding the job from completion. There does seem to be changes, but mainly they are on the NFSLock files: ./.NFSLock.gi_lock.NFSLock ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1 ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLock.KB913038%2E1.281.282.junction.blastn.holdover.NFSLock ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLock.KB913038%2E1.282.start.blastn.holdover.NFSLock ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLock.KB913038%2E1.281.end.blastn.holdover.NFSLock ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLock.KB913038%2E1.282.start.blastn.holdover.NFSLock.STACK ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLock.KB913038%2E1.281.end.blastn.holdover.NFSLock.STACK Everything else seems to have completed. I have seen a few issues re: NFS locking problems, would this be related? Should I stop the job? We're running GPFS for our NFS. Here's 'mount': -system-specific-4.1$ mount /dev/sda5 on / type ext4 (rw) proc on /proc type proc (rw) sysfs on /sys type sysfs (rw) devpts on /dev/pts type devpts (rw,gid=5,mode=620) tmpfs on /dev/shm type tmpfs (rw) /dev/sda1 on /boot type ext4 (rw) /dev/sdb1 on /export type ext4 (rw) /dev/sda2 on /var type ext4 (rw) tmpfs on /var/lib/ganglia/rrds type tmpfs (rw,size=6180842000,gid=99,uid=99) none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw) sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw) nfsd on /proc/fs/nfsd type nfsd (rw) /dev/IGBHOME0 on /home type gpfs (rw,mtime,dev=IGBHOME0) 128.174.124.79:/shares/group on /archive/group type nfs (rw,sync,hard,intr,retrans=10,timeo=300,rsize=65536,wsize=1048576,vers=3,proto=tcp,mountproto=tcp,addr=128.174.124.79) 128.174.124.79:/shares/CBC on /archive/CBC type nfs (rw,sync,hard,intr,retrans=10,timeo=300,rsize=65536,wsize=1048576,vers=3,proto=tcp,mountproto=tcp,addr=128.174.124.79) chris From carsonhh at gmail.com Thu Oct 3 13:45:37 2013 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 03 Oct 2013 14:45:37 -0400 Subject: [maker-devel] NFSLock problem In-Reply-To: Message-ID: Stop the job and restart it. No need to delete anything. Just restart it. Thanks, Carson On 10/3/13 2:44 PM, "Fields, Christopher J" wrote: >I have a MAKER job running that seems to be stalled on a failed scaffold. > It's running via MPI (MAKER v2.28, openMPI 1.6.3), that appears to have >worked successfully for the most part. This is a run that only uses >transcriptome and protein information in order to get a decent dseat of > >The failed scaffold seems to be holding the job from completion. There >does seem to be changes, but mainly they are on the NFSLock files: > >./.NFSLock.gi_lock.NFSLock >./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1 >./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLo >ck.KB913038%2E1.281.282.junction.blastn.holdover.NFSLock >./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLo >ck.KB913038%2E1.282.start.blastn.holdover.NFSLock >./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLo >ck.KB913038%2E1.281.end.blastn.holdover.NFSLock >./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLo >ck.KB913038%2E1.282.start.blastn.holdover.NFSLock.STACK >./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLo >ck.KB913038%2E1.281.end.blastn.holdover.NFSLock.STACK > >Everything else seems to have completed. > >I have seen a few issues re: NFS locking problems, would this be related? > Should I stop the job? > >We're running GPFS for our NFS. Here's 'mount': > >-system-specific-4.1$ mount >/dev/sda5 on / type ext4 (rw) >proc on /proc type proc (rw) >sysfs on /sys type sysfs (rw) >devpts on /dev/pts type devpts (rw,gid=5,mode=620) >tmpfs on /dev/shm type tmpfs (rw) >/dev/sda1 on /boot type ext4 (rw) >/dev/sdb1 on /export type ext4 (rw) >/dev/sda2 on /var type ext4 (rw) >tmpfs on /var/lib/ganglia/rrds type tmpfs >(rw,size=6180842000,gid=99,uid=99) >none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw) >sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw) >nfsd on /proc/fs/nfsd type nfsd (rw) >/dev/IGBHOME0 on /home type gpfs (rw,mtime,dev=IGBHOME0) >128.174.124.79:/shares/group on /archive/group type nfs >(rw,sync,hard,intr,retrans=10,timeo=300,rsize=65536,wsize=1048576,vers=3,p >roto=tcp,mountproto=tcp,addr=128.174.124.79) >128.174.124.79:/shares/CBC on /archive/CBC type nfs >(rw,sync,hard,intr,retrans=10,timeo=300,rsize=65536,wsize=1048576,vers=3,p >roto=tcp,mountproto=tcp,addr=128.174.124.79) > >chris >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From cjfields at illinois.edu Thu Oct 3 21:45:29 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Fri, 4 Oct 2013 02:45:29 +0000 Subject: [maker-devel] NFSLock problem In-Reply-To: References: Message-ID: Carson, Took a couple of restarts but finally processed through. I saw a prior email on the mail list where you mentioned this could be due to failed jobs hanging things up; I may set up my next job to allow one processor for logging into the nodes in case there is a similar problem, maybe try to track this down. chris On Oct 3, 2013, at 1:45 PM, Carson Holt wrote: > Stop the job and restart it. No need to delete anything. Just restart it. > > Thanks, > Carson > > > On 10/3/13 2:44 PM, "Fields, Christopher J" wrote: > >> I have a MAKER job running that seems to be stalled on a failed scaffold. >> It's running via MPI (MAKER v2.28, openMPI 1.6.3), that appears to have >> worked successfully for the most part. This is a run that only uses >> transcriptome and protein information in order to get a decent dseat of >> >> The failed scaffold seems to be holding the job from completion. There >> does seem to be changes, but mainly they are on the NFSLock files: >> >> ./.NFSLock.gi_lock.NFSLock >> ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1 >> ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLo >> ck.KB913038%2E1.281.282.junction.blastn.holdover.NFSLock >> ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLo >> ck.KB913038%2E1.282.start.blastn.holdover.NFSLock >> ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLo >> ck.KB913038%2E1.281.end.blastn.holdover.NFSLock >> ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLo >> ck.KB913038%2E1.282.start.blastn.holdover.NFSLock.STACK >> ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLo >> ck.KB913038%2E1.281.end.blastn.holdover.NFSLock.STACK >> >> Everything else seems to have completed. >> >> I have seen a few issues re: NFS locking problems, would this be related? >> Should I stop the job? >> >> We're running GPFS for our NFS. Here's 'mount': >> >> -system-specific-4.1$ mount >> /dev/sda5 on / type ext4 (rw) >> proc on /proc type proc (rw) >> sysfs on /sys type sysfs (rw) >> devpts on /dev/pts type devpts (rw,gid=5,mode=620) >> tmpfs on /dev/shm type tmpfs (rw) >> /dev/sda1 on /boot type ext4 (rw) >> /dev/sdb1 on /export type ext4 (rw) >> /dev/sda2 on /var type ext4 (rw) >> tmpfs on /var/lib/ganglia/rrds type tmpfs >> (rw,size=6180842000,gid=99,uid=99) >> none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw) >> sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw) >> nfsd on /proc/fs/nfsd type nfsd (rw) >> /dev/IGBHOME0 on /home type gpfs (rw,mtime,dev=IGBHOME0) >> 128.174.124.79:/shares/group on /archive/group type nfs >> (rw,sync,hard,intr,retrans=10,timeo=300,rsize=65536,wsize=1048576,vers=3,p >> roto=tcp,mountproto=tcp,addr=128.174.124.79) >> 128.174.124.79:/shares/CBC on /archive/CBC type nfs >> (rw,sync,hard,intr,retrans=10,timeo=300,rsize=65536,wsize=1048576,vers=3,p >> roto=tcp,mountproto=tcp,addr=128.174.124.79) >> >> chris >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > From felix.bemm at uni-wuerzburg.de Fri Oct 4 02:10:08 2013 From: felix.bemm at uni-wuerzburg.de (Felix Bemm) Date: Fri, 04 Oct 2013 09:10:08 +0200 Subject: [maker-devel] Contigs failing - could not make datastore directory Message-ID: <524E69D0.1090905@uni-wuerzburg.de> Hi, I have a problem with Contigs failing like this: examining contents of the fasta file and run log ERROR: could not make datastore directory --> rank=11, hostname=r5n04 ERROR: Failed while examining contents of the fasta file and run log ERROR: Chunk failed at level:0, tier_type:0 FAILED CONTIG:MA_gen_v8.05_c31675 There is enough space available on both the storage device and the tmp device that is being used. The datastore log files looks like this: MA_gen_v8.05_c1149 FAILED MA_gen_v8.05_c1153 FAILED MA_gen_v8.05_c1154 FAILED MA_gen_v8.05_c1155 FAILED MA_gen_v8.05_c1156 FAILED Since there is no datastore for this contigs, I can not provide any more informationen. Approx. 1/3 of the contigs were annotated, after that maker started to fail. It kept running for a while and finishing some of the contigs in the second or third try like this: MA_gen_v8.05_c4443 FAILED MA_gen_v8.05_c4443 FAILED MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ STARTED MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ FINISHED Any ideas what can cause this problems? Best regards Felix -- Felix Bemm Department of Bioinformatics University of W?rzburg, Germany Tel: +49 931 - 31 83696 Fax: +49 931 - 31 84552 felix.bemm at uni-wuerzburg.de From carsonhh at gmail.com Fri Oct 4 13:15:35 2013 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 04 Oct 2013 14:15:35 -0400 Subject: [maker-devel] Contigs failing - could not make datastore directory In-Reply-To: <524E69D0.1090905@uni-wuerzburg.de> Message-ID: Let me know what you find. With some failures it's hard to identify the problem especially with long running processes, which is why I made MAKER restartable, so you can at least get completion by just launching again. --Carson On 10/4/13 3:10 AM, "Felix Bemm" wrote: >Hi, > >I have a problem with Contigs failing like this: > >examining contents of the fasta file and run log >ERROR: could not make datastore directory >--> rank=11, hostname=r5n04 >ERROR: Failed while examining contents of the fasta file and run log >ERROR: Chunk failed at level:0, tier_type:0 >FAILED CONTIG:MA_gen_v8.05_c31675 > >There is enough space available on both the storage device and the tmp >device that is being used. The datastore log files looks like this: > >MA_gen_v8.05_c1149 FAILED >MA_gen_v8.05_c1153 FAILED >MA_gen_v8.05_c1154 FAILED >MA_gen_v8.05_c1155 FAILED >MA_gen_v8.05_c1156 FAILED > >Since there is no datastore for this contigs, I can not provide any more >informationen. Approx. 1/3 of the contigs were annotated, after that >maker started to fail. It kept running for a while and finishing some of >the contigs in the second or third try like this: > >MA_gen_v8.05_c4443 FAILED >MA_gen_v8.05_c4443 FAILED >MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ START >ED >MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ >FINISHED > >Any ideas what can cause this problems? > >Best regards >Felix > >-- >Felix Bemm >Department of Bioinformatics >University of W?rzburg, Germany >Tel: +49 931 - 31 83696 >Fax: +49 931 - 31 84552 >felix.bemm at uni-wuerzburg.de > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From carsonhh at gmail.com Fri Oct 4 13:21:42 2013 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 04 Oct 2013 14:21:42 -0400 Subject: [maker-devel] Contigs failing - could not make datastore directory In-Reply-To: <524E69D0.1090905@uni-wuerzburg.de> Message-ID: Hi Felix, There are other potential issues as well, for example a file quota limit set by your administrator or full local /tmp which will be unique to each node and cannot be queried directly from the root node on a cluster. On a cluster one node may also not have the NFS location mounted. Or it could be system related, for example on ibrix NFS mounted systems you can get a weird filesystem issue with ghost files and directories when using MAKER that can neither be created nor deleted. Try running the failed contigs in a different directory (even if it's on the same disk). Also what version of MAKER are you using? --Carson On 10/4/13 3:10 AM, "Felix Bemm" wrote: >Hi, > >I have a problem with Contigs failing like this: > >examining contents of the fasta file and run log >ERROR: could not make datastore directory >--> rank=11, hostname=r5n04 >ERROR: Failed while examining contents of the fasta file and run log >ERROR: Chunk failed at level:0, tier_type:0 >FAILED CONTIG:MA_gen_v8.05_c31675 > >There is enough space available on both the storage device and the tmp >device that is being used. The datastore log files looks like this: > >MA_gen_v8.05_c1149 FAILED >MA_gen_v8.05_c1153 FAILED >MA_gen_v8.05_c1154 FAILED >MA_gen_v8.05_c1155 FAILED >MA_gen_v8.05_c1156 FAILED > >Since there is no datastore for this contigs, I can not provide any more >informationen. Approx. 1/3 of the contigs were annotated, after that >maker started to fail. It kept running for a while and finishing some of >the contigs in the second or third try like this: > >MA_gen_v8.05_c4443 FAILED >MA_gen_v8.05_c4443 FAILED >MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ START >ED >MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ >FINISHED > >Any ideas what can cause this problems? > >Best regards >Felix > >-- >Felix Bemm >Department of Bioinformatics >University of W?rzburg, Germany >Tel: +49 931 - 31 83696 >Fax: +49 931 - 31 84552 >felix.bemm at uni-wuerzburg.de > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From felix.bemm at uni-wuerzburg.de Sat Oct 5 09:09:15 2013 From: felix.bemm at uni-wuerzburg.de (Felix Bemm) Date: Sat, 05 Oct 2013 16:09:15 +0200 Subject: [maker-devel] Contigs failing - could not make datastore directory In-Reply-To: References: Message-ID: <52501D8B.2060301@uni-wuerzburg.de> Hi Carson, there is definitely no quota neither a full local tmp device. I changed to another tmp device, which is unfortunately residing on nfs but now it seems to work. I am using the latest version of maker (2.28). The complete job is running on a single machine with 32 mpi threads. Bests Felix Am 04.10.2013 20:21, schrieb Carson Holt: > Hi Felix, > > There are other potential issues as well, for example a file quota limit > set by your administrator or full local /tmp which will be unique to each > node and cannot be queried directly from the root node on a cluster. On a > cluster one node may also not have the NFS location mounted. Or it could > be system related, for example on ibrix NFS mounted systems you can get a > weird filesystem issue with ghost files and directories when using MAKER > that can neither be created nor deleted. Try running the failed contigs > in a different directory (even if it's on the same disk). Also what > version of MAKER are you using? > > --Carson > > > > On 10/4/13 3:10 AM, "Felix Bemm" wrote: > >> Hi, >> >> I have a problem with Contigs failing like this: >> >> examining contents of the fasta file and run log >> ERROR: could not make datastore directory >> --> rank=11, hostname=r5n04 >> ERROR: Failed while examining contents of the fasta file and run log >> ERROR: Chunk failed at level:0, tier_type:0 >> FAILED CONTIG:MA_gen_v8.05_c31675 >> >> There is enough space available on both the storage device and the tmp >> device that is being used. The datastore log files looks like this: >> >> MA_gen_v8.05_c1149 FAILED >> MA_gen_v8.05_c1153 FAILED >> MA_gen_v8.05_c1154 FAILED >> MA_gen_v8.05_c1155 FAILED >> MA_gen_v8.05_c1156 FAILED >> >> Since there is no datastore for this contigs, I can not provide any more >> informationen. Approx. 1/3 of the contigs were annotated, after that >> maker started to fail. It kept running for a while and finishing some of >> the contigs in the second or third try like this: >> >> MA_gen_v8.05_c4443 FAILED >> MA_gen_v8.05_c4443 FAILED >> MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ START >> ED >> MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ >> FINISHED >> >> Any ideas what can cause this problems? >> >> Best regards >> Felix >> >> -- >> Felix Bemm >> Department of Bioinformatics >> University of W?rzburg, Germany >> Tel: +49 931 - 31 83696 >> Fax: +49 931 - 31 84552 >> felix.bemm at uni-wuerzburg.de >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -- Felix Bemm Department of Bioinformatics University of W?rzburg, Germany Tel: +49 931 - 31 83696 Fax: +49 931 - 31 84552 felix.bemm at uni-wuerzburg.de From jfierst at uoregon.edu Fri Oct 4 12:06:36 2013 From: jfierst at uoregon.edu (Janna Fierst) Date: Fri, 4 Oct 2013 10:06:36 -0700 Subject: [maker-devel] Fwd: exon/intron boundaries In-Reply-To: References: Message-ID: Hi, thanks for your reply- I have been going through our annotations in detail and trying different parameter sets, and I think I have identified what is going on but I'm not sure how to set the MAKER2 parameters for our situation. We are working with a species of Caenorhabditid worm and there are long gene-dense blocks that are being incorrectly annotated as large single genes instead of several smaller closely spaced genes. The protein alignments (tblastx and protein2genome) show very clearly where the exon/intron boundaries are and in most cases agree with the augustus predictions. The assembled cufflinks output (through blastn, est2genome and est_gff:cufflinks) does not agree in some locations; I think this may be because in some cases the UTR nearly overlaps adjacent genes. I have included a screenshot of an annotated region viewed in apollo to try to show this. The large gene in the middle is actually 7 different genes that are extremely close together and MAKER2 is collapsing them into a single gene. I tried running without any RNASeq/cufflinks data and MAKER2 annotates the region as two genes instead of one, but I can't get it to recognize the 7 as different genes. I have retrained SNAP but we have not been able to successfully train Augustus, we are currently using the default caenhorhabditis species model. I also included a species specific repeat library. I tried setting correct_est_fusion=1 and reducing pred_flank but these changes appear to really alter the annotations and we end up annotating almost nothing. I also tried setting est2genome=0 to decrease the influence of the cufflinks assembly but it didn't appear to help. There are some very large introns in these genes so I haven't tried yet decreasing the maximum intron size because I'm concerned this may generate too many split genes instead of our current merged gene problem. Thanks for your help, any advice is greatly appreciated! -Janna Fierst On Mon, Aug 26, 2013 at 12:21 PM, Carson Holt wrote: > Are you getting gene fusions or just more exons? Gene fusions can be > reduced by setting correct_est_fusion=1, or reducing pred_flank, although > reducing pred_flank can cause other issues (but those generally only appear > if setting the value below below 150). Also if you have the maximum intron > size set to high (split_hit option), you may also be generating bridging > alignments that make evidence align across distant paralogous genes as well > (this can result in gene merging) > > You should also look at your results manually in a viewer like Apollo. > Then see if the extra exons are supported by something such as protein > alignments from another species. If this is the case, you may have a > poorly annotated protein set that is being used as evidence that is > carrying over it's erroneous exons into the species you are annotating. If > the extra exons are supported by EST evidence, then perhaps you should try > and rebuild the EST assembly (for example trinity has an option to use a > Jarccardian similarity coefficient to avoid fusing transcripts). > > Another option, is to retrain SNAP or Augustus. MAKER does not actually > produce any of the models itself (it is a pipeline not a predictor). The > models are all generated using these other algorithms, MAKER just feeds > them hints based on protein and transcript alignments, so making sure > training is sufficient is important for those programs to produce their > best models. > > Finally make sure your repeat database is sufficient, you may need to > generate a species specific repeat library using something like > RepeatModeler. Repeats can end up being included as extra exons in gene > models because they may contain reading frames the do code for proteins > (I.e. reverse transcriptases). > > If you have any questions on any of the above, just let us know. > > Thanks, > Carson > > > From: Janna Fierst > Date: Monday, August 26, 2013 2:54 PM > To: > Subject: [maker-devel] exon/intron boundaries > > Hi, > > I am using MAKER 2.28 to annotate a Caenorhabditid worm genome, and the > initial results appear fairly good but we seem to be be annotating too many > exons for multiple genes. I was wondering which parameters should be tuned > to change the threshold for exon/intron boundaries? Thanks for your help > -Janna Fierst > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 1.tiff Type: image/tiff Size: 115362 bytes Desc: not available URL: From jfierst at uoregon.edu Sat Oct 5 17:18:52 2013 From: jfierst at uoregon.edu (Janna Fierst) Date: Sat, 5 Oct 2013 15:18:52 -0700 Subject: [maker-devel] Fwd: exon/intron boundaries In-Reply-To: References: Message-ID: Hi, thanks for your reply- I have been going through our annotations in detail and trying different parameter sets, and I think I have identified what is going on but I'm not sure how to set the MAKER2 parameters for our situation. We are working with a species of Caenorhabditid worm and there are long gene-dense blocks that are being incorrectly annotated as large single genes instead of several smaller closely spaced genes. The protein alignments (tblastx and protein2genome) show very clearly where the exon/intron boundaries are and in most cases agree with the augustus predictions. The assembled cufflinks output (through blastn, est2genome and est_gff:cufflinks) does not agree in some locations; I think this may be because in some cases the UTR nearly overlaps adjacent genes. I have included a screenshot of an annotated region viewed in apollo to try to show this. The large gene in the middle is actually 7 different genes that are extremely close together and MAKER2 is collapsing them into a single gene. I tried running without any RNASeq/cufflinks data and MAKER2 annotates the region as two genes instead of one, but I can't get it to recognize the 7 as different genes. I have retrained SNAP but we have not been able to successfully train Augustus, we are currently using the default caenhorhabditis species model. I also included a species specific repeat library. I tried setting correct_est_fusion=1 and reducing pred_flank but these changes appear to really alter the annotations and we end up annotating almost nothing. I also tried setting est2genome=0 to decrease the influence of the cufflinks assembly but it didn't appear to help. There are some very large introns in these genes so I haven't tried yet decreasing the maximum intron size because I'm concerned this may generate too many split genes instead of our current merged gene problem. Thanks for your help, any advice is greatly appreciated! -Janna Fierst On Mon, Aug 26, 2013 at 12:21 PM, Carson Holt wrote: > Are you getting gene fusions or just more exons? Gene fusions can be > reduced by setting correct_est_fusion=1, or reducing pred_flank, although > reducing pred_flank can cause other issues (but those generally only appear > if setting the value below below 150). Also if you have the maximum intron > size set to high (split_hit option), you may also be generating bridging > alignments that make evidence align across distant paralogous genes as well > (this can result in gene merging) > > You should also look at your results manually in a viewer like Apollo. > Then see if the extra exons are supported by something such as protein > alignments from another species. If this is the case, you may have a > poorly annotated protein set that is being used as evidence that is > carrying over it's erroneous exons into the species you are annotating. If > the extra exons are supported by EST evidence, then perhaps you should try > and rebuild the EST assembly (for example trinity has an option to use a > Jarccardian similarity coefficient to avoid fusing transcripts). > > Another option, is to retrain SNAP or Augustus. MAKER does not actually > produce any of the models itself (it is a pipeline not a predictor). The > models are all generated using these other algorithms, MAKER just feeds > them hints based on protein and transcript alignments, so making sure > training is sufficient is important for those programs to produce their > best models. > > Finally make sure your repeat database is sufficient, you may need to > generate a species specific repeat library using something like > RepeatModeler. Repeats can end up being included as extra exons in gene > models because they may contain reading frames the do code for proteins > (I.e. reverse transcriptases). > > If you have any questions on any of the above, just let us know. > > Thanks, > Carson > > > From: Janna Fierst > Date: Monday, August 26, 2013 2:54 PM > To: > Subject: [maker-devel] exon/intron boundaries > > Hi, > > I am using MAKER 2.28 to annotate a Caenorhabditid worm genome, and the > initial results appear fairly good but we seem to be be annotating too many > exons for multiple genes. I was wondering which parameters should be tuned > to change the threshold for exon/intron boundaries? Thanks for your help > -Janna Fierst > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 1.tiff Type: image/tiff Size: 115362 bytes Desc: not available URL: From myandell at genetics.utah.edu Mon Oct 7 06:36:53 2013 From: myandell at genetics.utah.edu (Mark Yandell) Date: Mon, 7 Oct 2013 11:36:53 +0000 Subject: [maker-devel] maker apollo error In-Reply-To: <8687323557146056ef885dd8a1bf2ea4@buzonweb.us.es> References: <8687323557146056ef885dd8a1bf2ea4@buzonweb.us.es> Message-ID: <7A60AB257EFF2B48B1F4C814817EA05365E66B3C@mxb2.hg.genetics.utah.edu> Hi Gabriel, I've forwarded your request on to the MAKER email list... cheers, --mark Mark Yandell Professor of Human Genetics H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:801-587-7707 ________________________________________ From: ggpozo at us.es [ggpozo at us.es] Sent: Monday, October 07, 2013 2:49 AM To: Mark Yandell Subject: maker apollo error Dear Dr. Yandell, First, thank you very much for provide us such a great tool as MAKER. I am traying to install it in my Linux system, everything is ok, however when I try to instal Apollo with ./Build apollo... Downloading apollo... --2013-10-07 10:45:37-- http://gmod.svn.sourceforge.net/viewvc/gmod/apollo/trunk/?view=tar Resolviendo gmod.svn.sourceforge.net... 216.34.181.65, 216.34.181.177 Connecting to gmod.svn.sourceforge.net|216.34.181.65|:80... conectado. Petici?n HTTP enviada, esperando respuesta... 302 Found Localizaci?n: http://sourceforge.net/p/gmod/svn/ [siguiendo] --2013-10-07 10:45:38-- http://sourceforge.net/p/gmod/svn/ Resolviendo sourceforge.net... 216.34.181.60 Connecting to sourceforge.net|216.34.181.60|:80... conectado. Petici?n HTTP enviada, esperando respuesta... 302 Found Localizaci?n: http://sourceforge.net/p/gmod/svn/HEAD/tree/ [siguiendo] --2013-10-07 10:45:38-- http://sourceforge.net/p/gmod/svn/HEAD/tree/ Connecting to sourceforge.net|216.34.181.60|:80... conectado. Petici?n HTTP enviada, esperando respuesta... 200 OK Longitud: 37730 (37K) [text/html] Saving to: `/home/maker/src/../exe/apollo.tar.gz' 100%[==================================================================================================================================>] 37.730 92,1K/s in 0,4s 2013-10-07 10:45:40 (92,1 KB/s) - `/home/maker/src/../exe/apollo.tar.gz' saved [37730/37730] Unpacking apollo tarball... gzip: stdin: not in gzip format tar: Child returned status 1 tar: Error is not recoverable: exiting now ERROR: Failed installing apollo, now cleaning installation path... You may need to install apollo manually. It seems that the http direction in locations file is wrong, Apollo has changed its version and now Apollo is integrated in Apolloweb, may be this the problem. I would greatly appreciate any help you can give me. Thank you very much. Dr. Gabriel Gutierrez Department of Genetcis University of Seville Spain From carsonhh at gmail.com Mon Oct 7 08:20:17 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 07 Oct 2013 09:20:17 -0400 Subject: [maker-devel] Fwd: exon/intron boundaries In-Reply-To: Message-ID: Hi Janna, There are a couple of things to do. Download the maker-2.29p-beta from the lab website. This includes some changes made to improve the performance of correct_est_fusion. Whenever using the correct_est_fusion option, this also reduces the influence of ESTs on annotation. So normally you would need to increase your protein dataset, but given that you have supplied 3 nematode species and all of UniProt already you should probably be fine. But there is one last thing you can do. Instead of using cufflinks, try using trinity to assemble the ESTs. There is a Jaccard clip option that reducing merging caused by overlapping UTR. Between the trinity and correct_est_fusion you should be able to really reduce the effect of those ESTs. If those changes don't work there is one last option. If you take the MAKER results and filter them for SNAP and Augustus ab initio results (match/match_part in the GFF3), then you can pass those in to the pred_gff options. Then turn snaphmm and augustus_species off in the control files. Basically what this will do is turn MAKER's hint based prediction off and force it to filter the ab intio results and select directly models from there. Since the merging is being caused by bad hints (merged transcripts from mRNAseq) this would reduce that effect. You will still need correct_est_fusion=1 though to trim UTR coming from the merged transcripts, because even though MAEKR can't process hints to rerun SNAP and Augustus this way, it will try and add UTR using the EST evidence. --Carson From: Janna Fierst Date: Friday, October 4, 2013 1:06 PM To: Subject: [maker-devel] Fwd: exon/intron boundaries Hi, thanks for your reply- I have been going through our annotations in detail and trying different parameter sets, and I think I have identified what is going on but I'm not sure how to set the MAKER2 parameters for our situation. We are working with a species of Caenorhabditid worm and there are long gene-dense blocks that are being incorrectly annotated as large single genes instead of several smaller closely spaced genes. The protein alignments (tblastx and protein2genome) show very clearly where the exon/intron boundaries are and in most cases agree with the augustus predictions. The assembled cufflinks output (through blastn, est2genome and est_gff:cufflinks) does not agree in some locations; I think this may be because in some cases the UTR nearly overlaps adjacent genes. I have included a screenshot of an annotated region viewed in apollo to try to show this. The large gene in the middle is actually 7 different genes that are extremely close together and MAKER2 is collapsing them into a single gene. I tried running without any RNASeq/cufflinks data and MAKER2 annotates the region as two genes instead of one, but I can't get it to recognize the 7 as different genes. I have retrained SNAP but we have not been able to successfully train Augustus, we are currently using the default caenhorhabditis species model. I also included a species specific repeat library. I tried setting correct_est_fusion=1 and reducing pred_flank but these changes appear to really alter the annotations and we end up annotating almost nothing. I also tried setting est2genome=0 to decrease the influence of the cufflinks assembly but it didn't appear to help. There are some very large introns in these genes so I haven't tried yet decreasing the maximum intron size because I'm concerned this may generate too many split genes instead of our current merged gene problem. Thanks for your help, any advice is greatly appreciated! -Janna Fierst On Mon, Aug 26, 2013 at 12:21 PM, Carson Holt wrote: > Are you getting gene fusions or just more exons? Gene fusions can be reduced > by setting correct_est_fusion=1, or reducing pred_flank, although reducing > pred_flank can cause other issues (but those generally only appear if setting > the value below below 150). Also if you have the maximum intron size set to > high (split_hit option), you may also be generating bridging alignments that > make evidence align across distant paralogous genes as well (this can result > in gene merging) > > You should also look at your results manually in a viewer like Apollo. Then > see if the extra exons are supported by something such as protein alignments > from another species. If this is the case, you may have a poorly annotated > protein set that is being used as evidence that is carrying over it's > erroneous exons into the species you are annotating. If the extra exons are > supported by EST evidence, then perhaps you should try and rebuild the EST > assembly (for example trinity has an option to use a Jarccardian similarity > coefficient to avoid fusing transcripts). > > Another option, is to retrain SNAP or Augustus. MAKER does not actually > produce any of the models itself (it is a pipeline not a predictor). The > models are all generated using these other algorithms, MAKER just feeds them > hints based on protein and transcript alignments, so making sure training is > sufficient is important for those programs to produce their best models. > > Finally make sure your repeat database is sufficient, you may need to generate > a species specific repeat library using something like RepeatModeler. Repeats > can end up being included as extra exons in gene models because they may > contain reading frames the do code for proteins (I.e. reverse transcriptases). > > If you have any questions on any of the above, just let us know. > > Thanks, > Carson > > > From: Janna Fierst > Date: Monday, August 26, 2013 2:54 PM > To: > Subject: [maker-devel] exon/intron boundaries > > Hi, > > I am using MAKER 2.28 to annotate a Caenorhabditid worm genome, and the > initial results appear fairly good but we seem to be be annotating too many > exons for multiple genes. I was wondering which parameters should be tuned to > change the threshold for exon/intron boundaries? Thanks for your help -Janna > Fierst > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/mak > er-devel_yandell-lab.org _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Oct 7 09:12:24 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 07 Oct 2013 10:12:24 -0400 Subject: [maker-devel] maker apollo error In-Reply-To: <7A60AB257EFF2B48B1F4C814817EA05365E66B3C@mxb2.hg.genetics.utah.edu> Message-ID: This was a location from GMOD for the older Apollo. It looks like they've dropped it, so it now points to nothing. Perhaps there is another location that still has the old code I can get from Ed. --Carson On 10/7/13 7:36 AM, "Mark Yandell" wrote: > >Hi Gabriel, > >I've forwarded your request on to the MAKER email list... > >cheers, > >--mark > >Mark Yandell >Professor of Human Genetics >H.A. & Edna Benning Presidential Endowed Chair >Eccles Institute of Human Genetics >University of Utah >15 North 2030 East, Room 2100 >Salt Lake City, UT 84112-5330 >ph:801-587-7707 > >________________________________________ >From: ggpozo at us.es [ggpozo at us.es] >Sent: Monday, October 07, 2013 2:49 AM >To: Mark Yandell >Subject: maker apollo error > >Dear Dr. Yandell, > >First, thank you very much for provide us such a great tool as MAKER. I >am traying to install it in my Linux system, everything is ok, however >when I try to instal Apollo with ./Build apollo... > >Downloading apollo... >--2013-10-07 10:45:37-- >http://gmod.svn.sourceforge.net/viewvc/gmod/apollo/trunk/?view=tar >Resolviendo gmod.svn.sourceforge.net... 216.34.181.65, 216.34.181.177 >Connecting to gmod.svn.sourceforge.net|216.34.181.65|:80... conectado. >Petici?n HTTP enviada, esperando respuesta... 302 Found >Localizaci?n: http://sourceforge.net/p/gmod/svn/ [siguiendo] >--2013-10-07 10:45:38-- http://sourceforge.net/p/gmod/svn/ >Resolviendo sourceforge.net... 216.34.181.60 >Connecting to sourceforge.net|216.34.181.60|:80... conectado. >Petici?n HTTP enviada, esperando respuesta... 302 Found >Localizaci?n: http://sourceforge.net/p/gmod/svn/HEAD/tree/ [siguiendo] >--2013-10-07 10:45:38-- http://sourceforge.net/p/gmod/svn/HEAD/tree/ >Connecting to sourceforge.net|216.34.181.60|:80... conectado. >Petici?n HTTP enviada, esperando respuesta... 200 OK >Longitud: 37730 (37K) [text/html] >Saving to: `/home/maker/src/../exe/apollo.tar.gz' > >100%[===================================================================== >=============================================================>] 37.730 > 92,1K/s in 0,4s > >2013-10-07 10:45:40 (92,1 KB/s) - `/home/maker/src/../exe/apollo.tar.gz' >saved [37730/37730] > >Unpacking apollo tarball... > >gzip: stdin: not in gzip format >tar: Child returned status 1 >tar: Error is not recoverable: exiting now > > >ERROR: Failed installing apollo, now cleaning installation path... >You may need to install apollo manually. > > > >It seems that the http direction in locations file is wrong, Apollo has >changed its version and now Apollo is integrated in Apolloweb, may be >this the problem. > > > >I would greatly appreciate any help you can give me. > >Thank you very much. > > > >Dr. Gabriel Gutierrez > >Department of Genetcis > >University of Seville > >Spain > > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From carsonhh at gmail.com Mon Oct 7 09:33:44 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 07 Oct 2013 10:33:44 -0400 Subject: [maker-devel] Contigs failing - could not make datastore directory In-Reply-To: <52501D8B.2060301@uni-wuerzburg.de> Message-ID: You can also try the 2.29 beta if you want (on the website). Also when MAKER exits, it tries to delete anything left behind in temporary folders, so is it possible that it is creating space on exit after filling up /tmp? Given that the issue seems to go away when you switch to another directory, it suggests that this might be the kind of thing that is specific to something about the current state of your your system. Thanks, Carson On 10/5/13 10:09 AM, "Felix Bemm" wrote: >Hi Carson, > >there is definitely no quota neither a full local tmp device. I changed >to another tmp device, which is unfortunately residing on nfs but now it >seems to work. I am using the latest version of maker (2.28). The >complete job is running on a single machine with 32 mpi threads. > >Bests >Felix > >Am 04.10.2013 20:21, schrieb Carson Holt: >> Hi Felix, >> >> There are other potential issues as well, for example a file quota limit >> set by your administrator or full local /tmp which will be unique to >>each >> node and cannot be queried directly from the root node on a cluster. >>On a >> cluster one node may also not have the NFS location mounted. Or it >>could >> be system related, for example on ibrix NFS mounted systems you can get >>a >> weird filesystem issue with ghost files and directories when using MAKER >> that can neither be created nor deleted. Try running the failed contigs >> in a different directory (even if it's on the same disk). Also what >> version of MAKER are you using? >> >> --Carson >> >> >> >> On 10/4/13 3:10 AM, "Felix Bemm" wrote: >> >>> Hi, >>> >>> I have a problem with Contigs failing like this: >>> >>> examining contents of the fasta file and run log >>> ERROR: could not make datastore directory >>> --> rank=11, hostname=r5n04 >>> ERROR: Failed while examining contents of the fasta file and run log >>> ERROR: Chunk failed at level:0, tier_type:0 >>> FAILED CONTIG:MA_gen_v8.05_c31675 >>> >>> There is enough space available on both the storage device and the tmp >>> device that is being used. The datastore log files looks like this: >>> >>> MA_gen_v8.05_c1149 FAILED >>> MA_gen_v8.05_c1153 FAILED >>> MA_gen_v8.05_c1154 FAILED >>> MA_gen_v8.05_c1155 FAILED >>> MA_gen_v8.05_c1156 FAILED >>> >>> Since there is no datastore for this contigs, I can not provide any >>>more >>> informationen. Approx. 1/3 of the contigs were annotated, after that >>> maker started to fail. It kept running for a while and finishing some >>>of >>> the contigs in the second or third try like this: >>> >>> MA_gen_v8.05_c4443 FAILED >>> MA_gen_v8.05_c4443 FAILED >>> >>>MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ STA >>>RT >>> ED >>> MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ >>> FINISHED >>> >>> Any ideas what can cause this problems? >>> >>> Best regards >>> Felix >>> >>> -- >>> Felix Bemm >>> Department of Bioinformatics >>> University of W?rzburg, Germany >>> Tel: +49 931 - 31 83696 >>> Fax: +49 931 - 31 84552 >>> felix.bemm at uni-wuerzburg.de >>> >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > >-- >Felix Bemm >Department of Bioinformatics >University of W?rzburg, Germany >Tel: +49 931 - 31 83696 >Fax: +49 931 - 31 84552 >felix.bemm at uni-wuerzburg.de From felix.bemm at uni-wuerzburg.de Mon Oct 7 11:40:12 2013 From: felix.bemm at uni-wuerzburg.de (Felix Bemm) Date: Mon, 07 Oct 2013 18:40:12 +0200 Subject: [maker-devel] Contigs failing - could not make datastore directory In-Reply-To: References: Message-ID: <5252E3EC.6050908@uni-wuerzburg.de> Do you have a changelog for 2.29b? On 07.10.2013 16:33, Carson Holt wrote: > You can also try the 2.29 beta if you want (on the website). Also when > MAKER exits, it tries to delete anything left behind in temporary folders, > so is it possible that it is creating space on exit after filling up /tmp? I will try to monitor that! > Given that the issue seems to go away when you switch to another > directory, it suggests that this might be the kind of thing that is > specific to something about the current state of your your system. I agree with that. Interestingly it happens with the only directory where I would not expect that. > Thanks, > Carson > > > On 10/5/13 10:09 AM, "Felix Bemm" wrote: > >> Hi Carson, >> >> there is definitely no quota neither a full local tmp device. I changed >> to another tmp device, which is unfortunately residing on nfs but now it >> seems to work. I am using the latest version of maker (2.28). The >> complete job is running on a single machine with 32 mpi threads. >> >> Bests >> Felix >> >> Am 04.10.2013 20:21, schrieb Carson Holt: >>> Hi Felix, >>> >>> There are other potential issues as well, for example a file quota limit >>> set by your administrator or full local /tmp which will be unique to >>> each >>> node and cannot be queried directly from the root node on a cluster. >>> On a >>> cluster one node may also not have the NFS location mounted. Or it >>> could >>> be system related, for example on ibrix NFS mounted systems you can get >>> a >>> weird filesystem issue with ghost files and directories when using MAKER >>> that can neither be created nor deleted. Try running the failed contigs >>> in a different directory (even if it's on the same disk). Also what >>> version of MAKER are you using? >>> >>> --Carson >>> >>> >>> >>> On 10/4/13 3:10 AM, "Felix Bemm" wrote: >>> >>>> Hi, >>>> >>>> I have a problem with Contigs failing like this: >>>> >>>> examining contents of the fasta file and run log >>>> ERROR: could not make datastore directory >>>> --> rank=11, hostname=r5n04 >>>> ERROR: Failed while examining contents of the fasta file and run log >>>> ERROR: Chunk failed at level:0, tier_type:0 >>>> FAILED CONTIG:MA_gen_v8.05_c31675 >>>> >>>> There is enough space available on both the storage device and the tmp >>>> device that is being used. The datastore log files looks like this: >>>> >>>> MA_gen_v8.05_c1149 FAILED >>>> MA_gen_v8.05_c1153 FAILED >>>> MA_gen_v8.05_c1154 FAILED >>>> MA_gen_v8.05_c1155 FAILED >>>> MA_gen_v8.05_c1156 FAILED >>>> >>>> Since there is no datastore for this contigs, I can not provide any >>>> more >>>> informationen. Approx. 1/3 of the contigs were annotated, after that >>>> maker started to fail. It kept running for a while and finishing some >>>> of >>>> the contigs in the second or third try like this: >>>> >>>> MA_gen_v8.05_c4443 FAILED >>>> MA_gen_v8.05_c4443 FAILED >>>> >>>> MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ STA >>>> RT >>>> ED >>>> MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ >>>> FINISHED >>>> >>>> Any ideas what can cause this problems? >>>> >>>> Best regards >>>> Felix >>>> >>>> -- >>>> Felix Bemm >>>> Department of Bioinformatics >>>> University of W?rzburg, Germany >>>> Tel: +49 931 - 31 83696 >>>> Fax: +49 931 - 31 84552 >>>> felix.bemm at uni-wuerzburg.de >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> >> -- >> Felix Bemm >> Department of Bioinformatics >> University of W?rzburg, Germany >> Tel: +49 931 - 31 83696 >> Fax: +49 931 - 31 84552 >> felix.bemm at uni-wuerzburg.de -- Felix Bemm Department of Bioinformatics University of W?rzburg, Germany Tel: +49 931 - 31 83696 Fax: +49 931 - 31 84552 felix.bemm at uni-wuerzburg.de From carsonhh at gmail.com Mon Oct 7 12:25:34 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 07 Oct 2013 13:25:34 -0400 Subject: [maker-devel] maker apollo error In-Reply-To: <14a642f3dc9cb615f048b70228bad3be@buzonweb.us.es> Message-ID: Here is a new location for the source --> http://sourceforge.net/code-snapshots/svn/g/gm/gmod/svn/gmod-svn-25291-apoll o-trunk.zip --Carson From: Date: Monday, October 7, 2013 12:47 PM To: Carson Holt Cc: Mark Yandell , Subject: Re: [maker-devel] maker apollo error Thanks, I have modified the locations file pointing to a personal server where I downloaded an old version of Apollo but it did not work, but if you can get a new one I would try it again. Thanks Gabriel El 07/10/2013 16:12, Carson Holt escribi?: > This was a location from GMOD for the older Apollo. It looks like they've > dropped it, so it now points to nothing. Perhaps there is another location > that still has the old code I can get from Ed. > > --Carson > > > > On 10/7/13 7:36 AM, "Mark Yandell" wrote: >> Hi Gabriel, I've forwarded your request on to the MAKER email list... cheers, >> --mark Mark Yandell Professor of Human Genetics H.A. & Edna Benning >> Presidential Endowed Chair Eccles Institute of Human Genetics University of >> Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 >> ph:801-587-7707 ________________________________________ From: ggpozo at us.es >> [ggpozo at us.es] Sent: Monday, October 07, 2013 2:49 AM To: Mark Yandell >> Subject: maker apollo error Dear Dr. Yandell, First, thank you very much for >> provide us such a great tool as MAKER. I am traying to install it in my Linux >> system, everything is ok, however when I try to instal Apollo with ./Build >> apollo... Downloading apollo... --2013-10-07 10:45:37-- >> http://gmod.svn.sourceforge.net/viewvc/gmod/apollo/trunk/?view=tar >> Resolviendo gmod.svn.sourceforge.net... 216.34.181.65, 216.34.181.177 >> Connecting to gmod.svn.sourceforge.net|216.34.181.65|:80... conectado. >> Petici?n HTTP enviada, esperando respuesta... 302 Found Localizaci?n: >> http://sourceforge.net/p/gmod/svn/ [siguiendo] --2013-10-07 10:45:38-- >> http://sourceforge.net/p/gmod/svn/ Resolviendo sourceforge.net... >> 216.34.181.60 Connecting to sourceforge.net|216.34.181.60|:80... conectado. >> Petici?n HTTP enviada, esperando respuesta... 302 Found Localizaci?n: >> http://sourceforge.net/p/gmod/svn/HEAD/tree/ [siguiendo] --2013-10-07 >> 10:45:38-- http://sourceforge.net/p/gmod/svn/HEAD/tree/ Connecting to >> sourceforge.net|216.34.181.60|:80... conectado. Petici?n HTTP enviada, >> esperando respuesta... 200 OK Longitud: 37730 (37K) [text/html] Saving to: >> `/home/maker/src/../exe/apollo.tar.gz' >> 100%[===================================================================== >> =============================================================>] 37.730 >> 92,1K/s in 0,4s 2013-10-07 10:45:40 (92,1 KB/s) - >> `/home/maker/src/../exe/apollo.tar.gz' saved [37730/37730] Unpacking apollo >> tarball... gzip: stdin: not in gzip format tar: Child returned status 1 tar: >> Error is not recoverable: exiting now ERROR: Failed installing apollo, now >> cleaning installation path... You may need to install apollo manually. It >> seems that the http direction in locations file is wrong, Apollo has changed >> its version and now Apollo is integrated in Apolloweb, may be this the >> problem. I would greatly appreciate any help you can give me. Thank you very >> much. Dr. Gabriel Gutierrez Department of Genetcis University of Seville >> Spain _______________________________________________ maker-devel mailing >> list maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ggpozo at us.es Mon Oct 7 11:47:13 2013 From: ggpozo at us.es (ggpozo at us.es) Date: Mon, 07 Oct 2013 18:47:13 +0200 Subject: [maker-devel] maker apollo error In-Reply-To: References: Message-ID: <14a642f3dc9cb615f048b70228bad3be@buzonweb.us.es> Thanks, I have modified the locations file pointing to a personal server where I downloaded an old version of Apollo but it did not work, but if you can get a new one I would try it again. Thanks Gabriel El 07/10/2013 16:12, Carson Holt escribi?: > This was a location from GMOD for the older Apollo. It looks like they've > dropped it, so it now points to nothing. Perhaps there is another location > that still has the old code I can get from Ed. > > --Carson > > On 10/7/13 7:36 AM, "Mark Yandell" wrote: > >> Hi Gabriel, I've forwarded your request on to the MAKER email list... cheers, --mark Mark Yandell Professor of Human Genetics H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:801-587-7707 ________________________________________ From: ggpozo at us.es [1] [ggpozo at us.es [2]] Sent: Monday, October 07, 2013 2:49 AM To: Mark Yandell Subject: maker apollo error Dear Dr. Yandell, First, thank you very much for provide us such a great tool as MAKER. I am traying to install it in my Linux system, everything is ok, however when I try to instal Apollo with ./Build apollo... Downloading apollo... --2013-10-07 10:45:37-- http://gmod.svn.sourceforge.net/viewvc/gmod/apollo/trunk/?view=tar [3] Resolviendo gmod.svn.sourceforge.net... 216.34.181.65, 216.34.181.177 Connecting to gmod.svn.sourceforge.net|216.34.181.65|:80... conectado. Petici?n HTTP enviada, esperando respuesta... 302 Found Localizaci?n: http://sourceforge.net/p/gmod/svn/ [4] [siguiendo] --2013-10-07 10:45:38-- http://sourceforge.net/p/gmod/svn/ [5] Resolviendo sourceforge.net... 216.34.181.60 Connecting to sourceforge.net|216.34.181.60|:80... conectado. Petici?n HTTP enviada, esperando respuesta... 302 Found Localizaci?n: http://sourceforge.net/p/gmod/svn/HEAD/tree/ [6] [siguiendo] --2013-10-07 10:45:38-- http://sourceforge.net/p/gmod/svn/HEAD/tree/ [7] Connecting to sourceforge.net|216.34.181.60|:80... conectado. Petici?n HTTP enviada, esperando respuesta... 200 OK Longitud: 37730 (37K) [text/html] Saving to: `/home/maker/src/../exe/apollo.tar.gz' 100%[===================================================================== =============================================================>] 37.730 92,1K/s in 0,4s 2013-10-07 10:45:40 (92,1 KB/s) - `/home/maker/src/../exe/apollo.tar.gz' saved [37730/37730] Unpacking apollo tarball... gzip: stdin: not in gzip format tar: Child returned status 1 tar: Error is not recoverable: exiting now ERROR: Failed installing apollo, now cleaning installation path... You may need to install apollo manually. It seems that the http direction in locations file is wrong, Apollo has changed its version and now Apollo is integrated in Apolloweb, may be this the problem. I would greatly appreciate any help you can give me. Thank you very much. Dr. Gabriel Gutierrez Department of Genetcis University of Seville Spain _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com [8] http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org [9] Links: ------ [1] mailto:ggpozo at us.es [2] mailto:ggpozo at us.es [3] http://gmod.svn.sourceforge.net/viewvc/gmod/apollo/trunk/?view=tar [4] http://sourceforge.net/p/gmod/svn/ [5] http://sourceforge.net/p/gmod/svn/ [6] http://sourceforge.net/p/gmod/svn/HEAD/tree/ [7] http://sourceforge.net/p/gmod/svn/HEAD/tree/ [8] mailto:maker-devel at box290.bluehost.com [9] http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From felix.bemm at uni-wuerzburg.de Sat Oct 12 10:06:52 2013 From: felix.bemm at uni-wuerzburg.de (Felix Bemm) Date: Sat, 12 Oct 2013 17:06:52 +0200 Subject: [maker-devel] Serious Start Codon Problem Message-ID: <5259658C.4000102@uni-wuerzburg.de> Hi, I have a serious problems with maker annotations especially the start codon placement. It started happening with maker version 2.27! About 1/3 of my genes are missing a proper start codon. When reviewing the GFF files gene predictors and est evidence standalone would predict the right gene structure including the start codon. I was thinking that this problem was somehow linked to my gene models but looking at their standalone prediction I discarded that theory. Just some stats about proteins with proper start codon (first column) and missing start codon (second column). Maker 9268 4215 SNAP 16577 896 Genemark 18764 290 Numbers look the same when Augustus is used (currently retraining). Manually using EST's in Apollo often fixes the problems but I don't want to check > 4000 genes for this. Do you have any idea what could cause such a problem? If you need some example gff files I can send you. Best regards Felix -- Felix Bemm Department of Bioinformatics University of W?rzburg, Germany Tel: +49 931 - 31 83696 Fax: +49 931 - 31 84552 felix.bemm at uni-wuerzburg.de From carsonhh at gmail.com Sat Oct 12 10:48:29 2013 From: carsonhh at gmail.com (Carson Holt) Date: Sat, 12 Oct 2013 11:48:29 -0400 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: <5259658C.4000102@uni-wuerzburg.de> Message-ID: This issue just came up this week on another e-mail as well. One way to fix it, is to edit the Bio::Tools::CodonTable module from BioPerl (use maker --debug to have maker print out the locations of all modules it uses). Such a hack should only be done as a temporary work around. You change line 256 from --> ---M---------------M---------------M---------------------------- To--> -----------------------------------M---------------------------- Maker uses the BioPerl is_start_codon function to determine if the codon used in a result it receives is acceptable or if it should look upstream or downstream for a different one. Apparently there are actually 3 acceptable start codons in the standard codon table (2 rare ones and one common), so making the edit removes the two rare ones from the list. I'm also preparing a 2.30 release that exports a "strict" codon table to the BioPerl module, that will be used by default and should fix the issue. Thanks, Carson On 10/12/13 11:06 AM, "Felix Bemm" wrote: >Hi, > >I have a serious problems with maker annotations especially the start >codon placement. It started happening with maker version 2.27! About 1/3 >of my genes are missing a proper start codon. When reviewing the GFF >files gene predictors and est evidence standalone would predict the >right gene structure including the start codon. I was thinking that this >problem was somehow linked to my gene models but looking at their >standalone prediction I discarded that theory. Just some stats about >proteins with proper start codon (first column) and missing start codon >(second column). > >Maker 9268 4215 >SNAP 16577 896 >Genemark 18764 290 > >Numbers look the same when Augustus is used (currently retraining). >Manually using EST's in Apollo often fixes the problems but I don't want >to check > 4000 genes for this. > >Do you have any idea what could cause such a problem? If you need some >example gff files I can send you. > >Best regards >Felix > >-- >Felix Bemm >Department of Bioinformatics >University of W?rzburg, Germany >Tel: +49 931 - 31 83696 >Fax: +49 931 - 31 84552 >felix.bemm at uni-wuerzburg.de > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From felix.bemm at uni-wuerzburg.de Sat Oct 12 11:16:06 2013 From: felix.bemm at uni-wuerzburg.de (Felix Bemm) Date: Sat, 12 Oct 2013 18:16:06 +0200 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: References: Message-ID: <525975C6.4020403@uni-wuerzburg.de> Thx Carson! Do you now when 2.30 will be available? Bests On 12.10.2013 17:48, Carson Holt wrote: > This issue just came up this week on another e-mail as well. > > One way to fix it, is to edit the Bio::Tools::CodonTable module from > BioPerl (use maker --debug to have maker print out the locations of all > modules it uses). Such a hack should only be done as a temporary work > around. > > You change line 256 from --> > ---M---------------M---------------M---------------------------- > > To--> > -----------------------------------M---------------------------- > > > Maker uses the BioPerl is_start_codon function to determine if the codon > used in a result it receives is acceptable or if it should look upstream > or downstream for a different one. Apparently there are actually 3 > acceptable start codons in the standard codon table (2 rare ones and one > common), so making the edit removes the two rare ones from the list. > > I'm also preparing a 2.30 release that exports a "strict" codon table to > the BioPerl module, that will be used by default and should fix the issue. > > Thanks, > Carson > > > > On 10/12/13 11:06 AM, "Felix Bemm" wrote: > >> Hi, >> >> I have a serious problems with maker annotations especially the start >> codon placement. It started happening with maker version 2.27! About 1/3 >> of my genes are missing a proper start codon. When reviewing the GFF >> files gene predictors and est evidence standalone would predict the >> right gene structure including the start codon. I was thinking that this >> problem was somehow linked to my gene models but looking at their >> standalone prediction I discarded that theory. Just some stats about >> proteins with proper start codon (first column) and missing start codon >> (second column). >> >> Maker 9268 4215 >> SNAP 16577 896 >> Genemark 18764 290 >> >> Numbers look the same when Augustus is used (currently retraining). >> Manually using EST's in Apollo often fixes the problems but I don't want >> to check > 4000 genes for this. >> >> Do you have any idea what could cause such a problem? If you need some >> example gff files I can send you. >> >> Best regards >> Felix >> >> -- >> Felix Bemm >> Department of Bioinformatics >> University of W?rzburg, Germany >> Tel: +49 931 - 31 83696 >> Fax: +49 931 - 31 84552 >> felix.bemm at uni-wuerzburg.de >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -- Felix Bemm Department of Bioinformatics University of W?rzburg, Germany Tel: +49 931 - 31 83696 Fax: +49 931 - 31 84552 felix.bemm at uni-wuerzburg.de From carsonhh at gmail.com Sun Oct 13 00:26:17 2013 From: carsonhh at gmail.com (Carson Holt) Date: Sun, 13 Oct 2013 01:26:17 -0400 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: <525975C6.4020403@uni-wuerzburg.de> Message-ID: Before Wednesday, because after that I'm on a 6 day road trip, so I have to have it wrapped up before I leave. --Carson On 10/12/13 12:16 PM, "Felix Bemm" wrote: >Thx Carson! Do you now when 2.30 will be available? Bests > >On 12.10.2013 17:48, Carson Holt wrote: >> This issue just came up this week on another e-mail as well. >> >> One way to fix it, is to edit the Bio::Tools::CodonTable module from >> BioPerl (use maker --debug to have maker print out the locations of all >> modules it uses). Such a hack should only be done as a temporary work >> around. >> >> You change line 256 from --> >> ---M---------------M---------------M---------------------------- >> >> To--> >> -----------------------------------M---------------------------- >> >> >> Maker uses the BioPerl is_start_codon function to determine if the codon >> used in a result it receives is acceptable or if it should look upstream >> or downstream for a different one. Apparently there are actually 3 >> acceptable start codons in the standard codon table (2 rare ones and one >> common), so making the edit removes the two rare ones from the list. >> >> I'm also preparing a 2.30 release that exports a "strict" codon table to >> the BioPerl module, that will be used by default and should fix the >>issue. >> >> Thanks, >> Carson >> >> >> >> On 10/12/13 11:06 AM, "Felix Bemm" wrote: >> >>> Hi, >>> >>> I have a serious problems with maker annotations especially the start >>> codon placement. It started happening with maker version 2.27! About >>>1/3 >>> of my genes are missing a proper start codon. When reviewing the GFF >>> files gene predictors and est evidence standalone would predict the >>> right gene structure including the start codon. I was thinking that >>>this >>> problem was somehow linked to my gene models but looking at their >>> standalone prediction I discarded that theory. Just some stats about >>> proteins with proper start codon (first column) and missing start codon >>> (second column). >>> >>> Maker 9268 4215 >>> SNAP 16577 896 >>> Genemark 18764 290 >>> >>> Numbers look the same when Augustus is used (currently retraining). >>> Manually using EST's in Apollo often fixes the problems but I don't >>>want >>> to check > 4000 genes for this. >>> >>> Do you have any idea what could cause such a problem? If you need some >>> example gff files I can send you. >>> >>> Best regards >>> Felix >>> >>> -- >>> Felix Bemm >>> Department of Bioinformatics >>> University of W?rzburg, Germany >>> Tel: +49 931 - 31 83696 >>> Fax: +49 931 - 31 84552 >>> felix.bemm at uni-wuerzburg.de >>> >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > >-- >Felix Bemm >Department of Bioinformatics >University of W?rzburg, Germany >Tel: +49 931 - 31 83696 >Fax: +49 931 - 31 84552 >felix.bemm at uni-wuerzburg.de From carsonhh at gmail.com Mon Oct 14 09:45:25 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 14 Oct 2013 10:45:25 -0400 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: <525975C6.4020403@uni-wuerzburg.de> Message-ID: Probably Tuesday or Wednesday. --Carson On 10/12/13 12:16 PM, "Felix Bemm" wrote: >Thx Carson! Do you now when 2.30 will be available? Bests > >On 12.10.2013 17:48, Carson Holt wrote: >> This issue just came up this week on another e-mail as well. >> >> One way to fix it, is to edit the Bio::Tools::CodonTable module from >> BioPerl (use maker --debug to have maker print out the locations of all >> modules it uses). Such a hack should only be done as a temporary work >> around. >> >> You change line 256 from --> >> ---M---------------M---------------M---------------------------- >> >> To--> >> -----------------------------------M---------------------------- >> >> >> Maker uses the BioPerl is_start_codon function to determine if the codon >> used in a result it receives is acceptable or if it should look upstream >> or downstream for a different one. Apparently there are actually 3 >> acceptable start codons in the standard codon table (2 rare ones and one >> common), so making the edit removes the two rare ones from the list. >> >> I'm also preparing a 2.30 release that exports a "strict" codon table to >> the BioPerl module, that will be used by default and should fix the >>issue. >> >> Thanks, >> Carson >> >> >> >> On 10/12/13 11:06 AM, "Felix Bemm" wrote: >> >>> Hi, >>> >>> I have a serious problems with maker annotations especially the start >>> codon placement. It started happening with maker version 2.27! About >>>1/3 >>> of my genes are missing a proper start codon. When reviewing the GFF >>> files gene predictors and est evidence standalone would predict the >>> right gene structure including the start codon. I was thinking that >>>this >>> problem was somehow linked to my gene models but looking at their >>> standalone prediction I discarded that theory. Just some stats about >>> proteins with proper start codon (first column) and missing start codon >>> (second column). >>> >>> Maker 9268 4215 >>> SNAP 16577 896 >>> Genemark 18764 290 >>> >>> Numbers look the same when Augustus is used (currently retraining). >>> Manually using EST's in Apollo often fixes the problems but I don't >>>want >>> to check > 4000 genes for this. >>> >>> Do you have any idea what could cause such a problem? If you need some >>> example gff files I can send you. >>> >>> Best regards >>> Felix >>> >>> -- >>> Felix Bemm >>> Department of Bioinformatics >>> University of W?rzburg, Germany >>> Tel: +49 931 - 31 83696 >>> Fax: +49 931 - 31 84552 >>> felix.bemm at uni-wuerzburg.de >>> >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > >-- >Felix Bemm >Department of Bioinformatics >University of W?rzburg, Germany >Tel: +49 931 - 31 83696 >Fax: +49 931 - 31 84552 >felix.bemm at uni-wuerzburg.de From cjfields at illinois.edu Mon Oct 14 10:07:43 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Mon, 14 Oct 2013 15:07:43 +0000 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: References: Message-ID: Carson, Regarding the CodonTable change below, should this be something that needs to be fixed, or made more flexible, in bioperl? I could see this being a problem if using MAKER for bacterial annotation (where the alternative start codons are actually more common). chris On Oct 14, 2013, at 9:45 AM, Carson Holt wrote: > Probably Tuesday or Wednesday. > > --Carson > > > On 10/12/13 12:16 PM, "Felix Bemm" wrote: > >> Thx Carson! Do you now when 2.30 will be available? Bests >> >> On 12.10.2013 17:48, Carson Holt wrote: >>> This issue just came up this week on another e-mail as well. >>> >>> One way to fix it, is to edit the Bio::Tools::CodonTable module from >>> BioPerl (use maker --debug to have maker print out the locations of all >>> modules it uses). Such a hack should only be done as a temporary work >>> around. >>> >>> You change line 256 from --> >>> ---M---------------M---------------M---------------------------- >>> >>> To--> >>> -----------------------------------M---------------------------- >>> >>> >>> Maker uses the BioPerl is_start_codon function to determine if the codon >>> used in a result it receives is acceptable or if it should look upstream >>> or downstream for a different one. Apparently there are actually 3 >>> acceptable start codons in the standard codon table (2 rare ones and one >>> common), so making the edit removes the two rare ones from the list. >>> >>> I'm also preparing a 2.30 release that exports a "strict" codon table to >>> the BioPerl module, that will be used by default and should fix the >>> issue. >>> >>> Thanks, >>> Carson >>> >>> >>> >>> On 10/12/13 11:06 AM, "Felix Bemm" wrote: >>> >>>> Hi, >>>> >>>> I have a serious problems with maker annotations especially the start >>>> codon placement. It started happening with maker version 2.27! About >>>> 1/3 >>>> of my genes are missing a proper start codon. When reviewing the GFF >>>> files gene predictors and est evidence standalone would predict the >>>> right gene structure including the start codon. I was thinking that >>>> this >>>> problem was somehow linked to my gene models but looking at their >>>> standalone prediction I discarded that theory. Just some stats about >>>> proteins with proper start codon (first column) and missing start codon >>>> (second column). >>>> >>>> Maker 9268 4215 >>>> SNAP 16577 896 >>>> Genemark 18764 290 >>>> >>>> Numbers look the same when Augustus is used (currently retraining). >>>> Manually using EST's in Apollo often fixes the problems but I don't >>>> want >>>> to check > 4000 genes for this. >>>> >>>> Do you have any idea what could cause such a problem? If you need some >>>> example gff files I can send you. >>>> >>>> Best regards >>>> Felix >>>> >>>> -- >>>> Felix Bemm >>>> Department of Bioinformatics >>>> University of W?rzburg, Germany >>>> Tel: +49 931 - 31 83696 >>>> Fax: +49 931 - 31 84552 >>>> felix.bemm at uni-wuerzburg.de >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> -- >> Felix Bemm >> Department of Bioinformatics >> University of W?rzburg, Germany >> Tel: +49 931 - 31 83696 >> Fax: +49 931 - 31 84552 >> felix.bemm at uni-wuerzburg.de > > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From carsonhh at gmail.com Mon Oct 14 12:58:24 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 14 Oct 2013 13:58:24 -0400 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: Message-ID: I think adding one more codon table to the set of available tables would be sufficient. Call it the 'Strict' table or 'Canonical' table. It doesn't need to be the default, but should be a selectable option just as the other codon tables are. There is a way to export codon tables to the module, which is what I've now added to MAKER, but I'm sure other BoiPerl users would find a strictly canonical codon table useful as well. Thanks, Carson On 10/14/13 11:07 AM, "Fields, Christopher J" wrote: >Carson, > >Regarding the CodonTable change below, should this be something that >needs to be fixed, or made more flexible, in bioperl? I could see this >being a problem if using MAKER for bacterial annotation (where the >alternative start codons are actually more common). > >chris > >On Oct 14, 2013, at 9:45 AM, Carson Holt wrote: > >> Probably Tuesday or Wednesday. >> >> --Carson >> >> >> On 10/12/13 12:16 PM, "Felix Bemm" wrote: >> >>> Thx Carson! Do you now when 2.30 will be available? Bests >>> >>> On 12.10.2013 17:48, Carson Holt wrote: >>>> This issue just came up this week on another e-mail as well. >>>> >>>> One way to fix it, is to edit the Bio::Tools::CodonTable module from >>>> BioPerl (use maker --debug to have maker print out the locations of >>>>all >>>> modules it uses). Such a hack should only be done as a temporary work >>>> around. >>>> >>>> You change line 256 from --> >>>> ---M---------------M---------------M---------------------------- >>>> >>>> To--> >>>> -----------------------------------M---------------------------- >>>> >>>> >>>> Maker uses the BioPerl is_start_codon function to determine if the >>>>codon >>>> used in a result it receives is acceptable or if it should look >>>>upstream >>>> or downstream for a different one. Apparently there are actually 3 >>>> acceptable start codons in the standard codon table (2 rare ones and >>>>one >>>> common), so making the edit removes the two rare ones from the list. >>>> >>>> I'm also preparing a 2.30 release that exports a "strict" codon table >>>>to >>>> the BioPerl module, that will be used by default and should fix the >>>> issue. >>>> >>>> Thanks, >>>> Carson >>>> >>>> >>>> >>>> On 10/12/13 11:06 AM, "Felix Bemm" >>>>wrote: >>>> >>>>> Hi, >>>>> >>>>> I have a serious problems with maker annotations especially the start >>>>> codon placement. It started happening with maker version 2.27! About >>>>> 1/3 >>>>> of my genes are missing a proper start codon. When reviewing the GFF >>>>> files gene predictors and est evidence standalone would predict the >>>>> right gene structure including the start codon. I was thinking that >>>>> this >>>>> problem was somehow linked to my gene models but looking at their >>>>> standalone prediction I discarded that theory. Just some stats about >>>>> proteins with proper start codon (first column) and missing start >>>>>codon >>>>> (second column). >>>>> >>>>> Maker 9268 4215 >>>>> SNAP 16577 896 >>>>> Genemark 18764 290 >>>>> >>>>> Numbers look the same when Augustus is used (currently retraining). >>>>> Manually using EST's in Apollo often fixes the problems but I don't >>>>> want >>>>> to check > 4000 genes for this. >>>>> >>>>> Do you have any idea what could cause such a problem? If you need >>>>>some >>>>> example gff files I can send you. >>>>> >>>>> Best regards >>>>> Felix >>>>> >>>>> -- >>>>> Felix Bemm >>>>> Department of Bioinformatics >>>>> University of W?rzburg, Germany >>>>> Tel: +49 931 - 31 83696 >>>>> Fax: +49 931 - 31 84552 >>>>> felix.bemm at uni-wuerzburg.de >>>>> >>>>> _______________________________________________ >>>>> maker-devel mailing list >>>>> maker-devel at box290.bluehost.com >>>>> >>>>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.or >>>>>g >>> >>> -- >>> Felix Bemm >>> Department of Bioinformatics >>> University of W?rzburg, Germany >>> Tel: +49 931 - 31 83696 >>> Fax: +49 931 - 31 84552 >>> felix.bemm at uni-wuerzburg.de >> >> >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > From cjfields at illinois.edu Mon Oct 14 13:51:28 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Mon, 14 Oct 2013 18:51:28 +0000 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: References: Message-ID: <160BC401-0471-4285-8712-21440628A5DA@illinois.edu> Done, with some added tests. I labeled it 'Strict' ('Canonical' can be a bit of a loaded term). It's currently set as ID 24. I will likely push out a new 1.6 point release this week or next, I'll merge these in for that release. chris On Oct 14, 2013, at 12:58 PM, Carson Holt wrote: > I think adding one more codon table to the set of available tables would > be sufficient. Call it the 'Strict' table or 'Canonical' table. It > doesn't need to be the default, but should be a selectable option just as > the other codon tables are. There is a way to export codon tables to the > module, which is what I've now added to MAKER, but I'm sure other BoiPerl > users would find a strictly canonical codon table useful as well. > > Thanks, > Carson > > > On 10/14/13 11:07 AM, "Fields, Christopher J" > wrote: > >> Carson, >> >> Regarding the CodonTable change below, should this be something that >> needs to be fixed, or made more flexible, in bioperl? I could see this >> being a problem if using MAKER for bacterial annotation (where the >> alternative start codons are actually more common). >> >> chris >> >> On Oct 14, 2013, at 9:45 AM, Carson Holt wrote: >> >>> Probably Tuesday or Wednesday. >>> >>> --Carson >>> >>> >>> On 10/12/13 12:16 PM, "Felix Bemm" wrote: >>> >>>> Thx Carson! Do you now when 2.30 will be available? Bests >>>> >>>> On 12.10.2013 17:48, Carson Holt wrote: >>>>> This issue just came up this week on another e-mail as well. >>>>> >>>>> One way to fix it, is to edit the Bio::Tools::CodonTable module from >>>>> BioPerl (use maker --debug to have maker print out the locations of >>>>> all >>>>> modules it uses). Such a hack should only be done as a temporary work >>>>> around. >>>>> >>>>> You change line 256 from --> >>>>> ---M---------------M---------------M---------------------------- >>>>> >>>>> To--> >>>>> -----------------------------------M---------------------------- >>>>> >>>>> >>>>> Maker uses the BioPerl is_start_codon function to determine if the >>>>> codon >>>>> used in a result it receives is acceptable or if it should look >>>>> upstream >>>>> or downstream for a different one. Apparently there are actually 3 >>>>> acceptable start codons in the standard codon table (2 rare ones and >>>>> one >>>>> common), so making the edit removes the two rare ones from the list. >>>>> >>>>> I'm also preparing a 2.30 release that exports a "strict" codon table >>>>> to >>>>> the BioPerl module, that will be used by default and should fix the >>>>> issue. >>>>> >>>>> Thanks, >>>>> Carson >>>>> >>>>> >>>>> >>>>> On 10/12/13 11:06 AM, "Felix Bemm" >>>>> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I have a serious problems with maker annotations especially the start >>>>>> codon placement. It started happening with maker version 2.27! About >>>>>> 1/3 >>>>>> of my genes are missing a proper start codon. When reviewing the GFF >>>>>> files gene predictors and est evidence standalone would predict the >>>>>> right gene structure including the start codon. I was thinking that >>>>>> this >>>>>> problem was somehow linked to my gene models but looking at their >>>>>> standalone prediction I discarded that theory. Just some stats about >>>>>> proteins with proper start codon (first column) and missing start >>>>>> codon >>>>>> (second column). >>>>>> >>>>>> Maker 9268 4215 >>>>>> SNAP 16577 896 >>>>>> Genemark 18764 290 >>>>>> >>>>>> Numbers look the same when Augustus is used (currently retraining). >>>>>> Manually using EST's in Apollo often fixes the problems but I don't >>>>>> want >>>>>> to check > 4000 genes for this. >>>>>> >>>>>> Do you have any idea what could cause such a problem? If you need >>>>>> some >>>>>> example gff files I can send you. >>>>>> >>>>>> Best regards >>>>>> Felix >>>>>> >>>>>> -- >>>>>> Felix Bemm >>>>>> Department of Bioinformatics >>>>>> University of W?rzburg, Germany >>>>>> Tel: +49 931 - 31 83696 >>>>>> Fax: +49 931 - 31 84552 >>>>>> felix.bemm at uni-wuerzburg.de >>>>>> >>>>>> _______________________________________________ >>>>>> maker-devel mailing list >>>>>> maker-devel at box290.bluehost.com >>>>>> >>>>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.or >>>>>> g >>>> >>>> -- >>>> Felix Bemm >>>> Department of Bioinformatics >>>> University of W?rzburg, Germany >>>> Tel: +49 931 - 31 83696 >>>> Fax: +49 931 - 31 84552 >>>> felix.bemm at uni-wuerzburg.de >>> >>> >>> >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> > > From carsonhh at gmail.com Mon Oct 14 19:59:45 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 14 Oct 2013 20:59:45 -0400 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: <160BC401-0471-4285-8712-21440628A5DA@illinois.edu> Message-ID: Thanks!! --Carson On 10/14/13 2:51 PM, "Fields, Christopher J" wrote: >Done, with some added tests. I labeled it 'Strict' ('Canonical' can be a >bit of a loaded term). It's currently set as ID 24. > >I will likely push out a new 1.6 point release this week or next, I'll >merge these in for that release. > >chris > >On Oct 14, 2013, at 12:58 PM, Carson Holt wrote: > >> I think adding one more codon table to the set of available tables would >> be sufficient. Call it the 'Strict' table or 'Canonical' table. It >> doesn't need to be the default, but should be a selectable option just >>as >> the other codon tables are. There is a way to export codon tables to >>the >> module, which is what I've now added to MAKER, but I'm sure other >>BoiPerl >> users would find a strictly canonical codon table useful as well. >> >> Thanks, >> Carson >> >> >> On 10/14/13 11:07 AM, "Fields, Christopher J" >> wrote: >> >>> Carson, >>> >>> Regarding the CodonTable change below, should this be something that >>> needs to be fixed, or made more flexible, in bioperl? I could see this >>> being a problem if using MAKER for bacterial annotation (where the >>> alternative start codons are actually more common). >>> >>> chris >>> >>> On Oct 14, 2013, at 9:45 AM, Carson Holt wrote: >>> >>>> Probably Tuesday or Wednesday. >>>> >>>> --Carson >>>> >>>> >>>> On 10/12/13 12:16 PM, "Felix Bemm" >>>>wrote: >>>> >>>>> Thx Carson! Do you now when 2.30 will be available? Bests >>>>> >>>>> On 12.10.2013 17:48, Carson Holt wrote: >>>>>> This issue just came up this week on another e-mail as well. >>>>>> >>>>>> One way to fix it, is to edit the Bio::Tools::CodonTable module from >>>>>> BioPerl (use maker --debug to have maker print out the locations of >>>>>> all >>>>>> modules it uses). Such a hack should only be done as a temporary >>>>>>work >>>>>> around. >>>>>> >>>>>> You change line 256 from --> >>>>>> ---M---------------M---------------M---------------------------- >>>>>> >>>>>> To--> >>>>>> -----------------------------------M---------------------------- >>>>>> >>>>>> >>>>>> Maker uses the BioPerl is_start_codon function to determine if the >>>>>> codon >>>>>> used in a result it receives is acceptable or if it should look >>>>>> upstream >>>>>> or downstream for a different one. Apparently there are actually 3 >>>>>> acceptable start codons in the standard codon table (2 rare ones and >>>>>> one >>>>>> common), so making the edit removes the two rare ones from the list. >>>>>> >>>>>> I'm also preparing a 2.30 release that exports a "strict" codon >>>>>>table >>>>>> to >>>>>> the BioPerl module, that will be used by default and should fix the >>>>>> issue. >>>>>> >>>>>> Thanks, >>>>>> Carson >>>>>> >>>>>> >>>>>> >>>>>> On 10/12/13 11:06 AM, "Felix Bemm" >>>>>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I have a serious problems with maker annotations especially the >>>>>>>start >>>>>>> codon placement. It started happening with maker version 2.27! >>>>>>>About >>>>>>> 1/3 >>>>>>> of my genes are missing a proper start codon. When reviewing the >>>>>>>GFF >>>>>>> files gene predictors and est evidence standalone would predict the >>>>>>> right gene structure including the start codon. I was thinking that >>>>>>> this >>>>>>> problem was somehow linked to my gene models but looking at their >>>>>>> standalone prediction I discarded that theory. Just some stats >>>>>>>about >>>>>>> proteins with proper start codon (first column) and missing start >>>>>>> codon >>>>>>> (second column). >>>>>>> >>>>>>> Maker 9268 4215 >>>>>>> SNAP 16577 896 >>>>>>> Genemark 18764 290 >>>>>>> >>>>>>> Numbers look the same when Augustus is used (currently retraining). >>>>>>> Manually using EST's in Apollo often fixes the problems but I don't >>>>>>> want >>>>>>> to check > 4000 genes for this. >>>>>>> >>>>>>> Do you have any idea what could cause such a problem? If you need >>>>>>> some >>>>>>> example gff files I can send you. >>>>>>> >>>>>>> Best regards >>>>>>> Felix >>>>>>> >>>>>>> -- >>>>>>> Felix Bemm >>>>>>> Department of Bioinformatics >>>>>>> University of W?rzburg, Germany >>>>>>> Tel: +49 931 - 31 83696 >>>>>>> Fax: +49 931 - 31 84552 >>>>>>> felix.bemm at uni-wuerzburg.de >>>>>>> >>>>>>> _______________________________________________ >>>>>>> maker-devel mailing list >>>>>>> maker-devel at box290.bluehost.com >>>>>>> >>>>>>> >>>>>>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab. >>>>>>>or >>>>>>> g >>>>> >>>>> -- >>>>> Felix Bemm >>>>> Department of Bioinformatics >>>>> University of W?rzburg, Germany >>>>> Tel: +49 931 - 31 83696 >>>>> Fax: +49 931 - 31 84552 >>>>> felix.bemm at uni-wuerzburg.de >>>> >>>> >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> >>>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >> >> > From jfierst at uoregon.edu Mon Oct 21 12:40:06 2013 From: jfierst at uoregon.edu (Janna Fierst) Date: Mon, 21 Oct 2013 10:40:06 -0700 Subject: [maker-devel] Fwd: exon/intron boundaries In-Reply-To: References: Message-ID: Hi, thanks for your help- I have used Trinity to assemble the EST's but am having a problem with the 2.29-beta release. Instead of starting and finishing a set of scaffolds, it appears to re-start over and over again. I am running it in parallel across 32 cores and I'm wondering if this is a problem with the parallel implementation. I've gone through it several times and not been able to find any errors or output that would suggest what the problem is. Thanks -Janna On Mon, Oct 7, 2013 at 6:20 AM, Carson Holt wrote: > Hi Janna, > > There are a couple of things to do. Download the maker-2.29p-beta from > the lab website. This includes some changes made to improve the > performance of correct_est_fusion. Whenever using the correct_est_fusion > option, this also reduces the influence of ESTs on annotation. So normally > you would need to increase your protein dataset, but given that you have > supplied 3 nematode species and all of UniProt already you should probably > be fine. But there is one last thing you can do. Instead of using > cufflinks, try using trinity to assemble the ESTs. There is a Jaccard clip > option that reducing merging caused by overlapping UTR. Between the > trinity and correct_est_fusion you should be able to really reduce the > effect of those ESTs. > > If those changes don't work there is one last option. If you take the > MAKER results and filter them for SNAP and Augustus ab initio results > (match/match_part in the GFF3), then you can pass those in to the pred_gff > options. Then turn snaphmm and augustus_species off in the control files. > Basically what this will do is turn MAKER's hint based prediction off and > force it to filter the ab intio results and select directly models from > there. Since the merging is being caused by bad hints (merged transcripts > from mRNAseq) this would reduce that effect. You will still need > correct_est_fusion=1 though to trim UTR coming from the merged transcripts, > because even though MAEKR can't process hints to rerun SNAP and Augustus > this way, it will try and add UTR using the EST evidence. > > --Carson > > > From: Janna Fierst > Date: Friday, October 4, 2013 1:06 PM > To: > Subject: [maker-devel] Fwd: exon/intron boundaries > > Hi, > > thanks for your reply- I have been going through our annotations in detail > and trying different parameter sets, and I think I have identified what is > going on but I'm not sure how to set the MAKER2 parameters for our > situation. We are working with a species of Caenorhabditid worm and there > are long gene-dense blocks that are being incorrectly annotated as large > single genes instead of several smaller closely spaced genes. The protein > alignments (tblastx and protein2genome) show very clearly where the > exon/intron boundaries are and in most cases agree with the augustus > predictions. The assembled cufflinks output (through blastn, est2genome and > est_gff:cufflinks) does not agree in some locations; I think this may be > because in some cases the UTR nearly overlaps adjacent genes. > > I have included a screenshot of an annotated region viewed in apollo to > try to show this. The large gene in the middle is actually 7 different > genes that are extremely close together and MAKER2 is collapsing them into > a single gene. I tried running without any RNASeq/cufflinks data and MAKER2 > annotates the region as two genes instead of one, but I can't get it to > recognize the 7 as different genes. I have retrained SNAP but we have not > been able to successfully train Augustus, we are currently using the > default caenhorhabditis species model. I also included a species specific > repeat library. I tried setting correct_est_fusion=1 and reducing > pred_flank but these changes appear to really alter the annotations and we > end up annotating almost nothing. I also tried setting est2genome=0 to > decrease the influence of the cufflinks assembly but it didn't appear to > help. There are some very large introns in these genes so I haven't tried > yet decreasing the maximum intron size because I'm concerned this may > generate too many split genes instead of our current merged gene problem. > Thanks for your help, any advice is greatly appreciated! -Janna Fierst > > > On Mon, Aug 26, 2013 at 12:21 PM, Carson Holt wrote: > >> Are you getting gene fusions or just more exons? Gene fusions can be >> reduced by setting correct_est_fusion=1, or reducing pred_flank, although >> reducing pred_flank can cause other issues (but those generally only appear >> if setting the value below below 150). Also if you have the maximum intron >> size set to high (split_hit option), you may also be generating bridging >> alignments that make evidence align across distant paralogous genes as well >> (this can result in gene merging) >> >> You should also look at your results manually in a viewer like Apollo. >> Then see if the extra exons are supported by something such as protein >> alignments from another species. If this is the case, you may have a >> poorly annotated protein set that is being used as evidence that is >> carrying over it's erroneous exons into the species you are annotating. If >> the extra exons are supported by EST evidence, then perhaps you should try >> and rebuild the EST assembly (for example trinity has an option to use a >> Jarccardian similarity coefficient to avoid fusing transcripts). >> >> Another option, is to retrain SNAP or Augustus. MAKER does not actually >> produce any of the models itself (it is a pipeline not a predictor). The >> models are all generated using these other algorithms, MAKER just feeds >> them hints based on protein and transcript alignments, so making sure >> training is sufficient is important for those programs to produce their >> best models. >> >> Finally make sure your repeat database is sufficient, you may need to >> generate a species specific repeat library using something like >> RepeatModeler. Repeats can end up being included as extra exons in gene >> models because they may contain reading frames the do code for proteins >> (I.e. reverse transcriptases). >> >> If you have any questions on any of the above, just let us know. >> >> Thanks, >> Carson >> >> >> From: Janna Fierst >> Date: Monday, August 26, 2013 2:54 PM >> To: >> Subject: [maker-devel] exon/intron boundaries >> >> Hi, >> >> I am using MAKER 2.28 to annotate a Caenorhabditid worm genome, and the >> initial results appear fairly good but we seem to be be annotating too many >> exons for multiple genes. I was wondering which parameters should be tuned >> to change the threshold for exon/intron boundaries? Thanks for your help >> -Janna Fierst >> _______________________________________________ maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> > > > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From graham.etherington at sainsbury-laboratory.ac.uk Thu Oct 24 08:42:51 2013 From: graham.etherington at sainsbury-laboratory.ac.uk (graham etherington (TSL)) Date: Thu, 24 Oct 2013 13:42:51 +0000 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: Message-ID: Hi, I notice that the latest version of Maker available from http://www.yandell-lab.org/software/maker.html is still v2.29b (4 Oct 2013). As we're also having the start codon problem we'd like to get hold of v2.30. Do you know when this will be released? Many thanks, Graham Dr. Graham Etherington Bioinformatics Support Officer, The Sainsbury Laboratory, Norwich Research Park, Norwich NR4 7UH. UK Tel: +44 (0)1603 450601 On 13/10/2013 06:26, "Carson Holt" wrote: >Before Wednesday, because after that I'm on a 6 day road trip, so I have >to have it wrapped up before I leave. > >--Carson > > > > >On 10/12/13 12:16 PM, "Felix Bemm" wrote: > >>Thx Carson! Do you now when 2.30 will be available? Bests >> >>On 12.10.2013 17:48, Carson Holt wrote: >>> This issue just came up this week on another e-mail as well. >>> >>> One way to fix it, is to edit the Bio::Tools::CodonTable module from >>> BioPerl (use maker --debug to have maker print out the locations of all >>> modules it uses). Such a hack should only be done as a temporary work >>> around. >>> >>> You change line 256 from --> >>> ---M---------------M---------------M---------------------------- >>> >>> To--> >>> -----------------------------------M---------------------------- >>> >>> >>> Maker uses the BioPerl is_start_codon function to determine if the >>>codon >>> used in a result it receives is acceptable or if it should look >>>upstream >>> or downstream for a different one. Apparently there are actually 3 >>> acceptable start codons in the standard codon table (2 rare ones and >>>one >>> common), so making the edit removes the two rare ones from the list. >>> >>> I'm also preparing a 2.30 release that exports a "strict" codon table >>>to >>> the BioPerl module, that will be used by default and should fix the >>>issue. >>> >>> Thanks, >>> Carson >>> >>> >>> >>> On 10/12/13 11:06 AM, "Felix Bemm" wrote: >>> >>>> Hi, >>>> >>>> I have a serious problems with maker annotations especially the start >>>> codon placement. It started happening with maker version 2.27! About >>>>1/3 >>>> of my genes are missing a proper start codon. When reviewing the GFF >>>> files gene predictors and est evidence standalone would predict the >>>> right gene structure including the start codon. I was thinking that >>>>this >>>> problem was somehow linked to my gene models but looking at their >>>> standalone prediction I discarded that theory. Just some stats about >>>> proteins with proper start codon (first column) and missing start >>>>codon >>>> (second column). >>>> >>>> Maker 9268 4215 >>>> SNAP 16577 896 >>>> Genemark 18764 290 >>>> >>>> Numbers look the same when Augustus is used (currently retraining). >>>> Manually using EST's in Apollo often fixes the problems but I don't >>>>want >>>> to check > 4000 genes for this. >>>> >>>> Do you have any idea what could cause such a problem? If you need some >>>> example gff files I can send you. >>>> >>>> Best regards >>>> Felix >>>> >>>> -- >>>> Felix Bemm >>>> Department of Bioinformatics >>>> University of W?rzburg, Germany >>>> Tel: +49 931 - 31 83696 >>>> Fax: +49 931 - 31 84552 >>>> felix.bemm at uni-wuerzburg.de >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> >>>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >>-- >>Felix Bemm >>Department of Bioinformatics >>University of W?rzburg, Germany >>Tel: +49 931 - 31 83696 >>Fax: +49 931 - 31 84552 >>felix.bemm at uni-wuerzburg.de > > > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From myandell at genetics.utah.edu Thu Oct 24 08:55:05 2013 From: myandell at genetics.utah.edu (Mark Yandell) Date: Thu, 24 Oct 2013 13:55:05 +0000 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: References: , Message-ID: <7A60AB257EFF2B48B1F4C814817EA05365E778BF@mxb2.hg.genetics.utah.edu> Hi Graham, Carson is currently on the road, moving his family from Canada back to the States. He should be back online sometime next week. Sorry for the delay. --mark Mark Yandell Professor of Human Genetics H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:801-587-7707 ________________________________________ From: maker-devel [maker-devel-bounces at yandell-lab.org] on behalf of graham etherington (TSL) [graham.etherington at sainsbury-laboratory.ac.uk] Sent: Thursday, October 24, 2013 7:42 AM To: Carson Holt; Felix Bemm; maker-devel at yandell-lab.org Cc: Kate Bailey (TSL) Subject: Re: [maker-devel] Serious Start Codon Problem Hi, I notice that the latest version of Maker available from http://www.yandell-lab.org/software/maker.html is still v2.29b (4 Oct 2013). As we're also having the start codon problem we'd like to get hold of v2.30. Do you know when this will be released? Many thanks, Graham Dr. Graham Etherington Bioinformatics Support Officer, The Sainsbury Laboratory, Norwich Research Park, Norwich NR4 7UH. UK Tel: +44 (0)1603 450601 On 13/10/2013 06:26, "Carson Holt" wrote: >Before Wednesday, because after that I'm on a 6 day road trip, so I have >to have it wrapped up before I leave. > >--Carson > > > > >On 10/12/13 12:16 PM, "Felix Bemm" wrote: > >>Thx Carson! Do you now when 2.30 will be available? Bests >> >>On 12.10.2013 17:48, Carson Holt wrote: >>> This issue just came up this week on another e-mail as well. >>> >>> One way to fix it, is to edit the Bio::Tools::CodonTable module from >>> BioPerl (use maker --debug to have maker print out the locations of all >>> modules it uses). Such a hack should only be done as a temporary work >>> around. >>> >>> You change line 256 from --> >>> ---M---------------M---------------M---------------------------- >>> >>> To--> >>> -----------------------------------M---------------------------- >>> >>> >>> Maker uses the BioPerl is_start_codon function to determine if the >>>codon >>> used in a result it receives is acceptable or if it should look >>>upstream >>> or downstream for a different one. Apparently there are actually 3 >>> acceptable start codons in the standard codon table (2 rare ones and >>>one >>> common), so making the edit removes the two rare ones from the list. >>> >>> I'm also preparing a 2.30 release that exports a "strict" codon table >>>to >>> the BioPerl module, that will be used by default and should fix the >>>issue. >>> >>> Thanks, >>> Carson >>> >>> >>> >>> On 10/12/13 11:06 AM, "Felix Bemm" wrote: >>> >>>> Hi, >>>> >>>> I have a serious problems with maker annotations especially the start >>>> codon placement. It started happening with maker version 2.27! About >>>>1/3 >>>> of my genes are missing a proper start codon. When reviewing the GFF >>>> files gene predictors and est evidence standalone would predict the >>>> right gene structure including the start codon. I was thinking that >>>>this >>>> problem was somehow linked to my gene models but looking at their >>>> standalone prediction I discarded that theory. Just some stats about >>>> proteins with proper start codon (first column) and missing start >>>>codon >>>> (second column). >>>> >>>> Maker 9268 4215 >>>> SNAP 16577 896 >>>> Genemark 18764 290 >>>> >>>> Numbers look the same when Augustus is used (currently retraining). >>>> Manually using EST's in Apollo often fixes the problems but I don't >>>>want >>>> to check > 4000 genes for this. >>>> >>>> Do you have any idea what could cause such a problem? If you need some >>>> example gff files I can send you. >>>> >>>> Best regards >>>> Felix >>>> >>>> -- >>>> Felix Bemm >>>> Department of Bioinformatics >>>> University of W?rzburg, Germany >>>> Tel: +49 931 - 31 83696 >>>> Fax: +49 931 - 31 84552 >>>> felix.bemm at uni-wuerzburg.de >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> >>>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >>-- >>Felix Bemm >>Department of Bioinformatics >>University of W?rzburg, Germany >>Tel: +49 931 - 31 83696 >>Fax: +49 931 - 31 84552 >>felix.bemm at uni-wuerzburg.de > > > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From felix.bemm at uni-wuerzburg.de Thu Oct 24 09:03:27 2013 From: felix.bemm at uni-wuerzburg.de (Felix Bemm) Date: Thu, 24 Oct 2013 16:03:27 +0200 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: <7A60AB257EFF2B48B1F4C814817EA05365E778BF@mxb2.hg.genetics.utah.edu> References: , <7A60AB257EFF2B48B1F4C814817EA05365E778BF@mxb2.hg.genetics.utah.edu> Message-ID: <526928AF.90108@uni-wuerzburg.de> Hi, I used the temporary fix of the Bio::Tools::CodonTable. It's working pretty nice. Almost every predicted proteins now start with a M ;) Bests Felix On 24.10.2013 15:55, Mark Yandell wrote: > Hi Graham, > > Carson is currently on the road, moving his family from Canada back to the States. He should be back online sometime next week. Sorry for the delay. > > > --mark > > > Mark Yandell > Professor of Human Genetics > H.A. & Edna Benning Presidential Endowed Chair > Eccles Institute of Human Genetics > University of Utah > 15 North 2030 East, Room 2100 > Salt Lake City, UT 84112-5330 > ph:801-587-7707 > > ________________________________________ > From: maker-devel [maker-devel-bounces at yandell-lab.org] on behalf of graham etherington (TSL) [graham.etherington at sainsbury-laboratory.ac.uk] > Sent: Thursday, October 24, 2013 7:42 AM > To: Carson Holt; Felix Bemm; maker-devel at yandell-lab.org > Cc: Kate Bailey (TSL) > Subject: Re: [maker-devel] Serious Start Codon Problem > > Hi, > I notice that the latest version of Maker available from > http://www.yandell-lab.org/software/maker.html is still v2.29b (4 Oct > 2013). As we're also having the start codon problem we'd like to get hold > of v2.30. Do you know when this will be released? > Many thanks, > Graham > > > Dr. Graham Etherington > Bioinformatics Support Officer, > The Sainsbury Laboratory, > Norwich Research Park, > Norwich NR4 7UH. > UK > Tel: +44 (0)1603 450601 > > > > > > On 13/10/2013 06:26, "Carson Holt" wrote: > >> Before Wednesday, because after that I'm on a 6 day road trip, so I have >> to have it wrapped up before I leave. >> >> --Carson >> >> >> >> >> On 10/12/13 12:16 PM, "Felix Bemm" wrote: >> >>> Thx Carson! Do you now when 2.30 will be available? Bests >>> >>> On 12.10.2013 17:48, Carson Holt wrote: >>>> This issue just came up this week on another e-mail as well. >>>> >>>> One way to fix it, is to edit the Bio::Tools::CodonTable module from >>>> BioPerl (use maker --debug to have maker print out the locations of all >>>> modules it uses). Such a hack should only be done as a temporary work >>>> around. >>>> >>>> You change line 256 from --> >>>> ---M---------------M---------------M---------------------------- >>>> >>>> To--> >>>> -----------------------------------M---------------------------- >>>> >>>> >>>> Maker uses the BioPerl is_start_codon function to determine if the >>>> codon >>>> used in a result it receives is acceptable or if it should look >>>> upstream >>>> or downstream for a different one. Apparently there are actually 3 >>>> acceptable start codons in the standard codon table (2 rare ones and >>>> one >>>> common), so making the edit removes the two rare ones from the list. >>>> >>>> I'm also preparing a 2.30 release that exports a "strict" codon table >>>> to >>>> the BioPerl module, that will be used by default and should fix the >>>> issue. >>>> >>>> Thanks, >>>> Carson >>>> >>>> >>>> >>>> On 10/12/13 11:06 AM, "Felix Bemm" wrote: >>>> >>>>> Hi, >>>>> >>>>> I have a serious problems with maker annotations especially the start >>>>> codon placement. It started happening with maker version 2.27! About >>>>> 1/3 >>>>> of my genes are missing a proper start codon. When reviewing the GFF >>>>> files gene predictors and est evidence standalone would predict the >>>>> right gene structure including the start codon. I was thinking that >>>>> this >>>>> problem was somehow linked to my gene models but looking at their >>>>> standalone prediction I discarded that theory. Just some stats about >>>>> proteins with proper start codon (first column) and missing start >>>>> codon >>>>> (second column). >>>>> >>>>> Maker 9268 4215 >>>>> SNAP 16577 896 >>>>> Genemark 18764 290 >>>>> >>>>> Numbers look the same when Augustus is used (currently retraining). >>>>> Manually using EST's in Apollo often fixes the problems but I don't >>>>> want >>>>> to check > 4000 genes for this. >>>>> >>>>> Do you have any idea what could cause such a problem? If you need some >>>>> example gff files I can send you. >>>>> >>>>> Best regards >>>>> Felix >>>>> >>>>> -- >>>>> Felix Bemm >>>>> Department of Bioinformatics >>>>> University of W?rzburg, Germany >>>>> Tel: +49 931 - 31 83696 >>>>> Fax: +49 931 - 31 84552 >>>>> felix.bemm at uni-wuerzburg.de >>>>> >>>>> _______________________________________________ >>>>> maker-devel mailing list >>>>> maker-devel at box290.bluehost.com >>>>> >>>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >>> -- >>> Felix Bemm >>> Department of Bioinformatics >>> University of W?rzburg, Germany >>> Tel: +49 931 - 31 83696 >>> Fax: +49 931 - 31 84552 >>> felix.bemm at uni-wuerzburg.de >> >> >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -- Felix Bemm Department of Bioinformatics University of W?rzburg, Germany Tel: +49 931 - 31 83696 Fax: +49 931 - 31 84552 felix.bemm at uni-wuerzburg.de From jim.hu.biobio at gmail.com Thu Oct 24 09:32:01 2013 From: jim.hu.biobio at gmail.com (JH Gmail) Date: Thu, 24 Oct 2013 09:32:01 -0500 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: <7A60AB257EFF2B48B1F4C814817EA05365E778BF@mxb2.hg.genetics.utah.edu> References: <7A60AB257EFF2B48B1F4C814817EA05365E778BF@mxb2.hg.genetics.utah.edu> Message-ID: <69247335-1227-41CB-94EA-BFEEE5207D51@gmail.com> Slightly off-topic questions: What is the assessment of right and wrong based on? Is there a training set based on experiments? In bacteria, the use of the minor starts is significant, but the mechanism is so different I would expect eukaryotes to be more strict. Does maker handle ribosome profiling yet? jim Sent from my iPad > On Oct 24, 2013, at 8:55 AM, Mark Yandell wrote: > > Hi Graham, > > Carson is currently on the road, moving his family from Canada back to the States. He should be back online sometime next week. Sorry for the delay. > > > --mark > > > Mark Yandell > Professor of Human Genetics > H.A. & Edna Benning Presidential Endowed Chair > Eccles Institute of Human Genetics > University of Utah > 15 North 2030 East, Room 2100 > Salt Lake City, UT 84112-5330 > ph:801-587-7707 > > ________________________________________ > From: maker-devel [maker-devel-bounces at yandell-lab.org] on behalf of graham etherington (TSL) [graham.etherington at sainsbury-laboratory.ac.uk] > Sent: Thursday, October 24, 2013 7:42 AM > To: Carson Holt; Felix Bemm; maker-devel at yandell-lab.org > Cc: Kate Bailey (TSL) > Subject: Re: [maker-devel] Serious Start Codon Problem > > Hi, > I notice that the latest version of Maker available from > http://www.yandell-lab.org/software/maker.html is still v2.29b (4 Oct > 2013). As we're also having the start codon problem we'd like to get hold > of v2.30. Do you know when this will be released? > Many thanks, > Graham > > > Dr. Graham Etherington > Bioinformatics Support Officer, > The Sainsbury Laboratory, > Norwich Research Park, > Norwich NR4 7UH. > UK > Tel: +44 (0)1603 450601 > > > > > >> On 13/10/2013 06:26, "Carson Holt" wrote: >> >> Before Wednesday, because after that I'm on a 6 day road trip, so I have >> to have it wrapped up before I leave. >> >> --Carson >> >> >> >> >>> On 10/12/13 12:16 PM, "Felix Bemm" wrote: >>> >>> Thx Carson! Do you now when 2.30 will be available? Bests >>> >>>> On 12.10.2013 17:48, Carson Holt wrote: >>>> This issue just came up this week on another e-mail as well. >>>> >>>> One way to fix it, is to edit the Bio::Tools::CodonTable module from >>>> BioPerl (use maker --debug to have maker print out the locations of all >>>> modules it uses). Such a hack should only be done as a temporary work >>>> around. >>>> >>>> You change line 256 from --> >>>> ---M---------------M---------------M---------------------------- >>>> >>>> To--> >>>> -----------------------------------M---------------------------- >>>> >>>> >>>> Maker uses the BioPerl is_start_codon function to determine if the >>>> codon >>>> used in a result it receives is acceptable or if it should look >>>> upstream >>>> or downstream for a different one. Apparently there are actually 3 >>>> acceptable start codons in the standard codon table (2 rare ones and >>>> one >>>> common), so making the edit removes the two rare ones from the list. >>>> >>>> I'm also preparing a 2.30 release that exports a "strict" codon table >>>> to >>>> the BioPerl module, that will be used by default and should fix the >>>> issue. >>>> >>>> Thanks, >>>> Carson >>>> >>>> >>>> >>>>> On 10/12/13 11:06 AM, "Felix Bemm" wrote: >>>>> >>>>> Hi, >>>>> >>>>> I have a serious problems with maker annotations especially the start >>>>> codon placement. It started happening with maker version 2.27! About >>>>> 1/3 >>>>> of my genes are missing a proper start codon. When reviewing the GFF >>>>> files gene predictors and est evidence standalone would predict the >>>>> right gene structure including the start codon. I was thinking that >>>>> this >>>>> problem was somehow linked to my gene models but looking at their >>>>> standalone prediction I discarded that theory. Just some stats about >>>>> proteins with proper start codon (first column) and missing start >>>>> codon >>>>> (second column). >>>>> >>>>> Maker 9268 4215 >>>>> SNAP 16577 896 >>>>> Genemark 18764 290 >>>>> >>>>> Numbers look the same when Augustus is used (currently retraining). >>>>> Manually using EST's in Apollo often fixes the problems but I don't >>>>> want >>>>> to check > 4000 genes for this. >>>>> >>>>> Do you have any idea what could cause such a problem? If you need some >>>>> example gff files I can send you. >>>>> >>>>> Best regards >>>>> Felix >>>>> >>>>> -- >>>>> Felix Bemm >>>>> Department of Bioinformatics >>>>> University of W?rzburg, Germany >>>>> Tel: +49 931 - 31 83696 >>>>> Fax: +49 931 - 31 84552 >>>>> felix.bemm at uni-wuerzburg.de >>>>> >>>>> _______________________________________________ >>>>> maker-devel mailing list >>>>> maker-devel at box290.bluehost.com >>>>> >>>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >>> -- >>> Felix Bemm >>> Department of Bioinformatics >>> University of W?rzburg, Germany >>> Tel: +49 931 - 31 83696 >>> Fax: +49 931 - 31 84552 >>> felix.bemm at uni-wuerzburg.de >> >> >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From amelia.ireland at gmod.org Thu Oct 24 13:52:17 2013 From: amelia.ireland at gmod.org (Amelia Ireland) Date: Thu, 24 Oct 2013 11:52:17 -0700 Subject: [maker-devel] GMOD 2014: San Diego Message-ID: Greetings GMOD community citizens! Online registration is now open for the 2014 GMOD meeting, to be held at the Best Western Seven Seas, San Diego, CA, on January 16-17, 2014 (after PAG). The meeting is open to GMOD users, developers, and anyone interested in the project and the software we provide. Online registration is now open; please see the meeting page, http://gmod.org/wiki/Jan_2014_GMOD_Meeting, for more information. Early bird registration rates apply until Dec. 14th. We have secured a discount on a block of rooms at the Best Western Seven Seas; please see the wiki for more information on booking. Please feel free to contact the GMOD helpdesk on help at gmod.org if you have any questions. Thanks! -- Amelia Ireland GMOD Community Support http://gmod.org || @gmodproject -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry.moore at genetics.utah.edu Thu Oct 24 16:09:20 2013 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Thu, 24 Oct 2013 15:09:20 -0600 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: <69247335-1227-41CB-94EA-BFEEE5207D51@gmail.com> References: <7A60AB257EFF2B48B1F4C814817EA05365E778BF@mxb2.hg.genetics.utah.edu> <69247335-1227-41CB-94EA-BFEEE5207D51@gmail.com> Message-ID: <1F2D4C5A-1B72-409A-8F7D-4793F9BCEE88@genetics.utah.edu> Hi Jim, You can and any kind of evidence you want to MAKER including ribosomal profiling data as long as you format it in GFF3 format. You could pass in your GFF3 formatted ribosomal profiling data in maker_opt.ctl with the protein_gff option and it will be treated as protein evidence. However if you are wondering if anyone has coded a specific addition to MAKER to accept ribosomal profiling data and treat it as a different kind of evidence then no, this hasn't been done. B On Oct 24, 2013, at 8:32 AM, JH Gmail wrote: > Slightly off-topic questions: What is the assessment of right and wrong based on? Is there a training set based on experiments? In bacteria, the use of the minor starts is significant, but the mechanism is so different I would expect eukaryotes to be more strict. > > Does maker handle ribosome profiling yet? > > jim > > Sent from my iPad > >> On Oct 24, 2013, at 8:55 AM, Mark Yandell wrote: >> >> Hi Graham, >> >> Carson is currently on the road, moving his family from Canada back to the States. He should be back online sometime next week. Sorry for the delay. >> >> >> --mark >> >> >> Mark Yandell >> Professor of Human Genetics >> H.A. & Edna Benning Presidential Endowed Chair >> Eccles Institute of Human Genetics >> University of Utah >> 15 North 2030 East, Room 2100 >> Salt Lake City, UT 84112-5330 >> ph:801-587-7707 >> >> ________________________________________ >> From: maker-devel [maker-devel-bounces at yandell-lab.org] on behalf of graham etherington (TSL) [graham.etherington at sainsbury-laboratory.ac.uk] >> Sent: Thursday, October 24, 2013 7:42 AM >> To: Carson Holt; Felix Bemm; maker-devel at yandell-lab.org >> Cc: Kate Bailey (TSL) >> Subject: Re: [maker-devel] Serious Start Codon Problem >> >> Hi, >> I notice that the latest version of Maker available from >> http://www.yandell-lab.org/software/maker.html is still v2.29b (4 Oct >> 2013). As we're also having the start codon problem we'd like to get hold >> of v2.30. Do you know when this will be released? >> Many thanks, >> Graham >> >> >> Dr. Graham Etherington >> Bioinformatics Support Officer, >> The Sainsbury Laboratory, >> Norwich Research Park, >> Norwich NR4 7UH. >> UK >> Tel: +44 (0)1603 450601 >> >> >> >> >> >>> On 13/10/2013 06:26, "Carson Holt" wrote: >>> >>> Before Wednesday, because after that I'm on a 6 day road trip, so I have >>> to have it wrapped up before I leave. >>> >>> --Carson >>> >>> >>> >>> >>>> On 10/12/13 12:16 PM, "Felix Bemm" wrote: >>>> >>>> Thx Carson! Do you now when 2.30 will be available? Bests >>>> >>>>> On 12.10.2013 17:48, Carson Holt wrote: >>>>> This issue just came up this week on another e-mail as well. >>>>> >>>>> One way to fix it, is to edit the Bio::Tools::CodonTable module from >>>>> BioPerl (use maker --debug to have maker print out the locations of all >>>>> modules it uses). Such a hack should only be done as a temporary work >>>>> around. >>>>> >>>>> You change line 256 from --> >>>>> ---M---------------M---------------M---------------------------- >>>>> >>>>> To--> >>>>> -----------------------------------M---------------------------- >>>>> >>>>> >>>>> Maker uses the BioPerl is_start_codon function to determine if the >>>>> codon >>>>> used in a result it receives is acceptable or if it should look >>>>> upstream >>>>> or downstream for a different one. Apparently there are actually 3 >>>>> acceptable start codons in the standard codon table (2 rare ones and >>>>> one >>>>> common), so making the edit removes the two rare ones from the list. >>>>> >>>>> I'm also preparing a 2.30 release that exports a "strict" codon table >>>>> to >>>>> the BioPerl module, that will be used by default and should fix the >>>>> issue. >>>>> >>>>> Thanks, >>>>> Carson >>>>> >>>>> >>>>> >>>>>> On 10/12/13 11:06 AM, "Felix Bemm" wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> I have a serious problems with maker annotations especially the start >>>>>> codon placement. It started happening with maker version 2.27! About >>>>>> 1/3 >>>>>> of my genes are missing a proper start codon. When reviewing the GFF >>>>>> files gene predictors and est evidence standalone would predict the >>>>>> right gene structure including the start codon. I was thinking that >>>>>> this >>>>>> problem was somehow linked to my gene models but looking at their >>>>>> standalone prediction I discarded that theory. Just some stats about >>>>>> proteins with proper start codon (first column) and missing start >>>>>> codon >>>>>> (second column). >>>>>> >>>>>> Maker 9268 4215 >>>>>> SNAP 16577 896 >>>>>> Genemark 18764 290 >>>>>> >>>>>> Numbers look the same when Augustus is used (currently retraining). >>>>>> Manually using EST's in Apollo often fixes the problems but I don't >>>>>> want >>>>>> to check > 4000 genes for this. >>>>>> >>>>>> Do you have any idea what could cause such a problem? If you need some >>>>>> example gff files I can send you. >>>>>> >>>>>> Best regards >>>>>> Felix >>>>>> >>>>>> -- >>>>>> Felix Bemm >>>>>> Department of Bioinformatics >>>>>> University of W?rzburg, Germany >>>>>> Tel: +49 931 - 31 83696 >>>>>> Fax: +49 931 - 31 84552 >>>>>> felix.bemm at uni-wuerzburg.de >>>>>> >>>>>> _______________________________________________ >>>>>> maker-devel mailing list >>>>>> maker-devel at box290.bluehost.com >>>>>> >>>>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>>> >>>> -- >>>> Felix Bemm >>>> Department of Bioinformatics >>>> University of W?rzburg, Germany >>>> Tel: +49 931 - 31 83696 >>>> Fax: +49 931 - 31 84552 >>>> felix.bemm at uni-wuerzburg.de >>> >>> >>> >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From marc.hoeppner at imbim.uu.se Fri Oct 25 09:20:04 2013 From: marc.hoeppner at imbim.uu.se (Marc P. Hoeppner) Date: Fri, 25 Oct 2013 16:20:04 +0200 Subject: [maker-devel] Maker MPICH2 issue (Multiple MAKER processes have been started in the same directory) Message-ID: <526A7E14.9010906@imbim.uu.se> Dear list, I know that MPICH2 related issues have been discussed previously, but the suggested solutions don't seem to work for me. Briefly, I have set up a new cluster, with a fresh install of mpich2 (SL 6.4, from the yum repos) and a newly compiled build of Maker 2.28. When I start maker with mpiexec, I get spammed with 'WARNING: Multiple MAKER processes have been started in the same directory.' Here is what I have tried so far - still getting the error :-< Any suggestions would be appreciated -------- MPICH2 --------- which mpiexec: /usr/lib64/mpich2/bin/mpiexec Started the mpd ring as normal user across 5 nodes (bnode-01 to bnode-04, and the head node bhead) mpdtrace: bhead bnode-02 bnode-03 bnode-04 bnode-01 mpdringtest: time for 1 loops = 0.00128293037415 seconds mpixec -n 32 -machinefile hostfile hostname: I also ran the Skampi Mpi Benchmark, which ran through just fine. ----------- MAKER ----------- ./Build status PERL Dependencies: VERIFIED External Programs: VERIFIED External C Libraries: VERIFIED MPI SUPPORT: ENABLED MWAS Web Interface: DISABLED MAKER PACKAGE: CONFIGURATION OK During configuration/installation, Build.PL automatically detected all MPI data and never complained, so I have to assume that everything went fine there. -- ---------- Marc P. Hoeppner, PhD Department of Microbiology and Medical Biochemistry Uppsala University, Sweden marc.hoeppner at imbim.uu.se From jim.hu.biobio at gmail.com Fri Oct 25 09:51:19 2013 From: jim.hu.biobio at gmail.com (JH Gmail) Date: Fri, 25 Oct 2013 09:51:19 -0500 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: <1F2D4C5A-1B72-409A-8F7D-4793F9BCEE88@genetics.utah.edu> References: <7A60AB257EFF2B48B1F4C814817EA05365E778BF@mxb2.hg.genetics.utah.edu> <69247335-1227-41CB-94EA-BFEEE5207D51@gmail.com> <1F2D4C5A-1B72-409A-8F7D-4793F9BCEE88@genetics.utah.edu> Message-ID: <596400F6-3215-4F71-A896-AF52CEC8D77D@gmail.com> Thanks Barry, I'm thinking about this more theoretically, as we don't have immediate plans, but... This would work for gff derived from ribosome profiling analysis, but it would be nice to be able to start with bam files of mapped reads, or coverage scores as bigwig or gff. RNA-seq has similar issues, but IIRC Maker expects RNA-seq data to be assembled first. It may not be possible to do de novo cds assembly from ribosome profiling because I think the nuclease step may kill the overlaps. I'd have to look at some of the available data. Jim Sent from my iPad > On Oct 24, 2013, at 4:09 PM, Barry Moore wrote: > > Hi Jim, > > You can and any kind of evidence you want to MAKER including ribosomal profiling data as long as you format it in GFF3 format. You could pass in your GFF3 formatted ribosomal profiling data in maker_opt.ctl with the protein_gff option and it will be treated as protein evidence. However if you are wondering if anyone has coded a specific addition to MAKER to accept ribosomal profiling data and treat it as a different kind of evidence then no, this hasn't been done. > > B > >> On Oct 24, 2013, at 8:32 AM, JH Gmail wrote: >> >> Slightly off-topic questions: What is the assessment of right and wrong based on? Is there a training set based on experiments? In bacteria, the use of the minor starts is significant, but the mechanism is so different I would expect eukaryotes to be more strict. >> >> Does maker handle ribosome profiling yet? >> >> jim >> >> Sent from my iPad >> >>> On Oct 24, 2013, at 8:55 AM, Mark Yandell wrote: >>> >>> Hi Graham, >>> >>> Carson is currently on the road, moving his family from Canada back to the States. He should be back online sometime next week. Sorry for the delay. >>> >>> >>> --mark >>> >>> >>> Mark Yandell >>> Professor of Human Genetics >>> H.A. & Edna Benning Presidential Endowed Chair >>> Eccles Institute of Human Genetics >>> University of Utah >>> 15 North 2030 East, Room 2100 >>> Salt Lake City, UT 84112-5330 >>> ph:801-587-7707 >>> >>> ________________________________________ >>> From: maker-devel [maker-devel-bounces at yandell-lab.org] on behalf of graham etherington (TSL) [graham.etherington at sainsbury-laboratory.ac.uk] >>> Sent: Thursday, October 24, 2013 7:42 AM >>> To: Carson Holt; Felix Bemm; maker-devel at yandell-lab.org >>> Cc: Kate Bailey (TSL) >>> Subject: Re: [maker-devel] Serious Start Codon Problem >>> >>> Hi, >>> I notice that the latest version of Maker available from >>> http://www.yandell-lab.org/software/maker.html is still v2.29b (4 Oct >>> 2013). As we're also having the start codon problem we'd like to get hold >>> of v2.30. Do you know when this will be released? >>> Many thanks, >>> Graham >>> >>> >>> Dr. Graham Etherington >>> Bioinformatics Support Officer, >>> The Sainsbury Laboratory, >>> Norwich Research Park, >>> Norwich NR4 7UH. >>> UK >>> Tel: +44 (0)1603 450601 >>> >>> >>> >>> >>> >>>> On 13/10/2013 06:26, "Carson Holt" wrote: >>>> >>>> Before Wednesday, because after that I'm on a 6 day road trip, so I have >>>> to have it wrapped up before I leave. >>>> >>>> --Carson >>>> >>>> >>>> >>>> >>>>> On 10/12/13 12:16 PM, "Felix Bemm" wrote: >>>>> >>>>> Thx Carson! Do you now when 2.30 will be available? Bests >>>>> >>>>>> On 12.10.2013 17:48, Carson Holt wrote: >>>>>> This issue just came up this week on another e-mail as well. >>>>>> >>>>>> One way to fix it, is to edit the Bio::Tools::CodonTable module from >>>>>> BioPerl (use maker --debug to have maker print out the locations of all >>>>>> modules it uses). Such a hack should only be done as a temporary work >>>>>> around. >>>>>> >>>>>> You change line 256 from --> >>>>>> ---M---------------M---------------M---------------------------- >>>>>> >>>>>> To--> >>>>>> -----------------------------------M---------------------------- >>>>>> >>>>>> >>>>>> Maker uses the BioPerl is_start_codon function to determine if the >>>>>> codon >>>>>> used in a result it receives is acceptable or if it should look >>>>>> upstream >>>>>> or downstream for a different one. Apparently there are actually 3 >>>>>> acceptable start codons in the standard codon table (2 rare ones and >>>>>> one >>>>>> common), so making the edit removes the two rare ones from the list. >>>>>> >>>>>> I'm also preparing a 2.30 release that exports a "strict" codon table >>>>>> to >>>>>> the BioPerl module, that will be used by default and should fix the >>>>>> issue. >>>>>> >>>>>> Thanks, >>>>>> Carson >>>>>> >>>>>> >>>>>> >>>>>>> On 10/12/13 11:06 AM, "Felix Bemm" wrote: >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I have a serious problems with maker annotations especially the start >>>>>>> codon placement. It started happening with maker version 2.27! About >>>>>>> 1/3 >>>>>>> of my genes are missing a proper start codon. When reviewing the GFF >>>>>>> files gene predictors and est evidence standalone would predict the >>>>>>> right gene structure including the start codon. I was thinking that >>>>>>> this >>>>>>> problem was somehow linked to my gene models but looking at their >>>>>>> standalone prediction I discarded that theory. Just some stats about >>>>>>> proteins with proper start codon (first column) and missing start >>>>>>> codon >>>>>>> (second column). >>>>>>> >>>>>>> Maker 9268 4215 >>>>>>> SNAP 16577 896 >>>>>>> Genemark 18764 290 >>>>>>> >>>>>>> Numbers look the same when Augustus is used (currently retraining). >>>>>>> Manually using EST's in Apollo often fixes the problems but I don't >>>>>>> want >>>>>>> to check > 4000 genes for this. >>>>>>> >>>>>>> Do you have any idea what could cause such a problem? If you need some >>>>>>> example gff files I can send you. >>>>>>> >>>>>>> Best regards >>>>>>> Felix >>>>>>> >>>>>>> -- >>>>>>> Felix Bemm >>>>>>> Department of Bioinformatics >>>>>>> University of W?rzburg, Germany >>>>>>> Tel: +49 931 - 31 83696 >>>>>>> Fax: +49 931 - 31 84552 >>>>>>> felix.bemm at uni-wuerzburg.de >>>>>>> >>>>>>> _______________________________________________ >>>>>>> maker-devel mailing list >>>>>>> maker-devel at box290.bluehost.com >>>>>>> >>>>>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>>>> >>>>> -- >>>>> Felix Bemm >>>>> Department of Bioinformatics >>>>> University of W?rzburg, Germany >>>>> Tel: +49 931 - 31 83696 >>>>> Fax: +49 931 - 31 84552 >>>>> felix.bemm at uni-wuerzburg.de >>>> >>>> >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >>> >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > Barry Moore > Research Scientist > Dept. of Human Genetics > University of Utah > Salt Lake City, UT 84112 > -------------------------------------------- > (801) 585-3543 > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry.moore at genetics.utah.edu Fri Oct 25 10:44:55 2013 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Fri, 25 Oct 2013 09:44:55 -0600 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: <596400F6-3215-4F71-A896-AF52CEC8D77D@gmail.com> References: <7A60AB257EFF2B48B1F4C814817EA05365E778BF@mxb2.hg.genetics.utah.edu> <69247335-1227-41CB-94EA-BFEEE5207D51@gmail.com> <1F2D4C5A-1B72-409A-8F7D-4793F9BCEE88@genetics.utah.edu> <596400F6-3215-4F71-A896-AF52CEC8D77D@gmail.com> Message-ID: Thanks Jim. B On Oct 25, 2013, at 8:51 AM, JH Gmail wrote: > Thanks Barry, > > I'm thinking about this more theoretically, as we don't have immediate plans, but... > > This would work for gff derived from ribosome profiling analysis, but it would be nice to be able to start with bam files of mapped reads, or coverage scores as bigwig or gff. RNA-seq has similar issues, but IIRC Maker expects RNA-seq data to be assembled first. It may not be possible to do de novo cds assembly from ribosome profiling because I think the nuclease step may kill the overlaps. I'd have to look at some of the available data. > > Jim > > Sent from my iPad > > On Oct 24, 2013, at 4:09 PM, Barry Moore wrote: > >> Hi Jim, >> >> You can and any kind of evidence you want to MAKER including ribosomal profiling data as long as you format it in GFF3 format. You could pass in your GFF3 formatted ribosomal profiling data in maker_opt.ctl with the protein_gff option and it will be treated as protein evidence. However if you are wondering if anyone has coded a specific addition to MAKER to accept ribosomal profiling data and treat it as a different kind of evidence then no, this hasn't been done. >> >> B >> >> On Oct 24, 2013, at 8:32 AM, JH Gmail wrote: >> >>> Slightly off-topic questions: What is the assessment of right and wrong based on? Is there a training set based on experiments? In bacteria, the use of the minor starts is significant, but the mechanism is so different I would expect eukaryotes to be more strict. >>> >>> Does maker handle ribosome profiling yet? >>> >>> jim >>> >>> Sent from my iPad >>> >>>> On Oct 24, 2013, at 8:55 AM, Mark Yandell wrote: >>>> >>>> Hi Graham, >>>> >>>> Carson is currently on the road, moving his family from Canada back to the States. He should be back online sometime next week. Sorry for the delay. >>>> >>>> >>>> --mark >>>> >>>> >>>> Mark Yandell >>>> Professor of Human Genetics >>>> H.A. & Edna Benning Presidential Endowed Chair >>>> Eccles Institute of Human Genetics >>>> University of Utah >>>> 15 North 2030 East, Room 2100 >>>> Salt Lake City, UT 84112-5330 >>>> ph:801-587-7707 >>>> >>>> ________________________________________ >>>> From: maker-devel [maker-devel-bounces at yandell-lab.org] on behalf of graham etherington (TSL) [graham.etherington at sainsbury-laboratory.ac.uk] >>>> Sent: Thursday, October 24, 2013 7:42 AM >>>> To: Carson Holt; Felix Bemm; maker-devel at yandell-lab.org >>>> Cc: Kate Bailey (TSL) >>>> Subject: Re: [maker-devel] Serious Start Codon Problem >>>> >>>> Hi, >>>> I notice that the latest version of Maker available from >>>> http://www.yandell-lab.org/software/maker.html is still v2.29b (4 Oct >>>> 2013). As we're also having the start codon problem we'd like to get hold >>>> of v2.30. Do you know when this will be released? >>>> Many thanks, >>>> Graham >>>> >>>> >>>> Dr. Graham Etherington >>>> Bioinformatics Support Officer, >>>> The Sainsbury Laboratory, >>>> Norwich Research Park, >>>> Norwich NR4 7UH. >>>> UK >>>> Tel: +44 (0)1603 450601 >>>> >>>> >>>> >>>> >>>> >>>>> On 13/10/2013 06:26, "Carson Holt" wrote: >>>>> >>>>> Before Wednesday, because after that I'm on a 6 day road trip, so I have >>>>> to have it wrapped up before I leave. >>>>> >>>>> --Carson >>>>> >>>>> >>>>> >>>>> >>>>>> On 10/12/13 12:16 PM, "Felix Bemm" wrote: >>>>>> >>>>>> Thx Carson! Do you now when 2.30 will be available? Bests >>>>>> >>>>>>> On 12.10.2013 17:48, Carson Holt wrote: >>>>>>> This issue just came up this week on another e-mail as well. >>>>>>> >>>>>>> One way to fix it, is to edit the Bio::Tools::CodonTable module from >>>>>>> BioPerl (use maker --debug to have maker print out the locations of all >>>>>>> modules it uses). Such a hack should only be done as a temporary work >>>>>>> around. >>>>>>> >>>>>>> You change line 256 from --> >>>>>>> ---M---------------M---------------M---------------------------- >>>>>>> >>>>>>> To--> >>>>>>> -----------------------------------M---------------------------- >>>>>>> >>>>>>> >>>>>>> Maker uses the BioPerl is_start_codon function to determine if the >>>>>>> codon >>>>>>> used in a result it receives is acceptable or if it should look >>>>>>> upstream >>>>>>> or downstream for a different one. Apparently there are actually 3 >>>>>>> acceptable start codons in the standard codon table (2 rare ones and >>>>>>> one >>>>>>> common), so making the edit removes the two rare ones from the list. >>>>>>> >>>>>>> I'm also preparing a 2.30 release that exports a "strict" codon table >>>>>>> to >>>>>>> the BioPerl module, that will be used by default and should fix the >>>>>>> issue. >>>>>>> >>>>>>> Thanks, >>>>>>> Carson >>>>>>> >>>>>>> >>>>>>> >>>>>>>> On 10/12/13 11:06 AM, "Felix Bemm" wrote: >>>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> I have a serious problems with maker annotations especially the start >>>>>>>> codon placement. It started happening with maker version 2.27! About >>>>>>>> 1/3 >>>>>>>> of my genes are missing a proper start codon. When reviewing the GFF >>>>>>>> files gene predictors and est evidence standalone would predict the >>>>>>>> right gene structure including the start codon. I was thinking that >>>>>>>> this >>>>>>>> problem was somehow linked to my gene models but looking at their >>>>>>>> standalone prediction I discarded that theory. Just some stats about >>>>>>>> proteins with proper start codon (first column) and missing start >>>>>>>> codon >>>>>>>> (second column). >>>>>>>> >>>>>>>> Maker 9268 4215 >>>>>>>> SNAP 16577 896 >>>>>>>> Genemark 18764 290 >>>>>>>> >>>>>>>> Numbers look the same when Augustus is used (currently retraining). >>>>>>>> Manually using EST's in Apollo often fixes the problems but I don't >>>>>>>> want >>>>>>>> to check > 4000 genes for this. >>>>>>>> >>>>>>>> Do you have any idea what could cause such a problem? If you need some >>>>>>>> example gff files I can send you. >>>>>>>> >>>>>>>> Best regards >>>>>>>> Felix >>>>>>>> >>>>>>>> -- >>>>>>>> Felix Bemm >>>>>>>> Department of Bioinformatics >>>>>>>> University of W?rzburg, Germany >>>>>>>> Tel: +49 931 - 31 83696 >>>>>>>> Fax: +49 931 - 31 84552 >>>>>>>> felix.bemm at uni-wuerzburg.de >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> maker-devel mailing list >>>>>>>> maker-devel at box290.bluehost.com >>>>>>>> >>>>>>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>>>>> >>>>>> -- >>>>>> Felix Bemm >>>>>> Department of Bioinformatics >>>>>> University of W?rzburg, Germany >>>>>> Tel: +49 931 - 31 83696 >>>>>> Fax: +49 931 - 31 84552 >>>>>> felix.bemm at uni-wuerzburg.de >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> maker-devel mailing list >>>>> maker-devel at box290.bluehost.com >>>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>>> >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> Barry Moore >> Research Scientist >> Dept. of Human Genetics >> University of Utah >> Salt Lake City, UT 84112 >> -------------------------------------------- >> (801) 585-3543 >> >> >> >> Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From marc.hoeppner at imbim.uu.se Sat Oct 26 06:54:07 2013 From: marc.hoeppner at imbim.uu.se (Marc P. Hoeppner) Date: Sat, 26 Oct 2013 13:54:07 +0200 Subject: [maker-devel] Maker MPICH2 issue (Multiple MAKER processes have been started in the same directory) In-Reply-To: <526A7E14.9010906@imbim.uu.se> References: <526A7E14.9010906@imbim.uu.se> Message-ID: <526BAD5F.2000800@imbim.uu.se> Dear List, after much more debugging, I found the solution myself. Here are the details: The Maker installation instructions refer to 'mpich2' in multiple places, including weblinks (whchi are outdated btw). Some digging revealed that mpich2 has since been renamed and merged back into mpich. So the website is www.mpich.org and the current stable version is mpich-3.0.4. Closely following the installation instruction yielded a working copy of mpiexec that Maker was willing to accept. Good news - but I suggest to update the documentation. Most importantly for either SL/CentOS users, mpich2 as packaged with e.g. Scientific Linux *does not* work with Maker. You will need to download the latest stable build and compile it yourself (the mpich2 version bundled with Maker doesn't work either btw). Cheers, Marc > Dear list, > > I know that MPICH2 related issues have been discussed previously, but > the suggested solutions don't seem to work for me. > Briefly, I have set up a new cluster, with a fresh install of mpich2 > (SL 6.4, from the yum repos) and a newly compiled build of Maker 2.28. > > When I start maker with mpiexec, I get spammed with > 'WARNING: Multiple MAKER processes have been started in the same > directory.' > > Here is what I have tried so far - still getting the error :-< Any > suggestions would be appreciated > > -------- > MPICH2 > --------- > > which mpiexec: > /usr/lib64/mpich2/bin/mpiexec > > Started the mpd ring as normal user across 5 nodes (bnode-01 to > bnode-04, and the head node bhead) > > mpdtrace: > bhead > bnode-02 > bnode-03 > bnode-04 > bnode-01 > > mpdringtest: > time for 1 loops = 0.00128293037415 seconds > > mpixec -n 32 -machinefile hostfile hostname: > > > > I also ran the Skampi Mpi Benchmark, which ran through just fine. > > ----------- > MAKER > ----------- > > ./Build status > > PERL Dependencies: VERIFIED > External Programs: VERIFIED > External C Libraries: VERIFIED > MPI SUPPORT: ENABLED > MWAS Web Interface: DISABLED > MAKER PACKAGE: CONFIGURATION OK > > During configuration/installation, Build.PL automatically detected all > MPI data and never complained, so I have to assume that everything > went fine there. > -- ---------- Marc P. Hoeppner, PhD Department of Microbiology and Medical Biochemistry Uppsala University, Sweden marc.hoeppner at imbim.uu.se From barry.utah at gmail.com Sat Oct 26 09:41:24 2013 From: barry.utah at gmail.com (Barry Moore) Date: Sat, 26 Oct 2013 08:41:24 -0600 Subject: [maker-devel] Maker MPICH2 issue (Multiple MAKER processes have been started in the same directory) In-Reply-To: <526BAD5F.2000800@imbim.uu.se> References: <526A7E14.9010906@imbim.uu.se> <526BAD5F.2000800@imbim.uu.se> Message-ID: Hi Marc, Thanks so much for sharing the results of all your hard work! We will follow up on your feedback and incorporate the necessary updates to MAKER docs and installation procedures. B On Oct 26, 2013, at 5:54 AM, Marc P. Hoeppner wrote: > Dear List, > > after much more debugging, I found the solution myself. Here are the details: > > The Maker installation instructions refer to 'mpich2' in multiple places, including weblinks (whchi are outdated btw). Some digging revealed that mpich2 has since been renamed and merged back into mpich. So the website is www.mpich.org and the current stable version is mpich-3.0.4. Closely following the installation instruction yielded a working copy of mpiexec that Maker was willing to accept. Good news - but I suggest to update the documentation. Most importantly for either SL/CentOS users, mpich2 as packaged with e.g. Scientific Linux *does not* work with Maker. You will need to download the latest stable build and compile it yourself (the mpich2 version bundled with Maker doesn't work either btw). > > Cheers, > > Marc > >> Dear list, >> >> I know that MPICH2 related issues have been discussed previously, but the suggested solutions don't seem to work for me. >> Briefly, I have set up a new cluster, with a fresh install of mpich2 (SL 6.4, from the yum repos) and a newly compiled build of Maker 2.28. >> >> When I start maker with mpiexec, I get spammed with >> 'WARNING: Multiple MAKER processes have been started in the same directory.' >> >> Here is what I have tried so far - still getting the error :-< Any suggestions would be appreciated >> >> -------- >> MPICH2 >> --------- >> >> which mpiexec: >> /usr/lib64/mpich2/bin/mpiexec >> >> Started the mpd ring as normal user across 5 nodes (bnode-01 to bnode-04, and the head node bhead) >> >> mpdtrace: >> bhead >> bnode-02 >> bnode-03 >> bnode-04 >> bnode-01 >> >> mpdringtest: >> time for 1 loops = 0.00128293037415 seconds >> >> mpixec -n 32 -machinefile hostfile hostname: >> >> >> >> I also ran the Skampi Mpi Benchmark, which ran through just fine. >> >> ----------- >> MAKER >> ----------- >> >> ./Build status >> >> PERL Dependencies: VERIFIED >> External Programs: VERIFIED >> External C Libraries: VERIFIED >> MPI SUPPORT: ENABLED >> MWAS Web Interface: DISABLED >> MAKER PACKAGE: CONFIGURATION OK >> >> During configuration/installation, Build.PL automatically detected all MPI data and never complained, so I have to assume that everything went fine there. >> > > > -- > ---------- > Marc P. Hoeppner, PhD > Department of Microbiology and Medical Biochemistry > Uppsala University, Sweden > marc.hoeppner at imbim.uu.se > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From wholtz at lygos.com Wed Oct 30 13:21:52 2013 From: wholtz at lygos.com (William Holtz) Date: Wed, 30 Oct 2013 11:21:52 -0700 (PDT) Subject: [maker-devel] maker apollo error Message-ID: <1383157312.95043.YahooMailNeo@web5706.biz.mail.ne1.yahoo.com> Hi, I'm attempting to reply to a thread on this list from before I was a member. My apologies if this gets messed up. I've tried installing maker v2.28 and v2.29p-beta and both of them fail due to the URL of the apollo tar file being outdated as previously reported on this list.?The updated URL that Carson Holt posted,?http://sourceforge.net/code-snapshots/svn/g/gm/gmod/svn/gmod-svn-25291-apollo-trunk.zip,?is also no longer valid. I was able to find a copy of the apollo source at?http://downloads.sourceforge.net/project/gmod/Apollo/1.11.1/apollo-1.11.1.tgz. However my attempts to integrate this into the maker build process have been unsuccessful, but likely could be done by someone more skilled than I.?Any pointers on how to get past this hurdle would be appreciated. Thanks in advance for your help. -Will From barry.utah at gmail.com Wed Oct 30 17:31:58 2013 From: barry.utah at gmail.com (Barry Moore) Date: Wed, 30 Oct 2013 16:31:58 -0600 Subject: [maker-devel] maker apollo error In-Reply-To: <1383157312.95043.YahooMailNeo@web5706.biz.mail.ne1.yahoo.com> References: <1383157312.95043.YahooMailNeo@web5706.biz.mail.ne1.yahoo.com> Message-ID: Hi Will, You can edit these URLs in the following file: maker/src/locations In a section that looks like this: 'apollo' => {'Linux_x86_64' => 'http://gmod.svn.sourceforge.net/viewvc/gmod/apollo/trunk/?view=tar', 'Linux_i386' => 'http://gmod.svn.sourceforge.net/viewvc/gmod/apollo/trunk/?view=tar', 'Darwin_i386' => 'http://weatherby.genetics.utah.edu/apollo/apollo.tar.gz', 'Darwin_x86_64' => 'http://weatherby.genetics.utah.edu/apollo/apollo.tar.gz', 'src_' => 'http://weatherby.genetics.utah.edu/apollo/apollo.tar.gz', }, Carson may have more feedback for you as well, but he is right in the middle of a move, so there may be a few days before he can reply, so modifying this file should work for you in the mean time. On any of the executables that are missing you can always install them manually and add them to your path and maker will find them. Also, maker shouldn't be required for a base maker install, so you could ignore that dependency as well. Thanks, B On Oct 30, 2013, at 12:21 PM, William Holtz wrote: > Hi, > > I'm attempting to reply to a thread on this list from before I was a member. My apologies if this gets messed up. > > I've tried installing maker v2.28 and v2.29p-beta and both of them fail due to the URL of the apollo tar file being outdated as previously reported on this list. The updated URL that Carson Holt posted, http://sourceforge.net/code-snapshots/svn/g/gm/gmod/svn/gmod-svn-25291-apollo-trunk.zip, is also no longer valid. I was able to find a copy of the apollo source at http://downloads.sourceforge.net/project/gmod/Apollo/1.11.1/apollo-1.11.1.tgz. However my attempts to integrate this into the maker build process have been unsuccessful, but likely could be done by someone more skilled than I. Any pointers on how to get past this hurdle would be appreciated. > > Thanks in advance for your help. > > -Will > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Wed Oct 2 08:14:44 2013 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 02 Oct 2013 10:14:44 -0400 Subject: [maker-devel] maker2 scripts for functional annotation In-Reply-To: Message-ID: It still works, but it will be reduced. Without ESTs you won't get any UTR prediction, also you will be limited by how well the protein set matches to the genes. If you use the protein set of a close relative you will capture most things. You will probably capture 80-95% of what you would by also including ESTs. It all depending on how different the species in the proteins evidence file are compared to the the species being annotated. --Carson On 10/1/13 3:00 PM, "Xia.Cao at dupont.com" wrote: >Hi Carson, > >Thank you for your message and your kind help. Now conversion from >match/match_part to gene/mRNA/exons/CDS works well after blanking out >some options in control file. > >I have one more question about maker2. In case we don't have EST evidence >(set of ESTs or assembled mRNA-seq in fasta format) for a genome, does >maker2 function? If it does function, could you please let me know the >performance of maker2 without providing EST evidence compared to the one >with EST evidence? > >Thank you again and I look forward to hearing from you. > >Best, >Xia > > > > >-----Original Message----- >From: Carson Holt [mailto:carsonhh at gmail.com] >Sent: Wednesday, September 25, 2013 10:36 AM >To: CAO, XIA; myandell at genetics.utah.edu; RIVERA, CORBAN GREGORY; >maker-devel at yandell-lab.org >Subject: Re: [maker-devel] maker2 scripts for functional annotation > >If it is launching predictors then you have snap hmm or augustus_species >set. You ned to blank out all other options in the control files >(including repeat masking options, proteins, ESTs, etc.) when trying to >convert mathc/match_part to gene/mRNA/exons/CDS, or else those other >programs will run. > >--Carson > > >On 9/25/13 10:31 AM, "Xia.Cao at dupont.com" wrote: > >>Hi Carson, >> >>Thank you for the message and your kind help. We tested maker2 by >>setting keep_preds=1, pred_gff=generated_gff_file_from_first_makerRun . >>But it seemed maker2 started to launch all predictors again and it took >>long time to finish. I wonder if there is any way that we can directly >>get gene/mRNA/exons/CDS gff file without re-running maker2 to convert >>match/match_part features into gene/mRNA/exons/CDS. >> >>Thanks, >>Xia >> >>-----Original Message----- >>From: Carson Holt [mailto:carsonhh at gmail.com] >>Sent: Thursday, September 19, 2013 5:58 PM >>To: Mark Yandell; CAO, XIA; RIVERA, CORBAN GREGORY; >>maker-devel at yandell-lab.org >>Subject: Re: [maker-devel] maker2 scripts for functional annotation >> >>Hello Corban & Xia, >> >>Some scripts like gff3_preds2models are deprecated. To get the same >>result as was offered by gff3_preds2models, just give your >>match/match_part features to pref_gff= in the maker_opts.ctl file, set >>keep_preds=1, and run with all other options and predictors turned off. >>The final MAKER result will be your match/match_part features converted >>into gene/mRNA/exons/CDS. >> >>For functional annotation, you can use Interproscan, BLASTP against >>UniProt, or BALST2GO. My preference is to use InterProScan to add GO >>terms and proteins domains via the ipr_update_gff and iprscan2gff3 >>scripts. Then add putative gene functions via BLASTP to UniProt and >>maker_functional_fasta and maker_functional_gff scripts. >> >>Go ahead and take a look and that those tools and let me know if you >>have any questions or need help you configuring them. >> >>Thanks, >>Carson >> >> >>On 9/19/13 11:53 AM, "Mark Yandell" wrote: >> >>>Hi Corban & Xia, >>> >>> >>>I've forwarded your question along to the MAKER_dev list, were you can >>>get speedy answers to your maker related questions. Thanks for using >>>MAKER. >>> >>>--mark >>> >>> >>>Mark Yandell >>>Professor of Human Genetics >>>H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of >>>Human Genetics University of Utah >>>15 North 2030 East, Room 2100 >>>Salt Lake City, UT 84112-5330 >>>ph:801-587-7707 >>> >>>________________________________________ >>>From: Xia.Cao at dupont.com [Xia.Cao at dupont.com] >>>Sent: Thursday, September 19, 2013 11:49 AM >>>To: Mark Yandell; Corban-Gregory.Rivera at dupont.com >>>Subject: maker2 scripts for functional annotation >>> >>>Dr. Yandell, >>> >>>We were recently evaluating maker2 for annotation and going through >>>the maker tutorial from 2012. >>> >>>http://gmod.org/wiki/MAKER_Tutorial_2012 >>> >>>The tutorial makes references to some scripts that we couldn?t find in >>>the current release. We were looking for scripts like >>>gff3_preds2models to convert match/match_part format into annotations >>>with gene/mRNA/exons/CDS and others. I was wondering if maybe we did >>>not have the most up to date version. >>> >>>In addition to getting accurate gene annotations, I was looking for a >>>solution to get functional assignments. I see that there are some >>>scripts like maker_functional_fasta that may help, but I was wondering >>>what you would recommend. >>> >>>Thanks, >>> >>>Corban & Xia >>> >>>This communication is for use by the intended recipient and contains >>>information that may be Privileged, confidential or copyrighted under >>>applicable law. If you are not the intended recipient, you are hereby >>>formally notified that any use, copying or distribution of this >>>e-mail, in whole or in part, is strictly prohibited. Please notify the >>>sender by return e-mail and delete this e-mail from your system. >>>Unless explicitly and conspicuously designated as "E-Contract >>>Intended", this e-mail does not constitute a contract offer, a >>>contract amendment, or an acceptance of a contract offer. This e-mail >>>does not constitute a consent to the use of sender's contact >>>information for direct marketing purposes or for transfers of data to >>>third parties. >>> >>>The dupont.com web address will continue in use for a transitional >>>period for communications sent or received on behalf of DuPont >>>Performance Coatings., which is not affiliated in any way with the >>>DuPont Company. >>> >>>Francais Deutsch Italiano Espanol Portugues Japanese Chinese >>>Korean >>> >>> http://www.DuPont.com/corp/email_disclaimer.html >>> >>>_______________________________________________ >>>maker-devel mailing list >>>maker-devel at box290.bluehost.com >>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.or >>>g >> >> >> >>This communication is for use by the intended recipient and contains >>information that may be Privileged, confidential or copyrighted under >>applicable law. If you are not the intended recipient, you are hereby >>formally notified that any use, copying or distribution of this e-mail, >>in whole or in part, is strictly prohibited. Please notify the sender >>by return e-mail and delete this e-mail from your system. Unless >>explicitly and conspicuously designated as "E-Contract Intended", this >>e-mail does not constitute a contract offer, a contract amendment, or >>an acceptance of a contract offer. This e-mail does not constitute a >>consent to the use of sender's contact information for direct marketing >>purposes or for transfers of data to third parties. >> >>The dupont.com web address will continue in use for >>a transitional period for communications sent or received on behalf of >>DuPont Performance Coatings., which is not affiliated in any way with >>the DuPont Company. >> >>Francais Deutsch Italiano Espanol Portugues Japanese Chinese >>Korean >> >> http://www.DuPont.com/corp/email_disclaimer.html >> > > > >This communication is for use by the intended recipient and contains >information that may be Privileged, confidential or copyrighted under >applicable law. If you are not the intended recipient, you are hereby >formally notified that any use, copying or distribution of this e-mail, >in whole or in part, is strictly prohibited. Please notify the sender by >return e-mail and delete this e-mail from your system. Unless explicitly >and conspicuously designated as "E-Contract Intended", this e-mail does >not constitute a contract offer, a contract amendment, or an acceptance >of a contract offer. This e-mail does not constitute a consent to the >use of sender's contact information for direct marketing purposes or for >transfers of data to third parties. > >The dupont.com web address will continue in use for a >transitional period for communications sent or received on behalf of >DuPont >Performance Coatings., which is not affiliated in any way with the DuPont >Company. > >Francais Deutsch Italiano Espanol Portugues Japanese Chinese Korean > > http://www.DuPont.com/corp/email_disclaimer.html > From Xia.Cao at dupont.com Tue Oct 1 13:00:42 2013 From: Xia.Cao at dupont.com (Xia.Cao at dupont.com) Date: Tue, 1 Oct 2013 19:00:42 +0000 Subject: [maker-devel] maker2 scripts for functional annotation In-Reply-To: References: Message-ID: Hi Carson, Thank you for your message and your kind help. Now conversion from match/match_part to gene/mRNA/exons/CDS works well after blanking out some options in control file. I have one more question about maker2. In case we don't have EST evidence (set of ESTs or assembled mRNA-seq in fasta format) for a genome, does maker2 function? If it does function, could you please let me know the performance of maker2 without providing EST evidence compared to the one with EST evidence? Thank you again and I look forward to hearing from you. Best, Xia -----Original Message----- From: Carson Holt [mailto:carsonhh at gmail.com] Sent: Wednesday, September 25, 2013 10:36 AM To: CAO, XIA; myandell at genetics.utah.edu; RIVERA, CORBAN GREGORY; maker-devel at yandell-lab.org Subject: Re: [maker-devel] maker2 scripts for functional annotation If it is launching predictors then you have snap hmm or augustus_species set. You ned to blank out all other options in the control files (including repeat masking options, proteins, ESTs, etc.) when trying to convert mathc/match_part to gene/mRNA/exons/CDS, or else those other programs will run. --Carson On 9/25/13 10:31 AM, "Xia.Cao at dupont.com" wrote: >Hi Carson, > >Thank you for the message and your kind help. We tested maker2 by >setting keep_preds=1, pred_gff=generated_gff_file_from_first_makerRun . >But it seemed maker2 started to launch all predictors again and it took >long time to finish. I wonder if there is any way that we can directly >get gene/mRNA/exons/CDS gff file without re-running maker2 to convert >match/match_part features into gene/mRNA/exons/CDS. > >Thanks, >Xia > >-----Original Message----- >From: Carson Holt [mailto:carsonhh at gmail.com] >Sent: Thursday, September 19, 2013 5:58 PM >To: Mark Yandell; CAO, XIA; RIVERA, CORBAN GREGORY; >maker-devel at yandell-lab.org >Subject: Re: [maker-devel] maker2 scripts for functional annotation > >Hello Corban & Xia, > >Some scripts like gff3_preds2models are deprecated. To get the same >result as was offered by gff3_preds2models, just give your >match/match_part features to pref_gff= in the maker_opts.ctl file, set >keep_preds=1, and run with all other options and predictors turned off. >The final MAKER result will be your match/match_part features converted >into gene/mRNA/exons/CDS. > >For functional annotation, you can use Interproscan, BLASTP against >UniProt, or BALST2GO. My preference is to use InterProScan to add GO >terms and proteins domains via the ipr_update_gff and iprscan2gff3 >scripts. Then add putative gene functions via BLASTP to UniProt and >maker_functional_fasta and maker_functional_gff scripts. > >Go ahead and take a look and that those tools and let me know if you >have any questions or need help you configuring them. > >Thanks, >Carson > > >On 9/19/13 11:53 AM, "Mark Yandell" wrote: > >>Hi Corban & Xia, >> >> >>I've forwarded your question along to the MAKER_dev list, were you can >>get speedy answers to your maker related questions. Thanks for using >>MAKER. >> >>--mark >> >> >>Mark Yandell >>Professor of Human Genetics >>H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of >>Human Genetics University of Utah >>15 North 2030 East, Room 2100 >>Salt Lake City, UT 84112-5330 >>ph:801-587-7707 >> >>________________________________________ >>From: Xia.Cao at dupont.com [Xia.Cao at dupont.com] >>Sent: Thursday, September 19, 2013 11:49 AM >>To: Mark Yandell; Corban-Gregory.Rivera at dupont.com >>Subject: maker2 scripts for functional annotation >> >>Dr. Yandell, >> >>We were recently evaluating maker2 for annotation and going through >>the maker tutorial from 2012. >> >>http://gmod.org/wiki/MAKER_Tutorial_2012 >> >>The tutorial makes references to some scripts that we couldn?t find in >>the current release. We were looking for scripts like >>gff3_preds2models to convert match/match_part format into annotations >>with gene/mRNA/exons/CDS and others. I was wondering if maybe we did >>not have the most up to date version. >> >>In addition to getting accurate gene annotations, I was looking for a >>solution to get functional assignments. I see that there are some >>scripts like maker_functional_fasta that may help, but I was wondering >>what you would recommend. >> >>Thanks, >> >>Corban & Xia >> >>This communication is for use by the intended recipient and contains >>information that may be Privileged, confidential or copyrighted under >>applicable law. If you are not the intended recipient, you are hereby >>formally notified that any use, copying or distribution of this >>e-mail, in whole or in part, is strictly prohibited. Please notify the >>sender by return e-mail and delete this e-mail from your system. >>Unless explicitly and conspicuously designated as "E-Contract >>Intended", this e-mail does not constitute a contract offer, a >>contract amendment, or an acceptance of a contract offer. This e-mail >>does not constitute a consent to the use of sender's contact >>information for direct marketing purposes or for transfers of data to third parties. >> >>The dupont.com web address will continue in use for a transitional >>period for communications sent or received on behalf of DuPont >>Performance Coatings., which is not affiliated in any way with the >>DuPont Company. >> >>Francais Deutsch Italiano Espanol Portugues Japanese Chinese >>Korean >> >> http://www.DuPont.com/corp/email_disclaimer.html >> >>_______________________________________________ >>maker-devel mailing list >>maker-devel at box290.bluehost.com >>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.or >>g > > > >This communication is for use by the intended recipient and contains >information that may be Privileged, confidential or copyrighted under >applicable law. If you are not the intended recipient, you are hereby >formally notified that any use, copying or distribution of this e-mail, >in whole or in part, is strictly prohibited. Please notify the sender >by return e-mail and delete this e-mail from your system. Unless >explicitly and conspicuously designated as "E-Contract Intended", this >e-mail does not constitute a contract offer, a contract amendment, or >an acceptance of a contract offer. This e-mail does not constitute a >consent to the use of sender's contact information for direct marketing >purposes or for transfers of data to third parties. > >The dupont.com web address will continue in use for >a transitional period for communications sent or received on behalf of >DuPont Performance Coatings., which is not affiliated in any way with >the DuPont Company. > >Francais Deutsch Italiano Espanol Portugues Japanese Chinese >Korean > > http://www.DuPont.com/corp/email_disclaimer.html > This communication is for use by the intended recipient and contains information that may be Privileged, confidential or copyrighted under applicable law. If you are not the intended recipient, you are hereby formally notified that any use, copying or distribution of this e-mail, in whole or in part, is strictly prohibited. Please notify the sender by return e-mail and delete this e-mail from your system. Unless explicitly and conspicuously designated as "E-Contract Intended", this e-mail does not constitute a contract offer, a contract amendment, or an acceptance of a contract offer. This e-mail does not constitute a consent to the use of sender's contact information for direct marketing purposes or for transfers of data to third parties. The dupont.com web address will continue in use for a transitional period for communications sent or received on behalf of DuPont Performance Coatings., which is not affiliated in any way with the DuPont Company. Francais Deutsch Italiano Espanol Portugues Japanese Chinese Korean http://www.DuPont.com/corp/email_disclaimer.html From carsonhh at gmail.com Wed Oct 2 08:43:29 2013 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 02 Oct 2013 10:43:29 -0400 Subject: [maker-devel] maker2 scripts for functional annotation In-Reply-To: Message-ID: That's ok. If it is very closely related, provide it directly to the est= option, otherwise provide it to the altest= option. The difference between the two options is how they are aligned. The altest= alignments takes about 10x longer than the est= alignments. --Carson On 10/2/13 10:40 AM, "Xia.Cao at dupont.com" wrote: >Hi Carson, > >Thank you for your quick and kind reply. One more question here. If we >don't have EST evidence for the strain we sequenced/assembled, will it be >ok to provide some EST evidence that is from some other strains which are >close to the one we are trying to annotate? Could you please let us know >how this will affect performance of maker2? > >Thank you again. > >Best, >Xia > >-----Original Message----- >From: Carson Holt [mailto:carsonhh at gmail.com] >Sent: Wednesday, October 02, 2013 10:15 AM >To: CAO, XIA >Cc: myandell at genetics.utah.edu; RIVERA, CORBAN GREGORY; >maker-devel at yandell-lab.org >Subject: Re: [maker-devel] maker2 scripts for functional annotation > >It still works, but it will be reduced. Without ESTs you won't get any >UTR prediction, also you will be limited by how well the protein set >matches to the genes. If you use the protein set of a close relative you >will capture most things. You will probably capture 80-95% of what you >would by also including ESTs. It all depending on how different the >species in the proteins evidence file are compared to the the species >being annotated. > >--Carson > >On 10/1/13 3:00 PM, "Xia.Cao at dupont.com" wrote: > >>Hi Carson, >> >>Thank you for your message and your kind help. Now conversion from >>match/match_part to gene/mRNA/exons/CDS works well after blanking out >>some options in control file. >> >>I have one more question about maker2. In case we don't have EST >>evidence (set of ESTs or assembled mRNA-seq in fasta format) for a >>genome, does >>maker2 function? If it does function, could you please let me know the >>performance of maker2 without providing EST evidence compared to the >>one with EST evidence? >> >>Thank you again and I look forward to hearing from you. >> >>Best, >>Xia >> >> >> >> >>-----Original Message----- >>From: Carson Holt [mailto:carsonhh at gmail.com] >>Sent: Wednesday, September 25, 2013 10:36 AM >>To: CAO, XIA; myandell at genetics.utah.edu; RIVERA, CORBAN GREGORY; >>maker-devel at yandell-lab.org >>Subject: Re: [maker-devel] maker2 scripts for functional annotation >> >>If it is launching predictors then you have snap hmm or >>augustus_species set. You ned to blank out all other options in the >>control files (including repeat masking options, proteins, ESTs, etc.) >>when trying to convert mathc/match_part to gene/mRNA/exons/CDS, or else >>those other programs will run. >> >>--Carson >> >> >>On 9/25/13 10:31 AM, "Xia.Cao at dupont.com" wrote: >> >>>Hi Carson, >>> >>>Thank you for the message and your kind help. We tested maker2 by >>>setting keep_preds=1, pred_gff=generated_gff_file_from_first_makerRun . >>>But it seemed maker2 started to launch all predictors again and it >>>took long time to finish. I wonder if there is any way that we can >>>directly get gene/mRNA/exons/CDS gff file without re-running maker2 to >>>convert match/match_part features into gene/mRNA/exons/CDS. >>> >>>Thanks, >>>Xia >>> >>>-----Original Message----- >>>From: Carson Holt [mailto:carsonhh at gmail.com] >>>Sent: Thursday, September 19, 2013 5:58 PM >>>To: Mark Yandell; CAO, XIA; RIVERA, CORBAN GREGORY; >>>maker-devel at yandell-lab.org >>>Subject: Re: [maker-devel] maker2 scripts for functional annotation >>> >>>Hello Corban & Xia, >>> >>>Some scripts like gff3_preds2models are deprecated. To get the same >>>result as was offered by gff3_preds2models, just give your >>>match/match_part features to pref_gff= in the maker_opts.ctl file, set >>>keep_preds=1, and run with all other options and predictors turned off. >>>The final MAKER result will be your match/match_part features >>>converted into gene/mRNA/exons/CDS. >>> >>>For functional annotation, you can use Interproscan, BLASTP against >>>UniProt, or BALST2GO. My preference is to use InterProScan to add GO >>>terms and proteins domains via the ipr_update_gff and iprscan2gff3 >>>scripts. Then add putative gene functions via BLASTP to UniProt and >>>maker_functional_fasta and maker_functional_gff scripts. >>> >>>Go ahead and take a look and that those tools and let me know if you >>>have any questions or need help you configuring them. >>> >>>Thanks, >>>Carson >>> >>> >>>On 9/19/13 11:53 AM, "Mark Yandell" wrote: >>> >>>>Hi Corban & Xia, >>>> >>>> >>>>I've forwarded your question along to the MAKER_dev list, were you >>>>can get speedy answers to your maker related questions. Thanks for >>>>using MAKER. >>>> >>>>--mark >>>> >>>> >>>>Mark Yandell >>>>Professor of Human Genetics >>>>H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of >>>>Human Genetics University of Utah >>>>15 North 2030 East, Room 2100 >>>>Salt Lake City, UT 84112-5330 >>>>ph:801-587-7707 >>>> >>>>________________________________________ >>>>From: Xia.Cao at dupont.com [Xia.Cao at dupont.com] >>>>Sent: Thursday, September 19, 2013 11:49 AM >>>>To: Mark Yandell; Corban-Gregory.Rivera at dupont.com >>>>Subject: maker2 scripts for functional annotation >>>> >>>>Dr. Yandell, >>>> >>>>We were recently evaluating maker2 for annotation and going through >>>>the maker tutorial from 2012. >>>> >>>>http://gmod.org/wiki/MAKER_Tutorial_2012 >>>> >>>>The tutorial makes references to some scripts that we couldn?t find >>>>in the current release. We were looking for scripts like >>>>gff3_preds2models to convert match/match_part format into annotations >>>>with gene/mRNA/exons/CDS and others. I was wondering if maybe we did >>>>not have the most up to date version. >>>> >>>>In addition to getting accurate gene annotations, I was looking for a >>>>solution to get functional assignments. I see that there are some >>>>scripts like maker_functional_fasta that may help, but I was >>>>wondering what you would recommend. >>>> >>>>Thanks, >>>> >>>>Corban & Xia >>>> >>>>This communication is for use by the intended recipient and contains >>>>information that may be Privileged, confidential or copyrighted under >>>>applicable law. If you are not the intended recipient, you are hereby >>>>formally notified that any use, copying or distribution of this >>>>e-mail, in whole or in part, is strictly prohibited. Please notify >>>>the sender by return e-mail and delete this e-mail from your system. >>>>Unless explicitly and conspicuously designated as "E-Contract >>>>Intended", this e-mail does not constitute a contract offer, a >>>>contract amendment, or an acceptance of a contract offer. This e-mail >>>>does not constitute a consent to the use of sender's contact >>>>information for direct marketing purposes or for transfers of data to >>>>third parties. >>>> >>>>The dupont.com web address will continue in use for a transitional >>>>period for communications sent or received on behalf of DuPont >>>>Performance Coatings., which is not affiliated in any way with the >>>>DuPont Company. >>>> >>>>Francais Deutsch Italiano Espanol Portugues Japanese Chinese >>>>Korean >>>> >>>> http://www.DuPont.com/corp/email_disclaimer.html >>>> >>>>_______________________________________________ >>>>maker-devel mailing list >>>>maker-devel at box290.bluehost.com >>>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.o >>>>r >>>>g >>> >>> >>> >>>This communication is for use by the intended recipient and contains >>>information that may be Privileged, confidential or copyrighted under >>>applicable law. If you are not the intended recipient, you are hereby >>>formally notified that any use, copying or distribution of this >>>e-mail, in whole or in part, is strictly prohibited. Please notify the >>>sender by return e-mail and delete this e-mail from your system. >>>Unless explicitly and conspicuously designated as "E-Contract >>>Intended", this e-mail does not constitute a contract offer, a >>>contract amendment, or an acceptance of a contract offer. This e-mail >>>does not constitute a consent to the use of sender's contact >>>information for direct marketing purposes or for transfers of data to >>>third parties. >>> >>>The dupont.com web address will continue in use >>>for a transitional period for communications sent or received on >>>behalf of DuPont Performance Coatings., which is not affiliated in any >>>way with the DuPont Company. >>> >>>Francais Deutsch Italiano Espanol Portugues Japanese Chinese >>>Korean >>> >>> http://www.DuPont.com/corp/email_disclaimer.html >>> >> >> >> >>This communication is for use by the intended recipient and contains >>information that may be Privileged, confidential or copyrighted under >>applicable law. If you are not the intended recipient, you are hereby >>formally notified that any use, copying or distribution of this e-mail, >>in whole or in part, is strictly prohibited. Please notify the sender >>by return e-mail and delete this e-mail from your system. Unless >>explicitly and conspicuously designated as "E-Contract Intended", this >>e-mail does not constitute a contract offer, a contract amendment, or >>an acceptance of a contract offer. This e-mail does not constitute a >>consent to the use of sender's contact information for direct marketing >>purposes or for transfers of data to third parties. >> >>The dupont.com web address will continue in use for >>a transitional period for communications sent or received on behalf of >>DuPont Performance Coatings., which is not affiliated in any way with >>the DuPont Company. >> >>Francais Deutsch Italiano Espanol Portugues Japanese Chinese >>Korean >> >> http://www.DuPont.com/corp/email_disclaimer.html >> > > > >This communication is for use by the intended recipient and contains >information that may be Privileged, confidential or copyrighted under >applicable law. If you are not the intended recipient, you are hereby >formally notified that any use, copying or distribution of this e-mail, >in whole or in part, is strictly prohibited. Please notify the sender by >return e-mail and delete this e-mail from your system. Unless explicitly >and conspicuously designated as "E-Contract Intended", this e-mail does >not constitute a contract offer, a contract amendment, or an acceptance >of a contract offer. This e-mail does not constitute a consent to the >use of sender's contact information for direct marketing purposes or for >transfers of data to third parties. > >The dupont.com web address will continue in use for a >transitional period for communications sent or received on behalf of >DuPont >Performance Coatings., which is not affiliated in any way with the DuPont >Company. > >Francais Deutsch Italiano Espanol Portugues Japanese Chinese Korean > > http://www.DuPont.com/corp/email_disclaimer.html > From Carson.Holt at oicr.on.ca Wed Oct 2 09:24:42 2013 From: Carson.Holt at oicr.on.ca (Carson Holt) Date: Wed, 2 Oct 2013 15:24:42 +0000 Subject: [maker-devel] Incorrect gene model, antisense transcripts In-Reply-To: Message-ID: SNAP appears to be making small ab initio calls all over the place. You may need to retrain it, or just drop it completely from your analysis and just use augustus. Try running with protein2genome and then training SNAP off of those results. Also make sure your repeat masking is sufficient. You may be getting too many SANP predictions if that is the case. Thanks, Carson From: Sivaranjani Namasivayam > Date: Tuesday, October 1, 2013 5:01 PM To: "maker-devel-bounces at yandell-lab.org" > Subject: Incorrect gene model, antisense transcripts Hello, I am working on annotating a genome. I have RNAseq data assembled with Cufflinks and Trinity. I am using to predict gene models. Below is the procedure I used - For the first round of prediction I provided the following : EST evidence: assembled RNAseq, Protein evidence: proteins from related organism, Gene predictor: augustus trained on a related organism. - The number of genes I got was much less than expected (Expect atleast 10K genes, got ~4K genes) - Used the gene models from the first round of annotations to train SNAP (default parameters) and provided the HMM for the second round of annotation (used augustus also). All other data I passed as such in the gff3, except the gene models.I also included the some manually curated genes in the EST evidence. - The number of genes were much higher, ~11k, but many seemed incorrect and appear to have been generated from the SNAP models. I am attaching a screen shot from Apollo showing this. Would you have any suggestions on why this might be happening and how I can fix? I am essentially looking for gene models for my assembled RNAseq, can I achieve this by setting est2genome to 1? Would I be losing/misrepresenting any data if I do this. Also, I recently noticed my RNAseq data has assembled antisense transcripts. MAKER seems to be predicting a coding region for these antisense transcripts in some cases, (although the coding region predicted is much smaller than that of the sense gene model.) Is there a way to tell MAKER not to predict gene models on both strands at the same loci. Appreciate your help. Thanks, Ranjani -------------- next part -------------- An HTML attachment was scrubbed... URL: From Xia.Cao at dupont.com Wed Oct 2 08:40:12 2013 From: Xia.Cao at dupont.com (Xia.Cao at dupont.com) Date: Wed, 2 Oct 2013 14:40:12 +0000 Subject: [maker-devel] maker2 scripts for functional annotation In-Reply-To: References: Message-ID: Hi Carson, Thank you for your quick and kind reply. One more question here. If we don't have EST evidence for the strain we sequenced/assembled, will it be ok to provide some EST evidence that is from some other strains which are close to the one we are trying to annotate? Could you please let us know how this will affect performance of maker2? Thank you again. Best, Xia -----Original Message----- From: Carson Holt [mailto:carsonhh at gmail.com] Sent: Wednesday, October 02, 2013 10:15 AM To: CAO, XIA Cc: myandell at genetics.utah.edu; RIVERA, CORBAN GREGORY; maker-devel at yandell-lab.org Subject: Re: [maker-devel] maker2 scripts for functional annotation It still works, but it will be reduced. Without ESTs you won't get any UTR prediction, also you will be limited by how well the protein set matches to the genes. If you use the protein set of a close relative you will capture most things. You will probably capture 80-95% of what you would by also including ESTs. It all depending on how different the species in the proteins evidence file are compared to the the species being annotated. --Carson On 10/1/13 3:00 PM, "Xia.Cao at dupont.com" wrote: >Hi Carson, > >Thank you for your message and your kind help. Now conversion from >match/match_part to gene/mRNA/exons/CDS works well after blanking out >some options in control file. > >I have one more question about maker2. In case we don't have EST >evidence (set of ESTs or assembled mRNA-seq in fasta format) for a >genome, does >maker2 function? If it does function, could you please let me know the >performance of maker2 without providing EST evidence compared to the >one with EST evidence? > >Thank you again and I look forward to hearing from you. > >Best, >Xia > > > > >-----Original Message----- >From: Carson Holt [mailto:carsonhh at gmail.com] >Sent: Wednesday, September 25, 2013 10:36 AM >To: CAO, XIA; myandell at genetics.utah.edu; RIVERA, CORBAN GREGORY; >maker-devel at yandell-lab.org >Subject: Re: [maker-devel] maker2 scripts for functional annotation > >If it is launching predictors then you have snap hmm or >augustus_species set. You ned to blank out all other options in the >control files (including repeat masking options, proteins, ESTs, etc.) >when trying to convert mathc/match_part to gene/mRNA/exons/CDS, or else >those other programs will run. > >--Carson > > >On 9/25/13 10:31 AM, "Xia.Cao at dupont.com" wrote: > >>Hi Carson, >> >>Thank you for the message and your kind help. We tested maker2 by >>setting keep_preds=1, pred_gff=generated_gff_file_from_first_makerRun . >>But it seemed maker2 started to launch all predictors again and it >>took long time to finish. I wonder if there is any way that we can >>directly get gene/mRNA/exons/CDS gff file without re-running maker2 to >>convert match/match_part features into gene/mRNA/exons/CDS. >> >>Thanks, >>Xia >> >>-----Original Message----- >>From: Carson Holt [mailto:carsonhh at gmail.com] >>Sent: Thursday, September 19, 2013 5:58 PM >>To: Mark Yandell; CAO, XIA; RIVERA, CORBAN GREGORY; >>maker-devel at yandell-lab.org >>Subject: Re: [maker-devel] maker2 scripts for functional annotation >> >>Hello Corban & Xia, >> >>Some scripts like gff3_preds2models are deprecated. To get the same >>result as was offered by gff3_preds2models, just give your >>match/match_part features to pref_gff= in the maker_opts.ctl file, set >>keep_preds=1, and run with all other options and predictors turned off. >>The final MAKER result will be your match/match_part features >>converted into gene/mRNA/exons/CDS. >> >>For functional annotation, you can use Interproscan, BLASTP against >>UniProt, or BALST2GO. My preference is to use InterProScan to add GO >>terms and proteins domains via the ipr_update_gff and iprscan2gff3 >>scripts. Then add putative gene functions via BLASTP to UniProt and >>maker_functional_fasta and maker_functional_gff scripts. >> >>Go ahead and take a look and that those tools and let me know if you >>have any questions or need help you configuring them. >> >>Thanks, >>Carson >> >> >>On 9/19/13 11:53 AM, "Mark Yandell" wrote: >> >>>Hi Corban & Xia, >>> >>> >>>I've forwarded your question along to the MAKER_dev list, were you >>>can get speedy answers to your maker related questions. Thanks for >>>using MAKER. >>> >>>--mark >>> >>> >>>Mark Yandell >>>Professor of Human Genetics >>>H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of >>>Human Genetics University of Utah >>>15 North 2030 East, Room 2100 >>>Salt Lake City, UT 84112-5330 >>>ph:801-587-7707 >>> >>>________________________________________ >>>From: Xia.Cao at dupont.com [Xia.Cao at dupont.com] >>>Sent: Thursday, September 19, 2013 11:49 AM >>>To: Mark Yandell; Corban-Gregory.Rivera at dupont.com >>>Subject: maker2 scripts for functional annotation >>> >>>Dr. Yandell, >>> >>>We were recently evaluating maker2 for annotation and going through >>>the maker tutorial from 2012. >>> >>>http://gmod.org/wiki/MAKER_Tutorial_2012 >>> >>>The tutorial makes references to some scripts that we couldn?t find >>>in the current release. We were looking for scripts like >>>gff3_preds2models to convert match/match_part format into annotations >>>with gene/mRNA/exons/CDS and others. I was wondering if maybe we did >>>not have the most up to date version. >>> >>>In addition to getting accurate gene annotations, I was looking for a >>>solution to get functional assignments. I see that there are some >>>scripts like maker_functional_fasta that may help, but I was >>>wondering what you would recommend. >>> >>>Thanks, >>> >>>Corban & Xia >>> >>>This communication is for use by the intended recipient and contains >>>information that may be Privileged, confidential or copyrighted under >>>applicable law. If you are not the intended recipient, you are hereby >>>formally notified that any use, copying or distribution of this >>>e-mail, in whole or in part, is strictly prohibited. Please notify >>>the sender by return e-mail and delete this e-mail from your system. >>>Unless explicitly and conspicuously designated as "E-Contract >>>Intended", this e-mail does not constitute a contract offer, a >>>contract amendment, or an acceptance of a contract offer. This e-mail >>>does not constitute a consent to the use of sender's contact >>>information for direct marketing purposes or for transfers of data to >>>third parties. >>> >>>The dupont.com web address will continue in use for a transitional >>>period for communications sent or received on behalf of DuPont >>>Performance Coatings., which is not affiliated in any way with the >>>DuPont Company. >>> >>>Francais Deutsch Italiano Espanol Portugues Japanese Chinese >>>Korean >>> >>> http://www.DuPont.com/corp/email_disclaimer.html >>> >>>_______________________________________________ >>>maker-devel mailing list >>>maker-devel at box290.bluehost.com >>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.o >>>r >>>g >> >> >> >>This communication is for use by the intended recipient and contains >>information that may be Privileged, confidential or copyrighted under >>applicable law. If you are not the intended recipient, you are hereby >>formally notified that any use, copying or distribution of this >>e-mail, in whole or in part, is strictly prohibited. Please notify the >>sender by return e-mail and delete this e-mail from your system. >>Unless explicitly and conspicuously designated as "E-Contract >>Intended", this e-mail does not constitute a contract offer, a >>contract amendment, or an acceptance of a contract offer. This e-mail >>does not constitute a consent to the use of sender's contact >>information for direct marketing purposes or for transfers of data to third parties. >> >>The dupont.com web address will continue in use >>for a transitional period for communications sent or received on >>behalf of DuPont Performance Coatings., which is not affiliated in any >>way with the DuPont Company. >> >>Francais Deutsch Italiano Espanol Portugues Japanese Chinese >>Korean >> >> http://www.DuPont.com/corp/email_disclaimer.html >> > > > >This communication is for use by the intended recipient and contains >information that may be Privileged, confidential or copyrighted under >applicable law. If you are not the intended recipient, you are hereby >formally notified that any use, copying or distribution of this e-mail, >in whole or in part, is strictly prohibited. Please notify the sender >by return e-mail and delete this e-mail from your system. Unless >explicitly and conspicuously designated as "E-Contract Intended", this >e-mail does not constitute a contract offer, a contract amendment, or >an acceptance of a contract offer. This e-mail does not constitute a >consent to the use of sender's contact information for direct marketing >purposes or for transfers of data to third parties. > >The dupont.com web address will continue in use for >a transitional period for communications sent or received on behalf of >DuPont Performance Coatings., which is not affiliated in any way with >the DuPont Company. > >Francais Deutsch Italiano Espanol Portugues Japanese Chinese >Korean > > http://www.DuPont.com/corp/email_disclaimer.html > This communication is for use by the intended recipient and contains information that may be Privileged, confidential or copyrighted under applicable law. If you are not the intended recipient, you are hereby formally notified that any use, copying or distribution of this e-mail, in whole or in part, is strictly prohibited. Please notify the sender by return e-mail and delete this e-mail from your system. Unless explicitly and conspicuously designated as "E-Contract Intended", this e-mail does not constitute a contract offer, a contract amendment, or an acceptance of a contract offer. This e-mail does not constitute a consent to the use of sender's contact information for direct marketing purposes or for transfers of data to third parties. The dupont.com web address will continue in use for a transitional period for communications sent or received on behalf of DuPont Performance Coatings., which is not affiliated in any way with the DuPont Company. Francais Deutsch Italiano Espanol Portugues Japanese Chinese Korean http://www.DuPont.com/corp/email_disclaimer.html From markwiz at us.ibm.com Wed Oct 2 10:52:55 2013 From: markwiz at us.ibm.com (Mark Wisner) Date: Wed, 2 Oct 2013 12:52:55 -0400 Subject: [maker-devel] Maker2 on IBM PowerLinux? Message-ID: IBMs PowerLinux is tuned for Big Data workloads. We have had queries for Maker2 on PowerLinux. How do we work with Maker2 development to create a PowerLinux port. Thanks, Mark K. Wisner Linux On Power ISV PM IBM 3039 Cornwallis Rd RTP, NC 27709 Cell 919-649-5813 -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry.moore at genetics.utah.edu Wed Oct 2 11:29:36 2013 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Wed, 2 Oct 2013 11:29:36 -0600 Subject: [maker-devel] Maker2 on IBM PowerLinux? In-Reply-To: References: Message-ID: Hi Mark, Carson Holt is the primary Maker developer, but others of us may be able to help as well. I've added e-mails for Carson, Michael Campbell, Mark Yandell and myself who are currently developers on the project. You can feel free to contact this mailing list - or if the conversation is getting too far afield for general interest you can contact us off list and we will do whatever we can to help. If you feel like you need to do some modification of MAKER code to optimize it for PowerLinux I'm happy to give you a branch on the subversion repo to work with and then we can work with you to merge these changes back to the trunk when you feel like you have some improvements that benefit PowerLinux (and all other) users. Thanks very much for your interest in MAKER - your participation in the community is most welcome! Barry On Oct 2, 2013, at 10:52 AM, Mark Wisner wrote: > IBMs PowerLinux is tuned for Big Data workloads. We have had queries for Maker2 on PowerLinux. > How do we work with Maker2 development to create a PowerLinux port. > Thanks, > > Mark K. Wisner > Linux On Power ISV PM > IBM > 3039 Cornwallis Rd > RTP, NC 27709 > Cell 919-649-5813 > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Wed Oct 2 11:29:50 2013 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 02 Oct 2013 13:29:50 -0400 Subject: [maker-devel] Maker2 on IBM PowerLinux? In-Reply-To: Message-ID: MAKER is written in perl, so it can work pretty much anywhere you have perl. It is more an issue of the tools MAKER uses. If you can get SNAP, BLAST, exonerate, etc. to run then MAKER should be no problem. You can let MAKER try and install and configure any missing perl libraries as well as external tools for you (this is by far the easiest way). You will need an internet connection. cd ?/maker/src/ perl Build.PL #say 'Y' to local installation of libraries and 'N' to MPI for now ./Build installdeps #grabs missing perl libraries from the ether, and installs them in the .../maker/perl/ directory ./Build installers #grabs missing tools from the ether, installs, and configures them in the .../maker/exe/ directory ./Build install #just installs maker If you want to run via MPI, I'd recommend OpenMPI, and you will need to rerun the 'perl Build.PL' and './Build install' steps (say yes to MPI this time around). Here are some variables you will likly have to add to your .bash_profile to get openmpi to run with MAKER (you must add these and source the profile before attempting to install with OpenMPI), otherwise compilation will not work correctly (OpenMPI shared library issue). #you need to set $OMPIHOME to OpenMPI's location here --> export LD_PRELOAD=$OMPIHOME/lib/libmpi.so:$LD_PRELOAD #this one just suppresses a warning --> export OMPI_MCA_mpi_warn_on_fork=0 And on some systems you have to add this ' -mca btl ^openib' to the mpiexec command when I run maker. Example--> mpiexec -mca btl ^openib -n 20 maker Thanks, Carson From: Mark Wisner Date: Wednesday, October 2, 2013 12:52 PM To: Subject: [maker-devel] Maker2 on IBM PowerLinux? IBMs PowerLinux is tuned for Big Data workloads. We have had queries for Maker2 on PowerLinux. How do we work with Maker2 development to create a PowerLinux port. Thanks, Mark K. Wisner Linux On Power ISV PM IBM 3039 Cornwallis Rd RTP, NC 27709 Cell 919-649-5813 _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry.moore at genetics.utah.edu Wed Oct 2 11:21:29 2013 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Wed, 2 Oct 2013 11:21:29 -0600 Subject: [maker-devel] maker2 scripts for functional annotation In-Reply-To: References: Message-ID: Hi Xia, You can use EST evidence from other strains/species. The considerations are as follows: When you pass EST (or mRNASeq based transcripts) evidence MAKER uses blastn to align those with alignment parameters that assume they sequences will align easily with almost perfect identity - this is fast. If you use EST/mRNA evidence from another species MAKER uses tblastx to compare the RNA evidence to the assembly, both translated into protein space - this works fine, but it is slow and often doesn't get you much more evidence than you could have gotten from proteins as evidence. If you believe that you're other strain should have very little divergence in sequence at the nt level then you could experiment with passing EST/mRNA evidence as est in the control file. Consider this an experiment, do it on a few contigs and examine the results carefully and skeptically. If it works well, that would be the way to go because you've get better UTRs, but if it doesn't you'll see that you get little support for models because sequence divergence was too much for good alignment. In that case you'll need to use alt_est and this will be slower and you'll likely get little in the way of support for UTRs - in this case as well, do a few contigs and examine the output carefully to see if the additional alignment time with tblastx is worth it in your case. Barry On Oct 2, 2013, at 8:40 AM, wrote: > Hi Carson, > > Thank you for your quick and kind reply. One more question here. If we don't have EST evidence for the strain we sequenced/assembled, will it be ok to provide some EST evidence that is from some other strains which are close to the one we are trying to annotate? Could you please let us know how this will affect performance of maker2? > > Thank you again. > > Best, > Xia > > -----Original Message----- > From: Carson Holt [mailto:carsonhh at gmail.com] > Sent: Wednesday, October 02, 2013 10:15 AM > To: CAO, XIA > Cc: myandell at genetics.utah.edu; RIVERA, CORBAN GREGORY; maker-devel at yandell-lab.org > Subject: Re: [maker-devel] maker2 scripts for functional annotation > > It still works, but it will be reduced. Without ESTs you won't get any UTR prediction, also you will be limited by how well the protein set matches to the genes. If you use the protein set of a close relative you will capture most things. You will probably capture 80-95% of what you would by also including ESTs. It all depending on how different the species in the proteins evidence file are compared to the the species being annotated. > > --Carson > > On 10/1/13 3:00 PM, "Xia.Cao at dupont.com" wrote: > >> Hi Carson, >> >> Thank you for your message and your kind help. Now conversion from >> match/match_part to gene/mRNA/exons/CDS works well after blanking out >> some options in control file. >> >> I have one more question about maker2. In case we don't have EST >> evidence (set of ESTs or assembled mRNA-seq in fasta format) for a >> genome, does >> maker2 function? If it does function, could you please let me know the >> performance of maker2 without providing EST evidence compared to the >> one with EST evidence? >> >> Thank you again and I look forward to hearing from you. >> >> Best, >> Xia >> >> >> >> >> -----Original Message----- >> From: Carson Holt [mailto:carsonhh at gmail.com] >> Sent: Wednesday, September 25, 2013 10:36 AM >> To: CAO, XIA; myandell at genetics.utah.edu; RIVERA, CORBAN GREGORY; >> maker-devel at yandell-lab.org >> Subject: Re: [maker-devel] maker2 scripts for functional annotation >> >> If it is launching predictors then you have snap hmm or >> augustus_species set. You ned to blank out all other options in the >> control files (including repeat masking options, proteins, ESTs, etc.) >> when trying to convert mathc/match_part to gene/mRNA/exons/CDS, or else >> those other programs will run. >> >> --Carson >> >> >> On 9/25/13 10:31 AM, "Xia.Cao at dupont.com" wrote: >> >>> Hi Carson, >>> >>> Thank you for the message and your kind help. We tested maker2 by >>> setting keep_preds=1, pred_gff=generated_gff_file_from_first_makerRun . >>> But it seemed maker2 started to launch all predictors again and it >>> took long time to finish. I wonder if there is any way that we can >>> directly get gene/mRNA/exons/CDS gff file without re-running maker2 to >>> convert match/match_part features into gene/mRNA/exons/CDS. >>> >>> Thanks, >>> Xia >>> >>> -----Original Message----- >>> From: Carson Holt [mailto:carsonhh at gmail.com] >>> Sent: Thursday, September 19, 2013 5:58 PM >>> To: Mark Yandell; CAO, XIA; RIVERA, CORBAN GREGORY; >>> maker-devel at yandell-lab.org >>> Subject: Re: [maker-devel] maker2 scripts for functional annotation >>> >>> Hello Corban & Xia, >>> >>> Some scripts like gff3_preds2models are deprecated. To get the same >>> result as was offered by gff3_preds2models, just give your >>> match/match_part features to pref_gff= in the maker_opts.ctl file, set >>> keep_preds=1, and run with all other options and predictors turned off. >>> The final MAKER result will be your match/match_part features >>> converted into gene/mRNA/exons/CDS. >>> >>> For functional annotation, you can use Interproscan, BLASTP against >>> UniProt, or BALST2GO. My preference is to use InterProScan to add GO >>> terms and proteins domains via the ipr_update_gff and iprscan2gff3 >>> scripts. Then add putative gene functions via BLASTP to UniProt and >>> maker_functional_fasta and maker_functional_gff scripts. >>> >>> Go ahead and take a look and that those tools and let me know if you >>> have any questions or need help you configuring them. >>> >>> Thanks, >>> Carson >>> >>> >>> On 9/19/13 11:53 AM, "Mark Yandell" wrote: >>> >>>> Hi Corban & Xia, >>>> >>>> >>>> I've forwarded your question along to the MAKER_dev list, were you >>>> can get speedy answers to your maker related questions. Thanks for >>>> using MAKER. >>>> >>>> --mark >>>> >>>> >>>> Mark Yandell >>>> Professor of Human Genetics >>>> H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of >>>> Human Genetics University of Utah >>>> 15 North 2030 East, Room 2100 >>>> Salt Lake City, UT 84112-5330 >>>> ph:801-587-7707 >>>> >>>> ________________________________________ >>>> From: Xia.Cao at dupont.com [Xia.Cao at dupont.com] >>>> Sent: Thursday, September 19, 2013 11:49 AM >>>> To: Mark Yandell; Corban-Gregory.Rivera at dupont.com >>>> Subject: maker2 scripts for functional annotation >>>> >>>> Dr. Yandell, >>>> >>>> We were recently evaluating maker2 for annotation and going through >>>> the maker tutorial from 2012. >>>> >>>> http://gmod.org/wiki/MAKER_Tutorial_2012 >>>> >>>> The tutorial makes references to some scripts that we couldn?t find >>>> in the current release. We were looking for scripts like >>>> gff3_preds2models to convert match/match_part format into annotations >>>> with gene/mRNA/exons/CDS and others. I was wondering if maybe we did >>>> not have the most up to date version. >>>> >>>> In addition to getting accurate gene annotations, I was looking for a >>>> solution to get functional assignments. I see that there are some >>>> scripts like maker_functional_fasta that may help, but I was >>>> wondering what you would recommend. >>>> >>>> Thanks, >>>> >>>> Corban & Xia >>>> >>>> This communication is for use by the intended recipient and contains >>>> information that may be Privileged, confidential or copyrighted under >>>> applicable law. If you are not the intended recipient, you are hereby >>>> formally notified that any use, copying or distribution of this >>>> e-mail, in whole or in part, is strictly prohibited. Please notify >>>> the sender by return e-mail and delete this e-mail from your system. >>>> Unless explicitly and conspicuously designated as "E-Contract >>>> Intended", this e-mail does not constitute a contract offer, a >>>> contract amendment, or an acceptance of a contract offer. This e-mail >>>> does not constitute a consent to the use of sender's contact >>>> information for direct marketing purposes or for transfers of data to >>>> third parties. >>>> >>>> The dupont.com web address will continue in use for a transitional >>>> period for communications sent or received on behalf of DuPont >>>> Performance Coatings., which is not affiliated in any way with the >>>> DuPont Company. >>>> >>>> Francais Deutsch Italiano Espanol Portugues Japanese Chinese >>>> Korean >>>> >>>> http://www.DuPont.com/corp/email_disclaimer.html >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.o >>>> r >>>> g >>> >>> >>> >>> This communication is for use by the intended recipient and contains >>> information that may be Privileged, confidential or copyrighted under >>> applicable law. If you are not the intended recipient, you are hereby >>> formally notified that any use, copying or distribution of this >>> e-mail, in whole or in part, is strictly prohibited. Please notify the >>> sender by return e-mail and delete this e-mail from your system. >>> Unless explicitly and conspicuously designated as "E-Contract >>> Intended", this e-mail does not constitute a contract offer, a >>> contract amendment, or an acceptance of a contract offer. This e-mail >>> does not constitute a consent to the use of sender's contact >>> information for direct marketing purposes or for transfers of data to third parties. >>> >>> The dupont.com web address will continue in use >>> for a transitional period for communications sent or received on >>> behalf of DuPont Performance Coatings., which is not affiliated in any >>> way with the DuPont Company. >>> >>> Francais Deutsch Italiano Espanol Portugues Japanese Chinese >>> Korean >>> >>> http://www.DuPont.com/corp/email_disclaimer.html >>> >> >> >> >> This communication is for use by the intended recipient and contains >> information that may be Privileged, confidential or copyrighted under >> applicable law. If you are not the intended recipient, you are hereby >> formally notified that any use, copying or distribution of this e-mail, >> in whole or in part, is strictly prohibited. Please notify the sender >> by return e-mail and delete this e-mail from your system. Unless >> explicitly and conspicuously designated as "E-Contract Intended", this >> e-mail does not constitute a contract offer, a contract amendment, or >> an acceptance of a contract offer. This e-mail does not constitute a >> consent to the use of sender's contact information for direct marketing >> purposes or for transfers of data to third parties. >> >> The dupont.com web address will continue in use for >> a transitional period for communications sent or received on behalf of >> DuPont Performance Coatings., which is not affiliated in any way with >> the DuPont Company. >> >> Francais Deutsch Italiano Espanol Portugues Japanese Chinese >> Korean >> >> http://www.DuPont.com/corp/email_disclaimer.html >> > > > > This communication is for use by the intended recipient and contains > information that may be Privileged, confidential or copyrighted under > applicable law. If you are not the intended recipient, you are hereby > formally notified that any use, copying or distribution of this e-mail, > in whole or in part, is strictly prohibited. Please notify the sender by > return e-mail and delete this e-mail from your system. Unless explicitly > and conspicuously designated as "E-Contract Intended", this e-mail does > not constitute a contract offer, a contract amendment, or an acceptance > of a contract offer. This e-mail does not constitute a consent to the > use of sender's contact information for direct marketing purposes or for > transfers of data to third parties. > > The dupont.com web address will continue in use for a > transitional period for communications sent or received on behalf of DuPont > Performance Coatings., which is not affiliated in any way with the DuPont Company. > > Francais Deutsch Italiano Espanol Portugues Japanese Chinese Korean > > http://www.DuPont.com/corp/email_disclaimer.html > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From Xia.Cao at dupont.com Wed Oct 2 11:51:26 2013 From: Xia.Cao at dupont.com (Xia.Cao at dupont.com) Date: Wed, 2 Oct 2013 17:51:26 +0000 Subject: [maker-devel] maker2 scripts for functional annotation In-Reply-To: References: Message-ID: Hi Barry, Thank you very much for the message and suggestions. It's really helpful to us. We will do some tests to see how we can take advantage of Maker2 to produce high-quality gene models from our assembled genome. Thank you again, Xia From: Barry Moore [mailto:barry.moore at genetics.utah.edu] Sent: Wednesday, October 02, 2013 1:21 PM To: CAO, XIA Cc: carsonhh at gmail.com; maker-devel at yandell-lab.org; RIVERA, CORBAN GREGORY Subject: Re: [maker-devel] maker2 scripts for functional annotation Hi Xia, You can use EST evidence from other strains/species. The considerations are as follows: When you pass EST (or mRNASeq based transcripts) evidence MAKER uses blastn to align those with alignment parameters that assume they sequences will align easily with almost perfect identity - this is fast. If you use EST/mRNA evidence from another species MAKER uses tblastx to compare the RNA evidence to the assembly, both translated into protein space - this works fine, but it is slow and often doesn't get you much more evidence than you could have gotten from proteins as evidence. If you believe that you're other strain should have very little divergence in sequence at the nt level then you could experiment with passing EST/mRNA evidence as est in the control file. Consider this an experiment, do it on a few contigs and examine the results carefully and skeptically. If it works well, that would be the way to go because you've get better UTRs, but if it doesn't you'll see that you get little support for models because sequence divergence was too much for good alignment. In that case you'll need to use alt_est and this will be slower and you'll likely get little in the way of support for UTRs - in this case as well, do a few contigs and examine the output carefully to see if the additional alignment time with tblastx is worth it in your case. Barry On Oct 2, 2013, at 8:40 AM, > wrote: Hi Carson, Thank you for your quick and kind reply. One more question here. If we don't have EST evidence for the strain we sequenced/assembled, will it be ok to provide some EST evidence that is from some other strains which are close to the one we are trying to annotate? Could you please let us know how this will affect performance of maker2? Thank you again. Best, Xia -----Original Message----- From: Carson Holt [mailto:carsonhh at gmail.com] Sent: Wednesday, October 02, 2013 10:15 AM To: CAO, XIA Cc: myandell at genetics.utah.edu; RIVERA, CORBAN GREGORY; maker-devel at yandell-lab.org Subject: Re: [maker-devel] maker2 scripts for functional annotation It still works, but it will be reduced. Without ESTs you won't get any UTR prediction, also you will be limited by how well the protein set matches to the genes. If you use the protein set of a close relative you will capture most things. You will probably capture 80-95% of what you would by also including ESTs. It all depending on how different the species in the proteins evidence file are compared to the the species being annotated. --Carson On 10/1/13 3:00 PM, "Xia.Cao at dupont.com" > wrote: Hi Carson, Thank you for your message and your kind help. Now conversion from match/match_part to gene/mRNA/exons/CDS works well after blanking out some options in control file. I have one more question about maker2. In case we don't have EST evidence (set of ESTs or assembled mRNA-seq in fasta format) for a genome, does maker2 function? If it does function, could you please let me know the performance of maker2 without providing EST evidence compared to the one with EST evidence? Thank you again and I look forward to hearing from you. Best, Xia -----Original Message----- From: Carson Holt [mailto:carsonhh at gmail.com] Sent: Wednesday, September 25, 2013 10:36 AM To: CAO, XIA; myandell at genetics.utah.edu; RIVERA, CORBAN GREGORY; maker-devel at yandell-lab.org Subject: Re: [maker-devel] maker2 scripts for functional annotation If it is launching predictors then you have snap hmm or augustus_species set. You ned to blank out all other options in the control files (including repeat masking options, proteins, ESTs, etc.) when trying to convert mathc/match_part to gene/mRNA/exons/CDS, or else those other programs will run. --Carson On 9/25/13 10:31 AM, "Xia.Cao at dupont.com" > wrote: Hi Carson, Thank you for the message and your kind help. We tested maker2 by setting keep_preds=1, pred_gff=generated_gff_file_from_first_makerRun . But it seemed maker2 started to launch all predictors again and it took long time to finish. I wonder if there is any way that we can directly get gene/mRNA/exons/CDS gff file without re-running maker2 to convert match/match_part features into gene/mRNA/exons/CDS. Thanks, Xia -----Original Message----- From: Carson Holt [mailto:carsonhh at gmail.com] Sent: Thursday, September 19, 2013 5:58 PM To: Mark Yandell; CAO, XIA; RIVERA, CORBAN GREGORY; maker-devel at yandell-lab.org Subject: Re: [maker-devel] maker2 scripts for functional annotation Hello Corban & Xia, Some scripts like gff3_preds2models are deprecated. To get the same result as was offered by gff3_preds2models, just give your match/match_part features to pref_gff= in the maker_opts.ctl file, set keep_preds=1, and run with all other options and predictors turned off. The final MAKER result will be your match/match_part features converted into gene/mRNA/exons/CDS. For functional annotation, you can use Interproscan, BLASTP against UniProt, or BALST2GO. My preference is to use InterProScan to add GO terms and proteins domains via the ipr_update_gff and iprscan2gff3 scripts. Then add putative gene functions via BLASTP to UniProt and maker_functional_fasta and maker_functional_gff scripts. Go ahead and take a look and that those tools and let me know if you have any questions or need help you configuring them. Thanks, Carson On 9/19/13 11:53 AM, "Mark Yandell" > wrote: Hi Corban & Xia, I've forwarded your question along to the MAKER_dev list, were you can get speedy answers to your maker related questions. Thanks for using MAKER. --mark Mark Yandell Professor of Human Genetics H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:801-587-7707 ________________________________________ From: Xia.Cao at dupont.com [Xia.Cao at dupont.com] Sent: Thursday, September 19, 2013 11:49 AM To: Mark Yandell; Corban-Gregory.Rivera at dupont.com Subject: maker2 scripts for functional annotation Dr. Yandell, We were recently evaluating maker2 for annotation and going through the maker tutorial from 2012. http://gmod.org/wiki/MAKER_Tutorial_2012 The tutorial makes references to some scripts that we couldn?t find in the current release. We were looking for scripts like gff3_preds2models to convert match/match_part format into annotations with gene/mRNA/exons/CDS and others. I was wondering if maybe we did not have the most up to date version. In addition to getting accurate gene annotations, I was looking for a solution to get functional assignments. I see that there are some scripts like maker_functional_fasta that may help, but I was wondering what you would recommend. Thanks, Corban & Xia This communication is for use by the intended recipient and contains information that may be Privileged, confidential or copyrighted under applicable law. If you are not the intended recipient, you are hereby formally notified that any use, copying or distribution of this e-mail, in whole or in part, is strictly prohibited. Please notify the sender by return e-mail and delete this e-mail from your system. Unless explicitly and conspicuously designated as "E-Contract Intended", this e-mail does not constitute a contract offer, a contract amendment, or an acceptance of a contract offer. This e-mail does not constitute a consent to the use of sender's contact information for direct marketing purposes or for transfers of data to third parties. The dupont.com web address will continue in use for a transitional period for communications sent or received on behalf of DuPont Performance Coatings., which is not affiliated in any way with the DuPont Company. Francais Deutsch Italiano Espanol Portugues Japanese Chinese Korean http://www.DuPont.com/corp/email_disclaimer.html _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.o r g This communication is for use by the intended recipient and contains information that may be Privileged, confidential or copyrighted under applicable law. If you are not the intended recipient, you are hereby formally notified that any use, copying or distribution of this e-mail, in whole or in part, is strictly prohibited. Please notify the sender by return e-mail and delete this e-mail from your system. Unless explicitly and conspicuously designated as "E-Contract Intended", this e-mail does not constitute a contract offer, a contract amendment, or an acceptance of a contract offer. This e-mail does not constitute a consent to the use of sender's contact information for direct marketing purposes or for transfers of data to third parties. The dupont.com web address will continue in use for a transitional period for communications sent or received on behalf of DuPont Performance Coatings., which is not affiliated in any way with the DuPont Company. Francais Deutsch Italiano Espanol Portugues Japanese Chinese Korean http://www.DuPont.com/corp/email_disclaimer.html This communication is for use by the intended recipient and contains information that may be Privileged, confidential or copyrighted under applicable law. If you are not the intended recipient, you are hereby formally notified that any use, copying or distribution of this e-mail, in whole or in part, is strictly prohibited. Please notify the sender by return e-mail and delete this e-mail from your system. Unless explicitly and conspicuously designated as "E-Contract Intended", this e-mail does not constitute a contract offer, a contract amendment, or an acceptance of a contract offer. This e-mail does not constitute a consent to the use of sender's contact information for direct marketing purposes or for transfers of data to third parties. The dupont.com web address will continue in use for a transitional period for communications sent or received on behalf of DuPont Performance Coatings., which is not affiliated in any way with the DuPont Company. Francais Deutsch Italiano Espanol Portugues Japanese Chinese Korean http://www.DuPont.com/corp/email_disclaimer.html This communication is for use by the intended recipient and contains information that may be Privileged, confidential or copyrighted under applicable law. If you are not the intended recipient, you are hereby formally notified that any use, copying or distribution of this e-mail, in whole or in part, is strictly prohibited. Please notify the sender by return e-mail and delete this e-mail from your system. Unless explicitly and conspicuously designated as "E-Contract Intended", this e-mail does not constitute a contract offer, a contract amendment, or an acceptance of a contract offer. This e-mail does not constitute a consent to the use of sender's contact information for direct marketing purposes or for transfers of data to third parties. The dupont.com web address will continue in use for a transitional period for communications sent or received on behalf of DuPont Performance Coatings., which is not affiliated in any way with the DuPont Company. Francais Deutsch Italiano Espanol Portugues Japanese Chinese Korean http://www.DuPont.com/corp/email_disclaimer.html _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 This communication is for use by the intended recipient and contains information that may be Privileged, confidential or copyrighted under applicable law. If you are not the intended recipient, you are hereby formally notified that any use, copying or distribution of this e-mail, in whole or in part, is strictly prohibited. Please notify the sender by return e-mail and delete this e-mail from your system. Unless explicitly and conspicuously designated as "E-Contract Intended", this e-mail does not constitute a contract offer, a contract amendment, or an acceptance of a contract offer. This e-mail does not constitute a consent to the use of sender's contact information for direct marketing purposes or for transfers of data to third parties. The dupont.com web address will continue in use for a transitional period for communications sent or received on behalf of DuPont Performance Coatings., which is not affiliated in any way with the DuPont Company. Francais Deutsch Italiano Espanol Portugues Japanese Chinese Korean http://www.DuPont.com/corp/email_disclaimer.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From cjfields at illinois.edu Thu Oct 3 12:44:07 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Thu, 3 Oct 2013 18:44:07 +0000 Subject: [maker-devel] NFSLock problem Message-ID: I have a MAKER job running that seems to be stalled on a failed scaffold. It's running via MPI (MAKER v2.28, openMPI 1.6.3), that appears to have worked successfully for the most part. This is a run that only uses transcriptome and protein information in order to get a decent dseat of The failed scaffold seems to be holding the job from completion. There does seem to be changes, but mainly they are on the NFSLock files: ./.NFSLock.gi_lock.NFSLock ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1 ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLock.KB913038%2E1.281.282.junction.blastn.holdover.NFSLock ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLock.KB913038%2E1.282.start.blastn.holdover.NFSLock ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLock.KB913038%2E1.281.end.blastn.holdover.NFSLock ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLock.KB913038%2E1.282.start.blastn.holdover.NFSLock.STACK ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLock.KB913038%2E1.281.end.blastn.holdover.NFSLock.STACK Everything else seems to have completed. I have seen a few issues re: NFS locking problems, would this be related? Should I stop the job? We're running GPFS for our NFS. Here's 'mount': -system-specific-4.1$ mount /dev/sda5 on / type ext4 (rw) proc on /proc type proc (rw) sysfs on /sys type sysfs (rw) devpts on /dev/pts type devpts (rw,gid=5,mode=620) tmpfs on /dev/shm type tmpfs (rw) /dev/sda1 on /boot type ext4 (rw) /dev/sdb1 on /export type ext4 (rw) /dev/sda2 on /var type ext4 (rw) tmpfs on /var/lib/ganglia/rrds type tmpfs (rw,size=6180842000,gid=99,uid=99) none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw) sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw) nfsd on /proc/fs/nfsd type nfsd (rw) /dev/IGBHOME0 on /home type gpfs (rw,mtime,dev=IGBHOME0) 128.174.124.79:/shares/group on /archive/group type nfs (rw,sync,hard,intr,retrans=10,timeo=300,rsize=65536,wsize=1048576,vers=3,proto=tcp,mountproto=tcp,addr=128.174.124.79) 128.174.124.79:/shares/CBC on /archive/CBC type nfs (rw,sync,hard,intr,retrans=10,timeo=300,rsize=65536,wsize=1048576,vers=3,proto=tcp,mountproto=tcp,addr=128.174.124.79) chris From carsonhh at gmail.com Thu Oct 3 12:45:37 2013 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 03 Oct 2013 14:45:37 -0400 Subject: [maker-devel] NFSLock problem In-Reply-To: Message-ID: Stop the job and restart it. No need to delete anything. Just restart it. Thanks, Carson On 10/3/13 2:44 PM, "Fields, Christopher J" wrote: >I have a MAKER job running that seems to be stalled on a failed scaffold. > It's running via MPI (MAKER v2.28, openMPI 1.6.3), that appears to have >worked successfully for the most part. This is a run that only uses >transcriptome and protein information in order to get a decent dseat of > >The failed scaffold seems to be holding the job from completion. There >does seem to be changes, but mainly they are on the NFSLock files: > >./.NFSLock.gi_lock.NFSLock >./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1 >./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLo >ck.KB913038%2E1.281.282.junction.blastn.holdover.NFSLock >./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLo >ck.KB913038%2E1.282.start.blastn.holdover.NFSLock >./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLo >ck.KB913038%2E1.281.end.blastn.holdover.NFSLock >./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLo >ck.KB913038%2E1.282.start.blastn.holdover.NFSLock.STACK >./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLo >ck.KB913038%2E1.281.end.blastn.holdover.NFSLock.STACK > >Everything else seems to have completed. > >I have seen a few issues re: NFS locking problems, would this be related? > Should I stop the job? > >We're running GPFS for our NFS. Here's 'mount': > >-system-specific-4.1$ mount >/dev/sda5 on / type ext4 (rw) >proc on /proc type proc (rw) >sysfs on /sys type sysfs (rw) >devpts on /dev/pts type devpts (rw,gid=5,mode=620) >tmpfs on /dev/shm type tmpfs (rw) >/dev/sda1 on /boot type ext4 (rw) >/dev/sdb1 on /export type ext4 (rw) >/dev/sda2 on /var type ext4 (rw) >tmpfs on /var/lib/ganglia/rrds type tmpfs >(rw,size=6180842000,gid=99,uid=99) >none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw) >sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw) >nfsd on /proc/fs/nfsd type nfsd (rw) >/dev/IGBHOME0 on /home type gpfs (rw,mtime,dev=IGBHOME0) >128.174.124.79:/shares/group on /archive/group type nfs >(rw,sync,hard,intr,retrans=10,timeo=300,rsize=65536,wsize=1048576,vers=3,p >roto=tcp,mountproto=tcp,addr=128.174.124.79) >128.174.124.79:/shares/CBC on /archive/CBC type nfs >(rw,sync,hard,intr,retrans=10,timeo=300,rsize=65536,wsize=1048576,vers=3,p >roto=tcp,mountproto=tcp,addr=128.174.124.79) > >chris >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From cjfields at illinois.edu Thu Oct 3 20:45:29 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Fri, 4 Oct 2013 02:45:29 +0000 Subject: [maker-devel] NFSLock problem In-Reply-To: References: Message-ID: Carson, Took a couple of restarts but finally processed through. I saw a prior email on the mail list where you mentioned this could be due to failed jobs hanging things up; I may set up my next job to allow one processor for logging into the nodes in case there is a similar problem, maybe try to track this down. chris On Oct 3, 2013, at 1:45 PM, Carson Holt wrote: > Stop the job and restart it. No need to delete anything. Just restart it. > > Thanks, > Carson > > > On 10/3/13 2:44 PM, "Fields, Christopher J" wrote: > >> I have a MAKER job running that seems to be stalled on a failed scaffold. >> It's running via MPI (MAKER v2.28, openMPI 1.6.3), that appears to have >> worked successfully for the most part. This is a run that only uses >> transcriptome and protein information in order to get a decent dseat of >> >> The failed scaffold seems to be holding the job from completion. There >> does seem to be changes, but mainly they are on the NFSLock files: >> >> ./.NFSLock.gi_lock.NFSLock >> ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1 >> ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLo >> ck.KB913038%2E1.281.282.junction.blastn.holdover.NFSLock >> ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLo >> ck.KB913038%2E1.282.start.blastn.holdover.NFSLock >> ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLo >> ck.KB913038%2E1.281.end.blastn.holdover.NFSLock >> ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLo >> ck.KB913038%2E1.282.start.blastn.holdover.NFSLock.STACK >> ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLo >> ck.KB913038%2E1.281.end.blastn.holdover.NFSLock.STACK >> >> Everything else seems to have completed. >> >> I have seen a few issues re: NFS locking problems, would this be related? >> Should I stop the job? >> >> We're running GPFS for our NFS. Here's 'mount': >> >> -system-specific-4.1$ mount >> /dev/sda5 on / type ext4 (rw) >> proc on /proc type proc (rw) >> sysfs on /sys type sysfs (rw) >> devpts on /dev/pts type devpts (rw,gid=5,mode=620) >> tmpfs on /dev/shm type tmpfs (rw) >> /dev/sda1 on /boot type ext4 (rw) >> /dev/sdb1 on /export type ext4 (rw) >> /dev/sda2 on /var type ext4 (rw) >> tmpfs on /var/lib/ganglia/rrds type tmpfs >> (rw,size=6180842000,gid=99,uid=99) >> none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw) >> sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw) >> nfsd on /proc/fs/nfsd type nfsd (rw) >> /dev/IGBHOME0 on /home type gpfs (rw,mtime,dev=IGBHOME0) >> 128.174.124.79:/shares/group on /archive/group type nfs >> (rw,sync,hard,intr,retrans=10,timeo=300,rsize=65536,wsize=1048576,vers=3,p >> roto=tcp,mountproto=tcp,addr=128.174.124.79) >> 128.174.124.79:/shares/CBC on /archive/CBC type nfs >> (rw,sync,hard,intr,retrans=10,timeo=300,rsize=65536,wsize=1048576,vers=3,p >> roto=tcp,mountproto=tcp,addr=128.174.124.79) >> >> chris >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > From felix.bemm at uni-wuerzburg.de Fri Oct 4 01:10:08 2013 From: felix.bemm at uni-wuerzburg.de (Felix Bemm) Date: Fri, 04 Oct 2013 09:10:08 +0200 Subject: [maker-devel] Contigs failing - could not make datastore directory Message-ID: <524E69D0.1090905@uni-wuerzburg.de> Hi, I have a problem with Contigs failing like this: examining contents of the fasta file and run log ERROR: could not make datastore directory --> rank=11, hostname=r5n04 ERROR: Failed while examining contents of the fasta file and run log ERROR: Chunk failed at level:0, tier_type:0 FAILED CONTIG:MA_gen_v8.05_c31675 There is enough space available on both the storage device and the tmp device that is being used. The datastore log files looks like this: MA_gen_v8.05_c1149 FAILED MA_gen_v8.05_c1153 FAILED MA_gen_v8.05_c1154 FAILED MA_gen_v8.05_c1155 FAILED MA_gen_v8.05_c1156 FAILED Since there is no datastore for this contigs, I can not provide any more informationen. Approx. 1/3 of the contigs were annotated, after that maker started to fail. It kept running for a while and finishing some of the contigs in the second or third try like this: MA_gen_v8.05_c4443 FAILED MA_gen_v8.05_c4443 FAILED MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ STARTED MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ FINISHED Any ideas what can cause this problems? Best regards Felix -- Felix Bemm Department of Bioinformatics University of W?rzburg, Germany Tel: +49 931 - 31 83696 Fax: +49 931 - 31 84552 felix.bemm at uni-wuerzburg.de From carsonhh at gmail.com Fri Oct 4 12:15:35 2013 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 04 Oct 2013 14:15:35 -0400 Subject: [maker-devel] Contigs failing - could not make datastore directory In-Reply-To: <524E69D0.1090905@uni-wuerzburg.de> Message-ID: Let me know what you find. With some failures it's hard to identify the problem especially with long running processes, which is why I made MAKER restartable, so you can at least get completion by just launching again. --Carson On 10/4/13 3:10 AM, "Felix Bemm" wrote: >Hi, > >I have a problem with Contigs failing like this: > >examining contents of the fasta file and run log >ERROR: could not make datastore directory >--> rank=11, hostname=r5n04 >ERROR: Failed while examining contents of the fasta file and run log >ERROR: Chunk failed at level:0, tier_type:0 >FAILED CONTIG:MA_gen_v8.05_c31675 > >There is enough space available on both the storage device and the tmp >device that is being used. The datastore log files looks like this: > >MA_gen_v8.05_c1149 FAILED >MA_gen_v8.05_c1153 FAILED >MA_gen_v8.05_c1154 FAILED >MA_gen_v8.05_c1155 FAILED >MA_gen_v8.05_c1156 FAILED > >Since there is no datastore for this contigs, I can not provide any more >informationen. Approx. 1/3 of the contigs were annotated, after that >maker started to fail. It kept running for a while and finishing some of >the contigs in the second or third try like this: > >MA_gen_v8.05_c4443 FAILED >MA_gen_v8.05_c4443 FAILED >MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ START >ED >MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ >FINISHED > >Any ideas what can cause this problems? > >Best regards >Felix > >-- >Felix Bemm >Department of Bioinformatics >University of W?rzburg, Germany >Tel: +49 931 - 31 83696 >Fax: +49 931 - 31 84552 >felix.bemm at uni-wuerzburg.de > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From carsonhh at gmail.com Fri Oct 4 12:21:42 2013 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 04 Oct 2013 14:21:42 -0400 Subject: [maker-devel] Contigs failing - could not make datastore directory In-Reply-To: <524E69D0.1090905@uni-wuerzburg.de> Message-ID: Hi Felix, There are other potential issues as well, for example a file quota limit set by your administrator or full local /tmp which will be unique to each node and cannot be queried directly from the root node on a cluster. On a cluster one node may also not have the NFS location mounted. Or it could be system related, for example on ibrix NFS mounted systems you can get a weird filesystem issue with ghost files and directories when using MAKER that can neither be created nor deleted. Try running the failed contigs in a different directory (even if it's on the same disk). Also what version of MAKER are you using? --Carson On 10/4/13 3:10 AM, "Felix Bemm" wrote: >Hi, > >I have a problem with Contigs failing like this: > >examining contents of the fasta file and run log >ERROR: could not make datastore directory >--> rank=11, hostname=r5n04 >ERROR: Failed while examining contents of the fasta file and run log >ERROR: Chunk failed at level:0, tier_type:0 >FAILED CONTIG:MA_gen_v8.05_c31675 > >There is enough space available on both the storage device and the tmp >device that is being used. The datastore log files looks like this: > >MA_gen_v8.05_c1149 FAILED >MA_gen_v8.05_c1153 FAILED >MA_gen_v8.05_c1154 FAILED >MA_gen_v8.05_c1155 FAILED >MA_gen_v8.05_c1156 FAILED > >Since there is no datastore for this contigs, I can not provide any more >informationen. Approx. 1/3 of the contigs were annotated, after that >maker started to fail. It kept running for a while and finishing some of >the contigs in the second or third try like this: > >MA_gen_v8.05_c4443 FAILED >MA_gen_v8.05_c4443 FAILED >MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ START >ED >MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ >FINISHED > >Any ideas what can cause this problems? > >Best regards >Felix > >-- >Felix Bemm >Department of Bioinformatics >University of W?rzburg, Germany >Tel: +49 931 - 31 83696 >Fax: +49 931 - 31 84552 >felix.bemm at uni-wuerzburg.de > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From felix.bemm at uni-wuerzburg.de Sat Oct 5 08:09:15 2013 From: felix.bemm at uni-wuerzburg.de (Felix Bemm) Date: Sat, 05 Oct 2013 16:09:15 +0200 Subject: [maker-devel] Contigs failing - could not make datastore directory In-Reply-To: References: Message-ID: <52501D8B.2060301@uni-wuerzburg.de> Hi Carson, there is definitely no quota neither a full local tmp device. I changed to another tmp device, which is unfortunately residing on nfs but now it seems to work. I am using the latest version of maker (2.28). The complete job is running on a single machine with 32 mpi threads. Bests Felix Am 04.10.2013 20:21, schrieb Carson Holt: > Hi Felix, > > There are other potential issues as well, for example a file quota limit > set by your administrator or full local /tmp which will be unique to each > node and cannot be queried directly from the root node on a cluster. On a > cluster one node may also not have the NFS location mounted. Or it could > be system related, for example on ibrix NFS mounted systems you can get a > weird filesystem issue with ghost files and directories when using MAKER > that can neither be created nor deleted. Try running the failed contigs > in a different directory (even if it's on the same disk). Also what > version of MAKER are you using? > > --Carson > > > > On 10/4/13 3:10 AM, "Felix Bemm" wrote: > >> Hi, >> >> I have a problem with Contigs failing like this: >> >> examining contents of the fasta file and run log >> ERROR: could not make datastore directory >> --> rank=11, hostname=r5n04 >> ERROR: Failed while examining contents of the fasta file and run log >> ERROR: Chunk failed at level:0, tier_type:0 >> FAILED CONTIG:MA_gen_v8.05_c31675 >> >> There is enough space available on both the storage device and the tmp >> device that is being used. The datastore log files looks like this: >> >> MA_gen_v8.05_c1149 FAILED >> MA_gen_v8.05_c1153 FAILED >> MA_gen_v8.05_c1154 FAILED >> MA_gen_v8.05_c1155 FAILED >> MA_gen_v8.05_c1156 FAILED >> >> Since there is no datastore for this contigs, I can not provide any more >> informationen. Approx. 1/3 of the contigs were annotated, after that >> maker started to fail. It kept running for a while and finishing some of >> the contigs in the second or third try like this: >> >> MA_gen_v8.05_c4443 FAILED >> MA_gen_v8.05_c4443 FAILED >> MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ START >> ED >> MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ >> FINISHED >> >> Any ideas what can cause this problems? >> >> Best regards >> Felix >> >> -- >> Felix Bemm >> Department of Bioinformatics >> University of W?rzburg, Germany >> Tel: +49 931 - 31 83696 >> Fax: +49 931 - 31 84552 >> felix.bemm at uni-wuerzburg.de >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -- Felix Bemm Department of Bioinformatics University of W?rzburg, Germany Tel: +49 931 - 31 83696 Fax: +49 931 - 31 84552 felix.bemm at uni-wuerzburg.de From jfierst at uoregon.edu Fri Oct 4 11:06:36 2013 From: jfierst at uoregon.edu (Janna Fierst) Date: Fri, 4 Oct 2013 10:06:36 -0700 Subject: [maker-devel] Fwd: exon/intron boundaries In-Reply-To: References: Message-ID: Hi, thanks for your reply- I have been going through our annotations in detail and trying different parameter sets, and I think I have identified what is going on but I'm not sure how to set the MAKER2 parameters for our situation. We are working with a species of Caenorhabditid worm and there are long gene-dense blocks that are being incorrectly annotated as large single genes instead of several smaller closely spaced genes. The protein alignments (tblastx and protein2genome) show very clearly where the exon/intron boundaries are and in most cases agree with the augustus predictions. The assembled cufflinks output (through blastn, est2genome and est_gff:cufflinks) does not agree in some locations; I think this may be because in some cases the UTR nearly overlaps adjacent genes. I have included a screenshot of an annotated region viewed in apollo to try to show this. The large gene in the middle is actually 7 different genes that are extremely close together and MAKER2 is collapsing them into a single gene. I tried running without any RNASeq/cufflinks data and MAKER2 annotates the region as two genes instead of one, but I can't get it to recognize the 7 as different genes. I have retrained SNAP but we have not been able to successfully train Augustus, we are currently using the default caenhorhabditis species model. I also included a species specific repeat library. I tried setting correct_est_fusion=1 and reducing pred_flank but these changes appear to really alter the annotations and we end up annotating almost nothing. I also tried setting est2genome=0 to decrease the influence of the cufflinks assembly but it didn't appear to help. There are some very large introns in these genes so I haven't tried yet decreasing the maximum intron size because I'm concerned this may generate too many split genes instead of our current merged gene problem. Thanks for your help, any advice is greatly appreciated! -Janna Fierst On Mon, Aug 26, 2013 at 12:21 PM, Carson Holt wrote: > Are you getting gene fusions or just more exons? Gene fusions can be > reduced by setting correct_est_fusion=1, or reducing pred_flank, although > reducing pred_flank can cause other issues (but those generally only appear > if setting the value below below 150). Also if you have the maximum intron > size set to high (split_hit option), you may also be generating bridging > alignments that make evidence align across distant paralogous genes as well > (this can result in gene merging) > > You should also look at your results manually in a viewer like Apollo. > Then see if the extra exons are supported by something such as protein > alignments from another species. If this is the case, you may have a > poorly annotated protein set that is being used as evidence that is > carrying over it's erroneous exons into the species you are annotating. If > the extra exons are supported by EST evidence, then perhaps you should try > and rebuild the EST assembly (for example trinity has an option to use a > Jarccardian similarity coefficient to avoid fusing transcripts). > > Another option, is to retrain SNAP or Augustus. MAKER does not actually > produce any of the models itself (it is a pipeline not a predictor). The > models are all generated using these other algorithms, MAKER just feeds > them hints based on protein and transcript alignments, so making sure > training is sufficient is important for those programs to produce their > best models. > > Finally make sure your repeat database is sufficient, you may need to > generate a species specific repeat library using something like > RepeatModeler. Repeats can end up being included as extra exons in gene > models because they may contain reading frames the do code for proteins > (I.e. reverse transcriptases). > > If you have any questions on any of the above, just let us know. > > Thanks, > Carson > > > From: Janna Fierst > Date: Monday, August 26, 2013 2:54 PM > To: > Subject: [maker-devel] exon/intron boundaries > > Hi, > > I am using MAKER 2.28 to annotate a Caenorhabditid worm genome, and the > initial results appear fairly good but we seem to be be annotating too many > exons for multiple genes. I was wondering which parameters should be tuned > to change the threshold for exon/intron boundaries? Thanks for your help > -Janna Fierst > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 1.tiff Type: image/tiff Size: 115362 bytes Desc: not available URL: From jfierst at uoregon.edu Sat Oct 5 16:18:52 2013 From: jfierst at uoregon.edu (Janna Fierst) Date: Sat, 5 Oct 2013 15:18:52 -0700 Subject: [maker-devel] Fwd: exon/intron boundaries In-Reply-To: References: Message-ID: Hi, thanks for your reply- I have been going through our annotations in detail and trying different parameter sets, and I think I have identified what is going on but I'm not sure how to set the MAKER2 parameters for our situation. We are working with a species of Caenorhabditid worm and there are long gene-dense blocks that are being incorrectly annotated as large single genes instead of several smaller closely spaced genes. The protein alignments (tblastx and protein2genome) show very clearly where the exon/intron boundaries are and in most cases agree with the augustus predictions. The assembled cufflinks output (through blastn, est2genome and est_gff:cufflinks) does not agree in some locations; I think this may be because in some cases the UTR nearly overlaps adjacent genes. I have included a screenshot of an annotated region viewed in apollo to try to show this. The large gene in the middle is actually 7 different genes that are extremely close together and MAKER2 is collapsing them into a single gene. I tried running without any RNASeq/cufflinks data and MAKER2 annotates the region as two genes instead of one, but I can't get it to recognize the 7 as different genes. I have retrained SNAP but we have not been able to successfully train Augustus, we are currently using the default caenhorhabditis species model. I also included a species specific repeat library. I tried setting correct_est_fusion=1 and reducing pred_flank but these changes appear to really alter the annotations and we end up annotating almost nothing. I also tried setting est2genome=0 to decrease the influence of the cufflinks assembly but it didn't appear to help. There are some very large introns in these genes so I haven't tried yet decreasing the maximum intron size because I'm concerned this may generate too many split genes instead of our current merged gene problem. Thanks for your help, any advice is greatly appreciated! -Janna Fierst On Mon, Aug 26, 2013 at 12:21 PM, Carson Holt wrote: > Are you getting gene fusions or just more exons? Gene fusions can be > reduced by setting correct_est_fusion=1, or reducing pred_flank, although > reducing pred_flank can cause other issues (but those generally only appear > if setting the value below below 150). Also if you have the maximum intron > size set to high (split_hit option), you may also be generating bridging > alignments that make evidence align across distant paralogous genes as well > (this can result in gene merging) > > You should also look at your results manually in a viewer like Apollo. > Then see if the extra exons are supported by something such as protein > alignments from another species. If this is the case, you may have a > poorly annotated protein set that is being used as evidence that is > carrying over it's erroneous exons into the species you are annotating. If > the extra exons are supported by EST evidence, then perhaps you should try > and rebuild the EST assembly (for example trinity has an option to use a > Jarccardian similarity coefficient to avoid fusing transcripts). > > Another option, is to retrain SNAP or Augustus. MAKER does not actually > produce any of the models itself (it is a pipeline not a predictor). The > models are all generated using these other algorithms, MAKER just feeds > them hints based on protein and transcript alignments, so making sure > training is sufficient is important for those programs to produce their > best models. > > Finally make sure your repeat database is sufficient, you may need to > generate a species specific repeat library using something like > RepeatModeler. Repeats can end up being included as extra exons in gene > models because they may contain reading frames the do code for proteins > (I.e. reverse transcriptases). > > If you have any questions on any of the above, just let us know. > > Thanks, > Carson > > > From: Janna Fierst > Date: Monday, August 26, 2013 2:54 PM > To: > Subject: [maker-devel] exon/intron boundaries > > Hi, > > I am using MAKER 2.28 to annotate a Caenorhabditid worm genome, and the > initial results appear fairly good but we seem to be be annotating too many > exons for multiple genes. I was wondering which parameters should be tuned > to change the threshold for exon/intron boundaries? Thanks for your help > -Janna Fierst > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 1.tiff Type: image/tiff Size: 115362 bytes Desc: not available URL: From myandell at genetics.utah.edu Mon Oct 7 05:36:53 2013 From: myandell at genetics.utah.edu (Mark Yandell) Date: Mon, 7 Oct 2013 11:36:53 +0000 Subject: [maker-devel] maker apollo error In-Reply-To: <8687323557146056ef885dd8a1bf2ea4@buzonweb.us.es> References: <8687323557146056ef885dd8a1bf2ea4@buzonweb.us.es> Message-ID: <7A60AB257EFF2B48B1F4C814817EA05365E66B3C@mxb2.hg.genetics.utah.edu> Hi Gabriel, I've forwarded your request on to the MAKER email list... cheers, --mark Mark Yandell Professor of Human Genetics H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:801-587-7707 ________________________________________ From: ggpozo at us.es [ggpozo at us.es] Sent: Monday, October 07, 2013 2:49 AM To: Mark Yandell Subject: maker apollo error Dear Dr. Yandell, First, thank you very much for provide us such a great tool as MAKER. I am traying to install it in my Linux system, everything is ok, however when I try to instal Apollo with ./Build apollo... Downloading apollo... --2013-10-07 10:45:37-- http://gmod.svn.sourceforge.net/viewvc/gmod/apollo/trunk/?view=tar Resolviendo gmod.svn.sourceforge.net... 216.34.181.65, 216.34.181.177 Connecting to gmod.svn.sourceforge.net|216.34.181.65|:80... conectado. Petici?n HTTP enviada, esperando respuesta... 302 Found Localizaci?n: http://sourceforge.net/p/gmod/svn/ [siguiendo] --2013-10-07 10:45:38-- http://sourceforge.net/p/gmod/svn/ Resolviendo sourceforge.net... 216.34.181.60 Connecting to sourceforge.net|216.34.181.60|:80... conectado. Petici?n HTTP enviada, esperando respuesta... 302 Found Localizaci?n: http://sourceforge.net/p/gmod/svn/HEAD/tree/ [siguiendo] --2013-10-07 10:45:38-- http://sourceforge.net/p/gmod/svn/HEAD/tree/ Connecting to sourceforge.net|216.34.181.60|:80... conectado. Petici?n HTTP enviada, esperando respuesta... 200 OK Longitud: 37730 (37K) [text/html] Saving to: `/home/maker/src/../exe/apollo.tar.gz' 100%[==================================================================================================================================>] 37.730 92,1K/s in 0,4s 2013-10-07 10:45:40 (92,1 KB/s) - `/home/maker/src/../exe/apollo.tar.gz' saved [37730/37730] Unpacking apollo tarball... gzip: stdin: not in gzip format tar: Child returned status 1 tar: Error is not recoverable: exiting now ERROR: Failed installing apollo, now cleaning installation path... You may need to install apollo manually. It seems that the http direction in locations file is wrong, Apollo has changed its version and now Apollo is integrated in Apolloweb, may be this the problem. I would greatly appreciate any help you can give me. Thank you very much. Dr. Gabriel Gutierrez Department of Genetcis University of Seville Spain From carsonhh at gmail.com Mon Oct 7 07:20:17 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 07 Oct 2013 09:20:17 -0400 Subject: [maker-devel] Fwd: exon/intron boundaries In-Reply-To: Message-ID: Hi Janna, There are a couple of things to do. Download the maker-2.29p-beta from the lab website. This includes some changes made to improve the performance of correct_est_fusion. Whenever using the correct_est_fusion option, this also reduces the influence of ESTs on annotation. So normally you would need to increase your protein dataset, but given that you have supplied 3 nematode species and all of UniProt already you should probably be fine. But there is one last thing you can do. Instead of using cufflinks, try using trinity to assemble the ESTs. There is a Jaccard clip option that reducing merging caused by overlapping UTR. Between the trinity and correct_est_fusion you should be able to really reduce the effect of those ESTs. If those changes don't work there is one last option. If you take the MAKER results and filter them for SNAP and Augustus ab initio results (match/match_part in the GFF3), then you can pass those in to the pred_gff options. Then turn snaphmm and augustus_species off in the control files. Basically what this will do is turn MAKER's hint based prediction off and force it to filter the ab intio results and select directly models from there. Since the merging is being caused by bad hints (merged transcripts from mRNAseq) this would reduce that effect. You will still need correct_est_fusion=1 though to trim UTR coming from the merged transcripts, because even though MAEKR can't process hints to rerun SNAP and Augustus this way, it will try and add UTR using the EST evidence. --Carson From: Janna Fierst Date: Friday, October 4, 2013 1:06 PM To: Subject: [maker-devel] Fwd: exon/intron boundaries Hi, thanks for your reply- I have been going through our annotations in detail and trying different parameter sets, and I think I have identified what is going on but I'm not sure how to set the MAKER2 parameters for our situation. We are working with a species of Caenorhabditid worm and there are long gene-dense blocks that are being incorrectly annotated as large single genes instead of several smaller closely spaced genes. The protein alignments (tblastx and protein2genome) show very clearly where the exon/intron boundaries are and in most cases agree with the augustus predictions. The assembled cufflinks output (through blastn, est2genome and est_gff:cufflinks) does not agree in some locations; I think this may be because in some cases the UTR nearly overlaps adjacent genes. I have included a screenshot of an annotated region viewed in apollo to try to show this. The large gene in the middle is actually 7 different genes that are extremely close together and MAKER2 is collapsing them into a single gene. I tried running without any RNASeq/cufflinks data and MAKER2 annotates the region as two genes instead of one, but I can't get it to recognize the 7 as different genes. I have retrained SNAP but we have not been able to successfully train Augustus, we are currently using the default caenhorhabditis species model. I also included a species specific repeat library. I tried setting correct_est_fusion=1 and reducing pred_flank but these changes appear to really alter the annotations and we end up annotating almost nothing. I also tried setting est2genome=0 to decrease the influence of the cufflinks assembly but it didn't appear to help. There are some very large introns in these genes so I haven't tried yet decreasing the maximum intron size because I'm concerned this may generate too many split genes instead of our current merged gene problem. Thanks for your help, any advice is greatly appreciated! -Janna Fierst On Mon, Aug 26, 2013 at 12:21 PM, Carson Holt wrote: > Are you getting gene fusions or just more exons? Gene fusions can be reduced > by setting correct_est_fusion=1, or reducing pred_flank, although reducing > pred_flank can cause other issues (but those generally only appear if setting > the value below below 150). Also if you have the maximum intron size set to > high (split_hit option), you may also be generating bridging alignments that > make evidence align across distant paralogous genes as well (this can result > in gene merging) > > You should also look at your results manually in a viewer like Apollo. Then > see if the extra exons are supported by something such as protein alignments > from another species. If this is the case, you may have a poorly annotated > protein set that is being used as evidence that is carrying over it's > erroneous exons into the species you are annotating. If the extra exons are > supported by EST evidence, then perhaps you should try and rebuild the EST > assembly (for example trinity has an option to use a Jarccardian similarity > coefficient to avoid fusing transcripts). > > Another option, is to retrain SNAP or Augustus. MAKER does not actually > produce any of the models itself (it is a pipeline not a predictor). The > models are all generated using these other algorithms, MAKER just feeds them > hints based on protein and transcript alignments, so making sure training is > sufficient is important for those programs to produce their best models. > > Finally make sure your repeat database is sufficient, you may need to generate > a species specific repeat library using something like RepeatModeler. Repeats > can end up being included as extra exons in gene models because they may > contain reading frames the do code for proteins (I.e. reverse transcriptases). > > If you have any questions on any of the above, just let us know. > > Thanks, > Carson > > > From: Janna Fierst > Date: Monday, August 26, 2013 2:54 PM > To: > Subject: [maker-devel] exon/intron boundaries > > Hi, > > I am using MAKER 2.28 to annotate a Caenorhabditid worm genome, and the > initial results appear fairly good but we seem to be be annotating too many > exons for multiple genes. I was wondering which parameters should be tuned to > change the threshold for exon/intron boundaries? Thanks for your help -Janna > Fierst > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/mak > er-devel_yandell-lab.org _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Oct 7 08:12:24 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 07 Oct 2013 10:12:24 -0400 Subject: [maker-devel] maker apollo error In-Reply-To: <7A60AB257EFF2B48B1F4C814817EA05365E66B3C@mxb2.hg.genetics.utah.edu> Message-ID: This was a location from GMOD for the older Apollo. It looks like they've dropped it, so it now points to nothing. Perhaps there is another location that still has the old code I can get from Ed. --Carson On 10/7/13 7:36 AM, "Mark Yandell" wrote: > >Hi Gabriel, > >I've forwarded your request on to the MAKER email list... > >cheers, > >--mark > >Mark Yandell >Professor of Human Genetics >H.A. & Edna Benning Presidential Endowed Chair >Eccles Institute of Human Genetics >University of Utah >15 North 2030 East, Room 2100 >Salt Lake City, UT 84112-5330 >ph:801-587-7707 > >________________________________________ >From: ggpozo at us.es [ggpozo at us.es] >Sent: Monday, October 07, 2013 2:49 AM >To: Mark Yandell >Subject: maker apollo error > >Dear Dr. Yandell, > >First, thank you very much for provide us such a great tool as MAKER. I >am traying to install it in my Linux system, everything is ok, however >when I try to instal Apollo with ./Build apollo... > >Downloading apollo... >--2013-10-07 10:45:37-- >http://gmod.svn.sourceforge.net/viewvc/gmod/apollo/trunk/?view=tar >Resolviendo gmod.svn.sourceforge.net... 216.34.181.65, 216.34.181.177 >Connecting to gmod.svn.sourceforge.net|216.34.181.65|:80... conectado. >Petici?n HTTP enviada, esperando respuesta... 302 Found >Localizaci?n: http://sourceforge.net/p/gmod/svn/ [siguiendo] >--2013-10-07 10:45:38-- http://sourceforge.net/p/gmod/svn/ >Resolviendo sourceforge.net... 216.34.181.60 >Connecting to sourceforge.net|216.34.181.60|:80... conectado. >Petici?n HTTP enviada, esperando respuesta... 302 Found >Localizaci?n: http://sourceforge.net/p/gmod/svn/HEAD/tree/ [siguiendo] >--2013-10-07 10:45:38-- http://sourceforge.net/p/gmod/svn/HEAD/tree/ >Connecting to sourceforge.net|216.34.181.60|:80... conectado. >Petici?n HTTP enviada, esperando respuesta... 200 OK >Longitud: 37730 (37K) [text/html] >Saving to: `/home/maker/src/../exe/apollo.tar.gz' > >100%[===================================================================== >=============================================================>] 37.730 > 92,1K/s in 0,4s > >2013-10-07 10:45:40 (92,1 KB/s) - `/home/maker/src/../exe/apollo.tar.gz' >saved [37730/37730] > >Unpacking apollo tarball... > >gzip: stdin: not in gzip format >tar: Child returned status 1 >tar: Error is not recoverable: exiting now > > >ERROR: Failed installing apollo, now cleaning installation path... >You may need to install apollo manually. > > > >It seems that the http direction in locations file is wrong, Apollo has >changed its version and now Apollo is integrated in Apolloweb, may be >this the problem. > > > >I would greatly appreciate any help you can give me. > >Thank you very much. > > > >Dr. Gabriel Gutierrez > >Department of Genetcis > >University of Seville > >Spain > > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From carsonhh at gmail.com Mon Oct 7 08:33:44 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 07 Oct 2013 10:33:44 -0400 Subject: [maker-devel] Contigs failing - could not make datastore directory In-Reply-To: <52501D8B.2060301@uni-wuerzburg.de> Message-ID: You can also try the 2.29 beta if you want (on the website). Also when MAKER exits, it tries to delete anything left behind in temporary folders, so is it possible that it is creating space on exit after filling up /tmp? Given that the issue seems to go away when you switch to another directory, it suggests that this might be the kind of thing that is specific to something about the current state of your your system. Thanks, Carson On 10/5/13 10:09 AM, "Felix Bemm" wrote: >Hi Carson, > >there is definitely no quota neither a full local tmp device. I changed >to another tmp device, which is unfortunately residing on nfs but now it >seems to work. I am using the latest version of maker (2.28). The >complete job is running on a single machine with 32 mpi threads. > >Bests >Felix > >Am 04.10.2013 20:21, schrieb Carson Holt: >> Hi Felix, >> >> There are other potential issues as well, for example a file quota limit >> set by your administrator or full local /tmp which will be unique to >>each >> node and cannot be queried directly from the root node on a cluster. >>On a >> cluster one node may also not have the NFS location mounted. Or it >>could >> be system related, for example on ibrix NFS mounted systems you can get >>a >> weird filesystem issue with ghost files and directories when using MAKER >> that can neither be created nor deleted. Try running the failed contigs >> in a different directory (even if it's on the same disk). Also what >> version of MAKER are you using? >> >> --Carson >> >> >> >> On 10/4/13 3:10 AM, "Felix Bemm" wrote: >> >>> Hi, >>> >>> I have a problem with Contigs failing like this: >>> >>> examining contents of the fasta file and run log >>> ERROR: could not make datastore directory >>> --> rank=11, hostname=r5n04 >>> ERROR: Failed while examining contents of the fasta file and run log >>> ERROR: Chunk failed at level:0, tier_type:0 >>> FAILED CONTIG:MA_gen_v8.05_c31675 >>> >>> There is enough space available on both the storage device and the tmp >>> device that is being used. The datastore log files looks like this: >>> >>> MA_gen_v8.05_c1149 FAILED >>> MA_gen_v8.05_c1153 FAILED >>> MA_gen_v8.05_c1154 FAILED >>> MA_gen_v8.05_c1155 FAILED >>> MA_gen_v8.05_c1156 FAILED >>> >>> Since there is no datastore for this contigs, I can not provide any >>>more >>> informationen. Approx. 1/3 of the contigs were annotated, after that >>> maker started to fail. It kept running for a while and finishing some >>>of >>> the contigs in the second or third try like this: >>> >>> MA_gen_v8.05_c4443 FAILED >>> MA_gen_v8.05_c4443 FAILED >>> >>>MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ STA >>>RT >>> ED >>> MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ >>> FINISHED >>> >>> Any ideas what can cause this problems? >>> >>> Best regards >>> Felix >>> >>> -- >>> Felix Bemm >>> Department of Bioinformatics >>> University of W?rzburg, Germany >>> Tel: +49 931 - 31 83696 >>> Fax: +49 931 - 31 84552 >>> felix.bemm at uni-wuerzburg.de >>> >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > >-- >Felix Bemm >Department of Bioinformatics >University of W?rzburg, Germany >Tel: +49 931 - 31 83696 >Fax: +49 931 - 31 84552 >felix.bemm at uni-wuerzburg.de From felix.bemm at uni-wuerzburg.de Mon Oct 7 10:40:12 2013 From: felix.bemm at uni-wuerzburg.de (Felix Bemm) Date: Mon, 07 Oct 2013 18:40:12 +0200 Subject: [maker-devel] Contigs failing - could not make datastore directory In-Reply-To: References: Message-ID: <5252E3EC.6050908@uni-wuerzburg.de> Do you have a changelog for 2.29b? On 07.10.2013 16:33, Carson Holt wrote: > You can also try the 2.29 beta if you want (on the website). Also when > MAKER exits, it tries to delete anything left behind in temporary folders, > so is it possible that it is creating space on exit after filling up /tmp? I will try to monitor that! > Given that the issue seems to go away when you switch to another > directory, it suggests that this might be the kind of thing that is > specific to something about the current state of your your system. I agree with that. Interestingly it happens with the only directory where I would not expect that. > Thanks, > Carson > > > On 10/5/13 10:09 AM, "Felix Bemm" wrote: > >> Hi Carson, >> >> there is definitely no quota neither a full local tmp device. I changed >> to another tmp device, which is unfortunately residing on nfs but now it >> seems to work. I am using the latest version of maker (2.28). The >> complete job is running on a single machine with 32 mpi threads. >> >> Bests >> Felix >> >> Am 04.10.2013 20:21, schrieb Carson Holt: >>> Hi Felix, >>> >>> There are other potential issues as well, for example a file quota limit >>> set by your administrator or full local /tmp which will be unique to >>> each >>> node and cannot be queried directly from the root node on a cluster. >>> On a >>> cluster one node may also not have the NFS location mounted. Or it >>> could >>> be system related, for example on ibrix NFS mounted systems you can get >>> a >>> weird filesystem issue with ghost files and directories when using MAKER >>> that can neither be created nor deleted. Try running the failed contigs >>> in a different directory (even if it's on the same disk). Also what >>> version of MAKER are you using? >>> >>> --Carson >>> >>> >>> >>> On 10/4/13 3:10 AM, "Felix Bemm" wrote: >>> >>>> Hi, >>>> >>>> I have a problem with Contigs failing like this: >>>> >>>> examining contents of the fasta file and run log >>>> ERROR: could not make datastore directory >>>> --> rank=11, hostname=r5n04 >>>> ERROR: Failed while examining contents of the fasta file and run log >>>> ERROR: Chunk failed at level:0, tier_type:0 >>>> FAILED CONTIG:MA_gen_v8.05_c31675 >>>> >>>> There is enough space available on both the storage device and the tmp >>>> device that is being used. The datastore log files looks like this: >>>> >>>> MA_gen_v8.05_c1149 FAILED >>>> MA_gen_v8.05_c1153 FAILED >>>> MA_gen_v8.05_c1154 FAILED >>>> MA_gen_v8.05_c1155 FAILED >>>> MA_gen_v8.05_c1156 FAILED >>>> >>>> Since there is no datastore for this contigs, I can not provide any >>>> more >>>> informationen. Approx. 1/3 of the contigs were annotated, after that >>>> maker started to fail. It kept running for a while and finishing some >>>> of >>>> the contigs in the second or third try like this: >>>> >>>> MA_gen_v8.05_c4443 FAILED >>>> MA_gen_v8.05_c4443 FAILED >>>> >>>> MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ STA >>>> RT >>>> ED >>>> MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ >>>> FINISHED >>>> >>>> Any ideas what can cause this problems? >>>> >>>> Best regards >>>> Felix >>>> >>>> -- >>>> Felix Bemm >>>> Department of Bioinformatics >>>> University of W?rzburg, Germany >>>> Tel: +49 931 - 31 83696 >>>> Fax: +49 931 - 31 84552 >>>> felix.bemm at uni-wuerzburg.de >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> >> -- >> Felix Bemm >> Department of Bioinformatics >> University of W?rzburg, Germany >> Tel: +49 931 - 31 83696 >> Fax: +49 931 - 31 84552 >> felix.bemm at uni-wuerzburg.de -- Felix Bemm Department of Bioinformatics University of W?rzburg, Germany Tel: +49 931 - 31 83696 Fax: +49 931 - 31 84552 felix.bemm at uni-wuerzburg.de From carsonhh at gmail.com Mon Oct 7 11:25:34 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 07 Oct 2013 13:25:34 -0400 Subject: [maker-devel] maker apollo error In-Reply-To: <14a642f3dc9cb615f048b70228bad3be@buzonweb.us.es> Message-ID: Here is a new location for the source --> http://sourceforge.net/code-snapshots/svn/g/gm/gmod/svn/gmod-svn-25291-apoll o-trunk.zip --Carson From: Date: Monday, October 7, 2013 12:47 PM To: Carson Holt Cc: Mark Yandell , Subject: Re: [maker-devel] maker apollo error Thanks, I have modified the locations file pointing to a personal server where I downloaded an old version of Apollo but it did not work, but if you can get a new one I would try it again. Thanks Gabriel El 07/10/2013 16:12, Carson Holt escribi?: > This was a location from GMOD for the older Apollo. It looks like they've > dropped it, so it now points to nothing. Perhaps there is another location > that still has the old code I can get from Ed. > > --Carson > > > > On 10/7/13 7:36 AM, "Mark Yandell" wrote: >> Hi Gabriel, I've forwarded your request on to the MAKER email list... cheers, >> --mark Mark Yandell Professor of Human Genetics H.A. & Edna Benning >> Presidential Endowed Chair Eccles Institute of Human Genetics University of >> Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 >> ph:801-587-7707 ________________________________________ From: ggpozo at us.es >> [ggpozo at us.es] Sent: Monday, October 07, 2013 2:49 AM To: Mark Yandell >> Subject: maker apollo error Dear Dr. Yandell, First, thank you very much for >> provide us such a great tool as MAKER. I am traying to install it in my Linux >> system, everything is ok, however when I try to instal Apollo with ./Build >> apollo... Downloading apollo... --2013-10-07 10:45:37-- >> http://gmod.svn.sourceforge.net/viewvc/gmod/apollo/trunk/?view=tar >> Resolviendo gmod.svn.sourceforge.net... 216.34.181.65, 216.34.181.177 >> Connecting to gmod.svn.sourceforge.net|216.34.181.65|:80... conectado. >> Petici?n HTTP enviada, esperando respuesta... 302 Found Localizaci?n: >> http://sourceforge.net/p/gmod/svn/ [siguiendo] --2013-10-07 10:45:38-- >> http://sourceforge.net/p/gmod/svn/ Resolviendo sourceforge.net... >> 216.34.181.60 Connecting to sourceforge.net|216.34.181.60|:80... conectado. >> Petici?n HTTP enviada, esperando respuesta... 302 Found Localizaci?n: >> http://sourceforge.net/p/gmod/svn/HEAD/tree/ [siguiendo] --2013-10-07 >> 10:45:38-- http://sourceforge.net/p/gmod/svn/HEAD/tree/ Connecting to >> sourceforge.net|216.34.181.60|:80... conectado. Petici?n HTTP enviada, >> esperando respuesta... 200 OK Longitud: 37730 (37K) [text/html] Saving to: >> `/home/maker/src/../exe/apollo.tar.gz' >> 100%[===================================================================== >> =============================================================>] 37.730 >> 92,1K/s in 0,4s 2013-10-07 10:45:40 (92,1 KB/s) - >> `/home/maker/src/../exe/apollo.tar.gz' saved [37730/37730] Unpacking apollo >> tarball... gzip: stdin: not in gzip format tar: Child returned status 1 tar: >> Error is not recoverable: exiting now ERROR: Failed installing apollo, now >> cleaning installation path... You may need to install apollo manually. It >> seems that the http direction in locations file is wrong, Apollo has changed >> its version and now Apollo is integrated in Apolloweb, may be this the >> problem. I would greatly appreciate any help you can give me. Thank you very >> much. Dr. Gabriel Gutierrez Department of Genetcis University of Seville >> Spain _______________________________________________ maker-devel mailing >> list maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ggpozo at us.es Mon Oct 7 10:47:13 2013 From: ggpozo at us.es (ggpozo at us.es) Date: Mon, 07 Oct 2013 18:47:13 +0200 Subject: [maker-devel] maker apollo error In-Reply-To: References: Message-ID: <14a642f3dc9cb615f048b70228bad3be@buzonweb.us.es> Thanks, I have modified the locations file pointing to a personal server where I downloaded an old version of Apollo but it did not work, but if you can get a new one I would try it again. Thanks Gabriel El 07/10/2013 16:12, Carson Holt escribi?: > This was a location from GMOD for the older Apollo. It looks like they've > dropped it, so it now points to nothing. Perhaps there is another location > that still has the old code I can get from Ed. > > --Carson > > On 10/7/13 7:36 AM, "Mark Yandell" wrote: > >> Hi Gabriel, I've forwarded your request on to the MAKER email list... cheers, --mark Mark Yandell Professor of Human Genetics H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:801-587-7707 ________________________________________ From: ggpozo at us.es [1] [ggpozo at us.es [2]] Sent: Monday, October 07, 2013 2:49 AM To: Mark Yandell Subject: maker apollo error Dear Dr. Yandell, First, thank you very much for provide us such a great tool as MAKER. I am traying to install it in my Linux system, everything is ok, however when I try to instal Apollo with ./Build apollo... Downloading apollo... --2013-10-07 10:45:37-- http://gmod.svn.sourceforge.net/viewvc/gmod/apollo/trunk/?view=tar [3] Resolviendo gmod.svn.sourceforge.net... 216.34.181.65, 216.34.181.177 Connecting to gmod.svn.sourceforge.net|216.34.181.65|:80... conectado. Petici?n HTTP enviada, esperando respuesta... 302 Found Localizaci?n: http://sourceforge.net/p/gmod/svn/ [4] [siguiendo] --2013-10-07 10:45:38-- http://sourceforge.net/p/gmod/svn/ [5] Resolviendo sourceforge.net... 216.34.181.60 Connecting to sourceforge.net|216.34.181.60|:80... conectado. Petici?n HTTP enviada, esperando respuesta... 302 Found Localizaci?n: http://sourceforge.net/p/gmod/svn/HEAD/tree/ [6] [siguiendo] --2013-10-07 10:45:38-- http://sourceforge.net/p/gmod/svn/HEAD/tree/ [7] Connecting to sourceforge.net|216.34.181.60|:80... conectado. Petici?n HTTP enviada, esperando respuesta... 200 OK Longitud: 37730 (37K) [text/html] Saving to: `/home/maker/src/../exe/apollo.tar.gz' 100%[===================================================================== =============================================================>] 37.730 92,1K/s in 0,4s 2013-10-07 10:45:40 (92,1 KB/s) - `/home/maker/src/../exe/apollo.tar.gz' saved [37730/37730] Unpacking apollo tarball... gzip: stdin: not in gzip format tar: Child returned status 1 tar: Error is not recoverable: exiting now ERROR: Failed installing apollo, now cleaning installation path... You may need to install apollo manually. It seems that the http direction in locations file is wrong, Apollo has changed its version and now Apollo is integrated in Apolloweb, may be this the problem. I would greatly appreciate any help you can give me. Thank you very much. Dr. Gabriel Gutierrez Department of Genetcis University of Seville Spain _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com [8] http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org [9] Links: ------ [1] mailto:ggpozo at us.es [2] mailto:ggpozo at us.es [3] http://gmod.svn.sourceforge.net/viewvc/gmod/apollo/trunk/?view=tar [4] http://sourceforge.net/p/gmod/svn/ [5] http://sourceforge.net/p/gmod/svn/ [6] http://sourceforge.net/p/gmod/svn/HEAD/tree/ [7] http://sourceforge.net/p/gmod/svn/HEAD/tree/ [8] mailto:maker-devel at box290.bluehost.com [9] http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From felix.bemm at uni-wuerzburg.de Sat Oct 12 09:06:52 2013 From: felix.bemm at uni-wuerzburg.de (Felix Bemm) Date: Sat, 12 Oct 2013 17:06:52 +0200 Subject: [maker-devel] Serious Start Codon Problem Message-ID: <5259658C.4000102@uni-wuerzburg.de> Hi, I have a serious problems with maker annotations especially the start codon placement. It started happening with maker version 2.27! About 1/3 of my genes are missing a proper start codon. When reviewing the GFF files gene predictors and est evidence standalone would predict the right gene structure including the start codon. I was thinking that this problem was somehow linked to my gene models but looking at their standalone prediction I discarded that theory. Just some stats about proteins with proper start codon (first column) and missing start codon (second column). Maker 9268 4215 SNAP 16577 896 Genemark 18764 290 Numbers look the same when Augustus is used (currently retraining). Manually using EST's in Apollo often fixes the problems but I don't want to check > 4000 genes for this. Do you have any idea what could cause such a problem? If you need some example gff files I can send you. Best regards Felix -- Felix Bemm Department of Bioinformatics University of W?rzburg, Germany Tel: +49 931 - 31 83696 Fax: +49 931 - 31 84552 felix.bemm at uni-wuerzburg.de From carsonhh at gmail.com Sat Oct 12 09:48:29 2013 From: carsonhh at gmail.com (Carson Holt) Date: Sat, 12 Oct 2013 11:48:29 -0400 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: <5259658C.4000102@uni-wuerzburg.de> Message-ID: This issue just came up this week on another e-mail as well. One way to fix it, is to edit the Bio::Tools::CodonTable module from BioPerl (use maker --debug to have maker print out the locations of all modules it uses). Such a hack should only be done as a temporary work around. You change line 256 from --> ---M---------------M---------------M---------------------------- To--> -----------------------------------M---------------------------- Maker uses the BioPerl is_start_codon function to determine if the codon used in a result it receives is acceptable or if it should look upstream or downstream for a different one. Apparently there are actually 3 acceptable start codons in the standard codon table (2 rare ones and one common), so making the edit removes the two rare ones from the list. I'm also preparing a 2.30 release that exports a "strict" codon table to the BioPerl module, that will be used by default and should fix the issue. Thanks, Carson On 10/12/13 11:06 AM, "Felix Bemm" wrote: >Hi, > >I have a serious problems with maker annotations especially the start >codon placement. It started happening with maker version 2.27! About 1/3 >of my genes are missing a proper start codon. When reviewing the GFF >files gene predictors and est evidence standalone would predict the >right gene structure including the start codon. I was thinking that this >problem was somehow linked to my gene models but looking at their >standalone prediction I discarded that theory. Just some stats about >proteins with proper start codon (first column) and missing start codon >(second column). > >Maker 9268 4215 >SNAP 16577 896 >Genemark 18764 290 > >Numbers look the same when Augustus is used (currently retraining). >Manually using EST's in Apollo often fixes the problems but I don't want >to check > 4000 genes for this. > >Do you have any idea what could cause such a problem? If you need some >example gff files I can send you. > >Best regards >Felix > >-- >Felix Bemm >Department of Bioinformatics >University of W?rzburg, Germany >Tel: +49 931 - 31 83696 >Fax: +49 931 - 31 84552 >felix.bemm at uni-wuerzburg.de > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From felix.bemm at uni-wuerzburg.de Sat Oct 12 10:16:06 2013 From: felix.bemm at uni-wuerzburg.de (Felix Bemm) Date: Sat, 12 Oct 2013 18:16:06 +0200 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: References: Message-ID: <525975C6.4020403@uni-wuerzburg.de> Thx Carson! Do you now when 2.30 will be available? Bests On 12.10.2013 17:48, Carson Holt wrote: > This issue just came up this week on another e-mail as well. > > One way to fix it, is to edit the Bio::Tools::CodonTable module from > BioPerl (use maker --debug to have maker print out the locations of all > modules it uses). Such a hack should only be done as a temporary work > around. > > You change line 256 from --> > ---M---------------M---------------M---------------------------- > > To--> > -----------------------------------M---------------------------- > > > Maker uses the BioPerl is_start_codon function to determine if the codon > used in a result it receives is acceptable or if it should look upstream > or downstream for a different one. Apparently there are actually 3 > acceptable start codons in the standard codon table (2 rare ones and one > common), so making the edit removes the two rare ones from the list. > > I'm also preparing a 2.30 release that exports a "strict" codon table to > the BioPerl module, that will be used by default and should fix the issue. > > Thanks, > Carson > > > > On 10/12/13 11:06 AM, "Felix Bemm" wrote: > >> Hi, >> >> I have a serious problems with maker annotations especially the start >> codon placement. It started happening with maker version 2.27! About 1/3 >> of my genes are missing a proper start codon. When reviewing the GFF >> files gene predictors and est evidence standalone would predict the >> right gene structure including the start codon. I was thinking that this >> problem was somehow linked to my gene models but looking at their >> standalone prediction I discarded that theory. Just some stats about >> proteins with proper start codon (first column) and missing start codon >> (second column). >> >> Maker 9268 4215 >> SNAP 16577 896 >> Genemark 18764 290 >> >> Numbers look the same when Augustus is used (currently retraining). >> Manually using EST's in Apollo often fixes the problems but I don't want >> to check > 4000 genes for this. >> >> Do you have any idea what could cause such a problem? If you need some >> example gff files I can send you. >> >> Best regards >> Felix >> >> -- >> Felix Bemm >> Department of Bioinformatics >> University of W?rzburg, Germany >> Tel: +49 931 - 31 83696 >> Fax: +49 931 - 31 84552 >> felix.bemm at uni-wuerzburg.de >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -- Felix Bemm Department of Bioinformatics University of W?rzburg, Germany Tel: +49 931 - 31 83696 Fax: +49 931 - 31 84552 felix.bemm at uni-wuerzburg.de From carsonhh at gmail.com Sat Oct 12 23:26:17 2013 From: carsonhh at gmail.com (Carson Holt) Date: Sun, 13 Oct 2013 01:26:17 -0400 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: <525975C6.4020403@uni-wuerzburg.de> Message-ID: Before Wednesday, because after that I'm on a 6 day road trip, so I have to have it wrapped up before I leave. --Carson On 10/12/13 12:16 PM, "Felix Bemm" wrote: >Thx Carson! Do you now when 2.30 will be available? Bests > >On 12.10.2013 17:48, Carson Holt wrote: >> This issue just came up this week on another e-mail as well. >> >> One way to fix it, is to edit the Bio::Tools::CodonTable module from >> BioPerl (use maker --debug to have maker print out the locations of all >> modules it uses). Such a hack should only be done as a temporary work >> around. >> >> You change line 256 from --> >> ---M---------------M---------------M---------------------------- >> >> To--> >> -----------------------------------M---------------------------- >> >> >> Maker uses the BioPerl is_start_codon function to determine if the codon >> used in a result it receives is acceptable or if it should look upstream >> or downstream for a different one. Apparently there are actually 3 >> acceptable start codons in the standard codon table (2 rare ones and one >> common), so making the edit removes the two rare ones from the list. >> >> I'm also preparing a 2.30 release that exports a "strict" codon table to >> the BioPerl module, that will be used by default and should fix the >>issue. >> >> Thanks, >> Carson >> >> >> >> On 10/12/13 11:06 AM, "Felix Bemm" wrote: >> >>> Hi, >>> >>> I have a serious problems with maker annotations especially the start >>> codon placement. It started happening with maker version 2.27! About >>>1/3 >>> of my genes are missing a proper start codon. When reviewing the GFF >>> files gene predictors and est evidence standalone would predict the >>> right gene structure including the start codon. I was thinking that >>>this >>> problem was somehow linked to my gene models but looking at their >>> standalone prediction I discarded that theory. Just some stats about >>> proteins with proper start codon (first column) and missing start codon >>> (second column). >>> >>> Maker 9268 4215 >>> SNAP 16577 896 >>> Genemark 18764 290 >>> >>> Numbers look the same when Augustus is used (currently retraining). >>> Manually using EST's in Apollo often fixes the problems but I don't >>>want >>> to check > 4000 genes for this. >>> >>> Do you have any idea what could cause such a problem? If you need some >>> example gff files I can send you. >>> >>> Best regards >>> Felix >>> >>> -- >>> Felix Bemm >>> Department of Bioinformatics >>> University of W?rzburg, Germany >>> Tel: +49 931 - 31 83696 >>> Fax: +49 931 - 31 84552 >>> felix.bemm at uni-wuerzburg.de >>> >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > >-- >Felix Bemm >Department of Bioinformatics >University of W?rzburg, Germany >Tel: +49 931 - 31 83696 >Fax: +49 931 - 31 84552 >felix.bemm at uni-wuerzburg.de From carsonhh at gmail.com Mon Oct 14 08:45:25 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 14 Oct 2013 10:45:25 -0400 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: <525975C6.4020403@uni-wuerzburg.de> Message-ID: Probably Tuesday or Wednesday. --Carson On 10/12/13 12:16 PM, "Felix Bemm" wrote: >Thx Carson! Do you now when 2.30 will be available? Bests > >On 12.10.2013 17:48, Carson Holt wrote: >> This issue just came up this week on another e-mail as well. >> >> One way to fix it, is to edit the Bio::Tools::CodonTable module from >> BioPerl (use maker --debug to have maker print out the locations of all >> modules it uses). Such a hack should only be done as a temporary work >> around. >> >> You change line 256 from --> >> ---M---------------M---------------M---------------------------- >> >> To--> >> -----------------------------------M---------------------------- >> >> >> Maker uses the BioPerl is_start_codon function to determine if the codon >> used in a result it receives is acceptable or if it should look upstream >> or downstream for a different one. Apparently there are actually 3 >> acceptable start codons in the standard codon table (2 rare ones and one >> common), so making the edit removes the two rare ones from the list. >> >> I'm also preparing a 2.30 release that exports a "strict" codon table to >> the BioPerl module, that will be used by default and should fix the >>issue. >> >> Thanks, >> Carson >> >> >> >> On 10/12/13 11:06 AM, "Felix Bemm" wrote: >> >>> Hi, >>> >>> I have a serious problems with maker annotations especially the start >>> codon placement. It started happening with maker version 2.27! About >>>1/3 >>> of my genes are missing a proper start codon. When reviewing the GFF >>> files gene predictors and est evidence standalone would predict the >>> right gene structure including the start codon. I was thinking that >>>this >>> problem was somehow linked to my gene models but looking at their >>> standalone prediction I discarded that theory. Just some stats about >>> proteins with proper start codon (first column) and missing start codon >>> (second column). >>> >>> Maker 9268 4215 >>> SNAP 16577 896 >>> Genemark 18764 290 >>> >>> Numbers look the same when Augustus is used (currently retraining). >>> Manually using EST's in Apollo often fixes the problems but I don't >>>want >>> to check > 4000 genes for this. >>> >>> Do you have any idea what could cause such a problem? If you need some >>> example gff files I can send you. >>> >>> Best regards >>> Felix >>> >>> -- >>> Felix Bemm >>> Department of Bioinformatics >>> University of W?rzburg, Germany >>> Tel: +49 931 - 31 83696 >>> Fax: +49 931 - 31 84552 >>> felix.bemm at uni-wuerzburg.de >>> >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > >-- >Felix Bemm >Department of Bioinformatics >University of W?rzburg, Germany >Tel: +49 931 - 31 83696 >Fax: +49 931 - 31 84552 >felix.bemm at uni-wuerzburg.de From cjfields at illinois.edu Mon Oct 14 09:07:43 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Mon, 14 Oct 2013 15:07:43 +0000 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: References: Message-ID: Carson, Regarding the CodonTable change below, should this be something that needs to be fixed, or made more flexible, in bioperl? I could see this being a problem if using MAKER for bacterial annotation (where the alternative start codons are actually more common). chris On Oct 14, 2013, at 9:45 AM, Carson Holt wrote: > Probably Tuesday or Wednesday. > > --Carson > > > On 10/12/13 12:16 PM, "Felix Bemm" wrote: > >> Thx Carson! Do you now when 2.30 will be available? Bests >> >> On 12.10.2013 17:48, Carson Holt wrote: >>> This issue just came up this week on another e-mail as well. >>> >>> One way to fix it, is to edit the Bio::Tools::CodonTable module from >>> BioPerl (use maker --debug to have maker print out the locations of all >>> modules it uses). Such a hack should only be done as a temporary work >>> around. >>> >>> You change line 256 from --> >>> ---M---------------M---------------M---------------------------- >>> >>> To--> >>> -----------------------------------M---------------------------- >>> >>> >>> Maker uses the BioPerl is_start_codon function to determine if the codon >>> used in a result it receives is acceptable or if it should look upstream >>> or downstream for a different one. Apparently there are actually 3 >>> acceptable start codons in the standard codon table (2 rare ones and one >>> common), so making the edit removes the two rare ones from the list. >>> >>> I'm also preparing a 2.30 release that exports a "strict" codon table to >>> the BioPerl module, that will be used by default and should fix the >>> issue. >>> >>> Thanks, >>> Carson >>> >>> >>> >>> On 10/12/13 11:06 AM, "Felix Bemm" wrote: >>> >>>> Hi, >>>> >>>> I have a serious problems with maker annotations especially the start >>>> codon placement. It started happening with maker version 2.27! About >>>> 1/3 >>>> of my genes are missing a proper start codon. When reviewing the GFF >>>> files gene predictors and est evidence standalone would predict the >>>> right gene structure including the start codon. I was thinking that >>>> this >>>> problem was somehow linked to my gene models but looking at their >>>> standalone prediction I discarded that theory. Just some stats about >>>> proteins with proper start codon (first column) and missing start codon >>>> (second column). >>>> >>>> Maker 9268 4215 >>>> SNAP 16577 896 >>>> Genemark 18764 290 >>>> >>>> Numbers look the same when Augustus is used (currently retraining). >>>> Manually using EST's in Apollo often fixes the problems but I don't >>>> want >>>> to check > 4000 genes for this. >>>> >>>> Do you have any idea what could cause such a problem? If you need some >>>> example gff files I can send you. >>>> >>>> Best regards >>>> Felix >>>> >>>> -- >>>> Felix Bemm >>>> Department of Bioinformatics >>>> University of W?rzburg, Germany >>>> Tel: +49 931 - 31 83696 >>>> Fax: +49 931 - 31 84552 >>>> felix.bemm at uni-wuerzburg.de >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> -- >> Felix Bemm >> Department of Bioinformatics >> University of W?rzburg, Germany >> Tel: +49 931 - 31 83696 >> Fax: +49 931 - 31 84552 >> felix.bemm at uni-wuerzburg.de > > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From carsonhh at gmail.com Mon Oct 14 11:58:24 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 14 Oct 2013 13:58:24 -0400 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: Message-ID: I think adding one more codon table to the set of available tables would be sufficient. Call it the 'Strict' table or 'Canonical' table. It doesn't need to be the default, but should be a selectable option just as the other codon tables are. There is a way to export codon tables to the module, which is what I've now added to MAKER, but I'm sure other BoiPerl users would find a strictly canonical codon table useful as well. Thanks, Carson On 10/14/13 11:07 AM, "Fields, Christopher J" wrote: >Carson, > >Regarding the CodonTable change below, should this be something that >needs to be fixed, or made more flexible, in bioperl? I could see this >being a problem if using MAKER for bacterial annotation (where the >alternative start codons are actually more common). > >chris > >On Oct 14, 2013, at 9:45 AM, Carson Holt wrote: > >> Probably Tuesday or Wednesday. >> >> --Carson >> >> >> On 10/12/13 12:16 PM, "Felix Bemm" wrote: >> >>> Thx Carson! Do you now when 2.30 will be available? Bests >>> >>> On 12.10.2013 17:48, Carson Holt wrote: >>>> This issue just came up this week on another e-mail as well. >>>> >>>> One way to fix it, is to edit the Bio::Tools::CodonTable module from >>>> BioPerl (use maker --debug to have maker print out the locations of >>>>all >>>> modules it uses). Such a hack should only be done as a temporary work >>>> around. >>>> >>>> You change line 256 from --> >>>> ---M---------------M---------------M---------------------------- >>>> >>>> To--> >>>> -----------------------------------M---------------------------- >>>> >>>> >>>> Maker uses the BioPerl is_start_codon function to determine if the >>>>codon >>>> used in a result it receives is acceptable or if it should look >>>>upstream >>>> or downstream for a different one. Apparently there are actually 3 >>>> acceptable start codons in the standard codon table (2 rare ones and >>>>one >>>> common), so making the edit removes the two rare ones from the list. >>>> >>>> I'm also preparing a 2.30 release that exports a "strict" codon table >>>>to >>>> the BioPerl module, that will be used by default and should fix the >>>> issue. >>>> >>>> Thanks, >>>> Carson >>>> >>>> >>>> >>>> On 10/12/13 11:06 AM, "Felix Bemm" >>>>wrote: >>>> >>>>> Hi, >>>>> >>>>> I have a serious problems with maker annotations especially the start >>>>> codon placement. It started happening with maker version 2.27! About >>>>> 1/3 >>>>> of my genes are missing a proper start codon. When reviewing the GFF >>>>> files gene predictors and est evidence standalone would predict the >>>>> right gene structure including the start codon. I was thinking that >>>>> this >>>>> problem was somehow linked to my gene models but looking at their >>>>> standalone prediction I discarded that theory. Just some stats about >>>>> proteins with proper start codon (first column) and missing start >>>>>codon >>>>> (second column). >>>>> >>>>> Maker 9268 4215 >>>>> SNAP 16577 896 >>>>> Genemark 18764 290 >>>>> >>>>> Numbers look the same when Augustus is used (currently retraining). >>>>> Manually using EST's in Apollo often fixes the problems but I don't >>>>> want >>>>> to check > 4000 genes for this. >>>>> >>>>> Do you have any idea what could cause such a problem? If you need >>>>>some >>>>> example gff files I can send you. >>>>> >>>>> Best regards >>>>> Felix >>>>> >>>>> -- >>>>> Felix Bemm >>>>> Department of Bioinformatics >>>>> University of W?rzburg, Germany >>>>> Tel: +49 931 - 31 83696 >>>>> Fax: +49 931 - 31 84552 >>>>> felix.bemm at uni-wuerzburg.de >>>>> >>>>> _______________________________________________ >>>>> maker-devel mailing list >>>>> maker-devel at box290.bluehost.com >>>>> >>>>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.or >>>>>g >>> >>> -- >>> Felix Bemm >>> Department of Bioinformatics >>> University of W?rzburg, Germany >>> Tel: +49 931 - 31 83696 >>> Fax: +49 931 - 31 84552 >>> felix.bemm at uni-wuerzburg.de >> >> >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > From cjfields at illinois.edu Mon Oct 14 12:51:28 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Mon, 14 Oct 2013 18:51:28 +0000 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: References: Message-ID: <160BC401-0471-4285-8712-21440628A5DA@illinois.edu> Done, with some added tests. I labeled it 'Strict' ('Canonical' can be a bit of a loaded term). It's currently set as ID 24. I will likely push out a new 1.6 point release this week or next, I'll merge these in for that release. chris On Oct 14, 2013, at 12:58 PM, Carson Holt wrote: > I think adding one more codon table to the set of available tables would > be sufficient. Call it the 'Strict' table or 'Canonical' table. It > doesn't need to be the default, but should be a selectable option just as > the other codon tables are. There is a way to export codon tables to the > module, which is what I've now added to MAKER, but I'm sure other BoiPerl > users would find a strictly canonical codon table useful as well. > > Thanks, > Carson > > > On 10/14/13 11:07 AM, "Fields, Christopher J" > wrote: > >> Carson, >> >> Regarding the CodonTable change below, should this be something that >> needs to be fixed, or made more flexible, in bioperl? I could see this >> being a problem if using MAKER for bacterial annotation (where the >> alternative start codons are actually more common). >> >> chris >> >> On Oct 14, 2013, at 9:45 AM, Carson Holt wrote: >> >>> Probably Tuesday or Wednesday. >>> >>> --Carson >>> >>> >>> On 10/12/13 12:16 PM, "Felix Bemm" wrote: >>> >>>> Thx Carson! Do you now when 2.30 will be available? Bests >>>> >>>> On 12.10.2013 17:48, Carson Holt wrote: >>>>> This issue just came up this week on another e-mail as well. >>>>> >>>>> One way to fix it, is to edit the Bio::Tools::CodonTable module from >>>>> BioPerl (use maker --debug to have maker print out the locations of >>>>> all >>>>> modules it uses). Such a hack should only be done as a temporary work >>>>> around. >>>>> >>>>> You change line 256 from --> >>>>> ---M---------------M---------------M---------------------------- >>>>> >>>>> To--> >>>>> -----------------------------------M---------------------------- >>>>> >>>>> >>>>> Maker uses the BioPerl is_start_codon function to determine if the >>>>> codon >>>>> used in a result it receives is acceptable or if it should look >>>>> upstream >>>>> or downstream for a different one. Apparently there are actually 3 >>>>> acceptable start codons in the standard codon table (2 rare ones and >>>>> one >>>>> common), so making the edit removes the two rare ones from the list. >>>>> >>>>> I'm also preparing a 2.30 release that exports a "strict" codon table >>>>> to >>>>> the BioPerl module, that will be used by default and should fix the >>>>> issue. >>>>> >>>>> Thanks, >>>>> Carson >>>>> >>>>> >>>>> >>>>> On 10/12/13 11:06 AM, "Felix Bemm" >>>>> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I have a serious problems with maker annotations especially the start >>>>>> codon placement. It started happening with maker version 2.27! About >>>>>> 1/3 >>>>>> of my genes are missing a proper start codon. When reviewing the GFF >>>>>> files gene predictors and est evidence standalone would predict the >>>>>> right gene structure including the start codon. I was thinking that >>>>>> this >>>>>> problem was somehow linked to my gene models but looking at their >>>>>> standalone prediction I discarded that theory. Just some stats about >>>>>> proteins with proper start codon (first column) and missing start >>>>>> codon >>>>>> (second column). >>>>>> >>>>>> Maker 9268 4215 >>>>>> SNAP 16577 896 >>>>>> Genemark 18764 290 >>>>>> >>>>>> Numbers look the same when Augustus is used (currently retraining). >>>>>> Manually using EST's in Apollo often fixes the problems but I don't >>>>>> want >>>>>> to check > 4000 genes for this. >>>>>> >>>>>> Do you have any idea what could cause such a problem? If you need >>>>>> some >>>>>> example gff files I can send you. >>>>>> >>>>>> Best regards >>>>>> Felix >>>>>> >>>>>> -- >>>>>> Felix Bemm >>>>>> Department of Bioinformatics >>>>>> University of W?rzburg, Germany >>>>>> Tel: +49 931 - 31 83696 >>>>>> Fax: +49 931 - 31 84552 >>>>>> felix.bemm at uni-wuerzburg.de >>>>>> >>>>>> _______________________________________________ >>>>>> maker-devel mailing list >>>>>> maker-devel at box290.bluehost.com >>>>>> >>>>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.or >>>>>> g >>>> >>>> -- >>>> Felix Bemm >>>> Department of Bioinformatics >>>> University of W?rzburg, Germany >>>> Tel: +49 931 - 31 83696 >>>> Fax: +49 931 - 31 84552 >>>> felix.bemm at uni-wuerzburg.de >>> >>> >>> >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> > > From carsonhh at gmail.com Mon Oct 14 18:59:45 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 14 Oct 2013 20:59:45 -0400 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: <160BC401-0471-4285-8712-21440628A5DA@illinois.edu> Message-ID: Thanks!! --Carson On 10/14/13 2:51 PM, "Fields, Christopher J" wrote: >Done, with some added tests. I labeled it 'Strict' ('Canonical' can be a >bit of a loaded term). It's currently set as ID 24. > >I will likely push out a new 1.6 point release this week or next, I'll >merge these in for that release. > >chris > >On Oct 14, 2013, at 12:58 PM, Carson Holt wrote: > >> I think adding one more codon table to the set of available tables would >> be sufficient. Call it the 'Strict' table or 'Canonical' table. It >> doesn't need to be the default, but should be a selectable option just >>as >> the other codon tables are. There is a way to export codon tables to >>the >> module, which is what I've now added to MAKER, but I'm sure other >>BoiPerl >> users would find a strictly canonical codon table useful as well. >> >> Thanks, >> Carson >> >> >> On 10/14/13 11:07 AM, "Fields, Christopher J" >> wrote: >> >>> Carson, >>> >>> Regarding the CodonTable change below, should this be something that >>> needs to be fixed, or made more flexible, in bioperl? I could see this >>> being a problem if using MAKER for bacterial annotation (where the >>> alternative start codons are actually more common). >>> >>> chris >>> >>> On Oct 14, 2013, at 9:45 AM, Carson Holt wrote: >>> >>>> Probably Tuesday or Wednesday. >>>> >>>> --Carson >>>> >>>> >>>> On 10/12/13 12:16 PM, "Felix Bemm" >>>>wrote: >>>> >>>>> Thx Carson! Do you now when 2.30 will be available? Bests >>>>> >>>>> On 12.10.2013 17:48, Carson Holt wrote: >>>>>> This issue just came up this week on another e-mail as well. >>>>>> >>>>>> One way to fix it, is to edit the Bio::Tools::CodonTable module from >>>>>> BioPerl (use maker --debug to have maker print out the locations of >>>>>> all >>>>>> modules it uses). Such a hack should only be done as a temporary >>>>>>work >>>>>> around. >>>>>> >>>>>> You change line 256 from --> >>>>>> ---M---------------M---------------M---------------------------- >>>>>> >>>>>> To--> >>>>>> -----------------------------------M---------------------------- >>>>>> >>>>>> >>>>>> Maker uses the BioPerl is_start_codon function to determine if the >>>>>> codon >>>>>> used in a result it receives is acceptable or if it should look >>>>>> upstream >>>>>> or downstream for a different one. Apparently there are actually 3 >>>>>> acceptable start codons in the standard codon table (2 rare ones and >>>>>> one >>>>>> common), so making the edit removes the two rare ones from the list. >>>>>> >>>>>> I'm also preparing a 2.30 release that exports a "strict" codon >>>>>>table >>>>>> to >>>>>> the BioPerl module, that will be used by default and should fix the >>>>>> issue. >>>>>> >>>>>> Thanks, >>>>>> Carson >>>>>> >>>>>> >>>>>> >>>>>> On 10/12/13 11:06 AM, "Felix Bemm" >>>>>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I have a serious problems with maker annotations especially the >>>>>>>start >>>>>>> codon placement. It started happening with maker version 2.27! >>>>>>>About >>>>>>> 1/3 >>>>>>> of my genes are missing a proper start codon. When reviewing the >>>>>>>GFF >>>>>>> files gene predictors and est evidence standalone would predict the >>>>>>> right gene structure including the start codon. I was thinking that >>>>>>> this >>>>>>> problem was somehow linked to my gene models but looking at their >>>>>>> standalone prediction I discarded that theory. Just some stats >>>>>>>about >>>>>>> proteins with proper start codon (first column) and missing start >>>>>>> codon >>>>>>> (second column). >>>>>>> >>>>>>> Maker 9268 4215 >>>>>>> SNAP 16577 896 >>>>>>> Genemark 18764 290 >>>>>>> >>>>>>> Numbers look the same when Augustus is used (currently retraining). >>>>>>> Manually using EST's in Apollo often fixes the problems but I don't >>>>>>> want >>>>>>> to check > 4000 genes for this. >>>>>>> >>>>>>> Do you have any idea what could cause such a problem? If you need >>>>>>> some >>>>>>> example gff files I can send you. >>>>>>> >>>>>>> Best regards >>>>>>> Felix >>>>>>> >>>>>>> -- >>>>>>> Felix Bemm >>>>>>> Department of Bioinformatics >>>>>>> University of W?rzburg, Germany >>>>>>> Tel: +49 931 - 31 83696 >>>>>>> Fax: +49 931 - 31 84552 >>>>>>> felix.bemm at uni-wuerzburg.de >>>>>>> >>>>>>> _______________________________________________ >>>>>>> maker-devel mailing list >>>>>>> maker-devel at box290.bluehost.com >>>>>>> >>>>>>> >>>>>>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab. >>>>>>>or >>>>>>> g >>>>> >>>>> -- >>>>> Felix Bemm >>>>> Department of Bioinformatics >>>>> University of W?rzburg, Germany >>>>> Tel: +49 931 - 31 83696 >>>>> Fax: +49 931 - 31 84552 >>>>> felix.bemm at uni-wuerzburg.de >>>> >>>> >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> >>>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >> >> > From jfierst at uoregon.edu Mon Oct 21 11:40:06 2013 From: jfierst at uoregon.edu (Janna Fierst) Date: Mon, 21 Oct 2013 10:40:06 -0700 Subject: [maker-devel] Fwd: exon/intron boundaries In-Reply-To: References: Message-ID: Hi, thanks for your help- I have used Trinity to assemble the EST's but am having a problem with the 2.29-beta release. Instead of starting and finishing a set of scaffolds, it appears to re-start over and over again. I am running it in parallel across 32 cores and I'm wondering if this is a problem with the parallel implementation. I've gone through it several times and not been able to find any errors or output that would suggest what the problem is. Thanks -Janna On Mon, Oct 7, 2013 at 6:20 AM, Carson Holt wrote: > Hi Janna, > > There are a couple of things to do. Download the maker-2.29p-beta from > the lab website. This includes some changes made to improve the > performance of correct_est_fusion. Whenever using the correct_est_fusion > option, this also reduces the influence of ESTs on annotation. So normally > you would need to increase your protein dataset, but given that you have > supplied 3 nematode species and all of UniProt already you should probably > be fine. But there is one last thing you can do. Instead of using > cufflinks, try using trinity to assemble the ESTs. There is a Jaccard clip > option that reducing merging caused by overlapping UTR. Between the > trinity and correct_est_fusion you should be able to really reduce the > effect of those ESTs. > > If those changes don't work there is one last option. If you take the > MAKER results and filter them for SNAP and Augustus ab initio results > (match/match_part in the GFF3), then you can pass those in to the pred_gff > options. Then turn snaphmm and augustus_species off in the control files. > Basically what this will do is turn MAKER's hint based prediction off and > force it to filter the ab intio results and select directly models from > there. Since the merging is being caused by bad hints (merged transcripts > from mRNAseq) this would reduce that effect. You will still need > correct_est_fusion=1 though to trim UTR coming from the merged transcripts, > because even though MAEKR can't process hints to rerun SNAP and Augustus > this way, it will try and add UTR using the EST evidence. > > --Carson > > > From: Janna Fierst > Date: Friday, October 4, 2013 1:06 PM > To: > Subject: [maker-devel] Fwd: exon/intron boundaries > > Hi, > > thanks for your reply- I have been going through our annotations in detail > and trying different parameter sets, and I think I have identified what is > going on but I'm not sure how to set the MAKER2 parameters for our > situation. We are working with a species of Caenorhabditid worm and there > are long gene-dense blocks that are being incorrectly annotated as large > single genes instead of several smaller closely spaced genes. The protein > alignments (tblastx and protein2genome) show very clearly where the > exon/intron boundaries are and in most cases agree with the augustus > predictions. The assembled cufflinks output (through blastn, est2genome and > est_gff:cufflinks) does not agree in some locations; I think this may be > because in some cases the UTR nearly overlaps adjacent genes. > > I have included a screenshot of an annotated region viewed in apollo to > try to show this. The large gene in the middle is actually 7 different > genes that are extremely close together and MAKER2 is collapsing them into > a single gene. I tried running without any RNASeq/cufflinks data and MAKER2 > annotates the region as two genes instead of one, but I can't get it to > recognize the 7 as different genes. I have retrained SNAP but we have not > been able to successfully train Augustus, we are currently using the > default caenhorhabditis species model. I also included a species specific > repeat library. I tried setting correct_est_fusion=1 and reducing > pred_flank but these changes appear to really alter the annotations and we > end up annotating almost nothing. I also tried setting est2genome=0 to > decrease the influence of the cufflinks assembly but it didn't appear to > help. There are some very large introns in these genes so I haven't tried > yet decreasing the maximum intron size because I'm concerned this may > generate too many split genes instead of our current merged gene problem. > Thanks for your help, any advice is greatly appreciated! -Janna Fierst > > > On Mon, Aug 26, 2013 at 12:21 PM, Carson Holt wrote: > >> Are you getting gene fusions or just more exons? Gene fusions can be >> reduced by setting correct_est_fusion=1, or reducing pred_flank, although >> reducing pred_flank can cause other issues (but those generally only appear >> if setting the value below below 150). Also if you have the maximum intron >> size set to high (split_hit option), you may also be generating bridging >> alignments that make evidence align across distant paralogous genes as well >> (this can result in gene merging) >> >> You should also look at your results manually in a viewer like Apollo. >> Then see if the extra exons are supported by something such as protein >> alignments from another species. If this is the case, you may have a >> poorly annotated protein set that is being used as evidence that is >> carrying over it's erroneous exons into the species you are annotating. If >> the extra exons are supported by EST evidence, then perhaps you should try >> and rebuild the EST assembly (for example trinity has an option to use a >> Jarccardian similarity coefficient to avoid fusing transcripts). >> >> Another option, is to retrain SNAP or Augustus. MAKER does not actually >> produce any of the models itself (it is a pipeline not a predictor). The >> models are all generated using these other algorithms, MAKER just feeds >> them hints based on protein and transcript alignments, so making sure >> training is sufficient is important for those programs to produce their >> best models. >> >> Finally make sure your repeat database is sufficient, you may need to >> generate a species specific repeat library using something like >> RepeatModeler. Repeats can end up being included as extra exons in gene >> models because they may contain reading frames the do code for proteins >> (I.e. reverse transcriptases). >> >> If you have any questions on any of the above, just let us know. >> >> Thanks, >> Carson >> >> >> From: Janna Fierst >> Date: Monday, August 26, 2013 2:54 PM >> To: >> Subject: [maker-devel] exon/intron boundaries >> >> Hi, >> >> I am using MAKER 2.28 to annotate a Caenorhabditid worm genome, and the >> initial results appear fairly good but we seem to be be annotating too many >> exons for multiple genes. I was wondering which parameters should be tuned >> to change the threshold for exon/intron boundaries? Thanks for your help >> -Janna Fierst >> _______________________________________________ maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> > > > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From graham.etherington at sainsbury-laboratory.ac.uk Thu Oct 24 07:42:51 2013 From: graham.etherington at sainsbury-laboratory.ac.uk (graham etherington (TSL)) Date: Thu, 24 Oct 2013 13:42:51 +0000 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: Message-ID: Hi, I notice that the latest version of Maker available from http://www.yandell-lab.org/software/maker.html is still v2.29b (4 Oct 2013). As we're also having the start codon problem we'd like to get hold of v2.30. Do you know when this will be released? Many thanks, Graham Dr. Graham Etherington Bioinformatics Support Officer, The Sainsbury Laboratory, Norwich Research Park, Norwich NR4 7UH. UK Tel: +44 (0)1603 450601 On 13/10/2013 06:26, "Carson Holt" wrote: >Before Wednesday, because after that I'm on a 6 day road trip, so I have >to have it wrapped up before I leave. > >--Carson > > > > >On 10/12/13 12:16 PM, "Felix Bemm" wrote: > >>Thx Carson! Do you now when 2.30 will be available? Bests >> >>On 12.10.2013 17:48, Carson Holt wrote: >>> This issue just came up this week on another e-mail as well. >>> >>> One way to fix it, is to edit the Bio::Tools::CodonTable module from >>> BioPerl (use maker --debug to have maker print out the locations of all >>> modules it uses). Such a hack should only be done as a temporary work >>> around. >>> >>> You change line 256 from --> >>> ---M---------------M---------------M---------------------------- >>> >>> To--> >>> -----------------------------------M---------------------------- >>> >>> >>> Maker uses the BioPerl is_start_codon function to determine if the >>>codon >>> used in a result it receives is acceptable or if it should look >>>upstream >>> or downstream for a different one. Apparently there are actually 3 >>> acceptable start codons in the standard codon table (2 rare ones and >>>one >>> common), so making the edit removes the two rare ones from the list. >>> >>> I'm also preparing a 2.30 release that exports a "strict" codon table >>>to >>> the BioPerl module, that will be used by default and should fix the >>>issue. >>> >>> Thanks, >>> Carson >>> >>> >>> >>> On 10/12/13 11:06 AM, "Felix Bemm" wrote: >>> >>>> Hi, >>>> >>>> I have a serious problems with maker annotations especially the start >>>> codon placement. It started happening with maker version 2.27! About >>>>1/3 >>>> of my genes are missing a proper start codon. When reviewing the GFF >>>> files gene predictors and est evidence standalone would predict the >>>> right gene structure including the start codon. I was thinking that >>>>this >>>> problem was somehow linked to my gene models but looking at their >>>> standalone prediction I discarded that theory. Just some stats about >>>> proteins with proper start codon (first column) and missing start >>>>codon >>>> (second column). >>>> >>>> Maker 9268 4215 >>>> SNAP 16577 896 >>>> Genemark 18764 290 >>>> >>>> Numbers look the same when Augustus is used (currently retraining). >>>> Manually using EST's in Apollo often fixes the problems but I don't >>>>want >>>> to check > 4000 genes for this. >>>> >>>> Do you have any idea what could cause such a problem? If you need some >>>> example gff files I can send you. >>>> >>>> Best regards >>>> Felix >>>> >>>> -- >>>> Felix Bemm >>>> Department of Bioinformatics >>>> University of W?rzburg, Germany >>>> Tel: +49 931 - 31 83696 >>>> Fax: +49 931 - 31 84552 >>>> felix.bemm at uni-wuerzburg.de >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> >>>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >>-- >>Felix Bemm >>Department of Bioinformatics >>University of W?rzburg, Germany >>Tel: +49 931 - 31 83696 >>Fax: +49 931 - 31 84552 >>felix.bemm at uni-wuerzburg.de > > > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From myandell at genetics.utah.edu Thu Oct 24 07:55:05 2013 From: myandell at genetics.utah.edu (Mark Yandell) Date: Thu, 24 Oct 2013 13:55:05 +0000 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: References: , Message-ID: <7A60AB257EFF2B48B1F4C814817EA05365E778BF@mxb2.hg.genetics.utah.edu> Hi Graham, Carson is currently on the road, moving his family from Canada back to the States. He should be back online sometime next week. Sorry for the delay. --mark Mark Yandell Professor of Human Genetics H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:801-587-7707 ________________________________________ From: maker-devel [maker-devel-bounces at yandell-lab.org] on behalf of graham etherington (TSL) [graham.etherington at sainsbury-laboratory.ac.uk] Sent: Thursday, October 24, 2013 7:42 AM To: Carson Holt; Felix Bemm; maker-devel at yandell-lab.org Cc: Kate Bailey (TSL) Subject: Re: [maker-devel] Serious Start Codon Problem Hi, I notice that the latest version of Maker available from http://www.yandell-lab.org/software/maker.html is still v2.29b (4 Oct 2013). As we're also having the start codon problem we'd like to get hold of v2.30. Do you know when this will be released? Many thanks, Graham Dr. Graham Etherington Bioinformatics Support Officer, The Sainsbury Laboratory, Norwich Research Park, Norwich NR4 7UH. UK Tel: +44 (0)1603 450601 On 13/10/2013 06:26, "Carson Holt" wrote: >Before Wednesday, because after that I'm on a 6 day road trip, so I have >to have it wrapped up before I leave. > >--Carson > > > > >On 10/12/13 12:16 PM, "Felix Bemm" wrote: > >>Thx Carson! Do you now when 2.30 will be available? Bests >> >>On 12.10.2013 17:48, Carson Holt wrote: >>> This issue just came up this week on another e-mail as well. >>> >>> One way to fix it, is to edit the Bio::Tools::CodonTable module from >>> BioPerl (use maker --debug to have maker print out the locations of all >>> modules it uses). Such a hack should only be done as a temporary work >>> around. >>> >>> You change line 256 from --> >>> ---M---------------M---------------M---------------------------- >>> >>> To--> >>> -----------------------------------M---------------------------- >>> >>> >>> Maker uses the BioPerl is_start_codon function to determine if the >>>codon >>> used in a result it receives is acceptable or if it should look >>>upstream >>> or downstream for a different one. Apparently there are actually 3 >>> acceptable start codons in the standard codon table (2 rare ones and >>>one >>> common), so making the edit removes the two rare ones from the list. >>> >>> I'm also preparing a 2.30 release that exports a "strict" codon table >>>to >>> the BioPerl module, that will be used by default and should fix the >>>issue. >>> >>> Thanks, >>> Carson >>> >>> >>> >>> On 10/12/13 11:06 AM, "Felix Bemm" wrote: >>> >>>> Hi, >>>> >>>> I have a serious problems with maker annotations especially the start >>>> codon placement. It started happening with maker version 2.27! About >>>>1/3 >>>> of my genes are missing a proper start codon. When reviewing the GFF >>>> files gene predictors and est evidence standalone would predict the >>>> right gene structure including the start codon. I was thinking that >>>>this >>>> problem was somehow linked to my gene models but looking at their >>>> standalone prediction I discarded that theory. Just some stats about >>>> proteins with proper start codon (first column) and missing start >>>>codon >>>> (second column). >>>> >>>> Maker 9268 4215 >>>> SNAP 16577 896 >>>> Genemark 18764 290 >>>> >>>> Numbers look the same when Augustus is used (currently retraining). >>>> Manually using EST's in Apollo often fixes the problems but I don't >>>>want >>>> to check > 4000 genes for this. >>>> >>>> Do you have any idea what could cause such a problem? If you need some >>>> example gff files I can send you. >>>> >>>> Best regards >>>> Felix >>>> >>>> -- >>>> Felix Bemm >>>> Department of Bioinformatics >>>> University of W?rzburg, Germany >>>> Tel: +49 931 - 31 83696 >>>> Fax: +49 931 - 31 84552 >>>> felix.bemm at uni-wuerzburg.de >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> >>>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >>-- >>Felix Bemm >>Department of Bioinformatics >>University of W?rzburg, Germany >>Tel: +49 931 - 31 83696 >>Fax: +49 931 - 31 84552 >>felix.bemm at uni-wuerzburg.de > > > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From felix.bemm at uni-wuerzburg.de Thu Oct 24 08:03:27 2013 From: felix.bemm at uni-wuerzburg.de (Felix Bemm) Date: Thu, 24 Oct 2013 16:03:27 +0200 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: <7A60AB257EFF2B48B1F4C814817EA05365E778BF@mxb2.hg.genetics.utah.edu> References: , <7A60AB257EFF2B48B1F4C814817EA05365E778BF@mxb2.hg.genetics.utah.edu> Message-ID: <526928AF.90108@uni-wuerzburg.de> Hi, I used the temporary fix of the Bio::Tools::CodonTable. It's working pretty nice. Almost every predicted proteins now start with a M ;) Bests Felix On 24.10.2013 15:55, Mark Yandell wrote: > Hi Graham, > > Carson is currently on the road, moving his family from Canada back to the States. He should be back online sometime next week. Sorry for the delay. > > > --mark > > > Mark Yandell > Professor of Human Genetics > H.A. & Edna Benning Presidential Endowed Chair > Eccles Institute of Human Genetics > University of Utah > 15 North 2030 East, Room 2100 > Salt Lake City, UT 84112-5330 > ph:801-587-7707 > > ________________________________________ > From: maker-devel [maker-devel-bounces at yandell-lab.org] on behalf of graham etherington (TSL) [graham.etherington at sainsbury-laboratory.ac.uk] > Sent: Thursday, October 24, 2013 7:42 AM > To: Carson Holt; Felix Bemm; maker-devel at yandell-lab.org > Cc: Kate Bailey (TSL) > Subject: Re: [maker-devel] Serious Start Codon Problem > > Hi, > I notice that the latest version of Maker available from > http://www.yandell-lab.org/software/maker.html is still v2.29b (4 Oct > 2013). As we're also having the start codon problem we'd like to get hold > of v2.30. Do you know when this will be released? > Many thanks, > Graham > > > Dr. Graham Etherington > Bioinformatics Support Officer, > The Sainsbury Laboratory, > Norwich Research Park, > Norwich NR4 7UH. > UK > Tel: +44 (0)1603 450601 > > > > > > On 13/10/2013 06:26, "Carson Holt" wrote: > >> Before Wednesday, because after that I'm on a 6 day road trip, so I have >> to have it wrapped up before I leave. >> >> --Carson >> >> >> >> >> On 10/12/13 12:16 PM, "Felix Bemm" wrote: >> >>> Thx Carson! Do you now when 2.30 will be available? Bests >>> >>> On 12.10.2013 17:48, Carson Holt wrote: >>>> This issue just came up this week on another e-mail as well. >>>> >>>> One way to fix it, is to edit the Bio::Tools::CodonTable module from >>>> BioPerl (use maker --debug to have maker print out the locations of all >>>> modules it uses). Such a hack should only be done as a temporary work >>>> around. >>>> >>>> You change line 256 from --> >>>> ---M---------------M---------------M---------------------------- >>>> >>>> To--> >>>> -----------------------------------M---------------------------- >>>> >>>> >>>> Maker uses the BioPerl is_start_codon function to determine if the >>>> codon >>>> used in a result it receives is acceptable or if it should look >>>> upstream >>>> or downstream for a different one. Apparently there are actually 3 >>>> acceptable start codons in the standard codon table (2 rare ones and >>>> one >>>> common), so making the edit removes the two rare ones from the list. >>>> >>>> I'm also preparing a 2.30 release that exports a "strict" codon table >>>> to >>>> the BioPerl module, that will be used by default and should fix the >>>> issue. >>>> >>>> Thanks, >>>> Carson >>>> >>>> >>>> >>>> On 10/12/13 11:06 AM, "Felix Bemm" wrote: >>>> >>>>> Hi, >>>>> >>>>> I have a serious problems with maker annotations especially the start >>>>> codon placement. It started happening with maker version 2.27! About >>>>> 1/3 >>>>> of my genes are missing a proper start codon. When reviewing the GFF >>>>> files gene predictors and est evidence standalone would predict the >>>>> right gene structure including the start codon. I was thinking that >>>>> this >>>>> problem was somehow linked to my gene models but looking at their >>>>> standalone prediction I discarded that theory. Just some stats about >>>>> proteins with proper start codon (first column) and missing start >>>>> codon >>>>> (second column). >>>>> >>>>> Maker 9268 4215 >>>>> SNAP 16577 896 >>>>> Genemark 18764 290 >>>>> >>>>> Numbers look the same when Augustus is used (currently retraining). >>>>> Manually using EST's in Apollo often fixes the problems but I don't >>>>> want >>>>> to check > 4000 genes for this. >>>>> >>>>> Do you have any idea what could cause such a problem? If you need some >>>>> example gff files I can send you. >>>>> >>>>> Best regards >>>>> Felix >>>>> >>>>> -- >>>>> Felix Bemm >>>>> Department of Bioinformatics >>>>> University of W?rzburg, Germany >>>>> Tel: +49 931 - 31 83696 >>>>> Fax: +49 931 - 31 84552 >>>>> felix.bemm at uni-wuerzburg.de >>>>> >>>>> _______________________________________________ >>>>> maker-devel mailing list >>>>> maker-devel at box290.bluehost.com >>>>> >>>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >>> -- >>> Felix Bemm >>> Department of Bioinformatics >>> University of W?rzburg, Germany >>> Tel: +49 931 - 31 83696 >>> Fax: +49 931 - 31 84552 >>> felix.bemm at uni-wuerzburg.de >> >> >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -- Felix Bemm Department of Bioinformatics University of W?rzburg, Germany Tel: +49 931 - 31 83696 Fax: +49 931 - 31 84552 felix.bemm at uni-wuerzburg.de From jim.hu.biobio at gmail.com Thu Oct 24 08:32:01 2013 From: jim.hu.biobio at gmail.com (JH Gmail) Date: Thu, 24 Oct 2013 09:32:01 -0500 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: <7A60AB257EFF2B48B1F4C814817EA05365E778BF@mxb2.hg.genetics.utah.edu> References: <7A60AB257EFF2B48B1F4C814817EA05365E778BF@mxb2.hg.genetics.utah.edu> Message-ID: <69247335-1227-41CB-94EA-BFEEE5207D51@gmail.com> Slightly off-topic questions: What is the assessment of right and wrong based on? Is there a training set based on experiments? In bacteria, the use of the minor starts is significant, but the mechanism is so different I would expect eukaryotes to be more strict. Does maker handle ribosome profiling yet? jim Sent from my iPad > On Oct 24, 2013, at 8:55 AM, Mark Yandell wrote: > > Hi Graham, > > Carson is currently on the road, moving his family from Canada back to the States. He should be back online sometime next week. Sorry for the delay. > > > --mark > > > Mark Yandell > Professor of Human Genetics > H.A. & Edna Benning Presidential Endowed Chair > Eccles Institute of Human Genetics > University of Utah > 15 North 2030 East, Room 2100 > Salt Lake City, UT 84112-5330 > ph:801-587-7707 > > ________________________________________ > From: maker-devel [maker-devel-bounces at yandell-lab.org] on behalf of graham etherington (TSL) [graham.etherington at sainsbury-laboratory.ac.uk] > Sent: Thursday, October 24, 2013 7:42 AM > To: Carson Holt; Felix Bemm; maker-devel at yandell-lab.org > Cc: Kate Bailey (TSL) > Subject: Re: [maker-devel] Serious Start Codon Problem > > Hi, > I notice that the latest version of Maker available from > http://www.yandell-lab.org/software/maker.html is still v2.29b (4 Oct > 2013). As we're also having the start codon problem we'd like to get hold > of v2.30. Do you know when this will be released? > Many thanks, > Graham > > > Dr. Graham Etherington > Bioinformatics Support Officer, > The Sainsbury Laboratory, > Norwich Research Park, > Norwich NR4 7UH. > UK > Tel: +44 (0)1603 450601 > > > > > >> On 13/10/2013 06:26, "Carson Holt" wrote: >> >> Before Wednesday, because after that I'm on a 6 day road trip, so I have >> to have it wrapped up before I leave. >> >> --Carson >> >> >> >> >>> On 10/12/13 12:16 PM, "Felix Bemm" wrote: >>> >>> Thx Carson! Do you now when 2.30 will be available? Bests >>> >>>> On 12.10.2013 17:48, Carson Holt wrote: >>>> This issue just came up this week on another e-mail as well. >>>> >>>> One way to fix it, is to edit the Bio::Tools::CodonTable module from >>>> BioPerl (use maker --debug to have maker print out the locations of all >>>> modules it uses). Such a hack should only be done as a temporary work >>>> around. >>>> >>>> You change line 256 from --> >>>> ---M---------------M---------------M---------------------------- >>>> >>>> To--> >>>> -----------------------------------M---------------------------- >>>> >>>> >>>> Maker uses the BioPerl is_start_codon function to determine if the >>>> codon >>>> used in a result it receives is acceptable or if it should look >>>> upstream >>>> or downstream for a different one. Apparently there are actually 3 >>>> acceptable start codons in the standard codon table (2 rare ones and >>>> one >>>> common), so making the edit removes the two rare ones from the list. >>>> >>>> I'm also preparing a 2.30 release that exports a "strict" codon table >>>> to >>>> the BioPerl module, that will be used by default and should fix the >>>> issue. >>>> >>>> Thanks, >>>> Carson >>>> >>>> >>>> >>>>> On 10/12/13 11:06 AM, "Felix Bemm" wrote: >>>>> >>>>> Hi, >>>>> >>>>> I have a serious problems with maker annotations especially the start >>>>> codon placement. It started happening with maker version 2.27! About >>>>> 1/3 >>>>> of my genes are missing a proper start codon. When reviewing the GFF >>>>> files gene predictors and est evidence standalone would predict the >>>>> right gene structure including the start codon. I was thinking that >>>>> this >>>>> problem was somehow linked to my gene models but looking at their >>>>> standalone prediction I discarded that theory. Just some stats about >>>>> proteins with proper start codon (first column) and missing start >>>>> codon >>>>> (second column). >>>>> >>>>> Maker 9268 4215 >>>>> SNAP 16577 896 >>>>> Genemark 18764 290 >>>>> >>>>> Numbers look the same when Augustus is used (currently retraining). >>>>> Manually using EST's in Apollo often fixes the problems but I don't >>>>> want >>>>> to check > 4000 genes for this. >>>>> >>>>> Do you have any idea what could cause such a problem? If you need some >>>>> example gff files I can send you. >>>>> >>>>> Best regards >>>>> Felix >>>>> >>>>> -- >>>>> Felix Bemm >>>>> Department of Bioinformatics >>>>> University of W?rzburg, Germany >>>>> Tel: +49 931 - 31 83696 >>>>> Fax: +49 931 - 31 84552 >>>>> felix.bemm at uni-wuerzburg.de >>>>> >>>>> _______________________________________________ >>>>> maker-devel mailing list >>>>> maker-devel at box290.bluehost.com >>>>> >>>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >>> -- >>> Felix Bemm >>> Department of Bioinformatics >>> University of W?rzburg, Germany >>> Tel: +49 931 - 31 83696 >>> Fax: +49 931 - 31 84552 >>> felix.bemm at uni-wuerzburg.de >> >> >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From amelia.ireland at gmod.org Thu Oct 24 12:52:17 2013 From: amelia.ireland at gmod.org (Amelia Ireland) Date: Thu, 24 Oct 2013 11:52:17 -0700 Subject: [maker-devel] GMOD 2014: San Diego Message-ID: Greetings GMOD community citizens! Online registration is now open for the 2014 GMOD meeting, to be held at the Best Western Seven Seas, San Diego, CA, on January 16-17, 2014 (after PAG). The meeting is open to GMOD users, developers, and anyone interested in the project and the software we provide. Online registration is now open; please see the meeting page, http://gmod.org/wiki/Jan_2014_GMOD_Meeting, for more information. Early bird registration rates apply until Dec. 14th. We have secured a discount on a block of rooms at the Best Western Seven Seas; please see the wiki for more information on booking. Please feel free to contact the GMOD helpdesk on help at gmod.org if you have any questions. Thanks! -- Amelia Ireland GMOD Community Support http://gmod.org || @gmodproject -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry.moore at genetics.utah.edu Thu Oct 24 15:09:20 2013 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Thu, 24 Oct 2013 15:09:20 -0600 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: <69247335-1227-41CB-94EA-BFEEE5207D51@gmail.com> References: <7A60AB257EFF2B48B1F4C814817EA05365E778BF@mxb2.hg.genetics.utah.edu> <69247335-1227-41CB-94EA-BFEEE5207D51@gmail.com> Message-ID: <1F2D4C5A-1B72-409A-8F7D-4793F9BCEE88@genetics.utah.edu> Hi Jim, You can and any kind of evidence you want to MAKER including ribosomal profiling data as long as you format it in GFF3 format. You could pass in your GFF3 formatted ribosomal profiling data in maker_opt.ctl with the protein_gff option and it will be treated as protein evidence. However if you are wondering if anyone has coded a specific addition to MAKER to accept ribosomal profiling data and treat it as a different kind of evidence then no, this hasn't been done. B On Oct 24, 2013, at 8:32 AM, JH Gmail wrote: > Slightly off-topic questions: What is the assessment of right and wrong based on? Is there a training set based on experiments? In bacteria, the use of the minor starts is significant, but the mechanism is so different I would expect eukaryotes to be more strict. > > Does maker handle ribosome profiling yet? > > jim > > Sent from my iPad > >> On Oct 24, 2013, at 8:55 AM, Mark Yandell wrote: >> >> Hi Graham, >> >> Carson is currently on the road, moving his family from Canada back to the States. He should be back online sometime next week. Sorry for the delay. >> >> >> --mark >> >> >> Mark Yandell >> Professor of Human Genetics >> H.A. & Edna Benning Presidential Endowed Chair >> Eccles Institute of Human Genetics >> University of Utah >> 15 North 2030 East, Room 2100 >> Salt Lake City, UT 84112-5330 >> ph:801-587-7707 >> >> ________________________________________ >> From: maker-devel [maker-devel-bounces at yandell-lab.org] on behalf of graham etherington (TSL) [graham.etherington at sainsbury-laboratory.ac.uk] >> Sent: Thursday, October 24, 2013 7:42 AM >> To: Carson Holt; Felix Bemm; maker-devel at yandell-lab.org >> Cc: Kate Bailey (TSL) >> Subject: Re: [maker-devel] Serious Start Codon Problem >> >> Hi, >> I notice that the latest version of Maker available from >> http://www.yandell-lab.org/software/maker.html is still v2.29b (4 Oct >> 2013). As we're also having the start codon problem we'd like to get hold >> of v2.30. Do you know when this will be released? >> Many thanks, >> Graham >> >> >> Dr. Graham Etherington >> Bioinformatics Support Officer, >> The Sainsbury Laboratory, >> Norwich Research Park, >> Norwich NR4 7UH. >> UK >> Tel: +44 (0)1603 450601 >> >> >> >> >> >>> On 13/10/2013 06:26, "Carson Holt" wrote: >>> >>> Before Wednesday, because after that I'm on a 6 day road trip, so I have >>> to have it wrapped up before I leave. >>> >>> --Carson >>> >>> >>> >>> >>>> On 10/12/13 12:16 PM, "Felix Bemm" wrote: >>>> >>>> Thx Carson! Do you now when 2.30 will be available? Bests >>>> >>>>> On 12.10.2013 17:48, Carson Holt wrote: >>>>> This issue just came up this week on another e-mail as well. >>>>> >>>>> One way to fix it, is to edit the Bio::Tools::CodonTable module from >>>>> BioPerl (use maker --debug to have maker print out the locations of all >>>>> modules it uses). Such a hack should only be done as a temporary work >>>>> around. >>>>> >>>>> You change line 256 from --> >>>>> ---M---------------M---------------M---------------------------- >>>>> >>>>> To--> >>>>> -----------------------------------M---------------------------- >>>>> >>>>> >>>>> Maker uses the BioPerl is_start_codon function to determine if the >>>>> codon >>>>> used in a result it receives is acceptable or if it should look >>>>> upstream >>>>> or downstream for a different one. Apparently there are actually 3 >>>>> acceptable start codons in the standard codon table (2 rare ones and >>>>> one >>>>> common), so making the edit removes the two rare ones from the list. >>>>> >>>>> I'm also preparing a 2.30 release that exports a "strict" codon table >>>>> to >>>>> the BioPerl module, that will be used by default and should fix the >>>>> issue. >>>>> >>>>> Thanks, >>>>> Carson >>>>> >>>>> >>>>> >>>>>> On 10/12/13 11:06 AM, "Felix Bemm" wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> I have a serious problems with maker annotations especially the start >>>>>> codon placement. It started happening with maker version 2.27! About >>>>>> 1/3 >>>>>> of my genes are missing a proper start codon. When reviewing the GFF >>>>>> files gene predictors and est evidence standalone would predict the >>>>>> right gene structure including the start codon. I was thinking that >>>>>> this >>>>>> problem was somehow linked to my gene models but looking at their >>>>>> standalone prediction I discarded that theory. Just some stats about >>>>>> proteins with proper start codon (first column) and missing start >>>>>> codon >>>>>> (second column). >>>>>> >>>>>> Maker 9268 4215 >>>>>> SNAP 16577 896 >>>>>> Genemark 18764 290 >>>>>> >>>>>> Numbers look the same when Augustus is used (currently retraining). >>>>>> Manually using EST's in Apollo often fixes the problems but I don't >>>>>> want >>>>>> to check > 4000 genes for this. >>>>>> >>>>>> Do you have any idea what could cause such a problem? If you need some >>>>>> example gff files I can send you. >>>>>> >>>>>> Best regards >>>>>> Felix >>>>>> >>>>>> -- >>>>>> Felix Bemm >>>>>> Department of Bioinformatics >>>>>> University of W?rzburg, Germany >>>>>> Tel: +49 931 - 31 83696 >>>>>> Fax: +49 931 - 31 84552 >>>>>> felix.bemm at uni-wuerzburg.de >>>>>> >>>>>> _______________________________________________ >>>>>> maker-devel mailing list >>>>>> maker-devel at box290.bluehost.com >>>>>> >>>>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>>> >>>> -- >>>> Felix Bemm >>>> Department of Bioinformatics >>>> University of W?rzburg, Germany >>>> Tel: +49 931 - 31 83696 >>>> Fax: +49 931 - 31 84552 >>>> felix.bemm at uni-wuerzburg.de >>> >>> >>> >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From marc.hoeppner at imbim.uu.se Fri Oct 25 08:20:04 2013 From: marc.hoeppner at imbim.uu.se (Marc P. Hoeppner) Date: Fri, 25 Oct 2013 16:20:04 +0200 Subject: [maker-devel] Maker MPICH2 issue (Multiple MAKER processes have been started in the same directory) Message-ID: <526A7E14.9010906@imbim.uu.se> Dear list, I know that MPICH2 related issues have been discussed previously, but the suggested solutions don't seem to work for me. Briefly, I have set up a new cluster, with a fresh install of mpich2 (SL 6.4, from the yum repos) and a newly compiled build of Maker 2.28. When I start maker with mpiexec, I get spammed with 'WARNING: Multiple MAKER processes have been started in the same directory.' Here is what I have tried so far - still getting the error :-< Any suggestions would be appreciated -------- MPICH2 --------- which mpiexec: /usr/lib64/mpich2/bin/mpiexec Started the mpd ring as normal user across 5 nodes (bnode-01 to bnode-04, and the head node bhead) mpdtrace: bhead bnode-02 bnode-03 bnode-04 bnode-01 mpdringtest: time for 1 loops = 0.00128293037415 seconds mpixec -n 32 -machinefile hostfile hostname: I also ran the Skampi Mpi Benchmark, which ran through just fine. ----------- MAKER ----------- ./Build status PERL Dependencies: VERIFIED External Programs: VERIFIED External C Libraries: VERIFIED MPI SUPPORT: ENABLED MWAS Web Interface: DISABLED MAKER PACKAGE: CONFIGURATION OK During configuration/installation, Build.PL automatically detected all MPI data and never complained, so I have to assume that everything went fine there. -- ---------- Marc P. Hoeppner, PhD Department of Microbiology and Medical Biochemistry Uppsala University, Sweden marc.hoeppner at imbim.uu.se From jim.hu.biobio at gmail.com Fri Oct 25 08:51:19 2013 From: jim.hu.biobio at gmail.com (JH Gmail) Date: Fri, 25 Oct 2013 09:51:19 -0500 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: <1F2D4C5A-1B72-409A-8F7D-4793F9BCEE88@genetics.utah.edu> References: <7A60AB257EFF2B48B1F4C814817EA05365E778BF@mxb2.hg.genetics.utah.edu> <69247335-1227-41CB-94EA-BFEEE5207D51@gmail.com> <1F2D4C5A-1B72-409A-8F7D-4793F9BCEE88@genetics.utah.edu> Message-ID: <596400F6-3215-4F71-A896-AF52CEC8D77D@gmail.com> Thanks Barry, I'm thinking about this more theoretically, as we don't have immediate plans, but... This would work for gff derived from ribosome profiling analysis, but it would be nice to be able to start with bam files of mapped reads, or coverage scores as bigwig or gff. RNA-seq has similar issues, but IIRC Maker expects RNA-seq data to be assembled first. It may not be possible to do de novo cds assembly from ribosome profiling because I think the nuclease step may kill the overlaps. I'd have to look at some of the available data. Jim Sent from my iPad > On Oct 24, 2013, at 4:09 PM, Barry Moore wrote: > > Hi Jim, > > You can and any kind of evidence you want to MAKER including ribosomal profiling data as long as you format it in GFF3 format. You could pass in your GFF3 formatted ribosomal profiling data in maker_opt.ctl with the protein_gff option and it will be treated as protein evidence. However if you are wondering if anyone has coded a specific addition to MAKER to accept ribosomal profiling data and treat it as a different kind of evidence then no, this hasn't been done. > > B > >> On Oct 24, 2013, at 8:32 AM, JH Gmail wrote: >> >> Slightly off-topic questions: What is the assessment of right and wrong based on? Is there a training set based on experiments? In bacteria, the use of the minor starts is significant, but the mechanism is so different I would expect eukaryotes to be more strict. >> >> Does maker handle ribosome profiling yet? >> >> jim >> >> Sent from my iPad >> >>> On Oct 24, 2013, at 8:55 AM, Mark Yandell wrote: >>> >>> Hi Graham, >>> >>> Carson is currently on the road, moving his family from Canada back to the States. He should be back online sometime next week. Sorry for the delay. >>> >>> >>> --mark >>> >>> >>> Mark Yandell >>> Professor of Human Genetics >>> H.A. & Edna Benning Presidential Endowed Chair >>> Eccles Institute of Human Genetics >>> University of Utah >>> 15 North 2030 East, Room 2100 >>> Salt Lake City, UT 84112-5330 >>> ph:801-587-7707 >>> >>> ________________________________________ >>> From: maker-devel [maker-devel-bounces at yandell-lab.org] on behalf of graham etherington (TSL) [graham.etherington at sainsbury-laboratory.ac.uk] >>> Sent: Thursday, October 24, 2013 7:42 AM >>> To: Carson Holt; Felix Bemm; maker-devel at yandell-lab.org >>> Cc: Kate Bailey (TSL) >>> Subject: Re: [maker-devel] Serious Start Codon Problem >>> >>> Hi, >>> I notice that the latest version of Maker available from >>> http://www.yandell-lab.org/software/maker.html is still v2.29b (4 Oct >>> 2013). As we're also having the start codon problem we'd like to get hold >>> of v2.30. Do you know when this will be released? >>> Many thanks, >>> Graham >>> >>> >>> Dr. Graham Etherington >>> Bioinformatics Support Officer, >>> The Sainsbury Laboratory, >>> Norwich Research Park, >>> Norwich NR4 7UH. >>> UK >>> Tel: +44 (0)1603 450601 >>> >>> >>> >>> >>> >>>> On 13/10/2013 06:26, "Carson Holt" wrote: >>>> >>>> Before Wednesday, because after that I'm on a 6 day road trip, so I have >>>> to have it wrapped up before I leave. >>>> >>>> --Carson >>>> >>>> >>>> >>>> >>>>> On 10/12/13 12:16 PM, "Felix Bemm" wrote: >>>>> >>>>> Thx Carson! Do you now when 2.30 will be available? Bests >>>>> >>>>>> On 12.10.2013 17:48, Carson Holt wrote: >>>>>> This issue just came up this week on another e-mail as well. >>>>>> >>>>>> One way to fix it, is to edit the Bio::Tools::CodonTable module from >>>>>> BioPerl (use maker --debug to have maker print out the locations of all >>>>>> modules it uses). Such a hack should only be done as a temporary work >>>>>> around. >>>>>> >>>>>> You change line 256 from --> >>>>>> ---M---------------M---------------M---------------------------- >>>>>> >>>>>> To--> >>>>>> -----------------------------------M---------------------------- >>>>>> >>>>>> >>>>>> Maker uses the BioPerl is_start_codon function to determine if the >>>>>> codon >>>>>> used in a result it receives is acceptable or if it should look >>>>>> upstream >>>>>> or downstream for a different one. Apparently there are actually 3 >>>>>> acceptable start codons in the standard codon table (2 rare ones and >>>>>> one >>>>>> common), so making the edit removes the two rare ones from the list. >>>>>> >>>>>> I'm also preparing a 2.30 release that exports a "strict" codon table >>>>>> to >>>>>> the BioPerl module, that will be used by default and should fix the >>>>>> issue. >>>>>> >>>>>> Thanks, >>>>>> Carson >>>>>> >>>>>> >>>>>> >>>>>>> On 10/12/13 11:06 AM, "Felix Bemm" wrote: >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I have a serious problems with maker annotations especially the start >>>>>>> codon placement. It started happening with maker version 2.27! About >>>>>>> 1/3 >>>>>>> of my genes are missing a proper start codon. When reviewing the GFF >>>>>>> files gene predictors and est evidence standalone would predict the >>>>>>> right gene structure including the start codon. I was thinking that >>>>>>> this >>>>>>> problem was somehow linked to my gene models but looking at their >>>>>>> standalone prediction I discarded that theory. Just some stats about >>>>>>> proteins with proper start codon (first column) and missing start >>>>>>> codon >>>>>>> (second column). >>>>>>> >>>>>>> Maker 9268 4215 >>>>>>> SNAP 16577 896 >>>>>>> Genemark 18764 290 >>>>>>> >>>>>>> Numbers look the same when Augustus is used (currently retraining). >>>>>>> Manually using EST's in Apollo often fixes the problems but I don't >>>>>>> want >>>>>>> to check > 4000 genes for this. >>>>>>> >>>>>>> Do you have any idea what could cause such a problem? If you need some >>>>>>> example gff files I can send you. >>>>>>> >>>>>>> Best regards >>>>>>> Felix >>>>>>> >>>>>>> -- >>>>>>> Felix Bemm >>>>>>> Department of Bioinformatics >>>>>>> University of W?rzburg, Germany >>>>>>> Tel: +49 931 - 31 83696 >>>>>>> Fax: +49 931 - 31 84552 >>>>>>> felix.bemm at uni-wuerzburg.de >>>>>>> >>>>>>> _______________________________________________ >>>>>>> maker-devel mailing list >>>>>>> maker-devel at box290.bluehost.com >>>>>>> >>>>>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>>>> >>>>> -- >>>>> Felix Bemm >>>>> Department of Bioinformatics >>>>> University of W?rzburg, Germany >>>>> Tel: +49 931 - 31 83696 >>>>> Fax: +49 931 - 31 84552 >>>>> felix.bemm at uni-wuerzburg.de >>>> >>>> >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >>> >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > Barry Moore > Research Scientist > Dept. of Human Genetics > University of Utah > Salt Lake City, UT 84112 > -------------------------------------------- > (801) 585-3543 > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry.moore at genetics.utah.edu Fri Oct 25 09:44:55 2013 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Fri, 25 Oct 2013 09:44:55 -0600 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: <596400F6-3215-4F71-A896-AF52CEC8D77D@gmail.com> References: <7A60AB257EFF2B48B1F4C814817EA05365E778BF@mxb2.hg.genetics.utah.edu> <69247335-1227-41CB-94EA-BFEEE5207D51@gmail.com> <1F2D4C5A-1B72-409A-8F7D-4793F9BCEE88@genetics.utah.edu> <596400F6-3215-4F71-A896-AF52CEC8D77D@gmail.com> Message-ID: Thanks Jim. B On Oct 25, 2013, at 8:51 AM, JH Gmail wrote: > Thanks Barry, > > I'm thinking about this more theoretically, as we don't have immediate plans, but... > > This would work for gff derived from ribosome profiling analysis, but it would be nice to be able to start with bam files of mapped reads, or coverage scores as bigwig or gff. RNA-seq has similar issues, but IIRC Maker expects RNA-seq data to be assembled first. It may not be possible to do de novo cds assembly from ribosome profiling because I think the nuclease step may kill the overlaps. I'd have to look at some of the available data. > > Jim > > Sent from my iPad > > On Oct 24, 2013, at 4:09 PM, Barry Moore wrote: > >> Hi Jim, >> >> You can and any kind of evidence you want to MAKER including ribosomal profiling data as long as you format it in GFF3 format. You could pass in your GFF3 formatted ribosomal profiling data in maker_opt.ctl with the protein_gff option and it will be treated as protein evidence. However if you are wondering if anyone has coded a specific addition to MAKER to accept ribosomal profiling data and treat it as a different kind of evidence then no, this hasn't been done. >> >> B >> >> On Oct 24, 2013, at 8:32 AM, JH Gmail wrote: >> >>> Slightly off-topic questions: What is the assessment of right and wrong based on? Is there a training set based on experiments? In bacteria, the use of the minor starts is significant, but the mechanism is so different I would expect eukaryotes to be more strict. >>> >>> Does maker handle ribosome profiling yet? >>> >>> jim >>> >>> Sent from my iPad >>> >>>> On Oct 24, 2013, at 8:55 AM, Mark Yandell wrote: >>>> >>>> Hi Graham, >>>> >>>> Carson is currently on the road, moving his family from Canada back to the States. He should be back online sometime next week. Sorry for the delay. >>>> >>>> >>>> --mark >>>> >>>> >>>> Mark Yandell >>>> Professor of Human Genetics >>>> H.A. & Edna Benning Presidential Endowed Chair >>>> Eccles Institute of Human Genetics >>>> University of Utah >>>> 15 North 2030 East, Room 2100 >>>> Salt Lake City, UT 84112-5330 >>>> ph:801-587-7707 >>>> >>>> ________________________________________ >>>> From: maker-devel [maker-devel-bounces at yandell-lab.org] on behalf of graham etherington (TSL) [graham.etherington at sainsbury-laboratory.ac.uk] >>>> Sent: Thursday, October 24, 2013 7:42 AM >>>> To: Carson Holt; Felix Bemm; maker-devel at yandell-lab.org >>>> Cc: Kate Bailey (TSL) >>>> Subject: Re: [maker-devel] Serious Start Codon Problem >>>> >>>> Hi, >>>> I notice that the latest version of Maker available from >>>> http://www.yandell-lab.org/software/maker.html is still v2.29b (4 Oct >>>> 2013). As we're also having the start codon problem we'd like to get hold >>>> of v2.30. Do you know when this will be released? >>>> Many thanks, >>>> Graham >>>> >>>> >>>> Dr. Graham Etherington >>>> Bioinformatics Support Officer, >>>> The Sainsbury Laboratory, >>>> Norwich Research Park, >>>> Norwich NR4 7UH. >>>> UK >>>> Tel: +44 (0)1603 450601 >>>> >>>> >>>> >>>> >>>> >>>>> On 13/10/2013 06:26, "Carson Holt" wrote: >>>>> >>>>> Before Wednesday, because after that I'm on a 6 day road trip, so I have >>>>> to have it wrapped up before I leave. >>>>> >>>>> --Carson >>>>> >>>>> >>>>> >>>>> >>>>>> On 10/12/13 12:16 PM, "Felix Bemm" wrote: >>>>>> >>>>>> Thx Carson! Do you now when 2.30 will be available? Bests >>>>>> >>>>>>> On 12.10.2013 17:48, Carson Holt wrote: >>>>>>> This issue just came up this week on another e-mail as well. >>>>>>> >>>>>>> One way to fix it, is to edit the Bio::Tools::CodonTable module from >>>>>>> BioPerl (use maker --debug to have maker print out the locations of all >>>>>>> modules it uses). Such a hack should only be done as a temporary work >>>>>>> around. >>>>>>> >>>>>>> You change line 256 from --> >>>>>>> ---M---------------M---------------M---------------------------- >>>>>>> >>>>>>> To--> >>>>>>> -----------------------------------M---------------------------- >>>>>>> >>>>>>> >>>>>>> Maker uses the BioPerl is_start_codon function to determine if the >>>>>>> codon >>>>>>> used in a result it receives is acceptable or if it should look >>>>>>> upstream >>>>>>> or downstream for a different one. Apparently there are actually 3 >>>>>>> acceptable start codons in the standard codon table (2 rare ones and >>>>>>> one >>>>>>> common), so making the edit removes the two rare ones from the list. >>>>>>> >>>>>>> I'm also preparing a 2.30 release that exports a "strict" codon table >>>>>>> to >>>>>>> the BioPerl module, that will be used by default and should fix the >>>>>>> issue. >>>>>>> >>>>>>> Thanks, >>>>>>> Carson >>>>>>> >>>>>>> >>>>>>> >>>>>>>> On 10/12/13 11:06 AM, "Felix Bemm" wrote: >>>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> I have a serious problems with maker annotations especially the start >>>>>>>> codon placement. It started happening with maker version 2.27! About >>>>>>>> 1/3 >>>>>>>> of my genes are missing a proper start codon. When reviewing the GFF >>>>>>>> files gene predictors and est evidence standalone would predict the >>>>>>>> right gene structure including the start codon. I was thinking that >>>>>>>> this >>>>>>>> problem was somehow linked to my gene models but looking at their >>>>>>>> standalone prediction I discarded that theory. Just some stats about >>>>>>>> proteins with proper start codon (first column) and missing start >>>>>>>> codon >>>>>>>> (second column). >>>>>>>> >>>>>>>> Maker 9268 4215 >>>>>>>> SNAP 16577 896 >>>>>>>> Genemark 18764 290 >>>>>>>> >>>>>>>> Numbers look the same when Augustus is used (currently retraining). >>>>>>>> Manually using EST's in Apollo often fixes the problems but I don't >>>>>>>> want >>>>>>>> to check > 4000 genes for this. >>>>>>>> >>>>>>>> Do you have any idea what could cause such a problem? If you need some >>>>>>>> example gff files I can send you. >>>>>>>> >>>>>>>> Best regards >>>>>>>> Felix >>>>>>>> >>>>>>>> -- >>>>>>>> Felix Bemm >>>>>>>> Department of Bioinformatics >>>>>>>> University of W?rzburg, Germany >>>>>>>> Tel: +49 931 - 31 83696 >>>>>>>> Fax: +49 931 - 31 84552 >>>>>>>> felix.bemm at uni-wuerzburg.de >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> maker-devel mailing list >>>>>>>> maker-devel at box290.bluehost.com >>>>>>>> >>>>>>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>>>>> >>>>>> -- >>>>>> Felix Bemm >>>>>> Department of Bioinformatics >>>>>> University of W?rzburg, Germany >>>>>> Tel: +49 931 - 31 83696 >>>>>> Fax: +49 931 - 31 84552 >>>>>> felix.bemm at uni-wuerzburg.de >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> maker-devel mailing list >>>>> maker-devel at box290.bluehost.com >>>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>>> >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> Barry Moore >> Research Scientist >> Dept. of Human Genetics >> University of Utah >> Salt Lake City, UT 84112 >> -------------------------------------------- >> (801) 585-3543 >> >> >> >> Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From marc.hoeppner at imbim.uu.se Sat Oct 26 05:54:07 2013 From: marc.hoeppner at imbim.uu.se (Marc P. Hoeppner) Date: Sat, 26 Oct 2013 13:54:07 +0200 Subject: [maker-devel] Maker MPICH2 issue (Multiple MAKER processes have been started in the same directory) In-Reply-To: <526A7E14.9010906@imbim.uu.se> References: <526A7E14.9010906@imbim.uu.se> Message-ID: <526BAD5F.2000800@imbim.uu.se> Dear List, after much more debugging, I found the solution myself. Here are the details: The Maker installation instructions refer to 'mpich2' in multiple places, including weblinks (whchi are outdated btw). Some digging revealed that mpich2 has since been renamed and merged back into mpich. So the website is www.mpich.org and the current stable version is mpich-3.0.4. Closely following the installation instruction yielded a working copy of mpiexec that Maker was willing to accept. Good news - but I suggest to update the documentation. Most importantly for either SL/CentOS users, mpich2 as packaged with e.g. Scientific Linux *does not* work with Maker. You will need to download the latest stable build and compile it yourself (the mpich2 version bundled with Maker doesn't work either btw). Cheers, Marc > Dear list, > > I know that MPICH2 related issues have been discussed previously, but > the suggested solutions don't seem to work for me. > Briefly, I have set up a new cluster, with a fresh install of mpich2 > (SL 6.4, from the yum repos) and a newly compiled build of Maker 2.28. > > When I start maker with mpiexec, I get spammed with > 'WARNING: Multiple MAKER processes have been started in the same > directory.' > > Here is what I have tried so far - still getting the error :-< Any > suggestions would be appreciated > > -------- > MPICH2 > --------- > > which mpiexec: > /usr/lib64/mpich2/bin/mpiexec > > Started the mpd ring as normal user across 5 nodes (bnode-01 to > bnode-04, and the head node bhead) > > mpdtrace: > bhead > bnode-02 > bnode-03 > bnode-04 > bnode-01 > > mpdringtest: > time for 1 loops = 0.00128293037415 seconds > > mpixec -n 32 -machinefile hostfile hostname: > > > > I also ran the Skampi Mpi Benchmark, which ran through just fine. > > ----------- > MAKER > ----------- > > ./Build status > > PERL Dependencies: VERIFIED > External Programs: VERIFIED > External C Libraries: VERIFIED > MPI SUPPORT: ENABLED > MWAS Web Interface: DISABLED > MAKER PACKAGE: CONFIGURATION OK > > During configuration/installation, Build.PL automatically detected all > MPI data and never complained, so I have to assume that everything > went fine there. > -- ---------- Marc P. Hoeppner, PhD Department of Microbiology and Medical Biochemistry Uppsala University, Sweden marc.hoeppner at imbim.uu.se From barry.utah at gmail.com Sat Oct 26 08:41:24 2013 From: barry.utah at gmail.com (Barry Moore) Date: Sat, 26 Oct 2013 08:41:24 -0600 Subject: [maker-devel] Maker MPICH2 issue (Multiple MAKER processes have been started in the same directory) In-Reply-To: <526BAD5F.2000800@imbim.uu.se> References: <526A7E14.9010906@imbim.uu.se> <526BAD5F.2000800@imbim.uu.se> Message-ID: Hi Marc, Thanks so much for sharing the results of all your hard work! We will follow up on your feedback and incorporate the necessary updates to MAKER docs and installation procedures. B On Oct 26, 2013, at 5:54 AM, Marc P. Hoeppner wrote: > Dear List, > > after much more debugging, I found the solution myself. Here are the details: > > The Maker installation instructions refer to 'mpich2' in multiple places, including weblinks (whchi are outdated btw). Some digging revealed that mpich2 has since been renamed and merged back into mpich. So the website is www.mpich.org and the current stable version is mpich-3.0.4. Closely following the installation instruction yielded a working copy of mpiexec that Maker was willing to accept. Good news - but I suggest to update the documentation. Most importantly for either SL/CentOS users, mpich2 as packaged with e.g. Scientific Linux *does not* work with Maker. You will need to download the latest stable build and compile it yourself (the mpich2 version bundled with Maker doesn't work either btw). > > Cheers, > > Marc > >> Dear list, >> >> I know that MPICH2 related issues have been discussed previously, but the suggested solutions don't seem to work for me. >> Briefly, I have set up a new cluster, with a fresh install of mpich2 (SL 6.4, from the yum repos) and a newly compiled build of Maker 2.28. >> >> When I start maker with mpiexec, I get spammed with >> 'WARNING: Multiple MAKER processes have been started in the same directory.' >> >> Here is what I have tried so far - still getting the error :-< Any suggestions would be appreciated >> >> -------- >> MPICH2 >> --------- >> >> which mpiexec: >> /usr/lib64/mpich2/bin/mpiexec >> >> Started the mpd ring as normal user across 5 nodes (bnode-01 to bnode-04, and the head node bhead) >> >> mpdtrace: >> bhead >> bnode-02 >> bnode-03 >> bnode-04 >> bnode-01 >> >> mpdringtest: >> time for 1 loops = 0.00128293037415 seconds >> >> mpixec -n 32 -machinefile hostfile hostname: >> >> >> >> I also ran the Skampi Mpi Benchmark, which ran through just fine. >> >> ----------- >> MAKER >> ----------- >> >> ./Build status >> >> PERL Dependencies: VERIFIED >> External Programs: VERIFIED >> External C Libraries: VERIFIED >> MPI SUPPORT: ENABLED >> MWAS Web Interface: DISABLED >> MAKER PACKAGE: CONFIGURATION OK >> >> During configuration/installation, Build.PL automatically detected all MPI data and never complained, so I have to assume that everything went fine there. >> > > > -- > ---------- > Marc P. Hoeppner, PhD > Department of Microbiology and Medical Biochemistry > Uppsala University, Sweden > marc.hoeppner at imbim.uu.se > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From wholtz at lygos.com Wed Oct 30 12:21:52 2013 From: wholtz at lygos.com (William Holtz) Date: Wed, 30 Oct 2013 11:21:52 -0700 (PDT) Subject: [maker-devel] maker apollo error Message-ID: <1383157312.95043.YahooMailNeo@web5706.biz.mail.ne1.yahoo.com> Hi, I'm attempting to reply to a thread on this list from before I was a member. My apologies if this gets messed up. I've tried installing maker v2.28 and v2.29p-beta and both of them fail due to the URL of the apollo tar file being outdated as previously reported on this list.?The updated URL that Carson Holt posted,?http://sourceforge.net/code-snapshots/svn/g/gm/gmod/svn/gmod-svn-25291-apollo-trunk.zip,?is also no longer valid. I was able to find a copy of the apollo source at?http://downloads.sourceforge.net/project/gmod/Apollo/1.11.1/apollo-1.11.1.tgz. However my attempts to integrate this into the maker build process have been unsuccessful, but likely could be done by someone more skilled than I.?Any pointers on how to get past this hurdle would be appreciated. Thanks in advance for your help. -Will From barry.utah at gmail.com Wed Oct 30 16:31:58 2013 From: barry.utah at gmail.com (Barry Moore) Date: Wed, 30 Oct 2013 16:31:58 -0600 Subject: [maker-devel] maker apollo error In-Reply-To: <1383157312.95043.YahooMailNeo@web5706.biz.mail.ne1.yahoo.com> References: <1383157312.95043.YahooMailNeo@web5706.biz.mail.ne1.yahoo.com> Message-ID: Hi Will, You can edit these URLs in the following file: maker/src/locations In a section that looks like this: 'apollo' => {'Linux_x86_64' => 'http://gmod.svn.sourceforge.net/viewvc/gmod/apollo/trunk/?view=tar', 'Linux_i386' => 'http://gmod.svn.sourceforge.net/viewvc/gmod/apollo/trunk/?view=tar', 'Darwin_i386' => 'http://weatherby.genetics.utah.edu/apollo/apollo.tar.gz', 'Darwin_x86_64' => 'http://weatherby.genetics.utah.edu/apollo/apollo.tar.gz', 'src_' => 'http://weatherby.genetics.utah.edu/apollo/apollo.tar.gz', }, Carson may have more feedback for you as well, but he is right in the middle of a move, so there may be a few days before he can reply, so modifying this file should work for you in the mean time. On any of the executables that are missing you can always install them manually and add them to your path and maker will find them. Also, maker shouldn't be required for a base maker install, so you could ignore that dependency as well. Thanks, B On Oct 30, 2013, at 12:21 PM, William Holtz wrote: > Hi, > > I'm attempting to reply to a thread on this list from before I was a member. My apologies if this gets messed up. > > I've tried installing maker v2.28 and v2.29p-beta and both of them fail due to the URL of the apollo tar file being outdated as previously reported on this list. The updated URL that Carson Holt posted, http://sourceforge.net/code-snapshots/svn/g/gm/gmod/svn/gmod-svn-25291-apollo-trunk.zip, is also no longer valid. I was able to find a copy of the apollo source at http://downloads.sourceforge.net/project/gmod/Apollo/1.11.1/apollo-1.11.1.tgz. However my attempts to integrate this into the maker build process have been unsuccessful, but likely could be done by someone more skilled than I. Any pointers on how to get past this hurdle would be appreciated. > > Thanks in advance for your help. > > -Will > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Wed Oct 2 08:14:44 2013 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 02 Oct 2013 10:14:44 -0400 Subject: [maker-devel] maker2 scripts for functional annotation In-Reply-To: Message-ID: It still works, but it will be reduced. Without ESTs you won't get any UTR prediction, also you will be limited by how well the protein set matches to the genes. If you use the protein set of a close relative you will capture most things. You will probably capture 80-95% of what you would by also including ESTs. It all depending on how different the species in the proteins evidence file are compared to the the species being annotated. --Carson On 10/1/13 3:00 PM, "Xia.Cao at dupont.com" wrote: >Hi Carson, > >Thank you for your message and your kind help. Now conversion from >match/match_part to gene/mRNA/exons/CDS works well after blanking out >some options in control file. > >I have one more question about maker2. In case we don't have EST evidence >(set of ESTs or assembled mRNA-seq in fasta format) for a genome, does >maker2 function? If it does function, could you please let me know the >performance of maker2 without providing EST evidence compared to the one >with EST evidence? > >Thank you again and I look forward to hearing from you. > >Best, >Xia > > > > >-----Original Message----- >From: Carson Holt [mailto:carsonhh at gmail.com] >Sent: Wednesday, September 25, 2013 10:36 AM >To: CAO, XIA; myandell at genetics.utah.edu; RIVERA, CORBAN GREGORY; >maker-devel at yandell-lab.org >Subject: Re: [maker-devel] maker2 scripts for functional annotation > >If it is launching predictors then you have snap hmm or augustus_species >set. You ned to blank out all other options in the control files >(including repeat masking options, proteins, ESTs, etc.) when trying to >convert mathc/match_part to gene/mRNA/exons/CDS, or else those other >programs will run. > >--Carson > > >On 9/25/13 10:31 AM, "Xia.Cao at dupont.com" wrote: > >>Hi Carson, >> >>Thank you for the message and your kind help. We tested maker2 by >>setting keep_preds=1, pred_gff=generated_gff_file_from_first_makerRun . >>But it seemed maker2 started to launch all predictors again and it took >>long time to finish. I wonder if there is any way that we can directly >>get gene/mRNA/exons/CDS gff file without re-running maker2 to convert >>match/match_part features into gene/mRNA/exons/CDS. >> >>Thanks, >>Xia >> >>-----Original Message----- >>From: Carson Holt [mailto:carsonhh at gmail.com] >>Sent: Thursday, September 19, 2013 5:58 PM >>To: Mark Yandell; CAO, XIA; RIVERA, CORBAN GREGORY; >>maker-devel at yandell-lab.org >>Subject: Re: [maker-devel] maker2 scripts for functional annotation >> >>Hello Corban & Xia, >> >>Some scripts like gff3_preds2models are deprecated. To get the same >>result as was offered by gff3_preds2models, just give your >>match/match_part features to pref_gff= in the maker_opts.ctl file, set >>keep_preds=1, and run with all other options and predictors turned off. >>The final MAKER result will be your match/match_part features converted >>into gene/mRNA/exons/CDS. >> >>For functional annotation, you can use Interproscan, BLASTP against >>UniProt, or BALST2GO. My preference is to use InterProScan to add GO >>terms and proteins domains via the ipr_update_gff and iprscan2gff3 >>scripts. Then add putative gene functions via BLASTP to UniProt and >>maker_functional_fasta and maker_functional_gff scripts. >> >>Go ahead and take a look and that those tools and let me know if you >>have any questions or need help you configuring them. >> >>Thanks, >>Carson >> >> >>On 9/19/13 11:53 AM, "Mark Yandell" wrote: >> >>>Hi Corban & Xia, >>> >>> >>>I've forwarded your question along to the MAKER_dev list, were you can >>>get speedy answers to your maker related questions. Thanks for using >>>MAKER. >>> >>>--mark >>> >>> >>>Mark Yandell >>>Professor of Human Genetics >>>H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of >>>Human Genetics University of Utah >>>15 North 2030 East, Room 2100 >>>Salt Lake City, UT 84112-5330 >>>ph:801-587-7707 >>> >>>________________________________________ >>>From: Xia.Cao at dupont.com [Xia.Cao at dupont.com] >>>Sent: Thursday, September 19, 2013 11:49 AM >>>To: Mark Yandell; Corban-Gregory.Rivera at dupont.com >>>Subject: maker2 scripts for functional annotation >>> >>>Dr. Yandell, >>> >>>We were recently evaluating maker2 for annotation and going through >>>the maker tutorial from 2012. >>> >>>http://gmod.org/wiki/MAKER_Tutorial_2012 >>> >>>The tutorial makes references to some scripts that we couldn?t find in >>>the current release. We were looking for scripts like >>>gff3_preds2models to convert match/match_part format into annotations >>>with gene/mRNA/exons/CDS and others. I was wondering if maybe we did >>>not have the most up to date version. >>> >>>In addition to getting accurate gene annotations, I was looking for a >>>solution to get functional assignments. I see that there are some >>>scripts like maker_functional_fasta that may help, but I was wondering >>>what you would recommend. >>> >>>Thanks, >>> >>>Corban & Xia >>> >>>This communication is for use by the intended recipient and contains >>>information that may be Privileged, confidential or copyrighted under >>>applicable law. If you are not the intended recipient, you are hereby >>>formally notified that any use, copying or distribution of this >>>e-mail, in whole or in part, is strictly prohibited. Please notify the >>>sender by return e-mail and delete this e-mail from your system. >>>Unless explicitly and conspicuously designated as "E-Contract >>>Intended", this e-mail does not constitute a contract offer, a >>>contract amendment, or an acceptance of a contract offer. This e-mail >>>does not constitute a consent to the use of sender's contact >>>information for direct marketing purposes or for transfers of data to >>>third parties. >>> >>>The dupont.com web address will continue in use for a transitional >>>period for communications sent or received on behalf of DuPont >>>Performance Coatings., which is not affiliated in any way with the >>>DuPont Company. >>> >>>Francais Deutsch Italiano Espanol Portugues Japanese Chinese >>>Korean >>> >>> http://www.DuPont.com/corp/email_disclaimer.html >>> >>>_______________________________________________ >>>maker-devel mailing list >>>maker-devel at box290.bluehost.com >>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.or >>>g >> >> >> >>This communication is for use by the intended recipient and contains >>information that may be Privileged, confidential or copyrighted under >>applicable law. If you are not the intended recipient, you are hereby >>formally notified that any use, copying or distribution of this e-mail, >>in whole or in part, is strictly prohibited. Please notify the sender >>by return e-mail and delete this e-mail from your system. Unless >>explicitly and conspicuously designated as "E-Contract Intended", this >>e-mail does not constitute a contract offer, a contract amendment, or >>an acceptance of a contract offer. This e-mail does not constitute a >>consent to the use of sender's contact information for direct marketing >>purposes or for transfers of data to third parties. >> >>The dupont.com web address will continue in use for >>a transitional period for communications sent or received on behalf of >>DuPont Performance Coatings., which is not affiliated in any way with >>the DuPont Company. >> >>Francais Deutsch Italiano Espanol Portugues Japanese Chinese >>Korean >> >> http://www.DuPont.com/corp/email_disclaimer.html >> > > > >This communication is for use by the intended recipient and contains >information that may be Privileged, confidential or copyrighted under >applicable law. If you are not the intended recipient, you are hereby >formally notified that any use, copying or distribution of this e-mail, >in whole or in part, is strictly prohibited. Please notify the sender by >return e-mail and delete this e-mail from your system. Unless explicitly >and conspicuously designated as "E-Contract Intended", this e-mail does >not constitute a contract offer, a contract amendment, or an acceptance >of a contract offer. This e-mail does not constitute a consent to the >use of sender's contact information for direct marketing purposes or for >transfers of data to third parties. > >The dupont.com web address will continue in use for a >transitional period for communications sent or received on behalf of >DuPont >Performance Coatings., which is not affiliated in any way with the DuPont >Company. > >Francais Deutsch Italiano Espanol Portugues Japanese Chinese Korean > > http://www.DuPont.com/corp/email_disclaimer.html > From Xia.Cao at dupont.com Tue Oct 1 13:00:42 2013 From: Xia.Cao at dupont.com (Xia.Cao at dupont.com) Date: Tue, 1 Oct 2013 19:00:42 +0000 Subject: [maker-devel] maker2 scripts for functional annotation In-Reply-To: References: Message-ID: Hi Carson, Thank you for your message and your kind help. Now conversion from match/match_part to gene/mRNA/exons/CDS works well after blanking out some options in control file. I have one more question about maker2. In case we don't have EST evidence (set of ESTs or assembled mRNA-seq in fasta format) for a genome, does maker2 function? If it does function, could you please let me know the performance of maker2 without providing EST evidence compared to the one with EST evidence? Thank you again and I look forward to hearing from you. Best, Xia -----Original Message----- From: Carson Holt [mailto:carsonhh at gmail.com] Sent: Wednesday, September 25, 2013 10:36 AM To: CAO, XIA; myandell at genetics.utah.edu; RIVERA, CORBAN GREGORY; maker-devel at yandell-lab.org Subject: Re: [maker-devel] maker2 scripts for functional annotation If it is launching predictors then you have snap hmm or augustus_species set. You ned to blank out all other options in the control files (including repeat masking options, proteins, ESTs, etc.) when trying to convert mathc/match_part to gene/mRNA/exons/CDS, or else those other programs will run. --Carson On 9/25/13 10:31 AM, "Xia.Cao at dupont.com" wrote: >Hi Carson, > >Thank you for the message and your kind help. We tested maker2 by >setting keep_preds=1, pred_gff=generated_gff_file_from_first_makerRun . >But it seemed maker2 started to launch all predictors again and it took >long time to finish. I wonder if there is any way that we can directly >get gene/mRNA/exons/CDS gff file without re-running maker2 to convert >match/match_part features into gene/mRNA/exons/CDS. > >Thanks, >Xia > >-----Original Message----- >From: Carson Holt [mailto:carsonhh at gmail.com] >Sent: Thursday, September 19, 2013 5:58 PM >To: Mark Yandell; CAO, XIA; RIVERA, CORBAN GREGORY; >maker-devel at yandell-lab.org >Subject: Re: [maker-devel] maker2 scripts for functional annotation > >Hello Corban & Xia, > >Some scripts like gff3_preds2models are deprecated. To get the same >result as was offered by gff3_preds2models, just give your >match/match_part features to pref_gff= in the maker_opts.ctl file, set >keep_preds=1, and run with all other options and predictors turned off. >The final MAKER result will be your match/match_part features converted >into gene/mRNA/exons/CDS. > >For functional annotation, you can use Interproscan, BLASTP against >UniProt, or BALST2GO. My preference is to use InterProScan to add GO >terms and proteins domains via the ipr_update_gff and iprscan2gff3 >scripts. Then add putative gene functions via BLASTP to UniProt and >maker_functional_fasta and maker_functional_gff scripts. > >Go ahead and take a look and that those tools and let me know if you >have any questions or need help you configuring them. > >Thanks, >Carson > > >On 9/19/13 11:53 AM, "Mark Yandell" wrote: > >>Hi Corban & Xia, >> >> >>I've forwarded your question along to the MAKER_dev list, were you can >>get speedy answers to your maker related questions. Thanks for using >>MAKER. >> >>--mark >> >> >>Mark Yandell >>Professor of Human Genetics >>H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of >>Human Genetics University of Utah >>15 North 2030 East, Room 2100 >>Salt Lake City, UT 84112-5330 >>ph:801-587-7707 >> >>________________________________________ >>From: Xia.Cao at dupont.com [Xia.Cao at dupont.com] >>Sent: Thursday, September 19, 2013 11:49 AM >>To: Mark Yandell; Corban-Gregory.Rivera at dupont.com >>Subject: maker2 scripts for functional annotation >> >>Dr. Yandell, >> >>We were recently evaluating maker2 for annotation and going through >>the maker tutorial from 2012. >> >>http://gmod.org/wiki/MAKER_Tutorial_2012 >> >>The tutorial makes references to some scripts that we couldn?t find in >>the current release. We were looking for scripts like >>gff3_preds2models to convert match/match_part format into annotations >>with gene/mRNA/exons/CDS and others. I was wondering if maybe we did >>not have the most up to date version. >> >>In addition to getting accurate gene annotations, I was looking for a >>solution to get functional assignments. I see that there are some >>scripts like maker_functional_fasta that may help, but I was wondering >>what you would recommend. >> >>Thanks, >> >>Corban & Xia >> >>This communication is for use by the intended recipient and contains >>information that may be Privileged, confidential or copyrighted under >>applicable law. If you are not the intended recipient, you are hereby >>formally notified that any use, copying or distribution of this >>e-mail, in whole or in part, is strictly prohibited. Please notify the >>sender by return e-mail and delete this e-mail from your system. >>Unless explicitly and conspicuously designated as "E-Contract >>Intended", this e-mail does not constitute a contract offer, a >>contract amendment, or an acceptance of a contract offer. This e-mail >>does not constitute a consent to the use of sender's contact >>information for direct marketing purposes or for transfers of data to third parties. >> >>The dupont.com web address will continue in use for a transitional >>period for communications sent or received on behalf of DuPont >>Performance Coatings., which is not affiliated in any way with the >>DuPont Company. >> >>Francais Deutsch Italiano Espanol Portugues Japanese Chinese >>Korean >> >> http://www.DuPont.com/corp/email_disclaimer.html >> >>_______________________________________________ >>maker-devel mailing list >>maker-devel at box290.bluehost.com >>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.or >>g > > > >This communication is for use by the intended recipient and contains >information that may be Privileged, confidential or copyrighted under >applicable law. If you are not the intended recipient, you are hereby >formally notified that any use, copying or distribution of this e-mail, >in whole or in part, is strictly prohibited. Please notify the sender >by return e-mail and delete this e-mail from your system. Unless >explicitly and conspicuously designated as "E-Contract Intended", this >e-mail does not constitute a contract offer, a contract amendment, or >an acceptance of a contract offer. This e-mail does not constitute a >consent to the use of sender's contact information for direct marketing >purposes or for transfers of data to third parties. > >The dupont.com web address will continue in use for >a transitional period for communications sent or received on behalf of >DuPont Performance Coatings., which is not affiliated in any way with >the DuPont Company. > >Francais Deutsch Italiano Espanol Portugues Japanese Chinese >Korean > > http://www.DuPont.com/corp/email_disclaimer.html > This communication is for use by the intended recipient and contains information that may be Privileged, confidential or copyrighted under applicable law. If you are not the intended recipient, you are hereby formally notified that any use, copying or distribution of this e-mail, in whole or in part, is strictly prohibited. Please notify the sender by return e-mail and delete this e-mail from your system. Unless explicitly and conspicuously designated as "E-Contract Intended", this e-mail does not constitute a contract offer, a contract amendment, or an acceptance of a contract offer. This e-mail does not constitute a consent to the use of sender's contact information for direct marketing purposes or for transfers of data to third parties. The dupont.com web address will continue in use for a transitional period for communications sent or received on behalf of DuPont Performance Coatings., which is not affiliated in any way with the DuPont Company. Francais Deutsch Italiano Espanol Portugues Japanese Chinese Korean http://www.DuPont.com/corp/email_disclaimer.html From carsonhh at gmail.com Wed Oct 2 08:43:29 2013 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 02 Oct 2013 10:43:29 -0400 Subject: [maker-devel] maker2 scripts for functional annotation In-Reply-To: Message-ID: That's ok. If it is very closely related, provide it directly to the est= option, otherwise provide it to the altest= option. The difference between the two options is how they are aligned. The altest= alignments takes about 10x longer than the est= alignments. --Carson On 10/2/13 10:40 AM, "Xia.Cao at dupont.com" wrote: >Hi Carson, > >Thank you for your quick and kind reply. One more question here. If we >don't have EST evidence for the strain we sequenced/assembled, will it be >ok to provide some EST evidence that is from some other strains which are >close to the one we are trying to annotate? Could you please let us know >how this will affect performance of maker2? > >Thank you again. > >Best, >Xia > >-----Original Message----- >From: Carson Holt [mailto:carsonhh at gmail.com] >Sent: Wednesday, October 02, 2013 10:15 AM >To: CAO, XIA >Cc: myandell at genetics.utah.edu; RIVERA, CORBAN GREGORY; >maker-devel at yandell-lab.org >Subject: Re: [maker-devel] maker2 scripts for functional annotation > >It still works, but it will be reduced. Without ESTs you won't get any >UTR prediction, also you will be limited by how well the protein set >matches to the genes. If you use the protein set of a close relative you >will capture most things. You will probably capture 80-95% of what you >would by also including ESTs. It all depending on how different the >species in the proteins evidence file are compared to the the species >being annotated. > >--Carson > >On 10/1/13 3:00 PM, "Xia.Cao at dupont.com" wrote: > >>Hi Carson, >> >>Thank you for your message and your kind help. Now conversion from >>match/match_part to gene/mRNA/exons/CDS works well after blanking out >>some options in control file. >> >>I have one more question about maker2. In case we don't have EST >>evidence (set of ESTs or assembled mRNA-seq in fasta format) for a >>genome, does >>maker2 function? If it does function, could you please let me know the >>performance of maker2 without providing EST evidence compared to the >>one with EST evidence? >> >>Thank you again and I look forward to hearing from you. >> >>Best, >>Xia >> >> >> >> >>-----Original Message----- >>From: Carson Holt [mailto:carsonhh at gmail.com] >>Sent: Wednesday, September 25, 2013 10:36 AM >>To: CAO, XIA; myandell at genetics.utah.edu; RIVERA, CORBAN GREGORY; >>maker-devel at yandell-lab.org >>Subject: Re: [maker-devel] maker2 scripts for functional annotation >> >>If it is launching predictors then you have snap hmm or >>augustus_species set. You ned to blank out all other options in the >>control files (including repeat masking options, proteins, ESTs, etc.) >>when trying to convert mathc/match_part to gene/mRNA/exons/CDS, or else >>those other programs will run. >> >>--Carson >> >> >>On 9/25/13 10:31 AM, "Xia.Cao at dupont.com" wrote: >> >>>Hi Carson, >>> >>>Thank you for the message and your kind help. We tested maker2 by >>>setting keep_preds=1, pred_gff=generated_gff_file_from_first_makerRun . >>>But it seemed maker2 started to launch all predictors again and it >>>took long time to finish. I wonder if there is any way that we can >>>directly get gene/mRNA/exons/CDS gff file without re-running maker2 to >>>convert match/match_part features into gene/mRNA/exons/CDS. >>> >>>Thanks, >>>Xia >>> >>>-----Original Message----- >>>From: Carson Holt [mailto:carsonhh at gmail.com] >>>Sent: Thursday, September 19, 2013 5:58 PM >>>To: Mark Yandell; CAO, XIA; RIVERA, CORBAN GREGORY; >>>maker-devel at yandell-lab.org >>>Subject: Re: [maker-devel] maker2 scripts for functional annotation >>> >>>Hello Corban & Xia, >>> >>>Some scripts like gff3_preds2models are deprecated. To get the same >>>result as was offered by gff3_preds2models, just give your >>>match/match_part features to pref_gff= in the maker_opts.ctl file, set >>>keep_preds=1, and run with all other options and predictors turned off. >>>The final MAKER result will be your match/match_part features >>>converted into gene/mRNA/exons/CDS. >>> >>>For functional annotation, you can use Interproscan, BLASTP against >>>UniProt, or BALST2GO. My preference is to use InterProScan to add GO >>>terms and proteins domains via the ipr_update_gff and iprscan2gff3 >>>scripts. Then add putative gene functions via BLASTP to UniProt and >>>maker_functional_fasta and maker_functional_gff scripts. >>> >>>Go ahead and take a look and that those tools and let me know if you >>>have any questions or need help you configuring them. >>> >>>Thanks, >>>Carson >>> >>> >>>On 9/19/13 11:53 AM, "Mark Yandell" wrote: >>> >>>>Hi Corban & Xia, >>>> >>>> >>>>I've forwarded your question along to the MAKER_dev list, were you >>>>can get speedy answers to your maker related questions. Thanks for >>>>using MAKER. >>>> >>>>--mark >>>> >>>> >>>>Mark Yandell >>>>Professor of Human Genetics >>>>H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of >>>>Human Genetics University of Utah >>>>15 North 2030 East, Room 2100 >>>>Salt Lake City, UT 84112-5330 >>>>ph:801-587-7707 >>>> >>>>________________________________________ >>>>From: Xia.Cao at dupont.com [Xia.Cao at dupont.com] >>>>Sent: Thursday, September 19, 2013 11:49 AM >>>>To: Mark Yandell; Corban-Gregory.Rivera at dupont.com >>>>Subject: maker2 scripts for functional annotation >>>> >>>>Dr. Yandell, >>>> >>>>We were recently evaluating maker2 for annotation and going through >>>>the maker tutorial from 2012. >>>> >>>>http://gmod.org/wiki/MAKER_Tutorial_2012 >>>> >>>>The tutorial makes references to some scripts that we couldn?t find >>>>in the current release. We were looking for scripts like >>>>gff3_preds2models to convert match/match_part format into annotations >>>>with gene/mRNA/exons/CDS and others. I was wondering if maybe we did >>>>not have the most up to date version. >>>> >>>>In addition to getting accurate gene annotations, I was looking for a >>>>solution to get functional assignments. I see that there are some >>>>scripts like maker_functional_fasta that may help, but I was >>>>wondering what you would recommend. >>>> >>>>Thanks, >>>> >>>>Corban & Xia >>>> >>>>This communication is for use by the intended recipient and contains >>>>information that may be Privileged, confidential or copyrighted under >>>>applicable law. If you are not the intended recipient, you are hereby >>>>formally notified that any use, copying or distribution of this >>>>e-mail, in whole or in part, is strictly prohibited. Please notify >>>>the sender by return e-mail and delete this e-mail from your system. >>>>Unless explicitly and conspicuously designated as "E-Contract >>>>Intended", this e-mail does not constitute a contract offer, a >>>>contract amendment, or an acceptance of a contract offer. This e-mail >>>>does not constitute a consent to the use of sender's contact >>>>information for direct marketing purposes or for transfers of data to >>>>third parties. >>>> >>>>The dupont.com web address will continue in use for a transitional >>>>period for communications sent or received on behalf of DuPont >>>>Performance Coatings., which is not affiliated in any way with the >>>>DuPont Company. >>>> >>>>Francais Deutsch Italiano Espanol Portugues Japanese Chinese >>>>Korean >>>> >>>> http://www.DuPont.com/corp/email_disclaimer.html >>>> >>>>_______________________________________________ >>>>maker-devel mailing list >>>>maker-devel at box290.bluehost.com >>>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.o >>>>r >>>>g >>> >>> >>> >>>This communication is for use by the intended recipient and contains >>>information that may be Privileged, confidential or copyrighted under >>>applicable law. If you are not the intended recipient, you are hereby >>>formally notified that any use, copying or distribution of this >>>e-mail, in whole or in part, is strictly prohibited. Please notify the >>>sender by return e-mail and delete this e-mail from your system. >>>Unless explicitly and conspicuously designated as "E-Contract >>>Intended", this e-mail does not constitute a contract offer, a >>>contract amendment, or an acceptance of a contract offer. This e-mail >>>does not constitute a consent to the use of sender's contact >>>information for direct marketing purposes or for transfers of data to >>>third parties. >>> >>>The dupont.com web address will continue in use >>>for a transitional period for communications sent or received on >>>behalf of DuPont Performance Coatings., which is not affiliated in any >>>way with the DuPont Company. >>> >>>Francais Deutsch Italiano Espanol Portugues Japanese Chinese >>>Korean >>> >>> http://www.DuPont.com/corp/email_disclaimer.html >>> >> >> >> >>This communication is for use by the intended recipient and contains >>information that may be Privileged, confidential or copyrighted under >>applicable law. If you are not the intended recipient, you are hereby >>formally notified that any use, copying or distribution of this e-mail, >>in whole or in part, is strictly prohibited. Please notify the sender >>by return e-mail and delete this e-mail from your system. Unless >>explicitly and conspicuously designated as "E-Contract Intended", this >>e-mail does not constitute a contract offer, a contract amendment, or >>an acceptance of a contract offer. This e-mail does not constitute a >>consent to the use of sender's contact information for direct marketing >>purposes or for transfers of data to third parties. >> >>The dupont.com web address will continue in use for >>a transitional period for communications sent or received on behalf of >>DuPont Performance Coatings., which is not affiliated in any way with >>the DuPont Company. >> >>Francais Deutsch Italiano Espanol Portugues Japanese Chinese >>Korean >> >> http://www.DuPont.com/corp/email_disclaimer.html >> > > > >This communication is for use by the intended recipient and contains >information that may be Privileged, confidential or copyrighted under >applicable law. If you are not the intended recipient, you are hereby >formally notified that any use, copying or distribution of this e-mail, >in whole or in part, is strictly prohibited. Please notify the sender by >return e-mail and delete this e-mail from your system. Unless explicitly >and conspicuously designated as "E-Contract Intended", this e-mail does >not constitute a contract offer, a contract amendment, or an acceptance >of a contract offer. This e-mail does not constitute a consent to the >use of sender's contact information for direct marketing purposes or for >transfers of data to third parties. > >The dupont.com web address will continue in use for a >transitional period for communications sent or received on behalf of >DuPont >Performance Coatings., which is not affiliated in any way with the DuPont >Company. > >Francais Deutsch Italiano Espanol Portugues Japanese Chinese Korean > > http://www.DuPont.com/corp/email_disclaimer.html > From Carson.Holt at oicr.on.ca Wed Oct 2 09:24:42 2013 From: Carson.Holt at oicr.on.ca (Carson Holt) Date: Wed, 2 Oct 2013 15:24:42 +0000 Subject: [maker-devel] Incorrect gene model, antisense transcripts In-Reply-To: Message-ID: SNAP appears to be making small ab initio calls all over the place. You may need to retrain it, or just drop it completely from your analysis and just use augustus. Try running with protein2genome and then training SNAP off of those results. Also make sure your repeat masking is sufficient. You may be getting too many SANP predictions if that is the case. Thanks, Carson From: Sivaranjani Namasivayam > Date: Tuesday, October 1, 2013 5:01 PM To: "maker-devel-bounces at yandell-lab.org" > Subject: Incorrect gene model, antisense transcripts Hello, I am working on annotating a genome. I have RNAseq data assembled with Cufflinks and Trinity. I am using to predict gene models. Below is the procedure I used - For the first round of prediction I provided the following : EST evidence: assembled RNAseq, Protein evidence: proteins from related organism, Gene predictor: augustus trained on a related organism. - The number of genes I got was much less than expected (Expect atleast 10K genes, got ~4K genes) - Used the gene models from the first round of annotations to train SNAP (default parameters) and provided the HMM for the second round of annotation (used augustus also). All other data I passed as such in the gff3, except the gene models.I also included the some manually curated genes in the EST evidence. - The number of genes were much higher, ~11k, but many seemed incorrect and appear to have been generated from the SNAP models. I am attaching a screen shot from Apollo showing this. Would you have any suggestions on why this might be happening and how I can fix? I am essentially looking for gene models for my assembled RNAseq, can I achieve this by setting est2genome to 1? Would I be losing/misrepresenting any data if I do this. Also, I recently noticed my RNAseq data has assembled antisense transcripts. MAKER seems to be predicting a coding region for these antisense transcripts in some cases, (although the coding region predicted is much smaller than that of the sense gene model.) Is there a way to tell MAKER not to predict gene models on both strands at the same loci. Appreciate your help. Thanks, Ranjani -------------- next part -------------- An HTML attachment was scrubbed... URL: From Xia.Cao at dupont.com Wed Oct 2 08:40:12 2013 From: Xia.Cao at dupont.com (Xia.Cao at dupont.com) Date: Wed, 2 Oct 2013 14:40:12 +0000 Subject: [maker-devel] maker2 scripts for functional annotation In-Reply-To: References: Message-ID: Hi Carson, Thank you for your quick and kind reply. One more question here. If we don't have EST evidence for the strain we sequenced/assembled, will it be ok to provide some EST evidence that is from some other strains which are close to the one we are trying to annotate? Could you please let us know how this will affect performance of maker2? Thank you again. Best, Xia -----Original Message----- From: Carson Holt [mailto:carsonhh at gmail.com] Sent: Wednesday, October 02, 2013 10:15 AM To: CAO, XIA Cc: myandell at genetics.utah.edu; RIVERA, CORBAN GREGORY; maker-devel at yandell-lab.org Subject: Re: [maker-devel] maker2 scripts for functional annotation It still works, but it will be reduced. Without ESTs you won't get any UTR prediction, also you will be limited by how well the protein set matches to the genes. If you use the protein set of a close relative you will capture most things. You will probably capture 80-95% of what you would by also including ESTs. It all depending on how different the species in the proteins evidence file are compared to the the species being annotated. --Carson On 10/1/13 3:00 PM, "Xia.Cao at dupont.com" wrote: >Hi Carson, > >Thank you for your message and your kind help. Now conversion from >match/match_part to gene/mRNA/exons/CDS works well after blanking out >some options in control file. > >I have one more question about maker2. In case we don't have EST >evidence (set of ESTs or assembled mRNA-seq in fasta format) for a >genome, does >maker2 function? If it does function, could you please let me know the >performance of maker2 without providing EST evidence compared to the >one with EST evidence? > >Thank you again and I look forward to hearing from you. > >Best, >Xia > > > > >-----Original Message----- >From: Carson Holt [mailto:carsonhh at gmail.com] >Sent: Wednesday, September 25, 2013 10:36 AM >To: CAO, XIA; myandell at genetics.utah.edu; RIVERA, CORBAN GREGORY; >maker-devel at yandell-lab.org >Subject: Re: [maker-devel] maker2 scripts for functional annotation > >If it is launching predictors then you have snap hmm or >augustus_species set. You ned to blank out all other options in the >control files (including repeat masking options, proteins, ESTs, etc.) >when trying to convert mathc/match_part to gene/mRNA/exons/CDS, or else >those other programs will run. > >--Carson > > >On 9/25/13 10:31 AM, "Xia.Cao at dupont.com" wrote: > >>Hi Carson, >> >>Thank you for the message and your kind help. We tested maker2 by >>setting keep_preds=1, pred_gff=generated_gff_file_from_first_makerRun . >>But it seemed maker2 started to launch all predictors again and it >>took long time to finish. I wonder if there is any way that we can >>directly get gene/mRNA/exons/CDS gff file without re-running maker2 to >>convert match/match_part features into gene/mRNA/exons/CDS. >> >>Thanks, >>Xia >> >>-----Original Message----- >>From: Carson Holt [mailto:carsonhh at gmail.com] >>Sent: Thursday, September 19, 2013 5:58 PM >>To: Mark Yandell; CAO, XIA; RIVERA, CORBAN GREGORY; >>maker-devel at yandell-lab.org >>Subject: Re: [maker-devel] maker2 scripts for functional annotation >> >>Hello Corban & Xia, >> >>Some scripts like gff3_preds2models are deprecated. To get the same >>result as was offered by gff3_preds2models, just give your >>match/match_part features to pref_gff= in the maker_opts.ctl file, set >>keep_preds=1, and run with all other options and predictors turned off. >>The final MAKER result will be your match/match_part features >>converted into gene/mRNA/exons/CDS. >> >>For functional annotation, you can use Interproscan, BLASTP against >>UniProt, or BALST2GO. My preference is to use InterProScan to add GO >>terms and proteins domains via the ipr_update_gff and iprscan2gff3 >>scripts. Then add putative gene functions via BLASTP to UniProt and >>maker_functional_fasta and maker_functional_gff scripts. >> >>Go ahead and take a look and that those tools and let me know if you >>have any questions or need help you configuring them. >> >>Thanks, >>Carson >> >> >>On 9/19/13 11:53 AM, "Mark Yandell" wrote: >> >>>Hi Corban & Xia, >>> >>> >>>I've forwarded your question along to the MAKER_dev list, were you >>>can get speedy answers to your maker related questions. Thanks for >>>using MAKER. >>> >>>--mark >>> >>> >>>Mark Yandell >>>Professor of Human Genetics >>>H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of >>>Human Genetics University of Utah >>>15 North 2030 East, Room 2100 >>>Salt Lake City, UT 84112-5330 >>>ph:801-587-7707 >>> >>>________________________________________ >>>From: Xia.Cao at dupont.com [Xia.Cao at dupont.com] >>>Sent: Thursday, September 19, 2013 11:49 AM >>>To: Mark Yandell; Corban-Gregory.Rivera at dupont.com >>>Subject: maker2 scripts for functional annotation >>> >>>Dr. Yandell, >>> >>>We were recently evaluating maker2 for annotation and going through >>>the maker tutorial from 2012. >>> >>>http://gmod.org/wiki/MAKER_Tutorial_2012 >>> >>>The tutorial makes references to some scripts that we couldn?t find >>>in the current release. We were looking for scripts like >>>gff3_preds2models to convert match/match_part format into annotations >>>with gene/mRNA/exons/CDS and others. I was wondering if maybe we did >>>not have the most up to date version. >>> >>>In addition to getting accurate gene annotations, I was looking for a >>>solution to get functional assignments. I see that there are some >>>scripts like maker_functional_fasta that may help, but I was >>>wondering what you would recommend. >>> >>>Thanks, >>> >>>Corban & Xia >>> >>>This communication is for use by the intended recipient and contains >>>information that may be Privileged, confidential or copyrighted under >>>applicable law. If you are not the intended recipient, you are hereby >>>formally notified that any use, copying or distribution of this >>>e-mail, in whole or in part, is strictly prohibited. Please notify >>>the sender by return e-mail and delete this e-mail from your system. >>>Unless explicitly and conspicuously designated as "E-Contract >>>Intended", this e-mail does not constitute a contract offer, a >>>contract amendment, or an acceptance of a contract offer. This e-mail >>>does not constitute a consent to the use of sender's contact >>>information for direct marketing purposes or for transfers of data to >>>third parties. >>> >>>The dupont.com web address will continue in use for a transitional >>>period for communications sent or received on behalf of DuPont >>>Performance Coatings., which is not affiliated in any way with the >>>DuPont Company. >>> >>>Francais Deutsch Italiano Espanol Portugues Japanese Chinese >>>Korean >>> >>> http://www.DuPont.com/corp/email_disclaimer.html >>> >>>_______________________________________________ >>>maker-devel mailing list >>>maker-devel at box290.bluehost.com >>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.o >>>r >>>g >> >> >> >>This communication is for use by the intended recipient and contains >>information that may be Privileged, confidential or copyrighted under >>applicable law. If you are not the intended recipient, you are hereby >>formally notified that any use, copying or distribution of this >>e-mail, in whole or in part, is strictly prohibited. Please notify the >>sender by return e-mail and delete this e-mail from your system. >>Unless explicitly and conspicuously designated as "E-Contract >>Intended", this e-mail does not constitute a contract offer, a >>contract amendment, or an acceptance of a contract offer. This e-mail >>does not constitute a consent to the use of sender's contact >>information for direct marketing purposes or for transfers of data to third parties. >> >>The dupont.com web address will continue in use >>for a transitional period for communications sent or received on >>behalf of DuPont Performance Coatings., which is not affiliated in any >>way with the DuPont Company. >> >>Francais Deutsch Italiano Espanol Portugues Japanese Chinese >>Korean >> >> http://www.DuPont.com/corp/email_disclaimer.html >> > > > >This communication is for use by the intended recipient and contains >information that may be Privileged, confidential or copyrighted under >applicable law. If you are not the intended recipient, you are hereby >formally notified that any use, copying or distribution of this e-mail, >in whole or in part, is strictly prohibited. Please notify the sender >by return e-mail and delete this e-mail from your system. Unless >explicitly and conspicuously designated as "E-Contract Intended", this >e-mail does not constitute a contract offer, a contract amendment, or >an acceptance of a contract offer. This e-mail does not constitute a >consent to the use of sender's contact information for direct marketing >purposes or for transfers of data to third parties. > >The dupont.com web address will continue in use for >a transitional period for communications sent or received on behalf of >DuPont Performance Coatings., which is not affiliated in any way with >the DuPont Company. > >Francais Deutsch Italiano Espanol Portugues Japanese Chinese >Korean > > http://www.DuPont.com/corp/email_disclaimer.html > This communication is for use by the intended recipient and contains information that may be Privileged, confidential or copyrighted under applicable law. If you are not the intended recipient, you are hereby formally notified that any use, copying or distribution of this e-mail, in whole or in part, is strictly prohibited. Please notify the sender by return e-mail and delete this e-mail from your system. Unless explicitly and conspicuously designated as "E-Contract Intended", this e-mail does not constitute a contract offer, a contract amendment, or an acceptance of a contract offer. This e-mail does not constitute a consent to the use of sender's contact information for direct marketing purposes or for transfers of data to third parties. The dupont.com web address will continue in use for a transitional period for communications sent or received on behalf of DuPont Performance Coatings., which is not affiliated in any way with the DuPont Company. Francais Deutsch Italiano Espanol Portugues Japanese Chinese Korean http://www.DuPont.com/corp/email_disclaimer.html From markwiz at us.ibm.com Wed Oct 2 10:52:55 2013 From: markwiz at us.ibm.com (Mark Wisner) Date: Wed, 2 Oct 2013 12:52:55 -0400 Subject: [maker-devel] Maker2 on IBM PowerLinux? Message-ID: IBMs PowerLinux is tuned for Big Data workloads. We have had queries for Maker2 on PowerLinux. How do we work with Maker2 development to create a PowerLinux port. Thanks, Mark K. Wisner Linux On Power ISV PM IBM 3039 Cornwallis Rd RTP, NC 27709 Cell 919-649-5813 -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry.moore at genetics.utah.edu Wed Oct 2 11:29:36 2013 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Wed, 2 Oct 2013 11:29:36 -0600 Subject: [maker-devel] Maker2 on IBM PowerLinux? In-Reply-To: References: Message-ID: Hi Mark, Carson Holt is the primary Maker developer, but others of us may be able to help as well. I've added e-mails for Carson, Michael Campbell, Mark Yandell and myself who are currently developers on the project. You can feel free to contact this mailing list - or if the conversation is getting too far afield for general interest you can contact us off list and we will do whatever we can to help. If you feel like you need to do some modification of MAKER code to optimize it for PowerLinux I'm happy to give you a branch on the subversion repo to work with and then we can work with you to merge these changes back to the trunk when you feel like you have some improvements that benefit PowerLinux (and all other) users. Thanks very much for your interest in MAKER - your participation in the community is most welcome! Barry On Oct 2, 2013, at 10:52 AM, Mark Wisner wrote: > IBMs PowerLinux is tuned for Big Data workloads. We have had queries for Maker2 on PowerLinux. > How do we work with Maker2 development to create a PowerLinux port. > Thanks, > > Mark K. Wisner > Linux On Power ISV PM > IBM > 3039 Cornwallis Rd > RTP, NC 27709 > Cell 919-649-5813 > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Wed Oct 2 11:29:50 2013 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 02 Oct 2013 13:29:50 -0400 Subject: [maker-devel] Maker2 on IBM PowerLinux? In-Reply-To: Message-ID: MAKER is written in perl, so it can work pretty much anywhere you have perl. It is more an issue of the tools MAKER uses. If you can get SNAP, BLAST, exonerate, etc. to run then MAKER should be no problem. You can let MAKER try and install and configure any missing perl libraries as well as external tools for you (this is by far the easiest way). You will need an internet connection. cd ?/maker/src/ perl Build.PL #say 'Y' to local installation of libraries and 'N' to MPI for now ./Build installdeps #grabs missing perl libraries from the ether, and installs them in the .../maker/perl/ directory ./Build installers #grabs missing tools from the ether, installs, and configures them in the .../maker/exe/ directory ./Build install #just installs maker If you want to run via MPI, I'd recommend OpenMPI, and you will need to rerun the 'perl Build.PL' and './Build install' steps (say yes to MPI this time around). Here are some variables you will likly have to add to your .bash_profile to get openmpi to run with MAKER (you must add these and source the profile before attempting to install with OpenMPI), otherwise compilation will not work correctly (OpenMPI shared library issue). #you need to set $OMPIHOME to OpenMPI's location here --> export LD_PRELOAD=$OMPIHOME/lib/libmpi.so:$LD_PRELOAD #this one just suppresses a warning --> export OMPI_MCA_mpi_warn_on_fork=0 And on some systems you have to add this ' -mca btl ^openib' to the mpiexec command when I run maker. Example--> mpiexec -mca btl ^openib -n 20 maker Thanks, Carson From: Mark Wisner Date: Wednesday, October 2, 2013 12:52 PM To: Subject: [maker-devel] Maker2 on IBM PowerLinux? IBMs PowerLinux is tuned for Big Data workloads. We have had queries for Maker2 on PowerLinux. How do we work with Maker2 development to create a PowerLinux port. Thanks, Mark K. Wisner Linux On Power ISV PM IBM 3039 Cornwallis Rd RTP, NC 27709 Cell 919-649-5813 _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry.moore at genetics.utah.edu Wed Oct 2 11:21:29 2013 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Wed, 2 Oct 2013 11:21:29 -0600 Subject: [maker-devel] maker2 scripts for functional annotation In-Reply-To: References: Message-ID: Hi Xia, You can use EST evidence from other strains/species. The considerations are as follows: When you pass EST (or mRNASeq based transcripts) evidence MAKER uses blastn to align those with alignment parameters that assume they sequences will align easily with almost perfect identity - this is fast. If you use EST/mRNA evidence from another species MAKER uses tblastx to compare the RNA evidence to the assembly, both translated into protein space - this works fine, but it is slow and often doesn't get you much more evidence than you could have gotten from proteins as evidence. If you believe that you're other strain should have very little divergence in sequence at the nt level then you could experiment with passing EST/mRNA evidence as est in the control file. Consider this an experiment, do it on a few contigs and examine the results carefully and skeptically. If it works well, that would be the way to go because you've get better UTRs, but if it doesn't you'll see that you get little support for models because sequence divergence was too much for good alignment. In that case you'll need to use alt_est and this will be slower and you'll likely get little in the way of support for UTRs - in this case as well, do a few contigs and examine the output carefully to see if the additional alignment time with tblastx is worth it in your case. Barry On Oct 2, 2013, at 8:40 AM, wrote: > Hi Carson, > > Thank you for your quick and kind reply. One more question here. If we don't have EST evidence for the strain we sequenced/assembled, will it be ok to provide some EST evidence that is from some other strains which are close to the one we are trying to annotate? Could you please let us know how this will affect performance of maker2? > > Thank you again. > > Best, > Xia > > -----Original Message----- > From: Carson Holt [mailto:carsonhh at gmail.com] > Sent: Wednesday, October 02, 2013 10:15 AM > To: CAO, XIA > Cc: myandell at genetics.utah.edu; RIVERA, CORBAN GREGORY; maker-devel at yandell-lab.org > Subject: Re: [maker-devel] maker2 scripts for functional annotation > > It still works, but it will be reduced. Without ESTs you won't get any UTR prediction, also you will be limited by how well the protein set matches to the genes. If you use the protein set of a close relative you will capture most things. You will probably capture 80-95% of what you would by also including ESTs. It all depending on how different the species in the proteins evidence file are compared to the the species being annotated. > > --Carson > > On 10/1/13 3:00 PM, "Xia.Cao at dupont.com" wrote: > >> Hi Carson, >> >> Thank you for your message and your kind help. Now conversion from >> match/match_part to gene/mRNA/exons/CDS works well after blanking out >> some options in control file. >> >> I have one more question about maker2. In case we don't have EST >> evidence (set of ESTs or assembled mRNA-seq in fasta format) for a >> genome, does >> maker2 function? If it does function, could you please let me know the >> performance of maker2 without providing EST evidence compared to the >> one with EST evidence? >> >> Thank you again and I look forward to hearing from you. >> >> Best, >> Xia >> >> >> >> >> -----Original Message----- >> From: Carson Holt [mailto:carsonhh at gmail.com] >> Sent: Wednesday, September 25, 2013 10:36 AM >> To: CAO, XIA; myandell at genetics.utah.edu; RIVERA, CORBAN GREGORY; >> maker-devel at yandell-lab.org >> Subject: Re: [maker-devel] maker2 scripts for functional annotation >> >> If it is launching predictors then you have snap hmm or >> augustus_species set. You ned to blank out all other options in the >> control files (including repeat masking options, proteins, ESTs, etc.) >> when trying to convert mathc/match_part to gene/mRNA/exons/CDS, or else >> those other programs will run. >> >> --Carson >> >> >> On 9/25/13 10:31 AM, "Xia.Cao at dupont.com" wrote: >> >>> Hi Carson, >>> >>> Thank you for the message and your kind help. We tested maker2 by >>> setting keep_preds=1, pred_gff=generated_gff_file_from_first_makerRun . >>> But it seemed maker2 started to launch all predictors again and it >>> took long time to finish. I wonder if there is any way that we can >>> directly get gene/mRNA/exons/CDS gff file without re-running maker2 to >>> convert match/match_part features into gene/mRNA/exons/CDS. >>> >>> Thanks, >>> Xia >>> >>> -----Original Message----- >>> From: Carson Holt [mailto:carsonhh at gmail.com] >>> Sent: Thursday, September 19, 2013 5:58 PM >>> To: Mark Yandell; CAO, XIA; RIVERA, CORBAN GREGORY; >>> maker-devel at yandell-lab.org >>> Subject: Re: [maker-devel] maker2 scripts for functional annotation >>> >>> Hello Corban & Xia, >>> >>> Some scripts like gff3_preds2models are deprecated. To get the same >>> result as was offered by gff3_preds2models, just give your >>> match/match_part features to pref_gff= in the maker_opts.ctl file, set >>> keep_preds=1, and run with all other options and predictors turned off. >>> The final MAKER result will be your match/match_part features >>> converted into gene/mRNA/exons/CDS. >>> >>> For functional annotation, you can use Interproscan, BLASTP against >>> UniProt, or BALST2GO. My preference is to use InterProScan to add GO >>> terms and proteins domains via the ipr_update_gff and iprscan2gff3 >>> scripts. Then add putative gene functions via BLASTP to UniProt and >>> maker_functional_fasta and maker_functional_gff scripts. >>> >>> Go ahead and take a look and that those tools and let me know if you >>> have any questions or need help you configuring them. >>> >>> Thanks, >>> Carson >>> >>> >>> On 9/19/13 11:53 AM, "Mark Yandell" wrote: >>> >>>> Hi Corban & Xia, >>>> >>>> >>>> I've forwarded your question along to the MAKER_dev list, were you >>>> can get speedy answers to your maker related questions. Thanks for >>>> using MAKER. >>>> >>>> --mark >>>> >>>> >>>> Mark Yandell >>>> Professor of Human Genetics >>>> H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of >>>> Human Genetics University of Utah >>>> 15 North 2030 East, Room 2100 >>>> Salt Lake City, UT 84112-5330 >>>> ph:801-587-7707 >>>> >>>> ________________________________________ >>>> From: Xia.Cao at dupont.com [Xia.Cao at dupont.com] >>>> Sent: Thursday, September 19, 2013 11:49 AM >>>> To: Mark Yandell; Corban-Gregory.Rivera at dupont.com >>>> Subject: maker2 scripts for functional annotation >>>> >>>> Dr. Yandell, >>>> >>>> We were recently evaluating maker2 for annotation and going through >>>> the maker tutorial from 2012. >>>> >>>> http://gmod.org/wiki/MAKER_Tutorial_2012 >>>> >>>> The tutorial makes references to some scripts that we couldn?t find >>>> in the current release. We were looking for scripts like >>>> gff3_preds2models to convert match/match_part format into annotations >>>> with gene/mRNA/exons/CDS and others. I was wondering if maybe we did >>>> not have the most up to date version. >>>> >>>> In addition to getting accurate gene annotations, I was looking for a >>>> solution to get functional assignments. I see that there are some >>>> scripts like maker_functional_fasta that may help, but I was >>>> wondering what you would recommend. >>>> >>>> Thanks, >>>> >>>> Corban & Xia >>>> >>>> This communication is for use by the intended recipient and contains >>>> information that may be Privileged, confidential or copyrighted under >>>> applicable law. If you are not the intended recipient, you are hereby >>>> formally notified that any use, copying or distribution of this >>>> e-mail, in whole or in part, is strictly prohibited. Please notify >>>> the sender by return e-mail and delete this e-mail from your system. >>>> Unless explicitly and conspicuously designated as "E-Contract >>>> Intended", this e-mail does not constitute a contract offer, a >>>> contract amendment, or an acceptance of a contract offer. This e-mail >>>> does not constitute a consent to the use of sender's contact >>>> information for direct marketing purposes or for transfers of data to >>>> third parties. >>>> >>>> The dupont.com web address will continue in use for a transitional >>>> period for communications sent or received on behalf of DuPont >>>> Performance Coatings., which is not affiliated in any way with the >>>> DuPont Company. >>>> >>>> Francais Deutsch Italiano Espanol Portugues Japanese Chinese >>>> Korean >>>> >>>> http://www.DuPont.com/corp/email_disclaimer.html >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.o >>>> r >>>> g >>> >>> >>> >>> This communication is for use by the intended recipient and contains >>> information that may be Privileged, confidential or copyrighted under >>> applicable law. If you are not the intended recipient, you are hereby >>> formally notified that any use, copying or distribution of this >>> e-mail, in whole or in part, is strictly prohibited. Please notify the >>> sender by return e-mail and delete this e-mail from your system. >>> Unless explicitly and conspicuously designated as "E-Contract >>> Intended", this e-mail does not constitute a contract offer, a >>> contract amendment, or an acceptance of a contract offer. This e-mail >>> does not constitute a consent to the use of sender's contact >>> information for direct marketing purposes or for transfers of data to third parties. >>> >>> The dupont.com web address will continue in use >>> for a transitional period for communications sent or received on >>> behalf of DuPont Performance Coatings., which is not affiliated in any >>> way with the DuPont Company. >>> >>> Francais Deutsch Italiano Espanol Portugues Japanese Chinese >>> Korean >>> >>> http://www.DuPont.com/corp/email_disclaimer.html >>> >> >> >> >> This communication is for use by the intended recipient and contains >> information that may be Privileged, confidential or copyrighted under >> applicable law. If you are not the intended recipient, you are hereby >> formally notified that any use, copying or distribution of this e-mail, >> in whole or in part, is strictly prohibited. Please notify the sender >> by return e-mail and delete this e-mail from your system. Unless >> explicitly and conspicuously designated as "E-Contract Intended", this >> e-mail does not constitute a contract offer, a contract amendment, or >> an acceptance of a contract offer. This e-mail does not constitute a >> consent to the use of sender's contact information for direct marketing >> purposes or for transfers of data to third parties. >> >> The dupont.com web address will continue in use for >> a transitional period for communications sent or received on behalf of >> DuPont Performance Coatings., which is not affiliated in any way with >> the DuPont Company. >> >> Francais Deutsch Italiano Espanol Portugues Japanese Chinese >> Korean >> >> http://www.DuPont.com/corp/email_disclaimer.html >> > > > > This communication is for use by the intended recipient and contains > information that may be Privileged, confidential or copyrighted under > applicable law. If you are not the intended recipient, you are hereby > formally notified that any use, copying or distribution of this e-mail, > in whole or in part, is strictly prohibited. Please notify the sender by > return e-mail and delete this e-mail from your system. Unless explicitly > and conspicuously designated as "E-Contract Intended", this e-mail does > not constitute a contract offer, a contract amendment, or an acceptance > of a contract offer. This e-mail does not constitute a consent to the > use of sender's contact information for direct marketing purposes or for > transfers of data to third parties. > > The dupont.com web address will continue in use for a > transitional period for communications sent or received on behalf of DuPont > Performance Coatings., which is not affiliated in any way with the DuPont Company. > > Francais Deutsch Italiano Espanol Portugues Japanese Chinese Korean > > http://www.DuPont.com/corp/email_disclaimer.html > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From Xia.Cao at dupont.com Wed Oct 2 11:51:26 2013 From: Xia.Cao at dupont.com (Xia.Cao at dupont.com) Date: Wed, 2 Oct 2013 17:51:26 +0000 Subject: [maker-devel] maker2 scripts for functional annotation In-Reply-To: References: Message-ID: Hi Barry, Thank you very much for the message and suggestions. It's really helpful to us. We will do some tests to see how we can take advantage of Maker2 to produce high-quality gene models from our assembled genome. Thank you again, Xia From: Barry Moore [mailto:barry.moore at genetics.utah.edu] Sent: Wednesday, October 02, 2013 1:21 PM To: CAO, XIA Cc: carsonhh at gmail.com; maker-devel at yandell-lab.org; RIVERA, CORBAN GREGORY Subject: Re: [maker-devel] maker2 scripts for functional annotation Hi Xia, You can use EST evidence from other strains/species. The considerations are as follows: When you pass EST (or mRNASeq based transcripts) evidence MAKER uses blastn to align those with alignment parameters that assume they sequences will align easily with almost perfect identity - this is fast. If you use EST/mRNA evidence from another species MAKER uses tblastx to compare the RNA evidence to the assembly, both translated into protein space - this works fine, but it is slow and often doesn't get you much more evidence than you could have gotten from proteins as evidence. If you believe that you're other strain should have very little divergence in sequence at the nt level then you could experiment with passing EST/mRNA evidence as est in the control file. Consider this an experiment, do it on a few contigs and examine the results carefully and skeptically. If it works well, that would be the way to go because you've get better UTRs, but if it doesn't you'll see that you get little support for models because sequence divergence was too much for good alignment. In that case you'll need to use alt_est and this will be slower and you'll likely get little in the way of support for UTRs - in this case as well, do a few contigs and examine the output carefully to see if the additional alignment time with tblastx is worth it in your case. Barry On Oct 2, 2013, at 8:40 AM, > wrote: Hi Carson, Thank you for your quick and kind reply. One more question here. If we don't have EST evidence for the strain we sequenced/assembled, will it be ok to provide some EST evidence that is from some other strains which are close to the one we are trying to annotate? Could you please let us know how this will affect performance of maker2? Thank you again. Best, Xia -----Original Message----- From: Carson Holt [mailto:carsonhh at gmail.com] Sent: Wednesday, October 02, 2013 10:15 AM To: CAO, XIA Cc: myandell at genetics.utah.edu; RIVERA, CORBAN GREGORY; maker-devel at yandell-lab.org Subject: Re: [maker-devel] maker2 scripts for functional annotation It still works, but it will be reduced. Without ESTs you won't get any UTR prediction, also you will be limited by how well the protein set matches to the genes. If you use the protein set of a close relative you will capture most things. You will probably capture 80-95% of what you would by also including ESTs. It all depending on how different the species in the proteins evidence file are compared to the the species being annotated. --Carson On 10/1/13 3:00 PM, "Xia.Cao at dupont.com" > wrote: Hi Carson, Thank you for your message and your kind help. Now conversion from match/match_part to gene/mRNA/exons/CDS works well after blanking out some options in control file. I have one more question about maker2. In case we don't have EST evidence (set of ESTs or assembled mRNA-seq in fasta format) for a genome, does maker2 function? If it does function, could you please let me know the performance of maker2 without providing EST evidence compared to the one with EST evidence? Thank you again and I look forward to hearing from you. Best, Xia -----Original Message----- From: Carson Holt [mailto:carsonhh at gmail.com] Sent: Wednesday, September 25, 2013 10:36 AM To: CAO, XIA; myandell at genetics.utah.edu; RIVERA, CORBAN GREGORY; maker-devel at yandell-lab.org Subject: Re: [maker-devel] maker2 scripts for functional annotation If it is launching predictors then you have snap hmm or augustus_species set. You ned to blank out all other options in the control files (including repeat masking options, proteins, ESTs, etc.) when trying to convert mathc/match_part to gene/mRNA/exons/CDS, or else those other programs will run. --Carson On 9/25/13 10:31 AM, "Xia.Cao at dupont.com" > wrote: Hi Carson, Thank you for the message and your kind help. We tested maker2 by setting keep_preds=1, pred_gff=generated_gff_file_from_first_makerRun . But it seemed maker2 started to launch all predictors again and it took long time to finish. I wonder if there is any way that we can directly get gene/mRNA/exons/CDS gff file without re-running maker2 to convert match/match_part features into gene/mRNA/exons/CDS. Thanks, Xia -----Original Message----- From: Carson Holt [mailto:carsonhh at gmail.com] Sent: Thursday, September 19, 2013 5:58 PM To: Mark Yandell; CAO, XIA; RIVERA, CORBAN GREGORY; maker-devel at yandell-lab.org Subject: Re: [maker-devel] maker2 scripts for functional annotation Hello Corban & Xia, Some scripts like gff3_preds2models are deprecated. To get the same result as was offered by gff3_preds2models, just give your match/match_part features to pref_gff= in the maker_opts.ctl file, set keep_preds=1, and run with all other options and predictors turned off. The final MAKER result will be your match/match_part features converted into gene/mRNA/exons/CDS. For functional annotation, you can use Interproscan, BLASTP against UniProt, or BALST2GO. My preference is to use InterProScan to add GO terms and proteins domains via the ipr_update_gff and iprscan2gff3 scripts. Then add putative gene functions via BLASTP to UniProt and maker_functional_fasta and maker_functional_gff scripts. Go ahead and take a look and that those tools and let me know if you have any questions or need help you configuring them. Thanks, Carson On 9/19/13 11:53 AM, "Mark Yandell" > wrote: Hi Corban & Xia, I've forwarded your question along to the MAKER_dev list, were you can get speedy answers to your maker related questions. Thanks for using MAKER. --mark Mark Yandell Professor of Human Genetics H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:801-587-7707 ________________________________________ From: Xia.Cao at dupont.com [Xia.Cao at dupont.com] Sent: Thursday, September 19, 2013 11:49 AM To: Mark Yandell; Corban-Gregory.Rivera at dupont.com Subject: maker2 scripts for functional annotation Dr. Yandell, We were recently evaluating maker2 for annotation and going through the maker tutorial from 2012. http://gmod.org/wiki/MAKER_Tutorial_2012 The tutorial makes references to some scripts that we couldn?t find in the current release. We were looking for scripts like gff3_preds2models to convert match/match_part format into annotations with gene/mRNA/exons/CDS and others. I was wondering if maybe we did not have the most up to date version. In addition to getting accurate gene annotations, I was looking for a solution to get functional assignments. I see that there are some scripts like maker_functional_fasta that may help, but I was wondering what you would recommend. Thanks, Corban & Xia This communication is for use by the intended recipient and contains information that may be Privileged, confidential or copyrighted under applicable law. If you are not the intended recipient, you are hereby formally notified that any use, copying or distribution of this e-mail, in whole or in part, is strictly prohibited. Please notify the sender by return e-mail and delete this e-mail from your system. Unless explicitly and conspicuously designated as "E-Contract Intended", this e-mail does not constitute a contract offer, a contract amendment, or an acceptance of a contract offer. This e-mail does not constitute a consent to the use of sender's contact information for direct marketing purposes or for transfers of data to third parties. The dupont.com web address will continue in use for a transitional period for communications sent or received on behalf of DuPont Performance Coatings., which is not affiliated in any way with the DuPont Company. Francais Deutsch Italiano Espanol Portugues Japanese Chinese Korean http://www.DuPont.com/corp/email_disclaimer.html _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.o r g This communication is for use by the intended recipient and contains information that may be Privileged, confidential or copyrighted under applicable law. If you are not the intended recipient, you are hereby formally notified that any use, copying or distribution of this e-mail, in whole or in part, is strictly prohibited. Please notify the sender by return e-mail and delete this e-mail from your system. Unless explicitly and conspicuously designated as "E-Contract Intended", this e-mail does not constitute a contract offer, a contract amendment, or an acceptance of a contract offer. This e-mail does not constitute a consent to the use of sender's contact information for direct marketing purposes or for transfers of data to third parties. The dupont.com web address will continue in use for a transitional period for communications sent or received on behalf of DuPont Performance Coatings., which is not affiliated in any way with the DuPont Company. Francais Deutsch Italiano Espanol Portugues Japanese Chinese Korean http://www.DuPont.com/corp/email_disclaimer.html This communication is for use by the intended recipient and contains information that may be Privileged, confidential or copyrighted under applicable law. If you are not the intended recipient, you are hereby formally notified that any use, copying or distribution of this e-mail, in whole or in part, is strictly prohibited. Please notify the sender by return e-mail and delete this e-mail from your system. Unless explicitly and conspicuously designated as "E-Contract Intended", this e-mail does not constitute a contract offer, a contract amendment, or an acceptance of a contract offer. This e-mail does not constitute a consent to the use of sender's contact information for direct marketing purposes or for transfers of data to third parties. The dupont.com web address will continue in use for a transitional period for communications sent or received on behalf of DuPont Performance Coatings., which is not affiliated in any way with the DuPont Company. Francais Deutsch Italiano Espanol Portugues Japanese Chinese Korean http://www.DuPont.com/corp/email_disclaimer.html This communication is for use by the intended recipient and contains information that may be Privileged, confidential or copyrighted under applicable law. If you are not the intended recipient, you are hereby formally notified that any use, copying or distribution of this e-mail, in whole or in part, is strictly prohibited. Please notify the sender by return e-mail and delete this e-mail from your system. Unless explicitly and conspicuously designated as "E-Contract Intended", this e-mail does not constitute a contract offer, a contract amendment, or an acceptance of a contract offer. This e-mail does not constitute a consent to the use of sender's contact information for direct marketing purposes or for transfers of data to third parties. The dupont.com web address will continue in use for a transitional period for communications sent or received on behalf of DuPont Performance Coatings., which is not affiliated in any way with the DuPont Company. Francais Deutsch Italiano Espanol Portugues Japanese Chinese Korean http://www.DuPont.com/corp/email_disclaimer.html _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 This communication is for use by the intended recipient and contains information that may be Privileged, confidential or copyrighted under applicable law. If you are not the intended recipient, you are hereby formally notified that any use, copying or distribution of this e-mail, in whole or in part, is strictly prohibited. Please notify the sender by return e-mail and delete this e-mail from your system. Unless explicitly and conspicuously designated as "E-Contract Intended", this e-mail does not constitute a contract offer, a contract amendment, or an acceptance of a contract offer. This e-mail does not constitute a consent to the use of sender's contact information for direct marketing purposes or for transfers of data to third parties. The dupont.com web address will continue in use for a transitional period for communications sent or received on behalf of DuPont Performance Coatings., which is not affiliated in any way with the DuPont Company. Francais Deutsch Italiano Espanol Portugues Japanese Chinese Korean http://www.DuPont.com/corp/email_disclaimer.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From cjfields at illinois.edu Thu Oct 3 12:44:07 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Thu, 3 Oct 2013 18:44:07 +0000 Subject: [maker-devel] NFSLock problem Message-ID: I have a MAKER job running that seems to be stalled on a failed scaffold. It's running via MPI (MAKER v2.28, openMPI 1.6.3), that appears to have worked successfully for the most part. This is a run that only uses transcriptome and protein information in order to get a decent dseat of The failed scaffold seems to be holding the job from completion. There does seem to be changes, but mainly they are on the NFSLock files: ./.NFSLock.gi_lock.NFSLock ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1 ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLock.KB913038%2E1.281.282.junction.blastn.holdover.NFSLock ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLock.KB913038%2E1.282.start.blastn.holdover.NFSLock ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLock.KB913038%2E1.281.end.blastn.holdover.NFSLock ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLock.KB913038%2E1.282.start.blastn.holdover.NFSLock.STACK ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLock.KB913038%2E1.281.end.blastn.holdover.NFSLock.STACK Everything else seems to have completed. I have seen a few issues re: NFS locking problems, would this be related? Should I stop the job? We're running GPFS for our NFS. Here's 'mount': -system-specific-4.1$ mount /dev/sda5 on / type ext4 (rw) proc on /proc type proc (rw) sysfs on /sys type sysfs (rw) devpts on /dev/pts type devpts (rw,gid=5,mode=620) tmpfs on /dev/shm type tmpfs (rw) /dev/sda1 on /boot type ext4 (rw) /dev/sdb1 on /export type ext4 (rw) /dev/sda2 on /var type ext4 (rw) tmpfs on /var/lib/ganglia/rrds type tmpfs (rw,size=6180842000,gid=99,uid=99) none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw) sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw) nfsd on /proc/fs/nfsd type nfsd (rw) /dev/IGBHOME0 on /home type gpfs (rw,mtime,dev=IGBHOME0) 128.174.124.79:/shares/group on /archive/group type nfs (rw,sync,hard,intr,retrans=10,timeo=300,rsize=65536,wsize=1048576,vers=3,proto=tcp,mountproto=tcp,addr=128.174.124.79) 128.174.124.79:/shares/CBC on /archive/CBC type nfs (rw,sync,hard,intr,retrans=10,timeo=300,rsize=65536,wsize=1048576,vers=3,proto=tcp,mountproto=tcp,addr=128.174.124.79) chris From carsonhh at gmail.com Thu Oct 3 12:45:37 2013 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 03 Oct 2013 14:45:37 -0400 Subject: [maker-devel] NFSLock problem In-Reply-To: Message-ID: Stop the job and restart it. No need to delete anything. Just restart it. Thanks, Carson On 10/3/13 2:44 PM, "Fields, Christopher J" wrote: >I have a MAKER job running that seems to be stalled on a failed scaffold. > It's running via MPI (MAKER v2.28, openMPI 1.6.3), that appears to have >worked successfully for the most part. This is a run that only uses >transcriptome and protein information in order to get a decent dseat of > >The failed scaffold seems to be holding the job from completion. There >does seem to be changes, but mainly they are on the NFSLock files: > >./.NFSLock.gi_lock.NFSLock >./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1 >./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLo >ck.KB913038%2E1.281.282.junction.blastn.holdover.NFSLock >./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLo >ck.KB913038%2E1.282.start.blastn.holdover.NFSLock >./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLo >ck.KB913038%2E1.281.end.blastn.holdover.NFSLock >./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLo >ck.KB913038%2E1.282.start.blastn.holdover.NFSLock.STACK >./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLo >ck.KB913038%2E1.281.end.blastn.holdover.NFSLock.STACK > >Everything else seems to have completed. > >I have seen a few issues re: NFS locking problems, would this be related? > Should I stop the job? > >We're running GPFS for our NFS. Here's 'mount': > >-system-specific-4.1$ mount >/dev/sda5 on / type ext4 (rw) >proc on /proc type proc (rw) >sysfs on /sys type sysfs (rw) >devpts on /dev/pts type devpts (rw,gid=5,mode=620) >tmpfs on /dev/shm type tmpfs (rw) >/dev/sda1 on /boot type ext4 (rw) >/dev/sdb1 on /export type ext4 (rw) >/dev/sda2 on /var type ext4 (rw) >tmpfs on /var/lib/ganglia/rrds type tmpfs >(rw,size=6180842000,gid=99,uid=99) >none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw) >sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw) >nfsd on /proc/fs/nfsd type nfsd (rw) >/dev/IGBHOME0 on /home type gpfs (rw,mtime,dev=IGBHOME0) >128.174.124.79:/shares/group on /archive/group type nfs >(rw,sync,hard,intr,retrans=10,timeo=300,rsize=65536,wsize=1048576,vers=3,p >roto=tcp,mountproto=tcp,addr=128.174.124.79) >128.174.124.79:/shares/CBC on /archive/CBC type nfs >(rw,sync,hard,intr,retrans=10,timeo=300,rsize=65536,wsize=1048576,vers=3,p >roto=tcp,mountproto=tcp,addr=128.174.124.79) > >chris >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From cjfields at illinois.edu Thu Oct 3 20:45:29 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Fri, 4 Oct 2013 02:45:29 +0000 Subject: [maker-devel] NFSLock problem In-Reply-To: References: Message-ID: Carson, Took a couple of restarts but finally processed through. I saw a prior email on the mail list where you mentioned this could be due to failed jobs hanging things up; I may set up my next job to allow one processor for logging into the nodes in case there is a similar problem, maybe try to track this down. chris On Oct 3, 2013, at 1:45 PM, Carson Holt wrote: > Stop the job and restart it. No need to delete anything. Just restart it. > > Thanks, > Carson > > > On 10/3/13 2:44 PM, "Fields, Christopher J" wrote: > >> I have a MAKER job running that seems to be stalled on a failed scaffold. >> It's running via MPI (MAKER v2.28, openMPI 1.6.3), that appears to have >> worked successfully for the most part. This is a run that only uses >> transcriptome and protein information in order to get a decent dseat of >> >> The failed scaffold seems to be holding the job from completion. There >> does seem to be changes, but mainly they are on the NFSLock files: >> >> ./.NFSLock.gi_lock.NFSLock >> ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1 >> ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLo >> ck.KB913038%2E1.281.282.junction.blastn.holdover.NFSLock >> ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLo >> ck.KB913038%2E1.282.start.blastn.holdover.NFSLock >> ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLo >> ck.KB913038%2E1.281.end.blastn.holdover.NFSLock >> ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLo >> ck.KB913038%2E1.282.start.blastn.holdover.NFSLock.STACK >> ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLo >> ck.KB913038%2E1.281.end.blastn.holdover.NFSLock.STACK >> >> Everything else seems to have completed. >> >> I have seen a few issues re: NFS locking problems, would this be related? >> Should I stop the job? >> >> We're running GPFS for our NFS. Here's 'mount': >> >> -system-specific-4.1$ mount >> /dev/sda5 on / type ext4 (rw) >> proc on /proc type proc (rw) >> sysfs on /sys type sysfs (rw) >> devpts on /dev/pts type devpts (rw,gid=5,mode=620) >> tmpfs on /dev/shm type tmpfs (rw) >> /dev/sda1 on /boot type ext4 (rw) >> /dev/sdb1 on /export type ext4 (rw) >> /dev/sda2 on /var type ext4 (rw) >> tmpfs on /var/lib/ganglia/rrds type tmpfs >> (rw,size=6180842000,gid=99,uid=99) >> none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw) >> sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw) >> nfsd on /proc/fs/nfsd type nfsd (rw) >> /dev/IGBHOME0 on /home type gpfs (rw,mtime,dev=IGBHOME0) >> 128.174.124.79:/shares/group on /archive/group type nfs >> (rw,sync,hard,intr,retrans=10,timeo=300,rsize=65536,wsize=1048576,vers=3,p >> roto=tcp,mountproto=tcp,addr=128.174.124.79) >> 128.174.124.79:/shares/CBC on /archive/CBC type nfs >> (rw,sync,hard,intr,retrans=10,timeo=300,rsize=65536,wsize=1048576,vers=3,p >> roto=tcp,mountproto=tcp,addr=128.174.124.79) >> >> chris >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > From felix.bemm at uni-wuerzburg.de Fri Oct 4 01:10:08 2013 From: felix.bemm at uni-wuerzburg.de (Felix Bemm) Date: Fri, 04 Oct 2013 09:10:08 +0200 Subject: [maker-devel] Contigs failing - could not make datastore directory Message-ID: <524E69D0.1090905@uni-wuerzburg.de> Hi, I have a problem with Contigs failing like this: examining contents of the fasta file and run log ERROR: could not make datastore directory --> rank=11, hostname=r5n04 ERROR: Failed while examining contents of the fasta file and run log ERROR: Chunk failed at level:0, tier_type:0 FAILED CONTIG:MA_gen_v8.05_c31675 There is enough space available on both the storage device and the tmp device that is being used. The datastore log files looks like this: MA_gen_v8.05_c1149 FAILED MA_gen_v8.05_c1153 FAILED MA_gen_v8.05_c1154 FAILED MA_gen_v8.05_c1155 FAILED MA_gen_v8.05_c1156 FAILED Since there is no datastore for this contigs, I can not provide any more informationen. Approx. 1/3 of the contigs were annotated, after that maker started to fail. It kept running for a while and finishing some of the contigs in the second or third try like this: MA_gen_v8.05_c4443 FAILED MA_gen_v8.05_c4443 FAILED MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ STARTED MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ FINISHED Any ideas what can cause this problems? Best regards Felix -- Felix Bemm Department of Bioinformatics University of W?rzburg, Germany Tel: +49 931 - 31 83696 Fax: +49 931 - 31 84552 felix.bemm at uni-wuerzburg.de From carsonhh at gmail.com Fri Oct 4 12:15:35 2013 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 04 Oct 2013 14:15:35 -0400 Subject: [maker-devel] Contigs failing - could not make datastore directory In-Reply-To: <524E69D0.1090905@uni-wuerzburg.de> Message-ID: Let me know what you find. With some failures it's hard to identify the problem especially with long running processes, which is why I made MAKER restartable, so you can at least get completion by just launching again. --Carson On 10/4/13 3:10 AM, "Felix Bemm" wrote: >Hi, > >I have a problem with Contigs failing like this: > >examining contents of the fasta file and run log >ERROR: could not make datastore directory >--> rank=11, hostname=r5n04 >ERROR: Failed while examining contents of the fasta file and run log >ERROR: Chunk failed at level:0, tier_type:0 >FAILED CONTIG:MA_gen_v8.05_c31675 > >There is enough space available on both the storage device and the tmp >device that is being used. The datastore log files looks like this: > >MA_gen_v8.05_c1149 FAILED >MA_gen_v8.05_c1153 FAILED >MA_gen_v8.05_c1154 FAILED >MA_gen_v8.05_c1155 FAILED >MA_gen_v8.05_c1156 FAILED > >Since there is no datastore for this contigs, I can not provide any more >informationen. Approx. 1/3 of the contigs were annotated, after that >maker started to fail. It kept running for a while and finishing some of >the contigs in the second or third try like this: > >MA_gen_v8.05_c4443 FAILED >MA_gen_v8.05_c4443 FAILED >MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ START >ED >MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ >FINISHED > >Any ideas what can cause this problems? > >Best regards >Felix > >-- >Felix Bemm >Department of Bioinformatics >University of W?rzburg, Germany >Tel: +49 931 - 31 83696 >Fax: +49 931 - 31 84552 >felix.bemm at uni-wuerzburg.de > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From carsonhh at gmail.com Fri Oct 4 12:21:42 2013 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 04 Oct 2013 14:21:42 -0400 Subject: [maker-devel] Contigs failing - could not make datastore directory In-Reply-To: <524E69D0.1090905@uni-wuerzburg.de> Message-ID: Hi Felix, There are other potential issues as well, for example a file quota limit set by your administrator or full local /tmp which will be unique to each node and cannot be queried directly from the root node on a cluster. On a cluster one node may also not have the NFS location mounted. Or it could be system related, for example on ibrix NFS mounted systems you can get a weird filesystem issue with ghost files and directories when using MAKER that can neither be created nor deleted. Try running the failed contigs in a different directory (even if it's on the same disk). Also what version of MAKER are you using? --Carson On 10/4/13 3:10 AM, "Felix Bemm" wrote: >Hi, > >I have a problem with Contigs failing like this: > >examining contents of the fasta file and run log >ERROR: could not make datastore directory >--> rank=11, hostname=r5n04 >ERROR: Failed while examining contents of the fasta file and run log >ERROR: Chunk failed at level:0, tier_type:0 >FAILED CONTIG:MA_gen_v8.05_c31675 > >There is enough space available on both the storage device and the tmp >device that is being used. The datastore log files looks like this: > >MA_gen_v8.05_c1149 FAILED >MA_gen_v8.05_c1153 FAILED >MA_gen_v8.05_c1154 FAILED >MA_gen_v8.05_c1155 FAILED >MA_gen_v8.05_c1156 FAILED > >Since there is no datastore for this contigs, I can not provide any more >informationen. Approx. 1/3 of the contigs were annotated, after that >maker started to fail. It kept running for a while and finishing some of >the contigs in the second or third try like this: > >MA_gen_v8.05_c4443 FAILED >MA_gen_v8.05_c4443 FAILED >MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ START >ED >MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ >FINISHED > >Any ideas what can cause this problems? > >Best regards >Felix > >-- >Felix Bemm >Department of Bioinformatics >University of W?rzburg, Germany >Tel: +49 931 - 31 83696 >Fax: +49 931 - 31 84552 >felix.bemm at uni-wuerzburg.de > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From felix.bemm at uni-wuerzburg.de Sat Oct 5 08:09:15 2013 From: felix.bemm at uni-wuerzburg.de (Felix Bemm) Date: Sat, 05 Oct 2013 16:09:15 +0200 Subject: [maker-devel] Contigs failing - could not make datastore directory In-Reply-To: References: Message-ID: <52501D8B.2060301@uni-wuerzburg.de> Hi Carson, there is definitely no quota neither a full local tmp device. I changed to another tmp device, which is unfortunately residing on nfs but now it seems to work. I am using the latest version of maker (2.28). The complete job is running on a single machine with 32 mpi threads. Bests Felix Am 04.10.2013 20:21, schrieb Carson Holt: > Hi Felix, > > There are other potential issues as well, for example a file quota limit > set by your administrator or full local /tmp which will be unique to each > node and cannot be queried directly from the root node on a cluster. On a > cluster one node may also not have the NFS location mounted. Or it could > be system related, for example on ibrix NFS mounted systems you can get a > weird filesystem issue with ghost files and directories when using MAKER > that can neither be created nor deleted. Try running the failed contigs > in a different directory (even if it's on the same disk). Also what > version of MAKER are you using? > > --Carson > > > > On 10/4/13 3:10 AM, "Felix Bemm" wrote: > >> Hi, >> >> I have a problem with Contigs failing like this: >> >> examining contents of the fasta file and run log >> ERROR: could not make datastore directory >> --> rank=11, hostname=r5n04 >> ERROR: Failed while examining contents of the fasta file and run log >> ERROR: Chunk failed at level:0, tier_type:0 >> FAILED CONTIG:MA_gen_v8.05_c31675 >> >> There is enough space available on both the storage device and the tmp >> device that is being used. The datastore log files looks like this: >> >> MA_gen_v8.05_c1149 FAILED >> MA_gen_v8.05_c1153 FAILED >> MA_gen_v8.05_c1154 FAILED >> MA_gen_v8.05_c1155 FAILED >> MA_gen_v8.05_c1156 FAILED >> >> Since there is no datastore for this contigs, I can not provide any more >> informationen. Approx. 1/3 of the contigs were annotated, after that >> maker started to fail. It kept running for a while and finishing some of >> the contigs in the second or third try like this: >> >> MA_gen_v8.05_c4443 FAILED >> MA_gen_v8.05_c4443 FAILED >> MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ START >> ED >> MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ >> FINISHED >> >> Any ideas what can cause this problems? >> >> Best regards >> Felix >> >> -- >> Felix Bemm >> Department of Bioinformatics >> University of W?rzburg, Germany >> Tel: +49 931 - 31 83696 >> Fax: +49 931 - 31 84552 >> felix.bemm at uni-wuerzburg.de >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -- Felix Bemm Department of Bioinformatics University of W?rzburg, Germany Tel: +49 931 - 31 83696 Fax: +49 931 - 31 84552 felix.bemm at uni-wuerzburg.de From jfierst at uoregon.edu Fri Oct 4 11:06:36 2013 From: jfierst at uoregon.edu (Janna Fierst) Date: Fri, 4 Oct 2013 10:06:36 -0700 Subject: [maker-devel] Fwd: exon/intron boundaries In-Reply-To: References: Message-ID: Hi, thanks for your reply- I have been going through our annotations in detail and trying different parameter sets, and I think I have identified what is going on but I'm not sure how to set the MAKER2 parameters for our situation. We are working with a species of Caenorhabditid worm and there are long gene-dense blocks that are being incorrectly annotated as large single genes instead of several smaller closely spaced genes. The protein alignments (tblastx and protein2genome) show very clearly where the exon/intron boundaries are and in most cases agree with the augustus predictions. The assembled cufflinks output (through blastn, est2genome and est_gff:cufflinks) does not agree in some locations; I think this may be because in some cases the UTR nearly overlaps adjacent genes. I have included a screenshot of an annotated region viewed in apollo to try to show this. The large gene in the middle is actually 7 different genes that are extremely close together and MAKER2 is collapsing them into a single gene. I tried running without any RNASeq/cufflinks data and MAKER2 annotates the region as two genes instead of one, but I can't get it to recognize the 7 as different genes. I have retrained SNAP but we have not been able to successfully train Augustus, we are currently using the default caenhorhabditis species model. I also included a species specific repeat library. I tried setting correct_est_fusion=1 and reducing pred_flank but these changes appear to really alter the annotations and we end up annotating almost nothing. I also tried setting est2genome=0 to decrease the influence of the cufflinks assembly but it didn't appear to help. There are some very large introns in these genes so I haven't tried yet decreasing the maximum intron size because I'm concerned this may generate too many split genes instead of our current merged gene problem. Thanks for your help, any advice is greatly appreciated! -Janna Fierst On Mon, Aug 26, 2013 at 12:21 PM, Carson Holt wrote: > Are you getting gene fusions or just more exons? Gene fusions can be > reduced by setting correct_est_fusion=1, or reducing pred_flank, although > reducing pred_flank can cause other issues (but those generally only appear > if setting the value below below 150). Also if you have the maximum intron > size set to high (split_hit option), you may also be generating bridging > alignments that make evidence align across distant paralogous genes as well > (this can result in gene merging) > > You should also look at your results manually in a viewer like Apollo. > Then see if the extra exons are supported by something such as protein > alignments from another species. If this is the case, you may have a > poorly annotated protein set that is being used as evidence that is > carrying over it's erroneous exons into the species you are annotating. If > the extra exons are supported by EST evidence, then perhaps you should try > and rebuild the EST assembly (for example trinity has an option to use a > Jarccardian similarity coefficient to avoid fusing transcripts). > > Another option, is to retrain SNAP or Augustus. MAKER does not actually > produce any of the models itself (it is a pipeline not a predictor). The > models are all generated using these other algorithms, MAKER just feeds > them hints based on protein and transcript alignments, so making sure > training is sufficient is important for those programs to produce their > best models. > > Finally make sure your repeat database is sufficient, you may need to > generate a species specific repeat library using something like > RepeatModeler. Repeats can end up being included as extra exons in gene > models because they may contain reading frames the do code for proteins > (I.e. reverse transcriptases). > > If you have any questions on any of the above, just let us know. > > Thanks, > Carson > > > From: Janna Fierst > Date: Monday, August 26, 2013 2:54 PM > To: > Subject: [maker-devel] exon/intron boundaries > > Hi, > > I am using MAKER 2.28 to annotate a Caenorhabditid worm genome, and the > initial results appear fairly good but we seem to be be annotating too many > exons for multiple genes. I was wondering which parameters should be tuned > to change the threshold for exon/intron boundaries? Thanks for your help > -Janna Fierst > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 1.tiff Type: image/tiff Size: 115362 bytes Desc: not available URL: From jfierst at uoregon.edu Sat Oct 5 16:18:52 2013 From: jfierst at uoregon.edu (Janna Fierst) Date: Sat, 5 Oct 2013 15:18:52 -0700 Subject: [maker-devel] Fwd: exon/intron boundaries In-Reply-To: References: Message-ID: Hi, thanks for your reply- I have been going through our annotations in detail and trying different parameter sets, and I think I have identified what is going on but I'm not sure how to set the MAKER2 parameters for our situation. We are working with a species of Caenorhabditid worm and there are long gene-dense blocks that are being incorrectly annotated as large single genes instead of several smaller closely spaced genes. The protein alignments (tblastx and protein2genome) show very clearly where the exon/intron boundaries are and in most cases agree with the augustus predictions. The assembled cufflinks output (through blastn, est2genome and est_gff:cufflinks) does not agree in some locations; I think this may be because in some cases the UTR nearly overlaps adjacent genes. I have included a screenshot of an annotated region viewed in apollo to try to show this. The large gene in the middle is actually 7 different genes that are extremely close together and MAKER2 is collapsing them into a single gene. I tried running without any RNASeq/cufflinks data and MAKER2 annotates the region as two genes instead of one, but I can't get it to recognize the 7 as different genes. I have retrained SNAP but we have not been able to successfully train Augustus, we are currently using the default caenhorhabditis species model. I also included a species specific repeat library. I tried setting correct_est_fusion=1 and reducing pred_flank but these changes appear to really alter the annotations and we end up annotating almost nothing. I also tried setting est2genome=0 to decrease the influence of the cufflinks assembly but it didn't appear to help. There are some very large introns in these genes so I haven't tried yet decreasing the maximum intron size because I'm concerned this may generate too many split genes instead of our current merged gene problem. Thanks for your help, any advice is greatly appreciated! -Janna Fierst On Mon, Aug 26, 2013 at 12:21 PM, Carson Holt wrote: > Are you getting gene fusions or just more exons? Gene fusions can be > reduced by setting correct_est_fusion=1, or reducing pred_flank, although > reducing pred_flank can cause other issues (but those generally only appear > if setting the value below below 150). Also if you have the maximum intron > size set to high (split_hit option), you may also be generating bridging > alignments that make evidence align across distant paralogous genes as well > (this can result in gene merging) > > You should also look at your results manually in a viewer like Apollo. > Then see if the extra exons are supported by something such as protein > alignments from another species. If this is the case, you may have a > poorly annotated protein set that is being used as evidence that is > carrying over it's erroneous exons into the species you are annotating. If > the extra exons are supported by EST evidence, then perhaps you should try > and rebuild the EST assembly (for example trinity has an option to use a > Jarccardian similarity coefficient to avoid fusing transcripts). > > Another option, is to retrain SNAP or Augustus. MAKER does not actually > produce any of the models itself (it is a pipeline not a predictor). The > models are all generated using these other algorithms, MAKER just feeds > them hints based on protein and transcript alignments, so making sure > training is sufficient is important for those programs to produce their > best models. > > Finally make sure your repeat database is sufficient, you may need to > generate a species specific repeat library using something like > RepeatModeler. Repeats can end up being included as extra exons in gene > models because they may contain reading frames the do code for proteins > (I.e. reverse transcriptases). > > If you have any questions on any of the above, just let us know. > > Thanks, > Carson > > > From: Janna Fierst > Date: Monday, August 26, 2013 2:54 PM > To: > Subject: [maker-devel] exon/intron boundaries > > Hi, > > I am using MAKER 2.28 to annotate a Caenorhabditid worm genome, and the > initial results appear fairly good but we seem to be be annotating too many > exons for multiple genes. I was wondering which parameters should be tuned > to change the threshold for exon/intron boundaries? Thanks for your help > -Janna Fierst > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 1.tiff Type: image/tiff Size: 115362 bytes Desc: not available URL: From myandell at genetics.utah.edu Mon Oct 7 05:36:53 2013 From: myandell at genetics.utah.edu (Mark Yandell) Date: Mon, 7 Oct 2013 11:36:53 +0000 Subject: [maker-devel] maker apollo error In-Reply-To: <8687323557146056ef885dd8a1bf2ea4@buzonweb.us.es> References: <8687323557146056ef885dd8a1bf2ea4@buzonweb.us.es> Message-ID: <7A60AB257EFF2B48B1F4C814817EA05365E66B3C@mxb2.hg.genetics.utah.edu> Hi Gabriel, I've forwarded your request on to the MAKER email list... cheers, --mark Mark Yandell Professor of Human Genetics H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:801-587-7707 ________________________________________ From: ggpozo at us.es [ggpozo at us.es] Sent: Monday, October 07, 2013 2:49 AM To: Mark Yandell Subject: maker apollo error Dear Dr. Yandell, First, thank you very much for provide us such a great tool as MAKER. I am traying to install it in my Linux system, everything is ok, however when I try to instal Apollo with ./Build apollo... Downloading apollo... --2013-10-07 10:45:37-- http://gmod.svn.sourceforge.net/viewvc/gmod/apollo/trunk/?view=tar Resolviendo gmod.svn.sourceforge.net... 216.34.181.65, 216.34.181.177 Connecting to gmod.svn.sourceforge.net|216.34.181.65|:80... conectado. Petici?n HTTP enviada, esperando respuesta... 302 Found Localizaci?n: http://sourceforge.net/p/gmod/svn/ [siguiendo] --2013-10-07 10:45:38-- http://sourceforge.net/p/gmod/svn/ Resolviendo sourceforge.net... 216.34.181.60 Connecting to sourceforge.net|216.34.181.60|:80... conectado. Petici?n HTTP enviada, esperando respuesta... 302 Found Localizaci?n: http://sourceforge.net/p/gmod/svn/HEAD/tree/ [siguiendo] --2013-10-07 10:45:38-- http://sourceforge.net/p/gmod/svn/HEAD/tree/ Connecting to sourceforge.net|216.34.181.60|:80... conectado. Petici?n HTTP enviada, esperando respuesta... 200 OK Longitud: 37730 (37K) [text/html] Saving to: `/home/maker/src/../exe/apollo.tar.gz' 100%[==================================================================================================================================>] 37.730 92,1K/s in 0,4s 2013-10-07 10:45:40 (92,1 KB/s) - `/home/maker/src/../exe/apollo.tar.gz' saved [37730/37730] Unpacking apollo tarball... gzip: stdin: not in gzip format tar: Child returned status 1 tar: Error is not recoverable: exiting now ERROR: Failed installing apollo, now cleaning installation path... You may need to install apollo manually. It seems that the http direction in locations file is wrong, Apollo has changed its version and now Apollo is integrated in Apolloweb, may be this the problem. I would greatly appreciate any help you can give me. Thank you very much. Dr. Gabriel Gutierrez Department of Genetcis University of Seville Spain From carsonhh at gmail.com Mon Oct 7 07:20:17 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 07 Oct 2013 09:20:17 -0400 Subject: [maker-devel] Fwd: exon/intron boundaries In-Reply-To: Message-ID: Hi Janna, There are a couple of things to do. Download the maker-2.29p-beta from the lab website. This includes some changes made to improve the performance of correct_est_fusion. Whenever using the correct_est_fusion option, this also reduces the influence of ESTs on annotation. So normally you would need to increase your protein dataset, but given that you have supplied 3 nematode species and all of UniProt already you should probably be fine. But there is one last thing you can do. Instead of using cufflinks, try using trinity to assemble the ESTs. There is a Jaccard clip option that reducing merging caused by overlapping UTR. Between the trinity and correct_est_fusion you should be able to really reduce the effect of those ESTs. If those changes don't work there is one last option. If you take the MAKER results and filter them for SNAP and Augustus ab initio results (match/match_part in the GFF3), then you can pass those in to the pred_gff options. Then turn snaphmm and augustus_species off in the control files. Basically what this will do is turn MAKER's hint based prediction off and force it to filter the ab intio results and select directly models from there. Since the merging is being caused by bad hints (merged transcripts from mRNAseq) this would reduce that effect. You will still need correct_est_fusion=1 though to trim UTR coming from the merged transcripts, because even though MAEKR can't process hints to rerun SNAP and Augustus this way, it will try and add UTR using the EST evidence. --Carson From: Janna Fierst Date: Friday, October 4, 2013 1:06 PM To: Subject: [maker-devel] Fwd: exon/intron boundaries Hi, thanks for your reply- I have been going through our annotations in detail and trying different parameter sets, and I think I have identified what is going on but I'm not sure how to set the MAKER2 parameters for our situation. We are working with a species of Caenorhabditid worm and there are long gene-dense blocks that are being incorrectly annotated as large single genes instead of several smaller closely spaced genes. The protein alignments (tblastx and protein2genome) show very clearly where the exon/intron boundaries are and in most cases agree with the augustus predictions. The assembled cufflinks output (through blastn, est2genome and est_gff:cufflinks) does not agree in some locations; I think this may be because in some cases the UTR nearly overlaps adjacent genes. I have included a screenshot of an annotated region viewed in apollo to try to show this. The large gene in the middle is actually 7 different genes that are extremely close together and MAKER2 is collapsing them into a single gene. I tried running without any RNASeq/cufflinks data and MAKER2 annotates the region as two genes instead of one, but I can't get it to recognize the 7 as different genes. I have retrained SNAP but we have not been able to successfully train Augustus, we are currently using the default caenhorhabditis species model. I also included a species specific repeat library. I tried setting correct_est_fusion=1 and reducing pred_flank but these changes appear to really alter the annotations and we end up annotating almost nothing. I also tried setting est2genome=0 to decrease the influence of the cufflinks assembly but it didn't appear to help. There are some very large introns in these genes so I haven't tried yet decreasing the maximum intron size because I'm concerned this may generate too many split genes instead of our current merged gene problem. Thanks for your help, any advice is greatly appreciated! -Janna Fierst On Mon, Aug 26, 2013 at 12:21 PM, Carson Holt wrote: > Are you getting gene fusions or just more exons? Gene fusions can be reduced > by setting correct_est_fusion=1, or reducing pred_flank, although reducing > pred_flank can cause other issues (but those generally only appear if setting > the value below below 150). Also if you have the maximum intron size set to > high (split_hit option), you may also be generating bridging alignments that > make evidence align across distant paralogous genes as well (this can result > in gene merging) > > You should also look at your results manually in a viewer like Apollo. Then > see if the extra exons are supported by something such as protein alignments > from another species. If this is the case, you may have a poorly annotated > protein set that is being used as evidence that is carrying over it's > erroneous exons into the species you are annotating. If the extra exons are > supported by EST evidence, then perhaps you should try and rebuild the EST > assembly (for example trinity has an option to use a Jarccardian similarity > coefficient to avoid fusing transcripts). > > Another option, is to retrain SNAP or Augustus. MAKER does not actually > produce any of the models itself (it is a pipeline not a predictor). The > models are all generated using these other algorithms, MAKER just feeds them > hints based on protein and transcript alignments, so making sure training is > sufficient is important for those programs to produce their best models. > > Finally make sure your repeat database is sufficient, you may need to generate > a species specific repeat library using something like RepeatModeler. Repeats > can end up being included as extra exons in gene models because they may > contain reading frames the do code for proteins (I.e. reverse transcriptases). > > If you have any questions on any of the above, just let us know. > > Thanks, > Carson > > > From: Janna Fierst > Date: Monday, August 26, 2013 2:54 PM > To: > Subject: [maker-devel] exon/intron boundaries > > Hi, > > I am using MAKER 2.28 to annotate a Caenorhabditid worm genome, and the > initial results appear fairly good but we seem to be be annotating too many > exons for multiple genes. I was wondering which parameters should be tuned to > change the threshold for exon/intron boundaries? Thanks for your help -Janna > Fierst > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/mak > er-devel_yandell-lab.org _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Oct 7 08:12:24 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 07 Oct 2013 10:12:24 -0400 Subject: [maker-devel] maker apollo error In-Reply-To: <7A60AB257EFF2B48B1F4C814817EA05365E66B3C@mxb2.hg.genetics.utah.edu> Message-ID: This was a location from GMOD for the older Apollo. It looks like they've dropped it, so it now points to nothing. Perhaps there is another location that still has the old code I can get from Ed. --Carson On 10/7/13 7:36 AM, "Mark Yandell" wrote: > >Hi Gabriel, > >I've forwarded your request on to the MAKER email list... > >cheers, > >--mark > >Mark Yandell >Professor of Human Genetics >H.A. & Edna Benning Presidential Endowed Chair >Eccles Institute of Human Genetics >University of Utah >15 North 2030 East, Room 2100 >Salt Lake City, UT 84112-5330 >ph:801-587-7707 > >________________________________________ >From: ggpozo at us.es [ggpozo at us.es] >Sent: Monday, October 07, 2013 2:49 AM >To: Mark Yandell >Subject: maker apollo error > >Dear Dr. Yandell, > >First, thank you very much for provide us such a great tool as MAKER. I >am traying to install it in my Linux system, everything is ok, however >when I try to instal Apollo with ./Build apollo... > >Downloading apollo... >--2013-10-07 10:45:37-- >http://gmod.svn.sourceforge.net/viewvc/gmod/apollo/trunk/?view=tar >Resolviendo gmod.svn.sourceforge.net... 216.34.181.65, 216.34.181.177 >Connecting to gmod.svn.sourceforge.net|216.34.181.65|:80... conectado. >Petici?n HTTP enviada, esperando respuesta... 302 Found >Localizaci?n: http://sourceforge.net/p/gmod/svn/ [siguiendo] >--2013-10-07 10:45:38-- http://sourceforge.net/p/gmod/svn/ >Resolviendo sourceforge.net... 216.34.181.60 >Connecting to sourceforge.net|216.34.181.60|:80... conectado. >Petici?n HTTP enviada, esperando respuesta... 302 Found >Localizaci?n: http://sourceforge.net/p/gmod/svn/HEAD/tree/ [siguiendo] >--2013-10-07 10:45:38-- http://sourceforge.net/p/gmod/svn/HEAD/tree/ >Connecting to sourceforge.net|216.34.181.60|:80... conectado. >Petici?n HTTP enviada, esperando respuesta... 200 OK >Longitud: 37730 (37K) [text/html] >Saving to: `/home/maker/src/../exe/apollo.tar.gz' > >100%[===================================================================== >=============================================================>] 37.730 > 92,1K/s in 0,4s > >2013-10-07 10:45:40 (92,1 KB/s) - `/home/maker/src/../exe/apollo.tar.gz' >saved [37730/37730] > >Unpacking apollo tarball... > >gzip: stdin: not in gzip format >tar: Child returned status 1 >tar: Error is not recoverable: exiting now > > >ERROR: Failed installing apollo, now cleaning installation path... >You may need to install apollo manually. > > > >It seems that the http direction in locations file is wrong, Apollo has >changed its version and now Apollo is integrated in Apolloweb, may be >this the problem. > > > >I would greatly appreciate any help you can give me. > >Thank you very much. > > > >Dr. Gabriel Gutierrez > >Department of Genetcis > >University of Seville > >Spain > > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From carsonhh at gmail.com Mon Oct 7 08:33:44 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 07 Oct 2013 10:33:44 -0400 Subject: [maker-devel] Contigs failing - could not make datastore directory In-Reply-To: <52501D8B.2060301@uni-wuerzburg.de> Message-ID: You can also try the 2.29 beta if you want (on the website). Also when MAKER exits, it tries to delete anything left behind in temporary folders, so is it possible that it is creating space on exit after filling up /tmp? Given that the issue seems to go away when you switch to another directory, it suggests that this might be the kind of thing that is specific to something about the current state of your your system. Thanks, Carson On 10/5/13 10:09 AM, "Felix Bemm" wrote: >Hi Carson, > >there is definitely no quota neither a full local tmp device. I changed >to another tmp device, which is unfortunately residing on nfs but now it >seems to work. I am using the latest version of maker (2.28). The >complete job is running on a single machine with 32 mpi threads. > >Bests >Felix > >Am 04.10.2013 20:21, schrieb Carson Holt: >> Hi Felix, >> >> There are other potential issues as well, for example a file quota limit >> set by your administrator or full local /tmp which will be unique to >>each >> node and cannot be queried directly from the root node on a cluster. >>On a >> cluster one node may also not have the NFS location mounted. Or it >>could >> be system related, for example on ibrix NFS mounted systems you can get >>a >> weird filesystem issue with ghost files and directories when using MAKER >> that can neither be created nor deleted. Try running the failed contigs >> in a different directory (even if it's on the same disk). Also what >> version of MAKER are you using? >> >> --Carson >> >> >> >> On 10/4/13 3:10 AM, "Felix Bemm" wrote: >> >>> Hi, >>> >>> I have a problem with Contigs failing like this: >>> >>> examining contents of the fasta file and run log >>> ERROR: could not make datastore directory >>> --> rank=11, hostname=r5n04 >>> ERROR: Failed while examining contents of the fasta file and run log >>> ERROR: Chunk failed at level:0, tier_type:0 >>> FAILED CONTIG:MA_gen_v8.05_c31675 >>> >>> There is enough space available on both the storage device and the tmp >>> device that is being used. The datastore log files looks like this: >>> >>> MA_gen_v8.05_c1149 FAILED >>> MA_gen_v8.05_c1153 FAILED >>> MA_gen_v8.05_c1154 FAILED >>> MA_gen_v8.05_c1155 FAILED >>> MA_gen_v8.05_c1156 FAILED >>> >>> Since there is no datastore for this contigs, I can not provide any >>>more >>> informationen. Approx. 1/3 of the contigs were annotated, after that >>> maker started to fail. It kept running for a while and finishing some >>>of >>> the contigs in the second or third try like this: >>> >>> MA_gen_v8.05_c4443 FAILED >>> MA_gen_v8.05_c4443 FAILED >>> >>>MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ STA >>>RT >>> ED >>> MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ >>> FINISHED >>> >>> Any ideas what can cause this problems? >>> >>> Best regards >>> Felix >>> >>> -- >>> Felix Bemm >>> Department of Bioinformatics >>> University of W?rzburg, Germany >>> Tel: +49 931 - 31 83696 >>> Fax: +49 931 - 31 84552 >>> felix.bemm at uni-wuerzburg.de >>> >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > >-- >Felix Bemm >Department of Bioinformatics >University of W?rzburg, Germany >Tel: +49 931 - 31 83696 >Fax: +49 931 - 31 84552 >felix.bemm at uni-wuerzburg.de From felix.bemm at uni-wuerzburg.de Mon Oct 7 10:40:12 2013 From: felix.bemm at uni-wuerzburg.de (Felix Bemm) Date: Mon, 07 Oct 2013 18:40:12 +0200 Subject: [maker-devel] Contigs failing - could not make datastore directory In-Reply-To: References: Message-ID: <5252E3EC.6050908@uni-wuerzburg.de> Do you have a changelog for 2.29b? On 07.10.2013 16:33, Carson Holt wrote: > You can also try the 2.29 beta if you want (on the website). Also when > MAKER exits, it tries to delete anything left behind in temporary folders, > so is it possible that it is creating space on exit after filling up /tmp? I will try to monitor that! > Given that the issue seems to go away when you switch to another > directory, it suggests that this might be the kind of thing that is > specific to something about the current state of your your system. I agree with that. Interestingly it happens with the only directory where I would not expect that. > Thanks, > Carson > > > On 10/5/13 10:09 AM, "Felix Bemm" wrote: > >> Hi Carson, >> >> there is definitely no quota neither a full local tmp device. I changed >> to another tmp device, which is unfortunately residing on nfs but now it >> seems to work. I am using the latest version of maker (2.28). The >> complete job is running on a single machine with 32 mpi threads. >> >> Bests >> Felix >> >> Am 04.10.2013 20:21, schrieb Carson Holt: >>> Hi Felix, >>> >>> There are other potential issues as well, for example a file quota limit >>> set by your administrator or full local /tmp which will be unique to >>> each >>> node and cannot be queried directly from the root node on a cluster. >>> On a >>> cluster one node may also not have the NFS location mounted. Or it >>> could >>> be system related, for example on ibrix NFS mounted systems you can get >>> a >>> weird filesystem issue with ghost files and directories when using MAKER >>> that can neither be created nor deleted. Try running the failed contigs >>> in a different directory (even if it's on the same disk). Also what >>> version of MAKER are you using? >>> >>> --Carson >>> >>> >>> >>> On 10/4/13 3:10 AM, "Felix Bemm" wrote: >>> >>>> Hi, >>>> >>>> I have a problem with Contigs failing like this: >>>> >>>> examining contents of the fasta file and run log >>>> ERROR: could not make datastore directory >>>> --> rank=11, hostname=r5n04 >>>> ERROR: Failed while examining contents of the fasta file and run log >>>> ERROR: Chunk failed at level:0, tier_type:0 >>>> FAILED CONTIG:MA_gen_v8.05_c31675 >>>> >>>> There is enough space available on both the storage device and the tmp >>>> device that is being used. The datastore log files looks like this: >>>> >>>> MA_gen_v8.05_c1149 FAILED >>>> MA_gen_v8.05_c1153 FAILED >>>> MA_gen_v8.05_c1154 FAILED >>>> MA_gen_v8.05_c1155 FAILED >>>> MA_gen_v8.05_c1156 FAILED >>>> >>>> Since there is no datastore for this contigs, I can not provide any >>>> more >>>> informationen. Approx. 1/3 of the contigs were annotated, after that >>>> maker started to fail. It kept running for a while and finishing some >>>> of >>>> the contigs in the second or third try like this: >>>> >>>> MA_gen_v8.05_c4443 FAILED >>>> MA_gen_v8.05_c4443 FAILED >>>> >>>> MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ STA >>>> RT >>>> ED >>>> MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ >>>> FINISHED >>>> >>>> Any ideas what can cause this problems? >>>> >>>> Best regards >>>> Felix >>>> >>>> -- >>>> Felix Bemm >>>> Department of Bioinformatics >>>> University of W?rzburg, Germany >>>> Tel: +49 931 - 31 83696 >>>> Fax: +49 931 - 31 84552 >>>> felix.bemm at uni-wuerzburg.de >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> >> -- >> Felix Bemm >> Department of Bioinformatics >> University of W?rzburg, Germany >> Tel: +49 931 - 31 83696 >> Fax: +49 931 - 31 84552 >> felix.bemm at uni-wuerzburg.de -- Felix Bemm Department of Bioinformatics University of W?rzburg, Germany Tel: +49 931 - 31 83696 Fax: +49 931 - 31 84552 felix.bemm at uni-wuerzburg.de From carsonhh at gmail.com Mon Oct 7 11:25:34 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 07 Oct 2013 13:25:34 -0400 Subject: [maker-devel] maker apollo error In-Reply-To: <14a642f3dc9cb615f048b70228bad3be@buzonweb.us.es> Message-ID: Here is a new location for the source --> http://sourceforge.net/code-snapshots/svn/g/gm/gmod/svn/gmod-svn-25291-apoll o-trunk.zip --Carson From: Date: Monday, October 7, 2013 12:47 PM To: Carson Holt Cc: Mark Yandell , Subject: Re: [maker-devel] maker apollo error Thanks, I have modified the locations file pointing to a personal server where I downloaded an old version of Apollo but it did not work, but if you can get a new one I would try it again. Thanks Gabriel El 07/10/2013 16:12, Carson Holt escribi?: > This was a location from GMOD for the older Apollo. It looks like they've > dropped it, so it now points to nothing. Perhaps there is another location > that still has the old code I can get from Ed. > > --Carson > > > > On 10/7/13 7:36 AM, "Mark Yandell" wrote: >> Hi Gabriel, I've forwarded your request on to the MAKER email list... cheers, >> --mark Mark Yandell Professor of Human Genetics H.A. & Edna Benning >> Presidential Endowed Chair Eccles Institute of Human Genetics University of >> Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 >> ph:801-587-7707 ________________________________________ From: ggpozo at us.es >> [ggpozo at us.es] Sent: Monday, October 07, 2013 2:49 AM To: Mark Yandell >> Subject: maker apollo error Dear Dr. Yandell, First, thank you very much for >> provide us such a great tool as MAKER. I am traying to install it in my Linux >> system, everything is ok, however when I try to instal Apollo with ./Build >> apollo... Downloading apollo... --2013-10-07 10:45:37-- >> http://gmod.svn.sourceforge.net/viewvc/gmod/apollo/trunk/?view=tar >> Resolviendo gmod.svn.sourceforge.net... 216.34.181.65, 216.34.181.177 >> Connecting to gmod.svn.sourceforge.net|216.34.181.65|:80... conectado. >> Petici?n HTTP enviada, esperando respuesta... 302 Found Localizaci?n: >> http://sourceforge.net/p/gmod/svn/ [siguiendo] --2013-10-07 10:45:38-- >> http://sourceforge.net/p/gmod/svn/ Resolviendo sourceforge.net... >> 216.34.181.60 Connecting to sourceforge.net|216.34.181.60|:80... conectado. >> Petici?n HTTP enviada, esperando respuesta... 302 Found Localizaci?n: >> http://sourceforge.net/p/gmod/svn/HEAD/tree/ [siguiendo] --2013-10-07 >> 10:45:38-- http://sourceforge.net/p/gmod/svn/HEAD/tree/ Connecting to >> sourceforge.net|216.34.181.60|:80... conectado. Petici?n HTTP enviada, >> esperando respuesta... 200 OK Longitud: 37730 (37K) [text/html] Saving to: >> `/home/maker/src/../exe/apollo.tar.gz' >> 100%[===================================================================== >> =============================================================>] 37.730 >> 92,1K/s in 0,4s 2013-10-07 10:45:40 (92,1 KB/s) - >> `/home/maker/src/../exe/apollo.tar.gz' saved [37730/37730] Unpacking apollo >> tarball... gzip: stdin: not in gzip format tar: Child returned status 1 tar: >> Error is not recoverable: exiting now ERROR: Failed installing apollo, now >> cleaning installation path... You may need to install apollo manually. It >> seems that the http direction in locations file is wrong, Apollo has changed >> its version and now Apollo is integrated in Apolloweb, may be this the >> problem. I would greatly appreciate any help you can give me. Thank you very >> much. Dr. Gabriel Gutierrez Department of Genetcis University of Seville >> Spain _______________________________________________ maker-devel mailing >> list maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ggpozo at us.es Mon Oct 7 10:47:13 2013 From: ggpozo at us.es (ggpozo at us.es) Date: Mon, 07 Oct 2013 18:47:13 +0200 Subject: [maker-devel] maker apollo error In-Reply-To: References: Message-ID: <14a642f3dc9cb615f048b70228bad3be@buzonweb.us.es> Thanks, I have modified the locations file pointing to a personal server where I downloaded an old version of Apollo but it did not work, but if you can get a new one I would try it again. Thanks Gabriel El 07/10/2013 16:12, Carson Holt escribi?: > This was a location from GMOD for the older Apollo. It looks like they've > dropped it, so it now points to nothing. Perhaps there is another location > that still has the old code I can get from Ed. > > --Carson > > On 10/7/13 7:36 AM, "Mark Yandell" wrote: > >> Hi Gabriel, I've forwarded your request on to the MAKER email list... cheers, --mark Mark Yandell Professor of Human Genetics H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:801-587-7707 ________________________________________ From: ggpozo at us.es [1] [ggpozo at us.es [2]] Sent: Monday, October 07, 2013 2:49 AM To: Mark Yandell Subject: maker apollo error Dear Dr. Yandell, First, thank you very much for provide us such a great tool as MAKER. I am traying to install it in my Linux system, everything is ok, however when I try to instal Apollo with ./Build apollo... Downloading apollo... --2013-10-07 10:45:37-- http://gmod.svn.sourceforge.net/viewvc/gmod/apollo/trunk/?view=tar [3] Resolviendo gmod.svn.sourceforge.net... 216.34.181.65, 216.34.181.177 Connecting to gmod.svn.sourceforge.net|216.34.181.65|:80... conectado. Petici?n HTTP enviada, esperando respuesta... 302 Found Localizaci?n: http://sourceforge.net/p/gmod/svn/ [4] [siguiendo] --2013-10-07 10:45:38-- http://sourceforge.net/p/gmod/svn/ [5] Resolviendo sourceforge.net... 216.34.181.60 Connecting to sourceforge.net|216.34.181.60|:80... conectado. Petici?n HTTP enviada, esperando respuesta... 302 Found Localizaci?n: http://sourceforge.net/p/gmod/svn/HEAD/tree/ [6] [siguiendo] --2013-10-07 10:45:38-- http://sourceforge.net/p/gmod/svn/HEAD/tree/ [7] Connecting to sourceforge.net|216.34.181.60|:80... conectado. Petici?n HTTP enviada, esperando respuesta... 200 OK Longitud: 37730 (37K) [text/html] Saving to: `/home/maker/src/../exe/apollo.tar.gz' 100%[===================================================================== =============================================================>] 37.730 92,1K/s in 0,4s 2013-10-07 10:45:40 (92,1 KB/s) - `/home/maker/src/../exe/apollo.tar.gz' saved [37730/37730] Unpacking apollo tarball... gzip: stdin: not in gzip format tar: Child returned status 1 tar: Error is not recoverable: exiting now ERROR: Failed installing apollo, now cleaning installation path... You may need to install apollo manually. It seems that the http direction in locations file is wrong, Apollo has changed its version and now Apollo is integrated in Apolloweb, may be this the problem. I would greatly appreciate any help you can give me. Thank you very much. Dr. Gabriel Gutierrez Department of Genetcis University of Seville Spain _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com [8] http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org [9] Links: ------ [1] mailto:ggpozo at us.es [2] mailto:ggpozo at us.es [3] http://gmod.svn.sourceforge.net/viewvc/gmod/apollo/trunk/?view=tar [4] http://sourceforge.net/p/gmod/svn/ [5] http://sourceforge.net/p/gmod/svn/ [6] http://sourceforge.net/p/gmod/svn/HEAD/tree/ [7] http://sourceforge.net/p/gmod/svn/HEAD/tree/ [8] mailto:maker-devel at box290.bluehost.com [9] http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From felix.bemm at uni-wuerzburg.de Sat Oct 12 09:06:52 2013 From: felix.bemm at uni-wuerzburg.de (Felix Bemm) Date: Sat, 12 Oct 2013 17:06:52 +0200 Subject: [maker-devel] Serious Start Codon Problem Message-ID: <5259658C.4000102@uni-wuerzburg.de> Hi, I have a serious problems with maker annotations especially the start codon placement. It started happening with maker version 2.27! About 1/3 of my genes are missing a proper start codon. When reviewing the GFF files gene predictors and est evidence standalone would predict the right gene structure including the start codon. I was thinking that this problem was somehow linked to my gene models but looking at their standalone prediction I discarded that theory. Just some stats about proteins with proper start codon (first column) and missing start codon (second column). Maker 9268 4215 SNAP 16577 896 Genemark 18764 290 Numbers look the same when Augustus is used (currently retraining). Manually using EST's in Apollo often fixes the problems but I don't want to check > 4000 genes for this. Do you have any idea what could cause such a problem? If you need some example gff files I can send you. Best regards Felix -- Felix Bemm Department of Bioinformatics University of W?rzburg, Germany Tel: +49 931 - 31 83696 Fax: +49 931 - 31 84552 felix.bemm at uni-wuerzburg.de From carsonhh at gmail.com Sat Oct 12 09:48:29 2013 From: carsonhh at gmail.com (Carson Holt) Date: Sat, 12 Oct 2013 11:48:29 -0400 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: <5259658C.4000102@uni-wuerzburg.de> Message-ID: This issue just came up this week on another e-mail as well. One way to fix it, is to edit the Bio::Tools::CodonTable module from BioPerl (use maker --debug to have maker print out the locations of all modules it uses). Such a hack should only be done as a temporary work around. You change line 256 from --> ---M---------------M---------------M---------------------------- To--> -----------------------------------M---------------------------- Maker uses the BioPerl is_start_codon function to determine if the codon used in a result it receives is acceptable or if it should look upstream or downstream for a different one. Apparently there are actually 3 acceptable start codons in the standard codon table (2 rare ones and one common), so making the edit removes the two rare ones from the list. I'm also preparing a 2.30 release that exports a "strict" codon table to the BioPerl module, that will be used by default and should fix the issue. Thanks, Carson On 10/12/13 11:06 AM, "Felix Bemm" wrote: >Hi, > >I have a serious problems with maker annotations especially the start >codon placement. It started happening with maker version 2.27! About 1/3 >of my genes are missing a proper start codon. When reviewing the GFF >files gene predictors and est evidence standalone would predict the >right gene structure including the start codon. I was thinking that this >problem was somehow linked to my gene models but looking at their >standalone prediction I discarded that theory. Just some stats about >proteins with proper start codon (first column) and missing start codon >(second column). > >Maker 9268 4215 >SNAP 16577 896 >Genemark 18764 290 > >Numbers look the same when Augustus is used (currently retraining). >Manually using EST's in Apollo often fixes the problems but I don't want >to check > 4000 genes for this. > >Do you have any idea what could cause such a problem? If you need some >example gff files I can send you. > >Best regards >Felix > >-- >Felix Bemm >Department of Bioinformatics >University of W?rzburg, Germany >Tel: +49 931 - 31 83696 >Fax: +49 931 - 31 84552 >felix.bemm at uni-wuerzburg.de > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From felix.bemm at uni-wuerzburg.de Sat Oct 12 10:16:06 2013 From: felix.bemm at uni-wuerzburg.de (Felix Bemm) Date: Sat, 12 Oct 2013 18:16:06 +0200 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: References: Message-ID: <525975C6.4020403@uni-wuerzburg.de> Thx Carson! Do you now when 2.30 will be available? Bests On 12.10.2013 17:48, Carson Holt wrote: > This issue just came up this week on another e-mail as well. > > One way to fix it, is to edit the Bio::Tools::CodonTable module from > BioPerl (use maker --debug to have maker print out the locations of all > modules it uses). Such a hack should only be done as a temporary work > around. > > You change line 256 from --> > ---M---------------M---------------M---------------------------- > > To--> > -----------------------------------M---------------------------- > > > Maker uses the BioPerl is_start_codon function to determine if the codon > used in a result it receives is acceptable or if it should look upstream > or downstream for a different one. Apparently there are actually 3 > acceptable start codons in the standard codon table (2 rare ones and one > common), so making the edit removes the two rare ones from the list. > > I'm also preparing a 2.30 release that exports a "strict" codon table to > the BioPerl module, that will be used by default and should fix the issue. > > Thanks, > Carson > > > > On 10/12/13 11:06 AM, "Felix Bemm" wrote: > >> Hi, >> >> I have a serious problems with maker annotations especially the start >> codon placement. It started happening with maker version 2.27! About 1/3 >> of my genes are missing a proper start codon. When reviewing the GFF >> files gene predictors and est evidence standalone would predict the >> right gene structure including the start codon. I was thinking that this >> problem was somehow linked to my gene models but looking at their >> standalone prediction I discarded that theory. Just some stats about >> proteins with proper start codon (first column) and missing start codon >> (second column). >> >> Maker 9268 4215 >> SNAP 16577 896 >> Genemark 18764 290 >> >> Numbers look the same when Augustus is used (currently retraining). >> Manually using EST's in Apollo often fixes the problems but I don't want >> to check > 4000 genes for this. >> >> Do you have any idea what could cause such a problem? If you need some >> example gff files I can send you. >> >> Best regards >> Felix >> >> -- >> Felix Bemm >> Department of Bioinformatics >> University of W?rzburg, Germany >> Tel: +49 931 - 31 83696 >> Fax: +49 931 - 31 84552 >> felix.bemm at uni-wuerzburg.de >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -- Felix Bemm Department of Bioinformatics University of W?rzburg, Germany Tel: +49 931 - 31 83696 Fax: +49 931 - 31 84552 felix.bemm at uni-wuerzburg.de From carsonhh at gmail.com Sat Oct 12 23:26:17 2013 From: carsonhh at gmail.com (Carson Holt) Date: Sun, 13 Oct 2013 01:26:17 -0400 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: <525975C6.4020403@uni-wuerzburg.de> Message-ID: Before Wednesday, because after that I'm on a 6 day road trip, so I have to have it wrapped up before I leave. --Carson On 10/12/13 12:16 PM, "Felix Bemm" wrote: >Thx Carson! Do you now when 2.30 will be available? Bests > >On 12.10.2013 17:48, Carson Holt wrote: >> This issue just came up this week on another e-mail as well. >> >> One way to fix it, is to edit the Bio::Tools::CodonTable module from >> BioPerl (use maker --debug to have maker print out the locations of all >> modules it uses). Such a hack should only be done as a temporary work >> around. >> >> You change line 256 from --> >> ---M---------------M---------------M---------------------------- >> >> To--> >> -----------------------------------M---------------------------- >> >> >> Maker uses the BioPerl is_start_codon function to determine if the codon >> used in a result it receives is acceptable or if it should look upstream >> or downstream for a different one. Apparently there are actually 3 >> acceptable start codons in the standard codon table (2 rare ones and one >> common), so making the edit removes the two rare ones from the list. >> >> I'm also preparing a 2.30 release that exports a "strict" codon table to >> the BioPerl module, that will be used by default and should fix the >>issue. >> >> Thanks, >> Carson >> >> >> >> On 10/12/13 11:06 AM, "Felix Bemm" wrote: >> >>> Hi, >>> >>> I have a serious problems with maker annotations especially the start >>> codon placement. It started happening with maker version 2.27! About >>>1/3 >>> of my genes are missing a proper start codon. When reviewing the GFF >>> files gene predictors and est evidence standalone would predict the >>> right gene structure including the start codon. I was thinking that >>>this >>> problem was somehow linked to my gene models but looking at their >>> standalone prediction I discarded that theory. Just some stats about >>> proteins with proper start codon (first column) and missing start codon >>> (second column). >>> >>> Maker 9268 4215 >>> SNAP 16577 896 >>> Genemark 18764 290 >>> >>> Numbers look the same when Augustus is used (currently retraining). >>> Manually using EST's in Apollo often fixes the problems but I don't >>>want >>> to check > 4000 genes for this. >>> >>> Do you have any idea what could cause such a problem? If you need some >>> example gff files I can send you. >>> >>> Best regards >>> Felix >>> >>> -- >>> Felix Bemm >>> Department of Bioinformatics >>> University of W?rzburg, Germany >>> Tel: +49 931 - 31 83696 >>> Fax: +49 931 - 31 84552 >>> felix.bemm at uni-wuerzburg.de >>> >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > >-- >Felix Bemm >Department of Bioinformatics >University of W?rzburg, Germany >Tel: +49 931 - 31 83696 >Fax: +49 931 - 31 84552 >felix.bemm at uni-wuerzburg.de From carsonhh at gmail.com Mon Oct 14 08:45:25 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 14 Oct 2013 10:45:25 -0400 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: <525975C6.4020403@uni-wuerzburg.de> Message-ID: Probably Tuesday or Wednesday. --Carson On 10/12/13 12:16 PM, "Felix Bemm" wrote: >Thx Carson! Do you now when 2.30 will be available? Bests > >On 12.10.2013 17:48, Carson Holt wrote: >> This issue just came up this week on another e-mail as well. >> >> One way to fix it, is to edit the Bio::Tools::CodonTable module from >> BioPerl (use maker --debug to have maker print out the locations of all >> modules it uses). Such a hack should only be done as a temporary work >> around. >> >> You change line 256 from --> >> ---M---------------M---------------M---------------------------- >> >> To--> >> -----------------------------------M---------------------------- >> >> >> Maker uses the BioPerl is_start_codon function to determine if the codon >> used in a result it receives is acceptable or if it should look upstream >> or downstream for a different one. Apparently there are actually 3 >> acceptable start codons in the standard codon table (2 rare ones and one >> common), so making the edit removes the two rare ones from the list. >> >> I'm also preparing a 2.30 release that exports a "strict" codon table to >> the BioPerl module, that will be used by default and should fix the >>issue. >> >> Thanks, >> Carson >> >> >> >> On 10/12/13 11:06 AM, "Felix Bemm" wrote: >> >>> Hi, >>> >>> I have a serious problems with maker annotations especially the start >>> codon placement. It started happening with maker version 2.27! About >>>1/3 >>> of my genes are missing a proper start codon. When reviewing the GFF >>> files gene predictors and est evidence standalone would predict the >>> right gene structure including the start codon. I was thinking that >>>this >>> problem was somehow linked to my gene models but looking at their >>> standalone prediction I discarded that theory. Just some stats about >>> proteins with proper start codon (first column) and missing start codon >>> (second column). >>> >>> Maker 9268 4215 >>> SNAP 16577 896 >>> Genemark 18764 290 >>> >>> Numbers look the same when Augustus is used (currently retraining). >>> Manually using EST's in Apollo often fixes the problems but I don't >>>want >>> to check > 4000 genes for this. >>> >>> Do you have any idea what could cause such a problem? If you need some >>> example gff files I can send you. >>> >>> Best regards >>> Felix >>> >>> -- >>> Felix Bemm >>> Department of Bioinformatics >>> University of W?rzburg, Germany >>> Tel: +49 931 - 31 83696 >>> Fax: +49 931 - 31 84552 >>> felix.bemm at uni-wuerzburg.de >>> >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > >-- >Felix Bemm >Department of Bioinformatics >University of W?rzburg, Germany >Tel: +49 931 - 31 83696 >Fax: +49 931 - 31 84552 >felix.bemm at uni-wuerzburg.de From cjfields at illinois.edu Mon Oct 14 09:07:43 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Mon, 14 Oct 2013 15:07:43 +0000 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: References: Message-ID: Carson, Regarding the CodonTable change below, should this be something that needs to be fixed, or made more flexible, in bioperl? I could see this being a problem if using MAKER for bacterial annotation (where the alternative start codons are actually more common). chris On Oct 14, 2013, at 9:45 AM, Carson Holt wrote: > Probably Tuesday or Wednesday. > > --Carson > > > On 10/12/13 12:16 PM, "Felix Bemm" wrote: > >> Thx Carson! Do you now when 2.30 will be available? Bests >> >> On 12.10.2013 17:48, Carson Holt wrote: >>> This issue just came up this week on another e-mail as well. >>> >>> One way to fix it, is to edit the Bio::Tools::CodonTable module from >>> BioPerl (use maker --debug to have maker print out the locations of all >>> modules it uses). Such a hack should only be done as a temporary work >>> around. >>> >>> You change line 256 from --> >>> ---M---------------M---------------M---------------------------- >>> >>> To--> >>> -----------------------------------M---------------------------- >>> >>> >>> Maker uses the BioPerl is_start_codon function to determine if the codon >>> used in a result it receives is acceptable or if it should look upstream >>> or downstream for a different one. Apparently there are actually 3 >>> acceptable start codons in the standard codon table (2 rare ones and one >>> common), so making the edit removes the two rare ones from the list. >>> >>> I'm also preparing a 2.30 release that exports a "strict" codon table to >>> the BioPerl module, that will be used by default and should fix the >>> issue. >>> >>> Thanks, >>> Carson >>> >>> >>> >>> On 10/12/13 11:06 AM, "Felix Bemm" wrote: >>> >>>> Hi, >>>> >>>> I have a serious problems with maker annotations especially the start >>>> codon placement. It started happening with maker version 2.27! About >>>> 1/3 >>>> of my genes are missing a proper start codon. When reviewing the GFF >>>> files gene predictors and est evidence standalone would predict the >>>> right gene structure including the start codon. I was thinking that >>>> this >>>> problem was somehow linked to my gene models but looking at their >>>> standalone prediction I discarded that theory. Just some stats about >>>> proteins with proper start codon (first column) and missing start codon >>>> (second column). >>>> >>>> Maker 9268 4215 >>>> SNAP 16577 896 >>>> Genemark 18764 290 >>>> >>>> Numbers look the same when Augustus is used (currently retraining). >>>> Manually using EST's in Apollo often fixes the problems but I don't >>>> want >>>> to check > 4000 genes for this. >>>> >>>> Do you have any idea what could cause such a problem? If you need some >>>> example gff files I can send you. >>>> >>>> Best regards >>>> Felix >>>> >>>> -- >>>> Felix Bemm >>>> Department of Bioinformatics >>>> University of W?rzburg, Germany >>>> Tel: +49 931 - 31 83696 >>>> Fax: +49 931 - 31 84552 >>>> felix.bemm at uni-wuerzburg.de >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> -- >> Felix Bemm >> Department of Bioinformatics >> University of W?rzburg, Germany >> Tel: +49 931 - 31 83696 >> Fax: +49 931 - 31 84552 >> felix.bemm at uni-wuerzburg.de > > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From carsonhh at gmail.com Mon Oct 14 11:58:24 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 14 Oct 2013 13:58:24 -0400 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: Message-ID: I think adding one more codon table to the set of available tables would be sufficient. Call it the 'Strict' table or 'Canonical' table. It doesn't need to be the default, but should be a selectable option just as the other codon tables are. There is a way to export codon tables to the module, which is what I've now added to MAKER, but I'm sure other BoiPerl users would find a strictly canonical codon table useful as well. Thanks, Carson On 10/14/13 11:07 AM, "Fields, Christopher J" wrote: >Carson, > >Regarding the CodonTable change below, should this be something that >needs to be fixed, or made more flexible, in bioperl? I could see this >being a problem if using MAKER for bacterial annotation (where the >alternative start codons are actually more common). > >chris > >On Oct 14, 2013, at 9:45 AM, Carson Holt wrote: > >> Probably Tuesday or Wednesday. >> >> --Carson >> >> >> On 10/12/13 12:16 PM, "Felix Bemm" wrote: >> >>> Thx Carson! Do you now when 2.30 will be available? Bests >>> >>> On 12.10.2013 17:48, Carson Holt wrote: >>>> This issue just came up this week on another e-mail as well. >>>> >>>> One way to fix it, is to edit the Bio::Tools::CodonTable module from >>>> BioPerl (use maker --debug to have maker print out the locations of >>>>all >>>> modules it uses). Such a hack should only be done as a temporary work >>>> around. >>>> >>>> You change line 256 from --> >>>> ---M---------------M---------------M---------------------------- >>>> >>>> To--> >>>> -----------------------------------M---------------------------- >>>> >>>> >>>> Maker uses the BioPerl is_start_codon function to determine if the >>>>codon >>>> used in a result it receives is acceptable or if it should look >>>>upstream >>>> or downstream for a different one. Apparently there are actually 3 >>>> acceptable start codons in the standard codon table (2 rare ones and >>>>one >>>> common), so making the edit removes the two rare ones from the list. >>>> >>>> I'm also preparing a 2.30 release that exports a "strict" codon table >>>>to >>>> the BioPerl module, that will be used by default and should fix the >>>> issue. >>>> >>>> Thanks, >>>> Carson >>>> >>>> >>>> >>>> On 10/12/13 11:06 AM, "Felix Bemm" >>>>wrote: >>>> >>>>> Hi, >>>>> >>>>> I have a serious problems with maker annotations especially the start >>>>> codon placement. It started happening with maker version 2.27! About >>>>> 1/3 >>>>> of my genes are missing a proper start codon. When reviewing the GFF >>>>> files gene predictors and est evidence standalone would predict the >>>>> right gene structure including the start codon. I was thinking that >>>>> this >>>>> problem was somehow linked to my gene models but looking at their >>>>> standalone prediction I discarded that theory. Just some stats about >>>>> proteins with proper start codon (first column) and missing start >>>>>codon >>>>> (second column). >>>>> >>>>> Maker 9268 4215 >>>>> SNAP 16577 896 >>>>> Genemark 18764 290 >>>>> >>>>> Numbers look the same when Augustus is used (currently retraining). >>>>> Manually using EST's in Apollo often fixes the problems but I don't >>>>> want >>>>> to check > 4000 genes for this. >>>>> >>>>> Do you have any idea what could cause such a problem? If you need >>>>>some >>>>> example gff files I can send you. >>>>> >>>>> Best regards >>>>> Felix >>>>> >>>>> -- >>>>> Felix Bemm >>>>> Department of Bioinformatics >>>>> University of W?rzburg, Germany >>>>> Tel: +49 931 - 31 83696 >>>>> Fax: +49 931 - 31 84552 >>>>> felix.bemm at uni-wuerzburg.de >>>>> >>>>> _______________________________________________ >>>>> maker-devel mailing list >>>>> maker-devel at box290.bluehost.com >>>>> >>>>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.or >>>>>g >>> >>> -- >>> Felix Bemm >>> Department of Bioinformatics >>> University of W?rzburg, Germany >>> Tel: +49 931 - 31 83696 >>> Fax: +49 931 - 31 84552 >>> felix.bemm at uni-wuerzburg.de >> >> >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > From cjfields at illinois.edu Mon Oct 14 12:51:28 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Mon, 14 Oct 2013 18:51:28 +0000 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: References: Message-ID: <160BC401-0471-4285-8712-21440628A5DA@illinois.edu> Done, with some added tests. I labeled it 'Strict' ('Canonical' can be a bit of a loaded term). It's currently set as ID 24. I will likely push out a new 1.6 point release this week or next, I'll merge these in for that release. chris On Oct 14, 2013, at 12:58 PM, Carson Holt wrote: > I think adding one more codon table to the set of available tables would > be sufficient. Call it the 'Strict' table or 'Canonical' table. It > doesn't need to be the default, but should be a selectable option just as > the other codon tables are. There is a way to export codon tables to the > module, which is what I've now added to MAKER, but I'm sure other BoiPerl > users would find a strictly canonical codon table useful as well. > > Thanks, > Carson > > > On 10/14/13 11:07 AM, "Fields, Christopher J" > wrote: > >> Carson, >> >> Regarding the CodonTable change below, should this be something that >> needs to be fixed, or made more flexible, in bioperl? I could see this >> being a problem if using MAKER for bacterial annotation (where the >> alternative start codons are actually more common). >> >> chris >> >> On Oct 14, 2013, at 9:45 AM, Carson Holt wrote: >> >>> Probably Tuesday or Wednesday. >>> >>> --Carson >>> >>> >>> On 10/12/13 12:16 PM, "Felix Bemm" wrote: >>> >>>> Thx Carson! Do you now when 2.30 will be available? Bests >>>> >>>> On 12.10.2013 17:48, Carson Holt wrote: >>>>> This issue just came up this week on another e-mail as well. >>>>> >>>>> One way to fix it, is to edit the Bio::Tools::CodonTable module from >>>>> BioPerl (use maker --debug to have maker print out the locations of >>>>> all >>>>> modules it uses). Such a hack should only be done as a temporary work >>>>> around. >>>>> >>>>> You change line 256 from --> >>>>> ---M---------------M---------------M---------------------------- >>>>> >>>>> To--> >>>>> -----------------------------------M---------------------------- >>>>> >>>>> >>>>> Maker uses the BioPerl is_start_codon function to determine if the >>>>> codon >>>>> used in a result it receives is acceptable or if it should look >>>>> upstream >>>>> or downstream for a different one. Apparently there are actually 3 >>>>> acceptable start codons in the standard codon table (2 rare ones and >>>>> one >>>>> common), so making the edit removes the two rare ones from the list. >>>>> >>>>> I'm also preparing a 2.30 release that exports a "strict" codon table >>>>> to >>>>> the BioPerl module, that will be used by default and should fix the >>>>> issue. >>>>> >>>>> Thanks, >>>>> Carson >>>>> >>>>> >>>>> >>>>> On 10/12/13 11:06 AM, "Felix Bemm" >>>>> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I have a serious problems with maker annotations especially the start >>>>>> codon placement. It started happening with maker version 2.27! About >>>>>> 1/3 >>>>>> of my genes are missing a proper start codon. When reviewing the GFF >>>>>> files gene predictors and est evidence standalone would predict the >>>>>> right gene structure including the start codon. I was thinking that >>>>>> this >>>>>> problem was somehow linked to my gene models but looking at their >>>>>> standalone prediction I discarded that theory. Just some stats about >>>>>> proteins with proper start codon (first column) and missing start >>>>>> codon >>>>>> (second column). >>>>>> >>>>>> Maker 9268 4215 >>>>>> SNAP 16577 896 >>>>>> Genemark 18764 290 >>>>>> >>>>>> Numbers look the same when Augustus is used (currently retraining). >>>>>> Manually using EST's in Apollo often fixes the problems but I don't >>>>>> want >>>>>> to check > 4000 genes for this. >>>>>> >>>>>> Do you have any idea what could cause such a problem? If you need >>>>>> some >>>>>> example gff files I can send you. >>>>>> >>>>>> Best regards >>>>>> Felix >>>>>> >>>>>> -- >>>>>> Felix Bemm >>>>>> Department of Bioinformatics >>>>>> University of W?rzburg, Germany >>>>>> Tel: +49 931 - 31 83696 >>>>>> Fax: +49 931 - 31 84552 >>>>>> felix.bemm at uni-wuerzburg.de >>>>>> >>>>>> _______________________________________________ >>>>>> maker-devel mailing list >>>>>> maker-devel at box290.bluehost.com >>>>>> >>>>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.or >>>>>> g >>>> >>>> -- >>>> Felix Bemm >>>> Department of Bioinformatics >>>> University of W?rzburg, Germany >>>> Tel: +49 931 - 31 83696 >>>> Fax: +49 931 - 31 84552 >>>> felix.bemm at uni-wuerzburg.de >>> >>> >>> >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> > > From carsonhh at gmail.com Mon Oct 14 18:59:45 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 14 Oct 2013 20:59:45 -0400 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: <160BC401-0471-4285-8712-21440628A5DA@illinois.edu> Message-ID: Thanks!! --Carson On 10/14/13 2:51 PM, "Fields, Christopher J" wrote: >Done, with some added tests. I labeled it 'Strict' ('Canonical' can be a >bit of a loaded term). It's currently set as ID 24. > >I will likely push out a new 1.6 point release this week or next, I'll >merge these in for that release. > >chris > >On Oct 14, 2013, at 12:58 PM, Carson Holt wrote: > >> I think adding one more codon table to the set of available tables would >> be sufficient. Call it the 'Strict' table or 'Canonical' table. It >> doesn't need to be the default, but should be a selectable option just >>as >> the other codon tables are. There is a way to export codon tables to >>the >> module, which is what I've now added to MAKER, but I'm sure other >>BoiPerl >> users would find a strictly canonical codon table useful as well. >> >> Thanks, >> Carson >> >> >> On 10/14/13 11:07 AM, "Fields, Christopher J" >> wrote: >> >>> Carson, >>> >>> Regarding the CodonTable change below, should this be something that >>> needs to be fixed, or made more flexible, in bioperl? I could see this >>> being a problem if using MAKER for bacterial annotation (where the >>> alternative start codons are actually more common). >>> >>> chris >>> >>> On Oct 14, 2013, at 9:45 AM, Carson Holt wrote: >>> >>>> Probably Tuesday or Wednesday. >>>> >>>> --Carson >>>> >>>> >>>> On 10/12/13 12:16 PM, "Felix Bemm" >>>>wrote: >>>> >>>>> Thx Carson! Do you now when 2.30 will be available? Bests >>>>> >>>>> On 12.10.2013 17:48, Carson Holt wrote: >>>>>> This issue just came up this week on another e-mail as well. >>>>>> >>>>>> One way to fix it, is to edit the Bio::Tools::CodonTable module from >>>>>> BioPerl (use maker --debug to have maker print out the locations of >>>>>> all >>>>>> modules it uses). Such a hack should only be done as a temporary >>>>>>work >>>>>> around. >>>>>> >>>>>> You change line 256 from --> >>>>>> ---M---------------M---------------M---------------------------- >>>>>> >>>>>> To--> >>>>>> -----------------------------------M---------------------------- >>>>>> >>>>>> >>>>>> Maker uses the BioPerl is_start_codon function to determine if the >>>>>> codon >>>>>> used in a result it receives is acceptable or if it should look >>>>>> upstream >>>>>> or downstream for a different one. Apparently there are actually 3 >>>>>> acceptable start codons in the standard codon table (2 rare ones and >>>>>> one >>>>>> common), so making the edit removes the two rare ones from the list. >>>>>> >>>>>> I'm also preparing a 2.30 release that exports a "strict" codon >>>>>>table >>>>>> to >>>>>> the BioPerl module, that will be used by default and should fix the >>>>>> issue. >>>>>> >>>>>> Thanks, >>>>>> Carson >>>>>> >>>>>> >>>>>> >>>>>> On 10/12/13 11:06 AM, "Felix Bemm" >>>>>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I have a serious problems with maker annotations especially the >>>>>>>start >>>>>>> codon placement. It started happening with maker version 2.27! >>>>>>>About >>>>>>> 1/3 >>>>>>> of my genes are missing a proper start codon. When reviewing the >>>>>>>GFF >>>>>>> files gene predictors and est evidence standalone would predict the >>>>>>> right gene structure including the start codon. I was thinking that >>>>>>> this >>>>>>> problem was somehow linked to my gene models but looking at their >>>>>>> standalone prediction I discarded that theory. Just some stats >>>>>>>about >>>>>>> proteins with proper start codon (first column) and missing start >>>>>>> codon >>>>>>> (second column). >>>>>>> >>>>>>> Maker 9268 4215 >>>>>>> SNAP 16577 896 >>>>>>> Genemark 18764 290 >>>>>>> >>>>>>> Numbers look the same when Augustus is used (currently retraining). >>>>>>> Manually using EST's in Apollo often fixes the problems but I don't >>>>>>> want >>>>>>> to check > 4000 genes for this. >>>>>>> >>>>>>> Do you have any idea what could cause such a problem? If you need >>>>>>> some >>>>>>> example gff files I can send you. >>>>>>> >>>>>>> Best regards >>>>>>> Felix >>>>>>> >>>>>>> -- >>>>>>> Felix Bemm >>>>>>> Department of Bioinformatics >>>>>>> University of W?rzburg, Germany >>>>>>> Tel: +49 931 - 31 83696 >>>>>>> Fax: +49 931 - 31 84552 >>>>>>> felix.bemm at uni-wuerzburg.de >>>>>>> >>>>>>> _______________________________________________ >>>>>>> maker-devel mailing list >>>>>>> maker-devel at box290.bluehost.com >>>>>>> >>>>>>> >>>>>>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab. >>>>>>>or >>>>>>> g >>>>> >>>>> -- >>>>> Felix Bemm >>>>> Department of Bioinformatics >>>>> University of W?rzburg, Germany >>>>> Tel: +49 931 - 31 83696 >>>>> Fax: +49 931 - 31 84552 >>>>> felix.bemm at uni-wuerzburg.de >>>> >>>> >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> >>>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >> >> > From jfierst at uoregon.edu Mon Oct 21 11:40:06 2013 From: jfierst at uoregon.edu (Janna Fierst) Date: Mon, 21 Oct 2013 10:40:06 -0700 Subject: [maker-devel] Fwd: exon/intron boundaries In-Reply-To: References: Message-ID: Hi, thanks for your help- I have used Trinity to assemble the EST's but am having a problem with the 2.29-beta release. Instead of starting and finishing a set of scaffolds, it appears to re-start over and over again. I am running it in parallel across 32 cores and I'm wondering if this is a problem with the parallel implementation. I've gone through it several times and not been able to find any errors or output that would suggest what the problem is. Thanks -Janna On Mon, Oct 7, 2013 at 6:20 AM, Carson Holt wrote: > Hi Janna, > > There are a couple of things to do. Download the maker-2.29p-beta from > the lab website. This includes some changes made to improve the > performance of correct_est_fusion. Whenever using the correct_est_fusion > option, this also reduces the influence of ESTs on annotation. So normally > you would need to increase your protein dataset, but given that you have > supplied 3 nematode species and all of UniProt already you should probably > be fine. But there is one last thing you can do. Instead of using > cufflinks, try using trinity to assemble the ESTs. There is a Jaccard clip > option that reducing merging caused by overlapping UTR. Between the > trinity and correct_est_fusion you should be able to really reduce the > effect of those ESTs. > > If those changes don't work there is one last option. If you take the > MAKER results and filter them for SNAP and Augustus ab initio results > (match/match_part in the GFF3), then you can pass those in to the pred_gff > options. Then turn snaphmm and augustus_species off in the control files. > Basically what this will do is turn MAKER's hint based prediction off and > force it to filter the ab intio results and select directly models from > there. Since the merging is being caused by bad hints (merged transcripts > from mRNAseq) this would reduce that effect. You will still need > correct_est_fusion=1 though to trim UTR coming from the merged transcripts, > because even though MAEKR can't process hints to rerun SNAP and Augustus > this way, it will try and add UTR using the EST evidence. > > --Carson > > > From: Janna Fierst > Date: Friday, October 4, 2013 1:06 PM > To: > Subject: [maker-devel] Fwd: exon/intron boundaries > > Hi, > > thanks for your reply- I have been going through our annotations in detail > and trying different parameter sets, and I think I have identified what is > going on but I'm not sure how to set the MAKER2 parameters for our > situation. We are working with a species of Caenorhabditid worm and there > are long gene-dense blocks that are being incorrectly annotated as large > single genes instead of several smaller closely spaced genes. The protein > alignments (tblastx and protein2genome) show very clearly where the > exon/intron boundaries are and in most cases agree with the augustus > predictions. The assembled cufflinks output (through blastn, est2genome and > est_gff:cufflinks) does not agree in some locations; I think this may be > because in some cases the UTR nearly overlaps adjacent genes. > > I have included a screenshot of an annotated region viewed in apollo to > try to show this. The large gene in the middle is actually 7 different > genes that are extremely close together and MAKER2 is collapsing them into > a single gene. I tried running without any RNASeq/cufflinks data and MAKER2 > annotates the region as two genes instead of one, but I can't get it to > recognize the 7 as different genes. I have retrained SNAP but we have not > been able to successfully train Augustus, we are currently using the > default caenhorhabditis species model. I also included a species specific > repeat library. I tried setting correct_est_fusion=1 and reducing > pred_flank but these changes appear to really alter the annotations and we > end up annotating almost nothing. I also tried setting est2genome=0 to > decrease the influence of the cufflinks assembly but it didn't appear to > help. There are some very large introns in these genes so I haven't tried > yet decreasing the maximum intron size because I'm concerned this may > generate too many split genes instead of our current merged gene problem. > Thanks for your help, any advice is greatly appreciated! -Janna Fierst > > > On Mon, Aug 26, 2013 at 12:21 PM, Carson Holt wrote: > >> Are you getting gene fusions or just more exons? Gene fusions can be >> reduced by setting correct_est_fusion=1, or reducing pred_flank, although >> reducing pred_flank can cause other issues (but those generally only appear >> if setting the value below below 150). Also if you have the maximum intron >> size set to high (split_hit option), you may also be generating bridging >> alignments that make evidence align across distant paralogous genes as well >> (this can result in gene merging) >> >> You should also look at your results manually in a viewer like Apollo. >> Then see if the extra exons are supported by something such as protein >> alignments from another species. If this is the case, you may have a >> poorly annotated protein set that is being used as evidence that is >> carrying over it's erroneous exons into the species you are annotating. If >> the extra exons are supported by EST evidence, then perhaps you should try >> and rebuild the EST assembly (for example trinity has an option to use a >> Jarccardian similarity coefficient to avoid fusing transcripts). >> >> Another option, is to retrain SNAP or Augustus. MAKER does not actually >> produce any of the models itself (it is a pipeline not a predictor). The >> models are all generated using these other algorithms, MAKER just feeds >> them hints based on protein and transcript alignments, so making sure >> training is sufficient is important for those programs to produce their >> best models. >> >> Finally make sure your repeat database is sufficient, you may need to >> generate a species specific repeat library using something like >> RepeatModeler. Repeats can end up being included as extra exons in gene >> models because they may contain reading frames the do code for proteins >> (I.e. reverse transcriptases). >> >> If you have any questions on any of the above, just let us know. >> >> Thanks, >> Carson >> >> >> From: Janna Fierst >> Date: Monday, August 26, 2013 2:54 PM >> To: >> Subject: [maker-devel] exon/intron boundaries >> >> Hi, >> >> I am using MAKER 2.28 to annotate a Caenorhabditid worm genome, and the >> initial results appear fairly good but we seem to be be annotating too many >> exons for multiple genes. I was wondering which parameters should be tuned >> to change the threshold for exon/intron boundaries? Thanks for your help >> -Janna Fierst >> _______________________________________________ maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> > > > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From graham.etherington at sainsbury-laboratory.ac.uk Thu Oct 24 07:42:51 2013 From: graham.etherington at sainsbury-laboratory.ac.uk (graham etherington (TSL)) Date: Thu, 24 Oct 2013 13:42:51 +0000 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: Message-ID: Hi, I notice that the latest version of Maker available from http://www.yandell-lab.org/software/maker.html is still v2.29b (4 Oct 2013). As we're also having the start codon problem we'd like to get hold of v2.30. Do you know when this will be released? Many thanks, Graham Dr. Graham Etherington Bioinformatics Support Officer, The Sainsbury Laboratory, Norwich Research Park, Norwich NR4 7UH. UK Tel: +44 (0)1603 450601 On 13/10/2013 06:26, "Carson Holt" wrote: >Before Wednesday, because after that I'm on a 6 day road trip, so I have >to have it wrapped up before I leave. > >--Carson > > > > >On 10/12/13 12:16 PM, "Felix Bemm" wrote: > >>Thx Carson! Do you now when 2.30 will be available? Bests >> >>On 12.10.2013 17:48, Carson Holt wrote: >>> This issue just came up this week on another e-mail as well. >>> >>> One way to fix it, is to edit the Bio::Tools::CodonTable module from >>> BioPerl (use maker --debug to have maker print out the locations of all >>> modules it uses). Such a hack should only be done as a temporary work >>> around. >>> >>> You change line 256 from --> >>> ---M---------------M---------------M---------------------------- >>> >>> To--> >>> -----------------------------------M---------------------------- >>> >>> >>> Maker uses the BioPerl is_start_codon function to determine if the >>>codon >>> used in a result it receives is acceptable or if it should look >>>upstream >>> or downstream for a different one. Apparently there are actually 3 >>> acceptable start codons in the standard codon table (2 rare ones and >>>one >>> common), so making the edit removes the two rare ones from the list. >>> >>> I'm also preparing a 2.30 release that exports a "strict" codon table >>>to >>> the BioPerl module, that will be used by default and should fix the >>>issue. >>> >>> Thanks, >>> Carson >>> >>> >>> >>> On 10/12/13 11:06 AM, "Felix Bemm" wrote: >>> >>>> Hi, >>>> >>>> I have a serious problems with maker annotations especially the start >>>> codon placement. It started happening with maker version 2.27! About >>>>1/3 >>>> of my genes are missing a proper start codon. When reviewing the GFF >>>> files gene predictors and est evidence standalone would predict the >>>> right gene structure including the start codon. I was thinking that >>>>this >>>> problem was somehow linked to my gene models but looking at their >>>> standalone prediction I discarded that theory. Just some stats about >>>> proteins with proper start codon (first column) and missing start >>>>codon >>>> (second column). >>>> >>>> Maker 9268 4215 >>>> SNAP 16577 896 >>>> Genemark 18764 290 >>>> >>>> Numbers look the same when Augustus is used (currently retraining). >>>> Manually using EST's in Apollo often fixes the problems but I don't >>>>want >>>> to check > 4000 genes for this. >>>> >>>> Do you have any idea what could cause such a problem? If you need some >>>> example gff files I can send you. >>>> >>>> Best regards >>>> Felix >>>> >>>> -- >>>> Felix Bemm >>>> Department of Bioinformatics >>>> University of W?rzburg, Germany >>>> Tel: +49 931 - 31 83696 >>>> Fax: +49 931 - 31 84552 >>>> felix.bemm at uni-wuerzburg.de >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> >>>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >>-- >>Felix Bemm >>Department of Bioinformatics >>University of W?rzburg, Germany >>Tel: +49 931 - 31 83696 >>Fax: +49 931 - 31 84552 >>felix.bemm at uni-wuerzburg.de > > > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From myandell at genetics.utah.edu Thu Oct 24 07:55:05 2013 From: myandell at genetics.utah.edu (Mark Yandell) Date: Thu, 24 Oct 2013 13:55:05 +0000 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: References: , Message-ID: <7A60AB257EFF2B48B1F4C814817EA05365E778BF@mxb2.hg.genetics.utah.edu> Hi Graham, Carson is currently on the road, moving his family from Canada back to the States. He should be back online sometime next week. Sorry for the delay. --mark Mark Yandell Professor of Human Genetics H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:801-587-7707 ________________________________________ From: maker-devel [maker-devel-bounces at yandell-lab.org] on behalf of graham etherington (TSL) [graham.etherington at sainsbury-laboratory.ac.uk] Sent: Thursday, October 24, 2013 7:42 AM To: Carson Holt; Felix Bemm; maker-devel at yandell-lab.org Cc: Kate Bailey (TSL) Subject: Re: [maker-devel] Serious Start Codon Problem Hi, I notice that the latest version of Maker available from http://www.yandell-lab.org/software/maker.html is still v2.29b (4 Oct 2013). As we're also having the start codon problem we'd like to get hold of v2.30. Do you know when this will be released? Many thanks, Graham Dr. Graham Etherington Bioinformatics Support Officer, The Sainsbury Laboratory, Norwich Research Park, Norwich NR4 7UH. UK Tel: +44 (0)1603 450601 On 13/10/2013 06:26, "Carson Holt" wrote: >Before Wednesday, because after that I'm on a 6 day road trip, so I have >to have it wrapped up before I leave. > >--Carson > > > > >On 10/12/13 12:16 PM, "Felix Bemm" wrote: > >>Thx Carson! Do you now when 2.30 will be available? Bests >> >>On 12.10.2013 17:48, Carson Holt wrote: >>> This issue just came up this week on another e-mail as well. >>> >>> One way to fix it, is to edit the Bio::Tools::CodonTable module from >>> BioPerl (use maker --debug to have maker print out the locations of all >>> modules it uses). Such a hack should only be done as a temporary work >>> around. >>> >>> You change line 256 from --> >>> ---M---------------M---------------M---------------------------- >>> >>> To--> >>> -----------------------------------M---------------------------- >>> >>> >>> Maker uses the BioPerl is_start_codon function to determine if the >>>codon >>> used in a result it receives is acceptable or if it should look >>>upstream >>> or downstream for a different one. Apparently there are actually 3 >>> acceptable start codons in the standard codon table (2 rare ones and >>>one >>> common), so making the edit removes the two rare ones from the list. >>> >>> I'm also preparing a 2.30 release that exports a "strict" codon table >>>to >>> the BioPerl module, that will be used by default and should fix the >>>issue. >>> >>> Thanks, >>> Carson >>> >>> >>> >>> On 10/12/13 11:06 AM, "Felix Bemm" wrote: >>> >>>> Hi, >>>> >>>> I have a serious problems with maker annotations especially the start >>>> codon placement. It started happening with maker version 2.27! About >>>>1/3 >>>> of my genes are missing a proper start codon. When reviewing the GFF >>>> files gene predictors and est evidence standalone would predict the >>>> right gene structure including the start codon. I was thinking that >>>>this >>>> problem was somehow linked to my gene models but looking at their >>>> standalone prediction I discarded that theory. Just some stats about >>>> proteins with proper start codon (first column) and missing start >>>>codon >>>> (second column). >>>> >>>> Maker 9268 4215 >>>> SNAP 16577 896 >>>> Genemark 18764 290 >>>> >>>> Numbers look the same when Augustus is used (currently retraining). >>>> Manually using EST's in Apollo often fixes the problems but I don't >>>>want >>>> to check > 4000 genes for this. >>>> >>>> Do you have any idea what could cause such a problem? If you need some >>>> example gff files I can send you. >>>> >>>> Best regards >>>> Felix >>>> >>>> -- >>>> Felix Bemm >>>> Department of Bioinformatics >>>> University of W?rzburg, Germany >>>> Tel: +49 931 - 31 83696 >>>> Fax: +49 931 - 31 84552 >>>> felix.bemm at uni-wuerzburg.de >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> >>>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >>-- >>Felix Bemm >>Department of Bioinformatics >>University of W?rzburg, Germany >>Tel: +49 931 - 31 83696 >>Fax: +49 931 - 31 84552 >>felix.bemm at uni-wuerzburg.de > > > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From felix.bemm at uni-wuerzburg.de Thu Oct 24 08:03:27 2013 From: felix.bemm at uni-wuerzburg.de (Felix Bemm) Date: Thu, 24 Oct 2013 16:03:27 +0200 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: <7A60AB257EFF2B48B1F4C814817EA05365E778BF@mxb2.hg.genetics.utah.edu> References: , <7A60AB257EFF2B48B1F4C814817EA05365E778BF@mxb2.hg.genetics.utah.edu> Message-ID: <526928AF.90108@uni-wuerzburg.de> Hi, I used the temporary fix of the Bio::Tools::CodonTable. It's working pretty nice. Almost every predicted proteins now start with a M ;) Bests Felix On 24.10.2013 15:55, Mark Yandell wrote: > Hi Graham, > > Carson is currently on the road, moving his family from Canada back to the States. He should be back online sometime next week. Sorry for the delay. > > > --mark > > > Mark Yandell > Professor of Human Genetics > H.A. & Edna Benning Presidential Endowed Chair > Eccles Institute of Human Genetics > University of Utah > 15 North 2030 East, Room 2100 > Salt Lake City, UT 84112-5330 > ph:801-587-7707 > > ________________________________________ > From: maker-devel [maker-devel-bounces at yandell-lab.org] on behalf of graham etherington (TSL) [graham.etherington at sainsbury-laboratory.ac.uk] > Sent: Thursday, October 24, 2013 7:42 AM > To: Carson Holt; Felix Bemm; maker-devel at yandell-lab.org > Cc: Kate Bailey (TSL) > Subject: Re: [maker-devel] Serious Start Codon Problem > > Hi, > I notice that the latest version of Maker available from > http://www.yandell-lab.org/software/maker.html is still v2.29b (4 Oct > 2013). As we're also having the start codon problem we'd like to get hold > of v2.30. Do you know when this will be released? > Many thanks, > Graham > > > Dr. Graham Etherington > Bioinformatics Support Officer, > The Sainsbury Laboratory, > Norwich Research Park, > Norwich NR4 7UH. > UK > Tel: +44 (0)1603 450601 > > > > > > On 13/10/2013 06:26, "Carson Holt" wrote: > >> Before Wednesday, because after that I'm on a 6 day road trip, so I have >> to have it wrapped up before I leave. >> >> --Carson >> >> >> >> >> On 10/12/13 12:16 PM, "Felix Bemm" wrote: >> >>> Thx Carson! Do you now when 2.30 will be available? Bests >>> >>> On 12.10.2013 17:48, Carson Holt wrote: >>>> This issue just came up this week on another e-mail as well. >>>> >>>> One way to fix it, is to edit the Bio::Tools::CodonTable module from >>>> BioPerl (use maker --debug to have maker print out the locations of all >>>> modules it uses). Such a hack should only be done as a temporary work >>>> around. >>>> >>>> You change line 256 from --> >>>> ---M---------------M---------------M---------------------------- >>>> >>>> To--> >>>> -----------------------------------M---------------------------- >>>> >>>> >>>> Maker uses the BioPerl is_start_codon function to determine if the >>>> codon >>>> used in a result it receives is acceptable or if it should look >>>> upstream >>>> or downstream for a different one. Apparently there are actually 3 >>>> acceptable start codons in the standard codon table (2 rare ones and >>>> one >>>> common), so making the edit removes the two rare ones from the list. >>>> >>>> I'm also preparing a 2.30 release that exports a "strict" codon table >>>> to >>>> the BioPerl module, that will be used by default and should fix the >>>> issue. >>>> >>>> Thanks, >>>> Carson >>>> >>>> >>>> >>>> On 10/12/13 11:06 AM, "Felix Bemm" wrote: >>>> >>>>> Hi, >>>>> >>>>> I have a serious problems with maker annotations especially the start >>>>> codon placement. It started happening with maker version 2.27! About >>>>> 1/3 >>>>> of my genes are missing a proper start codon. When reviewing the GFF >>>>> files gene predictors and est evidence standalone would predict the >>>>> right gene structure including the start codon. I was thinking that >>>>> this >>>>> problem was somehow linked to my gene models but looking at their >>>>> standalone prediction I discarded that theory. Just some stats about >>>>> proteins with proper start codon (first column) and missing start >>>>> codon >>>>> (second column). >>>>> >>>>> Maker 9268 4215 >>>>> SNAP 16577 896 >>>>> Genemark 18764 290 >>>>> >>>>> Numbers look the same when Augustus is used (currently retraining). >>>>> Manually using EST's in Apollo often fixes the problems but I don't >>>>> want >>>>> to check > 4000 genes for this. >>>>> >>>>> Do you have any idea what could cause such a problem? If you need some >>>>> example gff files I can send you. >>>>> >>>>> Best regards >>>>> Felix >>>>> >>>>> -- >>>>> Felix Bemm >>>>> Department of Bioinformatics >>>>> University of W?rzburg, Germany >>>>> Tel: +49 931 - 31 83696 >>>>> Fax: +49 931 - 31 84552 >>>>> felix.bemm at uni-wuerzburg.de >>>>> >>>>> _______________________________________________ >>>>> maker-devel mailing list >>>>> maker-devel at box290.bluehost.com >>>>> >>>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >>> -- >>> Felix Bemm >>> Department of Bioinformatics >>> University of W?rzburg, Germany >>> Tel: +49 931 - 31 83696 >>> Fax: +49 931 - 31 84552 >>> felix.bemm at uni-wuerzburg.de >> >> >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -- Felix Bemm Department of Bioinformatics University of W?rzburg, Germany Tel: +49 931 - 31 83696 Fax: +49 931 - 31 84552 felix.bemm at uni-wuerzburg.de From jim.hu.biobio at gmail.com Thu Oct 24 08:32:01 2013 From: jim.hu.biobio at gmail.com (JH Gmail) Date: Thu, 24 Oct 2013 09:32:01 -0500 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: <7A60AB257EFF2B48B1F4C814817EA05365E778BF@mxb2.hg.genetics.utah.edu> References: <7A60AB257EFF2B48B1F4C814817EA05365E778BF@mxb2.hg.genetics.utah.edu> Message-ID: <69247335-1227-41CB-94EA-BFEEE5207D51@gmail.com> Slightly off-topic questions: What is the assessment of right and wrong based on? Is there a training set based on experiments? In bacteria, the use of the minor starts is significant, but the mechanism is so different I would expect eukaryotes to be more strict. Does maker handle ribosome profiling yet? jim Sent from my iPad > On Oct 24, 2013, at 8:55 AM, Mark Yandell wrote: > > Hi Graham, > > Carson is currently on the road, moving his family from Canada back to the States. He should be back online sometime next week. Sorry for the delay. > > > --mark > > > Mark Yandell > Professor of Human Genetics > H.A. & Edna Benning Presidential Endowed Chair > Eccles Institute of Human Genetics > University of Utah > 15 North 2030 East, Room 2100 > Salt Lake City, UT 84112-5330 > ph:801-587-7707 > > ________________________________________ > From: maker-devel [maker-devel-bounces at yandell-lab.org] on behalf of graham etherington (TSL) [graham.etherington at sainsbury-laboratory.ac.uk] > Sent: Thursday, October 24, 2013 7:42 AM > To: Carson Holt; Felix Bemm; maker-devel at yandell-lab.org > Cc: Kate Bailey (TSL) > Subject: Re: [maker-devel] Serious Start Codon Problem > > Hi, > I notice that the latest version of Maker available from > http://www.yandell-lab.org/software/maker.html is still v2.29b (4 Oct > 2013). As we're also having the start codon problem we'd like to get hold > of v2.30. Do you know when this will be released? > Many thanks, > Graham > > > Dr. Graham Etherington > Bioinformatics Support Officer, > The Sainsbury Laboratory, > Norwich Research Park, > Norwich NR4 7UH. > UK > Tel: +44 (0)1603 450601 > > > > > >> On 13/10/2013 06:26, "Carson Holt" wrote: >> >> Before Wednesday, because after that I'm on a 6 day road trip, so I have >> to have it wrapped up before I leave. >> >> --Carson >> >> >> >> >>> On 10/12/13 12:16 PM, "Felix Bemm" wrote: >>> >>> Thx Carson! Do you now when 2.30 will be available? Bests >>> >>>> On 12.10.2013 17:48, Carson Holt wrote: >>>> This issue just came up this week on another e-mail as well. >>>> >>>> One way to fix it, is to edit the Bio::Tools::CodonTable module from >>>> BioPerl (use maker --debug to have maker print out the locations of all >>>> modules it uses). Such a hack should only be done as a temporary work >>>> around. >>>> >>>> You change line 256 from --> >>>> ---M---------------M---------------M---------------------------- >>>> >>>> To--> >>>> -----------------------------------M---------------------------- >>>> >>>> >>>> Maker uses the BioPerl is_start_codon function to determine if the >>>> codon >>>> used in a result it receives is acceptable or if it should look >>>> upstream >>>> or downstream for a different one. Apparently there are actually 3 >>>> acceptable start codons in the standard codon table (2 rare ones and >>>> one >>>> common), so making the edit removes the two rare ones from the list. >>>> >>>> I'm also preparing a 2.30 release that exports a "strict" codon table >>>> to >>>> the BioPerl module, that will be used by default and should fix the >>>> issue. >>>> >>>> Thanks, >>>> Carson >>>> >>>> >>>> >>>>> On 10/12/13 11:06 AM, "Felix Bemm" wrote: >>>>> >>>>> Hi, >>>>> >>>>> I have a serious problems with maker annotations especially the start >>>>> codon placement. It started happening with maker version 2.27! About >>>>> 1/3 >>>>> of my genes are missing a proper start codon. When reviewing the GFF >>>>> files gene predictors and est evidence standalone would predict the >>>>> right gene structure including the start codon. I was thinking that >>>>> this >>>>> problem was somehow linked to my gene models but looking at their >>>>> standalone prediction I discarded that theory. Just some stats about >>>>> proteins with proper start codon (first column) and missing start >>>>> codon >>>>> (second column). >>>>> >>>>> Maker 9268 4215 >>>>> SNAP 16577 896 >>>>> Genemark 18764 290 >>>>> >>>>> Numbers look the same when Augustus is used (currently retraining). >>>>> Manually using EST's in Apollo often fixes the problems but I don't >>>>> want >>>>> to check > 4000 genes for this. >>>>> >>>>> Do you have any idea what could cause such a problem? If you need some >>>>> example gff files I can send you. >>>>> >>>>> Best regards >>>>> Felix >>>>> >>>>> -- >>>>> Felix Bemm >>>>> Department of Bioinformatics >>>>> University of W?rzburg, Germany >>>>> Tel: +49 931 - 31 83696 >>>>> Fax: +49 931 - 31 84552 >>>>> felix.bemm at uni-wuerzburg.de >>>>> >>>>> _______________________________________________ >>>>> maker-devel mailing list >>>>> maker-devel at box290.bluehost.com >>>>> >>>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >>> -- >>> Felix Bemm >>> Department of Bioinformatics >>> University of W?rzburg, Germany >>> Tel: +49 931 - 31 83696 >>> Fax: +49 931 - 31 84552 >>> felix.bemm at uni-wuerzburg.de >> >> >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From amelia.ireland at gmod.org Thu Oct 24 12:52:17 2013 From: amelia.ireland at gmod.org (Amelia Ireland) Date: Thu, 24 Oct 2013 11:52:17 -0700 Subject: [maker-devel] GMOD 2014: San Diego Message-ID: Greetings GMOD community citizens! Online registration is now open for the 2014 GMOD meeting, to be held at the Best Western Seven Seas, San Diego, CA, on January 16-17, 2014 (after PAG). The meeting is open to GMOD users, developers, and anyone interested in the project and the software we provide. Online registration is now open; please see the meeting page, http://gmod.org/wiki/Jan_2014_GMOD_Meeting, for more information. Early bird registration rates apply until Dec. 14th. We have secured a discount on a block of rooms at the Best Western Seven Seas; please see the wiki for more information on booking. Please feel free to contact the GMOD helpdesk on help at gmod.org if you have any questions. Thanks! -- Amelia Ireland GMOD Community Support http://gmod.org || @gmodproject -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry.moore at genetics.utah.edu Thu Oct 24 15:09:20 2013 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Thu, 24 Oct 2013 15:09:20 -0600 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: <69247335-1227-41CB-94EA-BFEEE5207D51@gmail.com> References: <7A60AB257EFF2B48B1F4C814817EA05365E778BF@mxb2.hg.genetics.utah.edu> <69247335-1227-41CB-94EA-BFEEE5207D51@gmail.com> Message-ID: <1F2D4C5A-1B72-409A-8F7D-4793F9BCEE88@genetics.utah.edu> Hi Jim, You can and any kind of evidence you want to MAKER including ribosomal profiling data as long as you format it in GFF3 format. You could pass in your GFF3 formatted ribosomal profiling data in maker_opt.ctl with the protein_gff option and it will be treated as protein evidence. However if you are wondering if anyone has coded a specific addition to MAKER to accept ribosomal profiling data and treat it as a different kind of evidence then no, this hasn't been done. B On Oct 24, 2013, at 8:32 AM, JH Gmail wrote: > Slightly off-topic questions: What is the assessment of right and wrong based on? Is there a training set based on experiments? In bacteria, the use of the minor starts is significant, but the mechanism is so different I would expect eukaryotes to be more strict. > > Does maker handle ribosome profiling yet? > > jim > > Sent from my iPad > >> On Oct 24, 2013, at 8:55 AM, Mark Yandell wrote: >> >> Hi Graham, >> >> Carson is currently on the road, moving his family from Canada back to the States. He should be back online sometime next week. Sorry for the delay. >> >> >> --mark >> >> >> Mark Yandell >> Professor of Human Genetics >> H.A. & Edna Benning Presidential Endowed Chair >> Eccles Institute of Human Genetics >> University of Utah >> 15 North 2030 East, Room 2100 >> Salt Lake City, UT 84112-5330 >> ph:801-587-7707 >> >> ________________________________________ >> From: maker-devel [maker-devel-bounces at yandell-lab.org] on behalf of graham etherington (TSL) [graham.etherington at sainsbury-laboratory.ac.uk] >> Sent: Thursday, October 24, 2013 7:42 AM >> To: Carson Holt; Felix Bemm; maker-devel at yandell-lab.org >> Cc: Kate Bailey (TSL) >> Subject: Re: [maker-devel] Serious Start Codon Problem >> >> Hi, >> I notice that the latest version of Maker available from >> http://www.yandell-lab.org/software/maker.html is still v2.29b (4 Oct >> 2013). As we're also having the start codon problem we'd like to get hold >> of v2.30. Do you know when this will be released? >> Many thanks, >> Graham >> >> >> Dr. Graham Etherington >> Bioinformatics Support Officer, >> The Sainsbury Laboratory, >> Norwich Research Park, >> Norwich NR4 7UH. >> UK >> Tel: +44 (0)1603 450601 >> >> >> >> >> >>> On 13/10/2013 06:26, "Carson Holt" wrote: >>> >>> Before Wednesday, because after that I'm on a 6 day road trip, so I have >>> to have it wrapped up before I leave. >>> >>> --Carson >>> >>> >>> >>> >>>> On 10/12/13 12:16 PM, "Felix Bemm" wrote: >>>> >>>> Thx Carson! Do you now when 2.30 will be available? Bests >>>> >>>>> On 12.10.2013 17:48, Carson Holt wrote: >>>>> This issue just came up this week on another e-mail as well. >>>>> >>>>> One way to fix it, is to edit the Bio::Tools::CodonTable module from >>>>> BioPerl (use maker --debug to have maker print out the locations of all >>>>> modules it uses). Such a hack should only be done as a temporary work >>>>> around. >>>>> >>>>> You change line 256 from --> >>>>> ---M---------------M---------------M---------------------------- >>>>> >>>>> To--> >>>>> -----------------------------------M---------------------------- >>>>> >>>>> >>>>> Maker uses the BioPerl is_start_codon function to determine if the >>>>> codon >>>>> used in a result it receives is acceptable or if it should look >>>>> upstream >>>>> or downstream for a different one. Apparently there are actually 3 >>>>> acceptable start codons in the standard codon table (2 rare ones and >>>>> one >>>>> common), so making the edit removes the two rare ones from the list. >>>>> >>>>> I'm also preparing a 2.30 release that exports a "strict" codon table >>>>> to >>>>> the BioPerl module, that will be used by default and should fix the >>>>> issue. >>>>> >>>>> Thanks, >>>>> Carson >>>>> >>>>> >>>>> >>>>>> On 10/12/13 11:06 AM, "Felix Bemm" wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> I have a serious problems with maker annotations especially the start >>>>>> codon placement. It started happening with maker version 2.27! About >>>>>> 1/3 >>>>>> of my genes are missing a proper start codon. When reviewing the GFF >>>>>> files gene predictors and est evidence standalone would predict the >>>>>> right gene structure including the start codon. I was thinking that >>>>>> this >>>>>> problem was somehow linked to my gene models but looking at their >>>>>> standalone prediction I discarded that theory. Just some stats about >>>>>> proteins with proper start codon (first column) and missing start >>>>>> codon >>>>>> (second column). >>>>>> >>>>>> Maker 9268 4215 >>>>>> SNAP 16577 896 >>>>>> Genemark 18764 290 >>>>>> >>>>>> Numbers look the same when Augustus is used (currently retraining). >>>>>> Manually using EST's in Apollo often fixes the problems but I don't >>>>>> want >>>>>> to check > 4000 genes for this. >>>>>> >>>>>> Do you have any idea what could cause such a problem? If you need some >>>>>> example gff files I can send you. >>>>>> >>>>>> Best regards >>>>>> Felix >>>>>> >>>>>> -- >>>>>> Felix Bemm >>>>>> Department of Bioinformatics >>>>>> University of W?rzburg, Germany >>>>>> Tel: +49 931 - 31 83696 >>>>>> Fax: +49 931 - 31 84552 >>>>>> felix.bemm at uni-wuerzburg.de >>>>>> >>>>>> _______________________________________________ >>>>>> maker-devel mailing list >>>>>> maker-devel at box290.bluehost.com >>>>>> >>>>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>>> >>>> -- >>>> Felix Bemm >>>> Department of Bioinformatics >>>> University of W?rzburg, Germany >>>> Tel: +49 931 - 31 83696 >>>> Fax: +49 931 - 31 84552 >>>> felix.bemm at uni-wuerzburg.de >>> >>> >>> >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From marc.hoeppner at imbim.uu.se Fri Oct 25 08:20:04 2013 From: marc.hoeppner at imbim.uu.se (Marc P. Hoeppner) Date: Fri, 25 Oct 2013 16:20:04 +0200 Subject: [maker-devel] Maker MPICH2 issue (Multiple MAKER processes have been started in the same directory) Message-ID: <526A7E14.9010906@imbim.uu.se> Dear list, I know that MPICH2 related issues have been discussed previously, but the suggested solutions don't seem to work for me. Briefly, I have set up a new cluster, with a fresh install of mpich2 (SL 6.4, from the yum repos) and a newly compiled build of Maker 2.28. When I start maker with mpiexec, I get spammed with 'WARNING: Multiple MAKER processes have been started in the same directory.' Here is what I have tried so far - still getting the error :-< Any suggestions would be appreciated -------- MPICH2 --------- which mpiexec: /usr/lib64/mpich2/bin/mpiexec Started the mpd ring as normal user across 5 nodes (bnode-01 to bnode-04, and the head node bhead) mpdtrace: bhead bnode-02 bnode-03 bnode-04 bnode-01 mpdringtest: time for 1 loops = 0.00128293037415 seconds mpixec -n 32 -machinefile hostfile hostname: I also ran the Skampi Mpi Benchmark, which ran through just fine. ----------- MAKER ----------- ./Build status PERL Dependencies: VERIFIED External Programs: VERIFIED External C Libraries: VERIFIED MPI SUPPORT: ENABLED MWAS Web Interface: DISABLED MAKER PACKAGE: CONFIGURATION OK During configuration/installation, Build.PL automatically detected all MPI data and never complained, so I have to assume that everything went fine there. -- ---------- Marc P. Hoeppner, PhD Department of Microbiology and Medical Biochemistry Uppsala University, Sweden marc.hoeppner at imbim.uu.se From jim.hu.biobio at gmail.com Fri Oct 25 08:51:19 2013 From: jim.hu.biobio at gmail.com (JH Gmail) Date: Fri, 25 Oct 2013 09:51:19 -0500 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: <1F2D4C5A-1B72-409A-8F7D-4793F9BCEE88@genetics.utah.edu> References: <7A60AB257EFF2B48B1F4C814817EA05365E778BF@mxb2.hg.genetics.utah.edu> <69247335-1227-41CB-94EA-BFEEE5207D51@gmail.com> <1F2D4C5A-1B72-409A-8F7D-4793F9BCEE88@genetics.utah.edu> Message-ID: <596400F6-3215-4F71-A896-AF52CEC8D77D@gmail.com> Thanks Barry, I'm thinking about this more theoretically, as we don't have immediate plans, but... This would work for gff derived from ribosome profiling analysis, but it would be nice to be able to start with bam files of mapped reads, or coverage scores as bigwig or gff. RNA-seq has similar issues, but IIRC Maker expects RNA-seq data to be assembled first. It may not be possible to do de novo cds assembly from ribosome profiling because I think the nuclease step may kill the overlaps. I'd have to look at some of the available data. Jim Sent from my iPad > On Oct 24, 2013, at 4:09 PM, Barry Moore wrote: > > Hi Jim, > > You can and any kind of evidence you want to MAKER including ribosomal profiling data as long as you format it in GFF3 format. You could pass in your GFF3 formatted ribosomal profiling data in maker_opt.ctl with the protein_gff option and it will be treated as protein evidence. However if you are wondering if anyone has coded a specific addition to MAKER to accept ribosomal profiling data and treat it as a different kind of evidence then no, this hasn't been done. > > B > >> On Oct 24, 2013, at 8:32 AM, JH Gmail wrote: >> >> Slightly off-topic questions: What is the assessment of right and wrong based on? Is there a training set based on experiments? In bacteria, the use of the minor starts is significant, but the mechanism is so different I would expect eukaryotes to be more strict. >> >> Does maker handle ribosome profiling yet? >> >> jim >> >> Sent from my iPad >> >>> On Oct 24, 2013, at 8:55 AM, Mark Yandell wrote: >>> >>> Hi Graham, >>> >>> Carson is currently on the road, moving his family from Canada back to the States. He should be back online sometime next week. Sorry for the delay. >>> >>> >>> --mark >>> >>> >>> Mark Yandell >>> Professor of Human Genetics >>> H.A. & Edna Benning Presidential Endowed Chair >>> Eccles Institute of Human Genetics >>> University of Utah >>> 15 North 2030 East, Room 2100 >>> Salt Lake City, UT 84112-5330 >>> ph:801-587-7707 >>> >>> ________________________________________ >>> From: maker-devel [maker-devel-bounces at yandell-lab.org] on behalf of graham etherington (TSL) [graham.etherington at sainsbury-laboratory.ac.uk] >>> Sent: Thursday, October 24, 2013 7:42 AM >>> To: Carson Holt; Felix Bemm; maker-devel at yandell-lab.org >>> Cc: Kate Bailey (TSL) >>> Subject: Re: [maker-devel] Serious Start Codon Problem >>> >>> Hi, >>> I notice that the latest version of Maker available from >>> http://www.yandell-lab.org/software/maker.html is still v2.29b (4 Oct >>> 2013). As we're also having the start codon problem we'd like to get hold >>> of v2.30. Do you know when this will be released? >>> Many thanks, >>> Graham >>> >>> >>> Dr. Graham Etherington >>> Bioinformatics Support Officer, >>> The Sainsbury Laboratory, >>> Norwich Research Park, >>> Norwich NR4 7UH. >>> UK >>> Tel: +44 (0)1603 450601 >>> >>> >>> >>> >>> >>>> On 13/10/2013 06:26, "Carson Holt" wrote: >>>> >>>> Before Wednesday, because after that I'm on a 6 day road trip, so I have >>>> to have it wrapped up before I leave. >>>> >>>> --Carson >>>> >>>> >>>> >>>> >>>>> On 10/12/13 12:16 PM, "Felix Bemm" wrote: >>>>> >>>>> Thx Carson! Do you now when 2.30 will be available? Bests >>>>> >>>>>> On 12.10.2013 17:48, Carson Holt wrote: >>>>>> This issue just came up this week on another e-mail as well. >>>>>> >>>>>> One way to fix it, is to edit the Bio::Tools::CodonTable module from >>>>>> BioPerl (use maker --debug to have maker print out the locations of all >>>>>> modules it uses). Such a hack should only be done as a temporary work >>>>>> around. >>>>>> >>>>>> You change line 256 from --> >>>>>> ---M---------------M---------------M---------------------------- >>>>>> >>>>>> To--> >>>>>> -----------------------------------M---------------------------- >>>>>> >>>>>> >>>>>> Maker uses the BioPerl is_start_codon function to determine if the >>>>>> codon >>>>>> used in a result it receives is acceptable or if it should look >>>>>> upstream >>>>>> or downstream for a different one. Apparently there are actually 3 >>>>>> acceptable start codons in the standard codon table (2 rare ones and >>>>>> one >>>>>> common), so making the edit removes the two rare ones from the list. >>>>>> >>>>>> I'm also preparing a 2.30 release that exports a "strict" codon table >>>>>> to >>>>>> the BioPerl module, that will be used by default and should fix the >>>>>> issue. >>>>>> >>>>>> Thanks, >>>>>> Carson >>>>>> >>>>>> >>>>>> >>>>>>> On 10/12/13 11:06 AM, "Felix Bemm" wrote: >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I have a serious problems with maker annotations especially the start >>>>>>> codon placement. It started happening with maker version 2.27! About >>>>>>> 1/3 >>>>>>> of my genes are missing a proper start codon. When reviewing the GFF >>>>>>> files gene predictors and est evidence standalone would predict the >>>>>>> right gene structure including the start codon. I was thinking that >>>>>>> this >>>>>>> problem was somehow linked to my gene models but looking at their >>>>>>> standalone prediction I discarded that theory. Just some stats about >>>>>>> proteins with proper start codon (first column) and missing start >>>>>>> codon >>>>>>> (second column). >>>>>>> >>>>>>> Maker 9268 4215 >>>>>>> SNAP 16577 896 >>>>>>> Genemark 18764 290 >>>>>>> >>>>>>> Numbers look the same when Augustus is used (currently retraining). >>>>>>> Manually using EST's in Apollo often fixes the problems but I don't >>>>>>> want >>>>>>> to check > 4000 genes for this. >>>>>>> >>>>>>> Do you have any idea what could cause such a problem? If you need some >>>>>>> example gff files I can send you. >>>>>>> >>>>>>> Best regards >>>>>>> Felix >>>>>>> >>>>>>> -- >>>>>>> Felix Bemm >>>>>>> Department of Bioinformatics >>>>>>> University of W?rzburg, Germany >>>>>>> Tel: +49 931 - 31 83696 >>>>>>> Fax: +49 931 - 31 84552 >>>>>>> felix.bemm at uni-wuerzburg.de >>>>>>> >>>>>>> _______________________________________________ >>>>>>> maker-devel mailing list >>>>>>> maker-devel at box290.bluehost.com >>>>>>> >>>>>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>>>> >>>>> -- >>>>> Felix Bemm >>>>> Department of Bioinformatics >>>>> University of W?rzburg, Germany >>>>> Tel: +49 931 - 31 83696 >>>>> Fax: +49 931 - 31 84552 >>>>> felix.bemm at uni-wuerzburg.de >>>> >>>> >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >>> >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > Barry Moore > Research Scientist > Dept. of Human Genetics > University of Utah > Salt Lake City, UT 84112 > -------------------------------------------- > (801) 585-3543 > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry.moore at genetics.utah.edu Fri Oct 25 09:44:55 2013 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Fri, 25 Oct 2013 09:44:55 -0600 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: <596400F6-3215-4F71-A896-AF52CEC8D77D@gmail.com> References: <7A60AB257EFF2B48B1F4C814817EA05365E778BF@mxb2.hg.genetics.utah.edu> <69247335-1227-41CB-94EA-BFEEE5207D51@gmail.com> <1F2D4C5A-1B72-409A-8F7D-4793F9BCEE88@genetics.utah.edu> <596400F6-3215-4F71-A896-AF52CEC8D77D@gmail.com> Message-ID: Thanks Jim. B On Oct 25, 2013, at 8:51 AM, JH Gmail wrote: > Thanks Barry, > > I'm thinking about this more theoretically, as we don't have immediate plans, but... > > This would work for gff derived from ribosome profiling analysis, but it would be nice to be able to start with bam files of mapped reads, or coverage scores as bigwig or gff. RNA-seq has similar issues, but IIRC Maker expects RNA-seq data to be assembled first. It may not be possible to do de novo cds assembly from ribosome profiling because I think the nuclease step may kill the overlaps. I'd have to look at some of the available data. > > Jim > > Sent from my iPad > > On Oct 24, 2013, at 4:09 PM, Barry Moore wrote: > >> Hi Jim, >> >> You can and any kind of evidence you want to MAKER including ribosomal profiling data as long as you format it in GFF3 format. You could pass in your GFF3 formatted ribosomal profiling data in maker_opt.ctl with the protein_gff option and it will be treated as protein evidence. However if you are wondering if anyone has coded a specific addition to MAKER to accept ribosomal profiling data and treat it as a different kind of evidence then no, this hasn't been done. >> >> B >> >> On Oct 24, 2013, at 8:32 AM, JH Gmail wrote: >> >>> Slightly off-topic questions: What is the assessment of right and wrong based on? Is there a training set based on experiments? In bacteria, the use of the minor starts is significant, but the mechanism is so different I would expect eukaryotes to be more strict. >>> >>> Does maker handle ribosome profiling yet? >>> >>> jim >>> >>> Sent from my iPad >>> >>>> On Oct 24, 2013, at 8:55 AM, Mark Yandell wrote: >>>> >>>> Hi Graham, >>>> >>>> Carson is currently on the road, moving his family from Canada back to the States. He should be back online sometime next week. Sorry for the delay. >>>> >>>> >>>> --mark >>>> >>>> >>>> Mark Yandell >>>> Professor of Human Genetics >>>> H.A. & Edna Benning Presidential Endowed Chair >>>> Eccles Institute of Human Genetics >>>> University of Utah >>>> 15 North 2030 East, Room 2100 >>>> Salt Lake City, UT 84112-5330 >>>> ph:801-587-7707 >>>> >>>> ________________________________________ >>>> From: maker-devel [maker-devel-bounces at yandell-lab.org] on behalf of graham etherington (TSL) [graham.etherington at sainsbury-laboratory.ac.uk] >>>> Sent: Thursday, October 24, 2013 7:42 AM >>>> To: Carson Holt; Felix Bemm; maker-devel at yandell-lab.org >>>> Cc: Kate Bailey (TSL) >>>> Subject: Re: [maker-devel] Serious Start Codon Problem >>>> >>>> Hi, >>>> I notice that the latest version of Maker available from >>>> http://www.yandell-lab.org/software/maker.html is still v2.29b (4 Oct >>>> 2013). As we're also having the start codon problem we'd like to get hold >>>> of v2.30. Do you know when this will be released? >>>> Many thanks, >>>> Graham >>>> >>>> >>>> Dr. Graham Etherington >>>> Bioinformatics Support Officer, >>>> The Sainsbury Laboratory, >>>> Norwich Research Park, >>>> Norwich NR4 7UH. >>>> UK >>>> Tel: +44 (0)1603 450601 >>>> >>>> >>>> >>>> >>>> >>>>> On 13/10/2013 06:26, "Carson Holt" wrote: >>>>> >>>>> Before Wednesday, because after that I'm on a 6 day road trip, so I have >>>>> to have it wrapped up before I leave. >>>>> >>>>> --Carson >>>>> >>>>> >>>>> >>>>> >>>>>> On 10/12/13 12:16 PM, "Felix Bemm" wrote: >>>>>> >>>>>> Thx Carson! Do you now when 2.30 will be available? Bests >>>>>> >>>>>>> On 12.10.2013 17:48, Carson Holt wrote: >>>>>>> This issue just came up this week on another e-mail as well. >>>>>>> >>>>>>> One way to fix it, is to edit the Bio::Tools::CodonTable module from >>>>>>> BioPerl (use maker --debug to have maker print out the locations of all >>>>>>> modules it uses). Such a hack should only be done as a temporary work >>>>>>> around. >>>>>>> >>>>>>> You change line 256 from --> >>>>>>> ---M---------------M---------------M---------------------------- >>>>>>> >>>>>>> To--> >>>>>>> -----------------------------------M---------------------------- >>>>>>> >>>>>>> >>>>>>> Maker uses the BioPerl is_start_codon function to determine if the >>>>>>> codon >>>>>>> used in a result it receives is acceptable or if it should look >>>>>>> upstream >>>>>>> or downstream for a different one. Apparently there are actually 3 >>>>>>> acceptable start codons in the standard codon table (2 rare ones and >>>>>>> one >>>>>>> common), so making the edit removes the two rare ones from the list. >>>>>>> >>>>>>> I'm also preparing a 2.30 release that exports a "strict" codon table >>>>>>> to >>>>>>> the BioPerl module, that will be used by default and should fix the >>>>>>> issue. >>>>>>> >>>>>>> Thanks, >>>>>>> Carson >>>>>>> >>>>>>> >>>>>>> >>>>>>>> On 10/12/13 11:06 AM, "Felix Bemm" wrote: >>>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> I have a serious problems with maker annotations especially the start >>>>>>>> codon placement. It started happening with maker version 2.27! About >>>>>>>> 1/3 >>>>>>>> of my genes are missing a proper start codon. When reviewing the GFF >>>>>>>> files gene predictors and est evidence standalone would predict the >>>>>>>> right gene structure including the start codon. I was thinking that >>>>>>>> this >>>>>>>> problem was somehow linked to my gene models but looking at their >>>>>>>> standalone prediction I discarded that theory. Just some stats about >>>>>>>> proteins with proper start codon (first column) and missing start >>>>>>>> codon >>>>>>>> (second column). >>>>>>>> >>>>>>>> Maker 9268 4215 >>>>>>>> SNAP 16577 896 >>>>>>>> Genemark 18764 290 >>>>>>>> >>>>>>>> Numbers look the same when Augustus is used (currently retraining). >>>>>>>> Manually using EST's in Apollo often fixes the problems but I don't >>>>>>>> want >>>>>>>> to check > 4000 genes for this. >>>>>>>> >>>>>>>> Do you have any idea what could cause such a problem? If you need some >>>>>>>> example gff files I can send you. >>>>>>>> >>>>>>>> Best regards >>>>>>>> Felix >>>>>>>> >>>>>>>> -- >>>>>>>> Felix Bemm >>>>>>>> Department of Bioinformatics >>>>>>>> University of W?rzburg, Germany >>>>>>>> Tel: +49 931 - 31 83696 >>>>>>>> Fax: +49 931 - 31 84552 >>>>>>>> felix.bemm at uni-wuerzburg.de >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> maker-devel mailing list >>>>>>>> maker-devel at box290.bluehost.com >>>>>>>> >>>>>>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>>>>> >>>>>> -- >>>>>> Felix Bemm >>>>>> Department of Bioinformatics >>>>>> University of W?rzburg, Germany >>>>>> Tel: +49 931 - 31 83696 >>>>>> Fax: +49 931 - 31 84552 >>>>>> felix.bemm at uni-wuerzburg.de >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> maker-devel mailing list >>>>> maker-devel at box290.bluehost.com >>>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>>> >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> Barry Moore >> Research Scientist >> Dept. of Human Genetics >> University of Utah >> Salt Lake City, UT 84112 >> -------------------------------------------- >> (801) 585-3543 >> >> >> >> Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From marc.hoeppner at imbim.uu.se Sat Oct 26 05:54:07 2013 From: marc.hoeppner at imbim.uu.se (Marc P. Hoeppner) Date: Sat, 26 Oct 2013 13:54:07 +0200 Subject: [maker-devel] Maker MPICH2 issue (Multiple MAKER processes have been started in the same directory) In-Reply-To: <526A7E14.9010906@imbim.uu.se> References: <526A7E14.9010906@imbim.uu.se> Message-ID: <526BAD5F.2000800@imbim.uu.se> Dear List, after much more debugging, I found the solution myself. Here are the details: The Maker installation instructions refer to 'mpich2' in multiple places, including weblinks (whchi are outdated btw). Some digging revealed that mpich2 has since been renamed and merged back into mpich. So the website is www.mpich.org and the current stable version is mpich-3.0.4. Closely following the installation instruction yielded a working copy of mpiexec that Maker was willing to accept. Good news - but I suggest to update the documentation. Most importantly for either SL/CentOS users, mpich2 as packaged with e.g. Scientific Linux *does not* work with Maker. You will need to download the latest stable build and compile it yourself (the mpich2 version bundled with Maker doesn't work either btw). Cheers, Marc > Dear list, > > I know that MPICH2 related issues have been discussed previously, but > the suggested solutions don't seem to work for me. > Briefly, I have set up a new cluster, with a fresh install of mpich2 > (SL 6.4, from the yum repos) and a newly compiled build of Maker 2.28. > > When I start maker with mpiexec, I get spammed with > 'WARNING: Multiple MAKER processes have been started in the same > directory.' > > Here is what I have tried so far - still getting the error :-< Any > suggestions would be appreciated > > -------- > MPICH2 > --------- > > which mpiexec: > /usr/lib64/mpich2/bin/mpiexec > > Started the mpd ring as normal user across 5 nodes (bnode-01 to > bnode-04, and the head node bhead) > > mpdtrace: > bhead > bnode-02 > bnode-03 > bnode-04 > bnode-01 > > mpdringtest: > time for 1 loops = 0.00128293037415 seconds > > mpixec -n 32 -machinefile hostfile hostname: > > > > I also ran the Skampi Mpi Benchmark, which ran through just fine. > > ----------- > MAKER > ----------- > > ./Build status > > PERL Dependencies: VERIFIED > External Programs: VERIFIED > External C Libraries: VERIFIED > MPI SUPPORT: ENABLED > MWAS Web Interface: DISABLED > MAKER PACKAGE: CONFIGURATION OK > > During configuration/installation, Build.PL automatically detected all > MPI data and never complained, so I have to assume that everything > went fine there. > -- ---------- Marc P. Hoeppner, PhD Department of Microbiology and Medical Biochemistry Uppsala University, Sweden marc.hoeppner at imbim.uu.se From barry.utah at gmail.com Sat Oct 26 08:41:24 2013 From: barry.utah at gmail.com (Barry Moore) Date: Sat, 26 Oct 2013 08:41:24 -0600 Subject: [maker-devel] Maker MPICH2 issue (Multiple MAKER processes have been started in the same directory) In-Reply-To: <526BAD5F.2000800@imbim.uu.se> References: <526A7E14.9010906@imbim.uu.se> <526BAD5F.2000800@imbim.uu.se> Message-ID: Hi Marc, Thanks so much for sharing the results of all your hard work! We will follow up on your feedback and incorporate the necessary updates to MAKER docs and installation procedures. B On Oct 26, 2013, at 5:54 AM, Marc P. Hoeppner wrote: > Dear List, > > after much more debugging, I found the solution myself. Here are the details: > > The Maker installation instructions refer to 'mpich2' in multiple places, including weblinks (whchi are outdated btw). Some digging revealed that mpich2 has since been renamed and merged back into mpich. So the website is www.mpich.org and the current stable version is mpich-3.0.4. Closely following the installation instruction yielded a working copy of mpiexec that Maker was willing to accept. Good news - but I suggest to update the documentation. Most importantly for either SL/CentOS users, mpich2 as packaged with e.g. Scientific Linux *does not* work with Maker. You will need to download the latest stable build and compile it yourself (the mpich2 version bundled with Maker doesn't work either btw). > > Cheers, > > Marc > >> Dear list, >> >> I know that MPICH2 related issues have been discussed previously, but the suggested solutions don't seem to work for me. >> Briefly, I have set up a new cluster, with a fresh install of mpich2 (SL 6.4, from the yum repos) and a newly compiled build of Maker 2.28. >> >> When I start maker with mpiexec, I get spammed with >> 'WARNING: Multiple MAKER processes have been started in the same directory.' >> >> Here is what I have tried so far - still getting the error :-< Any suggestions would be appreciated >> >> -------- >> MPICH2 >> --------- >> >> which mpiexec: >> /usr/lib64/mpich2/bin/mpiexec >> >> Started the mpd ring as normal user across 5 nodes (bnode-01 to bnode-04, and the head node bhead) >> >> mpdtrace: >> bhead >> bnode-02 >> bnode-03 >> bnode-04 >> bnode-01 >> >> mpdringtest: >> time for 1 loops = 0.00128293037415 seconds >> >> mpixec -n 32 -machinefile hostfile hostname: >> >> >> >> I also ran the Skampi Mpi Benchmark, which ran through just fine. >> >> ----------- >> MAKER >> ----------- >> >> ./Build status >> >> PERL Dependencies: VERIFIED >> External Programs: VERIFIED >> External C Libraries: VERIFIED >> MPI SUPPORT: ENABLED >> MWAS Web Interface: DISABLED >> MAKER PACKAGE: CONFIGURATION OK >> >> During configuration/installation, Build.PL automatically detected all MPI data and never complained, so I have to assume that everything went fine there. >> > > > -- > ---------- > Marc P. Hoeppner, PhD > Department of Microbiology and Medical Biochemistry > Uppsala University, Sweden > marc.hoeppner at imbim.uu.se > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From wholtz at lygos.com Wed Oct 30 12:21:52 2013 From: wholtz at lygos.com (William Holtz) Date: Wed, 30 Oct 2013 11:21:52 -0700 (PDT) Subject: [maker-devel] maker apollo error Message-ID: <1383157312.95043.YahooMailNeo@web5706.biz.mail.ne1.yahoo.com> Hi, I'm attempting to reply to a thread on this list from before I was a member. My apologies if this gets messed up. I've tried installing maker v2.28 and v2.29p-beta and both of them fail due to the URL of the apollo tar file being outdated as previously reported on this list.?The updated URL that Carson Holt posted,?http://sourceforge.net/code-snapshots/svn/g/gm/gmod/svn/gmod-svn-25291-apollo-trunk.zip,?is also no longer valid. I was able to find a copy of the apollo source at?http://downloads.sourceforge.net/project/gmod/Apollo/1.11.1/apollo-1.11.1.tgz. However my attempts to integrate this into the maker build process have been unsuccessful, but likely could be done by someone more skilled than I.?Any pointers on how to get past this hurdle would be appreciated. Thanks in advance for your help. -Will From barry.utah at gmail.com Wed Oct 30 16:31:58 2013 From: barry.utah at gmail.com (Barry Moore) Date: Wed, 30 Oct 2013 16:31:58 -0600 Subject: [maker-devel] maker apollo error In-Reply-To: <1383157312.95043.YahooMailNeo@web5706.biz.mail.ne1.yahoo.com> References: <1383157312.95043.YahooMailNeo@web5706.biz.mail.ne1.yahoo.com> Message-ID: Hi Will, You can edit these URLs in the following file: maker/src/locations In a section that looks like this: 'apollo' => {'Linux_x86_64' => 'http://gmod.svn.sourceforge.net/viewvc/gmod/apollo/trunk/?view=tar', 'Linux_i386' => 'http://gmod.svn.sourceforge.net/viewvc/gmod/apollo/trunk/?view=tar', 'Darwin_i386' => 'http://weatherby.genetics.utah.edu/apollo/apollo.tar.gz', 'Darwin_x86_64' => 'http://weatherby.genetics.utah.edu/apollo/apollo.tar.gz', 'src_' => 'http://weatherby.genetics.utah.edu/apollo/apollo.tar.gz', }, Carson may have more feedback for you as well, but he is right in the middle of a move, so there may be a few days before he can reply, so modifying this file should work for you in the mean time. On any of the executables that are missing you can always install them manually and add them to your path and maker will find them. Also, maker shouldn't be required for a base maker install, so you could ignore that dependency as well. Thanks, B On Oct 30, 2013, at 12:21 PM, William Holtz wrote: > Hi, > > I'm attempting to reply to a thread on this list from before I was a member. My apologies if this gets messed up. > > I've tried installing maker v2.28 and v2.29p-beta and both of them fail due to the URL of the apollo tar file being outdated as previously reported on this list. The updated URL that Carson Holt posted, http://sourceforge.net/code-snapshots/svn/g/gm/gmod/svn/gmod-svn-25291-apollo-trunk.zip, is also no longer valid. I was able to find a copy of the apollo source at http://downloads.sourceforge.net/project/gmod/Apollo/1.11.1/apollo-1.11.1.tgz. However my attempts to integrate this into the maker build process have been unsuccessful, but likely could be done by someone more skilled than I. Any pointers on how to get past this hurdle would be appreciated. > > Thanks in advance for your help. > > -Will > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Wed Oct 2 08:14:44 2013 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 02 Oct 2013 10:14:44 -0400 Subject: [maker-devel] maker2 scripts for functional annotation In-Reply-To: Message-ID: It still works, but it will be reduced. Without ESTs you won't get any UTR prediction, also you will be limited by how well the protein set matches to the genes. If you use the protein set of a close relative you will capture most things. You will probably capture 80-95% of what you would by also including ESTs. It all depending on how different the species in the proteins evidence file are compared to the the species being annotated. --Carson On 10/1/13 3:00 PM, "Xia.Cao at dupont.com" wrote: >Hi Carson, > >Thank you for your message and your kind help. Now conversion from >match/match_part to gene/mRNA/exons/CDS works well after blanking out >some options in control file. > >I have one more question about maker2. In case we don't have EST evidence >(set of ESTs or assembled mRNA-seq in fasta format) for a genome, does >maker2 function? If it does function, could you please let me know the >performance of maker2 without providing EST evidence compared to the one >with EST evidence? > >Thank you again and I look forward to hearing from you. > >Best, >Xia > > > > >-----Original Message----- >From: Carson Holt [mailto:carsonhh at gmail.com] >Sent: Wednesday, September 25, 2013 10:36 AM >To: CAO, XIA; myandell at genetics.utah.edu; RIVERA, CORBAN GREGORY; >maker-devel at yandell-lab.org >Subject: Re: [maker-devel] maker2 scripts for functional annotation > >If it is launching predictors then you have snap hmm or augustus_species >set. You ned to blank out all other options in the control files >(including repeat masking options, proteins, ESTs, etc.) when trying to >convert mathc/match_part to gene/mRNA/exons/CDS, or else those other >programs will run. > >--Carson > > >On 9/25/13 10:31 AM, "Xia.Cao at dupont.com" wrote: > >>Hi Carson, >> >>Thank you for the message and your kind help. We tested maker2 by >>setting keep_preds=1, pred_gff=generated_gff_file_from_first_makerRun . >>But it seemed maker2 started to launch all predictors again and it took >>long time to finish. I wonder if there is any way that we can directly >>get gene/mRNA/exons/CDS gff file without re-running maker2 to convert >>match/match_part features into gene/mRNA/exons/CDS. >> >>Thanks, >>Xia >> >>-----Original Message----- >>From: Carson Holt [mailto:carsonhh at gmail.com] >>Sent: Thursday, September 19, 2013 5:58 PM >>To: Mark Yandell; CAO, XIA; RIVERA, CORBAN GREGORY; >>maker-devel at yandell-lab.org >>Subject: Re: [maker-devel] maker2 scripts for functional annotation >> >>Hello Corban & Xia, >> >>Some scripts like gff3_preds2models are deprecated. To get the same >>result as was offered by gff3_preds2models, just give your >>match/match_part features to pref_gff= in the maker_opts.ctl file, set >>keep_preds=1, and run with all other options and predictors turned off. >>The final MAKER result will be your match/match_part features converted >>into gene/mRNA/exons/CDS. >> >>For functional annotation, you can use Interproscan, BLASTP against >>UniProt, or BALST2GO. My preference is to use InterProScan to add GO >>terms and proteins domains via the ipr_update_gff and iprscan2gff3 >>scripts. Then add putative gene functions via BLASTP to UniProt and >>maker_functional_fasta and maker_functional_gff scripts. >> >>Go ahead and take a look and that those tools and let me know if you >>have any questions or need help you configuring them. >> >>Thanks, >>Carson >> >> >>On 9/19/13 11:53 AM, "Mark Yandell" wrote: >> >>>Hi Corban & Xia, >>> >>> >>>I've forwarded your question along to the MAKER_dev list, were you can >>>get speedy answers to your maker related questions. Thanks for using >>>MAKER. >>> >>>--mark >>> >>> >>>Mark Yandell >>>Professor of Human Genetics >>>H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of >>>Human Genetics University of Utah >>>15 North 2030 East, Room 2100 >>>Salt Lake City, UT 84112-5330 >>>ph:801-587-7707 >>> >>>________________________________________ >>>From: Xia.Cao at dupont.com [Xia.Cao at dupont.com] >>>Sent: Thursday, September 19, 2013 11:49 AM >>>To: Mark Yandell; Corban-Gregory.Rivera at dupont.com >>>Subject: maker2 scripts for functional annotation >>> >>>Dr. Yandell, >>> >>>We were recently evaluating maker2 for annotation and going through >>>the maker tutorial from 2012. >>> >>>http://gmod.org/wiki/MAKER_Tutorial_2012 >>> >>>The tutorial makes references to some scripts that we couldn?t find in >>>the current release. We were looking for scripts like >>>gff3_preds2models to convert match/match_part format into annotations >>>with gene/mRNA/exons/CDS and others. I was wondering if maybe we did >>>not have the most up to date version. >>> >>>In addition to getting accurate gene annotations, I was looking for a >>>solution to get functional assignments. I see that there are some >>>scripts like maker_functional_fasta that may help, but I was wondering >>>what you would recommend. >>> >>>Thanks, >>> >>>Corban & Xia >>> >>>This communication is for use by the intended recipient and contains >>>information that may be Privileged, confidential or copyrighted under >>>applicable law. If you are not the intended recipient, you are hereby >>>formally notified that any use, copying or distribution of this >>>e-mail, in whole or in part, is strictly prohibited. Please notify the >>>sender by return e-mail and delete this e-mail from your system. >>>Unless explicitly and conspicuously designated as "E-Contract >>>Intended", this e-mail does not constitute a contract offer, a >>>contract amendment, or an acceptance of a contract offer. This e-mail >>>does not constitute a consent to the use of sender's contact >>>information for direct marketing purposes or for transfers of data to >>>third parties. >>> >>>The dupont.com web address will continue in use for a transitional >>>period for communications sent or received on behalf of DuPont >>>Performance Coatings., which is not affiliated in any way with the >>>DuPont Company. >>> >>>Francais Deutsch Italiano Espanol Portugues Japanese Chinese >>>Korean >>> >>> http://www.DuPont.com/corp/email_disclaimer.html >>> >>>_______________________________________________ >>>maker-devel mailing list >>>maker-devel at box290.bluehost.com >>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.or >>>g >> >> >> >>This communication is for use by the intended recipient and contains >>information that may be Privileged, confidential or copyrighted under >>applicable law. If you are not the intended recipient, you are hereby >>formally notified that any use, copying or distribution of this e-mail, >>in whole or in part, is strictly prohibited. Please notify the sender >>by return e-mail and delete this e-mail from your system. Unless >>explicitly and conspicuously designated as "E-Contract Intended", this >>e-mail does not constitute a contract offer, a contract amendment, or >>an acceptance of a contract offer. This e-mail does not constitute a >>consent to the use of sender's contact information for direct marketing >>purposes or for transfers of data to third parties. >> >>The dupont.com web address will continue in use for >>a transitional period for communications sent or received on behalf of >>DuPont Performance Coatings., which is not affiliated in any way with >>the DuPont Company. >> >>Francais Deutsch Italiano Espanol Portugues Japanese Chinese >>Korean >> >> http://www.DuPont.com/corp/email_disclaimer.html >> > > > >This communication is for use by the intended recipient and contains >information that may be Privileged, confidential or copyrighted under >applicable law. If you are not the intended recipient, you are hereby >formally notified that any use, copying or distribution of this e-mail, >in whole or in part, is strictly prohibited. Please notify the sender by >return e-mail and delete this e-mail from your system. Unless explicitly >and conspicuously designated as "E-Contract Intended", this e-mail does >not constitute a contract offer, a contract amendment, or an acceptance >of a contract offer. This e-mail does not constitute a consent to the >use of sender's contact information for direct marketing purposes or for >transfers of data to third parties. > >The dupont.com web address will continue in use for a >transitional period for communications sent or received on behalf of >DuPont >Performance Coatings., which is not affiliated in any way with the DuPont >Company. > >Francais Deutsch Italiano Espanol Portugues Japanese Chinese Korean > > http://www.DuPont.com/corp/email_disclaimer.html > From Xia.Cao at dupont.com Tue Oct 1 13:00:42 2013 From: Xia.Cao at dupont.com (Xia.Cao at dupont.com) Date: Tue, 1 Oct 2013 19:00:42 +0000 Subject: [maker-devel] maker2 scripts for functional annotation In-Reply-To: References: Message-ID: Hi Carson, Thank you for your message and your kind help. Now conversion from match/match_part to gene/mRNA/exons/CDS works well after blanking out some options in control file. I have one more question about maker2. In case we don't have EST evidence (set of ESTs or assembled mRNA-seq in fasta format) for a genome, does maker2 function? If it does function, could you please let me know the performance of maker2 without providing EST evidence compared to the one with EST evidence? Thank you again and I look forward to hearing from you. Best, Xia -----Original Message----- From: Carson Holt [mailto:carsonhh at gmail.com] Sent: Wednesday, September 25, 2013 10:36 AM To: CAO, XIA; myandell at genetics.utah.edu; RIVERA, CORBAN GREGORY; maker-devel at yandell-lab.org Subject: Re: [maker-devel] maker2 scripts for functional annotation If it is launching predictors then you have snap hmm or augustus_species set. You ned to blank out all other options in the control files (including repeat masking options, proteins, ESTs, etc.) when trying to convert mathc/match_part to gene/mRNA/exons/CDS, or else those other programs will run. --Carson On 9/25/13 10:31 AM, "Xia.Cao at dupont.com" wrote: >Hi Carson, > >Thank you for the message and your kind help. We tested maker2 by >setting keep_preds=1, pred_gff=generated_gff_file_from_first_makerRun . >But it seemed maker2 started to launch all predictors again and it took >long time to finish. I wonder if there is any way that we can directly >get gene/mRNA/exons/CDS gff file without re-running maker2 to convert >match/match_part features into gene/mRNA/exons/CDS. > >Thanks, >Xia > >-----Original Message----- >From: Carson Holt [mailto:carsonhh at gmail.com] >Sent: Thursday, September 19, 2013 5:58 PM >To: Mark Yandell; CAO, XIA; RIVERA, CORBAN GREGORY; >maker-devel at yandell-lab.org >Subject: Re: [maker-devel] maker2 scripts for functional annotation > >Hello Corban & Xia, > >Some scripts like gff3_preds2models are deprecated. To get the same >result as was offered by gff3_preds2models, just give your >match/match_part features to pref_gff= in the maker_opts.ctl file, set >keep_preds=1, and run with all other options and predictors turned off. >The final MAKER result will be your match/match_part features converted >into gene/mRNA/exons/CDS. > >For functional annotation, you can use Interproscan, BLASTP against >UniProt, or BALST2GO. My preference is to use InterProScan to add GO >terms and proteins domains via the ipr_update_gff and iprscan2gff3 >scripts. Then add putative gene functions via BLASTP to UniProt and >maker_functional_fasta and maker_functional_gff scripts. > >Go ahead and take a look and that those tools and let me know if you >have any questions or need help you configuring them. > >Thanks, >Carson > > >On 9/19/13 11:53 AM, "Mark Yandell" wrote: > >>Hi Corban & Xia, >> >> >>I've forwarded your question along to the MAKER_dev list, were you can >>get speedy answers to your maker related questions. Thanks for using >>MAKER. >> >>--mark >> >> >>Mark Yandell >>Professor of Human Genetics >>H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of >>Human Genetics University of Utah >>15 North 2030 East, Room 2100 >>Salt Lake City, UT 84112-5330 >>ph:801-587-7707 >> >>________________________________________ >>From: Xia.Cao at dupont.com [Xia.Cao at dupont.com] >>Sent: Thursday, September 19, 2013 11:49 AM >>To: Mark Yandell; Corban-Gregory.Rivera at dupont.com >>Subject: maker2 scripts for functional annotation >> >>Dr. Yandell, >> >>We were recently evaluating maker2 for annotation and going through >>the maker tutorial from 2012. >> >>http://gmod.org/wiki/MAKER_Tutorial_2012 >> >>The tutorial makes references to some scripts that we couldn?t find in >>the current release. We were looking for scripts like >>gff3_preds2models to convert match/match_part format into annotations >>with gene/mRNA/exons/CDS and others. I was wondering if maybe we did >>not have the most up to date version. >> >>In addition to getting accurate gene annotations, I was looking for a >>solution to get functional assignments. I see that there are some >>scripts like maker_functional_fasta that may help, but I was wondering >>what you would recommend. >> >>Thanks, >> >>Corban & Xia >> >>This communication is for use by the intended recipient and contains >>information that may be Privileged, confidential or copyrighted under >>applicable law. If you are not the intended recipient, you are hereby >>formally notified that any use, copying or distribution of this >>e-mail, in whole or in part, is strictly prohibited. Please notify the >>sender by return e-mail and delete this e-mail from your system. >>Unless explicitly and conspicuously designated as "E-Contract >>Intended", this e-mail does not constitute a contract offer, a >>contract amendment, or an acceptance of a contract offer. This e-mail >>does not constitute a consent to the use of sender's contact >>information for direct marketing purposes or for transfers of data to third parties. >> >>The dupont.com web address will continue in use for a transitional >>period for communications sent or received on behalf of DuPont >>Performance Coatings., which is not affiliated in any way with the >>DuPont Company. >> >>Francais Deutsch Italiano Espanol Portugues Japanese Chinese >>Korean >> >> http://www.DuPont.com/corp/email_disclaimer.html >> >>_______________________________________________ >>maker-devel mailing list >>maker-devel at box290.bluehost.com >>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.or >>g > > > >This communication is for use by the intended recipient and contains >information that may be Privileged, confidential or copyrighted under >applicable law. If you are not the intended recipient, you are hereby >formally notified that any use, copying or distribution of this e-mail, >in whole or in part, is strictly prohibited. Please notify the sender >by return e-mail and delete this e-mail from your system. Unless >explicitly and conspicuously designated as "E-Contract Intended", this >e-mail does not constitute a contract offer, a contract amendment, or >an acceptance of a contract offer. This e-mail does not constitute a >consent to the use of sender's contact information for direct marketing >purposes or for transfers of data to third parties. > >The dupont.com web address will continue in use for >a transitional period for communications sent or received on behalf of >DuPont Performance Coatings., which is not affiliated in any way with >the DuPont Company. > >Francais Deutsch Italiano Espanol Portugues Japanese Chinese >Korean > > http://www.DuPont.com/corp/email_disclaimer.html > This communication is for use by the intended recipient and contains information that may be Privileged, confidential or copyrighted under applicable law. If you are not the intended recipient, you are hereby formally notified that any use, copying or distribution of this e-mail, in whole or in part, is strictly prohibited. Please notify the sender by return e-mail and delete this e-mail from your system. Unless explicitly and conspicuously designated as "E-Contract Intended", this e-mail does not constitute a contract offer, a contract amendment, or an acceptance of a contract offer. This e-mail does not constitute a consent to the use of sender's contact information for direct marketing purposes or for transfers of data to third parties. The dupont.com web address will continue in use for a transitional period for communications sent or received on behalf of DuPont Performance Coatings., which is not affiliated in any way with the DuPont Company. Francais Deutsch Italiano Espanol Portugues Japanese Chinese Korean http://www.DuPont.com/corp/email_disclaimer.html From carsonhh at gmail.com Wed Oct 2 08:43:29 2013 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 02 Oct 2013 10:43:29 -0400 Subject: [maker-devel] maker2 scripts for functional annotation In-Reply-To: Message-ID: That's ok. If it is very closely related, provide it directly to the est= option, otherwise provide it to the altest= option. The difference between the two options is how they are aligned. The altest= alignments takes about 10x longer than the est= alignments. --Carson On 10/2/13 10:40 AM, "Xia.Cao at dupont.com" wrote: >Hi Carson, > >Thank you for your quick and kind reply. One more question here. If we >don't have EST evidence for the strain we sequenced/assembled, will it be >ok to provide some EST evidence that is from some other strains which are >close to the one we are trying to annotate? Could you please let us know >how this will affect performance of maker2? > >Thank you again. > >Best, >Xia > >-----Original Message----- >From: Carson Holt [mailto:carsonhh at gmail.com] >Sent: Wednesday, October 02, 2013 10:15 AM >To: CAO, XIA >Cc: myandell at genetics.utah.edu; RIVERA, CORBAN GREGORY; >maker-devel at yandell-lab.org >Subject: Re: [maker-devel] maker2 scripts for functional annotation > >It still works, but it will be reduced. Without ESTs you won't get any >UTR prediction, also you will be limited by how well the protein set >matches to the genes. If you use the protein set of a close relative you >will capture most things. You will probably capture 80-95% of what you >would by also including ESTs. It all depending on how different the >species in the proteins evidence file are compared to the the species >being annotated. > >--Carson > >On 10/1/13 3:00 PM, "Xia.Cao at dupont.com" wrote: > >>Hi Carson, >> >>Thank you for your message and your kind help. Now conversion from >>match/match_part to gene/mRNA/exons/CDS works well after blanking out >>some options in control file. >> >>I have one more question about maker2. In case we don't have EST >>evidence (set of ESTs or assembled mRNA-seq in fasta format) for a >>genome, does >>maker2 function? If it does function, could you please let me know the >>performance of maker2 without providing EST evidence compared to the >>one with EST evidence? >> >>Thank you again and I look forward to hearing from you. >> >>Best, >>Xia >> >> >> >> >>-----Original Message----- >>From: Carson Holt [mailto:carsonhh at gmail.com] >>Sent: Wednesday, September 25, 2013 10:36 AM >>To: CAO, XIA; myandell at genetics.utah.edu; RIVERA, CORBAN GREGORY; >>maker-devel at yandell-lab.org >>Subject: Re: [maker-devel] maker2 scripts for functional annotation >> >>If it is launching predictors then you have snap hmm or >>augustus_species set. You ned to blank out all other options in the >>control files (including repeat masking options, proteins, ESTs, etc.) >>when trying to convert mathc/match_part to gene/mRNA/exons/CDS, or else >>those other programs will run. >> >>--Carson >> >> >>On 9/25/13 10:31 AM, "Xia.Cao at dupont.com" wrote: >> >>>Hi Carson, >>> >>>Thank you for the message and your kind help. We tested maker2 by >>>setting keep_preds=1, pred_gff=generated_gff_file_from_first_makerRun . >>>But it seemed maker2 started to launch all predictors again and it >>>took long time to finish. I wonder if there is any way that we can >>>directly get gene/mRNA/exons/CDS gff file without re-running maker2 to >>>convert match/match_part features into gene/mRNA/exons/CDS. >>> >>>Thanks, >>>Xia >>> >>>-----Original Message----- >>>From: Carson Holt [mailto:carsonhh at gmail.com] >>>Sent: Thursday, September 19, 2013 5:58 PM >>>To: Mark Yandell; CAO, XIA; RIVERA, CORBAN GREGORY; >>>maker-devel at yandell-lab.org >>>Subject: Re: [maker-devel] maker2 scripts for functional annotation >>> >>>Hello Corban & Xia, >>> >>>Some scripts like gff3_preds2models are deprecated. To get the same >>>result as was offered by gff3_preds2models, just give your >>>match/match_part features to pref_gff= in the maker_opts.ctl file, set >>>keep_preds=1, and run with all other options and predictors turned off. >>>The final MAKER result will be your match/match_part features >>>converted into gene/mRNA/exons/CDS. >>> >>>For functional annotation, you can use Interproscan, BLASTP against >>>UniProt, or BALST2GO. My preference is to use InterProScan to add GO >>>terms and proteins domains via the ipr_update_gff and iprscan2gff3 >>>scripts. Then add putative gene functions via BLASTP to UniProt and >>>maker_functional_fasta and maker_functional_gff scripts. >>> >>>Go ahead and take a look and that those tools and let me know if you >>>have any questions or need help you configuring them. >>> >>>Thanks, >>>Carson >>> >>> >>>On 9/19/13 11:53 AM, "Mark Yandell" wrote: >>> >>>>Hi Corban & Xia, >>>> >>>> >>>>I've forwarded your question along to the MAKER_dev list, were you >>>>can get speedy answers to your maker related questions. Thanks for >>>>using MAKER. >>>> >>>>--mark >>>> >>>> >>>>Mark Yandell >>>>Professor of Human Genetics >>>>H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of >>>>Human Genetics University of Utah >>>>15 North 2030 East, Room 2100 >>>>Salt Lake City, UT 84112-5330 >>>>ph:801-587-7707 >>>> >>>>________________________________________ >>>>From: Xia.Cao at dupont.com [Xia.Cao at dupont.com] >>>>Sent: Thursday, September 19, 2013 11:49 AM >>>>To: Mark Yandell; Corban-Gregory.Rivera at dupont.com >>>>Subject: maker2 scripts for functional annotation >>>> >>>>Dr. Yandell, >>>> >>>>We were recently evaluating maker2 for annotation and going through >>>>the maker tutorial from 2012. >>>> >>>>http://gmod.org/wiki/MAKER_Tutorial_2012 >>>> >>>>The tutorial makes references to some scripts that we couldn?t find >>>>in the current release. We were looking for scripts like >>>>gff3_preds2models to convert match/match_part format into annotations >>>>with gene/mRNA/exons/CDS and others. I was wondering if maybe we did >>>>not have the most up to date version. >>>> >>>>In addition to getting accurate gene annotations, I was looking for a >>>>solution to get functional assignments. I see that there are some >>>>scripts like maker_functional_fasta that may help, but I was >>>>wondering what you would recommend. >>>> >>>>Thanks, >>>> >>>>Corban & Xia >>>> >>>>This communication is for use by the intended recipient and contains >>>>information that may be Privileged, confidential or copyrighted under >>>>applicable law. If you are not the intended recipient, you are hereby >>>>formally notified that any use, copying or distribution of this >>>>e-mail, in whole or in part, is strictly prohibited. Please notify >>>>the sender by return e-mail and delete this e-mail from your system. >>>>Unless explicitly and conspicuously designated as "E-Contract >>>>Intended", this e-mail does not constitute a contract offer, a >>>>contract amendment, or an acceptance of a contract offer. This e-mail >>>>does not constitute a consent to the use of sender's contact >>>>information for direct marketing purposes or for transfers of data to >>>>third parties. >>>> >>>>The dupont.com web address will continue in use for a transitional >>>>period for communications sent or received on behalf of DuPont >>>>Performance Coatings., which is not affiliated in any way with the >>>>DuPont Company. >>>> >>>>Francais Deutsch Italiano Espanol Portugues Japanese Chinese >>>>Korean >>>> >>>> http://www.DuPont.com/corp/email_disclaimer.html >>>> >>>>_______________________________________________ >>>>maker-devel mailing list >>>>maker-devel at box290.bluehost.com >>>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.o >>>>r >>>>g >>> >>> >>> >>>This communication is for use by the intended recipient and contains >>>information that may be Privileged, confidential or copyrighted under >>>applicable law. If you are not the intended recipient, you are hereby >>>formally notified that any use, copying or distribution of this >>>e-mail, in whole or in part, is strictly prohibited. Please notify the >>>sender by return e-mail and delete this e-mail from your system. >>>Unless explicitly and conspicuously designated as "E-Contract >>>Intended", this e-mail does not constitute a contract offer, a >>>contract amendment, or an acceptance of a contract offer. This e-mail >>>does not constitute a consent to the use of sender's contact >>>information for direct marketing purposes or for transfers of data to >>>third parties. >>> >>>The dupont.com web address will continue in use >>>for a transitional period for communications sent or received on >>>behalf of DuPont Performance Coatings., which is not affiliated in any >>>way with the DuPont Company. >>> >>>Francais Deutsch Italiano Espanol Portugues Japanese Chinese >>>Korean >>> >>> http://www.DuPont.com/corp/email_disclaimer.html >>> >> >> >> >>This communication is for use by the intended recipient and contains >>information that may be Privileged, confidential or copyrighted under >>applicable law. If you are not the intended recipient, you are hereby >>formally notified that any use, copying or distribution of this e-mail, >>in whole or in part, is strictly prohibited. Please notify the sender >>by return e-mail and delete this e-mail from your system. Unless >>explicitly and conspicuously designated as "E-Contract Intended", this >>e-mail does not constitute a contract offer, a contract amendment, or >>an acceptance of a contract offer. This e-mail does not constitute a >>consent to the use of sender's contact information for direct marketing >>purposes or for transfers of data to third parties. >> >>The dupont.com web address will continue in use for >>a transitional period for communications sent or received on behalf of >>DuPont Performance Coatings., which is not affiliated in any way with >>the DuPont Company. >> >>Francais Deutsch Italiano Espanol Portugues Japanese Chinese >>Korean >> >> http://www.DuPont.com/corp/email_disclaimer.html >> > > > >This communication is for use by the intended recipient and contains >information that may be Privileged, confidential or copyrighted under >applicable law. If you are not the intended recipient, you are hereby >formally notified that any use, copying or distribution of this e-mail, >in whole or in part, is strictly prohibited. Please notify the sender by >return e-mail and delete this e-mail from your system. Unless explicitly >and conspicuously designated as "E-Contract Intended", this e-mail does >not constitute a contract offer, a contract amendment, or an acceptance >of a contract offer. This e-mail does not constitute a consent to the >use of sender's contact information for direct marketing purposes or for >transfers of data to third parties. > >The dupont.com web address will continue in use for a >transitional period for communications sent or received on behalf of >DuPont >Performance Coatings., which is not affiliated in any way with the DuPont >Company. > >Francais Deutsch Italiano Espanol Portugues Japanese Chinese Korean > > http://www.DuPont.com/corp/email_disclaimer.html > From Carson.Holt at oicr.on.ca Wed Oct 2 09:24:42 2013 From: Carson.Holt at oicr.on.ca (Carson Holt) Date: Wed, 2 Oct 2013 15:24:42 +0000 Subject: [maker-devel] Incorrect gene model, antisense transcripts In-Reply-To: Message-ID: SNAP appears to be making small ab initio calls all over the place. You may need to retrain it, or just drop it completely from your analysis and just use augustus. Try running with protein2genome and then training SNAP off of those results. Also make sure your repeat masking is sufficient. You may be getting too many SANP predictions if that is the case. Thanks, Carson From: Sivaranjani Namasivayam > Date: Tuesday, October 1, 2013 5:01 PM To: "maker-devel-bounces at yandell-lab.org" > Subject: Incorrect gene model, antisense transcripts Hello, I am working on annotating a genome. I have RNAseq data assembled with Cufflinks and Trinity. I am using to predict gene models. Below is the procedure I used - For the first round of prediction I provided the following : EST evidence: assembled RNAseq, Protein evidence: proteins from related organism, Gene predictor: augustus trained on a related organism. - The number of genes I got was much less than expected (Expect atleast 10K genes, got ~4K genes) - Used the gene models from the first round of annotations to train SNAP (default parameters) and provided the HMM for the second round of annotation (used augustus also). All other data I passed as such in the gff3, except the gene models.I also included the some manually curated genes in the EST evidence. - The number of genes were much higher, ~11k, but many seemed incorrect and appear to have been generated from the SNAP models. I am attaching a screen shot from Apollo showing this. Would you have any suggestions on why this might be happening and how I can fix? I am essentially looking for gene models for my assembled RNAseq, can I achieve this by setting est2genome to 1? Would I be losing/misrepresenting any data if I do this. Also, I recently noticed my RNAseq data has assembled antisense transcripts. MAKER seems to be predicting a coding region for these antisense transcripts in some cases, (although the coding region predicted is much smaller than that of the sense gene model.) Is there a way to tell MAKER not to predict gene models on both strands at the same loci. Appreciate your help. Thanks, Ranjani -------------- next part -------------- An HTML attachment was scrubbed... URL: From Xia.Cao at dupont.com Wed Oct 2 08:40:12 2013 From: Xia.Cao at dupont.com (Xia.Cao at dupont.com) Date: Wed, 2 Oct 2013 14:40:12 +0000 Subject: [maker-devel] maker2 scripts for functional annotation In-Reply-To: References: Message-ID: Hi Carson, Thank you for your quick and kind reply. One more question here. If we don't have EST evidence for the strain we sequenced/assembled, will it be ok to provide some EST evidence that is from some other strains which are close to the one we are trying to annotate? Could you please let us know how this will affect performance of maker2? Thank you again. Best, Xia -----Original Message----- From: Carson Holt [mailto:carsonhh at gmail.com] Sent: Wednesday, October 02, 2013 10:15 AM To: CAO, XIA Cc: myandell at genetics.utah.edu; RIVERA, CORBAN GREGORY; maker-devel at yandell-lab.org Subject: Re: [maker-devel] maker2 scripts for functional annotation It still works, but it will be reduced. Without ESTs you won't get any UTR prediction, also you will be limited by how well the protein set matches to the genes. If you use the protein set of a close relative you will capture most things. You will probably capture 80-95% of what you would by also including ESTs. It all depending on how different the species in the proteins evidence file are compared to the the species being annotated. --Carson On 10/1/13 3:00 PM, "Xia.Cao at dupont.com" wrote: >Hi Carson, > >Thank you for your message and your kind help. Now conversion from >match/match_part to gene/mRNA/exons/CDS works well after blanking out >some options in control file. > >I have one more question about maker2. In case we don't have EST >evidence (set of ESTs or assembled mRNA-seq in fasta format) for a >genome, does >maker2 function? If it does function, could you please let me know the >performance of maker2 without providing EST evidence compared to the >one with EST evidence? > >Thank you again and I look forward to hearing from you. > >Best, >Xia > > > > >-----Original Message----- >From: Carson Holt [mailto:carsonhh at gmail.com] >Sent: Wednesday, September 25, 2013 10:36 AM >To: CAO, XIA; myandell at genetics.utah.edu; RIVERA, CORBAN GREGORY; >maker-devel at yandell-lab.org >Subject: Re: [maker-devel] maker2 scripts for functional annotation > >If it is launching predictors then you have snap hmm or >augustus_species set. You ned to blank out all other options in the >control files (including repeat masking options, proteins, ESTs, etc.) >when trying to convert mathc/match_part to gene/mRNA/exons/CDS, or else >those other programs will run. > >--Carson > > >On 9/25/13 10:31 AM, "Xia.Cao at dupont.com" wrote: > >>Hi Carson, >> >>Thank you for the message and your kind help. We tested maker2 by >>setting keep_preds=1, pred_gff=generated_gff_file_from_first_makerRun . >>But it seemed maker2 started to launch all predictors again and it >>took long time to finish. I wonder if there is any way that we can >>directly get gene/mRNA/exons/CDS gff file without re-running maker2 to >>convert match/match_part features into gene/mRNA/exons/CDS. >> >>Thanks, >>Xia >> >>-----Original Message----- >>From: Carson Holt [mailto:carsonhh at gmail.com] >>Sent: Thursday, September 19, 2013 5:58 PM >>To: Mark Yandell; CAO, XIA; RIVERA, CORBAN GREGORY; >>maker-devel at yandell-lab.org >>Subject: Re: [maker-devel] maker2 scripts for functional annotation >> >>Hello Corban & Xia, >> >>Some scripts like gff3_preds2models are deprecated. To get the same >>result as was offered by gff3_preds2models, just give your >>match/match_part features to pref_gff= in the maker_opts.ctl file, set >>keep_preds=1, and run with all other options and predictors turned off. >>The final MAKER result will be your match/match_part features >>converted into gene/mRNA/exons/CDS. >> >>For functional annotation, you can use Interproscan, BLASTP against >>UniProt, or BALST2GO. My preference is to use InterProScan to add GO >>terms and proteins domains via the ipr_update_gff and iprscan2gff3 >>scripts. Then add putative gene functions via BLASTP to UniProt and >>maker_functional_fasta and maker_functional_gff scripts. >> >>Go ahead and take a look and that those tools and let me know if you >>have any questions or need help you configuring them. >> >>Thanks, >>Carson >> >> >>On 9/19/13 11:53 AM, "Mark Yandell" wrote: >> >>>Hi Corban & Xia, >>> >>> >>>I've forwarded your question along to the MAKER_dev list, were you >>>can get speedy answers to your maker related questions. Thanks for >>>using MAKER. >>> >>>--mark >>> >>> >>>Mark Yandell >>>Professor of Human Genetics >>>H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of >>>Human Genetics University of Utah >>>15 North 2030 East, Room 2100 >>>Salt Lake City, UT 84112-5330 >>>ph:801-587-7707 >>> >>>________________________________________ >>>From: Xia.Cao at dupont.com [Xia.Cao at dupont.com] >>>Sent: Thursday, September 19, 2013 11:49 AM >>>To: Mark Yandell; Corban-Gregory.Rivera at dupont.com >>>Subject: maker2 scripts for functional annotation >>> >>>Dr. Yandell, >>> >>>We were recently evaluating maker2 for annotation and going through >>>the maker tutorial from 2012. >>> >>>http://gmod.org/wiki/MAKER_Tutorial_2012 >>> >>>The tutorial makes references to some scripts that we couldn?t find >>>in the current release. We were looking for scripts like >>>gff3_preds2models to convert match/match_part format into annotations >>>with gene/mRNA/exons/CDS and others. I was wondering if maybe we did >>>not have the most up to date version. >>> >>>In addition to getting accurate gene annotations, I was looking for a >>>solution to get functional assignments. I see that there are some >>>scripts like maker_functional_fasta that may help, but I was >>>wondering what you would recommend. >>> >>>Thanks, >>> >>>Corban & Xia >>> >>>This communication is for use by the intended recipient and contains >>>information that may be Privileged, confidential or copyrighted under >>>applicable law. If you are not the intended recipient, you are hereby >>>formally notified that any use, copying or distribution of this >>>e-mail, in whole or in part, is strictly prohibited. Please notify >>>the sender by return e-mail and delete this e-mail from your system. >>>Unless explicitly and conspicuously designated as "E-Contract >>>Intended", this e-mail does not constitute a contract offer, a >>>contract amendment, or an acceptance of a contract offer. This e-mail >>>does not constitute a consent to the use of sender's contact >>>information for direct marketing purposes or for transfers of data to >>>third parties. >>> >>>The dupont.com web address will continue in use for a transitional >>>period for communications sent or received on behalf of DuPont >>>Performance Coatings., which is not affiliated in any way with the >>>DuPont Company. >>> >>>Francais Deutsch Italiano Espanol Portugues Japanese Chinese >>>Korean >>> >>> http://www.DuPont.com/corp/email_disclaimer.html >>> >>>_______________________________________________ >>>maker-devel mailing list >>>maker-devel at box290.bluehost.com >>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.o >>>r >>>g >> >> >> >>This communication is for use by the intended recipient and contains >>information that may be Privileged, confidential or copyrighted under >>applicable law. If you are not the intended recipient, you are hereby >>formally notified that any use, copying or distribution of this >>e-mail, in whole or in part, is strictly prohibited. Please notify the >>sender by return e-mail and delete this e-mail from your system. >>Unless explicitly and conspicuously designated as "E-Contract >>Intended", this e-mail does not constitute a contract offer, a >>contract amendment, or an acceptance of a contract offer. This e-mail >>does not constitute a consent to the use of sender's contact >>information for direct marketing purposes or for transfers of data to third parties. >> >>The dupont.com web address will continue in use >>for a transitional period for communications sent or received on >>behalf of DuPont Performance Coatings., which is not affiliated in any >>way with the DuPont Company. >> >>Francais Deutsch Italiano Espanol Portugues Japanese Chinese >>Korean >> >> http://www.DuPont.com/corp/email_disclaimer.html >> > > > >This communication is for use by the intended recipient and contains >information that may be Privileged, confidential or copyrighted under >applicable law. If you are not the intended recipient, you are hereby >formally notified that any use, copying or distribution of this e-mail, >in whole or in part, is strictly prohibited. Please notify the sender >by return e-mail and delete this e-mail from your system. Unless >explicitly and conspicuously designated as "E-Contract Intended", this >e-mail does not constitute a contract offer, a contract amendment, or >an acceptance of a contract offer. This e-mail does not constitute a >consent to the use of sender's contact information for direct marketing >purposes or for transfers of data to third parties. > >The dupont.com web address will continue in use for >a transitional period for communications sent or received on behalf of >DuPont Performance Coatings., which is not affiliated in any way with >the DuPont Company. > >Francais Deutsch Italiano Espanol Portugues Japanese Chinese >Korean > > http://www.DuPont.com/corp/email_disclaimer.html > This communication is for use by the intended recipient and contains information that may be Privileged, confidential or copyrighted under applicable law. If you are not the intended recipient, you are hereby formally notified that any use, copying or distribution of this e-mail, in whole or in part, is strictly prohibited. Please notify the sender by return e-mail and delete this e-mail from your system. Unless explicitly and conspicuously designated as "E-Contract Intended", this e-mail does not constitute a contract offer, a contract amendment, or an acceptance of a contract offer. This e-mail does not constitute a consent to the use of sender's contact information for direct marketing purposes or for transfers of data to third parties. The dupont.com web address will continue in use for a transitional period for communications sent or received on behalf of DuPont Performance Coatings., which is not affiliated in any way with the DuPont Company. Francais Deutsch Italiano Espanol Portugues Japanese Chinese Korean http://www.DuPont.com/corp/email_disclaimer.html From markwiz at us.ibm.com Wed Oct 2 10:52:55 2013 From: markwiz at us.ibm.com (Mark Wisner) Date: Wed, 2 Oct 2013 12:52:55 -0400 Subject: [maker-devel] Maker2 on IBM PowerLinux? Message-ID: IBMs PowerLinux is tuned for Big Data workloads. We have had queries for Maker2 on PowerLinux. How do we work with Maker2 development to create a PowerLinux port. Thanks, Mark K. Wisner Linux On Power ISV PM IBM 3039 Cornwallis Rd RTP, NC 27709 Cell 919-649-5813 -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry.moore at genetics.utah.edu Wed Oct 2 11:29:36 2013 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Wed, 2 Oct 2013 11:29:36 -0600 Subject: [maker-devel] Maker2 on IBM PowerLinux? In-Reply-To: References: Message-ID: Hi Mark, Carson Holt is the primary Maker developer, but others of us may be able to help as well. I've added e-mails for Carson, Michael Campbell, Mark Yandell and myself who are currently developers on the project. You can feel free to contact this mailing list - or if the conversation is getting too far afield for general interest you can contact us off list and we will do whatever we can to help. If you feel like you need to do some modification of MAKER code to optimize it for PowerLinux I'm happy to give you a branch on the subversion repo to work with and then we can work with you to merge these changes back to the trunk when you feel like you have some improvements that benefit PowerLinux (and all other) users. Thanks very much for your interest in MAKER - your participation in the community is most welcome! Barry On Oct 2, 2013, at 10:52 AM, Mark Wisner wrote: > IBMs PowerLinux is tuned for Big Data workloads. We have had queries for Maker2 on PowerLinux. > How do we work with Maker2 development to create a PowerLinux port. > Thanks, > > Mark K. Wisner > Linux On Power ISV PM > IBM > 3039 Cornwallis Rd > RTP, NC 27709 > Cell 919-649-5813 > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Wed Oct 2 11:29:50 2013 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 02 Oct 2013 13:29:50 -0400 Subject: [maker-devel] Maker2 on IBM PowerLinux? In-Reply-To: Message-ID: MAKER is written in perl, so it can work pretty much anywhere you have perl. It is more an issue of the tools MAKER uses. If you can get SNAP, BLAST, exonerate, etc. to run then MAKER should be no problem. You can let MAKER try and install and configure any missing perl libraries as well as external tools for you (this is by far the easiest way). You will need an internet connection. cd ?/maker/src/ perl Build.PL #say 'Y' to local installation of libraries and 'N' to MPI for now ./Build installdeps #grabs missing perl libraries from the ether, and installs them in the .../maker/perl/ directory ./Build installers #grabs missing tools from the ether, installs, and configures them in the .../maker/exe/ directory ./Build install #just installs maker If you want to run via MPI, I'd recommend OpenMPI, and you will need to rerun the 'perl Build.PL' and './Build install' steps (say yes to MPI this time around). Here are some variables you will likly have to add to your .bash_profile to get openmpi to run with MAKER (you must add these and source the profile before attempting to install with OpenMPI), otherwise compilation will not work correctly (OpenMPI shared library issue). #you need to set $OMPIHOME to OpenMPI's location here --> export LD_PRELOAD=$OMPIHOME/lib/libmpi.so:$LD_PRELOAD #this one just suppresses a warning --> export OMPI_MCA_mpi_warn_on_fork=0 And on some systems you have to add this ' -mca btl ^openib' to the mpiexec command when I run maker. Example--> mpiexec -mca btl ^openib -n 20 maker Thanks, Carson From: Mark Wisner Date: Wednesday, October 2, 2013 12:52 PM To: Subject: [maker-devel] Maker2 on IBM PowerLinux? IBMs PowerLinux is tuned for Big Data workloads. We have had queries for Maker2 on PowerLinux. How do we work with Maker2 development to create a PowerLinux port. Thanks, Mark K. Wisner Linux On Power ISV PM IBM 3039 Cornwallis Rd RTP, NC 27709 Cell 919-649-5813 _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry.moore at genetics.utah.edu Wed Oct 2 11:21:29 2013 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Wed, 2 Oct 2013 11:21:29 -0600 Subject: [maker-devel] maker2 scripts for functional annotation In-Reply-To: References: Message-ID: Hi Xia, You can use EST evidence from other strains/species. The considerations are as follows: When you pass EST (or mRNASeq based transcripts) evidence MAKER uses blastn to align those with alignment parameters that assume they sequences will align easily with almost perfect identity - this is fast. If you use EST/mRNA evidence from another species MAKER uses tblastx to compare the RNA evidence to the assembly, both translated into protein space - this works fine, but it is slow and often doesn't get you much more evidence than you could have gotten from proteins as evidence. If you believe that you're other strain should have very little divergence in sequence at the nt level then you could experiment with passing EST/mRNA evidence as est in the control file. Consider this an experiment, do it on a few contigs and examine the results carefully and skeptically. If it works well, that would be the way to go because you've get better UTRs, but if it doesn't you'll see that you get little support for models because sequence divergence was too much for good alignment. In that case you'll need to use alt_est and this will be slower and you'll likely get little in the way of support for UTRs - in this case as well, do a few contigs and examine the output carefully to see if the additional alignment time with tblastx is worth it in your case. Barry On Oct 2, 2013, at 8:40 AM, wrote: > Hi Carson, > > Thank you for your quick and kind reply. One more question here. If we don't have EST evidence for the strain we sequenced/assembled, will it be ok to provide some EST evidence that is from some other strains which are close to the one we are trying to annotate? Could you please let us know how this will affect performance of maker2? > > Thank you again. > > Best, > Xia > > -----Original Message----- > From: Carson Holt [mailto:carsonhh at gmail.com] > Sent: Wednesday, October 02, 2013 10:15 AM > To: CAO, XIA > Cc: myandell at genetics.utah.edu; RIVERA, CORBAN GREGORY; maker-devel at yandell-lab.org > Subject: Re: [maker-devel] maker2 scripts for functional annotation > > It still works, but it will be reduced. Without ESTs you won't get any UTR prediction, also you will be limited by how well the protein set matches to the genes. If you use the protein set of a close relative you will capture most things. You will probably capture 80-95% of what you would by also including ESTs. It all depending on how different the species in the proteins evidence file are compared to the the species being annotated. > > --Carson > > On 10/1/13 3:00 PM, "Xia.Cao at dupont.com" wrote: > >> Hi Carson, >> >> Thank you for your message and your kind help. Now conversion from >> match/match_part to gene/mRNA/exons/CDS works well after blanking out >> some options in control file. >> >> I have one more question about maker2. In case we don't have EST >> evidence (set of ESTs or assembled mRNA-seq in fasta format) for a >> genome, does >> maker2 function? If it does function, could you please let me know the >> performance of maker2 without providing EST evidence compared to the >> one with EST evidence? >> >> Thank you again and I look forward to hearing from you. >> >> Best, >> Xia >> >> >> >> >> -----Original Message----- >> From: Carson Holt [mailto:carsonhh at gmail.com] >> Sent: Wednesday, September 25, 2013 10:36 AM >> To: CAO, XIA; myandell at genetics.utah.edu; RIVERA, CORBAN GREGORY; >> maker-devel at yandell-lab.org >> Subject: Re: [maker-devel] maker2 scripts for functional annotation >> >> If it is launching predictors then you have snap hmm or >> augustus_species set. You ned to blank out all other options in the >> control files (including repeat masking options, proteins, ESTs, etc.) >> when trying to convert mathc/match_part to gene/mRNA/exons/CDS, or else >> those other programs will run. >> >> --Carson >> >> >> On 9/25/13 10:31 AM, "Xia.Cao at dupont.com" wrote: >> >>> Hi Carson, >>> >>> Thank you for the message and your kind help. We tested maker2 by >>> setting keep_preds=1, pred_gff=generated_gff_file_from_first_makerRun . >>> But it seemed maker2 started to launch all predictors again and it >>> took long time to finish. I wonder if there is any way that we can >>> directly get gene/mRNA/exons/CDS gff file without re-running maker2 to >>> convert match/match_part features into gene/mRNA/exons/CDS. >>> >>> Thanks, >>> Xia >>> >>> -----Original Message----- >>> From: Carson Holt [mailto:carsonhh at gmail.com] >>> Sent: Thursday, September 19, 2013 5:58 PM >>> To: Mark Yandell; CAO, XIA; RIVERA, CORBAN GREGORY; >>> maker-devel at yandell-lab.org >>> Subject: Re: [maker-devel] maker2 scripts for functional annotation >>> >>> Hello Corban & Xia, >>> >>> Some scripts like gff3_preds2models are deprecated. To get the same >>> result as was offered by gff3_preds2models, just give your >>> match/match_part features to pref_gff= in the maker_opts.ctl file, set >>> keep_preds=1, and run with all other options and predictors turned off. >>> The final MAKER result will be your match/match_part features >>> converted into gene/mRNA/exons/CDS. >>> >>> For functional annotation, you can use Interproscan, BLASTP against >>> UniProt, or BALST2GO. My preference is to use InterProScan to add GO >>> terms and proteins domains via the ipr_update_gff and iprscan2gff3 >>> scripts. Then add putative gene functions via BLASTP to UniProt and >>> maker_functional_fasta and maker_functional_gff scripts. >>> >>> Go ahead and take a look and that those tools and let me know if you >>> have any questions or need help you configuring them. >>> >>> Thanks, >>> Carson >>> >>> >>> On 9/19/13 11:53 AM, "Mark Yandell" wrote: >>> >>>> Hi Corban & Xia, >>>> >>>> >>>> I've forwarded your question along to the MAKER_dev list, were you >>>> can get speedy answers to your maker related questions. Thanks for >>>> using MAKER. >>>> >>>> --mark >>>> >>>> >>>> Mark Yandell >>>> Professor of Human Genetics >>>> H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of >>>> Human Genetics University of Utah >>>> 15 North 2030 East, Room 2100 >>>> Salt Lake City, UT 84112-5330 >>>> ph:801-587-7707 >>>> >>>> ________________________________________ >>>> From: Xia.Cao at dupont.com [Xia.Cao at dupont.com] >>>> Sent: Thursday, September 19, 2013 11:49 AM >>>> To: Mark Yandell; Corban-Gregory.Rivera at dupont.com >>>> Subject: maker2 scripts for functional annotation >>>> >>>> Dr. Yandell, >>>> >>>> We were recently evaluating maker2 for annotation and going through >>>> the maker tutorial from 2012. >>>> >>>> http://gmod.org/wiki/MAKER_Tutorial_2012 >>>> >>>> The tutorial makes references to some scripts that we couldn?t find >>>> in the current release. We were looking for scripts like >>>> gff3_preds2models to convert match/match_part format into annotations >>>> with gene/mRNA/exons/CDS and others. I was wondering if maybe we did >>>> not have the most up to date version. >>>> >>>> In addition to getting accurate gene annotations, I was looking for a >>>> solution to get functional assignments. I see that there are some >>>> scripts like maker_functional_fasta that may help, but I was >>>> wondering what you would recommend. >>>> >>>> Thanks, >>>> >>>> Corban & Xia >>>> >>>> This communication is for use by the intended recipient and contains >>>> information that may be Privileged, confidential or copyrighted under >>>> applicable law. If you are not the intended recipient, you are hereby >>>> formally notified that any use, copying or distribution of this >>>> e-mail, in whole or in part, is strictly prohibited. Please notify >>>> the sender by return e-mail and delete this e-mail from your system. >>>> Unless explicitly and conspicuously designated as "E-Contract >>>> Intended", this e-mail does not constitute a contract offer, a >>>> contract amendment, or an acceptance of a contract offer. This e-mail >>>> does not constitute a consent to the use of sender's contact >>>> information for direct marketing purposes or for transfers of data to >>>> third parties. >>>> >>>> The dupont.com web address will continue in use for a transitional >>>> period for communications sent or received on behalf of DuPont >>>> Performance Coatings., which is not affiliated in any way with the >>>> DuPont Company. >>>> >>>> Francais Deutsch Italiano Espanol Portugues Japanese Chinese >>>> Korean >>>> >>>> http://www.DuPont.com/corp/email_disclaimer.html >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.o >>>> r >>>> g >>> >>> >>> >>> This communication is for use by the intended recipient and contains >>> information that may be Privileged, confidential or copyrighted under >>> applicable law. If you are not the intended recipient, you are hereby >>> formally notified that any use, copying or distribution of this >>> e-mail, in whole or in part, is strictly prohibited. Please notify the >>> sender by return e-mail and delete this e-mail from your system. >>> Unless explicitly and conspicuously designated as "E-Contract >>> Intended", this e-mail does not constitute a contract offer, a >>> contract amendment, or an acceptance of a contract offer. This e-mail >>> does not constitute a consent to the use of sender's contact >>> information for direct marketing purposes or for transfers of data to third parties. >>> >>> The dupont.com web address will continue in use >>> for a transitional period for communications sent or received on >>> behalf of DuPont Performance Coatings., which is not affiliated in any >>> way with the DuPont Company. >>> >>> Francais Deutsch Italiano Espanol Portugues Japanese Chinese >>> Korean >>> >>> http://www.DuPont.com/corp/email_disclaimer.html >>> >> >> >> >> This communication is for use by the intended recipient and contains >> information that may be Privileged, confidential or copyrighted under >> applicable law. If you are not the intended recipient, you are hereby >> formally notified that any use, copying or distribution of this e-mail, >> in whole or in part, is strictly prohibited. Please notify the sender >> by return e-mail and delete this e-mail from your system. Unless >> explicitly and conspicuously designated as "E-Contract Intended", this >> e-mail does not constitute a contract offer, a contract amendment, or >> an acceptance of a contract offer. This e-mail does not constitute a >> consent to the use of sender's contact information for direct marketing >> purposes or for transfers of data to third parties. >> >> The dupont.com web address will continue in use for >> a transitional period for communications sent or received on behalf of >> DuPont Performance Coatings., which is not affiliated in any way with >> the DuPont Company. >> >> Francais Deutsch Italiano Espanol Portugues Japanese Chinese >> Korean >> >> http://www.DuPont.com/corp/email_disclaimer.html >> > > > > This communication is for use by the intended recipient and contains > information that may be Privileged, confidential or copyrighted under > applicable law. If you are not the intended recipient, you are hereby > formally notified that any use, copying or distribution of this e-mail, > in whole or in part, is strictly prohibited. Please notify the sender by > return e-mail and delete this e-mail from your system. Unless explicitly > and conspicuously designated as "E-Contract Intended", this e-mail does > not constitute a contract offer, a contract amendment, or an acceptance > of a contract offer. This e-mail does not constitute a consent to the > use of sender's contact information for direct marketing purposes or for > transfers of data to third parties. > > The dupont.com web address will continue in use for a > transitional period for communications sent or received on behalf of DuPont > Performance Coatings., which is not affiliated in any way with the DuPont Company. > > Francais Deutsch Italiano Espanol Portugues Japanese Chinese Korean > > http://www.DuPont.com/corp/email_disclaimer.html > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From Xia.Cao at dupont.com Wed Oct 2 11:51:26 2013 From: Xia.Cao at dupont.com (Xia.Cao at dupont.com) Date: Wed, 2 Oct 2013 17:51:26 +0000 Subject: [maker-devel] maker2 scripts for functional annotation In-Reply-To: References: Message-ID: Hi Barry, Thank you very much for the message and suggestions. It's really helpful to us. We will do some tests to see how we can take advantage of Maker2 to produce high-quality gene models from our assembled genome. Thank you again, Xia From: Barry Moore [mailto:barry.moore at genetics.utah.edu] Sent: Wednesday, October 02, 2013 1:21 PM To: CAO, XIA Cc: carsonhh at gmail.com; maker-devel at yandell-lab.org; RIVERA, CORBAN GREGORY Subject: Re: [maker-devel] maker2 scripts for functional annotation Hi Xia, You can use EST evidence from other strains/species. The considerations are as follows: When you pass EST (or mRNASeq based transcripts) evidence MAKER uses blastn to align those with alignment parameters that assume they sequences will align easily with almost perfect identity - this is fast. If you use EST/mRNA evidence from another species MAKER uses tblastx to compare the RNA evidence to the assembly, both translated into protein space - this works fine, but it is slow and often doesn't get you much more evidence than you could have gotten from proteins as evidence. If you believe that you're other strain should have very little divergence in sequence at the nt level then you could experiment with passing EST/mRNA evidence as est in the control file. Consider this an experiment, do it on a few contigs and examine the results carefully and skeptically. If it works well, that would be the way to go because you've get better UTRs, but if it doesn't you'll see that you get little support for models because sequence divergence was too much for good alignment. In that case you'll need to use alt_est and this will be slower and you'll likely get little in the way of support for UTRs - in this case as well, do a few contigs and examine the output carefully to see if the additional alignment time with tblastx is worth it in your case. Barry On Oct 2, 2013, at 8:40 AM, > wrote: Hi Carson, Thank you for your quick and kind reply. One more question here. If we don't have EST evidence for the strain we sequenced/assembled, will it be ok to provide some EST evidence that is from some other strains which are close to the one we are trying to annotate? Could you please let us know how this will affect performance of maker2? Thank you again. Best, Xia -----Original Message----- From: Carson Holt [mailto:carsonhh at gmail.com] Sent: Wednesday, October 02, 2013 10:15 AM To: CAO, XIA Cc: myandell at genetics.utah.edu; RIVERA, CORBAN GREGORY; maker-devel at yandell-lab.org Subject: Re: [maker-devel] maker2 scripts for functional annotation It still works, but it will be reduced. Without ESTs you won't get any UTR prediction, also you will be limited by how well the protein set matches to the genes. If you use the protein set of a close relative you will capture most things. You will probably capture 80-95% of what you would by also including ESTs. It all depending on how different the species in the proteins evidence file are compared to the the species being annotated. --Carson On 10/1/13 3:00 PM, "Xia.Cao at dupont.com" > wrote: Hi Carson, Thank you for your message and your kind help. Now conversion from match/match_part to gene/mRNA/exons/CDS works well after blanking out some options in control file. I have one more question about maker2. In case we don't have EST evidence (set of ESTs or assembled mRNA-seq in fasta format) for a genome, does maker2 function? If it does function, could you please let me know the performance of maker2 without providing EST evidence compared to the one with EST evidence? Thank you again and I look forward to hearing from you. Best, Xia -----Original Message----- From: Carson Holt [mailto:carsonhh at gmail.com] Sent: Wednesday, September 25, 2013 10:36 AM To: CAO, XIA; myandell at genetics.utah.edu; RIVERA, CORBAN GREGORY; maker-devel at yandell-lab.org Subject: Re: [maker-devel] maker2 scripts for functional annotation If it is launching predictors then you have snap hmm or augustus_species set. You ned to blank out all other options in the control files (including repeat masking options, proteins, ESTs, etc.) when trying to convert mathc/match_part to gene/mRNA/exons/CDS, or else those other programs will run. --Carson On 9/25/13 10:31 AM, "Xia.Cao at dupont.com" > wrote: Hi Carson, Thank you for the message and your kind help. We tested maker2 by setting keep_preds=1, pred_gff=generated_gff_file_from_first_makerRun . But it seemed maker2 started to launch all predictors again and it took long time to finish. I wonder if there is any way that we can directly get gene/mRNA/exons/CDS gff file without re-running maker2 to convert match/match_part features into gene/mRNA/exons/CDS. Thanks, Xia -----Original Message----- From: Carson Holt [mailto:carsonhh at gmail.com] Sent: Thursday, September 19, 2013 5:58 PM To: Mark Yandell; CAO, XIA; RIVERA, CORBAN GREGORY; maker-devel at yandell-lab.org Subject: Re: [maker-devel] maker2 scripts for functional annotation Hello Corban & Xia, Some scripts like gff3_preds2models are deprecated. To get the same result as was offered by gff3_preds2models, just give your match/match_part features to pref_gff= in the maker_opts.ctl file, set keep_preds=1, and run with all other options and predictors turned off. The final MAKER result will be your match/match_part features converted into gene/mRNA/exons/CDS. For functional annotation, you can use Interproscan, BLASTP against UniProt, or BALST2GO. My preference is to use InterProScan to add GO terms and proteins domains via the ipr_update_gff and iprscan2gff3 scripts. Then add putative gene functions via BLASTP to UniProt and maker_functional_fasta and maker_functional_gff scripts. Go ahead and take a look and that those tools and let me know if you have any questions or need help you configuring them. Thanks, Carson On 9/19/13 11:53 AM, "Mark Yandell" > wrote: Hi Corban & Xia, I've forwarded your question along to the MAKER_dev list, were you can get speedy answers to your maker related questions. Thanks for using MAKER. --mark Mark Yandell Professor of Human Genetics H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:801-587-7707 ________________________________________ From: Xia.Cao at dupont.com [Xia.Cao at dupont.com] Sent: Thursday, September 19, 2013 11:49 AM To: Mark Yandell; Corban-Gregory.Rivera at dupont.com Subject: maker2 scripts for functional annotation Dr. Yandell, We were recently evaluating maker2 for annotation and going through the maker tutorial from 2012. http://gmod.org/wiki/MAKER_Tutorial_2012 The tutorial makes references to some scripts that we couldn?t find in the current release. We were looking for scripts like gff3_preds2models to convert match/match_part format into annotations with gene/mRNA/exons/CDS and others. I was wondering if maybe we did not have the most up to date version. In addition to getting accurate gene annotations, I was looking for a solution to get functional assignments. I see that there are some scripts like maker_functional_fasta that may help, but I was wondering what you would recommend. Thanks, Corban & Xia This communication is for use by the intended recipient and contains information that may be Privileged, confidential or copyrighted under applicable law. If you are not the intended recipient, you are hereby formally notified that any use, copying or distribution of this e-mail, in whole or in part, is strictly prohibited. Please notify the sender by return e-mail and delete this e-mail from your system. Unless explicitly and conspicuously designated as "E-Contract Intended", this e-mail does not constitute a contract offer, a contract amendment, or an acceptance of a contract offer. This e-mail does not constitute a consent to the use of sender's contact information for direct marketing purposes or for transfers of data to third parties. The dupont.com web address will continue in use for a transitional period for communications sent or received on behalf of DuPont Performance Coatings., which is not affiliated in any way with the DuPont Company. Francais Deutsch Italiano Espanol Portugues Japanese Chinese Korean http://www.DuPont.com/corp/email_disclaimer.html _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.o r g This communication is for use by the intended recipient and contains information that may be Privileged, confidential or copyrighted under applicable law. If you are not the intended recipient, you are hereby formally notified that any use, copying or distribution of this e-mail, in whole or in part, is strictly prohibited. Please notify the sender by return e-mail and delete this e-mail from your system. Unless explicitly and conspicuously designated as "E-Contract Intended", this e-mail does not constitute a contract offer, a contract amendment, or an acceptance of a contract offer. This e-mail does not constitute a consent to the use of sender's contact information for direct marketing purposes or for transfers of data to third parties. The dupont.com web address will continue in use for a transitional period for communications sent or received on behalf of DuPont Performance Coatings., which is not affiliated in any way with the DuPont Company. Francais Deutsch Italiano Espanol Portugues Japanese Chinese Korean http://www.DuPont.com/corp/email_disclaimer.html This communication is for use by the intended recipient and contains information that may be Privileged, confidential or copyrighted under applicable law. If you are not the intended recipient, you are hereby formally notified that any use, copying or distribution of this e-mail, in whole or in part, is strictly prohibited. Please notify the sender by return e-mail and delete this e-mail from your system. Unless explicitly and conspicuously designated as "E-Contract Intended", this e-mail does not constitute a contract offer, a contract amendment, or an acceptance of a contract offer. This e-mail does not constitute a consent to the use of sender's contact information for direct marketing purposes or for transfers of data to third parties. The dupont.com web address will continue in use for a transitional period for communications sent or received on behalf of DuPont Performance Coatings., which is not affiliated in any way with the DuPont Company. Francais Deutsch Italiano Espanol Portugues Japanese Chinese Korean http://www.DuPont.com/corp/email_disclaimer.html This communication is for use by the intended recipient and contains information that may be Privileged, confidential or copyrighted under applicable law. If you are not the intended recipient, you are hereby formally notified that any use, copying or distribution of this e-mail, in whole or in part, is strictly prohibited. Please notify the sender by return e-mail and delete this e-mail from your system. Unless explicitly and conspicuously designated as "E-Contract Intended", this e-mail does not constitute a contract offer, a contract amendment, or an acceptance of a contract offer. This e-mail does not constitute a consent to the use of sender's contact information for direct marketing purposes or for transfers of data to third parties. The dupont.com web address will continue in use for a transitional period for communications sent or received on behalf of DuPont Performance Coatings., which is not affiliated in any way with the DuPont Company. Francais Deutsch Italiano Espanol Portugues Japanese Chinese Korean http://www.DuPont.com/corp/email_disclaimer.html _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 This communication is for use by the intended recipient and contains information that may be Privileged, confidential or copyrighted under applicable law. If you are not the intended recipient, you are hereby formally notified that any use, copying or distribution of this e-mail, in whole or in part, is strictly prohibited. Please notify the sender by return e-mail and delete this e-mail from your system. Unless explicitly and conspicuously designated as "E-Contract Intended", this e-mail does not constitute a contract offer, a contract amendment, or an acceptance of a contract offer. This e-mail does not constitute a consent to the use of sender's contact information for direct marketing purposes or for transfers of data to third parties. The dupont.com web address will continue in use for a transitional period for communications sent or received on behalf of DuPont Performance Coatings., which is not affiliated in any way with the DuPont Company. Francais Deutsch Italiano Espanol Portugues Japanese Chinese Korean http://www.DuPont.com/corp/email_disclaimer.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From cjfields at illinois.edu Thu Oct 3 12:44:07 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Thu, 3 Oct 2013 18:44:07 +0000 Subject: [maker-devel] NFSLock problem Message-ID: I have a MAKER job running that seems to be stalled on a failed scaffold. It's running via MPI (MAKER v2.28, openMPI 1.6.3), that appears to have worked successfully for the most part. This is a run that only uses transcriptome and protein information in order to get a decent dseat of The failed scaffold seems to be holding the job from completion. There does seem to be changes, but mainly they are on the NFSLock files: ./.NFSLock.gi_lock.NFSLock ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1 ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLock.KB913038%2E1.281.282.junction.blastn.holdover.NFSLock ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLock.KB913038%2E1.282.start.blastn.holdover.NFSLock ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLock.KB913038%2E1.281.end.blastn.holdover.NFSLock ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLock.KB913038%2E1.282.start.blastn.holdover.NFSLock.STACK ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLock.KB913038%2E1.281.end.blastn.holdover.NFSLock.STACK Everything else seems to have completed. I have seen a few issues re: NFS locking problems, would this be related? Should I stop the job? We're running GPFS for our NFS. Here's 'mount': -system-specific-4.1$ mount /dev/sda5 on / type ext4 (rw) proc on /proc type proc (rw) sysfs on /sys type sysfs (rw) devpts on /dev/pts type devpts (rw,gid=5,mode=620) tmpfs on /dev/shm type tmpfs (rw) /dev/sda1 on /boot type ext4 (rw) /dev/sdb1 on /export type ext4 (rw) /dev/sda2 on /var type ext4 (rw) tmpfs on /var/lib/ganglia/rrds type tmpfs (rw,size=6180842000,gid=99,uid=99) none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw) sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw) nfsd on /proc/fs/nfsd type nfsd (rw) /dev/IGBHOME0 on /home type gpfs (rw,mtime,dev=IGBHOME0) 128.174.124.79:/shares/group on /archive/group type nfs (rw,sync,hard,intr,retrans=10,timeo=300,rsize=65536,wsize=1048576,vers=3,proto=tcp,mountproto=tcp,addr=128.174.124.79) 128.174.124.79:/shares/CBC on /archive/CBC type nfs (rw,sync,hard,intr,retrans=10,timeo=300,rsize=65536,wsize=1048576,vers=3,proto=tcp,mountproto=tcp,addr=128.174.124.79) chris From carsonhh at gmail.com Thu Oct 3 12:45:37 2013 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 03 Oct 2013 14:45:37 -0400 Subject: [maker-devel] NFSLock problem In-Reply-To: Message-ID: Stop the job and restart it. No need to delete anything. Just restart it. Thanks, Carson On 10/3/13 2:44 PM, "Fields, Christopher J" wrote: >I have a MAKER job running that seems to be stalled on a failed scaffold. > It's running via MPI (MAKER v2.28, openMPI 1.6.3), that appears to have >worked successfully for the most part. This is a run that only uses >transcriptome and protein information in order to get a decent dseat of > >The failed scaffold seems to be holding the job from completion. There >does seem to be changes, but mainly they are on the NFSLock files: > >./.NFSLock.gi_lock.NFSLock >./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1 >./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLo >ck.KB913038%2E1.281.282.junction.blastn.holdover.NFSLock >./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLo >ck.KB913038%2E1.282.start.blastn.holdover.NFSLock >./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLo >ck.KB913038%2E1.281.end.blastn.holdover.NFSLock >./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLo >ck.KB913038%2E1.282.start.blastn.holdover.NFSLock.STACK >./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLo >ck.KB913038%2E1.281.end.blastn.holdover.NFSLock.STACK > >Everything else seems to have completed. > >I have seen a few issues re: NFS locking problems, would this be related? > Should I stop the job? > >We're running GPFS for our NFS. Here's 'mount': > >-system-specific-4.1$ mount >/dev/sda5 on / type ext4 (rw) >proc on /proc type proc (rw) >sysfs on /sys type sysfs (rw) >devpts on /dev/pts type devpts (rw,gid=5,mode=620) >tmpfs on /dev/shm type tmpfs (rw) >/dev/sda1 on /boot type ext4 (rw) >/dev/sdb1 on /export type ext4 (rw) >/dev/sda2 on /var type ext4 (rw) >tmpfs on /var/lib/ganglia/rrds type tmpfs >(rw,size=6180842000,gid=99,uid=99) >none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw) >sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw) >nfsd on /proc/fs/nfsd type nfsd (rw) >/dev/IGBHOME0 on /home type gpfs (rw,mtime,dev=IGBHOME0) >128.174.124.79:/shares/group on /archive/group type nfs >(rw,sync,hard,intr,retrans=10,timeo=300,rsize=65536,wsize=1048576,vers=3,p >roto=tcp,mountproto=tcp,addr=128.174.124.79) >128.174.124.79:/shares/CBC on /archive/CBC type nfs >(rw,sync,hard,intr,retrans=10,timeo=300,rsize=65536,wsize=1048576,vers=3,p >roto=tcp,mountproto=tcp,addr=128.174.124.79) > >chris >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From cjfields at illinois.edu Thu Oct 3 20:45:29 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Fri, 4 Oct 2013 02:45:29 +0000 Subject: [maker-devel] NFSLock problem In-Reply-To: References: Message-ID: Carson, Took a couple of restarts but finally processed through. I saw a prior email on the mail list where you mentioned this could be due to failed jobs hanging things up; I may set up my next job to allow one processor for logging into the nodes in case there is a similar problem, maybe try to track this down. chris On Oct 3, 2013, at 1:45 PM, Carson Holt wrote: > Stop the job and restart it. No need to delete anything. Just restart it. > > Thanks, > Carson > > > On 10/3/13 2:44 PM, "Fields, Christopher J" wrote: > >> I have a MAKER job running that seems to be stalled on a failed scaffold. >> It's running via MPI (MAKER v2.28, openMPI 1.6.3), that appears to have >> worked successfully for the most part. This is a run that only uses >> transcriptome and protein information in order to get a decent dseat of >> >> The failed scaffold seems to be holding the job from completion. There >> does seem to be changes, but mainly they are on the NFSLock files: >> >> ./.NFSLock.gi_lock.NFSLock >> ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1 >> ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLo >> ck.KB913038%2E1.281.282.junction.blastn.holdover.NFSLock >> ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLo >> ck.KB913038%2E1.282.start.blastn.holdover.NFSLock >> ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLo >> ck.KB913038%2E1.281.end.blastn.holdover.NFSLock >> ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLo >> ck.KB913038%2E1.282.start.blastn.holdover.NFSLock.STACK >> ./Zalbi.unplaced.scaf_datastore/2C/0F/KB913038.1/theVoid.KB913038.1/.NFSLo >> ck.KB913038%2E1.281.end.blastn.holdover.NFSLock.STACK >> >> Everything else seems to have completed. >> >> I have seen a few issues re: NFS locking problems, would this be related? >> Should I stop the job? >> >> We're running GPFS for our NFS. Here's 'mount': >> >> -system-specific-4.1$ mount >> /dev/sda5 on / type ext4 (rw) >> proc on /proc type proc (rw) >> sysfs on /sys type sysfs (rw) >> devpts on /dev/pts type devpts (rw,gid=5,mode=620) >> tmpfs on /dev/shm type tmpfs (rw) >> /dev/sda1 on /boot type ext4 (rw) >> /dev/sdb1 on /export type ext4 (rw) >> /dev/sda2 on /var type ext4 (rw) >> tmpfs on /var/lib/ganglia/rrds type tmpfs >> (rw,size=6180842000,gid=99,uid=99) >> none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw) >> sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw) >> nfsd on /proc/fs/nfsd type nfsd (rw) >> /dev/IGBHOME0 on /home type gpfs (rw,mtime,dev=IGBHOME0) >> 128.174.124.79:/shares/group on /archive/group type nfs >> (rw,sync,hard,intr,retrans=10,timeo=300,rsize=65536,wsize=1048576,vers=3,p >> roto=tcp,mountproto=tcp,addr=128.174.124.79) >> 128.174.124.79:/shares/CBC on /archive/CBC type nfs >> (rw,sync,hard,intr,retrans=10,timeo=300,rsize=65536,wsize=1048576,vers=3,p >> roto=tcp,mountproto=tcp,addr=128.174.124.79) >> >> chris >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > From felix.bemm at uni-wuerzburg.de Fri Oct 4 01:10:08 2013 From: felix.bemm at uni-wuerzburg.de (Felix Bemm) Date: Fri, 04 Oct 2013 09:10:08 +0200 Subject: [maker-devel] Contigs failing - could not make datastore directory Message-ID: <524E69D0.1090905@uni-wuerzburg.de> Hi, I have a problem with Contigs failing like this: examining contents of the fasta file and run log ERROR: could not make datastore directory --> rank=11, hostname=r5n04 ERROR: Failed while examining contents of the fasta file and run log ERROR: Chunk failed at level:0, tier_type:0 FAILED CONTIG:MA_gen_v8.05_c31675 There is enough space available on both the storage device and the tmp device that is being used. The datastore log files looks like this: MA_gen_v8.05_c1149 FAILED MA_gen_v8.05_c1153 FAILED MA_gen_v8.05_c1154 FAILED MA_gen_v8.05_c1155 FAILED MA_gen_v8.05_c1156 FAILED Since there is no datastore for this contigs, I can not provide any more informationen. Approx. 1/3 of the contigs were annotated, after that maker started to fail. It kept running for a while and finishing some of the contigs in the second or third try like this: MA_gen_v8.05_c4443 FAILED MA_gen_v8.05_c4443 FAILED MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ STARTED MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ FINISHED Any ideas what can cause this problems? Best regards Felix -- Felix Bemm Department of Bioinformatics University of W?rzburg, Germany Tel: +49 931 - 31 83696 Fax: +49 931 - 31 84552 felix.bemm at uni-wuerzburg.de From carsonhh at gmail.com Fri Oct 4 12:15:35 2013 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 04 Oct 2013 14:15:35 -0400 Subject: [maker-devel] Contigs failing - could not make datastore directory In-Reply-To: <524E69D0.1090905@uni-wuerzburg.de> Message-ID: Let me know what you find. With some failures it's hard to identify the problem especially with long running processes, which is why I made MAKER restartable, so you can at least get completion by just launching again. --Carson On 10/4/13 3:10 AM, "Felix Bemm" wrote: >Hi, > >I have a problem with Contigs failing like this: > >examining contents of the fasta file and run log >ERROR: could not make datastore directory >--> rank=11, hostname=r5n04 >ERROR: Failed while examining contents of the fasta file and run log >ERROR: Chunk failed at level:0, tier_type:0 >FAILED CONTIG:MA_gen_v8.05_c31675 > >There is enough space available on both the storage device and the tmp >device that is being used. The datastore log files looks like this: > >MA_gen_v8.05_c1149 FAILED >MA_gen_v8.05_c1153 FAILED >MA_gen_v8.05_c1154 FAILED >MA_gen_v8.05_c1155 FAILED >MA_gen_v8.05_c1156 FAILED > >Since there is no datastore for this contigs, I can not provide any more >informationen. Approx. 1/3 of the contigs were annotated, after that >maker started to fail. It kept running for a while and finishing some of >the contigs in the second or third try like this: > >MA_gen_v8.05_c4443 FAILED >MA_gen_v8.05_c4443 FAILED >MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ START >ED >MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ >FINISHED > >Any ideas what can cause this problems? > >Best regards >Felix > >-- >Felix Bemm >Department of Bioinformatics >University of W?rzburg, Germany >Tel: +49 931 - 31 83696 >Fax: +49 931 - 31 84552 >felix.bemm at uni-wuerzburg.de > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From carsonhh at gmail.com Fri Oct 4 12:21:42 2013 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 04 Oct 2013 14:21:42 -0400 Subject: [maker-devel] Contigs failing - could not make datastore directory In-Reply-To: <524E69D0.1090905@uni-wuerzburg.de> Message-ID: Hi Felix, There are other potential issues as well, for example a file quota limit set by your administrator or full local /tmp which will be unique to each node and cannot be queried directly from the root node on a cluster. On a cluster one node may also not have the NFS location mounted. Or it could be system related, for example on ibrix NFS mounted systems you can get a weird filesystem issue with ghost files and directories when using MAKER that can neither be created nor deleted. Try running the failed contigs in a different directory (even if it's on the same disk). Also what version of MAKER are you using? --Carson On 10/4/13 3:10 AM, "Felix Bemm" wrote: >Hi, > >I have a problem with Contigs failing like this: > >examining contents of the fasta file and run log >ERROR: could not make datastore directory >--> rank=11, hostname=r5n04 >ERROR: Failed while examining contents of the fasta file and run log >ERROR: Chunk failed at level:0, tier_type:0 >FAILED CONTIG:MA_gen_v8.05_c31675 > >There is enough space available on both the storage device and the tmp >device that is being used. The datastore log files looks like this: > >MA_gen_v8.05_c1149 FAILED >MA_gen_v8.05_c1153 FAILED >MA_gen_v8.05_c1154 FAILED >MA_gen_v8.05_c1155 FAILED >MA_gen_v8.05_c1156 FAILED > >Since there is no datastore for this contigs, I can not provide any more >informationen. Approx. 1/3 of the contigs were annotated, after that >maker started to fail. It kept running for a while and finishing some of >the contigs in the second or third try like this: > >MA_gen_v8.05_c4443 FAILED >MA_gen_v8.05_c4443 FAILED >MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ START >ED >MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ >FINISHED > >Any ideas what can cause this problems? > >Best regards >Felix > >-- >Felix Bemm >Department of Bioinformatics >University of W?rzburg, Germany >Tel: +49 931 - 31 83696 >Fax: +49 931 - 31 84552 >felix.bemm at uni-wuerzburg.de > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From felix.bemm at uni-wuerzburg.de Sat Oct 5 08:09:15 2013 From: felix.bemm at uni-wuerzburg.de (Felix Bemm) Date: Sat, 05 Oct 2013 16:09:15 +0200 Subject: [maker-devel] Contigs failing - could not make datastore directory In-Reply-To: References: Message-ID: <52501D8B.2060301@uni-wuerzburg.de> Hi Carson, there is definitely no quota neither a full local tmp device. I changed to another tmp device, which is unfortunately residing on nfs but now it seems to work. I am using the latest version of maker (2.28). The complete job is running on a single machine with 32 mpi threads. Bests Felix Am 04.10.2013 20:21, schrieb Carson Holt: > Hi Felix, > > There are other potential issues as well, for example a file quota limit > set by your administrator or full local /tmp which will be unique to each > node and cannot be queried directly from the root node on a cluster. On a > cluster one node may also not have the NFS location mounted. Or it could > be system related, for example on ibrix NFS mounted systems you can get a > weird filesystem issue with ghost files and directories when using MAKER > that can neither be created nor deleted. Try running the failed contigs > in a different directory (even if it's on the same disk). Also what > version of MAKER are you using? > > --Carson > > > > On 10/4/13 3:10 AM, "Felix Bemm" wrote: > >> Hi, >> >> I have a problem with Contigs failing like this: >> >> examining contents of the fasta file and run log >> ERROR: could not make datastore directory >> --> rank=11, hostname=r5n04 >> ERROR: Failed while examining contents of the fasta file and run log >> ERROR: Chunk failed at level:0, tier_type:0 >> FAILED CONTIG:MA_gen_v8.05_c31675 >> >> There is enough space available on both the storage device and the tmp >> device that is being used. The datastore log files looks like this: >> >> MA_gen_v8.05_c1149 FAILED >> MA_gen_v8.05_c1153 FAILED >> MA_gen_v8.05_c1154 FAILED >> MA_gen_v8.05_c1155 FAILED >> MA_gen_v8.05_c1156 FAILED >> >> Since there is no datastore for this contigs, I can not provide any more >> informationen. Approx. 1/3 of the contigs were annotated, after that >> maker started to fail. It kept running for a while and finishing some of >> the contigs in the second or third try like this: >> >> MA_gen_v8.05_c4443 FAILED >> MA_gen_v8.05_c4443 FAILED >> MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ START >> ED >> MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ >> FINISHED >> >> Any ideas what can cause this problems? >> >> Best regards >> Felix >> >> -- >> Felix Bemm >> Department of Bioinformatics >> University of W?rzburg, Germany >> Tel: +49 931 - 31 83696 >> Fax: +49 931 - 31 84552 >> felix.bemm at uni-wuerzburg.de >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -- Felix Bemm Department of Bioinformatics University of W?rzburg, Germany Tel: +49 931 - 31 83696 Fax: +49 931 - 31 84552 felix.bemm at uni-wuerzburg.de From jfierst at uoregon.edu Fri Oct 4 11:06:36 2013 From: jfierst at uoregon.edu (Janna Fierst) Date: Fri, 4 Oct 2013 10:06:36 -0700 Subject: [maker-devel] Fwd: exon/intron boundaries In-Reply-To: References: Message-ID: Hi, thanks for your reply- I have been going through our annotations in detail and trying different parameter sets, and I think I have identified what is going on but I'm not sure how to set the MAKER2 parameters for our situation. We are working with a species of Caenorhabditid worm and there are long gene-dense blocks that are being incorrectly annotated as large single genes instead of several smaller closely spaced genes. The protein alignments (tblastx and protein2genome) show very clearly where the exon/intron boundaries are and in most cases agree with the augustus predictions. The assembled cufflinks output (through blastn, est2genome and est_gff:cufflinks) does not agree in some locations; I think this may be because in some cases the UTR nearly overlaps adjacent genes. I have included a screenshot of an annotated region viewed in apollo to try to show this. The large gene in the middle is actually 7 different genes that are extremely close together and MAKER2 is collapsing them into a single gene. I tried running without any RNASeq/cufflinks data and MAKER2 annotates the region as two genes instead of one, but I can't get it to recognize the 7 as different genes. I have retrained SNAP but we have not been able to successfully train Augustus, we are currently using the default caenhorhabditis species model. I also included a species specific repeat library. I tried setting correct_est_fusion=1 and reducing pred_flank but these changes appear to really alter the annotations and we end up annotating almost nothing. I also tried setting est2genome=0 to decrease the influence of the cufflinks assembly but it didn't appear to help. There are some very large introns in these genes so I haven't tried yet decreasing the maximum intron size because I'm concerned this may generate too many split genes instead of our current merged gene problem. Thanks for your help, any advice is greatly appreciated! -Janna Fierst On Mon, Aug 26, 2013 at 12:21 PM, Carson Holt wrote: > Are you getting gene fusions or just more exons? Gene fusions can be > reduced by setting correct_est_fusion=1, or reducing pred_flank, although > reducing pred_flank can cause other issues (but those generally only appear > if setting the value below below 150). Also if you have the maximum intron > size set to high (split_hit option), you may also be generating bridging > alignments that make evidence align across distant paralogous genes as well > (this can result in gene merging) > > You should also look at your results manually in a viewer like Apollo. > Then see if the extra exons are supported by something such as protein > alignments from another species. If this is the case, you may have a > poorly annotated protein set that is being used as evidence that is > carrying over it's erroneous exons into the species you are annotating. If > the extra exons are supported by EST evidence, then perhaps you should try > and rebuild the EST assembly (for example trinity has an option to use a > Jarccardian similarity coefficient to avoid fusing transcripts). > > Another option, is to retrain SNAP or Augustus. MAKER does not actually > produce any of the models itself (it is a pipeline not a predictor). The > models are all generated using these other algorithms, MAKER just feeds > them hints based on protein and transcript alignments, so making sure > training is sufficient is important for those programs to produce their > best models. > > Finally make sure your repeat database is sufficient, you may need to > generate a species specific repeat library using something like > RepeatModeler. Repeats can end up being included as extra exons in gene > models because they may contain reading frames the do code for proteins > (I.e. reverse transcriptases). > > If you have any questions on any of the above, just let us know. > > Thanks, > Carson > > > From: Janna Fierst > Date: Monday, August 26, 2013 2:54 PM > To: > Subject: [maker-devel] exon/intron boundaries > > Hi, > > I am using MAKER 2.28 to annotate a Caenorhabditid worm genome, and the > initial results appear fairly good but we seem to be be annotating too many > exons for multiple genes. I was wondering which parameters should be tuned > to change the threshold for exon/intron boundaries? Thanks for your help > -Janna Fierst > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 1.tiff Type: image/tiff Size: 115362 bytes Desc: not available URL: From jfierst at uoregon.edu Sat Oct 5 16:18:52 2013 From: jfierst at uoregon.edu (Janna Fierst) Date: Sat, 5 Oct 2013 15:18:52 -0700 Subject: [maker-devel] Fwd: exon/intron boundaries In-Reply-To: References: Message-ID: Hi, thanks for your reply- I have been going through our annotations in detail and trying different parameter sets, and I think I have identified what is going on but I'm not sure how to set the MAKER2 parameters for our situation. We are working with a species of Caenorhabditid worm and there are long gene-dense blocks that are being incorrectly annotated as large single genes instead of several smaller closely spaced genes. The protein alignments (tblastx and protein2genome) show very clearly where the exon/intron boundaries are and in most cases agree with the augustus predictions. The assembled cufflinks output (through blastn, est2genome and est_gff:cufflinks) does not agree in some locations; I think this may be because in some cases the UTR nearly overlaps adjacent genes. I have included a screenshot of an annotated region viewed in apollo to try to show this. The large gene in the middle is actually 7 different genes that are extremely close together and MAKER2 is collapsing them into a single gene. I tried running without any RNASeq/cufflinks data and MAKER2 annotates the region as two genes instead of one, but I can't get it to recognize the 7 as different genes. I have retrained SNAP but we have not been able to successfully train Augustus, we are currently using the default caenhorhabditis species model. I also included a species specific repeat library. I tried setting correct_est_fusion=1 and reducing pred_flank but these changes appear to really alter the annotations and we end up annotating almost nothing. I also tried setting est2genome=0 to decrease the influence of the cufflinks assembly but it didn't appear to help. There are some very large introns in these genes so I haven't tried yet decreasing the maximum intron size because I'm concerned this may generate too many split genes instead of our current merged gene problem. Thanks for your help, any advice is greatly appreciated! -Janna Fierst On Mon, Aug 26, 2013 at 12:21 PM, Carson Holt wrote: > Are you getting gene fusions or just more exons? Gene fusions can be > reduced by setting correct_est_fusion=1, or reducing pred_flank, although > reducing pred_flank can cause other issues (but those generally only appear > if setting the value below below 150). Also if you have the maximum intron > size set to high (split_hit option), you may also be generating bridging > alignments that make evidence align across distant paralogous genes as well > (this can result in gene merging) > > You should also look at your results manually in a viewer like Apollo. > Then see if the extra exons are supported by something such as protein > alignments from another species. If this is the case, you may have a > poorly annotated protein set that is being used as evidence that is > carrying over it's erroneous exons into the species you are annotating. If > the extra exons are supported by EST evidence, then perhaps you should try > and rebuild the EST assembly (for example trinity has an option to use a > Jarccardian similarity coefficient to avoid fusing transcripts). > > Another option, is to retrain SNAP or Augustus. MAKER does not actually > produce any of the models itself (it is a pipeline not a predictor). The > models are all generated using these other algorithms, MAKER just feeds > them hints based on protein and transcript alignments, so making sure > training is sufficient is important for those programs to produce their > best models. > > Finally make sure your repeat database is sufficient, you may need to > generate a species specific repeat library using something like > RepeatModeler. Repeats can end up being included as extra exons in gene > models because they may contain reading frames the do code for proteins > (I.e. reverse transcriptases). > > If you have any questions on any of the above, just let us know. > > Thanks, > Carson > > > From: Janna Fierst > Date: Monday, August 26, 2013 2:54 PM > To: > Subject: [maker-devel] exon/intron boundaries > > Hi, > > I am using MAKER 2.28 to annotate a Caenorhabditid worm genome, and the > initial results appear fairly good but we seem to be be annotating too many > exons for multiple genes. I was wondering which parameters should be tuned > to change the threshold for exon/intron boundaries? Thanks for your help > -Janna Fierst > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 1.tiff Type: image/tiff Size: 115362 bytes Desc: not available URL: From myandell at genetics.utah.edu Mon Oct 7 05:36:53 2013 From: myandell at genetics.utah.edu (Mark Yandell) Date: Mon, 7 Oct 2013 11:36:53 +0000 Subject: [maker-devel] maker apollo error In-Reply-To: <8687323557146056ef885dd8a1bf2ea4@buzonweb.us.es> References: <8687323557146056ef885dd8a1bf2ea4@buzonweb.us.es> Message-ID: <7A60AB257EFF2B48B1F4C814817EA05365E66B3C@mxb2.hg.genetics.utah.edu> Hi Gabriel, I've forwarded your request on to the MAKER email list... cheers, --mark Mark Yandell Professor of Human Genetics H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:801-587-7707 ________________________________________ From: ggpozo at us.es [ggpozo at us.es] Sent: Monday, October 07, 2013 2:49 AM To: Mark Yandell Subject: maker apollo error Dear Dr. Yandell, First, thank you very much for provide us such a great tool as MAKER. I am traying to install it in my Linux system, everything is ok, however when I try to instal Apollo with ./Build apollo... Downloading apollo... --2013-10-07 10:45:37-- http://gmod.svn.sourceforge.net/viewvc/gmod/apollo/trunk/?view=tar Resolviendo gmod.svn.sourceforge.net... 216.34.181.65, 216.34.181.177 Connecting to gmod.svn.sourceforge.net|216.34.181.65|:80... conectado. Petici?n HTTP enviada, esperando respuesta... 302 Found Localizaci?n: http://sourceforge.net/p/gmod/svn/ [siguiendo] --2013-10-07 10:45:38-- http://sourceforge.net/p/gmod/svn/ Resolviendo sourceforge.net... 216.34.181.60 Connecting to sourceforge.net|216.34.181.60|:80... conectado. Petici?n HTTP enviada, esperando respuesta... 302 Found Localizaci?n: http://sourceforge.net/p/gmod/svn/HEAD/tree/ [siguiendo] --2013-10-07 10:45:38-- http://sourceforge.net/p/gmod/svn/HEAD/tree/ Connecting to sourceforge.net|216.34.181.60|:80... conectado. Petici?n HTTP enviada, esperando respuesta... 200 OK Longitud: 37730 (37K) [text/html] Saving to: `/home/maker/src/../exe/apollo.tar.gz' 100%[==================================================================================================================================>] 37.730 92,1K/s in 0,4s 2013-10-07 10:45:40 (92,1 KB/s) - `/home/maker/src/../exe/apollo.tar.gz' saved [37730/37730] Unpacking apollo tarball... gzip: stdin: not in gzip format tar: Child returned status 1 tar: Error is not recoverable: exiting now ERROR: Failed installing apollo, now cleaning installation path... You may need to install apollo manually. It seems that the http direction in locations file is wrong, Apollo has changed its version and now Apollo is integrated in Apolloweb, may be this the problem. I would greatly appreciate any help you can give me. Thank you very much. Dr. Gabriel Gutierrez Department of Genetcis University of Seville Spain From carsonhh at gmail.com Mon Oct 7 07:20:17 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 07 Oct 2013 09:20:17 -0400 Subject: [maker-devel] Fwd: exon/intron boundaries In-Reply-To: Message-ID: Hi Janna, There are a couple of things to do. Download the maker-2.29p-beta from the lab website. This includes some changes made to improve the performance of correct_est_fusion. Whenever using the correct_est_fusion option, this also reduces the influence of ESTs on annotation. So normally you would need to increase your protein dataset, but given that you have supplied 3 nematode species and all of UniProt already you should probably be fine. But there is one last thing you can do. Instead of using cufflinks, try using trinity to assemble the ESTs. There is a Jaccard clip option that reducing merging caused by overlapping UTR. Between the trinity and correct_est_fusion you should be able to really reduce the effect of those ESTs. If those changes don't work there is one last option. If you take the MAKER results and filter them for SNAP and Augustus ab initio results (match/match_part in the GFF3), then you can pass those in to the pred_gff options. Then turn snaphmm and augustus_species off in the control files. Basically what this will do is turn MAKER's hint based prediction off and force it to filter the ab intio results and select directly models from there. Since the merging is being caused by bad hints (merged transcripts from mRNAseq) this would reduce that effect. You will still need correct_est_fusion=1 though to trim UTR coming from the merged transcripts, because even though MAEKR can't process hints to rerun SNAP and Augustus this way, it will try and add UTR using the EST evidence. --Carson From: Janna Fierst Date: Friday, October 4, 2013 1:06 PM To: Subject: [maker-devel] Fwd: exon/intron boundaries Hi, thanks for your reply- I have been going through our annotations in detail and trying different parameter sets, and I think I have identified what is going on but I'm not sure how to set the MAKER2 parameters for our situation. We are working with a species of Caenorhabditid worm and there are long gene-dense blocks that are being incorrectly annotated as large single genes instead of several smaller closely spaced genes. The protein alignments (tblastx and protein2genome) show very clearly where the exon/intron boundaries are and in most cases agree with the augustus predictions. The assembled cufflinks output (through blastn, est2genome and est_gff:cufflinks) does not agree in some locations; I think this may be because in some cases the UTR nearly overlaps adjacent genes. I have included a screenshot of an annotated region viewed in apollo to try to show this. The large gene in the middle is actually 7 different genes that are extremely close together and MAKER2 is collapsing them into a single gene. I tried running without any RNASeq/cufflinks data and MAKER2 annotates the region as two genes instead of one, but I can't get it to recognize the 7 as different genes. I have retrained SNAP but we have not been able to successfully train Augustus, we are currently using the default caenhorhabditis species model. I also included a species specific repeat library. I tried setting correct_est_fusion=1 and reducing pred_flank but these changes appear to really alter the annotations and we end up annotating almost nothing. I also tried setting est2genome=0 to decrease the influence of the cufflinks assembly but it didn't appear to help. There are some very large introns in these genes so I haven't tried yet decreasing the maximum intron size because I'm concerned this may generate too many split genes instead of our current merged gene problem. Thanks for your help, any advice is greatly appreciated! -Janna Fierst On Mon, Aug 26, 2013 at 12:21 PM, Carson Holt wrote: > Are you getting gene fusions or just more exons? Gene fusions can be reduced > by setting correct_est_fusion=1, or reducing pred_flank, although reducing > pred_flank can cause other issues (but those generally only appear if setting > the value below below 150). Also if you have the maximum intron size set to > high (split_hit option), you may also be generating bridging alignments that > make evidence align across distant paralogous genes as well (this can result > in gene merging) > > You should also look at your results manually in a viewer like Apollo. Then > see if the extra exons are supported by something such as protein alignments > from another species. If this is the case, you may have a poorly annotated > protein set that is being used as evidence that is carrying over it's > erroneous exons into the species you are annotating. If the extra exons are > supported by EST evidence, then perhaps you should try and rebuild the EST > assembly (for example trinity has an option to use a Jarccardian similarity > coefficient to avoid fusing transcripts). > > Another option, is to retrain SNAP or Augustus. MAKER does not actually > produce any of the models itself (it is a pipeline not a predictor). The > models are all generated using these other algorithms, MAKER just feeds them > hints based on protein and transcript alignments, so making sure training is > sufficient is important for those programs to produce their best models. > > Finally make sure your repeat database is sufficient, you may need to generate > a species specific repeat library using something like RepeatModeler. Repeats > can end up being included as extra exons in gene models because they may > contain reading frames the do code for proteins (I.e. reverse transcriptases). > > If you have any questions on any of the above, just let us know. > > Thanks, > Carson > > > From: Janna Fierst > Date: Monday, August 26, 2013 2:54 PM > To: > Subject: [maker-devel] exon/intron boundaries > > Hi, > > I am using MAKER 2.28 to annotate a Caenorhabditid worm genome, and the > initial results appear fairly good but we seem to be be annotating too many > exons for multiple genes. I was wondering which parameters should be tuned to > change the threshold for exon/intron boundaries? Thanks for your help -Janna > Fierst > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/mak > er-devel_yandell-lab.org _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Oct 7 08:12:24 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 07 Oct 2013 10:12:24 -0400 Subject: [maker-devel] maker apollo error In-Reply-To: <7A60AB257EFF2B48B1F4C814817EA05365E66B3C@mxb2.hg.genetics.utah.edu> Message-ID: This was a location from GMOD for the older Apollo. It looks like they've dropped it, so it now points to nothing. Perhaps there is another location that still has the old code I can get from Ed. --Carson On 10/7/13 7:36 AM, "Mark Yandell" wrote: > >Hi Gabriel, > >I've forwarded your request on to the MAKER email list... > >cheers, > >--mark > >Mark Yandell >Professor of Human Genetics >H.A. & Edna Benning Presidential Endowed Chair >Eccles Institute of Human Genetics >University of Utah >15 North 2030 East, Room 2100 >Salt Lake City, UT 84112-5330 >ph:801-587-7707 > >________________________________________ >From: ggpozo at us.es [ggpozo at us.es] >Sent: Monday, October 07, 2013 2:49 AM >To: Mark Yandell >Subject: maker apollo error > >Dear Dr. Yandell, > >First, thank you very much for provide us such a great tool as MAKER. I >am traying to install it in my Linux system, everything is ok, however >when I try to instal Apollo with ./Build apollo... > >Downloading apollo... >--2013-10-07 10:45:37-- >http://gmod.svn.sourceforge.net/viewvc/gmod/apollo/trunk/?view=tar >Resolviendo gmod.svn.sourceforge.net... 216.34.181.65, 216.34.181.177 >Connecting to gmod.svn.sourceforge.net|216.34.181.65|:80... conectado. >Petici?n HTTP enviada, esperando respuesta... 302 Found >Localizaci?n: http://sourceforge.net/p/gmod/svn/ [siguiendo] >--2013-10-07 10:45:38-- http://sourceforge.net/p/gmod/svn/ >Resolviendo sourceforge.net... 216.34.181.60 >Connecting to sourceforge.net|216.34.181.60|:80... conectado. >Petici?n HTTP enviada, esperando respuesta... 302 Found >Localizaci?n: http://sourceforge.net/p/gmod/svn/HEAD/tree/ [siguiendo] >--2013-10-07 10:45:38-- http://sourceforge.net/p/gmod/svn/HEAD/tree/ >Connecting to sourceforge.net|216.34.181.60|:80... conectado. >Petici?n HTTP enviada, esperando respuesta... 200 OK >Longitud: 37730 (37K) [text/html] >Saving to: `/home/maker/src/../exe/apollo.tar.gz' > >100%[===================================================================== >=============================================================>] 37.730 > 92,1K/s in 0,4s > >2013-10-07 10:45:40 (92,1 KB/s) - `/home/maker/src/../exe/apollo.tar.gz' >saved [37730/37730] > >Unpacking apollo tarball... > >gzip: stdin: not in gzip format >tar: Child returned status 1 >tar: Error is not recoverable: exiting now > > >ERROR: Failed installing apollo, now cleaning installation path... >You may need to install apollo manually. > > > >It seems that the http direction in locations file is wrong, Apollo has >changed its version and now Apollo is integrated in Apolloweb, may be >this the problem. > > > >I would greatly appreciate any help you can give me. > >Thank you very much. > > > >Dr. Gabriel Gutierrez > >Department of Genetcis > >University of Seville > >Spain > > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From carsonhh at gmail.com Mon Oct 7 08:33:44 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 07 Oct 2013 10:33:44 -0400 Subject: [maker-devel] Contigs failing - could not make datastore directory In-Reply-To: <52501D8B.2060301@uni-wuerzburg.de> Message-ID: You can also try the 2.29 beta if you want (on the website). Also when MAKER exits, it tries to delete anything left behind in temporary folders, so is it possible that it is creating space on exit after filling up /tmp? Given that the issue seems to go away when you switch to another directory, it suggests that this might be the kind of thing that is specific to something about the current state of your your system. Thanks, Carson On 10/5/13 10:09 AM, "Felix Bemm" wrote: >Hi Carson, > >there is definitely no quota neither a full local tmp device. I changed >to another tmp device, which is unfortunately residing on nfs but now it >seems to work. I am using the latest version of maker (2.28). The >complete job is running on a single machine with 32 mpi threads. > >Bests >Felix > >Am 04.10.2013 20:21, schrieb Carson Holt: >> Hi Felix, >> >> There are other potential issues as well, for example a file quota limit >> set by your administrator or full local /tmp which will be unique to >>each >> node and cannot be queried directly from the root node on a cluster. >>On a >> cluster one node may also not have the NFS location mounted. Or it >>could >> be system related, for example on ibrix NFS mounted systems you can get >>a >> weird filesystem issue with ghost files and directories when using MAKER >> that can neither be created nor deleted. Try running the failed contigs >> in a different directory (even if it's on the same disk). Also what >> version of MAKER are you using? >> >> --Carson >> >> >> >> On 10/4/13 3:10 AM, "Felix Bemm" wrote: >> >>> Hi, >>> >>> I have a problem with Contigs failing like this: >>> >>> examining contents of the fasta file and run log >>> ERROR: could not make datastore directory >>> --> rank=11, hostname=r5n04 >>> ERROR: Failed while examining contents of the fasta file and run log >>> ERROR: Chunk failed at level:0, tier_type:0 >>> FAILED CONTIG:MA_gen_v8.05_c31675 >>> >>> There is enough space available on both the storage device and the tmp >>> device that is being used. The datastore log files looks like this: >>> >>> MA_gen_v8.05_c1149 FAILED >>> MA_gen_v8.05_c1153 FAILED >>> MA_gen_v8.05_c1154 FAILED >>> MA_gen_v8.05_c1155 FAILED >>> MA_gen_v8.05_c1156 FAILED >>> >>> Since there is no datastore for this contigs, I can not provide any >>>more >>> informationen. Approx. 1/3 of the contigs were annotated, after that >>> maker started to fail. It kept running for a while and finishing some >>>of >>> the contigs in the second or third try like this: >>> >>> MA_gen_v8.05_c4443 FAILED >>> MA_gen_v8.05_c4443 FAILED >>> >>>MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ STA >>>RT >>> ED >>> MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ >>> FINISHED >>> >>> Any ideas what can cause this problems? >>> >>> Best regards >>> Felix >>> >>> -- >>> Felix Bemm >>> Department of Bioinformatics >>> University of W?rzburg, Germany >>> Tel: +49 931 - 31 83696 >>> Fax: +49 931 - 31 84552 >>> felix.bemm at uni-wuerzburg.de >>> >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > >-- >Felix Bemm >Department of Bioinformatics >University of W?rzburg, Germany >Tel: +49 931 - 31 83696 >Fax: +49 931 - 31 84552 >felix.bemm at uni-wuerzburg.de From felix.bemm at uni-wuerzburg.de Mon Oct 7 10:40:12 2013 From: felix.bemm at uni-wuerzburg.de (Felix Bemm) Date: Mon, 07 Oct 2013 18:40:12 +0200 Subject: [maker-devel] Contigs failing - could not make datastore directory In-Reply-To: References: Message-ID: <5252E3EC.6050908@uni-wuerzburg.de> Do you have a changelog for 2.29b? On 07.10.2013 16:33, Carson Holt wrote: > You can also try the 2.29 beta if you want (on the website). Also when > MAKER exits, it tries to delete anything left behind in temporary folders, > so is it possible that it is creating space on exit after filling up /tmp? I will try to monitor that! > Given that the issue seems to go away when you switch to another > directory, it suggests that this might be the kind of thing that is > specific to something about the current state of your your system. I agree with that. Interestingly it happens with the only directory where I would not expect that. > Thanks, > Carson > > > On 10/5/13 10:09 AM, "Felix Bemm" wrote: > >> Hi Carson, >> >> there is definitely no quota neither a full local tmp device. I changed >> to another tmp device, which is unfortunately residing on nfs but now it >> seems to work. I am using the latest version of maker (2.28). The >> complete job is running on a single machine with 32 mpi threads. >> >> Bests >> Felix >> >> Am 04.10.2013 20:21, schrieb Carson Holt: >>> Hi Felix, >>> >>> There are other potential issues as well, for example a file quota limit >>> set by your administrator or full local /tmp which will be unique to >>> each >>> node and cannot be queried directly from the root node on a cluster. >>> On a >>> cluster one node may also not have the NFS location mounted. Or it >>> could >>> be system related, for example on ibrix NFS mounted systems you can get >>> a >>> weird filesystem issue with ghost files and directories when using MAKER >>> that can neither be created nor deleted. Try running the failed contigs >>> in a different directory (even if it's on the same disk). Also what >>> version of MAKER are you using? >>> >>> --Carson >>> >>> >>> >>> On 10/4/13 3:10 AM, "Felix Bemm" wrote: >>> >>>> Hi, >>>> >>>> I have a problem with Contigs failing like this: >>>> >>>> examining contents of the fasta file and run log >>>> ERROR: could not make datastore directory >>>> --> rank=11, hostname=r5n04 >>>> ERROR: Failed while examining contents of the fasta file and run log >>>> ERROR: Chunk failed at level:0, tier_type:0 >>>> FAILED CONTIG:MA_gen_v8.05_c31675 >>>> >>>> There is enough space available on both the storage device and the tmp >>>> device that is being used. The datastore log files looks like this: >>>> >>>> MA_gen_v8.05_c1149 FAILED >>>> MA_gen_v8.05_c1153 FAILED >>>> MA_gen_v8.05_c1154 FAILED >>>> MA_gen_v8.05_c1155 FAILED >>>> MA_gen_v8.05_c1156 FAILED >>>> >>>> Since there is no datastore for this contigs, I can not provide any >>>> more >>>> informationen. Approx. 1/3 of the contigs were annotated, after that >>>> maker started to fail. It kept running for a while and finishing some >>>> of >>>> the contigs in the second or third try like this: >>>> >>>> MA_gen_v8.05_c4443 FAILED >>>> MA_gen_v8.05_c4443 FAILED >>>> >>>> MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ STA >>>> RT >>>> ED >>>> MA_gen_v8.05_c4443 MA_gen_v8.05M_datastore/A7/9B/MA_gen_v8.05_c4443/ >>>> FINISHED >>>> >>>> Any ideas what can cause this problems? >>>> >>>> Best regards >>>> Felix >>>> >>>> -- >>>> Felix Bemm >>>> Department of Bioinformatics >>>> University of W?rzburg, Germany >>>> Tel: +49 931 - 31 83696 >>>> Fax: +49 931 - 31 84552 >>>> felix.bemm at uni-wuerzburg.de >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> >> -- >> Felix Bemm >> Department of Bioinformatics >> University of W?rzburg, Germany >> Tel: +49 931 - 31 83696 >> Fax: +49 931 - 31 84552 >> felix.bemm at uni-wuerzburg.de -- Felix Bemm Department of Bioinformatics University of W?rzburg, Germany Tel: +49 931 - 31 83696 Fax: +49 931 - 31 84552 felix.bemm at uni-wuerzburg.de From carsonhh at gmail.com Mon Oct 7 11:25:34 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 07 Oct 2013 13:25:34 -0400 Subject: [maker-devel] maker apollo error In-Reply-To: <14a642f3dc9cb615f048b70228bad3be@buzonweb.us.es> Message-ID: Here is a new location for the source --> http://sourceforge.net/code-snapshots/svn/g/gm/gmod/svn/gmod-svn-25291-apoll o-trunk.zip --Carson From: Date: Monday, October 7, 2013 12:47 PM To: Carson Holt Cc: Mark Yandell , Subject: Re: [maker-devel] maker apollo error Thanks, I have modified the locations file pointing to a personal server where I downloaded an old version of Apollo but it did not work, but if you can get a new one I would try it again. Thanks Gabriel El 07/10/2013 16:12, Carson Holt escribi?: > This was a location from GMOD for the older Apollo. It looks like they've > dropped it, so it now points to nothing. Perhaps there is another location > that still has the old code I can get from Ed. > > --Carson > > > > On 10/7/13 7:36 AM, "Mark Yandell" wrote: >> Hi Gabriel, I've forwarded your request on to the MAKER email list... cheers, >> --mark Mark Yandell Professor of Human Genetics H.A. & Edna Benning >> Presidential Endowed Chair Eccles Institute of Human Genetics University of >> Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 >> ph:801-587-7707 ________________________________________ From: ggpozo at us.es >> [ggpozo at us.es] Sent: Monday, October 07, 2013 2:49 AM To: Mark Yandell >> Subject: maker apollo error Dear Dr. Yandell, First, thank you very much for >> provide us such a great tool as MAKER. I am traying to install it in my Linux >> system, everything is ok, however when I try to instal Apollo with ./Build >> apollo... Downloading apollo... --2013-10-07 10:45:37-- >> http://gmod.svn.sourceforge.net/viewvc/gmod/apollo/trunk/?view=tar >> Resolviendo gmod.svn.sourceforge.net... 216.34.181.65, 216.34.181.177 >> Connecting to gmod.svn.sourceforge.net|216.34.181.65|:80... conectado. >> Petici?n HTTP enviada, esperando respuesta... 302 Found Localizaci?n: >> http://sourceforge.net/p/gmod/svn/ [siguiendo] --2013-10-07 10:45:38-- >> http://sourceforge.net/p/gmod/svn/ Resolviendo sourceforge.net... >> 216.34.181.60 Connecting to sourceforge.net|216.34.181.60|:80... conectado. >> Petici?n HTTP enviada, esperando respuesta... 302 Found Localizaci?n: >> http://sourceforge.net/p/gmod/svn/HEAD/tree/ [siguiendo] --2013-10-07 >> 10:45:38-- http://sourceforge.net/p/gmod/svn/HEAD/tree/ Connecting to >> sourceforge.net|216.34.181.60|:80... conectado. Petici?n HTTP enviada, >> esperando respuesta... 200 OK Longitud: 37730 (37K) [text/html] Saving to: >> `/home/maker/src/../exe/apollo.tar.gz' >> 100%[===================================================================== >> =============================================================>] 37.730 >> 92,1K/s in 0,4s 2013-10-07 10:45:40 (92,1 KB/s) - >> `/home/maker/src/../exe/apollo.tar.gz' saved [37730/37730] Unpacking apollo >> tarball... gzip: stdin: not in gzip format tar: Child returned status 1 tar: >> Error is not recoverable: exiting now ERROR: Failed installing apollo, now >> cleaning installation path... You may need to install apollo manually. It >> seems that the http direction in locations file is wrong, Apollo has changed >> its version and now Apollo is integrated in Apolloweb, may be this the >> problem. I would greatly appreciate any help you can give me. Thank you very >> much. Dr. Gabriel Gutierrez Department of Genetcis University of Seville >> Spain _______________________________________________ maker-devel mailing >> list maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ggpozo at us.es Mon Oct 7 10:47:13 2013 From: ggpozo at us.es (ggpozo at us.es) Date: Mon, 07 Oct 2013 18:47:13 +0200 Subject: [maker-devel] maker apollo error In-Reply-To: References: Message-ID: <14a642f3dc9cb615f048b70228bad3be@buzonweb.us.es> Thanks, I have modified the locations file pointing to a personal server where I downloaded an old version of Apollo but it did not work, but if you can get a new one I would try it again. Thanks Gabriel El 07/10/2013 16:12, Carson Holt escribi?: > This was a location from GMOD for the older Apollo. It looks like they've > dropped it, so it now points to nothing. Perhaps there is another location > that still has the old code I can get from Ed. > > --Carson > > On 10/7/13 7:36 AM, "Mark Yandell" wrote: > >> Hi Gabriel, I've forwarded your request on to the MAKER email list... cheers, --mark Mark Yandell Professor of Human Genetics H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:801-587-7707 ________________________________________ From: ggpozo at us.es [1] [ggpozo at us.es [2]] Sent: Monday, October 07, 2013 2:49 AM To: Mark Yandell Subject: maker apollo error Dear Dr. Yandell, First, thank you very much for provide us such a great tool as MAKER. I am traying to install it in my Linux system, everything is ok, however when I try to instal Apollo with ./Build apollo... Downloading apollo... --2013-10-07 10:45:37-- http://gmod.svn.sourceforge.net/viewvc/gmod/apollo/trunk/?view=tar [3] Resolviendo gmod.svn.sourceforge.net... 216.34.181.65, 216.34.181.177 Connecting to gmod.svn.sourceforge.net|216.34.181.65|:80... conectado. Petici?n HTTP enviada, esperando respuesta... 302 Found Localizaci?n: http://sourceforge.net/p/gmod/svn/ [4] [siguiendo] --2013-10-07 10:45:38-- http://sourceforge.net/p/gmod/svn/ [5] Resolviendo sourceforge.net... 216.34.181.60 Connecting to sourceforge.net|216.34.181.60|:80... conectado. Petici?n HTTP enviada, esperando respuesta... 302 Found Localizaci?n: http://sourceforge.net/p/gmod/svn/HEAD/tree/ [6] [siguiendo] --2013-10-07 10:45:38-- http://sourceforge.net/p/gmod/svn/HEAD/tree/ [7] Connecting to sourceforge.net|216.34.181.60|:80... conectado. Petici?n HTTP enviada, esperando respuesta... 200 OK Longitud: 37730 (37K) [text/html] Saving to: `/home/maker/src/../exe/apollo.tar.gz' 100%[===================================================================== =============================================================>] 37.730 92,1K/s in 0,4s 2013-10-07 10:45:40 (92,1 KB/s) - `/home/maker/src/../exe/apollo.tar.gz' saved [37730/37730] Unpacking apollo tarball... gzip: stdin: not in gzip format tar: Child returned status 1 tar: Error is not recoverable: exiting now ERROR: Failed installing apollo, now cleaning installation path... You may need to install apollo manually. It seems that the http direction in locations file is wrong, Apollo has changed its version and now Apollo is integrated in Apolloweb, may be this the problem. I would greatly appreciate any help you can give me. Thank you very much. Dr. Gabriel Gutierrez Department of Genetcis University of Seville Spain _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com [8] http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org [9] Links: ------ [1] mailto:ggpozo at us.es [2] mailto:ggpozo at us.es [3] http://gmod.svn.sourceforge.net/viewvc/gmod/apollo/trunk/?view=tar [4] http://sourceforge.net/p/gmod/svn/ [5] http://sourceforge.net/p/gmod/svn/ [6] http://sourceforge.net/p/gmod/svn/HEAD/tree/ [7] http://sourceforge.net/p/gmod/svn/HEAD/tree/ [8] mailto:maker-devel at box290.bluehost.com [9] http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From felix.bemm at uni-wuerzburg.de Sat Oct 12 09:06:52 2013 From: felix.bemm at uni-wuerzburg.de (Felix Bemm) Date: Sat, 12 Oct 2013 17:06:52 +0200 Subject: [maker-devel] Serious Start Codon Problem Message-ID: <5259658C.4000102@uni-wuerzburg.de> Hi, I have a serious problems with maker annotations especially the start codon placement. It started happening with maker version 2.27! About 1/3 of my genes are missing a proper start codon. When reviewing the GFF files gene predictors and est evidence standalone would predict the right gene structure including the start codon. I was thinking that this problem was somehow linked to my gene models but looking at their standalone prediction I discarded that theory. Just some stats about proteins with proper start codon (first column) and missing start codon (second column). Maker 9268 4215 SNAP 16577 896 Genemark 18764 290 Numbers look the same when Augustus is used (currently retraining). Manually using EST's in Apollo often fixes the problems but I don't want to check > 4000 genes for this. Do you have any idea what could cause such a problem? If you need some example gff files I can send you. Best regards Felix -- Felix Bemm Department of Bioinformatics University of W?rzburg, Germany Tel: +49 931 - 31 83696 Fax: +49 931 - 31 84552 felix.bemm at uni-wuerzburg.de From carsonhh at gmail.com Sat Oct 12 09:48:29 2013 From: carsonhh at gmail.com (Carson Holt) Date: Sat, 12 Oct 2013 11:48:29 -0400 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: <5259658C.4000102@uni-wuerzburg.de> Message-ID: This issue just came up this week on another e-mail as well. One way to fix it, is to edit the Bio::Tools::CodonTable module from BioPerl (use maker --debug to have maker print out the locations of all modules it uses). Such a hack should only be done as a temporary work around. You change line 256 from --> ---M---------------M---------------M---------------------------- To--> -----------------------------------M---------------------------- Maker uses the BioPerl is_start_codon function to determine if the codon used in a result it receives is acceptable or if it should look upstream or downstream for a different one. Apparently there are actually 3 acceptable start codons in the standard codon table (2 rare ones and one common), so making the edit removes the two rare ones from the list. I'm also preparing a 2.30 release that exports a "strict" codon table to the BioPerl module, that will be used by default and should fix the issue. Thanks, Carson On 10/12/13 11:06 AM, "Felix Bemm" wrote: >Hi, > >I have a serious problems with maker annotations especially the start >codon placement. It started happening with maker version 2.27! About 1/3 >of my genes are missing a proper start codon. When reviewing the GFF >files gene predictors and est evidence standalone would predict the >right gene structure including the start codon. I was thinking that this >problem was somehow linked to my gene models but looking at their >standalone prediction I discarded that theory. Just some stats about >proteins with proper start codon (first column) and missing start codon >(second column). > >Maker 9268 4215 >SNAP 16577 896 >Genemark 18764 290 > >Numbers look the same when Augustus is used (currently retraining). >Manually using EST's in Apollo often fixes the problems but I don't want >to check > 4000 genes for this. > >Do you have any idea what could cause such a problem? If you need some >example gff files I can send you. > >Best regards >Felix > >-- >Felix Bemm >Department of Bioinformatics >University of W?rzburg, Germany >Tel: +49 931 - 31 83696 >Fax: +49 931 - 31 84552 >felix.bemm at uni-wuerzburg.de > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From felix.bemm at uni-wuerzburg.de Sat Oct 12 10:16:06 2013 From: felix.bemm at uni-wuerzburg.de (Felix Bemm) Date: Sat, 12 Oct 2013 18:16:06 +0200 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: References: Message-ID: <525975C6.4020403@uni-wuerzburg.de> Thx Carson! Do you now when 2.30 will be available? Bests On 12.10.2013 17:48, Carson Holt wrote: > This issue just came up this week on another e-mail as well. > > One way to fix it, is to edit the Bio::Tools::CodonTable module from > BioPerl (use maker --debug to have maker print out the locations of all > modules it uses). Such a hack should only be done as a temporary work > around. > > You change line 256 from --> > ---M---------------M---------------M---------------------------- > > To--> > -----------------------------------M---------------------------- > > > Maker uses the BioPerl is_start_codon function to determine if the codon > used in a result it receives is acceptable or if it should look upstream > or downstream for a different one. Apparently there are actually 3 > acceptable start codons in the standard codon table (2 rare ones and one > common), so making the edit removes the two rare ones from the list. > > I'm also preparing a 2.30 release that exports a "strict" codon table to > the BioPerl module, that will be used by default and should fix the issue. > > Thanks, > Carson > > > > On 10/12/13 11:06 AM, "Felix Bemm" wrote: > >> Hi, >> >> I have a serious problems with maker annotations especially the start >> codon placement. It started happening with maker version 2.27! About 1/3 >> of my genes are missing a proper start codon. When reviewing the GFF >> files gene predictors and est evidence standalone would predict the >> right gene structure including the start codon. I was thinking that this >> problem was somehow linked to my gene models but looking at their >> standalone prediction I discarded that theory. Just some stats about >> proteins with proper start codon (first column) and missing start codon >> (second column). >> >> Maker 9268 4215 >> SNAP 16577 896 >> Genemark 18764 290 >> >> Numbers look the same when Augustus is used (currently retraining). >> Manually using EST's in Apollo often fixes the problems but I don't want >> to check > 4000 genes for this. >> >> Do you have any idea what could cause such a problem? If you need some >> example gff files I can send you. >> >> Best regards >> Felix >> >> -- >> Felix Bemm >> Department of Bioinformatics >> University of W?rzburg, Germany >> Tel: +49 931 - 31 83696 >> Fax: +49 931 - 31 84552 >> felix.bemm at uni-wuerzburg.de >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -- Felix Bemm Department of Bioinformatics University of W?rzburg, Germany Tel: +49 931 - 31 83696 Fax: +49 931 - 31 84552 felix.bemm at uni-wuerzburg.de From carsonhh at gmail.com Sat Oct 12 23:26:17 2013 From: carsonhh at gmail.com (Carson Holt) Date: Sun, 13 Oct 2013 01:26:17 -0400 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: <525975C6.4020403@uni-wuerzburg.de> Message-ID: Before Wednesday, because after that I'm on a 6 day road trip, so I have to have it wrapped up before I leave. --Carson On 10/12/13 12:16 PM, "Felix Bemm" wrote: >Thx Carson! Do you now when 2.30 will be available? Bests > >On 12.10.2013 17:48, Carson Holt wrote: >> This issue just came up this week on another e-mail as well. >> >> One way to fix it, is to edit the Bio::Tools::CodonTable module from >> BioPerl (use maker --debug to have maker print out the locations of all >> modules it uses). Such a hack should only be done as a temporary work >> around. >> >> You change line 256 from --> >> ---M---------------M---------------M---------------------------- >> >> To--> >> -----------------------------------M---------------------------- >> >> >> Maker uses the BioPerl is_start_codon function to determine if the codon >> used in a result it receives is acceptable or if it should look upstream >> or downstream for a different one. Apparently there are actually 3 >> acceptable start codons in the standard codon table (2 rare ones and one >> common), so making the edit removes the two rare ones from the list. >> >> I'm also preparing a 2.30 release that exports a "strict" codon table to >> the BioPerl module, that will be used by default and should fix the >>issue. >> >> Thanks, >> Carson >> >> >> >> On 10/12/13 11:06 AM, "Felix Bemm" wrote: >> >>> Hi, >>> >>> I have a serious problems with maker annotations especially the start >>> codon placement. It started happening with maker version 2.27! About >>>1/3 >>> of my genes are missing a proper start codon. When reviewing the GFF >>> files gene predictors and est evidence standalone would predict the >>> right gene structure including the start codon. I was thinking that >>>this >>> problem was somehow linked to my gene models but looking at their >>> standalone prediction I discarded that theory. Just some stats about >>> proteins with proper start codon (first column) and missing start codon >>> (second column). >>> >>> Maker 9268 4215 >>> SNAP 16577 896 >>> Genemark 18764 290 >>> >>> Numbers look the same when Augustus is used (currently retraining). >>> Manually using EST's in Apollo often fixes the problems but I don't >>>want >>> to check > 4000 genes for this. >>> >>> Do you have any idea what could cause such a problem? If you need some >>> example gff files I can send you. >>> >>> Best regards >>> Felix >>> >>> -- >>> Felix Bemm >>> Department of Bioinformatics >>> University of W?rzburg, Germany >>> Tel: +49 931 - 31 83696 >>> Fax: +49 931 - 31 84552 >>> felix.bemm at uni-wuerzburg.de >>> >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > >-- >Felix Bemm >Department of Bioinformatics >University of W?rzburg, Germany >Tel: +49 931 - 31 83696 >Fax: +49 931 - 31 84552 >felix.bemm at uni-wuerzburg.de From carsonhh at gmail.com Mon Oct 14 08:45:25 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 14 Oct 2013 10:45:25 -0400 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: <525975C6.4020403@uni-wuerzburg.de> Message-ID: Probably Tuesday or Wednesday. --Carson On 10/12/13 12:16 PM, "Felix Bemm" wrote: >Thx Carson! Do you now when 2.30 will be available? Bests > >On 12.10.2013 17:48, Carson Holt wrote: >> This issue just came up this week on another e-mail as well. >> >> One way to fix it, is to edit the Bio::Tools::CodonTable module from >> BioPerl (use maker --debug to have maker print out the locations of all >> modules it uses). Such a hack should only be done as a temporary work >> around. >> >> You change line 256 from --> >> ---M---------------M---------------M---------------------------- >> >> To--> >> -----------------------------------M---------------------------- >> >> >> Maker uses the BioPerl is_start_codon function to determine if the codon >> used in a result it receives is acceptable or if it should look upstream >> or downstream for a different one. Apparently there are actually 3 >> acceptable start codons in the standard codon table (2 rare ones and one >> common), so making the edit removes the two rare ones from the list. >> >> I'm also preparing a 2.30 release that exports a "strict" codon table to >> the BioPerl module, that will be used by default and should fix the >>issue. >> >> Thanks, >> Carson >> >> >> >> On 10/12/13 11:06 AM, "Felix Bemm" wrote: >> >>> Hi, >>> >>> I have a serious problems with maker annotations especially the start >>> codon placement. It started happening with maker version 2.27! About >>>1/3 >>> of my genes are missing a proper start codon. When reviewing the GFF >>> files gene predictors and est evidence standalone would predict the >>> right gene structure including the start codon. I was thinking that >>>this >>> problem was somehow linked to my gene models but looking at their >>> standalone prediction I discarded that theory. Just some stats about >>> proteins with proper start codon (first column) and missing start codon >>> (second column). >>> >>> Maker 9268 4215 >>> SNAP 16577 896 >>> Genemark 18764 290 >>> >>> Numbers look the same when Augustus is used (currently retraining). >>> Manually using EST's in Apollo often fixes the problems but I don't >>>want >>> to check > 4000 genes for this. >>> >>> Do you have any idea what could cause such a problem? If you need some >>> example gff files I can send you. >>> >>> Best regards >>> Felix >>> >>> -- >>> Felix Bemm >>> Department of Bioinformatics >>> University of W?rzburg, Germany >>> Tel: +49 931 - 31 83696 >>> Fax: +49 931 - 31 84552 >>> felix.bemm at uni-wuerzburg.de >>> >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > >-- >Felix Bemm >Department of Bioinformatics >University of W?rzburg, Germany >Tel: +49 931 - 31 83696 >Fax: +49 931 - 31 84552 >felix.bemm at uni-wuerzburg.de From cjfields at illinois.edu Mon Oct 14 09:07:43 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Mon, 14 Oct 2013 15:07:43 +0000 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: References: Message-ID: Carson, Regarding the CodonTable change below, should this be something that needs to be fixed, or made more flexible, in bioperl? I could see this being a problem if using MAKER for bacterial annotation (where the alternative start codons are actually more common). chris On Oct 14, 2013, at 9:45 AM, Carson Holt wrote: > Probably Tuesday or Wednesday. > > --Carson > > > On 10/12/13 12:16 PM, "Felix Bemm" wrote: > >> Thx Carson! Do you now when 2.30 will be available? Bests >> >> On 12.10.2013 17:48, Carson Holt wrote: >>> This issue just came up this week on another e-mail as well. >>> >>> One way to fix it, is to edit the Bio::Tools::CodonTable module from >>> BioPerl (use maker --debug to have maker print out the locations of all >>> modules it uses). Such a hack should only be done as a temporary work >>> around. >>> >>> You change line 256 from --> >>> ---M---------------M---------------M---------------------------- >>> >>> To--> >>> -----------------------------------M---------------------------- >>> >>> >>> Maker uses the BioPerl is_start_codon function to determine if the codon >>> used in a result it receives is acceptable or if it should look upstream >>> or downstream for a different one. Apparently there are actually 3 >>> acceptable start codons in the standard codon table (2 rare ones and one >>> common), so making the edit removes the two rare ones from the list. >>> >>> I'm also preparing a 2.30 release that exports a "strict" codon table to >>> the BioPerl module, that will be used by default and should fix the >>> issue. >>> >>> Thanks, >>> Carson >>> >>> >>> >>> On 10/12/13 11:06 AM, "Felix Bemm" wrote: >>> >>>> Hi, >>>> >>>> I have a serious problems with maker annotations especially the start >>>> codon placement. It started happening with maker version 2.27! About >>>> 1/3 >>>> of my genes are missing a proper start codon. When reviewing the GFF >>>> files gene predictors and est evidence standalone would predict the >>>> right gene structure including the start codon. I was thinking that >>>> this >>>> problem was somehow linked to my gene models but looking at their >>>> standalone prediction I discarded that theory. Just some stats about >>>> proteins with proper start codon (first column) and missing start codon >>>> (second column). >>>> >>>> Maker 9268 4215 >>>> SNAP 16577 896 >>>> Genemark 18764 290 >>>> >>>> Numbers look the same when Augustus is used (currently retraining). >>>> Manually using EST's in Apollo often fixes the problems but I don't >>>> want >>>> to check > 4000 genes for this. >>>> >>>> Do you have any idea what could cause such a problem? If you need some >>>> example gff files I can send you. >>>> >>>> Best regards >>>> Felix >>>> >>>> -- >>>> Felix Bemm >>>> Department of Bioinformatics >>>> University of W?rzburg, Germany >>>> Tel: +49 931 - 31 83696 >>>> Fax: +49 931 - 31 84552 >>>> felix.bemm at uni-wuerzburg.de >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> -- >> Felix Bemm >> Department of Bioinformatics >> University of W?rzburg, Germany >> Tel: +49 931 - 31 83696 >> Fax: +49 931 - 31 84552 >> felix.bemm at uni-wuerzburg.de > > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From carsonhh at gmail.com Mon Oct 14 11:58:24 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 14 Oct 2013 13:58:24 -0400 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: Message-ID: I think adding one more codon table to the set of available tables would be sufficient. Call it the 'Strict' table or 'Canonical' table. It doesn't need to be the default, but should be a selectable option just as the other codon tables are. There is a way to export codon tables to the module, which is what I've now added to MAKER, but I'm sure other BoiPerl users would find a strictly canonical codon table useful as well. Thanks, Carson On 10/14/13 11:07 AM, "Fields, Christopher J" wrote: >Carson, > >Regarding the CodonTable change below, should this be something that >needs to be fixed, or made more flexible, in bioperl? I could see this >being a problem if using MAKER for bacterial annotation (where the >alternative start codons are actually more common). > >chris > >On Oct 14, 2013, at 9:45 AM, Carson Holt wrote: > >> Probably Tuesday or Wednesday. >> >> --Carson >> >> >> On 10/12/13 12:16 PM, "Felix Bemm" wrote: >> >>> Thx Carson! Do you now when 2.30 will be available? Bests >>> >>> On 12.10.2013 17:48, Carson Holt wrote: >>>> This issue just came up this week on another e-mail as well. >>>> >>>> One way to fix it, is to edit the Bio::Tools::CodonTable module from >>>> BioPerl (use maker --debug to have maker print out the locations of >>>>all >>>> modules it uses). Such a hack should only be done as a temporary work >>>> around. >>>> >>>> You change line 256 from --> >>>> ---M---------------M---------------M---------------------------- >>>> >>>> To--> >>>> -----------------------------------M---------------------------- >>>> >>>> >>>> Maker uses the BioPerl is_start_codon function to determine if the >>>>codon >>>> used in a result it receives is acceptable or if it should look >>>>upstream >>>> or downstream for a different one. Apparently there are actually 3 >>>> acceptable start codons in the standard codon table (2 rare ones and >>>>one >>>> common), so making the edit removes the two rare ones from the list. >>>> >>>> I'm also preparing a 2.30 release that exports a "strict" codon table >>>>to >>>> the BioPerl module, that will be used by default and should fix the >>>> issue. >>>> >>>> Thanks, >>>> Carson >>>> >>>> >>>> >>>> On 10/12/13 11:06 AM, "Felix Bemm" >>>>wrote: >>>> >>>>> Hi, >>>>> >>>>> I have a serious problems with maker annotations especially the start >>>>> codon placement. It started happening with maker version 2.27! About >>>>> 1/3 >>>>> of my genes are missing a proper start codon. When reviewing the GFF >>>>> files gene predictors and est evidence standalone would predict the >>>>> right gene structure including the start codon. I was thinking that >>>>> this >>>>> problem was somehow linked to my gene models but looking at their >>>>> standalone prediction I discarded that theory. Just some stats about >>>>> proteins with proper start codon (first column) and missing start >>>>>codon >>>>> (second column). >>>>> >>>>> Maker 9268 4215 >>>>> SNAP 16577 896 >>>>> Genemark 18764 290 >>>>> >>>>> Numbers look the same when Augustus is used (currently retraining). >>>>> Manually using EST's in Apollo often fixes the problems but I don't >>>>> want >>>>> to check > 4000 genes for this. >>>>> >>>>> Do you have any idea what could cause such a problem? If you need >>>>>some >>>>> example gff files I can send you. >>>>> >>>>> Best regards >>>>> Felix >>>>> >>>>> -- >>>>> Felix Bemm >>>>> Department of Bioinformatics >>>>> University of W?rzburg, Germany >>>>> Tel: +49 931 - 31 83696 >>>>> Fax: +49 931 - 31 84552 >>>>> felix.bemm at uni-wuerzburg.de >>>>> >>>>> _______________________________________________ >>>>> maker-devel mailing list >>>>> maker-devel at box290.bluehost.com >>>>> >>>>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.or >>>>>g >>> >>> -- >>> Felix Bemm >>> Department of Bioinformatics >>> University of W?rzburg, Germany >>> Tel: +49 931 - 31 83696 >>> Fax: +49 931 - 31 84552 >>> felix.bemm at uni-wuerzburg.de >> >> >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > From cjfields at illinois.edu Mon Oct 14 12:51:28 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Mon, 14 Oct 2013 18:51:28 +0000 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: References: Message-ID: <160BC401-0471-4285-8712-21440628A5DA@illinois.edu> Done, with some added tests. I labeled it 'Strict' ('Canonical' can be a bit of a loaded term). It's currently set as ID 24. I will likely push out a new 1.6 point release this week or next, I'll merge these in for that release. chris On Oct 14, 2013, at 12:58 PM, Carson Holt wrote: > I think adding one more codon table to the set of available tables would > be sufficient. Call it the 'Strict' table or 'Canonical' table. It > doesn't need to be the default, but should be a selectable option just as > the other codon tables are. There is a way to export codon tables to the > module, which is what I've now added to MAKER, but I'm sure other BoiPerl > users would find a strictly canonical codon table useful as well. > > Thanks, > Carson > > > On 10/14/13 11:07 AM, "Fields, Christopher J" > wrote: > >> Carson, >> >> Regarding the CodonTable change below, should this be something that >> needs to be fixed, or made more flexible, in bioperl? I could see this >> being a problem if using MAKER for bacterial annotation (where the >> alternative start codons are actually more common). >> >> chris >> >> On Oct 14, 2013, at 9:45 AM, Carson Holt wrote: >> >>> Probably Tuesday or Wednesday. >>> >>> --Carson >>> >>> >>> On 10/12/13 12:16 PM, "Felix Bemm" wrote: >>> >>>> Thx Carson! Do you now when 2.30 will be available? Bests >>>> >>>> On 12.10.2013 17:48, Carson Holt wrote: >>>>> This issue just came up this week on another e-mail as well. >>>>> >>>>> One way to fix it, is to edit the Bio::Tools::CodonTable module from >>>>> BioPerl (use maker --debug to have maker print out the locations of >>>>> all >>>>> modules it uses). Such a hack should only be done as a temporary work >>>>> around. >>>>> >>>>> You change line 256 from --> >>>>> ---M---------------M---------------M---------------------------- >>>>> >>>>> To--> >>>>> -----------------------------------M---------------------------- >>>>> >>>>> >>>>> Maker uses the BioPerl is_start_codon function to determine if the >>>>> codon >>>>> used in a result it receives is acceptable or if it should look >>>>> upstream >>>>> or downstream for a different one. Apparently there are actually 3 >>>>> acceptable start codons in the standard codon table (2 rare ones and >>>>> one >>>>> common), so making the edit removes the two rare ones from the list. >>>>> >>>>> I'm also preparing a 2.30 release that exports a "strict" codon table >>>>> to >>>>> the BioPerl module, that will be used by default and should fix the >>>>> issue. >>>>> >>>>> Thanks, >>>>> Carson >>>>> >>>>> >>>>> >>>>> On 10/12/13 11:06 AM, "Felix Bemm" >>>>> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I have a serious problems with maker annotations especially the start >>>>>> codon placement. It started happening with maker version 2.27! About >>>>>> 1/3 >>>>>> of my genes are missing a proper start codon. When reviewing the GFF >>>>>> files gene predictors and est evidence standalone would predict the >>>>>> right gene structure including the start codon. I was thinking that >>>>>> this >>>>>> problem was somehow linked to my gene models but looking at their >>>>>> standalone prediction I discarded that theory. Just some stats about >>>>>> proteins with proper start codon (first column) and missing start >>>>>> codon >>>>>> (second column). >>>>>> >>>>>> Maker 9268 4215 >>>>>> SNAP 16577 896 >>>>>> Genemark 18764 290 >>>>>> >>>>>> Numbers look the same when Augustus is used (currently retraining). >>>>>> Manually using EST's in Apollo often fixes the problems but I don't >>>>>> want >>>>>> to check > 4000 genes for this. >>>>>> >>>>>> Do you have any idea what could cause such a problem? If you need >>>>>> some >>>>>> example gff files I can send you. >>>>>> >>>>>> Best regards >>>>>> Felix >>>>>> >>>>>> -- >>>>>> Felix Bemm >>>>>> Department of Bioinformatics >>>>>> University of W?rzburg, Germany >>>>>> Tel: +49 931 - 31 83696 >>>>>> Fax: +49 931 - 31 84552 >>>>>> felix.bemm at uni-wuerzburg.de >>>>>> >>>>>> _______________________________________________ >>>>>> maker-devel mailing list >>>>>> maker-devel at box290.bluehost.com >>>>>> >>>>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.or >>>>>> g >>>> >>>> -- >>>> Felix Bemm >>>> Department of Bioinformatics >>>> University of W?rzburg, Germany >>>> Tel: +49 931 - 31 83696 >>>> Fax: +49 931 - 31 84552 >>>> felix.bemm at uni-wuerzburg.de >>> >>> >>> >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> > > From carsonhh at gmail.com Mon Oct 14 18:59:45 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 14 Oct 2013 20:59:45 -0400 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: <160BC401-0471-4285-8712-21440628A5DA@illinois.edu> Message-ID: Thanks!! --Carson On 10/14/13 2:51 PM, "Fields, Christopher J" wrote: >Done, with some added tests. I labeled it 'Strict' ('Canonical' can be a >bit of a loaded term). It's currently set as ID 24. > >I will likely push out a new 1.6 point release this week or next, I'll >merge these in for that release. > >chris > >On Oct 14, 2013, at 12:58 PM, Carson Holt wrote: > >> I think adding one more codon table to the set of available tables would >> be sufficient. Call it the 'Strict' table or 'Canonical' table. It >> doesn't need to be the default, but should be a selectable option just >>as >> the other codon tables are. There is a way to export codon tables to >>the >> module, which is what I've now added to MAKER, but I'm sure other >>BoiPerl >> users would find a strictly canonical codon table useful as well. >> >> Thanks, >> Carson >> >> >> On 10/14/13 11:07 AM, "Fields, Christopher J" >> wrote: >> >>> Carson, >>> >>> Regarding the CodonTable change below, should this be something that >>> needs to be fixed, or made more flexible, in bioperl? I could see this >>> being a problem if using MAKER for bacterial annotation (where the >>> alternative start codons are actually more common). >>> >>> chris >>> >>> On Oct 14, 2013, at 9:45 AM, Carson Holt wrote: >>> >>>> Probably Tuesday or Wednesday. >>>> >>>> --Carson >>>> >>>> >>>> On 10/12/13 12:16 PM, "Felix Bemm" >>>>wrote: >>>> >>>>> Thx Carson! Do you now when 2.30 will be available? Bests >>>>> >>>>> On 12.10.2013 17:48, Carson Holt wrote: >>>>>> This issue just came up this week on another e-mail as well. >>>>>> >>>>>> One way to fix it, is to edit the Bio::Tools::CodonTable module from >>>>>> BioPerl (use maker --debug to have maker print out the locations of >>>>>> all >>>>>> modules it uses). Such a hack should only be done as a temporary >>>>>>work >>>>>> around. >>>>>> >>>>>> You change line 256 from --> >>>>>> ---M---------------M---------------M---------------------------- >>>>>> >>>>>> To--> >>>>>> -----------------------------------M---------------------------- >>>>>> >>>>>> >>>>>> Maker uses the BioPerl is_start_codon function to determine if the >>>>>> codon >>>>>> used in a result it receives is acceptable or if it should look >>>>>> upstream >>>>>> or downstream for a different one. Apparently there are actually 3 >>>>>> acceptable start codons in the standard codon table (2 rare ones and >>>>>> one >>>>>> common), so making the edit removes the two rare ones from the list. >>>>>> >>>>>> I'm also preparing a 2.30 release that exports a "strict" codon >>>>>>table >>>>>> to >>>>>> the BioPerl module, that will be used by default and should fix the >>>>>> issue. >>>>>> >>>>>> Thanks, >>>>>> Carson >>>>>> >>>>>> >>>>>> >>>>>> On 10/12/13 11:06 AM, "Felix Bemm" >>>>>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I have a serious problems with maker annotations especially the >>>>>>>start >>>>>>> codon placement. It started happening with maker version 2.27! >>>>>>>About >>>>>>> 1/3 >>>>>>> of my genes are missing a proper start codon. When reviewing the >>>>>>>GFF >>>>>>> files gene predictors and est evidence standalone would predict the >>>>>>> right gene structure including the start codon. I was thinking that >>>>>>> this >>>>>>> problem was somehow linked to my gene models but looking at their >>>>>>> standalone prediction I discarded that theory. Just some stats >>>>>>>about >>>>>>> proteins with proper start codon (first column) and missing start >>>>>>> codon >>>>>>> (second column). >>>>>>> >>>>>>> Maker 9268 4215 >>>>>>> SNAP 16577 896 >>>>>>> Genemark 18764 290 >>>>>>> >>>>>>> Numbers look the same when Augustus is used (currently retraining). >>>>>>> Manually using EST's in Apollo often fixes the problems but I don't >>>>>>> want >>>>>>> to check > 4000 genes for this. >>>>>>> >>>>>>> Do you have any idea what could cause such a problem? If you need >>>>>>> some >>>>>>> example gff files I can send you. >>>>>>> >>>>>>> Best regards >>>>>>> Felix >>>>>>> >>>>>>> -- >>>>>>> Felix Bemm >>>>>>> Department of Bioinformatics >>>>>>> University of W?rzburg, Germany >>>>>>> Tel: +49 931 - 31 83696 >>>>>>> Fax: +49 931 - 31 84552 >>>>>>> felix.bemm at uni-wuerzburg.de >>>>>>> >>>>>>> _______________________________________________ >>>>>>> maker-devel mailing list >>>>>>> maker-devel at box290.bluehost.com >>>>>>> >>>>>>> >>>>>>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab. >>>>>>>or >>>>>>> g >>>>> >>>>> -- >>>>> Felix Bemm >>>>> Department of Bioinformatics >>>>> University of W?rzburg, Germany >>>>> Tel: +49 931 - 31 83696 >>>>> Fax: +49 931 - 31 84552 >>>>> felix.bemm at uni-wuerzburg.de >>>> >>>> >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> >>>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >> >> > From jfierst at uoregon.edu Mon Oct 21 11:40:06 2013 From: jfierst at uoregon.edu (Janna Fierst) Date: Mon, 21 Oct 2013 10:40:06 -0700 Subject: [maker-devel] Fwd: exon/intron boundaries In-Reply-To: References: Message-ID: Hi, thanks for your help- I have used Trinity to assemble the EST's but am having a problem with the 2.29-beta release. Instead of starting and finishing a set of scaffolds, it appears to re-start over and over again. I am running it in parallel across 32 cores and I'm wondering if this is a problem with the parallel implementation. I've gone through it several times and not been able to find any errors or output that would suggest what the problem is. Thanks -Janna On Mon, Oct 7, 2013 at 6:20 AM, Carson Holt wrote: > Hi Janna, > > There are a couple of things to do. Download the maker-2.29p-beta from > the lab website. This includes some changes made to improve the > performance of correct_est_fusion. Whenever using the correct_est_fusion > option, this also reduces the influence of ESTs on annotation. So normally > you would need to increase your protein dataset, but given that you have > supplied 3 nematode species and all of UniProt already you should probably > be fine. But there is one last thing you can do. Instead of using > cufflinks, try using trinity to assemble the ESTs. There is a Jaccard clip > option that reducing merging caused by overlapping UTR. Between the > trinity and correct_est_fusion you should be able to really reduce the > effect of those ESTs. > > If those changes don't work there is one last option. If you take the > MAKER results and filter them for SNAP and Augustus ab initio results > (match/match_part in the GFF3), then you can pass those in to the pred_gff > options. Then turn snaphmm and augustus_species off in the control files. > Basically what this will do is turn MAKER's hint based prediction off and > force it to filter the ab intio results and select directly models from > there. Since the merging is being caused by bad hints (merged transcripts > from mRNAseq) this would reduce that effect. You will still need > correct_est_fusion=1 though to trim UTR coming from the merged transcripts, > because even though MAEKR can't process hints to rerun SNAP and Augustus > this way, it will try and add UTR using the EST evidence. > > --Carson > > > From: Janna Fierst > Date: Friday, October 4, 2013 1:06 PM > To: > Subject: [maker-devel] Fwd: exon/intron boundaries > > Hi, > > thanks for your reply- I have been going through our annotations in detail > and trying different parameter sets, and I think I have identified what is > going on but I'm not sure how to set the MAKER2 parameters for our > situation. We are working with a species of Caenorhabditid worm and there > are long gene-dense blocks that are being incorrectly annotated as large > single genes instead of several smaller closely spaced genes. The protein > alignments (tblastx and protein2genome) show very clearly where the > exon/intron boundaries are and in most cases agree with the augustus > predictions. The assembled cufflinks output (through blastn, est2genome and > est_gff:cufflinks) does not agree in some locations; I think this may be > because in some cases the UTR nearly overlaps adjacent genes. > > I have included a screenshot of an annotated region viewed in apollo to > try to show this. The large gene in the middle is actually 7 different > genes that are extremely close together and MAKER2 is collapsing them into > a single gene. I tried running without any RNASeq/cufflinks data and MAKER2 > annotates the region as two genes instead of one, but I can't get it to > recognize the 7 as different genes. I have retrained SNAP but we have not > been able to successfully train Augustus, we are currently using the > default caenhorhabditis species model. I also included a species specific > repeat library. I tried setting correct_est_fusion=1 and reducing > pred_flank but these changes appear to really alter the annotations and we > end up annotating almost nothing. I also tried setting est2genome=0 to > decrease the influence of the cufflinks assembly but it didn't appear to > help. There are some very large introns in these genes so I haven't tried > yet decreasing the maximum intron size because I'm concerned this may > generate too many split genes instead of our current merged gene problem. > Thanks for your help, any advice is greatly appreciated! -Janna Fierst > > > On Mon, Aug 26, 2013 at 12:21 PM, Carson Holt wrote: > >> Are you getting gene fusions or just more exons? Gene fusions can be >> reduced by setting correct_est_fusion=1, or reducing pred_flank, although >> reducing pred_flank can cause other issues (but those generally only appear >> if setting the value below below 150). Also if you have the maximum intron >> size set to high (split_hit option), you may also be generating bridging >> alignments that make evidence align across distant paralogous genes as well >> (this can result in gene merging) >> >> You should also look at your results manually in a viewer like Apollo. >> Then see if the extra exons are supported by something such as protein >> alignments from another species. If this is the case, you may have a >> poorly annotated protein set that is being used as evidence that is >> carrying over it's erroneous exons into the species you are annotating. If >> the extra exons are supported by EST evidence, then perhaps you should try >> and rebuild the EST assembly (for example trinity has an option to use a >> Jarccardian similarity coefficient to avoid fusing transcripts). >> >> Another option, is to retrain SNAP or Augustus. MAKER does not actually >> produce any of the models itself (it is a pipeline not a predictor). The >> models are all generated using these other algorithms, MAKER just feeds >> them hints based on protein and transcript alignments, so making sure >> training is sufficient is important for those programs to produce their >> best models. >> >> Finally make sure your repeat database is sufficient, you may need to >> generate a species specific repeat library using something like >> RepeatModeler. Repeats can end up being included as extra exons in gene >> models because they may contain reading frames the do code for proteins >> (I.e. reverse transcriptases). >> >> If you have any questions on any of the above, just let us know. >> >> Thanks, >> Carson >> >> >> From: Janna Fierst >> Date: Monday, August 26, 2013 2:54 PM >> To: >> Subject: [maker-devel] exon/intron boundaries >> >> Hi, >> >> I am using MAKER 2.28 to annotate a Caenorhabditid worm genome, and the >> initial results appear fairly good but we seem to be be annotating too many >> exons for multiple genes. I was wondering which parameters should be tuned >> to change the threshold for exon/intron boundaries? Thanks for your help >> -Janna Fierst >> _______________________________________________ maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> > > > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From graham.etherington at sainsbury-laboratory.ac.uk Thu Oct 24 07:42:51 2013 From: graham.etherington at sainsbury-laboratory.ac.uk (graham etherington (TSL)) Date: Thu, 24 Oct 2013 13:42:51 +0000 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: Message-ID: Hi, I notice that the latest version of Maker available from http://www.yandell-lab.org/software/maker.html is still v2.29b (4 Oct 2013). As we're also having the start codon problem we'd like to get hold of v2.30. Do you know when this will be released? Many thanks, Graham Dr. Graham Etherington Bioinformatics Support Officer, The Sainsbury Laboratory, Norwich Research Park, Norwich NR4 7UH. UK Tel: +44 (0)1603 450601 On 13/10/2013 06:26, "Carson Holt" wrote: >Before Wednesday, because after that I'm on a 6 day road trip, so I have >to have it wrapped up before I leave. > >--Carson > > > > >On 10/12/13 12:16 PM, "Felix Bemm" wrote: > >>Thx Carson! Do you now when 2.30 will be available? Bests >> >>On 12.10.2013 17:48, Carson Holt wrote: >>> This issue just came up this week on another e-mail as well. >>> >>> One way to fix it, is to edit the Bio::Tools::CodonTable module from >>> BioPerl (use maker --debug to have maker print out the locations of all >>> modules it uses). Such a hack should only be done as a temporary work >>> around. >>> >>> You change line 256 from --> >>> ---M---------------M---------------M---------------------------- >>> >>> To--> >>> -----------------------------------M---------------------------- >>> >>> >>> Maker uses the BioPerl is_start_codon function to determine if the >>>codon >>> used in a result it receives is acceptable or if it should look >>>upstream >>> or downstream for a different one. Apparently there are actually 3 >>> acceptable start codons in the standard codon table (2 rare ones and >>>one >>> common), so making the edit removes the two rare ones from the list. >>> >>> I'm also preparing a 2.30 release that exports a "strict" codon table >>>to >>> the BioPerl module, that will be used by default and should fix the >>>issue. >>> >>> Thanks, >>> Carson >>> >>> >>> >>> On 10/12/13 11:06 AM, "Felix Bemm" wrote: >>> >>>> Hi, >>>> >>>> I have a serious problems with maker annotations especially the start >>>> codon placement. It started happening with maker version 2.27! About >>>>1/3 >>>> of my genes are missing a proper start codon. When reviewing the GFF >>>> files gene predictors and est evidence standalone would predict the >>>> right gene structure including the start codon. I was thinking that >>>>this >>>> problem was somehow linked to my gene models but looking at their >>>> standalone prediction I discarded that theory. Just some stats about >>>> proteins with proper start codon (first column) and missing start >>>>codon >>>> (second column). >>>> >>>> Maker 9268 4215 >>>> SNAP 16577 896 >>>> Genemark 18764 290 >>>> >>>> Numbers look the same when Augustus is used (currently retraining). >>>> Manually using EST's in Apollo often fixes the problems but I don't >>>>want >>>> to check > 4000 genes for this. >>>> >>>> Do you have any idea what could cause such a problem? If you need some >>>> example gff files I can send you. >>>> >>>> Best regards >>>> Felix >>>> >>>> -- >>>> Felix Bemm >>>> Department of Bioinformatics >>>> University of W?rzburg, Germany >>>> Tel: +49 931 - 31 83696 >>>> Fax: +49 931 - 31 84552 >>>> felix.bemm at uni-wuerzburg.de >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> >>>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >>-- >>Felix Bemm >>Department of Bioinformatics >>University of W?rzburg, Germany >>Tel: +49 931 - 31 83696 >>Fax: +49 931 - 31 84552 >>felix.bemm at uni-wuerzburg.de > > > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From myandell at genetics.utah.edu Thu Oct 24 07:55:05 2013 From: myandell at genetics.utah.edu (Mark Yandell) Date: Thu, 24 Oct 2013 13:55:05 +0000 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: References: , Message-ID: <7A60AB257EFF2B48B1F4C814817EA05365E778BF@mxb2.hg.genetics.utah.edu> Hi Graham, Carson is currently on the road, moving his family from Canada back to the States. He should be back online sometime next week. Sorry for the delay. --mark Mark Yandell Professor of Human Genetics H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:801-587-7707 ________________________________________ From: maker-devel [maker-devel-bounces at yandell-lab.org] on behalf of graham etherington (TSL) [graham.etherington at sainsbury-laboratory.ac.uk] Sent: Thursday, October 24, 2013 7:42 AM To: Carson Holt; Felix Bemm; maker-devel at yandell-lab.org Cc: Kate Bailey (TSL) Subject: Re: [maker-devel] Serious Start Codon Problem Hi, I notice that the latest version of Maker available from http://www.yandell-lab.org/software/maker.html is still v2.29b (4 Oct 2013). As we're also having the start codon problem we'd like to get hold of v2.30. Do you know when this will be released? Many thanks, Graham Dr. Graham Etherington Bioinformatics Support Officer, The Sainsbury Laboratory, Norwich Research Park, Norwich NR4 7UH. UK Tel: +44 (0)1603 450601 On 13/10/2013 06:26, "Carson Holt" wrote: >Before Wednesday, because after that I'm on a 6 day road trip, so I have >to have it wrapped up before I leave. > >--Carson > > > > >On 10/12/13 12:16 PM, "Felix Bemm" wrote: > >>Thx Carson! Do you now when 2.30 will be available? Bests >> >>On 12.10.2013 17:48, Carson Holt wrote: >>> This issue just came up this week on another e-mail as well. >>> >>> One way to fix it, is to edit the Bio::Tools::CodonTable module from >>> BioPerl (use maker --debug to have maker print out the locations of all >>> modules it uses). Such a hack should only be done as a temporary work >>> around. >>> >>> You change line 256 from --> >>> ---M---------------M---------------M---------------------------- >>> >>> To--> >>> -----------------------------------M---------------------------- >>> >>> >>> Maker uses the BioPerl is_start_codon function to determine if the >>>codon >>> used in a result it receives is acceptable or if it should look >>>upstream >>> or downstream for a different one. Apparently there are actually 3 >>> acceptable start codons in the standard codon table (2 rare ones and >>>one >>> common), so making the edit removes the two rare ones from the list. >>> >>> I'm also preparing a 2.30 release that exports a "strict" codon table >>>to >>> the BioPerl module, that will be used by default and should fix the >>>issue. >>> >>> Thanks, >>> Carson >>> >>> >>> >>> On 10/12/13 11:06 AM, "Felix Bemm" wrote: >>> >>>> Hi, >>>> >>>> I have a serious problems with maker annotations especially the start >>>> codon placement. It started happening with maker version 2.27! About >>>>1/3 >>>> of my genes are missing a proper start codon. When reviewing the GFF >>>> files gene predictors and est evidence standalone would predict the >>>> right gene structure including the start codon. I was thinking that >>>>this >>>> problem was somehow linked to my gene models but looking at their >>>> standalone prediction I discarded that theory. Just some stats about >>>> proteins with proper start codon (first column) and missing start >>>>codon >>>> (second column). >>>> >>>> Maker 9268 4215 >>>> SNAP 16577 896 >>>> Genemark 18764 290 >>>> >>>> Numbers look the same when Augustus is used (currently retraining). >>>> Manually using EST's in Apollo often fixes the problems but I don't >>>>want >>>> to check > 4000 genes for this. >>>> >>>> Do you have any idea what could cause such a problem? If you need some >>>> example gff files I can send you. >>>> >>>> Best regards >>>> Felix >>>> >>>> -- >>>> Felix Bemm >>>> Department of Bioinformatics >>>> University of W?rzburg, Germany >>>> Tel: +49 931 - 31 83696 >>>> Fax: +49 931 - 31 84552 >>>> felix.bemm at uni-wuerzburg.de >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> >>>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >>-- >>Felix Bemm >>Department of Bioinformatics >>University of W?rzburg, Germany >>Tel: +49 931 - 31 83696 >>Fax: +49 931 - 31 84552 >>felix.bemm at uni-wuerzburg.de > > > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From felix.bemm at uni-wuerzburg.de Thu Oct 24 08:03:27 2013 From: felix.bemm at uni-wuerzburg.de (Felix Bemm) Date: Thu, 24 Oct 2013 16:03:27 +0200 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: <7A60AB257EFF2B48B1F4C814817EA05365E778BF@mxb2.hg.genetics.utah.edu> References: , <7A60AB257EFF2B48B1F4C814817EA05365E778BF@mxb2.hg.genetics.utah.edu> Message-ID: <526928AF.90108@uni-wuerzburg.de> Hi, I used the temporary fix of the Bio::Tools::CodonTable. It's working pretty nice. Almost every predicted proteins now start with a M ;) Bests Felix On 24.10.2013 15:55, Mark Yandell wrote: > Hi Graham, > > Carson is currently on the road, moving his family from Canada back to the States. He should be back online sometime next week. Sorry for the delay. > > > --mark > > > Mark Yandell > Professor of Human Genetics > H.A. & Edna Benning Presidential Endowed Chair > Eccles Institute of Human Genetics > University of Utah > 15 North 2030 East, Room 2100 > Salt Lake City, UT 84112-5330 > ph:801-587-7707 > > ________________________________________ > From: maker-devel [maker-devel-bounces at yandell-lab.org] on behalf of graham etherington (TSL) [graham.etherington at sainsbury-laboratory.ac.uk] > Sent: Thursday, October 24, 2013 7:42 AM > To: Carson Holt; Felix Bemm; maker-devel at yandell-lab.org > Cc: Kate Bailey (TSL) > Subject: Re: [maker-devel] Serious Start Codon Problem > > Hi, > I notice that the latest version of Maker available from > http://www.yandell-lab.org/software/maker.html is still v2.29b (4 Oct > 2013). As we're also having the start codon problem we'd like to get hold > of v2.30. Do you know when this will be released? > Many thanks, > Graham > > > Dr. Graham Etherington > Bioinformatics Support Officer, > The Sainsbury Laboratory, > Norwich Research Park, > Norwich NR4 7UH. > UK > Tel: +44 (0)1603 450601 > > > > > > On 13/10/2013 06:26, "Carson Holt" wrote: > >> Before Wednesday, because after that I'm on a 6 day road trip, so I have >> to have it wrapped up before I leave. >> >> --Carson >> >> >> >> >> On 10/12/13 12:16 PM, "Felix Bemm" wrote: >> >>> Thx Carson! Do you now when 2.30 will be available? Bests >>> >>> On 12.10.2013 17:48, Carson Holt wrote: >>>> This issue just came up this week on another e-mail as well. >>>> >>>> One way to fix it, is to edit the Bio::Tools::CodonTable module from >>>> BioPerl (use maker --debug to have maker print out the locations of all >>>> modules it uses). Such a hack should only be done as a temporary work >>>> around. >>>> >>>> You change line 256 from --> >>>> ---M---------------M---------------M---------------------------- >>>> >>>> To--> >>>> -----------------------------------M---------------------------- >>>> >>>> >>>> Maker uses the BioPerl is_start_codon function to determine if the >>>> codon >>>> used in a result it receives is acceptable or if it should look >>>> upstream >>>> or downstream for a different one. Apparently there are actually 3 >>>> acceptable start codons in the standard codon table (2 rare ones and >>>> one >>>> common), so making the edit removes the two rare ones from the list. >>>> >>>> I'm also preparing a 2.30 release that exports a "strict" codon table >>>> to >>>> the BioPerl module, that will be used by default and should fix the >>>> issue. >>>> >>>> Thanks, >>>> Carson >>>> >>>> >>>> >>>> On 10/12/13 11:06 AM, "Felix Bemm" wrote: >>>> >>>>> Hi, >>>>> >>>>> I have a serious problems with maker annotations especially the start >>>>> codon placement. It started happening with maker version 2.27! About >>>>> 1/3 >>>>> of my genes are missing a proper start codon. When reviewing the GFF >>>>> files gene predictors and est evidence standalone would predict the >>>>> right gene structure including the start codon. I was thinking that >>>>> this >>>>> problem was somehow linked to my gene models but looking at their >>>>> standalone prediction I discarded that theory. Just some stats about >>>>> proteins with proper start codon (first column) and missing start >>>>> codon >>>>> (second column). >>>>> >>>>> Maker 9268 4215 >>>>> SNAP 16577 896 >>>>> Genemark 18764 290 >>>>> >>>>> Numbers look the same when Augustus is used (currently retraining). >>>>> Manually using EST's in Apollo often fixes the problems but I don't >>>>> want >>>>> to check > 4000 genes for this. >>>>> >>>>> Do you have any idea what could cause such a problem? If you need some >>>>> example gff files I can send you. >>>>> >>>>> Best regards >>>>> Felix >>>>> >>>>> -- >>>>> Felix Bemm >>>>> Department of Bioinformatics >>>>> University of W?rzburg, Germany >>>>> Tel: +49 931 - 31 83696 >>>>> Fax: +49 931 - 31 84552 >>>>> felix.bemm at uni-wuerzburg.de >>>>> >>>>> _______________________________________________ >>>>> maker-devel mailing list >>>>> maker-devel at box290.bluehost.com >>>>> >>>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >>> -- >>> Felix Bemm >>> Department of Bioinformatics >>> University of W?rzburg, Germany >>> Tel: +49 931 - 31 83696 >>> Fax: +49 931 - 31 84552 >>> felix.bemm at uni-wuerzburg.de >> >> >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -- Felix Bemm Department of Bioinformatics University of W?rzburg, Germany Tel: +49 931 - 31 83696 Fax: +49 931 - 31 84552 felix.bemm at uni-wuerzburg.de From jim.hu.biobio at gmail.com Thu Oct 24 08:32:01 2013 From: jim.hu.biobio at gmail.com (JH Gmail) Date: Thu, 24 Oct 2013 09:32:01 -0500 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: <7A60AB257EFF2B48B1F4C814817EA05365E778BF@mxb2.hg.genetics.utah.edu> References: <7A60AB257EFF2B48B1F4C814817EA05365E778BF@mxb2.hg.genetics.utah.edu> Message-ID: <69247335-1227-41CB-94EA-BFEEE5207D51@gmail.com> Slightly off-topic questions: What is the assessment of right and wrong based on? Is there a training set based on experiments? In bacteria, the use of the minor starts is significant, but the mechanism is so different I would expect eukaryotes to be more strict. Does maker handle ribosome profiling yet? jim Sent from my iPad > On Oct 24, 2013, at 8:55 AM, Mark Yandell wrote: > > Hi Graham, > > Carson is currently on the road, moving his family from Canada back to the States. He should be back online sometime next week. Sorry for the delay. > > > --mark > > > Mark Yandell > Professor of Human Genetics > H.A. & Edna Benning Presidential Endowed Chair > Eccles Institute of Human Genetics > University of Utah > 15 North 2030 East, Room 2100 > Salt Lake City, UT 84112-5330 > ph:801-587-7707 > > ________________________________________ > From: maker-devel [maker-devel-bounces at yandell-lab.org] on behalf of graham etherington (TSL) [graham.etherington at sainsbury-laboratory.ac.uk] > Sent: Thursday, October 24, 2013 7:42 AM > To: Carson Holt; Felix Bemm; maker-devel at yandell-lab.org > Cc: Kate Bailey (TSL) > Subject: Re: [maker-devel] Serious Start Codon Problem > > Hi, > I notice that the latest version of Maker available from > http://www.yandell-lab.org/software/maker.html is still v2.29b (4 Oct > 2013). As we're also having the start codon problem we'd like to get hold > of v2.30. Do you know when this will be released? > Many thanks, > Graham > > > Dr. Graham Etherington > Bioinformatics Support Officer, > The Sainsbury Laboratory, > Norwich Research Park, > Norwich NR4 7UH. > UK > Tel: +44 (0)1603 450601 > > > > > >> On 13/10/2013 06:26, "Carson Holt" wrote: >> >> Before Wednesday, because after that I'm on a 6 day road trip, so I have >> to have it wrapped up before I leave. >> >> --Carson >> >> >> >> >>> On 10/12/13 12:16 PM, "Felix Bemm" wrote: >>> >>> Thx Carson! Do you now when 2.30 will be available? Bests >>> >>>> On 12.10.2013 17:48, Carson Holt wrote: >>>> This issue just came up this week on another e-mail as well. >>>> >>>> One way to fix it, is to edit the Bio::Tools::CodonTable module from >>>> BioPerl (use maker --debug to have maker print out the locations of all >>>> modules it uses). Such a hack should only be done as a temporary work >>>> around. >>>> >>>> You change line 256 from --> >>>> ---M---------------M---------------M---------------------------- >>>> >>>> To--> >>>> -----------------------------------M---------------------------- >>>> >>>> >>>> Maker uses the BioPerl is_start_codon function to determine if the >>>> codon >>>> used in a result it receives is acceptable or if it should look >>>> upstream >>>> or downstream for a different one. Apparently there are actually 3 >>>> acceptable start codons in the standard codon table (2 rare ones and >>>> one >>>> common), so making the edit removes the two rare ones from the list. >>>> >>>> I'm also preparing a 2.30 release that exports a "strict" codon table >>>> to >>>> the BioPerl module, that will be used by default and should fix the >>>> issue. >>>> >>>> Thanks, >>>> Carson >>>> >>>> >>>> >>>>> On 10/12/13 11:06 AM, "Felix Bemm" wrote: >>>>> >>>>> Hi, >>>>> >>>>> I have a serious problems with maker annotations especially the start >>>>> codon placement. It started happening with maker version 2.27! About >>>>> 1/3 >>>>> of my genes are missing a proper start codon. When reviewing the GFF >>>>> files gene predictors and est evidence standalone would predict the >>>>> right gene structure including the start codon. I was thinking that >>>>> this >>>>> problem was somehow linked to my gene models but looking at their >>>>> standalone prediction I discarded that theory. Just some stats about >>>>> proteins with proper start codon (first column) and missing start >>>>> codon >>>>> (second column). >>>>> >>>>> Maker 9268 4215 >>>>> SNAP 16577 896 >>>>> Genemark 18764 290 >>>>> >>>>> Numbers look the same when Augustus is used (currently retraining). >>>>> Manually using EST's in Apollo often fixes the problems but I don't >>>>> want >>>>> to check > 4000 genes for this. >>>>> >>>>> Do you have any idea what could cause such a problem? If you need some >>>>> example gff files I can send you. >>>>> >>>>> Best regards >>>>> Felix >>>>> >>>>> -- >>>>> Felix Bemm >>>>> Department of Bioinformatics >>>>> University of W?rzburg, Germany >>>>> Tel: +49 931 - 31 83696 >>>>> Fax: +49 931 - 31 84552 >>>>> felix.bemm at uni-wuerzburg.de >>>>> >>>>> _______________________________________________ >>>>> maker-devel mailing list >>>>> maker-devel at box290.bluehost.com >>>>> >>>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >>> -- >>> Felix Bemm >>> Department of Bioinformatics >>> University of W?rzburg, Germany >>> Tel: +49 931 - 31 83696 >>> Fax: +49 931 - 31 84552 >>> felix.bemm at uni-wuerzburg.de >> >> >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From amelia.ireland at gmod.org Thu Oct 24 12:52:17 2013 From: amelia.ireland at gmod.org (Amelia Ireland) Date: Thu, 24 Oct 2013 11:52:17 -0700 Subject: [maker-devel] GMOD 2014: San Diego Message-ID: Greetings GMOD community citizens! Online registration is now open for the 2014 GMOD meeting, to be held at the Best Western Seven Seas, San Diego, CA, on January 16-17, 2014 (after PAG). The meeting is open to GMOD users, developers, and anyone interested in the project and the software we provide. Online registration is now open; please see the meeting page, http://gmod.org/wiki/Jan_2014_GMOD_Meeting, for more information. Early bird registration rates apply until Dec. 14th. We have secured a discount on a block of rooms at the Best Western Seven Seas; please see the wiki for more information on booking. Please feel free to contact the GMOD helpdesk on help at gmod.org if you have any questions. Thanks! -- Amelia Ireland GMOD Community Support http://gmod.org || @gmodproject -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry.moore at genetics.utah.edu Thu Oct 24 15:09:20 2013 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Thu, 24 Oct 2013 15:09:20 -0600 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: <69247335-1227-41CB-94EA-BFEEE5207D51@gmail.com> References: <7A60AB257EFF2B48B1F4C814817EA05365E778BF@mxb2.hg.genetics.utah.edu> <69247335-1227-41CB-94EA-BFEEE5207D51@gmail.com> Message-ID: <1F2D4C5A-1B72-409A-8F7D-4793F9BCEE88@genetics.utah.edu> Hi Jim, You can and any kind of evidence you want to MAKER including ribosomal profiling data as long as you format it in GFF3 format. You could pass in your GFF3 formatted ribosomal profiling data in maker_opt.ctl with the protein_gff option and it will be treated as protein evidence. However if you are wondering if anyone has coded a specific addition to MAKER to accept ribosomal profiling data and treat it as a different kind of evidence then no, this hasn't been done. B On Oct 24, 2013, at 8:32 AM, JH Gmail wrote: > Slightly off-topic questions: What is the assessment of right and wrong based on? Is there a training set based on experiments? In bacteria, the use of the minor starts is significant, but the mechanism is so different I would expect eukaryotes to be more strict. > > Does maker handle ribosome profiling yet? > > jim > > Sent from my iPad > >> On Oct 24, 2013, at 8:55 AM, Mark Yandell wrote: >> >> Hi Graham, >> >> Carson is currently on the road, moving his family from Canada back to the States. He should be back online sometime next week. Sorry for the delay. >> >> >> --mark >> >> >> Mark Yandell >> Professor of Human Genetics >> H.A. & Edna Benning Presidential Endowed Chair >> Eccles Institute of Human Genetics >> University of Utah >> 15 North 2030 East, Room 2100 >> Salt Lake City, UT 84112-5330 >> ph:801-587-7707 >> >> ________________________________________ >> From: maker-devel [maker-devel-bounces at yandell-lab.org] on behalf of graham etherington (TSL) [graham.etherington at sainsbury-laboratory.ac.uk] >> Sent: Thursday, October 24, 2013 7:42 AM >> To: Carson Holt; Felix Bemm; maker-devel at yandell-lab.org >> Cc: Kate Bailey (TSL) >> Subject: Re: [maker-devel] Serious Start Codon Problem >> >> Hi, >> I notice that the latest version of Maker available from >> http://www.yandell-lab.org/software/maker.html is still v2.29b (4 Oct >> 2013). As we're also having the start codon problem we'd like to get hold >> of v2.30. Do you know when this will be released? >> Many thanks, >> Graham >> >> >> Dr. Graham Etherington >> Bioinformatics Support Officer, >> The Sainsbury Laboratory, >> Norwich Research Park, >> Norwich NR4 7UH. >> UK >> Tel: +44 (0)1603 450601 >> >> >> >> >> >>> On 13/10/2013 06:26, "Carson Holt" wrote: >>> >>> Before Wednesday, because after that I'm on a 6 day road trip, so I have >>> to have it wrapped up before I leave. >>> >>> --Carson >>> >>> >>> >>> >>>> On 10/12/13 12:16 PM, "Felix Bemm" wrote: >>>> >>>> Thx Carson! Do you now when 2.30 will be available? Bests >>>> >>>>> On 12.10.2013 17:48, Carson Holt wrote: >>>>> This issue just came up this week on another e-mail as well. >>>>> >>>>> One way to fix it, is to edit the Bio::Tools::CodonTable module from >>>>> BioPerl (use maker --debug to have maker print out the locations of all >>>>> modules it uses). Such a hack should only be done as a temporary work >>>>> around. >>>>> >>>>> You change line 256 from --> >>>>> ---M---------------M---------------M---------------------------- >>>>> >>>>> To--> >>>>> -----------------------------------M---------------------------- >>>>> >>>>> >>>>> Maker uses the BioPerl is_start_codon function to determine if the >>>>> codon >>>>> used in a result it receives is acceptable or if it should look >>>>> upstream >>>>> or downstream for a different one. Apparently there are actually 3 >>>>> acceptable start codons in the standard codon table (2 rare ones and >>>>> one >>>>> common), so making the edit removes the two rare ones from the list. >>>>> >>>>> I'm also preparing a 2.30 release that exports a "strict" codon table >>>>> to >>>>> the BioPerl module, that will be used by default and should fix the >>>>> issue. >>>>> >>>>> Thanks, >>>>> Carson >>>>> >>>>> >>>>> >>>>>> On 10/12/13 11:06 AM, "Felix Bemm" wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> I have a serious problems with maker annotations especially the start >>>>>> codon placement. It started happening with maker version 2.27! About >>>>>> 1/3 >>>>>> of my genes are missing a proper start codon. When reviewing the GFF >>>>>> files gene predictors and est evidence standalone would predict the >>>>>> right gene structure including the start codon. I was thinking that >>>>>> this >>>>>> problem was somehow linked to my gene models but looking at their >>>>>> standalone prediction I discarded that theory. Just some stats about >>>>>> proteins with proper start codon (first column) and missing start >>>>>> codon >>>>>> (second column). >>>>>> >>>>>> Maker 9268 4215 >>>>>> SNAP 16577 896 >>>>>> Genemark 18764 290 >>>>>> >>>>>> Numbers look the same when Augustus is used (currently retraining). >>>>>> Manually using EST's in Apollo often fixes the problems but I don't >>>>>> want >>>>>> to check > 4000 genes for this. >>>>>> >>>>>> Do you have any idea what could cause such a problem? If you need some >>>>>> example gff files I can send you. >>>>>> >>>>>> Best regards >>>>>> Felix >>>>>> >>>>>> -- >>>>>> Felix Bemm >>>>>> Department of Bioinformatics >>>>>> University of W?rzburg, Germany >>>>>> Tel: +49 931 - 31 83696 >>>>>> Fax: +49 931 - 31 84552 >>>>>> felix.bemm at uni-wuerzburg.de >>>>>> >>>>>> _______________________________________________ >>>>>> maker-devel mailing list >>>>>> maker-devel at box290.bluehost.com >>>>>> >>>>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>>> >>>> -- >>>> Felix Bemm >>>> Department of Bioinformatics >>>> University of W?rzburg, Germany >>>> Tel: +49 931 - 31 83696 >>>> Fax: +49 931 - 31 84552 >>>> felix.bemm at uni-wuerzburg.de >>> >>> >>> >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From marc.hoeppner at imbim.uu.se Fri Oct 25 08:20:04 2013 From: marc.hoeppner at imbim.uu.se (Marc P. Hoeppner) Date: Fri, 25 Oct 2013 16:20:04 +0200 Subject: [maker-devel] Maker MPICH2 issue (Multiple MAKER processes have been started in the same directory) Message-ID: <526A7E14.9010906@imbim.uu.se> Dear list, I know that MPICH2 related issues have been discussed previously, but the suggested solutions don't seem to work for me. Briefly, I have set up a new cluster, with a fresh install of mpich2 (SL 6.4, from the yum repos) and a newly compiled build of Maker 2.28. When I start maker with mpiexec, I get spammed with 'WARNING: Multiple MAKER processes have been started in the same directory.' Here is what I have tried so far - still getting the error :-< Any suggestions would be appreciated -------- MPICH2 --------- which mpiexec: /usr/lib64/mpich2/bin/mpiexec Started the mpd ring as normal user across 5 nodes (bnode-01 to bnode-04, and the head node bhead) mpdtrace: bhead bnode-02 bnode-03 bnode-04 bnode-01 mpdringtest: time for 1 loops = 0.00128293037415 seconds mpixec -n 32 -machinefile hostfile hostname: I also ran the Skampi Mpi Benchmark, which ran through just fine. ----------- MAKER ----------- ./Build status PERL Dependencies: VERIFIED External Programs: VERIFIED External C Libraries: VERIFIED MPI SUPPORT: ENABLED MWAS Web Interface: DISABLED MAKER PACKAGE: CONFIGURATION OK During configuration/installation, Build.PL automatically detected all MPI data and never complained, so I have to assume that everything went fine there. -- ---------- Marc P. Hoeppner, PhD Department of Microbiology and Medical Biochemistry Uppsala University, Sweden marc.hoeppner at imbim.uu.se From jim.hu.biobio at gmail.com Fri Oct 25 08:51:19 2013 From: jim.hu.biobio at gmail.com (JH Gmail) Date: Fri, 25 Oct 2013 09:51:19 -0500 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: <1F2D4C5A-1B72-409A-8F7D-4793F9BCEE88@genetics.utah.edu> References: <7A60AB257EFF2B48B1F4C814817EA05365E778BF@mxb2.hg.genetics.utah.edu> <69247335-1227-41CB-94EA-BFEEE5207D51@gmail.com> <1F2D4C5A-1B72-409A-8F7D-4793F9BCEE88@genetics.utah.edu> Message-ID: <596400F6-3215-4F71-A896-AF52CEC8D77D@gmail.com> Thanks Barry, I'm thinking about this more theoretically, as we don't have immediate plans, but... This would work for gff derived from ribosome profiling analysis, but it would be nice to be able to start with bam files of mapped reads, or coverage scores as bigwig or gff. RNA-seq has similar issues, but IIRC Maker expects RNA-seq data to be assembled first. It may not be possible to do de novo cds assembly from ribosome profiling because I think the nuclease step may kill the overlaps. I'd have to look at some of the available data. Jim Sent from my iPad > On Oct 24, 2013, at 4:09 PM, Barry Moore wrote: > > Hi Jim, > > You can and any kind of evidence you want to MAKER including ribosomal profiling data as long as you format it in GFF3 format. You could pass in your GFF3 formatted ribosomal profiling data in maker_opt.ctl with the protein_gff option and it will be treated as protein evidence. However if you are wondering if anyone has coded a specific addition to MAKER to accept ribosomal profiling data and treat it as a different kind of evidence then no, this hasn't been done. > > B > >> On Oct 24, 2013, at 8:32 AM, JH Gmail wrote: >> >> Slightly off-topic questions: What is the assessment of right and wrong based on? Is there a training set based on experiments? In bacteria, the use of the minor starts is significant, but the mechanism is so different I would expect eukaryotes to be more strict. >> >> Does maker handle ribosome profiling yet? >> >> jim >> >> Sent from my iPad >> >>> On Oct 24, 2013, at 8:55 AM, Mark Yandell wrote: >>> >>> Hi Graham, >>> >>> Carson is currently on the road, moving his family from Canada back to the States. He should be back online sometime next week. Sorry for the delay. >>> >>> >>> --mark >>> >>> >>> Mark Yandell >>> Professor of Human Genetics >>> H.A. & Edna Benning Presidential Endowed Chair >>> Eccles Institute of Human Genetics >>> University of Utah >>> 15 North 2030 East, Room 2100 >>> Salt Lake City, UT 84112-5330 >>> ph:801-587-7707 >>> >>> ________________________________________ >>> From: maker-devel [maker-devel-bounces at yandell-lab.org] on behalf of graham etherington (TSL) [graham.etherington at sainsbury-laboratory.ac.uk] >>> Sent: Thursday, October 24, 2013 7:42 AM >>> To: Carson Holt; Felix Bemm; maker-devel at yandell-lab.org >>> Cc: Kate Bailey (TSL) >>> Subject: Re: [maker-devel] Serious Start Codon Problem >>> >>> Hi, >>> I notice that the latest version of Maker available from >>> http://www.yandell-lab.org/software/maker.html is still v2.29b (4 Oct >>> 2013). As we're also having the start codon problem we'd like to get hold >>> of v2.30. Do you know when this will be released? >>> Many thanks, >>> Graham >>> >>> >>> Dr. Graham Etherington >>> Bioinformatics Support Officer, >>> The Sainsbury Laboratory, >>> Norwich Research Park, >>> Norwich NR4 7UH. >>> UK >>> Tel: +44 (0)1603 450601 >>> >>> >>> >>> >>> >>>> On 13/10/2013 06:26, "Carson Holt" wrote: >>>> >>>> Before Wednesday, because after that I'm on a 6 day road trip, so I have >>>> to have it wrapped up before I leave. >>>> >>>> --Carson >>>> >>>> >>>> >>>> >>>>> On 10/12/13 12:16 PM, "Felix Bemm" wrote: >>>>> >>>>> Thx Carson! Do you now when 2.30 will be available? Bests >>>>> >>>>>> On 12.10.2013 17:48, Carson Holt wrote: >>>>>> This issue just came up this week on another e-mail as well. >>>>>> >>>>>> One way to fix it, is to edit the Bio::Tools::CodonTable module from >>>>>> BioPerl (use maker --debug to have maker print out the locations of all >>>>>> modules it uses). Such a hack should only be done as a temporary work >>>>>> around. >>>>>> >>>>>> You change line 256 from --> >>>>>> ---M---------------M---------------M---------------------------- >>>>>> >>>>>> To--> >>>>>> -----------------------------------M---------------------------- >>>>>> >>>>>> >>>>>> Maker uses the BioPerl is_start_codon function to determine if the >>>>>> codon >>>>>> used in a result it receives is acceptable or if it should look >>>>>> upstream >>>>>> or downstream for a different one. Apparently there are actually 3 >>>>>> acceptable start codons in the standard codon table (2 rare ones and >>>>>> one >>>>>> common), so making the edit removes the two rare ones from the list. >>>>>> >>>>>> I'm also preparing a 2.30 release that exports a "strict" codon table >>>>>> to >>>>>> the BioPerl module, that will be used by default and should fix the >>>>>> issue. >>>>>> >>>>>> Thanks, >>>>>> Carson >>>>>> >>>>>> >>>>>> >>>>>>> On 10/12/13 11:06 AM, "Felix Bemm" wrote: >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I have a serious problems with maker annotations especially the start >>>>>>> codon placement. It started happening with maker version 2.27! About >>>>>>> 1/3 >>>>>>> of my genes are missing a proper start codon. When reviewing the GFF >>>>>>> files gene predictors and est evidence standalone would predict the >>>>>>> right gene structure including the start codon. I was thinking that >>>>>>> this >>>>>>> problem was somehow linked to my gene models but looking at their >>>>>>> standalone prediction I discarded that theory. Just some stats about >>>>>>> proteins with proper start codon (first column) and missing start >>>>>>> codon >>>>>>> (second column). >>>>>>> >>>>>>> Maker 9268 4215 >>>>>>> SNAP 16577 896 >>>>>>> Genemark 18764 290 >>>>>>> >>>>>>> Numbers look the same when Augustus is used (currently retraining). >>>>>>> Manually using EST's in Apollo often fixes the problems but I don't >>>>>>> want >>>>>>> to check > 4000 genes for this. >>>>>>> >>>>>>> Do you have any idea what could cause such a problem? If you need some >>>>>>> example gff files I can send you. >>>>>>> >>>>>>> Best regards >>>>>>> Felix >>>>>>> >>>>>>> -- >>>>>>> Felix Bemm >>>>>>> Department of Bioinformatics >>>>>>> University of W?rzburg, Germany >>>>>>> Tel: +49 931 - 31 83696 >>>>>>> Fax: +49 931 - 31 84552 >>>>>>> felix.bemm at uni-wuerzburg.de >>>>>>> >>>>>>> _______________________________________________ >>>>>>> maker-devel mailing list >>>>>>> maker-devel at box290.bluehost.com >>>>>>> >>>>>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>>>> >>>>> -- >>>>> Felix Bemm >>>>> Department of Bioinformatics >>>>> University of W?rzburg, Germany >>>>> Tel: +49 931 - 31 83696 >>>>> Fax: +49 931 - 31 84552 >>>>> felix.bemm at uni-wuerzburg.de >>>> >>>> >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >>> >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > Barry Moore > Research Scientist > Dept. of Human Genetics > University of Utah > Salt Lake City, UT 84112 > -------------------------------------------- > (801) 585-3543 > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry.moore at genetics.utah.edu Fri Oct 25 09:44:55 2013 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Fri, 25 Oct 2013 09:44:55 -0600 Subject: [maker-devel] Serious Start Codon Problem In-Reply-To: <596400F6-3215-4F71-A896-AF52CEC8D77D@gmail.com> References: <7A60AB257EFF2B48B1F4C814817EA05365E778BF@mxb2.hg.genetics.utah.edu> <69247335-1227-41CB-94EA-BFEEE5207D51@gmail.com> <1F2D4C5A-1B72-409A-8F7D-4793F9BCEE88@genetics.utah.edu> <596400F6-3215-4F71-A896-AF52CEC8D77D@gmail.com> Message-ID: Thanks Jim. B On Oct 25, 2013, at 8:51 AM, JH Gmail wrote: > Thanks Barry, > > I'm thinking about this more theoretically, as we don't have immediate plans, but... > > This would work for gff derived from ribosome profiling analysis, but it would be nice to be able to start with bam files of mapped reads, or coverage scores as bigwig or gff. RNA-seq has similar issues, but IIRC Maker expects RNA-seq data to be assembled first. It may not be possible to do de novo cds assembly from ribosome profiling because I think the nuclease step may kill the overlaps. I'd have to look at some of the available data. > > Jim > > Sent from my iPad > > On Oct 24, 2013, at 4:09 PM, Barry Moore wrote: > >> Hi Jim, >> >> You can and any kind of evidence you want to MAKER including ribosomal profiling data as long as you format it in GFF3 format. You could pass in your GFF3 formatted ribosomal profiling data in maker_opt.ctl with the protein_gff option and it will be treated as protein evidence. However if you are wondering if anyone has coded a specific addition to MAKER to accept ribosomal profiling data and treat it as a different kind of evidence then no, this hasn't been done. >> >> B >> >> On Oct 24, 2013, at 8:32 AM, JH Gmail wrote: >> >>> Slightly off-topic questions: What is the assessment of right and wrong based on? Is there a training set based on experiments? In bacteria, the use of the minor starts is significant, but the mechanism is so different I would expect eukaryotes to be more strict. >>> >>> Does maker handle ribosome profiling yet? >>> >>> jim >>> >>> Sent from my iPad >>> >>>> On Oct 24, 2013, at 8:55 AM, Mark Yandell wrote: >>>> >>>> Hi Graham, >>>> >>>> Carson is currently on the road, moving his family from Canada back to the States. He should be back online sometime next week. Sorry for the delay. >>>> >>>> >>>> --mark >>>> >>>> >>>> Mark Yandell >>>> Professor of Human Genetics >>>> H.A. & Edna Benning Presidential Endowed Chair >>>> Eccles Institute of Human Genetics >>>> University of Utah >>>> 15 North 2030 East, Room 2100 >>>> Salt Lake City, UT 84112-5330 >>>> ph:801-587-7707 >>>> >>>> ________________________________________ >>>> From: maker-devel [maker-devel-bounces at yandell-lab.org] on behalf of graham etherington (TSL) [graham.etherington at sainsbury-laboratory.ac.uk] >>>> Sent: Thursday, October 24, 2013 7:42 AM >>>> To: Carson Holt; Felix Bemm; maker-devel at yandell-lab.org >>>> Cc: Kate Bailey (TSL) >>>> Subject: Re: [maker-devel] Serious Start Codon Problem >>>> >>>> Hi, >>>> I notice that the latest version of Maker available from >>>> http://www.yandell-lab.org/software/maker.html is still v2.29b (4 Oct >>>> 2013). As we're also having the start codon problem we'd like to get hold >>>> of v2.30. Do you know when this will be released? >>>> Many thanks, >>>> Graham >>>> >>>> >>>> Dr. Graham Etherington >>>> Bioinformatics Support Officer, >>>> The Sainsbury Laboratory, >>>> Norwich Research Park, >>>> Norwich NR4 7UH. >>>> UK >>>> Tel: +44 (0)1603 450601 >>>> >>>> >>>> >>>> >>>> >>>>> On 13/10/2013 06:26, "Carson Holt" wrote: >>>>> >>>>> Before Wednesday, because after that I'm on a 6 day road trip, so I have >>>>> to have it wrapped up before I leave. >>>>> >>>>> --Carson >>>>> >>>>> >>>>> >>>>> >>>>>> On 10/12/13 12:16 PM, "Felix Bemm" wrote: >>>>>> >>>>>> Thx Carson! Do you now when 2.30 will be available? Bests >>>>>> >>>>>>> On 12.10.2013 17:48, Carson Holt wrote: >>>>>>> This issue just came up this week on another e-mail as well. >>>>>>> >>>>>>> One way to fix it, is to edit the Bio::Tools::CodonTable module from >>>>>>> BioPerl (use maker --debug to have maker print out the locations of all >>>>>>> modules it uses). Such a hack should only be done as a temporary work >>>>>>> around. >>>>>>> >>>>>>> You change line 256 from --> >>>>>>> ---M---------------M---------------M---------------------------- >>>>>>> >>>>>>> To--> >>>>>>> -----------------------------------M---------------------------- >>>>>>> >>>>>>> >>>>>>> Maker uses the BioPerl is_start_codon function to determine if the >>>>>>> codon >>>>>>> used in a result it receives is acceptable or if it should look >>>>>>> upstream >>>>>>> or downstream for a different one. Apparently there are actually 3 >>>>>>> acceptable start codons in the standard codon table (2 rare ones and >>>>>>> one >>>>>>> common), so making the edit removes the two rare ones from the list. >>>>>>> >>>>>>> I'm also preparing a 2.30 release that exports a "strict" codon table >>>>>>> to >>>>>>> the BioPerl module, that will be used by default and should fix the >>>>>>> issue. >>>>>>> >>>>>>> Thanks, >>>>>>> Carson >>>>>>> >>>>>>> >>>>>>> >>>>>>>> On 10/12/13 11:06 AM, "Felix Bemm" wrote: >>>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> I have a serious problems with maker annotations especially the start >>>>>>>> codon placement. It started happening with maker version 2.27! About >>>>>>>> 1/3 >>>>>>>> of my genes are missing a proper start codon. When reviewing the GFF >>>>>>>> files gene predictors and est evidence standalone would predict the >>>>>>>> right gene structure including the start codon. I was thinking that >>>>>>>> this >>>>>>>> problem was somehow linked to my gene models but looking at their >>>>>>>> standalone prediction I discarded that theory. Just some stats about >>>>>>>> proteins with proper start codon (first column) and missing start >>>>>>>> codon >>>>>>>> (second column). >>>>>>>> >>>>>>>> Maker 9268 4215 >>>>>>>> SNAP 16577 896 >>>>>>>> Genemark 18764 290 >>>>>>>> >>>>>>>> Numbers look the same when Augustus is used (currently retraining). >>>>>>>> Manually using EST's in Apollo often fixes the problems but I don't >>>>>>>> want >>>>>>>> to check > 4000 genes for this. >>>>>>>> >>>>>>>> Do you have any idea what could cause such a problem? If you need some >>>>>>>> example gff files I can send you. >>>>>>>> >>>>>>>> Best regards >>>>>>>> Felix >>>>>>>> >>>>>>>> -- >>>>>>>> Felix Bemm >>>>>>>> Department of Bioinformatics >>>>>>>> University of W?rzburg, Germany >>>>>>>> Tel: +49 931 - 31 83696 >>>>>>>> Fax: +49 931 - 31 84552 >>>>>>>> felix.bemm at uni-wuerzburg.de >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> maker-devel mailing list >>>>>>>> maker-devel at box290.bluehost.com >>>>>>>> >>>>>>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>>>>> >>>>>> -- >>>>>> Felix Bemm >>>>>> Department of Bioinformatics >>>>>> University of W?rzburg, Germany >>>>>> Tel: +49 931 - 31 83696 >>>>>> Fax: +49 931 - 31 84552 >>>>>> felix.bemm at uni-wuerzburg.de >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> maker-devel mailing list >>>>> maker-devel at box290.bluehost.com >>>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>>> >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> Barry Moore >> Research Scientist >> Dept. of Human Genetics >> University of Utah >> Salt Lake City, UT 84112 >> -------------------------------------------- >> (801) 585-3543 >> >> >> >> Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From marc.hoeppner at imbim.uu.se Sat Oct 26 05:54:07 2013 From: marc.hoeppner at imbim.uu.se (Marc P. Hoeppner) Date: Sat, 26 Oct 2013 13:54:07 +0200 Subject: [maker-devel] Maker MPICH2 issue (Multiple MAKER processes have been started in the same directory) In-Reply-To: <526A7E14.9010906@imbim.uu.se> References: <526A7E14.9010906@imbim.uu.se> Message-ID: <526BAD5F.2000800@imbim.uu.se> Dear List, after much more debugging, I found the solution myself. Here are the details: The Maker installation instructions refer to 'mpich2' in multiple places, including weblinks (whchi are outdated btw). Some digging revealed that mpich2 has since been renamed and merged back into mpich. So the website is www.mpich.org and the current stable version is mpich-3.0.4. Closely following the installation instruction yielded a working copy of mpiexec that Maker was willing to accept. Good news - but I suggest to update the documentation. Most importantly for either SL/CentOS users, mpich2 as packaged with e.g. Scientific Linux *does not* work with Maker. You will need to download the latest stable build and compile it yourself (the mpich2 version bundled with Maker doesn't work either btw). Cheers, Marc > Dear list, > > I know that MPICH2 related issues have been discussed previously, but > the suggested solutions don't seem to work for me. > Briefly, I have set up a new cluster, with a fresh install of mpich2 > (SL 6.4, from the yum repos) and a newly compiled build of Maker 2.28. > > When I start maker with mpiexec, I get spammed with > 'WARNING: Multiple MAKER processes have been started in the same > directory.' > > Here is what I have tried so far - still getting the error :-< Any > suggestions would be appreciated > > -------- > MPICH2 > --------- > > which mpiexec: > /usr/lib64/mpich2/bin/mpiexec > > Started the mpd ring as normal user across 5 nodes (bnode-01 to > bnode-04, and the head node bhead) > > mpdtrace: > bhead > bnode-02 > bnode-03 > bnode-04 > bnode-01 > > mpdringtest: > time for 1 loops = 0.00128293037415 seconds > > mpixec -n 32 -machinefile hostfile hostname: > > > > I also ran the Skampi Mpi Benchmark, which ran through just fine. > > ----------- > MAKER > ----------- > > ./Build status > > PERL Dependencies: VERIFIED > External Programs: VERIFIED > External C Libraries: VERIFIED > MPI SUPPORT: ENABLED > MWAS Web Interface: DISABLED > MAKER PACKAGE: CONFIGURATION OK > > During configuration/installation, Build.PL automatically detected all > MPI data and never complained, so I have to assume that everything > went fine there. > -- ---------- Marc P. Hoeppner, PhD Department of Microbiology and Medical Biochemistry Uppsala University, Sweden marc.hoeppner at imbim.uu.se From barry.utah at gmail.com Sat Oct 26 08:41:24 2013 From: barry.utah at gmail.com (Barry Moore) Date: Sat, 26 Oct 2013 08:41:24 -0600 Subject: [maker-devel] Maker MPICH2 issue (Multiple MAKER processes have been started in the same directory) In-Reply-To: <526BAD5F.2000800@imbim.uu.se> References: <526A7E14.9010906@imbim.uu.se> <526BAD5F.2000800@imbim.uu.se> Message-ID: Hi Marc, Thanks so much for sharing the results of all your hard work! We will follow up on your feedback and incorporate the necessary updates to MAKER docs and installation procedures. B On Oct 26, 2013, at 5:54 AM, Marc P. Hoeppner wrote: > Dear List, > > after much more debugging, I found the solution myself. Here are the details: > > The Maker installation instructions refer to 'mpich2' in multiple places, including weblinks (whchi are outdated btw). Some digging revealed that mpich2 has since been renamed and merged back into mpich. So the website is www.mpich.org and the current stable version is mpich-3.0.4. Closely following the installation instruction yielded a working copy of mpiexec that Maker was willing to accept. Good news - but I suggest to update the documentation. Most importantly for either SL/CentOS users, mpich2 as packaged with e.g. Scientific Linux *does not* work with Maker. You will need to download the latest stable build and compile it yourself (the mpich2 version bundled with Maker doesn't work either btw). > > Cheers, > > Marc > >> Dear list, >> >> I know that MPICH2 related issues have been discussed previously, but the suggested solutions don't seem to work for me. >> Briefly, I have set up a new cluster, with a fresh install of mpich2 (SL 6.4, from the yum repos) and a newly compiled build of Maker 2.28. >> >> When I start maker with mpiexec, I get spammed with >> 'WARNING: Multiple MAKER processes have been started in the same directory.' >> >> Here is what I have tried so far - still getting the error :-< Any suggestions would be appreciated >> >> -------- >> MPICH2 >> --------- >> >> which mpiexec: >> /usr/lib64/mpich2/bin/mpiexec >> >> Started the mpd ring as normal user across 5 nodes (bnode-01 to bnode-04, and the head node bhead) >> >> mpdtrace: >> bhead >> bnode-02 >> bnode-03 >> bnode-04 >> bnode-01 >> >> mpdringtest: >> time for 1 loops = 0.00128293037415 seconds >> >> mpixec -n 32 -machinefile hostfile hostname: >> >> >> >> I also ran the Skampi Mpi Benchmark, which ran through just fine. >> >> ----------- >> MAKER >> ----------- >> >> ./Build status >> >> PERL Dependencies: VERIFIED >> External Programs: VERIFIED >> External C Libraries: VERIFIED >> MPI SUPPORT: ENABLED >> MWAS Web Interface: DISABLED >> MAKER PACKAGE: CONFIGURATION OK >> >> During configuration/installation, Build.PL automatically detected all MPI data and never complained, so I have to assume that everything went fine there. >> > > > -- > ---------- > Marc P. Hoeppner, PhD > Department of Microbiology and Medical Biochemistry > Uppsala University, Sweden > marc.hoeppner at imbim.uu.se > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From wholtz at lygos.com Wed Oct 30 12:21:52 2013 From: wholtz at lygos.com (William Holtz) Date: Wed, 30 Oct 2013 11:21:52 -0700 (PDT) Subject: [maker-devel] maker apollo error Message-ID: <1383157312.95043.YahooMailNeo@web5706.biz.mail.ne1.yahoo.com> Hi, I'm attempting to reply to a thread on this list from before I was a member. My apologies if this gets messed up. I've tried installing maker v2.28 and v2.29p-beta and both of them fail due to the URL of the apollo tar file being outdated as previously reported on this list.?The updated URL that Carson Holt posted,?http://sourceforge.net/code-snapshots/svn/g/gm/gmod/svn/gmod-svn-25291-apollo-trunk.zip,?is also no longer valid. I was able to find a copy of the apollo source at?http://downloads.sourceforge.net/project/gmod/Apollo/1.11.1/apollo-1.11.1.tgz. However my attempts to integrate this into the maker build process have been unsuccessful, but likely could be done by someone more skilled than I.?Any pointers on how to get past this hurdle would be appreciated. Thanks in advance for your help. -Will From barry.utah at gmail.com Wed Oct 30 16:31:58 2013 From: barry.utah at gmail.com (Barry Moore) Date: Wed, 30 Oct 2013 16:31:58 -0600 Subject: [maker-devel] maker apollo error In-Reply-To: <1383157312.95043.YahooMailNeo@web5706.biz.mail.ne1.yahoo.com> References: <1383157312.95043.YahooMailNeo@web5706.biz.mail.ne1.yahoo.com> Message-ID: Hi Will, You can edit these URLs in the following file: maker/src/locations In a section that looks like this: 'apollo' => {'Linux_x86_64' => 'http://gmod.svn.sourceforge.net/viewvc/gmod/apollo/trunk/?view=tar', 'Linux_i386' => 'http://gmod.svn.sourceforge.net/viewvc/gmod/apollo/trunk/?view=tar', 'Darwin_i386' => 'http://weatherby.genetics.utah.edu/apollo/apollo.tar.gz', 'Darwin_x86_64' => 'http://weatherby.genetics.utah.edu/apollo/apollo.tar.gz', 'src_' => 'http://weatherby.genetics.utah.edu/apollo/apollo.tar.gz', }, Carson may have more feedback for you as well, but he is right in the middle of a move, so there may be a few days before he can reply, so modifying this file should work for you in the mean time. On any of the executables that are missing you can always install them manually and add them to your path and maker will find them. Also, maker shouldn't be required for a base maker install, so you could ignore that dependency as well. Thanks, B On Oct 30, 2013, at 12:21 PM, William Holtz wrote: > Hi, > > I'm attempting to reply to a thread on this list from before I was a member. My apologies if this gets messed up. > > I've tried installing maker v2.28 and v2.29p-beta and both of them fail due to the URL of the apollo tar file being outdated as previously reported on this list. The updated URL that Carson Holt posted, http://sourceforge.net/code-snapshots/svn/g/gm/gmod/svn/gmod-svn-25291-apollo-trunk.zip, is also no longer valid. I was able to find a copy of the apollo source at http://downloads.sourceforge.net/project/gmod/Apollo/1.11.1/apollo-1.11.1.tgz. However my attempts to integrate this into the maker build process have been unsuccessful, but likely could be done by someone more skilled than I. Any pointers on how to get past this hurdle would be appreciated. > > Thanks in advance for your help. > > -Will > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: