From carson.holt at genetics.utah.edu Thu Nov 6 01:04:07 2014 From: carson.holt at genetics.utah.edu (Carson Holt) Date: Thu, 6 Nov 2014 07:04:07 +0000 Subject: [maker-devel] URGENT: Re: maker failure with example data In-Reply-To: References: Message-ID: <3CCDC24F-756A-492C-8E7F-B1B97616EE45@genetics.utah.edu> The final transcript and proteins fasta files will only exists if there were gene models with evidence support. If you did not provide an HMM for one of the ab initio gene predictors (SNAP, Augustus, etc.) then there will be no gene models, and if you do not provide protein or est evidence fastas, then there will be no evidence support. Also if your contigs are too short to contain gene models then there will be no models. Thanks, Carson On Nov 5, 2014, at 11:49 PM, Goutham atla > wrote: Dear All, I have finished running maker. But I realised that there are no *transcripts.fasta and *protein.fasta files in any of the directories that make has created. It has only gtf files. Example output of a test run: I have similar results on original file also: [User at motif jcf7180001838744]$ pwd /home/User/Maker_Annotation/Maker_test.maker.output/Maker_test_datastore/35/C1/jcf7180001838744 [User at motif jcf7180001838744]$ ls jcf7180001838744.gff run.log theVoid.jcf7180001838744 Any help from you in figuring out why there are no protein.fasta and transcripts.fast would be very helpful. Regards, Goutham On Wed, Oct 1, 2014 at 11:28 AM, Goutham atla > wrote: Dear All, Thank you. I figured out th problem is with mpich2. I was behind mpich2 but was unsuccessful. I installed mpich v3 and its working fine now. Thank you all. The old GMDO tutorials are bit misleading as the new versions have come up. On Wed, Oct 1, 2014 at 11:09 AM, Marc H?ppner > wrote: Another possibility could be that MPICH2 wasn?t build properly, no? I remember something with enabling shared libraries during the compilation of mpich, without which the error below would appear. /Marc Marc P. Hoeppner, PhD Team Leader BILS Genome Annotation Platform Department for Medical Biochemistry and Microbiology Uppsala University, Sweden marc.hoeppner at imbim.uu.se On 30 Sep 2014, at 21:33, Carson Holt > wrote: The message is warning that there are multiple instances of MAKER running, but no MPI communication. When you build MAKER (perl Build.PL step when installing MAKER), you need to specify the location of 'mpicc' and 'mpi.h' to build with MPI support. Otherwise you won't be able to link against MPICH2 shared libraries. You probably need to rerun that step. --Carson From: Goutham atla > Date: Tuesday, September 30, 2014 at 10:49 AM To: Carson Holt > Cc: "maker-devel at yandell-lab.org" > Subject: URGENT: Re: maker failure with example data Hi Carson, I figured out the problem is with RepeatMasker installation and I fixed it. I am running maker with MPICH2 and I get the following warning when I start it: STATUS: Processing and indexing input FASTA files... WARNING: Multiple MAKER processes have been started in the same directory. I would like to if this is common. Regards, Goutham On Tue, Sep 30, 2014 at 12:02 PM, Goutham atla > wrote: Dear Carson, Thank you for the reply. I reinstalled the BioPerl and now I am getting the following error on test data. ERROR: RepeatMasker failed --> rank=NA, hostname=motif ERROR: Failed while doing repeat masking ERROR: Chunk failed at level:0, tier_type:1 FAILED CONTIG:contig-dpp-500-500 On Mon, Sep 29, 2014 at 8:17 PM, Carson Holt > wrote: The error is caused by the BioPerl indexer returning an empty length for the indexed fasta sequence (possibly because of a corrupt index file or other reasons). You may need to reinstall BioPerl (use the CPAN version not the BioPerl-live version), or reinstall Berkley DB (used by the BioPerl indexer), or reinstall the Perl module DB_File via CPAN (Perl's interface to Berkley DB). After reinstalling BioPerl, delete the mpi_blastdb directory for the MAKER run before retrying. Also verify that the /tmp directory on your system or the directory pointed to by TMP= in the maker_opts,ctl file is not full and that TMP= is not set to an NFS mounted location. Thanks, Carson From: Goutham atla > Date: Monday, September 29, 2014 at 6:33 AM To: > Subject: maker failure with example data Dear All, I am running maker with the demo file, i.e dip_contig.fasta by keeping all other parameters in .ctl files as default. But it do not progress and shows the following message that the length of the sequence is 0. Can anybody help me ? --Next Contig-- MAKER WARNING: All old files will be erased before continuing #--------------------------------------------------------------------- Skipping the contig because it is too short!! SeqID: contig-dpp-500-500 Length: 0 #--------------------------------------------------------------------- Regards, Goutham -- Goutham Atla -- Goutham Atla _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -- Goutham Atla -- Goutham Atla -------------- next part -------------- An HTML attachment was scrubbed... URL: From monica.poelchau at ars.usda.gov Fri Nov 7 07:17:04 2014 From: monica.poelchau at ars.usda.gov (Poelchau, Monica) Date: Fri, 7 Nov 2014 13:17:04 +0000 Subject: [maker-devel] calculating AED values between two datasets Message-ID: Hi everyone, I would like to generate a list of Maker AED values comparing two datasets: a set of computationally predicted genes, and manually curated genes from the Web Apollo program. The idea is to quantify the amount of nucleotide-level change that occurred during the manual curation process. I have tried to run Maker in several ways to generate the AED values. Both gene sets are in (as far as I can tell) valid gff3 format. First, I included the manually curated (Web Apollo) gff3 in the 'model_gff' field of maker_opts.exe, and the gff3 of the computational predictions in the 'est_gff' field, with all of the other prediction and evidence alignment settings turned off. All resulting AEDs from this analysis were 1, even though many of the annotations had 100% overlap. Next, instead of using the computational predictions in gff3 format, I used the fasta file of the cDNA sequence from the computational predictions in the 'est' field. Here, the results made more sense, but there was a small but significant percentage of the AED values that were 1 that actually should have been less than 1. I have tried the 2 analyses above using both the gff3 output straight from Web Apollo, and after running the gff3 through maker once as the only entry in the model-gff field, as explained in the MAKER2 paper (http://www.biomedcentral.com/1471-2105/12/491). This does not to appear to make a difference. Do you have any ideas where I might start to debug this? Thanks for your help! Monica This electronic message contains information generated by the USDA solely for the intended recipients. Any unauthorized interception of this message or the use or disclosure of the information it contains may violate the law and subject the violator to civil or criminal penalties. If you believe you have received this message in error, please notify the sender and delete the email immediately. -------------- next part -------------- An HTML attachment was scrubbed... URL: From goutham.atla at gmail.com Thu Nov 6 23:39:51 2014 From: goutham.atla at gmail.com (Goutham atla) Date: Fri, 7 Nov 2014 11:09:51 +0530 Subject: [maker-devel] URGENT: Re: maker failure with example data In-Reply-To: <3CCDC24F-756A-492C-8E7F-B1B97616EE45@genetics.utah.edu> References: <3CCDC24F-756A-492C-8E7F-B1B97616EE45@genetics.utah.edu> Message-ID: Dear Carson, Thanks for the quick reply. It worked after providing the assembled transcripts and protein fasta from closely related species. Regards, Goutham On Thu, Nov 6, 2014 at 12:34 PM, Carson Holt wrote: > The final transcript and proteins fasta files will only exists if there > were gene models with evidence support. If you did not provide an HMM for > one of the ab initio gene predictors (SNAP, Augustus, etc.) then there will > be no gene models, and if you do not provide protein or est evidence > fastas, then there will be no evidence support. Also if your contigs are > too short to contain gene models then there will be no models. > > Thanks, > Carson > > > > On Nov 5, 2014, at 11:49 PM, Goutham atla wrote: > > Dear All, > > I have finished running maker. But I realised that there are no > *transcripts.fasta and *protein.fasta files in any of the directories that > make has created. It has only gtf files. > > Example output of a test run: I have similar results on original file > also: > > [User at motif jcf7180001838744]$ pwd > > /home/User/Maker_Annotation/Maker_test.maker.output/Maker_test_datastore/35/C1/jcf7180001838744 > [User at motif jcf7180001838744]$ ls > jcf7180001838744.gff run.log theVoid.jcf7180001838744 > > Any help from you in figuring out why there are no protein.fasta > and transcripts.fast would be very helpful. > > Regards, > Goutham > > On Wed, Oct 1, 2014 at 11:28 AM, Goutham atla > wrote: > >> Dear All, >> >> Thank you. I figured out th problem is with mpich2. I was behind mpich2 >> but was unsuccessful. I installed mpich v3 and its working fine now. Thank >> you all. The old GMDO tutorials are bit misleading as the new versions have >> come up. >> >> On Wed, Oct 1, 2014 at 11:09 AM, Marc H?ppner >> wrote: >> >>> Another possibility could be that MPICH2 wasn?t build properly, no? I >>> remember something with enabling shared libraries during the compilation of >>> mpich, without which the error below would appear. >>> >>> /Marc >>> >>> Marc P. Hoeppner, PhD >>> Team Leader >>> BILS Genome Annotation Platform >>> Department for Medical Biochemistry and Microbiology >>> Uppsala University, Sweden >>> marc.hoeppner at imbim.uu.se >>> >>> >>> >>> On 30 Sep 2014, at 21:33, Carson Holt >>> wrote: >>> >>> The message is warning that there are multiple instances of MAKER >>> running, but no MPI communication. When you build MAKER (perl Build.PL step >>> when installing MAKER), you need to specify the location of 'mpicc' and >>> 'mpi.h' to build with MPI support. Otherwise you won't be able to link >>> against MPICH2 shared libraries. You probably need to rerun that step. >>> >>> --Carson >>> >>> >>> From: Goutham atla >>> Date: Tuesday, September 30, 2014 at 10:49 AM >>> To: Carson Holt >>> Cc: "maker-devel at yandell-lab.org" >>> Subject: URGENT: Re: maker failure with example data >>> >>> Hi Carson, >>> >>> I figured out the problem is with RepeatMasker installation and I fixed >>> it. >>> >>> I am running maker with MPICH2 and I get the following warning when I >>> start it: >>> >>> >>> >>> *STATUS: Processing and indexing input FASTA files... WARNING: Multiple >>> MAKER processes have been started in the same directory.* >>> >>> I would like to if this is common. >>> >>> Regards, >>> Goutham >>> >>> >>> On Tue, Sep 30, 2014 at 12:02 PM, Goutham atla >>> wrote: >>> >>>> Dear Carson, >>>> >>>> Thank you for the reply. I reinstalled the BioPerl and now I am >>>> getting the following error on test data. >>>> >>>> ERROR: RepeatMasker failed >>>> --> rank=NA, hostname=motif >>>> ERROR: Failed while doing repeat masking >>>> ERROR: Chunk failed at level:0, tier_type:1 >>>> FAILED CONTIG:contig-dpp-500-500 >>>> >>>> On Mon, Sep 29, 2014 at 8:17 PM, Carson Holt < >>>> carson.holt at genetics.utah.edu> wrote: >>>> >>>>> The error is caused by the BioPerl indexer returning an empty length >>>>> for the indexed fasta sequence (possibly because of a corrupt index file or >>>>> other reasons). You may need to reinstall BioPerl (use the CPAN version >>>>> not the BioPerl-live version), or reinstall Berkley DB (used by the BioPerl >>>>> indexer), or reinstall the Perl module DB_File via CPAN (Perl's interface >>>>> to Berkley DB). After reinstalling BioPerl, delete the mpi_blastdb >>>>> directory for the MAKER run before retrying. >>>>> >>>>> Also verify that the /tmp directory on your system or the directory >>>>> pointed to by TMP= in the maker_opts,ctl file is not full and that TMP= is >>>>> not set to an NFS mounted location. >>>>> >>>>> Thanks, >>>>> Carson >>>>> >>>>> >>>>> >>>>> >>>>> From: Goutham atla >>>>> Date: Monday, September 29, 2014 at 6:33 AM >>>>> To: >>>>> Subject: maker failure with example data >>>>> >>>>> Dear All, >>>>> >>>>> I am running maker with the demo file, i.e dip_contig.fasta by >>>>> keeping all other parameters in .ctl files as default. But it do not >>>>> progress and shows the following message that the length of the sequence is >>>>> 0. Can anybody help me ? >>>>> >>>>> >>>>> >>>>> --Next Contig-- >>>>> >>>>> MAKER WARNING: All old files will be erased before continuing >>>>> #--------------------------------------------------------------------- >>>>> Skipping the contig because it is too short!! >>>>> SeqID: contig-dpp-500-500 >>>>> Length: 0 >>>>> #--------------------------------------------------------------------- >>>>> >>>>> >>>>> Regards, >>>>> Goutham >>>>> >>>> >>>> >>>> >>>> -- >>>> Goutham Atla >>>> >>> >>> >>> >>> -- >>> Goutham Atla >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >>> >>> >> >> >> -- >> Goutham Atla >> > > > > -- > Goutham Atla > > > -- Goutham Atla -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Fri Nov 7 09:26:31 2014 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 7 Nov 2014 08:26:31 -0700 Subject: [maker-devel] calculating AED values between two datasets In-Reply-To: References: Message-ID: <051D0D34-9E49-401F-B22D-16970EB93B66@gmail.com> If you got every value as 1 with the est_gff, then your GFF3 didn?t load. The est_gff option is expecting match/match_part format alignment format, and you may not have had it correctly structures. For using fasta files instead, you may also need to set single_exon=1 and single_length=1, otherwise many of those alignments will be ignored for AED scoring purposes. You should also look out the output in a viewer like apollo to visualize the comparison to see if the reason you get 1 is because the aligner can?t recover the original transcript alignment. ?Carson > On Nov 7, 2014, at 6:17 AM, Poelchau, Monica wrote: > > Hi everyone, > > I would like to generate a list of Maker AED values comparing two datasets: a set of computationally predicted genes, and manually curated genes from the Web Apollo program. The idea is to quantify the amount of nucleotide-level change that occurred during the manual curation process. > > I have tried to run Maker in several ways to generate the AED values. Both gene sets are in (as far as I can tell) valid gff3 format. First, I included the manually curated (Web Apollo) gff3 in the 'model_gff' field of maker_opts.exe, and the gff3 of the computational predictions in the 'est_gff' field, with all of the other prediction and evidence alignment settings turned off. All resulting AEDs from this analysis were 1, even though many of the annotations had 100% overlap. Next, instead of using the computational predictions in gff3 format, I used the fasta file of the cDNA sequence from the computational predictions in the 'est' field. Here, the results made more sense, but there was a small but significant percentage of the AED values that were 1 that actually should have been less than 1. > > I have tried the 2 analyses above using both the gff3 output straight from Web Apollo, and after running the gff3 through maker once as the only entry in the model-gff field, as explained in the MAKER2 paper (http://www.biomedcentral.com/1471-2105/12/491). This does not to appear to make a difference. > > Do you have any ideas where I might start to debug this? > > Thanks for your help! > > Monica > > > > > > This electronic message contains information generated by the USDA solely for the intended recipients. Any unauthorized interception of this message or the use or disclosure of the information it contains may violate the law and subject the violator to civil or criminal penalties. If you believe you have received this message in error, please notify the sender and delete the email immediately. > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From monica.poelchau at ars.usda.gov Fri Nov 7 13:00:26 2014 From: monica.poelchau at ars.usda.gov (Poelchau, Monica) Date: Fri, 7 Nov 2014 19:00:26 +0000 Subject: [maker-devel] calculating AED values between two datasets In-Reply-To: <051D0D34-9E49-401F-B22D-16970EB93B66@gmail.com> References: <051D0D34-9E49-401F-B22D-16970EB93B66@gmail.com> Message-ID: Thank you for the prompt reply, Carson! Yes, my gff3 was modeled as gene models, not match/match_part, so reformatting it may do the trick. Monica From: Carson Holt > Date: Friday, November 7, 2014 at 10:26 AM To: Monica Poelchau > Cc: "maker-devel at yandell-lab.org" > Subject: Re: [maker-devel] calculating AED values between two datasets If you got every value as 1 with the est_gff, then your GFF3 didn?t load. The est_gff option is expecting match/match_part format alignment format, and you may not have had it correctly structures. For using fasta files instead, you may also need to set single_exon=1 and single_length=1, otherwise many of those alignments will be ignored for AED scoring purposes. You should also look out the output in a viewer like apollo to visualize the comparison to see if the reason you get 1 is because the aligner can?t recover the original transcript alignment. ?Carson On Nov 7, 2014, at 6:17 AM, Poelchau, Monica > wrote: Hi everyone, I would like to generate a list of Maker AED values comparing two datasets: a set of computationally predicted genes, and manually curated genes from the Web Apollo program. The idea is to quantify the amount of nucleotide-level change that occurred during the manual curation process. I have tried to run Maker in several ways to generate the AED values. Both gene sets are in (as far as I can tell) valid gff3 format. First, I included the manually curated (Web Apollo) gff3 in the 'model_gff' field of maker_opts.exe, and the gff3 of the computational predictions in the 'est_gff' field, with all of the other prediction and evidence alignment settings turned off. All resulting AEDs from this analysis were 1, even though many of the annotations had 100% overlap. Next, instead of using the computational predictions in gff3 format, I used the fasta file of the cDNA sequence from the computational predictions in the 'est' field. Here, the results made more sense, but there was a small but significant percentage of the AED values that were 1 that actually should have been less than 1. I have tried the 2 analyses above using both the gff3 output straight from Web Apollo, and after running the gff3 through maker once as the only entry in the model-gff field, as explained in the MAKER2 paper (http://www.biomedcentral.com/1471-2105/12/491). This does not to appear to make a difference. Do you have any ideas where I might start to debug this? Thanks for your help! Monica This electronic message contains information generated by the USDA solely for the intended recipients. Any unauthorized interception of this message or the use or disclosure of the information it contains may violate the law and subject the violator to civil or criminal penalties. If you believe you have received this message in error, please notify the sender and delete the email immediately. _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From Timothy.Stitt at tgac.ac.uk Sat Nov 8 07:58:53 2014 From: Timothy.Stitt at tgac.ac.uk (Timothy Stitt (TGAC)) Date: Sat, 8 Nov 2014 13:58:53 +0000 Subject: [maker-devel] DBD::SQLite::db do failed errors Message-ID: Dear Maker Support, I'm running Maker v2.31.7 and I'm receiving lots of the following warnings/errors during the run. The errors don't seem to prevent the calculation from completing so I was just wondering how I can avoid getting them? STATUS: Setting up database for any GFF3 input... DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 1. DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 2. DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 3. ? Thanks in advance, Tim. --- Timothy Stitt PhD / Head of Scientific Computing The Genome Analysis Centre (TGAC) http://www.tgac.ac.uk/ p: +44 1603 450378 e: timothy.stitt at tgac.ac.uk -------------- next part -------------- An HTML attachment was scrubbed... URL: From jimhu at email.tamu.edu Fri Nov 7 12:34:11 2014 From: jimhu at email.tamu.edu (Jim Hu) Date: Fri, 7 Nov 2014 12:34:11 -0600 Subject: [maker-devel] Speaking of AED... Message-ID: I was teaching Yandell and Ence (2012) in the genomics class I co-teach, and was having trouble understanding the values for AED in Box 4 Figure Bb derive from the evidence set in Figure Ba. Box 4 says: "AAED is caculated in the same manner as SN and SPm but in place of a reference gene model, the coordinates of the union of the aligned evidence (see panel Ba) are used instead". In the union, I expect that a bp that is in an exon in any of the evidence would be considered a TP. If so, then why isn't nt-level AED for Annotation 1 in Bb zero? I'm probably missing something trivial. Thanks Jim ===================================== Jim Hu Professor Dept. of Biochemistry and Biophysics 2128 TAMU Texas A&M Univ. College Station, TX 77843-2128 979-862-4054 -------------- next part -------------- An HTML attachment was scrubbed... URL: From sarasank at umail.iu.edu Sat Nov 8 12:58:30 2014 From: sarasank at umail.iu.edu (Saranya Sankaranarayanan) Date: Sat, 8 Nov 2014 13:58:30 -0500 Subject: [maker-devel] Fwd: AED plot In-Reply-To: References: Message-ID: Hi Maker authors, I am new to using Maker. I have a few basic questions. I have the maker annotation complete and I ran the gff3_merge -n -d genome_master_datastore_index.log - to create the gff file After that, I used the script AED_cdf_generator.pl to obtain the AED plot, while I get the error: Use of uninitialized value $total in division (/) at ./AED_cdf_generator.pl line 43. Illegal division by zero at ./AED_cdf_generator.pl line 43. I parsed my gff file as: AED_cdf_generator.pl -b 0.025 maker.gff Could anyone please help me with this error? Thank you. It looks like no value is parsed to the variable total, but I am not able to decipher why. Regards, Saranya -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Sat Nov 8 17:52:26 2014 From: carsonhh at gmail.com (Carson Holt) Date: Sat, 8 Nov 2014 16:52:26 -0700 Subject: [maker-devel] Speaking of AED... In-Reply-To: References: Message-ID: <443253CC-838D-42A7-8FEB-8BAF442FAE9A@gmail.com> I think I would agree. Annotation 1 is a perfect match to the evidence. It is ab initio 1 that would have been AED of 0.2, but annotation 1 should have been AED of 0. ?Carson > On Nov 7, 2014, at 11:34 AM, Jim Hu wrote: > > I was teaching Yandell and Ence (2012) in the genomics class I co-teach, and was having trouble understanding the values for AED in Box 4 Figure Bb derive from the evidence set in Figure Ba. > > Box 4 says: "AAED is caculated in the same manner as SN and SPm but in place of a reference gene model, the coordinates of the union of the aligned evidence (see panel Ba) are used instead". In the union, I expect that a bp that is in an exon in any of the evidence would be considered a TP. If so, then why isn't nt-level AED for Annotation 1 in Bb zero? > > I'm probably missing something trivial. > > Thanks > > Jim > ===================================== > Jim Hu > Professor > Dept. of Biochemistry and Biophysics > 2128 TAMU > Texas A&M Univ. > College Station, TX 77843-2128 > 979-862-4054 > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From dence at genetics.utah.edu Sat Nov 8 17:53:19 2014 From: dence at genetics.utah.edu (Daniel Ence) Date: Sat, 8 Nov 2014 23:53:19 +0000 Subject: [maker-devel] Speaking of AED... In-Reply-To: References: Message-ID: <5FC7C806-E03F-4DC3-8932-65F6C0E1A7EF@genetics.utah.edu> Hi Professor Hu, I?m excited that you?re teaching from this review. I hope that you find it useful for your class! Annotation 1 has an AED of 0.2 and not 0 because the middle exon doesn?t line up exactly with the evidence alignments. Since there are bps in the annotation that aren?t supported by evidence, then it has an AED of > 0. It?s a little hard to see in the figure, but if you use a straight-edge, you can see it. Feel free to let me know whether that helps clear things up. Thanks, Daniel > On Nov 7, 2014, at 11:34 AM, Jim Hu wrote: > > I was teaching Yandell and Ence (2012) in the genomics class I co-teach, and was having trouble understanding the values for AED in Box 4 Figure Bb derive from the evidence set in Figure Ba. > > Box 4 says: "AAED is caculated in the same manner as SN and SPm but in place of a reference gene model, the coordinates of the union of the aligned evidence (see panel Ba) are used instead". In the union, I expect that a bp that is in an exon in any of the evidence would be considered a TP. If so, then why isn't nt-level AED for Annotation 1 in Bb zero? > > I'm probably missing something trivial. > > Thanks > > Jim > ===================================== > Jim Hu > Professor > Dept. of Biochemistry and Biophysics > 2128 TAMU > Texas A&M Univ. > College Station, TX 77843-2128 > 979-862-4054 > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From bmoore at genetics.utah.edu Sat Nov 8 17:38:23 2014 From: bmoore at genetics.utah.edu (Barry Moore) Date: Sat, 8 Nov 2014 23:38:23 +0000 Subject: [maker-devel] Speaking of AED... In-Reply-To: References: Message-ID: The 5? most junction on the 3? terminal exon (assuming + strand) is discordant in both Annotation 1 & 2 from the evidence in Ba. [cid:F7723E49-0CF1-4E2C-A8BF-64312129A65F] B On Nov 7, 2014, at 11:34 AM, Jim Hu > wrote: I was teaching Yandell and Ence (2012) in the genomics class I co-teach, and was having trouble understanding the values for AED in Box 4 Figure Bb derive from the evidence set in Figure Ba. Box 4 says: "AAED is caculated in the same manner as SN and SPm but in place of a reference gene model, the coordinates of the union of the aligned evidence (see panel Ba) are used instead". In the union, I expect that a bp that is in an exon in any of the evidence would be considered a TP. If so, then why isn't nt-level AED for Annotation 1 in Bb zero? I'm probably missing something trivial. Thanks Jim ===================================== Jim Hu Professor Dept. of Biochemistry and Biophysics 2128 TAMU Texas A&M Univ. College Station, TX 77843-2128 979-862-4054 _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: PastedGraphic-1.png Type: image/png Size: 344800 bytes Desc: PastedGraphic-1.png URL: From carsonhh at gmail.com Sat Nov 8 18:13:44 2014 From: carsonhh at gmail.com (Carson Holt) Date: Sat, 8 Nov 2014 17:13:44 -0700 Subject: [maker-devel] DBD::SQLite::db do failed errors In-Reply-To: References: Message-ID: <3100A718-B063-4BC5-A036-943DEBCC6484@gmail.com> It?s caused by one of the characters in your GFF3 file. For example characters NOT in the set [a-zA-Z0-9.:^*$@!+_?-|] must be escaped in GFF3 with exceptions outlined in the format spec. You mayhave either a ? or a ? that must be escaped. ?Carson > On Nov 8, 2014, at 6:58 AM, Timothy Stitt (TGAC) wrote: > > Dear Maker Support, > > I'm running Maker v2.31.7 and I'm receiving lots of the following warnings/errors during the run. The errors don't seem to prevent the calculation from completing so I was just wondering how I can avoid getting them? > > > STATUS: Setting up database for any GFF3 input... > DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 1. > DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 2. > DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 3. > ? > > > Thanks in advance, > > Tim. > --- > Timothy Stitt PhD / Head of Scientific Computing > The Genome Analysis Centre (TGAC) > http://www.tgac.ac.uk/ > > p: +44 1603 450378 > e: timothy.stitt at tgac.ac.uk _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From bmoore at genetics.utah.edu Sat Nov 8 18:07:43 2014 From: bmoore at genetics.utah.edu (Barry Moore) Date: Sun, 9 Nov 2014 00:07:43 +0000 Subject: [maker-devel] Speaking of AED... In-Reply-To: References: Message-ID: <6AB384C1-4F5C-4132-9B3A-23F0DE3A9351@genetics.utah.edu> Hmm, I missed the one Daniel pointed out, and then upon inspection noticed a third discordant exon. This figure needs to go into a book of optical illusions! [cid:87ED4E8E-56C4-4808-A7E3-9F0B4521CADB] On Nov 8, 2014, at 4:38 PM, Barry Moore > wrote: The 5? most junction on the 3? terminal exon (assuming + strand) is discordant in both Annotation 1 & 2 from the evidence in Ba. B On Nov 7, 2014, at 11:34 AM, Jim Hu > wrote: I was teaching Yandell and Ence (2012) in the genomics class I co-teach, and was having trouble understanding the values for AED in Box 4 Figure Bb derive from the evidence set in Figure Ba. Box 4 says: "AAED is caculated in the same manner as SN and SPm but in place of a reference gene model, the coordinates of the union of the aligned evidence (see panel Ba) are used instead". In the union, I expect that a bp that is in an exon in any of the evidence would be considered a TP. If so, then why isn't nt-level AED for Annotation 1 in Bb zero? I'm probably missing something trivial. Thanks Jim ===================================== Jim Hu Professor Dept. of Biochemistry and Biophysics 2128 TAMU Texas A&M Univ. College Station, TX 77843-2128 979-862-4054 _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: PastedGraphic-2.png Type: image/png Size: 362479 bytes Desc: PastedGraphic-2.png URL: From michael.s.campbell1 at gmail.com Sun Nov 9 00:01:27 2014 From: michael.s.campbell1 at gmail.com (Michael Campbell) Date: Sat, 8 Nov 2014 23:01:27 -0700 Subject: [maker-devel] Fwd: AED plot In-Reply-To: References: Message-ID: Hi Sranya, If you can send me a copy of your gff3 file I can look at it and see why you are getting the error. That is a pretty young accessory script so there may be something in your file that it has't seen before. Thanks, Mike On Sat, Nov 8, 2014 at 11:58 AM, Saranya Sankaranarayanan < sarasank at umail.iu.edu> wrote: > Hi Maker authors, > > I am new to using Maker. I have a few basic questions. > > I have the maker annotation complete and I ran the > > gff3_merge -n -d genome_master_datastore_index.log - to create the gff > file > > After that, I used the script AED_cdf_generator.pl to obtain the AED > plot, while I get the error: > > > Use of uninitialized value $total in division (/) at > ./AED_cdf_generator.pl line 43. > Illegal division by zero at ./AED_cdf_generator.pl line 43. > > I parsed my gff file as: > AED_cdf_generator.pl -b 0.025 maker.gff > > Could anyone please help me with this error? Thank you. It looks like no > value is parsed to the variable total, but I am not able to decipher why. > > Regards, > Saranya > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > -- Michael Campbell MS, RD. Doctoral Candidate Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From muriel.grosb at gmail.com Mon Nov 10 04:35:30 2014 From: muriel.grosb at gmail.com (Muriel Gros-Balthazard) Date: Mon, 10 Nov 2014 11:35:30 +0100 Subject: [maker-devel] running Maker but skipping first steps Message-ID: <546094F2.6000100@gmail.com> Hello, I want to run Maker but I would like to skip the first steps : STATUS: Parsing control files... STATUS: Processing and indexing input FASTA files... STATUS: Setting up database for any GFF3 input... A data structure will be created for you at: /Data/Genomics/GeneAnnotation/Maker_pipeline/5_Run_Maker/Pdac_ref2013s.maker.output/Pdac_ref2013s_datastore To access files for individual sequences use the datastore index: /Data/Genomics/GeneAnnotation/Maker_pipeline/5_Run_Maker/Pdac_ref2013s.maker.output/Pdac_ref2013s_master_datastore_index.log Indeed, there was an error in RepeatMasking (and I reinstalled RepeatMasker) but I believe that the previous steps are always the same. Is there a way to run Maker so that it doesn't run this first steps again given that the control files didn't change, the fasta files are already indexed and the database of gff3 is set up ? Thank you ! Muriel From FeatherstonJ at arc.agric.za Mon Nov 10 07:42:15 2014 From: FeatherstonJ at arc.agric.za (Jonathan Featherston) Date: Mon, 10 Nov 2014 13:42:15 +0000 Subject: [maker-devel] Maker Message-ID: <57CFF349-6C9F-4172-ADB3-A9572E21D4A3@arc.agric.za> Dear Carson I've been trying to train SNAP with Maker but I'm getting empty genome.ann and .dna files. I have tried running the maker2zff on the implant page to see if my script was corrupt. No help from that. I've seen a few pages in the group and on seqanswers about the empty off files but most seem to have been resolved by including all outputs (maker2zff -n) and even this doesn't generate anything for me?. So I'm guessing the problem is somewhere with the maker outputs. I did get errors from the maker run but they seem to be about mli and ALRM (a perl error- what a pain getting perl libs on a mac). Argument "ALRM" isn't numeric in exit at /Users/Jonathan/perl5/perlbrew/perls/perl-5.20.1/lib/site_perl/5.20.1/darwin-2level/forks.pm line 2184. Argument "ALRM" isn't numeric in exit at /Users/Jonathan/perl5/perlbrew/perls/perl-5.20.1/lib/site_perl/5.20.1/darwin-2level/forks.pm line 2184. Argument "ALRM" isn't numeric in exit at /Users/Jonathan/perl5/perlbrew/perls/perl-5.20.1/lib/site_perl/5.20.1/darwin-2level/forks.pm line 2184. Argument "ALRM" isn't numeric in exit at /Users/Jonathan/perl5/perlbrew/perls/perl-5.20.1/lib/site_perl/5.20.1/darwin-2level/forks.pm line 2184. Argument "ALRM" isn't numeric in exit at /Users/Jonathan/perl5/perlbrew/perls/perl-5.20.1/lib/site_perl/5.20.1/darwin-2level/forks.pm line 2184. -------------------------------------------------------------------------- mpiexec has exited due to process rank 4 with PID 5935 on node Administrators-MacBook-Pro-9 exiting improperly. There are three reasons this could occur: 1. this process did not call "init" before exiting, but others in the job did. This can cause a job to hang indefinitely while it waits for all processes to call "init". By rule, if one process calls "init", then ALL processes must call "init" prior to termination. 2. this process called "init", but exited without calling "finalize". By rule, all processes that call "init" MUST call "finalize" prior to exiting or it will be considered an "abnormal termination" 3. this process called "MPI_Abort" or "orte_abort" and the mca parameter orte_create_session_dirs is set to false. In this case, the run-time cannot detect that the abort call was an abnormal termination. Hence, the only error message you will receive is this one. This may have caused other processes in the application to be terminated by signals sent by mpiexec (as reported here). You can avoid this message by specifying -quiet on the mpiexec command line. Maker did finish and the gff file produced (I can't produce a fasta file from the est2genome=1 option??) seems ok. It has produced protein-matches and match_part although I don't see maker product. Otherwise I ran maker with mli using the command from the CPBI maker paper. I used -nohup mpiexec -n 8 maker < /dev/null & for my maker execution. I'm using altest and protein homology for now. Thank you very much for what help you can provide. I really enjoyed the workshop you and Mark presented! Kind Regards Jonathan -------------- next part -------------- An HTML attachment was scrubbed... URL: From Timothy.Stitt at tgac.ac.uk Mon Nov 10 07:59:59 2014 From: Timothy.Stitt at tgac.ac.uk (Timothy Stitt (TGAC)) Date: Mon, 10 Nov 2014 13:59:59 +0000 Subject: [maker-devel] DBD::SQLite::db do failed errors In-Reply-To: <3100A718-B063-4BC5-A036-943DEBCC6484@gmail.com> References: <3100A718-B063-4BC5-A036-943DEBCC6484@gmail.com> Message-ID: Thanks Carson. I checked the *.gff files for ' and " symbols. I only observed a bunch of " in one of the files as follows: scaffold16677 exonerate:protein2genome:local gene 128238 128710 339 - . gene_id 0 ; sequence Lus10000040|PACid:23139618 ; gene_orientation + scaffold16677 exonerate:protein2genome:local cds 128645 128710 . - . scaffold16677 exonerate:protein2genome:local exon 128645 128710 . - . insertions 0 ; deletions 0 scaffold16677 exonerate:protein2genome:local splice5 128643 128644 . - . intron_id 1 ; splice_site "GT" scaffold16677 exonerate:protein2genome:local intron 128552 128644 . - . intron_id 1 scaffold16677 exonerate:protein2genome:local splice3 128552 128553 . - . intron_id 0 ; splice_site "AG" scaffold16677 exonerate:protein2genome:local cds 128442 128551 . - . scaffold16677 exonerate:protein2genome:local exon 128442 128551 . - . insertions 0 ; deletions 0 scaffold16677 exonerate:protein2genome:local splice5 128440 128441 . - . intron_id 2 ; splice_site "GT" scaffold16677 exonerate:protein2genome:local intron 128362 128441 . - . intron_id 2 scaffold16677 exonerate:protein2genome:local splice3 128362 128363 . - . intron_id 1 ; splice_site "AG" Would these "GT", "AG" etc. strings cause the problem? If so, how should I change them to be correct? Thanks, Tim. --- Timothy Stitt PhD / Head of Scientific Computing The Genome Analysis Centre (TGAC) http://www.tgac.ac.uk/ p: +44 1603 450378 e: timothy.stitt at tgac.ac.uk From: Carson Holt > Date: Sunday, 9 November 2014 00:13 To: Timothy Stitt > Cc: "maker-devel at yandell-lab.org" > Subject: Re: [maker-devel] DBD::SQLite::db do failed errors It?s caused by one of the characters in your GFF3 file. For example characters NOT in the set [a-zA-Z0-9.:^*$@!+_?-|] must be escaped in GFF3 with exceptions outlined in the format spec. You mayhave either a ? or a ? that must be escaped. ?Carson On Nov 8, 2014, at 6:58 AM, Timothy Stitt (TGAC) > wrote: Dear Maker Support, I'm running Maker v2.31.7 and I'm receiving lots of the following warnings/errors during the run. The errors don't seem to prevent the calculation from completing so I was just wondering how I can avoid getting them? STATUS: Setting up database for any GFF3 input... DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 1. DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 2. DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 3. ? Thanks in advance, Tim. --- Timothy Stitt PhD / Head of Scientific Computing The Genome Analysis Centre (TGAC) http://www.tgac.ac.uk/ p: +44 1603 450378 e: timothy.stitt at tgac.ac.uk _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From myandell at genetics.utah.edu Sat Nov 8 18:32:12 2014 From: myandell at genetics.utah.edu (Mark Yandell) Date: Sun, 9 Nov 2014 00:32:12 +0000 Subject: [maker-devel] Speaking of AED... In-Reply-To: <6AB384C1-4F5C-4132-9B3A-23F0DE3A9351@genetics.utah.edu> References: , <6AB384C1-4F5C-4132-9B3A-23F0DE3A9351@genetics.utah.edu> Message-ID: <7A60AB257EFF2B48B1F4C814817EA053E3664681@mxb1.hg.genetics.utah.edu> And you are still missing one-- 3-prine end of the middle exon is also discordant. . I agree though somehow the color makes it hard to see. Sorry. --mark Mark Yandell Professor of Human Genetics H.A. & Edna Benning Presidential Endowed Chair Co-director USTAR Center for Genetic Discovery Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:801-587-7707 ________________________________________ From: maker-devel [maker-devel-bounces at yandell-lab.org] on behalf of Barry Moore [bmoore at genetics.utah.edu] Sent: Saturday, November 08, 2014 5:07 PM To: Jim Hu; maker-devel at yandell-lab.org Cc: Barry Moore Subject: Re: [maker-devel] Speaking of AED... Hmm, I missed the one Daniel pointed out, and then upon inspection noticed a third discordant exon. This figure needs to go into a book of optical illusions! [cid:87ED4E8E-56C4-4808-A7E3-9F0B4521CADB] On Nov 8, 2014, at 4:38 PM, Barry Moore > wrote: The 5? most junction on the 3? terminal exon (assuming + strand) is discordant in both Annotation 1 & 2 from the evidence in Ba. B On Nov 7, 2014, at 11:34 AM, Jim Hu > wrote: I was teaching Yandell and Ence (2012) in the genomics class I co-teach, and was having trouble understanding the values for AED in Box 4 Figure Bb derive from the evidence set in Figure Ba. Box 4 says: "AAED is caculated in the same manner as SN and SPm but in place of a reference gene model, the coordinates of the union of the aligned evidence (see panel Ba) are used instead". In the union, I expect that a bp that is in an exon in any of the evidence would be considered a TP. If so, then why isn't nt-level AED for Annotation 1 in Bb zero? I'm probably missing something trivial. Thanks Jim ===================================== Jim Hu Professor Dept. of Biochemistry and Biophysics 2128 TAMU Texas A&M Univ. College Station, TX 77843-2128 979-862-4054 _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- A non-text attachment was scrubbed... Name: PastedGraphic-2.png Type: image/png Size: 362479 bytes Desc: PastedGraphic-2.png URL: From sarasank at umail.iu.edu Sun Nov 9 10:33:08 2014 From: sarasank at umail.iu.edu (Saranya Sankaranarayanan) Date: Sun, 9 Nov 2014 11:33:08 -0500 Subject: [maker-devel] Fwd: AED plot In-Reply-To: References: Message-ID: Hi Mike, Please find the gff3 file attached with this email. Thanks a lot for the very prompt response. Sincerely, Saranya Sankaranarayanan Master's Student, SoIC Indiana University On Sun, Nov 9, 2014 at 1:01 AM, Michael Campbell < michael.s.campbell1 at gmail.com> wrote: > Hi Sranya, > > If you can send me a copy of your gff3 file I can look at it and see why > you are getting the error. That is a pretty young accessory script so there > may be something in your file that it has't seen before. > > Thanks, > Mike > > On Sat, Nov 8, 2014 at 11:58 AM, Saranya Sankaranarayanan < > sarasank at umail.iu.edu> wrote: > >> Hi Maker authors, >> >> I am new to using Maker. I have a few basic questions. >> >> I have the maker annotation complete and I ran the >> >> gff3_merge -n -d genome_master_datastore_index.log - to create the gff >> file >> >> After that, I used the script AED_cdf_generator.pl to obtain the AED >> plot, while I get the error: >> >> >> Use of uninitialized value $total in division (/) at >> ./AED_cdf_generator.pl line 43. >> Illegal division by zero at ./AED_cdf_generator.pl line 43. >> >> I parsed my gff file as: >> AED_cdf_generator.pl -b 0.025 maker.gff >> >> Could anyone please help me with this error? Thank you. It looks like no >> value is parsed to the variable total, but I am not able to decipher why. >> >> Regards, >> Saranya >> >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> > > > -- > Michael Campbell MS, RD. > Doctoral Candidate > Eccles Institute of Human Genetics > University of Utah > 15 North 2030 East, Room 2100 > Salt Lake City, UT 84112-5330 > ph:585-3543 > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Gff3.zip Type: application/zip Size: 2098998 bytes Desc: not available URL: From carsonhh at gmail.com Mon Nov 10 09:16:28 2014 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 10 Nov 2014 08:16:28 -0700 Subject: [maker-devel] DBD::SQLite::db do failed errors In-Reply-To: References: <3100A718-B063-4BC5-A036-943DEBCC6484@gmail.com> Message-ID: Actually that is not a GFF3 file. It appears to be GTF which is structured different from GFF3. You would need to convert to GFF3. You can try the sequence ontology converter here ?> http://www.sequenceontology.org/cgi-bin/converter.cgi Unfortunately it will not likely be a painless process as GTF files vary so much between sources that one GTF file might not actually be compatible with another GTF file, so you may have to spend some time editing the file for the converter to work. ?Carson > On Nov 10, 2014, at 6:59 AM, Timothy Stitt (TGAC) wrote: > > Thanks Carson. > > I checked the *.gff files for ' and " symbols. I only observed a bunch of " in one of the files as follows: > > scaffold16677 exonerate:protein2genome:local > gene 128238 > 128710 339 > - . > gene_id 0 ; sequence Lus10000040|PACid:23139618 ; gene_orientation + > scaffold16677 exonerate:protein2genome:local > cds 128645 > 128710 . > - . > scaffold16677 exonerate:protein2genome:local > exon 128645 > 128710 . > - . > insertions 0 ; deletions 0 > scaffold16677 exonerate:protein2genome:local > splice5 128643 > 128644 . > - . > intron_id 1 ; splice_site "GT" > scaffold16677 exonerate:protein2genome:local > intron 128552 > 128644 . > - . > intron_id 1 > scaffold16677 exonerate:protein2genome:local > splice3 128552 > 128553 . > - . > intron_id 0 ; splice_site "AG" > scaffold16677 exonerate:protein2genome:local > cds 128442 > 128551 . > - . > scaffold16677 exonerate:protein2genome:local > exon 128442 > 128551 . > - . > insertions 0 ; deletions 0 > scaffold16677 exonerate:protein2genome:local > splice5 128440 > 128441 . > - . > intron_id 2 ; splice_site "GT" > scaffold16677 exonerate:protein2genome:local > intron 128362 > 128441 . > - . > intron_id 2 > scaffold16677 exonerate:protein2genome:local > splice3 128362 > 128363 . > - . > intron_id 1 ; splice_site "AG" > > Would these "GT", "AG" etc. strings cause the problem? If so, how should I change them to be correct? > > Thanks, > > Tim. > --- > Timothy Stitt PhD / Head of Scientific Computing > The Genome Analysis Centre (TGAC) > http://www.tgac.ac.uk/ > > p: +44 1603 450378 > e: timothy.stitt at tgac.ac.uk > > From: Carson Holt > > Date: Sunday, 9 November 2014 00:13 > To: Timothy Stitt > > Cc: "maker-devel at yandell-lab.org " > > Subject: Re: [maker-devel] DBD::SQLite::db do failed errors > > It?s caused by one of the characters in your GFF3 file. For example characters NOT in the set [a-zA-Z0-9.:^*$@!+_?-|] must be escaped in GFF3 with exceptions outlined in the format spec. You mayhave either a ? or a ? that must be escaped. > > ?Carson > > > > >> On Nov 8, 2014, at 6:58 AM, Timothy Stitt (TGAC) > wrote: >> >> Dear Maker Support, >> >> I'm running Maker v2.31.7 and I'm receiving lots of the following warnings/errors during the run. The errors don't seem to prevent the calculation from completing so I was just wondering how I can avoid getting them? >> >> >> STATUS: Setting up database for any GFF3 input... >> DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 1. >> DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 2. >> DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 3. >> ? >> >> >> Thanks in advance, >> >> Tim. >> --- >> Timothy Stitt PhD / Head of Scientific Computing >> The Genome Analysis Centre (TGAC) >> http://www.tgac.ac.uk/ >> >> p: +44 1603 450378 >> e: timothy.stitt at tgac.ac.uk _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Nov 10 09:23:12 2014 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 10 Nov 2014 08:23:12 -0700 Subject: [maker-devel] running Maker but skipping first steps In-Reply-To: <546094F2.6000100@gmail.com> References: <546094F2.6000100@gmail.com> Message-ID: These are just status messages. The steps don?t actually rerun, except for the control file parsing. That obviously has to happen every time for MAKER to know the control files are still the same between runs. Both these messages ?> STATUS: Processing and indexing input FASTA files... STATUS: Setting up database for any GFF3 input... MAKER sees that the indexes already exists, validates their integrity, and then moves on. So there is no rerunning of steps. ?Carson > On Nov 10, 2014, at 3:35 AM, Muriel Gros-Balthazard wrote: > > Hello, > > I want to run Maker but I would like to skip the first steps : > STATUS: Parsing control files... > STATUS: Processing and indexing input FASTA files... > STATUS: Setting up database for any GFF3 input... > A data structure will be created for you at: > /Data/Genomics/GeneAnnotation/Maker_pipeline/5_Run_Maker/Pdac_ref2013s.maker.output/Pdac_ref2013s_datastore > > To access files for individual sequences use the datastore index: > /Data/Genomics/GeneAnnotation/Maker_pipeline/5_Run_Maker/Pdac_ref2013s.maker.output/Pdac_ref2013s_master_datastore_index.log > > Indeed, there was an error in RepeatMasking (and I reinstalled RepeatMasker) but I believe that the previous steps are always the same. > Is there a way to run Maker so that it doesn't run this first steps again given that the control files didn't change, the fasta files are already indexed and the database of gff3 is set up ? > > Thank you ! > > Muriel > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From mike.thon at gmail.com Mon Nov 10 09:54:00 2014 From: mike.thon at gmail.com (Michael Thon) Date: Mon, 10 Nov 2014 16:54:00 +0100 Subject: [maker-devel] map2assembly Message-ID: Hi - We?re using map2assembly to map genes from other gene annotation pipelines onto the genome sequence in order to compare AED values to a maker de novo annotation. We found a few transcripts that map2assembly maps to multiple loci in the genome. Is there any way to know if these are all equally good alignments or are they all above some present threshold? I?m trying to decide what to do with the multiple mappings - whether we should discard all but one (in that case we?d need to decide which one) or whether we should keep them all. Keeping them all makes the most sense but the problem is they all have the same id. should map2assembly append a number to the id when a transcript maps to multiple locations in the genome? From carsonhh at gmail.com Mon Nov 10 10:08:26 2014 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 10 Nov 2014 09:08:26 -0700 Subject: [maker-devel] map2assembly In-Reply-To: References: Message-ID: <2D1BD5DD-7405-448F-B68D-80C8FEEDC6B3@gmail.com> Try using the transcript score (column 6). It should indicate the % recovery. A 100 means perfect match to the input transcript. The value is %identity multiplied by %coverage, so it will decrease because of a lack of identity or a lack of end-to-end alignment. ?Carson > On Nov 10, 2014, at 8:54 AM, Michael Thon wrote: > > Hi - > We?re using map2assembly to map genes from other gene annotation pipelines onto the genome sequence in order to compare AED values to a maker de novo annotation. We found a few transcripts that map2assembly maps to multiple loci in the genome. Is there any way to know if these are all equally good alignments or are they all above some present threshold? I?m trying to decide what to do with the multiple mappings - whether we should discard all but one (in that case we?d need to decide which one) or whether we should keep them all. Keeping them all makes the most sense but the problem is they all have the same id. should map2assembly append a number to the id when a transcript maps to multiple locations in the genome? > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From carsonhh at gmail.com Mon Nov 10 10:12:53 2014 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 10 Nov 2014 09:12:53 -0700 Subject: [maker-devel] map2assembly In-Reply-To: <2D1BD5DD-7405-448F-B68D-80C8FEEDC6B3@gmail.com> References: <2D1BD5DD-7405-448F-B68D-80C8FEEDC6B3@gmail.com> Message-ID: <681F266A-0549-4165-9261-FDD9F268D674@gmail.com> You can also use the -l option when running gff3_merge to correct for for unique IDs when merging multiple GFF3 files (i.e. IDs will be uniq within a file, but may not be unique across files when mapping transcripts the IDs are being copied direct from the aligned transcript). ?Carson > On Nov 10, 2014, at 9:08 AM, Carson Holt wrote: > > Try using the transcript score (column 6). It should indicate the % recovery. A 100 means perfect match to the input transcript. The value is %identity multiplied by %coverage, so it will decrease because of a lack of identity or a lack of end-to-end alignment. > > ?Carson > > > > > >> On Nov 10, 2014, at 8:54 AM, Michael Thon wrote: >> >> Hi - >> We?re using map2assembly to map genes from other gene annotation pipelines onto the genome sequence in order to compare AED values to a maker de novo annotation. We found a few transcripts that map2assembly maps to multiple loci in the genome. Is there any way to know if these are all equally good alignments or are they all above some present threshold? I?m trying to decide what to do with the multiple mappings - whether we should discard all but one (in that case we?d need to decide which one) or whether we should keep them all. Keeping them all makes the most sense but the problem is they all have the same id. should map2assembly append a number to the id when a transcript maps to multiple locations in the genome? >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > From carsonhh at gmail.com Mon Nov 10 10:34:33 2014 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 10 Nov 2014 09:34:33 -0700 Subject: [maker-devel] Maker In-Reply-To: <57CFF349-6C9F-4172-ADB3-A9572E21D4A3@arc.agric.za> References: <57CFF349-6C9F-4172-ADB3-A9572E21D4A3@arc.agric.za> Message-ID: <1354797B-F783-4671-BB94-42A0E1611B03@gmail.com> You probably have an error further upstream. The 'Argument "ALRM" isn't numeric? error is just something you get as things are dieing in a non-elegant way, but the cause will be further up the error log. The lack of fasta files means that you have no final gene models. Either your contigs are too short to produce a model, or your evidence alignments are insufficient in end-to-end coverage, splice site recovery on polishing, or %identity, so maker cannot elucidate a usable model from alignment alone. What is your longest contig? Also try running GEGMA from the Korf lab, to help identify if the assembly is incomplete and by how much. ?Carson > On Nov 10, 2014, at 6:42 AM, Jonathan Featherston wrote: > > Dear Carson > > I've been trying to train SNAP with Maker but I'm getting empty genome.ann and .dna files. I have tried running the maker2zff on the implant page to see if my script was corrupt. No help from that. I've seen a few pages in the group and on seqanswers about the empty off files but most seem to have been resolved by including all outputs (maker2zff -n) and even this doesn't generate anything for me?. So I'm guessing the problem is somewhere with the maker outputs. > > I did get errors from the maker run but they seem to be about mli and ALRM (a perl error- what a pain getting perl libs on a mac). > > > Argument "ALRM" isn't numeric in exit at /Users/Jonathan/perl5/perlbrew/perls/perl-5.20.1/lib/site_perl/5.20.1/darwin-2level/forks.pm line 2184. > Argument "ALRM" isn't numeric in exit at /Users/Jonathan/perl5/perlbrew/perls/perl-5.20.1/lib/site_perl/5.20.1/darwin-2level/forks.pm line 2184. > Argument "ALRM" isn't numeric in exit at /Users/Jonathan/perl5/perlbrew/perls/perl-5.20.1/lib/site_perl/5.20.1/darwin-2level/forks.pm line 2184. > Argument "ALRM" isn't numeric in exit at /Users/Jonathan/perl5/perlbrew/perls/perl-5.20.1/lib/site_perl/5.20.1/darwin-2level/forks.pm line 2184. > Argument "ALRM" isn't numeric in exit at /Users/Jonathan/perl5/perlbrew/perls/perl-5.20.1/lib/site_perl/5.20.1/darwin-2level/forks.pm line 2184. > -------------------------------------------------------------------------- > mpiexec has exited due to process rank 4 with PID 5935 on > node Administrators-MacBook-Pro-9 exiting improperly. There are three reasons this could occur: > > 1. this process did not call "init" before exiting, but others in > the job did. This can cause a job to hang indefinitely while it waits > for all processes to call "init". By rule, if one process calls "init", > then ALL processes must call "init" prior to termination. > > 2. this process called "init", but exited without calling "finalize". > By rule, all processes that call "init" MUST call "finalize" prior to > exiting or it will be considered an "abnormal termination" > > 3. this process called "MPI_Abort" or "orte_abort" and the mca parameter > orte_create_session_dirs is set to false. In this case, the run-time cannot > detect that the abort call was an abnormal termination. Hence, the only > error message you will receive is this one. > > This may have caused other processes in the application to be > terminated by signals sent by mpiexec (as reported here). > > You can avoid this message by specifying -quiet on the mpiexec command line. > > Maker did finish and the gff file produced (I can't produce a fasta file from the est2genome=1 option??) seems ok. It has produced protein-matches and match_part although I don't see maker product. > > Otherwise I ran maker with mli using the command from the CPBI maker paper. I used -nohup mpiexec -n 8 maker < /dev/null & for my maker execution. > > I'm using altest and protein homology for now. > > Thank you very much for what help you can provide. I really enjoyed the workshop you and Mark presented! > > Kind Regards > Jonathan > > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From michael.s.campbell1 at gmail.com Mon Nov 10 14:54:14 2014 From: michael.s.campbell1 at gmail.com (Michael Campbell) Date: Mon, 10 Nov 2014 13:54:14 -0700 Subject: [maker-devel] Fwd: AED plot In-Reply-To: References: Message-ID: Hi Saranya, I fixed the AED_cdf_generator.pl scrip and added it to the svn repository for MAKER so it will be available in the next MAKE release. If you are using the svn repository you can do an svn update and get the new version of the script in the MAKER bin. If not I've attached a copy of the script to this email (I removed the .pl extension to the file since some email servers will block .pl files). let me know if you have any more problems with it. Thanks, Mike On Sun, Nov 9, 2014 at 9:33 AM, Saranya Sankaranarayanan < sarasank at umail.iu.edu> wrote: > Hi Mike, > > Please find the gff3 file attached with this email. Thanks a lot for the > very prompt response. > > Sincerely, > Saranya Sankaranarayanan > Master's Student, SoIC > Indiana University > > On Sun, Nov 9, 2014 at 1:01 AM, Michael Campbell < > michael.s.campbell1 at gmail.com> wrote: > >> Hi Sranya, >> >> If you can send me a copy of your gff3 file I can look at it and see why >> you are getting the error. That is a pretty young accessory script so there >> may be something in your file that it has't seen before. >> >> Thanks, >> Mike >> >> On Sat, Nov 8, 2014 at 11:58 AM, Saranya Sankaranarayanan < >> sarasank at umail.iu.edu> wrote: >> >>> Hi Maker authors, >>> >>> I am new to using Maker. I have a few basic questions. >>> >>> I have the maker annotation complete and I ran the >>> >>> gff3_merge -n -d genome_master_datastore_index.log - to create the gff >>> file >>> >>> After that, I used the script AED_cdf_generator.pl to obtain the AED >>> plot, while I get the error: >>> >>> >>> Use of uninitialized value $total in division (/) at >>> ./AED_cdf_generator.pl line 43. >>> Illegal division by zero at ./AED_cdf_generator.pl line 43. >>> >>> I parsed my gff file as: >>> AED_cdf_generator.pl -b 0.025 maker.gff >>> >>> Could anyone please help me with this error? Thank you. It looks like no >>> value is parsed to the variable total, but I am not able to decipher why. >>> >>> Regards, >>> Saranya >>> >>> >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >>> >> >> >> -- >> Michael Campbell MS, RD. >> Doctoral Candidate >> Eccles Institute of Human Genetics >> University of Utah >> 15 North 2030 East, Room 2100 >> Salt Lake City, UT 84112-5330 >> ph:585-3543 >> >> > -- Michael Campbell MS, RD. Doctoral Candidate Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: AED_cdf_generator Type: application/octet-stream Size: 2980 bytes Desc: not available URL: From sarasank at umail.iu.edu Mon Nov 10 15:06:58 2014 From: sarasank at umail.iu.edu (Saranya Sankaranarayanan) Date: Mon, 10 Nov 2014 16:06:58 -0500 Subject: [maker-devel] Fwd: AED plot In-Reply-To: References: Message-ID: Great! It works now. Thanks a lot for the support! Sincerely, Saranya Sankaranarayanan Master's Student, SoIC Indiana University On Mon, Nov 10, 2014 at 3:54 PM, Michael Campbell < michael.s.campbell1 at gmail.com> wrote: > Hi Saranya, > > I fixed the AED_cdf_generator.pl scrip and added it to the svn repository > for MAKER so it will be available in the next MAKE release. If you are > using the svn repository you can do an svn update and get the new version > of the script in the MAKER bin. If not I've attached a copy of the script > to this email (I removed the .pl extension to the file since some email > servers will block .pl files). let me know if you have any more problems > with it. > > Thanks, > Mike > > On Sun, Nov 9, 2014 at 9:33 AM, Saranya Sankaranarayanan < > sarasank at umail.iu.edu> wrote: > >> Hi Mike, >> >> Please find the gff3 file attached with this email. Thanks a lot for the >> very prompt response. >> >> Sincerely, >> Saranya Sankaranarayanan >> Master's Student, SoIC >> Indiana University >> >> On Sun, Nov 9, 2014 at 1:01 AM, Michael Campbell < >> michael.s.campbell1 at gmail.com> wrote: >> >>> Hi Sranya, >>> >>> If you can send me a copy of your gff3 file I can look at it and see why >>> you are getting the error. That is a pretty young accessory script so there >>> may be something in your file that it has't seen before. >>> >>> Thanks, >>> Mike >>> >>> On Sat, Nov 8, 2014 at 11:58 AM, Saranya Sankaranarayanan < >>> sarasank at umail.iu.edu> wrote: >>> >>>> Hi Maker authors, >>>> >>>> I am new to using Maker. I have a few basic questions. >>>> >>>> I have the maker annotation complete and I ran the >>>> >>>> gff3_merge -n -d genome_master_datastore_index.log - to create the >>>> gff file >>>> >>>> After that, I used the script AED_cdf_generator.pl to obtain the AED >>>> plot, while I get the error: >>>> >>>> >>>> Use of uninitialized value $total in division (/) at >>>> ./AED_cdf_generator.pl line 43. >>>> Illegal division by zero at ./AED_cdf_generator.pl line 43. >>>> >>>> I parsed my gff file as: >>>> AED_cdf_generator.pl -b 0.025 maker.gff >>>> >>>> Could anyone please help me with this error? Thank you. It looks like >>>> no value is parsed to the variable total, but I am not able to decipher why. >>>> >>>> Regards, >>>> Saranya >>>> >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>>> >>>> >>> >>> >>> -- >>> Michael Campbell MS, RD. >>> Doctoral Candidate >>> Eccles Institute of Human Genetics >>> University of Utah >>> 15 North 2030 East, Room 2100 >>> Salt Lake City, UT 84112-5330 >>> ph:585-3543 >>> >>> >> > > > -- > Michael Campbell MS, RD. > Doctoral Candidate > Eccles Institute of Human Genetics > University of Utah > 15 North 2030 East, Room 2100 > Salt Lake City, UT 84112-5330 > ph:585-3543 > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From goutham.atla at gmail.com Thu Nov 13 00:22:46 2014 From: goutham.atla at gmail.com (Goutham atla) Date: Thu, 13 Nov 2014 11:52:46 +0530 Subject: [maker-devel] URGENT: Re: maker failure with example data In-Reply-To: References: <3CCDC24F-756A-492C-8E7F-B1B97616EE45@genetics.utah.edu> Message-ID: Dear Carson, Maker is throwing error if I provide a rmlib file for repeat masking. It says At this time the hmmer search engine can only be used with the Dfam database. Please rerun your search without the -lib option or switch to a different search engine. We have ran it without rmlib and it completed successfully. We got GFF, proteins and transcripts.fasta files. We are working on Oryza sativa (subspecies indica) but we have fully annotated Oryza sative (subspecies japonica) which is fully annotated. I would like to know what would be the best way to do a functional annotation of the GFF file given by maker. Regards, Goutham On Fri, Nov 7, 2014 at 11:09 AM, Goutham atla wrote: > Dear Carson, > > Thanks for the quick reply. It worked after providing the assembled > transcripts and protein fasta from closely related species. > > > Regards, > Goutham > > On Thu, Nov 6, 2014 at 12:34 PM, Carson Holt < > carson.holt at genetics.utah.edu> wrote: > >> The final transcript and proteins fasta files will only exists if there >> were gene models with evidence support. If you did not provide an HMM for >> one of the ab initio gene predictors (SNAP, Augustus, etc.) then there will >> be no gene models, and if you do not provide protein or est evidence >> fastas, then there will be no evidence support. Also if your contigs are >> too short to contain gene models then there will be no models. >> >> Thanks, >> Carson >> >> >> >> On Nov 5, 2014, at 11:49 PM, Goutham atla >> wrote: >> >> Dear All, >> >> I have finished running maker. But I realised that there are no >> *transcripts.fasta and *protein.fasta files in any of the directories that >> make has created. It has only gtf files. >> >> Example output of a test run: I have similar results on original file >> also: >> >> [User at motif jcf7180001838744]$ pwd >> >> /home/User/Maker_Annotation/Maker_test.maker.output/Maker_test_datastore/35/C1/jcf7180001838744 >> [User at motif jcf7180001838744]$ ls >> jcf7180001838744.gff run.log theVoid.jcf7180001838744 >> >> Any help from you in figuring out why there are no protein.fasta >> and transcripts.fast would be very helpful. >> >> Regards, >> Goutham >> >> On Wed, Oct 1, 2014 at 11:28 AM, Goutham atla >> wrote: >> >>> Dear All, >>> >>> Thank you. I figured out th problem is with mpich2. I was behind >>> mpich2 but was unsuccessful. I installed mpich v3 and its working fine now. >>> Thank you all. The old GMDO tutorials are bit misleading as the new >>> versions have come up. >>> >>> On Wed, Oct 1, 2014 at 11:09 AM, Marc H?ppner >> > wrote: >>> >>>> Another possibility could be that MPICH2 wasn?t build properly, no? I >>>> remember something with enabling shared libraries during the compilation of >>>> mpich, without which the error below would appear. >>>> >>>> /Marc >>>> >>>> Marc P. Hoeppner, PhD >>>> Team Leader >>>> BILS Genome Annotation Platform >>>> Department for Medical Biochemistry and Microbiology >>>> Uppsala University, Sweden >>>> marc.hoeppner at imbim.uu.se >>>> >>>> >>>> >>>> On 30 Sep 2014, at 21:33, Carson Holt >>>> wrote: >>>> >>>> The message is warning that there are multiple instances of MAKER >>>> running, but no MPI communication. When you build MAKER (perl Build.PL step >>>> when installing MAKER), you need to specify the location of 'mpicc' and >>>> 'mpi.h' to build with MPI support. Otherwise you won't be able to link >>>> against MPICH2 shared libraries. You probably need to rerun that step. >>>> >>>> --Carson >>>> >>>> >>>> From: Goutham atla >>>> Date: Tuesday, September 30, 2014 at 10:49 AM >>>> To: Carson Holt >>>> Cc: "maker-devel at yandell-lab.org" >>>> Subject: URGENT: Re: maker failure with example data >>>> >>>> Hi Carson, >>>> >>>> I figured out the problem is with RepeatMasker installation and I >>>> fixed it. >>>> >>>> I am running maker with MPICH2 and I get the following warning when I >>>> start it: >>>> >>>> >>>> >>>> *STATUS: Processing and indexing input FASTA files... WARNING: Multiple >>>> MAKER processes have been started in the same directory.* >>>> >>>> I would like to if this is common. >>>> >>>> Regards, >>>> Goutham >>>> >>>> >>>> On Tue, Sep 30, 2014 at 12:02 PM, Goutham atla >>>> wrote: >>>> >>>>> Dear Carson, >>>>> >>>>> Thank you for the reply. I reinstalled the BioPerl and now I am >>>>> getting the following error on test data. >>>>> >>>>> ERROR: RepeatMasker failed >>>>> --> rank=NA, hostname=motif >>>>> ERROR: Failed while doing repeat masking >>>>> ERROR: Chunk failed at level:0, tier_type:1 >>>>> FAILED CONTIG:contig-dpp-500-500 >>>>> >>>>> On Mon, Sep 29, 2014 at 8:17 PM, Carson Holt < >>>>> carson.holt at genetics.utah.edu> wrote: >>>>> >>>>>> The error is caused by the BioPerl indexer returning an empty >>>>>> length for the indexed fasta sequence (possibly because of a corrupt index >>>>>> file or other reasons). You may need to reinstall BioPerl (use the CPAN >>>>>> version not the BioPerl-live version), or reinstall Berkley DB (used by the >>>>>> BioPerl indexer), or reinstall the Perl module DB_File via CPAN (Perl's >>>>>> interface to Berkley DB). After reinstalling BioPerl, delete the >>>>>> mpi_blastdb directory for the MAKER run before retrying. >>>>>> >>>>>> Also verify that the /tmp directory on your system or the directory >>>>>> pointed to by TMP= in the maker_opts,ctl file is not full and that TMP= is >>>>>> not set to an NFS mounted location. >>>>>> >>>>>> Thanks, >>>>>> Carson >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> From: Goutham atla >>>>>> Date: Monday, September 29, 2014 at 6:33 AM >>>>>> To: >>>>>> Subject: maker failure with example data >>>>>> >>>>>> Dear All, >>>>>> >>>>>> I am running maker with the demo file, i.e dip_contig.fasta by >>>>>> keeping all other parameters in .ctl files as default. But it do not >>>>>> progress and shows the following message that the length of the sequence is >>>>>> 0. Can anybody help me ? >>>>>> >>>>>> >>>>>> >>>>>> --Next Contig-- >>>>>> >>>>>> MAKER WARNING: All old files will be erased before continuing >>>>>> >>>>>> #--------------------------------------------------------------------- >>>>>> Skipping the contig because it is too short!! >>>>>> SeqID: contig-dpp-500-500 >>>>>> Length: 0 >>>>>> >>>>>> #--------------------------------------------------------------------- >>>>>> >>>>>> >>>>>> Regards, >>>>>> Goutham >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Goutham Atla >>>>> >>>> >>>> >>>> >>>> -- >>>> Goutham Atla >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>>> >>>> >>>> >>> >>> >>> -- >>> Goutham Atla >>> >> >> >> >> -- >> Goutham Atla >> >> >> > > > -- > Goutham Atla > -- Goutham Atla -------------- next part -------------- An HTML attachment was scrubbed... URL: From cjfields at illinois.edu Thu Nov 13 22:34:06 2014 From: cjfields at illinois.edu (Fields, Christopher J) Date: Fri, 14 Nov 2014 04:34:06 +0000 Subject: [maker-devel] BioPerl Bio::Tools::CodonTable changes Message-ID: <6615DD9E-10F6-4EFE-9900-F66317BDA0EE@illinois.edu> Carson, Just a note that we need to address a specific hack added last year in BioPerl for MAKER re: ?strict? codon tables. I added a new one to the end of the list, not thinking that more would eventually be added, and that time has now come: https://github.com/bioperl/bioperl-live/issues/90 I?m not sure how MAKER is setting the table, but if it?s by using the codon table # that will likely subtly break as it will now point to the new codon table from NCBI. chris From carsonhh at gmail.com Fri Nov 14 10:46:58 2014 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 14 Nov 2014 09:46:58 -0700 Subject: [maker-devel] BioPerl Bio::Tools::CodonTable changes In-Reply-To: <6615DD9E-10F6-4EFE-9900-F66317BDA0EE@illinois.edu> References: <6615DD9E-10F6-4EFE-9900-F66317BDA0EE@illinois.edu> Message-ID: <27D24CD7-09E3-4618-8A52-578104B34E65@gmail.com> Actually since I wanted to keep compatibility with old versions of BioPerl, I?ve been using the add_table method to just insert the tabled I need. Then I select it using the id method. But I think I like the idea of making the strictly canonical codon table be table 0, since having a strictly canonical codon table in BioPerl seems rather important. Thanks, Carson > On Nov 13, 2014, at 9:34 PM, Fields, Christopher J wrote: > > Carson, > > Just a note that we need to address a specific hack added last year in BioPerl for MAKER re: ?strict? codon tables. I added a new one to the end of the list, not thinking that more would eventually be added, and that time has now come: > > https://github.com/bioperl/bioperl-live/issues/90 > > I?m not sure how MAKER is setting the table, but if it?s by using the codon table # that will likely subtly break as it will now point to the new codon table from NCBI. > > chris > From cjfields at illinois.edu Fri Nov 14 11:20:14 2014 From: cjfields at illinois.edu (Fields, Christopher J) Date: Fri, 14 Nov 2014 17:20:14 +0000 Subject: [maker-devel] BioPerl Bio::Tools::CodonTable changes In-Reply-To: <27D24CD7-09E3-4618-8A52-578104B34E65@gmail.com> References: <6615DD9E-10F6-4EFE-9900-F66317BDA0EE@illinois.edu> <27D24CD7-09E3-4618-8A52-578104B34E65@gmail.com> Message-ID: <22112917-6961-4F53-87F2-DC4EA9E2175E@illinois.edu> Okay, just wanted to make sure that a change in this wouldn?t break MAKER. chris On Nov 14, 2014, at 10:46 AM, Carson Holt wrote: > Actually since I wanted to keep compatibility with old versions of BioPerl, I?ve been using the add_table method to just insert the tabled I need. Then I select it using the id method. > > But I think I like the idea of making the strictly canonical codon table be table 0, since having a strictly canonical codon table in BioPerl seems rather important. > > Thanks, > Carson > > > > >> On Nov 13, 2014, at 9:34 PM, Fields, Christopher J wrote: >> >> Carson, >> >> Just a note that we need to address a specific hack added last year in BioPerl for MAKER re: ?strict? codon tables. I added a new one to the end of the list, not thinking that more would eventually be added, and that time has now come: >> >> https://github.com/bioperl/bioperl-live/issues/90 >> >> I?m not sure how MAKER is setting the table, but if it?s by using the codon table # that will likely subtly break as it will now point to the new codon table from NCBI. >> >> chris >> > From xiaenhua at gmail.com Wed Nov 19 06:47:28 2014 From: xiaenhua at gmail.com (xiaenhua at gmail.com) Date: Wed, 19 Nov 2014 20:47:28 +0800 Subject: [maker-devel] ERROR: Failed while prepare section files Message-ID: <2014111920472385185424@gmail.com> Dear Maker developer Team, When I rerun maker using the first maker derived GFF3 files together with two newly generated evidence of Proteins and ESTs, I failed. I set the parameters in the maker_opt.ctl file like this: ------------------------------------- genome=CSL.fasta maker_gff=CSL_1st_maker.gff; est_gff=osi_csl_maker.pasa_assemblies_Maker.gff3; protein_gff=CSL_wise.gff3; est2genome=1; protein2genome=1; other parameters with default. Then, I run maker via MPI. However, during the 2nd run, I failed. Below is the error message: ----------------------------------- preparing ab-inits preparing ab-inits preparing ab-inits gathering ab-init output files gathering ab-init output files gathering ab-init output files gathering ab-init output files prepare section files Gathering GFF3 input into hits - chunk:0 gathering ab-init output files gathering ab-init output files gathering ab-init output files gathering ab-init output files prepare section files Gathering GFF3 input into hits - chunk:0 prepare section files Died at /home/xiaenhua/SoftWare/maker/bin/../lib/Bio/Search/Hit/PhatHit/Base.pm line 188. --> rank=6, hostname=localhost.localdomain ERROR: Failed while prepare section files ERROR: Chunk failed at level:12, tier_type:3 FAILED CONTIG:scaffold3 ERROR: Chunk failed at level:4, tier_type:0 FAILED CONTIG:scaffold3 Gathering GFF3 input into hits - chunk:0 gathering ab-init output files prepare section files Gathering GFF3 input into hits - chunk:0 Died at /home/xiaenhua/SoftWare/maker/bin/../lib/Bio/Search/Hit/PhatHit/Base.pm line 188. --> rank=8, hostname=localhost.localdomain ERROR: Failed while prepare section files ERROR: Chunk failed at level:12, tier_type:3 FAILED CONTIG:scaffold5 prepare section files ERROR: Chunk failed at level:4, tier_type:0 FAILED CONTIG:scaffold5 ........................ ........................ -------------------------------------- My protein evidence gff3 file looks like this: scaffold3 genewise match 1276842 1277727 . - . ID=GeneWise.45.m scaffold3 genewise match_part 1277687 1277727 . - . ID=GeneWise.45.cds_1;Parent=GeneWise.45.m scaffold3 genewise match_part 1276842 1277545 . - . ID=GeneWise.45.cds_2;Parent=GeneWise.45.m EST evidence gff3: scaffold3 match 1275835 1276664 . + . ID=align_24718.m scaffold3 match_part 1275835 1276664 . + . ID=align_24718.cds_1;Parent=align_24718.m scaffold3 match 2510415 2511782 . + . ID=align_24719.m scaffold3 match_part 2510415 2511782 . + . ID=align_24719.cds_1;Parent=align_24719.m scaffold3 match 4113431 4114364 . + . ID=align_24720.m scaffold3 match_part 4113431 4114364 . + . ID=align_24720.cds_1;Parent=align_24720.m I don't know what happened? Your any help will be appreciated greatly! Thank you! All the best, En-Hua Xia -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Wed Nov 19 09:55:17 2014 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 19 Nov 2014 08:55:17 -0700 Subject: [maker-devel] ERROR: Failed while prepare section files In-Reply-To: <2014111920472385185424@gmail.com> References: <2014111920472385185424@gmail.com> Message-ID: <824C3CBE-FD06-4571-A8AB-06710840FF41@gmail.com> Could you rerun with the latest MAKER release, just to make sure that it stil happens with the current release (Version 2.31.7). Run with 'maker -a?. If it still happenes, then send me the GFF3 files you are using as input, and I?ll take a look. Basically it?s happening because you are missing a start or end position for a feature in one of the files. ?Carson > On Nov 19, 2014, at 5:47 AM, xiaenhua at gmail.com wrote: > > Dear Maker developer Team, > When I rerun maker using the first maker derived GFF3 files together with two newly generated evidence of Proteins and ESTs, I failed. I set the parameters in the maker_opt.ctl file like this: > ------------------------------------- > genome=CSL.fasta > maker_gff=CSL_1st_maker.gff; > est_gff=osi_csl_maker.pasa_assemblies_Maker.gff3; > protein_gff=CSL_wise.gff3; > est2genome=1; > protein2genome=1; > other parameters with default. > Then, I run maker via MPI. However, during the 2nd run, I failed. Below is the error message: > ----------------------------------- > preparing ab-inits > preparing ab-inits > preparing ab-inits > gathering ab-init output files > gathering ab-init output files > gathering ab-init output files > gathering ab-init output files > prepare section files > Gathering GFF3 input into hits - chunk:0 > gathering ab-init output files > gathering ab-init output files > gathering ab-init output files > gathering ab-init output files > prepare section files > Gathering GFF3 input into hits - chunk:0 > prepare section files > Died at /home/xiaenhua/SoftWare/maker/bin/../lib/Bio/Search/Hit/PhatHit/Base.pm line 188. > --> rank=6, hostname=localhost.localdomain > ERROR: Failed while prepare section files > ERROR: Chunk failed at level:12, tier_type:3 > FAILED CONTIG:scaffold3 > > ERROR: Chunk failed at level:4, tier_type:0 > FAILED CONTIG:scaffold3 > > Gathering GFF3 input into hits - chunk:0 > gathering ab-init output files > prepare section files > Gathering GFF3 input into hits - chunk:0 > Died at /home/xiaenhua/SoftWare/maker/bin/../lib/Bio/Search/Hit/PhatHit/Base.pm line 188. > --> rank=8, hostname=localhost.localdomain > ERROR: Failed while prepare section files > ERROR: Chunk failed at level:12, tier_type:3 > FAILED CONTIG:scaffold5 > > prepare section files > ERROR: Chunk failed at level:4, tier_type:0 > FAILED CONTIG:scaffold5 > ........................ > ........................ > -------------------------------------- > My protein evidence gff3 file looks like this: > scaffold3 genewise match 1276842 1277727 . - . ID=GeneWise.45.m > scaffold3 genewise match_part 1277687 1277727 . - . ID=GeneWise.45.cds_1;Parent=GeneWise.45.m > scaffold3 genewise match_part 1276842 1277545 . - . ID=GeneWise.45.cds_2;Parent=GeneWise.45.m > > EST evidence gff3: > scaffold3 match 1275835 1276664 . + . ID=align_24718.m > scaffold3 match_part 1275835 1276664 . + . ID=align_24718.cds_1;Parent=align_24718.m > scaffold3 match 2510415 2511782 . + . ID=align_24719.m > scaffold3 match_part 2510415 2511782 . + . ID=align_24719.cds_1;Parent=align_24719.m > scaffold3 match 4113431 4114364 . + . ID=align_24720.m > scaffold3 match_part 4113431 4114364 . + . ID=align_24720.cds_1;Parent=align_24720.m > > I don't know what happened? Your any help will be appreciated greatly! > Thank you! > > All the best, > En-Hua Xia > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ernesto at ebi.ac.uk Fri Nov 21 04:59:51 2014 From: ernesto at ebi.ac.uk (ernesto lowy gallego) Date: Fri, 21 Nov 2014 10:59:51 +0000 Subject: [maker-devel] Latest release of MAKER version 2.31.7 Message-ID: <546F1B27.90309@ebi.ac.uk> Hi, I am trying to find the features of the latest release of MAKER (version 2.31.7, released the 31/10/2014), Could you please let me know where can I find them? Thanks a lot! ernesto -- Developer VectorBase | Ensembl Genomes From carsonhh at gmail.com Fri Nov 21 09:04:07 2014 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 21 Nov 2014 08:04:07 -0700 Subject: [maker-devel] Latest release of MAKER version 2.31.7 In-Reply-To: <546F1B27.90309@ebi.ac.uk> References: <546F1B27.90309@ebi.ac.uk> Message-ID: The only change is a bug fix for an issue that sometimes occurs when model_gff is mixed with correct_est_fusion=1 and aways_complete=1. ?Carson > On Nov 21, 2014, at 3:59 AM, ernesto lowy gallego wrote: > > Hi, > > I am trying to find the features of the latest release of MAKER (version 2.31.7, released the 31/10/2014), > > Could you please let me know where can I find them? > > Thanks a lot! > > ernesto > > -- > Developer > > VectorBase | Ensembl Genomes > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From muriel.grosb at gmail.com Fri Nov 21 10:07:39 2014 From: muriel.grosb at gmail.com (Muriel Gros-Balthazard) Date: Fri, 21 Nov 2014 17:07:39 +0100 Subject: [maker-devel] Repeat masking in Maker Message-ID: <546F634B.1000900@gmail.com> Hello, I generated my own library of repeats following the tutorial provided with Maker. I also wanted to use all the species from the RepBase library for the masking. It is not clear to me how this works in Maker. Indeed, I put both these options : model_org=all rmlib=allRepeats.lib However, when using RepeatMasker without Maker, you can't put both -lib allRepeats.lib and -species all as options. Indeed, you can only say one species when also using the -lib option (-species arabidopsis for instance and not -species all) What about Maker ? Do I have masking of allRepeats.lib and also of all species repeats if I put these two arguments in Maker ? model_org=all rmlib=allRepeats.lib Another question: It is said that RepeatRunner is used as well. I put the option: repeat_protein=te_proteins.fasta But realized that RepeatRunner was not installed on my computer !!! I had no problem to run Maker. So, this file of te_proteins is used rather by RepeatMasker to mask them ? It is not clear to me how RepeatRunner is involved in the pipeline ? Thanks a lot for your answers, Muriel -------------- next part -------------- An HTML attachment was scrubbed... URL: From michael.s.campbell1 at gmail.com Fri Nov 21 11:09:17 2014 From: michael.s.campbell1 at gmail.com (Michael Campbell) Date: Fri, 21 Nov 2014 10:09:17 -0700 Subject: [maker-devel] Repeat masking in Maker In-Reply-To: <546F634B.1000900@gmail.com> References: <546F634B.1000900@gmail.com> Message-ID: Hi Muriel, By setting model_org=all MAKER will run repeatmasker using all of RepBase. MAKER will also repeatmasker to mask with your species specific repeat library when you set rmlib=allRepeats.lib. For more information on what options can be used in the model_org= line of the maker_opts.ctl file you can find it here on the MAKER wiki http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/The_MAKER_control_files_explained . A few releases back Repeat runner was added internally to MAKER, so you don't have to install it seperatly. If you look in the MAKER output error you can find where MAKER called repeat runner. Thanks, Mike On Fri, Nov 21, 2014 at 9:07 AM, Muriel Gros-Balthazard < muriel.grosb at gmail.com> wrote: > Hello, > > I generated my own library of repeats following the tutorial provided with > Maker. > I also wanted to use all the species from the RepBase library for the > masking. > > It is not clear to me how this works in Maker. > Indeed, I put both these options : > model_org=all > rmlib=allRepeats.lib > > However, when using RepeatMasker without Maker, you can't put both -lib > allRepeats.lib and -species all as options. > Indeed, you can only say one species when also using the -lib option (-species > arabidopsis for instance and not -species all) > > What about Maker ? > > Do I have masking of allRepeats.lib and also of all species repeats if I > put these two arguments in Maker ? > model_org=all > rmlib=allRepeats.lib > > Another question: > It is said that RepeatRunner is used as well. I put the option: > repeat_protein=te_proteins.fasta > But realized that RepeatRunner was not installed on my computer !!! > I had no problem to run Maker. > So, this file of te_proteins is used rather by RepeatMasker to mask them ? > It is not clear to me how RepeatRunner is involved in the pipeline ? > > Thanks a lot for your answers, > > Muriel > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > -- Michael Campbell MS, RD. Doctoral Candidate Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Fri Nov 21 11:21:28 2014 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 21 Nov 2014 10:21:28 -0700 Subject: [maker-devel] Repeat masking in Maker In-Reply-To: References: <546F634B.1000900@gmail.com> Message-ID: Yes. If you set them both, then RepeatMasker runs twice (once with each setting), and then combines the results. ?Carson > On Nov 21, 2014, at 10:09 AM, Michael Campbell wrote: > > Hi Muriel, > > By setting model_org=all MAKER will run repeatmasker using all of RepBase. MAKER will also repeatmasker to mask with your species specific repeat library when you set rmlib=allRepeats.lib. > > For more information on what options can be used in the model_org= line of the maker_opts.ctl file you can find it here on the MAKER wiki > > http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/The_MAKER_control_files_explained . > > A few releases back Repeat runner was added internally to MAKER, so you don't have to install it seperatly. If you look in the MAKER output error you can find where MAKER called repeat runner. > > Thanks, > Mike > > On Fri, Nov 21, 2014 at 9:07 AM, Muriel Gros-Balthazard > wrote: > Hello, > > I generated my own library of repeats following the tutorial provided with Maker. > I also wanted to use all the species from the RepBase library for the masking. > > It is not clear to me how this works in Maker. > Indeed, I put both these options : > model_org=all > rmlib=allRepeats.lib > > However, when using RepeatMasker without Maker, you can't put both -lib allRepeats.lib and -species all as options. > Indeed, you can only say one species when also using the -lib option (-species arabidopsis for instance and not -species all) > > What about Maker ? > > Do I have masking of allRepeats.lib and also of all species repeats if I put these two arguments in Maker ? > model_org=all > rmlib=allRepeats.lib > > Another question: > It is said that RepeatRunner is used as well. I put the option: repeat_protein=te_proteins.fasta > But realized that RepeatRunner was not installed on my computer !!! > I had no problem to run Maker. > So, this file of te_proteins is used rather by RepeatMasker to mask them ? > It is not clear to me how RepeatRunner is involved in the pipeline ? > > Thanks a lot for your answers, > > Muriel > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > > > -- > Michael Campbell MS, RD. > Doctoral Candidate > Eccles Institute of Human Genetics > University of Utah > 15 North 2030 East, Room 2100 > Salt Lake City, UT 84112-5330 > ph:585-3543 > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From muriel.grosb at gmail.com Thu Nov 27 03:22:26 2014 From: muriel.grosb at gmail.com (Muriel Gros-Balthazard) Date: Thu, 27 Nov 2014 10:22:26 +0100 Subject: [maker-devel] gff output Message-ID: <5476ED52.3060902@gmail.com> Hello, I have been using Maker to generate an annotation. I especially set these options: - est_gff with a list of transcripts.gff3 (Cufflinks output) - model_org=all - rmlib=allrepeats.lib - repeat_protein=te_prot.fasta - pred_gff= Augustus.gff3 (that I generated previously) I obtain a gff file for each of my contigs. However, here are the three possibilities in the second column : # est_gff:cufflinks # repeatmasker # repeatrunner I have no information about exons and introns. And I am wondering if the Augustus.gff3 was used... On top of that, I forgot to set up pred_stats to 1. If I understand well, I can just change this in the ocntrol file, and run Maker again. Since there is the output with everything, it won't run again the prediction, only this option. Is that right ? Thank you, Muriel From carson.holt at genetics.utah.edu Thu Nov 6 00:04:07 2014 From: carson.holt at genetics.utah.edu (Carson Holt) Date: Thu, 6 Nov 2014 07:04:07 +0000 Subject: [maker-devel] URGENT: Re: maker failure with example data In-Reply-To: References: Message-ID: <3CCDC24F-756A-492C-8E7F-B1B97616EE45@genetics.utah.edu> The final transcript and proteins fasta files will only exists if there were gene models with evidence support. If you did not provide an HMM for one of the ab initio gene predictors (SNAP, Augustus, etc.) then there will be no gene models, and if you do not provide protein or est evidence fastas, then there will be no evidence support. Also if your contigs are too short to contain gene models then there will be no models. Thanks, Carson On Nov 5, 2014, at 11:49 PM, Goutham atla > wrote: Dear All, I have finished running maker. But I realised that there are no *transcripts.fasta and *protein.fasta files in any of the directories that make has created. It has only gtf files. Example output of a test run: I have similar results on original file also: [User at motif jcf7180001838744]$ pwd /home/User/Maker_Annotation/Maker_test.maker.output/Maker_test_datastore/35/C1/jcf7180001838744 [User at motif jcf7180001838744]$ ls jcf7180001838744.gff run.log theVoid.jcf7180001838744 Any help from you in figuring out why there are no protein.fasta and transcripts.fast would be very helpful. Regards, Goutham On Wed, Oct 1, 2014 at 11:28 AM, Goutham atla > wrote: Dear All, Thank you. I figured out th problem is with mpich2. I was behind mpich2 but was unsuccessful. I installed mpich v3 and its working fine now. Thank you all. The old GMDO tutorials are bit misleading as the new versions have come up. On Wed, Oct 1, 2014 at 11:09 AM, Marc H?ppner > wrote: Another possibility could be that MPICH2 wasn?t build properly, no? I remember something with enabling shared libraries during the compilation of mpich, without which the error below would appear. /Marc Marc P. Hoeppner, PhD Team Leader BILS Genome Annotation Platform Department for Medical Biochemistry and Microbiology Uppsala University, Sweden marc.hoeppner at imbim.uu.se On 30 Sep 2014, at 21:33, Carson Holt > wrote: The message is warning that there are multiple instances of MAKER running, but no MPI communication. When you build MAKER (perl Build.PL step when installing MAKER), you need to specify the location of 'mpicc' and 'mpi.h' to build with MPI support. Otherwise you won't be able to link against MPICH2 shared libraries. You probably need to rerun that step. --Carson From: Goutham atla > Date: Tuesday, September 30, 2014 at 10:49 AM To: Carson Holt > Cc: "maker-devel at yandell-lab.org" > Subject: URGENT: Re: maker failure with example data Hi Carson, I figured out the problem is with RepeatMasker installation and I fixed it. I am running maker with MPICH2 and I get the following warning when I start it: STATUS: Processing and indexing input FASTA files... WARNING: Multiple MAKER processes have been started in the same directory. I would like to if this is common. Regards, Goutham On Tue, Sep 30, 2014 at 12:02 PM, Goutham atla > wrote: Dear Carson, Thank you for the reply. I reinstalled the BioPerl and now I am getting the following error on test data. ERROR: RepeatMasker failed --> rank=NA, hostname=motif ERROR: Failed while doing repeat masking ERROR: Chunk failed at level:0, tier_type:1 FAILED CONTIG:contig-dpp-500-500 On Mon, Sep 29, 2014 at 8:17 PM, Carson Holt > wrote: The error is caused by the BioPerl indexer returning an empty length for the indexed fasta sequence (possibly because of a corrupt index file or other reasons). You may need to reinstall BioPerl (use the CPAN version not the BioPerl-live version), or reinstall Berkley DB (used by the BioPerl indexer), or reinstall the Perl module DB_File via CPAN (Perl's interface to Berkley DB). After reinstalling BioPerl, delete the mpi_blastdb directory for the MAKER run before retrying. Also verify that the /tmp directory on your system or the directory pointed to by TMP= in the maker_opts,ctl file is not full and that TMP= is not set to an NFS mounted location. Thanks, Carson From: Goutham atla > Date: Monday, September 29, 2014 at 6:33 AM To: > Subject: maker failure with example data Dear All, I am running maker with the demo file, i.e dip_contig.fasta by keeping all other parameters in .ctl files as default. But it do not progress and shows the following message that the length of the sequence is 0. Can anybody help me ? --Next Contig-- MAKER WARNING: All old files will be erased before continuing #--------------------------------------------------------------------- Skipping the contig because it is too short!! SeqID: contig-dpp-500-500 Length: 0 #--------------------------------------------------------------------- Regards, Goutham -- Goutham Atla -- Goutham Atla _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -- Goutham Atla -- Goutham Atla -------------- next part -------------- An HTML attachment was scrubbed... URL: From monica.poelchau at ars.usda.gov Fri Nov 7 06:17:04 2014 From: monica.poelchau at ars.usda.gov (Poelchau, Monica) Date: Fri, 7 Nov 2014 13:17:04 +0000 Subject: [maker-devel] calculating AED values between two datasets Message-ID: Hi everyone, I would like to generate a list of Maker AED values comparing two datasets: a set of computationally predicted genes, and manually curated genes from the Web Apollo program. The idea is to quantify the amount of nucleotide-level change that occurred during the manual curation process. I have tried to run Maker in several ways to generate the AED values. Both gene sets are in (as far as I can tell) valid gff3 format. First, I included the manually curated (Web Apollo) gff3 in the 'model_gff' field of maker_opts.exe, and the gff3 of the computational predictions in the 'est_gff' field, with all of the other prediction and evidence alignment settings turned off. All resulting AEDs from this analysis were 1, even though many of the annotations had 100% overlap. Next, instead of using the computational predictions in gff3 format, I used the fasta file of the cDNA sequence from the computational predictions in the 'est' field. Here, the results made more sense, but there was a small but significant percentage of the AED values that were 1 that actually should have been less than 1. I have tried the 2 analyses above using both the gff3 output straight from Web Apollo, and after running the gff3 through maker once as the only entry in the model-gff field, as explained in the MAKER2 paper (http://www.biomedcentral.com/1471-2105/12/491). This does not to appear to make a difference. Do you have any ideas where I might start to debug this? Thanks for your help! Monica This electronic message contains information generated by the USDA solely for the intended recipients. Any unauthorized interception of this message or the use or disclosure of the information it contains may violate the law and subject the violator to civil or criminal penalties. If you believe you have received this message in error, please notify the sender and delete the email immediately. -------------- next part -------------- An HTML attachment was scrubbed... URL: From goutham.atla at gmail.com Thu Nov 6 22:39:51 2014 From: goutham.atla at gmail.com (Goutham atla) Date: Fri, 7 Nov 2014 11:09:51 +0530 Subject: [maker-devel] URGENT: Re: maker failure with example data In-Reply-To: <3CCDC24F-756A-492C-8E7F-B1B97616EE45@genetics.utah.edu> References: <3CCDC24F-756A-492C-8E7F-B1B97616EE45@genetics.utah.edu> Message-ID: Dear Carson, Thanks for the quick reply. It worked after providing the assembled transcripts and protein fasta from closely related species. Regards, Goutham On Thu, Nov 6, 2014 at 12:34 PM, Carson Holt wrote: > The final transcript and proteins fasta files will only exists if there > were gene models with evidence support. If you did not provide an HMM for > one of the ab initio gene predictors (SNAP, Augustus, etc.) then there will > be no gene models, and if you do not provide protein or est evidence > fastas, then there will be no evidence support. Also if your contigs are > too short to contain gene models then there will be no models. > > Thanks, > Carson > > > > On Nov 5, 2014, at 11:49 PM, Goutham atla wrote: > > Dear All, > > I have finished running maker. But I realised that there are no > *transcripts.fasta and *protein.fasta files in any of the directories that > make has created. It has only gtf files. > > Example output of a test run: I have similar results on original file > also: > > [User at motif jcf7180001838744]$ pwd > > /home/User/Maker_Annotation/Maker_test.maker.output/Maker_test_datastore/35/C1/jcf7180001838744 > [User at motif jcf7180001838744]$ ls > jcf7180001838744.gff run.log theVoid.jcf7180001838744 > > Any help from you in figuring out why there are no protein.fasta > and transcripts.fast would be very helpful. > > Regards, > Goutham > > On Wed, Oct 1, 2014 at 11:28 AM, Goutham atla > wrote: > >> Dear All, >> >> Thank you. I figured out th problem is with mpich2. I was behind mpich2 >> but was unsuccessful. I installed mpich v3 and its working fine now. Thank >> you all. The old GMDO tutorials are bit misleading as the new versions have >> come up. >> >> On Wed, Oct 1, 2014 at 11:09 AM, Marc H?ppner >> wrote: >> >>> Another possibility could be that MPICH2 wasn?t build properly, no? I >>> remember something with enabling shared libraries during the compilation of >>> mpich, without which the error below would appear. >>> >>> /Marc >>> >>> Marc P. Hoeppner, PhD >>> Team Leader >>> BILS Genome Annotation Platform >>> Department for Medical Biochemistry and Microbiology >>> Uppsala University, Sweden >>> marc.hoeppner at imbim.uu.se >>> >>> >>> >>> On 30 Sep 2014, at 21:33, Carson Holt >>> wrote: >>> >>> The message is warning that there are multiple instances of MAKER >>> running, but no MPI communication. When you build MAKER (perl Build.PL step >>> when installing MAKER), you need to specify the location of 'mpicc' and >>> 'mpi.h' to build with MPI support. Otherwise you won't be able to link >>> against MPICH2 shared libraries. You probably need to rerun that step. >>> >>> --Carson >>> >>> >>> From: Goutham atla >>> Date: Tuesday, September 30, 2014 at 10:49 AM >>> To: Carson Holt >>> Cc: "maker-devel at yandell-lab.org" >>> Subject: URGENT: Re: maker failure with example data >>> >>> Hi Carson, >>> >>> I figured out the problem is with RepeatMasker installation and I fixed >>> it. >>> >>> I am running maker with MPICH2 and I get the following warning when I >>> start it: >>> >>> >>> >>> *STATUS: Processing and indexing input FASTA files... WARNING: Multiple >>> MAKER processes have been started in the same directory.* >>> >>> I would like to if this is common. >>> >>> Regards, >>> Goutham >>> >>> >>> On Tue, Sep 30, 2014 at 12:02 PM, Goutham atla >>> wrote: >>> >>>> Dear Carson, >>>> >>>> Thank you for the reply. I reinstalled the BioPerl and now I am >>>> getting the following error on test data. >>>> >>>> ERROR: RepeatMasker failed >>>> --> rank=NA, hostname=motif >>>> ERROR: Failed while doing repeat masking >>>> ERROR: Chunk failed at level:0, tier_type:1 >>>> FAILED CONTIG:contig-dpp-500-500 >>>> >>>> On Mon, Sep 29, 2014 at 8:17 PM, Carson Holt < >>>> carson.holt at genetics.utah.edu> wrote: >>>> >>>>> The error is caused by the BioPerl indexer returning an empty length >>>>> for the indexed fasta sequence (possibly because of a corrupt index file or >>>>> other reasons). You may need to reinstall BioPerl (use the CPAN version >>>>> not the BioPerl-live version), or reinstall Berkley DB (used by the BioPerl >>>>> indexer), or reinstall the Perl module DB_File via CPAN (Perl's interface >>>>> to Berkley DB). After reinstalling BioPerl, delete the mpi_blastdb >>>>> directory for the MAKER run before retrying. >>>>> >>>>> Also verify that the /tmp directory on your system or the directory >>>>> pointed to by TMP= in the maker_opts,ctl file is not full and that TMP= is >>>>> not set to an NFS mounted location. >>>>> >>>>> Thanks, >>>>> Carson >>>>> >>>>> >>>>> >>>>> >>>>> From: Goutham atla >>>>> Date: Monday, September 29, 2014 at 6:33 AM >>>>> To: >>>>> Subject: maker failure with example data >>>>> >>>>> Dear All, >>>>> >>>>> I am running maker with the demo file, i.e dip_contig.fasta by >>>>> keeping all other parameters in .ctl files as default. But it do not >>>>> progress and shows the following message that the length of the sequence is >>>>> 0. Can anybody help me ? >>>>> >>>>> >>>>> >>>>> --Next Contig-- >>>>> >>>>> MAKER WARNING: All old files will be erased before continuing >>>>> #--------------------------------------------------------------------- >>>>> Skipping the contig because it is too short!! >>>>> SeqID: contig-dpp-500-500 >>>>> Length: 0 >>>>> #--------------------------------------------------------------------- >>>>> >>>>> >>>>> Regards, >>>>> Goutham >>>>> >>>> >>>> >>>> >>>> -- >>>> Goutham Atla >>>> >>> >>> >>> >>> -- >>> Goutham Atla >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >>> >>> >> >> >> -- >> Goutham Atla >> > > > > -- > Goutham Atla > > > -- Goutham Atla -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Fri Nov 7 08:26:31 2014 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 7 Nov 2014 08:26:31 -0700 Subject: [maker-devel] calculating AED values between two datasets In-Reply-To: References: Message-ID: <051D0D34-9E49-401F-B22D-16970EB93B66@gmail.com> If you got every value as 1 with the est_gff, then your GFF3 didn?t load. The est_gff option is expecting match/match_part format alignment format, and you may not have had it correctly structures. For using fasta files instead, you may also need to set single_exon=1 and single_length=1, otherwise many of those alignments will be ignored for AED scoring purposes. You should also look out the output in a viewer like apollo to visualize the comparison to see if the reason you get 1 is because the aligner can?t recover the original transcript alignment. ?Carson > On Nov 7, 2014, at 6:17 AM, Poelchau, Monica wrote: > > Hi everyone, > > I would like to generate a list of Maker AED values comparing two datasets: a set of computationally predicted genes, and manually curated genes from the Web Apollo program. The idea is to quantify the amount of nucleotide-level change that occurred during the manual curation process. > > I have tried to run Maker in several ways to generate the AED values. Both gene sets are in (as far as I can tell) valid gff3 format. First, I included the manually curated (Web Apollo) gff3 in the 'model_gff' field of maker_opts.exe, and the gff3 of the computational predictions in the 'est_gff' field, with all of the other prediction and evidence alignment settings turned off. All resulting AEDs from this analysis were 1, even though many of the annotations had 100% overlap. Next, instead of using the computational predictions in gff3 format, I used the fasta file of the cDNA sequence from the computational predictions in the 'est' field. Here, the results made more sense, but there was a small but significant percentage of the AED values that were 1 that actually should have been less than 1. > > I have tried the 2 analyses above using both the gff3 output straight from Web Apollo, and after running the gff3 through maker once as the only entry in the model-gff field, as explained in the MAKER2 paper (http://www.biomedcentral.com/1471-2105/12/491). This does not to appear to make a difference. > > Do you have any ideas where I might start to debug this? > > Thanks for your help! > > Monica > > > > > > This electronic message contains information generated by the USDA solely for the intended recipients. Any unauthorized interception of this message or the use or disclosure of the information it contains may violate the law and subject the violator to civil or criminal penalties. If you believe you have received this message in error, please notify the sender and delete the email immediately. > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From monica.poelchau at ars.usda.gov Fri Nov 7 12:00:26 2014 From: monica.poelchau at ars.usda.gov (Poelchau, Monica) Date: Fri, 7 Nov 2014 19:00:26 +0000 Subject: [maker-devel] calculating AED values between two datasets In-Reply-To: <051D0D34-9E49-401F-B22D-16970EB93B66@gmail.com> References: <051D0D34-9E49-401F-B22D-16970EB93B66@gmail.com> Message-ID: Thank you for the prompt reply, Carson! Yes, my gff3 was modeled as gene models, not match/match_part, so reformatting it may do the trick. Monica From: Carson Holt > Date: Friday, November 7, 2014 at 10:26 AM To: Monica Poelchau > Cc: "maker-devel at yandell-lab.org" > Subject: Re: [maker-devel] calculating AED values between two datasets If you got every value as 1 with the est_gff, then your GFF3 didn?t load. The est_gff option is expecting match/match_part format alignment format, and you may not have had it correctly structures. For using fasta files instead, you may also need to set single_exon=1 and single_length=1, otherwise many of those alignments will be ignored for AED scoring purposes. You should also look out the output in a viewer like apollo to visualize the comparison to see if the reason you get 1 is because the aligner can?t recover the original transcript alignment. ?Carson On Nov 7, 2014, at 6:17 AM, Poelchau, Monica > wrote: Hi everyone, I would like to generate a list of Maker AED values comparing two datasets: a set of computationally predicted genes, and manually curated genes from the Web Apollo program. The idea is to quantify the amount of nucleotide-level change that occurred during the manual curation process. I have tried to run Maker in several ways to generate the AED values. Both gene sets are in (as far as I can tell) valid gff3 format. First, I included the manually curated (Web Apollo) gff3 in the 'model_gff' field of maker_opts.exe, and the gff3 of the computational predictions in the 'est_gff' field, with all of the other prediction and evidence alignment settings turned off. All resulting AEDs from this analysis were 1, even though many of the annotations had 100% overlap. Next, instead of using the computational predictions in gff3 format, I used the fasta file of the cDNA sequence from the computational predictions in the 'est' field. Here, the results made more sense, but there was a small but significant percentage of the AED values that were 1 that actually should have been less than 1. I have tried the 2 analyses above using both the gff3 output straight from Web Apollo, and after running the gff3 through maker once as the only entry in the model-gff field, as explained in the MAKER2 paper (http://www.biomedcentral.com/1471-2105/12/491). This does not to appear to make a difference. Do you have any ideas where I might start to debug this? Thanks for your help! Monica This electronic message contains information generated by the USDA solely for the intended recipients. Any unauthorized interception of this message or the use or disclosure of the information it contains may violate the law and subject the violator to civil or criminal penalties. If you believe you have received this message in error, please notify the sender and delete the email immediately. _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From Timothy.Stitt at tgac.ac.uk Sat Nov 8 06:58:53 2014 From: Timothy.Stitt at tgac.ac.uk (Timothy Stitt (TGAC)) Date: Sat, 8 Nov 2014 13:58:53 +0000 Subject: [maker-devel] DBD::SQLite::db do failed errors Message-ID: Dear Maker Support, I'm running Maker v2.31.7 and I'm receiving lots of the following warnings/errors during the run. The errors don't seem to prevent the calculation from completing so I was just wondering how I can avoid getting them? STATUS: Setting up database for any GFF3 input... DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 1. DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 2. DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 3. ? Thanks in advance, Tim. --- Timothy Stitt PhD / Head of Scientific Computing The Genome Analysis Centre (TGAC) http://www.tgac.ac.uk/ p: +44 1603 450378 e: timothy.stitt at tgac.ac.uk -------------- next part -------------- An HTML attachment was scrubbed... URL: From jimhu at email.tamu.edu Fri Nov 7 11:34:11 2014 From: jimhu at email.tamu.edu (Jim Hu) Date: Fri, 7 Nov 2014 12:34:11 -0600 Subject: [maker-devel] Speaking of AED... Message-ID: I was teaching Yandell and Ence (2012) in the genomics class I co-teach, and was having trouble understanding the values for AED in Box 4 Figure Bb derive from the evidence set in Figure Ba. Box 4 says: "AAED is caculated in the same manner as SN and SPm but in place of a reference gene model, the coordinates of the union of the aligned evidence (see panel Ba) are used instead". In the union, I expect that a bp that is in an exon in any of the evidence would be considered a TP. If so, then why isn't nt-level AED for Annotation 1 in Bb zero? I'm probably missing something trivial. Thanks Jim ===================================== Jim Hu Professor Dept. of Biochemistry and Biophysics 2128 TAMU Texas A&M Univ. College Station, TX 77843-2128 979-862-4054 -------------- next part -------------- An HTML attachment was scrubbed... URL: From sarasank at umail.iu.edu Sat Nov 8 11:58:30 2014 From: sarasank at umail.iu.edu (Saranya Sankaranarayanan) Date: Sat, 8 Nov 2014 13:58:30 -0500 Subject: [maker-devel] Fwd: AED plot In-Reply-To: References: Message-ID: Hi Maker authors, I am new to using Maker. I have a few basic questions. I have the maker annotation complete and I ran the gff3_merge -n -d genome_master_datastore_index.log - to create the gff file After that, I used the script AED_cdf_generator.pl to obtain the AED plot, while I get the error: Use of uninitialized value $total in division (/) at ./AED_cdf_generator.pl line 43. Illegal division by zero at ./AED_cdf_generator.pl line 43. I parsed my gff file as: AED_cdf_generator.pl -b 0.025 maker.gff Could anyone please help me with this error? Thank you. It looks like no value is parsed to the variable total, but I am not able to decipher why. Regards, Saranya -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Sat Nov 8 16:52:26 2014 From: carsonhh at gmail.com (Carson Holt) Date: Sat, 8 Nov 2014 16:52:26 -0700 Subject: [maker-devel] Speaking of AED... In-Reply-To: References: Message-ID: <443253CC-838D-42A7-8FEB-8BAF442FAE9A@gmail.com> I think I would agree. Annotation 1 is a perfect match to the evidence. It is ab initio 1 that would have been AED of 0.2, but annotation 1 should have been AED of 0. ?Carson > On Nov 7, 2014, at 11:34 AM, Jim Hu wrote: > > I was teaching Yandell and Ence (2012) in the genomics class I co-teach, and was having trouble understanding the values for AED in Box 4 Figure Bb derive from the evidence set in Figure Ba. > > Box 4 says: "AAED is caculated in the same manner as SN and SPm but in place of a reference gene model, the coordinates of the union of the aligned evidence (see panel Ba) are used instead". In the union, I expect that a bp that is in an exon in any of the evidence would be considered a TP. If so, then why isn't nt-level AED for Annotation 1 in Bb zero? > > I'm probably missing something trivial. > > Thanks > > Jim > ===================================== > Jim Hu > Professor > Dept. of Biochemistry and Biophysics > 2128 TAMU > Texas A&M Univ. > College Station, TX 77843-2128 > 979-862-4054 > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From dence at genetics.utah.edu Sat Nov 8 16:53:19 2014 From: dence at genetics.utah.edu (Daniel Ence) Date: Sat, 8 Nov 2014 23:53:19 +0000 Subject: [maker-devel] Speaking of AED... In-Reply-To: References: Message-ID: <5FC7C806-E03F-4DC3-8932-65F6C0E1A7EF@genetics.utah.edu> Hi Professor Hu, I?m excited that you?re teaching from this review. I hope that you find it useful for your class! Annotation 1 has an AED of 0.2 and not 0 because the middle exon doesn?t line up exactly with the evidence alignments. Since there are bps in the annotation that aren?t supported by evidence, then it has an AED of > 0. It?s a little hard to see in the figure, but if you use a straight-edge, you can see it. Feel free to let me know whether that helps clear things up. Thanks, Daniel > On Nov 7, 2014, at 11:34 AM, Jim Hu wrote: > > I was teaching Yandell and Ence (2012) in the genomics class I co-teach, and was having trouble understanding the values for AED in Box 4 Figure Bb derive from the evidence set in Figure Ba. > > Box 4 says: "AAED is caculated in the same manner as SN and SPm but in place of a reference gene model, the coordinates of the union of the aligned evidence (see panel Ba) are used instead". In the union, I expect that a bp that is in an exon in any of the evidence would be considered a TP. If so, then why isn't nt-level AED for Annotation 1 in Bb zero? > > I'm probably missing something trivial. > > Thanks > > Jim > ===================================== > Jim Hu > Professor > Dept. of Biochemistry and Biophysics > 2128 TAMU > Texas A&M Univ. > College Station, TX 77843-2128 > 979-862-4054 > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From bmoore at genetics.utah.edu Sat Nov 8 16:38:23 2014 From: bmoore at genetics.utah.edu (Barry Moore) Date: Sat, 8 Nov 2014 23:38:23 +0000 Subject: [maker-devel] Speaking of AED... In-Reply-To: References: Message-ID: The 5? most junction on the 3? terminal exon (assuming + strand) is discordant in both Annotation 1 & 2 from the evidence in Ba. [cid:F7723E49-0CF1-4E2C-A8BF-64312129A65F] B On Nov 7, 2014, at 11:34 AM, Jim Hu > wrote: I was teaching Yandell and Ence (2012) in the genomics class I co-teach, and was having trouble understanding the values for AED in Box 4 Figure Bb derive from the evidence set in Figure Ba. Box 4 says: "AAED is caculated in the same manner as SN and SPm but in place of a reference gene model, the coordinates of the union of the aligned evidence (see panel Ba) are used instead". In the union, I expect that a bp that is in an exon in any of the evidence would be considered a TP. If so, then why isn't nt-level AED for Annotation 1 in Bb zero? I'm probably missing something trivial. Thanks Jim ===================================== Jim Hu Professor Dept. of Biochemistry and Biophysics 2128 TAMU Texas A&M Univ. College Station, TX 77843-2128 979-862-4054 _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: PastedGraphic-1.png Type: image/png Size: 344800 bytes Desc: PastedGraphic-1.png URL: From carsonhh at gmail.com Sat Nov 8 17:13:44 2014 From: carsonhh at gmail.com (Carson Holt) Date: Sat, 8 Nov 2014 17:13:44 -0700 Subject: [maker-devel] DBD::SQLite::db do failed errors In-Reply-To: References: Message-ID: <3100A718-B063-4BC5-A036-943DEBCC6484@gmail.com> It?s caused by one of the characters in your GFF3 file. For example characters NOT in the set [a-zA-Z0-9.:^*$@!+_?-|] must be escaped in GFF3 with exceptions outlined in the format spec. You mayhave either a ? or a ? that must be escaped. ?Carson > On Nov 8, 2014, at 6:58 AM, Timothy Stitt (TGAC) wrote: > > Dear Maker Support, > > I'm running Maker v2.31.7 and I'm receiving lots of the following warnings/errors during the run. The errors don't seem to prevent the calculation from completing so I was just wondering how I can avoid getting them? > > > STATUS: Setting up database for any GFF3 input... > DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 1. > DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 2. > DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 3. > ? > > > Thanks in advance, > > Tim. > --- > Timothy Stitt PhD / Head of Scientific Computing > The Genome Analysis Centre (TGAC) > http://www.tgac.ac.uk/ > > p: +44 1603 450378 > e: timothy.stitt at tgac.ac.uk _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From bmoore at genetics.utah.edu Sat Nov 8 17:07:43 2014 From: bmoore at genetics.utah.edu (Barry Moore) Date: Sun, 9 Nov 2014 00:07:43 +0000 Subject: [maker-devel] Speaking of AED... In-Reply-To: References: Message-ID: <6AB384C1-4F5C-4132-9B3A-23F0DE3A9351@genetics.utah.edu> Hmm, I missed the one Daniel pointed out, and then upon inspection noticed a third discordant exon. This figure needs to go into a book of optical illusions! [cid:87ED4E8E-56C4-4808-A7E3-9F0B4521CADB] On Nov 8, 2014, at 4:38 PM, Barry Moore > wrote: The 5? most junction on the 3? terminal exon (assuming + strand) is discordant in both Annotation 1 & 2 from the evidence in Ba. B On Nov 7, 2014, at 11:34 AM, Jim Hu > wrote: I was teaching Yandell and Ence (2012) in the genomics class I co-teach, and was having trouble understanding the values for AED in Box 4 Figure Bb derive from the evidence set in Figure Ba. Box 4 says: "AAED is caculated in the same manner as SN and SPm but in place of a reference gene model, the coordinates of the union of the aligned evidence (see panel Ba) are used instead". In the union, I expect that a bp that is in an exon in any of the evidence would be considered a TP. If so, then why isn't nt-level AED for Annotation 1 in Bb zero? I'm probably missing something trivial. Thanks Jim ===================================== Jim Hu Professor Dept. of Biochemistry and Biophysics 2128 TAMU Texas A&M Univ. College Station, TX 77843-2128 979-862-4054 _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: PastedGraphic-2.png Type: image/png Size: 362479 bytes Desc: PastedGraphic-2.png URL: From michael.s.campbell1 at gmail.com Sat Nov 8 23:01:27 2014 From: michael.s.campbell1 at gmail.com (Michael Campbell) Date: Sat, 8 Nov 2014 23:01:27 -0700 Subject: [maker-devel] Fwd: AED plot In-Reply-To: References: Message-ID: Hi Sranya, If you can send me a copy of your gff3 file I can look at it and see why you are getting the error. That is a pretty young accessory script so there may be something in your file that it has't seen before. Thanks, Mike On Sat, Nov 8, 2014 at 11:58 AM, Saranya Sankaranarayanan < sarasank at umail.iu.edu> wrote: > Hi Maker authors, > > I am new to using Maker. I have a few basic questions. > > I have the maker annotation complete and I ran the > > gff3_merge -n -d genome_master_datastore_index.log - to create the gff > file > > After that, I used the script AED_cdf_generator.pl to obtain the AED > plot, while I get the error: > > > Use of uninitialized value $total in division (/) at > ./AED_cdf_generator.pl line 43. > Illegal division by zero at ./AED_cdf_generator.pl line 43. > > I parsed my gff file as: > AED_cdf_generator.pl -b 0.025 maker.gff > > Could anyone please help me with this error? Thank you. It looks like no > value is parsed to the variable total, but I am not able to decipher why. > > Regards, > Saranya > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > -- Michael Campbell MS, RD. Doctoral Candidate Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From muriel.grosb at gmail.com Mon Nov 10 03:35:30 2014 From: muriel.grosb at gmail.com (Muriel Gros-Balthazard) Date: Mon, 10 Nov 2014 11:35:30 +0100 Subject: [maker-devel] running Maker but skipping first steps Message-ID: <546094F2.6000100@gmail.com> Hello, I want to run Maker but I would like to skip the first steps : STATUS: Parsing control files... STATUS: Processing and indexing input FASTA files... STATUS: Setting up database for any GFF3 input... A data structure will be created for you at: /Data/Genomics/GeneAnnotation/Maker_pipeline/5_Run_Maker/Pdac_ref2013s.maker.output/Pdac_ref2013s_datastore To access files for individual sequences use the datastore index: /Data/Genomics/GeneAnnotation/Maker_pipeline/5_Run_Maker/Pdac_ref2013s.maker.output/Pdac_ref2013s_master_datastore_index.log Indeed, there was an error in RepeatMasking (and I reinstalled RepeatMasker) but I believe that the previous steps are always the same. Is there a way to run Maker so that it doesn't run this first steps again given that the control files didn't change, the fasta files are already indexed and the database of gff3 is set up ? Thank you ! Muriel From FeatherstonJ at arc.agric.za Mon Nov 10 06:42:15 2014 From: FeatherstonJ at arc.agric.za (Jonathan Featherston) Date: Mon, 10 Nov 2014 13:42:15 +0000 Subject: [maker-devel] Maker Message-ID: <57CFF349-6C9F-4172-ADB3-A9572E21D4A3@arc.agric.za> Dear Carson I've been trying to train SNAP with Maker but I'm getting empty genome.ann and .dna files. I have tried running the maker2zff on the implant page to see if my script was corrupt. No help from that. I've seen a few pages in the group and on seqanswers about the empty off files but most seem to have been resolved by including all outputs (maker2zff -n) and even this doesn't generate anything for me?. So I'm guessing the problem is somewhere with the maker outputs. I did get errors from the maker run but they seem to be about mli and ALRM (a perl error- what a pain getting perl libs on a mac). Argument "ALRM" isn't numeric in exit at /Users/Jonathan/perl5/perlbrew/perls/perl-5.20.1/lib/site_perl/5.20.1/darwin-2level/forks.pm line 2184. Argument "ALRM" isn't numeric in exit at /Users/Jonathan/perl5/perlbrew/perls/perl-5.20.1/lib/site_perl/5.20.1/darwin-2level/forks.pm line 2184. Argument "ALRM" isn't numeric in exit at /Users/Jonathan/perl5/perlbrew/perls/perl-5.20.1/lib/site_perl/5.20.1/darwin-2level/forks.pm line 2184. Argument "ALRM" isn't numeric in exit at /Users/Jonathan/perl5/perlbrew/perls/perl-5.20.1/lib/site_perl/5.20.1/darwin-2level/forks.pm line 2184. Argument "ALRM" isn't numeric in exit at /Users/Jonathan/perl5/perlbrew/perls/perl-5.20.1/lib/site_perl/5.20.1/darwin-2level/forks.pm line 2184. -------------------------------------------------------------------------- mpiexec has exited due to process rank 4 with PID 5935 on node Administrators-MacBook-Pro-9 exiting improperly. There are three reasons this could occur: 1. this process did not call "init" before exiting, but others in the job did. This can cause a job to hang indefinitely while it waits for all processes to call "init". By rule, if one process calls "init", then ALL processes must call "init" prior to termination. 2. this process called "init", but exited without calling "finalize". By rule, all processes that call "init" MUST call "finalize" prior to exiting or it will be considered an "abnormal termination" 3. this process called "MPI_Abort" or "orte_abort" and the mca parameter orte_create_session_dirs is set to false. In this case, the run-time cannot detect that the abort call was an abnormal termination. Hence, the only error message you will receive is this one. This may have caused other processes in the application to be terminated by signals sent by mpiexec (as reported here). You can avoid this message by specifying -quiet on the mpiexec command line. Maker did finish and the gff file produced (I can't produce a fasta file from the est2genome=1 option??) seems ok. It has produced protein-matches and match_part although I don't see maker product. Otherwise I ran maker with mli using the command from the CPBI maker paper. I used -nohup mpiexec -n 8 maker < /dev/null & for my maker execution. I'm using altest and protein homology for now. Thank you very much for what help you can provide. I really enjoyed the workshop you and Mark presented! Kind Regards Jonathan -------------- next part -------------- An HTML attachment was scrubbed... URL: From Timothy.Stitt at tgac.ac.uk Mon Nov 10 06:59:59 2014 From: Timothy.Stitt at tgac.ac.uk (Timothy Stitt (TGAC)) Date: Mon, 10 Nov 2014 13:59:59 +0000 Subject: [maker-devel] DBD::SQLite::db do failed errors In-Reply-To: <3100A718-B063-4BC5-A036-943DEBCC6484@gmail.com> References: <3100A718-B063-4BC5-A036-943DEBCC6484@gmail.com> Message-ID: Thanks Carson. I checked the *.gff files for ' and " symbols. I only observed a bunch of " in one of the files as follows: scaffold16677 exonerate:protein2genome:local gene 128238 128710 339 - . gene_id 0 ; sequence Lus10000040|PACid:23139618 ; gene_orientation + scaffold16677 exonerate:protein2genome:local cds 128645 128710 . - . scaffold16677 exonerate:protein2genome:local exon 128645 128710 . - . insertions 0 ; deletions 0 scaffold16677 exonerate:protein2genome:local splice5 128643 128644 . - . intron_id 1 ; splice_site "GT" scaffold16677 exonerate:protein2genome:local intron 128552 128644 . - . intron_id 1 scaffold16677 exonerate:protein2genome:local splice3 128552 128553 . - . intron_id 0 ; splice_site "AG" scaffold16677 exonerate:protein2genome:local cds 128442 128551 . - . scaffold16677 exonerate:protein2genome:local exon 128442 128551 . - . insertions 0 ; deletions 0 scaffold16677 exonerate:protein2genome:local splice5 128440 128441 . - . intron_id 2 ; splice_site "GT" scaffold16677 exonerate:protein2genome:local intron 128362 128441 . - . intron_id 2 scaffold16677 exonerate:protein2genome:local splice3 128362 128363 . - . intron_id 1 ; splice_site "AG" Would these "GT", "AG" etc. strings cause the problem? If so, how should I change them to be correct? Thanks, Tim. --- Timothy Stitt PhD / Head of Scientific Computing The Genome Analysis Centre (TGAC) http://www.tgac.ac.uk/ p: +44 1603 450378 e: timothy.stitt at tgac.ac.uk From: Carson Holt > Date: Sunday, 9 November 2014 00:13 To: Timothy Stitt > Cc: "maker-devel at yandell-lab.org" > Subject: Re: [maker-devel] DBD::SQLite::db do failed errors It?s caused by one of the characters in your GFF3 file. For example characters NOT in the set [a-zA-Z0-9.:^*$@!+_?-|] must be escaped in GFF3 with exceptions outlined in the format spec. You mayhave either a ? or a ? that must be escaped. ?Carson On Nov 8, 2014, at 6:58 AM, Timothy Stitt (TGAC) > wrote: Dear Maker Support, I'm running Maker v2.31.7 and I'm receiving lots of the following warnings/errors during the run. The errors don't seem to prevent the calculation from completing so I was just wondering how I can avoid getting them? STATUS: Setting up database for any GFF3 input... DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 1. DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 2. DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 3. ? Thanks in advance, Tim. --- Timothy Stitt PhD / Head of Scientific Computing The Genome Analysis Centre (TGAC) http://www.tgac.ac.uk/ p: +44 1603 450378 e: timothy.stitt at tgac.ac.uk _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From myandell at genetics.utah.edu Sat Nov 8 17:32:12 2014 From: myandell at genetics.utah.edu (Mark Yandell) Date: Sun, 9 Nov 2014 00:32:12 +0000 Subject: [maker-devel] Speaking of AED... In-Reply-To: <6AB384C1-4F5C-4132-9B3A-23F0DE3A9351@genetics.utah.edu> References: , <6AB384C1-4F5C-4132-9B3A-23F0DE3A9351@genetics.utah.edu> Message-ID: <7A60AB257EFF2B48B1F4C814817EA053E3664681@mxb1.hg.genetics.utah.edu> And you are still missing one-- 3-prine end of the middle exon is also discordant. . I agree though somehow the color makes it hard to see. Sorry. --mark Mark Yandell Professor of Human Genetics H.A. & Edna Benning Presidential Endowed Chair Co-director USTAR Center for Genetic Discovery Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:801-587-7707 ________________________________________ From: maker-devel [maker-devel-bounces at yandell-lab.org] on behalf of Barry Moore [bmoore at genetics.utah.edu] Sent: Saturday, November 08, 2014 5:07 PM To: Jim Hu; maker-devel at yandell-lab.org Cc: Barry Moore Subject: Re: [maker-devel] Speaking of AED... Hmm, I missed the one Daniel pointed out, and then upon inspection noticed a third discordant exon. This figure needs to go into a book of optical illusions! [cid:87ED4E8E-56C4-4808-A7E3-9F0B4521CADB] On Nov 8, 2014, at 4:38 PM, Barry Moore > wrote: The 5? most junction on the 3? terminal exon (assuming + strand) is discordant in both Annotation 1 & 2 from the evidence in Ba. B On Nov 7, 2014, at 11:34 AM, Jim Hu > wrote: I was teaching Yandell and Ence (2012) in the genomics class I co-teach, and was having trouble understanding the values for AED in Box 4 Figure Bb derive from the evidence set in Figure Ba. Box 4 says: "AAED is caculated in the same manner as SN and SPm but in place of a reference gene model, the coordinates of the union of the aligned evidence (see panel Ba) are used instead". In the union, I expect that a bp that is in an exon in any of the evidence would be considered a TP. If so, then why isn't nt-level AED for Annotation 1 in Bb zero? I'm probably missing something trivial. Thanks Jim ===================================== Jim Hu Professor Dept. of Biochemistry and Biophysics 2128 TAMU Texas A&M Univ. College Station, TX 77843-2128 979-862-4054 _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- A non-text attachment was scrubbed... Name: PastedGraphic-2.png Type: image/png Size: 362479 bytes Desc: PastedGraphic-2.png URL: From sarasank at umail.iu.edu Sun Nov 9 09:33:08 2014 From: sarasank at umail.iu.edu (Saranya Sankaranarayanan) Date: Sun, 9 Nov 2014 11:33:08 -0500 Subject: [maker-devel] Fwd: AED plot In-Reply-To: References: Message-ID: Hi Mike, Please find the gff3 file attached with this email. Thanks a lot for the very prompt response. Sincerely, Saranya Sankaranarayanan Master's Student, SoIC Indiana University On Sun, Nov 9, 2014 at 1:01 AM, Michael Campbell < michael.s.campbell1 at gmail.com> wrote: > Hi Sranya, > > If you can send me a copy of your gff3 file I can look at it and see why > you are getting the error. That is a pretty young accessory script so there > may be something in your file that it has't seen before. > > Thanks, > Mike > > On Sat, Nov 8, 2014 at 11:58 AM, Saranya Sankaranarayanan < > sarasank at umail.iu.edu> wrote: > >> Hi Maker authors, >> >> I am new to using Maker. I have a few basic questions. >> >> I have the maker annotation complete and I ran the >> >> gff3_merge -n -d genome_master_datastore_index.log - to create the gff >> file >> >> After that, I used the script AED_cdf_generator.pl to obtain the AED >> plot, while I get the error: >> >> >> Use of uninitialized value $total in division (/) at >> ./AED_cdf_generator.pl line 43. >> Illegal division by zero at ./AED_cdf_generator.pl line 43. >> >> I parsed my gff file as: >> AED_cdf_generator.pl -b 0.025 maker.gff >> >> Could anyone please help me with this error? Thank you. It looks like no >> value is parsed to the variable total, but I am not able to decipher why. >> >> Regards, >> Saranya >> >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> > > > -- > Michael Campbell MS, RD. > Doctoral Candidate > Eccles Institute of Human Genetics > University of Utah > 15 North 2030 East, Room 2100 > Salt Lake City, UT 84112-5330 > ph:585-3543 > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Gff3.zip Type: application/zip Size: 2098998 bytes Desc: not available URL: From carsonhh at gmail.com Mon Nov 10 08:16:28 2014 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 10 Nov 2014 08:16:28 -0700 Subject: [maker-devel] DBD::SQLite::db do failed errors In-Reply-To: References: <3100A718-B063-4BC5-A036-943DEBCC6484@gmail.com> Message-ID: Actually that is not a GFF3 file. It appears to be GTF which is structured different from GFF3. You would need to convert to GFF3. You can try the sequence ontology converter here ?> http://www.sequenceontology.org/cgi-bin/converter.cgi Unfortunately it will not likely be a painless process as GTF files vary so much between sources that one GTF file might not actually be compatible with another GTF file, so you may have to spend some time editing the file for the converter to work. ?Carson > On Nov 10, 2014, at 6:59 AM, Timothy Stitt (TGAC) wrote: > > Thanks Carson. > > I checked the *.gff files for ' and " symbols. I only observed a bunch of " in one of the files as follows: > > scaffold16677 exonerate:protein2genome:local > gene 128238 > 128710 339 > - . > gene_id 0 ; sequence Lus10000040|PACid:23139618 ; gene_orientation + > scaffold16677 exonerate:protein2genome:local > cds 128645 > 128710 . > - . > scaffold16677 exonerate:protein2genome:local > exon 128645 > 128710 . > - . > insertions 0 ; deletions 0 > scaffold16677 exonerate:protein2genome:local > splice5 128643 > 128644 . > - . > intron_id 1 ; splice_site "GT" > scaffold16677 exonerate:protein2genome:local > intron 128552 > 128644 . > - . > intron_id 1 > scaffold16677 exonerate:protein2genome:local > splice3 128552 > 128553 . > - . > intron_id 0 ; splice_site "AG" > scaffold16677 exonerate:protein2genome:local > cds 128442 > 128551 . > - . > scaffold16677 exonerate:protein2genome:local > exon 128442 > 128551 . > - . > insertions 0 ; deletions 0 > scaffold16677 exonerate:protein2genome:local > splice5 128440 > 128441 . > - . > intron_id 2 ; splice_site "GT" > scaffold16677 exonerate:protein2genome:local > intron 128362 > 128441 . > - . > intron_id 2 > scaffold16677 exonerate:protein2genome:local > splice3 128362 > 128363 . > - . > intron_id 1 ; splice_site "AG" > > Would these "GT", "AG" etc. strings cause the problem? If so, how should I change them to be correct? > > Thanks, > > Tim. > --- > Timothy Stitt PhD / Head of Scientific Computing > The Genome Analysis Centre (TGAC) > http://www.tgac.ac.uk/ > > p: +44 1603 450378 > e: timothy.stitt at tgac.ac.uk > > From: Carson Holt > > Date: Sunday, 9 November 2014 00:13 > To: Timothy Stitt > > Cc: "maker-devel at yandell-lab.org " > > Subject: Re: [maker-devel] DBD::SQLite::db do failed errors > > It?s caused by one of the characters in your GFF3 file. For example characters NOT in the set [a-zA-Z0-9.:^*$@!+_?-|] must be escaped in GFF3 with exceptions outlined in the format spec. You mayhave either a ? or a ? that must be escaped. > > ?Carson > > > > >> On Nov 8, 2014, at 6:58 AM, Timothy Stitt (TGAC) > wrote: >> >> Dear Maker Support, >> >> I'm running Maker v2.31.7 and I'm receiving lots of the following warnings/errors during the run. The errors don't seem to prevent the calculation from completing so I was just wondering how I can avoid getting them? >> >> >> STATUS: Setting up database for any GFF3 input... >> DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 1. >> DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 2. >> DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 3. >> ? >> >> >> Thanks in advance, >> >> Tim. >> --- >> Timothy Stitt PhD / Head of Scientific Computing >> The Genome Analysis Centre (TGAC) >> http://www.tgac.ac.uk/ >> >> p: +44 1603 450378 >> e: timothy.stitt at tgac.ac.uk _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Nov 10 08:23:12 2014 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 10 Nov 2014 08:23:12 -0700 Subject: [maker-devel] running Maker but skipping first steps In-Reply-To: <546094F2.6000100@gmail.com> References: <546094F2.6000100@gmail.com> Message-ID: These are just status messages. The steps don?t actually rerun, except for the control file parsing. That obviously has to happen every time for MAKER to know the control files are still the same between runs. Both these messages ?> STATUS: Processing and indexing input FASTA files... STATUS: Setting up database for any GFF3 input... MAKER sees that the indexes already exists, validates their integrity, and then moves on. So there is no rerunning of steps. ?Carson > On Nov 10, 2014, at 3:35 AM, Muriel Gros-Balthazard wrote: > > Hello, > > I want to run Maker but I would like to skip the first steps : > STATUS: Parsing control files... > STATUS: Processing and indexing input FASTA files... > STATUS: Setting up database for any GFF3 input... > A data structure will be created for you at: > /Data/Genomics/GeneAnnotation/Maker_pipeline/5_Run_Maker/Pdac_ref2013s.maker.output/Pdac_ref2013s_datastore > > To access files for individual sequences use the datastore index: > /Data/Genomics/GeneAnnotation/Maker_pipeline/5_Run_Maker/Pdac_ref2013s.maker.output/Pdac_ref2013s_master_datastore_index.log > > Indeed, there was an error in RepeatMasking (and I reinstalled RepeatMasker) but I believe that the previous steps are always the same. > Is there a way to run Maker so that it doesn't run this first steps again given that the control files didn't change, the fasta files are already indexed and the database of gff3 is set up ? > > Thank you ! > > Muriel > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From mike.thon at gmail.com Mon Nov 10 08:54:00 2014 From: mike.thon at gmail.com (Michael Thon) Date: Mon, 10 Nov 2014 16:54:00 +0100 Subject: [maker-devel] map2assembly Message-ID: Hi - We?re using map2assembly to map genes from other gene annotation pipelines onto the genome sequence in order to compare AED values to a maker de novo annotation. We found a few transcripts that map2assembly maps to multiple loci in the genome. Is there any way to know if these are all equally good alignments or are they all above some present threshold? I?m trying to decide what to do with the multiple mappings - whether we should discard all but one (in that case we?d need to decide which one) or whether we should keep them all. Keeping them all makes the most sense but the problem is they all have the same id. should map2assembly append a number to the id when a transcript maps to multiple locations in the genome? From carsonhh at gmail.com Mon Nov 10 09:08:26 2014 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 10 Nov 2014 09:08:26 -0700 Subject: [maker-devel] map2assembly In-Reply-To: References: Message-ID: <2D1BD5DD-7405-448F-B68D-80C8FEEDC6B3@gmail.com> Try using the transcript score (column 6). It should indicate the % recovery. A 100 means perfect match to the input transcript. The value is %identity multiplied by %coverage, so it will decrease because of a lack of identity or a lack of end-to-end alignment. ?Carson > On Nov 10, 2014, at 8:54 AM, Michael Thon wrote: > > Hi - > We?re using map2assembly to map genes from other gene annotation pipelines onto the genome sequence in order to compare AED values to a maker de novo annotation. We found a few transcripts that map2assembly maps to multiple loci in the genome. Is there any way to know if these are all equally good alignments or are they all above some present threshold? I?m trying to decide what to do with the multiple mappings - whether we should discard all but one (in that case we?d need to decide which one) or whether we should keep them all. Keeping them all makes the most sense but the problem is they all have the same id. should map2assembly append a number to the id when a transcript maps to multiple locations in the genome? > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From carsonhh at gmail.com Mon Nov 10 09:12:53 2014 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 10 Nov 2014 09:12:53 -0700 Subject: [maker-devel] map2assembly In-Reply-To: <2D1BD5DD-7405-448F-B68D-80C8FEEDC6B3@gmail.com> References: <2D1BD5DD-7405-448F-B68D-80C8FEEDC6B3@gmail.com> Message-ID: <681F266A-0549-4165-9261-FDD9F268D674@gmail.com> You can also use the -l option when running gff3_merge to correct for for unique IDs when merging multiple GFF3 files (i.e. IDs will be uniq within a file, but may not be unique across files when mapping transcripts the IDs are being copied direct from the aligned transcript). ?Carson > On Nov 10, 2014, at 9:08 AM, Carson Holt wrote: > > Try using the transcript score (column 6). It should indicate the % recovery. A 100 means perfect match to the input transcript. The value is %identity multiplied by %coverage, so it will decrease because of a lack of identity or a lack of end-to-end alignment. > > ?Carson > > > > > >> On Nov 10, 2014, at 8:54 AM, Michael Thon wrote: >> >> Hi - >> We?re using map2assembly to map genes from other gene annotation pipelines onto the genome sequence in order to compare AED values to a maker de novo annotation. We found a few transcripts that map2assembly maps to multiple loci in the genome. Is there any way to know if these are all equally good alignments or are they all above some present threshold? I?m trying to decide what to do with the multiple mappings - whether we should discard all but one (in that case we?d need to decide which one) or whether we should keep them all. Keeping them all makes the most sense but the problem is they all have the same id. should map2assembly append a number to the id when a transcript maps to multiple locations in the genome? >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > From carsonhh at gmail.com Mon Nov 10 09:34:33 2014 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 10 Nov 2014 09:34:33 -0700 Subject: [maker-devel] Maker In-Reply-To: <57CFF349-6C9F-4172-ADB3-A9572E21D4A3@arc.agric.za> References: <57CFF349-6C9F-4172-ADB3-A9572E21D4A3@arc.agric.za> Message-ID: <1354797B-F783-4671-BB94-42A0E1611B03@gmail.com> You probably have an error further upstream. The 'Argument "ALRM" isn't numeric? error is just something you get as things are dieing in a non-elegant way, but the cause will be further up the error log. The lack of fasta files means that you have no final gene models. Either your contigs are too short to produce a model, or your evidence alignments are insufficient in end-to-end coverage, splice site recovery on polishing, or %identity, so maker cannot elucidate a usable model from alignment alone. What is your longest contig? Also try running GEGMA from the Korf lab, to help identify if the assembly is incomplete and by how much. ?Carson > On Nov 10, 2014, at 6:42 AM, Jonathan Featherston wrote: > > Dear Carson > > I've been trying to train SNAP with Maker but I'm getting empty genome.ann and .dna files. I have tried running the maker2zff on the implant page to see if my script was corrupt. No help from that. I've seen a few pages in the group and on seqanswers about the empty off files but most seem to have been resolved by including all outputs (maker2zff -n) and even this doesn't generate anything for me?. So I'm guessing the problem is somewhere with the maker outputs. > > I did get errors from the maker run but they seem to be about mli and ALRM (a perl error- what a pain getting perl libs on a mac). > > > Argument "ALRM" isn't numeric in exit at /Users/Jonathan/perl5/perlbrew/perls/perl-5.20.1/lib/site_perl/5.20.1/darwin-2level/forks.pm line 2184. > Argument "ALRM" isn't numeric in exit at /Users/Jonathan/perl5/perlbrew/perls/perl-5.20.1/lib/site_perl/5.20.1/darwin-2level/forks.pm line 2184. > Argument "ALRM" isn't numeric in exit at /Users/Jonathan/perl5/perlbrew/perls/perl-5.20.1/lib/site_perl/5.20.1/darwin-2level/forks.pm line 2184. > Argument "ALRM" isn't numeric in exit at /Users/Jonathan/perl5/perlbrew/perls/perl-5.20.1/lib/site_perl/5.20.1/darwin-2level/forks.pm line 2184. > Argument "ALRM" isn't numeric in exit at /Users/Jonathan/perl5/perlbrew/perls/perl-5.20.1/lib/site_perl/5.20.1/darwin-2level/forks.pm line 2184. > -------------------------------------------------------------------------- > mpiexec has exited due to process rank 4 with PID 5935 on > node Administrators-MacBook-Pro-9 exiting improperly. There are three reasons this could occur: > > 1. this process did not call "init" before exiting, but others in > the job did. This can cause a job to hang indefinitely while it waits > for all processes to call "init". By rule, if one process calls "init", > then ALL processes must call "init" prior to termination. > > 2. this process called "init", but exited without calling "finalize". > By rule, all processes that call "init" MUST call "finalize" prior to > exiting or it will be considered an "abnormal termination" > > 3. this process called "MPI_Abort" or "orte_abort" and the mca parameter > orte_create_session_dirs is set to false. In this case, the run-time cannot > detect that the abort call was an abnormal termination. Hence, the only > error message you will receive is this one. > > This may have caused other processes in the application to be > terminated by signals sent by mpiexec (as reported here). > > You can avoid this message by specifying -quiet on the mpiexec command line. > > Maker did finish and the gff file produced (I can't produce a fasta file from the est2genome=1 option??) seems ok. It has produced protein-matches and match_part although I don't see maker product. > > Otherwise I ran maker with mli using the command from the CPBI maker paper. I used -nohup mpiexec -n 8 maker < /dev/null & for my maker execution. > > I'm using altest and protein homology for now. > > Thank you very much for what help you can provide. I really enjoyed the workshop you and Mark presented! > > Kind Regards > Jonathan > > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From michael.s.campbell1 at gmail.com Mon Nov 10 13:54:14 2014 From: michael.s.campbell1 at gmail.com (Michael Campbell) Date: Mon, 10 Nov 2014 13:54:14 -0700 Subject: [maker-devel] Fwd: AED plot In-Reply-To: References: Message-ID: Hi Saranya, I fixed the AED_cdf_generator.pl scrip and added it to the svn repository for MAKER so it will be available in the next MAKE release. If you are using the svn repository you can do an svn update and get the new version of the script in the MAKER bin. If not I've attached a copy of the script to this email (I removed the .pl extension to the file since some email servers will block .pl files). let me know if you have any more problems with it. Thanks, Mike On Sun, Nov 9, 2014 at 9:33 AM, Saranya Sankaranarayanan < sarasank at umail.iu.edu> wrote: > Hi Mike, > > Please find the gff3 file attached with this email. Thanks a lot for the > very prompt response. > > Sincerely, > Saranya Sankaranarayanan > Master's Student, SoIC > Indiana University > > On Sun, Nov 9, 2014 at 1:01 AM, Michael Campbell < > michael.s.campbell1 at gmail.com> wrote: > >> Hi Sranya, >> >> If you can send me a copy of your gff3 file I can look at it and see why >> you are getting the error. That is a pretty young accessory script so there >> may be something in your file that it has't seen before. >> >> Thanks, >> Mike >> >> On Sat, Nov 8, 2014 at 11:58 AM, Saranya Sankaranarayanan < >> sarasank at umail.iu.edu> wrote: >> >>> Hi Maker authors, >>> >>> I am new to using Maker. I have a few basic questions. >>> >>> I have the maker annotation complete and I ran the >>> >>> gff3_merge -n -d genome_master_datastore_index.log - to create the gff >>> file >>> >>> After that, I used the script AED_cdf_generator.pl to obtain the AED >>> plot, while I get the error: >>> >>> >>> Use of uninitialized value $total in division (/) at >>> ./AED_cdf_generator.pl line 43. >>> Illegal division by zero at ./AED_cdf_generator.pl line 43. >>> >>> I parsed my gff file as: >>> AED_cdf_generator.pl -b 0.025 maker.gff >>> >>> Could anyone please help me with this error? Thank you. It looks like no >>> value is parsed to the variable total, but I am not able to decipher why. >>> >>> Regards, >>> Saranya >>> >>> >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >>> >> >> >> -- >> Michael Campbell MS, RD. >> Doctoral Candidate >> Eccles Institute of Human Genetics >> University of Utah >> 15 North 2030 East, Room 2100 >> Salt Lake City, UT 84112-5330 >> ph:585-3543 >> >> > -- Michael Campbell MS, RD. Doctoral Candidate Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: AED_cdf_generator Type: application/octet-stream Size: 2980 bytes Desc: not available URL: From sarasank at umail.iu.edu Mon Nov 10 14:06:58 2014 From: sarasank at umail.iu.edu (Saranya Sankaranarayanan) Date: Mon, 10 Nov 2014 16:06:58 -0500 Subject: [maker-devel] Fwd: AED plot In-Reply-To: References: Message-ID: Great! It works now. Thanks a lot for the support! Sincerely, Saranya Sankaranarayanan Master's Student, SoIC Indiana University On Mon, Nov 10, 2014 at 3:54 PM, Michael Campbell < michael.s.campbell1 at gmail.com> wrote: > Hi Saranya, > > I fixed the AED_cdf_generator.pl scrip and added it to the svn repository > for MAKER so it will be available in the next MAKE release. If you are > using the svn repository you can do an svn update and get the new version > of the script in the MAKER bin. If not I've attached a copy of the script > to this email (I removed the .pl extension to the file since some email > servers will block .pl files). let me know if you have any more problems > with it. > > Thanks, > Mike > > On Sun, Nov 9, 2014 at 9:33 AM, Saranya Sankaranarayanan < > sarasank at umail.iu.edu> wrote: > >> Hi Mike, >> >> Please find the gff3 file attached with this email. Thanks a lot for the >> very prompt response. >> >> Sincerely, >> Saranya Sankaranarayanan >> Master's Student, SoIC >> Indiana University >> >> On Sun, Nov 9, 2014 at 1:01 AM, Michael Campbell < >> michael.s.campbell1 at gmail.com> wrote: >> >>> Hi Sranya, >>> >>> If you can send me a copy of your gff3 file I can look at it and see why >>> you are getting the error. That is a pretty young accessory script so there >>> may be something in your file that it has't seen before. >>> >>> Thanks, >>> Mike >>> >>> On Sat, Nov 8, 2014 at 11:58 AM, Saranya Sankaranarayanan < >>> sarasank at umail.iu.edu> wrote: >>> >>>> Hi Maker authors, >>>> >>>> I am new to using Maker. I have a few basic questions. >>>> >>>> I have the maker annotation complete and I ran the >>>> >>>> gff3_merge -n -d genome_master_datastore_index.log - to create the >>>> gff file >>>> >>>> After that, I used the script AED_cdf_generator.pl to obtain the AED >>>> plot, while I get the error: >>>> >>>> >>>> Use of uninitialized value $total in division (/) at >>>> ./AED_cdf_generator.pl line 43. >>>> Illegal division by zero at ./AED_cdf_generator.pl line 43. >>>> >>>> I parsed my gff file as: >>>> AED_cdf_generator.pl -b 0.025 maker.gff >>>> >>>> Could anyone please help me with this error? Thank you. It looks like >>>> no value is parsed to the variable total, but I am not able to decipher why. >>>> >>>> Regards, >>>> Saranya >>>> >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>>> >>>> >>> >>> >>> -- >>> Michael Campbell MS, RD. >>> Doctoral Candidate >>> Eccles Institute of Human Genetics >>> University of Utah >>> 15 North 2030 East, Room 2100 >>> Salt Lake City, UT 84112-5330 >>> ph:585-3543 >>> >>> >> > > > -- > Michael Campbell MS, RD. > Doctoral Candidate > Eccles Institute of Human Genetics > University of Utah > 15 North 2030 East, Room 2100 > Salt Lake City, UT 84112-5330 > ph:585-3543 > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From goutham.atla at gmail.com Wed Nov 12 23:22:46 2014 From: goutham.atla at gmail.com (Goutham atla) Date: Thu, 13 Nov 2014 11:52:46 +0530 Subject: [maker-devel] URGENT: Re: maker failure with example data In-Reply-To: References: <3CCDC24F-756A-492C-8E7F-B1B97616EE45@genetics.utah.edu> Message-ID: Dear Carson, Maker is throwing error if I provide a rmlib file for repeat masking. It says At this time the hmmer search engine can only be used with the Dfam database. Please rerun your search without the -lib option or switch to a different search engine. We have ran it without rmlib and it completed successfully. We got GFF, proteins and transcripts.fasta files. We are working on Oryza sativa (subspecies indica) but we have fully annotated Oryza sative (subspecies japonica) which is fully annotated. I would like to know what would be the best way to do a functional annotation of the GFF file given by maker. Regards, Goutham On Fri, Nov 7, 2014 at 11:09 AM, Goutham atla wrote: > Dear Carson, > > Thanks for the quick reply. It worked after providing the assembled > transcripts and protein fasta from closely related species. > > > Regards, > Goutham > > On Thu, Nov 6, 2014 at 12:34 PM, Carson Holt < > carson.holt at genetics.utah.edu> wrote: > >> The final transcript and proteins fasta files will only exists if there >> were gene models with evidence support. If you did not provide an HMM for >> one of the ab initio gene predictors (SNAP, Augustus, etc.) then there will >> be no gene models, and if you do not provide protein or est evidence >> fastas, then there will be no evidence support. Also if your contigs are >> too short to contain gene models then there will be no models. >> >> Thanks, >> Carson >> >> >> >> On Nov 5, 2014, at 11:49 PM, Goutham atla >> wrote: >> >> Dear All, >> >> I have finished running maker. But I realised that there are no >> *transcripts.fasta and *protein.fasta files in any of the directories that >> make has created. It has only gtf files. >> >> Example output of a test run: I have similar results on original file >> also: >> >> [User at motif jcf7180001838744]$ pwd >> >> /home/User/Maker_Annotation/Maker_test.maker.output/Maker_test_datastore/35/C1/jcf7180001838744 >> [User at motif jcf7180001838744]$ ls >> jcf7180001838744.gff run.log theVoid.jcf7180001838744 >> >> Any help from you in figuring out why there are no protein.fasta >> and transcripts.fast would be very helpful. >> >> Regards, >> Goutham >> >> On Wed, Oct 1, 2014 at 11:28 AM, Goutham atla >> wrote: >> >>> Dear All, >>> >>> Thank you. I figured out th problem is with mpich2. I was behind >>> mpich2 but was unsuccessful. I installed mpich v3 and its working fine now. >>> Thank you all. The old GMDO tutorials are bit misleading as the new >>> versions have come up. >>> >>> On Wed, Oct 1, 2014 at 11:09 AM, Marc H?ppner >> > wrote: >>> >>>> Another possibility could be that MPICH2 wasn?t build properly, no? I >>>> remember something with enabling shared libraries during the compilation of >>>> mpich, without which the error below would appear. >>>> >>>> /Marc >>>> >>>> Marc P. Hoeppner, PhD >>>> Team Leader >>>> BILS Genome Annotation Platform >>>> Department for Medical Biochemistry and Microbiology >>>> Uppsala University, Sweden >>>> marc.hoeppner at imbim.uu.se >>>> >>>> >>>> >>>> On 30 Sep 2014, at 21:33, Carson Holt >>>> wrote: >>>> >>>> The message is warning that there are multiple instances of MAKER >>>> running, but no MPI communication. When you build MAKER (perl Build.PL step >>>> when installing MAKER), you need to specify the location of 'mpicc' and >>>> 'mpi.h' to build with MPI support. Otherwise you won't be able to link >>>> against MPICH2 shared libraries. You probably need to rerun that step. >>>> >>>> --Carson >>>> >>>> >>>> From: Goutham atla >>>> Date: Tuesday, September 30, 2014 at 10:49 AM >>>> To: Carson Holt >>>> Cc: "maker-devel at yandell-lab.org" >>>> Subject: URGENT: Re: maker failure with example data >>>> >>>> Hi Carson, >>>> >>>> I figured out the problem is with RepeatMasker installation and I >>>> fixed it. >>>> >>>> I am running maker with MPICH2 and I get the following warning when I >>>> start it: >>>> >>>> >>>> >>>> *STATUS: Processing and indexing input FASTA files... WARNING: Multiple >>>> MAKER processes have been started in the same directory.* >>>> >>>> I would like to if this is common. >>>> >>>> Regards, >>>> Goutham >>>> >>>> >>>> On Tue, Sep 30, 2014 at 12:02 PM, Goutham atla >>>> wrote: >>>> >>>>> Dear Carson, >>>>> >>>>> Thank you for the reply. I reinstalled the BioPerl and now I am >>>>> getting the following error on test data. >>>>> >>>>> ERROR: RepeatMasker failed >>>>> --> rank=NA, hostname=motif >>>>> ERROR: Failed while doing repeat masking >>>>> ERROR: Chunk failed at level:0, tier_type:1 >>>>> FAILED CONTIG:contig-dpp-500-500 >>>>> >>>>> On Mon, Sep 29, 2014 at 8:17 PM, Carson Holt < >>>>> carson.holt at genetics.utah.edu> wrote: >>>>> >>>>>> The error is caused by the BioPerl indexer returning an empty >>>>>> length for the indexed fasta sequence (possibly because of a corrupt index >>>>>> file or other reasons). You may need to reinstall BioPerl (use the CPAN >>>>>> version not the BioPerl-live version), or reinstall Berkley DB (used by the >>>>>> BioPerl indexer), or reinstall the Perl module DB_File via CPAN (Perl's >>>>>> interface to Berkley DB). After reinstalling BioPerl, delete the >>>>>> mpi_blastdb directory for the MAKER run before retrying. >>>>>> >>>>>> Also verify that the /tmp directory on your system or the directory >>>>>> pointed to by TMP= in the maker_opts,ctl file is not full and that TMP= is >>>>>> not set to an NFS mounted location. >>>>>> >>>>>> Thanks, >>>>>> Carson >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> From: Goutham atla >>>>>> Date: Monday, September 29, 2014 at 6:33 AM >>>>>> To: >>>>>> Subject: maker failure with example data >>>>>> >>>>>> Dear All, >>>>>> >>>>>> I am running maker with the demo file, i.e dip_contig.fasta by >>>>>> keeping all other parameters in .ctl files as default. But it do not >>>>>> progress and shows the following message that the length of the sequence is >>>>>> 0. Can anybody help me ? >>>>>> >>>>>> >>>>>> >>>>>> --Next Contig-- >>>>>> >>>>>> MAKER WARNING: All old files will be erased before continuing >>>>>> >>>>>> #--------------------------------------------------------------------- >>>>>> Skipping the contig because it is too short!! >>>>>> SeqID: contig-dpp-500-500 >>>>>> Length: 0 >>>>>> >>>>>> #--------------------------------------------------------------------- >>>>>> >>>>>> >>>>>> Regards, >>>>>> Goutham >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Goutham Atla >>>>> >>>> >>>> >>>> >>>> -- >>>> Goutham Atla >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>>> >>>> >>>> >>> >>> >>> -- >>> Goutham Atla >>> >> >> >> >> -- >> Goutham Atla >> >> >> > > > -- > Goutham Atla > -- Goutham Atla -------------- next part -------------- An HTML attachment was scrubbed... URL: From cjfields at illinois.edu Thu Nov 13 21:34:06 2014 From: cjfields at illinois.edu (Fields, Christopher J) Date: Fri, 14 Nov 2014 04:34:06 +0000 Subject: [maker-devel] BioPerl Bio::Tools::CodonTable changes Message-ID: <6615DD9E-10F6-4EFE-9900-F66317BDA0EE@illinois.edu> Carson, Just a note that we need to address a specific hack added last year in BioPerl for MAKER re: ?strict? codon tables. I added a new one to the end of the list, not thinking that more would eventually be added, and that time has now come: https://github.com/bioperl/bioperl-live/issues/90 I?m not sure how MAKER is setting the table, but if it?s by using the codon table # that will likely subtly break as it will now point to the new codon table from NCBI. chris From carsonhh at gmail.com Fri Nov 14 09:46:58 2014 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 14 Nov 2014 09:46:58 -0700 Subject: [maker-devel] BioPerl Bio::Tools::CodonTable changes In-Reply-To: <6615DD9E-10F6-4EFE-9900-F66317BDA0EE@illinois.edu> References: <6615DD9E-10F6-4EFE-9900-F66317BDA0EE@illinois.edu> Message-ID: <27D24CD7-09E3-4618-8A52-578104B34E65@gmail.com> Actually since I wanted to keep compatibility with old versions of BioPerl, I?ve been using the add_table method to just insert the tabled I need. Then I select it using the id method. But I think I like the idea of making the strictly canonical codon table be table 0, since having a strictly canonical codon table in BioPerl seems rather important. Thanks, Carson > On Nov 13, 2014, at 9:34 PM, Fields, Christopher J wrote: > > Carson, > > Just a note that we need to address a specific hack added last year in BioPerl for MAKER re: ?strict? codon tables. I added a new one to the end of the list, not thinking that more would eventually be added, and that time has now come: > > https://github.com/bioperl/bioperl-live/issues/90 > > I?m not sure how MAKER is setting the table, but if it?s by using the codon table # that will likely subtly break as it will now point to the new codon table from NCBI. > > chris > From cjfields at illinois.edu Fri Nov 14 10:20:14 2014 From: cjfields at illinois.edu (Fields, Christopher J) Date: Fri, 14 Nov 2014 17:20:14 +0000 Subject: [maker-devel] BioPerl Bio::Tools::CodonTable changes In-Reply-To: <27D24CD7-09E3-4618-8A52-578104B34E65@gmail.com> References: <6615DD9E-10F6-4EFE-9900-F66317BDA0EE@illinois.edu> <27D24CD7-09E3-4618-8A52-578104B34E65@gmail.com> Message-ID: <22112917-6961-4F53-87F2-DC4EA9E2175E@illinois.edu> Okay, just wanted to make sure that a change in this wouldn?t break MAKER. chris On Nov 14, 2014, at 10:46 AM, Carson Holt wrote: > Actually since I wanted to keep compatibility with old versions of BioPerl, I?ve been using the add_table method to just insert the tabled I need. Then I select it using the id method. > > But I think I like the idea of making the strictly canonical codon table be table 0, since having a strictly canonical codon table in BioPerl seems rather important. > > Thanks, > Carson > > > > >> On Nov 13, 2014, at 9:34 PM, Fields, Christopher J wrote: >> >> Carson, >> >> Just a note that we need to address a specific hack added last year in BioPerl for MAKER re: ?strict? codon tables. I added a new one to the end of the list, not thinking that more would eventually be added, and that time has now come: >> >> https://github.com/bioperl/bioperl-live/issues/90 >> >> I?m not sure how MAKER is setting the table, but if it?s by using the codon table # that will likely subtly break as it will now point to the new codon table from NCBI. >> >> chris >> > From xiaenhua at gmail.com Wed Nov 19 05:47:28 2014 From: xiaenhua at gmail.com (xiaenhua at gmail.com) Date: Wed, 19 Nov 2014 20:47:28 +0800 Subject: [maker-devel] ERROR: Failed while prepare section files Message-ID: <2014111920472385185424@gmail.com> Dear Maker developer Team, When I rerun maker using the first maker derived GFF3 files together with two newly generated evidence of Proteins and ESTs, I failed. I set the parameters in the maker_opt.ctl file like this: ------------------------------------- genome=CSL.fasta maker_gff=CSL_1st_maker.gff; est_gff=osi_csl_maker.pasa_assemblies_Maker.gff3; protein_gff=CSL_wise.gff3; est2genome=1; protein2genome=1; other parameters with default. Then, I run maker via MPI. However, during the 2nd run, I failed. Below is the error message: ----------------------------------- preparing ab-inits preparing ab-inits preparing ab-inits gathering ab-init output files gathering ab-init output files gathering ab-init output files gathering ab-init output files prepare section files Gathering GFF3 input into hits - chunk:0 gathering ab-init output files gathering ab-init output files gathering ab-init output files gathering ab-init output files prepare section files Gathering GFF3 input into hits - chunk:0 prepare section files Died at /home/xiaenhua/SoftWare/maker/bin/../lib/Bio/Search/Hit/PhatHit/Base.pm line 188. --> rank=6, hostname=localhost.localdomain ERROR: Failed while prepare section files ERROR: Chunk failed at level:12, tier_type:3 FAILED CONTIG:scaffold3 ERROR: Chunk failed at level:4, tier_type:0 FAILED CONTIG:scaffold3 Gathering GFF3 input into hits - chunk:0 gathering ab-init output files prepare section files Gathering GFF3 input into hits - chunk:0 Died at /home/xiaenhua/SoftWare/maker/bin/../lib/Bio/Search/Hit/PhatHit/Base.pm line 188. --> rank=8, hostname=localhost.localdomain ERROR: Failed while prepare section files ERROR: Chunk failed at level:12, tier_type:3 FAILED CONTIG:scaffold5 prepare section files ERROR: Chunk failed at level:4, tier_type:0 FAILED CONTIG:scaffold5 ........................ ........................ -------------------------------------- My protein evidence gff3 file looks like this: scaffold3 genewise match 1276842 1277727 . - . ID=GeneWise.45.m scaffold3 genewise match_part 1277687 1277727 . - . ID=GeneWise.45.cds_1;Parent=GeneWise.45.m scaffold3 genewise match_part 1276842 1277545 . - . ID=GeneWise.45.cds_2;Parent=GeneWise.45.m EST evidence gff3: scaffold3 match 1275835 1276664 . + . ID=align_24718.m scaffold3 match_part 1275835 1276664 . + . ID=align_24718.cds_1;Parent=align_24718.m scaffold3 match 2510415 2511782 . + . ID=align_24719.m scaffold3 match_part 2510415 2511782 . + . ID=align_24719.cds_1;Parent=align_24719.m scaffold3 match 4113431 4114364 . + . ID=align_24720.m scaffold3 match_part 4113431 4114364 . + . ID=align_24720.cds_1;Parent=align_24720.m I don't know what happened? Your any help will be appreciated greatly! Thank you! All the best, En-Hua Xia -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Wed Nov 19 08:55:17 2014 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 19 Nov 2014 08:55:17 -0700 Subject: [maker-devel] ERROR: Failed while prepare section files In-Reply-To: <2014111920472385185424@gmail.com> References: <2014111920472385185424@gmail.com> Message-ID: <824C3CBE-FD06-4571-A8AB-06710840FF41@gmail.com> Could you rerun with the latest MAKER release, just to make sure that it stil happens with the current release (Version 2.31.7). Run with 'maker -a?. If it still happenes, then send me the GFF3 files you are using as input, and I?ll take a look. Basically it?s happening because you are missing a start or end position for a feature in one of the files. ?Carson > On Nov 19, 2014, at 5:47 AM, xiaenhua at gmail.com wrote: > > Dear Maker developer Team, > When I rerun maker using the first maker derived GFF3 files together with two newly generated evidence of Proteins and ESTs, I failed. I set the parameters in the maker_opt.ctl file like this: > ------------------------------------- > genome=CSL.fasta > maker_gff=CSL_1st_maker.gff; > est_gff=osi_csl_maker.pasa_assemblies_Maker.gff3; > protein_gff=CSL_wise.gff3; > est2genome=1; > protein2genome=1; > other parameters with default. > Then, I run maker via MPI. However, during the 2nd run, I failed. Below is the error message: > ----------------------------------- > preparing ab-inits > preparing ab-inits > preparing ab-inits > gathering ab-init output files > gathering ab-init output files > gathering ab-init output files > gathering ab-init output files > prepare section files > Gathering GFF3 input into hits - chunk:0 > gathering ab-init output files > gathering ab-init output files > gathering ab-init output files > gathering ab-init output files > prepare section files > Gathering GFF3 input into hits - chunk:0 > prepare section files > Died at /home/xiaenhua/SoftWare/maker/bin/../lib/Bio/Search/Hit/PhatHit/Base.pm line 188. > --> rank=6, hostname=localhost.localdomain > ERROR: Failed while prepare section files > ERROR: Chunk failed at level:12, tier_type:3 > FAILED CONTIG:scaffold3 > > ERROR: Chunk failed at level:4, tier_type:0 > FAILED CONTIG:scaffold3 > > Gathering GFF3 input into hits - chunk:0 > gathering ab-init output files > prepare section files > Gathering GFF3 input into hits - chunk:0 > Died at /home/xiaenhua/SoftWare/maker/bin/../lib/Bio/Search/Hit/PhatHit/Base.pm line 188. > --> rank=8, hostname=localhost.localdomain > ERROR: Failed while prepare section files > ERROR: Chunk failed at level:12, tier_type:3 > FAILED CONTIG:scaffold5 > > prepare section files > ERROR: Chunk failed at level:4, tier_type:0 > FAILED CONTIG:scaffold5 > ........................ > ........................ > -------------------------------------- > My protein evidence gff3 file looks like this: > scaffold3 genewise match 1276842 1277727 . - . ID=GeneWise.45.m > scaffold3 genewise match_part 1277687 1277727 . - . ID=GeneWise.45.cds_1;Parent=GeneWise.45.m > scaffold3 genewise match_part 1276842 1277545 . - . ID=GeneWise.45.cds_2;Parent=GeneWise.45.m > > EST evidence gff3: > scaffold3 match 1275835 1276664 . + . ID=align_24718.m > scaffold3 match_part 1275835 1276664 . + . ID=align_24718.cds_1;Parent=align_24718.m > scaffold3 match 2510415 2511782 . + . ID=align_24719.m > scaffold3 match_part 2510415 2511782 . + . ID=align_24719.cds_1;Parent=align_24719.m > scaffold3 match 4113431 4114364 . + . ID=align_24720.m > scaffold3 match_part 4113431 4114364 . + . ID=align_24720.cds_1;Parent=align_24720.m > > I don't know what happened? Your any help will be appreciated greatly! > Thank you! > > All the best, > En-Hua Xia > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ernesto at ebi.ac.uk Fri Nov 21 03:59:51 2014 From: ernesto at ebi.ac.uk (ernesto lowy gallego) Date: Fri, 21 Nov 2014 10:59:51 +0000 Subject: [maker-devel] Latest release of MAKER version 2.31.7 Message-ID: <546F1B27.90309@ebi.ac.uk> Hi, I am trying to find the features of the latest release of MAKER (version 2.31.7, released the 31/10/2014), Could you please let me know where can I find them? Thanks a lot! ernesto -- Developer VectorBase | Ensembl Genomes From carsonhh at gmail.com Fri Nov 21 08:04:07 2014 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 21 Nov 2014 08:04:07 -0700 Subject: [maker-devel] Latest release of MAKER version 2.31.7 In-Reply-To: <546F1B27.90309@ebi.ac.uk> References: <546F1B27.90309@ebi.ac.uk> Message-ID: The only change is a bug fix for an issue that sometimes occurs when model_gff is mixed with correct_est_fusion=1 and aways_complete=1. ?Carson > On Nov 21, 2014, at 3:59 AM, ernesto lowy gallego wrote: > > Hi, > > I am trying to find the features of the latest release of MAKER (version 2.31.7, released the 31/10/2014), > > Could you please let me know where can I find them? > > Thanks a lot! > > ernesto > > -- > Developer > > VectorBase | Ensembl Genomes > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From muriel.grosb at gmail.com Fri Nov 21 09:07:39 2014 From: muriel.grosb at gmail.com (Muriel Gros-Balthazard) Date: Fri, 21 Nov 2014 17:07:39 +0100 Subject: [maker-devel] Repeat masking in Maker Message-ID: <546F634B.1000900@gmail.com> Hello, I generated my own library of repeats following the tutorial provided with Maker. I also wanted to use all the species from the RepBase library for the masking. It is not clear to me how this works in Maker. Indeed, I put both these options : model_org=all rmlib=allRepeats.lib However, when using RepeatMasker without Maker, you can't put both -lib allRepeats.lib and -species all as options. Indeed, you can only say one species when also using the -lib option (-species arabidopsis for instance and not -species all) What about Maker ? Do I have masking of allRepeats.lib and also of all species repeats if I put these two arguments in Maker ? model_org=all rmlib=allRepeats.lib Another question: It is said that RepeatRunner is used as well. I put the option: repeat_protein=te_proteins.fasta But realized that RepeatRunner was not installed on my computer !!! I had no problem to run Maker. So, this file of te_proteins is used rather by RepeatMasker to mask them ? It is not clear to me how RepeatRunner is involved in the pipeline ? Thanks a lot for your answers, Muriel -------------- next part -------------- An HTML attachment was scrubbed... URL: From michael.s.campbell1 at gmail.com Fri Nov 21 10:09:17 2014 From: michael.s.campbell1 at gmail.com (Michael Campbell) Date: Fri, 21 Nov 2014 10:09:17 -0700 Subject: [maker-devel] Repeat masking in Maker In-Reply-To: <546F634B.1000900@gmail.com> References: <546F634B.1000900@gmail.com> Message-ID: Hi Muriel, By setting model_org=all MAKER will run repeatmasker using all of RepBase. MAKER will also repeatmasker to mask with your species specific repeat library when you set rmlib=allRepeats.lib. For more information on what options can be used in the model_org= line of the maker_opts.ctl file you can find it here on the MAKER wiki http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/The_MAKER_control_files_explained . A few releases back Repeat runner was added internally to MAKER, so you don't have to install it seperatly. If you look in the MAKER output error you can find where MAKER called repeat runner. Thanks, Mike On Fri, Nov 21, 2014 at 9:07 AM, Muriel Gros-Balthazard < muriel.grosb at gmail.com> wrote: > Hello, > > I generated my own library of repeats following the tutorial provided with > Maker. > I also wanted to use all the species from the RepBase library for the > masking. > > It is not clear to me how this works in Maker. > Indeed, I put both these options : > model_org=all > rmlib=allRepeats.lib > > However, when using RepeatMasker without Maker, you can't put both -lib > allRepeats.lib and -species all as options. > Indeed, you can only say one species when also using the -lib option (-species > arabidopsis for instance and not -species all) > > What about Maker ? > > Do I have masking of allRepeats.lib and also of all species repeats if I > put these two arguments in Maker ? > model_org=all > rmlib=allRepeats.lib > > Another question: > It is said that RepeatRunner is used as well. I put the option: > repeat_protein=te_proteins.fasta > But realized that RepeatRunner was not installed on my computer !!! > I had no problem to run Maker. > So, this file of te_proteins is used rather by RepeatMasker to mask them ? > It is not clear to me how RepeatRunner is involved in the pipeline ? > > Thanks a lot for your answers, > > Muriel > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > -- Michael Campbell MS, RD. Doctoral Candidate Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Fri Nov 21 10:21:28 2014 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 21 Nov 2014 10:21:28 -0700 Subject: [maker-devel] Repeat masking in Maker In-Reply-To: References: <546F634B.1000900@gmail.com> Message-ID: Yes. If you set them both, then RepeatMasker runs twice (once with each setting), and then combines the results. ?Carson > On Nov 21, 2014, at 10:09 AM, Michael Campbell wrote: > > Hi Muriel, > > By setting model_org=all MAKER will run repeatmasker using all of RepBase. MAKER will also repeatmasker to mask with your species specific repeat library when you set rmlib=allRepeats.lib. > > For more information on what options can be used in the model_org= line of the maker_opts.ctl file you can find it here on the MAKER wiki > > http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/The_MAKER_control_files_explained . > > A few releases back Repeat runner was added internally to MAKER, so you don't have to install it seperatly. If you look in the MAKER output error you can find where MAKER called repeat runner. > > Thanks, > Mike > > On Fri, Nov 21, 2014 at 9:07 AM, Muriel Gros-Balthazard > wrote: > Hello, > > I generated my own library of repeats following the tutorial provided with Maker. > I also wanted to use all the species from the RepBase library for the masking. > > It is not clear to me how this works in Maker. > Indeed, I put both these options : > model_org=all > rmlib=allRepeats.lib > > However, when using RepeatMasker without Maker, you can't put both -lib allRepeats.lib and -species all as options. > Indeed, you can only say one species when also using the -lib option (-species arabidopsis for instance and not -species all) > > What about Maker ? > > Do I have masking of allRepeats.lib and also of all species repeats if I put these two arguments in Maker ? > model_org=all > rmlib=allRepeats.lib > > Another question: > It is said that RepeatRunner is used as well. I put the option: repeat_protein=te_proteins.fasta > But realized that RepeatRunner was not installed on my computer !!! > I had no problem to run Maker. > So, this file of te_proteins is used rather by RepeatMasker to mask them ? > It is not clear to me how RepeatRunner is involved in the pipeline ? > > Thanks a lot for your answers, > > Muriel > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > > > -- > Michael Campbell MS, RD. > Doctoral Candidate > Eccles Institute of Human Genetics > University of Utah > 15 North 2030 East, Room 2100 > Salt Lake City, UT 84112-5330 > ph:585-3543 > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From muriel.grosb at gmail.com Thu Nov 27 02:22:26 2014 From: muriel.grosb at gmail.com (Muriel Gros-Balthazard) Date: Thu, 27 Nov 2014 10:22:26 +0100 Subject: [maker-devel] gff output Message-ID: <5476ED52.3060902@gmail.com> Hello, I have been using Maker to generate an annotation. I especially set these options: - est_gff with a list of transcripts.gff3 (Cufflinks output) - model_org=all - rmlib=allrepeats.lib - repeat_protein=te_prot.fasta - pred_gff= Augustus.gff3 (that I generated previously) I obtain a gff file for each of my contigs. However, here are the three possibilities in the second column : # est_gff:cufflinks # repeatmasker # repeatrunner I have no information about exons and introns. And I am wondering if the Augustus.gff3 was used... On top of that, I forgot to set up pred_stats to 1. If I understand well, I can just change this in the ocntrol file, and run Maker again. Since there is the output with everything, it won't run again the prediction, only this option. Is that right ? Thank you, Muriel From carson.holt at genetics.utah.edu Thu Nov 6 00:04:07 2014 From: carson.holt at genetics.utah.edu (Carson Holt) Date: Thu, 6 Nov 2014 07:04:07 +0000 Subject: [maker-devel] URGENT: Re: maker failure with example data In-Reply-To: References: Message-ID: <3CCDC24F-756A-492C-8E7F-B1B97616EE45@genetics.utah.edu> The final transcript and proteins fasta files will only exists if there were gene models with evidence support. If you did not provide an HMM for one of the ab initio gene predictors (SNAP, Augustus, etc.) then there will be no gene models, and if you do not provide protein or est evidence fastas, then there will be no evidence support. Also if your contigs are too short to contain gene models then there will be no models. Thanks, Carson On Nov 5, 2014, at 11:49 PM, Goutham atla > wrote: Dear All, I have finished running maker. But I realised that there are no *transcripts.fasta and *protein.fasta files in any of the directories that make has created. It has only gtf files. Example output of a test run: I have similar results on original file also: [User at motif jcf7180001838744]$ pwd /home/User/Maker_Annotation/Maker_test.maker.output/Maker_test_datastore/35/C1/jcf7180001838744 [User at motif jcf7180001838744]$ ls jcf7180001838744.gff run.log theVoid.jcf7180001838744 Any help from you in figuring out why there are no protein.fasta and transcripts.fast would be very helpful. Regards, Goutham On Wed, Oct 1, 2014 at 11:28 AM, Goutham atla > wrote: Dear All, Thank you. I figured out th problem is with mpich2. I was behind mpich2 but was unsuccessful. I installed mpich v3 and its working fine now. Thank you all. The old GMDO tutorials are bit misleading as the new versions have come up. On Wed, Oct 1, 2014 at 11:09 AM, Marc H?ppner > wrote: Another possibility could be that MPICH2 wasn?t build properly, no? I remember something with enabling shared libraries during the compilation of mpich, without which the error below would appear. /Marc Marc P. Hoeppner, PhD Team Leader BILS Genome Annotation Platform Department for Medical Biochemistry and Microbiology Uppsala University, Sweden marc.hoeppner at imbim.uu.se On 30 Sep 2014, at 21:33, Carson Holt > wrote: The message is warning that there are multiple instances of MAKER running, but no MPI communication. When you build MAKER (perl Build.PL step when installing MAKER), you need to specify the location of 'mpicc' and 'mpi.h' to build with MPI support. Otherwise you won't be able to link against MPICH2 shared libraries. You probably need to rerun that step. --Carson From: Goutham atla > Date: Tuesday, September 30, 2014 at 10:49 AM To: Carson Holt > Cc: "maker-devel at yandell-lab.org" > Subject: URGENT: Re: maker failure with example data Hi Carson, I figured out the problem is with RepeatMasker installation and I fixed it. I am running maker with MPICH2 and I get the following warning when I start it: STATUS: Processing and indexing input FASTA files... WARNING: Multiple MAKER processes have been started in the same directory. I would like to if this is common. Regards, Goutham On Tue, Sep 30, 2014 at 12:02 PM, Goutham atla > wrote: Dear Carson, Thank you for the reply. I reinstalled the BioPerl and now I am getting the following error on test data. ERROR: RepeatMasker failed --> rank=NA, hostname=motif ERROR: Failed while doing repeat masking ERROR: Chunk failed at level:0, tier_type:1 FAILED CONTIG:contig-dpp-500-500 On Mon, Sep 29, 2014 at 8:17 PM, Carson Holt > wrote: The error is caused by the BioPerl indexer returning an empty length for the indexed fasta sequence (possibly because of a corrupt index file or other reasons). You may need to reinstall BioPerl (use the CPAN version not the BioPerl-live version), or reinstall Berkley DB (used by the BioPerl indexer), or reinstall the Perl module DB_File via CPAN (Perl's interface to Berkley DB). After reinstalling BioPerl, delete the mpi_blastdb directory for the MAKER run before retrying. Also verify that the /tmp directory on your system or the directory pointed to by TMP= in the maker_opts,ctl file is not full and that TMP= is not set to an NFS mounted location. Thanks, Carson From: Goutham atla > Date: Monday, September 29, 2014 at 6:33 AM To: > Subject: maker failure with example data Dear All, I am running maker with the demo file, i.e dip_contig.fasta by keeping all other parameters in .ctl files as default. But it do not progress and shows the following message that the length of the sequence is 0. Can anybody help me ? --Next Contig-- MAKER WARNING: All old files will be erased before continuing #--------------------------------------------------------------------- Skipping the contig because it is too short!! SeqID: contig-dpp-500-500 Length: 0 #--------------------------------------------------------------------- Regards, Goutham -- Goutham Atla -- Goutham Atla _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -- Goutham Atla -- Goutham Atla -------------- next part -------------- An HTML attachment was scrubbed... URL: From monica.poelchau at ars.usda.gov Fri Nov 7 06:17:04 2014 From: monica.poelchau at ars.usda.gov (Poelchau, Monica) Date: Fri, 7 Nov 2014 13:17:04 +0000 Subject: [maker-devel] calculating AED values between two datasets Message-ID: Hi everyone, I would like to generate a list of Maker AED values comparing two datasets: a set of computationally predicted genes, and manually curated genes from the Web Apollo program. The idea is to quantify the amount of nucleotide-level change that occurred during the manual curation process. I have tried to run Maker in several ways to generate the AED values. Both gene sets are in (as far as I can tell) valid gff3 format. First, I included the manually curated (Web Apollo) gff3 in the 'model_gff' field of maker_opts.exe, and the gff3 of the computational predictions in the 'est_gff' field, with all of the other prediction and evidence alignment settings turned off. All resulting AEDs from this analysis were 1, even though many of the annotations had 100% overlap. Next, instead of using the computational predictions in gff3 format, I used the fasta file of the cDNA sequence from the computational predictions in the 'est' field. Here, the results made more sense, but there was a small but significant percentage of the AED values that were 1 that actually should have been less than 1. I have tried the 2 analyses above using both the gff3 output straight from Web Apollo, and after running the gff3 through maker once as the only entry in the model-gff field, as explained in the MAKER2 paper (http://www.biomedcentral.com/1471-2105/12/491). This does not to appear to make a difference. Do you have any ideas where I might start to debug this? Thanks for your help! Monica This electronic message contains information generated by the USDA solely for the intended recipients. Any unauthorized interception of this message or the use or disclosure of the information it contains may violate the law and subject the violator to civil or criminal penalties. If you believe you have received this message in error, please notify the sender and delete the email immediately. -------------- next part -------------- An HTML attachment was scrubbed... URL: From goutham.atla at gmail.com Thu Nov 6 22:39:51 2014 From: goutham.atla at gmail.com (Goutham atla) Date: Fri, 7 Nov 2014 11:09:51 +0530 Subject: [maker-devel] URGENT: Re: maker failure with example data In-Reply-To: <3CCDC24F-756A-492C-8E7F-B1B97616EE45@genetics.utah.edu> References: <3CCDC24F-756A-492C-8E7F-B1B97616EE45@genetics.utah.edu> Message-ID: Dear Carson, Thanks for the quick reply. It worked after providing the assembled transcripts and protein fasta from closely related species. Regards, Goutham On Thu, Nov 6, 2014 at 12:34 PM, Carson Holt wrote: > The final transcript and proteins fasta files will only exists if there > were gene models with evidence support. If you did not provide an HMM for > one of the ab initio gene predictors (SNAP, Augustus, etc.) then there will > be no gene models, and if you do not provide protein or est evidence > fastas, then there will be no evidence support. Also if your contigs are > too short to contain gene models then there will be no models. > > Thanks, > Carson > > > > On Nov 5, 2014, at 11:49 PM, Goutham atla wrote: > > Dear All, > > I have finished running maker. But I realised that there are no > *transcripts.fasta and *protein.fasta files in any of the directories that > make has created. It has only gtf files. > > Example output of a test run: I have similar results on original file > also: > > [User at motif jcf7180001838744]$ pwd > > /home/User/Maker_Annotation/Maker_test.maker.output/Maker_test_datastore/35/C1/jcf7180001838744 > [User at motif jcf7180001838744]$ ls > jcf7180001838744.gff run.log theVoid.jcf7180001838744 > > Any help from you in figuring out why there are no protein.fasta > and transcripts.fast would be very helpful. > > Regards, > Goutham > > On Wed, Oct 1, 2014 at 11:28 AM, Goutham atla > wrote: > >> Dear All, >> >> Thank you. I figured out th problem is with mpich2. I was behind mpich2 >> but was unsuccessful. I installed mpich v3 and its working fine now. Thank >> you all. The old GMDO tutorials are bit misleading as the new versions have >> come up. >> >> On Wed, Oct 1, 2014 at 11:09 AM, Marc H?ppner >> wrote: >> >>> Another possibility could be that MPICH2 wasn?t build properly, no? I >>> remember something with enabling shared libraries during the compilation of >>> mpich, without which the error below would appear. >>> >>> /Marc >>> >>> Marc P. Hoeppner, PhD >>> Team Leader >>> BILS Genome Annotation Platform >>> Department for Medical Biochemistry and Microbiology >>> Uppsala University, Sweden >>> marc.hoeppner at imbim.uu.se >>> >>> >>> >>> On 30 Sep 2014, at 21:33, Carson Holt >>> wrote: >>> >>> The message is warning that there are multiple instances of MAKER >>> running, but no MPI communication. When you build MAKER (perl Build.PL step >>> when installing MAKER), you need to specify the location of 'mpicc' and >>> 'mpi.h' to build with MPI support. Otherwise you won't be able to link >>> against MPICH2 shared libraries. You probably need to rerun that step. >>> >>> --Carson >>> >>> >>> From: Goutham atla >>> Date: Tuesday, September 30, 2014 at 10:49 AM >>> To: Carson Holt >>> Cc: "maker-devel at yandell-lab.org" >>> Subject: URGENT: Re: maker failure with example data >>> >>> Hi Carson, >>> >>> I figured out the problem is with RepeatMasker installation and I fixed >>> it. >>> >>> I am running maker with MPICH2 and I get the following warning when I >>> start it: >>> >>> >>> >>> *STATUS: Processing and indexing input FASTA files... WARNING: Multiple >>> MAKER processes have been started in the same directory.* >>> >>> I would like to if this is common. >>> >>> Regards, >>> Goutham >>> >>> >>> On Tue, Sep 30, 2014 at 12:02 PM, Goutham atla >>> wrote: >>> >>>> Dear Carson, >>>> >>>> Thank you for the reply. I reinstalled the BioPerl and now I am >>>> getting the following error on test data. >>>> >>>> ERROR: RepeatMasker failed >>>> --> rank=NA, hostname=motif >>>> ERROR: Failed while doing repeat masking >>>> ERROR: Chunk failed at level:0, tier_type:1 >>>> FAILED CONTIG:contig-dpp-500-500 >>>> >>>> On Mon, Sep 29, 2014 at 8:17 PM, Carson Holt < >>>> carson.holt at genetics.utah.edu> wrote: >>>> >>>>> The error is caused by the BioPerl indexer returning an empty length >>>>> for the indexed fasta sequence (possibly because of a corrupt index file or >>>>> other reasons). You may need to reinstall BioPerl (use the CPAN version >>>>> not the BioPerl-live version), or reinstall Berkley DB (used by the BioPerl >>>>> indexer), or reinstall the Perl module DB_File via CPAN (Perl's interface >>>>> to Berkley DB). After reinstalling BioPerl, delete the mpi_blastdb >>>>> directory for the MAKER run before retrying. >>>>> >>>>> Also verify that the /tmp directory on your system or the directory >>>>> pointed to by TMP= in the maker_opts,ctl file is not full and that TMP= is >>>>> not set to an NFS mounted location. >>>>> >>>>> Thanks, >>>>> Carson >>>>> >>>>> >>>>> >>>>> >>>>> From: Goutham atla >>>>> Date: Monday, September 29, 2014 at 6:33 AM >>>>> To: >>>>> Subject: maker failure with example data >>>>> >>>>> Dear All, >>>>> >>>>> I am running maker with the demo file, i.e dip_contig.fasta by >>>>> keeping all other parameters in .ctl files as default. But it do not >>>>> progress and shows the following message that the length of the sequence is >>>>> 0. Can anybody help me ? >>>>> >>>>> >>>>> >>>>> --Next Contig-- >>>>> >>>>> MAKER WARNING: All old files will be erased before continuing >>>>> #--------------------------------------------------------------------- >>>>> Skipping the contig because it is too short!! >>>>> SeqID: contig-dpp-500-500 >>>>> Length: 0 >>>>> #--------------------------------------------------------------------- >>>>> >>>>> >>>>> Regards, >>>>> Goutham >>>>> >>>> >>>> >>>> >>>> -- >>>> Goutham Atla >>>> >>> >>> >>> >>> -- >>> Goutham Atla >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >>> >>> >> >> >> -- >> Goutham Atla >> > > > > -- > Goutham Atla > > > -- Goutham Atla -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Fri Nov 7 08:26:31 2014 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 7 Nov 2014 08:26:31 -0700 Subject: [maker-devel] calculating AED values between two datasets In-Reply-To: References: Message-ID: <051D0D34-9E49-401F-B22D-16970EB93B66@gmail.com> If you got every value as 1 with the est_gff, then your GFF3 didn?t load. The est_gff option is expecting match/match_part format alignment format, and you may not have had it correctly structures. For using fasta files instead, you may also need to set single_exon=1 and single_length=1, otherwise many of those alignments will be ignored for AED scoring purposes. You should also look out the output in a viewer like apollo to visualize the comparison to see if the reason you get 1 is because the aligner can?t recover the original transcript alignment. ?Carson > On Nov 7, 2014, at 6:17 AM, Poelchau, Monica wrote: > > Hi everyone, > > I would like to generate a list of Maker AED values comparing two datasets: a set of computationally predicted genes, and manually curated genes from the Web Apollo program. The idea is to quantify the amount of nucleotide-level change that occurred during the manual curation process. > > I have tried to run Maker in several ways to generate the AED values. Both gene sets are in (as far as I can tell) valid gff3 format. First, I included the manually curated (Web Apollo) gff3 in the 'model_gff' field of maker_opts.exe, and the gff3 of the computational predictions in the 'est_gff' field, with all of the other prediction and evidence alignment settings turned off. All resulting AEDs from this analysis were 1, even though many of the annotations had 100% overlap. Next, instead of using the computational predictions in gff3 format, I used the fasta file of the cDNA sequence from the computational predictions in the 'est' field. Here, the results made more sense, but there was a small but significant percentage of the AED values that were 1 that actually should have been less than 1. > > I have tried the 2 analyses above using both the gff3 output straight from Web Apollo, and after running the gff3 through maker once as the only entry in the model-gff field, as explained in the MAKER2 paper (http://www.biomedcentral.com/1471-2105/12/491). This does not to appear to make a difference. > > Do you have any ideas where I might start to debug this? > > Thanks for your help! > > Monica > > > > > > This electronic message contains information generated by the USDA solely for the intended recipients. Any unauthorized interception of this message or the use or disclosure of the information it contains may violate the law and subject the violator to civil or criminal penalties. If you believe you have received this message in error, please notify the sender and delete the email immediately. > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From monica.poelchau at ars.usda.gov Fri Nov 7 12:00:26 2014 From: monica.poelchau at ars.usda.gov (Poelchau, Monica) Date: Fri, 7 Nov 2014 19:00:26 +0000 Subject: [maker-devel] calculating AED values between two datasets In-Reply-To: <051D0D34-9E49-401F-B22D-16970EB93B66@gmail.com> References: <051D0D34-9E49-401F-B22D-16970EB93B66@gmail.com> Message-ID: Thank you for the prompt reply, Carson! Yes, my gff3 was modeled as gene models, not match/match_part, so reformatting it may do the trick. Monica From: Carson Holt > Date: Friday, November 7, 2014 at 10:26 AM To: Monica Poelchau > Cc: "maker-devel at yandell-lab.org" > Subject: Re: [maker-devel] calculating AED values between two datasets If you got every value as 1 with the est_gff, then your GFF3 didn?t load. The est_gff option is expecting match/match_part format alignment format, and you may not have had it correctly structures. For using fasta files instead, you may also need to set single_exon=1 and single_length=1, otherwise many of those alignments will be ignored for AED scoring purposes. You should also look out the output in a viewer like apollo to visualize the comparison to see if the reason you get 1 is because the aligner can?t recover the original transcript alignment. ?Carson On Nov 7, 2014, at 6:17 AM, Poelchau, Monica > wrote: Hi everyone, I would like to generate a list of Maker AED values comparing two datasets: a set of computationally predicted genes, and manually curated genes from the Web Apollo program. The idea is to quantify the amount of nucleotide-level change that occurred during the manual curation process. I have tried to run Maker in several ways to generate the AED values. Both gene sets are in (as far as I can tell) valid gff3 format. First, I included the manually curated (Web Apollo) gff3 in the 'model_gff' field of maker_opts.exe, and the gff3 of the computational predictions in the 'est_gff' field, with all of the other prediction and evidence alignment settings turned off. All resulting AEDs from this analysis were 1, even though many of the annotations had 100% overlap. Next, instead of using the computational predictions in gff3 format, I used the fasta file of the cDNA sequence from the computational predictions in the 'est' field. Here, the results made more sense, but there was a small but significant percentage of the AED values that were 1 that actually should have been less than 1. I have tried the 2 analyses above using both the gff3 output straight from Web Apollo, and after running the gff3 through maker once as the only entry in the model-gff field, as explained in the MAKER2 paper (http://www.biomedcentral.com/1471-2105/12/491). This does not to appear to make a difference. Do you have any ideas where I might start to debug this? Thanks for your help! Monica This electronic message contains information generated by the USDA solely for the intended recipients. Any unauthorized interception of this message or the use or disclosure of the information it contains may violate the law and subject the violator to civil or criminal penalties. If you believe you have received this message in error, please notify the sender and delete the email immediately. _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From Timothy.Stitt at tgac.ac.uk Sat Nov 8 06:58:53 2014 From: Timothy.Stitt at tgac.ac.uk (Timothy Stitt (TGAC)) Date: Sat, 8 Nov 2014 13:58:53 +0000 Subject: [maker-devel] DBD::SQLite::db do failed errors Message-ID: Dear Maker Support, I'm running Maker v2.31.7 and I'm receiving lots of the following warnings/errors during the run. The errors don't seem to prevent the calculation from completing so I was just wondering how I can avoid getting them? STATUS: Setting up database for any GFF3 input... DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 1. DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 2. DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 3. ? Thanks in advance, Tim. --- Timothy Stitt PhD / Head of Scientific Computing The Genome Analysis Centre (TGAC) http://www.tgac.ac.uk/ p: +44 1603 450378 e: timothy.stitt at tgac.ac.uk -------------- next part -------------- An HTML attachment was scrubbed... URL: From jimhu at email.tamu.edu Fri Nov 7 11:34:11 2014 From: jimhu at email.tamu.edu (Jim Hu) Date: Fri, 7 Nov 2014 12:34:11 -0600 Subject: [maker-devel] Speaking of AED... Message-ID: I was teaching Yandell and Ence (2012) in the genomics class I co-teach, and was having trouble understanding the values for AED in Box 4 Figure Bb derive from the evidence set in Figure Ba. Box 4 says: "AAED is caculated in the same manner as SN and SPm but in place of a reference gene model, the coordinates of the union of the aligned evidence (see panel Ba) are used instead". In the union, I expect that a bp that is in an exon in any of the evidence would be considered a TP. If so, then why isn't nt-level AED for Annotation 1 in Bb zero? I'm probably missing something trivial. Thanks Jim ===================================== Jim Hu Professor Dept. of Biochemistry and Biophysics 2128 TAMU Texas A&M Univ. College Station, TX 77843-2128 979-862-4054 -------------- next part -------------- An HTML attachment was scrubbed... URL: From sarasank at umail.iu.edu Sat Nov 8 11:58:30 2014 From: sarasank at umail.iu.edu (Saranya Sankaranarayanan) Date: Sat, 8 Nov 2014 13:58:30 -0500 Subject: [maker-devel] Fwd: AED plot In-Reply-To: References: Message-ID: Hi Maker authors, I am new to using Maker. I have a few basic questions. I have the maker annotation complete and I ran the gff3_merge -n -d genome_master_datastore_index.log - to create the gff file After that, I used the script AED_cdf_generator.pl to obtain the AED plot, while I get the error: Use of uninitialized value $total in division (/) at ./AED_cdf_generator.pl line 43. Illegal division by zero at ./AED_cdf_generator.pl line 43. I parsed my gff file as: AED_cdf_generator.pl -b 0.025 maker.gff Could anyone please help me with this error? Thank you. It looks like no value is parsed to the variable total, but I am not able to decipher why. Regards, Saranya -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Sat Nov 8 16:52:26 2014 From: carsonhh at gmail.com (Carson Holt) Date: Sat, 8 Nov 2014 16:52:26 -0700 Subject: [maker-devel] Speaking of AED... In-Reply-To: References: Message-ID: <443253CC-838D-42A7-8FEB-8BAF442FAE9A@gmail.com> I think I would agree. Annotation 1 is a perfect match to the evidence. It is ab initio 1 that would have been AED of 0.2, but annotation 1 should have been AED of 0. ?Carson > On Nov 7, 2014, at 11:34 AM, Jim Hu wrote: > > I was teaching Yandell and Ence (2012) in the genomics class I co-teach, and was having trouble understanding the values for AED in Box 4 Figure Bb derive from the evidence set in Figure Ba. > > Box 4 says: "AAED is caculated in the same manner as SN and SPm but in place of a reference gene model, the coordinates of the union of the aligned evidence (see panel Ba) are used instead". In the union, I expect that a bp that is in an exon in any of the evidence would be considered a TP. If so, then why isn't nt-level AED for Annotation 1 in Bb zero? > > I'm probably missing something trivial. > > Thanks > > Jim > ===================================== > Jim Hu > Professor > Dept. of Biochemistry and Biophysics > 2128 TAMU > Texas A&M Univ. > College Station, TX 77843-2128 > 979-862-4054 > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From dence at genetics.utah.edu Sat Nov 8 16:53:19 2014 From: dence at genetics.utah.edu (Daniel Ence) Date: Sat, 8 Nov 2014 23:53:19 +0000 Subject: [maker-devel] Speaking of AED... In-Reply-To: References: Message-ID: <5FC7C806-E03F-4DC3-8932-65F6C0E1A7EF@genetics.utah.edu> Hi Professor Hu, I?m excited that you?re teaching from this review. I hope that you find it useful for your class! Annotation 1 has an AED of 0.2 and not 0 because the middle exon doesn?t line up exactly with the evidence alignments. Since there are bps in the annotation that aren?t supported by evidence, then it has an AED of > 0. It?s a little hard to see in the figure, but if you use a straight-edge, you can see it. Feel free to let me know whether that helps clear things up. Thanks, Daniel > On Nov 7, 2014, at 11:34 AM, Jim Hu wrote: > > I was teaching Yandell and Ence (2012) in the genomics class I co-teach, and was having trouble understanding the values for AED in Box 4 Figure Bb derive from the evidence set in Figure Ba. > > Box 4 says: "AAED is caculated in the same manner as SN and SPm but in place of a reference gene model, the coordinates of the union of the aligned evidence (see panel Ba) are used instead". In the union, I expect that a bp that is in an exon in any of the evidence would be considered a TP. If so, then why isn't nt-level AED for Annotation 1 in Bb zero? > > I'm probably missing something trivial. > > Thanks > > Jim > ===================================== > Jim Hu > Professor > Dept. of Biochemistry and Biophysics > 2128 TAMU > Texas A&M Univ. > College Station, TX 77843-2128 > 979-862-4054 > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From bmoore at genetics.utah.edu Sat Nov 8 16:38:23 2014 From: bmoore at genetics.utah.edu (Barry Moore) Date: Sat, 8 Nov 2014 23:38:23 +0000 Subject: [maker-devel] Speaking of AED... In-Reply-To: References: Message-ID: The 5? most junction on the 3? terminal exon (assuming + strand) is discordant in both Annotation 1 & 2 from the evidence in Ba. [cid:F7723E49-0CF1-4E2C-A8BF-64312129A65F] B On Nov 7, 2014, at 11:34 AM, Jim Hu > wrote: I was teaching Yandell and Ence (2012) in the genomics class I co-teach, and was having trouble understanding the values for AED in Box 4 Figure Bb derive from the evidence set in Figure Ba. Box 4 says: "AAED is caculated in the same manner as SN and SPm but in place of a reference gene model, the coordinates of the union of the aligned evidence (see panel Ba) are used instead". In the union, I expect that a bp that is in an exon in any of the evidence would be considered a TP. If so, then why isn't nt-level AED for Annotation 1 in Bb zero? I'm probably missing something trivial. Thanks Jim ===================================== Jim Hu Professor Dept. of Biochemistry and Biophysics 2128 TAMU Texas A&M Univ. College Station, TX 77843-2128 979-862-4054 _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: PastedGraphic-1.png Type: image/png Size: 344800 bytes Desc: PastedGraphic-1.png URL: From carsonhh at gmail.com Sat Nov 8 17:13:44 2014 From: carsonhh at gmail.com (Carson Holt) Date: Sat, 8 Nov 2014 17:13:44 -0700 Subject: [maker-devel] DBD::SQLite::db do failed errors In-Reply-To: References: Message-ID: <3100A718-B063-4BC5-A036-943DEBCC6484@gmail.com> It?s caused by one of the characters in your GFF3 file. For example characters NOT in the set [a-zA-Z0-9.:^*$@!+_?-|] must be escaped in GFF3 with exceptions outlined in the format spec. You mayhave either a ? or a ? that must be escaped. ?Carson > On Nov 8, 2014, at 6:58 AM, Timothy Stitt (TGAC) wrote: > > Dear Maker Support, > > I'm running Maker v2.31.7 and I'm receiving lots of the following warnings/errors during the run. The errors don't seem to prevent the calculation from completing so I was just wondering how I can avoid getting them? > > > STATUS: Setting up database for any GFF3 input... > DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 1. > DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 2. > DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 3. > ? > > > Thanks in advance, > > Tim. > --- > Timothy Stitt PhD / Head of Scientific Computing > The Genome Analysis Centre (TGAC) > http://www.tgac.ac.uk/ > > p: +44 1603 450378 > e: timothy.stitt at tgac.ac.uk _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From bmoore at genetics.utah.edu Sat Nov 8 17:07:43 2014 From: bmoore at genetics.utah.edu (Barry Moore) Date: Sun, 9 Nov 2014 00:07:43 +0000 Subject: [maker-devel] Speaking of AED... In-Reply-To: References: Message-ID: <6AB384C1-4F5C-4132-9B3A-23F0DE3A9351@genetics.utah.edu> Hmm, I missed the one Daniel pointed out, and then upon inspection noticed a third discordant exon. This figure needs to go into a book of optical illusions! [cid:87ED4E8E-56C4-4808-A7E3-9F0B4521CADB] On Nov 8, 2014, at 4:38 PM, Barry Moore > wrote: The 5? most junction on the 3? terminal exon (assuming + strand) is discordant in both Annotation 1 & 2 from the evidence in Ba. B On Nov 7, 2014, at 11:34 AM, Jim Hu > wrote: I was teaching Yandell and Ence (2012) in the genomics class I co-teach, and was having trouble understanding the values for AED in Box 4 Figure Bb derive from the evidence set in Figure Ba. Box 4 says: "AAED is caculated in the same manner as SN and SPm but in place of a reference gene model, the coordinates of the union of the aligned evidence (see panel Ba) are used instead". In the union, I expect that a bp that is in an exon in any of the evidence would be considered a TP. If so, then why isn't nt-level AED for Annotation 1 in Bb zero? I'm probably missing something trivial. Thanks Jim ===================================== Jim Hu Professor Dept. of Biochemistry and Biophysics 2128 TAMU Texas A&M Univ. College Station, TX 77843-2128 979-862-4054 _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: PastedGraphic-2.png Type: image/png Size: 362479 bytes Desc: PastedGraphic-2.png URL: From michael.s.campbell1 at gmail.com Sat Nov 8 23:01:27 2014 From: michael.s.campbell1 at gmail.com (Michael Campbell) Date: Sat, 8 Nov 2014 23:01:27 -0700 Subject: [maker-devel] Fwd: AED plot In-Reply-To: References: Message-ID: Hi Sranya, If you can send me a copy of your gff3 file I can look at it and see why you are getting the error. That is a pretty young accessory script so there may be something in your file that it has't seen before. Thanks, Mike On Sat, Nov 8, 2014 at 11:58 AM, Saranya Sankaranarayanan < sarasank at umail.iu.edu> wrote: > Hi Maker authors, > > I am new to using Maker. I have a few basic questions. > > I have the maker annotation complete and I ran the > > gff3_merge -n -d genome_master_datastore_index.log - to create the gff > file > > After that, I used the script AED_cdf_generator.pl to obtain the AED > plot, while I get the error: > > > Use of uninitialized value $total in division (/) at > ./AED_cdf_generator.pl line 43. > Illegal division by zero at ./AED_cdf_generator.pl line 43. > > I parsed my gff file as: > AED_cdf_generator.pl -b 0.025 maker.gff > > Could anyone please help me with this error? Thank you. It looks like no > value is parsed to the variable total, but I am not able to decipher why. > > Regards, > Saranya > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > -- Michael Campbell MS, RD. Doctoral Candidate Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From muriel.grosb at gmail.com Mon Nov 10 03:35:30 2014 From: muriel.grosb at gmail.com (Muriel Gros-Balthazard) Date: Mon, 10 Nov 2014 11:35:30 +0100 Subject: [maker-devel] running Maker but skipping first steps Message-ID: <546094F2.6000100@gmail.com> Hello, I want to run Maker but I would like to skip the first steps : STATUS: Parsing control files... STATUS: Processing and indexing input FASTA files... STATUS: Setting up database for any GFF3 input... A data structure will be created for you at: /Data/Genomics/GeneAnnotation/Maker_pipeline/5_Run_Maker/Pdac_ref2013s.maker.output/Pdac_ref2013s_datastore To access files for individual sequences use the datastore index: /Data/Genomics/GeneAnnotation/Maker_pipeline/5_Run_Maker/Pdac_ref2013s.maker.output/Pdac_ref2013s_master_datastore_index.log Indeed, there was an error in RepeatMasking (and I reinstalled RepeatMasker) but I believe that the previous steps are always the same. Is there a way to run Maker so that it doesn't run this first steps again given that the control files didn't change, the fasta files are already indexed and the database of gff3 is set up ? Thank you ! Muriel From FeatherstonJ at arc.agric.za Mon Nov 10 06:42:15 2014 From: FeatherstonJ at arc.agric.za (Jonathan Featherston) Date: Mon, 10 Nov 2014 13:42:15 +0000 Subject: [maker-devel] Maker Message-ID: <57CFF349-6C9F-4172-ADB3-A9572E21D4A3@arc.agric.za> Dear Carson I've been trying to train SNAP with Maker but I'm getting empty genome.ann and .dna files. I have tried running the maker2zff on the implant page to see if my script was corrupt. No help from that. I've seen a few pages in the group and on seqanswers about the empty off files but most seem to have been resolved by including all outputs (maker2zff -n) and even this doesn't generate anything for me?. So I'm guessing the problem is somewhere with the maker outputs. I did get errors from the maker run but they seem to be about mli and ALRM (a perl error- what a pain getting perl libs on a mac). Argument "ALRM" isn't numeric in exit at /Users/Jonathan/perl5/perlbrew/perls/perl-5.20.1/lib/site_perl/5.20.1/darwin-2level/forks.pm line 2184. Argument "ALRM" isn't numeric in exit at /Users/Jonathan/perl5/perlbrew/perls/perl-5.20.1/lib/site_perl/5.20.1/darwin-2level/forks.pm line 2184. Argument "ALRM" isn't numeric in exit at /Users/Jonathan/perl5/perlbrew/perls/perl-5.20.1/lib/site_perl/5.20.1/darwin-2level/forks.pm line 2184. Argument "ALRM" isn't numeric in exit at /Users/Jonathan/perl5/perlbrew/perls/perl-5.20.1/lib/site_perl/5.20.1/darwin-2level/forks.pm line 2184. Argument "ALRM" isn't numeric in exit at /Users/Jonathan/perl5/perlbrew/perls/perl-5.20.1/lib/site_perl/5.20.1/darwin-2level/forks.pm line 2184. -------------------------------------------------------------------------- mpiexec has exited due to process rank 4 with PID 5935 on node Administrators-MacBook-Pro-9 exiting improperly. There are three reasons this could occur: 1. this process did not call "init" before exiting, but others in the job did. This can cause a job to hang indefinitely while it waits for all processes to call "init". By rule, if one process calls "init", then ALL processes must call "init" prior to termination. 2. this process called "init", but exited without calling "finalize". By rule, all processes that call "init" MUST call "finalize" prior to exiting or it will be considered an "abnormal termination" 3. this process called "MPI_Abort" or "orte_abort" and the mca parameter orte_create_session_dirs is set to false. In this case, the run-time cannot detect that the abort call was an abnormal termination. Hence, the only error message you will receive is this one. This may have caused other processes in the application to be terminated by signals sent by mpiexec (as reported here). You can avoid this message by specifying -quiet on the mpiexec command line. Maker did finish and the gff file produced (I can't produce a fasta file from the est2genome=1 option??) seems ok. It has produced protein-matches and match_part although I don't see maker product. Otherwise I ran maker with mli using the command from the CPBI maker paper. I used -nohup mpiexec -n 8 maker < /dev/null & for my maker execution. I'm using altest and protein homology for now. Thank you very much for what help you can provide. I really enjoyed the workshop you and Mark presented! Kind Regards Jonathan -------------- next part -------------- An HTML attachment was scrubbed... URL: From Timothy.Stitt at tgac.ac.uk Mon Nov 10 06:59:59 2014 From: Timothy.Stitt at tgac.ac.uk (Timothy Stitt (TGAC)) Date: Mon, 10 Nov 2014 13:59:59 +0000 Subject: [maker-devel] DBD::SQLite::db do failed errors In-Reply-To: <3100A718-B063-4BC5-A036-943DEBCC6484@gmail.com> References: <3100A718-B063-4BC5-A036-943DEBCC6484@gmail.com> Message-ID: Thanks Carson. I checked the *.gff files for ' and " symbols. I only observed a bunch of " in one of the files as follows: scaffold16677 exonerate:protein2genome:local gene 128238 128710 339 - . gene_id 0 ; sequence Lus10000040|PACid:23139618 ; gene_orientation + scaffold16677 exonerate:protein2genome:local cds 128645 128710 . - . scaffold16677 exonerate:protein2genome:local exon 128645 128710 . - . insertions 0 ; deletions 0 scaffold16677 exonerate:protein2genome:local splice5 128643 128644 . - . intron_id 1 ; splice_site "GT" scaffold16677 exonerate:protein2genome:local intron 128552 128644 . - . intron_id 1 scaffold16677 exonerate:protein2genome:local splice3 128552 128553 . - . intron_id 0 ; splice_site "AG" scaffold16677 exonerate:protein2genome:local cds 128442 128551 . - . scaffold16677 exonerate:protein2genome:local exon 128442 128551 . - . insertions 0 ; deletions 0 scaffold16677 exonerate:protein2genome:local splice5 128440 128441 . - . intron_id 2 ; splice_site "GT" scaffold16677 exonerate:protein2genome:local intron 128362 128441 . - . intron_id 2 scaffold16677 exonerate:protein2genome:local splice3 128362 128363 . - . intron_id 1 ; splice_site "AG" Would these "GT", "AG" etc. strings cause the problem? If so, how should I change them to be correct? Thanks, Tim. --- Timothy Stitt PhD / Head of Scientific Computing The Genome Analysis Centre (TGAC) http://www.tgac.ac.uk/ p: +44 1603 450378 e: timothy.stitt at tgac.ac.uk From: Carson Holt > Date: Sunday, 9 November 2014 00:13 To: Timothy Stitt > Cc: "maker-devel at yandell-lab.org" > Subject: Re: [maker-devel] DBD::SQLite::db do failed errors It?s caused by one of the characters in your GFF3 file. For example characters NOT in the set [a-zA-Z0-9.:^*$@!+_?-|] must be escaped in GFF3 with exceptions outlined in the format spec. You mayhave either a ? or a ? that must be escaped. ?Carson On Nov 8, 2014, at 6:58 AM, Timothy Stitt (TGAC) > wrote: Dear Maker Support, I'm running Maker v2.31.7 and I'm receiving lots of the following warnings/errors during the run. The errors don't seem to prevent the calculation from completing so I was just wondering how I can avoid getting them? STATUS: Setting up database for any GFF3 input... DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 1. DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 2. DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 3. ? Thanks in advance, Tim. --- Timothy Stitt PhD / Head of Scientific Computing The Genome Analysis Centre (TGAC) http://www.tgac.ac.uk/ p: +44 1603 450378 e: timothy.stitt at tgac.ac.uk _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From myandell at genetics.utah.edu Sat Nov 8 17:32:12 2014 From: myandell at genetics.utah.edu (Mark Yandell) Date: Sun, 9 Nov 2014 00:32:12 +0000 Subject: [maker-devel] Speaking of AED... In-Reply-To: <6AB384C1-4F5C-4132-9B3A-23F0DE3A9351@genetics.utah.edu> References: , <6AB384C1-4F5C-4132-9B3A-23F0DE3A9351@genetics.utah.edu> Message-ID: <7A60AB257EFF2B48B1F4C814817EA053E3664681@mxb1.hg.genetics.utah.edu> And you are still missing one-- 3-prine end of the middle exon is also discordant. . I agree though somehow the color makes it hard to see. Sorry. --mark Mark Yandell Professor of Human Genetics H.A. & Edna Benning Presidential Endowed Chair Co-director USTAR Center for Genetic Discovery Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:801-587-7707 ________________________________________ From: maker-devel [maker-devel-bounces at yandell-lab.org] on behalf of Barry Moore [bmoore at genetics.utah.edu] Sent: Saturday, November 08, 2014 5:07 PM To: Jim Hu; maker-devel at yandell-lab.org Cc: Barry Moore Subject: Re: [maker-devel] Speaking of AED... Hmm, I missed the one Daniel pointed out, and then upon inspection noticed a third discordant exon. This figure needs to go into a book of optical illusions! [cid:87ED4E8E-56C4-4808-A7E3-9F0B4521CADB] On Nov 8, 2014, at 4:38 PM, Barry Moore > wrote: The 5? most junction on the 3? terminal exon (assuming + strand) is discordant in both Annotation 1 & 2 from the evidence in Ba. B On Nov 7, 2014, at 11:34 AM, Jim Hu > wrote: I was teaching Yandell and Ence (2012) in the genomics class I co-teach, and was having trouble understanding the values for AED in Box 4 Figure Bb derive from the evidence set in Figure Ba. Box 4 says: "AAED is caculated in the same manner as SN and SPm but in place of a reference gene model, the coordinates of the union of the aligned evidence (see panel Ba) are used instead". In the union, I expect that a bp that is in an exon in any of the evidence would be considered a TP. If so, then why isn't nt-level AED for Annotation 1 in Bb zero? I'm probably missing something trivial. Thanks Jim ===================================== Jim Hu Professor Dept. of Biochemistry and Biophysics 2128 TAMU Texas A&M Univ. College Station, TX 77843-2128 979-862-4054 _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- A non-text attachment was scrubbed... Name: PastedGraphic-2.png Type: image/png Size: 362479 bytes Desc: PastedGraphic-2.png URL: From sarasank at umail.iu.edu Sun Nov 9 09:33:08 2014 From: sarasank at umail.iu.edu (Saranya Sankaranarayanan) Date: Sun, 9 Nov 2014 11:33:08 -0500 Subject: [maker-devel] Fwd: AED plot In-Reply-To: References: Message-ID: Hi Mike, Please find the gff3 file attached with this email. Thanks a lot for the very prompt response. Sincerely, Saranya Sankaranarayanan Master's Student, SoIC Indiana University On Sun, Nov 9, 2014 at 1:01 AM, Michael Campbell < michael.s.campbell1 at gmail.com> wrote: > Hi Sranya, > > If you can send me a copy of your gff3 file I can look at it and see why > you are getting the error. That is a pretty young accessory script so there > may be something in your file that it has't seen before. > > Thanks, > Mike > > On Sat, Nov 8, 2014 at 11:58 AM, Saranya Sankaranarayanan < > sarasank at umail.iu.edu> wrote: > >> Hi Maker authors, >> >> I am new to using Maker. I have a few basic questions. >> >> I have the maker annotation complete and I ran the >> >> gff3_merge -n -d genome_master_datastore_index.log - to create the gff >> file >> >> After that, I used the script AED_cdf_generator.pl to obtain the AED >> plot, while I get the error: >> >> >> Use of uninitialized value $total in division (/) at >> ./AED_cdf_generator.pl line 43. >> Illegal division by zero at ./AED_cdf_generator.pl line 43. >> >> I parsed my gff file as: >> AED_cdf_generator.pl -b 0.025 maker.gff >> >> Could anyone please help me with this error? Thank you. It looks like no >> value is parsed to the variable total, but I am not able to decipher why. >> >> Regards, >> Saranya >> >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> > > > -- > Michael Campbell MS, RD. > Doctoral Candidate > Eccles Institute of Human Genetics > University of Utah > 15 North 2030 East, Room 2100 > Salt Lake City, UT 84112-5330 > ph:585-3543 > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Gff3.zip Type: application/zip Size: 2098998 bytes Desc: not available URL: From carsonhh at gmail.com Mon Nov 10 08:16:28 2014 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 10 Nov 2014 08:16:28 -0700 Subject: [maker-devel] DBD::SQLite::db do failed errors In-Reply-To: References: <3100A718-B063-4BC5-A036-943DEBCC6484@gmail.com> Message-ID: Actually that is not a GFF3 file. It appears to be GTF which is structured different from GFF3. You would need to convert to GFF3. You can try the sequence ontology converter here ?> http://www.sequenceontology.org/cgi-bin/converter.cgi Unfortunately it will not likely be a painless process as GTF files vary so much between sources that one GTF file might not actually be compatible with another GTF file, so you may have to spend some time editing the file for the converter to work. ?Carson > On Nov 10, 2014, at 6:59 AM, Timothy Stitt (TGAC) wrote: > > Thanks Carson. > > I checked the *.gff files for ' and " symbols. I only observed a bunch of " in one of the files as follows: > > scaffold16677 exonerate:protein2genome:local > gene 128238 > 128710 339 > - . > gene_id 0 ; sequence Lus10000040|PACid:23139618 ; gene_orientation + > scaffold16677 exonerate:protein2genome:local > cds 128645 > 128710 . > - . > scaffold16677 exonerate:protein2genome:local > exon 128645 > 128710 . > - . > insertions 0 ; deletions 0 > scaffold16677 exonerate:protein2genome:local > splice5 128643 > 128644 . > - . > intron_id 1 ; splice_site "GT" > scaffold16677 exonerate:protein2genome:local > intron 128552 > 128644 . > - . > intron_id 1 > scaffold16677 exonerate:protein2genome:local > splice3 128552 > 128553 . > - . > intron_id 0 ; splice_site "AG" > scaffold16677 exonerate:protein2genome:local > cds 128442 > 128551 . > - . > scaffold16677 exonerate:protein2genome:local > exon 128442 > 128551 . > - . > insertions 0 ; deletions 0 > scaffold16677 exonerate:protein2genome:local > splice5 128440 > 128441 . > - . > intron_id 2 ; splice_site "GT" > scaffold16677 exonerate:protein2genome:local > intron 128362 > 128441 . > - . > intron_id 2 > scaffold16677 exonerate:protein2genome:local > splice3 128362 > 128363 . > - . > intron_id 1 ; splice_site "AG" > > Would these "GT", "AG" etc. strings cause the problem? If so, how should I change them to be correct? > > Thanks, > > Tim. > --- > Timothy Stitt PhD / Head of Scientific Computing > The Genome Analysis Centre (TGAC) > http://www.tgac.ac.uk/ > > p: +44 1603 450378 > e: timothy.stitt at tgac.ac.uk > > From: Carson Holt > > Date: Sunday, 9 November 2014 00:13 > To: Timothy Stitt > > Cc: "maker-devel at yandell-lab.org " > > Subject: Re: [maker-devel] DBD::SQLite::db do failed errors > > It?s caused by one of the characters in your GFF3 file. For example characters NOT in the set [a-zA-Z0-9.:^*$@!+_?-|] must be escaped in GFF3 with exceptions outlined in the format spec. You mayhave either a ? or a ? that must be escaped. > > ?Carson > > > > >> On Nov 8, 2014, at 6:58 AM, Timothy Stitt (TGAC) > wrote: >> >> Dear Maker Support, >> >> I'm running Maker v2.31.7 and I'm receiving lots of the following warnings/errors during the run. The errors don't seem to prevent the calculation from completing so I was just wondering how I can avoid getting them? >> >> >> STATUS: Setting up database for any GFF3 input... >> DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 1. >> DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 2. >> DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 3. >> ? >> >> >> Thanks in advance, >> >> Tim. >> --- >> Timothy Stitt PhD / Head of Scientific Computing >> The Genome Analysis Centre (TGAC) >> http://www.tgac.ac.uk/ >> >> p: +44 1603 450378 >> e: timothy.stitt at tgac.ac.uk _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Nov 10 08:23:12 2014 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 10 Nov 2014 08:23:12 -0700 Subject: [maker-devel] running Maker but skipping first steps In-Reply-To: <546094F2.6000100@gmail.com> References: <546094F2.6000100@gmail.com> Message-ID: These are just status messages. The steps don?t actually rerun, except for the control file parsing. That obviously has to happen every time for MAKER to know the control files are still the same between runs. Both these messages ?> STATUS: Processing and indexing input FASTA files... STATUS: Setting up database for any GFF3 input... MAKER sees that the indexes already exists, validates their integrity, and then moves on. So there is no rerunning of steps. ?Carson > On Nov 10, 2014, at 3:35 AM, Muriel Gros-Balthazard wrote: > > Hello, > > I want to run Maker but I would like to skip the first steps : > STATUS: Parsing control files... > STATUS: Processing and indexing input FASTA files... > STATUS: Setting up database for any GFF3 input... > A data structure will be created for you at: > /Data/Genomics/GeneAnnotation/Maker_pipeline/5_Run_Maker/Pdac_ref2013s.maker.output/Pdac_ref2013s_datastore > > To access files for individual sequences use the datastore index: > /Data/Genomics/GeneAnnotation/Maker_pipeline/5_Run_Maker/Pdac_ref2013s.maker.output/Pdac_ref2013s_master_datastore_index.log > > Indeed, there was an error in RepeatMasking (and I reinstalled RepeatMasker) but I believe that the previous steps are always the same. > Is there a way to run Maker so that it doesn't run this first steps again given that the control files didn't change, the fasta files are already indexed and the database of gff3 is set up ? > > Thank you ! > > Muriel > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From mike.thon at gmail.com Mon Nov 10 08:54:00 2014 From: mike.thon at gmail.com (Michael Thon) Date: Mon, 10 Nov 2014 16:54:00 +0100 Subject: [maker-devel] map2assembly Message-ID: Hi - We?re using map2assembly to map genes from other gene annotation pipelines onto the genome sequence in order to compare AED values to a maker de novo annotation. We found a few transcripts that map2assembly maps to multiple loci in the genome. Is there any way to know if these are all equally good alignments or are they all above some present threshold? I?m trying to decide what to do with the multiple mappings - whether we should discard all but one (in that case we?d need to decide which one) or whether we should keep them all. Keeping them all makes the most sense but the problem is they all have the same id. should map2assembly append a number to the id when a transcript maps to multiple locations in the genome? From carsonhh at gmail.com Mon Nov 10 09:08:26 2014 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 10 Nov 2014 09:08:26 -0700 Subject: [maker-devel] map2assembly In-Reply-To: References: Message-ID: <2D1BD5DD-7405-448F-B68D-80C8FEEDC6B3@gmail.com> Try using the transcript score (column 6). It should indicate the % recovery. A 100 means perfect match to the input transcript. The value is %identity multiplied by %coverage, so it will decrease because of a lack of identity or a lack of end-to-end alignment. ?Carson > On Nov 10, 2014, at 8:54 AM, Michael Thon wrote: > > Hi - > We?re using map2assembly to map genes from other gene annotation pipelines onto the genome sequence in order to compare AED values to a maker de novo annotation. We found a few transcripts that map2assembly maps to multiple loci in the genome. Is there any way to know if these are all equally good alignments or are they all above some present threshold? I?m trying to decide what to do with the multiple mappings - whether we should discard all but one (in that case we?d need to decide which one) or whether we should keep them all. Keeping them all makes the most sense but the problem is they all have the same id. should map2assembly append a number to the id when a transcript maps to multiple locations in the genome? > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From carsonhh at gmail.com Mon Nov 10 09:12:53 2014 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 10 Nov 2014 09:12:53 -0700 Subject: [maker-devel] map2assembly In-Reply-To: <2D1BD5DD-7405-448F-B68D-80C8FEEDC6B3@gmail.com> References: <2D1BD5DD-7405-448F-B68D-80C8FEEDC6B3@gmail.com> Message-ID: <681F266A-0549-4165-9261-FDD9F268D674@gmail.com> You can also use the -l option when running gff3_merge to correct for for unique IDs when merging multiple GFF3 files (i.e. IDs will be uniq within a file, but may not be unique across files when mapping transcripts the IDs are being copied direct from the aligned transcript). ?Carson > On Nov 10, 2014, at 9:08 AM, Carson Holt wrote: > > Try using the transcript score (column 6). It should indicate the % recovery. A 100 means perfect match to the input transcript. The value is %identity multiplied by %coverage, so it will decrease because of a lack of identity or a lack of end-to-end alignment. > > ?Carson > > > > > >> On Nov 10, 2014, at 8:54 AM, Michael Thon wrote: >> >> Hi - >> We?re using map2assembly to map genes from other gene annotation pipelines onto the genome sequence in order to compare AED values to a maker de novo annotation. We found a few transcripts that map2assembly maps to multiple loci in the genome. Is there any way to know if these are all equally good alignments or are they all above some present threshold? I?m trying to decide what to do with the multiple mappings - whether we should discard all but one (in that case we?d need to decide which one) or whether we should keep them all. Keeping them all makes the most sense but the problem is they all have the same id. should map2assembly append a number to the id when a transcript maps to multiple locations in the genome? >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > From carsonhh at gmail.com Mon Nov 10 09:34:33 2014 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 10 Nov 2014 09:34:33 -0700 Subject: [maker-devel] Maker In-Reply-To: <57CFF349-6C9F-4172-ADB3-A9572E21D4A3@arc.agric.za> References: <57CFF349-6C9F-4172-ADB3-A9572E21D4A3@arc.agric.za> Message-ID: <1354797B-F783-4671-BB94-42A0E1611B03@gmail.com> You probably have an error further upstream. The 'Argument "ALRM" isn't numeric? error is just something you get as things are dieing in a non-elegant way, but the cause will be further up the error log. The lack of fasta files means that you have no final gene models. Either your contigs are too short to produce a model, or your evidence alignments are insufficient in end-to-end coverage, splice site recovery on polishing, or %identity, so maker cannot elucidate a usable model from alignment alone. What is your longest contig? Also try running GEGMA from the Korf lab, to help identify if the assembly is incomplete and by how much. ?Carson > On Nov 10, 2014, at 6:42 AM, Jonathan Featherston wrote: > > Dear Carson > > I've been trying to train SNAP with Maker but I'm getting empty genome.ann and .dna files. I have tried running the maker2zff on the implant page to see if my script was corrupt. No help from that. I've seen a few pages in the group and on seqanswers about the empty off files but most seem to have been resolved by including all outputs (maker2zff -n) and even this doesn't generate anything for me?. So I'm guessing the problem is somewhere with the maker outputs. > > I did get errors from the maker run but they seem to be about mli and ALRM (a perl error- what a pain getting perl libs on a mac). > > > Argument "ALRM" isn't numeric in exit at /Users/Jonathan/perl5/perlbrew/perls/perl-5.20.1/lib/site_perl/5.20.1/darwin-2level/forks.pm line 2184. > Argument "ALRM" isn't numeric in exit at /Users/Jonathan/perl5/perlbrew/perls/perl-5.20.1/lib/site_perl/5.20.1/darwin-2level/forks.pm line 2184. > Argument "ALRM" isn't numeric in exit at /Users/Jonathan/perl5/perlbrew/perls/perl-5.20.1/lib/site_perl/5.20.1/darwin-2level/forks.pm line 2184. > Argument "ALRM" isn't numeric in exit at /Users/Jonathan/perl5/perlbrew/perls/perl-5.20.1/lib/site_perl/5.20.1/darwin-2level/forks.pm line 2184. > Argument "ALRM" isn't numeric in exit at /Users/Jonathan/perl5/perlbrew/perls/perl-5.20.1/lib/site_perl/5.20.1/darwin-2level/forks.pm line 2184. > -------------------------------------------------------------------------- > mpiexec has exited due to process rank 4 with PID 5935 on > node Administrators-MacBook-Pro-9 exiting improperly. There are three reasons this could occur: > > 1. this process did not call "init" before exiting, but others in > the job did. This can cause a job to hang indefinitely while it waits > for all processes to call "init". By rule, if one process calls "init", > then ALL processes must call "init" prior to termination. > > 2. this process called "init", but exited without calling "finalize". > By rule, all processes that call "init" MUST call "finalize" prior to > exiting or it will be considered an "abnormal termination" > > 3. this process called "MPI_Abort" or "orte_abort" and the mca parameter > orte_create_session_dirs is set to false. In this case, the run-time cannot > detect that the abort call was an abnormal termination. Hence, the only > error message you will receive is this one. > > This may have caused other processes in the application to be > terminated by signals sent by mpiexec (as reported here). > > You can avoid this message by specifying -quiet on the mpiexec command line. > > Maker did finish and the gff file produced (I can't produce a fasta file from the est2genome=1 option??) seems ok. It has produced protein-matches and match_part although I don't see maker product. > > Otherwise I ran maker with mli using the command from the CPBI maker paper. I used -nohup mpiexec -n 8 maker < /dev/null & for my maker execution. > > I'm using altest and protein homology for now. > > Thank you very much for what help you can provide. I really enjoyed the workshop you and Mark presented! > > Kind Regards > Jonathan > > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From michael.s.campbell1 at gmail.com Mon Nov 10 13:54:14 2014 From: michael.s.campbell1 at gmail.com (Michael Campbell) Date: Mon, 10 Nov 2014 13:54:14 -0700 Subject: [maker-devel] Fwd: AED plot In-Reply-To: References: Message-ID: Hi Saranya, I fixed the AED_cdf_generator.pl scrip and added it to the svn repository for MAKER so it will be available in the next MAKE release. If you are using the svn repository you can do an svn update and get the new version of the script in the MAKER bin. If not I've attached a copy of the script to this email (I removed the .pl extension to the file since some email servers will block .pl files). let me know if you have any more problems with it. Thanks, Mike On Sun, Nov 9, 2014 at 9:33 AM, Saranya Sankaranarayanan < sarasank at umail.iu.edu> wrote: > Hi Mike, > > Please find the gff3 file attached with this email. Thanks a lot for the > very prompt response. > > Sincerely, > Saranya Sankaranarayanan > Master's Student, SoIC > Indiana University > > On Sun, Nov 9, 2014 at 1:01 AM, Michael Campbell < > michael.s.campbell1 at gmail.com> wrote: > >> Hi Sranya, >> >> If you can send me a copy of your gff3 file I can look at it and see why >> you are getting the error. That is a pretty young accessory script so there >> may be something in your file that it has't seen before. >> >> Thanks, >> Mike >> >> On Sat, Nov 8, 2014 at 11:58 AM, Saranya Sankaranarayanan < >> sarasank at umail.iu.edu> wrote: >> >>> Hi Maker authors, >>> >>> I am new to using Maker. I have a few basic questions. >>> >>> I have the maker annotation complete and I ran the >>> >>> gff3_merge -n -d genome_master_datastore_index.log - to create the gff >>> file >>> >>> After that, I used the script AED_cdf_generator.pl to obtain the AED >>> plot, while I get the error: >>> >>> >>> Use of uninitialized value $total in division (/) at >>> ./AED_cdf_generator.pl line 43. >>> Illegal division by zero at ./AED_cdf_generator.pl line 43. >>> >>> I parsed my gff file as: >>> AED_cdf_generator.pl -b 0.025 maker.gff >>> >>> Could anyone please help me with this error? Thank you. It looks like no >>> value is parsed to the variable total, but I am not able to decipher why. >>> >>> Regards, >>> Saranya >>> >>> >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >>> >> >> >> -- >> Michael Campbell MS, RD. >> Doctoral Candidate >> Eccles Institute of Human Genetics >> University of Utah >> 15 North 2030 East, Room 2100 >> Salt Lake City, UT 84112-5330 >> ph:585-3543 >> >> > -- Michael Campbell MS, RD. Doctoral Candidate Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: AED_cdf_generator Type: application/octet-stream Size: 2981 bytes Desc: not available URL: From sarasank at umail.iu.edu Mon Nov 10 14:06:58 2014 From: sarasank at umail.iu.edu (Saranya Sankaranarayanan) Date: Mon, 10 Nov 2014 16:06:58 -0500 Subject: [maker-devel] Fwd: AED plot In-Reply-To: References: Message-ID: Great! It works now. Thanks a lot for the support! Sincerely, Saranya Sankaranarayanan Master's Student, SoIC Indiana University On Mon, Nov 10, 2014 at 3:54 PM, Michael Campbell < michael.s.campbell1 at gmail.com> wrote: > Hi Saranya, > > I fixed the AED_cdf_generator.pl scrip and added it to the svn repository > for MAKER so it will be available in the next MAKE release. If you are > using the svn repository you can do an svn update and get the new version > of the script in the MAKER bin. If not I've attached a copy of the script > to this email (I removed the .pl extension to the file since some email > servers will block .pl files). let me know if you have any more problems > with it. > > Thanks, > Mike > > On Sun, Nov 9, 2014 at 9:33 AM, Saranya Sankaranarayanan < > sarasank at umail.iu.edu> wrote: > >> Hi Mike, >> >> Please find the gff3 file attached with this email. Thanks a lot for the >> very prompt response. >> >> Sincerely, >> Saranya Sankaranarayanan >> Master's Student, SoIC >> Indiana University >> >> On Sun, Nov 9, 2014 at 1:01 AM, Michael Campbell < >> michael.s.campbell1 at gmail.com> wrote: >> >>> Hi Sranya, >>> >>> If you can send me a copy of your gff3 file I can look at it and see why >>> you are getting the error. That is a pretty young accessory script so there >>> may be something in your file that it has't seen before. >>> >>> Thanks, >>> Mike >>> >>> On Sat, Nov 8, 2014 at 11:58 AM, Saranya Sankaranarayanan < >>> sarasank at umail.iu.edu> wrote: >>> >>>> Hi Maker authors, >>>> >>>> I am new to using Maker. I have a few basic questions. >>>> >>>> I have the maker annotation complete and I ran the >>>> >>>> gff3_merge -n -d genome_master_datastore_index.log - to create the >>>> gff file >>>> >>>> After that, I used the script AED_cdf_generator.pl to obtain the AED >>>> plot, while I get the error: >>>> >>>> >>>> Use of uninitialized value $total in division (/) at >>>> ./AED_cdf_generator.pl line 43. >>>> Illegal division by zero at ./AED_cdf_generator.pl line 43. >>>> >>>> I parsed my gff file as: >>>> AED_cdf_generator.pl -b 0.025 maker.gff >>>> >>>> Could anyone please help me with this error? Thank you. It looks like >>>> no value is parsed to the variable total, but I am not able to decipher why. >>>> >>>> Regards, >>>> Saranya >>>> >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>>> >>>> >>> >>> >>> -- >>> Michael Campbell MS, RD. >>> Doctoral Candidate >>> Eccles Institute of Human Genetics >>> University of Utah >>> 15 North 2030 East, Room 2100 >>> Salt Lake City, UT 84112-5330 >>> ph:585-3543 >>> >>> >> > > > -- > Michael Campbell MS, RD. > Doctoral Candidate > Eccles Institute of Human Genetics > University of Utah > 15 North 2030 East, Room 2100 > Salt Lake City, UT 84112-5330 > ph:585-3543 > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From goutham.atla at gmail.com Wed Nov 12 23:22:46 2014 From: goutham.atla at gmail.com (Goutham atla) Date: Thu, 13 Nov 2014 11:52:46 +0530 Subject: [maker-devel] URGENT: Re: maker failure with example data In-Reply-To: References: <3CCDC24F-756A-492C-8E7F-B1B97616EE45@genetics.utah.edu> Message-ID: Dear Carson, Maker is throwing error if I provide a rmlib file for repeat masking. It says At this time the hmmer search engine can only be used with the Dfam database. Please rerun your search without the -lib option or switch to a different search engine. We have ran it without rmlib and it completed successfully. We got GFF, proteins and transcripts.fasta files. We are working on Oryza sativa (subspecies indica) but we have fully annotated Oryza sative (subspecies japonica) which is fully annotated. I would like to know what would be the best way to do a functional annotation of the GFF file given by maker. Regards, Goutham On Fri, Nov 7, 2014 at 11:09 AM, Goutham atla wrote: > Dear Carson, > > Thanks for the quick reply. It worked after providing the assembled > transcripts and protein fasta from closely related species. > > > Regards, > Goutham > > On Thu, Nov 6, 2014 at 12:34 PM, Carson Holt < > carson.holt at genetics.utah.edu> wrote: > >> The final transcript and proteins fasta files will only exists if there >> were gene models with evidence support. If you did not provide an HMM for >> one of the ab initio gene predictors (SNAP, Augustus, etc.) then there will >> be no gene models, and if you do not provide protein or est evidence >> fastas, then there will be no evidence support. Also if your contigs are >> too short to contain gene models then there will be no models. >> >> Thanks, >> Carson >> >> >> >> On Nov 5, 2014, at 11:49 PM, Goutham atla >> wrote: >> >> Dear All, >> >> I have finished running maker. But I realised that there are no >> *transcripts.fasta and *protein.fasta files in any of the directories that >> make has created. It has only gtf files. >> >> Example output of a test run: I have similar results on original file >> also: >> >> [User at motif jcf7180001838744]$ pwd >> >> /home/User/Maker_Annotation/Maker_test.maker.output/Maker_test_datastore/35/C1/jcf7180001838744 >> [User at motif jcf7180001838744]$ ls >> jcf7180001838744.gff run.log theVoid.jcf7180001838744 >> >> Any help from you in figuring out why there are no protein.fasta >> and transcripts.fast would be very helpful. >> >> Regards, >> Goutham >> >> On Wed, Oct 1, 2014 at 11:28 AM, Goutham atla >> wrote: >> >>> Dear All, >>> >>> Thank you. I figured out th problem is with mpich2. I was behind >>> mpich2 but was unsuccessful. I installed mpich v3 and its working fine now. >>> Thank you all. The old GMDO tutorials are bit misleading as the new >>> versions have come up. >>> >>> On Wed, Oct 1, 2014 at 11:09 AM, Marc H?ppner >> > wrote: >>> >>>> Another possibility could be that MPICH2 wasn?t build properly, no? I >>>> remember something with enabling shared libraries during the compilation of >>>> mpich, without which the error below would appear. >>>> >>>> /Marc >>>> >>>> Marc P. Hoeppner, PhD >>>> Team Leader >>>> BILS Genome Annotation Platform >>>> Department for Medical Biochemistry and Microbiology >>>> Uppsala University, Sweden >>>> marc.hoeppner at imbim.uu.se >>>> >>>> >>>> >>>> On 30 Sep 2014, at 21:33, Carson Holt >>>> wrote: >>>> >>>> The message is warning that there are multiple instances of MAKER >>>> running, but no MPI communication. When you build MAKER (perl Build.PL step >>>> when installing MAKER), you need to specify the location of 'mpicc' and >>>> 'mpi.h' to build with MPI support. Otherwise you won't be able to link >>>> against MPICH2 shared libraries. You probably need to rerun that step. >>>> >>>> --Carson >>>> >>>> >>>> From: Goutham atla >>>> Date: Tuesday, September 30, 2014 at 10:49 AM >>>> To: Carson Holt >>>> Cc: "maker-devel at yandell-lab.org" >>>> Subject: URGENT: Re: maker failure with example data >>>> >>>> Hi Carson, >>>> >>>> I figured out the problem is with RepeatMasker installation and I >>>> fixed it. >>>> >>>> I am running maker with MPICH2 and I get the following warning when I >>>> start it: >>>> >>>> >>>> >>>> *STATUS: Processing and indexing input FASTA files... WARNING: Multiple >>>> MAKER processes have been started in the same directory.* >>>> >>>> I would like to if this is common. >>>> >>>> Regards, >>>> Goutham >>>> >>>> >>>> On Tue, Sep 30, 2014 at 12:02 PM, Goutham atla >>>> wrote: >>>> >>>>> Dear Carson, >>>>> >>>>> Thank you for the reply. I reinstalled the BioPerl and now I am >>>>> getting the following error on test data. >>>>> >>>>> ERROR: RepeatMasker failed >>>>> --> rank=NA, hostname=motif >>>>> ERROR: Failed while doing repeat masking >>>>> ERROR: Chunk failed at level:0, tier_type:1 >>>>> FAILED CONTIG:contig-dpp-500-500 >>>>> >>>>> On Mon, Sep 29, 2014 at 8:17 PM, Carson Holt < >>>>> carson.holt at genetics.utah.edu> wrote: >>>>> >>>>>> The error is caused by the BioPerl indexer returning an empty >>>>>> length for the indexed fasta sequence (possibly because of a corrupt index >>>>>> file or other reasons). You may need to reinstall BioPerl (use the CPAN >>>>>> version not the BioPerl-live version), or reinstall Berkley DB (used by the >>>>>> BioPerl indexer), or reinstall the Perl module DB_File via CPAN (Perl's >>>>>> interface to Berkley DB). After reinstalling BioPerl, delete the >>>>>> mpi_blastdb directory for the MAKER run before retrying. >>>>>> >>>>>> Also verify that the /tmp directory on your system or the directory >>>>>> pointed to by TMP= in the maker_opts,ctl file is not full and that TMP= is >>>>>> not set to an NFS mounted location. >>>>>> >>>>>> Thanks, >>>>>> Carson >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> From: Goutham atla >>>>>> Date: Monday, September 29, 2014 at 6:33 AM >>>>>> To: >>>>>> Subject: maker failure with example data >>>>>> >>>>>> Dear All, >>>>>> >>>>>> I am running maker with the demo file, i.e dip_contig.fasta by >>>>>> keeping all other parameters in .ctl files as default. But it do not >>>>>> progress and shows the following message that the length of the sequence is >>>>>> 0. Can anybody help me ? >>>>>> >>>>>> >>>>>> >>>>>> --Next Contig-- >>>>>> >>>>>> MAKER WARNING: All old files will be erased before continuing >>>>>> >>>>>> #--------------------------------------------------------------------- >>>>>> Skipping the contig because it is too short!! >>>>>> SeqID: contig-dpp-500-500 >>>>>> Length: 0 >>>>>> >>>>>> #--------------------------------------------------------------------- >>>>>> >>>>>> >>>>>> Regards, >>>>>> Goutham >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Goutham Atla >>>>> >>>> >>>> >>>> >>>> -- >>>> Goutham Atla >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>>> >>>> >>>> >>> >>> >>> -- >>> Goutham Atla >>> >> >> >> >> -- >> Goutham Atla >> >> >> > > > -- > Goutham Atla > -- Goutham Atla -------------- next part -------------- An HTML attachment was scrubbed... URL: From cjfields at illinois.edu Thu Nov 13 21:34:06 2014 From: cjfields at illinois.edu (Fields, Christopher J) Date: Fri, 14 Nov 2014 04:34:06 +0000 Subject: [maker-devel] BioPerl Bio::Tools::CodonTable changes Message-ID: <6615DD9E-10F6-4EFE-9900-F66317BDA0EE@illinois.edu> Carson, Just a note that we need to address a specific hack added last year in BioPerl for MAKER re: ?strict? codon tables. I added a new one to the end of the list, not thinking that more would eventually be added, and that time has now come: https://github.com/bioperl/bioperl-live/issues/90 I?m not sure how MAKER is setting the table, but if it?s by using the codon table # that will likely subtly break as it will now point to the new codon table from NCBI. chris From carsonhh at gmail.com Fri Nov 14 09:46:58 2014 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 14 Nov 2014 09:46:58 -0700 Subject: [maker-devel] BioPerl Bio::Tools::CodonTable changes In-Reply-To: <6615DD9E-10F6-4EFE-9900-F66317BDA0EE@illinois.edu> References: <6615DD9E-10F6-4EFE-9900-F66317BDA0EE@illinois.edu> Message-ID: <27D24CD7-09E3-4618-8A52-578104B34E65@gmail.com> Actually since I wanted to keep compatibility with old versions of BioPerl, I?ve been using the add_table method to just insert the tabled I need. Then I select it using the id method. But I think I like the idea of making the strictly canonical codon table be table 0, since having a strictly canonical codon table in BioPerl seems rather important. Thanks, Carson > On Nov 13, 2014, at 9:34 PM, Fields, Christopher J wrote: > > Carson, > > Just a note that we need to address a specific hack added last year in BioPerl for MAKER re: ?strict? codon tables. I added a new one to the end of the list, not thinking that more would eventually be added, and that time has now come: > > https://github.com/bioperl/bioperl-live/issues/90 > > I?m not sure how MAKER is setting the table, but if it?s by using the codon table # that will likely subtly break as it will now point to the new codon table from NCBI. > > chris > From cjfields at illinois.edu Fri Nov 14 10:20:14 2014 From: cjfields at illinois.edu (Fields, Christopher J) Date: Fri, 14 Nov 2014 17:20:14 +0000 Subject: [maker-devel] BioPerl Bio::Tools::CodonTable changes In-Reply-To: <27D24CD7-09E3-4618-8A52-578104B34E65@gmail.com> References: <6615DD9E-10F6-4EFE-9900-F66317BDA0EE@illinois.edu> <27D24CD7-09E3-4618-8A52-578104B34E65@gmail.com> Message-ID: <22112917-6961-4F53-87F2-DC4EA9E2175E@illinois.edu> Okay, just wanted to make sure that a change in this wouldn?t break MAKER. chris On Nov 14, 2014, at 10:46 AM, Carson Holt wrote: > Actually since I wanted to keep compatibility with old versions of BioPerl, I?ve been using the add_table method to just insert the tabled I need. Then I select it using the id method. > > But I think I like the idea of making the strictly canonical codon table be table 0, since having a strictly canonical codon table in BioPerl seems rather important. > > Thanks, > Carson > > > > >> On Nov 13, 2014, at 9:34 PM, Fields, Christopher J wrote: >> >> Carson, >> >> Just a note that we need to address a specific hack added last year in BioPerl for MAKER re: ?strict? codon tables. I added a new one to the end of the list, not thinking that more would eventually be added, and that time has now come: >> >> https://github.com/bioperl/bioperl-live/issues/90 >> >> I?m not sure how MAKER is setting the table, but if it?s by using the codon table # that will likely subtly break as it will now point to the new codon table from NCBI. >> >> chris >> > From xiaenhua at gmail.com Wed Nov 19 05:47:28 2014 From: xiaenhua at gmail.com (xiaenhua at gmail.com) Date: Wed, 19 Nov 2014 20:47:28 +0800 Subject: [maker-devel] ERROR: Failed while prepare section files Message-ID: <2014111920472385185424@gmail.com> Dear Maker developer Team, When I rerun maker using the first maker derived GFF3 files together with two newly generated evidence of Proteins and ESTs, I failed. I set the parameters in the maker_opt.ctl file like this: ------------------------------------- genome=CSL.fasta maker_gff=CSL_1st_maker.gff; est_gff=osi_csl_maker.pasa_assemblies_Maker.gff3; protein_gff=CSL_wise.gff3; est2genome=1; protein2genome=1; other parameters with default. Then, I run maker via MPI. However, during the 2nd run, I failed. Below is the error message: ----------------------------------- preparing ab-inits preparing ab-inits preparing ab-inits gathering ab-init output files gathering ab-init output files gathering ab-init output files gathering ab-init output files prepare section files Gathering GFF3 input into hits - chunk:0 gathering ab-init output files gathering ab-init output files gathering ab-init output files gathering ab-init output files prepare section files Gathering GFF3 input into hits - chunk:0 prepare section files Died at /home/xiaenhua/SoftWare/maker/bin/../lib/Bio/Search/Hit/PhatHit/Base.pm line 188. --> rank=6, hostname=localhost.localdomain ERROR: Failed while prepare section files ERROR: Chunk failed at level:12, tier_type:3 FAILED CONTIG:scaffold3 ERROR: Chunk failed at level:4, tier_type:0 FAILED CONTIG:scaffold3 Gathering GFF3 input into hits - chunk:0 gathering ab-init output files prepare section files Gathering GFF3 input into hits - chunk:0 Died at /home/xiaenhua/SoftWare/maker/bin/../lib/Bio/Search/Hit/PhatHit/Base.pm line 188. --> rank=8, hostname=localhost.localdomain ERROR: Failed while prepare section files ERROR: Chunk failed at level:12, tier_type:3 FAILED CONTIG:scaffold5 prepare section files ERROR: Chunk failed at level:4, tier_type:0 FAILED CONTIG:scaffold5 ........................ ........................ -------------------------------------- My protein evidence gff3 file looks like this: scaffold3 genewise match 1276842 1277727 . - . ID=GeneWise.45.m scaffold3 genewise match_part 1277687 1277727 . - . ID=GeneWise.45.cds_1;Parent=GeneWise.45.m scaffold3 genewise match_part 1276842 1277545 . - . ID=GeneWise.45.cds_2;Parent=GeneWise.45.m EST evidence gff3: scaffold3 match 1275835 1276664 . + . ID=align_24718.m scaffold3 match_part 1275835 1276664 . + . ID=align_24718.cds_1;Parent=align_24718.m scaffold3 match 2510415 2511782 . + . ID=align_24719.m scaffold3 match_part 2510415 2511782 . + . ID=align_24719.cds_1;Parent=align_24719.m scaffold3 match 4113431 4114364 . + . ID=align_24720.m scaffold3 match_part 4113431 4114364 . + . ID=align_24720.cds_1;Parent=align_24720.m I don't know what happened? Your any help will be appreciated greatly! Thank you! All the best, En-Hua Xia -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Wed Nov 19 08:55:17 2014 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 19 Nov 2014 08:55:17 -0700 Subject: [maker-devel] ERROR: Failed while prepare section files In-Reply-To: <2014111920472385185424@gmail.com> References: <2014111920472385185424@gmail.com> Message-ID: <824C3CBE-FD06-4571-A8AB-06710840FF41@gmail.com> Could you rerun with the latest MAKER release, just to make sure that it stil happens with the current release (Version 2.31.7). Run with 'maker -a?. If it still happenes, then send me the GFF3 files you are using as input, and I?ll take a look. Basically it?s happening because you are missing a start or end position for a feature in one of the files. ?Carson > On Nov 19, 2014, at 5:47 AM, xiaenhua at gmail.com wrote: > > Dear Maker developer Team, > When I rerun maker using the first maker derived GFF3 files together with two newly generated evidence of Proteins and ESTs, I failed. I set the parameters in the maker_opt.ctl file like this: > ------------------------------------- > genome=CSL.fasta > maker_gff=CSL_1st_maker.gff; > est_gff=osi_csl_maker.pasa_assemblies_Maker.gff3; > protein_gff=CSL_wise.gff3; > est2genome=1; > protein2genome=1; > other parameters with default. > Then, I run maker via MPI. However, during the 2nd run, I failed. Below is the error message: > ----------------------------------- > preparing ab-inits > preparing ab-inits > preparing ab-inits > gathering ab-init output files > gathering ab-init output files > gathering ab-init output files > gathering ab-init output files > prepare section files > Gathering GFF3 input into hits - chunk:0 > gathering ab-init output files > gathering ab-init output files > gathering ab-init output files > gathering ab-init output files > prepare section files > Gathering GFF3 input into hits - chunk:0 > prepare section files > Died at /home/xiaenhua/SoftWare/maker/bin/../lib/Bio/Search/Hit/PhatHit/Base.pm line 188. > --> rank=6, hostname=localhost.localdomain > ERROR: Failed while prepare section files > ERROR: Chunk failed at level:12, tier_type:3 > FAILED CONTIG:scaffold3 > > ERROR: Chunk failed at level:4, tier_type:0 > FAILED CONTIG:scaffold3 > > Gathering GFF3 input into hits - chunk:0 > gathering ab-init output files > prepare section files > Gathering GFF3 input into hits - chunk:0 > Died at /home/xiaenhua/SoftWare/maker/bin/../lib/Bio/Search/Hit/PhatHit/Base.pm line 188. > --> rank=8, hostname=localhost.localdomain > ERROR: Failed while prepare section files > ERROR: Chunk failed at level:12, tier_type:3 > FAILED CONTIG:scaffold5 > > prepare section files > ERROR: Chunk failed at level:4, tier_type:0 > FAILED CONTIG:scaffold5 > ........................ > ........................ > -------------------------------------- > My protein evidence gff3 file looks like this: > scaffold3 genewise match 1276842 1277727 . - . ID=GeneWise.45.m > scaffold3 genewise match_part 1277687 1277727 . - . ID=GeneWise.45.cds_1;Parent=GeneWise.45.m > scaffold3 genewise match_part 1276842 1277545 . - . ID=GeneWise.45.cds_2;Parent=GeneWise.45.m > > EST evidence gff3: > scaffold3 match 1275835 1276664 . + . ID=align_24718.m > scaffold3 match_part 1275835 1276664 . + . ID=align_24718.cds_1;Parent=align_24718.m > scaffold3 match 2510415 2511782 . + . ID=align_24719.m > scaffold3 match_part 2510415 2511782 . + . ID=align_24719.cds_1;Parent=align_24719.m > scaffold3 match 4113431 4114364 . + . ID=align_24720.m > scaffold3 match_part 4113431 4114364 . + . ID=align_24720.cds_1;Parent=align_24720.m > > I don't know what happened? Your any help will be appreciated greatly! > Thank you! > > All the best, > En-Hua Xia > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ernesto at ebi.ac.uk Fri Nov 21 03:59:51 2014 From: ernesto at ebi.ac.uk (ernesto lowy gallego) Date: Fri, 21 Nov 2014 10:59:51 +0000 Subject: [maker-devel] Latest release of MAKER version 2.31.7 Message-ID: <546F1B27.90309@ebi.ac.uk> Hi, I am trying to find the features of the latest release of MAKER (version 2.31.7, released the 31/10/2014), Could you please let me know where can I find them? Thanks a lot! ernesto -- Developer VectorBase | Ensembl Genomes From carsonhh at gmail.com Fri Nov 21 08:04:07 2014 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 21 Nov 2014 08:04:07 -0700 Subject: [maker-devel] Latest release of MAKER version 2.31.7 In-Reply-To: <546F1B27.90309@ebi.ac.uk> References: <546F1B27.90309@ebi.ac.uk> Message-ID: The only change is a bug fix for an issue that sometimes occurs when model_gff is mixed with correct_est_fusion=1 and aways_complete=1. ?Carson > On Nov 21, 2014, at 3:59 AM, ernesto lowy gallego wrote: > > Hi, > > I am trying to find the features of the latest release of MAKER (version 2.31.7, released the 31/10/2014), > > Could you please let me know where can I find them? > > Thanks a lot! > > ernesto > > -- > Developer > > VectorBase | Ensembl Genomes > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From muriel.grosb at gmail.com Fri Nov 21 09:07:39 2014 From: muriel.grosb at gmail.com (Muriel Gros-Balthazard) Date: Fri, 21 Nov 2014 17:07:39 +0100 Subject: [maker-devel] Repeat masking in Maker Message-ID: <546F634B.1000900@gmail.com> Hello, I generated my own library of repeats following the tutorial provided with Maker. I also wanted to use all the species from the RepBase library for the masking. It is not clear to me how this works in Maker. Indeed, I put both these options : model_org=all rmlib=allRepeats.lib However, when using RepeatMasker without Maker, you can't put both -lib allRepeats.lib and -species all as options. Indeed, you can only say one species when also using the -lib option (-species arabidopsis for instance and not -species all) What about Maker ? Do I have masking of allRepeats.lib and also of all species repeats if I put these two arguments in Maker ? model_org=all rmlib=allRepeats.lib Another question: It is said that RepeatRunner is used as well. I put the option: repeat_protein=te_proteins.fasta But realized that RepeatRunner was not installed on my computer !!! I had no problem to run Maker. So, this file of te_proteins is used rather by RepeatMasker to mask them ? It is not clear to me how RepeatRunner is involved in the pipeline ? Thanks a lot for your answers, Muriel -------------- next part -------------- An HTML attachment was scrubbed... URL: From michael.s.campbell1 at gmail.com Fri Nov 21 10:09:17 2014 From: michael.s.campbell1 at gmail.com (Michael Campbell) Date: Fri, 21 Nov 2014 10:09:17 -0700 Subject: [maker-devel] Repeat masking in Maker In-Reply-To: <546F634B.1000900@gmail.com> References: <546F634B.1000900@gmail.com> Message-ID: Hi Muriel, By setting model_org=all MAKER will run repeatmasker using all of RepBase. MAKER will also repeatmasker to mask with your species specific repeat library when you set rmlib=allRepeats.lib. For more information on what options can be used in the model_org= line of the maker_opts.ctl file you can find it here on the MAKER wiki http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/The_MAKER_control_files_explained . A few releases back Repeat runner was added internally to MAKER, so you don't have to install it seperatly. If you look in the MAKER output error you can find where MAKER called repeat runner. Thanks, Mike On Fri, Nov 21, 2014 at 9:07 AM, Muriel Gros-Balthazard < muriel.grosb at gmail.com> wrote: > Hello, > > I generated my own library of repeats following the tutorial provided with > Maker. > I also wanted to use all the species from the RepBase library for the > masking. > > It is not clear to me how this works in Maker. > Indeed, I put both these options : > model_org=all > rmlib=allRepeats.lib > > However, when using RepeatMasker without Maker, you can't put both -lib > allRepeats.lib and -species all as options. > Indeed, you can only say one species when also using the -lib option (-species > arabidopsis for instance and not -species all) > > What about Maker ? > > Do I have masking of allRepeats.lib and also of all species repeats if I > put these two arguments in Maker ? > model_org=all > rmlib=allRepeats.lib > > Another question: > It is said that RepeatRunner is used as well. I put the option: > repeat_protein=te_proteins.fasta > But realized that RepeatRunner was not installed on my computer !!! > I had no problem to run Maker. > So, this file of te_proteins is used rather by RepeatMasker to mask them ? > It is not clear to me how RepeatRunner is involved in the pipeline ? > > Thanks a lot for your answers, > > Muriel > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > -- Michael Campbell MS, RD. Doctoral Candidate Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Fri Nov 21 10:21:28 2014 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 21 Nov 2014 10:21:28 -0700 Subject: [maker-devel] Repeat masking in Maker In-Reply-To: References: <546F634B.1000900@gmail.com> Message-ID: Yes. If you set them both, then RepeatMasker runs twice (once with each setting), and then combines the results. ?Carson > On Nov 21, 2014, at 10:09 AM, Michael Campbell wrote: > > Hi Muriel, > > By setting model_org=all MAKER will run repeatmasker using all of RepBase. MAKER will also repeatmasker to mask with your species specific repeat library when you set rmlib=allRepeats.lib. > > For more information on what options can be used in the model_org= line of the maker_opts.ctl file you can find it here on the MAKER wiki > > http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/The_MAKER_control_files_explained . > > A few releases back Repeat runner was added internally to MAKER, so you don't have to install it seperatly. If you look in the MAKER output error you can find where MAKER called repeat runner. > > Thanks, > Mike > > On Fri, Nov 21, 2014 at 9:07 AM, Muriel Gros-Balthazard > wrote: > Hello, > > I generated my own library of repeats following the tutorial provided with Maker. > I also wanted to use all the species from the RepBase library for the masking. > > It is not clear to me how this works in Maker. > Indeed, I put both these options : > model_org=all > rmlib=allRepeats.lib > > However, when using RepeatMasker without Maker, you can't put both -lib allRepeats.lib and -species all as options. > Indeed, you can only say one species when also using the -lib option (-species arabidopsis for instance and not -species all) > > What about Maker ? > > Do I have masking of allRepeats.lib and also of all species repeats if I put these two arguments in Maker ? > model_org=all > rmlib=allRepeats.lib > > Another question: > It is said that RepeatRunner is used as well. I put the option: repeat_protein=te_proteins.fasta > But realized that RepeatRunner was not installed on my computer !!! > I had no problem to run Maker. > So, this file of te_proteins is used rather by RepeatMasker to mask them ? > It is not clear to me how RepeatRunner is involved in the pipeline ? > > Thanks a lot for your answers, > > Muriel > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > > > -- > Michael Campbell MS, RD. > Doctoral Candidate > Eccles Institute of Human Genetics > University of Utah > 15 North 2030 East, Room 2100 > Salt Lake City, UT 84112-5330 > ph:585-3543 > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From muriel.grosb at gmail.com Thu Nov 27 02:22:26 2014 From: muriel.grosb at gmail.com (Muriel Gros-Balthazard) Date: Thu, 27 Nov 2014 10:22:26 +0100 Subject: [maker-devel] gff output Message-ID: <5476ED52.3060902@gmail.com> Hello, I have been using Maker to generate an annotation. I especially set these options: - est_gff with a list of transcripts.gff3 (Cufflinks output) - model_org=all - rmlib=allrepeats.lib - repeat_protein=te_prot.fasta - pred_gff= Augustus.gff3 (that I generated previously) I obtain a gff file for each of my contigs. However, here are the three possibilities in the second column : # est_gff:cufflinks # repeatmasker # repeatrunner I have no information about exons and introns. And I am wondering if the Augustus.gff3 was used... On top of that, I forgot to set up pred_stats to 1. If I understand well, I can just change this in the ocntrol file, and run Maker again. Since there is the output with everything, it won't run again the prediction, only this option. Is that right ? Thank you, Muriel From carson.holt at genetics.utah.edu Thu Nov 6 00:04:07 2014 From: carson.holt at genetics.utah.edu (Carson Holt) Date: Thu, 6 Nov 2014 07:04:07 +0000 Subject: [maker-devel] URGENT: Re: maker failure with example data In-Reply-To: References: Message-ID: <3CCDC24F-756A-492C-8E7F-B1B97616EE45@genetics.utah.edu> The final transcript and proteins fasta files will only exists if there were gene models with evidence support. If you did not provide an HMM for one of the ab initio gene predictors (SNAP, Augustus, etc.) then there will be no gene models, and if you do not provide protein or est evidence fastas, then there will be no evidence support. Also if your contigs are too short to contain gene models then there will be no models. Thanks, Carson On Nov 5, 2014, at 11:49 PM, Goutham atla > wrote: Dear All, I have finished running maker. But I realised that there are no *transcripts.fasta and *protein.fasta files in any of the directories that make has created. It has only gtf files. Example output of a test run: I have similar results on original file also: [User at motif jcf7180001838744]$ pwd /home/User/Maker_Annotation/Maker_test.maker.output/Maker_test_datastore/35/C1/jcf7180001838744 [User at motif jcf7180001838744]$ ls jcf7180001838744.gff run.log theVoid.jcf7180001838744 Any help from you in figuring out why there are no protein.fasta and transcripts.fast would be very helpful. Regards, Goutham On Wed, Oct 1, 2014 at 11:28 AM, Goutham atla > wrote: Dear All, Thank you. I figured out th problem is with mpich2. I was behind mpich2 but was unsuccessful. I installed mpich v3 and its working fine now. Thank you all. The old GMDO tutorials are bit misleading as the new versions have come up. On Wed, Oct 1, 2014 at 11:09 AM, Marc H?ppner > wrote: Another possibility could be that MPICH2 wasn?t build properly, no? I remember something with enabling shared libraries during the compilation of mpich, without which the error below would appear. /Marc Marc P. Hoeppner, PhD Team Leader BILS Genome Annotation Platform Department for Medical Biochemistry and Microbiology Uppsala University, Sweden marc.hoeppner at imbim.uu.se On 30 Sep 2014, at 21:33, Carson Holt > wrote: The message is warning that there are multiple instances of MAKER running, but no MPI communication. When you build MAKER (perl Build.PL step when installing MAKER), you need to specify the location of 'mpicc' and 'mpi.h' to build with MPI support. Otherwise you won't be able to link against MPICH2 shared libraries. You probably need to rerun that step. --Carson From: Goutham atla > Date: Tuesday, September 30, 2014 at 10:49 AM To: Carson Holt > Cc: "maker-devel at yandell-lab.org" > Subject: URGENT: Re: maker failure with example data Hi Carson, I figured out the problem is with RepeatMasker installation and I fixed it. I am running maker with MPICH2 and I get the following warning when I start it: STATUS: Processing and indexing input FASTA files... WARNING: Multiple MAKER processes have been started in the same directory. I would like to if this is common. Regards, Goutham On Tue, Sep 30, 2014 at 12:02 PM, Goutham atla > wrote: Dear Carson, Thank you for the reply. I reinstalled the BioPerl and now I am getting the following error on test data. ERROR: RepeatMasker failed --> rank=NA, hostname=motif ERROR: Failed while doing repeat masking ERROR: Chunk failed at level:0, tier_type:1 FAILED CONTIG:contig-dpp-500-500 On Mon, Sep 29, 2014 at 8:17 PM, Carson Holt > wrote: The error is caused by the BioPerl indexer returning an empty length for the indexed fasta sequence (possibly because of a corrupt index file or other reasons). You may need to reinstall BioPerl (use the CPAN version not the BioPerl-live version), or reinstall Berkley DB (used by the BioPerl indexer), or reinstall the Perl module DB_File via CPAN (Perl's interface to Berkley DB). After reinstalling BioPerl, delete the mpi_blastdb directory for the MAKER run before retrying. Also verify that the /tmp directory on your system or the directory pointed to by TMP= in the maker_opts,ctl file is not full and that TMP= is not set to an NFS mounted location. Thanks, Carson From: Goutham atla > Date: Monday, September 29, 2014 at 6:33 AM To: > Subject: maker failure with example data Dear All, I am running maker with the demo file, i.e dip_contig.fasta by keeping all other parameters in .ctl files as default. But it do not progress and shows the following message that the length of the sequence is 0. Can anybody help me ? --Next Contig-- MAKER WARNING: All old files will be erased before continuing #--------------------------------------------------------------------- Skipping the contig because it is too short!! SeqID: contig-dpp-500-500 Length: 0 #--------------------------------------------------------------------- Regards, Goutham -- Goutham Atla -- Goutham Atla _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -- Goutham Atla -- Goutham Atla -------------- next part -------------- An HTML attachment was scrubbed... URL: From monica.poelchau at ars.usda.gov Fri Nov 7 06:17:04 2014 From: monica.poelchau at ars.usda.gov (Poelchau, Monica) Date: Fri, 7 Nov 2014 13:17:04 +0000 Subject: [maker-devel] calculating AED values between two datasets Message-ID: Hi everyone, I would like to generate a list of Maker AED values comparing two datasets: a set of computationally predicted genes, and manually curated genes from the Web Apollo program. The idea is to quantify the amount of nucleotide-level change that occurred during the manual curation process. I have tried to run Maker in several ways to generate the AED values. Both gene sets are in (as far as I can tell) valid gff3 format. First, I included the manually curated (Web Apollo) gff3 in the 'model_gff' field of maker_opts.exe, and the gff3 of the computational predictions in the 'est_gff' field, with all of the other prediction and evidence alignment settings turned off. All resulting AEDs from this analysis were 1, even though many of the annotations had 100% overlap. Next, instead of using the computational predictions in gff3 format, I used the fasta file of the cDNA sequence from the computational predictions in the 'est' field. Here, the results made more sense, but there was a small but significant percentage of the AED values that were 1 that actually should have been less than 1. I have tried the 2 analyses above using both the gff3 output straight from Web Apollo, and after running the gff3 through maker once as the only entry in the model-gff field, as explained in the MAKER2 paper (http://www.biomedcentral.com/1471-2105/12/491). This does not to appear to make a difference. Do you have any ideas where I might start to debug this? Thanks for your help! Monica This electronic message contains information generated by the USDA solely for the intended recipients. Any unauthorized interception of this message or the use or disclosure of the information it contains may violate the law and subject the violator to civil or criminal penalties. If you believe you have received this message in error, please notify the sender and delete the email immediately. -------------- next part -------------- An HTML attachment was scrubbed... URL: From goutham.atla at gmail.com Thu Nov 6 22:39:51 2014 From: goutham.atla at gmail.com (Goutham atla) Date: Fri, 7 Nov 2014 11:09:51 +0530 Subject: [maker-devel] URGENT: Re: maker failure with example data In-Reply-To: <3CCDC24F-756A-492C-8E7F-B1B97616EE45@genetics.utah.edu> References: <3CCDC24F-756A-492C-8E7F-B1B97616EE45@genetics.utah.edu> Message-ID: Dear Carson, Thanks for the quick reply. It worked after providing the assembled transcripts and protein fasta from closely related species. Regards, Goutham On Thu, Nov 6, 2014 at 12:34 PM, Carson Holt wrote: > The final transcript and proteins fasta files will only exists if there > were gene models with evidence support. If you did not provide an HMM for > one of the ab initio gene predictors (SNAP, Augustus, etc.) then there will > be no gene models, and if you do not provide protein or est evidence > fastas, then there will be no evidence support. Also if your contigs are > too short to contain gene models then there will be no models. > > Thanks, > Carson > > > > On Nov 5, 2014, at 11:49 PM, Goutham atla wrote: > > Dear All, > > I have finished running maker. But I realised that there are no > *transcripts.fasta and *protein.fasta files in any of the directories that > make has created. It has only gtf files. > > Example output of a test run: I have similar results on original file > also: > > [User at motif jcf7180001838744]$ pwd > > /home/User/Maker_Annotation/Maker_test.maker.output/Maker_test_datastore/35/C1/jcf7180001838744 > [User at motif jcf7180001838744]$ ls > jcf7180001838744.gff run.log theVoid.jcf7180001838744 > > Any help from you in figuring out why there are no protein.fasta > and transcripts.fast would be very helpful. > > Regards, > Goutham > > On Wed, Oct 1, 2014 at 11:28 AM, Goutham atla > wrote: > >> Dear All, >> >> Thank you. I figured out th problem is with mpich2. I was behind mpich2 >> but was unsuccessful. I installed mpich v3 and its working fine now. Thank >> you all. The old GMDO tutorials are bit misleading as the new versions have >> come up. >> >> On Wed, Oct 1, 2014 at 11:09 AM, Marc H?ppner >> wrote: >> >>> Another possibility could be that MPICH2 wasn?t build properly, no? I >>> remember something with enabling shared libraries during the compilation of >>> mpich, without which the error below would appear. >>> >>> /Marc >>> >>> Marc P. Hoeppner, PhD >>> Team Leader >>> BILS Genome Annotation Platform >>> Department for Medical Biochemistry and Microbiology >>> Uppsala University, Sweden >>> marc.hoeppner at imbim.uu.se >>> >>> >>> >>> On 30 Sep 2014, at 21:33, Carson Holt >>> wrote: >>> >>> The message is warning that there are multiple instances of MAKER >>> running, but no MPI communication. When you build MAKER (perl Build.PL step >>> when installing MAKER), you need to specify the location of 'mpicc' and >>> 'mpi.h' to build with MPI support. Otherwise you won't be able to link >>> against MPICH2 shared libraries. You probably need to rerun that step. >>> >>> --Carson >>> >>> >>> From: Goutham atla >>> Date: Tuesday, September 30, 2014 at 10:49 AM >>> To: Carson Holt >>> Cc: "maker-devel at yandell-lab.org" >>> Subject: URGENT: Re: maker failure with example data >>> >>> Hi Carson, >>> >>> I figured out the problem is with RepeatMasker installation and I fixed >>> it. >>> >>> I am running maker with MPICH2 and I get the following warning when I >>> start it: >>> >>> >>> >>> *STATUS: Processing and indexing input FASTA files... WARNING: Multiple >>> MAKER processes have been started in the same directory.* >>> >>> I would like to if this is common. >>> >>> Regards, >>> Goutham >>> >>> >>> On Tue, Sep 30, 2014 at 12:02 PM, Goutham atla >>> wrote: >>> >>>> Dear Carson, >>>> >>>> Thank you for the reply. I reinstalled the BioPerl and now I am >>>> getting the following error on test data. >>>> >>>> ERROR: RepeatMasker failed >>>> --> rank=NA, hostname=motif >>>> ERROR: Failed while doing repeat masking >>>> ERROR: Chunk failed at level:0, tier_type:1 >>>> FAILED CONTIG:contig-dpp-500-500 >>>> >>>> On Mon, Sep 29, 2014 at 8:17 PM, Carson Holt < >>>> carson.holt at genetics.utah.edu> wrote: >>>> >>>>> The error is caused by the BioPerl indexer returning an empty length >>>>> for the indexed fasta sequence (possibly because of a corrupt index file or >>>>> other reasons). You may need to reinstall BioPerl (use the CPAN version >>>>> not the BioPerl-live version), or reinstall Berkley DB (used by the BioPerl >>>>> indexer), or reinstall the Perl module DB_File via CPAN (Perl's interface >>>>> to Berkley DB). After reinstalling BioPerl, delete the mpi_blastdb >>>>> directory for the MAKER run before retrying. >>>>> >>>>> Also verify that the /tmp directory on your system or the directory >>>>> pointed to by TMP= in the maker_opts,ctl file is not full and that TMP= is >>>>> not set to an NFS mounted location. >>>>> >>>>> Thanks, >>>>> Carson >>>>> >>>>> >>>>> >>>>> >>>>> From: Goutham atla >>>>> Date: Monday, September 29, 2014 at 6:33 AM >>>>> To: >>>>> Subject: maker failure with example data >>>>> >>>>> Dear All, >>>>> >>>>> I am running maker with the demo file, i.e dip_contig.fasta by >>>>> keeping all other parameters in .ctl files as default. But it do not >>>>> progress and shows the following message that the length of the sequence is >>>>> 0. Can anybody help me ? >>>>> >>>>> >>>>> >>>>> --Next Contig-- >>>>> >>>>> MAKER WARNING: All old files will be erased before continuing >>>>> #--------------------------------------------------------------------- >>>>> Skipping the contig because it is too short!! >>>>> SeqID: contig-dpp-500-500 >>>>> Length: 0 >>>>> #--------------------------------------------------------------------- >>>>> >>>>> >>>>> Regards, >>>>> Goutham >>>>> >>>> >>>> >>>> >>>> -- >>>> Goutham Atla >>>> >>> >>> >>> >>> -- >>> Goutham Atla >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >>> >>> >> >> >> -- >> Goutham Atla >> > > > > -- > Goutham Atla > > > -- Goutham Atla -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Fri Nov 7 08:26:31 2014 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 7 Nov 2014 08:26:31 -0700 Subject: [maker-devel] calculating AED values between two datasets In-Reply-To: References: Message-ID: <051D0D34-9E49-401F-B22D-16970EB93B66@gmail.com> If you got every value as 1 with the est_gff, then your GFF3 didn?t load. The est_gff option is expecting match/match_part format alignment format, and you may not have had it correctly structures. For using fasta files instead, you may also need to set single_exon=1 and single_length=1, otherwise many of those alignments will be ignored for AED scoring purposes. You should also look out the output in a viewer like apollo to visualize the comparison to see if the reason you get 1 is because the aligner can?t recover the original transcript alignment. ?Carson > On Nov 7, 2014, at 6:17 AM, Poelchau, Monica wrote: > > Hi everyone, > > I would like to generate a list of Maker AED values comparing two datasets: a set of computationally predicted genes, and manually curated genes from the Web Apollo program. The idea is to quantify the amount of nucleotide-level change that occurred during the manual curation process. > > I have tried to run Maker in several ways to generate the AED values. Both gene sets are in (as far as I can tell) valid gff3 format. First, I included the manually curated (Web Apollo) gff3 in the 'model_gff' field of maker_opts.exe, and the gff3 of the computational predictions in the 'est_gff' field, with all of the other prediction and evidence alignment settings turned off. All resulting AEDs from this analysis were 1, even though many of the annotations had 100% overlap. Next, instead of using the computational predictions in gff3 format, I used the fasta file of the cDNA sequence from the computational predictions in the 'est' field. Here, the results made more sense, but there was a small but significant percentage of the AED values that were 1 that actually should have been less than 1. > > I have tried the 2 analyses above using both the gff3 output straight from Web Apollo, and after running the gff3 through maker once as the only entry in the model-gff field, as explained in the MAKER2 paper (http://www.biomedcentral.com/1471-2105/12/491). This does not to appear to make a difference. > > Do you have any ideas where I might start to debug this? > > Thanks for your help! > > Monica > > > > > > This electronic message contains information generated by the USDA solely for the intended recipients. Any unauthorized interception of this message or the use or disclosure of the information it contains may violate the law and subject the violator to civil or criminal penalties. If you believe you have received this message in error, please notify the sender and delete the email immediately. > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From monica.poelchau at ars.usda.gov Fri Nov 7 12:00:26 2014 From: monica.poelchau at ars.usda.gov (Poelchau, Monica) Date: Fri, 7 Nov 2014 19:00:26 +0000 Subject: [maker-devel] calculating AED values between two datasets In-Reply-To: <051D0D34-9E49-401F-B22D-16970EB93B66@gmail.com> References: <051D0D34-9E49-401F-B22D-16970EB93B66@gmail.com> Message-ID: Thank you for the prompt reply, Carson! Yes, my gff3 was modeled as gene models, not match/match_part, so reformatting it may do the trick. Monica From: Carson Holt > Date: Friday, November 7, 2014 at 10:26 AM To: Monica Poelchau > Cc: "maker-devel at yandell-lab.org" > Subject: Re: [maker-devel] calculating AED values between two datasets If you got every value as 1 with the est_gff, then your GFF3 didn?t load. The est_gff option is expecting match/match_part format alignment format, and you may not have had it correctly structures. For using fasta files instead, you may also need to set single_exon=1 and single_length=1, otherwise many of those alignments will be ignored for AED scoring purposes. You should also look out the output in a viewer like apollo to visualize the comparison to see if the reason you get 1 is because the aligner can?t recover the original transcript alignment. ?Carson On Nov 7, 2014, at 6:17 AM, Poelchau, Monica > wrote: Hi everyone, I would like to generate a list of Maker AED values comparing two datasets: a set of computationally predicted genes, and manually curated genes from the Web Apollo program. The idea is to quantify the amount of nucleotide-level change that occurred during the manual curation process. I have tried to run Maker in several ways to generate the AED values. Both gene sets are in (as far as I can tell) valid gff3 format. First, I included the manually curated (Web Apollo) gff3 in the 'model_gff' field of maker_opts.exe, and the gff3 of the computational predictions in the 'est_gff' field, with all of the other prediction and evidence alignment settings turned off. All resulting AEDs from this analysis were 1, even though many of the annotations had 100% overlap. Next, instead of using the computational predictions in gff3 format, I used the fasta file of the cDNA sequence from the computational predictions in the 'est' field. Here, the results made more sense, but there was a small but significant percentage of the AED values that were 1 that actually should have been less than 1. I have tried the 2 analyses above using both the gff3 output straight from Web Apollo, and after running the gff3 through maker once as the only entry in the model-gff field, as explained in the MAKER2 paper (http://www.biomedcentral.com/1471-2105/12/491). This does not to appear to make a difference. Do you have any ideas where I might start to debug this? Thanks for your help! Monica This electronic message contains information generated by the USDA solely for the intended recipients. Any unauthorized interception of this message or the use or disclosure of the information it contains may violate the law and subject the violator to civil or criminal penalties. If you believe you have received this message in error, please notify the sender and delete the email immediately. _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From Timothy.Stitt at tgac.ac.uk Sat Nov 8 06:58:53 2014 From: Timothy.Stitt at tgac.ac.uk (Timothy Stitt (TGAC)) Date: Sat, 8 Nov 2014 13:58:53 +0000 Subject: [maker-devel] DBD::SQLite::db do failed errors Message-ID: Dear Maker Support, I'm running Maker v2.31.7 and I'm receiving lots of the following warnings/errors during the run. The errors don't seem to prevent the calculation from completing so I was just wondering how I can avoid getting them? STATUS: Setting up database for any GFF3 input... DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 1. DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 2. DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 3. ? Thanks in advance, Tim. --- Timothy Stitt PhD / Head of Scientific Computing The Genome Analysis Centre (TGAC) http://www.tgac.ac.uk/ p: +44 1603 450378 e: timothy.stitt at tgac.ac.uk -------------- next part -------------- An HTML attachment was scrubbed... URL: From jimhu at email.tamu.edu Fri Nov 7 11:34:11 2014 From: jimhu at email.tamu.edu (Jim Hu) Date: Fri, 7 Nov 2014 12:34:11 -0600 Subject: [maker-devel] Speaking of AED... Message-ID: I was teaching Yandell and Ence (2012) in the genomics class I co-teach, and was having trouble understanding the values for AED in Box 4 Figure Bb derive from the evidence set in Figure Ba. Box 4 says: "AAED is caculated in the same manner as SN and SPm but in place of a reference gene model, the coordinates of the union of the aligned evidence (see panel Ba) are used instead". In the union, I expect that a bp that is in an exon in any of the evidence would be considered a TP. If so, then why isn't nt-level AED for Annotation 1 in Bb zero? I'm probably missing something trivial. Thanks Jim ===================================== Jim Hu Professor Dept. of Biochemistry and Biophysics 2128 TAMU Texas A&M Univ. College Station, TX 77843-2128 979-862-4054 -------------- next part -------------- An HTML attachment was scrubbed... URL: From sarasank at umail.iu.edu Sat Nov 8 11:58:30 2014 From: sarasank at umail.iu.edu (Saranya Sankaranarayanan) Date: Sat, 8 Nov 2014 13:58:30 -0500 Subject: [maker-devel] Fwd: AED plot In-Reply-To: References: Message-ID: Hi Maker authors, I am new to using Maker. I have a few basic questions. I have the maker annotation complete and I ran the gff3_merge -n -d genome_master_datastore_index.log - to create the gff file After that, I used the script AED_cdf_generator.pl to obtain the AED plot, while I get the error: Use of uninitialized value $total in division (/) at ./AED_cdf_generator.pl line 43. Illegal division by zero at ./AED_cdf_generator.pl line 43. I parsed my gff file as: AED_cdf_generator.pl -b 0.025 maker.gff Could anyone please help me with this error? Thank you. It looks like no value is parsed to the variable total, but I am not able to decipher why. Regards, Saranya -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Sat Nov 8 16:52:26 2014 From: carsonhh at gmail.com (Carson Holt) Date: Sat, 8 Nov 2014 16:52:26 -0700 Subject: [maker-devel] Speaking of AED... In-Reply-To: References: Message-ID: <443253CC-838D-42A7-8FEB-8BAF442FAE9A@gmail.com> I think I would agree. Annotation 1 is a perfect match to the evidence. It is ab initio 1 that would have been AED of 0.2, but annotation 1 should have been AED of 0. ?Carson > On Nov 7, 2014, at 11:34 AM, Jim Hu wrote: > > I was teaching Yandell and Ence (2012) in the genomics class I co-teach, and was having trouble understanding the values for AED in Box 4 Figure Bb derive from the evidence set in Figure Ba. > > Box 4 says: "AAED is caculated in the same manner as SN and SPm but in place of a reference gene model, the coordinates of the union of the aligned evidence (see panel Ba) are used instead". In the union, I expect that a bp that is in an exon in any of the evidence would be considered a TP. If so, then why isn't nt-level AED for Annotation 1 in Bb zero? > > I'm probably missing something trivial. > > Thanks > > Jim > ===================================== > Jim Hu > Professor > Dept. of Biochemistry and Biophysics > 2128 TAMU > Texas A&M Univ. > College Station, TX 77843-2128 > 979-862-4054 > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From dence at genetics.utah.edu Sat Nov 8 16:53:19 2014 From: dence at genetics.utah.edu (Daniel Ence) Date: Sat, 8 Nov 2014 23:53:19 +0000 Subject: [maker-devel] Speaking of AED... In-Reply-To: References: Message-ID: <5FC7C806-E03F-4DC3-8932-65F6C0E1A7EF@genetics.utah.edu> Hi Professor Hu, I?m excited that you?re teaching from this review. I hope that you find it useful for your class! Annotation 1 has an AED of 0.2 and not 0 because the middle exon doesn?t line up exactly with the evidence alignments. Since there are bps in the annotation that aren?t supported by evidence, then it has an AED of > 0. It?s a little hard to see in the figure, but if you use a straight-edge, you can see it. Feel free to let me know whether that helps clear things up. Thanks, Daniel > On Nov 7, 2014, at 11:34 AM, Jim Hu wrote: > > I was teaching Yandell and Ence (2012) in the genomics class I co-teach, and was having trouble understanding the values for AED in Box 4 Figure Bb derive from the evidence set in Figure Ba. > > Box 4 says: "AAED is caculated in the same manner as SN and SPm but in place of a reference gene model, the coordinates of the union of the aligned evidence (see panel Ba) are used instead". In the union, I expect that a bp that is in an exon in any of the evidence would be considered a TP. If so, then why isn't nt-level AED for Annotation 1 in Bb zero? > > I'm probably missing something trivial. > > Thanks > > Jim > ===================================== > Jim Hu > Professor > Dept. of Biochemistry and Biophysics > 2128 TAMU > Texas A&M Univ. > College Station, TX 77843-2128 > 979-862-4054 > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From bmoore at genetics.utah.edu Sat Nov 8 16:38:23 2014 From: bmoore at genetics.utah.edu (Barry Moore) Date: Sat, 8 Nov 2014 23:38:23 +0000 Subject: [maker-devel] Speaking of AED... In-Reply-To: References: Message-ID: The 5? most junction on the 3? terminal exon (assuming + strand) is discordant in both Annotation 1 & 2 from the evidence in Ba. [cid:F7723E49-0CF1-4E2C-A8BF-64312129A65F] B On Nov 7, 2014, at 11:34 AM, Jim Hu > wrote: I was teaching Yandell and Ence (2012) in the genomics class I co-teach, and was having trouble understanding the values for AED in Box 4 Figure Bb derive from the evidence set in Figure Ba. Box 4 says: "AAED is caculated in the same manner as SN and SPm but in place of a reference gene model, the coordinates of the union of the aligned evidence (see panel Ba) are used instead". In the union, I expect that a bp that is in an exon in any of the evidence would be considered a TP. If so, then why isn't nt-level AED for Annotation 1 in Bb zero? I'm probably missing something trivial. Thanks Jim ===================================== Jim Hu Professor Dept. of Biochemistry and Biophysics 2128 TAMU Texas A&M Univ. College Station, TX 77843-2128 979-862-4054 _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: PastedGraphic-1.png Type: image/png Size: 344800 bytes Desc: PastedGraphic-1.png URL: From carsonhh at gmail.com Sat Nov 8 17:13:44 2014 From: carsonhh at gmail.com (Carson Holt) Date: Sat, 8 Nov 2014 17:13:44 -0700 Subject: [maker-devel] DBD::SQLite::db do failed errors In-Reply-To: References: Message-ID: <3100A718-B063-4BC5-A036-943DEBCC6484@gmail.com> It?s caused by one of the characters in your GFF3 file. For example characters NOT in the set [a-zA-Z0-9.:^*$@!+_?-|] must be escaped in GFF3 with exceptions outlined in the format spec. You mayhave either a ? or a ? that must be escaped. ?Carson > On Nov 8, 2014, at 6:58 AM, Timothy Stitt (TGAC) wrote: > > Dear Maker Support, > > I'm running Maker v2.31.7 and I'm receiving lots of the following warnings/errors during the run. The errors don't seem to prevent the calculation from completing so I was just wondering how I can avoid getting them? > > > STATUS: Setting up database for any GFF3 input... > DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 1. > DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 2. > DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 3. > ? > > > Thanks in advance, > > Tim. > --- > Timothy Stitt PhD / Head of Scientific Computing > The Genome Analysis Centre (TGAC) > http://www.tgac.ac.uk/ > > p: +44 1603 450378 > e: timothy.stitt at tgac.ac.uk _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From bmoore at genetics.utah.edu Sat Nov 8 17:07:43 2014 From: bmoore at genetics.utah.edu (Barry Moore) Date: Sun, 9 Nov 2014 00:07:43 +0000 Subject: [maker-devel] Speaking of AED... In-Reply-To: References: Message-ID: <6AB384C1-4F5C-4132-9B3A-23F0DE3A9351@genetics.utah.edu> Hmm, I missed the one Daniel pointed out, and then upon inspection noticed a third discordant exon. This figure needs to go into a book of optical illusions! [cid:87ED4E8E-56C4-4808-A7E3-9F0B4521CADB] On Nov 8, 2014, at 4:38 PM, Barry Moore > wrote: The 5? most junction on the 3? terminal exon (assuming + strand) is discordant in both Annotation 1 & 2 from the evidence in Ba. B On Nov 7, 2014, at 11:34 AM, Jim Hu > wrote: I was teaching Yandell and Ence (2012) in the genomics class I co-teach, and was having trouble understanding the values for AED in Box 4 Figure Bb derive from the evidence set in Figure Ba. Box 4 says: "AAED is caculated in the same manner as SN and SPm but in place of a reference gene model, the coordinates of the union of the aligned evidence (see panel Ba) are used instead". In the union, I expect that a bp that is in an exon in any of the evidence would be considered a TP. If so, then why isn't nt-level AED for Annotation 1 in Bb zero? I'm probably missing something trivial. Thanks Jim ===================================== Jim Hu Professor Dept. of Biochemistry and Biophysics 2128 TAMU Texas A&M Univ. College Station, TX 77843-2128 979-862-4054 _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: PastedGraphic-2.png Type: image/png Size: 362479 bytes Desc: PastedGraphic-2.png URL: From michael.s.campbell1 at gmail.com Sat Nov 8 23:01:27 2014 From: michael.s.campbell1 at gmail.com (Michael Campbell) Date: Sat, 8 Nov 2014 23:01:27 -0700 Subject: [maker-devel] Fwd: AED plot In-Reply-To: References: Message-ID: Hi Sranya, If you can send me a copy of your gff3 file I can look at it and see why you are getting the error. That is a pretty young accessory script so there may be something in your file that it has't seen before. Thanks, Mike On Sat, Nov 8, 2014 at 11:58 AM, Saranya Sankaranarayanan < sarasank at umail.iu.edu> wrote: > Hi Maker authors, > > I am new to using Maker. I have a few basic questions. > > I have the maker annotation complete and I ran the > > gff3_merge -n -d genome_master_datastore_index.log - to create the gff > file > > After that, I used the script AED_cdf_generator.pl to obtain the AED > plot, while I get the error: > > > Use of uninitialized value $total in division (/) at > ./AED_cdf_generator.pl line 43. > Illegal division by zero at ./AED_cdf_generator.pl line 43. > > I parsed my gff file as: > AED_cdf_generator.pl -b 0.025 maker.gff > > Could anyone please help me with this error? Thank you. It looks like no > value is parsed to the variable total, but I am not able to decipher why. > > Regards, > Saranya > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > -- Michael Campbell MS, RD. Doctoral Candidate Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From muriel.grosb at gmail.com Mon Nov 10 03:35:30 2014 From: muriel.grosb at gmail.com (Muriel Gros-Balthazard) Date: Mon, 10 Nov 2014 11:35:30 +0100 Subject: [maker-devel] running Maker but skipping first steps Message-ID: <546094F2.6000100@gmail.com> Hello, I want to run Maker but I would like to skip the first steps : STATUS: Parsing control files... STATUS: Processing and indexing input FASTA files... STATUS: Setting up database for any GFF3 input... A data structure will be created for you at: /Data/Genomics/GeneAnnotation/Maker_pipeline/5_Run_Maker/Pdac_ref2013s.maker.output/Pdac_ref2013s_datastore To access files for individual sequences use the datastore index: /Data/Genomics/GeneAnnotation/Maker_pipeline/5_Run_Maker/Pdac_ref2013s.maker.output/Pdac_ref2013s_master_datastore_index.log Indeed, there was an error in RepeatMasking (and I reinstalled RepeatMasker) but I believe that the previous steps are always the same. Is there a way to run Maker so that it doesn't run this first steps again given that the control files didn't change, the fasta files are already indexed and the database of gff3 is set up ? Thank you ! Muriel From FeatherstonJ at arc.agric.za Mon Nov 10 06:42:15 2014 From: FeatherstonJ at arc.agric.za (Jonathan Featherston) Date: Mon, 10 Nov 2014 13:42:15 +0000 Subject: [maker-devel] Maker Message-ID: <57CFF349-6C9F-4172-ADB3-A9572E21D4A3@arc.agric.za> Dear Carson I've been trying to train SNAP with Maker but I'm getting empty genome.ann and .dna files. I have tried running the maker2zff on the implant page to see if my script was corrupt. No help from that. I've seen a few pages in the group and on seqanswers about the empty off files but most seem to have been resolved by including all outputs (maker2zff -n) and even this doesn't generate anything for me?. So I'm guessing the problem is somewhere with the maker outputs. I did get errors from the maker run but they seem to be about mli and ALRM (a perl error- what a pain getting perl libs on a mac). Argument "ALRM" isn't numeric in exit at /Users/Jonathan/perl5/perlbrew/perls/perl-5.20.1/lib/site_perl/5.20.1/darwin-2level/forks.pm line 2184. Argument "ALRM" isn't numeric in exit at /Users/Jonathan/perl5/perlbrew/perls/perl-5.20.1/lib/site_perl/5.20.1/darwin-2level/forks.pm line 2184. Argument "ALRM" isn't numeric in exit at /Users/Jonathan/perl5/perlbrew/perls/perl-5.20.1/lib/site_perl/5.20.1/darwin-2level/forks.pm line 2184. Argument "ALRM" isn't numeric in exit at /Users/Jonathan/perl5/perlbrew/perls/perl-5.20.1/lib/site_perl/5.20.1/darwin-2level/forks.pm line 2184. Argument "ALRM" isn't numeric in exit at /Users/Jonathan/perl5/perlbrew/perls/perl-5.20.1/lib/site_perl/5.20.1/darwin-2level/forks.pm line 2184. -------------------------------------------------------------------------- mpiexec has exited due to process rank 4 with PID 5935 on node Administrators-MacBook-Pro-9 exiting improperly. There are three reasons this could occur: 1. this process did not call "init" before exiting, but others in the job did. This can cause a job to hang indefinitely while it waits for all processes to call "init". By rule, if one process calls "init", then ALL processes must call "init" prior to termination. 2. this process called "init", but exited without calling "finalize". By rule, all processes that call "init" MUST call "finalize" prior to exiting or it will be considered an "abnormal termination" 3. this process called "MPI_Abort" or "orte_abort" and the mca parameter orte_create_session_dirs is set to false. In this case, the run-time cannot detect that the abort call was an abnormal termination. Hence, the only error message you will receive is this one. This may have caused other processes in the application to be terminated by signals sent by mpiexec (as reported here). You can avoid this message by specifying -quiet on the mpiexec command line. Maker did finish and the gff file produced (I can't produce a fasta file from the est2genome=1 option??) seems ok. It has produced protein-matches and match_part although I don't see maker product. Otherwise I ran maker with mli using the command from the CPBI maker paper. I used -nohup mpiexec -n 8 maker < /dev/null & for my maker execution. I'm using altest and protein homology for now. Thank you very much for what help you can provide. I really enjoyed the workshop you and Mark presented! Kind Regards Jonathan -------------- next part -------------- An HTML attachment was scrubbed... URL: From Timothy.Stitt at tgac.ac.uk Mon Nov 10 06:59:59 2014 From: Timothy.Stitt at tgac.ac.uk (Timothy Stitt (TGAC)) Date: Mon, 10 Nov 2014 13:59:59 +0000 Subject: [maker-devel] DBD::SQLite::db do failed errors In-Reply-To: <3100A718-B063-4BC5-A036-943DEBCC6484@gmail.com> References: <3100A718-B063-4BC5-A036-943DEBCC6484@gmail.com> Message-ID: Thanks Carson. I checked the *.gff files for ' and " symbols. I only observed a bunch of " in one of the files as follows: scaffold16677 exonerate:protein2genome:local gene 128238 128710 339 - . gene_id 0 ; sequence Lus10000040|PACid:23139618 ; gene_orientation + scaffold16677 exonerate:protein2genome:local cds 128645 128710 . - . scaffold16677 exonerate:protein2genome:local exon 128645 128710 . - . insertions 0 ; deletions 0 scaffold16677 exonerate:protein2genome:local splice5 128643 128644 . - . intron_id 1 ; splice_site "GT" scaffold16677 exonerate:protein2genome:local intron 128552 128644 . - . intron_id 1 scaffold16677 exonerate:protein2genome:local splice3 128552 128553 . - . intron_id 0 ; splice_site "AG" scaffold16677 exonerate:protein2genome:local cds 128442 128551 . - . scaffold16677 exonerate:protein2genome:local exon 128442 128551 . - . insertions 0 ; deletions 0 scaffold16677 exonerate:protein2genome:local splice5 128440 128441 . - . intron_id 2 ; splice_site "GT" scaffold16677 exonerate:protein2genome:local intron 128362 128441 . - . intron_id 2 scaffold16677 exonerate:protein2genome:local splice3 128362 128363 . - . intron_id 1 ; splice_site "AG" Would these "GT", "AG" etc. strings cause the problem? If so, how should I change them to be correct? Thanks, Tim. --- Timothy Stitt PhD / Head of Scientific Computing The Genome Analysis Centre (TGAC) http://www.tgac.ac.uk/ p: +44 1603 450378 e: timothy.stitt at tgac.ac.uk From: Carson Holt > Date: Sunday, 9 November 2014 00:13 To: Timothy Stitt > Cc: "maker-devel at yandell-lab.org" > Subject: Re: [maker-devel] DBD::SQLite::db do failed errors It?s caused by one of the characters in your GFF3 file. For example characters NOT in the set [a-zA-Z0-9.:^*$@!+_?-|] must be escaped in GFF3 with exceptions outlined in the format spec. You mayhave either a ? or a ? that must be escaped. ?Carson On Nov 8, 2014, at 6:58 AM, Timothy Stitt (TGAC) > wrote: Dear Maker Support, I'm running Maker v2.31.7 and I'm receiving lots of the following warnings/errors during the run. The errors don't seem to prevent the calculation from completing so I was just wondering how I can avoid getting them? STATUS: Setting up database for any GFF3 input... DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 1. DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 2. DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 3. ? Thanks in advance, Tim. --- Timothy Stitt PhD / Head of Scientific Computing The Genome Analysis Centre (TGAC) http://www.tgac.ac.uk/ p: +44 1603 450378 e: timothy.stitt at tgac.ac.uk _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From myandell at genetics.utah.edu Sat Nov 8 17:32:12 2014 From: myandell at genetics.utah.edu (Mark Yandell) Date: Sun, 9 Nov 2014 00:32:12 +0000 Subject: [maker-devel] Speaking of AED... In-Reply-To: <6AB384C1-4F5C-4132-9B3A-23F0DE3A9351@genetics.utah.edu> References: , <6AB384C1-4F5C-4132-9B3A-23F0DE3A9351@genetics.utah.edu> Message-ID: <7A60AB257EFF2B48B1F4C814817EA053E3664681@mxb1.hg.genetics.utah.edu> And you are still missing one-- 3-prine end of the middle exon is also discordant. . I agree though somehow the color makes it hard to see. Sorry. --mark Mark Yandell Professor of Human Genetics H.A. & Edna Benning Presidential Endowed Chair Co-director USTAR Center for Genetic Discovery Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:801-587-7707 ________________________________________ From: maker-devel [maker-devel-bounces at yandell-lab.org] on behalf of Barry Moore [bmoore at genetics.utah.edu] Sent: Saturday, November 08, 2014 5:07 PM To: Jim Hu; maker-devel at yandell-lab.org Cc: Barry Moore Subject: Re: [maker-devel] Speaking of AED... Hmm, I missed the one Daniel pointed out, and then upon inspection noticed a third discordant exon. This figure needs to go into a book of optical illusions! [cid:87ED4E8E-56C4-4808-A7E3-9F0B4521CADB] On Nov 8, 2014, at 4:38 PM, Barry Moore > wrote: The 5? most junction on the 3? terminal exon (assuming + strand) is discordant in both Annotation 1 & 2 from the evidence in Ba. B On Nov 7, 2014, at 11:34 AM, Jim Hu > wrote: I was teaching Yandell and Ence (2012) in the genomics class I co-teach, and was having trouble understanding the values for AED in Box 4 Figure Bb derive from the evidence set in Figure Ba. Box 4 says: "AAED is caculated in the same manner as SN and SPm but in place of a reference gene model, the coordinates of the union of the aligned evidence (see panel Ba) are used instead". In the union, I expect that a bp that is in an exon in any of the evidence would be considered a TP. If so, then why isn't nt-level AED for Annotation 1 in Bb zero? I'm probably missing something trivial. Thanks Jim ===================================== Jim Hu Professor Dept. of Biochemistry and Biophysics 2128 TAMU Texas A&M Univ. College Station, TX 77843-2128 979-862-4054 _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- A non-text attachment was scrubbed... Name: PastedGraphic-2.png Type: image/png Size: 362479 bytes Desc: PastedGraphic-2.png URL: From sarasank at umail.iu.edu Sun Nov 9 09:33:08 2014 From: sarasank at umail.iu.edu (Saranya Sankaranarayanan) Date: Sun, 9 Nov 2014 11:33:08 -0500 Subject: [maker-devel] Fwd: AED plot In-Reply-To: References: Message-ID: Hi Mike, Please find the gff3 file attached with this email. Thanks a lot for the very prompt response. Sincerely, Saranya Sankaranarayanan Master's Student, SoIC Indiana University On Sun, Nov 9, 2014 at 1:01 AM, Michael Campbell < michael.s.campbell1 at gmail.com> wrote: > Hi Sranya, > > If you can send me a copy of your gff3 file I can look at it and see why > you are getting the error. That is a pretty young accessory script so there > may be something in your file that it has't seen before. > > Thanks, > Mike > > On Sat, Nov 8, 2014 at 11:58 AM, Saranya Sankaranarayanan < > sarasank at umail.iu.edu> wrote: > >> Hi Maker authors, >> >> I am new to using Maker. I have a few basic questions. >> >> I have the maker annotation complete and I ran the >> >> gff3_merge -n -d genome_master_datastore_index.log - to create the gff >> file >> >> After that, I used the script AED_cdf_generator.pl to obtain the AED >> plot, while I get the error: >> >> >> Use of uninitialized value $total in division (/) at >> ./AED_cdf_generator.pl line 43. >> Illegal division by zero at ./AED_cdf_generator.pl line 43. >> >> I parsed my gff file as: >> AED_cdf_generator.pl -b 0.025 maker.gff >> >> Could anyone please help me with this error? Thank you. It looks like no >> value is parsed to the variable total, but I am not able to decipher why. >> >> Regards, >> Saranya >> >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> > > > -- > Michael Campbell MS, RD. > Doctoral Candidate > Eccles Institute of Human Genetics > University of Utah > 15 North 2030 East, Room 2100 > Salt Lake City, UT 84112-5330 > ph:585-3543 > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Gff3.zip Type: application/zip Size: 2098998 bytes Desc: not available URL: From carsonhh at gmail.com Mon Nov 10 08:16:28 2014 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 10 Nov 2014 08:16:28 -0700 Subject: [maker-devel] DBD::SQLite::db do failed errors In-Reply-To: References: <3100A718-B063-4BC5-A036-943DEBCC6484@gmail.com> Message-ID: Actually that is not a GFF3 file. It appears to be GTF which is structured different from GFF3. You would need to convert to GFF3. You can try the sequence ontology converter here ?> http://www.sequenceontology.org/cgi-bin/converter.cgi Unfortunately it will not likely be a painless process as GTF files vary so much between sources that one GTF file might not actually be compatible with another GTF file, so you may have to spend some time editing the file for the converter to work. ?Carson > On Nov 10, 2014, at 6:59 AM, Timothy Stitt (TGAC) wrote: > > Thanks Carson. > > I checked the *.gff files for ' and " symbols. I only observed a bunch of " in one of the files as follows: > > scaffold16677 exonerate:protein2genome:local > gene 128238 > 128710 339 > - . > gene_id 0 ; sequence Lus10000040|PACid:23139618 ; gene_orientation + > scaffold16677 exonerate:protein2genome:local > cds 128645 > 128710 . > - . > scaffold16677 exonerate:protein2genome:local > exon 128645 > 128710 . > - . > insertions 0 ; deletions 0 > scaffold16677 exonerate:protein2genome:local > splice5 128643 > 128644 . > - . > intron_id 1 ; splice_site "GT" > scaffold16677 exonerate:protein2genome:local > intron 128552 > 128644 . > - . > intron_id 1 > scaffold16677 exonerate:protein2genome:local > splice3 128552 > 128553 . > - . > intron_id 0 ; splice_site "AG" > scaffold16677 exonerate:protein2genome:local > cds 128442 > 128551 . > - . > scaffold16677 exonerate:protein2genome:local > exon 128442 > 128551 . > - . > insertions 0 ; deletions 0 > scaffold16677 exonerate:protein2genome:local > splice5 128440 > 128441 . > - . > intron_id 2 ; splice_site "GT" > scaffold16677 exonerate:protein2genome:local > intron 128362 > 128441 . > - . > intron_id 2 > scaffold16677 exonerate:protein2genome:local > splice3 128362 > 128363 . > - . > intron_id 1 ; splice_site "AG" > > Would these "GT", "AG" etc. strings cause the problem? If so, how should I change them to be correct? > > Thanks, > > Tim. > --- > Timothy Stitt PhD / Head of Scientific Computing > The Genome Analysis Centre (TGAC) > http://www.tgac.ac.uk/ > > p: +44 1603 450378 > e: timothy.stitt at tgac.ac.uk > > From: Carson Holt > > Date: Sunday, 9 November 2014 00:13 > To: Timothy Stitt > > Cc: "maker-devel at yandell-lab.org " > > Subject: Re: [maker-devel] DBD::SQLite::db do failed errors > > It?s caused by one of the characters in your GFF3 file. For example characters NOT in the set [a-zA-Z0-9.:^*$@!+_?-|] must be escaped in GFF3 with exceptions outlined in the format spec. You mayhave either a ? or a ? that must be escaped. > > ?Carson > > > > >> On Nov 8, 2014, at 6:58 AM, Timothy Stitt (TGAC) > wrote: >> >> Dear Maker Support, >> >> I'm running Maker v2.31.7 and I'm receiving lots of the following warnings/errors during the run. The errors don't seem to prevent the calculation from completing so I was just wondering how I can avoid getting them? >> >> >> STATUS: Setting up database for any GFF3 input... >> DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 1. >> DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 2. >> DBD::SQLite::db do failed: near ",": syntax error at /usr/users/TGAC_ga007/stittt/Software/MAKER/UV/2.31.7/bin/../lib/GFFDB.pm line 496, <$IN> line 3. >> ? >> >> >> Thanks in advance, >> >> Tim. >> --- >> Timothy Stitt PhD / Head of Scientific Computing >> The Genome Analysis Centre (TGAC) >> http://www.tgac.ac.uk/ >> >> p: +44 1603 450378 >> e: timothy.stitt at tgac.ac.uk _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Nov 10 08:23:12 2014 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 10 Nov 2014 08:23:12 -0700 Subject: [maker-devel] running Maker but skipping first steps In-Reply-To: <546094F2.6000100@gmail.com> References: <546094F2.6000100@gmail.com> Message-ID: These are just status messages. The steps don?t actually rerun, except for the control file parsing. That obviously has to happen every time for MAKER to know the control files are still the same between runs. Both these messages ?> STATUS: Processing and indexing input FASTA files... STATUS: Setting up database for any GFF3 input... MAKER sees that the indexes already exists, validates their integrity, and then moves on. So there is no rerunning of steps. ?Carson > On Nov 10, 2014, at 3:35 AM, Muriel Gros-Balthazard wrote: > > Hello, > > I want to run Maker but I would like to skip the first steps : > STATUS: Parsing control files... > STATUS: Processing and indexing input FASTA files... > STATUS: Setting up database for any GFF3 input... > A data structure will be created for you at: > /Data/Genomics/GeneAnnotation/Maker_pipeline/5_Run_Maker/Pdac_ref2013s.maker.output/Pdac_ref2013s_datastore > > To access files for individual sequences use the datastore index: > /Data/Genomics/GeneAnnotation/Maker_pipeline/5_Run_Maker/Pdac_ref2013s.maker.output/Pdac_ref2013s_master_datastore_index.log > > Indeed, there was an error in RepeatMasking (and I reinstalled RepeatMasker) but I believe that the previous steps are always the same. > Is there a way to run Maker so that it doesn't run this first steps again given that the control files didn't change, the fasta files are already indexed and the database of gff3 is set up ? > > Thank you ! > > Muriel > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From mike.thon at gmail.com Mon Nov 10 08:54:00 2014 From: mike.thon at gmail.com (Michael Thon) Date: Mon, 10 Nov 2014 16:54:00 +0100 Subject: [maker-devel] map2assembly Message-ID: Hi - We?re using map2assembly to map genes from other gene annotation pipelines onto the genome sequence in order to compare AED values to a maker de novo annotation. We found a few transcripts that map2assembly maps to multiple loci in the genome. Is there any way to know if these are all equally good alignments or are they all above some present threshold? I?m trying to decide what to do with the multiple mappings - whether we should discard all but one (in that case we?d need to decide which one) or whether we should keep them all. Keeping them all makes the most sense but the problem is they all have the same id. should map2assembly append a number to the id when a transcript maps to multiple locations in the genome? From carsonhh at gmail.com Mon Nov 10 09:08:26 2014 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 10 Nov 2014 09:08:26 -0700 Subject: [maker-devel] map2assembly In-Reply-To: References: Message-ID: <2D1BD5DD-7405-448F-B68D-80C8FEEDC6B3@gmail.com> Try using the transcript score (column 6). It should indicate the % recovery. A 100 means perfect match to the input transcript. The value is %identity multiplied by %coverage, so it will decrease because of a lack of identity or a lack of end-to-end alignment. ?Carson > On Nov 10, 2014, at 8:54 AM, Michael Thon wrote: > > Hi - > We?re using map2assembly to map genes from other gene annotation pipelines onto the genome sequence in order to compare AED values to a maker de novo annotation. We found a few transcripts that map2assembly maps to multiple loci in the genome. Is there any way to know if these are all equally good alignments or are they all above some present threshold? I?m trying to decide what to do with the multiple mappings - whether we should discard all but one (in that case we?d need to decide which one) or whether we should keep them all. Keeping them all makes the most sense but the problem is they all have the same id. should map2assembly append a number to the id when a transcript maps to multiple locations in the genome? > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From carsonhh at gmail.com Mon Nov 10 09:12:53 2014 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 10 Nov 2014 09:12:53 -0700 Subject: [maker-devel] map2assembly In-Reply-To: <2D1BD5DD-7405-448F-B68D-80C8FEEDC6B3@gmail.com> References: <2D1BD5DD-7405-448F-B68D-80C8FEEDC6B3@gmail.com> Message-ID: <681F266A-0549-4165-9261-FDD9F268D674@gmail.com> You can also use the -l option when running gff3_merge to correct for for unique IDs when merging multiple GFF3 files (i.e. IDs will be uniq within a file, but may not be unique across files when mapping transcripts the IDs are being copied direct from the aligned transcript). ?Carson > On Nov 10, 2014, at 9:08 AM, Carson Holt wrote: > > Try using the transcript score (column 6). It should indicate the % recovery. A 100 means perfect match to the input transcript. The value is %identity multiplied by %coverage, so it will decrease because of a lack of identity or a lack of end-to-end alignment. > > ?Carson > > > > > >> On Nov 10, 2014, at 8:54 AM, Michael Thon wrote: >> >> Hi - >> We?re using map2assembly to map genes from other gene annotation pipelines onto the genome sequence in order to compare AED values to a maker de novo annotation. We found a few transcripts that map2assembly maps to multiple loci in the genome. Is there any way to know if these are all equally good alignments or are they all above some present threshold? I?m trying to decide what to do with the multiple mappings - whether we should discard all but one (in that case we?d need to decide which one) or whether we should keep them all. Keeping them all makes the most sense but the problem is they all have the same id. should map2assembly append a number to the id when a transcript maps to multiple locations in the genome? >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > From carsonhh at gmail.com Mon Nov 10 09:34:33 2014 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 10 Nov 2014 09:34:33 -0700 Subject: [maker-devel] Maker In-Reply-To: <57CFF349-6C9F-4172-ADB3-A9572E21D4A3@arc.agric.za> References: <57CFF349-6C9F-4172-ADB3-A9572E21D4A3@arc.agric.za> Message-ID: <1354797B-F783-4671-BB94-42A0E1611B03@gmail.com> You probably have an error further upstream. The 'Argument "ALRM" isn't numeric? error is just something you get as things are dieing in a non-elegant way, but the cause will be further up the error log. The lack of fasta files means that you have no final gene models. Either your contigs are too short to produce a model, or your evidence alignments are insufficient in end-to-end coverage, splice site recovery on polishing, or %identity, so maker cannot elucidate a usable model from alignment alone. What is your longest contig? Also try running GEGMA from the Korf lab, to help identify if the assembly is incomplete and by how much. ?Carson > On Nov 10, 2014, at 6:42 AM, Jonathan Featherston wrote: > > Dear Carson > > I've been trying to train SNAP with Maker but I'm getting empty genome.ann and .dna files. I have tried running the maker2zff on the implant page to see if my script was corrupt. No help from that. I've seen a few pages in the group and on seqanswers about the empty off files but most seem to have been resolved by including all outputs (maker2zff -n) and even this doesn't generate anything for me?. So I'm guessing the problem is somewhere with the maker outputs. > > I did get errors from the maker run but they seem to be about mli and ALRM (a perl error- what a pain getting perl libs on a mac). > > > Argument "ALRM" isn't numeric in exit at /Users/Jonathan/perl5/perlbrew/perls/perl-5.20.1/lib/site_perl/5.20.1/darwin-2level/forks.pm line 2184. > Argument "ALRM" isn't numeric in exit at /Users/Jonathan/perl5/perlbrew/perls/perl-5.20.1/lib/site_perl/5.20.1/darwin-2level/forks.pm line 2184. > Argument "ALRM" isn't numeric in exit at /Users/Jonathan/perl5/perlbrew/perls/perl-5.20.1/lib/site_perl/5.20.1/darwin-2level/forks.pm line 2184. > Argument "ALRM" isn't numeric in exit at /Users/Jonathan/perl5/perlbrew/perls/perl-5.20.1/lib/site_perl/5.20.1/darwin-2level/forks.pm line 2184. > Argument "ALRM" isn't numeric in exit at /Users/Jonathan/perl5/perlbrew/perls/perl-5.20.1/lib/site_perl/5.20.1/darwin-2level/forks.pm line 2184. > -------------------------------------------------------------------------- > mpiexec has exited due to process rank 4 with PID 5935 on > node Administrators-MacBook-Pro-9 exiting improperly. There are three reasons this could occur: > > 1. this process did not call "init" before exiting, but others in > the job did. This can cause a job to hang indefinitely while it waits > for all processes to call "init". By rule, if one process calls "init", > then ALL processes must call "init" prior to termination. > > 2. this process called "init", but exited without calling "finalize". > By rule, all processes that call "init" MUST call "finalize" prior to > exiting or it will be considered an "abnormal termination" > > 3. this process called "MPI_Abort" or "orte_abort" and the mca parameter > orte_create_session_dirs is set to false. In this case, the run-time cannot > detect that the abort call was an abnormal termination. Hence, the only > error message you will receive is this one. > > This may have caused other processes in the application to be > terminated by signals sent by mpiexec (as reported here). > > You can avoid this message by specifying -quiet on the mpiexec command line. > > Maker did finish and the gff file produced (I can't produce a fasta file from the est2genome=1 option??) seems ok. It has produced protein-matches and match_part although I don't see maker product. > > Otherwise I ran maker with mli using the command from the CPBI maker paper. I used -nohup mpiexec -n 8 maker < /dev/null & for my maker execution. > > I'm using altest and protein homology for now. > > Thank you very much for what help you can provide. I really enjoyed the workshop you and Mark presented! > > Kind Regards > Jonathan > > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From michael.s.campbell1 at gmail.com Mon Nov 10 13:54:14 2014 From: michael.s.campbell1 at gmail.com (Michael Campbell) Date: Mon, 10 Nov 2014 13:54:14 -0700 Subject: [maker-devel] Fwd: AED plot In-Reply-To: References: Message-ID: Hi Saranya, I fixed the AED_cdf_generator.pl scrip and added it to the svn repository for MAKER so it will be available in the next MAKE release. If you are using the svn repository you can do an svn update and get the new version of the script in the MAKER bin. If not I've attached a copy of the script to this email (I removed the .pl extension to the file since some email servers will block .pl files). let me know if you have any more problems with it. Thanks, Mike On Sun, Nov 9, 2014 at 9:33 AM, Saranya Sankaranarayanan < sarasank at umail.iu.edu> wrote: > Hi Mike, > > Please find the gff3 file attached with this email. Thanks a lot for the > very prompt response. > > Sincerely, > Saranya Sankaranarayanan > Master's Student, SoIC > Indiana University > > On Sun, Nov 9, 2014 at 1:01 AM, Michael Campbell < > michael.s.campbell1 at gmail.com> wrote: > >> Hi Sranya, >> >> If you can send me a copy of your gff3 file I can look at it and see why >> you are getting the error. That is a pretty young accessory script so there >> may be something in your file that it has't seen before. >> >> Thanks, >> Mike >> >> On Sat, Nov 8, 2014 at 11:58 AM, Saranya Sankaranarayanan < >> sarasank at umail.iu.edu> wrote: >> >>> Hi Maker authors, >>> >>> I am new to using Maker. I have a few basic questions. >>> >>> I have the maker annotation complete and I ran the >>> >>> gff3_merge -n -d genome_master_datastore_index.log - to create the gff >>> file >>> >>> After that, I used the script AED_cdf_generator.pl to obtain the AED >>> plot, while I get the error: >>> >>> >>> Use of uninitialized value $total in division (/) at >>> ./AED_cdf_generator.pl line 43. >>> Illegal division by zero at ./AED_cdf_generator.pl line 43. >>> >>> I parsed my gff file as: >>> AED_cdf_generator.pl -b 0.025 maker.gff >>> >>> Could anyone please help me with this error? Thank you. It looks like no >>> value is parsed to the variable total, but I am not able to decipher why. >>> >>> Regards, >>> Saranya >>> >>> >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >>> >> >> >> -- >> Michael Campbell MS, RD. >> Doctoral Candidate >> Eccles Institute of Human Genetics >> University of Utah >> 15 North 2030 East, Room 2100 >> Salt Lake City, UT 84112-5330 >> ph:585-3543 >> >> > -- Michael Campbell MS, RD. Doctoral Candidate Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: AED_cdf_generator Type: application/octet-stream Size: 2981 bytes Desc: not available URL: From sarasank at umail.iu.edu Mon Nov 10 14:06:58 2014 From: sarasank at umail.iu.edu (Saranya Sankaranarayanan) Date: Mon, 10 Nov 2014 16:06:58 -0500 Subject: [maker-devel] Fwd: AED plot In-Reply-To: References: Message-ID: Great! It works now. Thanks a lot for the support! Sincerely, Saranya Sankaranarayanan Master's Student, SoIC Indiana University On Mon, Nov 10, 2014 at 3:54 PM, Michael Campbell < michael.s.campbell1 at gmail.com> wrote: > Hi Saranya, > > I fixed the AED_cdf_generator.pl scrip and added it to the svn repository > for MAKER so it will be available in the next MAKE release. If you are > using the svn repository you can do an svn update and get the new version > of the script in the MAKER bin. If not I've attached a copy of the script > to this email (I removed the .pl extension to the file since some email > servers will block .pl files). let me know if you have any more problems > with it. > > Thanks, > Mike > > On Sun, Nov 9, 2014 at 9:33 AM, Saranya Sankaranarayanan < > sarasank at umail.iu.edu> wrote: > >> Hi Mike, >> >> Please find the gff3 file attached with this email. Thanks a lot for the >> very prompt response. >> >> Sincerely, >> Saranya Sankaranarayanan >> Master's Student, SoIC >> Indiana University >> >> On Sun, Nov 9, 2014 at 1:01 AM, Michael Campbell < >> michael.s.campbell1 at gmail.com> wrote: >> >>> Hi Sranya, >>> >>> If you can send me a copy of your gff3 file I can look at it and see why >>> you are getting the error. That is a pretty young accessory script so there >>> may be something in your file that it has't seen before. >>> >>> Thanks, >>> Mike >>> >>> On Sat, Nov 8, 2014 at 11:58 AM, Saranya Sankaranarayanan < >>> sarasank at umail.iu.edu> wrote: >>> >>>> Hi Maker authors, >>>> >>>> I am new to using Maker. I have a few basic questions. >>>> >>>> I have the maker annotation complete and I ran the >>>> >>>> gff3_merge -n -d genome_master_datastore_index.log - to create the >>>> gff file >>>> >>>> After that, I used the script AED_cdf_generator.pl to obtain the AED >>>> plot, while I get the error: >>>> >>>> >>>> Use of uninitialized value $total in division (/) at >>>> ./AED_cdf_generator.pl line 43. >>>> Illegal division by zero at ./AED_cdf_generator.pl line 43. >>>> >>>> I parsed my gff file as: >>>> AED_cdf_generator.pl -b 0.025 maker.gff >>>> >>>> Could anyone please help me with this error? Thank you. It looks like >>>> no value is parsed to the variable total, but I am not able to decipher why. >>>> >>>> Regards, >>>> Saranya >>>> >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>>> >>>> >>> >>> >>> -- >>> Michael Campbell MS, RD. >>> Doctoral Candidate >>> Eccles Institute of Human Genetics >>> University of Utah >>> 15 North 2030 East, Room 2100 >>> Salt Lake City, UT 84112-5330 >>> ph:585-3543 >>> >>> >> > > > -- > Michael Campbell MS, RD. > Doctoral Candidate > Eccles Institute of Human Genetics > University of Utah > 15 North 2030 East, Room 2100 > Salt Lake City, UT 84112-5330 > ph:585-3543 > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From goutham.atla at gmail.com Wed Nov 12 23:22:46 2014 From: goutham.atla at gmail.com (Goutham atla) Date: Thu, 13 Nov 2014 11:52:46 +0530 Subject: [maker-devel] URGENT: Re: maker failure with example data In-Reply-To: References: <3CCDC24F-756A-492C-8E7F-B1B97616EE45@genetics.utah.edu> Message-ID: Dear Carson, Maker is throwing error if I provide a rmlib file for repeat masking. It says At this time the hmmer search engine can only be used with the Dfam database. Please rerun your search without the -lib option or switch to a different search engine. We have ran it without rmlib and it completed successfully. We got GFF, proteins and transcripts.fasta files. We are working on Oryza sativa (subspecies indica) but we have fully annotated Oryza sative (subspecies japonica) which is fully annotated. I would like to know what would be the best way to do a functional annotation of the GFF file given by maker. Regards, Goutham On Fri, Nov 7, 2014 at 11:09 AM, Goutham atla wrote: > Dear Carson, > > Thanks for the quick reply. It worked after providing the assembled > transcripts and protein fasta from closely related species. > > > Regards, > Goutham > > On Thu, Nov 6, 2014 at 12:34 PM, Carson Holt < > carson.holt at genetics.utah.edu> wrote: > >> The final transcript and proteins fasta files will only exists if there >> were gene models with evidence support. If you did not provide an HMM for >> one of the ab initio gene predictors (SNAP, Augustus, etc.) then there will >> be no gene models, and if you do not provide protein or est evidence >> fastas, then there will be no evidence support. Also if your contigs are >> too short to contain gene models then there will be no models. >> >> Thanks, >> Carson >> >> >> >> On Nov 5, 2014, at 11:49 PM, Goutham atla >> wrote: >> >> Dear All, >> >> I have finished running maker. But I realised that there are no >> *transcripts.fasta and *protein.fasta files in any of the directories that >> make has created. It has only gtf files. >> >> Example output of a test run: I have similar results on original file >> also: >> >> [User at motif jcf7180001838744]$ pwd >> >> /home/User/Maker_Annotation/Maker_test.maker.output/Maker_test_datastore/35/C1/jcf7180001838744 >> [User at motif jcf7180001838744]$ ls >> jcf7180001838744.gff run.log theVoid.jcf7180001838744 >> >> Any help from you in figuring out why there are no protein.fasta >> and transcripts.fast would be very helpful. >> >> Regards, >> Goutham >> >> On Wed, Oct 1, 2014 at 11:28 AM, Goutham atla >> wrote: >> >>> Dear All, >>> >>> Thank you. I figured out th problem is with mpich2. I was behind >>> mpich2 but was unsuccessful. I installed mpich v3 and its working fine now. >>> Thank you all. The old GMDO tutorials are bit misleading as the new >>> versions have come up. >>> >>> On Wed, Oct 1, 2014 at 11:09 AM, Marc H?ppner >> > wrote: >>> >>>> Another possibility could be that MPICH2 wasn?t build properly, no? I >>>> remember something with enabling shared libraries during the compilation of >>>> mpich, without which the error below would appear. >>>> >>>> /Marc >>>> >>>> Marc P. Hoeppner, PhD >>>> Team Leader >>>> BILS Genome Annotation Platform >>>> Department for Medical Biochemistry and Microbiology >>>> Uppsala University, Sweden >>>> marc.hoeppner at imbim.uu.se >>>> >>>> >>>> >>>> On 30 Sep 2014, at 21:33, Carson Holt >>>> wrote: >>>> >>>> The message is warning that there are multiple instances of MAKER >>>> running, but no MPI communication. When you build MAKER (perl Build.PL step >>>> when installing MAKER), you need to specify the location of 'mpicc' and >>>> 'mpi.h' to build with MPI support. Otherwise you won't be able to link >>>> against MPICH2 shared libraries. You probably need to rerun that step. >>>> >>>> --Carson >>>> >>>> >>>> From: Goutham atla >>>> Date: Tuesday, September 30, 2014 at 10:49 AM >>>> To: Carson Holt >>>> Cc: "maker-devel at yandell-lab.org" >>>> Subject: URGENT: Re: maker failure with example data >>>> >>>> Hi Carson, >>>> >>>> I figured out the problem is with RepeatMasker installation and I >>>> fixed it. >>>> >>>> I am running maker with MPICH2 and I get the following warning when I >>>> start it: >>>> >>>> >>>> >>>> *STATUS: Processing and indexing input FASTA files... WARNING: Multiple >>>> MAKER processes have been started in the same directory.* >>>> >>>> I would like to if this is common. >>>> >>>> Regards, >>>> Goutham >>>> >>>> >>>> On Tue, Sep 30, 2014 at 12:02 PM, Goutham atla >>>> wrote: >>>> >>>>> Dear Carson, >>>>> >>>>> Thank you for the reply. I reinstalled the BioPerl and now I am >>>>> getting the following error on test data. >>>>> >>>>> ERROR: RepeatMasker failed >>>>> --> rank=NA, hostname=motif >>>>> ERROR: Failed while doing repeat masking >>>>> ERROR: Chunk failed at level:0, tier_type:1 >>>>> FAILED CONTIG:contig-dpp-500-500 >>>>> >>>>> On Mon, Sep 29, 2014 at 8:17 PM, Carson Holt < >>>>> carson.holt at genetics.utah.edu> wrote: >>>>> >>>>>> The error is caused by the BioPerl indexer returning an empty >>>>>> length for the indexed fasta sequence (possibly because of a corrupt index >>>>>> file or other reasons). You may need to reinstall BioPerl (use the CPAN >>>>>> version not the BioPerl-live version), or reinstall Berkley DB (used by the >>>>>> BioPerl indexer), or reinstall the Perl module DB_File via CPAN (Perl's >>>>>> interface to Berkley DB). After reinstalling BioPerl, delete the >>>>>> mpi_blastdb directory for the MAKER run before retrying. >>>>>> >>>>>> Also verify that the /tmp directory on your system or the directory >>>>>> pointed to by TMP= in the maker_opts,ctl file is not full and that TMP= is >>>>>> not set to an NFS mounted location. >>>>>> >>>>>> Thanks, >>>>>> Carson >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> From: Goutham atla >>>>>> Date: Monday, September 29, 2014 at 6:33 AM >>>>>> To: >>>>>> Subject: maker failure with example data >>>>>> >>>>>> Dear All, >>>>>> >>>>>> I am running maker with the demo file, i.e dip_contig.fasta by >>>>>> keeping all other parameters in .ctl files as default. But it do not >>>>>> progress and shows the following message that the length of the sequence is >>>>>> 0. Can anybody help me ? >>>>>> >>>>>> >>>>>> >>>>>> --Next Contig-- >>>>>> >>>>>> MAKER WARNING: All old files will be erased before continuing >>>>>> >>>>>> #--------------------------------------------------------------------- >>>>>> Skipping the contig because it is too short!! >>>>>> SeqID: contig-dpp-500-500 >>>>>> Length: 0 >>>>>> >>>>>> #--------------------------------------------------------------------- >>>>>> >>>>>> >>>>>> Regards, >>>>>> Goutham >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Goutham Atla >>>>> >>>> >>>> >>>> >>>> -- >>>> Goutham Atla >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>>> >>>> >>>> >>> >>> >>> -- >>> Goutham Atla >>> >> >> >> >> -- >> Goutham Atla >> >> >> > > > -- > Goutham Atla > -- Goutham Atla -------------- next part -------------- An HTML attachment was scrubbed... URL: From cjfields at illinois.edu Thu Nov 13 21:34:06 2014 From: cjfields at illinois.edu (Fields, Christopher J) Date: Fri, 14 Nov 2014 04:34:06 +0000 Subject: [maker-devel] BioPerl Bio::Tools::CodonTable changes Message-ID: <6615DD9E-10F6-4EFE-9900-F66317BDA0EE@illinois.edu> Carson, Just a note that we need to address a specific hack added last year in BioPerl for MAKER re: ?strict? codon tables. I added a new one to the end of the list, not thinking that more would eventually be added, and that time has now come: https://github.com/bioperl/bioperl-live/issues/90 I?m not sure how MAKER is setting the table, but if it?s by using the codon table # that will likely subtly break as it will now point to the new codon table from NCBI. chris From carsonhh at gmail.com Fri Nov 14 09:46:58 2014 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 14 Nov 2014 09:46:58 -0700 Subject: [maker-devel] BioPerl Bio::Tools::CodonTable changes In-Reply-To: <6615DD9E-10F6-4EFE-9900-F66317BDA0EE@illinois.edu> References: <6615DD9E-10F6-4EFE-9900-F66317BDA0EE@illinois.edu> Message-ID: <27D24CD7-09E3-4618-8A52-578104B34E65@gmail.com> Actually since I wanted to keep compatibility with old versions of BioPerl, I?ve been using the add_table method to just insert the tabled I need. Then I select it using the id method. But I think I like the idea of making the strictly canonical codon table be table 0, since having a strictly canonical codon table in BioPerl seems rather important. Thanks, Carson > On Nov 13, 2014, at 9:34 PM, Fields, Christopher J wrote: > > Carson, > > Just a note that we need to address a specific hack added last year in BioPerl for MAKER re: ?strict? codon tables. I added a new one to the end of the list, not thinking that more would eventually be added, and that time has now come: > > https://github.com/bioperl/bioperl-live/issues/90 > > I?m not sure how MAKER is setting the table, but if it?s by using the codon table # that will likely subtly break as it will now point to the new codon table from NCBI. > > chris > From cjfields at illinois.edu Fri Nov 14 10:20:14 2014 From: cjfields at illinois.edu (Fields, Christopher J) Date: Fri, 14 Nov 2014 17:20:14 +0000 Subject: [maker-devel] BioPerl Bio::Tools::CodonTable changes In-Reply-To: <27D24CD7-09E3-4618-8A52-578104B34E65@gmail.com> References: <6615DD9E-10F6-4EFE-9900-F66317BDA0EE@illinois.edu> <27D24CD7-09E3-4618-8A52-578104B34E65@gmail.com> Message-ID: <22112917-6961-4F53-87F2-DC4EA9E2175E@illinois.edu> Okay, just wanted to make sure that a change in this wouldn?t break MAKER. chris On Nov 14, 2014, at 10:46 AM, Carson Holt wrote: > Actually since I wanted to keep compatibility with old versions of BioPerl, I?ve been using the add_table method to just insert the tabled I need. Then I select it using the id method. > > But I think I like the idea of making the strictly canonical codon table be table 0, since having a strictly canonical codon table in BioPerl seems rather important. > > Thanks, > Carson > > > > >> On Nov 13, 2014, at 9:34 PM, Fields, Christopher J wrote: >> >> Carson, >> >> Just a note that we need to address a specific hack added last year in BioPerl for MAKER re: ?strict? codon tables. I added a new one to the end of the list, not thinking that more would eventually be added, and that time has now come: >> >> https://github.com/bioperl/bioperl-live/issues/90 >> >> I?m not sure how MAKER is setting the table, but if it?s by using the codon table # that will likely subtly break as it will now point to the new codon table from NCBI. >> >> chris >> > From xiaenhua at gmail.com Wed Nov 19 05:47:28 2014 From: xiaenhua at gmail.com (xiaenhua at gmail.com) Date: Wed, 19 Nov 2014 20:47:28 +0800 Subject: [maker-devel] ERROR: Failed while prepare section files Message-ID: <2014111920472385185424@gmail.com> Dear Maker developer Team, When I rerun maker using the first maker derived GFF3 files together with two newly generated evidence of Proteins and ESTs, I failed. I set the parameters in the maker_opt.ctl file like this: ------------------------------------- genome=CSL.fasta maker_gff=CSL_1st_maker.gff; est_gff=osi_csl_maker.pasa_assemblies_Maker.gff3; protein_gff=CSL_wise.gff3; est2genome=1; protein2genome=1; other parameters with default. Then, I run maker via MPI. However, during the 2nd run, I failed. Below is the error message: ----------------------------------- preparing ab-inits preparing ab-inits preparing ab-inits gathering ab-init output files gathering ab-init output files gathering ab-init output files gathering ab-init output files prepare section files Gathering GFF3 input into hits - chunk:0 gathering ab-init output files gathering ab-init output files gathering ab-init output files gathering ab-init output files prepare section files Gathering GFF3 input into hits - chunk:0 prepare section files Died at /home/xiaenhua/SoftWare/maker/bin/../lib/Bio/Search/Hit/PhatHit/Base.pm line 188. --> rank=6, hostname=localhost.localdomain ERROR: Failed while prepare section files ERROR: Chunk failed at level:12, tier_type:3 FAILED CONTIG:scaffold3 ERROR: Chunk failed at level:4, tier_type:0 FAILED CONTIG:scaffold3 Gathering GFF3 input into hits - chunk:0 gathering ab-init output files prepare section files Gathering GFF3 input into hits - chunk:0 Died at /home/xiaenhua/SoftWare/maker/bin/../lib/Bio/Search/Hit/PhatHit/Base.pm line 188. --> rank=8, hostname=localhost.localdomain ERROR: Failed while prepare section files ERROR: Chunk failed at level:12, tier_type:3 FAILED CONTIG:scaffold5 prepare section files ERROR: Chunk failed at level:4, tier_type:0 FAILED CONTIG:scaffold5 ........................ ........................ -------------------------------------- My protein evidence gff3 file looks like this: scaffold3 genewise match 1276842 1277727 . - . ID=GeneWise.45.m scaffold3 genewise match_part 1277687 1277727 . - . ID=GeneWise.45.cds_1;Parent=GeneWise.45.m scaffold3 genewise match_part 1276842 1277545 . - . ID=GeneWise.45.cds_2;Parent=GeneWise.45.m EST evidence gff3: scaffold3 match 1275835 1276664 . + . ID=align_24718.m scaffold3 match_part 1275835 1276664 . + . ID=align_24718.cds_1;Parent=align_24718.m scaffold3 match 2510415 2511782 . + . ID=align_24719.m scaffold3 match_part 2510415 2511782 . + . ID=align_24719.cds_1;Parent=align_24719.m scaffold3 match 4113431 4114364 . + . ID=align_24720.m scaffold3 match_part 4113431 4114364 . + . ID=align_24720.cds_1;Parent=align_24720.m I don't know what happened? Your any help will be appreciated greatly! Thank you! All the best, En-Hua Xia -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Wed Nov 19 08:55:17 2014 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 19 Nov 2014 08:55:17 -0700 Subject: [maker-devel] ERROR: Failed while prepare section files In-Reply-To: <2014111920472385185424@gmail.com> References: <2014111920472385185424@gmail.com> Message-ID: <824C3CBE-FD06-4571-A8AB-06710840FF41@gmail.com> Could you rerun with the latest MAKER release, just to make sure that it stil happens with the current release (Version 2.31.7). Run with 'maker -a?. If it still happenes, then send me the GFF3 files you are using as input, and I?ll take a look. Basically it?s happening because you are missing a start or end position for a feature in one of the files. ?Carson > On Nov 19, 2014, at 5:47 AM, xiaenhua at gmail.com wrote: > > Dear Maker developer Team, > When I rerun maker using the first maker derived GFF3 files together with two newly generated evidence of Proteins and ESTs, I failed. I set the parameters in the maker_opt.ctl file like this: > ------------------------------------- > genome=CSL.fasta > maker_gff=CSL_1st_maker.gff; > est_gff=osi_csl_maker.pasa_assemblies_Maker.gff3; > protein_gff=CSL_wise.gff3; > est2genome=1; > protein2genome=1; > other parameters with default. > Then, I run maker via MPI. However, during the 2nd run, I failed. Below is the error message: > ----------------------------------- > preparing ab-inits > preparing ab-inits > preparing ab-inits > gathering ab-init output files > gathering ab-init output files > gathering ab-init output files > gathering ab-init output files > prepare section files > Gathering GFF3 input into hits - chunk:0 > gathering ab-init output files > gathering ab-init output files > gathering ab-init output files > gathering ab-init output files > prepare section files > Gathering GFF3 input into hits - chunk:0 > prepare section files > Died at /home/xiaenhua/SoftWare/maker/bin/../lib/Bio/Search/Hit/PhatHit/Base.pm line 188. > --> rank=6, hostname=localhost.localdomain > ERROR: Failed while prepare section files > ERROR: Chunk failed at level:12, tier_type:3 > FAILED CONTIG:scaffold3 > > ERROR: Chunk failed at level:4, tier_type:0 > FAILED CONTIG:scaffold3 > > Gathering GFF3 input into hits - chunk:0 > gathering ab-init output files > prepare section files > Gathering GFF3 input into hits - chunk:0 > Died at /home/xiaenhua/SoftWare/maker/bin/../lib/Bio/Search/Hit/PhatHit/Base.pm line 188. > --> rank=8, hostname=localhost.localdomain > ERROR: Failed while prepare section files > ERROR: Chunk failed at level:12, tier_type:3 > FAILED CONTIG:scaffold5 > > prepare section files > ERROR: Chunk failed at level:4, tier_type:0 > FAILED CONTIG:scaffold5 > ........................ > ........................ > -------------------------------------- > My protein evidence gff3 file looks like this: > scaffold3 genewise match 1276842 1277727 . - . ID=GeneWise.45.m > scaffold3 genewise match_part 1277687 1277727 . - . ID=GeneWise.45.cds_1;Parent=GeneWise.45.m > scaffold3 genewise match_part 1276842 1277545 . - . ID=GeneWise.45.cds_2;Parent=GeneWise.45.m > > EST evidence gff3: > scaffold3 match 1275835 1276664 . + . ID=align_24718.m > scaffold3 match_part 1275835 1276664 . + . ID=align_24718.cds_1;Parent=align_24718.m > scaffold3 match 2510415 2511782 . + . ID=align_24719.m > scaffold3 match_part 2510415 2511782 . + . ID=align_24719.cds_1;Parent=align_24719.m > scaffold3 match 4113431 4114364 . + . ID=align_24720.m > scaffold3 match_part 4113431 4114364 . + . ID=align_24720.cds_1;Parent=align_24720.m > > I don't know what happened? Your any help will be appreciated greatly! > Thank you! > > All the best, > En-Hua Xia > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ernesto at ebi.ac.uk Fri Nov 21 03:59:51 2014 From: ernesto at ebi.ac.uk (ernesto lowy gallego) Date: Fri, 21 Nov 2014 10:59:51 +0000 Subject: [maker-devel] Latest release of MAKER version 2.31.7 Message-ID: <546F1B27.90309@ebi.ac.uk> Hi, I am trying to find the features of the latest release of MAKER (version 2.31.7, released the 31/10/2014), Could you please let me know where can I find them? Thanks a lot! ernesto -- Developer VectorBase | Ensembl Genomes From carsonhh at gmail.com Fri Nov 21 08:04:07 2014 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 21 Nov 2014 08:04:07 -0700 Subject: [maker-devel] Latest release of MAKER version 2.31.7 In-Reply-To: <546F1B27.90309@ebi.ac.uk> References: <546F1B27.90309@ebi.ac.uk> Message-ID: The only change is a bug fix for an issue that sometimes occurs when model_gff is mixed with correct_est_fusion=1 and aways_complete=1. ?Carson > On Nov 21, 2014, at 3:59 AM, ernesto lowy gallego wrote: > > Hi, > > I am trying to find the features of the latest release of MAKER (version 2.31.7, released the 31/10/2014), > > Could you please let me know where can I find them? > > Thanks a lot! > > ernesto > > -- > Developer > > VectorBase | Ensembl Genomes > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From muriel.grosb at gmail.com Fri Nov 21 09:07:39 2014 From: muriel.grosb at gmail.com (Muriel Gros-Balthazard) Date: Fri, 21 Nov 2014 17:07:39 +0100 Subject: [maker-devel] Repeat masking in Maker Message-ID: <546F634B.1000900@gmail.com> Hello, I generated my own library of repeats following the tutorial provided with Maker. I also wanted to use all the species from the RepBase library for the masking. It is not clear to me how this works in Maker. Indeed, I put both these options : model_org=all rmlib=allRepeats.lib However, when using RepeatMasker without Maker, you can't put both -lib allRepeats.lib and -species all as options. Indeed, you can only say one species when also using the -lib option (-species arabidopsis for instance and not -species all) What about Maker ? Do I have masking of allRepeats.lib and also of all species repeats if I put these two arguments in Maker ? model_org=all rmlib=allRepeats.lib Another question: It is said that RepeatRunner is used as well. I put the option: repeat_protein=te_proteins.fasta But realized that RepeatRunner was not installed on my computer !!! I had no problem to run Maker. So, this file of te_proteins is used rather by RepeatMasker to mask them ? It is not clear to me how RepeatRunner is involved in the pipeline ? Thanks a lot for your answers, Muriel -------------- next part -------------- An HTML attachment was scrubbed... URL: From michael.s.campbell1 at gmail.com Fri Nov 21 10:09:17 2014 From: michael.s.campbell1 at gmail.com (Michael Campbell) Date: Fri, 21 Nov 2014 10:09:17 -0700 Subject: [maker-devel] Repeat masking in Maker In-Reply-To: <546F634B.1000900@gmail.com> References: <546F634B.1000900@gmail.com> Message-ID: Hi Muriel, By setting model_org=all MAKER will run repeatmasker using all of RepBase. MAKER will also repeatmasker to mask with your species specific repeat library when you set rmlib=allRepeats.lib. For more information on what options can be used in the model_org= line of the maker_opts.ctl file you can find it here on the MAKER wiki http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/The_MAKER_control_files_explained . A few releases back Repeat runner was added internally to MAKER, so you don't have to install it seperatly. If you look in the MAKER output error you can find where MAKER called repeat runner. Thanks, Mike On Fri, Nov 21, 2014 at 9:07 AM, Muriel Gros-Balthazard < muriel.grosb at gmail.com> wrote: > Hello, > > I generated my own library of repeats following the tutorial provided with > Maker. > I also wanted to use all the species from the RepBase library for the > masking. > > It is not clear to me how this works in Maker. > Indeed, I put both these options : > model_org=all > rmlib=allRepeats.lib > > However, when using RepeatMasker without Maker, you can't put both -lib > allRepeats.lib and -species all as options. > Indeed, you can only say one species when also using the -lib option (-species > arabidopsis for instance and not -species all) > > What about Maker ? > > Do I have masking of allRepeats.lib and also of all species repeats if I > put these two arguments in Maker ? > model_org=all > rmlib=allRepeats.lib > > Another question: > It is said that RepeatRunner is used as well. I put the option: > repeat_protein=te_proteins.fasta > But realized that RepeatRunner was not installed on my computer !!! > I had no problem to run Maker. > So, this file of te_proteins is used rather by RepeatMasker to mask them ? > It is not clear to me how RepeatRunner is involved in the pipeline ? > > Thanks a lot for your answers, > > Muriel > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > -- Michael Campbell MS, RD. Doctoral Candidate Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Fri Nov 21 10:21:28 2014 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 21 Nov 2014 10:21:28 -0700 Subject: [maker-devel] Repeat masking in Maker In-Reply-To: References: <546F634B.1000900@gmail.com> Message-ID: Yes. If you set them both, then RepeatMasker runs twice (once with each setting), and then combines the results. ?Carson > On Nov 21, 2014, at 10:09 AM, Michael Campbell wrote: > > Hi Muriel, > > By setting model_org=all MAKER will run repeatmasker using all of RepBase. MAKER will also repeatmasker to mask with your species specific repeat library when you set rmlib=allRepeats.lib. > > For more information on what options can be used in the model_org= line of the maker_opts.ctl file you can find it here on the MAKER wiki > > http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/The_MAKER_control_files_explained . > > A few releases back Repeat runner was added internally to MAKER, so you don't have to install it seperatly. If you look in the MAKER output error you can find where MAKER called repeat runner. > > Thanks, > Mike > > On Fri, Nov 21, 2014 at 9:07 AM, Muriel Gros-Balthazard > wrote: > Hello, > > I generated my own library of repeats following the tutorial provided with Maker. > I also wanted to use all the species from the RepBase library for the masking. > > It is not clear to me how this works in Maker. > Indeed, I put both these options : > model_org=all > rmlib=allRepeats.lib > > However, when using RepeatMasker without Maker, you can't put both -lib allRepeats.lib and -species all as options. > Indeed, you can only say one species when also using the -lib option (-species arabidopsis for instance and not -species all) > > What about Maker ? > > Do I have masking of allRepeats.lib and also of all species repeats if I put these two arguments in Maker ? > model_org=all > rmlib=allRepeats.lib > > Another question: > It is said that RepeatRunner is used as well. I put the option: repeat_protein=te_proteins.fasta > But realized that RepeatRunner was not installed on my computer !!! > I had no problem to run Maker. > So, this file of te_proteins is used rather by RepeatMasker to mask them ? > It is not clear to me how RepeatRunner is involved in the pipeline ? > > Thanks a lot for your answers, > > Muriel > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > > > -- > Michael Campbell MS, RD. > Doctoral Candidate > Eccles Institute of Human Genetics > University of Utah > 15 North 2030 East, Room 2100 > Salt Lake City, UT 84112-5330 > ph:585-3543 > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From muriel.grosb at gmail.com Thu Nov 27 02:22:26 2014 From: muriel.grosb at gmail.com (Muriel Gros-Balthazard) Date: Thu, 27 Nov 2014 10:22:26 +0100 Subject: [maker-devel] gff output Message-ID: <5476ED52.3060902@gmail.com> Hello, I have been using Maker to generate an annotation. I especially set these options: - est_gff with a list of transcripts.gff3 (Cufflinks output) - model_org=all - rmlib=allrepeats.lib - repeat_protein=te_prot.fasta - pred_gff= Augustus.gff3 (that I generated previously) I obtain a gff file for each of my contigs. However, here are the three possibilities in the second column : # est_gff:cufflinks # repeatmasker # repeatrunner I have no information about exons and introns. And I am wondering if the Augustus.gff3 was used... On top of that, I forgot to set up pred_stats to 1. If I understand well, I can just change this in the ocntrol file, and run Maker again. Since there is the output with everything, it won't run again the prediction, only this option. Is that right ? Thank you, Muriel