From kenlee.nakasugi at sydney.edu.au Sun Mar 3 17:44:01 2013 From: kenlee.nakasugi at sydney.edu.au (Kenlee Nakasugi) Date: Mon, 04 Mar 2013 10:44:01 +1100 Subject: [maker-devel] regarding mpich In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6E7062FF@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6E7062FF@CHIMBX5.ad.uillinois.edu> Message-ID: <1362354241.2252.38.camel@waterhouse874-8> Hi, I'm trying to install mpi_maker (Maker 2.1) on a new system (intel ix x86_64) which has mpich v3.0.2 installed, but I can't seem to get maker Build.PL to recognize it. I tried editing the Build.pm file to point to it, but with no success. All dependencies have been installed and successfully recognized, it is just MPI support that is not. Is there anything I could modify in the install scripts to make it recognize this? Currently, the directly path to where the mpicc and mpiexec are is /apps/mpich/3.0.2/bin I don't have sys admin rights for the machine, and I'm not sure if this version of mpich was installed for shared libraries as per the GMOD tutorial. But I have previously circumvented this with an earlier version of mpich by modifying the Build.pm module with success. I'm wondering if mpichv3.02 is not compatible? Cheers, Ken -- Kenlee Nakasugi | Research Fellow School of Molecular Bioscience Level 8, SMB Building (G08)| The University of Sydney | NSW | 2006 T: +61 2 9114 1321 E: kenlee.nakasugi at sydney.edu.au -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Mar 4 07:35:03 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 04 Mar 2013 08:35:03 -0500 Subject: [maker-devel] regarding mpich In-Reply-To: <1362354241.2252.38.camel@waterhouse874-8> Message-ID: Use the last MPICH2 version, as MPICH3 is very different (it's the first attempt to implement the new MPI3 protocol set, and not just a version update). Alternatively you can use OpenMPI. Also use maker version 2.27 instead for MPI. Thanks, Carson From: Kenlee Nakasugi Date: Sunday, 3 March, 2013 6:44 PM To: "maker-devel at yandell-lab.org List" Subject: [maker-devel] regarding mpich Hi, I'm trying to install mpi_maker (Maker 2.1) on a new system (intel ix x86_64) which has mpich v3.0.2 installed, but I can't seem to get maker Build.PL to recognize it. I tried editing the Build.pm file to point to it, but with no success. All dependencies have been installed and successfully recognized, it is just MPI support that is not. Is there anything I could modify in the install scripts to make it recognize this? Currently, the directly path to where the mpicc and mpiexec are is /apps/mpich/3.0.2/bin I don't have sys admin rights for the machine, and I'm not sure if this version of mpich was installed for shared libraries as per the GMOD tutorial. But I have previously circumvented this with an earlier version of mpich by modifying the Build.pm module with success. I'm wondering if mpichv3.02 is not compatible? Cheers, Ken -- Kenlee Nakasugi | Research Fellow School of Molecular Bioscience Level 8, SMB Building (G08)| The University of Sydney | NSW | 2006 T: +61 2 9114 1321 E: kenlee.nakasugi at sydney.edu.au _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From canchaya at uvigo.es Mon Mar 4 05:10:26 2013 From: canchaya at uvigo.es (Carlos A. Canchaya) Date: Mon, 4 Mar 2013 12:10:26 +0100 Subject: [maker-devel] Sharing benchmarks of maker References: <6472D2A0-7BA8-41F0-ACFD-4D3C800D36FB@uvigo.es> Message-ID: <7F41714C-6C75-4892-AA5B-D7649DDA7DF2@uvigo.es> Hi, I've just install maker2 in our server and run a first test with our data. The input was about 30 000 sequences (9.6 Mb) and it was run in just one server with 32 processors for 36 hours) with mpich2. Our server has 250 Gb of memory and cpus of 2,4 Gb. The test was simple because it only ran repeatmasker and SNAP. Considering that we would like to use other gene prediction/annotation tools available in MAKER, I wonder if you can share some of your benchmarks in order to know if we could scale up pretty well to our production cluster in order to annotate our 1.6 Gb draft genome Best, Carlos Carlos A. Canchaya, PhD IPP Research Fellow Department of Biochemistry, Genetics and Immunology Faculty of Biology Campus Universitario University of Vigo 36310 Vigo Spain http://darwin.uvigo.es/~ccanchaya/ email: canchaya at uvigo.es Tel : +34 986 130048 Fax: +34 986 812556 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Mar 4 09:12:06 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 04 Mar 2013 10:12:06 -0500 Subject: [maker-devel] Sharing benchmarks of maker In-Reply-To: <7F41714C-6C75-4892-AA5B-D7649DDA7DF2@uvigo.es> Message-ID: Performance is highly dependent on the size of evidence datasets used (proteins/ESTs) as well as the IO performance of a system when running via MPI (you can hit IO bottlenecks well before cpu bottlenecks depending on cluster configuration). The Arabidopsis genome (120Mb assembly) running SNAP and Augustus, 1.1Gb EST dataset, and 10Mb protein dataset takes ~1 hour 30 min on 1,500 cpus with OpenMPI. The Maize genome (2.1 Gb) running SNAP and Augustus, 3Gb EST dataset, and 16 Mb protein dataset takes ~4 hours 30 min on 2200 cpus. A human sized genome would take 5-6 days on 100 cpus. MAKER is fully restartable (keeps log of progress). So if there is any failure or the user kills it in the middle of a job, it will pick up at the point it left off on restart (so you don't waste all that processing time). 2Gb of RAM per processing core is recommended when parallelizing MAKER via MPI, but fragmented genomes with smaller contigs can get by with less than 1Gb per core. MAKER version 2.28 which has additional optimization for OpenMPI and lower memory footprint will be available in a couple of weeks. Until then 2.27 is recommended over 2.1 for MPI. 2.27 should also work with OpenMPI. 2.1 only works with older versions of MPICH2 using the mpd launcher and not the current hydra launcher. Thanks, Carson From: "Carlos A. Canchaya" Date: Monday, 4 March, 2013 6:10 AM To: Subject: [maker-devel] Sharing benchmarks of maker Hi, I've just install maker2 in our server and run a first test with our data. The input was about 30 000 sequences (9.6 Mb) and it was run in just one server with 32 processors for 36 hours) with mpich2. Our server has 250 Gb of memory and cpus of 2,4 Gb. The test was simple because it only ran repeatmasker and SNAP. Considering that we would like to use other gene prediction/annotation tools available in MAKER, I wonder if you can share some of your benchmarks in order to know if we could scale up pretty well to our production cluster in order to annotate our 1.6 Gb draft genome Best, Carlos Carlos A. Canchaya, PhD IPP Research Fellow Department of Biochemistry, Genetics and Immunology Faculty of Biology Campus Universitario University of Vigo 36310 Vigo Spain http://darwin.uvigo.es/~ccanchaya/ email: canchaya at uvigo.es Tel : +34 986 130048 Fax: +34 986 812556 > _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Mar 4 09:33:02 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 04 Mar 2013 10:33:02 -0500 Subject: [maker-devel] Sharing benchmarks of maker In-Reply-To: Message-ID: For the Arabidopsis genome it also took ~2 hour 10 min on 600, so there was only a 40 min gain by going from 600 to 1,500 cpus. This is because assembly strucutre have a lot to do with the efficiency of the parallelization, so you can hit a point of diminishing returns on some assemblies sooner than others. --Carson From: Carson Holt Date: Monday, 4 March, 2013 10:12 AM To: "Carlos A. Canchaya" , Subject: Re: [maker-devel] Sharing benchmarks of maker Performance is highly dependent on the size of evidence datasets used (proteins/ESTs) as well as the IO performance of a system when running via MPI (you can hit IO bottlenecks well before cpu bottlenecks depending on cluster configuration). The Arabidopsis genome (120Mb assembly) running SNAP and Augustus, 1.1Gb EST dataset, and 10Mb protein dataset takes ~1 hour 30 min on 1,500 cpus with OpenMPI. The Maize genome (2.1 Gb) running SNAP and Augustus, 3Gb EST dataset, and 16 Mb protein dataset takes ~4 hours 30 min on 2200 cpus. A human sized genome would take 5-6 days on 100 cpus. MAKER is fully restartable (keeps log of progress). So if there is any failure or the user kills it in the middle of a job, it will pick up at the point it left off on restart (so you don't waste all that processing time). 2Gb of RAM per processing core is recommended when parallelizing MAKER via MPI, but fragmented genomes with smaller contigs can get by with less than 1Gb per core. MAKER version 2.28 which has additional optimization for OpenMPI and lower memory footprint will be available in a couple of weeks. Until then 2.27 is recommended over 2.1 for MPI. 2.27 should also work with OpenMPI. 2.1 only works with older versions of MPICH2 using the mpd launcher and not the current hydra launcher. Thanks, Carson From: "Carlos A. Canchaya" Date: Monday, 4 March, 2013 6:10 AM To: Subject: [maker-devel] Sharing benchmarks of maker Hi, I've just install maker2 in our server and run a first test with our data. The input was about 30 000 sequences (9.6 Mb) and it was run in just one server with 32 processors for 36 hours) with mpich2. Our server has 250 Gb of memory and cpus of 2,4 Gb. The test was simple because it only ran repeatmasker and SNAP. Considering that we would like to use other gene prediction/annotation tools available in MAKER, I wonder if you can share some of your benchmarks in order to know if we could scale up pretty well to our production cluster in order to annotate our 1.6 Gb draft genome Best, Carlos Carlos A. Canchaya, PhD IPP Research Fellow Department of Biochemistry, Genetics and Immunology Faculty of Biology Campus Universitario University of Vigo 36310 Vigo Spain http://darwin.uvigo.es/~ccanchaya/ email: canchaya at uvigo.es Tel : +34 986 130048 Fax: +34 986 812556 > _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/m aker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From kenlee.nakasugi at sydney.edu.au Mon Mar 4 14:50:27 2013 From: kenlee.nakasugi at sydney.edu.au (Kenlee Nakasugi) Date: Mon, 4 Mar 2013 20:50:27 +0000 Subject: [maker-devel] regarding mpich In-Reply-To: References: <1362354241.2252.38.camel@waterhouse874-8>, Message-ID: Thanks Carson. Will Maker 2.27 be able to continue analysis on Maker 2.1 files that stopped halfway? Thanks Ken On 05/03/2013, at 1:44 AM, "Carson Holt" > wrote: Use the last MPICH2 version, as MPICH3 is very different (it's the first attempt to implement the new MPI3 protocol set, and not just a version update). Alternatively you can use OpenMPI. Also use maker version 2.27 instead for MPI. Thanks, Carson From: Kenlee Nakasugi > Date: Sunday, 3 March, 2013 6:44 PM To: "maker-devel at yandell-lab.org List" > Subject: [maker-devel] regarding mpich Hi, I'm trying to install mpi_maker (Maker 2.1) on a new system (intel ix x86_64) which has mpich v3.0.2 installed, but I can't seem to get maker Build.PL to recognize it. I tried editing the Build.pm file to point to it, but with no success. All dependencies have been installed and successfully recognized, it is just MPI support that is not. Is there anything I could modify in the install scripts to make it recognize this? Currently, the directly path to where the mpicc and mpiexec are is /apps/mpich/3.0.2/bin I don't have sys admin rights for the machine, and I'm not sure if this version of mpich was installed for shared libraries as per the GMOD tutorial. But I have previously circumvented this with an earlier version of mpich by modifying the Build.pm module with success. I'm wondering if mpichv3.02 is not compatible? Cheers, Ken -- Kenlee Nakasugi | Research Fellow School of Molecular Bioscience Level 8, SMB Building (G08)| The University of Sydney | NSW | 2006 T: +61 2 9114 1321 E: kenlee.nakasugi at sydney.edu.au _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From dsth at ebi.ac.uk Mon Mar 4 14:57:01 2013 From: dsth at ebi.ac.uk (Daniel Hughes) Date: Mon, 4 Mar 2013 20:57:01 +0000 Subject: [maker-devel] regarding mpich In-Reply-To: References: <1362354241.2252.38.camel@waterhouse874-8> Message-ID: Unlikely. Probably safer to export what has finished as gff and run it as re-annotation if you don't want to waste what was alteady processed for running additional iterations. Dan from me phone... On Mar 4, 2013 8:52 PM, "Kenlee Nakasugi" wrote: > Thanks Carson. Will Maker 2.27 be able to continue analysis on Maker 2.1 > files that stopped halfway? > Thanks > Ken > > > > On 05/03/2013, at 1:44 AM, "Carson Holt" wrote: > > Use the last MPICH2 version, as MPICH3 is very different (it's the > first attempt to implement the new MPI3 protocol set, and not just a > version update). Alternatively you can use OpenMPI. Also use maker > version 2.27 instead for MPI. > > Thanks, > Carson > > > > From: Kenlee Nakasugi > Date: Sunday, 3 March, 2013 6:44 PM > To: "maker-devel at yandell-lab.org List" > Subject: [maker-devel] regarding mpich > > Hi, > > I'm trying to install mpi_maker (Maker 2.1) on a new system (intel ix > x86_64) which has mpich v3.0.2 installed, but I can't seem to get maker > Build.PL to recognize it. I tried editing the Build.pm file to point to it, > but with no success. > All dependencies have been installed and successfully recognized, it is > just MPI support that is not. > > Is there anything I could modify in the install scripts to make it > recognize this? Currently, the directly path to where the mpicc and mpiexec > are is /apps/mpich/3.0.2/bin > I don't have sys admin rights for the machine, and I'm not sure if this > version of mpich was installed for shared libraries as per the GMOD > tutorial. But I have previously circumvented this with an earlier version > of mpich by modifying the Build.pm module with success. I'm wondering if > mpichv3.02 is not compatible? > > > Cheers, > Ken > > -- > Kenlee Nakasugi | Research Fellow > School of Molecular Bioscience > Level 8, SMB Building (G08)| The University of Sydney | NSW | 2006 > T: +61 2 9114 1321 > E: kenlee.nakasugi at sydney.edu.au > > _______________________________________________ maker-devel mailing > list maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Mar 4 14:58:21 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 04 Mar 2013 15:58:21 -0500 Subject: [maker-devel] regarding mpich In-Reply-To: Message-ID: Some files it can reuse, but not all. So, exporting finished contigs with GFF3 pass-through is an option. --Carson From: Daniel Hughes Date: Monday, 4 March, 2013 3:57 PM To: Kenlee Nakasugi Cc: "maker-devel at yandell-lab.org List" , Carson Holt Subject: Re: [maker-devel] regarding mpich Unlikely. Probably safer to export what has finished as gff and run it as re-annotation if you don't want to waste what was alteady processed for running additional iterations. Dan from me phone... On Mar 4, 2013 8:52 PM, "Kenlee Nakasugi" wrote: > Thanks Carson. Will Maker 2.27 be able to continue analysis on Maker 2.1 files > that stopped halfway? > Thanks > Ken > > > > On 05/03/2013, at 1:44 AM, "Carson Holt" wrote: > >> Use the last MPICH2 version, as MPICH3 is very different (it's the first >> attempt to implement the new MPI3 protocol set, and not just a version >> update). Alternatively you can use OpenMPI. Also use maker version 2.27 >> instead for MPI. >> >> Thanks, >> Carson >> >> >> >> From: Kenlee Nakasugi >> Date: Sunday, 3 March, 2013 6:44 PM >> To: "maker-devel at yandell-lab.org List" >> Subject: [maker-devel] regarding mpich >> >> Hi, >> >> I'm trying to install mpi_maker (Maker 2.1) on a new system (intel ix x86_64) >> which has mpich v3.0.2 installed, but I can't seem to get maker Build.PL to >> recognize it. I tried editing the Build.pm file to point to it, but with no >> success. >> All dependencies have been installed and successfully recognized, it is just >> MPI support that is not. >> >> Is there anything I could modify in the install scripts to make it recognize >> this? Currently, the directly path to where the mpicc and mpiexec are is >> /apps/mpich/3.0.2/bin >> I don't have sys admin rights for the machine, and I'm not sure if this >> version of mpich was installed for shared libraries as per the GMOD tutorial. >> But I have previously circumvented this with an earlier version of mpich by >> modifying the Build.pm module with success. I'm wondering if mpichv3.02 is >> not compatible? >> >> >> Cheers, >> Ken >> >> -- >> Kenlee Nakasugi | Research Fellow >> School of Molecular Bioscience >> Level 8, SMB Building (G08)| The University of Sydney | NSW | 2006 >> T: +61 2 9114 1321 >> E: kenlee.nakasugi at sydney.edu.au >> _______________________________________________ maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kenlee.nakasugi at sydney.edu.au Mon Mar 4 19:49:09 2013 From: kenlee.nakasugi at sydney.edu.au (Kenlee Nakasugi) Date: Tue, 05 Mar 2013 12:49:09 +1100 Subject: [maker-devel] hex char:29 error with Signal.pm Message-ID: <1362448149.6346.46.camel@waterhouse874-8> Hi again, I'm running into the following error when I run maker 2.1: ## Ran into unknown state (hex char: 29) at /home/programs/maker/lib/File/..//Proc/Signal.pm line 94. ## I tried applying the patch as described here: http://gmod.827538.n3.nabble.com/cluster-error-running-maker-td4022354.html Using the command: $ patch -np1 < 646785-and-handle-Hex29.patch I did this in maker/lib/Proc and maker/lib/Process directories, but am getting this error: ## patch: **** Only garbage was found in the patch input. ## Apparently, this isn't a fatal error: http://gmod.827538.n3.nabble.com/Ran-into-unknown-state-hex-char-29-at-home-a200302-maker-2-10-lib-File-Proc-Signal-pm-line-94-td3034795.html and I might eventually have to run the latest version of Maker, but I need to continue a previous analyses and not having this constant error would be great. The version of Proc::ProcessTable is already latest, 0.47. The platform is ix x86_64 GNU/Linux Thanks, Ken -- Kenlee Nakasugi | Research Fellow School of Molecular Bioscience Level 8, SMB Building (G08)| The University of Sydney | NSW | 2006 T: +61 2 9114 1321 E: kenlee.nakasugi at sydney.edu.au -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Mar 4 22:48:17 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 04 Mar 2013 23:48:17 -0500 Subject: [maker-devel] hex char:29 error with Signal.pm In-Reply-To: <1362448149.6346.46.camel@waterhouse874-8> Message-ID: This is an issue with Proc::ProcessTable on some systems. If you upgrade to MAKER 2.27 it goes away because it no longer uses Proc::ProcessTable. Thanks, Carson From: Kenlee Nakasugi Date: Monday, 4 March, 2013 8:49 PM To: "maker-devel at yandell-lab.org List" Subject: [maker-devel] hex char:29 error with Signal.pm Hi again, I'm running into the following error when I run maker 2.1: ## Ran into unknown state (hex char: 29) at /home/programs/maker/lib/File/..//Proc/Signal.pm line 94. ## I tried applying the patch as described here: http://gmod.827538.n3.nabble.com/cluster-error-running-maker-td4022354.html Using the command: $ patch -np1 < 646785-and-handle-Hex29.patch I did this in maker/lib/Proc and maker/lib/Process directories, but am getting this error: ## patch: **** Only garbage was found in the patch input. ## Apparently, this isn't a fatal error: http://gmod.827538.n3.nabble.com/Ran-into-unknown-state-hex-char-29-at-home- a200302-maker-2-10-lib-File-Proc-Signal-pm-line-94-td3034795.html and I might eventually have to run the latest version of Maker, but I need to continue a previous analyses and not having this constant error would be great. The version of Proc::ProcessTable is already latest, 0.47. The platform is ix x86_64 GNU/Linux Thanks, Ken -- Kenlee Nakasugi | Research Fellow School of Molecular Bioscience Level 8, SMB Building (G08)| The University of Sydney | NSW | 2006 T: +61 2 9114 1321 E: kenlee.nakasugi at sydney.edu.au _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From Carson.Holt at oicr.on.ca Wed Mar 6 11:45:40 2013 From: Carson.Holt at oicr.on.ca (Carson Holt) Date: Wed, 6 Mar 2013 17:45:40 +0000 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: Message-ID: The failed thread is usually just a symptom. There is something causing the thread to fail. Could you send me your STDERR. Often times there is a warning or error further up. Thanks, Carson From: Ram?n Fallon > Date: Wednesday, 6 March, 2013 12:34 PM To: > Subject: thread terminated, causing all processes to fail Hi, I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a single multicore machine. I've successfully run the dpp_contig.fasta (MPI/8 processes) example but am having trouble with larger contigs fasta files of my own, which are well formed. I've run into a problem whereby an mpiexec run of 8 processes will stop due to a perl-thread related problem which says FATAL: Thread terminated, causing all processes to fail this corresponds to line 924 in the maker executable (which is for the secondary/worker threads), and is the result of a test on !$thr OR'd with !$thr->is_running, so clearly one of these is failing. $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a programmer, I've only recently started to look at the code and have not got the hang of the parallelisation setup here, though I gather the master must use threads to initially generate the parallel instances which then use the message passing. Of course threads don't have message passing ability, so I guess something clever is going on and will take some time for me to understand. Clearly however, it has worked before on dpp_contigs, so it may be is something wrong with my datafile or the way I am carrying out the analysis. Any clues that can be put my way are welcome. Thank you! -------------- next part -------------- An HTML attachment was scrubbed... URL: From ramonfallon at gmail.com Wed Mar 6 11:34:59 2013 From: ramonfallon at gmail.com (=?ISO-8859-1?Q?Ram=F3n_Fallon?=) Date: Wed, 6 Mar 2013 18:34:59 +0100 Subject: [maker-devel] thread terminated, causing all processes to fail Message-ID: Hi, I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a single multicore machine. I've successfully run the dpp_contig.fasta (MPI/8 processes) example but am having trouble with larger contigs fasta files of my own, which are well formed. I've run into a problem whereby an mpiexec run of 8 processes will stop due to a perl-thread related problem which says FATAL: Thread terminated, causing all processes to fail this corresponds to line 924 in the maker executable (which is for the secondary/worker threads), and is the result of a test on !$thr OR'd with !$thr->is_running, so clearly one of these is failing. $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a programmer, I've only recently started to look at the code and have not got the hang of the parallelisation setup here, though I gather the master must use threads to initially generate the parallel instances which then use the message passing. Of course threads don't have message passing ability, so I guess something clever is going on and will take some time for me to understand. Clearly however, it has worked before on dpp_contigs, so it may be is something wrong with my datafile or the way I am carrying out the analysis. Any clues that can be put my way are welcome. Thank you! -------------- next part -------------- An HTML attachment was scrubbed... URL: From ramonfallon at gmail.com Wed Mar 6 11:57:12 2013 From: ramonfallon at gmail.com (=?ISO-8859-1?Q?Ram=F3n_Fallon?=) Date: Wed, 6 Mar 2013 18:57:12 +0100 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: References: Message-ID: Hi, Many thanks for your quick reply and hint. Yes, you're right .. further up there is indeed Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line 148 thread 1. Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw FastaSeq for Storable --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 thread 1. I run a "script" session and have maker on -debug so I have everything in one file. Do you prefer to have it attached to a post to this mailing list (if it accepts txt attachments) Cheers. On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon wrote: > Hi, > > I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a > single multicore machine. > > I've successfully run the dpp_contig.fasta (MPI/8 processes) example but > am having trouble with larger contigs fasta files of my own, which are well > formed. > > I've run into a problem whereby an mpiexec run of 8 processes will stop > due to a perl-thread related problem which says > > FATAL: Thread terminated, causing all processes to fail > > this corresponds to line 924 in the maker executable (which is for the > secondary/worker threads), and is the result of a test on !$thr OR'd with > !$thr->is_running, so clearly one of these is failing. > > $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a > programmer, I've only recently started to look at the code and have not got > the hang of the parallelisation setup here, though I gather the master must > use threads to initially generate the parallel instances which then use the > message passing. Of course threads don't have message passing ability, so I > guess something clever is going on and will take some time for me to > understand. > > Clearly however, it has worked before on dpp_contigs, so it may be is > something wrong with my datafile or the way I am carrying out the analysis. > > Any clues that can be put my way are welcome. > > Thank you! > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Carson.Holt at oicr.on.ca Wed Mar 6 12:04:30 2013 From: Carson.Holt at oicr.on.ca (Carson Holt) Date: Wed, 6 Mar 2013 18:04:30 +0000 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: Message-ID: If you do reply all to this message, I should get the attachment. It will be stripped from the one going to the list though. Thanks, Carson From: Ram?n Fallon > Date: Wednesday, 6 March, 2013 12:57 PM To: > Subject: Re: thread terminated, causing all processes to fail Hi, Many thanks for your quick reply and hint. Yes, you're right .. further up there is indeed Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line 148 thread 1. Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw FastaSeq for Storable --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 thread 1. I run a "script" session and have maker on -debug so I have everything in one file. Do you prefer to have it attached to a post to this mailing list (if it accepts txt attachments) Cheers. On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon > wrote: Hi, I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a single multicore machine. I've successfully run the dpp_contig.fasta (MPI/8 processes) example but am having trouble with larger contigs fasta files of my own, which are well formed. I've run into a problem whereby an mpiexec run of 8 processes will stop due to a perl-thread related problem which says FATAL: Thread terminated, causing all processes to fail this corresponds to line 924 in the maker executable (which is for the secondary/worker threads), and is the result of a test on !$thr OR'd with !$thr->is_running, so clearly one of these is failing. $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a programmer, I've only recently started to look at the code and have not got the hang of the parallelisation setup here, though I gather the master must use threads to initially generate the parallel instances which then use the message passing. Of course threads don't have message passing ability, so I guess something clever is going on and will take some time for me to understand. Clearly however, it has worked before on dpp_contigs, so it may be is something wrong with my datafile or the way I am carrying out the analysis. Any clues that can be put my way are welcome. Thank you! -------------- next part -------------- An HTML attachment was scrubbed... URL: From ramonfallon at gmail.com Wed Mar 6 12:15:01 2013 From: ramonfallon at gmail.com (=?ISO-8859-1?Q?Ram=F3n_Fallon?=) Date: Wed, 6 Mar 2013 19:15:01 +0100 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: References: Message-ID: OK great, here goes .. many thanks! On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt wrote: > If you do reply all to this message, I should get the attachment. It > will be stripped from the one going to the list though. > > Thanks, > Carson > > > > From: Ram?n Fallon > Date: Wednesday, 6 March, 2013 12:57 PM > To: > Subject: Re: thread terminated, causing all processes to fail > > Hi, > > Many thanks for your quick reply and hint. > > Yes, you're right .. further up there is indeed > > Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line > 148 thread 1. > Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw > FastaSeq for Storable > --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 > thread 1. > > I run a "script" session and have maker on -debug so I have everything > in one file. Do you prefer to have it attached to a post to this mailing > list (if it accepts txt attachments) > > Cheers. > > > On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon wrote: > >> Hi, >> >> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a >> single multicore machine. >> >> I've successfully run the dpp_contig.fasta (MPI/8 processes) example >> but am having trouble with larger contigs fasta files of my own, which are >> well formed. >> >> I've run into a problem whereby an mpiexec run of 8 processes will stop >> due to a perl-thread related problem which says >> >> FATAL: Thread terminated, causing all processes to fail >> >> this corresponds to line 924 in the maker executable (which is for the >> secondary/worker threads), and is the result of a test on !$thr OR'd with >> !$thr->is_running, so clearly one of these is failing. >> >> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being >> a programmer, I've only recently started to look at the code and have not >> got the hang of the parallelisation setup here, though I gather the master >> must use threads to initially generate the parallel instances which then >> use the message passing. Of course threads don't have message passing >> ability, so I guess something clever is going on and will take some time >> for me to understand. >> >> Clearly however, it has worked before on dpp_contigs, so it may be is >> something wrong with my datafile or the way I am carrying out the analysis. >> >> Any clues that can be put my way are welcome. >> >> Thank you! >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: rf_mkr_run.scriptlog.zip Type: application/zip Size: 7598 bytes Desc: not available URL: From Carson.Holt at oicr.on.ca Wed Mar 6 12:22:38 2013 From: Carson.Holt at oicr.on.ca (Carson Holt) Date: Wed, 6 Mar 2013 18:22:38 +0000 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: Message-ID: Could you delete your ../*maker.output/mpi_blastdb directory, and then when rerunning maker, run with the ?a flag. Thanks, Carson From: Ram?n Fallon > Date: Wednesday, 6 March, 2013 1:15 PM To: Carson Holt > Cc: "maker-devel at yandell-lab.org" > Subject: Re: thread terminated, causing all processes to fail OK great, here goes .. many thanks! On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt > wrote: If you do reply all to this message, I should get the attachment. It will be stripped from the one going to the list though. Thanks, Carson From: Ram?n Fallon > Date: Wednesday, 6 March, 2013 12:57 PM To: > Subject: Re: thread terminated, causing all processes to fail Hi, Many thanks for your quick reply and hint. Yes, you're right .. further up there is indeed Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line 148 thread 1. Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw FastaSeq for Storable --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 thread 1. I run a "script" session and have maker on -debug so I have everything in one file. Do you prefer to have it attached to a post to this mailing list (if it accepts txt attachments) Cheers. On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon > wrote: Hi, I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a single multicore machine. I've successfully run the dpp_contig.fasta (MPI/8 processes) example but am having trouble with larger contigs fasta files of my own, which are well formed. I've run into a problem whereby an mpiexec run of 8 processes will stop due to a perl-thread related problem which says FATAL: Thread terminated, causing all processes to fail this corresponds to line 924 in the maker executable (which is for the secondary/worker threads), and is the result of a test on !$thr OR'd with !$thr->is_running, so clearly one of these is failing. $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a programmer, I've only recently started to look at the code and have not got the hang of the parallelisation setup here, though I gather the master must use threads to initially generate the parallel instances which then use the message passing. Of course threads don't have message passing ability, so I guess something clever is going on and will take some time for me to understand. Clearly however, it has worked before on dpp_contigs, so it may be is something wrong with my datafile or the way I am carrying out the analysis. Any clues that can be put my way are welcome. Thank you! -------------- next part -------------- An HTML attachment was scrubbed... URL: From ramonfallon at gmail.com Wed Mar 6 12:49:46 2013 From: ramonfallon at gmail.com (=?ISO-8859-1?Q?Ram=F3n_Fallon?=) Date: Wed, 6 Mar 2013 19:49:46 +0100 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: References: Message-ID: OK, will do. Will get back to you tomorrow on it. Many thanks! On Wed, Mar 6, 2013 at 7:22 PM, Carson Holt wrote: > Could you delete your ../*maker.output/mpi_blastdb directory, and then > when rerunning maker, run with the ?a flag. > > Thanks, > Carson > > > From: Ram?n Fallon > Date: Wednesday, 6 March, 2013 1:15 PM > To: Carson Holt > Cc: "maker-devel at yandell-lab.org" > > Subject: Re: thread terminated, causing all processes to fail > > OK great, here goes .. many thanks! > > > > On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt wrote: > >> If you do reply all to this message, I should get the attachment. It >> will be stripped from the one going to the list though. >> >> Thanks, >> Carson >> >> >> >> From: Ram?n Fallon >> Date: Wednesday, 6 March, 2013 12:57 PM >> To: >> Subject: Re: thread terminated, causing all processes to fail >> >> Hi, >> >> Many thanks for your quick reply and hint. >> >> Yes, you're right .. further up there is indeed >> >> Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line >> 148 thread 1. >> Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw >> FastaSeq for Storable >> --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 >> thread 1. >> >> I run a "script" session and have maker on -debug so I have everything >> in one file. Do you prefer to have it attached to a post to this mailing >> list (if it accepts txt attachments) >> >> Cheers. >> >> >> On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon wrote: >> >>> Hi, >>> >>> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a >>> single multicore machine. >>> >>> I've successfully run the dpp_contig.fasta (MPI/8 processes) example >>> but am having trouble with larger contigs fasta files of my own, which are >>> well formed. >>> >>> I've run into a problem whereby an mpiexec run of 8 processes will >>> stop due to a perl-thread related problem which says >>> >>> FATAL: Thread terminated, causing all processes to fail >>> >>> this corresponds to line 924 in the maker executable (which is for the >>> secondary/worker threads), and is the result of a test on !$thr OR'd with >>> !$thr->is_running, so clearly one of these is failing. >>> >>> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being >>> a programmer, I've only recently started to look at the code and have not >>> got the hang of the parallelisation setup here, though I gather the master >>> must use threads to initially generate the parallel instances which then >>> use the message passing. Of course threads don't have message passing >>> ability, so I guess something clever is going on and will take some time >>> for me to understand. >>> >>> Clearly however, it has worked before on dpp_contigs, so it may be is >>> something wrong with my datafile or the way I am carrying out the analysis. >>> >>> Any clues that can be put my way are welcome. >>> >>> Thank you! >>> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ramonfallon at gmail.com Thu Mar 7 08:40:53 2013 From: ramonfallon at gmail.com (=?ISO-8859-1?Q?Ram=F3n_Fallon?=) Date: Thu, 7 Mar 2013 15:40:53 +0100 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: References: Message-ID: Hi Carson, I send you a zip of the text file of my repeated maker session, this time having deleted the mpi_blastdb dir and with the -a flag added to "mpiexec -n 8 maker -debug". Command line. Cheers / Ram?n. On Wed, Mar 6, 2013 at 7:49 PM, Ram?n Fallon wrote: > OK, will do. > > Will get back to you tomorrow on it. > > Many thanks! > > > On Wed, Mar 6, 2013 at 7:22 PM, Carson Holt wrote: > >> Could you delete your ../*maker.output/mpi_blastdb directory, and then >> when rerunning maker, run with the ?a flag. >> >> Thanks, >> Carson >> >> >> From: Ram?n Fallon >> Date: Wednesday, 6 March, 2013 1:15 PM >> To: Carson Holt >> Cc: "maker-devel at yandell-lab.org" >> >> Subject: Re: thread terminated, causing all processes to fail >> >> OK great, here goes .. many thanks! >> >> >> >> On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt wrote: >> >>> If you do reply all to this message, I should get the attachment. It >>> will be stripped from the one going to the list though. >>> >>> Thanks, >>> Carson >>> >>> >>> >>> From: Ram?n Fallon >>> Date: Wednesday, 6 March, 2013 12:57 PM >>> To: >>> Subject: Re: thread terminated, causing all processes to fail >>> >>> Hi, >>> >>> Many thanks for your quick reply and hint. >>> >>> Yes, you're right .. further up there is indeed >>> >>> Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line >>> 148 thread 1. >>> Thread 1 terminated abnormally: ERROR: Could not reestablish DB to >>> thaw FastaSeq for Storable >>> --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 >>> thread 1. >>> >>> I run a "script" session and have maker on -debug so I have everything >>> in one file. Do you prefer to have it attached to a post to this mailing >>> list (if it accepts txt attachments) >>> >>> Cheers. >>> >>> >>> On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon wrote: >>> >>>> Hi, >>>> >>>> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a >>>> single multicore machine. >>>> >>>> I've successfully run the dpp_contig.fasta (MPI/8 processes) example >>>> but am having trouble with larger contigs fasta files of my own, which are >>>> well formed. >>>> >>>> I've run into a problem whereby an mpiexec run of 8 processes will >>>> stop due to a perl-thread related problem which says >>>> >>>> FATAL: Thread terminated, causing all processes to fail >>>> >>>> this corresponds to line 924 in the maker executable (which is for the >>>> secondary/worker threads), and is the result of a test on !$thr OR'd with >>>> !$thr->is_running, so clearly one of these is failing. >>>> >>>> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite >>>> being a programmer, I've only recently started to look at the code and have >>>> not got the hang of the parallelisation setup here, though I gather the >>>> master must use threads to initially generate the parallel instances which >>>> then use the message passing. Of course threads don't have message passing >>>> ability, so I guess something clever is going on and will take some time >>>> for me to understand. >>>> >>>> Clearly however, it has worked before on dpp_contigs, so it may be is >>>> something wrong with my datafile or the way I am carrying out the analysis. >>>> >>>> Any clues that can be put my way are welcome. >>>> >>>> Thank you! >>>> >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: rf_mkr_run.scriptlog2.zip Type: application/zip Size: 6430 bytes Desc: not available URL: From carsonhh at gmail.com Thu Mar 7 10:44:40 2013 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 07 Mar 2013 11:44:40 -0500 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: Message-ID: That is extremely odd. It fails to even generate the indexes. Could you check the drive space of your working directory and your /tmp directory? It is odd because Bioperl uses the stat command to check on the file right before making a tied hash. So it was there for the stat but not the tie, which is immediately following. If you check manually does it exist now? --> /home/ramonf/makertrials/mgallocut7/sca29310_8.maker.output/mpi_blastdb/sca2 9310_8%2Efa.mpi.1/sca29310_8%2Efa.mpi.1.0.index Are you running in an NFS mounted directory? --Carson From: Ram?n Fallon Date: Thursday, 7 March, 2013 9:40 AM To: Carson Holt Cc: "maker-devel at yandell-lab.org" Subject: Re: [maker-devel] thread terminated, causing all processes to fail Hi Carson, I send you a zip of the text file of my repeated maker session, this time having deleted the mpi_blastdb dir and with the -a flag added to "mpiexec -n 8 maker -debug". Command line. Cheers / Ram?n. On Wed, Mar 6, 2013 at 7:49 PM, Ram?n Fallon wrote: > OK, will do. > > Will get back to you tomorrow on it. > > Many thanks! > > > On Wed, Mar 6, 2013 at 7:22 PM, Carson Holt wrote: >> Could you delete your ../*maker.output/mpi_blastdb directory, and then when >> rerunning maker, run with the ?a flag. >> >> Thanks, >> Carson >> >> >> From: Ram?n Fallon >> Date: Wednesday, 6 March, 2013 1:15 PM >> To: Carson Holt >> Cc: "maker-devel at yandell-lab.org" >> >> Subject: Re: thread terminated, causing all processes to fail >> >> OK great, here goes .. many thanks! >> >> >> >> On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt wrote: >>> If you do reply all to this message, I should get the attachment. It will >>> be stripped from the one going to the list though. >>> >>> Thanks, >>> Carson >>> >>> >>> >>> From: Ram?n Fallon >>> Date: Wednesday, 6 March, 2013 12:57 PM >>> To: >>> Subject: Re: thread terminated, causing all processes to fail >>> >>> Hi, >>> >>> Many thanks for your quick reply and hint. >>> >>> Yes, you're right .. further up there is indeed >>> >>> Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line 148 >>> thread 1. >>> Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw >>> FastaSeq for Storable >>> --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 >>> thread 1. >>> >>> I run a "script" session and have maker on -debug so I have everything in >>> one file. Do you prefer to have it attached to a post to this mailing list >>> (if it accepts txt attachments) >>> >>> Cheers. >>> >>> >>> On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon wrote: >>>> Hi, >>>> >>>> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a >>>> single multicore machine. >>>> >>>> I've successfully run the dpp_contig.fasta (MPI/8 processes) example but am >>>> having trouble with larger contigs fasta files of my own, which are well >>>> formed. >>>> >>>> I've run into a problem whereby an mpiexec run of 8 processes will stop due >>>> to a perl-thread related problem which says >>>> >>>> FATAL: Thread terminated, causing all processes to fail >>>> >>>> this corresponds to line 924 in the maker executable (which is for the >>>> secondary/worker threads), and is the result of a test on !$thr OR'd with >>>> !$thr->is_running, so clearly one of these is failing. >>>> >>>> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a >>>> programmer, I've only recently started to look at the code and have not got >>>> the hang of the parallelisation setup here, though I gather the master must >>>> use threads to initially generate the parallel instances which then use the >>>> message passing. Of course threads don't have message passing ability, so I >>>> guess something clever is going on and will take some time for me to >>>> understand. >>>> >>>> Clearly however, it has worked before on dpp_contigs, so it may be is >>>> something wrong with my datafile or the way I am carrying out the analysis. >>>> >>>> Any clues that can be put my way are welcome. >>>> >>>> Thank you! >>> >> > _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ramonfallon at gmail.com Thu Mar 7 11:47:53 2013 From: ramonfallon at gmail.com (=?ISO-8859-1?Q?Ram=F3n_Fallon?=) Date: Thu, 7 Mar 2013 18:47:53 +0100 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: References: Message-ID: This is a standalone machine and no NFS at all. "df" gives a healthy amount of disk space, so there should be no problem there. Yes that file does exist although it has the nominal 12288 bytes size, which appears to be the minimum for a DB_file tie. As I mentioned the dpp_contig.fa example set does work so part of my investigation is looking at how. I can do some trivial unit tests on the Bioperl stat-before-tied-hashes situation and see what comes up. So I'll attempt to clear that up and then revert. Many thanks! / Ram?n. On Thu, Mar 7, 2013 at 5:44 PM, Carson Holt wrote: > That is extremely odd. It fails to even generate the indexes. Could you > check the drive space of your working directory and your /tmp directory? > > It is odd because Bioperl uses the stat command to check on the file right > before making a tied hash. So it was there for the stat but not the tie, > which is immediately following. > > If you check manually does it exist now? --> > /home/ramonf/makertrials/mgallocut7/sca29310_8.maker.output/mpi_blastdb/sca29310_8%2Efa.mpi.1/sca29310_8%2Efa.mpi.1.0.index > > Are you running in an NFS mounted directory? > > --Carson > > > From: Ram?n Fallon > Date: Thursday, 7 March, 2013 9:40 AM > > To: Carson Holt > Cc: "maker-devel at yandell-lab.org" > Subject: Re: [maker-devel] thread terminated, causing all processes to > fail > > Hi Carson, > > I send you a zip of the text file of my repeated maker session, this time > having deleted the mpi_blastdb dir and with the -a flag added to "mpiexec > -n 8 maker -debug". Command line. > > Cheers / Ram?n. > > > On Wed, Mar 6, 2013 at 7:49 PM, Ram?n Fallon wrote: > >> OK, will do. >> >> Will get back to you tomorrow on it. >> >> Many thanks! >> >> >> On Wed, Mar 6, 2013 at 7:22 PM, Carson Holt wrote: >> >>> Could you delete your ../*maker.output/mpi_blastdb directory, and then >>> when rerunning maker, run with the ?a flag. >>> >>> Thanks, >>> Carson >>> >>> >>> From: Ram?n Fallon >>> Date: Wednesday, 6 March, 2013 1:15 PM >>> To: Carson Holt >>> Cc: "maker-devel at yandell-lab.org" >>> >>> Subject: Re: thread terminated, causing all processes to fail >>> >>> OK great, here goes .. many thanks! >>> >>> >>> >>> On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt wrote: >>> >>>> If you do reply all to this message, I should get the attachment. It >>>> will be stripped from the one going to the list though. >>>> >>>> Thanks, >>>> Carson >>>> >>>> >>>> >>>> From: Ram?n Fallon >>>> Date: Wednesday, 6 March, 2013 12:57 PM >>>> To: >>>> Subject: Re: thread terminated, causing all processes to fail >>>> >>>> Hi, >>>> >>>> Many thanks for your quick reply and hint. >>>> >>>> Yes, you're right .. further up there is indeed >>>> >>>> Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line >>>> 148 thread 1. >>>> Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw >>>> FastaSeq for Storable >>>> --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 >>>> thread 1. >>>> >>>> I run a "script" session and have maker on -debug so I have everything >>>> in one file. Do you prefer to have it attached to a post to this mailing >>>> list (if it accepts txt attachments) >>>> >>>> Cheers. >>>> >>>> >>>> On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon wrote: >>>> >>>>> Hi, >>>>> >>>>> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a >>>>> single multicore machine. >>>>> >>>>> I've successfully run the dpp_contig.fasta (MPI/8 processes) example >>>>> but am having trouble with larger contigs fasta files of my own, which are >>>>> well formed. >>>>> >>>>> I've run into a problem whereby an mpiexec run of 8 processes will >>>>> stop due to a perl-thread related problem which says >>>>> >>>>> FATAL: Thread terminated, causing all processes to fail >>>>> >>>>> this corresponds to line 924 in the maker executable (which is for the >>>>> secondary/worker threads), and is the result of a test on !$thr OR'd with >>>>> !$thr->is_running, so clearly one of these is failing. >>>>> >>>>> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being >>>>> a programmer, I've only recently started to look at the code and have not >>>>> got the hang of the parallelisation setup here, though I gather the master >>>>> must use threads to initially generate the parallel instances which then >>>>> use the message passing. Of course threads don't have message passing >>>>> ability, so I guess something clever is going on and will take some time >>>>> for me to understand. >>>>> >>>>> Clearly however, it has worked before on dpp_contigs, so it may be is >>>>> something wrong with my datafile or the way I am carrying out the analysis. >>>>> >>>>> Any clues that can be put my way are welcome. >>>>> >>>>> Thank you! >>>>> >>>> >>>> >>> >> > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Thu Mar 7 11:57:46 2013 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 07 Mar 2013 12:57:46 -0500 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: Message-ID: Try running maker outside of with the ?a flag after deleting mpi_blastdb. Does it still happen. Also if you try again with MPI with the ?a flag and having deleted mpi_blastdb, does it fail the same every time? Could you also check for background maker processes that may be trying to work in the same directory that you may not have realized were running. Thanks, Carson From: Ram?n Fallon Date: Thursday, 7 March, 2013 12:47 PM To: Carson Holt Cc: "maker-devel at yandell-lab.org" Subject: Re: [maker-devel] thread terminated, causing all processes to fail This is a standalone machine and no NFS at all. "df" gives a healthy amount of disk space, so there should be no problem there. Yes that file does exist although it has the nominal 12288 bytes size, which appears to be the minimum for a DB_file tie. As I mentioned the dpp_contig.fa example set does work so part of my investigation is looking at how. I can do some trivial unit tests on the Bioperl stat-before-tied-hashes situation and see what comes up. So I'll attempt to clear that up and then revert. Many thanks! / Ram?n. On Thu, Mar 7, 2013 at 5:44 PM, Carson Holt wrote: > That is extremely odd. It fails to even generate the indexes. Could you check > the drive space of your working directory and your /tmp directory? > > It is odd because Bioperl uses the stat command to check on the file right > before making a tied hash. So it was there for the stat but not the tie, > which is immediately following. > > If you check manually does it exist now? --> > /home/ramonf/makertrials/mgallocut7/sca29310_8.maker.output/mpi_blastdb/sca293 > 10_8%2Efa.mpi.1/sca29310_8%2Efa.mpi.1.0.index > > Are you running in an NFS mounted directory? > > --Carson > > > From: Ram?n Fallon > Date: Thursday, 7 March, 2013 9:40 AM > > To: Carson Holt > Cc: "maker-devel at yandell-lab.org" > Subject: Re: [maker-devel] thread terminated, causing all processes to fail > > Hi Carson, > > I send you a zip of the text file of my repeated maker session, this time > having deleted the mpi_blastdb dir and with the -a flag added to "mpiexec -n 8 > maker -debug". Command line. > > Cheers / Ram?n. > > > On Wed, Mar 6, 2013 at 7:49 PM, Ram?n Fallon wrote: >> OK, will do. >> >> Will get back to you tomorrow on it. >> >> Many thanks! >> >> >> On Wed, Mar 6, 2013 at 7:22 PM, Carson Holt wrote: >>> Could you delete your ../*maker.output/mpi_blastdb directory, and then when >>> rerunning maker, run with the ?a flag. >>> >>> Thanks, >>> Carson >>> >>> >>> From: Ram?n Fallon >>> Date: Wednesday, 6 March, 2013 1:15 PM >>> To: Carson Holt >>> Cc: "maker-devel at yandell-lab.org" >>> >>> Subject: Re: thread terminated, causing all processes to fail >>> >>> OK great, here goes .. many thanks! >>> >>> >>> >>> On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt wrote: >>>> If you do reply all to this message, I should get the attachment. It will >>>> be stripped from the one going to the list though. >>>> >>>> Thanks, >>>> Carson >>>> >>>> >>>> >>>> From: Ram?n Fallon >>>> Date: Wednesday, 6 March, 2013 12:57 PM >>>> To: >>>> Subject: Re: thread terminated, causing all processes to fail >>>> >>>> Hi, >>>> >>>> Many thanks for your quick reply and hint. >>>> >>>> Yes, you're right .. further up there is indeed >>>> >>>> Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line 148 >>>> thread 1. >>>> Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw >>>> FastaSeq for Storable >>>> --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 >>>> thread 1. >>>> >>>> I run a "script" session and have maker on -debug so I have everything in >>>> one file. Do you prefer to have it attached to a post to this mailing list >>>> (if it accepts txt attachments) >>>> >>>> Cheers. >>>> >>>> >>>> On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon wrote: >>>>> Hi, >>>>> >>>>> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a >>>>> single multicore machine. >>>>> >>>>> I've successfully run the dpp_contig.fasta (MPI/8 processes) example but >>>>> am having trouble with larger contigs fasta files of my own, which are >>>>> well formed. >>>>> >>>>> I've run into a problem whereby an mpiexec run of 8 processes will stop >>>>> due to a perl-thread related problem which says >>>>> >>>>> FATAL: Thread terminated, causing all processes to fail >>>>> >>>>> this corresponds to line 924 in the maker executable (which is for the >>>>> secondary/worker threads), and is the result of a test on !$thr OR'd with >>>>> !$thr->is_running, so clearly one of these is failing. >>>>> >>>>> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a >>>>> programmer, I've only recently started to look at the code and have not >>>>> got the hang of the parallelisation setup here, though I gather the master >>>>> must use threads to initially generate the parallel instances which then >>>>> use the message passing. Of course threads don't have message passing >>>>> ability, so I guess something clever is going on and will take some time >>>>> for me to understand. >>>>> >>>>> Clearly however, it has worked before on dpp_contigs, so it may be is >>>>> something wrong with my datafile or the way I am carrying out the >>>>> analysis. >>>>> >>>>> Any clues that can be put my way are welcome. >>>>> >>>>> Thank you! >>>> >>> >> > > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/mak > er-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Thu Mar 7 15:09:34 2013 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 07 Mar 2013 16:09:34 -0500 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: Message-ID: It should have said "Try running maker outside of MPi". --Carson From: Carson Holt Date: Thursday, 7 March, 2013 12:57 PM To: Ram?n Fallon Cc: "maker-devel at yandell-lab.org" Subject: Re: [maker-devel] thread terminated, causing all processes to fail Try running maker outside of with the ?a flag after deleting mpi_blastdb. Does it still happen. Also if you try again with MPI with the ?a flag and having deleted mpi_blastdb, does it fail the same every time? Could you also check for background maker processes that may be trying to work in the same directory that you may not have realized were running. Thanks, Carson From: Ram?n Fallon Date: Thursday, 7 March, 2013 12:47 PM To: Carson Holt Cc: "maker-devel at yandell-lab.org" Subject: Re: [maker-devel] thread terminated, causing all processes to fail This is a standalone machine and no NFS at all. "df" gives a healthy amount of disk space, so there should be no problem there. Yes that file does exist although it has the nominal 12288 bytes size, which appears to be the minimum for a DB_file tie. As I mentioned the dpp_contig.fa example set does work so part of my investigation is looking at how. I can do some trivial unit tests on the Bioperl stat-before-tied-hashes situation and see what comes up. So I'll attempt to clear that up and then revert. Many thanks! / Ram?n. On Thu, Mar 7, 2013 at 5:44 PM, Carson Holt wrote: > That is extremely odd. It fails to even generate the indexes. Could you check > the drive space of your working directory and your /tmp directory? > > It is odd because Bioperl uses the stat command to check on the file right > before making a tied hash. So it was there for the stat but not the tie, > which is immediately following. > > If you check manually does it exist now? --> > /home/ramonf/makertrials/mgallocut7/sca29310_8.maker.output/mpi_blastdb/sca293 > 10_8%2Efa.mpi.1/sca29310_8%2Efa.mpi.1.0.index > > Are you running in an NFS mounted directory? > > --Carson > > > From: Ram?n Fallon > Date: Thursday, 7 March, 2013 9:40 AM > > To: Carson Holt > Cc: "maker-devel at yandell-lab.org" > Subject: Re: [maker-devel] thread terminated, causing all processes to fail > > Hi Carson, > > I send you a zip of the text file of my repeated maker session, this time > having deleted the mpi_blastdb dir and with the -a flag added to "mpiexec -n 8 > maker -debug". Command line. > > Cheers / Ram?n. > > > On Wed, Mar 6, 2013 at 7:49 PM, Ram?n Fallon wrote: >> OK, will do. >> >> Will get back to you tomorrow on it. >> >> Many thanks! >> >> >> On Wed, Mar 6, 2013 at 7:22 PM, Carson Holt wrote: >>> Could you delete your ../*maker.output/mpi_blastdb directory, and then when >>> rerunning maker, run with the ?a flag. >>> >>> Thanks, >>> Carson >>> >>> >>> From: Ram?n Fallon >>> Date: Wednesday, 6 March, 2013 1:15 PM >>> To: Carson Holt >>> Cc: "maker-devel at yandell-lab.org" >>> >>> Subject: Re: thread terminated, causing all processes to fail >>> >>> OK great, here goes .. many thanks! >>> >>> >>> >>> On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt wrote: >>>> If you do reply all to this message, I should get the attachment. It will >>>> be stripped from the one going to the list though. >>>> >>>> Thanks, >>>> Carson >>>> >>>> >>>> >>>> From: Ram?n Fallon >>>> Date: Wednesday, 6 March, 2013 12:57 PM >>>> To: >>>> Subject: Re: thread terminated, causing all processes to fail >>>> >>>> Hi, >>>> >>>> Many thanks for your quick reply and hint. >>>> >>>> Yes, you're right .. further up there is indeed >>>> >>>> Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line 148 >>>> thread 1. >>>> Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw >>>> FastaSeq for Storable >>>> --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 >>>> thread 1. >>>> >>>> I run a "script" session and have maker on -debug so I have everything in >>>> one file. Do you prefer to have it attached to a post to this mailing list >>>> (if it accepts txt attachments) >>>> >>>> Cheers. >>>> >>>> >>>> On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon wrote: >>>>> Hi, >>>>> >>>>> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a >>>>> single multicore machine. >>>>> >>>>> I've successfully run the dpp_contig.fasta (MPI/8 processes) example but >>>>> am having trouble with larger contigs fasta files of my own, which are >>>>> well formed. >>>>> >>>>> I've run into a problem whereby an mpiexec run of 8 processes will stop >>>>> due to a perl-thread related problem which says >>>>> >>>>> FATAL: Thread terminated, causing all processes to fail >>>>> >>>>> this corresponds to line 924 in the maker executable (which is for the >>>>> secondary/worker threads), and is the result of a test on !$thr OR'd with >>>>> !$thr->is_running, so clearly one of these is failing. >>>>> >>>>> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a >>>>> programmer, I've only recently started to look at the code and have not >>>>> got the hang of the parallelisation setup here, though I gather the master >>>>> must use threads to initially generate the parallel instances which then >>>>> use the message passing. Of course threads don't have message passing >>>>> ability, so I guess something clever is going on and will take some time >>>>> for me to understand. >>>>> >>>>> Clearly however, it has worked before on dpp_contigs, so it may be is >>>>> something wrong with my datafile or the way I am carrying out the >>>>> analysis. >>>>> >>>>> Any clues that can be put my way are welcome. >>>>> >>>>> Thank you! >>>> >>> >> > > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/mak > er-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From kangyangjae at gmail.com Thu Mar 7 22:00:19 2013 From: kangyangjae at gmail.com (Kang, Yang Jae) Date: Fri, 8 Mar 2013 13:00:19 +0900 Subject: [maker-devel] retrying the FAILED scaffolds Message-ID: <13f201ce1bb1$769c9e20$63d5da60$@gmail.com> Hello I have question regarding some FAILED scaffolds Is there any way to re-try maker pipeline on just Failed scaffolds separately? And do I have to manually erase for the failed directories named as ../theVoid.scaffold_#/? And how can I track down the reason why only those 20 out of around 3000 scaffolds? Thank you Kang, Yang Jae Ph.D. Cropgenomics Lab. College of Agriculture and Life Science Seoul National University Korea -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Thu Mar 7 22:13:08 2013 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 07 Mar 2013 23:13:08 -0500 Subject: [maker-devel] retrying the FAILED scaffolds In-Reply-To: <13f201ce1bb1$769c9e20$63d5da60$@gmail.com> Message-ID: Is there any way to re-try maker pipeline on just Failed scaffolds separately? > Yes. The failed contig fasta will be in the maker.output subdirectory for that > contig. Alternatively use the fasta_tool script to extract them from the > genome file. You can then run them in a separate directory, or use the > '-base' command line flag to force it to use the base name of the current > results directory. Use the ?g option to override the genome file without > having to edit the control files > > Example: > > maker -g failed.fasta ?base maize_assemby > > Output will end up here --> maize_assemby.maker.output And do I have to manually erase for the failed directories named as ../theVoid.scaffold_#/? > No. You can let MAKER just retry them as is (let maker handle what to delete > and keep) or set clean_try=1 to force full deletion before rerunning And how can I track down the reason why only those 20 out of around 3000 scaffolds? > Search for the tag "ERROR" in the standard output of your run. What MAKER > version are you using? I can take a look at the STDERR as wel if you want. > If it's too big for e-mail, you can share it via dropbox. Thanks, Carson -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Fri Mar 8 14:20:37 2013 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 08 Mar 2013 15:20:37 -0500 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: Message-ID: I think I've found the potential cause and committed the necessary changes to fix it. Thanks, Carson From: Ram?n Fallon Date: Thursday, 7 March, 2013 12:47 PM To: Carson Holt Cc: "maker-devel at yandell-lab.org" Subject: Re: [maker-devel] thread terminated, causing all processes to fail This is a standalone machine and no NFS at all. "df" gives a healthy amount of disk space, so there should be no problem there. Yes that file does exist although it has the nominal 12288 bytes size, which appears to be the minimum for a DB_file tie. As I mentioned the dpp_contig.fa example set does work so part of my investigation is looking at how. I can do some trivial unit tests on the Bioperl stat-before-tied-hashes situation and see what comes up. So I'll attempt to clear that up and then revert. Many thanks! / Ram?n. On Thu, Mar 7, 2013 at 5:44 PM, Carson Holt wrote: > That is extremely odd. It fails to even generate the indexes. Could you check > the drive space of your working directory and your /tmp directory? > > It is odd because Bioperl uses the stat command to check on the file right > before making a tied hash. So it was there for the stat but not the tie, > which is immediately following. > > If you check manually does it exist now? --> > /home/ramonf/makertrials/mgallocut7/sca29310_8.maker.output/mpi_blastdb/sca293 > 10_8%2Efa.mpi.1/sca29310_8%2Efa.mpi.1.0.index > > Are you running in an NFS mounted directory? > > --Carson > > > From: Ram?n Fallon > Date: Thursday, 7 March, 2013 9:40 AM > > To: Carson Holt > Cc: "maker-devel at yandell-lab.org" > Subject: Re: [maker-devel] thread terminated, causing all processes to fail > > Hi Carson, > > I send you a zip of the text file of my repeated maker session, this time > having deleted the mpi_blastdb dir and with the -a flag added to "mpiexec -n 8 > maker -debug". Command line. > > Cheers / Ram?n. > > > On Wed, Mar 6, 2013 at 7:49 PM, Ram?n Fallon wrote: >> OK, will do. >> >> Will get back to you tomorrow on it. >> >> Many thanks! >> >> >> On Wed, Mar 6, 2013 at 7:22 PM, Carson Holt wrote: >>> Could you delete your ../*maker.output/mpi_blastdb directory, and then when >>> rerunning maker, run with the ?a flag. >>> >>> Thanks, >>> Carson >>> >>> >>> From: Ram?n Fallon >>> Date: Wednesday, 6 March, 2013 1:15 PM >>> To: Carson Holt >>> Cc: "maker-devel at yandell-lab.org" >>> >>> Subject: Re: thread terminated, causing all processes to fail >>> >>> OK great, here goes .. many thanks! >>> >>> >>> >>> On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt wrote: >>>> If you do reply all to this message, I should get the attachment. It will >>>> be stripped from the one going to the list though. >>>> >>>> Thanks, >>>> Carson >>>> >>>> >>>> >>>> From: Ram?n Fallon >>>> Date: Wednesday, 6 March, 2013 12:57 PM >>>> To: >>>> Subject: Re: thread terminated, causing all processes to fail >>>> >>>> Hi, >>>> >>>> Many thanks for your quick reply and hint. >>>> >>>> Yes, you're right .. further up there is indeed >>>> >>>> Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line 148 >>>> thread 1. >>>> Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw >>>> FastaSeq for Storable >>>> --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 >>>> thread 1. >>>> >>>> I run a "script" session and have maker on -debug so I have everything in >>>> one file. Do you prefer to have it attached to a post to this mailing list >>>> (if it accepts txt attachments) >>>> >>>> Cheers. >>>> >>>> >>>> On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon wrote: >>>>> Hi, >>>>> >>>>> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a >>>>> single multicore machine. >>>>> >>>>> I've successfully run the dpp_contig.fasta (MPI/8 processes) example but >>>>> am having trouble with larger contigs fasta files of my own, which are >>>>> well formed. >>>>> >>>>> I've run into a problem whereby an mpiexec run of 8 processes will stop >>>>> due to a perl-thread related problem which says >>>>> >>>>> FATAL: Thread terminated, causing all processes to fail >>>>> >>>>> this corresponds to line 924 in the maker executable (which is for the >>>>> secondary/worker threads), and is the result of a test on !$thr OR'd with >>>>> !$thr->is_running, so clearly one of these is failing. >>>>> >>>>> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a >>>>> programmer, I've only recently started to look at the code and have not >>>>> got the hang of the parallelisation setup here, though I gather the master >>>>> must use threads to initially generate the parallel instances which then >>>>> use the message passing. Of course threads don't have message passing >>>>> ability, so I guess something clever is going on and will take some time >>>>> for me to understand. >>>>> >>>>> Clearly however, it has worked before on dpp_contigs, so it may be is >>>>> something wrong with my datafile or the way I am carrying out the >>>>> analysis. >>>>> >>>>> Any clues that can be put my way are welcome. >>>>> >>>>> Thank you! >>>> >>> >> > > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/mak > er-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Fri Mar 8 14:28:32 2013 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 08 Mar 2013 15:28:32 -0500 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: Message-ID: Also delete mpi_blastdb before retrying with the new svn repository. Thanks, Carson From: Carson Holt Date: Friday, 8 March, 2013 3:20 PM To: Ram?n Fallon Cc: "maker-devel at yandell-lab.org" Subject: Re: [maker-devel] thread terminated, causing all processes to fail I think I've found the potential cause and committed the necessary changes to fix it. Thanks, Carson From: Ram?n Fallon Date: Thursday, 7 March, 2013 12:47 PM To: Carson Holt Cc: "maker-devel at yandell-lab.org" Subject: Re: [maker-devel] thread terminated, causing all processes to fail This is a standalone machine and no NFS at all. "df" gives a healthy amount of disk space, so there should be no problem there. Yes that file does exist although it has the nominal 12288 bytes size, which appears to be the minimum for a DB_file tie. As I mentioned the dpp_contig.fa example set does work so part of my investigation is looking at how. I can do some trivial unit tests on the Bioperl stat-before-tied-hashes situation and see what comes up. So I'll attempt to clear that up and then revert. Many thanks! / Ram?n. On Thu, Mar 7, 2013 at 5:44 PM, Carson Holt wrote: > That is extremely odd. It fails to even generate the indexes. Could you check > the drive space of your working directory and your /tmp directory? > > It is odd because Bioperl uses the stat command to check on the file right > before making a tied hash. So it was there for the stat but not the tie, > which is immediately following. > > If you check manually does it exist now? --> > /home/ramonf/makertrials/mgallocut7/sca29310_8.maker.output/mpi_blastdb/sca293 > 10_8%2Efa.mpi.1/sca29310_8%2Efa.mpi.1.0.index > > Are you running in an NFS mounted directory? > > --Carson > > > From: Ram?n Fallon > Date: Thursday, 7 March, 2013 9:40 AM > > To: Carson Holt > Cc: "maker-devel at yandell-lab.org" > Subject: Re: [maker-devel] thread terminated, causing all processes to fail > > Hi Carson, > > I send you a zip of the text file of my repeated maker session, this time > having deleted the mpi_blastdb dir and with the -a flag added to "mpiexec -n 8 > maker -debug". Command line. > > Cheers / Ram?n. > > > On Wed, Mar 6, 2013 at 7:49 PM, Ram?n Fallon wrote: >> OK, will do. >> >> Will get back to you tomorrow on it. >> >> Many thanks! >> >> >> On Wed, Mar 6, 2013 at 7:22 PM, Carson Holt wrote: >>> Could you delete your ../*maker.output/mpi_blastdb directory, and then when >>> rerunning maker, run with the ?a flag. >>> >>> Thanks, >>> Carson >>> >>> >>> From: Ram?n Fallon >>> Date: Wednesday, 6 March, 2013 1:15 PM >>> To: Carson Holt >>> Cc: "maker-devel at yandell-lab.org" >>> >>> Subject: Re: thread terminated, causing all processes to fail >>> >>> OK great, here goes .. many thanks! >>> >>> >>> >>> On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt wrote: >>>> If you do reply all to this message, I should get the attachment. It will >>>> be stripped from the one going to the list though. >>>> >>>> Thanks, >>>> Carson >>>> >>>> >>>> >>>> From: Ram?n Fallon >>>> Date: Wednesday, 6 March, 2013 12:57 PM >>>> To: >>>> Subject: Re: thread terminated, causing all processes to fail >>>> >>>> Hi, >>>> >>>> Many thanks for your quick reply and hint. >>>> >>>> Yes, you're right .. further up there is indeed >>>> >>>> Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line 148 >>>> thread 1. >>>> Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw >>>> FastaSeq for Storable >>>> --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 >>>> thread 1. >>>> >>>> I run a "script" session and have maker on -debug so I have everything in >>>> one file. Do you prefer to have it attached to a post to this mailing list >>>> (if it accepts txt attachments) >>>> >>>> Cheers. >>>> >>>> >>>> On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon wrote: >>>>> Hi, >>>>> >>>>> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a >>>>> single multicore machine. >>>>> >>>>> I've successfully run the dpp_contig.fasta (MPI/8 processes) example but >>>>> am having trouble with larger contigs fasta files of my own, which are >>>>> well formed. >>>>> >>>>> I've run into a problem whereby an mpiexec run of 8 processes will stop >>>>> due to a perl-thread related problem which says >>>>> >>>>> FATAL: Thread terminated, causing all processes to fail >>>>> >>>>> this corresponds to line 924 in the maker executable (which is for the >>>>> secondary/worker threads), and is the result of a test on !$thr OR'd with >>>>> !$thr->is_running, so clearly one of these is failing. >>>>> >>>>> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a >>>>> programmer, I've only recently started to look at the code and have not >>>>> got the hang of the parallelisation setup here, though I gather the master >>>>> must use threads to initially generate the parallel instances which then >>>>> use the message passing. Of course threads don't have message passing >>>>> ability, so I guess something clever is going on and will take some time >>>>> for me to understand. >>>>> >>>>> Clearly however, it has worked before on dpp_contigs, so it may be is >>>>> something wrong with my datafile or the way I am carrying out the >>>>> analysis. >>>>> >>>>> Any clues that can be put my way are welcome. >>>>> >>>>> Thank you! >>>> >>> >> > > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/mak > er-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Sun Mar 10 11:31:27 2013 From: carsonhh at gmail.com (Carson Holt) Date: Sun, 10 Mar 2013 12:31:27 -0400 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: Message-ID: I've fixed the missing script issue. Thanks, Carson From: Ram?n Fallon Date: Sunday, 10 March, 2013 10:45 AM To: Carson Holt Cc: "maker-devel at yandell-lab.org" Subject: Re: [maker-devel] thread terminated, causing all processes to fail Hi Carson, In terms of rev 995, on a simplified version of our data set, I tried a sequential run successfully, and even a "mpiexec -n 4" which ran to completion. In any case, many thanks for the new version 996. I did have a problem with the build, namely the new line: 'bin/TACC.PL ' => ['bin/ibrun'], I tried to find TACC.PL unsuccessfully, so I decided to dispense with this new line and then it compiled fine. I started one or two tests and will inform you later about them. From my end I must admit I am using a rather large EST fasta file, but is not useful for test .. I will try to cut it down Monday or Tues so that tests can be more agile. Many thanks / Ram?n. On Fri, Mar 8, 2013 at 9:28 PM, Carson Holt wrote: > Also delete mpi_blastdb before retrying with the new svn repository. > > Thanks, > Carson > > > From: Carson Holt > Date: Friday, 8 March, 2013 3:20 PM > To: Ram?n Fallon > > Cc: "maker-devel at yandell-lab.org" > Subject: Re: [maker-devel] thread terminated, causing all processes to fail > > I think I've found the potential cause and committed the necessary changes to > fix it. > > Thanks, > Carson > > > From: Ram?n Fallon > Date: Thursday, 7 March, 2013 12:47 PM > To: Carson Holt > Cc: "maker-devel at yandell-lab.org" > Subject: Re: [maker-devel] thread terminated, causing all processes to fail > > This is a standalone machine and no NFS at all. "df" gives a healthy amount of > disk space, so there should be no problem there. > > Yes that file does exist although it has the nominal 12288 bytes size, which > appears to be the minimum for a DB_file tie. > > As I mentioned the dpp_contig.fa example set does work so part of my > investigation is looking at how. > > I can do some trivial unit tests on the Bioperl stat-before-tied-hashes > situation and see what comes up. > > So I'll attempt to clear that up and then revert. > > Many thanks! / Ram?n. > > > On Thu, Mar 7, 2013 at 5:44 PM, Carson Holt wrote: >> That is extremely odd. It fails to even generate the indexes. Could you >> check the drive space of your working directory and your /tmp directory? >> >> It is odd because Bioperl uses the stat command to check on the file right >> before making a tied hash. So it was there for the stat but not the tie, >> which is immediately following. >> >> If you check manually does it exist now? --> >> /home/ramonf/makertrials/mgallocut7/sca29310_8.maker.output/mpi_blastdb/sca29 >> 310_8%2Efa.mpi.1/sca29310_8%2Efa.mpi.1.0.index >> >> Are you running in an NFS mounted directory? >> >> --Carson >> >> >> From: Ram?n Fallon >> Date: Thursday, 7 March, 2013 9:40 AM >> >> To: Carson Holt >> Cc: "maker-devel at yandell-lab.org" >> Subject: Re: [maker-devel] thread terminated, causing all processes to fail >> >> Hi Carson, >> >> I send you a zip of the text file of my repeated maker session, this time >> having deleted the mpi_blastdb dir and with the -a flag added to "mpiexec -n >> 8 maker -debug". Command line. >> >> Cheers / Ram?n. >> >> >> On Wed, Mar 6, 2013 at 7:49 PM, Ram?n Fallon wrote: >>> OK, will do. >>> >>> Will get back to you tomorrow on it. >>> >>> Many thanks! >>> >>> >>> On Wed, Mar 6, 2013 at 7:22 PM, Carson Holt wrote: >>>> Could you delete your ../*maker.output/mpi_blastdb directory, and then when >>>> rerunning maker, run with the ?a flag. >>>> >>>> Thanks, >>>> Carson >>>> >>>> >>>> From: Ram?n Fallon >>>> Date: Wednesday, 6 March, 2013 1:15 PM >>>> To: Carson Holt >>>> Cc: "maker-devel at yandell-lab.org" >>>> >>>> Subject: Re: thread terminated, causing all processes to fail >>>> >>>> OK great, here goes .. many thanks! >>>> >>>> >>>> >>>> On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt wrote: >>>>> If you do reply all to this message, I should get the attachment. It will >>>>> be stripped from the one going to the list though. >>>>> >>>>> Thanks, >>>>> Carson >>>>> >>>>> >>>>> >>>>> From: Ram?n Fallon >>>>> Date: Wednesday, 6 March, 2013 12:57 PM >>>>> To: >>>>> Subject: Re: thread terminated, causing all processes to fail >>>>> >>>>> Hi, >>>>> >>>>> Many thanks for your quick reply and hint. >>>>> >>>>> Yes, you're right .. further up there is indeed >>>>> >>>>> Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line 148 >>>>> thread 1. >>>>> Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw >>>>> FastaSeq for Storable >>>>> --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 >>>>> thread 1. >>>>> >>>>> I run a "script" session and have maker on -debug so I have everything in >>>>> one file. Do you prefer to have it attached to a post to this mailing list >>>>> (if it accepts txt attachments) >>>>> >>>>> Cheers. >>>>> >>>>> >>>>> On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon >>>>> wrote: >>>>>> Hi, >>>>>> >>>>>> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a >>>>>> single multicore machine. >>>>>> >>>>>> I've successfully run the dpp_contig.fasta (MPI/8 processes) example but >>>>>> am having trouble with larger contigs fasta files of my own, which are >>>>>> well formed. >>>>>> >>>>>> I've run into a problem whereby an mpiexec run of 8 processes will stop >>>>>> due to a perl-thread related problem which says >>>>>> >>>>>> FATAL: Thread terminated, causing all processes to fail >>>>>> >>>>>> this corresponds to line 924 in the maker executable (which is for the >>>>>> secondary/worker threads), and is the result of a test on !$thr OR'd with >>>>>> !$thr->is_running, so clearly one of these is failing. >>>>>> >>>>>> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a >>>>>> programmer, I've only recently started to look at the code and have not >>>>>> got the hang of the parallelisation setup here, though I gather the >>>>>> master must use threads to initially generate the parallel instances >>>>>> which then use the message passing. Of course threads don't have message >>>>>> passing ability, so I guess something clever is going on and will take >>>>>> some time for me to understand. >>>>>> >>>>>> Clearly however, it has worked before on dpp_contigs, so it may be is >>>>>> something wrong with my datafile or the way I am carrying out the >>>>>> analysis. >>>>>> >>>>>> Any clues that can be put my way are welcome. >>>>>> >>>>>> Thank you! >>>>> >>>> >>> >> >> _______________________________________________ maker-devel mailing list >> maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/ma >> ker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ramonfallon at gmail.com Sun Mar 10 09:45:38 2013 From: ramonfallon at gmail.com (=?ISO-8859-1?Q?Ram=F3n_Fallon?=) Date: Sun, 10 Mar 2013 15:45:38 +0100 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: References: Message-ID: Hi Carson, In terms of rev 995, on a simplified version of our data set, I tried a sequential run successfully, and even a "mpiexec -n 4" which ran to completion. In any case, many thanks for the new version 996. I did have a problem with the build, namely the new line: 'bin/TACC.PL' => ['bin/ibrun'], I tried to find TACC.PL unsuccessfully, so I decided to dispense with this new line and then it compiled fine. I started one or two tests and will inform you later about them. From my end I must admit I am using a rather large EST fasta file, but is not useful for test .. I will try to cut it down Monday or Tues so that tests can be more agile. Many thanks / Ram?n. On Fri, Mar 8, 2013 at 9:28 PM, Carson Holt wrote: > Also delete mpi_blastdb before retrying with the new svn repository. > > Thanks, > Carson > > > From: Carson Holt > Date: Friday, 8 March, 2013 3:20 PM > To: Ram?n Fallon > > Cc: "maker-devel at yandell-lab.org" > Subject: Re: [maker-devel] thread terminated, causing all processes to > fail > > I think I've found the potential cause and committed the necessary changes > to fix it. > > Thanks, > Carson > > > From: Ram?n Fallon > Date: Thursday, 7 March, 2013 12:47 PM > To: Carson Holt > Cc: "maker-devel at yandell-lab.org" > Subject: Re: [maker-devel] thread terminated, causing all processes to > fail > > This is a standalone machine and no NFS at all. "df" gives a healthy > amount of disk space, so there should be no problem there. > > Yes that file does exist although it has the nominal 12288 bytes size, > which appears to be the minimum for a DB_file tie. > > As I mentioned the dpp_contig.fa example set does work so part of my > investigation is looking at how. > > I can do some trivial unit tests on the Bioperl stat-before-tied-hashes > situation and see what comes up. > > So I'll attempt to clear that up and then revert. > > Many thanks! / Ram?n. > > > On Thu, Mar 7, 2013 at 5:44 PM, Carson Holt wrote: > >> That is extremely odd. It fails to even generate the indexes. Could you >> check the drive space of your working directory and your /tmp directory? >> >> It is odd because Bioperl uses the stat command to check on the file >> right before making a tied hash. So it was there for the stat but not the >> tie, which is immediately following. >> >> If you check manually does it exist now? --> >> /home/ramonf/makertrials/mgallocut7/sca29310_8.maker.output/mpi_blastdb/sca29310_8%2Efa.mpi.1/sca29310_8%2Efa.mpi.1.0.index >> >> Are you running in an NFS mounted directory? >> >> --Carson >> >> >> From: Ram?n Fallon >> Date: Thursday, 7 March, 2013 9:40 AM >> >> To: Carson Holt >> Cc: "maker-devel at yandell-lab.org" >> Subject: Re: [maker-devel] thread terminated, causing all processes to >> fail >> >> Hi Carson, >> >> I send you a zip of the text file of my repeated maker session, this time >> having deleted the mpi_blastdb dir and with the -a flag added to "mpiexec >> -n 8 maker -debug". Command line. >> >> Cheers / Ram?n. >> >> >> On Wed, Mar 6, 2013 at 7:49 PM, Ram?n Fallon wrote: >> >>> OK, will do. >>> >>> Will get back to you tomorrow on it. >>> >>> Many thanks! >>> >>> >>> On Wed, Mar 6, 2013 at 7:22 PM, Carson Holt wrote: >>> >>>> Could you delete your ../*maker.output/mpi_blastdb directory, and then >>>> when rerunning maker, run with the ?a flag. >>>> >>>> Thanks, >>>> Carson >>>> >>>> >>>> From: Ram?n Fallon >>>> Date: Wednesday, 6 March, 2013 1:15 PM >>>> To: Carson Holt >>>> Cc: "maker-devel at yandell-lab.org" >>>> >>>> Subject: Re: thread terminated, causing all processes to fail >>>> >>>> OK great, here goes .. many thanks! >>>> >>>> >>>> >>>> On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt wrote: >>>> >>>>> If you do reply all to this message, I should get the attachment. It >>>>> will be stripped from the one going to the list though. >>>>> >>>>> Thanks, >>>>> Carson >>>>> >>>>> >>>>> >>>>> From: Ram?n Fallon >>>>> Date: Wednesday, 6 March, 2013 12:57 PM >>>>> To: >>>>> Subject: Re: thread terminated, causing all processes to fail >>>>> >>>>> Hi, >>>>> >>>>> Many thanks for your quick reply and hint. >>>>> >>>>> Yes, you're right .. further up there is indeed >>>>> >>>>> Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line >>>>> 148 thread 1. >>>>> Thread 1 terminated abnormally: ERROR: Could not reestablish DB to >>>>> thaw FastaSeq for Storable >>>>> --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line >>>>> 1457 thread 1. >>>>> >>>>> I run a "script" session and have maker on -debug so I have everything >>>>> in one file. Do you prefer to have it attached to a post to this mailing >>>>> list (if it accepts txt attachments) >>>>> >>>>> Cheers. >>>>> >>>>> >>>>> On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a >>>>>> single multicore machine. >>>>>> >>>>>> I've successfully run the dpp_contig.fasta (MPI/8 processes) example >>>>>> but am having trouble with larger contigs fasta files of my own, which are >>>>>> well formed. >>>>>> >>>>>> I've run into a problem whereby an mpiexec run of 8 processes will >>>>>> stop due to a perl-thread related problem which says >>>>>> >>>>>> FATAL: Thread terminated, causing all processes to fail >>>>>> >>>>>> this corresponds to line 924 in the maker executable (which is for >>>>>> the secondary/worker threads), and is the result of a test on !$thr OR'd >>>>>> with !$thr->is_running, so clearly one of these is failing. >>>>>> >>>>>> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite >>>>>> being a programmer, I've only recently started to look at the code and have >>>>>> not got the hang of the parallelisation setup here, though I gather the >>>>>> master must use threads to initially generate the parallel instances which >>>>>> then use the message passing. Of course threads don't have message passing >>>>>> ability, so I guess something clever is going on and will take some time >>>>>> for me to understand. >>>>>> >>>>>> Clearly however, it has worked before on dpp_contigs, so it may be is >>>>>> something wrong with my datafile or the way I am carrying out the analysis. >>>>>> >>>>>> Any clues that can be put my way are welcome. >>>>>> >>>>>> Thank you! >>>>>> >>>>> >>>>> >>>> >>> >> _______________________________________________ maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mikheyev at gmail.com Mon Mar 11 04:46:06 2013 From: mikheyev at gmail.com (Sasha Mikheyev) Date: Mon, 11 Mar 2013 18:46:06 +0900 Subject: [maker-devel] duplicate CDS in annotation Message-ID: Dear Yandell lab, I am re-annotating the harvester and genome using protein and RNA-seq data. However, I get many artifacts like the one below. It seems that there are several CDS records that should tie in to the same mRNA, but they are really hanging out separately, and produce several nucleotide sequences with the same name when extracted from the gff. I would appreciate any guidance about how to fix this! Thank you, Sasha grep "pbar_scf7180000350377:hit:2506" Pbar.2.0.gff pbar_scf7180000350377 protein2genome protein_match 172004 172162 150 - . ID=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;score=150; pbar_scf7180000350377 protein2genome match_part 172004 172162 150 - . ID=pbar_scf7180000350377:hsp:2798;Parent=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;Target=Hsal|HS9704 1 53 +;Gap=M159; pbar_scf7180000350377 maker mRNA 538308 558769 . + . ID=pbar_scf7180000350377:hit:2506;Parent=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29;Name=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29-mRNA-1;_AED=0.48;_eAED=0.39;_QI=0|0|0|0.5|1|1|6|0|395;score=0.01; pbar_scf7180000350377 maker exon 538308 538334 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:305;Parent=pbar_scf7180000350377:hit:2506; pbar_scf7180000350377 maker exon 538748 538968 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:306;Parent=pbar_scf7180000350377:hit:2506; pbar_scf7180000350377 maker exon 539842 540242 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:307;Parent=pbar_scf7180000350377:hit:2506; pbar_scf7180000350377 maker exon 542624 542798 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:308;Parent=pbar_scf7180000350377:hit:2506; pbar_scf7180000350377 maker exon 555823 556025 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:309;Parent=pbar_scf7180000350377:hit:2506; pbar_scf7180000350377 maker exon 558609 558769 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:310;Parent=pbar_scf7180000350377:hit:2506; pbar_scf7180000350377 maker CDS 538308 538334 . + 0 ID=pbar_scf7180000350377:hit:2506:cds:305;Parent=pbar_scf7180000350377:hit:2506; pbar_scf7180000350377 maker CDS 538748 538968 . + 0 ID=pbar_scf7180000350377:hit:2506:cds:306;Parent=pbar_scf7180000350377:hit:2506; pbar_scf7180000350377 maker CDS 539842 540242 . + 1 ID=pbar_scf7180000350377:hit:2506:cds:307;Parent=pbar_scf7180000350377:hit:2506; pbar_scf7180000350377 maker CDS 542624 542798 . + 2 ID=pbar_scf7180000350377:hit:2506:cds:308;Parent=pbar_scf7180000350377:hit:2506; pbar_scf7180000350377 maker CDS 555823 556025 . + 1 ID=pbar_scf7180000350377:hit:2506:cds:309;Parent=pbar_scf7180000350377:hit:2506; pbar_scf7180000350377 maker CDS 558609 558769 . + 2 ID=pbar_scf7180000350377:hit:2506:cds:310;Parent=pbar_scf7180000350377:hit:2506; -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry.moore at genetics.utah.edu Mon Mar 11 06:32:44 2013 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Mon, 11 Mar 2013 05:32:44 -0600 Subject: [maker-devel] duplicate CDS in annotation In-Reply-To: References: Message-ID: <52822F85-760B-451B-B156-8861EA77A910@genetics.utah.edu> Hi Sasha, This gene model appears to be correctly formatted to me. In GFF3 format the CDS features are allowed to span multiple lines and they share the same ID to indicate that it is all the same features. See the GFF3 specification on the Sequence Ontology website (http://www.sequenceontology.org/resources/gff3.html), and in particular the description of the ID attribute specifies: ID Indicates the ID of the feature. IDs for each feature must be unique within the scope of the GFF file. In the case of discontinuous features (i.e. a single feature that exists over multiple genomic locations) the same ID may appear on multiple lines. All lines that share an ID collectively represent a single feature. So each of those CDS lines forms one part of the single CDS feature for this gene. B On Mar 11, 2013, at 3:46 AM, Sasha Mikheyev wrote: > Dear Yandell lab, > > I am re-annotating the harvester and genome using protein and RNA-seq data. However, I get many artifacts like the one below. It seems that there are several CDS records that should tie in to the same mRNA, but they are really hanging out separately, and produce several nucleotide sequences with the same name when extracted from the gff. I would appreciate any guidance about how to fix this! > > Thank you, > > Sasha > > grep "pbar_scf7180000350377:hit:2506" Pbar.2.0.gff > pbar_scf7180000350377 protein2genome protein_match 172004 172162 150 - . ID=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;score=150; > pbar_scf7180000350377 protein2genome match_part 172004 172162 150 - . ID=pbar_scf7180000350377:hsp:2798;Parent=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;Target=Hsal|HS9704 1 53 +;Gap=M159; > pbar_scf7180000350377 maker mRNA 538308 558769 . + . ID=pbar_scf7180000350377:hit:2506;Parent=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29;Name=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29-mRNA-1;_AED=0.48;_eAED=0.39;_QI=0|0|0|0.5|1|1|6|0|395;score=0.01; > pbar_scf7180000350377 maker exon 538308 538334 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:305;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker exon 538748 538968 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:306;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker exon 539842 540242 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:307;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker exon 542624 542798 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:308;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker exon 555823 556025 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:309;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker exon 558609 558769 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:310;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 538308 538334 . + 0 ID=pbar_scf7180000350377:hit:2506:cds:305;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 538748 538968 . + 0 ID=pbar_scf7180000350377:hit:2506:cds:306;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 539842 540242 . + 1 ID=pbar_scf7180000350377:hit:2506:cds:307;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 542624 542798 . + 2 ID=pbar_scf7180000350377:hit:2506:cds:308;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 555823 556025 . + 1 ID=pbar_scf7180000350377:hit:2506:cds:309;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 558609 558769 . + 2 ID=pbar_scf7180000350377:hit:2506:cds:310;Parent=pbar_scf7180000350377:hit:2506; > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Mar 11 08:02:13 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 11 Mar 2013 09:02:13 -0400 Subject: [maker-devel] duplicate CDS in annotation In-Reply-To: <52822F85-760B-451B-B156-8861EA77A910@genetics.utah.edu> Message-ID: I think the issue is that you are getting a match feature that is being printed with the same ID as the mRNA feature. Correct? What version of MAKER are you using, and what does the gile you are giving to pred_gff or model_gff look like? Could you send them? Thanks, Carson From: Barry Moore Date: Monday, 11 March, 2013 7:32 AM To: Sasha Mikheyev Cc: Subject: Re: [maker-devel] duplicate CDS in annotation Hi Sasha, This gene model appears to be correctly formatted to me. In GFF3 format the CDS features are allowed to span multiple lines and they share the same ID to indicate that it is all the same features. See the GFF3 specification on the Sequence Ontology website (http://www.sequenceontology.org/resources/gff3.html), and in particular the description of the ID attribute specifies: > ID Indicates the ID of the feature. IDs for each feature must be unique > within the scope of the GFF file. In the case of discontinuous features (i.e. > a single feature that exists over multiple genomic locations) the same ID may > appear on multiple lines. All lines that share an ID collectively represent a > single feature. So each of those CDS lines forms one part of the single CDS feature for this gene. B On Mar 11, 2013, at 3:46 AM, Sasha Mikheyev wrote: > Dear Yandell lab, > > I am re-annotating the harvester and genome using protein and RNA-seq data. > However, I get many artifacts like the one below. It seems that there are > several CDS records that should tie in to the same mRNA, but they are really > hanging out separately, and produce several nucleotide sequences with the same > name when extracted from the gff. I would appreciate any guidance about how to > fix this! > > Thank you, > > Sasha > > grep "pbar_scf7180000350377:hit:2506" Pbar.2.0.gff > pbar_scf7180000350377 protein2genome protein_match 172004 172162 150 - . > ID=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;score=150; > pbar_scf7180000350377 protein2genome match_part 172004 172162 150 - . > ID=pbar_scf7180000350377:hsp:2798;Parent=pbar_scf7180000350377:hit:2506;Name=H > sal|HS9704;Target=Hsal|HS9704 1 53 +;Gap=M159; > pbar_scf7180000350377 maker mRNA 538308 558769 . + . > ID=pbar_scf7180000350377:hit:2506;Parent=augustus_masked-pbar_scf7180000350377 > -abinit-gene-5.29;Name=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29- > mRNA-1;_AED=0.48;_eAED=0.39;_QI=0|0|0|0.5|1|1|6|0|395;score=0.01; > pbar_scf7180000350377 maker exon 538308 538334 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:305;Parent=pbar_scf7180000350377:hit:25 > 06; > pbar_scf7180000350377 maker exon 538748 538968 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:306;Parent=pbar_scf7180000350377:hit:25 > 06; > pbar_scf7180000350377 maker exon 539842 540242 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:307;Parent=pbar_scf7180000350377:hit:25 > 06; > pbar_scf7180000350377 maker exon 542624 542798 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:308;Parent=pbar_scf7180000350377:hit:25 > 06; > pbar_scf7180000350377 maker exon 555823 556025 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:309;Parent=pbar_scf7180000350377:hit:25 > 06; > pbar_scf7180000350377 maker exon 558609 558769 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:310;Parent=pbar_scf7180000350377:hit:25 > 06; > pbar_scf7180000350377 maker CDS 538308 538334 . + 0 > ID=pbar_scf7180000350377:hit:2506:cds:305;Parent=pbar_scf7180000350377:hit:250 > 6; > pbar_scf7180000350377 maker CDS 538748 538968 . + 0 > ID=pbar_scf7180000350377:hit:2506:cds:306;Parent=pbar_scf7180000350377:hit:250 > 6; > pbar_scf7180000350377 maker CDS 539842 540242 . + 1 > ID=pbar_scf7180000350377:hit:2506:cds:307;Parent=pbar_scf7180000350377:hit:250 > 6; > pbar_scf7180000350377 maker CDS 542624 542798 . + 2 > ID=pbar_scf7180000350377:hit:2506:cds:308;Parent=pbar_scf7180000350377:hit:250 > 6; > pbar_scf7180000350377 maker CDS 555823 556025 . + 1 > ID=pbar_scf7180000350377:hit:2506:cds:309;Parent=pbar_scf7180000350377:hit:250 > 6; > pbar_scf7180000350377 maker CDS 558609 558769 . + 2 > ID=pbar_scf7180000350377:hit:2506:cds:310;Parent=pbar_scf7180000350377:hit:250 > 6; > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From sitaram.rajaraman at helsinki.fi Mon Mar 11 09:33:27 2013 From: sitaram.rajaraman at helsinki.fi (Sitaram Rajaraman) Date: Mon, 11 Mar 2013 16:33:27 +0200 Subject: [maker-devel] Doubts in the synthesis part of MAKER Message-ID: <513DEB37.6090601@helsinki.fi> Hello MAKER developers, I'm Sitaram, working as a Bioinformatician at the University of Helsinki. We are trying out MAKER as part of a gene prediction/annotation pipeline and have some doubts regarding this. In the synthesis step in the paper, I find it a bit hard to visualise how the hints are generated from the various sources and the scores are calculated. It would be nice if you could throw some light on this. Also if you could point to the particular .Pm file which contains the actual source code, it would be convenient as there quite a lot of source code and debugging the whole set is bit cumbersome. Regards, -- Sitaram Rajaraman, Plant Stress Research Group, Dept of Biosciences, University of Helsinki. From carsonhh at gmail.com Mon Mar 11 09:51:56 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 11 Mar 2013 10:51:56 -0400 Subject: [maker-devel] Doubts in the synthesis part of MAKER In-Reply-To: <513DEB37.6090601@helsinki.fi> Message-ID: Hints are basically CDS location, exon location, and intron location. The CDS hints are based on protein alignment. Intron and exon hints are based on the EST alignments, which when polished should give exact intron coordinates. Ironically the most useless part of the gene model is actually the most informative feature for gene prediction (the intron coordinates). lib/Process/MPIchunk.pm will have the steps in the _go method. It is a little hard to follow as MAKER is designed for distributed parallelization (i.e. parallelization without shared memory with steps potentially divided on different machines on the other end of the network). It is divided into MPItier and MPIchunk objects. The MPItier object encapsulate a series of linear steps or 'levels' while the MPIchunk objects encapsulate a single step sent to a machine across the network and it exists within a single 'level' of the MPITier object. Note there can be multiple chunks assigned to a 'level'. MPItiers can also have MPITiers as children at a given level instead of MPIchunks, so the process structure then branches like a tree and can then merges back somewhere in the middle of the algorithm. The 'maker' script is really just the communication script for the objects. In MPI one maker thread is launched to handle communication and another to run the MPItiers and MPIchunks. They communication threads then pass MPIchunks and MPITiers back and forth across the network by either requesting things to do from other nodes or by asking for help if they have a large number of MPIChunks or MPItiers to process. Thanks, Carson On 13-03-11 10:33 AM, "Sitaram Rajaraman" wrote: >Hello MAKER developers, > I'm Sitaram, working as a Bioinformatician at the University of >Helsinki. We are trying out MAKER as part of a gene prediction/annotation >pipeline and have some doubts regarding this. In the synthesis step in >the paper, I find it a bit hard to visualise how the hints are generated >from the various sources and the scores are calculated. It would be nice >if you could throw some light on this. Also if you could point to the >particular .Pm file which contains the actual source code, it would be >convenient as there quite a lot of source code and debugging the whole >set is bit cumbersome. > >Regards, > >-- >Sitaram Rajaraman, >Plant Stress Research Group, >Dept of Biosciences, >University of Helsinki. > > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From sitaram.rajaraman at helsinki.fi Mon Mar 11 10:03:20 2013 From: sitaram.rajaraman at helsinki.fi (Sitaram Rajaraman) Date: Mon, 11 Mar 2013 17:03:20 +0200 Subject: [maker-devel] Doubts in the synthesis part of MAKER In-Reply-To: References: Message-ID: <513DF238.4050109@helsinki.fi> Thank you ! I will proceed with this information ! - Sitaram. On 03/11/2013 04:51 PM, Carson Holt wrote: > Hints are basically CDS location, exon location, and intron location. The > CDS hints are based on protein alignment. Intron and exon hints are based > on the EST alignments, which when polished should give exact intron > coordinates. Ironically the most useless part of the gene model is > actually the most informative feature for gene prediction (the intron > coordinates). > > lib/Process/MPIchunk.pm will have the steps in the _go method. It is a > little hard to follow as MAKER is designed for distributed parallelization > (i.e. parallelization without shared memory with steps potentially divided > on different machines on the other end of the network). > > It is divided into MPItier and MPIchunk objects. The MPItier object > encapsulate a series of linear steps or 'levels' while the MPIchunk > objects encapsulate a single step sent to a machine across the network and > it exists within a single 'level' of the MPITier object. Note there can > be multiple chunks assigned to a 'level'. MPItiers can also have MPITiers > as children at a given level instead of MPIchunks, so the process > structure then branches like a tree and can then merges back somewhere in > the middle of the algorithm. > > The 'maker' script is really just the communication script for the > objects. In MPI one maker thread is launched to handle communication and > another to run the MPItiers and MPIchunks. They communication threads > then pass MPIchunks and MPITiers back and forth across the network by > either requesting things to do from other nodes or by asking for help if > they have a large number of MPIChunks or MPItiers to process. > > Thanks, > Carson > > > > > > On 13-03-11 10:33 AM, "Sitaram Rajaraman" > wrote: > >> Hello MAKER developers, >> I'm Sitaram, working as a Bioinformatician at the University of >> Helsinki. We are trying out MAKER as part of a gene prediction/annotation >> pipeline and have some doubts regarding this. In the synthesis step in >> the paper, I find it a bit hard to visualise how the hints are generated > >from the various sources and the scores are calculated. It would be nice >> if you could throw some light on this. Also if you could point to the >> particular .Pm file which contains the actual source code, it would be >> convenient as there quite a lot of source code and debugging the whole >> set is bit cumbersome. >> >> Regards, >> >> -- >> Sitaram Rajaraman, >> Plant Stress Research Group, >> Dept of Biosciences, >> University of Helsinki. >> >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -- Sitaram Rajaraman, Plant Stress Research Group, Dept of Biosciences, University of Helsinki. From carsonhh at gmail.com Mon Mar 11 10:05:30 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 11 Mar 2013 11:05:30 -0400 Subject: [maker-devel] Doubts in the synthesis part of MAKER In-Reply-To: Message-ID: One more detail. There are basically 5 steps per level. Load -> this step creates the chunks for that level. This is where I decide how many chunks to make, and any special variables need to be generated for packaging into that chunk. Init --> these is just a declaration of the variables to package into a chunk (only give the chunk what it needs) Run --> these is the actual code that will be run after the chunk is transported to it's destination Result --> this describes how to merge results of that chunk back into the parent object Flow --> this decides what to do when all chunks for that level are complete (i.e. which level to move onto next). Default is next level in linear succession, but it can jump forward and backwards several levels if needed. Thanks, Carson On 13-03-11 10:51 AM, "Carson Holt" wrote: >Hints are basically CDS location, exon location, and intron location. >The >CDS hints are based on protein alignment. Intron and exon hints are >based >on the EST alignments, which when polished should give exact intron >coordinates. Ironically the most useless part of the gene model is >actually the most informative feature for gene prediction (the intron >coordinates). > >lib/Process/MPIchunk.pm will have the steps in the _go method. It is a >little hard to follow as MAKER is designed for distributed >parallelization >(i.e. parallelization without shared memory with steps potentially >divided >on different machines on the other end of the network). > >It is divided into MPItier and MPIchunk objects. The MPItier object >encapsulate a series of linear steps or 'levels' while the MPIchunk >objects encapsulate a single step sent to a machine across the network >and >it exists within a single 'level' of the MPITier object. Note there can >be multiple chunks assigned to a 'level'. MPItiers can also have >MPITiers >as children at a given level instead of MPIchunks, so the process >structure then branches like a tree and can then merges back somewhere in >the middle of the algorithm. > >The 'maker' script is really just the communication script for the >objects. In MPI one maker thread is launched to handle communication and >another to run the MPItiers and MPIchunks. They communication threads >then pass MPIchunks and MPITiers back and forth across the network by >either requesting things to do from other nodes or by asking for help if >they have a large number of MPIChunks or MPItiers to process. > >Thanks, >Carson > > > > > >On 13-03-11 10:33 AM, "Sitaram Rajaraman" >wrote: > >>Hello MAKER developers, >> I'm Sitaram, working as a Bioinformatician at the University of >>Helsinki. We are trying out MAKER as part of a gene >>prediction/annotation >>pipeline and have some doubts regarding this. In the synthesis step in >>the paper, I find it a bit hard to visualise how the hints are generated >>from the various sources and the scores are calculated. It would be nice >>if you could throw some light on this. Also if you could point to the >>particular .Pm file which contains the actual source code, it would be >>convenient as there quite a lot of source code and debugging the whole >>set is bit cumbersome. >> >>Regards, >> >>-- >>Sitaram Rajaraman, >>Plant Stress Research Group, >>Dept of Biosciences, >>University of Helsinki. >> >> >>_______________________________________________ >>maker-devel mailing list >>maker-devel at box290.bluehost.com >>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From isradelacon at gmail.com Mon Mar 11 13:34:27 2013 From: isradelacon at gmail.com (Israel Barrantes) Date: Mon, 11 Mar 2013 19:34:27 +0100 Subject: [maker-devel] different RNA-seq experiment outputs in separate annotation passes? Message-ID: Dear maker-devel, I have several RNA-seq experiment outputs that I want to use as input for MAKER annotation: (1) Illumina 1.3, strain A, cell stage N (2) Illumina 1.8, strain A, cell stage N (3) Illumina 1.8, strain B, cell stage N (4) 454, strain unknown, cell stage M For each experiment I mapped the reads and produced GTFs with tophat/cufflinks separately (and later converted to GFF3s with the supplied script) Q1: Does it make a difference to run a different annotation pass for each GFF3 from tophat/cufflinks? Q2: If this is the case, altering the order of passing the cDNA GFFs (e.g., first pass, experiment 1 GFF, then exp.2 in second pass, etc) will produce more or less transcripts? Q3: Is it better to simply merge this GFFs into a single nonredundant file (e.g. bedtools intersect) than using them separately, one for each MAKER pass? Thank you in advance, -- Israel Barrantes Otto-von-Guericke-Universit?t Lehrstuhl f?r Regulationsbiologie IBIO/FNW Deutschland -------------- next part -------------- An HTML attachment was scrubbed... URL: From dence at genetics.utah.edu Mon Mar 11 13:39:01 2013 From: dence at genetics.utah.edu (Daniel Ence) Date: Mon, 11 Mar 2013 18:39:01 +0000 Subject: [maker-devel] different RNA-seq experiment outputs in separate annotation passes? In-Reply-To: References: Message-ID: Hi Israel, I think that for general annotation purposes, you want to use all of those GFF files during your one make run to annotate the whole genome. If you're interested in exploring which genes are expressed in your different strains and cell stages, then you can use your annotation results and blast against the different RNA-seq experiments. I didn't answer your questions separately, but hopefully that gives some good guidance. If I missed something, let me know. Thanks, Daniel Daniel Ence Graduate Student Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ________________________________ From: maker-devel-bounces at yandell-lab.org [maker-devel-bounces at yandell-lab.org] on behalf of Israel Barrantes [isradelacon at gmail.com] Sent: Monday, March 11, 2013 12:34 PM To: maker-devel at yandell-lab.org Subject: [maker-devel] different RNA-seq experiment outputs in separate annotation passes? Dear maker-devel, I have several RNA-seq experiment outputs that I want to use as input for MAKER annotation: (1) Illumina 1.3, strain A, cell stage N (2) Illumina 1.8, strain A, cell stage N (3) Illumina 1.8, strain B, cell stage N (4) 454, strain unknown, cell stage M For each experiment I mapped the reads and produced GTFs with tophat/cufflinks separately (and later converted to GFF3s with the supplied script) Q1: Does it make a difference to run a different annotation pass for each GFF3 from tophat/cufflinks? Q2: If this is the case, altering the order of passing the cDNA GFFs (e.g., first pass, experiment 1 GFF, then exp.2 in second pass, etc) will produce more or less transcripts? Q3: Is it better to simply merge this GFFs into a single nonredundant file (e.g. bedtools intersect) than using them separately, one for each MAKER pass? Thank you in advance, -- Israel Barrantes Otto-von-Guericke-Universit?t Lehrstuhl f?r Regulationsbiologie IBIO/FNW Deutschland -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Mar 12 08:37:35 2013 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 12 Mar 2013 09:37:35 -0400 Subject: [maker-devel] duplicate CDS in annotation In-Reply-To: Message-ID: Yes. Try the newer version and see if you still have the issue. Thanks, Carson From: Sasha Mikheyev Date: Tuesday, 12 March, 2013 1:26 AM To: Carson Holt Cc: Barry Moore , Subject: Re: [maker-devel] duplicate CDS in annotation Hi Carson, I have been using version 2.10. Is it worth trying with a newer version? You can find the model file here . It is rather large, as it includes all of the output from the first maker run. Yours, Sasha On Mon, Mar 11, 2013 at 10:02 PM, Carson Holt wrote: > I think the issue is that you are getting a match feature that is being > printed with the same ID as the mRNA feature. Correct? > > What version of MAKER are you using, and what does the gile you are giving to > pred_gff or model_gff look like? Could you send them? > > Thanks, > Carson > > > From: Barry Moore > Date: Monday, 11 March, 2013 7:32 AM > To: Sasha Mikheyev > Cc: > Subject: Re: [maker-devel] duplicate CDS in annotation > > Hi Sasha, > > This gene model appears to be correctly formatted to me. In GFF3 format the > CDS features are allowed to span multiple lines and they share the same ID to > indicate that it is all the same features. See the GFF3 specification on the > Sequence Ontology website > (http://www.sequenceontology.org/resources/gff3.html), and in particular the > description of the ID attribute specifies: > >> ID Indicates the ID of the feature. IDs for each feature must be unique >> within the scope of the GFF file. In the case of discontinuous features >> (i.e. a single feature that exists over multiple genomic locations) the same >> ID may appear on multiple lines. All lines that share an ID collectively >> represent a single feature. > > So each of those CDS lines forms one part of the single CDS feature for this > gene. > > B > > On Mar 11, 2013, at 3:46 AM, Sasha Mikheyev wrote: > >> Dear Yandell lab, >> >> I am re-annotating the harvester and genome using protein and RNA-seq data. >> However, I get many artifacts like the one below. It seems that there are >> several CDS records that should tie in to the same mRNA, but they are really >> hanging out separately, and produce several nucleotide sequences with the >> same name when extracted from the gff. I would appreciate any guidance about >> how to fix this! >> >> Thank you, >> >> Sasha >> >> grep "pbar_scf7180000350377:hit:2506" Pbar.2.0.gff >> pbar_scf7180000350377 protein2genome protein_match 172004 172162 150 - . >> ID=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;score=150; >> pbar_scf7180000350377 protein2genome match_part 172004 172162 150 - . >> ID=pbar_scf7180000350377:hsp:2798;Parent=pbar_scf7180000350377:hit:2506;Name= >> Hsal|HS9704;Target=Hsal|HS9704 1 53 +;Gap=M159; >> pbar_scf7180000350377 maker mRNA 538308 558769 . + . >> ID=pbar_scf7180000350377:hit:2506;Parent=augustus_masked-pbar_scf718000035037 >> 7-abinit-gene-5.29;Name=augustus_masked-pbar_scf7180000350377-abinit-gene-5.2 >> 9-mRNA-1;_AED=0.48;_eAED=0.39;_QI=0|0|0|0.5|1|1|6|0|395;score=0.01; >> pbar_scf7180000350377 maker exon 538308 538334 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:305;Parent=pbar_scf7180000350377:hit:2 >> 506; >> pbar_scf7180000350377 maker exon 538748 538968 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:306;Parent=pbar_scf7180000350377:hit:2 >> 506; >> pbar_scf7180000350377 maker exon 539842 540242 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:307;Parent=pbar_scf7180000350377:hit:2 >> 506; >> pbar_scf7180000350377 maker exon 542624 542798 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:308;Parent=pbar_scf7180000350377:hit:2 >> 506; >> pbar_scf7180000350377 maker exon 555823 556025 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:309;Parent=pbar_scf7180000350377:hit:2 >> 506; >> pbar_scf7180000350377 maker exon 558609 558769 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:310;Parent=pbar_scf7180000350377:hit:2 >> 506; >> pbar_scf7180000350377 maker CDS 538308 538334 . + 0 >> ID=pbar_scf7180000350377:hit:2506:cds:305;Parent=pbar_scf7180000350377:hit:25 >> 06; >> pbar_scf7180000350377 maker CDS 538748 538968 . + 0 >> ID=pbar_scf7180000350377:hit:2506:cds:306;Parent=pbar_scf7180000350377:hit:25 >> 06; >> pbar_scf7180000350377 maker CDS 539842 540242 . + 1 >> ID=pbar_scf7180000350377:hit:2506:cds:307;Parent=pbar_scf7180000350377:hit:25 >> 06; >> pbar_scf7180000350377 maker CDS 542624 542798 . + 2 >> ID=pbar_scf7180000350377:hit:2506:cds:308;Parent=pbar_scf7180000350377:hit:25 >> 06; >> pbar_scf7180000350377 maker CDS 555823 556025 . + 1 >> ID=pbar_scf7180000350377:hit:2506:cds:309;Parent=pbar_scf7180000350377:hit:25 >> 06; >> pbar_scf7180000350377 maker CDS 558609 558769 . + 2 >> ID=pbar_scf7180000350377:hit:2506:cds:310;Parent=pbar_scf7180000350377:hit:25 >> 06; >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > Barry Moore > Research Scientist > Dept. of Human Genetics > University of Utah > Salt Lake City, UT 84112 > -------------------------------------------- > (801) 585-3543 > > > > > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/mak > er-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From alex.marshall at ed.ac.uk Mon Mar 11 11:15:05 2013 From: alex.marshall at ed.ac.uk (Alex Marshall) Date: Mon, 11 Mar 2013 16:15:05 +0000 Subject: [maker-devel] symbol lookup error: ProcessTable.so: undefined symbol: Perl_Tstack_sp_ptr Message-ID: <513E0309.7010004@ed.ac.uk> Hi to the maker-devel, I am getting an error everytime I run the maker script. symbol lookup error: /path/to/software/lib64/perl5/site_perl/5.8.8/x86_64-linux-thread-multi/auto/Proc/ProcessTable/ProcessTable.so: undefined symbol: Perl_Tstack_sp_ptr Your help would be very appreciated. Best wishes, Alex ---------------- Edinburgh University -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From mikheyev at gmail.com Tue Mar 12 00:26:53 2013 From: mikheyev at gmail.com (Sasha Mikheyev) Date: Tue, 12 Mar 2013 14:26:53 +0900 Subject: [maker-devel] duplicate CDS in annotation In-Reply-To: References: <52822F85-760B-451B-B156-8861EA77A910@genetics.utah.edu> Message-ID: Hi Carson, I have been using version 2.10. Is it worth trying with a newer version? You can find the model file here. It is rather large, as it includes all of the output from the first maker run. Yours, Sasha On Mon, Mar 11, 2013 at 10:02 PM, Carson Holt wrote: > I think the issue is that you are getting a match feature that is being > printed with the same ID as the mRNA feature. Correct? > > What version of MAKER are you using, and what does the gile you are giving > to pred_gff or model_gff look like? Could you send them? > > Thanks, > Carson > > > From: Barry Moore > Date: Monday, 11 March, 2013 7:32 AM > To: Sasha Mikheyev > Cc: > Subject: Re: [maker-devel] duplicate CDS in annotation > > Hi Sasha, > > This gene model appears to be correctly formatted to me. In GFF3 format > the CDS features are allowed to span multiple lines and they share the same > ID to indicate that it is all the same features. See the GFF3 > specification on the Sequence Ontology website ( > http://www.sequenceontology.org/resources/gff3.html), and in particular > the description of the ID attribute specifies: > > ID Indicates the ID of the feature. IDs for each feature must be unique > within the scope of the GFF file. In the case of discontinuous features > (i.e. a single feature that exists over multiple genomic locations) the > same ID may appear on multiple lines. All lines that share an ID > collectively represent a single feature. > > > So each of those CDS lines forms one part of the single CDS feature for > this gene. > > B > > On Mar 11, 2013, at 3:46 AM, Sasha Mikheyev wrote: > > Dear Yandell lab, > > I am re-annotating the harvester and genome using protein and RNA-seq > data. However, I get many artifacts like the one below. It seems that there > are several CDS records that should tie in to the same mRNA, but they are > really hanging out separately, and produce several nucleotide sequences > with the same name when extracted from the gff. I would appreciate any > guidance about how to fix this! > > Thank you, > > Sasha > > grep "pbar_scf7180000350377:hit:2506" Pbar.2.0.gff > pbar_scf7180000350377 protein2genome protein_match 172004 172162 150 - . > ID=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;score=150; > pbar_scf7180000350377 protein2genome match_part 172004 172162 150 - . ID=pbar_scf7180000350377:hsp:2798;Parent=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;Target=Hsal|HS9704 > 1 53 +;Gap=M159; > pbar_scf7180000350377 maker mRNA 538308 558769 . + . > ID=pbar_scf7180000350377:hit:2506;Parent=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29;Name=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29-mRNA-1;_AED=0.48;_eAED=0.39;_QI=0|0|0|0.5|1|1|6|0|395;score=0.01; > pbar_scf7180000350377 maker exon 538308 538334 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:305;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker exon 538748 538968 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:306;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker exon 539842 540242 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:307;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker exon 542624 542798 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:308;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker exon 555823 556025 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:309;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker exon 558609 558769 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:310;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 538308 538334 . + 0 > ID=pbar_scf7180000350377:hit:2506:cds:305;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 538748 538968 . + 0 > ID=pbar_scf7180000350377:hit:2506:cds:306;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 539842 540242 . + 1 > ID=pbar_scf7180000350377:hit:2506:cds:307;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 542624 542798 . + 2 > ID=pbar_scf7180000350377:hit:2506:cds:308;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 555823 556025 . + 1 > ID=pbar_scf7180000350377:hit:2506:cds:309;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 558609 558769 . + 2 > ID=pbar_scf7180000350377:hit:2506:cds:310;Parent=pbar_scf7180000350377:hit:2506; > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > Barry Moore > Research Scientist > Dept. of Human Genetics > University of Utah > Salt Lake City, UT 84112 > -------------------------------------------- > (801) 585-3543 > > > > > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Mar 12 09:27:44 2013 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 12 Mar 2013 10:27:44 -0400 Subject: [maker-devel] symbol lookup error: ProcessTable.so: undefined symbol: Perl_Tstack_sp_ptr In-Reply-To: <513E0309.7010004@ed.ac.uk> Message-ID: Could you try the 2.27 version of MAKER? You are using 2.10 correct? Thanks, Carson On 13-03-11 12:15 PM, "Alex Marshall" wrote: >Hi to the maker-devel, > >I am getting an error everytime I run the maker script. > >symbol lookup error: >/path/to/software/lib64/perl5/site_perl/5.8.8/x86_64-linux-thread-multi/au >to/Proc/ProcessTable/ProcessTable.so: >undefined symbol: Perl_Tstack_sp_ptr > >Your help would be very appreciated. > >Best wishes, >Alex > >---------------- >Edinburgh University > >-- >The University of Edinburgh is a charitable body, registered in >Scotland, with registration number SC005336. > > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From barry.moore at genetics.utah.edu Tue Mar 12 18:57:32 2013 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Tue, 12 Mar 2013 17:57:32 -0600 Subject: [maker-devel] MAKER subversion repositories Message-ID: For any of you who are running MAKER straight from our subversion repositories in the lab - we have migrated those repos to a new server. Reply to Shawn or I for info on how to connect to the new repos. Thanks. Barry Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From ares711122 at gmail.com Tue Mar 12 19:24:42 2013 From: ares711122 at gmail.com (Hung-Wei Hsu) Date: Wed, 13 Mar 2013 08:24:42 +0800 Subject: [maker-devel] ERROR: Could not obtain lock to format database Message-ID: Hi MAKER developers, I tried MAKER 2.27b on one E. coli scaffold sequence with uniprot protein database. I failed to run the analysis and got an error message as below. Could not obtain lock to format database at maker-2.27b/bin/../lib/GI.pm Any suggestions or helps will be deeply appreciated. Best regards, Hung-Wei -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Wed Mar 13 08:24:44 2013 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 13 Mar 2013 09:24:44 -0400 Subject: [maker-devel] symbol lookup error: ProcessTable.so: undefined symbol: Perl_Tstack_sp_ptr In-Reply-To: <513FE1F0.2030209@ed.ac.uk> Message-ID: I'm very glad it's working. Those kind of errors are the hardest to track down. --Carson On 13-03-12 10:18 PM, "Alex Marshall" wrote: >I have some great news. I uninstalled every one of my local perl >libraries. Basically by getting rid of my libraries, and then using your >Build script to install the maker dependencies totally fixed it. It >worked with the test.fasta file, no errors whatsoever. I am smiling so >much right now that my face might crack ;) you were right, broken perl. >I just checked, getting lots of finished in the >master_datastore_index.log. thank you so much. > >Alex > > > > > >On 12/03/2013 19:11, Carson Holt wrote: >> I do think your perl has a problem. I've added some changes to each of >> these modules that should help force perl to generate the correct object >> method lookup table. >> >> Could you test them out (place most under the /lib/Iterator/ >>subdirectory). >> >> --Carson >> >> >> On 13-03-12 3:00 PM, "Alex Marshall" wrote: >> >>> We had maker working happily for ages. >>> >>> Then we upgraded from perl version 5.8.8 to 5.10 which stopped maker >>> working. >>> >>> Maker said it couldn't find forks.pm, added that library path, to fix >>> the error. >>> >>> Then that particular error below started happening. >>> >>> Alex >>> >>> >>> On 12/03/2013 18:54, Alex Marshall wrote: >>>> version 5.10 on a hpc cluster >>>> >>>> Alex >>>> >>>> >>>> >>>> On 12/03/2013 18:48, Carson Holt wrote: >>>>> That means the first time it called fileHandle it didn't die (which >>>>> should >>>>> be impossible) >>>>> >>>>> Then the second time it called it, it died. It begs the question, >>>>>what >>>>> happened to the first call. >>>>> >>>>> This is looking more and more like you have a broken perl. >>>>> >>>>> What version of perl are you using? >>>>> >>>>> --Carson >>>>> >>>>> >>>>> >>>>> On 13-03-12 2:28 PM, "Alex Marshall" wrote: >>>>> >>>>>> I deleted Iterator.pm, I put the new one in the maker/lib folder, >>>>>>then >>>>>> reran maker >>>>>> >>>>>> vi interator.pm confirms this: >>>>>> >>>>>> sub fileHandle { >>>>>> die "this should die"; >>>>>> >>>>>> error: >>>>>> STATUS: Parsing control files... >>>>>> Opening a new filehandle: Iterator:GFF3 >>>>>> Gettign the existing filehandle: Iterator::GFF3 >>>>>> Checking if it still exists: Iterator::GFF3 >>>>>> ERROR: No open filehandle Iterator::GFF3 >>>>>> --> rank=NA, hostname=frontend04 >>>>>> >>>>>> >>>>>> >>>>>> On 12/03/2013 18:21, Carson Holt wrote: >>>>>>> Try this one. >>>>>>> >>>>>>> It should fail immediately >>>>>>> >>>>>>> Code --> die "this should die"; >>>>>>> >>>>>>> >>>>>>> I'm just making sure it's being called as expected. >>>>>>> >>>>>>> --Carson >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 13-03-12 2:18 PM, "Alex Marshall" >>>>>>>wrote: >>>>>>> >>>>>>>> I have Iterator.pm and GFF3.pm in the right place: >>>>>>>> >>>>>>>> ..../software/maker2/maker-2.27/lib/Iterator.pm >>>>>>>> ..../software/maker2/maker-2.27/lib/Iterator/GFF3.pm >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 12/03/2013 18:16, Alex Marshall wrote: >>>>>>>>> I have deleted Iterator.pm, and replaced again (just to be sure). >>>>>>>>> >>>>>>>>> STATUS: Parsing control files... >>>>>>>>> Opening a new filehandle: Iterator:GFF3 >>>>>>>>> Gettign the existing filehandle: Iterator::GFF3 >>>>>>>>> Checking if it still exists: Iterator::GFF3 >>>>>>>>> ERROR: No open filehandle Iterator::GFF3 >>>>>>>>> --> rank=NA, hostname=frontend04 >>>>>>>>> >>>>>>>>> >>>>>>>>> On 12/03/2013 18:11, Carson Holt wrote: >>>>>>>>>> It's missing all the standard error from the Iterator.pm >>>>>>>>>>message I >>>>>>>>>> added? >>>>>>>>>> Could you double check that you replaced that one too. >>>>>>>>>> >>>>>>>>>> --Carson >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 13-03-12 2:07 PM, "Alex Marshall" >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> STATUS: Parsing control files... >>>>>>>>>>> Opening a new filehandle: Iterator:GFF3 >>>>>>>>>>> Gettign the existing filehandle: Iterator::GFF3 >>>>>>>>>>> Checking if it still exists: Iterator::GFF3 >>>>>>>>>>> ERROR: No open filehandle Iterator::GFF3 >>>>>>>>>>> --> rank=NA, hostname=frontend04 >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 12/03/2013 18:02, Carson Holt wrote: >>>>>>>>>>>> Please use these two and send me the full STDERR (replaces >>>>>>>>>>>>also >>>>>>>>>>>> Iterator/GFF3.pm). >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Carson >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 13-03-12 1:55 PM, "Alex Marshall" >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> same again: >>>>>>>>>>>>> >>>>>>>>>>>>> STATUS: Parsing control files... >>>>>>>>>>>>> ERROR: No open filehandle Iterator::GFF3 >>>>>>>>>>>>> --> rank=NA, hostname=frontend04 >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 12/03/2013 17:45, Carson Holt wrote: >>>>>>>>>>>>>> Try this one. This is a code snippet --> >>>>>>>>>>>>>> >>>>>>>>>>>>>> my $fh = new FileHandle(); >>>>>>>>>>>>>> $fh->open("$arg") or die "ERROR: Could not >>>>>>>>>>>>>> open >>>>>>>>>>>>>> file: >>>>>>>>>>>>>> $!\n"; >>>>>>>>>>>>>> $self->{fileHandle} = $fh; >>>>>>>>>>>>>> $self->startPos($fh->getpos()); >>>>>>>>>>>>>> if (! openhandle($fh)){ #checks to see if >>>>>>>>>>>>>>file >>>>>>>>>>>>>> handle >>>>>>>>>>>>>> is >>>>>>>>>>>>>> open >>>>>>>>>>>>>> confess "ERROR: No open filehandle in Iterator\n"; >>>>>>>>>>>>>> } >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> All it does is open the handle, check the reading position >>>>>>>>>>>>>>and >>>>>>>>>>>>>> then >>>>>>>>>>>>>> check >>>>>>>>>>>>>> to see if the handle is still open. >>>>>>>>>>>>>> >>>>>>>>>>>>>> --Carson >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 13-03-12 1:37 PM, "Alex Marshall" >>>>>>>>>>>>>> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> [1] If I comment out the error in the GFF3.pm file: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> STATUS: Parsing control files... >>>>>>>>>>>>>>> Can't call method "getpos" without a package or object >>>>>>>>>>>>>>> reference >>>>>>>>>>>>>>> at >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>/exports/work/biology_ieb_mblaxter/software/maker2/maker-2.2 >>>>>>>>>>>>>>>7/ >>>>>>>>>>>>>>> bin >>>>>>>>>>>>>>> /. >>>>>>>>>>>>>>> ./l >>>>>>>>>>>>>>> ib >>>>>>>>>>>>>>> /I >>>>>>>>>>>>>>> terator/GFF3.pm >>>>>>>>>>>>>>> line 42, line 121. >>>>>>>>>>>>>>> --> rank=NA, hostname=frontend04 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> [2] If I add the error comment back to the GFF3.pm, and add >>>>>>>>>>>>>>> the >>>>>>>>>>>>>>> second >>>>>>>>>>>>>>> new Iterator.pm: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> STATUS: Parsing control files... >>>>>>>>>>>>>>> ERROR: No open filehandle Iterator::GFF3 >>>>>>>>>>>>>>> --> rank=NA, hostname=frontend04 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Alex >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 12/03/2013 17:31, Carson Holt wrote: >>>>>>>>>>>>>>>> There is one other thing it does right before. It calls >>>>>>>>>>>>>>>> this >>>>>>>>>>>>>>>> --> >>>>>>>>>>>>>>>> $self->fileHandle()->getpos() >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I switched the chaining off so it is just $fh->getpos in >>>>>>>>>>>>>>>>the >>>>>>>>>>>>>>>> attached >>>>>>>>>>>>>>>> module (replace again). >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I don't see why a failure would happen special for you >>>>>>>>>>>>>>>> there, >>>>>>>>>>>>>>>> but >>>>>>>>>>>>>>>> try >>>>>>>>>>>>>>>> it >>>>>>>>>>>>>>>> again. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> --Carson >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 13-03-12 1:24 PM, "Carson Holt" >>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> This is the new line in Iterator.pm >>>>>>>>>>>>>>>>> --> $fh->open("$arg") or die "ERROR: Could not open file: >>>>>>>>>>>>>>>>> $!\n"; >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> The extra info would be from $! >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> In the place where the error is occurring, all MAKER does >>>>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>> open a >>>>>>>>>>>>>>>>> file >>>>>>>>>>>>>>>>> handle in Iterator.pm and then check to see if it is open >>>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>> Iterator::GFF3 (it does one and then instantly the >>>>>>>>>>>>>>>>>other). >>>>>>>>>>>>>>>>> The >>>>>>>>>>>>>>>>> second >>>>>>>>>>>>>>>>> failure is just the check on the filehandle. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> If the open succeeds, but for some reason it can't tell >>>>>>>>>>>>>>>>>it >>>>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>> open, >>>>>>>>>>>>>>>>> then >>>>>>>>>>>>>>>>> it is something to do with your system. You can try >>>>>>>>>>>>>>>>> reinstalling >>>>>>>>>>>>>>>>> Scalar::Util as that is the module that implements >>>>>>>>>>>>>>>>> openhandle >>>>>>>>>>>>>>>>> method >>>>>>>>>>>>>>>>> that >>>>>>>>>>>>>>>>> is called. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> You can also try just commenting out line 37 of >>>>>>>>>>>>>>>>> lib/Iterator/GFF3.pm >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> --Carson >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 13-03-12 1:15 PM, "Alex Marshall" >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I am looking at Iterator.pm >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> so it should of thrown more error information? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Alex >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On 12/03/2013 17:14, Alex Marshall wrote: >>>>>>>>>>>>>>>>>>> replaced Iterator.pm in maker2/maker-2.27/lib >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> error: same as before >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> STATUS: Parsing control files... >>>>>>>>>>>>>>>>>>> ERROR: No open filehandle Iterator::GFF3 >>>>>>>>>>>>>>>>>>> --> rank=NA, hostname=frontend04 >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> ...../software/maker2/maker-2.27/lib/Iterator/GFF3.pm >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> in sub new >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> my $fh = $self->fileHandle(); >>>>>>>>>>>>>>>>>>> if (! openhandle($fh)){ #checks to see if file >>>>>>>>>>>>>>>>>>> handle >>>>>>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>>>> open >>>>>>>>>>>>>>>>>>> die "ERROR: No open filehandle >>>>>>>>>>>>>>>>>>> Iterator::GFF3\n"; >>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Alex >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On 12/03/2013 17:06, Carson Holt wrote: >>>>>>>>>>>>>>>>>>>> I get not errors, and don?t see any issues. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Could you replace the Iterator.pm in the lib directory >>>>>>>>>>>>>>>>>>>> with >>>>>>>>>>>>>>>>>>>> this >>>>>>>>>>>>>>>>>>>> one. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I >>>>>>>>>>>>>>>>>>>> added some more output to the STDERR if opening a >>>>>>>>>>>>>>>>>>>> filehandle >>>>>>>>>>>>>>>>>>>> fails. >>>>>>>>>>>>>>>>>>>> At >>>>>>>>>>>>>>>>>>>> least it should provide more information. Could you >>>>>>>>>>>>>>>>>>>> then >>>>>>>>>>>>>>>>>>>> let me >>>>>>>>>>>>>>>>>>>> know >>>>>>>>>>>>>>>>>>>> what >>>>>>>>>>>>>>>>>>>> it says. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>> Carson >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On 13-03-12 12:35 PM, "Alex Marshall" >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> please find: maker_opts.ctl and test.fa attached >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Alex >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On 12/03/2013 16:31, Alex Marshall wrote: >>>>>>>>>>>>>>>>>>>>>> will send to you now... >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On 12/03/2013 16:29, Carson Holt wrote: >>>>>>>>>>>>>>>>>>>>>>> Could you send me the entire captured STDERR, your >>>>>>>>>>>>>>>>>>>>>>> maker_opts.ctl >>>>>>>>>>>>>>>>>>>>>>> file and >>>>>>>>>>>>>>>>>>>>>>> you test.fasta? >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>> Carson >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On 13-03-12 12:23 PM, "Alex Marshall" >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> It is in fasta format not GFF format >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> On 12/03/2013 16:16, Alex Marshall wrote: >>>>>>>>>>>>>>>>>>>>>>>>> I have been looking through maker_opts.ctl >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> #-----Genome (Required for De-Novo Annotation) >>>>>>>>>>>>>>>>>>>>>>>>> genome=test.fna #genome sequence (fasta format or >>>>>>>>>>>>>>>>>>>>>>>>> fasta >>>>>>>>>>>>>>>>>>>>>>>>> embeded >>>>>>>>>>>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>>>>>>>>>> GFF3) >>>>>>>>>>>>>>>>>>>>>>>>> organism_type=eukaryotic #eukaryotic or prokaryotic. >>>>>>>>>>>>>>>>>>>>>>>>> Default >>>>>>>>>>>>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>>>>>>>>>> eukaryotic >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> I added the path to the genome, same error. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Alex >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> On 12/03/2013 16:11, Carson Holt wrote: >>>>>>>>>>>>>>>>>>>>>>>>>> What does you maker_opts.ctl file look like. What is >>>>>>>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>> value >>>>>>>>>>>>>>>>>>>>>>>>>> for >>>>>>>>>>>>>>>>>>>>>>>>>> genome? If you did not give a genome fasta file and are >>>>>>>>>>>>>>>>>>>>>>>>>> using >>>>>>>>>>>>>>>>>>>>>>>>>> a >>>>>>>>>>>>>>>>>>>>>>>>>> gff3 as >>>>>>>>>>>>>>>>>>>>>>>>>> input, is there a FASTA file embedded in it? >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> --Carson >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> On 13-03-12 12:06 PM, "Alex Marshall" >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> [1] hard drive - enough space >>>>>>>>>>>>>>>>>>>>>>>>>>> [2] ./Build realclean - done >>>>>>>>>>>>>>>>>>>>>>>>>>> [3] delete the maker_path/perl directory and >>>>>>>>>>>>>>>>>>>>>>>>>>> maker_path/bin - >>>>>>>>>>>>>>>>>>>>>>>>>>> done >>>>>>>>>>>>>>>>>>>>>>>>>>> [4] LD_PRELOAD=/.....path...../openmpi/lib/libmpi.so - >>>>>>>>>>>>>>>>>>>>>>>>>>> done >>>>>>>>>>>>>>>>>>>>>>>>>>> [5] perl Build.PL - done >>>>>>>>>>>>>>>>>>>>>>>>>>> [6] installation of 2.27 worked >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> and back to original error: >>>>>>>>>>>>>>>>>>>>>>>>>>> STATUS: Parsing control files... >>>>>>>>>>>>>>>>>>>>>>>>>>> ERROR: No open filehandle Iterator::GFF3 >>>>>>>>>>>>>>>>>>>>>>>>>>> --> rank=NA, hostname=frontend04 >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Alex >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> On 12/03/2013 15:26, Carson Holt wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>> So the odd unrelated errors you are getting suggest >>>>>>>>>>>>>>>>>>>>>>>>>>>> there >>>>>>>>>>>>>>>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>>>>>>>>>>>>> something >>>>>>>>>>>>>>>>>>>>>>>>>>>> else going on that needs to be resolved first. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Check your drive space 'df -h maker_path'. Make sure >>>>>>>>>>>>>>>>>>>>>>>>>>>> you >>>>>>>>>>>>>>>>>>>>>>>>>>>> don't >>>>>>>>>>>>>>>>>>>>>>>>>>>> just >>>>>>>>>>>>>>>>>>>>>>>>>>>> have >>>>>>>>>>>>>>>>>>>>>>>>>>>> a full hard drive. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Run './Build realclean', and delete the >> maker_path/perl >>>>>>>>>>>>>>>>>>>>>>>>>>>> directory and >>>>>>>>>>>>>>>>>>>>>>>>>>>> maker_path/bin sidreactory completely. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Make sure to execute the export >>>>>>>>>>>>>>>>>>>>>>>>>>>> LD_PRELOAD=/.....path...../openmpi/lib/libmpi.so >> comamnd >>>>>>>>>>>>>>>>>>>>>>>>>>>> before >>>>>>>>>>>>>>>>>>>>>>>>>>>> ever >>>>>>>>>>>>>>>>>>>>>>>>>>>> running 'perl Build.PL' >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Which version of OPenMPI are you using. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>>>> Carson >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> On 13-03-12 11:21 AM, "Alex >>>>>>>>>>>>>>>>>>>>>>>>>>>> Marshall" >>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> I am using openmpi and yes I ran ./Build install >> step. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Configuring MAKER with MPI support >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Can't exec "/bin/sh": Argument list too long at >>>>>>>>>>>>>>>>>>>>>>>>>>>>> /....path...../lib/perl5/Inline/C.pm line 801. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> A problem was encountered while attempting to >> compile >>>>>>>>>>>>>>>>>>>>>>>>>>>>> and >>>>>>>>>>>>>>>>>>>>>>>>>>>>> install >>>>>>>>>>>>>>>>>>>>>>>>>>>>> your >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Inline >>>>>>>>>>>>>>>>>>>>>>>>>>>>> C code. The command that failed was: >>>>>>>>>>>>>>>>>>>>>>>>>>>>> /usr/bin/perl Makefile.PL > out.Makefile_PL >> 2>&1 >>>>>>>>>>>>>>>>>>>>>>>>>>>>> The build directory was: >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>> /....path...../maker2/maker-2.27/src/blib/build/Parallel >>>>>>>>>>>>>>>>>>>>>>>>>>>>> /A >>>>>>>>>>>>>>>>>>>>>>>>>>>>> pp >>>>>>>>>>>>>>>>>>>>>>>>>>>>> li >>>>>>>>>>>>>>>>>>>>>>>>>>>>> c >>>>>>>>>>>>>>>>>>>>>>>>>>>>> ati >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> on/ >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> MPI >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> To debug the problem, cd to the build directory, and >>>>>>>>>>>>>>>>>>>>>>>>>>>>> inspect >>>>>>>>>>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>>>>> output >>>>>>>>>>>>>>>>>>>>>>>>>>>>> files. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> at >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>> /....path...../maker2/maker-2.27/src/lib/Parallel/Applic >>>>>>>>>>>>>>>>>>>>>>>>>>>>> at >>>>>>>>>>>>>>>>>>>>>>>>>>>>> io >>>>>>>>>>>>>>>>>>>>>>>>>>>>> n/ >>>>>>>>>>>>>>>>>>>>>>>>>>>>> M >>>>>>>>>>>>>>>>>>>>>>>>>>>>> PI. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> pm >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> line 223. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Alex >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 12/03/2013 15:14, Carson Holt wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Also the place it is trying to load from >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>> /....path...../maker2/maker-2.27/src/blib/lib/auto/Para >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ll >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> el >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> /A >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> p >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pli >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> cat >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ion >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> /M >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> PI >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> /MPI.so >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> That is not the final install location? Did you >> run >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> './Build >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> install' >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> step? When that runs everything related to MPI >> will >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> be >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> here --> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>> /....path...../maker2/maker-2.27/perl/lib/auto/Parallel >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> /A >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pp >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> li >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> c >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ati >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> on/ >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> MPI >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> /M >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> PI >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> .so >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> --Carson >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 13-03-12 11:11 AM, "Alex >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Marshall" >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> now getting mpi problems: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Can't load >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>> '/....path...../maker2/maker-2.27/src/blib/lib/auto/Para >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ll >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> el >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> / >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> App >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> lic >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ati >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> on >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> /M >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> PI/MPI.so' >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for module Parallel::Application::MPI: >>>>> libmpich.so.1.0: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> cannot >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> open >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> shared object file: No such file or directory at >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> /usr/lib64/perl5/DynaLoader.pm line 200. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> at /....path...../lib/perl5/Inline.pm line >> 536. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> at >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>> /....path...../maker2/maker-2.27/src/lib/Parallel/Applic >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> at >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> io >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> n >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> /MP >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I.p >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> m >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> line 223 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> you suggest: export >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> LD_PRELOAD=/.....path...../openmpi/lib/libmpi.so >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I do that, and run again, same error. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Alex >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 12/03/2013 14:43, Alex Marshall wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ok I will upgrade to 2.27 now. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Alex >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 12/03/2013 14:42, Carson Holt wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The original error is caused by an issue in >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Proc::ProcessTable >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> on >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> some >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> systems. I no longer use that module in maker >> for >>>>>>> that >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> reason. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> After >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> first error, you may have to delete the >> mpi_blastdb >>>>>>> and >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> any >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> files >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> with the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> extension .db in the maker.output directory >> before >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> retrying. I >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> would >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> recommend using 2.27. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Carson >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 13-03-12 10:40 AM, "Alex >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Marshall" >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I managed to fix that error. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I am using version 2.25-beta. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> new error: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ERROR: No open filehandle Iterator::GFF3 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Alex >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 12/03/2013 14:27, Carson Holt wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Could you try the 2.27 version of MAKER? You >> are >>>>>>> using >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2.10 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> correct? >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Carson >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 13-03-11 12:15 PM, "Alex >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Marshall" >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi to the maker-devel, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I am getting an error everytime I run the >> maker >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> script. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> symbol lookup error: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>> /path/to/software/lib64/perl5/site_perl/5.8.8/x86_64-li >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> n >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ux- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> thr >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ead >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -m >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ul >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ti/ >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> au >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to/Proc/ProcessTable/ProcessTable.so: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> undefined symbol: Perl_Tstack_sp_ptr >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Your help would be very appreciated. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best wishes, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Alex >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ---------------- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Edinburgh University >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The University of Edinburgh is a charitable >>>>> body, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> registered in >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Scotland, with registration number SC005336. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> _______________________________________________ >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> maker-devel mailing list >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> maker-devel at box290.bluehost.com >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>> http://box290.bluehost.com/mailman/listinfo/maker-devel >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> _ >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> yan >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> del >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> l-l >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ab >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> .o >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> rg >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Alex Marshall, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Room 3.54, Blaxter Lab, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Ashworth Laboratories, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Institute of Evolutionary Biology, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The King's Buildings, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The University of Edinburgh, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Edinburgh, EH9 3JT >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> alex.marshall at ed.ac.uk >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> +44(0)131 650 7403 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The University of Edinburgh is a charitable >> body, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> registered in >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Scotland, with registration number SC005336. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Alex Marshall, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Room 3.54, Blaxter Lab, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Ashworth Laboratories, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Institute of Evolutionary Biology, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The King's Buildings, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The University of Edinburgh, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Edinburgh, EH9 3JT >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> alex.marshall at ed.ac.uk >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> +44(0)131 650 7403 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The University of Edinburgh is a charitable body, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> registered in >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Scotland, with registration number SC005336. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Alex Marshall, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Room 3.54, Blaxter Lab, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Ashworth Laboratories, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Institute of Evolutionary Biology, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> The King's Buildings, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> The University of Edinburgh, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Edinburgh, EH9 3JT >>>>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>>>> alex.marshall at ed.ac.uk >>>>>>>>>>>>>>>>>>>>>>>>>>>>> +44(0)131 650 7403 >>>>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> The University of Edinburgh is a charitable body, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> registered >>>>>>>>>>>>>>>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Scotland, with registration number SC005336. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>> Alex Marshall, >>>>>>>>>>>>>>>>>>>>>>>>>>> Room 3.54, Blaxter Lab, >>>>>>>>>>>>>>>>>>>>>>>>>>> Ashworth Laboratories, >>>>>>>>>>>>>>>>>>>>>>>>>>> Institute of Evolutionary Biology, >>>>>>>>>>>>>>>>>>>>>>>>>>> The King's Buildings, >>>>>>>>>>>>>>>>>>>>>>>>>>> The University of Edinburgh, >>>>>>>>>>>>>>>>>>>>>>>>>>> Edinburgh, EH9 3JT >>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>> alex.marshall at ed.ac.uk >>>>>>>>>>>>>>>>>>>>>>>>>>> +44(0)131 650 7403 >>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> The University of Edinburgh is a charitable body, >>>>>>>>>>>>>>>>>>>>>>>>>>> registered >>>>>>>>>>>>>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>>>>>>>>>>>> Scotland, with registration number SC005336. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>> Alex Marshall, >>>>>>>>>>>>>>>>>>>>>>>> Room 3.54, Blaxter Lab, >>>>>>>>>>>>>>>>>>>>>>>> Ashworth Laboratories, >>>>>>>>>>>>>>>>>>>>>>>> Institute of Evolutionary Biology, >>>>>>>>>>>>>>>>>>>>>>>> The King's Buildings, >>>>>>>>>>>>>>>>>>>>>>>> The University of Edinburgh, >>>>>>>>>>>>>>>>>>>>>>>> Edinburgh, EH9 3JT >>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>> alex.marshall at ed.ac.uk >>>>>>>>>>>>>>>>>>>>>>>> +44(0)131 650 7403 >>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> The University of Edinburgh is a charitable body, >>>>>>>>>>>>>>>>>>>>>>>> registered >>>>>>>>>>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>>>>>>>>> Scotland, with registration number SC005336. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>> Alex Marshall, >>>>>>>>>>>>>>>>>>>>> Room 3.54, Blaxter Lab, >>>>>>>>>>>>>>>>>>>>> Ashworth Laboratories, >>>>>>>>>>>>>>>>>>>>> Institute of Evolutionary Biology, >>>>>>>>>>>>>>>>>>>>> The King's Buildings, >>>>>>>>>>>>>>>>>>>>> The University of Edinburgh, >>>>>>>>>>>>>>>>>>>>> Edinburgh, EH9 3JT >>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>> alex.marshall at ed.ac.uk >>>>>>>>>>>>>>>>>>>>> +44(0)131 650 7403 >>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> The University of Edinburgh is a charitable body, >>>>>>>>>>>>>>>>>>>>> registered in >>>>>>>>>>>>>>>>>>>>> Scotland, with registration number SC005336. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>> Alex Marshall, >>>>>>>>>>>>>>>>>> Room 3.54, Blaxter Lab, >>>>>>>>>>>>>>>>>> Ashworth Laboratories, >>>>>>>>>>>>>>>>>> Institute of Evolutionary Biology, >>>>>>>>>>>>>>>>>> The King's Buildings, >>>>>>>>>>>>>>>>>> The University of Edinburgh, >>>>>>>>>>>>>>>>>> Edinburgh, EH9 3JT >>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>> alex.marshall at ed.ac.uk >>>>>>>>>>>>>>>>>> +44(0)131 650 7403 >>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> The University of Edinburgh is a charitable body, >>>>>>>>>>>>>>>>>> registered >>>>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>>> Scotland, with registration number SC005336. >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>> Alex Marshall, >>>>>>>>>>>>>>> Room 3.54, Blaxter Lab, >>>>>>>>>>>>>>> Ashworth Laboratories, >>>>>>>>>>>>>>> Institute of Evolutionary Biology, >>>>>>>>>>>>>>> The King's Buildings, >>>>>>>>>>>>>>> The University of Edinburgh, >>>>>>>>>>>>>>> Edinburgh, EH9 3JT >>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>> alex.marshall at ed.ac.uk >>>>>>>>>>>>>>> +44(0)131 650 7403 >>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The University of Edinburgh is a charitable body, >>>>>>>>>>>>>>>registered >>>>>>>>>>>>>>> in >>>>>>>>>>>>>>> Scotland, with registration number SC005336. >>>>>>>>>>>>> -- >>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>> Alex Marshall, >>>>>>>>>>>>> Room 3.54, Blaxter Lab, >>>>>>>>>>>>> Ashworth Laboratories, >>>>>>>>>>>>> Institute of Evolutionary Biology, >>>>>>>>>>>>> The King's Buildings, >>>>>>>>>>>>> The University of Edinburgh, >>>>>>>>>>>>> Edinburgh, EH9 3JT >>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>> alex.marshall at ed.ac.uk >>>>>>>>>>>>> +44(0)131 650 7403 >>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>> >>>>>>>>>>>>> The University of Edinburgh is a charitable body, registered >>>>>>>>>>>>>in >>>>>>>>>>>>> Scotland, with registration number SC005336. >>>>>>>>>>> -- >>>>>>>>>>> ----------------------------- >>>>>>>>>>> Alex Marshall, >>>>>>>>>>> Room 3.54, Blaxter Lab, >>>>>>>>>>> Ashworth Laboratories, >>>>>>>>>>> Institute of Evolutionary Biology, >>>>>>>>>>> The King's Buildings, >>>>>>>>>>> The University of Edinburgh, >>>>>>>>>>> Edinburgh, EH9 3JT >>>>>>>>>>> ----------------------------- >>>>>>>>>>> alex.marshall at ed.ac.uk >>>>>>>>>>> +44(0)131 650 7403 >>>>>>>>>>> ----------------------------- >>>>>>>>>>> >>>>>>>>>>> The University of Edinburgh is a charitable body, registered in >>>>>>>>>>> Scotland, with registration number SC005336. >>>>>>>> -- >>>>>>>> ----------------------------- >>>>>>>> Alex Marshall, >>>>>>>> Room 3.54, Blaxter Lab, >>>>>>>> Ashworth Laboratories, >>>>>>>> Institute of Evolutionary Biology, >>>>>>>> The King's Buildings, >>>>>>>> The University of Edinburgh, >>>>>>>> Edinburgh, EH9 3JT >>>>>>>> ----------------------------- >>>>>>>> alex.marshall at ed.ac.uk >>>>>>>> +44(0)131 650 7403 >>>>>>>> ----------------------------- >>>>>>>> >>>>>>>> The University of Edinburgh is a charitable body, registered in >>>>>>>> Scotland, with registration number SC005336. >>>>>> -- >>>>>> ----------------------------- >>>>>> Alex Marshall, >>>>>> Room 3.54, Blaxter Lab, >>>>>> Ashworth Laboratories, >>>>>> Institute of Evolutionary Biology, >>>>>> The King's Buildings, >>>>>> The University of Edinburgh, >>>>>> Edinburgh, EH9 3JT >>>>>> ----------------------------- >>>>>> alex.marshall at ed.ac.uk >>>>>> +44(0)131 650 7403 >>>>>> ----------------------------- >>>>>> >>>>>> The University of Edinburgh is a charitable body, registered in >>>>>> Scotland, with registration number SC005336. >>> >>> -- >>> ----------------------------- >>> Alex Marshall, >>> Room 3.54, Blaxter Lab, >>> Ashworth Laboratories, >>> Institute of Evolutionary Biology, >>> The King's Buildings, >>> The University of Edinburgh, >>> Edinburgh, EH9 3JT >>> ----------------------------- >>> alex.marshall at ed.ac.uk >>> +44(0)131 650 7403 >>> ----------------------------- >>> >>> The University of Edinburgh is a charitable body, registered in >>> Scotland, with registration number SC005336. > > >-- >----------------------------- >Alex Marshall, >Room 3.54, Blaxter Lab, >Ashworth Laboratories, >Institute of Evolutionary Biology, >The King's Buildings, >The University of Edinburgh, >Edinburgh, EH9 3JT >----------------------------- >alex.marshall at ed.ac.uk >+44(0)131 650 7403 >----------------------------- > >The University of Edinburgh is a charitable body, registered in >Scotland, with registration number SC005336. From mikheyev at gmail.com Wed Mar 13 02:23:25 2013 From: mikheyev at gmail.com (Sasha Mikheyev) Date: Wed, 13 Mar 2013 16:23:25 +0900 Subject: [maker-devel] duplicate CDS in annotation In-Reply-To: References: Message-ID: Dear Carson, The new version does indeed fix the problem! However, I noticed that some of the CDS annotations were swallowed. This seems to affect a ~600 genes. e.g. input: pbar_scf7180000349951 maker mRNA 98033 98530 . - . ID=PB12301-RA;Parent=PB12301;Name=PB12301-RA;Alias=maker-pbar_scf7180000349951-snap-gene-1.17-mRNA-1;_AED=1.00;_QI=0|0|0|0|0|0|2|0|81; pbar_scf7180000349951 maker exon 98393 98530 . - . ID=PB12301-RA:exon:10283;Parent=PB12301-RA; pbar_scf7180000349951 maker exon 98033 98140 . - . ID=PB12301-RA:exon:10284;Parent=PB12301-RA; pbar_scf7180000349951 maker CDS 98033 98140 . - 0 ID=PB12301-RA:cds:10114;Parent=PB12301-RA; pbar_scf7180000349951 maker CDS 98393 98530 . - 0 ID=PB12301-RA:cds:10113;Parent=PB12301-RA; output: pbar_scf7180000349951 maker mRNA 98033 98530 . - . ID=PB12301-RA;Parent=PB12301;Name=PB12301-RA;_AED=0.38;_eAED=0.38;_QI=0|0|0.33|1|0.5|1|3|246|165;Alias=genemark-pbar_scf7180000349951-abinit-gene-1.14-mRNA-1,PB12301-RA pbar_scf7180000349951 maker exon 98033 98530 . - . ID=PB12301-RA:exon:134;Parent=PB12301-RA pbar_scf7180000349951 maker exon 98033 98140 . - . ID=PB12301-RA:exon:133;Parent=PB12301-RA pbar_scf7180000349951 maker exon 98393 98530 . - . ID=PB12301-RA:exon:132;Parent=PB12301-RA pbar_scf7180000349951 maker three_prime_UTR 98393 98530 . - . ID=PB12301-RA:three_prime_utr;Parent=PB12301-RA pbar_scf7180000349951 maker three_prime_UTR 98033 98140 . - . ID=PB12301-RA:three_prime_utr;Parent=PB12301-RA pbar_scf7180000349951 maker CDS 98033 98530 . - 0 ID=PB12301-RA:cds;Parent=PB12301-RA Thank you, Sasha On Tue, Mar 12, 2013 at 10:37 PM, Carson Holt wrote: > Yes. Try the newer version and see if you still have the issue. > > Thanks, > Carson > > > From: Sasha Mikheyev > Date: Tuesday, 12 March, 2013 1:26 AM > To: Carson Holt > Cc: Barry Moore , < > maker-devel at yandell-lab.org> > > Subject: Re: [maker-devel] duplicate CDS in annotation > > Hi Carson, > > I have been using version 2.10. Is it worth trying with a newer version? > > You can find the model file here. > It is rather large, as it includes all of the output from the first maker > run. > > Yours, > > Sasha > > > On Mon, Mar 11, 2013 at 10:02 PM, Carson Holt wrote: > >> I think the issue is that you are getting a match feature that is being >> printed with the same ID as the mRNA feature. Correct? >> >> What version of MAKER are you using, and what does the gile you are >> giving to pred_gff or model_gff look like? Could you send them? >> >> Thanks, >> Carson >> >> >> From: Barry Moore >> Date: Monday, 11 March, 2013 7:32 AM >> To: Sasha Mikheyev >> Cc: >> Subject: Re: [maker-devel] duplicate CDS in annotation >> >> Hi Sasha, >> >> This gene model appears to be correctly formatted to me. In GFF3 format >> the CDS features are allowed to span multiple lines and they share the same >> ID to indicate that it is all the same features. See the GFF3 >> specification on the Sequence Ontology website ( >> http://www.sequenceontology.org/resources/gff3.html), and in particular >> the description of the ID attribute specifies: >> >> ID Indicates the ID of the feature. IDs for each feature must be unique >> within the scope of the GFF file. In the case of discontinuous features >> (i.e. a single feature that exists over multiple genomic locations) the >> same ID may appear on multiple lines. All lines that share an ID >> collectively represent a single feature. >> >> >> So each of those CDS lines forms one part of the single CDS feature for >> this gene. >> >> B >> >> On Mar 11, 2013, at 3:46 AM, Sasha Mikheyev wrote: >> >> Dear Yandell lab, >> >> I am re-annotating the harvester and genome using protein and RNA-seq >> data. However, I get many artifacts like the one below. It seems that there >> are several CDS records that should tie in to the same mRNA, but they are >> really hanging out separately, and produce several nucleotide sequences >> with the same name when extracted from the gff. I would appreciate any >> guidance about how to fix this! >> >> Thank you, >> >> Sasha >> >> grep "pbar_scf7180000350377:hit:2506" Pbar.2.0.gff >> pbar_scf7180000350377 protein2genome protein_match 172004 172162 150 - . >> ID=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;score=150; >> pbar_scf7180000350377 protein2genome match_part 172004 172162 150 - . ID=pbar_scf7180000350377:hsp:2798;Parent=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;Target=Hsal|HS9704 >> 1 53 +;Gap=M159; >> pbar_scf7180000350377 maker mRNA 538308 558769 . + . >> ID=pbar_scf7180000350377:hit:2506;Parent=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29;Name=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29-mRNA-1;_AED=0.48;_eAED=0.39;_QI=0|0|0|0.5|1|1|6|0|395;score=0.01; >> pbar_scf7180000350377 maker exon 538308 538334 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:305;Parent=pbar_scf7180000350377:hit:2506; >> pbar_scf7180000350377 maker exon 538748 538968 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:306;Parent=pbar_scf7180000350377:hit:2506; >> pbar_scf7180000350377 maker exon 539842 540242 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:307;Parent=pbar_scf7180000350377:hit:2506; >> pbar_scf7180000350377 maker exon 542624 542798 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:308;Parent=pbar_scf7180000350377:hit:2506; >> pbar_scf7180000350377 maker exon 555823 556025 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:309;Parent=pbar_scf7180000350377:hit:2506; >> pbar_scf7180000350377 maker exon 558609 558769 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:310;Parent=pbar_scf7180000350377:hit:2506; >> pbar_scf7180000350377 maker CDS 538308 538334 . + 0 >> ID=pbar_scf7180000350377:hit:2506:cds:305;Parent=pbar_scf7180000350377:hit:2506; >> pbar_scf7180000350377 maker CDS 538748 538968 . + 0 >> ID=pbar_scf7180000350377:hit:2506:cds:306;Parent=pbar_scf7180000350377:hit:2506; >> pbar_scf7180000350377 maker CDS 539842 540242 . + 1 >> ID=pbar_scf7180000350377:hit:2506:cds:307;Parent=pbar_scf7180000350377:hit:2506; >> pbar_scf7180000350377 maker CDS 542624 542798 . + 2 >> ID=pbar_scf7180000350377:hit:2506:cds:308;Parent=pbar_scf7180000350377:hit:2506; >> pbar_scf7180000350377 maker CDS 555823 556025 . + 1 >> ID=pbar_scf7180000350377:hit:2506:cds:309;Parent=pbar_scf7180000350377:hit:2506; >> pbar_scf7180000350377 maker CDS 558609 558769 . + 2 >> ID=pbar_scf7180000350377:hit:2506:cds:310;Parent=pbar_scf7180000350377:hit:2506; >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> >> Barry Moore >> Research Scientist >> Dept. of Human Genetics >> University of Utah >> Salt Lake City, UT 84112 >> -------------------------------------------- >> (801) 585-3543 >> >> >> >> >> _______________________________________________ maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Hossein.Borhan at AGR.GC.CA Wed Mar 13 16:49:44 2013 From: Hossein.Borhan at AGR.GC.CA (Borhan, Hossein) Date: Wed, 13 Mar 2013 17:49:44 -0400 Subject: [maker-devel] do of the maker predicted proteins do not start with M Message-ID: <7B64340A44B6634C814A22BCFA6179D5020887CC@onottaxms5.AGR.GC.CA> Hi I have run maker and some of the protein predicted by maker does not start with a Methionine. I am not sure why Here are some examples >maker-scaffold00001-snap-gene-0.8-mRNA-1 protein AED:0.27496328928047 eAED:0.27496328928047 QI:0|0.4|0.16|0.5|1|1|6|0|453 VIIKFKTFAKASRSVELFGHEGAWARGDGYCNFKTESEKADRSVKSSCSLNIPFTYDVGR RQYVIKGDRFCLSHNHLVMIPSPTTVIVNDQRDLTPDQLSYIINLGKYSLPFPMVTRMLS DQFPDCRIQKPLLHRLLRKGKLQAFGGDRDAMNALINLGRSYEEHGGFFEIDIDVDCRLE KIWLARAEGLQFASVYNDVVQIDGGAKMNAYGFVFLPVTVIDCLGKSYVVGAMAGPSAEN KADVVKTLEYFRVKRSESVLIADDALAFRAAAVECDMVYHQCTKHYQAKIARACAGLGHE GKEFMIKANTLVYHIFPSEDAFFAKADEYRLMFLQYGGAVKLFDDIVDKRQQLCRTFTSC KFTGSHSSNQRAEGTISRTKRDVQPWLSRANLFEMFTHLEMIQKQQEDEAARLLSNLIRK GKHWSDYVDSIFRERQLNSRLLSSVREVDTGLH >snap-scaffold00087-abinit-gene-2.145-mRNA-1 protein AED:0.0539495114006514 eAED:0.0539495114006514 QI:2|1|0.6|1|1|1|5|0|817 ALSLHGTRQAFARVPPPCRRAHPAERRQPGGGMSADAPVKAGYLLKLTSSLSHWNRRYFI VADTKLFYCKTEDDLLRRKFQGEIDLAGAQIALYTRNDETAKRFSDHHHMLGVKPAGCDR IYILDADSEHAQKEWVACLRRHASQAPVSSPVDAAVAAAPRKDPQSVREGFLTKRGETIK NWKMRYFVLKGNYLHYYRSIEDAQPAGSILLLGTRTTAEPKAVTGMPHSFSIARADAKRK YMIHADSKEECDAWVGAIQQQSVFVRHAGTDSAPPEVVAPAAATPVHQQHQSRSSFGNRP NVADDSADDDEAALDEVALSNGPPALAPHGIANTGASTGLNLKQKVSKKKRRFVTDEFDL DLTYITENIIAMGFPAESMEAMFRNSMSDVQRFLDGRHPDAYRVYNLCSERDYDPAKFHH NVCRFPFDDHNCPNFEDLIPLCEDIHNWLSIQSDHVVAIHCKAGKGRTGLVICAYLLYSG AWRTARDALQFYGFVRTQDQKGVTIPSQIRYVEYFEQYMADPEILSRNNGPLVISEIFVG RGCRPFDTVTITNMGRRMNSKDWGKYWKDALDDGLLLQLPKGACQVDKDFKVEFLASGLL GKKTRVAGFWLHTAFIQDGVVDIDKSMIDKVNKEKDCPAFSIQVFFGGRTYVDRRCRIPV APPQPTGPLLLSPATVRIRNADPLPVPNPSSPSESPSFSAMSSVPSLSLESLSSVSSSLS PTTGPKAAPSPKKQDPGLDPGSPPGTVKSAPVAEAGAAPVDARSDNKAPARSCSLPHGRF PGDVAGDGAFQQAAVQVRIAFWNNLQSEALQRRNSRL >augustus-scaffold00087-abinit-gene-0.106-mRNA-1 protein AED:0.10935424621144 eAED:0.10935424621144 QI:1|1|0.66|1|1|1|3|0|483 STSTVFCPGLWNRVGRYVSGGLQTSPVTVPRQVRPICLATQPATDQPGYVMSSTSALVGI GVVTVALLCRWAPILVTTSGPGSPARSSADVMRIWADHDWSAGTTSLPISQESLLAKRVL SKSFDGLPPNLHVQDDTVPVALLRAHLNAGRHMRLRDMCPTAGACDLQGADPDHGLAPLH MAAMRDDRSSIAYLMALGADPDAMDRAGRQYRNLSFTNFVRNARRAAEERGSTCQLPEVN LAGLERADLDRSWAEIRRLAHEGEPVAIRGLLGAYDRSDVLDWDLDAFLTRHGHVPVNVG DVPYAQYFGLPIQSMPLSKYVASLAPGSASYVFAKDDGICRDALQILDRFARDALPPYFV SPAALGSDAVHFYLGNKGSGAPFHLHSDAVNLLAHGSKTWFVTPPPQSVYSRTPIGEFAA NGTSGIESLRCEQNPGDAIYIPFDWGHAVLNNEDSTFGFAVELLNKRDSLHFLRPSSQVP AGQ Regards Hossein -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Wed Mar 13 16:40:39 2013 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 13 Mar 2013 17:40:39 -0400 Subject: [maker-devel] ERROR: Could not obtain lock to format database In-Reply-To: Message-ID: Could you check to make sure your hard drive is not full, whatever location you set as TMP= in the control files is not full (default is /tmp). Also maker sure you do not set /tmp to an NFS mounted or a tmpfs location. Could you also send the full captured STDERR. Thanks, Carson From: Hung-Wei Hsu Date: Tuesday, 12 March, 2013 8:24 PM To: Subject: [maker-devel] ERROR: Could not obtain lock to format database Hi MAKER developers, I tried MAKER 2.27b on one E. coli scaffold sequence with uniprot protein database. I failed to run the analysis and got an error message as below. Could not obtain lock to format database at maker-2.27b/bin/../lib/GI.pm Any suggestions or helps will be deeply appreciated. Best regards, Hung-Wei _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Wed Mar 13 16:47:06 2013 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 13 Mar 2013 17:47:06 -0400 Subject: [maker-devel] duplicate CDS in annotation In-Reply-To: Message-ID: The output shows that the original model was Alias=maker-pbar_scf7180000349951-snap-gene-1.17-mRNA-1 and the new model replacing it is Alias=genemark-pbar_scf7180000349951-abinit-gene-1.14-mRNA-1. So it is really a completely different model (as one derived from SNAP and one from GeneMark). I'm guessing you have map_forward=1 set and are using the GFF3 passthrough options correct? Thanks, Carson From: Sasha Mikheyev Date: Wednesday, 13 March, 2013 3:23 AM To: Carson Holt Cc: Barry Moore , Subject: Re: [maker-devel] duplicate CDS in annotation Dear Carson, The new version does indeed fix the problem! However, I noticed that some of the CDS annotations were swallowed. This seems to affect a ~600 genes. e.g. input: pbar_scf7180000349951 maker mRNA 98033 98530 . - . ID=PB12301-RA;Parent=PB12301;Name=PB12301-RA;Alias=maker-pbar_scf71800003499 51-snap-gene-1.17-mRNA-1;_AED=1.00;_QI=0|0|0|0|0|0|2|0|81; pbar_scf7180000349951 maker exon 98393 98530 . - . ID=PB12301-RA:exon:10283;Parent=PB12301-RA; pbar_scf7180000349951 maker exon 98033 98140 . - . ID=PB12301-RA:exon:10284;Parent=PB12301-RA; pbar_scf7180000349951 maker CDS 98033 98140 . - 0 ID=PB12301-RA:cds:10114;Parent=PB12301-RA; pbar_scf7180000349951 maker CDS 98393 98530 . - 0 ID=PB12301-RA:cds:10113;Parent=PB12301-RA; output: pbar_scf7180000349951 maker mRNA 98033 98530 . - . ID=PB12301-RA;Parent=PB12301;Name=PB12301-RA;_AED=0.38;_eAED=0.38;_QI=0|0|0. 33|1|0.5|1|3|246|165;Alias=genemark-pbar_scf7180000349951-abinit-gene-1.14-m RNA-1,PB12301-RA pbar_scf7180000349951 maker exon 98033 98530 . - . ID=PB12301-RA:exon:134;Parent=PB12301-RA pbar_scf7180000349951 maker exon 98033 98140 . - . ID=PB12301-RA:exon:133;Parent=PB12301-RA pbar_scf7180000349951 maker exon 98393 98530 . - . ID=PB12301-RA:exon:132;Parent=PB12301-RA pbar_scf7180000349951 maker three_prime_UTR 98393 98530 . - . ID=PB12301-RA:three_prime_utr;Parent=PB12301-RA pbar_scf7180000349951 maker three_prime_UTR 98033 98140 . - . ID=PB12301-RA:three_prime_utr;Parent=PB12301-RA pbar_scf7180000349951 maker CDS 98033 98530 . - 0 ID=PB12301-RA:cds;Parent=PB12301-RA Thank you, Sasha On Tue, Mar 12, 2013 at 10:37 PM, Carson Holt wrote: > Yes. Try the newer version and see if you still have the issue. > > Thanks, > Carson > > > From: Sasha Mikheyev > Date: Tuesday, 12 March, 2013 1:26 AM > To: Carson Holt > Cc: Barry Moore , > > > Subject: Re: [maker-devel] duplicate CDS in annotation > > Hi Carson, > > I have been using version 2.10. Is it worth trying with a newer version? > > You can find the model file here > . It is rather large, as it includes all of the output from the first maker > run. > > Yours, > > Sasha > > > On Mon, Mar 11, 2013 at 10:02 PM, Carson Holt wrote: >> I think the issue is that you are getting a match feature that is being >> printed with the same ID as the mRNA feature. Correct? >> >> What version of MAKER are you using, and what does the gile you are giving to >> pred_gff or model_gff look like? Could you send them? >> >> Thanks, >> Carson >> >> >> From: Barry Moore >> Date: Monday, 11 March, 2013 7:32 AM >> To: Sasha Mikheyev >> Cc: >> Subject: Re: [maker-devel] duplicate CDS in annotation >> >> Hi Sasha, >> >> This gene model appears to be correctly formatted to me. In GFF3 format the >> CDS features are allowed to span multiple lines and they share the same ID to >> indicate that it is all the same features. See the GFF3 specification on the >> Sequence Ontology website >> (http://www.sequenceontology.org/resources/gff3.html), and in particular the >> description of the ID attribute specifies: >> >>> ID Indicates the ID of the feature. IDs for each feature must be unique >>> within the scope of the GFF file. In the case of discontinuous features >>> (i.e. a single feature that exists over multiple genomic locations) the same >>> ID may appear on multiple lines. All lines that share an ID collectively >>> represent a single feature. >> >> So each of those CDS lines forms one part of the single CDS feature for this >> gene. >> >> B >> >> On Mar 11, 2013, at 3:46 AM, Sasha Mikheyev wrote: >> >>> Dear Yandell lab, >>> >>> I am re-annotating the harvester and genome using protein and RNA-seq data. >>> However, I get many artifacts like the one below. It seems that there are >>> several CDS records that should tie in to the same mRNA, but they are really >>> hanging out separately, and produce several nucleotide sequences with the >>> same name when extracted from the gff. I would appreciate any guidance about >>> how to fix this! >>> >>> Thank you, >>> >>> Sasha >>> >>> grep "pbar_scf7180000350377:hit:2506" Pbar.2.0.gff >>> pbar_scf7180000350377 protein2genome protein_match 172004 172162 150 - . >>> ID=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;score=150; >>> pbar_scf7180000350377 protein2genome match_part 172004 172162 150 - . >>> ID=pbar_scf7180000350377:hsp:2798;Parent=pbar_scf7180000350377:hit:2506;Name >>> =Hsal|HS9704;Target=Hsal|HS9704 1 53 +;Gap=M159; >>> pbar_scf7180000350377 maker mRNA 538308 558769 . + . >>> ID=pbar_scf7180000350377:hit:2506;Parent=augustus_masked-pbar_scf71800003503 >>> 77-abinit-gene-5.29;Name=augustus_masked-pbar_scf7180000350377-abinit-gene-5 >>> .29-mRNA-1;_AED=0.48;_eAED=0.39;_QI=0|0|0|0.5|1|1|6|0|395;score=0.01; >>> pbar_scf7180000350377 maker exon 538308 538334 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:305;Parent=pbar_scf7180000350377:hit: >>> 2506; >>> pbar_scf7180000350377 maker exon 538748 538968 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:306;Parent=pbar_scf7180000350377:hit: >>> 2506; >>> pbar_scf7180000350377 maker exon 539842 540242 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:307;Parent=pbar_scf7180000350377:hit: >>> 2506; >>> pbar_scf7180000350377 maker exon 542624 542798 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:308;Parent=pbar_scf7180000350377:hit: >>> 2506; >>> pbar_scf7180000350377 maker exon 555823 556025 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:309;Parent=pbar_scf7180000350377:hit: >>> 2506; >>> pbar_scf7180000350377 maker exon 558609 558769 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:310;Parent=pbar_scf7180000350377:hit: >>> 2506; >>> pbar_scf7180000350377 maker CDS 538308 538334 . + 0 >>> ID=pbar_scf7180000350377:hit:2506:cds:305;Parent=pbar_scf7180000350377:hit:2 >>> 506; >>> pbar_scf7180000350377 maker CDS 538748 538968 . + 0 >>> ID=pbar_scf7180000350377:hit:2506:cds:306;Parent=pbar_scf7180000350377:hit:2 >>> 506; >>> pbar_scf7180000350377 maker CDS 539842 540242 . + 1 >>> ID=pbar_scf7180000350377:hit:2506:cds:307;Parent=pbar_scf7180000350377:hit:2 >>> 506; >>> pbar_scf7180000350377 maker CDS 542624 542798 . + 2 >>> ID=pbar_scf7180000350377:hit:2506:cds:308;Parent=pbar_scf7180000350377:hit:2 >>> 506; >>> pbar_scf7180000350377 maker CDS 555823 556025 . + 1 >>> ID=pbar_scf7180000350377:hit:2506:cds:309;Parent=pbar_scf7180000350377:hit:2 >>> 506; >>> pbar_scf7180000350377 maker CDS 558609 558769 . + 2 >>> ID=pbar_scf7180000350377:hit:2506:cds:310;Parent=pbar_scf7180000350377:hit:2 >>> 506; >>> >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> Barry Moore >> Research Scientist >> Dept. of Human Genetics >> University of Utah >> Salt Lake City, UT 84112 >> -------------------------------------------- >> (801) 585-3543 >> >> >> >> >> _______________________________________________ maker-devel mailing list >> maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/ma >> ker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Wed Mar 13 19:26:25 2013 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 13 Mar 2013 20:26:25 -0400 Subject: [maker-devel] do of the maker predicted proteins do not start with M In-Reply-To: <7B64340A44B6634C814A22BCFA6179D5020887CC@onottaxms5.AGR.GC.CA> Message-ID: SNAP and other gene prediction programs are capable of producing partial models if they can't find reasonable start and stop codons. You can set always_complete=1 in the maker_opts.ctl file to get MAKER to walk forward and backwards to search for starts and stops after the ab initio predictors do their work in an attempt to force model completion. Thanks, Carson From: "Borhan, Hossein" Date: Wednesday, 13 March, 2013 5:49 PM To: Subject: [maker-devel] do of the maker predicted proteins do not start with M Hi I have run maker and some of the protein predicted by maker does not start with a Methionine. I am not sure why Here are some examples >maker-scaffold00001-snap-gene-0.8-mRNA-1 protein AED:0.27496328928047 eAED:0.27496328928047 QI:0|0.4|0.16|0.5|1|1|6|0|453 VIIKFKTFAKASRSVELFGHEGAWARGDGYCNFKTESEKADRSVKSSCSLNIPFTYDVGR RQYVIKGDRFCLSHNHLVMIPSPTTVIVNDQRDLTPDQLSYIINLGKYSLPFPMVTRMLS DQFPDCRIQKPLLHRLLRKGKLQAFGGDRDAMNALINLGRSYEEHGGFFEIDIDVDCRLE KIWLARAEGLQFASVYNDVVQIDGGAKMNAYGFVFLPVTVIDCLGKSYVVGAMAGPSAEN KADVVKTLEYFRVKRSESVLIADDALAFRAAAVECDMVYHQCTKHYQAKIARACAGLGHE GKEFMIKANTLVYHIFPSEDAFFAKADEYRLMFLQYGGAVKLFDDIVDKRQQLCRTFTSC KFTGSHSSNQRAEGTISRTKRDVQPWLSRANLFEMFTHLEMIQKQQEDEAARLLSNLIRK GKHWSDYVDSIFRERQLNSRLLSSVREVDTGLH >snap-scaffold00087-abinit-gene-2.145-mRNA-1 protein AED:0.0539495114006514 eAED:0.0539495114006514 QI:2|1|0.6|1|1|1|5|0|817 ALSLHGTRQAFARVPPPCRRAHPAERRQPGGGMSADAPVKAGYLLKLTSSLSHWNRRYFI VADTKLFYCKTEDDLLRRKFQGEIDLAGAQIALYTRNDETAKRFSDHHHMLGVKPAGCDR IYILDADSEHAQKEWVACLRRHASQAPVSSPVDAAVAAAPRKDPQSVREGFLTKRGETIK NWKMRYFVLKGNYLHYYRSIEDAQPAGSILLLGTRTTAEPKAVTGMPHSFSIARADAKRK YMIHADSKEECDAWVGAIQQQSVFVRHAGTDSAPPEVVAPAAATPVHQQHQSRSSFGNRP NVADDSADDDEAALDEVALSNGPPALAPHGIANTGASTGLNLKQKVSKKKRRFVTDEFDL DLTYITENIIAMGFPAESMEAMFRNSMSDVQRFLDGRHPDAYRVYNLCSERDYDPAKFHH NVCRFPFDDHNCPNFEDLIPLCEDIHNWLSIQSDHVVAIHCKAGKGRTGLVICAYLLYSG AWRTARDALQFYGFVRTQDQKGVTIPSQIRYVEYFEQYMADPEILSRNNGPLVISEIFVG RGCRPFDTVTITNMGRRMNSKDWGKYWKDALDDGLLLQLPKGACQVDKDFKVEFLASGLL GKKTRVAGFWLHTAFIQDGVVDIDKSMIDKVNKEKDCPAFSIQVFFGGRTYVDRRCRIPV APPQPTGPLLLSPATVRIRNADPLPVPNPSSPSESPSFSAMSSVPSLSLESLSSVSSSLS PTTGPKAAPSPKKQDPGLDPGSPPGTVKSAPVAEAGAAPVDARSDNKAPARSCSLPHGRF PGDVAGDGAFQQAAVQVRIAFWNNLQSEALQRRNSRL >augustus-scaffold00087-abinit-gene-0.106-mRNA-1 protein AED:0.10935424621144 eAED:0.10935424621144 QI:1|1|0.66|1|1|1|3|0|483 STSTVFCPGLWNRVGRYVSGGLQTSPVTVPRQVRPICLATQPATDQPGYVMSSTSALVGI GVVTVALLCRWAPILVTTSGPGSPARSSADVMRIWADHDWSAGTTSLPISQESLLAKRVL SKSFDGLPPNLHVQDDTVPVALLRAHLNAGRHMRLRDMCPTAGACDLQGADPDHGLAPLH MAAMRDDRSSIAYLMALGADPDAMDRAGRQYRNLSFTNFVRNARRAAEERGSTCQLPEVN LAGLERADLDRSWAEIRRLAHEGEPVAIRGLLGAYDRSDVLDWDLDAFLTRHGHVPVNVG DVPYAQYFGLPIQSMPLSKYVASLAPGSASYVFAKDDGICRDALQILDRFARDALPPYFV SPAALGSDAVHFYLGNKGSGAPFHLHSDAVNLLAHGSKTWFVTPPPQSVYSRTPIGEFAA NGTSGIESLRCEQNPGDAIYIPFDWGHAVLNNEDSTFGFAVELLNKRDSLHFLRPSSQVP AGQ Regards Hossein _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Wed Mar 13 21:54:55 2013 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 13 Mar 2013 22:54:55 -0400 Subject: [maker-devel] duplicate CDS in annotation In-Reply-To: Message-ID: Yes. map_forward=1 allows new models to keep the names of the models they replace. It makes it so you don't have to relocate genes every time a model gets a slight modification during reannotation. --Carson From: Sasha Mikheyev Date: Wednesday, 13 March, 2013 9:17 PM To: Carson Holt Cc: Barry Moore , Subject: Re: [maker-devel] duplicate CDS in annotation OK. Got it! I did pass through the gene model names. I guess I now see that a new gene model may become associated with the old name in the re-annotation. Sasha On Thu, Mar 14, 2013 at 6:47 AM, Carson Holt wrote: > The output shows that the original model was > Alias=maker-pbar_scf7180000349951-snap-gene-1.17-mRNA-1 and the new model > replacing it is Alias=genemark-pbar_scf7180000349951-abinit-gene-1.14-mRNA-1. > > So it is really a completely different model (as one derived from SNAP and one > from GeneMark). I'm guessing you have map_forward=1 set and are using the > GFF3 passthrough options correct? > > Thanks, > Carson > > > > From: Sasha Mikheyev > Date: Wednesday, 13 March, 2013 3:23 AM > > To: Carson Holt > Cc: Barry Moore , > > Subject: Re: [maker-devel] duplicate CDS in annotation > > Dear Carson, > > The new version does indeed fix the problem! > > However, I noticed that some of the CDS annotations were swallowed. This seems > to affect a ~600 genes. > > e.g. input: > > pbar_scf7180000349951 maker mRNA 98033 98530 . - . > ID=PB12301-RA;Parent=PB12301;Name=PB12301-RA;Alias=maker-pbar_scf7180000349951 > -snap-gene-1.17-mRNA-1;_AED=1.00;_QI=0|0|0|0|0|0|2|0|81; > pbar_scf7180000349951 maker exon 98393 98530 . - . > ID=PB12301-RA:exon:10283;Parent=PB12301-RA; > pbar_scf7180000349951 maker exon 98033 98140 . - . > ID=PB12301-RA:exon:10284;Parent=PB12301-RA; > pbar_scf7180000349951 maker CDS 98033 98140 . - 0 > ID=PB12301-RA:cds:10114;Parent=PB12301-RA; > pbar_scf7180000349951 maker CDS 98393 98530 . - 0 > ID=PB12301-RA:cds:10113;Parent=PB12301-RA; > > output: > > pbar_scf7180000349951 maker mRNA 98033 98530 . - . > ID=PB12301-RA;Parent=PB12301;Name=PB12301-RA;_AED=0.38;_eAED=0.38;_QI=0|0|0.33 > |1|0.5|1|3|246|165;Alias=genemark-pbar_scf7180000349951-abinit-gene-1.14-mRNA- > 1,PB12301-RA > pbar_scf7180000349951 maker exon 98033 98530 . - . > ID=PB12301-RA:exon:134;Parent=PB12301-RA > pbar_scf7180000349951 maker exon 98033 98140 . - . > ID=PB12301-RA:exon:133;Parent=PB12301-RA > pbar_scf7180000349951 maker exon 98393 98530 . - . > ID=PB12301-RA:exon:132;Parent=PB12301-RA > pbar_scf7180000349951 maker three_prime_UTR 98393 98530 . - . > ID=PB12301-RA:three_prime_utr;Parent=PB12301-RA > pbar_scf7180000349951 maker three_prime_UTR 98033 98140 . - . > ID=PB12301-RA:three_prime_utr;Parent=PB12301-RA > pbar_scf7180000349951 maker CDS 98033 98530 . - 0 > ID=PB12301-RA:cds;Parent=PB12301-RA > > Thank you, > > Sasha > > On Tue, Mar 12, 2013 at 10:37 PM, Carson Holt wrote: >> Yes. Try the newer version and see if you still have the issue. >> >> Thanks, >> Carson >> >> >> From: Sasha Mikheyev >> Date: Tuesday, 12 March, 2013 1:26 AM >> To: Carson Holt >> Cc: Barry Moore , >> >> >> Subject: Re: [maker-devel] duplicate CDS in annotation >> >> Hi Carson, >> >> I have been using version 2.10. Is it worth trying with a newer version? >> >> You can find the model file here >> . It is rather large, as it >> includes all of the output from the first maker run. >> >> Yours, >> >> Sasha >> >> >> On Mon, Mar 11, 2013 at 10:02 PM, Carson Holt wrote: >>> I think the issue is that you are getting a match feature that is being >>> printed with the same ID as the mRNA feature. Correct? >>> >>> What version of MAKER are you using, and what does the gile you are giving >>> to pred_gff or model_gff look like? Could you send them? >>> >>> Thanks, >>> Carson >>> >>> >>> From: Barry Moore >>> Date: Monday, 11 March, 2013 7:32 AM >>> To: Sasha Mikheyev >>> Cc: >>> Subject: Re: [maker-devel] duplicate CDS in annotation >>> >>> Hi Sasha, >>> >>> This gene model appears to be correctly formatted to me. In GFF3 format the >>> CDS features are allowed to span multiple lines and they share the same ID >>> to indicate that it is all the same features. See the GFF3 specification on >>> the Sequence Ontology website >>> (http://www.sequenceontology.org/resources/gff3.html), and in particular the >>> description of the ID attribute specifies: >>> >>>> ID Indicates the ID of the feature. IDs for each feature must be unique >>>> within the scope of the GFF file. In the case of discontinuous features >>>> (i.e. a single feature that exists over multiple genomic locations) the >>>> same ID may appear on multiple lines. All lines that share an ID >>>> collectively represent a single feature. >>> >>> So each of those CDS lines forms one part of the single CDS feature for this >>> gene. >>> >>> B >>> >>> On Mar 11, 2013, at 3:46 AM, Sasha Mikheyev wrote: >>> >>>> Dear Yandell lab, >>>> >>>> I am re-annotating the harvester and genome using protein and RNA-seq data. >>>> However, I get many artifacts like the one below. It seems that there are >>>> several CDS records that should tie in to the same mRNA, but they are >>>> really hanging out separately, and produce several nucleotide sequences >>>> with the same name when extracted from the gff. I would appreciate any >>>> guidance about how to fix this! >>>> >>>> Thank you, >>>> >>>> Sasha >>>> >>>> grep "pbar_scf7180000350377:hit:2506" Pbar.2.0.gff >>>> pbar_scf7180000350377 protein2genome protein_match 172004 172162 150 - . >>>> ID=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;score=150; >>>> pbar_scf7180000350377 protein2genome match_part 172004 172162 150 - . >>>> ID=pbar_scf7180000350377:hsp:2798;Parent=pbar_scf7180000350377:hit:2506;Nam >>>> e=Hsal|HS9704;Target=Hsal|HS9704 1 53 +;Gap=M159; >>>> pbar_scf7180000350377 maker mRNA 538308 558769 . + . >>>> ID=pbar_scf7180000350377:hit:2506;Parent=augustus_masked-pbar_scf7180000350 >>>> 377-abinit-gene-5.29;Name=augustus_masked-pbar_scf7180000350377-abinit-gene >>>> -5.29-mRNA-1;_AED=0.48;_eAED=0.39;_QI=0|0|0|0.5|1|1|6|0|395;score=0.01; >>>> pbar_scf7180000350377 maker exon 538308 538334 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:305;Parent=pbar_scf7180000350377:hit >>>> :2506; >>>> pbar_scf7180000350377 maker exon 538748 538968 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:306;Parent=pbar_scf7180000350377:hit >>>> :2506; >>>> pbar_scf7180000350377 maker exon 539842 540242 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:307;Parent=pbar_scf7180000350377:hit >>>> :2506; >>>> pbar_scf7180000350377 maker exon 542624 542798 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:308;Parent=pbar_scf7180000350377:hit >>>> :2506; >>>> pbar_scf7180000350377 maker exon 555823 556025 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:309;Parent=pbar_scf7180000350377:hit >>>> :2506; >>>> pbar_scf7180000350377 maker exon 558609 558769 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:310;Parent=pbar_scf7180000350377:hit >>>> :2506; >>>> pbar_scf7180000350377 maker CDS 538308 538334 . + 0 >>>> ID=pbar_scf7180000350377:hit:2506:cds:305;Parent=pbar_scf7180000350377:hit: >>>> 2506; >>>> pbar_scf7180000350377 maker CDS 538748 538968 . + 0 >>>> ID=pbar_scf7180000350377:hit:2506:cds:306;Parent=pbar_scf7180000350377:hit: >>>> 2506; >>>> pbar_scf7180000350377 maker CDS 539842 540242 . + 1 >>>> ID=pbar_scf7180000350377:hit:2506:cds:307;Parent=pbar_scf7180000350377:hit: >>>> 2506; >>>> pbar_scf7180000350377 maker CDS 542624 542798 . + 2 >>>> ID=pbar_scf7180000350377:hit:2506:cds:308;Parent=pbar_scf7180000350377:hit: >>>> 2506; >>>> pbar_scf7180000350377 maker CDS 555823 556025 . + 1 >>>> ID=pbar_scf7180000350377:hit:2506:cds:309;Parent=pbar_scf7180000350377:hit: >>>> 2506; >>>> pbar_scf7180000350377 maker CDS 558609 558769 . + 2 >>>> ID=pbar_scf7180000350377:hit:2506:cds:310;Parent=pbar_scf7180000350377:hit: >>>> 2506; >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >>> Barry Moore >>> Research Scientist >>> Dept. of Human Genetics >>> University of Utah >>> Salt Lake City, UT 84112 >>> -------------------------------------------- >>> (801) 585-3543 >>> >>> >>> >>> >>> _______________________________________________ maker-devel mailing list >>> maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/m >>> aker-devel_yandell-lab.org >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mikheyev at gmail.com Wed Mar 13 20:17:40 2013 From: mikheyev at gmail.com (Sasha Mikheyev) Date: Thu, 14 Mar 2013 10:17:40 +0900 Subject: [maker-devel] duplicate CDS in annotation In-Reply-To: References: Message-ID: OK. Got it! I did pass through the gene model names. I guess I now see that a new gene model may become associated with the old name in the re-annotation. Sasha On Thu, Mar 14, 2013 at 6:47 AM, Carson Holt wrote: > The output shows that the original model > was Alias=maker-pbar_scf7180000349951-snap-gene-1.17-mRNA-1 and the new > model replacing it is > Alias=genemark-pbar_scf7180000349951-abinit-gene-1.14-mRNA-1. > > So it is really a completely different model (as one derived from SNAP and > one from GeneMark). I'm guessing you have map_forward=1 set and are using > the GFF3 passthrough options correct? > > Thanks, > Carson > > > > From: Sasha Mikheyev > Date: Wednesday, 13 March, 2013 3:23 AM > > To: Carson Holt > Cc: Barry Moore , < > maker-devel at yandell-lab.org> > Subject: Re: [maker-devel] duplicate CDS in annotation > > Dear Carson, > > The new version does indeed fix the problem! > > However, I noticed that some of the CDS annotations were swallowed. This > seems to affect a ~600 genes. > > e.g. input: > > pbar_scf7180000349951 maker mRNA 98033 98530 . - . > ID=PB12301-RA;Parent=PB12301;Name=PB12301-RA;Alias=maker-pbar_scf7180000349951-snap-gene-1.17-mRNA-1;_AED=1.00;_QI=0|0|0|0|0|0|2|0|81; > pbar_scf7180000349951 maker exon 98393 98530 . - . > ID=PB12301-RA:exon:10283;Parent=PB12301-RA; > pbar_scf7180000349951 maker exon 98033 98140 . - . > ID=PB12301-RA:exon:10284;Parent=PB12301-RA; > pbar_scf7180000349951 maker CDS 98033 98140 . - 0 > ID=PB12301-RA:cds:10114;Parent=PB12301-RA; > pbar_scf7180000349951 maker CDS 98393 98530 . - 0 > ID=PB12301-RA:cds:10113;Parent=PB12301-RA; > > output: > > pbar_scf7180000349951 maker mRNA 98033 98530 . - . > ID=PB12301-RA;Parent=PB12301;Name=PB12301-RA;_AED=0.38;_eAED=0.38;_QI=0|0|0.33|1|0.5|1|3|246|165;Alias=genemark-pbar_scf7180000349951-abinit-gene-1.14-mRNA-1,PB12301-RA > pbar_scf7180000349951 maker exon 98033 98530 . - . > ID=PB12301-RA:exon:134;Parent=PB12301-RA > pbar_scf7180000349951 maker exon 98033 98140 . - . > ID=PB12301-RA:exon:133;Parent=PB12301-RA > pbar_scf7180000349951 maker exon 98393 98530 . - . > ID=PB12301-RA:exon:132;Parent=PB12301-RA > pbar_scf7180000349951 maker three_prime_UTR 98393 98530 . - . > ID=PB12301-RA:three_prime_utr;Parent=PB12301-RA > pbar_scf7180000349951 maker three_prime_UTR 98033 98140 . - . > ID=PB12301-RA:three_prime_utr;Parent=PB12301-RA > pbar_scf7180000349951 maker CDS 98033 98530 . - 0 > ID=PB12301-RA:cds;Parent=PB12301-RA > > Thank you, > > Sasha > > On Tue, Mar 12, 2013 at 10:37 PM, Carson Holt wrote: > >> Yes. Try the newer version and see if you still have the issue. >> >> Thanks, >> Carson >> >> >> From: Sasha Mikheyev >> Date: Tuesday, 12 March, 2013 1:26 AM >> To: Carson Holt >> Cc: Barry Moore , < >> maker-devel at yandell-lab.org> >> >> Subject: Re: [maker-devel] duplicate CDS in annotation >> >> Hi Carson, >> >> I have been using version 2.10. Is it worth trying with a newer version? >> >> You can find the model file here. >> It is rather large, as it includes all of the output from the first maker >> run. >> >> Yours, >> >> Sasha >> >> >> On Mon, Mar 11, 2013 at 10:02 PM, Carson Holt wrote: >> >>> I think the issue is that you are getting a match feature that is being >>> printed with the same ID as the mRNA feature. Correct? >>> >>> What version of MAKER are you using, and what does the gile you are >>> giving to pred_gff or model_gff look like? Could you send them? >>> >>> Thanks, >>> Carson >>> >>> >>> From: Barry Moore >>> Date: Monday, 11 March, 2013 7:32 AM >>> To: Sasha Mikheyev >>> Cc: >>> Subject: Re: [maker-devel] duplicate CDS in annotation >>> >>> Hi Sasha, >>> >>> This gene model appears to be correctly formatted to me. In GFF3 format >>> the CDS features are allowed to span multiple lines and they share the same >>> ID to indicate that it is all the same features. See the GFF3 >>> specification on the Sequence Ontology website ( >>> http://www.sequenceontology.org/resources/gff3.html), and in particular >>> the description of the ID attribute specifies: >>> >>> ID Indicates the ID of the feature. IDs for each feature must be unique >>> within the scope of the GFF file. In the case of discontinuous features >>> (i.e. a single feature that exists over multiple genomic locations) the >>> same ID may appear on multiple lines. All lines that share an ID >>> collectively represent a single feature. >>> >>> >>> So each of those CDS lines forms one part of the single CDS feature for >>> this gene. >>> >>> B >>> >>> On Mar 11, 2013, at 3:46 AM, Sasha Mikheyev wrote: >>> >>> Dear Yandell lab, >>> >>> I am re-annotating the harvester and genome using protein and RNA-seq >>> data. However, I get many artifacts like the one below. It seems that there >>> are several CDS records that should tie in to the same mRNA, but they are >>> really hanging out separately, and produce several nucleotide sequences >>> with the same name when extracted from the gff. I would appreciate any >>> guidance about how to fix this! >>> >>> Thank you, >>> >>> Sasha >>> >>> grep "pbar_scf7180000350377:hit:2506" Pbar.2.0.gff >>> pbar_scf7180000350377 protein2genome protein_match 172004 172162 150 - . >>> ID=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;score=150; >>> pbar_scf7180000350377 protein2genome match_part 172004 172162 150 - . ID=pbar_scf7180000350377:hsp:2798;Parent=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;Target=Hsal|HS9704 >>> 1 53 +;Gap=M159; >>> pbar_scf7180000350377 maker mRNA 538308 558769 . + . >>> ID=pbar_scf7180000350377:hit:2506;Parent=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29;Name=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29-mRNA-1;_AED=0.48;_eAED=0.39;_QI=0|0|0|0.5|1|1|6|0|395;score=0.01; >>> pbar_scf7180000350377 maker exon 538308 538334 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:305;Parent=pbar_scf7180000350377:hit:2506; >>> pbar_scf7180000350377 maker exon 538748 538968 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:306;Parent=pbar_scf7180000350377:hit:2506; >>> pbar_scf7180000350377 maker exon 539842 540242 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:307;Parent=pbar_scf7180000350377:hit:2506; >>> pbar_scf7180000350377 maker exon 542624 542798 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:308;Parent=pbar_scf7180000350377:hit:2506; >>> pbar_scf7180000350377 maker exon 555823 556025 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:309;Parent=pbar_scf7180000350377:hit:2506; >>> pbar_scf7180000350377 maker exon 558609 558769 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:310;Parent=pbar_scf7180000350377:hit:2506; >>> pbar_scf7180000350377 maker CDS 538308 538334 . + 0 >>> ID=pbar_scf7180000350377:hit:2506:cds:305;Parent=pbar_scf7180000350377:hit:2506; >>> pbar_scf7180000350377 maker CDS 538748 538968 . + 0 >>> ID=pbar_scf7180000350377:hit:2506:cds:306;Parent=pbar_scf7180000350377:hit:2506; >>> pbar_scf7180000350377 maker CDS 539842 540242 . + 1 >>> ID=pbar_scf7180000350377:hit:2506:cds:307;Parent=pbar_scf7180000350377:hit:2506; >>> pbar_scf7180000350377 maker CDS 542624 542798 . + 2 >>> ID=pbar_scf7180000350377:hit:2506:cds:308;Parent=pbar_scf7180000350377:hit:2506; >>> pbar_scf7180000350377 maker CDS 555823 556025 . + 1 >>> ID=pbar_scf7180000350377:hit:2506:cds:309;Parent=pbar_scf7180000350377:hit:2506; >>> pbar_scf7180000350377 maker CDS 558609 558769 . + 2 >>> ID=pbar_scf7180000350377:hit:2506:cds:310;Parent=pbar_scf7180000350377:hit:2506; >>> >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >>> >>> Barry Moore >>> Research Scientist >>> Dept. of Human Genetics >>> University of Utah >>> Salt Lake City, UT 84112 >>> -------------------------------------------- >>> (801) 585-3543 >>> >>> >>> >>> >>> _______________________________________________ maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mikheyev at gmail.com Wed Mar 13 22:34:52 2013 From: mikheyev at gmail.com (Sasha Mikheyev) Date: Thu, 14 Mar 2013 12:34:52 +0900 Subject: [maker-devel] duplicate CDS in annotation In-Reply-To: References: Message-ID: Thank you very much! Problem solved! Sasha On Thu, Mar 14, 2013 at 11:54 AM, Carson Holt wrote: > Yes. map_forward=1 allows new models to keep the names of the models they > replace. It makes it so you don't have to relocate genes every time a > model gets a slight modification during reannotation. > > --Carson > > > From: Sasha Mikheyev > Date: Wednesday, 13 March, 2013 9:17 PM > > To: Carson Holt > Cc: Barry Moore , < > maker-devel at yandell-lab.org> > Subject: Re: [maker-devel] duplicate CDS in annotation > > OK. Got it! I did pass through the gene model names. I guess I now see > that a new gene model may become associated with the old name in the > re-annotation. > > Sasha > > On Thu, Mar 14, 2013 at 6:47 AM, Carson Holt wrote: > >> The output shows that the original model >> was Alias=maker-pbar_scf7180000349951-snap-gene-1.17-mRNA-1 and the new >> model replacing it is >> Alias=genemark-pbar_scf7180000349951-abinit-gene-1.14-mRNA-1. >> >> So it is really a completely different model (as one derived from SNAP >> and one from GeneMark). I'm guessing you have map_forward=1 set and are >> using the GFF3 passthrough options correct? >> >> Thanks, >> Carson >> >> >> >> From: Sasha Mikheyev >> Date: Wednesday, 13 March, 2013 3:23 AM >> >> To: Carson Holt >> Cc: Barry Moore , < >> maker-devel at yandell-lab.org> >> Subject: Re: [maker-devel] duplicate CDS in annotation >> >> Dear Carson, >> >> The new version does indeed fix the problem! >> >> However, I noticed that some of the CDS annotations were swallowed. This >> seems to affect a ~600 genes. >> >> e.g. input: >> >> pbar_scf7180000349951 maker mRNA 98033 98530 . - . >> ID=PB12301-RA;Parent=PB12301;Name=PB12301-RA;Alias=maker-pbar_scf7180000349951-snap-gene-1.17-mRNA-1;_AED=1.00;_QI=0|0|0|0|0|0|2|0|81; >> pbar_scf7180000349951 maker exon 98393 98530 . - . >> ID=PB12301-RA:exon:10283;Parent=PB12301-RA; >> pbar_scf7180000349951 maker exon 98033 98140 . - . >> ID=PB12301-RA:exon:10284;Parent=PB12301-RA; >> pbar_scf7180000349951 maker CDS 98033 98140 . - 0 >> ID=PB12301-RA:cds:10114;Parent=PB12301-RA; >> pbar_scf7180000349951 maker CDS 98393 98530 . - 0 >> ID=PB12301-RA:cds:10113;Parent=PB12301-RA; >> >> output: >> >> pbar_scf7180000349951 maker mRNA 98033 98530 . - . >> ID=PB12301-RA;Parent=PB12301;Name=PB12301-RA;_AED=0.38;_eAED=0.38;_QI=0|0|0.33|1|0.5|1|3|246|165;Alias=genemark-pbar_scf7180000349951-abinit-gene-1.14-mRNA-1,PB12301-RA >> pbar_scf7180000349951 maker exon 98033 98530 . - . >> ID=PB12301-RA:exon:134;Parent=PB12301-RA >> pbar_scf7180000349951 maker exon 98033 98140 . - . >> ID=PB12301-RA:exon:133;Parent=PB12301-RA >> pbar_scf7180000349951 maker exon 98393 98530 . - . >> ID=PB12301-RA:exon:132;Parent=PB12301-RA >> pbar_scf7180000349951 maker three_prime_UTR 98393 98530 . - . >> ID=PB12301-RA:three_prime_utr;Parent=PB12301-RA >> pbar_scf7180000349951 maker three_prime_UTR 98033 98140 . - . >> ID=PB12301-RA:three_prime_utr;Parent=PB12301-RA >> pbar_scf7180000349951 maker CDS 98033 98530 . - 0 >> ID=PB12301-RA:cds;Parent=PB12301-RA >> >> Thank you, >> >> Sasha >> >> On Tue, Mar 12, 2013 at 10:37 PM, Carson Holt wrote: >> >>> Yes. Try the newer version and see if you still have the issue. >>> >>> Thanks, >>> Carson >>> >>> >>> From: Sasha Mikheyev >>> Date: Tuesday, 12 March, 2013 1:26 AM >>> To: Carson Holt >>> Cc: Barry Moore , < >>> maker-devel at yandell-lab.org> >>> >>> Subject: Re: [maker-devel] duplicate CDS in annotation >>> >>> Hi Carson, >>> >>> I have been using version 2.10. Is it worth trying with a newer version? >>> >>> You can find the model file here. >>> It is rather large, as it includes all of the output from the first maker >>> run. >>> >>> Yours, >>> >>> Sasha >>> >>> >>> On Mon, Mar 11, 2013 at 10:02 PM, Carson Holt wrote: >>> >>>> I think the issue is that you are getting a match feature that is being >>>> printed with the same ID as the mRNA feature. Correct? >>>> >>>> What version of MAKER are you using, and what does the gile you are >>>> giving to pred_gff or model_gff look like? Could you send them? >>>> >>>> Thanks, >>>> Carson >>>> >>>> >>>> From: Barry Moore >>>> Date: Monday, 11 March, 2013 7:32 AM >>>> To: Sasha Mikheyev >>>> Cc: >>>> Subject: Re: [maker-devel] duplicate CDS in annotation >>>> >>>> Hi Sasha, >>>> >>>> This gene model appears to be correctly formatted to me. In GFF3 >>>> format the CDS features are allowed to span multiple lines and they share >>>> the same ID to indicate that it is all the same features. See the GFF3 >>>> specification on the Sequence Ontology website ( >>>> http://www.sequenceontology.org/resources/gff3.html), and in >>>> particular the description of the ID attribute specifies: >>>> >>>> ID Indicates the ID of the feature. IDs for each feature must be unique >>>> within the scope of the GFF file. In the case of discontinuous features >>>> (i.e. a single feature that exists over multiple genomic locations) the >>>> same ID may appear on multiple lines. All lines that share an ID >>>> collectively represent a single feature. >>>> >>>> >>>> So each of those CDS lines forms one part of the single CDS feature for >>>> this gene. >>>> >>>> B >>>> >>>> On Mar 11, 2013, at 3:46 AM, Sasha Mikheyev wrote: >>>> >>>> Dear Yandell lab, >>>> >>>> I am re-annotating the harvester and genome using protein and RNA-seq >>>> data. However, I get many artifacts like the one below. It seems that there >>>> are several CDS records that should tie in to the same mRNA, but they are >>>> really hanging out separately, and produce several nucleotide sequences >>>> with the same name when extracted from the gff. I would appreciate any >>>> guidance about how to fix this! >>>> >>>> Thank you, >>>> >>>> Sasha >>>> >>>> grep "pbar_scf7180000350377:hit:2506" Pbar.2.0.gff >>>> pbar_scf7180000350377 protein2genome protein_match 172004 172162 150 - >>>> . ID=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;score=150; >>>> pbar_scf7180000350377 protein2genome match_part 172004 172162 150 - . ID=pbar_scf7180000350377:hsp:2798;Parent=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;Target=Hsal|HS9704 >>>> 1 53 +;Gap=M159; >>>> pbar_scf7180000350377 maker mRNA 538308 558769 . + . >>>> ID=pbar_scf7180000350377:hit:2506;Parent=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29;Name=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29-mRNA-1;_AED=0.48;_eAED=0.39;_QI=0|0|0|0.5|1|1|6|0|395;score=0.01; >>>> pbar_scf7180000350377 maker exon 538308 538334 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:305;Parent=pbar_scf7180000350377:hit:2506; >>>> pbar_scf7180000350377 maker exon 538748 538968 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:306;Parent=pbar_scf7180000350377:hit:2506; >>>> pbar_scf7180000350377 maker exon 539842 540242 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:307;Parent=pbar_scf7180000350377:hit:2506; >>>> pbar_scf7180000350377 maker exon 542624 542798 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:308;Parent=pbar_scf7180000350377:hit:2506; >>>> pbar_scf7180000350377 maker exon 555823 556025 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:309;Parent=pbar_scf7180000350377:hit:2506; >>>> pbar_scf7180000350377 maker exon 558609 558769 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:310;Parent=pbar_scf7180000350377:hit:2506; >>>> pbar_scf7180000350377 maker CDS 538308 538334 . + 0 >>>> ID=pbar_scf7180000350377:hit:2506:cds:305;Parent=pbar_scf7180000350377:hit:2506; >>>> pbar_scf7180000350377 maker CDS 538748 538968 . + 0 >>>> ID=pbar_scf7180000350377:hit:2506:cds:306;Parent=pbar_scf7180000350377:hit:2506; >>>> pbar_scf7180000350377 maker CDS 539842 540242 . + 1 >>>> ID=pbar_scf7180000350377:hit:2506:cds:307;Parent=pbar_scf7180000350377:hit:2506; >>>> pbar_scf7180000350377 maker CDS 542624 542798 . + 2 >>>> ID=pbar_scf7180000350377:hit:2506:cds:308;Parent=pbar_scf7180000350377:hit:2506; >>>> pbar_scf7180000350377 maker CDS 555823 556025 . + 1 >>>> ID=pbar_scf7180000350377:hit:2506:cds:309;Parent=pbar_scf7180000350377:hit:2506; >>>> pbar_scf7180000350377 maker CDS 558609 558769 . + 2 >>>> ID=pbar_scf7180000350377:hit:2506:cds:310;Parent=pbar_scf7180000350377:hit:2506; >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>>> >>>> >>>> Barry Moore >>>> Research Scientist >>>> Dept. of Human Genetics >>>> University of Utah >>>> Salt Lake City, UT 84112 >>>> -------------------------------------------- >>>> (801) 585-3543 >>>> >>>> >>>> >>>> >>>> _______________________________________________ maker-devel mailing >>>> list maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>>> >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ramonfallon at gmail.com Thu Mar 14 10:19:47 2013 From: ramonfallon at gmail.com (=?ISO-8859-1?Q?Ram=F3n_Fallon?=) Date: Thu, 14 Mar 2013 16:19:47 +0100 Subject: [maker-devel] 12core speed check Message-ID: Hi, I was trying to tweak some of our machines to maximise Mpich2/Maker (svn rev 997) throughput and describe one small set of results on this mailing list to allow sharing of experiences. I use the example input dataset "dpp_contig.fasta" with the original sequence repeated 125 times within the same file (under different names of course) to allow for a decent size run. This file totalled 4.019 megabases. I use the dpp_proteins.fasta and The maker_opts.ctl has "cpus=1" set as the docs recommend for MPI. Hardware is a standalone HP Proliant SL390 with two Intel X5675 @ 3.07GHz, totalling 12 cores with 192GB RAM and 1TB disk (local, no NFS) running Ubuntu 10.04 with 2.6.32-41 linux kernel commandline was "mpiexec -n <#cores> maker" within a dedicated directory containing all relevant files. #cores time(mins) Megabases/hr 1 27.00 8.93 2 126.25 1.91 4 42.57 5.66 6 25.42 9.49 8 18.60 12.96 10 16.67 14.47 12 13.98 17.24 I attach a png file with graph. The upshot of this particular experiment is that 2 processes show anomalous behaviour and that 6 processors are needed to gain an advantage on the 1 processor run, while 12 processors achieves a speed-up of nearly 2 on the 1 processor version. I am now going to move on to a three node cluster with 2x 8core processors each (so I can go up to 48 processors), so will report back with higher core numbers. Any suggestions on further speed optimizations welcome. Cheers / Ram?n. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: twelvecore_spup.png Type: image/png Size: 25749 bytes Desc: not available URL: From carsonhh at gmail.com Thu Mar 14 10:53:33 2013 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 14 Mar 2013 11:53:33 -0400 Subject: [maker-devel] 12core speed check In-Reply-To: Message-ID: I can give a similar setup a try as well to see if anything is amiss in the development version. The expected behavior is that 1 and 2 cores should have identical performance (as one process is always fully dedicated to communication). --Carson From: Ram?n Fallon Date: Thursday, 14 March, 2013 11:19 AM To: Subject: [maker-devel] 12core speed check Hi, I was trying to tweak some of our machines to maximise Mpich2/Maker (svn rev 997) throughput and describe one small set of results on this mailing list to allow sharing of experiences. I use the example input dataset "dpp_contig.fasta" with the original sequence repeated 125 times within the same file (under different names of course) to allow for a decent size run. This file totalled 4.019 megabases. I use the dpp_proteins.fasta and The maker_opts.ctl has "cpus=1" set as the docs recommend for MPI. Hardware is a standalone HP Proliant SL390 with two Intel X5675 @ 3.07GHz, totalling 12 cores with 192GB RAM and 1TB disk (local, no NFS) running Ubuntu 10.04 with 2.6.32-41 linux kernel commandline was "mpiexec -n <#cores> maker" within a dedicated directory containing all relevant files. #cores time(mins) Megabases/hr 1 27.00 8.93 2 126.25 1.91 4 42.57 5.66 6 25.42 9.49 8 18.60 12.96 10 16.67 14.47 12 13.98 17.24 I attach a png file with graph. The upshot of this particular experiment is that 2 processes show anomalous behaviour and that 6 processors are needed to gain an advantage on the 1 processor run, while 12 processors achieves a speed-up of nearly 2 on the 1 processor version. I am now going to move on to a three node cluster with 2x 8core processors each (so I can go up to 48 processors), so will report back with higher core numbers. Any suggestions on further speed optimizations welcome. Cheers / Ram?n. _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnuhn at ebi.ac.uk Thu Mar 14 11:20:01 2013 From: mnuhn at ebi.ac.uk (Michael Nuhn) Date: Thu, 14 Mar 2013 16:20:01 +0000 Subject: [maker-devel] master_datastore_index.log file shrinks. Message-ID: <5141F8B1.7020808@ebi.ac.uk> Hello! I'm trying to keep track of the progress of maker (version 2.27) while it is running by looking at the master_datastore_index.log file every once in a while. Sometimes the number of lines in it decreases. Just now it went down from more than two hundred to thirty seven. When I start more instances of maker, the number of lines in it increases when they start. But sometimes I check and the number of lines has greatly reduced since the last time. I'm afraid that the newer instances of maker are deleting the file and starting it from scratch instead of adding their progress to it. Is this a file locking issue I should be worried about? Cheers, Michael. From olaf.mueller at duke.edu Thu Mar 14 11:13:20 2013 From: olaf.mueller at duke.edu (Olaf Mueller) Date: Thu, 14 Mar 2013 12:13:20 -0400 Subject: [maker-devel] 12core speed check In-Reply-To: References: Message-ID: <5141F720.20502@duke.edu> The X5675 supports hyperthreading. Does i.e. "mpiexec -n 2 maker" use 2 physical cores or 2 threads of the 1st core? If the latter happens it would be interesting to see your series extended to -n 24. Cheers Olaf On 03/14/2013 11:19 AM, Ram?n Fallon wrote: > Hi, > > I was trying to tweak some of our machines to maximise Mpich2/Maker > (svn rev 997) throughput and describe one small set of results on > this mailing list to allow sharing of experiences. > > I use the example input dataset "dpp_contig.fasta" with the original > sequence repeated 125 times within the same file (under different > names of course) to allow for a decent size run. This file totalled > 4.019 megabases. I use the dpp_proteins.fasta and The maker_opts.ctl > has "cpus=1" set as the docs recommend for MPI. > > Hardware is a standalone HP Proliant SL390 with two Intel X5675 @ > 3.07GHz, totalling 12 cores with 192GB RAM and 1TB disk (local, no > NFS) running Ubuntu 10.04 with 2.6.32-41 linux kernel > > commandline was "mpiexec -n <#cores> maker" within a dedicated > directory containing all relevant files. > > #cores time(mins) Megabases/hr > 1 27.00 8.93 > 2 126.25 1.91 > 4 42.57 5.66 > 6 25.42 9.49 > 8 18.60 12.96 > 10 16.67 14.47 > 12 13.98 17.24 > > I attach a png file with graph. The upshot of this particular > experiment is that 2 processes show anomalous behaviour and that 6 > processors are needed to gain an advantage on the 1 processor run, > while 12 processors achieves a speed-up of nearly 2 on the 1 processor > version. > > I am now going to move on to a three node cluster with 2x 8core > processors each (so I can go up to 48 processors), so will report back > with higher core numbers. Any suggestions on further speed > optimizations welcome. > > Cheers / Ram?n. > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Thu Mar 14 11:21:47 2013 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 14 Mar 2013 12:21:47 -0400 Subject: [maker-devel] master_datastore_index.log file shrinks. In-Reply-To: <5141F8B1.7020808@ebi.ac.uk> Message-ID: The file should only be deleted if there are no instances running and a new one starts. Then it rebuilds it. If it is being deleted while other instances are still active, then yes that is a lock issue. There are several other locks that should protect individual contigs while that particular lock is only protecting the datastore_index.log file. If any of the contig locks are not working you would start to see failures of contigs with weird errors that say there are missing files. Try dialling back on the number of simultaneous instances you start and instead use MPI or the -cpus option to get the parallelization boost. Alternatively you can also split up the input file and use the -base option so everything gets written to the same place (then you never have to worry about locks affecting individual contigs - as no single instance has access to all the contigs) Example: fasta_tool --chunks 5 maize_assembly.fasta maker -g maize_assembly_0.fasta -base maize_assembly maker -g maize_assembly_1.fasta -base maize_assembly maker -g maize_assembly_2.fasta -base maize_assembly maker -g maize_assembly_3.fasta -base maize_assembly maker -g maize_assembly_4.fasta -base maize_assembly maker -dsindex Everything then gets written to maize_assembly.maker.output for all results. The last call to maker with the -dsindex flag then rebuilds the datastore_index.log file to match the original maize_assembly.fasta file Thanks, Carson On 13-03-14 12:20 PM, "Michael Nuhn" wrote: >Hello! > >I'm trying to keep track of the progress of maker (version 2.27) while >it is running by looking at the master_datastore_index.log file every >once in a while. > >Sometimes the number of lines in it decreases. Just now it went down >from more than two hundred to thirty seven. > >When I start more instances of maker, the number of lines in it >increases when they start. But sometimes I check and the number of lines >has greatly reduced since the last time. > >I'm afraid that the newer instances of maker are deleting the file and >starting it from scratch instead of adding their progress to it. > >Is this a file locking issue I should be worried about? > >Cheers, >Michael. > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From mnuhn at ebi.ac.uk Thu Mar 14 11:49:19 2013 From: mnuhn at ebi.ac.uk (Michael Nuhn) Date: Thu, 14 Mar 2013 16:49:19 +0000 Subject: [maker-devel] master_datastore_index.log file shrinks. In-Reply-To: References: Message-ID: <5141FF8F.2050900@ebi.ac.uk> Hello Carson! Thanks for your quick response and your ideas. I'll give them a try. Cheers, Michael. On 03/14/2013 04:21 PM, Carson Holt wrote: > The file should only be deleted if there are no instances running and a > new one starts. Then it rebuilds it. If it is being deleted while other > instances are still active, then yes that is a lock issue. There are > several other locks that should protect individual contigs while that > particular lock is only protecting the datastore_index.log file. > > If any of the contig locks are not working you would start to see failures > of contigs with weird errors that say there are missing files. > > Try dialling back on the number of simultaneous instances you start and > instead use MPI or the -cpus option to get the parallelization boost. > Alternatively you can also split up the input file and use the -base > option so everything gets written to the same place (then you never have > to worry about locks affecting individual contigs - as no single instance > has access to all the contigs) > > Example: > fasta_tool --chunks 5 maize_assembly.fasta > maker -g maize_assembly_0.fasta -base maize_assembly > maker -g maize_assembly_1.fasta -base maize_assembly > > maker -g maize_assembly_2.fasta -base maize_assembly > > maker -g maize_assembly_3.fasta -base maize_assembly > > maker -g maize_assembly_4.fasta -base maize_assembly > > maker -dsindex > > Everything then gets written to maize_assembly.maker.output for all > results. The last call to maker with the -dsindex flag then rebuilds the > datastore_index.log file to match the original maize_assembly.fasta file > > > Thanks, > Carson > > > > > > On 13-03-14 12:20 PM, "Michael Nuhn" wrote: > >> Hello! >> >> I'm trying to keep track of the progress of maker (version 2.27) while >> it is running by looking at the master_datastore_index.log file every >> once in a while. >> >> Sometimes the number of lines in it decreases. Just now it went down >>from more than two hundred to thirty seven. >> >> When I start more instances of maker, the number of lines in it >> increases when they start. But sometimes I check and the number of lines >> has greatly reduced since the last time. >> >> I'm afraid that the newer instances of maker are deleting the file and >> starting it from scratch instead of adding their progress to it. >> >> Is this a file locking issue I should be worried about? >> >> Cheers, >> Michael. >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > From carsonhh at gmail.com Thu Mar 14 12:51:41 2013 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 14 Mar 2013 13:51:41 -0400 Subject: [maker-devel] 12core speed check In-Reply-To: Message-ID: Could you update to 998. It was a recent commit to the devel version that caused a weird pause. Thanks, Carson From: Ram?n Fallon Date: Thursday, 14 March, 2013 11:19 AM To: Subject: [maker-devel] 12core speed check Hi, I was trying to tweak some of our machines to maximise Mpich2/Maker (svn rev 997) throughput and describe one small set of results on this mailing list to allow sharing of experiences. I use the example input dataset "dpp_contig.fasta" with the original sequence repeated 125 times within the same file (under different names of course) to allow for a decent size run. This file totalled 4.019 megabases. I use the dpp_proteins.fasta and The maker_opts.ctl has "cpus=1" set as the docs recommend for MPI. Hardware is a standalone HP Proliant SL390 with two Intel X5675 @ 3.07GHz, totalling 12 cores with 192GB RAM and 1TB disk (local, no NFS) running Ubuntu 10.04 with 2.6.32-41 linux kernel commandline was "mpiexec -n <#cores> maker" within a dedicated directory containing all relevant files. #cores time(mins) Megabases/hr 1 27.00 8.93 2 126.25 1.91 4 42.57 5.66 6 25.42 9.49 8 18.60 12.96 10 16.67 14.47 12 13.98 17.24 I attach a png file with graph. The upshot of this particular experiment is that 2 processes show anomalous behaviour and that 6 processors are needed to gain an advantage on the 1 processor run, while 12 processors achieves a speed-up of nearly 2 on the 1 processor version. I am now going to move on to a three node cluster with 2x 8core processors each (so I can go up to 48 processors), so will report back with higher core numbers. Any suggestions on further speed optimizations welcome. Cheers / Ram?n. _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Thu Mar 14 12:55:38 2013 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 14 Mar 2013 13:55:38 -0400 Subject: [maker-devel] 12core speed check In-Reply-To: <5141F720.20502@duke.edu> Message-ID: It should use 2 physical cores. Hyperthreading shouldn't come into play unless you start more processes than there are physical cores. I haven't seen any big performance advantage in most cases with hyperthreading on linux machines. I find more often than not it just confuses students into thinking there are free processors and then starting too many jobs. --Carson From: Olaf Mueller Date: Thursday, 14 March, 2013 12:13 PM To: Subject: Re: [maker-devel] 12core speed check The X5675 supports hyperthreading. Does i.e. "mpiexec -n 2 maker" use 2 physical cores or 2 threads of the 1st core? If the latter happens it would be interesting to see your series extended to -n 24. Cheers Olaf On 03/14/2013 11:19 AM, Ram?n Fallon wrote: > Hi, > > > > I was trying to tweak some of our machines to maximise Mpich2/Maker (svn rev > 997) throughput and describe one small set of results on this mailing list to > allow sharing of experiences. > > > > > I use the example input dataset "dpp_contig.fasta" with the original sequence > repeated 125 times within the same file (under different names of course) to > allow for a decent size run. This file totalled 4.019 megabases. I use the > dpp_proteins.fasta and The maker_opts.ctl has "cpus=1" set as the docs > recommend for MPI. > > > > > Hardware is a standalone HP Proliant SL390 with two Intel X5675 @ 3.07GHz, > totalling 12 cores with 192GB RAM and 1TB disk (local, no NFS) running Ubuntu > 10.04 with 2.6.32-41 linux kernel > > > > > commandline was "mpiexec -n <#cores> maker" within a dedicated directory > containing all relevant files. > > > > > > #cores time(mins) Megabases/hr > > 1 27.00 8.93 > > 2 126.25 1.91 > > 4 42.57 5.66 > > 6 25.42 9.49 > > 8 18.60 12.96 > > 10 16.67 14.47 > > 12 13.98 17.24 > > > > > > I attach a png file with graph. The upshot of this particular experiment is > that 2 processes show anomalous behaviour and that 6 processors are needed to > gain an advantage on the 1 processor run, while 12 processors achieves a > speed-up of nearly 2 on the 1 processor version. > > > > > I am now going to move on to a three node cluster with 2x 8core processors > each (so I can go up to 48 processors), so will report back with higher core > numbers. Any suggestions on further speed optimizations welcome. > > > > > Cheers / Ram?n. > > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/mak > er-devel_yandell-lab.org > _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From myandell at genetics.utah.edu Thu Mar 14 12:59:37 2013 From: myandell at genetics.utah.edu (Mark Yandell) Date: Thu, 14 Mar 2013 17:59:37 +0000 Subject: [maker-devel] 12core speed check In-Reply-To: References: , Message-ID: <7A60AB257EFF2B48B1F4C814817EA05350ED9082@mxb2.hg.genetics.utah.edu> Thanks Ramon. super interesting analysis! Mark Yandell Professor of Human Genetics H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:801-587-7707 ________________________________________ From: maker-devel-bounces at yandell-lab.org [maker-devel-bounces at yandell-lab.org] on behalf of Carson Holt [carsonhh at gmail.com] Sent: Thursday, March 14, 2013 11:51 AM To: Ram?n Fallon; maker-devel at yandell-lab.org Subject: Re: [maker-devel] 12core speed check Could you update to 998. It was a recent commit to the devel version that caused a weird pause. Thanks, Carson From: Ram?n Fallon > Date: Thursday, 14 March, 2013 11:19 AM To: > Subject: [maker-devel] 12core speed check Hi, I was trying to tweak some of our machines to maximise Mpich2/Maker (svn rev 997) throughput and describe one small set of results on this mailing list to allow sharing of experiences. I use the example input dataset "dpp_contig.fasta" with the original sequence repeated 125 times within the same file (under different names of course) to allow for a decent size run. This file totalled 4.019 megabases. I use the dpp_proteins.fasta and The maker_opts.ctl has "cpus=1" set as the docs recommend for MPI. Hardware is a standalone HP Proliant SL390 with two Intel X5675 @ 3.07GHz, totalling 12 cores with 192GB RAM and 1TB disk (local, no NFS) running Ubuntu 10.04 with 2.6.32-41 linux kernel commandline was "mpiexec -n <#cores> maker" within a dedicated directory containing all relevant files. #cores time(mins) Megabases/hr 1 27.00 8.93 2 126.25 1.91 4 42.57 5.66 6 25.42 9.49 8 18.60 12.96 10 16.67 14.47 12 13.98 17.24 I attach a png file with graph. The upshot of this particular experiment is that 2 processes show anomalous behaviour and that 6 processors are needed to gain an advantage on the 1 processor run, while 12 processors achieves a speed-up of nearly 2 on the 1 processor version. I am now going to move on to a three node cluster with 2x 8core processors each (so I can go up to 48 processors), so will report back with higher core numbers. Any suggestions on further speed optimizations welcome. Cheers / Ram?n. _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From daniel.quest at gmail.com Thu Mar 14 21:07:34 2013 From: daniel.quest at gmail.com (Dan Quest) Date: Fri, 15 Mar 2013 02:07:34 +0000 (UTC) Subject: [maker-devel] Invitation to connect on LinkedIn Message-ID: <1487511280.7392755.1363313254244.JavaMail.app@ela4-app2322.prod> LinkedIn ------------ I'd like to add you to my professional network on LinkedIn. - Dan Dan Quest Senior Analyst Programmer at Mayo Clinic Rochester, Minnesota Area Confirm that you know Dan Quest: https://www.linkedin.com/e/-m3y3hs-heapifdk-1i/isd/11686987554/Yo4-rOXB/?hs=false&tok=26pedbV21vJlE1 -- You are receiving Invitation to Connect emails. Click to unsubscribe: http://www.linkedin.com/e/-m3y3hs-heapifdk-1i/vcG-iX3vwW9133a7MYTHsMyDds41ZeU5jWTF9LUs04/goo/maker-devel%40yandell-lab%2Eorg/20061/I3868510560_1/?hs=false&tok=24a30hi6RvJlE1 (c) 2012 LinkedIn Corporation. 2029 Stierlin Ct, Mountain View, CA 94043, USA. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ares711122 at gmail.com Thu Mar 14 22:13:55 2013 From: ares711122 at gmail.com (Hung-Wei Hsu) Date: Fri, 15 Mar 2013 11:13:55 +0800 Subject: [maker-devel] ERROR: Could not obtain lock to format database In-Reply-To: References: Message-ID: You may find the error messages in the run log as attached. Thanks a lot in advance. Best regards, Hung-Wei 2013/3/14 Carson Holt > Could you check to make sure your hard drive is not full, whatever > location you set as TMP= in the control files is not full (default is > /tmp). Also maker sure you do not set /tmp to an NFS mounted or a tmpfs > location. > > Could you also send the full captured STDERR. > > Thanks, > Carson > > > > From: Hung-Wei Hsu > Date: Tuesday, 12 March, 2013 8:24 PM > To: > Subject: [maker-devel] ERROR: Could not obtain lock to format database > > Hi MAKER developers, > > I tried MAKER 2.27b on one E. coli scaffold sequence with uniprot protein > database. > I failed to run the analysis and got an error message as below. > > Could not obtain lock to format database at maker-2.27b/bin/../lib/GI.pm > > Any suggestions or helps will be deeply appreciated. > > Best regards, > Hung-Wei > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: run.log Type: application/octet-stream Size: 27205 bytes Desc: not available URL: From ares711122 at gmail.com Thu Mar 14 22:35:09 2013 From: ares711122 at gmail.com (Hung-Wei Hsu) Date: Fri, 15 Mar 2013 11:35:09 +0800 Subject: [maker-devel] ERROR: Could not obtain lock to format database In-Reply-To: References: Message-ID: The hard disk where I tried MAKER is about 2TB in size. TMP was not set to an NFS mounted or a tmpfs location and was empty before analysis. The hard disk where TMP directory was located at was about 2TB in size. Thanks a lot in advance. Best regards, Hung-Wei 2013/3/15 Hung-Wei Hsu > You may find the error messages in the run log as attached. > Thanks a lot in advance. > > Best regards, > Hung-Wei > > > 2013/3/14 Carson Holt > >> Could you check to make sure your hard drive is not full, whatever >> location you set as TMP= in the control files is not full (default is >> /tmp). Also maker sure you do not set /tmp to an NFS mounted or a tmpfs >> location. >> >> Could you also send the full captured STDERR. >> >> Thanks, >> Carson >> >> >> >> From: Hung-Wei Hsu >> Date: Tuesday, 12 March, 2013 8:24 PM >> To: >> Subject: [maker-devel] ERROR: Could not obtain lock to format database >> >> Hi MAKER developers, >> >> I tried MAKER 2.27b on one E. coli scaffold sequence with uniprot protein >> database. >> I failed to run the analysis and got an error message as below. >> >> Could not obtain lock to format database at maker-2.27b/bin/../lib/GI.pm >> >> Any suggestions or helps will be deeply appreciated. >> >> Best regards, >> Hung-Wei >> _______________________________________________ maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Fri Mar 15 13:06:21 2013 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 15 Mar 2013 14:06:21 -0400 Subject: [maker-devel] ERROR: Could not obtain lock to format database In-Reply-To: Message-ID: Were you by any chance running multiple instances of MAKER at the same time in the same directory? It looks like two processes started to work on the same contig (normally a first set of locks blocks this possibility ? but rarely they get past that step). Then when it got to a part where an analysis is performed one properly failed when it realized that the other had the lock. In any case, it looks like it just retried and finished the contig in question. So the snippet seems to indicate expected behavior. Do you see the contig in question as being finished and having an output GFF3? --Carson From: Hung-Wei Hsu Date: Thursday, 14 March, 2013 11:13 PM To: Carson Holt Cc: Subject: Re: [maker-devel] ERROR: Could not obtain lock to format database You may find the error messages in the run log as attached. Thanks a lot in advance. Best regards, Hung-Wei 2013/3/14 Carson Holt > Could you check to make sure your hard drive is not full, whatever location > you set as TMP= in the control files is not full (default is /tmp). Also > maker sure you do not set /tmp to an NFS mounted or a tmpfs location. > > Could you also send the full captured STDERR. > > Thanks, > Carson > > > > From: Hung-Wei Hsu > Date: Tuesday, 12 March, 2013 8:24 PM > To: > Subject: [maker-devel] ERROR: Could not obtain lock to format database > > Hi MAKER developers, > > I tried MAKER 2.27b on one E. coli scaffold sequence with uniprot protein > database. > I failed to run the analysis and got an error message as below. > > Could not obtain lock to format database at maker-2.27b/bin/../lib/GI.pm > > Any suggestions or helps will be deeply appreciated. > > Best regards, > Hung-Wei > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/mak > er-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ramonfallon at gmail.com Mon Mar 18 09:35:04 2013 From: ramonfallon at gmail.com (=?ISO-8859-1?Q?Ram=F3n_Fallon?=) Date: Mon, 18 Mar 2013 15:35:04 +0100 Subject: [maker-devel] Fwd: 12core speed check In-Reply-To: References: <7A60AB257EFF2B48B1F4C814817EA05350ED9082@mxb2.hg.genetics.utah.edu> Message-ID: Hi! I've tried again from two different machines, and I can't do a "svn co" nor "svn update" on the malachite server. Can you verify the server and the svn service is OK on your side? Many thanks / Ram?n. On Fri, Mar 15, 2013 at 1:18 PM, Ram?n Fallon wrote: > Hi Mark and Carson, > > Many thanks for the comments and the speedy replies! > > Previously, I never had problem connecting to the svn server on > malachite.genetics.utah.edu, but this morning, I couldn't connect to > update to rev 998. > > I'l try again later. > > Cheers / Ram?n. > > > On Thu, Mar 14, 2013 at 6:59 PM, Mark Yandell wrote: > >> Thanks Ramon. super interesting analysis! >> >> >> Mark Yandell >> Professor of Human Genetics >> H.A. & Edna Benning Presidential Endowed Chair >> Eccles Institute of Human Genetics >> University of Utah >> 15 North 2030 East, Room 2100 >> Salt Lake City, UT 84112-5330 >> ph:801-587-7707 >> >> ________________________________________ >> From: maker-devel-bounces at yandell-lab.org [ >> maker-devel-bounces at yandell-lab.org] on behalf of Carson Holt [ >> carsonhh at gmail.com] >> Sent: Thursday, March 14, 2013 11:51 AM >> To: Ram?n Fallon; maker-devel at yandell-lab.org >> Subject: Re: [maker-devel] 12core speed check >> >> Could you update to 998. It was a recent commit to the devel version >> that caused a weird pause. >> >> Thanks, >> Carson >> >> >> From: Ram?n Fallon > >> Date: Thursday, 14 March, 2013 11:19 AM >> To: > >> Subject: [maker-devel] 12core speed check >> >> Hi, >> >> I was trying to tweak some of our machines to maximise Mpich2/Maker (svn >> rev 997) throughput and describe one small set of results on this mailing >> list to allow sharing of experiences. >> >> I use the example input dataset "dpp_contig.fasta" with the original >> sequence repeated 125 times within the same file (under different names of >> course) to allow for a decent size run. This file totalled 4.019 megabases. >> I use the dpp_proteins.fasta and The maker_opts.ctl has "cpus=1" set as >> the docs recommend for MPI. >> >> Hardware is a standalone HP Proliant SL390 with two Intel X5675 @ >> 3.07GHz, totalling 12 cores with 192GB RAM and 1TB disk (local, no NFS) >> running Ubuntu 10.04 with 2.6.32-41 linux kernel >> >> commandline was "mpiexec -n <#cores> maker" within a dedicated directory >> containing all relevant files. >> >> #cores time(mins) Megabases/hr >> 1 27.00 8.93 >> 2 126.25 1.91 >> 4 42.57 5.66 >> 6 25.42 9.49 >> 8 18.60 12.96 >> 10 16.67 14.47 >> 12 13.98 17.24 >> >> I attach a png file with graph. The upshot of this particular experiment >> is that 2 processes show anomalous behaviour and that 6 processors are >> needed to gain an advantage on the 1 processor run, while 12 processors >> achieves a speed-up of nearly 2 on the 1 processor version. >> >> I am now going to move on to a three node cluster with 2x 8core >> processors each (so I can go up to 48 processors), so will report back with >> higher core numbers. Any suggestions on further speed optimizations welcome. >> >> Cheers / Ram?n. >> _______________________________________________ maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Mar 18 09:51:37 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 18 Mar 2013 10:51:37 -0400 Subject: [maker-devel] Fwd: 12core speed check In-Reply-To: Message-ID: For any users currently using the devel subversion repository. If you need to update, please send me an e-mail to get information on how to switch over to our new server. Thanks, Carson From: Ram?n Fallon Date: Monday, 18 March, 2013 10:35 AM To: Subject: [maker-devel] Fwd: 12core speed check Hi! I've tried again from two different machines, and I can't do a "svn co" nor "svn update" on the malachite server. Can you verify the server and the svn service is OK on your side? Many thanks / Ram?n. On Fri, Mar 15, 2013 at 1:18 PM, Ram?n Fallon wrote: > Hi Mark and Carson, > > Many thanks for the comments and the speedy replies! > > Previously, I never had problem connecting to the svn server on > malachite.genetics.utah.edu , but this > morning, I couldn't connect to update to rev 998. > > I'l try again later. > > Cheers / Ram?n. > > > On Thu, Mar 14, 2013 at 6:59 PM, Mark Yandell > wrote: >> Thanks Ramon. super interesting analysis! >> >> >> Mark Yandell >> Professor of Human Genetics >> H.A. & Edna Benning Presidential Endowed Chair >> Eccles Institute of Human Genetics >> University of Utah >> 15 North 2030 East, Room 2100 >> Salt Lake City, UT 84112-5330 >> ph:801-587-7707 >> >> ________________________________________ >> From: maker-devel-bounces at yandell-lab.org >> [maker-devel-bounces at yandell-lab.org] on behalf of Carson Holt >> [carsonhh at gmail.com] >> Sent: Thursday, March 14, 2013 11:51 AM >> To: Ram?n Fallon; maker-devel at yandell-lab.org >> Subject: Re: [maker-devel] 12core speed check >> >> Could you update to 998. It was a recent commit to the devel version that >> caused a weird pause. >> >> Thanks, >> Carson >> >> >> From: Ram?n Fallon > >> Date: Thursday, 14 March, 2013 11:19 AM >> To: > >> Subject: [maker-devel] 12core speed check >> >> Hi, >> >> I was trying to tweak some of our machines to maximise Mpich2/Maker (svn rev >> 997) throughput and describe one small set of results on this mailing list >> to allow sharing of experiences. >> >> I use the example input dataset "dpp_contig.fasta" with the original sequence >> repeated 125 times within the same file (under different names of course) to >> allow for a decent size run. This file totalled 4.019 megabases. I use the >> dpp_proteins.fasta and The maker_opts.ctl has "cpus=1" set as the docs >> recommend for MPI. >> >> Hardware is a standalone HP Proliant SL390 with two Intel X5675 @ 3.07GHz, >> totalling 12 cores with 192GB RAM and 1TB disk (local, no NFS) running Ubuntu >> 10.04 with 2.6.32-41 linux kernel >> >> commandline was "mpiexec -n <#cores> maker" within a dedicated directory >> containing all relevant files. >> >> #cores time(mins) Megabases/hr >> 1 27.00 8.93 >> 2 126.25 1.91 >> 4 42.57 5.66 >> 6 25.42 9.49 >> 8 18.60 12.96 >> 10 16.67 14.47 >> 12 13.98 17.24 >> >> I attach a png file with graph. The upshot of this particular experiment is >> that 2 processes show anomalous behaviour and that 6 processors are needed to >> gain an advantage on the 1 processor run, while 12 processors achieves a >> speed-up of nearly 2 on the 1 processor version. >> >> I am now going to move on to a three node cluster with 2x 8core processors >> each (so I can go up to 48 processors), so will report back with higher core >> numbers. Any suggestions on further speed optimizations welcome. >> >> Cheers / Ram?n. >> _______________________________________________ maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From hudarul at yahoo.com Mon Mar 18 15:13:21 2013 From: hudarul at yahoo.com (Hud Hud) Date: Mon, 18 Mar 2013 13:13:21 -0700 (PDT) Subject: [maker-devel] Maker-no such file or directory Message-ID: <1363637601.24386.YahooMailNeo@web164901.mail.bf1.yahoo.com> I have some problem with maker 1. i try to work with the example data in data directory, but im having this kind of error..anyone can help me error $ maker STATUS: Parsing control files... dpp_contig.fasta (fasta file or fasta embeded in GFF3 file): No such file or directory at /home/Dorah/maker-2.27-beta/maker/bin/../lib/GI.pm line 186 --> rank=NA, hostname=NurKaiyisah my maker_opts.ctl genome=$home/Dorah/maker-2.27-beta/maker/data/dpp_contig.fasta est=$home/Dorah/maker-2.27-beta/maker/data/dpp_est.fasta protein=$home/Dorah/maker-2.27-beta/maker/data/dpp_protein.fasta -------------- next part -------------- An HTML attachment was scrubbed... URL: From Hossein.Borhan at AGR.GC.CA Mon Mar 18 15:40:38 2013 From: Hossein.Borhan at AGR.GC.CA (Borhan, Hossein) Date: Mon, 18 Mar 2013 16:40:38 -0400 Subject: [maker-devel] failed gene prediction Message-ID: <7B64340A44B6634C814A22BCFA6179D5020887DA@onottaxms5.AGR.GC.CA> Hi I have tried maker on a fungus genome of 45 mb with 1/3 being repeat rich. It did not produce any prediction. I am not sure what is causing this. Attached are the STDERR and opts.ctl. I appreciate your help Hossein -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: wa74-maker-stderr.log Type: application/octet-stream Size: 6325713 bytes Desc: wa74-maker-stderr.log URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_opts.ctl Type: application/octet-stream Size: 5244 bytes Desc: maker_opts.ctl URL: From carsonhh at gmail.com Mon Mar 18 15:44:41 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 18 Mar 2013 16:44:41 -0400 Subject: [maker-devel] Maker-no such file or directory In-Reply-To: <1363637601.24386.YahooMailNeo@web164901.mail.bf1.yahoo.com> Message-ID: Does 'ls -al $home/Dorah/maker-2.27-beta/maker/data/dpp_contig.fasta' show a valid location? The error is just saying that the file location as written in the maker_opts.ctl file does not exist. --Carson From: Hud Hud Reply-To: Hud Hud Date: Monday, 18 March, 2013 4:13 PM To: "maker-devel at yandell-lab.org" Subject: [maker-devel] Maker-no such file or directory I have some problem with maker 1. i try to work with the example data in data directory, but im having this kind of error..anyone can help me error $ maker STATUS: Parsing control files... dpp_contig.fasta (fasta file or fasta embeded in GFF3 file): No such file or directory at /home/Dorah/maker-2.27-beta/maker/bin/../lib/GI.pm line 186 --> rank=NA, hostname=NurKaiyisah my maker_opts.ctl genome=$home/Dorah/maker-2.27-beta/maker/data/dpp_contig.fasta est=$home/Dorah/maker-2.27-beta/maker/data/dpp_est.fasta protein=$home/Dorah/maker-2.27-beta/maker/data/dpp_protein.fasta _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Mar 18 15:49:30 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 18 Mar 2013 16:49:30 -0400 Subject: [maker-devel] failed gene prediction In-Reply-To: <7B64340A44B6634C814A22BCFA6179D5020887DA@onottaxms5.AGR.GC.CA> Message-ID: You didn't supply any evidence or HMM files for gene predictors. Just raw assembly data by itself is insufficient for genome annotation. Here is some nice documentation for running MAKER --> http://gmod.org/wiki/MAKER_Tutorial_2012 Here is a nice overview of genome annotation ion general --> http://fasta.bioch.virginia.edu/cshl/pdf/12/ajm12/euk_genome_annotation_revi ew.pdf Once you've gone through the documentation and examples, if you come across any questions just let us know. Thanks, Carson From: "Borhan, Hossein" Date: Monday, 18 March, 2013 4:40 PM To: Subject: [maker-devel] failed gene prediction Hi I have tried maker on a fungus genome of 45 mb with 1/3 being repeat rich. It did not produce any prediction. I am not sure what is causing this. Attached are the STDERR and opts.ctl. I appreciate your help Hossein _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ares711122 at gmail.com Mon Mar 18 19:44:39 2013 From: ares711122 at gmail.com (Hung-Wei Hsu) Date: Tue, 19 Mar 2013 08:44:39 +0800 Subject: [maker-devel] ERROR: Could not obtain lock to format database In-Reply-To: References: Message-ID: I make sure I just ran one instance of MAKER at the same time. I only analyzed one contig for the test. After MAKER interruption, I can't find an GFF3 output of this contig. There are only a theVoidXXX directory and a run.log file. I'm trying 2.26b with the same parameters for the same data. Hopefully, it can work well. Hung-Wei 2013/3/16 Carson Holt > Were you by any chance running multiple instances of MAKER at the same > time in the same directory? It looks like two processes started to work on > the same contig (normally a first set of locks blocks this possibility ? > but rarely they get past that step). Then when it got to a part where an > analysis is performed one properly failed when it realized that the other > had the lock. In any case, it looks like it just retried and finished the > contig in question. So the snippet seems to indicate expected behavior. > Do you see the contig in question as being finished and having an output > GFF3? > > --Carson > > > > > From: Hung-Wei Hsu > Date: Thursday, 14 March, 2013 11:13 PM > To: Carson Holt > Cc: > Subject: Re: [maker-devel] ERROR: Could not obtain lock to format database > > You may find the error messages in the run log as attached. > Thanks a lot in advance. > > Best regards, > Hung-Wei > > > 2013/3/14 Carson Holt > >> Could you check to make sure your hard drive is not full, whatever >> location you set as TMP= in the control files is not full (default is >> /tmp). Also maker sure you do not set /tmp to an NFS mounted or a tmpfs >> location. >> >> Could you also send the full captured STDERR. >> >> Thanks, >> Carson >> >> >> >> From: Hung-Wei Hsu >> Date: Tuesday, 12 March, 2013 8:24 PM >> To: >> Subject: [maker-devel] ERROR: Could not obtain lock to format database >> >> Hi MAKER developers, >> >> I tried MAKER 2.27b on one E. coli scaffold sequence with uniprot protein >> database. >> I failed to run the analysis and got an error message as below. >> >> Could not obtain lock to format database at maker-2.27b/bin/../lib/GI.pm >> >> Any suggestions or helps will be deeply appreciated. >> >> Best regards, >> Hung-Wei >> _______________________________________________ maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnuhn at ebi.ac.uk Tue Mar 19 07:12:32 2013 From: mnuhn at ebi.ac.uk (Michael Nuhn) Date: Tue, 19 Mar 2013 12:12:32 +0000 Subject: [maker-devel] master_datastore_index.log file shrinks. In-Reply-To: <5141FF8F.2050900@ebi.ac.uk> References: <5141FF8F.2050900@ebi.ac.uk> Message-ID: <51485630.6080701@ebi.ac.uk> Hello Carson! On 03/14/2013 04:49 PM, Michael Nuhn wrote: >> Try dialling back on the number of simultaneous instances you start and >> instead use MPI or the -cpus option to get the parallelization boost. >> Alternatively you can also split up the input file and use the -base >> option so everything gets written to the same place (then you never have >> to worry about locks affecting individual contigs - as no single instance >> has access to all the contigs) >> >> Example: >> fasta_tool --chunks 5 maize_assembly.fasta >> maker -g maize_assembly_0.fasta -base maize_assembly >> maker -g maize_assembly_1.fasta -base maize_assembly >> >> maker -g maize_assembly_2.fasta -base maize_assembly >> >> maker -g maize_assembly_3.fasta -base maize_assembly >> >> maker -g maize_assembly_4.fasta -base maize_assembly >> >> maker -dsindex >> >> Everything then gets written to maize_assembly.maker.output for all >> results. The last call to maker with the -dsindex flag then rebuilds the >> datastore_index.log file to match the original maize_assembly.fasta file I have tried this, split my genome into 50 files and run them as you suggested above. This worked well most of the time, but now I am getting locking issues again. The working directory gets flooded with STACK.STACK.STACK.STACK ... files. What I think is happening is that for some reason the maker instances decide that they want to rebuild the index. This takes a lot of time and this blocks even more instances wanting to lock the index files. In the end most of the maker instances end up waiting. I would like to try the following, but I don't know, if this might cause problems later on: I would like to run all of the split sequence files as separate maker projects as if they were independent genomes. In the end I'd merge all the individual gff files using the gff3_merge script. Do you see any reason why this wouldn't work? Cheers, Michael. From Bob_Freeman at hms.harvard.edu Tue Mar 19 08:03:00 2013 From: Bob_Freeman at hms.harvard.edu (Freeman, Robert M.) Date: Tue, 19 Mar 2013 09:03:00 -0400 Subject: [maker-devel] Failing to recoup all gff annotations, including ab-initio Message-ID: Carson et al., Thanks again for a great suite tools! We're using MAKER now to generate gene models (and model fragments) for a ciliate, the models for which we'll be using to generate a high-quality protein database for searches with mass spec. I bootstrapped the process using the core set of proteins with CEGMA, then trained SNAP. After the final round of running MAKER, I get about 1100 evidence-based models and 34K ab-initio. And that's fine (for now). I am able to collect the fasta files for both transcripts and proteins (evidence-based and ab-initio) without problem. My problem is that when I use the gff3_merge script, I only get annotations for the evidence-based models. I'm not sure why the ab-initio model annotations are being collected. I've tried using and not the '-g' switch, but this doesn't seem to make a difference. Thoughts? Tx, B ----------------------------------------------------- Bob Freeman, Ph.D. Acorn Worm Informatics, Kirschner lab Dept of Systems Biology, Alpert 524 Harvard Medical School 200 Longwood Avenue Boston, MA 02115 617/432.2294, vox "Sorry I'm late. Oh, God, that sounded insincere. I'm late." -- Karen Walker, from Will and Grace -------------- next part -------------- An HTML attachment was scrubbed... URL: From dsth at ebi.ac.uk Tue Mar 19 08:33:13 2013 From: dsth at ebi.ac.uk (Daniel Hughes) Date: Tue, 19 Mar 2013 13:33:13 +0000 Subject: [maker-devel] master_datastore_index.log file shrinks.] Message-ID: Daniel S. T. Hughes M.Biochem (Hons; Oxford), Ph.D (Cambridge) ------------------------------------------------------------------------------------- dsth at cantab.net dsth at cpan.org Hi Michael, You're using ebi cluster? i have to ask, is this all just a really elaborate way of avoiding the use of MPI that works perfectly well on both the ebi and sanger compute farms? if you carry on in the direction you seem to be going you're likely to end up with a considerable level of unnecessary overhead and should possibly consider adapting the ensembl genebuild pipeline to your specific needs. Dan Hello Carson! > > On 03/14/2013 04:49 PM, Michael Nuhn wrote: > >> Try dialling back on the number of simultaneous instances you start and > >> instead use MPI or the -cpus option to get the parallelization boost. > >> Alternatively you can also split up the input file and use the -base > >> option so everything gets written to the same place (then you never have > >> to worry about locks affecting individual contigs - as no single > instance > >> has access to all the contigs) > >> > >> Example: > >> fasta_tool --chunks 5 maize_assembly.fasta > >> maker -g maize_assembly_0.fasta -base maize_assembly > >> maker -g maize_assembly_1.fasta -base maize_assembly > >> > >> maker -g maize_assembly_2.fasta -base maize_assembly > >> > >> maker -g maize_assembly_3.fasta -base maize_assembly > >> > >> maker -g maize_assembly_4.fasta -base maize_assembly > >> > >> maker -dsindex > >> > >> Everything then gets written to maize_assembly.maker.output for all > >> results. The last call to maker with the -dsindex flag then rebuilds > the > >> datastore_index.log file to match the original maize_assembly.fasta file > > I have tried this, split my genome into 50 files and run them as you > suggested above. > > This worked well most of the time, but now I am getting locking issues > again. The working directory gets flooded with STACK.STACK.STACK.STACK > ... files. > > What I think is happening is that for some reason the maker instances > decide that they want to rebuild the index. This takes a lot of time > and this blocks even more instances wanting to lock the index files. > In the end most of the maker instances end up waiting. > > I would like to try the following, but I don't know, if this might > cause problems later on: > > I would like to run all of the split sequence files as separate maker > projects as if they were independent genomes. In the end I'd merge all > the individual gff files using the gff3_merge script. > > Do you see any reason why this wouldn't work? > > Cheers, > Michael. > > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > ----- End forwarded message ----- > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Mar 19 09:27:16 2013 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 19 Mar 2013 10:27:16 -0400 Subject: [maker-devel] master_datastore_index.log file shrinks.] In-Reply-To: Message-ID: Yes. If at all possible use MPI. It removes the overhead of locks which happen per primary instance of MAKER. So one maker job using 1000 cpus via MPI will have one shared set of locks. 1000 serial instances of MAKER on the other hand would have 1000x the locks. Alternatively if you do need to continue without MPI for some reason, I just finished a devel version of MAKER that has a --no_locks option. You can never start two instances using the same input fasta when --no_locks is specified, but the splitting to use different input fastas I mentioned before in the example will still work fine. I also have updated the indexing/reindexing, so if indexing failures happen, MAKER will switch between the current working directory and the TMP= directory from the maker_opts.ctl file so as to try different IO locations (I.e. NFS and non-NFS). Note you should never set TMP= in the control files to an NFS mounted location (it not only makes things a lot slower, but berkleydb and sqllite will get frequent errors on NFS). TMP= defaults to /tmp when not specified I'll send you download information in a separate e-mail. Try a regular MAKER run to see if the indexing/reindexing changes are sufficient before attempting the ?no_locks option. Thanks, Carson From: Daniel Hughes Date: Tuesday, 19 March, 2013 9:33 AM To: Michael Nuhn , Subject: Re: [maker-devel] master_datastore_index.log file shrinks.] Daniel S. T. Hughes M.Biochem (Hons; Oxford), Ph.D (Cambridge) ---------------------------------------------------------------------------- --------- dsth at cantab.net dsth at cpan.org Hi Michael, You're using ebi cluster? i have to ask, is this all just a really elaborate way of avoiding the use of MPI that works perfectly well on both the ebi and sanger compute farms? if you carry on in the direction you seem to be going you're likely to end up with a considerable level of unnecessary overhead and should possibly consider adapting the ensembl genebuild pipeline to your specific needs. Dan > Hello Carson! > > On 03/14/2013 04:49 PM, Michael Nuhn wrote: >>> >> Try dialling back on the number of simultaneous instances you start and >>> >> instead use MPI or the -cpus option to get the parallelization boost. >>> >> Alternatively you can also split up the input file and use the -base >>> >> option so everything gets written to the same place (then you never have >>> >> to worry about locks affecting individual contigs - as no single instance >>> >> has access to all the contigs) >>> >> >>> >> Example: >>> >> fasta_tool --chunks 5 maize_assembly.fasta >>> >> maker -g maize_assembly_0.fasta -base maize_assembly >>> >> maker -g maize_assembly_1.fasta -base maize_assembly >>> >> >>> >> maker -g maize_assembly_2.fasta -base maize_assembly >>> >> >>> >> maker -g maize_assembly_3.fasta -base maize_assembly >>> >> >>> >> maker -g maize_assembly_4.fasta -base maize_assembly >>> >> >>> >> maker -dsindex >>> >> >>> >> Everything then gets written to maize_assembly.maker.output for all >>> >> results. The last call to maker with the -dsindex flag then rebuilds the >>> >> datastore_index.log file to match the original maize_assembly.fasta file > > I have tried this, split my genome into 50 files and run them as you > suggested above. > > This worked well most of the time, but now I am getting locking issues > again. The working directory gets flooded with STACK.STACK.STACK.STACK > ... files. > > What I think is happening is that for some reason the maker instances > decide that they want to rebuild the index. This takes a lot of time > and this blocks even more instances wanting to lock the index files. > In the end most of the maker instances end up waiting. > > I would like to try the following, but I don't know, if this might > cause problems later on: > > I would like to run all of the split sequence files as separate maker > projects as if they were independent genomes. In the end I'd merge all > the individual gff files using the gff3_merge script. > > Do you see any reason why this wouldn't work? > > Cheers, > Michael. > > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > ----- End forwarded message ----- > _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Mar 19 09:38:00 2013 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 19 Mar 2013 10:38:00 -0400 Subject: [maker-devel] master_datastore_index.log file shrinks.] In-Reply-To: Message-ID: You can also talk to Eleanor Stanley at Sanger, she has a pre-release of MAKER 2.28 already installed and running on the Sanger cluster with OpenMPI. Thanks, Carson From: Carson Holt Date: Tuesday, 19 March, 2013 10:27 AM To: Daniel Hughes , Michael Nuhn , Subject: Re: [maker-devel] master_datastore_index.log file shrinks.] Yes. If at all possible use MPI. It removes the overhead of locks which happen per primary instance of MAKER. So one maker job using 1000 cpus via MPI will have one shared set of locks. 1000 serial instances of MAKER on the other hand would have 1000x the locks. Alternatively if you do need to continue without MPI for some reason, I just finished a devel version of MAKER that has a --no_locks option. You can never start two instances using the same input fasta when --no_locks is specified, but the splitting to use different input fastas I mentioned before in the example will still work fine. I also have updated the indexing/reindexing, so if indexing failures happen, MAKER will switch between the current working directory and the TMP= directory from the maker_opts.ctl file so as to try different IO locations (I.e. NFS and non-NFS). Note you should never set TMP= in the control files to an NFS mounted location (it not only makes things a lot slower, but berkleydb and sqllite will get frequent errors on NFS). TMP= defaults to /tmp when not specified I'll send you download information in a separate e-mail. Try a regular MAKER run to see if the indexing/reindexing changes are sufficient before attempting the ?no_locks option. Thanks, Carson From: Daniel Hughes Date: Tuesday, 19 March, 2013 9:33 AM To: Michael Nuhn , Subject: Re: [maker-devel] master_datastore_index.log file shrinks.] Daniel S. T. Hughes M.Biochem (Hons; Oxford), Ph.D (Cambridge) ---------------------------------------------------------------------------- --------- dsth at cantab.net dsth at cpan.org Hi Michael, You're using ebi cluster? i have to ask, is this all just a really elaborate way of avoiding the use of MPI that works perfectly well on both the ebi and sanger compute farms? if you carry on in the direction you seem to be going you're likely to end up with a considerable level of unnecessary overhead and should possibly consider adapting the ensembl genebuild pipeline to your specific needs. Dan > Hello Carson! > > On 03/14/2013 04:49 PM, Michael Nuhn wrote: >>> >> Try dialling back on the number of simultaneous instances you start and >>> >> instead use MPI or the -cpus option to get the parallelization boost. >>> >> Alternatively you can also split up the input file and use the -base >>> >> option so everything gets written to the same place (then you never have >>> >> to worry about locks affecting individual contigs - as no single instance >>> >> has access to all the contigs) >>> >> >>> >> Example: >>> >> fasta_tool --chunks 5 maize_assembly.fasta >>> >> maker -g maize_assembly_0.fasta -base maize_assembly >>> >> maker -g maize_assembly_1.fasta -base maize_assembly >>> >> >>> >> maker -g maize_assembly_2.fasta -base maize_assembly >>> >> >>> >> maker -g maize_assembly_3.fasta -base maize_assembly >>> >> >>> >> maker -g maize_assembly_4.fasta -base maize_assembly >>> >> >>> >> maker -dsindex >>> >> >>> >> Everything then gets written to maize_assembly.maker.output for all >>> >> results. The last call to maker with the -dsindex flag then rebuilds the >>> >> datastore_index.log file to match the original maize_assembly.fasta file > > I have tried this, split my genome into 50 files and run them as you > suggested above. > > This worked well most of the time, but now I am getting locking issues > again. The working directory gets flooded with STACK.STACK.STACK.STACK > ... files. > > What I think is happening is that for some reason the maker instances > decide that they want to rebuild the index. This takes a lot of time > and this blocks even more instances wanting to lock the index files. > In the end most of the maker instances end up waiting. > > I would like to try the following, but I don't know, if this might > cause problems later on: > > I would like to run all of the split sequence files as separate maker > projects as if they were independent genomes. In the end I'd merge all > the individual gff files using the gff3_merge script. > > Do you see any reason why this wouldn't work? > > Cheers, > Michael. > > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > ----- End forwarded message ----- > _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/m aker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Mar 19 09:52:19 2013 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 19 Mar 2013 10:52:19 -0400 Subject: [maker-devel] Failing to recoup all gff annotations, including ab-initio In-Reply-To: Message-ID: Ab initio models without evidence support are not considered final models by default (newly trained ab initio predictors tend to have a very high false positive rate). If you really want the ab initio models without support to upgraded, set the keep_preds=1 in the maker_opts.ctl file. All ab inito models are also stored in the GFF3 as match/mtach_part features for reference purposes not gene/mRNA/exon/CDS. Thanks, Carson From: "Freeman, Robert M." Date: Tuesday, 19 March, 2013 9:03 AM To: "maker-devel at yandell-lab.org" Subject: [maker-devel] Failing to recoup all gff annotations, including ab-initio Carson et al., Thanks again for a great suite tools! We're using MAKER now to generate gene models (and model fragments) for a ciliate, the models for which we'll be using to generate a high-quality protein database for searches with mass spec. I bootstrapped the process using the core set of proteins with CEGMA, then trained SNAP. After the final round of running MAKER, I get about 1100 evidence-based models and 34K ab-initio. And that's fine (for now). I am able to collect the fasta files for both transcripts and proteins (evidence-based and ab-initio) without problem. My problem is that when I use the gff3_merge script, I only get annotations for the evidence-based models. I'm not sure why the ab-initio model annotations are being collected. I've tried using and not the '-g' switch, but this doesn't seem to make a difference. Thoughts? Tx, B ----------------------------------------------------- Bob Freeman, Ph.D. Acorn Worm Informatics, Kirschner lab Dept of Systems Biology, Alpert 524 Harvard Medical School 200 Longwood Avenue Boston, MA 02115 617/432.2294, vox "Sorry I'm late. Oh, God, that sounded insincere. I'm late." -- Karen Walker, from Will and Grace _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnuhn at ebi.ac.uk Tue Mar 19 10:19:25 2013 From: mnuhn at ebi.ac.uk (Michael Nuhn) Date: Tue, 19 Mar 2013 15:19:25 +0000 Subject: [maker-devel] master_datastore_index.log file shrinks.] In-Reply-To: References: Message-ID: <514881FD.4020003@ebi.ac.uk> Hello Carson! On 03/19/2013 02:27 PM, Carson Holt wrote: > Yes. If at all possible use MPI. It removes the overhead of locks > which happen per primary instance of MAKER. So one maker job using 1000 > cpus via MPI will have one shared set of locks. 1000 serial instances > of MAKER on the other hand would have 1000x the locks. I don't know a thing about MPI. I tried installing maker (2.2.7) with mpich-3.0.2, mpich2-1.4.1 and open mpi and none of them worked for me. I also tried the automatic installation that comes with maker, but it didn't work for me either. If need be, I could spend time getting to the bottom of this, but there is no telling how long this would take me so I'd rather not, if there is an alternative. Would the approach I outlined before work? (Treating the split files as separate genomes to annotate and then combine the gffs afterwards) I also like this approach, because I would select a few contigs in the beginning which I would run on their own. They would complete early and this way I would get a preview of the results of the run instead of having to wait for everything to complete. It might also be more robust, because file locking issues would be confined to the instances working on a sequence chunk, but the rest of the instances could continue working. Cheers, Michael. > Alternatively if you do need to continue without MPI for some reason, I > just finished a devel version of MAKER that has a --no_locks option. > You can never start two instances using the same input fasta when > --no_locks is specified, but the splitting to use different input fastas > I mentioned before in the example will still work fine. > > I also have updated the indexing/reindexing, so if indexing failures > happen, MAKER will switch between the current working directory and the > TMP= directory from the maker_opts.ctl file so as to try different IO > locations (I.e. NFS and non-NFS). Note you should never set TMP= in the > control files to an NFS mounted location (it not only makes things a lot > slower, but berkleydb and sqllite will get frequent errors on NFS). > TMP= defaults to /tmp when not specified > > I'll send you download information in a separate e-mail. Try a regular > MAKER run to see if the indexing/reindexing changes are sufficient > before attempting the ?no_locks option. > > Thanks, > Carson From carsonhh at gmail.com Tue Mar 19 10:02:22 2013 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 19 Mar 2013 11:02:22 -0400 Subject: [maker-devel] master_datastore_index.log file shrinks.] In-Reply-To: <514881FD.4020003@ebi.ac.uk> Message-ID: Try it with the no_locks option then. Make sure to let one instance finish populating the mpi_blastdb directory before running other instances as that is where most initial locking occurs. I'll send you more details on how to install with OpenMPI, so you can give that a shot while your jobs are also running serially (so you don't lose time). Also instead of 50 serial instances, you could try 10 with -cpus set to 5. Thanks, Carson On 13-03-19 11:19 AM, "Michael Nuhn" wrote: >Hello Carson! > >On 03/19/2013 02:27 PM, Carson Holt wrote: >> Yes. If at all possible use MPI. It removes the overhead of locks >> which happen per primary instance of MAKER. So one maker job using 1000 >> cpus via MPI will have one shared set of locks. 1000 serial instances >> of MAKER on the other hand would have 1000x the locks. > >I don't know a thing about MPI. > >I tried installing maker (2.2.7) with mpich-3.0.2, mpich2-1.4.1 and open >mpi and none of them worked for me. I also tried the automatic >installation that comes with maker, but it didn't work for me either. > >If need be, I could spend time getting to the bottom of this, but there >is no telling how long this would take me so I'd rather not, if there is >an alternative. > >Would the approach I outlined before work? (Treating the split files as >separate genomes to annotate and then combine the gffs afterwards) > >I also like this approach, because I would select a few contigs in the >beginning which I would run on their own. They would complete early and >this way I would get a preview of the results of the run instead of >having to wait for everything to complete. > >It might also be more robust, because file locking issues would be >confined to the instances working on a sequence chunk, but the rest of >the instances could continue working. > >Cheers, >Michael. > >> Alternatively if you do need to continue without MPI for some reason, I >> just finished a devel version of MAKER that has a --no_locks option. >> You can never start two instances using the same input fasta when >> --no_locks is specified, but the splitting to use different input fastas >> I mentioned before in the example will still work fine. >> >> I also have updated the indexing/reindexing, so if indexing failures >> happen, MAKER will switch between the current working directory and the >> TMP= directory from the maker_opts.ctl file so as to try different IO >> locations (I.e. NFS and non-NFS). Note you should never set TMP= in the >> control files to an NFS mounted location (it not only makes things a lot >> slower, but berkleydb and sqllite will get frequent errors on NFS). >> TMP= defaults to /tmp when not specified >> >> I'll send you download information in a separate e-mail. Try a regular >> MAKER run to see if the indexing/reindexing changes are sufficient >> before attempting the ?no_locks option. >> >> Thanks, >> Carson > From dsth at ebi.ac.uk Tue Mar 19 10:13:51 2013 From: dsth at ebi.ac.uk (Daniel Hughes) Date: Tue, 19 Mar 2013 15:13:51 +0000 Subject: [maker-devel] master_datastore_index.log file shrinks.] In-Reply-To: <514881FD.4020003@ebi.ac.uk> References: <514881FD.4020003@ebi.ac.uk> Message-ID: You really don't need to know anything about MPI. While MPI is itself pretty complex, I seem to recall maker uses the p2p subset alone mainly to send serialised perl objects as c strings etc., for IPC across ad hoc infrastructure - but none of that is relevant as Carson has done all the IPC debugging for you and its use should be transparent. If it's failing, its almost certainly because you've got discrepencies between the mpi libraries visible at compile-time vs. run-time and you may need to force the dynamic linker to behave itself. The only other caveat on ebi infrastructure i can think of off the top of my head relates to cross-node MPI usage when going into the hundreds of processes but i'm assuming you not doing that? You need to be more specific about how it's failing. dan from me phone... On Mar 19, 2013 11:55 AM, "Michael Nuhn" wrote: > Hello Carson! > > On 03/19/2013 02:27 PM, Carson Holt wrote: > >> Yes. If at all possible use MPI. It removes the overhead of locks >> which happen per primary instance of MAKER. So one maker job using 1000 >> cpus via MPI will have one shared set of locks. 1000 serial instances >> of MAKER on the other hand would have 1000x the locks. >> > > I don't know a thing about MPI. > > I tried installing maker (2.2.7) with mpich-3.0.2, mpich2-1.4.1 and open > mpi and none of them worked for me. I also tried the automatic installation > that comes with maker, but it didn't work for me either. > > If need be, I could spend time getting to the bottom of this, but there is > no telling how long this would take me so I'd rather not, if there is an > alternative. > > Would the approach I outlined before work? (Treating the split files as > separate genomes to annotate and then combine the gffs afterwards) > > I also like this approach, because I would select a few contigs in the > beginning which I would run on their own. They would complete early and > this way I would get a preview of the results of the run instead of having > to wait for everything to complete. > > It might also be more robust, because file locking issues would be > confined to the instances working on a sequence chunk, but the rest of the > instances could continue working. > > Cheers, > Michael. > > Alternatively if you do need to continue without MPI for some reason, I >> just finished a devel version of MAKER that has a --no_locks option. >> You can never start two instances using the same input fasta when >> --no_locks is specified, but the splitting to use different input fastas >> I mentioned before in the example will still work fine. >> >> I also have updated the indexing/reindexing, so if indexing failures >> happen, MAKER will switch between the current working directory and the >> TMP= directory from the maker_opts.ctl file so as to try different IO >> locations (I.e. NFS and non-NFS). Note you should never set TMP= in the >> control files to an NFS mounted location (it not only makes things a lot >> slower, but berkleydb and sqllite will get frequent errors on NFS). >> TMP= defaults to /tmp when not specified >> >> I'll send you download information in a separate e-mail. Try a regular >> MAKER run to see if the indexing/reindexing changes are sufficient >> before attempting the ?no_locks option. >> >> Thanks, >> Carson >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Mar 19 10:22:22 2013 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 19 Mar 2013 11:22:22 -0400 Subject: [maker-devel] master_datastore_index.log file shrinks.] In-Reply-To: Message-ID: I have MAKER working under OpemnMPI 1.4.3 (intel compiled). I had to set a couple of environmental variables prior to setup. You would probably need to set these values as well. If you your OpenMPI path was here for example --> /software/openmpi-1.4.3/, run the following commands (path set accordingly) before even attempting maker setup. export OMPI_MCA_mpi_warn_on_fork 0 export LD_PRELOAD /software/openmpi-1.4.3/lib/libmpi.so:$LD_PRELOAD These not only need to be set before compilation, but also before any run (so add them to you ~.bashrc or ~/.bash_profile or any module load scripts thanks). The LD_PRELOAD statement needs to be set for any program using OpenMPI's shared libraries and not just MAKER, so it's normally a good idea to have that set system wide for all users. The detail can be found in the OpenMPI documentation. Note sometimes system library updates can break OpenMPI's shared libraries while not breaking OpenMPI itself, so you might also need to recompile OpenMPI if it has broken shared libraries. Once you have those commands in place, run the perl Buil.PL step. Say yes to install with MPI. Then run ./Build install Thanks, Carson On 13-03-19 11:02 AM, "Carson Holt" wrote: >Try it with the no_locks option then. Make sure to let one instance >finish populating the mpi_blastdb directory before running other >instances >as that is where most initial locking occurs. > >I'll send you more details on how to install with OpenMPI, so you can >give >that a shot while your jobs are also running serially (so you don't lose >time). Also instead of 50 serial instances, you could try 10 with -cpus >set to 5. > >Thanks, >Carson > > > >On 13-03-19 11:19 AM, "Michael Nuhn" wrote: > >>Hello Carson! >> >>On 03/19/2013 02:27 PM, Carson Holt wrote: >>> Yes. If at all possible use MPI. It removes the overhead of locks >>> which happen per primary instance of MAKER. So one maker job using >>>1000 >>> cpus via MPI will have one shared set of locks. 1000 serial instances >>> of MAKER on the other hand would have 1000x the locks. >> >>I don't know a thing about MPI. >> >>I tried installing maker (2.2.7) with mpich-3.0.2, mpich2-1.4.1 and open >>mpi and none of them worked for me. I also tried the automatic >>installation that comes with maker, but it didn't work for me either. >> >>If need be, I could spend time getting to the bottom of this, but there >>is no telling how long this would take me so I'd rather not, if there is >>an alternative. >> >>Would the approach I outlined before work? (Treating the split files as >>separate genomes to annotate and then combine the gffs afterwards) >> >>I also like this approach, because I would select a few contigs in the >>beginning which I would run on their own. They would complete early and >>this way I would get a preview of the results of the run instead of >>having to wait for everything to complete. >> >>It might also be more robust, because file locking issues would be >>confined to the instances working on a sequence chunk, but the rest of >>the instances could continue working. >> >>Cheers, >>Michael. >> >>> Alternatively if you do need to continue without MPI for some reason, I >>> just finished a devel version of MAKER that has a --no_locks option. >>> You can never start two instances using the same input fasta when >>> --no_locks is specified, but the splitting to use different input >>>fastas >>> I mentioned before in the example will still work fine. >>> >>> I also have updated the indexing/reindexing, so if indexing failures >>> happen, MAKER will switch between the current working directory and the >>> TMP= directory from the maker_opts.ctl file so as to try different IO >>> locations (I.e. NFS and non-NFS). Note you should never set TMP= in >>>the >>> control files to an NFS mounted location (it not only makes things a >>>lot >>> slower, but berkleydb and sqllite will get frequent errors on NFS). >>> TMP= defaults to /tmp when not specified >>> >>> I'll send you download information in a separate e-mail. Try a regular >>> MAKER run to see if the indexing/reindexing changes are sufficient >>> before attempting the ?no_locks option. >>> >>> Thanks, >>> Carson From dsth at ebi.ac.uk Tue Mar 19 10:26:02 2013 From: dsth at ebi.ac.uk (Daniel Hughes) Date: Tue, 19 Mar 2013 15:26:02 +0000 Subject: [maker-devel] master_datastore_index.log file shrinks.] In-Reply-To: References: <514881FD.4020003@ebi.ac.uk> Message-ID: oh and (1) it will work as long as evidence etc., is synchronous, (2) it will be really inefficient - be glad ebi doesn't use a by group compute time fair-share policy ;) Dan from me phone... On Mar 19, 2013 12:13 PM, "Daniel Hughes" wrote: > You really don't need to know anything about MPI. While MPI is itself > pretty complex, I seem to recall maker uses the p2p subset alone mainly to > send serialised perl objects as c strings etc., for IPC across ad hoc > infrastructure - but none of that is relevant as Carson has done all the > IPC debugging for you and its use should be transparent. If it's failing, > its almost certainly because you've got discrepencies between the mpi > libraries visible at compile-time vs. run-time and you may need to force > the dynamic linker to behave itself. The only other caveat on ebi > infrastructure i can think of off the top of my head relates to cross-node > MPI usage when going into the hundreds of processes but i'm assuming you > not doing that? You need to be more specific about how it's failing. > > dan > > from me phone... > On Mar 19, 2013 11:55 AM, "Michael Nuhn" wrote: > >> Hello Carson! >> >> On 03/19/2013 02:27 PM, Carson Holt wrote: >> >>> Yes. If at all possible use MPI. It removes the overhead of locks >>> which happen per primary instance of MAKER. So one maker job using 1000 >>> cpus via MPI will have one shared set of locks. 1000 serial instances >>> of MAKER on the other hand would have 1000x the locks. >>> >> >> I don't know a thing about MPI. >> >> I tried installing maker (2.2.7) with mpich-3.0.2, mpich2-1.4.1 and open >> mpi and none of them worked for me. I also tried the automatic installation >> that comes with maker, but it didn't work for me either. >> >> If need be, I could spend time getting to the bottom of this, but there >> is no telling how long this would take me so I'd rather not, if there is an >> alternative. >> >> Would the approach I outlined before work? (Treating the split files as >> separate genomes to annotate and then combine the gffs afterwards) >> >> I also like this approach, because I would select a few contigs in the >> beginning which I would run on their own. They would complete early and >> this way I would get a preview of the results of the run instead of having >> to wait for everything to complete. >> >> It might also be more robust, because file locking issues would be >> confined to the instances working on a sequence chunk, but the rest of the >> instances could continue working. >> >> Cheers, >> Michael. >> >> Alternatively if you do need to continue without MPI for some reason, I >>> just finished a devel version of MAKER that has a --no_locks option. >>> You can never start two instances using the same input fasta when >>> --no_locks is specified, but the splitting to use different input fastas >>> I mentioned before in the example will still work fine. >>> >>> I also have updated the indexing/reindexing, so if indexing failures >>> happen, MAKER will switch between the current working directory and the >>> TMP= directory from the maker_opts.ctl file so as to try different IO >>> locations (I.e. NFS and non-NFS). Note you should never set TMP= in the >>> control files to an NFS mounted location (it not only makes things a lot >>> slower, but berkleydb and sqllite will get frequent errors on NFS). >>> TMP= defaults to /tmp when not specified >>> >>> I'll send you download information in a separate e-mail. Try a regular >>> MAKER run to see if the indexing/reindexing changes are sufficient >>> before attempting the ?no_locks option. >>> >>> Thanks, >>> Carson >>> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnuhn at ebi.ac.uk Tue Mar 19 10:54:34 2013 From: mnuhn at ebi.ac.uk (Michael Nuhn) Date: Tue, 19 Mar 2013 15:54:34 +0000 Subject: [maker-devel] master_datastore_index.log file shrinks.] In-Reply-To: References: Message-ID: <51488A3A.20106@ebi.ac.uk> Hello Carson! Thanks for the pointers. I'll give mpi another shot. Cheers, Michael. On 03/19/2013 03:22 PM, Carson Holt wrote: > I have MAKER working under OpemnMPI 1.4.3 (intel compiled). > > I had to set a couple of environmental variables prior to setup. You would > probably need to set these values as well. If you your OpenMPI path was > here for example --> /software/openmpi-1.4.3/, run the following commands > (path set accordingly) before even attempting maker setup. > > export OMPI_MCA_mpi_warn_on_fork 0 > export LD_PRELOAD /software/openmpi-1.4.3/lib/libmpi.so:$LD_PRELOAD > > These not only need to be set before compilation, but also before any run > (so add them to you ~.bashrc or ~/.bash_profile or any module load scripts > thanks). The LD_PRELOAD statement needs to be set for any program using > OpenMPI's shared libraries and not just MAKER, so it's normally a good > idea to have that set system wide for all users. The detail can be found > in the OpenMPI documentation. Note sometimes system library updates can > break OpenMPI's shared libraries while not breaking OpenMPI itself, so you > might also need to recompile OpenMPI if it has broken shared libraries. > > Once you have those commands in place, run the perl Buil.PL step. Say yes > to install with MPI. Then run ./Build install > > Thanks, > Carson > > > > On 13-03-19 11:02 AM, "Carson Holt" wrote: > >> Try it with the no_locks option then. Make sure to let one instance >> finish populating the mpi_blastdb directory before running other >> instances >> as that is where most initial locking occurs. >> >> I'll send you more details on how to install with OpenMPI, so you can >> give >> that a shot while your jobs are also running serially (so you don't lose >> time). Also instead of 50 serial instances, you could try 10 with -cpus >> set to 5. >> >> Thanks, >> Carson >> >> >> >> On 13-03-19 11:19 AM, "Michael Nuhn" wrote: >> >>> Hello Carson! >>> >>> On 03/19/2013 02:27 PM, Carson Holt wrote: >>>> Yes. If at all possible use MPI. It removes the overhead of locks >>>> which happen per primary instance of MAKER. So one maker job using >>>> 1000 >>>> cpus via MPI will have one shared set of locks. 1000 serial instances >>>> of MAKER on the other hand would have 1000x the locks. >>> >>> I don't know a thing about MPI. >>> >>> I tried installing maker (2.2.7) with mpich-3.0.2, mpich2-1.4.1 and open >>> mpi and none of them worked for me. I also tried the automatic >>> installation that comes with maker, but it didn't work for me either. >>> >>> If need be, I could spend time getting to the bottom of this, but there >>> is no telling how long this would take me so I'd rather not, if there is >>> an alternative. >>> >>> Would the approach I outlined before work? (Treating the split files as >>> separate genomes to annotate and then combine the gffs afterwards) >>> >>> I also like this approach, because I would select a few contigs in the >>> beginning which I would run on their own. They would complete early and >>> this way I would get a preview of the results of the run instead of >>> having to wait for everything to complete. >>> >>> It might also be more robust, because file locking issues would be >>> confined to the instances working on a sequence chunk, but the rest of >>> the instances could continue working. >>> >>> Cheers, >>> Michael. >>> >>>> Alternatively if you do need to continue without MPI for some reason, I >>>> just finished a devel version of MAKER that has a --no_locks option. >>>> You can never start two instances using the same input fasta when >>>> --no_locks is specified, but the splitting to use different input >>>> fastas >>>> I mentioned before in the example will still work fine. >>>> >>>> I also have updated the indexing/reindexing, so if indexing failures >>>> happen, MAKER will switch between the current working directory and the >>>> TMP= directory from the maker_opts.ctl file so as to try different IO >>>> locations (I.e. NFS and non-NFS). Note you should never set TMP= in >>>> the >>>> control files to an NFS mounted location (it not only makes things a >>>> lot >>>> slower, but berkleydb and sqllite will get frequent errors on NFS). >>>> TMP= defaults to /tmp when not specified >>>> >>>> I'll send you download information in a separate e-mail. Try a regular >>>> MAKER run to see if the indexing/reindexing changes are sufficient >>>> before attempting the ?no_locks option. >>>> >>>> Thanks, >>>> Carson > > From es9 at sanger.ac.uk Tue Mar 19 10:40:08 2013 From: es9 at sanger.ac.uk (Eleanor Stanley) Date: Tue, 19 Mar 2013 15:40:08 +0000 Subject: [maker-devel] master_datastore_index.log file shrinks.] In-Reply-To: <51488A3A.20106@ebi.ac.uk> References: <51488A3A.20106@ebi.ac.uk> Message-ID: For the Sanger farm I have a wrapper script to run MPI maker so that the same environmental variables are forced to all nodes. Eleanor On 19 Mar 2013, at 15:54, Michael Nuhn wrote: > Hello Carson! > > Thanks for the pointers. I'll give mpi another shot. > > Cheers, > Michael. > > On 03/19/2013 03:22 PM, Carson Holt wrote: >> I have MAKER working under OpemnMPI 1.4.3 (intel compiled). >> >> I had to set a couple of environmental variables prior to setup. You would >> probably need to set these values as well. If you your OpenMPI path was >> here for example --> /software/openmpi-1.4.3/, run the following commands >> (path set accordingly) before even attempting maker setup. >> >> export OMPI_MCA_mpi_warn_on_fork 0 >> export LD_PRELOAD /software/openmpi-1.4.3/lib/libmpi.so:$LD_PRELOAD >> >> These not only need to be set before compilation, but also before any run >> (so add them to you ~.bashrc or ~/.bash_profile or any module load scripts >> thanks). The LD_PRELOAD statement needs to be set for any program using >> OpenMPI's shared libraries and not just MAKER, so it's normally a good >> idea to have that set system wide for all users. The detail can be found >> in the OpenMPI documentation. Note sometimes system library updates can >> break OpenMPI's shared libraries while not breaking OpenMPI itself, so you >> might also need to recompile OpenMPI if it has broken shared libraries. >> >> Once you have those commands in place, run the perl Buil.PL step. Say yes >> to install with MPI. Then run ./Build install >> >> Thanks, >> Carson >> >> >> >> On 13-03-19 11:02 AM, "Carson Holt" wrote: >> >>> Try it with the no_locks option then. Make sure to let one instance >>> finish populating the mpi_blastdb directory before running other >>> instances >>> as that is where most initial locking occurs. >>> >>> I'll send you more details on how to install with OpenMPI, so you can >>> give >>> that a shot while your jobs are also running serially (so you don't lose >>> time). Also instead of 50 serial instances, you could try 10 with -cpus >>> set to 5. >>> >>> Thanks, >>> Carson >>> >>> >>> >>> On 13-03-19 11:19 AM, "Michael Nuhn" wrote: >>> >>>> Hello Carson! >>>> >>>> On 03/19/2013 02:27 PM, Carson Holt wrote: >>>>> Yes. If at all possible use MPI. It removes the overhead of locks >>>>> which happen per primary instance of MAKER. So one maker job using >>>>> 1000 >>>>> cpus via MPI will have one shared set of locks. 1000 serial instances >>>>> of MAKER on the other hand would have 1000x the locks. >>>> >>>> I don't know a thing about MPI. >>>> >>>> I tried installing maker (2.2.7) with mpich-3.0.2, mpich2-1.4.1 and open >>>> mpi and none of them worked for me. I also tried the automatic >>>> installation that comes with maker, but it didn't work for me either. >>>> >>>> If need be, I could spend time getting to the bottom of this, but there >>>> is no telling how long this would take me so I'd rather not, if there is >>>> an alternative. >>>> >>>> Would the approach I outlined before work? (Treating the split files as >>>> separate genomes to annotate and then combine the gffs afterwards) >>>> >>>> I also like this approach, because I would select a few contigs in the >>>> beginning which I would run on their own. They would complete early and >>>> this way I would get a preview of the results of the run instead of >>>> having to wait for everything to complete. >>>> >>>> It might also be more robust, because file locking issues would be >>>> confined to the instances working on a sequence chunk, but the rest of >>>> the instances could continue working. >>>> >>>> Cheers, >>>> Michael. >>>> >>>>> Alternatively if you do need to continue without MPI for some reason, I >>>>> just finished a devel version of MAKER that has a --no_locks option. >>>>> You can never start two instances using the same input fasta when >>>>> --no_locks is specified, but the splitting to use different input >>>>> fastas >>>>> I mentioned before in the example will still work fine. >>>>> >>>>> I also have updated the indexing/reindexing, so if indexing failures >>>>> happen, MAKER will switch between the current working directory and the >>>>> TMP= directory from the maker_opts.ctl file so as to try different IO >>>>> locations (I.e. NFS and non-NFS). Note you should never set TMP= in >>>>> the >>>>> control files to an NFS mounted location (it not only makes things a >>>>> lot >>>>> slower, but berkleydb and sqllite will get frequent errors on NFS). >>>>> TMP= defaults to /tmp when not specified >>>>> >>>>> I'll send you download information in a separate e-mail. Try a regular >>>>> MAKER run to see if the indexing/reindexing changes are sufficient >>>>> before attempting the ?no_locks option. >>>>> >>>>> Thanks, >>>>> Carson >> >> > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From Bob_Freeman at hms.harvard.edu Tue Mar 19 11:18:11 2013 From: Bob_Freeman at hms.harvard.edu (Freeman, Robert M.) Date: Tue, 19 Mar 2013 12:18:11 -0400 Subject: [maker-devel] Failing to recoup all gff annotations, including ab-initio In-Reply-To: References: Message-ID: <06F15FF0-2384-4BDD-AD9B-9C1D0AB6370C@hms.harvard.edu> Thanks, Carson. This explains the behavior I saw and will help us moving forward. Best, Bob On Mar 19, 2013, at 10:52 AM, Carson Holt wrote: Ab initio models without evidence support are not considered final models by default (newly trained ab initio predictors tend to have a very high false positive rate). If you really want the ab initio models without support to upgraded, set the keep_preds=1 in the maker_opts.ctl file. All ab inito models are also stored in the GFF3 as match/mtach_part features for reference purposes not gene/mRNA/exon/CDS. Thanks, Carson From: "Freeman, Robert M." > Date: Tuesday, 19 March, 2013 9:03 AM To: "maker-devel at yandell-lab.org" > Subject: [maker-devel] Failing to recoup all gff annotations, including ab-initio Carson et al., Thanks again for a great suite tools! We're using MAKER now to generate gene models (and model fragments) for a ciliate, the models for which we'll be using to generate a high-quality protein database for searches with mass spec. I bootstrapped the process using the core set of proteins with CEGMA, then trained SNAP. After the final round of running MAKER, I get about 1100 evidence-based models and 34K ab-initio. And that's fine (for now). I am able to collect the fasta files for both transcripts and proteins (evidence-based and ab-initio) without problem. My problem is that when I use the gff3_merge script, I only get annotations for the evidence-based models. I'm not sure why the ab-initio model annotations are being collected. I've tried using and not the '-g' switch, but this doesn't seem to make a difference. Thoughts? Tx, B _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org ----------------------------------------------------- Bob Freeman, Ph.D. Acorn Worm Informatics, Kirschner lab Dept of Systems Biology, Alpert 524 Harvard Medical School 200 Longwood Avenue Boston, MA 02115 617/432.2294, vox "Sorry I'm late. Oh, God, that sounded insincere. I'm late." -- Karen Walker, from Will and Grace -------------- next part -------------- An HTML attachment was scrubbed... URL: From cjfields at illinois.edu Tue Mar 19 14:04:18 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Tue, 19 Mar 2013 19:04:18 +0000 Subject: [maker-devel] Alternative start codons Message-ID: <118F034CF4C3EF48A96F86CE585B94BF74DA507D@CHIMBX5.ad.uillinois.edu> We had a user notice that MAKER is not observing alternative start codons for bacterial genomes. For instance, this predicted transcript: >Xf_Mul_000007-RA transcript Name:"Protein of unknown function" offset:79 AED:0.42 eAED:1.00 QI:79|-1|0|1|-1|1|1|20|24 GTGGGATACAGGCCGCTGATCGCTGATGGCGCGTACCTGAAACTGCTGCTGGACTACTAC GTTACAGTGCAGCCTTTGCATGCCGATTGGAAAGATCTATATATCATCGCTTGCGCTATT ACAGCGGCTAAAAAGAGTCTTCAATTTGGCGTAATTCAGTCATTGGCGGGGTAG Yields this protein sequence. >Xf_Mul_000007-RA protein AED:0.42 eAED:1.00 QI:79|-1|0|1|-1|1|1|20|24 MPIGKIYISSLALLQRLKRVFNLA I'm pretty sure I know what is going on, namely that MAKER is treating the 5' end as UTR and looking for the first ATG (there is one in the sequence above). Is there any way to change this behavior, though? For instance, allow alternative start codons like GTG/TTG? chris From hudarul at yahoo.com Tue Mar 19 14:08:55 2013 From: hudarul at yahoo.com (Hud Hud) Date: Tue, 19 Mar 2013 12:08:55 -0700 (PDT) Subject: [maker-devel] Maker-no such file or directory In-Reply-To: References: <1363637601.24386.YahooMailNeo@web164901.mail.bf1.yahoo.com> Message-ID: <1363720135.24498.YahooMailNeo@web164901.mail.bf1.yahoo.com> Hello everyone I have some queries, i cant run MAKER locally, so can i use MWAS on my contigs, but since my contigs too long to be run on MWAS, is it possible to combine the results after i upload and run the analysis on my contigs separately... ________________________________ From: Carson Holt To: Hud Hud ; "maker-devel at yandell-lab.org" Sent: Tuesday, March 19, 2013 4:44 AM Subject: Re: [maker-devel] Maker-no such file or directory Does 'ls -al?$home/Dorah/maker-2.27-beta/maker/data/dpp_contig.fasta' ?show a valid location? The error is just saying that the file location as written in the maker_opts.ctl file does not exist. --Carson From: Hud Hud Reply-To: Hud Hud Date: Monday, 18 March, 2013 4:13 PM To: "maker-devel at yandell-lab.org" Subject: [maker-devel] Maker-no such file or directory I have some problem with maker 1. i try to work with the example data in data directory, but im having this kind of error..anyone can help me error $ maker STATUS: Parsing control files... dpp_contig.fasta (fasta file or fasta embeded in GFF3 file): No such file or directory at /home/Dorah/maker-2.27-beta/maker/bin/../lib/GI.pm line 186 --> rank=NA, hostname=NurKaiyisah my maker_opts.ctl genome=$home/Dorah/maker-2.27-beta/maker/data/dpp_contig.fasta est=$home/Dorah/maker-2.27-beta/maker/data/dpp_est.fasta protein=$home/Dorah/maker-2.27-beta/maker/data/dpp_protein.fasta _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Mar 19 14:30:09 2013 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 19 Mar 2013 15:30:09 -0400 Subject: [maker-devel] Maker-no such file or directory In-Reply-To: <1363720135.24498.YahooMailNeo@web164901.mail.bf1.yahoo.com> Message-ID: You can. It will be very slow though as MWAS only dedicates a single cpu per job. So with a 5Mb max per job submission it could take a very long time depending on the size of the assembly (emphasis on very long). --Carson From: Hud Hud Reply-To: Hud Hud Date: Tuesday, 19 March, 2013 3:08 PM To: "maker-devel at yandell-lab.org" Subject: Re: [maker-devel] Maker-no such file or directory Hello everyone I have some queries, i cant run MAKER locally, so can i use MWAS on my contigs, but since my contigs too long to be run on MWAS, is it possible to combine the results after i upload and run the analysis on my contigs separately... From: Carson Holt To: Hud Hud ; "maker-devel at yandell-lab.org" Sent: Tuesday, March 19, 2013 4:44 AM Subject: Re: [maker-devel] Maker-no such file or directory Does 'ls -al $home/Dorah/maker-2.27-beta/maker/data/dpp_contig.fasta' show a valid location? The error is just saying that the file location as written in the maker_opts.ctl file does not exist. --Carson From: Hud Hud Reply-To: Hud Hud Date: Monday, 18 March, 2013 4:13 PM To: "maker-devel at yandell-lab.org" Subject: [maker-devel] Maker-no such file or directory I have some problem with maker 1. i try to work with the example data in data directory, but im having this kind of error..anyone can help me error $ maker STATUS: Parsing control files... dpp_contig.fasta (fasta file or fasta embeded in GFF3 file): No such file or directory at /home/Dorah/maker-2.27-beta/maker/bin/../lib/GI.pm line 186 --> rank=NA, hostname=NurKaiyisah my maker_opts.ctl genome=$home/Dorah/maker-2.27-beta/maker/data/dpp_contig.fasta est=$home/Dorah/maker-2.27-beta/maker/data/dpp_est.fasta protein=$home/Dorah/maker-2.27-beta/maker/data/dpp_protein.fasta _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Mar 19 14:33:46 2013 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 19 Mar 2013 15:33:46 -0400 Subject: [maker-devel] Alternative start codons In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF74DA507D@CHIMBX5.ad.uillinois.edu> Message-ID: It could be changed. I imagine that this is a protein2genome or est2genome gene, as MAKER won't try and determine by itself the start and end if it comes from a gene predictor. --Carson On 13-03-19 3:04 PM, "Fields, Christopher J" wrote: >We had a user notice that MAKER is not observing alternative start codons >for bacterial genomes. For instance, this predicted transcript: > >>Xf_Mul_000007-RA transcript Name:"Protein of unknown function" offset:79 >>AED:0.42 eAED:1.00 QI:79|-1|0|1|-1|1|1|20|24 >GTGGGATACAGGCCGCTGATCGCTGATGGCGCGTACCTGAAACTGCTGCTGGACTACTAC >GTTACAGTGCAGCCTTTGCATGCCGATTGGAAAGATCTATATATCATCGCTTGCGCTATT >ACAGCGGCTAAAAAGAGTCTTCAATTTGGCGTAATTCAGTCATTGGCGGGGTAG > >Yields this protein sequence. > >>Xf_Mul_000007-RA protein AED:0.42 eAED:1.00 QI:79|-1|0|1|-1|1|1|20|24 >MPIGKIYISSLALLQRLKRVFNLA > >I'm pretty sure I know what is going on, namely that MAKER is treating >the 5' end as UTR and looking for the first ATG (there is one in the >sequence above). Is there any way to change this behavior, though? For >instance, allow alternative start codons like GTG/TTG? > >chris >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From myandell at genetics.utah.edu Tue Mar 19 19:02:36 2013 From: myandell at genetics.utah.edu (Mark Yandell) Date: Wed, 20 Mar 2013 00:02:36 +0000 Subject: [maker-devel] Maker2 gff file output In-Reply-To: References: Message-ID: <7A60AB257EFF2B48B1F4C814817EA05350EDC688@mxb2.hg.genetics.utah.edu> Hi Blake, I'be forwarded this onto the maker_devel list, they can help you more there. regarding your comment g 'When I view the output of many contigs in Apollo, there is many times where 3 or 4 models show close to identical gene structure, but the final maker output does not contain that gene call. ' Those calls are in the output files, but there are in a different multifasta file; there are non-overalpping ab intio models. Another way is to set the config flag to allow MAKEr to use unspliced EST and RNA-seq alignments as evidence, I'be forwarded this onto the maker_devel list, they can help you more there. cheers, --mark Mark Yandell Professor of Human Genetics H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:801-587-7707 ________________________________________ From: Blake Hovde [hovdebt at uw.edu] Sent: Tuesday, March 19, 2013 2:35 PM To: Mark Yandell Subject: Maker2 gff file output Hi Dr. Yandell, I am currently running MAKER2 on a new algal genome and am running into a couple of problems that I would like your input on the genome size is ~60Mb and is currently in ~3100 contigs. First, I am having trouble doing multiple iterations of hmm training with SNAP due to the fact that I have so many gff output files in the datastore (1 for each contig in my draft genome). not just a single gff output that seems to be in the examples and tutorials I have followed thus far. Is there a way to combine all of my gff files together to make use of the SNAP hmm training or re-annotation? Second, Using multiple lines of evidence (augustus, genemarkES, RNAseq data, and COGs based on homology searches) I am having a hard time getting a lot of maker gene calls. It seems that the calling is too stringent in many cases. When I view the output of many contigs in Apollo, there is many times where 3 or 4 models show close to identical gene structure, but the final maker output does not contain that gene call. Do you have any suggestions on how to lower the stringency of the MAKER output so that more genes will be called? In some cases I am getting less than 3000 gene calls in the final output. Where an Augustus model trained on Chlamydamonas will return ~15000. Thanks very much for your help! Sincerely, Blake Hovde Graduate Student Department of Genome Sciences University of Washington From carsonhh at gmail.com Tue Mar 19 21:43:44 2013 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 19 Mar 2013 22:43:44 -0400 Subject: [maker-devel] Maker2 gff file output In-Reply-To: <7A60AB257EFF2B48B1F4C814817EA05350EDC688@mxb2.hg.genetics.utah.edu> Message-ID: >I am currently running MAKER2 on a new algal genome and am running >into a couple of problems that I would like your input on the genome >size is ~60Mb and is currently in ~3100 contigs. >First, I am having trouble doing multiple iterations of hmm training >with SNAP due to the fact that I have so many gff output files in the >datastore (1 for each contig in my draft genome). not just a single >gff output that seems to be in the examples and tutorials I have >followed thus far. Is there a way to combine all of my gff files >together to make use of the SNAP hmm training or re-annotation? Use the gff3_merge script in the .../maker/bin/ directory > >Second, Using multiple lines of evidence (augustus, genemarkES, RNAseq >data, and COGs based on homology searches) I am having a hard time >getting a lot of maker gene calls. It seems that the calling is too >stringent in many cases. When I view the output of many contigs in >Apollo, there is many times where 3 or 4 models show close to >identical gene structure, but the final maker output does not contain >that gene call. Do you have any suggestions on how to lower the >stringency of the MAKER output so that more genes will be called? In >some cases I am getting less than 3000 gene calls in the final output. >Where an Augustus model trained on Chlamydamonas will return ~15000. I agree with Mark. You may want to set single_exon=1 to accept single exon evidence, try increasing the depth of your protein evidence file as well, or if the genome is relatively gene dense, set keep_preds=1. On some genomes that are gene dense (fungi for example) ab initio predictors don't have that high a false positive rate, so this can be safe. However on more complex genomes doing so can produce more false positives than there are genes. Thanks, Carson On 13-03-19 8:02 PM, "Mark Yandell" wrote: >Hi Blake, > >I'be forwarded this onto the maker_devel list, they can help you more >there. > >regarding your comment g 'When I view the output of many contigs in >Apollo, there is many times where 3 or 4 models show close to identical >gene structure, but the final maker output does not contain that gene >call. ' Those calls are in the output files, but there are in a >different multifasta file; there are non-overalpping ab intio models. >Another way is to set the config flag to allow MAKEr to use unspliced EST >and RNA-seq alignments as evidence, > >I'be forwarded this onto the maker_devel list, they can help you more >there. > >cheers, > >--mark > > >Mark Yandell >Professor of Human Genetics >H.A. & Edna Benning Presidential Endowed Chair >Eccles Institute of Human Genetics >University of Utah >15 North 2030 East, Room 2100 >Salt Lake City, UT 84112-5330 >ph:801-587-7707 > >________________________________________ >From: Blake Hovde [hovdebt at uw.edu] >Sent: Tuesday, March 19, 2013 2:35 PM >To: Mark Yandell >Subject: Maker2 gff file output > >Hi Dr. Yandell, > >I am currently running MAKER2 on a new algal genome and am running >into a couple of problems that I would like your input on the genome >size is ~60Mb and is currently in ~3100 contigs. >First, I am having trouble doing multiple iterations of hmm training >with SNAP due to the fact that I have so many gff output files in the >datastore (1 for each contig in my draft genome). not just a single >gff output that seems to be in the examples and tutorials I have >followed thus far. Is there a way to combine all of my gff files >together to make use of the SNAP hmm training or re-annotation? > >Second, Using multiple lines of evidence (augustus, genemarkES, RNAseq >data, and COGs based on homology searches) I am having a hard time >getting a lot of maker gene calls. It seems that the calling is too >stringent in many cases. When I view the output of many contigs in >Apollo, there is many times where 3 or 4 models show close to >identical gene structure, but the final maker output does not contain >that gene call. Do you have any suggestions on how to lower the >stringency of the MAKER output so that more genes will be called? In >some cases I am getting less than 3000 gene calls in the final output. > Where an Augustus model trained on Chlamydamonas will return ~15000. > >Thanks very much for your help! > >Sincerely, >Blake Hovde >Graduate Student >Department of Genome Sciences >University of Washington > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From Carson.Holt at oicr.on.ca Wed Mar 20 08:51:29 2013 From: Carson.Holt at oicr.on.ca (Carson Holt) Date: Wed, 20 Mar 2013 13:51:29 +0000 Subject: [maker-devel] AED calculations using the MAKER pipeline In-Reply-To: Message-ID: In the current MAKER download when using GFF3 passthrough there was an issue with everything being done at the very last step. This of course leads to a memory spike and a very slow last step. That seems to be similar to what you are describing. It should be resolved in what will become version 2.28. I can give you access to the pre-release code, so you can check that the issue is resolved for you. I'll send details in a separate e-mail. Also the ### will be printed after every ~100,000 bp of assembly processed by MAKER. You can ignore them, but they actually have a meaning in GFF3. Basically everything between two sets of ###'s are fully resolved. It allows programs that read GFF3 to parallelize file loading or just load sections of a file as they can rapidly identify "safe chunks". Without them the entire file must be loaded into memory in order to be certain that all feature parts are there (as there is no requirement for sorting or order in GFF3). log.child files will always be empty unless you run analysis like snap or blast. Thanks, Carson On 13-03-20 9:05 AM, "Krishnakumar, Vivek" wrote: >Hi, > >We have been using the MAKER pipeline here at JCVI to calculate AED >scores by feeding in our annotation set as `model_gff` and the protein >and EST evidence as `protein_gff` and `est_gff` respectively. Here is the >issue we are having: > >When running the above pipeline with protein2genome and est2genome >evidence generated earlier by MAKER, there are no problems calculating >the AED score. Normally this pipeline takes a little over 12 hours to >complete. > >But if we use our own evidence, AAT and Genewise aligned proteins for >`protein_gff` and PASA assembled ESTs for `est_gff`, the same pipeline >runs very very slow and the intermediary *.gff.ann file has many chunks >(separated by '###') that are completely empty. Our evidence in formatted >in the same way as est2genome or protein2genome (GFF file with >"expressed_sequence_match::match_part" or "protein_match::match_part" >features respectively) > >The input to my pipeline is 8 chromosomes, ~2200 scaffolds and I use the >default `max_dna_len` parameter used to split the large assemblies into >chunks. > >Investigating the master_datastore.log shows me that the scaffolds run >through without any issues and the chromosomes are still being processed. >For any of the chromosomes, investigating the 'run.log' file, one level >above 'theVoid' shows me how many "final.section" jobs were started and >how many finished. And in the case of all the chromosomes, it tells me >that everything that was started has finished. And the 'log.child.*' >files within `theVoid` are all empty. Also within `theVoid`, I'm noticing >that the "raw.section" and "evidence_*.gff" files are not empty. But one >thing that is surprising is that of all the "final.section" files, only >the one pertaining to the last chunk is very large (proportional to the >size of the evidnce), the rest are all exactly the same size (exactly 331 >bytes). > >I'm running MAKER in MPI mode spawning 48 processes on a high memory >machine with 64 available cores and 1TB of RAM. > >I hope I've been able to explain my situation clearly in this email. > >Any help is appreciated. >Thank you. > >Vivek From vKrishna at jcvi.org Wed Mar 20 08:05:55 2013 From: vKrishna at jcvi.org (Krishnakumar, Vivek) Date: Wed, 20 Mar 2013 09:05:55 -0400 Subject: [maker-devel] AED calculations using the MAKER pipeline Message-ID: Hi, We have been using the MAKER pipeline here at JCVI to calculate AED scores by feeding in our annotation set as `model_gff` and the protein and EST evidence as `protein_gff` and `est_gff` respectively. Here is the issue we are having: When running the above pipeline with protein2genome and est2genome evidence generated earlier by MAKER, there are no problems calculating the AED score. Normally this pipeline takes a little over 12 hours to complete. But if we use our own evidence, AAT and Genewise aligned proteins for `protein_gff` and PASA assembled ESTs for `est_gff`, the same pipeline runs very very slow and the intermediary *.gff.ann file has many chunks (separated by '###') that are completely empty. Our evidence in formatted in the same way as est2genome or protein2genome (GFF file with "expressed_sequence_match::match_part" or "protein_match::match_part" features respectively) The input to my pipeline is 8 chromosomes, ~2200 scaffolds and I use the default `max_dna_len` parameter used to split the large assemblies into chunks. Investigating the master_datastore.log shows me that the scaffolds run through without any issues and the chromosomes are still being processed. For any of the chromosomes, investigating the 'run.log' file, one level above 'theVoid' shows me how many "final.section" jobs were started and how many finished. And in the case of all the chromosomes, it tells me that everything that was started has finished. And the 'log.child.*' files within `theVoid` are all empty. Also within `theVoid`, I'm noticing that the "raw.section" and "evidence_*.gff" files are not empty. But one thing that is surprising is that of all the "final.section" files, only the one pertaining to the last chunk is very large (proportional to the size of the evidnce), the rest are all exactly the same size (exactly 331 bytes). I'm running MAKER in MPI mode spawning 48 processes on a high memory machine with 64 available cores and 1TB of RAM. I hope I've been able to explain my situation clearly in this email. Any help is appreciated. Thank you. Vivek From cdtown at jcvi.org Wed Mar 20 08:54:33 2013 From: cdtown at jcvi.org (Town, Christopher D.) Date: Wed, 20 Mar 2013 09:54:33 -0400 Subject: [maker-devel] AED calculations using the MAKER pipeline In-Reply-To: References: Message-ID: Thanks. Is there any way of guestimating when this final step might be completed. We are in a time crunch here to get this analysis finished and the data/annotation out. Best Chris -----Original Message----- From: Carson Holt [mailto:Carson.Holt at oicr.on.ca] Sent: Wednesday, March 20, 2013 9:51 AM To: Krishnakumar, Vivek; maker-devel at yandell-lab.org Cc: Town, Christopher D.; Tang, Haibao; Bidwell, Shelby; Rosen, Benjamin Subject: Re: AED calculations using the MAKER pipeline In the current MAKER download when using GFF3 passthrough there was an issue with everything being done at the very last step. This of course leads to a memory spike and a very slow last step. That seems to be similar to what you are describing. It should be resolved in what will become version 2.28. I can give you access to the pre-release code, so you can check that the issue is resolved for you. I'll send details in a separate e-mail. Also the ### will be printed after every ~100,000 bp of assembly processed by MAKER. You can ignore them, but they actually have a meaning in GFF3. Basically everything between two sets of ###'s are fully resolved. It allows programs that read GFF3 to parallelize file loading or just load sections of a file as they can rapidly identify "safe chunks". Without them the entire file must be loaded into memory in order to be certain that all feature parts are there (as there is no requirement for sorting or order in GFF3). log.child files will always be empty unless you run analysis like snap or blast. Thanks, Carson On 13-03-20 9:05 AM, "Krishnakumar, Vivek" wrote: >Hi, > >We have been using the MAKER pipeline here at JCVI to calculate AED >scores by feeding in our annotation set as `model_gff` and the protein >and EST evidence as `protein_gff` and `est_gff` respectively. Here is >the issue we are having: > >When running the above pipeline with protein2genome and est2genome >evidence generated earlier by MAKER, there are no problems calculating >the AED score. Normally this pipeline takes a little over 12 hours to >complete. > >But if we use our own evidence, AAT and Genewise aligned proteins for >`protein_gff` and PASA assembled ESTs for `est_gff`, the same pipeline >runs very very slow and the intermediary *.gff.ann file has many chunks >(separated by '###') that are completely empty. Our evidence in >formatted in the same way as est2genome or protein2genome (GFF file >with "expressed_sequence_match::match_part" or "protein_match::match_part" >features respectively) > >The input to my pipeline is 8 chromosomes, ~2200 scaffolds and I use >the default `max_dna_len` parameter used to split the large assemblies >into chunks. > >Investigating the master_datastore.log shows me that the scaffolds run >through without any issues and the chromosomes are still being processed. >For any of the chromosomes, investigating the 'run.log' file, one level >above 'theVoid' shows me how many "final.section" jobs were started and >how many finished. And in the case of all the chromosomes, it tells me >that everything that was started has finished. And the 'log.child.*' >files within `theVoid` are all empty. Also within `theVoid`, I'm >noticing that the "raw.section" and "evidence_*.gff" files are not >empty. But one thing that is surprising is that of all the >"final.section" files, only the one pertaining to the last chunk is >very large (proportional to the size of the evidnce), the rest are all >exactly the same size (exactly 331 bytes). > >I'm running MAKER in MPI mode spawning 48 processes on a high memory >machine with 64 available cores and 1TB of RAM. > >I hope I've been able to explain my situation clearly in this email. > >Any help is appreciated. >Thank you. > >Vivek From myandell at genetics.utah.edu Wed Mar 20 09:55:38 2013 From: myandell at genetics.utah.edu (Mark Yandell) Date: Wed, 20 Mar 2013 14:55:38 +0000 Subject: [maker-devel] AED calculations using the MAKER pipeline In-Reply-To: References: Message-ID: <7A60AB257EFF2B48B1F4C814817EA05350EDCE05@mxb2.hg.genetics.utah.edu> Hi Vivek, sound like its a maybe problem with the protein2genome GFF file. Cane you send us a sample file that is known to produce the problem? cheers, --mark Mark Yandell Professor of Human Genetics H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:801-587-7707 ________________________________________ From: maker-devel-bounces at yandell-lab.org [maker-devel-bounces at yandell-lab.org] on behalf of Krishnakumar, Vivek [vKrishna at jcvi.org] Sent: Wednesday, March 20, 2013 7:05 AM To: maker-devel at yandell-lab.org Cc: Tang, Haibao; Rosen, Benjamin; Town, Christopher D.; Bidwell, Shelby Subject: [maker-devel] AED calculations using the MAKER pipeline Hi, We have been using the MAKER pipeline here at JCVI to calculate AED scores by feeding in our annotation set as `model_gff` and the protein and EST evidence as `protein_gff` and `est_gff` respectively. Here is the issue we are having: When running the above pipeline with protein2genome and est2genome evidence generated earlier by MAKER, there are no problems calculating the AED score. Normally this pipeline takes a little over 12 hours to complete. But if we use our own evidence, AAT and Genewise aligned proteins for `protein_gff` and PASA assembled ESTs for `est_gff`, the same pipeline runs very very slow and the intermediary *.gff.ann file has many chunks (separated by '###') that are completely empty. Our evidence in formatted in the same way as est2genome or protein2genome (GFF file with "expressed_sequence_match::match_part" or "protein_match::match_part" features respectively) The input to my pipeline is 8 chromosomes, ~2200 scaffolds and I use the default `max_dna_len` parameter used to split the large assemblies into chunks. Investigating the master_datastore.log shows me that the scaffolds run through without any issues and the chromosomes are still being processed. For any of the chromosomes, investigating the 'run.log' file, one level above 'theVoid' shows me how many "final.section" jobs were started and how many finished. And in the case of all the chromosomes, it tells me that everything that was started has finished. And the 'log.child.*' files within `theVoid` are all empty. Also within `theVoid`, I'm noticing that the "raw.section" and "evidence_*.gff" files are not empty. But one thing that is surprising is that of all the "final.section" files, only the one pertaining to the last chunk is very large (proportional to the size of the evidnce), the rest are all exactly the same size (exactly 331 bytes). I'm running MAKER in MPI mode spawning 48 processes on a high memory machine with 64 available cores and 1TB of RAM. I hope I've been able to explain my situation clearly in this email. Any help is appreciated. Thank you. Vivek _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From myandell at genetics.utah.edu Wed Mar 20 09:57:17 2013 From: myandell at genetics.utah.edu (Mark Yandell) Date: Wed, 20 Mar 2013 14:57:17 +0000 Subject: [maker-devel] AED calculations using the MAKER pipeline In-Reply-To: References: , Message-ID: <7A60AB257EFF2B48B1F4C814817EA05350EDCE15@mxb2.hg.genetics.utah.edu> whoops. looks like carson has got this one already. Thanks! Mark Yandell Professor of Human Genetics H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:801-587-7707 ________________________________________ From: maker-devel-bounces at yandell-lab.org [maker-devel-bounces at yandell-lab.org] on behalf of Town, Christopher D. [cdtown at jcvi.org] Sent: Wednesday, March 20, 2013 7:54 AM To: Carson Holt; Krishnakumar, Vivek; maker-devel at yandell-lab.org Cc: Tang, Haibao; Rosen, Benjamin; Bidwell, Shelby Subject: Re: [maker-devel] AED calculations using the MAKER pipeline Thanks. Is there any way of guestimating when this final step might be completed. We are in a time crunch here to get this analysis finished and the data/annotation out. Best Chris -----Original Message----- From: Carson Holt [mailto:Carson.Holt at oicr.on.ca] Sent: Wednesday, March 20, 2013 9:51 AM To: Krishnakumar, Vivek; maker-devel at yandell-lab.org Cc: Town, Christopher D.; Tang, Haibao; Bidwell, Shelby; Rosen, Benjamin Subject: Re: AED calculations using the MAKER pipeline In the current MAKER download when using GFF3 passthrough there was an issue with everything being done at the very last step. This of course leads to a memory spike and a very slow last step. That seems to be similar to what you are describing. It should be resolved in what will become version 2.28. I can give you access to the pre-release code, so you can check that the issue is resolved for you. I'll send details in a separate e-mail. Also the ### will be printed after every ~100,000 bp of assembly processed by MAKER. You can ignore them, but they actually have a meaning in GFF3. Basically everything between two sets of ###'s are fully resolved. It allows programs that read GFF3 to parallelize file loading or just load sections of a file as they can rapidly identify "safe chunks". Without them the entire file must be loaded into memory in order to be certain that all feature parts are there (as there is no requirement for sorting or order in GFF3). log.child files will always be empty unless you run analysis like snap or blast. Thanks, Carson On 13-03-20 9:05 AM, "Krishnakumar, Vivek" wrote: >Hi, > >We have been using the MAKER pipeline here at JCVI to calculate AED >scores by feeding in our annotation set as `model_gff` and the protein >and EST evidence as `protein_gff` and `est_gff` respectively. Here is >the issue we are having: > >When running the above pipeline with protein2genome and est2genome >evidence generated earlier by MAKER, there are no problems calculating >the AED score. Normally this pipeline takes a little over 12 hours to >complete. > >But if we use our own evidence, AAT and Genewise aligned proteins for >`protein_gff` and PASA assembled ESTs for `est_gff`, the same pipeline >runs very very slow and the intermediary *.gff.ann file has many chunks >(separated by '###') that are completely empty. Our evidence in >formatted in the same way as est2genome or protein2genome (GFF file >with "expressed_sequence_match::match_part" or "protein_match::match_part" >features respectively) > >The input to my pipeline is 8 chromosomes, ~2200 scaffolds and I use >the default `max_dna_len` parameter used to split the large assemblies >into chunks. > >Investigating the master_datastore.log shows me that the scaffolds run >through without any issues and the chromosomes are still being processed. >For any of the chromosomes, investigating the 'run.log' file, one level >above 'theVoid' shows me how many "final.section" jobs were started and >how many finished. And in the case of all the chromosomes, it tells me >that everything that was started has finished. And the 'log.child.*' >files within `theVoid` are all empty. Also within `theVoid`, I'm >noticing that the "raw.section" and "evidence_*.gff" files are not >empty. But one thing that is surprising is that of all the >"final.section" files, only the one pertaining to the last chunk is >very large (proportional to the size of the evidnce), the rest are all >exactly the same size (exactly 331 bytes). > >I'm running MAKER in MPI mode spawning 48 processes on a high memory >machine with 64 available cores and 1TB of RAM. > >I hope I've been able to explain my situation clearly in this email. > >Any help is appreciated. >Thank you. > >Vivek _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From carsonhh at gmail.com Wed Mar 20 12:36:30 2013 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 20 Mar 2013 13:36:30 -0400 Subject: [maker-devel] AED calculations using the MAKER pipeline In-Reply-To: Message-ID: On the few cases where I found this (if it is the same issue you are experiencing), it was very much dependent on the total size of the evidence database and the length of the contigs. For me it took about 25-50% longer, but used up 10-15x as much RAM (primarily because the contigs were very long > 50 Mb each). The issue was unnoticeable on the short contigs that are more typical of de novo annotation. Thanks, Carson On 13-03-20 9:54 AM, "Town, Christopher D." wrote: >Thanks. Is there any way of guestimating when this final step might be >completed. We are in a time crunch here to get this analysis finished and >the data/annotation out. > >Best > >Chris > >-----Original Message----- >From: Carson Holt [mailto:Carson.Holt at oicr.on.ca] >Sent: Wednesday, March 20, 2013 9:51 AM >To: Krishnakumar, Vivek; maker-devel at yandell-lab.org >Cc: Town, Christopher D.; Tang, Haibao; Bidwell, Shelby; Rosen, Benjamin >Subject: Re: AED calculations using the MAKER pipeline > >In the current MAKER download when using GFF3 passthrough there was an >issue with everything being done at the very last step. This of course >leads to a memory spike and a very slow last step. That seems to be >similar to what you are describing. It should be resolved in what will >become version 2.28. I can give you access to the pre-release code, so >you can check that the issue is resolved for you. I'll send details in a >separate e-mail. > >Also the ### will be printed after every ~100,000 bp of assembly >processed by MAKER. You can ignore them, but they actually have a >meaning in GFF3. >Basically everything between two sets of ###'s are fully resolved. It >allows programs that read GFF3 to parallelize file loading or just load >sections of a file as they can rapidly identify "safe chunks". Without >them the entire file must be loaded into memory in order to be certain >that all feature parts are there (as there is no requirement for sorting >or order in GFF3). > >log.child files will always be empty unless you run analysis like snap or >blast. > >Thanks, >Carson > > > > > > >On 13-03-20 9:05 AM, "Krishnakumar, Vivek" wrote: > >>Hi, >> >>We have been using the MAKER pipeline here at JCVI to calculate AED >>scores by feeding in our annotation set as `model_gff` and the protein >>and EST evidence as `protein_gff` and `est_gff` respectively. Here is >>the issue we are having: >> >>When running the above pipeline with protein2genome and est2genome >>evidence generated earlier by MAKER, there are no problems calculating >>the AED score. Normally this pipeline takes a little over 12 hours to >>complete. >> >>But if we use our own evidence, AAT and Genewise aligned proteins for >>`protein_gff` and PASA assembled ESTs for `est_gff`, the same pipeline >>runs very very slow and the intermediary *.gff.ann file has many chunks >>(separated by '###') that are completely empty. Our evidence in >>formatted in the same way as est2genome or protein2genome (GFF file >>with "expressed_sequence_match::match_part" or >>"protein_match::match_part" >>features respectively) >> >>The input to my pipeline is 8 chromosomes, ~2200 scaffolds and I use >>the default `max_dna_len` parameter used to split the large assemblies >>into chunks. >> >>Investigating the master_datastore.log shows me that the scaffolds run >>through without any issues and the chromosomes are still being processed. >>For any of the chromosomes, investigating the 'run.log' file, one level >>above 'theVoid' shows me how many "final.section" jobs were started and >>how many finished. And in the case of all the chromosomes, it tells me >>that everything that was started has finished. And the 'log.child.*' >>files within `theVoid` are all empty. Also within `theVoid`, I'm >>noticing that the "raw.section" and "evidence_*.gff" files are not >>empty. But one thing that is surprising is that of all the >>"final.section" files, only the one pertaining to the last chunk is >>very large (proportional to the size of the evidnce), the rest are all >>exactly the same size (exactly 331 bytes). >> >>I'm running MAKER in MPI mode spawning 48 processes on a high memory >>machine with 64 available cores and 1TB of RAM. >> >>I hope I've been able to explain my situation clearly in this email. >> >>Any help is appreciated. >>Thank you. >> >>Vivek > > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From ares711122 at gmail.com Thu Mar 21 19:08:45 2013 From: ares711122 at gmail.com (Hung-Wei Hsu) Date: Fri, 22 Mar 2013 08:08:45 +0800 Subject: [maker-devel] Directory structure is too deep! Message-ID: Hi MAKER developers, I found that the MAKER outputs of each contigs were located in separate deep directory. Can MAKER collect these outputs in one simple directory so that these results can be easily examined? Thanks a lot in advance. Warmest regards, Hung-Wei -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Thu Mar 21 21:07:23 2013 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 21 Mar 2013 22:07:23 -0400 Subject: [maker-devel] Directory structure is too deep! In-Reply-To: Message-ID: You can use gff3_merge to collect them into a single file, or to keep them as separate files but in the same directory just use the standard linux copy command. Similarly you can use fasta_merge to collect the fasta files. Example: > mkdir results > cp *.maker.output/*_datastore/*/*/*.gff results/ Thanks, Carson From: Hung-Wei Hsu Date: Thursday, 21 March, 2013 8:08 PM To: Subject: [maker-devel] Directory structure is too deep! Hi MAKER developers, I found that the MAKER outputs of each contigs were located in separate deep directory. Can MAKER collect these outputs in one simple directory so that these results can be easily examined? Thanks a lot in advance. Warmest regards, Hung-Wei _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From jason.stajich at gmail.com Fri Mar 22 01:12:55 2013 From: jason.stajich at gmail.com (Jason Stajich) Date: Thu, 21 Mar 2013 20:12:55 -1000 Subject: [maker-devel] failed gene prediction In-Reply-To: References: Message-ID: <59B5B965-7B15-449E-B42F-E41D4F448B6A@gmail.com> For fungi, I've put up some of the gene prediction parameters that I've built or trained if that is a helpful for you. https://github.com/hyphaltip/fungi-gene-prediction-params In the absence of any ESTs or RNA-Seq I also recommend generating a starting training set with CEGMA first and then training your predictors from there except for GeneMark.hmm which seems to do okay with self-training. Jason On Mar 18, 2013, at 10:49 AM, Carson Holt wrote: > You didn't supply any evidence or HMM files for gene predictors. Just raw assembly data by itself is insufficient for genome annotation. > > Here is some nice documentation for running MAKER --> http://gmod.org/wiki/MAKER_Tutorial_2012 > Here is a nice overview of genome annotation ion general --> http://fasta.bioch.virginia.edu/cshl/pdf/12/ajm12/euk_genome_annotation_review.pdf > > Once you've gone through the documentation and examples, if you come across any questions just let us know. > > Thanks, > Carson > > > From: "Borhan, Hossein" > Date: Monday, 18 March, 2013 4:40 PM > To: > Subject: [maker-devel] failed gene prediction > > Hi > > I have tried maker on a fungus genome of 45 mb with 1/3 being repeat rich. It did not produce any prediction. I am not sure what is causing this. Attached are the STDERR and opts.ctl. I appreciate your help > > > Hossein > > > > > > > > > > _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org Jason Stajich jason.stajich at gmail.com jason at bioperl.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ares711122 at gmail.com Fri Mar 22 02:52:25 2013 From: ares711122 at gmail.com (Hung-Wei Hsu) Date: Fri, 22 Mar 2013 15:52:25 +0800 Subject: [maker-devel] Can MAKER analyze the viral genome? Message-ID: Hi MAKER developers, I'm wondering if MAKER can deal with the viral genome. If yes, how do I set the running parameters? Thanks. Kind regards, Hung-Wei -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Sat Mar 23 18:42:39 2013 From: carsonhh at gmail.com (Carson Holt) Date: Sat, 23 Mar 2013 19:42:39 -0400 Subject: [maker-devel] Can MAKER analyze the viral genome? In-Reply-To: Message-ID: You can set organism type to prokaryotic and use the protein2genome option for annotation. It's not a perfect match as it only allows for partial gene spatial overlap and not full gene within a gene like you can see in viruses. Thanks, Carson From: Hung-Wei Hsu Date: Friday, 22 March, 2013 3:52 AM To: Subject: [maker-devel] Can MAKER analyze the viral genome? Hi MAKER developers, I'm wondering if MAKER can deal with the viral genome. If yes, how do I set the running parameters? Thanks. Kind regards, Hung-Wei _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From jjin01 at mail.rockefeller.edu Sat Mar 23 19:43:54 2013 From: jjin01 at mail.rockefeller.edu (Jingjing Jin) Date: Sun, 24 Mar 2013 00:43:54 +0000 Subject: [maker-devel] maker running error Message-ID: Dear all, When I run the maker, there is an error like this: *** buffer overflow detected ***: /usr/bin/perl terminated ======= Backtrace: ========= /lib64/libc.so.6(__fortify_fail+0x37)[0x3582d01d47] /lib64/libc.so.6[0x3582cffc30] /lib64/libc.so.6[0x3582cff089] /lib64/libc.so.6(__printf_fp+0x1531)[0x3582c4afa1] /lib64/libc.so.6(_IO_vfprintf+0x11a7)[0x3582c45407] /lib64/libc.so.6(__vsprintf_chk+0x9d)[0x3582cff12d] /lib64/libc.so.6(__sprintf_chk+0x7f)[0x3582cff06f] /usr/local/maker/lib/File/../../perl/lib/auto/Proc/ProcessTable/ProcessTable.so(OS_get_table+0x9bb)[0x7f328e8eb69b] /usr/local/maker/lib/File/../../perl/lib/auto/Proc/ProcessTable/ProcessTable.so(XS_Proc__ProcessTable_table+0x182)[0x7f328e8ecc02] /usr/lib64/perl5/CORE/libperl.so(Perl_pp_entersub+0x5a5)[0x35848a66d5] /usr/lib64/perl5/CORE/libperl.so(Perl_runops_standard+0x16)[0x35848a49c6] /usr/lib64/perl5/CORE/libperl.so(perl_run+0x338)[0x358484d0d8] /usr/bin/perl(main+0xec)[0x400cac] /lib64/libc.so.6(__libc_start_main+0xfd)[0x3582c1ecdd] /usr/bin/perl[0x400af9] ======= Memory map: ======== Could anyone give me some suggestion about how to deal with this problem? Thanks! Jingjing -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Sat Mar 23 22:04:49 2013 From: carsonhh at gmail.com (Carson Holt) Date: Sat, 23 Mar 2013 23:04:49 -0400 Subject: [maker-devel] maker running error In-Reply-To: Message-ID: Could you try maker version 2.27 from the website? Proc::ProcessTable may have problems on your system in accessing the process table. Version 2.27 tries to access the same information by first parsing the output of the standard 'df' command and only tries to access the process table directly if that fails. Thanks, Carson From: Jingjing Jin Date: Saturday, 23 March, 2013 8:43 PM To: "maker-devel at yandell-lab.org" Subject: [maker-devel] maker running error Dear all, When I run the maker, there is an error like this: *** buffer overflow detected ***: /usr/bin/perl terminated ======= Backtrace: ========= /lib64/libc.so.6(__fortify_fail+0x37)[0x3582d01d47] /lib64/libc.so.6[0x3582cffc30] /lib64/libc.so.6[0x3582cff089] /lib64/libc.so.6(__printf_fp+0x1531)[0x3582c4afa1] /lib64/libc.so.6(_IO_vfprintf+0x11a7)[0x3582c45407] /lib64/libc.so.6(__vsprintf_chk+0x9d)[0x3582cff12d] /lib64/libc.so.6(__sprintf_chk+0x7f)[0x3582cff06f] /usr/local/maker/lib/File/../../perl/lib/auto/Proc/ProcessTable/ProcessTable .so(OS_get_table+0x9bb)[0x7f328e8eb69b] /usr/local/maker/lib/File/../../perl/lib/auto/Proc/ProcessTable/ProcessTable .so(XS_Proc__ProcessTable_table+0x182)[0x7f328e8ecc02] /usr/lib64/perl5/CORE/libperl.so(Perl_pp_entersub+0x5a5)[0x35848a66d5] /usr/lib64/perl5/CORE/libperl.so(Perl_runops_standard+0x16)[0x35848a49c6] /usr/lib64/perl5/CORE/libperl.so(perl_run+0x338)[0x358484d0d8] /usr/bin/perl(main+0xec)[0x400cac] /lib64/libc.so.6(__libc_start_main+0xfd)[0x3582c1ecdd] /usr/bin/perl[0x400af9] ======= Memory map: ======== Could anyone give me some suggestion about how to deal with this problem? Thanks! Jingjing _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnuhn at ebi.ac.uk Mon Mar 25 07:18:11 2013 From: mnuhn at ebi.ac.uk (mnuhn) Date: Mon, 25 Mar 2013 12:18:11 +0000 Subject: [maker-devel] =?utf-8?q?master=5Fdatastore=5Findex=2Elog_file_shr?= =?utf-8?q?inks=2E?= In-Reply-To: References: Message-ID: <407ae892252062e886fb3855bb6bf74c@ebi.ac.uk> Thanks, this works and mpi maker is running now. Cheers, Michael. P.S.: If anyone is trying to reproduce this, I only had one directory in LD_PRELOAD and it didn't like the trailing colon, so I removed it to make it work: export LD_PRELOAD=/software/openmpi-1.4.3/lib/libmpi.so On 2013-03-19 15:22, Carson Holt wrote: > I have MAKER working under OpemnMPI 1.4.3 (intel compiled). > > I had to set a couple of environmental variables prior to setup. You > would > probably need to set these values as well. If you your OpenMPI path > was > here for example --> /software/openmpi-1.4.3/, run the following > commands > (path set accordingly) before even attempting maker setup. > > export OMPI_MCA_mpi_warn_on_fork 0 > export LD_PRELOAD /software/openmpi-1.4.3/lib/libmpi.so:$LD_PRELOAD > > These not only need to be set before compilation, but also before any > run > (so add them to you ~.bashrc or ~/.bash_profile or any module load > scripts > thanks). The LD_PRELOAD statement needs to be set for any program > using > OpenMPI's shared libraries and not just MAKER, so it's normally a > good > idea to have that set system wide for all users. The detail can be > found > in the OpenMPI documentation. Note sometimes system library updates > can > break OpenMPI's shared libraries while not breaking OpenMPI itself, > so you > might also need to recompile OpenMPI if it has broken shared > libraries. > > Once you have those commands in place, run the perl Buil.PL step. Say > yes > to install with MPI. Then run ./Build install > > Thanks, > Carson > > > > On 13-03-19 11:02 AM, "Carson Holt" wrote: > >>Try it with the no_locks option then. Make sure to let one instance >>finish populating the mpi_blastdb directory before running other >>instances >>as that is where most initial locking occurs. >> >>I'll send you more details on how to install with OpenMPI, so you can >>give >>that a shot while your jobs are also running serially (so you don't >> lose >>time). Also instead of 50 serial instances, you could try 10 with >> -cpus >>set to 5. >> >>Thanks, >>Carson >> >> >> >>On 13-03-19 11:19 AM, "Michael Nuhn" wrote: >> >>>Hello Carson! >>> >>>On 03/19/2013 02:27 PM, Carson Holt wrote: >>>> Yes. If at all possible use MPI. It removes the overhead of >>>> locks >>>> which happen per primary instance of MAKER. So one maker job >>>> using >>>>1000 >>>> cpus via MPI will have one shared set of locks. 1000 serial >>>> instances >>>> of MAKER on the other hand would have 1000x the locks. >>> >>>I don't know a thing about MPI. >>> >>>I tried installing maker (2.2.7) with mpich-3.0.2, mpich2-1.4.1 and >>> open >>>mpi and none of them worked for me. I also tried the automatic >>>installation that comes with maker, but it didn't work for me >>> either. >>> >>>If need be, I could spend time getting to the bottom of this, but >>> there >>>is no telling how long this would take me so I'd rather not, if >>> there is >>>an alternative. >>> >>>Would the approach I outlined before work? (Treating the split files >>> as >>>separate genomes to annotate and then combine the gffs afterwards) >>> >>>I also like this approach, because I would select a few contigs in >>> the >>>beginning which I would run on their own. They would complete early >>> and >>>this way I would get a preview of the results of the run instead of >>>having to wait for everything to complete. >>> >>>It might also be more robust, because file locking issues would be >>>confined to the instances working on a sequence chunk, but the rest >>> of >>>the instances could continue working. >>> >>>Cheers, >>>Michael. >>> >>>> Alternatively if you do need to continue without MPI for some >>>> reason, I >>>> just finished a devel version of MAKER that has a --no_locks >>>> option. >>>> You can never start two instances using the same input fasta >>>> when >>>> --no_locks is specified, but the splitting to use different input >>>>fastas >>>> I mentioned before in the example will still work fine. >>>> >>>> I also have updated the indexing/reindexing, so if indexing >>>> failures >>>> happen, MAKER will switch between the current working directory >>>> and the >>>> TMP= directory from the maker_opts.ctl file so as to try different >>>> IO >>>> locations (I.e. NFS and non-NFS). Note you should never set TMP= >>>> in >>>>the >>>> control files to an NFS mounted location (it not only makes things >>>> a >>>>lot >>>> slower, but berkleydb and sqllite will get frequent errors on >>>> NFS). >>>> TMP= defaults to /tmp when not specified >>>> >>>> I'll send you download information in a separate e-mail. Try a >>>> regular >>>> MAKER run to see if the indexing/reindexing changes are sufficient >>>> before attempting the ?no_locks option. >>>> >>>> Thanks, >>>> Carson From lengjingmao at gmail.com Mon Mar 25 08:49:11 2013 From: lengjingmao at gmail.com (shaohua.fan) Date: Mon, 25 Mar 2013 14:49:11 +0100 Subject: [maker-devel] maker terminated strangely Message-ID: Hi Maker developers, I met a problem when I was using Maker version 2.27 beta version that the pipeline terminated in the middle of the process without any error message. The genome I am working with is a Eukaryotic genome which is consisted by around 6000 scaffolds. I combined de novo (Augustus and SNAP) and evidence based (protein from a closely related species and transcriptome from the same species) for the gene prediction (the genome is already repeat masked). The MPI (mpich2 version 1.5) enabled maker was run on a cluster by using SGE. I checked with the administrator of our cluster, there is no limitation of SGE job. The maker was run by using mpiexec -n 48 /home/shafan/maker/bin/maker maker_opts.ctl maker_bopts.ctl maker_exe.ctl I attached my maker_opt.ctl, please let me know if you need any information for this problem. Thanks a lot! Shaohua -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_opts.ctl Type: application/octet-stream Size: 4518 bytes Desc: not available URL: From carsonhh at gmail.com Mon Mar 25 09:01:45 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 25 Mar 2013 10:01:45 -0400 Subject: [maker-devel] maker terminated strangely In-Reply-To: Message-ID: Could you send your captured standard error. That would contain messages that highlight the specific cause. Thanks, Carson From: "shaohua.fan" Date: Monday, 25 March, 2013 9:49 AM To: Subject: [maker-devel] maker terminated strangely Hi Maker developers, I met a problem when I was using Maker version 2.27 beta version that the pipeline terminated in the middle of the process without any error message. The genome I am working with is a Eukaryotic genome which is consisted by around 6000 scaffolds. I combined de novo (Augustus and SNAP) and evidence based (protein from a closely related species and transcriptome from the same species) for the gene prediction (the genome is already repeat masked). The MPI (mpich2 version 1.5) enabled maker was run on a cluster by using SGE. I checked with the administrator of our cluster, there is no limitation of SGE job. The maker was run by using mpiexec -n 48 /home/shafan/maker/bin/maker maker_opts.ctl maker_bopts.ctl maker_exe.ctl I attached my maker_opt.ctl, please let me know if you need any information for this problem. Thanks a lot! Shaohua _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From lengjingmao at gmail.com Mon Mar 25 09:07:17 2013 From: lengjingmao at gmail.com (shaohua.fan) Date: Mon, 25 Mar 2013 15:07:17 +0100 Subject: [maker-devel] maker terminated strangely In-Reply-To: References: Message-ID: Hi Carson, Do you mean standard output from maker? If yes, I need to upload the file to our ftp server, since it is quite big around 1.1 Gb. Shaohua 2013/3/25 Carson Holt > Could you send your captured standard error. That would contain messages > that highlight the specific cause. > > Thanks, > Carson > > > From: "shaohua.fan" > Date: Monday, 25 March, 2013 9:49 AM > To: > Subject: [maker-devel] maker terminated strangely > > Hi Maker developers, > > I met a problem when I was using Maker version 2.27 beta version that the > pipeline terminated in the middle of the process without any error message. > > > The genome I am working with is a Eukaryotic genome which is consisted by > around 6000 scaffolds. I combined de novo (Augustus and SNAP) and evidence > based (protein from a closely related species and transcriptome from the > same species) for the gene prediction (the genome is already repeat > masked). The MPI (mpich2 version 1.5) enabled maker was run on a cluster by > using SGE. I checked with the administrator of our cluster, there is no > limitation of SGE job. > > The maker was run by using mpiexec -n 48 /home/shafan/maker/bin/maker > maker_opts.ctl maker_bopts.ctl maker_exe.ctl > > I attached my maker_opt.ctl, please let me know if you need any > information for this problem. > > Thanks a lot! > > Shaohua > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Mar 25 09:07:45 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 25 Mar 2013 10:07:45 -0400 Subject: [maker-devel] master_datastore_index.log file shrinks. In-Reply-To: <407ae892252062e886fb3855bb6bf74c@ebi.ac.uk> Message-ID: Great news. I'm glad it's working. If you have more questions, just let me know. --Carson On 13-03-25 8:18 AM, "mnuhn" wrote: >Thanks, this works and mpi maker is running now. > >Cheers, >Michael. > >P.S.: > >If anyone is trying to reproduce this, I only had one directory in >LD_PRELOAD and it didn't like the trailing colon, so I removed it to >make it work: > >export LD_PRELOAD=/software/openmpi-1.4.3/lib/libmpi.so > >On 2013-03-19 15:22, Carson Holt wrote: >> I have MAKER working under OpemnMPI 1.4.3 (intel compiled). >> >> I had to set a couple of environmental variables prior to setup. You >> would >> probably need to set these values as well. If you your OpenMPI path >> was >> here for example --> /software/openmpi-1.4.3/, run the following >> commands >> (path set accordingly) before even attempting maker setup. >> >> export OMPI_MCA_mpi_warn_on_fork 0 >> export LD_PRELOAD /software/openmpi-1.4.3/lib/libmpi.so:$LD_PRELOAD >> >> These not only need to be set before compilation, but also before any >> run >> (so add them to you ~.bashrc or ~/.bash_profile or any module load >> scripts >> thanks). The LD_PRELOAD statement needs to be set for any program >> using >> OpenMPI's shared libraries and not just MAKER, so it's normally a >> good >> idea to have that set system wide for all users. The detail can be >> found >> in the OpenMPI documentation. Note sometimes system library updates >> can >> break OpenMPI's shared libraries while not breaking OpenMPI itself, >> so you >> might also need to recompile OpenMPI if it has broken shared >> libraries. >> >> Once you have those commands in place, run the perl Buil.PL step. Say >> yes >> to install with MPI. Then run ./Build install >> >> Thanks, >> Carson >> >> >> >> On 13-03-19 11:02 AM, "Carson Holt" wrote: >> >>>Try it with the no_locks option then. Make sure to let one instance >>>finish populating the mpi_blastdb directory before running other >>>instances >>>as that is where most initial locking occurs. >>> >>>I'll send you more details on how to install with OpenMPI, so you can >>>give >>>that a shot while your jobs are also running serially (so you don't >>> lose >>>time). Also instead of 50 serial instances, you could try 10 with >>> -cpus >>>set to 5. >>> >>>Thanks, >>>Carson >>> >>> >>> >>>On 13-03-19 11:19 AM, "Michael Nuhn" wrote: >>> >>>>Hello Carson! >>>> >>>>On 03/19/2013 02:27 PM, Carson Holt wrote: >>>>> Yes. If at all possible use MPI. It removes the overhead of >>>>> locks >>>>> which happen per primary instance of MAKER. So one maker job >>>>> using >>>>>1000 >>>>> cpus via MPI will have one shared set of locks. 1000 serial >>>>> instances >>>>> of MAKER on the other hand would have 1000x the locks. >>>> >>>>I don't know a thing about MPI. >>>> >>>>I tried installing maker (2.2.7) with mpich-3.0.2, mpich2-1.4.1 and >>>> open >>>>mpi and none of them worked for me. I also tried the automatic >>>>installation that comes with maker, but it didn't work for me >>>> either. >>>> >>>>If need be, I could spend time getting to the bottom of this, but >>>> there >>>>is no telling how long this would take me so I'd rather not, if >>>> there is >>>>an alternative. >>>> >>>>Would the approach I outlined before work? (Treating the split files >>>> as >>>>separate genomes to annotate and then combine the gffs afterwards) >>>> >>>>I also like this approach, because I would select a few contigs in >>>> the >>>>beginning which I would run on their own. They would complete early >>>> and >>>>this way I would get a preview of the results of the run instead of >>>>having to wait for everything to complete. >>>> >>>>It might also be more robust, because file locking issues would be >>>>confined to the instances working on a sequence chunk, but the rest >>>> of >>>>the instances could continue working. >>>> >>>>Cheers, >>>>Michael. >>>> >>>>> Alternatively if you do need to continue without MPI for some >>>>> reason, I >>>>> just finished a devel version of MAKER that has a --no_locks >>>>> option. >>>>> You can never start two instances using the same input fasta >>>>> when >>>>> --no_locks is specified, but the splitting to use different input >>>>>fastas >>>>> I mentioned before in the example will still work fine. >>>>> >>>>> I also have updated the indexing/reindexing, so if indexing >>>>> failures >>>>> happen, MAKER will switch between the current working directory >>>>> and the >>>>> TMP= directory from the maker_opts.ctl file so as to try different >>>>> IO >>>>> locations (I.e. NFS and non-NFS). Note you should never set TMP= >>>>> in >>>>>the >>>>> control files to an NFS mounted location (it not only makes things >>>>> a >>>>>lot >>>>> slower, but berkleydb and sqllite will get frequent errors on >>>>> NFS). >>>>> TMP= defaults to /tmp when not specified >>>>> >>>>> I'll send you download information in a separate e-mail. Try a >>>>> regular >>>>> MAKER run to see if the indexing/reindexing changes are sufficient >>>>> before attempting the ?no_locks option. >>>>> >>>>> Thanks, >>>>> Carson > From carsonhh at gmail.com Mon Mar 25 09:08:17 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 25 Mar 2013 10:08:17 -0400 Subject: [maker-devel] maker terminated strangely In-Reply-To: Message-ID: Yes. Thanks, Carson From: "shaohua.fan" Date: Monday, 25 March, 2013 10:07 AM To: Carson Holt Cc: Subject: Re: [maker-devel] maker terminated strangely Hi Carson, Do you mean standard output from maker? If yes, I need to upload the file to our ftp server, since it is quite big around 1.1 Gb. Shaohua 2013/3/25 Carson Holt > Could you send your captured standard error. That would contain messages that > highlight the specific cause. > > Thanks, > Carson > > > From: "shaohua.fan" > Date: Monday, 25 March, 2013 9:49 AM > To: > Subject: [maker-devel] maker terminated strangely > > Hi Maker developers, > > I met a problem when I was using Maker version 2.27 beta version that the > pipeline terminated in the middle of the process without any error message. > > > The genome I am working with is a Eukaryotic genome which is consisted by > around 6000 scaffolds. I combined de novo (Augustus and SNAP) and evidence > based (protein from a closely related species and transcriptome from the same > species) for the gene prediction (the genome is already repeat masked). The > MPI (mpich2 version 1.5) enabled maker was run on a cluster by using SGE. I > checked with the administrator of our cluster, there is no limitation of SGE > job. > > The maker was run by using mpiexec -n 48 /home/shafan/maker/bin/maker > maker_opts.ctl maker_bopts.ctl maker_exe.ctl > > I attached my maker_opt.ctl, please let me know if you need any information > for this problem. > > Thanks a lot! > > Shaohua > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/mak > er-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ares711122 at gmail.com Mon Mar 25 21:50:52 2013 From: ares711122 at gmail.com (Hung-Wei Hsu) Date: Tue, 26 Mar 2013 10:50:52 +0800 Subject: [maker-devel] Why are some start positions minus in the gff result? Message-ID: Hi MAKER developers, I could successfully run MAKER and get the final gff. But I found some start positions in the gff were minus. That led to error in the gff reader. Is this a bug? Could you please help to resolve this problem? Thanks a lot in advance. Best regards, Hung-Wei -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Mar 25 22:24:01 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 25 Mar 2013 23:24:01 -0400 Subject: [maker-devel] Why are some start positions minus in the gff result? In-Reply-To: Message-ID: I haven't seen that before, so could you package up the job (all input and control files) that generates this and send it to me. Your using maker's prokaryotic settings to try and get it to annotate viral genomes, correct? --Carson From: Hung-Wei Hsu Date: Monday, 25 March, 2013 10:50 PM To: Subject: [maker-devel] Why are some start positions minus in the gff result? Hi MAKER developers, I could successfully run MAKER and get the final gff. But I found some start positions in the gff were minus. That led to error in the gff reader. Is this a bug? Could you please help to resolve this problem? Thanks a lot in advance. Best regards, Hung-Wei _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From hudarul at yahoo.com Sun Mar 31 15:02:04 2013 From: hudarul at yahoo.com (Hud Hud) Date: Sun, 31 Mar 2013 13:02:04 -0700 (PDT) Subject: [maker-devel] Help on error-Repeat masker Message-ID: <1364760124.37890.YahooMailNeo@web164901.mail.bf1.yahoo.com> Hello, i have some problem when runnning maker, i've got this kind of error, what could possibly go wrong here? Thnks so much setting up GFF3 output and fasta chunks doing repeat masking running ?repeat masker. #--------- command -------------# Widget::RepeatMasker: cd /tmp/maker_WOVHsi; /home/maker-2.27-beta/maker/exe/RepeatMasker/RepeatMasker /home/maker-2.27-beta/maker/data/contig.maker.output/contig_datastore/61/0D/contig172//theVoid ? ? ? ? ? ? ? ? ? ? ? ? ? ? .contig172/contig172.0.simple.rb -dir /home/maker-2.27-beta/maker/data/contig.maker.output/contig_datastore/61/0D/contig172//theVoid.contig172 -pa 1 - lib /tmp/maker_WOVHsi/b1piBcWHlH #-------------------------------# sh: /home/maker-2.27-beta/maker/exe/RepeatMasker/RepeatMasker: /u1/local/bin/perl: bad interpreter: Permission denied ERROR: RepeatMasker failed --> rank=NA, hostname=Homis ERROR: Failed while doing repeat masking ERROR: Chunk failed at level:0, tier_type:1 FAILED CONTIG:contig172 ERROR: Chunk failed at level:2, tier_type:0 FAILED CONTIG:172 examining contents of the fasta file and run log -------------- next part -------------- An HTML attachment was scrubbed... URL: From kenlee.nakasugi at sydney.edu.au Sun Mar 3 16:44:01 2013 From: kenlee.nakasugi at sydney.edu.au (Kenlee Nakasugi) Date: Mon, 04 Mar 2013 10:44:01 +1100 Subject: [maker-devel] regarding mpich In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6E7062FF@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6E7062FF@CHIMBX5.ad.uillinois.edu> Message-ID: <1362354241.2252.38.camel@waterhouse874-8> Hi, I'm trying to install mpi_maker (Maker 2.1) on a new system (intel ix x86_64) which has mpich v3.0.2 installed, but I can't seem to get maker Build.PL to recognize it. I tried editing the Build.pm file to point to it, but with no success. All dependencies have been installed and successfully recognized, it is just MPI support that is not. Is there anything I could modify in the install scripts to make it recognize this? Currently, the directly path to where the mpicc and mpiexec are is /apps/mpich/3.0.2/bin I don't have sys admin rights for the machine, and I'm not sure if this version of mpich was installed for shared libraries as per the GMOD tutorial. But I have previously circumvented this with an earlier version of mpich by modifying the Build.pm module with success. I'm wondering if mpichv3.02 is not compatible? Cheers, Ken -- Kenlee Nakasugi | Research Fellow School of Molecular Bioscience Level 8, SMB Building (G08)| The University of Sydney | NSW | 2006 T: +61 2 9114 1321 E: kenlee.nakasugi at sydney.edu.au -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Mar 4 06:35:03 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 04 Mar 2013 08:35:03 -0500 Subject: [maker-devel] regarding mpich In-Reply-To: <1362354241.2252.38.camel@waterhouse874-8> Message-ID: Use the last MPICH2 version, as MPICH3 is very different (it's the first attempt to implement the new MPI3 protocol set, and not just a version update). Alternatively you can use OpenMPI. Also use maker version 2.27 instead for MPI. Thanks, Carson From: Kenlee Nakasugi Date: Sunday, 3 March, 2013 6:44 PM To: "maker-devel at yandell-lab.org List" Subject: [maker-devel] regarding mpich Hi, I'm trying to install mpi_maker (Maker 2.1) on a new system (intel ix x86_64) which has mpich v3.0.2 installed, but I can't seem to get maker Build.PL to recognize it. I tried editing the Build.pm file to point to it, but with no success. All dependencies have been installed and successfully recognized, it is just MPI support that is not. Is there anything I could modify in the install scripts to make it recognize this? Currently, the directly path to where the mpicc and mpiexec are is /apps/mpich/3.0.2/bin I don't have sys admin rights for the machine, and I'm not sure if this version of mpich was installed for shared libraries as per the GMOD tutorial. But I have previously circumvented this with an earlier version of mpich by modifying the Build.pm module with success. I'm wondering if mpichv3.02 is not compatible? Cheers, Ken -- Kenlee Nakasugi | Research Fellow School of Molecular Bioscience Level 8, SMB Building (G08)| The University of Sydney | NSW | 2006 T: +61 2 9114 1321 E: kenlee.nakasugi at sydney.edu.au _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From canchaya at uvigo.es Mon Mar 4 04:10:26 2013 From: canchaya at uvigo.es (Carlos A. Canchaya) Date: Mon, 4 Mar 2013 12:10:26 +0100 Subject: [maker-devel] Sharing benchmarks of maker References: <6472D2A0-7BA8-41F0-ACFD-4D3C800D36FB@uvigo.es> Message-ID: <7F41714C-6C75-4892-AA5B-D7649DDA7DF2@uvigo.es> Hi, I've just install maker2 in our server and run a first test with our data. The input was about 30 000 sequences (9.6 Mb) and it was run in just one server with 32 processors for 36 hours) with mpich2. Our server has 250 Gb of memory and cpus of 2,4 Gb. The test was simple because it only ran repeatmasker and SNAP. Considering that we would like to use other gene prediction/annotation tools available in MAKER, I wonder if you can share some of your benchmarks in order to know if we could scale up pretty well to our production cluster in order to annotate our 1.6 Gb draft genome Best, Carlos Carlos A. Canchaya, PhD IPP Research Fellow Department of Biochemistry, Genetics and Immunology Faculty of Biology Campus Universitario University of Vigo 36310 Vigo Spain http://darwin.uvigo.es/~ccanchaya/ email: canchaya at uvigo.es Tel : +34 986 130048 Fax: +34 986 812556 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Mar 4 08:12:06 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 04 Mar 2013 10:12:06 -0500 Subject: [maker-devel] Sharing benchmarks of maker In-Reply-To: <7F41714C-6C75-4892-AA5B-D7649DDA7DF2@uvigo.es> Message-ID: Performance is highly dependent on the size of evidence datasets used (proteins/ESTs) as well as the IO performance of a system when running via MPI (you can hit IO bottlenecks well before cpu bottlenecks depending on cluster configuration). The Arabidopsis genome (120Mb assembly) running SNAP and Augustus, 1.1Gb EST dataset, and 10Mb protein dataset takes ~1 hour 30 min on 1,500 cpus with OpenMPI. The Maize genome (2.1 Gb) running SNAP and Augustus, 3Gb EST dataset, and 16 Mb protein dataset takes ~4 hours 30 min on 2200 cpus. A human sized genome would take 5-6 days on 100 cpus. MAKER is fully restartable (keeps log of progress). So if there is any failure or the user kills it in the middle of a job, it will pick up at the point it left off on restart (so you don't waste all that processing time). 2Gb of RAM per processing core is recommended when parallelizing MAKER via MPI, but fragmented genomes with smaller contigs can get by with less than 1Gb per core. MAKER version 2.28 which has additional optimization for OpenMPI and lower memory footprint will be available in a couple of weeks. Until then 2.27 is recommended over 2.1 for MPI. 2.27 should also work with OpenMPI. 2.1 only works with older versions of MPICH2 using the mpd launcher and not the current hydra launcher. Thanks, Carson From: "Carlos A. Canchaya" Date: Monday, 4 March, 2013 6:10 AM To: Subject: [maker-devel] Sharing benchmarks of maker Hi, I've just install maker2 in our server and run a first test with our data. The input was about 30 000 sequences (9.6 Mb) and it was run in just one server with 32 processors for 36 hours) with mpich2. Our server has 250 Gb of memory and cpus of 2,4 Gb. The test was simple because it only ran repeatmasker and SNAP. Considering that we would like to use other gene prediction/annotation tools available in MAKER, I wonder if you can share some of your benchmarks in order to know if we could scale up pretty well to our production cluster in order to annotate our 1.6 Gb draft genome Best, Carlos Carlos A. Canchaya, PhD IPP Research Fellow Department of Biochemistry, Genetics and Immunology Faculty of Biology Campus Universitario University of Vigo 36310 Vigo Spain http://darwin.uvigo.es/~ccanchaya/ email: canchaya at uvigo.es Tel : +34 986 130048 Fax: +34 986 812556 > _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Mar 4 08:33:02 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 04 Mar 2013 10:33:02 -0500 Subject: [maker-devel] Sharing benchmarks of maker In-Reply-To: Message-ID: For the Arabidopsis genome it also took ~2 hour 10 min on 600, so there was only a 40 min gain by going from 600 to 1,500 cpus. This is because assembly strucutre have a lot to do with the efficiency of the parallelization, so you can hit a point of diminishing returns on some assemblies sooner than others. --Carson From: Carson Holt Date: Monday, 4 March, 2013 10:12 AM To: "Carlos A. Canchaya" , Subject: Re: [maker-devel] Sharing benchmarks of maker Performance is highly dependent on the size of evidence datasets used (proteins/ESTs) as well as the IO performance of a system when running via MPI (you can hit IO bottlenecks well before cpu bottlenecks depending on cluster configuration). The Arabidopsis genome (120Mb assembly) running SNAP and Augustus, 1.1Gb EST dataset, and 10Mb protein dataset takes ~1 hour 30 min on 1,500 cpus with OpenMPI. The Maize genome (2.1 Gb) running SNAP and Augustus, 3Gb EST dataset, and 16 Mb protein dataset takes ~4 hours 30 min on 2200 cpus. A human sized genome would take 5-6 days on 100 cpus. MAKER is fully restartable (keeps log of progress). So if there is any failure or the user kills it in the middle of a job, it will pick up at the point it left off on restart (so you don't waste all that processing time). 2Gb of RAM per processing core is recommended when parallelizing MAKER via MPI, but fragmented genomes with smaller contigs can get by with less than 1Gb per core. MAKER version 2.28 which has additional optimization for OpenMPI and lower memory footprint will be available in a couple of weeks. Until then 2.27 is recommended over 2.1 for MPI. 2.27 should also work with OpenMPI. 2.1 only works with older versions of MPICH2 using the mpd launcher and not the current hydra launcher. Thanks, Carson From: "Carlos A. Canchaya" Date: Monday, 4 March, 2013 6:10 AM To: Subject: [maker-devel] Sharing benchmarks of maker Hi, I've just install maker2 in our server and run a first test with our data. The input was about 30 000 sequences (9.6 Mb) and it was run in just one server with 32 processors for 36 hours) with mpich2. Our server has 250 Gb of memory and cpus of 2,4 Gb. The test was simple because it only ran repeatmasker and SNAP. Considering that we would like to use other gene prediction/annotation tools available in MAKER, I wonder if you can share some of your benchmarks in order to know if we could scale up pretty well to our production cluster in order to annotate our 1.6 Gb draft genome Best, Carlos Carlos A. Canchaya, PhD IPP Research Fellow Department of Biochemistry, Genetics and Immunology Faculty of Biology Campus Universitario University of Vigo 36310 Vigo Spain http://darwin.uvigo.es/~ccanchaya/ email: canchaya at uvigo.es Tel : +34 986 130048 Fax: +34 986 812556 > _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/m aker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From kenlee.nakasugi at sydney.edu.au Mon Mar 4 13:50:27 2013 From: kenlee.nakasugi at sydney.edu.au (Kenlee Nakasugi) Date: Mon, 4 Mar 2013 20:50:27 +0000 Subject: [maker-devel] regarding mpich In-Reply-To: References: <1362354241.2252.38.camel@waterhouse874-8>, Message-ID: Thanks Carson. Will Maker 2.27 be able to continue analysis on Maker 2.1 files that stopped halfway? Thanks Ken On 05/03/2013, at 1:44 AM, "Carson Holt" > wrote: Use the last MPICH2 version, as MPICH3 is very different (it's the first attempt to implement the new MPI3 protocol set, and not just a version update). Alternatively you can use OpenMPI. Also use maker version 2.27 instead for MPI. Thanks, Carson From: Kenlee Nakasugi > Date: Sunday, 3 March, 2013 6:44 PM To: "maker-devel at yandell-lab.org List" > Subject: [maker-devel] regarding mpich Hi, I'm trying to install mpi_maker (Maker 2.1) on a new system (intel ix x86_64) which has mpich v3.0.2 installed, but I can't seem to get maker Build.PL to recognize it. I tried editing the Build.pm file to point to it, but with no success. All dependencies have been installed and successfully recognized, it is just MPI support that is not. Is there anything I could modify in the install scripts to make it recognize this? Currently, the directly path to where the mpicc and mpiexec are is /apps/mpich/3.0.2/bin I don't have sys admin rights for the machine, and I'm not sure if this version of mpich was installed for shared libraries as per the GMOD tutorial. But I have previously circumvented this with an earlier version of mpich by modifying the Build.pm module with success. I'm wondering if mpichv3.02 is not compatible? Cheers, Ken -- Kenlee Nakasugi | Research Fellow School of Molecular Bioscience Level 8, SMB Building (G08)| The University of Sydney | NSW | 2006 T: +61 2 9114 1321 E: kenlee.nakasugi at sydney.edu.au _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From dsth at ebi.ac.uk Mon Mar 4 13:57:01 2013 From: dsth at ebi.ac.uk (Daniel Hughes) Date: Mon, 4 Mar 2013 20:57:01 +0000 Subject: [maker-devel] regarding mpich In-Reply-To: References: <1362354241.2252.38.camel@waterhouse874-8> Message-ID: Unlikely. Probably safer to export what has finished as gff and run it as re-annotation if you don't want to waste what was alteady processed for running additional iterations. Dan from me phone... On Mar 4, 2013 8:52 PM, "Kenlee Nakasugi" wrote: > Thanks Carson. Will Maker 2.27 be able to continue analysis on Maker 2.1 > files that stopped halfway? > Thanks > Ken > > > > On 05/03/2013, at 1:44 AM, "Carson Holt" wrote: > > Use the last MPICH2 version, as MPICH3 is very different (it's the > first attempt to implement the new MPI3 protocol set, and not just a > version update). Alternatively you can use OpenMPI. Also use maker > version 2.27 instead for MPI. > > Thanks, > Carson > > > > From: Kenlee Nakasugi > Date: Sunday, 3 March, 2013 6:44 PM > To: "maker-devel at yandell-lab.org List" > Subject: [maker-devel] regarding mpich > > Hi, > > I'm trying to install mpi_maker (Maker 2.1) on a new system (intel ix > x86_64) which has mpich v3.0.2 installed, but I can't seem to get maker > Build.PL to recognize it. I tried editing the Build.pm file to point to it, > but with no success. > All dependencies have been installed and successfully recognized, it is > just MPI support that is not. > > Is there anything I could modify in the install scripts to make it > recognize this? Currently, the directly path to where the mpicc and mpiexec > are is /apps/mpich/3.0.2/bin > I don't have sys admin rights for the machine, and I'm not sure if this > version of mpich was installed for shared libraries as per the GMOD > tutorial. But I have previously circumvented this with an earlier version > of mpich by modifying the Build.pm module with success. I'm wondering if > mpichv3.02 is not compatible? > > > Cheers, > Ken > > -- > Kenlee Nakasugi | Research Fellow > School of Molecular Bioscience > Level 8, SMB Building (G08)| The University of Sydney | NSW | 2006 > T: +61 2 9114 1321 > E: kenlee.nakasugi at sydney.edu.au > > _______________________________________________ maker-devel mailing > list maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Mar 4 13:58:21 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 04 Mar 2013 15:58:21 -0500 Subject: [maker-devel] regarding mpich In-Reply-To: Message-ID: Some files it can reuse, but not all. So, exporting finished contigs with GFF3 pass-through is an option. --Carson From: Daniel Hughes Date: Monday, 4 March, 2013 3:57 PM To: Kenlee Nakasugi Cc: "maker-devel at yandell-lab.org List" , Carson Holt Subject: Re: [maker-devel] regarding mpich Unlikely. Probably safer to export what has finished as gff and run it as re-annotation if you don't want to waste what was alteady processed for running additional iterations. Dan from me phone... On Mar 4, 2013 8:52 PM, "Kenlee Nakasugi" wrote: > Thanks Carson. Will Maker 2.27 be able to continue analysis on Maker 2.1 files > that stopped halfway? > Thanks > Ken > > > > On 05/03/2013, at 1:44 AM, "Carson Holt" wrote: > >> Use the last MPICH2 version, as MPICH3 is very different (it's the first >> attempt to implement the new MPI3 protocol set, and not just a version >> update). Alternatively you can use OpenMPI. Also use maker version 2.27 >> instead for MPI. >> >> Thanks, >> Carson >> >> >> >> From: Kenlee Nakasugi >> Date: Sunday, 3 March, 2013 6:44 PM >> To: "maker-devel at yandell-lab.org List" >> Subject: [maker-devel] regarding mpich >> >> Hi, >> >> I'm trying to install mpi_maker (Maker 2.1) on a new system (intel ix x86_64) >> which has mpich v3.0.2 installed, but I can't seem to get maker Build.PL to >> recognize it. I tried editing the Build.pm file to point to it, but with no >> success. >> All dependencies have been installed and successfully recognized, it is just >> MPI support that is not. >> >> Is there anything I could modify in the install scripts to make it recognize >> this? Currently, the directly path to where the mpicc and mpiexec are is >> /apps/mpich/3.0.2/bin >> I don't have sys admin rights for the machine, and I'm not sure if this >> version of mpich was installed for shared libraries as per the GMOD tutorial. >> But I have previously circumvented this with an earlier version of mpich by >> modifying the Build.pm module with success. I'm wondering if mpichv3.02 is >> not compatible? >> >> >> Cheers, >> Ken >> >> -- >> Kenlee Nakasugi | Research Fellow >> School of Molecular Bioscience >> Level 8, SMB Building (G08)| The University of Sydney | NSW | 2006 >> T: +61 2 9114 1321 >> E: kenlee.nakasugi at sydney.edu.au >> _______________________________________________ maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kenlee.nakasugi at sydney.edu.au Mon Mar 4 18:49:09 2013 From: kenlee.nakasugi at sydney.edu.au (Kenlee Nakasugi) Date: Tue, 05 Mar 2013 12:49:09 +1100 Subject: [maker-devel] hex char:29 error with Signal.pm Message-ID: <1362448149.6346.46.camel@waterhouse874-8> Hi again, I'm running into the following error when I run maker 2.1: ## Ran into unknown state (hex char: 29) at /home/programs/maker/lib/File/..//Proc/Signal.pm line 94. ## I tried applying the patch as described here: http://gmod.827538.n3.nabble.com/cluster-error-running-maker-td4022354.html Using the command: $ patch -np1 < 646785-and-handle-Hex29.patch I did this in maker/lib/Proc and maker/lib/Process directories, but am getting this error: ## patch: **** Only garbage was found in the patch input. ## Apparently, this isn't a fatal error: http://gmod.827538.n3.nabble.com/Ran-into-unknown-state-hex-char-29-at-home-a200302-maker-2-10-lib-File-Proc-Signal-pm-line-94-td3034795.html and I might eventually have to run the latest version of Maker, but I need to continue a previous analyses and not having this constant error would be great. The version of Proc::ProcessTable is already latest, 0.47. The platform is ix x86_64 GNU/Linux Thanks, Ken -- Kenlee Nakasugi | Research Fellow School of Molecular Bioscience Level 8, SMB Building (G08)| The University of Sydney | NSW | 2006 T: +61 2 9114 1321 E: kenlee.nakasugi at sydney.edu.au -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Mar 4 21:48:17 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 04 Mar 2013 23:48:17 -0500 Subject: [maker-devel] hex char:29 error with Signal.pm In-Reply-To: <1362448149.6346.46.camel@waterhouse874-8> Message-ID: This is an issue with Proc::ProcessTable on some systems. If you upgrade to MAKER 2.27 it goes away because it no longer uses Proc::ProcessTable. Thanks, Carson From: Kenlee Nakasugi Date: Monday, 4 March, 2013 8:49 PM To: "maker-devel at yandell-lab.org List" Subject: [maker-devel] hex char:29 error with Signal.pm Hi again, I'm running into the following error when I run maker 2.1: ## Ran into unknown state (hex char: 29) at /home/programs/maker/lib/File/..//Proc/Signal.pm line 94. ## I tried applying the patch as described here: http://gmod.827538.n3.nabble.com/cluster-error-running-maker-td4022354.html Using the command: $ patch -np1 < 646785-and-handle-Hex29.patch I did this in maker/lib/Proc and maker/lib/Process directories, but am getting this error: ## patch: **** Only garbage was found in the patch input. ## Apparently, this isn't a fatal error: http://gmod.827538.n3.nabble.com/Ran-into-unknown-state-hex-char-29-at-home- a200302-maker-2-10-lib-File-Proc-Signal-pm-line-94-td3034795.html and I might eventually have to run the latest version of Maker, but I need to continue a previous analyses and not having this constant error would be great. The version of Proc::ProcessTable is already latest, 0.47. The platform is ix x86_64 GNU/Linux Thanks, Ken -- Kenlee Nakasugi | Research Fellow School of Molecular Bioscience Level 8, SMB Building (G08)| The University of Sydney | NSW | 2006 T: +61 2 9114 1321 E: kenlee.nakasugi at sydney.edu.au _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From Carson.Holt at oicr.on.ca Wed Mar 6 10:45:40 2013 From: Carson.Holt at oicr.on.ca (Carson Holt) Date: Wed, 6 Mar 2013 17:45:40 +0000 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: Message-ID: The failed thread is usually just a symptom. There is something causing the thread to fail. Could you send me your STDERR. Often times there is a warning or error further up. Thanks, Carson From: Ram?n Fallon > Date: Wednesday, 6 March, 2013 12:34 PM To: > Subject: thread terminated, causing all processes to fail Hi, I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a single multicore machine. I've successfully run the dpp_contig.fasta (MPI/8 processes) example but am having trouble with larger contigs fasta files of my own, which are well formed. I've run into a problem whereby an mpiexec run of 8 processes will stop due to a perl-thread related problem which says FATAL: Thread terminated, causing all processes to fail this corresponds to line 924 in the maker executable (which is for the secondary/worker threads), and is the result of a test on !$thr OR'd with !$thr->is_running, so clearly one of these is failing. $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a programmer, I've only recently started to look at the code and have not got the hang of the parallelisation setup here, though I gather the master must use threads to initially generate the parallel instances which then use the message passing. Of course threads don't have message passing ability, so I guess something clever is going on and will take some time for me to understand. Clearly however, it has worked before on dpp_contigs, so it may be is something wrong with my datafile or the way I am carrying out the analysis. Any clues that can be put my way are welcome. Thank you! -------------- next part -------------- An HTML attachment was scrubbed... URL: From ramonfallon at gmail.com Wed Mar 6 10:34:59 2013 From: ramonfallon at gmail.com (=?ISO-8859-1?Q?Ram=F3n_Fallon?=) Date: Wed, 6 Mar 2013 18:34:59 +0100 Subject: [maker-devel] thread terminated, causing all processes to fail Message-ID: Hi, I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a single multicore machine. I've successfully run the dpp_contig.fasta (MPI/8 processes) example but am having trouble with larger contigs fasta files of my own, which are well formed. I've run into a problem whereby an mpiexec run of 8 processes will stop due to a perl-thread related problem which says FATAL: Thread terminated, causing all processes to fail this corresponds to line 924 in the maker executable (which is for the secondary/worker threads), and is the result of a test on !$thr OR'd with !$thr->is_running, so clearly one of these is failing. $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a programmer, I've only recently started to look at the code and have not got the hang of the parallelisation setup here, though I gather the master must use threads to initially generate the parallel instances which then use the message passing. Of course threads don't have message passing ability, so I guess something clever is going on and will take some time for me to understand. Clearly however, it has worked before on dpp_contigs, so it may be is something wrong with my datafile or the way I am carrying out the analysis. Any clues that can be put my way are welcome. Thank you! -------------- next part -------------- An HTML attachment was scrubbed... URL: From ramonfallon at gmail.com Wed Mar 6 10:57:12 2013 From: ramonfallon at gmail.com (=?ISO-8859-1?Q?Ram=F3n_Fallon?=) Date: Wed, 6 Mar 2013 18:57:12 +0100 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: References: Message-ID: Hi, Many thanks for your quick reply and hint. Yes, you're right .. further up there is indeed Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line 148 thread 1. Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw FastaSeq for Storable --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 thread 1. I run a "script" session and have maker on -debug so I have everything in one file. Do you prefer to have it attached to a post to this mailing list (if it accepts txt attachments) Cheers. On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon wrote: > Hi, > > I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a > single multicore machine. > > I've successfully run the dpp_contig.fasta (MPI/8 processes) example but > am having trouble with larger contigs fasta files of my own, which are well > formed. > > I've run into a problem whereby an mpiexec run of 8 processes will stop > due to a perl-thread related problem which says > > FATAL: Thread terminated, causing all processes to fail > > this corresponds to line 924 in the maker executable (which is for the > secondary/worker threads), and is the result of a test on !$thr OR'd with > !$thr->is_running, so clearly one of these is failing. > > $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a > programmer, I've only recently started to look at the code and have not got > the hang of the parallelisation setup here, though I gather the master must > use threads to initially generate the parallel instances which then use the > message passing. Of course threads don't have message passing ability, so I > guess something clever is going on and will take some time for me to > understand. > > Clearly however, it has worked before on dpp_contigs, so it may be is > something wrong with my datafile or the way I am carrying out the analysis. > > Any clues that can be put my way are welcome. > > Thank you! > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Carson.Holt at oicr.on.ca Wed Mar 6 11:04:30 2013 From: Carson.Holt at oicr.on.ca (Carson Holt) Date: Wed, 6 Mar 2013 18:04:30 +0000 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: Message-ID: If you do reply all to this message, I should get the attachment. It will be stripped from the one going to the list though. Thanks, Carson From: Ram?n Fallon > Date: Wednesday, 6 March, 2013 12:57 PM To: > Subject: Re: thread terminated, causing all processes to fail Hi, Many thanks for your quick reply and hint. Yes, you're right .. further up there is indeed Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line 148 thread 1. Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw FastaSeq for Storable --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 thread 1. I run a "script" session and have maker on -debug so I have everything in one file. Do you prefer to have it attached to a post to this mailing list (if it accepts txt attachments) Cheers. On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon > wrote: Hi, I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a single multicore machine. I've successfully run the dpp_contig.fasta (MPI/8 processes) example but am having trouble with larger contigs fasta files of my own, which are well formed. I've run into a problem whereby an mpiexec run of 8 processes will stop due to a perl-thread related problem which says FATAL: Thread terminated, causing all processes to fail this corresponds to line 924 in the maker executable (which is for the secondary/worker threads), and is the result of a test on !$thr OR'd with !$thr->is_running, so clearly one of these is failing. $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a programmer, I've only recently started to look at the code and have not got the hang of the parallelisation setup here, though I gather the master must use threads to initially generate the parallel instances which then use the message passing. Of course threads don't have message passing ability, so I guess something clever is going on and will take some time for me to understand. Clearly however, it has worked before on dpp_contigs, so it may be is something wrong with my datafile or the way I am carrying out the analysis. Any clues that can be put my way are welcome. Thank you! -------------- next part -------------- An HTML attachment was scrubbed... URL: From ramonfallon at gmail.com Wed Mar 6 11:15:01 2013 From: ramonfallon at gmail.com (=?ISO-8859-1?Q?Ram=F3n_Fallon?=) Date: Wed, 6 Mar 2013 19:15:01 +0100 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: References: Message-ID: OK great, here goes .. many thanks! On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt wrote: > If you do reply all to this message, I should get the attachment. It > will be stripped from the one going to the list though. > > Thanks, > Carson > > > > From: Ram?n Fallon > Date: Wednesday, 6 March, 2013 12:57 PM > To: > Subject: Re: thread terminated, causing all processes to fail > > Hi, > > Many thanks for your quick reply and hint. > > Yes, you're right .. further up there is indeed > > Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line > 148 thread 1. > Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw > FastaSeq for Storable > --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 > thread 1. > > I run a "script" session and have maker on -debug so I have everything > in one file. Do you prefer to have it attached to a post to this mailing > list (if it accepts txt attachments) > > Cheers. > > > On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon wrote: > >> Hi, >> >> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a >> single multicore machine. >> >> I've successfully run the dpp_contig.fasta (MPI/8 processes) example >> but am having trouble with larger contigs fasta files of my own, which are >> well formed. >> >> I've run into a problem whereby an mpiexec run of 8 processes will stop >> due to a perl-thread related problem which says >> >> FATAL: Thread terminated, causing all processes to fail >> >> this corresponds to line 924 in the maker executable (which is for the >> secondary/worker threads), and is the result of a test on !$thr OR'd with >> !$thr->is_running, so clearly one of these is failing. >> >> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being >> a programmer, I've only recently started to look at the code and have not >> got the hang of the parallelisation setup here, though I gather the master >> must use threads to initially generate the parallel instances which then >> use the message passing. Of course threads don't have message passing >> ability, so I guess something clever is going on and will take some time >> for me to understand. >> >> Clearly however, it has worked before on dpp_contigs, so it may be is >> something wrong with my datafile or the way I am carrying out the analysis. >> >> Any clues that can be put my way are welcome. >> >> Thank you! >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: rf_mkr_run.scriptlog.zip Type: application/zip Size: 7598 bytes Desc: not available URL: From Carson.Holt at oicr.on.ca Wed Mar 6 11:22:38 2013 From: Carson.Holt at oicr.on.ca (Carson Holt) Date: Wed, 6 Mar 2013 18:22:38 +0000 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: Message-ID: Could you delete your ../*maker.output/mpi_blastdb directory, and then when rerunning maker, run with the ?a flag. Thanks, Carson From: Ram?n Fallon > Date: Wednesday, 6 March, 2013 1:15 PM To: Carson Holt > Cc: "maker-devel at yandell-lab.org" > Subject: Re: thread terminated, causing all processes to fail OK great, here goes .. many thanks! On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt > wrote: If you do reply all to this message, I should get the attachment. It will be stripped from the one going to the list though. Thanks, Carson From: Ram?n Fallon > Date: Wednesday, 6 March, 2013 12:57 PM To: > Subject: Re: thread terminated, causing all processes to fail Hi, Many thanks for your quick reply and hint. Yes, you're right .. further up there is indeed Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line 148 thread 1. Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw FastaSeq for Storable --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 thread 1. I run a "script" session and have maker on -debug so I have everything in one file. Do you prefer to have it attached to a post to this mailing list (if it accepts txt attachments) Cheers. On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon > wrote: Hi, I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a single multicore machine. I've successfully run the dpp_contig.fasta (MPI/8 processes) example but am having trouble with larger contigs fasta files of my own, which are well formed. I've run into a problem whereby an mpiexec run of 8 processes will stop due to a perl-thread related problem which says FATAL: Thread terminated, causing all processes to fail this corresponds to line 924 in the maker executable (which is for the secondary/worker threads), and is the result of a test on !$thr OR'd with !$thr->is_running, so clearly one of these is failing. $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a programmer, I've only recently started to look at the code and have not got the hang of the parallelisation setup here, though I gather the master must use threads to initially generate the parallel instances which then use the message passing. Of course threads don't have message passing ability, so I guess something clever is going on and will take some time for me to understand. Clearly however, it has worked before on dpp_contigs, so it may be is something wrong with my datafile or the way I am carrying out the analysis. Any clues that can be put my way are welcome. Thank you! -------------- next part -------------- An HTML attachment was scrubbed... URL: From ramonfallon at gmail.com Wed Mar 6 11:49:46 2013 From: ramonfallon at gmail.com (=?ISO-8859-1?Q?Ram=F3n_Fallon?=) Date: Wed, 6 Mar 2013 19:49:46 +0100 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: References: Message-ID: OK, will do. Will get back to you tomorrow on it. Many thanks! On Wed, Mar 6, 2013 at 7:22 PM, Carson Holt wrote: > Could you delete your ../*maker.output/mpi_blastdb directory, and then > when rerunning maker, run with the ?a flag. > > Thanks, > Carson > > > From: Ram?n Fallon > Date: Wednesday, 6 March, 2013 1:15 PM > To: Carson Holt > Cc: "maker-devel at yandell-lab.org" > > Subject: Re: thread terminated, causing all processes to fail > > OK great, here goes .. many thanks! > > > > On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt wrote: > >> If you do reply all to this message, I should get the attachment. It >> will be stripped from the one going to the list though. >> >> Thanks, >> Carson >> >> >> >> From: Ram?n Fallon >> Date: Wednesday, 6 March, 2013 12:57 PM >> To: >> Subject: Re: thread terminated, causing all processes to fail >> >> Hi, >> >> Many thanks for your quick reply and hint. >> >> Yes, you're right .. further up there is indeed >> >> Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line >> 148 thread 1. >> Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw >> FastaSeq for Storable >> --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 >> thread 1. >> >> I run a "script" session and have maker on -debug so I have everything >> in one file. Do you prefer to have it attached to a post to this mailing >> list (if it accepts txt attachments) >> >> Cheers. >> >> >> On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon wrote: >> >>> Hi, >>> >>> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a >>> single multicore machine. >>> >>> I've successfully run the dpp_contig.fasta (MPI/8 processes) example >>> but am having trouble with larger contigs fasta files of my own, which are >>> well formed. >>> >>> I've run into a problem whereby an mpiexec run of 8 processes will >>> stop due to a perl-thread related problem which says >>> >>> FATAL: Thread terminated, causing all processes to fail >>> >>> this corresponds to line 924 in the maker executable (which is for the >>> secondary/worker threads), and is the result of a test on !$thr OR'd with >>> !$thr->is_running, so clearly one of these is failing. >>> >>> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being >>> a programmer, I've only recently started to look at the code and have not >>> got the hang of the parallelisation setup here, though I gather the master >>> must use threads to initially generate the parallel instances which then >>> use the message passing. Of course threads don't have message passing >>> ability, so I guess something clever is going on and will take some time >>> for me to understand. >>> >>> Clearly however, it has worked before on dpp_contigs, so it may be is >>> something wrong with my datafile or the way I am carrying out the analysis. >>> >>> Any clues that can be put my way are welcome. >>> >>> Thank you! >>> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ramonfallon at gmail.com Thu Mar 7 07:40:53 2013 From: ramonfallon at gmail.com (=?ISO-8859-1?Q?Ram=F3n_Fallon?=) Date: Thu, 7 Mar 2013 15:40:53 +0100 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: References: Message-ID: Hi Carson, I send you a zip of the text file of my repeated maker session, this time having deleted the mpi_blastdb dir and with the -a flag added to "mpiexec -n 8 maker -debug". Command line. Cheers / Ram?n. On Wed, Mar 6, 2013 at 7:49 PM, Ram?n Fallon wrote: > OK, will do. > > Will get back to you tomorrow on it. > > Many thanks! > > > On Wed, Mar 6, 2013 at 7:22 PM, Carson Holt wrote: > >> Could you delete your ../*maker.output/mpi_blastdb directory, and then >> when rerunning maker, run with the ?a flag. >> >> Thanks, >> Carson >> >> >> From: Ram?n Fallon >> Date: Wednesday, 6 March, 2013 1:15 PM >> To: Carson Holt >> Cc: "maker-devel at yandell-lab.org" >> >> Subject: Re: thread terminated, causing all processes to fail >> >> OK great, here goes .. many thanks! >> >> >> >> On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt wrote: >> >>> If you do reply all to this message, I should get the attachment. It >>> will be stripped from the one going to the list though. >>> >>> Thanks, >>> Carson >>> >>> >>> >>> From: Ram?n Fallon >>> Date: Wednesday, 6 March, 2013 12:57 PM >>> To: >>> Subject: Re: thread terminated, causing all processes to fail >>> >>> Hi, >>> >>> Many thanks for your quick reply and hint. >>> >>> Yes, you're right .. further up there is indeed >>> >>> Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line >>> 148 thread 1. >>> Thread 1 terminated abnormally: ERROR: Could not reestablish DB to >>> thaw FastaSeq for Storable >>> --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 >>> thread 1. >>> >>> I run a "script" session and have maker on -debug so I have everything >>> in one file. Do you prefer to have it attached to a post to this mailing >>> list (if it accepts txt attachments) >>> >>> Cheers. >>> >>> >>> On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon wrote: >>> >>>> Hi, >>>> >>>> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a >>>> single multicore machine. >>>> >>>> I've successfully run the dpp_contig.fasta (MPI/8 processes) example >>>> but am having trouble with larger contigs fasta files of my own, which are >>>> well formed. >>>> >>>> I've run into a problem whereby an mpiexec run of 8 processes will >>>> stop due to a perl-thread related problem which says >>>> >>>> FATAL: Thread terminated, causing all processes to fail >>>> >>>> this corresponds to line 924 in the maker executable (which is for the >>>> secondary/worker threads), and is the result of a test on !$thr OR'd with >>>> !$thr->is_running, so clearly one of these is failing. >>>> >>>> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite >>>> being a programmer, I've only recently started to look at the code and have >>>> not got the hang of the parallelisation setup here, though I gather the >>>> master must use threads to initially generate the parallel instances which >>>> then use the message passing. Of course threads don't have message passing >>>> ability, so I guess something clever is going on and will take some time >>>> for me to understand. >>>> >>>> Clearly however, it has worked before on dpp_contigs, so it may be is >>>> something wrong with my datafile or the way I am carrying out the analysis. >>>> >>>> Any clues that can be put my way are welcome. >>>> >>>> Thank you! >>>> >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: rf_mkr_run.scriptlog2.zip Type: application/zip Size: 6430 bytes Desc: not available URL: From carsonhh at gmail.com Thu Mar 7 09:44:40 2013 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 07 Mar 2013 11:44:40 -0500 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: Message-ID: That is extremely odd. It fails to even generate the indexes. Could you check the drive space of your working directory and your /tmp directory? It is odd because Bioperl uses the stat command to check on the file right before making a tied hash. So it was there for the stat but not the tie, which is immediately following. If you check manually does it exist now? --> /home/ramonf/makertrials/mgallocut7/sca29310_8.maker.output/mpi_blastdb/sca2 9310_8%2Efa.mpi.1/sca29310_8%2Efa.mpi.1.0.index Are you running in an NFS mounted directory? --Carson From: Ram?n Fallon Date: Thursday, 7 March, 2013 9:40 AM To: Carson Holt Cc: "maker-devel at yandell-lab.org" Subject: Re: [maker-devel] thread terminated, causing all processes to fail Hi Carson, I send you a zip of the text file of my repeated maker session, this time having deleted the mpi_blastdb dir and with the -a flag added to "mpiexec -n 8 maker -debug". Command line. Cheers / Ram?n. On Wed, Mar 6, 2013 at 7:49 PM, Ram?n Fallon wrote: > OK, will do. > > Will get back to you tomorrow on it. > > Many thanks! > > > On Wed, Mar 6, 2013 at 7:22 PM, Carson Holt wrote: >> Could you delete your ../*maker.output/mpi_blastdb directory, and then when >> rerunning maker, run with the ?a flag. >> >> Thanks, >> Carson >> >> >> From: Ram?n Fallon >> Date: Wednesday, 6 March, 2013 1:15 PM >> To: Carson Holt >> Cc: "maker-devel at yandell-lab.org" >> >> Subject: Re: thread terminated, causing all processes to fail >> >> OK great, here goes .. many thanks! >> >> >> >> On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt wrote: >>> If you do reply all to this message, I should get the attachment. It will >>> be stripped from the one going to the list though. >>> >>> Thanks, >>> Carson >>> >>> >>> >>> From: Ram?n Fallon >>> Date: Wednesday, 6 March, 2013 12:57 PM >>> To: >>> Subject: Re: thread terminated, causing all processes to fail >>> >>> Hi, >>> >>> Many thanks for your quick reply and hint. >>> >>> Yes, you're right .. further up there is indeed >>> >>> Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line 148 >>> thread 1. >>> Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw >>> FastaSeq for Storable >>> --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 >>> thread 1. >>> >>> I run a "script" session and have maker on -debug so I have everything in >>> one file. Do you prefer to have it attached to a post to this mailing list >>> (if it accepts txt attachments) >>> >>> Cheers. >>> >>> >>> On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon wrote: >>>> Hi, >>>> >>>> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a >>>> single multicore machine. >>>> >>>> I've successfully run the dpp_contig.fasta (MPI/8 processes) example but am >>>> having trouble with larger contigs fasta files of my own, which are well >>>> formed. >>>> >>>> I've run into a problem whereby an mpiexec run of 8 processes will stop due >>>> to a perl-thread related problem which says >>>> >>>> FATAL: Thread terminated, causing all processes to fail >>>> >>>> this corresponds to line 924 in the maker executable (which is for the >>>> secondary/worker threads), and is the result of a test on !$thr OR'd with >>>> !$thr->is_running, so clearly one of these is failing. >>>> >>>> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a >>>> programmer, I've only recently started to look at the code and have not got >>>> the hang of the parallelisation setup here, though I gather the master must >>>> use threads to initially generate the parallel instances which then use the >>>> message passing. Of course threads don't have message passing ability, so I >>>> guess something clever is going on and will take some time for me to >>>> understand. >>>> >>>> Clearly however, it has worked before on dpp_contigs, so it may be is >>>> something wrong with my datafile or the way I am carrying out the analysis. >>>> >>>> Any clues that can be put my way are welcome. >>>> >>>> Thank you! >>> >> > _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ramonfallon at gmail.com Thu Mar 7 10:47:53 2013 From: ramonfallon at gmail.com (=?ISO-8859-1?Q?Ram=F3n_Fallon?=) Date: Thu, 7 Mar 2013 18:47:53 +0100 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: References: Message-ID: This is a standalone machine and no NFS at all. "df" gives a healthy amount of disk space, so there should be no problem there. Yes that file does exist although it has the nominal 12288 bytes size, which appears to be the minimum for a DB_file tie. As I mentioned the dpp_contig.fa example set does work so part of my investigation is looking at how. I can do some trivial unit tests on the Bioperl stat-before-tied-hashes situation and see what comes up. So I'll attempt to clear that up and then revert. Many thanks! / Ram?n. On Thu, Mar 7, 2013 at 5:44 PM, Carson Holt wrote: > That is extremely odd. It fails to even generate the indexes. Could you > check the drive space of your working directory and your /tmp directory? > > It is odd because Bioperl uses the stat command to check on the file right > before making a tied hash. So it was there for the stat but not the tie, > which is immediately following. > > If you check manually does it exist now? --> > /home/ramonf/makertrials/mgallocut7/sca29310_8.maker.output/mpi_blastdb/sca29310_8%2Efa.mpi.1/sca29310_8%2Efa.mpi.1.0.index > > Are you running in an NFS mounted directory? > > --Carson > > > From: Ram?n Fallon > Date: Thursday, 7 March, 2013 9:40 AM > > To: Carson Holt > Cc: "maker-devel at yandell-lab.org" > Subject: Re: [maker-devel] thread terminated, causing all processes to > fail > > Hi Carson, > > I send you a zip of the text file of my repeated maker session, this time > having deleted the mpi_blastdb dir and with the -a flag added to "mpiexec > -n 8 maker -debug". Command line. > > Cheers / Ram?n. > > > On Wed, Mar 6, 2013 at 7:49 PM, Ram?n Fallon wrote: > >> OK, will do. >> >> Will get back to you tomorrow on it. >> >> Many thanks! >> >> >> On Wed, Mar 6, 2013 at 7:22 PM, Carson Holt wrote: >> >>> Could you delete your ../*maker.output/mpi_blastdb directory, and then >>> when rerunning maker, run with the ?a flag. >>> >>> Thanks, >>> Carson >>> >>> >>> From: Ram?n Fallon >>> Date: Wednesday, 6 March, 2013 1:15 PM >>> To: Carson Holt >>> Cc: "maker-devel at yandell-lab.org" >>> >>> Subject: Re: thread terminated, causing all processes to fail >>> >>> OK great, here goes .. many thanks! >>> >>> >>> >>> On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt wrote: >>> >>>> If you do reply all to this message, I should get the attachment. It >>>> will be stripped from the one going to the list though. >>>> >>>> Thanks, >>>> Carson >>>> >>>> >>>> >>>> From: Ram?n Fallon >>>> Date: Wednesday, 6 March, 2013 12:57 PM >>>> To: >>>> Subject: Re: thread terminated, causing all processes to fail >>>> >>>> Hi, >>>> >>>> Many thanks for your quick reply and hint. >>>> >>>> Yes, you're right .. further up there is indeed >>>> >>>> Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line >>>> 148 thread 1. >>>> Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw >>>> FastaSeq for Storable >>>> --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 >>>> thread 1. >>>> >>>> I run a "script" session and have maker on -debug so I have everything >>>> in one file. Do you prefer to have it attached to a post to this mailing >>>> list (if it accepts txt attachments) >>>> >>>> Cheers. >>>> >>>> >>>> On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon wrote: >>>> >>>>> Hi, >>>>> >>>>> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a >>>>> single multicore machine. >>>>> >>>>> I've successfully run the dpp_contig.fasta (MPI/8 processes) example >>>>> but am having trouble with larger contigs fasta files of my own, which are >>>>> well formed. >>>>> >>>>> I've run into a problem whereby an mpiexec run of 8 processes will >>>>> stop due to a perl-thread related problem which says >>>>> >>>>> FATAL: Thread terminated, causing all processes to fail >>>>> >>>>> this corresponds to line 924 in the maker executable (which is for the >>>>> secondary/worker threads), and is the result of a test on !$thr OR'd with >>>>> !$thr->is_running, so clearly one of these is failing. >>>>> >>>>> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being >>>>> a programmer, I've only recently started to look at the code and have not >>>>> got the hang of the parallelisation setup here, though I gather the master >>>>> must use threads to initially generate the parallel instances which then >>>>> use the message passing. Of course threads don't have message passing >>>>> ability, so I guess something clever is going on and will take some time >>>>> for me to understand. >>>>> >>>>> Clearly however, it has worked before on dpp_contigs, so it may be is >>>>> something wrong with my datafile or the way I am carrying out the analysis. >>>>> >>>>> Any clues that can be put my way are welcome. >>>>> >>>>> Thank you! >>>>> >>>> >>>> >>> >> > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Thu Mar 7 10:57:46 2013 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 07 Mar 2013 12:57:46 -0500 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: Message-ID: Try running maker outside of with the ?a flag after deleting mpi_blastdb. Does it still happen. Also if you try again with MPI with the ?a flag and having deleted mpi_blastdb, does it fail the same every time? Could you also check for background maker processes that may be trying to work in the same directory that you may not have realized were running. Thanks, Carson From: Ram?n Fallon Date: Thursday, 7 March, 2013 12:47 PM To: Carson Holt Cc: "maker-devel at yandell-lab.org" Subject: Re: [maker-devel] thread terminated, causing all processes to fail This is a standalone machine and no NFS at all. "df" gives a healthy amount of disk space, so there should be no problem there. Yes that file does exist although it has the nominal 12288 bytes size, which appears to be the minimum for a DB_file tie. As I mentioned the dpp_contig.fa example set does work so part of my investigation is looking at how. I can do some trivial unit tests on the Bioperl stat-before-tied-hashes situation and see what comes up. So I'll attempt to clear that up and then revert. Many thanks! / Ram?n. On Thu, Mar 7, 2013 at 5:44 PM, Carson Holt wrote: > That is extremely odd. It fails to even generate the indexes. Could you check > the drive space of your working directory and your /tmp directory? > > It is odd because Bioperl uses the stat command to check on the file right > before making a tied hash. So it was there for the stat but not the tie, > which is immediately following. > > If you check manually does it exist now? --> > /home/ramonf/makertrials/mgallocut7/sca29310_8.maker.output/mpi_blastdb/sca293 > 10_8%2Efa.mpi.1/sca29310_8%2Efa.mpi.1.0.index > > Are you running in an NFS mounted directory? > > --Carson > > > From: Ram?n Fallon > Date: Thursday, 7 March, 2013 9:40 AM > > To: Carson Holt > Cc: "maker-devel at yandell-lab.org" > Subject: Re: [maker-devel] thread terminated, causing all processes to fail > > Hi Carson, > > I send you a zip of the text file of my repeated maker session, this time > having deleted the mpi_blastdb dir and with the -a flag added to "mpiexec -n 8 > maker -debug". Command line. > > Cheers / Ram?n. > > > On Wed, Mar 6, 2013 at 7:49 PM, Ram?n Fallon wrote: >> OK, will do. >> >> Will get back to you tomorrow on it. >> >> Many thanks! >> >> >> On Wed, Mar 6, 2013 at 7:22 PM, Carson Holt wrote: >>> Could you delete your ../*maker.output/mpi_blastdb directory, and then when >>> rerunning maker, run with the ?a flag. >>> >>> Thanks, >>> Carson >>> >>> >>> From: Ram?n Fallon >>> Date: Wednesday, 6 March, 2013 1:15 PM >>> To: Carson Holt >>> Cc: "maker-devel at yandell-lab.org" >>> >>> Subject: Re: thread terminated, causing all processes to fail >>> >>> OK great, here goes .. many thanks! >>> >>> >>> >>> On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt wrote: >>>> If you do reply all to this message, I should get the attachment. It will >>>> be stripped from the one going to the list though. >>>> >>>> Thanks, >>>> Carson >>>> >>>> >>>> >>>> From: Ram?n Fallon >>>> Date: Wednesday, 6 March, 2013 12:57 PM >>>> To: >>>> Subject: Re: thread terminated, causing all processes to fail >>>> >>>> Hi, >>>> >>>> Many thanks for your quick reply and hint. >>>> >>>> Yes, you're right .. further up there is indeed >>>> >>>> Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line 148 >>>> thread 1. >>>> Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw >>>> FastaSeq for Storable >>>> --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 >>>> thread 1. >>>> >>>> I run a "script" session and have maker on -debug so I have everything in >>>> one file. Do you prefer to have it attached to a post to this mailing list >>>> (if it accepts txt attachments) >>>> >>>> Cheers. >>>> >>>> >>>> On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon wrote: >>>>> Hi, >>>>> >>>>> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a >>>>> single multicore machine. >>>>> >>>>> I've successfully run the dpp_contig.fasta (MPI/8 processes) example but >>>>> am having trouble with larger contigs fasta files of my own, which are >>>>> well formed. >>>>> >>>>> I've run into a problem whereby an mpiexec run of 8 processes will stop >>>>> due to a perl-thread related problem which says >>>>> >>>>> FATAL: Thread terminated, causing all processes to fail >>>>> >>>>> this corresponds to line 924 in the maker executable (which is for the >>>>> secondary/worker threads), and is the result of a test on !$thr OR'd with >>>>> !$thr->is_running, so clearly one of these is failing. >>>>> >>>>> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a >>>>> programmer, I've only recently started to look at the code and have not >>>>> got the hang of the parallelisation setup here, though I gather the master >>>>> must use threads to initially generate the parallel instances which then >>>>> use the message passing. Of course threads don't have message passing >>>>> ability, so I guess something clever is going on and will take some time >>>>> for me to understand. >>>>> >>>>> Clearly however, it has worked before on dpp_contigs, so it may be is >>>>> something wrong with my datafile or the way I am carrying out the >>>>> analysis. >>>>> >>>>> Any clues that can be put my way are welcome. >>>>> >>>>> Thank you! >>>> >>> >> > > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/mak > er-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Thu Mar 7 14:09:34 2013 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 07 Mar 2013 16:09:34 -0500 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: Message-ID: It should have said "Try running maker outside of MPi". --Carson From: Carson Holt Date: Thursday, 7 March, 2013 12:57 PM To: Ram?n Fallon Cc: "maker-devel at yandell-lab.org" Subject: Re: [maker-devel] thread terminated, causing all processes to fail Try running maker outside of with the ?a flag after deleting mpi_blastdb. Does it still happen. Also if you try again with MPI with the ?a flag and having deleted mpi_blastdb, does it fail the same every time? Could you also check for background maker processes that may be trying to work in the same directory that you may not have realized were running. Thanks, Carson From: Ram?n Fallon Date: Thursday, 7 March, 2013 12:47 PM To: Carson Holt Cc: "maker-devel at yandell-lab.org" Subject: Re: [maker-devel] thread terminated, causing all processes to fail This is a standalone machine and no NFS at all. "df" gives a healthy amount of disk space, so there should be no problem there. Yes that file does exist although it has the nominal 12288 bytes size, which appears to be the minimum for a DB_file tie. As I mentioned the dpp_contig.fa example set does work so part of my investigation is looking at how. I can do some trivial unit tests on the Bioperl stat-before-tied-hashes situation and see what comes up. So I'll attempt to clear that up and then revert. Many thanks! / Ram?n. On Thu, Mar 7, 2013 at 5:44 PM, Carson Holt wrote: > That is extremely odd. It fails to even generate the indexes. Could you check > the drive space of your working directory and your /tmp directory? > > It is odd because Bioperl uses the stat command to check on the file right > before making a tied hash. So it was there for the stat but not the tie, > which is immediately following. > > If you check manually does it exist now? --> > /home/ramonf/makertrials/mgallocut7/sca29310_8.maker.output/mpi_blastdb/sca293 > 10_8%2Efa.mpi.1/sca29310_8%2Efa.mpi.1.0.index > > Are you running in an NFS mounted directory? > > --Carson > > > From: Ram?n Fallon > Date: Thursday, 7 March, 2013 9:40 AM > > To: Carson Holt > Cc: "maker-devel at yandell-lab.org" > Subject: Re: [maker-devel] thread terminated, causing all processes to fail > > Hi Carson, > > I send you a zip of the text file of my repeated maker session, this time > having deleted the mpi_blastdb dir and with the -a flag added to "mpiexec -n 8 > maker -debug". Command line. > > Cheers / Ram?n. > > > On Wed, Mar 6, 2013 at 7:49 PM, Ram?n Fallon wrote: >> OK, will do. >> >> Will get back to you tomorrow on it. >> >> Many thanks! >> >> >> On Wed, Mar 6, 2013 at 7:22 PM, Carson Holt wrote: >>> Could you delete your ../*maker.output/mpi_blastdb directory, and then when >>> rerunning maker, run with the ?a flag. >>> >>> Thanks, >>> Carson >>> >>> >>> From: Ram?n Fallon >>> Date: Wednesday, 6 March, 2013 1:15 PM >>> To: Carson Holt >>> Cc: "maker-devel at yandell-lab.org" >>> >>> Subject: Re: thread terminated, causing all processes to fail >>> >>> OK great, here goes .. many thanks! >>> >>> >>> >>> On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt wrote: >>>> If you do reply all to this message, I should get the attachment. It will >>>> be stripped from the one going to the list though. >>>> >>>> Thanks, >>>> Carson >>>> >>>> >>>> >>>> From: Ram?n Fallon >>>> Date: Wednesday, 6 March, 2013 12:57 PM >>>> To: >>>> Subject: Re: thread terminated, causing all processes to fail >>>> >>>> Hi, >>>> >>>> Many thanks for your quick reply and hint. >>>> >>>> Yes, you're right .. further up there is indeed >>>> >>>> Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line 148 >>>> thread 1. >>>> Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw >>>> FastaSeq for Storable >>>> --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 >>>> thread 1. >>>> >>>> I run a "script" session and have maker on -debug so I have everything in >>>> one file. Do you prefer to have it attached to a post to this mailing list >>>> (if it accepts txt attachments) >>>> >>>> Cheers. >>>> >>>> >>>> On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon wrote: >>>>> Hi, >>>>> >>>>> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a >>>>> single multicore machine. >>>>> >>>>> I've successfully run the dpp_contig.fasta (MPI/8 processes) example but >>>>> am having trouble with larger contigs fasta files of my own, which are >>>>> well formed. >>>>> >>>>> I've run into a problem whereby an mpiexec run of 8 processes will stop >>>>> due to a perl-thread related problem which says >>>>> >>>>> FATAL: Thread terminated, causing all processes to fail >>>>> >>>>> this corresponds to line 924 in the maker executable (which is for the >>>>> secondary/worker threads), and is the result of a test on !$thr OR'd with >>>>> !$thr->is_running, so clearly one of these is failing. >>>>> >>>>> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a >>>>> programmer, I've only recently started to look at the code and have not >>>>> got the hang of the parallelisation setup here, though I gather the master >>>>> must use threads to initially generate the parallel instances which then >>>>> use the message passing. Of course threads don't have message passing >>>>> ability, so I guess something clever is going on and will take some time >>>>> for me to understand. >>>>> >>>>> Clearly however, it has worked before on dpp_contigs, so it may be is >>>>> something wrong with my datafile or the way I am carrying out the >>>>> analysis. >>>>> >>>>> Any clues that can be put my way are welcome. >>>>> >>>>> Thank you! >>>> >>> >> > > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/mak > er-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From kangyangjae at gmail.com Thu Mar 7 21:00:19 2013 From: kangyangjae at gmail.com (Kang, Yang Jae) Date: Fri, 8 Mar 2013 13:00:19 +0900 Subject: [maker-devel] retrying the FAILED scaffolds Message-ID: <13f201ce1bb1$769c9e20$63d5da60$@gmail.com> Hello I have question regarding some FAILED scaffolds Is there any way to re-try maker pipeline on just Failed scaffolds separately? And do I have to manually erase for the failed directories named as ../theVoid.scaffold_#/? And how can I track down the reason why only those 20 out of around 3000 scaffolds? Thank you Kang, Yang Jae Ph.D. Cropgenomics Lab. College of Agriculture and Life Science Seoul National University Korea -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Thu Mar 7 21:13:08 2013 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 07 Mar 2013 23:13:08 -0500 Subject: [maker-devel] retrying the FAILED scaffolds In-Reply-To: <13f201ce1bb1$769c9e20$63d5da60$@gmail.com> Message-ID: Is there any way to re-try maker pipeline on just Failed scaffolds separately? > Yes. The failed contig fasta will be in the maker.output subdirectory for that > contig. Alternatively use the fasta_tool script to extract them from the > genome file. You can then run them in a separate directory, or use the > '-base' command line flag to force it to use the base name of the current > results directory. Use the ?g option to override the genome file without > having to edit the control files > > Example: > > maker -g failed.fasta ?base maize_assemby > > Output will end up here --> maize_assemby.maker.output And do I have to manually erase for the failed directories named as ../theVoid.scaffold_#/? > No. You can let MAKER just retry them as is (let maker handle what to delete > and keep) or set clean_try=1 to force full deletion before rerunning And how can I track down the reason why only those 20 out of around 3000 scaffolds? > Search for the tag "ERROR" in the standard output of your run. What MAKER > version are you using? I can take a look at the STDERR as wel if you want. > If it's too big for e-mail, you can share it via dropbox. Thanks, Carson -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Fri Mar 8 13:20:37 2013 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 08 Mar 2013 15:20:37 -0500 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: Message-ID: I think I've found the potential cause and committed the necessary changes to fix it. Thanks, Carson From: Ram?n Fallon Date: Thursday, 7 March, 2013 12:47 PM To: Carson Holt Cc: "maker-devel at yandell-lab.org" Subject: Re: [maker-devel] thread terminated, causing all processes to fail This is a standalone machine and no NFS at all. "df" gives a healthy amount of disk space, so there should be no problem there. Yes that file does exist although it has the nominal 12288 bytes size, which appears to be the minimum for a DB_file tie. As I mentioned the dpp_contig.fa example set does work so part of my investigation is looking at how. I can do some trivial unit tests on the Bioperl stat-before-tied-hashes situation and see what comes up. So I'll attempt to clear that up and then revert. Many thanks! / Ram?n. On Thu, Mar 7, 2013 at 5:44 PM, Carson Holt wrote: > That is extremely odd. It fails to even generate the indexes. Could you check > the drive space of your working directory and your /tmp directory? > > It is odd because Bioperl uses the stat command to check on the file right > before making a tied hash. So it was there for the stat but not the tie, > which is immediately following. > > If you check manually does it exist now? --> > /home/ramonf/makertrials/mgallocut7/sca29310_8.maker.output/mpi_blastdb/sca293 > 10_8%2Efa.mpi.1/sca29310_8%2Efa.mpi.1.0.index > > Are you running in an NFS mounted directory? > > --Carson > > > From: Ram?n Fallon > Date: Thursday, 7 March, 2013 9:40 AM > > To: Carson Holt > Cc: "maker-devel at yandell-lab.org" > Subject: Re: [maker-devel] thread terminated, causing all processes to fail > > Hi Carson, > > I send you a zip of the text file of my repeated maker session, this time > having deleted the mpi_blastdb dir and with the -a flag added to "mpiexec -n 8 > maker -debug". Command line. > > Cheers / Ram?n. > > > On Wed, Mar 6, 2013 at 7:49 PM, Ram?n Fallon wrote: >> OK, will do. >> >> Will get back to you tomorrow on it. >> >> Many thanks! >> >> >> On Wed, Mar 6, 2013 at 7:22 PM, Carson Holt wrote: >>> Could you delete your ../*maker.output/mpi_blastdb directory, and then when >>> rerunning maker, run with the ?a flag. >>> >>> Thanks, >>> Carson >>> >>> >>> From: Ram?n Fallon >>> Date: Wednesday, 6 March, 2013 1:15 PM >>> To: Carson Holt >>> Cc: "maker-devel at yandell-lab.org" >>> >>> Subject: Re: thread terminated, causing all processes to fail >>> >>> OK great, here goes .. many thanks! >>> >>> >>> >>> On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt wrote: >>>> If you do reply all to this message, I should get the attachment. It will >>>> be stripped from the one going to the list though. >>>> >>>> Thanks, >>>> Carson >>>> >>>> >>>> >>>> From: Ram?n Fallon >>>> Date: Wednesday, 6 March, 2013 12:57 PM >>>> To: >>>> Subject: Re: thread terminated, causing all processes to fail >>>> >>>> Hi, >>>> >>>> Many thanks for your quick reply and hint. >>>> >>>> Yes, you're right .. further up there is indeed >>>> >>>> Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line 148 >>>> thread 1. >>>> Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw >>>> FastaSeq for Storable >>>> --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 >>>> thread 1. >>>> >>>> I run a "script" session and have maker on -debug so I have everything in >>>> one file. Do you prefer to have it attached to a post to this mailing list >>>> (if it accepts txt attachments) >>>> >>>> Cheers. >>>> >>>> >>>> On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon wrote: >>>>> Hi, >>>>> >>>>> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a >>>>> single multicore machine. >>>>> >>>>> I've successfully run the dpp_contig.fasta (MPI/8 processes) example but >>>>> am having trouble with larger contigs fasta files of my own, which are >>>>> well formed. >>>>> >>>>> I've run into a problem whereby an mpiexec run of 8 processes will stop >>>>> due to a perl-thread related problem which says >>>>> >>>>> FATAL: Thread terminated, causing all processes to fail >>>>> >>>>> this corresponds to line 924 in the maker executable (which is for the >>>>> secondary/worker threads), and is the result of a test on !$thr OR'd with >>>>> !$thr->is_running, so clearly one of these is failing. >>>>> >>>>> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a >>>>> programmer, I've only recently started to look at the code and have not >>>>> got the hang of the parallelisation setup here, though I gather the master >>>>> must use threads to initially generate the parallel instances which then >>>>> use the message passing. Of course threads don't have message passing >>>>> ability, so I guess something clever is going on and will take some time >>>>> for me to understand. >>>>> >>>>> Clearly however, it has worked before on dpp_contigs, so it may be is >>>>> something wrong with my datafile or the way I am carrying out the >>>>> analysis. >>>>> >>>>> Any clues that can be put my way are welcome. >>>>> >>>>> Thank you! >>>> >>> >> > > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/mak > er-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Fri Mar 8 13:28:32 2013 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 08 Mar 2013 15:28:32 -0500 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: Message-ID: Also delete mpi_blastdb before retrying with the new svn repository. Thanks, Carson From: Carson Holt Date: Friday, 8 March, 2013 3:20 PM To: Ram?n Fallon Cc: "maker-devel at yandell-lab.org" Subject: Re: [maker-devel] thread terminated, causing all processes to fail I think I've found the potential cause and committed the necessary changes to fix it. Thanks, Carson From: Ram?n Fallon Date: Thursday, 7 March, 2013 12:47 PM To: Carson Holt Cc: "maker-devel at yandell-lab.org" Subject: Re: [maker-devel] thread terminated, causing all processes to fail This is a standalone machine and no NFS at all. "df" gives a healthy amount of disk space, so there should be no problem there. Yes that file does exist although it has the nominal 12288 bytes size, which appears to be the minimum for a DB_file tie. As I mentioned the dpp_contig.fa example set does work so part of my investigation is looking at how. I can do some trivial unit tests on the Bioperl stat-before-tied-hashes situation and see what comes up. So I'll attempt to clear that up and then revert. Many thanks! / Ram?n. On Thu, Mar 7, 2013 at 5:44 PM, Carson Holt wrote: > That is extremely odd. It fails to even generate the indexes. Could you check > the drive space of your working directory and your /tmp directory? > > It is odd because Bioperl uses the stat command to check on the file right > before making a tied hash. So it was there for the stat but not the tie, > which is immediately following. > > If you check manually does it exist now? --> > /home/ramonf/makertrials/mgallocut7/sca29310_8.maker.output/mpi_blastdb/sca293 > 10_8%2Efa.mpi.1/sca29310_8%2Efa.mpi.1.0.index > > Are you running in an NFS mounted directory? > > --Carson > > > From: Ram?n Fallon > Date: Thursday, 7 March, 2013 9:40 AM > > To: Carson Holt > Cc: "maker-devel at yandell-lab.org" > Subject: Re: [maker-devel] thread terminated, causing all processes to fail > > Hi Carson, > > I send you a zip of the text file of my repeated maker session, this time > having deleted the mpi_blastdb dir and with the -a flag added to "mpiexec -n 8 > maker -debug". Command line. > > Cheers / Ram?n. > > > On Wed, Mar 6, 2013 at 7:49 PM, Ram?n Fallon wrote: >> OK, will do. >> >> Will get back to you tomorrow on it. >> >> Many thanks! >> >> >> On Wed, Mar 6, 2013 at 7:22 PM, Carson Holt wrote: >>> Could you delete your ../*maker.output/mpi_blastdb directory, and then when >>> rerunning maker, run with the ?a flag. >>> >>> Thanks, >>> Carson >>> >>> >>> From: Ram?n Fallon >>> Date: Wednesday, 6 March, 2013 1:15 PM >>> To: Carson Holt >>> Cc: "maker-devel at yandell-lab.org" >>> >>> Subject: Re: thread terminated, causing all processes to fail >>> >>> OK great, here goes .. many thanks! >>> >>> >>> >>> On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt wrote: >>>> If you do reply all to this message, I should get the attachment. It will >>>> be stripped from the one going to the list though. >>>> >>>> Thanks, >>>> Carson >>>> >>>> >>>> >>>> From: Ram?n Fallon >>>> Date: Wednesday, 6 March, 2013 12:57 PM >>>> To: >>>> Subject: Re: thread terminated, causing all processes to fail >>>> >>>> Hi, >>>> >>>> Many thanks for your quick reply and hint. >>>> >>>> Yes, you're right .. further up there is indeed >>>> >>>> Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line 148 >>>> thread 1. >>>> Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw >>>> FastaSeq for Storable >>>> --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 >>>> thread 1. >>>> >>>> I run a "script" session and have maker on -debug so I have everything in >>>> one file. Do you prefer to have it attached to a post to this mailing list >>>> (if it accepts txt attachments) >>>> >>>> Cheers. >>>> >>>> >>>> On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon wrote: >>>>> Hi, >>>>> >>>>> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a >>>>> single multicore machine. >>>>> >>>>> I've successfully run the dpp_contig.fasta (MPI/8 processes) example but >>>>> am having trouble with larger contigs fasta files of my own, which are >>>>> well formed. >>>>> >>>>> I've run into a problem whereby an mpiexec run of 8 processes will stop >>>>> due to a perl-thread related problem which says >>>>> >>>>> FATAL: Thread terminated, causing all processes to fail >>>>> >>>>> this corresponds to line 924 in the maker executable (which is for the >>>>> secondary/worker threads), and is the result of a test on !$thr OR'd with >>>>> !$thr->is_running, so clearly one of these is failing. >>>>> >>>>> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a >>>>> programmer, I've only recently started to look at the code and have not >>>>> got the hang of the parallelisation setup here, though I gather the master >>>>> must use threads to initially generate the parallel instances which then >>>>> use the message passing. Of course threads don't have message passing >>>>> ability, so I guess something clever is going on and will take some time >>>>> for me to understand. >>>>> >>>>> Clearly however, it has worked before on dpp_contigs, so it may be is >>>>> something wrong with my datafile or the way I am carrying out the >>>>> analysis. >>>>> >>>>> Any clues that can be put my way are welcome. >>>>> >>>>> Thank you! >>>> >>> >> > > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/mak > er-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Sun Mar 10 10:31:27 2013 From: carsonhh at gmail.com (Carson Holt) Date: Sun, 10 Mar 2013 12:31:27 -0400 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: Message-ID: I've fixed the missing script issue. Thanks, Carson From: Ram?n Fallon Date: Sunday, 10 March, 2013 10:45 AM To: Carson Holt Cc: "maker-devel at yandell-lab.org" Subject: Re: [maker-devel] thread terminated, causing all processes to fail Hi Carson, In terms of rev 995, on a simplified version of our data set, I tried a sequential run successfully, and even a "mpiexec -n 4" which ran to completion. In any case, many thanks for the new version 996. I did have a problem with the build, namely the new line: 'bin/TACC.PL ' => ['bin/ibrun'], I tried to find TACC.PL unsuccessfully, so I decided to dispense with this new line and then it compiled fine. I started one or two tests and will inform you later about them. From my end I must admit I am using a rather large EST fasta file, but is not useful for test .. I will try to cut it down Monday or Tues so that tests can be more agile. Many thanks / Ram?n. On Fri, Mar 8, 2013 at 9:28 PM, Carson Holt wrote: > Also delete mpi_blastdb before retrying with the new svn repository. > > Thanks, > Carson > > > From: Carson Holt > Date: Friday, 8 March, 2013 3:20 PM > To: Ram?n Fallon > > Cc: "maker-devel at yandell-lab.org" > Subject: Re: [maker-devel] thread terminated, causing all processes to fail > > I think I've found the potential cause and committed the necessary changes to > fix it. > > Thanks, > Carson > > > From: Ram?n Fallon > Date: Thursday, 7 March, 2013 12:47 PM > To: Carson Holt > Cc: "maker-devel at yandell-lab.org" > Subject: Re: [maker-devel] thread terminated, causing all processes to fail > > This is a standalone machine and no NFS at all. "df" gives a healthy amount of > disk space, so there should be no problem there. > > Yes that file does exist although it has the nominal 12288 bytes size, which > appears to be the minimum for a DB_file tie. > > As I mentioned the dpp_contig.fa example set does work so part of my > investigation is looking at how. > > I can do some trivial unit tests on the Bioperl stat-before-tied-hashes > situation and see what comes up. > > So I'll attempt to clear that up and then revert. > > Many thanks! / Ram?n. > > > On Thu, Mar 7, 2013 at 5:44 PM, Carson Holt wrote: >> That is extremely odd. It fails to even generate the indexes. Could you >> check the drive space of your working directory and your /tmp directory? >> >> It is odd because Bioperl uses the stat command to check on the file right >> before making a tied hash. So it was there for the stat but not the tie, >> which is immediately following. >> >> If you check manually does it exist now? --> >> /home/ramonf/makertrials/mgallocut7/sca29310_8.maker.output/mpi_blastdb/sca29 >> 310_8%2Efa.mpi.1/sca29310_8%2Efa.mpi.1.0.index >> >> Are you running in an NFS mounted directory? >> >> --Carson >> >> >> From: Ram?n Fallon >> Date: Thursday, 7 March, 2013 9:40 AM >> >> To: Carson Holt >> Cc: "maker-devel at yandell-lab.org" >> Subject: Re: [maker-devel] thread terminated, causing all processes to fail >> >> Hi Carson, >> >> I send you a zip of the text file of my repeated maker session, this time >> having deleted the mpi_blastdb dir and with the -a flag added to "mpiexec -n >> 8 maker -debug". Command line. >> >> Cheers / Ram?n. >> >> >> On Wed, Mar 6, 2013 at 7:49 PM, Ram?n Fallon wrote: >>> OK, will do. >>> >>> Will get back to you tomorrow on it. >>> >>> Many thanks! >>> >>> >>> On Wed, Mar 6, 2013 at 7:22 PM, Carson Holt wrote: >>>> Could you delete your ../*maker.output/mpi_blastdb directory, and then when >>>> rerunning maker, run with the ?a flag. >>>> >>>> Thanks, >>>> Carson >>>> >>>> >>>> From: Ram?n Fallon >>>> Date: Wednesday, 6 March, 2013 1:15 PM >>>> To: Carson Holt >>>> Cc: "maker-devel at yandell-lab.org" >>>> >>>> Subject: Re: thread terminated, causing all processes to fail >>>> >>>> OK great, here goes .. many thanks! >>>> >>>> >>>> >>>> On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt wrote: >>>>> If you do reply all to this message, I should get the attachment. It will >>>>> be stripped from the one going to the list though. >>>>> >>>>> Thanks, >>>>> Carson >>>>> >>>>> >>>>> >>>>> From: Ram?n Fallon >>>>> Date: Wednesday, 6 March, 2013 12:57 PM >>>>> To: >>>>> Subject: Re: thread terminated, causing all processes to fail >>>>> >>>>> Hi, >>>>> >>>>> Many thanks for your quick reply and hint. >>>>> >>>>> Yes, you're right .. further up there is indeed >>>>> >>>>> Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line 148 >>>>> thread 1. >>>>> Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw >>>>> FastaSeq for Storable >>>>> --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 >>>>> thread 1. >>>>> >>>>> I run a "script" session and have maker on -debug so I have everything in >>>>> one file. Do you prefer to have it attached to a post to this mailing list >>>>> (if it accepts txt attachments) >>>>> >>>>> Cheers. >>>>> >>>>> >>>>> On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon >>>>> wrote: >>>>>> Hi, >>>>>> >>>>>> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a >>>>>> single multicore machine. >>>>>> >>>>>> I've successfully run the dpp_contig.fasta (MPI/8 processes) example but >>>>>> am having trouble with larger contigs fasta files of my own, which are >>>>>> well formed. >>>>>> >>>>>> I've run into a problem whereby an mpiexec run of 8 processes will stop >>>>>> due to a perl-thread related problem which says >>>>>> >>>>>> FATAL: Thread terminated, causing all processes to fail >>>>>> >>>>>> this corresponds to line 924 in the maker executable (which is for the >>>>>> secondary/worker threads), and is the result of a test on !$thr OR'd with >>>>>> !$thr->is_running, so clearly one of these is failing. >>>>>> >>>>>> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a >>>>>> programmer, I've only recently started to look at the code and have not >>>>>> got the hang of the parallelisation setup here, though I gather the >>>>>> master must use threads to initially generate the parallel instances >>>>>> which then use the message passing. Of course threads don't have message >>>>>> passing ability, so I guess something clever is going on and will take >>>>>> some time for me to understand. >>>>>> >>>>>> Clearly however, it has worked before on dpp_contigs, so it may be is >>>>>> something wrong with my datafile or the way I am carrying out the >>>>>> analysis. >>>>>> >>>>>> Any clues that can be put my way are welcome. >>>>>> >>>>>> Thank you! >>>>> >>>> >>> >> >> _______________________________________________ maker-devel mailing list >> maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/ma >> ker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ramonfallon at gmail.com Sun Mar 10 08:45:38 2013 From: ramonfallon at gmail.com (=?ISO-8859-1?Q?Ram=F3n_Fallon?=) Date: Sun, 10 Mar 2013 15:45:38 +0100 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: References: Message-ID: Hi Carson, In terms of rev 995, on a simplified version of our data set, I tried a sequential run successfully, and even a "mpiexec -n 4" which ran to completion. In any case, many thanks for the new version 996. I did have a problem with the build, namely the new line: 'bin/TACC.PL' => ['bin/ibrun'], I tried to find TACC.PL unsuccessfully, so I decided to dispense with this new line and then it compiled fine. I started one or two tests and will inform you later about them. From my end I must admit I am using a rather large EST fasta file, but is not useful for test .. I will try to cut it down Monday or Tues so that tests can be more agile. Many thanks / Ram?n. On Fri, Mar 8, 2013 at 9:28 PM, Carson Holt wrote: > Also delete mpi_blastdb before retrying with the new svn repository. > > Thanks, > Carson > > > From: Carson Holt > Date: Friday, 8 March, 2013 3:20 PM > To: Ram?n Fallon > > Cc: "maker-devel at yandell-lab.org" > Subject: Re: [maker-devel] thread terminated, causing all processes to > fail > > I think I've found the potential cause and committed the necessary changes > to fix it. > > Thanks, > Carson > > > From: Ram?n Fallon > Date: Thursday, 7 March, 2013 12:47 PM > To: Carson Holt > Cc: "maker-devel at yandell-lab.org" > Subject: Re: [maker-devel] thread terminated, causing all processes to > fail > > This is a standalone machine and no NFS at all. "df" gives a healthy > amount of disk space, so there should be no problem there. > > Yes that file does exist although it has the nominal 12288 bytes size, > which appears to be the minimum for a DB_file tie. > > As I mentioned the dpp_contig.fa example set does work so part of my > investigation is looking at how. > > I can do some trivial unit tests on the Bioperl stat-before-tied-hashes > situation and see what comes up. > > So I'll attempt to clear that up and then revert. > > Many thanks! / Ram?n. > > > On Thu, Mar 7, 2013 at 5:44 PM, Carson Holt wrote: > >> That is extremely odd. It fails to even generate the indexes. Could you >> check the drive space of your working directory and your /tmp directory? >> >> It is odd because Bioperl uses the stat command to check on the file >> right before making a tied hash. So it was there for the stat but not the >> tie, which is immediately following. >> >> If you check manually does it exist now? --> >> /home/ramonf/makertrials/mgallocut7/sca29310_8.maker.output/mpi_blastdb/sca29310_8%2Efa.mpi.1/sca29310_8%2Efa.mpi.1.0.index >> >> Are you running in an NFS mounted directory? >> >> --Carson >> >> >> From: Ram?n Fallon >> Date: Thursday, 7 March, 2013 9:40 AM >> >> To: Carson Holt >> Cc: "maker-devel at yandell-lab.org" >> Subject: Re: [maker-devel] thread terminated, causing all processes to >> fail >> >> Hi Carson, >> >> I send you a zip of the text file of my repeated maker session, this time >> having deleted the mpi_blastdb dir and with the -a flag added to "mpiexec >> -n 8 maker -debug". Command line. >> >> Cheers / Ram?n. >> >> >> On Wed, Mar 6, 2013 at 7:49 PM, Ram?n Fallon wrote: >> >>> OK, will do. >>> >>> Will get back to you tomorrow on it. >>> >>> Many thanks! >>> >>> >>> On Wed, Mar 6, 2013 at 7:22 PM, Carson Holt wrote: >>> >>>> Could you delete your ../*maker.output/mpi_blastdb directory, and then >>>> when rerunning maker, run with the ?a flag. >>>> >>>> Thanks, >>>> Carson >>>> >>>> >>>> From: Ram?n Fallon >>>> Date: Wednesday, 6 March, 2013 1:15 PM >>>> To: Carson Holt >>>> Cc: "maker-devel at yandell-lab.org" >>>> >>>> Subject: Re: thread terminated, causing all processes to fail >>>> >>>> OK great, here goes .. many thanks! >>>> >>>> >>>> >>>> On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt wrote: >>>> >>>>> If you do reply all to this message, I should get the attachment. It >>>>> will be stripped from the one going to the list though. >>>>> >>>>> Thanks, >>>>> Carson >>>>> >>>>> >>>>> >>>>> From: Ram?n Fallon >>>>> Date: Wednesday, 6 March, 2013 12:57 PM >>>>> To: >>>>> Subject: Re: thread terminated, causing all processes to fail >>>>> >>>>> Hi, >>>>> >>>>> Many thanks for your quick reply and hint. >>>>> >>>>> Yes, you're right .. further up there is indeed >>>>> >>>>> Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line >>>>> 148 thread 1. >>>>> Thread 1 terminated abnormally: ERROR: Could not reestablish DB to >>>>> thaw FastaSeq for Storable >>>>> --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line >>>>> 1457 thread 1. >>>>> >>>>> I run a "script" session and have maker on -debug so I have everything >>>>> in one file. Do you prefer to have it attached to a post to this mailing >>>>> list (if it accepts txt attachments) >>>>> >>>>> Cheers. >>>>> >>>>> >>>>> On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a >>>>>> single multicore machine. >>>>>> >>>>>> I've successfully run the dpp_contig.fasta (MPI/8 processes) example >>>>>> but am having trouble with larger contigs fasta files of my own, which are >>>>>> well formed. >>>>>> >>>>>> I've run into a problem whereby an mpiexec run of 8 processes will >>>>>> stop due to a perl-thread related problem which says >>>>>> >>>>>> FATAL: Thread terminated, causing all processes to fail >>>>>> >>>>>> this corresponds to line 924 in the maker executable (which is for >>>>>> the secondary/worker threads), and is the result of a test on !$thr OR'd >>>>>> with !$thr->is_running, so clearly one of these is failing. >>>>>> >>>>>> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite >>>>>> being a programmer, I've only recently started to look at the code and have >>>>>> not got the hang of the parallelisation setup here, though I gather the >>>>>> master must use threads to initially generate the parallel instances which >>>>>> then use the message passing. Of course threads don't have message passing >>>>>> ability, so I guess something clever is going on and will take some time >>>>>> for me to understand. >>>>>> >>>>>> Clearly however, it has worked before on dpp_contigs, so it may be is >>>>>> something wrong with my datafile or the way I am carrying out the analysis. >>>>>> >>>>>> Any clues that can be put my way are welcome. >>>>>> >>>>>> Thank you! >>>>>> >>>>> >>>>> >>>> >>> >> _______________________________________________ maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mikheyev at gmail.com Mon Mar 11 03:46:06 2013 From: mikheyev at gmail.com (Sasha Mikheyev) Date: Mon, 11 Mar 2013 18:46:06 +0900 Subject: [maker-devel] duplicate CDS in annotation Message-ID: Dear Yandell lab, I am re-annotating the harvester and genome using protein and RNA-seq data. However, I get many artifacts like the one below. It seems that there are several CDS records that should tie in to the same mRNA, but they are really hanging out separately, and produce several nucleotide sequences with the same name when extracted from the gff. I would appreciate any guidance about how to fix this! Thank you, Sasha grep "pbar_scf7180000350377:hit:2506" Pbar.2.0.gff pbar_scf7180000350377 protein2genome protein_match 172004 172162 150 - . ID=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;score=150; pbar_scf7180000350377 protein2genome match_part 172004 172162 150 - . ID=pbar_scf7180000350377:hsp:2798;Parent=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;Target=Hsal|HS9704 1 53 +;Gap=M159; pbar_scf7180000350377 maker mRNA 538308 558769 . + . ID=pbar_scf7180000350377:hit:2506;Parent=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29;Name=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29-mRNA-1;_AED=0.48;_eAED=0.39;_QI=0|0|0|0.5|1|1|6|0|395;score=0.01; pbar_scf7180000350377 maker exon 538308 538334 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:305;Parent=pbar_scf7180000350377:hit:2506; pbar_scf7180000350377 maker exon 538748 538968 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:306;Parent=pbar_scf7180000350377:hit:2506; pbar_scf7180000350377 maker exon 539842 540242 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:307;Parent=pbar_scf7180000350377:hit:2506; pbar_scf7180000350377 maker exon 542624 542798 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:308;Parent=pbar_scf7180000350377:hit:2506; pbar_scf7180000350377 maker exon 555823 556025 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:309;Parent=pbar_scf7180000350377:hit:2506; pbar_scf7180000350377 maker exon 558609 558769 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:310;Parent=pbar_scf7180000350377:hit:2506; pbar_scf7180000350377 maker CDS 538308 538334 . + 0 ID=pbar_scf7180000350377:hit:2506:cds:305;Parent=pbar_scf7180000350377:hit:2506; pbar_scf7180000350377 maker CDS 538748 538968 . + 0 ID=pbar_scf7180000350377:hit:2506:cds:306;Parent=pbar_scf7180000350377:hit:2506; pbar_scf7180000350377 maker CDS 539842 540242 . + 1 ID=pbar_scf7180000350377:hit:2506:cds:307;Parent=pbar_scf7180000350377:hit:2506; pbar_scf7180000350377 maker CDS 542624 542798 . + 2 ID=pbar_scf7180000350377:hit:2506:cds:308;Parent=pbar_scf7180000350377:hit:2506; pbar_scf7180000350377 maker CDS 555823 556025 . + 1 ID=pbar_scf7180000350377:hit:2506:cds:309;Parent=pbar_scf7180000350377:hit:2506; pbar_scf7180000350377 maker CDS 558609 558769 . + 2 ID=pbar_scf7180000350377:hit:2506:cds:310;Parent=pbar_scf7180000350377:hit:2506; -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry.moore at genetics.utah.edu Mon Mar 11 05:32:44 2013 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Mon, 11 Mar 2013 05:32:44 -0600 Subject: [maker-devel] duplicate CDS in annotation In-Reply-To: References: Message-ID: <52822F85-760B-451B-B156-8861EA77A910@genetics.utah.edu> Hi Sasha, This gene model appears to be correctly formatted to me. In GFF3 format the CDS features are allowed to span multiple lines and they share the same ID to indicate that it is all the same features. See the GFF3 specification on the Sequence Ontology website (http://www.sequenceontology.org/resources/gff3.html), and in particular the description of the ID attribute specifies: ID Indicates the ID of the feature. IDs for each feature must be unique within the scope of the GFF file. In the case of discontinuous features (i.e. a single feature that exists over multiple genomic locations) the same ID may appear on multiple lines. All lines that share an ID collectively represent a single feature. So each of those CDS lines forms one part of the single CDS feature for this gene. B On Mar 11, 2013, at 3:46 AM, Sasha Mikheyev wrote: > Dear Yandell lab, > > I am re-annotating the harvester and genome using protein and RNA-seq data. However, I get many artifacts like the one below. It seems that there are several CDS records that should tie in to the same mRNA, but they are really hanging out separately, and produce several nucleotide sequences with the same name when extracted from the gff. I would appreciate any guidance about how to fix this! > > Thank you, > > Sasha > > grep "pbar_scf7180000350377:hit:2506" Pbar.2.0.gff > pbar_scf7180000350377 protein2genome protein_match 172004 172162 150 - . ID=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;score=150; > pbar_scf7180000350377 protein2genome match_part 172004 172162 150 - . ID=pbar_scf7180000350377:hsp:2798;Parent=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;Target=Hsal|HS9704 1 53 +;Gap=M159; > pbar_scf7180000350377 maker mRNA 538308 558769 . + . ID=pbar_scf7180000350377:hit:2506;Parent=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29;Name=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29-mRNA-1;_AED=0.48;_eAED=0.39;_QI=0|0|0|0.5|1|1|6|0|395;score=0.01; > pbar_scf7180000350377 maker exon 538308 538334 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:305;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker exon 538748 538968 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:306;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker exon 539842 540242 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:307;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker exon 542624 542798 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:308;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker exon 555823 556025 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:309;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker exon 558609 558769 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:310;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 538308 538334 . + 0 ID=pbar_scf7180000350377:hit:2506:cds:305;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 538748 538968 . + 0 ID=pbar_scf7180000350377:hit:2506:cds:306;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 539842 540242 . + 1 ID=pbar_scf7180000350377:hit:2506:cds:307;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 542624 542798 . + 2 ID=pbar_scf7180000350377:hit:2506:cds:308;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 555823 556025 . + 1 ID=pbar_scf7180000350377:hit:2506:cds:309;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 558609 558769 . + 2 ID=pbar_scf7180000350377:hit:2506:cds:310;Parent=pbar_scf7180000350377:hit:2506; > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Mar 11 07:02:13 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 11 Mar 2013 09:02:13 -0400 Subject: [maker-devel] duplicate CDS in annotation In-Reply-To: <52822F85-760B-451B-B156-8861EA77A910@genetics.utah.edu> Message-ID: I think the issue is that you are getting a match feature that is being printed with the same ID as the mRNA feature. Correct? What version of MAKER are you using, and what does the gile you are giving to pred_gff or model_gff look like? Could you send them? Thanks, Carson From: Barry Moore Date: Monday, 11 March, 2013 7:32 AM To: Sasha Mikheyev Cc: Subject: Re: [maker-devel] duplicate CDS in annotation Hi Sasha, This gene model appears to be correctly formatted to me. In GFF3 format the CDS features are allowed to span multiple lines and they share the same ID to indicate that it is all the same features. See the GFF3 specification on the Sequence Ontology website (http://www.sequenceontology.org/resources/gff3.html), and in particular the description of the ID attribute specifies: > ID Indicates the ID of the feature. IDs for each feature must be unique > within the scope of the GFF file. In the case of discontinuous features (i.e. > a single feature that exists over multiple genomic locations) the same ID may > appear on multiple lines. All lines that share an ID collectively represent a > single feature. So each of those CDS lines forms one part of the single CDS feature for this gene. B On Mar 11, 2013, at 3:46 AM, Sasha Mikheyev wrote: > Dear Yandell lab, > > I am re-annotating the harvester and genome using protein and RNA-seq data. > However, I get many artifacts like the one below. It seems that there are > several CDS records that should tie in to the same mRNA, but they are really > hanging out separately, and produce several nucleotide sequences with the same > name when extracted from the gff. I would appreciate any guidance about how to > fix this! > > Thank you, > > Sasha > > grep "pbar_scf7180000350377:hit:2506" Pbar.2.0.gff > pbar_scf7180000350377 protein2genome protein_match 172004 172162 150 - . > ID=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;score=150; > pbar_scf7180000350377 protein2genome match_part 172004 172162 150 - . > ID=pbar_scf7180000350377:hsp:2798;Parent=pbar_scf7180000350377:hit:2506;Name=H > sal|HS9704;Target=Hsal|HS9704 1 53 +;Gap=M159; > pbar_scf7180000350377 maker mRNA 538308 558769 . + . > ID=pbar_scf7180000350377:hit:2506;Parent=augustus_masked-pbar_scf7180000350377 > -abinit-gene-5.29;Name=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29- > mRNA-1;_AED=0.48;_eAED=0.39;_QI=0|0|0|0.5|1|1|6|0|395;score=0.01; > pbar_scf7180000350377 maker exon 538308 538334 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:305;Parent=pbar_scf7180000350377:hit:25 > 06; > pbar_scf7180000350377 maker exon 538748 538968 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:306;Parent=pbar_scf7180000350377:hit:25 > 06; > pbar_scf7180000350377 maker exon 539842 540242 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:307;Parent=pbar_scf7180000350377:hit:25 > 06; > pbar_scf7180000350377 maker exon 542624 542798 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:308;Parent=pbar_scf7180000350377:hit:25 > 06; > pbar_scf7180000350377 maker exon 555823 556025 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:309;Parent=pbar_scf7180000350377:hit:25 > 06; > pbar_scf7180000350377 maker exon 558609 558769 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:310;Parent=pbar_scf7180000350377:hit:25 > 06; > pbar_scf7180000350377 maker CDS 538308 538334 . + 0 > ID=pbar_scf7180000350377:hit:2506:cds:305;Parent=pbar_scf7180000350377:hit:250 > 6; > pbar_scf7180000350377 maker CDS 538748 538968 . + 0 > ID=pbar_scf7180000350377:hit:2506:cds:306;Parent=pbar_scf7180000350377:hit:250 > 6; > pbar_scf7180000350377 maker CDS 539842 540242 . + 1 > ID=pbar_scf7180000350377:hit:2506:cds:307;Parent=pbar_scf7180000350377:hit:250 > 6; > pbar_scf7180000350377 maker CDS 542624 542798 . + 2 > ID=pbar_scf7180000350377:hit:2506:cds:308;Parent=pbar_scf7180000350377:hit:250 > 6; > pbar_scf7180000350377 maker CDS 555823 556025 . + 1 > ID=pbar_scf7180000350377:hit:2506:cds:309;Parent=pbar_scf7180000350377:hit:250 > 6; > pbar_scf7180000350377 maker CDS 558609 558769 . + 2 > ID=pbar_scf7180000350377:hit:2506:cds:310;Parent=pbar_scf7180000350377:hit:250 > 6; > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From sitaram.rajaraman at helsinki.fi Mon Mar 11 08:33:27 2013 From: sitaram.rajaraman at helsinki.fi (Sitaram Rajaraman) Date: Mon, 11 Mar 2013 16:33:27 +0200 Subject: [maker-devel] Doubts in the synthesis part of MAKER Message-ID: <513DEB37.6090601@helsinki.fi> Hello MAKER developers, I'm Sitaram, working as a Bioinformatician at the University of Helsinki. We are trying out MAKER as part of a gene prediction/annotation pipeline and have some doubts regarding this. In the synthesis step in the paper, I find it a bit hard to visualise how the hints are generated from the various sources and the scores are calculated. It would be nice if you could throw some light on this. Also if you could point to the particular .Pm file which contains the actual source code, it would be convenient as there quite a lot of source code and debugging the whole set is bit cumbersome. Regards, -- Sitaram Rajaraman, Plant Stress Research Group, Dept of Biosciences, University of Helsinki. From carsonhh at gmail.com Mon Mar 11 08:51:56 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 11 Mar 2013 10:51:56 -0400 Subject: [maker-devel] Doubts in the synthesis part of MAKER In-Reply-To: <513DEB37.6090601@helsinki.fi> Message-ID: Hints are basically CDS location, exon location, and intron location. The CDS hints are based on protein alignment. Intron and exon hints are based on the EST alignments, which when polished should give exact intron coordinates. Ironically the most useless part of the gene model is actually the most informative feature for gene prediction (the intron coordinates). lib/Process/MPIchunk.pm will have the steps in the _go method. It is a little hard to follow as MAKER is designed for distributed parallelization (i.e. parallelization without shared memory with steps potentially divided on different machines on the other end of the network). It is divided into MPItier and MPIchunk objects. The MPItier object encapsulate a series of linear steps or 'levels' while the MPIchunk objects encapsulate a single step sent to a machine across the network and it exists within a single 'level' of the MPITier object. Note there can be multiple chunks assigned to a 'level'. MPItiers can also have MPITiers as children at a given level instead of MPIchunks, so the process structure then branches like a tree and can then merges back somewhere in the middle of the algorithm. The 'maker' script is really just the communication script for the objects. In MPI one maker thread is launched to handle communication and another to run the MPItiers and MPIchunks. They communication threads then pass MPIchunks and MPITiers back and forth across the network by either requesting things to do from other nodes or by asking for help if they have a large number of MPIChunks or MPItiers to process. Thanks, Carson On 13-03-11 10:33 AM, "Sitaram Rajaraman" wrote: >Hello MAKER developers, > I'm Sitaram, working as a Bioinformatician at the University of >Helsinki. We are trying out MAKER as part of a gene prediction/annotation >pipeline and have some doubts regarding this. In the synthesis step in >the paper, I find it a bit hard to visualise how the hints are generated >from the various sources and the scores are calculated. It would be nice >if you could throw some light on this. Also if you could point to the >particular .Pm file which contains the actual source code, it would be >convenient as there quite a lot of source code and debugging the whole >set is bit cumbersome. > >Regards, > >-- >Sitaram Rajaraman, >Plant Stress Research Group, >Dept of Biosciences, >University of Helsinki. > > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From sitaram.rajaraman at helsinki.fi Mon Mar 11 09:03:20 2013 From: sitaram.rajaraman at helsinki.fi (Sitaram Rajaraman) Date: Mon, 11 Mar 2013 17:03:20 +0200 Subject: [maker-devel] Doubts in the synthesis part of MAKER In-Reply-To: References: Message-ID: <513DF238.4050109@helsinki.fi> Thank you ! I will proceed with this information ! - Sitaram. On 03/11/2013 04:51 PM, Carson Holt wrote: > Hints are basically CDS location, exon location, and intron location. The > CDS hints are based on protein alignment. Intron and exon hints are based > on the EST alignments, which when polished should give exact intron > coordinates. Ironically the most useless part of the gene model is > actually the most informative feature for gene prediction (the intron > coordinates). > > lib/Process/MPIchunk.pm will have the steps in the _go method. It is a > little hard to follow as MAKER is designed for distributed parallelization > (i.e. parallelization without shared memory with steps potentially divided > on different machines on the other end of the network). > > It is divided into MPItier and MPIchunk objects. The MPItier object > encapsulate a series of linear steps or 'levels' while the MPIchunk > objects encapsulate a single step sent to a machine across the network and > it exists within a single 'level' of the MPITier object. Note there can > be multiple chunks assigned to a 'level'. MPItiers can also have MPITiers > as children at a given level instead of MPIchunks, so the process > structure then branches like a tree and can then merges back somewhere in > the middle of the algorithm. > > The 'maker' script is really just the communication script for the > objects. In MPI one maker thread is launched to handle communication and > another to run the MPItiers and MPIchunks. They communication threads > then pass MPIchunks and MPITiers back and forth across the network by > either requesting things to do from other nodes or by asking for help if > they have a large number of MPIChunks or MPItiers to process. > > Thanks, > Carson > > > > > > On 13-03-11 10:33 AM, "Sitaram Rajaraman" > wrote: > >> Hello MAKER developers, >> I'm Sitaram, working as a Bioinformatician at the University of >> Helsinki. We are trying out MAKER as part of a gene prediction/annotation >> pipeline and have some doubts regarding this. In the synthesis step in >> the paper, I find it a bit hard to visualise how the hints are generated > >from the various sources and the scores are calculated. It would be nice >> if you could throw some light on this. Also if you could point to the >> particular .Pm file which contains the actual source code, it would be >> convenient as there quite a lot of source code and debugging the whole >> set is bit cumbersome. >> >> Regards, >> >> -- >> Sitaram Rajaraman, >> Plant Stress Research Group, >> Dept of Biosciences, >> University of Helsinki. >> >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -- Sitaram Rajaraman, Plant Stress Research Group, Dept of Biosciences, University of Helsinki. From carsonhh at gmail.com Mon Mar 11 09:05:30 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 11 Mar 2013 11:05:30 -0400 Subject: [maker-devel] Doubts in the synthesis part of MAKER In-Reply-To: Message-ID: One more detail. There are basically 5 steps per level. Load -> this step creates the chunks for that level. This is where I decide how many chunks to make, and any special variables need to be generated for packaging into that chunk. Init --> these is just a declaration of the variables to package into a chunk (only give the chunk what it needs) Run --> these is the actual code that will be run after the chunk is transported to it's destination Result --> this describes how to merge results of that chunk back into the parent object Flow --> this decides what to do when all chunks for that level are complete (i.e. which level to move onto next). Default is next level in linear succession, but it can jump forward and backwards several levels if needed. Thanks, Carson On 13-03-11 10:51 AM, "Carson Holt" wrote: >Hints are basically CDS location, exon location, and intron location. >The >CDS hints are based on protein alignment. Intron and exon hints are >based >on the EST alignments, which when polished should give exact intron >coordinates. Ironically the most useless part of the gene model is >actually the most informative feature for gene prediction (the intron >coordinates). > >lib/Process/MPIchunk.pm will have the steps in the _go method. It is a >little hard to follow as MAKER is designed for distributed >parallelization >(i.e. parallelization without shared memory with steps potentially >divided >on different machines on the other end of the network). > >It is divided into MPItier and MPIchunk objects. The MPItier object >encapsulate a series of linear steps or 'levels' while the MPIchunk >objects encapsulate a single step sent to a machine across the network >and >it exists within a single 'level' of the MPITier object. Note there can >be multiple chunks assigned to a 'level'. MPItiers can also have >MPITiers >as children at a given level instead of MPIchunks, so the process >structure then branches like a tree and can then merges back somewhere in >the middle of the algorithm. > >The 'maker' script is really just the communication script for the >objects. In MPI one maker thread is launched to handle communication and >another to run the MPItiers and MPIchunks. They communication threads >then pass MPIchunks and MPITiers back and forth across the network by >either requesting things to do from other nodes or by asking for help if >they have a large number of MPIChunks or MPItiers to process. > >Thanks, >Carson > > > > > >On 13-03-11 10:33 AM, "Sitaram Rajaraman" >wrote: > >>Hello MAKER developers, >> I'm Sitaram, working as a Bioinformatician at the University of >>Helsinki. We are trying out MAKER as part of a gene >>prediction/annotation >>pipeline and have some doubts regarding this. In the synthesis step in >>the paper, I find it a bit hard to visualise how the hints are generated >>from the various sources and the scores are calculated. It would be nice >>if you could throw some light on this. Also if you could point to the >>particular .Pm file which contains the actual source code, it would be >>convenient as there quite a lot of source code and debugging the whole >>set is bit cumbersome. >> >>Regards, >> >>-- >>Sitaram Rajaraman, >>Plant Stress Research Group, >>Dept of Biosciences, >>University of Helsinki. >> >> >>_______________________________________________ >>maker-devel mailing list >>maker-devel at box290.bluehost.com >>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From isradelacon at gmail.com Mon Mar 11 12:34:27 2013 From: isradelacon at gmail.com (Israel Barrantes) Date: Mon, 11 Mar 2013 19:34:27 +0100 Subject: [maker-devel] different RNA-seq experiment outputs in separate annotation passes? Message-ID: Dear maker-devel, I have several RNA-seq experiment outputs that I want to use as input for MAKER annotation: (1) Illumina 1.3, strain A, cell stage N (2) Illumina 1.8, strain A, cell stage N (3) Illumina 1.8, strain B, cell stage N (4) 454, strain unknown, cell stage M For each experiment I mapped the reads and produced GTFs with tophat/cufflinks separately (and later converted to GFF3s with the supplied script) Q1: Does it make a difference to run a different annotation pass for each GFF3 from tophat/cufflinks? Q2: If this is the case, altering the order of passing the cDNA GFFs (e.g., first pass, experiment 1 GFF, then exp.2 in second pass, etc) will produce more or less transcripts? Q3: Is it better to simply merge this GFFs into a single nonredundant file (e.g. bedtools intersect) than using them separately, one for each MAKER pass? Thank you in advance, -- Israel Barrantes Otto-von-Guericke-Universit?t Lehrstuhl f?r Regulationsbiologie IBIO/FNW Deutschland -------------- next part -------------- An HTML attachment was scrubbed... URL: From dence at genetics.utah.edu Mon Mar 11 12:39:01 2013 From: dence at genetics.utah.edu (Daniel Ence) Date: Mon, 11 Mar 2013 18:39:01 +0000 Subject: [maker-devel] different RNA-seq experiment outputs in separate annotation passes? In-Reply-To: References: Message-ID: Hi Israel, I think that for general annotation purposes, you want to use all of those GFF files during your one make run to annotate the whole genome. If you're interested in exploring which genes are expressed in your different strains and cell stages, then you can use your annotation results and blast against the different RNA-seq experiments. I didn't answer your questions separately, but hopefully that gives some good guidance. If I missed something, let me know. Thanks, Daniel Daniel Ence Graduate Student Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ________________________________ From: maker-devel-bounces at yandell-lab.org [maker-devel-bounces at yandell-lab.org] on behalf of Israel Barrantes [isradelacon at gmail.com] Sent: Monday, March 11, 2013 12:34 PM To: maker-devel at yandell-lab.org Subject: [maker-devel] different RNA-seq experiment outputs in separate annotation passes? Dear maker-devel, I have several RNA-seq experiment outputs that I want to use as input for MAKER annotation: (1) Illumina 1.3, strain A, cell stage N (2) Illumina 1.8, strain A, cell stage N (3) Illumina 1.8, strain B, cell stage N (4) 454, strain unknown, cell stage M For each experiment I mapped the reads and produced GTFs with tophat/cufflinks separately (and later converted to GFF3s with the supplied script) Q1: Does it make a difference to run a different annotation pass for each GFF3 from tophat/cufflinks? Q2: If this is the case, altering the order of passing the cDNA GFFs (e.g., first pass, experiment 1 GFF, then exp.2 in second pass, etc) will produce more or less transcripts? Q3: Is it better to simply merge this GFFs into a single nonredundant file (e.g. bedtools intersect) than using them separately, one for each MAKER pass? Thank you in advance, -- Israel Barrantes Otto-von-Guericke-Universit?t Lehrstuhl f?r Regulationsbiologie IBIO/FNW Deutschland -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Mar 12 07:37:35 2013 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 12 Mar 2013 09:37:35 -0400 Subject: [maker-devel] duplicate CDS in annotation In-Reply-To: Message-ID: Yes. Try the newer version and see if you still have the issue. Thanks, Carson From: Sasha Mikheyev Date: Tuesday, 12 March, 2013 1:26 AM To: Carson Holt Cc: Barry Moore , Subject: Re: [maker-devel] duplicate CDS in annotation Hi Carson, I have been using version 2.10. Is it worth trying with a newer version? You can find the model file here . It is rather large, as it includes all of the output from the first maker run. Yours, Sasha On Mon, Mar 11, 2013 at 10:02 PM, Carson Holt wrote: > I think the issue is that you are getting a match feature that is being > printed with the same ID as the mRNA feature. Correct? > > What version of MAKER are you using, and what does the gile you are giving to > pred_gff or model_gff look like? Could you send them? > > Thanks, > Carson > > > From: Barry Moore > Date: Monday, 11 March, 2013 7:32 AM > To: Sasha Mikheyev > Cc: > Subject: Re: [maker-devel] duplicate CDS in annotation > > Hi Sasha, > > This gene model appears to be correctly formatted to me. In GFF3 format the > CDS features are allowed to span multiple lines and they share the same ID to > indicate that it is all the same features. See the GFF3 specification on the > Sequence Ontology website > (http://www.sequenceontology.org/resources/gff3.html), and in particular the > description of the ID attribute specifies: > >> ID Indicates the ID of the feature. IDs for each feature must be unique >> within the scope of the GFF file. In the case of discontinuous features >> (i.e. a single feature that exists over multiple genomic locations) the same >> ID may appear on multiple lines. All lines that share an ID collectively >> represent a single feature. > > So each of those CDS lines forms one part of the single CDS feature for this > gene. > > B > > On Mar 11, 2013, at 3:46 AM, Sasha Mikheyev wrote: > >> Dear Yandell lab, >> >> I am re-annotating the harvester and genome using protein and RNA-seq data. >> However, I get many artifacts like the one below. It seems that there are >> several CDS records that should tie in to the same mRNA, but they are really >> hanging out separately, and produce several nucleotide sequences with the >> same name when extracted from the gff. I would appreciate any guidance about >> how to fix this! >> >> Thank you, >> >> Sasha >> >> grep "pbar_scf7180000350377:hit:2506" Pbar.2.0.gff >> pbar_scf7180000350377 protein2genome protein_match 172004 172162 150 - . >> ID=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;score=150; >> pbar_scf7180000350377 protein2genome match_part 172004 172162 150 - . >> ID=pbar_scf7180000350377:hsp:2798;Parent=pbar_scf7180000350377:hit:2506;Name= >> Hsal|HS9704;Target=Hsal|HS9704 1 53 +;Gap=M159; >> pbar_scf7180000350377 maker mRNA 538308 558769 . + . >> ID=pbar_scf7180000350377:hit:2506;Parent=augustus_masked-pbar_scf718000035037 >> 7-abinit-gene-5.29;Name=augustus_masked-pbar_scf7180000350377-abinit-gene-5.2 >> 9-mRNA-1;_AED=0.48;_eAED=0.39;_QI=0|0|0|0.5|1|1|6|0|395;score=0.01; >> pbar_scf7180000350377 maker exon 538308 538334 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:305;Parent=pbar_scf7180000350377:hit:2 >> 506; >> pbar_scf7180000350377 maker exon 538748 538968 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:306;Parent=pbar_scf7180000350377:hit:2 >> 506; >> pbar_scf7180000350377 maker exon 539842 540242 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:307;Parent=pbar_scf7180000350377:hit:2 >> 506; >> pbar_scf7180000350377 maker exon 542624 542798 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:308;Parent=pbar_scf7180000350377:hit:2 >> 506; >> pbar_scf7180000350377 maker exon 555823 556025 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:309;Parent=pbar_scf7180000350377:hit:2 >> 506; >> pbar_scf7180000350377 maker exon 558609 558769 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:310;Parent=pbar_scf7180000350377:hit:2 >> 506; >> pbar_scf7180000350377 maker CDS 538308 538334 . + 0 >> ID=pbar_scf7180000350377:hit:2506:cds:305;Parent=pbar_scf7180000350377:hit:25 >> 06; >> pbar_scf7180000350377 maker CDS 538748 538968 . + 0 >> ID=pbar_scf7180000350377:hit:2506:cds:306;Parent=pbar_scf7180000350377:hit:25 >> 06; >> pbar_scf7180000350377 maker CDS 539842 540242 . + 1 >> ID=pbar_scf7180000350377:hit:2506:cds:307;Parent=pbar_scf7180000350377:hit:25 >> 06; >> pbar_scf7180000350377 maker CDS 542624 542798 . + 2 >> ID=pbar_scf7180000350377:hit:2506:cds:308;Parent=pbar_scf7180000350377:hit:25 >> 06; >> pbar_scf7180000350377 maker CDS 555823 556025 . + 1 >> ID=pbar_scf7180000350377:hit:2506:cds:309;Parent=pbar_scf7180000350377:hit:25 >> 06; >> pbar_scf7180000350377 maker CDS 558609 558769 . + 2 >> ID=pbar_scf7180000350377:hit:2506:cds:310;Parent=pbar_scf7180000350377:hit:25 >> 06; >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > Barry Moore > Research Scientist > Dept. of Human Genetics > University of Utah > Salt Lake City, UT 84112 > -------------------------------------------- > (801) 585-3543 > > > > > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/mak > er-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From alex.marshall at ed.ac.uk Mon Mar 11 10:15:05 2013 From: alex.marshall at ed.ac.uk (Alex Marshall) Date: Mon, 11 Mar 2013 16:15:05 +0000 Subject: [maker-devel] symbol lookup error: ProcessTable.so: undefined symbol: Perl_Tstack_sp_ptr Message-ID: <513E0309.7010004@ed.ac.uk> Hi to the maker-devel, I am getting an error everytime I run the maker script. symbol lookup error: /path/to/software/lib64/perl5/site_perl/5.8.8/x86_64-linux-thread-multi/auto/Proc/ProcessTable/ProcessTable.so: undefined symbol: Perl_Tstack_sp_ptr Your help would be very appreciated. Best wishes, Alex ---------------- Edinburgh University -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From mikheyev at gmail.com Mon Mar 11 23:26:53 2013 From: mikheyev at gmail.com (Sasha Mikheyev) Date: Tue, 12 Mar 2013 14:26:53 +0900 Subject: [maker-devel] duplicate CDS in annotation In-Reply-To: References: <52822F85-760B-451B-B156-8861EA77A910@genetics.utah.edu> Message-ID: Hi Carson, I have been using version 2.10. Is it worth trying with a newer version? You can find the model file here. It is rather large, as it includes all of the output from the first maker run. Yours, Sasha On Mon, Mar 11, 2013 at 10:02 PM, Carson Holt wrote: > I think the issue is that you are getting a match feature that is being > printed with the same ID as the mRNA feature. Correct? > > What version of MAKER are you using, and what does the gile you are giving > to pred_gff or model_gff look like? Could you send them? > > Thanks, > Carson > > > From: Barry Moore > Date: Monday, 11 March, 2013 7:32 AM > To: Sasha Mikheyev > Cc: > Subject: Re: [maker-devel] duplicate CDS in annotation > > Hi Sasha, > > This gene model appears to be correctly formatted to me. In GFF3 format > the CDS features are allowed to span multiple lines and they share the same > ID to indicate that it is all the same features. See the GFF3 > specification on the Sequence Ontology website ( > http://www.sequenceontology.org/resources/gff3.html), and in particular > the description of the ID attribute specifies: > > ID Indicates the ID of the feature. IDs for each feature must be unique > within the scope of the GFF file. In the case of discontinuous features > (i.e. a single feature that exists over multiple genomic locations) the > same ID may appear on multiple lines. All lines that share an ID > collectively represent a single feature. > > > So each of those CDS lines forms one part of the single CDS feature for > this gene. > > B > > On Mar 11, 2013, at 3:46 AM, Sasha Mikheyev wrote: > > Dear Yandell lab, > > I am re-annotating the harvester and genome using protein and RNA-seq > data. However, I get many artifacts like the one below. It seems that there > are several CDS records that should tie in to the same mRNA, but they are > really hanging out separately, and produce several nucleotide sequences > with the same name when extracted from the gff. I would appreciate any > guidance about how to fix this! > > Thank you, > > Sasha > > grep "pbar_scf7180000350377:hit:2506" Pbar.2.0.gff > pbar_scf7180000350377 protein2genome protein_match 172004 172162 150 - . > ID=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;score=150; > pbar_scf7180000350377 protein2genome match_part 172004 172162 150 - . ID=pbar_scf7180000350377:hsp:2798;Parent=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;Target=Hsal|HS9704 > 1 53 +;Gap=M159; > pbar_scf7180000350377 maker mRNA 538308 558769 . + . > ID=pbar_scf7180000350377:hit:2506;Parent=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29;Name=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29-mRNA-1;_AED=0.48;_eAED=0.39;_QI=0|0|0|0.5|1|1|6|0|395;score=0.01; > pbar_scf7180000350377 maker exon 538308 538334 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:305;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker exon 538748 538968 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:306;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker exon 539842 540242 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:307;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker exon 542624 542798 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:308;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker exon 555823 556025 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:309;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker exon 558609 558769 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:310;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 538308 538334 . + 0 > ID=pbar_scf7180000350377:hit:2506:cds:305;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 538748 538968 . + 0 > ID=pbar_scf7180000350377:hit:2506:cds:306;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 539842 540242 . + 1 > ID=pbar_scf7180000350377:hit:2506:cds:307;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 542624 542798 . + 2 > ID=pbar_scf7180000350377:hit:2506:cds:308;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 555823 556025 . + 1 > ID=pbar_scf7180000350377:hit:2506:cds:309;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 558609 558769 . + 2 > ID=pbar_scf7180000350377:hit:2506:cds:310;Parent=pbar_scf7180000350377:hit:2506; > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > Barry Moore > Research Scientist > Dept. of Human Genetics > University of Utah > Salt Lake City, UT 84112 > -------------------------------------------- > (801) 585-3543 > > > > > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Mar 12 08:27:44 2013 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 12 Mar 2013 10:27:44 -0400 Subject: [maker-devel] symbol lookup error: ProcessTable.so: undefined symbol: Perl_Tstack_sp_ptr In-Reply-To: <513E0309.7010004@ed.ac.uk> Message-ID: Could you try the 2.27 version of MAKER? You are using 2.10 correct? Thanks, Carson On 13-03-11 12:15 PM, "Alex Marshall" wrote: >Hi to the maker-devel, > >I am getting an error everytime I run the maker script. > >symbol lookup error: >/path/to/software/lib64/perl5/site_perl/5.8.8/x86_64-linux-thread-multi/au >to/Proc/ProcessTable/ProcessTable.so: >undefined symbol: Perl_Tstack_sp_ptr > >Your help would be very appreciated. > >Best wishes, >Alex > >---------------- >Edinburgh University > >-- >The University of Edinburgh is a charitable body, registered in >Scotland, with registration number SC005336. > > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From barry.moore at genetics.utah.edu Tue Mar 12 17:57:32 2013 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Tue, 12 Mar 2013 17:57:32 -0600 Subject: [maker-devel] MAKER subversion repositories Message-ID: For any of you who are running MAKER straight from our subversion repositories in the lab - we have migrated those repos to a new server. Reply to Shawn or I for info on how to connect to the new repos. Thanks. Barry Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From ares711122 at gmail.com Tue Mar 12 18:24:42 2013 From: ares711122 at gmail.com (Hung-Wei Hsu) Date: Wed, 13 Mar 2013 08:24:42 +0800 Subject: [maker-devel] ERROR: Could not obtain lock to format database Message-ID: Hi MAKER developers, I tried MAKER 2.27b on one E. coli scaffold sequence with uniprot protein database. I failed to run the analysis and got an error message as below. Could not obtain lock to format database at maker-2.27b/bin/../lib/GI.pm Any suggestions or helps will be deeply appreciated. Best regards, Hung-Wei -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Wed Mar 13 07:24:44 2013 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 13 Mar 2013 09:24:44 -0400 Subject: [maker-devel] symbol lookup error: ProcessTable.so: undefined symbol: Perl_Tstack_sp_ptr In-Reply-To: <513FE1F0.2030209@ed.ac.uk> Message-ID: I'm very glad it's working. Those kind of errors are the hardest to track down. --Carson On 13-03-12 10:18 PM, "Alex Marshall" wrote: >I have some great news. I uninstalled every one of my local perl >libraries. Basically by getting rid of my libraries, and then using your >Build script to install the maker dependencies totally fixed it. It >worked with the test.fasta file, no errors whatsoever. I am smiling so >much right now that my face might crack ;) you were right, broken perl. >I just checked, getting lots of finished in the >master_datastore_index.log. thank you so much. > >Alex > > > > > >On 12/03/2013 19:11, Carson Holt wrote: >> I do think your perl has a problem. I've added some changes to each of >> these modules that should help force perl to generate the correct object >> method lookup table. >> >> Could you test them out (place most under the /lib/Iterator/ >>subdirectory). >> >> --Carson >> >> >> On 13-03-12 3:00 PM, "Alex Marshall" wrote: >> >>> We had maker working happily for ages. >>> >>> Then we upgraded from perl version 5.8.8 to 5.10 which stopped maker >>> working. >>> >>> Maker said it couldn't find forks.pm, added that library path, to fix >>> the error. >>> >>> Then that particular error below started happening. >>> >>> Alex >>> >>> >>> On 12/03/2013 18:54, Alex Marshall wrote: >>>> version 5.10 on a hpc cluster >>>> >>>> Alex >>>> >>>> >>>> >>>> On 12/03/2013 18:48, Carson Holt wrote: >>>>> That means the first time it called fileHandle it didn't die (which >>>>> should >>>>> be impossible) >>>>> >>>>> Then the second time it called it, it died. It begs the question, >>>>>what >>>>> happened to the first call. >>>>> >>>>> This is looking more and more like you have a broken perl. >>>>> >>>>> What version of perl are you using? >>>>> >>>>> --Carson >>>>> >>>>> >>>>> >>>>> On 13-03-12 2:28 PM, "Alex Marshall" wrote: >>>>> >>>>>> I deleted Iterator.pm, I put the new one in the maker/lib folder, >>>>>>then >>>>>> reran maker >>>>>> >>>>>> vi interator.pm confirms this: >>>>>> >>>>>> sub fileHandle { >>>>>> die "this should die"; >>>>>> >>>>>> error: >>>>>> STATUS: Parsing control files... >>>>>> Opening a new filehandle: Iterator:GFF3 >>>>>> Gettign the existing filehandle: Iterator::GFF3 >>>>>> Checking if it still exists: Iterator::GFF3 >>>>>> ERROR: No open filehandle Iterator::GFF3 >>>>>> --> rank=NA, hostname=frontend04 >>>>>> >>>>>> >>>>>> >>>>>> On 12/03/2013 18:21, Carson Holt wrote: >>>>>>> Try this one. >>>>>>> >>>>>>> It should fail immediately >>>>>>> >>>>>>> Code --> die "this should die"; >>>>>>> >>>>>>> >>>>>>> I'm just making sure it's being called as expected. >>>>>>> >>>>>>> --Carson >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 13-03-12 2:18 PM, "Alex Marshall" >>>>>>>wrote: >>>>>>> >>>>>>>> I have Iterator.pm and GFF3.pm in the right place: >>>>>>>> >>>>>>>> ..../software/maker2/maker-2.27/lib/Iterator.pm >>>>>>>> ..../software/maker2/maker-2.27/lib/Iterator/GFF3.pm >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 12/03/2013 18:16, Alex Marshall wrote: >>>>>>>>> I have deleted Iterator.pm, and replaced again (just to be sure). >>>>>>>>> >>>>>>>>> STATUS: Parsing control files... >>>>>>>>> Opening a new filehandle: Iterator:GFF3 >>>>>>>>> Gettign the existing filehandle: Iterator::GFF3 >>>>>>>>> Checking if it still exists: Iterator::GFF3 >>>>>>>>> ERROR: No open filehandle Iterator::GFF3 >>>>>>>>> --> rank=NA, hostname=frontend04 >>>>>>>>> >>>>>>>>> >>>>>>>>> On 12/03/2013 18:11, Carson Holt wrote: >>>>>>>>>> It's missing all the standard error from the Iterator.pm >>>>>>>>>>message I >>>>>>>>>> added? >>>>>>>>>> Could you double check that you replaced that one too. >>>>>>>>>> >>>>>>>>>> --Carson >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 13-03-12 2:07 PM, "Alex Marshall" >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> STATUS: Parsing control files... >>>>>>>>>>> Opening a new filehandle: Iterator:GFF3 >>>>>>>>>>> Gettign the existing filehandle: Iterator::GFF3 >>>>>>>>>>> Checking if it still exists: Iterator::GFF3 >>>>>>>>>>> ERROR: No open filehandle Iterator::GFF3 >>>>>>>>>>> --> rank=NA, hostname=frontend04 >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 12/03/2013 18:02, Carson Holt wrote: >>>>>>>>>>>> Please use these two and send me the full STDERR (replaces >>>>>>>>>>>>also >>>>>>>>>>>> Iterator/GFF3.pm). >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Carson >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 13-03-12 1:55 PM, "Alex Marshall" >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> same again: >>>>>>>>>>>>> >>>>>>>>>>>>> STATUS: Parsing control files... >>>>>>>>>>>>> ERROR: No open filehandle Iterator::GFF3 >>>>>>>>>>>>> --> rank=NA, hostname=frontend04 >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 12/03/2013 17:45, Carson Holt wrote: >>>>>>>>>>>>>> Try this one. This is a code snippet --> >>>>>>>>>>>>>> >>>>>>>>>>>>>> my $fh = new FileHandle(); >>>>>>>>>>>>>> $fh->open("$arg") or die "ERROR: Could not >>>>>>>>>>>>>> open >>>>>>>>>>>>>> file: >>>>>>>>>>>>>> $!\n"; >>>>>>>>>>>>>> $self->{fileHandle} = $fh; >>>>>>>>>>>>>> $self->startPos($fh->getpos()); >>>>>>>>>>>>>> if (! openhandle($fh)){ #checks to see if >>>>>>>>>>>>>>file >>>>>>>>>>>>>> handle >>>>>>>>>>>>>> is >>>>>>>>>>>>>> open >>>>>>>>>>>>>> confess "ERROR: No open filehandle in Iterator\n"; >>>>>>>>>>>>>> } >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> All it does is open the handle, check the reading position >>>>>>>>>>>>>>and >>>>>>>>>>>>>> then >>>>>>>>>>>>>> check >>>>>>>>>>>>>> to see if the handle is still open. >>>>>>>>>>>>>> >>>>>>>>>>>>>> --Carson >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 13-03-12 1:37 PM, "Alex Marshall" >>>>>>>>>>>>>> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> [1] If I comment out the error in the GFF3.pm file: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> STATUS: Parsing control files... >>>>>>>>>>>>>>> Can't call method "getpos" without a package or object >>>>>>>>>>>>>>> reference >>>>>>>>>>>>>>> at >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>/exports/work/biology_ieb_mblaxter/software/maker2/maker-2.2 >>>>>>>>>>>>>>>7/ >>>>>>>>>>>>>>> bin >>>>>>>>>>>>>>> /. >>>>>>>>>>>>>>> ./l >>>>>>>>>>>>>>> ib >>>>>>>>>>>>>>> /I >>>>>>>>>>>>>>> terator/GFF3.pm >>>>>>>>>>>>>>> line 42, line 121. >>>>>>>>>>>>>>> --> rank=NA, hostname=frontend04 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> [2] If I add the error comment back to the GFF3.pm, and add >>>>>>>>>>>>>>> the >>>>>>>>>>>>>>> second >>>>>>>>>>>>>>> new Iterator.pm: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> STATUS: Parsing control files... >>>>>>>>>>>>>>> ERROR: No open filehandle Iterator::GFF3 >>>>>>>>>>>>>>> --> rank=NA, hostname=frontend04 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Alex >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 12/03/2013 17:31, Carson Holt wrote: >>>>>>>>>>>>>>>> There is one other thing it does right before. It calls >>>>>>>>>>>>>>>> this >>>>>>>>>>>>>>>> --> >>>>>>>>>>>>>>>> $self->fileHandle()->getpos() >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I switched the chaining off so it is just $fh->getpos in >>>>>>>>>>>>>>>>the >>>>>>>>>>>>>>>> attached >>>>>>>>>>>>>>>> module (replace again). >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I don't see why a failure would happen special for you >>>>>>>>>>>>>>>> there, >>>>>>>>>>>>>>>> but >>>>>>>>>>>>>>>> try >>>>>>>>>>>>>>>> it >>>>>>>>>>>>>>>> again. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> --Carson >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 13-03-12 1:24 PM, "Carson Holt" >>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> This is the new line in Iterator.pm >>>>>>>>>>>>>>>>> --> $fh->open("$arg") or die "ERROR: Could not open file: >>>>>>>>>>>>>>>>> $!\n"; >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> The extra info would be from $! >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> In the place where the error is occurring, all MAKER does >>>>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>> open a >>>>>>>>>>>>>>>>> file >>>>>>>>>>>>>>>>> handle in Iterator.pm and then check to see if it is open >>>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>> Iterator::GFF3 (it does one and then instantly the >>>>>>>>>>>>>>>>>other). >>>>>>>>>>>>>>>>> The >>>>>>>>>>>>>>>>> second >>>>>>>>>>>>>>>>> failure is just the check on the filehandle. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> If the open succeeds, but for some reason it can't tell >>>>>>>>>>>>>>>>>it >>>>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>> open, >>>>>>>>>>>>>>>>> then >>>>>>>>>>>>>>>>> it is something to do with your system. You can try >>>>>>>>>>>>>>>>> reinstalling >>>>>>>>>>>>>>>>> Scalar::Util as that is the module that implements >>>>>>>>>>>>>>>>> openhandle >>>>>>>>>>>>>>>>> method >>>>>>>>>>>>>>>>> that >>>>>>>>>>>>>>>>> is called. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> You can also try just commenting out line 37 of >>>>>>>>>>>>>>>>> lib/Iterator/GFF3.pm >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> --Carson >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 13-03-12 1:15 PM, "Alex Marshall" >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I am looking at Iterator.pm >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> so it should of thrown more error information? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Alex >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On 12/03/2013 17:14, Alex Marshall wrote: >>>>>>>>>>>>>>>>>>> replaced Iterator.pm in maker2/maker-2.27/lib >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> error: same as before >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> STATUS: Parsing control files... >>>>>>>>>>>>>>>>>>> ERROR: No open filehandle Iterator::GFF3 >>>>>>>>>>>>>>>>>>> --> rank=NA, hostname=frontend04 >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> ...../software/maker2/maker-2.27/lib/Iterator/GFF3.pm >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> in sub new >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> my $fh = $self->fileHandle(); >>>>>>>>>>>>>>>>>>> if (! openhandle($fh)){ #checks to see if file >>>>>>>>>>>>>>>>>>> handle >>>>>>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>>>> open >>>>>>>>>>>>>>>>>>> die "ERROR: No open filehandle >>>>>>>>>>>>>>>>>>> Iterator::GFF3\n"; >>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Alex >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On 12/03/2013 17:06, Carson Holt wrote: >>>>>>>>>>>>>>>>>>>> I get not errors, and don?t see any issues. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Could you replace the Iterator.pm in the lib directory >>>>>>>>>>>>>>>>>>>> with >>>>>>>>>>>>>>>>>>>> this >>>>>>>>>>>>>>>>>>>> one. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I >>>>>>>>>>>>>>>>>>>> added some more output to the STDERR if opening a >>>>>>>>>>>>>>>>>>>> filehandle >>>>>>>>>>>>>>>>>>>> fails. >>>>>>>>>>>>>>>>>>>> At >>>>>>>>>>>>>>>>>>>> least it should provide more information. Could you >>>>>>>>>>>>>>>>>>>> then >>>>>>>>>>>>>>>>>>>> let me >>>>>>>>>>>>>>>>>>>> know >>>>>>>>>>>>>>>>>>>> what >>>>>>>>>>>>>>>>>>>> it says. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>> Carson >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On 13-03-12 12:35 PM, "Alex Marshall" >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> please find: maker_opts.ctl and test.fa attached >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Alex >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On 12/03/2013 16:31, Alex Marshall wrote: >>>>>>>>>>>>>>>>>>>>>> will send to you now... >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On 12/03/2013 16:29, Carson Holt wrote: >>>>>>>>>>>>>>>>>>>>>>> Could you send me the entire captured STDERR, your >>>>>>>>>>>>>>>>>>>>>>> maker_opts.ctl >>>>>>>>>>>>>>>>>>>>>>> file and >>>>>>>>>>>>>>>>>>>>>>> you test.fasta? >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>> Carson >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On 13-03-12 12:23 PM, "Alex Marshall" >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> It is in fasta format not GFF format >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> On 12/03/2013 16:16, Alex Marshall wrote: >>>>>>>>>>>>>>>>>>>>>>>>> I have been looking through maker_opts.ctl >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> #-----Genome (Required for De-Novo Annotation) >>>>>>>>>>>>>>>>>>>>>>>>> genome=test.fna #genome sequence (fasta format or >>>>>>>>>>>>>>>>>>>>>>>>> fasta >>>>>>>>>>>>>>>>>>>>>>>>> embeded >>>>>>>>>>>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>>>>>>>>>> GFF3) >>>>>>>>>>>>>>>>>>>>>>>>> organism_type=eukaryotic #eukaryotic or prokaryotic. >>>>>>>>>>>>>>>>>>>>>>>>> Default >>>>>>>>>>>>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>>>>>>>>>> eukaryotic >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> I added the path to the genome, same error. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Alex >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> On 12/03/2013 16:11, Carson Holt wrote: >>>>>>>>>>>>>>>>>>>>>>>>>> What does you maker_opts.ctl file look like. What is >>>>>>>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>> value >>>>>>>>>>>>>>>>>>>>>>>>>> for >>>>>>>>>>>>>>>>>>>>>>>>>> genome? If you did not give a genome fasta file and are >>>>>>>>>>>>>>>>>>>>>>>>>> using >>>>>>>>>>>>>>>>>>>>>>>>>> a >>>>>>>>>>>>>>>>>>>>>>>>>> gff3 as >>>>>>>>>>>>>>>>>>>>>>>>>> input, is there a FASTA file embedded in it? >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> --Carson >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> On 13-03-12 12:06 PM, "Alex Marshall" >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> [1] hard drive - enough space >>>>>>>>>>>>>>>>>>>>>>>>>>> [2] ./Build realclean - done >>>>>>>>>>>>>>>>>>>>>>>>>>> [3] delete the maker_path/perl directory and >>>>>>>>>>>>>>>>>>>>>>>>>>> maker_path/bin - >>>>>>>>>>>>>>>>>>>>>>>>>>> done >>>>>>>>>>>>>>>>>>>>>>>>>>> [4] LD_PRELOAD=/.....path...../openmpi/lib/libmpi.so - >>>>>>>>>>>>>>>>>>>>>>>>>>> done >>>>>>>>>>>>>>>>>>>>>>>>>>> [5] perl Build.PL - done >>>>>>>>>>>>>>>>>>>>>>>>>>> [6] installation of 2.27 worked >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> and back to original error: >>>>>>>>>>>>>>>>>>>>>>>>>>> STATUS: Parsing control files... >>>>>>>>>>>>>>>>>>>>>>>>>>> ERROR: No open filehandle Iterator::GFF3 >>>>>>>>>>>>>>>>>>>>>>>>>>> --> rank=NA, hostname=frontend04 >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Alex >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> On 12/03/2013 15:26, Carson Holt wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>> So the odd unrelated errors you are getting suggest >>>>>>>>>>>>>>>>>>>>>>>>>>>> there >>>>>>>>>>>>>>>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>>>>>>>>>>>>> something >>>>>>>>>>>>>>>>>>>>>>>>>>>> else going on that needs to be resolved first. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Check your drive space 'df -h maker_path'. Make sure >>>>>>>>>>>>>>>>>>>>>>>>>>>> you >>>>>>>>>>>>>>>>>>>>>>>>>>>> don't >>>>>>>>>>>>>>>>>>>>>>>>>>>> just >>>>>>>>>>>>>>>>>>>>>>>>>>>> have >>>>>>>>>>>>>>>>>>>>>>>>>>>> a full hard drive. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Run './Build realclean', and delete the >> maker_path/perl >>>>>>>>>>>>>>>>>>>>>>>>>>>> directory and >>>>>>>>>>>>>>>>>>>>>>>>>>>> maker_path/bin sidreactory completely. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Make sure to execute the export >>>>>>>>>>>>>>>>>>>>>>>>>>>> LD_PRELOAD=/.....path...../openmpi/lib/libmpi.so >> comamnd >>>>>>>>>>>>>>>>>>>>>>>>>>>> before >>>>>>>>>>>>>>>>>>>>>>>>>>>> ever >>>>>>>>>>>>>>>>>>>>>>>>>>>> running 'perl Build.PL' >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Which version of OPenMPI are you using. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>>>> Carson >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> On 13-03-12 11:21 AM, "Alex >>>>>>>>>>>>>>>>>>>>>>>>>>>> Marshall" >>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> I am using openmpi and yes I ran ./Build install >> step. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Configuring MAKER with MPI support >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Can't exec "/bin/sh": Argument list too long at >>>>>>>>>>>>>>>>>>>>>>>>>>>>> /....path...../lib/perl5/Inline/C.pm line 801. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> A problem was encountered while attempting to >> compile >>>>>>>>>>>>>>>>>>>>>>>>>>>>> and >>>>>>>>>>>>>>>>>>>>>>>>>>>>> install >>>>>>>>>>>>>>>>>>>>>>>>>>>>> your >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Inline >>>>>>>>>>>>>>>>>>>>>>>>>>>>> C code. The command that failed was: >>>>>>>>>>>>>>>>>>>>>>>>>>>>> /usr/bin/perl Makefile.PL > out.Makefile_PL >> 2>&1 >>>>>>>>>>>>>>>>>>>>>>>>>>>>> The build directory was: >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>> /....path...../maker2/maker-2.27/src/blib/build/Parallel >>>>>>>>>>>>>>>>>>>>>>>>>>>>> /A >>>>>>>>>>>>>>>>>>>>>>>>>>>>> pp >>>>>>>>>>>>>>>>>>>>>>>>>>>>> li >>>>>>>>>>>>>>>>>>>>>>>>>>>>> c >>>>>>>>>>>>>>>>>>>>>>>>>>>>> ati >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> on/ >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> MPI >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> To debug the problem, cd to the build directory, and >>>>>>>>>>>>>>>>>>>>>>>>>>>>> inspect >>>>>>>>>>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>>>>> output >>>>>>>>>>>>>>>>>>>>>>>>>>>>> files. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> at >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>> /....path...../maker2/maker-2.27/src/lib/Parallel/Applic >>>>>>>>>>>>>>>>>>>>>>>>>>>>> at >>>>>>>>>>>>>>>>>>>>>>>>>>>>> io >>>>>>>>>>>>>>>>>>>>>>>>>>>>> n/ >>>>>>>>>>>>>>>>>>>>>>>>>>>>> M >>>>>>>>>>>>>>>>>>>>>>>>>>>>> PI. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> pm >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> line 223. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Alex >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 12/03/2013 15:14, Carson Holt wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Also the place it is trying to load from >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>> /....path...../maker2/maker-2.27/src/blib/lib/auto/Para >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ll >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> el >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> /A >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> p >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pli >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> cat >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ion >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> /M >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> PI >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> /MPI.so >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> That is not the final install location? Did you >> run >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> './Build >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> install' >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> step? When that runs everything related to MPI >> will >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> be >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> here --> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>> /....path...../maker2/maker-2.27/perl/lib/auto/Parallel >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> /A >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pp >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> li >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> c >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ati >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> on/ >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> MPI >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> /M >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> PI >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> .so >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> --Carson >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 13-03-12 11:11 AM, "Alex >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Marshall" >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> now getting mpi problems: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Can't load >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>> '/....path...../maker2/maker-2.27/src/blib/lib/auto/Para >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ll >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> el >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> / >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> App >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> lic >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ati >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> on >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> /M >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> PI/MPI.so' >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for module Parallel::Application::MPI: >>>>> libmpich.so.1.0: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> cannot >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> open >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> shared object file: No such file or directory at >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> /usr/lib64/perl5/DynaLoader.pm line 200. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> at /....path...../lib/perl5/Inline.pm line >> 536. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> at >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>> /....path...../maker2/maker-2.27/src/lib/Parallel/Applic >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> at >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> io >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> n >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> /MP >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I.p >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> m >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> line 223 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> you suggest: export >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> LD_PRELOAD=/.....path...../openmpi/lib/libmpi.so >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I do that, and run again, same error. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Alex >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 12/03/2013 14:43, Alex Marshall wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ok I will upgrade to 2.27 now. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Alex >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 12/03/2013 14:42, Carson Holt wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The original error is caused by an issue in >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Proc::ProcessTable >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> on >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> some >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> systems. I no longer use that module in maker >> for >>>>>>> that >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> reason. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> After >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> first error, you may have to delete the >> mpi_blastdb >>>>>>> and >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> any >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> files >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> with the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> extension .db in the maker.output directory >> before >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> retrying. I >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> would >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> recommend using 2.27. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Carson >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 13-03-12 10:40 AM, "Alex >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Marshall" >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I managed to fix that error. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I am using version 2.25-beta. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> new error: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ERROR: No open filehandle Iterator::GFF3 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Alex >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 12/03/2013 14:27, Carson Holt wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Could you try the 2.27 version of MAKER? You >> are >>>>>>> using >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2.10 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> correct? >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Carson >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 13-03-11 12:15 PM, "Alex >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Marshall" >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi to the maker-devel, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I am getting an error everytime I run the >> maker >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> script. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> symbol lookup error: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>> /path/to/software/lib64/perl5/site_perl/5.8.8/x86_64-li >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> n >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ux- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> thr >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ead >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -m >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ul >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ti/ >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> au >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to/Proc/ProcessTable/ProcessTable.so: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> undefined symbol: Perl_Tstack_sp_ptr >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Your help would be very appreciated. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best wishes, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Alex >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ---------------- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Edinburgh University >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The University of Edinburgh is a charitable >>>>> body, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> registered in >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Scotland, with registration number SC005336. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> _______________________________________________ >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> maker-devel mailing list >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> maker-devel at box290.bluehost.com >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>> http://box290.bluehost.com/mailman/listinfo/maker-devel >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> _ >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> yan >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> del >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> l-l >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ab >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> .o >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> rg >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Alex Marshall, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Room 3.54, Blaxter Lab, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Ashworth Laboratories, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Institute of Evolutionary Biology, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The King's Buildings, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The University of Edinburgh, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Edinburgh, EH9 3JT >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> alex.marshall at ed.ac.uk >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> +44(0)131 650 7403 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The University of Edinburgh is a charitable >> body, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> registered in >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Scotland, with registration number SC005336. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Alex Marshall, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Room 3.54, Blaxter Lab, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Ashworth Laboratories, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Institute of Evolutionary Biology, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The King's Buildings, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The University of Edinburgh, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Edinburgh, EH9 3JT >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> alex.marshall at ed.ac.uk >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> +44(0)131 650 7403 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The University of Edinburgh is a charitable body, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> registered in >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Scotland, with registration number SC005336. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Alex Marshall, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Room 3.54, Blaxter Lab, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Ashworth Laboratories, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Institute of Evolutionary Biology, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> The King's Buildings, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> The University of Edinburgh, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Edinburgh, EH9 3JT >>>>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>>>> alex.marshall at ed.ac.uk >>>>>>>>>>>>>>>>>>>>>>>>>>>>> +44(0)131 650 7403 >>>>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> The University of Edinburgh is a charitable body, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> registered >>>>>>>>>>>>>>>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Scotland, with registration number SC005336. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>> Alex Marshall, >>>>>>>>>>>>>>>>>>>>>>>>>>> Room 3.54, Blaxter Lab, >>>>>>>>>>>>>>>>>>>>>>>>>>> Ashworth Laboratories, >>>>>>>>>>>>>>>>>>>>>>>>>>> Institute of Evolutionary Biology, >>>>>>>>>>>>>>>>>>>>>>>>>>> The King's Buildings, >>>>>>>>>>>>>>>>>>>>>>>>>>> The University of Edinburgh, >>>>>>>>>>>>>>>>>>>>>>>>>>> Edinburgh, EH9 3JT >>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>> alex.marshall at ed.ac.uk >>>>>>>>>>>>>>>>>>>>>>>>>>> +44(0)131 650 7403 >>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> The University of Edinburgh is a charitable body, >>>>>>>>>>>>>>>>>>>>>>>>>>> registered >>>>>>>>>>>>>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>>>>>>>>>>>> Scotland, with registration number SC005336. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>> Alex Marshall, >>>>>>>>>>>>>>>>>>>>>>>> Room 3.54, Blaxter Lab, >>>>>>>>>>>>>>>>>>>>>>>> Ashworth Laboratories, >>>>>>>>>>>>>>>>>>>>>>>> Institute of Evolutionary Biology, >>>>>>>>>>>>>>>>>>>>>>>> The King's Buildings, >>>>>>>>>>>>>>>>>>>>>>>> The University of Edinburgh, >>>>>>>>>>>>>>>>>>>>>>>> Edinburgh, EH9 3JT >>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>> alex.marshall at ed.ac.uk >>>>>>>>>>>>>>>>>>>>>>>> +44(0)131 650 7403 >>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> The University of Edinburgh is a charitable body, >>>>>>>>>>>>>>>>>>>>>>>> registered >>>>>>>>>>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>>>>>>>>> Scotland, with registration number SC005336. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>> Alex Marshall, >>>>>>>>>>>>>>>>>>>>> Room 3.54, Blaxter Lab, >>>>>>>>>>>>>>>>>>>>> Ashworth Laboratories, >>>>>>>>>>>>>>>>>>>>> Institute of Evolutionary Biology, >>>>>>>>>>>>>>>>>>>>> The King's Buildings, >>>>>>>>>>>>>>>>>>>>> The University of Edinburgh, >>>>>>>>>>>>>>>>>>>>> Edinburgh, EH9 3JT >>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>> alex.marshall at ed.ac.uk >>>>>>>>>>>>>>>>>>>>> +44(0)131 650 7403 >>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> The University of Edinburgh is a charitable body, >>>>>>>>>>>>>>>>>>>>> registered in >>>>>>>>>>>>>>>>>>>>> Scotland, with registration number SC005336. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>> Alex Marshall, >>>>>>>>>>>>>>>>>> Room 3.54, Blaxter Lab, >>>>>>>>>>>>>>>>>> Ashworth Laboratories, >>>>>>>>>>>>>>>>>> Institute of Evolutionary Biology, >>>>>>>>>>>>>>>>>> The King's Buildings, >>>>>>>>>>>>>>>>>> The University of Edinburgh, >>>>>>>>>>>>>>>>>> Edinburgh, EH9 3JT >>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>> alex.marshall at ed.ac.uk >>>>>>>>>>>>>>>>>> +44(0)131 650 7403 >>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> The University of Edinburgh is a charitable body, >>>>>>>>>>>>>>>>>> registered >>>>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>>> Scotland, with registration number SC005336. >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>> Alex Marshall, >>>>>>>>>>>>>>> Room 3.54, Blaxter Lab, >>>>>>>>>>>>>>> Ashworth Laboratories, >>>>>>>>>>>>>>> Institute of Evolutionary Biology, >>>>>>>>>>>>>>> The King's Buildings, >>>>>>>>>>>>>>> The University of Edinburgh, >>>>>>>>>>>>>>> Edinburgh, EH9 3JT >>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>> alex.marshall at ed.ac.uk >>>>>>>>>>>>>>> +44(0)131 650 7403 >>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The University of Edinburgh is a charitable body, >>>>>>>>>>>>>>>registered >>>>>>>>>>>>>>> in >>>>>>>>>>>>>>> Scotland, with registration number SC005336. >>>>>>>>>>>>> -- >>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>> Alex Marshall, >>>>>>>>>>>>> Room 3.54, Blaxter Lab, >>>>>>>>>>>>> Ashworth Laboratories, >>>>>>>>>>>>> Institute of Evolutionary Biology, >>>>>>>>>>>>> The King's Buildings, >>>>>>>>>>>>> The University of Edinburgh, >>>>>>>>>>>>> Edinburgh, EH9 3JT >>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>> alex.marshall at ed.ac.uk >>>>>>>>>>>>> +44(0)131 650 7403 >>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>> >>>>>>>>>>>>> The University of Edinburgh is a charitable body, registered >>>>>>>>>>>>>in >>>>>>>>>>>>> Scotland, with registration number SC005336. >>>>>>>>>>> -- >>>>>>>>>>> ----------------------------- >>>>>>>>>>> Alex Marshall, >>>>>>>>>>> Room 3.54, Blaxter Lab, >>>>>>>>>>> Ashworth Laboratories, >>>>>>>>>>> Institute of Evolutionary Biology, >>>>>>>>>>> The King's Buildings, >>>>>>>>>>> The University of Edinburgh, >>>>>>>>>>> Edinburgh, EH9 3JT >>>>>>>>>>> ----------------------------- >>>>>>>>>>> alex.marshall at ed.ac.uk >>>>>>>>>>> +44(0)131 650 7403 >>>>>>>>>>> ----------------------------- >>>>>>>>>>> >>>>>>>>>>> The University of Edinburgh is a charitable body, registered in >>>>>>>>>>> Scotland, with registration number SC005336. >>>>>>>> -- >>>>>>>> ----------------------------- >>>>>>>> Alex Marshall, >>>>>>>> Room 3.54, Blaxter Lab, >>>>>>>> Ashworth Laboratories, >>>>>>>> Institute of Evolutionary Biology, >>>>>>>> The King's Buildings, >>>>>>>> The University of Edinburgh, >>>>>>>> Edinburgh, EH9 3JT >>>>>>>> ----------------------------- >>>>>>>> alex.marshall at ed.ac.uk >>>>>>>> +44(0)131 650 7403 >>>>>>>> ----------------------------- >>>>>>>> >>>>>>>> The University of Edinburgh is a charitable body, registered in >>>>>>>> Scotland, with registration number SC005336. >>>>>> -- >>>>>> ----------------------------- >>>>>> Alex Marshall, >>>>>> Room 3.54, Blaxter Lab, >>>>>> Ashworth Laboratories, >>>>>> Institute of Evolutionary Biology, >>>>>> The King's Buildings, >>>>>> The University of Edinburgh, >>>>>> Edinburgh, EH9 3JT >>>>>> ----------------------------- >>>>>> alex.marshall at ed.ac.uk >>>>>> +44(0)131 650 7403 >>>>>> ----------------------------- >>>>>> >>>>>> The University of Edinburgh is a charitable body, registered in >>>>>> Scotland, with registration number SC005336. >>> >>> -- >>> ----------------------------- >>> Alex Marshall, >>> Room 3.54, Blaxter Lab, >>> Ashworth Laboratories, >>> Institute of Evolutionary Biology, >>> The King's Buildings, >>> The University of Edinburgh, >>> Edinburgh, EH9 3JT >>> ----------------------------- >>> alex.marshall at ed.ac.uk >>> +44(0)131 650 7403 >>> ----------------------------- >>> >>> The University of Edinburgh is a charitable body, registered in >>> Scotland, with registration number SC005336. > > >-- >----------------------------- >Alex Marshall, >Room 3.54, Blaxter Lab, >Ashworth Laboratories, >Institute of Evolutionary Biology, >The King's Buildings, >The University of Edinburgh, >Edinburgh, EH9 3JT >----------------------------- >alex.marshall at ed.ac.uk >+44(0)131 650 7403 >----------------------------- > >The University of Edinburgh is a charitable body, registered in >Scotland, with registration number SC005336. From mikheyev at gmail.com Wed Mar 13 01:23:25 2013 From: mikheyev at gmail.com (Sasha Mikheyev) Date: Wed, 13 Mar 2013 16:23:25 +0900 Subject: [maker-devel] duplicate CDS in annotation In-Reply-To: References: Message-ID: Dear Carson, The new version does indeed fix the problem! However, I noticed that some of the CDS annotations were swallowed. This seems to affect a ~600 genes. e.g. input: pbar_scf7180000349951 maker mRNA 98033 98530 . - . ID=PB12301-RA;Parent=PB12301;Name=PB12301-RA;Alias=maker-pbar_scf7180000349951-snap-gene-1.17-mRNA-1;_AED=1.00;_QI=0|0|0|0|0|0|2|0|81; pbar_scf7180000349951 maker exon 98393 98530 . - . ID=PB12301-RA:exon:10283;Parent=PB12301-RA; pbar_scf7180000349951 maker exon 98033 98140 . - . ID=PB12301-RA:exon:10284;Parent=PB12301-RA; pbar_scf7180000349951 maker CDS 98033 98140 . - 0 ID=PB12301-RA:cds:10114;Parent=PB12301-RA; pbar_scf7180000349951 maker CDS 98393 98530 . - 0 ID=PB12301-RA:cds:10113;Parent=PB12301-RA; output: pbar_scf7180000349951 maker mRNA 98033 98530 . - . ID=PB12301-RA;Parent=PB12301;Name=PB12301-RA;_AED=0.38;_eAED=0.38;_QI=0|0|0.33|1|0.5|1|3|246|165;Alias=genemark-pbar_scf7180000349951-abinit-gene-1.14-mRNA-1,PB12301-RA pbar_scf7180000349951 maker exon 98033 98530 . - . ID=PB12301-RA:exon:134;Parent=PB12301-RA pbar_scf7180000349951 maker exon 98033 98140 . - . ID=PB12301-RA:exon:133;Parent=PB12301-RA pbar_scf7180000349951 maker exon 98393 98530 . - . ID=PB12301-RA:exon:132;Parent=PB12301-RA pbar_scf7180000349951 maker three_prime_UTR 98393 98530 . - . ID=PB12301-RA:three_prime_utr;Parent=PB12301-RA pbar_scf7180000349951 maker three_prime_UTR 98033 98140 . - . ID=PB12301-RA:three_prime_utr;Parent=PB12301-RA pbar_scf7180000349951 maker CDS 98033 98530 . - 0 ID=PB12301-RA:cds;Parent=PB12301-RA Thank you, Sasha On Tue, Mar 12, 2013 at 10:37 PM, Carson Holt wrote: > Yes. Try the newer version and see if you still have the issue. > > Thanks, > Carson > > > From: Sasha Mikheyev > Date: Tuesday, 12 March, 2013 1:26 AM > To: Carson Holt > Cc: Barry Moore , < > maker-devel at yandell-lab.org> > > Subject: Re: [maker-devel] duplicate CDS in annotation > > Hi Carson, > > I have been using version 2.10. Is it worth trying with a newer version? > > You can find the model file here. > It is rather large, as it includes all of the output from the first maker > run. > > Yours, > > Sasha > > > On Mon, Mar 11, 2013 at 10:02 PM, Carson Holt wrote: > >> I think the issue is that you are getting a match feature that is being >> printed with the same ID as the mRNA feature. Correct? >> >> What version of MAKER are you using, and what does the gile you are >> giving to pred_gff or model_gff look like? Could you send them? >> >> Thanks, >> Carson >> >> >> From: Barry Moore >> Date: Monday, 11 March, 2013 7:32 AM >> To: Sasha Mikheyev >> Cc: >> Subject: Re: [maker-devel] duplicate CDS in annotation >> >> Hi Sasha, >> >> This gene model appears to be correctly formatted to me. In GFF3 format >> the CDS features are allowed to span multiple lines and they share the same >> ID to indicate that it is all the same features. See the GFF3 >> specification on the Sequence Ontology website ( >> http://www.sequenceontology.org/resources/gff3.html), and in particular >> the description of the ID attribute specifies: >> >> ID Indicates the ID of the feature. IDs for each feature must be unique >> within the scope of the GFF file. In the case of discontinuous features >> (i.e. a single feature that exists over multiple genomic locations) the >> same ID may appear on multiple lines. All lines that share an ID >> collectively represent a single feature. >> >> >> So each of those CDS lines forms one part of the single CDS feature for >> this gene. >> >> B >> >> On Mar 11, 2013, at 3:46 AM, Sasha Mikheyev wrote: >> >> Dear Yandell lab, >> >> I am re-annotating the harvester and genome using protein and RNA-seq >> data. However, I get many artifacts like the one below. It seems that there >> are several CDS records that should tie in to the same mRNA, but they are >> really hanging out separately, and produce several nucleotide sequences >> with the same name when extracted from the gff. I would appreciate any >> guidance about how to fix this! >> >> Thank you, >> >> Sasha >> >> grep "pbar_scf7180000350377:hit:2506" Pbar.2.0.gff >> pbar_scf7180000350377 protein2genome protein_match 172004 172162 150 - . >> ID=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;score=150; >> pbar_scf7180000350377 protein2genome match_part 172004 172162 150 - . ID=pbar_scf7180000350377:hsp:2798;Parent=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;Target=Hsal|HS9704 >> 1 53 +;Gap=M159; >> pbar_scf7180000350377 maker mRNA 538308 558769 . + . >> ID=pbar_scf7180000350377:hit:2506;Parent=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29;Name=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29-mRNA-1;_AED=0.48;_eAED=0.39;_QI=0|0|0|0.5|1|1|6|0|395;score=0.01; >> pbar_scf7180000350377 maker exon 538308 538334 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:305;Parent=pbar_scf7180000350377:hit:2506; >> pbar_scf7180000350377 maker exon 538748 538968 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:306;Parent=pbar_scf7180000350377:hit:2506; >> pbar_scf7180000350377 maker exon 539842 540242 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:307;Parent=pbar_scf7180000350377:hit:2506; >> pbar_scf7180000350377 maker exon 542624 542798 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:308;Parent=pbar_scf7180000350377:hit:2506; >> pbar_scf7180000350377 maker exon 555823 556025 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:309;Parent=pbar_scf7180000350377:hit:2506; >> pbar_scf7180000350377 maker exon 558609 558769 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:310;Parent=pbar_scf7180000350377:hit:2506; >> pbar_scf7180000350377 maker CDS 538308 538334 . + 0 >> ID=pbar_scf7180000350377:hit:2506:cds:305;Parent=pbar_scf7180000350377:hit:2506; >> pbar_scf7180000350377 maker CDS 538748 538968 . + 0 >> ID=pbar_scf7180000350377:hit:2506:cds:306;Parent=pbar_scf7180000350377:hit:2506; >> pbar_scf7180000350377 maker CDS 539842 540242 . + 1 >> ID=pbar_scf7180000350377:hit:2506:cds:307;Parent=pbar_scf7180000350377:hit:2506; >> pbar_scf7180000350377 maker CDS 542624 542798 . + 2 >> ID=pbar_scf7180000350377:hit:2506:cds:308;Parent=pbar_scf7180000350377:hit:2506; >> pbar_scf7180000350377 maker CDS 555823 556025 . + 1 >> ID=pbar_scf7180000350377:hit:2506:cds:309;Parent=pbar_scf7180000350377:hit:2506; >> pbar_scf7180000350377 maker CDS 558609 558769 . + 2 >> ID=pbar_scf7180000350377:hit:2506:cds:310;Parent=pbar_scf7180000350377:hit:2506; >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> >> Barry Moore >> Research Scientist >> Dept. of Human Genetics >> University of Utah >> Salt Lake City, UT 84112 >> -------------------------------------------- >> (801) 585-3543 >> >> >> >> >> _______________________________________________ maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Hossein.Borhan at AGR.GC.CA Wed Mar 13 15:49:44 2013 From: Hossein.Borhan at AGR.GC.CA (Borhan, Hossein) Date: Wed, 13 Mar 2013 17:49:44 -0400 Subject: [maker-devel] do of the maker predicted proteins do not start with M Message-ID: <7B64340A44B6634C814A22BCFA6179D5020887CC@onottaxms5.AGR.GC.CA> Hi I have run maker and some of the protein predicted by maker does not start with a Methionine. I am not sure why Here are some examples >maker-scaffold00001-snap-gene-0.8-mRNA-1 protein AED:0.27496328928047 eAED:0.27496328928047 QI:0|0.4|0.16|0.5|1|1|6|0|453 VIIKFKTFAKASRSVELFGHEGAWARGDGYCNFKTESEKADRSVKSSCSLNIPFTYDVGR RQYVIKGDRFCLSHNHLVMIPSPTTVIVNDQRDLTPDQLSYIINLGKYSLPFPMVTRMLS DQFPDCRIQKPLLHRLLRKGKLQAFGGDRDAMNALINLGRSYEEHGGFFEIDIDVDCRLE KIWLARAEGLQFASVYNDVVQIDGGAKMNAYGFVFLPVTVIDCLGKSYVVGAMAGPSAEN KADVVKTLEYFRVKRSESVLIADDALAFRAAAVECDMVYHQCTKHYQAKIARACAGLGHE GKEFMIKANTLVYHIFPSEDAFFAKADEYRLMFLQYGGAVKLFDDIVDKRQQLCRTFTSC KFTGSHSSNQRAEGTISRTKRDVQPWLSRANLFEMFTHLEMIQKQQEDEAARLLSNLIRK GKHWSDYVDSIFRERQLNSRLLSSVREVDTGLH >snap-scaffold00087-abinit-gene-2.145-mRNA-1 protein AED:0.0539495114006514 eAED:0.0539495114006514 QI:2|1|0.6|1|1|1|5|0|817 ALSLHGTRQAFARVPPPCRRAHPAERRQPGGGMSADAPVKAGYLLKLTSSLSHWNRRYFI VADTKLFYCKTEDDLLRRKFQGEIDLAGAQIALYTRNDETAKRFSDHHHMLGVKPAGCDR IYILDADSEHAQKEWVACLRRHASQAPVSSPVDAAVAAAPRKDPQSVREGFLTKRGETIK NWKMRYFVLKGNYLHYYRSIEDAQPAGSILLLGTRTTAEPKAVTGMPHSFSIARADAKRK YMIHADSKEECDAWVGAIQQQSVFVRHAGTDSAPPEVVAPAAATPVHQQHQSRSSFGNRP NVADDSADDDEAALDEVALSNGPPALAPHGIANTGASTGLNLKQKVSKKKRRFVTDEFDL DLTYITENIIAMGFPAESMEAMFRNSMSDVQRFLDGRHPDAYRVYNLCSERDYDPAKFHH NVCRFPFDDHNCPNFEDLIPLCEDIHNWLSIQSDHVVAIHCKAGKGRTGLVICAYLLYSG AWRTARDALQFYGFVRTQDQKGVTIPSQIRYVEYFEQYMADPEILSRNNGPLVISEIFVG RGCRPFDTVTITNMGRRMNSKDWGKYWKDALDDGLLLQLPKGACQVDKDFKVEFLASGLL GKKTRVAGFWLHTAFIQDGVVDIDKSMIDKVNKEKDCPAFSIQVFFGGRTYVDRRCRIPV APPQPTGPLLLSPATVRIRNADPLPVPNPSSPSESPSFSAMSSVPSLSLESLSSVSSSLS PTTGPKAAPSPKKQDPGLDPGSPPGTVKSAPVAEAGAAPVDARSDNKAPARSCSLPHGRF PGDVAGDGAFQQAAVQVRIAFWNNLQSEALQRRNSRL >augustus-scaffold00087-abinit-gene-0.106-mRNA-1 protein AED:0.10935424621144 eAED:0.10935424621144 QI:1|1|0.66|1|1|1|3|0|483 STSTVFCPGLWNRVGRYVSGGLQTSPVTVPRQVRPICLATQPATDQPGYVMSSTSALVGI GVVTVALLCRWAPILVTTSGPGSPARSSADVMRIWADHDWSAGTTSLPISQESLLAKRVL SKSFDGLPPNLHVQDDTVPVALLRAHLNAGRHMRLRDMCPTAGACDLQGADPDHGLAPLH MAAMRDDRSSIAYLMALGADPDAMDRAGRQYRNLSFTNFVRNARRAAEERGSTCQLPEVN LAGLERADLDRSWAEIRRLAHEGEPVAIRGLLGAYDRSDVLDWDLDAFLTRHGHVPVNVG DVPYAQYFGLPIQSMPLSKYVASLAPGSASYVFAKDDGICRDALQILDRFARDALPPYFV SPAALGSDAVHFYLGNKGSGAPFHLHSDAVNLLAHGSKTWFVTPPPQSVYSRTPIGEFAA NGTSGIESLRCEQNPGDAIYIPFDWGHAVLNNEDSTFGFAVELLNKRDSLHFLRPSSQVP AGQ Regards Hossein -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Wed Mar 13 15:40:39 2013 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 13 Mar 2013 17:40:39 -0400 Subject: [maker-devel] ERROR: Could not obtain lock to format database In-Reply-To: Message-ID: Could you check to make sure your hard drive is not full, whatever location you set as TMP= in the control files is not full (default is /tmp). Also maker sure you do not set /tmp to an NFS mounted or a tmpfs location. Could you also send the full captured STDERR. Thanks, Carson From: Hung-Wei Hsu Date: Tuesday, 12 March, 2013 8:24 PM To: Subject: [maker-devel] ERROR: Could not obtain lock to format database Hi MAKER developers, I tried MAKER 2.27b on one E. coli scaffold sequence with uniprot protein database. I failed to run the analysis and got an error message as below. Could not obtain lock to format database at maker-2.27b/bin/../lib/GI.pm Any suggestions or helps will be deeply appreciated. Best regards, Hung-Wei _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Wed Mar 13 15:47:06 2013 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 13 Mar 2013 17:47:06 -0400 Subject: [maker-devel] duplicate CDS in annotation In-Reply-To: Message-ID: The output shows that the original model was Alias=maker-pbar_scf7180000349951-snap-gene-1.17-mRNA-1 and the new model replacing it is Alias=genemark-pbar_scf7180000349951-abinit-gene-1.14-mRNA-1. So it is really a completely different model (as one derived from SNAP and one from GeneMark). I'm guessing you have map_forward=1 set and are using the GFF3 passthrough options correct? Thanks, Carson From: Sasha Mikheyev Date: Wednesday, 13 March, 2013 3:23 AM To: Carson Holt Cc: Barry Moore , Subject: Re: [maker-devel] duplicate CDS in annotation Dear Carson, The new version does indeed fix the problem! However, I noticed that some of the CDS annotations were swallowed. This seems to affect a ~600 genes. e.g. input: pbar_scf7180000349951 maker mRNA 98033 98530 . - . ID=PB12301-RA;Parent=PB12301;Name=PB12301-RA;Alias=maker-pbar_scf71800003499 51-snap-gene-1.17-mRNA-1;_AED=1.00;_QI=0|0|0|0|0|0|2|0|81; pbar_scf7180000349951 maker exon 98393 98530 . - . ID=PB12301-RA:exon:10283;Parent=PB12301-RA; pbar_scf7180000349951 maker exon 98033 98140 . - . ID=PB12301-RA:exon:10284;Parent=PB12301-RA; pbar_scf7180000349951 maker CDS 98033 98140 . - 0 ID=PB12301-RA:cds:10114;Parent=PB12301-RA; pbar_scf7180000349951 maker CDS 98393 98530 . - 0 ID=PB12301-RA:cds:10113;Parent=PB12301-RA; output: pbar_scf7180000349951 maker mRNA 98033 98530 . - . ID=PB12301-RA;Parent=PB12301;Name=PB12301-RA;_AED=0.38;_eAED=0.38;_QI=0|0|0. 33|1|0.5|1|3|246|165;Alias=genemark-pbar_scf7180000349951-abinit-gene-1.14-m RNA-1,PB12301-RA pbar_scf7180000349951 maker exon 98033 98530 . - . ID=PB12301-RA:exon:134;Parent=PB12301-RA pbar_scf7180000349951 maker exon 98033 98140 . - . ID=PB12301-RA:exon:133;Parent=PB12301-RA pbar_scf7180000349951 maker exon 98393 98530 . - . ID=PB12301-RA:exon:132;Parent=PB12301-RA pbar_scf7180000349951 maker three_prime_UTR 98393 98530 . - . ID=PB12301-RA:three_prime_utr;Parent=PB12301-RA pbar_scf7180000349951 maker three_prime_UTR 98033 98140 . - . ID=PB12301-RA:three_prime_utr;Parent=PB12301-RA pbar_scf7180000349951 maker CDS 98033 98530 . - 0 ID=PB12301-RA:cds;Parent=PB12301-RA Thank you, Sasha On Tue, Mar 12, 2013 at 10:37 PM, Carson Holt wrote: > Yes. Try the newer version and see if you still have the issue. > > Thanks, > Carson > > > From: Sasha Mikheyev > Date: Tuesday, 12 March, 2013 1:26 AM > To: Carson Holt > Cc: Barry Moore , > > > Subject: Re: [maker-devel] duplicate CDS in annotation > > Hi Carson, > > I have been using version 2.10. Is it worth trying with a newer version? > > You can find the model file here > . It is rather large, as it includes all of the output from the first maker > run. > > Yours, > > Sasha > > > On Mon, Mar 11, 2013 at 10:02 PM, Carson Holt wrote: >> I think the issue is that you are getting a match feature that is being >> printed with the same ID as the mRNA feature. Correct? >> >> What version of MAKER are you using, and what does the gile you are giving to >> pred_gff or model_gff look like? Could you send them? >> >> Thanks, >> Carson >> >> >> From: Barry Moore >> Date: Monday, 11 March, 2013 7:32 AM >> To: Sasha Mikheyev >> Cc: >> Subject: Re: [maker-devel] duplicate CDS in annotation >> >> Hi Sasha, >> >> This gene model appears to be correctly formatted to me. In GFF3 format the >> CDS features are allowed to span multiple lines and they share the same ID to >> indicate that it is all the same features. See the GFF3 specification on the >> Sequence Ontology website >> (http://www.sequenceontology.org/resources/gff3.html), and in particular the >> description of the ID attribute specifies: >> >>> ID Indicates the ID of the feature. IDs for each feature must be unique >>> within the scope of the GFF file. In the case of discontinuous features >>> (i.e. a single feature that exists over multiple genomic locations) the same >>> ID may appear on multiple lines. All lines that share an ID collectively >>> represent a single feature. >> >> So each of those CDS lines forms one part of the single CDS feature for this >> gene. >> >> B >> >> On Mar 11, 2013, at 3:46 AM, Sasha Mikheyev wrote: >> >>> Dear Yandell lab, >>> >>> I am re-annotating the harvester and genome using protein and RNA-seq data. >>> However, I get many artifacts like the one below. It seems that there are >>> several CDS records that should tie in to the same mRNA, but they are really >>> hanging out separately, and produce several nucleotide sequences with the >>> same name when extracted from the gff. I would appreciate any guidance about >>> how to fix this! >>> >>> Thank you, >>> >>> Sasha >>> >>> grep "pbar_scf7180000350377:hit:2506" Pbar.2.0.gff >>> pbar_scf7180000350377 protein2genome protein_match 172004 172162 150 - . >>> ID=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;score=150; >>> pbar_scf7180000350377 protein2genome match_part 172004 172162 150 - . >>> ID=pbar_scf7180000350377:hsp:2798;Parent=pbar_scf7180000350377:hit:2506;Name >>> =Hsal|HS9704;Target=Hsal|HS9704 1 53 +;Gap=M159; >>> pbar_scf7180000350377 maker mRNA 538308 558769 . + . >>> ID=pbar_scf7180000350377:hit:2506;Parent=augustus_masked-pbar_scf71800003503 >>> 77-abinit-gene-5.29;Name=augustus_masked-pbar_scf7180000350377-abinit-gene-5 >>> .29-mRNA-1;_AED=0.48;_eAED=0.39;_QI=0|0|0|0.5|1|1|6|0|395;score=0.01; >>> pbar_scf7180000350377 maker exon 538308 538334 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:305;Parent=pbar_scf7180000350377:hit: >>> 2506; >>> pbar_scf7180000350377 maker exon 538748 538968 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:306;Parent=pbar_scf7180000350377:hit: >>> 2506; >>> pbar_scf7180000350377 maker exon 539842 540242 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:307;Parent=pbar_scf7180000350377:hit: >>> 2506; >>> pbar_scf7180000350377 maker exon 542624 542798 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:308;Parent=pbar_scf7180000350377:hit: >>> 2506; >>> pbar_scf7180000350377 maker exon 555823 556025 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:309;Parent=pbar_scf7180000350377:hit: >>> 2506; >>> pbar_scf7180000350377 maker exon 558609 558769 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:310;Parent=pbar_scf7180000350377:hit: >>> 2506; >>> pbar_scf7180000350377 maker CDS 538308 538334 . + 0 >>> ID=pbar_scf7180000350377:hit:2506:cds:305;Parent=pbar_scf7180000350377:hit:2 >>> 506; >>> pbar_scf7180000350377 maker CDS 538748 538968 . + 0 >>> ID=pbar_scf7180000350377:hit:2506:cds:306;Parent=pbar_scf7180000350377:hit:2 >>> 506; >>> pbar_scf7180000350377 maker CDS 539842 540242 . + 1 >>> ID=pbar_scf7180000350377:hit:2506:cds:307;Parent=pbar_scf7180000350377:hit:2 >>> 506; >>> pbar_scf7180000350377 maker CDS 542624 542798 . + 2 >>> ID=pbar_scf7180000350377:hit:2506:cds:308;Parent=pbar_scf7180000350377:hit:2 >>> 506; >>> pbar_scf7180000350377 maker CDS 555823 556025 . + 1 >>> ID=pbar_scf7180000350377:hit:2506:cds:309;Parent=pbar_scf7180000350377:hit:2 >>> 506; >>> pbar_scf7180000350377 maker CDS 558609 558769 . + 2 >>> ID=pbar_scf7180000350377:hit:2506:cds:310;Parent=pbar_scf7180000350377:hit:2 >>> 506; >>> >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> Barry Moore >> Research Scientist >> Dept. of Human Genetics >> University of Utah >> Salt Lake City, UT 84112 >> -------------------------------------------- >> (801) 585-3543 >> >> >> >> >> _______________________________________________ maker-devel mailing list >> maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/ma >> ker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Wed Mar 13 18:26:25 2013 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 13 Mar 2013 20:26:25 -0400 Subject: [maker-devel] do of the maker predicted proteins do not start with M In-Reply-To: <7B64340A44B6634C814A22BCFA6179D5020887CC@onottaxms5.AGR.GC.CA> Message-ID: SNAP and other gene prediction programs are capable of producing partial models if they can't find reasonable start and stop codons. You can set always_complete=1 in the maker_opts.ctl file to get MAKER to walk forward and backwards to search for starts and stops after the ab initio predictors do their work in an attempt to force model completion. Thanks, Carson From: "Borhan, Hossein" Date: Wednesday, 13 March, 2013 5:49 PM To: Subject: [maker-devel] do of the maker predicted proteins do not start with M Hi I have run maker and some of the protein predicted by maker does not start with a Methionine. I am not sure why Here are some examples >maker-scaffold00001-snap-gene-0.8-mRNA-1 protein AED:0.27496328928047 eAED:0.27496328928047 QI:0|0.4|0.16|0.5|1|1|6|0|453 VIIKFKTFAKASRSVELFGHEGAWARGDGYCNFKTESEKADRSVKSSCSLNIPFTYDVGR RQYVIKGDRFCLSHNHLVMIPSPTTVIVNDQRDLTPDQLSYIINLGKYSLPFPMVTRMLS DQFPDCRIQKPLLHRLLRKGKLQAFGGDRDAMNALINLGRSYEEHGGFFEIDIDVDCRLE KIWLARAEGLQFASVYNDVVQIDGGAKMNAYGFVFLPVTVIDCLGKSYVVGAMAGPSAEN KADVVKTLEYFRVKRSESVLIADDALAFRAAAVECDMVYHQCTKHYQAKIARACAGLGHE GKEFMIKANTLVYHIFPSEDAFFAKADEYRLMFLQYGGAVKLFDDIVDKRQQLCRTFTSC KFTGSHSSNQRAEGTISRTKRDVQPWLSRANLFEMFTHLEMIQKQQEDEAARLLSNLIRK GKHWSDYVDSIFRERQLNSRLLSSVREVDTGLH >snap-scaffold00087-abinit-gene-2.145-mRNA-1 protein AED:0.0539495114006514 eAED:0.0539495114006514 QI:2|1|0.6|1|1|1|5|0|817 ALSLHGTRQAFARVPPPCRRAHPAERRQPGGGMSADAPVKAGYLLKLTSSLSHWNRRYFI VADTKLFYCKTEDDLLRRKFQGEIDLAGAQIALYTRNDETAKRFSDHHHMLGVKPAGCDR IYILDADSEHAQKEWVACLRRHASQAPVSSPVDAAVAAAPRKDPQSVREGFLTKRGETIK NWKMRYFVLKGNYLHYYRSIEDAQPAGSILLLGTRTTAEPKAVTGMPHSFSIARADAKRK YMIHADSKEECDAWVGAIQQQSVFVRHAGTDSAPPEVVAPAAATPVHQQHQSRSSFGNRP NVADDSADDDEAALDEVALSNGPPALAPHGIANTGASTGLNLKQKVSKKKRRFVTDEFDL DLTYITENIIAMGFPAESMEAMFRNSMSDVQRFLDGRHPDAYRVYNLCSERDYDPAKFHH NVCRFPFDDHNCPNFEDLIPLCEDIHNWLSIQSDHVVAIHCKAGKGRTGLVICAYLLYSG AWRTARDALQFYGFVRTQDQKGVTIPSQIRYVEYFEQYMADPEILSRNNGPLVISEIFVG RGCRPFDTVTITNMGRRMNSKDWGKYWKDALDDGLLLQLPKGACQVDKDFKVEFLASGLL GKKTRVAGFWLHTAFIQDGVVDIDKSMIDKVNKEKDCPAFSIQVFFGGRTYVDRRCRIPV APPQPTGPLLLSPATVRIRNADPLPVPNPSSPSESPSFSAMSSVPSLSLESLSSVSSSLS PTTGPKAAPSPKKQDPGLDPGSPPGTVKSAPVAEAGAAPVDARSDNKAPARSCSLPHGRF PGDVAGDGAFQQAAVQVRIAFWNNLQSEALQRRNSRL >augustus-scaffold00087-abinit-gene-0.106-mRNA-1 protein AED:0.10935424621144 eAED:0.10935424621144 QI:1|1|0.66|1|1|1|3|0|483 STSTVFCPGLWNRVGRYVSGGLQTSPVTVPRQVRPICLATQPATDQPGYVMSSTSALVGI GVVTVALLCRWAPILVTTSGPGSPARSSADVMRIWADHDWSAGTTSLPISQESLLAKRVL SKSFDGLPPNLHVQDDTVPVALLRAHLNAGRHMRLRDMCPTAGACDLQGADPDHGLAPLH MAAMRDDRSSIAYLMALGADPDAMDRAGRQYRNLSFTNFVRNARRAAEERGSTCQLPEVN LAGLERADLDRSWAEIRRLAHEGEPVAIRGLLGAYDRSDVLDWDLDAFLTRHGHVPVNVG DVPYAQYFGLPIQSMPLSKYVASLAPGSASYVFAKDDGICRDALQILDRFARDALPPYFV SPAALGSDAVHFYLGNKGSGAPFHLHSDAVNLLAHGSKTWFVTPPPQSVYSRTPIGEFAA NGTSGIESLRCEQNPGDAIYIPFDWGHAVLNNEDSTFGFAVELLNKRDSLHFLRPSSQVP AGQ Regards Hossein _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Wed Mar 13 20:54:55 2013 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 13 Mar 2013 22:54:55 -0400 Subject: [maker-devel] duplicate CDS in annotation In-Reply-To: Message-ID: Yes. map_forward=1 allows new models to keep the names of the models they replace. It makes it so you don't have to relocate genes every time a model gets a slight modification during reannotation. --Carson From: Sasha Mikheyev Date: Wednesday, 13 March, 2013 9:17 PM To: Carson Holt Cc: Barry Moore , Subject: Re: [maker-devel] duplicate CDS in annotation OK. Got it! I did pass through the gene model names. I guess I now see that a new gene model may become associated with the old name in the re-annotation. Sasha On Thu, Mar 14, 2013 at 6:47 AM, Carson Holt wrote: > The output shows that the original model was > Alias=maker-pbar_scf7180000349951-snap-gene-1.17-mRNA-1 and the new model > replacing it is Alias=genemark-pbar_scf7180000349951-abinit-gene-1.14-mRNA-1. > > So it is really a completely different model (as one derived from SNAP and one > from GeneMark). I'm guessing you have map_forward=1 set and are using the > GFF3 passthrough options correct? > > Thanks, > Carson > > > > From: Sasha Mikheyev > Date: Wednesday, 13 March, 2013 3:23 AM > > To: Carson Holt > Cc: Barry Moore , > > Subject: Re: [maker-devel] duplicate CDS in annotation > > Dear Carson, > > The new version does indeed fix the problem! > > However, I noticed that some of the CDS annotations were swallowed. This seems > to affect a ~600 genes. > > e.g. input: > > pbar_scf7180000349951 maker mRNA 98033 98530 . - . > ID=PB12301-RA;Parent=PB12301;Name=PB12301-RA;Alias=maker-pbar_scf7180000349951 > -snap-gene-1.17-mRNA-1;_AED=1.00;_QI=0|0|0|0|0|0|2|0|81; > pbar_scf7180000349951 maker exon 98393 98530 . - . > ID=PB12301-RA:exon:10283;Parent=PB12301-RA; > pbar_scf7180000349951 maker exon 98033 98140 . - . > ID=PB12301-RA:exon:10284;Parent=PB12301-RA; > pbar_scf7180000349951 maker CDS 98033 98140 . - 0 > ID=PB12301-RA:cds:10114;Parent=PB12301-RA; > pbar_scf7180000349951 maker CDS 98393 98530 . - 0 > ID=PB12301-RA:cds:10113;Parent=PB12301-RA; > > output: > > pbar_scf7180000349951 maker mRNA 98033 98530 . - . > ID=PB12301-RA;Parent=PB12301;Name=PB12301-RA;_AED=0.38;_eAED=0.38;_QI=0|0|0.33 > |1|0.5|1|3|246|165;Alias=genemark-pbar_scf7180000349951-abinit-gene-1.14-mRNA- > 1,PB12301-RA > pbar_scf7180000349951 maker exon 98033 98530 . - . > ID=PB12301-RA:exon:134;Parent=PB12301-RA > pbar_scf7180000349951 maker exon 98033 98140 . - . > ID=PB12301-RA:exon:133;Parent=PB12301-RA > pbar_scf7180000349951 maker exon 98393 98530 . - . > ID=PB12301-RA:exon:132;Parent=PB12301-RA > pbar_scf7180000349951 maker three_prime_UTR 98393 98530 . - . > ID=PB12301-RA:three_prime_utr;Parent=PB12301-RA > pbar_scf7180000349951 maker three_prime_UTR 98033 98140 . - . > ID=PB12301-RA:three_prime_utr;Parent=PB12301-RA > pbar_scf7180000349951 maker CDS 98033 98530 . - 0 > ID=PB12301-RA:cds;Parent=PB12301-RA > > Thank you, > > Sasha > > On Tue, Mar 12, 2013 at 10:37 PM, Carson Holt wrote: >> Yes. Try the newer version and see if you still have the issue. >> >> Thanks, >> Carson >> >> >> From: Sasha Mikheyev >> Date: Tuesday, 12 March, 2013 1:26 AM >> To: Carson Holt >> Cc: Barry Moore , >> >> >> Subject: Re: [maker-devel] duplicate CDS in annotation >> >> Hi Carson, >> >> I have been using version 2.10. Is it worth trying with a newer version? >> >> You can find the model file here >> . It is rather large, as it >> includes all of the output from the first maker run. >> >> Yours, >> >> Sasha >> >> >> On Mon, Mar 11, 2013 at 10:02 PM, Carson Holt wrote: >>> I think the issue is that you are getting a match feature that is being >>> printed with the same ID as the mRNA feature. Correct? >>> >>> What version of MAKER are you using, and what does the gile you are giving >>> to pred_gff or model_gff look like? Could you send them? >>> >>> Thanks, >>> Carson >>> >>> >>> From: Barry Moore >>> Date: Monday, 11 March, 2013 7:32 AM >>> To: Sasha Mikheyev >>> Cc: >>> Subject: Re: [maker-devel] duplicate CDS in annotation >>> >>> Hi Sasha, >>> >>> This gene model appears to be correctly formatted to me. In GFF3 format the >>> CDS features are allowed to span multiple lines and they share the same ID >>> to indicate that it is all the same features. See the GFF3 specification on >>> the Sequence Ontology website >>> (http://www.sequenceontology.org/resources/gff3.html), and in particular the >>> description of the ID attribute specifies: >>> >>>> ID Indicates the ID of the feature. IDs for each feature must be unique >>>> within the scope of the GFF file. In the case of discontinuous features >>>> (i.e. a single feature that exists over multiple genomic locations) the >>>> same ID may appear on multiple lines. All lines that share an ID >>>> collectively represent a single feature. >>> >>> So each of those CDS lines forms one part of the single CDS feature for this >>> gene. >>> >>> B >>> >>> On Mar 11, 2013, at 3:46 AM, Sasha Mikheyev wrote: >>> >>>> Dear Yandell lab, >>>> >>>> I am re-annotating the harvester and genome using protein and RNA-seq data. >>>> However, I get many artifacts like the one below. It seems that there are >>>> several CDS records that should tie in to the same mRNA, but they are >>>> really hanging out separately, and produce several nucleotide sequences >>>> with the same name when extracted from the gff. I would appreciate any >>>> guidance about how to fix this! >>>> >>>> Thank you, >>>> >>>> Sasha >>>> >>>> grep "pbar_scf7180000350377:hit:2506" Pbar.2.0.gff >>>> pbar_scf7180000350377 protein2genome protein_match 172004 172162 150 - . >>>> ID=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;score=150; >>>> pbar_scf7180000350377 protein2genome match_part 172004 172162 150 - . >>>> ID=pbar_scf7180000350377:hsp:2798;Parent=pbar_scf7180000350377:hit:2506;Nam >>>> e=Hsal|HS9704;Target=Hsal|HS9704 1 53 +;Gap=M159; >>>> pbar_scf7180000350377 maker mRNA 538308 558769 . + . >>>> ID=pbar_scf7180000350377:hit:2506;Parent=augustus_masked-pbar_scf7180000350 >>>> 377-abinit-gene-5.29;Name=augustus_masked-pbar_scf7180000350377-abinit-gene >>>> -5.29-mRNA-1;_AED=0.48;_eAED=0.39;_QI=0|0|0|0.5|1|1|6|0|395;score=0.01; >>>> pbar_scf7180000350377 maker exon 538308 538334 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:305;Parent=pbar_scf7180000350377:hit >>>> :2506; >>>> pbar_scf7180000350377 maker exon 538748 538968 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:306;Parent=pbar_scf7180000350377:hit >>>> :2506; >>>> pbar_scf7180000350377 maker exon 539842 540242 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:307;Parent=pbar_scf7180000350377:hit >>>> :2506; >>>> pbar_scf7180000350377 maker exon 542624 542798 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:308;Parent=pbar_scf7180000350377:hit >>>> :2506; >>>> pbar_scf7180000350377 maker exon 555823 556025 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:309;Parent=pbar_scf7180000350377:hit >>>> :2506; >>>> pbar_scf7180000350377 maker exon 558609 558769 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:310;Parent=pbar_scf7180000350377:hit >>>> :2506; >>>> pbar_scf7180000350377 maker CDS 538308 538334 . + 0 >>>> ID=pbar_scf7180000350377:hit:2506:cds:305;Parent=pbar_scf7180000350377:hit: >>>> 2506; >>>> pbar_scf7180000350377 maker CDS 538748 538968 . + 0 >>>> ID=pbar_scf7180000350377:hit:2506:cds:306;Parent=pbar_scf7180000350377:hit: >>>> 2506; >>>> pbar_scf7180000350377 maker CDS 539842 540242 . + 1 >>>> ID=pbar_scf7180000350377:hit:2506:cds:307;Parent=pbar_scf7180000350377:hit: >>>> 2506; >>>> pbar_scf7180000350377 maker CDS 542624 542798 . + 2 >>>> ID=pbar_scf7180000350377:hit:2506:cds:308;Parent=pbar_scf7180000350377:hit: >>>> 2506; >>>> pbar_scf7180000350377 maker CDS 555823 556025 . + 1 >>>> ID=pbar_scf7180000350377:hit:2506:cds:309;Parent=pbar_scf7180000350377:hit: >>>> 2506; >>>> pbar_scf7180000350377 maker CDS 558609 558769 . + 2 >>>> ID=pbar_scf7180000350377:hit:2506:cds:310;Parent=pbar_scf7180000350377:hit: >>>> 2506; >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >>> Barry Moore >>> Research Scientist >>> Dept. of Human Genetics >>> University of Utah >>> Salt Lake City, UT 84112 >>> -------------------------------------------- >>> (801) 585-3543 >>> >>> >>> >>> >>> _______________________________________________ maker-devel mailing list >>> maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/m >>> aker-devel_yandell-lab.org >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mikheyev at gmail.com Wed Mar 13 19:17:40 2013 From: mikheyev at gmail.com (Sasha Mikheyev) Date: Thu, 14 Mar 2013 10:17:40 +0900 Subject: [maker-devel] duplicate CDS in annotation In-Reply-To: References: Message-ID: OK. Got it! I did pass through the gene model names. I guess I now see that a new gene model may become associated with the old name in the re-annotation. Sasha On Thu, Mar 14, 2013 at 6:47 AM, Carson Holt wrote: > The output shows that the original model > was Alias=maker-pbar_scf7180000349951-snap-gene-1.17-mRNA-1 and the new > model replacing it is > Alias=genemark-pbar_scf7180000349951-abinit-gene-1.14-mRNA-1. > > So it is really a completely different model (as one derived from SNAP and > one from GeneMark). I'm guessing you have map_forward=1 set and are using > the GFF3 passthrough options correct? > > Thanks, > Carson > > > > From: Sasha Mikheyev > Date: Wednesday, 13 March, 2013 3:23 AM > > To: Carson Holt > Cc: Barry Moore , < > maker-devel at yandell-lab.org> > Subject: Re: [maker-devel] duplicate CDS in annotation > > Dear Carson, > > The new version does indeed fix the problem! > > However, I noticed that some of the CDS annotations were swallowed. This > seems to affect a ~600 genes. > > e.g. input: > > pbar_scf7180000349951 maker mRNA 98033 98530 . - . > ID=PB12301-RA;Parent=PB12301;Name=PB12301-RA;Alias=maker-pbar_scf7180000349951-snap-gene-1.17-mRNA-1;_AED=1.00;_QI=0|0|0|0|0|0|2|0|81; > pbar_scf7180000349951 maker exon 98393 98530 . - . > ID=PB12301-RA:exon:10283;Parent=PB12301-RA; > pbar_scf7180000349951 maker exon 98033 98140 . - . > ID=PB12301-RA:exon:10284;Parent=PB12301-RA; > pbar_scf7180000349951 maker CDS 98033 98140 . - 0 > ID=PB12301-RA:cds:10114;Parent=PB12301-RA; > pbar_scf7180000349951 maker CDS 98393 98530 . - 0 > ID=PB12301-RA:cds:10113;Parent=PB12301-RA; > > output: > > pbar_scf7180000349951 maker mRNA 98033 98530 . - . > ID=PB12301-RA;Parent=PB12301;Name=PB12301-RA;_AED=0.38;_eAED=0.38;_QI=0|0|0.33|1|0.5|1|3|246|165;Alias=genemark-pbar_scf7180000349951-abinit-gene-1.14-mRNA-1,PB12301-RA > pbar_scf7180000349951 maker exon 98033 98530 . - . > ID=PB12301-RA:exon:134;Parent=PB12301-RA > pbar_scf7180000349951 maker exon 98033 98140 . - . > ID=PB12301-RA:exon:133;Parent=PB12301-RA > pbar_scf7180000349951 maker exon 98393 98530 . - . > ID=PB12301-RA:exon:132;Parent=PB12301-RA > pbar_scf7180000349951 maker three_prime_UTR 98393 98530 . - . > ID=PB12301-RA:three_prime_utr;Parent=PB12301-RA > pbar_scf7180000349951 maker three_prime_UTR 98033 98140 . - . > ID=PB12301-RA:three_prime_utr;Parent=PB12301-RA > pbar_scf7180000349951 maker CDS 98033 98530 . - 0 > ID=PB12301-RA:cds;Parent=PB12301-RA > > Thank you, > > Sasha > > On Tue, Mar 12, 2013 at 10:37 PM, Carson Holt wrote: > >> Yes. Try the newer version and see if you still have the issue. >> >> Thanks, >> Carson >> >> >> From: Sasha Mikheyev >> Date: Tuesday, 12 March, 2013 1:26 AM >> To: Carson Holt >> Cc: Barry Moore , < >> maker-devel at yandell-lab.org> >> >> Subject: Re: [maker-devel] duplicate CDS in annotation >> >> Hi Carson, >> >> I have been using version 2.10. Is it worth trying with a newer version? >> >> You can find the model file here. >> It is rather large, as it includes all of the output from the first maker >> run. >> >> Yours, >> >> Sasha >> >> >> On Mon, Mar 11, 2013 at 10:02 PM, Carson Holt wrote: >> >>> I think the issue is that you are getting a match feature that is being >>> printed with the same ID as the mRNA feature. Correct? >>> >>> What version of MAKER are you using, and what does the gile you are >>> giving to pred_gff or model_gff look like? Could you send them? >>> >>> Thanks, >>> Carson >>> >>> >>> From: Barry Moore >>> Date: Monday, 11 March, 2013 7:32 AM >>> To: Sasha Mikheyev >>> Cc: >>> Subject: Re: [maker-devel] duplicate CDS in annotation >>> >>> Hi Sasha, >>> >>> This gene model appears to be correctly formatted to me. In GFF3 format >>> the CDS features are allowed to span multiple lines and they share the same >>> ID to indicate that it is all the same features. See the GFF3 >>> specification on the Sequence Ontology website ( >>> http://www.sequenceontology.org/resources/gff3.html), and in particular >>> the description of the ID attribute specifies: >>> >>> ID Indicates the ID of the feature. IDs for each feature must be unique >>> within the scope of the GFF file. In the case of discontinuous features >>> (i.e. a single feature that exists over multiple genomic locations) the >>> same ID may appear on multiple lines. All lines that share an ID >>> collectively represent a single feature. >>> >>> >>> So each of those CDS lines forms one part of the single CDS feature for >>> this gene. >>> >>> B >>> >>> On Mar 11, 2013, at 3:46 AM, Sasha Mikheyev wrote: >>> >>> Dear Yandell lab, >>> >>> I am re-annotating the harvester and genome using protein and RNA-seq >>> data. However, I get many artifacts like the one below. It seems that there >>> are several CDS records that should tie in to the same mRNA, but they are >>> really hanging out separately, and produce several nucleotide sequences >>> with the same name when extracted from the gff. I would appreciate any >>> guidance about how to fix this! >>> >>> Thank you, >>> >>> Sasha >>> >>> grep "pbar_scf7180000350377:hit:2506" Pbar.2.0.gff >>> pbar_scf7180000350377 protein2genome protein_match 172004 172162 150 - . >>> ID=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;score=150; >>> pbar_scf7180000350377 protein2genome match_part 172004 172162 150 - . ID=pbar_scf7180000350377:hsp:2798;Parent=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;Target=Hsal|HS9704 >>> 1 53 +;Gap=M159; >>> pbar_scf7180000350377 maker mRNA 538308 558769 . + . >>> ID=pbar_scf7180000350377:hit:2506;Parent=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29;Name=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29-mRNA-1;_AED=0.48;_eAED=0.39;_QI=0|0|0|0.5|1|1|6|0|395;score=0.01; >>> pbar_scf7180000350377 maker exon 538308 538334 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:305;Parent=pbar_scf7180000350377:hit:2506; >>> pbar_scf7180000350377 maker exon 538748 538968 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:306;Parent=pbar_scf7180000350377:hit:2506; >>> pbar_scf7180000350377 maker exon 539842 540242 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:307;Parent=pbar_scf7180000350377:hit:2506; >>> pbar_scf7180000350377 maker exon 542624 542798 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:308;Parent=pbar_scf7180000350377:hit:2506; >>> pbar_scf7180000350377 maker exon 555823 556025 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:309;Parent=pbar_scf7180000350377:hit:2506; >>> pbar_scf7180000350377 maker exon 558609 558769 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:310;Parent=pbar_scf7180000350377:hit:2506; >>> pbar_scf7180000350377 maker CDS 538308 538334 . + 0 >>> ID=pbar_scf7180000350377:hit:2506:cds:305;Parent=pbar_scf7180000350377:hit:2506; >>> pbar_scf7180000350377 maker CDS 538748 538968 . + 0 >>> ID=pbar_scf7180000350377:hit:2506:cds:306;Parent=pbar_scf7180000350377:hit:2506; >>> pbar_scf7180000350377 maker CDS 539842 540242 . + 1 >>> ID=pbar_scf7180000350377:hit:2506:cds:307;Parent=pbar_scf7180000350377:hit:2506; >>> pbar_scf7180000350377 maker CDS 542624 542798 . + 2 >>> ID=pbar_scf7180000350377:hit:2506:cds:308;Parent=pbar_scf7180000350377:hit:2506; >>> pbar_scf7180000350377 maker CDS 555823 556025 . + 1 >>> ID=pbar_scf7180000350377:hit:2506:cds:309;Parent=pbar_scf7180000350377:hit:2506; >>> pbar_scf7180000350377 maker CDS 558609 558769 . + 2 >>> ID=pbar_scf7180000350377:hit:2506:cds:310;Parent=pbar_scf7180000350377:hit:2506; >>> >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >>> >>> Barry Moore >>> Research Scientist >>> Dept. of Human Genetics >>> University of Utah >>> Salt Lake City, UT 84112 >>> -------------------------------------------- >>> (801) 585-3543 >>> >>> >>> >>> >>> _______________________________________________ maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mikheyev at gmail.com Wed Mar 13 21:34:52 2013 From: mikheyev at gmail.com (Sasha Mikheyev) Date: Thu, 14 Mar 2013 12:34:52 +0900 Subject: [maker-devel] duplicate CDS in annotation In-Reply-To: References: Message-ID: Thank you very much! Problem solved! Sasha On Thu, Mar 14, 2013 at 11:54 AM, Carson Holt wrote: > Yes. map_forward=1 allows new models to keep the names of the models they > replace. It makes it so you don't have to relocate genes every time a > model gets a slight modification during reannotation. > > --Carson > > > From: Sasha Mikheyev > Date: Wednesday, 13 March, 2013 9:17 PM > > To: Carson Holt > Cc: Barry Moore , < > maker-devel at yandell-lab.org> > Subject: Re: [maker-devel] duplicate CDS in annotation > > OK. Got it! I did pass through the gene model names. I guess I now see > that a new gene model may become associated with the old name in the > re-annotation. > > Sasha > > On Thu, Mar 14, 2013 at 6:47 AM, Carson Holt wrote: > >> The output shows that the original model >> was Alias=maker-pbar_scf7180000349951-snap-gene-1.17-mRNA-1 and the new >> model replacing it is >> Alias=genemark-pbar_scf7180000349951-abinit-gene-1.14-mRNA-1. >> >> So it is really a completely different model (as one derived from SNAP >> and one from GeneMark). I'm guessing you have map_forward=1 set and are >> using the GFF3 passthrough options correct? >> >> Thanks, >> Carson >> >> >> >> From: Sasha Mikheyev >> Date: Wednesday, 13 March, 2013 3:23 AM >> >> To: Carson Holt >> Cc: Barry Moore , < >> maker-devel at yandell-lab.org> >> Subject: Re: [maker-devel] duplicate CDS in annotation >> >> Dear Carson, >> >> The new version does indeed fix the problem! >> >> However, I noticed that some of the CDS annotations were swallowed. This >> seems to affect a ~600 genes. >> >> e.g. input: >> >> pbar_scf7180000349951 maker mRNA 98033 98530 . - . >> ID=PB12301-RA;Parent=PB12301;Name=PB12301-RA;Alias=maker-pbar_scf7180000349951-snap-gene-1.17-mRNA-1;_AED=1.00;_QI=0|0|0|0|0|0|2|0|81; >> pbar_scf7180000349951 maker exon 98393 98530 . - . >> ID=PB12301-RA:exon:10283;Parent=PB12301-RA; >> pbar_scf7180000349951 maker exon 98033 98140 . - . >> ID=PB12301-RA:exon:10284;Parent=PB12301-RA; >> pbar_scf7180000349951 maker CDS 98033 98140 . - 0 >> ID=PB12301-RA:cds:10114;Parent=PB12301-RA; >> pbar_scf7180000349951 maker CDS 98393 98530 . - 0 >> ID=PB12301-RA:cds:10113;Parent=PB12301-RA; >> >> output: >> >> pbar_scf7180000349951 maker mRNA 98033 98530 . - . >> ID=PB12301-RA;Parent=PB12301;Name=PB12301-RA;_AED=0.38;_eAED=0.38;_QI=0|0|0.33|1|0.5|1|3|246|165;Alias=genemark-pbar_scf7180000349951-abinit-gene-1.14-mRNA-1,PB12301-RA >> pbar_scf7180000349951 maker exon 98033 98530 . - . >> ID=PB12301-RA:exon:134;Parent=PB12301-RA >> pbar_scf7180000349951 maker exon 98033 98140 . - . >> ID=PB12301-RA:exon:133;Parent=PB12301-RA >> pbar_scf7180000349951 maker exon 98393 98530 . - . >> ID=PB12301-RA:exon:132;Parent=PB12301-RA >> pbar_scf7180000349951 maker three_prime_UTR 98393 98530 . - . >> ID=PB12301-RA:three_prime_utr;Parent=PB12301-RA >> pbar_scf7180000349951 maker three_prime_UTR 98033 98140 . - . >> ID=PB12301-RA:three_prime_utr;Parent=PB12301-RA >> pbar_scf7180000349951 maker CDS 98033 98530 . - 0 >> ID=PB12301-RA:cds;Parent=PB12301-RA >> >> Thank you, >> >> Sasha >> >> On Tue, Mar 12, 2013 at 10:37 PM, Carson Holt wrote: >> >>> Yes. Try the newer version and see if you still have the issue. >>> >>> Thanks, >>> Carson >>> >>> >>> From: Sasha Mikheyev >>> Date: Tuesday, 12 March, 2013 1:26 AM >>> To: Carson Holt >>> Cc: Barry Moore , < >>> maker-devel at yandell-lab.org> >>> >>> Subject: Re: [maker-devel] duplicate CDS in annotation >>> >>> Hi Carson, >>> >>> I have been using version 2.10. Is it worth trying with a newer version? >>> >>> You can find the model file here. >>> It is rather large, as it includes all of the output from the first maker >>> run. >>> >>> Yours, >>> >>> Sasha >>> >>> >>> On Mon, Mar 11, 2013 at 10:02 PM, Carson Holt wrote: >>> >>>> I think the issue is that you are getting a match feature that is being >>>> printed with the same ID as the mRNA feature. Correct? >>>> >>>> What version of MAKER are you using, and what does the gile you are >>>> giving to pred_gff or model_gff look like? Could you send them? >>>> >>>> Thanks, >>>> Carson >>>> >>>> >>>> From: Barry Moore >>>> Date: Monday, 11 March, 2013 7:32 AM >>>> To: Sasha Mikheyev >>>> Cc: >>>> Subject: Re: [maker-devel] duplicate CDS in annotation >>>> >>>> Hi Sasha, >>>> >>>> This gene model appears to be correctly formatted to me. In GFF3 >>>> format the CDS features are allowed to span multiple lines and they share >>>> the same ID to indicate that it is all the same features. See the GFF3 >>>> specification on the Sequence Ontology website ( >>>> http://www.sequenceontology.org/resources/gff3.html), and in >>>> particular the description of the ID attribute specifies: >>>> >>>> ID Indicates the ID of the feature. IDs for each feature must be unique >>>> within the scope of the GFF file. In the case of discontinuous features >>>> (i.e. a single feature that exists over multiple genomic locations) the >>>> same ID may appear on multiple lines. All lines that share an ID >>>> collectively represent a single feature. >>>> >>>> >>>> So each of those CDS lines forms one part of the single CDS feature for >>>> this gene. >>>> >>>> B >>>> >>>> On Mar 11, 2013, at 3:46 AM, Sasha Mikheyev wrote: >>>> >>>> Dear Yandell lab, >>>> >>>> I am re-annotating the harvester and genome using protein and RNA-seq >>>> data. However, I get many artifacts like the one below. It seems that there >>>> are several CDS records that should tie in to the same mRNA, but they are >>>> really hanging out separately, and produce several nucleotide sequences >>>> with the same name when extracted from the gff. I would appreciate any >>>> guidance about how to fix this! >>>> >>>> Thank you, >>>> >>>> Sasha >>>> >>>> grep "pbar_scf7180000350377:hit:2506" Pbar.2.0.gff >>>> pbar_scf7180000350377 protein2genome protein_match 172004 172162 150 - >>>> . ID=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;score=150; >>>> pbar_scf7180000350377 protein2genome match_part 172004 172162 150 - . ID=pbar_scf7180000350377:hsp:2798;Parent=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;Target=Hsal|HS9704 >>>> 1 53 +;Gap=M159; >>>> pbar_scf7180000350377 maker mRNA 538308 558769 . + . >>>> ID=pbar_scf7180000350377:hit:2506;Parent=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29;Name=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29-mRNA-1;_AED=0.48;_eAED=0.39;_QI=0|0|0|0.5|1|1|6|0|395;score=0.01; >>>> pbar_scf7180000350377 maker exon 538308 538334 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:305;Parent=pbar_scf7180000350377:hit:2506; >>>> pbar_scf7180000350377 maker exon 538748 538968 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:306;Parent=pbar_scf7180000350377:hit:2506; >>>> pbar_scf7180000350377 maker exon 539842 540242 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:307;Parent=pbar_scf7180000350377:hit:2506; >>>> pbar_scf7180000350377 maker exon 542624 542798 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:308;Parent=pbar_scf7180000350377:hit:2506; >>>> pbar_scf7180000350377 maker exon 555823 556025 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:309;Parent=pbar_scf7180000350377:hit:2506; >>>> pbar_scf7180000350377 maker exon 558609 558769 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:310;Parent=pbar_scf7180000350377:hit:2506; >>>> pbar_scf7180000350377 maker CDS 538308 538334 . + 0 >>>> ID=pbar_scf7180000350377:hit:2506:cds:305;Parent=pbar_scf7180000350377:hit:2506; >>>> pbar_scf7180000350377 maker CDS 538748 538968 . + 0 >>>> ID=pbar_scf7180000350377:hit:2506:cds:306;Parent=pbar_scf7180000350377:hit:2506; >>>> pbar_scf7180000350377 maker CDS 539842 540242 . + 1 >>>> ID=pbar_scf7180000350377:hit:2506:cds:307;Parent=pbar_scf7180000350377:hit:2506; >>>> pbar_scf7180000350377 maker CDS 542624 542798 . + 2 >>>> ID=pbar_scf7180000350377:hit:2506:cds:308;Parent=pbar_scf7180000350377:hit:2506; >>>> pbar_scf7180000350377 maker CDS 555823 556025 . + 1 >>>> ID=pbar_scf7180000350377:hit:2506:cds:309;Parent=pbar_scf7180000350377:hit:2506; >>>> pbar_scf7180000350377 maker CDS 558609 558769 . + 2 >>>> ID=pbar_scf7180000350377:hit:2506:cds:310;Parent=pbar_scf7180000350377:hit:2506; >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>>> >>>> >>>> Barry Moore >>>> Research Scientist >>>> Dept. of Human Genetics >>>> University of Utah >>>> Salt Lake City, UT 84112 >>>> -------------------------------------------- >>>> (801) 585-3543 >>>> >>>> >>>> >>>> >>>> _______________________________________________ maker-devel mailing >>>> list maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>>> >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ramonfallon at gmail.com Thu Mar 14 09:19:47 2013 From: ramonfallon at gmail.com (=?ISO-8859-1?Q?Ram=F3n_Fallon?=) Date: Thu, 14 Mar 2013 16:19:47 +0100 Subject: [maker-devel] 12core speed check Message-ID: Hi, I was trying to tweak some of our machines to maximise Mpich2/Maker (svn rev 997) throughput and describe one small set of results on this mailing list to allow sharing of experiences. I use the example input dataset "dpp_contig.fasta" with the original sequence repeated 125 times within the same file (under different names of course) to allow for a decent size run. This file totalled 4.019 megabases. I use the dpp_proteins.fasta and The maker_opts.ctl has "cpus=1" set as the docs recommend for MPI. Hardware is a standalone HP Proliant SL390 with two Intel X5675 @ 3.07GHz, totalling 12 cores with 192GB RAM and 1TB disk (local, no NFS) running Ubuntu 10.04 with 2.6.32-41 linux kernel commandline was "mpiexec -n <#cores> maker" within a dedicated directory containing all relevant files. #cores time(mins) Megabases/hr 1 27.00 8.93 2 126.25 1.91 4 42.57 5.66 6 25.42 9.49 8 18.60 12.96 10 16.67 14.47 12 13.98 17.24 I attach a png file with graph. The upshot of this particular experiment is that 2 processes show anomalous behaviour and that 6 processors are needed to gain an advantage on the 1 processor run, while 12 processors achieves a speed-up of nearly 2 on the 1 processor version. I am now going to move on to a three node cluster with 2x 8core processors each (so I can go up to 48 processors), so will report back with higher core numbers. Any suggestions on further speed optimizations welcome. Cheers / Ram?n. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: twelvecore_spup.png Type: image/png Size: 25749 bytes Desc: not available URL: From carsonhh at gmail.com Thu Mar 14 09:53:33 2013 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 14 Mar 2013 11:53:33 -0400 Subject: [maker-devel] 12core speed check In-Reply-To: Message-ID: I can give a similar setup a try as well to see if anything is amiss in the development version. The expected behavior is that 1 and 2 cores should have identical performance (as one process is always fully dedicated to communication). --Carson From: Ram?n Fallon Date: Thursday, 14 March, 2013 11:19 AM To: Subject: [maker-devel] 12core speed check Hi, I was trying to tweak some of our machines to maximise Mpich2/Maker (svn rev 997) throughput and describe one small set of results on this mailing list to allow sharing of experiences. I use the example input dataset "dpp_contig.fasta" with the original sequence repeated 125 times within the same file (under different names of course) to allow for a decent size run. This file totalled 4.019 megabases. I use the dpp_proteins.fasta and The maker_opts.ctl has "cpus=1" set as the docs recommend for MPI. Hardware is a standalone HP Proliant SL390 with two Intel X5675 @ 3.07GHz, totalling 12 cores with 192GB RAM and 1TB disk (local, no NFS) running Ubuntu 10.04 with 2.6.32-41 linux kernel commandline was "mpiexec -n <#cores> maker" within a dedicated directory containing all relevant files. #cores time(mins) Megabases/hr 1 27.00 8.93 2 126.25 1.91 4 42.57 5.66 6 25.42 9.49 8 18.60 12.96 10 16.67 14.47 12 13.98 17.24 I attach a png file with graph. The upshot of this particular experiment is that 2 processes show anomalous behaviour and that 6 processors are needed to gain an advantage on the 1 processor run, while 12 processors achieves a speed-up of nearly 2 on the 1 processor version. I am now going to move on to a three node cluster with 2x 8core processors each (so I can go up to 48 processors), so will report back with higher core numbers. Any suggestions on further speed optimizations welcome. Cheers / Ram?n. _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnuhn at ebi.ac.uk Thu Mar 14 10:20:01 2013 From: mnuhn at ebi.ac.uk (Michael Nuhn) Date: Thu, 14 Mar 2013 16:20:01 +0000 Subject: [maker-devel] master_datastore_index.log file shrinks. Message-ID: <5141F8B1.7020808@ebi.ac.uk> Hello! I'm trying to keep track of the progress of maker (version 2.27) while it is running by looking at the master_datastore_index.log file every once in a while. Sometimes the number of lines in it decreases. Just now it went down from more than two hundred to thirty seven. When I start more instances of maker, the number of lines in it increases when they start. But sometimes I check and the number of lines has greatly reduced since the last time. I'm afraid that the newer instances of maker are deleting the file and starting it from scratch instead of adding their progress to it. Is this a file locking issue I should be worried about? Cheers, Michael. From olaf.mueller at duke.edu Thu Mar 14 10:13:20 2013 From: olaf.mueller at duke.edu (Olaf Mueller) Date: Thu, 14 Mar 2013 12:13:20 -0400 Subject: [maker-devel] 12core speed check In-Reply-To: References: Message-ID: <5141F720.20502@duke.edu> The X5675 supports hyperthreading. Does i.e. "mpiexec -n 2 maker" use 2 physical cores or 2 threads of the 1st core? If the latter happens it would be interesting to see your series extended to -n 24. Cheers Olaf On 03/14/2013 11:19 AM, Ram?n Fallon wrote: > Hi, > > I was trying to tweak some of our machines to maximise Mpich2/Maker > (svn rev 997) throughput and describe one small set of results on > this mailing list to allow sharing of experiences. > > I use the example input dataset "dpp_contig.fasta" with the original > sequence repeated 125 times within the same file (under different > names of course) to allow for a decent size run. This file totalled > 4.019 megabases. I use the dpp_proteins.fasta and The maker_opts.ctl > has "cpus=1" set as the docs recommend for MPI. > > Hardware is a standalone HP Proliant SL390 with two Intel X5675 @ > 3.07GHz, totalling 12 cores with 192GB RAM and 1TB disk (local, no > NFS) running Ubuntu 10.04 with 2.6.32-41 linux kernel > > commandline was "mpiexec -n <#cores> maker" within a dedicated > directory containing all relevant files. > > #cores time(mins) Megabases/hr > 1 27.00 8.93 > 2 126.25 1.91 > 4 42.57 5.66 > 6 25.42 9.49 > 8 18.60 12.96 > 10 16.67 14.47 > 12 13.98 17.24 > > I attach a png file with graph. The upshot of this particular > experiment is that 2 processes show anomalous behaviour and that 6 > processors are needed to gain an advantage on the 1 processor run, > while 12 processors achieves a speed-up of nearly 2 on the 1 processor > version. > > I am now going to move on to a three node cluster with 2x 8core > processors each (so I can go up to 48 processors), so will report back > with higher core numbers. Any suggestions on further speed > optimizations welcome. > > Cheers / Ram?n. > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Thu Mar 14 10:21:47 2013 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 14 Mar 2013 12:21:47 -0400 Subject: [maker-devel] master_datastore_index.log file shrinks. In-Reply-To: <5141F8B1.7020808@ebi.ac.uk> Message-ID: The file should only be deleted if there are no instances running and a new one starts. Then it rebuilds it. If it is being deleted while other instances are still active, then yes that is a lock issue. There are several other locks that should protect individual contigs while that particular lock is only protecting the datastore_index.log file. If any of the contig locks are not working you would start to see failures of contigs with weird errors that say there are missing files. Try dialling back on the number of simultaneous instances you start and instead use MPI or the -cpus option to get the parallelization boost. Alternatively you can also split up the input file and use the -base option so everything gets written to the same place (then you never have to worry about locks affecting individual contigs - as no single instance has access to all the contigs) Example: fasta_tool --chunks 5 maize_assembly.fasta maker -g maize_assembly_0.fasta -base maize_assembly maker -g maize_assembly_1.fasta -base maize_assembly maker -g maize_assembly_2.fasta -base maize_assembly maker -g maize_assembly_3.fasta -base maize_assembly maker -g maize_assembly_4.fasta -base maize_assembly maker -dsindex Everything then gets written to maize_assembly.maker.output for all results. The last call to maker with the -dsindex flag then rebuilds the datastore_index.log file to match the original maize_assembly.fasta file Thanks, Carson On 13-03-14 12:20 PM, "Michael Nuhn" wrote: >Hello! > >I'm trying to keep track of the progress of maker (version 2.27) while >it is running by looking at the master_datastore_index.log file every >once in a while. > >Sometimes the number of lines in it decreases. Just now it went down >from more than two hundred to thirty seven. > >When I start more instances of maker, the number of lines in it >increases when they start. But sometimes I check and the number of lines >has greatly reduced since the last time. > >I'm afraid that the newer instances of maker are deleting the file and >starting it from scratch instead of adding their progress to it. > >Is this a file locking issue I should be worried about? > >Cheers, >Michael. > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From mnuhn at ebi.ac.uk Thu Mar 14 10:49:19 2013 From: mnuhn at ebi.ac.uk (Michael Nuhn) Date: Thu, 14 Mar 2013 16:49:19 +0000 Subject: [maker-devel] master_datastore_index.log file shrinks. In-Reply-To: References: Message-ID: <5141FF8F.2050900@ebi.ac.uk> Hello Carson! Thanks for your quick response and your ideas. I'll give them a try. Cheers, Michael. On 03/14/2013 04:21 PM, Carson Holt wrote: > The file should only be deleted if there are no instances running and a > new one starts. Then it rebuilds it. If it is being deleted while other > instances are still active, then yes that is a lock issue. There are > several other locks that should protect individual contigs while that > particular lock is only protecting the datastore_index.log file. > > If any of the contig locks are not working you would start to see failures > of contigs with weird errors that say there are missing files. > > Try dialling back on the number of simultaneous instances you start and > instead use MPI or the -cpus option to get the parallelization boost. > Alternatively you can also split up the input file and use the -base > option so everything gets written to the same place (then you never have > to worry about locks affecting individual contigs - as no single instance > has access to all the contigs) > > Example: > fasta_tool --chunks 5 maize_assembly.fasta > maker -g maize_assembly_0.fasta -base maize_assembly > maker -g maize_assembly_1.fasta -base maize_assembly > > maker -g maize_assembly_2.fasta -base maize_assembly > > maker -g maize_assembly_3.fasta -base maize_assembly > > maker -g maize_assembly_4.fasta -base maize_assembly > > maker -dsindex > > Everything then gets written to maize_assembly.maker.output for all > results. The last call to maker with the -dsindex flag then rebuilds the > datastore_index.log file to match the original maize_assembly.fasta file > > > Thanks, > Carson > > > > > > On 13-03-14 12:20 PM, "Michael Nuhn" wrote: > >> Hello! >> >> I'm trying to keep track of the progress of maker (version 2.27) while >> it is running by looking at the master_datastore_index.log file every >> once in a while. >> >> Sometimes the number of lines in it decreases. Just now it went down >>from more than two hundred to thirty seven. >> >> When I start more instances of maker, the number of lines in it >> increases when they start. But sometimes I check and the number of lines >> has greatly reduced since the last time. >> >> I'm afraid that the newer instances of maker are deleting the file and >> starting it from scratch instead of adding their progress to it. >> >> Is this a file locking issue I should be worried about? >> >> Cheers, >> Michael. >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > From carsonhh at gmail.com Thu Mar 14 11:51:41 2013 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 14 Mar 2013 13:51:41 -0400 Subject: [maker-devel] 12core speed check In-Reply-To: Message-ID: Could you update to 998. It was a recent commit to the devel version that caused a weird pause. Thanks, Carson From: Ram?n Fallon Date: Thursday, 14 March, 2013 11:19 AM To: Subject: [maker-devel] 12core speed check Hi, I was trying to tweak some of our machines to maximise Mpich2/Maker (svn rev 997) throughput and describe one small set of results on this mailing list to allow sharing of experiences. I use the example input dataset "dpp_contig.fasta" with the original sequence repeated 125 times within the same file (under different names of course) to allow for a decent size run. This file totalled 4.019 megabases. I use the dpp_proteins.fasta and The maker_opts.ctl has "cpus=1" set as the docs recommend for MPI. Hardware is a standalone HP Proliant SL390 with two Intel X5675 @ 3.07GHz, totalling 12 cores with 192GB RAM and 1TB disk (local, no NFS) running Ubuntu 10.04 with 2.6.32-41 linux kernel commandline was "mpiexec -n <#cores> maker" within a dedicated directory containing all relevant files. #cores time(mins) Megabases/hr 1 27.00 8.93 2 126.25 1.91 4 42.57 5.66 6 25.42 9.49 8 18.60 12.96 10 16.67 14.47 12 13.98 17.24 I attach a png file with graph. The upshot of this particular experiment is that 2 processes show anomalous behaviour and that 6 processors are needed to gain an advantage on the 1 processor run, while 12 processors achieves a speed-up of nearly 2 on the 1 processor version. I am now going to move on to a three node cluster with 2x 8core processors each (so I can go up to 48 processors), so will report back with higher core numbers. Any suggestions on further speed optimizations welcome. Cheers / Ram?n. _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Thu Mar 14 11:55:38 2013 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 14 Mar 2013 13:55:38 -0400 Subject: [maker-devel] 12core speed check In-Reply-To: <5141F720.20502@duke.edu> Message-ID: It should use 2 physical cores. Hyperthreading shouldn't come into play unless you start more processes than there are physical cores. I haven't seen any big performance advantage in most cases with hyperthreading on linux machines. I find more often than not it just confuses students into thinking there are free processors and then starting too many jobs. --Carson From: Olaf Mueller Date: Thursday, 14 March, 2013 12:13 PM To: Subject: Re: [maker-devel] 12core speed check The X5675 supports hyperthreading. Does i.e. "mpiexec -n 2 maker" use 2 physical cores or 2 threads of the 1st core? If the latter happens it would be interesting to see your series extended to -n 24. Cheers Olaf On 03/14/2013 11:19 AM, Ram?n Fallon wrote: > Hi, > > > > I was trying to tweak some of our machines to maximise Mpich2/Maker (svn rev > 997) throughput and describe one small set of results on this mailing list to > allow sharing of experiences. > > > > > I use the example input dataset "dpp_contig.fasta" with the original sequence > repeated 125 times within the same file (under different names of course) to > allow for a decent size run. This file totalled 4.019 megabases. I use the > dpp_proteins.fasta and The maker_opts.ctl has "cpus=1" set as the docs > recommend for MPI. > > > > > Hardware is a standalone HP Proliant SL390 with two Intel X5675 @ 3.07GHz, > totalling 12 cores with 192GB RAM and 1TB disk (local, no NFS) running Ubuntu > 10.04 with 2.6.32-41 linux kernel > > > > > commandline was "mpiexec -n <#cores> maker" within a dedicated directory > containing all relevant files. > > > > > > #cores time(mins) Megabases/hr > > 1 27.00 8.93 > > 2 126.25 1.91 > > 4 42.57 5.66 > > 6 25.42 9.49 > > 8 18.60 12.96 > > 10 16.67 14.47 > > 12 13.98 17.24 > > > > > > I attach a png file with graph. The upshot of this particular experiment is > that 2 processes show anomalous behaviour and that 6 processors are needed to > gain an advantage on the 1 processor run, while 12 processors achieves a > speed-up of nearly 2 on the 1 processor version. > > > > > I am now going to move on to a three node cluster with 2x 8core processors > each (so I can go up to 48 processors), so will report back with higher core > numbers. Any suggestions on further speed optimizations welcome. > > > > > Cheers / Ram?n. > > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/mak > er-devel_yandell-lab.org > _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From myandell at genetics.utah.edu Thu Mar 14 11:59:37 2013 From: myandell at genetics.utah.edu (Mark Yandell) Date: Thu, 14 Mar 2013 17:59:37 +0000 Subject: [maker-devel] 12core speed check In-Reply-To: References: , Message-ID: <7A60AB257EFF2B48B1F4C814817EA05350ED9082@mxb2.hg.genetics.utah.edu> Thanks Ramon. super interesting analysis! Mark Yandell Professor of Human Genetics H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:801-587-7707 ________________________________________ From: maker-devel-bounces at yandell-lab.org [maker-devel-bounces at yandell-lab.org] on behalf of Carson Holt [carsonhh at gmail.com] Sent: Thursday, March 14, 2013 11:51 AM To: Ram?n Fallon; maker-devel at yandell-lab.org Subject: Re: [maker-devel] 12core speed check Could you update to 998. It was a recent commit to the devel version that caused a weird pause. Thanks, Carson From: Ram?n Fallon > Date: Thursday, 14 March, 2013 11:19 AM To: > Subject: [maker-devel] 12core speed check Hi, I was trying to tweak some of our machines to maximise Mpich2/Maker (svn rev 997) throughput and describe one small set of results on this mailing list to allow sharing of experiences. I use the example input dataset "dpp_contig.fasta" with the original sequence repeated 125 times within the same file (under different names of course) to allow for a decent size run. This file totalled 4.019 megabases. I use the dpp_proteins.fasta and The maker_opts.ctl has "cpus=1" set as the docs recommend for MPI. Hardware is a standalone HP Proliant SL390 with two Intel X5675 @ 3.07GHz, totalling 12 cores with 192GB RAM and 1TB disk (local, no NFS) running Ubuntu 10.04 with 2.6.32-41 linux kernel commandline was "mpiexec -n <#cores> maker" within a dedicated directory containing all relevant files. #cores time(mins) Megabases/hr 1 27.00 8.93 2 126.25 1.91 4 42.57 5.66 6 25.42 9.49 8 18.60 12.96 10 16.67 14.47 12 13.98 17.24 I attach a png file with graph. The upshot of this particular experiment is that 2 processes show anomalous behaviour and that 6 processors are needed to gain an advantage on the 1 processor run, while 12 processors achieves a speed-up of nearly 2 on the 1 processor version. I am now going to move on to a three node cluster with 2x 8core processors each (so I can go up to 48 processors), so will report back with higher core numbers. Any suggestions on further speed optimizations welcome. Cheers / Ram?n. _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From daniel.quest at gmail.com Thu Mar 14 20:07:34 2013 From: daniel.quest at gmail.com (Dan Quest) Date: Fri, 15 Mar 2013 02:07:34 +0000 (UTC) Subject: [maker-devel] Invitation to connect on LinkedIn Message-ID: <1487511280.7392755.1363313254244.JavaMail.app@ela4-app2322.prod> LinkedIn ------------ I'd like to add you to my professional network on LinkedIn. - Dan Dan Quest Senior Analyst Programmer at Mayo Clinic Rochester, Minnesota Area Confirm that you know Dan Quest: https://www.linkedin.com/e/-m3y3hs-heapifdk-1i/isd/11686987554/Yo4-rOXB/?hs=false&tok=26pedbV21vJlE1 -- You are receiving Invitation to Connect emails. Click to unsubscribe: http://www.linkedin.com/e/-m3y3hs-heapifdk-1i/vcG-iX3vwW9133a7MYTHsMyDds41ZeU5jWTF9LUs04/goo/maker-devel%40yandell-lab%2Eorg/20061/I3868510560_1/?hs=false&tok=24a30hi6RvJlE1 (c) 2012 LinkedIn Corporation. 2029 Stierlin Ct, Mountain View, CA 94043, USA. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ares711122 at gmail.com Thu Mar 14 21:13:55 2013 From: ares711122 at gmail.com (Hung-Wei Hsu) Date: Fri, 15 Mar 2013 11:13:55 +0800 Subject: [maker-devel] ERROR: Could not obtain lock to format database In-Reply-To: References: Message-ID: You may find the error messages in the run log as attached. Thanks a lot in advance. Best regards, Hung-Wei 2013/3/14 Carson Holt > Could you check to make sure your hard drive is not full, whatever > location you set as TMP= in the control files is not full (default is > /tmp). Also maker sure you do not set /tmp to an NFS mounted or a tmpfs > location. > > Could you also send the full captured STDERR. > > Thanks, > Carson > > > > From: Hung-Wei Hsu > Date: Tuesday, 12 March, 2013 8:24 PM > To: > Subject: [maker-devel] ERROR: Could not obtain lock to format database > > Hi MAKER developers, > > I tried MAKER 2.27b on one E. coli scaffold sequence with uniprot protein > database. > I failed to run the analysis and got an error message as below. > > Could not obtain lock to format database at maker-2.27b/bin/../lib/GI.pm > > Any suggestions or helps will be deeply appreciated. > > Best regards, > Hung-Wei > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: run.log Type: application/octet-stream Size: 27205 bytes Desc: not available URL: From ares711122 at gmail.com Thu Mar 14 21:35:09 2013 From: ares711122 at gmail.com (Hung-Wei Hsu) Date: Fri, 15 Mar 2013 11:35:09 +0800 Subject: [maker-devel] ERROR: Could not obtain lock to format database In-Reply-To: References: Message-ID: The hard disk where I tried MAKER is about 2TB in size. TMP was not set to an NFS mounted or a tmpfs location and was empty before analysis. The hard disk where TMP directory was located at was about 2TB in size. Thanks a lot in advance. Best regards, Hung-Wei 2013/3/15 Hung-Wei Hsu > You may find the error messages in the run log as attached. > Thanks a lot in advance. > > Best regards, > Hung-Wei > > > 2013/3/14 Carson Holt > >> Could you check to make sure your hard drive is not full, whatever >> location you set as TMP= in the control files is not full (default is >> /tmp). Also maker sure you do not set /tmp to an NFS mounted or a tmpfs >> location. >> >> Could you also send the full captured STDERR. >> >> Thanks, >> Carson >> >> >> >> From: Hung-Wei Hsu >> Date: Tuesday, 12 March, 2013 8:24 PM >> To: >> Subject: [maker-devel] ERROR: Could not obtain lock to format database >> >> Hi MAKER developers, >> >> I tried MAKER 2.27b on one E. coli scaffold sequence with uniprot protein >> database. >> I failed to run the analysis and got an error message as below. >> >> Could not obtain lock to format database at maker-2.27b/bin/../lib/GI.pm >> >> Any suggestions or helps will be deeply appreciated. >> >> Best regards, >> Hung-Wei >> _______________________________________________ maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Fri Mar 15 12:06:21 2013 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 15 Mar 2013 14:06:21 -0400 Subject: [maker-devel] ERROR: Could not obtain lock to format database In-Reply-To: Message-ID: Were you by any chance running multiple instances of MAKER at the same time in the same directory? It looks like two processes started to work on the same contig (normally a first set of locks blocks this possibility ? but rarely they get past that step). Then when it got to a part where an analysis is performed one properly failed when it realized that the other had the lock. In any case, it looks like it just retried and finished the contig in question. So the snippet seems to indicate expected behavior. Do you see the contig in question as being finished and having an output GFF3? --Carson From: Hung-Wei Hsu Date: Thursday, 14 March, 2013 11:13 PM To: Carson Holt Cc: Subject: Re: [maker-devel] ERROR: Could not obtain lock to format database You may find the error messages in the run log as attached. Thanks a lot in advance. Best regards, Hung-Wei 2013/3/14 Carson Holt > Could you check to make sure your hard drive is not full, whatever location > you set as TMP= in the control files is not full (default is /tmp). Also > maker sure you do not set /tmp to an NFS mounted or a tmpfs location. > > Could you also send the full captured STDERR. > > Thanks, > Carson > > > > From: Hung-Wei Hsu > Date: Tuesday, 12 March, 2013 8:24 PM > To: > Subject: [maker-devel] ERROR: Could not obtain lock to format database > > Hi MAKER developers, > > I tried MAKER 2.27b on one E. coli scaffold sequence with uniprot protein > database. > I failed to run the analysis and got an error message as below. > > Could not obtain lock to format database at maker-2.27b/bin/../lib/GI.pm > > Any suggestions or helps will be deeply appreciated. > > Best regards, > Hung-Wei > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/mak > er-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ramonfallon at gmail.com Mon Mar 18 08:35:04 2013 From: ramonfallon at gmail.com (=?ISO-8859-1?Q?Ram=F3n_Fallon?=) Date: Mon, 18 Mar 2013 15:35:04 +0100 Subject: [maker-devel] Fwd: 12core speed check In-Reply-To: References: <7A60AB257EFF2B48B1F4C814817EA05350ED9082@mxb2.hg.genetics.utah.edu> Message-ID: Hi! I've tried again from two different machines, and I can't do a "svn co" nor "svn update" on the malachite server. Can you verify the server and the svn service is OK on your side? Many thanks / Ram?n. On Fri, Mar 15, 2013 at 1:18 PM, Ram?n Fallon wrote: > Hi Mark and Carson, > > Many thanks for the comments and the speedy replies! > > Previously, I never had problem connecting to the svn server on > malachite.genetics.utah.edu, but this morning, I couldn't connect to > update to rev 998. > > I'l try again later. > > Cheers / Ram?n. > > > On Thu, Mar 14, 2013 at 6:59 PM, Mark Yandell wrote: > >> Thanks Ramon. super interesting analysis! >> >> >> Mark Yandell >> Professor of Human Genetics >> H.A. & Edna Benning Presidential Endowed Chair >> Eccles Institute of Human Genetics >> University of Utah >> 15 North 2030 East, Room 2100 >> Salt Lake City, UT 84112-5330 >> ph:801-587-7707 >> >> ________________________________________ >> From: maker-devel-bounces at yandell-lab.org [ >> maker-devel-bounces at yandell-lab.org] on behalf of Carson Holt [ >> carsonhh at gmail.com] >> Sent: Thursday, March 14, 2013 11:51 AM >> To: Ram?n Fallon; maker-devel at yandell-lab.org >> Subject: Re: [maker-devel] 12core speed check >> >> Could you update to 998. It was a recent commit to the devel version >> that caused a weird pause. >> >> Thanks, >> Carson >> >> >> From: Ram?n Fallon > >> Date: Thursday, 14 March, 2013 11:19 AM >> To: > >> Subject: [maker-devel] 12core speed check >> >> Hi, >> >> I was trying to tweak some of our machines to maximise Mpich2/Maker (svn >> rev 997) throughput and describe one small set of results on this mailing >> list to allow sharing of experiences. >> >> I use the example input dataset "dpp_contig.fasta" with the original >> sequence repeated 125 times within the same file (under different names of >> course) to allow for a decent size run. This file totalled 4.019 megabases. >> I use the dpp_proteins.fasta and The maker_opts.ctl has "cpus=1" set as >> the docs recommend for MPI. >> >> Hardware is a standalone HP Proliant SL390 with two Intel X5675 @ >> 3.07GHz, totalling 12 cores with 192GB RAM and 1TB disk (local, no NFS) >> running Ubuntu 10.04 with 2.6.32-41 linux kernel >> >> commandline was "mpiexec -n <#cores> maker" within a dedicated directory >> containing all relevant files. >> >> #cores time(mins) Megabases/hr >> 1 27.00 8.93 >> 2 126.25 1.91 >> 4 42.57 5.66 >> 6 25.42 9.49 >> 8 18.60 12.96 >> 10 16.67 14.47 >> 12 13.98 17.24 >> >> I attach a png file with graph. The upshot of this particular experiment >> is that 2 processes show anomalous behaviour and that 6 processors are >> needed to gain an advantage on the 1 processor run, while 12 processors >> achieves a speed-up of nearly 2 on the 1 processor version. >> >> I am now going to move on to a three node cluster with 2x 8core >> processors each (so I can go up to 48 processors), so will report back with >> higher core numbers. Any suggestions on further speed optimizations welcome. >> >> Cheers / Ram?n. >> _______________________________________________ maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Mar 18 08:51:37 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 18 Mar 2013 10:51:37 -0400 Subject: [maker-devel] Fwd: 12core speed check In-Reply-To: Message-ID: For any users currently using the devel subversion repository. If you need to update, please send me an e-mail to get information on how to switch over to our new server. Thanks, Carson From: Ram?n Fallon Date: Monday, 18 March, 2013 10:35 AM To: Subject: [maker-devel] Fwd: 12core speed check Hi! I've tried again from two different machines, and I can't do a "svn co" nor "svn update" on the malachite server. Can you verify the server and the svn service is OK on your side? Many thanks / Ram?n. On Fri, Mar 15, 2013 at 1:18 PM, Ram?n Fallon wrote: > Hi Mark and Carson, > > Many thanks for the comments and the speedy replies! > > Previously, I never had problem connecting to the svn server on > malachite.genetics.utah.edu , but this > morning, I couldn't connect to update to rev 998. > > I'l try again later. > > Cheers / Ram?n. > > > On Thu, Mar 14, 2013 at 6:59 PM, Mark Yandell > wrote: >> Thanks Ramon. super interesting analysis! >> >> >> Mark Yandell >> Professor of Human Genetics >> H.A. & Edna Benning Presidential Endowed Chair >> Eccles Institute of Human Genetics >> University of Utah >> 15 North 2030 East, Room 2100 >> Salt Lake City, UT 84112-5330 >> ph:801-587-7707 >> >> ________________________________________ >> From: maker-devel-bounces at yandell-lab.org >> [maker-devel-bounces at yandell-lab.org] on behalf of Carson Holt >> [carsonhh at gmail.com] >> Sent: Thursday, March 14, 2013 11:51 AM >> To: Ram?n Fallon; maker-devel at yandell-lab.org >> Subject: Re: [maker-devel] 12core speed check >> >> Could you update to 998. It was a recent commit to the devel version that >> caused a weird pause. >> >> Thanks, >> Carson >> >> >> From: Ram?n Fallon > >> Date: Thursday, 14 March, 2013 11:19 AM >> To: > >> Subject: [maker-devel] 12core speed check >> >> Hi, >> >> I was trying to tweak some of our machines to maximise Mpich2/Maker (svn rev >> 997) throughput and describe one small set of results on this mailing list >> to allow sharing of experiences. >> >> I use the example input dataset "dpp_contig.fasta" with the original sequence >> repeated 125 times within the same file (under different names of course) to >> allow for a decent size run. This file totalled 4.019 megabases. I use the >> dpp_proteins.fasta and The maker_opts.ctl has "cpus=1" set as the docs >> recommend for MPI. >> >> Hardware is a standalone HP Proliant SL390 with two Intel X5675 @ 3.07GHz, >> totalling 12 cores with 192GB RAM and 1TB disk (local, no NFS) running Ubuntu >> 10.04 with 2.6.32-41 linux kernel >> >> commandline was "mpiexec -n <#cores> maker" within a dedicated directory >> containing all relevant files. >> >> #cores time(mins) Megabases/hr >> 1 27.00 8.93 >> 2 126.25 1.91 >> 4 42.57 5.66 >> 6 25.42 9.49 >> 8 18.60 12.96 >> 10 16.67 14.47 >> 12 13.98 17.24 >> >> I attach a png file with graph. The upshot of this particular experiment is >> that 2 processes show anomalous behaviour and that 6 processors are needed to >> gain an advantage on the 1 processor run, while 12 processors achieves a >> speed-up of nearly 2 on the 1 processor version. >> >> I am now going to move on to a three node cluster with 2x 8core processors >> each (so I can go up to 48 processors), so will report back with higher core >> numbers. Any suggestions on further speed optimizations welcome. >> >> Cheers / Ram?n. >> _______________________________________________ maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From hudarul at yahoo.com Mon Mar 18 14:13:21 2013 From: hudarul at yahoo.com (Hud Hud) Date: Mon, 18 Mar 2013 13:13:21 -0700 (PDT) Subject: [maker-devel] Maker-no such file or directory Message-ID: <1363637601.24386.YahooMailNeo@web164901.mail.bf1.yahoo.com> I have some problem with maker 1. i try to work with the example data in data directory, but im having this kind of error..anyone can help me error $ maker STATUS: Parsing control files... dpp_contig.fasta (fasta file or fasta embeded in GFF3 file): No such file or directory at /home/Dorah/maker-2.27-beta/maker/bin/../lib/GI.pm line 186 --> rank=NA, hostname=NurKaiyisah my maker_opts.ctl genome=$home/Dorah/maker-2.27-beta/maker/data/dpp_contig.fasta est=$home/Dorah/maker-2.27-beta/maker/data/dpp_est.fasta protein=$home/Dorah/maker-2.27-beta/maker/data/dpp_protein.fasta -------------- next part -------------- An HTML attachment was scrubbed... URL: From Hossein.Borhan at AGR.GC.CA Mon Mar 18 14:40:38 2013 From: Hossein.Borhan at AGR.GC.CA (Borhan, Hossein) Date: Mon, 18 Mar 2013 16:40:38 -0400 Subject: [maker-devel] failed gene prediction Message-ID: <7B64340A44B6634C814A22BCFA6179D5020887DA@onottaxms5.AGR.GC.CA> Hi I have tried maker on a fungus genome of 45 mb with 1/3 being repeat rich. It did not produce any prediction. I am not sure what is causing this. Attached are the STDERR and opts.ctl. I appreciate your help Hossein -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: wa74-maker-stderr.log Type: application/octet-stream Size: 6325713 bytes Desc: wa74-maker-stderr.log URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_opts.ctl Type: application/octet-stream Size: 5244 bytes Desc: maker_opts.ctl URL: From carsonhh at gmail.com Mon Mar 18 14:44:41 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 18 Mar 2013 16:44:41 -0400 Subject: [maker-devel] Maker-no such file or directory In-Reply-To: <1363637601.24386.YahooMailNeo@web164901.mail.bf1.yahoo.com> Message-ID: Does 'ls -al $home/Dorah/maker-2.27-beta/maker/data/dpp_contig.fasta' show a valid location? The error is just saying that the file location as written in the maker_opts.ctl file does not exist. --Carson From: Hud Hud Reply-To: Hud Hud Date: Monday, 18 March, 2013 4:13 PM To: "maker-devel at yandell-lab.org" Subject: [maker-devel] Maker-no such file or directory I have some problem with maker 1. i try to work with the example data in data directory, but im having this kind of error..anyone can help me error $ maker STATUS: Parsing control files... dpp_contig.fasta (fasta file or fasta embeded in GFF3 file): No such file or directory at /home/Dorah/maker-2.27-beta/maker/bin/../lib/GI.pm line 186 --> rank=NA, hostname=NurKaiyisah my maker_opts.ctl genome=$home/Dorah/maker-2.27-beta/maker/data/dpp_contig.fasta est=$home/Dorah/maker-2.27-beta/maker/data/dpp_est.fasta protein=$home/Dorah/maker-2.27-beta/maker/data/dpp_protein.fasta _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Mar 18 14:49:30 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 18 Mar 2013 16:49:30 -0400 Subject: [maker-devel] failed gene prediction In-Reply-To: <7B64340A44B6634C814A22BCFA6179D5020887DA@onottaxms5.AGR.GC.CA> Message-ID: You didn't supply any evidence or HMM files for gene predictors. Just raw assembly data by itself is insufficient for genome annotation. Here is some nice documentation for running MAKER --> http://gmod.org/wiki/MAKER_Tutorial_2012 Here is a nice overview of genome annotation ion general --> http://fasta.bioch.virginia.edu/cshl/pdf/12/ajm12/euk_genome_annotation_revi ew.pdf Once you've gone through the documentation and examples, if you come across any questions just let us know. Thanks, Carson From: "Borhan, Hossein" Date: Monday, 18 March, 2013 4:40 PM To: Subject: [maker-devel] failed gene prediction Hi I have tried maker on a fungus genome of 45 mb with 1/3 being repeat rich. It did not produce any prediction. I am not sure what is causing this. Attached are the STDERR and opts.ctl. I appreciate your help Hossein _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ares711122 at gmail.com Mon Mar 18 18:44:39 2013 From: ares711122 at gmail.com (Hung-Wei Hsu) Date: Tue, 19 Mar 2013 08:44:39 +0800 Subject: [maker-devel] ERROR: Could not obtain lock to format database In-Reply-To: References: Message-ID: I make sure I just ran one instance of MAKER at the same time. I only analyzed one contig for the test. After MAKER interruption, I can't find an GFF3 output of this contig. There are only a theVoidXXX directory and a run.log file. I'm trying 2.26b with the same parameters for the same data. Hopefully, it can work well. Hung-Wei 2013/3/16 Carson Holt > Were you by any chance running multiple instances of MAKER at the same > time in the same directory? It looks like two processes started to work on > the same contig (normally a first set of locks blocks this possibility ? > but rarely they get past that step). Then when it got to a part where an > analysis is performed one properly failed when it realized that the other > had the lock. In any case, it looks like it just retried and finished the > contig in question. So the snippet seems to indicate expected behavior. > Do you see the contig in question as being finished and having an output > GFF3? > > --Carson > > > > > From: Hung-Wei Hsu > Date: Thursday, 14 March, 2013 11:13 PM > To: Carson Holt > Cc: > Subject: Re: [maker-devel] ERROR: Could not obtain lock to format database > > You may find the error messages in the run log as attached. > Thanks a lot in advance. > > Best regards, > Hung-Wei > > > 2013/3/14 Carson Holt > >> Could you check to make sure your hard drive is not full, whatever >> location you set as TMP= in the control files is not full (default is >> /tmp). Also maker sure you do not set /tmp to an NFS mounted or a tmpfs >> location. >> >> Could you also send the full captured STDERR. >> >> Thanks, >> Carson >> >> >> >> From: Hung-Wei Hsu >> Date: Tuesday, 12 March, 2013 8:24 PM >> To: >> Subject: [maker-devel] ERROR: Could not obtain lock to format database >> >> Hi MAKER developers, >> >> I tried MAKER 2.27b on one E. coli scaffold sequence with uniprot protein >> database. >> I failed to run the analysis and got an error message as below. >> >> Could not obtain lock to format database at maker-2.27b/bin/../lib/GI.pm >> >> Any suggestions or helps will be deeply appreciated. >> >> Best regards, >> Hung-Wei >> _______________________________________________ maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnuhn at ebi.ac.uk Tue Mar 19 06:12:32 2013 From: mnuhn at ebi.ac.uk (Michael Nuhn) Date: Tue, 19 Mar 2013 12:12:32 +0000 Subject: [maker-devel] master_datastore_index.log file shrinks. In-Reply-To: <5141FF8F.2050900@ebi.ac.uk> References: <5141FF8F.2050900@ebi.ac.uk> Message-ID: <51485630.6080701@ebi.ac.uk> Hello Carson! On 03/14/2013 04:49 PM, Michael Nuhn wrote: >> Try dialling back on the number of simultaneous instances you start and >> instead use MPI or the -cpus option to get the parallelization boost. >> Alternatively you can also split up the input file and use the -base >> option so everything gets written to the same place (then you never have >> to worry about locks affecting individual contigs - as no single instance >> has access to all the contigs) >> >> Example: >> fasta_tool --chunks 5 maize_assembly.fasta >> maker -g maize_assembly_0.fasta -base maize_assembly >> maker -g maize_assembly_1.fasta -base maize_assembly >> >> maker -g maize_assembly_2.fasta -base maize_assembly >> >> maker -g maize_assembly_3.fasta -base maize_assembly >> >> maker -g maize_assembly_4.fasta -base maize_assembly >> >> maker -dsindex >> >> Everything then gets written to maize_assembly.maker.output for all >> results. The last call to maker with the -dsindex flag then rebuilds the >> datastore_index.log file to match the original maize_assembly.fasta file I have tried this, split my genome into 50 files and run them as you suggested above. This worked well most of the time, but now I am getting locking issues again. The working directory gets flooded with STACK.STACK.STACK.STACK ... files. What I think is happening is that for some reason the maker instances decide that they want to rebuild the index. This takes a lot of time and this blocks even more instances wanting to lock the index files. In the end most of the maker instances end up waiting. I would like to try the following, but I don't know, if this might cause problems later on: I would like to run all of the split sequence files as separate maker projects as if they were independent genomes. In the end I'd merge all the individual gff files using the gff3_merge script. Do you see any reason why this wouldn't work? Cheers, Michael. From Bob_Freeman at hms.harvard.edu Tue Mar 19 07:03:00 2013 From: Bob_Freeman at hms.harvard.edu (Freeman, Robert M.) Date: Tue, 19 Mar 2013 09:03:00 -0400 Subject: [maker-devel] Failing to recoup all gff annotations, including ab-initio Message-ID: Carson et al., Thanks again for a great suite tools! We're using MAKER now to generate gene models (and model fragments) for a ciliate, the models for which we'll be using to generate a high-quality protein database for searches with mass spec. I bootstrapped the process using the core set of proteins with CEGMA, then trained SNAP. After the final round of running MAKER, I get about 1100 evidence-based models and 34K ab-initio. And that's fine (for now). I am able to collect the fasta files for both transcripts and proteins (evidence-based and ab-initio) without problem. My problem is that when I use the gff3_merge script, I only get annotations for the evidence-based models. I'm not sure why the ab-initio model annotations are being collected. I've tried using and not the '-g' switch, but this doesn't seem to make a difference. Thoughts? Tx, B ----------------------------------------------------- Bob Freeman, Ph.D. Acorn Worm Informatics, Kirschner lab Dept of Systems Biology, Alpert 524 Harvard Medical School 200 Longwood Avenue Boston, MA 02115 617/432.2294, vox "Sorry I'm late. Oh, God, that sounded insincere. I'm late." -- Karen Walker, from Will and Grace -------------- next part -------------- An HTML attachment was scrubbed... URL: From dsth at ebi.ac.uk Tue Mar 19 07:33:13 2013 From: dsth at ebi.ac.uk (Daniel Hughes) Date: Tue, 19 Mar 2013 13:33:13 +0000 Subject: [maker-devel] master_datastore_index.log file shrinks.] Message-ID: Daniel S. T. Hughes M.Biochem (Hons; Oxford), Ph.D (Cambridge) ------------------------------------------------------------------------------------- dsth at cantab.net dsth at cpan.org Hi Michael, You're using ebi cluster? i have to ask, is this all just a really elaborate way of avoiding the use of MPI that works perfectly well on both the ebi and sanger compute farms? if you carry on in the direction you seem to be going you're likely to end up with a considerable level of unnecessary overhead and should possibly consider adapting the ensembl genebuild pipeline to your specific needs. Dan Hello Carson! > > On 03/14/2013 04:49 PM, Michael Nuhn wrote: > >> Try dialling back on the number of simultaneous instances you start and > >> instead use MPI or the -cpus option to get the parallelization boost. > >> Alternatively you can also split up the input file and use the -base > >> option so everything gets written to the same place (then you never have > >> to worry about locks affecting individual contigs - as no single > instance > >> has access to all the contigs) > >> > >> Example: > >> fasta_tool --chunks 5 maize_assembly.fasta > >> maker -g maize_assembly_0.fasta -base maize_assembly > >> maker -g maize_assembly_1.fasta -base maize_assembly > >> > >> maker -g maize_assembly_2.fasta -base maize_assembly > >> > >> maker -g maize_assembly_3.fasta -base maize_assembly > >> > >> maker -g maize_assembly_4.fasta -base maize_assembly > >> > >> maker -dsindex > >> > >> Everything then gets written to maize_assembly.maker.output for all > >> results. The last call to maker with the -dsindex flag then rebuilds > the > >> datastore_index.log file to match the original maize_assembly.fasta file > > I have tried this, split my genome into 50 files and run them as you > suggested above. > > This worked well most of the time, but now I am getting locking issues > again. The working directory gets flooded with STACK.STACK.STACK.STACK > ... files. > > What I think is happening is that for some reason the maker instances > decide that they want to rebuild the index. This takes a lot of time > and this blocks even more instances wanting to lock the index files. > In the end most of the maker instances end up waiting. > > I would like to try the following, but I don't know, if this might > cause problems later on: > > I would like to run all of the split sequence files as separate maker > projects as if they were independent genomes. In the end I'd merge all > the individual gff files using the gff3_merge script. > > Do you see any reason why this wouldn't work? > > Cheers, > Michael. > > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > ----- End forwarded message ----- > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Mar 19 08:27:16 2013 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 19 Mar 2013 10:27:16 -0400 Subject: [maker-devel] master_datastore_index.log file shrinks.] In-Reply-To: Message-ID: Yes. If at all possible use MPI. It removes the overhead of locks which happen per primary instance of MAKER. So one maker job using 1000 cpus via MPI will have one shared set of locks. 1000 serial instances of MAKER on the other hand would have 1000x the locks. Alternatively if you do need to continue without MPI for some reason, I just finished a devel version of MAKER that has a --no_locks option. You can never start two instances using the same input fasta when --no_locks is specified, but the splitting to use different input fastas I mentioned before in the example will still work fine. I also have updated the indexing/reindexing, so if indexing failures happen, MAKER will switch between the current working directory and the TMP= directory from the maker_opts.ctl file so as to try different IO locations (I.e. NFS and non-NFS). Note you should never set TMP= in the control files to an NFS mounted location (it not only makes things a lot slower, but berkleydb and sqllite will get frequent errors on NFS). TMP= defaults to /tmp when not specified I'll send you download information in a separate e-mail. Try a regular MAKER run to see if the indexing/reindexing changes are sufficient before attempting the ?no_locks option. Thanks, Carson From: Daniel Hughes Date: Tuesday, 19 March, 2013 9:33 AM To: Michael Nuhn , Subject: Re: [maker-devel] master_datastore_index.log file shrinks.] Daniel S. T. Hughes M.Biochem (Hons; Oxford), Ph.D (Cambridge) ---------------------------------------------------------------------------- --------- dsth at cantab.net dsth at cpan.org Hi Michael, You're using ebi cluster? i have to ask, is this all just a really elaborate way of avoiding the use of MPI that works perfectly well on both the ebi and sanger compute farms? if you carry on in the direction you seem to be going you're likely to end up with a considerable level of unnecessary overhead and should possibly consider adapting the ensembl genebuild pipeline to your specific needs. Dan > Hello Carson! > > On 03/14/2013 04:49 PM, Michael Nuhn wrote: >>> >> Try dialling back on the number of simultaneous instances you start and >>> >> instead use MPI or the -cpus option to get the parallelization boost. >>> >> Alternatively you can also split up the input file and use the -base >>> >> option so everything gets written to the same place (then you never have >>> >> to worry about locks affecting individual contigs - as no single instance >>> >> has access to all the contigs) >>> >> >>> >> Example: >>> >> fasta_tool --chunks 5 maize_assembly.fasta >>> >> maker -g maize_assembly_0.fasta -base maize_assembly >>> >> maker -g maize_assembly_1.fasta -base maize_assembly >>> >> >>> >> maker -g maize_assembly_2.fasta -base maize_assembly >>> >> >>> >> maker -g maize_assembly_3.fasta -base maize_assembly >>> >> >>> >> maker -g maize_assembly_4.fasta -base maize_assembly >>> >> >>> >> maker -dsindex >>> >> >>> >> Everything then gets written to maize_assembly.maker.output for all >>> >> results. The last call to maker with the -dsindex flag then rebuilds the >>> >> datastore_index.log file to match the original maize_assembly.fasta file > > I have tried this, split my genome into 50 files and run them as you > suggested above. > > This worked well most of the time, but now I am getting locking issues > again. The working directory gets flooded with STACK.STACK.STACK.STACK > ... files. > > What I think is happening is that for some reason the maker instances > decide that they want to rebuild the index. This takes a lot of time > and this blocks even more instances wanting to lock the index files. > In the end most of the maker instances end up waiting. > > I would like to try the following, but I don't know, if this might > cause problems later on: > > I would like to run all of the split sequence files as separate maker > projects as if they were independent genomes. In the end I'd merge all > the individual gff files using the gff3_merge script. > > Do you see any reason why this wouldn't work? > > Cheers, > Michael. > > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > ----- End forwarded message ----- > _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Mar 19 08:38:00 2013 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 19 Mar 2013 10:38:00 -0400 Subject: [maker-devel] master_datastore_index.log file shrinks.] In-Reply-To: Message-ID: You can also talk to Eleanor Stanley at Sanger, she has a pre-release of MAKER 2.28 already installed and running on the Sanger cluster with OpenMPI. Thanks, Carson From: Carson Holt Date: Tuesday, 19 March, 2013 10:27 AM To: Daniel Hughes , Michael Nuhn , Subject: Re: [maker-devel] master_datastore_index.log file shrinks.] Yes. If at all possible use MPI. It removes the overhead of locks which happen per primary instance of MAKER. So one maker job using 1000 cpus via MPI will have one shared set of locks. 1000 serial instances of MAKER on the other hand would have 1000x the locks. Alternatively if you do need to continue without MPI for some reason, I just finished a devel version of MAKER that has a --no_locks option. You can never start two instances using the same input fasta when --no_locks is specified, but the splitting to use different input fastas I mentioned before in the example will still work fine. I also have updated the indexing/reindexing, so if indexing failures happen, MAKER will switch between the current working directory and the TMP= directory from the maker_opts.ctl file so as to try different IO locations (I.e. NFS and non-NFS). Note you should never set TMP= in the control files to an NFS mounted location (it not only makes things a lot slower, but berkleydb and sqllite will get frequent errors on NFS). TMP= defaults to /tmp when not specified I'll send you download information in a separate e-mail. Try a regular MAKER run to see if the indexing/reindexing changes are sufficient before attempting the ?no_locks option. Thanks, Carson From: Daniel Hughes Date: Tuesday, 19 March, 2013 9:33 AM To: Michael Nuhn , Subject: Re: [maker-devel] master_datastore_index.log file shrinks.] Daniel S. T. Hughes M.Biochem (Hons; Oxford), Ph.D (Cambridge) ---------------------------------------------------------------------------- --------- dsth at cantab.net dsth at cpan.org Hi Michael, You're using ebi cluster? i have to ask, is this all just a really elaborate way of avoiding the use of MPI that works perfectly well on both the ebi and sanger compute farms? if you carry on in the direction you seem to be going you're likely to end up with a considerable level of unnecessary overhead and should possibly consider adapting the ensembl genebuild pipeline to your specific needs. Dan > Hello Carson! > > On 03/14/2013 04:49 PM, Michael Nuhn wrote: >>> >> Try dialling back on the number of simultaneous instances you start and >>> >> instead use MPI or the -cpus option to get the parallelization boost. >>> >> Alternatively you can also split up the input file and use the -base >>> >> option so everything gets written to the same place (then you never have >>> >> to worry about locks affecting individual contigs - as no single instance >>> >> has access to all the contigs) >>> >> >>> >> Example: >>> >> fasta_tool --chunks 5 maize_assembly.fasta >>> >> maker -g maize_assembly_0.fasta -base maize_assembly >>> >> maker -g maize_assembly_1.fasta -base maize_assembly >>> >> >>> >> maker -g maize_assembly_2.fasta -base maize_assembly >>> >> >>> >> maker -g maize_assembly_3.fasta -base maize_assembly >>> >> >>> >> maker -g maize_assembly_4.fasta -base maize_assembly >>> >> >>> >> maker -dsindex >>> >> >>> >> Everything then gets written to maize_assembly.maker.output for all >>> >> results. The last call to maker with the -dsindex flag then rebuilds the >>> >> datastore_index.log file to match the original maize_assembly.fasta file > > I have tried this, split my genome into 50 files and run them as you > suggested above. > > This worked well most of the time, but now I am getting locking issues > again. The working directory gets flooded with STACK.STACK.STACK.STACK > ... files. > > What I think is happening is that for some reason the maker instances > decide that they want to rebuild the index. This takes a lot of time > and this blocks even more instances wanting to lock the index files. > In the end most of the maker instances end up waiting. > > I would like to try the following, but I don't know, if this might > cause problems later on: > > I would like to run all of the split sequence files as separate maker > projects as if they were independent genomes. In the end I'd merge all > the individual gff files using the gff3_merge script. > > Do you see any reason why this wouldn't work? > > Cheers, > Michael. > > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > ----- End forwarded message ----- > _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/m aker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Mar 19 08:52:19 2013 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 19 Mar 2013 10:52:19 -0400 Subject: [maker-devel] Failing to recoup all gff annotations, including ab-initio In-Reply-To: Message-ID: Ab initio models without evidence support are not considered final models by default (newly trained ab initio predictors tend to have a very high false positive rate). If you really want the ab initio models without support to upgraded, set the keep_preds=1 in the maker_opts.ctl file. All ab inito models are also stored in the GFF3 as match/mtach_part features for reference purposes not gene/mRNA/exon/CDS. Thanks, Carson From: "Freeman, Robert M." Date: Tuesday, 19 March, 2013 9:03 AM To: "maker-devel at yandell-lab.org" Subject: [maker-devel] Failing to recoup all gff annotations, including ab-initio Carson et al., Thanks again for a great suite tools! We're using MAKER now to generate gene models (and model fragments) for a ciliate, the models for which we'll be using to generate a high-quality protein database for searches with mass spec. I bootstrapped the process using the core set of proteins with CEGMA, then trained SNAP. After the final round of running MAKER, I get about 1100 evidence-based models and 34K ab-initio. And that's fine (for now). I am able to collect the fasta files for both transcripts and proteins (evidence-based and ab-initio) without problem. My problem is that when I use the gff3_merge script, I only get annotations for the evidence-based models. I'm not sure why the ab-initio model annotations are being collected. I've tried using and not the '-g' switch, but this doesn't seem to make a difference. Thoughts? Tx, B ----------------------------------------------------- Bob Freeman, Ph.D. Acorn Worm Informatics, Kirschner lab Dept of Systems Biology, Alpert 524 Harvard Medical School 200 Longwood Avenue Boston, MA 02115 617/432.2294, vox "Sorry I'm late. Oh, God, that sounded insincere. I'm late." -- Karen Walker, from Will and Grace _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnuhn at ebi.ac.uk Tue Mar 19 09:19:25 2013 From: mnuhn at ebi.ac.uk (Michael Nuhn) Date: Tue, 19 Mar 2013 15:19:25 +0000 Subject: [maker-devel] master_datastore_index.log file shrinks.] In-Reply-To: References: Message-ID: <514881FD.4020003@ebi.ac.uk> Hello Carson! On 03/19/2013 02:27 PM, Carson Holt wrote: > Yes. If at all possible use MPI. It removes the overhead of locks > which happen per primary instance of MAKER. So one maker job using 1000 > cpus via MPI will have one shared set of locks. 1000 serial instances > of MAKER on the other hand would have 1000x the locks. I don't know a thing about MPI. I tried installing maker (2.2.7) with mpich-3.0.2, mpich2-1.4.1 and open mpi and none of them worked for me. I also tried the automatic installation that comes with maker, but it didn't work for me either. If need be, I could spend time getting to the bottom of this, but there is no telling how long this would take me so I'd rather not, if there is an alternative. Would the approach I outlined before work? (Treating the split files as separate genomes to annotate and then combine the gffs afterwards) I also like this approach, because I would select a few contigs in the beginning which I would run on their own. They would complete early and this way I would get a preview of the results of the run instead of having to wait for everything to complete. It might also be more robust, because file locking issues would be confined to the instances working on a sequence chunk, but the rest of the instances could continue working. Cheers, Michael. > Alternatively if you do need to continue without MPI for some reason, I > just finished a devel version of MAKER that has a --no_locks option. > You can never start two instances using the same input fasta when > --no_locks is specified, but the splitting to use different input fastas > I mentioned before in the example will still work fine. > > I also have updated the indexing/reindexing, so if indexing failures > happen, MAKER will switch between the current working directory and the > TMP= directory from the maker_opts.ctl file so as to try different IO > locations (I.e. NFS and non-NFS). Note you should never set TMP= in the > control files to an NFS mounted location (it not only makes things a lot > slower, but berkleydb and sqllite will get frequent errors on NFS). > TMP= defaults to /tmp when not specified > > I'll send you download information in a separate e-mail. Try a regular > MAKER run to see if the indexing/reindexing changes are sufficient > before attempting the ?no_locks option. > > Thanks, > Carson From carsonhh at gmail.com Tue Mar 19 09:02:22 2013 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 19 Mar 2013 11:02:22 -0400 Subject: [maker-devel] master_datastore_index.log file shrinks.] In-Reply-To: <514881FD.4020003@ebi.ac.uk> Message-ID: Try it with the no_locks option then. Make sure to let one instance finish populating the mpi_blastdb directory before running other instances as that is where most initial locking occurs. I'll send you more details on how to install with OpenMPI, so you can give that a shot while your jobs are also running serially (so you don't lose time). Also instead of 50 serial instances, you could try 10 with -cpus set to 5. Thanks, Carson On 13-03-19 11:19 AM, "Michael Nuhn" wrote: >Hello Carson! > >On 03/19/2013 02:27 PM, Carson Holt wrote: >> Yes. If at all possible use MPI. It removes the overhead of locks >> which happen per primary instance of MAKER. So one maker job using 1000 >> cpus via MPI will have one shared set of locks. 1000 serial instances >> of MAKER on the other hand would have 1000x the locks. > >I don't know a thing about MPI. > >I tried installing maker (2.2.7) with mpich-3.0.2, mpich2-1.4.1 and open >mpi and none of them worked for me. I also tried the automatic >installation that comes with maker, but it didn't work for me either. > >If need be, I could spend time getting to the bottom of this, but there >is no telling how long this would take me so I'd rather not, if there is >an alternative. > >Would the approach I outlined before work? (Treating the split files as >separate genomes to annotate and then combine the gffs afterwards) > >I also like this approach, because I would select a few contigs in the >beginning which I would run on their own. They would complete early and >this way I would get a preview of the results of the run instead of >having to wait for everything to complete. > >It might also be more robust, because file locking issues would be >confined to the instances working on a sequence chunk, but the rest of >the instances could continue working. > >Cheers, >Michael. > >> Alternatively if you do need to continue without MPI for some reason, I >> just finished a devel version of MAKER that has a --no_locks option. >> You can never start two instances using the same input fasta when >> --no_locks is specified, but the splitting to use different input fastas >> I mentioned before in the example will still work fine. >> >> I also have updated the indexing/reindexing, so if indexing failures >> happen, MAKER will switch between the current working directory and the >> TMP= directory from the maker_opts.ctl file so as to try different IO >> locations (I.e. NFS and non-NFS). Note you should never set TMP= in the >> control files to an NFS mounted location (it not only makes things a lot >> slower, but berkleydb and sqllite will get frequent errors on NFS). >> TMP= defaults to /tmp when not specified >> >> I'll send you download information in a separate e-mail. Try a regular >> MAKER run to see if the indexing/reindexing changes are sufficient >> before attempting the ?no_locks option. >> >> Thanks, >> Carson > From dsth at ebi.ac.uk Tue Mar 19 09:13:51 2013 From: dsth at ebi.ac.uk (Daniel Hughes) Date: Tue, 19 Mar 2013 15:13:51 +0000 Subject: [maker-devel] master_datastore_index.log file shrinks.] In-Reply-To: <514881FD.4020003@ebi.ac.uk> References: <514881FD.4020003@ebi.ac.uk> Message-ID: You really don't need to know anything about MPI. While MPI is itself pretty complex, I seem to recall maker uses the p2p subset alone mainly to send serialised perl objects as c strings etc., for IPC across ad hoc infrastructure - but none of that is relevant as Carson has done all the IPC debugging for you and its use should be transparent. If it's failing, its almost certainly because you've got discrepencies between the mpi libraries visible at compile-time vs. run-time and you may need to force the dynamic linker to behave itself. The only other caveat on ebi infrastructure i can think of off the top of my head relates to cross-node MPI usage when going into the hundreds of processes but i'm assuming you not doing that? You need to be more specific about how it's failing. dan from me phone... On Mar 19, 2013 11:55 AM, "Michael Nuhn" wrote: > Hello Carson! > > On 03/19/2013 02:27 PM, Carson Holt wrote: > >> Yes. If at all possible use MPI. It removes the overhead of locks >> which happen per primary instance of MAKER. So one maker job using 1000 >> cpus via MPI will have one shared set of locks. 1000 serial instances >> of MAKER on the other hand would have 1000x the locks. >> > > I don't know a thing about MPI. > > I tried installing maker (2.2.7) with mpich-3.0.2, mpich2-1.4.1 and open > mpi and none of them worked for me. I also tried the automatic installation > that comes with maker, but it didn't work for me either. > > If need be, I could spend time getting to the bottom of this, but there is > no telling how long this would take me so I'd rather not, if there is an > alternative. > > Would the approach I outlined before work? (Treating the split files as > separate genomes to annotate and then combine the gffs afterwards) > > I also like this approach, because I would select a few contigs in the > beginning which I would run on their own. They would complete early and > this way I would get a preview of the results of the run instead of having > to wait for everything to complete. > > It might also be more robust, because file locking issues would be > confined to the instances working on a sequence chunk, but the rest of the > instances could continue working. > > Cheers, > Michael. > > Alternatively if you do need to continue without MPI for some reason, I >> just finished a devel version of MAKER that has a --no_locks option. >> You can never start two instances using the same input fasta when >> --no_locks is specified, but the splitting to use different input fastas >> I mentioned before in the example will still work fine. >> >> I also have updated the indexing/reindexing, so if indexing failures >> happen, MAKER will switch between the current working directory and the >> TMP= directory from the maker_opts.ctl file so as to try different IO >> locations (I.e. NFS and non-NFS). Note you should never set TMP= in the >> control files to an NFS mounted location (it not only makes things a lot >> slower, but berkleydb and sqllite will get frequent errors on NFS). >> TMP= defaults to /tmp when not specified >> >> I'll send you download information in a separate e-mail. Try a regular >> MAKER run to see if the indexing/reindexing changes are sufficient >> before attempting the ?no_locks option. >> >> Thanks, >> Carson >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Mar 19 09:22:22 2013 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 19 Mar 2013 11:22:22 -0400 Subject: [maker-devel] master_datastore_index.log file shrinks.] In-Reply-To: Message-ID: I have MAKER working under OpemnMPI 1.4.3 (intel compiled). I had to set a couple of environmental variables prior to setup. You would probably need to set these values as well. If you your OpenMPI path was here for example --> /software/openmpi-1.4.3/, run the following commands (path set accordingly) before even attempting maker setup. export OMPI_MCA_mpi_warn_on_fork 0 export LD_PRELOAD /software/openmpi-1.4.3/lib/libmpi.so:$LD_PRELOAD These not only need to be set before compilation, but also before any run (so add them to you ~.bashrc or ~/.bash_profile or any module load scripts thanks). The LD_PRELOAD statement needs to be set for any program using OpenMPI's shared libraries and not just MAKER, so it's normally a good idea to have that set system wide for all users. The detail can be found in the OpenMPI documentation. Note sometimes system library updates can break OpenMPI's shared libraries while not breaking OpenMPI itself, so you might also need to recompile OpenMPI if it has broken shared libraries. Once you have those commands in place, run the perl Buil.PL step. Say yes to install with MPI. Then run ./Build install Thanks, Carson On 13-03-19 11:02 AM, "Carson Holt" wrote: >Try it with the no_locks option then. Make sure to let one instance >finish populating the mpi_blastdb directory before running other >instances >as that is where most initial locking occurs. > >I'll send you more details on how to install with OpenMPI, so you can >give >that a shot while your jobs are also running serially (so you don't lose >time). Also instead of 50 serial instances, you could try 10 with -cpus >set to 5. > >Thanks, >Carson > > > >On 13-03-19 11:19 AM, "Michael Nuhn" wrote: > >>Hello Carson! >> >>On 03/19/2013 02:27 PM, Carson Holt wrote: >>> Yes. If at all possible use MPI. It removes the overhead of locks >>> which happen per primary instance of MAKER. So one maker job using >>>1000 >>> cpus via MPI will have one shared set of locks. 1000 serial instances >>> of MAKER on the other hand would have 1000x the locks. >> >>I don't know a thing about MPI. >> >>I tried installing maker (2.2.7) with mpich-3.0.2, mpich2-1.4.1 and open >>mpi and none of them worked for me. I also tried the automatic >>installation that comes with maker, but it didn't work for me either. >> >>If need be, I could spend time getting to the bottom of this, but there >>is no telling how long this would take me so I'd rather not, if there is >>an alternative. >> >>Would the approach I outlined before work? (Treating the split files as >>separate genomes to annotate and then combine the gffs afterwards) >> >>I also like this approach, because I would select a few contigs in the >>beginning which I would run on their own. They would complete early and >>this way I would get a preview of the results of the run instead of >>having to wait for everything to complete. >> >>It might also be more robust, because file locking issues would be >>confined to the instances working on a sequence chunk, but the rest of >>the instances could continue working. >> >>Cheers, >>Michael. >> >>> Alternatively if you do need to continue without MPI for some reason, I >>> just finished a devel version of MAKER that has a --no_locks option. >>> You can never start two instances using the same input fasta when >>> --no_locks is specified, but the splitting to use different input >>>fastas >>> I mentioned before in the example will still work fine. >>> >>> I also have updated the indexing/reindexing, so if indexing failures >>> happen, MAKER will switch between the current working directory and the >>> TMP= directory from the maker_opts.ctl file so as to try different IO >>> locations (I.e. NFS and non-NFS). Note you should never set TMP= in >>>the >>> control files to an NFS mounted location (it not only makes things a >>>lot >>> slower, but berkleydb and sqllite will get frequent errors on NFS). >>> TMP= defaults to /tmp when not specified >>> >>> I'll send you download information in a separate e-mail. Try a regular >>> MAKER run to see if the indexing/reindexing changes are sufficient >>> before attempting the ?no_locks option. >>> >>> Thanks, >>> Carson From dsth at ebi.ac.uk Tue Mar 19 09:26:02 2013 From: dsth at ebi.ac.uk (Daniel Hughes) Date: Tue, 19 Mar 2013 15:26:02 +0000 Subject: [maker-devel] master_datastore_index.log file shrinks.] In-Reply-To: References: <514881FD.4020003@ebi.ac.uk> Message-ID: oh and (1) it will work as long as evidence etc., is synchronous, (2) it will be really inefficient - be glad ebi doesn't use a by group compute time fair-share policy ;) Dan from me phone... On Mar 19, 2013 12:13 PM, "Daniel Hughes" wrote: > You really don't need to know anything about MPI. While MPI is itself > pretty complex, I seem to recall maker uses the p2p subset alone mainly to > send serialised perl objects as c strings etc., for IPC across ad hoc > infrastructure - but none of that is relevant as Carson has done all the > IPC debugging for you and its use should be transparent. If it's failing, > its almost certainly because you've got discrepencies between the mpi > libraries visible at compile-time vs. run-time and you may need to force > the dynamic linker to behave itself. The only other caveat on ebi > infrastructure i can think of off the top of my head relates to cross-node > MPI usage when going into the hundreds of processes but i'm assuming you > not doing that? You need to be more specific about how it's failing. > > dan > > from me phone... > On Mar 19, 2013 11:55 AM, "Michael Nuhn" wrote: > >> Hello Carson! >> >> On 03/19/2013 02:27 PM, Carson Holt wrote: >> >>> Yes. If at all possible use MPI. It removes the overhead of locks >>> which happen per primary instance of MAKER. So one maker job using 1000 >>> cpus via MPI will have one shared set of locks. 1000 serial instances >>> of MAKER on the other hand would have 1000x the locks. >>> >> >> I don't know a thing about MPI. >> >> I tried installing maker (2.2.7) with mpich-3.0.2, mpich2-1.4.1 and open >> mpi and none of them worked for me. I also tried the automatic installation >> that comes with maker, but it didn't work for me either. >> >> If need be, I could spend time getting to the bottom of this, but there >> is no telling how long this would take me so I'd rather not, if there is an >> alternative. >> >> Would the approach I outlined before work? (Treating the split files as >> separate genomes to annotate and then combine the gffs afterwards) >> >> I also like this approach, because I would select a few contigs in the >> beginning which I would run on their own. They would complete early and >> this way I would get a preview of the results of the run instead of having >> to wait for everything to complete. >> >> It might also be more robust, because file locking issues would be >> confined to the instances working on a sequence chunk, but the rest of the >> instances could continue working. >> >> Cheers, >> Michael. >> >> Alternatively if you do need to continue without MPI for some reason, I >>> just finished a devel version of MAKER that has a --no_locks option. >>> You can never start two instances using the same input fasta when >>> --no_locks is specified, but the splitting to use different input fastas >>> I mentioned before in the example will still work fine. >>> >>> I also have updated the indexing/reindexing, so if indexing failures >>> happen, MAKER will switch between the current working directory and the >>> TMP= directory from the maker_opts.ctl file so as to try different IO >>> locations (I.e. NFS and non-NFS). Note you should never set TMP= in the >>> control files to an NFS mounted location (it not only makes things a lot >>> slower, but berkleydb and sqllite will get frequent errors on NFS). >>> TMP= defaults to /tmp when not specified >>> >>> I'll send you download information in a separate e-mail. Try a regular >>> MAKER run to see if the indexing/reindexing changes are sufficient >>> before attempting the ?no_locks option. >>> >>> Thanks, >>> Carson >>> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnuhn at ebi.ac.uk Tue Mar 19 09:54:34 2013 From: mnuhn at ebi.ac.uk (Michael Nuhn) Date: Tue, 19 Mar 2013 15:54:34 +0000 Subject: [maker-devel] master_datastore_index.log file shrinks.] In-Reply-To: References: Message-ID: <51488A3A.20106@ebi.ac.uk> Hello Carson! Thanks for the pointers. I'll give mpi another shot. Cheers, Michael. On 03/19/2013 03:22 PM, Carson Holt wrote: > I have MAKER working under OpemnMPI 1.4.3 (intel compiled). > > I had to set a couple of environmental variables prior to setup. You would > probably need to set these values as well. If you your OpenMPI path was > here for example --> /software/openmpi-1.4.3/, run the following commands > (path set accordingly) before even attempting maker setup. > > export OMPI_MCA_mpi_warn_on_fork 0 > export LD_PRELOAD /software/openmpi-1.4.3/lib/libmpi.so:$LD_PRELOAD > > These not only need to be set before compilation, but also before any run > (so add them to you ~.bashrc or ~/.bash_profile or any module load scripts > thanks). The LD_PRELOAD statement needs to be set for any program using > OpenMPI's shared libraries and not just MAKER, so it's normally a good > idea to have that set system wide for all users. The detail can be found > in the OpenMPI documentation. Note sometimes system library updates can > break OpenMPI's shared libraries while not breaking OpenMPI itself, so you > might also need to recompile OpenMPI if it has broken shared libraries. > > Once you have those commands in place, run the perl Buil.PL step. Say yes > to install with MPI. Then run ./Build install > > Thanks, > Carson > > > > On 13-03-19 11:02 AM, "Carson Holt" wrote: > >> Try it with the no_locks option then. Make sure to let one instance >> finish populating the mpi_blastdb directory before running other >> instances >> as that is where most initial locking occurs. >> >> I'll send you more details on how to install with OpenMPI, so you can >> give >> that a shot while your jobs are also running serially (so you don't lose >> time). Also instead of 50 serial instances, you could try 10 with -cpus >> set to 5. >> >> Thanks, >> Carson >> >> >> >> On 13-03-19 11:19 AM, "Michael Nuhn" wrote: >> >>> Hello Carson! >>> >>> On 03/19/2013 02:27 PM, Carson Holt wrote: >>>> Yes. If at all possible use MPI. It removes the overhead of locks >>>> which happen per primary instance of MAKER. So one maker job using >>>> 1000 >>>> cpus via MPI will have one shared set of locks. 1000 serial instances >>>> of MAKER on the other hand would have 1000x the locks. >>> >>> I don't know a thing about MPI. >>> >>> I tried installing maker (2.2.7) with mpich-3.0.2, mpich2-1.4.1 and open >>> mpi and none of them worked for me. I also tried the automatic >>> installation that comes with maker, but it didn't work for me either. >>> >>> If need be, I could spend time getting to the bottom of this, but there >>> is no telling how long this would take me so I'd rather not, if there is >>> an alternative. >>> >>> Would the approach I outlined before work? (Treating the split files as >>> separate genomes to annotate and then combine the gffs afterwards) >>> >>> I also like this approach, because I would select a few contigs in the >>> beginning which I would run on their own. They would complete early and >>> this way I would get a preview of the results of the run instead of >>> having to wait for everything to complete. >>> >>> It might also be more robust, because file locking issues would be >>> confined to the instances working on a sequence chunk, but the rest of >>> the instances could continue working. >>> >>> Cheers, >>> Michael. >>> >>>> Alternatively if you do need to continue without MPI for some reason, I >>>> just finished a devel version of MAKER that has a --no_locks option. >>>> You can never start two instances using the same input fasta when >>>> --no_locks is specified, but the splitting to use different input >>>> fastas >>>> I mentioned before in the example will still work fine. >>>> >>>> I also have updated the indexing/reindexing, so if indexing failures >>>> happen, MAKER will switch between the current working directory and the >>>> TMP= directory from the maker_opts.ctl file so as to try different IO >>>> locations (I.e. NFS and non-NFS). Note you should never set TMP= in >>>> the >>>> control files to an NFS mounted location (it not only makes things a >>>> lot >>>> slower, but berkleydb and sqllite will get frequent errors on NFS). >>>> TMP= defaults to /tmp when not specified >>>> >>>> I'll send you download information in a separate e-mail. Try a regular >>>> MAKER run to see if the indexing/reindexing changes are sufficient >>>> before attempting the ?no_locks option. >>>> >>>> Thanks, >>>> Carson > > From es9 at sanger.ac.uk Tue Mar 19 09:40:08 2013 From: es9 at sanger.ac.uk (Eleanor Stanley) Date: Tue, 19 Mar 2013 15:40:08 +0000 Subject: [maker-devel] master_datastore_index.log file shrinks.] In-Reply-To: <51488A3A.20106@ebi.ac.uk> References: <51488A3A.20106@ebi.ac.uk> Message-ID: For the Sanger farm I have a wrapper script to run MPI maker so that the same environmental variables are forced to all nodes. Eleanor On 19 Mar 2013, at 15:54, Michael Nuhn wrote: > Hello Carson! > > Thanks for the pointers. I'll give mpi another shot. > > Cheers, > Michael. > > On 03/19/2013 03:22 PM, Carson Holt wrote: >> I have MAKER working under OpemnMPI 1.4.3 (intel compiled). >> >> I had to set a couple of environmental variables prior to setup. You would >> probably need to set these values as well. If you your OpenMPI path was >> here for example --> /software/openmpi-1.4.3/, run the following commands >> (path set accordingly) before even attempting maker setup. >> >> export OMPI_MCA_mpi_warn_on_fork 0 >> export LD_PRELOAD /software/openmpi-1.4.3/lib/libmpi.so:$LD_PRELOAD >> >> These not only need to be set before compilation, but also before any run >> (so add them to you ~.bashrc or ~/.bash_profile or any module load scripts >> thanks). The LD_PRELOAD statement needs to be set for any program using >> OpenMPI's shared libraries and not just MAKER, so it's normally a good >> idea to have that set system wide for all users. The detail can be found >> in the OpenMPI documentation. Note sometimes system library updates can >> break OpenMPI's shared libraries while not breaking OpenMPI itself, so you >> might also need to recompile OpenMPI if it has broken shared libraries. >> >> Once you have those commands in place, run the perl Buil.PL step. Say yes >> to install with MPI. Then run ./Build install >> >> Thanks, >> Carson >> >> >> >> On 13-03-19 11:02 AM, "Carson Holt" wrote: >> >>> Try it with the no_locks option then. Make sure to let one instance >>> finish populating the mpi_blastdb directory before running other >>> instances >>> as that is where most initial locking occurs. >>> >>> I'll send you more details on how to install with OpenMPI, so you can >>> give >>> that a shot while your jobs are also running serially (so you don't lose >>> time). Also instead of 50 serial instances, you could try 10 with -cpus >>> set to 5. >>> >>> Thanks, >>> Carson >>> >>> >>> >>> On 13-03-19 11:19 AM, "Michael Nuhn" wrote: >>> >>>> Hello Carson! >>>> >>>> On 03/19/2013 02:27 PM, Carson Holt wrote: >>>>> Yes. If at all possible use MPI. It removes the overhead of locks >>>>> which happen per primary instance of MAKER. So one maker job using >>>>> 1000 >>>>> cpus via MPI will have one shared set of locks. 1000 serial instances >>>>> of MAKER on the other hand would have 1000x the locks. >>>> >>>> I don't know a thing about MPI. >>>> >>>> I tried installing maker (2.2.7) with mpich-3.0.2, mpich2-1.4.1 and open >>>> mpi and none of them worked for me. I also tried the automatic >>>> installation that comes with maker, but it didn't work for me either. >>>> >>>> If need be, I could spend time getting to the bottom of this, but there >>>> is no telling how long this would take me so I'd rather not, if there is >>>> an alternative. >>>> >>>> Would the approach I outlined before work? (Treating the split files as >>>> separate genomes to annotate and then combine the gffs afterwards) >>>> >>>> I also like this approach, because I would select a few contigs in the >>>> beginning which I would run on their own. They would complete early and >>>> this way I would get a preview of the results of the run instead of >>>> having to wait for everything to complete. >>>> >>>> It might also be more robust, because file locking issues would be >>>> confined to the instances working on a sequence chunk, but the rest of >>>> the instances could continue working. >>>> >>>> Cheers, >>>> Michael. >>>> >>>>> Alternatively if you do need to continue without MPI for some reason, I >>>>> just finished a devel version of MAKER that has a --no_locks option. >>>>> You can never start two instances using the same input fasta when >>>>> --no_locks is specified, but the splitting to use different input >>>>> fastas >>>>> I mentioned before in the example will still work fine. >>>>> >>>>> I also have updated the indexing/reindexing, so if indexing failures >>>>> happen, MAKER will switch between the current working directory and the >>>>> TMP= directory from the maker_opts.ctl file so as to try different IO >>>>> locations (I.e. NFS and non-NFS). Note you should never set TMP= in >>>>> the >>>>> control files to an NFS mounted location (it not only makes things a >>>>> lot >>>>> slower, but berkleydb and sqllite will get frequent errors on NFS). >>>>> TMP= defaults to /tmp when not specified >>>>> >>>>> I'll send you download information in a separate e-mail. Try a regular >>>>> MAKER run to see if the indexing/reindexing changes are sufficient >>>>> before attempting the ?no_locks option. >>>>> >>>>> Thanks, >>>>> Carson >> >> > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From Bob_Freeman at hms.harvard.edu Tue Mar 19 10:18:11 2013 From: Bob_Freeman at hms.harvard.edu (Freeman, Robert M.) Date: Tue, 19 Mar 2013 12:18:11 -0400 Subject: [maker-devel] Failing to recoup all gff annotations, including ab-initio In-Reply-To: References: Message-ID: <06F15FF0-2384-4BDD-AD9B-9C1D0AB6370C@hms.harvard.edu> Thanks, Carson. This explains the behavior I saw and will help us moving forward. Best, Bob On Mar 19, 2013, at 10:52 AM, Carson Holt wrote: Ab initio models without evidence support are not considered final models by default (newly trained ab initio predictors tend to have a very high false positive rate). If you really want the ab initio models without support to upgraded, set the keep_preds=1 in the maker_opts.ctl file. All ab inito models are also stored in the GFF3 as match/mtach_part features for reference purposes not gene/mRNA/exon/CDS. Thanks, Carson From: "Freeman, Robert M." > Date: Tuesday, 19 March, 2013 9:03 AM To: "maker-devel at yandell-lab.org" > Subject: [maker-devel] Failing to recoup all gff annotations, including ab-initio Carson et al., Thanks again for a great suite tools! We're using MAKER now to generate gene models (and model fragments) for a ciliate, the models for which we'll be using to generate a high-quality protein database for searches with mass spec. I bootstrapped the process using the core set of proteins with CEGMA, then trained SNAP. After the final round of running MAKER, I get about 1100 evidence-based models and 34K ab-initio. And that's fine (for now). I am able to collect the fasta files for both transcripts and proteins (evidence-based and ab-initio) without problem. My problem is that when I use the gff3_merge script, I only get annotations for the evidence-based models. I'm not sure why the ab-initio model annotations are being collected. I've tried using and not the '-g' switch, but this doesn't seem to make a difference. Thoughts? Tx, B _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org ----------------------------------------------------- Bob Freeman, Ph.D. Acorn Worm Informatics, Kirschner lab Dept of Systems Biology, Alpert 524 Harvard Medical School 200 Longwood Avenue Boston, MA 02115 617/432.2294, vox "Sorry I'm late. Oh, God, that sounded insincere. I'm late." -- Karen Walker, from Will and Grace -------------- next part -------------- An HTML attachment was scrubbed... URL: From cjfields at illinois.edu Tue Mar 19 13:04:18 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Tue, 19 Mar 2013 19:04:18 +0000 Subject: [maker-devel] Alternative start codons Message-ID: <118F034CF4C3EF48A96F86CE585B94BF74DA507D@CHIMBX5.ad.uillinois.edu> We had a user notice that MAKER is not observing alternative start codons for bacterial genomes. For instance, this predicted transcript: >Xf_Mul_000007-RA transcript Name:"Protein of unknown function" offset:79 AED:0.42 eAED:1.00 QI:79|-1|0|1|-1|1|1|20|24 GTGGGATACAGGCCGCTGATCGCTGATGGCGCGTACCTGAAACTGCTGCTGGACTACTAC GTTACAGTGCAGCCTTTGCATGCCGATTGGAAAGATCTATATATCATCGCTTGCGCTATT ACAGCGGCTAAAAAGAGTCTTCAATTTGGCGTAATTCAGTCATTGGCGGGGTAG Yields this protein sequence. >Xf_Mul_000007-RA protein AED:0.42 eAED:1.00 QI:79|-1|0|1|-1|1|1|20|24 MPIGKIYISSLALLQRLKRVFNLA I'm pretty sure I know what is going on, namely that MAKER is treating the 5' end as UTR and looking for the first ATG (there is one in the sequence above). Is there any way to change this behavior, though? For instance, allow alternative start codons like GTG/TTG? chris From hudarul at yahoo.com Tue Mar 19 13:08:55 2013 From: hudarul at yahoo.com (Hud Hud) Date: Tue, 19 Mar 2013 12:08:55 -0700 (PDT) Subject: [maker-devel] Maker-no such file or directory In-Reply-To: References: <1363637601.24386.YahooMailNeo@web164901.mail.bf1.yahoo.com> Message-ID: <1363720135.24498.YahooMailNeo@web164901.mail.bf1.yahoo.com> Hello everyone I have some queries, i cant run MAKER locally, so can i use MWAS on my contigs, but since my contigs too long to be run on MWAS, is it possible to combine the results after i upload and run the analysis on my contigs separately... ________________________________ From: Carson Holt To: Hud Hud ; "maker-devel at yandell-lab.org" Sent: Tuesday, March 19, 2013 4:44 AM Subject: Re: [maker-devel] Maker-no such file or directory Does 'ls -al?$home/Dorah/maker-2.27-beta/maker/data/dpp_contig.fasta' ?show a valid location? The error is just saying that the file location as written in the maker_opts.ctl file does not exist. --Carson From: Hud Hud Reply-To: Hud Hud Date: Monday, 18 March, 2013 4:13 PM To: "maker-devel at yandell-lab.org" Subject: [maker-devel] Maker-no such file or directory I have some problem with maker 1. i try to work with the example data in data directory, but im having this kind of error..anyone can help me error $ maker STATUS: Parsing control files... dpp_contig.fasta (fasta file or fasta embeded in GFF3 file): No such file or directory at /home/Dorah/maker-2.27-beta/maker/bin/../lib/GI.pm line 186 --> rank=NA, hostname=NurKaiyisah my maker_opts.ctl genome=$home/Dorah/maker-2.27-beta/maker/data/dpp_contig.fasta est=$home/Dorah/maker-2.27-beta/maker/data/dpp_est.fasta protein=$home/Dorah/maker-2.27-beta/maker/data/dpp_protein.fasta _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Mar 19 13:30:09 2013 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 19 Mar 2013 15:30:09 -0400 Subject: [maker-devel] Maker-no such file or directory In-Reply-To: <1363720135.24498.YahooMailNeo@web164901.mail.bf1.yahoo.com> Message-ID: You can. It will be very slow though as MWAS only dedicates a single cpu per job. So with a 5Mb max per job submission it could take a very long time depending on the size of the assembly (emphasis on very long). --Carson From: Hud Hud Reply-To: Hud Hud Date: Tuesday, 19 March, 2013 3:08 PM To: "maker-devel at yandell-lab.org" Subject: Re: [maker-devel] Maker-no such file or directory Hello everyone I have some queries, i cant run MAKER locally, so can i use MWAS on my contigs, but since my contigs too long to be run on MWAS, is it possible to combine the results after i upload and run the analysis on my contigs separately... From: Carson Holt To: Hud Hud ; "maker-devel at yandell-lab.org" Sent: Tuesday, March 19, 2013 4:44 AM Subject: Re: [maker-devel] Maker-no such file or directory Does 'ls -al $home/Dorah/maker-2.27-beta/maker/data/dpp_contig.fasta' show a valid location? The error is just saying that the file location as written in the maker_opts.ctl file does not exist. --Carson From: Hud Hud Reply-To: Hud Hud Date: Monday, 18 March, 2013 4:13 PM To: "maker-devel at yandell-lab.org" Subject: [maker-devel] Maker-no such file or directory I have some problem with maker 1. i try to work with the example data in data directory, but im having this kind of error..anyone can help me error $ maker STATUS: Parsing control files... dpp_contig.fasta (fasta file or fasta embeded in GFF3 file): No such file or directory at /home/Dorah/maker-2.27-beta/maker/bin/../lib/GI.pm line 186 --> rank=NA, hostname=NurKaiyisah my maker_opts.ctl genome=$home/Dorah/maker-2.27-beta/maker/data/dpp_contig.fasta est=$home/Dorah/maker-2.27-beta/maker/data/dpp_est.fasta protein=$home/Dorah/maker-2.27-beta/maker/data/dpp_protein.fasta _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Mar 19 13:33:46 2013 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 19 Mar 2013 15:33:46 -0400 Subject: [maker-devel] Alternative start codons In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF74DA507D@CHIMBX5.ad.uillinois.edu> Message-ID: It could be changed. I imagine that this is a protein2genome or est2genome gene, as MAKER won't try and determine by itself the start and end if it comes from a gene predictor. --Carson On 13-03-19 3:04 PM, "Fields, Christopher J" wrote: >We had a user notice that MAKER is not observing alternative start codons >for bacterial genomes. For instance, this predicted transcript: > >>Xf_Mul_000007-RA transcript Name:"Protein of unknown function" offset:79 >>AED:0.42 eAED:1.00 QI:79|-1|0|1|-1|1|1|20|24 >GTGGGATACAGGCCGCTGATCGCTGATGGCGCGTACCTGAAACTGCTGCTGGACTACTAC >GTTACAGTGCAGCCTTTGCATGCCGATTGGAAAGATCTATATATCATCGCTTGCGCTATT >ACAGCGGCTAAAAAGAGTCTTCAATTTGGCGTAATTCAGTCATTGGCGGGGTAG > >Yields this protein sequence. > >>Xf_Mul_000007-RA protein AED:0.42 eAED:1.00 QI:79|-1|0|1|-1|1|1|20|24 >MPIGKIYISSLALLQRLKRVFNLA > >I'm pretty sure I know what is going on, namely that MAKER is treating >the 5' end as UTR and looking for the first ATG (there is one in the >sequence above). Is there any way to change this behavior, though? For >instance, allow alternative start codons like GTG/TTG? > >chris >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From myandell at genetics.utah.edu Tue Mar 19 18:02:36 2013 From: myandell at genetics.utah.edu (Mark Yandell) Date: Wed, 20 Mar 2013 00:02:36 +0000 Subject: [maker-devel] Maker2 gff file output In-Reply-To: References: Message-ID: <7A60AB257EFF2B48B1F4C814817EA05350EDC688@mxb2.hg.genetics.utah.edu> Hi Blake, I'be forwarded this onto the maker_devel list, they can help you more there. regarding your comment g 'When I view the output of many contigs in Apollo, there is many times where 3 or 4 models show close to identical gene structure, but the final maker output does not contain that gene call. ' Those calls are in the output files, but there are in a different multifasta file; there are non-overalpping ab intio models. Another way is to set the config flag to allow MAKEr to use unspliced EST and RNA-seq alignments as evidence, I'be forwarded this onto the maker_devel list, they can help you more there. cheers, --mark Mark Yandell Professor of Human Genetics H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:801-587-7707 ________________________________________ From: Blake Hovde [hovdebt at uw.edu] Sent: Tuesday, March 19, 2013 2:35 PM To: Mark Yandell Subject: Maker2 gff file output Hi Dr. Yandell, I am currently running MAKER2 on a new algal genome and am running into a couple of problems that I would like your input on the genome size is ~60Mb and is currently in ~3100 contigs. First, I am having trouble doing multiple iterations of hmm training with SNAP due to the fact that I have so many gff output files in the datastore (1 for each contig in my draft genome). not just a single gff output that seems to be in the examples and tutorials I have followed thus far. Is there a way to combine all of my gff files together to make use of the SNAP hmm training or re-annotation? Second, Using multiple lines of evidence (augustus, genemarkES, RNAseq data, and COGs based on homology searches) I am having a hard time getting a lot of maker gene calls. It seems that the calling is too stringent in many cases. When I view the output of many contigs in Apollo, there is many times where 3 or 4 models show close to identical gene structure, but the final maker output does not contain that gene call. Do you have any suggestions on how to lower the stringency of the MAKER output so that more genes will be called? In some cases I am getting less than 3000 gene calls in the final output. Where an Augustus model trained on Chlamydamonas will return ~15000. Thanks very much for your help! Sincerely, Blake Hovde Graduate Student Department of Genome Sciences University of Washington From carsonhh at gmail.com Tue Mar 19 20:43:44 2013 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 19 Mar 2013 22:43:44 -0400 Subject: [maker-devel] Maker2 gff file output In-Reply-To: <7A60AB257EFF2B48B1F4C814817EA05350EDC688@mxb2.hg.genetics.utah.edu> Message-ID: >I am currently running MAKER2 on a new algal genome and am running >into a couple of problems that I would like your input on the genome >size is ~60Mb and is currently in ~3100 contigs. >First, I am having trouble doing multiple iterations of hmm training >with SNAP due to the fact that I have so many gff output files in the >datastore (1 for each contig in my draft genome). not just a single >gff output that seems to be in the examples and tutorials I have >followed thus far. Is there a way to combine all of my gff files >together to make use of the SNAP hmm training or re-annotation? Use the gff3_merge script in the .../maker/bin/ directory > >Second, Using multiple lines of evidence (augustus, genemarkES, RNAseq >data, and COGs based on homology searches) I am having a hard time >getting a lot of maker gene calls. It seems that the calling is too >stringent in many cases. When I view the output of many contigs in >Apollo, there is many times where 3 or 4 models show close to >identical gene structure, but the final maker output does not contain >that gene call. Do you have any suggestions on how to lower the >stringency of the MAKER output so that more genes will be called? In >some cases I am getting less than 3000 gene calls in the final output. >Where an Augustus model trained on Chlamydamonas will return ~15000. I agree with Mark. You may want to set single_exon=1 to accept single exon evidence, try increasing the depth of your protein evidence file as well, or if the genome is relatively gene dense, set keep_preds=1. On some genomes that are gene dense (fungi for example) ab initio predictors don't have that high a false positive rate, so this can be safe. However on more complex genomes doing so can produce more false positives than there are genes. Thanks, Carson On 13-03-19 8:02 PM, "Mark Yandell" wrote: >Hi Blake, > >I'be forwarded this onto the maker_devel list, they can help you more >there. > >regarding your comment g 'When I view the output of many contigs in >Apollo, there is many times where 3 or 4 models show close to identical >gene structure, but the final maker output does not contain that gene >call. ' Those calls are in the output files, but there are in a >different multifasta file; there are non-overalpping ab intio models. >Another way is to set the config flag to allow MAKEr to use unspliced EST >and RNA-seq alignments as evidence, > >I'be forwarded this onto the maker_devel list, they can help you more >there. > >cheers, > >--mark > > >Mark Yandell >Professor of Human Genetics >H.A. & Edna Benning Presidential Endowed Chair >Eccles Institute of Human Genetics >University of Utah >15 North 2030 East, Room 2100 >Salt Lake City, UT 84112-5330 >ph:801-587-7707 > >________________________________________ >From: Blake Hovde [hovdebt at uw.edu] >Sent: Tuesday, March 19, 2013 2:35 PM >To: Mark Yandell >Subject: Maker2 gff file output > >Hi Dr. Yandell, > >I am currently running MAKER2 on a new algal genome and am running >into a couple of problems that I would like your input on the genome >size is ~60Mb and is currently in ~3100 contigs. >First, I am having trouble doing multiple iterations of hmm training >with SNAP due to the fact that I have so many gff output files in the >datastore (1 for each contig in my draft genome). not just a single >gff output that seems to be in the examples and tutorials I have >followed thus far. Is there a way to combine all of my gff files >together to make use of the SNAP hmm training or re-annotation? > >Second, Using multiple lines of evidence (augustus, genemarkES, RNAseq >data, and COGs based on homology searches) I am having a hard time >getting a lot of maker gene calls. It seems that the calling is too >stringent in many cases. When I view the output of many contigs in >Apollo, there is many times where 3 or 4 models show close to >identical gene structure, but the final maker output does not contain >that gene call. Do you have any suggestions on how to lower the >stringency of the MAKER output so that more genes will be called? In >some cases I am getting less than 3000 gene calls in the final output. > Where an Augustus model trained on Chlamydamonas will return ~15000. > >Thanks very much for your help! > >Sincerely, >Blake Hovde >Graduate Student >Department of Genome Sciences >University of Washington > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From Carson.Holt at oicr.on.ca Wed Mar 20 07:51:29 2013 From: Carson.Holt at oicr.on.ca (Carson Holt) Date: Wed, 20 Mar 2013 13:51:29 +0000 Subject: [maker-devel] AED calculations using the MAKER pipeline In-Reply-To: Message-ID: In the current MAKER download when using GFF3 passthrough there was an issue with everything being done at the very last step. This of course leads to a memory spike and a very slow last step. That seems to be similar to what you are describing. It should be resolved in what will become version 2.28. I can give you access to the pre-release code, so you can check that the issue is resolved for you. I'll send details in a separate e-mail. Also the ### will be printed after every ~100,000 bp of assembly processed by MAKER. You can ignore them, but they actually have a meaning in GFF3. Basically everything between two sets of ###'s are fully resolved. It allows programs that read GFF3 to parallelize file loading or just load sections of a file as they can rapidly identify "safe chunks". Without them the entire file must be loaded into memory in order to be certain that all feature parts are there (as there is no requirement for sorting or order in GFF3). log.child files will always be empty unless you run analysis like snap or blast. Thanks, Carson On 13-03-20 9:05 AM, "Krishnakumar, Vivek" wrote: >Hi, > >We have been using the MAKER pipeline here at JCVI to calculate AED >scores by feeding in our annotation set as `model_gff` and the protein >and EST evidence as `protein_gff` and `est_gff` respectively. Here is the >issue we are having: > >When running the above pipeline with protein2genome and est2genome >evidence generated earlier by MAKER, there are no problems calculating >the AED score. Normally this pipeline takes a little over 12 hours to >complete. > >But if we use our own evidence, AAT and Genewise aligned proteins for >`protein_gff` and PASA assembled ESTs for `est_gff`, the same pipeline >runs very very slow and the intermediary *.gff.ann file has many chunks >(separated by '###') that are completely empty. Our evidence in formatted >in the same way as est2genome or protein2genome (GFF file with >"expressed_sequence_match::match_part" or "protein_match::match_part" >features respectively) > >The input to my pipeline is 8 chromosomes, ~2200 scaffolds and I use the >default `max_dna_len` parameter used to split the large assemblies into >chunks. > >Investigating the master_datastore.log shows me that the scaffolds run >through without any issues and the chromosomes are still being processed. >For any of the chromosomes, investigating the 'run.log' file, one level >above 'theVoid' shows me how many "final.section" jobs were started and >how many finished. And in the case of all the chromosomes, it tells me >that everything that was started has finished. And the 'log.child.*' >files within `theVoid` are all empty. Also within `theVoid`, I'm noticing >that the "raw.section" and "evidence_*.gff" files are not empty. But one >thing that is surprising is that of all the "final.section" files, only >the one pertaining to the last chunk is very large (proportional to the >size of the evidnce), the rest are all exactly the same size (exactly 331 >bytes). > >I'm running MAKER in MPI mode spawning 48 processes on a high memory >machine with 64 available cores and 1TB of RAM. > >I hope I've been able to explain my situation clearly in this email. > >Any help is appreciated. >Thank you. > >Vivek From vKrishna at jcvi.org Wed Mar 20 07:05:55 2013 From: vKrishna at jcvi.org (Krishnakumar, Vivek) Date: Wed, 20 Mar 2013 09:05:55 -0400 Subject: [maker-devel] AED calculations using the MAKER pipeline Message-ID: Hi, We have been using the MAKER pipeline here at JCVI to calculate AED scores by feeding in our annotation set as `model_gff` and the protein and EST evidence as `protein_gff` and `est_gff` respectively. Here is the issue we are having: When running the above pipeline with protein2genome and est2genome evidence generated earlier by MAKER, there are no problems calculating the AED score. Normally this pipeline takes a little over 12 hours to complete. But if we use our own evidence, AAT and Genewise aligned proteins for `protein_gff` and PASA assembled ESTs for `est_gff`, the same pipeline runs very very slow and the intermediary *.gff.ann file has many chunks (separated by '###') that are completely empty. Our evidence in formatted in the same way as est2genome or protein2genome (GFF file with "expressed_sequence_match::match_part" or "protein_match::match_part" features respectively) The input to my pipeline is 8 chromosomes, ~2200 scaffolds and I use the default `max_dna_len` parameter used to split the large assemblies into chunks. Investigating the master_datastore.log shows me that the scaffolds run through without any issues and the chromosomes are still being processed. For any of the chromosomes, investigating the 'run.log' file, one level above 'theVoid' shows me how many "final.section" jobs were started and how many finished. And in the case of all the chromosomes, it tells me that everything that was started has finished. And the 'log.child.*' files within `theVoid` are all empty. Also within `theVoid`, I'm noticing that the "raw.section" and "evidence_*.gff" files are not empty. But one thing that is surprising is that of all the "final.section" files, only the one pertaining to the last chunk is very large (proportional to the size of the evidnce), the rest are all exactly the same size (exactly 331 bytes). I'm running MAKER in MPI mode spawning 48 processes on a high memory machine with 64 available cores and 1TB of RAM. I hope I've been able to explain my situation clearly in this email. Any help is appreciated. Thank you. Vivek From cdtown at jcvi.org Wed Mar 20 07:54:33 2013 From: cdtown at jcvi.org (Town, Christopher D.) Date: Wed, 20 Mar 2013 09:54:33 -0400 Subject: [maker-devel] AED calculations using the MAKER pipeline In-Reply-To: References: Message-ID: Thanks. Is there any way of guestimating when this final step might be completed. We are in a time crunch here to get this analysis finished and the data/annotation out. Best Chris -----Original Message----- From: Carson Holt [mailto:Carson.Holt at oicr.on.ca] Sent: Wednesday, March 20, 2013 9:51 AM To: Krishnakumar, Vivek; maker-devel at yandell-lab.org Cc: Town, Christopher D.; Tang, Haibao; Bidwell, Shelby; Rosen, Benjamin Subject: Re: AED calculations using the MAKER pipeline In the current MAKER download when using GFF3 passthrough there was an issue with everything being done at the very last step. This of course leads to a memory spike and a very slow last step. That seems to be similar to what you are describing. It should be resolved in what will become version 2.28. I can give you access to the pre-release code, so you can check that the issue is resolved for you. I'll send details in a separate e-mail. Also the ### will be printed after every ~100,000 bp of assembly processed by MAKER. You can ignore them, but they actually have a meaning in GFF3. Basically everything between two sets of ###'s are fully resolved. It allows programs that read GFF3 to parallelize file loading or just load sections of a file as they can rapidly identify "safe chunks". Without them the entire file must be loaded into memory in order to be certain that all feature parts are there (as there is no requirement for sorting or order in GFF3). log.child files will always be empty unless you run analysis like snap or blast. Thanks, Carson On 13-03-20 9:05 AM, "Krishnakumar, Vivek" wrote: >Hi, > >We have been using the MAKER pipeline here at JCVI to calculate AED >scores by feeding in our annotation set as `model_gff` and the protein >and EST evidence as `protein_gff` and `est_gff` respectively. Here is >the issue we are having: > >When running the above pipeline with protein2genome and est2genome >evidence generated earlier by MAKER, there are no problems calculating >the AED score. Normally this pipeline takes a little over 12 hours to >complete. > >But if we use our own evidence, AAT and Genewise aligned proteins for >`protein_gff` and PASA assembled ESTs for `est_gff`, the same pipeline >runs very very slow and the intermediary *.gff.ann file has many chunks >(separated by '###') that are completely empty. Our evidence in >formatted in the same way as est2genome or protein2genome (GFF file >with "expressed_sequence_match::match_part" or "protein_match::match_part" >features respectively) > >The input to my pipeline is 8 chromosomes, ~2200 scaffolds and I use >the default `max_dna_len` parameter used to split the large assemblies >into chunks. > >Investigating the master_datastore.log shows me that the scaffolds run >through without any issues and the chromosomes are still being processed. >For any of the chromosomes, investigating the 'run.log' file, one level >above 'theVoid' shows me how many "final.section" jobs were started and >how many finished. And in the case of all the chromosomes, it tells me >that everything that was started has finished. And the 'log.child.*' >files within `theVoid` are all empty. Also within `theVoid`, I'm >noticing that the "raw.section" and "evidence_*.gff" files are not >empty. But one thing that is surprising is that of all the >"final.section" files, only the one pertaining to the last chunk is >very large (proportional to the size of the evidnce), the rest are all >exactly the same size (exactly 331 bytes). > >I'm running MAKER in MPI mode spawning 48 processes on a high memory >machine with 64 available cores and 1TB of RAM. > >I hope I've been able to explain my situation clearly in this email. > >Any help is appreciated. >Thank you. > >Vivek From myandell at genetics.utah.edu Wed Mar 20 08:55:38 2013 From: myandell at genetics.utah.edu (Mark Yandell) Date: Wed, 20 Mar 2013 14:55:38 +0000 Subject: [maker-devel] AED calculations using the MAKER pipeline In-Reply-To: References: Message-ID: <7A60AB257EFF2B48B1F4C814817EA05350EDCE05@mxb2.hg.genetics.utah.edu> Hi Vivek, sound like its a maybe problem with the protein2genome GFF file. Cane you send us a sample file that is known to produce the problem? cheers, --mark Mark Yandell Professor of Human Genetics H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:801-587-7707 ________________________________________ From: maker-devel-bounces at yandell-lab.org [maker-devel-bounces at yandell-lab.org] on behalf of Krishnakumar, Vivek [vKrishna at jcvi.org] Sent: Wednesday, March 20, 2013 7:05 AM To: maker-devel at yandell-lab.org Cc: Tang, Haibao; Rosen, Benjamin; Town, Christopher D.; Bidwell, Shelby Subject: [maker-devel] AED calculations using the MAKER pipeline Hi, We have been using the MAKER pipeline here at JCVI to calculate AED scores by feeding in our annotation set as `model_gff` and the protein and EST evidence as `protein_gff` and `est_gff` respectively. Here is the issue we are having: When running the above pipeline with protein2genome and est2genome evidence generated earlier by MAKER, there are no problems calculating the AED score. Normally this pipeline takes a little over 12 hours to complete. But if we use our own evidence, AAT and Genewise aligned proteins for `protein_gff` and PASA assembled ESTs for `est_gff`, the same pipeline runs very very slow and the intermediary *.gff.ann file has many chunks (separated by '###') that are completely empty. Our evidence in formatted in the same way as est2genome or protein2genome (GFF file with "expressed_sequence_match::match_part" or "protein_match::match_part" features respectively) The input to my pipeline is 8 chromosomes, ~2200 scaffolds and I use the default `max_dna_len` parameter used to split the large assemblies into chunks. Investigating the master_datastore.log shows me that the scaffolds run through without any issues and the chromosomes are still being processed. For any of the chromosomes, investigating the 'run.log' file, one level above 'theVoid' shows me how many "final.section" jobs were started and how many finished. And in the case of all the chromosomes, it tells me that everything that was started has finished. And the 'log.child.*' files within `theVoid` are all empty. Also within `theVoid`, I'm noticing that the "raw.section" and "evidence_*.gff" files are not empty. But one thing that is surprising is that of all the "final.section" files, only the one pertaining to the last chunk is very large (proportional to the size of the evidnce), the rest are all exactly the same size (exactly 331 bytes). I'm running MAKER in MPI mode spawning 48 processes on a high memory machine with 64 available cores and 1TB of RAM. I hope I've been able to explain my situation clearly in this email. Any help is appreciated. Thank you. Vivek _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From myandell at genetics.utah.edu Wed Mar 20 08:57:17 2013 From: myandell at genetics.utah.edu (Mark Yandell) Date: Wed, 20 Mar 2013 14:57:17 +0000 Subject: [maker-devel] AED calculations using the MAKER pipeline In-Reply-To: References: , Message-ID: <7A60AB257EFF2B48B1F4C814817EA05350EDCE15@mxb2.hg.genetics.utah.edu> whoops. looks like carson has got this one already. Thanks! Mark Yandell Professor of Human Genetics H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:801-587-7707 ________________________________________ From: maker-devel-bounces at yandell-lab.org [maker-devel-bounces at yandell-lab.org] on behalf of Town, Christopher D. [cdtown at jcvi.org] Sent: Wednesday, March 20, 2013 7:54 AM To: Carson Holt; Krishnakumar, Vivek; maker-devel at yandell-lab.org Cc: Tang, Haibao; Rosen, Benjamin; Bidwell, Shelby Subject: Re: [maker-devel] AED calculations using the MAKER pipeline Thanks. Is there any way of guestimating when this final step might be completed. We are in a time crunch here to get this analysis finished and the data/annotation out. Best Chris -----Original Message----- From: Carson Holt [mailto:Carson.Holt at oicr.on.ca] Sent: Wednesday, March 20, 2013 9:51 AM To: Krishnakumar, Vivek; maker-devel at yandell-lab.org Cc: Town, Christopher D.; Tang, Haibao; Bidwell, Shelby; Rosen, Benjamin Subject: Re: AED calculations using the MAKER pipeline In the current MAKER download when using GFF3 passthrough there was an issue with everything being done at the very last step. This of course leads to a memory spike and a very slow last step. That seems to be similar to what you are describing. It should be resolved in what will become version 2.28. I can give you access to the pre-release code, so you can check that the issue is resolved for you. I'll send details in a separate e-mail. Also the ### will be printed after every ~100,000 bp of assembly processed by MAKER. You can ignore them, but they actually have a meaning in GFF3. Basically everything between two sets of ###'s are fully resolved. It allows programs that read GFF3 to parallelize file loading or just load sections of a file as they can rapidly identify "safe chunks". Without them the entire file must be loaded into memory in order to be certain that all feature parts are there (as there is no requirement for sorting or order in GFF3). log.child files will always be empty unless you run analysis like snap or blast. Thanks, Carson On 13-03-20 9:05 AM, "Krishnakumar, Vivek" wrote: >Hi, > >We have been using the MAKER pipeline here at JCVI to calculate AED >scores by feeding in our annotation set as `model_gff` and the protein >and EST evidence as `protein_gff` and `est_gff` respectively. Here is >the issue we are having: > >When running the above pipeline with protein2genome and est2genome >evidence generated earlier by MAKER, there are no problems calculating >the AED score. Normally this pipeline takes a little over 12 hours to >complete. > >But if we use our own evidence, AAT and Genewise aligned proteins for >`protein_gff` and PASA assembled ESTs for `est_gff`, the same pipeline >runs very very slow and the intermediary *.gff.ann file has many chunks >(separated by '###') that are completely empty. Our evidence in >formatted in the same way as est2genome or protein2genome (GFF file >with "expressed_sequence_match::match_part" or "protein_match::match_part" >features respectively) > >The input to my pipeline is 8 chromosomes, ~2200 scaffolds and I use >the default `max_dna_len` parameter used to split the large assemblies >into chunks. > >Investigating the master_datastore.log shows me that the scaffolds run >through without any issues and the chromosomes are still being processed. >For any of the chromosomes, investigating the 'run.log' file, one level >above 'theVoid' shows me how many "final.section" jobs were started and >how many finished. And in the case of all the chromosomes, it tells me >that everything that was started has finished. And the 'log.child.*' >files within `theVoid` are all empty. Also within `theVoid`, I'm >noticing that the "raw.section" and "evidence_*.gff" files are not >empty. But one thing that is surprising is that of all the >"final.section" files, only the one pertaining to the last chunk is >very large (proportional to the size of the evidnce), the rest are all >exactly the same size (exactly 331 bytes). > >I'm running MAKER in MPI mode spawning 48 processes on a high memory >machine with 64 available cores and 1TB of RAM. > >I hope I've been able to explain my situation clearly in this email. > >Any help is appreciated. >Thank you. > >Vivek _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From carsonhh at gmail.com Wed Mar 20 11:36:30 2013 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 20 Mar 2013 13:36:30 -0400 Subject: [maker-devel] AED calculations using the MAKER pipeline In-Reply-To: Message-ID: On the few cases where I found this (if it is the same issue you are experiencing), it was very much dependent on the total size of the evidence database and the length of the contigs. For me it took about 25-50% longer, but used up 10-15x as much RAM (primarily because the contigs were very long > 50 Mb each). The issue was unnoticeable on the short contigs that are more typical of de novo annotation. Thanks, Carson On 13-03-20 9:54 AM, "Town, Christopher D." wrote: >Thanks. Is there any way of guestimating when this final step might be >completed. We are in a time crunch here to get this analysis finished and >the data/annotation out. > >Best > >Chris > >-----Original Message----- >From: Carson Holt [mailto:Carson.Holt at oicr.on.ca] >Sent: Wednesday, March 20, 2013 9:51 AM >To: Krishnakumar, Vivek; maker-devel at yandell-lab.org >Cc: Town, Christopher D.; Tang, Haibao; Bidwell, Shelby; Rosen, Benjamin >Subject: Re: AED calculations using the MAKER pipeline > >In the current MAKER download when using GFF3 passthrough there was an >issue with everything being done at the very last step. This of course >leads to a memory spike and a very slow last step. That seems to be >similar to what you are describing. It should be resolved in what will >become version 2.28. I can give you access to the pre-release code, so >you can check that the issue is resolved for you. I'll send details in a >separate e-mail. > >Also the ### will be printed after every ~100,000 bp of assembly >processed by MAKER. You can ignore them, but they actually have a >meaning in GFF3. >Basically everything between two sets of ###'s are fully resolved. It >allows programs that read GFF3 to parallelize file loading or just load >sections of a file as they can rapidly identify "safe chunks". Without >them the entire file must be loaded into memory in order to be certain >that all feature parts are there (as there is no requirement for sorting >or order in GFF3). > >log.child files will always be empty unless you run analysis like snap or >blast. > >Thanks, >Carson > > > > > > >On 13-03-20 9:05 AM, "Krishnakumar, Vivek" wrote: > >>Hi, >> >>We have been using the MAKER pipeline here at JCVI to calculate AED >>scores by feeding in our annotation set as `model_gff` and the protein >>and EST evidence as `protein_gff` and `est_gff` respectively. Here is >>the issue we are having: >> >>When running the above pipeline with protein2genome and est2genome >>evidence generated earlier by MAKER, there are no problems calculating >>the AED score. Normally this pipeline takes a little over 12 hours to >>complete. >> >>But if we use our own evidence, AAT and Genewise aligned proteins for >>`protein_gff` and PASA assembled ESTs for `est_gff`, the same pipeline >>runs very very slow and the intermediary *.gff.ann file has many chunks >>(separated by '###') that are completely empty. Our evidence in >>formatted in the same way as est2genome or protein2genome (GFF file >>with "expressed_sequence_match::match_part" or >>"protein_match::match_part" >>features respectively) >> >>The input to my pipeline is 8 chromosomes, ~2200 scaffolds and I use >>the default `max_dna_len` parameter used to split the large assemblies >>into chunks. >> >>Investigating the master_datastore.log shows me that the scaffolds run >>through without any issues and the chromosomes are still being processed. >>For any of the chromosomes, investigating the 'run.log' file, one level >>above 'theVoid' shows me how many "final.section" jobs were started and >>how many finished. And in the case of all the chromosomes, it tells me >>that everything that was started has finished. And the 'log.child.*' >>files within `theVoid` are all empty. Also within `theVoid`, I'm >>noticing that the "raw.section" and "evidence_*.gff" files are not >>empty. But one thing that is surprising is that of all the >>"final.section" files, only the one pertaining to the last chunk is >>very large (proportional to the size of the evidnce), the rest are all >>exactly the same size (exactly 331 bytes). >> >>I'm running MAKER in MPI mode spawning 48 processes on a high memory >>machine with 64 available cores and 1TB of RAM. >> >>I hope I've been able to explain my situation clearly in this email. >> >>Any help is appreciated. >>Thank you. >> >>Vivek > > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From ares711122 at gmail.com Thu Mar 21 18:08:45 2013 From: ares711122 at gmail.com (Hung-Wei Hsu) Date: Fri, 22 Mar 2013 08:08:45 +0800 Subject: [maker-devel] Directory structure is too deep! Message-ID: Hi MAKER developers, I found that the MAKER outputs of each contigs were located in separate deep directory. Can MAKER collect these outputs in one simple directory so that these results can be easily examined? Thanks a lot in advance. Warmest regards, Hung-Wei -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Thu Mar 21 20:07:23 2013 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 21 Mar 2013 22:07:23 -0400 Subject: [maker-devel] Directory structure is too deep! In-Reply-To: Message-ID: You can use gff3_merge to collect them into a single file, or to keep them as separate files but in the same directory just use the standard linux copy command. Similarly you can use fasta_merge to collect the fasta files. Example: > mkdir results > cp *.maker.output/*_datastore/*/*/*.gff results/ Thanks, Carson From: Hung-Wei Hsu Date: Thursday, 21 March, 2013 8:08 PM To: Subject: [maker-devel] Directory structure is too deep! Hi MAKER developers, I found that the MAKER outputs of each contigs were located in separate deep directory. Can MAKER collect these outputs in one simple directory so that these results can be easily examined? Thanks a lot in advance. Warmest regards, Hung-Wei _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From jason.stajich at gmail.com Fri Mar 22 00:12:55 2013 From: jason.stajich at gmail.com (Jason Stajich) Date: Thu, 21 Mar 2013 20:12:55 -1000 Subject: [maker-devel] failed gene prediction In-Reply-To: References: Message-ID: <59B5B965-7B15-449E-B42F-E41D4F448B6A@gmail.com> For fungi, I've put up some of the gene prediction parameters that I've built or trained if that is a helpful for you. https://github.com/hyphaltip/fungi-gene-prediction-params In the absence of any ESTs or RNA-Seq I also recommend generating a starting training set with CEGMA first and then training your predictors from there except for GeneMark.hmm which seems to do okay with self-training. Jason On Mar 18, 2013, at 10:49 AM, Carson Holt wrote: > You didn't supply any evidence or HMM files for gene predictors. Just raw assembly data by itself is insufficient for genome annotation. > > Here is some nice documentation for running MAKER --> http://gmod.org/wiki/MAKER_Tutorial_2012 > Here is a nice overview of genome annotation ion general --> http://fasta.bioch.virginia.edu/cshl/pdf/12/ajm12/euk_genome_annotation_review.pdf > > Once you've gone through the documentation and examples, if you come across any questions just let us know. > > Thanks, > Carson > > > From: "Borhan, Hossein" > Date: Monday, 18 March, 2013 4:40 PM > To: > Subject: [maker-devel] failed gene prediction > > Hi > > I have tried maker on a fungus genome of 45 mb with 1/3 being repeat rich. It did not produce any prediction. I am not sure what is causing this. Attached are the STDERR and opts.ctl. I appreciate your help > > > Hossein > > > > > > > > > > _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org Jason Stajich jason.stajich at gmail.com jason at bioperl.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ares711122 at gmail.com Fri Mar 22 01:52:25 2013 From: ares711122 at gmail.com (Hung-Wei Hsu) Date: Fri, 22 Mar 2013 15:52:25 +0800 Subject: [maker-devel] Can MAKER analyze the viral genome? Message-ID: Hi MAKER developers, I'm wondering if MAKER can deal with the viral genome. If yes, how do I set the running parameters? Thanks. Kind regards, Hung-Wei -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Sat Mar 23 17:42:39 2013 From: carsonhh at gmail.com (Carson Holt) Date: Sat, 23 Mar 2013 19:42:39 -0400 Subject: [maker-devel] Can MAKER analyze the viral genome? In-Reply-To: Message-ID: You can set organism type to prokaryotic and use the protein2genome option for annotation. It's not a perfect match as it only allows for partial gene spatial overlap and not full gene within a gene like you can see in viruses. Thanks, Carson From: Hung-Wei Hsu Date: Friday, 22 March, 2013 3:52 AM To: Subject: [maker-devel] Can MAKER analyze the viral genome? Hi MAKER developers, I'm wondering if MAKER can deal with the viral genome. If yes, how do I set the running parameters? Thanks. Kind regards, Hung-Wei _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From jjin01 at mail.rockefeller.edu Sat Mar 23 18:43:54 2013 From: jjin01 at mail.rockefeller.edu (Jingjing Jin) Date: Sun, 24 Mar 2013 00:43:54 +0000 Subject: [maker-devel] maker running error Message-ID: Dear all, When I run the maker, there is an error like this: *** buffer overflow detected ***: /usr/bin/perl terminated ======= Backtrace: ========= /lib64/libc.so.6(__fortify_fail+0x37)[0x3582d01d47] /lib64/libc.so.6[0x3582cffc30] /lib64/libc.so.6[0x3582cff089] /lib64/libc.so.6(__printf_fp+0x1531)[0x3582c4afa1] /lib64/libc.so.6(_IO_vfprintf+0x11a7)[0x3582c45407] /lib64/libc.so.6(__vsprintf_chk+0x9d)[0x3582cff12d] /lib64/libc.so.6(__sprintf_chk+0x7f)[0x3582cff06f] /usr/local/maker/lib/File/../../perl/lib/auto/Proc/ProcessTable/ProcessTable.so(OS_get_table+0x9bb)[0x7f328e8eb69b] /usr/local/maker/lib/File/../../perl/lib/auto/Proc/ProcessTable/ProcessTable.so(XS_Proc__ProcessTable_table+0x182)[0x7f328e8ecc02] /usr/lib64/perl5/CORE/libperl.so(Perl_pp_entersub+0x5a5)[0x35848a66d5] /usr/lib64/perl5/CORE/libperl.so(Perl_runops_standard+0x16)[0x35848a49c6] /usr/lib64/perl5/CORE/libperl.so(perl_run+0x338)[0x358484d0d8] /usr/bin/perl(main+0xec)[0x400cac] /lib64/libc.so.6(__libc_start_main+0xfd)[0x3582c1ecdd] /usr/bin/perl[0x400af9] ======= Memory map: ======== Could anyone give me some suggestion about how to deal with this problem? Thanks! Jingjing -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Sat Mar 23 21:04:49 2013 From: carsonhh at gmail.com (Carson Holt) Date: Sat, 23 Mar 2013 23:04:49 -0400 Subject: [maker-devel] maker running error In-Reply-To: Message-ID: Could you try maker version 2.27 from the website? Proc::ProcessTable may have problems on your system in accessing the process table. Version 2.27 tries to access the same information by first parsing the output of the standard 'df' command and only tries to access the process table directly if that fails. Thanks, Carson From: Jingjing Jin Date: Saturday, 23 March, 2013 8:43 PM To: "maker-devel at yandell-lab.org" Subject: [maker-devel] maker running error Dear all, When I run the maker, there is an error like this: *** buffer overflow detected ***: /usr/bin/perl terminated ======= Backtrace: ========= /lib64/libc.so.6(__fortify_fail+0x37)[0x3582d01d47] /lib64/libc.so.6[0x3582cffc30] /lib64/libc.so.6[0x3582cff089] /lib64/libc.so.6(__printf_fp+0x1531)[0x3582c4afa1] /lib64/libc.so.6(_IO_vfprintf+0x11a7)[0x3582c45407] /lib64/libc.so.6(__vsprintf_chk+0x9d)[0x3582cff12d] /lib64/libc.so.6(__sprintf_chk+0x7f)[0x3582cff06f] /usr/local/maker/lib/File/../../perl/lib/auto/Proc/ProcessTable/ProcessTable .so(OS_get_table+0x9bb)[0x7f328e8eb69b] /usr/local/maker/lib/File/../../perl/lib/auto/Proc/ProcessTable/ProcessTable .so(XS_Proc__ProcessTable_table+0x182)[0x7f328e8ecc02] /usr/lib64/perl5/CORE/libperl.so(Perl_pp_entersub+0x5a5)[0x35848a66d5] /usr/lib64/perl5/CORE/libperl.so(Perl_runops_standard+0x16)[0x35848a49c6] /usr/lib64/perl5/CORE/libperl.so(perl_run+0x338)[0x358484d0d8] /usr/bin/perl(main+0xec)[0x400cac] /lib64/libc.so.6(__libc_start_main+0xfd)[0x3582c1ecdd] /usr/bin/perl[0x400af9] ======= Memory map: ======== Could anyone give me some suggestion about how to deal with this problem? Thanks! Jingjing _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnuhn at ebi.ac.uk Mon Mar 25 06:18:11 2013 From: mnuhn at ebi.ac.uk (mnuhn) Date: Mon, 25 Mar 2013 12:18:11 +0000 Subject: [maker-devel] =?utf-8?q?master=5Fdatastore=5Findex=2Elog_file_shr?= =?utf-8?q?inks=2E?= In-Reply-To: References: Message-ID: <407ae892252062e886fb3855bb6bf74c@ebi.ac.uk> Thanks, this works and mpi maker is running now. Cheers, Michael. P.S.: If anyone is trying to reproduce this, I only had one directory in LD_PRELOAD and it didn't like the trailing colon, so I removed it to make it work: export LD_PRELOAD=/software/openmpi-1.4.3/lib/libmpi.so On 2013-03-19 15:22, Carson Holt wrote: > I have MAKER working under OpemnMPI 1.4.3 (intel compiled). > > I had to set a couple of environmental variables prior to setup. You > would > probably need to set these values as well. If you your OpenMPI path > was > here for example --> /software/openmpi-1.4.3/, run the following > commands > (path set accordingly) before even attempting maker setup. > > export OMPI_MCA_mpi_warn_on_fork 0 > export LD_PRELOAD /software/openmpi-1.4.3/lib/libmpi.so:$LD_PRELOAD > > These not only need to be set before compilation, but also before any > run > (so add them to you ~.bashrc or ~/.bash_profile or any module load > scripts > thanks). The LD_PRELOAD statement needs to be set for any program > using > OpenMPI's shared libraries and not just MAKER, so it's normally a > good > idea to have that set system wide for all users. The detail can be > found > in the OpenMPI documentation. Note sometimes system library updates > can > break OpenMPI's shared libraries while not breaking OpenMPI itself, > so you > might also need to recompile OpenMPI if it has broken shared > libraries. > > Once you have those commands in place, run the perl Buil.PL step. Say > yes > to install with MPI. Then run ./Build install > > Thanks, > Carson > > > > On 13-03-19 11:02 AM, "Carson Holt" wrote: > >>Try it with the no_locks option then. Make sure to let one instance >>finish populating the mpi_blastdb directory before running other >>instances >>as that is where most initial locking occurs. >> >>I'll send you more details on how to install with OpenMPI, so you can >>give >>that a shot while your jobs are also running serially (so you don't >> lose >>time). Also instead of 50 serial instances, you could try 10 with >> -cpus >>set to 5. >> >>Thanks, >>Carson >> >> >> >>On 13-03-19 11:19 AM, "Michael Nuhn" wrote: >> >>>Hello Carson! >>> >>>On 03/19/2013 02:27 PM, Carson Holt wrote: >>>> Yes. If at all possible use MPI. It removes the overhead of >>>> locks >>>> which happen per primary instance of MAKER. So one maker job >>>> using >>>>1000 >>>> cpus via MPI will have one shared set of locks. 1000 serial >>>> instances >>>> of MAKER on the other hand would have 1000x the locks. >>> >>>I don't know a thing about MPI. >>> >>>I tried installing maker (2.2.7) with mpich-3.0.2, mpich2-1.4.1 and >>> open >>>mpi and none of them worked for me. I also tried the automatic >>>installation that comes with maker, but it didn't work for me >>> either. >>> >>>If need be, I could spend time getting to the bottom of this, but >>> there >>>is no telling how long this would take me so I'd rather not, if >>> there is >>>an alternative. >>> >>>Would the approach I outlined before work? (Treating the split files >>> as >>>separate genomes to annotate and then combine the gffs afterwards) >>> >>>I also like this approach, because I would select a few contigs in >>> the >>>beginning which I would run on their own. They would complete early >>> and >>>this way I would get a preview of the results of the run instead of >>>having to wait for everything to complete. >>> >>>It might also be more robust, because file locking issues would be >>>confined to the instances working on a sequence chunk, but the rest >>> of >>>the instances could continue working. >>> >>>Cheers, >>>Michael. >>> >>>> Alternatively if you do need to continue without MPI for some >>>> reason, I >>>> just finished a devel version of MAKER that has a --no_locks >>>> option. >>>> You can never start two instances using the same input fasta >>>> when >>>> --no_locks is specified, but the splitting to use different input >>>>fastas >>>> I mentioned before in the example will still work fine. >>>> >>>> I also have updated the indexing/reindexing, so if indexing >>>> failures >>>> happen, MAKER will switch between the current working directory >>>> and the >>>> TMP= directory from the maker_opts.ctl file so as to try different >>>> IO >>>> locations (I.e. NFS and non-NFS). Note you should never set TMP= >>>> in >>>>the >>>> control files to an NFS mounted location (it not only makes things >>>> a >>>>lot >>>> slower, but berkleydb and sqllite will get frequent errors on >>>> NFS). >>>> TMP= defaults to /tmp when not specified >>>> >>>> I'll send you download information in a separate e-mail. Try a >>>> regular >>>> MAKER run to see if the indexing/reindexing changes are sufficient >>>> before attempting the ?no_locks option. >>>> >>>> Thanks, >>>> Carson From lengjingmao at gmail.com Mon Mar 25 07:49:11 2013 From: lengjingmao at gmail.com (shaohua.fan) Date: Mon, 25 Mar 2013 14:49:11 +0100 Subject: [maker-devel] maker terminated strangely Message-ID: Hi Maker developers, I met a problem when I was using Maker version 2.27 beta version that the pipeline terminated in the middle of the process without any error message. The genome I am working with is a Eukaryotic genome which is consisted by around 6000 scaffolds. I combined de novo (Augustus and SNAP) and evidence based (protein from a closely related species and transcriptome from the same species) for the gene prediction (the genome is already repeat masked). The MPI (mpich2 version 1.5) enabled maker was run on a cluster by using SGE. I checked with the administrator of our cluster, there is no limitation of SGE job. The maker was run by using mpiexec -n 48 /home/shafan/maker/bin/maker maker_opts.ctl maker_bopts.ctl maker_exe.ctl I attached my maker_opt.ctl, please let me know if you need any information for this problem. Thanks a lot! Shaohua -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_opts.ctl Type: application/octet-stream Size: 4518 bytes Desc: not available URL: From carsonhh at gmail.com Mon Mar 25 08:01:45 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 25 Mar 2013 10:01:45 -0400 Subject: [maker-devel] maker terminated strangely In-Reply-To: Message-ID: Could you send your captured standard error. That would contain messages that highlight the specific cause. Thanks, Carson From: "shaohua.fan" Date: Monday, 25 March, 2013 9:49 AM To: Subject: [maker-devel] maker terminated strangely Hi Maker developers, I met a problem when I was using Maker version 2.27 beta version that the pipeline terminated in the middle of the process without any error message. The genome I am working with is a Eukaryotic genome which is consisted by around 6000 scaffolds. I combined de novo (Augustus and SNAP) and evidence based (protein from a closely related species and transcriptome from the same species) for the gene prediction (the genome is already repeat masked). The MPI (mpich2 version 1.5) enabled maker was run on a cluster by using SGE. I checked with the administrator of our cluster, there is no limitation of SGE job. The maker was run by using mpiexec -n 48 /home/shafan/maker/bin/maker maker_opts.ctl maker_bopts.ctl maker_exe.ctl I attached my maker_opt.ctl, please let me know if you need any information for this problem. Thanks a lot! Shaohua _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From lengjingmao at gmail.com Mon Mar 25 08:07:17 2013 From: lengjingmao at gmail.com (shaohua.fan) Date: Mon, 25 Mar 2013 15:07:17 +0100 Subject: [maker-devel] maker terminated strangely In-Reply-To: References: Message-ID: Hi Carson, Do you mean standard output from maker? If yes, I need to upload the file to our ftp server, since it is quite big around 1.1 Gb. Shaohua 2013/3/25 Carson Holt > Could you send your captured standard error. That would contain messages > that highlight the specific cause. > > Thanks, > Carson > > > From: "shaohua.fan" > Date: Monday, 25 March, 2013 9:49 AM > To: > Subject: [maker-devel] maker terminated strangely > > Hi Maker developers, > > I met a problem when I was using Maker version 2.27 beta version that the > pipeline terminated in the middle of the process without any error message. > > > The genome I am working with is a Eukaryotic genome which is consisted by > around 6000 scaffolds. I combined de novo (Augustus and SNAP) and evidence > based (protein from a closely related species and transcriptome from the > same species) for the gene prediction (the genome is already repeat > masked). The MPI (mpich2 version 1.5) enabled maker was run on a cluster by > using SGE. I checked with the administrator of our cluster, there is no > limitation of SGE job. > > The maker was run by using mpiexec -n 48 /home/shafan/maker/bin/maker > maker_opts.ctl maker_bopts.ctl maker_exe.ctl > > I attached my maker_opt.ctl, please let me know if you need any > information for this problem. > > Thanks a lot! > > Shaohua > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Mar 25 08:07:45 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 25 Mar 2013 10:07:45 -0400 Subject: [maker-devel] master_datastore_index.log file shrinks. In-Reply-To: <407ae892252062e886fb3855bb6bf74c@ebi.ac.uk> Message-ID: Great news. I'm glad it's working. If you have more questions, just let me know. --Carson On 13-03-25 8:18 AM, "mnuhn" wrote: >Thanks, this works and mpi maker is running now. > >Cheers, >Michael. > >P.S.: > >If anyone is trying to reproduce this, I only had one directory in >LD_PRELOAD and it didn't like the trailing colon, so I removed it to >make it work: > >export LD_PRELOAD=/software/openmpi-1.4.3/lib/libmpi.so > >On 2013-03-19 15:22, Carson Holt wrote: >> I have MAKER working under OpemnMPI 1.4.3 (intel compiled). >> >> I had to set a couple of environmental variables prior to setup. You >> would >> probably need to set these values as well. If you your OpenMPI path >> was >> here for example --> /software/openmpi-1.4.3/, run the following >> commands >> (path set accordingly) before even attempting maker setup. >> >> export OMPI_MCA_mpi_warn_on_fork 0 >> export LD_PRELOAD /software/openmpi-1.4.3/lib/libmpi.so:$LD_PRELOAD >> >> These not only need to be set before compilation, but also before any >> run >> (so add them to you ~.bashrc or ~/.bash_profile or any module load >> scripts >> thanks). The LD_PRELOAD statement needs to be set for any program >> using >> OpenMPI's shared libraries and not just MAKER, so it's normally a >> good >> idea to have that set system wide for all users. The detail can be >> found >> in the OpenMPI documentation. Note sometimes system library updates >> can >> break OpenMPI's shared libraries while not breaking OpenMPI itself, >> so you >> might also need to recompile OpenMPI if it has broken shared >> libraries. >> >> Once you have those commands in place, run the perl Buil.PL step. Say >> yes >> to install with MPI. Then run ./Build install >> >> Thanks, >> Carson >> >> >> >> On 13-03-19 11:02 AM, "Carson Holt" wrote: >> >>>Try it with the no_locks option then. Make sure to let one instance >>>finish populating the mpi_blastdb directory before running other >>>instances >>>as that is where most initial locking occurs. >>> >>>I'll send you more details on how to install with OpenMPI, so you can >>>give >>>that a shot while your jobs are also running serially (so you don't >>> lose >>>time). Also instead of 50 serial instances, you could try 10 with >>> -cpus >>>set to 5. >>> >>>Thanks, >>>Carson >>> >>> >>> >>>On 13-03-19 11:19 AM, "Michael Nuhn" wrote: >>> >>>>Hello Carson! >>>> >>>>On 03/19/2013 02:27 PM, Carson Holt wrote: >>>>> Yes. If at all possible use MPI. It removes the overhead of >>>>> locks >>>>> which happen per primary instance of MAKER. So one maker job >>>>> using >>>>>1000 >>>>> cpus via MPI will have one shared set of locks. 1000 serial >>>>> instances >>>>> of MAKER on the other hand would have 1000x the locks. >>>> >>>>I don't know a thing about MPI. >>>> >>>>I tried installing maker (2.2.7) with mpich-3.0.2, mpich2-1.4.1 and >>>> open >>>>mpi and none of them worked for me. I also tried the automatic >>>>installation that comes with maker, but it didn't work for me >>>> either. >>>> >>>>If need be, I could spend time getting to the bottom of this, but >>>> there >>>>is no telling how long this would take me so I'd rather not, if >>>> there is >>>>an alternative. >>>> >>>>Would the approach I outlined before work? (Treating the split files >>>> as >>>>separate genomes to annotate and then combine the gffs afterwards) >>>> >>>>I also like this approach, because I would select a few contigs in >>>> the >>>>beginning which I would run on their own. They would complete early >>>> and >>>>this way I would get a preview of the results of the run instead of >>>>having to wait for everything to complete. >>>> >>>>It might also be more robust, because file locking issues would be >>>>confined to the instances working on a sequence chunk, but the rest >>>> of >>>>the instances could continue working. >>>> >>>>Cheers, >>>>Michael. >>>> >>>>> Alternatively if you do need to continue without MPI for some >>>>> reason, I >>>>> just finished a devel version of MAKER that has a --no_locks >>>>> option. >>>>> You can never start two instances using the same input fasta >>>>> when >>>>> --no_locks is specified, but the splitting to use different input >>>>>fastas >>>>> I mentioned before in the example will still work fine. >>>>> >>>>> I also have updated the indexing/reindexing, so if indexing >>>>> failures >>>>> happen, MAKER will switch between the current working directory >>>>> and the >>>>> TMP= directory from the maker_opts.ctl file so as to try different >>>>> IO >>>>> locations (I.e. NFS and non-NFS). Note you should never set TMP= >>>>> in >>>>>the >>>>> control files to an NFS mounted location (it not only makes things >>>>> a >>>>>lot >>>>> slower, but berkleydb and sqllite will get frequent errors on >>>>> NFS). >>>>> TMP= defaults to /tmp when not specified >>>>> >>>>> I'll send you download information in a separate e-mail. Try a >>>>> regular >>>>> MAKER run to see if the indexing/reindexing changes are sufficient >>>>> before attempting the ?no_locks option. >>>>> >>>>> Thanks, >>>>> Carson > From carsonhh at gmail.com Mon Mar 25 08:08:17 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 25 Mar 2013 10:08:17 -0400 Subject: [maker-devel] maker terminated strangely In-Reply-To: Message-ID: Yes. Thanks, Carson From: "shaohua.fan" Date: Monday, 25 March, 2013 10:07 AM To: Carson Holt Cc: Subject: Re: [maker-devel] maker terminated strangely Hi Carson, Do you mean standard output from maker? If yes, I need to upload the file to our ftp server, since it is quite big around 1.1 Gb. Shaohua 2013/3/25 Carson Holt > Could you send your captured standard error. That would contain messages that > highlight the specific cause. > > Thanks, > Carson > > > From: "shaohua.fan" > Date: Monday, 25 March, 2013 9:49 AM > To: > Subject: [maker-devel] maker terminated strangely > > Hi Maker developers, > > I met a problem when I was using Maker version 2.27 beta version that the > pipeline terminated in the middle of the process without any error message. > > > The genome I am working with is a Eukaryotic genome which is consisted by > around 6000 scaffolds. I combined de novo (Augustus and SNAP) and evidence > based (protein from a closely related species and transcriptome from the same > species) for the gene prediction (the genome is already repeat masked). The > MPI (mpich2 version 1.5) enabled maker was run on a cluster by using SGE. I > checked with the administrator of our cluster, there is no limitation of SGE > job. > > The maker was run by using mpiexec -n 48 /home/shafan/maker/bin/maker > maker_opts.ctl maker_bopts.ctl maker_exe.ctl > > I attached my maker_opt.ctl, please let me know if you need any information > for this problem. > > Thanks a lot! > > Shaohua > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/mak > er-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ares711122 at gmail.com Mon Mar 25 20:50:52 2013 From: ares711122 at gmail.com (Hung-Wei Hsu) Date: Tue, 26 Mar 2013 10:50:52 +0800 Subject: [maker-devel] Why are some start positions minus in the gff result? Message-ID: Hi MAKER developers, I could successfully run MAKER and get the final gff. But I found some start positions in the gff were minus. That led to error in the gff reader. Is this a bug? Could you please help to resolve this problem? Thanks a lot in advance. Best regards, Hung-Wei -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Mar 25 21:24:01 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 25 Mar 2013 23:24:01 -0400 Subject: [maker-devel] Why are some start positions minus in the gff result? In-Reply-To: Message-ID: I haven't seen that before, so could you package up the job (all input and control files) that generates this and send it to me. Your using maker's prokaryotic settings to try and get it to annotate viral genomes, correct? --Carson From: Hung-Wei Hsu Date: Monday, 25 March, 2013 10:50 PM To: Subject: [maker-devel] Why are some start positions minus in the gff result? Hi MAKER developers, I could successfully run MAKER and get the final gff. But I found some start positions in the gff were minus. That led to error in the gff reader. Is this a bug? Could you please help to resolve this problem? Thanks a lot in advance. Best regards, Hung-Wei _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From hudarul at yahoo.com Sun Mar 31 14:02:04 2013 From: hudarul at yahoo.com (Hud Hud) Date: Sun, 31 Mar 2013 13:02:04 -0700 (PDT) Subject: [maker-devel] Help on error-Repeat masker Message-ID: <1364760124.37890.YahooMailNeo@web164901.mail.bf1.yahoo.com> Hello, i have some problem when runnning maker, i've got this kind of error, what could possibly go wrong here? Thnks so much setting up GFF3 output and fasta chunks doing repeat masking running ?repeat masker. #--------- command -------------# Widget::RepeatMasker: cd /tmp/maker_WOVHsi; /home/maker-2.27-beta/maker/exe/RepeatMasker/RepeatMasker /home/maker-2.27-beta/maker/data/contig.maker.output/contig_datastore/61/0D/contig172//theVoid ? ? ? ? ? ? ? ? ? ? ? ? ? ? .contig172/contig172.0.simple.rb -dir /home/maker-2.27-beta/maker/data/contig.maker.output/contig_datastore/61/0D/contig172//theVoid.contig172 -pa 1 - lib /tmp/maker_WOVHsi/b1piBcWHlH #-------------------------------# sh: /home/maker-2.27-beta/maker/exe/RepeatMasker/RepeatMasker: /u1/local/bin/perl: bad interpreter: Permission denied ERROR: RepeatMasker failed --> rank=NA, hostname=Homis ERROR: Failed while doing repeat masking ERROR: Chunk failed at level:0, tier_type:1 FAILED CONTIG:contig172 ERROR: Chunk failed at level:2, tier_type:0 FAILED CONTIG:172 examining contents of the fasta file and run log -------------- next part -------------- An HTML attachment was scrubbed... URL: From kenlee.nakasugi at sydney.edu.au Sun Mar 3 16:44:01 2013 From: kenlee.nakasugi at sydney.edu.au (Kenlee Nakasugi) Date: Mon, 04 Mar 2013 10:44:01 +1100 Subject: [maker-devel] regarding mpich In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6E7062FF@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6E7062FF@CHIMBX5.ad.uillinois.edu> Message-ID: <1362354241.2252.38.camel@waterhouse874-8> Hi, I'm trying to install mpi_maker (Maker 2.1) on a new system (intel ix x86_64) which has mpich v3.0.2 installed, but I can't seem to get maker Build.PL to recognize it. I tried editing the Build.pm file to point to it, but with no success. All dependencies have been installed and successfully recognized, it is just MPI support that is not. Is there anything I could modify in the install scripts to make it recognize this? Currently, the directly path to where the mpicc and mpiexec are is /apps/mpich/3.0.2/bin I don't have sys admin rights for the machine, and I'm not sure if this version of mpich was installed for shared libraries as per the GMOD tutorial. But I have previously circumvented this with an earlier version of mpich by modifying the Build.pm module with success. I'm wondering if mpichv3.02 is not compatible? Cheers, Ken -- Kenlee Nakasugi | Research Fellow School of Molecular Bioscience Level 8, SMB Building (G08)| The University of Sydney | NSW | 2006 T: +61 2 9114 1321 E: kenlee.nakasugi at sydney.edu.au -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Mar 4 06:35:03 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 04 Mar 2013 08:35:03 -0500 Subject: [maker-devel] regarding mpich In-Reply-To: <1362354241.2252.38.camel@waterhouse874-8> Message-ID: Use the last MPICH2 version, as MPICH3 is very different (it's the first attempt to implement the new MPI3 protocol set, and not just a version update). Alternatively you can use OpenMPI. Also use maker version 2.27 instead for MPI. Thanks, Carson From: Kenlee Nakasugi Date: Sunday, 3 March, 2013 6:44 PM To: "maker-devel at yandell-lab.org List" Subject: [maker-devel] regarding mpich Hi, I'm trying to install mpi_maker (Maker 2.1) on a new system (intel ix x86_64) which has mpich v3.0.2 installed, but I can't seem to get maker Build.PL to recognize it. I tried editing the Build.pm file to point to it, but with no success. All dependencies have been installed and successfully recognized, it is just MPI support that is not. Is there anything I could modify in the install scripts to make it recognize this? Currently, the directly path to where the mpicc and mpiexec are is /apps/mpich/3.0.2/bin I don't have sys admin rights for the machine, and I'm not sure if this version of mpich was installed for shared libraries as per the GMOD tutorial. But I have previously circumvented this with an earlier version of mpich by modifying the Build.pm module with success. I'm wondering if mpichv3.02 is not compatible? Cheers, Ken -- Kenlee Nakasugi | Research Fellow School of Molecular Bioscience Level 8, SMB Building (G08)| The University of Sydney | NSW | 2006 T: +61 2 9114 1321 E: kenlee.nakasugi at sydney.edu.au _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From canchaya at uvigo.es Mon Mar 4 04:10:26 2013 From: canchaya at uvigo.es (Carlos A. Canchaya) Date: Mon, 4 Mar 2013 12:10:26 +0100 Subject: [maker-devel] Sharing benchmarks of maker References: <6472D2A0-7BA8-41F0-ACFD-4D3C800D36FB@uvigo.es> Message-ID: <7F41714C-6C75-4892-AA5B-D7649DDA7DF2@uvigo.es> Hi, I've just install maker2 in our server and run a first test with our data. The input was about 30 000 sequences (9.6 Mb) and it was run in just one server with 32 processors for 36 hours) with mpich2. Our server has 250 Gb of memory and cpus of 2,4 Gb. The test was simple because it only ran repeatmasker and SNAP. Considering that we would like to use other gene prediction/annotation tools available in MAKER, I wonder if you can share some of your benchmarks in order to know if we could scale up pretty well to our production cluster in order to annotate our 1.6 Gb draft genome Best, Carlos Carlos A. Canchaya, PhD IPP Research Fellow Department of Biochemistry, Genetics and Immunology Faculty of Biology Campus Universitario University of Vigo 36310 Vigo Spain http://darwin.uvigo.es/~ccanchaya/ email: canchaya at uvigo.es Tel : +34 986 130048 Fax: +34 986 812556 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Mar 4 08:12:06 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 04 Mar 2013 10:12:06 -0500 Subject: [maker-devel] Sharing benchmarks of maker In-Reply-To: <7F41714C-6C75-4892-AA5B-D7649DDA7DF2@uvigo.es> Message-ID: Performance is highly dependent on the size of evidence datasets used (proteins/ESTs) as well as the IO performance of a system when running via MPI (you can hit IO bottlenecks well before cpu bottlenecks depending on cluster configuration). The Arabidopsis genome (120Mb assembly) running SNAP and Augustus, 1.1Gb EST dataset, and 10Mb protein dataset takes ~1 hour 30 min on 1,500 cpus with OpenMPI. The Maize genome (2.1 Gb) running SNAP and Augustus, 3Gb EST dataset, and 16 Mb protein dataset takes ~4 hours 30 min on 2200 cpus. A human sized genome would take 5-6 days on 100 cpus. MAKER is fully restartable (keeps log of progress). So if there is any failure or the user kills it in the middle of a job, it will pick up at the point it left off on restart (so you don't waste all that processing time). 2Gb of RAM per processing core is recommended when parallelizing MAKER via MPI, but fragmented genomes with smaller contigs can get by with less than 1Gb per core. MAKER version 2.28 which has additional optimization for OpenMPI and lower memory footprint will be available in a couple of weeks. Until then 2.27 is recommended over 2.1 for MPI. 2.27 should also work with OpenMPI. 2.1 only works with older versions of MPICH2 using the mpd launcher and not the current hydra launcher. Thanks, Carson From: "Carlos A. Canchaya" Date: Monday, 4 March, 2013 6:10 AM To: Subject: [maker-devel] Sharing benchmarks of maker Hi, I've just install maker2 in our server and run a first test with our data. The input was about 30 000 sequences (9.6 Mb) and it was run in just one server with 32 processors for 36 hours) with mpich2. Our server has 250 Gb of memory and cpus of 2,4 Gb. The test was simple because it only ran repeatmasker and SNAP. Considering that we would like to use other gene prediction/annotation tools available in MAKER, I wonder if you can share some of your benchmarks in order to know if we could scale up pretty well to our production cluster in order to annotate our 1.6 Gb draft genome Best, Carlos Carlos A. Canchaya, PhD IPP Research Fellow Department of Biochemistry, Genetics and Immunology Faculty of Biology Campus Universitario University of Vigo 36310 Vigo Spain http://darwin.uvigo.es/~ccanchaya/ email: canchaya at uvigo.es Tel : +34 986 130048 Fax: +34 986 812556 > _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Mar 4 08:33:02 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 04 Mar 2013 10:33:02 -0500 Subject: [maker-devel] Sharing benchmarks of maker In-Reply-To: Message-ID: For the Arabidopsis genome it also took ~2 hour 10 min on 600, so there was only a 40 min gain by going from 600 to 1,500 cpus. This is because assembly strucutre have a lot to do with the efficiency of the parallelization, so you can hit a point of diminishing returns on some assemblies sooner than others. --Carson From: Carson Holt Date: Monday, 4 March, 2013 10:12 AM To: "Carlos A. Canchaya" , Subject: Re: [maker-devel] Sharing benchmarks of maker Performance is highly dependent on the size of evidence datasets used (proteins/ESTs) as well as the IO performance of a system when running via MPI (you can hit IO bottlenecks well before cpu bottlenecks depending on cluster configuration). The Arabidopsis genome (120Mb assembly) running SNAP and Augustus, 1.1Gb EST dataset, and 10Mb protein dataset takes ~1 hour 30 min on 1,500 cpus with OpenMPI. The Maize genome (2.1 Gb) running SNAP and Augustus, 3Gb EST dataset, and 16 Mb protein dataset takes ~4 hours 30 min on 2200 cpus. A human sized genome would take 5-6 days on 100 cpus. MAKER is fully restartable (keeps log of progress). So if there is any failure or the user kills it in the middle of a job, it will pick up at the point it left off on restart (so you don't waste all that processing time). 2Gb of RAM per processing core is recommended when parallelizing MAKER via MPI, but fragmented genomes with smaller contigs can get by with less than 1Gb per core. MAKER version 2.28 which has additional optimization for OpenMPI and lower memory footprint will be available in a couple of weeks. Until then 2.27 is recommended over 2.1 for MPI. 2.27 should also work with OpenMPI. 2.1 only works with older versions of MPICH2 using the mpd launcher and not the current hydra launcher. Thanks, Carson From: "Carlos A. Canchaya" Date: Monday, 4 March, 2013 6:10 AM To: Subject: [maker-devel] Sharing benchmarks of maker Hi, I've just install maker2 in our server and run a first test with our data. The input was about 30 000 sequences (9.6 Mb) and it was run in just one server with 32 processors for 36 hours) with mpich2. Our server has 250 Gb of memory and cpus of 2,4 Gb. The test was simple because it only ran repeatmasker and SNAP. Considering that we would like to use other gene prediction/annotation tools available in MAKER, I wonder if you can share some of your benchmarks in order to know if we could scale up pretty well to our production cluster in order to annotate our 1.6 Gb draft genome Best, Carlos Carlos A. Canchaya, PhD IPP Research Fellow Department of Biochemistry, Genetics and Immunology Faculty of Biology Campus Universitario University of Vigo 36310 Vigo Spain http://darwin.uvigo.es/~ccanchaya/ email: canchaya at uvigo.es Tel : +34 986 130048 Fax: +34 986 812556 > _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/m aker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From kenlee.nakasugi at sydney.edu.au Mon Mar 4 13:50:27 2013 From: kenlee.nakasugi at sydney.edu.au (Kenlee Nakasugi) Date: Mon, 4 Mar 2013 20:50:27 +0000 Subject: [maker-devel] regarding mpich In-Reply-To: References: <1362354241.2252.38.camel@waterhouse874-8>, Message-ID: Thanks Carson. Will Maker 2.27 be able to continue analysis on Maker 2.1 files that stopped halfway? Thanks Ken On 05/03/2013, at 1:44 AM, "Carson Holt" > wrote: Use the last MPICH2 version, as MPICH3 is very different (it's the first attempt to implement the new MPI3 protocol set, and not just a version update). Alternatively you can use OpenMPI. Also use maker version 2.27 instead for MPI. Thanks, Carson From: Kenlee Nakasugi > Date: Sunday, 3 March, 2013 6:44 PM To: "maker-devel at yandell-lab.org List" > Subject: [maker-devel] regarding mpich Hi, I'm trying to install mpi_maker (Maker 2.1) on a new system (intel ix x86_64) which has mpich v3.0.2 installed, but I can't seem to get maker Build.PL to recognize it. I tried editing the Build.pm file to point to it, but with no success. All dependencies have been installed and successfully recognized, it is just MPI support that is not. Is there anything I could modify in the install scripts to make it recognize this? Currently, the directly path to where the mpicc and mpiexec are is /apps/mpich/3.0.2/bin I don't have sys admin rights for the machine, and I'm not sure if this version of mpich was installed for shared libraries as per the GMOD tutorial. But I have previously circumvented this with an earlier version of mpich by modifying the Build.pm module with success. I'm wondering if mpichv3.02 is not compatible? Cheers, Ken -- Kenlee Nakasugi | Research Fellow School of Molecular Bioscience Level 8, SMB Building (G08)| The University of Sydney | NSW | 2006 T: +61 2 9114 1321 E: kenlee.nakasugi at sydney.edu.au _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From dsth at ebi.ac.uk Mon Mar 4 13:57:01 2013 From: dsth at ebi.ac.uk (Daniel Hughes) Date: Mon, 4 Mar 2013 20:57:01 +0000 Subject: [maker-devel] regarding mpich In-Reply-To: References: <1362354241.2252.38.camel@waterhouse874-8> Message-ID: Unlikely. Probably safer to export what has finished as gff and run it as re-annotation if you don't want to waste what was alteady processed for running additional iterations. Dan from me phone... On Mar 4, 2013 8:52 PM, "Kenlee Nakasugi" wrote: > Thanks Carson. Will Maker 2.27 be able to continue analysis on Maker 2.1 > files that stopped halfway? > Thanks > Ken > > > > On 05/03/2013, at 1:44 AM, "Carson Holt" wrote: > > Use the last MPICH2 version, as MPICH3 is very different (it's the > first attempt to implement the new MPI3 protocol set, and not just a > version update). Alternatively you can use OpenMPI. Also use maker > version 2.27 instead for MPI. > > Thanks, > Carson > > > > From: Kenlee Nakasugi > Date: Sunday, 3 March, 2013 6:44 PM > To: "maker-devel at yandell-lab.org List" > Subject: [maker-devel] regarding mpich > > Hi, > > I'm trying to install mpi_maker (Maker 2.1) on a new system (intel ix > x86_64) which has mpich v3.0.2 installed, but I can't seem to get maker > Build.PL to recognize it. I tried editing the Build.pm file to point to it, > but with no success. > All dependencies have been installed and successfully recognized, it is > just MPI support that is not. > > Is there anything I could modify in the install scripts to make it > recognize this? Currently, the directly path to where the mpicc and mpiexec > are is /apps/mpich/3.0.2/bin > I don't have sys admin rights for the machine, and I'm not sure if this > version of mpich was installed for shared libraries as per the GMOD > tutorial. But I have previously circumvented this with an earlier version > of mpich by modifying the Build.pm module with success. I'm wondering if > mpichv3.02 is not compatible? > > > Cheers, > Ken > > -- > Kenlee Nakasugi | Research Fellow > School of Molecular Bioscience > Level 8, SMB Building (G08)| The University of Sydney | NSW | 2006 > T: +61 2 9114 1321 > E: kenlee.nakasugi at sydney.edu.au > > _______________________________________________ maker-devel mailing > list maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Mar 4 13:58:21 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 04 Mar 2013 15:58:21 -0500 Subject: [maker-devel] regarding mpich In-Reply-To: Message-ID: Some files it can reuse, but not all. So, exporting finished contigs with GFF3 pass-through is an option. --Carson From: Daniel Hughes Date: Monday, 4 March, 2013 3:57 PM To: Kenlee Nakasugi Cc: "maker-devel at yandell-lab.org List" , Carson Holt Subject: Re: [maker-devel] regarding mpich Unlikely. Probably safer to export what has finished as gff and run it as re-annotation if you don't want to waste what was alteady processed for running additional iterations. Dan from me phone... On Mar 4, 2013 8:52 PM, "Kenlee Nakasugi" wrote: > Thanks Carson. Will Maker 2.27 be able to continue analysis on Maker 2.1 files > that stopped halfway? > Thanks > Ken > > > > On 05/03/2013, at 1:44 AM, "Carson Holt" wrote: > >> Use the last MPICH2 version, as MPICH3 is very different (it's the first >> attempt to implement the new MPI3 protocol set, and not just a version >> update). Alternatively you can use OpenMPI. Also use maker version 2.27 >> instead for MPI. >> >> Thanks, >> Carson >> >> >> >> From: Kenlee Nakasugi >> Date: Sunday, 3 March, 2013 6:44 PM >> To: "maker-devel at yandell-lab.org List" >> Subject: [maker-devel] regarding mpich >> >> Hi, >> >> I'm trying to install mpi_maker (Maker 2.1) on a new system (intel ix x86_64) >> which has mpich v3.0.2 installed, but I can't seem to get maker Build.PL to >> recognize it. I tried editing the Build.pm file to point to it, but with no >> success. >> All dependencies have been installed and successfully recognized, it is just >> MPI support that is not. >> >> Is there anything I could modify in the install scripts to make it recognize >> this? Currently, the directly path to where the mpicc and mpiexec are is >> /apps/mpich/3.0.2/bin >> I don't have sys admin rights for the machine, and I'm not sure if this >> version of mpich was installed for shared libraries as per the GMOD tutorial. >> But I have previously circumvented this with an earlier version of mpich by >> modifying the Build.pm module with success. I'm wondering if mpichv3.02 is >> not compatible? >> >> >> Cheers, >> Ken >> >> -- >> Kenlee Nakasugi | Research Fellow >> School of Molecular Bioscience >> Level 8, SMB Building (G08)| The University of Sydney | NSW | 2006 >> T: +61 2 9114 1321 >> E: kenlee.nakasugi at sydney.edu.au >> _______________________________________________ maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kenlee.nakasugi at sydney.edu.au Mon Mar 4 18:49:09 2013 From: kenlee.nakasugi at sydney.edu.au (Kenlee Nakasugi) Date: Tue, 05 Mar 2013 12:49:09 +1100 Subject: [maker-devel] hex char:29 error with Signal.pm Message-ID: <1362448149.6346.46.camel@waterhouse874-8> Hi again, I'm running into the following error when I run maker 2.1: ## Ran into unknown state (hex char: 29) at /home/programs/maker/lib/File/..//Proc/Signal.pm line 94. ## I tried applying the patch as described here: http://gmod.827538.n3.nabble.com/cluster-error-running-maker-td4022354.html Using the command: $ patch -np1 < 646785-and-handle-Hex29.patch I did this in maker/lib/Proc and maker/lib/Process directories, but am getting this error: ## patch: **** Only garbage was found in the patch input. ## Apparently, this isn't a fatal error: http://gmod.827538.n3.nabble.com/Ran-into-unknown-state-hex-char-29-at-home-a200302-maker-2-10-lib-File-Proc-Signal-pm-line-94-td3034795.html and I might eventually have to run the latest version of Maker, but I need to continue a previous analyses and not having this constant error would be great. The version of Proc::ProcessTable is already latest, 0.47. The platform is ix x86_64 GNU/Linux Thanks, Ken -- Kenlee Nakasugi | Research Fellow School of Molecular Bioscience Level 8, SMB Building (G08)| The University of Sydney | NSW | 2006 T: +61 2 9114 1321 E: kenlee.nakasugi at sydney.edu.au -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Mar 4 21:48:17 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 04 Mar 2013 23:48:17 -0500 Subject: [maker-devel] hex char:29 error with Signal.pm In-Reply-To: <1362448149.6346.46.camel@waterhouse874-8> Message-ID: This is an issue with Proc::ProcessTable on some systems. If you upgrade to MAKER 2.27 it goes away because it no longer uses Proc::ProcessTable. Thanks, Carson From: Kenlee Nakasugi Date: Monday, 4 March, 2013 8:49 PM To: "maker-devel at yandell-lab.org List" Subject: [maker-devel] hex char:29 error with Signal.pm Hi again, I'm running into the following error when I run maker 2.1: ## Ran into unknown state (hex char: 29) at /home/programs/maker/lib/File/..//Proc/Signal.pm line 94. ## I tried applying the patch as described here: http://gmod.827538.n3.nabble.com/cluster-error-running-maker-td4022354.html Using the command: $ patch -np1 < 646785-and-handle-Hex29.patch I did this in maker/lib/Proc and maker/lib/Process directories, but am getting this error: ## patch: **** Only garbage was found in the patch input. ## Apparently, this isn't a fatal error: http://gmod.827538.n3.nabble.com/Ran-into-unknown-state-hex-char-29-at-home- a200302-maker-2-10-lib-File-Proc-Signal-pm-line-94-td3034795.html and I might eventually have to run the latest version of Maker, but I need to continue a previous analyses and not having this constant error would be great. The version of Proc::ProcessTable is already latest, 0.47. The platform is ix x86_64 GNU/Linux Thanks, Ken -- Kenlee Nakasugi | Research Fellow School of Molecular Bioscience Level 8, SMB Building (G08)| The University of Sydney | NSW | 2006 T: +61 2 9114 1321 E: kenlee.nakasugi at sydney.edu.au _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From Carson.Holt at oicr.on.ca Wed Mar 6 10:45:40 2013 From: Carson.Holt at oicr.on.ca (Carson Holt) Date: Wed, 6 Mar 2013 17:45:40 +0000 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: Message-ID: The failed thread is usually just a symptom. There is something causing the thread to fail. Could you send me your STDERR. Often times there is a warning or error further up. Thanks, Carson From: Ram?n Fallon > Date: Wednesday, 6 March, 2013 12:34 PM To: > Subject: thread terminated, causing all processes to fail Hi, I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a single multicore machine. I've successfully run the dpp_contig.fasta (MPI/8 processes) example but am having trouble with larger contigs fasta files of my own, which are well formed. I've run into a problem whereby an mpiexec run of 8 processes will stop due to a perl-thread related problem which says FATAL: Thread terminated, causing all processes to fail this corresponds to line 924 in the maker executable (which is for the secondary/worker threads), and is the result of a test on !$thr OR'd with !$thr->is_running, so clearly one of these is failing. $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a programmer, I've only recently started to look at the code and have not got the hang of the parallelisation setup here, though I gather the master must use threads to initially generate the parallel instances which then use the message passing. Of course threads don't have message passing ability, so I guess something clever is going on and will take some time for me to understand. Clearly however, it has worked before on dpp_contigs, so it may be is something wrong with my datafile or the way I am carrying out the analysis. Any clues that can be put my way are welcome. Thank you! -------------- next part -------------- An HTML attachment was scrubbed... URL: From ramonfallon at gmail.com Wed Mar 6 10:34:59 2013 From: ramonfallon at gmail.com (=?ISO-8859-1?Q?Ram=F3n_Fallon?=) Date: Wed, 6 Mar 2013 18:34:59 +0100 Subject: [maker-devel] thread terminated, causing all processes to fail Message-ID: Hi, I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a single multicore machine. I've successfully run the dpp_contig.fasta (MPI/8 processes) example but am having trouble with larger contigs fasta files of my own, which are well formed. I've run into a problem whereby an mpiexec run of 8 processes will stop due to a perl-thread related problem which says FATAL: Thread terminated, causing all processes to fail this corresponds to line 924 in the maker executable (which is for the secondary/worker threads), and is the result of a test on !$thr OR'd with !$thr->is_running, so clearly one of these is failing. $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a programmer, I've only recently started to look at the code and have not got the hang of the parallelisation setup here, though I gather the master must use threads to initially generate the parallel instances which then use the message passing. Of course threads don't have message passing ability, so I guess something clever is going on and will take some time for me to understand. Clearly however, it has worked before on dpp_contigs, so it may be is something wrong with my datafile or the way I am carrying out the analysis. Any clues that can be put my way are welcome. Thank you! -------------- next part -------------- An HTML attachment was scrubbed... URL: From ramonfallon at gmail.com Wed Mar 6 10:57:12 2013 From: ramonfallon at gmail.com (=?ISO-8859-1?Q?Ram=F3n_Fallon?=) Date: Wed, 6 Mar 2013 18:57:12 +0100 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: References: Message-ID: Hi, Many thanks for your quick reply and hint. Yes, you're right .. further up there is indeed Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line 148 thread 1. Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw FastaSeq for Storable --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 thread 1. I run a "script" session and have maker on -debug so I have everything in one file. Do you prefer to have it attached to a post to this mailing list (if it accepts txt attachments) Cheers. On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon wrote: > Hi, > > I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a > single multicore machine. > > I've successfully run the dpp_contig.fasta (MPI/8 processes) example but > am having trouble with larger contigs fasta files of my own, which are well > formed. > > I've run into a problem whereby an mpiexec run of 8 processes will stop > due to a perl-thread related problem which says > > FATAL: Thread terminated, causing all processes to fail > > this corresponds to line 924 in the maker executable (which is for the > secondary/worker threads), and is the result of a test on !$thr OR'd with > !$thr->is_running, so clearly one of these is failing. > > $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a > programmer, I've only recently started to look at the code and have not got > the hang of the parallelisation setup here, though I gather the master must > use threads to initially generate the parallel instances which then use the > message passing. Of course threads don't have message passing ability, so I > guess something clever is going on and will take some time for me to > understand. > > Clearly however, it has worked before on dpp_contigs, so it may be is > something wrong with my datafile or the way I am carrying out the analysis. > > Any clues that can be put my way are welcome. > > Thank you! > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Carson.Holt at oicr.on.ca Wed Mar 6 11:04:30 2013 From: Carson.Holt at oicr.on.ca (Carson Holt) Date: Wed, 6 Mar 2013 18:04:30 +0000 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: Message-ID: If you do reply all to this message, I should get the attachment. It will be stripped from the one going to the list though. Thanks, Carson From: Ram?n Fallon > Date: Wednesday, 6 March, 2013 12:57 PM To: > Subject: Re: thread terminated, causing all processes to fail Hi, Many thanks for your quick reply and hint. Yes, you're right .. further up there is indeed Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line 148 thread 1. Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw FastaSeq for Storable --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 thread 1. I run a "script" session and have maker on -debug so I have everything in one file. Do you prefer to have it attached to a post to this mailing list (if it accepts txt attachments) Cheers. On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon > wrote: Hi, I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a single multicore machine. I've successfully run the dpp_contig.fasta (MPI/8 processes) example but am having trouble with larger contigs fasta files of my own, which are well formed. I've run into a problem whereby an mpiexec run of 8 processes will stop due to a perl-thread related problem which says FATAL: Thread terminated, causing all processes to fail this corresponds to line 924 in the maker executable (which is for the secondary/worker threads), and is the result of a test on !$thr OR'd with !$thr->is_running, so clearly one of these is failing. $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a programmer, I've only recently started to look at the code and have not got the hang of the parallelisation setup here, though I gather the master must use threads to initially generate the parallel instances which then use the message passing. Of course threads don't have message passing ability, so I guess something clever is going on and will take some time for me to understand. Clearly however, it has worked before on dpp_contigs, so it may be is something wrong with my datafile or the way I am carrying out the analysis. Any clues that can be put my way are welcome. Thank you! -------------- next part -------------- An HTML attachment was scrubbed... URL: From ramonfallon at gmail.com Wed Mar 6 11:15:01 2013 From: ramonfallon at gmail.com (=?ISO-8859-1?Q?Ram=F3n_Fallon?=) Date: Wed, 6 Mar 2013 19:15:01 +0100 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: References: Message-ID: OK great, here goes .. many thanks! On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt wrote: > If you do reply all to this message, I should get the attachment. It > will be stripped from the one going to the list though. > > Thanks, > Carson > > > > From: Ram?n Fallon > Date: Wednesday, 6 March, 2013 12:57 PM > To: > Subject: Re: thread terminated, causing all processes to fail > > Hi, > > Many thanks for your quick reply and hint. > > Yes, you're right .. further up there is indeed > > Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line > 148 thread 1. > Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw > FastaSeq for Storable > --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 > thread 1. > > I run a "script" session and have maker on -debug so I have everything > in one file. Do you prefer to have it attached to a post to this mailing > list (if it accepts txt attachments) > > Cheers. > > > On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon wrote: > >> Hi, >> >> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a >> single multicore machine. >> >> I've successfully run the dpp_contig.fasta (MPI/8 processes) example >> but am having trouble with larger contigs fasta files of my own, which are >> well formed. >> >> I've run into a problem whereby an mpiexec run of 8 processes will stop >> due to a perl-thread related problem which says >> >> FATAL: Thread terminated, causing all processes to fail >> >> this corresponds to line 924 in the maker executable (which is for the >> secondary/worker threads), and is the result of a test on !$thr OR'd with >> !$thr->is_running, so clearly one of these is failing. >> >> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being >> a programmer, I've only recently started to look at the code and have not >> got the hang of the parallelisation setup here, though I gather the master >> must use threads to initially generate the parallel instances which then >> use the message passing. Of course threads don't have message passing >> ability, so I guess something clever is going on and will take some time >> for me to understand. >> >> Clearly however, it has worked before on dpp_contigs, so it may be is >> something wrong with my datafile or the way I am carrying out the analysis. >> >> Any clues that can be put my way are welcome. >> >> Thank you! >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: rf_mkr_run.scriptlog.zip Type: application/zip Size: 7598 bytes Desc: not available URL: From Carson.Holt at oicr.on.ca Wed Mar 6 11:22:38 2013 From: Carson.Holt at oicr.on.ca (Carson Holt) Date: Wed, 6 Mar 2013 18:22:38 +0000 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: Message-ID: Could you delete your ../*maker.output/mpi_blastdb directory, and then when rerunning maker, run with the ?a flag. Thanks, Carson From: Ram?n Fallon > Date: Wednesday, 6 March, 2013 1:15 PM To: Carson Holt > Cc: "maker-devel at yandell-lab.org" > Subject: Re: thread terminated, causing all processes to fail OK great, here goes .. many thanks! On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt > wrote: If you do reply all to this message, I should get the attachment. It will be stripped from the one going to the list though. Thanks, Carson From: Ram?n Fallon > Date: Wednesday, 6 March, 2013 12:57 PM To: > Subject: Re: thread terminated, causing all processes to fail Hi, Many thanks for your quick reply and hint. Yes, you're right .. further up there is indeed Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line 148 thread 1. Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw FastaSeq for Storable --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 thread 1. I run a "script" session and have maker on -debug so I have everything in one file. Do you prefer to have it attached to a post to this mailing list (if it accepts txt attachments) Cheers. On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon > wrote: Hi, I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a single multicore machine. I've successfully run the dpp_contig.fasta (MPI/8 processes) example but am having trouble with larger contigs fasta files of my own, which are well formed. I've run into a problem whereby an mpiexec run of 8 processes will stop due to a perl-thread related problem which says FATAL: Thread terminated, causing all processes to fail this corresponds to line 924 in the maker executable (which is for the secondary/worker threads), and is the result of a test on !$thr OR'd with !$thr->is_running, so clearly one of these is failing. $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a programmer, I've only recently started to look at the code and have not got the hang of the parallelisation setup here, though I gather the master must use threads to initially generate the parallel instances which then use the message passing. Of course threads don't have message passing ability, so I guess something clever is going on and will take some time for me to understand. Clearly however, it has worked before on dpp_contigs, so it may be is something wrong with my datafile or the way I am carrying out the analysis. Any clues that can be put my way are welcome. Thank you! -------------- next part -------------- An HTML attachment was scrubbed... URL: From ramonfallon at gmail.com Wed Mar 6 11:49:46 2013 From: ramonfallon at gmail.com (=?ISO-8859-1?Q?Ram=F3n_Fallon?=) Date: Wed, 6 Mar 2013 19:49:46 +0100 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: References: Message-ID: OK, will do. Will get back to you tomorrow on it. Many thanks! On Wed, Mar 6, 2013 at 7:22 PM, Carson Holt wrote: > Could you delete your ../*maker.output/mpi_blastdb directory, and then > when rerunning maker, run with the ?a flag. > > Thanks, > Carson > > > From: Ram?n Fallon > Date: Wednesday, 6 March, 2013 1:15 PM > To: Carson Holt > Cc: "maker-devel at yandell-lab.org" > > Subject: Re: thread terminated, causing all processes to fail > > OK great, here goes .. many thanks! > > > > On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt wrote: > >> If you do reply all to this message, I should get the attachment. It >> will be stripped from the one going to the list though. >> >> Thanks, >> Carson >> >> >> >> From: Ram?n Fallon >> Date: Wednesday, 6 March, 2013 12:57 PM >> To: >> Subject: Re: thread terminated, causing all processes to fail >> >> Hi, >> >> Many thanks for your quick reply and hint. >> >> Yes, you're right .. further up there is indeed >> >> Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line >> 148 thread 1. >> Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw >> FastaSeq for Storable >> --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 >> thread 1. >> >> I run a "script" session and have maker on -debug so I have everything >> in one file. Do you prefer to have it attached to a post to this mailing >> list (if it accepts txt attachments) >> >> Cheers. >> >> >> On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon wrote: >> >>> Hi, >>> >>> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a >>> single multicore machine. >>> >>> I've successfully run the dpp_contig.fasta (MPI/8 processes) example >>> but am having trouble with larger contigs fasta files of my own, which are >>> well formed. >>> >>> I've run into a problem whereby an mpiexec run of 8 processes will >>> stop due to a perl-thread related problem which says >>> >>> FATAL: Thread terminated, causing all processes to fail >>> >>> this corresponds to line 924 in the maker executable (which is for the >>> secondary/worker threads), and is the result of a test on !$thr OR'd with >>> !$thr->is_running, so clearly one of these is failing. >>> >>> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being >>> a programmer, I've only recently started to look at the code and have not >>> got the hang of the parallelisation setup here, though I gather the master >>> must use threads to initially generate the parallel instances which then >>> use the message passing. Of course threads don't have message passing >>> ability, so I guess something clever is going on and will take some time >>> for me to understand. >>> >>> Clearly however, it has worked before on dpp_contigs, so it may be is >>> something wrong with my datafile or the way I am carrying out the analysis. >>> >>> Any clues that can be put my way are welcome. >>> >>> Thank you! >>> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ramonfallon at gmail.com Thu Mar 7 07:40:53 2013 From: ramonfallon at gmail.com (=?ISO-8859-1?Q?Ram=F3n_Fallon?=) Date: Thu, 7 Mar 2013 15:40:53 +0100 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: References: Message-ID: Hi Carson, I send you a zip of the text file of my repeated maker session, this time having deleted the mpi_blastdb dir and with the -a flag added to "mpiexec -n 8 maker -debug". Command line. Cheers / Ram?n. On Wed, Mar 6, 2013 at 7:49 PM, Ram?n Fallon wrote: > OK, will do. > > Will get back to you tomorrow on it. > > Many thanks! > > > On Wed, Mar 6, 2013 at 7:22 PM, Carson Holt wrote: > >> Could you delete your ../*maker.output/mpi_blastdb directory, and then >> when rerunning maker, run with the ?a flag. >> >> Thanks, >> Carson >> >> >> From: Ram?n Fallon >> Date: Wednesday, 6 March, 2013 1:15 PM >> To: Carson Holt >> Cc: "maker-devel at yandell-lab.org" >> >> Subject: Re: thread terminated, causing all processes to fail >> >> OK great, here goes .. many thanks! >> >> >> >> On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt wrote: >> >>> If you do reply all to this message, I should get the attachment. It >>> will be stripped from the one going to the list though. >>> >>> Thanks, >>> Carson >>> >>> >>> >>> From: Ram?n Fallon >>> Date: Wednesday, 6 March, 2013 12:57 PM >>> To: >>> Subject: Re: thread terminated, causing all processes to fail >>> >>> Hi, >>> >>> Many thanks for your quick reply and hint. >>> >>> Yes, you're right .. further up there is indeed >>> >>> Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line >>> 148 thread 1. >>> Thread 1 terminated abnormally: ERROR: Could not reestablish DB to >>> thaw FastaSeq for Storable >>> --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 >>> thread 1. >>> >>> I run a "script" session and have maker on -debug so I have everything >>> in one file. Do you prefer to have it attached to a post to this mailing >>> list (if it accepts txt attachments) >>> >>> Cheers. >>> >>> >>> On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon wrote: >>> >>>> Hi, >>>> >>>> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a >>>> single multicore machine. >>>> >>>> I've successfully run the dpp_contig.fasta (MPI/8 processes) example >>>> but am having trouble with larger contigs fasta files of my own, which are >>>> well formed. >>>> >>>> I've run into a problem whereby an mpiexec run of 8 processes will >>>> stop due to a perl-thread related problem which says >>>> >>>> FATAL: Thread terminated, causing all processes to fail >>>> >>>> this corresponds to line 924 in the maker executable (which is for the >>>> secondary/worker threads), and is the result of a test on !$thr OR'd with >>>> !$thr->is_running, so clearly one of these is failing. >>>> >>>> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite >>>> being a programmer, I've only recently started to look at the code and have >>>> not got the hang of the parallelisation setup here, though I gather the >>>> master must use threads to initially generate the parallel instances which >>>> then use the message passing. Of course threads don't have message passing >>>> ability, so I guess something clever is going on and will take some time >>>> for me to understand. >>>> >>>> Clearly however, it has worked before on dpp_contigs, so it may be is >>>> something wrong with my datafile or the way I am carrying out the analysis. >>>> >>>> Any clues that can be put my way are welcome. >>>> >>>> Thank you! >>>> >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: rf_mkr_run.scriptlog2.zip Type: application/zip Size: 6430 bytes Desc: not available URL: From carsonhh at gmail.com Thu Mar 7 09:44:40 2013 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 07 Mar 2013 11:44:40 -0500 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: Message-ID: That is extremely odd. It fails to even generate the indexes. Could you check the drive space of your working directory and your /tmp directory? It is odd because Bioperl uses the stat command to check on the file right before making a tied hash. So it was there for the stat but not the tie, which is immediately following. If you check manually does it exist now? --> /home/ramonf/makertrials/mgallocut7/sca29310_8.maker.output/mpi_blastdb/sca2 9310_8%2Efa.mpi.1/sca29310_8%2Efa.mpi.1.0.index Are you running in an NFS mounted directory? --Carson From: Ram?n Fallon Date: Thursday, 7 March, 2013 9:40 AM To: Carson Holt Cc: "maker-devel at yandell-lab.org" Subject: Re: [maker-devel] thread terminated, causing all processes to fail Hi Carson, I send you a zip of the text file of my repeated maker session, this time having deleted the mpi_blastdb dir and with the -a flag added to "mpiexec -n 8 maker -debug". Command line. Cheers / Ram?n. On Wed, Mar 6, 2013 at 7:49 PM, Ram?n Fallon wrote: > OK, will do. > > Will get back to you tomorrow on it. > > Many thanks! > > > On Wed, Mar 6, 2013 at 7:22 PM, Carson Holt wrote: >> Could you delete your ../*maker.output/mpi_blastdb directory, and then when >> rerunning maker, run with the ?a flag. >> >> Thanks, >> Carson >> >> >> From: Ram?n Fallon >> Date: Wednesday, 6 March, 2013 1:15 PM >> To: Carson Holt >> Cc: "maker-devel at yandell-lab.org" >> >> Subject: Re: thread terminated, causing all processes to fail >> >> OK great, here goes .. many thanks! >> >> >> >> On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt wrote: >>> If you do reply all to this message, I should get the attachment. It will >>> be stripped from the one going to the list though. >>> >>> Thanks, >>> Carson >>> >>> >>> >>> From: Ram?n Fallon >>> Date: Wednesday, 6 March, 2013 12:57 PM >>> To: >>> Subject: Re: thread terminated, causing all processes to fail >>> >>> Hi, >>> >>> Many thanks for your quick reply and hint. >>> >>> Yes, you're right .. further up there is indeed >>> >>> Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line 148 >>> thread 1. >>> Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw >>> FastaSeq for Storable >>> --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 >>> thread 1. >>> >>> I run a "script" session and have maker on -debug so I have everything in >>> one file. Do you prefer to have it attached to a post to this mailing list >>> (if it accepts txt attachments) >>> >>> Cheers. >>> >>> >>> On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon wrote: >>>> Hi, >>>> >>>> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a >>>> single multicore machine. >>>> >>>> I've successfully run the dpp_contig.fasta (MPI/8 processes) example but am >>>> having trouble with larger contigs fasta files of my own, which are well >>>> formed. >>>> >>>> I've run into a problem whereby an mpiexec run of 8 processes will stop due >>>> to a perl-thread related problem which says >>>> >>>> FATAL: Thread terminated, causing all processes to fail >>>> >>>> this corresponds to line 924 in the maker executable (which is for the >>>> secondary/worker threads), and is the result of a test on !$thr OR'd with >>>> !$thr->is_running, so clearly one of these is failing. >>>> >>>> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a >>>> programmer, I've only recently started to look at the code and have not got >>>> the hang of the parallelisation setup here, though I gather the master must >>>> use threads to initially generate the parallel instances which then use the >>>> message passing. Of course threads don't have message passing ability, so I >>>> guess something clever is going on and will take some time for me to >>>> understand. >>>> >>>> Clearly however, it has worked before on dpp_contigs, so it may be is >>>> something wrong with my datafile or the way I am carrying out the analysis. >>>> >>>> Any clues that can be put my way are welcome. >>>> >>>> Thank you! >>> >> > _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ramonfallon at gmail.com Thu Mar 7 10:47:53 2013 From: ramonfallon at gmail.com (=?ISO-8859-1?Q?Ram=F3n_Fallon?=) Date: Thu, 7 Mar 2013 18:47:53 +0100 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: References: Message-ID: This is a standalone machine and no NFS at all. "df" gives a healthy amount of disk space, so there should be no problem there. Yes that file does exist although it has the nominal 12288 bytes size, which appears to be the minimum for a DB_file tie. As I mentioned the dpp_contig.fa example set does work so part of my investigation is looking at how. I can do some trivial unit tests on the Bioperl stat-before-tied-hashes situation and see what comes up. So I'll attempt to clear that up and then revert. Many thanks! / Ram?n. On Thu, Mar 7, 2013 at 5:44 PM, Carson Holt wrote: > That is extremely odd. It fails to even generate the indexes. Could you > check the drive space of your working directory and your /tmp directory? > > It is odd because Bioperl uses the stat command to check on the file right > before making a tied hash. So it was there for the stat but not the tie, > which is immediately following. > > If you check manually does it exist now? --> > /home/ramonf/makertrials/mgallocut7/sca29310_8.maker.output/mpi_blastdb/sca29310_8%2Efa.mpi.1/sca29310_8%2Efa.mpi.1.0.index > > Are you running in an NFS mounted directory? > > --Carson > > > From: Ram?n Fallon > Date: Thursday, 7 March, 2013 9:40 AM > > To: Carson Holt > Cc: "maker-devel at yandell-lab.org" > Subject: Re: [maker-devel] thread terminated, causing all processes to > fail > > Hi Carson, > > I send you a zip of the text file of my repeated maker session, this time > having deleted the mpi_blastdb dir and with the -a flag added to "mpiexec > -n 8 maker -debug". Command line. > > Cheers / Ram?n. > > > On Wed, Mar 6, 2013 at 7:49 PM, Ram?n Fallon wrote: > >> OK, will do. >> >> Will get back to you tomorrow on it. >> >> Many thanks! >> >> >> On Wed, Mar 6, 2013 at 7:22 PM, Carson Holt wrote: >> >>> Could you delete your ../*maker.output/mpi_blastdb directory, and then >>> when rerunning maker, run with the ?a flag. >>> >>> Thanks, >>> Carson >>> >>> >>> From: Ram?n Fallon >>> Date: Wednesday, 6 March, 2013 1:15 PM >>> To: Carson Holt >>> Cc: "maker-devel at yandell-lab.org" >>> >>> Subject: Re: thread terminated, causing all processes to fail >>> >>> OK great, here goes .. many thanks! >>> >>> >>> >>> On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt wrote: >>> >>>> If you do reply all to this message, I should get the attachment. It >>>> will be stripped from the one going to the list though. >>>> >>>> Thanks, >>>> Carson >>>> >>>> >>>> >>>> From: Ram?n Fallon >>>> Date: Wednesday, 6 March, 2013 12:57 PM >>>> To: >>>> Subject: Re: thread terminated, causing all processes to fail >>>> >>>> Hi, >>>> >>>> Many thanks for your quick reply and hint. >>>> >>>> Yes, you're right .. further up there is indeed >>>> >>>> Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line >>>> 148 thread 1. >>>> Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw >>>> FastaSeq for Storable >>>> --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 >>>> thread 1. >>>> >>>> I run a "script" session and have maker on -debug so I have everything >>>> in one file. Do you prefer to have it attached to a post to this mailing >>>> list (if it accepts txt attachments) >>>> >>>> Cheers. >>>> >>>> >>>> On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon wrote: >>>> >>>>> Hi, >>>>> >>>>> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a >>>>> single multicore machine. >>>>> >>>>> I've successfully run the dpp_contig.fasta (MPI/8 processes) example >>>>> but am having trouble with larger contigs fasta files of my own, which are >>>>> well formed. >>>>> >>>>> I've run into a problem whereby an mpiexec run of 8 processes will >>>>> stop due to a perl-thread related problem which says >>>>> >>>>> FATAL: Thread terminated, causing all processes to fail >>>>> >>>>> this corresponds to line 924 in the maker executable (which is for the >>>>> secondary/worker threads), and is the result of a test on !$thr OR'd with >>>>> !$thr->is_running, so clearly one of these is failing. >>>>> >>>>> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being >>>>> a programmer, I've only recently started to look at the code and have not >>>>> got the hang of the parallelisation setup here, though I gather the master >>>>> must use threads to initially generate the parallel instances which then >>>>> use the message passing. Of course threads don't have message passing >>>>> ability, so I guess something clever is going on and will take some time >>>>> for me to understand. >>>>> >>>>> Clearly however, it has worked before on dpp_contigs, so it may be is >>>>> something wrong with my datafile or the way I am carrying out the analysis. >>>>> >>>>> Any clues that can be put my way are welcome. >>>>> >>>>> Thank you! >>>>> >>>> >>>> >>> >> > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Thu Mar 7 10:57:46 2013 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 07 Mar 2013 12:57:46 -0500 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: Message-ID: Try running maker outside of with the ?a flag after deleting mpi_blastdb. Does it still happen. Also if you try again with MPI with the ?a flag and having deleted mpi_blastdb, does it fail the same every time? Could you also check for background maker processes that may be trying to work in the same directory that you may not have realized were running. Thanks, Carson From: Ram?n Fallon Date: Thursday, 7 March, 2013 12:47 PM To: Carson Holt Cc: "maker-devel at yandell-lab.org" Subject: Re: [maker-devel] thread terminated, causing all processes to fail This is a standalone machine and no NFS at all. "df" gives a healthy amount of disk space, so there should be no problem there. Yes that file does exist although it has the nominal 12288 bytes size, which appears to be the minimum for a DB_file tie. As I mentioned the dpp_contig.fa example set does work so part of my investigation is looking at how. I can do some trivial unit tests on the Bioperl stat-before-tied-hashes situation and see what comes up. So I'll attempt to clear that up and then revert. Many thanks! / Ram?n. On Thu, Mar 7, 2013 at 5:44 PM, Carson Holt wrote: > That is extremely odd. It fails to even generate the indexes. Could you check > the drive space of your working directory and your /tmp directory? > > It is odd because Bioperl uses the stat command to check on the file right > before making a tied hash. So it was there for the stat but not the tie, > which is immediately following. > > If you check manually does it exist now? --> > /home/ramonf/makertrials/mgallocut7/sca29310_8.maker.output/mpi_blastdb/sca293 > 10_8%2Efa.mpi.1/sca29310_8%2Efa.mpi.1.0.index > > Are you running in an NFS mounted directory? > > --Carson > > > From: Ram?n Fallon > Date: Thursday, 7 March, 2013 9:40 AM > > To: Carson Holt > Cc: "maker-devel at yandell-lab.org" > Subject: Re: [maker-devel] thread terminated, causing all processes to fail > > Hi Carson, > > I send you a zip of the text file of my repeated maker session, this time > having deleted the mpi_blastdb dir and with the -a flag added to "mpiexec -n 8 > maker -debug". Command line. > > Cheers / Ram?n. > > > On Wed, Mar 6, 2013 at 7:49 PM, Ram?n Fallon wrote: >> OK, will do. >> >> Will get back to you tomorrow on it. >> >> Many thanks! >> >> >> On Wed, Mar 6, 2013 at 7:22 PM, Carson Holt wrote: >>> Could you delete your ../*maker.output/mpi_blastdb directory, and then when >>> rerunning maker, run with the ?a flag. >>> >>> Thanks, >>> Carson >>> >>> >>> From: Ram?n Fallon >>> Date: Wednesday, 6 March, 2013 1:15 PM >>> To: Carson Holt >>> Cc: "maker-devel at yandell-lab.org" >>> >>> Subject: Re: thread terminated, causing all processes to fail >>> >>> OK great, here goes .. many thanks! >>> >>> >>> >>> On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt wrote: >>>> If you do reply all to this message, I should get the attachment. It will >>>> be stripped from the one going to the list though. >>>> >>>> Thanks, >>>> Carson >>>> >>>> >>>> >>>> From: Ram?n Fallon >>>> Date: Wednesday, 6 March, 2013 12:57 PM >>>> To: >>>> Subject: Re: thread terminated, causing all processes to fail >>>> >>>> Hi, >>>> >>>> Many thanks for your quick reply and hint. >>>> >>>> Yes, you're right .. further up there is indeed >>>> >>>> Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line 148 >>>> thread 1. >>>> Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw >>>> FastaSeq for Storable >>>> --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 >>>> thread 1. >>>> >>>> I run a "script" session and have maker on -debug so I have everything in >>>> one file. Do you prefer to have it attached to a post to this mailing list >>>> (if it accepts txt attachments) >>>> >>>> Cheers. >>>> >>>> >>>> On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon wrote: >>>>> Hi, >>>>> >>>>> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a >>>>> single multicore machine. >>>>> >>>>> I've successfully run the dpp_contig.fasta (MPI/8 processes) example but >>>>> am having trouble with larger contigs fasta files of my own, which are >>>>> well formed. >>>>> >>>>> I've run into a problem whereby an mpiexec run of 8 processes will stop >>>>> due to a perl-thread related problem which says >>>>> >>>>> FATAL: Thread terminated, causing all processes to fail >>>>> >>>>> this corresponds to line 924 in the maker executable (which is for the >>>>> secondary/worker threads), and is the result of a test on !$thr OR'd with >>>>> !$thr->is_running, so clearly one of these is failing. >>>>> >>>>> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a >>>>> programmer, I've only recently started to look at the code and have not >>>>> got the hang of the parallelisation setup here, though I gather the master >>>>> must use threads to initially generate the parallel instances which then >>>>> use the message passing. Of course threads don't have message passing >>>>> ability, so I guess something clever is going on and will take some time >>>>> for me to understand. >>>>> >>>>> Clearly however, it has worked before on dpp_contigs, so it may be is >>>>> something wrong with my datafile or the way I am carrying out the >>>>> analysis. >>>>> >>>>> Any clues that can be put my way are welcome. >>>>> >>>>> Thank you! >>>> >>> >> > > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/mak > er-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Thu Mar 7 14:09:34 2013 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 07 Mar 2013 16:09:34 -0500 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: Message-ID: It should have said "Try running maker outside of MPi". --Carson From: Carson Holt Date: Thursday, 7 March, 2013 12:57 PM To: Ram?n Fallon Cc: "maker-devel at yandell-lab.org" Subject: Re: [maker-devel] thread terminated, causing all processes to fail Try running maker outside of with the ?a flag after deleting mpi_blastdb. Does it still happen. Also if you try again with MPI with the ?a flag and having deleted mpi_blastdb, does it fail the same every time? Could you also check for background maker processes that may be trying to work in the same directory that you may not have realized were running. Thanks, Carson From: Ram?n Fallon Date: Thursday, 7 March, 2013 12:47 PM To: Carson Holt Cc: "maker-devel at yandell-lab.org" Subject: Re: [maker-devel] thread terminated, causing all processes to fail This is a standalone machine and no NFS at all. "df" gives a healthy amount of disk space, so there should be no problem there. Yes that file does exist although it has the nominal 12288 bytes size, which appears to be the minimum for a DB_file tie. As I mentioned the dpp_contig.fa example set does work so part of my investigation is looking at how. I can do some trivial unit tests on the Bioperl stat-before-tied-hashes situation and see what comes up. So I'll attempt to clear that up and then revert. Many thanks! / Ram?n. On Thu, Mar 7, 2013 at 5:44 PM, Carson Holt wrote: > That is extremely odd. It fails to even generate the indexes. Could you check > the drive space of your working directory and your /tmp directory? > > It is odd because Bioperl uses the stat command to check on the file right > before making a tied hash. So it was there for the stat but not the tie, > which is immediately following. > > If you check manually does it exist now? --> > /home/ramonf/makertrials/mgallocut7/sca29310_8.maker.output/mpi_blastdb/sca293 > 10_8%2Efa.mpi.1/sca29310_8%2Efa.mpi.1.0.index > > Are you running in an NFS mounted directory? > > --Carson > > > From: Ram?n Fallon > Date: Thursday, 7 March, 2013 9:40 AM > > To: Carson Holt > Cc: "maker-devel at yandell-lab.org" > Subject: Re: [maker-devel] thread terminated, causing all processes to fail > > Hi Carson, > > I send you a zip of the text file of my repeated maker session, this time > having deleted the mpi_blastdb dir and with the -a flag added to "mpiexec -n 8 > maker -debug". Command line. > > Cheers / Ram?n. > > > On Wed, Mar 6, 2013 at 7:49 PM, Ram?n Fallon wrote: >> OK, will do. >> >> Will get back to you tomorrow on it. >> >> Many thanks! >> >> >> On Wed, Mar 6, 2013 at 7:22 PM, Carson Holt wrote: >>> Could you delete your ../*maker.output/mpi_blastdb directory, and then when >>> rerunning maker, run with the ?a flag. >>> >>> Thanks, >>> Carson >>> >>> >>> From: Ram?n Fallon >>> Date: Wednesday, 6 March, 2013 1:15 PM >>> To: Carson Holt >>> Cc: "maker-devel at yandell-lab.org" >>> >>> Subject: Re: thread terminated, causing all processes to fail >>> >>> OK great, here goes .. many thanks! >>> >>> >>> >>> On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt wrote: >>>> If you do reply all to this message, I should get the attachment. It will >>>> be stripped from the one going to the list though. >>>> >>>> Thanks, >>>> Carson >>>> >>>> >>>> >>>> From: Ram?n Fallon >>>> Date: Wednesday, 6 March, 2013 12:57 PM >>>> To: >>>> Subject: Re: thread terminated, causing all processes to fail >>>> >>>> Hi, >>>> >>>> Many thanks for your quick reply and hint. >>>> >>>> Yes, you're right .. further up there is indeed >>>> >>>> Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line 148 >>>> thread 1. >>>> Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw >>>> FastaSeq for Storable >>>> --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 >>>> thread 1. >>>> >>>> I run a "script" session and have maker on -debug so I have everything in >>>> one file. Do you prefer to have it attached to a post to this mailing list >>>> (if it accepts txt attachments) >>>> >>>> Cheers. >>>> >>>> >>>> On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon wrote: >>>>> Hi, >>>>> >>>>> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a >>>>> single multicore machine. >>>>> >>>>> I've successfully run the dpp_contig.fasta (MPI/8 processes) example but >>>>> am having trouble with larger contigs fasta files of my own, which are >>>>> well formed. >>>>> >>>>> I've run into a problem whereby an mpiexec run of 8 processes will stop >>>>> due to a perl-thread related problem which says >>>>> >>>>> FATAL: Thread terminated, causing all processes to fail >>>>> >>>>> this corresponds to line 924 in the maker executable (which is for the >>>>> secondary/worker threads), and is the result of a test on !$thr OR'd with >>>>> !$thr->is_running, so clearly one of these is failing. >>>>> >>>>> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a >>>>> programmer, I've only recently started to look at the code and have not >>>>> got the hang of the parallelisation setup here, though I gather the master >>>>> must use threads to initially generate the parallel instances which then >>>>> use the message passing. Of course threads don't have message passing >>>>> ability, so I guess something clever is going on and will take some time >>>>> for me to understand. >>>>> >>>>> Clearly however, it has worked before on dpp_contigs, so it may be is >>>>> something wrong with my datafile or the way I am carrying out the >>>>> analysis. >>>>> >>>>> Any clues that can be put my way are welcome. >>>>> >>>>> Thank you! >>>> >>> >> > > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/mak > er-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From kangyangjae at gmail.com Thu Mar 7 21:00:19 2013 From: kangyangjae at gmail.com (Kang, Yang Jae) Date: Fri, 8 Mar 2013 13:00:19 +0900 Subject: [maker-devel] retrying the FAILED scaffolds Message-ID: <13f201ce1bb1$769c9e20$63d5da60$@gmail.com> Hello I have question regarding some FAILED scaffolds Is there any way to re-try maker pipeline on just Failed scaffolds separately? And do I have to manually erase for the failed directories named as ../theVoid.scaffold_#/? And how can I track down the reason why only those 20 out of around 3000 scaffolds? Thank you Kang, Yang Jae Ph.D. Cropgenomics Lab. College of Agriculture and Life Science Seoul National University Korea -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Thu Mar 7 21:13:08 2013 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 07 Mar 2013 23:13:08 -0500 Subject: [maker-devel] retrying the FAILED scaffolds In-Reply-To: <13f201ce1bb1$769c9e20$63d5da60$@gmail.com> Message-ID: Is there any way to re-try maker pipeline on just Failed scaffolds separately? > Yes. The failed contig fasta will be in the maker.output subdirectory for that > contig. Alternatively use the fasta_tool script to extract them from the > genome file. You can then run them in a separate directory, or use the > '-base' command line flag to force it to use the base name of the current > results directory. Use the ?g option to override the genome file without > having to edit the control files > > Example: > > maker -g failed.fasta ?base maize_assemby > > Output will end up here --> maize_assemby.maker.output And do I have to manually erase for the failed directories named as ../theVoid.scaffold_#/? > No. You can let MAKER just retry them as is (let maker handle what to delete > and keep) or set clean_try=1 to force full deletion before rerunning And how can I track down the reason why only those 20 out of around 3000 scaffolds? > Search for the tag "ERROR" in the standard output of your run. What MAKER > version are you using? I can take a look at the STDERR as wel if you want. > If it's too big for e-mail, you can share it via dropbox. Thanks, Carson -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Fri Mar 8 13:20:37 2013 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 08 Mar 2013 15:20:37 -0500 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: Message-ID: I think I've found the potential cause and committed the necessary changes to fix it. Thanks, Carson From: Ram?n Fallon Date: Thursday, 7 March, 2013 12:47 PM To: Carson Holt Cc: "maker-devel at yandell-lab.org" Subject: Re: [maker-devel] thread terminated, causing all processes to fail This is a standalone machine and no NFS at all. "df" gives a healthy amount of disk space, so there should be no problem there. Yes that file does exist although it has the nominal 12288 bytes size, which appears to be the minimum for a DB_file tie. As I mentioned the dpp_contig.fa example set does work so part of my investigation is looking at how. I can do some trivial unit tests on the Bioperl stat-before-tied-hashes situation and see what comes up. So I'll attempt to clear that up and then revert. Many thanks! / Ram?n. On Thu, Mar 7, 2013 at 5:44 PM, Carson Holt wrote: > That is extremely odd. It fails to even generate the indexes. Could you check > the drive space of your working directory and your /tmp directory? > > It is odd because Bioperl uses the stat command to check on the file right > before making a tied hash. So it was there for the stat but not the tie, > which is immediately following. > > If you check manually does it exist now? --> > /home/ramonf/makertrials/mgallocut7/sca29310_8.maker.output/mpi_blastdb/sca293 > 10_8%2Efa.mpi.1/sca29310_8%2Efa.mpi.1.0.index > > Are you running in an NFS mounted directory? > > --Carson > > > From: Ram?n Fallon > Date: Thursday, 7 March, 2013 9:40 AM > > To: Carson Holt > Cc: "maker-devel at yandell-lab.org" > Subject: Re: [maker-devel] thread terminated, causing all processes to fail > > Hi Carson, > > I send you a zip of the text file of my repeated maker session, this time > having deleted the mpi_blastdb dir and with the -a flag added to "mpiexec -n 8 > maker -debug". Command line. > > Cheers / Ram?n. > > > On Wed, Mar 6, 2013 at 7:49 PM, Ram?n Fallon wrote: >> OK, will do. >> >> Will get back to you tomorrow on it. >> >> Many thanks! >> >> >> On Wed, Mar 6, 2013 at 7:22 PM, Carson Holt wrote: >>> Could you delete your ../*maker.output/mpi_blastdb directory, and then when >>> rerunning maker, run with the ?a flag. >>> >>> Thanks, >>> Carson >>> >>> >>> From: Ram?n Fallon >>> Date: Wednesday, 6 March, 2013 1:15 PM >>> To: Carson Holt >>> Cc: "maker-devel at yandell-lab.org" >>> >>> Subject: Re: thread terminated, causing all processes to fail >>> >>> OK great, here goes .. many thanks! >>> >>> >>> >>> On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt wrote: >>>> If you do reply all to this message, I should get the attachment. It will >>>> be stripped from the one going to the list though. >>>> >>>> Thanks, >>>> Carson >>>> >>>> >>>> >>>> From: Ram?n Fallon >>>> Date: Wednesday, 6 March, 2013 12:57 PM >>>> To: >>>> Subject: Re: thread terminated, causing all processes to fail >>>> >>>> Hi, >>>> >>>> Many thanks for your quick reply and hint. >>>> >>>> Yes, you're right .. further up there is indeed >>>> >>>> Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line 148 >>>> thread 1. >>>> Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw >>>> FastaSeq for Storable >>>> --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 >>>> thread 1. >>>> >>>> I run a "script" session and have maker on -debug so I have everything in >>>> one file. Do you prefer to have it attached to a post to this mailing list >>>> (if it accepts txt attachments) >>>> >>>> Cheers. >>>> >>>> >>>> On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon wrote: >>>>> Hi, >>>>> >>>>> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a >>>>> single multicore machine. >>>>> >>>>> I've successfully run the dpp_contig.fasta (MPI/8 processes) example but >>>>> am having trouble with larger contigs fasta files of my own, which are >>>>> well formed. >>>>> >>>>> I've run into a problem whereby an mpiexec run of 8 processes will stop >>>>> due to a perl-thread related problem which says >>>>> >>>>> FATAL: Thread terminated, causing all processes to fail >>>>> >>>>> this corresponds to line 924 in the maker executable (which is for the >>>>> secondary/worker threads), and is the result of a test on !$thr OR'd with >>>>> !$thr->is_running, so clearly one of these is failing. >>>>> >>>>> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a >>>>> programmer, I've only recently started to look at the code and have not >>>>> got the hang of the parallelisation setup here, though I gather the master >>>>> must use threads to initially generate the parallel instances which then >>>>> use the message passing. Of course threads don't have message passing >>>>> ability, so I guess something clever is going on and will take some time >>>>> for me to understand. >>>>> >>>>> Clearly however, it has worked before on dpp_contigs, so it may be is >>>>> something wrong with my datafile or the way I am carrying out the >>>>> analysis. >>>>> >>>>> Any clues that can be put my way are welcome. >>>>> >>>>> Thank you! >>>> >>> >> > > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/mak > er-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Fri Mar 8 13:28:32 2013 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 08 Mar 2013 15:28:32 -0500 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: Message-ID: Also delete mpi_blastdb before retrying with the new svn repository. Thanks, Carson From: Carson Holt Date: Friday, 8 March, 2013 3:20 PM To: Ram?n Fallon Cc: "maker-devel at yandell-lab.org" Subject: Re: [maker-devel] thread terminated, causing all processes to fail I think I've found the potential cause and committed the necessary changes to fix it. Thanks, Carson From: Ram?n Fallon Date: Thursday, 7 March, 2013 12:47 PM To: Carson Holt Cc: "maker-devel at yandell-lab.org" Subject: Re: [maker-devel] thread terminated, causing all processes to fail This is a standalone machine and no NFS at all. "df" gives a healthy amount of disk space, so there should be no problem there. Yes that file does exist although it has the nominal 12288 bytes size, which appears to be the minimum for a DB_file tie. As I mentioned the dpp_contig.fa example set does work so part of my investigation is looking at how. I can do some trivial unit tests on the Bioperl stat-before-tied-hashes situation and see what comes up. So I'll attempt to clear that up and then revert. Many thanks! / Ram?n. On Thu, Mar 7, 2013 at 5:44 PM, Carson Holt wrote: > That is extremely odd. It fails to even generate the indexes. Could you check > the drive space of your working directory and your /tmp directory? > > It is odd because Bioperl uses the stat command to check on the file right > before making a tied hash. So it was there for the stat but not the tie, > which is immediately following. > > If you check manually does it exist now? --> > /home/ramonf/makertrials/mgallocut7/sca29310_8.maker.output/mpi_blastdb/sca293 > 10_8%2Efa.mpi.1/sca29310_8%2Efa.mpi.1.0.index > > Are you running in an NFS mounted directory? > > --Carson > > > From: Ram?n Fallon > Date: Thursday, 7 March, 2013 9:40 AM > > To: Carson Holt > Cc: "maker-devel at yandell-lab.org" > Subject: Re: [maker-devel] thread terminated, causing all processes to fail > > Hi Carson, > > I send you a zip of the text file of my repeated maker session, this time > having deleted the mpi_blastdb dir and with the -a flag added to "mpiexec -n 8 > maker -debug". Command line. > > Cheers / Ram?n. > > > On Wed, Mar 6, 2013 at 7:49 PM, Ram?n Fallon wrote: >> OK, will do. >> >> Will get back to you tomorrow on it. >> >> Many thanks! >> >> >> On Wed, Mar 6, 2013 at 7:22 PM, Carson Holt wrote: >>> Could you delete your ../*maker.output/mpi_blastdb directory, and then when >>> rerunning maker, run with the ?a flag. >>> >>> Thanks, >>> Carson >>> >>> >>> From: Ram?n Fallon >>> Date: Wednesday, 6 March, 2013 1:15 PM >>> To: Carson Holt >>> Cc: "maker-devel at yandell-lab.org" >>> >>> Subject: Re: thread terminated, causing all processes to fail >>> >>> OK great, here goes .. many thanks! >>> >>> >>> >>> On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt wrote: >>>> If you do reply all to this message, I should get the attachment. It will >>>> be stripped from the one going to the list though. >>>> >>>> Thanks, >>>> Carson >>>> >>>> >>>> >>>> From: Ram?n Fallon >>>> Date: Wednesday, 6 March, 2013 12:57 PM >>>> To: >>>> Subject: Re: thread terminated, causing all processes to fail >>>> >>>> Hi, >>>> >>>> Many thanks for your quick reply and hint. >>>> >>>> Yes, you're right .. further up there is indeed >>>> >>>> Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line 148 >>>> thread 1. >>>> Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw >>>> FastaSeq for Storable >>>> --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 >>>> thread 1. >>>> >>>> I run a "script" session and have maker on -debug so I have everything in >>>> one file. Do you prefer to have it attached to a post to this mailing list >>>> (if it accepts txt attachments) >>>> >>>> Cheers. >>>> >>>> >>>> On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon wrote: >>>>> Hi, >>>>> >>>>> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a >>>>> single multicore machine. >>>>> >>>>> I've successfully run the dpp_contig.fasta (MPI/8 processes) example but >>>>> am having trouble with larger contigs fasta files of my own, which are >>>>> well formed. >>>>> >>>>> I've run into a problem whereby an mpiexec run of 8 processes will stop >>>>> due to a perl-thread related problem which says >>>>> >>>>> FATAL: Thread terminated, causing all processes to fail >>>>> >>>>> this corresponds to line 924 in the maker executable (which is for the >>>>> secondary/worker threads), and is the result of a test on !$thr OR'd with >>>>> !$thr->is_running, so clearly one of these is failing. >>>>> >>>>> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a >>>>> programmer, I've only recently started to look at the code and have not >>>>> got the hang of the parallelisation setup here, though I gather the master >>>>> must use threads to initially generate the parallel instances which then >>>>> use the message passing. Of course threads don't have message passing >>>>> ability, so I guess something clever is going on and will take some time >>>>> for me to understand. >>>>> >>>>> Clearly however, it has worked before on dpp_contigs, so it may be is >>>>> something wrong with my datafile or the way I am carrying out the >>>>> analysis. >>>>> >>>>> Any clues that can be put my way are welcome. >>>>> >>>>> Thank you! >>>> >>> >> > > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/mak > er-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Sun Mar 10 10:31:27 2013 From: carsonhh at gmail.com (Carson Holt) Date: Sun, 10 Mar 2013 12:31:27 -0400 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: Message-ID: I've fixed the missing script issue. Thanks, Carson From: Ram?n Fallon Date: Sunday, 10 March, 2013 10:45 AM To: Carson Holt Cc: "maker-devel at yandell-lab.org" Subject: Re: [maker-devel] thread terminated, causing all processes to fail Hi Carson, In terms of rev 995, on a simplified version of our data set, I tried a sequential run successfully, and even a "mpiexec -n 4" which ran to completion. In any case, many thanks for the new version 996. I did have a problem with the build, namely the new line: 'bin/TACC.PL ' => ['bin/ibrun'], I tried to find TACC.PL unsuccessfully, so I decided to dispense with this new line and then it compiled fine. I started one or two tests and will inform you later about them. From my end I must admit I am using a rather large EST fasta file, but is not useful for test .. I will try to cut it down Monday or Tues so that tests can be more agile. Many thanks / Ram?n. On Fri, Mar 8, 2013 at 9:28 PM, Carson Holt wrote: > Also delete mpi_blastdb before retrying with the new svn repository. > > Thanks, > Carson > > > From: Carson Holt > Date: Friday, 8 March, 2013 3:20 PM > To: Ram?n Fallon > > Cc: "maker-devel at yandell-lab.org" > Subject: Re: [maker-devel] thread terminated, causing all processes to fail > > I think I've found the potential cause and committed the necessary changes to > fix it. > > Thanks, > Carson > > > From: Ram?n Fallon > Date: Thursday, 7 March, 2013 12:47 PM > To: Carson Holt > Cc: "maker-devel at yandell-lab.org" > Subject: Re: [maker-devel] thread terminated, causing all processes to fail > > This is a standalone machine and no NFS at all. "df" gives a healthy amount of > disk space, so there should be no problem there. > > Yes that file does exist although it has the nominal 12288 bytes size, which > appears to be the minimum for a DB_file tie. > > As I mentioned the dpp_contig.fa example set does work so part of my > investigation is looking at how. > > I can do some trivial unit tests on the Bioperl stat-before-tied-hashes > situation and see what comes up. > > So I'll attempt to clear that up and then revert. > > Many thanks! / Ram?n. > > > On Thu, Mar 7, 2013 at 5:44 PM, Carson Holt wrote: >> That is extremely odd. It fails to even generate the indexes. Could you >> check the drive space of your working directory and your /tmp directory? >> >> It is odd because Bioperl uses the stat command to check on the file right >> before making a tied hash. So it was there for the stat but not the tie, >> which is immediately following. >> >> If you check manually does it exist now? --> >> /home/ramonf/makertrials/mgallocut7/sca29310_8.maker.output/mpi_blastdb/sca29 >> 310_8%2Efa.mpi.1/sca29310_8%2Efa.mpi.1.0.index >> >> Are you running in an NFS mounted directory? >> >> --Carson >> >> >> From: Ram?n Fallon >> Date: Thursday, 7 March, 2013 9:40 AM >> >> To: Carson Holt >> Cc: "maker-devel at yandell-lab.org" >> Subject: Re: [maker-devel] thread terminated, causing all processes to fail >> >> Hi Carson, >> >> I send you a zip of the text file of my repeated maker session, this time >> having deleted the mpi_blastdb dir and with the -a flag added to "mpiexec -n >> 8 maker -debug". Command line. >> >> Cheers / Ram?n. >> >> >> On Wed, Mar 6, 2013 at 7:49 PM, Ram?n Fallon wrote: >>> OK, will do. >>> >>> Will get back to you tomorrow on it. >>> >>> Many thanks! >>> >>> >>> On Wed, Mar 6, 2013 at 7:22 PM, Carson Holt wrote: >>>> Could you delete your ../*maker.output/mpi_blastdb directory, and then when >>>> rerunning maker, run with the ?a flag. >>>> >>>> Thanks, >>>> Carson >>>> >>>> >>>> From: Ram?n Fallon >>>> Date: Wednesday, 6 March, 2013 1:15 PM >>>> To: Carson Holt >>>> Cc: "maker-devel at yandell-lab.org" >>>> >>>> Subject: Re: thread terminated, causing all processes to fail >>>> >>>> OK great, here goes .. many thanks! >>>> >>>> >>>> >>>> On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt wrote: >>>>> If you do reply all to this message, I should get the attachment. It will >>>>> be stripped from the one going to the list though. >>>>> >>>>> Thanks, >>>>> Carson >>>>> >>>>> >>>>> >>>>> From: Ram?n Fallon >>>>> Date: Wednesday, 6 March, 2013 12:57 PM >>>>> To: >>>>> Subject: Re: thread terminated, causing all processes to fail >>>>> >>>>> Hi, >>>>> >>>>> Many thanks for your quick reply and hint. >>>>> >>>>> Yes, you're right .. further up there is indeed >>>>> >>>>> Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line 148 >>>>> thread 1. >>>>> Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw >>>>> FastaSeq for Storable >>>>> --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 >>>>> thread 1. >>>>> >>>>> I run a "script" session and have maker on -debug so I have everything in >>>>> one file. Do you prefer to have it attached to a post to this mailing list >>>>> (if it accepts txt attachments) >>>>> >>>>> Cheers. >>>>> >>>>> >>>>> On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon >>>>> wrote: >>>>>> Hi, >>>>>> >>>>>> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a >>>>>> single multicore machine. >>>>>> >>>>>> I've successfully run the dpp_contig.fasta (MPI/8 processes) example but >>>>>> am having trouble with larger contigs fasta files of my own, which are >>>>>> well formed. >>>>>> >>>>>> I've run into a problem whereby an mpiexec run of 8 processes will stop >>>>>> due to a perl-thread related problem which says >>>>>> >>>>>> FATAL: Thread terminated, causing all processes to fail >>>>>> >>>>>> this corresponds to line 924 in the maker executable (which is for the >>>>>> secondary/worker threads), and is the result of a test on !$thr OR'd with >>>>>> !$thr->is_running, so clearly one of these is failing. >>>>>> >>>>>> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a >>>>>> programmer, I've only recently started to look at the code and have not >>>>>> got the hang of the parallelisation setup here, though I gather the >>>>>> master must use threads to initially generate the parallel instances >>>>>> which then use the message passing. Of course threads don't have message >>>>>> passing ability, so I guess something clever is going on and will take >>>>>> some time for me to understand. >>>>>> >>>>>> Clearly however, it has worked before on dpp_contigs, so it may be is >>>>>> something wrong with my datafile or the way I am carrying out the >>>>>> analysis. >>>>>> >>>>>> Any clues that can be put my way are welcome. >>>>>> >>>>>> Thank you! >>>>> >>>> >>> >> >> _______________________________________________ maker-devel mailing list >> maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/ma >> ker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ramonfallon at gmail.com Sun Mar 10 08:45:38 2013 From: ramonfallon at gmail.com (=?ISO-8859-1?Q?Ram=F3n_Fallon?=) Date: Sun, 10 Mar 2013 15:45:38 +0100 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: References: Message-ID: Hi Carson, In terms of rev 995, on a simplified version of our data set, I tried a sequential run successfully, and even a "mpiexec -n 4" which ran to completion. In any case, many thanks for the new version 996. I did have a problem with the build, namely the new line: 'bin/TACC.PL' => ['bin/ibrun'], I tried to find TACC.PL unsuccessfully, so I decided to dispense with this new line and then it compiled fine. I started one or two tests and will inform you later about them. From my end I must admit I am using a rather large EST fasta file, but is not useful for test .. I will try to cut it down Monday or Tues so that tests can be more agile. Many thanks / Ram?n. On Fri, Mar 8, 2013 at 9:28 PM, Carson Holt wrote: > Also delete mpi_blastdb before retrying with the new svn repository. > > Thanks, > Carson > > > From: Carson Holt > Date: Friday, 8 March, 2013 3:20 PM > To: Ram?n Fallon > > Cc: "maker-devel at yandell-lab.org" > Subject: Re: [maker-devel] thread terminated, causing all processes to > fail > > I think I've found the potential cause and committed the necessary changes > to fix it. > > Thanks, > Carson > > > From: Ram?n Fallon > Date: Thursday, 7 March, 2013 12:47 PM > To: Carson Holt > Cc: "maker-devel at yandell-lab.org" > Subject: Re: [maker-devel] thread terminated, causing all processes to > fail > > This is a standalone machine and no NFS at all. "df" gives a healthy > amount of disk space, so there should be no problem there. > > Yes that file does exist although it has the nominal 12288 bytes size, > which appears to be the minimum for a DB_file tie. > > As I mentioned the dpp_contig.fa example set does work so part of my > investigation is looking at how. > > I can do some trivial unit tests on the Bioperl stat-before-tied-hashes > situation and see what comes up. > > So I'll attempt to clear that up and then revert. > > Many thanks! / Ram?n. > > > On Thu, Mar 7, 2013 at 5:44 PM, Carson Holt wrote: > >> That is extremely odd. It fails to even generate the indexes. Could you >> check the drive space of your working directory and your /tmp directory? >> >> It is odd because Bioperl uses the stat command to check on the file >> right before making a tied hash. So it was there for the stat but not the >> tie, which is immediately following. >> >> If you check manually does it exist now? --> >> /home/ramonf/makertrials/mgallocut7/sca29310_8.maker.output/mpi_blastdb/sca29310_8%2Efa.mpi.1/sca29310_8%2Efa.mpi.1.0.index >> >> Are you running in an NFS mounted directory? >> >> --Carson >> >> >> From: Ram?n Fallon >> Date: Thursday, 7 March, 2013 9:40 AM >> >> To: Carson Holt >> Cc: "maker-devel at yandell-lab.org" >> Subject: Re: [maker-devel] thread terminated, causing all processes to >> fail >> >> Hi Carson, >> >> I send you a zip of the text file of my repeated maker session, this time >> having deleted the mpi_blastdb dir and with the -a flag added to "mpiexec >> -n 8 maker -debug". Command line. >> >> Cheers / Ram?n. >> >> >> On Wed, Mar 6, 2013 at 7:49 PM, Ram?n Fallon wrote: >> >>> OK, will do. >>> >>> Will get back to you tomorrow on it. >>> >>> Many thanks! >>> >>> >>> On Wed, Mar 6, 2013 at 7:22 PM, Carson Holt wrote: >>> >>>> Could you delete your ../*maker.output/mpi_blastdb directory, and then >>>> when rerunning maker, run with the ?a flag. >>>> >>>> Thanks, >>>> Carson >>>> >>>> >>>> From: Ram?n Fallon >>>> Date: Wednesday, 6 March, 2013 1:15 PM >>>> To: Carson Holt >>>> Cc: "maker-devel at yandell-lab.org" >>>> >>>> Subject: Re: thread terminated, causing all processes to fail >>>> >>>> OK great, here goes .. many thanks! >>>> >>>> >>>> >>>> On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt wrote: >>>> >>>>> If you do reply all to this message, I should get the attachment. It >>>>> will be stripped from the one going to the list though. >>>>> >>>>> Thanks, >>>>> Carson >>>>> >>>>> >>>>> >>>>> From: Ram?n Fallon >>>>> Date: Wednesday, 6 March, 2013 12:57 PM >>>>> To: >>>>> Subject: Re: thread terminated, causing all processes to fail >>>>> >>>>> Hi, >>>>> >>>>> Many thanks for your quick reply and hint. >>>>> >>>>> Yes, you're right .. further up there is indeed >>>>> >>>>> Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line >>>>> 148 thread 1. >>>>> Thread 1 terminated abnormally: ERROR: Could not reestablish DB to >>>>> thaw FastaSeq for Storable >>>>> --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line >>>>> 1457 thread 1. >>>>> >>>>> I run a "script" session and have maker on -debug so I have everything >>>>> in one file. Do you prefer to have it attached to a post to this mailing >>>>> list (if it accepts txt attachments) >>>>> >>>>> Cheers. >>>>> >>>>> >>>>> On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a >>>>>> single multicore machine. >>>>>> >>>>>> I've successfully run the dpp_contig.fasta (MPI/8 processes) example >>>>>> but am having trouble with larger contigs fasta files of my own, which are >>>>>> well formed. >>>>>> >>>>>> I've run into a problem whereby an mpiexec run of 8 processes will >>>>>> stop due to a perl-thread related problem which says >>>>>> >>>>>> FATAL: Thread terminated, causing all processes to fail >>>>>> >>>>>> this corresponds to line 924 in the maker executable (which is for >>>>>> the secondary/worker threads), and is the result of a test on !$thr OR'd >>>>>> with !$thr->is_running, so clearly one of these is failing. >>>>>> >>>>>> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite >>>>>> being a programmer, I've only recently started to look at the code and have >>>>>> not got the hang of the parallelisation setup here, though I gather the >>>>>> master must use threads to initially generate the parallel instances which >>>>>> then use the message passing. Of course threads don't have message passing >>>>>> ability, so I guess something clever is going on and will take some time >>>>>> for me to understand. >>>>>> >>>>>> Clearly however, it has worked before on dpp_contigs, so it may be is >>>>>> something wrong with my datafile or the way I am carrying out the analysis. >>>>>> >>>>>> Any clues that can be put my way are welcome. >>>>>> >>>>>> Thank you! >>>>>> >>>>> >>>>> >>>> >>> >> _______________________________________________ maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mikheyev at gmail.com Mon Mar 11 03:46:06 2013 From: mikheyev at gmail.com (Sasha Mikheyev) Date: Mon, 11 Mar 2013 18:46:06 +0900 Subject: [maker-devel] duplicate CDS in annotation Message-ID: Dear Yandell lab, I am re-annotating the harvester and genome using protein and RNA-seq data. However, I get many artifacts like the one below. It seems that there are several CDS records that should tie in to the same mRNA, but they are really hanging out separately, and produce several nucleotide sequences with the same name when extracted from the gff. I would appreciate any guidance about how to fix this! Thank you, Sasha grep "pbar_scf7180000350377:hit:2506" Pbar.2.0.gff pbar_scf7180000350377 protein2genome protein_match 172004 172162 150 - . ID=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;score=150; pbar_scf7180000350377 protein2genome match_part 172004 172162 150 - . ID=pbar_scf7180000350377:hsp:2798;Parent=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;Target=Hsal|HS9704 1 53 +;Gap=M159; pbar_scf7180000350377 maker mRNA 538308 558769 . + . ID=pbar_scf7180000350377:hit:2506;Parent=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29;Name=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29-mRNA-1;_AED=0.48;_eAED=0.39;_QI=0|0|0|0.5|1|1|6|0|395;score=0.01; pbar_scf7180000350377 maker exon 538308 538334 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:305;Parent=pbar_scf7180000350377:hit:2506; pbar_scf7180000350377 maker exon 538748 538968 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:306;Parent=pbar_scf7180000350377:hit:2506; pbar_scf7180000350377 maker exon 539842 540242 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:307;Parent=pbar_scf7180000350377:hit:2506; pbar_scf7180000350377 maker exon 542624 542798 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:308;Parent=pbar_scf7180000350377:hit:2506; pbar_scf7180000350377 maker exon 555823 556025 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:309;Parent=pbar_scf7180000350377:hit:2506; pbar_scf7180000350377 maker exon 558609 558769 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:310;Parent=pbar_scf7180000350377:hit:2506; pbar_scf7180000350377 maker CDS 538308 538334 . + 0 ID=pbar_scf7180000350377:hit:2506:cds:305;Parent=pbar_scf7180000350377:hit:2506; pbar_scf7180000350377 maker CDS 538748 538968 . + 0 ID=pbar_scf7180000350377:hit:2506:cds:306;Parent=pbar_scf7180000350377:hit:2506; pbar_scf7180000350377 maker CDS 539842 540242 . + 1 ID=pbar_scf7180000350377:hit:2506:cds:307;Parent=pbar_scf7180000350377:hit:2506; pbar_scf7180000350377 maker CDS 542624 542798 . + 2 ID=pbar_scf7180000350377:hit:2506:cds:308;Parent=pbar_scf7180000350377:hit:2506; pbar_scf7180000350377 maker CDS 555823 556025 . + 1 ID=pbar_scf7180000350377:hit:2506:cds:309;Parent=pbar_scf7180000350377:hit:2506; pbar_scf7180000350377 maker CDS 558609 558769 . + 2 ID=pbar_scf7180000350377:hit:2506:cds:310;Parent=pbar_scf7180000350377:hit:2506; -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry.moore at genetics.utah.edu Mon Mar 11 05:32:44 2013 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Mon, 11 Mar 2013 05:32:44 -0600 Subject: [maker-devel] duplicate CDS in annotation In-Reply-To: References: Message-ID: <52822F85-760B-451B-B156-8861EA77A910@genetics.utah.edu> Hi Sasha, This gene model appears to be correctly formatted to me. In GFF3 format the CDS features are allowed to span multiple lines and they share the same ID to indicate that it is all the same features. See the GFF3 specification on the Sequence Ontology website (http://www.sequenceontology.org/resources/gff3.html), and in particular the description of the ID attribute specifies: ID Indicates the ID of the feature. IDs for each feature must be unique within the scope of the GFF file. In the case of discontinuous features (i.e. a single feature that exists over multiple genomic locations) the same ID may appear on multiple lines. All lines that share an ID collectively represent a single feature. So each of those CDS lines forms one part of the single CDS feature for this gene. B On Mar 11, 2013, at 3:46 AM, Sasha Mikheyev wrote: > Dear Yandell lab, > > I am re-annotating the harvester and genome using protein and RNA-seq data. However, I get many artifacts like the one below. It seems that there are several CDS records that should tie in to the same mRNA, but they are really hanging out separately, and produce several nucleotide sequences with the same name when extracted from the gff. I would appreciate any guidance about how to fix this! > > Thank you, > > Sasha > > grep "pbar_scf7180000350377:hit:2506" Pbar.2.0.gff > pbar_scf7180000350377 protein2genome protein_match 172004 172162 150 - . ID=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;score=150; > pbar_scf7180000350377 protein2genome match_part 172004 172162 150 - . ID=pbar_scf7180000350377:hsp:2798;Parent=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;Target=Hsal|HS9704 1 53 +;Gap=M159; > pbar_scf7180000350377 maker mRNA 538308 558769 . + . ID=pbar_scf7180000350377:hit:2506;Parent=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29;Name=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29-mRNA-1;_AED=0.48;_eAED=0.39;_QI=0|0|0|0.5|1|1|6|0|395;score=0.01; > pbar_scf7180000350377 maker exon 538308 538334 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:305;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker exon 538748 538968 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:306;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker exon 539842 540242 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:307;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker exon 542624 542798 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:308;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker exon 555823 556025 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:309;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker exon 558609 558769 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:310;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 538308 538334 . + 0 ID=pbar_scf7180000350377:hit:2506:cds:305;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 538748 538968 . + 0 ID=pbar_scf7180000350377:hit:2506:cds:306;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 539842 540242 . + 1 ID=pbar_scf7180000350377:hit:2506:cds:307;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 542624 542798 . + 2 ID=pbar_scf7180000350377:hit:2506:cds:308;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 555823 556025 . + 1 ID=pbar_scf7180000350377:hit:2506:cds:309;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 558609 558769 . + 2 ID=pbar_scf7180000350377:hit:2506:cds:310;Parent=pbar_scf7180000350377:hit:2506; > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Mar 11 07:02:13 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 11 Mar 2013 09:02:13 -0400 Subject: [maker-devel] duplicate CDS in annotation In-Reply-To: <52822F85-760B-451B-B156-8861EA77A910@genetics.utah.edu> Message-ID: I think the issue is that you are getting a match feature that is being printed with the same ID as the mRNA feature. Correct? What version of MAKER are you using, and what does the gile you are giving to pred_gff or model_gff look like? Could you send them? Thanks, Carson From: Barry Moore Date: Monday, 11 March, 2013 7:32 AM To: Sasha Mikheyev Cc: Subject: Re: [maker-devel] duplicate CDS in annotation Hi Sasha, This gene model appears to be correctly formatted to me. In GFF3 format the CDS features are allowed to span multiple lines and they share the same ID to indicate that it is all the same features. See the GFF3 specification on the Sequence Ontology website (http://www.sequenceontology.org/resources/gff3.html), and in particular the description of the ID attribute specifies: > ID Indicates the ID of the feature. IDs for each feature must be unique > within the scope of the GFF file. In the case of discontinuous features (i.e. > a single feature that exists over multiple genomic locations) the same ID may > appear on multiple lines. All lines that share an ID collectively represent a > single feature. So each of those CDS lines forms one part of the single CDS feature for this gene. B On Mar 11, 2013, at 3:46 AM, Sasha Mikheyev wrote: > Dear Yandell lab, > > I am re-annotating the harvester and genome using protein and RNA-seq data. > However, I get many artifacts like the one below. It seems that there are > several CDS records that should tie in to the same mRNA, but they are really > hanging out separately, and produce several nucleotide sequences with the same > name when extracted from the gff. I would appreciate any guidance about how to > fix this! > > Thank you, > > Sasha > > grep "pbar_scf7180000350377:hit:2506" Pbar.2.0.gff > pbar_scf7180000350377 protein2genome protein_match 172004 172162 150 - . > ID=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;score=150; > pbar_scf7180000350377 protein2genome match_part 172004 172162 150 - . > ID=pbar_scf7180000350377:hsp:2798;Parent=pbar_scf7180000350377:hit:2506;Name=H > sal|HS9704;Target=Hsal|HS9704 1 53 +;Gap=M159; > pbar_scf7180000350377 maker mRNA 538308 558769 . + . > ID=pbar_scf7180000350377:hit:2506;Parent=augustus_masked-pbar_scf7180000350377 > -abinit-gene-5.29;Name=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29- > mRNA-1;_AED=0.48;_eAED=0.39;_QI=0|0|0|0.5|1|1|6|0|395;score=0.01; > pbar_scf7180000350377 maker exon 538308 538334 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:305;Parent=pbar_scf7180000350377:hit:25 > 06; > pbar_scf7180000350377 maker exon 538748 538968 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:306;Parent=pbar_scf7180000350377:hit:25 > 06; > pbar_scf7180000350377 maker exon 539842 540242 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:307;Parent=pbar_scf7180000350377:hit:25 > 06; > pbar_scf7180000350377 maker exon 542624 542798 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:308;Parent=pbar_scf7180000350377:hit:25 > 06; > pbar_scf7180000350377 maker exon 555823 556025 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:309;Parent=pbar_scf7180000350377:hit:25 > 06; > pbar_scf7180000350377 maker exon 558609 558769 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:310;Parent=pbar_scf7180000350377:hit:25 > 06; > pbar_scf7180000350377 maker CDS 538308 538334 . + 0 > ID=pbar_scf7180000350377:hit:2506:cds:305;Parent=pbar_scf7180000350377:hit:250 > 6; > pbar_scf7180000350377 maker CDS 538748 538968 . + 0 > ID=pbar_scf7180000350377:hit:2506:cds:306;Parent=pbar_scf7180000350377:hit:250 > 6; > pbar_scf7180000350377 maker CDS 539842 540242 . + 1 > ID=pbar_scf7180000350377:hit:2506:cds:307;Parent=pbar_scf7180000350377:hit:250 > 6; > pbar_scf7180000350377 maker CDS 542624 542798 . + 2 > ID=pbar_scf7180000350377:hit:2506:cds:308;Parent=pbar_scf7180000350377:hit:250 > 6; > pbar_scf7180000350377 maker CDS 555823 556025 . + 1 > ID=pbar_scf7180000350377:hit:2506:cds:309;Parent=pbar_scf7180000350377:hit:250 > 6; > pbar_scf7180000350377 maker CDS 558609 558769 . + 2 > ID=pbar_scf7180000350377:hit:2506:cds:310;Parent=pbar_scf7180000350377:hit:250 > 6; > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From sitaram.rajaraman at helsinki.fi Mon Mar 11 08:33:27 2013 From: sitaram.rajaraman at helsinki.fi (Sitaram Rajaraman) Date: Mon, 11 Mar 2013 16:33:27 +0200 Subject: [maker-devel] Doubts in the synthesis part of MAKER Message-ID: <513DEB37.6090601@helsinki.fi> Hello MAKER developers, I'm Sitaram, working as a Bioinformatician at the University of Helsinki. We are trying out MAKER as part of a gene prediction/annotation pipeline and have some doubts regarding this. In the synthesis step in the paper, I find it a bit hard to visualise how the hints are generated from the various sources and the scores are calculated. It would be nice if you could throw some light on this. Also if you could point to the particular .Pm file which contains the actual source code, it would be convenient as there quite a lot of source code and debugging the whole set is bit cumbersome. Regards, -- Sitaram Rajaraman, Plant Stress Research Group, Dept of Biosciences, University of Helsinki. From carsonhh at gmail.com Mon Mar 11 08:51:56 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 11 Mar 2013 10:51:56 -0400 Subject: [maker-devel] Doubts in the synthesis part of MAKER In-Reply-To: <513DEB37.6090601@helsinki.fi> Message-ID: Hints are basically CDS location, exon location, and intron location. The CDS hints are based on protein alignment. Intron and exon hints are based on the EST alignments, which when polished should give exact intron coordinates. Ironically the most useless part of the gene model is actually the most informative feature for gene prediction (the intron coordinates). lib/Process/MPIchunk.pm will have the steps in the _go method. It is a little hard to follow as MAKER is designed for distributed parallelization (i.e. parallelization without shared memory with steps potentially divided on different machines on the other end of the network). It is divided into MPItier and MPIchunk objects. The MPItier object encapsulate a series of linear steps or 'levels' while the MPIchunk objects encapsulate a single step sent to a machine across the network and it exists within a single 'level' of the MPITier object. Note there can be multiple chunks assigned to a 'level'. MPItiers can also have MPITiers as children at a given level instead of MPIchunks, so the process structure then branches like a tree and can then merges back somewhere in the middle of the algorithm. The 'maker' script is really just the communication script for the objects. In MPI one maker thread is launched to handle communication and another to run the MPItiers and MPIchunks. They communication threads then pass MPIchunks and MPITiers back and forth across the network by either requesting things to do from other nodes or by asking for help if they have a large number of MPIChunks or MPItiers to process. Thanks, Carson On 13-03-11 10:33 AM, "Sitaram Rajaraman" wrote: >Hello MAKER developers, > I'm Sitaram, working as a Bioinformatician at the University of >Helsinki. We are trying out MAKER as part of a gene prediction/annotation >pipeline and have some doubts regarding this. In the synthesis step in >the paper, I find it a bit hard to visualise how the hints are generated >from the various sources and the scores are calculated. It would be nice >if you could throw some light on this. Also if you could point to the >particular .Pm file which contains the actual source code, it would be >convenient as there quite a lot of source code and debugging the whole >set is bit cumbersome. > >Regards, > >-- >Sitaram Rajaraman, >Plant Stress Research Group, >Dept of Biosciences, >University of Helsinki. > > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From sitaram.rajaraman at helsinki.fi Mon Mar 11 09:03:20 2013 From: sitaram.rajaraman at helsinki.fi (Sitaram Rajaraman) Date: Mon, 11 Mar 2013 17:03:20 +0200 Subject: [maker-devel] Doubts in the synthesis part of MAKER In-Reply-To: References: Message-ID: <513DF238.4050109@helsinki.fi> Thank you ! I will proceed with this information ! - Sitaram. On 03/11/2013 04:51 PM, Carson Holt wrote: > Hints are basically CDS location, exon location, and intron location. The > CDS hints are based on protein alignment. Intron and exon hints are based > on the EST alignments, which when polished should give exact intron > coordinates. Ironically the most useless part of the gene model is > actually the most informative feature for gene prediction (the intron > coordinates). > > lib/Process/MPIchunk.pm will have the steps in the _go method. It is a > little hard to follow as MAKER is designed for distributed parallelization > (i.e. parallelization without shared memory with steps potentially divided > on different machines on the other end of the network). > > It is divided into MPItier and MPIchunk objects. The MPItier object > encapsulate a series of linear steps or 'levels' while the MPIchunk > objects encapsulate a single step sent to a machine across the network and > it exists within a single 'level' of the MPITier object. Note there can > be multiple chunks assigned to a 'level'. MPItiers can also have MPITiers > as children at a given level instead of MPIchunks, so the process > structure then branches like a tree and can then merges back somewhere in > the middle of the algorithm. > > The 'maker' script is really just the communication script for the > objects. In MPI one maker thread is launched to handle communication and > another to run the MPItiers and MPIchunks. They communication threads > then pass MPIchunks and MPITiers back and forth across the network by > either requesting things to do from other nodes or by asking for help if > they have a large number of MPIChunks or MPItiers to process. > > Thanks, > Carson > > > > > > On 13-03-11 10:33 AM, "Sitaram Rajaraman" > wrote: > >> Hello MAKER developers, >> I'm Sitaram, working as a Bioinformatician at the University of >> Helsinki. We are trying out MAKER as part of a gene prediction/annotation >> pipeline and have some doubts regarding this. In the synthesis step in >> the paper, I find it a bit hard to visualise how the hints are generated > >from the various sources and the scores are calculated. It would be nice >> if you could throw some light on this. Also if you could point to the >> particular .Pm file which contains the actual source code, it would be >> convenient as there quite a lot of source code and debugging the whole >> set is bit cumbersome. >> >> Regards, >> >> -- >> Sitaram Rajaraman, >> Plant Stress Research Group, >> Dept of Biosciences, >> University of Helsinki. >> >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -- Sitaram Rajaraman, Plant Stress Research Group, Dept of Biosciences, University of Helsinki. From carsonhh at gmail.com Mon Mar 11 09:05:30 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 11 Mar 2013 11:05:30 -0400 Subject: [maker-devel] Doubts in the synthesis part of MAKER In-Reply-To: Message-ID: One more detail. There are basically 5 steps per level. Load -> this step creates the chunks for that level. This is where I decide how many chunks to make, and any special variables need to be generated for packaging into that chunk. Init --> these is just a declaration of the variables to package into a chunk (only give the chunk what it needs) Run --> these is the actual code that will be run after the chunk is transported to it's destination Result --> this describes how to merge results of that chunk back into the parent object Flow --> this decides what to do when all chunks for that level are complete (i.e. which level to move onto next). Default is next level in linear succession, but it can jump forward and backwards several levels if needed. Thanks, Carson On 13-03-11 10:51 AM, "Carson Holt" wrote: >Hints are basically CDS location, exon location, and intron location. >The >CDS hints are based on protein alignment. Intron and exon hints are >based >on the EST alignments, which when polished should give exact intron >coordinates. Ironically the most useless part of the gene model is >actually the most informative feature for gene prediction (the intron >coordinates). > >lib/Process/MPIchunk.pm will have the steps in the _go method. It is a >little hard to follow as MAKER is designed for distributed >parallelization >(i.e. parallelization without shared memory with steps potentially >divided >on different machines on the other end of the network). > >It is divided into MPItier and MPIchunk objects. The MPItier object >encapsulate a series of linear steps or 'levels' while the MPIchunk >objects encapsulate a single step sent to a machine across the network >and >it exists within a single 'level' of the MPITier object. Note there can >be multiple chunks assigned to a 'level'. MPItiers can also have >MPITiers >as children at a given level instead of MPIchunks, so the process >structure then branches like a tree and can then merges back somewhere in >the middle of the algorithm. > >The 'maker' script is really just the communication script for the >objects. In MPI one maker thread is launched to handle communication and >another to run the MPItiers and MPIchunks. They communication threads >then pass MPIchunks and MPITiers back and forth across the network by >either requesting things to do from other nodes or by asking for help if >they have a large number of MPIChunks or MPItiers to process. > >Thanks, >Carson > > > > > >On 13-03-11 10:33 AM, "Sitaram Rajaraman" >wrote: > >>Hello MAKER developers, >> I'm Sitaram, working as a Bioinformatician at the University of >>Helsinki. We are trying out MAKER as part of a gene >>prediction/annotation >>pipeline and have some doubts regarding this. In the synthesis step in >>the paper, I find it a bit hard to visualise how the hints are generated >>from the various sources and the scores are calculated. It would be nice >>if you could throw some light on this. Also if you could point to the >>particular .Pm file which contains the actual source code, it would be >>convenient as there quite a lot of source code and debugging the whole >>set is bit cumbersome. >> >>Regards, >> >>-- >>Sitaram Rajaraman, >>Plant Stress Research Group, >>Dept of Biosciences, >>University of Helsinki. >> >> >>_______________________________________________ >>maker-devel mailing list >>maker-devel at box290.bluehost.com >>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From isradelacon at gmail.com Mon Mar 11 12:34:27 2013 From: isradelacon at gmail.com (Israel Barrantes) Date: Mon, 11 Mar 2013 19:34:27 +0100 Subject: [maker-devel] different RNA-seq experiment outputs in separate annotation passes? Message-ID: Dear maker-devel, I have several RNA-seq experiment outputs that I want to use as input for MAKER annotation: (1) Illumina 1.3, strain A, cell stage N (2) Illumina 1.8, strain A, cell stage N (3) Illumina 1.8, strain B, cell stage N (4) 454, strain unknown, cell stage M For each experiment I mapped the reads and produced GTFs with tophat/cufflinks separately (and later converted to GFF3s with the supplied script) Q1: Does it make a difference to run a different annotation pass for each GFF3 from tophat/cufflinks? Q2: If this is the case, altering the order of passing the cDNA GFFs (e.g., first pass, experiment 1 GFF, then exp.2 in second pass, etc) will produce more or less transcripts? Q3: Is it better to simply merge this GFFs into a single nonredundant file (e.g. bedtools intersect) than using them separately, one for each MAKER pass? Thank you in advance, -- Israel Barrantes Otto-von-Guericke-Universit?t Lehrstuhl f?r Regulationsbiologie IBIO/FNW Deutschland -------------- next part -------------- An HTML attachment was scrubbed... URL: From dence at genetics.utah.edu Mon Mar 11 12:39:01 2013 From: dence at genetics.utah.edu (Daniel Ence) Date: Mon, 11 Mar 2013 18:39:01 +0000 Subject: [maker-devel] different RNA-seq experiment outputs in separate annotation passes? In-Reply-To: References: Message-ID: Hi Israel, I think that for general annotation purposes, you want to use all of those GFF files during your one make run to annotate the whole genome. If you're interested in exploring which genes are expressed in your different strains and cell stages, then you can use your annotation results and blast against the different RNA-seq experiments. I didn't answer your questions separately, but hopefully that gives some good guidance. If I missed something, let me know. Thanks, Daniel Daniel Ence Graduate Student Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ________________________________ From: maker-devel-bounces at yandell-lab.org [maker-devel-bounces at yandell-lab.org] on behalf of Israel Barrantes [isradelacon at gmail.com] Sent: Monday, March 11, 2013 12:34 PM To: maker-devel at yandell-lab.org Subject: [maker-devel] different RNA-seq experiment outputs in separate annotation passes? Dear maker-devel, I have several RNA-seq experiment outputs that I want to use as input for MAKER annotation: (1) Illumina 1.3, strain A, cell stage N (2) Illumina 1.8, strain A, cell stage N (3) Illumina 1.8, strain B, cell stage N (4) 454, strain unknown, cell stage M For each experiment I mapped the reads and produced GTFs with tophat/cufflinks separately (and later converted to GFF3s with the supplied script) Q1: Does it make a difference to run a different annotation pass for each GFF3 from tophat/cufflinks? Q2: If this is the case, altering the order of passing the cDNA GFFs (e.g., first pass, experiment 1 GFF, then exp.2 in second pass, etc) will produce more or less transcripts? Q3: Is it better to simply merge this GFFs into a single nonredundant file (e.g. bedtools intersect) than using them separately, one for each MAKER pass? Thank you in advance, -- Israel Barrantes Otto-von-Guericke-Universit?t Lehrstuhl f?r Regulationsbiologie IBIO/FNW Deutschland -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Mar 12 07:37:35 2013 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 12 Mar 2013 09:37:35 -0400 Subject: [maker-devel] duplicate CDS in annotation In-Reply-To: Message-ID: Yes. Try the newer version and see if you still have the issue. Thanks, Carson From: Sasha Mikheyev Date: Tuesday, 12 March, 2013 1:26 AM To: Carson Holt Cc: Barry Moore , Subject: Re: [maker-devel] duplicate CDS in annotation Hi Carson, I have been using version 2.10. Is it worth trying with a newer version? You can find the model file here . It is rather large, as it includes all of the output from the first maker run. Yours, Sasha On Mon, Mar 11, 2013 at 10:02 PM, Carson Holt wrote: > I think the issue is that you are getting a match feature that is being > printed with the same ID as the mRNA feature. Correct? > > What version of MAKER are you using, and what does the gile you are giving to > pred_gff or model_gff look like? Could you send them? > > Thanks, > Carson > > > From: Barry Moore > Date: Monday, 11 March, 2013 7:32 AM > To: Sasha Mikheyev > Cc: > Subject: Re: [maker-devel] duplicate CDS in annotation > > Hi Sasha, > > This gene model appears to be correctly formatted to me. In GFF3 format the > CDS features are allowed to span multiple lines and they share the same ID to > indicate that it is all the same features. See the GFF3 specification on the > Sequence Ontology website > (http://www.sequenceontology.org/resources/gff3.html), and in particular the > description of the ID attribute specifies: > >> ID Indicates the ID of the feature. IDs for each feature must be unique >> within the scope of the GFF file. In the case of discontinuous features >> (i.e. a single feature that exists over multiple genomic locations) the same >> ID may appear on multiple lines. All lines that share an ID collectively >> represent a single feature. > > So each of those CDS lines forms one part of the single CDS feature for this > gene. > > B > > On Mar 11, 2013, at 3:46 AM, Sasha Mikheyev wrote: > >> Dear Yandell lab, >> >> I am re-annotating the harvester and genome using protein and RNA-seq data. >> However, I get many artifacts like the one below. It seems that there are >> several CDS records that should tie in to the same mRNA, but they are really >> hanging out separately, and produce several nucleotide sequences with the >> same name when extracted from the gff. I would appreciate any guidance about >> how to fix this! >> >> Thank you, >> >> Sasha >> >> grep "pbar_scf7180000350377:hit:2506" Pbar.2.0.gff >> pbar_scf7180000350377 protein2genome protein_match 172004 172162 150 - . >> ID=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;score=150; >> pbar_scf7180000350377 protein2genome match_part 172004 172162 150 - . >> ID=pbar_scf7180000350377:hsp:2798;Parent=pbar_scf7180000350377:hit:2506;Name= >> Hsal|HS9704;Target=Hsal|HS9704 1 53 +;Gap=M159; >> pbar_scf7180000350377 maker mRNA 538308 558769 . + . >> ID=pbar_scf7180000350377:hit:2506;Parent=augustus_masked-pbar_scf718000035037 >> 7-abinit-gene-5.29;Name=augustus_masked-pbar_scf7180000350377-abinit-gene-5.2 >> 9-mRNA-1;_AED=0.48;_eAED=0.39;_QI=0|0|0|0.5|1|1|6|0|395;score=0.01; >> pbar_scf7180000350377 maker exon 538308 538334 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:305;Parent=pbar_scf7180000350377:hit:2 >> 506; >> pbar_scf7180000350377 maker exon 538748 538968 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:306;Parent=pbar_scf7180000350377:hit:2 >> 506; >> pbar_scf7180000350377 maker exon 539842 540242 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:307;Parent=pbar_scf7180000350377:hit:2 >> 506; >> pbar_scf7180000350377 maker exon 542624 542798 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:308;Parent=pbar_scf7180000350377:hit:2 >> 506; >> pbar_scf7180000350377 maker exon 555823 556025 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:309;Parent=pbar_scf7180000350377:hit:2 >> 506; >> pbar_scf7180000350377 maker exon 558609 558769 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:310;Parent=pbar_scf7180000350377:hit:2 >> 506; >> pbar_scf7180000350377 maker CDS 538308 538334 . + 0 >> ID=pbar_scf7180000350377:hit:2506:cds:305;Parent=pbar_scf7180000350377:hit:25 >> 06; >> pbar_scf7180000350377 maker CDS 538748 538968 . + 0 >> ID=pbar_scf7180000350377:hit:2506:cds:306;Parent=pbar_scf7180000350377:hit:25 >> 06; >> pbar_scf7180000350377 maker CDS 539842 540242 . + 1 >> ID=pbar_scf7180000350377:hit:2506:cds:307;Parent=pbar_scf7180000350377:hit:25 >> 06; >> pbar_scf7180000350377 maker CDS 542624 542798 . + 2 >> ID=pbar_scf7180000350377:hit:2506:cds:308;Parent=pbar_scf7180000350377:hit:25 >> 06; >> pbar_scf7180000350377 maker CDS 555823 556025 . + 1 >> ID=pbar_scf7180000350377:hit:2506:cds:309;Parent=pbar_scf7180000350377:hit:25 >> 06; >> pbar_scf7180000350377 maker CDS 558609 558769 . + 2 >> ID=pbar_scf7180000350377:hit:2506:cds:310;Parent=pbar_scf7180000350377:hit:25 >> 06; >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > Barry Moore > Research Scientist > Dept. of Human Genetics > University of Utah > Salt Lake City, UT 84112 > -------------------------------------------- > (801) 585-3543 > > > > > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/mak > er-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From alex.marshall at ed.ac.uk Mon Mar 11 10:15:05 2013 From: alex.marshall at ed.ac.uk (Alex Marshall) Date: Mon, 11 Mar 2013 16:15:05 +0000 Subject: [maker-devel] symbol lookup error: ProcessTable.so: undefined symbol: Perl_Tstack_sp_ptr Message-ID: <513E0309.7010004@ed.ac.uk> Hi to the maker-devel, I am getting an error everytime I run the maker script. symbol lookup error: /path/to/software/lib64/perl5/site_perl/5.8.8/x86_64-linux-thread-multi/auto/Proc/ProcessTable/ProcessTable.so: undefined symbol: Perl_Tstack_sp_ptr Your help would be very appreciated. Best wishes, Alex ---------------- Edinburgh University -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From mikheyev at gmail.com Mon Mar 11 23:26:53 2013 From: mikheyev at gmail.com (Sasha Mikheyev) Date: Tue, 12 Mar 2013 14:26:53 +0900 Subject: [maker-devel] duplicate CDS in annotation In-Reply-To: References: <52822F85-760B-451B-B156-8861EA77A910@genetics.utah.edu> Message-ID: Hi Carson, I have been using version 2.10. Is it worth trying with a newer version? You can find the model file here. It is rather large, as it includes all of the output from the first maker run. Yours, Sasha On Mon, Mar 11, 2013 at 10:02 PM, Carson Holt wrote: > I think the issue is that you are getting a match feature that is being > printed with the same ID as the mRNA feature. Correct? > > What version of MAKER are you using, and what does the gile you are giving > to pred_gff or model_gff look like? Could you send them? > > Thanks, > Carson > > > From: Barry Moore > Date: Monday, 11 March, 2013 7:32 AM > To: Sasha Mikheyev > Cc: > Subject: Re: [maker-devel] duplicate CDS in annotation > > Hi Sasha, > > This gene model appears to be correctly formatted to me. In GFF3 format > the CDS features are allowed to span multiple lines and they share the same > ID to indicate that it is all the same features. See the GFF3 > specification on the Sequence Ontology website ( > http://www.sequenceontology.org/resources/gff3.html), and in particular > the description of the ID attribute specifies: > > ID Indicates the ID of the feature. IDs for each feature must be unique > within the scope of the GFF file. In the case of discontinuous features > (i.e. a single feature that exists over multiple genomic locations) the > same ID may appear on multiple lines. All lines that share an ID > collectively represent a single feature. > > > So each of those CDS lines forms one part of the single CDS feature for > this gene. > > B > > On Mar 11, 2013, at 3:46 AM, Sasha Mikheyev wrote: > > Dear Yandell lab, > > I am re-annotating the harvester and genome using protein and RNA-seq > data. However, I get many artifacts like the one below. It seems that there > are several CDS records that should tie in to the same mRNA, but they are > really hanging out separately, and produce several nucleotide sequences > with the same name when extracted from the gff. I would appreciate any > guidance about how to fix this! > > Thank you, > > Sasha > > grep "pbar_scf7180000350377:hit:2506" Pbar.2.0.gff > pbar_scf7180000350377 protein2genome protein_match 172004 172162 150 - . > ID=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;score=150; > pbar_scf7180000350377 protein2genome match_part 172004 172162 150 - . ID=pbar_scf7180000350377:hsp:2798;Parent=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;Target=Hsal|HS9704 > 1 53 +;Gap=M159; > pbar_scf7180000350377 maker mRNA 538308 558769 . + . > ID=pbar_scf7180000350377:hit:2506;Parent=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29;Name=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29-mRNA-1;_AED=0.48;_eAED=0.39;_QI=0|0|0|0.5|1|1|6|0|395;score=0.01; > pbar_scf7180000350377 maker exon 538308 538334 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:305;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker exon 538748 538968 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:306;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker exon 539842 540242 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:307;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker exon 542624 542798 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:308;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker exon 555823 556025 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:309;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker exon 558609 558769 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:310;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 538308 538334 . + 0 > ID=pbar_scf7180000350377:hit:2506:cds:305;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 538748 538968 . + 0 > ID=pbar_scf7180000350377:hit:2506:cds:306;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 539842 540242 . + 1 > ID=pbar_scf7180000350377:hit:2506:cds:307;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 542624 542798 . + 2 > ID=pbar_scf7180000350377:hit:2506:cds:308;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 555823 556025 . + 1 > ID=pbar_scf7180000350377:hit:2506:cds:309;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 558609 558769 . + 2 > ID=pbar_scf7180000350377:hit:2506:cds:310;Parent=pbar_scf7180000350377:hit:2506; > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > Barry Moore > Research Scientist > Dept. of Human Genetics > University of Utah > Salt Lake City, UT 84112 > -------------------------------------------- > (801) 585-3543 > > > > > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Mar 12 08:27:44 2013 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 12 Mar 2013 10:27:44 -0400 Subject: [maker-devel] symbol lookup error: ProcessTable.so: undefined symbol: Perl_Tstack_sp_ptr In-Reply-To: <513E0309.7010004@ed.ac.uk> Message-ID: Could you try the 2.27 version of MAKER? You are using 2.10 correct? Thanks, Carson On 13-03-11 12:15 PM, "Alex Marshall" wrote: >Hi to the maker-devel, > >I am getting an error everytime I run the maker script. > >symbol lookup error: >/path/to/software/lib64/perl5/site_perl/5.8.8/x86_64-linux-thread-multi/au >to/Proc/ProcessTable/ProcessTable.so: >undefined symbol: Perl_Tstack_sp_ptr > >Your help would be very appreciated. > >Best wishes, >Alex > >---------------- >Edinburgh University > >-- >The University of Edinburgh is a charitable body, registered in >Scotland, with registration number SC005336. > > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From barry.moore at genetics.utah.edu Tue Mar 12 17:57:32 2013 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Tue, 12 Mar 2013 17:57:32 -0600 Subject: [maker-devel] MAKER subversion repositories Message-ID: For any of you who are running MAKER straight from our subversion repositories in the lab - we have migrated those repos to a new server. Reply to Shawn or I for info on how to connect to the new repos. Thanks. Barry Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From ares711122 at gmail.com Tue Mar 12 18:24:42 2013 From: ares711122 at gmail.com (Hung-Wei Hsu) Date: Wed, 13 Mar 2013 08:24:42 +0800 Subject: [maker-devel] ERROR: Could not obtain lock to format database Message-ID: Hi MAKER developers, I tried MAKER 2.27b on one E. coli scaffold sequence with uniprot protein database. I failed to run the analysis and got an error message as below. Could not obtain lock to format database at maker-2.27b/bin/../lib/GI.pm Any suggestions or helps will be deeply appreciated. Best regards, Hung-Wei -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Wed Mar 13 07:24:44 2013 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 13 Mar 2013 09:24:44 -0400 Subject: [maker-devel] symbol lookup error: ProcessTable.so: undefined symbol: Perl_Tstack_sp_ptr In-Reply-To: <513FE1F0.2030209@ed.ac.uk> Message-ID: I'm very glad it's working. Those kind of errors are the hardest to track down. --Carson On 13-03-12 10:18 PM, "Alex Marshall" wrote: >I have some great news. I uninstalled every one of my local perl >libraries. Basically by getting rid of my libraries, and then using your >Build script to install the maker dependencies totally fixed it. It >worked with the test.fasta file, no errors whatsoever. I am smiling so >much right now that my face might crack ;) you were right, broken perl. >I just checked, getting lots of finished in the >master_datastore_index.log. thank you so much. > >Alex > > > > > >On 12/03/2013 19:11, Carson Holt wrote: >> I do think your perl has a problem. I've added some changes to each of >> these modules that should help force perl to generate the correct object >> method lookup table. >> >> Could you test them out (place most under the /lib/Iterator/ >>subdirectory). >> >> --Carson >> >> >> On 13-03-12 3:00 PM, "Alex Marshall" wrote: >> >>> We had maker working happily for ages. >>> >>> Then we upgraded from perl version 5.8.8 to 5.10 which stopped maker >>> working. >>> >>> Maker said it couldn't find forks.pm, added that library path, to fix >>> the error. >>> >>> Then that particular error below started happening. >>> >>> Alex >>> >>> >>> On 12/03/2013 18:54, Alex Marshall wrote: >>>> version 5.10 on a hpc cluster >>>> >>>> Alex >>>> >>>> >>>> >>>> On 12/03/2013 18:48, Carson Holt wrote: >>>>> That means the first time it called fileHandle it didn't die (which >>>>> should >>>>> be impossible) >>>>> >>>>> Then the second time it called it, it died. It begs the question, >>>>>what >>>>> happened to the first call. >>>>> >>>>> This is looking more and more like you have a broken perl. >>>>> >>>>> What version of perl are you using? >>>>> >>>>> --Carson >>>>> >>>>> >>>>> >>>>> On 13-03-12 2:28 PM, "Alex Marshall" wrote: >>>>> >>>>>> I deleted Iterator.pm, I put the new one in the maker/lib folder, >>>>>>then >>>>>> reran maker >>>>>> >>>>>> vi interator.pm confirms this: >>>>>> >>>>>> sub fileHandle { >>>>>> die "this should die"; >>>>>> >>>>>> error: >>>>>> STATUS: Parsing control files... >>>>>> Opening a new filehandle: Iterator:GFF3 >>>>>> Gettign the existing filehandle: Iterator::GFF3 >>>>>> Checking if it still exists: Iterator::GFF3 >>>>>> ERROR: No open filehandle Iterator::GFF3 >>>>>> --> rank=NA, hostname=frontend04 >>>>>> >>>>>> >>>>>> >>>>>> On 12/03/2013 18:21, Carson Holt wrote: >>>>>>> Try this one. >>>>>>> >>>>>>> It should fail immediately >>>>>>> >>>>>>> Code --> die "this should die"; >>>>>>> >>>>>>> >>>>>>> I'm just making sure it's being called as expected. >>>>>>> >>>>>>> --Carson >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 13-03-12 2:18 PM, "Alex Marshall" >>>>>>>wrote: >>>>>>> >>>>>>>> I have Iterator.pm and GFF3.pm in the right place: >>>>>>>> >>>>>>>> ..../software/maker2/maker-2.27/lib/Iterator.pm >>>>>>>> ..../software/maker2/maker-2.27/lib/Iterator/GFF3.pm >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 12/03/2013 18:16, Alex Marshall wrote: >>>>>>>>> I have deleted Iterator.pm, and replaced again (just to be sure). >>>>>>>>> >>>>>>>>> STATUS: Parsing control files... >>>>>>>>> Opening a new filehandle: Iterator:GFF3 >>>>>>>>> Gettign the existing filehandle: Iterator::GFF3 >>>>>>>>> Checking if it still exists: Iterator::GFF3 >>>>>>>>> ERROR: No open filehandle Iterator::GFF3 >>>>>>>>> --> rank=NA, hostname=frontend04 >>>>>>>>> >>>>>>>>> >>>>>>>>> On 12/03/2013 18:11, Carson Holt wrote: >>>>>>>>>> It's missing all the standard error from the Iterator.pm >>>>>>>>>>message I >>>>>>>>>> added? >>>>>>>>>> Could you double check that you replaced that one too. >>>>>>>>>> >>>>>>>>>> --Carson >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 13-03-12 2:07 PM, "Alex Marshall" >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> STATUS: Parsing control files... >>>>>>>>>>> Opening a new filehandle: Iterator:GFF3 >>>>>>>>>>> Gettign the existing filehandle: Iterator::GFF3 >>>>>>>>>>> Checking if it still exists: Iterator::GFF3 >>>>>>>>>>> ERROR: No open filehandle Iterator::GFF3 >>>>>>>>>>> --> rank=NA, hostname=frontend04 >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 12/03/2013 18:02, Carson Holt wrote: >>>>>>>>>>>> Please use these two and send me the full STDERR (replaces >>>>>>>>>>>>also >>>>>>>>>>>> Iterator/GFF3.pm). >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Carson >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 13-03-12 1:55 PM, "Alex Marshall" >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> same again: >>>>>>>>>>>>> >>>>>>>>>>>>> STATUS: Parsing control files... >>>>>>>>>>>>> ERROR: No open filehandle Iterator::GFF3 >>>>>>>>>>>>> --> rank=NA, hostname=frontend04 >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 12/03/2013 17:45, Carson Holt wrote: >>>>>>>>>>>>>> Try this one. This is a code snippet --> >>>>>>>>>>>>>> >>>>>>>>>>>>>> my $fh = new FileHandle(); >>>>>>>>>>>>>> $fh->open("$arg") or die "ERROR: Could not >>>>>>>>>>>>>> open >>>>>>>>>>>>>> file: >>>>>>>>>>>>>> $!\n"; >>>>>>>>>>>>>> $self->{fileHandle} = $fh; >>>>>>>>>>>>>> $self->startPos($fh->getpos()); >>>>>>>>>>>>>> if (! openhandle($fh)){ #checks to see if >>>>>>>>>>>>>>file >>>>>>>>>>>>>> handle >>>>>>>>>>>>>> is >>>>>>>>>>>>>> open >>>>>>>>>>>>>> confess "ERROR: No open filehandle in Iterator\n"; >>>>>>>>>>>>>> } >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> All it does is open the handle, check the reading position >>>>>>>>>>>>>>and >>>>>>>>>>>>>> then >>>>>>>>>>>>>> check >>>>>>>>>>>>>> to see if the handle is still open. >>>>>>>>>>>>>> >>>>>>>>>>>>>> --Carson >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 13-03-12 1:37 PM, "Alex Marshall" >>>>>>>>>>>>>> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> [1] If I comment out the error in the GFF3.pm file: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> STATUS: Parsing control files... >>>>>>>>>>>>>>> Can't call method "getpos" without a package or object >>>>>>>>>>>>>>> reference >>>>>>>>>>>>>>> at >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>/exports/work/biology_ieb_mblaxter/software/maker2/maker-2.2 >>>>>>>>>>>>>>>7/ >>>>>>>>>>>>>>> bin >>>>>>>>>>>>>>> /. >>>>>>>>>>>>>>> ./l >>>>>>>>>>>>>>> ib >>>>>>>>>>>>>>> /I >>>>>>>>>>>>>>> terator/GFF3.pm >>>>>>>>>>>>>>> line 42, line 121. >>>>>>>>>>>>>>> --> rank=NA, hostname=frontend04 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> [2] If I add the error comment back to the GFF3.pm, and add >>>>>>>>>>>>>>> the >>>>>>>>>>>>>>> second >>>>>>>>>>>>>>> new Iterator.pm: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> STATUS: Parsing control files... >>>>>>>>>>>>>>> ERROR: No open filehandle Iterator::GFF3 >>>>>>>>>>>>>>> --> rank=NA, hostname=frontend04 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Alex >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 12/03/2013 17:31, Carson Holt wrote: >>>>>>>>>>>>>>>> There is one other thing it does right before. It calls >>>>>>>>>>>>>>>> this >>>>>>>>>>>>>>>> --> >>>>>>>>>>>>>>>> $self->fileHandle()->getpos() >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I switched the chaining off so it is just $fh->getpos in >>>>>>>>>>>>>>>>the >>>>>>>>>>>>>>>> attached >>>>>>>>>>>>>>>> module (replace again). >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I don't see why a failure would happen special for you >>>>>>>>>>>>>>>> there, >>>>>>>>>>>>>>>> but >>>>>>>>>>>>>>>> try >>>>>>>>>>>>>>>> it >>>>>>>>>>>>>>>> again. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> --Carson >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 13-03-12 1:24 PM, "Carson Holt" >>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> This is the new line in Iterator.pm >>>>>>>>>>>>>>>>> --> $fh->open("$arg") or die "ERROR: Could not open file: >>>>>>>>>>>>>>>>> $!\n"; >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> The extra info would be from $! >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> In the place where the error is occurring, all MAKER does >>>>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>> open a >>>>>>>>>>>>>>>>> file >>>>>>>>>>>>>>>>> handle in Iterator.pm and then check to see if it is open >>>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>> Iterator::GFF3 (it does one and then instantly the >>>>>>>>>>>>>>>>>other). >>>>>>>>>>>>>>>>> The >>>>>>>>>>>>>>>>> second >>>>>>>>>>>>>>>>> failure is just the check on the filehandle. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> If the open succeeds, but for some reason it can't tell >>>>>>>>>>>>>>>>>it >>>>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>> open, >>>>>>>>>>>>>>>>> then >>>>>>>>>>>>>>>>> it is something to do with your system. You can try >>>>>>>>>>>>>>>>> reinstalling >>>>>>>>>>>>>>>>> Scalar::Util as that is the module that implements >>>>>>>>>>>>>>>>> openhandle >>>>>>>>>>>>>>>>> method >>>>>>>>>>>>>>>>> that >>>>>>>>>>>>>>>>> is called. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> You can also try just commenting out line 37 of >>>>>>>>>>>>>>>>> lib/Iterator/GFF3.pm >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> --Carson >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 13-03-12 1:15 PM, "Alex Marshall" >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I am looking at Iterator.pm >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> so it should of thrown more error information? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Alex >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On 12/03/2013 17:14, Alex Marshall wrote: >>>>>>>>>>>>>>>>>>> replaced Iterator.pm in maker2/maker-2.27/lib >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> error: same as before >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> STATUS: Parsing control files... >>>>>>>>>>>>>>>>>>> ERROR: No open filehandle Iterator::GFF3 >>>>>>>>>>>>>>>>>>> --> rank=NA, hostname=frontend04 >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> ...../software/maker2/maker-2.27/lib/Iterator/GFF3.pm >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> in sub new >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> my $fh = $self->fileHandle(); >>>>>>>>>>>>>>>>>>> if (! openhandle($fh)){ #checks to see if file >>>>>>>>>>>>>>>>>>> handle >>>>>>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>>>> open >>>>>>>>>>>>>>>>>>> die "ERROR: No open filehandle >>>>>>>>>>>>>>>>>>> Iterator::GFF3\n"; >>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Alex >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On 12/03/2013 17:06, Carson Holt wrote: >>>>>>>>>>>>>>>>>>>> I get not errors, and don?t see any issues. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Could you replace the Iterator.pm in the lib directory >>>>>>>>>>>>>>>>>>>> with >>>>>>>>>>>>>>>>>>>> this >>>>>>>>>>>>>>>>>>>> one. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I >>>>>>>>>>>>>>>>>>>> added some more output to the STDERR if opening a >>>>>>>>>>>>>>>>>>>> filehandle >>>>>>>>>>>>>>>>>>>> fails. >>>>>>>>>>>>>>>>>>>> At >>>>>>>>>>>>>>>>>>>> least it should provide more information. Could you >>>>>>>>>>>>>>>>>>>> then >>>>>>>>>>>>>>>>>>>> let me >>>>>>>>>>>>>>>>>>>> know >>>>>>>>>>>>>>>>>>>> what >>>>>>>>>>>>>>>>>>>> it says. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>> Carson >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On 13-03-12 12:35 PM, "Alex Marshall" >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> please find: maker_opts.ctl and test.fa attached >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Alex >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On 12/03/2013 16:31, Alex Marshall wrote: >>>>>>>>>>>>>>>>>>>>>> will send to you now... >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On 12/03/2013 16:29, Carson Holt wrote: >>>>>>>>>>>>>>>>>>>>>>> Could you send me the entire captured STDERR, your >>>>>>>>>>>>>>>>>>>>>>> maker_opts.ctl >>>>>>>>>>>>>>>>>>>>>>> file and >>>>>>>>>>>>>>>>>>>>>>> you test.fasta? >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>> Carson >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On 13-03-12 12:23 PM, "Alex Marshall" >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> It is in fasta format not GFF format >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> On 12/03/2013 16:16, Alex Marshall wrote: >>>>>>>>>>>>>>>>>>>>>>>>> I have been looking through maker_opts.ctl >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> #-----Genome (Required for De-Novo Annotation) >>>>>>>>>>>>>>>>>>>>>>>>> genome=test.fna #genome sequence (fasta format or >>>>>>>>>>>>>>>>>>>>>>>>> fasta >>>>>>>>>>>>>>>>>>>>>>>>> embeded >>>>>>>>>>>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>>>>>>>>>> GFF3) >>>>>>>>>>>>>>>>>>>>>>>>> organism_type=eukaryotic #eukaryotic or prokaryotic. >>>>>>>>>>>>>>>>>>>>>>>>> Default >>>>>>>>>>>>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>>>>>>>>>> eukaryotic >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> I added the path to the genome, same error. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Alex >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> On 12/03/2013 16:11, Carson Holt wrote: >>>>>>>>>>>>>>>>>>>>>>>>>> What does you maker_opts.ctl file look like. What is >>>>>>>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>> value >>>>>>>>>>>>>>>>>>>>>>>>>> for >>>>>>>>>>>>>>>>>>>>>>>>>> genome? If you did not give a genome fasta file and are >>>>>>>>>>>>>>>>>>>>>>>>>> using >>>>>>>>>>>>>>>>>>>>>>>>>> a >>>>>>>>>>>>>>>>>>>>>>>>>> gff3 as >>>>>>>>>>>>>>>>>>>>>>>>>> input, is there a FASTA file embedded in it? >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> --Carson >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> On 13-03-12 12:06 PM, "Alex Marshall" >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> [1] hard drive - enough space >>>>>>>>>>>>>>>>>>>>>>>>>>> [2] ./Build realclean - done >>>>>>>>>>>>>>>>>>>>>>>>>>> [3] delete the maker_path/perl directory and >>>>>>>>>>>>>>>>>>>>>>>>>>> maker_path/bin - >>>>>>>>>>>>>>>>>>>>>>>>>>> done >>>>>>>>>>>>>>>>>>>>>>>>>>> [4] LD_PRELOAD=/.....path...../openmpi/lib/libmpi.so - >>>>>>>>>>>>>>>>>>>>>>>>>>> done >>>>>>>>>>>>>>>>>>>>>>>>>>> [5] perl Build.PL - done >>>>>>>>>>>>>>>>>>>>>>>>>>> [6] installation of 2.27 worked >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> and back to original error: >>>>>>>>>>>>>>>>>>>>>>>>>>> STATUS: Parsing control files... >>>>>>>>>>>>>>>>>>>>>>>>>>> ERROR: No open filehandle Iterator::GFF3 >>>>>>>>>>>>>>>>>>>>>>>>>>> --> rank=NA, hostname=frontend04 >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Alex >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> On 12/03/2013 15:26, Carson Holt wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>> So the odd unrelated errors you are getting suggest >>>>>>>>>>>>>>>>>>>>>>>>>>>> there >>>>>>>>>>>>>>>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>>>>>>>>>>>>> something >>>>>>>>>>>>>>>>>>>>>>>>>>>> else going on that needs to be resolved first. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Check your drive space 'df -h maker_path'. Make sure >>>>>>>>>>>>>>>>>>>>>>>>>>>> you >>>>>>>>>>>>>>>>>>>>>>>>>>>> don't >>>>>>>>>>>>>>>>>>>>>>>>>>>> just >>>>>>>>>>>>>>>>>>>>>>>>>>>> have >>>>>>>>>>>>>>>>>>>>>>>>>>>> a full hard drive. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Run './Build realclean', and delete the >> maker_path/perl >>>>>>>>>>>>>>>>>>>>>>>>>>>> directory and >>>>>>>>>>>>>>>>>>>>>>>>>>>> maker_path/bin sidreactory completely. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Make sure to execute the export >>>>>>>>>>>>>>>>>>>>>>>>>>>> LD_PRELOAD=/.....path...../openmpi/lib/libmpi.so >> comamnd >>>>>>>>>>>>>>>>>>>>>>>>>>>> before >>>>>>>>>>>>>>>>>>>>>>>>>>>> ever >>>>>>>>>>>>>>>>>>>>>>>>>>>> running 'perl Build.PL' >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Which version of OPenMPI are you using. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>>>> Carson >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> On 13-03-12 11:21 AM, "Alex >>>>>>>>>>>>>>>>>>>>>>>>>>>> Marshall" >>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> I am using openmpi and yes I ran ./Build install >> step. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Configuring MAKER with MPI support >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Can't exec "/bin/sh": Argument list too long at >>>>>>>>>>>>>>>>>>>>>>>>>>>>> /....path...../lib/perl5/Inline/C.pm line 801. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> A problem was encountered while attempting to >> compile >>>>>>>>>>>>>>>>>>>>>>>>>>>>> and >>>>>>>>>>>>>>>>>>>>>>>>>>>>> install >>>>>>>>>>>>>>>>>>>>>>>>>>>>> your >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Inline >>>>>>>>>>>>>>>>>>>>>>>>>>>>> C code. The command that failed was: >>>>>>>>>>>>>>>>>>>>>>>>>>>>> /usr/bin/perl Makefile.PL > out.Makefile_PL >> 2>&1 >>>>>>>>>>>>>>>>>>>>>>>>>>>>> The build directory was: >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>> /....path...../maker2/maker-2.27/src/blib/build/Parallel >>>>>>>>>>>>>>>>>>>>>>>>>>>>> /A >>>>>>>>>>>>>>>>>>>>>>>>>>>>> pp >>>>>>>>>>>>>>>>>>>>>>>>>>>>> li >>>>>>>>>>>>>>>>>>>>>>>>>>>>> c >>>>>>>>>>>>>>>>>>>>>>>>>>>>> ati >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> on/ >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> MPI >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> To debug the problem, cd to the build directory, and >>>>>>>>>>>>>>>>>>>>>>>>>>>>> inspect >>>>>>>>>>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>>>>> output >>>>>>>>>>>>>>>>>>>>>>>>>>>>> files. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> at >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>> /....path...../maker2/maker-2.27/src/lib/Parallel/Applic >>>>>>>>>>>>>>>>>>>>>>>>>>>>> at >>>>>>>>>>>>>>>>>>>>>>>>>>>>> io >>>>>>>>>>>>>>>>>>>>>>>>>>>>> n/ >>>>>>>>>>>>>>>>>>>>>>>>>>>>> M >>>>>>>>>>>>>>>>>>>>>>>>>>>>> PI. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> pm >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> line 223. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Alex >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 12/03/2013 15:14, Carson Holt wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Also the place it is trying to load from >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>> /....path...../maker2/maker-2.27/src/blib/lib/auto/Para >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ll >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> el >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> /A >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> p >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pli >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> cat >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ion >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> /M >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> PI >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> /MPI.so >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> That is not the final install location? Did you >> run >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> './Build >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> install' >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> step? When that runs everything related to MPI >> will >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> be >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> here --> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>> /....path...../maker2/maker-2.27/perl/lib/auto/Parallel >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> /A >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pp >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> li >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> c >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ati >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> on/ >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> MPI >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> /M >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> PI >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> .so >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> --Carson >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 13-03-12 11:11 AM, "Alex >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Marshall" >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> now getting mpi problems: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Can't load >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>> '/....path...../maker2/maker-2.27/src/blib/lib/auto/Para >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ll >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> el >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> / >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> App >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> lic >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ati >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> on >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> /M >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> PI/MPI.so' >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for module Parallel::Application::MPI: >>>>> libmpich.so.1.0: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> cannot >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> open >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> shared object file: No such file or directory at >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> /usr/lib64/perl5/DynaLoader.pm line 200. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> at /....path...../lib/perl5/Inline.pm line >> 536. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> at >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>> /....path...../maker2/maker-2.27/src/lib/Parallel/Applic >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> at >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> io >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> n >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> /MP >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I.p >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> m >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> line 223 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> you suggest: export >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> LD_PRELOAD=/.....path...../openmpi/lib/libmpi.so >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I do that, and run again, same error. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Alex >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 12/03/2013 14:43, Alex Marshall wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ok I will upgrade to 2.27 now. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Alex >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 12/03/2013 14:42, Carson Holt wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The original error is caused by an issue in >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Proc::ProcessTable >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> on >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> some >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> systems. I no longer use that module in maker >> for >>>>>>> that >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> reason. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> After >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> first error, you may have to delete the >> mpi_blastdb >>>>>>> and >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> any >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> files >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> with the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> extension .db in the maker.output directory >> before >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> retrying. I >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> would >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> recommend using 2.27. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Carson >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 13-03-12 10:40 AM, "Alex >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Marshall" >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I managed to fix that error. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I am using version 2.25-beta. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> new error: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ERROR: No open filehandle Iterator::GFF3 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Alex >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 12/03/2013 14:27, Carson Holt wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Could you try the 2.27 version of MAKER? You >> are >>>>>>> using >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2.10 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> correct? >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Carson >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 13-03-11 12:15 PM, "Alex >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Marshall" >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi to the maker-devel, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I am getting an error everytime I run the >> maker >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> script. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> symbol lookup error: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>> /path/to/software/lib64/perl5/site_perl/5.8.8/x86_64-li >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> n >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ux- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> thr >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ead >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -m >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ul >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ti/ >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> au >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to/Proc/ProcessTable/ProcessTable.so: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> undefined symbol: Perl_Tstack_sp_ptr >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Your help would be very appreciated. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best wishes, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Alex >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ---------------- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Edinburgh University >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The University of Edinburgh is a charitable >>>>> body, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> registered in >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Scotland, with registration number SC005336. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> _______________________________________________ >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> maker-devel mailing list >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> maker-devel at box290.bluehost.com >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>> http://box290.bluehost.com/mailman/listinfo/maker-devel >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> _ >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> yan >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> del >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> l-l >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ab >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> .o >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> rg >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Alex Marshall, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Room 3.54, Blaxter Lab, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Ashworth Laboratories, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Institute of Evolutionary Biology, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The King's Buildings, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The University of Edinburgh, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Edinburgh, EH9 3JT >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> alex.marshall at ed.ac.uk >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> +44(0)131 650 7403 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The University of Edinburgh is a charitable >> body, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> registered in >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Scotland, with registration number SC005336. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Alex Marshall, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Room 3.54, Blaxter Lab, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Ashworth Laboratories, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Institute of Evolutionary Biology, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The King's Buildings, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The University of Edinburgh, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Edinburgh, EH9 3JT >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> alex.marshall at ed.ac.uk >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> +44(0)131 650 7403 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The University of Edinburgh is a charitable body, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> registered in >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Scotland, with registration number SC005336. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Alex Marshall, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Room 3.54, Blaxter Lab, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Ashworth Laboratories, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Institute of Evolutionary Biology, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> The King's Buildings, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> The University of Edinburgh, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Edinburgh, EH9 3JT >>>>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>>>> alex.marshall at ed.ac.uk >>>>>>>>>>>>>>>>>>>>>>>>>>>>> +44(0)131 650 7403 >>>>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> The University of Edinburgh is a charitable body, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> registered >>>>>>>>>>>>>>>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Scotland, with registration number SC005336. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>> Alex Marshall, >>>>>>>>>>>>>>>>>>>>>>>>>>> Room 3.54, Blaxter Lab, >>>>>>>>>>>>>>>>>>>>>>>>>>> Ashworth Laboratories, >>>>>>>>>>>>>>>>>>>>>>>>>>> Institute of Evolutionary Biology, >>>>>>>>>>>>>>>>>>>>>>>>>>> The King's Buildings, >>>>>>>>>>>>>>>>>>>>>>>>>>> The University of Edinburgh, >>>>>>>>>>>>>>>>>>>>>>>>>>> Edinburgh, EH9 3JT >>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>> alex.marshall at ed.ac.uk >>>>>>>>>>>>>>>>>>>>>>>>>>> +44(0)131 650 7403 >>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> The University of Edinburgh is a charitable body, >>>>>>>>>>>>>>>>>>>>>>>>>>> registered >>>>>>>>>>>>>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>>>>>>>>>>>> Scotland, with registration number SC005336. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>> Alex Marshall, >>>>>>>>>>>>>>>>>>>>>>>> Room 3.54, Blaxter Lab, >>>>>>>>>>>>>>>>>>>>>>>> Ashworth Laboratories, >>>>>>>>>>>>>>>>>>>>>>>> Institute of Evolutionary Biology, >>>>>>>>>>>>>>>>>>>>>>>> The King's Buildings, >>>>>>>>>>>>>>>>>>>>>>>> The University of Edinburgh, >>>>>>>>>>>>>>>>>>>>>>>> Edinburgh, EH9 3JT >>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>> alex.marshall at ed.ac.uk >>>>>>>>>>>>>>>>>>>>>>>> +44(0)131 650 7403 >>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> The University of Edinburgh is a charitable body, >>>>>>>>>>>>>>>>>>>>>>>> registered >>>>>>>>>>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>>>>>>>>> Scotland, with registration number SC005336. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>> Alex Marshall, >>>>>>>>>>>>>>>>>>>>> Room 3.54, Blaxter Lab, >>>>>>>>>>>>>>>>>>>>> Ashworth Laboratories, >>>>>>>>>>>>>>>>>>>>> Institute of Evolutionary Biology, >>>>>>>>>>>>>>>>>>>>> The King's Buildings, >>>>>>>>>>>>>>>>>>>>> The University of Edinburgh, >>>>>>>>>>>>>>>>>>>>> Edinburgh, EH9 3JT >>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>> alex.marshall at ed.ac.uk >>>>>>>>>>>>>>>>>>>>> +44(0)131 650 7403 >>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> The University of Edinburgh is a charitable body, >>>>>>>>>>>>>>>>>>>>> registered in >>>>>>>>>>>>>>>>>>>>> Scotland, with registration number SC005336. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>> Alex Marshall, >>>>>>>>>>>>>>>>>> Room 3.54, Blaxter Lab, >>>>>>>>>>>>>>>>>> Ashworth Laboratories, >>>>>>>>>>>>>>>>>> Institute of Evolutionary Biology, >>>>>>>>>>>>>>>>>> The King's Buildings, >>>>>>>>>>>>>>>>>> The University of Edinburgh, >>>>>>>>>>>>>>>>>> Edinburgh, EH9 3JT >>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>> alex.marshall at ed.ac.uk >>>>>>>>>>>>>>>>>> +44(0)131 650 7403 >>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> The University of Edinburgh is a charitable body, >>>>>>>>>>>>>>>>>> registered >>>>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>>> Scotland, with registration number SC005336. >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>> Alex Marshall, >>>>>>>>>>>>>>> Room 3.54, Blaxter Lab, >>>>>>>>>>>>>>> Ashworth Laboratories, >>>>>>>>>>>>>>> Institute of Evolutionary Biology, >>>>>>>>>>>>>>> The King's Buildings, >>>>>>>>>>>>>>> The University of Edinburgh, >>>>>>>>>>>>>>> Edinburgh, EH9 3JT >>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>> alex.marshall at ed.ac.uk >>>>>>>>>>>>>>> +44(0)131 650 7403 >>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The University of Edinburgh is a charitable body, >>>>>>>>>>>>>>>registered >>>>>>>>>>>>>>> in >>>>>>>>>>>>>>> Scotland, with registration number SC005336. >>>>>>>>>>>>> -- >>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>> Alex Marshall, >>>>>>>>>>>>> Room 3.54, Blaxter Lab, >>>>>>>>>>>>> Ashworth Laboratories, >>>>>>>>>>>>> Institute of Evolutionary Biology, >>>>>>>>>>>>> The King's Buildings, >>>>>>>>>>>>> The University of Edinburgh, >>>>>>>>>>>>> Edinburgh, EH9 3JT >>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>> alex.marshall at ed.ac.uk >>>>>>>>>>>>> +44(0)131 650 7403 >>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>> >>>>>>>>>>>>> The University of Edinburgh is a charitable body, registered >>>>>>>>>>>>>in >>>>>>>>>>>>> Scotland, with registration number SC005336. >>>>>>>>>>> -- >>>>>>>>>>> ----------------------------- >>>>>>>>>>> Alex Marshall, >>>>>>>>>>> Room 3.54, Blaxter Lab, >>>>>>>>>>> Ashworth Laboratories, >>>>>>>>>>> Institute of Evolutionary Biology, >>>>>>>>>>> The King's Buildings, >>>>>>>>>>> The University of Edinburgh, >>>>>>>>>>> Edinburgh, EH9 3JT >>>>>>>>>>> ----------------------------- >>>>>>>>>>> alex.marshall at ed.ac.uk >>>>>>>>>>> +44(0)131 650 7403 >>>>>>>>>>> ----------------------------- >>>>>>>>>>> >>>>>>>>>>> The University of Edinburgh is a charitable body, registered in >>>>>>>>>>> Scotland, with registration number SC005336. >>>>>>>> -- >>>>>>>> ----------------------------- >>>>>>>> Alex Marshall, >>>>>>>> Room 3.54, Blaxter Lab, >>>>>>>> Ashworth Laboratories, >>>>>>>> Institute of Evolutionary Biology, >>>>>>>> The King's Buildings, >>>>>>>> The University of Edinburgh, >>>>>>>> Edinburgh, EH9 3JT >>>>>>>> ----------------------------- >>>>>>>> alex.marshall at ed.ac.uk >>>>>>>> +44(0)131 650 7403 >>>>>>>> ----------------------------- >>>>>>>> >>>>>>>> The University of Edinburgh is a charitable body, registered in >>>>>>>> Scotland, with registration number SC005336. >>>>>> -- >>>>>> ----------------------------- >>>>>> Alex Marshall, >>>>>> Room 3.54, Blaxter Lab, >>>>>> Ashworth Laboratories, >>>>>> Institute of Evolutionary Biology, >>>>>> The King's Buildings, >>>>>> The University of Edinburgh, >>>>>> Edinburgh, EH9 3JT >>>>>> ----------------------------- >>>>>> alex.marshall at ed.ac.uk >>>>>> +44(0)131 650 7403 >>>>>> ----------------------------- >>>>>> >>>>>> The University of Edinburgh is a charitable body, registered in >>>>>> Scotland, with registration number SC005336. >>> >>> -- >>> ----------------------------- >>> Alex Marshall, >>> Room 3.54, Blaxter Lab, >>> Ashworth Laboratories, >>> Institute of Evolutionary Biology, >>> The King's Buildings, >>> The University of Edinburgh, >>> Edinburgh, EH9 3JT >>> ----------------------------- >>> alex.marshall at ed.ac.uk >>> +44(0)131 650 7403 >>> ----------------------------- >>> >>> The University of Edinburgh is a charitable body, registered in >>> Scotland, with registration number SC005336. > > >-- >----------------------------- >Alex Marshall, >Room 3.54, Blaxter Lab, >Ashworth Laboratories, >Institute of Evolutionary Biology, >The King's Buildings, >The University of Edinburgh, >Edinburgh, EH9 3JT >----------------------------- >alex.marshall at ed.ac.uk >+44(0)131 650 7403 >----------------------------- > >The University of Edinburgh is a charitable body, registered in >Scotland, with registration number SC005336. From mikheyev at gmail.com Wed Mar 13 01:23:25 2013 From: mikheyev at gmail.com (Sasha Mikheyev) Date: Wed, 13 Mar 2013 16:23:25 +0900 Subject: [maker-devel] duplicate CDS in annotation In-Reply-To: References: Message-ID: Dear Carson, The new version does indeed fix the problem! However, I noticed that some of the CDS annotations were swallowed. This seems to affect a ~600 genes. e.g. input: pbar_scf7180000349951 maker mRNA 98033 98530 . - . ID=PB12301-RA;Parent=PB12301;Name=PB12301-RA;Alias=maker-pbar_scf7180000349951-snap-gene-1.17-mRNA-1;_AED=1.00;_QI=0|0|0|0|0|0|2|0|81; pbar_scf7180000349951 maker exon 98393 98530 . - . ID=PB12301-RA:exon:10283;Parent=PB12301-RA; pbar_scf7180000349951 maker exon 98033 98140 . - . ID=PB12301-RA:exon:10284;Parent=PB12301-RA; pbar_scf7180000349951 maker CDS 98033 98140 . - 0 ID=PB12301-RA:cds:10114;Parent=PB12301-RA; pbar_scf7180000349951 maker CDS 98393 98530 . - 0 ID=PB12301-RA:cds:10113;Parent=PB12301-RA; output: pbar_scf7180000349951 maker mRNA 98033 98530 . - . ID=PB12301-RA;Parent=PB12301;Name=PB12301-RA;_AED=0.38;_eAED=0.38;_QI=0|0|0.33|1|0.5|1|3|246|165;Alias=genemark-pbar_scf7180000349951-abinit-gene-1.14-mRNA-1,PB12301-RA pbar_scf7180000349951 maker exon 98033 98530 . - . ID=PB12301-RA:exon:134;Parent=PB12301-RA pbar_scf7180000349951 maker exon 98033 98140 . - . ID=PB12301-RA:exon:133;Parent=PB12301-RA pbar_scf7180000349951 maker exon 98393 98530 . - . ID=PB12301-RA:exon:132;Parent=PB12301-RA pbar_scf7180000349951 maker three_prime_UTR 98393 98530 . - . ID=PB12301-RA:three_prime_utr;Parent=PB12301-RA pbar_scf7180000349951 maker three_prime_UTR 98033 98140 . - . ID=PB12301-RA:three_prime_utr;Parent=PB12301-RA pbar_scf7180000349951 maker CDS 98033 98530 . - 0 ID=PB12301-RA:cds;Parent=PB12301-RA Thank you, Sasha On Tue, Mar 12, 2013 at 10:37 PM, Carson Holt wrote: > Yes. Try the newer version and see if you still have the issue. > > Thanks, > Carson > > > From: Sasha Mikheyev > Date: Tuesday, 12 March, 2013 1:26 AM > To: Carson Holt > Cc: Barry Moore , < > maker-devel at yandell-lab.org> > > Subject: Re: [maker-devel] duplicate CDS in annotation > > Hi Carson, > > I have been using version 2.10. Is it worth trying with a newer version? > > You can find the model file here. > It is rather large, as it includes all of the output from the first maker > run. > > Yours, > > Sasha > > > On Mon, Mar 11, 2013 at 10:02 PM, Carson Holt wrote: > >> I think the issue is that you are getting a match feature that is being >> printed with the same ID as the mRNA feature. Correct? >> >> What version of MAKER are you using, and what does the gile you are >> giving to pred_gff or model_gff look like? Could you send them? >> >> Thanks, >> Carson >> >> >> From: Barry Moore >> Date: Monday, 11 March, 2013 7:32 AM >> To: Sasha Mikheyev >> Cc: >> Subject: Re: [maker-devel] duplicate CDS in annotation >> >> Hi Sasha, >> >> This gene model appears to be correctly formatted to me. In GFF3 format >> the CDS features are allowed to span multiple lines and they share the same >> ID to indicate that it is all the same features. See the GFF3 >> specification on the Sequence Ontology website ( >> http://www.sequenceontology.org/resources/gff3.html), and in particular >> the description of the ID attribute specifies: >> >> ID Indicates the ID of the feature. IDs for each feature must be unique >> within the scope of the GFF file. In the case of discontinuous features >> (i.e. a single feature that exists over multiple genomic locations) the >> same ID may appear on multiple lines. All lines that share an ID >> collectively represent a single feature. >> >> >> So each of those CDS lines forms one part of the single CDS feature for >> this gene. >> >> B >> >> On Mar 11, 2013, at 3:46 AM, Sasha Mikheyev wrote: >> >> Dear Yandell lab, >> >> I am re-annotating the harvester and genome using protein and RNA-seq >> data. However, I get many artifacts like the one below. It seems that there >> are several CDS records that should tie in to the same mRNA, but they are >> really hanging out separately, and produce several nucleotide sequences >> with the same name when extracted from the gff. I would appreciate any >> guidance about how to fix this! >> >> Thank you, >> >> Sasha >> >> grep "pbar_scf7180000350377:hit:2506" Pbar.2.0.gff >> pbar_scf7180000350377 protein2genome protein_match 172004 172162 150 - . >> ID=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;score=150; >> pbar_scf7180000350377 protein2genome match_part 172004 172162 150 - . ID=pbar_scf7180000350377:hsp:2798;Parent=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;Target=Hsal|HS9704 >> 1 53 +;Gap=M159; >> pbar_scf7180000350377 maker mRNA 538308 558769 . + . >> ID=pbar_scf7180000350377:hit:2506;Parent=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29;Name=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29-mRNA-1;_AED=0.48;_eAED=0.39;_QI=0|0|0|0.5|1|1|6|0|395;score=0.01; >> pbar_scf7180000350377 maker exon 538308 538334 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:305;Parent=pbar_scf7180000350377:hit:2506; >> pbar_scf7180000350377 maker exon 538748 538968 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:306;Parent=pbar_scf7180000350377:hit:2506; >> pbar_scf7180000350377 maker exon 539842 540242 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:307;Parent=pbar_scf7180000350377:hit:2506; >> pbar_scf7180000350377 maker exon 542624 542798 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:308;Parent=pbar_scf7180000350377:hit:2506; >> pbar_scf7180000350377 maker exon 555823 556025 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:309;Parent=pbar_scf7180000350377:hit:2506; >> pbar_scf7180000350377 maker exon 558609 558769 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:310;Parent=pbar_scf7180000350377:hit:2506; >> pbar_scf7180000350377 maker CDS 538308 538334 . + 0 >> ID=pbar_scf7180000350377:hit:2506:cds:305;Parent=pbar_scf7180000350377:hit:2506; >> pbar_scf7180000350377 maker CDS 538748 538968 . + 0 >> ID=pbar_scf7180000350377:hit:2506:cds:306;Parent=pbar_scf7180000350377:hit:2506; >> pbar_scf7180000350377 maker CDS 539842 540242 . + 1 >> ID=pbar_scf7180000350377:hit:2506:cds:307;Parent=pbar_scf7180000350377:hit:2506; >> pbar_scf7180000350377 maker CDS 542624 542798 . + 2 >> ID=pbar_scf7180000350377:hit:2506:cds:308;Parent=pbar_scf7180000350377:hit:2506; >> pbar_scf7180000350377 maker CDS 555823 556025 . + 1 >> ID=pbar_scf7180000350377:hit:2506:cds:309;Parent=pbar_scf7180000350377:hit:2506; >> pbar_scf7180000350377 maker CDS 558609 558769 . + 2 >> ID=pbar_scf7180000350377:hit:2506:cds:310;Parent=pbar_scf7180000350377:hit:2506; >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> >> Barry Moore >> Research Scientist >> Dept. of Human Genetics >> University of Utah >> Salt Lake City, UT 84112 >> -------------------------------------------- >> (801) 585-3543 >> >> >> >> >> _______________________________________________ maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Hossein.Borhan at AGR.GC.CA Wed Mar 13 15:49:44 2013 From: Hossein.Borhan at AGR.GC.CA (Borhan, Hossein) Date: Wed, 13 Mar 2013 17:49:44 -0400 Subject: [maker-devel] do of the maker predicted proteins do not start with M Message-ID: <7B64340A44B6634C814A22BCFA6179D5020887CC@onottaxms5.AGR.GC.CA> Hi I have run maker and some of the protein predicted by maker does not start with a Methionine. I am not sure why Here are some examples >maker-scaffold00001-snap-gene-0.8-mRNA-1 protein AED:0.27496328928047 eAED:0.27496328928047 QI:0|0.4|0.16|0.5|1|1|6|0|453 VIIKFKTFAKASRSVELFGHEGAWARGDGYCNFKTESEKADRSVKSSCSLNIPFTYDVGR RQYVIKGDRFCLSHNHLVMIPSPTTVIVNDQRDLTPDQLSYIINLGKYSLPFPMVTRMLS DQFPDCRIQKPLLHRLLRKGKLQAFGGDRDAMNALINLGRSYEEHGGFFEIDIDVDCRLE KIWLARAEGLQFASVYNDVVQIDGGAKMNAYGFVFLPVTVIDCLGKSYVVGAMAGPSAEN KADVVKTLEYFRVKRSESVLIADDALAFRAAAVECDMVYHQCTKHYQAKIARACAGLGHE GKEFMIKANTLVYHIFPSEDAFFAKADEYRLMFLQYGGAVKLFDDIVDKRQQLCRTFTSC KFTGSHSSNQRAEGTISRTKRDVQPWLSRANLFEMFTHLEMIQKQQEDEAARLLSNLIRK GKHWSDYVDSIFRERQLNSRLLSSVREVDTGLH >snap-scaffold00087-abinit-gene-2.145-mRNA-1 protein AED:0.0539495114006514 eAED:0.0539495114006514 QI:2|1|0.6|1|1|1|5|0|817 ALSLHGTRQAFARVPPPCRRAHPAERRQPGGGMSADAPVKAGYLLKLTSSLSHWNRRYFI VADTKLFYCKTEDDLLRRKFQGEIDLAGAQIALYTRNDETAKRFSDHHHMLGVKPAGCDR IYILDADSEHAQKEWVACLRRHASQAPVSSPVDAAVAAAPRKDPQSVREGFLTKRGETIK NWKMRYFVLKGNYLHYYRSIEDAQPAGSILLLGTRTTAEPKAVTGMPHSFSIARADAKRK YMIHADSKEECDAWVGAIQQQSVFVRHAGTDSAPPEVVAPAAATPVHQQHQSRSSFGNRP NVADDSADDDEAALDEVALSNGPPALAPHGIANTGASTGLNLKQKVSKKKRRFVTDEFDL DLTYITENIIAMGFPAESMEAMFRNSMSDVQRFLDGRHPDAYRVYNLCSERDYDPAKFHH NVCRFPFDDHNCPNFEDLIPLCEDIHNWLSIQSDHVVAIHCKAGKGRTGLVICAYLLYSG AWRTARDALQFYGFVRTQDQKGVTIPSQIRYVEYFEQYMADPEILSRNNGPLVISEIFVG RGCRPFDTVTITNMGRRMNSKDWGKYWKDALDDGLLLQLPKGACQVDKDFKVEFLASGLL GKKTRVAGFWLHTAFIQDGVVDIDKSMIDKVNKEKDCPAFSIQVFFGGRTYVDRRCRIPV APPQPTGPLLLSPATVRIRNADPLPVPNPSSPSESPSFSAMSSVPSLSLESLSSVSSSLS PTTGPKAAPSPKKQDPGLDPGSPPGTVKSAPVAEAGAAPVDARSDNKAPARSCSLPHGRF PGDVAGDGAFQQAAVQVRIAFWNNLQSEALQRRNSRL >augustus-scaffold00087-abinit-gene-0.106-mRNA-1 protein AED:0.10935424621144 eAED:0.10935424621144 QI:1|1|0.66|1|1|1|3|0|483 STSTVFCPGLWNRVGRYVSGGLQTSPVTVPRQVRPICLATQPATDQPGYVMSSTSALVGI GVVTVALLCRWAPILVTTSGPGSPARSSADVMRIWADHDWSAGTTSLPISQESLLAKRVL SKSFDGLPPNLHVQDDTVPVALLRAHLNAGRHMRLRDMCPTAGACDLQGADPDHGLAPLH MAAMRDDRSSIAYLMALGADPDAMDRAGRQYRNLSFTNFVRNARRAAEERGSTCQLPEVN LAGLERADLDRSWAEIRRLAHEGEPVAIRGLLGAYDRSDVLDWDLDAFLTRHGHVPVNVG DVPYAQYFGLPIQSMPLSKYVASLAPGSASYVFAKDDGICRDALQILDRFARDALPPYFV SPAALGSDAVHFYLGNKGSGAPFHLHSDAVNLLAHGSKTWFVTPPPQSVYSRTPIGEFAA NGTSGIESLRCEQNPGDAIYIPFDWGHAVLNNEDSTFGFAVELLNKRDSLHFLRPSSQVP AGQ Regards Hossein -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Wed Mar 13 15:40:39 2013 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 13 Mar 2013 17:40:39 -0400 Subject: [maker-devel] ERROR: Could not obtain lock to format database In-Reply-To: Message-ID: Could you check to make sure your hard drive is not full, whatever location you set as TMP= in the control files is not full (default is /tmp). Also maker sure you do not set /tmp to an NFS mounted or a tmpfs location. Could you also send the full captured STDERR. Thanks, Carson From: Hung-Wei Hsu Date: Tuesday, 12 March, 2013 8:24 PM To: Subject: [maker-devel] ERROR: Could not obtain lock to format database Hi MAKER developers, I tried MAKER 2.27b on one E. coli scaffold sequence with uniprot protein database. I failed to run the analysis and got an error message as below. Could not obtain lock to format database at maker-2.27b/bin/../lib/GI.pm Any suggestions or helps will be deeply appreciated. Best regards, Hung-Wei _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Wed Mar 13 15:47:06 2013 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 13 Mar 2013 17:47:06 -0400 Subject: [maker-devel] duplicate CDS in annotation In-Reply-To: Message-ID: The output shows that the original model was Alias=maker-pbar_scf7180000349951-snap-gene-1.17-mRNA-1 and the new model replacing it is Alias=genemark-pbar_scf7180000349951-abinit-gene-1.14-mRNA-1. So it is really a completely different model (as one derived from SNAP and one from GeneMark). I'm guessing you have map_forward=1 set and are using the GFF3 passthrough options correct? Thanks, Carson From: Sasha Mikheyev Date: Wednesday, 13 March, 2013 3:23 AM To: Carson Holt Cc: Barry Moore , Subject: Re: [maker-devel] duplicate CDS in annotation Dear Carson, The new version does indeed fix the problem! However, I noticed that some of the CDS annotations were swallowed. This seems to affect a ~600 genes. e.g. input: pbar_scf7180000349951 maker mRNA 98033 98530 . - . ID=PB12301-RA;Parent=PB12301;Name=PB12301-RA;Alias=maker-pbar_scf71800003499 51-snap-gene-1.17-mRNA-1;_AED=1.00;_QI=0|0|0|0|0|0|2|0|81; pbar_scf7180000349951 maker exon 98393 98530 . - . ID=PB12301-RA:exon:10283;Parent=PB12301-RA; pbar_scf7180000349951 maker exon 98033 98140 . - . ID=PB12301-RA:exon:10284;Parent=PB12301-RA; pbar_scf7180000349951 maker CDS 98033 98140 . - 0 ID=PB12301-RA:cds:10114;Parent=PB12301-RA; pbar_scf7180000349951 maker CDS 98393 98530 . - 0 ID=PB12301-RA:cds:10113;Parent=PB12301-RA; output: pbar_scf7180000349951 maker mRNA 98033 98530 . - . ID=PB12301-RA;Parent=PB12301;Name=PB12301-RA;_AED=0.38;_eAED=0.38;_QI=0|0|0. 33|1|0.5|1|3|246|165;Alias=genemark-pbar_scf7180000349951-abinit-gene-1.14-m RNA-1,PB12301-RA pbar_scf7180000349951 maker exon 98033 98530 . - . ID=PB12301-RA:exon:134;Parent=PB12301-RA pbar_scf7180000349951 maker exon 98033 98140 . - . ID=PB12301-RA:exon:133;Parent=PB12301-RA pbar_scf7180000349951 maker exon 98393 98530 . - . ID=PB12301-RA:exon:132;Parent=PB12301-RA pbar_scf7180000349951 maker three_prime_UTR 98393 98530 . - . ID=PB12301-RA:three_prime_utr;Parent=PB12301-RA pbar_scf7180000349951 maker three_prime_UTR 98033 98140 . - . ID=PB12301-RA:three_prime_utr;Parent=PB12301-RA pbar_scf7180000349951 maker CDS 98033 98530 . - 0 ID=PB12301-RA:cds;Parent=PB12301-RA Thank you, Sasha On Tue, Mar 12, 2013 at 10:37 PM, Carson Holt wrote: > Yes. Try the newer version and see if you still have the issue. > > Thanks, > Carson > > > From: Sasha Mikheyev > Date: Tuesday, 12 March, 2013 1:26 AM > To: Carson Holt > Cc: Barry Moore , > > > Subject: Re: [maker-devel] duplicate CDS in annotation > > Hi Carson, > > I have been using version 2.10. Is it worth trying with a newer version? > > You can find the model file here > . It is rather large, as it includes all of the output from the first maker > run. > > Yours, > > Sasha > > > On Mon, Mar 11, 2013 at 10:02 PM, Carson Holt wrote: >> I think the issue is that you are getting a match feature that is being >> printed with the same ID as the mRNA feature. Correct? >> >> What version of MAKER are you using, and what does the gile you are giving to >> pred_gff or model_gff look like? Could you send them? >> >> Thanks, >> Carson >> >> >> From: Barry Moore >> Date: Monday, 11 March, 2013 7:32 AM >> To: Sasha Mikheyev >> Cc: >> Subject: Re: [maker-devel] duplicate CDS in annotation >> >> Hi Sasha, >> >> This gene model appears to be correctly formatted to me. In GFF3 format the >> CDS features are allowed to span multiple lines and they share the same ID to >> indicate that it is all the same features. See the GFF3 specification on the >> Sequence Ontology website >> (http://www.sequenceontology.org/resources/gff3.html), and in particular the >> description of the ID attribute specifies: >> >>> ID Indicates the ID of the feature. IDs for each feature must be unique >>> within the scope of the GFF file. In the case of discontinuous features >>> (i.e. a single feature that exists over multiple genomic locations) the same >>> ID may appear on multiple lines. All lines that share an ID collectively >>> represent a single feature. >> >> So each of those CDS lines forms one part of the single CDS feature for this >> gene. >> >> B >> >> On Mar 11, 2013, at 3:46 AM, Sasha Mikheyev wrote: >> >>> Dear Yandell lab, >>> >>> I am re-annotating the harvester and genome using protein and RNA-seq data. >>> However, I get many artifacts like the one below. It seems that there are >>> several CDS records that should tie in to the same mRNA, but they are really >>> hanging out separately, and produce several nucleotide sequences with the >>> same name when extracted from the gff. I would appreciate any guidance about >>> how to fix this! >>> >>> Thank you, >>> >>> Sasha >>> >>> grep "pbar_scf7180000350377:hit:2506" Pbar.2.0.gff >>> pbar_scf7180000350377 protein2genome protein_match 172004 172162 150 - . >>> ID=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;score=150; >>> pbar_scf7180000350377 protein2genome match_part 172004 172162 150 - . >>> ID=pbar_scf7180000350377:hsp:2798;Parent=pbar_scf7180000350377:hit:2506;Name >>> =Hsal|HS9704;Target=Hsal|HS9704 1 53 +;Gap=M159; >>> pbar_scf7180000350377 maker mRNA 538308 558769 . + . >>> ID=pbar_scf7180000350377:hit:2506;Parent=augustus_masked-pbar_scf71800003503 >>> 77-abinit-gene-5.29;Name=augustus_masked-pbar_scf7180000350377-abinit-gene-5 >>> .29-mRNA-1;_AED=0.48;_eAED=0.39;_QI=0|0|0|0.5|1|1|6|0|395;score=0.01; >>> pbar_scf7180000350377 maker exon 538308 538334 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:305;Parent=pbar_scf7180000350377:hit: >>> 2506; >>> pbar_scf7180000350377 maker exon 538748 538968 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:306;Parent=pbar_scf7180000350377:hit: >>> 2506; >>> pbar_scf7180000350377 maker exon 539842 540242 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:307;Parent=pbar_scf7180000350377:hit: >>> 2506; >>> pbar_scf7180000350377 maker exon 542624 542798 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:308;Parent=pbar_scf7180000350377:hit: >>> 2506; >>> pbar_scf7180000350377 maker exon 555823 556025 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:309;Parent=pbar_scf7180000350377:hit: >>> 2506; >>> pbar_scf7180000350377 maker exon 558609 558769 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:310;Parent=pbar_scf7180000350377:hit: >>> 2506; >>> pbar_scf7180000350377 maker CDS 538308 538334 . + 0 >>> ID=pbar_scf7180000350377:hit:2506:cds:305;Parent=pbar_scf7180000350377:hit:2 >>> 506; >>> pbar_scf7180000350377 maker CDS 538748 538968 . + 0 >>> ID=pbar_scf7180000350377:hit:2506:cds:306;Parent=pbar_scf7180000350377:hit:2 >>> 506; >>> pbar_scf7180000350377 maker CDS 539842 540242 . + 1 >>> ID=pbar_scf7180000350377:hit:2506:cds:307;Parent=pbar_scf7180000350377:hit:2 >>> 506; >>> pbar_scf7180000350377 maker CDS 542624 542798 . + 2 >>> ID=pbar_scf7180000350377:hit:2506:cds:308;Parent=pbar_scf7180000350377:hit:2 >>> 506; >>> pbar_scf7180000350377 maker CDS 555823 556025 . + 1 >>> ID=pbar_scf7180000350377:hit:2506:cds:309;Parent=pbar_scf7180000350377:hit:2 >>> 506; >>> pbar_scf7180000350377 maker CDS 558609 558769 . + 2 >>> ID=pbar_scf7180000350377:hit:2506:cds:310;Parent=pbar_scf7180000350377:hit:2 >>> 506; >>> >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> Barry Moore >> Research Scientist >> Dept. of Human Genetics >> University of Utah >> Salt Lake City, UT 84112 >> -------------------------------------------- >> (801) 585-3543 >> >> >> >> >> _______________________________________________ maker-devel mailing list >> maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/ma >> ker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Wed Mar 13 18:26:25 2013 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 13 Mar 2013 20:26:25 -0400 Subject: [maker-devel] do of the maker predicted proteins do not start with M In-Reply-To: <7B64340A44B6634C814A22BCFA6179D5020887CC@onottaxms5.AGR.GC.CA> Message-ID: SNAP and other gene prediction programs are capable of producing partial models if they can't find reasonable start and stop codons. You can set always_complete=1 in the maker_opts.ctl file to get MAKER to walk forward and backwards to search for starts and stops after the ab initio predictors do their work in an attempt to force model completion. Thanks, Carson From: "Borhan, Hossein" Date: Wednesday, 13 March, 2013 5:49 PM To: Subject: [maker-devel] do of the maker predicted proteins do not start with M Hi I have run maker and some of the protein predicted by maker does not start with a Methionine. I am not sure why Here are some examples >maker-scaffold00001-snap-gene-0.8-mRNA-1 protein AED:0.27496328928047 eAED:0.27496328928047 QI:0|0.4|0.16|0.5|1|1|6|0|453 VIIKFKTFAKASRSVELFGHEGAWARGDGYCNFKTESEKADRSVKSSCSLNIPFTYDVGR RQYVIKGDRFCLSHNHLVMIPSPTTVIVNDQRDLTPDQLSYIINLGKYSLPFPMVTRMLS DQFPDCRIQKPLLHRLLRKGKLQAFGGDRDAMNALINLGRSYEEHGGFFEIDIDVDCRLE KIWLARAEGLQFASVYNDVVQIDGGAKMNAYGFVFLPVTVIDCLGKSYVVGAMAGPSAEN KADVVKTLEYFRVKRSESVLIADDALAFRAAAVECDMVYHQCTKHYQAKIARACAGLGHE GKEFMIKANTLVYHIFPSEDAFFAKADEYRLMFLQYGGAVKLFDDIVDKRQQLCRTFTSC KFTGSHSSNQRAEGTISRTKRDVQPWLSRANLFEMFTHLEMIQKQQEDEAARLLSNLIRK GKHWSDYVDSIFRERQLNSRLLSSVREVDTGLH >snap-scaffold00087-abinit-gene-2.145-mRNA-1 protein AED:0.0539495114006514 eAED:0.0539495114006514 QI:2|1|0.6|1|1|1|5|0|817 ALSLHGTRQAFARVPPPCRRAHPAERRQPGGGMSADAPVKAGYLLKLTSSLSHWNRRYFI VADTKLFYCKTEDDLLRRKFQGEIDLAGAQIALYTRNDETAKRFSDHHHMLGVKPAGCDR IYILDADSEHAQKEWVACLRRHASQAPVSSPVDAAVAAAPRKDPQSVREGFLTKRGETIK NWKMRYFVLKGNYLHYYRSIEDAQPAGSILLLGTRTTAEPKAVTGMPHSFSIARADAKRK YMIHADSKEECDAWVGAIQQQSVFVRHAGTDSAPPEVVAPAAATPVHQQHQSRSSFGNRP NVADDSADDDEAALDEVALSNGPPALAPHGIANTGASTGLNLKQKVSKKKRRFVTDEFDL DLTYITENIIAMGFPAESMEAMFRNSMSDVQRFLDGRHPDAYRVYNLCSERDYDPAKFHH NVCRFPFDDHNCPNFEDLIPLCEDIHNWLSIQSDHVVAIHCKAGKGRTGLVICAYLLYSG AWRTARDALQFYGFVRTQDQKGVTIPSQIRYVEYFEQYMADPEILSRNNGPLVISEIFVG RGCRPFDTVTITNMGRRMNSKDWGKYWKDALDDGLLLQLPKGACQVDKDFKVEFLASGLL GKKTRVAGFWLHTAFIQDGVVDIDKSMIDKVNKEKDCPAFSIQVFFGGRTYVDRRCRIPV APPQPTGPLLLSPATVRIRNADPLPVPNPSSPSESPSFSAMSSVPSLSLESLSSVSSSLS PTTGPKAAPSPKKQDPGLDPGSPPGTVKSAPVAEAGAAPVDARSDNKAPARSCSLPHGRF PGDVAGDGAFQQAAVQVRIAFWNNLQSEALQRRNSRL >augustus-scaffold00087-abinit-gene-0.106-mRNA-1 protein AED:0.10935424621144 eAED:0.10935424621144 QI:1|1|0.66|1|1|1|3|0|483 STSTVFCPGLWNRVGRYVSGGLQTSPVTVPRQVRPICLATQPATDQPGYVMSSTSALVGI GVVTVALLCRWAPILVTTSGPGSPARSSADVMRIWADHDWSAGTTSLPISQESLLAKRVL SKSFDGLPPNLHVQDDTVPVALLRAHLNAGRHMRLRDMCPTAGACDLQGADPDHGLAPLH MAAMRDDRSSIAYLMALGADPDAMDRAGRQYRNLSFTNFVRNARRAAEERGSTCQLPEVN LAGLERADLDRSWAEIRRLAHEGEPVAIRGLLGAYDRSDVLDWDLDAFLTRHGHVPVNVG DVPYAQYFGLPIQSMPLSKYVASLAPGSASYVFAKDDGICRDALQILDRFARDALPPYFV SPAALGSDAVHFYLGNKGSGAPFHLHSDAVNLLAHGSKTWFVTPPPQSVYSRTPIGEFAA NGTSGIESLRCEQNPGDAIYIPFDWGHAVLNNEDSTFGFAVELLNKRDSLHFLRPSSQVP AGQ Regards Hossein _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Wed Mar 13 20:54:55 2013 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 13 Mar 2013 22:54:55 -0400 Subject: [maker-devel] duplicate CDS in annotation In-Reply-To: Message-ID: Yes. map_forward=1 allows new models to keep the names of the models they replace. It makes it so you don't have to relocate genes every time a model gets a slight modification during reannotation. --Carson From: Sasha Mikheyev Date: Wednesday, 13 March, 2013 9:17 PM To: Carson Holt Cc: Barry Moore , Subject: Re: [maker-devel] duplicate CDS in annotation OK. Got it! I did pass through the gene model names. I guess I now see that a new gene model may become associated with the old name in the re-annotation. Sasha On Thu, Mar 14, 2013 at 6:47 AM, Carson Holt wrote: > The output shows that the original model was > Alias=maker-pbar_scf7180000349951-snap-gene-1.17-mRNA-1 and the new model > replacing it is Alias=genemark-pbar_scf7180000349951-abinit-gene-1.14-mRNA-1. > > So it is really a completely different model (as one derived from SNAP and one > from GeneMark). I'm guessing you have map_forward=1 set and are using the > GFF3 passthrough options correct? > > Thanks, > Carson > > > > From: Sasha Mikheyev > Date: Wednesday, 13 March, 2013 3:23 AM > > To: Carson Holt > Cc: Barry Moore , > > Subject: Re: [maker-devel] duplicate CDS in annotation > > Dear Carson, > > The new version does indeed fix the problem! > > However, I noticed that some of the CDS annotations were swallowed. This seems > to affect a ~600 genes. > > e.g. input: > > pbar_scf7180000349951 maker mRNA 98033 98530 . - . > ID=PB12301-RA;Parent=PB12301;Name=PB12301-RA;Alias=maker-pbar_scf7180000349951 > -snap-gene-1.17-mRNA-1;_AED=1.00;_QI=0|0|0|0|0|0|2|0|81; > pbar_scf7180000349951 maker exon 98393 98530 . - . > ID=PB12301-RA:exon:10283;Parent=PB12301-RA; > pbar_scf7180000349951 maker exon 98033 98140 . - . > ID=PB12301-RA:exon:10284;Parent=PB12301-RA; > pbar_scf7180000349951 maker CDS 98033 98140 . - 0 > ID=PB12301-RA:cds:10114;Parent=PB12301-RA; > pbar_scf7180000349951 maker CDS 98393 98530 . - 0 > ID=PB12301-RA:cds:10113;Parent=PB12301-RA; > > output: > > pbar_scf7180000349951 maker mRNA 98033 98530 . - . > ID=PB12301-RA;Parent=PB12301;Name=PB12301-RA;_AED=0.38;_eAED=0.38;_QI=0|0|0.33 > |1|0.5|1|3|246|165;Alias=genemark-pbar_scf7180000349951-abinit-gene-1.14-mRNA- > 1,PB12301-RA > pbar_scf7180000349951 maker exon 98033 98530 . - . > ID=PB12301-RA:exon:134;Parent=PB12301-RA > pbar_scf7180000349951 maker exon 98033 98140 . - . > ID=PB12301-RA:exon:133;Parent=PB12301-RA > pbar_scf7180000349951 maker exon 98393 98530 . - . > ID=PB12301-RA:exon:132;Parent=PB12301-RA > pbar_scf7180000349951 maker three_prime_UTR 98393 98530 . - . > ID=PB12301-RA:three_prime_utr;Parent=PB12301-RA > pbar_scf7180000349951 maker three_prime_UTR 98033 98140 . - . > ID=PB12301-RA:three_prime_utr;Parent=PB12301-RA > pbar_scf7180000349951 maker CDS 98033 98530 . - 0 > ID=PB12301-RA:cds;Parent=PB12301-RA > > Thank you, > > Sasha > > On Tue, Mar 12, 2013 at 10:37 PM, Carson Holt wrote: >> Yes. Try the newer version and see if you still have the issue. >> >> Thanks, >> Carson >> >> >> From: Sasha Mikheyev >> Date: Tuesday, 12 March, 2013 1:26 AM >> To: Carson Holt >> Cc: Barry Moore , >> >> >> Subject: Re: [maker-devel] duplicate CDS in annotation >> >> Hi Carson, >> >> I have been using version 2.10. Is it worth trying with a newer version? >> >> You can find the model file here >> . It is rather large, as it >> includes all of the output from the first maker run. >> >> Yours, >> >> Sasha >> >> >> On Mon, Mar 11, 2013 at 10:02 PM, Carson Holt wrote: >>> I think the issue is that you are getting a match feature that is being >>> printed with the same ID as the mRNA feature. Correct? >>> >>> What version of MAKER are you using, and what does the gile you are giving >>> to pred_gff or model_gff look like? Could you send them? >>> >>> Thanks, >>> Carson >>> >>> >>> From: Barry Moore >>> Date: Monday, 11 March, 2013 7:32 AM >>> To: Sasha Mikheyev >>> Cc: >>> Subject: Re: [maker-devel] duplicate CDS in annotation >>> >>> Hi Sasha, >>> >>> This gene model appears to be correctly formatted to me. In GFF3 format the >>> CDS features are allowed to span multiple lines and they share the same ID >>> to indicate that it is all the same features. See the GFF3 specification on >>> the Sequence Ontology website >>> (http://www.sequenceontology.org/resources/gff3.html), and in particular the >>> description of the ID attribute specifies: >>> >>>> ID Indicates the ID of the feature. IDs for each feature must be unique >>>> within the scope of the GFF file. In the case of discontinuous features >>>> (i.e. a single feature that exists over multiple genomic locations) the >>>> same ID may appear on multiple lines. All lines that share an ID >>>> collectively represent a single feature. >>> >>> So each of those CDS lines forms one part of the single CDS feature for this >>> gene. >>> >>> B >>> >>> On Mar 11, 2013, at 3:46 AM, Sasha Mikheyev wrote: >>> >>>> Dear Yandell lab, >>>> >>>> I am re-annotating the harvester and genome using protein and RNA-seq data. >>>> However, I get many artifacts like the one below. It seems that there are >>>> several CDS records that should tie in to the same mRNA, but they are >>>> really hanging out separately, and produce several nucleotide sequences >>>> with the same name when extracted from the gff. I would appreciate any >>>> guidance about how to fix this! >>>> >>>> Thank you, >>>> >>>> Sasha >>>> >>>> grep "pbar_scf7180000350377:hit:2506" Pbar.2.0.gff >>>> pbar_scf7180000350377 protein2genome protein_match 172004 172162 150 - . >>>> ID=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;score=150; >>>> pbar_scf7180000350377 protein2genome match_part 172004 172162 150 - . >>>> ID=pbar_scf7180000350377:hsp:2798;Parent=pbar_scf7180000350377:hit:2506;Nam >>>> e=Hsal|HS9704;Target=Hsal|HS9704 1 53 +;Gap=M159; >>>> pbar_scf7180000350377 maker mRNA 538308 558769 . + . >>>> ID=pbar_scf7180000350377:hit:2506;Parent=augustus_masked-pbar_scf7180000350 >>>> 377-abinit-gene-5.29;Name=augustus_masked-pbar_scf7180000350377-abinit-gene >>>> -5.29-mRNA-1;_AED=0.48;_eAED=0.39;_QI=0|0|0|0.5|1|1|6|0|395;score=0.01; >>>> pbar_scf7180000350377 maker exon 538308 538334 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:305;Parent=pbar_scf7180000350377:hit >>>> :2506; >>>> pbar_scf7180000350377 maker exon 538748 538968 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:306;Parent=pbar_scf7180000350377:hit >>>> :2506; >>>> pbar_scf7180000350377 maker exon 539842 540242 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:307;Parent=pbar_scf7180000350377:hit >>>> :2506; >>>> pbar_scf7180000350377 maker exon 542624 542798 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:308;Parent=pbar_scf7180000350377:hit >>>> :2506; >>>> pbar_scf7180000350377 maker exon 555823 556025 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:309;Parent=pbar_scf7180000350377:hit >>>> :2506; >>>> pbar_scf7180000350377 maker exon 558609 558769 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:310;Parent=pbar_scf7180000350377:hit >>>> :2506; >>>> pbar_scf7180000350377 maker CDS 538308 538334 . + 0 >>>> ID=pbar_scf7180000350377:hit:2506:cds:305;Parent=pbar_scf7180000350377:hit: >>>> 2506; >>>> pbar_scf7180000350377 maker CDS 538748 538968 . + 0 >>>> ID=pbar_scf7180000350377:hit:2506:cds:306;Parent=pbar_scf7180000350377:hit: >>>> 2506; >>>> pbar_scf7180000350377 maker CDS 539842 540242 . + 1 >>>> ID=pbar_scf7180000350377:hit:2506:cds:307;Parent=pbar_scf7180000350377:hit: >>>> 2506; >>>> pbar_scf7180000350377 maker CDS 542624 542798 . + 2 >>>> ID=pbar_scf7180000350377:hit:2506:cds:308;Parent=pbar_scf7180000350377:hit: >>>> 2506; >>>> pbar_scf7180000350377 maker CDS 555823 556025 . + 1 >>>> ID=pbar_scf7180000350377:hit:2506:cds:309;Parent=pbar_scf7180000350377:hit: >>>> 2506; >>>> pbar_scf7180000350377 maker CDS 558609 558769 . + 2 >>>> ID=pbar_scf7180000350377:hit:2506:cds:310;Parent=pbar_scf7180000350377:hit: >>>> 2506; >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >>> Barry Moore >>> Research Scientist >>> Dept. of Human Genetics >>> University of Utah >>> Salt Lake City, UT 84112 >>> -------------------------------------------- >>> (801) 585-3543 >>> >>> >>> >>> >>> _______________________________________________ maker-devel mailing list >>> maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/m >>> aker-devel_yandell-lab.org >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mikheyev at gmail.com Wed Mar 13 19:17:40 2013 From: mikheyev at gmail.com (Sasha Mikheyev) Date: Thu, 14 Mar 2013 10:17:40 +0900 Subject: [maker-devel] duplicate CDS in annotation In-Reply-To: References: Message-ID: OK. Got it! I did pass through the gene model names. I guess I now see that a new gene model may become associated with the old name in the re-annotation. Sasha On Thu, Mar 14, 2013 at 6:47 AM, Carson Holt wrote: > The output shows that the original model > was Alias=maker-pbar_scf7180000349951-snap-gene-1.17-mRNA-1 and the new > model replacing it is > Alias=genemark-pbar_scf7180000349951-abinit-gene-1.14-mRNA-1. > > So it is really a completely different model (as one derived from SNAP and > one from GeneMark). I'm guessing you have map_forward=1 set and are using > the GFF3 passthrough options correct? > > Thanks, > Carson > > > > From: Sasha Mikheyev > Date: Wednesday, 13 March, 2013 3:23 AM > > To: Carson Holt > Cc: Barry Moore , < > maker-devel at yandell-lab.org> > Subject: Re: [maker-devel] duplicate CDS in annotation > > Dear Carson, > > The new version does indeed fix the problem! > > However, I noticed that some of the CDS annotations were swallowed. This > seems to affect a ~600 genes. > > e.g. input: > > pbar_scf7180000349951 maker mRNA 98033 98530 . - . > ID=PB12301-RA;Parent=PB12301;Name=PB12301-RA;Alias=maker-pbar_scf7180000349951-snap-gene-1.17-mRNA-1;_AED=1.00;_QI=0|0|0|0|0|0|2|0|81; > pbar_scf7180000349951 maker exon 98393 98530 . - . > ID=PB12301-RA:exon:10283;Parent=PB12301-RA; > pbar_scf7180000349951 maker exon 98033 98140 . - . > ID=PB12301-RA:exon:10284;Parent=PB12301-RA; > pbar_scf7180000349951 maker CDS 98033 98140 . - 0 > ID=PB12301-RA:cds:10114;Parent=PB12301-RA; > pbar_scf7180000349951 maker CDS 98393 98530 . - 0 > ID=PB12301-RA:cds:10113;Parent=PB12301-RA; > > output: > > pbar_scf7180000349951 maker mRNA 98033 98530 . - . > ID=PB12301-RA;Parent=PB12301;Name=PB12301-RA;_AED=0.38;_eAED=0.38;_QI=0|0|0.33|1|0.5|1|3|246|165;Alias=genemark-pbar_scf7180000349951-abinit-gene-1.14-mRNA-1,PB12301-RA > pbar_scf7180000349951 maker exon 98033 98530 . - . > ID=PB12301-RA:exon:134;Parent=PB12301-RA > pbar_scf7180000349951 maker exon 98033 98140 . - . > ID=PB12301-RA:exon:133;Parent=PB12301-RA > pbar_scf7180000349951 maker exon 98393 98530 . - . > ID=PB12301-RA:exon:132;Parent=PB12301-RA > pbar_scf7180000349951 maker three_prime_UTR 98393 98530 . - . > ID=PB12301-RA:three_prime_utr;Parent=PB12301-RA > pbar_scf7180000349951 maker three_prime_UTR 98033 98140 . - . > ID=PB12301-RA:three_prime_utr;Parent=PB12301-RA > pbar_scf7180000349951 maker CDS 98033 98530 . - 0 > ID=PB12301-RA:cds;Parent=PB12301-RA > > Thank you, > > Sasha > > On Tue, Mar 12, 2013 at 10:37 PM, Carson Holt wrote: > >> Yes. Try the newer version and see if you still have the issue. >> >> Thanks, >> Carson >> >> >> From: Sasha Mikheyev >> Date: Tuesday, 12 March, 2013 1:26 AM >> To: Carson Holt >> Cc: Barry Moore , < >> maker-devel at yandell-lab.org> >> >> Subject: Re: [maker-devel] duplicate CDS in annotation >> >> Hi Carson, >> >> I have been using version 2.10. Is it worth trying with a newer version? >> >> You can find the model file here. >> It is rather large, as it includes all of the output from the first maker >> run. >> >> Yours, >> >> Sasha >> >> >> On Mon, Mar 11, 2013 at 10:02 PM, Carson Holt wrote: >> >>> I think the issue is that you are getting a match feature that is being >>> printed with the same ID as the mRNA feature. Correct? >>> >>> What version of MAKER are you using, and what does the gile you are >>> giving to pred_gff or model_gff look like? Could you send them? >>> >>> Thanks, >>> Carson >>> >>> >>> From: Barry Moore >>> Date: Monday, 11 March, 2013 7:32 AM >>> To: Sasha Mikheyev >>> Cc: >>> Subject: Re: [maker-devel] duplicate CDS in annotation >>> >>> Hi Sasha, >>> >>> This gene model appears to be correctly formatted to me. In GFF3 format >>> the CDS features are allowed to span multiple lines and they share the same >>> ID to indicate that it is all the same features. See the GFF3 >>> specification on the Sequence Ontology website ( >>> http://www.sequenceontology.org/resources/gff3.html), and in particular >>> the description of the ID attribute specifies: >>> >>> ID Indicates the ID of the feature. IDs for each feature must be unique >>> within the scope of the GFF file. In the case of discontinuous features >>> (i.e. a single feature that exists over multiple genomic locations) the >>> same ID may appear on multiple lines. All lines that share an ID >>> collectively represent a single feature. >>> >>> >>> So each of those CDS lines forms one part of the single CDS feature for >>> this gene. >>> >>> B >>> >>> On Mar 11, 2013, at 3:46 AM, Sasha Mikheyev wrote: >>> >>> Dear Yandell lab, >>> >>> I am re-annotating the harvester and genome using protein and RNA-seq >>> data. However, I get many artifacts like the one below. It seems that there >>> are several CDS records that should tie in to the same mRNA, but they are >>> really hanging out separately, and produce several nucleotide sequences >>> with the same name when extracted from the gff. I would appreciate any >>> guidance about how to fix this! >>> >>> Thank you, >>> >>> Sasha >>> >>> grep "pbar_scf7180000350377:hit:2506" Pbar.2.0.gff >>> pbar_scf7180000350377 protein2genome protein_match 172004 172162 150 - . >>> ID=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;score=150; >>> pbar_scf7180000350377 protein2genome match_part 172004 172162 150 - . ID=pbar_scf7180000350377:hsp:2798;Parent=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;Target=Hsal|HS9704 >>> 1 53 +;Gap=M159; >>> pbar_scf7180000350377 maker mRNA 538308 558769 . + . >>> ID=pbar_scf7180000350377:hit:2506;Parent=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29;Name=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29-mRNA-1;_AED=0.48;_eAED=0.39;_QI=0|0|0|0.5|1|1|6|0|395;score=0.01; >>> pbar_scf7180000350377 maker exon 538308 538334 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:305;Parent=pbar_scf7180000350377:hit:2506; >>> pbar_scf7180000350377 maker exon 538748 538968 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:306;Parent=pbar_scf7180000350377:hit:2506; >>> pbar_scf7180000350377 maker exon 539842 540242 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:307;Parent=pbar_scf7180000350377:hit:2506; >>> pbar_scf7180000350377 maker exon 542624 542798 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:308;Parent=pbar_scf7180000350377:hit:2506; >>> pbar_scf7180000350377 maker exon 555823 556025 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:309;Parent=pbar_scf7180000350377:hit:2506; >>> pbar_scf7180000350377 maker exon 558609 558769 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:310;Parent=pbar_scf7180000350377:hit:2506; >>> pbar_scf7180000350377 maker CDS 538308 538334 . + 0 >>> ID=pbar_scf7180000350377:hit:2506:cds:305;Parent=pbar_scf7180000350377:hit:2506; >>> pbar_scf7180000350377 maker CDS 538748 538968 . + 0 >>> ID=pbar_scf7180000350377:hit:2506:cds:306;Parent=pbar_scf7180000350377:hit:2506; >>> pbar_scf7180000350377 maker CDS 539842 540242 . + 1 >>> ID=pbar_scf7180000350377:hit:2506:cds:307;Parent=pbar_scf7180000350377:hit:2506; >>> pbar_scf7180000350377 maker CDS 542624 542798 . + 2 >>> ID=pbar_scf7180000350377:hit:2506:cds:308;Parent=pbar_scf7180000350377:hit:2506; >>> pbar_scf7180000350377 maker CDS 555823 556025 . + 1 >>> ID=pbar_scf7180000350377:hit:2506:cds:309;Parent=pbar_scf7180000350377:hit:2506; >>> pbar_scf7180000350377 maker CDS 558609 558769 . + 2 >>> ID=pbar_scf7180000350377:hit:2506:cds:310;Parent=pbar_scf7180000350377:hit:2506; >>> >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >>> >>> Barry Moore >>> Research Scientist >>> Dept. of Human Genetics >>> University of Utah >>> Salt Lake City, UT 84112 >>> -------------------------------------------- >>> (801) 585-3543 >>> >>> >>> >>> >>> _______________________________________________ maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mikheyev at gmail.com Wed Mar 13 21:34:52 2013 From: mikheyev at gmail.com (Sasha Mikheyev) Date: Thu, 14 Mar 2013 12:34:52 +0900 Subject: [maker-devel] duplicate CDS in annotation In-Reply-To: References: Message-ID: Thank you very much! Problem solved! Sasha On Thu, Mar 14, 2013 at 11:54 AM, Carson Holt wrote: > Yes. map_forward=1 allows new models to keep the names of the models they > replace. It makes it so you don't have to relocate genes every time a > model gets a slight modification during reannotation. > > --Carson > > > From: Sasha Mikheyev > Date: Wednesday, 13 March, 2013 9:17 PM > > To: Carson Holt > Cc: Barry Moore , < > maker-devel at yandell-lab.org> > Subject: Re: [maker-devel] duplicate CDS in annotation > > OK. Got it! I did pass through the gene model names. I guess I now see > that a new gene model may become associated with the old name in the > re-annotation. > > Sasha > > On Thu, Mar 14, 2013 at 6:47 AM, Carson Holt wrote: > >> The output shows that the original model >> was Alias=maker-pbar_scf7180000349951-snap-gene-1.17-mRNA-1 and the new >> model replacing it is >> Alias=genemark-pbar_scf7180000349951-abinit-gene-1.14-mRNA-1. >> >> So it is really a completely different model (as one derived from SNAP >> and one from GeneMark). I'm guessing you have map_forward=1 set and are >> using the GFF3 passthrough options correct? >> >> Thanks, >> Carson >> >> >> >> From: Sasha Mikheyev >> Date: Wednesday, 13 March, 2013 3:23 AM >> >> To: Carson Holt >> Cc: Barry Moore , < >> maker-devel at yandell-lab.org> >> Subject: Re: [maker-devel] duplicate CDS in annotation >> >> Dear Carson, >> >> The new version does indeed fix the problem! >> >> However, I noticed that some of the CDS annotations were swallowed. This >> seems to affect a ~600 genes. >> >> e.g. input: >> >> pbar_scf7180000349951 maker mRNA 98033 98530 . - . >> ID=PB12301-RA;Parent=PB12301;Name=PB12301-RA;Alias=maker-pbar_scf7180000349951-snap-gene-1.17-mRNA-1;_AED=1.00;_QI=0|0|0|0|0|0|2|0|81; >> pbar_scf7180000349951 maker exon 98393 98530 . - . >> ID=PB12301-RA:exon:10283;Parent=PB12301-RA; >> pbar_scf7180000349951 maker exon 98033 98140 . - . >> ID=PB12301-RA:exon:10284;Parent=PB12301-RA; >> pbar_scf7180000349951 maker CDS 98033 98140 . - 0 >> ID=PB12301-RA:cds:10114;Parent=PB12301-RA; >> pbar_scf7180000349951 maker CDS 98393 98530 . - 0 >> ID=PB12301-RA:cds:10113;Parent=PB12301-RA; >> >> output: >> >> pbar_scf7180000349951 maker mRNA 98033 98530 . - . >> ID=PB12301-RA;Parent=PB12301;Name=PB12301-RA;_AED=0.38;_eAED=0.38;_QI=0|0|0.33|1|0.5|1|3|246|165;Alias=genemark-pbar_scf7180000349951-abinit-gene-1.14-mRNA-1,PB12301-RA >> pbar_scf7180000349951 maker exon 98033 98530 . - . >> ID=PB12301-RA:exon:134;Parent=PB12301-RA >> pbar_scf7180000349951 maker exon 98033 98140 . - . >> ID=PB12301-RA:exon:133;Parent=PB12301-RA >> pbar_scf7180000349951 maker exon 98393 98530 . - . >> ID=PB12301-RA:exon:132;Parent=PB12301-RA >> pbar_scf7180000349951 maker three_prime_UTR 98393 98530 . - . >> ID=PB12301-RA:three_prime_utr;Parent=PB12301-RA >> pbar_scf7180000349951 maker three_prime_UTR 98033 98140 . - . >> ID=PB12301-RA:three_prime_utr;Parent=PB12301-RA >> pbar_scf7180000349951 maker CDS 98033 98530 . - 0 >> ID=PB12301-RA:cds;Parent=PB12301-RA >> >> Thank you, >> >> Sasha >> >> On Tue, Mar 12, 2013 at 10:37 PM, Carson Holt wrote: >> >>> Yes. Try the newer version and see if you still have the issue. >>> >>> Thanks, >>> Carson >>> >>> >>> From: Sasha Mikheyev >>> Date: Tuesday, 12 March, 2013 1:26 AM >>> To: Carson Holt >>> Cc: Barry Moore , < >>> maker-devel at yandell-lab.org> >>> >>> Subject: Re: [maker-devel] duplicate CDS in annotation >>> >>> Hi Carson, >>> >>> I have been using version 2.10. Is it worth trying with a newer version? >>> >>> You can find the model file here. >>> It is rather large, as it includes all of the output from the first maker >>> run. >>> >>> Yours, >>> >>> Sasha >>> >>> >>> On Mon, Mar 11, 2013 at 10:02 PM, Carson Holt wrote: >>> >>>> I think the issue is that you are getting a match feature that is being >>>> printed with the same ID as the mRNA feature. Correct? >>>> >>>> What version of MAKER are you using, and what does the gile you are >>>> giving to pred_gff or model_gff look like? Could you send them? >>>> >>>> Thanks, >>>> Carson >>>> >>>> >>>> From: Barry Moore >>>> Date: Monday, 11 March, 2013 7:32 AM >>>> To: Sasha Mikheyev >>>> Cc: >>>> Subject: Re: [maker-devel] duplicate CDS in annotation >>>> >>>> Hi Sasha, >>>> >>>> This gene model appears to be correctly formatted to me. In GFF3 >>>> format the CDS features are allowed to span multiple lines and they share >>>> the same ID to indicate that it is all the same features. See the GFF3 >>>> specification on the Sequence Ontology website ( >>>> http://www.sequenceontology.org/resources/gff3.html), and in >>>> particular the description of the ID attribute specifies: >>>> >>>> ID Indicates the ID of the feature. IDs for each feature must be unique >>>> within the scope of the GFF file. In the case of discontinuous features >>>> (i.e. a single feature that exists over multiple genomic locations) the >>>> same ID may appear on multiple lines. All lines that share an ID >>>> collectively represent a single feature. >>>> >>>> >>>> So each of those CDS lines forms one part of the single CDS feature for >>>> this gene. >>>> >>>> B >>>> >>>> On Mar 11, 2013, at 3:46 AM, Sasha Mikheyev wrote: >>>> >>>> Dear Yandell lab, >>>> >>>> I am re-annotating the harvester and genome using protein and RNA-seq >>>> data. However, I get many artifacts like the one below. It seems that there >>>> are several CDS records that should tie in to the same mRNA, but they are >>>> really hanging out separately, and produce several nucleotide sequences >>>> with the same name when extracted from the gff. I would appreciate any >>>> guidance about how to fix this! >>>> >>>> Thank you, >>>> >>>> Sasha >>>> >>>> grep "pbar_scf7180000350377:hit:2506" Pbar.2.0.gff >>>> pbar_scf7180000350377 protein2genome protein_match 172004 172162 150 - >>>> . ID=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;score=150; >>>> pbar_scf7180000350377 protein2genome match_part 172004 172162 150 - . ID=pbar_scf7180000350377:hsp:2798;Parent=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;Target=Hsal|HS9704 >>>> 1 53 +;Gap=M159; >>>> pbar_scf7180000350377 maker mRNA 538308 558769 . + . >>>> ID=pbar_scf7180000350377:hit:2506;Parent=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29;Name=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29-mRNA-1;_AED=0.48;_eAED=0.39;_QI=0|0|0|0.5|1|1|6|0|395;score=0.01; >>>> pbar_scf7180000350377 maker exon 538308 538334 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:305;Parent=pbar_scf7180000350377:hit:2506; >>>> pbar_scf7180000350377 maker exon 538748 538968 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:306;Parent=pbar_scf7180000350377:hit:2506; >>>> pbar_scf7180000350377 maker exon 539842 540242 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:307;Parent=pbar_scf7180000350377:hit:2506; >>>> pbar_scf7180000350377 maker exon 542624 542798 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:308;Parent=pbar_scf7180000350377:hit:2506; >>>> pbar_scf7180000350377 maker exon 555823 556025 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:309;Parent=pbar_scf7180000350377:hit:2506; >>>> pbar_scf7180000350377 maker exon 558609 558769 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:310;Parent=pbar_scf7180000350377:hit:2506; >>>> pbar_scf7180000350377 maker CDS 538308 538334 . + 0 >>>> ID=pbar_scf7180000350377:hit:2506:cds:305;Parent=pbar_scf7180000350377:hit:2506; >>>> pbar_scf7180000350377 maker CDS 538748 538968 . + 0 >>>> ID=pbar_scf7180000350377:hit:2506:cds:306;Parent=pbar_scf7180000350377:hit:2506; >>>> pbar_scf7180000350377 maker CDS 539842 540242 . + 1 >>>> ID=pbar_scf7180000350377:hit:2506:cds:307;Parent=pbar_scf7180000350377:hit:2506; >>>> pbar_scf7180000350377 maker CDS 542624 542798 . + 2 >>>> ID=pbar_scf7180000350377:hit:2506:cds:308;Parent=pbar_scf7180000350377:hit:2506; >>>> pbar_scf7180000350377 maker CDS 555823 556025 . + 1 >>>> ID=pbar_scf7180000350377:hit:2506:cds:309;Parent=pbar_scf7180000350377:hit:2506; >>>> pbar_scf7180000350377 maker CDS 558609 558769 . + 2 >>>> ID=pbar_scf7180000350377:hit:2506:cds:310;Parent=pbar_scf7180000350377:hit:2506; >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>>> >>>> >>>> Barry Moore >>>> Research Scientist >>>> Dept. of Human Genetics >>>> University of Utah >>>> Salt Lake City, UT 84112 >>>> -------------------------------------------- >>>> (801) 585-3543 >>>> >>>> >>>> >>>> >>>> _______________________________________________ maker-devel mailing >>>> list maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>>> >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ramonfallon at gmail.com Thu Mar 14 09:19:47 2013 From: ramonfallon at gmail.com (=?ISO-8859-1?Q?Ram=F3n_Fallon?=) Date: Thu, 14 Mar 2013 16:19:47 +0100 Subject: [maker-devel] 12core speed check Message-ID: Hi, I was trying to tweak some of our machines to maximise Mpich2/Maker (svn rev 997) throughput and describe one small set of results on this mailing list to allow sharing of experiences. I use the example input dataset "dpp_contig.fasta" with the original sequence repeated 125 times within the same file (under different names of course) to allow for a decent size run. This file totalled 4.019 megabases. I use the dpp_proteins.fasta and The maker_opts.ctl has "cpus=1" set as the docs recommend for MPI. Hardware is a standalone HP Proliant SL390 with two Intel X5675 @ 3.07GHz, totalling 12 cores with 192GB RAM and 1TB disk (local, no NFS) running Ubuntu 10.04 with 2.6.32-41 linux kernel commandline was "mpiexec -n <#cores> maker" within a dedicated directory containing all relevant files. #cores time(mins) Megabases/hr 1 27.00 8.93 2 126.25 1.91 4 42.57 5.66 6 25.42 9.49 8 18.60 12.96 10 16.67 14.47 12 13.98 17.24 I attach a png file with graph. The upshot of this particular experiment is that 2 processes show anomalous behaviour and that 6 processors are needed to gain an advantage on the 1 processor run, while 12 processors achieves a speed-up of nearly 2 on the 1 processor version. I am now going to move on to a three node cluster with 2x 8core processors each (so I can go up to 48 processors), so will report back with higher core numbers. Any suggestions on further speed optimizations welcome. Cheers / Ram?n. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: twelvecore_spup.png Type: image/png Size: 25749 bytes Desc: not available URL: From carsonhh at gmail.com Thu Mar 14 09:53:33 2013 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 14 Mar 2013 11:53:33 -0400 Subject: [maker-devel] 12core speed check In-Reply-To: Message-ID: I can give a similar setup a try as well to see if anything is amiss in the development version. The expected behavior is that 1 and 2 cores should have identical performance (as one process is always fully dedicated to communication). --Carson From: Ram?n Fallon Date: Thursday, 14 March, 2013 11:19 AM To: Subject: [maker-devel] 12core speed check Hi, I was trying to tweak some of our machines to maximise Mpich2/Maker (svn rev 997) throughput and describe one small set of results on this mailing list to allow sharing of experiences. I use the example input dataset "dpp_contig.fasta" with the original sequence repeated 125 times within the same file (under different names of course) to allow for a decent size run. This file totalled 4.019 megabases. I use the dpp_proteins.fasta and The maker_opts.ctl has "cpus=1" set as the docs recommend for MPI. Hardware is a standalone HP Proliant SL390 with two Intel X5675 @ 3.07GHz, totalling 12 cores with 192GB RAM and 1TB disk (local, no NFS) running Ubuntu 10.04 with 2.6.32-41 linux kernel commandline was "mpiexec -n <#cores> maker" within a dedicated directory containing all relevant files. #cores time(mins) Megabases/hr 1 27.00 8.93 2 126.25 1.91 4 42.57 5.66 6 25.42 9.49 8 18.60 12.96 10 16.67 14.47 12 13.98 17.24 I attach a png file with graph. The upshot of this particular experiment is that 2 processes show anomalous behaviour and that 6 processors are needed to gain an advantage on the 1 processor run, while 12 processors achieves a speed-up of nearly 2 on the 1 processor version. I am now going to move on to a three node cluster with 2x 8core processors each (so I can go up to 48 processors), so will report back with higher core numbers. Any suggestions on further speed optimizations welcome. Cheers / Ram?n. _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnuhn at ebi.ac.uk Thu Mar 14 10:20:01 2013 From: mnuhn at ebi.ac.uk (Michael Nuhn) Date: Thu, 14 Mar 2013 16:20:01 +0000 Subject: [maker-devel] master_datastore_index.log file shrinks. Message-ID: <5141F8B1.7020808@ebi.ac.uk> Hello! I'm trying to keep track of the progress of maker (version 2.27) while it is running by looking at the master_datastore_index.log file every once in a while. Sometimes the number of lines in it decreases. Just now it went down from more than two hundred to thirty seven. When I start more instances of maker, the number of lines in it increases when they start. But sometimes I check and the number of lines has greatly reduced since the last time. I'm afraid that the newer instances of maker are deleting the file and starting it from scratch instead of adding their progress to it. Is this a file locking issue I should be worried about? Cheers, Michael. From olaf.mueller at duke.edu Thu Mar 14 10:13:20 2013 From: olaf.mueller at duke.edu (Olaf Mueller) Date: Thu, 14 Mar 2013 12:13:20 -0400 Subject: [maker-devel] 12core speed check In-Reply-To: References: Message-ID: <5141F720.20502@duke.edu> The X5675 supports hyperthreading. Does i.e. "mpiexec -n 2 maker" use 2 physical cores or 2 threads of the 1st core? If the latter happens it would be interesting to see your series extended to -n 24. Cheers Olaf On 03/14/2013 11:19 AM, Ram?n Fallon wrote: > Hi, > > I was trying to tweak some of our machines to maximise Mpich2/Maker > (svn rev 997) throughput and describe one small set of results on > this mailing list to allow sharing of experiences. > > I use the example input dataset "dpp_contig.fasta" with the original > sequence repeated 125 times within the same file (under different > names of course) to allow for a decent size run. This file totalled > 4.019 megabases. I use the dpp_proteins.fasta and The maker_opts.ctl > has "cpus=1" set as the docs recommend for MPI. > > Hardware is a standalone HP Proliant SL390 with two Intel X5675 @ > 3.07GHz, totalling 12 cores with 192GB RAM and 1TB disk (local, no > NFS) running Ubuntu 10.04 with 2.6.32-41 linux kernel > > commandline was "mpiexec -n <#cores> maker" within a dedicated > directory containing all relevant files. > > #cores time(mins) Megabases/hr > 1 27.00 8.93 > 2 126.25 1.91 > 4 42.57 5.66 > 6 25.42 9.49 > 8 18.60 12.96 > 10 16.67 14.47 > 12 13.98 17.24 > > I attach a png file with graph. The upshot of this particular > experiment is that 2 processes show anomalous behaviour and that 6 > processors are needed to gain an advantage on the 1 processor run, > while 12 processors achieves a speed-up of nearly 2 on the 1 processor > version. > > I am now going to move on to a three node cluster with 2x 8core > processors each (so I can go up to 48 processors), so will report back > with higher core numbers. Any suggestions on further speed > optimizations welcome. > > Cheers / Ram?n. > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Thu Mar 14 10:21:47 2013 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 14 Mar 2013 12:21:47 -0400 Subject: [maker-devel] master_datastore_index.log file shrinks. In-Reply-To: <5141F8B1.7020808@ebi.ac.uk> Message-ID: The file should only be deleted if there are no instances running and a new one starts. Then it rebuilds it. If it is being deleted while other instances are still active, then yes that is a lock issue. There are several other locks that should protect individual contigs while that particular lock is only protecting the datastore_index.log file. If any of the contig locks are not working you would start to see failures of contigs with weird errors that say there are missing files. Try dialling back on the number of simultaneous instances you start and instead use MPI or the -cpus option to get the parallelization boost. Alternatively you can also split up the input file and use the -base option so everything gets written to the same place (then you never have to worry about locks affecting individual contigs - as no single instance has access to all the contigs) Example: fasta_tool --chunks 5 maize_assembly.fasta maker -g maize_assembly_0.fasta -base maize_assembly maker -g maize_assembly_1.fasta -base maize_assembly maker -g maize_assembly_2.fasta -base maize_assembly maker -g maize_assembly_3.fasta -base maize_assembly maker -g maize_assembly_4.fasta -base maize_assembly maker -dsindex Everything then gets written to maize_assembly.maker.output for all results. The last call to maker with the -dsindex flag then rebuilds the datastore_index.log file to match the original maize_assembly.fasta file Thanks, Carson On 13-03-14 12:20 PM, "Michael Nuhn" wrote: >Hello! > >I'm trying to keep track of the progress of maker (version 2.27) while >it is running by looking at the master_datastore_index.log file every >once in a while. > >Sometimes the number of lines in it decreases. Just now it went down >from more than two hundred to thirty seven. > >When I start more instances of maker, the number of lines in it >increases when they start. But sometimes I check and the number of lines >has greatly reduced since the last time. > >I'm afraid that the newer instances of maker are deleting the file and >starting it from scratch instead of adding their progress to it. > >Is this a file locking issue I should be worried about? > >Cheers, >Michael. > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From mnuhn at ebi.ac.uk Thu Mar 14 10:49:19 2013 From: mnuhn at ebi.ac.uk (Michael Nuhn) Date: Thu, 14 Mar 2013 16:49:19 +0000 Subject: [maker-devel] master_datastore_index.log file shrinks. In-Reply-To: References: Message-ID: <5141FF8F.2050900@ebi.ac.uk> Hello Carson! Thanks for your quick response and your ideas. I'll give them a try. Cheers, Michael. On 03/14/2013 04:21 PM, Carson Holt wrote: > The file should only be deleted if there are no instances running and a > new one starts. Then it rebuilds it. If it is being deleted while other > instances are still active, then yes that is a lock issue. There are > several other locks that should protect individual contigs while that > particular lock is only protecting the datastore_index.log file. > > If any of the contig locks are not working you would start to see failures > of contigs with weird errors that say there are missing files. > > Try dialling back on the number of simultaneous instances you start and > instead use MPI or the -cpus option to get the parallelization boost. > Alternatively you can also split up the input file and use the -base > option so everything gets written to the same place (then you never have > to worry about locks affecting individual contigs - as no single instance > has access to all the contigs) > > Example: > fasta_tool --chunks 5 maize_assembly.fasta > maker -g maize_assembly_0.fasta -base maize_assembly > maker -g maize_assembly_1.fasta -base maize_assembly > > maker -g maize_assembly_2.fasta -base maize_assembly > > maker -g maize_assembly_3.fasta -base maize_assembly > > maker -g maize_assembly_4.fasta -base maize_assembly > > maker -dsindex > > Everything then gets written to maize_assembly.maker.output for all > results. The last call to maker with the -dsindex flag then rebuilds the > datastore_index.log file to match the original maize_assembly.fasta file > > > Thanks, > Carson > > > > > > On 13-03-14 12:20 PM, "Michael Nuhn" wrote: > >> Hello! >> >> I'm trying to keep track of the progress of maker (version 2.27) while >> it is running by looking at the master_datastore_index.log file every >> once in a while. >> >> Sometimes the number of lines in it decreases. Just now it went down >>from more than two hundred to thirty seven. >> >> When I start more instances of maker, the number of lines in it >> increases when they start. But sometimes I check and the number of lines >> has greatly reduced since the last time. >> >> I'm afraid that the newer instances of maker are deleting the file and >> starting it from scratch instead of adding their progress to it. >> >> Is this a file locking issue I should be worried about? >> >> Cheers, >> Michael. >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > From carsonhh at gmail.com Thu Mar 14 11:51:41 2013 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 14 Mar 2013 13:51:41 -0400 Subject: [maker-devel] 12core speed check In-Reply-To: Message-ID: Could you update to 998. It was a recent commit to the devel version that caused a weird pause. Thanks, Carson From: Ram?n Fallon Date: Thursday, 14 March, 2013 11:19 AM To: Subject: [maker-devel] 12core speed check Hi, I was trying to tweak some of our machines to maximise Mpich2/Maker (svn rev 997) throughput and describe one small set of results on this mailing list to allow sharing of experiences. I use the example input dataset "dpp_contig.fasta" with the original sequence repeated 125 times within the same file (under different names of course) to allow for a decent size run. This file totalled 4.019 megabases. I use the dpp_proteins.fasta and The maker_opts.ctl has "cpus=1" set as the docs recommend for MPI. Hardware is a standalone HP Proliant SL390 with two Intel X5675 @ 3.07GHz, totalling 12 cores with 192GB RAM and 1TB disk (local, no NFS) running Ubuntu 10.04 with 2.6.32-41 linux kernel commandline was "mpiexec -n <#cores> maker" within a dedicated directory containing all relevant files. #cores time(mins) Megabases/hr 1 27.00 8.93 2 126.25 1.91 4 42.57 5.66 6 25.42 9.49 8 18.60 12.96 10 16.67 14.47 12 13.98 17.24 I attach a png file with graph. The upshot of this particular experiment is that 2 processes show anomalous behaviour and that 6 processors are needed to gain an advantage on the 1 processor run, while 12 processors achieves a speed-up of nearly 2 on the 1 processor version. I am now going to move on to a three node cluster with 2x 8core processors each (so I can go up to 48 processors), so will report back with higher core numbers. Any suggestions on further speed optimizations welcome. Cheers / Ram?n. _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Thu Mar 14 11:55:38 2013 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 14 Mar 2013 13:55:38 -0400 Subject: [maker-devel] 12core speed check In-Reply-To: <5141F720.20502@duke.edu> Message-ID: It should use 2 physical cores. Hyperthreading shouldn't come into play unless you start more processes than there are physical cores. I haven't seen any big performance advantage in most cases with hyperthreading on linux machines. I find more often than not it just confuses students into thinking there are free processors and then starting too many jobs. --Carson From: Olaf Mueller Date: Thursday, 14 March, 2013 12:13 PM To: Subject: Re: [maker-devel] 12core speed check The X5675 supports hyperthreading. Does i.e. "mpiexec -n 2 maker" use 2 physical cores or 2 threads of the 1st core? If the latter happens it would be interesting to see your series extended to -n 24. Cheers Olaf On 03/14/2013 11:19 AM, Ram?n Fallon wrote: > Hi, > > > > I was trying to tweak some of our machines to maximise Mpich2/Maker (svn rev > 997) throughput and describe one small set of results on this mailing list to > allow sharing of experiences. > > > > > I use the example input dataset "dpp_contig.fasta" with the original sequence > repeated 125 times within the same file (under different names of course) to > allow for a decent size run. This file totalled 4.019 megabases. I use the > dpp_proteins.fasta and The maker_opts.ctl has "cpus=1" set as the docs > recommend for MPI. > > > > > Hardware is a standalone HP Proliant SL390 with two Intel X5675 @ 3.07GHz, > totalling 12 cores with 192GB RAM and 1TB disk (local, no NFS) running Ubuntu > 10.04 with 2.6.32-41 linux kernel > > > > > commandline was "mpiexec -n <#cores> maker" within a dedicated directory > containing all relevant files. > > > > > > #cores time(mins) Megabases/hr > > 1 27.00 8.93 > > 2 126.25 1.91 > > 4 42.57 5.66 > > 6 25.42 9.49 > > 8 18.60 12.96 > > 10 16.67 14.47 > > 12 13.98 17.24 > > > > > > I attach a png file with graph. The upshot of this particular experiment is > that 2 processes show anomalous behaviour and that 6 processors are needed to > gain an advantage on the 1 processor run, while 12 processors achieves a > speed-up of nearly 2 on the 1 processor version. > > > > > I am now going to move on to a three node cluster with 2x 8core processors > each (so I can go up to 48 processors), so will report back with higher core > numbers. Any suggestions on further speed optimizations welcome. > > > > > Cheers / Ram?n. > > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/mak > er-devel_yandell-lab.org > _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From myandell at genetics.utah.edu Thu Mar 14 11:59:37 2013 From: myandell at genetics.utah.edu (Mark Yandell) Date: Thu, 14 Mar 2013 17:59:37 +0000 Subject: [maker-devel] 12core speed check In-Reply-To: References: , Message-ID: <7A60AB257EFF2B48B1F4C814817EA05350ED9082@mxb2.hg.genetics.utah.edu> Thanks Ramon. super interesting analysis! Mark Yandell Professor of Human Genetics H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:801-587-7707 ________________________________________ From: maker-devel-bounces at yandell-lab.org [maker-devel-bounces at yandell-lab.org] on behalf of Carson Holt [carsonhh at gmail.com] Sent: Thursday, March 14, 2013 11:51 AM To: Ram?n Fallon; maker-devel at yandell-lab.org Subject: Re: [maker-devel] 12core speed check Could you update to 998. It was a recent commit to the devel version that caused a weird pause. Thanks, Carson From: Ram?n Fallon > Date: Thursday, 14 March, 2013 11:19 AM To: > Subject: [maker-devel] 12core speed check Hi, I was trying to tweak some of our machines to maximise Mpich2/Maker (svn rev 997) throughput and describe one small set of results on this mailing list to allow sharing of experiences. I use the example input dataset "dpp_contig.fasta" with the original sequence repeated 125 times within the same file (under different names of course) to allow for a decent size run. This file totalled 4.019 megabases. I use the dpp_proteins.fasta and The maker_opts.ctl has "cpus=1" set as the docs recommend for MPI. Hardware is a standalone HP Proliant SL390 with two Intel X5675 @ 3.07GHz, totalling 12 cores with 192GB RAM and 1TB disk (local, no NFS) running Ubuntu 10.04 with 2.6.32-41 linux kernel commandline was "mpiexec -n <#cores> maker" within a dedicated directory containing all relevant files. #cores time(mins) Megabases/hr 1 27.00 8.93 2 126.25 1.91 4 42.57 5.66 6 25.42 9.49 8 18.60 12.96 10 16.67 14.47 12 13.98 17.24 I attach a png file with graph. The upshot of this particular experiment is that 2 processes show anomalous behaviour and that 6 processors are needed to gain an advantage on the 1 processor run, while 12 processors achieves a speed-up of nearly 2 on the 1 processor version. I am now going to move on to a three node cluster with 2x 8core processors each (so I can go up to 48 processors), so will report back with higher core numbers. Any suggestions on further speed optimizations welcome. Cheers / Ram?n. _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From daniel.quest at gmail.com Thu Mar 14 20:07:34 2013 From: daniel.quest at gmail.com (Dan Quest) Date: Fri, 15 Mar 2013 02:07:34 +0000 (UTC) Subject: [maker-devel] Invitation to connect on LinkedIn Message-ID: <1487511280.7392755.1363313254244.JavaMail.app@ela4-app2322.prod> LinkedIn ------------ I'd like to add you to my professional network on LinkedIn. - Dan Dan Quest Senior Analyst Programmer at Mayo Clinic Rochester, Minnesota Area Confirm that you know Dan Quest: https://www.linkedin.com/e/-m3y3hs-heapifdk-1i/isd/11686987554/Yo4-rOXB/?hs=false&tok=26pedbV21vJlE1 -- You are receiving Invitation to Connect emails. Click to unsubscribe: http://www.linkedin.com/e/-m3y3hs-heapifdk-1i/vcG-iX3vwW9133a7MYTHsMyDds41ZeU5jWTF9LUs04/goo/maker-devel%40yandell-lab%2Eorg/20061/I3868510560_1/?hs=false&tok=24a30hi6RvJlE1 (c) 2012 LinkedIn Corporation. 2029 Stierlin Ct, Mountain View, CA 94043, USA. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ares711122 at gmail.com Thu Mar 14 21:13:55 2013 From: ares711122 at gmail.com (Hung-Wei Hsu) Date: Fri, 15 Mar 2013 11:13:55 +0800 Subject: [maker-devel] ERROR: Could not obtain lock to format database In-Reply-To: References: Message-ID: You may find the error messages in the run log as attached. Thanks a lot in advance. Best regards, Hung-Wei 2013/3/14 Carson Holt > Could you check to make sure your hard drive is not full, whatever > location you set as TMP= in the control files is not full (default is > /tmp). Also maker sure you do not set /tmp to an NFS mounted or a tmpfs > location. > > Could you also send the full captured STDERR. > > Thanks, > Carson > > > > From: Hung-Wei Hsu > Date: Tuesday, 12 March, 2013 8:24 PM > To: > Subject: [maker-devel] ERROR: Could not obtain lock to format database > > Hi MAKER developers, > > I tried MAKER 2.27b on one E. coli scaffold sequence with uniprot protein > database. > I failed to run the analysis and got an error message as below. > > Could not obtain lock to format database at maker-2.27b/bin/../lib/GI.pm > > Any suggestions or helps will be deeply appreciated. > > Best regards, > Hung-Wei > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: run.log Type: application/octet-stream Size: 27206 bytes Desc: not available URL: From ares711122 at gmail.com Thu Mar 14 21:35:09 2013 From: ares711122 at gmail.com (Hung-Wei Hsu) Date: Fri, 15 Mar 2013 11:35:09 +0800 Subject: [maker-devel] ERROR: Could not obtain lock to format database In-Reply-To: References: Message-ID: The hard disk where I tried MAKER is about 2TB in size. TMP was not set to an NFS mounted or a tmpfs location and was empty before analysis. The hard disk where TMP directory was located at was about 2TB in size. Thanks a lot in advance. Best regards, Hung-Wei 2013/3/15 Hung-Wei Hsu > You may find the error messages in the run log as attached. > Thanks a lot in advance. > > Best regards, > Hung-Wei > > > 2013/3/14 Carson Holt > >> Could you check to make sure your hard drive is not full, whatever >> location you set as TMP= in the control files is not full (default is >> /tmp). Also maker sure you do not set /tmp to an NFS mounted or a tmpfs >> location. >> >> Could you also send the full captured STDERR. >> >> Thanks, >> Carson >> >> >> >> From: Hung-Wei Hsu >> Date: Tuesday, 12 March, 2013 8:24 PM >> To: >> Subject: [maker-devel] ERROR: Could not obtain lock to format database >> >> Hi MAKER developers, >> >> I tried MAKER 2.27b on one E. coli scaffold sequence with uniprot protein >> database. >> I failed to run the analysis and got an error message as below. >> >> Could not obtain lock to format database at maker-2.27b/bin/../lib/GI.pm >> >> Any suggestions or helps will be deeply appreciated. >> >> Best regards, >> Hung-Wei >> _______________________________________________ maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Fri Mar 15 12:06:21 2013 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 15 Mar 2013 14:06:21 -0400 Subject: [maker-devel] ERROR: Could not obtain lock to format database In-Reply-To: Message-ID: Were you by any chance running multiple instances of MAKER at the same time in the same directory? It looks like two processes started to work on the same contig (normally a first set of locks blocks this possibility ? but rarely they get past that step). Then when it got to a part where an analysis is performed one properly failed when it realized that the other had the lock. In any case, it looks like it just retried and finished the contig in question. So the snippet seems to indicate expected behavior. Do you see the contig in question as being finished and having an output GFF3? --Carson From: Hung-Wei Hsu Date: Thursday, 14 March, 2013 11:13 PM To: Carson Holt Cc: Subject: Re: [maker-devel] ERROR: Could not obtain lock to format database You may find the error messages in the run log as attached. Thanks a lot in advance. Best regards, Hung-Wei 2013/3/14 Carson Holt > Could you check to make sure your hard drive is not full, whatever location > you set as TMP= in the control files is not full (default is /tmp). Also > maker sure you do not set /tmp to an NFS mounted or a tmpfs location. > > Could you also send the full captured STDERR. > > Thanks, > Carson > > > > From: Hung-Wei Hsu > Date: Tuesday, 12 March, 2013 8:24 PM > To: > Subject: [maker-devel] ERROR: Could not obtain lock to format database > > Hi MAKER developers, > > I tried MAKER 2.27b on one E. coli scaffold sequence with uniprot protein > database. > I failed to run the analysis and got an error message as below. > > Could not obtain lock to format database at maker-2.27b/bin/../lib/GI.pm > > Any suggestions or helps will be deeply appreciated. > > Best regards, > Hung-Wei > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/mak > er-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ramonfallon at gmail.com Mon Mar 18 08:35:04 2013 From: ramonfallon at gmail.com (=?ISO-8859-1?Q?Ram=F3n_Fallon?=) Date: Mon, 18 Mar 2013 15:35:04 +0100 Subject: [maker-devel] Fwd: 12core speed check In-Reply-To: References: <7A60AB257EFF2B48B1F4C814817EA05350ED9082@mxb2.hg.genetics.utah.edu> Message-ID: Hi! I've tried again from two different machines, and I can't do a "svn co" nor "svn update" on the malachite server. Can you verify the server and the svn service is OK on your side? Many thanks / Ram?n. On Fri, Mar 15, 2013 at 1:18 PM, Ram?n Fallon wrote: > Hi Mark and Carson, > > Many thanks for the comments and the speedy replies! > > Previously, I never had problem connecting to the svn server on > malachite.genetics.utah.edu, but this morning, I couldn't connect to > update to rev 998. > > I'l try again later. > > Cheers / Ram?n. > > > On Thu, Mar 14, 2013 at 6:59 PM, Mark Yandell wrote: > >> Thanks Ramon. super interesting analysis! >> >> >> Mark Yandell >> Professor of Human Genetics >> H.A. & Edna Benning Presidential Endowed Chair >> Eccles Institute of Human Genetics >> University of Utah >> 15 North 2030 East, Room 2100 >> Salt Lake City, UT 84112-5330 >> ph:801-587-7707 >> >> ________________________________________ >> From: maker-devel-bounces at yandell-lab.org [ >> maker-devel-bounces at yandell-lab.org] on behalf of Carson Holt [ >> carsonhh at gmail.com] >> Sent: Thursday, March 14, 2013 11:51 AM >> To: Ram?n Fallon; maker-devel at yandell-lab.org >> Subject: Re: [maker-devel] 12core speed check >> >> Could you update to 998. It was a recent commit to the devel version >> that caused a weird pause. >> >> Thanks, >> Carson >> >> >> From: Ram?n Fallon > >> Date: Thursday, 14 March, 2013 11:19 AM >> To: > >> Subject: [maker-devel] 12core speed check >> >> Hi, >> >> I was trying to tweak some of our machines to maximise Mpich2/Maker (svn >> rev 997) throughput and describe one small set of results on this mailing >> list to allow sharing of experiences. >> >> I use the example input dataset "dpp_contig.fasta" with the original >> sequence repeated 125 times within the same file (under different names of >> course) to allow for a decent size run. This file totalled 4.019 megabases. >> I use the dpp_proteins.fasta and The maker_opts.ctl has "cpus=1" set as >> the docs recommend for MPI. >> >> Hardware is a standalone HP Proliant SL390 with two Intel X5675 @ >> 3.07GHz, totalling 12 cores with 192GB RAM and 1TB disk (local, no NFS) >> running Ubuntu 10.04 with 2.6.32-41 linux kernel >> >> commandline was "mpiexec -n <#cores> maker" within a dedicated directory >> containing all relevant files. >> >> #cores time(mins) Megabases/hr >> 1 27.00 8.93 >> 2 126.25 1.91 >> 4 42.57 5.66 >> 6 25.42 9.49 >> 8 18.60 12.96 >> 10 16.67 14.47 >> 12 13.98 17.24 >> >> I attach a png file with graph. The upshot of this particular experiment >> is that 2 processes show anomalous behaviour and that 6 processors are >> needed to gain an advantage on the 1 processor run, while 12 processors >> achieves a speed-up of nearly 2 on the 1 processor version. >> >> I am now going to move on to a three node cluster with 2x 8core >> processors each (so I can go up to 48 processors), so will report back with >> higher core numbers. Any suggestions on further speed optimizations welcome. >> >> Cheers / Ram?n. >> _______________________________________________ maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Mar 18 08:51:37 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 18 Mar 2013 10:51:37 -0400 Subject: [maker-devel] Fwd: 12core speed check In-Reply-To: Message-ID: For any users currently using the devel subversion repository. If you need to update, please send me an e-mail to get information on how to switch over to our new server. Thanks, Carson From: Ram?n Fallon Date: Monday, 18 March, 2013 10:35 AM To: Subject: [maker-devel] Fwd: 12core speed check Hi! I've tried again from two different machines, and I can't do a "svn co" nor "svn update" on the malachite server. Can you verify the server and the svn service is OK on your side? Many thanks / Ram?n. On Fri, Mar 15, 2013 at 1:18 PM, Ram?n Fallon wrote: > Hi Mark and Carson, > > Many thanks for the comments and the speedy replies! > > Previously, I never had problem connecting to the svn server on > malachite.genetics.utah.edu , but this > morning, I couldn't connect to update to rev 998. > > I'l try again later. > > Cheers / Ram?n. > > > On Thu, Mar 14, 2013 at 6:59 PM, Mark Yandell > wrote: >> Thanks Ramon. super interesting analysis! >> >> >> Mark Yandell >> Professor of Human Genetics >> H.A. & Edna Benning Presidential Endowed Chair >> Eccles Institute of Human Genetics >> University of Utah >> 15 North 2030 East, Room 2100 >> Salt Lake City, UT 84112-5330 >> ph:801-587-7707 >> >> ________________________________________ >> From: maker-devel-bounces at yandell-lab.org >> [maker-devel-bounces at yandell-lab.org] on behalf of Carson Holt >> [carsonhh at gmail.com] >> Sent: Thursday, March 14, 2013 11:51 AM >> To: Ram?n Fallon; maker-devel at yandell-lab.org >> Subject: Re: [maker-devel] 12core speed check >> >> Could you update to 998. It was a recent commit to the devel version that >> caused a weird pause. >> >> Thanks, >> Carson >> >> >> From: Ram?n Fallon > >> Date: Thursday, 14 March, 2013 11:19 AM >> To: > >> Subject: [maker-devel] 12core speed check >> >> Hi, >> >> I was trying to tweak some of our machines to maximise Mpich2/Maker (svn rev >> 997) throughput and describe one small set of results on this mailing list >> to allow sharing of experiences. >> >> I use the example input dataset "dpp_contig.fasta" with the original sequence >> repeated 125 times within the same file (under different names of course) to >> allow for a decent size run. This file totalled 4.019 megabases. I use the >> dpp_proteins.fasta and The maker_opts.ctl has "cpus=1" set as the docs >> recommend for MPI. >> >> Hardware is a standalone HP Proliant SL390 with two Intel X5675 @ 3.07GHz, >> totalling 12 cores with 192GB RAM and 1TB disk (local, no NFS) running Ubuntu >> 10.04 with 2.6.32-41 linux kernel >> >> commandline was "mpiexec -n <#cores> maker" within a dedicated directory >> containing all relevant files. >> >> #cores time(mins) Megabases/hr >> 1 27.00 8.93 >> 2 126.25 1.91 >> 4 42.57 5.66 >> 6 25.42 9.49 >> 8 18.60 12.96 >> 10 16.67 14.47 >> 12 13.98 17.24 >> >> I attach a png file with graph. The upshot of this particular experiment is >> that 2 processes show anomalous behaviour and that 6 processors are needed to >> gain an advantage on the 1 processor run, while 12 processors achieves a >> speed-up of nearly 2 on the 1 processor version. >> >> I am now going to move on to a three node cluster with 2x 8core processors >> each (so I can go up to 48 processors), so will report back with higher core >> numbers. Any suggestions on further speed optimizations welcome. >> >> Cheers / Ram?n. >> _______________________________________________ maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From hudarul at yahoo.com Mon Mar 18 14:13:21 2013 From: hudarul at yahoo.com (Hud Hud) Date: Mon, 18 Mar 2013 13:13:21 -0700 (PDT) Subject: [maker-devel] Maker-no such file or directory Message-ID: <1363637601.24386.YahooMailNeo@web164901.mail.bf1.yahoo.com> I have some problem with maker 1. i try to work with the example data in data directory, but im having this kind of error..anyone can help me error $ maker STATUS: Parsing control files... dpp_contig.fasta (fasta file or fasta embeded in GFF3 file): No such file or directory at /home/Dorah/maker-2.27-beta/maker/bin/../lib/GI.pm line 186 --> rank=NA, hostname=NurKaiyisah my maker_opts.ctl genome=$home/Dorah/maker-2.27-beta/maker/data/dpp_contig.fasta est=$home/Dorah/maker-2.27-beta/maker/data/dpp_est.fasta protein=$home/Dorah/maker-2.27-beta/maker/data/dpp_protein.fasta -------------- next part -------------- An HTML attachment was scrubbed... URL: From Hossein.Borhan at AGR.GC.CA Mon Mar 18 14:40:38 2013 From: Hossein.Borhan at AGR.GC.CA (Borhan, Hossein) Date: Mon, 18 Mar 2013 16:40:38 -0400 Subject: [maker-devel] failed gene prediction Message-ID: <7B64340A44B6634C814A22BCFA6179D5020887DA@onottaxms5.AGR.GC.CA> Hi I have tried maker on a fungus genome of 45 mb with 1/3 being repeat rich. It did not produce any prediction. I am not sure what is causing this. Attached are the STDERR and opts.ctl. I appreciate your help Hossein -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: wa74-maker-stderr.log Type: application/octet-stream Size: 6325713 bytes Desc: wa74-maker-stderr.log URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_opts.ctl Type: application/octet-stream Size: 5244 bytes Desc: maker_opts.ctl URL: From carsonhh at gmail.com Mon Mar 18 14:44:41 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 18 Mar 2013 16:44:41 -0400 Subject: [maker-devel] Maker-no such file or directory In-Reply-To: <1363637601.24386.YahooMailNeo@web164901.mail.bf1.yahoo.com> Message-ID: Does 'ls -al $home/Dorah/maker-2.27-beta/maker/data/dpp_contig.fasta' show a valid location? The error is just saying that the file location as written in the maker_opts.ctl file does not exist. --Carson From: Hud Hud Reply-To: Hud Hud Date: Monday, 18 March, 2013 4:13 PM To: "maker-devel at yandell-lab.org" Subject: [maker-devel] Maker-no such file or directory I have some problem with maker 1. i try to work with the example data in data directory, but im having this kind of error..anyone can help me error $ maker STATUS: Parsing control files... dpp_contig.fasta (fasta file or fasta embeded in GFF3 file): No such file or directory at /home/Dorah/maker-2.27-beta/maker/bin/../lib/GI.pm line 186 --> rank=NA, hostname=NurKaiyisah my maker_opts.ctl genome=$home/Dorah/maker-2.27-beta/maker/data/dpp_contig.fasta est=$home/Dorah/maker-2.27-beta/maker/data/dpp_est.fasta protein=$home/Dorah/maker-2.27-beta/maker/data/dpp_protein.fasta _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Mar 18 14:49:30 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 18 Mar 2013 16:49:30 -0400 Subject: [maker-devel] failed gene prediction In-Reply-To: <7B64340A44B6634C814A22BCFA6179D5020887DA@onottaxms5.AGR.GC.CA> Message-ID: You didn't supply any evidence or HMM files for gene predictors. Just raw assembly data by itself is insufficient for genome annotation. Here is some nice documentation for running MAKER --> http://gmod.org/wiki/MAKER_Tutorial_2012 Here is a nice overview of genome annotation ion general --> http://fasta.bioch.virginia.edu/cshl/pdf/12/ajm12/euk_genome_annotation_revi ew.pdf Once you've gone through the documentation and examples, if you come across any questions just let us know. Thanks, Carson From: "Borhan, Hossein" Date: Monday, 18 March, 2013 4:40 PM To: Subject: [maker-devel] failed gene prediction Hi I have tried maker on a fungus genome of 45 mb with 1/3 being repeat rich. It did not produce any prediction. I am not sure what is causing this. Attached are the STDERR and opts.ctl. I appreciate your help Hossein _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ares711122 at gmail.com Mon Mar 18 18:44:39 2013 From: ares711122 at gmail.com (Hung-Wei Hsu) Date: Tue, 19 Mar 2013 08:44:39 +0800 Subject: [maker-devel] ERROR: Could not obtain lock to format database In-Reply-To: References: Message-ID: I make sure I just ran one instance of MAKER at the same time. I only analyzed one contig for the test. After MAKER interruption, I can't find an GFF3 output of this contig. There are only a theVoidXXX directory and a run.log file. I'm trying 2.26b with the same parameters for the same data. Hopefully, it can work well. Hung-Wei 2013/3/16 Carson Holt > Were you by any chance running multiple instances of MAKER at the same > time in the same directory? It looks like two processes started to work on > the same contig (normally a first set of locks blocks this possibility ? > but rarely they get past that step). Then when it got to a part where an > analysis is performed one properly failed when it realized that the other > had the lock. In any case, it looks like it just retried and finished the > contig in question. So the snippet seems to indicate expected behavior. > Do you see the contig in question as being finished and having an output > GFF3? > > --Carson > > > > > From: Hung-Wei Hsu > Date: Thursday, 14 March, 2013 11:13 PM > To: Carson Holt > Cc: > Subject: Re: [maker-devel] ERROR: Could not obtain lock to format database > > You may find the error messages in the run log as attached. > Thanks a lot in advance. > > Best regards, > Hung-Wei > > > 2013/3/14 Carson Holt > >> Could you check to make sure your hard drive is not full, whatever >> location you set as TMP= in the control files is not full (default is >> /tmp). Also maker sure you do not set /tmp to an NFS mounted or a tmpfs >> location. >> >> Could you also send the full captured STDERR. >> >> Thanks, >> Carson >> >> >> >> From: Hung-Wei Hsu >> Date: Tuesday, 12 March, 2013 8:24 PM >> To: >> Subject: [maker-devel] ERROR: Could not obtain lock to format database >> >> Hi MAKER developers, >> >> I tried MAKER 2.27b on one E. coli scaffold sequence with uniprot protein >> database. >> I failed to run the analysis and got an error message as below. >> >> Could not obtain lock to format database at maker-2.27b/bin/../lib/GI.pm >> >> Any suggestions or helps will be deeply appreciated. >> >> Best regards, >> Hung-Wei >> _______________________________________________ maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnuhn at ebi.ac.uk Tue Mar 19 06:12:32 2013 From: mnuhn at ebi.ac.uk (Michael Nuhn) Date: Tue, 19 Mar 2013 12:12:32 +0000 Subject: [maker-devel] master_datastore_index.log file shrinks. In-Reply-To: <5141FF8F.2050900@ebi.ac.uk> References: <5141FF8F.2050900@ebi.ac.uk> Message-ID: <51485630.6080701@ebi.ac.uk> Hello Carson! On 03/14/2013 04:49 PM, Michael Nuhn wrote: >> Try dialling back on the number of simultaneous instances you start and >> instead use MPI or the -cpus option to get the parallelization boost. >> Alternatively you can also split up the input file and use the -base >> option so everything gets written to the same place (then you never have >> to worry about locks affecting individual contigs - as no single instance >> has access to all the contigs) >> >> Example: >> fasta_tool --chunks 5 maize_assembly.fasta >> maker -g maize_assembly_0.fasta -base maize_assembly >> maker -g maize_assembly_1.fasta -base maize_assembly >> >> maker -g maize_assembly_2.fasta -base maize_assembly >> >> maker -g maize_assembly_3.fasta -base maize_assembly >> >> maker -g maize_assembly_4.fasta -base maize_assembly >> >> maker -dsindex >> >> Everything then gets written to maize_assembly.maker.output for all >> results. The last call to maker with the -dsindex flag then rebuilds the >> datastore_index.log file to match the original maize_assembly.fasta file I have tried this, split my genome into 50 files and run them as you suggested above. This worked well most of the time, but now I am getting locking issues again. The working directory gets flooded with STACK.STACK.STACK.STACK ... files. What I think is happening is that for some reason the maker instances decide that they want to rebuild the index. This takes a lot of time and this blocks even more instances wanting to lock the index files. In the end most of the maker instances end up waiting. I would like to try the following, but I don't know, if this might cause problems later on: I would like to run all of the split sequence files as separate maker projects as if they were independent genomes. In the end I'd merge all the individual gff files using the gff3_merge script. Do you see any reason why this wouldn't work? Cheers, Michael. From Bob_Freeman at hms.harvard.edu Tue Mar 19 07:03:00 2013 From: Bob_Freeman at hms.harvard.edu (Freeman, Robert M.) Date: Tue, 19 Mar 2013 09:03:00 -0400 Subject: [maker-devel] Failing to recoup all gff annotations, including ab-initio Message-ID: Carson et al., Thanks again for a great suite tools! We're using MAKER now to generate gene models (and model fragments) for a ciliate, the models for which we'll be using to generate a high-quality protein database for searches with mass spec. I bootstrapped the process using the core set of proteins with CEGMA, then trained SNAP. After the final round of running MAKER, I get about 1100 evidence-based models and 34K ab-initio. And that's fine (for now). I am able to collect the fasta files for both transcripts and proteins (evidence-based and ab-initio) without problem. My problem is that when I use the gff3_merge script, I only get annotations for the evidence-based models. I'm not sure why the ab-initio model annotations are being collected. I've tried using and not the '-g' switch, but this doesn't seem to make a difference. Thoughts? Tx, B ----------------------------------------------------- Bob Freeman, Ph.D. Acorn Worm Informatics, Kirschner lab Dept of Systems Biology, Alpert 524 Harvard Medical School 200 Longwood Avenue Boston, MA 02115 617/432.2294, vox "Sorry I'm late. Oh, God, that sounded insincere. I'm late." -- Karen Walker, from Will and Grace -------------- next part -------------- An HTML attachment was scrubbed... URL: From dsth at ebi.ac.uk Tue Mar 19 07:33:13 2013 From: dsth at ebi.ac.uk (Daniel Hughes) Date: Tue, 19 Mar 2013 13:33:13 +0000 Subject: [maker-devel] master_datastore_index.log file shrinks.] Message-ID: Daniel S. T. Hughes M.Biochem (Hons; Oxford), Ph.D (Cambridge) ------------------------------------------------------------------------------------- dsth at cantab.net dsth at cpan.org Hi Michael, You're using ebi cluster? i have to ask, is this all just a really elaborate way of avoiding the use of MPI that works perfectly well on both the ebi and sanger compute farms? if you carry on in the direction you seem to be going you're likely to end up with a considerable level of unnecessary overhead and should possibly consider adapting the ensembl genebuild pipeline to your specific needs. Dan Hello Carson! > > On 03/14/2013 04:49 PM, Michael Nuhn wrote: > >> Try dialling back on the number of simultaneous instances you start and > >> instead use MPI or the -cpus option to get the parallelization boost. > >> Alternatively you can also split up the input file and use the -base > >> option so everything gets written to the same place (then you never have > >> to worry about locks affecting individual contigs - as no single > instance > >> has access to all the contigs) > >> > >> Example: > >> fasta_tool --chunks 5 maize_assembly.fasta > >> maker -g maize_assembly_0.fasta -base maize_assembly > >> maker -g maize_assembly_1.fasta -base maize_assembly > >> > >> maker -g maize_assembly_2.fasta -base maize_assembly > >> > >> maker -g maize_assembly_3.fasta -base maize_assembly > >> > >> maker -g maize_assembly_4.fasta -base maize_assembly > >> > >> maker -dsindex > >> > >> Everything then gets written to maize_assembly.maker.output for all > >> results. The last call to maker with the -dsindex flag then rebuilds > the > >> datastore_index.log file to match the original maize_assembly.fasta file > > I have tried this, split my genome into 50 files and run them as you > suggested above. > > This worked well most of the time, but now I am getting locking issues > again. The working directory gets flooded with STACK.STACK.STACK.STACK > ... files. > > What I think is happening is that for some reason the maker instances > decide that they want to rebuild the index. This takes a lot of time > and this blocks even more instances wanting to lock the index files. > In the end most of the maker instances end up waiting. > > I would like to try the following, but I don't know, if this might > cause problems later on: > > I would like to run all of the split sequence files as separate maker > projects as if they were independent genomes. In the end I'd merge all > the individual gff files using the gff3_merge script. > > Do you see any reason why this wouldn't work? > > Cheers, > Michael. > > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > ----- End forwarded message ----- > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Mar 19 08:27:16 2013 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 19 Mar 2013 10:27:16 -0400 Subject: [maker-devel] master_datastore_index.log file shrinks.] In-Reply-To: Message-ID: Yes. If at all possible use MPI. It removes the overhead of locks which happen per primary instance of MAKER. So one maker job using 1000 cpus via MPI will have one shared set of locks. 1000 serial instances of MAKER on the other hand would have 1000x the locks. Alternatively if you do need to continue without MPI for some reason, I just finished a devel version of MAKER that has a --no_locks option. You can never start two instances using the same input fasta when --no_locks is specified, but the splitting to use different input fastas I mentioned before in the example will still work fine. I also have updated the indexing/reindexing, so if indexing failures happen, MAKER will switch between the current working directory and the TMP= directory from the maker_opts.ctl file so as to try different IO locations (I.e. NFS and non-NFS). Note you should never set TMP= in the control files to an NFS mounted location (it not only makes things a lot slower, but berkleydb and sqllite will get frequent errors on NFS). TMP= defaults to /tmp when not specified I'll send you download information in a separate e-mail. Try a regular MAKER run to see if the indexing/reindexing changes are sufficient before attempting the ?no_locks option. Thanks, Carson From: Daniel Hughes Date: Tuesday, 19 March, 2013 9:33 AM To: Michael Nuhn , Subject: Re: [maker-devel] master_datastore_index.log file shrinks.] Daniel S. T. Hughes M.Biochem (Hons; Oxford), Ph.D (Cambridge) ---------------------------------------------------------------------------- --------- dsth at cantab.net dsth at cpan.org Hi Michael, You're using ebi cluster? i have to ask, is this all just a really elaborate way of avoiding the use of MPI that works perfectly well on both the ebi and sanger compute farms? if you carry on in the direction you seem to be going you're likely to end up with a considerable level of unnecessary overhead and should possibly consider adapting the ensembl genebuild pipeline to your specific needs. Dan > Hello Carson! > > On 03/14/2013 04:49 PM, Michael Nuhn wrote: >>> >> Try dialling back on the number of simultaneous instances you start and >>> >> instead use MPI or the -cpus option to get the parallelization boost. >>> >> Alternatively you can also split up the input file and use the -base >>> >> option so everything gets written to the same place (then you never have >>> >> to worry about locks affecting individual contigs - as no single instance >>> >> has access to all the contigs) >>> >> >>> >> Example: >>> >> fasta_tool --chunks 5 maize_assembly.fasta >>> >> maker -g maize_assembly_0.fasta -base maize_assembly >>> >> maker -g maize_assembly_1.fasta -base maize_assembly >>> >> >>> >> maker -g maize_assembly_2.fasta -base maize_assembly >>> >> >>> >> maker -g maize_assembly_3.fasta -base maize_assembly >>> >> >>> >> maker -g maize_assembly_4.fasta -base maize_assembly >>> >> >>> >> maker -dsindex >>> >> >>> >> Everything then gets written to maize_assembly.maker.output for all >>> >> results. The last call to maker with the -dsindex flag then rebuilds the >>> >> datastore_index.log file to match the original maize_assembly.fasta file > > I have tried this, split my genome into 50 files and run them as you > suggested above. > > This worked well most of the time, but now I am getting locking issues > again. The working directory gets flooded with STACK.STACK.STACK.STACK > ... files. > > What I think is happening is that for some reason the maker instances > decide that they want to rebuild the index. This takes a lot of time > and this blocks even more instances wanting to lock the index files. > In the end most of the maker instances end up waiting. > > I would like to try the following, but I don't know, if this might > cause problems later on: > > I would like to run all of the split sequence files as separate maker > projects as if they were independent genomes. In the end I'd merge all > the individual gff files using the gff3_merge script. > > Do you see any reason why this wouldn't work? > > Cheers, > Michael. > > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > ----- End forwarded message ----- > _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Mar 19 08:38:00 2013 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 19 Mar 2013 10:38:00 -0400 Subject: [maker-devel] master_datastore_index.log file shrinks.] In-Reply-To: Message-ID: You can also talk to Eleanor Stanley at Sanger, she has a pre-release of MAKER 2.28 already installed and running on the Sanger cluster with OpenMPI. Thanks, Carson From: Carson Holt Date: Tuesday, 19 March, 2013 10:27 AM To: Daniel Hughes , Michael Nuhn , Subject: Re: [maker-devel] master_datastore_index.log file shrinks.] Yes. If at all possible use MPI. It removes the overhead of locks which happen per primary instance of MAKER. So one maker job using 1000 cpus via MPI will have one shared set of locks. 1000 serial instances of MAKER on the other hand would have 1000x the locks. Alternatively if you do need to continue without MPI for some reason, I just finished a devel version of MAKER that has a --no_locks option. You can never start two instances using the same input fasta when --no_locks is specified, but the splitting to use different input fastas I mentioned before in the example will still work fine. I also have updated the indexing/reindexing, so if indexing failures happen, MAKER will switch between the current working directory and the TMP= directory from the maker_opts.ctl file so as to try different IO locations (I.e. NFS and non-NFS). Note you should never set TMP= in the control files to an NFS mounted location (it not only makes things a lot slower, but berkleydb and sqllite will get frequent errors on NFS). TMP= defaults to /tmp when not specified I'll send you download information in a separate e-mail. Try a regular MAKER run to see if the indexing/reindexing changes are sufficient before attempting the ?no_locks option. Thanks, Carson From: Daniel Hughes Date: Tuesday, 19 March, 2013 9:33 AM To: Michael Nuhn , Subject: Re: [maker-devel] master_datastore_index.log file shrinks.] Daniel S. T. Hughes M.Biochem (Hons; Oxford), Ph.D (Cambridge) ---------------------------------------------------------------------------- --------- dsth at cantab.net dsth at cpan.org Hi Michael, You're using ebi cluster? i have to ask, is this all just a really elaborate way of avoiding the use of MPI that works perfectly well on both the ebi and sanger compute farms? if you carry on in the direction you seem to be going you're likely to end up with a considerable level of unnecessary overhead and should possibly consider adapting the ensembl genebuild pipeline to your specific needs. Dan > Hello Carson! > > On 03/14/2013 04:49 PM, Michael Nuhn wrote: >>> >> Try dialling back on the number of simultaneous instances you start and >>> >> instead use MPI or the -cpus option to get the parallelization boost. >>> >> Alternatively you can also split up the input file and use the -base >>> >> option so everything gets written to the same place (then you never have >>> >> to worry about locks affecting individual contigs - as no single instance >>> >> has access to all the contigs) >>> >> >>> >> Example: >>> >> fasta_tool --chunks 5 maize_assembly.fasta >>> >> maker -g maize_assembly_0.fasta -base maize_assembly >>> >> maker -g maize_assembly_1.fasta -base maize_assembly >>> >> >>> >> maker -g maize_assembly_2.fasta -base maize_assembly >>> >> >>> >> maker -g maize_assembly_3.fasta -base maize_assembly >>> >> >>> >> maker -g maize_assembly_4.fasta -base maize_assembly >>> >> >>> >> maker -dsindex >>> >> >>> >> Everything then gets written to maize_assembly.maker.output for all >>> >> results. The last call to maker with the -dsindex flag then rebuilds the >>> >> datastore_index.log file to match the original maize_assembly.fasta file > > I have tried this, split my genome into 50 files and run them as you > suggested above. > > This worked well most of the time, but now I am getting locking issues > again. The working directory gets flooded with STACK.STACK.STACK.STACK > ... files. > > What I think is happening is that for some reason the maker instances > decide that they want to rebuild the index. This takes a lot of time > and this blocks even more instances wanting to lock the index files. > In the end most of the maker instances end up waiting. > > I would like to try the following, but I don't know, if this might > cause problems later on: > > I would like to run all of the split sequence files as separate maker > projects as if they were independent genomes. In the end I'd merge all > the individual gff files using the gff3_merge script. > > Do you see any reason why this wouldn't work? > > Cheers, > Michael. > > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > ----- End forwarded message ----- > _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/m aker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Mar 19 08:52:19 2013 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 19 Mar 2013 10:52:19 -0400 Subject: [maker-devel] Failing to recoup all gff annotations, including ab-initio In-Reply-To: Message-ID: Ab initio models without evidence support are not considered final models by default (newly trained ab initio predictors tend to have a very high false positive rate). If you really want the ab initio models without support to upgraded, set the keep_preds=1 in the maker_opts.ctl file. All ab inito models are also stored in the GFF3 as match/mtach_part features for reference purposes not gene/mRNA/exon/CDS. Thanks, Carson From: "Freeman, Robert M." Date: Tuesday, 19 March, 2013 9:03 AM To: "maker-devel at yandell-lab.org" Subject: [maker-devel] Failing to recoup all gff annotations, including ab-initio Carson et al., Thanks again for a great suite tools! We're using MAKER now to generate gene models (and model fragments) for a ciliate, the models for which we'll be using to generate a high-quality protein database for searches with mass spec. I bootstrapped the process using the core set of proteins with CEGMA, then trained SNAP. After the final round of running MAKER, I get about 1100 evidence-based models and 34K ab-initio. And that's fine (for now). I am able to collect the fasta files for both transcripts and proteins (evidence-based and ab-initio) without problem. My problem is that when I use the gff3_merge script, I only get annotations for the evidence-based models. I'm not sure why the ab-initio model annotations are being collected. I've tried using and not the '-g' switch, but this doesn't seem to make a difference. Thoughts? Tx, B ----------------------------------------------------- Bob Freeman, Ph.D. Acorn Worm Informatics, Kirschner lab Dept of Systems Biology, Alpert 524 Harvard Medical School 200 Longwood Avenue Boston, MA 02115 617/432.2294, vox "Sorry I'm late. Oh, God, that sounded insincere. I'm late." -- Karen Walker, from Will and Grace _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnuhn at ebi.ac.uk Tue Mar 19 09:19:25 2013 From: mnuhn at ebi.ac.uk (Michael Nuhn) Date: Tue, 19 Mar 2013 15:19:25 +0000 Subject: [maker-devel] master_datastore_index.log file shrinks.] In-Reply-To: References: Message-ID: <514881FD.4020003@ebi.ac.uk> Hello Carson! On 03/19/2013 02:27 PM, Carson Holt wrote: > Yes. If at all possible use MPI. It removes the overhead of locks > which happen per primary instance of MAKER. So one maker job using 1000 > cpus via MPI will have one shared set of locks. 1000 serial instances > of MAKER on the other hand would have 1000x the locks. I don't know a thing about MPI. I tried installing maker (2.2.7) with mpich-3.0.2, mpich2-1.4.1 and open mpi and none of them worked for me. I also tried the automatic installation that comes with maker, but it didn't work for me either. If need be, I could spend time getting to the bottom of this, but there is no telling how long this would take me so I'd rather not, if there is an alternative. Would the approach I outlined before work? (Treating the split files as separate genomes to annotate and then combine the gffs afterwards) I also like this approach, because I would select a few contigs in the beginning which I would run on their own. They would complete early and this way I would get a preview of the results of the run instead of having to wait for everything to complete. It might also be more robust, because file locking issues would be confined to the instances working on a sequence chunk, but the rest of the instances could continue working. Cheers, Michael. > Alternatively if you do need to continue without MPI for some reason, I > just finished a devel version of MAKER that has a --no_locks option. > You can never start two instances using the same input fasta when > --no_locks is specified, but the splitting to use different input fastas > I mentioned before in the example will still work fine. > > I also have updated the indexing/reindexing, so if indexing failures > happen, MAKER will switch between the current working directory and the > TMP= directory from the maker_opts.ctl file so as to try different IO > locations (I.e. NFS and non-NFS). Note you should never set TMP= in the > control files to an NFS mounted location (it not only makes things a lot > slower, but berkleydb and sqllite will get frequent errors on NFS). > TMP= defaults to /tmp when not specified > > I'll send you download information in a separate e-mail. Try a regular > MAKER run to see if the indexing/reindexing changes are sufficient > before attempting the ?no_locks option. > > Thanks, > Carson From carsonhh at gmail.com Tue Mar 19 09:02:22 2013 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 19 Mar 2013 11:02:22 -0400 Subject: [maker-devel] master_datastore_index.log file shrinks.] In-Reply-To: <514881FD.4020003@ebi.ac.uk> Message-ID: Try it with the no_locks option then. Make sure to let one instance finish populating the mpi_blastdb directory before running other instances as that is where most initial locking occurs. I'll send you more details on how to install with OpenMPI, so you can give that a shot while your jobs are also running serially (so you don't lose time). Also instead of 50 serial instances, you could try 10 with -cpus set to 5. Thanks, Carson On 13-03-19 11:19 AM, "Michael Nuhn" wrote: >Hello Carson! > >On 03/19/2013 02:27 PM, Carson Holt wrote: >> Yes. If at all possible use MPI. It removes the overhead of locks >> which happen per primary instance of MAKER. So one maker job using 1000 >> cpus via MPI will have one shared set of locks. 1000 serial instances >> of MAKER on the other hand would have 1000x the locks. > >I don't know a thing about MPI. > >I tried installing maker (2.2.7) with mpich-3.0.2, mpich2-1.4.1 and open >mpi and none of them worked for me. I also tried the automatic >installation that comes with maker, but it didn't work for me either. > >If need be, I could spend time getting to the bottom of this, but there >is no telling how long this would take me so I'd rather not, if there is >an alternative. > >Would the approach I outlined before work? (Treating the split files as >separate genomes to annotate and then combine the gffs afterwards) > >I also like this approach, because I would select a few contigs in the >beginning which I would run on their own. They would complete early and >this way I would get a preview of the results of the run instead of >having to wait for everything to complete. > >It might also be more robust, because file locking issues would be >confined to the instances working on a sequence chunk, but the rest of >the instances could continue working. > >Cheers, >Michael. > >> Alternatively if you do need to continue without MPI for some reason, I >> just finished a devel version of MAKER that has a --no_locks option. >> You can never start two instances using the same input fasta when >> --no_locks is specified, but the splitting to use different input fastas >> I mentioned before in the example will still work fine. >> >> I also have updated the indexing/reindexing, so if indexing failures >> happen, MAKER will switch between the current working directory and the >> TMP= directory from the maker_opts.ctl file so as to try different IO >> locations (I.e. NFS and non-NFS). Note you should never set TMP= in the >> control files to an NFS mounted location (it not only makes things a lot >> slower, but berkleydb and sqllite will get frequent errors on NFS). >> TMP= defaults to /tmp when not specified >> >> I'll send you download information in a separate e-mail. Try a regular >> MAKER run to see if the indexing/reindexing changes are sufficient >> before attempting the ?no_locks option. >> >> Thanks, >> Carson > From dsth at ebi.ac.uk Tue Mar 19 09:13:51 2013 From: dsth at ebi.ac.uk (Daniel Hughes) Date: Tue, 19 Mar 2013 15:13:51 +0000 Subject: [maker-devel] master_datastore_index.log file shrinks.] In-Reply-To: <514881FD.4020003@ebi.ac.uk> References: <514881FD.4020003@ebi.ac.uk> Message-ID: You really don't need to know anything about MPI. While MPI is itself pretty complex, I seem to recall maker uses the p2p subset alone mainly to send serialised perl objects as c strings etc., for IPC across ad hoc infrastructure - but none of that is relevant as Carson has done all the IPC debugging for you and its use should be transparent. If it's failing, its almost certainly because you've got discrepencies between the mpi libraries visible at compile-time vs. run-time and you may need to force the dynamic linker to behave itself. The only other caveat on ebi infrastructure i can think of off the top of my head relates to cross-node MPI usage when going into the hundreds of processes but i'm assuming you not doing that? You need to be more specific about how it's failing. dan from me phone... On Mar 19, 2013 11:55 AM, "Michael Nuhn" wrote: > Hello Carson! > > On 03/19/2013 02:27 PM, Carson Holt wrote: > >> Yes. If at all possible use MPI. It removes the overhead of locks >> which happen per primary instance of MAKER. So one maker job using 1000 >> cpus via MPI will have one shared set of locks. 1000 serial instances >> of MAKER on the other hand would have 1000x the locks. >> > > I don't know a thing about MPI. > > I tried installing maker (2.2.7) with mpich-3.0.2, mpich2-1.4.1 and open > mpi and none of them worked for me. I also tried the automatic installation > that comes with maker, but it didn't work for me either. > > If need be, I could spend time getting to the bottom of this, but there is > no telling how long this would take me so I'd rather not, if there is an > alternative. > > Would the approach I outlined before work? (Treating the split files as > separate genomes to annotate and then combine the gffs afterwards) > > I also like this approach, because I would select a few contigs in the > beginning which I would run on their own. They would complete early and > this way I would get a preview of the results of the run instead of having > to wait for everything to complete. > > It might also be more robust, because file locking issues would be > confined to the instances working on a sequence chunk, but the rest of the > instances could continue working. > > Cheers, > Michael. > > Alternatively if you do need to continue without MPI for some reason, I >> just finished a devel version of MAKER that has a --no_locks option. >> You can never start two instances using the same input fasta when >> --no_locks is specified, but the splitting to use different input fastas >> I mentioned before in the example will still work fine. >> >> I also have updated the indexing/reindexing, so if indexing failures >> happen, MAKER will switch between the current working directory and the >> TMP= directory from the maker_opts.ctl file so as to try different IO >> locations (I.e. NFS and non-NFS). Note you should never set TMP= in the >> control files to an NFS mounted location (it not only makes things a lot >> slower, but berkleydb and sqllite will get frequent errors on NFS). >> TMP= defaults to /tmp when not specified >> >> I'll send you download information in a separate e-mail. Try a regular >> MAKER run to see if the indexing/reindexing changes are sufficient >> before attempting the ?no_locks option. >> >> Thanks, >> Carson >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Mar 19 09:22:22 2013 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 19 Mar 2013 11:22:22 -0400 Subject: [maker-devel] master_datastore_index.log file shrinks.] In-Reply-To: Message-ID: I have MAKER working under OpemnMPI 1.4.3 (intel compiled). I had to set a couple of environmental variables prior to setup. You would probably need to set these values as well. If you your OpenMPI path was here for example --> /software/openmpi-1.4.3/, run the following commands (path set accordingly) before even attempting maker setup. export OMPI_MCA_mpi_warn_on_fork 0 export LD_PRELOAD /software/openmpi-1.4.3/lib/libmpi.so:$LD_PRELOAD These not only need to be set before compilation, but also before any run (so add them to you ~.bashrc or ~/.bash_profile or any module load scripts thanks). The LD_PRELOAD statement needs to be set for any program using OpenMPI's shared libraries and not just MAKER, so it's normally a good idea to have that set system wide for all users. The detail can be found in the OpenMPI documentation. Note sometimes system library updates can break OpenMPI's shared libraries while not breaking OpenMPI itself, so you might also need to recompile OpenMPI if it has broken shared libraries. Once you have those commands in place, run the perl Buil.PL step. Say yes to install with MPI. Then run ./Build install Thanks, Carson On 13-03-19 11:02 AM, "Carson Holt" wrote: >Try it with the no_locks option then. Make sure to let one instance >finish populating the mpi_blastdb directory before running other >instances >as that is where most initial locking occurs. > >I'll send you more details on how to install with OpenMPI, so you can >give >that a shot while your jobs are also running serially (so you don't lose >time). Also instead of 50 serial instances, you could try 10 with -cpus >set to 5. > >Thanks, >Carson > > > >On 13-03-19 11:19 AM, "Michael Nuhn" wrote: > >>Hello Carson! >> >>On 03/19/2013 02:27 PM, Carson Holt wrote: >>> Yes. If at all possible use MPI. It removes the overhead of locks >>> which happen per primary instance of MAKER. So one maker job using >>>1000 >>> cpus via MPI will have one shared set of locks. 1000 serial instances >>> of MAKER on the other hand would have 1000x the locks. >> >>I don't know a thing about MPI. >> >>I tried installing maker (2.2.7) with mpich-3.0.2, mpich2-1.4.1 and open >>mpi and none of them worked for me. I also tried the automatic >>installation that comes with maker, but it didn't work for me either. >> >>If need be, I could spend time getting to the bottom of this, but there >>is no telling how long this would take me so I'd rather not, if there is >>an alternative. >> >>Would the approach I outlined before work? (Treating the split files as >>separate genomes to annotate and then combine the gffs afterwards) >> >>I also like this approach, because I would select a few contigs in the >>beginning which I would run on their own. They would complete early and >>this way I would get a preview of the results of the run instead of >>having to wait for everything to complete. >> >>It might also be more robust, because file locking issues would be >>confined to the instances working on a sequence chunk, but the rest of >>the instances could continue working. >> >>Cheers, >>Michael. >> >>> Alternatively if you do need to continue without MPI for some reason, I >>> just finished a devel version of MAKER that has a --no_locks option. >>> You can never start two instances using the same input fasta when >>> --no_locks is specified, but the splitting to use different input >>>fastas >>> I mentioned before in the example will still work fine. >>> >>> I also have updated the indexing/reindexing, so if indexing failures >>> happen, MAKER will switch between the current working directory and the >>> TMP= directory from the maker_opts.ctl file so as to try different IO >>> locations (I.e. NFS and non-NFS). Note you should never set TMP= in >>>the >>> control files to an NFS mounted location (it not only makes things a >>>lot >>> slower, but berkleydb and sqllite will get frequent errors on NFS). >>> TMP= defaults to /tmp when not specified >>> >>> I'll send you download information in a separate e-mail. Try a regular >>> MAKER run to see if the indexing/reindexing changes are sufficient >>> before attempting the ?no_locks option. >>> >>> Thanks, >>> Carson From dsth at ebi.ac.uk Tue Mar 19 09:26:02 2013 From: dsth at ebi.ac.uk (Daniel Hughes) Date: Tue, 19 Mar 2013 15:26:02 +0000 Subject: [maker-devel] master_datastore_index.log file shrinks.] In-Reply-To: References: <514881FD.4020003@ebi.ac.uk> Message-ID: oh and (1) it will work as long as evidence etc., is synchronous, (2) it will be really inefficient - be glad ebi doesn't use a by group compute time fair-share policy ;) Dan from me phone... On Mar 19, 2013 12:13 PM, "Daniel Hughes" wrote: > You really don't need to know anything about MPI. While MPI is itself > pretty complex, I seem to recall maker uses the p2p subset alone mainly to > send serialised perl objects as c strings etc., for IPC across ad hoc > infrastructure - but none of that is relevant as Carson has done all the > IPC debugging for you and its use should be transparent. If it's failing, > its almost certainly because you've got discrepencies between the mpi > libraries visible at compile-time vs. run-time and you may need to force > the dynamic linker to behave itself. The only other caveat on ebi > infrastructure i can think of off the top of my head relates to cross-node > MPI usage when going into the hundreds of processes but i'm assuming you > not doing that? You need to be more specific about how it's failing. > > dan > > from me phone... > On Mar 19, 2013 11:55 AM, "Michael Nuhn" wrote: > >> Hello Carson! >> >> On 03/19/2013 02:27 PM, Carson Holt wrote: >> >>> Yes. If at all possible use MPI. It removes the overhead of locks >>> which happen per primary instance of MAKER. So one maker job using 1000 >>> cpus via MPI will have one shared set of locks. 1000 serial instances >>> of MAKER on the other hand would have 1000x the locks. >>> >> >> I don't know a thing about MPI. >> >> I tried installing maker (2.2.7) with mpich-3.0.2, mpich2-1.4.1 and open >> mpi and none of them worked for me. I also tried the automatic installation >> that comes with maker, but it didn't work for me either. >> >> If need be, I could spend time getting to the bottom of this, but there >> is no telling how long this would take me so I'd rather not, if there is an >> alternative. >> >> Would the approach I outlined before work? (Treating the split files as >> separate genomes to annotate and then combine the gffs afterwards) >> >> I also like this approach, because I would select a few contigs in the >> beginning which I would run on their own. They would complete early and >> this way I would get a preview of the results of the run instead of having >> to wait for everything to complete. >> >> It might also be more robust, because file locking issues would be >> confined to the instances working on a sequence chunk, but the rest of the >> instances could continue working. >> >> Cheers, >> Michael. >> >> Alternatively if you do need to continue without MPI for some reason, I >>> just finished a devel version of MAKER that has a --no_locks option. >>> You can never start two instances using the same input fasta when >>> --no_locks is specified, but the splitting to use different input fastas >>> I mentioned before in the example will still work fine. >>> >>> I also have updated the indexing/reindexing, so if indexing failures >>> happen, MAKER will switch between the current working directory and the >>> TMP= directory from the maker_opts.ctl file so as to try different IO >>> locations (I.e. NFS and non-NFS). Note you should never set TMP= in the >>> control files to an NFS mounted location (it not only makes things a lot >>> slower, but berkleydb and sqllite will get frequent errors on NFS). >>> TMP= defaults to /tmp when not specified >>> >>> I'll send you download information in a separate e-mail. Try a regular >>> MAKER run to see if the indexing/reindexing changes are sufficient >>> before attempting the ?no_locks option. >>> >>> Thanks, >>> Carson >>> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnuhn at ebi.ac.uk Tue Mar 19 09:54:34 2013 From: mnuhn at ebi.ac.uk (Michael Nuhn) Date: Tue, 19 Mar 2013 15:54:34 +0000 Subject: [maker-devel] master_datastore_index.log file shrinks.] In-Reply-To: References: Message-ID: <51488A3A.20106@ebi.ac.uk> Hello Carson! Thanks for the pointers. I'll give mpi another shot. Cheers, Michael. On 03/19/2013 03:22 PM, Carson Holt wrote: > I have MAKER working under OpemnMPI 1.4.3 (intel compiled). > > I had to set a couple of environmental variables prior to setup. You would > probably need to set these values as well. If you your OpenMPI path was > here for example --> /software/openmpi-1.4.3/, run the following commands > (path set accordingly) before even attempting maker setup. > > export OMPI_MCA_mpi_warn_on_fork 0 > export LD_PRELOAD /software/openmpi-1.4.3/lib/libmpi.so:$LD_PRELOAD > > These not only need to be set before compilation, but also before any run > (so add them to you ~.bashrc or ~/.bash_profile or any module load scripts > thanks). The LD_PRELOAD statement needs to be set for any program using > OpenMPI's shared libraries and not just MAKER, so it's normally a good > idea to have that set system wide for all users. The detail can be found > in the OpenMPI documentation. Note sometimes system library updates can > break OpenMPI's shared libraries while not breaking OpenMPI itself, so you > might also need to recompile OpenMPI if it has broken shared libraries. > > Once you have those commands in place, run the perl Buil.PL step. Say yes > to install with MPI. Then run ./Build install > > Thanks, > Carson > > > > On 13-03-19 11:02 AM, "Carson Holt" wrote: > >> Try it with the no_locks option then. Make sure to let one instance >> finish populating the mpi_blastdb directory before running other >> instances >> as that is where most initial locking occurs. >> >> I'll send you more details on how to install with OpenMPI, so you can >> give >> that a shot while your jobs are also running serially (so you don't lose >> time). Also instead of 50 serial instances, you could try 10 with -cpus >> set to 5. >> >> Thanks, >> Carson >> >> >> >> On 13-03-19 11:19 AM, "Michael Nuhn" wrote: >> >>> Hello Carson! >>> >>> On 03/19/2013 02:27 PM, Carson Holt wrote: >>>> Yes. If at all possible use MPI. It removes the overhead of locks >>>> which happen per primary instance of MAKER. So one maker job using >>>> 1000 >>>> cpus via MPI will have one shared set of locks. 1000 serial instances >>>> of MAKER on the other hand would have 1000x the locks. >>> >>> I don't know a thing about MPI. >>> >>> I tried installing maker (2.2.7) with mpich-3.0.2, mpich2-1.4.1 and open >>> mpi and none of them worked for me. I also tried the automatic >>> installation that comes with maker, but it didn't work for me either. >>> >>> If need be, I could spend time getting to the bottom of this, but there >>> is no telling how long this would take me so I'd rather not, if there is >>> an alternative. >>> >>> Would the approach I outlined before work? (Treating the split files as >>> separate genomes to annotate and then combine the gffs afterwards) >>> >>> I also like this approach, because I would select a few contigs in the >>> beginning which I would run on their own. They would complete early and >>> this way I would get a preview of the results of the run instead of >>> having to wait for everything to complete. >>> >>> It might also be more robust, because file locking issues would be >>> confined to the instances working on a sequence chunk, but the rest of >>> the instances could continue working. >>> >>> Cheers, >>> Michael. >>> >>>> Alternatively if you do need to continue without MPI for some reason, I >>>> just finished a devel version of MAKER that has a --no_locks option. >>>> You can never start two instances using the same input fasta when >>>> --no_locks is specified, but the splitting to use different input >>>> fastas >>>> I mentioned before in the example will still work fine. >>>> >>>> I also have updated the indexing/reindexing, so if indexing failures >>>> happen, MAKER will switch between the current working directory and the >>>> TMP= directory from the maker_opts.ctl file so as to try different IO >>>> locations (I.e. NFS and non-NFS). Note you should never set TMP= in >>>> the >>>> control files to an NFS mounted location (it not only makes things a >>>> lot >>>> slower, but berkleydb and sqllite will get frequent errors on NFS). >>>> TMP= defaults to /tmp when not specified >>>> >>>> I'll send you download information in a separate e-mail. Try a regular >>>> MAKER run to see if the indexing/reindexing changes are sufficient >>>> before attempting the ?no_locks option. >>>> >>>> Thanks, >>>> Carson > > From es9 at sanger.ac.uk Tue Mar 19 09:40:08 2013 From: es9 at sanger.ac.uk (Eleanor Stanley) Date: Tue, 19 Mar 2013 15:40:08 +0000 Subject: [maker-devel] master_datastore_index.log file shrinks.] In-Reply-To: <51488A3A.20106@ebi.ac.uk> References: <51488A3A.20106@ebi.ac.uk> Message-ID: For the Sanger farm I have a wrapper script to run MPI maker so that the same environmental variables are forced to all nodes. Eleanor On 19 Mar 2013, at 15:54, Michael Nuhn wrote: > Hello Carson! > > Thanks for the pointers. I'll give mpi another shot. > > Cheers, > Michael. > > On 03/19/2013 03:22 PM, Carson Holt wrote: >> I have MAKER working under OpemnMPI 1.4.3 (intel compiled). >> >> I had to set a couple of environmental variables prior to setup. You would >> probably need to set these values as well. If you your OpenMPI path was >> here for example --> /software/openmpi-1.4.3/, run the following commands >> (path set accordingly) before even attempting maker setup. >> >> export OMPI_MCA_mpi_warn_on_fork 0 >> export LD_PRELOAD /software/openmpi-1.4.3/lib/libmpi.so:$LD_PRELOAD >> >> These not only need to be set before compilation, but also before any run >> (so add them to you ~.bashrc or ~/.bash_profile or any module load scripts >> thanks). The LD_PRELOAD statement needs to be set for any program using >> OpenMPI's shared libraries and not just MAKER, so it's normally a good >> idea to have that set system wide for all users. The detail can be found >> in the OpenMPI documentation. Note sometimes system library updates can >> break OpenMPI's shared libraries while not breaking OpenMPI itself, so you >> might also need to recompile OpenMPI if it has broken shared libraries. >> >> Once you have those commands in place, run the perl Buil.PL step. Say yes >> to install with MPI. Then run ./Build install >> >> Thanks, >> Carson >> >> >> >> On 13-03-19 11:02 AM, "Carson Holt" wrote: >> >>> Try it with the no_locks option then. Make sure to let one instance >>> finish populating the mpi_blastdb directory before running other >>> instances >>> as that is where most initial locking occurs. >>> >>> I'll send you more details on how to install with OpenMPI, so you can >>> give >>> that a shot while your jobs are also running serially (so you don't lose >>> time). Also instead of 50 serial instances, you could try 10 with -cpus >>> set to 5. >>> >>> Thanks, >>> Carson >>> >>> >>> >>> On 13-03-19 11:19 AM, "Michael Nuhn" wrote: >>> >>>> Hello Carson! >>>> >>>> On 03/19/2013 02:27 PM, Carson Holt wrote: >>>>> Yes. If at all possible use MPI. It removes the overhead of locks >>>>> which happen per primary instance of MAKER. So one maker job using >>>>> 1000 >>>>> cpus via MPI will have one shared set of locks. 1000 serial instances >>>>> of MAKER on the other hand would have 1000x the locks. >>>> >>>> I don't know a thing about MPI. >>>> >>>> I tried installing maker (2.2.7) with mpich-3.0.2, mpich2-1.4.1 and open >>>> mpi and none of them worked for me. I also tried the automatic >>>> installation that comes with maker, but it didn't work for me either. >>>> >>>> If need be, I could spend time getting to the bottom of this, but there >>>> is no telling how long this would take me so I'd rather not, if there is >>>> an alternative. >>>> >>>> Would the approach I outlined before work? (Treating the split files as >>>> separate genomes to annotate and then combine the gffs afterwards) >>>> >>>> I also like this approach, because I would select a few contigs in the >>>> beginning which I would run on their own. They would complete early and >>>> this way I would get a preview of the results of the run instead of >>>> having to wait for everything to complete. >>>> >>>> It might also be more robust, because file locking issues would be >>>> confined to the instances working on a sequence chunk, but the rest of >>>> the instances could continue working. >>>> >>>> Cheers, >>>> Michael. >>>> >>>>> Alternatively if you do need to continue without MPI for some reason, I >>>>> just finished a devel version of MAKER that has a --no_locks option. >>>>> You can never start two instances using the same input fasta when >>>>> --no_locks is specified, but the splitting to use different input >>>>> fastas >>>>> I mentioned before in the example will still work fine. >>>>> >>>>> I also have updated the indexing/reindexing, so if indexing failures >>>>> happen, MAKER will switch between the current working directory and the >>>>> TMP= directory from the maker_opts.ctl file so as to try different IO >>>>> locations (I.e. NFS and non-NFS). Note you should never set TMP= in >>>>> the >>>>> control files to an NFS mounted location (it not only makes things a >>>>> lot >>>>> slower, but berkleydb and sqllite will get frequent errors on NFS). >>>>> TMP= defaults to /tmp when not specified >>>>> >>>>> I'll send you download information in a separate e-mail. Try a regular >>>>> MAKER run to see if the indexing/reindexing changes are sufficient >>>>> before attempting the ?no_locks option. >>>>> >>>>> Thanks, >>>>> Carson >> >> > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From Bob_Freeman at hms.harvard.edu Tue Mar 19 10:18:11 2013 From: Bob_Freeman at hms.harvard.edu (Freeman, Robert M.) Date: Tue, 19 Mar 2013 12:18:11 -0400 Subject: [maker-devel] Failing to recoup all gff annotations, including ab-initio In-Reply-To: References: Message-ID: <06F15FF0-2384-4BDD-AD9B-9C1D0AB6370C@hms.harvard.edu> Thanks, Carson. This explains the behavior I saw and will help us moving forward. Best, Bob On Mar 19, 2013, at 10:52 AM, Carson Holt wrote: Ab initio models without evidence support are not considered final models by default (newly trained ab initio predictors tend to have a very high false positive rate). If you really want the ab initio models without support to upgraded, set the keep_preds=1 in the maker_opts.ctl file. All ab inito models are also stored in the GFF3 as match/mtach_part features for reference purposes not gene/mRNA/exon/CDS. Thanks, Carson From: "Freeman, Robert M." > Date: Tuesday, 19 March, 2013 9:03 AM To: "maker-devel at yandell-lab.org" > Subject: [maker-devel] Failing to recoup all gff annotations, including ab-initio Carson et al., Thanks again for a great suite tools! We're using MAKER now to generate gene models (and model fragments) for a ciliate, the models for which we'll be using to generate a high-quality protein database for searches with mass spec. I bootstrapped the process using the core set of proteins with CEGMA, then trained SNAP. After the final round of running MAKER, I get about 1100 evidence-based models and 34K ab-initio. And that's fine (for now). I am able to collect the fasta files for both transcripts and proteins (evidence-based and ab-initio) without problem. My problem is that when I use the gff3_merge script, I only get annotations for the evidence-based models. I'm not sure why the ab-initio model annotations are being collected. I've tried using and not the '-g' switch, but this doesn't seem to make a difference. Thoughts? Tx, B _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org ----------------------------------------------------- Bob Freeman, Ph.D. Acorn Worm Informatics, Kirschner lab Dept of Systems Biology, Alpert 524 Harvard Medical School 200 Longwood Avenue Boston, MA 02115 617/432.2294, vox "Sorry I'm late. Oh, God, that sounded insincere. I'm late." -- Karen Walker, from Will and Grace -------------- next part -------------- An HTML attachment was scrubbed... URL: From cjfields at illinois.edu Tue Mar 19 13:04:18 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Tue, 19 Mar 2013 19:04:18 +0000 Subject: [maker-devel] Alternative start codons Message-ID: <118F034CF4C3EF48A96F86CE585B94BF74DA507D@CHIMBX5.ad.uillinois.edu> We had a user notice that MAKER is not observing alternative start codons for bacterial genomes. For instance, this predicted transcript: >Xf_Mul_000007-RA transcript Name:"Protein of unknown function" offset:79 AED:0.42 eAED:1.00 QI:79|-1|0|1|-1|1|1|20|24 GTGGGATACAGGCCGCTGATCGCTGATGGCGCGTACCTGAAACTGCTGCTGGACTACTAC GTTACAGTGCAGCCTTTGCATGCCGATTGGAAAGATCTATATATCATCGCTTGCGCTATT ACAGCGGCTAAAAAGAGTCTTCAATTTGGCGTAATTCAGTCATTGGCGGGGTAG Yields this protein sequence. >Xf_Mul_000007-RA protein AED:0.42 eAED:1.00 QI:79|-1|0|1|-1|1|1|20|24 MPIGKIYISSLALLQRLKRVFNLA I'm pretty sure I know what is going on, namely that MAKER is treating the 5' end as UTR and looking for the first ATG (there is one in the sequence above). Is there any way to change this behavior, though? For instance, allow alternative start codons like GTG/TTG? chris From hudarul at yahoo.com Tue Mar 19 13:08:55 2013 From: hudarul at yahoo.com (Hud Hud) Date: Tue, 19 Mar 2013 12:08:55 -0700 (PDT) Subject: [maker-devel] Maker-no such file or directory In-Reply-To: References: <1363637601.24386.YahooMailNeo@web164901.mail.bf1.yahoo.com> Message-ID: <1363720135.24498.YahooMailNeo@web164901.mail.bf1.yahoo.com> Hello everyone I have some queries, i cant run MAKER locally, so can i use MWAS on my contigs, but since my contigs too long to be run on MWAS, is it possible to combine the results after i upload and run the analysis on my contigs separately... ________________________________ From: Carson Holt To: Hud Hud ; "maker-devel at yandell-lab.org" Sent: Tuesday, March 19, 2013 4:44 AM Subject: Re: [maker-devel] Maker-no such file or directory Does 'ls -al?$home/Dorah/maker-2.27-beta/maker/data/dpp_contig.fasta' ?show a valid location? The error is just saying that the file location as written in the maker_opts.ctl file does not exist. --Carson From: Hud Hud Reply-To: Hud Hud Date: Monday, 18 March, 2013 4:13 PM To: "maker-devel at yandell-lab.org" Subject: [maker-devel] Maker-no such file or directory I have some problem with maker 1. i try to work with the example data in data directory, but im having this kind of error..anyone can help me error $ maker STATUS: Parsing control files... dpp_contig.fasta (fasta file or fasta embeded in GFF3 file): No such file or directory at /home/Dorah/maker-2.27-beta/maker/bin/../lib/GI.pm line 186 --> rank=NA, hostname=NurKaiyisah my maker_opts.ctl genome=$home/Dorah/maker-2.27-beta/maker/data/dpp_contig.fasta est=$home/Dorah/maker-2.27-beta/maker/data/dpp_est.fasta protein=$home/Dorah/maker-2.27-beta/maker/data/dpp_protein.fasta _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Mar 19 13:30:09 2013 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 19 Mar 2013 15:30:09 -0400 Subject: [maker-devel] Maker-no such file or directory In-Reply-To: <1363720135.24498.YahooMailNeo@web164901.mail.bf1.yahoo.com> Message-ID: You can. It will be very slow though as MWAS only dedicates a single cpu per job. So with a 5Mb max per job submission it could take a very long time depending on the size of the assembly (emphasis on very long). --Carson From: Hud Hud Reply-To: Hud Hud Date: Tuesday, 19 March, 2013 3:08 PM To: "maker-devel at yandell-lab.org" Subject: Re: [maker-devel] Maker-no such file or directory Hello everyone I have some queries, i cant run MAKER locally, so can i use MWAS on my contigs, but since my contigs too long to be run on MWAS, is it possible to combine the results after i upload and run the analysis on my contigs separately... From: Carson Holt To: Hud Hud ; "maker-devel at yandell-lab.org" Sent: Tuesday, March 19, 2013 4:44 AM Subject: Re: [maker-devel] Maker-no such file or directory Does 'ls -al $home/Dorah/maker-2.27-beta/maker/data/dpp_contig.fasta' show a valid location? The error is just saying that the file location as written in the maker_opts.ctl file does not exist. --Carson From: Hud Hud Reply-To: Hud Hud Date: Monday, 18 March, 2013 4:13 PM To: "maker-devel at yandell-lab.org" Subject: [maker-devel] Maker-no such file or directory I have some problem with maker 1. i try to work with the example data in data directory, but im having this kind of error..anyone can help me error $ maker STATUS: Parsing control files... dpp_contig.fasta (fasta file or fasta embeded in GFF3 file): No such file or directory at /home/Dorah/maker-2.27-beta/maker/bin/../lib/GI.pm line 186 --> rank=NA, hostname=NurKaiyisah my maker_opts.ctl genome=$home/Dorah/maker-2.27-beta/maker/data/dpp_contig.fasta est=$home/Dorah/maker-2.27-beta/maker/data/dpp_est.fasta protein=$home/Dorah/maker-2.27-beta/maker/data/dpp_protein.fasta _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Mar 19 13:33:46 2013 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 19 Mar 2013 15:33:46 -0400 Subject: [maker-devel] Alternative start codons In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF74DA507D@CHIMBX5.ad.uillinois.edu> Message-ID: It could be changed. I imagine that this is a protein2genome or est2genome gene, as MAKER won't try and determine by itself the start and end if it comes from a gene predictor. --Carson On 13-03-19 3:04 PM, "Fields, Christopher J" wrote: >We had a user notice that MAKER is not observing alternative start codons >for bacterial genomes. For instance, this predicted transcript: > >>Xf_Mul_000007-RA transcript Name:"Protein of unknown function" offset:79 >>AED:0.42 eAED:1.00 QI:79|-1|0|1|-1|1|1|20|24 >GTGGGATACAGGCCGCTGATCGCTGATGGCGCGTACCTGAAACTGCTGCTGGACTACTAC >GTTACAGTGCAGCCTTTGCATGCCGATTGGAAAGATCTATATATCATCGCTTGCGCTATT >ACAGCGGCTAAAAAGAGTCTTCAATTTGGCGTAATTCAGTCATTGGCGGGGTAG > >Yields this protein sequence. > >>Xf_Mul_000007-RA protein AED:0.42 eAED:1.00 QI:79|-1|0|1|-1|1|1|20|24 >MPIGKIYISSLALLQRLKRVFNLA > >I'm pretty sure I know what is going on, namely that MAKER is treating >the 5' end as UTR and looking for the first ATG (there is one in the >sequence above). Is there any way to change this behavior, though? For >instance, allow alternative start codons like GTG/TTG? > >chris >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From myandell at genetics.utah.edu Tue Mar 19 18:02:36 2013 From: myandell at genetics.utah.edu (Mark Yandell) Date: Wed, 20 Mar 2013 00:02:36 +0000 Subject: [maker-devel] Maker2 gff file output In-Reply-To: References: Message-ID: <7A60AB257EFF2B48B1F4C814817EA05350EDC688@mxb2.hg.genetics.utah.edu> Hi Blake, I'be forwarded this onto the maker_devel list, they can help you more there. regarding your comment g 'When I view the output of many contigs in Apollo, there is many times where 3 or 4 models show close to identical gene structure, but the final maker output does not contain that gene call. ' Those calls are in the output files, but there are in a different multifasta file; there are non-overalpping ab intio models. Another way is to set the config flag to allow MAKEr to use unspliced EST and RNA-seq alignments as evidence, I'be forwarded this onto the maker_devel list, they can help you more there. cheers, --mark Mark Yandell Professor of Human Genetics H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:801-587-7707 ________________________________________ From: Blake Hovde [hovdebt at uw.edu] Sent: Tuesday, March 19, 2013 2:35 PM To: Mark Yandell Subject: Maker2 gff file output Hi Dr. Yandell, I am currently running MAKER2 on a new algal genome and am running into a couple of problems that I would like your input on the genome size is ~60Mb and is currently in ~3100 contigs. First, I am having trouble doing multiple iterations of hmm training with SNAP due to the fact that I have so many gff output files in the datastore (1 for each contig in my draft genome). not just a single gff output that seems to be in the examples and tutorials I have followed thus far. Is there a way to combine all of my gff files together to make use of the SNAP hmm training or re-annotation? Second, Using multiple lines of evidence (augustus, genemarkES, RNAseq data, and COGs based on homology searches) I am having a hard time getting a lot of maker gene calls. It seems that the calling is too stringent in many cases. When I view the output of many contigs in Apollo, there is many times where 3 or 4 models show close to identical gene structure, but the final maker output does not contain that gene call. Do you have any suggestions on how to lower the stringency of the MAKER output so that more genes will be called? In some cases I am getting less than 3000 gene calls in the final output. Where an Augustus model trained on Chlamydamonas will return ~15000. Thanks very much for your help! Sincerely, Blake Hovde Graduate Student Department of Genome Sciences University of Washington From carsonhh at gmail.com Tue Mar 19 20:43:44 2013 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 19 Mar 2013 22:43:44 -0400 Subject: [maker-devel] Maker2 gff file output In-Reply-To: <7A60AB257EFF2B48B1F4C814817EA05350EDC688@mxb2.hg.genetics.utah.edu> Message-ID: >I am currently running MAKER2 on a new algal genome and am running >into a couple of problems that I would like your input on the genome >size is ~60Mb and is currently in ~3100 contigs. >First, I am having trouble doing multiple iterations of hmm training >with SNAP due to the fact that I have so many gff output files in the >datastore (1 for each contig in my draft genome). not just a single >gff output that seems to be in the examples and tutorials I have >followed thus far. Is there a way to combine all of my gff files >together to make use of the SNAP hmm training or re-annotation? Use the gff3_merge script in the .../maker/bin/ directory > >Second, Using multiple lines of evidence (augustus, genemarkES, RNAseq >data, and COGs based on homology searches) I am having a hard time >getting a lot of maker gene calls. It seems that the calling is too >stringent in many cases. When I view the output of many contigs in >Apollo, there is many times where 3 or 4 models show close to >identical gene structure, but the final maker output does not contain >that gene call. Do you have any suggestions on how to lower the >stringency of the MAKER output so that more genes will be called? In >some cases I am getting less than 3000 gene calls in the final output. >Where an Augustus model trained on Chlamydamonas will return ~15000. I agree with Mark. You may want to set single_exon=1 to accept single exon evidence, try increasing the depth of your protein evidence file as well, or if the genome is relatively gene dense, set keep_preds=1. On some genomes that are gene dense (fungi for example) ab initio predictors don't have that high a false positive rate, so this can be safe. However on more complex genomes doing so can produce more false positives than there are genes. Thanks, Carson On 13-03-19 8:02 PM, "Mark Yandell" wrote: >Hi Blake, > >I'be forwarded this onto the maker_devel list, they can help you more >there. > >regarding your comment g 'When I view the output of many contigs in >Apollo, there is many times where 3 or 4 models show close to identical >gene structure, but the final maker output does not contain that gene >call. ' Those calls are in the output files, but there are in a >different multifasta file; there are non-overalpping ab intio models. >Another way is to set the config flag to allow MAKEr to use unspliced EST >and RNA-seq alignments as evidence, > >I'be forwarded this onto the maker_devel list, they can help you more >there. > >cheers, > >--mark > > >Mark Yandell >Professor of Human Genetics >H.A. & Edna Benning Presidential Endowed Chair >Eccles Institute of Human Genetics >University of Utah >15 North 2030 East, Room 2100 >Salt Lake City, UT 84112-5330 >ph:801-587-7707 > >________________________________________ >From: Blake Hovde [hovdebt at uw.edu] >Sent: Tuesday, March 19, 2013 2:35 PM >To: Mark Yandell >Subject: Maker2 gff file output > >Hi Dr. Yandell, > >I am currently running MAKER2 on a new algal genome and am running >into a couple of problems that I would like your input on the genome >size is ~60Mb and is currently in ~3100 contigs. >First, I am having trouble doing multiple iterations of hmm training >with SNAP due to the fact that I have so many gff output files in the >datastore (1 for each contig in my draft genome). not just a single >gff output that seems to be in the examples and tutorials I have >followed thus far. Is there a way to combine all of my gff files >together to make use of the SNAP hmm training or re-annotation? > >Second, Using multiple lines of evidence (augustus, genemarkES, RNAseq >data, and COGs based on homology searches) I am having a hard time >getting a lot of maker gene calls. It seems that the calling is too >stringent in many cases. When I view the output of many contigs in >Apollo, there is many times where 3 or 4 models show close to >identical gene structure, but the final maker output does not contain >that gene call. Do you have any suggestions on how to lower the >stringency of the MAKER output so that more genes will be called? In >some cases I am getting less than 3000 gene calls in the final output. > Where an Augustus model trained on Chlamydamonas will return ~15000. > >Thanks very much for your help! > >Sincerely, >Blake Hovde >Graduate Student >Department of Genome Sciences >University of Washington > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From Carson.Holt at oicr.on.ca Wed Mar 20 07:51:29 2013 From: Carson.Holt at oicr.on.ca (Carson Holt) Date: Wed, 20 Mar 2013 13:51:29 +0000 Subject: [maker-devel] AED calculations using the MAKER pipeline In-Reply-To: Message-ID: In the current MAKER download when using GFF3 passthrough there was an issue with everything being done at the very last step. This of course leads to a memory spike and a very slow last step. That seems to be similar to what you are describing. It should be resolved in what will become version 2.28. I can give you access to the pre-release code, so you can check that the issue is resolved for you. I'll send details in a separate e-mail. Also the ### will be printed after every ~100,000 bp of assembly processed by MAKER. You can ignore them, but they actually have a meaning in GFF3. Basically everything between two sets of ###'s are fully resolved. It allows programs that read GFF3 to parallelize file loading or just load sections of a file as they can rapidly identify "safe chunks". Without them the entire file must be loaded into memory in order to be certain that all feature parts are there (as there is no requirement for sorting or order in GFF3). log.child files will always be empty unless you run analysis like snap or blast. Thanks, Carson On 13-03-20 9:05 AM, "Krishnakumar, Vivek" wrote: >Hi, > >We have been using the MAKER pipeline here at JCVI to calculate AED >scores by feeding in our annotation set as `model_gff` and the protein >and EST evidence as `protein_gff` and `est_gff` respectively. Here is the >issue we are having: > >When running the above pipeline with protein2genome and est2genome >evidence generated earlier by MAKER, there are no problems calculating >the AED score. Normally this pipeline takes a little over 12 hours to >complete. > >But if we use our own evidence, AAT and Genewise aligned proteins for >`protein_gff` and PASA assembled ESTs for `est_gff`, the same pipeline >runs very very slow and the intermediary *.gff.ann file has many chunks >(separated by '###') that are completely empty. Our evidence in formatted >in the same way as est2genome or protein2genome (GFF file with >"expressed_sequence_match::match_part" or "protein_match::match_part" >features respectively) > >The input to my pipeline is 8 chromosomes, ~2200 scaffolds and I use the >default `max_dna_len` parameter used to split the large assemblies into >chunks. > >Investigating the master_datastore.log shows me that the scaffolds run >through without any issues and the chromosomes are still being processed. >For any of the chromosomes, investigating the 'run.log' file, one level >above 'theVoid' shows me how many "final.section" jobs were started and >how many finished. And in the case of all the chromosomes, it tells me >that everything that was started has finished. And the 'log.child.*' >files within `theVoid` are all empty. Also within `theVoid`, I'm noticing >that the "raw.section" and "evidence_*.gff" files are not empty. But one >thing that is surprising is that of all the "final.section" files, only >the one pertaining to the last chunk is very large (proportional to the >size of the evidnce), the rest are all exactly the same size (exactly 331 >bytes). > >I'm running MAKER in MPI mode spawning 48 processes on a high memory >machine with 64 available cores and 1TB of RAM. > >I hope I've been able to explain my situation clearly in this email. > >Any help is appreciated. >Thank you. > >Vivek From vKrishna at jcvi.org Wed Mar 20 07:05:55 2013 From: vKrishna at jcvi.org (Krishnakumar, Vivek) Date: Wed, 20 Mar 2013 09:05:55 -0400 Subject: [maker-devel] AED calculations using the MAKER pipeline Message-ID: Hi, We have been using the MAKER pipeline here at JCVI to calculate AED scores by feeding in our annotation set as `model_gff` and the protein and EST evidence as `protein_gff` and `est_gff` respectively. Here is the issue we are having: When running the above pipeline with protein2genome and est2genome evidence generated earlier by MAKER, there are no problems calculating the AED score. Normally this pipeline takes a little over 12 hours to complete. But if we use our own evidence, AAT and Genewise aligned proteins for `protein_gff` and PASA assembled ESTs for `est_gff`, the same pipeline runs very very slow and the intermediary *.gff.ann file has many chunks (separated by '###') that are completely empty. Our evidence in formatted in the same way as est2genome or protein2genome (GFF file with "expressed_sequence_match::match_part" or "protein_match::match_part" features respectively) The input to my pipeline is 8 chromosomes, ~2200 scaffolds and I use the default `max_dna_len` parameter used to split the large assemblies into chunks. Investigating the master_datastore.log shows me that the scaffolds run through without any issues and the chromosomes are still being processed. For any of the chromosomes, investigating the 'run.log' file, one level above 'theVoid' shows me how many "final.section" jobs were started and how many finished. And in the case of all the chromosomes, it tells me that everything that was started has finished. And the 'log.child.*' files within `theVoid` are all empty. Also within `theVoid`, I'm noticing that the "raw.section" and "evidence_*.gff" files are not empty. But one thing that is surprising is that of all the "final.section" files, only the one pertaining to the last chunk is very large (proportional to the size of the evidnce), the rest are all exactly the same size (exactly 331 bytes). I'm running MAKER in MPI mode spawning 48 processes on a high memory machine with 64 available cores and 1TB of RAM. I hope I've been able to explain my situation clearly in this email. Any help is appreciated. Thank you. Vivek From cdtown at jcvi.org Wed Mar 20 07:54:33 2013 From: cdtown at jcvi.org (Town, Christopher D.) Date: Wed, 20 Mar 2013 09:54:33 -0400 Subject: [maker-devel] AED calculations using the MAKER pipeline In-Reply-To: References: Message-ID: Thanks. Is there any way of guestimating when this final step might be completed. We are in a time crunch here to get this analysis finished and the data/annotation out. Best Chris -----Original Message----- From: Carson Holt [mailto:Carson.Holt at oicr.on.ca] Sent: Wednesday, March 20, 2013 9:51 AM To: Krishnakumar, Vivek; maker-devel at yandell-lab.org Cc: Town, Christopher D.; Tang, Haibao; Bidwell, Shelby; Rosen, Benjamin Subject: Re: AED calculations using the MAKER pipeline In the current MAKER download when using GFF3 passthrough there was an issue with everything being done at the very last step. This of course leads to a memory spike and a very slow last step. That seems to be similar to what you are describing. It should be resolved in what will become version 2.28. I can give you access to the pre-release code, so you can check that the issue is resolved for you. I'll send details in a separate e-mail. Also the ### will be printed after every ~100,000 bp of assembly processed by MAKER. You can ignore them, but they actually have a meaning in GFF3. Basically everything between two sets of ###'s are fully resolved. It allows programs that read GFF3 to parallelize file loading or just load sections of a file as they can rapidly identify "safe chunks". Without them the entire file must be loaded into memory in order to be certain that all feature parts are there (as there is no requirement for sorting or order in GFF3). log.child files will always be empty unless you run analysis like snap or blast. Thanks, Carson On 13-03-20 9:05 AM, "Krishnakumar, Vivek" wrote: >Hi, > >We have been using the MAKER pipeline here at JCVI to calculate AED >scores by feeding in our annotation set as `model_gff` and the protein >and EST evidence as `protein_gff` and `est_gff` respectively. Here is >the issue we are having: > >When running the above pipeline with protein2genome and est2genome >evidence generated earlier by MAKER, there are no problems calculating >the AED score. Normally this pipeline takes a little over 12 hours to >complete. > >But if we use our own evidence, AAT and Genewise aligned proteins for >`protein_gff` and PASA assembled ESTs for `est_gff`, the same pipeline >runs very very slow and the intermediary *.gff.ann file has many chunks >(separated by '###') that are completely empty. Our evidence in >formatted in the same way as est2genome or protein2genome (GFF file >with "expressed_sequence_match::match_part" or "protein_match::match_part" >features respectively) > >The input to my pipeline is 8 chromosomes, ~2200 scaffolds and I use >the default `max_dna_len` parameter used to split the large assemblies >into chunks. > >Investigating the master_datastore.log shows me that the scaffolds run >through without any issues and the chromosomes are still being processed. >For any of the chromosomes, investigating the 'run.log' file, one level >above 'theVoid' shows me how many "final.section" jobs were started and >how many finished. And in the case of all the chromosomes, it tells me >that everything that was started has finished. And the 'log.child.*' >files within `theVoid` are all empty. Also within `theVoid`, I'm >noticing that the "raw.section" and "evidence_*.gff" files are not >empty. But one thing that is surprising is that of all the >"final.section" files, only the one pertaining to the last chunk is >very large (proportional to the size of the evidnce), the rest are all >exactly the same size (exactly 331 bytes). > >I'm running MAKER in MPI mode spawning 48 processes on a high memory >machine with 64 available cores and 1TB of RAM. > >I hope I've been able to explain my situation clearly in this email. > >Any help is appreciated. >Thank you. > >Vivek From myandell at genetics.utah.edu Wed Mar 20 08:55:38 2013 From: myandell at genetics.utah.edu (Mark Yandell) Date: Wed, 20 Mar 2013 14:55:38 +0000 Subject: [maker-devel] AED calculations using the MAKER pipeline In-Reply-To: References: Message-ID: <7A60AB257EFF2B48B1F4C814817EA05350EDCE05@mxb2.hg.genetics.utah.edu> Hi Vivek, sound like its a maybe problem with the protein2genome GFF file. Cane you send us a sample file that is known to produce the problem? cheers, --mark Mark Yandell Professor of Human Genetics H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:801-587-7707 ________________________________________ From: maker-devel-bounces at yandell-lab.org [maker-devel-bounces at yandell-lab.org] on behalf of Krishnakumar, Vivek [vKrishna at jcvi.org] Sent: Wednesday, March 20, 2013 7:05 AM To: maker-devel at yandell-lab.org Cc: Tang, Haibao; Rosen, Benjamin; Town, Christopher D.; Bidwell, Shelby Subject: [maker-devel] AED calculations using the MAKER pipeline Hi, We have been using the MAKER pipeline here at JCVI to calculate AED scores by feeding in our annotation set as `model_gff` and the protein and EST evidence as `protein_gff` and `est_gff` respectively. Here is the issue we are having: When running the above pipeline with protein2genome and est2genome evidence generated earlier by MAKER, there are no problems calculating the AED score. Normally this pipeline takes a little over 12 hours to complete. But if we use our own evidence, AAT and Genewise aligned proteins for `protein_gff` and PASA assembled ESTs for `est_gff`, the same pipeline runs very very slow and the intermediary *.gff.ann file has many chunks (separated by '###') that are completely empty. Our evidence in formatted in the same way as est2genome or protein2genome (GFF file with "expressed_sequence_match::match_part" or "protein_match::match_part" features respectively) The input to my pipeline is 8 chromosomes, ~2200 scaffolds and I use the default `max_dna_len` parameter used to split the large assemblies into chunks. Investigating the master_datastore.log shows me that the scaffolds run through without any issues and the chromosomes are still being processed. For any of the chromosomes, investigating the 'run.log' file, one level above 'theVoid' shows me how many "final.section" jobs were started and how many finished. And in the case of all the chromosomes, it tells me that everything that was started has finished. And the 'log.child.*' files within `theVoid` are all empty. Also within `theVoid`, I'm noticing that the "raw.section" and "evidence_*.gff" files are not empty. But one thing that is surprising is that of all the "final.section" files, only the one pertaining to the last chunk is very large (proportional to the size of the evidnce), the rest are all exactly the same size (exactly 331 bytes). I'm running MAKER in MPI mode spawning 48 processes on a high memory machine with 64 available cores and 1TB of RAM. I hope I've been able to explain my situation clearly in this email. Any help is appreciated. Thank you. Vivek _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From myandell at genetics.utah.edu Wed Mar 20 08:57:17 2013 From: myandell at genetics.utah.edu (Mark Yandell) Date: Wed, 20 Mar 2013 14:57:17 +0000 Subject: [maker-devel] AED calculations using the MAKER pipeline In-Reply-To: References: , Message-ID: <7A60AB257EFF2B48B1F4C814817EA05350EDCE15@mxb2.hg.genetics.utah.edu> whoops. looks like carson has got this one already. Thanks! Mark Yandell Professor of Human Genetics H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:801-587-7707 ________________________________________ From: maker-devel-bounces at yandell-lab.org [maker-devel-bounces at yandell-lab.org] on behalf of Town, Christopher D. [cdtown at jcvi.org] Sent: Wednesday, March 20, 2013 7:54 AM To: Carson Holt; Krishnakumar, Vivek; maker-devel at yandell-lab.org Cc: Tang, Haibao; Rosen, Benjamin; Bidwell, Shelby Subject: Re: [maker-devel] AED calculations using the MAKER pipeline Thanks. Is there any way of guestimating when this final step might be completed. We are in a time crunch here to get this analysis finished and the data/annotation out. Best Chris -----Original Message----- From: Carson Holt [mailto:Carson.Holt at oicr.on.ca] Sent: Wednesday, March 20, 2013 9:51 AM To: Krishnakumar, Vivek; maker-devel at yandell-lab.org Cc: Town, Christopher D.; Tang, Haibao; Bidwell, Shelby; Rosen, Benjamin Subject: Re: AED calculations using the MAKER pipeline In the current MAKER download when using GFF3 passthrough there was an issue with everything being done at the very last step. This of course leads to a memory spike and a very slow last step. That seems to be similar to what you are describing. It should be resolved in what will become version 2.28. I can give you access to the pre-release code, so you can check that the issue is resolved for you. I'll send details in a separate e-mail. Also the ### will be printed after every ~100,000 bp of assembly processed by MAKER. You can ignore them, but they actually have a meaning in GFF3. Basically everything between two sets of ###'s are fully resolved. It allows programs that read GFF3 to parallelize file loading or just load sections of a file as they can rapidly identify "safe chunks". Without them the entire file must be loaded into memory in order to be certain that all feature parts are there (as there is no requirement for sorting or order in GFF3). log.child files will always be empty unless you run analysis like snap or blast. Thanks, Carson On 13-03-20 9:05 AM, "Krishnakumar, Vivek" wrote: >Hi, > >We have been using the MAKER pipeline here at JCVI to calculate AED >scores by feeding in our annotation set as `model_gff` and the protein >and EST evidence as `protein_gff` and `est_gff` respectively. Here is >the issue we are having: > >When running the above pipeline with protein2genome and est2genome >evidence generated earlier by MAKER, there are no problems calculating >the AED score. Normally this pipeline takes a little over 12 hours to >complete. > >But if we use our own evidence, AAT and Genewise aligned proteins for >`protein_gff` and PASA assembled ESTs for `est_gff`, the same pipeline >runs very very slow and the intermediary *.gff.ann file has many chunks >(separated by '###') that are completely empty. Our evidence in >formatted in the same way as est2genome or protein2genome (GFF file >with "expressed_sequence_match::match_part" or "protein_match::match_part" >features respectively) > >The input to my pipeline is 8 chromosomes, ~2200 scaffolds and I use >the default `max_dna_len` parameter used to split the large assemblies >into chunks. > >Investigating the master_datastore.log shows me that the scaffolds run >through without any issues and the chromosomes are still being processed. >For any of the chromosomes, investigating the 'run.log' file, one level >above 'theVoid' shows me how many "final.section" jobs were started and >how many finished. And in the case of all the chromosomes, it tells me >that everything that was started has finished. And the 'log.child.*' >files within `theVoid` are all empty. Also within `theVoid`, I'm >noticing that the "raw.section" and "evidence_*.gff" files are not >empty. But one thing that is surprising is that of all the >"final.section" files, only the one pertaining to the last chunk is >very large (proportional to the size of the evidnce), the rest are all >exactly the same size (exactly 331 bytes). > >I'm running MAKER in MPI mode spawning 48 processes on a high memory >machine with 64 available cores and 1TB of RAM. > >I hope I've been able to explain my situation clearly in this email. > >Any help is appreciated. >Thank you. > >Vivek _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From carsonhh at gmail.com Wed Mar 20 11:36:30 2013 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 20 Mar 2013 13:36:30 -0400 Subject: [maker-devel] AED calculations using the MAKER pipeline In-Reply-To: Message-ID: On the few cases where I found this (if it is the same issue you are experiencing), it was very much dependent on the total size of the evidence database and the length of the contigs. For me it took about 25-50% longer, but used up 10-15x as much RAM (primarily because the contigs were very long > 50 Mb each). The issue was unnoticeable on the short contigs that are more typical of de novo annotation. Thanks, Carson On 13-03-20 9:54 AM, "Town, Christopher D." wrote: >Thanks. Is there any way of guestimating when this final step might be >completed. We are in a time crunch here to get this analysis finished and >the data/annotation out. > >Best > >Chris > >-----Original Message----- >From: Carson Holt [mailto:Carson.Holt at oicr.on.ca] >Sent: Wednesday, March 20, 2013 9:51 AM >To: Krishnakumar, Vivek; maker-devel at yandell-lab.org >Cc: Town, Christopher D.; Tang, Haibao; Bidwell, Shelby; Rosen, Benjamin >Subject: Re: AED calculations using the MAKER pipeline > >In the current MAKER download when using GFF3 passthrough there was an >issue with everything being done at the very last step. This of course >leads to a memory spike and a very slow last step. That seems to be >similar to what you are describing. It should be resolved in what will >become version 2.28. I can give you access to the pre-release code, so >you can check that the issue is resolved for you. I'll send details in a >separate e-mail. > >Also the ### will be printed after every ~100,000 bp of assembly >processed by MAKER. You can ignore them, but they actually have a >meaning in GFF3. >Basically everything between two sets of ###'s are fully resolved. It >allows programs that read GFF3 to parallelize file loading or just load >sections of a file as they can rapidly identify "safe chunks". Without >them the entire file must be loaded into memory in order to be certain >that all feature parts are there (as there is no requirement for sorting >or order in GFF3). > >log.child files will always be empty unless you run analysis like snap or >blast. > >Thanks, >Carson > > > > > > >On 13-03-20 9:05 AM, "Krishnakumar, Vivek" wrote: > >>Hi, >> >>We have been using the MAKER pipeline here at JCVI to calculate AED >>scores by feeding in our annotation set as `model_gff` and the protein >>and EST evidence as `protein_gff` and `est_gff` respectively. Here is >>the issue we are having: >> >>When running the above pipeline with protein2genome and est2genome >>evidence generated earlier by MAKER, there are no problems calculating >>the AED score. Normally this pipeline takes a little over 12 hours to >>complete. >> >>But if we use our own evidence, AAT and Genewise aligned proteins for >>`protein_gff` and PASA assembled ESTs for `est_gff`, the same pipeline >>runs very very slow and the intermediary *.gff.ann file has many chunks >>(separated by '###') that are completely empty. Our evidence in >>formatted in the same way as est2genome or protein2genome (GFF file >>with "expressed_sequence_match::match_part" or >>"protein_match::match_part" >>features respectively) >> >>The input to my pipeline is 8 chromosomes, ~2200 scaffolds and I use >>the default `max_dna_len` parameter used to split the large assemblies >>into chunks. >> >>Investigating the master_datastore.log shows me that the scaffolds run >>through without any issues and the chromosomes are still being processed. >>For any of the chromosomes, investigating the 'run.log' file, one level >>above 'theVoid' shows me how many "final.section" jobs were started and >>how many finished. And in the case of all the chromosomes, it tells me >>that everything that was started has finished. And the 'log.child.*' >>files within `theVoid` are all empty. Also within `theVoid`, I'm >>noticing that the "raw.section" and "evidence_*.gff" files are not >>empty. But one thing that is surprising is that of all the >>"final.section" files, only the one pertaining to the last chunk is >>very large (proportional to the size of the evidnce), the rest are all >>exactly the same size (exactly 331 bytes). >> >>I'm running MAKER in MPI mode spawning 48 processes on a high memory >>machine with 64 available cores and 1TB of RAM. >> >>I hope I've been able to explain my situation clearly in this email. >> >>Any help is appreciated. >>Thank you. >> >>Vivek > > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From ares711122 at gmail.com Thu Mar 21 18:08:45 2013 From: ares711122 at gmail.com (Hung-Wei Hsu) Date: Fri, 22 Mar 2013 08:08:45 +0800 Subject: [maker-devel] Directory structure is too deep! Message-ID: Hi MAKER developers, I found that the MAKER outputs of each contigs were located in separate deep directory. Can MAKER collect these outputs in one simple directory so that these results can be easily examined? Thanks a lot in advance. Warmest regards, Hung-Wei -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Thu Mar 21 20:07:23 2013 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 21 Mar 2013 22:07:23 -0400 Subject: [maker-devel] Directory structure is too deep! In-Reply-To: Message-ID: You can use gff3_merge to collect them into a single file, or to keep them as separate files but in the same directory just use the standard linux copy command. Similarly you can use fasta_merge to collect the fasta files. Example: > mkdir results > cp *.maker.output/*_datastore/*/*/*.gff results/ Thanks, Carson From: Hung-Wei Hsu Date: Thursday, 21 March, 2013 8:08 PM To: Subject: [maker-devel] Directory structure is too deep! Hi MAKER developers, I found that the MAKER outputs of each contigs were located in separate deep directory. Can MAKER collect these outputs in one simple directory so that these results can be easily examined? Thanks a lot in advance. Warmest regards, Hung-Wei _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From jason.stajich at gmail.com Fri Mar 22 00:12:55 2013 From: jason.stajich at gmail.com (Jason Stajich) Date: Thu, 21 Mar 2013 20:12:55 -1000 Subject: [maker-devel] failed gene prediction In-Reply-To: References: Message-ID: <59B5B965-7B15-449E-B42F-E41D4F448B6A@gmail.com> For fungi, I've put up some of the gene prediction parameters that I've built or trained if that is a helpful for you. https://github.com/hyphaltip/fungi-gene-prediction-params In the absence of any ESTs or RNA-Seq I also recommend generating a starting training set with CEGMA first and then training your predictors from there except for GeneMark.hmm which seems to do okay with self-training. Jason On Mar 18, 2013, at 10:49 AM, Carson Holt wrote: > You didn't supply any evidence or HMM files for gene predictors. Just raw assembly data by itself is insufficient for genome annotation. > > Here is some nice documentation for running MAKER --> http://gmod.org/wiki/MAKER_Tutorial_2012 > Here is a nice overview of genome annotation ion general --> http://fasta.bioch.virginia.edu/cshl/pdf/12/ajm12/euk_genome_annotation_review.pdf > > Once you've gone through the documentation and examples, if you come across any questions just let us know. > > Thanks, > Carson > > > From: "Borhan, Hossein" > Date: Monday, 18 March, 2013 4:40 PM > To: > Subject: [maker-devel] failed gene prediction > > Hi > > I have tried maker on a fungus genome of 45 mb with 1/3 being repeat rich. It did not produce any prediction. I am not sure what is causing this. Attached are the STDERR and opts.ctl. I appreciate your help > > > Hossein > > > > > > > > > > _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org Jason Stajich jason.stajich at gmail.com jason at bioperl.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ares711122 at gmail.com Fri Mar 22 01:52:25 2013 From: ares711122 at gmail.com (Hung-Wei Hsu) Date: Fri, 22 Mar 2013 15:52:25 +0800 Subject: [maker-devel] Can MAKER analyze the viral genome? Message-ID: Hi MAKER developers, I'm wondering if MAKER can deal with the viral genome. If yes, how do I set the running parameters? Thanks. Kind regards, Hung-Wei -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Sat Mar 23 17:42:39 2013 From: carsonhh at gmail.com (Carson Holt) Date: Sat, 23 Mar 2013 19:42:39 -0400 Subject: [maker-devel] Can MAKER analyze the viral genome? In-Reply-To: Message-ID: You can set organism type to prokaryotic and use the protein2genome option for annotation. It's not a perfect match as it only allows for partial gene spatial overlap and not full gene within a gene like you can see in viruses. Thanks, Carson From: Hung-Wei Hsu Date: Friday, 22 March, 2013 3:52 AM To: Subject: [maker-devel] Can MAKER analyze the viral genome? Hi MAKER developers, I'm wondering if MAKER can deal with the viral genome. If yes, how do I set the running parameters? Thanks. Kind regards, Hung-Wei _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From jjin01 at mail.rockefeller.edu Sat Mar 23 18:43:54 2013 From: jjin01 at mail.rockefeller.edu (Jingjing Jin) Date: Sun, 24 Mar 2013 00:43:54 +0000 Subject: [maker-devel] maker running error Message-ID: Dear all, When I run the maker, there is an error like this: *** buffer overflow detected ***: /usr/bin/perl terminated ======= Backtrace: ========= /lib64/libc.so.6(__fortify_fail+0x37)[0x3582d01d47] /lib64/libc.so.6[0x3582cffc30] /lib64/libc.so.6[0x3582cff089] /lib64/libc.so.6(__printf_fp+0x1531)[0x3582c4afa1] /lib64/libc.so.6(_IO_vfprintf+0x11a7)[0x3582c45407] /lib64/libc.so.6(__vsprintf_chk+0x9d)[0x3582cff12d] /lib64/libc.so.6(__sprintf_chk+0x7f)[0x3582cff06f] /usr/local/maker/lib/File/../../perl/lib/auto/Proc/ProcessTable/ProcessTable.so(OS_get_table+0x9bb)[0x7f328e8eb69b] /usr/local/maker/lib/File/../../perl/lib/auto/Proc/ProcessTable/ProcessTable.so(XS_Proc__ProcessTable_table+0x182)[0x7f328e8ecc02] /usr/lib64/perl5/CORE/libperl.so(Perl_pp_entersub+0x5a5)[0x35848a66d5] /usr/lib64/perl5/CORE/libperl.so(Perl_runops_standard+0x16)[0x35848a49c6] /usr/lib64/perl5/CORE/libperl.so(perl_run+0x338)[0x358484d0d8] /usr/bin/perl(main+0xec)[0x400cac] /lib64/libc.so.6(__libc_start_main+0xfd)[0x3582c1ecdd] /usr/bin/perl[0x400af9] ======= Memory map: ======== Could anyone give me some suggestion about how to deal with this problem? Thanks! Jingjing -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Sat Mar 23 21:04:49 2013 From: carsonhh at gmail.com (Carson Holt) Date: Sat, 23 Mar 2013 23:04:49 -0400 Subject: [maker-devel] maker running error In-Reply-To: Message-ID: Could you try maker version 2.27 from the website? Proc::ProcessTable may have problems on your system in accessing the process table. Version 2.27 tries to access the same information by first parsing the output of the standard 'df' command and only tries to access the process table directly if that fails. Thanks, Carson From: Jingjing Jin Date: Saturday, 23 March, 2013 8:43 PM To: "maker-devel at yandell-lab.org" Subject: [maker-devel] maker running error Dear all, When I run the maker, there is an error like this: *** buffer overflow detected ***: /usr/bin/perl terminated ======= Backtrace: ========= /lib64/libc.so.6(__fortify_fail+0x37)[0x3582d01d47] /lib64/libc.so.6[0x3582cffc30] /lib64/libc.so.6[0x3582cff089] /lib64/libc.so.6(__printf_fp+0x1531)[0x3582c4afa1] /lib64/libc.so.6(_IO_vfprintf+0x11a7)[0x3582c45407] /lib64/libc.so.6(__vsprintf_chk+0x9d)[0x3582cff12d] /lib64/libc.so.6(__sprintf_chk+0x7f)[0x3582cff06f] /usr/local/maker/lib/File/../../perl/lib/auto/Proc/ProcessTable/ProcessTable .so(OS_get_table+0x9bb)[0x7f328e8eb69b] /usr/local/maker/lib/File/../../perl/lib/auto/Proc/ProcessTable/ProcessTable .so(XS_Proc__ProcessTable_table+0x182)[0x7f328e8ecc02] /usr/lib64/perl5/CORE/libperl.so(Perl_pp_entersub+0x5a5)[0x35848a66d5] /usr/lib64/perl5/CORE/libperl.so(Perl_runops_standard+0x16)[0x35848a49c6] /usr/lib64/perl5/CORE/libperl.so(perl_run+0x338)[0x358484d0d8] /usr/bin/perl(main+0xec)[0x400cac] /lib64/libc.so.6(__libc_start_main+0xfd)[0x3582c1ecdd] /usr/bin/perl[0x400af9] ======= Memory map: ======== Could anyone give me some suggestion about how to deal with this problem? Thanks! Jingjing _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnuhn at ebi.ac.uk Mon Mar 25 06:18:11 2013 From: mnuhn at ebi.ac.uk (mnuhn) Date: Mon, 25 Mar 2013 12:18:11 +0000 Subject: [maker-devel] =?utf-8?q?master=5Fdatastore=5Findex=2Elog_file_shr?= =?utf-8?q?inks=2E?= In-Reply-To: References: Message-ID: <407ae892252062e886fb3855bb6bf74c@ebi.ac.uk> Thanks, this works and mpi maker is running now. Cheers, Michael. P.S.: If anyone is trying to reproduce this, I only had one directory in LD_PRELOAD and it didn't like the trailing colon, so I removed it to make it work: export LD_PRELOAD=/software/openmpi-1.4.3/lib/libmpi.so On 2013-03-19 15:22, Carson Holt wrote: > I have MAKER working under OpemnMPI 1.4.3 (intel compiled). > > I had to set a couple of environmental variables prior to setup. You > would > probably need to set these values as well. If you your OpenMPI path > was > here for example --> /software/openmpi-1.4.3/, run the following > commands > (path set accordingly) before even attempting maker setup. > > export OMPI_MCA_mpi_warn_on_fork 0 > export LD_PRELOAD /software/openmpi-1.4.3/lib/libmpi.so:$LD_PRELOAD > > These not only need to be set before compilation, but also before any > run > (so add them to you ~.bashrc or ~/.bash_profile or any module load > scripts > thanks). The LD_PRELOAD statement needs to be set for any program > using > OpenMPI's shared libraries and not just MAKER, so it's normally a > good > idea to have that set system wide for all users. The detail can be > found > in the OpenMPI documentation. Note sometimes system library updates > can > break OpenMPI's shared libraries while not breaking OpenMPI itself, > so you > might also need to recompile OpenMPI if it has broken shared > libraries. > > Once you have those commands in place, run the perl Buil.PL step. Say > yes > to install with MPI. Then run ./Build install > > Thanks, > Carson > > > > On 13-03-19 11:02 AM, "Carson Holt" wrote: > >>Try it with the no_locks option then. Make sure to let one instance >>finish populating the mpi_blastdb directory before running other >>instances >>as that is where most initial locking occurs. >> >>I'll send you more details on how to install with OpenMPI, so you can >>give >>that a shot while your jobs are also running serially (so you don't >> lose >>time). Also instead of 50 serial instances, you could try 10 with >> -cpus >>set to 5. >> >>Thanks, >>Carson >> >> >> >>On 13-03-19 11:19 AM, "Michael Nuhn" wrote: >> >>>Hello Carson! >>> >>>On 03/19/2013 02:27 PM, Carson Holt wrote: >>>> Yes. If at all possible use MPI. It removes the overhead of >>>> locks >>>> which happen per primary instance of MAKER. So one maker job >>>> using >>>>1000 >>>> cpus via MPI will have one shared set of locks. 1000 serial >>>> instances >>>> of MAKER on the other hand would have 1000x the locks. >>> >>>I don't know a thing about MPI. >>> >>>I tried installing maker (2.2.7) with mpich-3.0.2, mpich2-1.4.1 and >>> open >>>mpi and none of them worked for me. I also tried the automatic >>>installation that comes with maker, but it didn't work for me >>> either. >>> >>>If need be, I could spend time getting to the bottom of this, but >>> there >>>is no telling how long this would take me so I'd rather not, if >>> there is >>>an alternative. >>> >>>Would the approach I outlined before work? (Treating the split files >>> as >>>separate genomes to annotate and then combine the gffs afterwards) >>> >>>I also like this approach, because I would select a few contigs in >>> the >>>beginning which I would run on their own. They would complete early >>> and >>>this way I would get a preview of the results of the run instead of >>>having to wait for everything to complete. >>> >>>It might also be more robust, because file locking issues would be >>>confined to the instances working on a sequence chunk, but the rest >>> of >>>the instances could continue working. >>> >>>Cheers, >>>Michael. >>> >>>> Alternatively if you do need to continue without MPI for some >>>> reason, I >>>> just finished a devel version of MAKER that has a --no_locks >>>> option. >>>> You can never start two instances using the same input fasta >>>> when >>>> --no_locks is specified, but the splitting to use different input >>>>fastas >>>> I mentioned before in the example will still work fine. >>>> >>>> I also have updated the indexing/reindexing, so if indexing >>>> failures >>>> happen, MAKER will switch between the current working directory >>>> and the >>>> TMP= directory from the maker_opts.ctl file so as to try different >>>> IO >>>> locations (I.e. NFS and non-NFS). Note you should never set TMP= >>>> in >>>>the >>>> control files to an NFS mounted location (it not only makes things >>>> a >>>>lot >>>> slower, but berkleydb and sqllite will get frequent errors on >>>> NFS). >>>> TMP= defaults to /tmp when not specified >>>> >>>> I'll send you download information in a separate e-mail. Try a >>>> regular >>>> MAKER run to see if the indexing/reindexing changes are sufficient >>>> before attempting the ?no_locks option. >>>> >>>> Thanks, >>>> Carson From lengjingmao at gmail.com Mon Mar 25 07:49:11 2013 From: lengjingmao at gmail.com (shaohua.fan) Date: Mon, 25 Mar 2013 14:49:11 +0100 Subject: [maker-devel] maker terminated strangely Message-ID: Hi Maker developers, I met a problem when I was using Maker version 2.27 beta version that the pipeline terminated in the middle of the process without any error message. The genome I am working with is a Eukaryotic genome which is consisted by around 6000 scaffolds. I combined de novo (Augustus and SNAP) and evidence based (protein from a closely related species and transcriptome from the same species) for the gene prediction (the genome is already repeat masked). The MPI (mpich2 version 1.5) enabled maker was run on a cluster by using SGE. I checked with the administrator of our cluster, there is no limitation of SGE job. The maker was run by using mpiexec -n 48 /home/shafan/maker/bin/maker maker_opts.ctl maker_bopts.ctl maker_exe.ctl I attached my maker_opt.ctl, please let me know if you need any information for this problem. Thanks a lot! Shaohua -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_opts.ctl Type: application/octet-stream Size: 4519 bytes Desc: not available URL: From carsonhh at gmail.com Mon Mar 25 08:01:45 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 25 Mar 2013 10:01:45 -0400 Subject: [maker-devel] maker terminated strangely In-Reply-To: Message-ID: Could you send your captured standard error. That would contain messages that highlight the specific cause. Thanks, Carson From: "shaohua.fan" Date: Monday, 25 March, 2013 9:49 AM To: Subject: [maker-devel] maker terminated strangely Hi Maker developers, I met a problem when I was using Maker version 2.27 beta version that the pipeline terminated in the middle of the process without any error message. The genome I am working with is a Eukaryotic genome which is consisted by around 6000 scaffolds. I combined de novo (Augustus and SNAP) and evidence based (protein from a closely related species and transcriptome from the same species) for the gene prediction (the genome is already repeat masked). The MPI (mpich2 version 1.5) enabled maker was run on a cluster by using SGE. I checked with the administrator of our cluster, there is no limitation of SGE job. The maker was run by using mpiexec -n 48 /home/shafan/maker/bin/maker maker_opts.ctl maker_bopts.ctl maker_exe.ctl I attached my maker_opt.ctl, please let me know if you need any information for this problem. Thanks a lot! Shaohua _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From lengjingmao at gmail.com Mon Mar 25 08:07:17 2013 From: lengjingmao at gmail.com (shaohua.fan) Date: Mon, 25 Mar 2013 15:07:17 +0100 Subject: [maker-devel] maker terminated strangely In-Reply-To: References: Message-ID: Hi Carson, Do you mean standard output from maker? If yes, I need to upload the file to our ftp server, since it is quite big around 1.1 Gb. Shaohua 2013/3/25 Carson Holt > Could you send your captured standard error. That would contain messages > that highlight the specific cause. > > Thanks, > Carson > > > From: "shaohua.fan" > Date: Monday, 25 March, 2013 9:49 AM > To: > Subject: [maker-devel] maker terminated strangely > > Hi Maker developers, > > I met a problem when I was using Maker version 2.27 beta version that the > pipeline terminated in the middle of the process without any error message. > > > The genome I am working with is a Eukaryotic genome which is consisted by > around 6000 scaffolds. I combined de novo (Augustus and SNAP) and evidence > based (protein from a closely related species and transcriptome from the > same species) for the gene prediction (the genome is already repeat > masked). The MPI (mpich2 version 1.5) enabled maker was run on a cluster by > using SGE. I checked with the administrator of our cluster, there is no > limitation of SGE job. > > The maker was run by using mpiexec -n 48 /home/shafan/maker/bin/maker > maker_opts.ctl maker_bopts.ctl maker_exe.ctl > > I attached my maker_opt.ctl, please let me know if you need any > information for this problem. > > Thanks a lot! > > Shaohua > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Mar 25 08:07:45 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 25 Mar 2013 10:07:45 -0400 Subject: [maker-devel] master_datastore_index.log file shrinks. In-Reply-To: <407ae892252062e886fb3855bb6bf74c@ebi.ac.uk> Message-ID: Great news. I'm glad it's working. If you have more questions, just let me know. --Carson On 13-03-25 8:18 AM, "mnuhn" wrote: >Thanks, this works and mpi maker is running now. > >Cheers, >Michael. > >P.S.: > >If anyone is trying to reproduce this, I only had one directory in >LD_PRELOAD and it didn't like the trailing colon, so I removed it to >make it work: > >export LD_PRELOAD=/software/openmpi-1.4.3/lib/libmpi.so > >On 2013-03-19 15:22, Carson Holt wrote: >> I have MAKER working under OpemnMPI 1.4.3 (intel compiled). >> >> I had to set a couple of environmental variables prior to setup. You >> would >> probably need to set these values as well. If you your OpenMPI path >> was >> here for example --> /software/openmpi-1.4.3/, run the following >> commands >> (path set accordingly) before even attempting maker setup. >> >> export OMPI_MCA_mpi_warn_on_fork 0 >> export LD_PRELOAD /software/openmpi-1.4.3/lib/libmpi.so:$LD_PRELOAD >> >> These not only need to be set before compilation, but also before any >> run >> (so add them to you ~.bashrc or ~/.bash_profile or any module load >> scripts >> thanks). The LD_PRELOAD statement needs to be set for any program >> using >> OpenMPI's shared libraries and not just MAKER, so it's normally a >> good >> idea to have that set system wide for all users. The detail can be >> found >> in the OpenMPI documentation. Note sometimes system library updates >> can >> break OpenMPI's shared libraries while not breaking OpenMPI itself, >> so you >> might also need to recompile OpenMPI if it has broken shared >> libraries. >> >> Once you have those commands in place, run the perl Buil.PL step. Say >> yes >> to install with MPI. Then run ./Build install >> >> Thanks, >> Carson >> >> >> >> On 13-03-19 11:02 AM, "Carson Holt" wrote: >> >>>Try it with the no_locks option then. Make sure to let one instance >>>finish populating the mpi_blastdb directory before running other >>>instances >>>as that is where most initial locking occurs. >>> >>>I'll send you more details on how to install with OpenMPI, so you can >>>give >>>that a shot while your jobs are also running serially (so you don't >>> lose >>>time). Also instead of 50 serial instances, you could try 10 with >>> -cpus >>>set to 5. >>> >>>Thanks, >>>Carson >>> >>> >>> >>>On 13-03-19 11:19 AM, "Michael Nuhn" wrote: >>> >>>>Hello Carson! >>>> >>>>On 03/19/2013 02:27 PM, Carson Holt wrote: >>>>> Yes. If at all possible use MPI. It removes the overhead of >>>>> locks >>>>> which happen per primary instance of MAKER. So one maker job >>>>> using >>>>>1000 >>>>> cpus via MPI will have one shared set of locks. 1000 serial >>>>> instances >>>>> of MAKER on the other hand would have 1000x the locks. >>>> >>>>I don't know a thing about MPI. >>>> >>>>I tried installing maker (2.2.7) with mpich-3.0.2, mpich2-1.4.1 and >>>> open >>>>mpi and none of them worked for me. I also tried the automatic >>>>installation that comes with maker, but it didn't work for me >>>> either. >>>> >>>>If need be, I could spend time getting to the bottom of this, but >>>> there >>>>is no telling how long this would take me so I'd rather not, if >>>> there is >>>>an alternative. >>>> >>>>Would the approach I outlined before work? (Treating the split files >>>> as >>>>separate genomes to annotate and then combine the gffs afterwards) >>>> >>>>I also like this approach, because I would select a few contigs in >>>> the >>>>beginning which I would run on their own. They would complete early >>>> and >>>>this way I would get a preview of the results of the run instead of >>>>having to wait for everything to complete. >>>> >>>>It might also be more robust, because file locking issues would be >>>>confined to the instances working on a sequence chunk, but the rest >>>> of >>>>the instances could continue working. >>>> >>>>Cheers, >>>>Michael. >>>> >>>>> Alternatively if you do need to continue without MPI for some >>>>> reason, I >>>>> just finished a devel version of MAKER that has a --no_locks >>>>> option. >>>>> You can never start two instances using the same input fasta >>>>> when >>>>> --no_locks is specified, but the splitting to use different input >>>>>fastas >>>>> I mentioned before in the example will still work fine. >>>>> >>>>> I also have updated the indexing/reindexing, so if indexing >>>>> failures >>>>> happen, MAKER will switch between the current working directory >>>>> and the >>>>> TMP= directory from the maker_opts.ctl file so as to try different >>>>> IO >>>>> locations (I.e. NFS and non-NFS). Note you should never set TMP= >>>>> in >>>>>the >>>>> control files to an NFS mounted location (it not only makes things >>>>> a >>>>>lot >>>>> slower, but berkleydb and sqllite will get frequent errors on >>>>> NFS). >>>>> TMP= defaults to /tmp when not specified >>>>> >>>>> I'll send you download information in a separate e-mail. Try a >>>>> regular >>>>> MAKER run to see if the indexing/reindexing changes are sufficient >>>>> before attempting the ?no_locks option. >>>>> >>>>> Thanks, >>>>> Carson > From carsonhh at gmail.com Mon Mar 25 08:08:17 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 25 Mar 2013 10:08:17 -0400 Subject: [maker-devel] maker terminated strangely In-Reply-To: Message-ID: Yes. Thanks, Carson From: "shaohua.fan" Date: Monday, 25 March, 2013 10:07 AM To: Carson Holt Cc: Subject: Re: [maker-devel] maker terminated strangely Hi Carson, Do you mean standard output from maker? If yes, I need to upload the file to our ftp server, since it is quite big around 1.1 Gb. Shaohua 2013/3/25 Carson Holt > Could you send your captured standard error. That would contain messages that > highlight the specific cause. > > Thanks, > Carson > > > From: "shaohua.fan" > Date: Monday, 25 March, 2013 9:49 AM > To: > Subject: [maker-devel] maker terminated strangely > > Hi Maker developers, > > I met a problem when I was using Maker version 2.27 beta version that the > pipeline terminated in the middle of the process without any error message. > > > The genome I am working with is a Eukaryotic genome which is consisted by > around 6000 scaffolds. I combined de novo (Augustus and SNAP) and evidence > based (protein from a closely related species and transcriptome from the same > species) for the gene prediction (the genome is already repeat masked). The > MPI (mpich2 version 1.5) enabled maker was run on a cluster by using SGE. I > checked with the administrator of our cluster, there is no limitation of SGE > job. > > The maker was run by using mpiexec -n 48 /home/shafan/maker/bin/maker > maker_opts.ctl maker_bopts.ctl maker_exe.ctl > > I attached my maker_opt.ctl, please let me know if you need any information > for this problem. > > Thanks a lot! > > Shaohua > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/mak > er-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ares711122 at gmail.com Mon Mar 25 20:50:52 2013 From: ares711122 at gmail.com (Hung-Wei Hsu) Date: Tue, 26 Mar 2013 10:50:52 +0800 Subject: [maker-devel] Why are some start positions minus in the gff result? Message-ID: Hi MAKER developers, I could successfully run MAKER and get the final gff. But I found some start positions in the gff were minus. That led to error in the gff reader. Is this a bug? Could you please help to resolve this problem? Thanks a lot in advance. Best regards, Hung-Wei -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Mar 25 21:24:01 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 25 Mar 2013 23:24:01 -0400 Subject: [maker-devel] Why are some start positions minus in the gff result? In-Reply-To: Message-ID: I haven't seen that before, so could you package up the job (all input and control files) that generates this and send it to me. Your using maker's prokaryotic settings to try and get it to annotate viral genomes, correct? --Carson From: Hung-Wei Hsu Date: Monday, 25 March, 2013 10:50 PM To: Subject: [maker-devel] Why are some start positions minus in the gff result? Hi MAKER developers, I could successfully run MAKER and get the final gff. But I found some start positions in the gff were minus. That led to error in the gff reader. Is this a bug? Could you please help to resolve this problem? Thanks a lot in advance. Best regards, Hung-Wei _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From hudarul at yahoo.com Sun Mar 31 14:02:04 2013 From: hudarul at yahoo.com (Hud Hud) Date: Sun, 31 Mar 2013 13:02:04 -0700 (PDT) Subject: [maker-devel] Help on error-Repeat masker Message-ID: <1364760124.37890.YahooMailNeo@web164901.mail.bf1.yahoo.com> Hello, i have some problem when runnning maker, i've got this kind of error, what could possibly go wrong here? Thnks so much setting up GFF3 output and fasta chunks doing repeat masking running ?repeat masker. #--------- command -------------# Widget::RepeatMasker: cd /tmp/maker_WOVHsi; /home/maker-2.27-beta/maker/exe/RepeatMasker/RepeatMasker /home/maker-2.27-beta/maker/data/contig.maker.output/contig_datastore/61/0D/contig172//theVoid ? ? ? ? ? ? ? ? ? ? ? ? ? ? .contig172/contig172.0.simple.rb -dir /home/maker-2.27-beta/maker/data/contig.maker.output/contig_datastore/61/0D/contig172//theVoid.contig172 -pa 1 - lib /tmp/maker_WOVHsi/b1piBcWHlH #-------------------------------# sh: /home/maker-2.27-beta/maker/exe/RepeatMasker/RepeatMasker: /u1/local/bin/perl: bad interpreter: Permission denied ERROR: RepeatMasker failed --> rank=NA, hostname=Homis ERROR: Failed while doing repeat masking ERROR: Chunk failed at level:0, tier_type:1 FAILED CONTIG:contig172 ERROR: Chunk failed at level:2, tier_type:0 FAILED CONTIG:172 examining contents of the fasta file and run log -------------- next part -------------- An HTML attachment was scrubbed... URL: From kenlee.nakasugi at sydney.edu.au Sun Mar 3 16:44:01 2013 From: kenlee.nakasugi at sydney.edu.au (Kenlee Nakasugi) Date: Mon, 04 Mar 2013 10:44:01 +1100 Subject: [maker-devel] regarding mpich In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF6E7062FF@CHIMBX5.ad.uillinois.edu> References: <118F034CF4C3EF48A96F86CE585B94BF6E7062FF@CHIMBX5.ad.uillinois.edu> Message-ID: <1362354241.2252.38.camel@waterhouse874-8> Hi, I'm trying to install mpi_maker (Maker 2.1) on a new system (intel ix x86_64) which has mpich v3.0.2 installed, but I can't seem to get maker Build.PL to recognize it. I tried editing the Build.pm file to point to it, but with no success. All dependencies have been installed and successfully recognized, it is just MPI support that is not. Is there anything I could modify in the install scripts to make it recognize this? Currently, the directly path to where the mpicc and mpiexec are is /apps/mpich/3.0.2/bin I don't have sys admin rights for the machine, and I'm not sure if this version of mpich was installed for shared libraries as per the GMOD tutorial. But I have previously circumvented this with an earlier version of mpich by modifying the Build.pm module with success. I'm wondering if mpichv3.02 is not compatible? Cheers, Ken -- Kenlee Nakasugi | Research Fellow School of Molecular Bioscience Level 8, SMB Building (G08)| The University of Sydney | NSW | 2006 T: +61 2 9114 1321 E: kenlee.nakasugi at sydney.edu.au -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Mar 4 06:35:03 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 04 Mar 2013 08:35:03 -0500 Subject: [maker-devel] regarding mpich In-Reply-To: <1362354241.2252.38.camel@waterhouse874-8> Message-ID: Use the last MPICH2 version, as MPICH3 is very different (it's the first attempt to implement the new MPI3 protocol set, and not just a version update). Alternatively you can use OpenMPI. Also use maker version 2.27 instead for MPI. Thanks, Carson From: Kenlee Nakasugi Date: Sunday, 3 March, 2013 6:44 PM To: "maker-devel at yandell-lab.org List" Subject: [maker-devel] regarding mpich Hi, I'm trying to install mpi_maker (Maker 2.1) on a new system (intel ix x86_64) which has mpich v3.0.2 installed, but I can't seem to get maker Build.PL to recognize it. I tried editing the Build.pm file to point to it, but with no success. All dependencies have been installed and successfully recognized, it is just MPI support that is not. Is there anything I could modify in the install scripts to make it recognize this? Currently, the directly path to where the mpicc and mpiexec are is /apps/mpich/3.0.2/bin I don't have sys admin rights for the machine, and I'm not sure if this version of mpich was installed for shared libraries as per the GMOD tutorial. But I have previously circumvented this with an earlier version of mpich by modifying the Build.pm module with success. I'm wondering if mpichv3.02 is not compatible? Cheers, Ken -- Kenlee Nakasugi | Research Fellow School of Molecular Bioscience Level 8, SMB Building (G08)| The University of Sydney | NSW | 2006 T: +61 2 9114 1321 E: kenlee.nakasugi at sydney.edu.au _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From canchaya at uvigo.es Mon Mar 4 04:10:26 2013 From: canchaya at uvigo.es (Carlos A. Canchaya) Date: Mon, 4 Mar 2013 12:10:26 +0100 Subject: [maker-devel] Sharing benchmarks of maker References: <6472D2A0-7BA8-41F0-ACFD-4D3C800D36FB@uvigo.es> Message-ID: <7F41714C-6C75-4892-AA5B-D7649DDA7DF2@uvigo.es> Hi, I've just install maker2 in our server and run a first test with our data. The input was about 30 000 sequences (9.6 Mb) and it was run in just one server with 32 processors for 36 hours) with mpich2. Our server has 250 Gb of memory and cpus of 2,4 Gb. The test was simple because it only ran repeatmasker and SNAP. Considering that we would like to use other gene prediction/annotation tools available in MAKER, I wonder if you can share some of your benchmarks in order to know if we could scale up pretty well to our production cluster in order to annotate our 1.6 Gb draft genome Best, Carlos Carlos A. Canchaya, PhD IPP Research Fellow Department of Biochemistry, Genetics and Immunology Faculty of Biology Campus Universitario University of Vigo 36310 Vigo Spain http://darwin.uvigo.es/~ccanchaya/ email: canchaya at uvigo.es Tel : +34 986 130048 Fax: +34 986 812556 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Mar 4 08:12:06 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 04 Mar 2013 10:12:06 -0500 Subject: [maker-devel] Sharing benchmarks of maker In-Reply-To: <7F41714C-6C75-4892-AA5B-D7649DDA7DF2@uvigo.es> Message-ID: Performance is highly dependent on the size of evidence datasets used (proteins/ESTs) as well as the IO performance of a system when running via MPI (you can hit IO bottlenecks well before cpu bottlenecks depending on cluster configuration). The Arabidopsis genome (120Mb assembly) running SNAP and Augustus, 1.1Gb EST dataset, and 10Mb protein dataset takes ~1 hour 30 min on 1,500 cpus with OpenMPI. The Maize genome (2.1 Gb) running SNAP and Augustus, 3Gb EST dataset, and 16 Mb protein dataset takes ~4 hours 30 min on 2200 cpus. A human sized genome would take 5-6 days on 100 cpus. MAKER is fully restartable (keeps log of progress). So if there is any failure or the user kills it in the middle of a job, it will pick up at the point it left off on restart (so you don't waste all that processing time). 2Gb of RAM per processing core is recommended when parallelizing MAKER via MPI, but fragmented genomes with smaller contigs can get by with less than 1Gb per core. MAKER version 2.28 which has additional optimization for OpenMPI and lower memory footprint will be available in a couple of weeks. Until then 2.27 is recommended over 2.1 for MPI. 2.27 should also work with OpenMPI. 2.1 only works with older versions of MPICH2 using the mpd launcher and not the current hydra launcher. Thanks, Carson From: "Carlos A. Canchaya" Date: Monday, 4 March, 2013 6:10 AM To: Subject: [maker-devel] Sharing benchmarks of maker Hi, I've just install maker2 in our server and run a first test with our data. The input was about 30 000 sequences (9.6 Mb) and it was run in just one server with 32 processors for 36 hours) with mpich2. Our server has 250 Gb of memory and cpus of 2,4 Gb. The test was simple because it only ran repeatmasker and SNAP. Considering that we would like to use other gene prediction/annotation tools available in MAKER, I wonder if you can share some of your benchmarks in order to know if we could scale up pretty well to our production cluster in order to annotate our 1.6 Gb draft genome Best, Carlos Carlos A. Canchaya, PhD IPP Research Fellow Department of Biochemistry, Genetics and Immunology Faculty of Biology Campus Universitario University of Vigo 36310 Vigo Spain http://darwin.uvigo.es/~ccanchaya/ email: canchaya at uvigo.es Tel : +34 986 130048 Fax: +34 986 812556 > _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Mar 4 08:33:02 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 04 Mar 2013 10:33:02 -0500 Subject: [maker-devel] Sharing benchmarks of maker In-Reply-To: Message-ID: For the Arabidopsis genome it also took ~2 hour 10 min on 600, so there was only a 40 min gain by going from 600 to 1,500 cpus. This is because assembly strucutre have a lot to do with the efficiency of the parallelization, so you can hit a point of diminishing returns on some assemblies sooner than others. --Carson From: Carson Holt Date: Monday, 4 March, 2013 10:12 AM To: "Carlos A. Canchaya" , Subject: Re: [maker-devel] Sharing benchmarks of maker Performance is highly dependent on the size of evidence datasets used (proteins/ESTs) as well as the IO performance of a system when running via MPI (you can hit IO bottlenecks well before cpu bottlenecks depending on cluster configuration). The Arabidopsis genome (120Mb assembly) running SNAP and Augustus, 1.1Gb EST dataset, and 10Mb protein dataset takes ~1 hour 30 min on 1,500 cpus with OpenMPI. The Maize genome (2.1 Gb) running SNAP and Augustus, 3Gb EST dataset, and 16 Mb protein dataset takes ~4 hours 30 min on 2200 cpus. A human sized genome would take 5-6 days on 100 cpus. MAKER is fully restartable (keeps log of progress). So if there is any failure or the user kills it in the middle of a job, it will pick up at the point it left off on restart (so you don't waste all that processing time). 2Gb of RAM per processing core is recommended when parallelizing MAKER via MPI, but fragmented genomes with smaller contigs can get by with less than 1Gb per core. MAKER version 2.28 which has additional optimization for OpenMPI and lower memory footprint will be available in a couple of weeks. Until then 2.27 is recommended over 2.1 for MPI. 2.27 should also work with OpenMPI. 2.1 only works with older versions of MPICH2 using the mpd launcher and not the current hydra launcher. Thanks, Carson From: "Carlos A. Canchaya" Date: Monday, 4 March, 2013 6:10 AM To: Subject: [maker-devel] Sharing benchmarks of maker Hi, I've just install maker2 in our server and run a first test with our data. The input was about 30 000 sequences (9.6 Mb) and it was run in just one server with 32 processors for 36 hours) with mpich2. Our server has 250 Gb of memory and cpus of 2,4 Gb. The test was simple because it only ran repeatmasker and SNAP. Considering that we would like to use other gene prediction/annotation tools available in MAKER, I wonder if you can share some of your benchmarks in order to know if we could scale up pretty well to our production cluster in order to annotate our 1.6 Gb draft genome Best, Carlos Carlos A. Canchaya, PhD IPP Research Fellow Department of Biochemistry, Genetics and Immunology Faculty of Biology Campus Universitario University of Vigo 36310 Vigo Spain http://darwin.uvigo.es/~ccanchaya/ email: canchaya at uvigo.es Tel : +34 986 130048 Fax: +34 986 812556 > _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/m aker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From kenlee.nakasugi at sydney.edu.au Mon Mar 4 13:50:27 2013 From: kenlee.nakasugi at sydney.edu.au (Kenlee Nakasugi) Date: Mon, 4 Mar 2013 20:50:27 +0000 Subject: [maker-devel] regarding mpich In-Reply-To: References: <1362354241.2252.38.camel@waterhouse874-8>, Message-ID: Thanks Carson. Will Maker 2.27 be able to continue analysis on Maker 2.1 files that stopped halfway? Thanks Ken On 05/03/2013, at 1:44 AM, "Carson Holt" > wrote: Use the last MPICH2 version, as MPICH3 is very different (it's the first attempt to implement the new MPI3 protocol set, and not just a version update). Alternatively you can use OpenMPI. Also use maker version 2.27 instead for MPI. Thanks, Carson From: Kenlee Nakasugi > Date: Sunday, 3 March, 2013 6:44 PM To: "maker-devel at yandell-lab.org List" > Subject: [maker-devel] regarding mpich Hi, I'm trying to install mpi_maker (Maker 2.1) on a new system (intel ix x86_64) which has mpich v3.0.2 installed, but I can't seem to get maker Build.PL to recognize it. I tried editing the Build.pm file to point to it, but with no success. All dependencies have been installed and successfully recognized, it is just MPI support that is not. Is there anything I could modify in the install scripts to make it recognize this? Currently, the directly path to where the mpicc and mpiexec are is /apps/mpich/3.0.2/bin I don't have sys admin rights for the machine, and I'm not sure if this version of mpich was installed for shared libraries as per the GMOD tutorial. But I have previously circumvented this with an earlier version of mpich by modifying the Build.pm module with success. I'm wondering if mpichv3.02 is not compatible? Cheers, Ken -- Kenlee Nakasugi | Research Fellow School of Molecular Bioscience Level 8, SMB Building (G08)| The University of Sydney | NSW | 2006 T: +61 2 9114 1321 E: kenlee.nakasugi at sydney.edu.au _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From dsth at ebi.ac.uk Mon Mar 4 13:57:01 2013 From: dsth at ebi.ac.uk (Daniel Hughes) Date: Mon, 4 Mar 2013 20:57:01 +0000 Subject: [maker-devel] regarding mpich In-Reply-To: References: <1362354241.2252.38.camel@waterhouse874-8> Message-ID: Unlikely. Probably safer to export what has finished as gff and run it as re-annotation if you don't want to waste what was alteady processed for running additional iterations. Dan from me phone... On Mar 4, 2013 8:52 PM, "Kenlee Nakasugi" wrote: > Thanks Carson. Will Maker 2.27 be able to continue analysis on Maker 2.1 > files that stopped halfway? > Thanks > Ken > > > > On 05/03/2013, at 1:44 AM, "Carson Holt" wrote: > > Use the last MPICH2 version, as MPICH3 is very different (it's the > first attempt to implement the new MPI3 protocol set, and not just a > version update). Alternatively you can use OpenMPI. Also use maker > version 2.27 instead for MPI. > > Thanks, > Carson > > > > From: Kenlee Nakasugi > Date: Sunday, 3 March, 2013 6:44 PM > To: "maker-devel at yandell-lab.org List" > Subject: [maker-devel] regarding mpich > > Hi, > > I'm trying to install mpi_maker (Maker 2.1) on a new system (intel ix > x86_64) which has mpich v3.0.2 installed, but I can't seem to get maker > Build.PL to recognize it. I tried editing the Build.pm file to point to it, > but with no success. > All dependencies have been installed and successfully recognized, it is > just MPI support that is not. > > Is there anything I could modify in the install scripts to make it > recognize this? Currently, the directly path to where the mpicc and mpiexec > are is /apps/mpich/3.0.2/bin > I don't have sys admin rights for the machine, and I'm not sure if this > version of mpich was installed for shared libraries as per the GMOD > tutorial. But I have previously circumvented this with an earlier version > of mpich by modifying the Build.pm module with success. I'm wondering if > mpichv3.02 is not compatible? > > > Cheers, > Ken > > -- > Kenlee Nakasugi | Research Fellow > School of Molecular Bioscience > Level 8, SMB Building (G08)| The University of Sydney | NSW | 2006 > T: +61 2 9114 1321 > E: kenlee.nakasugi at sydney.edu.au > > _______________________________________________ maker-devel mailing > list maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Mar 4 13:58:21 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 04 Mar 2013 15:58:21 -0500 Subject: [maker-devel] regarding mpich In-Reply-To: Message-ID: Some files it can reuse, but not all. So, exporting finished contigs with GFF3 pass-through is an option. --Carson From: Daniel Hughes Date: Monday, 4 March, 2013 3:57 PM To: Kenlee Nakasugi Cc: "maker-devel at yandell-lab.org List" , Carson Holt Subject: Re: [maker-devel] regarding mpich Unlikely. Probably safer to export what has finished as gff and run it as re-annotation if you don't want to waste what was alteady processed for running additional iterations. Dan from me phone... On Mar 4, 2013 8:52 PM, "Kenlee Nakasugi" wrote: > Thanks Carson. Will Maker 2.27 be able to continue analysis on Maker 2.1 files > that stopped halfway? > Thanks > Ken > > > > On 05/03/2013, at 1:44 AM, "Carson Holt" wrote: > >> Use the last MPICH2 version, as MPICH3 is very different (it's the first >> attempt to implement the new MPI3 protocol set, and not just a version >> update). Alternatively you can use OpenMPI. Also use maker version 2.27 >> instead for MPI. >> >> Thanks, >> Carson >> >> >> >> From: Kenlee Nakasugi >> Date: Sunday, 3 March, 2013 6:44 PM >> To: "maker-devel at yandell-lab.org List" >> Subject: [maker-devel] regarding mpich >> >> Hi, >> >> I'm trying to install mpi_maker (Maker 2.1) on a new system (intel ix x86_64) >> which has mpich v3.0.2 installed, but I can't seem to get maker Build.PL to >> recognize it. I tried editing the Build.pm file to point to it, but with no >> success. >> All dependencies have been installed and successfully recognized, it is just >> MPI support that is not. >> >> Is there anything I could modify in the install scripts to make it recognize >> this? Currently, the directly path to where the mpicc and mpiexec are is >> /apps/mpich/3.0.2/bin >> I don't have sys admin rights for the machine, and I'm not sure if this >> version of mpich was installed for shared libraries as per the GMOD tutorial. >> But I have previously circumvented this with an earlier version of mpich by >> modifying the Build.pm module with success. I'm wondering if mpichv3.02 is >> not compatible? >> >> >> Cheers, >> Ken >> >> -- >> Kenlee Nakasugi | Research Fellow >> School of Molecular Bioscience >> Level 8, SMB Building (G08)| The University of Sydney | NSW | 2006 >> T: +61 2 9114 1321 >> E: kenlee.nakasugi at sydney.edu.au >> _______________________________________________ maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kenlee.nakasugi at sydney.edu.au Mon Mar 4 18:49:09 2013 From: kenlee.nakasugi at sydney.edu.au (Kenlee Nakasugi) Date: Tue, 05 Mar 2013 12:49:09 +1100 Subject: [maker-devel] hex char:29 error with Signal.pm Message-ID: <1362448149.6346.46.camel@waterhouse874-8> Hi again, I'm running into the following error when I run maker 2.1: ## Ran into unknown state (hex char: 29) at /home/programs/maker/lib/File/..//Proc/Signal.pm line 94. ## I tried applying the patch as described here: http://gmod.827538.n3.nabble.com/cluster-error-running-maker-td4022354.html Using the command: $ patch -np1 < 646785-and-handle-Hex29.patch I did this in maker/lib/Proc and maker/lib/Process directories, but am getting this error: ## patch: **** Only garbage was found in the patch input. ## Apparently, this isn't a fatal error: http://gmod.827538.n3.nabble.com/Ran-into-unknown-state-hex-char-29-at-home-a200302-maker-2-10-lib-File-Proc-Signal-pm-line-94-td3034795.html and I might eventually have to run the latest version of Maker, but I need to continue a previous analyses and not having this constant error would be great. The version of Proc::ProcessTable is already latest, 0.47. The platform is ix x86_64 GNU/Linux Thanks, Ken -- Kenlee Nakasugi | Research Fellow School of Molecular Bioscience Level 8, SMB Building (G08)| The University of Sydney | NSW | 2006 T: +61 2 9114 1321 E: kenlee.nakasugi at sydney.edu.au -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Mar 4 21:48:17 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 04 Mar 2013 23:48:17 -0500 Subject: [maker-devel] hex char:29 error with Signal.pm In-Reply-To: <1362448149.6346.46.camel@waterhouse874-8> Message-ID: This is an issue with Proc::ProcessTable on some systems. If you upgrade to MAKER 2.27 it goes away because it no longer uses Proc::ProcessTable. Thanks, Carson From: Kenlee Nakasugi Date: Monday, 4 March, 2013 8:49 PM To: "maker-devel at yandell-lab.org List" Subject: [maker-devel] hex char:29 error with Signal.pm Hi again, I'm running into the following error when I run maker 2.1: ## Ran into unknown state (hex char: 29) at /home/programs/maker/lib/File/..//Proc/Signal.pm line 94. ## I tried applying the patch as described here: http://gmod.827538.n3.nabble.com/cluster-error-running-maker-td4022354.html Using the command: $ patch -np1 < 646785-and-handle-Hex29.patch I did this in maker/lib/Proc and maker/lib/Process directories, but am getting this error: ## patch: **** Only garbage was found in the patch input. ## Apparently, this isn't a fatal error: http://gmod.827538.n3.nabble.com/Ran-into-unknown-state-hex-char-29-at-home- a200302-maker-2-10-lib-File-Proc-Signal-pm-line-94-td3034795.html and I might eventually have to run the latest version of Maker, but I need to continue a previous analyses and not having this constant error would be great. The version of Proc::ProcessTable is already latest, 0.47. The platform is ix x86_64 GNU/Linux Thanks, Ken -- Kenlee Nakasugi | Research Fellow School of Molecular Bioscience Level 8, SMB Building (G08)| The University of Sydney | NSW | 2006 T: +61 2 9114 1321 E: kenlee.nakasugi at sydney.edu.au _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From Carson.Holt at oicr.on.ca Wed Mar 6 10:45:40 2013 From: Carson.Holt at oicr.on.ca (Carson Holt) Date: Wed, 6 Mar 2013 17:45:40 +0000 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: Message-ID: The failed thread is usually just a symptom. There is something causing the thread to fail. Could you send me your STDERR. Often times there is a warning or error further up. Thanks, Carson From: Ram?n Fallon > Date: Wednesday, 6 March, 2013 12:34 PM To: > Subject: thread terminated, causing all processes to fail Hi, I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a single multicore machine. I've successfully run the dpp_contig.fasta (MPI/8 processes) example but am having trouble with larger contigs fasta files of my own, which are well formed. I've run into a problem whereby an mpiexec run of 8 processes will stop due to a perl-thread related problem which says FATAL: Thread terminated, causing all processes to fail this corresponds to line 924 in the maker executable (which is for the secondary/worker threads), and is the result of a test on !$thr OR'd with !$thr->is_running, so clearly one of these is failing. $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a programmer, I've only recently started to look at the code and have not got the hang of the parallelisation setup here, though I gather the master must use threads to initially generate the parallel instances which then use the message passing. Of course threads don't have message passing ability, so I guess something clever is going on and will take some time for me to understand. Clearly however, it has worked before on dpp_contigs, so it may be is something wrong with my datafile or the way I am carrying out the analysis. Any clues that can be put my way are welcome. Thank you! -------------- next part -------------- An HTML attachment was scrubbed... URL: From ramonfallon at gmail.com Wed Mar 6 10:34:59 2013 From: ramonfallon at gmail.com (=?ISO-8859-1?Q?Ram=F3n_Fallon?=) Date: Wed, 6 Mar 2013 18:34:59 +0100 Subject: [maker-devel] thread terminated, causing all processes to fail Message-ID: Hi, I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a single multicore machine. I've successfully run the dpp_contig.fasta (MPI/8 processes) example but am having trouble with larger contigs fasta files of my own, which are well formed. I've run into a problem whereby an mpiexec run of 8 processes will stop due to a perl-thread related problem which says FATAL: Thread terminated, causing all processes to fail this corresponds to line 924 in the maker executable (which is for the secondary/worker threads), and is the result of a test on !$thr OR'd with !$thr->is_running, so clearly one of these is failing. $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a programmer, I've only recently started to look at the code and have not got the hang of the parallelisation setup here, though I gather the master must use threads to initially generate the parallel instances which then use the message passing. Of course threads don't have message passing ability, so I guess something clever is going on and will take some time for me to understand. Clearly however, it has worked before on dpp_contigs, so it may be is something wrong with my datafile or the way I am carrying out the analysis. Any clues that can be put my way are welcome. Thank you! -------------- next part -------------- An HTML attachment was scrubbed... URL: From ramonfallon at gmail.com Wed Mar 6 10:57:12 2013 From: ramonfallon at gmail.com (=?ISO-8859-1?Q?Ram=F3n_Fallon?=) Date: Wed, 6 Mar 2013 18:57:12 +0100 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: References: Message-ID: Hi, Many thanks for your quick reply and hint. Yes, you're right .. further up there is indeed Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line 148 thread 1. Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw FastaSeq for Storable --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 thread 1. I run a "script" session and have maker on -debug so I have everything in one file. Do you prefer to have it attached to a post to this mailing list (if it accepts txt attachments) Cheers. On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon wrote: > Hi, > > I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a > single multicore machine. > > I've successfully run the dpp_contig.fasta (MPI/8 processes) example but > am having trouble with larger contigs fasta files of my own, which are well > formed. > > I've run into a problem whereby an mpiexec run of 8 processes will stop > due to a perl-thread related problem which says > > FATAL: Thread terminated, causing all processes to fail > > this corresponds to line 924 in the maker executable (which is for the > secondary/worker threads), and is the result of a test on !$thr OR'd with > !$thr->is_running, so clearly one of these is failing. > > $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a > programmer, I've only recently started to look at the code and have not got > the hang of the parallelisation setup here, though I gather the master must > use threads to initially generate the parallel instances which then use the > message passing. Of course threads don't have message passing ability, so I > guess something clever is going on and will take some time for me to > understand. > > Clearly however, it has worked before on dpp_contigs, so it may be is > something wrong with my datafile or the way I am carrying out the analysis. > > Any clues that can be put my way are welcome. > > Thank you! > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Carson.Holt at oicr.on.ca Wed Mar 6 11:04:30 2013 From: Carson.Holt at oicr.on.ca (Carson Holt) Date: Wed, 6 Mar 2013 18:04:30 +0000 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: Message-ID: If you do reply all to this message, I should get the attachment. It will be stripped from the one going to the list though. Thanks, Carson From: Ram?n Fallon > Date: Wednesday, 6 March, 2013 12:57 PM To: > Subject: Re: thread terminated, causing all processes to fail Hi, Many thanks for your quick reply and hint. Yes, you're right .. further up there is indeed Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line 148 thread 1. Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw FastaSeq for Storable --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 thread 1. I run a "script" session and have maker on -debug so I have everything in one file. Do you prefer to have it attached to a post to this mailing list (if it accepts txt attachments) Cheers. On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon > wrote: Hi, I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a single multicore machine. I've successfully run the dpp_contig.fasta (MPI/8 processes) example but am having trouble with larger contigs fasta files of my own, which are well formed. I've run into a problem whereby an mpiexec run of 8 processes will stop due to a perl-thread related problem which says FATAL: Thread terminated, causing all processes to fail this corresponds to line 924 in the maker executable (which is for the secondary/worker threads), and is the result of a test on !$thr OR'd with !$thr->is_running, so clearly one of these is failing. $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a programmer, I've only recently started to look at the code and have not got the hang of the parallelisation setup here, though I gather the master must use threads to initially generate the parallel instances which then use the message passing. Of course threads don't have message passing ability, so I guess something clever is going on and will take some time for me to understand. Clearly however, it has worked before on dpp_contigs, so it may be is something wrong with my datafile or the way I am carrying out the analysis. Any clues that can be put my way are welcome. Thank you! -------------- next part -------------- An HTML attachment was scrubbed... URL: From ramonfallon at gmail.com Wed Mar 6 11:15:01 2013 From: ramonfallon at gmail.com (=?ISO-8859-1?Q?Ram=F3n_Fallon?=) Date: Wed, 6 Mar 2013 19:15:01 +0100 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: References: Message-ID: OK great, here goes .. many thanks! On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt wrote: > If you do reply all to this message, I should get the attachment. It > will be stripped from the one going to the list though. > > Thanks, > Carson > > > > From: Ram?n Fallon > Date: Wednesday, 6 March, 2013 12:57 PM > To: > Subject: Re: thread terminated, causing all processes to fail > > Hi, > > Many thanks for your quick reply and hint. > > Yes, you're right .. further up there is indeed > > Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line > 148 thread 1. > Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw > FastaSeq for Storable > --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 > thread 1. > > I run a "script" session and have maker on -debug so I have everything > in one file. Do you prefer to have it attached to a post to this mailing > list (if it accepts txt attachments) > > Cheers. > > > On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon wrote: > >> Hi, >> >> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a >> single multicore machine. >> >> I've successfully run the dpp_contig.fasta (MPI/8 processes) example >> but am having trouble with larger contigs fasta files of my own, which are >> well formed. >> >> I've run into a problem whereby an mpiexec run of 8 processes will stop >> due to a perl-thread related problem which says >> >> FATAL: Thread terminated, causing all processes to fail >> >> this corresponds to line 924 in the maker executable (which is for the >> secondary/worker threads), and is the result of a test on !$thr OR'd with >> !$thr->is_running, so clearly one of these is failing. >> >> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being >> a programmer, I've only recently started to look at the code and have not >> got the hang of the parallelisation setup here, though I gather the master >> must use threads to initially generate the parallel instances which then >> use the message passing. Of course threads don't have message passing >> ability, so I guess something clever is going on and will take some time >> for me to understand. >> >> Clearly however, it has worked before on dpp_contigs, so it may be is >> something wrong with my datafile or the way I am carrying out the analysis. >> >> Any clues that can be put my way are welcome. >> >> Thank you! >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: rf_mkr_run.scriptlog.zip Type: application/zip Size: 7598 bytes Desc: not available URL: From Carson.Holt at oicr.on.ca Wed Mar 6 11:22:38 2013 From: Carson.Holt at oicr.on.ca (Carson Holt) Date: Wed, 6 Mar 2013 18:22:38 +0000 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: Message-ID: Could you delete your ../*maker.output/mpi_blastdb directory, and then when rerunning maker, run with the ?a flag. Thanks, Carson From: Ram?n Fallon > Date: Wednesday, 6 March, 2013 1:15 PM To: Carson Holt > Cc: "maker-devel at yandell-lab.org" > Subject: Re: thread terminated, causing all processes to fail OK great, here goes .. many thanks! On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt > wrote: If you do reply all to this message, I should get the attachment. It will be stripped from the one going to the list though. Thanks, Carson From: Ram?n Fallon > Date: Wednesday, 6 March, 2013 12:57 PM To: > Subject: Re: thread terminated, causing all processes to fail Hi, Many thanks for your quick reply and hint. Yes, you're right .. further up there is indeed Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line 148 thread 1. Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw FastaSeq for Storable --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 thread 1. I run a "script" session and have maker on -debug so I have everything in one file. Do you prefer to have it attached to a post to this mailing list (if it accepts txt attachments) Cheers. On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon > wrote: Hi, I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a single multicore machine. I've successfully run the dpp_contig.fasta (MPI/8 processes) example but am having trouble with larger contigs fasta files of my own, which are well formed. I've run into a problem whereby an mpiexec run of 8 processes will stop due to a perl-thread related problem which says FATAL: Thread terminated, causing all processes to fail this corresponds to line 924 in the maker executable (which is for the secondary/worker threads), and is the result of a test on !$thr OR'd with !$thr->is_running, so clearly one of these is failing. $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a programmer, I've only recently started to look at the code and have not got the hang of the parallelisation setup here, though I gather the master must use threads to initially generate the parallel instances which then use the message passing. Of course threads don't have message passing ability, so I guess something clever is going on and will take some time for me to understand. Clearly however, it has worked before on dpp_contigs, so it may be is something wrong with my datafile or the way I am carrying out the analysis. Any clues that can be put my way are welcome. Thank you! -------------- next part -------------- An HTML attachment was scrubbed... URL: From ramonfallon at gmail.com Wed Mar 6 11:49:46 2013 From: ramonfallon at gmail.com (=?ISO-8859-1?Q?Ram=F3n_Fallon?=) Date: Wed, 6 Mar 2013 19:49:46 +0100 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: References: Message-ID: OK, will do. Will get back to you tomorrow on it. Many thanks! On Wed, Mar 6, 2013 at 7:22 PM, Carson Holt wrote: > Could you delete your ../*maker.output/mpi_blastdb directory, and then > when rerunning maker, run with the ?a flag. > > Thanks, > Carson > > > From: Ram?n Fallon > Date: Wednesday, 6 March, 2013 1:15 PM > To: Carson Holt > Cc: "maker-devel at yandell-lab.org" > > Subject: Re: thread terminated, causing all processes to fail > > OK great, here goes .. many thanks! > > > > On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt wrote: > >> If you do reply all to this message, I should get the attachment. It >> will be stripped from the one going to the list though. >> >> Thanks, >> Carson >> >> >> >> From: Ram?n Fallon >> Date: Wednesday, 6 March, 2013 12:57 PM >> To: >> Subject: Re: thread terminated, causing all processes to fail >> >> Hi, >> >> Many thanks for your quick reply and hint. >> >> Yes, you're right .. further up there is indeed >> >> Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line >> 148 thread 1. >> Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw >> FastaSeq for Storable >> --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 >> thread 1. >> >> I run a "script" session and have maker on -debug so I have everything >> in one file. Do you prefer to have it attached to a post to this mailing >> list (if it accepts txt attachments) >> >> Cheers. >> >> >> On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon wrote: >> >>> Hi, >>> >>> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a >>> single multicore machine. >>> >>> I've successfully run the dpp_contig.fasta (MPI/8 processes) example >>> but am having trouble with larger contigs fasta files of my own, which are >>> well formed. >>> >>> I've run into a problem whereby an mpiexec run of 8 processes will >>> stop due to a perl-thread related problem which says >>> >>> FATAL: Thread terminated, causing all processes to fail >>> >>> this corresponds to line 924 in the maker executable (which is for the >>> secondary/worker threads), and is the result of a test on !$thr OR'd with >>> !$thr->is_running, so clearly one of these is failing. >>> >>> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being >>> a programmer, I've only recently started to look at the code and have not >>> got the hang of the parallelisation setup here, though I gather the master >>> must use threads to initially generate the parallel instances which then >>> use the message passing. Of course threads don't have message passing >>> ability, so I guess something clever is going on and will take some time >>> for me to understand. >>> >>> Clearly however, it has worked before on dpp_contigs, so it may be is >>> something wrong with my datafile or the way I am carrying out the analysis. >>> >>> Any clues that can be put my way are welcome. >>> >>> Thank you! >>> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ramonfallon at gmail.com Thu Mar 7 07:40:53 2013 From: ramonfallon at gmail.com (=?ISO-8859-1?Q?Ram=F3n_Fallon?=) Date: Thu, 7 Mar 2013 15:40:53 +0100 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: References: Message-ID: Hi Carson, I send you a zip of the text file of my repeated maker session, this time having deleted the mpi_blastdb dir and with the -a flag added to "mpiexec -n 8 maker -debug". Command line. Cheers / Ram?n. On Wed, Mar 6, 2013 at 7:49 PM, Ram?n Fallon wrote: > OK, will do. > > Will get back to you tomorrow on it. > > Many thanks! > > > On Wed, Mar 6, 2013 at 7:22 PM, Carson Holt wrote: > >> Could you delete your ../*maker.output/mpi_blastdb directory, and then >> when rerunning maker, run with the ?a flag. >> >> Thanks, >> Carson >> >> >> From: Ram?n Fallon >> Date: Wednesday, 6 March, 2013 1:15 PM >> To: Carson Holt >> Cc: "maker-devel at yandell-lab.org" >> >> Subject: Re: thread terminated, causing all processes to fail >> >> OK great, here goes .. many thanks! >> >> >> >> On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt wrote: >> >>> If you do reply all to this message, I should get the attachment. It >>> will be stripped from the one going to the list though. >>> >>> Thanks, >>> Carson >>> >>> >>> >>> From: Ram?n Fallon >>> Date: Wednesday, 6 March, 2013 12:57 PM >>> To: >>> Subject: Re: thread terminated, causing all processes to fail >>> >>> Hi, >>> >>> Many thanks for your quick reply and hint. >>> >>> Yes, you're right .. further up there is indeed >>> >>> Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line >>> 148 thread 1. >>> Thread 1 terminated abnormally: ERROR: Could not reestablish DB to >>> thaw FastaSeq for Storable >>> --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 >>> thread 1. >>> >>> I run a "script" session and have maker on -debug so I have everything >>> in one file. Do you prefer to have it attached to a post to this mailing >>> list (if it accepts txt attachments) >>> >>> Cheers. >>> >>> >>> On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon wrote: >>> >>>> Hi, >>>> >>>> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a >>>> single multicore machine. >>>> >>>> I've successfully run the dpp_contig.fasta (MPI/8 processes) example >>>> but am having trouble with larger contigs fasta files of my own, which are >>>> well formed. >>>> >>>> I've run into a problem whereby an mpiexec run of 8 processes will >>>> stop due to a perl-thread related problem which says >>>> >>>> FATAL: Thread terminated, causing all processes to fail >>>> >>>> this corresponds to line 924 in the maker executable (which is for the >>>> secondary/worker threads), and is the result of a test on !$thr OR'd with >>>> !$thr->is_running, so clearly one of these is failing. >>>> >>>> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite >>>> being a programmer, I've only recently started to look at the code and have >>>> not got the hang of the parallelisation setup here, though I gather the >>>> master must use threads to initially generate the parallel instances which >>>> then use the message passing. Of course threads don't have message passing >>>> ability, so I guess something clever is going on and will take some time >>>> for me to understand. >>>> >>>> Clearly however, it has worked before on dpp_contigs, so it may be is >>>> something wrong with my datafile or the way I am carrying out the analysis. >>>> >>>> Any clues that can be put my way are welcome. >>>> >>>> Thank you! >>>> >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: rf_mkr_run.scriptlog2.zip Type: application/zip Size: 6430 bytes Desc: not available URL: From carsonhh at gmail.com Thu Mar 7 09:44:40 2013 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 07 Mar 2013 11:44:40 -0500 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: Message-ID: That is extremely odd. It fails to even generate the indexes. Could you check the drive space of your working directory and your /tmp directory? It is odd because Bioperl uses the stat command to check on the file right before making a tied hash. So it was there for the stat but not the tie, which is immediately following. If you check manually does it exist now? --> /home/ramonf/makertrials/mgallocut7/sca29310_8.maker.output/mpi_blastdb/sca2 9310_8%2Efa.mpi.1/sca29310_8%2Efa.mpi.1.0.index Are you running in an NFS mounted directory? --Carson From: Ram?n Fallon Date: Thursday, 7 March, 2013 9:40 AM To: Carson Holt Cc: "maker-devel at yandell-lab.org" Subject: Re: [maker-devel] thread terminated, causing all processes to fail Hi Carson, I send you a zip of the text file of my repeated maker session, this time having deleted the mpi_blastdb dir and with the -a flag added to "mpiexec -n 8 maker -debug". Command line. Cheers / Ram?n. On Wed, Mar 6, 2013 at 7:49 PM, Ram?n Fallon wrote: > OK, will do. > > Will get back to you tomorrow on it. > > Many thanks! > > > On Wed, Mar 6, 2013 at 7:22 PM, Carson Holt wrote: >> Could you delete your ../*maker.output/mpi_blastdb directory, and then when >> rerunning maker, run with the ?a flag. >> >> Thanks, >> Carson >> >> >> From: Ram?n Fallon >> Date: Wednesday, 6 March, 2013 1:15 PM >> To: Carson Holt >> Cc: "maker-devel at yandell-lab.org" >> >> Subject: Re: thread terminated, causing all processes to fail >> >> OK great, here goes .. many thanks! >> >> >> >> On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt wrote: >>> If you do reply all to this message, I should get the attachment. It will >>> be stripped from the one going to the list though. >>> >>> Thanks, >>> Carson >>> >>> >>> >>> From: Ram?n Fallon >>> Date: Wednesday, 6 March, 2013 12:57 PM >>> To: >>> Subject: Re: thread terminated, causing all processes to fail >>> >>> Hi, >>> >>> Many thanks for your quick reply and hint. >>> >>> Yes, you're right .. further up there is indeed >>> >>> Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line 148 >>> thread 1. >>> Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw >>> FastaSeq for Storable >>> --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 >>> thread 1. >>> >>> I run a "script" session and have maker on -debug so I have everything in >>> one file. Do you prefer to have it attached to a post to this mailing list >>> (if it accepts txt attachments) >>> >>> Cheers. >>> >>> >>> On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon wrote: >>>> Hi, >>>> >>>> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a >>>> single multicore machine. >>>> >>>> I've successfully run the dpp_contig.fasta (MPI/8 processes) example but am >>>> having trouble with larger contigs fasta files of my own, which are well >>>> formed. >>>> >>>> I've run into a problem whereby an mpiexec run of 8 processes will stop due >>>> to a perl-thread related problem which says >>>> >>>> FATAL: Thread terminated, causing all processes to fail >>>> >>>> this corresponds to line 924 in the maker executable (which is for the >>>> secondary/worker threads), and is the result of a test on !$thr OR'd with >>>> !$thr->is_running, so clearly one of these is failing. >>>> >>>> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a >>>> programmer, I've only recently started to look at the code and have not got >>>> the hang of the parallelisation setup here, though I gather the master must >>>> use threads to initially generate the parallel instances which then use the >>>> message passing. Of course threads don't have message passing ability, so I >>>> guess something clever is going on and will take some time for me to >>>> understand. >>>> >>>> Clearly however, it has worked before on dpp_contigs, so it may be is >>>> something wrong with my datafile or the way I am carrying out the analysis. >>>> >>>> Any clues that can be put my way are welcome. >>>> >>>> Thank you! >>> >> > _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ramonfallon at gmail.com Thu Mar 7 10:47:53 2013 From: ramonfallon at gmail.com (=?ISO-8859-1?Q?Ram=F3n_Fallon?=) Date: Thu, 7 Mar 2013 18:47:53 +0100 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: References: Message-ID: This is a standalone machine and no NFS at all. "df" gives a healthy amount of disk space, so there should be no problem there. Yes that file does exist although it has the nominal 12288 bytes size, which appears to be the minimum for a DB_file tie. As I mentioned the dpp_contig.fa example set does work so part of my investigation is looking at how. I can do some trivial unit tests on the Bioperl stat-before-tied-hashes situation and see what comes up. So I'll attempt to clear that up and then revert. Many thanks! / Ram?n. On Thu, Mar 7, 2013 at 5:44 PM, Carson Holt wrote: > That is extremely odd. It fails to even generate the indexes. Could you > check the drive space of your working directory and your /tmp directory? > > It is odd because Bioperl uses the stat command to check on the file right > before making a tied hash. So it was there for the stat but not the tie, > which is immediately following. > > If you check manually does it exist now? --> > /home/ramonf/makertrials/mgallocut7/sca29310_8.maker.output/mpi_blastdb/sca29310_8%2Efa.mpi.1/sca29310_8%2Efa.mpi.1.0.index > > Are you running in an NFS mounted directory? > > --Carson > > > From: Ram?n Fallon > Date: Thursday, 7 March, 2013 9:40 AM > > To: Carson Holt > Cc: "maker-devel at yandell-lab.org" > Subject: Re: [maker-devel] thread terminated, causing all processes to > fail > > Hi Carson, > > I send you a zip of the text file of my repeated maker session, this time > having deleted the mpi_blastdb dir and with the -a flag added to "mpiexec > -n 8 maker -debug". Command line. > > Cheers / Ram?n. > > > On Wed, Mar 6, 2013 at 7:49 PM, Ram?n Fallon wrote: > >> OK, will do. >> >> Will get back to you tomorrow on it. >> >> Many thanks! >> >> >> On Wed, Mar 6, 2013 at 7:22 PM, Carson Holt wrote: >> >>> Could you delete your ../*maker.output/mpi_blastdb directory, and then >>> when rerunning maker, run with the ?a flag. >>> >>> Thanks, >>> Carson >>> >>> >>> From: Ram?n Fallon >>> Date: Wednesday, 6 March, 2013 1:15 PM >>> To: Carson Holt >>> Cc: "maker-devel at yandell-lab.org" >>> >>> Subject: Re: thread terminated, causing all processes to fail >>> >>> OK great, here goes .. many thanks! >>> >>> >>> >>> On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt wrote: >>> >>>> If you do reply all to this message, I should get the attachment. It >>>> will be stripped from the one going to the list though. >>>> >>>> Thanks, >>>> Carson >>>> >>>> >>>> >>>> From: Ram?n Fallon >>>> Date: Wednesday, 6 March, 2013 12:57 PM >>>> To: >>>> Subject: Re: thread terminated, causing all processes to fail >>>> >>>> Hi, >>>> >>>> Many thanks for your quick reply and hint. >>>> >>>> Yes, you're right .. further up there is indeed >>>> >>>> Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line >>>> 148 thread 1. >>>> Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw >>>> FastaSeq for Storable >>>> --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 >>>> thread 1. >>>> >>>> I run a "script" session and have maker on -debug so I have everything >>>> in one file. Do you prefer to have it attached to a post to this mailing >>>> list (if it accepts txt attachments) >>>> >>>> Cheers. >>>> >>>> >>>> On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon wrote: >>>> >>>>> Hi, >>>>> >>>>> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a >>>>> single multicore machine. >>>>> >>>>> I've successfully run the dpp_contig.fasta (MPI/8 processes) example >>>>> but am having trouble with larger contigs fasta files of my own, which are >>>>> well formed. >>>>> >>>>> I've run into a problem whereby an mpiexec run of 8 processes will >>>>> stop due to a perl-thread related problem which says >>>>> >>>>> FATAL: Thread terminated, causing all processes to fail >>>>> >>>>> this corresponds to line 924 in the maker executable (which is for the >>>>> secondary/worker threads), and is the result of a test on !$thr OR'd with >>>>> !$thr->is_running, so clearly one of these is failing. >>>>> >>>>> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being >>>>> a programmer, I've only recently started to look at the code and have not >>>>> got the hang of the parallelisation setup here, though I gather the master >>>>> must use threads to initially generate the parallel instances which then >>>>> use the message passing. Of course threads don't have message passing >>>>> ability, so I guess something clever is going on and will take some time >>>>> for me to understand. >>>>> >>>>> Clearly however, it has worked before on dpp_contigs, so it may be is >>>>> something wrong with my datafile or the way I am carrying out the analysis. >>>>> >>>>> Any clues that can be put my way are welcome. >>>>> >>>>> Thank you! >>>>> >>>> >>>> >>> >> > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Thu Mar 7 10:57:46 2013 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 07 Mar 2013 12:57:46 -0500 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: Message-ID: Try running maker outside of with the ?a flag after deleting mpi_blastdb. Does it still happen. Also if you try again with MPI with the ?a flag and having deleted mpi_blastdb, does it fail the same every time? Could you also check for background maker processes that may be trying to work in the same directory that you may not have realized were running. Thanks, Carson From: Ram?n Fallon Date: Thursday, 7 March, 2013 12:47 PM To: Carson Holt Cc: "maker-devel at yandell-lab.org" Subject: Re: [maker-devel] thread terminated, causing all processes to fail This is a standalone machine and no NFS at all. "df" gives a healthy amount of disk space, so there should be no problem there. Yes that file does exist although it has the nominal 12288 bytes size, which appears to be the minimum for a DB_file tie. As I mentioned the dpp_contig.fa example set does work so part of my investigation is looking at how. I can do some trivial unit tests on the Bioperl stat-before-tied-hashes situation and see what comes up. So I'll attempt to clear that up and then revert. Many thanks! / Ram?n. On Thu, Mar 7, 2013 at 5:44 PM, Carson Holt wrote: > That is extremely odd. It fails to even generate the indexes. Could you check > the drive space of your working directory and your /tmp directory? > > It is odd because Bioperl uses the stat command to check on the file right > before making a tied hash. So it was there for the stat but not the tie, > which is immediately following. > > If you check manually does it exist now? --> > /home/ramonf/makertrials/mgallocut7/sca29310_8.maker.output/mpi_blastdb/sca293 > 10_8%2Efa.mpi.1/sca29310_8%2Efa.mpi.1.0.index > > Are you running in an NFS mounted directory? > > --Carson > > > From: Ram?n Fallon > Date: Thursday, 7 March, 2013 9:40 AM > > To: Carson Holt > Cc: "maker-devel at yandell-lab.org" > Subject: Re: [maker-devel] thread terminated, causing all processes to fail > > Hi Carson, > > I send you a zip of the text file of my repeated maker session, this time > having deleted the mpi_blastdb dir and with the -a flag added to "mpiexec -n 8 > maker -debug". Command line. > > Cheers / Ram?n. > > > On Wed, Mar 6, 2013 at 7:49 PM, Ram?n Fallon wrote: >> OK, will do. >> >> Will get back to you tomorrow on it. >> >> Many thanks! >> >> >> On Wed, Mar 6, 2013 at 7:22 PM, Carson Holt wrote: >>> Could you delete your ../*maker.output/mpi_blastdb directory, and then when >>> rerunning maker, run with the ?a flag. >>> >>> Thanks, >>> Carson >>> >>> >>> From: Ram?n Fallon >>> Date: Wednesday, 6 March, 2013 1:15 PM >>> To: Carson Holt >>> Cc: "maker-devel at yandell-lab.org" >>> >>> Subject: Re: thread terminated, causing all processes to fail >>> >>> OK great, here goes .. many thanks! >>> >>> >>> >>> On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt wrote: >>>> If you do reply all to this message, I should get the attachment. It will >>>> be stripped from the one going to the list though. >>>> >>>> Thanks, >>>> Carson >>>> >>>> >>>> >>>> From: Ram?n Fallon >>>> Date: Wednesday, 6 March, 2013 12:57 PM >>>> To: >>>> Subject: Re: thread terminated, causing all processes to fail >>>> >>>> Hi, >>>> >>>> Many thanks for your quick reply and hint. >>>> >>>> Yes, you're right .. further up there is indeed >>>> >>>> Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line 148 >>>> thread 1. >>>> Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw >>>> FastaSeq for Storable >>>> --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 >>>> thread 1. >>>> >>>> I run a "script" session and have maker on -debug so I have everything in >>>> one file. Do you prefer to have it attached to a post to this mailing list >>>> (if it accepts txt attachments) >>>> >>>> Cheers. >>>> >>>> >>>> On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon wrote: >>>>> Hi, >>>>> >>>>> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a >>>>> single multicore machine. >>>>> >>>>> I've successfully run the dpp_contig.fasta (MPI/8 processes) example but >>>>> am having trouble with larger contigs fasta files of my own, which are >>>>> well formed. >>>>> >>>>> I've run into a problem whereby an mpiexec run of 8 processes will stop >>>>> due to a perl-thread related problem which says >>>>> >>>>> FATAL: Thread terminated, causing all processes to fail >>>>> >>>>> this corresponds to line 924 in the maker executable (which is for the >>>>> secondary/worker threads), and is the result of a test on !$thr OR'd with >>>>> !$thr->is_running, so clearly one of these is failing. >>>>> >>>>> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a >>>>> programmer, I've only recently started to look at the code and have not >>>>> got the hang of the parallelisation setup here, though I gather the master >>>>> must use threads to initially generate the parallel instances which then >>>>> use the message passing. Of course threads don't have message passing >>>>> ability, so I guess something clever is going on and will take some time >>>>> for me to understand. >>>>> >>>>> Clearly however, it has worked before on dpp_contigs, so it may be is >>>>> something wrong with my datafile or the way I am carrying out the >>>>> analysis. >>>>> >>>>> Any clues that can be put my way are welcome. >>>>> >>>>> Thank you! >>>> >>> >> > > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/mak > er-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Thu Mar 7 14:09:34 2013 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 07 Mar 2013 16:09:34 -0500 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: Message-ID: It should have said "Try running maker outside of MPi". --Carson From: Carson Holt Date: Thursday, 7 March, 2013 12:57 PM To: Ram?n Fallon Cc: "maker-devel at yandell-lab.org" Subject: Re: [maker-devel] thread terminated, causing all processes to fail Try running maker outside of with the ?a flag after deleting mpi_blastdb. Does it still happen. Also if you try again with MPI with the ?a flag and having deleted mpi_blastdb, does it fail the same every time? Could you also check for background maker processes that may be trying to work in the same directory that you may not have realized were running. Thanks, Carson From: Ram?n Fallon Date: Thursday, 7 March, 2013 12:47 PM To: Carson Holt Cc: "maker-devel at yandell-lab.org" Subject: Re: [maker-devel] thread terminated, causing all processes to fail This is a standalone machine and no NFS at all. "df" gives a healthy amount of disk space, so there should be no problem there. Yes that file does exist although it has the nominal 12288 bytes size, which appears to be the minimum for a DB_file tie. As I mentioned the dpp_contig.fa example set does work so part of my investigation is looking at how. I can do some trivial unit tests on the Bioperl stat-before-tied-hashes situation and see what comes up. So I'll attempt to clear that up and then revert. Many thanks! / Ram?n. On Thu, Mar 7, 2013 at 5:44 PM, Carson Holt wrote: > That is extremely odd. It fails to even generate the indexes. Could you check > the drive space of your working directory and your /tmp directory? > > It is odd because Bioperl uses the stat command to check on the file right > before making a tied hash. So it was there for the stat but not the tie, > which is immediately following. > > If you check manually does it exist now? --> > /home/ramonf/makertrials/mgallocut7/sca29310_8.maker.output/mpi_blastdb/sca293 > 10_8%2Efa.mpi.1/sca29310_8%2Efa.mpi.1.0.index > > Are you running in an NFS mounted directory? > > --Carson > > > From: Ram?n Fallon > Date: Thursday, 7 March, 2013 9:40 AM > > To: Carson Holt > Cc: "maker-devel at yandell-lab.org" > Subject: Re: [maker-devel] thread terminated, causing all processes to fail > > Hi Carson, > > I send you a zip of the text file of my repeated maker session, this time > having deleted the mpi_blastdb dir and with the -a flag added to "mpiexec -n 8 > maker -debug". Command line. > > Cheers / Ram?n. > > > On Wed, Mar 6, 2013 at 7:49 PM, Ram?n Fallon wrote: >> OK, will do. >> >> Will get back to you tomorrow on it. >> >> Many thanks! >> >> >> On Wed, Mar 6, 2013 at 7:22 PM, Carson Holt wrote: >>> Could you delete your ../*maker.output/mpi_blastdb directory, and then when >>> rerunning maker, run with the ?a flag. >>> >>> Thanks, >>> Carson >>> >>> >>> From: Ram?n Fallon >>> Date: Wednesday, 6 March, 2013 1:15 PM >>> To: Carson Holt >>> Cc: "maker-devel at yandell-lab.org" >>> >>> Subject: Re: thread terminated, causing all processes to fail >>> >>> OK great, here goes .. many thanks! >>> >>> >>> >>> On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt wrote: >>>> If you do reply all to this message, I should get the attachment. It will >>>> be stripped from the one going to the list though. >>>> >>>> Thanks, >>>> Carson >>>> >>>> >>>> >>>> From: Ram?n Fallon >>>> Date: Wednesday, 6 March, 2013 12:57 PM >>>> To: >>>> Subject: Re: thread terminated, causing all processes to fail >>>> >>>> Hi, >>>> >>>> Many thanks for your quick reply and hint. >>>> >>>> Yes, you're right .. further up there is indeed >>>> >>>> Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line 148 >>>> thread 1. >>>> Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw >>>> FastaSeq for Storable >>>> --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 >>>> thread 1. >>>> >>>> I run a "script" session and have maker on -debug so I have everything in >>>> one file. Do you prefer to have it attached to a post to this mailing list >>>> (if it accepts txt attachments) >>>> >>>> Cheers. >>>> >>>> >>>> On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon wrote: >>>>> Hi, >>>>> >>>>> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a >>>>> single multicore machine. >>>>> >>>>> I've successfully run the dpp_contig.fasta (MPI/8 processes) example but >>>>> am having trouble with larger contigs fasta files of my own, which are >>>>> well formed. >>>>> >>>>> I've run into a problem whereby an mpiexec run of 8 processes will stop >>>>> due to a perl-thread related problem which says >>>>> >>>>> FATAL: Thread terminated, causing all processes to fail >>>>> >>>>> this corresponds to line 924 in the maker executable (which is for the >>>>> secondary/worker threads), and is the result of a test on !$thr OR'd with >>>>> !$thr->is_running, so clearly one of these is failing. >>>>> >>>>> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a >>>>> programmer, I've only recently started to look at the code and have not >>>>> got the hang of the parallelisation setup here, though I gather the master >>>>> must use threads to initially generate the parallel instances which then >>>>> use the message passing. Of course threads don't have message passing >>>>> ability, so I guess something clever is going on and will take some time >>>>> for me to understand. >>>>> >>>>> Clearly however, it has worked before on dpp_contigs, so it may be is >>>>> something wrong with my datafile or the way I am carrying out the >>>>> analysis. >>>>> >>>>> Any clues that can be put my way are welcome. >>>>> >>>>> Thank you! >>>> >>> >> > > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/mak > er-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From kangyangjae at gmail.com Thu Mar 7 21:00:19 2013 From: kangyangjae at gmail.com (Kang, Yang Jae) Date: Fri, 8 Mar 2013 13:00:19 +0900 Subject: [maker-devel] retrying the FAILED scaffolds Message-ID: <13f201ce1bb1$769c9e20$63d5da60$@gmail.com> Hello I have question regarding some FAILED scaffolds Is there any way to re-try maker pipeline on just Failed scaffolds separately? And do I have to manually erase for the failed directories named as ../theVoid.scaffold_#/? And how can I track down the reason why only those 20 out of around 3000 scaffolds? Thank you Kang, Yang Jae Ph.D. Cropgenomics Lab. College of Agriculture and Life Science Seoul National University Korea -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Thu Mar 7 21:13:08 2013 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 07 Mar 2013 23:13:08 -0500 Subject: [maker-devel] retrying the FAILED scaffolds In-Reply-To: <13f201ce1bb1$769c9e20$63d5da60$@gmail.com> Message-ID: Is there any way to re-try maker pipeline on just Failed scaffolds separately? > Yes. The failed contig fasta will be in the maker.output subdirectory for that > contig. Alternatively use the fasta_tool script to extract them from the > genome file. You can then run them in a separate directory, or use the > '-base' command line flag to force it to use the base name of the current > results directory. Use the ?g option to override the genome file without > having to edit the control files > > Example: > > maker -g failed.fasta ?base maize_assemby > > Output will end up here --> maize_assemby.maker.output And do I have to manually erase for the failed directories named as ../theVoid.scaffold_#/? > No. You can let MAKER just retry them as is (let maker handle what to delete > and keep) or set clean_try=1 to force full deletion before rerunning And how can I track down the reason why only those 20 out of around 3000 scaffolds? > Search for the tag "ERROR" in the standard output of your run. What MAKER > version are you using? I can take a look at the STDERR as wel if you want. > If it's too big for e-mail, you can share it via dropbox. Thanks, Carson -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Fri Mar 8 13:20:37 2013 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 08 Mar 2013 15:20:37 -0500 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: Message-ID: I think I've found the potential cause and committed the necessary changes to fix it. Thanks, Carson From: Ram?n Fallon Date: Thursday, 7 March, 2013 12:47 PM To: Carson Holt Cc: "maker-devel at yandell-lab.org" Subject: Re: [maker-devel] thread terminated, causing all processes to fail This is a standalone machine and no NFS at all. "df" gives a healthy amount of disk space, so there should be no problem there. Yes that file does exist although it has the nominal 12288 bytes size, which appears to be the minimum for a DB_file tie. As I mentioned the dpp_contig.fa example set does work so part of my investigation is looking at how. I can do some trivial unit tests on the Bioperl stat-before-tied-hashes situation and see what comes up. So I'll attempt to clear that up and then revert. Many thanks! / Ram?n. On Thu, Mar 7, 2013 at 5:44 PM, Carson Holt wrote: > That is extremely odd. It fails to even generate the indexes. Could you check > the drive space of your working directory and your /tmp directory? > > It is odd because Bioperl uses the stat command to check on the file right > before making a tied hash. So it was there for the stat but not the tie, > which is immediately following. > > If you check manually does it exist now? --> > /home/ramonf/makertrials/mgallocut7/sca29310_8.maker.output/mpi_blastdb/sca293 > 10_8%2Efa.mpi.1/sca29310_8%2Efa.mpi.1.0.index > > Are you running in an NFS mounted directory? > > --Carson > > > From: Ram?n Fallon > Date: Thursday, 7 March, 2013 9:40 AM > > To: Carson Holt > Cc: "maker-devel at yandell-lab.org" > Subject: Re: [maker-devel] thread terminated, causing all processes to fail > > Hi Carson, > > I send you a zip of the text file of my repeated maker session, this time > having deleted the mpi_blastdb dir and with the -a flag added to "mpiexec -n 8 > maker -debug". Command line. > > Cheers / Ram?n. > > > On Wed, Mar 6, 2013 at 7:49 PM, Ram?n Fallon wrote: >> OK, will do. >> >> Will get back to you tomorrow on it. >> >> Many thanks! >> >> >> On Wed, Mar 6, 2013 at 7:22 PM, Carson Holt wrote: >>> Could you delete your ../*maker.output/mpi_blastdb directory, and then when >>> rerunning maker, run with the ?a flag. >>> >>> Thanks, >>> Carson >>> >>> >>> From: Ram?n Fallon >>> Date: Wednesday, 6 March, 2013 1:15 PM >>> To: Carson Holt >>> Cc: "maker-devel at yandell-lab.org" >>> >>> Subject: Re: thread terminated, causing all processes to fail >>> >>> OK great, here goes .. many thanks! >>> >>> >>> >>> On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt wrote: >>>> If you do reply all to this message, I should get the attachment. It will >>>> be stripped from the one going to the list though. >>>> >>>> Thanks, >>>> Carson >>>> >>>> >>>> >>>> From: Ram?n Fallon >>>> Date: Wednesday, 6 March, 2013 12:57 PM >>>> To: >>>> Subject: Re: thread terminated, causing all processes to fail >>>> >>>> Hi, >>>> >>>> Many thanks for your quick reply and hint. >>>> >>>> Yes, you're right .. further up there is indeed >>>> >>>> Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line 148 >>>> thread 1. >>>> Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw >>>> FastaSeq for Storable >>>> --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 >>>> thread 1. >>>> >>>> I run a "script" session and have maker on -debug so I have everything in >>>> one file. Do you prefer to have it attached to a post to this mailing list >>>> (if it accepts txt attachments) >>>> >>>> Cheers. >>>> >>>> >>>> On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon wrote: >>>>> Hi, >>>>> >>>>> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a >>>>> single multicore machine. >>>>> >>>>> I've successfully run the dpp_contig.fasta (MPI/8 processes) example but >>>>> am having trouble with larger contigs fasta files of my own, which are >>>>> well formed. >>>>> >>>>> I've run into a problem whereby an mpiexec run of 8 processes will stop >>>>> due to a perl-thread related problem which says >>>>> >>>>> FATAL: Thread terminated, causing all processes to fail >>>>> >>>>> this corresponds to line 924 in the maker executable (which is for the >>>>> secondary/worker threads), and is the result of a test on !$thr OR'd with >>>>> !$thr->is_running, so clearly one of these is failing. >>>>> >>>>> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a >>>>> programmer, I've only recently started to look at the code and have not >>>>> got the hang of the parallelisation setup here, though I gather the master >>>>> must use threads to initially generate the parallel instances which then >>>>> use the message passing. Of course threads don't have message passing >>>>> ability, so I guess something clever is going on and will take some time >>>>> for me to understand. >>>>> >>>>> Clearly however, it has worked before on dpp_contigs, so it may be is >>>>> something wrong with my datafile or the way I am carrying out the >>>>> analysis. >>>>> >>>>> Any clues that can be put my way are welcome. >>>>> >>>>> Thank you! >>>> >>> >> > > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/mak > er-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Fri Mar 8 13:28:32 2013 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 08 Mar 2013 15:28:32 -0500 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: Message-ID: Also delete mpi_blastdb before retrying with the new svn repository. Thanks, Carson From: Carson Holt Date: Friday, 8 March, 2013 3:20 PM To: Ram?n Fallon Cc: "maker-devel at yandell-lab.org" Subject: Re: [maker-devel] thread terminated, causing all processes to fail I think I've found the potential cause and committed the necessary changes to fix it. Thanks, Carson From: Ram?n Fallon Date: Thursday, 7 March, 2013 12:47 PM To: Carson Holt Cc: "maker-devel at yandell-lab.org" Subject: Re: [maker-devel] thread terminated, causing all processes to fail This is a standalone machine and no NFS at all. "df" gives a healthy amount of disk space, so there should be no problem there. Yes that file does exist although it has the nominal 12288 bytes size, which appears to be the minimum for a DB_file tie. As I mentioned the dpp_contig.fa example set does work so part of my investigation is looking at how. I can do some trivial unit tests on the Bioperl stat-before-tied-hashes situation and see what comes up. So I'll attempt to clear that up and then revert. Many thanks! / Ram?n. On Thu, Mar 7, 2013 at 5:44 PM, Carson Holt wrote: > That is extremely odd. It fails to even generate the indexes. Could you check > the drive space of your working directory and your /tmp directory? > > It is odd because Bioperl uses the stat command to check on the file right > before making a tied hash. So it was there for the stat but not the tie, > which is immediately following. > > If you check manually does it exist now? --> > /home/ramonf/makertrials/mgallocut7/sca29310_8.maker.output/mpi_blastdb/sca293 > 10_8%2Efa.mpi.1/sca29310_8%2Efa.mpi.1.0.index > > Are you running in an NFS mounted directory? > > --Carson > > > From: Ram?n Fallon > Date: Thursday, 7 March, 2013 9:40 AM > > To: Carson Holt > Cc: "maker-devel at yandell-lab.org" > Subject: Re: [maker-devel] thread terminated, causing all processes to fail > > Hi Carson, > > I send you a zip of the text file of my repeated maker session, this time > having deleted the mpi_blastdb dir and with the -a flag added to "mpiexec -n 8 > maker -debug". Command line. > > Cheers / Ram?n. > > > On Wed, Mar 6, 2013 at 7:49 PM, Ram?n Fallon wrote: >> OK, will do. >> >> Will get back to you tomorrow on it. >> >> Many thanks! >> >> >> On Wed, Mar 6, 2013 at 7:22 PM, Carson Holt wrote: >>> Could you delete your ../*maker.output/mpi_blastdb directory, and then when >>> rerunning maker, run with the ?a flag. >>> >>> Thanks, >>> Carson >>> >>> >>> From: Ram?n Fallon >>> Date: Wednesday, 6 March, 2013 1:15 PM >>> To: Carson Holt >>> Cc: "maker-devel at yandell-lab.org" >>> >>> Subject: Re: thread terminated, causing all processes to fail >>> >>> OK great, here goes .. many thanks! >>> >>> >>> >>> On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt wrote: >>>> If you do reply all to this message, I should get the attachment. It will >>>> be stripped from the one going to the list though. >>>> >>>> Thanks, >>>> Carson >>>> >>>> >>>> >>>> From: Ram?n Fallon >>>> Date: Wednesday, 6 March, 2013 12:57 PM >>>> To: >>>> Subject: Re: thread terminated, causing all processes to fail >>>> >>>> Hi, >>>> >>>> Many thanks for your quick reply and hint. >>>> >>>> Yes, you're right .. further up there is indeed >>>> >>>> Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line 148 >>>> thread 1. >>>> Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw >>>> FastaSeq for Storable >>>> --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 >>>> thread 1. >>>> >>>> I run a "script" session and have maker on -debug so I have everything in >>>> one file. Do you prefer to have it attached to a post to this mailing list >>>> (if it accepts txt attachments) >>>> >>>> Cheers. >>>> >>>> >>>> On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon wrote: >>>>> Hi, >>>>> >>>>> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a >>>>> single multicore machine. >>>>> >>>>> I've successfully run the dpp_contig.fasta (MPI/8 processes) example but >>>>> am having trouble with larger contigs fasta files of my own, which are >>>>> well formed. >>>>> >>>>> I've run into a problem whereby an mpiexec run of 8 processes will stop >>>>> due to a perl-thread related problem which says >>>>> >>>>> FATAL: Thread terminated, causing all processes to fail >>>>> >>>>> this corresponds to line 924 in the maker executable (which is for the >>>>> secondary/worker threads), and is the result of a test on !$thr OR'd with >>>>> !$thr->is_running, so clearly one of these is failing. >>>>> >>>>> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a >>>>> programmer, I've only recently started to look at the code and have not >>>>> got the hang of the parallelisation setup here, though I gather the master >>>>> must use threads to initially generate the parallel instances which then >>>>> use the message passing. Of course threads don't have message passing >>>>> ability, so I guess something clever is going on and will take some time >>>>> for me to understand. >>>>> >>>>> Clearly however, it has worked before on dpp_contigs, so it may be is >>>>> something wrong with my datafile or the way I am carrying out the >>>>> analysis. >>>>> >>>>> Any clues that can be put my way are welcome. >>>>> >>>>> Thank you! >>>> >>> >> > > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/mak > er-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Sun Mar 10 10:31:27 2013 From: carsonhh at gmail.com (Carson Holt) Date: Sun, 10 Mar 2013 12:31:27 -0400 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: Message-ID: I've fixed the missing script issue. Thanks, Carson From: Ram?n Fallon Date: Sunday, 10 March, 2013 10:45 AM To: Carson Holt Cc: "maker-devel at yandell-lab.org" Subject: Re: [maker-devel] thread terminated, causing all processes to fail Hi Carson, In terms of rev 995, on a simplified version of our data set, I tried a sequential run successfully, and even a "mpiexec -n 4" which ran to completion. In any case, many thanks for the new version 996. I did have a problem with the build, namely the new line: 'bin/TACC.PL ' => ['bin/ibrun'], I tried to find TACC.PL unsuccessfully, so I decided to dispense with this new line and then it compiled fine. I started one or two tests and will inform you later about them. From my end I must admit I am using a rather large EST fasta file, but is not useful for test .. I will try to cut it down Monday or Tues so that tests can be more agile. Many thanks / Ram?n. On Fri, Mar 8, 2013 at 9:28 PM, Carson Holt wrote: > Also delete mpi_blastdb before retrying with the new svn repository. > > Thanks, > Carson > > > From: Carson Holt > Date: Friday, 8 March, 2013 3:20 PM > To: Ram?n Fallon > > Cc: "maker-devel at yandell-lab.org" > Subject: Re: [maker-devel] thread terminated, causing all processes to fail > > I think I've found the potential cause and committed the necessary changes to > fix it. > > Thanks, > Carson > > > From: Ram?n Fallon > Date: Thursday, 7 March, 2013 12:47 PM > To: Carson Holt > Cc: "maker-devel at yandell-lab.org" > Subject: Re: [maker-devel] thread terminated, causing all processes to fail > > This is a standalone machine and no NFS at all. "df" gives a healthy amount of > disk space, so there should be no problem there. > > Yes that file does exist although it has the nominal 12288 bytes size, which > appears to be the minimum for a DB_file tie. > > As I mentioned the dpp_contig.fa example set does work so part of my > investigation is looking at how. > > I can do some trivial unit tests on the Bioperl stat-before-tied-hashes > situation and see what comes up. > > So I'll attempt to clear that up and then revert. > > Many thanks! / Ram?n. > > > On Thu, Mar 7, 2013 at 5:44 PM, Carson Holt wrote: >> That is extremely odd. It fails to even generate the indexes. Could you >> check the drive space of your working directory and your /tmp directory? >> >> It is odd because Bioperl uses the stat command to check on the file right >> before making a tied hash. So it was there for the stat but not the tie, >> which is immediately following. >> >> If you check manually does it exist now? --> >> /home/ramonf/makertrials/mgallocut7/sca29310_8.maker.output/mpi_blastdb/sca29 >> 310_8%2Efa.mpi.1/sca29310_8%2Efa.mpi.1.0.index >> >> Are you running in an NFS mounted directory? >> >> --Carson >> >> >> From: Ram?n Fallon >> Date: Thursday, 7 March, 2013 9:40 AM >> >> To: Carson Holt >> Cc: "maker-devel at yandell-lab.org" >> Subject: Re: [maker-devel] thread terminated, causing all processes to fail >> >> Hi Carson, >> >> I send you a zip of the text file of my repeated maker session, this time >> having deleted the mpi_blastdb dir and with the -a flag added to "mpiexec -n >> 8 maker -debug". Command line. >> >> Cheers / Ram?n. >> >> >> On Wed, Mar 6, 2013 at 7:49 PM, Ram?n Fallon wrote: >>> OK, will do. >>> >>> Will get back to you tomorrow on it. >>> >>> Many thanks! >>> >>> >>> On Wed, Mar 6, 2013 at 7:22 PM, Carson Holt wrote: >>>> Could you delete your ../*maker.output/mpi_blastdb directory, and then when >>>> rerunning maker, run with the ?a flag. >>>> >>>> Thanks, >>>> Carson >>>> >>>> >>>> From: Ram?n Fallon >>>> Date: Wednesday, 6 March, 2013 1:15 PM >>>> To: Carson Holt >>>> Cc: "maker-devel at yandell-lab.org" >>>> >>>> Subject: Re: thread terminated, causing all processes to fail >>>> >>>> OK great, here goes .. many thanks! >>>> >>>> >>>> >>>> On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt wrote: >>>>> If you do reply all to this message, I should get the attachment. It will >>>>> be stripped from the one going to the list though. >>>>> >>>>> Thanks, >>>>> Carson >>>>> >>>>> >>>>> >>>>> From: Ram?n Fallon >>>>> Date: Wednesday, 6 March, 2013 12:57 PM >>>>> To: >>>>> Subject: Re: thread terminated, causing all processes to fail >>>>> >>>>> Hi, >>>>> >>>>> Many thanks for your quick reply and hint. >>>>> >>>>> Yes, you're right .. further up there is indeed >>>>> >>>>> Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line 148 >>>>> thread 1. >>>>> Thread 1 terminated abnormally: ERROR: Could not reestablish DB to thaw >>>>> FastaSeq for Storable >>>>> --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line 1457 >>>>> thread 1. >>>>> >>>>> I run a "script" session and have maker on -debug so I have everything in >>>>> one file. Do you prefer to have it attached to a post to this mailing list >>>>> (if it accepts txt attachments) >>>>> >>>>> Cheers. >>>>> >>>>> >>>>> On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon >>>>> wrote: >>>>>> Hi, >>>>>> >>>>>> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a >>>>>> single multicore machine. >>>>>> >>>>>> I've successfully run the dpp_contig.fasta (MPI/8 processes) example but >>>>>> am having trouble with larger contigs fasta files of my own, which are >>>>>> well formed. >>>>>> >>>>>> I've run into a problem whereby an mpiexec run of 8 processes will stop >>>>>> due to a perl-thread related problem which says >>>>>> >>>>>> FATAL: Thread terminated, causing all processes to fail >>>>>> >>>>>> this corresponds to line 924 in the maker executable (which is for the >>>>>> secondary/worker threads), and is the result of a test on !$thr OR'd with >>>>>> !$thr->is_running, so clearly one of these is failing. >>>>>> >>>>>> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite being a >>>>>> programmer, I've only recently started to look at the code and have not >>>>>> got the hang of the parallelisation setup here, though I gather the >>>>>> master must use threads to initially generate the parallel instances >>>>>> which then use the message passing. Of course threads don't have message >>>>>> passing ability, so I guess something clever is going on and will take >>>>>> some time for me to understand. >>>>>> >>>>>> Clearly however, it has worked before on dpp_contigs, so it may be is >>>>>> something wrong with my datafile or the way I am carrying out the >>>>>> analysis. >>>>>> >>>>>> Any clues that can be put my way are welcome. >>>>>> >>>>>> Thank you! >>>>> >>>> >>> >> >> _______________________________________________ maker-devel mailing list >> maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/ma >> ker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ramonfallon at gmail.com Sun Mar 10 08:45:38 2013 From: ramonfallon at gmail.com (=?ISO-8859-1?Q?Ram=F3n_Fallon?=) Date: Sun, 10 Mar 2013 15:45:38 +0100 Subject: [maker-devel] thread terminated, causing all processes to fail In-Reply-To: References: Message-ID: Hi Carson, In terms of rev 995, on a simplified version of our data set, I tried a sequential run successfully, and even a "mpiexec -n 4" which ran to completion. In any case, many thanks for the new version 996. I did have a problem with the build, namely the new line: 'bin/TACC.PL' => ['bin/ibrun'], I tried to find TACC.PL unsuccessfully, so I decided to dispense with this new line and then it compiled fine. I started one or two tests and will inform you later about them. From my end I must admit I am using a rather large EST fasta file, but is not useful for test .. I will try to cut it down Monday or Tues so that tests can be more agile. Many thanks / Ram?n. On Fri, Mar 8, 2013 at 9:28 PM, Carson Holt wrote: > Also delete mpi_blastdb before retrying with the new svn repository. > > Thanks, > Carson > > > From: Carson Holt > Date: Friday, 8 March, 2013 3:20 PM > To: Ram?n Fallon > > Cc: "maker-devel at yandell-lab.org" > Subject: Re: [maker-devel] thread terminated, causing all processes to > fail > > I think I've found the potential cause and committed the necessary changes > to fix it. > > Thanks, > Carson > > > From: Ram?n Fallon > Date: Thursday, 7 March, 2013 12:47 PM > To: Carson Holt > Cc: "maker-devel at yandell-lab.org" > Subject: Re: [maker-devel] thread terminated, causing all processes to > fail > > This is a standalone machine and no NFS at all. "df" gives a healthy > amount of disk space, so there should be no problem there. > > Yes that file does exist although it has the nominal 12288 bytes size, > which appears to be the minimum for a DB_file tie. > > As I mentioned the dpp_contig.fa example set does work so part of my > investigation is looking at how. > > I can do some trivial unit tests on the Bioperl stat-before-tied-hashes > situation and see what comes up. > > So I'll attempt to clear that up and then revert. > > Many thanks! / Ram?n. > > > On Thu, Mar 7, 2013 at 5:44 PM, Carson Holt wrote: > >> That is extremely odd. It fails to even generate the indexes. Could you >> check the drive space of your working directory and your /tmp directory? >> >> It is odd because Bioperl uses the stat command to check on the file >> right before making a tied hash. So it was there for the stat but not the >> tie, which is immediately following. >> >> If you check manually does it exist now? --> >> /home/ramonf/makertrials/mgallocut7/sca29310_8.maker.output/mpi_blastdb/sca29310_8%2Efa.mpi.1/sca29310_8%2Efa.mpi.1.0.index >> >> Are you running in an NFS mounted directory? >> >> --Carson >> >> >> From: Ram?n Fallon >> Date: Thursday, 7 March, 2013 9:40 AM >> >> To: Carson Holt >> Cc: "maker-devel at yandell-lab.org" >> Subject: Re: [maker-devel] thread terminated, causing all processes to >> fail >> >> Hi Carson, >> >> I send you a zip of the text file of my repeated maker session, this time >> having deleted the mpi_blastdb dir and with the -a flag added to "mpiexec >> -n 8 maker -debug". Command line. >> >> Cheers / Ram?n. >> >> >> On Wed, Mar 6, 2013 at 7:49 PM, Ram?n Fallon wrote: >> >>> OK, will do. >>> >>> Will get back to you tomorrow on it. >>> >>> Many thanks! >>> >>> >>> On Wed, Mar 6, 2013 at 7:22 PM, Carson Holt wrote: >>> >>>> Could you delete your ../*maker.output/mpi_blastdb directory, and then >>>> when rerunning maker, run with the ?a flag. >>>> >>>> Thanks, >>>> Carson >>>> >>>> >>>> From: Ram?n Fallon >>>> Date: Wednesday, 6 March, 2013 1:15 PM >>>> To: Carson Holt >>>> Cc: "maker-devel at yandell-lab.org" >>>> >>>> Subject: Re: thread terminated, causing all processes to fail >>>> >>>> OK great, here goes .. many thanks! >>>> >>>> >>>> >>>> On Wed, Mar 6, 2013 at 7:04 PM, Carson Holt wrote: >>>> >>>>> If you do reply all to this message, I should get the attachment. It >>>>> will be stripped from the one going to the list though. >>>>> >>>>> Thanks, >>>>> Carson >>>>> >>>>> >>>>> >>>>> From: Ram?n Fallon >>>>> Date: Wednesday, 6 March, 2013 12:57 PM >>>>> To: >>>>> Subject: Re: thread terminated, causing all processes to fail >>>>> >>>>> Hi, >>>>> >>>>> Many thanks for your quick reply and hint. >>>>> >>>>> Yes, you're right .. further up there is indeed >>>>> >>>>> Calling FastaDB::new at /opt/src/maker_svn/bin/../lib/FastaSeq.pm line >>>>> 148 thread 1. >>>>> Thread 1 terminated abnormally: ERROR: Could not reestablish DB to >>>>> thaw FastaSeq for Storable >>>>> --> rank=5, hostname=fatnode, at /opt/src/maker_svn/bin/maker line >>>>> 1457 thread 1. >>>>> >>>>> I run a "script" session and have maker on -debug so I have everything >>>>> in one file. Do you prefer to have it attached to a post to this mailing >>>>> list (if it accepts txt attachments) >>>>> >>>>> Cheers. >>>>> >>>>> >>>>> On Wed, Mar 6, 2013 at 6:34 PM, Ram?n Fallon wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I'm using the maker_svn rev 995 version and hand-compiled MPICH2 on a >>>>>> single multicore machine. >>>>>> >>>>>> I've successfully run the dpp_contig.fasta (MPI/8 processes) example >>>>>> but am having trouble with larger contigs fasta files of my own, which are >>>>>> well formed. >>>>>> >>>>>> I've run into a problem whereby an mpiexec run of 8 processes will >>>>>> stop due to a perl-thread related problem which says >>>>>> >>>>>> FATAL: Thread terminated, causing all processes to fail >>>>>> >>>>>> this corresponds to line 924 in the maker executable (which is for >>>>>> the secondary/worker threads), and is the result of a test on !$thr OR'd >>>>>> with !$thr->is_running, so clearly one of these is failing. >>>>>> >>>>>> $thr itself is a threads->new(\&$node_thread, $gdbfile). Despite >>>>>> being a programmer, I've only recently started to look at the code and have >>>>>> not got the hang of the parallelisation setup here, though I gather the >>>>>> master must use threads to initially generate the parallel instances which >>>>>> then use the message passing. Of course threads don't have message passing >>>>>> ability, so I guess something clever is going on and will take some time >>>>>> for me to understand. >>>>>> >>>>>> Clearly however, it has worked before on dpp_contigs, so it may be is >>>>>> something wrong with my datafile or the way I am carrying out the analysis. >>>>>> >>>>>> Any clues that can be put my way are welcome. >>>>>> >>>>>> Thank you! >>>>>> >>>>> >>>>> >>>> >>> >> _______________________________________________ maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mikheyev at gmail.com Mon Mar 11 03:46:06 2013 From: mikheyev at gmail.com (Sasha Mikheyev) Date: Mon, 11 Mar 2013 18:46:06 +0900 Subject: [maker-devel] duplicate CDS in annotation Message-ID: Dear Yandell lab, I am re-annotating the harvester and genome using protein and RNA-seq data. However, I get many artifacts like the one below. It seems that there are several CDS records that should tie in to the same mRNA, but they are really hanging out separately, and produce several nucleotide sequences with the same name when extracted from the gff. I would appreciate any guidance about how to fix this! Thank you, Sasha grep "pbar_scf7180000350377:hit:2506" Pbar.2.0.gff pbar_scf7180000350377 protein2genome protein_match 172004 172162 150 - . ID=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;score=150; pbar_scf7180000350377 protein2genome match_part 172004 172162 150 - . ID=pbar_scf7180000350377:hsp:2798;Parent=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;Target=Hsal|HS9704 1 53 +;Gap=M159; pbar_scf7180000350377 maker mRNA 538308 558769 . + . ID=pbar_scf7180000350377:hit:2506;Parent=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29;Name=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29-mRNA-1;_AED=0.48;_eAED=0.39;_QI=0|0|0|0.5|1|1|6|0|395;score=0.01; pbar_scf7180000350377 maker exon 538308 538334 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:305;Parent=pbar_scf7180000350377:hit:2506; pbar_scf7180000350377 maker exon 538748 538968 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:306;Parent=pbar_scf7180000350377:hit:2506; pbar_scf7180000350377 maker exon 539842 540242 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:307;Parent=pbar_scf7180000350377:hit:2506; pbar_scf7180000350377 maker exon 542624 542798 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:308;Parent=pbar_scf7180000350377:hit:2506; pbar_scf7180000350377 maker exon 555823 556025 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:309;Parent=pbar_scf7180000350377:hit:2506; pbar_scf7180000350377 maker exon 558609 558769 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:310;Parent=pbar_scf7180000350377:hit:2506; pbar_scf7180000350377 maker CDS 538308 538334 . + 0 ID=pbar_scf7180000350377:hit:2506:cds:305;Parent=pbar_scf7180000350377:hit:2506; pbar_scf7180000350377 maker CDS 538748 538968 . + 0 ID=pbar_scf7180000350377:hit:2506:cds:306;Parent=pbar_scf7180000350377:hit:2506; pbar_scf7180000350377 maker CDS 539842 540242 . + 1 ID=pbar_scf7180000350377:hit:2506:cds:307;Parent=pbar_scf7180000350377:hit:2506; pbar_scf7180000350377 maker CDS 542624 542798 . + 2 ID=pbar_scf7180000350377:hit:2506:cds:308;Parent=pbar_scf7180000350377:hit:2506; pbar_scf7180000350377 maker CDS 555823 556025 . + 1 ID=pbar_scf7180000350377:hit:2506:cds:309;Parent=pbar_scf7180000350377:hit:2506; pbar_scf7180000350377 maker CDS 558609 558769 . + 2 ID=pbar_scf7180000350377:hit:2506:cds:310;Parent=pbar_scf7180000350377:hit:2506; -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry.moore at genetics.utah.edu Mon Mar 11 05:32:44 2013 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Mon, 11 Mar 2013 05:32:44 -0600 Subject: [maker-devel] duplicate CDS in annotation In-Reply-To: References: Message-ID: <52822F85-760B-451B-B156-8861EA77A910@genetics.utah.edu> Hi Sasha, This gene model appears to be correctly formatted to me. In GFF3 format the CDS features are allowed to span multiple lines and they share the same ID to indicate that it is all the same features. See the GFF3 specification on the Sequence Ontology website (http://www.sequenceontology.org/resources/gff3.html), and in particular the description of the ID attribute specifies: ID Indicates the ID of the feature. IDs for each feature must be unique within the scope of the GFF file. In the case of discontinuous features (i.e. a single feature that exists over multiple genomic locations) the same ID may appear on multiple lines. All lines that share an ID collectively represent a single feature. So each of those CDS lines forms one part of the single CDS feature for this gene. B On Mar 11, 2013, at 3:46 AM, Sasha Mikheyev wrote: > Dear Yandell lab, > > I am re-annotating the harvester and genome using protein and RNA-seq data. However, I get many artifacts like the one below. It seems that there are several CDS records that should tie in to the same mRNA, but they are really hanging out separately, and produce several nucleotide sequences with the same name when extracted from the gff. I would appreciate any guidance about how to fix this! > > Thank you, > > Sasha > > grep "pbar_scf7180000350377:hit:2506" Pbar.2.0.gff > pbar_scf7180000350377 protein2genome protein_match 172004 172162 150 - . ID=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;score=150; > pbar_scf7180000350377 protein2genome match_part 172004 172162 150 - . ID=pbar_scf7180000350377:hsp:2798;Parent=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;Target=Hsal|HS9704 1 53 +;Gap=M159; > pbar_scf7180000350377 maker mRNA 538308 558769 . + . ID=pbar_scf7180000350377:hit:2506;Parent=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29;Name=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29-mRNA-1;_AED=0.48;_eAED=0.39;_QI=0|0|0|0.5|1|1|6|0|395;score=0.01; > pbar_scf7180000350377 maker exon 538308 538334 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:305;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker exon 538748 538968 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:306;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker exon 539842 540242 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:307;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker exon 542624 542798 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:308;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker exon 555823 556025 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:309;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker exon 558609 558769 0.01 + . ID=pbar_scf7180000350377:hit:2506:exon:310;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 538308 538334 . + 0 ID=pbar_scf7180000350377:hit:2506:cds:305;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 538748 538968 . + 0 ID=pbar_scf7180000350377:hit:2506:cds:306;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 539842 540242 . + 1 ID=pbar_scf7180000350377:hit:2506:cds:307;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 542624 542798 . + 2 ID=pbar_scf7180000350377:hit:2506:cds:308;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 555823 556025 . + 1 ID=pbar_scf7180000350377:hit:2506:cds:309;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 558609 558769 . + 2 ID=pbar_scf7180000350377:hit:2506:cds:310;Parent=pbar_scf7180000350377:hit:2506; > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Mar 11 07:02:13 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 11 Mar 2013 09:02:13 -0400 Subject: [maker-devel] duplicate CDS in annotation In-Reply-To: <52822F85-760B-451B-B156-8861EA77A910@genetics.utah.edu> Message-ID: I think the issue is that you are getting a match feature that is being printed with the same ID as the mRNA feature. Correct? What version of MAKER are you using, and what does the gile you are giving to pred_gff or model_gff look like? Could you send them? Thanks, Carson From: Barry Moore Date: Monday, 11 March, 2013 7:32 AM To: Sasha Mikheyev Cc: Subject: Re: [maker-devel] duplicate CDS in annotation Hi Sasha, This gene model appears to be correctly formatted to me. In GFF3 format the CDS features are allowed to span multiple lines and they share the same ID to indicate that it is all the same features. See the GFF3 specification on the Sequence Ontology website (http://www.sequenceontology.org/resources/gff3.html), and in particular the description of the ID attribute specifies: > ID Indicates the ID of the feature. IDs for each feature must be unique > within the scope of the GFF file. In the case of discontinuous features (i.e. > a single feature that exists over multiple genomic locations) the same ID may > appear on multiple lines. All lines that share an ID collectively represent a > single feature. So each of those CDS lines forms one part of the single CDS feature for this gene. B On Mar 11, 2013, at 3:46 AM, Sasha Mikheyev wrote: > Dear Yandell lab, > > I am re-annotating the harvester and genome using protein and RNA-seq data. > However, I get many artifacts like the one below. It seems that there are > several CDS records that should tie in to the same mRNA, but they are really > hanging out separately, and produce several nucleotide sequences with the same > name when extracted from the gff. I would appreciate any guidance about how to > fix this! > > Thank you, > > Sasha > > grep "pbar_scf7180000350377:hit:2506" Pbar.2.0.gff > pbar_scf7180000350377 protein2genome protein_match 172004 172162 150 - . > ID=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;score=150; > pbar_scf7180000350377 protein2genome match_part 172004 172162 150 - . > ID=pbar_scf7180000350377:hsp:2798;Parent=pbar_scf7180000350377:hit:2506;Name=H > sal|HS9704;Target=Hsal|HS9704 1 53 +;Gap=M159; > pbar_scf7180000350377 maker mRNA 538308 558769 . + . > ID=pbar_scf7180000350377:hit:2506;Parent=augustus_masked-pbar_scf7180000350377 > -abinit-gene-5.29;Name=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29- > mRNA-1;_AED=0.48;_eAED=0.39;_QI=0|0|0|0.5|1|1|6|0|395;score=0.01; > pbar_scf7180000350377 maker exon 538308 538334 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:305;Parent=pbar_scf7180000350377:hit:25 > 06; > pbar_scf7180000350377 maker exon 538748 538968 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:306;Parent=pbar_scf7180000350377:hit:25 > 06; > pbar_scf7180000350377 maker exon 539842 540242 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:307;Parent=pbar_scf7180000350377:hit:25 > 06; > pbar_scf7180000350377 maker exon 542624 542798 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:308;Parent=pbar_scf7180000350377:hit:25 > 06; > pbar_scf7180000350377 maker exon 555823 556025 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:309;Parent=pbar_scf7180000350377:hit:25 > 06; > pbar_scf7180000350377 maker exon 558609 558769 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:310;Parent=pbar_scf7180000350377:hit:25 > 06; > pbar_scf7180000350377 maker CDS 538308 538334 . + 0 > ID=pbar_scf7180000350377:hit:2506:cds:305;Parent=pbar_scf7180000350377:hit:250 > 6; > pbar_scf7180000350377 maker CDS 538748 538968 . + 0 > ID=pbar_scf7180000350377:hit:2506:cds:306;Parent=pbar_scf7180000350377:hit:250 > 6; > pbar_scf7180000350377 maker CDS 539842 540242 . + 1 > ID=pbar_scf7180000350377:hit:2506:cds:307;Parent=pbar_scf7180000350377:hit:250 > 6; > pbar_scf7180000350377 maker CDS 542624 542798 . + 2 > ID=pbar_scf7180000350377:hit:2506:cds:308;Parent=pbar_scf7180000350377:hit:250 > 6; > pbar_scf7180000350377 maker CDS 555823 556025 . + 1 > ID=pbar_scf7180000350377:hit:2506:cds:309;Parent=pbar_scf7180000350377:hit:250 > 6; > pbar_scf7180000350377 maker CDS 558609 558769 . + 2 > ID=pbar_scf7180000350377:hit:2506:cds:310;Parent=pbar_scf7180000350377:hit:250 > 6; > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From sitaram.rajaraman at helsinki.fi Mon Mar 11 08:33:27 2013 From: sitaram.rajaraman at helsinki.fi (Sitaram Rajaraman) Date: Mon, 11 Mar 2013 16:33:27 +0200 Subject: [maker-devel] Doubts in the synthesis part of MAKER Message-ID: <513DEB37.6090601@helsinki.fi> Hello MAKER developers, I'm Sitaram, working as a Bioinformatician at the University of Helsinki. We are trying out MAKER as part of a gene prediction/annotation pipeline and have some doubts regarding this. In the synthesis step in the paper, I find it a bit hard to visualise how the hints are generated from the various sources and the scores are calculated. It would be nice if you could throw some light on this. Also if you could point to the particular .Pm file which contains the actual source code, it would be convenient as there quite a lot of source code and debugging the whole set is bit cumbersome. Regards, -- Sitaram Rajaraman, Plant Stress Research Group, Dept of Biosciences, University of Helsinki. From carsonhh at gmail.com Mon Mar 11 08:51:56 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 11 Mar 2013 10:51:56 -0400 Subject: [maker-devel] Doubts in the synthesis part of MAKER In-Reply-To: <513DEB37.6090601@helsinki.fi> Message-ID: Hints are basically CDS location, exon location, and intron location. The CDS hints are based on protein alignment. Intron and exon hints are based on the EST alignments, which when polished should give exact intron coordinates. Ironically the most useless part of the gene model is actually the most informative feature for gene prediction (the intron coordinates). lib/Process/MPIchunk.pm will have the steps in the _go method. It is a little hard to follow as MAKER is designed for distributed parallelization (i.e. parallelization without shared memory with steps potentially divided on different machines on the other end of the network). It is divided into MPItier and MPIchunk objects. The MPItier object encapsulate a series of linear steps or 'levels' while the MPIchunk objects encapsulate a single step sent to a machine across the network and it exists within a single 'level' of the MPITier object. Note there can be multiple chunks assigned to a 'level'. MPItiers can also have MPITiers as children at a given level instead of MPIchunks, so the process structure then branches like a tree and can then merges back somewhere in the middle of the algorithm. The 'maker' script is really just the communication script for the objects. In MPI one maker thread is launched to handle communication and another to run the MPItiers and MPIchunks. They communication threads then pass MPIchunks and MPITiers back and forth across the network by either requesting things to do from other nodes or by asking for help if they have a large number of MPIChunks or MPItiers to process. Thanks, Carson On 13-03-11 10:33 AM, "Sitaram Rajaraman" wrote: >Hello MAKER developers, > I'm Sitaram, working as a Bioinformatician at the University of >Helsinki. We are trying out MAKER as part of a gene prediction/annotation >pipeline and have some doubts regarding this. In the synthesis step in >the paper, I find it a bit hard to visualise how the hints are generated >from the various sources and the scores are calculated. It would be nice >if you could throw some light on this. Also if you could point to the >particular .Pm file which contains the actual source code, it would be >convenient as there quite a lot of source code and debugging the whole >set is bit cumbersome. > >Regards, > >-- >Sitaram Rajaraman, >Plant Stress Research Group, >Dept of Biosciences, >University of Helsinki. > > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From sitaram.rajaraman at helsinki.fi Mon Mar 11 09:03:20 2013 From: sitaram.rajaraman at helsinki.fi (Sitaram Rajaraman) Date: Mon, 11 Mar 2013 17:03:20 +0200 Subject: [maker-devel] Doubts in the synthesis part of MAKER In-Reply-To: References: Message-ID: <513DF238.4050109@helsinki.fi> Thank you ! I will proceed with this information ! - Sitaram. On 03/11/2013 04:51 PM, Carson Holt wrote: > Hints are basically CDS location, exon location, and intron location. The > CDS hints are based on protein alignment. Intron and exon hints are based > on the EST alignments, which when polished should give exact intron > coordinates. Ironically the most useless part of the gene model is > actually the most informative feature for gene prediction (the intron > coordinates). > > lib/Process/MPIchunk.pm will have the steps in the _go method. It is a > little hard to follow as MAKER is designed for distributed parallelization > (i.e. parallelization without shared memory with steps potentially divided > on different machines on the other end of the network). > > It is divided into MPItier and MPIchunk objects. The MPItier object > encapsulate a series of linear steps or 'levels' while the MPIchunk > objects encapsulate a single step sent to a machine across the network and > it exists within a single 'level' of the MPITier object. Note there can > be multiple chunks assigned to a 'level'. MPItiers can also have MPITiers > as children at a given level instead of MPIchunks, so the process > structure then branches like a tree and can then merges back somewhere in > the middle of the algorithm. > > The 'maker' script is really just the communication script for the > objects. In MPI one maker thread is launched to handle communication and > another to run the MPItiers and MPIchunks. They communication threads > then pass MPIchunks and MPITiers back and forth across the network by > either requesting things to do from other nodes or by asking for help if > they have a large number of MPIChunks or MPItiers to process. > > Thanks, > Carson > > > > > > On 13-03-11 10:33 AM, "Sitaram Rajaraman" > wrote: > >> Hello MAKER developers, >> I'm Sitaram, working as a Bioinformatician at the University of >> Helsinki. We are trying out MAKER as part of a gene prediction/annotation >> pipeline and have some doubts regarding this. In the synthesis step in >> the paper, I find it a bit hard to visualise how the hints are generated > >from the various sources and the scores are calculated. It would be nice >> if you could throw some light on this. Also if you could point to the >> particular .Pm file which contains the actual source code, it would be >> convenient as there quite a lot of source code and debugging the whole >> set is bit cumbersome. >> >> Regards, >> >> -- >> Sitaram Rajaraman, >> Plant Stress Research Group, >> Dept of Biosciences, >> University of Helsinki. >> >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -- Sitaram Rajaraman, Plant Stress Research Group, Dept of Biosciences, University of Helsinki. From carsonhh at gmail.com Mon Mar 11 09:05:30 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 11 Mar 2013 11:05:30 -0400 Subject: [maker-devel] Doubts in the synthesis part of MAKER In-Reply-To: Message-ID: One more detail. There are basically 5 steps per level. Load -> this step creates the chunks for that level. This is where I decide how many chunks to make, and any special variables need to be generated for packaging into that chunk. Init --> these is just a declaration of the variables to package into a chunk (only give the chunk what it needs) Run --> these is the actual code that will be run after the chunk is transported to it's destination Result --> this describes how to merge results of that chunk back into the parent object Flow --> this decides what to do when all chunks for that level are complete (i.e. which level to move onto next). Default is next level in linear succession, but it can jump forward and backwards several levels if needed. Thanks, Carson On 13-03-11 10:51 AM, "Carson Holt" wrote: >Hints are basically CDS location, exon location, and intron location. >The >CDS hints are based on protein alignment. Intron and exon hints are >based >on the EST alignments, which when polished should give exact intron >coordinates. Ironically the most useless part of the gene model is >actually the most informative feature for gene prediction (the intron >coordinates). > >lib/Process/MPIchunk.pm will have the steps in the _go method. It is a >little hard to follow as MAKER is designed for distributed >parallelization >(i.e. parallelization without shared memory with steps potentially >divided >on different machines on the other end of the network). > >It is divided into MPItier and MPIchunk objects. The MPItier object >encapsulate a series of linear steps or 'levels' while the MPIchunk >objects encapsulate a single step sent to a machine across the network >and >it exists within a single 'level' of the MPITier object. Note there can >be multiple chunks assigned to a 'level'. MPItiers can also have >MPITiers >as children at a given level instead of MPIchunks, so the process >structure then branches like a tree and can then merges back somewhere in >the middle of the algorithm. > >The 'maker' script is really just the communication script for the >objects. In MPI one maker thread is launched to handle communication and >another to run the MPItiers and MPIchunks. They communication threads >then pass MPIchunks and MPITiers back and forth across the network by >either requesting things to do from other nodes or by asking for help if >they have a large number of MPIChunks or MPItiers to process. > >Thanks, >Carson > > > > > >On 13-03-11 10:33 AM, "Sitaram Rajaraman" >wrote: > >>Hello MAKER developers, >> I'm Sitaram, working as a Bioinformatician at the University of >>Helsinki. We are trying out MAKER as part of a gene >>prediction/annotation >>pipeline and have some doubts regarding this. In the synthesis step in >>the paper, I find it a bit hard to visualise how the hints are generated >>from the various sources and the scores are calculated. It would be nice >>if you could throw some light on this. Also if you could point to the >>particular .Pm file which contains the actual source code, it would be >>convenient as there quite a lot of source code and debugging the whole >>set is bit cumbersome. >> >>Regards, >> >>-- >>Sitaram Rajaraman, >>Plant Stress Research Group, >>Dept of Biosciences, >>University of Helsinki. >> >> >>_______________________________________________ >>maker-devel mailing list >>maker-devel at box290.bluehost.com >>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From isradelacon at gmail.com Mon Mar 11 12:34:27 2013 From: isradelacon at gmail.com (Israel Barrantes) Date: Mon, 11 Mar 2013 19:34:27 +0100 Subject: [maker-devel] different RNA-seq experiment outputs in separate annotation passes? Message-ID: Dear maker-devel, I have several RNA-seq experiment outputs that I want to use as input for MAKER annotation: (1) Illumina 1.3, strain A, cell stage N (2) Illumina 1.8, strain A, cell stage N (3) Illumina 1.8, strain B, cell stage N (4) 454, strain unknown, cell stage M For each experiment I mapped the reads and produced GTFs with tophat/cufflinks separately (and later converted to GFF3s with the supplied script) Q1: Does it make a difference to run a different annotation pass for each GFF3 from tophat/cufflinks? Q2: If this is the case, altering the order of passing the cDNA GFFs (e.g., first pass, experiment 1 GFF, then exp.2 in second pass, etc) will produce more or less transcripts? Q3: Is it better to simply merge this GFFs into a single nonredundant file (e.g. bedtools intersect) than using them separately, one for each MAKER pass? Thank you in advance, -- Israel Barrantes Otto-von-Guericke-Universit?t Lehrstuhl f?r Regulationsbiologie IBIO/FNW Deutschland -------------- next part -------------- An HTML attachment was scrubbed... URL: From dence at genetics.utah.edu Mon Mar 11 12:39:01 2013 From: dence at genetics.utah.edu (Daniel Ence) Date: Mon, 11 Mar 2013 18:39:01 +0000 Subject: [maker-devel] different RNA-seq experiment outputs in separate annotation passes? In-Reply-To: References: Message-ID: Hi Israel, I think that for general annotation purposes, you want to use all of those GFF files during your one make run to annotate the whole genome. If you're interested in exploring which genes are expressed in your different strains and cell stages, then you can use your annotation results and blast against the different RNA-seq experiments. I didn't answer your questions separately, but hopefully that gives some good guidance. If I missed something, let me know. Thanks, Daniel Daniel Ence Graduate Student Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ________________________________ From: maker-devel-bounces at yandell-lab.org [maker-devel-bounces at yandell-lab.org] on behalf of Israel Barrantes [isradelacon at gmail.com] Sent: Monday, March 11, 2013 12:34 PM To: maker-devel at yandell-lab.org Subject: [maker-devel] different RNA-seq experiment outputs in separate annotation passes? Dear maker-devel, I have several RNA-seq experiment outputs that I want to use as input for MAKER annotation: (1) Illumina 1.3, strain A, cell stage N (2) Illumina 1.8, strain A, cell stage N (3) Illumina 1.8, strain B, cell stage N (4) 454, strain unknown, cell stage M For each experiment I mapped the reads and produced GTFs with tophat/cufflinks separately (and later converted to GFF3s with the supplied script) Q1: Does it make a difference to run a different annotation pass for each GFF3 from tophat/cufflinks? Q2: If this is the case, altering the order of passing the cDNA GFFs (e.g., first pass, experiment 1 GFF, then exp.2 in second pass, etc) will produce more or less transcripts? Q3: Is it better to simply merge this GFFs into a single nonredundant file (e.g. bedtools intersect) than using them separately, one for each MAKER pass? Thank you in advance, -- Israel Barrantes Otto-von-Guericke-Universit?t Lehrstuhl f?r Regulationsbiologie IBIO/FNW Deutschland -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Mar 12 07:37:35 2013 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 12 Mar 2013 09:37:35 -0400 Subject: [maker-devel] duplicate CDS in annotation In-Reply-To: Message-ID: Yes. Try the newer version and see if you still have the issue. Thanks, Carson From: Sasha Mikheyev Date: Tuesday, 12 March, 2013 1:26 AM To: Carson Holt Cc: Barry Moore , Subject: Re: [maker-devel] duplicate CDS in annotation Hi Carson, I have been using version 2.10. Is it worth trying with a newer version? You can find the model file here . It is rather large, as it includes all of the output from the first maker run. Yours, Sasha On Mon, Mar 11, 2013 at 10:02 PM, Carson Holt wrote: > I think the issue is that you are getting a match feature that is being > printed with the same ID as the mRNA feature. Correct? > > What version of MAKER are you using, and what does the gile you are giving to > pred_gff or model_gff look like? Could you send them? > > Thanks, > Carson > > > From: Barry Moore > Date: Monday, 11 March, 2013 7:32 AM > To: Sasha Mikheyev > Cc: > Subject: Re: [maker-devel] duplicate CDS in annotation > > Hi Sasha, > > This gene model appears to be correctly formatted to me. In GFF3 format the > CDS features are allowed to span multiple lines and they share the same ID to > indicate that it is all the same features. See the GFF3 specification on the > Sequence Ontology website > (http://www.sequenceontology.org/resources/gff3.html), and in particular the > description of the ID attribute specifies: > >> ID Indicates the ID of the feature. IDs for each feature must be unique >> within the scope of the GFF file. In the case of discontinuous features >> (i.e. a single feature that exists over multiple genomic locations) the same >> ID may appear on multiple lines. All lines that share an ID collectively >> represent a single feature. > > So each of those CDS lines forms one part of the single CDS feature for this > gene. > > B > > On Mar 11, 2013, at 3:46 AM, Sasha Mikheyev wrote: > >> Dear Yandell lab, >> >> I am re-annotating the harvester and genome using protein and RNA-seq data. >> However, I get many artifacts like the one below. It seems that there are >> several CDS records that should tie in to the same mRNA, but they are really >> hanging out separately, and produce several nucleotide sequences with the >> same name when extracted from the gff. I would appreciate any guidance about >> how to fix this! >> >> Thank you, >> >> Sasha >> >> grep "pbar_scf7180000350377:hit:2506" Pbar.2.0.gff >> pbar_scf7180000350377 protein2genome protein_match 172004 172162 150 - . >> ID=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;score=150; >> pbar_scf7180000350377 protein2genome match_part 172004 172162 150 - . >> ID=pbar_scf7180000350377:hsp:2798;Parent=pbar_scf7180000350377:hit:2506;Name= >> Hsal|HS9704;Target=Hsal|HS9704 1 53 +;Gap=M159; >> pbar_scf7180000350377 maker mRNA 538308 558769 . + . >> ID=pbar_scf7180000350377:hit:2506;Parent=augustus_masked-pbar_scf718000035037 >> 7-abinit-gene-5.29;Name=augustus_masked-pbar_scf7180000350377-abinit-gene-5.2 >> 9-mRNA-1;_AED=0.48;_eAED=0.39;_QI=0|0|0|0.5|1|1|6|0|395;score=0.01; >> pbar_scf7180000350377 maker exon 538308 538334 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:305;Parent=pbar_scf7180000350377:hit:2 >> 506; >> pbar_scf7180000350377 maker exon 538748 538968 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:306;Parent=pbar_scf7180000350377:hit:2 >> 506; >> pbar_scf7180000350377 maker exon 539842 540242 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:307;Parent=pbar_scf7180000350377:hit:2 >> 506; >> pbar_scf7180000350377 maker exon 542624 542798 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:308;Parent=pbar_scf7180000350377:hit:2 >> 506; >> pbar_scf7180000350377 maker exon 555823 556025 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:309;Parent=pbar_scf7180000350377:hit:2 >> 506; >> pbar_scf7180000350377 maker exon 558609 558769 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:310;Parent=pbar_scf7180000350377:hit:2 >> 506; >> pbar_scf7180000350377 maker CDS 538308 538334 . + 0 >> ID=pbar_scf7180000350377:hit:2506:cds:305;Parent=pbar_scf7180000350377:hit:25 >> 06; >> pbar_scf7180000350377 maker CDS 538748 538968 . + 0 >> ID=pbar_scf7180000350377:hit:2506:cds:306;Parent=pbar_scf7180000350377:hit:25 >> 06; >> pbar_scf7180000350377 maker CDS 539842 540242 . + 1 >> ID=pbar_scf7180000350377:hit:2506:cds:307;Parent=pbar_scf7180000350377:hit:25 >> 06; >> pbar_scf7180000350377 maker CDS 542624 542798 . + 2 >> ID=pbar_scf7180000350377:hit:2506:cds:308;Parent=pbar_scf7180000350377:hit:25 >> 06; >> pbar_scf7180000350377 maker CDS 555823 556025 . + 1 >> ID=pbar_scf7180000350377:hit:2506:cds:309;Parent=pbar_scf7180000350377:hit:25 >> 06; >> pbar_scf7180000350377 maker CDS 558609 558769 . + 2 >> ID=pbar_scf7180000350377:hit:2506:cds:310;Parent=pbar_scf7180000350377:hit:25 >> 06; >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > Barry Moore > Research Scientist > Dept. of Human Genetics > University of Utah > Salt Lake City, UT 84112 > -------------------------------------------- > (801) 585-3543 > > > > > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/mak > er-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From alex.marshall at ed.ac.uk Mon Mar 11 10:15:05 2013 From: alex.marshall at ed.ac.uk (Alex Marshall) Date: Mon, 11 Mar 2013 16:15:05 +0000 Subject: [maker-devel] symbol lookup error: ProcessTable.so: undefined symbol: Perl_Tstack_sp_ptr Message-ID: <513E0309.7010004@ed.ac.uk> Hi to the maker-devel, I am getting an error everytime I run the maker script. symbol lookup error: /path/to/software/lib64/perl5/site_perl/5.8.8/x86_64-linux-thread-multi/auto/Proc/ProcessTable/ProcessTable.so: undefined symbol: Perl_Tstack_sp_ptr Your help would be very appreciated. Best wishes, Alex ---------------- Edinburgh University -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From mikheyev at gmail.com Mon Mar 11 23:26:53 2013 From: mikheyev at gmail.com (Sasha Mikheyev) Date: Tue, 12 Mar 2013 14:26:53 +0900 Subject: [maker-devel] duplicate CDS in annotation In-Reply-To: References: <52822F85-760B-451B-B156-8861EA77A910@genetics.utah.edu> Message-ID: Hi Carson, I have been using version 2.10. Is it worth trying with a newer version? You can find the model file here. It is rather large, as it includes all of the output from the first maker run. Yours, Sasha On Mon, Mar 11, 2013 at 10:02 PM, Carson Holt wrote: > I think the issue is that you are getting a match feature that is being > printed with the same ID as the mRNA feature. Correct? > > What version of MAKER are you using, and what does the gile you are giving > to pred_gff or model_gff look like? Could you send them? > > Thanks, > Carson > > > From: Barry Moore > Date: Monday, 11 March, 2013 7:32 AM > To: Sasha Mikheyev > Cc: > Subject: Re: [maker-devel] duplicate CDS in annotation > > Hi Sasha, > > This gene model appears to be correctly formatted to me. In GFF3 format > the CDS features are allowed to span multiple lines and they share the same > ID to indicate that it is all the same features. See the GFF3 > specification on the Sequence Ontology website ( > http://www.sequenceontology.org/resources/gff3.html), and in particular > the description of the ID attribute specifies: > > ID Indicates the ID of the feature. IDs for each feature must be unique > within the scope of the GFF file. In the case of discontinuous features > (i.e. a single feature that exists over multiple genomic locations) the > same ID may appear on multiple lines. All lines that share an ID > collectively represent a single feature. > > > So each of those CDS lines forms one part of the single CDS feature for > this gene. > > B > > On Mar 11, 2013, at 3:46 AM, Sasha Mikheyev wrote: > > Dear Yandell lab, > > I am re-annotating the harvester and genome using protein and RNA-seq > data. However, I get many artifacts like the one below. It seems that there > are several CDS records that should tie in to the same mRNA, but they are > really hanging out separately, and produce several nucleotide sequences > with the same name when extracted from the gff. I would appreciate any > guidance about how to fix this! > > Thank you, > > Sasha > > grep "pbar_scf7180000350377:hit:2506" Pbar.2.0.gff > pbar_scf7180000350377 protein2genome protein_match 172004 172162 150 - . > ID=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;score=150; > pbar_scf7180000350377 protein2genome match_part 172004 172162 150 - . ID=pbar_scf7180000350377:hsp:2798;Parent=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;Target=Hsal|HS9704 > 1 53 +;Gap=M159; > pbar_scf7180000350377 maker mRNA 538308 558769 . + . > ID=pbar_scf7180000350377:hit:2506;Parent=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29;Name=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29-mRNA-1;_AED=0.48;_eAED=0.39;_QI=0|0|0|0.5|1|1|6|0|395;score=0.01; > pbar_scf7180000350377 maker exon 538308 538334 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:305;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker exon 538748 538968 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:306;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker exon 539842 540242 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:307;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker exon 542624 542798 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:308;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker exon 555823 556025 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:309;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker exon 558609 558769 0.01 + . > ID=pbar_scf7180000350377:hit:2506:exon:310;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 538308 538334 . + 0 > ID=pbar_scf7180000350377:hit:2506:cds:305;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 538748 538968 . + 0 > ID=pbar_scf7180000350377:hit:2506:cds:306;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 539842 540242 . + 1 > ID=pbar_scf7180000350377:hit:2506:cds:307;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 542624 542798 . + 2 > ID=pbar_scf7180000350377:hit:2506:cds:308;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 555823 556025 . + 1 > ID=pbar_scf7180000350377:hit:2506:cds:309;Parent=pbar_scf7180000350377:hit:2506; > pbar_scf7180000350377 maker CDS 558609 558769 . + 2 > ID=pbar_scf7180000350377:hit:2506:cds:310;Parent=pbar_scf7180000350377:hit:2506; > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > Barry Moore > Research Scientist > Dept. of Human Genetics > University of Utah > Salt Lake City, UT 84112 > -------------------------------------------- > (801) 585-3543 > > > > > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Mar 12 08:27:44 2013 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 12 Mar 2013 10:27:44 -0400 Subject: [maker-devel] symbol lookup error: ProcessTable.so: undefined symbol: Perl_Tstack_sp_ptr In-Reply-To: <513E0309.7010004@ed.ac.uk> Message-ID: Could you try the 2.27 version of MAKER? You are using 2.10 correct? Thanks, Carson On 13-03-11 12:15 PM, "Alex Marshall" wrote: >Hi to the maker-devel, > >I am getting an error everytime I run the maker script. > >symbol lookup error: >/path/to/software/lib64/perl5/site_perl/5.8.8/x86_64-linux-thread-multi/au >to/Proc/ProcessTable/ProcessTable.so: >undefined symbol: Perl_Tstack_sp_ptr > >Your help would be very appreciated. > >Best wishes, >Alex > >---------------- >Edinburgh University > >-- >The University of Edinburgh is a charitable body, registered in >Scotland, with registration number SC005336. > > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From barry.moore at genetics.utah.edu Tue Mar 12 17:57:32 2013 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Tue, 12 Mar 2013 17:57:32 -0600 Subject: [maker-devel] MAKER subversion repositories Message-ID: For any of you who are running MAKER straight from our subversion repositories in the lab - we have migrated those repos to a new server. Reply to Shawn or I for info on how to connect to the new repos. Thanks. Barry Barry Moore Research Scientist Dept. of Human Genetics University of Utah Salt Lake City, UT 84112 -------------------------------------------- (801) 585-3543 -------------- next part -------------- An HTML attachment was scrubbed... URL: From ares711122 at gmail.com Tue Mar 12 18:24:42 2013 From: ares711122 at gmail.com (Hung-Wei Hsu) Date: Wed, 13 Mar 2013 08:24:42 +0800 Subject: [maker-devel] ERROR: Could not obtain lock to format database Message-ID: Hi MAKER developers, I tried MAKER 2.27b on one E. coli scaffold sequence with uniprot protein database. I failed to run the analysis and got an error message as below. Could not obtain lock to format database at maker-2.27b/bin/../lib/GI.pm Any suggestions or helps will be deeply appreciated. Best regards, Hung-Wei -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Wed Mar 13 07:24:44 2013 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 13 Mar 2013 09:24:44 -0400 Subject: [maker-devel] symbol lookup error: ProcessTable.so: undefined symbol: Perl_Tstack_sp_ptr In-Reply-To: <513FE1F0.2030209@ed.ac.uk> Message-ID: I'm very glad it's working. Those kind of errors are the hardest to track down. --Carson On 13-03-12 10:18 PM, "Alex Marshall" wrote: >I have some great news. I uninstalled every one of my local perl >libraries. Basically by getting rid of my libraries, and then using your >Build script to install the maker dependencies totally fixed it. It >worked with the test.fasta file, no errors whatsoever. I am smiling so >much right now that my face might crack ;) you were right, broken perl. >I just checked, getting lots of finished in the >master_datastore_index.log. thank you so much. > >Alex > > > > > >On 12/03/2013 19:11, Carson Holt wrote: >> I do think your perl has a problem. I've added some changes to each of >> these modules that should help force perl to generate the correct object >> method lookup table. >> >> Could you test them out (place most under the /lib/Iterator/ >>subdirectory). >> >> --Carson >> >> >> On 13-03-12 3:00 PM, "Alex Marshall" wrote: >> >>> We had maker working happily for ages. >>> >>> Then we upgraded from perl version 5.8.8 to 5.10 which stopped maker >>> working. >>> >>> Maker said it couldn't find forks.pm, added that library path, to fix >>> the error. >>> >>> Then that particular error below started happening. >>> >>> Alex >>> >>> >>> On 12/03/2013 18:54, Alex Marshall wrote: >>>> version 5.10 on a hpc cluster >>>> >>>> Alex >>>> >>>> >>>> >>>> On 12/03/2013 18:48, Carson Holt wrote: >>>>> That means the first time it called fileHandle it didn't die (which >>>>> should >>>>> be impossible) >>>>> >>>>> Then the second time it called it, it died. It begs the question, >>>>>what >>>>> happened to the first call. >>>>> >>>>> This is looking more and more like you have a broken perl. >>>>> >>>>> What version of perl are you using? >>>>> >>>>> --Carson >>>>> >>>>> >>>>> >>>>> On 13-03-12 2:28 PM, "Alex Marshall" wrote: >>>>> >>>>>> I deleted Iterator.pm, I put the new one in the maker/lib folder, >>>>>>then >>>>>> reran maker >>>>>> >>>>>> vi interator.pm confirms this: >>>>>> >>>>>> sub fileHandle { >>>>>> die "this should die"; >>>>>> >>>>>> error: >>>>>> STATUS: Parsing control files... >>>>>> Opening a new filehandle: Iterator:GFF3 >>>>>> Gettign the existing filehandle: Iterator::GFF3 >>>>>> Checking if it still exists: Iterator::GFF3 >>>>>> ERROR: No open filehandle Iterator::GFF3 >>>>>> --> rank=NA, hostname=frontend04 >>>>>> >>>>>> >>>>>> >>>>>> On 12/03/2013 18:21, Carson Holt wrote: >>>>>>> Try this one. >>>>>>> >>>>>>> It should fail immediately >>>>>>> >>>>>>> Code --> die "this should die"; >>>>>>> >>>>>>> >>>>>>> I'm just making sure it's being called as expected. >>>>>>> >>>>>>> --Carson >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 13-03-12 2:18 PM, "Alex Marshall" >>>>>>>wrote: >>>>>>> >>>>>>>> I have Iterator.pm and GFF3.pm in the right place: >>>>>>>> >>>>>>>> ..../software/maker2/maker-2.27/lib/Iterator.pm >>>>>>>> ..../software/maker2/maker-2.27/lib/Iterator/GFF3.pm >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 12/03/2013 18:16, Alex Marshall wrote: >>>>>>>>> I have deleted Iterator.pm, and replaced again (just to be sure). >>>>>>>>> >>>>>>>>> STATUS: Parsing control files... >>>>>>>>> Opening a new filehandle: Iterator:GFF3 >>>>>>>>> Gettign the existing filehandle: Iterator::GFF3 >>>>>>>>> Checking if it still exists: Iterator::GFF3 >>>>>>>>> ERROR: No open filehandle Iterator::GFF3 >>>>>>>>> --> rank=NA, hostname=frontend04 >>>>>>>>> >>>>>>>>> >>>>>>>>> On 12/03/2013 18:11, Carson Holt wrote: >>>>>>>>>> It's missing all the standard error from the Iterator.pm >>>>>>>>>>message I >>>>>>>>>> added? >>>>>>>>>> Could you double check that you replaced that one too. >>>>>>>>>> >>>>>>>>>> --Carson >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 13-03-12 2:07 PM, "Alex Marshall" >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> STATUS: Parsing control files... >>>>>>>>>>> Opening a new filehandle: Iterator:GFF3 >>>>>>>>>>> Gettign the existing filehandle: Iterator::GFF3 >>>>>>>>>>> Checking if it still exists: Iterator::GFF3 >>>>>>>>>>> ERROR: No open filehandle Iterator::GFF3 >>>>>>>>>>> --> rank=NA, hostname=frontend04 >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 12/03/2013 18:02, Carson Holt wrote: >>>>>>>>>>>> Please use these two and send me the full STDERR (replaces >>>>>>>>>>>>also >>>>>>>>>>>> Iterator/GFF3.pm). >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Carson >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 13-03-12 1:55 PM, "Alex Marshall" >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> same again: >>>>>>>>>>>>> >>>>>>>>>>>>> STATUS: Parsing control files... >>>>>>>>>>>>> ERROR: No open filehandle Iterator::GFF3 >>>>>>>>>>>>> --> rank=NA, hostname=frontend04 >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 12/03/2013 17:45, Carson Holt wrote: >>>>>>>>>>>>>> Try this one. This is a code snippet --> >>>>>>>>>>>>>> >>>>>>>>>>>>>> my $fh = new FileHandle(); >>>>>>>>>>>>>> $fh->open("$arg") or die "ERROR: Could not >>>>>>>>>>>>>> open >>>>>>>>>>>>>> file: >>>>>>>>>>>>>> $!\n"; >>>>>>>>>>>>>> $self->{fileHandle} = $fh; >>>>>>>>>>>>>> $self->startPos($fh->getpos()); >>>>>>>>>>>>>> if (! openhandle($fh)){ #checks to see if >>>>>>>>>>>>>>file >>>>>>>>>>>>>> handle >>>>>>>>>>>>>> is >>>>>>>>>>>>>> open >>>>>>>>>>>>>> confess "ERROR: No open filehandle in Iterator\n"; >>>>>>>>>>>>>> } >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> All it does is open the handle, check the reading position >>>>>>>>>>>>>>and >>>>>>>>>>>>>> then >>>>>>>>>>>>>> check >>>>>>>>>>>>>> to see if the handle is still open. >>>>>>>>>>>>>> >>>>>>>>>>>>>> --Carson >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 13-03-12 1:37 PM, "Alex Marshall" >>>>>>>>>>>>>> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> [1] If I comment out the error in the GFF3.pm file: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> STATUS: Parsing control files... >>>>>>>>>>>>>>> Can't call method "getpos" without a package or object >>>>>>>>>>>>>>> reference >>>>>>>>>>>>>>> at >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>/exports/work/biology_ieb_mblaxter/software/maker2/maker-2.2 >>>>>>>>>>>>>>>7/ >>>>>>>>>>>>>>> bin >>>>>>>>>>>>>>> /. >>>>>>>>>>>>>>> ./l >>>>>>>>>>>>>>> ib >>>>>>>>>>>>>>> /I >>>>>>>>>>>>>>> terator/GFF3.pm >>>>>>>>>>>>>>> line 42, line 121. >>>>>>>>>>>>>>> --> rank=NA, hostname=frontend04 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> [2] If I add the error comment back to the GFF3.pm, and add >>>>>>>>>>>>>>> the >>>>>>>>>>>>>>> second >>>>>>>>>>>>>>> new Iterator.pm: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> STATUS: Parsing control files... >>>>>>>>>>>>>>> ERROR: No open filehandle Iterator::GFF3 >>>>>>>>>>>>>>> --> rank=NA, hostname=frontend04 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Alex >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 12/03/2013 17:31, Carson Holt wrote: >>>>>>>>>>>>>>>> There is one other thing it does right before. It calls >>>>>>>>>>>>>>>> this >>>>>>>>>>>>>>>> --> >>>>>>>>>>>>>>>> $self->fileHandle()->getpos() >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I switched the chaining off so it is just $fh->getpos in >>>>>>>>>>>>>>>>the >>>>>>>>>>>>>>>> attached >>>>>>>>>>>>>>>> module (replace again). >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I don't see why a failure would happen special for you >>>>>>>>>>>>>>>> there, >>>>>>>>>>>>>>>> but >>>>>>>>>>>>>>>> try >>>>>>>>>>>>>>>> it >>>>>>>>>>>>>>>> again. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> --Carson >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 13-03-12 1:24 PM, "Carson Holt" >>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> This is the new line in Iterator.pm >>>>>>>>>>>>>>>>> --> $fh->open("$arg") or die "ERROR: Could not open file: >>>>>>>>>>>>>>>>> $!\n"; >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> The extra info would be from $! >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> In the place where the error is occurring, all MAKER does >>>>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>> open a >>>>>>>>>>>>>>>>> file >>>>>>>>>>>>>>>>> handle in Iterator.pm and then check to see if it is open >>>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>> Iterator::GFF3 (it does one and then instantly the >>>>>>>>>>>>>>>>>other). >>>>>>>>>>>>>>>>> The >>>>>>>>>>>>>>>>> second >>>>>>>>>>>>>>>>> failure is just the check on the filehandle. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> If the open succeeds, but for some reason it can't tell >>>>>>>>>>>>>>>>>it >>>>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>> open, >>>>>>>>>>>>>>>>> then >>>>>>>>>>>>>>>>> it is something to do with your system. You can try >>>>>>>>>>>>>>>>> reinstalling >>>>>>>>>>>>>>>>> Scalar::Util as that is the module that implements >>>>>>>>>>>>>>>>> openhandle >>>>>>>>>>>>>>>>> method >>>>>>>>>>>>>>>>> that >>>>>>>>>>>>>>>>> is called. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> You can also try just commenting out line 37 of >>>>>>>>>>>>>>>>> lib/Iterator/GFF3.pm >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> --Carson >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 13-03-12 1:15 PM, "Alex Marshall" >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I am looking at Iterator.pm >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> so it should of thrown more error information? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Alex >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On 12/03/2013 17:14, Alex Marshall wrote: >>>>>>>>>>>>>>>>>>> replaced Iterator.pm in maker2/maker-2.27/lib >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> error: same as before >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> STATUS: Parsing control files... >>>>>>>>>>>>>>>>>>> ERROR: No open filehandle Iterator::GFF3 >>>>>>>>>>>>>>>>>>> --> rank=NA, hostname=frontend04 >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> ...../software/maker2/maker-2.27/lib/Iterator/GFF3.pm >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> in sub new >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> my $fh = $self->fileHandle(); >>>>>>>>>>>>>>>>>>> if (! openhandle($fh)){ #checks to see if file >>>>>>>>>>>>>>>>>>> handle >>>>>>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>>>> open >>>>>>>>>>>>>>>>>>> die "ERROR: No open filehandle >>>>>>>>>>>>>>>>>>> Iterator::GFF3\n"; >>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Alex >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On 12/03/2013 17:06, Carson Holt wrote: >>>>>>>>>>>>>>>>>>>> I get not errors, and don?t see any issues. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Could you replace the Iterator.pm in the lib directory >>>>>>>>>>>>>>>>>>>> with >>>>>>>>>>>>>>>>>>>> this >>>>>>>>>>>>>>>>>>>> one. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I >>>>>>>>>>>>>>>>>>>> added some more output to the STDERR if opening a >>>>>>>>>>>>>>>>>>>> filehandle >>>>>>>>>>>>>>>>>>>> fails. >>>>>>>>>>>>>>>>>>>> At >>>>>>>>>>>>>>>>>>>> least it should provide more information. Could you >>>>>>>>>>>>>>>>>>>> then >>>>>>>>>>>>>>>>>>>> let me >>>>>>>>>>>>>>>>>>>> know >>>>>>>>>>>>>>>>>>>> what >>>>>>>>>>>>>>>>>>>> it says. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>> Carson >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On 13-03-12 12:35 PM, "Alex Marshall" >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> please find: maker_opts.ctl and test.fa attached >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Alex >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On 12/03/2013 16:31, Alex Marshall wrote: >>>>>>>>>>>>>>>>>>>>>> will send to you now... >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On 12/03/2013 16:29, Carson Holt wrote: >>>>>>>>>>>>>>>>>>>>>>> Could you send me the entire captured STDERR, your >>>>>>>>>>>>>>>>>>>>>>> maker_opts.ctl >>>>>>>>>>>>>>>>>>>>>>> file and >>>>>>>>>>>>>>>>>>>>>>> you test.fasta? >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>> Carson >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On 13-03-12 12:23 PM, "Alex Marshall" >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> It is in fasta format not GFF format >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> On 12/03/2013 16:16, Alex Marshall wrote: >>>>>>>>>>>>>>>>>>>>>>>>> I have been looking through maker_opts.ctl >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> #-----Genome (Required for De-Novo Annotation) >>>>>>>>>>>>>>>>>>>>>>>>> genome=test.fna #genome sequence (fasta format or >>>>>>>>>>>>>>>>>>>>>>>>> fasta >>>>>>>>>>>>>>>>>>>>>>>>> embeded >>>>>>>>>>>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>>>>>>>>>> GFF3) >>>>>>>>>>>>>>>>>>>>>>>>> organism_type=eukaryotic #eukaryotic or prokaryotic. >>>>>>>>>>>>>>>>>>>>>>>>> Default >>>>>>>>>>>>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>>>>>>>>>> eukaryotic >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> I added the path to the genome, same error. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Alex >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> On 12/03/2013 16:11, Carson Holt wrote: >>>>>>>>>>>>>>>>>>>>>>>>>> What does you maker_opts.ctl file look like. What is >>>>>>>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>> value >>>>>>>>>>>>>>>>>>>>>>>>>> for >>>>>>>>>>>>>>>>>>>>>>>>>> genome? If you did not give a genome fasta file and are >>>>>>>>>>>>>>>>>>>>>>>>>> using >>>>>>>>>>>>>>>>>>>>>>>>>> a >>>>>>>>>>>>>>>>>>>>>>>>>> gff3 as >>>>>>>>>>>>>>>>>>>>>>>>>> input, is there a FASTA file embedded in it? >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> --Carson >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> On 13-03-12 12:06 PM, "Alex Marshall" >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> [1] hard drive - enough space >>>>>>>>>>>>>>>>>>>>>>>>>>> [2] ./Build realclean - done >>>>>>>>>>>>>>>>>>>>>>>>>>> [3] delete the maker_path/perl directory and >>>>>>>>>>>>>>>>>>>>>>>>>>> maker_path/bin - >>>>>>>>>>>>>>>>>>>>>>>>>>> done >>>>>>>>>>>>>>>>>>>>>>>>>>> [4] LD_PRELOAD=/.....path...../openmpi/lib/libmpi.so - >>>>>>>>>>>>>>>>>>>>>>>>>>> done >>>>>>>>>>>>>>>>>>>>>>>>>>> [5] perl Build.PL - done >>>>>>>>>>>>>>>>>>>>>>>>>>> [6] installation of 2.27 worked >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> and back to original error: >>>>>>>>>>>>>>>>>>>>>>>>>>> STATUS: Parsing control files... >>>>>>>>>>>>>>>>>>>>>>>>>>> ERROR: No open filehandle Iterator::GFF3 >>>>>>>>>>>>>>>>>>>>>>>>>>> --> rank=NA, hostname=frontend04 >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Alex >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> On 12/03/2013 15:26, Carson Holt wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>> So the odd unrelated errors you are getting suggest >>>>>>>>>>>>>>>>>>>>>>>>>>>> there >>>>>>>>>>>>>>>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>>>>>>>>>>>>> something >>>>>>>>>>>>>>>>>>>>>>>>>>>> else going on that needs to be resolved first. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Check your drive space 'df -h maker_path'. Make sure >>>>>>>>>>>>>>>>>>>>>>>>>>>> you >>>>>>>>>>>>>>>>>>>>>>>>>>>> don't >>>>>>>>>>>>>>>>>>>>>>>>>>>> just >>>>>>>>>>>>>>>>>>>>>>>>>>>> have >>>>>>>>>>>>>>>>>>>>>>>>>>>> a full hard drive. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Run './Build realclean', and delete the >> maker_path/perl >>>>>>>>>>>>>>>>>>>>>>>>>>>> directory and >>>>>>>>>>>>>>>>>>>>>>>>>>>> maker_path/bin sidreactory completely. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Make sure to execute the export >>>>>>>>>>>>>>>>>>>>>>>>>>>> LD_PRELOAD=/.....path...../openmpi/lib/libmpi.so >> comamnd >>>>>>>>>>>>>>>>>>>>>>>>>>>> before >>>>>>>>>>>>>>>>>>>>>>>>>>>> ever >>>>>>>>>>>>>>>>>>>>>>>>>>>> running 'perl Build.PL' >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Which version of OPenMPI are you using. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>>>> Carson >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> On 13-03-12 11:21 AM, "Alex >>>>>>>>>>>>>>>>>>>>>>>>>>>> Marshall" >>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> I am using openmpi and yes I ran ./Build install >> step. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Configuring MAKER with MPI support >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Can't exec "/bin/sh": Argument list too long at >>>>>>>>>>>>>>>>>>>>>>>>>>>>> /....path...../lib/perl5/Inline/C.pm line 801. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> A problem was encountered while attempting to >> compile >>>>>>>>>>>>>>>>>>>>>>>>>>>>> and >>>>>>>>>>>>>>>>>>>>>>>>>>>>> install >>>>>>>>>>>>>>>>>>>>>>>>>>>>> your >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Inline >>>>>>>>>>>>>>>>>>>>>>>>>>>>> C code. The command that failed was: >>>>>>>>>>>>>>>>>>>>>>>>>>>>> /usr/bin/perl Makefile.PL > out.Makefile_PL >> 2>&1 >>>>>>>>>>>>>>>>>>>>>>>>>>>>> The build directory was: >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>> /....path...../maker2/maker-2.27/src/blib/build/Parallel >>>>>>>>>>>>>>>>>>>>>>>>>>>>> /A >>>>>>>>>>>>>>>>>>>>>>>>>>>>> pp >>>>>>>>>>>>>>>>>>>>>>>>>>>>> li >>>>>>>>>>>>>>>>>>>>>>>>>>>>> c >>>>>>>>>>>>>>>>>>>>>>>>>>>>> ati >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> on/ >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> MPI >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> To debug the problem, cd to the build directory, and >>>>>>>>>>>>>>>>>>>>>>>>>>>>> inspect >>>>>>>>>>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>>>>> output >>>>>>>>>>>>>>>>>>>>>>>>>>>>> files. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> at >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>> /....path...../maker2/maker-2.27/src/lib/Parallel/Applic >>>>>>>>>>>>>>>>>>>>>>>>>>>>> at >>>>>>>>>>>>>>>>>>>>>>>>>>>>> io >>>>>>>>>>>>>>>>>>>>>>>>>>>>> n/ >>>>>>>>>>>>>>>>>>>>>>>>>>>>> M >>>>>>>>>>>>>>>>>>>>>>>>>>>>> PI. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> pm >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> line 223. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Alex >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 12/03/2013 15:14, Carson Holt wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Also the place it is trying to load from >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>> /....path...../maker2/maker-2.27/src/blib/lib/auto/Para >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ll >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> el >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> /A >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> p >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pli >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> cat >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ion >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> /M >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> PI >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> /MPI.so >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> That is not the final install location? Did you >> run >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> './Build >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> install' >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> step? When that runs everything related to MPI >> will >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> be >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> here --> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>> /....path...../maker2/maker-2.27/perl/lib/auto/Parallel >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> /A >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pp >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> li >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> c >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ati >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> on/ >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> MPI >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> /M >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> PI >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> .so >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> --Carson >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 13-03-12 11:11 AM, "Alex >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Marshall" >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> now getting mpi problems: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Can't load >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>> '/....path...../maker2/maker-2.27/src/blib/lib/auto/Para >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ll >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> el >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> / >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> App >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> lic >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ati >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> on >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> /M >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> PI/MPI.so' >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for module Parallel::Application::MPI: >>>>> libmpich.so.1.0: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> cannot >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> open >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> shared object file: No such file or directory at >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> /usr/lib64/perl5/DynaLoader.pm line 200. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> at /....path...../lib/perl5/Inline.pm line >> 536. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> at >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>> /....path...../maker2/maker-2.27/src/lib/Parallel/Applic >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> at >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> io >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> n >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> /MP >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I.p >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> m >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> line 223 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> you suggest: export >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> LD_PRELOAD=/.....path...../openmpi/lib/libmpi.so >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I do that, and run again, same error. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Alex >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 12/03/2013 14:43, Alex Marshall wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ok I will upgrade to 2.27 now. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Alex >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 12/03/2013 14:42, Carson Holt wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The original error is caused by an issue in >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Proc::ProcessTable >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> on >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> some >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> systems. I no longer use that module in maker >> for >>>>>>> that >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> reason. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> After >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> first error, you may have to delete the >> mpi_blastdb >>>>>>> and >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> any >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> files >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> with the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> extension .db in the maker.output directory >> before >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> retrying. I >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> would >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> recommend using 2.27. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Carson >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 13-03-12 10:40 AM, "Alex >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Marshall" >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I managed to fix that error. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I am using version 2.25-beta. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> new error: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ERROR: No open filehandle Iterator::GFF3 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Alex >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 12/03/2013 14:27, Carson Holt wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Could you try the 2.27 version of MAKER? You >> are >>>>>>> using >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2.10 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> correct? >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Carson >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 13-03-11 12:15 PM, "Alex >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Marshall" >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi to the maker-devel, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I am getting an error everytime I run the >> maker >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> script. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> symbol lookup error: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>> /path/to/software/lib64/perl5/site_perl/5.8.8/x86_64-li >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> n >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ux- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> thr >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ead >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -m >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ul >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ti/ >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> au >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to/Proc/ProcessTable/ProcessTable.so: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> undefined symbol: Perl_Tstack_sp_ptr >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Your help would be very appreciated. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best wishes, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Alex >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ---------------- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Edinburgh University >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The University of Edinburgh is a charitable >>>>> body, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> registered in >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Scotland, with registration number SC005336. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> _______________________________________________ >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> maker-devel mailing list >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> maker-devel at box290.bluehost.com >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>> http://box290.bluehost.com/mailman/listinfo/maker-devel >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> _ >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> yan >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> del >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> l-l >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ab >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> .o >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> rg >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Alex Marshall, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Room 3.54, Blaxter Lab, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Ashworth Laboratories, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Institute of Evolutionary Biology, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The King's Buildings, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The University of Edinburgh, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Edinburgh, EH9 3JT >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> alex.marshall at ed.ac.uk >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> +44(0)131 650 7403 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The University of Edinburgh is a charitable >> body, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> registered in >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Scotland, with registration number SC005336. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Alex Marshall, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Room 3.54, Blaxter Lab, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Ashworth Laboratories, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Institute of Evolutionary Biology, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The King's Buildings, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The University of Edinburgh, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Edinburgh, EH9 3JT >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> alex.marshall at ed.ac.uk >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> +44(0)131 650 7403 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The University of Edinburgh is a charitable body, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> registered in >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Scotland, with registration number SC005336. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Alex Marshall, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Room 3.54, Blaxter Lab, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Ashworth Laboratories, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Institute of Evolutionary Biology, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> The King's Buildings, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> The University of Edinburgh, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Edinburgh, EH9 3JT >>>>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>>>> alex.marshall at ed.ac.uk >>>>>>>>>>>>>>>>>>>>>>>>>>>>> +44(0)131 650 7403 >>>>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> The University of Edinburgh is a charitable body, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> registered >>>>>>>>>>>>>>>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Scotland, with registration number SC005336. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>> Alex Marshall, >>>>>>>>>>>>>>>>>>>>>>>>>>> Room 3.54, Blaxter Lab, >>>>>>>>>>>>>>>>>>>>>>>>>>> Ashworth Laboratories, >>>>>>>>>>>>>>>>>>>>>>>>>>> Institute of Evolutionary Biology, >>>>>>>>>>>>>>>>>>>>>>>>>>> The King's Buildings, >>>>>>>>>>>>>>>>>>>>>>>>>>> The University of Edinburgh, >>>>>>>>>>>>>>>>>>>>>>>>>>> Edinburgh, EH9 3JT >>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>> alex.marshall at ed.ac.uk >>>>>>>>>>>>>>>>>>>>>>>>>>> +44(0)131 650 7403 >>>>>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> The University of Edinburgh is a charitable body, >>>>>>>>>>>>>>>>>>>>>>>>>>> registered >>>>>>>>>>>>>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>>>>>>>>>>>> Scotland, with registration number SC005336. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>> Alex Marshall, >>>>>>>>>>>>>>>>>>>>>>>> Room 3.54, Blaxter Lab, >>>>>>>>>>>>>>>>>>>>>>>> Ashworth Laboratories, >>>>>>>>>>>>>>>>>>>>>>>> Institute of Evolutionary Biology, >>>>>>>>>>>>>>>>>>>>>>>> The King's Buildings, >>>>>>>>>>>>>>>>>>>>>>>> The University of Edinburgh, >>>>>>>>>>>>>>>>>>>>>>>> Edinburgh, EH9 3JT >>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>> alex.marshall at ed.ac.uk >>>>>>>>>>>>>>>>>>>>>>>> +44(0)131 650 7403 >>>>>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> The University of Edinburgh is a charitable body, >>>>>>>>>>>>>>>>>>>>>>>> registered >>>>>>>>>>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>>>>>>>>> Scotland, with registration number SC005336. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>> Alex Marshall, >>>>>>>>>>>>>>>>>>>>> Room 3.54, Blaxter Lab, >>>>>>>>>>>>>>>>>>>>> Ashworth Laboratories, >>>>>>>>>>>>>>>>>>>>> Institute of Evolutionary Biology, >>>>>>>>>>>>>>>>>>>>> The King's Buildings, >>>>>>>>>>>>>>>>>>>>> The University of Edinburgh, >>>>>>>>>>>>>>>>>>>>> Edinburgh, EH9 3JT >>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>> alex.marshall at ed.ac.uk >>>>>>>>>>>>>>>>>>>>> +44(0)131 650 7403 >>>>>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> The University of Edinburgh is a charitable body, >>>>>>>>>>>>>>>>>>>>> registered in >>>>>>>>>>>>>>>>>>>>> Scotland, with registration number SC005336. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>> Alex Marshall, >>>>>>>>>>>>>>>>>> Room 3.54, Blaxter Lab, >>>>>>>>>>>>>>>>>> Ashworth Laboratories, >>>>>>>>>>>>>>>>>> Institute of Evolutionary Biology, >>>>>>>>>>>>>>>>>> The King's Buildings, >>>>>>>>>>>>>>>>>> The University of Edinburgh, >>>>>>>>>>>>>>>>>> Edinburgh, EH9 3JT >>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>> alex.marshall at ed.ac.uk >>>>>>>>>>>>>>>>>> +44(0)131 650 7403 >>>>>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> The University of Edinburgh is a charitable body, >>>>>>>>>>>>>>>>>> registered >>>>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>>> Scotland, with registration number SC005336. >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>> Alex Marshall, >>>>>>>>>>>>>>> Room 3.54, Blaxter Lab, >>>>>>>>>>>>>>> Ashworth Laboratories, >>>>>>>>>>>>>>> Institute of Evolutionary Biology, >>>>>>>>>>>>>>> The King's Buildings, >>>>>>>>>>>>>>> The University of Edinburgh, >>>>>>>>>>>>>>> Edinburgh, EH9 3JT >>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>> alex.marshall at ed.ac.uk >>>>>>>>>>>>>>> +44(0)131 650 7403 >>>>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The University of Edinburgh is a charitable body, >>>>>>>>>>>>>>>registered >>>>>>>>>>>>>>> in >>>>>>>>>>>>>>> Scotland, with registration number SC005336. >>>>>>>>>>>>> -- >>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>> Alex Marshall, >>>>>>>>>>>>> Room 3.54, Blaxter Lab, >>>>>>>>>>>>> Ashworth Laboratories, >>>>>>>>>>>>> Institute of Evolutionary Biology, >>>>>>>>>>>>> The King's Buildings, >>>>>>>>>>>>> The University of Edinburgh, >>>>>>>>>>>>> Edinburgh, EH9 3JT >>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>> alex.marshall at ed.ac.uk >>>>>>>>>>>>> +44(0)131 650 7403 >>>>>>>>>>>>> ----------------------------- >>>>>>>>>>>>> >>>>>>>>>>>>> The University of Edinburgh is a charitable body, registered >>>>>>>>>>>>>in >>>>>>>>>>>>> Scotland, with registration number SC005336. >>>>>>>>>>> -- >>>>>>>>>>> ----------------------------- >>>>>>>>>>> Alex Marshall, >>>>>>>>>>> Room 3.54, Blaxter Lab, >>>>>>>>>>> Ashworth Laboratories, >>>>>>>>>>> Institute of Evolutionary Biology, >>>>>>>>>>> The King's Buildings, >>>>>>>>>>> The University of Edinburgh, >>>>>>>>>>> Edinburgh, EH9 3JT >>>>>>>>>>> ----------------------------- >>>>>>>>>>> alex.marshall at ed.ac.uk >>>>>>>>>>> +44(0)131 650 7403 >>>>>>>>>>> ----------------------------- >>>>>>>>>>> >>>>>>>>>>> The University of Edinburgh is a charitable body, registered in >>>>>>>>>>> Scotland, with registration number SC005336. >>>>>>>> -- >>>>>>>> ----------------------------- >>>>>>>> Alex Marshall, >>>>>>>> Room 3.54, Blaxter Lab, >>>>>>>> Ashworth Laboratories, >>>>>>>> Institute of Evolutionary Biology, >>>>>>>> The King's Buildings, >>>>>>>> The University of Edinburgh, >>>>>>>> Edinburgh, EH9 3JT >>>>>>>> ----------------------------- >>>>>>>> alex.marshall at ed.ac.uk >>>>>>>> +44(0)131 650 7403 >>>>>>>> ----------------------------- >>>>>>>> >>>>>>>> The University of Edinburgh is a charitable body, registered in >>>>>>>> Scotland, with registration number SC005336. >>>>>> -- >>>>>> ----------------------------- >>>>>> Alex Marshall, >>>>>> Room 3.54, Blaxter Lab, >>>>>> Ashworth Laboratories, >>>>>> Institute of Evolutionary Biology, >>>>>> The King's Buildings, >>>>>> The University of Edinburgh, >>>>>> Edinburgh, EH9 3JT >>>>>> ----------------------------- >>>>>> alex.marshall at ed.ac.uk >>>>>> +44(0)131 650 7403 >>>>>> ----------------------------- >>>>>> >>>>>> The University of Edinburgh is a charitable body, registered in >>>>>> Scotland, with registration number SC005336. >>> >>> -- >>> ----------------------------- >>> Alex Marshall, >>> Room 3.54, Blaxter Lab, >>> Ashworth Laboratories, >>> Institute of Evolutionary Biology, >>> The King's Buildings, >>> The University of Edinburgh, >>> Edinburgh, EH9 3JT >>> ----------------------------- >>> alex.marshall at ed.ac.uk >>> +44(0)131 650 7403 >>> ----------------------------- >>> >>> The University of Edinburgh is a charitable body, registered in >>> Scotland, with registration number SC005336. > > >-- >----------------------------- >Alex Marshall, >Room 3.54, Blaxter Lab, >Ashworth Laboratories, >Institute of Evolutionary Biology, >The King's Buildings, >The University of Edinburgh, >Edinburgh, EH9 3JT >----------------------------- >alex.marshall at ed.ac.uk >+44(0)131 650 7403 >----------------------------- > >The University of Edinburgh is a charitable body, registered in >Scotland, with registration number SC005336. From mikheyev at gmail.com Wed Mar 13 01:23:25 2013 From: mikheyev at gmail.com (Sasha Mikheyev) Date: Wed, 13 Mar 2013 16:23:25 +0900 Subject: [maker-devel] duplicate CDS in annotation In-Reply-To: References: Message-ID: Dear Carson, The new version does indeed fix the problem! However, I noticed that some of the CDS annotations were swallowed. This seems to affect a ~600 genes. e.g. input: pbar_scf7180000349951 maker mRNA 98033 98530 . - . ID=PB12301-RA;Parent=PB12301;Name=PB12301-RA;Alias=maker-pbar_scf7180000349951-snap-gene-1.17-mRNA-1;_AED=1.00;_QI=0|0|0|0|0|0|2|0|81; pbar_scf7180000349951 maker exon 98393 98530 . - . ID=PB12301-RA:exon:10283;Parent=PB12301-RA; pbar_scf7180000349951 maker exon 98033 98140 . - . ID=PB12301-RA:exon:10284;Parent=PB12301-RA; pbar_scf7180000349951 maker CDS 98033 98140 . - 0 ID=PB12301-RA:cds:10114;Parent=PB12301-RA; pbar_scf7180000349951 maker CDS 98393 98530 . - 0 ID=PB12301-RA:cds:10113;Parent=PB12301-RA; output: pbar_scf7180000349951 maker mRNA 98033 98530 . - . ID=PB12301-RA;Parent=PB12301;Name=PB12301-RA;_AED=0.38;_eAED=0.38;_QI=0|0|0.33|1|0.5|1|3|246|165;Alias=genemark-pbar_scf7180000349951-abinit-gene-1.14-mRNA-1,PB12301-RA pbar_scf7180000349951 maker exon 98033 98530 . - . ID=PB12301-RA:exon:134;Parent=PB12301-RA pbar_scf7180000349951 maker exon 98033 98140 . - . ID=PB12301-RA:exon:133;Parent=PB12301-RA pbar_scf7180000349951 maker exon 98393 98530 . - . ID=PB12301-RA:exon:132;Parent=PB12301-RA pbar_scf7180000349951 maker three_prime_UTR 98393 98530 . - . ID=PB12301-RA:three_prime_utr;Parent=PB12301-RA pbar_scf7180000349951 maker three_prime_UTR 98033 98140 . - . ID=PB12301-RA:three_prime_utr;Parent=PB12301-RA pbar_scf7180000349951 maker CDS 98033 98530 . - 0 ID=PB12301-RA:cds;Parent=PB12301-RA Thank you, Sasha On Tue, Mar 12, 2013 at 10:37 PM, Carson Holt wrote: > Yes. Try the newer version and see if you still have the issue. > > Thanks, > Carson > > > From: Sasha Mikheyev > Date: Tuesday, 12 March, 2013 1:26 AM > To: Carson Holt > Cc: Barry Moore , < > maker-devel at yandell-lab.org> > > Subject: Re: [maker-devel] duplicate CDS in annotation > > Hi Carson, > > I have been using version 2.10. Is it worth trying with a newer version? > > You can find the model file here. > It is rather large, as it includes all of the output from the first maker > run. > > Yours, > > Sasha > > > On Mon, Mar 11, 2013 at 10:02 PM, Carson Holt wrote: > >> I think the issue is that you are getting a match feature that is being >> printed with the same ID as the mRNA feature. Correct? >> >> What version of MAKER are you using, and what does the gile you are >> giving to pred_gff or model_gff look like? Could you send them? >> >> Thanks, >> Carson >> >> >> From: Barry Moore >> Date: Monday, 11 March, 2013 7:32 AM >> To: Sasha Mikheyev >> Cc: >> Subject: Re: [maker-devel] duplicate CDS in annotation >> >> Hi Sasha, >> >> This gene model appears to be correctly formatted to me. In GFF3 format >> the CDS features are allowed to span multiple lines and they share the same >> ID to indicate that it is all the same features. See the GFF3 >> specification on the Sequence Ontology website ( >> http://www.sequenceontology.org/resources/gff3.html), and in particular >> the description of the ID attribute specifies: >> >> ID Indicates the ID of the feature. IDs for each feature must be unique >> within the scope of the GFF file. In the case of discontinuous features >> (i.e. a single feature that exists over multiple genomic locations) the >> same ID may appear on multiple lines. All lines that share an ID >> collectively represent a single feature. >> >> >> So each of those CDS lines forms one part of the single CDS feature for >> this gene. >> >> B >> >> On Mar 11, 2013, at 3:46 AM, Sasha Mikheyev wrote: >> >> Dear Yandell lab, >> >> I am re-annotating the harvester and genome using protein and RNA-seq >> data. However, I get many artifacts like the one below. It seems that there >> are several CDS records that should tie in to the same mRNA, but they are >> really hanging out separately, and produce several nucleotide sequences >> with the same name when extracted from the gff. I would appreciate any >> guidance about how to fix this! >> >> Thank you, >> >> Sasha >> >> grep "pbar_scf7180000350377:hit:2506" Pbar.2.0.gff >> pbar_scf7180000350377 protein2genome protein_match 172004 172162 150 - . >> ID=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;score=150; >> pbar_scf7180000350377 protein2genome match_part 172004 172162 150 - . ID=pbar_scf7180000350377:hsp:2798;Parent=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;Target=Hsal|HS9704 >> 1 53 +;Gap=M159; >> pbar_scf7180000350377 maker mRNA 538308 558769 . + . >> ID=pbar_scf7180000350377:hit:2506;Parent=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29;Name=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29-mRNA-1;_AED=0.48;_eAED=0.39;_QI=0|0|0|0.5|1|1|6|0|395;score=0.01; >> pbar_scf7180000350377 maker exon 538308 538334 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:305;Parent=pbar_scf7180000350377:hit:2506; >> pbar_scf7180000350377 maker exon 538748 538968 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:306;Parent=pbar_scf7180000350377:hit:2506; >> pbar_scf7180000350377 maker exon 539842 540242 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:307;Parent=pbar_scf7180000350377:hit:2506; >> pbar_scf7180000350377 maker exon 542624 542798 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:308;Parent=pbar_scf7180000350377:hit:2506; >> pbar_scf7180000350377 maker exon 555823 556025 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:309;Parent=pbar_scf7180000350377:hit:2506; >> pbar_scf7180000350377 maker exon 558609 558769 0.01 + . >> ID=pbar_scf7180000350377:hit:2506:exon:310;Parent=pbar_scf7180000350377:hit:2506; >> pbar_scf7180000350377 maker CDS 538308 538334 . + 0 >> ID=pbar_scf7180000350377:hit:2506:cds:305;Parent=pbar_scf7180000350377:hit:2506; >> pbar_scf7180000350377 maker CDS 538748 538968 . + 0 >> ID=pbar_scf7180000350377:hit:2506:cds:306;Parent=pbar_scf7180000350377:hit:2506; >> pbar_scf7180000350377 maker CDS 539842 540242 . + 1 >> ID=pbar_scf7180000350377:hit:2506:cds:307;Parent=pbar_scf7180000350377:hit:2506; >> pbar_scf7180000350377 maker CDS 542624 542798 . + 2 >> ID=pbar_scf7180000350377:hit:2506:cds:308;Parent=pbar_scf7180000350377:hit:2506; >> pbar_scf7180000350377 maker CDS 555823 556025 . + 1 >> ID=pbar_scf7180000350377:hit:2506:cds:309;Parent=pbar_scf7180000350377:hit:2506; >> pbar_scf7180000350377 maker CDS 558609 558769 . + 2 >> ID=pbar_scf7180000350377:hit:2506:cds:310;Parent=pbar_scf7180000350377:hit:2506; >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> >> Barry Moore >> Research Scientist >> Dept. of Human Genetics >> University of Utah >> Salt Lake City, UT 84112 >> -------------------------------------------- >> (801) 585-3543 >> >> >> >> >> _______________________________________________ maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Hossein.Borhan at AGR.GC.CA Wed Mar 13 15:49:44 2013 From: Hossein.Borhan at AGR.GC.CA (Borhan, Hossein) Date: Wed, 13 Mar 2013 17:49:44 -0400 Subject: [maker-devel] do of the maker predicted proteins do not start with M Message-ID: <7B64340A44B6634C814A22BCFA6179D5020887CC@onottaxms5.AGR.GC.CA> Hi I have run maker and some of the protein predicted by maker does not start with a Methionine. I am not sure why Here are some examples >maker-scaffold00001-snap-gene-0.8-mRNA-1 protein AED:0.27496328928047 eAED:0.27496328928047 QI:0|0.4|0.16|0.5|1|1|6|0|453 VIIKFKTFAKASRSVELFGHEGAWARGDGYCNFKTESEKADRSVKSSCSLNIPFTYDVGR RQYVIKGDRFCLSHNHLVMIPSPTTVIVNDQRDLTPDQLSYIINLGKYSLPFPMVTRMLS DQFPDCRIQKPLLHRLLRKGKLQAFGGDRDAMNALINLGRSYEEHGGFFEIDIDVDCRLE KIWLARAEGLQFASVYNDVVQIDGGAKMNAYGFVFLPVTVIDCLGKSYVVGAMAGPSAEN KADVVKTLEYFRVKRSESVLIADDALAFRAAAVECDMVYHQCTKHYQAKIARACAGLGHE GKEFMIKANTLVYHIFPSEDAFFAKADEYRLMFLQYGGAVKLFDDIVDKRQQLCRTFTSC KFTGSHSSNQRAEGTISRTKRDVQPWLSRANLFEMFTHLEMIQKQQEDEAARLLSNLIRK GKHWSDYVDSIFRERQLNSRLLSSVREVDTGLH >snap-scaffold00087-abinit-gene-2.145-mRNA-1 protein AED:0.0539495114006514 eAED:0.0539495114006514 QI:2|1|0.6|1|1|1|5|0|817 ALSLHGTRQAFARVPPPCRRAHPAERRQPGGGMSADAPVKAGYLLKLTSSLSHWNRRYFI VADTKLFYCKTEDDLLRRKFQGEIDLAGAQIALYTRNDETAKRFSDHHHMLGVKPAGCDR IYILDADSEHAQKEWVACLRRHASQAPVSSPVDAAVAAAPRKDPQSVREGFLTKRGETIK NWKMRYFVLKGNYLHYYRSIEDAQPAGSILLLGTRTTAEPKAVTGMPHSFSIARADAKRK YMIHADSKEECDAWVGAIQQQSVFVRHAGTDSAPPEVVAPAAATPVHQQHQSRSSFGNRP NVADDSADDDEAALDEVALSNGPPALAPHGIANTGASTGLNLKQKVSKKKRRFVTDEFDL DLTYITENIIAMGFPAESMEAMFRNSMSDVQRFLDGRHPDAYRVYNLCSERDYDPAKFHH NVCRFPFDDHNCPNFEDLIPLCEDIHNWLSIQSDHVVAIHCKAGKGRTGLVICAYLLYSG AWRTARDALQFYGFVRTQDQKGVTIPSQIRYVEYFEQYMADPEILSRNNGPLVISEIFVG RGCRPFDTVTITNMGRRMNSKDWGKYWKDALDDGLLLQLPKGACQVDKDFKVEFLASGLL GKKTRVAGFWLHTAFIQDGVVDIDKSMIDKVNKEKDCPAFSIQVFFGGRTYVDRRCRIPV APPQPTGPLLLSPATVRIRNADPLPVPNPSSPSESPSFSAMSSVPSLSLESLSSVSSSLS PTTGPKAAPSPKKQDPGLDPGSPPGTVKSAPVAEAGAAPVDARSDNKAPARSCSLPHGRF PGDVAGDGAFQQAAVQVRIAFWNNLQSEALQRRNSRL >augustus-scaffold00087-abinit-gene-0.106-mRNA-1 protein AED:0.10935424621144 eAED:0.10935424621144 QI:1|1|0.66|1|1|1|3|0|483 STSTVFCPGLWNRVGRYVSGGLQTSPVTVPRQVRPICLATQPATDQPGYVMSSTSALVGI GVVTVALLCRWAPILVTTSGPGSPARSSADVMRIWADHDWSAGTTSLPISQESLLAKRVL SKSFDGLPPNLHVQDDTVPVALLRAHLNAGRHMRLRDMCPTAGACDLQGADPDHGLAPLH MAAMRDDRSSIAYLMALGADPDAMDRAGRQYRNLSFTNFVRNARRAAEERGSTCQLPEVN LAGLERADLDRSWAEIRRLAHEGEPVAIRGLLGAYDRSDVLDWDLDAFLTRHGHVPVNVG DVPYAQYFGLPIQSMPLSKYVASLAPGSASYVFAKDDGICRDALQILDRFARDALPPYFV SPAALGSDAVHFYLGNKGSGAPFHLHSDAVNLLAHGSKTWFVTPPPQSVYSRTPIGEFAA NGTSGIESLRCEQNPGDAIYIPFDWGHAVLNNEDSTFGFAVELLNKRDSLHFLRPSSQVP AGQ Regards Hossein -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Wed Mar 13 15:40:39 2013 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 13 Mar 2013 17:40:39 -0400 Subject: [maker-devel] ERROR: Could not obtain lock to format database In-Reply-To: Message-ID: Could you check to make sure your hard drive is not full, whatever location you set as TMP= in the control files is not full (default is /tmp). Also maker sure you do not set /tmp to an NFS mounted or a tmpfs location. Could you also send the full captured STDERR. Thanks, Carson From: Hung-Wei Hsu Date: Tuesday, 12 March, 2013 8:24 PM To: Subject: [maker-devel] ERROR: Could not obtain lock to format database Hi MAKER developers, I tried MAKER 2.27b on one E. coli scaffold sequence with uniprot protein database. I failed to run the analysis and got an error message as below. Could not obtain lock to format database at maker-2.27b/bin/../lib/GI.pm Any suggestions or helps will be deeply appreciated. Best regards, Hung-Wei _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Wed Mar 13 15:47:06 2013 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 13 Mar 2013 17:47:06 -0400 Subject: [maker-devel] duplicate CDS in annotation In-Reply-To: Message-ID: The output shows that the original model was Alias=maker-pbar_scf7180000349951-snap-gene-1.17-mRNA-1 and the new model replacing it is Alias=genemark-pbar_scf7180000349951-abinit-gene-1.14-mRNA-1. So it is really a completely different model (as one derived from SNAP and one from GeneMark). I'm guessing you have map_forward=1 set and are using the GFF3 passthrough options correct? Thanks, Carson From: Sasha Mikheyev Date: Wednesday, 13 March, 2013 3:23 AM To: Carson Holt Cc: Barry Moore , Subject: Re: [maker-devel] duplicate CDS in annotation Dear Carson, The new version does indeed fix the problem! However, I noticed that some of the CDS annotations were swallowed. This seems to affect a ~600 genes. e.g. input: pbar_scf7180000349951 maker mRNA 98033 98530 . - . ID=PB12301-RA;Parent=PB12301;Name=PB12301-RA;Alias=maker-pbar_scf71800003499 51-snap-gene-1.17-mRNA-1;_AED=1.00;_QI=0|0|0|0|0|0|2|0|81; pbar_scf7180000349951 maker exon 98393 98530 . - . ID=PB12301-RA:exon:10283;Parent=PB12301-RA; pbar_scf7180000349951 maker exon 98033 98140 . - . ID=PB12301-RA:exon:10284;Parent=PB12301-RA; pbar_scf7180000349951 maker CDS 98033 98140 . - 0 ID=PB12301-RA:cds:10114;Parent=PB12301-RA; pbar_scf7180000349951 maker CDS 98393 98530 . - 0 ID=PB12301-RA:cds:10113;Parent=PB12301-RA; output: pbar_scf7180000349951 maker mRNA 98033 98530 . - . ID=PB12301-RA;Parent=PB12301;Name=PB12301-RA;_AED=0.38;_eAED=0.38;_QI=0|0|0. 33|1|0.5|1|3|246|165;Alias=genemark-pbar_scf7180000349951-abinit-gene-1.14-m RNA-1,PB12301-RA pbar_scf7180000349951 maker exon 98033 98530 . - . ID=PB12301-RA:exon:134;Parent=PB12301-RA pbar_scf7180000349951 maker exon 98033 98140 . - . ID=PB12301-RA:exon:133;Parent=PB12301-RA pbar_scf7180000349951 maker exon 98393 98530 . - . ID=PB12301-RA:exon:132;Parent=PB12301-RA pbar_scf7180000349951 maker three_prime_UTR 98393 98530 . - . ID=PB12301-RA:three_prime_utr;Parent=PB12301-RA pbar_scf7180000349951 maker three_prime_UTR 98033 98140 . - . ID=PB12301-RA:three_prime_utr;Parent=PB12301-RA pbar_scf7180000349951 maker CDS 98033 98530 . - 0 ID=PB12301-RA:cds;Parent=PB12301-RA Thank you, Sasha On Tue, Mar 12, 2013 at 10:37 PM, Carson Holt wrote: > Yes. Try the newer version and see if you still have the issue. > > Thanks, > Carson > > > From: Sasha Mikheyev > Date: Tuesday, 12 March, 2013 1:26 AM > To: Carson Holt > Cc: Barry Moore , > > > Subject: Re: [maker-devel] duplicate CDS in annotation > > Hi Carson, > > I have been using version 2.10. Is it worth trying with a newer version? > > You can find the model file here > . It is rather large, as it includes all of the output from the first maker > run. > > Yours, > > Sasha > > > On Mon, Mar 11, 2013 at 10:02 PM, Carson Holt wrote: >> I think the issue is that you are getting a match feature that is being >> printed with the same ID as the mRNA feature. Correct? >> >> What version of MAKER are you using, and what does the gile you are giving to >> pred_gff or model_gff look like? Could you send them? >> >> Thanks, >> Carson >> >> >> From: Barry Moore >> Date: Monday, 11 March, 2013 7:32 AM >> To: Sasha Mikheyev >> Cc: >> Subject: Re: [maker-devel] duplicate CDS in annotation >> >> Hi Sasha, >> >> This gene model appears to be correctly formatted to me. In GFF3 format the >> CDS features are allowed to span multiple lines and they share the same ID to >> indicate that it is all the same features. See the GFF3 specification on the >> Sequence Ontology website >> (http://www.sequenceontology.org/resources/gff3.html), and in particular the >> description of the ID attribute specifies: >> >>> ID Indicates the ID of the feature. IDs for each feature must be unique >>> within the scope of the GFF file. In the case of discontinuous features >>> (i.e. a single feature that exists over multiple genomic locations) the same >>> ID may appear on multiple lines. All lines that share an ID collectively >>> represent a single feature. >> >> So each of those CDS lines forms one part of the single CDS feature for this >> gene. >> >> B >> >> On Mar 11, 2013, at 3:46 AM, Sasha Mikheyev wrote: >> >>> Dear Yandell lab, >>> >>> I am re-annotating the harvester and genome using protein and RNA-seq data. >>> However, I get many artifacts like the one below. It seems that there are >>> several CDS records that should tie in to the same mRNA, but they are really >>> hanging out separately, and produce several nucleotide sequences with the >>> same name when extracted from the gff. I would appreciate any guidance about >>> how to fix this! >>> >>> Thank you, >>> >>> Sasha >>> >>> grep "pbar_scf7180000350377:hit:2506" Pbar.2.0.gff >>> pbar_scf7180000350377 protein2genome protein_match 172004 172162 150 - . >>> ID=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;score=150; >>> pbar_scf7180000350377 protein2genome match_part 172004 172162 150 - . >>> ID=pbar_scf7180000350377:hsp:2798;Parent=pbar_scf7180000350377:hit:2506;Name >>> =Hsal|HS9704;Target=Hsal|HS9704 1 53 +;Gap=M159; >>> pbar_scf7180000350377 maker mRNA 538308 558769 . + . >>> ID=pbar_scf7180000350377:hit:2506;Parent=augustus_masked-pbar_scf71800003503 >>> 77-abinit-gene-5.29;Name=augustus_masked-pbar_scf7180000350377-abinit-gene-5 >>> .29-mRNA-1;_AED=0.48;_eAED=0.39;_QI=0|0|0|0.5|1|1|6|0|395;score=0.01; >>> pbar_scf7180000350377 maker exon 538308 538334 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:305;Parent=pbar_scf7180000350377:hit: >>> 2506; >>> pbar_scf7180000350377 maker exon 538748 538968 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:306;Parent=pbar_scf7180000350377:hit: >>> 2506; >>> pbar_scf7180000350377 maker exon 539842 540242 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:307;Parent=pbar_scf7180000350377:hit: >>> 2506; >>> pbar_scf7180000350377 maker exon 542624 542798 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:308;Parent=pbar_scf7180000350377:hit: >>> 2506; >>> pbar_scf7180000350377 maker exon 555823 556025 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:309;Parent=pbar_scf7180000350377:hit: >>> 2506; >>> pbar_scf7180000350377 maker exon 558609 558769 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:310;Parent=pbar_scf7180000350377:hit: >>> 2506; >>> pbar_scf7180000350377 maker CDS 538308 538334 . + 0 >>> ID=pbar_scf7180000350377:hit:2506:cds:305;Parent=pbar_scf7180000350377:hit:2 >>> 506; >>> pbar_scf7180000350377 maker CDS 538748 538968 . + 0 >>> ID=pbar_scf7180000350377:hit:2506:cds:306;Parent=pbar_scf7180000350377:hit:2 >>> 506; >>> pbar_scf7180000350377 maker CDS 539842 540242 . + 1 >>> ID=pbar_scf7180000350377:hit:2506:cds:307;Parent=pbar_scf7180000350377:hit:2 >>> 506; >>> pbar_scf7180000350377 maker CDS 542624 542798 . + 2 >>> ID=pbar_scf7180000350377:hit:2506:cds:308;Parent=pbar_scf7180000350377:hit:2 >>> 506; >>> pbar_scf7180000350377 maker CDS 555823 556025 . + 1 >>> ID=pbar_scf7180000350377:hit:2506:cds:309;Parent=pbar_scf7180000350377:hit:2 >>> 506; >>> pbar_scf7180000350377 maker CDS 558609 558769 . + 2 >>> ID=pbar_scf7180000350377:hit:2506:cds:310;Parent=pbar_scf7180000350377:hit:2 >>> 506; >>> >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> Barry Moore >> Research Scientist >> Dept. of Human Genetics >> University of Utah >> Salt Lake City, UT 84112 >> -------------------------------------------- >> (801) 585-3543 >> >> >> >> >> _______________________________________________ maker-devel mailing list >> maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/ma >> ker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Wed Mar 13 18:26:25 2013 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 13 Mar 2013 20:26:25 -0400 Subject: [maker-devel] do of the maker predicted proteins do not start with M In-Reply-To: <7B64340A44B6634C814A22BCFA6179D5020887CC@onottaxms5.AGR.GC.CA> Message-ID: SNAP and other gene prediction programs are capable of producing partial models if they can't find reasonable start and stop codons. You can set always_complete=1 in the maker_opts.ctl file to get MAKER to walk forward and backwards to search for starts and stops after the ab initio predictors do their work in an attempt to force model completion. Thanks, Carson From: "Borhan, Hossein" Date: Wednesday, 13 March, 2013 5:49 PM To: Subject: [maker-devel] do of the maker predicted proteins do not start with M Hi I have run maker and some of the protein predicted by maker does not start with a Methionine. I am not sure why Here are some examples >maker-scaffold00001-snap-gene-0.8-mRNA-1 protein AED:0.27496328928047 eAED:0.27496328928047 QI:0|0.4|0.16|0.5|1|1|6|0|453 VIIKFKTFAKASRSVELFGHEGAWARGDGYCNFKTESEKADRSVKSSCSLNIPFTYDVGR RQYVIKGDRFCLSHNHLVMIPSPTTVIVNDQRDLTPDQLSYIINLGKYSLPFPMVTRMLS DQFPDCRIQKPLLHRLLRKGKLQAFGGDRDAMNALINLGRSYEEHGGFFEIDIDVDCRLE KIWLARAEGLQFASVYNDVVQIDGGAKMNAYGFVFLPVTVIDCLGKSYVVGAMAGPSAEN KADVVKTLEYFRVKRSESVLIADDALAFRAAAVECDMVYHQCTKHYQAKIARACAGLGHE GKEFMIKANTLVYHIFPSEDAFFAKADEYRLMFLQYGGAVKLFDDIVDKRQQLCRTFTSC KFTGSHSSNQRAEGTISRTKRDVQPWLSRANLFEMFTHLEMIQKQQEDEAARLLSNLIRK GKHWSDYVDSIFRERQLNSRLLSSVREVDTGLH >snap-scaffold00087-abinit-gene-2.145-mRNA-1 protein AED:0.0539495114006514 eAED:0.0539495114006514 QI:2|1|0.6|1|1|1|5|0|817 ALSLHGTRQAFARVPPPCRRAHPAERRQPGGGMSADAPVKAGYLLKLTSSLSHWNRRYFI VADTKLFYCKTEDDLLRRKFQGEIDLAGAQIALYTRNDETAKRFSDHHHMLGVKPAGCDR IYILDADSEHAQKEWVACLRRHASQAPVSSPVDAAVAAAPRKDPQSVREGFLTKRGETIK NWKMRYFVLKGNYLHYYRSIEDAQPAGSILLLGTRTTAEPKAVTGMPHSFSIARADAKRK YMIHADSKEECDAWVGAIQQQSVFVRHAGTDSAPPEVVAPAAATPVHQQHQSRSSFGNRP NVADDSADDDEAALDEVALSNGPPALAPHGIANTGASTGLNLKQKVSKKKRRFVTDEFDL DLTYITENIIAMGFPAESMEAMFRNSMSDVQRFLDGRHPDAYRVYNLCSERDYDPAKFHH NVCRFPFDDHNCPNFEDLIPLCEDIHNWLSIQSDHVVAIHCKAGKGRTGLVICAYLLYSG AWRTARDALQFYGFVRTQDQKGVTIPSQIRYVEYFEQYMADPEILSRNNGPLVISEIFVG RGCRPFDTVTITNMGRRMNSKDWGKYWKDALDDGLLLQLPKGACQVDKDFKVEFLASGLL GKKTRVAGFWLHTAFIQDGVVDIDKSMIDKVNKEKDCPAFSIQVFFGGRTYVDRRCRIPV APPQPTGPLLLSPATVRIRNADPLPVPNPSSPSESPSFSAMSSVPSLSLESLSSVSSSLS PTTGPKAAPSPKKQDPGLDPGSPPGTVKSAPVAEAGAAPVDARSDNKAPARSCSLPHGRF PGDVAGDGAFQQAAVQVRIAFWNNLQSEALQRRNSRL >augustus-scaffold00087-abinit-gene-0.106-mRNA-1 protein AED:0.10935424621144 eAED:0.10935424621144 QI:1|1|0.66|1|1|1|3|0|483 STSTVFCPGLWNRVGRYVSGGLQTSPVTVPRQVRPICLATQPATDQPGYVMSSTSALVGI GVVTVALLCRWAPILVTTSGPGSPARSSADVMRIWADHDWSAGTTSLPISQESLLAKRVL SKSFDGLPPNLHVQDDTVPVALLRAHLNAGRHMRLRDMCPTAGACDLQGADPDHGLAPLH MAAMRDDRSSIAYLMALGADPDAMDRAGRQYRNLSFTNFVRNARRAAEERGSTCQLPEVN LAGLERADLDRSWAEIRRLAHEGEPVAIRGLLGAYDRSDVLDWDLDAFLTRHGHVPVNVG DVPYAQYFGLPIQSMPLSKYVASLAPGSASYVFAKDDGICRDALQILDRFARDALPPYFV SPAALGSDAVHFYLGNKGSGAPFHLHSDAVNLLAHGSKTWFVTPPPQSVYSRTPIGEFAA NGTSGIESLRCEQNPGDAIYIPFDWGHAVLNNEDSTFGFAVELLNKRDSLHFLRPSSQVP AGQ Regards Hossein _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Wed Mar 13 20:54:55 2013 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 13 Mar 2013 22:54:55 -0400 Subject: [maker-devel] duplicate CDS in annotation In-Reply-To: Message-ID: Yes. map_forward=1 allows new models to keep the names of the models they replace. It makes it so you don't have to relocate genes every time a model gets a slight modification during reannotation. --Carson From: Sasha Mikheyev Date: Wednesday, 13 March, 2013 9:17 PM To: Carson Holt Cc: Barry Moore , Subject: Re: [maker-devel] duplicate CDS in annotation OK. Got it! I did pass through the gene model names. I guess I now see that a new gene model may become associated with the old name in the re-annotation. Sasha On Thu, Mar 14, 2013 at 6:47 AM, Carson Holt wrote: > The output shows that the original model was > Alias=maker-pbar_scf7180000349951-snap-gene-1.17-mRNA-1 and the new model > replacing it is Alias=genemark-pbar_scf7180000349951-abinit-gene-1.14-mRNA-1. > > So it is really a completely different model (as one derived from SNAP and one > from GeneMark). I'm guessing you have map_forward=1 set and are using the > GFF3 passthrough options correct? > > Thanks, > Carson > > > > From: Sasha Mikheyev > Date: Wednesday, 13 March, 2013 3:23 AM > > To: Carson Holt > Cc: Barry Moore , > > Subject: Re: [maker-devel] duplicate CDS in annotation > > Dear Carson, > > The new version does indeed fix the problem! > > However, I noticed that some of the CDS annotations were swallowed. This seems > to affect a ~600 genes. > > e.g. input: > > pbar_scf7180000349951 maker mRNA 98033 98530 . - . > ID=PB12301-RA;Parent=PB12301;Name=PB12301-RA;Alias=maker-pbar_scf7180000349951 > -snap-gene-1.17-mRNA-1;_AED=1.00;_QI=0|0|0|0|0|0|2|0|81; > pbar_scf7180000349951 maker exon 98393 98530 . - . > ID=PB12301-RA:exon:10283;Parent=PB12301-RA; > pbar_scf7180000349951 maker exon 98033 98140 . - . > ID=PB12301-RA:exon:10284;Parent=PB12301-RA; > pbar_scf7180000349951 maker CDS 98033 98140 . - 0 > ID=PB12301-RA:cds:10114;Parent=PB12301-RA; > pbar_scf7180000349951 maker CDS 98393 98530 . - 0 > ID=PB12301-RA:cds:10113;Parent=PB12301-RA; > > output: > > pbar_scf7180000349951 maker mRNA 98033 98530 . - . > ID=PB12301-RA;Parent=PB12301;Name=PB12301-RA;_AED=0.38;_eAED=0.38;_QI=0|0|0.33 > |1|0.5|1|3|246|165;Alias=genemark-pbar_scf7180000349951-abinit-gene-1.14-mRNA- > 1,PB12301-RA > pbar_scf7180000349951 maker exon 98033 98530 . - . > ID=PB12301-RA:exon:134;Parent=PB12301-RA > pbar_scf7180000349951 maker exon 98033 98140 . - . > ID=PB12301-RA:exon:133;Parent=PB12301-RA > pbar_scf7180000349951 maker exon 98393 98530 . - . > ID=PB12301-RA:exon:132;Parent=PB12301-RA > pbar_scf7180000349951 maker three_prime_UTR 98393 98530 . - . > ID=PB12301-RA:three_prime_utr;Parent=PB12301-RA > pbar_scf7180000349951 maker three_prime_UTR 98033 98140 . - . > ID=PB12301-RA:three_prime_utr;Parent=PB12301-RA > pbar_scf7180000349951 maker CDS 98033 98530 . - 0 > ID=PB12301-RA:cds;Parent=PB12301-RA > > Thank you, > > Sasha > > On Tue, Mar 12, 2013 at 10:37 PM, Carson Holt wrote: >> Yes. Try the newer version and see if you still have the issue. >> >> Thanks, >> Carson >> >> >> From: Sasha Mikheyev >> Date: Tuesday, 12 March, 2013 1:26 AM >> To: Carson Holt >> Cc: Barry Moore , >> >> >> Subject: Re: [maker-devel] duplicate CDS in annotation >> >> Hi Carson, >> >> I have been using version 2.10. Is it worth trying with a newer version? >> >> You can find the model file here >> . It is rather large, as it >> includes all of the output from the first maker run. >> >> Yours, >> >> Sasha >> >> >> On Mon, Mar 11, 2013 at 10:02 PM, Carson Holt wrote: >>> I think the issue is that you are getting a match feature that is being >>> printed with the same ID as the mRNA feature. Correct? >>> >>> What version of MAKER are you using, and what does the gile you are giving >>> to pred_gff or model_gff look like? Could you send them? >>> >>> Thanks, >>> Carson >>> >>> >>> From: Barry Moore >>> Date: Monday, 11 March, 2013 7:32 AM >>> To: Sasha Mikheyev >>> Cc: >>> Subject: Re: [maker-devel] duplicate CDS in annotation >>> >>> Hi Sasha, >>> >>> This gene model appears to be correctly formatted to me. In GFF3 format the >>> CDS features are allowed to span multiple lines and they share the same ID >>> to indicate that it is all the same features. See the GFF3 specification on >>> the Sequence Ontology website >>> (http://www.sequenceontology.org/resources/gff3.html), and in particular the >>> description of the ID attribute specifies: >>> >>>> ID Indicates the ID of the feature. IDs for each feature must be unique >>>> within the scope of the GFF file. In the case of discontinuous features >>>> (i.e. a single feature that exists over multiple genomic locations) the >>>> same ID may appear on multiple lines. All lines that share an ID >>>> collectively represent a single feature. >>> >>> So each of those CDS lines forms one part of the single CDS feature for this >>> gene. >>> >>> B >>> >>> On Mar 11, 2013, at 3:46 AM, Sasha Mikheyev wrote: >>> >>>> Dear Yandell lab, >>>> >>>> I am re-annotating the harvester and genome using protein and RNA-seq data. >>>> However, I get many artifacts like the one below. It seems that there are >>>> several CDS records that should tie in to the same mRNA, but they are >>>> really hanging out separately, and produce several nucleotide sequences >>>> with the same name when extracted from the gff. I would appreciate any >>>> guidance about how to fix this! >>>> >>>> Thank you, >>>> >>>> Sasha >>>> >>>> grep "pbar_scf7180000350377:hit:2506" Pbar.2.0.gff >>>> pbar_scf7180000350377 protein2genome protein_match 172004 172162 150 - . >>>> ID=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;score=150; >>>> pbar_scf7180000350377 protein2genome match_part 172004 172162 150 - . >>>> ID=pbar_scf7180000350377:hsp:2798;Parent=pbar_scf7180000350377:hit:2506;Nam >>>> e=Hsal|HS9704;Target=Hsal|HS9704 1 53 +;Gap=M159; >>>> pbar_scf7180000350377 maker mRNA 538308 558769 . + . >>>> ID=pbar_scf7180000350377:hit:2506;Parent=augustus_masked-pbar_scf7180000350 >>>> 377-abinit-gene-5.29;Name=augustus_masked-pbar_scf7180000350377-abinit-gene >>>> -5.29-mRNA-1;_AED=0.48;_eAED=0.39;_QI=0|0|0|0.5|1|1|6|0|395;score=0.01; >>>> pbar_scf7180000350377 maker exon 538308 538334 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:305;Parent=pbar_scf7180000350377:hit >>>> :2506; >>>> pbar_scf7180000350377 maker exon 538748 538968 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:306;Parent=pbar_scf7180000350377:hit >>>> :2506; >>>> pbar_scf7180000350377 maker exon 539842 540242 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:307;Parent=pbar_scf7180000350377:hit >>>> :2506; >>>> pbar_scf7180000350377 maker exon 542624 542798 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:308;Parent=pbar_scf7180000350377:hit >>>> :2506; >>>> pbar_scf7180000350377 maker exon 555823 556025 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:309;Parent=pbar_scf7180000350377:hit >>>> :2506; >>>> pbar_scf7180000350377 maker exon 558609 558769 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:310;Parent=pbar_scf7180000350377:hit >>>> :2506; >>>> pbar_scf7180000350377 maker CDS 538308 538334 . + 0 >>>> ID=pbar_scf7180000350377:hit:2506:cds:305;Parent=pbar_scf7180000350377:hit: >>>> 2506; >>>> pbar_scf7180000350377 maker CDS 538748 538968 . + 0 >>>> ID=pbar_scf7180000350377:hit:2506:cds:306;Parent=pbar_scf7180000350377:hit: >>>> 2506; >>>> pbar_scf7180000350377 maker CDS 539842 540242 . + 1 >>>> ID=pbar_scf7180000350377:hit:2506:cds:307;Parent=pbar_scf7180000350377:hit: >>>> 2506; >>>> pbar_scf7180000350377 maker CDS 542624 542798 . + 2 >>>> ID=pbar_scf7180000350377:hit:2506:cds:308;Parent=pbar_scf7180000350377:hit: >>>> 2506; >>>> pbar_scf7180000350377 maker CDS 555823 556025 . + 1 >>>> ID=pbar_scf7180000350377:hit:2506:cds:309;Parent=pbar_scf7180000350377:hit: >>>> 2506; >>>> pbar_scf7180000350377 maker CDS 558609 558769 . + 2 >>>> ID=pbar_scf7180000350377:hit:2506:cds:310;Parent=pbar_scf7180000350377:hit: >>>> 2506; >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >>> Barry Moore >>> Research Scientist >>> Dept. of Human Genetics >>> University of Utah >>> Salt Lake City, UT 84112 >>> -------------------------------------------- >>> (801) 585-3543 >>> >>> >>> >>> >>> _______________________________________________ maker-devel mailing list >>> maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/m >>> aker-devel_yandell-lab.org >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mikheyev at gmail.com Wed Mar 13 19:17:40 2013 From: mikheyev at gmail.com (Sasha Mikheyev) Date: Thu, 14 Mar 2013 10:17:40 +0900 Subject: [maker-devel] duplicate CDS in annotation In-Reply-To: References: Message-ID: OK. Got it! I did pass through the gene model names. I guess I now see that a new gene model may become associated with the old name in the re-annotation. Sasha On Thu, Mar 14, 2013 at 6:47 AM, Carson Holt wrote: > The output shows that the original model > was Alias=maker-pbar_scf7180000349951-snap-gene-1.17-mRNA-1 and the new > model replacing it is > Alias=genemark-pbar_scf7180000349951-abinit-gene-1.14-mRNA-1. > > So it is really a completely different model (as one derived from SNAP and > one from GeneMark). I'm guessing you have map_forward=1 set and are using > the GFF3 passthrough options correct? > > Thanks, > Carson > > > > From: Sasha Mikheyev > Date: Wednesday, 13 March, 2013 3:23 AM > > To: Carson Holt > Cc: Barry Moore , < > maker-devel at yandell-lab.org> > Subject: Re: [maker-devel] duplicate CDS in annotation > > Dear Carson, > > The new version does indeed fix the problem! > > However, I noticed that some of the CDS annotations were swallowed. This > seems to affect a ~600 genes. > > e.g. input: > > pbar_scf7180000349951 maker mRNA 98033 98530 . - . > ID=PB12301-RA;Parent=PB12301;Name=PB12301-RA;Alias=maker-pbar_scf7180000349951-snap-gene-1.17-mRNA-1;_AED=1.00;_QI=0|0|0|0|0|0|2|0|81; > pbar_scf7180000349951 maker exon 98393 98530 . - . > ID=PB12301-RA:exon:10283;Parent=PB12301-RA; > pbar_scf7180000349951 maker exon 98033 98140 . - . > ID=PB12301-RA:exon:10284;Parent=PB12301-RA; > pbar_scf7180000349951 maker CDS 98033 98140 . - 0 > ID=PB12301-RA:cds:10114;Parent=PB12301-RA; > pbar_scf7180000349951 maker CDS 98393 98530 . - 0 > ID=PB12301-RA:cds:10113;Parent=PB12301-RA; > > output: > > pbar_scf7180000349951 maker mRNA 98033 98530 . - . > ID=PB12301-RA;Parent=PB12301;Name=PB12301-RA;_AED=0.38;_eAED=0.38;_QI=0|0|0.33|1|0.5|1|3|246|165;Alias=genemark-pbar_scf7180000349951-abinit-gene-1.14-mRNA-1,PB12301-RA > pbar_scf7180000349951 maker exon 98033 98530 . - . > ID=PB12301-RA:exon:134;Parent=PB12301-RA > pbar_scf7180000349951 maker exon 98033 98140 . - . > ID=PB12301-RA:exon:133;Parent=PB12301-RA > pbar_scf7180000349951 maker exon 98393 98530 . - . > ID=PB12301-RA:exon:132;Parent=PB12301-RA > pbar_scf7180000349951 maker three_prime_UTR 98393 98530 . - . > ID=PB12301-RA:three_prime_utr;Parent=PB12301-RA > pbar_scf7180000349951 maker three_prime_UTR 98033 98140 . - . > ID=PB12301-RA:three_prime_utr;Parent=PB12301-RA > pbar_scf7180000349951 maker CDS 98033 98530 . - 0 > ID=PB12301-RA:cds;Parent=PB12301-RA > > Thank you, > > Sasha > > On Tue, Mar 12, 2013 at 10:37 PM, Carson Holt wrote: > >> Yes. Try the newer version and see if you still have the issue. >> >> Thanks, >> Carson >> >> >> From: Sasha Mikheyev >> Date: Tuesday, 12 March, 2013 1:26 AM >> To: Carson Holt >> Cc: Barry Moore , < >> maker-devel at yandell-lab.org> >> >> Subject: Re: [maker-devel] duplicate CDS in annotation >> >> Hi Carson, >> >> I have been using version 2.10. Is it worth trying with a newer version? >> >> You can find the model file here. >> It is rather large, as it includes all of the output from the first maker >> run. >> >> Yours, >> >> Sasha >> >> >> On Mon, Mar 11, 2013 at 10:02 PM, Carson Holt wrote: >> >>> I think the issue is that you are getting a match feature that is being >>> printed with the same ID as the mRNA feature. Correct? >>> >>> What version of MAKER are you using, and what does the gile you are >>> giving to pred_gff or model_gff look like? Could you send them? >>> >>> Thanks, >>> Carson >>> >>> >>> From: Barry Moore >>> Date: Monday, 11 March, 2013 7:32 AM >>> To: Sasha Mikheyev >>> Cc: >>> Subject: Re: [maker-devel] duplicate CDS in annotation >>> >>> Hi Sasha, >>> >>> This gene model appears to be correctly formatted to me. In GFF3 format >>> the CDS features are allowed to span multiple lines and they share the same >>> ID to indicate that it is all the same features. See the GFF3 >>> specification on the Sequence Ontology website ( >>> http://www.sequenceontology.org/resources/gff3.html), and in particular >>> the description of the ID attribute specifies: >>> >>> ID Indicates the ID of the feature. IDs for each feature must be unique >>> within the scope of the GFF file. In the case of discontinuous features >>> (i.e. a single feature that exists over multiple genomic locations) the >>> same ID may appear on multiple lines. All lines that share an ID >>> collectively represent a single feature. >>> >>> >>> So each of those CDS lines forms one part of the single CDS feature for >>> this gene. >>> >>> B >>> >>> On Mar 11, 2013, at 3:46 AM, Sasha Mikheyev wrote: >>> >>> Dear Yandell lab, >>> >>> I am re-annotating the harvester and genome using protein and RNA-seq >>> data. However, I get many artifacts like the one below. It seems that there >>> are several CDS records that should tie in to the same mRNA, but they are >>> really hanging out separately, and produce several nucleotide sequences >>> with the same name when extracted from the gff. I would appreciate any >>> guidance about how to fix this! >>> >>> Thank you, >>> >>> Sasha >>> >>> grep "pbar_scf7180000350377:hit:2506" Pbar.2.0.gff >>> pbar_scf7180000350377 protein2genome protein_match 172004 172162 150 - . >>> ID=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;score=150; >>> pbar_scf7180000350377 protein2genome match_part 172004 172162 150 - . ID=pbar_scf7180000350377:hsp:2798;Parent=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;Target=Hsal|HS9704 >>> 1 53 +;Gap=M159; >>> pbar_scf7180000350377 maker mRNA 538308 558769 . + . >>> ID=pbar_scf7180000350377:hit:2506;Parent=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29;Name=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29-mRNA-1;_AED=0.48;_eAED=0.39;_QI=0|0|0|0.5|1|1|6|0|395;score=0.01; >>> pbar_scf7180000350377 maker exon 538308 538334 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:305;Parent=pbar_scf7180000350377:hit:2506; >>> pbar_scf7180000350377 maker exon 538748 538968 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:306;Parent=pbar_scf7180000350377:hit:2506; >>> pbar_scf7180000350377 maker exon 539842 540242 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:307;Parent=pbar_scf7180000350377:hit:2506; >>> pbar_scf7180000350377 maker exon 542624 542798 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:308;Parent=pbar_scf7180000350377:hit:2506; >>> pbar_scf7180000350377 maker exon 555823 556025 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:309;Parent=pbar_scf7180000350377:hit:2506; >>> pbar_scf7180000350377 maker exon 558609 558769 0.01 + . >>> ID=pbar_scf7180000350377:hit:2506:exon:310;Parent=pbar_scf7180000350377:hit:2506; >>> pbar_scf7180000350377 maker CDS 538308 538334 . + 0 >>> ID=pbar_scf7180000350377:hit:2506:cds:305;Parent=pbar_scf7180000350377:hit:2506; >>> pbar_scf7180000350377 maker CDS 538748 538968 . + 0 >>> ID=pbar_scf7180000350377:hit:2506:cds:306;Parent=pbar_scf7180000350377:hit:2506; >>> pbar_scf7180000350377 maker CDS 539842 540242 . + 1 >>> ID=pbar_scf7180000350377:hit:2506:cds:307;Parent=pbar_scf7180000350377:hit:2506; >>> pbar_scf7180000350377 maker CDS 542624 542798 . + 2 >>> ID=pbar_scf7180000350377:hit:2506:cds:308;Parent=pbar_scf7180000350377:hit:2506; >>> pbar_scf7180000350377 maker CDS 555823 556025 . + 1 >>> ID=pbar_scf7180000350377:hit:2506:cds:309;Parent=pbar_scf7180000350377:hit:2506; >>> pbar_scf7180000350377 maker CDS 558609 558769 . + 2 >>> ID=pbar_scf7180000350377:hit:2506:cds:310;Parent=pbar_scf7180000350377:hit:2506; >>> >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >>> >>> Barry Moore >>> Research Scientist >>> Dept. of Human Genetics >>> University of Utah >>> Salt Lake City, UT 84112 >>> -------------------------------------------- >>> (801) 585-3543 >>> >>> >>> >>> >>> _______________________________________________ maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mikheyev at gmail.com Wed Mar 13 21:34:52 2013 From: mikheyev at gmail.com (Sasha Mikheyev) Date: Thu, 14 Mar 2013 12:34:52 +0900 Subject: [maker-devel] duplicate CDS in annotation In-Reply-To: References: Message-ID: Thank you very much! Problem solved! Sasha On Thu, Mar 14, 2013 at 11:54 AM, Carson Holt wrote: > Yes. map_forward=1 allows new models to keep the names of the models they > replace. It makes it so you don't have to relocate genes every time a > model gets a slight modification during reannotation. > > --Carson > > > From: Sasha Mikheyev > Date: Wednesday, 13 March, 2013 9:17 PM > > To: Carson Holt > Cc: Barry Moore , < > maker-devel at yandell-lab.org> > Subject: Re: [maker-devel] duplicate CDS in annotation > > OK. Got it! I did pass through the gene model names. I guess I now see > that a new gene model may become associated with the old name in the > re-annotation. > > Sasha > > On Thu, Mar 14, 2013 at 6:47 AM, Carson Holt wrote: > >> The output shows that the original model >> was Alias=maker-pbar_scf7180000349951-snap-gene-1.17-mRNA-1 and the new >> model replacing it is >> Alias=genemark-pbar_scf7180000349951-abinit-gene-1.14-mRNA-1. >> >> So it is really a completely different model (as one derived from SNAP >> and one from GeneMark). I'm guessing you have map_forward=1 set and are >> using the GFF3 passthrough options correct? >> >> Thanks, >> Carson >> >> >> >> From: Sasha Mikheyev >> Date: Wednesday, 13 March, 2013 3:23 AM >> >> To: Carson Holt >> Cc: Barry Moore , < >> maker-devel at yandell-lab.org> >> Subject: Re: [maker-devel] duplicate CDS in annotation >> >> Dear Carson, >> >> The new version does indeed fix the problem! >> >> However, I noticed that some of the CDS annotations were swallowed. This >> seems to affect a ~600 genes. >> >> e.g. input: >> >> pbar_scf7180000349951 maker mRNA 98033 98530 . - . >> ID=PB12301-RA;Parent=PB12301;Name=PB12301-RA;Alias=maker-pbar_scf7180000349951-snap-gene-1.17-mRNA-1;_AED=1.00;_QI=0|0|0|0|0|0|2|0|81; >> pbar_scf7180000349951 maker exon 98393 98530 . - . >> ID=PB12301-RA:exon:10283;Parent=PB12301-RA; >> pbar_scf7180000349951 maker exon 98033 98140 . - . >> ID=PB12301-RA:exon:10284;Parent=PB12301-RA; >> pbar_scf7180000349951 maker CDS 98033 98140 . - 0 >> ID=PB12301-RA:cds:10114;Parent=PB12301-RA; >> pbar_scf7180000349951 maker CDS 98393 98530 . - 0 >> ID=PB12301-RA:cds:10113;Parent=PB12301-RA; >> >> output: >> >> pbar_scf7180000349951 maker mRNA 98033 98530 . - . >> ID=PB12301-RA;Parent=PB12301;Name=PB12301-RA;_AED=0.38;_eAED=0.38;_QI=0|0|0.33|1|0.5|1|3|246|165;Alias=genemark-pbar_scf7180000349951-abinit-gene-1.14-mRNA-1,PB12301-RA >> pbar_scf7180000349951 maker exon 98033 98530 . - . >> ID=PB12301-RA:exon:134;Parent=PB12301-RA >> pbar_scf7180000349951 maker exon 98033 98140 . - . >> ID=PB12301-RA:exon:133;Parent=PB12301-RA >> pbar_scf7180000349951 maker exon 98393 98530 . - . >> ID=PB12301-RA:exon:132;Parent=PB12301-RA >> pbar_scf7180000349951 maker three_prime_UTR 98393 98530 . - . >> ID=PB12301-RA:three_prime_utr;Parent=PB12301-RA >> pbar_scf7180000349951 maker three_prime_UTR 98033 98140 . - . >> ID=PB12301-RA:three_prime_utr;Parent=PB12301-RA >> pbar_scf7180000349951 maker CDS 98033 98530 . - 0 >> ID=PB12301-RA:cds;Parent=PB12301-RA >> >> Thank you, >> >> Sasha >> >> On Tue, Mar 12, 2013 at 10:37 PM, Carson Holt wrote: >> >>> Yes. Try the newer version and see if you still have the issue. >>> >>> Thanks, >>> Carson >>> >>> >>> From: Sasha Mikheyev >>> Date: Tuesday, 12 March, 2013 1:26 AM >>> To: Carson Holt >>> Cc: Barry Moore , < >>> maker-devel at yandell-lab.org> >>> >>> Subject: Re: [maker-devel] duplicate CDS in annotation >>> >>> Hi Carson, >>> >>> I have been using version 2.10. Is it worth trying with a newer version? >>> >>> You can find the model file here. >>> It is rather large, as it includes all of the output from the first maker >>> run. >>> >>> Yours, >>> >>> Sasha >>> >>> >>> On Mon, Mar 11, 2013 at 10:02 PM, Carson Holt wrote: >>> >>>> I think the issue is that you are getting a match feature that is being >>>> printed with the same ID as the mRNA feature. Correct? >>>> >>>> What version of MAKER are you using, and what does the gile you are >>>> giving to pred_gff or model_gff look like? Could you send them? >>>> >>>> Thanks, >>>> Carson >>>> >>>> >>>> From: Barry Moore >>>> Date: Monday, 11 March, 2013 7:32 AM >>>> To: Sasha Mikheyev >>>> Cc: >>>> Subject: Re: [maker-devel] duplicate CDS in annotation >>>> >>>> Hi Sasha, >>>> >>>> This gene model appears to be correctly formatted to me. In GFF3 >>>> format the CDS features are allowed to span multiple lines and they share >>>> the same ID to indicate that it is all the same features. See the GFF3 >>>> specification on the Sequence Ontology website ( >>>> http://www.sequenceontology.org/resources/gff3.html), and in >>>> particular the description of the ID attribute specifies: >>>> >>>> ID Indicates the ID of the feature. IDs for each feature must be unique >>>> within the scope of the GFF file. In the case of discontinuous features >>>> (i.e. a single feature that exists over multiple genomic locations) the >>>> same ID may appear on multiple lines. All lines that share an ID >>>> collectively represent a single feature. >>>> >>>> >>>> So each of those CDS lines forms one part of the single CDS feature for >>>> this gene. >>>> >>>> B >>>> >>>> On Mar 11, 2013, at 3:46 AM, Sasha Mikheyev wrote: >>>> >>>> Dear Yandell lab, >>>> >>>> I am re-annotating the harvester and genome using protein and RNA-seq >>>> data. However, I get many artifacts like the one below. It seems that there >>>> are several CDS records that should tie in to the same mRNA, but they are >>>> really hanging out separately, and produce several nucleotide sequences >>>> with the same name when extracted from the gff. I would appreciate any >>>> guidance about how to fix this! >>>> >>>> Thank you, >>>> >>>> Sasha >>>> >>>> grep "pbar_scf7180000350377:hit:2506" Pbar.2.0.gff >>>> pbar_scf7180000350377 protein2genome protein_match 172004 172162 150 - >>>> . ID=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;score=150; >>>> pbar_scf7180000350377 protein2genome match_part 172004 172162 150 - . ID=pbar_scf7180000350377:hsp:2798;Parent=pbar_scf7180000350377:hit:2506;Name=Hsal|HS9704;Target=Hsal|HS9704 >>>> 1 53 +;Gap=M159; >>>> pbar_scf7180000350377 maker mRNA 538308 558769 . + . >>>> ID=pbar_scf7180000350377:hit:2506;Parent=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29;Name=augustus_masked-pbar_scf7180000350377-abinit-gene-5.29-mRNA-1;_AED=0.48;_eAED=0.39;_QI=0|0|0|0.5|1|1|6|0|395;score=0.01; >>>> pbar_scf7180000350377 maker exon 538308 538334 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:305;Parent=pbar_scf7180000350377:hit:2506; >>>> pbar_scf7180000350377 maker exon 538748 538968 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:306;Parent=pbar_scf7180000350377:hit:2506; >>>> pbar_scf7180000350377 maker exon 539842 540242 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:307;Parent=pbar_scf7180000350377:hit:2506; >>>> pbar_scf7180000350377 maker exon 542624 542798 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:308;Parent=pbar_scf7180000350377:hit:2506; >>>> pbar_scf7180000350377 maker exon 555823 556025 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:309;Parent=pbar_scf7180000350377:hit:2506; >>>> pbar_scf7180000350377 maker exon 558609 558769 0.01 + . >>>> ID=pbar_scf7180000350377:hit:2506:exon:310;Parent=pbar_scf7180000350377:hit:2506; >>>> pbar_scf7180000350377 maker CDS 538308 538334 . + 0 >>>> ID=pbar_scf7180000350377:hit:2506:cds:305;Parent=pbar_scf7180000350377:hit:2506; >>>> pbar_scf7180000350377 maker CDS 538748 538968 . + 0 >>>> ID=pbar_scf7180000350377:hit:2506:cds:306;Parent=pbar_scf7180000350377:hit:2506; >>>> pbar_scf7180000350377 maker CDS 539842 540242 . + 1 >>>> ID=pbar_scf7180000350377:hit:2506:cds:307;Parent=pbar_scf7180000350377:hit:2506; >>>> pbar_scf7180000350377 maker CDS 542624 542798 . + 2 >>>> ID=pbar_scf7180000350377:hit:2506:cds:308;Parent=pbar_scf7180000350377:hit:2506; >>>> pbar_scf7180000350377 maker CDS 555823 556025 . + 1 >>>> ID=pbar_scf7180000350377:hit:2506:cds:309;Parent=pbar_scf7180000350377:hit:2506; >>>> pbar_scf7180000350377 maker CDS 558609 558769 . + 2 >>>> ID=pbar_scf7180000350377:hit:2506:cds:310;Parent=pbar_scf7180000350377:hit:2506; >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>>> >>>> >>>> Barry Moore >>>> Research Scientist >>>> Dept. of Human Genetics >>>> University of Utah >>>> Salt Lake City, UT 84112 >>>> -------------------------------------------- >>>> (801) 585-3543 >>>> >>>> >>>> >>>> >>>> _______________________________________________ maker-devel mailing >>>> list maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>>> >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ramonfallon at gmail.com Thu Mar 14 09:19:47 2013 From: ramonfallon at gmail.com (=?ISO-8859-1?Q?Ram=F3n_Fallon?=) Date: Thu, 14 Mar 2013 16:19:47 +0100 Subject: [maker-devel] 12core speed check Message-ID: Hi, I was trying to tweak some of our machines to maximise Mpich2/Maker (svn rev 997) throughput and describe one small set of results on this mailing list to allow sharing of experiences. I use the example input dataset "dpp_contig.fasta" with the original sequence repeated 125 times within the same file (under different names of course) to allow for a decent size run. This file totalled 4.019 megabases. I use the dpp_proteins.fasta and The maker_opts.ctl has "cpus=1" set as the docs recommend for MPI. Hardware is a standalone HP Proliant SL390 with two Intel X5675 @ 3.07GHz, totalling 12 cores with 192GB RAM and 1TB disk (local, no NFS) running Ubuntu 10.04 with 2.6.32-41 linux kernel commandline was "mpiexec -n <#cores> maker" within a dedicated directory containing all relevant files. #cores time(mins) Megabases/hr 1 27.00 8.93 2 126.25 1.91 4 42.57 5.66 6 25.42 9.49 8 18.60 12.96 10 16.67 14.47 12 13.98 17.24 I attach a png file with graph. The upshot of this particular experiment is that 2 processes show anomalous behaviour and that 6 processors are needed to gain an advantage on the 1 processor run, while 12 processors achieves a speed-up of nearly 2 on the 1 processor version. I am now going to move on to a three node cluster with 2x 8core processors each (so I can go up to 48 processors), so will report back with higher core numbers. Any suggestions on further speed optimizations welcome. Cheers / Ram?n. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: twelvecore_spup.png Type: image/png Size: 25749 bytes Desc: not available URL: From carsonhh at gmail.com Thu Mar 14 09:53:33 2013 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 14 Mar 2013 11:53:33 -0400 Subject: [maker-devel] 12core speed check In-Reply-To: Message-ID: I can give a similar setup a try as well to see if anything is amiss in the development version. The expected behavior is that 1 and 2 cores should have identical performance (as one process is always fully dedicated to communication). --Carson From: Ram?n Fallon Date: Thursday, 14 March, 2013 11:19 AM To: Subject: [maker-devel] 12core speed check Hi, I was trying to tweak some of our machines to maximise Mpich2/Maker (svn rev 997) throughput and describe one small set of results on this mailing list to allow sharing of experiences. I use the example input dataset "dpp_contig.fasta" with the original sequence repeated 125 times within the same file (under different names of course) to allow for a decent size run. This file totalled 4.019 megabases. I use the dpp_proteins.fasta and The maker_opts.ctl has "cpus=1" set as the docs recommend for MPI. Hardware is a standalone HP Proliant SL390 with two Intel X5675 @ 3.07GHz, totalling 12 cores with 192GB RAM and 1TB disk (local, no NFS) running Ubuntu 10.04 with 2.6.32-41 linux kernel commandline was "mpiexec -n <#cores> maker" within a dedicated directory containing all relevant files. #cores time(mins) Megabases/hr 1 27.00 8.93 2 126.25 1.91 4 42.57 5.66 6 25.42 9.49 8 18.60 12.96 10 16.67 14.47 12 13.98 17.24 I attach a png file with graph. The upshot of this particular experiment is that 2 processes show anomalous behaviour and that 6 processors are needed to gain an advantage on the 1 processor run, while 12 processors achieves a speed-up of nearly 2 on the 1 processor version. I am now going to move on to a three node cluster with 2x 8core processors each (so I can go up to 48 processors), so will report back with higher core numbers. Any suggestions on further speed optimizations welcome. Cheers / Ram?n. _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnuhn at ebi.ac.uk Thu Mar 14 10:20:01 2013 From: mnuhn at ebi.ac.uk (Michael Nuhn) Date: Thu, 14 Mar 2013 16:20:01 +0000 Subject: [maker-devel] master_datastore_index.log file shrinks. Message-ID: <5141F8B1.7020808@ebi.ac.uk> Hello! I'm trying to keep track of the progress of maker (version 2.27) while it is running by looking at the master_datastore_index.log file every once in a while. Sometimes the number of lines in it decreases. Just now it went down from more than two hundred to thirty seven. When I start more instances of maker, the number of lines in it increases when they start. But sometimes I check and the number of lines has greatly reduced since the last time. I'm afraid that the newer instances of maker are deleting the file and starting it from scratch instead of adding their progress to it. Is this a file locking issue I should be worried about? Cheers, Michael. From olaf.mueller at duke.edu Thu Mar 14 10:13:20 2013 From: olaf.mueller at duke.edu (Olaf Mueller) Date: Thu, 14 Mar 2013 12:13:20 -0400 Subject: [maker-devel] 12core speed check In-Reply-To: References: Message-ID: <5141F720.20502@duke.edu> The X5675 supports hyperthreading. Does i.e. "mpiexec -n 2 maker" use 2 physical cores or 2 threads of the 1st core? If the latter happens it would be interesting to see your series extended to -n 24. Cheers Olaf On 03/14/2013 11:19 AM, Ram?n Fallon wrote: > Hi, > > I was trying to tweak some of our machines to maximise Mpich2/Maker > (svn rev 997) throughput and describe one small set of results on > this mailing list to allow sharing of experiences. > > I use the example input dataset "dpp_contig.fasta" with the original > sequence repeated 125 times within the same file (under different > names of course) to allow for a decent size run. This file totalled > 4.019 megabases. I use the dpp_proteins.fasta and The maker_opts.ctl > has "cpus=1" set as the docs recommend for MPI. > > Hardware is a standalone HP Proliant SL390 with two Intel X5675 @ > 3.07GHz, totalling 12 cores with 192GB RAM and 1TB disk (local, no > NFS) running Ubuntu 10.04 with 2.6.32-41 linux kernel > > commandline was "mpiexec -n <#cores> maker" within a dedicated > directory containing all relevant files. > > #cores time(mins) Megabases/hr > 1 27.00 8.93 > 2 126.25 1.91 > 4 42.57 5.66 > 6 25.42 9.49 > 8 18.60 12.96 > 10 16.67 14.47 > 12 13.98 17.24 > > I attach a png file with graph. The upshot of this particular > experiment is that 2 processes show anomalous behaviour and that 6 > processors are needed to gain an advantage on the 1 processor run, > while 12 processors achieves a speed-up of nearly 2 on the 1 processor > version. > > I am now going to move on to a three node cluster with 2x 8core > processors each (so I can go up to 48 processors), so will report back > with higher core numbers. Any suggestions on further speed > optimizations welcome. > > Cheers / Ram?n. > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Thu Mar 14 10:21:47 2013 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 14 Mar 2013 12:21:47 -0400 Subject: [maker-devel] master_datastore_index.log file shrinks. In-Reply-To: <5141F8B1.7020808@ebi.ac.uk> Message-ID: The file should only be deleted if there are no instances running and a new one starts. Then it rebuilds it. If it is being deleted while other instances are still active, then yes that is a lock issue. There are several other locks that should protect individual contigs while that particular lock is only protecting the datastore_index.log file. If any of the contig locks are not working you would start to see failures of contigs with weird errors that say there are missing files. Try dialling back on the number of simultaneous instances you start and instead use MPI or the -cpus option to get the parallelization boost. Alternatively you can also split up the input file and use the -base option so everything gets written to the same place (then you never have to worry about locks affecting individual contigs - as no single instance has access to all the contigs) Example: fasta_tool --chunks 5 maize_assembly.fasta maker -g maize_assembly_0.fasta -base maize_assembly maker -g maize_assembly_1.fasta -base maize_assembly maker -g maize_assembly_2.fasta -base maize_assembly maker -g maize_assembly_3.fasta -base maize_assembly maker -g maize_assembly_4.fasta -base maize_assembly maker -dsindex Everything then gets written to maize_assembly.maker.output for all results. The last call to maker with the -dsindex flag then rebuilds the datastore_index.log file to match the original maize_assembly.fasta file Thanks, Carson On 13-03-14 12:20 PM, "Michael Nuhn" wrote: >Hello! > >I'm trying to keep track of the progress of maker (version 2.27) while >it is running by looking at the master_datastore_index.log file every >once in a while. > >Sometimes the number of lines in it decreases. Just now it went down >from more than two hundred to thirty seven. > >When I start more instances of maker, the number of lines in it >increases when they start. But sometimes I check and the number of lines >has greatly reduced since the last time. > >I'm afraid that the newer instances of maker are deleting the file and >starting it from scratch instead of adding their progress to it. > >Is this a file locking issue I should be worried about? > >Cheers, >Michael. > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From mnuhn at ebi.ac.uk Thu Mar 14 10:49:19 2013 From: mnuhn at ebi.ac.uk (Michael Nuhn) Date: Thu, 14 Mar 2013 16:49:19 +0000 Subject: [maker-devel] master_datastore_index.log file shrinks. In-Reply-To: References: Message-ID: <5141FF8F.2050900@ebi.ac.uk> Hello Carson! Thanks for your quick response and your ideas. I'll give them a try. Cheers, Michael. On 03/14/2013 04:21 PM, Carson Holt wrote: > The file should only be deleted if there are no instances running and a > new one starts. Then it rebuilds it. If it is being deleted while other > instances are still active, then yes that is a lock issue. There are > several other locks that should protect individual contigs while that > particular lock is only protecting the datastore_index.log file. > > If any of the contig locks are not working you would start to see failures > of contigs with weird errors that say there are missing files. > > Try dialling back on the number of simultaneous instances you start and > instead use MPI or the -cpus option to get the parallelization boost. > Alternatively you can also split up the input file and use the -base > option so everything gets written to the same place (then you never have > to worry about locks affecting individual contigs - as no single instance > has access to all the contigs) > > Example: > fasta_tool --chunks 5 maize_assembly.fasta > maker -g maize_assembly_0.fasta -base maize_assembly > maker -g maize_assembly_1.fasta -base maize_assembly > > maker -g maize_assembly_2.fasta -base maize_assembly > > maker -g maize_assembly_3.fasta -base maize_assembly > > maker -g maize_assembly_4.fasta -base maize_assembly > > maker -dsindex > > Everything then gets written to maize_assembly.maker.output for all > results. The last call to maker with the -dsindex flag then rebuilds the > datastore_index.log file to match the original maize_assembly.fasta file > > > Thanks, > Carson > > > > > > On 13-03-14 12:20 PM, "Michael Nuhn" wrote: > >> Hello! >> >> I'm trying to keep track of the progress of maker (version 2.27) while >> it is running by looking at the master_datastore_index.log file every >> once in a while. >> >> Sometimes the number of lines in it decreases. Just now it went down >>from more than two hundred to thirty seven. >> >> When I start more instances of maker, the number of lines in it >> increases when they start. But sometimes I check and the number of lines >> has greatly reduced since the last time. >> >> I'm afraid that the newer instances of maker are deleting the file and >> starting it from scratch instead of adding their progress to it. >> >> Is this a file locking issue I should be worried about? >> >> Cheers, >> Michael. >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > From carsonhh at gmail.com Thu Mar 14 11:51:41 2013 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 14 Mar 2013 13:51:41 -0400 Subject: [maker-devel] 12core speed check In-Reply-To: Message-ID: Could you update to 998. It was a recent commit to the devel version that caused a weird pause. Thanks, Carson From: Ram?n Fallon Date: Thursday, 14 March, 2013 11:19 AM To: Subject: [maker-devel] 12core speed check Hi, I was trying to tweak some of our machines to maximise Mpich2/Maker (svn rev 997) throughput and describe one small set of results on this mailing list to allow sharing of experiences. I use the example input dataset "dpp_contig.fasta" with the original sequence repeated 125 times within the same file (under different names of course) to allow for a decent size run. This file totalled 4.019 megabases. I use the dpp_proteins.fasta and The maker_opts.ctl has "cpus=1" set as the docs recommend for MPI. Hardware is a standalone HP Proliant SL390 with two Intel X5675 @ 3.07GHz, totalling 12 cores with 192GB RAM and 1TB disk (local, no NFS) running Ubuntu 10.04 with 2.6.32-41 linux kernel commandline was "mpiexec -n <#cores> maker" within a dedicated directory containing all relevant files. #cores time(mins) Megabases/hr 1 27.00 8.93 2 126.25 1.91 4 42.57 5.66 6 25.42 9.49 8 18.60 12.96 10 16.67 14.47 12 13.98 17.24 I attach a png file with graph. The upshot of this particular experiment is that 2 processes show anomalous behaviour and that 6 processors are needed to gain an advantage on the 1 processor run, while 12 processors achieves a speed-up of nearly 2 on the 1 processor version. I am now going to move on to a three node cluster with 2x 8core processors each (so I can go up to 48 processors), so will report back with higher core numbers. Any suggestions on further speed optimizations welcome. Cheers / Ram?n. _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Thu Mar 14 11:55:38 2013 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 14 Mar 2013 13:55:38 -0400 Subject: [maker-devel] 12core speed check In-Reply-To: <5141F720.20502@duke.edu> Message-ID: It should use 2 physical cores. Hyperthreading shouldn't come into play unless you start more processes than there are physical cores. I haven't seen any big performance advantage in most cases with hyperthreading on linux machines. I find more often than not it just confuses students into thinking there are free processors and then starting too many jobs. --Carson From: Olaf Mueller Date: Thursday, 14 March, 2013 12:13 PM To: Subject: Re: [maker-devel] 12core speed check The X5675 supports hyperthreading. Does i.e. "mpiexec -n 2 maker" use 2 physical cores or 2 threads of the 1st core? If the latter happens it would be interesting to see your series extended to -n 24. Cheers Olaf On 03/14/2013 11:19 AM, Ram?n Fallon wrote: > Hi, > > > > I was trying to tweak some of our machines to maximise Mpich2/Maker (svn rev > 997) throughput and describe one small set of results on this mailing list to > allow sharing of experiences. > > > > > I use the example input dataset "dpp_contig.fasta" with the original sequence > repeated 125 times within the same file (under different names of course) to > allow for a decent size run. This file totalled 4.019 megabases. I use the > dpp_proteins.fasta and The maker_opts.ctl has "cpus=1" set as the docs > recommend for MPI. > > > > > Hardware is a standalone HP Proliant SL390 with two Intel X5675 @ 3.07GHz, > totalling 12 cores with 192GB RAM and 1TB disk (local, no NFS) running Ubuntu > 10.04 with 2.6.32-41 linux kernel > > > > > commandline was "mpiexec -n <#cores> maker" within a dedicated directory > containing all relevant files. > > > > > > #cores time(mins) Megabases/hr > > 1 27.00 8.93 > > 2 126.25 1.91 > > 4 42.57 5.66 > > 6 25.42 9.49 > > 8 18.60 12.96 > > 10 16.67 14.47 > > 12 13.98 17.24 > > > > > > I attach a png file with graph. The upshot of this particular experiment is > that 2 processes show anomalous behaviour and that 6 processors are needed to > gain an advantage on the 1 processor run, while 12 processors achieves a > speed-up of nearly 2 on the 1 processor version. > > > > > I am now going to move on to a three node cluster with 2x 8core processors > each (so I can go up to 48 processors), so will report back with higher core > numbers. Any suggestions on further speed optimizations welcome. > > > > > Cheers / Ram?n. > > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/mak > er-devel_yandell-lab.org > _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From myandell at genetics.utah.edu Thu Mar 14 11:59:37 2013 From: myandell at genetics.utah.edu (Mark Yandell) Date: Thu, 14 Mar 2013 17:59:37 +0000 Subject: [maker-devel] 12core speed check In-Reply-To: References: , Message-ID: <7A60AB257EFF2B48B1F4C814817EA05350ED9082@mxb2.hg.genetics.utah.edu> Thanks Ramon. super interesting analysis! Mark Yandell Professor of Human Genetics H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:801-587-7707 ________________________________________ From: maker-devel-bounces at yandell-lab.org [maker-devel-bounces at yandell-lab.org] on behalf of Carson Holt [carsonhh at gmail.com] Sent: Thursday, March 14, 2013 11:51 AM To: Ram?n Fallon; maker-devel at yandell-lab.org Subject: Re: [maker-devel] 12core speed check Could you update to 998. It was a recent commit to the devel version that caused a weird pause. Thanks, Carson From: Ram?n Fallon > Date: Thursday, 14 March, 2013 11:19 AM To: > Subject: [maker-devel] 12core speed check Hi, I was trying to tweak some of our machines to maximise Mpich2/Maker (svn rev 997) throughput and describe one small set of results on this mailing list to allow sharing of experiences. I use the example input dataset "dpp_contig.fasta" with the original sequence repeated 125 times within the same file (under different names of course) to allow for a decent size run. This file totalled 4.019 megabases. I use the dpp_proteins.fasta and The maker_opts.ctl has "cpus=1" set as the docs recommend for MPI. Hardware is a standalone HP Proliant SL390 with two Intel X5675 @ 3.07GHz, totalling 12 cores with 192GB RAM and 1TB disk (local, no NFS) running Ubuntu 10.04 with 2.6.32-41 linux kernel commandline was "mpiexec -n <#cores> maker" within a dedicated directory containing all relevant files. #cores time(mins) Megabases/hr 1 27.00 8.93 2 126.25 1.91 4 42.57 5.66 6 25.42 9.49 8 18.60 12.96 10 16.67 14.47 12 13.98 17.24 I attach a png file with graph. The upshot of this particular experiment is that 2 processes show anomalous behaviour and that 6 processors are needed to gain an advantage on the 1 processor run, while 12 processors achieves a speed-up of nearly 2 on the 1 processor version. I am now going to move on to a three node cluster with 2x 8core processors each (so I can go up to 48 processors), so will report back with higher core numbers. Any suggestions on further speed optimizations welcome. Cheers / Ram?n. _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From daniel.quest at gmail.com Thu Mar 14 20:07:34 2013 From: daniel.quest at gmail.com (Dan Quest) Date: Fri, 15 Mar 2013 02:07:34 +0000 (UTC) Subject: [maker-devel] Invitation to connect on LinkedIn Message-ID: <1487511280.7392755.1363313254244.JavaMail.app@ela4-app2322.prod> LinkedIn ------------ I'd like to add you to my professional network on LinkedIn. - Dan Dan Quest Senior Analyst Programmer at Mayo Clinic Rochester, Minnesota Area Confirm that you know Dan Quest: https://www.linkedin.com/e/-m3y3hs-heapifdk-1i/isd/11686987554/Yo4-rOXB/?hs=false&tok=26pedbV21vJlE1 -- You are receiving Invitation to Connect emails. Click to unsubscribe: http://www.linkedin.com/e/-m3y3hs-heapifdk-1i/vcG-iX3vwW9133a7MYTHsMyDds41ZeU5jWTF9LUs04/goo/maker-devel%40yandell-lab%2Eorg/20061/I3868510560_1/?hs=false&tok=24a30hi6RvJlE1 (c) 2012 LinkedIn Corporation. 2029 Stierlin Ct, Mountain View, CA 94043, USA. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ares711122 at gmail.com Thu Mar 14 21:13:55 2013 From: ares711122 at gmail.com (Hung-Wei Hsu) Date: Fri, 15 Mar 2013 11:13:55 +0800 Subject: [maker-devel] ERROR: Could not obtain lock to format database In-Reply-To: References: Message-ID: You may find the error messages in the run log as attached. Thanks a lot in advance. Best regards, Hung-Wei 2013/3/14 Carson Holt > Could you check to make sure your hard drive is not full, whatever > location you set as TMP= in the control files is not full (default is > /tmp). Also maker sure you do not set /tmp to an NFS mounted or a tmpfs > location. > > Could you also send the full captured STDERR. > > Thanks, > Carson > > > > From: Hung-Wei Hsu > Date: Tuesday, 12 March, 2013 8:24 PM > To: > Subject: [maker-devel] ERROR: Could not obtain lock to format database > > Hi MAKER developers, > > I tried MAKER 2.27b on one E. coli scaffold sequence with uniprot protein > database. > I failed to run the analysis and got an error message as below. > > Could not obtain lock to format database at maker-2.27b/bin/../lib/GI.pm > > Any suggestions or helps will be deeply appreciated. > > Best regards, > Hung-Wei > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: run.log Type: application/octet-stream Size: 27206 bytes Desc: not available URL: From ares711122 at gmail.com Thu Mar 14 21:35:09 2013 From: ares711122 at gmail.com (Hung-Wei Hsu) Date: Fri, 15 Mar 2013 11:35:09 +0800 Subject: [maker-devel] ERROR: Could not obtain lock to format database In-Reply-To: References: Message-ID: The hard disk where I tried MAKER is about 2TB in size. TMP was not set to an NFS mounted or a tmpfs location and was empty before analysis. The hard disk where TMP directory was located at was about 2TB in size. Thanks a lot in advance. Best regards, Hung-Wei 2013/3/15 Hung-Wei Hsu > You may find the error messages in the run log as attached. > Thanks a lot in advance. > > Best regards, > Hung-Wei > > > 2013/3/14 Carson Holt > >> Could you check to make sure your hard drive is not full, whatever >> location you set as TMP= in the control files is not full (default is >> /tmp). Also maker sure you do not set /tmp to an NFS mounted or a tmpfs >> location. >> >> Could you also send the full captured STDERR. >> >> Thanks, >> Carson >> >> >> >> From: Hung-Wei Hsu >> Date: Tuesday, 12 March, 2013 8:24 PM >> To: >> Subject: [maker-devel] ERROR: Could not obtain lock to format database >> >> Hi MAKER developers, >> >> I tried MAKER 2.27b on one E. coli scaffold sequence with uniprot protein >> database. >> I failed to run the analysis and got an error message as below. >> >> Could not obtain lock to format database at maker-2.27b/bin/../lib/GI.pm >> >> Any suggestions or helps will be deeply appreciated. >> >> Best regards, >> Hung-Wei >> _______________________________________________ maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Fri Mar 15 12:06:21 2013 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 15 Mar 2013 14:06:21 -0400 Subject: [maker-devel] ERROR: Could not obtain lock to format database In-Reply-To: Message-ID: Were you by any chance running multiple instances of MAKER at the same time in the same directory? It looks like two processes started to work on the same contig (normally a first set of locks blocks this possibility ? but rarely they get past that step). Then when it got to a part where an analysis is performed one properly failed when it realized that the other had the lock. In any case, it looks like it just retried and finished the contig in question. So the snippet seems to indicate expected behavior. Do you see the contig in question as being finished and having an output GFF3? --Carson From: Hung-Wei Hsu Date: Thursday, 14 March, 2013 11:13 PM To: Carson Holt Cc: Subject: Re: [maker-devel] ERROR: Could not obtain lock to format database You may find the error messages in the run log as attached. Thanks a lot in advance. Best regards, Hung-Wei 2013/3/14 Carson Holt > Could you check to make sure your hard drive is not full, whatever location > you set as TMP= in the control files is not full (default is /tmp). Also > maker sure you do not set /tmp to an NFS mounted or a tmpfs location. > > Could you also send the full captured STDERR. > > Thanks, > Carson > > > > From: Hung-Wei Hsu > Date: Tuesday, 12 March, 2013 8:24 PM > To: > Subject: [maker-devel] ERROR: Could not obtain lock to format database > > Hi MAKER developers, > > I tried MAKER 2.27b on one E. coli scaffold sequence with uniprot protein > database. > I failed to run the analysis and got an error message as below. > > Could not obtain lock to format database at maker-2.27b/bin/../lib/GI.pm > > Any suggestions or helps will be deeply appreciated. > > Best regards, > Hung-Wei > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/mak > er-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ramonfallon at gmail.com Mon Mar 18 08:35:04 2013 From: ramonfallon at gmail.com (=?ISO-8859-1?Q?Ram=F3n_Fallon?=) Date: Mon, 18 Mar 2013 15:35:04 +0100 Subject: [maker-devel] Fwd: 12core speed check In-Reply-To: References: <7A60AB257EFF2B48B1F4C814817EA05350ED9082@mxb2.hg.genetics.utah.edu> Message-ID: Hi! I've tried again from two different machines, and I can't do a "svn co" nor "svn update" on the malachite server. Can you verify the server and the svn service is OK on your side? Many thanks / Ram?n. On Fri, Mar 15, 2013 at 1:18 PM, Ram?n Fallon wrote: > Hi Mark and Carson, > > Many thanks for the comments and the speedy replies! > > Previously, I never had problem connecting to the svn server on > malachite.genetics.utah.edu, but this morning, I couldn't connect to > update to rev 998. > > I'l try again later. > > Cheers / Ram?n. > > > On Thu, Mar 14, 2013 at 6:59 PM, Mark Yandell wrote: > >> Thanks Ramon. super interesting analysis! >> >> >> Mark Yandell >> Professor of Human Genetics >> H.A. & Edna Benning Presidential Endowed Chair >> Eccles Institute of Human Genetics >> University of Utah >> 15 North 2030 East, Room 2100 >> Salt Lake City, UT 84112-5330 >> ph:801-587-7707 >> >> ________________________________________ >> From: maker-devel-bounces at yandell-lab.org [ >> maker-devel-bounces at yandell-lab.org] on behalf of Carson Holt [ >> carsonhh at gmail.com] >> Sent: Thursday, March 14, 2013 11:51 AM >> To: Ram?n Fallon; maker-devel at yandell-lab.org >> Subject: Re: [maker-devel] 12core speed check >> >> Could you update to 998. It was a recent commit to the devel version >> that caused a weird pause. >> >> Thanks, >> Carson >> >> >> From: Ram?n Fallon > >> Date: Thursday, 14 March, 2013 11:19 AM >> To: > >> Subject: [maker-devel] 12core speed check >> >> Hi, >> >> I was trying to tweak some of our machines to maximise Mpich2/Maker (svn >> rev 997) throughput and describe one small set of results on this mailing >> list to allow sharing of experiences. >> >> I use the example input dataset "dpp_contig.fasta" with the original >> sequence repeated 125 times within the same file (under different names of >> course) to allow for a decent size run. This file totalled 4.019 megabases. >> I use the dpp_proteins.fasta and The maker_opts.ctl has "cpus=1" set as >> the docs recommend for MPI. >> >> Hardware is a standalone HP Proliant SL390 with two Intel X5675 @ >> 3.07GHz, totalling 12 cores with 192GB RAM and 1TB disk (local, no NFS) >> running Ubuntu 10.04 with 2.6.32-41 linux kernel >> >> commandline was "mpiexec -n <#cores> maker" within a dedicated directory >> containing all relevant files. >> >> #cores time(mins) Megabases/hr >> 1 27.00 8.93 >> 2 126.25 1.91 >> 4 42.57 5.66 >> 6 25.42 9.49 >> 8 18.60 12.96 >> 10 16.67 14.47 >> 12 13.98 17.24 >> >> I attach a png file with graph. The upshot of this particular experiment >> is that 2 processes show anomalous behaviour and that 6 processors are >> needed to gain an advantage on the 1 processor run, while 12 processors >> achieves a speed-up of nearly 2 on the 1 processor version. >> >> I am now going to move on to a three node cluster with 2x 8core >> processors each (so I can go up to 48 processors), so will report back with >> higher core numbers. Any suggestions on further speed optimizations welcome. >> >> Cheers / Ram?n. >> _______________________________________________ maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Mar 18 08:51:37 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 18 Mar 2013 10:51:37 -0400 Subject: [maker-devel] Fwd: 12core speed check In-Reply-To: Message-ID: For any users currently using the devel subversion repository. If you need to update, please send me an e-mail to get information on how to switch over to our new server. Thanks, Carson From: Ram?n Fallon Date: Monday, 18 March, 2013 10:35 AM To: Subject: [maker-devel] Fwd: 12core speed check Hi! I've tried again from two different machines, and I can't do a "svn co" nor "svn update" on the malachite server. Can you verify the server and the svn service is OK on your side? Many thanks / Ram?n. On Fri, Mar 15, 2013 at 1:18 PM, Ram?n Fallon wrote: > Hi Mark and Carson, > > Many thanks for the comments and the speedy replies! > > Previously, I never had problem connecting to the svn server on > malachite.genetics.utah.edu , but this > morning, I couldn't connect to update to rev 998. > > I'l try again later. > > Cheers / Ram?n. > > > On Thu, Mar 14, 2013 at 6:59 PM, Mark Yandell > wrote: >> Thanks Ramon. super interesting analysis! >> >> >> Mark Yandell >> Professor of Human Genetics >> H.A. & Edna Benning Presidential Endowed Chair >> Eccles Institute of Human Genetics >> University of Utah >> 15 North 2030 East, Room 2100 >> Salt Lake City, UT 84112-5330 >> ph:801-587-7707 >> >> ________________________________________ >> From: maker-devel-bounces at yandell-lab.org >> [maker-devel-bounces at yandell-lab.org] on behalf of Carson Holt >> [carsonhh at gmail.com] >> Sent: Thursday, March 14, 2013 11:51 AM >> To: Ram?n Fallon; maker-devel at yandell-lab.org >> Subject: Re: [maker-devel] 12core speed check >> >> Could you update to 998. It was a recent commit to the devel version that >> caused a weird pause. >> >> Thanks, >> Carson >> >> >> From: Ram?n Fallon > >> Date: Thursday, 14 March, 2013 11:19 AM >> To: > >> Subject: [maker-devel] 12core speed check >> >> Hi, >> >> I was trying to tweak some of our machines to maximise Mpich2/Maker (svn rev >> 997) throughput and describe one small set of results on this mailing list >> to allow sharing of experiences. >> >> I use the example input dataset "dpp_contig.fasta" with the original sequence >> repeated 125 times within the same file (under different names of course) to >> allow for a decent size run. This file totalled 4.019 megabases. I use the >> dpp_proteins.fasta and The maker_opts.ctl has "cpus=1" set as the docs >> recommend for MPI. >> >> Hardware is a standalone HP Proliant SL390 with two Intel X5675 @ 3.07GHz, >> totalling 12 cores with 192GB RAM and 1TB disk (local, no NFS) running Ubuntu >> 10.04 with 2.6.32-41 linux kernel >> >> commandline was "mpiexec -n <#cores> maker" within a dedicated directory >> containing all relevant files. >> >> #cores time(mins) Megabases/hr >> 1 27.00 8.93 >> 2 126.25 1.91 >> 4 42.57 5.66 >> 6 25.42 9.49 >> 8 18.60 12.96 >> 10 16.67 14.47 >> 12 13.98 17.24 >> >> I attach a png file with graph. The upshot of this particular experiment is >> that 2 processes show anomalous behaviour and that 6 processors are needed to >> gain an advantage on the 1 processor run, while 12 processors achieves a >> speed-up of nearly 2 on the 1 processor version. >> >> I am now going to move on to a three node cluster with 2x 8core processors >> each (so I can go up to 48 processors), so will report back with higher core >> numbers. Any suggestions on further speed optimizations welcome. >> >> Cheers / Ram?n. >> _______________________________________________ maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From hudarul at yahoo.com Mon Mar 18 14:13:21 2013 From: hudarul at yahoo.com (Hud Hud) Date: Mon, 18 Mar 2013 13:13:21 -0700 (PDT) Subject: [maker-devel] Maker-no such file or directory Message-ID: <1363637601.24386.YahooMailNeo@web164901.mail.bf1.yahoo.com> I have some problem with maker 1. i try to work with the example data in data directory, but im having this kind of error..anyone can help me error $ maker STATUS: Parsing control files... dpp_contig.fasta (fasta file or fasta embeded in GFF3 file): No such file or directory at /home/Dorah/maker-2.27-beta/maker/bin/../lib/GI.pm line 186 --> rank=NA, hostname=NurKaiyisah my maker_opts.ctl genome=$home/Dorah/maker-2.27-beta/maker/data/dpp_contig.fasta est=$home/Dorah/maker-2.27-beta/maker/data/dpp_est.fasta protein=$home/Dorah/maker-2.27-beta/maker/data/dpp_protein.fasta -------------- next part -------------- An HTML attachment was scrubbed... URL: From Hossein.Borhan at AGR.GC.CA Mon Mar 18 14:40:38 2013 From: Hossein.Borhan at AGR.GC.CA (Borhan, Hossein) Date: Mon, 18 Mar 2013 16:40:38 -0400 Subject: [maker-devel] failed gene prediction Message-ID: <7B64340A44B6634C814A22BCFA6179D5020887DA@onottaxms5.AGR.GC.CA> Hi I have tried maker on a fungus genome of 45 mb with 1/3 being repeat rich. It did not produce any prediction. I am not sure what is causing this. Attached are the STDERR and opts.ctl. I appreciate your help Hossein -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: wa74-maker-stderr.log Type: application/octet-stream Size: 6325713 bytes Desc: wa74-maker-stderr.log URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_opts.ctl Type: application/octet-stream Size: 5244 bytes Desc: maker_opts.ctl URL: From carsonhh at gmail.com Mon Mar 18 14:44:41 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 18 Mar 2013 16:44:41 -0400 Subject: [maker-devel] Maker-no such file or directory In-Reply-To: <1363637601.24386.YahooMailNeo@web164901.mail.bf1.yahoo.com> Message-ID: Does 'ls -al $home/Dorah/maker-2.27-beta/maker/data/dpp_contig.fasta' show a valid location? The error is just saying that the file location as written in the maker_opts.ctl file does not exist. --Carson From: Hud Hud Reply-To: Hud Hud Date: Monday, 18 March, 2013 4:13 PM To: "maker-devel at yandell-lab.org" Subject: [maker-devel] Maker-no such file or directory I have some problem with maker 1. i try to work with the example data in data directory, but im having this kind of error..anyone can help me error $ maker STATUS: Parsing control files... dpp_contig.fasta (fasta file or fasta embeded in GFF3 file): No such file or directory at /home/Dorah/maker-2.27-beta/maker/bin/../lib/GI.pm line 186 --> rank=NA, hostname=NurKaiyisah my maker_opts.ctl genome=$home/Dorah/maker-2.27-beta/maker/data/dpp_contig.fasta est=$home/Dorah/maker-2.27-beta/maker/data/dpp_est.fasta protein=$home/Dorah/maker-2.27-beta/maker/data/dpp_protein.fasta _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Mar 18 14:49:30 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 18 Mar 2013 16:49:30 -0400 Subject: [maker-devel] failed gene prediction In-Reply-To: <7B64340A44B6634C814A22BCFA6179D5020887DA@onottaxms5.AGR.GC.CA> Message-ID: You didn't supply any evidence or HMM files for gene predictors. Just raw assembly data by itself is insufficient for genome annotation. Here is some nice documentation for running MAKER --> http://gmod.org/wiki/MAKER_Tutorial_2012 Here is a nice overview of genome annotation ion general --> http://fasta.bioch.virginia.edu/cshl/pdf/12/ajm12/euk_genome_annotation_revi ew.pdf Once you've gone through the documentation and examples, if you come across any questions just let us know. Thanks, Carson From: "Borhan, Hossein" Date: Monday, 18 March, 2013 4:40 PM To: Subject: [maker-devel] failed gene prediction Hi I have tried maker on a fungus genome of 45 mb with 1/3 being repeat rich. It did not produce any prediction. I am not sure what is causing this. Attached are the STDERR and opts.ctl. I appreciate your help Hossein _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ares711122 at gmail.com Mon Mar 18 18:44:39 2013 From: ares711122 at gmail.com (Hung-Wei Hsu) Date: Tue, 19 Mar 2013 08:44:39 +0800 Subject: [maker-devel] ERROR: Could not obtain lock to format database In-Reply-To: References: Message-ID: I make sure I just ran one instance of MAKER at the same time. I only analyzed one contig for the test. After MAKER interruption, I can't find an GFF3 output of this contig. There are only a theVoidXXX directory and a run.log file. I'm trying 2.26b with the same parameters for the same data. Hopefully, it can work well. Hung-Wei 2013/3/16 Carson Holt > Were you by any chance running multiple instances of MAKER at the same > time in the same directory? It looks like two processes started to work on > the same contig (normally a first set of locks blocks this possibility ? > but rarely they get past that step). Then when it got to a part where an > analysis is performed one properly failed when it realized that the other > had the lock. In any case, it looks like it just retried and finished the > contig in question. So the snippet seems to indicate expected behavior. > Do you see the contig in question as being finished and having an output > GFF3? > > --Carson > > > > > From: Hung-Wei Hsu > Date: Thursday, 14 March, 2013 11:13 PM > To: Carson Holt > Cc: > Subject: Re: [maker-devel] ERROR: Could not obtain lock to format database > > You may find the error messages in the run log as attached. > Thanks a lot in advance. > > Best regards, > Hung-Wei > > > 2013/3/14 Carson Holt > >> Could you check to make sure your hard drive is not full, whatever >> location you set as TMP= in the control files is not full (default is >> /tmp). Also maker sure you do not set /tmp to an NFS mounted or a tmpfs >> location. >> >> Could you also send the full captured STDERR. >> >> Thanks, >> Carson >> >> >> >> From: Hung-Wei Hsu >> Date: Tuesday, 12 March, 2013 8:24 PM >> To: >> Subject: [maker-devel] ERROR: Could not obtain lock to format database >> >> Hi MAKER developers, >> >> I tried MAKER 2.27b on one E. coli scaffold sequence with uniprot protein >> database. >> I failed to run the analysis and got an error message as below. >> >> Could not obtain lock to format database at maker-2.27b/bin/../lib/GI.pm >> >> Any suggestions or helps will be deeply appreciated. >> >> Best regards, >> Hung-Wei >> _______________________________________________ maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnuhn at ebi.ac.uk Tue Mar 19 06:12:32 2013 From: mnuhn at ebi.ac.uk (Michael Nuhn) Date: Tue, 19 Mar 2013 12:12:32 +0000 Subject: [maker-devel] master_datastore_index.log file shrinks. In-Reply-To: <5141FF8F.2050900@ebi.ac.uk> References: <5141FF8F.2050900@ebi.ac.uk> Message-ID: <51485630.6080701@ebi.ac.uk> Hello Carson! On 03/14/2013 04:49 PM, Michael Nuhn wrote: >> Try dialling back on the number of simultaneous instances you start and >> instead use MPI or the -cpus option to get the parallelization boost. >> Alternatively you can also split up the input file and use the -base >> option so everything gets written to the same place (then you never have >> to worry about locks affecting individual contigs - as no single instance >> has access to all the contigs) >> >> Example: >> fasta_tool --chunks 5 maize_assembly.fasta >> maker -g maize_assembly_0.fasta -base maize_assembly >> maker -g maize_assembly_1.fasta -base maize_assembly >> >> maker -g maize_assembly_2.fasta -base maize_assembly >> >> maker -g maize_assembly_3.fasta -base maize_assembly >> >> maker -g maize_assembly_4.fasta -base maize_assembly >> >> maker -dsindex >> >> Everything then gets written to maize_assembly.maker.output for all >> results. The last call to maker with the -dsindex flag then rebuilds the >> datastore_index.log file to match the original maize_assembly.fasta file I have tried this, split my genome into 50 files and run them as you suggested above. This worked well most of the time, but now I am getting locking issues again. The working directory gets flooded with STACK.STACK.STACK.STACK ... files. What I think is happening is that for some reason the maker instances decide that they want to rebuild the index. This takes a lot of time and this blocks even more instances wanting to lock the index files. In the end most of the maker instances end up waiting. I would like to try the following, but I don't know, if this might cause problems later on: I would like to run all of the split sequence files as separate maker projects as if they were independent genomes. In the end I'd merge all the individual gff files using the gff3_merge script. Do you see any reason why this wouldn't work? Cheers, Michael. From Bob_Freeman at hms.harvard.edu Tue Mar 19 07:03:00 2013 From: Bob_Freeman at hms.harvard.edu (Freeman, Robert M.) Date: Tue, 19 Mar 2013 09:03:00 -0400 Subject: [maker-devel] Failing to recoup all gff annotations, including ab-initio Message-ID: Carson et al., Thanks again for a great suite tools! We're using MAKER now to generate gene models (and model fragments) for a ciliate, the models for which we'll be using to generate a high-quality protein database for searches with mass spec. I bootstrapped the process using the core set of proteins with CEGMA, then trained SNAP. After the final round of running MAKER, I get about 1100 evidence-based models and 34K ab-initio. And that's fine (for now). I am able to collect the fasta files for both transcripts and proteins (evidence-based and ab-initio) without problem. My problem is that when I use the gff3_merge script, I only get annotations for the evidence-based models. I'm not sure why the ab-initio model annotations are being collected. I've tried using and not the '-g' switch, but this doesn't seem to make a difference. Thoughts? Tx, B ----------------------------------------------------- Bob Freeman, Ph.D. Acorn Worm Informatics, Kirschner lab Dept of Systems Biology, Alpert 524 Harvard Medical School 200 Longwood Avenue Boston, MA 02115 617/432.2294, vox "Sorry I'm late. Oh, God, that sounded insincere. I'm late." -- Karen Walker, from Will and Grace -------------- next part -------------- An HTML attachment was scrubbed... URL: From dsth at ebi.ac.uk Tue Mar 19 07:33:13 2013 From: dsth at ebi.ac.uk (Daniel Hughes) Date: Tue, 19 Mar 2013 13:33:13 +0000 Subject: [maker-devel] master_datastore_index.log file shrinks.] Message-ID: Daniel S. T. Hughes M.Biochem (Hons; Oxford), Ph.D (Cambridge) ------------------------------------------------------------------------------------- dsth at cantab.net dsth at cpan.org Hi Michael, You're using ebi cluster? i have to ask, is this all just a really elaborate way of avoiding the use of MPI that works perfectly well on both the ebi and sanger compute farms? if you carry on in the direction you seem to be going you're likely to end up with a considerable level of unnecessary overhead and should possibly consider adapting the ensembl genebuild pipeline to your specific needs. Dan Hello Carson! > > On 03/14/2013 04:49 PM, Michael Nuhn wrote: > >> Try dialling back on the number of simultaneous instances you start and > >> instead use MPI or the -cpus option to get the parallelization boost. > >> Alternatively you can also split up the input file and use the -base > >> option so everything gets written to the same place (then you never have > >> to worry about locks affecting individual contigs - as no single > instance > >> has access to all the contigs) > >> > >> Example: > >> fasta_tool --chunks 5 maize_assembly.fasta > >> maker -g maize_assembly_0.fasta -base maize_assembly > >> maker -g maize_assembly_1.fasta -base maize_assembly > >> > >> maker -g maize_assembly_2.fasta -base maize_assembly > >> > >> maker -g maize_assembly_3.fasta -base maize_assembly > >> > >> maker -g maize_assembly_4.fasta -base maize_assembly > >> > >> maker -dsindex > >> > >> Everything then gets written to maize_assembly.maker.output for all > >> results. The last call to maker with the -dsindex flag then rebuilds > the > >> datastore_index.log file to match the original maize_assembly.fasta file > > I have tried this, split my genome into 50 files and run them as you > suggested above. > > This worked well most of the time, but now I am getting locking issues > again. The working directory gets flooded with STACK.STACK.STACK.STACK > ... files. > > What I think is happening is that for some reason the maker instances > decide that they want to rebuild the index. This takes a lot of time > and this blocks even more instances wanting to lock the index files. > In the end most of the maker instances end up waiting. > > I would like to try the following, but I don't know, if this might > cause problems later on: > > I would like to run all of the split sequence files as separate maker > projects as if they were independent genomes. In the end I'd merge all > the individual gff files using the gff3_merge script. > > Do you see any reason why this wouldn't work? > > Cheers, > Michael. > > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > ----- End forwarded message ----- > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Mar 19 08:27:16 2013 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 19 Mar 2013 10:27:16 -0400 Subject: [maker-devel] master_datastore_index.log file shrinks.] In-Reply-To: Message-ID: Yes. If at all possible use MPI. It removes the overhead of locks which happen per primary instance of MAKER. So one maker job using 1000 cpus via MPI will have one shared set of locks. 1000 serial instances of MAKER on the other hand would have 1000x the locks. Alternatively if you do need to continue without MPI for some reason, I just finished a devel version of MAKER that has a --no_locks option. You can never start two instances using the same input fasta when --no_locks is specified, but the splitting to use different input fastas I mentioned before in the example will still work fine. I also have updated the indexing/reindexing, so if indexing failures happen, MAKER will switch between the current working directory and the TMP= directory from the maker_opts.ctl file so as to try different IO locations (I.e. NFS and non-NFS). Note you should never set TMP= in the control files to an NFS mounted location (it not only makes things a lot slower, but berkleydb and sqllite will get frequent errors on NFS). TMP= defaults to /tmp when not specified I'll send you download information in a separate e-mail. Try a regular MAKER run to see if the indexing/reindexing changes are sufficient before attempting the ?no_locks option. Thanks, Carson From: Daniel Hughes Date: Tuesday, 19 March, 2013 9:33 AM To: Michael Nuhn , Subject: Re: [maker-devel] master_datastore_index.log file shrinks.] Daniel S. T. Hughes M.Biochem (Hons; Oxford), Ph.D (Cambridge) ---------------------------------------------------------------------------- --------- dsth at cantab.net dsth at cpan.org Hi Michael, You're using ebi cluster? i have to ask, is this all just a really elaborate way of avoiding the use of MPI that works perfectly well on both the ebi and sanger compute farms? if you carry on in the direction you seem to be going you're likely to end up with a considerable level of unnecessary overhead and should possibly consider adapting the ensembl genebuild pipeline to your specific needs. Dan > Hello Carson! > > On 03/14/2013 04:49 PM, Michael Nuhn wrote: >>> >> Try dialling back on the number of simultaneous instances you start and >>> >> instead use MPI or the -cpus option to get the parallelization boost. >>> >> Alternatively you can also split up the input file and use the -base >>> >> option so everything gets written to the same place (then you never have >>> >> to worry about locks affecting individual contigs - as no single instance >>> >> has access to all the contigs) >>> >> >>> >> Example: >>> >> fasta_tool --chunks 5 maize_assembly.fasta >>> >> maker -g maize_assembly_0.fasta -base maize_assembly >>> >> maker -g maize_assembly_1.fasta -base maize_assembly >>> >> >>> >> maker -g maize_assembly_2.fasta -base maize_assembly >>> >> >>> >> maker -g maize_assembly_3.fasta -base maize_assembly >>> >> >>> >> maker -g maize_assembly_4.fasta -base maize_assembly >>> >> >>> >> maker -dsindex >>> >> >>> >> Everything then gets written to maize_assembly.maker.output for all >>> >> results. The last call to maker with the -dsindex flag then rebuilds the >>> >> datastore_index.log file to match the original maize_assembly.fasta file > > I have tried this, split my genome into 50 files and run them as you > suggested above. > > This worked well most of the time, but now I am getting locking issues > again. The working directory gets flooded with STACK.STACK.STACK.STACK > ... files. > > What I think is happening is that for some reason the maker instances > decide that they want to rebuild the index. This takes a lot of time > and this blocks even more instances wanting to lock the index files. > In the end most of the maker instances end up waiting. > > I would like to try the following, but I don't know, if this might > cause problems later on: > > I would like to run all of the split sequence files as separate maker > projects as if they were independent genomes. In the end I'd merge all > the individual gff files using the gff3_merge script. > > Do you see any reason why this wouldn't work? > > Cheers, > Michael. > > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > ----- End forwarded message ----- > _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Mar 19 08:38:00 2013 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 19 Mar 2013 10:38:00 -0400 Subject: [maker-devel] master_datastore_index.log file shrinks.] In-Reply-To: Message-ID: You can also talk to Eleanor Stanley at Sanger, she has a pre-release of MAKER 2.28 already installed and running on the Sanger cluster with OpenMPI. Thanks, Carson From: Carson Holt Date: Tuesday, 19 March, 2013 10:27 AM To: Daniel Hughes , Michael Nuhn , Subject: Re: [maker-devel] master_datastore_index.log file shrinks.] Yes. If at all possible use MPI. It removes the overhead of locks which happen per primary instance of MAKER. So one maker job using 1000 cpus via MPI will have one shared set of locks. 1000 serial instances of MAKER on the other hand would have 1000x the locks. Alternatively if you do need to continue without MPI for some reason, I just finished a devel version of MAKER that has a --no_locks option. You can never start two instances using the same input fasta when --no_locks is specified, but the splitting to use different input fastas I mentioned before in the example will still work fine. I also have updated the indexing/reindexing, so if indexing failures happen, MAKER will switch between the current working directory and the TMP= directory from the maker_opts.ctl file so as to try different IO locations (I.e. NFS and non-NFS). Note you should never set TMP= in the control files to an NFS mounted location (it not only makes things a lot slower, but berkleydb and sqllite will get frequent errors on NFS). TMP= defaults to /tmp when not specified I'll send you download information in a separate e-mail. Try a regular MAKER run to see if the indexing/reindexing changes are sufficient before attempting the ?no_locks option. Thanks, Carson From: Daniel Hughes Date: Tuesday, 19 March, 2013 9:33 AM To: Michael Nuhn , Subject: Re: [maker-devel] master_datastore_index.log file shrinks.] Daniel S. T. Hughes M.Biochem (Hons; Oxford), Ph.D (Cambridge) ---------------------------------------------------------------------------- --------- dsth at cantab.net dsth at cpan.org Hi Michael, You're using ebi cluster? i have to ask, is this all just a really elaborate way of avoiding the use of MPI that works perfectly well on both the ebi and sanger compute farms? if you carry on in the direction you seem to be going you're likely to end up with a considerable level of unnecessary overhead and should possibly consider adapting the ensembl genebuild pipeline to your specific needs. Dan > Hello Carson! > > On 03/14/2013 04:49 PM, Michael Nuhn wrote: >>> >> Try dialling back on the number of simultaneous instances you start and >>> >> instead use MPI or the -cpus option to get the parallelization boost. >>> >> Alternatively you can also split up the input file and use the -base >>> >> option so everything gets written to the same place (then you never have >>> >> to worry about locks affecting individual contigs - as no single instance >>> >> has access to all the contigs) >>> >> >>> >> Example: >>> >> fasta_tool --chunks 5 maize_assembly.fasta >>> >> maker -g maize_assembly_0.fasta -base maize_assembly >>> >> maker -g maize_assembly_1.fasta -base maize_assembly >>> >> >>> >> maker -g maize_assembly_2.fasta -base maize_assembly >>> >> >>> >> maker -g maize_assembly_3.fasta -base maize_assembly >>> >> >>> >> maker -g maize_assembly_4.fasta -base maize_assembly >>> >> >>> >> maker -dsindex >>> >> >>> >> Everything then gets written to maize_assembly.maker.output for all >>> >> results. The last call to maker with the -dsindex flag then rebuilds the >>> >> datastore_index.log file to match the original maize_assembly.fasta file > > I have tried this, split my genome into 50 files and run them as you > suggested above. > > This worked well most of the time, but now I am getting locking issues > again. The working directory gets flooded with STACK.STACK.STACK.STACK > ... files. > > What I think is happening is that for some reason the maker instances > decide that they want to rebuild the index. This takes a lot of time > and this blocks even more instances wanting to lock the index files. > In the end most of the maker instances end up waiting. > > I would like to try the following, but I don't know, if this might > cause problems later on: > > I would like to run all of the split sequence files as separate maker > projects as if they were independent genomes. In the end I'd merge all > the individual gff files using the gff3_merge script. > > Do you see any reason why this wouldn't work? > > Cheers, > Michael. > > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > ----- End forwarded message ----- > _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/m aker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Mar 19 08:52:19 2013 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 19 Mar 2013 10:52:19 -0400 Subject: [maker-devel] Failing to recoup all gff annotations, including ab-initio In-Reply-To: Message-ID: Ab initio models without evidence support are not considered final models by default (newly trained ab initio predictors tend to have a very high false positive rate). If you really want the ab initio models without support to upgraded, set the keep_preds=1 in the maker_opts.ctl file. All ab inito models are also stored in the GFF3 as match/mtach_part features for reference purposes not gene/mRNA/exon/CDS. Thanks, Carson From: "Freeman, Robert M." Date: Tuesday, 19 March, 2013 9:03 AM To: "maker-devel at yandell-lab.org" Subject: [maker-devel] Failing to recoup all gff annotations, including ab-initio Carson et al., Thanks again for a great suite tools! We're using MAKER now to generate gene models (and model fragments) for a ciliate, the models for which we'll be using to generate a high-quality protein database for searches with mass spec. I bootstrapped the process using the core set of proteins with CEGMA, then trained SNAP. After the final round of running MAKER, I get about 1100 evidence-based models and 34K ab-initio. And that's fine (for now). I am able to collect the fasta files for both transcripts and proteins (evidence-based and ab-initio) without problem. My problem is that when I use the gff3_merge script, I only get annotations for the evidence-based models. I'm not sure why the ab-initio model annotations are being collected. I've tried using and not the '-g' switch, but this doesn't seem to make a difference. Thoughts? Tx, B ----------------------------------------------------- Bob Freeman, Ph.D. Acorn Worm Informatics, Kirschner lab Dept of Systems Biology, Alpert 524 Harvard Medical School 200 Longwood Avenue Boston, MA 02115 617/432.2294, vox "Sorry I'm late. Oh, God, that sounded insincere. I'm late." -- Karen Walker, from Will and Grace _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnuhn at ebi.ac.uk Tue Mar 19 09:19:25 2013 From: mnuhn at ebi.ac.uk (Michael Nuhn) Date: Tue, 19 Mar 2013 15:19:25 +0000 Subject: [maker-devel] master_datastore_index.log file shrinks.] In-Reply-To: References: Message-ID: <514881FD.4020003@ebi.ac.uk> Hello Carson! On 03/19/2013 02:27 PM, Carson Holt wrote: > Yes. If at all possible use MPI. It removes the overhead of locks > which happen per primary instance of MAKER. So one maker job using 1000 > cpus via MPI will have one shared set of locks. 1000 serial instances > of MAKER on the other hand would have 1000x the locks. I don't know a thing about MPI. I tried installing maker (2.2.7) with mpich-3.0.2, mpich2-1.4.1 and open mpi and none of them worked for me. I also tried the automatic installation that comes with maker, but it didn't work for me either. If need be, I could spend time getting to the bottom of this, but there is no telling how long this would take me so I'd rather not, if there is an alternative. Would the approach I outlined before work? (Treating the split files as separate genomes to annotate and then combine the gffs afterwards) I also like this approach, because I would select a few contigs in the beginning which I would run on their own. They would complete early and this way I would get a preview of the results of the run instead of having to wait for everything to complete. It might also be more robust, because file locking issues would be confined to the instances working on a sequence chunk, but the rest of the instances could continue working. Cheers, Michael. > Alternatively if you do need to continue without MPI for some reason, I > just finished a devel version of MAKER that has a --no_locks option. > You can never start two instances using the same input fasta when > --no_locks is specified, but the splitting to use different input fastas > I mentioned before in the example will still work fine. > > I also have updated the indexing/reindexing, so if indexing failures > happen, MAKER will switch between the current working directory and the > TMP= directory from the maker_opts.ctl file so as to try different IO > locations (I.e. NFS and non-NFS). Note you should never set TMP= in the > control files to an NFS mounted location (it not only makes things a lot > slower, but berkleydb and sqllite will get frequent errors on NFS). > TMP= defaults to /tmp when not specified > > I'll send you download information in a separate e-mail. Try a regular > MAKER run to see if the indexing/reindexing changes are sufficient > before attempting the ?no_locks option. > > Thanks, > Carson From carsonhh at gmail.com Tue Mar 19 09:02:22 2013 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 19 Mar 2013 11:02:22 -0400 Subject: [maker-devel] master_datastore_index.log file shrinks.] In-Reply-To: <514881FD.4020003@ebi.ac.uk> Message-ID: Try it with the no_locks option then. Make sure to let one instance finish populating the mpi_blastdb directory before running other instances as that is where most initial locking occurs. I'll send you more details on how to install with OpenMPI, so you can give that a shot while your jobs are also running serially (so you don't lose time). Also instead of 50 serial instances, you could try 10 with -cpus set to 5. Thanks, Carson On 13-03-19 11:19 AM, "Michael Nuhn" wrote: >Hello Carson! > >On 03/19/2013 02:27 PM, Carson Holt wrote: >> Yes. If at all possible use MPI. It removes the overhead of locks >> which happen per primary instance of MAKER. So one maker job using 1000 >> cpus via MPI will have one shared set of locks. 1000 serial instances >> of MAKER on the other hand would have 1000x the locks. > >I don't know a thing about MPI. > >I tried installing maker (2.2.7) with mpich-3.0.2, mpich2-1.4.1 and open >mpi and none of them worked for me. I also tried the automatic >installation that comes with maker, but it didn't work for me either. > >If need be, I could spend time getting to the bottom of this, but there >is no telling how long this would take me so I'd rather not, if there is >an alternative. > >Would the approach I outlined before work? (Treating the split files as >separate genomes to annotate and then combine the gffs afterwards) > >I also like this approach, because I would select a few contigs in the >beginning which I would run on their own. They would complete early and >this way I would get a preview of the results of the run instead of >having to wait for everything to complete. > >It might also be more robust, because file locking issues would be >confined to the instances working on a sequence chunk, but the rest of >the instances could continue working. > >Cheers, >Michael. > >> Alternatively if you do need to continue without MPI for some reason, I >> just finished a devel version of MAKER that has a --no_locks option. >> You can never start two instances using the same input fasta when >> --no_locks is specified, but the splitting to use different input fastas >> I mentioned before in the example will still work fine. >> >> I also have updated the indexing/reindexing, so if indexing failures >> happen, MAKER will switch between the current working directory and the >> TMP= directory from the maker_opts.ctl file so as to try different IO >> locations (I.e. NFS and non-NFS). Note you should never set TMP= in the >> control files to an NFS mounted location (it not only makes things a lot >> slower, but berkleydb and sqllite will get frequent errors on NFS). >> TMP= defaults to /tmp when not specified >> >> I'll send you download information in a separate e-mail. Try a regular >> MAKER run to see if the indexing/reindexing changes are sufficient >> before attempting the ?no_locks option. >> >> Thanks, >> Carson > From dsth at ebi.ac.uk Tue Mar 19 09:13:51 2013 From: dsth at ebi.ac.uk (Daniel Hughes) Date: Tue, 19 Mar 2013 15:13:51 +0000 Subject: [maker-devel] master_datastore_index.log file shrinks.] In-Reply-To: <514881FD.4020003@ebi.ac.uk> References: <514881FD.4020003@ebi.ac.uk> Message-ID: You really don't need to know anything about MPI. While MPI is itself pretty complex, I seem to recall maker uses the p2p subset alone mainly to send serialised perl objects as c strings etc., for IPC across ad hoc infrastructure - but none of that is relevant as Carson has done all the IPC debugging for you and its use should be transparent. If it's failing, its almost certainly because you've got discrepencies between the mpi libraries visible at compile-time vs. run-time and you may need to force the dynamic linker to behave itself. The only other caveat on ebi infrastructure i can think of off the top of my head relates to cross-node MPI usage when going into the hundreds of processes but i'm assuming you not doing that? You need to be more specific about how it's failing. dan from me phone... On Mar 19, 2013 11:55 AM, "Michael Nuhn" wrote: > Hello Carson! > > On 03/19/2013 02:27 PM, Carson Holt wrote: > >> Yes. If at all possible use MPI. It removes the overhead of locks >> which happen per primary instance of MAKER. So one maker job using 1000 >> cpus via MPI will have one shared set of locks. 1000 serial instances >> of MAKER on the other hand would have 1000x the locks. >> > > I don't know a thing about MPI. > > I tried installing maker (2.2.7) with mpich-3.0.2, mpich2-1.4.1 and open > mpi and none of them worked for me. I also tried the automatic installation > that comes with maker, but it didn't work for me either. > > If need be, I could spend time getting to the bottom of this, but there is > no telling how long this would take me so I'd rather not, if there is an > alternative. > > Would the approach I outlined before work? (Treating the split files as > separate genomes to annotate and then combine the gffs afterwards) > > I also like this approach, because I would select a few contigs in the > beginning which I would run on their own. They would complete early and > this way I would get a preview of the results of the run instead of having > to wait for everything to complete. > > It might also be more robust, because file locking issues would be > confined to the instances working on a sequence chunk, but the rest of the > instances could continue working. > > Cheers, > Michael. > > Alternatively if you do need to continue without MPI for some reason, I >> just finished a devel version of MAKER that has a --no_locks option. >> You can never start two instances using the same input fasta when >> --no_locks is specified, but the splitting to use different input fastas >> I mentioned before in the example will still work fine. >> >> I also have updated the indexing/reindexing, so if indexing failures >> happen, MAKER will switch between the current working directory and the >> TMP= directory from the maker_opts.ctl file so as to try different IO >> locations (I.e. NFS and non-NFS). Note you should never set TMP= in the >> control files to an NFS mounted location (it not only makes things a lot >> slower, but berkleydb and sqllite will get frequent errors on NFS). >> TMP= defaults to /tmp when not specified >> >> I'll send you download information in a separate e-mail. Try a regular >> MAKER run to see if the indexing/reindexing changes are sufficient >> before attempting the ?no_locks option. >> >> Thanks, >> Carson >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Mar 19 09:22:22 2013 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 19 Mar 2013 11:22:22 -0400 Subject: [maker-devel] master_datastore_index.log file shrinks.] In-Reply-To: Message-ID: I have MAKER working under OpemnMPI 1.4.3 (intel compiled). I had to set a couple of environmental variables prior to setup. You would probably need to set these values as well. If you your OpenMPI path was here for example --> /software/openmpi-1.4.3/, run the following commands (path set accordingly) before even attempting maker setup. export OMPI_MCA_mpi_warn_on_fork 0 export LD_PRELOAD /software/openmpi-1.4.3/lib/libmpi.so:$LD_PRELOAD These not only need to be set before compilation, but also before any run (so add them to you ~.bashrc or ~/.bash_profile or any module load scripts thanks). The LD_PRELOAD statement needs to be set for any program using OpenMPI's shared libraries and not just MAKER, so it's normally a good idea to have that set system wide for all users. The detail can be found in the OpenMPI documentation. Note sometimes system library updates can break OpenMPI's shared libraries while not breaking OpenMPI itself, so you might also need to recompile OpenMPI if it has broken shared libraries. Once you have those commands in place, run the perl Buil.PL step. Say yes to install with MPI. Then run ./Build install Thanks, Carson On 13-03-19 11:02 AM, "Carson Holt" wrote: >Try it with the no_locks option then. Make sure to let one instance >finish populating the mpi_blastdb directory before running other >instances >as that is where most initial locking occurs. > >I'll send you more details on how to install with OpenMPI, so you can >give >that a shot while your jobs are also running serially (so you don't lose >time). Also instead of 50 serial instances, you could try 10 with -cpus >set to 5. > >Thanks, >Carson > > > >On 13-03-19 11:19 AM, "Michael Nuhn" wrote: > >>Hello Carson! >> >>On 03/19/2013 02:27 PM, Carson Holt wrote: >>> Yes. If at all possible use MPI. It removes the overhead of locks >>> which happen per primary instance of MAKER. So one maker job using >>>1000 >>> cpus via MPI will have one shared set of locks. 1000 serial instances >>> of MAKER on the other hand would have 1000x the locks. >> >>I don't know a thing about MPI. >> >>I tried installing maker (2.2.7) with mpich-3.0.2, mpich2-1.4.1 and open >>mpi and none of them worked for me. I also tried the automatic >>installation that comes with maker, but it didn't work for me either. >> >>If need be, I could spend time getting to the bottom of this, but there >>is no telling how long this would take me so I'd rather not, if there is >>an alternative. >> >>Would the approach I outlined before work? (Treating the split files as >>separate genomes to annotate and then combine the gffs afterwards) >> >>I also like this approach, because I would select a few contigs in the >>beginning which I would run on their own. They would complete early and >>this way I would get a preview of the results of the run instead of >>having to wait for everything to complete. >> >>It might also be more robust, because file locking issues would be >>confined to the instances working on a sequence chunk, but the rest of >>the instances could continue working. >> >>Cheers, >>Michael. >> >>> Alternatively if you do need to continue without MPI for some reason, I >>> just finished a devel version of MAKER that has a --no_locks option. >>> You can never start two instances using the same input fasta when >>> --no_locks is specified, but the splitting to use different input >>>fastas >>> I mentioned before in the example will still work fine. >>> >>> I also have updated the indexing/reindexing, so if indexing failures >>> happen, MAKER will switch between the current working directory and the >>> TMP= directory from the maker_opts.ctl file so as to try different IO >>> locations (I.e. NFS and non-NFS). Note you should never set TMP= in >>>the >>> control files to an NFS mounted location (it not only makes things a >>>lot >>> slower, but berkleydb and sqllite will get frequent errors on NFS). >>> TMP= defaults to /tmp when not specified >>> >>> I'll send you download information in a separate e-mail. Try a regular >>> MAKER run to see if the indexing/reindexing changes are sufficient >>> before attempting the ?no_locks option. >>> >>> Thanks, >>> Carson From dsth at ebi.ac.uk Tue Mar 19 09:26:02 2013 From: dsth at ebi.ac.uk (Daniel Hughes) Date: Tue, 19 Mar 2013 15:26:02 +0000 Subject: [maker-devel] master_datastore_index.log file shrinks.] In-Reply-To: References: <514881FD.4020003@ebi.ac.uk> Message-ID: oh and (1) it will work as long as evidence etc., is synchronous, (2) it will be really inefficient - be glad ebi doesn't use a by group compute time fair-share policy ;) Dan from me phone... On Mar 19, 2013 12:13 PM, "Daniel Hughes" wrote: > You really don't need to know anything about MPI. While MPI is itself > pretty complex, I seem to recall maker uses the p2p subset alone mainly to > send serialised perl objects as c strings etc., for IPC across ad hoc > infrastructure - but none of that is relevant as Carson has done all the > IPC debugging for you and its use should be transparent. If it's failing, > its almost certainly because you've got discrepencies between the mpi > libraries visible at compile-time vs. run-time and you may need to force > the dynamic linker to behave itself. The only other caveat on ebi > infrastructure i can think of off the top of my head relates to cross-node > MPI usage when going into the hundreds of processes but i'm assuming you > not doing that? You need to be more specific about how it's failing. > > dan > > from me phone... > On Mar 19, 2013 11:55 AM, "Michael Nuhn" wrote: > >> Hello Carson! >> >> On 03/19/2013 02:27 PM, Carson Holt wrote: >> >>> Yes. If at all possible use MPI. It removes the overhead of locks >>> which happen per primary instance of MAKER. So one maker job using 1000 >>> cpus via MPI will have one shared set of locks. 1000 serial instances >>> of MAKER on the other hand would have 1000x the locks. >>> >> >> I don't know a thing about MPI. >> >> I tried installing maker (2.2.7) with mpich-3.0.2, mpich2-1.4.1 and open >> mpi and none of them worked for me. I also tried the automatic installation >> that comes with maker, but it didn't work for me either. >> >> If need be, I could spend time getting to the bottom of this, but there >> is no telling how long this would take me so I'd rather not, if there is an >> alternative. >> >> Would the approach I outlined before work? (Treating the split files as >> separate genomes to annotate and then combine the gffs afterwards) >> >> I also like this approach, because I would select a few contigs in the >> beginning which I would run on their own. They would complete early and >> this way I would get a preview of the results of the run instead of having >> to wait for everything to complete. >> >> It might also be more robust, because file locking issues would be >> confined to the instances working on a sequence chunk, but the rest of the >> instances could continue working. >> >> Cheers, >> Michael. >> >> Alternatively if you do need to continue without MPI for some reason, I >>> just finished a devel version of MAKER that has a --no_locks option. >>> You can never start two instances using the same input fasta when >>> --no_locks is specified, but the splitting to use different input fastas >>> I mentioned before in the example will still work fine. >>> >>> I also have updated the indexing/reindexing, so if indexing failures >>> happen, MAKER will switch between the current working directory and the >>> TMP= directory from the maker_opts.ctl file so as to try different IO >>> locations (I.e. NFS and non-NFS). Note you should never set TMP= in the >>> control files to an NFS mounted location (it not only makes things a lot >>> slower, but berkleydb and sqllite will get frequent errors on NFS). >>> TMP= defaults to /tmp when not specified >>> >>> I'll send you download information in a separate e-mail. Try a regular >>> MAKER run to see if the indexing/reindexing changes are sufficient >>> before attempting the ?no_locks option. >>> >>> Thanks, >>> Carson >>> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnuhn at ebi.ac.uk Tue Mar 19 09:54:34 2013 From: mnuhn at ebi.ac.uk (Michael Nuhn) Date: Tue, 19 Mar 2013 15:54:34 +0000 Subject: [maker-devel] master_datastore_index.log file shrinks.] In-Reply-To: References: Message-ID: <51488A3A.20106@ebi.ac.uk> Hello Carson! Thanks for the pointers. I'll give mpi another shot. Cheers, Michael. On 03/19/2013 03:22 PM, Carson Holt wrote: > I have MAKER working under OpemnMPI 1.4.3 (intel compiled). > > I had to set a couple of environmental variables prior to setup. You would > probably need to set these values as well. If you your OpenMPI path was > here for example --> /software/openmpi-1.4.3/, run the following commands > (path set accordingly) before even attempting maker setup. > > export OMPI_MCA_mpi_warn_on_fork 0 > export LD_PRELOAD /software/openmpi-1.4.3/lib/libmpi.so:$LD_PRELOAD > > These not only need to be set before compilation, but also before any run > (so add them to you ~.bashrc or ~/.bash_profile or any module load scripts > thanks). The LD_PRELOAD statement needs to be set for any program using > OpenMPI's shared libraries and not just MAKER, so it's normally a good > idea to have that set system wide for all users. The detail can be found > in the OpenMPI documentation. Note sometimes system library updates can > break OpenMPI's shared libraries while not breaking OpenMPI itself, so you > might also need to recompile OpenMPI if it has broken shared libraries. > > Once you have those commands in place, run the perl Buil.PL step. Say yes > to install with MPI. Then run ./Build install > > Thanks, > Carson > > > > On 13-03-19 11:02 AM, "Carson Holt" wrote: > >> Try it with the no_locks option then. Make sure to let one instance >> finish populating the mpi_blastdb directory before running other >> instances >> as that is where most initial locking occurs. >> >> I'll send you more details on how to install with OpenMPI, so you can >> give >> that a shot while your jobs are also running serially (so you don't lose >> time). Also instead of 50 serial instances, you could try 10 with -cpus >> set to 5. >> >> Thanks, >> Carson >> >> >> >> On 13-03-19 11:19 AM, "Michael Nuhn" wrote: >> >>> Hello Carson! >>> >>> On 03/19/2013 02:27 PM, Carson Holt wrote: >>>> Yes. If at all possible use MPI. It removes the overhead of locks >>>> which happen per primary instance of MAKER. So one maker job using >>>> 1000 >>>> cpus via MPI will have one shared set of locks. 1000 serial instances >>>> of MAKER on the other hand would have 1000x the locks. >>> >>> I don't know a thing about MPI. >>> >>> I tried installing maker (2.2.7) with mpich-3.0.2, mpich2-1.4.1 and open >>> mpi and none of them worked for me. I also tried the automatic >>> installation that comes with maker, but it didn't work for me either. >>> >>> If need be, I could spend time getting to the bottom of this, but there >>> is no telling how long this would take me so I'd rather not, if there is >>> an alternative. >>> >>> Would the approach I outlined before work? (Treating the split files as >>> separate genomes to annotate and then combine the gffs afterwards) >>> >>> I also like this approach, because I would select a few contigs in the >>> beginning which I would run on their own. They would complete early and >>> this way I would get a preview of the results of the run instead of >>> having to wait for everything to complete. >>> >>> It might also be more robust, because file locking issues would be >>> confined to the instances working on a sequence chunk, but the rest of >>> the instances could continue working. >>> >>> Cheers, >>> Michael. >>> >>>> Alternatively if you do need to continue without MPI for some reason, I >>>> just finished a devel version of MAKER that has a --no_locks option. >>>> You can never start two instances using the same input fasta when >>>> --no_locks is specified, but the splitting to use different input >>>> fastas >>>> I mentioned before in the example will still work fine. >>>> >>>> I also have updated the indexing/reindexing, so if indexing failures >>>> happen, MAKER will switch between the current working directory and the >>>> TMP= directory from the maker_opts.ctl file so as to try different IO >>>> locations (I.e. NFS and non-NFS). Note you should never set TMP= in >>>> the >>>> control files to an NFS mounted location (it not only makes things a >>>> lot >>>> slower, but berkleydb and sqllite will get frequent errors on NFS). >>>> TMP= defaults to /tmp when not specified >>>> >>>> I'll send you download information in a separate e-mail. Try a regular >>>> MAKER run to see if the indexing/reindexing changes are sufficient >>>> before attempting the ?no_locks option. >>>> >>>> Thanks, >>>> Carson > > From es9 at sanger.ac.uk Tue Mar 19 09:40:08 2013 From: es9 at sanger.ac.uk (Eleanor Stanley) Date: Tue, 19 Mar 2013 15:40:08 +0000 Subject: [maker-devel] master_datastore_index.log file shrinks.] In-Reply-To: <51488A3A.20106@ebi.ac.uk> References: <51488A3A.20106@ebi.ac.uk> Message-ID: For the Sanger farm I have a wrapper script to run MPI maker so that the same environmental variables are forced to all nodes. Eleanor On 19 Mar 2013, at 15:54, Michael Nuhn wrote: > Hello Carson! > > Thanks for the pointers. I'll give mpi another shot. > > Cheers, > Michael. > > On 03/19/2013 03:22 PM, Carson Holt wrote: >> I have MAKER working under OpemnMPI 1.4.3 (intel compiled). >> >> I had to set a couple of environmental variables prior to setup. You would >> probably need to set these values as well. If you your OpenMPI path was >> here for example --> /software/openmpi-1.4.3/, run the following commands >> (path set accordingly) before even attempting maker setup. >> >> export OMPI_MCA_mpi_warn_on_fork 0 >> export LD_PRELOAD /software/openmpi-1.4.3/lib/libmpi.so:$LD_PRELOAD >> >> These not only need to be set before compilation, but also before any run >> (so add them to you ~.bashrc or ~/.bash_profile or any module load scripts >> thanks). The LD_PRELOAD statement needs to be set for any program using >> OpenMPI's shared libraries and not just MAKER, so it's normally a good >> idea to have that set system wide for all users. The detail can be found >> in the OpenMPI documentation. Note sometimes system library updates can >> break OpenMPI's shared libraries while not breaking OpenMPI itself, so you >> might also need to recompile OpenMPI if it has broken shared libraries. >> >> Once you have those commands in place, run the perl Buil.PL step. Say yes >> to install with MPI. Then run ./Build install >> >> Thanks, >> Carson >> >> >> >> On 13-03-19 11:02 AM, "Carson Holt" wrote: >> >>> Try it with the no_locks option then. Make sure to let one instance >>> finish populating the mpi_blastdb directory before running other >>> instances >>> as that is where most initial locking occurs. >>> >>> I'll send you more details on how to install with OpenMPI, so you can >>> give >>> that a shot while your jobs are also running serially (so you don't lose >>> time). Also instead of 50 serial instances, you could try 10 with -cpus >>> set to 5. >>> >>> Thanks, >>> Carson >>> >>> >>> >>> On 13-03-19 11:19 AM, "Michael Nuhn" wrote: >>> >>>> Hello Carson! >>>> >>>> On 03/19/2013 02:27 PM, Carson Holt wrote: >>>>> Yes. If at all possible use MPI. It removes the overhead of locks >>>>> which happen per primary instance of MAKER. So one maker job using >>>>> 1000 >>>>> cpus via MPI will have one shared set of locks. 1000 serial instances >>>>> of MAKER on the other hand would have 1000x the locks. >>>> >>>> I don't know a thing about MPI. >>>> >>>> I tried installing maker (2.2.7) with mpich-3.0.2, mpich2-1.4.1 and open >>>> mpi and none of them worked for me. I also tried the automatic >>>> installation that comes with maker, but it didn't work for me either. >>>> >>>> If need be, I could spend time getting to the bottom of this, but there >>>> is no telling how long this would take me so I'd rather not, if there is >>>> an alternative. >>>> >>>> Would the approach I outlined before work? (Treating the split files as >>>> separate genomes to annotate and then combine the gffs afterwards) >>>> >>>> I also like this approach, because I would select a few contigs in the >>>> beginning which I would run on their own. They would complete early and >>>> this way I would get a preview of the results of the run instead of >>>> having to wait for everything to complete. >>>> >>>> It might also be more robust, because file locking issues would be >>>> confined to the instances working on a sequence chunk, but the rest of >>>> the instances could continue working. >>>> >>>> Cheers, >>>> Michael. >>>> >>>>> Alternatively if you do need to continue without MPI for some reason, I >>>>> just finished a devel version of MAKER that has a --no_locks option. >>>>> You can never start two instances using the same input fasta when >>>>> --no_locks is specified, but the splitting to use different input >>>>> fastas >>>>> I mentioned before in the example will still work fine. >>>>> >>>>> I also have updated the indexing/reindexing, so if indexing failures >>>>> happen, MAKER will switch between the current working directory and the >>>>> TMP= directory from the maker_opts.ctl file so as to try different IO >>>>> locations (I.e. NFS and non-NFS). Note you should never set TMP= in >>>>> the >>>>> control files to an NFS mounted location (it not only makes things a >>>>> lot >>>>> slower, but berkleydb and sqllite will get frequent errors on NFS). >>>>> TMP= defaults to /tmp when not specified >>>>> >>>>> I'll send you download information in a separate e-mail. Try a regular >>>>> MAKER run to see if the indexing/reindexing changes are sufficient >>>>> before attempting the ?no_locks option. >>>>> >>>>> Thanks, >>>>> Carson >> >> > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From Bob_Freeman at hms.harvard.edu Tue Mar 19 10:18:11 2013 From: Bob_Freeman at hms.harvard.edu (Freeman, Robert M.) Date: Tue, 19 Mar 2013 12:18:11 -0400 Subject: [maker-devel] Failing to recoup all gff annotations, including ab-initio In-Reply-To: References: Message-ID: <06F15FF0-2384-4BDD-AD9B-9C1D0AB6370C@hms.harvard.edu> Thanks, Carson. This explains the behavior I saw and will help us moving forward. Best, Bob On Mar 19, 2013, at 10:52 AM, Carson Holt wrote: Ab initio models without evidence support are not considered final models by default (newly trained ab initio predictors tend to have a very high false positive rate). If you really want the ab initio models without support to upgraded, set the keep_preds=1 in the maker_opts.ctl file. All ab inito models are also stored in the GFF3 as match/mtach_part features for reference purposes not gene/mRNA/exon/CDS. Thanks, Carson From: "Freeman, Robert M." > Date: Tuesday, 19 March, 2013 9:03 AM To: "maker-devel at yandell-lab.org" > Subject: [maker-devel] Failing to recoup all gff annotations, including ab-initio Carson et al., Thanks again for a great suite tools! We're using MAKER now to generate gene models (and model fragments) for a ciliate, the models for which we'll be using to generate a high-quality protein database for searches with mass spec. I bootstrapped the process using the core set of proteins with CEGMA, then trained SNAP. After the final round of running MAKER, I get about 1100 evidence-based models and 34K ab-initio. And that's fine (for now). I am able to collect the fasta files for both transcripts and proteins (evidence-based and ab-initio) without problem. My problem is that when I use the gff3_merge script, I only get annotations for the evidence-based models. I'm not sure why the ab-initio model annotations are being collected. I've tried using and not the '-g' switch, but this doesn't seem to make a difference. Thoughts? Tx, B _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org ----------------------------------------------------- Bob Freeman, Ph.D. Acorn Worm Informatics, Kirschner lab Dept of Systems Biology, Alpert 524 Harvard Medical School 200 Longwood Avenue Boston, MA 02115 617/432.2294, vox "Sorry I'm late. Oh, God, that sounded insincere. I'm late." -- Karen Walker, from Will and Grace -------------- next part -------------- An HTML attachment was scrubbed... URL: From cjfields at illinois.edu Tue Mar 19 13:04:18 2013 From: cjfields at illinois.edu (Fields, Christopher J) Date: Tue, 19 Mar 2013 19:04:18 +0000 Subject: [maker-devel] Alternative start codons Message-ID: <118F034CF4C3EF48A96F86CE585B94BF74DA507D@CHIMBX5.ad.uillinois.edu> We had a user notice that MAKER is not observing alternative start codons for bacterial genomes. For instance, this predicted transcript: >Xf_Mul_000007-RA transcript Name:"Protein of unknown function" offset:79 AED:0.42 eAED:1.00 QI:79|-1|0|1|-1|1|1|20|24 GTGGGATACAGGCCGCTGATCGCTGATGGCGCGTACCTGAAACTGCTGCTGGACTACTAC GTTACAGTGCAGCCTTTGCATGCCGATTGGAAAGATCTATATATCATCGCTTGCGCTATT ACAGCGGCTAAAAAGAGTCTTCAATTTGGCGTAATTCAGTCATTGGCGGGGTAG Yields this protein sequence. >Xf_Mul_000007-RA protein AED:0.42 eAED:1.00 QI:79|-1|0|1|-1|1|1|20|24 MPIGKIYISSLALLQRLKRVFNLA I'm pretty sure I know what is going on, namely that MAKER is treating the 5' end as UTR and looking for the first ATG (there is one in the sequence above). Is there any way to change this behavior, though? For instance, allow alternative start codons like GTG/TTG? chris From hudarul at yahoo.com Tue Mar 19 13:08:55 2013 From: hudarul at yahoo.com (Hud Hud) Date: Tue, 19 Mar 2013 12:08:55 -0700 (PDT) Subject: [maker-devel] Maker-no such file or directory In-Reply-To: References: <1363637601.24386.YahooMailNeo@web164901.mail.bf1.yahoo.com> Message-ID: <1363720135.24498.YahooMailNeo@web164901.mail.bf1.yahoo.com> Hello everyone I have some queries, i cant run MAKER locally, so can i use MWAS on my contigs, but since my contigs too long to be run on MWAS, is it possible to combine the results after i upload and run the analysis on my contigs separately... ________________________________ From: Carson Holt To: Hud Hud ; "maker-devel at yandell-lab.org" Sent: Tuesday, March 19, 2013 4:44 AM Subject: Re: [maker-devel] Maker-no such file or directory Does 'ls -al?$home/Dorah/maker-2.27-beta/maker/data/dpp_contig.fasta' ?show a valid location? The error is just saying that the file location as written in the maker_opts.ctl file does not exist. --Carson From: Hud Hud Reply-To: Hud Hud Date: Monday, 18 March, 2013 4:13 PM To: "maker-devel at yandell-lab.org" Subject: [maker-devel] Maker-no such file or directory I have some problem with maker 1. i try to work with the example data in data directory, but im having this kind of error..anyone can help me error $ maker STATUS: Parsing control files... dpp_contig.fasta (fasta file or fasta embeded in GFF3 file): No such file or directory at /home/Dorah/maker-2.27-beta/maker/bin/../lib/GI.pm line 186 --> rank=NA, hostname=NurKaiyisah my maker_opts.ctl genome=$home/Dorah/maker-2.27-beta/maker/data/dpp_contig.fasta est=$home/Dorah/maker-2.27-beta/maker/data/dpp_est.fasta protein=$home/Dorah/maker-2.27-beta/maker/data/dpp_protein.fasta _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Mar 19 13:30:09 2013 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 19 Mar 2013 15:30:09 -0400 Subject: [maker-devel] Maker-no such file or directory In-Reply-To: <1363720135.24498.YahooMailNeo@web164901.mail.bf1.yahoo.com> Message-ID: You can. It will be very slow though as MWAS only dedicates a single cpu per job. So with a 5Mb max per job submission it could take a very long time depending on the size of the assembly (emphasis on very long). --Carson From: Hud Hud Reply-To: Hud Hud Date: Tuesday, 19 March, 2013 3:08 PM To: "maker-devel at yandell-lab.org" Subject: Re: [maker-devel] Maker-no such file or directory Hello everyone I have some queries, i cant run MAKER locally, so can i use MWAS on my contigs, but since my contigs too long to be run on MWAS, is it possible to combine the results after i upload and run the analysis on my contigs separately... From: Carson Holt To: Hud Hud ; "maker-devel at yandell-lab.org" Sent: Tuesday, March 19, 2013 4:44 AM Subject: Re: [maker-devel] Maker-no such file or directory Does 'ls -al $home/Dorah/maker-2.27-beta/maker/data/dpp_contig.fasta' show a valid location? The error is just saying that the file location as written in the maker_opts.ctl file does not exist. --Carson From: Hud Hud Reply-To: Hud Hud Date: Monday, 18 March, 2013 4:13 PM To: "maker-devel at yandell-lab.org" Subject: [maker-devel] Maker-no such file or directory I have some problem with maker 1. i try to work with the example data in data directory, but im having this kind of error..anyone can help me error $ maker STATUS: Parsing control files... dpp_contig.fasta (fasta file or fasta embeded in GFF3 file): No such file or directory at /home/Dorah/maker-2.27-beta/maker/bin/../lib/GI.pm line 186 --> rank=NA, hostname=NurKaiyisah my maker_opts.ctl genome=$home/Dorah/maker-2.27-beta/maker/data/dpp_contig.fasta est=$home/Dorah/maker-2.27-beta/maker/data/dpp_est.fasta protein=$home/Dorah/maker-2.27-beta/maker/data/dpp_protein.fasta _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Mar 19 13:33:46 2013 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 19 Mar 2013 15:33:46 -0400 Subject: [maker-devel] Alternative start codons In-Reply-To: <118F034CF4C3EF48A96F86CE585B94BF74DA507D@CHIMBX5.ad.uillinois.edu> Message-ID: It could be changed. I imagine that this is a protein2genome or est2genome gene, as MAKER won't try and determine by itself the start and end if it comes from a gene predictor. --Carson On 13-03-19 3:04 PM, "Fields, Christopher J" wrote: >We had a user notice that MAKER is not observing alternative start codons >for bacterial genomes. For instance, this predicted transcript: > >>Xf_Mul_000007-RA transcript Name:"Protein of unknown function" offset:79 >>AED:0.42 eAED:1.00 QI:79|-1|0|1|-1|1|1|20|24 >GTGGGATACAGGCCGCTGATCGCTGATGGCGCGTACCTGAAACTGCTGCTGGACTACTAC >GTTACAGTGCAGCCTTTGCATGCCGATTGGAAAGATCTATATATCATCGCTTGCGCTATT >ACAGCGGCTAAAAAGAGTCTTCAATTTGGCGTAATTCAGTCATTGGCGGGGTAG > >Yields this protein sequence. > >>Xf_Mul_000007-RA protein AED:0.42 eAED:1.00 QI:79|-1|0|1|-1|1|1|20|24 >MPIGKIYISSLALLQRLKRVFNLA > >I'm pretty sure I know what is going on, namely that MAKER is treating >the 5' end as UTR and looking for the first ATG (there is one in the >sequence above). Is there any way to change this behavior, though? For >instance, allow alternative start codons like GTG/TTG? > >chris >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From myandell at genetics.utah.edu Tue Mar 19 18:02:36 2013 From: myandell at genetics.utah.edu (Mark Yandell) Date: Wed, 20 Mar 2013 00:02:36 +0000 Subject: [maker-devel] Maker2 gff file output In-Reply-To: References: Message-ID: <7A60AB257EFF2B48B1F4C814817EA05350EDC688@mxb2.hg.genetics.utah.edu> Hi Blake, I'be forwarded this onto the maker_devel list, they can help you more there. regarding your comment g 'When I view the output of many contigs in Apollo, there is many times where 3 or 4 models show close to identical gene structure, but the final maker output does not contain that gene call. ' Those calls are in the output files, but there are in a different multifasta file; there are non-overalpping ab intio models. Another way is to set the config flag to allow MAKEr to use unspliced EST and RNA-seq alignments as evidence, I'be forwarded this onto the maker_devel list, they can help you more there. cheers, --mark Mark Yandell Professor of Human Genetics H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:801-587-7707 ________________________________________ From: Blake Hovde [hovdebt at uw.edu] Sent: Tuesday, March 19, 2013 2:35 PM To: Mark Yandell Subject: Maker2 gff file output Hi Dr. Yandell, I am currently running MAKER2 on a new algal genome and am running into a couple of problems that I would like your input on the genome size is ~60Mb and is currently in ~3100 contigs. First, I am having trouble doing multiple iterations of hmm training with SNAP due to the fact that I have so many gff output files in the datastore (1 for each contig in my draft genome). not just a single gff output that seems to be in the examples and tutorials I have followed thus far. Is there a way to combine all of my gff files together to make use of the SNAP hmm training or re-annotation? Second, Using multiple lines of evidence (augustus, genemarkES, RNAseq data, and COGs based on homology searches) I am having a hard time getting a lot of maker gene calls. It seems that the calling is too stringent in many cases. When I view the output of many contigs in Apollo, there is many times where 3 or 4 models show close to identical gene structure, but the final maker output does not contain that gene call. Do you have any suggestions on how to lower the stringency of the MAKER output so that more genes will be called? In some cases I am getting less than 3000 gene calls in the final output. Where an Augustus model trained on Chlamydamonas will return ~15000. Thanks very much for your help! Sincerely, Blake Hovde Graduate Student Department of Genome Sciences University of Washington From carsonhh at gmail.com Tue Mar 19 20:43:44 2013 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 19 Mar 2013 22:43:44 -0400 Subject: [maker-devel] Maker2 gff file output In-Reply-To: <7A60AB257EFF2B48B1F4C814817EA05350EDC688@mxb2.hg.genetics.utah.edu> Message-ID: >I am currently running MAKER2 on a new algal genome and am running >into a couple of problems that I would like your input on the genome >size is ~60Mb and is currently in ~3100 contigs. >First, I am having trouble doing multiple iterations of hmm training >with SNAP due to the fact that I have so many gff output files in the >datastore (1 for each contig in my draft genome). not just a single >gff output that seems to be in the examples and tutorials I have >followed thus far. Is there a way to combine all of my gff files >together to make use of the SNAP hmm training or re-annotation? Use the gff3_merge script in the .../maker/bin/ directory > >Second, Using multiple lines of evidence (augustus, genemarkES, RNAseq >data, and COGs based on homology searches) I am having a hard time >getting a lot of maker gene calls. It seems that the calling is too >stringent in many cases. When I view the output of many contigs in >Apollo, there is many times where 3 or 4 models show close to >identical gene structure, but the final maker output does not contain >that gene call. Do you have any suggestions on how to lower the >stringency of the MAKER output so that more genes will be called? In >some cases I am getting less than 3000 gene calls in the final output. >Where an Augustus model trained on Chlamydamonas will return ~15000. I agree with Mark. You may want to set single_exon=1 to accept single exon evidence, try increasing the depth of your protein evidence file as well, or if the genome is relatively gene dense, set keep_preds=1. On some genomes that are gene dense (fungi for example) ab initio predictors don't have that high a false positive rate, so this can be safe. However on more complex genomes doing so can produce more false positives than there are genes. Thanks, Carson On 13-03-19 8:02 PM, "Mark Yandell" wrote: >Hi Blake, > >I'be forwarded this onto the maker_devel list, they can help you more >there. > >regarding your comment g 'When I view the output of many contigs in >Apollo, there is many times where 3 or 4 models show close to identical >gene structure, but the final maker output does not contain that gene >call. ' Those calls are in the output files, but there are in a >different multifasta file; there are non-overalpping ab intio models. >Another way is to set the config flag to allow MAKEr to use unspliced EST >and RNA-seq alignments as evidence, > >I'be forwarded this onto the maker_devel list, they can help you more >there. > >cheers, > >--mark > > >Mark Yandell >Professor of Human Genetics >H.A. & Edna Benning Presidential Endowed Chair >Eccles Institute of Human Genetics >University of Utah >15 North 2030 East, Room 2100 >Salt Lake City, UT 84112-5330 >ph:801-587-7707 > >________________________________________ >From: Blake Hovde [hovdebt at uw.edu] >Sent: Tuesday, March 19, 2013 2:35 PM >To: Mark Yandell >Subject: Maker2 gff file output > >Hi Dr. Yandell, > >I am currently running MAKER2 on a new algal genome and am running >into a couple of problems that I would like your input on the genome >size is ~60Mb and is currently in ~3100 contigs. >First, I am having trouble doing multiple iterations of hmm training >with SNAP due to the fact that I have so many gff output files in the >datastore (1 for each contig in my draft genome). not just a single >gff output that seems to be in the examples and tutorials I have >followed thus far. Is there a way to combine all of my gff files >together to make use of the SNAP hmm training or re-annotation? > >Second, Using multiple lines of evidence (augustus, genemarkES, RNAseq >data, and COGs based on homology searches) I am having a hard time >getting a lot of maker gene calls. It seems that the calling is too >stringent in many cases. When I view the output of many contigs in >Apollo, there is many times where 3 or 4 models show close to >identical gene structure, but the final maker output does not contain >that gene call. Do you have any suggestions on how to lower the >stringency of the MAKER output so that more genes will be called? In >some cases I am getting less than 3000 gene calls in the final output. > Where an Augustus model trained on Chlamydamonas will return ~15000. > >Thanks very much for your help! > >Sincerely, >Blake Hovde >Graduate Student >Department of Genome Sciences >University of Washington > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From Carson.Holt at oicr.on.ca Wed Mar 20 07:51:29 2013 From: Carson.Holt at oicr.on.ca (Carson Holt) Date: Wed, 20 Mar 2013 13:51:29 +0000 Subject: [maker-devel] AED calculations using the MAKER pipeline In-Reply-To: Message-ID: In the current MAKER download when using GFF3 passthrough there was an issue with everything being done at the very last step. This of course leads to a memory spike and a very slow last step. That seems to be similar to what you are describing. It should be resolved in what will become version 2.28. I can give you access to the pre-release code, so you can check that the issue is resolved for you. I'll send details in a separate e-mail. Also the ### will be printed after every ~100,000 bp of assembly processed by MAKER. You can ignore them, but they actually have a meaning in GFF3. Basically everything between two sets of ###'s are fully resolved. It allows programs that read GFF3 to parallelize file loading or just load sections of a file as they can rapidly identify "safe chunks". Without them the entire file must be loaded into memory in order to be certain that all feature parts are there (as there is no requirement for sorting or order in GFF3). log.child files will always be empty unless you run analysis like snap or blast. Thanks, Carson On 13-03-20 9:05 AM, "Krishnakumar, Vivek" wrote: >Hi, > >We have been using the MAKER pipeline here at JCVI to calculate AED >scores by feeding in our annotation set as `model_gff` and the protein >and EST evidence as `protein_gff` and `est_gff` respectively. Here is the >issue we are having: > >When running the above pipeline with protein2genome and est2genome >evidence generated earlier by MAKER, there are no problems calculating >the AED score. Normally this pipeline takes a little over 12 hours to >complete. > >But if we use our own evidence, AAT and Genewise aligned proteins for >`protein_gff` and PASA assembled ESTs for `est_gff`, the same pipeline >runs very very slow and the intermediary *.gff.ann file has many chunks >(separated by '###') that are completely empty. Our evidence in formatted >in the same way as est2genome or protein2genome (GFF file with >"expressed_sequence_match::match_part" or "protein_match::match_part" >features respectively) > >The input to my pipeline is 8 chromosomes, ~2200 scaffolds and I use the >default `max_dna_len` parameter used to split the large assemblies into >chunks. > >Investigating the master_datastore.log shows me that the scaffolds run >through without any issues and the chromosomes are still being processed. >For any of the chromosomes, investigating the 'run.log' file, one level >above 'theVoid' shows me how many "final.section" jobs were started and >how many finished. And in the case of all the chromosomes, it tells me >that everything that was started has finished. And the 'log.child.*' >files within `theVoid` are all empty. Also within `theVoid`, I'm noticing >that the "raw.section" and "evidence_*.gff" files are not empty. But one >thing that is surprising is that of all the "final.section" files, only >the one pertaining to the last chunk is very large (proportional to the >size of the evidnce), the rest are all exactly the same size (exactly 331 >bytes). > >I'm running MAKER in MPI mode spawning 48 processes on a high memory >machine with 64 available cores and 1TB of RAM. > >I hope I've been able to explain my situation clearly in this email. > >Any help is appreciated. >Thank you. > >Vivek From vKrishna at jcvi.org Wed Mar 20 07:05:55 2013 From: vKrishna at jcvi.org (Krishnakumar, Vivek) Date: Wed, 20 Mar 2013 09:05:55 -0400 Subject: [maker-devel] AED calculations using the MAKER pipeline Message-ID: Hi, We have been using the MAKER pipeline here at JCVI to calculate AED scores by feeding in our annotation set as `model_gff` and the protein and EST evidence as `protein_gff` and `est_gff` respectively. Here is the issue we are having: When running the above pipeline with protein2genome and est2genome evidence generated earlier by MAKER, there are no problems calculating the AED score. Normally this pipeline takes a little over 12 hours to complete. But if we use our own evidence, AAT and Genewise aligned proteins for `protein_gff` and PASA assembled ESTs for `est_gff`, the same pipeline runs very very slow and the intermediary *.gff.ann file has many chunks (separated by '###') that are completely empty. Our evidence in formatted in the same way as est2genome or protein2genome (GFF file with "expressed_sequence_match::match_part" or "protein_match::match_part" features respectively) The input to my pipeline is 8 chromosomes, ~2200 scaffolds and I use the default `max_dna_len` parameter used to split the large assemblies into chunks. Investigating the master_datastore.log shows me that the scaffolds run through without any issues and the chromosomes are still being processed. For any of the chromosomes, investigating the 'run.log' file, one level above 'theVoid' shows me how many "final.section" jobs were started and how many finished. And in the case of all the chromosomes, it tells me that everything that was started has finished. And the 'log.child.*' files within `theVoid` are all empty. Also within `theVoid`, I'm noticing that the "raw.section" and "evidence_*.gff" files are not empty. But one thing that is surprising is that of all the "final.section" files, only the one pertaining to the last chunk is very large (proportional to the size of the evidnce), the rest are all exactly the same size (exactly 331 bytes). I'm running MAKER in MPI mode spawning 48 processes on a high memory machine with 64 available cores and 1TB of RAM. I hope I've been able to explain my situation clearly in this email. Any help is appreciated. Thank you. Vivek From cdtown at jcvi.org Wed Mar 20 07:54:33 2013 From: cdtown at jcvi.org (Town, Christopher D.) Date: Wed, 20 Mar 2013 09:54:33 -0400 Subject: [maker-devel] AED calculations using the MAKER pipeline In-Reply-To: References: Message-ID: Thanks. Is there any way of guestimating when this final step might be completed. We are in a time crunch here to get this analysis finished and the data/annotation out. Best Chris -----Original Message----- From: Carson Holt [mailto:Carson.Holt at oicr.on.ca] Sent: Wednesday, March 20, 2013 9:51 AM To: Krishnakumar, Vivek; maker-devel at yandell-lab.org Cc: Town, Christopher D.; Tang, Haibao; Bidwell, Shelby; Rosen, Benjamin Subject: Re: AED calculations using the MAKER pipeline In the current MAKER download when using GFF3 passthrough there was an issue with everything being done at the very last step. This of course leads to a memory spike and a very slow last step. That seems to be similar to what you are describing. It should be resolved in what will become version 2.28. I can give you access to the pre-release code, so you can check that the issue is resolved for you. I'll send details in a separate e-mail. Also the ### will be printed after every ~100,000 bp of assembly processed by MAKER. You can ignore them, but they actually have a meaning in GFF3. Basically everything between two sets of ###'s are fully resolved. It allows programs that read GFF3 to parallelize file loading or just load sections of a file as they can rapidly identify "safe chunks". Without them the entire file must be loaded into memory in order to be certain that all feature parts are there (as there is no requirement for sorting or order in GFF3). log.child files will always be empty unless you run analysis like snap or blast. Thanks, Carson On 13-03-20 9:05 AM, "Krishnakumar, Vivek" wrote: >Hi, > >We have been using the MAKER pipeline here at JCVI to calculate AED >scores by feeding in our annotation set as `model_gff` and the protein >and EST evidence as `protein_gff` and `est_gff` respectively. Here is >the issue we are having: > >When running the above pipeline with protein2genome and est2genome >evidence generated earlier by MAKER, there are no problems calculating >the AED score. Normally this pipeline takes a little over 12 hours to >complete. > >But if we use our own evidence, AAT and Genewise aligned proteins for >`protein_gff` and PASA assembled ESTs for `est_gff`, the same pipeline >runs very very slow and the intermediary *.gff.ann file has many chunks >(separated by '###') that are completely empty. Our evidence in >formatted in the same way as est2genome or protein2genome (GFF file >with "expressed_sequence_match::match_part" or "protein_match::match_part" >features respectively) > >The input to my pipeline is 8 chromosomes, ~2200 scaffolds and I use >the default `max_dna_len` parameter used to split the large assemblies >into chunks. > >Investigating the master_datastore.log shows me that the scaffolds run >through without any issues and the chromosomes are still being processed. >For any of the chromosomes, investigating the 'run.log' file, one level >above 'theVoid' shows me how many "final.section" jobs were started and >how many finished. And in the case of all the chromosomes, it tells me >that everything that was started has finished. And the 'log.child.*' >files within `theVoid` are all empty. Also within `theVoid`, I'm >noticing that the "raw.section" and "evidence_*.gff" files are not >empty. But one thing that is surprising is that of all the >"final.section" files, only the one pertaining to the last chunk is >very large (proportional to the size of the evidnce), the rest are all >exactly the same size (exactly 331 bytes). > >I'm running MAKER in MPI mode spawning 48 processes on a high memory >machine with 64 available cores and 1TB of RAM. > >I hope I've been able to explain my situation clearly in this email. > >Any help is appreciated. >Thank you. > >Vivek From myandell at genetics.utah.edu Wed Mar 20 08:55:38 2013 From: myandell at genetics.utah.edu (Mark Yandell) Date: Wed, 20 Mar 2013 14:55:38 +0000 Subject: [maker-devel] AED calculations using the MAKER pipeline In-Reply-To: References: Message-ID: <7A60AB257EFF2B48B1F4C814817EA05350EDCE05@mxb2.hg.genetics.utah.edu> Hi Vivek, sound like its a maybe problem with the protein2genome GFF file. Cane you send us a sample file that is known to produce the problem? cheers, --mark Mark Yandell Professor of Human Genetics H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:801-587-7707 ________________________________________ From: maker-devel-bounces at yandell-lab.org [maker-devel-bounces at yandell-lab.org] on behalf of Krishnakumar, Vivek [vKrishna at jcvi.org] Sent: Wednesday, March 20, 2013 7:05 AM To: maker-devel at yandell-lab.org Cc: Tang, Haibao; Rosen, Benjamin; Town, Christopher D.; Bidwell, Shelby Subject: [maker-devel] AED calculations using the MAKER pipeline Hi, We have been using the MAKER pipeline here at JCVI to calculate AED scores by feeding in our annotation set as `model_gff` and the protein and EST evidence as `protein_gff` and `est_gff` respectively. Here is the issue we are having: When running the above pipeline with protein2genome and est2genome evidence generated earlier by MAKER, there are no problems calculating the AED score. Normally this pipeline takes a little over 12 hours to complete. But if we use our own evidence, AAT and Genewise aligned proteins for `protein_gff` and PASA assembled ESTs for `est_gff`, the same pipeline runs very very slow and the intermediary *.gff.ann file has many chunks (separated by '###') that are completely empty. Our evidence in formatted in the same way as est2genome or protein2genome (GFF file with "expressed_sequence_match::match_part" or "protein_match::match_part" features respectively) The input to my pipeline is 8 chromosomes, ~2200 scaffolds and I use the default `max_dna_len` parameter used to split the large assemblies into chunks. Investigating the master_datastore.log shows me that the scaffolds run through without any issues and the chromosomes are still being processed. For any of the chromosomes, investigating the 'run.log' file, one level above 'theVoid' shows me how many "final.section" jobs were started and how many finished. And in the case of all the chromosomes, it tells me that everything that was started has finished. And the 'log.child.*' files within `theVoid` are all empty. Also within `theVoid`, I'm noticing that the "raw.section" and "evidence_*.gff" files are not empty. But one thing that is surprising is that of all the "final.section" files, only the one pertaining to the last chunk is very large (proportional to the size of the evidnce), the rest are all exactly the same size (exactly 331 bytes). I'm running MAKER in MPI mode spawning 48 processes on a high memory machine with 64 available cores and 1TB of RAM. I hope I've been able to explain my situation clearly in this email. Any help is appreciated. Thank you. Vivek _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From myandell at genetics.utah.edu Wed Mar 20 08:57:17 2013 From: myandell at genetics.utah.edu (Mark Yandell) Date: Wed, 20 Mar 2013 14:57:17 +0000 Subject: [maker-devel] AED calculations using the MAKER pipeline In-Reply-To: References: , Message-ID: <7A60AB257EFF2B48B1F4C814817EA05350EDCE15@mxb2.hg.genetics.utah.edu> whoops. looks like carson has got this one already. Thanks! Mark Yandell Professor of Human Genetics H.A. & Edna Benning Presidential Endowed Chair Eccles Institute of Human Genetics University of Utah 15 North 2030 East, Room 2100 Salt Lake City, UT 84112-5330 ph:801-587-7707 ________________________________________ From: maker-devel-bounces at yandell-lab.org [maker-devel-bounces at yandell-lab.org] on behalf of Town, Christopher D. [cdtown at jcvi.org] Sent: Wednesday, March 20, 2013 7:54 AM To: Carson Holt; Krishnakumar, Vivek; maker-devel at yandell-lab.org Cc: Tang, Haibao; Rosen, Benjamin; Bidwell, Shelby Subject: Re: [maker-devel] AED calculations using the MAKER pipeline Thanks. Is there any way of guestimating when this final step might be completed. We are in a time crunch here to get this analysis finished and the data/annotation out. Best Chris -----Original Message----- From: Carson Holt [mailto:Carson.Holt at oicr.on.ca] Sent: Wednesday, March 20, 2013 9:51 AM To: Krishnakumar, Vivek; maker-devel at yandell-lab.org Cc: Town, Christopher D.; Tang, Haibao; Bidwell, Shelby; Rosen, Benjamin Subject: Re: AED calculations using the MAKER pipeline In the current MAKER download when using GFF3 passthrough there was an issue with everything being done at the very last step. This of course leads to a memory spike and a very slow last step. That seems to be similar to what you are describing. It should be resolved in what will become version 2.28. I can give you access to the pre-release code, so you can check that the issue is resolved for you. I'll send details in a separate e-mail. Also the ### will be printed after every ~100,000 bp of assembly processed by MAKER. You can ignore them, but they actually have a meaning in GFF3. Basically everything between two sets of ###'s are fully resolved. It allows programs that read GFF3 to parallelize file loading or just load sections of a file as they can rapidly identify "safe chunks". Without them the entire file must be loaded into memory in order to be certain that all feature parts are there (as there is no requirement for sorting or order in GFF3). log.child files will always be empty unless you run analysis like snap or blast. Thanks, Carson On 13-03-20 9:05 AM, "Krishnakumar, Vivek" wrote: >Hi, > >We have been using the MAKER pipeline here at JCVI to calculate AED >scores by feeding in our annotation set as `model_gff` and the protein >and EST evidence as `protein_gff` and `est_gff` respectively. Here is >the issue we are having: > >When running the above pipeline with protein2genome and est2genome >evidence generated earlier by MAKER, there are no problems calculating >the AED score. Normally this pipeline takes a little over 12 hours to >complete. > >But if we use our own evidence, AAT and Genewise aligned proteins for >`protein_gff` and PASA assembled ESTs for `est_gff`, the same pipeline >runs very very slow and the intermediary *.gff.ann file has many chunks >(separated by '###') that are completely empty. Our evidence in >formatted in the same way as est2genome or protein2genome (GFF file >with "expressed_sequence_match::match_part" or "protein_match::match_part" >features respectively) > >The input to my pipeline is 8 chromosomes, ~2200 scaffolds and I use >the default `max_dna_len` parameter used to split the large assemblies >into chunks. > >Investigating the master_datastore.log shows me that the scaffolds run >through without any issues and the chromosomes are still being processed. >For any of the chromosomes, investigating the 'run.log' file, one level >above 'theVoid' shows me how many "final.section" jobs were started and >how many finished. And in the case of all the chromosomes, it tells me >that everything that was started has finished. And the 'log.child.*' >files within `theVoid` are all empty. Also within `theVoid`, I'm >noticing that the "raw.section" and "evidence_*.gff" files are not >empty. But one thing that is surprising is that of all the >"final.section" files, only the one pertaining to the last chunk is >very large (proportional to the size of the evidnce), the rest are all >exactly the same size (exactly 331 bytes). > >I'm running MAKER in MPI mode spawning 48 processes on a high memory >machine with 64 available cores and 1TB of RAM. > >I hope I've been able to explain my situation clearly in this email. > >Any help is appreciated. >Thank you. > >Vivek _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From carsonhh at gmail.com Wed Mar 20 11:36:30 2013 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 20 Mar 2013 13:36:30 -0400 Subject: [maker-devel] AED calculations using the MAKER pipeline In-Reply-To: Message-ID: On the few cases where I found this (if it is the same issue you are experiencing), it was very much dependent on the total size of the evidence database and the length of the contigs. For me it took about 25-50% longer, but used up 10-15x as much RAM (primarily because the contigs were very long > 50 Mb each). The issue was unnoticeable on the short contigs that are more typical of de novo annotation. Thanks, Carson On 13-03-20 9:54 AM, "Town, Christopher D." wrote: >Thanks. Is there any way of guestimating when this final step might be >completed. We are in a time crunch here to get this analysis finished and >the data/annotation out. > >Best > >Chris > >-----Original Message----- >From: Carson Holt [mailto:Carson.Holt at oicr.on.ca] >Sent: Wednesday, March 20, 2013 9:51 AM >To: Krishnakumar, Vivek; maker-devel at yandell-lab.org >Cc: Town, Christopher D.; Tang, Haibao; Bidwell, Shelby; Rosen, Benjamin >Subject: Re: AED calculations using the MAKER pipeline > >In the current MAKER download when using GFF3 passthrough there was an >issue with everything being done at the very last step. This of course >leads to a memory spike and a very slow last step. That seems to be >similar to what you are describing. It should be resolved in what will >become version 2.28. I can give you access to the pre-release code, so >you can check that the issue is resolved for you. I'll send details in a >separate e-mail. > >Also the ### will be printed after every ~100,000 bp of assembly >processed by MAKER. You can ignore them, but they actually have a >meaning in GFF3. >Basically everything between two sets of ###'s are fully resolved. It >allows programs that read GFF3 to parallelize file loading or just load >sections of a file as they can rapidly identify "safe chunks". Without >them the entire file must be loaded into memory in order to be certain >that all feature parts are there (as there is no requirement for sorting >or order in GFF3). > >log.child files will always be empty unless you run analysis like snap or >blast. > >Thanks, >Carson > > > > > > >On 13-03-20 9:05 AM, "Krishnakumar, Vivek" wrote: > >>Hi, >> >>We have been using the MAKER pipeline here at JCVI to calculate AED >>scores by feeding in our annotation set as `model_gff` and the protein >>and EST evidence as `protein_gff` and `est_gff` respectively. Here is >>the issue we are having: >> >>When running the above pipeline with protein2genome and est2genome >>evidence generated earlier by MAKER, there are no problems calculating >>the AED score. Normally this pipeline takes a little over 12 hours to >>complete. >> >>But if we use our own evidence, AAT and Genewise aligned proteins for >>`protein_gff` and PASA assembled ESTs for `est_gff`, the same pipeline >>runs very very slow and the intermediary *.gff.ann file has many chunks >>(separated by '###') that are completely empty. Our evidence in >>formatted in the same way as est2genome or protein2genome (GFF file >>with "expressed_sequence_match::match_part" or >>"protein_match::match_part" >>features respectively) >> >>The input to my pipeline is 8 chromosomes, ~2200 scaffolds and I use >>the default `max_dna_len` parameter used to split the large assemblies >>into chunks. >> >>Investigating the master_datastore.log shows me that the scaffolds run >>through without any issues and the chromosomes are still being processed. >>For any of the chromosomes, investigating the 'run.log' file, one level >>above 'theVoid' shows me how many "final.section" jobs were started and >>how many finished. And in the case of all the chromosomes, it tells me >>that everything that was started has finished. And the 'log.child.*' >>files within `theVoid` are all empty. Also within `theVoid`, I'm >>noticing that the "raw.section" and "evidence_*.gff" files are not >>empty. But one thing that is surprising is that of all the >>"final.section" files, only the one pertaining to the last chunk is >>very large (proportional to the size of the evidnce), the rest are all >>exactly the same size (exactly 331 bytes). >> >>I'm running MAKER in MPI mode spawning 48 processes on a high memory >>machine with 64 available cores and 1TB of RAM. >> >>I hope I've been able to explain my situation clearly in this email. >> >>Any help is appreciated. >>Thank you. >> >>Vivek > > >_______________________________________________ >maker-devel mailing list >maker-devel at box290.bluehost.com >http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From ares711122 at gmail.com Thu Mar 21 18:08:45 2013 From: ares711122 at gmail.com (Hung-Wei Hsu) Date: Fri, 22 Mar 2013 08:08:45 +0800 Subject: [maker-devel] Directory structure is too deep! Message-ID: Hi MAKER developers, I found that the MAKER outputs of each contigs were located in separate deep directory. Can MAKER collect these outputs in one simple directory so that these results can be easily examined? Thanks a lot in advance. Warmest regards, Hung-Wei -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Thu Mar 21 20:07:23 2013 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 21 Mar 2013 22:07:23 -0400 Subject: [maker-devel] Directory structure is too deep! In-Reply-To: Message-ID: You can use gff3_merge to collect them into a single file, or to keep them as separate files but in the same directory just use the standard linux copy command. Similarly you can use fasta_merge to collect the fasta files. Example: > mkdir results > cp *.maker.output/*_datastore/*/*/*.gff results/ Thanks, Carson From: Hung-Wei Hsu Date: Thursday, 21 March, 2013 8:08 PM To: Subject: [maker-devel] Directory structure is too deep! Hi MAKER developers, I found that the MAKER outputs of each contigs were located in separate deep directory. Can MAKER collect these outputs in one simple directory so that these results can be easily examined? Thanks a lot in advance. Warmest regards, Hung-Wei _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From jason.stajich at gmail.com Fri Mar 22 00:12:55 2013 From: jason.stajich at gmail.com (Jason Stajich) Date: Thu, 21 Mar 2013 20:12:55 -1000 Subject: [maker-devel] failed gene prediction In-Reply-To: References: Message-ID: <59B5B965-7B15-449E-B42F-E41D4F448B6A@gmail.com> For fungi, I've put up some of the gene prediction parameters that I've built or trained if that is a helpful for you. https://github.com/hyphaltip/fungi-gene-prediction-params In the absence of any ESTs or RNA-Seq I also recommend generating a starting training set with CEGMA first and then training your predictors from there except for GeneMark.hmm which seems to do okay with self-training. Jason On Mar 18, 2013, at 10:49 AM, Carson Holt wrote: > You didn't supply any evidence or HMM files for gene predictors. Just raw assembly data by itself is insufficient for genome annotation. > > Here is some nice documentation for running MAKER --> http://gmod.org/wiki/MAKER_Tutorial_2012 > Here is a nice overview of genome annotation ion general --> http://fasta.bioch.virginia.edu/cshl/pdf/12/ajm12/euk_genome_annotation_review.pdf > > Once you've gone through the documentation and examples, if you come across any questions just let us know. > > Thanks, > Carson > > > From: "Borhan, Hossein" > Date: Monday, 18 March, 2013 4:40 PM > To: > Subject: [maker-devel] failed gene prediction > > Hi > > I have tried maker on a fungus genome of 45 mb with 1/3 being repeat rich. It did not produce any prediction. I am not sure what is causing this. Attached are the STDERR and opts.ctl. I appreciate your help > > > Hossein > > > > > > > > > > _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org Jason Stajich jason.stajich at gmail.com jason at bioperl.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ares711122 at gmail.com Fri Mar 22 01:52:25 2013 From: ares711122 at gmail.com (Hung-Wei Hsu) Date: Fri, 22 Mar 2013 15:52:25 +0800 Subject: [maker-devel] Can MAKER analyze the viral genome? Message-ID: Hi MAKER developers, I'm wondering if MAKER can deal with the viral genome. If yes, how do I set the running parameters? Thanks. Kind regards, Hung-Wei -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Sat Mar 23 17:42:39 2013 From: carsonhh at gmail.com (Carson Holt) Date: Sat, 23 Mar 2013 19:42:39 -0400 Subject: [maker-devel] Can MAKER analyze the viral genome? In-Reply-To: Message-ID: You can set organism type to prokaryotic and use the protein2genome option for annotation. It's not a perfect match as it only allows for partial gene spatial overlap and not full gene within a gene like you can see in viruses. Thanks, Carson From: Hung-Wei Hsu Date: Friday, 22 March, 2013 3:52 AM To: Subject: [maker-devel] Can MAKER analyze the viral genome? Hi MAKER developers, I'm wondering if MAKER can deal with the viral genome. If yes, how do I set the running parameters? Thanks. Kind regards, Hung-Wei _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From jjin01 at mail.rockefeller.edu Sat Mar 23 18:43:54 2013 From: jjin01 at mail.rockefeller.edu (Jingjing Jin) Date: Sun, 24 Mar 2013 00:43:54 +0000 Subject: [maker-devel] maker running error Message-ID: Dear all, When I run the maker, there is an error like this: *** buffer overflow detected ***: /usr/bin/perl terminated ======= Backtrace: ========= /lib64/libc.so.6(__fortify_fail+0x37)[0x3582d01d47] /lib64/libc.so.6[0x3582cffc30] /lib64/libc.so.6[0x3582cff089] /lib64/libc.so.6(__printf_fp+0x1531)[0x3582c4afa1] /lib64/libc.so.6(_IO_vfprintf+0x11a7)[0x3582c45407] /lib64/libc.so.6(__vsprintf_chk+0x9d)[0x3582cff12d] /lib64/libc.so.6(__sprintf_chk+0x7f)[0x3582cff06f] /usr/local/maker/lib/File/../../perl/lib/auto/Proc/ProcessTable/ProcessTable.so(OS_get_table+0x9bb)[0x7f328e8eb69b] /usr/local/maker/lib/File/../../perl/lib/auto/Proc/ProcessTable/ProcessTable.so(XS_Proc__ProcessTable_table+0x182)[0x7f328e8ecc02] /usr/lib64/perl5/CORE/libperl.so(Perl_pp_entersub+0x5a5)[0x35848a66d5] /usr/lib64/perl5/CORE/libperl.so(Perl_runops_standard+0x16)[0x35848a49c6] /usr/lib64/perl5/CORE/libperl.so(perl_run+0x338)[0x358484d0d8] /usr/bin/perl(main+0xec)[0x400cac] /lib64/libc.so.6(__libc_start_main+0xfd)[0x3582c1ecdd] /usr/bin/perl[0x400af9] ======= Memory map: ======== Could anyone give me some suggestion about how to deal with this problem? Thanks! Jingjing -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Sat Mar 23 21:04:49 2013 From: carsonhh at gmail.com (Carson Holt) Date: Sat, 23 Mar 2013 23:04:49 -0400 Subject: [maker-devel] maker running error In-Reply-To: Message-ID: Could you try maker version 2.27 from the website? Proc::ProcessTable may have problems on your system in accessing the process table. Version 2.27 tries to access the same information by first parsing the output of the standard 'df' command and only tries to access the process table directly if that fails. Thanks, Carson From: Jingjing Jin Date: Saturday, 23 March, 2013 8:43 PM To: "maker-devel at yandell-lab.org" Subject: [maker-devel] maker running error Dear all, When I run the maker, there is an error like this: *** buffer overflow detected ***: /usr/bin/perl terminated ======= Backtrace: ========= /lib64/libc.so.6(__fortify_fail+0x37)[0x3582d01d47] /lib64/libc.so.6[0x3582cffc30] /lib64/libc.so.6[0x3582cff089] /lib64/libc.so.6(__printf_fp+0x1531)[0x3582c4afa1] /lib64/libc.so.6(_IO_vfprintf+0x11a7)[0x3582c45407] /lib64/libc.so.6(__vsprintf_chk+0x9d)[0x3582cff12d] /lib64/libc.so.6(__sprintf_chk+0x7f)[0x3582cff06f] /usr/local/maker/lib/File/../../perl/lib/auto/Proc/ProcessTable/ProcessTable .so(OS_get_table+0x9bb)[0x7f328e8eb69b] /usr/local/maker/lib/File/../../perl/lib/auto/Proc/ProcessTable/ProcessTable .so(XS_Proc__ProcessTable_table+0x182)[0x7f328e8ecc02] /usr/lib64/perl5/CORE/libperl.so(Perl_pp_entersub+0x5a5)[0x35848a66d5] /usr/lib64/perl5/CORE/libperl.so(Perl_runops_standard+0x16)[0x35848a49c6] /usr/lib64/perl5/CORE/libperl.so(perl_run+0x338)[0x358484d0d8] /usr/bin/perl(main+0xec)[0x400cac] /lib64/libc.so.6(__libc_start_main+0xfd)[0x3582c1ecdd] /usr/bin/perl[0x400af9] ======= Memory map: ======== Could anyone give me some suggestion about how to deal with this problem? Thanks! Jingjing _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnuhn at ebi.ac.uk Mon Mar 25 06:18:11 2013 From: mnuhn at ebi.ac.uk (mnuhn) Date: Mon, 25 Mar 2013 12:18:11 +0000 Subject: [maker-devel] =?utf-8?q?master=5Fdatastore=5Findex=2Elog_file_shr?= =?utf-8?q?inks=2E?= In-Reply-To: References: Message-ID: <407ae892252062e886fb3855bb6bf74c@ebi.ac.uk> Thanks, this works and mpi maker is running now. Cheers, Michael. P.S.: If anyone is trying to reproduce this, I only had one directory in LD_PRELOAD and it didn't like the trailing colon, so I removed it to make it work: export LD_PRELOAD=/software/openmpi-1.4.3/lib/libmpi.so On 2013-03-19 15:22, Carson Holt wrote: > I have MAKER working under OpemnMPI 1.4.3 (intel compiled). > > I had to set a couple of environmental variables prior to setup. You > would > probably need to set these values as well. If you your OpenMPI path > was > here for example --> /software/openmpi-1.4.3/, run the following > commands > (path set accordingly) before even attempting maker setup. > > export OMPI_MCA_mpi_warn_on_fork 0 > export LD_PRELOAD /software/openmpi-1.4.3/lib/libmpi.so:$LD_PRELOAD > > These not only need to be set before compilation, but also before any > run > (so add them to you ~.bashrc or ~/.bash_profile or any module load > scripts > thanks). The LD_PRELOAD statement needs to be set for any program > using > OpenMPI's shared libraries and not just MAKER, so it's normally a > good > idea to have that set system wide for all users. The detail can be > found > in the OpenMPI documentation. Note sometimes system library updates > can > break OpenMPI's shared libraries while not breaking OpenMPI itself, > so you > might also need to recompile OpenMPI if it has broken shared > libraries. > > Once you have those commands in place, run the perl Buil.PL step. Say > yes > to install with MPI. Then run ./Build install > > Thanks, > Carson > > > > On 13-03-19 11:02 AM, "Carson Holt" wrote: > >>Try it with the no_locks option then. Make sure to let one instance >>finish populating the mpi_blastdb directory before running other >>instances >>as that is where most initial locking occurs. >> >>I'll send you more details on how to install with OpenMPI, so you can >>give >>that a shot while your jobs are also running serially (so you don't >> lose >>time). Also instead of 50 serial instances, you could try 10 with >> -cpus >>set to 5. >> >>Thanks, >>Carson >> >> >> >>On 13-03-19 11:19 AM, "Michael Nuhn" wrote: >> >>>Hello Carson! >>> >>>On 03/19/2013 02:27 PM, Carson Holt wrote: >>>> Yes. If at all possible use MPI. It removes the overhead of >>>> locks >>>> which happen per primary instance of MAKER. So one maker job >>>> using >>>>1000 >>>> cpus via MPI will have one shared set of locks. 1000 serial >>>> instances >>>> of MAKER on the other hand would have 1000x the locks. >>> >>>I don't know a thing about MPI. >>> >>>I tried installing maker (2.2.7) with mpich-3.0.2, mpich2-1.4.1 and >>> open >>>mpi and none of them worked for me. I also tried the automatic >>>installation that comes with maker, but it didn't work for me >>> either. >>> >>>If need be, I could spend time getting to the bottom of this, but >>> there >>>is no telling how long this would take me so I'd rather not, if >>> there is >>>an alternative. >>> >>>Would the approach I outlined before work? (Treating the split files >>> as >>>separate genomes to annotate and then combine the gffs afterwards) >>> >>>I also like this approach, because I would select a few contigs in >>> the >>>beginning which I would run on their own. They would complete early >>> and >>>this way I would get a preview of the results of the run instead of >>>having to wait for everything to complete. >>> >>>It might also be more robust, because file locking issues would be >>>confined to the instances working on a sequence chunk, but the rest >>> of >>>the instances could continue working. >>> >>>Cheers, >>>Michael. >>> >>>> Alternatively if you do need to continue without MPI for some >>>> reason, I >>>> just finished a devel version of MAKER that has a --no_locks >>>> option. >>>> You can never start two instances using the same input fasta >>>> when >>>> --no_locks is specified, but the splitting to use different input >>>>fastas >>>> I mentioned before in the example will still work fine. >>>> >>>> I also have updated the indexing/reindexing, so if indexing >>>> failures >>>> happen, MAKER will switch between the current working directory >>>> and the >>>> TMP= directory from the maker_opts.ctl file so as to try different >>>> IO >>>> locations (I.e. NFS and non-NFS). Note you should never set TMP= >>>> in >>>>the >>>> control files to an NFS mounted location (it not only makes things >>>> a >>>>lot >>>> slower, but berkleydb and sqllite will get frequent errors on >>>> NFS). >>>> TMP= defaults to /tmp when not specified >>>> >>>> I'll send you download information in a separate e-mail. Try a >>>> regular >>>> MAKER run to see if the indexing/reindexing changes are sufficient >>>> before attempting the ?no_locks option. >>>> >>>> Thanks, >>>> Carson From lengjingmao at gmail.com Mon Mar 25 07:49:11 2013 From: lengjingmao at gmail.com (shaohua.fan) Date: Mon, 25 Mar 2013 14:49:11 +0100 Subject: [maker-devel] maker terminated strangely Message-ID: Hi Maker developers, I met a problem when I was using Maker version 2.27 beta version that the pipeline terminated in the middle of the process without any error message. The genome I am working with is a Eukaryotic genome which is consisted by around 6000 scaffolds. I combined de novo (Augustus and SNAP) and evidence based (protein from a closely related species and transcriptome from the same species) for the gene prediction (the genome is already repeat masked). The MPI (mpich2 version 1.5) enabled maker was run on a cluster by using SGE. I checked with the administrator of our cluster, there is no limitation of SGE job. The maker was run by using mpiexec -n 48 /home/shafan/maker/bin/maker maker_opts.ctl maker_bopts.ctl maker_exe.ctl I attached my maker_opt.ctl, please let me know if you need any information for this problem. Thanks a lot! Shaohua -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_opts.ctl Type: application/octet-stream Size: 4519 bytes Desc: not available URL: From carsonhh at gmail.com Mon Mar 25 08:01:45 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 25 Mar 2013 10:01:45 -0400 Subject: [maker-devel] maker terminated strangely In-Reply-To: Message-ID: Could you send your captured standard error. That would contain messages that highlight the specific cause. Thanks, Carson From: "shaohua.fan" Date: Monday, 25 March, 2013 9:49 AM To: Subject: [maker-devel] maker terminated strangely Hi Maker developers, I met a problem when I was using Maker version 2.27 beta version that the pipeline terminated in the middle of the process without any error message. The genome I am working with is a Eukaryotic genome which is consisted by around 6000 scaffolds. I combined de novo (Augustus and SNAP) and evidence based (protein from a closely related species and transcriptome from the same species) for the gene prediction (the genome is already repeat masked). The MPI (mpich2 version 1.5) enabled maker was run on a cluster by using SGE. I checked with the administrator of our cluster, there is no limitation of SGE job. The maker was run by using mpiexec -n 48 /home/shafan/maker/bin/maker maker_opts.ctl maker_bopts.ctl maker_exe.ctl I attached my maker_opt.ctl, please let me know if you need any information for this problem. Thanks a lot! Shaohua _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From lengjingmao at gmail.com Mon Mar 25 08:07:17 2013 From: lengjingmao at gmail.com (shaohua.fan) Date: Mon, 25 Mar 2013 15:07:17 +0100 Subject: [maker-devel] maker terminated strangely In-Reply-To: References: Message-ID: Hi Carson, Do you mean standard output from maker? If yes, I need to upload the file to our ftp server, since it is quite big around 1.1 Gb. Shaohua 2013/3/25 Carson Holt > Could you send your captured standard error. That would contain messages > that highlight the specific cause. > > Thanks, > Carson > > > From: "shaohua.fan" > Date: Monday, 25 March, 2013 9:49 AM > To: > Subject: [maker-devel] maker terminated strangely > > Hi Maker developers, > > I met a problem when I was using Maker version 2.27 beta version that the > pipeline terminated in the middle of the process without any error message. > > > The genome I am working with is a Eukaryotic genome which is consisted by > around 6000 scaffolds. I combined de novo (Augustus and SNAP) and evidence > based (protein from a closely related species and transcriptome from the > same species) for the gene prediction (the genome is already repeat > masked). The MPI (mpich2 version 1.5) enabled maker was run on a cluster by > using SGE. I checked with the administrator of our cluster, there is no > limitation of SGE job. > > The maker was run by using mpiexec -n 48 /home/shafan/maker/bin/maker > maker_opts.ctl maker_bopts.ctl maker_exe.ctl > > I attached my maker_opt.ctl, please let me know if you need any > information for this problem. > > Thanks a lot! > > Shaohua > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Mar 25 08:07:45 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 25 Mar 2013 10:07:45 -0400 Subject: [maker-devel] master_datastore_index.log file shrinks. In-Reply-To: <407ae892252062e886fb3855bb6bf74c@ebi.ac.uk> Message-ID: Great news. I'm glad it's working. If you have more questions, just let me know. --Carson On 13-03-25 8:18 AM, "mnuhn" wrote: >Thanks, this works and mpi maker is running now. > >Cheers, >Michael. > >P.S.: > >If anyone is trying to reproduce this, I only had one directory in >LD_PRELOAD and it didn't like the trailing colon, so I removed it to >make it work: > >export LD_PRELOAD=/software/openmpi-1.4.3/lib/libmpi.so > >On 2013-03-19 15:22, Carson Holt wrote: >> I have MAKER working under OpemnMPI 1.4.3 (intel compiled). >> >> I had to set a couple of environmental variables prior to setup. You >> would >> probably need to set these values as well. If you your OpenMPI path >> was >> here for example --> /software/openmpi-1.4.3/, run the following >> commands >> (path set accordingly) before even attempting maker setup. >> >> export OMPI_MCA_mpi_warn_on_fork 0 >> export LD_PRELOAD /software/openmpi-1.4.3/lib/libmpi.so:$LD_PRELOAD >> >> These not only need to be set before compilation, but also before any >> run >> (so add them to you ~.bashrc or ~/.bash_profile or any module load >> scripts >> thanks). The LD_PRELOAD statement needs to be set for any program >> using >> OpenMPI's shared libraries and not just MAKER, so it's normally a >> good >> idea to have that set system wide for all users. The detail can be >> found >> in the OpenMPI documentation. Note sometimes system library updates >> can >> break OpenMPI's shared libraries while not breaking OpenMPI itself, >> so you >> might also need to recompile OpenMPI if it has broken shared >> libraries. >> >> Once you have those commands in place, run the perl Buil.PL step. Say >> yes >> to install with MPI. Then run ./Build install >> >> Thanks, >> Carson >> >> >> >> On 13-03-19 11:02 AM, "Carson Holt" wrote: >> >>>Try it with the no_locks option then. Make sure to let one instance >>>finish populating the mpi_blastdb directory before running other >>>instances >>>as that is where most initial locking occurs. >>> >>>I'll send you more details on how to install with OpenMPI, so you can >>>give >>>that a shot while your jobs are also running serially (so you don't >>> lose >>>time). Also instead of 50 serial instances, you could try 10 with >>> -cpus >>>set to 5. >>> >>>Thanks, >>>Carson >>> >>> >>> >>>On 13-03-19 11:19 AM, "Michael Nuhn" wrote: >>> >>>>Hello Carson! >>>> >>>>On 03/19/2013 02:27 PM, Carson Holt wrote: >>>>> Yes. If at all possible use MPI. It removes the overhead of >>>>> locks >>>>> which happen per primary instance of MAKER. So one maker job >>>>> using >>>>>1000 >>>>> cpus via MPI will have one shared set of locks. 1000 serial >>>>> instances >>>>> of MAKER on the other hand would have 1000x the locks. >>>> >>>>I don't know a thing about MPI. >>>> >>>>I tried installing maker (2.2.7) with mpich-3.0.2, mpich2-1.4.1 and >>>> open >>>>mpi and none of them worked for me. I also tried the automatic >>>>installation that comes with maker, but it didn't work for me >>>> either. >>>> >>>>If need be, I could spend time getting to the bottom of this, but >>>> there >>>>is no telling how long this would take me so I'd rather not, if >>>> there is >>>>an alternative. >>>> >>>>Would the approach I outlined before work? (Treating the split files >>>> as >>>>separate genomes to annotate and then combine the gffs afterwards) >>>> >>>>I also like this approach, because I would select a few contigs in >>>> the >>>>beginning which I would run on their own. They would complete early >>>> and >>>>this way I would get a preview of the results of the run instead of >>>>having to wait for everything to complete. >>>> >>>>It might also be more robust, because file locking issues would be >>>>confined to the instances working on a sequence chunk, but the rest >>>> of >>>>the instances could continue working. >>>> >>>>Cheers, >>>>Michael. >>>> >>>>> Alternatively if you do need to continue without MPI for some >>>>> reason, I >>>>> just finished a devel version of MAKER that has a --no_locks >>>>> option. >>>>> You can never start two instances using the same input fasta >>>>> when >>>>> --no_locks is specified, but the splitting to use different input >>>>>fastas >>>>> I mentioned before in the example will still work fine. >>>>> >>>>> I also have updated the indexing/reindexing, so if indexing >>>>> failures >>>>> happen, MAKER will switch between the current working directory >>>>> and the >>>>> TMP= directory from the maker_opts.ctl file so as to try different >>>>> IO >>>>> locations (I.e. NFS and non-NFS). Note you should never set TMP= >>>>> in >>>>>the >>>>> control files to an NFS mounted location (it not only makes things >>>>> a >>>>>lot >>>>> slower, but berkleydb and sqllite will get frequent errors on >>>>> NFS). >>>>> TMP= defaults to /tmp when not specified >>>>> >>>>> I'll send you download information in a separate e-mail. Try a >>>>> regular >>>>> MAKER run to see if the indexing/reindexing changes are sufficient >>>>> before attempting the ?no_locks option. >>>>> >>>>> Thanks, >>>>> Carson > From carsonhh at gmail.com Mon Mar 25 08:08:17 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 25 Mar 2013 10:08:17 -0400 Subject: [maker-devel] maker terminated strangely In-Reply-To: Message-ID: Yes. Thanks, Carson From: "shaohua.fan" Date: Monday, 25 March, 2013 10:07 AM To: Carson Holt Cc: Subject: Re: [maker-devel] maker terminated strangely Hi Carson, Do you mean standard output from maker? If yes, I need to upload the file to our ftp server, since it is quite big around 1.1 Gb. Shaohua 2013/3/25 Carson Holt > Could you send your captured standard error. That would contain messages that > highlight the specific cause. > > Thanks, > Carson > > > From: "shaohua.fan" > Date: Monday, 25 March, 2013 9:49 AM > To: > Subject: [maker-devel] maker terminated strangely > > Hi Maker developers, > > I met a problem when I was using Maker version 2.27 beta version that the > pipeline terminated in the middle of the process without any error message. > > > The genome I am working with is a Eukaryotic genome which is consisted by > around 6000 scaffolds. I combined de novo (Augustus and SNAP) and evidence > based (protein from a closely related species and transcriptome from the same > species) for the gene prediction (the genome is already repeat masked). The > MPI (mpich2 version 1.5) enabled maker was run on a cluster by using SGE. I > checked with the administrator of our cluster, there is no limitation of SGE > job. > > The maker was run by using mpiexec -n 48 /home/shafan/maker/bin/maker > maker_opts.ctl maker_bopts.ctl maker_exe.ctl > > I attached my maker_opt.ctl, please let me know if you need any information > for this problem. > > Thanks a lot! > > Shaohua > _______________________________________________ maker-devel mailing list > maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/mak > er-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ares711122 at gmail.com Mon Mar 25 20:50:52 2013 From: ares711122 at gmail.com (Hung-Wei Hsu) Date: Tue, 26 Mar 2013 10:50:52 +0800 Subject: [maker-devel] Why are some start positions minus in the gff result? Message-ID: Hi MAKER developers, I could successfully run MAKER and get the final gff. But I found some start positions in the gff were minus. That led to error in the gff reader. Is this a bug? Could you please help to resolve this problem? Thanks a lot in advance. Best regards, Hung-Wei -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Mar 25 21:24:01 2013 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 25 Mar 2013 23:24:01 -0400 Subject: [maker-devel] Why are some start positions minus in the gff result? In-Reply-To: Message-ID: I haven't seen that before, so could you package up the job (all input and control files) that generates this and send it to me. Your using maker's prokaryotic settings to try and get it to annotate viral genomes, correct? --Carson From: Hung-Wei Hsu Date: Monday, 25 March, 2013 10:50 PM To: Subject: [maker-devel] Why are some start positions minus in the gff result? Hi MAKER developers, I could successfully run MAKER and get the final gff. But I found some start positions in the gff were minus. That led to error in the gff reader. Is this a bug? Could you please help to resolve this problem? Thanks a lot in advance. Best regards, Hung-Wei _______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From hudarul at yahoo.com Sun Mar 31 14:02:04 2013 From: hudarul at yahoo.com (Hud Hud) Date: Sun, 31 Mar 2013 13:02:04 -0700 (PDT) Subject: [maker-devel] Help on error-Repeat masker Message-ID: <1364760124.37890.YahooMailNeo@web164901.mail.bf1.yahoo.com> Hello, i have some problem when runnning maker, i've got this kind of error, what could possibly go wrong here? Thnks so much setting up GFF3 output and fasta chunks doing repeat masking running ?repeat masker. #--------- command -------------# Widget::RepeatMasker: cd /tmp/maker_WOVHsi; /home/maker-2.27-beta/maker/exe/RepeatMasker/RepeatMasker /home/maker-2.27-beta/maker/data/contig.maker.output/contig_datastore/61/0D/contig172//theVoid ? ? ? ? ? ? ? ? ? ? ? ? ? ? .contig172/contig172.0.simple.rb -dir /home/maker-2.27-beta/maker/data/contig.maker.output/contig_datastore/61/0D/contig172//theVoid.contig172 -pa 1 - lib /tmp/maker_WOVHsi/b1piBcWHlH #-------------------------------# sh: /home/maker-2.27-beta/maker/exe/RepeatMasker/RepeatMasker: /u1/local/bin/perl: bad interpreter: Permission denied ERROR: RepeatMasker failed --> rank=NA, hostname=Homis ERROR: Failed while doing repeat masking ERROR: Chunk failed at level:0, tier_type:1 FAILED CONTIG:contig172 ERROR: Chunk failed at level:2, tier_type:0 FAILED CONTIG:172 examining contents of the fasta file and run log -------------- next part -------------- An HTML attachment was scrubbed... URL: