From keith.decker at bayer.com Wed Jul 10 13:12:31 2019 From: keith.decker at bayer.com (DECKER, KEITH F [AG/1005]) Date: Wed, 10 Jul 2019 18:12:31 +0000 Subject: [maker-devel] Annotations from high confidence IsoSeq data Message-ID: I have a set of high confidence full length transcripts generated via the pacbio IsoSeq pipeline. Is there a recommended way to pass these transcripts to MAKER and ensure that the corresponding gene models are included in the final annotation? I also have assembled illumina transcripts, but I?m trying to figure out how to give the IsoSeq transcripts a higher ?weight?. The information contained in this e-mail is for the exclusive use of the intended recipient(s) and may be confidential, proprietary, and/or legally privileged. Inadvertent disclosure of this message does not constitute a waiver of any privilege. If you receive this message in error, please do not directly or indirectly use, print, copy, forward, or disclose any part of this message. Please also delete this e-mail and all copies and notify the sender. Thank you. -------------- next part -------------- An HTML attachment was scrubbed... URL: From cwarden at coh.org Mon Jul 15 11:25:59 2019 From: cwarden at coh.org (Charles Warden) Date: Mon, 15 Jul 2019 16:25:59 +0000 Subject: [maker-devel] MAKER caught in loop Message-ID: Hi, I am encountering some issues running MAKER on my Mac. I?ve run MAKER in the past, and I was even able to run MAKER very quickly (I think less than 1 minute) for 1 of my 2 assemblies. For the successful run, I installed some headers from /Library/Developer/CommandLineTools/Packages/macOS_SDK_headers_for_macOS_10.14.pkg, and MAKER worked (but the files were not visible on my PC). Actually, on my Mac, I could see from my terminal, but not even through the Mac file browser (after a restart). I?ve attached a copy of my configuration files, and I am running MAKER with the following command: /opt/maker/bin/maker -nolock -fix_nucleotides I upgraded my OS within the past few months, but it seems like there is some issue with the permissions for the files being created. I testing running MAKER from my Mac?s local hard drive (instead of the network folder). However, the run time is considerably longer than the assembly that ran to completion (but those .gff file is not visuable through several interfaces, for some reason). For example, MAKER has been stuck on this step since last Friday (for a 45,013 bp sequence), whereas annotation for a larger sequence completed within minutes: STATUS: Parsing control files... WARNING: You have chosen to turn locking off which may create race conditions if running in parallel. You have been warned. STATUS: Processing and indexing input FASTA files... STATUS: Setting up database for any GFF3 input... A data structure will be created for you at: /Users/cwarden/Documents/MAKER/AC270418_270441.maker.output/AC270418_270441_datastore To access files for individual sequences use the datastore index: /Users/cwarden/Documents/MAKER/AC270418_270441.maker.output/AC270418_270441_master_datastore_index.log STATUS: Now running MAKER... examining contents of the fasta file and run log --Next Contig-- #--------------------------------------------------------------------- Now starting the contig!! SeqID: AC270418_270441 Length: 45015 #--------------------------------------------------------------------- Can you please help me troubleshoot this issue? Thank You, Charles ------------------------------------------------------------ -SECURITY/CONFIDENTIALITY WARNING- This message and any attachments are intended solely for the individual or entity to which they are addressed. This communication may contain information that is privileged, confidential, or exempt from disclosure under applicable law (e.g., personal health information, research data, financial information). Because this e-mail has been sent without encryption, individuals other than the intended recipient may be able to view the information, forward it to others or tamper with the information without the knowledge or consent of the sender. If you are not the intended recipient, or the employee or person responsible for delivering the message to the intended recipient, any dissemination, distribution or copying of the communication is strictly prohibited. If you received the communication in error, please notify the sender immediately by replying to this message and deleting the message and any accompanying files from your system. If, due to the security risks, you do not wish to receive further communications via e-mail, please reply to this message and inform the sender that you do not wish to receive further e-mail from the sender. (LCP301) ------------------------------------------------------------ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_bopts.ctl Type: application/octet-stream Size: 497 bytes Desc: maker_bopts.ctl URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_exe.ctl Type: application/octet-stream Size: 352 bytes Desc: maker_exe.ctl URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_opts.ctl Type: application/octet-stream Size: 1180 bytes Desc: maker_opts.ctl URL: From jmartin at wustl.edu Mon Jul 15 19:58:30 2019 From: jmartin at wustl.edu (Martin, John) Date: Tue, 16 Jul 2019 00:58:30 +0000 Subject: [maker-devel] Trouble using custom repeat library w/ Maker v2.31.10 Message-ID: <65c82584-b495-a2a3-1c74-9fb45af52372@wustl.edu> Greetings, I'm trying to use a custom repeat library in a maker run being performed on a ~400Mb worm genome. I've successfully tested my installation of maker v2.31.10 using a small set of 13 contigs from my full assembly (~100kb or less) using the 'model_org=all' setting for the RepeatMasker part of the maker_opts.ctl configuration. But when I try to feed maker my repeat library fasta file, maker errors out when trying to repeatmask because its trying to run hmmpress on the fasta file I set as 'rmlib=...' I've attached the 3 .ctl files I'm using. I am not setting any other command line arguments for maker, when I launch the jobs I just type: maker in the directory with those .ctl files. The specific error message I am getting appears inside each of the subdirectories of the datastore, inside hmmPress.log files: Error: File format problem in trying to open HMM file /gscmnt/gc2732/mitrevalab/ USDA_Zarlenga/t_colubriformis/180530_assembly_and_annotation_PacBio_sequel_data/ SUPERNOVA_assembly/ANNOTATION/MAKER/maker_1/TMP_1/maker_sbWsNX/RM_416.MonJul1522 04492019/consensi.fa.classified. Format tag is '>rnd-1_family-14#Unknown': unrecognized. Current H3 format is 'HMMER3/f'. Previous H2/H3 formats also supported. I am not sure why maker is trying to run hmmpress on the fasta file entered for the 'rmlib' setting, which the documentation says should be fasta. I think hmmpress should be used on hmm files. Can someone help me troubleshoot this problem? Thanks, John Martin ________________________________ The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: maker_bopts.ctl URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: maker_opts.ctl URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: maker_exe.ctl URL: From carsonhh at gmail.com Tue Jul 16 14:51:09 2019 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 16 Jul 2019 13:51:09 -0600 Subject: [maker-devel] Trouble using custom repeat library w/ Maker v2.31.10 In-Reply-To: <65c82584-b495-a2a3-1c74-9fb45af52372@wustl.edu> References: <65c82584-b495-a2a3-1c74-9fb45af52372@wustl.edu> Message-ID: Reinstall RepeatMasker Manually. It?s apparently been configured during it?s install to use HMMER, and it?s possible that it was done incorrectly if it was done by a package manager like homebrew. It?s installed under /usr/local/RepeatMasker/ according to your maker_exe.ctl file. ?Carson > On Jul 15, 2019, at 6:58 PM, Martin, John wrote: > > Greetings, > > I'm trying to use a custom repeat library in a maker run being > performed on a ~400Mb worm genome. I've successfully tested my > installation of maker v2.31.10 using a small set of 13 contigs from my > full assembly (~100kb or less) using the 'model_org=all' setting for the > RepeatMasker part of the maker_opts.ctl configuration. But when I try > to feed maker my repeat library fasta file, maker errors out when trying > to repeatmask because its trying to run hmmpress on the fasta file I set > as 'rmlib=...' > > I've attached the 3 .ctl files I'm using. I am not setting any > other command line arguments for maker, when I launch the jobs I just type: > > maker > > in the directory with those .ctl files. The specific error message I > am getting appears inside each of the subdirectories of the datastore, > inside hmmPress.log files: > > Error: File format problem in trying to open HMM file > /gscmnt/gc2732/mitrevalab/ > USDA_Zarlenga/t_colubriformis/180530_assembly_and_annotation_PacBio_sequel_data/ > SUPERNOVA_assembly/ANNOTATION/MAKER/maker_1/TMP_1/maker_sbWsNX/RM_416.MonJul1522 > 04492019/consensi.fa.classified. > Format tag is '>rnd-1_family-14#Unknown': unrecognized. > Current H3 format is 'HMMER3/f'. Previous H2/H3 formats also supported. > > I am not sure why maker is trying to run hmmpress on the fasta file > entered for the 'rmlib' setting, which the documentation says should be > fasta. I think hmmpress should be used on hmm files. Can someone help > me troubleshoot this problem? > > > Thanks, > > John Martin > > > ________________________________ > The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail. > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From carsonhh at gmail.com Tue Jul 16 15:41:40 2019 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 16 Jul 2019 14:41:40 -0600 Subject: [maker-devel] Your message to maker-devel awaits moderator approval In-Reply-To: <5685F7D3-347A-4309-B1DE-51476ECA1CD6@coh.org> References: <5685F7D3-347A-4309-B1DE-51476ECA1CD6@coh.org> Message-ID: I?m glad you were able to find a solution. I do imagine it is a Mac OSX update issue since it?s compatibility with linux tools has gradually decreased with each update. With respect to google docs, we just use it as an archive, new messages and replies have to go to the maker-devel e-mail list. Thanks, Carson > On Jul 16, 2019, at 11:55 AM, Charles Warden wrote: > > Hi, > > I noticed that this was actually posted on-line: > > https://groups.google.com/forum/#!msg/maker-devel/VCC-BFpDXtU/Bx9H2BoqEAAJ > > I have to admit that I didn?t realize this was being made public, and I don?t seem to be able to respond from the Google Groups interface (with my personal Gmail account). > > However, if other people encounter a similar issue, I found a solution using a local Ubuntu Docker image (and the newer version of MAKER, version maker-3.01.02-beta). > > I?m not entirely sure why the Mac installation used to work but currently doesn?t work, although my guess is that it may have something to do with the OS upgrade. Also, I don?t think it actually worked for even 1 of the 2 assemblies before (since I wasn?t not able to find those hidden files, and it look longer to create a FASTA file with protein and transcript sequences). However, I currently have those FASTA files for both sequences. > > Thank You, > Charles > > From: maker-devel > on behalf of "maker-devel-owner at yandell-lab.org " > > Date: Monday, July 15, 2019 at 9:42 AM > To: Charles Warden > > Subject: Your message to maker-devel awaits moderator approval > > maker-devel > > > *SECURITY/CONFIDENTIALITY WARNING: > > This message and any attachments are intended solely for the individual or entity to which they are addressed. This communication may contain information that is privileged, confidential, or exempt from disclosure under applicable law (e.g., personal health information, research data, financial information). Because this e-mail has been sent without encryption, individuals other than the intended recipient may be able to view the information, forward it to others or tamper with the information without the knowledge or consent of the sender. If you are not the intended recipient, or the employee or person responsible for delivering the message to the intended recipient, any dissemination, distribution or copying of the communication is strictly prohibited. If you received the communication in error, please notify the sender immediately by replying to this message and deleting the message and any accompanying files from your system. If, due to the security risks, you do not wish to receive further communications via e-mail, please reply to this message and inform the sender that you do not wish to receive further e-mail from the sender. (LCP301) -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Jul 16 15:43:56 2019 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 16 Jul 2019 14:43:56 -0600 Subject: [maker-devel] Annotations from high confidence IsoSeq data In-Reply-To: References: Message-ID: MAKER doesn?t assign weights to evidence types, but you can look into EVAL which can take MAKER results and produce new models. If you gave your evidence as est= then EVAL can consume that as well since it will be in the GFF3. MAKER3-beta have some integrated support for eval that you can look at using the maker_eval.ctl file to provide weights to EVAL. ?Carson > On Jul 10, 2019, at 12:12 PM, DECKER, KEITH F [AG/1005] wrote: > > I have a set of high confidence full length transcripts generated via the pacbio IsoSeq pipeline. > > Is there a recommended way to pass these transcripts to MAKER and ensure that the corresponding gene models are included in the final annotation? I also have assembled illumina transcripts, but I?m trying to figure out how to give the IsoSeq transcripts a higher ?weight?. > > > > The information contained in this e-mail is for the exclusive use of the intended recipient(s) and may be confidential, proprietary, and/or legally privileged. Inadvertent disclosure of this message does not constitute a waiver of any privilege. If you receive this message in error, please do not directly or indirectly use, print, copy, forward, or disclose any part of this message. Please also delete this e-mail and all copies and notify the sender. Thank you. > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From jmartin at wustl.edu Wed Jul 17 15:10:22 2019 From: jmartin at wustl.edu (Martin, John) Date: Wed, 17 Jul 2019 20:10:22 +0000 Subject: [maker-devel] Trouble using custom repeat library w/ Maker v2.31.10 In-Reply-To: References: <65c82584-b495-a2a3-1c74-9fb45af52372@wustl.edu> Message-ID: <166ea7d7-86cc-acf7-c7d6-9184dbf38aec@wustl.edu> Thanks Carson, I actually had a second, older installation of RepeatMasker that defaulted to wu-blast that I was able to use (by modifying maker_exe.ctl to point to it) and that looks like it fixed my problem. John On 7/16/19 2:51 PM, Carson Holt wrote: > Reinstall RepeatMasker Manually. It?s apparently been configured during it?s install to use HMMER, and it?s possible that it was done incorrectly if it was done by a package manager like homebrew. > > It?s installed under /usr/local/RepeatMasker/ according to your maker_exe.ctl file. > > ?Carson > > > >> On Jul 15, 2019, at 6:58 PM, Martin, John wrote: >> >> Greetings, >> >> I'm trying to use a custom repeat library in a maker run being >> performed on a ~400Mb worm genome. I've successfully tested my >> installation of maker v2.31.10 using a small set of 13 contigs from my >> full assembly (~100kb or less) using the 'model_org=all' setting for the >> RepeatMasker part of the maker_opts.ctl configuration. But when I try >> to feed maker my repeat library fasta file, maker errors out when trying >> to repeatmask because its trying to run hmmpress on the fasta file I set >> as 'rmlib=...' >> >> I've attached the 3 .ctl files I'm using. I am not setting any >> other command line arguments for maker, when I launch the jobs I just type: >> >> maker >> >> in the directory with those .ctl files. The specific error message I >> am getting appears inside each of the subdirectories of the datastore, >> inside hmmPress.log files: >> >> Error: File format problem in trying to open HMM file >> /gscmnt/gc2732/mitrevalab/ >> USDA_Zarlenga/t_colubriformis/180530_assembly_and_annotation_PacBio_sequel_data/ >> SUPERNOVA_assembly/ANNOTATION/MAKER/maker_1/TMP_1/maker_sbWsNX/RM_416.MonJul1522 >> 04492019/consensi.fa.classified. >> Format tag is '>rnd-1_family-14#Unknown': unrecognized. >> Current H3 format is 'HMMER3/f'. Previous H2/H3 formats also supported. >> >> I am not sure why maker is trying to run hmmpress on the fasta file >> entered for the 'rmlib' setting, which the documentation says should be >> fasta. I think hmmpress should be used on hmm files. Can someone help >> me troubleshoot this problem? >> >> >> Thanks, >> >> John Martin >> >> >> ________________________________ >> The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail. >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org ________________________________ The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail. From keith.decker at bayer.com Wed Jul 10 12:12:31 2019 From: keith.decker at bayer.com (DECKER, KEITH F [AG/1005]) Date: Wed, 10 Jul 2019 18:12:31 +0000 Subject: [maker-devel] Annotations from high confidence IsoSeq data Message-ID: I have a set of high confidence full length transcripts generated via the pacbio IsoSeq pipeline. Is there a recommended way to pass these transcripts to MAKER and ensure that the corresponding gene models are included in the final annotation? I also have assembled illumina transcripts, but I?m trying to figure out how to give the IsoSeq transcripts a higher ?weight?. The information contained in this e-mail is for the exclusive use of the intended recipient(s) and may be confidential, proprietary, and/or legally privileged. Inadvertent disclosure of this message does not constitute a waiver of any privilege. If you receive this message in error, please do not directly or indirectly use, print, copy, forward, or disclose any part of this message. Please also delete this e-mail and all copies and notify the sender. Thank you. -------------- next part -------------- An HTML attachment was scrubbed... URL: From cwarden at coh.org Mon Jul 15 10:25:59 2019 From: cwarden at coh.org (Charles Warden) Date: Mon, 15 Jul 2019 16:25:59 +0000 Subject: [maker-devel] MAKER caught in loop Message-ID: Hi, I am encountering some issues running MAKER on my Mac. I?ve run MAKER in the past, and I was even able to run MAKER very quickly (I think less than 1 minute) for 1 of my 2 assemblies. For the successful run, I installed some headers from /Library/Developer/CommandLineTools/Packages/macOS_SDK_headers_for_macOS_10.14.pkg, and MAKER worked (but the files were not visible on my PC). Actually, on my Mac, I could see from my terminal, but not even through the Mac file browser (after a restart). I?ve attached a copy of my configuration files, and I am running MAKER with the following command: /opt/maker/bin/maker -nolock -fix_nucleotides I upgraded my OS within the past few months, but it seems like there is some issue with the permissions for the files being created. I testing running MAKER from my Mac?s local hard drive (instead of the network folder). However, the run time is considerably longer than the assembly that ran to completion (but those .gff file is not visuable through several interfaces, for some reason). For example, MAKER has been stuck on this step since last Friday (for a 45,013 bp sequence), whereas annotation for a larger sequence completed within minutes: STATUS: Parsing control files... WARNING: You have chosen to turn locking off which may create race conditions if running in parallel. You have been warned. STATUS: Processing and indexing input FASTA files... STATUS: Setting up database for any GFF3 input... A data structure will be created for you at: /Users/cwarden/Documents/MAKER/AC270418_270441.maker.output/AC270418_270441_datastore To access files for individual sequences use the datastore index: /Users/cwarden/Documents/MAKER/AC270418_270441.maker.output/AC270418_270441_master_datastore_index.log STATUS: Now running MAKER... examining contents of the fasta file and run log --Next Contig-- #--------------------------------------------------------------------- Now starting the contig!! SeqID: AC270418_270441 Length: 45015 #--------------------------------------------------------------------- Can you please help me troubleshoot this issue? Thank You, Charles ------------------------------------------------------------ -SECURITY/CONFIDENTIALITY WARNING- This message and any attachments are intended solely for the individual or entity to which they are addressed. This communication may contain information that is privileged, confidential, or exempt from disclosure under applicable law (e.g., personal health information, research data, financial information). Because this e-mail has been sent without encryption, individuals other than the intended recipient may be able to view the information, forward it to others or tamper with the information without the knowledge or consent of the sender. If you are not the intended recipient, or the employee or person responsible for delivering the message to the intended recipient, any dissemination, distribution or copying of the communication is strictly prohibited. If you received the communication in error, please notify the sender immediately by replying to this message and deleting the message and any accompanying files from your system. If, due to the security risks, you do not wish to receive further communications via e-mail, please reply to this message and inform the sender that you do not wish to receive further e-mail from the sender. (LCP301) ------------------------------------------------------------ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_bopts.ctl Type: application/octet-stream Size: 497 bytes Desc: maker_bopts.ctl URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_exe.ctl Type: application/octet-stream Size: 352 bytes Desc: maker_exe.ctl URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_opts.ctl Type: application/octet-stream Size: 1180 bytes Desc: maker_opts.ctl URL: From jmartin at wustl.edu Mon Jul 15 18:58:30 2019 From: jmartin at wustl.edu (Martin, John) Date: Tue, 16 Jul 2019 00:58:30 +0000 Subject: [maker-devel] Trouble using custom repeat library w/ Maker v2.31.10 Message-ID: <65c82584-b495-a2a3-1c74-9fb45af52372@wustl.edu> Greetings, I'm trying to use a custom repeat library in a maker run being performed on a ~400Mb worm genome. I've successfully tested my installation of maker v2.31.10 using a small set of 13 contigs from my full assembly (~100kb or less) using the 'model_org=all' setting for the RepeatMasker part of the maker_opts.ctl configuration. But when I try to feed maker my repeat library fasta file, maker errors out when trying to repeatmask because its trying to run hmmpress on the fasta file I set as 'rmlib=...' I've attached the 3 .ctl files I'm using. I am not setting any other command line arguments for maker, when I launch the jobs I just type: maker in the directory with those .ctl files. The specific error message I am getting appears inside each of the subdirectories of the datastore, inside hmmPress.log files: Error: File format problem in trying to open HMM file /gscmnt/gc2732/mitrevalab/ USDA_Zarlenga/t_colubriformis/180530_assembly_and_annotation_PacBio_sequel_data/ SUPERNOVA_assembly/ANNOTATION/MAKER/maker_1/TMP_1/maker_sbWsNX/RM_416.MonJul1522 04492019/consensi.fa.classified. Format tag is '>rnd-1_family-14#Unknown': unrecognized. Current H3 format is 'HMMER3/f'. Previous H2/H3 formats also supported. I am not sure why maker is trying to run hmmpress on the fasta file entered for the 'rmlib' setting, which the documentation says should be fasta. I think hmmpress should be used on hmm files. Can someone help me troubleshoot this problem? Thanks, John Martin ________________________________ The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: maker_bopts.ctl URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: maker_opts.ctl URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: maker_exe.ctl URL: From carsonhh at gmail.com Tue Jul 16 13:51:09 2019 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 16 Jul 2019 13:51:09 -0600 Subject: [maker-devel] Trouble using custom repeat library w/ Maker v2.31.10 In-Reply-To: <65c82584-b495-a2a3-1c74-9fb45af52372@wustl.edu> References: <65c82584-b495-a2a3-1c74-9fb45af52372@wustl.edu> Message-ID: Reinstall RepeatMasker Manually. It?s apparently been configured during it?s install to use HMMER, and it?s possible that it was done incorrectly if it was done by a package manager like homebrew. It?s installed under /usr/local/RepeatMasker/ according to your maker_exe.ctl file. ?Carson > On Jul 15, 2019, at 6:58 PM, Martin, John wrote: > > Greetings, > > I'm trying to use a custom repeat library in a maker run being > performed on a ~400Mb worm genome. I've successfully tested my > installation of maker v2.31.10 using a small set of 13 contigs from my > full assembly (~100kb or less) using the 'model_org=all' setting for the > RepeatMasker part of the maker_opts.ctl configuration. But when I try > to feed maker my repeat library fasta file, maker errors out when trying > to repeatmask because its trying to run hmmpress on the fasta file I set > as 'rmlib=...' > > I've attached the 3 .ctl files I'm using. I am not setting any > other command line arguments for maker, when I launch the jobs I just type: > > maker > > in the directory with those .ctl files. The specific error message I > am getting appears inside each of the subdirectories of the datastore, > inside hmmPress.log files: > > Error: File format problem in trying to open HMM file > /gscmnt/gc2732/mitrevalab/ > USDA_Zarlenga/t_colubriformis/180530_assembly_and_annotation_PacBio_sequel_data/ > SUPERNOVA_assembly/ANNOTATION/MAKER/maker_1/TMP_1/maker_sbWsNX/RM_416.MonJul1522 > 04492019/consensi.fa.classified. > Format tag is '>rnd-1_family-14#Unknown': unrecognized. > Current H3 format is 'HMMER3/f'. Previous H2/H3 formats also supported. > > I am not sure why maker is trying to run hmmpress on the fasta file > entered for the 'rmlib' setting, which the documentation says should be > fasta. I think hmmpress should be used on hmm files. Can someone help > me troubleshoot this problem? > > > Thanks, > > John Martin > > > ________________________________ > The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail. > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From carsonhh at gmail.com Tue Jul 16 14:41:40 2019 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 16 Jul 2019 14:41:40 -0600 Subject: [maker-devel] Your message to maker-devel awaits moderator approval In-Reply-To: <5685F7D3-347A-4309-B1DE-51476ECA1CD6@coh.org> References: <5685F7D3-347A-4309-B1DE-51476ECA1CD6@coh.org> Message-ID: I?m glad you were able to find a solution. I do imagine it is a Mac OSX update issue since it?s compatibility with linux tools has gradually decreased with each update. With respect to google docs, we just use it as an archive, new messages and replies have to go to the maker-devel e-mail list. Thanks, Carson > On Jul 16, 2019, at 11:55 AM, Charles Warden wrote: > > Hi, > > I noticed that this was actually posted on-line: > > https://groups.google.com/forum/#!msg/maker-devel/VCC-BFpDXtU/Bx9H2BoqEAAJ > > I have to admit that I didn?t realize this was being made public, and I don?t seem to be able to respond from the Google Groups interface (with my personal Gmail account). > > However, if other people encounter a similar issue, I found a solution using a local Ubuntu Docker image (and the newer version of MAKER, version maker-3.01.02-beta). > > I?m not entirely sure why the Mac installation used to work but currently doesn?t work, although my guess is that it may have something to do with the OS upgrade. Also, I don?t think it actually worked for even 1 of the 2 assemblies before (since I wasn?t not able to find those hidden files, and it look longer to create a FASTA file with protein and transcript sequences). However, I currently have those FASTA files for both sequences. > > Thank You, > Charles > > From: maker-devel > on behalf of "maker-devel-owner at yandell-lab.org " > > Date: Monday, July 15, 2019 at 9:42 AM > To: Charles Warden > > Subject: Your message to maker-devel awaits moderator approval > > maker-devel > > > *SECURITY/CONFIDENTIALITY WARNING: > > This message and any attachments are intended solely for the individual or entity to which they are addressed. This communication may contain information that is privileged, confidential, or exempt from disclosure under applicable law (e.g., personal health information, research data, financial information). Because this e-mail has been sent without encryption, individuals other than the intended recipient may be able to view the information, forward it to others or tamper with the information without the knowledge or consent of the sender. If you are not the intended recipient, or the employee or person responsible for delivering the message to the intended recipient, any dissemination, distribution or copying of the communication is strictly prohibited. If you received the communication in error, please notify the sender immediately by replying to this message and deleting the message and any accompanying files from your system. If, due to the security risks, you do not wish to receive further communications via e-mail, please reply to this message and inform the sender that you do not wish to receive further e-mail from the sender. (LCP301) -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Jul 16 14:43:56 2019 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 16 Jul 2019 14:43:56 -0600 Subject: [maker-devel] Annotations from high confidence IsoSeq data In-Reply-To: References: Message-ID: MAKER doesn?t assign weights to evidence types, but you can look into EVAL which can take MAKER results and produce new models. If you gave your evidence as est= then EVAL can consume that as well since it will be in the GFF3. MAKER3-beta have some integrated support for eval that you can look at using the maker_eval.ctl file to provide weights to EVAL. ?Carson > On Jul 10, 2019, at 12:12 PM, DECKER, KEITH F [AG/1005] wrote: > > I have a set of high confidence full length transcripts generated via the pacbio IsoSeq pipeline. > > Is there a recommended way to pass these transcripts to MAKER and ensure that the corresponding gene models are included in the final annotation? I also have assembled illumina transcripts, but I?m trying to figure out how to give the IsoSeq transcripts a higher ?weight?. > > > > The information contained in this e-mail is for the exclusive use of the intended recipient(s) and may be confidential, proprietary, and/or legally privileged. Inadvertent disclosure of this message does not constitute a waiver of any privilege. If you receive this message in error, please do not directly or indirectly use, print, copy, forward, or disclose any part of this message. Please also delete this e-mail and all copies and notify the sender. Thank you. > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From jmartin at wustl.edu Wed Jul 17 14:10:22 2019 From: jmartin at wustl.edu (Martin, John) Date: Wed, 17 Jul 2019 20:10:22 +0000 Subject: [maker-devel] Trouble using custom repeat library w/ Maker v2.31.10 In-Reply-To: References: <65c82584-b495-a2a3-1c74-9fb45af52372@wustl.edu> Message-ID: <166ea7d7-86cc-acf7-c7d6-9184dbf38aec@wustl.edu> Thanks Carson, I actually had a second, older installation of RepeatMasker that defaulted to wu-blast that I was able to use (by modifying maker_exe.ctl to point to it) and that looks like it fixed my problem. John On 7/16/19 2:51 PM, Carson Holt wrote: > Reinstall RepeatMasker Manually. It?s apparently been configured during it?s install to use HMMER, and it?s possible that it was done incorrectly if it was done by a package manager like homebrew. > > It?s installed under /usr/local/RepeatMasker/ according to your maker_exe.ctl file. > > ?Carson > > > >> On Jul 15, 2019, at 6:58 PM, Martin, John wrote: >> >> Greetings, >> >> I'm trying to use a custom repeat library in a maker run being >> performed on a ~400Mb worm genome. I've successfully tested my >> installation of maker v2.31.10 using a small set of 13 contigs from my >> full assembly (~100kb or less) using the 'model_org=all' setting for the >> RepeatMasker part of the maker_opts.ctl configuration. But when I try >> to feed maker my repeat library fasta file, maker errors out when trying >> to repeatmask because its trying to run hmmpress on the fasta file I set >> as 'rmlib=...' >> >> I've attached the 3 .ctl files I'm using. I am not setting any >> other command line arguments for maker, when I launch the jobs I just type: >> >> maker >> >> in the directory with those .ctl files. The specific error message I >> am getting appears inside each of the subdirectories of the datastore, >> inside hmmPress.log files: >> >> Error: File format problem in trying to open HMM file >> /gscmnt/gc2732/mitrevalab/ >> USDA_Zarlenga/t_colubriformis/180530_assembly_and_annotation_PacBio_sequel_data/ >> SUPERNOVA_assembly/ANNOTATION/MAKER/maker_1/TMP_1/maker_sbWsNX/RM_416.MonJul1522 >> 04492019/consensi.fa.classified. >> Format tag is '>rnd-1_family-14#Unknown': unrecognized. >> Current H3 format is 'HMMER3/f'. Previous H2/H3 formats also supported. >> >> I am not sure why maker is trying to run hmmpress on the fasta file >> entered for the 'rmlib' setting, which the documentation says should be >> fasta. I think hmmpress should be used on hmm files. Can someone help >> me troubleshoot this problem? >> >> >> Thanks, >> >> John Martin >> >> >> ________________________________ >> The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail. >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org ________________________________ The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail. From keith.decker at bayer.com Wed Jul 10 12:12:31 2019 From: keith.decker at bayer.com (DECKER, KEITH F [AG/1005]) Date: Wed, 10 Jul 2019 18:12:31 +0000 Subject: [maker-devel] Annotations from high confidence IsoSeq data Message-ID: I have a set of high confidence full length transcripts generated via the pacbio IsoSeq pipeline. Is there a recommended way to pass these transcripts to MAKER and ensure that the corresponding gene models are included in the final annotation? I also have assembled illumina transcripts, but I?m trying to figure out how to give the IsoSeq transcripts a higher ?weight?. The information contained in this e-mail is for the exclusive use of the intended recipient(s) and may be confidential, proprietary, and/or legally privileged. Inadvertent disclosure of this message does not constitute a waiver of any privilege. If you receive this message in error, please do not directly or indirectly use, print, copy, forward, or disclose any part of this message. Please also delete this e-mail and all copies and notify the sender. Thank you. -------------- next part -------------- An HTML attachment was scrubbed... URL: From cwarden at coh.org Mon Jul 15 10:25:59 2019 From: cwarden at coh.org (Charles Warden) Date: Mon, 15 Jul 2019 16:25:59 +0000 Subject: [maker-devel] MAKER caught in loop Message-ID: Hi, I am encountering some issues running MAKER on my Mac. I?ve run MAKER in the past, and I was even able to run MAKER very quickly (I think less than 1 minute) for 1 of my 2 assemblies. For the successful run, I installed some headers from /Library/Developer/CommandLineTools/Packages/macOS_SDK_headers_for_macOS_10.14.pkg, and MAKER worked (but the files were not visible on my PC). Actually, on my Mac, I could see from my terminal, but not even through the Mac file browser (after a restart). I?ve attached a copy of my configuration files, and I am running MAKER with the following command: /opt/maker/bin/maker -nolock -fix_nucleotides I upgraded my OS within the past few months, but it seems like there is some issue with the permissions for the files being created. I testing running MAKER from my Mac?s local hard drive (instead of the network folder). However, the run time is considerably longer than the assembly that ran to completion (but those .gff file is not visuable through several interfaces, for some reason). For example, MAKER has been stuck on this step since last Friday (for a 45,013 bp sequence), whereas annotation for a larger sequence completed within minutes: STATUS: Parsing control files... WARNING: You have chosen to turn locking off which may create race conditions if running in parallel. You have been warned. STATUS: Processing and indexing input FASTA files... STATUS: Setting up database for any GFF3 input... A data structure will be created for you at: /Users/cwarden/Documents/MAKER/AC270418_270441.maker.output/AC270418_270441_datastore To access files for individual sequences use the datastore index: /Users/cwarden/Documents/MAKER/AC270418_270441.maker.output/AC270418_270441_master_datastore_index.log STATUS: Now running MAKER... examining contents of the fasta file and run log --Next Contig-- #--------------------------------------------------------------------- Now starting the contig!! SeqID: AC270418_270441 Length: 45015 #--------------------------------------------------------------------- Can you please help me troubleshoot this issue? Thank You, Charles ------------------------------------------------------------ -SECURITY/CONFIDENTIALITY WARNING- This message and any attachments are intended solely for the individual or entity to which they are addressed. This communication may contain information that is privileged, confidential, or exempt from disclosure under applicable law (e.g., personal health information, research data, financial information). Because this e-mail has been sent without encryption, individuals other than the intended recipient may be able to view the information, forward it to others or tamper with the information without the knowledge or consent of the sender. If you are not the intended recipient, or the employee or person responsible for delivering the message to the intended recipient, any dissemination, distribution or copying of the communication is strictly prohibited. If you received the communication in error, please notify the sender immediately by replying to this message and deleting the message and any accompanying files from your system. If, due to the security risks, you do not wish to receive further communications via e-mail, please reply to this message and inform the sender that you do not wish to receive further e-mail from the sender. (LCP301) ------------------------------------------------------------ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_bopts.ctl Type: application/octet-stream Size: 497 bytes Desc: maker_bopts.ctl URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_exe.ctl Type: application/octet-stream Size: 352 bytes Desc: maker_exe.ctl URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_opts.ctl Type: application/octet-stream Size: 1180 bytes Desc: maker_opts.ctl URL: From jmartin at wustl.edu Mon Jul 15 18:58:30 2019 From: jmartin at wustl.edu (Martin, John) Date: Tue, 16 Jul 2019 00:58:30 +0000 Subject: [maker-devel] Trouble using custom repeat library w/ Maker v2.31.10 Message-ID: <65c82584-b495-a2a3-1c74-9fb45af52372@wustl.edu> Greetings, I'm trying to use a custom repeat library in a maker run being performed on a ~400Mb worm genome. I've successfully tested my installation of maker v2.31.10 using a small set of 13 contigs from my full assembly (~100kb or less) using the 'model_org=all' setting for the RepeatMasker part of the maker_opts.ctl configuration. But when I try to feed maker my repeat library fasta file, maker errors out when trying to repeatmask because its trying to run hmmpress on the fasta file I set as 'rmlib=...' I've attached the 3 .ctl files I'm using. I am not setting any other command line arguments for maker, when I launch the jobs I just type: maker in the directory with those .ctl files. The specific error message I am getting appears inside each of the subdirectories of the datastore, inside hmmPress.log files: Error: File format problem in trying to open HMM file /gscmnt/gc2732/mitrevalab/ USDA_Zarlenga/t_colubriformis/180530_assembly_and_annotation_PacBio_sequel_data/ SUPERNOVA_assembly/ANNOTATION/MAKER/maker_1/TMP_1/maker_sbWsNX/RM_416.MonJul1522 04492019/consensi.fa.classified. Format tag is '>rnd-1_family-14#Unknown': unrecognized. Current H3 format is 'HMMER3/f'. Previous H2/H3 formats also supported. I am not sure why maker is trying to run hmmpress on the fasta file entered for the 'rmlib' setting, which the documentation says should be fasta. I think hmmpress should be used on hmm files. Can someone help me troubleshoot this problem? Thanks, John Martin ________________________________ The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail. -------------- next part -------------- #-----BLAST and Exonerate Statistics Thresholds blast_type=ncbi+ #set to 'ncbi+', 'ncbi' or 'wublast' pcov_blastn=0.8 #Blastn Percent Coverage Threhold EST-Genome Alignments pid_blastn=0.85 #Blastn Percent Identity Threshold EST-Genome Aligments eval_blastn=1e-10 #Blastn eval cutoff bit_blastn=40 #Blastn bit cutoff depth_blastn=0 #Blastn depth cutoff (0 to disable cutoff) pcov_blastx=0.5 #Blastx Percent Coverage Threhold Protein-Genome Alignments pid_blastx=0.4 #Blastx Percent Identity Threshold Protein-Genome Aligments eval_blastx=1e-06 #Blastx eval cutoff bit_blastx=30 #Blastx bit cutoff depth_blastx=0 #Blastx depth cutoff (0 to disable cutoff) pcov_tblastx=0.8 #tBlastx Percent Coverage Threhold alt-EST-Genome Alignments pid_tblastx=0.85 #tBlastx Percent Identity Threshold alt-EST-Genome Aligments eval_tblastx=1e-10 #tBlastx eval cutoff bit_tblastx=40 #tBlastx bit cutoff depth_tblastx=0 #tBlastx depth cutoff (0 to disable cutoff) pcov_rm_blastx=0.5 #Blastx Percent Coverage Threhold For Transposable Element Masking pid_rm_blastx=0.4 #Blastx Percent Identity Threshold For Transposbale Element Masking eval_rm_blastx=1e-06 #Blastx eval cutoff for transposable element masking bit_rm_blastx=30 #Blastx bit cutoff for transposable element masking ep_score_limit=20 #Exonerate protein percent of maximal score threshold en_score_limit=20 #Exonerate nucleotide percent of maximal score threshold -------------- next part -------------- #-----Genome (these are always required) genome=Tcol_final_assembly.minsize_2000.fna.PART.1 #genome sequence (fasta file or fasta embeded in GFF3 file) organism_type=eukaryotic #eukaryotic or prokaryotic. Default is eukaryotic #-----Re-annotation Using MAKER Derived GFF3 maker_gff= #MAKER derived GFF3 file est_pass=0 #use ESTs in maker_gff: 1 = yes, 0 = no altest_pass=0 #use alternate organism ESTs in maker_gff: 1 = yes, 0 = no protein_pass=0 #use protein alignments in maker_gff: 1 = yes, 0 = no rm_pass=0 #use repeats in maker_gff: 1 = yes, 0 = no model_pass=0 #use gene models in maker_gff: 1 = yes, 0 = no pred_pass=0 #use ab-initio predictions in maker_gff: 1 = yes, 0 = no other_pass=0 #passthrough anyything else in maker_gff: 1 = yes, 0 = no #-----EST Evidence (for best results provide a file for at least one) est= #set of ESTs or assembled mRNA-seq in fasta format altest= #EST/cDNA sequence file in fasta format from an alternate organism est_gff=/gscmnt/gc2732/mitrevalab/USDA/t_colubriformis/180530_assembly_and_annotation_PacBio_sequel_data/SUPERNOVA_assembly/ANNOTATION/MAKER/RNAseq_transcript_assembly_w_better_HiSat2_arguments_and_using_stringtie/Merged_StringTie_Transcripts.maker_gff3 #aligned ESTs or mRNA-seq from an external GFF3 file altest_gff= #aligned ESTs from a closly relate species in GFF3 format #-----Protein Homology Evidence (for best results provide a file for at least one) protein=/gscmnt/gc2732/mitrevalab/USDA/t_colubriformis/180530_assembly_and_annotation_PacBio_sequel_data/SUPERNOVA_assembly/ANNOTATION/BRAKER2/evidence_protein/All_9_close_nematode.protein.faa #protein sequence file in fasta format (i.e. from mutiple oransisms) protein_gff= #aligned protein homology evidence from an external GFF3 file #-----Repeat Masking (leave values blank to skip repeat masking) model_org=all #select a model organism for RepBase masking in RepeatMasker rmlib=/gscmnt/gc2732/mitrevalab/USDA/t_colubriformis/180530_assembly_and_annotation_PacBio_sequel_data/SUPERNOVA_assembly/ANNOTATION/REPEAT_LIBRARY_std_RepeatModeler_install_bsub_pa4_spanhosts1_n5_249Gb/RM_1.TueMay281756062019/consensi.fa.classified #provide an organism specific repeat library in fasta format for RepeatMasker repeat_protein=/usr/local/maker/data/te_proteins.fasta #provide a fasta file of transposable element proteins for RepeatRunner rm_gff= #pre-identified repeat elements from an external GFF3 file prok_rm=0 #forces MAKER to repeatmask prokaryotes (no reason to change this), 1 = yes, 0 = no softmask=1 #use soft-masking rather than hard-masking in BLAST (i.e. seg and dust filtering) #-----Gene Prediction snaphmm= #SNAP HMM file gmhmm= #GeneMark HMM file augustus_species=Trichostrongylus_colubriformis_assembly_rnaseq-train #Augustus gene prediction species model fgenesh_par_file= #FGENESH parameter file pred_gff=/gscmnt/gc2732/mitrevalab/USDA/t_colubriformis/180530_assembly_and_annotation_PacBio_sequel_data/SUPERNOVA_assembly/ANNOTATION/BRAKER2/braker2_run_rnaseq_only/braker/Trichostrongylus_colubriformis_assembly_rnaseq-train/augustus.hints.gff3 #ab-initio predictions from an external GFF3 file model_gff= #annotated gene models from an external GFF3 file (annotation pass-through) est2genome=0 #infer gene predictions directly from ESTs, 1 = yes, 0 = no protein2genome=0 #infer predictions from protein homology, 1 = yes, 0 = no trna=0 #find tRNAs with tRNAscan, 1 = yes, 0 = no snoscan_rrna= #rRNA file to have Snoscan find snoRNAs unmask=0 #also run ab-initio prediction programs on unmasked sequence, 1 = yes, 0 = no #-----Other Annotation Feature Types (features MAKER doesn't recognize) other_gff= #extra features to pass-through to final MAKER generated GFF3 file #-----External Application Behavior Options alt_peptide=C #amino acid used to replace non-standard amino acids in BLAST databases cpus=1 #max number of cpus to use in BLAST and RepeatMasker (not for MPI, leave 1 when using MPI) #-----MAKER Behavior Options max_dna_len=100000 #length for dividing up contigs into chunks (increases/decreases memory usage) min_contig=1 #skip genome contigs below this length (under 10kb are often useless) pred_flank=200 #flank for extending evidence clusters sent to gene predictors pred_stats=0 #report AED and QI statistics for all predictions as well as models AED_threshold=1 #Maximum Annotation Edit Distance allowed (bound by 0 and 1) min_protein=25 #require at least this many amino acids in predicted proteins alt_splice=0 #Take extra steps to try and find alternative splicing, 1 = yes, 0 = no always_complete=0 #extra steps to force start and stop codons, 1 = yes, 0 = no map_forward=0 #map names and attributes forward from old GFF3 genes, 1 = yes, 0 = no keep_preds=0 #Concordance threshold to add unsupported gene prediction (bound by 0 and 1) split_hit=10000 #length for the splitting of hits (expected max intron size for evidence alignments) single_exon=0 #consider single exon EST evidence when generating annotations, 1 = yes, 0 = no single_length=250 #min length required for single exon ESTs if 'single_exon is enabled' correct_est_fusion=0 #limits use of ESTs in annotation to avoid fusion genes tries=2 #number of times to try a contig if there is a failure for some reason clean_try=0 #remove all data from previous run before retrying, 1 = yes, 0 = no clean_up=0 #removes theVoid directory with individual analysis files, 1 = yes, 0 = no TMP=/gscmnt/gc2732/mitrevalab/USDA/t_colubriformis/180530_assembly_and_annotation_PacBio_sequel_data/SUPERNOVA_assembly/ANNOTATION/MAKER/maker_1/TMP_1 #specify a directory other than the system default temporary directory for temporary files -------------- next part -------------- #-----Location of Executables Used by MAKER/EVALUATOR makeblastdb=/usr/local/rmblast-2.9.0/makeblastdb #location of NCBI+ makeblastdb executable blastn=/usr/local/ncbi-blast-2.9.0+/bin/blastn #location of NCBI+ blastn executable blastx=/usr/local/rmblast-2.9.0/blastx #location of NCBI+ blastx executable tblastx=/usr/local/ncbi-blast-2.9.0+/bin/tblastx #location of NCBI+ tblastx executable formatdb=/gsc/pkg/bio/blast/blast-2.2.26/bin/formatdb #location of NCBI formatdb executable blastall=/gsc/pkg/bio/blast/blast-2.2.26/bin/blastall #location of NCBI blastall executable xdformat=/gsc/pkg/bio/wu-blast/blast2_2006-05-04/xdformat #location of WUBLAST xdformat executable blasta=/gsc/pkg/bio/wu-blast/blast2_2006-05-04/blasta #location of WUBLAST blasta executable RepeatMasker=/usr/local/RepeatMasker/RepeatMasker #location of RepeatMasker executable exonerate=/usr/local/exonerate-2.2.0-x86_64/bin/exonerate #location of exonerate executable #-----Ab-initio Gene Prediction Algorithms snap=/usr/local/snap/snap #location of snap executable gmhmme3=/usr/local/gm_et_linux_64/gmes_petap/gmhmme3 #location of eukaryotic genemark executable gmhmmp= #location of prokaryotic genemark executable augustus=/usr/local/augustus.2.5.5/bin/augustus #location of augustus executable fgenesh= #location of fgenesh executable tRNAscan-SE= #location of trnascan executable snoscan= #location of snoscan executable #-----Other Algorithms probuild=/usr/local/gm_et_linux_64/gmes_petap/probuild #location of probuild executable (required for genemark) From carsonhh at gmail.com Tue Jul 16 13:51:09 2019 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 16 Jul 2019 13:51:09 -0600 Subject: [maker-devel] Trouble using custom repeat library w/ Maker v2.31.10 In-Reply-To: <65c82584-b495-a2a3-1c74-9fb45af52372@wustl.edu> References: <65c82584-b495-a2a3-1c74-9fb45af52372@wustl.edu> Message-ID: Reinstall RepeatMasker Manually. It?s apparently been configured during it?s install to use HMMER, and it?s possible that it was done incorrectly if it was done by a package manager like homebrew. It?s installed under /usr/local/RepeatMasker/ according to your maker_exe.ctl file. ?Carson > On Jul 15, 2019, at 6:58 PM, Martin, John wrote: > > Greetings, > > I'm trying to use a custom repeat library in a maker run being > performed on a ~400Mb worm genome. I've successfully tested my > installation of maker v2.31.10 using a small set of 13 contigs from my > full assembly (~100kb or less) using the 'model_org=all' setting for the > RepeatMasker part of the maker_opts.ctl configuration. But when I try > to feed maker my repeat library fasta file, maker errors out when trying > to repeatmask because its trying to run hmmpress on the fasta file I set > as 'rmlib=...' > > I've attached the 3 .ctl files I'm using. I am not setting any > other command line arguments for maker, when I launch the jobs I just type: > > maker > > in the directory with those .ctl files. The specific error message I > am getting appears inside each of the subdirectories of the datastore, > inside hmmPress.log files: > > Error: File format problem in trying to open HMM file > /gscmnt/gc2732/mitrevalab/ > USDA_Zarlenga/t_colubriformis/180530_assembly_and_annotation_PacBio_sequel_data/ > SUPERNOVA_assembly/ANNOTATION/MAKER/maker_1/TMP_1/maker_sbWsNX/RM_416.MonJul1522 > 04492019/consensi.fa.classified. > Format tag is '>rnd-1_family-14#Unknown': unrecognized. > Current H3 format is 'HMMER3/f'. Previous H2/H3 formats also supported. > > I am not sure why maker is trying to run hmmpress on the fasta file > entered for the 'rmlib' setting, which the documentation says should be > fasta. I think hmmpress should be used on hmm files. Can someone help > me troubleshoot this problem? > > > Thanks, > > John Martin > > > ________________________________ > The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail. > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From carsonhh at gmail.com Tue Jul 16 14:41:40 2019 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 16 Jul 2019 14:41:40 -0600 Subject: [maker-devel] Your message to maker-devel awaits moderator approval In-Reply-To: <5685F7D3-347A-4309-B1DE-51476ECA1CD6@coh.org> References: <5685F7D3-347A-4309-B1DE-51476ECA1CD6@coh.org> Message-ID: I?m glad you were able to find a solution. I do imagine it is a Mac OSX update issue since it?s compatibility with linux tools has gradually decreased with each update. With respect to google docs, we just use it as an archive, new messages and replies have to go to the maker-devel e-mail list. Thanks, Carson > On Jul 16, 2019, at 11:55 AM, Charles Warden wrote: > > Hi, > > I noticed that this was actually posted on-line: > > https://groups.google.com/forum/#!msg/maker-devel/VCC-BFpDXtU/Bx9H2BoqEAAJ > > I have to admit that I didn?t realize this was being made public, and I don?t seem to be able to respond from the Google Groups interface (with my personal Gmail account). > > However, if other people encounter a similar issue, I found a solution using a local Ubuntu Docker image (and the newer version of MAKER, version maker-3.01.02-beta). > > I?m not entirely sure why the Mac installation used to work but currently doesn?t work, although my guess is that it may have something to do with the OS upgrade. Also, I don?t think it actually worked for even 1 of the 2 assemblies before (since I wasn?t not able to find those hidden files, and it look longer to create a FASTA file with protein and transcript sequences). However, I currently have those FASTA files for both sequences. > > Thank You, > Charles > > From: maker-devel > on behalf of "maker-devel-owner at yandell-lab.org " > > Date: Monday, July 15, 2019 at 9:42 AM > To: Charles Warden > > Subject: Your message to maker-devel awaits moderator approval > > maker-devel > > > *SECURITY/CONFIDENTIALITY WARNING: > > This message and any attachments are intended solely for the individual or entity to which they are addressed. This communication may contain information that is privileged, confidential, or exempt from disclosure under applicable law (e.g., personal health information, research data, financial information). Because this e-mail has been sent without encryption, individuals other than the intended recipient may be able to view the information, forward it to others or tamper with the information without the knowledge or consent of the sender. If you are not the intended recipient, or the employee or person responsible for delivering the message to the intended recipient, any dissemination, distribution or copying of the communication is strictly prohibited. If you received the communication in error, please notify the sender immediately by replying to this message and deleting the message and any accompanying files from your system. If, due to the security risks, you do not wish to receive further communications via e-mail, please reply to this message and inform the sender that you do not wish to receive further e-mail from the sender. (LCP301) -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Jul 16 14:43:56 2019 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 16 Jul 2019 14:43:56 -0600 Subject: [maker-devel] Annotations from high confidence IsoSeq data In-Reply-To: References: Message-ID: MAKER doesn?t assign weights to evidence types, but you can look into EVAL which can take MAKER results and produce new models. If you gave your evidence as est= then EVAL can consume that as well since it will be in the GFF3. MAKER3-beta have some integrated support for eval that you can look at using the maker_eval.ctl file to provide weights to EVAL. ?Carson > On Jul 10, 2019, at 12:12 PM, DECKER, KEITH F [AG/1005] wrote: > > I have a set of high confidence full length transcripts generated via the pacbio IsoSeq pipeline. > > Is there a recommended way to pass these transcripts to MAKER and ensure that the corresponding gene models are included in the final annotation? I also have assembled illumina transcripts, but I?m trying to figure out how to give the IsoSeq transcripts a higher ?weight?. > > > > The information contained in this e-mail is for the exclusive use of the intended recipient(s) and may be confidential, proprietary, and/or legally privileged. Inadvertent disclosure of this message does not constitute a waiver of any privilege. If you receive this message in error, please do not directly or indirectly use, print, copy, forward, or disclose any part of this message. Please also delete this e-mail and all copies and notify the sender. Thank you. > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From jmartin at wustl.edu Wed Jul 17 14:10:22 2019 From: jmartin at wustl.edu (Martin, John) Date: Wed, 17 Jul 2019 20:10:22 +0000 Subject: [maker-devel] Trouble using custom repeat library w/ Maker v2.31.10 In-Reply-To: References: <65c82584-b495-a2a3-1c74-9fb45af52372@wustl.edu> Message-ID: <166ea7d7-86cc-acf7-c7d6-9184dbf38aec@wustl.edu> Thanks Carson, I actually had a second, older installation of RepeatMasker that defaulted to wu-blast that I was able to use (by modifying maker_exe.ctl to point to it) and that looks like it fixed my problem. John On 7/16/19 2:51 PM, Carson Holt wrote: > Reinstall RepeatMasker Manually. It?s apparently been configured during it?s install to use HMMER, and it?s possible that it was done incorrectly if it was done by a package manager like homebrew. > > It?s installed under /usr/local/RepeatMasker/ according to your maker_exe.ctl file. > > ?Carson > > > >> On Jul 15, 2019, at 6:58 PM, Martin, John wrote: >> >> Greetings, >> >> I'm trying to use a custom repeat library in a maker run being >> performed on a ~400Mb worm genome. I've successfully tested my >> installation of maker v2.31.10 using a small set of 13 contigs from my >> full assembly (~100kb or less) using the 'model_org=all' setting for the >> RepeatMasker part of the maker_opts.ctl configuration. But when I try >> to feed maker my repeat library fasta file, maker errors out when trying >> to repeatmask because its trying to run hmmpress on the fasta file I set >> as 'rmlib=...' >> >> I've attached the 3 .ctl files I'm using. I am not setting any >> other command line arguments for maker, when I launch the jobs I just type: >> >> maker >> >> in the directory with those .ctl files. The specific error message I >> am getting appears inside each of the subdirectories of the datastore, >> inside hmmPress.log files: >> >> Error: File format problem in trying to open HMM file >> /gscmnt/gc2732/mitrevalab/ >> USDA_Zarlenga/t_colubriformis/180530_assembly_and_annotation_PacBio_sequel_data/ >> SUPERNOVA_assembly/ANNOTATION/MAKER/maker_1/TMP_1/maker_sbWsNX/RM_416.MonJul1522 >> 04492019/consensi.fa.classified. >> Format tag is '>rnd-1_family-14#Unknown': unrecognized. >> Current H3 format is 'HMMER3/f'. Previous H2/H3 formats also supported. >> >> I am not sure why maker is trying to run hmmpress on the fasta file >> entered for the 'rmlib' setting, which the documentation says should be >> fasta. I think hmmpress should be used on hmm files. Can someone help >> me troubleshoot this problem? >> >> >> Thanks, >> >> John Martin >> >> >> ________________________________ >> The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail. >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org ________________________________ The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail. From keith.decker at bayer.com Wed Jul 10 12:12:31 2019 From: keith.decker at bayer.com (DECKER, KEITH F [AG/1005]) Date: Wed, 10 Jul 2019 18:12:31 +0000 Subject: [maker-devel] Annotations from high confidence IsoSeq data Message-ID: I have a set of high confidence full length transcripts generated via the pacbio IsoSeq pipeline. Is there a recommended way to pass these transcripts to MAKER and ensure that the corresponding gene models are included in the final annotation? I also have assembled illumina transcripts, but I?m trying to figure out how to give the IsoSeq transcripts a higher ?weight?. The information contained in this e-mail is for the exclusive use of the intended recipient(s) and may be confidential, proprietary, and/or legally privileged. Inadvertent disclosure of this message does not constitute a waiver of any privilege. If you receive this message in error, please do not directly or indirectly use, print, copy, forward, or disclose any part of this message. Please also delete this e-mail and all copies and notify the sender. Thank you. -------------- next part -------------- An HTML attachment was scrubbed... URL: From cwarden at coh.org Mon Jul 15 10:25:59 2019 From: cwarden at coh.org (Charles Warden) Date: Mon, 15 Jul 2019 16:25:59 +0000 Subject: [maker-devel] MAKER caught in loop Message-ID: Hi, I am encountering some issues running MAKER on my Mac. I?ve run MAKER in the past, and I was even able to run MAKER very quickly (I think less than 1 minute) for 1 of my 2 assemblies. For the successful run, I installed some headers from /Library/Developer/CommandLineTools/Packages/macOS_SDK_headers_for_macOS_10.14.pkg, and MAKER worked (but the files were not visible on my PC). Actually, on my Mac, I could see from my terminal, but not even through the Mac file browser (after a restart). I?ve attached a copy of my configuration files, and I am running MAKER with the following command: /opt/maker/bin/maker -nolock -fix_nucleotides I upgraded my OS within the past few months, but it seems like there is some issue with the permissions for the files being created. I testing running MAKER from my Mac?s local hard drive (instead of the network folder). However, the run time is considerably longer than the assembly that ran to completion (but those .gff file is not visuable through several interfaces, for some reason). For example, MAKER has been stuck on this step since last Friday (for a 45,013 bp sequence), whereas annotation for a larger sequence completed within minutes: STATUS: Parsing control files... WARNING: You have chosen to turn locking off which may create race conditions if running in parallel. You have been warned. STATUS: Processing and indexing input FASTA files... STATUS: Setting up database for any GFF3 input... A data structure will be created for you at: /Users/cwarden/Documents/MAKER/AC270418_270441.maker.output/AC270418_270441_datastore To access files for individual sequences use the datastore index: /Users/cwarden/Documents/MAKER/AC270418_270441.maker.output/AC270418_270441_master_datastore_index.log STATUS: Now running MAKER... examining contents of the fasta file and run log --Next Contig-- #--------------------------------------------------------------------- Now starting the contig!! SeqID: AC270418_270441 Length: 45015 #--------------------------------------------------------------------- Can you please help me troubleshoot this issue? Thank You, Charles ------------------------------------------------------------ -SECURITY/CONFIDENTIALITY WARNING- This message and any attachments are intended solely for the individual or entity to which they are addressed. This communication may contain information that is privileged, confidential, or exempt from disclosure under applicable law (e.g., personal health information, research data, financial information). Because this e-mail has been sent without encryption, individuals other than the intended recipient may be able to view the information, forward it to others or tamper with the information without the knowledge or consent of the sender. If you are not the intended recipient, or the employee or person responsible for delivering the message to the intended recipient, any dissemination, distribution or copying of the communication is strictly prohibited. If you received the communication in error, please notify the sender immediately by replying to this message and deleting the message and any accompanying files from your system. If, due to the security risks, you do not wish to receive further communications via e-mail, please reply to this message and inform the sender that you do not wish to receive further e-mail from the sender. (LCP301) ------------------------------------------------------------ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_bopts.ctl Type: application/octet-stream Size: 497 bytes Desc: maker_bopts.ctl URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_exe.ctl Type: application/octet-stream Size: 352 bytes Desc: maker_exe.ctl URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_opts.ctl Type: application/octet-stream Size: 1180 bytes Desc: maker_opts.ctl URL: From jmartin at wustl.edu Mon Jul 15 18:58:30 2019 From: jmartin at wustl.edu (Martin, John) Date: Tue, 16 Jul 2019 00:58:30 +0000 Subject: [maker-devel] Trouble using custom repeat library w/ Maker v2.31.10 Message-ID: <65c82584-b495-a2a3-1c74-9fb45af52372@wustl.edu> Greetings, I'm trying to use a custom repeat library in a maker run being performed on a ~400Mb worm genome. I've successfully tested my installation of maker v2.31.10 using a small set of 13 contigs from my full assembly (~100kb or less) using the 'model_org=all' setting for the RepeatMasker part of the maker_opts.ctl configuration. But when I try to feed maker my repeat library fasta file, maker errors out when trying to repeatmask because its trying to run hmmpress on the fasta file I set as 'rmlib=...' I've attached the 3 .ctl files I'm using. I am not setting any other command line arguments for maker, when I launch the jobs I just type: maker in the directory with those .ctl files. The specific error message I am getting appears inside each of the subdirectories of the datastore, inside hmmPress.log files: Error: File format problem in trying to open HMM file /gscmnt/gc2732/mitrevalab/ USDA_Zarlenga/t_colubriformis/180530_assembly_and_annotation_PacBio_sequel_data/ SUPERNOVA_assembly/ANNOTATION/MAKER/maker_1/TMP_1/maker_sbWsNX/RM_416.MonJul1522 04492019/consensi.fa.classified. Format tag is '>rnd-1_family-14#Unknown': unrecognized. Current H3 format is 'HMMER3/f'. Previous H2/H3 formats also supported. I am not sure why maker is trying to run hmmpress on the fasta file entered for the 'rmlib' setting, which the documentation says should be fasta. I think hmmpress should be used on hmm files. Can someone help me troubleshoot this problem? Thanks, John Martin ________________________________ The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail. -------------- next part -------------- #-----BLAST and Exonerate Statistics Thresholds blast_type=ncbi+ #set to 'ncbi+', 'ncbi' or 'wublast' pcov_blastn=0.8 #Blastn Percent Coverage Threhold EST-Genome Alignments pid_blastn=0.85 #Blastn Percent Identity Threshold EST-Genome Aligments eval_blastn=1e-10 #Blastn eval cutoff bit_blastn=40 #Blastn bit cutoff depth_blastn=0 #Blastn depth cutoff (0 to disable cutoff) pcov_blastx=0.5 #Blastx Percent Coverage Threhold Protein-Genome Alignments pid_blastx=0.4 #Blastx Percent Identity Threshold Protein-Genome Aligments eval_blastx=1e-06 #Blastx eval cutoff bit_blastx=30 #Blastx bit cutoff depth_blastx=0 #Blastx depth cutoff (0 to disable cutoff) pcov_tblastx=0.8 #tBlastx Percent Coverage Threhold alt-EST-Genome Alignments pid_tblastx=0.85 #tBlastx Percent Identity Threshold alt-EST-Genome Aligments eval_tblastx=1e-10 #tBlastx eval cutoff bit_tblastx=40 #tBlastx bit cutoff depth_tblastx=0 #tBlastx depth cutoff (0 to disable cutoff) pcov_rm_blastx=0.5 #Blastx Percent Coverage Threhold For Transposable Element Masking pid_rm_blastx=0.4 #Blastx Percent Identity Threshold For Transposbale Element Masking eval_rm_blastx=1e-06 #Blastx eval cutoff for transposable element masking bit_rm_blastx=30 #Blastx bit cutoff for transposable element masking ep_score_limit=20 #Exonerate protein percent of maximal score threshold en_score_limit=20 #Exonerate nucleotide percent of maximal score threshold -------------- next part -------------- #-----Genome (these are always required) genome=Tcol_final_assembly.minsize_2000.fna.PART.1 #genome sequence (fasta file or fasta embeded in GFF3 file) organism_type=eukaryotic #eukaryotic or prokaryotic. Default is eukaryotic #-----Re-annotation Using MAKER Derived GFF3 maker_gff= #MAKER derived GFF3 file est_pass=0 #use ESTs in maker_gff: 1 = yes, 0 = no altest_pass=0 #use alternate organism ESTs in maker_gff: 1 = yes, 0 = no protein_pass=0 #use protein alignments in maker_gff: 1 = yes, 0 = no rm_pass=0 #use repeats in maker_gff: 1 = yes, 0 = no model_pass=0 #use gene models in maker_gff: 1 = yes, 0 = no pred_pass=0 #use ab-initio predictions in maker_gff: 1 = yes, 0 = no other_pass=0 #passthrough anyything else in maker_gff: 1 = yes, 0 = no #-----EST Evidence (for best results provide a file for at least one) est= #set of ESTs or assembled mRNA-seq in fasta format altest= #EST/cDNA sequence file in fasta format from an alternate organism est_gff=/gscmnt/gc2732/mitrevalab/USDA/t_colubriformis/180530_assembly_and_annotation_PacBio_sequel_data/SUPERNOVA_assembly/ANNOTATION/MAKER/RNAseq_transcript_assembly_w_better_HiSat2_arguments_and_using_stringtie/Merged_StringTie_Transcripts.maker_gff3 #aligned ESTs or mRNA-seq from an external GFF3 file altest_gff= #aligned ESTs from a closly relate species in GFF3 format #-----Protein Homology Evidence (for best results provide a file for at least one) protein=/gscmnt/gc2732/mitrevalab/USDA/t_colubriformis/180530_assembly_and_annotation_PacBio_sequel_data/SUPERNOVA_assembly/ANNOTATION/BRAKER2/evidence_protein/All_9_close_nematode.protein.faa #protein sequence file in fasta format (i.e. from mutiple oransisms) protein_gff= #aligned protein homology evidence from an external GFF3 file #-----Repeat Masking (leave values blank to skip repeat masking) model_org=all #select a model organism for RepBase masking in RepeatMasker rmlib=/gscmnt/gc2732/mitrevalab/USDA/t_colubriformis/180530_assembly_and_annotation_PacBio_sequel_data/SUPERNOVA_assembly/ANNOTATION/REPEAT_LIBRARY_std_RepeatModeler_install_bsub_pa4_spanhosts1_n5_249Gb/RM_1.TueMay281756062019/consensi.fa.classified #provide an organism specific repeat library in fasta format for RepeatMasker repeat_protein=/usr/local/maker/data/te_proteins.fasta #provide a fasta file of transposable element proteins for RepeatRunner rm_gff= #pre-identified repeat elements from an external GFF3 file prok_rm=0 #forces MAKER to repeatmask prokaryotes (no reason to change this), 1 = yes, 0 = no softmask=1 #use soft-masking rather than hard-masking in BLAST (i.e. seg and dust filtering) #-----Gene Prediction snaphmm= #SNAP HMM file gmhmm= #GeneMark HMM file augustus_species=Trichostrongylus_colubriformis_assembly_rnaseq-train #Augustus gene prediction species model fgenesh_par_file= #FGENESH parameter file pred_gff=/gscmnt/gc2732/mitrevalab/USDA/t_colubriformis/180530_assembly_and_annotation_PacBio_sequel_data/SUPERNOVA_assembly/ANNOTATION/BRAKER2/braker2_run_rnaseq_only/braker/Trichostrongylus_colubriformis_assembly_rnaseq-train/augustus.hints.gff3 #ab-initio predictions from an external GFF3 file model_gff= #annotated gene models from an external GFF3 file (annotation pass-through) est2genome=0 #infer gene predictions directly from ESTs, 1 = yes, 0 = no protein2genome=0 #infer predictions from protein homology, 1 = yes, 0 = no trna=0 #find tRNAs with tRNAscan, 1 = yes, 0 = no snoscan_rrna= #rRNA file to have Snoscan find snoRNAs unmask=0 #also run ab-initio prediction programs on unmasked sequence, 1 = yes, 0 = no #-----Other Annotation Feature Types (features MAKER doesn't recognize) other_gff= #extra features to pass-through to final MAKER generated GFF3 file #-----External Application Behavior Options alt_peptide=C #amino acid used to replace non-standard amino acids in BLAST databases cpus=1 #max number of cpus to use in BLAST and RepeatMasker (not for MPI, leave 1 when using MPI) #-----MAKER Behavior Options max_dna_len=100000 #length for dividing up contigs into chunks (increases/decreases memory usage) min_contig=1 #skip genome contigs below this length (under 10kb are often useless) pred_flank=200 #flank for extending evidence clusters sent to gene predictors pred_stats=0 #report AED and QI statistics for all predictions as well as models AED_threshold=1 #Maximum Annotation Edit Distance allowed (bound by 0 and 1) min_protein=25 #require at least this many amino acids in predicted proteins alt_splice=0 #Take extra steps to try and find alternative splicing, 1 = yes, 0 = no always_complete=0 #extra steps to force start and stop codons, 1 = yes, 0 = no map_forward=0 #map names and attributes forward from old GFF3 genes, 1 = yes, 0 = no keep_preds=0 #Concordance threshold to add unsupported gene prediction (bound by 0 and 1) split_hit=10000 #length for the splitting of hits (expected max intron size for evidence alignments) single_exon=0 #consider single exon EST evidence when generating annotations, 1 = yes, 0 = no single_length=250 #min length required for single exon ESTs if 'single_exon is enabled' correct_est_fusion=0 #limits use of ESTs in annotation to avoid fusion genes tries=2 #number of times to try a contig if there is a failure for some reason clean_try=0 #remove all data from previous run before retrying, 1 = yes, 0 = no clean_up=0 #removes theVoid directory with individual analysis files, 1 = yes, 0 = no TMP=/gscmnt/gc2732/mitrevalab/USDA/t_colubriformis/180530_assembly_and_annotation_PacBio_sequel_data/SUPERNOVA_assembly/ANNOTATION/MAKER/maker_1/TMP_1 #specify a directory other than the system default temporary directory for temporary files -------------- next part -------------- #-----Location of Executables Used by MAKER/EVALUATOR makeblastdb=/usr/local/rmblast-2.9.0/makeblastdb #location of NCBI+ makeblastdb executable blastn=/usr/local/ncbi-blast-2.9.0+/bin/blastn #location of NCBI+ blastn executable blastx=/usr/local/rmblast-2.9.0/blastx #location of NCBI+ blastx executable tblastx=/usr/local/ncbi-blast-2.9.0+/bin/tblastx #location of NCBI+ tblastx executable formatdb=/gsc/pkg/bio/blast/blast-2.2.26/bin/formatdb #location of NCBI formatdb executable blastall=/gsc/pkg/bio/blast/blast-2.2.26/bin/blastall #location of NCBI blastall executable xdformat=/gsc/pkg/bio/wu-blast/blast2_2006-05-04/xdformat #location of WUBLAST xdformat executable blasta=/gsc/pkg/bio/wu-blast/blast2_2006-05-04/blasta #location of WUBLAST blasta executable RepeatMasker=/usr/local/RepeatMasker/RepeatMasker #location of RepeatMasker executable exonerate=/usr/local/exonerate-2.2.0-x86_64/bin/exonerate #location of exonerate executable #-----Ab-initio Gene Prediction Algorithms snap=/usr/local/snap/snap #location of snap executable gmhmme3=/usr/local/gm_et_linux_64/gmes_petap/gmhmme3 #location of eukaryotic genemark executable gmhmmp= #location of prokaryotic genemark executable augustus=/usr/local/augustus.2.5.5/bin/augustus #location of augustus executable fgenesh= #location of fgenesh executable tRNAscan-SE= #location of trnascan executable snoscan= #location of snoscan executable #-----Other Algorithms probuild=/usr/local/gm_et_linux_64/gmes_petap/probuild #location of probuild executable (required for genemark) From carsonhh at gmail.com Tue Jul 16 13:51:09 2019 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 16 Jul 2019 13:51:09 -0600 Subject: [maker-devel] Trouble using custom repeat library w/ Maker v2.31.10 In-Reply-To: <65c82584-b495-a2a3-1c74-9fb45af52372@wustl.edu> References: <65c82584-b495-a2a3-1c74-9fb45af52372@wustl.edu> Message-ID: Reinstall RepeatMasker Manually. It?s apparently been configured during it?s install to use HMMER, and it?s possible that it was done incorrectly if it was done by a package manager like homebrew. It?s installed under /usr/local/RepeatMasker/ according to your maker_exe.ctl file. ?Carson > On Jul 15, 2019, at 6:58 PM, Martin, John wrote: > > Greetings, > > I'm trying to use a custom repeat library in a maker run being > performed on a ~400Mb worm genome. I've successfully tested my > installation of maker v2.31.10 using a small set of 13 contigs from my > full assembly (~100kb or less) using the 'model_org=all' setting for the > RepeatMasker part of the maker_opts.ctl configuration. But when I try > to feed maker my repeat library fasta file, maker errors out when trying > to repeatmask because its trying to run hmmpress on the fasta file I set > as 'rmlib=...' > > I've attached the 3 .ctl files I'm using. I am not setting any > other command line arguments for maker, when I launch the jobs I just type: > > maker > > in the directory with those .ctl files. The specific error message I > am getting appears inside each of the subdirectories of the datastore, > inside hmmPress.log files: > > Error: File format problem in trying to open HMM file > /gscmnt/gc2732/mitrevalab/ > USDA_Zarlenga/t_colubriformis/180530_assembly_and_annotation_PacBio_sequel_data/ > SUPERNOVA_assembly/ANNOTATION/MAKER/maker_1/TMP_1/maker_sbWsNX/RM_416.MonJul1522 > 04492019/consensi.fa.classified. > Format tag is '>rnd-1_family-14#Unknown': unrecognized. > Current H3 format is 'HMMER3/f'. Previous H2/H3 formats also supported. > > I am not sure why maker is trying to run hmmpress on the fasta file > entered for the 'rmlib' setting, which the documentation says should be > fasta. I think hmmpress should be used on hmm files. Can someone help > me troubleshoot this problem? > > > Thanks, > > John Martin > > > ________________________________ > The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail. > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From carsonhh at gmail.com Tue Jul 16 14:41:40 2019 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 16 Jul 2019 14:41:40 -0600 Subject: [maker-devel] Your message to maker-devel awaits moderator approval In-Reply-To: <5685F7D3-347A-4309-B1DE-51476ECA1CD6@coh.org> References: <5685F7D3-347A-4309-B1DE-51476ECA1CD6@coh.org> Message-ID: I?m glad you were able to find a solution. I do imagine it is a Mac OSX update issue since it?s compatibility with linux tools has gradually decreased with each update. With respect to google docs, we just use it as an archive, new messages and replies have to go to the maker-devel e-mail list. Thanks, Carson > On Jul 16, 2019, at 11:55 AM, Charles Warden wrote: > > Hi, > > I noticed that this was actually posted on-line: > > https://groups.google.com/forum/#!msg/maker-devel/VCC-BFpDXtU/Bx9H2BoqEAAJ > > I have to admit that I didn?t realize this was being made public, and I don?t seem to be able to respond from the Google Groups interface (with my personal Gmail account). > > However, if other people encounter a similar issue, I found a solution using a local Ubuntu Docker image (and the newer version of MAKER, version maker-3.01.02-beta). > > I?m not entirely sure why the Mac installation used to work but currently doesn?t work, although my guess is that it may have something to do with the OS upgrade. Also, I don?t think it actually worked for even 1 of the 2 assemblies before (since I wasn?t not able to find those hidden files, and it look longer to create a FASTA file with protein and transcript sequences). However, I currently have those FASTA files for both sequences. > > Thank You, > Charles > > From: maker-devel > on behalf of "maker-devel-owner at yandell-lab.org " > > Date: Monday, July 15, 2019 at 9:42 AM > To: Charles Warden > > Subject: Your message to maker-devel awaits moderator approval > > maker-devel > > > *SECURITY/CONFIDENTIALITY WARNING: > > This message and any attachments are intended solely for the individual or entity to which they are addressed. This communication may contain information that is privileged, confidential, or exempt from disclosure under applicable law (e.g., personal health information, research data, financial information). Because this e-mail has been sent without encryption, individuals other than the intended recipient may be able to view the information, forward it to others or tamper with the information without the knowledge or consent of the sender. If you are not the intended recipient, or the employee or person responsible for delivering the message to the intended recipient, any dissemination, distribution or copying of the communication is strictly prohibited. If you received the communication in error, please notify the sender immediately by replying to this message and deleting the message and any accompanying files from your system. If, due to the security risks, you do not wish to receive further communications via e-mail, please reply to this message and inform the sender that you do not wish to receive further e-mail from the sender. (LCP301) -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Jul 16 14:43:56 2019 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 16 Jul 2019 14:43:56 -0600 Subject: [maker-devel] Annotations from high confidence IsoSeq data In-Reply-To: References: Message-ID: MAKER doesn?t assign weights to evidence types, but you can look into EVAL which can take MAKER results and produce new models. If you gave your evidence as est= then EVAL can consume that as well since it will be in the GFF3. MAKER3-beta have some integrated support for eval that you can look at using the maker_eval.ctl file to provide weights to EVAL. ?Carson > On Jul 10, 2019, at 12:12 PM, DECKER, KEITH F [AG/1005] wrote: > > I have a set of high confidence full length transcripts generated via the pacbio IsoSeq pipeline. > > Is there a recommended way to pass these transcripts to MAKER and ensure that the corresponding gene models are included in the final annotation? I also have assembled illumina transcripts, but I?m trying to figure out how to give the IsoSeq transcripts a higher ?weight?. > > > > The information contained in this e-mail is for the exclusive use of the intended recipient(s) and may be confidential, proprietary, and/or legally privileged. Inadvertent disclosure of this message does not constitute a waiver of any privilege. If you receive this message in error, please do not directly or indirectly use, print, copy, forward, or disclose any part of this message. Please also delete this e-mail and all copies and notify the sender. Thank you. > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From jmartin at wustl.edu Wed Jul 17 14:10:22 2019 From: jmartin at wustl.edu (Martin, John) Date: Wed, 17 Jul 2019 20:10:22 +0000 Subject: [maker-devel] Trouble using custom repeat library w/ Maker v2.31.10 In-Reply-To: References: <65c82584-b495-a2a3-1c74-9fb45af52372@wustl.edu> Message-ID: <166ea7d7-86cc-acf7-c7d6-9184dbf38aec@wustl.edu> Thanks Carson, I actually had a second, older installation of RepeatMasker that defaulted to wu-blast that I was able to use (by modifying maker_exe.ctl to point to it) and that looks like it fixed my problem. John On 7/16/19 2:51 PM, Carson Holt wrote: > Reinstall RepeatMasker Manually. It?s apparently been configured during it?s install to use HMMER, and it?s possible that it was done incorrectly if it was done by a package manager like homebrew. > > It?s installed under /usr/local/RepeatMasker/ according to your maker_exe.ctl file. > > ?Carson > > > >> On Jul 15, 2019, at 6:58 PM, Martin, John wrote: >> >> Greetings, >> >> I'm trying to use a custom repeat library in a maker run being >> performed on a ~400Mb worm genome. I've successfully tested my >> installation of maker v2.31.10 using a small set of 13 contigs from my >> full assembly (~100kb or less) using the 'model_org=all' setting for the >> RepeatMasker part of the maker_opts.ctl configuration. But when I try >> to feed maker my repeat library fasta file, maker errors out when trying >> to repeatmask because its trying to run hmmpress on the fasta file I set >> as 'rmlib=...' >> >> I've attached the 3 .ctl files I'm using. I am not setting any >> other command line arguments for maker, when I launch the jobs I just type: >> >> maker >> >> in the directory with those .ctl files. The specific error message I >> am getting appears inside each of the subdirectories of the datastore, >> inside hmmPress.log files: >> >> Error: File format problem in trying to open HMM file >> /gscmnt/gc2732/mitrevalab/ >> USDA_Zarlenga/t_colubriformis/180530_assembly_and_annotation_PacBio_sequel_data/ >> SUPERNOVA_assembly/ANNOTATION/MAKER/maker_1/TMP_1/maker_sbWsNX/RM_416.MonJul1522 >> 04492019/consensi.fa.classified. >> Format tag is '>rnd-1_family-14#Unknown': unrecognized. >> Current H3 format is 'HMMER3/f'. Previous H2/H3 formats also supported. >> >> I am not sure why maker is trying to run hmmpress on the fasta file >> entered for the 'rmlib' setting, which the documentation says should be >> fasta. I think hmmpress should be used on hmm files. Can someone help >> me troubleshoot this problem? >> >> >> Thanks, >> >> John Martin >> >> >> ________________________________ >> The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail. >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org ________________________________ The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail.