From xvazquezc at gmail.com Tue Jun 2 00:57:26 2020 From: xvazquezc at gmail.com (=?UTF-8?Q?Xabier_V=C3=A1zquez=2DCampos?=) Date: Tue, 2 Jun 2020 16:57:26 +1000 Subject: [maker-devel] Maker2 changelog and Maker3 Message-ID: Hi Carson, I was looking for some Maker-related stuff (not important what) and I realised that there was a new release in April for Maker 2 and Maker 3 was not in beta anymore. We have been using Maker 2.31.9 in our cluster for a few years now and while the new (2.31.11) includes the most recent change (maker_functional_fasta and maker_functional_gff handle newer UniProt/Swiss-Prot format. Also fix for newer genemark command line structure) I can't find what changed in 2.31.10 nor a general changelog. By the way, do you know when (version) the genemark command line structure changed? What about the Uniprot format? Also, is there any fundamental difference aside of the integration of EVM in Maker3 vs Maker2? I found this entry from 2 years ago and wanted to check if this is still the case. http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/2018-August/003279.html Thank you, Xabi PS: is there a way to receive notifications for new releases? I didn't notice anything in the mail list -- Xabier V?zquez-Campos, *PhD* *Research Associate* NSW Systems Biology Initiative School of Biotechnology and Biomolecular Sciences The University of New South Wales Sydney NSW 2052 AUSTRALIA -------------- next part -------------- An HTML attachment was scrubbed... URL: From max_coyle at berkeley.edu Tue Jun 2 23:27:14 2020 From: max_coyle at berkeley.edu (Maxwell C Coyle) Date: Tue, 2 Jun 2020 22:27:14 -0700 Subject: [maker-devel] promiscuous mapping during annotation liftover Message-ID: <5B012E69-B1AD-4852-9F38-FFA76A4E7289@berkeley.edu> Hello! I?m using Maker to map forward annotations to a new genome assembly in a species of choanoflagellate. I followed the protocol from Campbell 2014, creating a transcript FASTA file from my old GFF3 file and using this as the only EST evidence. I set est2genome=1 and est_forward=1. The Maker ran great, except it seems that many of the old transcripts are mapping many places in the new assembly, up to 100+ times, so that my gene count has inflated from 11,624 to 25,905. Of the most highly multiply mapped genes, many but not all are rRNA genes. I was wondering if there is a setting or tweak I can make so that each transcript maps uniquely to its best location in the new genome? Or maybe tweaking the exonerate stringency? This is not the final annotation step, as I will also be using RNA-seq evidence to improve my gene models after liftover. Thanks for your help! Best, Max Max Coyle PhD Candidate, King Lab Dept of Molecular and Cellular Biology, UC Berkeley -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Thu Jun 11 14:19:29 2020 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 11 Jun 2020 14:19:29 -0600 Subject: [maker-devel] promiscuous mapping during annotation liftover In-Reply-To: <5B012E69-B1AD-4852-9F38-FFA76A4E7289@berkeley.edu> References: <5B012E69-B1AD-4852-9F38-FFA76A4E7289@berkeley.edu> Message-ID: Mapping genes forward onto a new assembly is an iterative process. If a previous model maps to multiple locations that is an indication that you accidentally annotated a repeat in the previous annotation set (i.e. transposes, etc). This is especially true if you get 100 copies. You should simply remove that gene. If it?s just 2 or 3 copies you can use the score column (see GFF3 format specification) which indicates the best match to the original copy to keep one or the other (value is 0-100% recovery when est_forward=1 is set). You can compare neighboring genes between assemblies to see if one copy is simply a paralog. If there are a lot of genes with 2 copies, you probably have high heterozygosity in the genome so maternal and paternal chromosomes are assembling independently (i.e. this means that both copies are real and are the exact same gene). If you decide a gene belongs on a specific contig after reviewing neighboring models and score, you can anchor it to a contig or region by adding maker_coor= to the fasta header (example: >transcriptA maker_coor=contig1:10000-11000; ). Also consider removing models that have low scores. They may map to only one location, but if the % recovery is low, it may be best to just not try and recover the old model (a new gene prediction may prove much more accurate). ?Carson > On Jun 2, 2020, at 11:27 PM, Maxwell C Coyle wrote: > > Hello! > > I?m using Maker to map forward annotations to a new genome assembly in a species of choanoflagellate. I followed the protocol from Campbell 2014, creating a transcript FASTA file from my old GFF3 file and using this as the only EST evidence. I set est2genome=1 and est_forward=1. The Maker ran great, except it seems that many of the old transcripts are mapping many places in the new assembly, up to 100+ times, so that my gene count has inflated from 11,624 to 25,905. Of the most highly multiply mapped genes, many but not all are rRNA genes. > > I was wondering if there is a setting or tweak I can make so that each transcript maps uniquely to its best location in the new genome? Or maybe tweaking the exonerate stringency? This is not the final annotation step, as I will also be using RNA-seq evidence to improve my gene models after liftover. > > Thanks for your help! > > Best, > Max > > Max Coyle > PhD Candidate, King Lab > Dept of Molecular and Cellular Biology, UC Berkeley > > > > _______________________________________________ > maker-devel mailing list > maker-devel at yandell-lab.org > http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From sagnik at iastate.edu Sun Jun 14 06:32:09 2020 From: sagnik at iastate.edu (Sagnik Banerjee) Date: Sun, 14 Jun 2020 07:32:09 -0500 Subject: [maker-devel] Encountering problems while using SNAP models Message-ID: Hello, I am using MAKER3 to generate annotation for Saccharomyces cerevisiae. In round1 I used assembled transcripts and the run was completed successfully. In the next round, I supplied gene models trained using SNAP. I keep getting an error that cites a problem with line 540 at snap.pm. Can't locate object method "add_entry" via package "1" (perhaps you forgot to load "1"?) at /project/software/7/apps/maker3/3.01.03/bin/../lib/Widget/ snap.pm line 540. --> rank=15, hostname=ceres14-compute-54.scinet.local ERROR: Failed while annotating transcripts ERROR: Chunk failed at level:1, tier_type:4 FAILED CONTIG:II The problem is somewhat similar to the one discussed in http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/2018-January/003150.html. I have attached the error file. I have tried executing the same previously but the run did not complete and the job was killed. I was inspecting the run closely and I found that MAKER3 stopped writing to the .error file after a few minutes. I am including the portion that was written to the .error file. I have attached the relevant files. Could you please look into this? Thank you. Sagnik Banerjee Graduate Research Assistant Bioinformatics and Computational Biology Department of Statistics Dr. Carson Andorf's Lab Agricultural Research Service (ARS) Research Participation Program Oak Ridge Institute for Science and Education (ORISE) 437, Bessey Hall Iowa State University *"The moment I have realized God sitting in the temple of every human body, the moment I stand in reverence before every human being and see God in him - that moment I am free from bondage, everything that binds vanishes, and I am free" - Swami Vivekananda* ? -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_opts.ctl Type: application/octet-stream Size: 5133 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: sacce.hmm Type: application/octet-stream Size: 46562 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Error Type: application/octet-stream Size: 307562 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: round1.gff Type: application/octet-stream Size: 13906493 bytes Desc: not available URL: From sagnik at iastate.edu Thu Jun 25 09:08:22 2020 From: sagnik at iastate.edu (Sagnik Banerjee) Date: Thu, 25 Jun 2020 10:08:22 -0500 Subject: [maker-devel] Encountering problems while using SNAP models In-Reply-To: References: Message-ID: Hello, Could you find what's wrong with this? Thank you. Sagnik Banerjee Graduate Research Assistant Bioinformatics and Computational Biology Department of Statistics Dr. Carson Andorf's Lab Agricultural Research Service (ARS) Research Participation Program Oak Ridge Institute for Science and Education (ORISE) 437, Bessey Hall Iowa State University *"The moment I have realized God sitting in the temple of every human body, the moment I stand in reverence before every human being and see God in him - that moment I am free from bondage, everything that binds vanishes, and I am free" - Swami Vivekananda* ? On Sun, Jun 14, 2020 at 7:32 AM Sagnik Banerjee wrote: > Hello, > > I am using MAKER3 to generate annotation for Saccharomyces cerevisiae. In > round1 I used assembled transcripts and the run was completed successfully. > In the next round, I supplied gene models trained using SNAP. I keep > getting an error that cites a problem with line 540 at snap.pm. > > Can't locate object method "add_entry" via package "1" (perhaps you forgot > to load "1"?) at /project/software/7/apps/maker3/3.01.03/bin/../lib/Widget/ > snap.pm line 540. > > --> rank=15, hostname=ceres14-compute-54.scinet.local > > ERROR: Failed while annotating transcripts > > ERROR: Chunk failed at level:1, tier_type:4 > > FAILED CONTIG:II > > The problem is somewhat similar to the one discussed in > http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/2018-January/003150.html. > I have attached the error file. I have tried executing the same previously > but the run did not complete and the job was killed. I was inspecting the > run closely and I found that MAKER3 stopped writing to the .error file > after a few minutes. I am including the portion that was written to the > .error file. I have attached the relevant files. > > Could you please look into this? > > Thank you. > Sagnik Banerjee > Graduate Research Assistant > Bioinformatics and Computational Biology > Department of Statistics > Dr. Carson Andorf's Lab > Agricultural Research Service (ARS) Research Participation Program > Oak Ridge Institute for Science and Education (ORISE) > 437, Bessey Hall > Iowa State University > > *"The moment I have realized God sitting in the temple of every human > body, the moment I stand in reverence before every human being and see God > in him - that moment I am free from bondage, everything that binds > vanishes, and I am free" - Swami Vivekananda* > ? > -------------- next part -------------- An HTML attachment was scrubbed... URL: From xvazquezc at gmail.com Tue Jun 2 00:57:26 2020 From: xvazquezc at gmail.com (=?UTF-8?Q?Xabier_V=C3=A1zquez=2DCampos?=) Date: Tue, 2 Jun 2020 16:57:26 +1000 Subject: [maker-devel] Maker2 changelog and Maker3 Message-ID: Hi Carson, I was looking for some Maker-related stuff (not important what) and I realised that there was a new release in April for Maker 2 and Maker 3 was not in beta anymore. We have been using Maker 2.31.9 in our cluster for a few years now and while the new (2.31.11) includes the most recent change (maker_functional_fasta and maker_functional_gff handle newer UniProt/Swiss-Prot format. Also fix for newer genemark command line structure) I can't find what changed in 2.31.10 nor a general changelog. By the way, do you know when (version) the genemark command line structure changed? What about the Uniprot format? Also, is there any fundamental difference aside of the integration of EVM in Maker3 vs Maker2? I found this entry from 2 years ago and wanted to check if this is still the case. http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/2018-August/003279.html Thank you, Xabi PS: is there a way to receive notifications for new releases? I didn't notice anything in the mail list -- Xabier V?zquez-Campos, *PhD* *Research Associate* NSW Systems Biology Initiative School of Biotechnology and Biomolecular Sciences The University of New South Wales Sydney NSW 2052 AUSTRALIA -------------- next part -------------- An HTML attachment was scrubbed... URL: From max_coyle at berkeley.edu Tue Jun 2 23:27:14 2020 From: max_coyle at berkeley.edu (Maxwell C Coyle) Date: Tue, 2 Jun 2020 22:27:14 -0700 Subject: [maker-devel] promiscuous mapping during annotation liftover Message-ID: <5B012E69-B1AD-4852-9F38-FFA76A4E7289@berkeley.edu> Hello! I?m using Maker to map forward annotations to a new genome assembly in a species of choanoflagellate. I followed the protocol from Campbell 2014, creating a transcript FASTA file from my old GFF3 file and using this as the only EST evidence. I set est2genome=1 and est_forward=1. The Maker ran great, except it seems that many of the old transcripts are mapping many places in the new assembly, up to 100+ times, so that my gene count has inflated from 11,624 to 25,905. Of the most highly multiply mapped genes, many but not all are rRNA genes. I was wondering if there is a setting or tweak I can make so that each transcript maps uniquely to its best location in the new genome? Or maybe tweaking the exonerate stringency? This is not the final annotation step, as I will also be using RNA-seq evidence to improve my gene models after liftover. Thanks for your help! Best, Max Max Coyle PhD Candidate, King Lab Dept of Molecular and Cellular Biology, UC Berkeley -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Thu Jun 11 14:19:29 2020 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 11 Jun 2020 14:19:29 -0600 Subject: [maker-devel] promiscuous mapping during annotation liftover In-Reply-To: <5B012E69-B1AD-4852-9F38-FFA76A4E7289@berkeley.edu> References: <5B012E69-B1AD-4852-9F38-FFA76A4E7289@berkeley.edu> Message-ID: Mapping genes forward onto a new assembly is an iterative process. If a previous model maps to multiple locations that is an indication that you accidentally annotated a repeat in the previous annotation set (i.e. transposes, etc). This is especially true if you get 100 copies. You should simply remove that gene. If it?s just 2 or 3 copies you can use the score column (see GFF3 format specification) which indicates the best match to the original copy to keep one or the other (value is 0-100% recovery when est_forward=1 is set). You can compare neighboring genes between assemblies to see if one copy is simply a paralog. If there are a lot of genes with 2 copies, you probably have high heterozygosity in the genome so maternal and paternal chromosomes are assembling independently (i.e. this means that both copies are real and are the exact same gene). If you decide a gene belongs on a specific contig after reviewing neighboring models and score, you can anchor it to a contig or region by adding maker_coor= to the fasta header (example: >transcriptA maker_coor=contig1:10000-11000; ). Also consider removing models that have low scores. They may map to only one location, but if the % recovery is low, it may be best to just not try and recover the old model (a new gene prediction may prove much more accurate). ?Carson > On Jun 2, 2020, at 11:27 PM, Maxwell C Coyle wrote: > > Hello! > > I?m using Maker to map forward annotations to a new genome assembly in a species of choanoflagellate. I followed the protocol from Campbell 2014, creating a transcript FASTA file from my old GFF3 file and using this as the only EST evidence. I set est2genome=1 and est_forward=1. The Maker ran great, except it seems that many of the old transcripts are mapping many places in the new assembly, up to 100+ times, so that my gene count has inflated from 11,624 to 25,905. Of the most highly multiply mapped genes, many but not all are rRNA genes. > > I was wondering if there is a setting or tweak I can make so that each transcript maps uniquely to its best location in the new genome? Or maybe tweaking the exonerate stringency? This is not the final annotation step, as I will also be using RNA-seq evidence to improve my gene models after liftover. > > Thanks for your help! > > Best, > Max > > Max Coyle > PhD Candidate, King Lab > Dept of Molecular and Cellular Biology, UC Berkeley > > > > _______________________________________________ > maker-devel mailing list > maker-devel at yandell-lab.org > http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From sagnik at iastate.edu Sun Jun 14 06:32:09 2020 From: sagnik at iastate.edu (Sagnik Banerjee) Date: Sun, 14 Jun 2020 07:32:09 -0500 Subject: [maker-devel] Encountering problems while using SNAP models Message-ID: Hello, I am using MAKER3 to generate annotation for Saccharomyces cerevisiae. In round1 I used assembled transcripts and the run was completed successfully. In the next round, I supplied gene models trained using SNAP. I keep getting an error that cites a problem with line 540 at snap.pm. Can't locate object method "add_entry" via package "1" (perhaps you forgot to load "1"?) at /project/software/7/apps/maker3/3.01.03/bin/../lib/Widget/ snap.pm line 540. --> rank=15, hostname=ceres14-compute-54.scinet.local ERROR: Failed while annotating transcripts ERROR: Chunk failed at level:1, tier_type:4 FAILED CONTIG:II The problem is somewhat similar to the one discussed in http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/2018-January/003150.html. I have attached the error file. I have tried executing the same previously but the run did not complete and the job was killed. I was inspecting the run closely and I found that MAKER3 stopped writing to the .error file after a few minutes. I am including the portion that was written to the .error file. I have attached the relevant files. Could you please look into this? Thank you. Sagnik Banerjee Graduate Research Assistant Bioinformatics and Computational Biology Department of Statistics Dr. Carson Andorf's Lab Agricultural Research Service (ARS) Research Participation Program Oak Ridge Institute for Science and Education (ORISE) 437, Bessey Hall Iowa State University *"The moment I have realized God sitting in the temple of every human body, the moment I stand in reverence before every human being and see God in him - that moment I am free from bondage, everything that binds vanishes, and I am free" - Swami Vivekananda* ? -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_opts.ctl Type: application/octet-stream Size: 5133 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: sacce.hmm Type: application/octet-stream Size: 46562 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Error Type: application/octet-stream Size: 307562 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: round1.gff Type: application/octet-stream Size: 13906493 bytes Desc: not available URL: From sagnik at iastate.edu Thu Jun 25 09:08:22 2020 From: sagnik at iastate.edu (Sagnik Banerjee) Date: Thu, 25 Jun 2020 10:08:22 -0500 Subject: [maker-devel] Encountering problems while using SNAP models In-Reply-To: References: Message-ID: Hello, Could you find what's wrong with this? Thank you. Sagnik Banerjee Graduate Research Assistant Bioinformatics and Computational Biology Department of Statistics Dr. Carson Andorf's Lab Agricultural Research Service (ARS) Research Participation Program Oak Ridge Institute for Science and Education (ORISE) 437, Bessey Hall Iowa State University *"The moment I have realized God sitting in the temple of every human body, the moment I stand in reverence before every human being and see God in him - that moment I am free from bondage, everything that binds vanishes, and I am free" - Swami Vivekananda* ? On Sun, Jun 14, 2020 at 7:32 AM Sagnik Banerjee wrote: > Hello, > > I am using MAKER3 to generate annotation for Saccharomyces cerevisiae. In > round1 I used assembled transcripts and the run was completed successfully. > In the next round, I supplied gene models trained using SNAP. I keep > getting an error that cites a problem with line 540 at snap.pm. > > Can't locate object method "add_entry" via package "1" (perhaps you forgot > to load "1"?) at /project/software/7/apps/maker3/3.01.03/bin/../lib/Widget/ > snap.pm line 540. > > --> rank=15, hostname=ceres14-compute-54.scinet.local > > ERROR: Failed while annotating transcripts > > ERROR: Chunk failed at level:1, tier_type:4 > > FAILED CONTIG:II > > The problem is somewhat similar to the one discussed in > http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/2018-January/003150.html. > I have attached the error file. I have tried executing the same previously > but the run did not complete and the job was killed. I was inspecting the > run closely and I found that MAKER3 stopped writing to the .error file > after a few minutes. I am including the portion that was written to the > .error file. I have attached the relevant files. > > Could you please look into this? > > Thank you. > Sagnik Banerjee > Graduate Research Assistant > Bioinformatics and Computational Biology > Department of Statistics > Dr. Carson Andorf's Lab > Agricultural Research Service (ARS) Research Participation Program > Oak Ridge Institute for Science and Education (ORISE) > 437, Bessey Hall > Iowa State University > > *"The moment I have realized God sitting in the temple of every human > body, the moment I stand in reverence before every human being and see God > in him - that moment I am free from bondage, everything that binds > vanishes, and I am free" - Swami Vivekananda* > ? > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wei.xiong at wur.nl Thu Jun 18 05:19:38 2020 From: wei.xiong at wur.nl (Xiong, Wei) Date: Thu, 18 Jun 2020 11:19:38 +0000 Subject: [maker-devel] Final Maker annotation extend ORF Message-ID: Dear Developer, I have used MAKER to annotate my genome. The input for MAKER run is pred_gff (de-novo annotation by Augustus), the protein sequence of a close species, and an assembled Transcriptome by RNA-seq using Trinity. When I checked the output, I noticed that many of the predicted gene models are with extended the regions flanking the ORF. I have checked the evidence I applied. I think the Transcriptome data is the reason because the RNA-seq not only contain the matured mRNA My question is, how can I properly make use of the mRNA while preventing this extension? Is there any way I can fix the current annotation instead of re-run the annotation? Thank you for your help. I look forward to hearing from you! Best regards and stay safe, Wei Xiong Ph.D. candidate | Wageningen University & Research Plant Science Group | Biosystematics Group Radix Building 107 Droevendaalsesteeg 1 6708 PB Wageningen The Netherlands E-mail: wei.xiong at wur.nl -------------- next part -------------- An HTML attachment was scrubbed... URL: From kasuni91 at tamu.edu Sun Jun 21 13:24:12 2020 From: kasuni91 at tamu.edu (D Mudiyanselage Kasuni Daundasekara) Date: Sun, 21 Jun 2020 14:24:12 -0500 Subject: [maker-devel] Assistance on MAKER annotation Message-ID: Hello Yandell lab, We are trying to annotate a de novo plant genome. Originally, we annotated this genome (CAA, 1041 scaffolds) in 2016. Currently, we have updated the genome assembly of CAA (14 chromosomes + 737 scaffolds). We have a problem, there has been a loss in the number of genes in the new annotation (old genome=43289, new genome=38.633). Do you have any idea where these genes have been lost and any advice on how to get these genes back (some of these genes are really important, were well supported with AED and now are gone)? Thank you in advance. Best wishes, Kasuni Graduate student Pepper Lab Texas A & M University -------------- next part -------------- An HTML attachment was scrubbed... URL: From xvazquezc at gmail.com Tue Jun 2 00:57:26 2020 From: xvazquezc at gmail.com (=?UTF-8?Q?Xabier_V=C3=A1zquez=2DCampos?=) Date: Tue, 2 Jun 2020 16:57:26 +1000 Subject: [maker-devel] Maker2 changelog and Maker3 Message-ID: Hi Carson, I was looking for some Maker-related stuff (not important what) and I realised that there was a new release in April for Maker 2 and Maker 3 was not in beta anymore. We have been using Maker 2.31.9 in our cluster for a few years now and while the new (2.31.11) includes the most recent change (maker_functional_fasta and maker_functional_gff handle newer UniProt/Swiss-Prot format. Also fix for newer genemark command line structure) I can't find what changed in 2.31.10 nor a general changelog. By the way, do you know when (version) the genemark command line structure changed? What about the Uniprot format? Also, is there any fundamental difference aside of the integration of EVM in Maker3 vs Maker2? I found this entry from 2 years ago and wanted to check if this is still the case. http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/2018-August/003279.html Thank you, Xabi PS: is there a way to receive notifications for new releases? I didn't notice anything in the mail list -- Xabier V?zquez-Campos, *PhD* *Research Associate* NSW Systems Biology Initiative School of Biotechnology and Biomolecular Sciences The University of New South Wales Sydney NSW 2052 AUSTRALIA -------------- next part -------------- An HTML attachment was scrubbed... URL: From max_coyle at berkeley.edu Tue Jun 2 23:27:14 2020 From: max_coyle at berkeley.edu (Maxwell C Coyle) Date: Tue, 2 Jun 2020 22:27:14 -0700 Subject: [maker-devel] promiscuous mapping during annotation liftover Message-ID: <5B012E69-B1AD-4852-9F38-FFA76A4E7289@berkeley.edu> Hello! I?m using Maker to map forward annotations to a new genome assembly in a species of choanoflagellate. I followed the protocol from Campbell 2014, creating a transcript FASTA file from my old GFF3 file and using this as the only EST evidence. I set est2genome=1 and est_forward=1. The Maker ran great, except it seems that many of the old transcripts are mapping many places in the new assembly, up to 100+ times, so that my gene count has inflated from 11,624 to 25,905. Of the most highly multiply mapped genes, many but not all are rRNA genes. I was wondering if there is a setting or tweak I can make so that each transcript maps uniquely to its best location in the new genome? Or maybe tweaking the exonerate stringency? This is not the final annotation step, as I will also be using RNA-seq evidence to improve my gene models after liftover. Thanks for your help! Best, Max Max Coyle PhD Candidate, King Lab Dept of Molecular and Cellular Biology, UC Berkeley -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Thu Jun 11 14:19:29 2020 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 11 Jun 2020 14:19:29 -0600 Subject: [maker-devel] promiscuous mapping during annotation liftover In-Reply-To: <5B012E69-B1AD-4852-9F38-FFA76A4E7289@berkeley.edu> References: <5B012E69-B1AD-4852-9F38-FFA76A4E7289@berkeley.edu> Message-ID: Mapping genes forward onto a new assembly is an iterative process. If a previous model maps to multiple locations that is an indication that you accidentally annotated a repeat in the previous annotation set (i.e. transposes, etc). This is especially true if you get 100 copies. You should simply remove that gene. If it?s just 2 or 3 copies you can use the score column (see GFF3 format specification) which indicates the best match to the original copy to keep one or the other (value is 0-100% recovery when est_forward=1 is set). You can compare neighboring genes between assemblies to see if one copy is simply a paralog. If there are a lot of genes with 2 copies, you probably have high heterozygosity in the genome so maternal and paternal chromosomes are assembling independently (i.e. this means that both copies are real and are the exact same gene). If you decide a gene belongs on a specific contig after reviewing neighboring models and score, you can anchor it to a contig or region by adding maker_coor= to the fasta header (example: >transcriptA maker_coor=contig1:10000-11000; ). Also consider removing models that have low scores. They may map to only one location, but if the % recovery is low, it may be best to just not try and recover the old model (a new gene prediction may prove much more accurate). ?Carson > On Jun 2, 2020, at 11:27 PM, Maxwell C Coyle wrote: > > Hello! > > I?m using Maker to map forward annotations to a new genome assembly in a species of choanoflagellate. I followed the protocol from Campbell 2014, creating a transcript FASTA file from my old GFF3 file and using this as the only EST evidence. I set est2genome=1 and est_forward=1. The Maker ran great, except it seems that many of the old transcripts are mapping many places in the new assembly, up to 100+ times, so that my gene count has inflated from 11,624 to 25,905. Of the most highly multiply mapped genes, many but not all are rRNA genes. > > I was wondering if there is a setting or tweak I can make so that each transcript maps uniquely to its best location in the new genome? Or maybe tweaking the exonerate stringency? This is not the final annotation step, as I will also be using RNA-seq evidence to improve my gene models after liftover. > > Thanks for your help! > > Best, > Max > > Max Coyle > PhD Candidate, King Lab > Dept of Molecular and Cellular Biology, UC Berkeley > > > > _______________________________________________ > maker-devel mailing list > maker-devel at yandell-lab.org > http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From sagnik at iastate.edu Sun Jun 14 06:32:09 2020 From: sagnik at iastate.edu (Sagnik Banerjee) Date: Sun, 14 Jun 2020 07:32:09 -0500 Subject: [maker-devel] Encountering problems while using SNAP models Message-ID: Hello, I am using MAKER3 to generate annotation for Saccharomyces cerevisiae. In round1 I used assembled transcripts and the run was completed successfully. In the next round, I supplied gene models trained using SNAP. I keep getting an error that cites a problem with line 540 at snap.pm. Can't locate object method "add_entry" via package "1" (perhaps you forgot to load "1"?) at /project/software/7/apps/maker3/3.01.03/bin/../lib/Widget/ snap.pm line 540. --> rank=15, hostname=ceres14-compute-54.scinet.local ERROR: Failed while annotating transcripts ERROR: Chunk failed at level:1, tier_type:4 FAILED CONTIG:II The problem is somewhat similar to the one discussed in http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/2018-January/003150.html. I have attached the error file. I have tried executing the same previously but the run did not complete and the job was killed. I was inspecting the run closely and I found that MAKER3 stopped writing to the .error file after a few minutes. I am including the portion that was written to the .error file. I have attached the relevant files. Could you please look into this? Thank you. Sagnik Banerjee Graduate Research Assistant Bioinformatics and Computational Biology Department of Statistics Dr. Carson Andorf's Lab Agricultural Research Service (ARS) Research Participation Program Oak Ridge Institute for Science and Education (ORISE) 437, Bessey Hall Iowa State University *"The moment I have realized God sitting in the temple of every human body, the moment I stand in reverence before every human being and see God in him - that moment I am free from bondage, everything that binds vanishes, and I am free" - Swami Vivekananda* ? -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_opts.ctl Type: application/octet-stream Size: 5134 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: sacce.hmm Type: application/octet-stream Size: 46563 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Error Type: application/octet-stream Size: 307563 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: round1.gff Type: application/octet-stream Size: 13906494 bytes Desc: not available URL: From sagnik at iastate.edu Thu Jun 25 09:08:22 2020 From: sagnik at iastate.edu (Sagnik Banerjee) Date: Thu, 25 Jun 2020 10:08:22 -0500 Subject: [maker-devel] Encountering problems while using SNAP models In-Reply-To: References: Message-ID: Hello, Could you find what's wrong with this? Thank you. Sagnik Banerjee Graduate Research Assistant Bioinformatics and Computational Biology Department of Statistics Dr. Carson Andorf's Lab Agricultural Research Service (ARS) Research Participation Program Oak Ridge Institute for Science and Education (ORISE) 437, Bessey Hall Iowa State University *"The moment I have realized God sitting in the temple of every human body, the moment I stand in reverence before every human being and see God in him - that moment I am free from bondage, everything that binds vanishes, and I am free" - Swami Vivekananda* ? On Sun, Jun 14, 2020 at 7:32 AM Sagnik Banerjee wrote: > Hello, > > I am using MAKER3 to generate annotation for Saccharomyces cerevisiae. In > round1 I used assembled transcripts and the run was completed successfully. > In the next round, I supplied gene models trained using SNAP. I keep > getting an error that cites a problem with line 540 at snap.pm. > > Can't locate object method "add_entry" via package "1" (perhaps you forgot > to load "1"?) at /project/software/7/apps/maker3/3.01.03/bin/../lib/Widget/ > snap.pm line 540. > > --> rank=15, hostname=ceres14-compute-54.scinet.local > > ERROR: Failed while annotating transcripts > > ERROR: Chunk failed at level:1, tier_type:4 > > FAILED CONTIG:II > > The problem is somewhat similar to the one discussed in > http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/2018-January/003150.html. > I have attached the error file. I have tried executing the same previously > but the run did not complete and the job was killed. I was inspecting the > run closely and I found that MAKER3 stopped writing to the .error file > after a few minutes. I am including the portion that was written to the > .error file. I have attached the relevant files. > > Could you please look into this? > > Thank you. > Sagnik Banerjee > Graduate Research Assistant > Bioinformatics and Computational Biology > Department of Statistics > Dr. Carson Andorf's Lab > Agricultural Research Service (ARS) Research Participation Program > Oak Ridge Institute for Science and Education (ORISE) > 437, Bessey Hall > Iowa State University > > *"The moment I have realized God sitting in the temple of every human > body, the moment I stand in reverence before every human being and see God > in him - that moment I am free from bondage, everything that binds > vanishes, and I am free" - Swami Vivekananda* > ? > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wei.xiong at wur.nl Thu Jun 18 05:19:38 2020 From: wei.xiong at wur.nl (Xiong, Wei) Date: Thu, 18 Jun 2020 11:19:38 +0000 Subject: [maker-devel] Final Maker annotation extend ORF Message-ID: Dear Developer, I have used MAKER to annotate my genome. The input for MAKER run is pred_gff (de-novo annotation by Augustus), the protein sequence of a close species, and an assembled Transcriptome by RNA-seq using Trinity. When I checked the output, I noticed that many of the predicted gene models are with extended the regions flanking the ORF. I have checked the evidence I applied. I think the Transcriptome data is the reason because the RNA-seq not only contain the matured mRNA My question is, how can I properly make use of the mRNA while preventing this extension? Is there any way I can fix the current annotation instead of re-run the annotation? Thank you for your help. I look forward to hearing from you! Best regards and stay safe, Wei Xiong Ph.D. candidate | Wageningen University & Research Plant Science Group | Biosystematics Group Radix Building 107 Droevendaalsesteeg 1 6708 PB Wageningen The Netherlands E-mail: wei.xiong at wur.nl -------------- next part -------------- An HTML attachment was scrubbed... URL: From kasuni91 at tamu.edu Sun Jun 21 13:24:12 2020 From: kasuni91 at tamu.edu (D Mudiyanselage Kasuni Daundasekara) Date: Sun, 21 Jun 2020 14:24:12 -0500 Subject: [maker-devel] Assistance on MAKER annotation Message-ID: Hello Yandell lab, We are trying to annotate a de novo plant genome. Originally, we annotated this genome (CAA, 1041 scaffolds) in 2016. Currently, we have updated the genome assembly of CAA (14 chromosomes + 737 scaffolds). We have a problem, there has been a loss in the number of genes in the new annotation (old genome=43289, new genome=38.633). Do you have any idea where these genes have been lost and any advice on how to get these genes back (some of these genes are really important, were well supported with AED and now are gone)? Thank you in advance. Best wishes, Kasuni Graduate student Pepper Lab Texas A & M University -------------- next part -------------- An HTML attachment was scrubbed... URL: From xvazquezc at gmail.com Tue Jun 2 00:57:26 2020 From: xvazquezc at gmail.com (=?UTF-8?Q?Xabier_V=C3=A1zquez=2DCampos?=) Date: Tue, 2 Jun 2020 16:57:26 +1000 Subject: [maker-devel] Maker2 changelog and Maker3 Message-ID: Hi Carson, I was looking for some Maker-related stuff (not important what) and I realised that there was a new release in April for Maker 2 and Maker 3 was not in beta anymore. We have been using Maker 2.31.9 in our cluster for a few years now and while the new (2.31.11) includes the most recent change (maker_functional_fasta and maker_functional_gff handle newer UniProt/Swiss-Prot format. Also fix for newer genemark command line structure) I can't find what changed in 2.31.10 nor a general changelog. By the way, do you know when (version) the genemark command line structure changed? What about the Uniprot format? Also, is there any fundamental difference aside of the integration of EVM in Maker3 vs Maker2? I found this entry from 2 years ago and wanted to check if this is still the case. http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/2018-August/003279.html Thank you, Xabi PS: is there a way to receive notifications for new releases? I didn't notice anything in the mail list -- Xabier V?zquez-Campos, *PhD* *Research Associate* NSW Systems Biology Initiative School of Biotechnology and Biomolecular Sciences The University of New South Wales Sydney NSW 2052 AUSTRALIA -------------- next part -------------- An HTML attachment was scrubbed... URL: From max_coyle at berkeley.edu Tue Jun 2 23:27:14 2020 From: max_coyle at berkeley.edu (Maxwell C Coyle) Date: Tue, 2 Jun 2020 22:27:14 -0700 Subject: [maker-devel] promiscuous mapping during annotation liftover Message-ID: <5B012E69-B1AD-4852-9F38-FFA76A4E7289@berkeley.edu> Hello! I?m using Maker to map forward annotations to a new genome assembly in a species of choanoflagellate. I followed the protocol from Campbell 2014, creating a transcript FASTA file from my old GFF3 file and using this as the only EST evidence. I set est2genome=1 and est_forward=1. The Maker ran great, except it seems that many of the old transcripts are mapping many places in the new assembly, up to 100+ times, so that my gene count has inflated from 11,624 to 25,905. Of the most highly multiply mapped genes, many but not all are rRNA genes. I was wondering if there is a setting or tweak I can make so that each transcript maps uniquely to its best location in the new genome? Or maybe tweaking the exonerate stringency? This is not the final annotation step, as I will also be using RNA-seq evidence to improve my gene models after liftover. Thanks for your help! Best, Max Max Coyle PhD Candidate, King Lab Dept of Molecular and Cellular Biology, UC Berkeley -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Thu Jun 11 14:19:29 2020 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 11 Jun 2020 14:19:29 -0600 Subject: [maker-devel] promiscuous mapping during annotation liftover In-Reply-To: <5B012E69-B1AD-4852-9F38-FFA76A4E7289@berkeley.edu> References: <5B012E69-B1AD-4852-9F38-FFA76A4E7289@berkeley.edu> Message-ID: Mapping genes forward onto a new assembly is an iterative process. If a previous model maps to multiple locations that is an indication that you accidentally annotated a repeat in the previous annotation set (i.e. transposes, etc). This is especially true if you get 100 copies. You should simply remove that gene. If it?s just 2 or 3 copies you can use the score column (see GFF3 format specification) which indicates the best match to the original copy to keep one or the other (value is 0-100% recovery when est_forward=1 is set). You can compare neighboring genes between assemblies to see if one copy is simply a paralog. If there are a lot of genes with 2 copies, you probably have high heterozygosity in the genome so maternal and paternal chromosomes are assembling independently (i.e. this means that both copies are real and are the exact same gene). If you decide a gene belongs on a specific contig after reviewing neighboring models and score, you can anchor it to a contig or region by adding maker_coor= to the fasta header (example: >transcriptA maker_coor=contig1:10000-11000; ). Also consider removing models that have low scores. They may map to only one location, but if the % recovery is low, it may be best to just not try and recover the old model (a new gene prediction may prove much more accurate). ?Carson > On Jun 2, 2020, at 11:27 PM, Maxwell C Coyle wrote: > > Hello! > > I?m using Maker to map forward annotations to a new genome assembly in a species of choanoflagellate. I followed the protocol from Campbell 2014, creating a transcript FASTA file from my old GFF3 file and using this as the only EST evidence. I set est2genome=1 and est_forward=1. The Maker ran great, except it seems that many of the old transcripts are mapping many places in the new assembly, up to 100+ times, so that my gene count has inflated from 11,624 to 25,905. Of the most highly multiply mapped genes, many but not all are rRNA genes. > > I was wondering if there is a setting or tweak I can make so that each transcript maps uniquely to its best location in the new genome? Or maybe tweaking the exonerate stringency? This is not the final annotation step, as I will also be using RNA-seq evidence to improve my gene models after liftover. > > Thanks for your help! > > Best, > Max > > Max Coyle > PhD Candidate, King Lab > Dept of Molecular and Cellular Biology, UC Berkeley > > > > _______________________________________________ > maker-devel mailing list > maker-devel at yandell-lab.org > http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From sagnik at iastate.edu Sun Jun 14 06:32:09 2020 From: sagnik at iastate.edu (Sagnik Banerjee) Date: Sun, 14 Jun 2020 07:32:09 -0500 Subject: [maker-devel] Encountering problems while using SNAP models Message-ID: Hello, I am using MAKER3 to generate annotation for Saccharomyces cerevisiae. In round1 I used assembled transcripts and the run was completed successfully. In the next round, I supplied gene models trained using SNAP. I keep getting an error that cites a problem with line 540 at snap.pm. Can't locate object method "add_entry" via package "1" (perhaps you forgot to load "1"?) at /project/software/7/apps/maker3/3.01.03/bin/../lib/Widget/ snap.pm line 540. --> rank=15, hostname=ceres14-compute-54.scinet.local ERROR: Failed while annotating transcripts ERROR: Chunk failed at level:1, tier_type:4 FAILED CONTIG:II The problem is somewhat similar to the one discussed in http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/2018-January/003150.html. I have attached the error file. I have tried executing the same previously but the run did not complete and the job was killed. I was inspecting the run closely and I found that MAKER3 stopped writing to the .error file after a few minutes. I am including the portion that was written to the .error file. I have attached the relevant files. Could you please look into this? Thank you. Sagnik Banerjee Graduate Research Assistant Bioinformatics and Computational Biology Department of Statistics Dr. Carson Andorf's Lab Agricultural Research Service (ARS) Research Participation Program Oak Ridge Institute for Science and Education (ORISE) 437, Bessey Hall Iowa State University *"The moment I have realized God sitting in the temple of every human body, the moment I stand in reverence before every human being and see God in him - that moment I am free from bondage, everything that binds vanishes, and I am free" - Swami Vivekananda* ? -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_opts.ctl Type: application/octet-stream Size: 5134 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: sacce.hmm Type: application/octet-stream Size: 46563 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Error Type: application/octet-stream Size: 307563 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: round1.gff Type: application/octet-stream Size: 13906494 bytes Desc: not available URL: From sagnik at iastate.edu Thu Jun 25 09:08:22 2020 From: sagnik at iastate.edu (Sagnik Banerjee) Date: Thu, 25 Jun 2020 10:08:22 -0500 Subject: [maker-devel] Encountering problems while using SNAP models In-Reply-To: References: Message-ID: Hello, Could you find what's wrong with this? Thank you. Sagnik Banerjee Graduate Research Assistant Bioinformatics and Computational Biology Department of Statistics Dr. Carson Andorf's Lab Agricultural Research Service (ARS) Research Participation Program Oak Ridge Institute for Science and Education (ORISE) 437, Bessey Hall Iowa State University *"The moment I have realized God sitting in the temple of every human body, the moment I stand in reverence before every human being and see God in him - that moment I am free from bondage, everything that binds vanishes, and I am free" - Swami Vivekananda* ? On Sun, Jun 14, 2020 at 7:32 AM Sagnik Banerjee wrote: > Hello, > > I am using MAKER3 to generate annotation for Saccharomyces cerevisiae. In > round1 I used assembled transcripts and the run was completed successfully. > In the next round, I supplied gene models trained using SNAP. I keep > getting an error that cites a problem with line 540 at snap.pm. > > Can't locate object method "add_entry" via package "1" (perhaps you forgot > to load "1"?) at /project/software/7/apps/maker3/3.01.03/bin/../lib/Widget/ > snap.pm line 540. > > --> rank=15, hostname=ceres14-compute-54.scinet.local > > ERROR: Failed while annotating transcripts > > ERROR: Chunk failed at level:1, tier_type:4 > > FAILED CONTIG:II > > The problem is somewhat similar to the one discussed in > http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/2018-January/003150.html. > I have attached the error file. I have tried executing the same previously > but the run did not complete and the job was killed. I was inspecting the > run closely and I found that MAKER3 stopped writing to the .error file > after a few minutes. I am including the portion that was written to the > .error file. I have attached the relevant files. > > Could you please look into this? > > Thank you. > Sagnik Banerjee > Graduate Research Assistant > Bioinformatics and Computational Biology > Department of Statistics > Dr. Carson Andorf's Lab > Agricultural Research Service (ARS) Research Participation Program > Oak Ridge Institute for Science and Education (ORISE) > 437, Bessey Hall > Iowa State University > > *"The moment I have realized God sitting in the temple of every human > body, the moment I stand in reverence before every human being and see God > in him - that moment I am free from bondage, everything that binds > vanishes, and I am free" - Swami Vivekananda* > ? > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wei.xiong at wur.nl Thu Jun 18 05:19:38 2020 From: wei.xiong at wur.nl (Xiong, Wei) Date: Thu, 18 Jun 2020 11:19:38 +0000 Subject: [maker-devel] Final Maker annotation extend ORF Message-ID: Dear Developer, I have used MAKER to annotate my genome. The input for MAKER run is pred_gff (de-novo annotation by Augustus), the protein sequence of a close species, and an assembled Transcriptome by RNA-seq using Trinity. When I checked the output, I noticed that many of the predicted gene models are with extended the regions flanking the ORF. I have checked the evidence I applied. I think the Transcriptome data is the reason because the RNA-seq not only contain the matured mRNA My question is, how can I properly make use of the mRNA while preventing this extension? Is there any way I can fix the current annotation instead of re-run the annotation? Thank you for your help. I look forward to hearing from you! Best regards and stay safe, Wei Xiong Ph.D. candidate | Wageningen University & Research Plant Science Group | Biosystematics Group Radix Building 107 Droevendaalsesteeg 1 6708 PB Wageningen The Netherlands E-mail: wei.xiong at wur.nl -------------- next part -------------- An HTML attachment was scrubbed... URL: From kasuni91 at tamu.edu Sun Jun 21 13:24:12 2020 From: kasuni91 at tamu.edu (D Mudiyanselage Kasuni Daundasekara) Date: Sun, 21 Jun 2020 14:24:12 -0500 Subject: [maker-devel] Assistance on MAKER annotation Message-ID: Hello Yandell lab, We are trying to annotate a de novo plant genome. Originally, we annotated this genome (CAA, 1041 scaffolds) in 2016. Currently, we have updated the genome assembly of CAA (14 chromosomes + 737 scaffolds). We have a problem, there has been a loss in the number of genes in the new annotation (old genome=43289, new genome=38.633). Do you have any idea where these genes have been lost and any advice on how to get these genes back (some of these genes are really important, were well supported with AED and now are gone)? Thank you in advance. Best wishes, Kasuni Graduate student Pepper Lab Texas A & M University -------------- next part -------------- An HTML attachment was scrubbed... URL: