From eennadi at gmail.com Tue Sep 1 12:31:20 2020 From: eennadi at gmail.com (Emmanuel Nnadi) Date: Tue, 1 Sep 2020 19:31:20 +0100 Subject: [maker-devel] Maker yields sequences without start and stop codon Message-ID: I ran my annotation using ESTs from NCBI and transcriptome data and Swiss-prot data. I used SNAP trained twice after which Augustus was trained once. I tried to submit the annotated genome to NCBI but have some problems I have not experienced before and do not know how to solve. Some of the CDS features have invalid translations. Every CDS feature should have a valid start and stop codon, and should not have any internal stops. The only exception is if the CDS is partial at the end of a sequence or at an intron/exon boundary. If the CDS is partial, it must have the appropriate partial symbols. If a CDS feature does not have a valid translation (for example if there is a frameshift), please remove the CDS feature and annotate this with a single gene feature across the entire span. Include a note on the gene with a brief description. For example: 1 200 gene gene phoA gene_desc alkaline phosphatase locus_tag OBB_0001 note nonfunctional due to frameshift [3] There are 1075 gene features that are not associated with any other features (CDS, rRNA, etc.) and are not labelled as pseudogenes or as nonfunctional genes. Did you lose some of the annotation you intended to include. Please how can this be solved? Thanks Nnadi Nnaemeka Emmanuel,Ph.D Department of Microbiology, Faculty of Natural and Applied Science, Plateau State University, Bokkos, Plateau State, Nigeria. +2348068124819 Publications: https://www.researchgate.net/profile/Emmanuel_Nnadi/publications -------------- next part -------------- An HTML attachment was scrubbed... URL: From 14zac2 at gmail.com Mon Sep 7 13:02:36 2020 From: 14zac2 at gmail.com (Zoe Clarke) Date: Mon, 7 Sep 2020 15:02:36 -0400 Subject: [maker-devel] Issue setting up Maker for MPI on cluster Message-ID: Hello! I am having a bit of trouble running Maker on a compute cluster with MPI. I am using the Compute Canada cluster which actually had Maker installed with openmpi, but I don't think it was installed quite properly as Maker crashes after 6 hours. I then decided to install Maker using a mpich that was available on the cluster but ran into an error. I figured this error may have been due to mpich not being set up with the shared libraries, so I installed a new version of mpich and configured Maker following instructions on this thread: http://gmod.827538.n3.nabble.com/WARNING-Multiple-MAKER-processes-td4040133.html After following the instructions on this thread exactly, I am running into the exact same errors that I previously ran into with mpich, which are permission errors pasted below. I'm thinking I might have to modify some permissions somewhere, but I'm not sure where - do you know why this process is failing? I am testing Maker using the command /home/zocla/mpich3/bin/mpiexec -n 3 maker --help and getting the error below. Thanks so much for your help! Zoe Argument "2.53_01" isn't numeric in numeric ge (>=) at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/forks.pm line 1570. Argument "2.53_01" isn't numeric in numeric ge (>=) at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/forks.pm line 1570. Argument "2.53_01" isn't numeric in numeric ge (>=) at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/forks.pm line 1570. Can't create Inline validation file /scratch/zocla/maker_mympich/maker/perl/lib/auto/Parallel/Application/MPI/MPI.inl: Permission denied at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/Parallel/Application/MPI.pm line 236. --> rank=NA, hostname=cedar1.cedar.computecanada.ca at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/Parallel/Application/MPI.pm line 256. Parallel::Application::MPI::_bind("/home/zocla/mpich3/bin/mpicc", "/home/zocla/mpich3/include", "/scratch/zocla/maker_mympich/maker/bin/../perl", "") called at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/Parallel/Application/MPI.pm line 213 Parallel::Application::MPI::_load() called at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/Parallel/Application/MPI.pm line 74 Parallel::Application::MPI::MPI_Init() called at /scratch/zocla/maker_mympich/maker/bin//maker line 265 --> rank=NA, hostname=cedar1.cedar.computecanada.ca Can't create Inline validation file /scratch/zocla/maker_mympich/maker/perl/lib/auto/Parallel/Application/MPI/MPI.inl: Permission denied at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/Parallel/Application/MPI.pm line 236. --> rank=NA, hostname=cedar1.cedar.computecanada.ca at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/Parallel/Application/MPI.pm line 256. Parallel::Application::MPI::_bind("/home/zocla/mpich3/bin/mpicc", "/home/zocla/mpich3/include", "/scratch/zocla/maker_mympich/maker/bin/../perl", "") called at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/Parallel/Application/MPI.pm line 213 Parallel::Application::MPI::_load() called at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/Parallel/Application/MPI.pm line 74 Parallel::Application::MPI::MPI_Init() called at /scratch/zocla/maker_mympich/maker/bin//maker line 265 --> rank=NA, hostname=cedar1.cedar.computecanada.ca Can't create Inline validation file /scratch/zocla/maker_mympich/maker/perl/lib/auto/Parallel/Application/MPI/MPI.inl: Permission denied at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/Parallel/Application/MPI.pm line 236. --> rank=NA, hostname=cedar1.cedar.computecanada.ca at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/Parallel/Application/MPI.pm line 256. Parallel::Application::MPI::_bind("/home/zocla/mpich3/bin/mpicc", "/home/zocla/mpich3/include", "/scratch/zocla/maker_mympich/maker/bin/../perl", "") called at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/Parallel/Application/MPI.pm line 213 Parallel::Application::MPI::_load() called at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/Parallel/Application/MPI.pm line 74 Parallel::Application::MPI::MPI_Init() called at /scratch/zocla/maker_mympich/maker/bin//maker line 265 --> rank=NA, hostname=cedar1.cedar.computecanada.ca = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES = EXIT CODE: 13 = CLEANING UP REMAINING PROCESSES = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES -------------- next part -------------- An HTML attachment was scrubbed... URL: From guerrer at uni-duesseldorf.de Tue Sep 15 08:35:18 2020 From: guerrer at uni-duesseldorf.de (Ricardo Nuno Ferreira Martins Guerreiro) Date: Tue, 15 Sep 2020 16:35:18 +0200 Subject: [maker-devel] Abinitio annotation looses many genes from EST annotation Message-ID: <7795e7973f118c719ef54cc11387f7fb@uni-duesseldorf.de> Hello, It appears to me that my abInitio SNAP + Augustus (trained by nexflow) results are worse than the first (est2genome and prot2genome) run. The gene number decreases by more than 6000, the opposite of what I expected. Additionally, my abInitio rerun also has less genes than a previous SNAP reannotation.. The AED plots all look more or less the same but the number of genes lowers instead of increasing as I expected (image in annex). I'be been running my maker+nextflow pipeline with consistently good results, this species is the exception now. Do you have any suggestion? Should I ditch AbInitio training and use the Evidence based annotation? Or maybe just use the closest Augustus model instead of training my own? Kind regards, Ricardo -------------- next part -------------- A non-text attachment was scrubbed... Name: AEDs.png Type: image/png Size: 38352 bytes Desc: not available URL: From wei.xiong at wur.nl Fri Sep 18 03:57:19 2020 From: wei.xiong at wur.nl (Xiong, Wei) Date: Fri, 18 Sep 2020 09:57:19 +0000 Subject: [maker-devel] MAKER ERROR Message-ID: Dear Colleague, I had encountered an ERROR when I used MAKER to annotate my genome. It is a large plant genome (more than 3Gb). I included a TE (gff) data, one Transcriptome data (fasta), and three protein sequences (fasta) for the homology annotation. There are, in total, 29 scaffolds. Three scaffolds faced the "ERROR: Failed while processing all repeats," while the other 26 finished successfully. I have tried the following methods from the online forum. However, I still can't fix the error. * Check the RepeatMasker configuration * Check the maker_exe.ctl, and set a updated BLAST (ncbi-blast-2.7.1) * replace .../maker/lib/Widget/RepeatMasker.pm http://gmod.827538.n3.nabble.com/MAKER-v3-ERROR-Failed-while-processing-all-repeats-td4059410.html * increase the try_count in the maker_opt.ctl Could you please help me to solve this problem? Thank you for reading my email. I look forward to hearing from you. Best wishes and stay healthy, Wei Xiong PhD candidate | Wageningen University & Research Plant Science Group | Biosystematics Group Radix Building 107 Droevendaalsesteeg 1 6708 PB Wageningen The Netherlands Email: wei.xiong at wur.nl -------------- next part -------------- An HTML attachment was scrubbed... URL: From zoe.clarke at utoronto.ca Fri Sep 25 03:17:47 2020 From: zoe.clarke at utoronto.ca (Zoe Clarke) Date: Fri, 25 Sep 2020 09:17:47 +0000 Subject: [maker-devel] map_forward and temporary storage questions Message-ID: Hello! I am currently running Maker on a 2.5GB genome that has already had a list of ~8000 genes very thoroughly annotated. My hope is to find and annotate the rest of the genes using ESTs and protein homology. However, I tested Maker on a single contig of my genome (there are ~20,000 contigs) and I can't find any of the genes from my original gtf file even though I followed all of the instructions in this wiki: http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Updating_annotations_in_light_of_new_data (I entered the original gff under model_gff, and used map_forward=1). I am worried this is because my gff3 file isn't formatted properly. Here are a few lines in my gff file as an example: -------------------------------------------- WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 transcript 1094446 1105585 . + . ID=DIMT1.1;geneID=DIMT1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 exon 1094446 1094521 97.75 + . Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 exon 1094874 1094947 97.75 + . Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 exon 1095459 1095545 97.75 + . Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 exon 1097351 1097412 97.75 + . Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 exon 1097492 1097585 97.75 + . Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 exon 1097670 1097719 97.75 + . Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 exon 1098957 1099080 97.75 + . Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 exon 1099217 1099309 97.75 + . Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 exon 1100870 1100934 97.75 + . Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 exon 1101967 1102030 97.75 + . Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 exon 1103784 1103890 97.75 + . Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 exon 1105543 1105585 97.75 + . Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 CDS 1094446 1094521 . + 0 Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 CDS 1094874 1094947 . + 2 Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 CDS 1095459 1095545 . + 0 Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 CDS 1097351 1097412 . + 0 Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 CDS 1097492 1097585 . + 1 Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 CDS 1097670 1097719 . + 0 Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 CDS 1098957 1099080 . + 1 Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 CDS 1099217 1099309 . + 0 Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 CDS 1100870 1100934 . + 0 Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 CDS 1101967 1102030 . + 1 Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 CDS 1103784 1103890 . + 0 Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 CDS 1105543 1105582 . + 1 Parent=DIMT1.1 -------------------------------------- ?This is from the contig I used as a test for maker, and I can't find DIMT1.1 in the final gff file. At first I thought it might be because "geneID" is a listed attribute, but changing this to "Name" didn't help. Do you have any ideas why these genes might not be mapping forward? If it's something I can fix in the gff file, I am hoping I can fix it and use it for the second round of Maker after I have trained Snap. Also, do you think a better quality annotation would results from Snap trained from this curated list of ~8000 genes (that has been expertly done) or by the round 1 output of Maker? A final question: I am having memory storage issues with Maker, as it is currently taking up ~15TB of storage with temporary files. I am running Maker on a cluster and whenever my submitted Maker job runs out of memory it fails, so I have to resubmit it about every hour, which leaves a lot of temporary folders (e.g. maker_x6V2y4) in my directory. I notice that some of these temporary files haven't been updated in days - is it okay to delete them? Thank you so much for your help! Zoe ______________________________________ Zoe Clarke PhD candidate in Computational Biology at U of T Lab profile: http://baderlab.org/Zoe%20Clarke Personal website: https://zoe-clarke.weebly.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From eennadi at gmail.com Tue Sep 1 12:31:20 2020 From: eennadi at gmail.com (Emmanuel Nnadi) Date: Tue, 1 Sep 2020 19:31:20 +0100 Subject: [maker-devel] Maker yields sequences without start and stop codon Message-ID: I ran my annotation using ESTs from NCBI and transcriptome data and Swiss-prot data. I used SNAP trained twice after which Augustus was trained once. I tried to submit the annotated genome to NCBI but have some problems I have not experienced before and do not know how to solve. Some of the CDS features have invalid translations. Every CDS feature should have a valid start and stop codon, and should not have any internal stops. The only exception is if the CDS is partial at the end of a sequence or at an intron/exon boundary. If the CDS is partial, it must have the appropriate partial symbols. If a CDS feature does not have a valid translation (for example if there is a frameshift), please remove the CDS feature and annotate this with a single gene feature across the entire span. Include a note on the gene with a brief description. For example: 1 200 gene gene phoA gene_desc alkaline phosphatase locus_tag OBB_0001 note nonfunctional due to frameshift [3] There are 1075 gene features that are not associated with any other features (CDS, rRNA, etc.) and are not labelled as pseudogenes or as nonfunctional genes. Did you lose some of the annotation you intended to include. Please how can this be solved? Thanks Nnadi Nnaemeka Emmanuel,Ph.D Department of Microbiology, Faculty of Natural and Applied Science, Plateau State University, Bokkos, Plateau State, Nigeria. +2348068124819 Publications: https://www.researchgate.net/profile/Emmanuel_Nnadi/publications -------------- next part -------------- An HTML attachment was scrubbed... URL: From 14zac2 at gmail.com Mon Sep 7 13:02:36 2020 From: 14zac2 at gmail.com (Zoe Clarke) Date: Mon, 7 Sep 2020 15:02:36 -0400 Subject: [maker-devel] Issue setting up Maker for MPI on cluster Message-ID: Hello! I am having a bit of trouble running Maker on a compute cluster with MPI. I am using the Compute Canada cluster which actually had Maker installed with openmpi, but I don't think it was installed quite properly as Maker crashes after 6 hours. I then decided to install Maker using a mpich that was available on the cluster but ran into an error. I figured this error may have been due to mpich not being set up with the shared libraries, so I installed a new version of mpich and configured Maker following instructions on this thread: http://gmod.827538.n3.nabble.com/WARNING-Multiple-MAKER-processes-td4040133.html After following the instructions on this thread exactly, I am running into the exact same errors that I previously ran into with mpich, which are permission errors pasted below. I'm thinking I might have to modify some permissions somewhere, but I'm not sure where - do you know why this process is failing? I am testing Maker using the command /home/zocla/mpich3/bin/mpiexec -n 3 maker --help and getting the error below. Thanks so much for your help! Zoe Argument "2.53_01" isn't numeric in numeric ge (>=) at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/forks.pm line 1570. Argument "2.53_01" isn't numeric in numeric ge (>=) at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/forks.pm line 1570. Argument "2.53_01" isn't numeric in numeric ge (>=) at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/forks.pm line 1570. Can't create Inline validation file /scratch/zocla/maker_mympich/maker/perl/lib/auto/Parallel/Application/MPI/MPI.inl: Permission denied at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/Parallel/Application/MPI.pm line 236. --> rank=NA, hostname=cedar1.cedar.computecanada.ca at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/Parallel/Application/MPI.pm line 256. Parallel::Application::MPI::_bind("/home/zocla/mpich3/bin/mpicc", "/home/zocla/mpich3/include", "/scratch/zocla/maker_mympich/maker/bin/../perl", "") called at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/Parallel/Application/MPI.pm line 213 Parallel::Application::MPI::_load() called at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/Parallel/Application/MPI.pm line 74 Parallel::Application::MPI::MPI_Init() called at /scratch/zocla/maker_mympich/maker/bin//maker line 265 --> rank=NA, hostname=cedar1.cedar.computecanada.ca Can't create Inline validation file /scratch/zocla/maker_mympich/maker/perl/lib/auto/Parallel/Application/MPI/MPI.inl: Permission denied at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/Parallel/Application/MPI.pm line 236. --> rank=NA, hostname=cedar1.cedar.computecanada.ca at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/Parallel/Application/MPI.pm line 256. Parallel::Application::MPI::_bind("/home/zocla/mpich3/bin/mpicc", "/home/zocla/mpich3/include", "/scratch/zocla/maker_mympich/maker/bin/../perl", "") called at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/Parallel/Application/MPI.pm line 213 Parallel::Application::MPI::_load() called at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/Parallel/Application/MPI.pm line 74 Parallel::Application::MPI::MPI_Init() called at /scratch/zocla/maker_mympich/maker/bin//maker line 265 --> rank=NA, hostname=cedar1.cedar.computecanada.ca Can't create Inline validation file /scratch/zocla/maker_mympich/maker/perl/lib/auto/Parallel/Application/MPI/MPI.inl: Permission denied at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/Parallel/Application/MPI.pm line 236. --> rank=NA, hostname=cedar1.cedar.computecanada.ca at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/Parallel/Application/MPI.pm line 256. Parallel::Application::MPI::_bind("/home/zocla/mpich3/bin/mpicc", "/home/zocla/mpich3/include", "/scratch/zocla/maker_mympich/maker/bin/../perl", "") called at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/Parallel/Application/MPI.pm line 213 Parallel::Application::MPI::_load() called at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/Parallel/Application/MPI.pm line 74 Parallel::Application::MPI::MPI_Init() called at /scratch/zocla/maker_mympich/maker/bin//maker line 265 --> rank=NA, hostname=cedar1.cedar.computecanada.ca = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES = EXIT CODE: 13 = CLEANING UP REMAINING PROCESSES = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES -------------- next part -------------- An HTML attachment was scrubbed... URL: From guerrer at uni-duesseldorf.de Tue Sep 15 08:35:18 2020 From: guerrer at uni-duesseldorf.de (Ricardo Nuno Ferreira Martins Guerreiro) Date: Tue, 15 Sep 2020 16:35:18 +0200 Subject: [maker-devel] Abinitio annotation looses many genes from EST annotation Message-ID: <7795e7973f118c719ef54cc11387f7fb@uni-duesseldorf.de> Hello, It appears to me that my abInitio SNAP + Augustus (trained by nexflow) results are worse than the first (est2genome and prot2genome) run. The gene number decreases by more than 6000, the opposite of what I expected. Additionally, my abInitio rerun also has less genes than a previous SNAP reannotation.. The AED plots all look more or less the same but the number of genes lowers instead of increasing as I expected (image in annex). I'be been running my maker+nextflow pipeline with consistently good results, this species is the exception now. Do you have any suggestion? Should I ditch AbInitio training and use the Evidence based annotation? Or maybe just use the closest Augustus model instead of training my own? Kind regards, Ricardo -------------- next part -------------- A non-text attachment was scrubbed... Name: AEDs.png Type: image/png Size: 38352 bytes Desc: not available URL: From wei.xiong at wur.nl Fri Sep 18 03:57:19 2020 From: wei.xiong at wur.nl (Xiong, Wei) Date: Fri, 18 Sep 2020 09:57:19 +0000 Subject: [maker-devel] MAKER ERROR Message-ID: Dear Colleague, I had encountered an ERROR when I used MAKER to annotate my genome. It is a large plant genome (more than 3Gb). I included a TE (gff) data, one Transcriptome data (fasta), and three protein sequences (fasta) for the homology annotation. There are, in total, 29 scaffolds. Three scaffolds faced the "ERROR: Failed while processing all repeats," while the other 26 finished successfully. I have tried the following methods from the online forum. However, I still can't fix the error. * Check the RepeatMasker configuration * Check the maker_exe.ctl, and set a updated BLAST (ncbi-blast-2.7.1) * replace .../maker/lib/Widget/RepeatMasker.pm http://gmod.827538.n3.nabble.com/MAKER-v3-ERROR-Failed-while-processing-all-repeats-td4059410.html * increase the try_count in the maker_opt.ctl Could you please help me to solve this problem? Thank you for reading my email. I look forward to hearing from you. Best wishes and stay healthy, Wei Xiong PhD candidate | Wageningen University & Research Plant Science Group | Biosystematics Group Radix Building 107 Droevendaalsesteeg 1 6708 PB Wageningen The Netherlands Email: wei.xiong at wur.nl -------------- next part -------------- An HTML attachment was scrubbed... URL: From zoe.clarke at utoronto.ca Fri Sep 25 03:17:47 2020 From: zoe.clarke at utoronto.ca (Zoe Clarke) Date: Fri, 25 Sep 2020 09:17:47 +0000 Subject: [maker-devel] map_forward and temporary storage questions Message-ID: Hello! I am currently running Maker on a 2.5GB genome that has already had a list of ~8000 genes very thoroughly annotated. My hope is to find and annotate the rest of the genes using ESTs and protein homology. However, I tested Maker on a single contig of my genome (there are ~20,000 contigs) and I can't find any of the genes from my original gtf file even though I followed all of the instructions in this wiki: http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Updating_annotations_in_light_of_new_data (I entered the original gff under model_gff, and used map_forward=1). I am worried this is because my gff3 file isn't formatted properly. Here are a few lines in my gff file as an example: -------------------------------------------- WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 transcript 1094446 1105585 . + . ID=DIMT1.1;geneID=DIMT1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 exon 1094446 1094521 97.75 + . Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 exon 1094874 1094947 97.75 + . Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 exon 1095459 1095545 97.75 + . Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 exon 1097351 1097412 97.75 + . Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 exon 1097492 1097585 97.75 + . Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 exon 1097670 1097719 97.75 + . Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 exon 1098957 1099080 97.75 + . Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 exon 1099217 1099309 97.75 + . Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 exon 1100870 1100934 97.75 + . Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 exon 1101967 1102030 97.75 + . Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 exon 1103784 1103890 97.75 + . Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 exon 1105543 1105585 97.75 + . Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 CDS 1094446 1094521 . + 0 Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 CDS 1094874 1094947 . + 2 Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 CDS 1095459 1095545 . + 0 Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 CDS 1097351 1097412 . + 0 Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 CDS 1097492 1097585 . + 1 Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 CDS 1097670 1097719 . + 0 Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 CDS 1098957 1099080 . + 1 Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 CDS 1099217 1099309 . + 0 Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 CDS 1100870 1100934 . + 0 Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 CDS 1101967 1102030 . + 1 Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 CDS 1103784 1103890 . + 0 Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 CDS 1105543 1105582 . + 1 Parent=DIMT1.1 -------------------------------------- ?This is from the contig I used as a test for maker, and I can't find DIMT1.1 in the final gff file. At first I thought it might be because "geneID" is a listed attribute, but changing this to "Name" didn't help. Do you have any ideas why these genes might not be mapping forward? If it's something I can fix in the gff file, I am hoping I can fix it and use it for the second round of Maker after I have trained Snap. Also, do you think a better quality annotation would results from Snap trained from this curated list of ~8000 genes (that has been expertly done) or by the round 1 output of Maker? A final question: I am having memory storage issues with Maker, as it is currently taking up ~15TB of storage with temporary files. I am running Maker on a cluster and whenever my submitted Maker job runs out of memory it fails, so I have to resubmit it about every hour, which leaves a lot of temporary folders (e.g. maker_x6V2y4) in my directory. I notice that some of these temporary files haven't been updated in days - is it okay to delete them? Thank you so much for your help! Zoe ______________________________________ Zoe Clarke PhD candidate in Computational Biology at U of T Lab profile: http://baderlab.org/Zoe%20Clarke Personal website: https://zoe-clarke.weebly.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From eennadi at gmail.com Tue Sep 1 12:31:20 2020 From: eennadi at gmail.com (Emmanuel Nnadi) Date: Tue, 1 Sep 2020 19:31:20 +0100 Subject: [maker-devel] Maker yields sequences without start and stop codon Message-ID: I ran my annotation using ESTs from NCBI and transcriptome data and Swiss-prot data. I used SNAP trained twice after which Augustus was trained once. I tried to submit the annotated genome to NCBI but have some problems I have not experienced before and do not know how to solve. Some of the CDS features have invalid translations. Every CDS feature should have a valid start and stop codon, and should not have any internal stops. The only exception is if the CDS is partial at the end of a sequence or at an intron/exon boundary. If the CDS is partial, it must have the appropriate partial symbols. If a CDS feature does not have a valid translation (for example if there is a frameshift), please remove the CDS feature and annotate this with a single gene feature across the entire span. Include a note on the gene with a brief description. For example: 1 200 gene gene phoA gene_desc alkaline phosphatase locus_tag OBB_0001 note nonfunctional due to frameshift [3] There are 1075 gene features that are not associated with any other features (CDS, rRNA, etc.) and are not labelled as pseudogenes or as nonfunctional genes. Did you lose some of the annotation you intended to include. Please how can this be solved? Thanks Nnadi Nnaemeka Emmanuel,Ph.D Department of Microbiology, Faculty of Natural and Applied Science, Plateau State University, Bokkos, Plateau State, Nigeria. +2348068124819 Publications: https://www.researchgate.net/profile/Emmanuel_Nnadi/publications -------------- next part -------------- An HTML attachment was scrubbed... URL: From 14zac2 at gmail.com Mon Sep 7 13:02:36 2020 From: 14zac2 at gmail.com (Zoe Clarke) Date: Mon, 7 Sep 2020 15:02:36 -0400 Subject: [maker-devel] Issue setting up Maker for MPI on cluster Message-ID: Hello! I am having a bit of trouble running Maker on a compute cluster with MPI. I am using the Compute Canada cluster which actually had Maker installed with openmpi, but I don't think it was installed quite properly as Maker crashes after 6 hours. I then decided to install Maker using a mpich that was available on the cluster but ran into an error. I figured this error may have been due to mpich not being set up with the shared libraries, so I installed a new version of mpich and configured Maker following instructions on this thread: http://gmod.827538.n3.nabble.com/WARNING-Multiple-MAKER-processes-td4040133.html After following the instructions on this thread exactly, I am running into the exact same errors that I previously ran into with mpich, which are permission errors pasted below. I'm thinking I might have to modify some permissions somewhere, but I'm not sure where - do you know why this process is failing? I am testing Maker using the command /home/zocla/mpich3/bin/mpiexec -n 3 maker --help and getting the error below. Thanks so much for your help! Zoe Argument "2.53_01" isn't numeric in numeric ge (>=) at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/forks.pm line 1570. Argument "2.53_01" isn't numeric in numeric ge (>=) at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/forks.pm line 1570. Argument "2.53_01" isn't numeric in numeric ge (>=) at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/forks.pm line 1570. Can't create Inline validation file /scratch/zocla/maker_mympich/maker/perl/lib/auto/Parallel/Application/MPI/MPI.inl: Permission denied at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/Parallel/Application/MPI.pm line 236. --> rank=NA, hostname=cedar1.cedar.computecanada.ca at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/Parallel/Application/MPI.pm line 256. Parallel::Application::MPI::_bind("/home/zocla/mpich3/bin/mpicc", "/home/zocla/mpich3/include", "/scratch/zocla/maker_mympich/maker/bin/../perl", "") called at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/Parallel/Application/MPI.pm line 213 Parallel::Application::MPI::_load() called at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/Parallel/Application/MPI.pm line 74 Parallel::Application::MPI::MPI_Init() called at /scratch/zocla/maker_mympich/maker/bin//maker line 265 --> rank=NA, hostname=cedar1.cedar.computecanada.ca Can't create Inline validation file /scratch/zocla/maker_mympich/maker/perl/lib/auto/Parallel/Application/MPI/MPI.inl: Permission denied at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/Parallel/Application/MPI.pm line 236. --> rank=NA, hostname=cedar1.cedar.computecanada.ca at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/Parallel/Application/MPI.pm line 256. Parallel::Application::MPI::_bind("/home/zocla/mpich3/bin/mpicc", "/home/zocla/mpich3/include", "/scratch/zocla/maker_mympich/maker/bin/../perl", "") called at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/Parallel/Application/MPI.pm line 213 Parallel::Application::MPI::_load() called at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/Parallel/Application/MPI.pm line 74 Parallel::Application::MPI::MPI_Init() called at /scratch/zocla/maker_mympich/maker/bin//maker line 265 --> rank=NA, hostname=cedar1.cedar.computecanada.ca Can't create Inline validation file /scratch/zocla/maker_mympich/maker/perl/lib/auto/Parallel/Application/MPI/MPI.inl: Permission denied at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/Parallel/Application/MPI.pm line 236. --> rank=NA, hostname=cedar1.cedar.computecanada.ca at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/Parallel/Application/MPI.pm line 256. Parallel::Application::MPI::_bind("/home/zocla/mpich3/bin/mpicc", "/home/zocla/mpich3/include", "/scratch/zocla/maker_mympich/maker/bin/../perl", "") called at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/Parallel/Application/MPI.pm line 213 Parallel::Application::MPI::_load() called at /scratch/zocla/maker_mympich/maker/bin/../perl/lib/Parallel/Application/MPI.pm line 74 Parallel::Application::MPI::MPI_Init() called at /scratch/zocla/maker_mympich/maker/bin//maker line 265 --> rank=NA, hostname=cedar1.cedar.computecanada.ca = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES = EXIT CODE: 13 = CLEANING UP REMAINING PROCESSES = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES -------------- next part -------------- An HTML attachment was scrubbed... URL: From guerrer at uni-duesseldorf.de Tue Sep 15 08:35:18 2020 From: guerrer at uni-duesseldorf.de (Ricardo Nuno Ferreira Martins Guerreiro) Date: Tue, 15 Sep 2020 16:35:18 +0200 Subject: [maker-devel] Abinitio annotation looses many genes from EST annotation Message-ID: <7795e7973f118c719ef54cc11387f7fb@uni-duesseldorf.de> Hello, It appears to me that my abInitio SNAP + Augustus (trained by nexflow) results are worse than the first (est2genome and prot2genome) run. The gene number decreases by more than 6000, the opposite of what I expected. Additionally, my abInitio rerun also has less genes than a previous SNAP reannotation.. The AED plots all look more or less the same but the number of genes lowers instead of increasing as I expected (image in annex). I'be been running my maker+nextflow pipeline with consistently good results, this species is the exception now. Do you have any suggestion? Should I ditch AbInitio training and use the Evidence based annotation? Or maybe just use the closest Augustus model instead of training my own? Kind regards, Ricardo -------------- next part -------------- A non-text attachment was scrubbed... Name: AEDs.png Type: image/png Size: 38352 bytes Desc: not available URL: From wei.xiong at wur.nl Fri Sep 18 03:57:19 2020 From: wei.xiong at wur.nl (Xiong, Wei) Date: Fri, 18 Sep 2020 09:57:19 +0000 Subject: [maker-devel] MAKER ERROR Message-ID: Dear Colleague, I had encountered an ERROR when I used MAKER to annotate my genome. It is a large plant genome (more than 3Gb). I included a TE (gff) data, one Transcriptome data (fasta), and three protein sequences (fasta) for the homology annotation. There are, in total, 29 scaffolds. Three scaffolds faced the "ERROR: Failed while processing all repeats," while the other 26 finished successfully. I have tried the following methods from the online forum. However, I still can't fix the error. * Check the RepeatMasker configuration * Check the maker_exe.ctl, and set a updated BLAST (ncbi-blast-2.7.1) * replace .../maker/lib/Widget/RepeatMasker.pm http://gmod.827538.n3.nabble.com/MAKER-v3-ERROR-Failed-while-processing-all-repeats-td4059410.html * increase the try_count in the maker_opt.ctl Could you please help me to solve this problem? Thank you for reading my email. I look forward to hearing from you. Best wishes and stay healthy, Wei Xiong PhD candidate | Wageningen University & Research Plant Science Group | Biosystematics Group Radix Building 107 Droevendaalsesteeg 1 6708 PB Wageningen The Netherlands Email: wei.xiong at wur.nl -------------- next part -------------- An HTML attachment was scrubbed... URL: From zoe.clarke at utoronto.ca Fri Sep 25 03:17:47 2020 From: zoe.clarke at utoronto.ca (Zoe Clarke) Date: Fri, 25 Sep 2020 09:17:47 +0000 Subject: [maker-devel] map_forward and temporary storage questions Message-ID: Hello! I am currently running Maker on a 2.5GB genome that has already had a list of ~8000 genes very thoroughly annotated. My hope is to find and annotate the rest of the genes using ESTs and protein homology. However, I tested Maker on a single contig of my genome (there are ~20,000 contigs) and I can't find any of the genes from my original gtf file even though I followed all of the instructions in this wiki: http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Updating_annotations_in_light_of_new_data (I entered the original gff under model_gff, and used map_forward=1). I am worried this is because my gff3 file isn't formatted properly. Here are a few lines in my gff file as an example: -------------------------------------------- WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 transcript 1094446 1105585 . + . ID=DIMT1.1;geneID=DIMT1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 exon 1094446 1094521 97.75 + . Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 exon 1094874 1094947 97.75 + . Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 exon 1095459 1095545 97.75 + . Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 exon 1097351 1097412 97.75 + . Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 exon 1097492 1097585 97.75 + . Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 exon 1097670 1097719 97.75 + . Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 exon 1098957 1099080 97.75 + . Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 exon 1099217 1099309 97.75 + . Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 exon 1100870 1100934 97.75 + . Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 exon 1101967 1102030 97.75 + . Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 exon 1103784 1103890 97.75 + . Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 exon 1105543 1105585 97.75 + . Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 CDS 1094446 1094521 . + 0 Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 CDS 1094874 1094947 . + 2 Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 CDS 1095459 1095545 . + 0 Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 CDS 1097351 1097412 . + 0 Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 CDS 1097492 1097585 . + 1 Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 CDS 1097670 1097719 . + 0 Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 CDS 1098957 1099080 . + 1 Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 CDS 1099217 1099309 . + 0 Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 CDS 1100870 1100934 . + 0 Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 CDS 1101967 1102030 . + 1 Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 CDS 1103784 1103890 . + 0 Parent=DIMT1.1 WCK01_AAF20200214_F8-ctg36 ovaltine_v0.13 CDS 1105543 1105582 . + 1 Parent=DIMT1.1 -------------------------------------- ?This is from the contig I used as a test for maker, and I can't find DIMT1.1 in the final gff file. At first I thought it might be because "geneID" is a listed attribute, but changing this to "Name" didn't help. Do you have any ideas why these genes might not be mapping forward? If it's something I can fix in the gff file, I am hoping I can fix it and use it for the second round of Maker after I have trained Snap. Also, do you think a better quality annotation would results from Snap trained from this curated list of ~8000 genes (that has been expertly done) or by the round 1 output of Maker? A final question: I am having memory storage issues with Maker, as it is currently taking up ~15TB of storage with temporary files. I am running Maker on a cluster and whenever my submitted Maker job runs out of memory it fails, so I have to resubmit it about every hour, which leaves a lot of temporary folders (e.g. maker_x6V2y4) in my directory. I notice that some of these temporary files haven't been updated in days - is it okay to delete them? Thank you so much for your help! Zoe ______________________________________ Zoe Clarke PhD candidate in Computational Biology at U of T Lab profile: http://baderlab.org/Zoe%20Clarke Personal website: https://zoe-clarke.weebly.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: