From carsonhh at gmail.com Fri Jan 8 12:38:56 2021 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 8 Jan 2021 12:38:56 -0700 Subject: [maker-devel] maker-devel post from nanshangogo@gmail.com requires approval In-Reply-To: References: Message-ID: The result.maker.proteins.fasta and result.maker.transcripts.fasta files contain the filtered annotations, and the other files are for reference purposes (i.e. snap and augustus raw unfiltered output). The non_overlapping_ab_initio files contain non-redundant ab initio predictions that do not overlap a final annotation (maker file). They were called by a predictor but rejected for lack of support. If you wanted to look for a potential missing gene model, that is where it would be. ?Carson > On Dec 23, 2020, at 12:03 AM, maker-devel-owner at yandell-lab.org wrote: > > As list administrator, your authorization is requested for the > following mailing list posting: > > List: maker-devel at yandell-lab.org > From: nanshangogo at gmail.com > Subject: Qusetion about the ab initio gene predictors result > Reason: Post by non-member to a members-only list > > At your convenience, visit: > > http://yandell-lab.org/mailman/admindb/maker-devel_yandell-lab.org > > to approve or deny the request. > > From: nanshan yang > Subject: Qusetion about the ab initio gene predictors result > Date: December 23, 2020 at 12:03:35 AM MST > To: maker-devel at yandell-lab.org > > > Hi MAKER community : > I have questions about MAKER output files.I get result from ab initio gene predictors which use snap and augustus by maker,and after fasta_merge step,there are some fasta files as: > result.maker.augustus_masked.proteins.fasta > result.maker.augustus_masked.transcripts.fasta > result.maker.augustus.proteins.fasta > result.maker.augustus.transcripts.fasta > result.maker.non_overlapping_ab_initio.proteins.fasta > result.maker.non_overlapping_ab_initio.transcripts.fasta > result.maker.snap.proteins.fasta > result.maker.snap_masked..transcripts.fasta > result.maker.snap_masked..proteins.fasta > result.maker.snap.transcripts.fasta > result.maker.proteins.fasta > result.maker.transcripts.fasta > if i continue to analysis the fasta files,which fasta should i choose? > because i choose ab initio gene predictors,so the result .maker.non_overlapping_ab_initio*fasta can be uesed into the downstream analysis?or the result.maker.proteins.fasta > Thanks verymuch for any help or insights > > > > From: maker-devel-request at yandell-lab.org > Subject: confirm 102e4151023f6d0466a7c780132e8ca169e86aaa > Date: December 23, 2020 at 12:03:58 AM MST > > > If you reply to this message, keeping the Subject: header intact, > Mailman will discard the held message. Do this if the message is > spam. If you reply to this message and include an Approved: header > with the list password in it, the message will be approved for posting > to the list. The Approved: header can also appear in the first line > of the body of the reply. > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 1376 bytes Desc: not available URL: From carsonhh at gmail.com Fri Jan 8 12:41:35 2021 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 8 Jan 2021 12:41:35 -0700 Subject: [maker-devel] Possible precedence issue with control flow operator at /apps/maker/3.01.03/lib/site_perl/5.26.2/Bio/DB/IndexedBase.pm line 805 In-Reply-To: References: Message-ID: <1E264E9F-C843-4275-9D48-B70131C98AA6@gmail.com> The problem is the bioperl or even perl installation ?> ?Possible precedence issue with control flow operator at /apps/maker/3.01.03/lib/site_perl/5.26.2/Bio/DB/IndexedBase.pm line 805." You may have a custom install of perl without BerkleyDB setup, or need to update BioPerl. If you installed MAKER with MiniConda, then it tried to setup a custom perl and BioPerl that is broken. You probably need to reinstall MAKER. Using the system perl may be the easiest. ?Carson > On Dec 9, 2020, at 9:14 AM, Gumbi,Bonginkosi C wrote: > > ?Dear maker support team > > I need your help to troubleshoot this error. I don't know what I'm doing wrong, this is my first time annotating a genome. I have gone through almost three maker tutorials online but it's like the annotation doesn't generate the datastore folder and I don't know why because I have provided all the input files. I am running this analysis on the cluster platform. Below I have pasted the slurm script and the error message. Any suggestions and help would be highly appropriated. > > Slurm script > #!/bin/bash > #SBATCH --account=austin > #SBATCH --job-name=maker > #SBATCH --mail-type=ALL > #SBATCH --mail-user=charlesgumbi at ufl.edu > #SBATCH --mem=30gb > #SBATCH --ntasks=1 > #SBATCH --cpus-per-task2 > #SBATCH --time=48:00:00 > #SBATCH --output=maker%j.out > #SBATCH --error=maker%j.err > date;hostname;pwd > > #loading modules > module purge > module load maker/3.01.03 > > #runing maker > maker -base natalensis -fix_nucleotides -dsindex maker_bopts.ctl maker_exe.ctl maker_opts.ctl > > #making gff3 files > cd natalensis.maker.output > gff3_merge -d natalensis.maker.output/natalensis_master_datastore_index.log > fasta_merge -d natalensis.maker.output/natalensis_master_datastore_index.log. > > > Error file > Possible precedence issue with control flow operator at /apps/maker/3.01.03/lib/site_perl/5.26.2/Bio/DB/IndexedBase.pm line 805. > STATUS: Parsing control files... > STATUS: Processing and indexing input FASTA files... > STATUS: Setting up database for any GFF3 input... > A data structure will be created for you at: > /blue/austin/bonginkosi.gumbi/mastomys/genome/wtdg/annotation/maker/natalensis.maker.output/natalensis_datastore > > To access files for individual sequences use the datastore index: > /blue/austin/bonginkosi.gumbi/mastomys/genome/wtdg/annotation/maker/natalensis.maker.output/natalensis_master_datastore_index.log > > ERROR: The file 'natalensis.maker.output/natalensis_master_datastore_index.log' does not exist > ERROR: The file 'natalensis.maker.output/natalensis_master_datastore_index.log' does not exist > maker61784163.err (END) > > > Humble regards > charles -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 1376 bytes Desc: not available URL: From carsonhh at gmail.com Fri Jan 8 12:46:34 2021 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 8 Jan 2021 12:46:34 -0700 Subject: [maker-devel] Maker failed to annotated whole gene In-Reply-To: References: Message-ID: <8A08AB1B-CA46-4889-B225-47FC11C64D3E@gmail.com> Look at the evidence and annotation in a browser (Apollo, IGV, etc). If the evidence is not bridging the exons, then that is why. Look at the evidence alignment to see if HSP?s repeat the same alignment to multiple spots (looks like 4 exons but is essentially a duplication of identical HSPs). Look at the gene predictor performance across the genome to see if perhaps the gene predictor is not trained well. Zoom in on the evidence alignments in a browser, do you see long strings of NNNN that mean parts of the assembly are missing so a working annotation cannot be made to bridge the exons. Are there exonerate alignments with canonical splice site support. Or there may be assembly errors generating early stop codons, so the gene predictor cannot create a single model covering the entire locus, but rather generates multiple broken loci. ?Carson > On Nov 24, 2020, at 7:58 PM, Diana Moreno Santill?n wrote: > > Hello, > I noticed that some genes on my maker runs were annotated like fragmented pieces instead of a single gene. > > For example, for a gene composed by 4 exons, I was expecting to have the 4 exons concatenated in a single protein sequence. I performed annotations in several species and for some of them I have only one gene annotated, i.e with the 4 exons merged in a single protein sequence. But for other species, with the same protein evidence and maker.ctl parameters I got 4 "genes" evidence instead of one. Actually on the blast results is pretty clear how they are part of the same gene at different positions. This is an issue because I'm doing gene families expansions and contraction and the analysis detects as this gene is being expanded, as it has 3 more copies, but in reality they are part of the same gene. > > Have you seen this before? Could you help me to seek for a solution? > > Thank you > _______________________________________________ > maker-devel mailing list > maker-devel at yandell-lab.org > http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 1376 bytes Desc: not available URL: From carsonhh at gmail.com Fri Jan 8 15:41:47 2021 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 8 Jan 2021 15:41:47 -0700 Subject: [maker-devel] processing of simple and complex repeats In-Reply-To: References: Message-ID: It cannot unless it matches exactly the GFF3 style produced by MAKER itself (including Name, Target, and other GFF3 attributes). ?Carson > On Dec 24, 2020, at 9:29 AM, Santiago Revale wrote: > > Dear Maker developers, > > Can Maker distinguish between simple and complex repeats from a gff3 file of pre-aligned repeats? > > I'm trying to annotate a genome of a non-model Drosophila species and I've already generated a gff3 file with both simple and complex repeats for this species. I would like to use this gff3 file as input for Repeat Masking so Maker won't have to align repeats from any library. My maker_opts.ctl file looks like this: > > #-----Repeat Masking > model_org= > rmlib= > repeat_/path/to/te_proteins.fasta > rm_gff=/path/to/Dato_genome.Dato-first.full_mask.out.reformat.gff3 > prok_rm=0 > softmask=1 > > By using softmask=1 I understand that Maker will softmask only low complexity repeats (while complex ones will be hardmasked). My question is whether Maker can distinguish between simple and complex repeats from the gff3 file in order to softmask only simple repeats. Also, do you think it would be better to only include complex repeats in the gff3 file and let Maker find simple repeats on its own by using model_org=simple? > > Thank you very much in advance. > > Best regards, > Santiago > > _______________________________________________ > maker-devel mailing list > maker-devel at yandell-lab.org > http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 1376 bytes Desc: not available URL: From michele.vidotto at gmail.com Thu Jan 21 06:30:21 2021 From: michele.vidotto at gmail.com (Michele Vidotto) Date: Thu, 21 Jan 2021 14:30:21 +0100 Subject: [maker-devel] Issues with locking in MPI mode Message-ID: Dear all, as reported in the subject I'm having issues with locking mechanism of MAKER when it is runs in parallel-mode through mpi. I'm using maker version 3.01.03 but the same happens in my system when I build and install version 2.31.11. All prerequisites were installed in a conda environment. Perl was installed from anaconda channel in version 5.26.2. Hard-coded paths to the compilers were fixed. Necessary perl modules were installed via cpanm: "DBD::SQLite", "DBI", "Error", "Error::Simple", "File::NFSLock", "File::Which", "forks", "forks::shared", "Inline", "Inline::C", "IO::All", "IO::Prompt", "LWP::Simple" "Perl::Unsafe::Signals", "PerlIO::gzip", "Proc::Simple", "URI::Escape", "DBD::Pg" additional libraries and components were installed via conda - gcc_linux-64=7.3.0 - gxx_linux-64=7.3.0 - openmpi=4.1.0 - zlib=1.2.11 - libdb=6.1.26 - expat=2.2.9 - libxml2=2.9.10 - exonerate=2.4.0 - snoscan=1.0 - rapsearch=2.24 other components were installed manually. MAKER compile and install with no errors, but when I execute the program via MPI with: # to devoid OPEN MPI segmentation fault export THREADS_DAEMON_MODEL=1 mpiexec -mca btl ^openib -n 1 \ maker \ -force \ -cpus 8 \ --fix_nucleotides \ maker_opts.ctl \ maker_bopts.ctl \ maker_exe.ctl It always ends up with following error: STATUS: Parsing control files... ERROR: The directory is locked. Perhaps by an instance of MAKER. --> rank=NA, hostname=april.corp.igatechnology.com -------------------------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpiexec detected that one or more processes exited with non-zero status, thus causing the job to be terminated. The first process to do so was: Process name: [[19321,1],0] Exit code: 10 -------------------------------------------------------------------------- if I look inside *.maker.output directory a lock file remains: .NFSLock.gi_lock.NFSLock If instead I run maker with the -nolock flag. MAKER runs with no problems at all. My filesystem is oneFS from ISILON, exported to a virtual server through nfs4 protocol. By looking at the code MAKER uses File::NFSLock Perl module for locking. This module fails some tests when installed on my system with cipanm: # Failed test at t/300_bl_sh.t line 115. Shared locks not running simultaneously at t/300_bl_sh.t line 116, <$rd3> line 18. # Looks like your test exited with 4 just after 27. t/300_bl_sh.t ..... Dubious, test returned 4 (wstat 1024, 0x400) Failed 47/73 subtests t/400_kill.t ...... ok t/410_die.t ....... ok t/420_crash.t ..... ok t/430_taint.t ..... ok Test Summary Report ------------------- t/300_bl_sh.t (Wstat: 1024 Tests: 27 Failed: 1) Failed test: 27 Non-zero exit status: 4 Parse errors: Bad plan. You planned 73 tests but ran 27. But anyway I was able to install it with --notest flag. Do you have any idea on how I can overcome my problem and have MAKER run in parallel with MPI? Thanks in advance, --- Michele Vidotto mailto: michele.vidotto at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From bonginkosi.gumbi at ufl.edu Tue Jan 19 09:44:56 2021 From: bonginkosi.gumbi at ufl.edu (Gumbi,Bonginkosi C) Date: Tue, 19 Jan 2021 16:44:56 +0000 Subject: [maker-devel] ERROR: Could not determine if RepBase is installed Message-ID: Dear Maker support team Thank you so much for developing maker and making it available for researcher like me. I am attempting to annotate a genome using your prestigious program. I am running the pipeline on a University server. However, I'm halted by this error "ERROR: Could not determine if RepBase is installed". I have googled it and read on some of the solutions posted by other researchers such as this one https://github.com/bioconda/bioconda-recipes/issues/16501 and many more. However, this solution didn't work for me since I'm running Maker on the server and I do not have authority to reinstall nor update programs. I have talked with the University computing team about this error and unfortunately as I suspected the University does not have a Rebase license which is quite weird for such a big institute. But anyways, one of the solutions in this page is to use NCBI WindowMasker instead of RepeatMasker which to requires the RepBase subscription. My questions are: 1. Is possible to use the NCBI WindowMasker within the maker pipeline 2. If yes, how do I specify in Maker control files (.exe/opts/bopts) that I would like to use the NCBI WindowMasker instead of RepeatMasker? 3. If no to my first question, is there any alternative approach within the Maker pipeline that I can use to annotate repeats beside RepeatMasker/RepBase? Any help would be highly appreciated. Humble regards Bongie [https://avatars1.githubusercontent.com/u/14253259?s=400&v=4] ERROR: Could not determine if RepBase is installed ? Issue #16501 ? bioconda/bioconda-recipes ? GitHub Hi @abretaud, @nathanweeks, @johanneskoester, @kastman, @pvanheus, @jerowe, @bgruening and @ArneKr, I ran Maker but I got the following error: > qpeek 4702790.pbs Possible precedence issue with control flow operator at /lustre/work-lustr... github.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From guerrer at uni-duesseldorf.de Fri Jan 22 05:42:25 2021 From: guerrer at uni-duesseldorf.de (Ricardo Nuno Ferreira Martins Guerreiro) Date: Fri, 22 Jan 2021 13:42:25 +0100 Subject: [maker-devel] Find evidence for specific gene annotation (without any genome browser!) Message-ID: <6ba7a71ce9970fb0dbf072a61df0f18f@uni-duesseldorf.de> Hello, Simple question: How do I know which evidence generated one specific gene? Or even, which evidences are used for it's AED calculation? How do I do this if the gene is predicted directly by a prot2genome alignment? And how when it's an Augustus annotation? I want to do this without Jbrowser, which is a huge waste of time. I want the original name in my input protein set, not the maker name. Cheers, Ricardo From rzhang12 at ncsu.edu Thu Jan 28 08:58:22 2021 From: rzhang12 at ncsu.edu (Ran Zhang) Date: Thu, 28 Jan 2021 15:58:22 -0000 Subject: [maker-devel] Questions about repeat masking in MAKER Message-ID: <7B6875AE-4D87-4D7A-9040-933CA3A08397@ncsu.edu> Hi, I am using maker to do repeat masking and genome annotation. So far I have assembly, est and protein evidence. What I did is use maker to initialized three files and only edit the maker_opts.ctl. I put the genome, est and proteins. Also I changed the est2geome=1 and protein2gemone=1. But what I got is always wrong. I used the high performance cluster in our university and it is already installed MPI with maker. Error showed like this . I am a little confused about that should maker cannot do repeat masking automatically for me? I have to use RepeatModeler/RepeatMasker to do that first then use maker and skip the repeat masking? Or I have to make the specific repeat library and this is also a required file (although the tutorial said only 3 files required). Thanks a lot! Ran setting up GFF3 output and fasta chunks parseTagData: ID field not to EMBL spec "SNAP-OL2 repeatmasker; DNA; ???; BP. " from DE RepbaseID: SNAP-OL2XX at /usr/local/apps/maker/v2.31.10/exe/RepeatMasker/RepeatMasker line 7611. doing repeat masking running repeat masker. #--------- command -------------# Widget::RepeatMasker: cd /tmp/maker_joy_Oe; /usr/local/apps/maker/v2.31.10/exe/RepeatMasker/RepeatMasker /gpfs_common/share03/bonelllin/rzhang12/maker1/f2.renamed.maker.output/f2.renamed_datastore/04/3F/bviridis90138//theVoid.bviridis90138/0/bviridis90138.0.all.rb -species all -dir /gpfs_common/share03/bonelllin/rzhang12/maker1/f2.renamed.maker.output/f2.renamed_datastore/04/3F/bviridis90138//theVoid.bviridis90138/0 -pa 1 #-------------------------------# parseTagData: ID field not to EMBL spec "SNAP-OL2 repeatmasker; DNA; ???; BP. " from DE RepbaseID: SNAP-OL2XX at /usr/local/apps/maker/v2.31.10/exe/RepeatMasker/RepeatMasker line 7611. ERROR: RepeatMasker failed --> rank=12, hostname=n3i4-13 ERROR: Failed while doing repeat masking ERROR: Chunk failed at level:0, tier_type:1 FAILED CONTIG:bviridis90124 ERROR: Chunk failed at level:2, tier_type:0 FAILED CONTIG:bviridis90124 parseTagData: ID field not to EMBL spec "SNAP-OL2 repeatmasker; DNA; ???; BP. " from DE RepbaseID: SNAP-OL2XX at /usr/local/apps/maker/v2.31.10/exe/RepeatMasker/RepeatMasker line 7611. parseTagData: ID field not to EMBL spec "SNAP-OL2 repeatmasker; DNA; ???; BP. " from DE RepbaseID: SNAP-OL2XX -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Fri Jan 8 12:38:56 2021 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 8 Jan 2021 12:38:56 -0700 Subject: [maker-devel] maker-devel post from nanshangogo@gmail.com requires approval In-Reply-To: References: Message-ID: The result.maker.proteins.fasta and result.maker.transcripts.fasta files contain the filtered annotations, and the other files are for reference purposes (i.e. snap and augustus raw unfiltered output). The non_overlapping_ab_initio files contain non-redundant ab initio predictions that do not overlap a final annotation (maker file). They were called by a predictor but rejected for lack of support. If you wanted to look for a potential missing gene model, that is where it would be. ?Carson > On Dec 23, 2020, at 12:03 AM, maker-devel-owner at yandell-lab.org wrote: > > As list administrator, your authorization is requested for the > following mailing list posting: > > List: maker-devel at yandell-lab.org > From: nanshangogo at gmail.com > Subject: Qusetion about the ab initio gene predictors result > Reason: Post by non-member to a members-only list > > At your convenience, visit: > > http://yandell-lab.org/mailman/admindb/maker-devel_yandell-lab.org > > to approve or deny the request. > > From: nanshan yang > Subject: Qusetion about the ab initio gene predictors result > Date: December 23, 2020 at 12:03:35 AM MST > To: maker-devel at yandell-lab.org > > > Hi MAKER community : > I have questions about MAKER output files.I get result from ab initio gene predictors which use snap and augustus by maker,and after fasta_merge step,there are some fasta files as: > result.maker.augustus_masked.proteins.fasta > result.maker.augustus_masked.transcripts.fasta > result.maker.augustus.proteins.fasta > result.maker.augustus.transcripts.fasta > result.maker.non_overlapping_ab_initio.proteins.fasta > result.maker.non_overlapping_ab_initio.transcripts.fasta > result.maker.snap.proteins.fasta > result.maker.snap_masked..transcripts.fasta > result.maker.snap_masked..proteins.fasta > result.maker.snap.transcripts.fasta > result.maker.proteins.fasta > result.maker.transcripts.fasta > if i continue to analysis the fasta files,which fasta should i choose? > because i choose ab initio gene predictors,so the result .maker.non_overlapping_ab_initio*fasta can be uesed into the downstream analysis?or the result.maker.proteins.fasta > Thanks verymuch for any help or insights > > > > From: maker-devel-request at yandell-lab.org > Subject: confirm 102e4151023f6d0466a7c780132e8ca169e86aaa > Date: December 23, 2020 at 12:03:58 AM MST > > > If you reply to this message, keeping the Subject: header intact, > Mailman will discard the held message. Do this if the message is > spam. If you reply to this message and include an Approved: header > with the list password in it, the message will be approved for posting > to the list. The Approved: header can also appear in the first line > of the body of the reply. > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 1376 bytes Desc: not available URL: From carsonhh at gmail.com Fri Jan 8 12:41:35 2021 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 8 Jan 2021 12:41:35 -0700 Subject: [maker-devel] Possible precedence issue with control flow operator at /apps/maker/3.01.03/lib/site_perl/5.26.2/Bio/DB/IndexedBase.pm line 805 In-Reply-To: References: Message-ID: <1E264E9F-C843-4275-9D48-B70131C98AA6@gmail.com> The problem is the bioperl or even perl installation ?> ?Possible precedence issue with control flow operator at /apps/maker/3.01.03/lib/site_perl/5.26.2/Bio/DB/IndexedBase.pm line 805." You may have a custom install of perl without BerkleyDB setup, or need to update BioPerl. If you installed MAKER with MiniConda, then it tried to setup a custom perl and BioPerl that is broken. You probably need to reinstall MAKER. Using the system perl may be the easiest. ?Carson > On Dec 9, 2020, at 9:14 AM, Gumbi,Bonginkosi C wrote: > > ?Dear maker support team > > I need your help to troubleshoot this error. I don't know what I'm doing wrong, this is my first time annotating a genome. I have gone through almost three maker tutorials online but it's like the annotation doesn't generate the datastore folder and I don't know why because I have provided all the input files. I am running this analysis on the cluster platform. Below I have pasted the slurm script and the error message. Any suggestions and help would be highly appropriated. > > Slurm script > #!/bin/bash > #SBATCH --account=austin > #SBATCH --job-name=maker > #SBATCH --mail-type=ALL > #SBATCH --mail-user=charlesgumbi at ufl.edu > #SBATCH --mem=30gb > #SBATCH --ntasks=1 > #SBATCH --cpus-per-task2 > #SBATCH --time=48:00:00 > #SBATCH --output=maker%j.out > #SBATCH --error=maker%j.err > date;hostname;pwd > > #loading modules > module purge > module load maker/3.01.03 > > #runing maker > maker -base natalensis -fix_nucleotides -dsindex maker_bopts.ctl maker_exe.ctl maker_opts.ctl > > #making gff3 files > cd natalensis.maker.output > gff3_merge -d natalensis.maker.output/natalensis_master_datastore_index.log > fasta_merge -d natalensis.maker.output/natalensis_master_datastore_index.log. > > > Error file > Possible precedence issue with control flow operator at /apps/maker/3.01.03/lib/site_perl/5.26.2/Bio/DB/IndexedBase.pm line 805. > STATUS: Parsing control files... > STATUS: Processing and indexing input FASTA files... > STATUS: Setting up database for any GFF3 input... > A data structure will be created for you at: > /blue/austin/bonginkosi.gumbi/mastomys/genome/wtdg/annotation/maker/natalensis.maker.output/natalensis_datastore > > To access files for individual sequences use the datastore index: > /blue/austin/bonginkosi.gumbi/mastomys/genome/wtdg/annotation/maker/natalensis.maker.output/natalensis_master_datastore_index.log > > ERROR: The file 'natalensis.maker.output/natalensis_master_datastore_index.log' does not exist > ERROR: The file 'natalensis.maker.output/natalensis_master_datastore_index.log' does not exist > maker61784163.err (END) > > > Humble regards > charles -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 1376 bytes Desc: not available URL: From carsonhh at gmail.com Fri Jan 8 12:46:34 2021 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 8 Jan 2021 12:46:34 -0700 Subject: [maker-devel] Maker failed to annotated whole gene In-Reply-To: References: Message-ID: <8A08AB1B-CA46-4889-B225-47FC11C64D3E@gmail.com> Look at the evidence and annotation in a browser (Apollo, IGV, etc). If the evidence is not bridging the exons, then that is why. Look at the evidence alignment to see if HSP?s repeat the same alignment to multiple spots (looks like 4 exons but is essentially a duplication of identical HSPs). Look at the gene predictor performance across the genome to see if perhaps the gene predictor is not trained well. Zoom in on the evidence alignments in a browser, do you see long strings of NNNN that mean parts of the assembly are missing so a working annotation cannot be made to bridge the exons. Are there exonerate alignments with canonical splice site support. Or there may be assembly errors generating early stop codons, so the gene predictor cannot create a single model covering the entire locus, but rather generates multiple broken loci. ?Carson > On Nov 24, 2020, at 7:58 PM, Diana Moreno Santill?n wrote: > > Hello, > I noticed that some genes on my maker runs were annotated like fragmented pieces instead of a single gene. > > For example, for a gene composed by 4 exons, I was expecting to have the 4 exons concatenated in a single protein sequence. I performed annotations in several species and for some of them I have only one gene annotated, i.e with the 4 exons merged in a single protein sequence. But for other species, with the same protein evidence and maker.ctl parameters I got 4 "genes" evidence instead of one. Actually on the blast results is pretty clear how they are part of the same gene at different positions. This is an issue because I'm doing gene families expansions and contraction and the analysis detects as this gene is being expanded, as it has 3 more copies, but in reality they are part of the same gene. > > Have you seen this before? Could you help me to seek for a solution? > > Thank you > _______________________________________________ > maker-devel mailing list > maker-devel at yandell-lab.org > http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 1376 bytes Desc: not available URL: From carsonhh at gmail.com Fri Jan 8 15:41:47 2021 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 8 Jan 2021 15:41:47 -0700 Subject: [maker-devel] processing of simple and complex repeats In-Reply-To: References: Message-ID: It cannot unless it matches exactly the GFF3 style produced by MAKER itself (including Name, Target, and other GFF3 attributes). ?Carson > On Dec 24, 2020, at 9:29 AM, Santiago Revale wrote: > > Dear Maker developers, > > Can Maker distinguish between simple and complex repeats from a gff3 file of pre-aligned repeats? > > I'm trying to annotate a genome of a non-model Drosophila species and I've already generated a gff3 file with both simple and complex repeats for this species. I would like to use this gff3 file as input for Repeat Masking so Maker won't have to align repeats from any library. My maker_opts.ctl file looks like this: > > #-----Repeat Masking > model_org= > rmlib= > repeat_/path/to/te_proteins.fasta > rm_gff=/path/to/Dato_genome.Dato-first.full_mask.out.reformat.gff3 > prok_rm=0 > softmask=1 > > By using softmask=1 I understand that Maker will softmask only low complexity repeats (while complex ones will be hardmasked). My question is whether Maker can distinguish between simple and complex repeats from the gff3 file in order to softmask only simple repeats. Also, do you think it would be better to only include complex repeats in the gff3 file and let Maker find simple repeats on its own by using model_org=simple? > > Thank you very much in advance. > > Best regards, > Santiago > > _______________________________________________ > maker-devel mailing list > maker-devel at yandell-lab.org > http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 1376 bytes Desc: not available URL: From michele.vidotto at gmail.com Thu Jan 21 06:30:21 2021 From: michele.vidotto at gmail.com (Michele Vidotto) Date: Thu, 21 Jan 2021 14:30:21 +0100 Subject: [maker-devel] Issues with locking in MPI mode Message-ID: Dear all, as reported in the subject I'm having issues with locking mechanism of MAKER when it is runs in parallel-mode through mpi. I'm using maker version 3.01.03 but the same happens in my system when I build and install version 2.31.11. All prerequisites were installed in a conda environment. Perl was installed from anaconda channel in version 5.26.2. Hard-coded paths to the compilers were fixed. Necessary perl modules were installed via cpanm: "DBD::SQLite", "DBI", "Error", "Error::Simple", "File::NFSLock", "File::Which", "forks", "forks::shared", "Inline", "Inline::C", "IO::All", "IO::Prompt", "LWP::Simple" "Perl::Unsafe::Signals", "PerlIO::gzip", "Proc::Simple", "URI::Escape", "DBD::Pg" additional libraries and components were installed via conda - gcc_linux-64=7.3.0 - gxx_linux-64=7.3.0 - openmpi=4.1.0 - zlib=1.2.11 - libdb=6.1.26 - expat=2.2.9 - libxml2=2.9.10 - exonerate=2.4.0 - snoscan=1.0 - rapsearch=2.24 other components were installed manually. MAKER compile and install with no errors, but when I execute the program via MPI with: # to devoid OPEN MPI segmentation fault export THREADS_DAEMON_MODEL=1 mpiexec -mca btl ^openib -n 1 \ maker \ -force \ -cpus 8 \ --fix_nucleotides \ maker_opts.ctl \ maker_bopts.ctl \ maker_exe.ctl It always ends up with following error: STATUS: Parsing control files... ERROR: The directory is locked. Perhaps by an instance of MAKER. --> rank=NA, hostname=april.corp.igatechnology.com -------------------------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpiexec detected that one or more processes exited with non-zero status, thus causing the job to be terminated. The first process to do so was: Process name: [[19321,1],0] Exit code: 10 -------------------------------------------------------------------------- if I look inside *.maker.output directory a lock file remains: .NFSLock.gi_lock.NFSLock If instead I run maker with the -nolock flag. MAKER runs with no problems at all. My filesystem is oneFS from ISILON, exported to a virtual server through nfs4 protocol. By looking at the code MAKER uses File::NFSLock Perl module for locking. This module fails some tests when installed on my system with cipanm: # Failed test at t/300_bl_sh.t line 115. Shared locks not running simultaneously at t/300_bl_sh.t line 116, <$rd3> line 18. # Looks like your test exited with 4 just after 27. t/300_bl_sh.t ..... Dubious, test returned 4 (wstat 1024, 0x400) Failed 47/73 subtests t/400_kill.t ...... ok t/410_die.t ....... ok t/420_crash.t ..... ok t/430_taint.t ..... ok Test Summary Report ------------------- t/300_bl_sh.t (Wstat: 1024 Tests: 27 Failed: 1) Failed test: 27 Non-zero exit status: 4 Parse errors: Bad plan. You planned 73 tests but ran 27. But anyway I was able to install it with --notest flag. Do you have any idea on how I can overcome my problem and have MAKER run in parallel with MPI? Thanks in advance, --- Michele Vidotto mailto: michele.vidotto at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From bonginkosi.gumbi at ufl.edu Tue Jan 19 09:44:56 2021 From: bonginkosi.gumbi at ufl.edu (Gumbi,Bonginkosi C) Date: Tue, 19 Jan 2021 16:44:56 +0000 Subject: [maker-devel] ERROR: Could not determine if RepBase is installed Message-ID: Dear Maker support team Thank you so much for developing maker and making it available for researcher like me. I am attempting to annotate a genome using your prestigious program. I am running the pipeline on a University server. However, I'm halted by this error "ERROR: Could not determine if RepBase is installed". I have googled it and read on some of the solutions posted by other researchers such as this one https://github.com/bioconda/bioconda-recipes/issues/16501 and many more. However, this solution didn't work for me since I'm running Maker on the server and I do not have authority to reinstall nor update programs. I have talked with the University computing team about this error and unfortunately as I suspected the University does not have a Rebase license which is quite weird for such a big institute. But anyways, one of the solutions in this page is to use NCBI WindowMasker instead of RepeatMasker which to requires the RepBase subscription. My questions are: 1. Is possible to use the NCBI WindowMasker within the maker pipeline 2. If yes, how do I specify in Maker control files (.exe/opts/bopts) that I would like to use the NCBI WindowMasker instead of RepeatMasker? 3. If no to my first question, is there any alternative approach within the Maker pipeline that I can use to annotate repeats beside RepeatMasker/RepBase? Any help would be highly appreciated. Humble regards Bongie [https://avatars1.githubusercontent.com/u/14253259?s=400&v=4] ERROR: Could not determine if RepBase is installed ? Issue #16501 ? bioconda/bioconda-recipes ? GitHub Hi @abretaud, @nathanweeks, @johanneskoester, @kastman, @pvanheus, @jerowe, @bgruening and @ArneKr, I ran Maker but I got the following error: > qpeek 4702790.pbs Possible precedence issue with control flow operator at /lustre/work-lustr... github.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From guerrer at uni-duesseldorf.de Fri Jan 22 05:42:25 2021 From: guerrer at uni-duesseldorf.de (Ricardo Nuno Ferreira Martins Guerreiro) Date: Fri, 22 Jan 2021 13:42:25 +0100 Subject: [maker-devel] Find evidence for specific gene annotation (without any genome browser!) Message-ID: <6ba7a71ce9970fb0dbf072a61df0f18f@uni-duesseldorf.de> Hello, Simple question: How do I know which evidence generated one specific gene? Or even, which evidences are used for it's AED calculation? How do I do this if the gene is predicted directly by a prot2genome alignment? And how when it's an Augustus annotation? I want to do this without Jbrowser, which is a huge waste of time. I want the original name in my input protein set, not the maker name. Cheers, Ricardo From rzhang12 at ncsu.edu Thu Jan 28 08:58:22 2021 From: rzhang12 at ncsu.edu (Ran Zhang) Date: Thu, 28 Jan 2021 15:58:22 -0000 Subject: [maker-devel] Questions about repeat masking in MAKER Message-ID: <7B6875AE-4D87-4D7A-9040-933CA3A08397@ncsu.edu> Hi, I am using maker to do repeat masking and genome annotation. So far I have assembly, est and protein evidence. What I did is use maker to initialized three files and only edit the maker_opts.ctl. I put the genome, est and proteins. Also I changed the est2geome=1 and protein2gemone=1. But what I got is always wrong. I used the high performance cluster in our university and it is already installed MPI with maker. Error showed like this . I am a little confused about that should maker cannot do repeat masking automatically for me? I have to use RepeatModeler/RepeatMasker to do that first then use maker and skip the repeat masking? Or I have to make the specific repeat library and this is also a required file (although the tutorial said only 3 files required). Thanks a lot! Ran setting up GFF3 output and fasta chunks parseTagData: ID field not to EMBL spec "SNAP-OL2 repeatmasker; DNA; ???; BP. " from DE RepbaseID: SNAP-OL2XX at /usr/local/apps/maker/v2.31.10/exe/RepeatMasker/RepeatMasker line 7611. doing repeat masking running repeat masker. #--------- command -------------# Widget::RepeatMasker: cd /tmp/maker_joy_Oe; /usr/local/apps/maker/v2.31.10/exe/RepeatMasker/RepeatMasker /gpfs_common/share03/bonelllin/rzhang12/maker1/f2.renamed.maker.output/f2.renamed_datastore/04/3F/bviridis90138//theVoid.bviridis90138/0/bviridis90138.0.all.rb -species all -dir /gpfs_common/share03/bonelllin/rzhang12/maker1/f2.renamed.maker.output/f2.renamed_datastore/04/3F/bviridis90138//theVoid.bviridis90138/0 -pa 1 #-------------------------------# parseTagData: ID field not to EMBL spec "SNAP-OL2 repeatmasker; DNA; ???; BP. " from DE RepbaseID: SNAP-OL2XX at /usr/local/apps/maker/v2.31.10/exe/RepeatMasker/RepeatMasker line 7611. ERROR: RepeatMasker failed --> rank=12, hostname=n3i4-13 ERROR: Failed while doing repeat masking ERROR: Chunk failed at level:0, tier_type:1 FAILED CONTIG:bviridis90124 ERROR: Chunk failed at level:2, tier_type:0 FAILED CONTIG:bviridis90124 parseTagData: ID field not to EMBL spec "SNAP-OL2 repeatmasker; DNA; ???; BP. " from DE RepbaseID: SNAP-OL2XX at /usr/local/apps/maker/v2.31.10/exe/RepeatMasker/RepeatMasker line 7611. parseTagData: ID field not to EMBL spec "SNAP-OL2 repeatmasker; DNA; ???; BP. " from DE RepbaseID: SNAP-OL2XX -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Fri Jan 8 12:38:56 2021 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 8 Jan 2021 12:38:56 -0700 Subject: [maker-devel] maker-devel post from nanshangogo@gmail.com requires approval In-Reply-To: References: Message-ID: The result.maker.proteins.fasta and result.maker.transcripts.fasta files contain the filtered annotations, and the other files are for reference purposes (i.e. snap and augustus raw unfiltered output). The non_overlapping_ab_initio files contain non-redundant ab initio predictions that do not overlap a final annotation (maker file). They were called by a predictor but rejected for lack of support. If you wanted to look for a potential missing gene model, that is where it would be. ?Carson > On Dec 23, 2020, at 12:03 AM, maker-devel-owner at yandell-lab.org wrote: > > As list administrator, your authorization is requested for the > following mailing list posting: > > List: maker-devel at yandell-lab.org > From: nanshangogo at gmail.com > Subject: Qusetion about the ab initio gene predictors result > Reason: Post by non-member to a members-only list > > At your convenience, visit: > > http://yandell-lab.org/mailman/admindb/maker-devel_yandell-lab.org > > to approve or deny the request. > > From: nanshan yang > Subject: Qusetion about the ab initio gene predictors result > Date: December 23, 2020 at 12:03:35 AM MST > To: maker-devel at yandell-lab.org > > > Hi MAKER community : > I have questions about MAKER output files.I get result from ab initio gene predictors which use snap and augustus by maker,and after fasta_merge step,there are some fasta files as: > result.maker.augustus_masked.proteins.fasta > result.maker.augustus_masked.transcripts.fasta > result.maker.augustus.proteins.fasta > result.maker.augustus.transcripts.fasta > result.maker.non_overlapping_ab_initio.proteins.fasta > result.maker.non_overlapping_ab_initio.transcripts.fasta > result.maker.snap.proteins.fasta > result.maker.snap_masked..transcripts.fasta > result.maker.snap_masked..proteins.fasta > result.maker.snap.transcripts.fasta > result.maker.proteins.fasta > result.maker.transcripts.fasta > if i continue to analysis the fasta files,which fasta should i choose? > because i choose ab initio gene predictors,so the result .maker.non_overlapping_ab_initio*fasta can be uesed into the downstream analysis?or the result.maker.proteins.fasta > Thanks verymuch for any help or insights > > > > From: maker-devel-request at yandell-lab.org > Subject: confirm 102e4151023f6d0466a7c780132e8ca169e86aaa > Date: December 23, 2020 at 12:03:58 AM MST > > > If you reply to this message, keeping the Subject: header intact, > Mailman will discard the held message. Do this if the message is > spam. If you reply to this message and include an Approved: header > with the list password in it, the message will be approved for posting > to the list. The Approved: header can also appear in the first line > of the body of the reply. > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 1376 bytes Desc: not available URL: From carsonhh at gmail.com Fri Jan 8 12:41:35 2021 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 8 Jan 2021 12:41:35 -0700 Subject: [maker-devel] Possible precedence issue with control flow operator at /apps/maker/3.01.03/lib/site_perl/5.26.2/Bio/DB/IndexedBase.pm line 805 In-Reply-To: References: Message-ID: <1E264E9F-C843-4275-9D48-B70131C98AA6@gmail.com> The problem is the bioperl or even perl installation ?> ?Possible precedence issue with control flow operator at /apps/maker/3.01.03/lib/site_perl/5.26.2/Bio/DB/IndexedBase.pm line 805." You may have a custom install of perl without BerkleyDB setup, or need to update BioPerl. If you installed MAKER with MiniConda, then it tried to setup a custom perl and BioPerl that is broken. You probably need to reinstall MAKER. Using the system perl may be the easiest. ?Carson > On Dec 9, 2020, at 9:14 AM, Gumbi,Bonginkosi C wrote: > > ?Dear maker support team > > I need your help to troubleshoot this error. I don't know what I'm doing wrong, this is my first time annotating a genome. I have gone through almost three maker tutorials online but it's like the annotation doesn't generate the datastore folder and I don't know why because I have provided all the input files. I am running this analysis on the cluster platform. Below I have pasted the slurm script and the error message. Any suggestions and help would be highly appropriated. > > Slurm script > #!/bin/bash > #SBATCH --account=austin > #SBATCH --job-name=maker > #SBATCH --mail-type=ALL > #SBATCH --mail-user=charlesgumbi at ufl.edu > #SBATCH --mem=30gb > #SBATCH --ntasks=1 > #SBATCH --cpus-per-task2 > #SBATCH --time=48:00:00 > #SBATCH --output=maker%j.out > #SBATCH --error=maker%j.err > date;hostname;pwd > > #loading modules > module purge > module load maker/3.01.03 > > #runing maker > maker -base natalensis -fix_nucleotides -dsindex maker_bopts.ctl maker_exe.ctl maker_opts.ctl > > #making gff3 files > cd natalensis.maker.output > gff3_merge -d natalensis.maker.output/natalensis_master_datastore_index.log > fasta_merge -d natalensis.maker.output/natalensis_master_datastore_index.log. > > > Error file > Possible precedence issue with control flow operator at /apps/maker/3.01.03/lib/site_perl/5.26.2/Bio/DB/IndexedBase.pm line 805. > STATUS: Parsing control files... > STATUS: Processing and indexing input FASTA files... > STATUS: Setting up database for any GFF3 input... > A data structure will be created for you at: > /blue/austin/bonginkosi.gumbi/mastomys/genome/wtdg/annotation/maker/natalensis.maker.output/natalensis_datastore > > To access files for individual sequences use the datastore index: > /blue/austin/bonginkosi.gumbi/mastomys/genome/wtdg/annotation/maker/natalensis.maker.output/natalensis_master_datastore_index.log > > ERROR: The file 'natalensis.maker.output/natalensis_master_datastore_index.log' does not exist > ERROR: The file 'natalensis.maker.output/natalensis_master_datastore_index.log' does not exist > maker61784163.err (END) > > > Humble regards > charles -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 1376 bytes Desc: not available URL: From carsonhh at gmail.com Fri Jan 8 12:46:34 2021 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 8 Jan 2021 12:46:34 -0700 Subject: [maker-devel] Maker failed to annotated whole gene In-Reply-To: References: Message-ID: <8A08AB1B-CA46-4889-B225-47FC11C64D3E@gmail.com> Look at the evidence and annotation in a browser (Apollo, IGV, etc). If the evidence is not bridging the exons, then that is why. Look at the evidence alignment to see if HSP?s repeat the same alignment to multiple spots (looks like 4 exons but is essentially a duplication of identical HSPs). Look at the gene predictor performance across the genome to see if perhaps the gene predictor is not trained well. Zoom in on the evidence alignments in a browser, do you see long strings of NNNN that mean parts of the assembly are missing so a working annotation cannot be made to bridge the exons. Are there exonerate alignments with canonical splice site support. Or there may be assembly errors generating early stop codons, so the gene predictor cannot create a single model covering the entire locus, but rather generates multiple broken loci. ?Carson > On Nov 24, 2020, at 7:58 PM, Diana Moreno Santill?n wrote: > > Hello, > I noticed that some genes on my maker runs were annotated like fragmented pieces instead of a single gene. > > For example, for a gene composed by 4 exons, I was expecting to have the 4 exons concatenated in a single protein sequence. I performed annotations in several species and for some of them I have only one gene annotated, i.e with the 4 exons merged in a single protein sequence. But for other species, with the same protein evidence and maker.ctl parameters I got 4 "genes" evidence instead of one. Actually on the blast results is pretty clear how they are part of the same gene at different positions. This is an issue because I'm doing gene families expansions and contraction and the analysis detects as this gene is being expanded, as it has 3 more copies, but in reality they are part of the same gene. > > Have you seen this before? Could you help me to seek for a solution? > > Thank you > _______________________________________________ > maker-devel mailing list > maker-devel at yandell-lab.org > http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 1376 bytes Desc: not available URL: From carsonhh at gmail.com Fri Jan 8 15:41:47 2021 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 8 Jan 2021 15:41:47 -0700 Subject: [maker-devel] processing of simple and complex repeats In-Reply-To: References: Message-ID: It cannot unless it matches exactly the GFF3 style produced by MAKER itself (including Name, Target, and other GFF3 attributes). ?Carson > On Dec 24, 2020, at 9:29 AM, Santiago Revale wrote: > > Dear Maker developers, > > Can Maker distinguish between simple and complex repeats from a gff3 file of pre-aligned repeats? > > I'm trying to annotate a genome of a non-model Drosophila species and I've already generated a gff3 file with both simple and complex repeats for this species. I would like to use this gff3 file as input for Repeat Masking so Maker won't have to align repeats from any library. My maker_opts.ctl file looks like this: > > #-----Repeat Masking > model_org= > rmlib= > repeat_/path/to/te_proteins.fasta > rm_gff=/path/to/Dato_genome.Dato-first.full_mask.out.reformat.gff3 > prok_rm=0 > softmask=1 > > By using softmask=1 I understand that Maker will softmask only low complexity repeats (while complex ones will be hardmasked). My question is whether Maker can distinguish between simple and complex repeats from the gff3 file in order to softmask only simple repeats. Also, do you think it would be better to only include complex repeats in the gff3 file and let Maker find simple repeats on its own by using model_org=simple? > > Thank you very much in advance. > > Best regards, > Santiago > > _______________________________________________ > maker-devel mailing list > maker-devel at yandell-lab.org > http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 1376 bytes Desc: not available URL: From michele.vidotto at gmail.com Thu Jan 21 06:30:21 2021 From: michele.vidotto at gmail.com (Michele Vidotto) Date: Thu, 21 Jan 2021 14:30:21 +0100 Subject: [maker-devel] Issues with locking in MPI mode Message-ID: Dear all, as reported in the subject I'm having issues with locking mechanism of MAKER when it is runs in parallel-mode through mpi. I'm using maker version 3.01.03 but the same happens in my system when I build and install version 2.31.11. All prerequisites were installed in a conda environment. Perl was installed from anaconda channel in version 5.26.2. Hard-coded paths to the compilers were fixed. Necessary perl modules were installed via cpanm: "DBD::SQLite", "DBI", "Error", "Error::Simple", "File::NFSLock", "File::Which", "forks", "forks::shared", "Inline", "Inline::C", "IO::All", "IO::Prompt", "LWP::Simple" "Perl::Unsafe::Signals", "PerlIO::gzip", "Proc::Simple", "URI::Escape", "DBD::Pg" additional libraries and components were installed via conda - gcc_linux-64=7.3.0 - gxx_linux-64=7.3.0 - openmpi=4.1.0 - zlib=1.2.11 - libdb=6.1.26 - expat=2.2.9 - libxml2=2.9.10 - exonerate=2.4.0 - snoscan=1.0 - rapsearch=2.24 other components were installed manually. MAKER compile and install with no errors, but when I execute the program via MPI with: # to devoid OPEN MPI segmentation fault export THREADS_DAEMON_MODEL=1 mpiexec -mca btl ^openib -n 1 \ maker \ -force \ -cpus 8 \ --fix_nucleotides \ maker_opts.ctl \ maker_bopts.ctl \ maker_exe.ctl It always ends up with following error: STATUS: Parsing control files... ERROR: The directory is locked. Perhaps by an instance of MAKER. --> rank=NA, hostname=april.corp.igatechnology.com -------------------------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpiexec detected that one or more processes exited with non-zero status, thus causing the job to be terminated. The first process to do so was: Process name: [[19321,1],0] Exit code: 10 -------------------------------------------------------------------------- if I look inside *.maker.output directory a lock file remains: .NFSLock.gi_lock.NFSLock If instead I run maker with the -nolock flag. MAKER runs with no problems at all. My filesystem is oneFS from ISILON, exported to a virtual server through nfs4 protocol. By looking at the code MAKER uses File::NFSLock Perl module for locking. This module fails some tests when installed on my system with cipanm: # Failed test at t/300_bl_sh.t line 115. Shared locks not running simultaneously at t/300_bl_sh.t line 116, <$rd3> line 18. # Looks like your test exited with 4 just after 27. t/300_bl_sh.t ..... Dubious, test returned 4 (wstat 1024, 0x400) Failed 47/73 subtests t/400_kill.t ...... ok t/410_die.t ....... ok t/420_crash.t ..... ok t/430_taint.t ..... ok Test Summary Report ------------------- t/300_bl_sh.t (Wstat: 1024 Tests: 27 Failed: 1) Failed test: 27 Non-zero exit status: 4 Parse errors: Bad plan. You planned 73 tests but ran 27. But anyway I was able to install it with --notest flag. Do you have any idea on how I can overcome my problem and have MAKER run in parallel with MPI? Thanks in advance, --- Michele Vidotto mailto: michele.vidotto at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From bonginkosi.gumbi at ufl.edu Tue Jan 19 09:44:56 2021 From: bonginkosi.gumbi at ufl.edu (Gumbi,Bonginkosi C) Date: Tue, 19 Jan 2021 16:44:56 +0000 Subject: [maker-devel] ERROR: Could not determine if RepBase is installed Message-ID: Dear Maker support team Thank you so much for developing maker and making it available for researcher like me. I am attempting to annotate a genome using your prestigious program. I am running the pipeline on a University server. However, I'm halted by this error "ERROR: Could not determine if RepBase is installed". I have googled it and read on some of the solutions posted by other researchers such as this one https://github.com/bioconda/bioconda-recipes/issues/16501 and many more. However, this solution didn't work for me since I'm running Maker on the server and I do not have authority to reinstall nor update programs. I have talked with the University computing team about this error and unfortunately as I suspected the University does not have a Rebase license which is quite weird for such a big institute. But anyways, one of the solutions in this page is to use NCBI WindowMasker instead of RepeatMasker which to requires the RepBase subscription. My questions are: 1. Is possible to use the NCBI WindowMasker within the maker pipeline 2. If yes, how do I specify in Maker control files (.exe/opts/bopts) that I would like to use the NCBI WindowMasker instead of RepeatMasker? 3. If no to my first question, is there any alternative approach within the Maker pipeline that I can use to annotate repeats beside RepeatMasker/RepBase? Any help would be highly appreciated. Humble regards Bongie [https://avatars1.githubusercontent.com/u/14253259?s=400&v=4] ERROR: Could not determine if RepBase is installed ? Issue #16501 ? bioconda/bioconda-recipes ? GitHub Hi @abretaud, @nathanweeks, @johanneskoester, @kastman, @pvanheus, @jerowe, @bgruening and @ArneKr, I ran Maker but I got the following error: > qpeek 4702790.pbs Possible precedence issue with control flow operator at /lustre/work-lustr... github.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From guerrer at uni-duesseldorf.de Fri Jan 22 05:42:25 2021 From: guerrer at uni-duesseldorf.de (Ricardo Nuno Ferreira Martins Guerreiro) Date: Fri, 22 Jan 2021 13:42:25 +0100 Subject: [maker-devel] Find evidence for specific gene annotation (without any genome browser!) Message-ID: <6ba7a71ce9970fb0dbf072a61df0f18f@uni-duesseldorf.de> Hello, Simple question: How do I know which evidence generated one specific gene? Or even, which evidences are used for it's AED calculation? How do I do this if the gene is predicted directly by a prot2genome alignment? And how when it's an Augustus annotation? I want to do this without Jbrowser, which is a huge waste of time. I want the original name in my input protein set, not the maker name. Cheers, Ricardo From rzhang12 at ncsu.edu Thu Jan 28 08:58:22 2021 From: rzhang12 at ncsu.edu (Ran Zhang) Date: Thu, 28 Jan 2021 15:58:22 -0000 Subject: [maker-devel] Questions about repeat masking in MAKER Message-ID: <7B6875AE-4D87-4D7A-9040-933CA3A08397@ncsu.edu> Hi, I am using maker to do repeat masking and genome annotation. So far I have assembly, est and protein evidence. What I did is use maker to initialized three files and only edit the maker_opts.ctl. I put the genome, est and proteins. Also I changed the est2geome=1 and protein2gemone=1. But what I got is always wrong. I used the high performance cluster in our university and it is already installed MPI with maker. Error showed like this . I am a little confused about that should maker cannot do repeat masking automatically for me? I have to use RepeatModeler/RepeatMasker to do that first then use maker and skip the repeat masking? Or I have to make the specific repeat library and this is also a required file (although the tutorial said only 3 files required). Thanks a lot! Ran setting up GFF3 output and fasta chunks parseTagData: ID field not to EMBL spec "SNAP-OL2 repeatmasker; DNA; ???; BP. " from DE RepbaseID: SNAP-OL2XX at /usr/local/apps/maker/v2.31.10/exe/RepeatMasker/RepeatMasker line 7611. doing repeat masking running repeat masker. #--------- command -------------# Widget::RepeatMasker: cd /tmp/maker_joy_Oe; /usr/local/apps/maker/v2.31.10/exe/RepeatMasker/RepeatMasker /gpfs_common/share03/bonelllin/rzhang12/maker1/f2.renamed.maker.output/f2.renamed_datastore/04/3F/bviridis90138//theVoid.bviridis90138/0/bviridis90138.0.all.rb -species all -dir /gpfs_common/share03/bonelllin/rzhang12/maker1/f2.renamed.maker.output/f2.renamed_datastore/04/3F/bviridis90138//theVoid.bviridis90138/0 -pa 1 #-------------------------------# parseTagData: ID field not to EMBL spec "SNAP-OL2 repeatmasker; DNA; ???; BP. " from DE RepbaseID: SNAP-OL2XX at /usr/local/apps/maker/v2.31.10/exe/RepeatMasker/RepeatMasker line 7611. ERROR: RepeatMasker failed --> rank=12, hostname=n3i4-13 ERROR: Failed while doing repeat masking ERROR: Chunk failed at level:0, tier_type:1 FAILED CONTIG:bviridis90124 ERROR: Chunk failed at level:2, tier_type:0 FAILED CONTIG:bviridis90124 parseTagData: ID field not to EMBL spec "SNAP-OL2 repeatmasker; DNA; ???; BP. " from DE RepbaseID: SNAP-OL2XX at /usr/local/apps/maker/v2.31.10/exe/RepeatMasker/RepeatMasker line 7611. parseTagData: ID field not to EMBL spec "SNAP-OL2 repeatmasker; DNA; ???; BP. " from DE RepbaseID: SNAP-OL2XX -------------- next part -------------- An HTML attachment was scrubbed... URL: