From qlian003 at ucr.edu Wed Jan 3 19:52:26 2018 From: qlian003 at ucr.edu (Qihua Liang) Date: Wed, 3 Jan 2018 17:52:26 -0800 Subject: [maker-devel] questions on master_datastore_index.log file Message-ID: Dear Maker Develop Team, I have successfully run Maker for several times before. But I came across a strange thing days ago when I ran Maker again on a different assembly with the same input files and settings. I saw the message of "Maker is now finished!!!? but got empty GFF3 and no fasta files. And then I checked the master_datastore_index.log and realized that there are a lot of ?failed?s and ?retry?s and ?failed? again. What does this mean? Since I used same inputs as previous successful runs, could you provide some instructions on how to debug and solve it? Thank you so much Qihua From o.k.torresen at ibv.uio.no Thu Jan 4 07:21:28 2018 From: o.k.torresen at ibv.uio.no (=?utf-8?B?T2xlIEtyaXN0aWFuIFTDuHJyZXNlbg==?=) Date: Thu, 4 Jan 2018 13:21:28 +0000 Subject: [maker-devel] Names/IDs from pred_gff are included in final gff Message-ID: <76613C81-ED9B-45F2-B84B-B60BC1D4D972@ibv.uio.no> Hi, as far as I can see, names or IDs of features in gffs given to pred_gff is included in the final output as the name of the feature. As far as I can understand, this is not expected behaviour (it is for model_gff). This is with MAKER 2.31.9. I have these settings: map_forward=0 keep_preds=1 I thought that map_forward had to be 1 to get the names for the old GFFs. Can you replicate this? Thank you. Sincerely, Ole K. T?rresen From d.ence at ufl.edu Thu Jan 4 08:16:42 2018 From: d.ence at ufl.edu (Ence,daniel) Date: Thu, 4 Jan 2018 14:16:42 +0000 Subject: [maker-devel] questions on master_datastore_index.log file In-Reply-To: References: Message-ID: Hi, Before we can give any help to debug it, we need the error messages. These should be in the same file that the ?maker is finished? message is in. Look for the first error message (the one closest to the top of the file) and send that to the mailing list. Thanks, Daniel > On Jan 3, 2018, at 8:52 PM, Qihua Liang wrote: > > Dear Maker Develop Team, > > I have successfully run Maker for several times before. But I came across a strange thing days ago when I ran Maker again on a different assembly with the same input files and settings. > > I saw the message of "Maker is now finished!!!? but got empty GFF3 and no fasta files. And then I checked the master_datastore_index.log and realized that there are a lot of ?failed?s and ?retry?s and ?failed? again. What does this mean? Since I used same inputs as previous successful runs, could you provide some instructions on how to debug and solve it? > > Thank you so much > Qihua > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > https://urldefense.proofpoint.com/v2/url?u=http-3A__box290.bluehost.com_mailman_listinfo_maker-2Ddevel-5Fyandell-2Dlab.org&d=DwIGaQ&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=12jzlNvGVD0AlPJ4E7cTlw1Dvu6n9cb4kMCobJ28XPs&m=nUDCP_0kFOhDYlTHgOpWtf_zdL77aQFeQwYOGIQwP8c&s=9Z4T1hdtxOyIjpn6f70qhrQRuGsZxXdV-oLSJF1zGkY&e= From qlian003 at ucr.edu Thu Jan 4 15:36:18 2018 From: qlian003 at ucr.edu (Qihua Liang) Date: Thu, 4 Jan 2018 13:36:18 -0800 Subject: [maker-devel] questions on master_datastore_index.log file In-Reply-To: References: Message-ID: Hi Ence, When I searched for ?E/error? in the output file, here is what first showed up: Process::MpiChunk::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 Error::subs::try(CODE(0x502bbb0), HASH(0x5007788)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 4269 Process::MpiChunk::_go(Process::MpiChunk=HASH(0x50a1a18), "flow", HASH(0x50ad0f0), 2, 0) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 378 Process::MpiChunk::_flow(Process::MpiChunk=HASH(0x50a1a18), HASH(0x50ad0f0), 2, 0, Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 318 Process::MpiTiers::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 Error::subs::try(CODE(0x50a9348), HASH(0x4ff0ec0)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 338 Process::MpiTiers::_next_level(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 179 Is this what you may need? Qihua > On Jan 4, 2018, at 6:16 AM, Ence,daniel wrote: > > Hi, Before we can give any help to debug it, we need the error messages. These should be in the same file that the ?maker is finished? message is in. Look for the first error message (the one closest to the top of the file) and send that to the mailing list. > > Thanks, > Daniel > > >> On Jan 3, 2018, at 8:52 PM, Qihua Liang wrote: >> >> Dear Maker Develop Team, >> >> I have successfully run Maker for several times before. But I came across a strange thing days ago when I ran Maker again on a different assembly with the same input files and settings. >> >> I saw the message of "Maker is now finished!!!? but got empty GFF3 and no fasta files. And then I checked the master_datastore_index.log and realized that there are a lot of ?failed?s and ?retry?s and ?failed? again. What does this mean? Since I used same inputs as previous successful runs, could you provide some instructions on how to debug and solve it? >> >> Thank you so much >> Qihua >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> https://urldefense.proofpoint.com/v2/url?u=http-3A__box290.bluehost.com_mailman_listinfo_maker-2Ddevel-5Fyandell-2Dlab.org&d=DwIGaQ&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=12jzlNvGVD0AlPJ4E7cTlw1Dvu6n9cb4kMCobJ28XPs&m=nUDCP_0kFOhDYlTHgOpWtf_zdL77aQFeQwYOGIQwP8c&s=9Z4T1hdtxOyIjpn6f70qhrQRuGsZxXdV-oLSJF1zGkY&e= > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Fri Jan 5 21:22:56 2018 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 5 Jan 2018 20:22:56 -0700 Subject: [maker-devel] questions on master_datastore_index.log file In-Reply-To: References: Message-ID: That?s the stack trace. The error is going to be a few lines further back. It would be best to get a few hundred lines right around the area you are showing. ?Carson > On Jan 4, 2018, at 2:36 PM, Qihua Liang wrote: > > Hi Ence, > > When I searched for ?E/error? in the output file, here is what first showed up: > Process::MpiChunk::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 > eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 > Error::subs::try(CODE(0x502bbb0), HASH(0x5007788)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 4269 > Process::MpiChunk::_go(Process::MpiChunk=HASH(0x50a1a18), "flow", HASH(0x50ad0f0), 2, 0) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 378 > Process::MpiChunk::_flow(Process::MpiChunk=HASH(0x50a1a18), HASH(0x50ad0f0), 2, 0, Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 318 > Process::MpiTiers::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 > eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 > Error::subs::try(CODE(0x50a9348), HASH(0x4ff0ec0)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 338 > Process::MpiTiers::_next_level(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 179 > > Is this what you may need? > > Qihua > >> On Jan 4, 2018, at 6:16 AM, Ence,daniel > wrote: >> >> Hi, Before we can give any help to debug it, we need the error messages. These should be in the same file that the ?maker is finished? message is in. Look for the first error message (the one closest to the top of the file) and send that to the mailing list. >> >> Thanks, >> Daniel >> >> >>> On Jan 3, 2018, at 8:52 PM, Qihua Liang > wrote: >>> >>> Dear Maker Develop Team, >>> >>> I have successfully run Maker for several times before. But I came across a strange thing days ago when I ran Maker again on a different assembly with the same input files and settings. >>> >>> I saw the message of "Maker is now finished!!!? but got empty GFF3 and no fasta files. And then I checked the master_datastore_index.log and realized that there are a lot of ?failed?s and ?retry?s and ?failed? again. What does this mean? Since I used same inputs as previous successful runs, could you provide some instructions on how to debug and solve it? >>> >>> Thank you so much >>> Qihua >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> https://urldefense.proofpoint.com/v2/url?u=http-3A__box290.bluehost.com_mailman_listinfo_maker-2Ddevel-5Fyandell-2Dlab.org&d=DwIGaQ&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=12jzlNvGVD0AlPJ4E7cTlw1Dvu6n9cb4kMCobJ28XPs&m=nUDCP_0kFOhDYlTHgOpWtf_zdL77aQFeQwYOGIQwP8c&s=9Z4T1hdtxOyIjpn6f70qhrQRuGsZxXdV-oLSJF1zGkY&e= >> > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Jan 16 12:15:29 2018 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 16 Jan 2018 11:15:29 -0700 Subject: [maker-devel] Names/IDs from pred_gff are included in final gff In-Reply-To: <76613C81-ED9B-45F2-B84B-B60BC1D4D972@ibv.uio.no> References: <76613C81-ED9B-45F2-B84B-B60BC1D4D972@ibv.uio.no> Message-ID: <8CB421A6-3CB0-4539-B55A-D3F4CA61D0AD@gmail.com> pred_gff will maintain it?s name in the match/match_part feature as the information is pulled directly from the input GFF3. But any pred_gff feature that becomes a final model will be renamed to something like ?scaffold_1517-pred_gff_GeneMark.hmm-gene-0.6-mRNA-1? unless you specify map_forward=1 to maintain the original name. ?Carson > On Jan 4, 2018, at 6:21 AM, Ole Kristian T?rresen wrote: > > Hi, > as far as I can see, names or IDs of features in gffs given to pred_gff is included in the final output as the name of the feature. As far as I can understand, this is not expected behaviour (it is for model_gff). This is with MAKER 2.31.9. > > I have these settings: > map_forward=0 > keep_preds=1 > > I thought that map_forward had to be 1 to get the names for the old GFFs. Can you replicate this? > > Thank you. > > Sincerely, > Ole K. T?rresen > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From o.k.torresen at ibv.uio.no Wed Jan 17 11:52:13 2018 From: o.k.torresen at ibv.uio.no (=?utf-8?B?T2xlIEtyaXN0aWFuIFTDuHJyZXNlbg==?=) Date: Wed, 17 Jan 2018 17:52:13 +0000 Subject: [maker-devel] Names/IDs from pred_gff are included in final gff In-Reply-To: <8CB421A6-3CB0-4539-B55A-D3F4CA61D0AD@gmail.com> References: <76613C81-ED9B-45F2-B84B-B60BC1D4D972@ibv.uio.no> <8CB421A6-3CB0-4539-B55A-D3F4CA61D0AD@gmail.com> Message-ID: <583A84D5-B979-4C2F-B262-2D55A6F55B56@ibv.uio.no> Ok, but I have an entry in the final gff like this: ID=maker-GmG20150304_scaffold_2371-pred_gff_maker-gene-0.125-mRNA-1;Parent=maker-GmG20150304_scaffold_2371-pred_gff_maker-gene-0.125;Name=ENSGMOT00000000668.1;_AED=0.00;_eAED=0.00;_QI=819|1|1|1|1|1|4|112|726;score=89.75616 (The name is derived from a pred_gff entry which is the results of mapping an old annotation to the new assembly). This is then called ENSGMOT00000000668.1 protein AED:0.00 eAED:0.00 QI:819|1|1|1|1|1|4|112|726 in the proteins.fasta file. Which is unfortunate, because it apparently mapped 12 places in the assembly. I have set map_forward=0, but keep_preds=1 (filtering on domain presence and AED score later). This and another file (result of genemark_gtf2gff3), is not input as match/match_part to MAKER, but with gene/exon/CDS/mRNA. Could that be the issue? Ole > On 16 Jan 2018, at 19:15, Carson Holt wrote: > > pred_gff will maintain it?s name in the match/match_part feature as the information is pulled directly from the input GFF3. But any pred_gff feature that becomes a final model will be renamed to something like ?scaffold_1517-pred_gff_GeneMark.hmm-gene-0.6-mRNA-1? unless you specify map_forward=1 to maintain the original name. > > ?Carson > > > >> On Jan 4, 2018, at 6:21 AM, Ole Kristian T?rresen wrote: >> >> Hi, >> as far as I can see, names or IDs of features in gffs given to pred_gff is included in the final output as the name of the feature. As far as I can understand, this is not expected behaviour (it is for model_gff). This is with MAKER 2.31.9. >> >> I have these settings: >> map_forward=0 >> keep_preds=1 >> >> I thought that map_forward had to be 1 to get the names for the old GFFs. Can you replicate this? >> >> Thank you. >> >> Sincerely, >> Ole K. T?rresen >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > From qlian003 at ucr.edu Sat Jan 6 17:09:55 2018 From: qlian003 at ucr.edu (Qihua Liang) Date: Sat, 6 Jan 2018 15:09:55 -0800 Subject: [maker-devel] questions on master_datastore_index.log file In-Reply-To: References: Message-ID: <0BECB285-BB11-4F46-B6D7-072640F311B2@ucr.edu> Hi Carson, I am pasting more lines of error messages. I notice an error of "ERROR: Can't open seq file: /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/query.masked.gff.seq?, the seq name of ?ScsGwly? is ">ScsGwly_6124;HRSCAF=6247?, is it because of the seq naming that makes the temp file name weird? Thanks Qihua #--------- command -------------# Widget::blastx: /24-2/home/qliang/0.soft/maker/bin/../exe/blast/bin/blastx -db /tmp/maker_nJDkCL/te_proteins%2Efasta.mpi.10.9 -query /tmp/maker_nJDkCL/0/ScsG wly_5932%3BHRSCAF=6050.0 -num_alignments 10000 -num_descriptions 10000 -evalue 1e-06 -dbsize 300 -searchsp 500000000 -num_threads 1 -seg yes -soft_masking true -lcase_masking -show_gis -out /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/D7/4 A/ScsGwly_5932%3BHRSCAF=6050//theVoid.ScsGwly_5932%3BHRSCAF=6050/0/ScsGwly_5932%3BHRSCAF=6050.0.te_proteins%2Efasta.repeatrunner.temp_dir/te_ proteins%2Efasta.mpi.10.9.repeatrunner #-------------------------------# deleted:0 hits collecting blastx repeatmasking processing all repeats in cluster::shadow_cluster... ...finished clustering. preparing masked sequence preparing ab-inits running snap. #--------- command -------------# Widget::snap: /24-2/home/qliang/0.soft/maker/exe/snap/snap /home/qliang/cowpea/annotation/09.tingting/4.Abintio/2.CEGMA/3.maker/maker1.hmm/maker1.snap.hmm /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0 > /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0.maker1%2Esnap%2Eh mm.snap #-------------------------------# scoring....decoding.10.20.30.40.50.60.70.80.90.100 done scoring....decoding.10.20.30.40.50.60.70.80.90.100 done running augustus. #--------- command -------------# Widget::augustus: /usr/local/augustus.2.7/bin/augustus --species=cowpea_new --UTR=off /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0 > /tmp/maker _nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0.cowpea_new.augustus #-------------------------------# deleted:0 hits collecting blastx repeatmasking processing all repeats in cluster::shadow_cluster... ...finished clustering. deleted:0 hits doing blastx repeats running blast search. #--------- command -------------# Widget::blastx: /24-2/home/qliang/0.soft/maker/bin/../exe/blast/bin/blastx -db /tmp/maker_mvdRkd/te_proteins%2Efasta.mpi.10.6 -query /tmp/maker_mvdRkd/0/chr10.75 -num_alignments 10000 -num_descriptions 10000 -evalue 1e-06 -dbsize 300 -searchsp 500000000 -num_threads 1 -seg yes -soft_masking true -lcase_masking -show_gis -out /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/09/chr10//theVoid.chr10/7/chr10.75.te_proteins%2Efasta.repeatrunner.temp_dir/te_proteins%2Efasta.mpi.10.6.repeatrunner #-------------------------------# doing blastx repeats re reading blast report. /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/0/ScsGwly_6124%3BHRSCAF=6247.0.te_proteins%2Efasta.repeatrunner deleted:0 hits doing blastx repeats doing blastx repeats doing blastx repeats doing blastx repeats doing blastx repeats doing blastx repeats doing blastx repeats doing blastx repeats doing blastx repeats collecting blastx repeatmasking processing all repeats in cluster::shadow_cluster... ...finished clustering. ERROR: Can't open seq file: /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/query.masked.gff.seq No such file or directory at /24-2/home/qliang/0.soft/maker/bin/../lib/Dumper/GFF/GFFV3.pm line 199. Dumper::GFF::GFFV3::finalize(Dumper::GFF::GFFV3=HASH(0x5000ab8)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 700 Process::MpiChunk::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 Error::subs::try(CODE(0x502bbb0), HASH(0x5007788)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 4269 Process::MpiChunk::_go(Process::MpiChunk=HASH(0x50a1a18), "flow", HASH(0x50ad0f0), 2, 0) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 378 Process::MpiChunk::_flow(Process::MpiChunk=HASH(0x50a1a18), HASH(0x50ad0f0), 2, 0, Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 318 Process::MpiTiers::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 Error::subs::try(CODE(0x50a9348), HASH(0x4ff0ec0)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 338 Process::MpiTiers::_next_level(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 179 Process::MpiTiers::next_chunk(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 286 Process::MpiTiers::run_all(Process::MpiTiers=HASH(0x4fb3350), 0) called at /home/qliang/0.soft/maker/bin/maker line 695 --> rank=NA, hostname=H4 ERROR: Failed while builing masking tiers --> rank=NA, hostname=H4 --> rank=NA, hostname=H4 ERROR: Can not get next level running genemark. #--------- command -------------# Widget::genemark: /24-2/home/qliang/0.soft/PerlPackages/ActivePerl-5.22/bin/perl-static /24-2/home/qliang/0.soft/maker/bin/../lib/Widget/genemark/gmhmm_wrap -m /home/qliang/cowpea/annotation/05.CEGMA/2.genemask/output/gmhmm.mod -g /24-2/home/qliang/0.soft/makerPackages/gm_et_linux_64/gmes_petap/gmhmme3 -p /24-2/home/qliang/0.soft/makerPackages/gm_et_linux_64/gmes_petap/probuild -o /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_nomask.0.gmhmm%2Emod.genemark /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_nomask.0 #-------------------------------# FAILED CONTIG:ScsGwly_6124;HRSCAF=6247 examining contents of the fasta file and run log --Next Contig-- #--------------------------------------------------------------------- Now starting the contig!! SeqID: ScsGwly_6140;HRSCAF=6263 Length: 1247 #--------------------------------------------------------------------- > On Jan 5, 2018, at 7:22 PM, Carson Holt wrote: > > That?s the stack trace. The error is going to be a few lines further back. It would be best to get a few hundred lines right around the area you are showing. > > ?Carson > >> On Jan 4, 2018, at 2:36 PM, Qihua Liang > wrote: >> >> Hi Ence, >> >> When I searched for ?E/error? in the output file, here is what first showed up: >> Process::MpiChunk::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >> Error::subs::try(CODE(0x502bbb0), HASH(0x5007788)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 4269 >> Process::MpiChunk::_go(Process::MpiChunk=HASH(0x50a1a18), "flow", HASH(0x50ad0f0), 2, 0) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 378 >> Process::MpiChunk::_flow(Process::MpiChunk=HASH(0x50a1a18), HASH(0x50ad0f0), 2, 0, Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 318 >> Process::MpiTiers::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >> Error::subs::try(CODE(0x50a9348), HASH(0x4ff0ec0)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 338 >> Process::MpiTiers::_next_level(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 179 >> >> Is this what you may need? >> >> Qihua >> >>> On Jan 4, 2018, at 6:16 AM, Ence,daniel > wrote: >>> >>> Hi, Before we can give any help to debug it, we need the error messages. These should be in the same file that the ?maker is finished? message is in. Look for the first error message (the one closest to the top of the file) and send that to the mailing list. >>> >>> Thanks, >>> Daniel >>> >>> >>>> On Jan 3, 2018, at 8:52 PM, Qihua Liang > wrote: >>>> >>>> Dear Maker Develop Team, >>>> >>>> I have successfully run Maker for several times before. But I came across a strange thing days ago when I ran Maker again on a different assembly with the same input files and settings. >>>> >>>> I saw the message of "Maker is now finished!!!? but got empty GFF3 and no fasta files. And then I checked the master_datastore_index.log and realized that there are a lot of ?failed?s and ?retry?s and ?failed? again. What does this mean? Since I used same inputs as previous successful runs, could you provide some instructions on how to debug and solve it? >>>> >>>> Thank you so much >>>> Qihua >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__box290.bluehost.com_mailman_listinfo_maker-2Ddevel-5Fyandell-2Dlab.org&d=DwIGaQ&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=12jzlNvGVD0AlPJ4E7cTlw1Dvu6n9cb4kMCobJ28XPs&m=nUDCP_0kFOhDYlTHgOpWtf_zdL77aQFeQwYOGIQwP8c&s=9Z4T1hdtxOyIjpn6f70qhrQRuGsZxXdV-oLSJF1zGkY&e= >>> >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Jan 9 11:14:05 2018 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 9 Jan 2018 10:14:05 -0700 Subject: [maker-devel] questions on master_datastore_index.log file In-Reply-To: <0BECB285-BB11-4F46-B6D7-072640F311B2@ucr.edu> References: <0BECB285-BB11-4F46-B6D7-072640F311B2@ucr.edu> Message-ID: <0E5E8721-E814-4BA5-891B-B1C312BC0D4A@gmail.com> Your contig names may create issues. Specifically the ?;? character, but you should also remove the ?=? character. However, I believe your problem may be IO. If you are running under MPI or are running multiple jobs, the disk one of the machines may have that location unmounted, it may be full, you may have hit a system file quota limit, or the IO load is slowing it is not actually finished writing the file when MAKER tries to read it. If IO load, is the issue, then you just need to run fewer processes. The other possibilities would mean you need to make space, fix the mount, or raise any quotas on your systems. Thanks, Carson > On Jan 6, 2018, at 4:09 PM, Qihua Liang wrote: > > Hi Carson, > > I am pasting more lines of error messages. I notice an error of "ERROR: Can't open seq file: /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/query.masked.gff.seq?, the seq name of ?ScsGwly? is ">ScsGwly_6124;HRSCAF=6247?, is it because of the seq naming that makes the temp file name weird? > > Thanks > Qihua > > #--------- command -------------# > Widget::blastx: > /24-2/home/qliang/0.soft/maker/bin/../exe/blast/bin/blastx -db /tmp/maker_nJDkCL/te_proteins%2Efasta.mpi.10.9 -query /tmp/maker_nJDkCL/0/ScsG > wly_5932%3BHRSCAF=6050.0 -num_alignments 10000 -num_descriptions 10000 -evalue 1e-06 -dbsize 300 -searchsp 500000000 -num_threads 1 -seg yes > -soft_masking true -lcase_masking -show_gis -out /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/D7/4 > A/ScsGwly_5932%3BHRSCAF=6050//theVoid.ScsGwly_5932%3BHRSCAF=6050/0/ScsGwly_5932%3BHRSCAF=6050.0.te_proteins%2Efasta.repeatrunner.temp_dir/te_ > proteins%2Efasta.mpi.10.9.repeatrunner > #-------------------------------# > deleted:0 hits > collecting blastx repeatmasking > processing all repeats > in cluster::shadow_cluster... > ...finished clustering. > preparing masked sequence > preparing ab-inits > running snap. > #--------- command -------------# > Widget::snap: > /24-2/home/qliang/0.soft/maker/exe/snap/snap /home/qliang/cowpea/annotation/09.tingting/4.Abintio/2.CEGMA/3.maker/maker1.hmm/maker1.snap.hmm > /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0 > /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0.maker1%2Esnap%2Eh > mm.snap > #-------------------------------# > scoring....decoding.10.20.30.40.50.60.70.80.90.100 done > scoring....decoding.10.20.30.40.50.60.70.80.90.100 done > running augustus. > #--------- command -------------# > Widget::augustus: > /usr/local/augustus.2.7/bin/augustus --species=cowpea_new --UTR=off /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0 > /tmp/maker > _nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0.cowpea_new.augustus > #-------------------------------# > deleted:0 hits > collecting blastx repeatmasking > processing all repeats > in cluster::shadow_cluster... > ...finished clustering. > deleted:0 hits > doing blastx repeats > running blast search. > #--------- command -------------# > Widget::blastx: > /24-2/home/qliang/0.soft/maker/bin/../exe/blast/bin/blastx -db /tmp/maker_mvdRkd/te_proteins%2Efasta.mpi.10.6 -query /tmp/maker_mvdRkd/0/chr10.75 -num_alignments 10000 -num_descriptions 10000 -evalue 1e-06 -dbsize 300 -searchsp 500000000 -num_threads 1 -seg yes -soft_masking true -lcase_masking -show_gis -out /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/09/chr10//theVoid.chr10/7/chr10.75.te_proteins%2Efasta.repeatrunner.temp_dir/te_proteins%2Efasta.mpi.10.6.repeatrunner > #-------------------------------# > doing blastx repeats > re reading blast report. > /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/0/ScsGwly_6124%3BHRSCAF=6247.0.te_proteins%2Efasta.repeatrunner > deleted:0 hits > doing blastx repeats > doing blastx repeats > doing blastx repeats > doing blastx repeats > doing blastx repeats > doing blastx repeats > doing blastx repeats > doing blastx repeats > doing blastx repeats > collecting blastx repeatmasking > processing all repeats > in cluster::shadow_cluster... > ...finished clustering. > ERROR: Can't open seq file: /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/query.masked.gff.seq > No such file or directory > > at /24-2/home/qliang/0.soft/maker/bin/../lib/Dumper/GFF/GFFV3.pm line 199. > Dumper::GFF::GFFV3::finalize(Dumper::GFF::GFFV3=HASH(0x5000ab8)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 700 > Process::MpiChunk::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 > eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 > Error::subs::try(CODE(0x502bbb0), HASH(0x5007788)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 4269 > Process::MpiChunk::_go(Process::MpiChunk=HASH(0x50a1a18), "flow", HASH(0x50ad0f0), 2, 0) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 378 > Process::MpiChunk::_flow(Process::MpiChunk=HASH(0x50a1a18), HASH(0x50ad0f0), 2, 0, Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 318 > Process::MpiTiers::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 > eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 > Error::subs::try(CODE(0x50a9348), HASH(0x4ff0ec0)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 338 > Process::MpiTiers::_next_level(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 179 > Process::MpiTiers::next_chunk(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 286 > Process::MpiTiers::run_all(Process::MpiTiers=HASH(0x4fb3350), 0) called at /home/qliang/0.soft/maker/bin/maker line 695 > --> rank=NA, hostname=H4 > ERROR: Failed while builing masking tiers > --> rank=NA, hostname=H4 > --> rank=NA, hostname=H4 > ERROR: Can not get next level > running genemark. > #--------- command -------------# > Widget::genemark: > /24-2/home/qliang/0.soft/PerlPackages/ActivePerl-5.22/bin/perl-static /24-2/home/qliang/0.soft/maker/bin/../lib/Widget/genemark/gmhmm_wrap -m /home/qliang/cowpea/annotation/05.CEGMA/2.genemask/output/gmhmm.mod -g /24-2/home/qliang/0.soft/makerPackages/gm_et_linux_64/gmes_petap/gmhmme3 -p /24-2/home/qliang/0.soft/makerPackages/gm_et_linux_64/gmes_petap/probuild -o /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_nomask.0.gmhmm%2Emod.genemark /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_nomask.0 > #-------------------------------# > FAILED CONTIG:ScsGwly_6124;HRSCAF=6247 > > examining contents of the fasta file and run log > > > > --Next Contig-- > > #--------------------------------------------------------------------- > Now starting the contig!! > SeqID: ScsGwly_6140;HRSCAF=6263 > Length: 1247 > #--------------------------------------------------------------------- > > > > >> On Jan 5, 2018, at 7:22 PM, Carson Holt wrote: >> >> That?s the stack trace. The error is going to be a few lines further back. It would be best to get a few hundred lines right around the area you are showing. >> >> ?Carson >> >>> On Jan 4, 2018, at 2:36 PM, Qihua Liang wrote: >>> >>> Hi Ence, >>> >>> When I searched for ?E/error? in the output file, here is what first showed up: >>> Process::MpiChunk::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >>> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >>> Error::subs::try(CODE(0x502bbb0), HASH(0x5007788)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 4269 >>> Process::MpiChunk::_go(Process::MpiChunk=HASH(0x50a1a18), "flow", HASH(0x50ad0f0), 2, 0) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 378 >>> Process::MpiChunk::_flow(Process::MpiChunk=HASH(0x50a1a18), HASH(0x50ad0f0), 2, 0, Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 318 >>> Process::MpiTiers::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >>> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >>> Error::subs::try(CODE(0x50a9348), HASH(0x4ff0ec0)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 338 >>> Process::MpiTiers::_next_level(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 179 >>> >>> Is this what you may need? >>> >>> Qihua >>> >>>> On Jan 4, 2018, at 6:16 AM, Ence,daniel wrote: >>>> >>>> Hi, Before we can give any help to debug it, we need the error messages. These should be in the same file that the ?maker is finished? message is in. Look for the first error message (the one closest to the top of the file) and send that to the mailing list. >>>> >>>> Thanks, >>>> Daniel >>>> >>>> >>>>> On Jan 3, 2018, at 8:52 PM, Qihua Liang wrote: >>>>> >>>>> Dear Maker Develop Team, >>>>> >>>>> I have successfully run Maker for several times before. But I came across a strange thing days ago when I ran Maker again on a different assembly with the same input files and settings. >>>>> >>>>> I saw the message of "Maker is now finished!!!? but got empty GFF3 and no fasta files. And then I checked the master_datastore_index.log and realized that there are a lot of ?failed?s and ?retry?s and ?failed? again. What does this mean? Since I used same inputs as previous successful runs, could you provide some instructions on how to debug and solve it? >>>>> >>>>> Thank you so much >>>>> Qihua >>>>> _______________________________________________ >>>>> maker-devel mailing list >>>>> maker-devel at box290.bluehost.com >>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__box290.bluehost.com_mailman_listinfo_maker-2Ddevel-5Fyandell-2Dlab.org&d=DwIGaQ&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=12jzlNvGVD0AlPJ4E7cTlw1Dvu6n9cb4kMCobJ28XPs&m=nUDCP_0kFOhDYlTHgOpWtf_zdL77aQFeQwYOGIQwP8c&s=9Z4T1hdtxOyIjpn6f70qhrQRuGsZxXdV-oLSJF1zGkY&e= >>>> >>> >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From qlian003 at ucr.edu Tue Jan 9 12:10:49 2018 From: qlian003 at ucr.edu (Qihua Liang) Date: Tue, 9 Jan 2018 10:10:49 -0800 Subject: [maker-devel] questions on master_datastore_index.log file In-Reply-To: <0E5E8721-E814-4BA5-891B-B1C312BC0D4A@gmail.com> References: <0BECB285-BB11-4F46-B6D7-072640F311B2@ucr.edu> <0E5E8721-E814-4BA5-891B-B1C312BC0D4A@gmail.com> Message-ID: <87A06F3B-82C1-4B21-906E-69DC1308DEC6@ucr.edu> Hi Carson, I just check with the system administrator and we think the disk space should be working fine. And actually I also ran another attempt with much fewer processors days ago and I am having the same issues. Maybe I will try renaming the contig names to see how the new attempt works? Or any other suggestions? Thank you! Qihua > On Jan 9, 2018, at 9:14 AM, Carson Holt wrote: > > Your contig names may create issues. Specifically the ?;? character, but you should also remove the ?=? character. However, I believe your problem may be IO. If you are running under MPI or are running multiple jobs, the disk one of the machines may have that location unmounted, it may be full, you may have hit a system file quota limit, or the IO load is slowing it is not actually finished writing the file when MAKER tries to read it. If IO load, is the issue, then you just need to run fewer processes. The other possibilities would mean you need to make space, fix the mount, or raise any quotas on your systems. > > Thanks, > Carson > > On Jan 6, 2018, at 4:09 PM, Qihua Liang > wrote: > >> Hi Carson, >> >> I am pasting more lines of error messages. I notice an error of "ERROR: Can't open seq file: /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/query.masked.gff.seq?, the seq name of ?ScsGwly? is ">ScsGwly_6124;HRSCAF=6247?, is it because of the seq naming that makes the temp file name weird? >> >> Thanks >> Qihua >> >> #--------- command -------------# >> Widget::blastx: >> /24-2/home/qliang/0.soft/maker/bin/../exe/blast/bin/blastx -db /tmp/maker_nJDkCL/te_proteins%2Efasta.mpi.10.9 -query /tmp/maker_nJDkCL/0/ScsG >> wly_5932%3BHRSCAF=6050.0 -num_alignments 10000 -num_descriptions 10000 -evalue 1e-06 -dbsize 300 -searchsp 500000000 -num_threads 1 -seg yes >> -soft_masking true -lcase_masking -show_gis -out /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/D7/4 >> A/ScsGwly_5932%3BHRSCAF=6050//theVoid.ScsGwly_5932%3BHRSCAF=6050/0/ScsGwly_5932%3BHRSCAF=6050.0.te_proteins%2Efasta.repeatrunner.temp_dir/te_ >> proteins%2Efasta.mpi.10.9.repeatrunner >> #-------------------------------# >> deleted:0 hits >> collecting blastx repeatmasking >> processing all repeats >> in cluster::shadow_cluster... >> ...finished clustering. >> preparing masked sequence >> preparing ab-inits >> running snap. >> #--------- command -------------# >> Widget::snap: >> /24-2/home/qliang/0.soft/maker/exe/snap/snap /home/qliang/cowpea/annotation/09.tingting/4.Abintio/2.CEGMA/3.maker/maker1.hmm/maker1.snap.hmm >> /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0 > /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0.maker1%2Esnap%2Eh >> mm.snap >> #-------------------------------# >> scoring....decoding.10.20.30.40.50.60.70.80.90.100 done >> scoring....decoding.10.20.30.40.50.60.70.80.90.100 done >> running augustus. >> #--------- command -------------# >> Widget::augustus: >> /usr/local/augustus.2.7/bin/augustus --species=cowpea_new --UTR=off /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0 > /tmp/maker >> _nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0.cowpea_new.augustus >> #-------------------------------# >> deleted:0 hits >> collecting blastx repeatmasking >> processing all repeats >> in cluster::shadow_cluster... >> ...finished clustering. >> deleted:0 hits >> doing blastx repeats >> running blast search. >> #--------- command -------------# >> Widget::blastx: >> /24-2/home/qliang/0.soft/maker/bin/../exe/blast/bin/blastx -db /tmp/maker_mvdRkd/te_proteins%2Efasta.mpi.10.6 -query /tmp/maker_mvdRkd/0/chr10.75 -num_alignments 10000 -num_descriptions 10000 -evalue 1e-06 -dbsize 300 -searchsp 500000000 -num_threads 1 -seg yes -soft_masking true -lcase_masking -show_gis -out /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/09/chr10//theVoid.chr10/7/chr10.75.te_proteins%2Efasta.repeatrunner.temp_dir/te_proteins%2Efasta.mpi.10.6.repeatrunner >> #-------------------------------# >> doing blastx repeats >> re reading blast report. >> /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/0/ScsGwly_6124%3BHRSCAF=6247.0.te_proteins%2Efasta.repeatrunner >> deleted:0 hits >> doing blastx repeats >> doing blastx repeats >> doing blastx repeats >> doing blastx repeats >> doing blastx repeats >> doing blastx repeats >> doing blastx repeats >> doing blastx repeats >> doing blastx repeats >> collecting blastx repeatmasking >> processing all repeats >> in cluster::shadow_cluster... >> ...finished clustering. >> ERROR: Can't open seq file: /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/query.masked.gff.seq >> No such file or directory >> >> at /24-2/home/qliang/0.soft/maker/bin/../lib/Dumper/GFF/GFFV3.pm line 199. >> Dumper::GFF::GFFV3::finalize(Dumper::GFF::GFFV3=HASH(0x5000ab8)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 700 >> Process::MpiChunk::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >> Error::subs::try(CODE(0x502bbb0), HASH(0x5007788)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 4269 >> Process::MpiChunk::_go(Process::MpiChunk=HASH(0x50a1a18), "flow", HASH(0x50ad0f0), 2, 0) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 378 >> Process::MpiChunk::_flow(Process::MpiChunk=HASH(0x50a1a18), HASH(0x50ad0f0), 2, 0, Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 318 >> Process::MpiTiers::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >> Error::subs::try(CODE(0x50a9348), HASH(0x4ff0ec0)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 338 >> Process::MpiTiers::_next_level(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 179 >> Process::MpiTiers::next_chunk(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 286 >> Process::MpiTiers::run_all(Process::MpiTiers=HASH(0x4fb3350), 0) called at /home/qliang/0.soft/maker/bin/maker line 695 >> --> rank=NA, hostname=H4 >> ERROR: Failed while builing masking tiers >> --> rank=NA, hostname=H4 >> --> rank=NA, hostname=H4 >> ERROR: Can not get next level >> running genemark. >> #--------- command -------------# >> Widget::genemark: >> /24-2/home/qliang/0.soft/PerlPackages/ActivePerl-5.22/bin/perl-static /24-2/home/qliang/0.soft/maker/bin/../lib/Widget/genemark/gmhmm_wrap -m /home/qliang/cowpea/annotation/05.CEGMA/2.genemask/output/gmhmm.mod -g /24-2/home/qliang/0.soft/makerPackages/gm_et_linux_64/gmes_petap/gmhmme3 -p /24-2/home/qliang/0.soft/makerPackages/gm_et_linux_64/gmes_petap/probuild -o /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_nomask.0.gmhmm%2Emod.genemark /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_nomask.0 >> #-------------------------------# >> FAILED CONTIG:ScsGwly_6124;HRSCAF=6247 >> >> examining contents of the fasta file and run log >> >> >> >> --Next Contig-- >> >> #--------------------------------------------------------------------- >> Now starting the contig!! >> SeqID: ScsGwly_6140;HRSCAF=6263 >> Length: 1247 >> #--------------------------------------------------------------------- >> >> >> >> >>> On Jan 5, 2018, at 7:22 PM, Carson Holt > wrote: >>> >>> That?s the stack trace. The error is going to be a few lines further back. It would be best to get a few hundred lines right around the area you are showing. >>> >>> ?Carson >>> >>>> On Jan 4, 2018, at 2:36 PM, Qihua Liang > wrote: >>>> >>>> Hi Ence, >>>> >>>> When I searched for ?E/error? in the output file, here is what first showed up: >>>> Process::MpiChunk::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >>>> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >>>> Error::subs::try(CODE(0x502bbb0), HASH(0x5007788)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 4269 >>>> Process::MpiChunk::_go(Process::MpiChunk=HASH(0x50a1a18), "flow", HASH(0x50ad0f0), 2, 0) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 378 >>>> Process::MpiChunk::_flow(Process::MpiChunk=HASH(0x50a1a18), HASH(0x50ad0f0), 2, 0, Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 318 >>>> Process::MpiTiers::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >>>> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >>>> Error::subs::try(CODE(0x50a9348), HASH(0x4ff0ec0)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 338 >>>> Process::MpiTiers::_next_level(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 179 >>>> >>>> Is this what you may need? >>>> >>>> Qihua >>>> >>>>> On Jan 4, 2018, at 6:16 AM, Ence,daniel > wrote: >>>>> >>>>> Hi, Before we can give any help to debug it, we need the error messages. These should be in the same file that the ?maker is finished? message is in. Look for the first error message (the one closest to the top of the file) and send that to the mailing list. >>>>> >>>>> Thanks, >>>>> Daniel >>>>> >>>>> >>>>>> On Jan 3, 2018, at 8:52 PM, Qihua Liang > wrote: >>>>>> >>>>>> Dear Maker Develop Team, >>>>>> >>>>>> I have successfully run Maker for several times before. But I came across a strange thing days ago when I ran Maker again on a different assembly with the same input files and settings. >>>>>> >>>>>> I saw the message of "Maker is now finished!!!? but got empty GFF3 and no fasta files. And then I checked the master_datastore_index.log and realized that there are a lot of ?failed?s and ?retry?s and ?failed? again. What does this mean? Since I used same inputs as previous successful runs, could you provide some instructions on how to debug and solve it? >>>>>> >>>>>> Thank you so much >>>>>> Qihua >>>>>> _______________________________________________ >>>>>> maker-devel mailing list >>>>>> maker-devel at box290.bluehost.com >>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__box290.bluehost.com_mailman_listinfo_maker-2Ddevel-5Fyandell-2Dlab.org&d=DwIGaQ&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=12jzlNvGVD0AlPJ4E7cTlw1Dvu6n9cb4kMCobJ28XPs&m=nUDCP_0kFOhDYlTHgOpWtf_zdL77aQFeQwYOGIQwP8c&s=9Z4T1hdtxOyIjpn6f70qhrQRuGsZxXdV-oLSJF1zGkY&e= >>>>> >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Wed Jan 10 13:05:03 2018 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 10 Jan 2018 12:05:03 -0700 Subject: [maker-devel] questions on master_datastore_index.log file In-Reply-To: <87A06F3B-82C1-4B21-906E-69DC1308DEC6@ucr.edu> References: <0BECB285-BB11-4F46-B6D7-072640F311B2@ucr.edu> <0E5E8721-E814-4BA5-891B-B1C312BC0D4A@gmail.com> <87A06F3B-82C1-4B21-906E-69DC1308DEC6@ucr.edu> Message-ID: <36B45AA1-3D02-4E83-9EF8-85D56C4D3020@gmail.com> The error is saying exactly that the file MAKER just created does not exist. The only time we ever see this is when using network mounted locations under heavy IO load. Most network storage options use asynchronous IO, which means the system returns success on file operation before they actually complete. So they can say they finished writing a file before it actually exist. So if you try and open it right away, it doesn?t really exist and everything fails. But that only happens if there is heavy IO (lots of things going on in that mount location). So if you are getting persitent failures you may want to try a different work directory, or get your IT to troubleshoot IO load in the directory you are using. ?Carson > On Jan 9, 2018, at 11:10 AM, Qihua Liang wrote: > > Hi Carson, > > I just check with the system administrator and we think the disk space should be working fine. And actually I also ran another attempt with much fewer processors days ago and I am having the same issues. > > Maybe I will try renaming the contig names to see how the new attempt works? Or any other suggestions? > > Thank you! > Qihua > >> On Jan 9, 2018, at 9:14 AM, Carson Holt > wrote: >> >> Your contig names may create issues. Specifically the ?;? character, but you should also remove the ?=? character. However, I believe your problem may be IO. If you are running under MPI or are running multiple jobs, the disk one of the machines may have that location unmounted, it may be full, you may have hit a system file quota limit, or the IO load is slowing it is not actually finished writing the file when MAKER tries to read it. If IO load, is the issue, then you just need to run fewer processes. The other possibilities would mean you need to make space, fix the mount, or raise any quotas on your systems. >> >> Thanks, >> Carson >> >> On Jan 6, 2018, at 4:09 PM, Qihua Liang > wrote: >> >>> Hi Carson, >>> >>> I am pasting more lines of error messages. I notice an error of "ERROR: Can't open seq file: /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/query.masked.gff.seq?, the seq name of ?ScsGwly? is ">ScsGwly_6124;HRSCAF=6247?, is it because of the seq naming that makes the temp file name weird? >>> >>> Thanks >>> Qihua >>> >>> #--------- command -------------# >>> Widget::blastx: >>> /24-2/home/qliang/0.soft/maker/bin/../exe/blast/bin/blastx -db /tmp/maker_nJDkCL/te_proteins%2Efasta.mpi.10.9 -query /tmp/maker_nJDkCL/0/ScsG >>> wly_5932%3BHRSCAF=6050.0 -num_alignments 10000 -num_descriptions 10000 -evalue 1e-06 -dbsize 300 -searchsp 500000000 -num_threads 1 -seg yes >>> -soft_masking true -lcase_masking -show_gis -out /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/D7/4 >>> A/ScsGwly_5932%3BHRSCAF=6050//theVoid.ScsGwly_5932%3BHRSCAF=6050/0/ScsGwly_5932%3BHRSCAF=6050.0.te_proteins%2Efasta.repeatrunner.temp_dir/te_ >>> proteins%2Efasta.mpi.10.9.repeatrunner >>> #-------------------------------# >>> deleted:0 hits >>> collecting blastx repeatmasking >>> processing all repeats >>> in cluster::shadow_cluster... >>> ...finished clustering. >>> preparing masked sequence >>> preparing ab-inits >>> running snap. >>> #--------- command -------------# >>> Widget::snap: >>> /24-2/home/qliang/0.soft/maker/exe/snap/snap /home/qliang/cowpea/annotation/09.tingting/4.Abintio/2.CEGMA/3.maker/maker1.hmm/maker1.snap.hmm >>> /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0 > /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0.maker1%2Esnap%2Eh >>> mm.snap >>> #-------------------------------# >>> scoring....decoding.10.20.30.40.50.60.70.80.90.100 done >>> scoring....decoding.10.20.30.40.50.60.70.80.90.100 done >>> running augustus. >>> #--------- command -------------# >>> Widget::augustus: >>> /usr/local/augustus.2.7/bin/augustus --species=cowpea_new --UTR=off /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0 > /tmp/maker >>> _nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0.cowpea_new.augustus >>> #-------------------------------# >>> deleted:0 hits >>> collecting blastx repeatmasking >>> processing all repeats >>> in cluster::shadow_cluster... >>> ...finished clustering. >>> deleted:0 hits >>> doing blastx repeats >>> running blast search. >>> #--------- command -------------# >>> Widget::blastx: >>> /24-2/home/qliang/0.soft/maker/bin/../exe/blast/bin/blastx -db /tmp/maker_mvdRkd/te_proteins%2Efasta.mpi.10.6 -query /tmp/maker_mvdRkd/0/chr10.75 -num_alignments 10000 -num_descriptions 10000 -evalue 1e-06 -dbsize 300 -searchsp 500000000 -num_threads 1 -seg yes -soft_masking true -lcase_masking -show_gis -out /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/09/chr10//theVoid.chr10/7/chr10.75.te_proteins%2Efasta.repeatrunner.temp_dir/te_proteins%2Efasta.mpi.10.6.repeatrunner >>> #-------------------------------# >>> doing blastx repeats >>> re reading blast report. >>> /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/0/ScsGwly_6124%3BHRSCAF=6247.0.te_proteins%2Efasta.repeatrunner >>> deleted:0 hits >>> doing blastx repeats >>> doing blastx repeats >>> doing blastx repeats >>> doing blastx repeats >>> doing blastx repeats >>> doing blastx repeats >>> doing blastx repeats >>> doing blastx repeats >>> doing blastx repeats >>> collecting blastx repeatmasking >>> processing all repeats >>> in cluster::shadow_cluster... >>> ...finished clustering. >>> ERROR: Can't open seq file: /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/query.masked.gff.seq >>> No such file or directory >>> >>> at /24-2/home/qliang/0.soft/maker/bin/../lib/Dumper/GFF/GFFV3.pm line 199. >>> Dumper::GFF::GFFV3::finalize(Dumper::GFF::GFFV3=HASH(0x5000ab8)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 700 >>> Process::MpiChunk::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >>> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >>> Error::subs::try(CODE(0x502bbb0), HASH(0x5007788)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 4269 >>> Process::MpiChunk::_go(Process::MpiChunk=HASH(0x50a1a18), "flow", HASH(0x50ad0f0), 2, 0) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 378 >>> Process::MpiChunk::_flow(Process::MpiChunk=HASH(0x50a1a18), HASH(0x50ad0f0), 2, 0, Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 318 >>> Process::MpiTiers::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >>> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >>> Error::subs::try(CODE(0x50a9348), HASH(0x4ff0ec0)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 338 >>> Process::MpiTiers::_next_level(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 179 >>> Process::MpiTiers::next_chunk(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 286 >>> Process::MpiTiers::run_all(Process::MpiTiers=HASH(0x4fb3350), 0) called at /home/qliang/0.soft/maker/bin/maker line 695 >>> --> rank=NA, hostname=H4 >>> ERROR: Failed while builing masking tiers >>> --> rank=NA, hostname=H4 >>> --> rank=NA, hostname=H4 >>> ERROR: Can not get next level >>> running genemark. >>> #--------- command -------------# >>> Widget::genemark: >>> /24-2/home/qliang/0.soft/PerlPackages/ActivePerl-5.22/bin/perl-static /24-2/home/qliang/0.soft/maker/bin/../lib/Widget/genemark/gmhmm_wrap -m /home/qliang/cowpea/annotation/05.CEGMA/2.genemask/output/gmhmm.mod -g /24-2/home/qliang/0.soft/makerPackages/gm_et_linux_64/gmes_petap/gmhmme3 -p /24-2/home/qliang/0.soft/makerPackages/gm_et_linux_64/gmes_petap/probuild -o /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_nomask.0.gmhmm%2Emod.genemark /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_nomask.0 >>> #-------------------------------# >>> FAILED CONTIG:ScsGwly_6124;HRSCAF=6247 >>> >>> examining contents of the fasta file and run log >>> >>> >>> >>> --Next Contig-- >>> >>> #--------------------------------------------------------------------- >>> Now starting the contig!! >>> SeqID: ScsGwly_6140;HRSCAF=6263 >>> Length: 1247 >>> #--------------------------------------------------------------------- >>> >>> >>> >>> >>>> On Jan 5, 2018, at 7:22 PM, Carson Holt > wrote: >>>> >>>> That?s the stack trace. The error is going to be a few lines further back. It would be best to get a few hundred lines right around the area you are showing. >>>> >>>> ?Carson >>>> >>>>> On Jan 4, 2018, at 2:36 PM, Qihua Liang > wrote: >>>>> >>>>> Hi Ence, >>>>> >>>>> When I searched for ?E/error? in the output file, here is what first showed up: >>>>> Process::MpiChunk::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >>>>> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >>>>> Error::subs::try(CODE(0x502bbb0), HASH(0x5007788)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 4269 >>>>> Process::MpiChunk::_go(Process::MpiChunk=HASH(0x50a1a18), "flow", HASH(0x50ad0f0), 2, 0) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 378 >>>>> Process::MpiChunk::_flow(Process::MpiChunk=HASH(0x50a1a18), HASH(0x50ad0f0), 2, 0, Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 318 >>>>> Process::MpiTiers::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >>>>> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >>>>> Error::subs::try(CODE(0x50a9348), HASH(0x4ff0ec0)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 338 >>>>> Process::MpiTiers::_next_level(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 179 >>>>> >>>>> Is this what you may need? >>>>> >>>>> Qihua >>>>> >>>>>> On Jan 4, 2018, at 6:16 AM, Ence,daniel > wrote: >>>>>> >>>>>> Hi, Before we can give any help to debug it, we need the error messages. These should be in the same file that the ?maker is finished? message is in. Look for the first error message (the one closest to the top of the file) and send that to the mailing list. >>>>>> >>>>>> Thanks, >>>>>> Daniel >>>>>> >>>>>> >>>>>>> On Jan 3, 2018, at 8:52 PM, Qihua Liang > wrote: >>>>>>> >>>>>>> Dear Maker Develop Team, >>>>>>> >>>>>>> I have successfully run Maker for several times before. But I came across a strange thing days ago when I ran Maker again on a different assembly with the same input files and settings. >>>>>>> >>>>>>> I saw the message of "Maker is now finished!!!? but got empty GFF3 and no fasta files. And then I checked the master_datastore_index.log and realized that there are a lot of ?failed?s and ?retry?s and ?failed? again. What does this mean? Since I used same inputs as previous successful runs, could you provide some instructions on how to debug and solve it? >>>>>>> >>>>>>> Thank you so much >>>>>>> Qihua >>>>>>> _______________________________________________ >>>>>>> maker-devel mailing list >>>>>>> maker-devel at box290.bluehost.com >>>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__box290.bluehost.com_mailman_listinfo_maker-2Ddevel-5Fyandell-2Dlab.org&d=DwIGaQ&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=12jzlNvGVD0AlPJ4E7cTlw1Dvu6n9cb4kMCobJ28XPs&m=nUDCP_0kFOhDYlTHgOpWtf_zdL77aQFeQwYOGIQwP8c&s=9Z4T1hdtxOyIjpn6f70qhrQRuGsZxXdV-oLSJF1zGkY&e= >>>>>> >>>>> >>>>> _______________________________________________ >>>>> maker-devel mailing list >>>>> maker-devel at box290.bluehost.com >>>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>>> >>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From arsilan324 at gmail.com Thu Jan 11 08:15:31 2018 From: arsilan324 at gmail.com (Muhammad Arslan) Date: Thu, 11 Jan 2018 15:15:31 +0100 Subject: [maker-devel] GFF3 to .tbl Message-ID: Dear Madam or Sir, I am writing this email to inquire if there is any way to make .tbl file from maker generated GFF3 file? This is required since I am trying to submit the annotation to NCBI. If there is any other solution for this, please advise accordingly. Thank you very much! Arslan -- --------------------------------------------------------------------------------------------*Muhammad Arslan* PhD Student / Guest Scientist Department of Environmental Biotechnology Helmholtz Centre for Environmental Research - UFZ Permoserstra?e 15, 04318 Leipzig, Germany Phone +49,341,235 <+49%20341%20235> 1696, muhammad.arslan at ufz.de , www.ufz.de Registered Office / Registered Office: Leipzig Register court / Registration Office: Amtsgericht Leipzig Commercial register Nr./Trade Register No .: B 4703 Chairman / Chairman of the Supervisory Board: MinDirig Wilfried Kraus Scientific Director / Scientific Managing Director: Prof. Georg Teutsch Administrative Managing Director / Administrative Managing Director: Prof. Dr. Heike Grassmann -------------------------------------------------------------------------------------------- *SAVE PAPER - Please do not print this e-mail unless absolutely necessary* -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Fri Jan 19 16:46:26 2018 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 19 Jan 2018 15:46:26 -0700 Subject: [maker-devel] GFF3 to .tbl In-Reply-To: References: Message-ID: <93BD3F52-1D76-465A-94EE-80D616BB72A6@gmail.com> Try GAG ?> https://genomeannotation.github.io/GAG/ ?Carson > On Jan 11, 2018, at 7:15 AM, Muhammad Arslan wrote: > > Dear Madam or Sir, > > I am writing this email to inquire if there is any way to make .tbl file from maker generated GFF3 file? This is required since I am trying to submit the annotation to NCBI. If there is any other solution for this, please advise accordingly. > > Thank you very much! > Arslan > > -- > -------------------------------------------------------------------------------------------- > Muhammad Arslan > PhD Student / Guest Scientist > Department of Environmental Biotechnology > > Helmholtz Centre for Environmental Research - UFZ > Permoserstra?e 15, 04318 Leipzig, Germany > Phone +49,341,235 1696, > muhammad.arslan at ufz.de , www.ufz.de > > Registered Office / Registered Office: Leipzig > Register court / Registration Office: Amtsgericht Leipzig > Commercial register Nr./Trade Register No .: B 4703 > Chairman / Chairman of the Supervisory Board: MinDirig Wilfried Kraus > Scientific Director / Scientific Managing Director: > Prof. Georg Teutsch > Administrative Managing Director / Administrative Managing Director: > Prof. Dr. Heike Grassmann > > > -------------------------------------------------------------------------------------------- > SAVE PAPER - Please do not print this e-mail unless absolutely necessary > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From qwzhang0601 at gmail.com Mon Jan 22 11:23:34 2018 From: qwzhang0601 at gmail.com (Quanwei Zhang) Date: Mon, 22 Jan 2018 12:23:34 -0500 Subject: [maker-devel] name of gene model Message-ID: Hello: Would you please explain how the genes were named? Do similar names indicate sequence similarities (e.g., maker-Contig3217-snap-gene-35.13-mRNA-1, maker-Contig3217-snap-gene-35.14-mRNA-1)? maker-Contig2667-augustus-gene-266.22-mRNA-1; maker-Contig2667-snap-gene-266.5-mRNA-1; maker-Contig3217-snap-gene-35.13-mRNA-1; maker-Contig3217-snap-gene-35.14-mRNA-1; maker-Contig3217-snap-gene-35.15-mRNA-1; maker-Contig3217-snap-gene-35.16-mRNA-1 Thank you Best Quanwei -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Jan 22 11:29:26 2018 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 22 Jan 2018 10:29:26 -0700 Subject: [maker-devel] name of gene model In-Reply-To: References: Message-ID: <499FF8DC-C277-484B-AEC9-EE7A35090615@gmail.com> The only info in the name is the source program of the model (i.e. snap/augustus). The numbers are just meaningless iterators. ?Carson > On Jan 22, 2018, at 10:23 AM, Quanwei Zhang wrote: > > Hello: > > Would you please explain how the genes were named? Do similar names indicate sequence similarities (e.g., maker-Contig3217-snap-gene-35.13-mRNA-1, maker-Contig3217-snap-gene-35.14-mRNA-1)? > > maker-Contig2667-augustus-gene-266.22-mRNA-1; > maker-Contig2667-snap-gene-266.5-mRNA-1; > > maker-Contig3217-snap-gene-35.13-mRNA-1; > maker-Contig3217-snap-gene-35.14-mRNA-1; > maker-Contig3217-snap-gene-35.15-mRNA-1; > maker-Contig3217-snap-gene-35.16-mRNA-1 > > Thank you > > Best > Quanwei > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From yincl2013 at 126.com Tue Jan 23 09:01:17 2018 From: yincl2013 at 126.com (Chuanlin Yin) Date: Tue, 23 Jan 2018 23:01:17 +0800 (GMT+08:00) Subject: [maker-devel] maker-3.01.02-beta run error Message-ID: <165b9e3f.a94e.16123899e8a.Coremail.yincl2013@126.com> Dear Mr/Ms? Recently?when i want to use maker-3.01.02-beta for genome annotation. I had failed for the following error: Can't call method "add_entry" without a package or object reference at /gpfs/bioinformatics/software/maker-3.01.02-beta/bin/../lib/Widget/snap.pm line 540. --> rank=NA, hostname=c01n02 ERROR: Failed while annotating transcripts ERROR: Chunk failed at level:1, tier_type:4 FAILED CONTIG:002369F_pilon_obj Could you explain why it happened! Much appreciated for any replies. Thanks. Best regards! Showky -------------- next part -------------- An HTML attachment was scrubbed... URL: From Emily.Giroux at inspection.gc.ca Tue Jan 23 14:35:06 2018 From: Emily.Giroux at inspection.gc.ca (Giroux, Emily (CFIA/ACIA)) Date: Tue, 23 Jan 2018 20:35:06 +0000 Subject: [maker-devel] maker pipeline 2nd round updating augustus Message-ID: <397E3572255740488AA9993F4D41A3B036B588CB@ONOTTAXES2.AGR.GC.CA> Hi, I completed a first round of Maker, followed by snap and BUSCO to train augustus. I then placed the newly-trained species-specific augustus files in the augustus species directory and used this for my second round of maker. What I'm wondering now is whether I should repeat this process after completeing round 2 of maker, and follow this with using BUSCO to retrain the augustus files again and replace the previous species-specific libraries from round 1 with those from round 2 and use these as input for my third round of maker. Thank-you very much, Emily -------------- next part -------------- An HTML attachment was scrubbed... URL: From patrick.tranvan at unil.ch Thu Jan 25 08:46:27 2018 From: patrick.tranvan at unil.ch (Patrick Tran Van) Date: Thu, 25 Jan 2018 14:46:27 +0000 Subject: [maker-devel] Adding NR functional annotation Message-ID: <1516891629951.7595@unil.ch> Hi, Can you please update maker_functional_gff maker_functional_fasta in order to make it compatible with the database NR ? Thanks, Patrick -------------- next part -------------- An HTML attachment was scrubbed... URL: From marni at cs.au.dk Thu Jan 25 04:26:04 2018 From: marni at cs.au.dk (Marni Tausen) Date: Thu, 25 Jan 2018 10:26:04 +0000 Subject: [maker-devel] Maker run problems - BLAST makeblastdb failed Message-ID: <72D3C07A-D1A6-4759-B8D2-13EBE8DD7982@birc.au.dk> Hey, I have a problem getting maker to run. I?ve tried installing the pipeline on three separate systems. CentOS 6 (cluster), Mac OS X 10.12.6 and on CentOS 7. With each of them I run into problems with Repeatmasker step with the error message: #--------------------------------------------------------------------- Now starting the contig!! SeqID: chr0 Length: 38046352 #--------------------------------------------------------------------- setting up GFF3 output and fasta chunks doing repeat masking running repeat masker. #--------- command -------------# Widget::RepeatMasker: cd makertmp/maker_DMTHbJ; /Users/PM/maker/exe/RepeatMasker/RepeatMasker /Users/PM/GENEANNOTATION/TrR.v5.maker.output/TrR.v5_datastore/82/7E/chr0//theVoid.chr0/0/chr0.0.all.rb -species all -dir /Users/PM/GENEANNOTATION/TrR.v5.maker.output/TrR.v5_datastore/82/7E/chr0//theVoid.chr0/0 -pa 1 #-------------------------------# doing blastx repeats formating database... #--------- command -------------# Widget::formater: /Users/PM/maker/bin/../exe/lblast/bin/makeblastdb -dbtype prot -in makertmp/maker_DMTHbJ/0/blastprep/te_proteins%2Efasta.mpi.10.0 #-------------------------------# BLAST options error: File makertmp/maker_DMTHbJ/0/blastprep/te_proteins%2Efasta.mpi.10.0 does not exist ERROR: /Users/PM/maker/bin/../exe/lblast/bin/makeblastdb failed in Widget::formater --> rank=NA, hostname=d24834.local ERROR: Failed while doing blastx repeats ERROR: Chunk failed at level:1, tier_type:1 FAILED CONTIG:chr0 ERROR: Chunk failed at level:2, tier_type:0 FAILED CONTIG:chr0 examining contents of the fasta file and run log The Maker version that was installed is 2.31.9, and it was build using the ./Build commands. However the links for exonerate (2.2.0) and repeatmasker (repbase) (latest version) seem to be broken, as they always returned connection errors. So I manually installed both of those programs and linked maker to them. I?ve attached the config files and the script used to run maker. Cheers, Marni Tausen -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_bopts.ctl Type: application/octet-stream Size: 1413 bytes Desc: maker_bopts.ctl URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_exe.ctl Type: application/octet-stream Size: 1277 bytes Desc: maker_exe.ctl URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_opts.ctl Type: application/octet-stream Size: 4554 bytes Desc: maker_opts.ctl URL: From mmokrejs at gmail.com Thu Jan 25 10:05:45 2018 From: mmokrejs at gmail.com (=?UTF-8?Q?Martin_MOKREJ=c5=a0?=) Date: Thu, 25 Jan 2018 17:05:45 +0100 Subject: [maker-devel] Maker run problems - BLAST makeblastdb failed In-Reply-To: <72D3C07A-D1A6-4759-B8D2-13EBE8DD7982@birc.au.dk> References: <72D3C07A-D1A6-4759-B8D2-13EBE8DD7982@birc.au.dk> Message-ID: Hi Marni, do not use spaces in your filenames and directory names. I think that is your issue: te_proteins%2Efasta.mpi.10.0 Martin From carsonhh at gmail.com Thu Jan 25 15:20:21 2018 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 25 Jan 2018 14:20:21 -0700 Subject: [maker-devel] Maker run problems - BLAST makeblastdb failed In-Reply-To: <72D3C07A-D1A6-4759-B8D2-13EBE8DD7982@birc.au.dk> References: <72D3C07A-D1A6-4759-B8D2-13EBE8DD7982@birc.au.dk> Message-ID: <2AC145CF-6954-4740-BA88-A7ABBBC841D0@gmail.com> You set TMP=makertmp. That is likly not a true locally mounted location (i.e. it?s network mounted). In which case you will hit a race condition where files you just created don?t become readable for a few milliseconds to seconds after creation under heavy IO load. Alternatively it is locally mounted, but only exists on a single node and you are running it on a cluster (other nodes cannot cross access local nodes). Unless your cluster setup has a specific location for locally mounted temporary scratch space, you should not set TMP=. Just let it default to /tmp which is almost always locally mounted. ?Carson > On Jan 25, 2018, at 3:26 AM, Marni Tausen wrote: > > Hey, > > I have a problem getting maker to run. > > I?ve tried installing the pipeline on three separate systems. CentOS 6 (cluster), Mac OS X 10.12.6 and on CentOS 7. > > With each of them I run into problems with Repeatmasker step with the error message: > > #--------------------------------------------------------------------- > Now starting the contig!! > SeqID: chr0 > Length: 38046352 > #--------------------------------------------------------------------- > > > setting up GFF3 output and fasta chunks > doing repeat masking > running repeat masker. > #--------- command -------------# > Widget::RepeatMasker: > cd makertmp/maker_DMTHbJ; /Users/PM/maker/exe/RepeatMasker/RepeatMasker /Users/PM/GENEANNOTATION/TrR.v5.maker.output/TrR.v5_datastore/82/7E/chr0//theVoid.chr0/0/chr0.0.all.rb -species all -dir /Users/PM/GENEANNOTATION/TrR.v5.maker.output/TrR.v5_datastore/82/7E/chr0//theVoid.chr0/0 -pa 1 > #-------------------------------# > doing blastx repeats > formating database... > #--------- command -------------# > Widget::formater: > /Users/PM/maker/bin/../exe/lblast/bin/makeblastdb -dbtype prot -in makertmp/maker_DMTHbJ/0/blastprep/te_proteins%2Efasta.mpi.10.0 > #-------------------------------# > BLAST options error: File makertmp/maker_DMTHbJ/0/blastprep/te_proteins%2Efasta.mpi.10.0 does not exist > ERROR: /Users/PM/maker/bin/../exe/lblast/bin/makeblastdb failed in Widget::formater > --> rank=NA, hostname=d24834.local > ERROR: Failed while doing blastx repeats > ERROR: Chunk failed at level:1, tier_type:1 > FAILED CONTIG:chr0 > > ERROR: Chunk failed at level:2, tier_type:0 > FAILED CONTIG:chr0 > > examining contents of the fasta file and run log > > The Maker version that was installed is 2.31.9, and it was build using the ./Build commands. > > However the links for exonerate (2.2.0) and repeatmasker (repbase) (latest version) seem to be broken, as they always returned connection errors. > > So I manually installed both of those programs and linked maker to them. > > I?ve attached the config files and the script used to run maker. > > Cheers, > Marni Tausen > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Thu Jan 25 15:29:37 2018 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 25 Jan 2018 14:29:37 -0700 Subject: [maker-devel] maker pipeline 2nd round updating augustus In-Reply-To: <397E3572255740488AA9993F4D41A3B036B588CB@ONOTTAXES2.AGR.GC.CA> References: <397E3572255740488AA9993F4D41A3B036B588CB@ONOTTAXES2.AGR.GC.CA> Message-ID: <2D069310-1BFC-4C30-98B5-739FC90A732B@gmail.com> Don?t use BUSCO to train for the second round, there is a bias in the models it produces for conserved genes that tend to be short and intron poor., You will want to avoid this bias in the second round. You want to use a broad selection of gene models instead. Use the maker2zff script to select gene models for training (examples on doing this can be found on the maker tutorial wiki). Then use this script to convert ZFF to GenBank format to train Augustus ?> https://github.com/hyphaltip/genome-scripts/blob/master/gene_prediction/zff2augustus_gbk.pl This is a nice guide to train Augustus using GenBank format input?> https://vcru.wisc.edu/simonlab/bioinformatics/programs/augustus/docs/tutorial2015/training.html ?Carson > On Jan 23, 2018, at 1:35 PM, Giroux, Emily (CFIA/ACIA) wrote: > > Hi, > > I completed a first round of Maker, followed by snap and BUSCO to train augustus. I then placed the newly-trained species-specific augustus files in the augustus species directory and used this for my second round of maker. > > What I?m wondering now is whether I should repeat this process after completeing round 2 of maker, and follow this with using BUSCO to retrain the augustus files again and replace the previous species-specific libraries from round 1 with those from round 2 and use these as input for my third round of maker. > > Thank-you very much, > > Emily > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Thu Jan 25 15:33:33 2018 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 25 Jan 2018 14:33:33 -0700 Subject: [maker-devel] maker-3.01.02-beta run error In-Reply-To: <165b9e3f.a94e.16123899e8a.Coremail.yincl2013@126.com> References: <165b9e3f.a94e.16123899e8a.Coremail.yincl2013@126.com> Message-ID: Because of where that error occurred, it may be a snowball error (i.e. a result of another error upstream that is the real failure). Could you look back in the data to see if there is a failure further back? Perhaps include your entire STDERR log. Thanks, Carson > On Jan 23, 2018, at 8:01 AM, Chuanlin Yin wrote: > > Dear Mr/Ms? > > Recently?when i want to use maker-3.01.02-beta for genome annotation. I had failed for the following error: > > Can't call method "add_entry" without a package or object reference at /gpfs/bioinformatics/software/maker-3.01.02-beta/bin/../lib/Widget/snap.pm line 540. > --> rank=NA, hostname=c01n02 > ERROR: Failed while annotating transcripts > ERROR: Chunk failed at level:1, tier_type:4 > FAILED CONTIG:002369F_pilon_obj > > Could you explain why it happened! > > Much appreciated for any replies. Thanks. > > Best regards! > Showky > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From qwzhang0601 at gmail.com Fri Jan 26 10:40:32 2018 From: qwzhang0601 at gmail.com (Quanwei Zhang) Date: Fri, 26 Jan 2018 11:40:32 -0500 Subject: [maker-devel] map the transcripts back onto the genome using "est2genome=1" Message-ID: Hello: I am trying to annotate a new NMR genome assembly. Since the gene annotation is available for the old version of NMR from NCBI, I tried to map the published refSeq transcripts onto the genome by "est2genome=1". But I found quite a few genes were lost during mapping. Then I did another test to check the functionality of the mapping by "est2genome=1". I mapped the published refSeq transcripts onto the old genome (the same version for the published gene annotation) by maker with "est2genome=1". Still I can found quite a few genes were lost during the mapping. Below I show you the results of gene annotaion by BUSCOs, which *annotation completeness with single-copy orthologs*. You can see, even we only consider the single-copy orthologs, there are still 4% were not map back to the genome. Do you have any comments on this? Besides would you please give us some suggestions to make more published gene annotation map back to the same genome assembly through "est2genome=1"? Attached is the maker_opts.ctl file I used for the mapping. Many thanks. # this is the BUSCOs results using the published gene annotation C:99.3%[S:33.3%,D:66.0%],F:0.3%,M:0.4%,n:4104 4077 Complete BUSCOs (C) 1367 Complete and single-copy BUSCOs (S) 2710 Complete and duplicated BUSCOs (D) 14 Fragmented BUSCOs (F) 13 Missing BUSCOs (M) 4104 Total BUSCO groups searched #this is the BUSCOs results using gene models after mapping by maker2. C:93.4%[S:36.5%,D:56.9%],F:2.6%,M:4.0%,n:4104 3830 Complete BUSCOs (C) 1496 Complete and single-copy BUSCOs (S) 2334 Complete and duplicated BUSCOs (D) 105 Fragmented BUSCOs (F) 169 Missing BUSCOs (M) 4104 Total BUSCO groups searched Best Quanwei -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_opts.ctl Type: application/octet-stream Size: 4733 bytes Desc: not available URL: From qwzhang0601 at gmail.com Fri Jan 26 17:16:50 2018 From: qwzhang0601 at gmail.com (Quanwei Zhang) Date: Fri, 26 Jan 2018 18:16:50 -0500 Subject: [maker-devel] gene annotation for a better genome In-Reply-To: References: <5AFEDD05-DF02-463F-A6EE-1619A9BB968D@gmail.com> Message-ID: Hi Carson: Thank you for your previous suggestions. I have done the annotation according to your suggestions. I firstly mapped the transcripts from old assembly to the new assembly by setting "est2genome=1", and then update the models by new predictions. Besides mapping by "est2genome=1" , do you think it is a good idea to do a separate mapping by proteins of old assembly (setting "protein2genome=1")? And then I provide both mapping GFF files (i.e., mapping GFF by transcripts and proteins, separately) and update them with new predictions and evidence support? Why I am trying to do this is because I found for certain genes they were not mapped to the new assembly but they can be mapped by protein orthologs. Thank you. Best Quanwei 2017-10-24 18:26 GMT-04:00 Carson Holt : > Yes. If you use est2genome it will just align the model, and then find the > longest ORF. So it is a quick way to jsut align old models to the new > assembly. Alternatively you can just do de novo annotation. > > ?Carson > > > > On Oct 24, 2017, at 10:54 AM, Quanwei Zhang wrote: > > Dear Carson: > > Thank you again for your suggestions. I just get the new genome assembly > of NMR and start to do gene annotation. I understand you ideas about this. > But can I simply use the old genome transcripts as transcript evidence, and > just following the standard Maker2 pipeline? I set est2genome=1 and provide > the mRNA sequences in the fasta format for the first round training of SNAP. > > For transcripts I have the following choices. I think the first choice is > more reliable and better, right? > (1) There are about 60,000 RefSeq transcripts from NCBI. So I downloaded > those sequences in fasta format. > (2) We have the raw data of RNA-seq from 11 tissues, we can do assembly by > trinity for each sample and then get the transcripts. But I think most of > the RNA-seq should have been submitted to NCBI. > > BTW, if we use the RefSeq data from NCBI, we can download the mRNA > sequences, coding sequences or protein sequences. I wonder which type of > data are the best to train the SNAP? For Augustus, we will use BUSCO to > train it. > > Many thanks. > > Best > Quanwei > > > > > 2017-09-29 12:36 GMT-04:00 Carson Holt : > >> You can try using the est2genome=1 option to map the old models forward >> onto the new assembly as if they were ESTs (add a line that says >> est_forward=1 to the control file to maintain old naming and set est=1 to >> the old model transcript file). Then provide the final models as a pred_gff >> for a subsuquent run (i.e. a traditional MAKER run where you are annotating >> the new assembly with transcript and protein evidence and ab initio >> predictors). Don?t supply the old models to est= on that run. >> >> The idea behind doing it this way is: >> 1. You need to get old models onto the new assembly so coordinates will >> change. So by doing it this way, you will at least be able to move many >> models forward based on homology. >> 2. By providing the models to pred_gff on a subsequent MAKER run, you are >> just letting old models compete against new annotations. They will be >> rejected if they have no evidence support, or can be kept if they score >> better than alternate models from SNAP/Augustus. That way you have the >> chance to integrate old models while at the same time rejecting some old >> models that have no evidence overlap. >> >> ?Carson >> >> >> > On Sep 28, 2017, at 6:05 AM, Quanwei Zhang >> wrote: >> > >> > Hello: >> > >> > Recently, we got a new version of NMR genome, whose genome had been >> assembled and annotated a few years ago. We can download the gene >> annotation from NCBI. >> > >> > Now we want to annotate the new genome using Maker2 pipeline. I wonder >> how can I fully make use of existing annotations. On the other hand, since >> the previous genome is not very well assemblies, some genes annotation >> maybe false positives. I hope those false positive genes in previous >> annotation won't mislead Maker2 for current gene annotation. >> > >> > Do you have any suggestions. Thanks >> > >> > Best >> > Quanwei >> > _______________________________________________ >> > maker-devel mailing list >> > maker-devel at box290.bluehost.com >> > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Jan 29 12:23:06 2018 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 29 Jan 2018 11:23:06 -0700 Subject: [maker-devel] gene annotation for a better genome In-Reply-To: References: <5AFEDD05-DF02-463F-A6EE-1619A9BB968D@gmail.com> Message-ID: <753F1840-4874-4C0D-80F7-59E1A1579884@gmail.com> You can set both est2genome=1 and protein2genome=1. You can also set est_forward=1 to get the names from the old models (you have to add it as it?s not already there). If you want to try and force an alignment to a specifc location, you can also add maker_coor=chr2:1-3000 to the fasta header comment line to have maker only alow alignments within a specific region (chr2:1-3000 in the example). ?Carson > On Jan 26, 2018, at 4:16 PM, Quanwei Zhang wrote: > > Hi Carson: > > Thank you for your previous suggestions. I have done the annotation according to your suggestions. I firstly mapped the transcripts from old assembly to the new assembly by setting "est2genome=1", and then update the models by new predictions. > > Besides mapping by "est2genome=1" , do you think it is a good idea to do a separate mapping by proteins of old assembly (setting "protein2genome=1")? And then I provide both mapping GFF files (i.e., mapping GFF by transcripts and proteins, separately) and update them with new predictions and evidence support? Why I am trying to do this is because I found for certain genes they were not mapped to the new assembly but they can be mapped by protein orthologs. > > Thank you. > > Best > Quanwei > > 2017-10-24 18:26 GMT-04:00 Carson Holt >: > Yes. If you use est2genome it will just align the model, and then find the longest ORF. So it is a quick way to jsut align old models to the new assembly. Alternatively you can just do de novo annotation. > > ?Carson > > > >> On Oct 24, 2017, at 10:54 AM, Quanwei Zhang > wrote: >> >> Dear Carson: >> >> Thank you again for your suggestions. I just get the new genome assembly of NMR and start to do gene annotation. I understand you ideas about this. But can I simply use the old genome transcripts as transcript evidence, and just following the standard Maker2 pipeline? I set est2genome=1 and provide the mRNA sequences in the fasta format for the first round training of SNAP. >> >> For transcripts I have the following choices. I think the first choice is more reliable and better, right? >> (1) There are about 60,000 RefSeq transcripts from NCBI. So I downloaded those sequences in fasta format. >> (2) We have the raw data of RNA-seq from 11 tissues, we can do assembly by trinity for each sample and then get the transcripts. But I think most of the RNA-seq should have been submitted to NCBI. >> >> BTW, if we use the RefSeq data from NCBI, we can download the mRNA sequences, coding sequences or protein sequences. I wonder which type of data are the best to train the SNAP? For Augustus, we will use BUSCO to train it. >> >> Many thanks. >> >> Best >> Quanwei >> >> >> >> >> 2017-09-29 12:36 GMT-04:00 Carson Holt >: >> You can try using the est2genome=1 option to map the old models forward onto the new assembly as if they were ESTs (add a line that says est_forward=1 to the control file to maintain old naming and set est=1 to the old model transcript file). Then provide the final models as a pred_gff for a subsuquent run (i.e. a traditional MAKER run where you are annotating the new assembly with transcript and protein evidence and ab initio predictors). Don?t supply the old models to est= on that run. >> >> The idea behind doing it this way is: >> 1. You need to get old models onto the new assembly so coordinates will change. So by doing it this way, you will at least be able to move many models forward based on homology. >> 2. By providing the models to pred_gff on a subsequent MAKER run, you are just letting old models compete against new annotations. They will be rejected if they have no evidence support, or can be kept if they score better than alternate models from SNAP/Augustus. That way you have the chance to integrate old models while at the same time rejecting some old models that have no evidence overlap. >> >> ?Carson >> >> >> > On Sep 28, 2017, at 6:05 AM, Quanwei Zhang > wrote: >> > >> > Hello: >> > >> > Recently, we got a new version of NMR genome, whose genome had been assembled and annotated a few years ago. We can download the gene annotation from NCBI. >> > >> > Now we want to annotate the new genome using Maker2 pipeline. I wonder how can I fully make use of existing annotations. On the other hand, since the previous genome is not very well assemblies, some genes annotation maybe false positives. I hope those false positive genes in previous annotation won't mislead Maker2 for current gene annotation. >> > >> > Do you have any suggestions. Thanks >> > >> > Best >> > Quanwei >> > _______________________________________________ >> > maker-devel mailing list >> > maker-devel at box290.bluehost.com >> > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From qwzhang0601 at gmail.com Mon Jan 29 13:57:42 2018 From: qwzhang0601 at gmail.com (Quanwei Zhang) Date: Mon, 29 Jan 2018 14:57:42 -0500 Subject: [maker-devel] gene annotation for a better genome In-Reply-To: <753F1840-4874-4C0D-80F7-59E1A1579884@gmail.com> References: <5AFEDD05-DF02-463F-A6EE-1619A9BB968D@gmail.com> <753F1840-4874-4C0D-80F7-59E1A1579884@gmail.com> Message-ID: Dear Carson: Thank you for your reply. Do you mean set est2genome=1 and protein2genome=1 in one round or do such mapping in two separate rounds? So I will provide gff files by mapping the transcripts and proteins to "pred_gff". Besides the gff from such mapping, I am also considering to provide a gff file obtained from a regular de novo annotation by maker2. And then update gene models from those gff. Here is the reason why I consider this. Suppose at location 1 there is a gene model gA by mapping transcripts and proteins. Then if I try to update those gene models in the second round of maker, maker can not change internal exons of gA (so can not replace it). However, if I provide both the gff by mapping transcripts and gff by maker de novo annotation, then if another gene model gA' (by de novo annotation) was predicted by maker at the same location, maker will compare gA and gA' and select the one with higher score, right? By this way we can replace a mapping gene model with predicted model by maker if the predicted one have stronger evidence support. Right? Thank you. Best Quanwei 2018-01-29 13:23 GMT-05:00 Carson Holt : > You can set both est2genome=1 and protein2genome=1. You can also set > est_forward=1 to get the names from the old models (you have to add it as > it?s not already there). If you want to try and force an alignment to a > specifc location, you can also add maker_coor=chr2:1-3000 to the fasta > header comment line to have maker only alow alignments within a specific > region (chr2:1-3000 in the example). > > ?Carson > > > On Jan 26, 2018, at 4:16 PM, Quanwei Zhang wrote: > > Hi Carson: > > Thank you for your previous suggestions. I have done the annotation > according to your suggestions. I firstly mapped the transcripts from old > assembly to the new assembly by setting "est2genome=1", and then update the > models by new predictions. > > Besides mapping by "est2genome=1" , do you think it is a good idea to do a > separate mapping by proteins of old assembly (setting "protein2genome=1")? > And then I provide both mapping GFF files (i.e., mapping GFF by transcripts > and proteins, separately) and update them with new predictions and evidence > support? Why I am trying to do this is because I found for certain genes > they were not mapped to the new assembly but they can be mapped by protein > orthologs. > > Thank you. > > Best > Quanwei > > 2017-10-24 18:26 GMT-04:00 Carson Holt : > >> Yes. If you use est2genome it will just align the model, and then find >> the longest ORF. So it is a quick way to jsut align old models to the new >> assembly. Alternatively you can just do de novo annotation. >> >> ?Carson >> >> >> >> On Oct 24, 2017, at 10:54 AM, Quanwei Zhang >> wrote: >> >> Dear Carson: >> >> Thank you again for your suggestions. I just get the new genome assembly >> of NMR and start to do gene annotation. I understand you ideas about this. >> But can I simply use the old genome transcripts as transcript evidence, and >> just following the standard Maker2 pipeline? I set est2genome=1 and provide >> the mRNA sequences in the fasta format for the first round training of SNAP. >> >> For transcripts I have the following choices. I think the first choice is >> more reliable and better, right? >> (1) There are about 60,000 RefSeq transcripts from NCBI. So I downloaded >> those sequences in fasta format. >> (2) We have the raw data of RNA-seq from 11 tissues, we can do assembly >> by trinity for each sample and then get the transcripts. But I think most >> of the RNA-seq should have been submitted to NCBI. >> >> BTW, if we use the RefSeq data from NCBI, we can download the mRNA >> sequences, coding sequences or protein sequences. I wonder which type of >> data are the best to train the SNAP? For Augustus, we will use BUSCO to >> train it. >> >> Many thanks. >> >> Best >> Quanwei >> >> >> >> >> 2017-09-29 12:36 GMT-04:00 Carson Holt : >> >>> You can try using the est2genome=1 option to map the old models forward >>> onto the new assembly as if they were ESTs (add a line that says >>> est_forward=1 to the control file to maintain old naming and set est=1 to >>> the old model transcript file). Then provide the final models as a pred_gff >>> for a subsuquent run (i.e. a traditional MAKER run where you are annotating >>> the new assembly with transcript and protein evidence and ab initio >>> predictors). Don?t supply the old models to est= on that run. >>> >>> The idea behind doing it this way is: >>> 1. You need to get old models onto the new assembly so coordinates will >>> change. So by doing it this way, you will at least be able to move many >>> models forward based on homology. >>> 2. By providing the models to pred_gff on a subsequent MAKER run, you >>> are just letting old models compete against new annotations. They will be >>> rejected if they have no evidence support, or can be kept if they score >>> better than alternate models from SNAP/Augustus. That way you have the >>> chance to integrate old models while at the same time rejecting some old >>> models that have no evidence overlap. >>> >>> ?Carson >>> >>> >>> > On Sep 28, 2017, at 6:05 AM, Quanwei Zhang >>> wrote: >>> > >>> > Hello: >>> > >>> > Recently, we got a new version of NMR genome, whose genome had been >>> assembled and annotated a few years ago. We can download the gene >>> annotation from NCBI. >>> > >>> > Now we want to annotate the new genome using Maker2 pipeline. I wonder >>> how can I fully make use of existing annotations. On the other hand, since >>> the previous genome is not very well assemblies, some genes annotation >>> maybe false positives. I hope those false positive genes in previous >>> annotation won't mislead Maker2 for current gene annotation. >>> > >>> > Do you have any suggestions. Thanks >>> > >>> > Best >>> > Quanwei >>> > _______________________________________________ >>> > maker-devel mailing list >>> > maker-devel at box290.bluehost.com >>> > http://box290.bluehost.com/mailman/listinfo/maker-devel_yand >>> ell-lab.org >>> >>> >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From admin at genome.arizona.edu Mon Jan 29 17:08:54 2018 From: admin at genome.arizona.edu (admin at genome.arizona.edu) Date: Mon, 29 Jan 2018 16:08:54 -0700 Subject: [maker-devel] MPI selection Message-ID: <77cfb864-4de1-a9af-aeea-9d3e7cf45ce5@genome.arizona.edu> Hi, we have now three versions of MPI installed on our cluster, OpenMPI, MPICH, and MVAPICH2. Since we have infiniband, the MVAPICH2 is working best with MPI test programs. MPICH should support infiniband too but currently there are some seg faults with that we are trying to resolve. On our cluster we have ~/.mpi-selection file which allows users to pick the MPI installation to use, and sets appropriate PATH and LD_LIBRARY_PATH variables. I am looking through the Maker MPI instructions, and it seems that a certain mpicc and mpi.h must be chosen during installation. So if originally, Maker was installed with MPICH, then would I have to reinstall it if users want to use MVAPICH2? Or is there config file somewhere I can update so I don't have to reinstall Maker? Or does nothing need to be done and we can rely on PATH and LD_LIBRARY_PATH variables pointing to correct mpicc and libmpi.so (mpi.h is in include directory)? Thanks From yuejiaxing at gmail.com Tue Jan 30 10:32:04 2018 From: yuejiaxing at gmail.com (Jia-Xing Yue) Date: Tue, 30 Jan 2018 17:32:04 +0100 Subject: [maker-devel] Is it possible to extract the GFF3 file for the raw gene models predicted by est2genome and protein2genome? Message-ID: Hello, I enabled the est2genome and protein2genome option for Maker-3.00.0-beta in my particular case. I was wondering if it is possible to extract the gene models predicted by snap, augustus, est2genome, and protein2genome respectively. By using the gff_merge command, I think I can extract some gene models for each cases but not all, especially for the est2genome and protein2genome set (e.g. those labeled with "maker-chr*-exonerate_est2genome-gene" and "maker-chr*-exonerate_protein2genome-gene"). Thanks in advance! Best, Jia-Xing -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Jan 30 10:47:39 2018 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 30 Jan 2018 09:47:39 -0700 Subject: [maker-devel] MPI selection In-Reply-To: <77cfb864-4de1-a9af-aeea-9d3e7cf45ce5@genome.arizona.edu> References: <77cfb864-4de1-a9af-aeea-9d3e7cf45ce5@genome.arizona.edu> Message-ID: <34C36A98-A87F-4B28-8E05-FCD412CFEBEA@gmail.com> The libraries used by MVAPICH2, Intel MPI, and OpenMPI to access infiniband have a known bug. For performance reasons, infiniband libraries use registered memory in a way that makes it impossible to do system calls to external programs under MPI (doing so results in seg faults). MAKER has to call out to external programs like BLAST, exonerate, etc., so it triggers this bug. The infiniband bug is well known, and unfortunately will not be fixed because fixing it causes infiniband to lose some advertised features like direct memory access. As a work around OpenMPI and Intel MPI allow you to disable infiniband libraries via command line flags and use IP over infiniband instead (i.e. they let you drop infiniband features on demand so that your code will run). However MVAPICH2 does not provide the same option. As a result you cannot use MAKER or any MPI program that does system calls to external programs with MVAPICH2 (it results in seg faults). But you can use all other MPI flavors with the appropriate flags detailed below: #For OpenMPI, use as follows (the example assumes ib0 is your ip over infiniband adapter) export LD_PRELOAD=/path/to/openmpi/libmpi.so mpiexec --mca btl vader,tcp,self --mca btl_tcp_if_include ib0 --mca btl_openib_want_fork_support 1 --mca mpi_warn_on_fork 0 maker #For Intel MPI set these environmental variables before launch export I_MPI_FABRICS='shm:tcp' export I_MPI_HYDRA_IFACE='ib0' mpiexec maker #For MPICH, nothing is needed as the Infiniband libraries are always disabled, but you can specifically tell it to use the ib0 adapter as the communicator mpiexec -iface ib0 maker ?Carson > On Jan 29, 2018, at 4:08 PM, admin at genome.arizona.edu wrote: > > Hi, we have now three versions of MPI installed on our cluster, OpenMPI, MPICH, and MVAPICH2. Since we have infiniband, the MVAPICH2 is working best with MPI test programs. MPICH should support infiniband too but currently there are some seg faults with that we are trying to resolve. > > On our cluster we have ~/.mpi-selection file which allows users to pick the MPI installation to use, and sets appropriate PATH and LD_LIBRARY_PATH variables. > > I am looking through the Maker MPI instructions, and it seems that a certain mpicc and mpi.h must be chosen during installation. So if originally, Maker was installed with MPICH, then would I have to reinstall it if users want to use MVAPICH2? Or is there config file somewhere I can update so I don't have to reinstall Maker? Or does nothing need to be done and we can rely on PATH and LD_LIBRARY_PATH variables pointing to correct mpicc and libmpi.so (mpi.h is in include directory)? > > Thanks > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From carsonhh at gmail.com Tue Jan 30 10:54:05 2018 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 30 Jan 2018 09:54:05 -0700 Subject: [maker-devel] gene annotation for a better genome In-Reply-To: References: <5AFEDD05-DF02-463F-A6EE-1619A9BB968D@gmail.com> <753F1840-4874-4C0D-80F7-59E1A1579884@gmail.com> Message-ID: <921EBAEF-13E3-4175-90A2-8F41651F95C9@gmail.com> You can set both simultaneously. est2genome will almost always be picked first since it will match better thatn the protein alignment (i.e. it matches at UTRs). ?Carson > On Jan 29, 2018, at 12:57 PM, Quanwei Zhang wrote: > > Dear Carson: > > Thank you for your reply. Do you mean set est2genome=1 and protein2genome=1 in one round or do such mapping in two separate rounds? > > So I will provide gff files by mapping the transcripts and proteins to "pred_gff". Besides the gff from such mapping, I am also considering to provide a gff file obtained from a regular de novo annotation by maker2. And then update gene models from those gff. > > Here is the reason why I consider this. Suppose at location 1 there is a gene model gA by mapping transcripts and proteins. Then if I try to update those gene models in the second round of maker, maker can not change internal exons of gA (so can not replace it). However, if I provide both the gff by mapping transcripts and gff by maker de novo annotation, then if another gene model gA' (by de novo annotation) was predicted by maker at the same location, maker will compare gA and gA' and select the one with higher score, right? By this way we can replace a mapping gene model with predicted model by maker if the predicted one have stronger evidence support. Right? > > Thank you. > > Best > Quanwei > > > > 2018-01-29 13:23 GMT-05:00 Carson Holt >: > You can set both est2genome=1 and protein2genome=1. You can also set est_forward=1 to get the names from the old models (you have to add it as it?s not already there). If you want to try and force an alignment to a specifc location, you can also add maker_coor=chr2:1-3000 to the fasta header comment line to have maker only alow alignments within a specific region (chr2:1-3000 in the example). > > ?Carson > > >> On Jan 26, 2018, at 4:16 PM, Quanwei Zhang > wrote: >> >> Hi Carson: >> >> Thank you for your previous suggestions. I have done the annotation according to your suggestions. I firstly mapped the transcripts from old assembly to the new assembly by setting "est2genome=1", and then update the models by new predictions. >> >> Besides mapping by "est2genome=1" , do you think it is a good idea to do a separate mapping by proteins of old assembly (setting "protein2genome=1")? And then I provide both mapping GFF files (i.e., mapping GFF by transcripts and proteins, separately) and update them with new predictions and evidence support? Why I am trying to do this is because I found for certain genes they were not mapped to the new assembly but they can be mapped by protein orthologs. >> >> Thank you. >> >> Best >> Quanwei >> >> 2017-10-24 18:26 GMT-04:00 Carson Holt >: >> Yes. If you use est2genome it will just align the model, and then find the longest ORF. So it is a quick way to jsut align old models to the new assembly. Alternatively you can just do de novo annotation. >> >> ?Carson >> >> >> >>> On Oct 24, 2017, at 10:54 AM, Quanwei Zhang > wrote: >>> >>> Dear Carson: >>> >>> Thank you again for your suggestions. I just get the new genome assembly of NMR and start to do gene annotation. I understand you ideas about this. But can I simply use the old genome transcripts as transcript evidence, and just following the standard Maker2 pipeline? I set est2genome=1 and provide the mRNA sequences in the fasta format for the first round training of SNAP. >>> >>> For transcripts I have the following choices. I think the first choice is more reliable and better, right? >>> (1) There are about 60,000 RefSeq transcripts from NCBI. So I downloaded those sequences in fasta format. >>> (2) We have the raw data of RNA-seq from 11 tissues, we can do assembly by trinity for each sample and then get the transcripts. But I think most of the RNA-seq should have been submitted to NCBI. >>> >>> BTW, if we use the RefSeq data from NCBI, we can download the mRNA sequences, coding sequences or protein sequences. I wonder which type of data are the best to train the SNAP? For Augustus, we will use BUSCO to train it. >>> >>> Many thanks. >>> >>> Best >>> Quanwei >>> >>> >>> >>> >>> 2017-09-29 12:36 GMT-04:00 Carson Holt >: >>> You can try using the est2genome=1 option to map the old models forward onto the new assembly as if they were ESTs (add a line that says est_forward=1 to the control file to maintain old naming and set est=1 to the old model transcript file). Then provide the final models as a pred_gff for a subsuquent run (i.e. a traditional MAKER run where you are annotating the new assembly with transcript and protein evidence and ab initio predictors). Don?t supply the old models to est= on that run. >>> >>> The idea behind doing it this way is: >>> 1. You need to get old models onto the new assembly so coordinates will change. So by doing it this way, you will at least be able to move many models forward based on homology. >>> 2. By providing the models to pred_gff on a subsequent MAKER run, you are just letting old models compete against new annotations. They will be rejected if they have no evidence support, or can be kept if they score better than alternate models from SNAP/Augustus. That way you have the chance to integrate old models while at the same time rejecting some old models that have no evidence overlap. >>> >>> ?Carson >>> >>> >>> > On Sep 28, 2017, at 6:05 AM, Quanwei Zhang > wrote: >>> > >>> > Hello: >>> > >>> > Recently, we got a new version of NMR genome, whose genome had been assembled and annotated a few years ago. We can download the gene annotation from NCBI. >>> > >>> > Now we want to annotate the new genome using Maker2 pipeline. I wonder how can I fully make use of existing annotations. On the other hand, since the previous genome is not very well assemblies, some genes annotation maybe false positives. I hope those false positive genes in previous annotation won't mislead Maker2 for current gene annotation. >>> > >>> > Do you have any suggestions. Thanks >>> > >>> > Best >>> > Quanwei >>> > _______________________________________________ >>> > maker-devel mailing list >>> > maker-devel at box290.bluehost.com >>> > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >>> >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Jan 30 10:57:01 2018 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 30 Jan 2018 09:57:01 -0700 Subject: [maker-devel] Is it possible to extract the GFF3 file for the raw gene models predicted by est2genome and protein2genome? In-Reply-To: References: Message-ID: <9771EB42-8A80-49D8-9A21-67406860FD4F@gmail.com> You can just grep on the name. Although est2genome and protein2genome should only be used for initial training, as they are almost always guaranteed to be partial and should be disabled once you have trained gene predictors that can build complete models. ?Carson > On Jan 30, 2018, at 9:32 AM, Jia-Xing Yue wrote: > > Hello, > > I enabled the est2genome and protein2genome option for Maker-3.00.0-beta in my particular case. I was wondering if it is possible to extract the gene models predicted by snap, augustus, est2genome, and protein2genome respectively. > > By using the gff_merge command, I think I can extract some gene models for each cases but not all, especially for the est2genome and protein2genome set (e.g. those labeled with "maker-chr*-exonerate_est2genome-gene" and "maker-chr*-exonerate_protein2genome-gene"). > > Thanks in advance! > > Best, > Jia-Xing > > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From yuejiaxing at gmail.com Tue Jan 30 11:03:34 2018 From: yuejiaxing at gmail.com (Jia-Xing Yue) Date: Tue, 30 Jan 2018 18:03:34 +0100 Subject: [maker-devel] Is it possible to extract the GFF3 file for the raw gene models predicted by est2genome and protein2genome? In-Reply-To: <9771EB42-8A80-49D8-9A21-67406860FD4F@gmail.com> References: <9771EB42-8A80-49D8-9A21-67406860FD4F@gmail.com> Message-ID: Dear Carson, Thanks for the quick response! Could you elaborate a bit on on "grep on the name". Do you mean just grep all the lines in the gff_merge output with "est2genome" and "protein2genome" in column 3? In that case, what I got is the alignments rather than the gene model guessed by Maker based on the alignment, right? Thanks! Best, Jia-Xing On Tue, Jan 30, 2018 at 5:57 PM, Carson Holt wrote: > You can just grep on the name. Although est2genome and protein2genome > should only be used for initial training, as they are almost always > guaranteed to be partial and should be disabled once you have trained gene > predictors that can build complete models. > > ?Carson > > > On Jan 30, 2018, at 9:32 AM, Jia-Xing Yue wrote: > > > > Hello, > > > > I enabled the est2genome and protein2genome option for Maker-3.00.0-beta > in my particular case. I was wondering if it is possible to extract the > gene models predicted by snap, augustus, est2genome, and protein2genome > respectively. > > > > By using the gff_merge command, I think I can extract some gene models > for each cases but not all, especially for the est2genome and > protein2genome set (e.g. those labeled with "maker-chr*-exonerate_est2genome-gene" > and "maker-chr*-exonerate_protein2genome-gene"). > > > > Thanks in advance! > > > > Best, > > Jia-Xing > > > > > > > > _______________________________________________ > > maker-devel mailing list > > maker-devel at box290.bluehost.com > > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > -- Jia-Xing Yue Population Genomics and Complex Traits Group Tour Pasteur 8eme etage Facult? de M?decine Institute for Research on Cancer and Aging, Nice (IRCAN) CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA) 28 Avenue de Valombrose 06107 NICE Cedex 2 France Twitter: @iAmphioxus Personal website: http://www.iamphioxus.org/ Lab website: https://litilab.wordpress.com/ Yeast Population Reference Panel: https://yjx1217.github.io/Yeast_PacBio_2016/welcome/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Jan 30 11:06:27 2018 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 30 Jan 2018 10:06:27 -0700 Subject: [maker-devel] Is it possible to extract the GFF3 file for the raw gene models predicted by est2genome and protein2genome? In-Reply-To: References: <9771EB42-8A80-49D8-9A21-67406860FD4F@gmail.com> Message-ID: <335E2942-4FCA-4F3C-A488-06116F6B7604@gmail.com> MAKER models will al have ?maker? in the source column. Everything else is a reference alignment (not a model). But you can grep on the gene name. If it is sourced from SNAP, it will have snap in the name, and the same is true for augustus, est2genome, protein2genome, etc. ?Carson > On Jan 30, 2018, at 10:03 AM, Jia-Xing Yue wrote: > > Dear Carson, > > Thanks for the quick response! Could you elaborate a bit on on "grep on the name". Do you mean just grep all the lines in the gff_merge output with "est2genome" and "protein2genome" in column 3? In that case, what I got is the alignments rather than the gene model guessed by Maker based on the alignment, right? > > > Thanks! > > Best, > Jia-Xing > > > > On Tue, Jan 30, 2018 at 5:57 PM, Carson Holt > wrote: > You can just grep on the name. Although est2genome and protein2genome should only be used for initial training, as they are almost always guaranteed to be partial and should be disabled once you have trained gene predictors that can build complete models. > > ?Carson > > > On Jan 30, 2018, at 9:32 AM, Jia-Xing Yue > wrote: > > > > Hello, > > > > I enabled the est2genome and protein2genome option for Maker-3.00.0-beta in my particular case. I was wondering if it is possible to extract the gene models predicted by snap, augustus, est2genome, and protein2genome respectively. > > > > By using the gff_merge command, I think I can extract some gene models for each cases but not all, especially for the est2genome and protein2genome set (e.g. those labeled with "maker-chr*-exonerate_est2genome-gene" and "maker-chr*-exonerate_protein2genome-gene"). > > > > Thanks in advance! > > > > Best, > > Jia-Xing > > > > > > > > _______________________________________________ > > maker-devel mailing list > > maker-devel at box290.bluehost.com > > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > > > -- > Jia-Xing Yue > > Population Genomics and Complex Traits Group > Tour Pasteur 8eme etage > Facult? de M?decine > Institute for Research on Cancer and Aging, Nice (IRCAN) > CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA) > 28 Avenue de Valombrose > 06107 NICE Cedex 2 > France > > Twitter: @iAmphioxus > Personal website: http://www.iamphioxus.org/ > Lab website: https://litilab.wordpress.com/ > Yeast Population Reference Panel: https://yjx1217.github.io/Yeast_PacBio_2016/welcome/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From admin at genome.arizona.edu Tue Jan 30 11:24:05 2018 From: admin at genome.arizona.edu (admin at genome.arizona.edu) Date: Tue, 30 Jan 2018 10:24:05 -0700 Subject: [maker-devel] MPI selection In-Reply-To: <34C36A98-A87F-4B28-8E05-FCD412CFEBEA@gmail.com> References: <77cfb864-4de1-a9af-aeea-9d3e7cf45ce5@genome.arizona.edu> <34C36A98-A87F-4B28-8E05-FCD412CFEBEA@gmail.com> Message-ID: <4825e452-aab6-aa13-ebc7-3d3d1832cc60@genome.arizona.edu> Carson Holt wrote on 01/30/2018 09:47 AM: > The libraries used by MVAPICH2, Intel MPI, and OpenMPI to access infiniband have a known bug. For performance reasons, infiniband libraries use registered memory in a way that makes it impossible to do system calls to external programs under MPI (doing so results in seg faults). MAKER has to call out to external programs like BLAST, exonerate, etc., so it triggers this bug. > The infiniband bug is well known, and unfortunately will not be fixed because fixing it causes infiniband to lose some advertised features like direct memory access. Well that stinks! Maybe that's why we got such a good deal on new-old-stock infiniband equipment! Still it has allowed us to use full speed of our NFS RAIDs, which has been nice. I will try with using ib0, the speed is still about 10Gb, but I was under the impression using IPoIB would cause packet loss or other problems... Thanks for clearing that up. So is there a fabric/protocol you would recommend for clusters running maker? From yuejiaxing at gmail.com Tue Jan 30 11:24:22 2018 From: yuejiaxing at gmail.com (Jia-Xing Yue) Date: Tue, 30 Jan 2018 12:24:22 -0500 Subject: [maker-devel] Is it possible to extract the GFF3 file for the raw gene models predicted by est2genome and protein2genome? Message-ID: Dear Carson, Yes that's what I did actually. But it seems that I only got much fewer gene models for est2genome and protein2genome in this way than I would expect. I have turned on EVM for my maker run. Could this explain the low numbers of est2genome and protein2genome models that I got? Thx! Best, Jia-Xing Sent from my Nokia Lumia 920 ------------------------------ From: Carson Holt Sent: ?30/?01/?2018 18:06 To: Jia-Xing Yue Cc: maker-devel at yandell-lab.org List Subject: Re: [maker-devel] Is it possible to extract the GFF3 file for the raw gene models predicted by est2genome and protein2genome? MAKER models will al have ?maker? in the source column. Everything else is a reference alignment (not a model). But you can grep on the gene name. If it is sourced from SNAP, it will have snap in the name, and the same is true for augustus, est2genome, protein2genome, etc. ?Carson On Jan 30, 2018, at 10:03 AM, Jia-Xing Yue wrote: Dear Carson, Thanks for the quick response! Could you elaborate a bit on on "grep on the name". Do you mean just grep all the lines in the gff_merge output with "est2genome" and "protein2genome" in column 3? In that case, what I got is the alignments rather than the gene model guessed by Maker based on the alignment, right? Thanks! Best, Jia-Xing On Tue, Jan 30, 2018 at 5:57 PM, Carson Holt wrote: > You can just grep on the name. Although est2genome and protein2genome > should only be used for initial training, as they are almost always > guaranteed to be partial and should be disabled once you have trained gene > predictors that can build complete models. > > ?Carson > > > On Jan 30, 2018, at 9:32 AM, Jia-Xing Yue wrote: > > > > Hello, > > > > I enabled the est2genome and protein2genome option for Maker-3.00.0-beta > in my particular case. I was wondering if it is possible to extract the > gene models predicted by snap, augustus, est2genome, and protein2genome > respectively. > > > > By using the gff_merge command, I think I can extract some gene models > for each cases but not all, especially for the est2genome and > protein2genome set (e.g. those labeled with "maker-chr*-exonerate_est2genome-gene" > and "maker-chr*-exonerate_protein2genome-gene"). > > > > Thanks in advance! > > > > Best, > > Jia-Xing > > > > > > > > _______________________________________________ > > maker-devel mailing list > > maker-devel at box290.bluehost.com > > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > -- Jia-Xing Yue Population Genomics and Complex Traits Group Tour Pasteur 8eme etage Facult? de M?decine Institute for Research on Cancer and Aging, Nice (IRCAN) CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA) 28 Avenue de Valombrose 06107 NICE Cedex 2 France Twitter: @iAmphioxus Personal website: http://www.iamphioxus.org/ Lab website: https://litilab.wordpress.com/ Yeast Population Reference Panel: https://yjx1217.github.io/Yeast_PacBio_2016/welcome/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Jan 30 11:37:59 2018 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 30 Jan 2018 10:37:59 -0700 Subject: [maker-devel] MPI selection In-Reply-To: <4825e452-aab6-aa13-ebc7-3d3d1832cc60@genome.arizona.edu> References: <77cfb864-4de1-a9af-aeea-9d3e7cf45ce5@genome.arizona.edu> <34C36A98-A87F-4B28-8E05-FCD412CFEBEA@gmail.com> <4825e452-aab6-aa13-ebc7-3d3d1832cc60@genome.arizona.edu> Message-ID: MAKER does not really move a lot of data with MPI, it?s just moving around command lines and small variables. So not getting full infiniband performance will not hurt you. I doubt you see any issues using ib0. For MPI flavor, I get the best performance with Intel MPI followed by OpenMPI. Overall you will find that MAKER is IO bound as opposed to CPU or communications bound. So pointing it at your best performing network based storage will be the greatest performance factor (if you have Lustre storage, point it there for example). Pull back on job size and count if other users have issues accessing the disk (too many jobs can bring NFS to it?s knees). The one suggestion I have as far as job size, it to keep jobs sizes under 200 CPU cores. Over that, you will get better performance by splitting up datasets and submitting multiple job. Also MAKER keeps a log of it?s progress, so you can kill jobs or restart failed jobs, and they pick up right where they left off. ?Carson > On Jan 30, 2018, at 10:24 AM, admin at genome.arizona.edu wrote: > > Carson Holt wrote on 01/30/2018 09:47 AM: > > The libraries used by MVAPICH2, Intel MPI, and OpenMPI to access infiniband have a known bug. For performance reasons, infiniband libraries use registered memory in a way that makes it impossible to do system calls to external programs under MPI (doing so results in seg faults). MAKER has to call out to external programs like BLAST, exonerate, etc., so it triggers this bug. > > The infiniband bug is well known, and unfortunately will not be fixed because fixing it causes infiniband to lose some advertised features like direct memory access. > > > Well that stinks! Maybe that's why we got such a good deal on new-old-stock infiniband equipment! Still it has allowed us to use full speed of our NFS RAIDs, which has been nice. I will try with using ib0, the speed is still about 10Gb, but I was under the impression using IPoIB would cause packet loss or other problems... > > Thanks for clearing that up. So is there a fabric/protocol you would recommend for clusters running maker? > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From qlian003 at ucr.edu Wed Jan 3 18:52:26 2018 From: qlian003 at ucr.edu (Qihua Liang) Date: Wed, 3 Jan 2018 17:52:26 -0800 Subject: [maker-devel] questions on master_datastore_index.log file Message-ID: Dear Maker Develop Team, I have successfully run Maker for several times before. But I came across a strange thing days ago when I ran Maker again on a different assembly with the same input files and settings. I saw the message of "Maker is now finished!!!? but got empty GFF3 and no fasta files. And then I checked the master_datastore_index.log and realized that there are a lot of ?failed?s and ?retry?s and ?failed? again. What does this mean? Since I used same inputs as previous successful runs, could you provide some instructions on how to debug and solve it? Thank you so much Qihua From o.k.torresen at ibv.uio.no Thu Jan 4 06:21:28 2018 From: o.k.torresen at ibv.uio.no (=?utf-8?B?T2xlIEtyaXN0aWFuIFTDuHJyZXNlbg==?=) Date: Thu, 4 Jan 2018 13:21:28 +0000 Subject: [maker-devel] Names/IDs from pred_gff are included in final gff Message-ID: <76613C81-ED9B-45F2-B84B-B60BC1D4D972@ibv.uio.no> Hi, as far as I can see, names or IDs of features in gffs given to pred_gff is included in the final output as the name of the feature. As far as I can understand, this is not expected behaviour (it is for model_gff). This is with MAKER 2.31.9. I have these settings: map_forward=0 keep_preds=1 I thought that map_forward had to be 1 to get the names for the old GFFs. Can you replicate this? Thank you. Sincerely, Ole K. T?rresen From d.ence at ufl.edu Thu Jan 4 07:16:42 2018 From: d.ence at ufl.edu (Ence,daniel) Date: Thu, 4 Jan 2018 14:16:42 +0000 Subject: [maker-devel] questions on master_datastore_index.log file In-Reply-To: References: Message-ID: Hi, Before we can give any help to debug it, we need the error messages. These should be in the same file that the ?maker is finished? message is in. Look for the first error message (the one closest to the top of the file) and send that to the mailing list. Thanks, Daniel > On Jan 3, 2018, at 8:52 PM, Qihua Liang wrote: > > Dear Maker Develop Team, > > I have successfully run Maker for several times before. But I came across a strange thing days ago when I ran Maker again on a different assembly with the same input files and settings. > > I saw the message of "Maker is now finished!!!? but got empty GFF3 and no fasta files. And then I checked the master_datastore_index.log and realized that there are a lot of ?failed?s and ?retry?s and ?failed? again. What does this mean? Since I used same inputs as previous successful runs, could you provide some instructions on how to debug and solve it? > > Thank you so much > Qihua > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > https://urldefense.proofpoint.com/v2/url?u=http-3A__box290.bluehost.com_mailman_listinfo_maker-2Ddevel-5Fyandell-2Dlab.org&d=DwIGaQ&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=12jzlNvGVD0AlPJ4E7cTlw1Dvu6n9cb4kMCobJ28XPs&m=nUDCP_0kFOhDYlTHgOpWtf_zdL77aQFeQwYOGIQwP8c&s=9Z4T1hdtxOyIjpn6f70qhrQRuGsZxXdV-oLSJF1zGkY&e= From qlian003 at ucr.edu Thu Jan 4 14:36:18 2018 From: qlian003 at ucr.edu (Qihua Liang) Date: Thu, 4 Jan 2018 13:36:18 -0800 Subject: [maker-devel] questions on master_datastore_index.log file In-Reply-To: References: Message-ID: Hi Ence, When I searched for ?E/error? in the output file, here is what first showed up: Process::MpiChunk::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 Error::subs::try(CODE(0x502bbb0), HASH(0x5007788)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 4269 Process::MpiChunk::_go(Process::MpiChunk=HASH(0x50a1a18), "flow", HASH(0x50ad0f0), 2, 0) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 378 Process::MpiChunk::_flow(Process::MpiChunk=HASH(0x50a1a18), HASH(0x50ad0f0), 2, 0, Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 318 Process::MpiTiers::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 Error::subs::try(CODE(0x50a9348), HASH(0x4ff0ec0)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 338 Process::MpiTiers::_next_level(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 179 Is this what you may need? Qihua > On Jan 4, 2018, at 6:16 AM, Ence,daniel wrote: > > Hi, Before we can give any help to debug it, we need the error messages. These should be in the same file that the ?maker is finished? message is in. Look for the first error message (the one closest to the top of the file) and send that to the mailing list. > > Thanks, > Daniel > > >> On Jan 3, 2018, at 8:52 PM, Qihua Liang wrote: >> >> Dear Maker Develop Team, >> >> I have successfully run Maker for several times before. But I came across a strange thing days ago when I ran Maker again on a different assembly with the same input files and settings. >> >> I saw the message of "Maker is now finished!!!? but got empty GFF3 and no fasta files. And then I checked the master_datastore_index.log and realized that there are a lot of ?failed?s and ?retry?s and ?failed? again. What does this mean? Since I used same inputs as previous successful runs, could you provide some instructions on how to debug and solve it? >> >> Thank you so much >> Qihua >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> https://urldefense.proofpoint.com/v2/url?u=http-3A__box290.bluehost.com_mailman_listinfo_maker-2Ddevel-5Fyandell-2Dlab.org&d=DwIGaQ&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=12jzlNvGVD0AlPJ4E7cTlw1Dvu6n9cb4kMCobJ28XPs&m=nUDCP_0kFOhDYlTHgOpWtf_zdL77aQFeQwYOGIQwP8c&s=9Z4T1hdtxOyIjpn6f70qhrQRuGsZxXdV-oLSJF1zGkY&e= > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Fri Jan 5 20:22:56 2018 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 5 Jan 2018 20:22:56 -0700 Subject: [maker-devel] questions on master_datastore_index.log file In-Reply-To: References: Message-ID: That?s the stack trace. The error is going to be a few lines further back. It would be best to get a few hundred lines right around the area you are showing. ?Carson > On Jan 4, 2018, at 2:36 PM, Qihua Liang wrote: > > Hi Ence, > > When I searched for ?E/error? in the output file, here is what first showed up: > Process::MpiChunk::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 > eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 > Error::subs::try(CODE(0x502bbb0), HASH(0x5007788)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 4269 > Process::MpiChunk::_go(Process::MpiChunk=HASH(0x50a1a18), "flow", HASH(0x50ad0f0), 2, 0) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 378 > Process::MpiChunk::_flow(Process::MpiChunk=HASH(0x50a1a18), HASH(0x50ad0f0), 2, 0, Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 318 > Process::MpiTiers::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 > eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 > Error::subs::try(CODE(0x50a9348), HASH(0x4ff0ec0)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 338 > Process::MpiTiers::_next_level(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 179 > > Is this what you may need? > > Qihua > >> On Jan 4, 2018, at 6:16 AM, Ence,daniel > wrote: >> >> Hi, Before we can give any help to debug it, we need the error messages. These should be in the same file that the ?maker is finished? message is in. Look for the first error message (the one closest to the top of the file) and send that to the mailing list. >> >> Thanks, >> Daniel >> >> >>> On Jan 3, 2018, at 8:52 PM, Qihua Liang > wrote: >>> >>> Dear Maker Develop Team, >>> >>> I have successfully run Maker for several times before. But I came across a strange thing days ago when I ran Maker again on a different assembly with the same input files and settings. >>> >>> I saw the message of "Maker is now finished!!!? but got empty GFF3 and no fasta files. And then I checked the master_datastore_index.log and realized that there are a lot of ?failed?s and ?retry?s and ?failed? again. What does this mean? Since I used same inputs as previous successful runs, could you provide some instructions on how to debug and solve it? >>> >>> Thank you so much >>> Qihua >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> https://urldefense.proofpoint.com/v2/url?u=http-3A__box290.bluehost.com_mailman_listinfo_maker-2Ddevel-5Fyandell-2Dlab.org&d=DwIGaQ&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=12jzlNvGVD0AlPJ4E7cTlw1Dvu6n9cb4kMCobJ28XPs&m=nUDCP_0kFOhDYlTHgOpWtf_zdL77aQFeQwYOGIQwP8c&s=9Z4T1hdtxOyIjpn6f70qhrQRuGsZxXdV-oLSJF1zGkY&e= >> > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Jan 16 11:15:29 2018 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 16 Jan 2018 11:15:29 -0700 Subject: [maker-devel] Names/IDs from pred_gff are included in final gff In-Reply-To: <76613C81-ED9B-45F2-B84B-B60BC1D4D972@ibv.uio.no> References: <76613C81-ED9B-45F2-B84B-B60BC1D4D972@ibv.uio.no> Message-ID: <8CB421A6-3CB0-4539-B55A-D3F4CA61D0AD@gmail.com> pred_gff will maintain it?s name in the match/match_part feature as the information is pulled directly from the input GFF3. But any pred_gff feature that becomes a final model will be renamed to something like ?scaffold_1517-pred_gff_GeneMark.hmm-gene-0.6-mRNA-1? unless you specify map_forward=1 to maintain the original name. ?Carson > On Jan 4, 2018, at 6:21 AM, Ole Kristian T?rresen wrote: > > Hi, > as far as I can see, names or IDs of features in gffs given to pred_gff is included in the final output as the name of the feature. As far as I can understand, this is not expected behaviour (it is for model_gff). This is with MAKER 2.31.9. > > I have these settings: > map_forward=0 > keep_preds=1 > > I thought that map_forward had to be 1 to get the names for the old GFFs. Can you replicate this? > > Thank you. > > Sincerely, > Ole K. T?rresen > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From o.k.torresen at ibv.uio.no Wed Jan 17 10:52:13 2018 From: o.k.torresen at ibv.uio.no (=?utf-8?B?T2xlIEtyaXN0aWFuIFTDuHJyZXNlbg==?=) Date: Wed, 17 Jan 2018 17:52:13 +0000 Subject: [maker-devel] Names/IDs from pred_gff are included in final gff In-Reply-To: <8CB421A6-3CB0-4539-B55A-D3F4CA61D0AD@gmail.com> References: <76613C81-ED9B-45F2-B84B-B60BC1D4D972@ibv.uio.no> <8CB421A6-3CB0-4539-B55A-D3F4CA61D0AD@gmail.com> Message-ID: <583A84D5-B979-4C2F-B262-2D55A6F55B56@ibv.uio.no> Ok, but I have an entry in the final gff like this: ID=maker-GmG20150304_scaffold_2371-pred_gff_maker-gene-0.125-mRNA-1;Parent=maker-GmG20150304_scaffold_2371-pred_gff_maker-gene-0.125;Name=ENSGMOT00000000668.1;_AED=0.00;_eAED=0.00;_QI=819|1|1|1|1|1|4|112|726;score=89.75616 (The name is derived from a pred_gff entry which is the results of mapping an old annotation to the new assembly). This is then called ENSGMOT00000000668.1 protein AED:0.00 eAED:0.00 QI:819|1|1|1|1|1|4|112|726 in the proteins.fasta file. Which is unfortunate, because it apparently mapped 12 places in the assembly. I have set map_forward=0, but keep_preds=1 (filtering on domain presence and AED score later). This and another file (result of genemark_gtf2gff3), is not input as match/match_part to MAKER, but with gene/exon/CDS/mRNA. Could that be the issue? Ole > On 16 Jan 2018, at 19:15, Carson Holt wrote: > > pred_gff will maintain it?s name in the match/match_part feature as the information is pulled directly from the input GFF3. But any pred_gff feature that becomes a final model will be renamed to something like ?scaffold_1517-pred_gff_GeneMark.hmm-gene-0.6-mRNA-1? unless you specify map_forward=1 to maintain the original name. > > ?Carson > > > >> On Jan 4, 2018, at 6:21 AM, Ole Kristian T?rresen wrote: >> >> Hi, >> as far as I can see, names or IDs of features in gffs given to pred_gff is included in the final output as the name of the feature. As far as I can understand, this is not expected behaviour (it is for model_gff). This is with MAKER 2.31.9. >> >> I have these settings: >> map_forward=0 >> keep_preds=1 >> >> I thought that map_forward had to be 1 to get the names for the old GFFs. Can you replicate this? >> >> Thank you. >> >> Sincerely, >> Ole K. T?rresen >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > From qlian003 at ucr.edu Sat Jan 6 16:09:55 2018 From: qlian003 at ucr.edu (Qihua Liang) Date: Sat, 6 Jan 2018 15:09:55 -0800 Subject: [maker-devel] questions on master_datastore_index.log file In-Reply-To: References: Message-ID: <0BECB285-BB11-4F46-B6D7-072640F311B2@ucr.edu> Hi Carson, I am pasting more lines of error messages. I notice an error of "ERROR: Can't open seq file: /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/query.masked.gff.seq?, the seq name of ?ScsGwly? is ">ScsGwly_6124;HRSCAF=6247?, is it because of the seq naming that makes the temp file name weird? Thanks Qihua #--------- command -------------# Widget::blastx: /24-2/home/qliang/0.soft/maker/bin/../exe/blast/bin/blastx -db /tmp/maker_nJDkCL/te_proteins%2Efasta.mpi.10.9 -query /tmp/maker_nJDkCL/0/ScsG wly_5932%3BHRSCAF=6050.0 -num_alignments 10000 -num_descriptions 10000 -evalue 1e-06 -dbsize 300 -searchsp 500000000 -num_threads 1 -seg yes -soft_masking true -lcase_masking -show_gis -out /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/D7/4 A/ScsGwly_5932%3BHRSCAF=6050//theVoid.ScsGwly_5932%3BHRSCAF=6050/0/ScsGwly_5932%3BHRSCAF=6050.0.te_proteins%2Efasta.repeatrunner.temp_dir/te_ proteins%2Efasta.mpi.10.9.repeatrunner #-------------------------------# deleted:0 hits collecting blastx repeatmasking processing all repeats in cluster::shadow_cluster... ...finished clustering. preparing masked sequence preparing ab-inits running snap. #--------- command -------------# Widget::snap: /24-2/home/qliang/0.soft/maker/exe/snap/snap /home/qliang/cowpea/annotation/09.tingting/4.Abintio/2.CEGMA/3.maker/maker1.hmm/maker1.snap.hmm /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0 > /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0.maker1%2Esnap%2Eh mm.snap #-------------------------------# scoring....decoding.10.20.30.40.50.60.70.80.90.100 done scoring....decoding.10.20.30.40.50.60.70.80.90.100 done running augustus. #--------- command -------------# Widget::augustus: /usr/local/augustus.2.7/bin/augustus --species=cowpea_new --UTR=off /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0 > /tmp/maker _nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0.cowpea_new.augustus #-------------------------------# deleted:0 hits collecting blastx repeatmasking processing all repeats in cluster::shadow_cluster... ...finished clustering. deleted:0 hits doing blastx repeats running blast search. #--------- command -------------# Widget::blastx: /24-2/home/qliang/0.soft/maker/bin/../exe/blast/bin/blastx -db /tmp/maker_mvdRkd/te_proteins%2Efasta.mpi.10.6 -query /tmp/maker_mvdRkd/0/chr10.75 -num_alignments 10000 -num_descriptions 10000 -evalue 1e-06 -dbsize 300 -searchsp 500000000 -num_threads 1 -seg yes -soft_masking true -lcase_masking -show_gis -out /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/09/chr10//theVoid.chr10/7/chr10.75.te_proteins%2Efasta.repeatrunner.temp_dir/te_proteins%2Efasta.mpi.10.6.repeatrunner #-------------------------------# doing blastx repeats re reading blast report. /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/0/ScsGwly_6124%3BHRSCAF=6247.0.te_proteins%2Efasta.repeatrunner deleted:0 hits doing blastx repeats doing blastx repeats doing blastx repeats doing blastx repeats doing blastx repeats doing blastx repeats doing blastx repeats doing blastx repeats doing blastx repeats collecting blastx repeatmasking processing all repeats in cluster::shadow_cluster... ...finished clustering. ERROR: Can't open seq file: /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/query.masked.gff.seq No such file or directory at /24-2/home/qliang/0.soft/maker/bin/../lib/Dumper/GFF/GFFV3.pm line 199. Dumper::GFF::GFFV3::finalize(Dumper::GFF::GFFV3=HASH(0x5000ab8)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 700 Process::MpiChunk::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 Error::subs::try(CODE(0x502bbb0), HASH(0x5007788)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 4269 Process::MpiChunk::_go(Process::MpiChunk=HASH(0x50a1a18), "flow", HASH(0x50ad0f0), 2, 0) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 378 Process::MpiChunk::_flow(Process::MpiChunk=HASH(0x50a1a18), HASH(0x50ad0f0), 2, 0, Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 318 Process::MpiTiers::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 Error::subs::try(CODE(0x50a9348), HASH(0x4ff0ec0)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 338 Process::MpiTiers::_next_level(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 179 Process::MpiTiers::next_chunk(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 286 Process::MpiTiers::run_all(Process::MpiTiers=HASH(0x4fb3350), 0) called at /home/qliang/0.soft/maker/bin/maker line 695 --> rank=NA, hostname=H4 ERROR: Failed while builing masking tiers --> rank=NA, hostname=H4 --> rank=NA, hostname=H4 ERROR: Can not get next level running genemark. #--------- command -------------# Widget::genemark: /24-2/home/qliang/0.soft/PerlPackages/ActivePerl-5.22/bin/perl-static /24-2/home/qliang/0.soft/maker/bin/../lib/Widget/genemark/gmhmm_wrap -m /home/qliang/cowpea/annotation/05.CEGMA/2.genemask/output/gmhmm.mod -g /24-2/home/qliang/0.soft/makerPackages/gm_et_linux_64/gmes_petap/gmhmme3 -p /24-2/home/qliang/0.soft/makerPackages/gm_et_linux_64/gmes_petap/probuild -o /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_nomask.0.gmhmm%2Emod.genemark /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_nomask.0 #-------------------------------# FAILED CONTIG:ScsGwly_6124;HRSCAF=6247 examining contents of the fasta file and run log --Next Contig-- #--------------------------------------------------------------------- Now starting the contig!! SeqID: ScsGwly_6140;HRSCAF=6263 Length: 1247 #--------------------------------------------------------------------- > On Jan 5, 2018, at 7:22 PM, Carson Holt wrote: > > That?s the stack trace. The error is going to be a few lines further back. It would be best to get a few hundred lines right around the area you are showing. > > ?Carson > >> On Jan 4, 2018, at 2:36 PM, Qihua Liang > wrote: >> >> Hi Ence, >> >> When I searched for ?E/error? in the output file, here is what first showed up: >> Process::MpiChunk::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >> Error::subs::try(CODE(0x502bbb0), HASH(0x5007788)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 4269 >> Process::MpiChunk::_go(Process::MpiChunk=HASH(0x50a1a18), "flow", HASH(0x50ad0f0), 2, 0) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 378 >> Process::MpiChunk::_flow(Process::MpiChunk=HASH(0x50a1a18), HASH(0x50ad0f0), 2, 0, Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 318 >> Process::MpiTiers::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >> Error::subs::try(CODE(0x50a9348), HASH(0x4ff0ec0)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 338 >> Process::MpiTiers::_next_level(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 179 >> >> Is this what you may need? >> >> Qihua >> >>> On Jan 4, 2018, at 6:16 AM, Ence,daniel > wrote: >>> >>> Hi, Before we can give any help to debug it, we need the error messages. These should be in the same file that the ?maker is finished? message is in. Look for the first error message (the one closest to the top of the file) and send that to the mailing list. >>> >>> Thanks, >>> Daniel >>> >>> >>>> On Jan 3, 2018, at 8:52 PM, Qihua Liang > wrote: >>>> >>>> Dear Maker Develop Team, >>>> >>>> I have successfully run Maker for several times before. But I came across a strange thing days ago when I ran Maker again on a different assembly with the same input files and settings. >>>> >>>> I saw the message of "Maker is now finished!!!? but got empty GFF3 and no fasta files. And then I checked the master_datastore_index.log and realized that there are a lot of ?failed?s and ?retry?s and ?failed? again. What does this mean? Since I used same inputs as previous successful runs, could you provide some instructions on how to debug and solve it? >>>> >>>> Thank you so much >>>> Qihua >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__box290.bluehost.com_mailman_listinfo_maker-2Ddevel-5Fyandell-2Dlab.org&d=DwIGaQ&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=12jzlNvGVD0AlPJ4E7cTlw1Dvu6n9cb4kMCobJ28XPs&m=nUDCP_0kFOhDYlTHgOpWtf_zdL77aQFeQwYOGIQwP8c&s=9Z4T1hdtxOyIjpn6f70qhrQRuGsZxXdV-oLSJF1zGkY&e= >>> >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Jan 9 10:14:05 2018 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 9 Jan 2018 10:14:05 -0700 Subject: [maker-devel] questions on master_datastore_index.log file In-Reply-To: <0BECB285-BB11-4F46-B6D7-072640F311B2@ucr.edu> References: <0BECB285-BB11-4F46-B6D7-072640F311B2@ucr.edu> Message-ID: <0E5E8721-E814-4BA5-891B-B1C312BC0D4A@gmail.com> Your contig names may create issues. Specifically the ?;? character, but you should also remove the ?=? character. However, I believe your problem may be IO. If you are running under MPI or are running multiple jobs, the disk one of the machines may have that location unmounted, it may be full, you may have hit a system file quota limit, or the IO load is slowing it is not actually finished writing the file when MAKER tries to read it. If IO load, is the issue, then you just need to run fewer processes. The other possibilities would mean you need to make space, fix the mount, or raise any quotas on your systems. Thanks, Carson > On Jan 6, 2018, at 4:09 PM, Qihua Liang wrote: > > Hi Carson, > > I am pasting more lines of error messages. I notice an error of "ERROR: Can't open seq file: /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/query.masked.gff.seq?, the seq name of ?ScsGwly? is ">ScsGwly_6124;HRSCAF=6247?, is it because of the seq naming that makes the temp file name weird? > > Thanks > Qihua > > #--------- command -------------# > Widget::blastx: > /24-2/home/qliang/0.soft/maker/bin/../exe/blast/bin/blastx -db /tmp/maker_nJDkCL/te_proteins%2Efasta.mpi.10.9 -query /tmp/maker_nJDkCL/0/ScsG > wly_5932%3BHRSCAF=6050.0 -num_alignments 10000 -num_descriptions 10000 -evalue 1e-06 -dbsize 300 -searchsp 500000000 -num_threads 1 -seg yes > -soft_masking true -lcase_masking -show_gis -out /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/D7/4 > A/ScsGwly_5932%3BHRSCAF=6050//theVoid.ScsGwly_5932%3BHRSCAF=6050/0/ScsGwly_5932%3BHRSCAF=6050.0.te_proteins%2Efasta.repeatrunner.temp_dir/te_ > proteins%2Efasta.mpi.10.9.repeatrunner > #-------------------------------# > deleted:0 hits > collecting blastx repeatmasking > processing all repeats > in cluster::shadow_cluster... > ...finished clustering. > preparing masked sequence > preparing ab-inits > running snap. > #--------- command -------------# > Widget::snap: > /24-2/home/qliang/0.soft/maker/exe/snap/snap /home/qliang/cowpea/annotation/09.tingting/4.Abintio/2.CEGMA/3.maker/maker1.hmm/maker1.snap.hmm > /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0 > /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0.maker1%2Esnap%2Eh > mm.snap > #-------------------------------# > scoring....decoding.10.20.30.40.50.60.70.80.90.100 done > scoring....decoding.10.20.30.40.50.60.70.80.90.100 done > running augustus. > #--------- command -------------# > Widget::augustus: > /usr/local/augustus.2.7/bin/augustus --species=cowpea_new --UTR=off /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0 > /tmp/maker > _nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0.cowpea_new.augustus > #-------------------------------# > deleted:0 hits > collecting blastx repeatmasking > processing all repeats > in cluster::shadow_cluster... > ...finished clustering. > deleted:0 hits > doing blastx repeats > running blast search. > #--------- command -------------# > Widget::blastx: > /24-2/home/qliang/0.soft/maker/bin/../exe/blast/bin/blastx -db /tmp/maker_mvdRkd/te_proteins%2Efasta.mpi.10.6 -query /tmp/maker_mvdRkd/0/chr10.75 -num_alignments 10000 -num_descriptions 10000 -evalue 1e-06 -dbsize 300 -searchsp 500000000 -num_threads 1 -seg yes -soft_masking true -lcase_masking -show_gis -out /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/09/chr10//theVoid.chr10/7/chr10.75.te_proteins%2Efasta.repeatrunner.temp_dir/te_proteins%2Efasta.mpi.10.6.repeatrunner > #-------------------------------# > doing blastx repeats > re reading blast report. > /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/0/ScsGwly_6124%3BHRSCAF=6247.0.te_proteins%2Efasta.repeatrunner > deleted:0 hits > doing blastx repeats > doing blastx repeats > doing blastx repeats > doing blastx repeats > doing blastx repeats > doing blastx repeats > doing blastx repeats > doing blastx repeats > doing blastx repeats > collecting blastx repeatmasking > processing all repeats > in cluster::shadow_cluster... > ...finished clustering. > ERROR: Can't open seq file: /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/query.masked.gff.seq > No such file or directory > > at /24-2/home/qliang/0.soft/maker/bin/../lib/Dumper/GFF/GFFV3.pm line 199. > Dumper::GFF::GFFV3::finalize(Dumper::GFF::GFFV3=HASH(0x5000ab8)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 700 > Process::MpiChunk::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 > eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 > Error::subs::try(CODE(0x502bbb0), HASH(0x5007788)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 4269 > Process::MpiChunk::_go(Process::MpiChunk=HASH(0x50a1a18), "flow", HASH(0x50ad0f0), 2, 0) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 378 > Process::MpiChunk::_flow(Process::MpiChunk=HASH(0x50a1a18), HASH(0x50ad0f0), 2, 0, Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 318 > Process::MpiTiers::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 > eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 > Error::subs::try(CODE(0x50a9348), HASH(0x4ff0ec0)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 338 > Process::MpiTiers::_next_level(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 179 > Process::MpiTiers::next_chunk(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 286 > Process::MpiTiers::run_all(Process::MpiTiers=HASH(0x4fb3350), 0) called at /home/qliang/0.soft/maker/bin/maker line 695 > --> rank=NA, hostname=H4 > ERROR: Failed while builing masking tiers > --> rank=NA, hostname=H4 > --> rank=NA, hostname=H4 > ERROR: Can not get next level > running genemark. > #--------- command -------------# > Widget::genemark: > /24-2/home/qliang/0.soft/PerlPackages/ActivePerl-5.22/bin/perl-static /24-2/home/qliang/0.soft/maker/bin/../lib/Widget/genemark/gmhmm_wrap -m /home/qliang/cowpea/annotation/05.CEGMA/2.genemask/output/gmhmm.mod -g /24-2/home/qliang/0.soft/makerPackages/gm_et_linux_64/gmes_petap/gmhmme3 -p /24-2/home/qliang/0.soft/makerPackages/gm_et_linux_64/gmes_petap/probuild -o /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_nomask.0.gmhmm%2Emod.genemark /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_nomask.0 > #-------------------------------# > FAILED CONTIG:ScsGwly_6124;HRSCAF=6247 > > examining contents of the fasta file and run log > > > > --Next Contig-- > > #--------------------------------------------------------------------- > Now starting the contig!! > SeqID: ScsGwly_6140;HRSCAF=6263 > Length: 1247 > #--------------------------------------------------------------------- > > > > >> On Jan 5, 2018, at 7:22 PM, Carson Holt wrote: >> >> That?s the stack trace. The error is going to be a few lines further back. It would be best to get a few hundred lines right around the area you are showing. >> >> ?Carson >> >>> On Jan 4, 2018, at 2:36 PM, Qihua Liang wrote: >>> >>> Hi Ence, >>> >>> When I searched for ?E/error? in the output file, here is what first showed up: >>> Process::MpiChunk::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >>> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >>> Error::subs::try(CODE(0x502bbb0), HASH(0x5007788)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 4269 >>> Process::MpiChunk::_go(Process::MpiChunk=HASH(0x50a1a18), "flow", HASH(0x50ad0f0), 2, 0) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 378 >>> Process::MpiChunk::_flow(Process::MpiChunk=HASH(0x50a1a18), HASH(0x50ad0f0), 2, 0, Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 318 >>> Process::MpiTiers::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >>> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >>> Error::subs::try(CODE(0x50a9348), HASH(0x4ff0ec0)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 338 >>> Process::MpiTiers::_next_level(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 179 >>> >>> Is this what you may need? >>> >>> Qihua >>> >>>> On Jan 4, 2018, at 6:16 AM, Ence,daniel wrote: >>>> >>>> Hi, Before we can give any help to debug it, we need the error messages. These should be in the same file that the ?maker is finished? message is in. Look for the first error message (the one closest to the top of the file) and send that to the mailing list. >>>> >>>> Thanks, >>>> Daniel >>>> >>>> >>>>> On Jan 3, 2018, at 8:52 PM, Qihua Liang wrote: >>>>> >>>>> Dear Maker Develop Team, >>>>> >>>>> I have successfully run Maker for several times before. But I came across a strange thing days ago when I ran Maker again on a different assembly with the same input files and settings. >>>>> >>>>> I saw the message of "Maker is now finished!!!? but got empty GFF3 and no fasta files. And then I checked the master_datastore_index.log and realized that there are a lot of ?failed?s and ?retry?s and ?failed? again. What does this mean? Since I used same inputs as previous successful runs, could you provide some instructions on how to debug and solve it? >>>>> >>>>> Thank you so much >>>>> Qihua >>>>> _______________________________________________ >>>>> maker-devel mailing list >>>>> maker-devel at box290.bluehost.com >>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__box290.bluehost.com_mailman_listinfo_maker-2Ddevel-5Fyandell-2Dlab.org&d=DwIGaQ&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=12jzlNvGVD0AlPJ4E7cTlw1Dvu6n9cb4kMCobJ28XPs&m=nUDCP_0kFOhDYlTHgOpWtf_zdL77aQFeQwYOGIQwP8c&s=9Z4T1hdtxOyIjpn6f70qhrQRuGsZxXdV-oLSJF1zGkY&e= >>>> >>> >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From qlian003 at ucr.edu Tue Jan 9 11:10:49 2018 From: qlian003 at ucr.edu (Qihua Liang) Date: Tue, 9 Jan 2018 10:10:49 -0800 Subject: [maker-devel] questions on master_datastore_index.log file In-Reply-To: <0E5E8721-E814-4BA5-891B-B1C312BC0D4A@gmail.com> References: <0BECB285-BB11-4F46-B6D7-072640F311B2@ucr.edu> <0E5E8721-E814-4BA5-891B-B1C312BC0D4A@gmail.com> Message-ID: <87A06F3B-82C1-4B21-906E-69DC1308DEC6@ucr.edu> Hi Carson, I just check with the system administrator and we think the disk space should be working fine. And actually I also ran another attempt with much fewer processors days ago and I am having the same issues. Maybe I will try renaming the contig names to see how the new attempt works? Or any other suggestions? Thank you! Qihua > On Jan 9, 2018, at 9:14 AM, Carson Holt wrote: > > Your contig names may create issues. Specifically the ?;? character, but you should also remove the ?=? character. However, I believe your problem may be IO. If you are running under MPI or are running multiple jobs, the disk one of the machines may have that location unmounted, it may be full, you may have hit a system file quota limit, or the IO load is slowing it is not actually finished writing the file when MAKER tries to read it. If IO load, is the issue, then you just need to run fewer processes. The other possibilities would mean you need to make space, fix the mount, or raise any quotas on your systems. > > Thanks, > Carson > > On Jan 6, 2018, at 4:09 PM, Qihua Liang > wrote: > >> Hi Carson, >> >> I am pasting more lines of error messages. I notice an error of "ERROR: Can't open seq file: /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/query.masked.gff.seq?, the seq name of ?ScsGwly? is ">ScsGwly_6124;HRSCAF=6247?, is it because of the seq naming that makes the temp file name weird? >> >> Thanks >> Qihua >> >> #--------- command -------------# >> Widget::blastx: >> /24-2/home/qliang/0.soft/maker/bin/../exe/blast/bin/blastx -db /tmp/maker_nJDkCL/te_proteins%2Efasta.mpi.10.9 -query /tmp/maker_nJDkCL/0/ScsG >> wly_5932%3BHRSCAF=6050.0 -num_alignments 10000 -num_descriptions 10000 -evalue 1e-06 -dbsize 300 -searchsp 500000000 -num_threads 1 -seg yes >> -soft_masking true -lcase_masking -show_gis -out /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/D7/4 >> A/ScsGwly_5932%3BHRSCAF=6050//theVoid.ScsGwly_5932%3BHRSCAF=6050/0/ScsGwly_5932%3BHRSCAF=6050.0.te_proteins%2Efasta.repeatrunner.temp_dir/te_ >> proteins%2Efasta.mpi.10.9.repeatrunner >> #-------------------------------# >> deleted:0 hits >> collecting blastx repeatmasking >> processing all repeats >> in cluster::shadow_cluster... >> ...finished clustering. >> preparing masked sequence >> preparing ab-inits >> running snap. >> #--------- command -------------# >> Widget::snap: >> /24-2/home/qliang/0.soft/maker/exe/snap/snap /home/qliang/cowpea/annotation/09.tingting/4.Abintio/2.CEGMA/3.maker/maker1.hmm/maker1.snap.hmm >> /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0 > /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0.maker1%2Esnap%2Eh >> mm.snap >> #-------------------------------# >> scoring....decoding.10.20.30.40.50.60.70.80.90.100 done >> scoring....decoding.10.20.30.40.50.60.70.80.90.100 done >> running augustus. >> #--------- command -------------# >> Widget::augustus: >> /usr/local/augustus.2.7/bin/augustus --species=cowpea_new --UTR=off /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0 > /tmp/maker >> _nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0.cowpea_new.augustus >> #-------------------------------# >> deleted:0 hits >> collecting blastx repeatmasking >> processing all repeats >> in cluster::shadow_cluster... >> ...finished clustering. >> deleted:0 hits >> doing blastx repeats >> running blast search. >> #--------- command -------------# >> Widget::blastx: >> /24-2/home/qliang/0.soft/maker/bin/../exe/blast/bin/blastx -db /tmp/maker_mvdRkd/te_proteins%2Efasta.mpi.10.6 -query /tmp/maker_mvdRkd/0/chr10.75 -num_alignments 10000 -num_descriptions 10000 -evalue 1e-06 -dbsize 300 -searchsp 500000000 -num_threads 1 -seg yes -soft_masking true -lcase_masking -show_gis -out /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/09/chr10//theVoid.chr10/7/chr10.75.te_proteins%2Efasta.repeatrunner.temp_dir/te_proteins%2Efasta.mpi.10.6.repeatrunner >> #-------------------------------# >> doing blastx repeats >> re reading blast report. >> /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/0/ScsGwly_6124%3BHRSCAF=6247.0.te_proteins%2Efasta.repeatrunner >> deleted:0 hits >> doing blastx repeats >> doing blastx repeats >> doing blastx repeats >> doing blastx repeats >> doing blastx repeats >> doing blastx repeats >> doing blastx repeats >> doing blastx repeats >> doing blastx repeats >> collecting blastx repeatmasking >> processing all repeats >> in cluster::shadow_cluster... >> ...finished clustering. >> ERROR: Can't open seq file: /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/query.masked.gff.seq >> No such file or directory >> >> at /24-2/home/qliang/0.soft/maker/bin/../lib/Dumper/GFF/GFFV3.pm line 199. >> Dumper::GFF::GFFV3::finalize(Dumper::GFF::GFFV3=HASH(0x5000ab8)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 700 >> Process::MpiChunk::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >> Error::subs::try(CODE(0x502bbb0), HASH(0x5007788)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 4269 >> Process::MpiChunk::_go(Process::MpiChunk=HASH(0x50a1a18), "flow", HASH(0x50ad0f0), 2, 0) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 378 >> Process::MpiChunk::_flow(Process::MpiChunk=HASH(0x50a1a18), HASH(0x50ad0f0), 2, 0, Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 318 >> Process::MpiTiers::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >> Error::subs::try(CODE(0x50a9348), HASH(0x4ff0ec0)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 338 >> Process::MpiTiers::_next_level(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 179 >> Process::MpiTiers::next_chunk(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 286 >> Process::MpiTiers::run_all(Process::MpiTiers=HASH(0x4fb3350), 0) called at /home/qliang/0.soft/maker/bin/maker line 695 >> --> rank=NA, hostname=H4 >> ERROR: Failed while builing masking tiers >> --> rank=NA, hostname=H4 >> --> rank=NA, hostname=H4 >> ERROR: Can not get next level >> running genemark. >> #--------- command -------------# >> Widget::genemark: >> /24-2/home/qliang/0.soft/PerlPackages/ActivePerl-5.22/bin/perl-static /24-2/home/qliang/0.soft/maker/bin/../lib/Widget/genemark/gmhmm_wrap -m /home/qliang/cowpea/annotation/05.CEGMA/2.genemask/output/gmhmm.mod -g /24-2/home/qliang/0.soft/makerPackages/gm_et_linux_64/gmes_petap/gmhmme3 -p /24-2/home/qliang/0.soft/makerPackages/gm_et_linux_64/gmes_petap/probuild -o /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_nomask.0.gmhmm%2Emod.genemark /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_nomask.0 >> #-------------------------------# >> FAILED CONTIG:ScsGwly_6124;HRSCAF=6247 >> >> examining contents of the fasta file and run log >> >> >> >> --Next Contig-- >> >> #--------------------------------------------------------------------- >> Now starting the contig!! >> SeqID: ScsGwly_6140;HRSCAF=6263 >> Length: 1247 >> #--------------------------------------------------------------------- >> >> >> >> >>> On Jan 5, 2018, at 7:22 PM, Carson Holt > wrote: >>> >>> That?s the stack trace. The error is going to be a few lines further back. It would be best to get a few hundred lines right around the area you are showing. >>> >>> ?Carson >>> >>>> On Jan 4, 2018, at 2:36 PM, Qihua Liang > wrote: >>>> >>>> Hi Ence, >>>> >>>> When I searched for ?E/error? in the output file, here is what first showed up: >>>> Process::MpiChunk::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >>>> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >>>> Error::subs::try(CODE(0x502bbb0), HASH(0x5007788)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 4269 >>>> Process::MpiChunk::_go(Process::MpiChunk=HASH(0x50a1a18), "flow", HASH(0x50ad0f0), 2, 0) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 378 >>>> Process::MpiChunk::_flow(Process::MpiChunk=HASH(0x50a1a18), HASH(0x50ad0f0), 2, 0, Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 318 >>>> Process::MpiTiers::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >>>> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >>>> Error::subs::try(CODE(0x50a9348), HASH(0x4ff0ec0)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 338 >>>> Process::MpiTiers::_next_level(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 179 >>>> >>>> Is this what you may need? >>>> >>>> Qihua >>>> >>>>> On Jan 4, 2018, at 6:16 AM, Ence,daniel > wrote: >>>>> >>>>> Hi, Before we can give any help to debug it, we need the error messages. These should be in the same file that the ?maker is finished? message is in. Look for the first error message (the one closest to the top of the file) and send that to the mailing list. >>>>> >>>>> Thanks, >>>>> Daniel >>>>> >>>>> >>>>>> On Jan 3, 2018, at 8:52 PM, Qihua Liang > wrote: >>>>>> >>>>>> Dear Maker Develop Team, >>>>>> >>>>>> I have successfully run Maker for several times before. But I came across a strange thing days ago when I ran Maker again on a different assembly with the same input files and settings. >>>>>> >>>>>> I saw the message of "Maker is now finished!!!? but got empty GFF3 and no fasta files. And then I checked the master_datastore_index.log and realized that there are a lot of ?failed?s and ?retry?s and ?failed? again. What does this mean? Since I used same inputs as previous successful runs, could you provide some instructions on how to debug and solve it? >>>>>> >>>>>> Thank you so much >>>>>> Qihua >>>>>> _______________________________________________ >>>>>> maker-devel mailing list >>>>>> maker-devel at box290.bluehost.com >>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__box290.bluehost.com_mailman_listinfo_maker-2Ddevel-5Fyandell-2Dlab.org&d=DwIGaQ&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=12jzlNvGVD0AlPJ4E7cTlw1Dvu6n9cb4kMCobJ28XPs&m=nUDCP_0kFOhDYlTHgOpWtf_zdL77aQFeQwYOGIQwP8c&s=9Z4T1hdtxOyIjpn6f70qhrQRuGsZxXdV-oLSJF1zGkY&e= >>>>> >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Wed Jan 10 12:05:03 2018 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 10 Jan 2018 12:05:03 -0700 Subject: [maker-devel] questions on master_datastore_index.log file In-Reply-To: <87A06F3B-82C1-4B21-906E-69DC1308DEC6@ucr.edu> References: <0BECB285-BB11-4F46-B6D7-072640F311B2@ucr.edu> <0E5E8721-E814-4BA5-891B-B1C312BC0D4A@gmail.com> <87A06F3B-82C1-4B21-906E-69DC1308DEC6@ucr.edu> Message-ID: <36B45AA1-3D02-4E83-9EF8-85D56C4D3020@gmail.com> The error is saying exactly that the file MAKER just created does not exist. The only time we ever see this is when using network mounted locations under heavy IO load. Most network storage options use asynchronous IO, which means the system returns success on file operation before they actually complete. So they can say they finished writing a file before it actually exist. So if you try and open it right away, it doesn?t really exist and everything fails. But that only happens if there is heavy IO (lots of things going on in that mount location). So if you are getting persitent failures you may want to try a different work directory, or get your IT to troubleshoot IO load in the directory you are using. ?Carson > On Jan 9, 2018, at 11:10 AM, Qihua Liang wrote: > > Hi Carson, > > I just check with the system administrator and we think the disk space should be working fine. And actually I also ran another attempt with much fewer processors days ago and I am having the same issues. > > Maybe I will try renaming the contig names to see how the new attempt works? Or any other suggestions? > > Thank you! > Qihua > >> On Jan 9, 2018, at 9:14 AM, Carson Holt > wrote: >> >> Your contig names may create issues. Specifically the ?;? character, but you should also remove the ?=? character. However, I believe your problem may be IO. If you are running under MPI or are running multiple jobs, the disk one of the machines may have that location unmounted, it may be full, you may have hit a system file quota limit, or the IO load is slowing it is not actually finished writing the file when MAKER tries to read it. If IO load, is the issue, then you just need to run fewer processes. The other possibilities would mean you need to make space, fix the mount, or raise any quotas on your systems. >> >> Thanks, >> Carson >> >> On Jan 6, 2018, at 4:09 PM, Qihua Liang > wrote: >> >>> Hi Carson, >>> >>> I am pasting more lines of error messages. I notice an error of "ERROR: Can't open seq file: /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/query.masked.gff.seq?, the seq name of ?ScsGwly? is ">ScsGwly_6124;HRSCAF=6247?, is it because of the seq naming that makes the temp file name weird? >>> >>> Thanks >>> Qihua >>> >>> #--------- command -------------# >>> Widget::blastx: >>> /24-2/home/qliang/0.soft/maker/bin/../exe/blast/bin/blastx -db /tmp/maker_nJDkCL/te_proteins%2Efasta.mpi.10.9 -query /tmp/maker_nJDkCL/0/ScsG >>> wly_5932%3BHRSCAF=6050.0 -num_alignments 10000 -num_descriptions 10000 -evalue 1e-06 -dbsize 300 -searchsp 500000000 -num_threads 1 -seg yes >>> -soft_masking true -lcase_masking -show_gis -out /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/D7/4 >>> A/ScsGwly_5932%3BHRSCAF=6050//theVoid.ScsGwly_5932%3BHRSCAF=6050/0/ScsGwly_5932%3BHRSCAF=6050.0.te_proteins%2Efasta.repeatrunner.temp_dir/te_ >>> proteins%2Efasta.mpi.10.9.repeatrunner >>> #-------------------------------# >>> deleted:0 hits >>> collecting blastx repeatmasking >>> processing all repeats >>> in cluster::shadow_cluster... >>> ...finished clustering. >>> preparing masked sequence >>> preparing ab-inits >>> running snap. >>> #--------- command -------------# >>> Widget::snap: >>> /24-2/home/qliang/0.soft/maker/exe/snap/snap /home/qliang/cowpea/annotation/09.tingting/4.Abintio/2.CEGMA/3.maker/maker1.hmm/maker1.snap.hmm >>> /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0 > /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0.maker1%2Esnap%2Eh >>> mm.snap >>> #-------------------------------# >>> scoring....decoding.10.20.30.40.50.60.70.80.90.100 done >>> scoring....decoding.10.20.30.40.50.60.70.80.90.100 done >>> running augustus. >>> #--------- command -------------# >>> Widget::augustus: >>> /usr/local/augustus.2.7/bin/augustus --species=cowpea_new --UTR=off /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0 > /tmp/maker >>> _nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0.cowpea_new.augustus >>> #-------------------------------# >>> deleted:0 hits >>> collecting blastx repeatmasking >>> processing all repeats >>> in cluster::shadow_cluster... >>> ...finished clustering. >>> deleted:0 hits >>> doing blastx repeats >>> running blast search. >>> #--------- command -------------# >>> Widget::blastx: >>> /24-2/home/qliang/0.soft/maker/bin/../exe/blast/bin/blastx -db /tmp/maker_mvdRkd/te_proteins%2Efasta.mpi.10.6 -query /tmp/maker_mvdRkd/0/chr10.75 -num_alignments 10000 -num_descriptions 10000 -evalue 1e-06 -dbsize 300 -searchsp 500000000 -num_threads 1 -seg yes -soft_masking true -lcase_masking -show_gis -out /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/09/chr10//theVoid.chr10/7/chr10.75.te_proteins%2Efasta.repeatrunner.temp_dir/te_proteins%2Efasta.mpi.10.6.repeatrunner >>> #-------------------------------# >>> doing blastx repeats >>> re reading blast report. >>> /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/0/ScsGwly_6124%3BHRSCAF=6247.0.te_proteins%2Efasta.repeatrunner >>> deleted:0 hits >>> doing blastx repeats >>> doing blastx repeats >>> doing blastx repeats >>> doing blastx repeats >>> doing blastx repeats >>> doing blastx repeats >>> doing blastx repeats >>> doing blastx repeats >>> doing blastx repeats >>> collecting blastx repeatmasking >>> processing all repeats >>> in cluster::shadow_cluster... >>> ...finished clustering. >>> ERROR: Can't open seq file: /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/query.masked.gff.seq >>> No such file or directory >>> >>> at /24-2/home/qliang/0.soft/maker/bin/../lib/Dumper/GFF/GFFV3.pm line 199. >>> Dumper::GFF::GFFV3::finalize(Dumper::GFF::GFFV3=HASH(0x5000ab8)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 700 >>> Process::MpiChunk::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >>> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >>> Error::subs::try(CODE(0x502bbb0), HASH(0x5007788)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 4269 >>> Process::MpiChunk::_go(Process::MpiChunk=HASH(0x50a1a18), "flow", HASH(0x50ad0f0), 2, 0) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 378 >>> Process::MpiChunk::_flow(Process::MpiChunk=HASH(0x50a1a18), HASH(0x50ad0f0), 2, 0, Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 318 >>> Process::MpiTiers::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >>> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >>> Error::subs::try(CODE(0x50a9348), HASH(0x4ff0ec0)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 338 >>> Process::MpiTiers::_next_level(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 179 >>> Process::MpiTiers::next_chunk(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 286 >>> Process::MpiTiers::run_all(Process::MpiTiers=HASH(0x4fb3350), 0) called at /home/qliang/0.soft/maker/bin/maker line 695 >>> --> rank=NA, hostname=H4 >>> ERROR: Failed while builing masking tiers >>> --> rank=NA, hostname=H4 >>> --> rank=NA, hostname=H4 >>> ERROR: Can not get next level >>> running genemark. >>> #--------- command -------------# >>> Widget::genemark: >>> /24-2/home/qliang/0.soft/PerlPackages/ActivePerl-5.22/bin/perl-static /24-2/home/qliang/0.soft/maker/bin/../lib/Widget/genemark/gmhmm_wrap -m /home/qliang/cowpea/annotation/05.CEGMA/2.genemask/output/gmhmm.mod -g /24-2/home/qliang/0.soft/makerPackages/gm_et_linux_64/gmes_petap/gmhmme3 -p /24-2/home/qliang/0.soft/makerPackages/gm_et_linux_64/gmes_petap/probuild -o /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_nomask.0.gmhmm%2Emod.genemark /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_nomask.0 >>> #-------------------------------# >>> FAILED CONTIG:ScsGwly_6124;HRSCAF=6247 >>> >>> examining contents of the fasta file and run log >>> >>> >>> >>> --Next Contig-- >>> >>> #--------------------------------------------------------------------- >>> Now starting the contig!! >>> SeqID: ScsGwly_6140;HRSCAF=6263 >>> Length: 1247 >>> #--------------------------------------------------------------------- >>> >>> >>> >>> >>>> On Jan 5, 2018, at 7:22 PM, Carson Holt > wrote: >>>> >>>> That?s the stack trace. The error is going to be a few lines further back. It would be best to get a few hundred lines right around the area you are showing. >>>> >>>> ?Carson >>>> >>>>> On Jan 4, 2018, at 2:36 PM, Qihua Liang > wrote: >>>>> >>>>> Hi Ence, >>>>> >>>>> When I searched for ?E/error? in the output file, here is what first showed up: >>>>> Process::MpiChunk::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >>>>> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >>>>> Error::subs::try(CODE(0x502bbb0), HASH(0x5007788)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 4269 >>>>> Process::MpiChunk::_go(Process::MpiChunk=HASH(0x50a1a18), "flow", HASH(0x50ad0f0), 2, 0) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 378 >>>>> Process::MpiChunk::_flow(Process::MpiChunk=HASH(0x50a1a18), HASH(0x50ad0f0), 2, 0, Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 318 >>>>> Process::MpiTiers::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >>>>> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >>>>> Error::subs::try(CODE(0x50a9348), HASH(0x4ff0ec0)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 338 >>>>> Process::MpiTiers::_next_level(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 179 >>>>> >>>>> Is this what you may need? >>>>> >>>>> Qihua >>>>> >>>>>> On Jan 4, 2018, at 6:16 AM, Ence,daniel > wrote: >>>>>> >>>>>> Hi, Before we can give any help to debug it, we need the error messages. These should be in the same file that the ?maker is finished? message is in. Look for the first error message (the one closest to the top of the file) and send that to the mailing list. >>>>>> >>>>>> Thanks, >>>>>> Daniel >>>>>> >>>>>> >>>>>>> On Jan 3, 2018, at 8:52 PM, Qihua Liang > wrote: >>>>>>> >>>>>>> Dear Maker Develop Team, >>>>>>> >>>>>>> I have successfully run Maker for several times before. But I came across a strange thing days ago when I ran Maker again on a different assembly with the same input files and settings. >>>>>>> >>>>>>> I saw the message of "Maker is now finished!!!? but got empty GFF3 and no fasta files. And then I checked the master_datastore_index.log and realized that there are a lot of ?failed?s and ?retry?s and ?failed? again. What does this mean? Since I used same inputs as previous successful runs, could you provide some instructions on how to debug and solve it? >>>>>>> >>>>>>> Thank you so much >>>>>>> Qihua >>>>>>> _______________________________________________ >>>>>>> maker-devel mailing list >>>>>>> maker-devel at box290.bluehost.com >>>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__box290.bluehost.com_mailman_listinfo_maker-2Ddevel-5Fyandell-2Dlab.org&d=DwIGaQ&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=12jzlNvGVD0AlPJ4E7cTlw1Dvu6n9cb4kMCobJ28XPs&m=nUDCP_0kFOhDYlTHgOpWtf_zdL77aQFeQwYOGIQwP8c&s=9Z4T1hdtxOyIjpn6f70qhrQRuGsZxXdV-oLSJF1zGkY&e= >>>>>> >>>>> >>>>> _______________________________________________ >>>>> maker-devel mailing list >>>>> maker-devel at box290.bluehost.com >>>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>>> >>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From arsilan324 at gmail.com Thu Jan 11 07:15:31 2018 From: arsilan324 at gmail.com (Muhammad Arslan) Date: Thu, 11 Jan 2018 15:15:31 +0100 Subject: [maker-devel] GFF3 to .tbl Message-ID: Dear Madam or Sir, I am writing this email to inquire if there is any way to make .tbl file from maker generated GFF3 file? This is required since I am trying to submit the annotation to NCBI. If there is any other solution for this, please advise accordingly. Thank you very much! Arslan -- --------------------------------------------------------------------------------------------*Muhammad Arslan* PhD Student / Guest Scientist Department of Environmental Biotechnology Helmholtz Centre for Environmental Research - UFZ Permoserstra?e 15, 04318 Leipzig, Germany Phone +49,341,235 <+49%20341%20235> 1696, muhammad.arslan at ufz.de , www.ufz.de Registered Office / Registered Office: Leipzig Register court / Registration Office: Amtsgericht Leipzig Commercial register Nr./Trade Register No .: B 4703 Chairman / Chairman of the Supervisory Board: MinDirig Wilfried Kraus Scientific Director / Scientific Managing Director: Prof. Georg Teutsch Administrative Managing Director / Administrative Managing Director: Prof. Dr. Heike Grassmann -------------------------------------------------------------------------------------------- *SAVE PAPER - Please do not print this e-mail unless absolutely necessary* -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Fri Jan 19 15:46:26 2018 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 19 Jan 2018 15:46:26 -0700 Subject: [maker-devel] GFF3 to .tbl In-Reply-To: References: Message-ID: <93BD3F52-1D76-465A-94EE-80D616BB72A6@gmail.com> Try GAG ?> https://genomeannotation.github.io/GAG/ ?Carson > On Jan 11, 2018, at 7:15 AM, Muhammad Arslan wrote: > > Dear Madam or Sir, > > I am writing this email to inquire if there is any way to make .tbl file from maker generated GFF3 file? This is required since I am trying to submit the annotation to NCBI. If there is any other solution for this, please advise accordingly. > > Thank you very much! > Arslan > > -- > -------------------------------------------------------------------------------------------- > Muhammad Arslan > PhD Student / Guest Scientist > Department of Environmental Biotechnology > > Helmholtz Centre for Environmental Research - UFZ > Permoserstra?e 15, 04318 Leipzig, Germany > Phone +49,341,235 1696, > muhammad.arslan at ufz.de , www.ufz.de > > Registered Office / Registered Office: Leipzig > Register court / Registration Office: Amtsgericht Leipzig > Commercial register Nr./Trade Register No .: B 4703 > Chairman / Chairman of the Supervisory Board: MinDirig Wilfried Kraus > Scientific Director / Scientific Managing Director: > Prof. Georg Teutsch > Administrative Managing Director / Administrative Managing Director: > Prof. Dr. Heike Grassmann > > > -------------------------------------------------------------------------------------------- > SAVE PAPER - Please do not print this e-mail unless absolutely necessary > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From qwzhang0601 at gmail.com Mon Jan 22 10:23:34 2018 From: qwzhang0601 at gmail.com (Quanwei Zhang) Date: Mon, 22 Jan 2018 12:23:34 -0500 Subject: [maker-devel] name of gene model Message-ID: Hello: Would you please explain how the genes were named? Do similar names indicate sequence similarities (e.g., maker-Contig3217-snap-gene-35.13-mRNA-1, maker-Contig3217-snap-gene-35.14-mRNA-1)? maker-Contig2667-augustus-gene-266.22-mRNA-1; maker-Contig2667-snap-gene-266.5-mRNA-1; maker-Contig3217-snap-gene-35.13-mRNA-1; maker-Contig3217-snap-gene-35.14-mRNA-1; maker-Contig3217-snap-gene-35.15-mRNA-1; maker-Contig3217-snap-gene-35.16-mRNA-1 Thank you Best Quanwei -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Jan 22 10:29:26 2018 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 22 Jan 2018 10:29:26 -0700 Subject: [maker-devel] name of gene model In-Reply-To: References: Message-ID: <499FF8DC-C277-484B-AEC9-EE7A35090615@gmail.com> The only info in the name is the source program of the model (i.e. snap/augustus). The numbers are just meaningless iterators. ?Carson > On Jan 22, 2018, at 10:23 AM, Quanwei Zhang wrote: > > Hello: > > Would you please explain how the genes were named? Do similar names indicate sequence similarities (e.g., maker-Contig3217-snap-gene-35.13-mRNA-1, maker-Contig3217-snap-gene-35.14-mRNA-1)? > > maker-Contig2667-augustus-gene-266.22-mRNA-1; > maker-Contig2667-snap-gene-266.5-mRNA-1; > > maker-Contig3217-snap-gene-35.13-mRNA-1; > maker-Contig3217-snap-gene-35.14-mRNA-1; > maker-Contig3217-snap-gene-35.15-mRNA-1; > maker-Contig3217-snap-gene-35.16-mRNA-1 > > Thank you > > Best > Quanwei > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From yincl2013 at 126.com Tue Jan 23 08:01:17 2018 From: yincl2013 at 126.com (Chuanlin Yin) Date: Tue, 23 Jan 2018 23:01:17 +0800 (GMT+08:00) Subject: [maker-devel] maker-3.01.02-beta run error Message-ID: <165b9e3f.a94e.16123899e8a.Coremail.yincl2013@126.com> Dear Mr/Ms? Recently?when i want to use maker-3.01.02-beta for genome annotation. I had failed for the following error: Can't call method "add_entry" without a package or object reference at /gpfs/bioinformatics/software/maker-3.01.02-beta/bin/../lib/Widget/snap.pm line 540. --> rank=NA, hostname=c01n02 ERROR: Failed while annotating transcripts ERROR: Chunk failed at level:1, tier_type:4 FAILED CONTIG:002369F_pilon_obj Could you explain why it happened! Much appreciated for any replies. Thanks. Best regards! Showky -------------- next part -------------- An HTML attachment was scrubbed... URL: From Emily.Giroux at inspection.gc.ca Tue Jan 23 13:35:06 2018 From: Emily.Giroux at inspection.gc.ca (Giroux, Emily (CFIA/ACIA)) Date: Tue, 23 Jan 2018 20:35:06 +0000 Subject: [maker-devel] maker pipeline 2nd round updating augustus Message-ID: <397E3572255740488AA9993F4D41A3B036B588CB@ONOTTAXES2.AGR.GC.CA> Hi, I completed a first round of Maker, followed by snap and BUSCO to train augustus. I then placed the newly-trained species-specific augustus files in the augustus species directory and used this for my second round of maker. What I'm wondering now is whether I should repeat this process after completeing round 2 of maker, and follow this with using BUSCO to retrain the augustus files again and replace the previous species-specific libraries from round 1 with those from round 2 and use these as input for my third round of maker. Thank-you very much, Emily -------------- next part -------------- An HTML attachment was scrubbed... URL: From patrick.tranvan at unil.ch Thu Jan 25 07:46:27 2018 From: patrick.tranvan at unil.ch (Patrick Tran Van) Date: Thu, 25 Jan 2018 14:46:27 +0000 Subject: [maker-devel] Adding NR functional annotation Message-ID: <1516891629951.7595@unil.ch> Hi, Can you please update maker_functional_gff maker_functional_fasta in order to make it compatible with the database NR ? Thanks, Patrick -------------- next part -------------- An HTML attachment was scrubbed... URL: From marni at cs.au.dk Thu Jan 25 03:26:04 2018 From: marni at cs.au.dk (Marni Tausen) Date: Thu, 25 Jan 2018 10:26:04 +0000 Subject: [maker-devel] Maker run problems - BLAST makeblastdb failed Message-ID: <72D3C07A-D1A6-4759-B8D2-13EBE8DD7982@birc.au.dk> Hey, I have a problem getting maker to run. I?ve tried installing the pipeline on three separate systems. CentOS 6 (cluster), Mac OS X 10.12.6 and on CentOS 7. With each of them I run into problems with Repeatmasker step with the error message: #--------------------------------------------------------------------- Now starting the contig!! SeqID: chr0 Length: 38046352 #--------------------------------------------------------------------- setting up GFF3 output and fasta chunks doing repeat masking running repeat masker. #--------- command -------------# Widget::RepeatMasker: cd makertmp/maker_DMTHbJ; /Users/PM/maker/exe/RepeatMasker/RepeatMasker /Users/PM/GENEANNOTATION/TrR.v5.maker.output/TrR.v5_datastore/82/7E/chr0//theVoid.chr0/0/chr0.0.all.rb -species all -dir /Users/PM/GENEANNOTATION/TrR.v5.maker.output/TrR.v5_datastore/82/7E/chr0//theVoid.chr0/0 -pa 1 #-------------------------------# doing blastx repeats formating database... #--------- command -------------# Widget::formater: /Users/PM/maker/bin/../exe/lblast/bin/makeblastdb -dbtype prot -in makertmp/maker_DMTHbJ/0/blastprep/te_proteins%2Efasta.mpi.10.0 #-------------------------------# BLAST options error: File makertmp/maker_DMTHbJ/0/blastprep/te_proteins%2Efasta.mpi.10.0 does not exist ERROR: /Users/PM/maker/bin/../exe/lblast/bin/makeblastdb failed in Widget::formater --> rank=NA, hostname=d24834.local ERROR: Failed while doing blastx repeats ERROR: Chunk failed at level:1, tier_type:1 FAILED CONTIG:chr0 ERROR: Chunk failed at level:2, tier_type:0 FAILED CONTIG:chr0 examining contents of the fasta file and run log The Maker version that was installed is 2.31.9, and it was build using the ./Build commands. However the links for exonerate (2.2.0) and repeatmasker (repbase) (latest version) seem to be broken, as they always returned connection errors. So I manually installed both of those programs and linked maker to them. I?ve attached the config files and the script used to run maker. Cheers, Marni Tausen -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_bopts.ctl Type: application/octet-stream Size: 1413 bytes Desc: maker_bopts.ctl URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_exe.ctl Type: application/octet-stream Size: 1277 bytes Desc: maker_exe.ctl URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_opts.ctl Type: application/octet-stream Size: 4554 bytes Desc: maker_opts.ctl URL: From mmokrejs at gmail.com Thu Jan 25 09:05:45 2018 From: mmokrejs at gmail.com (=?UTF-8?Q?Martin_MOKREJ=c5=a0?=) Date: Thu, 25 Jan 2018 17:05:45 +0100 Subject: [maker-devel] Maker run problems - BLAST makeblastdb failed In-Reply-To: <72D3C07A-D1A6-4759-B8D2-13EBE8DD7982@birc.au.dk> References: <72D3C07A-D1A6-4759-B8D2-13EBE8DD7982@birc.au.dk> Message-ID: Hi Marni, do not use spaces in your filenames and directory names. I think that is your issue: te_proteins%2Efasta.mpi.10.0 Martin From carsonhh at gmail.com Thu Jan 25 14:20:21 2018 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 25 Jan 2018 14:20:21 -0700 Subject: [maker-devel] Maker run problems - BLAST makeblastdb failed In-Reply-To: <72D3C07A-D1A6-4759-B8D2-13EBE8DD7982@birc.au.dk> References: <72D3C07A-D1A6-4759-B8D2-13EBE8DD7982@birc.au.dk> Message-ID: <2AC145CF-6954-4740-BA88-A7ABBBC841D0@gmail.com> You set TMP=makertmp. That is likly not a true locally mounted location (i.e. it?s network mounted). In which case you will hit a race condition where files you just created don?t become readable for a few milliseconds to seconds after creation under heavy IO load. Alternatively it is locally mounted, but only exists on a single node and you are running it on a cluster (other nodes cannot cross access local nodes). Unless your cluster setup has a specific location for locally mounted temporary scratch space, you should not set TMP=. Just let it default to /tmp which is almost always locally mounted. ?Carson > On Jan 25, 2018, at 3:26 AM, Marni Tausen wrote: > > Hey, > > I have a problem getting maker to run. > > I?ve tried installing the pipeline on three separate systems. CentOS 6 (cluster), Mac OS X 10.12.6 and on CentOS 7. > > With each of them I run into problems with Repeatmasker step with the error message: > > #--------------------------------------------------------------------- > Now starting the contig!! > SeqID: chr0 > Length: 38046352 > #--------------------------------------------------------------------- > > > setting up GFF3 output and fasta chunks > doing repeat masking > running repeat masker. > #--------- command -------------# > Widget::RepeatMasker: > cd makertmp/maker_DMTHbJ; /Users/PM/maker/exe/RepeatMasker/RepeatMasker /Users/PM/GENEANNOTATION/TrR.v5.maker.output/TrR.v5_datastore/82/7E/chr0//theVoid.chr0/0/chr0.0.all.rb -species all -dir /Users/PM/GENEANNOTATION/TrR.v5.maker.output/TrR.v5_datastore/82/7E/chr0//theVoid.chr0/0 -pa 1 > #-------------------------------# > doing blastx repeats > formating database... > #--------- command -------------# > Widget::formater: > /Users/PM/maker/bin/../exe/lblast/bin/makeblastdb -dbtype prot -in makertmp/maker_DMTHbJ/0/blastprep/te_proteins%2Efasta.mpi.10.0 > #-------------------------------# > BLAST options error: File makertmp/maker_DMTHbJ/0/blastprep/te_proteins%2Efasta.mpi.10.0 does not exist > ERROR: /Users/PM/maker/bin/../exe/lblast/bin/makeblastdb failed in Widget::formater > --> rank=NA, hostname=d24834.local > ERROR: Failed while doing blastx repeats > ERROR: Chunk failed at level:1, tier_type:1 > FAILED CONTIG:chr0 > > ERROR: Chunk failed at level:2, tier_type:0 > FAILED CONTIG:chr0 > > examining contents of the fasta file and run log > > The Maker version that was installed is 2.31.9, and it was build using the ./Build commands. > > However the links for exonerate (2.2.0) and repeatmasker (repbase) (latest version) seem to be broken, as they always returned connection errors. > > So I manually installed both of those programs and linked maker to them. > > I?ve attached the config files and the script used to run maker. > > Cheers, > Marni Tausen > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Thu Jan 25 14:29:37 2018 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 25 Jan 2018 14:29:37 -0700 Subject: [maker-devel] maker pipeline 2nd round updating augustus In-Reply-To: <397E3572255740488AA9993F4D41A3B036B588CB@ONOTTAXES2.AGR.GC.CA> References: <397E3572255740488AA9993F4D41A3B036B588CB@ONOTTAXES2.AGR.GC.CA> Message-ID: <2D069310-1BFC-4C30-98B5-739FC90A732B@gmail.com> Don?t use BUSCO to train for the second round, there is a bias in the models it produces for conserved genes that tend to be short and intron poor., You will want to avoid this bias in the second round. You want to use a broad selection of gene models instead. Use the maker2zff script to select gene models for training (examples on doing this can be found on the maker tutorial wiki). Then use this script to convert ZFF to GenBank format to train Augustus ?> https://github.com/hyphaltip/genome-scripts/blob/master/gene_prediction/zff2augustus_gbk.pl This is a nice guide to train Augustus using GenBank format input?> https://vcru.wisc.edu/simonlab/bioinformatics/programs/augustus/docs/tutorial2015/training.html ?Carson > On Jan 23, 2018, at 1:35 PM, Giroux, Emily (CFIA/ACIA) wrote: > > Hi, > > I completed a first round of Maker, followed by snap and BUSCO to train augustus. I then placed the newly-trained species-specific augustus files in the augustus species directory and used this for my second round of maker. > > What I?m wondering now is whether I should repeat this process after completeing round 2 of maker, and follow this with using BUSCO to retrain the augustus files again and replace the previous species-specific libraries from round 1 with those from round 2 and use these as input for my third round of maker. > > Thank-you very much, > > Emily > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Thu Jan 25 14:33:33 2018 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 25 Jan 2018 14:33:33 -0700 Subject: [maker-devel] maker-3.01.02-beta run error In-Reply-To: <165b9e3f.a94e.16123899e8a.Coremail.yincl2013@126.com> References: <165b9e3f.a94e.16123899e8a.Coremail.yincl2013@126.com> Message-ID: Because of where that error occurred, it may be a snowball error (i.e. a result of another error upstream that is the real failure). Could you look back in the data to see if there is a failure further back? Perhaps include your entire STDERR log. Thanks, Carson > On Jan 23, 2018, at 8:01 AM, Chuanlin Yin wrote: > > Dear Mr/Ms? > > Recently?when i want to use maker-3.01.02-beta for genome annotation. I had failed for the following error: > > Can't call method "add_entry" without a package or object reference at /gpfs/bioinformatics/software/maker-3.01.02-beta/bin/../lib/Widget/snap.pm line 540. > --> rank=NA, hostname=c01n02 > ERROR: Failed while annotating transcripts > ERROR: Chunk failed at level:1, tier_type:4 > FAILED CONTIG:002369F_pilon_obj > > Could you explain why it happened! > > Much appreciated for any replies. Thanks. > > Best regards! > Showky > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From qwzhang0601 at gmail.com Fri Jan 26 09:40:32 2018 From: qwzhang0601 at gmail.com (Quanwei Zhang) Date: Fri, 26 Jan 2018 11:40:32 -0500 Subject: [maker-devel] map the transcripts back onto the genome using "est2genome=1" Message-ID: Hello: I am trying to annotate a new NMR genome assembly. Since the gene annotation is available for the old version of NMR from NCBI, I tried to map the published refSeq transcripts onto the genome by "est2genome=1". But I found quite a few genes were lost during mapping. Then I did another test to check the functionality of the mapping by "est2genome=1". I mapped the published refSeq transcripts onto the old genome (the same version for the published gene annotation) by maker with "est2genome=1". Still I can found quite a few genes were lost during the mapping. Below I show you the results of gene annotaion by BUSCOs, which *annotation completeness with single-copy orthologs*. You can see, even we only consider the single-copy orthologs, there are still 4% were not map back to the genome. Do you have any comments on this? Besides would you please give us some suggestions to make more published gene annotation map back to the same genome assembly through "est2genome=1"? Attached is the maker_opts.ctl file I used for the mapping. Many thanks. # this is the BUSCOs results using the published gene annotation C:99.3%[S:33.3%,D:66.0%],F:0.3%,M:0.4%,n:4104 4077 Complete BUSCOs (C) 1367 Complete and single-copy BUSCOs (S) 2710 Complete and duplicated BUSCOs (D) 14 Fragmented BUSCOs (F) 13 Missing BUSCOs (M) 4104 Total BUSCO groups searched #this is the BUSCOs results using gene models after mapping by maker2. C:93.4%[S:36.5%,D:56.9%],F:2.6%,M:4.0%,n:4104 3830 Complete BUSCOs (C) 1496 Complete and single-copy BUSCOs (S) 2334 Complete and duplicated BUSCOs (D) 105 Fragmented BUSCOs (F) 169 Missing BUSCOs (M) 4104 Total BUSCO groups searched Best Quanwei -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_opts.ctl Type: application/octet-stream Size: 4733 bytes Desc: not available URL: From qwzhang0601 at gmail.com Fri Jan 26 16:16:50 2018 From: qwzhang0601 at gmail.com (Quanwei Zhang) Date: Fri, 26 Jan 2018 18:16:50 -0500 Subject: [maker-devel] gene annotation for a better genome In-Reply-To: References: <5AFEDD05-DF02-463F-A6EE-1619A9BB968D@gmail.com> Message-ID: Hi Carson: Thank you for your previous suggestions. I have done the annotation according to your suggestions. I firstly mapped the transcripts from old assembly to the new assembly by setting "est2genome=1", and then update the models by new predictions. Besides mapping by "est2genome=1" , do you think it is a good idea to do a separate mapping by proteins of old assembly (setting "protein2genome=1")? And then I provide both mapping GFF files (i.e., mapping GFF by transcripts and proteins, separately) and update them with new predictions and evidence support? Why I am trying to do this is because I found for certain genes they were not mapped to the new assembly but they can be mapped by protein orthologs. Thank you. Best Quanwei 2017-10-24 18:26 GMT-04:00 Carson Holt : > Yes. If you use est2genome it will just align the model, and then find the > longest ORF. So it is a quick way to jsut align old models to the new > assembly. Alternatively you can just do de novo annotation. > > ?Carson > > > > On Oct 24, 2017, at 10:54 AM, Quanwei Zhang wrote: > > Dear Carson: > > Thank you again for your suggestions. I just get the new genome assembly > of NMR and start to do gene annotation. I understand you ideas about this. > But can I simply use the old genome transcripts as transcript evidence, and > just following the standard Maker2 pipeline? I set est2genome=1 and provide > the mRNA sequences in the fasta format for the first round training of SNAP. > > For transcripts I have the following choices. I think the first choice is > more reliable and better, right? > (1) There are about 60,000 RefSeq transcripts from NCBI. So I downloaded > those sequences in fasta format. > (2) We have the raw data of RNA-seq from 11 tissues, we can do assembly by > trinity for each sample and then get the transcripts. But I think most of > the RNA-seq should have been submitted to NCBI. > > BTW, if we use the RefSeq data from NCBI, we can download the mRNA > sequences, coding sequences or protein sequences. I wonder which type of > data are the best to train the SNAP? For Augustus, we will use BUSCO to > train it. > > Many thanks. > > Best > Quanwei > > > > > 2017-09-29 12:36 GMT-04:00 Carson Holt : > >> You can try using the est2genome=1 option to map the old models forward >> onto the new assembly as if they were ESTs (add a line that says >> est_forward=1 to the control file to maintain old naming and set est=1 to >> the old model transcript file). Then provide the final models as a pred_gff >> for a subsuquent run (i.e. a traditional MAKER run where you are annotating >> the new assembly with transcript and protein evidence and ab initio >> predictors). Don?t supply the old models to est= on that run. >> >> The idea behind doing it this way is: >> 1. You need to get old models onto the new assembly so coordinates will >> change. So by doing it this way, you will at least be able to move many >> models forward based on homology. >> 2. By providing the models to pred_gff on a subsequent MAKER run, you are >> just letting old models compete against new annotations. They will be >> rejected if they have no evidence support, or can be kept if they score >> better than alternate models from SNAP/Augustus. That way you have the >> chance to integrate old models while at the same time rejecting some old >> models that have no evidence overlap. >> >> ?Carson >> >> >> > On Sep 28, 2017, at 6:05 AM, Quanwei Zhang >> wrote: >> > >> > Hello: >> > >> > Recently, we got a new version of NMR genome, whose genome had been >> assembled and annotated a few years ago. We can download the gene >> annotation from NCBI. >> > >> > Now we want to annotate the new genome using Maker2 pipeline. I wonder >> how can I fully make use of existing annotations. On the other hand, since >> the previous genome is not very well assemblies, some genes annotation >> maybe false positives. I hope those false positive genes in previous >> annotation won't mislead Maker2 for current gene annotation. >> > >> > Do you have any suggestions. Thanks >> > >> > Best >> > Quanwei >> > _______________________________________________ >> > maker-devel mailing list >> > maker-devel at box290.bluehost.com >> > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Jan 29 11:23:06 2018 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 29 Jan 2018 11:23:06 -0700 Subject: [maker-devel] gene annotation for a better genome In-Reply-To: References: <5AFEDD05-DF02-463F-A6EE-1619A9BB968D@gmail.com> Message-ID: <753F1840-4874-4C0D-80F7-59E1A1579884@gmail.com> You can set both est2genome=1 and protein2genome=1. You can also set est_forward=1 to get the names from the old models (you have to add it as it?s not already there). If you want to try and force an alignment to a specifc location, you can also add maker_coor=chr2:1-3000 to the fasta header comment line to have maker only alow alignments within a specific region (chr2:1-3000 in the example). ?Carson > On Jan 26, 2018, at 4:16 PM, Quanwei Zhang wrote: > > Hi Carson: > > Thank you for your previous suggestions. I have done the annotation according to your suggestions. I firstly mapped the transcripts from old assembly to the new assembly by setting "est2genome=1", and then update the models by new predictions. > > Besides mapping by "est2genome=1" , do you think it is a good idea to do a separate mapping by proteins of old assembly (setting "protein2genome=1")? And then I provide both mapping GFF files (i.e., mapping GFF by transcripts and proteins, separately) and update them with new predictions and evidence support? Why I am trying to do this is because I found for certain genes they were not mapped to the new assembly but they can be mapped by protein orthologs. > > Thank you. > > Best > Quanwei > > 2017-10-24 18:26 GMT-04:00 Carson Holt >: > Yes. If you use est2genome it will just align the model, and then find the longest ORF. So it is a quick way to jsut align old models to the new assembly. Alternatively you can just do de novo annotation. > > ?Carson > > > >> On Oct 24, 2017, at 10:54 AM, Quanwei Zhang > wrote: >> >> Dear Carson: >> >> Thank you again for your suggestions. I just get the new genome assembly of NMR and start to do gene annotation. I understand you ideas about this. But can I simply use the old genome transcripts as transcript evidence, and just following the standard Maker2 pipeline? I set est2genome=1 and provide the mRNA sequences in the fasta format for the first round training of SNAP. >> >> For transcripts I have the following choices. I think the first choice is more reliable and better, right? >> (1) There are about 60,000 RefSeq transcripts from NCBI. So I downloaded those sequences in fasta format. >> (2) We have the raw data of RNA-seq from 11 tissues, we can do assembly by trinity for each sample and then get the transcripts. But I think most of the RNA-seq should have been submitted to NCBI. >> >> BTW, if we use the RefSeq data from NCBI, we can download the mRNA sequences, coding sequences or protein sequences. I wonder which type of data are the best to train the SNAP? For Augustus, we will use BUSCO to train it. >> >> Many thanks. >> >> Best >> Quanwei >> >> >> >> >> 2017-09-29 12:36 GMT-04:00 Carson Holt >: >> You can try using the est2genome=1 option to map the old models forward onto the new assembly as if they were ESTs (add a line that says est_forward=1 to the control file to maintain old naming and set est=1 to the old model transcript file). Then provide the final models as a pred_gff for a subsuquent run (i.e. a traditional MAKER run where you are annotating the new assembly with transcript and protein evidence and ab initio predictors). Don?t supply the old models to est= on that run. >> >> The idea behind doing it this way is: >> 1. You need to get old models onto the new assembly so coordinates will change. So by doing it this way, you will at least be able to move many models forward based on homology. >> 2. By providing the models to pred_gff on a subsequent MAKER run, you are just letting old models compete against new annotations. They will be rejected if they have no evidence support, or can be kept if they score better than alternate models from SNAP/Augustus. That way you have the chance to integrate old models while at the same time rejecting some old models that have no evidence overlap. >> >> ?Carson >> >> >> > On Sep 28, 2017, at 6:05 AM, Quanwei Zhang > wrote: >> > >> > Hello: >> > >> > Recently, we got a new version of NMR genome, whose genome had been assembled and annotated a few years ago. We can download the gene annotation from NCBI. >> > >> > Now we want to annotate the new genome using Maker2 pipeline. I wonder how can I fully make use of existing annotations. On the other hand, since the previous genome is not very well assemblies, some genes annotation maybe false positives. I hope those false positive genes in previous annotation won't mislead Maker2 for current gene annotation. >> > >> > Do you have any suggestions. Thanks >> > >> > Best >> > Quanwei >> > _______________________________________________ >> > maker-devel mailing list >> > maker-devel at box290.bluehost.com >> > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From qwzhang0601 at gmail.com Mon Jan 29 12:57:42 2018 From: qwzhang0601 at gmail.com (Quanwei Zhang) Date: Mon, 29 Jan 2018 14:57:42 -0500 Subject: [maker-devel] gene annotation for a better genome In-Reply-To: <753F1840-4874-4C0D-80F7-59E1A1579884@gmail.com> References: <5AFEDD05-DF02-463F-A6EE-1619A9BB968D@gmail.com> <753F1840-4874-4C0D-80F7-59E1A1579884@gmail.com> Message-ID: Dear Carson: Thank you for your reply. Do you mean set est2genome=1 and protein2genome=1 in one round or do such mapping in two separate rounds? So I will provide gff files by mapping the transcripts and proteins to "pred_gff". Besides the gff from such mapping, I am also considering to provide a gff file obtained from a regular de novo annotation by maker2. And then update gene models from those gff. Here is the reason why I consider this. Suppose at location 1 there is a gene model gA by mapping transcripts and proteins. Then if I try to update those gene models in the second round of maker, maker can not change internal exons of gA (so can not replace it). However, if I provide both the gff by mapping transcripts and gff by maker de novo annotation, then if another gene model gA' (by de novo annotation) was predicted by maker at the same location, maker will compare gA and gA' and select the one with higher score, right? By this way we can replace a mapping gene model with predicted model by maker if the predicted one have stronger evidence support. Right? Thank you. Best Quanwei 2018-01-29 13:23 GMT-05:00 Carson Holt : > You can set both est2genome=1 and protein2genome=1. You can also set > est_forward=1 to get the names from the old models (you have to add it as > it?s not already there). If you want to try and force an alignment to a > specifc location, you can also add maker_coor=chr2:1-3000 to the fasta > header comment line to have maker only alow alignments within a specific > region (chr2:1-3000 in the example). > > ?Carson > > > On Jan 26, 2018, at 4:16 PM, Quanwei Zhang wrote: > > Hi Carson: > > Thank you for your previous suggestions. I have done the annotation > according to your suggestions. I firstly mapped the transcripts from old > assembly to the new assembly by setting "est2genome=1", and then update the > models by new predictions. > > Besides mapping by "est2genome=1" , do you think it is a good idea to do a > separate mapping by proteins of old assembly (setting "protein2genome=1")? > And then I provide both mapping GFF files (i.e., mapping GFF by transcripts > and proteins, separately) and update them with new predictions and evidence > support? Why I am trying to do this is because I found for certain genes > they were not mapped to the new assembly but they can be mapped by protein > orthologs. > > Thank you. > > Best > Quanwei > > 2017-10-24 18:26 GMT-04:00 Carson Holt : > >> Yes. If you use est2genome it will just align the model, and then find >> the longest ORF. So it is a quick way to jsut align old models to the new >> assembly. Alternatively you can just do de novo annotation. >> >> ?Carson >> >> >> >> On Oct 24, 2017, at 10:54 AM, Quanwei Zhang >> wrote: >> >> Dear Carson: >> >> Thank you again for your suggestions. I just get the new genome assembly >> of NMR and start to do gene annotation. I understand you ideas about this. >> But can I simply use the old genome transcripts as transcript evidence, and >> just following the standard Maker2 pipeline? I set est2genome=1 and provide >> the mRNA sequences in the fasta format for the first round training of SNAP. >> >> For transcripts I have the following choices. I think the first choice is >> more reliable and better, right? >> (1) There are about 60,000 RefSeq transcripts from NCBI. So I downloaded >> those sequences in fasta format. >> (2) We have the raw data of RNA-seq from 11 tissues, we can do assembly >> by trinity for each sample and then get the transcripts. But I think most >> of the RNA-seq should have been submitted to NCBI. >> >> BTW, if we use the RefSeq data from NCBI, we can download the mRNA >> sequences, coding sequences or protein sequences. I wonder which type of >> data are the best to train the SNAP? For Augustus, we will use BUSCO to >> train it. >> >> Many thanks. >> >> Best >> Quanwei >> >> >> >> >> 2017-09-29 12:36 GMT-04:00 Carson Holt : >> >>> You can try using the est2genome=1 option to map the old models forward >>> onto the new assembly as if they were ESTs (add a line that says >>> est_forward=1 to the control file to maintain old naming and set est=1 to >>> the old model transcript file). Then provide the final models as a pred_gff >>> for a subsuquent run (i.e. a traditional MAKER run where you are annotating >>> the new assembly with transcript and protein evidence and ab initio >>> predictors). Don?t supply the old models to est= on that run. >>> >>> The idea behind doing it this way is: >>> 1. You need to get old models onto the new assembly so coordinates will >>> change. So by doing it this way, you will at least be able to move many >>> models forward based on homology. >>> 2. By providing the models to pred_gff on a subsequent MAKER run, you >>> are just letting old models compete against new annotations. They will be >>> rejected if they have no evidence support, or can be kept if they score >>> better than alternate models from SNAP/Augustus. That way you have the >>> chance to integrate old models while at the same time rejecting some old >>> models that have no evidence overlap. >>> >>> ?Carson >>> >>> >>> > On Sep 28, 2017, at 6:05 AM, Quanwei Zhang >>> wrote: >>> > >>> > Hello: >>> > >>> > Recently, we got a new version of NMR genome, whose genome had been >>> assembled and annotated a few years ago. We can download the gene >>> annotation from NCBI. >>> > >>> > Now we want to annotate the new genome using Maker2 pipeline. I wonder >>> how can I fully make use of existing annotations. On the other hand, since >>> the previous genome is not very well assemblies, some genes annotation >>> maybe false positives. I hope those false positive genes in previous >>> annotation won't mislead Maker2 for current gene annotation. >>> > >>> > Do you have any suggestions. Thanks >>> > >>> > Best >>> > Quanwei >>> > _______________________________________________ >>> > maker-devel mailing list >>> > maker-devel at box290.bluehost.com >>> > http://box290.bluehost.com/mailman/listinfo/maker-devel_yand >>> ell-lab.org >>> >>> >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From admin at genome.arizona.edu Mon Jan 29 16:08:54 2018 From: admin at genome.arizona.edu (admin at genome.arizona.edu) Date: Mon, 29 Jan 2018 16:08:54 -0700 Subject: [maker-devel] MPI selection Message-ID: <77cfb864-4de1-a9af-aeea-9d3e7cf45ce5@genome.arizona.edu> Hi, we have now three versions of MPI installed on our cluster, OpenMPI, MPICH, and MVAPICH2. Since we have infiniband, the MVAPICH2 is working best with MPI test programs. MPICH should support infiniband too but currently there are some seg faults with that we are trying to resolve. On our cluster we have ~/.mpi-selection file which allows users to pick the MPI installation to use, and sets appropriate PATH and LD_LIBRARY_PATH variables. I am looking through the Maker MPI instructions, and it seems that a certain mpicc and mpi.h must be chosen during installation. So if originally, Maker was installed with MPICH, then would I have to reinstall it if users want to use MVAPICH2? Or is there config file somewhere I can update so I don't have to reinstall Maker? Or does nothing need to be done and we can rely on PATH and LD_LIBRARY_PATH variables pointing to correct mpicc and libmpi.so (mpi.h is in include directory)? Thanks From yuejiaxing at gmail.com Tue Jan 30 09:32:04 2018 From: yuejiaxing at gmail.com (Jia-Xing Yue) Date: Tue, 30 Jan 2018 17:32:04 +0100 Subject: [maker-devel] Is it possible to extract the GFF3 file for the raw gene models predicted by est2genome and protein2genome? Message-ID: Hello, I enabled the est2genome and protein2genome option for Maker-3.00.0-beta in my particular case. I was wondering if it is possible to extract the gene models predicted by snap, augustus, est2genome, and protein2genome respectively. By using the gff_merge command, I think I can extract some gene models for each cases but not all, especially for the est2genome and protein2genome set (e.g. those labeled with "maker-chr*-exonerate_est2genome-gene" and "maker-chr*-exonerate_protein2genome-gene"). Thanks in advance! Best, Jia-Xing -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Jan 30 09:47:39 2018 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 30 Jan 2018 09:47:39 -0700 Subject: [maker-devel] MPI selection In-Reply-To: <77cfb864-4de1-a9af-aeea-9d3e7cf45ce5@genome.arizona.edu> References: <77cfb864-4de1-a9af-aeea-9d3e7cf45ce5@genome.arizona.edu> Message-ID: <34C36A98-A87F-4B28-8E05-FCD412CFEBEA@gmail.com> The libraries used by MVAPICH2, Intel MPI, and OpenMPI to access infiniband have a known bug. For performance reasons, infiniband libraries use registered memory in a way that makes it impossible to do system calls to external programs under MPI (doing so results in seg faults). MAKER has to call out to external programs like BLAST, exonerate, etc., so it triggers this bug. The infiniband bug is well known, and unfortunately will not be fixed because fixing it causes infiniband to lose some advertised features like direct memory access. As a work around OpenMPI and Intel MPI allow you to disable infiniband libraries via command line flags and use IP over infiniband instead (i.e. they let you drop infiniband features on demand so that your code will run). However MVAPICH2 does not provide the same option. As a result you cannot use MAKER or any MPI program that does system calls to external programs with MVAPICH2 (it results in seg faults). But you can use all other MPI flavors with the appropriate flags detailed below: #For OpenMPI, use as follows (the example assumes ib0 is your ip over infiniband adapter) export LD_PRELOAD=/path/to/openmpi/libmpi.so mpiexec --mca btl vader,tcp,self --mca btl_tcp_if_include ib0 --mca btl_openib_want_fork_support 1 --mca mpi_warn_on_fork 0 maker #For Intel MPI set these environmental variables before launch export I_MPI_FABRICS='shm:tcp' export I_MPI_HYDRA_IFACE='ib0' mpiexec maker #For MPICH, nothing is needed as the Infiniband libraries are always disabled, but you can specifically tell it to use the ib0 adapter as the communicator mpiexec -iface ib0 maker ?Carson > On Jan 29, 2018, at 4:08 PM, admin at genome.arizona.edu wrote: > > Hi, we have now three versions of MPI installed on our cluster, OpenMPI, MPICH, and MVAPICH2. Since we have infiniband, the MVAPICH2 is working best with MPI test programs. MPICH should support infiniband too but currently there are some seg faults with that we are trying to resolve. > > On our cluster we have ~/.mpi-selection file which allows users to pick the MPI installation to use, and sets appropriate PATH and LD_LIBRARY_PATH variables. > > I am looking through the Maker MPI instructions, and it seems that a certain mpicc and mpi.h must be chosen during installation. So if originally, Maker was installed with MPICH, then would I have to reinstall it if users want to use MVAPICH2? Or is there config file somewhere I can update so I don't have to reinstall Maker? Or does nothing need to be done and we can rely on PATH and LD_LIBRARY_PATH variables pointing to correct mpicc and libmpi.so (mpi.h is in include directory)? > > Thanks > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From carsonhh at gmail.com Tue Jan 30 09:54:05 2018 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 30 Jan 2018 09:54:05 -0700 Subject: [maker-devel] gene annotation for a better genome In-Reply-To: References: <5AFEDD05-DF02-463F-A6EE-1619A9BB968D@gmail.com> <753F1840-4874-4C0D-80F7-59E1A1579884@gmail.com> Message-ID: <921EBAEF-13E3-4175-90A2-8F41651F95C9@gmail.com> You can set both simultaneously. est2genome will almost always be picked first since it will match better thatn the protein alignment (i.e. it matches at UTRs). ?Carson > On Jan 29, 2018, at 12:57 PM, Quanwei Zhang wrote: > > Dear Carson: > > Thank you for your reply. Do you mean set est2genome=1 and protein2genome=1 in one round or do such mapping in two separate rounds? > > So I will provide gff files by mapping the transcripts and proteins to "pred_gff". Besides the gff from such mapping, I am also considering to provide a gff file obtained from a regular de novo annotation by maker2. And then update gene models from those gff. > > Here is the reason why I consider this. Suppose at location 1 there is a gene model gA by mapping transcripts and proteins. Then if I try to update those gene models in the second round of maker, maker can not change internal exons of gA (so can not replace it). However, if I provide both the gff by mapping transcripts and gff by maker de novo annotation, then if another gene model gA' (by de novo annotation) was predicted by maker at the same location, maker will compare gA and gA' and select the one with higher score, right? By this way we can replace a mapping gene model with predicted model by maker if the predicted one have stronger evidence support. Right? > > Thank you. > > Best > Quanwei > > > > 2018-01-29 13:23 GMT-05:00 Carson Holt >: > You can set both est2genome=1 and protein2genome=1. You can also set est_forward=1 to get the names from the old models (you have to add it as it?s not already there). If you want to try and force an alignment to a specifc location, you can also add maker_coor=chr2:1-3000 to the fasta header comment line to have maker only alow alignments within a specific region (chr2:1-3000 in the example). > > ?Carson > > >> On Jan 26, 2018, at 4:16 PM, Quanwei Zhang > wrote: >> >> Hi Carson: >> >> Thank you for your previous suggestions. I have done the annotation according to your suggestions. I firstly mapped the transcripts from old assembly to the new assembly by setting "est2genome=1", and then update the models by new predictions. >> >> Besides mapping by "est2genome=1" , do you think it is a good idea to do a separate mapping by proteins of old assembly (setting "protein2genome=1")? And then I provide both mapping GFF files (i.e., mapping GFF by transcripts and proteins, separately) and update them with new predictions and evidence support? Why I am trying to do this is because I found for certain genes they were not mapped to the new assembly but they can be mapped by protein orthologs. >> >> Thank you. >> >> Best >> Quanwei >> >> 2017-10-24 18:26 GMT-04:00 Carson Holt >: >> Yes. If you use est2genome it will just align the model, and then find the longest ORF. So it is a quick way to jsut align old models to the new assembly. Alternatively you can just do de novo annotation. >> >> ?Carson >> >> >> >>> On Oct 24, 2017, at 10:54 AM, Quanwei Zhang > wrote: >>> >>> Dear Carson: >>> >>> Thank you again for your suggestions. I just get the new genome assembly of NMR and start to do gene annotation. I understand you ideas about this. But can I simply use the old genome transcripts as transcript evidence, and just following the standard Maker2 pipeline? I set est2genome=1 and provide the mRNA sequences in the fasta format for the first round training of SNAP. >>> >>> For transcripts I have the following choices. I think the first choice is more reliable and better, right? >>> (1) There are about 60,000 RefSeq transcripts from NCBI. So I downloaded those sequences in fasta format. >>> (2) We have the raw data of RNA-seq from 11 tissues, we can do assembly by trinity for each sample and then get the transcripts. But I think most of the RNA-seq should have been submitted to NCBI. >>> >>> BTW, if we use the RefSeq data from NCBI, we can download the mRNA sequences, coding sequences or protein sequences. I wonder which type of data are the best to train the SNAP? For Augustus, we will use BUSCO to train it. >>> >>> Many thanks. >>> >>> Best >>> Quanwei >>> >>> >>> >>> >>> 2017-09-29 12:36 GMT-04:00 Carson Holt >: >>> You can try using the est2genome=1 option to map the old models forward onto the new assembly as if they were ESTs (add a line that says est_forward=1 to the control file to maintain old naming and set est=1 to the old model transcript file). Then provide the final models as a pred_gff for a subsuquent run (i.e. a traditional MAKER run where you are annotating the new assembly with transcript and protein evidence and ab initio predictors). Don?t supply the old models to est= on that run. >>> >>> The idea behind doing it this way is: >>> 1. You need to get old models onto the new assembly so coordinates will change. So by doing it this way, you will at least be able to move many models forward based on homology. >>> 2. By providing the models to pred_gff on a subsequent MAKER run, you are just letting old models compete against new annotations. They will be rejected if they have no evidence support, or can be kept if they score better than alternate models from SNAP/Augustus. That way you have the chance to integrate old models while at the same time rejecting some old models that have no evidence overlap. >>> >>> ?Carson >>> >>> >>> > On Sep 28, 2017, at 6:05 AM, Quanwei Zhang > wrote: >>> > >>> > Hello: >>> > >>> > Recently, we got a new version of NMR genome, whose genome had been assembled and annotated a few years ago. We can download the gene annotation from NCBI. >>> > >>> > Now we want to annotate the new genome using Maker2 pipeline. I wonder how can I fully make use of existing annotations. On the other hand, since the previous genome is not very well assemblies, some genes annotation maybe false positives. I hope those false positive genes in previous annotation won't mislead Maker2 for current gene annotation. >>> > >>> > Do you have any suggestions. Thanks >>> > >>> > Best >>> > Quanwei >>> > _______________________________________________ >>> > maker-devel mailing list >>> > maker-devel at box290.bluehost.com >>> > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >>> >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Jan 30 09:57:01 2018 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 30 Jan 2018 09:57:01 -0700 Subject: [maker-devel] Is it possible to extract the GFF3 file for the raw gene models predicted by est2genome and protein2genome? In-Reply-To: References: Message-ID: <9771EB42-8A80-49D8-9A21-67406860FD4F@gmail.com> You can just grep on the name. Although est2genome and protein2genome should only be used for initial training, as they are almost always guaranteed to be partial and should be disabled once you have trained gene predictors that can build complete models. ?Carson > On Jan 30, 2018, at 9:32 AM, Jia-Xing Yue wrote: > > Hello, > > I enabled the est2genome and protein2genome option for Maker-3.00.0-beta in my particular case. I was wondering if it is possible to extract the gene models predicted by snap, augustus, est2genome, and protein2genome respectively. > > By using the gff_merge command, I think I can extract some gene models for each cases but not all, especially for the est2genome and protein2genome set (e.g. those labeled with "maker-chr*-exonerate_est2genome-gene" and "maker-chr*-exonerate_protein2genome-gene"). > > Thanks in advance! > > Best, > Jia-Xing > > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From yuejiaxing at gmail.com Tue Jan 30 10:03:34 2018 From: yuejiaxing at gmail.com (Jia-Xing Yue) Date: Tue, 30 Jan 2018 18:03:34 +0100 Subject: [maker-devel] Is it possible to extract the GFF3 file for the raw gene models predicted by est2genome and protein2genome? In-Reply-To: <9771EB42-8A80-49D8-9A21-67406860FD4F@gmail.com> References: <9771EB42-8A80-49D8-9A21-67406860FD4F@gmail.com> Message-ID: Dear Carson, Thanks for the quick response! Could you elaborate a bit on on "grep on the name". Do you mean just grep all the lines in the gff_merge output with "est2genome" and "protein2genome" in column 3? In that case, what I got is the alignments rather than the gene model guessed by Maker based on the alignment, right? Thanks! Best, Jia-Xing On Tue, Jan 30, 2018 at 5:57 PM, Carson Holt wrote: > You can just grep on the name. Although est2genome and protein2genome > should only be used for initial training, as they are almost always > guaranteed to be partial and should be disabled once you have trained gene > predictors that can build complete models. > > ?Carson > > > On Jan 30, 2018, at 9:32 AM, Jia-Xing Yue wrote: > > > > Hello, > > > > I enabled the est2genome and protein2genome option for Maker-3.00.0-beta > in my particular case. I was wondering if it is possible to extract the > gene models predicted by snap, augustus, est2genome, and protein2genome > respectively. > > > > By using the gff_merge command, I think I can extract some gene models > for each cases but not all, especially for the est2genome and > protein2genome set (e.g. those labeled with "maker-chr*-exonerate_est2genome-gene" > and "maker-chr*-exonerate_protein2genome-gene"). > > > > Thanks in advance! > > > > Best, > > Jia-Xing > > > > > > > > _______________________________________________ > > maker-devel mailing list > > maker-devel at box290.bluehost.com > > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > -- Jia-Xing Yue Population Genomics and Complex Traits Group Tour Pasteur 8eme etage Facult? de M?decine Institute for Research on Cancer and Aging, Nice (IRCAN) CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA) 28 Avenue de Valombrose 06107 NICE Cedex 2 France Twitter: @iAmphioxus Personal website: http://www.iamphioxus.org/ Lab website: https://litilab.wordpress.com/ Yeast Population Reference Panel: https://yjx1217.github.io/Yeast_PacBio_2016/welcome/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Jan 30 10:06:27 2018 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 30 Jan 2018 10:06:27 -0700 Subject: [maker-devel] Is it possible to extract the GFF3 file for the raw gene models predicted by est2genome and protein2genome? In-Reply-To: References: <9771EB42-8A80-49D8-9A21-67406860FD4F@gmail.com> Message-ID: <335E2942-4FCA-4F3C-A488-06116F6B7604@gmail.com> MAKER models will al have ?maker? in the source column. Everything else is a reference alignment (not a model). But you can grep on the gene name. If it is sourced from SNAP, it will have snap in the name, and the same is true for augustus, est2genome, protein2genome, etc. ?Carson > On Jan 30, 2018, at 10:03 AM, Jia-Xing Yue wrote: > > Dear Carson, > > Thanks for the quick response! Could you elaborate a bit on on "grep on the name". Do you mean just grep all the lines in the gff_merge output with "est2genome" and "protein2genome" in column 3? In that case, what I got is the alignments rather than the gene model guessed by Maker based on the alignment, right? > > > Thanks! > > Best, > Jia-Xing > > > > On Tue, Jan 30, 2018 at 5:57 PM, Carson Holt > wrote: > You can just grep on the name. Although est2genome and protein2genome should only be used for initial training, as they are almost always guaranteed to be partial and should be disabled once you have trained gene predictors that can build complete models. > > ?Carson > > > On Jan 30, 2018, at 9:32 AM, Jia-Xing Yue > wrote: > > > > Hello, > > > > I enabled the est2genome and protein2genome option for Maker-3.00.0-beta in my particular case. I was wondering if it is possible to extract the gene models predicted by snap, augustus, est2genome, and protein2genome respectively. > > > > By using the gff_merge command, I think I can extract some gene models for each cases but not all, especially for the est2genome and protein2genome set (e.g. those labeled with "maker-chr*-exonerate_est2genome-gene" and "maker-chr*-exonerate_protein2genome-gene"). > > > > Thanks in advance! > > > > Best, > > Jia-Xing > > > > > > > > _______________________________________________ > > maker-devel mailing list > > maker-devel at box290.bluehost.com > > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > > > -- > Jia-Xing Yue > > Population Genomics and Complex Traits Group > Tour Pasteur 8eme etage > Facult? de M?decine > Institute for Research on Cancer and Aging, Nice (IRCAN) > CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA) > 28 Avenue de Valombrose > 06107 NICE Cedex 2 > France > > Twitter: @iAmphioxus > Personal website: http://www.iamphioxus.org/ > Lab website: https://litilab.wordpress.com/ > Yeast Population Reference Panel: https://yjx1217.github.io/Yeast_PacBio_2016/welcome/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From admin at genome.arizona.edu Tue Jan 30 10:24:05 2018 From: admin at genome.arizona.edu (admin at genome.arizona.edu) Date: Tue, 30 Jan 2018 10:24:05 -0700 Subject: [maker-devel] MPI selection In-Reply-To: <34C36A98-A87F-4B28-8E05-FCD412CFEBEA@gmail.com> References: <77cfb864-4de1-a9af-aeea-9d3e7cf45ce5@genome.arizona.edu> <34C36A98-A87F-4B28-8E05-FCD412CFEBEA@gmail.com> Message-ID: <4825e452-aab6-aa13-ebc7-3d3d1832cc60@genome.arizona.edu> Carson Holt wrote on 01/30/2018 09:47 AM: > The libraries used by MVAPICH2, Intel MPI, and OpenMPI to access infiniband have a known bug. For performance reasons, infiniband libraries use registered memory in a way that makes it impossible to do system calls to external programs under MPI (doing so results in seg faults). MAKER has to call out to external programs like BLAST, exonerate, etc., so it triggers this bug. > The infiniband bug is well known, and unfortunately will not be fixed because fixing it causes infiniband to lose some advertised features like direct memory access. Well that stinks! Maybe that's why we got such a good deal on new-old-stock infiniband equipment! Still it has allowed us to use full speed of our NFS RAIDs, which has been nice. I will try with using ib0, the speed is still about 10Gb, but I was under the impression using IPoIB would cause packet loss or other problems... Thanks for clearing that up. So is there a fabric/protocol you would recommend for clusters running maker? From yuejiaxing at gmail.com Tue Jan 30 10:24:22 2018 From: yuejiaxing at gmail.com (Jia-Xing Yue) Date: Tue, 30 Jan 2018 12:24:22 -0500 Subject: [maker-devel] Is it possible to extract the GFF3 file for the raw gene models predicted by est2genome and protein2genome? Message-ID: Dear Carson, Yes that's what I did actually. But it seems that I only got much fewer gene models for est2genome and protein2genome in this way than I would expect. I have turned on EVM for my maker run. Could this explain the low numbers of est2genome and protein2genome models that I got? Thx! Best, Jia-Xing Sent from my Nokia Lumia 920 ------------------------------ From: Carson Holt Sent: ?30/?01/?2018 18:06 To: Jia-Xing Yue Cc: maker-devel at yandell-lab.org List Subject: Re: [maker-devel] Is it possible to extract the GFF3 file for the raw gene models predicted by est2genome and protein2genome? MAKER models will al have ?maker? in the source column. Everything else is a reference alignment (not a model). But you can grep on the gene name. If it is sourced from SNAP, it will have snap in the name, and the same is true for augustus, est2genome, protein2genome, etc. ?Carson On Jan 30, 2018, at 10:03 AM, Jia-Xing Yue wrote: Dear Carson, Thanks for the quick response! Could you elaborate a bit on on "grep on the name". Do you mean just grep all the lines in the gff_merge output with "est2genome" and "protein2genome" in column 3? In that case, what I got is the alignments rather than the gene model guessed by Maker based on the alignment, right? Thanks! Best, Jia-Xing On Tue, Jan 30, 2018 at 5:57 PM, Carson Holt wrote: > You can just grep on the name. Although est2genome and protein2genome > should only be used for initial training, as they are almost always > guaranteed to be partial and should be disabled once you have trained gene > predictors that can build complete models. > > ?Carson > > > On Jan 30, 2018, at 9:32 AM, Jia-Xing Yue wrote: > > > > Hello, > > > > I enabled the est2genome and protein2genome option for Maker-3.00.0-beta > in my particular case. I was wondering if it is possible to extract the > gene models predicted by snap, augustus, est2genome, and protein2genome > respectively. > > > > By using the gff_merge command, I think I can extract some gene models > for each cases but not all, especially for the est2genome and > protein2genome set (e.g. those labeled with "maker-chr*-exonerate_est2genome-gene" > and "maker-chr*-exonerate_protein2genome-gene"). > > > > Thanks in advance! > > > > Best, > > Jia-Xing > > > > > > > > _______________________________________________ > > maker-devel mailing list > > maker-devel at box290.bluehost.com > > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > -- Jia-Xing Yue Population Genomics and Complex Traits Group Tour Pasteur 8eme etage Facult? de M?decine Institute for Research on Cancer and Aging, Nice (IRCAN) CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA) 28 Avenue de Valombrose 06107 NICE Cedex 2 France Twitter: @iAmphioxus Personal website: http://www.iamphioxus.org/ Lab website: https://litilab.wordpress.com/ Yeast Population Reference Panel: https://yjx1217.github.io/Yeast_PacBio_2016/welcome/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Jan 30 10:37:59 2018 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 30 Jan 2018 10:37:59 -0700 Subject: [maker-devel] MPI selection In-Reply-To: <4825e452-aab6-aa13-ebc7-3d3d1832cc60@genome.arizona.edu> References: <77cfb864-4de1-a9af-aeea-9d3e7cf45ce5@genome.arizona.edu> <34C36A98-A87F-4B28-8E05-FCD412CFEBEA@gmail.com> <4825e452-aab6-aa13-ebc7-3d3d1832cc60@genome.arizona.edu> Message-ID: MAKER does not really move a lot of data with MPI, it?s just moving around command lines and small variables. So not getting full infiniband performance will not hurt you. I doubt you see any issues using ib0. For MPI flavor, I get the best performance with Intel MPI followed by OpenMPI. Overall you will find that MAKER is IO bound as opposed to CPU or communications bound. So pointing it at your best performing network based storage will be the greatest performance factor (if you have Lustre storage, point it there for example). Pull back on job size and count if other users have issues accessing the disk (too many jobs can bring NFS to it?s knees). The one suggestion I have as far as job size, it to keep jobs sizes under 200 CPU cores. Over that, you will get better performance by splitting up datasets and submitting multiple job. Also MAKER keeps a log of it?s progress, so you can kill jobs or restart failed jobs, and they pick up right where they left off. ?Carson > On Jan 30, 2018, at 10:24 AM, admin at genome.arizona.edu wrote: > > Carson Holt wrote on 01/30/2018 09:47 AM: > > The libraries used by MVAPICH2, Intel MPI, and OpenMPI to access infiniband have a known bug. For performance reasons, infiniband libraries use registered memory in a way that makes it impossible to do system calls to external programs under MPI (doing so results in seg faults). MAKER has to call out to external programs like BLAST, exonerate, etc., so it triggers this bug. > > The infiniband bug is well known, and unfortunately will not be fixed because fixing it causes infiniband to lose some advertised features like direct memory access. > > > Well that stinks! Maybe that's why we got such a good deal on new-old-stock infiniband equipment! Still it has allowed us to use full speed of our NFS RAIDs, which has been nice. I will try with using ib0, the speed is still about 10Gb, but I was under the impression using IPoIB would cause packet loss or other problems... > > Thanks for clearing that up. So is there a fabric/protocol you would recommend for clusters running maker? > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From qlian003 at ucr.edu Wed Jan 3 18:52:26 2018 From: qlian003 at ucr.edu (Qihua Liang) Date: Wed, 3 Jan 2018 17:52:26 -0800 Subject: [maker-devel] questions on master_datastore_index.log file Message-ID: Dear Maker Develop Team, I have successfully run Maker for several times before. But I came across a strange thing days ago when I ran Maker again on a different assembly with the same input files and settings. I saw the message of "Maker is now finished!!!? but got empty GFF3 and no fasta files. And then I checked the master_datastore_index.log and realized that there are a lot of ?failed?s and ?retry?s and ?failed? again. What does this mean? Since I used same inputs as previous successful runs, could you provide some instructions on how to debug and solve it? Thank you so much Qihua From o.k.torresen at ibv.uio.no Thu Jan 4 06:21:28 2018 From: o.k.torresen at ibv.uio.no (=?utf-8?B?T2xlIEtyaXN0aWFuIFTDuHJyZXNlbg==?=) Date: Thu, 4 Jan 2018 13:21:28 +0000 Subject: [maker-devel] Names/IDs from pred_gff are included in final gff Message-ID: <76613C81-ED9B-45F2-B84B-B60BC1D4D972@ibv.uio.no> Hi, as far as I can see, names or IDs of features in gffs given to pred_gff is included in the final output as the name of the feature. As far as I can understand, this is not expected behaviour (it is for model_gff). This is with MAKER 2.31.9. I have these settings: map_forward=0 keep_preds=1 I thought that map_forward had to be 1 to get the names for the old GFFs. Can you replicate this? Thank you. Sincerely, Ole K. T?rresen From d.ence at ufl.edu Thu Jan 4 07:16:42 2018 From: d.ence at ufl.edu (Ence,daniel) Date: Thu, 4 Jan 2018 14:16:42 +0000 Subject: [maker-devel] questions on master_datastore_index.log file In-Reply-To: References: Message-ID: Hi, Before we can give any help to debug it, we need the error messages. These should be in the same file that the ?maker is finished? message is in. Look for the first error message (the one closest to the top of the file) and send that to the mailing list. Thanks, Daniel > On Jan 3, 2018, at 8:52 PM, Qihua Liang wrote: > > Dear Maker Develop Team, > > I have successfully run Maker for several times before. But I came across a strange thing days ago when I ran Maker again on a different assembly with the same input files and settings. > > I saw the message of "Maker is now finished!!!? but got empty GFF3 and no fasta files. And then I checked the master_datastore_index.log and realized that there are a lot of ?failed?s and ?retry?s and ?failed? again. What does this mean? Since I used same inputs as previous successful runs, could you provide some instructions on how to debug and solve it? > > Thank you so much > Qihua > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > https://urldefense.proofpoint.com/v2/url?u=http-3A__box290.bluehost.com_mailman_listinfo_maker-2Ddevel-5Fyandell-2Dlab.org&d=DwIGaQ&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=12jzlNvGVD0AlPJ4E7cTlw1Dvu6n9cb4kMCobJ28XPs&m=nUDCP_0kFOhDYlTHgOpWtf_zdL77aQFeQwYOGIQwP8c&s=9Z4T1hdtxOyIjpn6f70qhrQRuGsZxXdV-oLSJF1zGkY&e= From qlian003 at ucr.edu Thu Jan 4 14:36:18 2018 From: qlian003 at ucr.edu (Qihua Liang) Date: Thu, 4 Jan 2018 13:36:18 -0800 Subject: [maker-devel] questions on master_datastore_index.log file In-Reply-To: References: Message-ID: Hi Ence, When I searched for ?E/error? in the output file, here is what first showed up: Process::MpiChunk::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 Error::subs::try(CODE(0x502bbb0), HASH(0x5007788)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 4269 Process::MpiChunk::_go(Process::MpiChunk=HASH(0x50a1a18), "flow", HASH(0x50ad0f0), 2, 0) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 378 Process::MpiChunk::_flow(Process::MpiChunk=HASH(0x50a1a18), HASH(0x50ad0f0), 2, 0, Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 318 Process::MpiTiers::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 Error::subs::try(CODE(0x50a9348), HASH(0x4ff0ec0)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 338 Process::MpiTiers::_next_level(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 179 Is this what you may need? Qihua > On Jan 4, 2018, at 6:16 AM, Ence,daniel wrote: > > Hi, Before we can give any help to debug it, we need the error messages. These should be in the same file that the ?maker is finished? message is in. Look for the first error message (the one closest to the top of the file) and send that to the mailing list. > > Thanks, > Daniel > > >> On Jan 3, 2018, at 8:52 PM, Qihua Liang wrote: >> >> Dear Maker Develop Team, >> >> I have successfully run Maker for several times before. But I came across a strange thing days ago when I ran Maker again on a different assembly with the same input files and settings. >> >> I saw the message of "Maker is now finished!!!? but got empty GFF3 and no fasta files. And then I checked the master_datastore_index.log and realized that there are a lot of ?failed?s and ?retry?s and ?failed? again. What does this mean? Since I used same inputs as previous successful runs, could you provide some instructions on how to debug and solve it? >> >> Thank you so much >> Qihua >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> https://urldefense.proofpoint.com/v2/url?u=http-3A__box290.bluehost.com_mailman_listinfo_maker-2Ddevel-5Fyandell-2Dlab.org&d=DwIGaQ&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=12jzlNvGVD0AlPJ4E7cTlw1Dvu6n9cb4kMCobJ28XPs&m=nUDCP_0kFOhDYlTHgOpWtf_zdL77aQFeQwYOGIQwP8c&s=9Z4T1hdtxOyIjpn6f70qhrQRuGsZxXdV-oLSJF1zGkY&e= > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Fri Jan 5 20:22:56 2018 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 5 Jan 2018 20:22:56 -0700 Subject: [maker-devel] questions on master_datastore_index.log file In-Reply-To: References: Message-ID: That?s the stack trace. The error is going to be a few lines further back. It would be best to get a few hundred lines right around the area you are showing. ?Carson > On Jan 4, 2018, at 2:36 PM, Qihua Liang wrote: > > Hi Ence, > > When I searched for ?E/error? in the output file, here is what first showed up: > Process::MpiChunk::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 > eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 > Error::subs::try(CODE(0x502bbb0), HASH(0x5007788)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 4269 > Process::MpiChunk::_go(Process::MpiChunk=HASH(0x50a1a18), "flow", HASH(0x50ad0f0), 2, 0) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 378 > Process::MpiChunk::_flow(Process::MpiChunk=HASH(0x50a1a18), HASH(0x50ad0f0), 2, 0, Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 318 > Process::MpiTiers::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 > eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 > Error::subs::try(CODE(0x50a9348), HASH(0x4ff0ec0)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 338 > Process::MpiTiers::_next_level(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 179 > > Is this what you may need? > > Qihua > >> On Jan 4, 2018, at 6:16 AM, Ence,daniel > wrote: >> >> Hi, Before we can give any help to debug it, we need the error messages. These should be in the same file that the ?maker is finished? message is in. Look for the first error message (the one closest to the top of the file) and send that to the mailing list. >> >> Thanks, >> Daniel >> >> >>> On Jan 3, 2018, at 8:52 PM, Qihua Liang > wrote: >>> >>> Dear Maker Develop Team, >>> >>> I have successfully run Maker for several times before. But I came across a strange thing days ago when I ran Maker again on a different assembly with the same input files and settings. >>> >>> I saw the message of "Maker is now finished!!!? but got empty GFF3 and no fasta files. And then I checked the master_datastore_index.log and realized that there are a lot of ?failed?s and ?retry?s and ?failed? again. What does this mean? Since I used same inputs as previous successful runs, could you provide some instructions on how to debug and solve it? >>> >>> Thank you so much >>> Qihua >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> https://urldefense.proofpoint.com/v2/url?u=http-3A__box290.bluehost.com_mailman_listinfo_maker-2Ddevel-5Fyandell-2Dlab.org&d=DwIGaQ&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=12jzlNvGVD0AlPJ4E7cTlw1Dvu6n9cb4kMCobJ28XPs&m=nUDCP_0kFOhDYlTHgOpWtf_zdL77aQFeQwYOGIQwP8c&s=9Z4T1hdtxOyIjpn6f70qhrQRuGsZxXdV-oLSJF1zGkY&e= >> > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Jan 16 11:15:29 2018 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 16 Jan 2018 11:15:29 -0700 Subject: [maker-devel] Names/IDs from pred_gff are included in final gff In-Reply-To: <76613C81-ED9B-45F2-B84B-B60BC1D4D972@ibv.uio.no> References: <76613C81-ED9B-45F2-B84B-B60BC1D4D972@ibv.uio.no> Message-ID: <8CB421A6-3CB0-4539-B55A-D3F4CA61D0AD@gmail.com> pred_gff will maintain it?s name in the match/match_part feature as the information is pulled directly from the input GFF3. But any pred_gff feature that becomes a final model will be renamed to something like ?scaffold_1517-pred_gff_GeneMark.hmm-gene-0.6-mRNA-1? unless you specify map_forward=1 to maintain the original name. ?Carson > On Jan 4, 2018, at 6:21 AM, Ole Kristian T?rresen wrote: > > Hi, > as far as I can see, names or IDs of features in gffs given to pred_gff is included in the final output as the name of the feature. As far as I can understand, this is not expected behaviour (it is for model_gff). This is with MAKER 2.31.9. > > I have these settings: > map_forward=0 > keep_preds=1 > > I thought that map_forward had to be 1 to get the names for the old GFFs. Can you replicate this? > > Thank you. > > Sincerely, > Ole K. T?rresen > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From o.k.torresen at ibv.uio.no Wed Jan 17 10:52:13 2018 From: o.k.torresen at ibv.uio.no (=?utf-8?B?T2xlIEtyaXN0aWFuIFTDuHJyZXNlbg==?=) Date: Wed, 17 Jan 2018 17:52:13 +0000 Subject: [maker-devel] Names/IDs from pred_gff are included in final gff In-Reply-To: <8CB421A6-3CB0-4539-B55A-D3F4CA61D0AD@gmail.com> References: <76613C81-ED9B-45F2-B84B-B60BC1D4D972@ibv.uio.no> <8CB421A6-3CB0-4539-B55A-D3F4CA61D0AD@gmail.com> Message-ID: <583A84D5-B979-4C2F-B262-2D55A6F55B56@ibv.uio.no> Ok, but I have an entry in the final gff like this: ID=maker-GmG20150304_scaffold_2371-pred_gff_maker-gene-0.125-mRNA-1;Parent=maker-GmG20150304_scaffold_2371-pred_gff_maker-gene-0.125;Name=ENSGMOT00000000668.1;_AED=0.00;_eAED=0.00;_QI=819|1|1|1|1|1|4|112|726;score=89.75616 (The name is derived from a pred_gff entry which is the results of mapping an old annotation to the new assembly). This is then called ENSGMOT00000000668.1 protein AED:0.00 eAED:0.00 QI:819|1|1|1|1|1|4|112|726 in the proteins.fasta file. Which is unfortunate, because it apparently mapped 12 places in the assembly. I have set map_forward=0, but keep_preds=1 (filtering on domain presence and AED score later). This and another file (result of genemark_gtf2gff3), is not input as match/match_part to MAKER, but with gene/exon/CDS/mRNA. Could that be the issue? Ole > On 16 Jan 2018, at 19:15, Carson Holt wrote: > > pred_gff will maintain it?s name in the match/match_part feature as the information is pulled directly from the input GFF3. But any pred_gff feature that becomes a final model will be renamed to something like ?scaffold_1517-pred_gff_GeneMark.hmm-gene-0.6-mRNA-1? unless you specify map_forward=1 to maintain the original name. > > ?Carson > > > >> On Jan 4, 2018, at 6:21 AM, Ole Kristian T?rresen wrote: >> >> Hi, >> as far as I can see, names or IDs of features in gffs given to pred_gff is included in the final output as the name of the feature. As far as I can understand, this is not expected behaviour (it is for model_gff). This is with MAKER 2.31.9. >> >> I have these settings: >> map_forward=0 >> keep_preds=1 >> >> I thought that map_forward had to be 1 to get the names for the old GFFs. Can you replicate this? >> >> Thank you. >> >> Sincerely, >> Ole K. T?rresen >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > From qlian003 at ucr.edu Sat Jan 6 16:09:55 2018 From: qlian003 at ucr.edu (Qihua Liang) Date: Sat, 6 Jan 2018 15:09:55 -0800 Subject: [maker-devel] questions on master_datastore_index.log file In-Reply-To: References: Message-ID: <0BECB285-BB11-4F46-B6D7-072640F311B2@ucr.edu> Hi Carson, I am pasting more lines of error messages. I notice an error of "ERROR: Can't open seq file: /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/query.masked.gff.seq?, the seq name of ?ScsGwly? is ">ScsGwly_6124;HRSCAF=6247?, is it because of the seq naming that makes the temp file name weird? Thanks Qihua #--------- command -------------# Widget::blastx: /24-2/home/qliang/0.soft/maker/bin/../exe/blast/bin/blastx -db /tmp/maker_nJDkCL/te_proteins%2Efasta.mpi.10.9 -query /tmp/maker_nJDkCL/0/ScsG wly_5932%3BHRSCAF=6050.0 -num_alignments 10000 -num_descriptions 10000 -evalue 1e-06 -dbsize 300 -searchsp 500000000 -num_threads 1 -seg yes -soft_masking true -lcase_masking -show_gis -out /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/D7/4 A/ScsGwly_5932%3BHRSCAF=6050//theVoid.ScsGwly_5932%3BHRSCAF=6050/0/ScsGwly_5932%3BHRSCAF=6050.0.te_proteins%2Efasta.repeatrunner.temp_dir/te_ proteins%2Efasta.mpi.10.9.repeatrunner #-------------------------------# deleted:0 hits collecting blastx repeatmasking processing all repeats in cluster::shadow_cluster... ...finished clustering. preparing masked sequence preparing ab-inits running snap. #--------- command -------------# Widget::snap: /24-2/home/qliang/0.soft/maker/exe/snap/snap /home/qliang/cowpea/annotation/09.tingting/4.Abintio/2.CEGMA/3.maker/maker1.hmm/maker1.snap.hmm /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0 > /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0.maker1%2Esnap%2Eh mm.snap #-------------------------------# scoring....decoding.10.20.30.40.50.60.70.80.90.100 done scoring....decoding.10.20.30.40.50.60.70.80.90.100 done running augustus. #--------- command -------------# Widget::augustus: /usr/local/augustus.2.7/bin/augustus --species=cowpea_new --UTR=off /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0 > /tmp/maker _nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0.cowpea_new.augustus #-------------------------------# deleted:0 hits collecting blastx repeatmasking processing all repeats in cluster::shadow_cluster... ...finished clustering. deleted:0 hits doing blastx repeats running blast search. #--------- command -------------# Widget::blastx: /24-2/home/qliang/0.soft/maker/bin/../exe/blast/bin/blastx -db /tmp/maker_mvdRkd/te_proteins%2Efasta.mpi.10.6 -query /tmp/maker_mvdRkd/0/chr10.75 -num_alignments 10000 -num_descriptions 10000 -evalue 1e-06 -dbsize 300 -searchsp 500000000 -num_threads 1 -seg yes -soft_masking true -lcase_masking -show_gis -out /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/09/chr10//theVoid.chr10/7/chr10.75.te_proteins%2Efasta.repeatrunner.temp_dir/te_proteins%2Efasta.mpi.10.6.repeatrunner #-------------------------------# doing blastx repeats re reading blast report. /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/0/ScsGwly_6124%3BHRSCAF=6247.0.te_proteins%2Efasta.repeatrunner deleted:0 hits doing blastx repeats doing blastx repeats doing blastx repeats doing blastx repeats doing blastx repeats doing blastx repeats doing blastx repeats doing blastx repeats doing blastx repeats collecting blastx repeatmasking processing all repeats in cluster::shadow_cluster... ...finished clustering. ERROR: Can't open seq file: /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/query.masked.gff.seq No such file or directory at /24-2/home/qliang/0.soft/maker/bin/../lib/Dumper/GFF/GFFV3.pm line 199. Dumper::GFF::GFFV3::finalize(Dumper::GFF::GFFV3=HASH(0x5000ab8)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 700 Process::MpiChunk::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 Error::subs::try(CODE(0x502bbb0), HASH(0x5007788)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 4269 Process::MpiChunk::_go(Process::MpiChunk=HASH(0x50a1a18), "flow", HASH(0x50ad0f0), 2, 0) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 378 Process::MpiChunk::_flow(Process::MpiChunk=HASH(0x50a1a18), HASH(0x50ad0f0), 2, 0, Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 318 Process::MpiTiers::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 Error::subs::try(CODE(0x50a9348), HASH(0x4ff0ec0)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 338 Process::MpiTiers::_next_level(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 179 Process::MpiTiers::next_chunk(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 286 Process::MpiTiers::run_all(Process::MpiTiers=HASH(0x4fb3350), 0) called at /home/qliang/0.soft/maker/bin/maker line 695 --> rank=NA, hostname=H4 ERROR: Failed while builing masking tiers --> rank=NA, hostname=H4 --> rank=NA, hostname=H4 ERROR: Can not get next level running genemark. #--------- command -------------# Widget::genemark: /24-2/home/qliang/0.soft/PerlPackages/ActivePerl-5.22/bin/perl-static /24-2/home/qliang/0.soft/maker/bin/../lib/Widget/genemark/gmhmm_wrap -m /home/qliang/cowpea/annotation/05.CEGMA/2.genemask/output/gmhmm.mod -g /24-2/home/qliang/0.soft/makerPackages/gm_et_linux_64/gmes_petap/gmhmme3 -p /24-2/home/qliang/0.soft/makerPackages/gm_et_linux_64/gmes_petap/probuild -o /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_nomask.0.gmhmm%2Emod.genemark /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_nomask.0 #-------------------------------# FAILED CONTIG:ScsGwly_6124;HRSCAF=6247 examining contents of the fasta file and run log --Next Contig-- #--------------------------------------------------------------------- Now starting the contig!! SeqID: ScsGwly_6140;HRSCAF=6263 Length: 1247 #--------------------------------------------------------------------- > On Jan 5, 2018, at 7:22 PM, Carson Holt wrote: > > That?s the stack trace. The error is going to be a few lines further back. It would be best to get a few hundred lines right around the area you are showing. > > ?Carson > >> On Jan 4, 2018, at 2:36 PM, Qihua Liang > wrote: >> >> Hi Ence, >> >> When I searched for ?E/error? in the output file, here is what first showed up: >> Process::MpiChunk::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >> Error::subs::try(CODE(0x502bbb0), HASH(0x5007788)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 4269 >> Process::MpiChunk::_go(Process::MpiChunk=HASH(0x50a1a18), "flow", HASH(0x50ad0f0), 2, 0) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 378 >> Process::MpiChunk::_flow(Process::MpiChunk=HASH(0x50a1a18), HASH(0x50ad0f0), 2, 0, Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 318 >> Process::MpiTiers::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >> Error::subs::try(CODE(0x50a9348), HASH(0x4ff0ec0)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 338 >> Process::MpiTiers::_next_level(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 179 >> >> Is this what you may need? >> >> Qihua >> >>> On Jan 4, 2018, at 6:16 AM, Ence,daniel > wrote: >>> >>> Hi, Before we can give any help to debug it, we need the error messages. These should be in the same file that the ?maker is finished? message is in. Look for the first error message (the one closest to the top of the file) and send that to the mailing list. >>> >>> Thanks, >>> Daniel >>> >>> >>>> On Jan 3, 2018, at 8:52 PM, Qihua Liang > wrote: >>>> >>>> Dear Maker Develop Team, >>>> >>>> I have successfully run Maker for several times before. But I came across a strange thing days ago when I ran Maker again on a different assembly with the same input files and settings. >>>> >>>> I saw the message of "Maker is now finished!!!? but got empty GFF3 and no fasta files. And then I checked the master_datastore_index.log and realized that there are a lot of ?failed?s and ?retry?s and ?failed? again. What does this mean? Since I used same inputs as previous successful runs, could you provide some instructions on how to debug and solve it? >>>> >>>> Thank you so much >>>> Qihua >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__box290.bluehost.com_mailman_listinfo_maker-2Ddevel-5Fyandell-2Dlab.org&d=DwIGaQ&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=12jzlNvGVD0AlPJ4E7cTlw1Dvu6n9cb4kMCobJ28XPs&m=nUDCP_0kFOhDYlTHgOpWtf_zdL77aQFeQwYOGIQwP8c&s=9Z4T1hdtxOyIjpn6f70qhrQRuGsZxXdV-oLSJF1zGkY&e= >>> >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Jan 9 10:14:05 2018 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 9 Jan 2018 10:14:05 -0700 Subject: [maker-devel] questions on master_datastore_index.log file In-Reply-To: <0BECB285-BB11-4F46-B6D7-072640F311B2@ucr.edu> References: <0BECB285-BB11-4F46-B6D7-072640F311B2@ucr.edu> Message-ID: <0E5E8721-E814-4BA5-891B-B1C312BC0D4A@gmail.com> Your contig names may create issues. Specifically the ?;? character, but you should also remove the ?=? character. However, I believe your problem may be IO. If you are running under MPI or are running multiple jobs, the disk one of the machines may have that location unmounted, it may be full, you may have hit a system file quota limit, or the IO load is slowing it is not actually finished writing the file when MAKER tries to read it. If IO load, is the issue, then you just need to run fewer processes. The other possibilities would mean you need to make space, fix the mount, or raise any quotas on your systems. Thanks, Carson > On Jan 6, 2018, at 4:09 PM, Qihua Liang wrote: > > Hi Carson, > > I am pasting more lines of error messages. I notice an error of "ERROR: Can't open seq file: /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/query.masked.gff.seq?, the seq name of ?ScsGwly? is ">ScsGwly_6124;HRSCAF=6247?, is it because of the seq naming that makes the temp file name weird? > > Thanks > Qihua > > #--------- command -------------# > Widget::blastx: > /24-2/home/qliang/0.soft/maker/bin/../exe/blast/bin/blastx -db /tmp/maker_nJDkCL/te_proteins%2Efasta.mpi.10.9 -query /tmp/maker_nJDkCL/0/ScsG > wly_5932%3BHRSCAF=6050.0 -num_alignments 10000 -num_descriptions 10000 -evalue 1e-06 -dbsize 300 -searchsp 500000000 -num_threads 1 -seg yes > -soft_masking true -lcase_masking -show_gis -out /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/D7/4 > A/ScsGwly_5932%3BHRSCAF=6050//theVoid.ScsGwly_5932%3BHRSCAF=6050/0/ScsGwly_5932%3BHRSCAF=6050.0.te_proteins%2Efasta.repeatrunner.temp_dir/te_ > proteins%2Efasta.mpi.10.9.repeatrunner > #-------------------------------# > deleted:0 hits > collecting blastx repeatmasking > processing all repeats > in cluster::shadow_cluster... > ...finished clustering. > preparing masked sequence > preparing ab-inits > running snap. > #--------- command -------------# > Widget::snap: > /24-2/home/qliang/0.soft/maker/exe/snap/snap /home/qliang/cowpea/annotation/09.tingting/4.Abintio/2.CEGMA/3.maker/maker1.hmm/maker1.snap.hmm > /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0 > /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0.maker1%2Esnap%2Eh > mm.snap > #-------------------------------# > scoring....decoding.10.20.30.40.50.60.70.80.90.100 done > scoring....decoding.10.20.30.40.50.60.70.80.90.100 done > running augustus. > #--------- command -------------# > Widget::augustus: > /usr/local/augustus.2.7/bin/augustus --species=cowpea_new --UTR=off /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0 > /tmp/maker > _nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0.cowpea_new.augustus > #-------------------------------# > deleted:0 hits > collecting blastx repeatmasking > processing all repeats > in cluster::shadow_cluster... > ...finished clustering. > deleted:0 hits > doing blastx repeats > running blast search. > #--------- command -------------# > Widget::blastx: > /24-2/home/qliang/0.soft/maker/bin/../exe/blast/bin/blastx -db /tmp/maker_mvdRkd/te_proteins%2Efasta.mpi.10.6 -query /tmp/maker_mvdRkd/0/chr10.75 -num_alignments 10000 -num_descriptions 10000 -evalue 1e-06 -dbsize 300 -searchsp 500000000 -num_threads 1 -seg yes -soft_masking true -lcase_masking -show_gis -out /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/09/chr10//theVoid.chr10/7/chr10.75.te_proteins%2Efasta.repeatrunner.temp_dir/te_proteins%2Efasta.mpi.10.6.repeatrunner > #-------------------------------# > doing blastx repeats > re reading blast report. > /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/0/ScsGwly_6124%3BHRSCAF=6247.0.te_proteins%2Efasta.repeatrunner > deleted:0 hits > doing blastx repeats > doing blastx repeats > doing blastx repeats > doing blastx repeats > doing blastx repeats > doing blastx repeats > doing blastx repeats > doing blastx repeats > doing blastx repeats > collecting blastx repeatmasking > processing all repeats > in cluster::shadow_cluster... > ...finished clustering. > ERROR: Can't open seq file: /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/query.masked.gff.seq > No such file or directory > > at /24-2/home/qliang/0.soft/maker/bin/../lib/Dumper/GFF/GFFV3.pm line 199. > Dumper::GFF::GFFV3::finalize(Dumper::GFF::GFFV3=HASH(0x5000ab8)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 700 > Process::MpiChunk::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 > eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 > Error::subs::try(CODE(0x502bbb0), HASH(0x5007788)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 4269 > Process::MpiChunk::_go(Process::MpiChunk=HASH(0x50a1a18), "flow", HASH(0x50ad0f0), 2, 0) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 378 > Process::MpiChunk::_flow(Process::MpiChunk=HASH(0x50a1a18), HASH(0x50ad0f0), 2, 0, Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 318 > Process::MpiTiers::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 > eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 > Error::subs::try(CODE(0x50a9348), HASH(0x4ff0ec0)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 338 > Process::MpiTiers::_next_level(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 179 > Process::MpiTiers::next_chunk(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 286 > Process::MpiTiers::run_all(Process::MpiTiers=HASH(0x4fb3350), 0) called at /home/qliang/0.soft/maker/bin/maker line 695 > --> rank=NA, hostname=H4 > ERROR: Failed while builing masking tiers > --> rank=NA, hostname=H4 > --> rank=NA, hostname=H4 > ERROR: Can not get next level > running genemark. > #--------- command -------------# > Widget::genemark: > /24-2/home/qliang/0.soft/PerlPackages/ActivePerl-5.22/bin/perl-static /24-2/home/qliang/0.soft/maker/bin/../lib/Widget/genemark/gmhmm_wrap -m /home/qliang/cowpea/annotation/05.CEGMA/2.genemask/output/gmhmm.mod -g /24-2/home/qliang/0.soft/makerPackages/gm_et_linux_64/gmes_petap/gmhmme3 -p /24-2/home/qliang/0.soft/makerPackages/gm_et_linux_64/gmes_petap/probuild -o /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_nomask.0.gmhmm%2Emod.genemark /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_nomask.0 > #-------------------------------# > FAILED CONTIG:ScsGwly_6124;HRSCAF=6247 > > examining contents of the fasta file and run log > > > > --Next Contig-- > > #--------------------------------------------------------------------- > Now starting the contig!! > SeqID: ScsGwly_6140;HRSCAF=6263 > Length: 1247 > #--------------------------------------------------------------------- > > > > >> On Jan 5, 2018, at 7:22 PM, Carson Holt wrote: >> >> That?s the stack trace. The error is going to be a few lines further back. It would be best to get a few hundred lines right around the area you are showing. >> >> ?Carson >> >>> On Jan 4, 2018, at 2:36 PM, Qihua Liang wrote: >>> >>> Hi Ence, >>> >>> When I searched for ?E/error? in the output file, here is what first showed up: >>> Process::MpiChunk::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >>> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >>> Error::subs::try(CODE(0x502bbb0), HASH(0x5007788)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 4269 >>> Process::MpiChunk::_go(Process::MpiChunk=HASH(0x50a1a18), "flow", HASH(0x50ad0f0), 2, 0) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 378 >>> Process::MpiChunk::_flow(Process::MpiChunk=HASH(0x50a1a18), HASH(0x50ad0f0), 2, 0, Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 318 >>> Process::MpiTiers::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >>> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >>> Error::subs::try(CODE(0x50a9348), HASH(0x4ff0ec0)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 338 >>> Process::MpiTiers::_next_level(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 179 >>> >>> Is this what you may need? >>> >>> Qihua >>> >>>> On Jan 4, 2018, at 6:16 AM, Ence,daniel wrote: >>>> >>>> Hi, Before we can give any help to debug it, we need the error messages. These should be in the same file that the ?maker is finished? message is in. Look for the first error message (the one closest to the top of the file) and send that to the mailing list. >>>> >>>> Thanks, >>>> Daniel >>>> >>>> >>>>> On Jan 3, 2018, at 8:52 PM, Qihua Liang wrote: >>>>> >>>>> Dear Maker Develop Team, >>>>> >>>>> I have successfully run Maker for several times before. But I came across a strange thing days ago when I ran Maker again on a different assembly with the same input files and settings. >>>>> >>>>> I saw the message of "Maker is now finished!!!? but got empty GFF3 and no fasta files. And then I checked the master_datastore_index.log and realized that there are a lot of ?failed?s and ?retry?s and ?failed? again. What does this mean? Since I used same inputs as previous successful runs, could you provide some instructions on how to debug and solve it? >>>>> >>>>> Thank you so much >>>>> Qihua >>>>> _______________________________________________ >>>>> maker-devel mailing list >>>>> maker-devel at box290.bluehost.com >>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__box290.bluehost.com_mailman_listinfo_maker-2Ddevel-5Fyandell-2Dlab.org&d=DwIGaQ&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=12jzlNvGVD0AlPJ4E7cTlw1Dvu6n9cb4kMCobJ28XPs&m=nUDCP_0kFOhDYlTHgOpWtf_zdL77aQFeQwYOGIQwP8c&s=9Z4T1hdtxOyIjpn6f70qhrQRuGsZxXdV-oLSJF1zGkY&e= >>>> >>> >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From qlian003 at ucr.edu Tue Jan 9 11:10:49 2018 From: qlian003 at ucr.edu (Qihua Liang) Date: Tue, 9 Jan 2018 10:10:49 -0800 Subject: [maker-devel] questions on master_datastore_index.log file In-Reply-To: <0E5E8721-E814-4BA5-891B-B1C312BC0D4A@gmail.com> References: <0BECB285-BB11-4F46-B6D7-072640F311B2@ucr.edu> <0E5E8721-E814-4BA5-891B-B1C312BC0D4A@gmail.com> Message-ID: <87A06F3B-82C1-4B21-906E-69DC1308DEC6@ucr.edu> Hi Carson, I just check with the system administrator and we think the disk space should be working fine. And actually I also ran another attempt with much fewer processors days ago and I am having the same issues. Maybe I will try renaming the contig names to see how the new attempt works? Or any other suggestions? Thank you! Qihua > On Jan 9, 2018, at 9:14 AM, Carson Holt wrote: > > Your contig names may create issues. Specifically the ?;? character, but you should also remove the ?=? character. However, I believe your problem may be IO. If you are running under MPI or are running multiple jobs, the disk one of the machines may have that location unmounted, it may be full, you may have hit a system file quota limit, or the IO load is slowing it is not actually finished writing the file when MAKER tries to read it. If IO load, is the issue, then you just need to run fewer processes. The other possibilities would mean you need to make space, fix the mount, or raise any quotas on your systems. > > Thanks, > Carson > > On Jan 6, 2018, at 4:09 PM, Qihua Liang > wrote: > >> Hi Carson, >> >> I am pasting more lines of error messages. I notice an error of "ERROR: Can't open seq file: /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/query.masked.gff.seq?, the seq name of ?ScsGwly? is ">ScsGwly_6124;HRSCAF=6247?, is it because of the seq naming that makes the temp file name weird? >> >> Thanks >> Qihua >> >> #--------- command -------------# >> Widget::blastx: >> /24-2/home/qliang/0.soft/maker/bin/../exe/blast/bin/blastx -db /tmp/maker_nJDkCL/te_proteins%2Efasta.mpi.10.9 -query /tmp/maker_nJDkCL/0/ScsG >> wly_5932%3BHRSCAF=6050.0 -num_alignments 10000 -num_descriptions 10000 -evalue 1e-06 -dbsize 300 -searchsp 500000000 -num_threads 1 -seg yes >> -soft_masking true -lcase_masking -show_gis -out /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/D7/4 >> A/ScsGwly_5932%3BHRSCAF=6050//theVoid.ScsGwly_5932%3BHRSCAF=6050/0/ScsGwly_5932%3BHRSCAF=6050.0.te_proteins%2Efasta.repeatrunner.temp_dir/te_ >> proteins%2Efasta.mpi.10.9.repeatrunner >> #-------------------------------# >> deleted:0 hits >> collecting blastx repeatmasking >> processing all repeats >> in cluster::shadow_cluster... >> ...finished clustering. >> preparing masked sequence >> preparing ab-inits >> running snap. >> #--------- command -------------# >> Widget::snap: >> /24-2/home/qliang/0.soft/maker/exe/snap/snap /home/qliang/cowpea/annotation/09.tingting/4.Abintio/2.CEGMA/3.maker/maker1.hmm/maker1.snap.hmm >> /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0 > /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0.maker1%2Esnap%2Eh >> mm.snap >> #-------------------------------# >> scoring....decoding.10.20.30.40.50.60.70.80.90.100 done >> scoring....decoding.10.20.30.40.50.60.70.80.90.100 done >> running augustus. >> #--------- command -------------# >> Widget::augustus: >> /usr/local/augustus.2.7/bin/augustus --species=cowpea_new --UTR=off /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0 > /tmp/maker >> _nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0.cowpea_new.augustus >> #-------------------------------# >> deleted:0 hits >> collecting blastx repeatmasking >> processing all repeats >> in cluster::shadow_cluster... >> ...finished clustering. >> deleted:0 hits >> doing blastx repeats >> running blast search. >> #--------- command -------------# >> Widget::blastx: >> /24-2/home/qliang/0.soft/maker/bin/../exe/blast/bin/blastx -db /tmp/maker_mvdRkd/te_proteins%2Efasta.mpi.10.6 -query /tmp/maker_mvdRkd/0/chr10.75 -num_alignments 10000 -num_descriptions 10000 -evalue 1e-06 -dbsize 300 -searchsp 500000000 -num_threads 1 -seg yes -soft_masking true -lcase_masking -show_gis -out /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/09/chr10//theVoid.chr10/7/chr10.75.te_proteins%2Efasta.repeatrunner.temp_dir/te_proteins%2Efasta.mpi.10.6.repeatrunner >> #-------------------------------# >> doing blastx repeats >> re reading blast report. >> /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/0/ScsGwly_6124%3BHRSCAF=6247.0.te_proteins%2Efasta.repeatrunner >> deleted:0 hits >> doing blastx repeats >> doing blastx repeats >> doing blastx repeats >> doing blastx repeats >> doing blastx repeats >> doing blastx repeats >> doing blastx repeats >> doing blastx repeats >> doing blastx repeats >> collecting blastx repeatmasking >> processing all repeats >> in cluster::shadow_cluster... >> ...finished clustering. >> ERROR: Can't open seq file: /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/query.masked.gff.seq >> No such file or directory >> >> at /24-2/home/qliang/0.soft/maker/bin/../lib/Dumper/GFF/GFFV3.pm line 199. >> Dumper::GFF::GFFV3::finalize(Dumper::GFF::GFFV3=HASH(0x5000ab8)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 700 >> Process::MpiChunk::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >> Error::subs::try(CODE(0x502bbb0), HASH(0x5007788)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 4269 >> Process::MpiChunk::_go(Process::MpiChunk=HASH(0x50a1a18), "flow", HASH(0x50ad0f0), 2, 0) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 378 >> Process::MpiChunk::_flow(Process::MpiChunk=HASH(0x50a1a18), HASH(0x50ad0f0), 2, 0, Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 318 >> Process::MpiTiers::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >> Error::subs::try(CODE(0x50a9348), HASH(0x4ff0ec0)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 338 >> Process::MpiTiers::_next_level(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 179 >> Process::MpiTiers::next_chunk(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 286 >> Process::MpiTiers::run_all(Process::MpiTiers=HASH(0x4fb3350), 0) called at /home/qliang/0.soft/maker/bin/maker line 695 >> --> rank=NA, hostname=H4 >> ERROR: Failed while builing masking tiers >> --> rank=NA, hostname=H4 >> --> rank=NA, hostname=H4 >> ERROR: Can not get next level >> running genemark. >> #--------- command -------------# >> Widget::genemark: >> /24-2/home/qliang/0.soft/PerlPackages/ActivePerl-5.22/bin/perl-static /24-2/home/qliang/0.soft/maker/bin/../lib/Widget/genemark/gmhmm_wrap -m /home/qliang/cowpea/annotation/05.CEGMA/2.genemask/output/gmhmm.mod -g /24-2/home/qliang/0.soft/makerPackages/gm_et_linux_64/gmes_petap/gmhmme3 -p /24-2/home/qliang/0.soft/makerPackages/gm_et_linux_64/gmes_petap/probuild -o /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_nomask.0.gmhmm%2Emod.genemark /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_nomask.0 >> #-------------------------------# >> FAILED CONTIG:ScsGwly_6124;HRSCAF=6247 >> >> examining contents of the fasta file and run log >> >> >> >> --Next Contig-- >> >> #--------------------------------------------------------------------- >> Now starting the contig!! >> SeqID: ScsGwly_6140;HRSCAF=6263 >> Length: 1247 >> #--------------------------------------------------------------------- >> >> >> >> >>> On Jan 5, 2018, at 7:22 PM, Carson Holt > wrote: >>> >>> That?s the stack trace. The error is going to be a few lines further back. It would be best to get a few hundred lines right around the area you are showing. >>> >>> ?Carson >>> >>>> On Jan 4, 2018, at 2:36 PM, Qihua Liang > wrote: >>>> >>>> Hi Ence, >>>> >>>> When I searched for ?E/error? in the output file, here is what first showed up: >>>> Process::MpiChunk::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >>>> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >>>> Error::subs::try(CODE(0x502bbb0), HASH(0x5007788)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 4269 >>>> Process::MpiChunk::_go(Process::MpiChunk=HASH(0x50a1a18), "flow", HASH(0x50ad0f0), 2, 0) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 378 >>>> Process::MpiChunk::_flow(Process::MpiChunk=HASH(0x50a1a18), HASH(0x50ad0f0), 2, 0, Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 318 >>>> Process::MpiTiers::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >>>> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >>>> Error::subs::try(CODE(0x50a9348), HASH(0x4ff0ec0)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 338 >>>> Process::MpiTiers::_next_level(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 179 >>>> >>>> Is this what you may need? >>>> >>>> Qihua >>>> >>>>> On Jan 4, 2018, at 6:16 AM, Ence,daniel > wrote: >>>>> >>>>> Hi, Before we can give any help to debug it, we need the error messages. These should be in the same file that the ?maker is finished? message is in. Look for the first error message (the one closest to the top of the file) and send that to the mailing list. >>>>> >>>>> Thanks, >>>>> Daniel >>>>> >>>>> >>>>>> On Jan 3, 2018, at 8:52 PM, Qihua Liang > wrote: >>>>>> >>>>>> Dear Maker Develop Team, >>>>>> >>>>>> I have successfully run Maker for several times before. But I came across a strange thing days ago when I ran Maker again on a different assembly with the same input files and settings. >>>>>> >>>>>> I saw the message of "Maker is now finished!!!? but got empty GFF3 and no fasta files. And then I checked the master_datastore_index.log and realized that there are a lot of ?failed?s and ?retry?s and ?failed? again. What does this mean? Since I used same inputs as previous successful runs, could you provide some instructions on how to debug and solve it? >>>>>> >>>>>> Thank you so much >>>>>> Qihua >>>>>> _______________________________________________ >>>>>> maker-devel mailing list >>>>>> maker-devel at box290.bluehost.com >>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__box290.bluehost.com_mailman_listinfo_maker-2Ddevel-5Fyandell-2Dlab.org&d=DwIGaQ&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=12jzlNvGVD0AlPJ4E7cTlw1Dvu6n9cb4kMCobJ28XPs&m=nUDCP_0kFOhDYlTHgOpWtf_zdL77aQFeQwYOGIQwP8c&s=9Z4T1hdtxOyIjpn6f70qhrQRuGsZxXdV-oLSJF1zGkY&e= >>>>> >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Wed Jan 10 12:05:03 2018 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 10 Jan 2018 12:05:03 -0700 Subject: [maker-devel] questions on master_datastore_index.log file In-Reply-To: <87A06F3B-82C1-4B21-906E-69DC1308DEC6@ucr.edu> References: <0BECB285-BB11-4F46-B6D7-072640F311B2@ucr.edu> <0E5E8721-E814-4BA5-891B-B1C312BC0D4A@gmail.com> <87A06F3B-82C1-4B21-906E-69DC1308DEC6@ucr.edu> Message-ID: <36B45AA1-3D02-4E83-9EF8-85D56C4D3020@gmail.com> The error is saying exactly that the file MAKER just created does not exist. The only time we ever see this is when using network mounted locations under heavy IO load. Most network storage options use asynchronous IO, which means the system returns success on file operation before they actually complete. So they can say they finished writing a file before it actually exist. So if you try and open it right away, it doesn?t really exist and everything fails. But that only happens if there is heavy IO (lots of things going on in that mount location). So if you are getting persitent failures you may want to try a different work directory, or get your IT to troubleshoot IO load in the directory you are using. ?Carson > On Jan 9, 2018, at 11:10 AM, Qihua Liang wrote: > > Hi Carson, > > I just check with the system administrator and we think the disk space should be working fine. And actually I also ran another attempt with much fewer processors days ago and I am having the same issues. > > Maybe I will try renaming the contig names to see how the new attempt works? Or any other suggestions? > > Thank you! > Qihua > >> On Jan 9, 2018, at 9:14 AM, Carson Holt > wrote: >> >> Your contig names may create issues. Specifically the ?;? character, but you should also remove the ?=? character. However, I believe your problem may be IO. If you are running under MPI or are running multiple jobs, the disk one of the machines may have that location unmounted, it may be full, you may have hit a system file quota limit, or the IO load is slowing it is not actually finished writing the file when MAKER tries to read it. If IO load, is the issue, then you just need to run fewer processes. The other possibilities would mean you need to make space, fix the mount, or raise any quotas on your systems. >> >> Thanks, >> Carson >> >> On Jan 6, 2018, at 4:09 PM, Qihua Liang > wrote: >> >>> Hi Carson, >>> >>> I am pasting more lines of error messages. I notice an error of "ERROR: Can't open seq file: /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/query.masked.gff.seq?, the seq name of ?ScsGwly? is ">ScsGwly_6124;HRSCAF=6247?, is it because of the seq naming that makes the temp file name weird? >>> >>> Thanks >>> Qihua >>> >>> #--------- command -------------# >>> Widget::blastx: >>> /24-2/home/qliang/0.soft/maker/bin/../exe/blast/bin/blastx -db /tmp/maker_nJDkCL/te_proteins%2Efasta.mpi.10.9 -query /tmp/maker_nJDkCL/0/ScsG >>> wly_5932%3BHRSCAF=6050.0 -num_alignments 10000 -num_descriptions 10000 -evalue 1e-06 -dbsize 300 -searchsp 500000000 -num_threads 1 -seg yes >>> -soft_masking true -lcase_masking -show_gis -out /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/D7/4 >>> A/ScsGwly_5932%3BHRSCAF=6050//theVoid.ScsGwly_5932%3BHRSCAF=6050/0/ScsGwly_5932%3BHRSCAF=6050.0.te_proteins%2Efasta.repeatrunner.temp_dir/te_ >>> proteins%2Efasta.mpi.10.9.repeatrunner >>> #-------------------------------# >>> deleted:0 hits >>> collecting blastx repeatmasking >>> processing all repeats >>> in cluster::shadow_cluster... >>> ...finished clustering. >>> preparing masked sequence >>> preparing ab-inits >>> running snap. >>> #--------- command -------------# >>> Widget::snap: >>> /24-2/home/qliang/0.soft/maker/exe/snap/snap /home/qliang/cowpea/annotation/09.tingting/4.Abintio/2.CEGMA/3.maker/maker1.hmm/maker1.snap.hmm >>> /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0 > /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0.maker1%2Esnap%2Eh >>> mm.snap >>> #-------------------------------# >>> scoring....decoding.10.20.30.40.50.60.70.80.90.100 done >>> scoring....decoding.10.20.30.40.50.60.70.80.90.100 done >>> running augustus. >>> #--------- command -------------# >>> Widget::augustus: >>> /usr/local/augustus.2.7/bin/augustus --species=cowpea_new --UTR=off /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0 > /tmp/maker >>> _nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0.cowpea_new.augustus >>> #-------------------------------# >>> deleted:0 hits >>> collecting blastx repeatmasking >>> processing all repeats >>> in cluster::shadow_cluster... >>> ...finished clustering. >>> deleted:0 hits >>> doing blastx repeats >>> running blast search. >>> #--------- command -------------# >>> Widget::blastx: >>> /24-2/home/qliang/0.soft/maker/bin/../exe/blast/bin/blastx -db /tmp/maker_mvdRkd/te_proteins%2Efasta.mpi.10.6 -query /tmp/maker_mvdRkd/0/chr10.75 -num_alignments 10000 -num_descriptions 10000 -evalue 1e-06 -dbsize 300 -searchsp 500000000 -num_threads 1 -seg yes -soft_masking true -lcase_masking -show_gis -out /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/09/chr10//theVoid.chr10/7/chr10.75.te_proteins%2Efasta.repeatrunner.temp_dir/te_proteins%2Efasta.mpi.10.6.repeatrunner >>> #-------------------------------# >>> doing blastx repeats >>> re reading blast report. >>> /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/0/ScsGwly_6124%3BHRSCAF=6247.0.te_proteins%2Efasta.repeatrunner >>> deleted:0 hits >>> doing blastx repeats >>> doing blastx repeats >>> doing blastx repeats >>> doing blastx repeats >>> doing blastx repeats >>> doing blastx repeats >>> doing blastx repeats >>> doing blastx repeats >>> doing blastx repeats >>> collecting blastx repeatmasking >>> processing all repeats >>> in cluster::shadow_cluster... >>> ...finished clustering. >>> ERROR: Can't open seq file: /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/query.masked.gff.seq >>> No such file or directory >>> >>> at /24-2/home/qliang/0.soft/maker/bin/../lib/Dumper/GFF/GFFV3.pm line 199. >>> Dumper::GFF::GFFV3::finalize(Dumper::GFF::GFFV3=HASH(0x5000ab8)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 700 >>> Process::MpiChunk::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >>> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >>> Error::subs::try(CODE(0x502bbb0), HASH(0x5007788)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 4269 >>> Process::MpiChunk::_go(Process::MpiChunk=HASH(0x50a1a18), "flow", HASH(0x50ad0f0), 2, 0) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 378 >>> Process::MpiChunk::_flow(Process::MpiChunk=HASH(0x50a1a18), HASH(0x50ad0f0), 2, 0, Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 318 >>> Process::MpiTiers::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >>> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >>> Error::subs::try(CODE(0x50a9348), HASH(0x4ff0ec0)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 338 >>> Process::MpiTiers::_next_level(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 179 >>> Process::MpiTiers::next_chunk(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 286 >>> Process::MpiTiers::run_all(Process::MpiTiers=HASH(0x4fb3350), 0) called at /home/qliang/0.soft/maker/bin/maker line 695 >>> --> rank=NA, hostname=H4 >>> ERROR: Failed while builing masking tiers >>> --> rank=NA, hostname=H4 >>> --> rank=NA, hostname=H4 >>> ERROR: Can not get next level >>> running genemark. >>> #--------- command -------------# >>> Widget::genemark: >>> /24-2/home/qliang/0.soft/PerlPackages/ActivePerl-5.22/bin/perl-static /24-2/home/qliang/0.soft/maker/bin/../lib/Widget/genemark/gmhmm_wrap -m /home/qliang/cowpea/annotation/05.CEGMA/2.genemask/output/gmhmm.mod -g /24-2/home/qliang/0.soft/makerPackages/gm_et_linux_64/gmes_petap/gmhmme3 -p /24-2/home/qliang/0.soft/makerPackages/gm_et_linux_64/gmes_petap/probuild -o /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_nomask.0.gmhmm%2Emod.genemark /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_nomask.0 >>> #-------------------------------# >>> FAILED CONTIG:ScsGwly_6124;HRSCAF=6247 >>> >>> examining contents of the fasta file and run log >>> >>> >>> >>> --Next Contig-- >>> >>> #--------------------------------------------------------------------- >>> Now starting the contig!! >>> SeqID: ScsGwly_6140;HRSCAF=6263 >>> Length: 1247 >>> #--------------------------------------------------------------------- >>> >>> >>> >>> >>>> On Jan 5, 2018, at 7:22 PM, Carson Holt > wrote: >>>> >>>> That?s the stack trace. The error is going to be a few lines further back. It would be best to get a few hundred lines right around the area you are showing. >>>> >>>> ?Carson >>>> >>>>> On Jan 4, 2018, at 2:36 PM, Qihua Liang > wrote: >>>>> >>>>> Hi Ence, >>>>> >>>>> When I searched for ?E/error? in the output file, here is what first showed up: >>>>> Process::MpiChunk::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >>>>> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >>>>> Error::subs::try(CODE(0x502bbb0), HASH(0x5007788)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 4269 >>>>> Process::MpiChunk::_go(Process::MpiChunk=HASH(0x50a1a18), "flow", HASH(0x50ad0f0), 2, 0) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 378 >>>>> Process::MpiChunk::_flow(Process::MpiChunk=HASH(0x50a1a18), HASH(0x50ad0f0), 2, 0, Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 318 >>>>> Process::MpiTiers::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >>>>> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >>>>> Error::subs::try(CODE(0x50a9348), HASH(0x4ff0ec0)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 338 >>>>> Process::MpiTiers::_next_level(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 179 >>>>> >>>>> Is this what you may need? >>>>> >>>>> Qihua >>>>> >>>>>> On Jan 4, 2018, at 6:16 AM, Ence,daniel > wrote: >>>>>> >>>>>> Hi, Before we can give any help to debug it, we need the error messages. These should be in the same file that the ?maker is finished? message is in. Look for the first error message (the one closest to the top of the file) and send that to the mailing list. >>>>>> >>>>>> Thanks, >>>>>> Daniel >>>>>> >>>>>> >>>>>>> On Jan 3, 2018, at 8:52 PM, Qihua Liang > wrote: >>>>>>> >>>>>>> Dear Maker Develop Team, >>>>>>> >>>>>>> I have successfully run Maker for several times before. But I came across a strange thing days ago when I ran Maker again on a different assembly with the same input files and settings. >>>>>>> >>>>>>> I saw the message of "Maker is now finished!!!? but got empty GFF3 and no fasta files. And then I checked the master_datastore_index.log and realized that there are a lot of ?failed?s and ?retry?s and ?failed? again. What does this mean? Since I used same inputs as previous successful runs, could you provide some instructions on how to debug and solve it? >>>>>>> >>>>>>> Thank you so much >>>>>>> Qihua >>>>>>> _______________________________________________ >>>>>>> maker-devel mailing list >>>>>>> maker-devel at box290.bluehost.com >>>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__box290.bluehost.com_mailman_listinfo_maker-2Ddevel-5Fyandell-2Dlab.org&d=DwIGaQ&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=12jzlNvGVD0AlPJ4E7cTlw1Dvu6n9cb4kMCobJ28XPs&m=nUDCP_0kFOhDYlTHgOpWtf_zdL77aQFeQwYOGIQwP8c&s=9Z4T1hdtxOyIjpn6f70qhrQRuGsZxXdV-oLSJF1zGkY&e= >>>>>> >>>>> >>>>> _______________________________________________ >>>>> maker-devel mailing list >>>>> maker-devel at box290.bluehost.com >>>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>>> >>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From arsilan324 at gmail.com Thu Jan 11 07:15:31 2018 From: arsilan324 at gmail.com (Muhammad Arslan) Date: Thu, 11 Jan 2018 15:15:31 +0100 Subject: [maker-devel] GFF3 to .tbl Message-ID: Dear Madam or Sir, I am writing this email to inquire if there is any way to make .tbl file from maker generated GFF3 file? This is required since I am trying to submit the annotation to NCBI. If there is any other solution for this, please advise accordingly. Thank you very much! Arslan -- --------------------------------------------------------------------------------------------*Muhammad Arslan* PhD Student / Guest Scientist Department of Environmental Biotechnology Helmholtz Centre for Environmental Research - UFZ Permoserstra?e 15, 04318 Leipzig, Germany Phone +49,341,235 <+49%20341%20235> 1696, muhammad.arslan at ufz.de , www.ufz.de Registered Office / Registered Office: Leipzig Register court / Registration Office: Amtsgericht Leipzig Commercial register Nr./Trade Register No .: B 4703 Chairman / Chairman of the Supervisory Board: MinDirig Wilfried Kraus Scientific Director / Scientific Managing Director: Prof. Georg Teutsch Administrative Managing Director / Administrative Managing Director: Prof. Dr. Heike Grassmann -------------------------------------------------------------------------------------------- *SAVE PAPER - Please do not print this e-mail unless absolutely necessary* -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Fri Jan 19 15:46:26 2018 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 19 Jan 2018 15:46:26 -0700 Subject: [maker-devel] GFF3 to .tbl In-Reply-To: References: Message-ID: <93BD3F52-1D76-465A-94EE-80D616BB72A6@gmail.com> Try GAG ?> https://genomeannotation.github.io/GAG/ ?Carson > On Jan 11, 2018, at 7:15 AM, Muhammad Arslan wrote: > > Dear Madam or Sir, > > I am writing this email to inquire if there is any way to make .tbl file from maker generated GFF3 file? This is required since I am trying to submit the annotation to NCBI. If there is any other solution for this, please advise accordingly. > > Thank you very much! > Arslan > > -- > -------------------------------------------------------------------------------------------- > Muhammad Arslan > PhD Student / Guest Scientist > Department of Environmental Biotechnology > > Helmholtz Centre for Environmental Research - UFZ > Permoserstra?e 15, 04318 Leipzig, Germany > Phone +49,341,235 1696, > muhammad.arslan at ufz.de , www.ufz.de > > Registered Office / Registered Office: Leipzig > Register court / Registration Office: Amtsgericht Leipzig > Commercial register Nr./Trade Register No .: B 4703 > Chairman / Chairman of the Supervisory Board: MinDirig Wilfried Kraus > Scientific Director / Scientific Managing Director: > Prof. Georg Teutsch > Administrative Managing Director / Administrative Managing Director: > Prof. Dr. Heike Grassmann > > > -------------------------------------------------------------------------------------------- > SAVE PAPER - Please do not print this e-mail unless absolutely necessary > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From qwzhang0601 at gmail.com Mon Jan 22 10:23:34 2018 From: qwzhang0601 at gmail.com (Quanwei Zhang) Date: Mon, 22 Jan 2018 12:23:34 -0500 Subject: [maker-devel] name of gene model Message-ID: Hello: Would you please explain how the genes were named? Do similar names indicate sequence similarities (e.g., maker-Contig3217-snap-gene-35.13-mRNA-1, maker-Contig3217-snap-gene-35.14-mRNA-1)? maker-Contig2667-augustus-gene-266.22-mRNA-1; maker-Contig2667-snap-gene-266.5-mRNA-1; maker-Contig3217-snap-gene-35.13-mRNA-1; maker-Contig3217-snap-gene-35.14-mRNA-1; maker-Contig3217-snap-gene-35.15-mRNA-1; maker-Contig3217-snap-gene-35.16-mRNA-1 Thank you Best Quanwei -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Jan 22 10:29:26 2018 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 22 Jan 2018 10:29:26 -0700 Subject: [maker-devel] name of gene model In-Reply-To: References: Message-ID: <499FF8DC-C277-484B-AEC9-EE7A35090615@gmail.com> The only info in the name is the source program of the model (i.e. snap/augustus). The numbers are just meaningless iterators. ?Carson > On Jan 22, 2018, at 10:23 AM, Quanwei Zhang wrote: > > Hello: > > Would you please explain how the genes were named? Do similar names indicate sequence similarities (e.g., maker-Contig3217-snap-gene-35.13-mRNA-1, maker-Contig3217-snap-gene-35.14-mRNA-1)? > > maker-Contig2667-augustus-gene-266.22-mRNA-1; > maker-Contig2667-snap-gene-266.5-mRNA-1; > > maker-Contig3217-snap-gene-35.13-mRNA-1; > maker-Contig3217-snap-gene-35.14-mRNA-1; > maker-Contig3217-snap-gene-35.15-mRNA-1; > maker-Contig3217-snap-gene-35.16-mRNA-1 > > Thank you > > Best > Quanwei > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From yincl2013 at 126.com Tue Jan 23 08:01:17 2018 From: yincl2013 at 126.com (Chuanlin Yin) Date: Tue, 23 Jan 2018 23:01:17 +0800 (GMT+08:00) Subject: [maker-devel] maker-3.01.02-beta run error Message-ID: <165b9e3f.a94e.16123899e8a.Coremail.yincl2013@126.com> Dear Mr/Ms? Recently?when i want to use maker-3.01.02-beta for genome annotation. I had failed for the following error: Can't call method "add_entry" without a package or object reference at /gpfs/bioinformatics/software/maker-3.01.02-beta/bin/../lib/Widget/snap.pm line 540. --> rank=NA, hostname=c01n02 ERROR: Failed while annotating transcripts ERROR: Chunk failed at level:1, tier_type:4 FAILED CONTIG:002369F_pilon_obj Could you explain why it happened! Much appreciated for any replies. Thanks. Best regards! Showky -------------- next part -------------- An HTML attachment was scrubbed... URL: From Emily.Giroux at inspection.gc.ca Tue Jan 23 13:35:06 2018 From: Emily.Giroux at inspection.gc.ca (Giroux, Emily (CFIA/ACIA)) Date: Tue, 23 Jan 2018 20:35:06 +0000 Subject: [maker-devel] maker pipeline 2nd round updating augustus Message-ID: <397E3572255740488AA9993F4D41A3B036B588CB@ONOTTAXES2.AGR.GC.CA> Hi, I completed a first round of Maker, followed by snap and BUSCO to train augustus. I then placed the newly-trained species-specific augustus files in the augustus species directory and used this for my second round of maker. What I'm wondering now is whether I should repeat this process after completeing round 2 of maker, and follow this with using BUSCO to retrain the augustus files again and replace the previous species-specific libraries from round 1 with those from round 2 and use these as input for my third round of maker. Thank-you very much, Emily -------------- next part -------------- An HTML attachment was scrubbed... URL: From patrick.tranvan at unil.ch Thu Jan 25 07:46:27 2018 From: patrick.tranvan at unil.ch (Patrick Tran Van) Date: Thu, 25 Jan 2018 14:46:27 +0000 Subject: [maker-devel] Adding NR functional annotation Message-ID: <1516891629951.7595@unil.ch> Hi, Can you please update maker_functional_gff maker_functional_fasta in order to make it compatible with the database NR ? Thanks, Patrick -------------- next part -------------- An HTML attachment was scrubbed... URL: From marni at cs.au.dk Thu Jan 25 03:26:04 2018 From: marni at cs.au.dk (Marni Tausen) Date: Thu, 25 Jan 2018 10:26:04 +0000 Subject: [maker-devel] Maker run problems - BLAST makeblastdb failed Message-ID: <72D3C07A-D1A6-4759-B8D2-13EBE8DD7982@birc.au.dk> Hey, I have a problem getting maker to run. I?ve tried installing the pipeline on three separate systems. CentOS 6 (cluster), Mac OS X 10.12.6 and on CentOS 7. With each of them I run into problems with Repeatmasker step with the error message: #--------------------------------------------------------------------- Now starting the contig!! SeqID: chr0 Length: 38046352 #--------------------------------------------------------------------- setting up GFF3 output and fasta chunks doing repeat masking running repeat masker. #--------- command -------------# Widget::RepeatMasker: cd makertmp/maker_DMTHbJ; /Users/PM/maker/exe/RepeatMasker/RepeatMasker /Users/PM/GENEANNOTATION/TrR.v5.maker.output/TrR.v5_datastore/82/7E/chr0//theVoid.chr0/0/chr0.0.all.rb -species all -dir /Users/PM/GENEANNOTATION/TrR.v5.maker.output/TrR.v5_datastore/82/7E/chr0//theVoid.chr0/0 -pa 1 #-------------------------------# doing blastx repeats formating database... #--------- command -------------# Widget::formater: /Users/PM/maker/bin/../exe/lblast/bin/makeblastdb -dbtype prot -in makertmp/maker_DMTHbJ/0/blastprep/te_proteins%2Efasta.mpi.10.0 #-------------------------------# BLAST options error: File makertmp/maker_DMTHbJ/0/blastprep/te_proteins%2Efasta.mpi.10.0 does not exist ERROR: /Users/PM/maker/bin/../exe/lblast/bin/makeblastdb failed in Widget::formater --> rank=NA, hostname=d24834.local ERROR: Failed while doing blastx repeats ERROR: Chunk failed at level:1, tier_type:1 FAILED CONTIG:chr0 ERROR: Chunk failed at level:2, tier_type:0 FAILED CONTIG:chr0 examining contents of the fasta file and run log The Maker version that was installed is 2.31.9, and it was build using the ./Build commands. However the links for exonerate (2.2.0) and repeatmasker (repbase) (latest version) seem to be broken, as they always returned connection errors. So I manually installed both of those programs and linked maker to them. I?ve attached the config files and the script used to run maker. Cheers, Marni Tausen -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_bopts.ctl Type: application/octet-stream Size: 1413 bytes Desc: maker_bopts.ctl URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_exe.ctl Type: application/octet-stream Size: 1277 bytes Desc: maker_exe.ctl URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_opts.ctl Type: application/octet-stream Size: 4554 bytes Desc: maker_opts.ctl URL: From mmokrejs at gmail.com Thu Jan 25 09:05:45 2018 From: mmokrejs at gmail.com (=?UTF-8?Q?Martin_MOKREJ=c5=a0?=) Date: Thu, 25 Jan 2018 17:05:45 +0100 Subject: [maker-devel] Maker run problems - BLAST makeblastdb failed In-Reply-To: <72D3C07A-D1A6-4759-B8D2-13EBE8DD7982@birc.au.dk> References: <72D3C07A-D1A6-4759-B8D2-13EBE8DD7982@birc.au.dk> Message-ID: Hi Marni, do not use spaces in your filenames and directory names. I think that is your issue: te_proteins%2Efasta.mpi.10.0 Martin From carsonhh at gmail.com Thu Jan 25 14:20:21 2018 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 25 Jan 2018 14:20:21 -0700 Subject: [maker-devel] Maker run problems - BLAST makeblastdb failed In-Reply-To: <72D3C07A-D1A6-4759-B8D2-13EBE8DD7982@birc.au.dk> References: <72D3C07A-D1A6-4759-B8D2-13EBE8DD7982@birc.au.dk> Message-ID: <2AC145CF-6954-4740-BA88-A7ABBBC841D0@gmail.com> You set TMP=makertmp. That is likly not a true locally mounted location (i.e. it?s network mounted). In which case you will hit a race condition where files you just created don?t become readable for a few milliseconds to seconds after creation under heavy IO load. Alternatively it is locally mounted, but only exists on a single node and you are running it on a cluster (other nodes cannot cross access local nodes). Unless your cluster setup has a specific location for locally mounted temporary scratch space, you should not set TMP=. Just let it default to /tmp which is almost always locally mounted. ?Carson > On Jan 25, 2018, at 3:26 AM, Marni Tausen wrote: > > Hey, > > I have a problem getting maker to run. > > I?ve tried installing the pipeline on three separate systems. CentOS 6 (cluster), Mac OS X 10.12.6 and on CentOS 7. > > With each of them I run into problems with Repeatmasker step with the error message: > > #--------------------------------------------------------------------- > Now starting the contig!! > SeqID: chr0 > Length: 38046352 > #--------------------------------------------------------------------- > > > setting up GFF3 output and fasta chunks > doing repeat masking > running repeat masker. > #--------- command -------------# > Widget::RepeatMasker: > cd makertmp/maker_DMTHbJ; /Users/PM/maker/exe/RepeatMasker/RepeatMasker /Users/PM/GENEANNOTATION/TrR.v5.maker.output/TrR.v5_datastore/82/7E/chr0//theVoid.chr0/0/chr0.0.all.rb -species all -dir /Users/PM/GENEANNOTATION/TrR.v5.maker.output/TrR.v5_datastore/82/7E/chr0//theVoid.chr0/0 -pa 1 > #-------------------------------# > doing blastx repeats > formating database... > #--------- command -------------# > Widget::formater: > /Users/PM/maker/bin/../exe/lblast/bin/makeblastdb -dbtype prot -in makertmp/maker_DMTHbJ/0/blastprep/te_proteins%2Efasta.mpi.10.0 > #-------------------------------# > BLAST options error: File makertmp/maker_DMTHbJ/0/blastprep/te_proteins%2Efasta.mpi.10.0 does not exist > ERROR: /Users/PM/maker/bin/../exe/lblast/bin/makeblastdb failed in Widget::formater > --> rank=NA, hostname=d24834.local > ERROR: Failed while doing blastx repeats > ERROR: Chunk failed at level:1, tier_type:1 > FAILED CONTIG:chr0 > > ERROR: Chunk failed at level:2, tier_type:0 > FAILED CONTIG:chr0 > > examining contents of the fasta file and run log > > The Maker version that was installed is 2.31.9, and it was build using the ./Build commands. > > However the links for exonerate (2.2.0) and repeatmasker (repbase) (latest version) seem to be broken, as they always returned connection errors. > > So I manually installed both of those programs and linked maker to them. > > I?ve attached the config files and the script used to run maker. > > Cheers, > Marni Tausen > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Thu Jan 25 14:29:37 2018 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 25 Jan 2018 14:29:37 -0700 Subject: [maker-devel] maker pipeline 2nd round updating augustus In-Reply-To: <397E3572255740488AA9993F4D41A3B036B588CB@ONOTTAXES2.AGR.GC.CA> References: <397E3572255740488AA9993F4D41A3B036B588CB@ONOTTAXES2.AGR.GC.CA> Message-ID: <2D069310-1BFC-4C30-98B5-739FC90A732B@gmail.com> Don?t use BUSCO to train for the second round, there is a bias in the models it produces for conserved genes that tend to be short and intron poor., You will want to avoid this bias in the second round. You want to use a broad selection of gene models instead. Use the maker2zff script to select gene models for training (examples on doing this can be found on the maker tutorial wiki). Then use this script to convert ZFF to GenBank format to train Augustus ?> https://github.com/hyphaltip/genome-scripts/blob/master/gene_prediction/zff2augustus_gbk.pl This is a nice guide to train Augustus using GenBank format input?> https://vcru.wisc.edu/simonlab/bioinformatics/programs/augustus/docs/tutorial2015/training.html ?Carson > On Jan 23, 2018, at 1:35 PM, Giroux, Emily (CFIA/ACIA) wrote: > > Hi, > > I completed a first round of Maker, followed by snap and BUSCO to train augustus. I then placed the newly-trained species-specific augustus files in the augustus species directory and used this for my second round of maker. > > What I?m wondering now is whether I should repeat this process after completeing round 2 of maker, and follow this with using BUSCO to retrain the augustus files again and replace the previous species-specific libraries from round 1 with those from round 2 and use these as input for my third round of maker. > > Thank-you very much, > > Emily > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Thu Jan 25 14:33:33 2018 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 25 Jan 2018 14:33:33 -0700 Subject: [maker-devel] maker-3.01.02-beta run error In-Reply-To: <165b9e3f.a94e.16123899e8a.Coremail.yincl2013@126.com> References: <165b9e3f.a94e.16123899e8a.Coremail.yincl2013@126.com> Message-ID: Because of where that error occurred, it may be a snowball error (i.e. a result of another error upstream that is the real failure). Could you look back in the data to see if there is a failure further back? Perhaps include your entire STDERR log. Thanks, Carson > On Jan 23, 2018, at 8:01 AM, Chuanlin Yin wrote: > > Dear Mr/Ms? > > Recently?when i want to use maker-3.01.02-beta for genome annotation. I had failed for the following error: > > Can't call method "add_entry" without a package or object reference at /gpfs/bioinformatics/software/maker-3.01.02-beta/bin/../lib/Widget/snap.pm line 540. > --> rank=NA, hostname=c01n02 > ERROR: Failed while annotating transcripts > ERROR: Chunk failed at level:1, tier_type:4 > FAILED CONTIG:002369F_pilon_obj > > Could you explain why it happened! > > Much appreciated for any replies. Thanks. > > Best regards! > Showky > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From qwzhang0601 at gmail.com Fri Jan 26 09:40:32 2018 From: qwzhang0601 at gmail.com (Quanwei Zhang) Date: Fri, 26 Jan 2018 11:40:32 -0500 Subject: [maker-devel] map the transcripts back onto the genome using "est2genome=1" Message-ID: Hello: I am trying to annotate a new NMR genome assembly. Since the gene annotation is available for the old version of NMR from NCBI, I tried to map the published refSeq transcripts onto the genome by "est2genome=1". But I found quite a few genes were lost during mapping. Then I did another test to check the functionality of the mapping by "est2genome=1". I mapped the published refSeq transcripts onto the old genome (the same version for the published gene annotation) by maker with "est2genome=1". Still I can found quite a few genes were lost during the mapping. Below I show you the results of gene annotaion by BUSCOs, which *annotation completeness with single-copy orthologs*. You can see, even we only consider the single-copy orthologs, there are still 4% were not map back to the genome. Do you have any comments on this? Besides would you please give us some suggestions to make more published gene annotation map back to the same genome assembly through "est2genome=1"? Attached is the maker_opts.ctl file I used for the mapping. Many thanks. # this is the BUSCOs results using the published gene annotation C:99.3%[S:33.3%,D:66.0%],F:0.3%,M:0.4%,n:4104 4077 Complete BUSCOs (C) 1367 Complete and single-copy BUSCOs (S) 2710 Complete and duplicated BUSCOs (D) 14 Fragmented BUSCOs (F) 13 Missing BUSCOs (M) 4104 Total BUSCO groups searched #this is the BUSCOs results using gene models after mapping by maker2. C:93.4%[S:36.5%,D:56.9%],F:2.6%,M:4.0%,n:4104 3830 Complete BUSCOs (C) 1496 Complete and single-copy BUSCOs (S) 2334 Complete and duplicated BUSCOs (D) 105 Fragmented BUSCOs (F) 169 Missing BUSCOs (M) 4104 Total BUSCO groups searched Best Quanwei -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_opts.ctl Type: application/octet-stream Size: 4734 bytes Desc: not available URL: From qwzhang0601 at gmail.com Fri Jan 26 16:16:50 2018 From: qwzhang0601 at gmail.com (Quanwei Zhang) Date: Fri, 26 Jan 2018 18:16:50 -0500 Subject: [maker-devel] gene annotation for a better genome In-Reply-To: References: <5AFEDD05-DF02-463F-A6EE-1619A9BB968D@gmail.com> Message-ID: Hi Carson: Thank you for your previous suggestions. I have done the annotation according to your suggestions. I firstly mapped the transcripts from old assembly to the new assembly by setting "est2genome=1", and then update the models by new predictions. Besides mapping by "est2genome=1" , do you think it is a good idea to do a separate mapping by proteins of old assembly (setting "protein2genome=1")? And then I provide both mapping GFF files (i.e., mapping GFF by transcripts and proteins, separately) and update them with new predictions and evidence support? Why I am trying to do this is because I found for certain genes they were not mapped to the new assembly but they can be mapped by protein orthologs. Thank you. Best Quanwei 2017-10-24 18:26 GMT-04:00 Carson Holt : > Yes. If you use est2genome it will just align the model, and then find the > longest ORF. So it is a quick way to jsut align old models to the new > assembly. Alternatively you can just do de novo annotation. > > ?Carson > > > > On Oct 24, 2017, at 10:54 AM, Quanwei Zhang wrote: > > Dear Carson: > > Thank you again for your suggestions. I just get the new genome assembly > of NMR and start to do gene annotation. I understand you ideas about this. > But can I simply use the old genome transcripts as transcript evidence, and > just following the standard Maker2 pipeline? I set est2genome=1 and provide > the mRNA sequences in the fasta format for the first round training of SNAP. > > For transcripts I have the following choices. I think the first choice is > more reliable and better, right? > (1) There are about 60,000 RefSeq transcripts from NCBI. So I downloaded > those sequences in fasta format. > (2) We have the raw data of RNA-seq from 11 tissues, we can do assembly by > trinity for each sample and then get the transcripts. But I think most of > the RNA-seq should have been submitted to NCBI. > > BTW, if we use the RefSeq data from NCBI, we can download the mRNA > sequences, coding sequences or protein sequences. I wonder which type of > data are the best to train the SNAP? For Augustus, we will use BUSCO to > train it. > > Many thanks. > > Best > Quanwei > > > > > 2017-09-29 12:36 GMT-04:00 Carson Holt : > >> You can try using the est2genome=1 option to map the old models forward >> onto the new assembly as if they were ESTs (add a line that says >> est_forward=1 to the control file to maintain old naming and set est=1 to >> the old model transcript file). Then provide the final models as a pred_gff >> for a subsuquent run (i.e. a traditional MAKER run where you are annotating >> the new assembly with transcript and protein evidence and ab initio >> predictors). Don?t supply the old models to est= on that run. >> >> The idea behind doing it this way is: >> 1. You need to get old models onto the new assembly so coordinates will >> change. So by doing it this way, you will at least be able to move many >> models forward based on homology. >> 2. By providing the models to pred_gff on a subsequent MAKER run, you are >> just letting old models compete against new annotations. They will be >> rejected if they have no evidence support, or can be kept if they score >> better than alternate models from SNAP/Augustus. That way you have the >> chance to integrate old models while at the same time rejecting some old >> models that have no evidence overlap. >> >> ?Carson >> >> >> > On Sep 28, 2017, at 6:05 AM, Quanwei Zhang >> wrote: >> > >> > Hello: >> > >> > Recently, we got a new version of NMR genome, whose genome had been >> assembled and annotated a few years ago. We can download the gene >> annotation from NCBI. >> > >> > Now we want to annotate the new genome using Maker2 pipeline. I wonder >> how can I fully make use of existing annotations. On the other hand, since >> the previous genome is not very well assemblies, some genes annotation >> maybe false positives. I hope those false positive genes in previous >> annotation won't mislead Maker2 for current gene annotation. >> > >> > Do you have any suggestions. Thanks >> > >> > Best >> > Quanwei >> > _______________________________________________ >> > maker-devel mailing list >> > maker-devel at box290.bluehost.com >> > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Jan 29 11:23:06 2018 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 29 Jan 2018 11:23:06 -0700 Subject: [maker-devel] gene annotation for a better genome In-Reply-To: References: <5AFEDD05-DF02-463F-A6EE-1619A9BB968D@gmail.com> Message-ID: <753F1840-4874-4C0D-80F7-59E1A1579884@gmail.com> You can set both est2genome=1 and protein2genome=1. You can also set est_forward=1 to get the names from the old models (you have to add it as it?s not already there). If you want to try and force an alignment to a specifc location, you can also add maker_coor=chr2:1-3000 to the fasta header comment line to have maker only alow alignments within a specific region (chr2:1-3000 in the example). ?Carson > On Jan 26, 2018, at 4:16 PM, Quanwei Zhang wrote: > > Hi Carson: > > Thank you for your previous suggestions. I have done the annotation according to your suggestions. I firstly mapped the transcripts from old assembly to the new assembly by setting "est2genome=1", and then update the models by new predictions. > > Besides mapping by "est2genome=1" , do you think it is a good idea to do a separate mapping by proteins of old assembly (setting "protein2genome=1")? And then I provide both mapping GFF files (i.e., mapping GFF by transcripts and proteins, separately) and update them with new predictions and evidence support? Why I am trying to do this is because I found for certain genes they were not mapped to the new assembly but they can be mapped by protein orthologs. > > Thank you. > > Best > Quanwei > > 2017-10-24 18:26 GMT-04:00 Carson Holt >: > Yes. If you use est2genome it will just align the model, and then find the longest ORF. So it is a quick way to jsut align old models to the new assembly. Alternatively you can just do de novo annotation. > > ?Carson > > > >> On Oct 24, 2017, at 10:54 AM, Quanwei Zhang > wrote: >> >> Dear Carson: >> >> Thank you again for your suggestions. I just get the new genome assembly of NMR and start to do gene annotation. I understand you ideas about this. But can I simply use the old genome transcripts as transcript evidence, and just following the standard Maker2 pipeline? I set est2genome=1 and provide the mRNA sequences in the fasta format for the first round training of SNAP. >> >> For transcripts I have the following choices. I think the first choice is more reliable and better, right? >> (1) There are about 60,000 RefSeq transcripts from NCBI. So I downloaded those sequences in fasta format. >> (2) We have the raw data of RNA-seq from 11 tissues, we can do assembly by trinity for each sample and then get the transcripts. But I think most of the RNA-seq should have been submitted to NCBI. >> >> BTW, if we use the RefSeq data from NCBI, we can download the mRNA sequences, coding sequences or protein sequences. I wonder which type of data are the best to train the SNAP? For Augustus, we will use BUSCO to train it. >> >> Many thanks. >> >> Best >> Quanwei >> >> >> >> >> 2017-09-29 12:36 GMT-04:00 Carson Holt >: >> You can try using the est2genome=1 option to map the old models forward onto the new assembly as if they were ESTs (add a line that says est_forward=1 to the control file to maintain old naming and set est=1 to the old model transcript file). Then provide the final models as a pred_gff for a subsuquent run (i.e. a traditional MAKER run where you are annotating the new assembly with transcript and protein evidence and ab initio predictors). Don?t supply the old models to est= on that run. >> >> The idea behind doing it this way is: >> 1. You need to get old models onto the new assembly so coordinates will change. So by doing it this way, you will at least be able to move many models forward based on homology. >> 2. By providing the models to pred_gff on a subsequent MAKER run, you are just letting old models compete against new annotations. They will be rejected if they have no evidence support, or can be kept if they score better than alternate models from SNAP/Augustus. That way you have the chance to integrate old models while at the same time rejecting some old models that have no evidence overlap. >> >> ?Carson >> >> >> > On Sep 28, 2017, at 6:05 AM, Quanwei Zhang > wrote: >> > >> > Hello: >> > >> > Recently, we got a new version of NMR genome, whose genome had been assembled and annotated a few years ago. We can download the gene annotation from NCBI. >> > >> > Now we want to annotate the new genome using Maker2 pipeline. I wonder how can I fully make use of existing annotations. On the other hand, since the previous genome is not very well assemblies, some genes annotation maybe false positives. I hope those false positive genes in previous annotation won't mislead Maker2 for current gene annotation. >> > >> > Do you have any suggestions. Thanks >> > >> > Best >> > Quanwei >> > _______________________________________________ >> > maker-devel mailing list >> > maker-devel at box290.bluehost.com >> > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From qwzhang0601 at gmail.com Mon Jan 29 12:57:42 2018 From: qwzhang0601 at gmail.com (Quanwei Zhang) Date: Mon, 29 Jan 2018 14:57:42 -0500 Subject: [maker-devel] gene annotation for a better genome In-Reply-To: <753F1840-4874-4C0D-80F7-59E1A1579884@gmail.com> References: <5AFEDD05-DF02-463F-A6EE-1619A9BB968D@gmail.com> <753F1840-4874-4C0D-80F7-59E1A1579884@gmail.com> Message-ID: Dear Carson: Thank you for your reply. Do you mean set est2genome=1 and protein2genome=1 in one round or do such mapping in two separate rounds? So I will provide gff files by mapping the transcripts and proteins to "pred_gff". Besides the gff from such mapping, I am also considering to provide a gff file obtained from a regular de novo annotation by maker2. And then update gene models from those gff. Here is the reason why I consider this. Suppose at location 1 there is a gene model gA by mapping transcripts and proteins. Then if I try to update those gene models in the second round of maker, maker can not change internal exons of gA (so can not replace it). However, if I provide both the gff by mapping transcripts and gff by maker de novo annotation, then if another gene model gA' (by de novo annotation) was predicted by maker at the same location, maker will compare gA and gA' and select the one with higher score, right? By this way we can replace a mapping gene model with predicted model by maker if the predicted one have stronger evidence support. Right? Thank you. Best Quanwei 2018-01-29 13:23 GMT-05:00 Carson Holt : > You can set both est2genome=1 and protein2genome=1. You can also set > est_forward=1 to get the names from the old models (you have to add it as > it?s not already there). If you want to try and force an alignment to a > specifc location, you can also add maker_coor=chr2:1-3000 to the fasta > header comment line to have maker only alow alignments within a specific > region (chr2:1-3000 in the example). > > ?Carson > > > On Jan 26, 2018, at 4:16 PM, Quanwei Zhang wrote: > > Hi Carson: > > Thank you for your previous suggestions. I have done the annotation > according to your suggestions. I firstly mapped the transcripts from old > assembly to the new assembly by setting "est2genome=1", and then update the > models by new predictions. > > Besides mapping by "est2genome=1" , do you think it is a good idea to do a > separate mapping by proteins of old assembly (setting "protein2genome=1")? > And then I provide both mapping GFF files (i.e., mapping GFF by transcripts > and proteins, separately) and update them with new predictions and evidence > support? Why I am trying to do this is because I found for certain genes > they were not mapped to the new assembly but they can be mapped by protein > orthologs. > > Thank you. > > Best > Quanwei > > 2017-10-24 18:26 GMT-04:00 Carson Holt : > >> Yes. If you use est2genome it will just align the model, and then find >> the longest ORF. So it is a quick way to jsut align old models to the new >> assembly. Alternatively you can just do de novo annotation. >> >> ?Carson >> >> >> >> On Oct 24, 2017, at 10:54 AM, Quanwei Zhang >> wrote: >> >> Dear Carson: >> >> Thank you again for your suggestions. I just get the new genome assembly >> of NMR and start to do gene annotation. I understand you ideas about this. >> But can I simply use the old genome transcripts as transcript evidence, and >> just following the standard Maker2 pipeline? I set est2genome=1 and provide >> the mRNA sequences in the fasta format for the first round training of SNAP. >> >> For transcripts I have the following choices. I think the first choice is >> more reliable and better, right? >> (1) There are about 60,000 RefSeq transcripts from NCBI. So I downloaded >> those sequences in fasta format. >> (2) We have the raw data of RNA-seq from 11 tissues, we can do assembly >> by trinity for each sample and then get the transcripts. But I think most >> of the RNA-seq should have been submitted to NCBI. >> >> BTW, if we use the RefSeq data from NCBI, we can download the mRNA >> sequences, coding sequences or protein sequences. I wonder which type of >> data are the best to train the SNAP? For Augustus, we will use BUSCO to >> train it. >> >> Many thanks. >> >> Best >> Quanwei >> >> >> >> >> 2017-09-29 12:36 GMT-04:00 Carson Holt : >> >>> You can try using the est2genome=1 option to map the old models forward >>> onto the new assembly as if they were ESTs (add a line that says >>> est_forward=1 to the control file to maintain old naming and set est=1 to >>> the old model transcript file). Then provide the final models as a pred_gff >>> for a subsuquent run (i.e. a traditional MAKER run where you are annotating >>> the new assembly with transcript and protein evidence and ab initio >>> predictors). Don?t supply the old models to est= on that run. >>> >>> The idea behind doing it this way is: >>> 1. You need to get old models onto the new assembly so coordinates will >>> change. So by doing it this way, you will at least be able to move many >>> models forward based on homology. >>> 2. By providing the models to pred_gff on a subsequent MAKER run, you >>> are just letting old models compete against new annotations. They will be >>> rejected if they have no evidence support, or can be kept if they score >>> better than alternate models from SNAP/Augustus. That way you have the >>> chance to integrate old models while at the same time rejecting some old >>> models that have no evidence overlap. >>> >>> ?Carson >>> >>> >>> > On Sep 28, 2017, at 6:05 AM, Quanwei Zhang >>> wrote: >>> > >>> > Hello: >>> > >>> > Recently, we got a new version of NMR genome, whose genome had been >>> assembled and annotated a few years ago. We can download the gene >>> annotation from NCBI. >>> > >>> > Now we want to annotate the new genome using Maker2 pipeline. I wonder >>> how can I fully make use of existing annotations. On the other hand, since >>> the previous genome is not very well assemblies, some genes annotation >>> maybe false positives. I hope those false positive genes in previous >>> annotation won't mislead Maker2 for current gene annotation. >>> > >>> > Do you have any suggestions. Thanks >>> > >>> > Best >>> > Quanwei >>> > _______________________________________________ >>> > maker-devel mailing list >>> > maker-devel at box290.bluehost.com >>> > http://box290.bluehost.com/mailman/listinfo/maker-devel_yand >>> ell-lab.org >>> >>> >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From admin at genome.arizona.edu Mon Jan 29 16:08:54 2018 From: admin at genome.arizona.edu (admin at genome.arizona.edu) Date: Mon, 29 Jan 2018 16:08:54 -0700 Subject: [maker-devel] MPI selection Message-ID: <77cfb864-4de1-a9af-aeea-9d3e7cf45ce5@genome.arizona.edu> Hi, we have now three versions of MPI installed on our cluster, OpenMPI, MPICH, and MVAPICH2. Since we have infiniband, the MVAPICH2 is working best with MPI test programs. MPICH should support infiniband too but currently there are some seg faults with that we are trying to resolve. On our cluster we have ~/.mpi-selection file which allows users to pick the MPI installation to use, and sets appropriate PATH and LD_LIBRARY_PATH variables. I am looking through the Maker MPI instructions, and it seems that a certain mpicc and mpi.h must be chosen during installation. So if originally, Maker was installed with MPICH, then would I have to reinstall it if users want to use MVAPICH2? Or is there config file somewhere I can update so I don't have to reinstall Maker? Or does nothing need to be done and we can rely on PATH and LD_LIBRARY_PATH variables pointing to correct mpicc and libmpi.so (mpi.h is in include directory)? Thanks From yuejiaxing at gmail.com Tue Jan 30 09:32:04 2018 From: yuejiaxing at gmail.com (Jia-Xing Yue) Date: Tue, 30 Jan 2018 17:32:04 +0100 Subject: [maker-devel] Is it possible to extract the GFF3 file for the raw gene models predicted by est2genome and protein2genome? Message-ID: Hello, I enabled the est2genome and protein2genome option for Maker-3.00.0-beta in my particular case. I was wondering if it is possible to extract the gene models predicted by snap, augustus, est2genome, and protein2genome respectively. By using the gff_merge command, I think I can extract some gene models for each cases but not all, especially for the est2genome and protein2genome set (e.g. those labeled with "maker-chr*-exonerate_est2genome-gene" and "maker-chr*-exonerate_protein2genome-gene"). Thanks in advance! Best, Jia-Xing -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Jan 30 09:47:39 2018 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 30 Jan 2018 09:47:39 -0700 Subject: [maker-devel] MPI selection In-Reply-To: <77cfb864-4de1-a9af-aeea-9d3e7cf45ce5@genome.arizona.edu> References: <77cfb864-4de1-a9af-aeea-9d3e7cf45ce5@genome.arizona.edu> Message-ID: <34C36A98-A87F-4B28-8E05-FCD412CFEBEA@gmail.com> The libraries used by MVAPICH2, Intel MPI, and OpenMPI to access infiniband have a known bug. For performance reasons, infiniband libraries use registered memory in a way that makes it impossible to do system calls to external programs under MPI (doing so results in seg faults). MAKER has to call out to external programs like BLAST, exonerate, etc., so it triggers this bug. The infiniband bug is well known, and unfortunately will not be fixed because fixing it causes infiniband to lose some advertised features like direct memory access. As a work around OpenMPI and Intel MPI allow you to disable infiniband libraries via command line flags and use IP over infiniband instead (i.e. they let you drop infiniband features on demand so that your code will run). However MVAPICH2 does not provide the same option. As a result you cannot use MAKER or any MPI program that does system calls to external programs with MVAPICH2 (it results in seg faults). But you can use all other MPI flavors with the appropriate flags detailed below: #For OpenMPI, use as follows (the example assumes ib0 is your ip over infiniband adapter) export LD_PRELOAD=/path/to/openmpi/libmpi.so mpiexec --mca btl vader,tcp,self --mca btl_tcp_if_include ib0 --mca btl_openib_want_fork_support 1 --mca mpi_warn_on_fork 0 maker #For Intel MPI set these environmental variables before launch export I_MPI_FABRICS='shm:tcp' export I_MPI_HYDRA_IFACE='ib0' mpiexec maker #For MPICH, nothing is needed as the Infiniband libraries are always disabled, but you can specifically tell it to use the ib0 adapter as the communicator mpiexec -iface ib0 maker ?Carson > On Jan 29, 2018, at 4:08 PM, admin at genome.arizona.edu wrote: > > Hi, we have now three versions of MPI installed on our cluster, OpenMPI, MPICH, and MVAPICH2. Since we have infiniband, the MVAPICH2 is working best with MPI test programs. MPICH should support infiniband too but currently there are some seg faults with that we are trying to resolve. > > On our cluster we have ~/.mpi-selection file which allows users to pick the MPI installation to use, and sets appropriate PATH and LD_LIBRARY_PATH variables. > > I am looking through the Maker MPI instructions, and it seems that a certain mpicc and mpi.h must be chosen during installation. So if originally, Maker was installed with MPICH, then would I have to reinstall it if users want to use MVAPICH2? Or is there config file somewhere I can update so I don't have to reinstall Maker? Or does nothing need to be done and we can rely on PATH and LD_LIBRARY_PATH variables pointing to correct mpicc and libmpi.so (mpi.h is in include directory)? > > Thanks > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From carsonhh at gmail.com Tue Jan 30 09:54:05 2018 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 30 Jan 2018 09:54:05 -0700 Subject: [maker-devel] gene annotation for a better genome In-Reply-To: References: <5AFEDD05-DF02-463F-A6EE-1619A9BB968D@gmail.com> <753F1840-4874-4C0D-80F7-59E1A1579884@gmail.com> Message-ID: <921EBAEF-13E3-4175-90A2-8F41651F95C9@gmail.com> You can set both simultaneously. est2genome will almost always be picked first since it will match better thatn the protein alignment (i.e. it matches at UTRs). ?Carson > On Jan 29, 2018, at 12:57 PM, Quanwei Zhang wrote: > > Dear Carson: > > Thank you for your reply. Do you mean set est2genome=1 and protein2genome=1 in one round or do such mapping in two separate rounds? > > So I will provide gff files by mapping the transcripts and proteins to "pred_gff". Besides the gff from such mapping, I am also considering to provide a gff file obtained from a regular de novo annotation by maker2. And then update gene models from those gff. > > Here is the reason why I consider this. Suppose at location 1 there is a gene model gA by mapping transcripts and proteins. Then if I try to update those gene models in the second round of maker, maker can not change internal exons of gA (so can not replace it). However, if I provide both the gff by mapping transcripts and gff by maker de novo annotation, then if another gene model gA' (by de novo annotation) was predicted by maker at the same location, maker will compare gA and gA' and select the one with higher score, right? By this way we can replace a mapping gene model with predicted model by maker if the predicted one have stronger evidence support. Right? > > Thank you. > > Best > Quanwei > > > > 2018-01-29 13:23 GMT-05:00 Carson Holt >: > You can set both est2genome=1 and protein2genome=1. You can also set est_forward=1 to get the names from the old models (you have to add it as it?s not already there). If you want to try and force an alignment to a specifc location, you can also add maker_coor=chr2:1-3000 to the fasta header comment line to have maker only alow alignments within a specific region (chr2:1-3000 in the example). > > ?Carson > > >> On Jan 26, 2018, at 4:16 PM, Quanwei Zhang > wrote: >> >> Hi Carson: >> >> Thank you for your previous suggestions. I have done the annotation according to your suggestions. I firstly mapped the transcripts from old assembly to the new assembly by setting "est2genome=1", and then update the models by new predictions. >> >> Besides mapping by "est2genome=1" , do you think it is a good idea to do a separate mapping by proteins of old assembly (setting "protein2genome=1")? And then I provide both mapping GFF files (i.e., mapping GFF by transcripts and proteins, separately) and update them with new predictions and evidence support? Why I am trying to do this is because I found for certain genes they were not mapped to the new assembly but they can be mapped by protein orthologs. >> >> Thank you. >> >> Best >> Quanwei >> >> 2017-10-24 18:26 GMT-04:00 Carson Holt >: >> Yes. If you use est2genome it will just align the model, and then find the longest ORF. So it is a quick way to jsut align old models to the new assembly. Alternatively you can just do de novo annotation. >> >> ?Carson >> >> >> >>> On Oct 24, 2017, at 10:54 AM, Quanwei Zhang > wrote: >>> >>> Dear Carson: >>> >>> Thank you again for your suggestions. I just get the new genome assembly of NMR and start to do gene annotation. I understand you ideas about this. But can I simply use the old genome transcripts as transcript evidence, and just following the standard Maker2 pipeline? I set est2genome=1 and provide the mRNA sequences in the fasta format for the first round training of SNAP. >>> >>> For transcripts I have the following choices. I think the first choice is more reliable and better, right? >>> (1) There are about 60,000 RefSeq transcripts from NCBI. So I downloaded those sequences in fasta format. >>> (2) We have the raw data of RNA-seq from 11 tissues, we can do assembly by trinity for each sample and then get the transcripts. But I think most of the RNA-seq should have been submitted to NCBI. >>> >>> BTW, if we use the RefSeq data from NCBI, we can download the mRNA sequences, coding sequences or protein sequences. I wonder which type of data are the best to train the SNAP? For Augustus, we will use BUSCO to train it. >>> >>> Many thanks. >>> >>> Best >>> Quanwei >>> >>> >>> >>> >>> 2017-09-29 12:36 GMT-04:00 Carson Holt >: >>> You can try using the est2genome=1 option to map the old models forward onto the new assembly as if they were ESTs (add a line that says est_forward=1 to the control file to maintain old naming and set est=1 to the old model transcript file). Then provide the final models as a pred_gff for a subsuquent run (i.e. a traditional MAKER run where you are annotating the new assembly with transcript and protein evidence and ab initio predictors). Don?t supply the old models to est= on that run. >>> >>> The idea behind doing it this way is: >>> 1. You need to get old models onto the new assembly so coordinates will change. So by doing it this way, you will at least be able to move many models forward based on homology. >>> 2. By providing the models to pred_gff on a subsequent MAKER run, you are just letting old models compete against new annotations. They will be rejected if they have no evidence support, or can be kept if they score better than alternate models from SNAP/Augustus. That way you have the chance to integrate old models while at the same time rejecting some old models that have no evidence overlap. >>> >>> ?Carson >>> >>> >>> > On Sep 28, 2017, at 6:05 AM, Quanwei Zhang > wrote: >>> > >>> > Hello: >>> > >>> > Recently, we got a new version of NMR genome, whose genome had been assembled and annotated a few years ago. We can download the gene annotation from NCBI. >>> > >>> > Now we want to annotate the new genome using Maker2 pipeline. I wonder how can I fully make use of existing annotations. On the other hand, since the previous genome is not very well assemblies, some genes annotation maybe false positives. I hope those false positive genes in previous annotation won't mislead Maker2 for current gene annotation. >>> > >>> > Do you have any suggestions. Thanks >>> > >>> > Best >>> > Quanwei >>> > _______________________________________________ >>> > maker-devel mailing list >>> > maker-devel at box290.bluehost.com >>> > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >>> >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Jan 30 09:57:01 2018 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 30 Jan 2018 09:57:01 -0700 Subject: [maker-devel] Is it possible to extract the GFF3 file for the raw gene models predicted by est2genome and protein2genome? In-Reply-To: References: Message-ID: <9771EB42-8A80-49D8-9A21-67406860FD4F@gmail.com> You can just grep on the name. Although est2genome and protein2genome should only be used for initial training, as they are almost always guaranteed to be partial and should be disabled once you have trained gene predictors that can build complete models. ?Carson > On Jan 30, 2018, at 9:32 AM, Jia-Xing Yue wrote: > > Hello, > > I enabled the est2genome and protein2genome option for Maker-3.00.0-beta in my particular case. I was wondering if it is possible to extract the gene models predicted by snap, augustus, est2genome, and protein2genome respectively. > > By using the gff_merge command, I think I can extract some gene models for each cases but not all, especially for the est2genome and protein2genome set (e.g. those labeled with "maker-chr*-exonerate_est2genome-gene" and "maker-chr*-exonerate_protein2genome-gene"). > > Thanks in advance! > > Best, > Jia-Xing > > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From yuejiaxing at gmail.com Tue Jan 30 10:03:34 2018 From: yuejiaxing at gmail.com (Jia-Xing Yue) Date: Tue, 30 Jan 2018 18:03:34 +0100 Subject: [maker-devel] Is it possible to extract the GFF3 file for the raw gene models predicted by est2genome and protein2genome? In-Reply-To: <9771EB42-8A80-49D8-9A21-67406860FD4F@gmail.com> References: <9771EB42-8A80-49D8-9A21-67406860FD4F@gmail.com> Message-ID: Dear Carson, Thanks for the quick response! Could you elaborate a bit on on "grep on the name". Do you mean just grep all the lines in the gff_merge output with "est2genome" and "protein2genome" in column 3? In that case, what I got is the alignments rather than the gene model guessed by Maker based on the alignment, right? Thanks! Best, Jia-Xing On Tue, Jan 30, 2018 at 5:57 PM, Carson Holt wrote: > You can just grep on the name. Although est2genome and protein2genome > should only be used for initial training, as they are almost always > guaranteed to be partial and should be disabled once you have trained gene > predictors that can build complete models. > > ?Carson > > > On Jan 30, 2018, at 9:32 AM, Jia-Xing Yue wrote: > > > > Hello, > > > > I enabled the est2genome and protein2genome option for Maker-3.00.0-beta > in my particular case. I was wondering if it is possible to extract the > gene models predicted by snap, augustus, est2genome, and protein2genome > respectively. > > > > By using the gff_merge command, I think I can extract some gene models > for each cases but not all, especially for the est2genome and > protein2genome set (e.g. those labeled with "maker-chr*-exonerate_est2genome-gene" > and "maker-chr*-exonerate_protein2genome-gene"). > > > > Thanks in advance! > > > > Best, > > Jia-Xing > > > > > > > > _______________________________________________ > > maker-devel mailing list > > maker-devel at box290.bluehost.com > > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > -- Jia-Xing Yue Population Genomics and Complex Traits Group Tour Pasteur 8eme etage Facult? de M?decine Institute for Research on Cancer and Aging, Nice (IRCAN) CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA) 28 Avenue de Valombrose 06107 NICE Cedex 2 France Twitter: @iAmphioxus Personal website: http://www.iamphioxus.org/ Lab website: https://litilab.wordpress.com/ Yeast Population Reference Panel: https://yjx1217.github.io/Yeast_PacBio_2016/welcome/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Jan 30 10:06:27 2018 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 30 Jan 2018 10:06:27 -0700 Subject: [maker-devel] Is it possible to extract the GFF3 file for the raw gene models predicted by est2genome and protein2genome? In-Reply-To: References: <9771EB42-8A80-49D8-9A21-67406860FD4F@gmail.com> Message-ID: <335E2942-4FCA-4F3C-A488-06116F6B7604@gmail.com> MAKER models will al have ?maker? in the source column. Everything else is a reference alignment (not a model). But you can grep on the gene name. If it is sourced from SNAP, it will have snap in the name, and the same is true for augustus, est2genome, protein2genome, etc. ?Carson > On Jan 30, 2018, at 10:03 AM, Jia-Xing Yue wrote: > > Dear Carson, > > Thanks for the quick response! Could you elaborate a bit on on "grep on the name". Do you mean just grep all the lines in the gff_merge output with "est2genome" and "protein2genome" in column 3? In that case, what I got is the alignments rather than the gene model guessed by Maker based on the alignment, right? > > > Thanks! > > Best, > Jia-Xing > > > > On Tue, Jan 30, 2018 at 5:57 PM, Carson Holt > wrote: > You can just grep on the name. Although est2genome and protein2genome should only be used for initial training, as they are almost always guaranteed to be partial and should be disabled once you have trained gene predictors that can build complete models. > > ?Carson > > > On Jan 30, 2018, at 9:32 AM, Jia-Xing Yue > wrote: > > > > Hello, > > > > I enabled the est2genome and protein2genome option for Maker-3.00.0-beta in my particular case. I was wondering if it is possible to extract the gene models predicted by snap, augustus, est2genome, and protein2genome respectively. > > > > By using the gff_merge command, I think I can extract some gene models for each cases but not all, especially for the est2genome and protein2genome set (e.g. those labeled with "maker-chr*-exonerate_est2genome-gene" and "maker-chr*-exonerate_protein2genome-gene"). > > > > Thanks in advance! > > > > Best, > > Jia-Xing > > > > > > > > _______________________________________________ > > maker-devel mailing list > > maker-devel at box290.bluehost.com > > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > > > -- > Jia-Xing Yue > > Population Genomics and Complex Traits Group > Tour Pasteur 8eme etage > Facult? de M?decine > Institute for Research on Cancer and Aging, Nice (IRCAN) > CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA) > 28 Avenue de Valombrose > 06107 NICE Cedex 2 > France > > Twitter: @iAmphioxus > Personal website: http://www.iamphioxus.org/ > Lab website: https://litilab.wordpress.com/ > Yeast Population Reference Panel: https://yjx1217.github.io/Yeast_PacBio_2016/welcome/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From admin at genome.arizona.edu Tue Jan 30 10:24:05 2018 From: admin at genome.arizona.edu (admin at genome.arizona.edu) Date: Tue, 30 Jan 2018 10:24:05 -0700 Subject: [maker-devel] MPI selection In-Reply-To: <34C36A98-A87F-4B28-8E05-FCD412CFEBEA@gmail.com> References: <77cfb864-4de1-a9af-aeea-9d3e7cf45ce5@genome.arizona.edu> <34C36A98-A87F-4B28-8E05-FCD412CFEBEA@gmail.com> Message-ID: <4825e452-aab6-aa13-ebc7-3d3d1832cc60@genome.arizona.edu> Carson Holt wrote on 01/30/2018 09:47 AM: > The libraries used by MVAPICH2, Intel MPI, and OpenMPI to access infiniband have a known bug. For performance reasons, infiniband libraries use registered memory in a way that makes it impossible to do system calls to external programs under MPI (doing so results in seg faults). MAKER has to call out to external programs like BLAST, exonerate, etc., so it triggers this bug. > The infiniband bug is well known, and unfortunately will not be fixed because fixing it causes infiniband to lose some advertised features like direct memory access. Well that stinks! Maybe that's why we got such a good deal on new-old-stock infiniband equipment! Still it has allowed us to use full speed of our NFS RAIDs, which has been nice. I will try with using ib0, the speed is still about 10Gb, but I was under the impression using IPoIB would cause packet loss or other problems... Thanks for clearing that up. So is there a fabric/protocol you would recommend for clusters running maker? From yuejiaxing at gmail.com Tue Jan 30 10:24:22 2018 From: yuejiaxing at gmail.com (Jia-Xing Yue) Date: Tue, 30 Jan 2018 12:24:22 -0500 Subject: [maker-devel] Is it possible to extract the GFF3 file for the raw gene models predicted by est2genome and protein2genome? Message-ID: Dear Carson, Yes that's what I did actually. But it seems that I only got much fewer gene models for est2genome and protein2genome in this way than I would expect. I have turned on EVM for my maker run. Could this explain the low numbers of est2genome and protein2genome models that I got? Thx! Best, Jia-Xing Sent from my Nokia Lumia 920 ------------------------------ From: Carson Holt Sent: ?30/?01/?2018 18:06 To: Jia-Xing Yue Cc: maker-devel at yandell-lab.org List Subject: Re: [maker-devel] Is it possible to extract the GFF3 file for the raw gene models predicted by est2genome and protein2genome? MAKER models will al have ?maker? in the source column. Everything else is a reference alignment (not a model). But you can grep on the gene name. If it is sourced from SNAP, it will have snap in the name, and the same is true for augustus, est2genome, protein2genome, etc. ?Carson On Jan 30, 2018, at 10:03 AM, Jia-Xing Yue wrote: Dear Carson, Thanks for the quick response! Could you elaborate a bit on on "grep on the name". Do you mean just grep all the lines in the gff_merge output with "est2genome" and "protein2genome" in column 3? In that case, what I got is the alignments rather than the gene model guessed by Maker based on the alignment, right? Thanks! Best, Jia-Xing On Tue, Jan 30, 2018 at 5:57 PM, Carson Holt wrote: > You can just grep on the name. Although est2genome and protein2genome > should only be used for initial training, as they are almost always > guaranteed to be partial and should be disabled once you have trained gene > predictors that can build complete models. > > ?Carson > > > On Jan 30, 2018, at 9:32 AM, Jia-Xing Yue wrote: > > > > Hello, > > > > I enabled the est2genome and protein2genome option for Maker-3.00.0-beta > in my particular case. I was wondering if it is possible to extract the > gene models predicted by snap, augustus, est2genome, and protein2genome > respectively. > > > > By using the gff_merge command, I think I can extract some gene models > for each cases but not all, especially for the est2genome and > protein2genome set (e.g. those labeled with "maker-chr*-exonerate_est2genome-gene" > and "maker-chr*-exonerate_protein2genome-gene"). > > > > Thanks in advance! > > > > Best, > > Jia-Xing > > > > > > > > _______________________________________________ > > maker-devel mailing list > > maker-devel at box290.bluehost.com > > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > -- Jia-Xing Yue Population Genomics and Complex Traits Group Tour Pasteur 8eme etage Facult? de M?decine Institute for Research on Cancer and Aging, Nice (IRCAN) CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA) 28 Avenue de Valombrose 06107 NICE Cedex 2 France Twitter: @iAmphioxus Personal website: http://www.iamphioxus.org/ Lab website: https://litilab.wordpress.com/ Yeast Population Reference Panel: https://yjx1217.github.io/Yeast_PacBio_2016/welcome/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Jan 30 10:37:59 2018 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 30 Jan 2018 10:37:59 -0700 Subject: [maker-devel] MPI selection In-Reply-To: <4825e452-aab6-aa13-ebc7-3d3d1832cc60@genome.arizona.edu> References: <77cfb864-4de1-a9af-aeea-9d3e7cf45ce5@genome.arizona.edu> <34C36A98-A87F-4B28-8E05-FCD412CFEBEA@gmail.com> <4825e452-aab6-aa13-ebc7-3d3d1832cc60@genome.arizona.edu> Message-ID: MAKER does not really move a lot of data with MPI, it?s just moving around command lines and small variables. So not getting full infiniband performance will not hurt you. I doubt you see any issues using ib0. For MPI flavor, I get the best performance with Intel MPI followed by OpenMPI. Overall you will find that MAKER is IO bound as opposed to CPU or communications bound. So pointing it at your best performing network based storage will be the greatest performance factor (if you have Lustre storage, point it there for example). Pull back on job size and count if other users have issues accessing the disk (too many jobs can bring NFS to it?s knees). The one suggestion I have as far as job size, it to keep jobs sizes under 200 CPU cores. Over that, you will get better performance by splitting up datasets and submitting multiple job. Also MAKER keeps a log of it?s progress, so you can kill jobs or restart failed jobs, and they pick up right where they left off. ?Carson > On Jan 30, 2018, at 10:24 AM, admin at genome.arizona.edu wrote: > > Carson Holt wrote on 01/30/2018 09:47 AM: > > The libraries used by MVAPICH2, Intel MPI, and OpenMPI to access infiniband have a known bug. For performance reasons, infiniband libraries use registered memory in a way that makes it impossible to do system calls to external programs under MPI (doing so results in seg faults). MAKER has to call out to external programs like BLAST, exonerate, etc., so it triggers this bug. > > The infiniband bug is well known, and unfortunately will not be fixed because fixing it causes infiniband to lose some advertised features like direct memory access. > > > Well that stinks! Maybe that's why we got such a good deal on new-old-stock infiniband equipment! Still it has allowed us to use full speed of our NFS RAIDs, which has been nice. I will try with using ib0, the speed is still about 10Gb, but I was under the impression using IPoIB would cause packet loss or other problems... > > Thanks for clearing that up. So is there a fabric/protocol you would recommend for clusters running maker? > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From qlian003 at ucr.edu Wed Jan 3 18:52:26 2018 From: qlian003 at ucr.edu (Qihua Liang) Date: Wed, 3 Jan 2018 17:52:26 -0800 Subject: [maker-devel] questions on master_datastore_index.log file Message-ID: Dear Maker Develop Team, I have successfully run Maker for several times before. But I came across a strange thing days ago when I ran Maker again on a different assembly with the same input files and settings. I saw the message of "Maker is now finished!!!? but got empty GFF3 and no fasta files. And then I checked the master_datastore_index.log and realized that there are a lot of ?failed?s and ?retry?s and ?failed? again. What does this mean? Since I used same inputs as previous successful runs, could you provide some instructions on how to debug and solve it? Thank you so much Qihua From o.k.torresen at ibv.uio.no Thu Jan 4 06:21:28 2018 From: o.k.torresen at ibv.uio.no (=?utf-8?B?T2xlIEtyaXN0aWFuIFTDuHJyZXNlbg==?=) Date: Thu, 4 Jan 2018 13:21:28 +0000 Subject: [maker-devel] Names/IDs from pred_gff are included in final gff Message-ID: <76613C81-ED9B-45F2-B84B-B60BC1D4D972@ibv.uio.no> Hi, as far as I can see, names or IDs of features in gffs given to pred_gff is included in the final output as the name of the feature. As far as I can understand, this is not expected behaviour (it is for model_gff). This is with MAKER 2.31.9. I have these settings: map_forward=0 keep_preds=1 I thought that map_forward had to be 1 to get the names for the old GFFs. Can you replicate this? Thank you. Sincerely, Ole K. T?rresen From d.ence at ufl.edu Thu Jan 4 07:16:42 2018 From: d.ence at ufl.edu (Ence,daniel) Date: Thu, 4 Jan 2018 14:16:42 +0000 Subject: [maker-devel] questions on master_datastore_index.log file In-Reply-To: References: Message-ID: Hi, Before we can give any help to debug it, we need the error messages. These should be in the same file that the ?maker is finished? message is in. Look for the first error message (the one closest to the top of the file) and send that to the mailing list. Thanks, Daniel > On Jan 3, 2018, at 8:52 PM, Qihua Liang wrote: > > Dear Maker Develop Team, > > I have successfully run Maker for several times before. But I came across a strange thing days ago when I ran Maker again on a different assembly with the same input files and settings. > > I saw the message of "Maker is now finished!!!? but got empty GFF3 and no fasta files. And then I checked the master_datastore_index.log and realized that there are a lot of ?failed?s and ?retry?s and ?failed? again. What does this mean? Since I used same inputs as previous successful runs, could you provide some instructions on how to debug and solve it? > > Thank you so much > Qihua > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > https://urldefense.proofpoint.com/v2/url?u=http-3A__box290.bluehost.com_mailman_listinfo_maker-2Ddevel-5Fyandell-2Dlab.org&d=DwIGaQ&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=12jzlNvGVD0AlPJ4E7cTlw1Dvu6n9cb4kMCobJ28XPs&m=nUDCP_0kFOhDYlTHgOpWtf_zdL77aQFeQwYOGIQwP8c&s=9Z4T1hdtxOyIjpn6f70qhrQRuGsZxXdV-oLSJF1zGkY&e= From qlian003 at ucr.edu Thu Jan 4 14:36:18 2018 From: qlian003 at ucr.edu (Qihua Liang) Date: Thu, 4 Jan 2018 13:36:18 -0800 Subject: [maker-devel] questions on master_datastore_index.log file In-Reply-To: References: Message-ID: Hi Ence, When I searched for ?E/error? in the output file, here is what first showed up: Process::MpiChunk::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 Error::subs::try(CODE(0x502bbb0), HASH(0x5007788)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 4269 Process::MpiChunk::_go(Process::MpiChunk=HASH(0x50a1a18), "flow", HASH(0x50ad0f0), 2, 0) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 378 Process::MpiChunk::_flow(Process::MpiChunk=HASH(0x50a1a18), HASH(0x50ad0f0), 2, 0, Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 318 Process::MpiTiers::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 Error::subs::try(CODE(0x50a9348), HASH(0x4ff0ec0)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 338 Process::MpiTiers::_next_level(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 179 Is this what you may need? Qihua > On Jan 4, 2018, at 6:16 AM, Ence,daniel wrote: > > Hi, Before we can give any help to debug it, we need the error messages. These should be in the same file that the ?maker is finished? message is in. Look for the first error message (the one closest to the top of the file) and send that to the mailing list. > > Thanks, > Daniel > > >> On Jan 3, 2018, at 8:52 PM, Qihua Liang wrote: >> >> Dear Maker Develop Team, >> >> I have successfully run Maker for several times before. But I came across a strange thing days ago when I ran Maker again on a different assembly with the same input files and settings. >> >> I saw the message of "Maker is now finished!!!? but got empty GFF3 and no fasta files. And then I checked the master_datastore_index.log and realized that there are a lot of ?failed?s and ?retry?s and ?failed? again. What does this mean? Since I used same inputs as previous successful runs, could you provide some instructions on how to debug and solve it? >> >> Thank you so much >> Qihua >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> https://urldefense.proofpoint.com/v2/url?u=http-3A__box290.bluehost.com_mailman_listinfo_maker-2Ddevel-5Fyandell-2Dlab.org&d=DwIGaQ&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=12jzlNvGVD0AlPJ4E7cTlw1Dvu6n9cb4kMCobJ28XPs&m=nUDCP_0kFOhDYlTHgOpWtf_zdL77aQFeQwYOGIQwP8c&s=9Z4T1hdtxOyIjpn6f70qhrQRuGsZxXdV-oLSJF1zGkY&e= > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Fri Jan 5 20:22:56 2018 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 5 Jan 2018 20:22:56 -0700 Subject: [maker-devel] questions on master_datastore_index.log file In-Reply-To: References: Message-ID: That?s the stack trace. The error is going to be a few lines further back. It would be best to get a few hundred lines right around the area you are showing. ?Carson > On Jan 4, 2018, at 2:36 PM, Qihua Liang wrote: > > Hi Ence, > > When I searched for ?E/error? in the output file, here is what first showed up: > Process::MpiChunk::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 > eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 > Error::subs::try(CODE(0x502bbb0), HASH(0x5007788)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 4269 > Process::MpiChunk::_go(Process::MpiChunk=HASH(0x50a1a18), "flow", HASH(0x50ad0f0), 2, 0) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 378 > Process::MpiChunk::_flow(Process::MpiChunk=HASH(0x50a1a18), HASH(0x50ad0f0), 2, 0, Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 318 > Process::MpiTiers::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 > eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 > Error::subs::try(CODE(0x50a9348), HASH(0x4ff0ec0)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 338 > Process::MpiTiers::_next_level(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 179 > > Is this what you may need? > > Qihua > >> On Jan 4, 2018, at 6:16 AM, Ence,daniel > wrote: >> >> Hi, Before we can give any help to debug it, we need the error messages. These should be in the same file that the ?maker is finished? message is in. Look for the first error message (the one closest to the top of the file) and send that to the mailing list. >> >> Thanks, >> Daniel >> >> >>> On Jan 3, 2018, at 8:52 PM, Qihua Liang > wrote: >>> >>> Dear Maker Develop Team, >>> >>> I have successfully run Maker for several times before. But I came across a strange thing days ago when I ran Maker again on a different assembly with the same input files and settings. >>> >>> I saw the message of "Maker is now finished!!!? but got empty GFF3 and no fasta files. And then I checked the master_datastore_index.log and realized that there are a lot of ?failed?s and ?retry?s and ?failed? again. What does this mean? Since I used same inputs as previous successful runs, could you provide some instructions on how to debug and solve it? >>> >>> Thank you so much >>> Qihua >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> https://urldefense.proofpoint.com/v2/url?u=http-3A__box290.bluehost.com_mailman_listinfo_maker-2Ddevel-5Fyandell-2Dlab.org&d=DwIGaQ&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=12jzlNvGVD0AlPJ4E7cTlw1Dvu6n9cb4kMCobJ28XPs&m=nUDCP_0kFOhDYlTHgOpWtf_zdL77aQFeQwYOGIQwP8c&s=9Z4T1hdtxOyIjpn6f70qhrQRuGsZxXdV-oLSJF1zGkY&e= >> > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Jan 16 11:15:29 2018 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 16 Jan 2018 11:15:29 -0700 Subject: [maker-devel] Names/IDs from pred_gff are included in final gff In-Reply-To: <76613C81-ED9B-45F2-B84B-B60BC1D4D972@ibv.uio.no> References: <76613C81-ED9B-45F2-B84B-B60BC1D4D972@ibv.uio.no> Message-ID: <8CB421A6-3CB0-4539-B55A-D3F4CA61D0AD@gmail.com> pred_gff will maintain it?s name in the match/match_part feature as the information is pulled directly from the input GFF3. But any pred_gff feature that becomes a final model will be renamed to something like ?scaffold_1517-pred_gff_GeneMark.hmm-gene-0.6-mRNA-1? unless you specify map_forward=1 to maintain the original name. ?Carson > On Jan 4, 2018, at 6:21 AM, Ole Kristian T?rresen wrote: > > Hi, > as far as I can see, names or IDs of features in gffs given to pred_gff is included in the final output as the name of the feature. As far as I can understand, this is not expected behaviour (it is for model_gff). This is with MAKER 2.31.9. > > I have these settings: > map_forward=0 > keep_preds=1 > > I thought that map_forward had to be 1 to get the names for the old GFFs. Can you replicate this? > > Thank you. > > Sincerely, > Ole K. T?rresen > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From o.k.torresen at ibv.uio.no Wed Jan 17 10:52:13 2018 From: o.k.torresen at ibv.uio.no (=?utf-8?B?T2xlIEtyaXN0aWFuIFTDuHJyZXNlbg==?=) Date: Wed, 17 Jan 2018 17:52:13 +0000 Subject: [maker-devel] Names/IDs from pred_gff are included in final gff In-Reply-To: <8CB421A6-3CB0-4539-B55A-D3F4CA61D0AD@gmail.com> References: <76613C81-ED9B-45F2-B84B-B60BC1D4D972@ibv.uio.no> <8CB421A6-3CB0-4539-B55A-D3F4CA61D0AD@gmail.com> Message-ID: <583A84D5-B979-4C2F-B262-2D55A6F55B56@ibv.uio.no> Ok, but I have an entry in the final gff like this: ID=maker-GmG20150304_scaffold_2371-pred_gff_maker-gene-0.125-mRNA-1;Parent=maker-GmG20150304_scaffold_2371-pred_gff_maker-gene-0.125;Name=ENSGMOT00000000668.1;_AED=0.00;_eAED=0.00;_QI=819|1|1|1|1|1|4|112|726;score=89.75616 (The name is derived from a pred_gff entry which is the results of mapping an old annotation to the new assembly). This is then called ENSGMOT00000000668.1 protein AED:0.00 eAED:0.00 QI:819|1|1|1|1|1|4|112|726 in the proteins.fasta file. Which is unfortunate, because it apparently mapped 12 places in the assembly. I have set map_forward=0, but keep_preds=1 (filtering on domain presence and AED score later). This and another file (result of genemark_gtf2gff3), is not input as match/match_part to MAKER, but with gene/exon/CDS/mRNA. Could that be the issue? Ole > On 16 Jan 2018, at 19:15, Carson Holt wrote: > > pred_gff will maintain it?s name in the match/match_part feature as the information is pulled directly from the input GFF3. But any pred_gff feature that becomes a final model will be renamed to something like ?scaffold_1517-pred_gff_GeneMark.hmm-gene-0.6-mRNA-1? unless you specify map_forward=1 to maintain the original name. > > ?Carson > > > >> On Jan 4, 2018, at 6:21 AM, Ole Kristian T?rresen wrote: >> >> Hi, >> as far as I can see, names or IDs of features in gffs given to pred_gff is included in the final output as the name of the feature. As far as I can understand, this is not expected behaviour (it is for model_gff). This is with MAKER 2.31.9. >> >> I have these settings: >> map_forward=0 >> keep_preds=1 >> >> I thought that map_forward had to be 1 to get the names for the old GFFs. Can you replicate this? >> >> Thank you. >> >> Sincerely, >> Ole K. T?rresen >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > From qlian003 at ucr.edu Sat Jan 6 16:09:55 2018 From: qlian003 at ucr.edu (Qihua Liang) Date: Sat, 6 Jan 2018 15:09:55 -0800 Subject: [maker-devel] questions on master_datastore_index.log file In-Reply-To: References: Message-ID: <0BECB285-BB11-4F46-B6D7-072640F311B2@ucr.edu> Hi Carson, I am pasting more lines of error messages. I notice an error of "ERROR: Can't open seq file: /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/query.masked.gff.seq?, the seq name of ?ScsGwly? is ">ScsGwly_6124;HRSCAF=6247?, is it because of the seq naming that makes the temp file name weird? Thanks Qihua #--------- command -------------# Widget::blastx: /24-2/home/qliang/0.soft/maker/bin/../exe/blast/bin/blastx -db /tmp/maker_nJDkCL/te_proteins%2Efasta.mpi.10.9 -query /tmp/maker_nJDkCL/0/ScsG wly_5932%3BHRSCAF=6050.0 -num_alignments 10000 -num_descriptions 10000 -evalue 1e-06 -dbsize 300 -searchsp 500000000 -num_threads 1 -seg yes -soft_masking true -lcase_masking -show_gis -out /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/D7/4 A/ScsGwly_5932%3BHRSCAF=6050//theVoid.ScsGwly_5932%3BHRSCAF=6050/0/ScsGwly_5932%3BHRSCAF=6050.0.te_proteins%2Efasta.repeatrunner.temp_dir/te_ proteins%2Efasta.mpi.10.9.repeatrunner #-------------------------------# deleted:0 hits collecting blastx repeatmasking processing all repeats in cluster::shadow_cluster... ...finished clustering. preparing masked sequence preparing ab-inits running snap. #--------- command -------------# Widget::snap: /24-2/home/qliang/0.soft/maker/exe/snap/snap /home/qliang/cowpea/annotation/09.tingting/4.Abintio/2.CEGMA/3.maker/maker1.hmm/maker1.snap.hmm /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0 > /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0.maker1%2Esnap%2Eh mm.snap #-------------------------------# scoring....decoding.10.20.30.40.50.60.70.80.90.100 done scoring....decoding.10.20.30.40.50.60.70.80.90.100 done running augustus. #--------- command -------------# Widget::augustus: /usr/local/augustus.2.7/bin/augustus --species=cowpea_new --UTR=off /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0 > /tmp/maker _nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0.cowpea_new.augustus #-------------------------------# deleted:0 hits collecting blastx repeatmasking processing all repeats in cluster::shadow_cluster... ...finished clustering. deleted:0 hits doing blastx repeats running blast search. #--------- command -------------# Widget::blastx: /24-2/home/qliang/0.soft/maker/bin/../exe/blast/bin/blastx -db /tmp/maker_mvdRkd/te_proteins%2Efasta.mpi.10.6 -query /tmp/maker_mvdRkd/0/chr10.75 -num_alignments 10000 -num_descriptions 10000 -evalue 1e-06 -dbsize 300 -searchsp 500000000 -num_threads 1 -seg yes -soft_masking true -lcase_masking -show_gis -out /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/09/chr10//theVoid.chr10/7/chr10.75.te_proteins%2Efasta.repeatrunner.temp_dir/te_proteins%2Efasta.mpi.10.6.repeatrunner #-------------------------------# doing blastx repeats re reading blast report. /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/0/ScsGwly_6124%3BHRSCAF=6247.0.te_proteins%2Efasta.repeatrunner deleted:0 hits doing blastx repeats doing blastx repeats doing blastx repeats doing blastx repeats doing blastx repeats doing blastx repeats doing blastx repeats doing blastx repeats doing blastx repeats collecting blastx repeatmasking processing all repeats in cluster::shadow_cluster... ...finished clustering. ERROR: Can't open seq file: /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/query.masked.gff.seq No such file or directory at /24-2/home/qliang/0.soft/maker/bin/../lib/Dumper/GFF/GFFV3.pm line 199. Dumper::GFF::GFFV3::finalize(Dumper::GFF::GFFV3=HASH(0x5000ab8)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 700 Process::MpiChunk::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 Error::subs::try(CODE(0x502bbb0), HASH(0x5007788)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 4269 Process::MpiChunk::_go(Process::MpiChunk=HASH(0x50a1a18), "flow", HASH(0x50ad0f0), 2, 0) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 378 Process::MpiChunk::_flow(Process::MpiChunk=HASH(0x50a1a18), HASH(0x50ad0f0), 2, 0, Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 318 Process::MpiTiers::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 Error::subs::try(CODE(0x50a9348), HASH(0x4ff0ec0)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 338 Process::MpiTiers::_next_level(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 179 Process::MpiTiers::next_chunk(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 286 Process::MpiTiers::run_all(Process::MpiTiers=HASH(0x4fb3350), 0) called at /home/qliang/0.soft/maker/bin/maker line 695 --> rank=NA, hostname=H4 ERROR: Failed while builing masking tiers --> rank=NA, hostname=H4 --> rank=NA, hostname=H4 ERROR: Can not get next level running genemark. #--------- command -------------# Widget::genemark: /24-2/home/qliang/0.soft/PerlPackages/ActivePerl-5.22/bin/perl-static /24-2/home/qliang/0.soft/maker/bin/../lib/Widget/genemark/gmhmm_wrap -m /home/qliang/cowpea/annotation/05.CEGMA/2.genemask/output/gmhmm.mod -g /24-2/home/qliang/0.soft/makerPackages/gm_et_linux_64/gmes_petap/gmhmme3 -p /24-2/home/qliang/0.soft/makerPackages/gm_et_linux_64/gmes_petap/probuild -o /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_nomask.0.gmhmm%2Emod.genemark /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_nomask.0 #-------------------------------# FAILED CONTIG:ScsGwly_6124;HRSCAF=6247 examining contents of the fasta file and run log --Next Contig-- #--------------------------------------------------------------------- Now starting the contig!! SeqID: ScsGwly_6140;HRSCAF=6263 Length: 1247 #--------------------------------------------------------------------- > On Jan 5, 2018, at 7:22 PM, Carson Holt wrote: > > That?s the stack trace. The error is going to be a few lines further back. It would be best to get a few hundred lines right around the area you are showing. > > ?Carson > >> On Jan 4, 2018, at 2:36 PM, Qihua Liang > wrote: >> >> Hi Ence, >> >> When I searched for ?E/error? in the output file, here is what first showed up: >> Process::MpiChunk::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >> Error::subs::try(CODE(0x502bbb0), HASH(0x5007788)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 4269 >> Process::MpiChunk::_go(Process::MpiChunk=HASH(0x50a1a18), "flow", HASH(0x50ad0f0), 2, 0) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 378 >> Process::MpiChunk::_flow(Process::MpiChunk=HASH(0x50a1a18), HASH(0x50ad0f0), 2, 0, Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 318 >> Process::MpiTiers::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >> Error::subs::try(CODE(0x50a9348), HASH(0x4ff0ec0)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 338 >> Process::MpiTiers::_next_level(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 179 >> >> Is this what you may need? >> >> Qihua >> >>> On Jan 4, 2018, at 6:16 AM, Ence,daniel > wrote: >>> >>> Hi, Before we can give any help to debug it, we need the error messages. These should be in the same file that the ?maker is finished? message is in. Look for the first error message (the one closest to the top of the file) and send that to the mailing list. >>> >>> Thanks, >>> Daniel >>> >>> >>>> On Jan 3, 2018, at 8:52 PM, Qihua Liang > wrote: >>>> >>>> Dear Maker Develop Team, >>>> >>>> I have successfully run Maker for several times before. But I came across a strange thing days ago when I ran Maker again on a different assembly with the same input files and settings. >>>> >>>> I saw the message of "Maker is now finished!!!? but got empty GFF3 and no fasta files. And then I checked the master_datastore_index.log and realized that there are a lot of ?failed?s and ?retry?s and ?failed? again. What does this mean? Since I used same inputs as previous successful runs, could you provide some instructions on how to debug and solve it? >>>> >>>> Thank you so much >>>> Qihua >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__box290.bluehost.com_mailman_listinfo_maker-2Ddevel-5Fyandell-2Dlab.org&d=DwIGaQ&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=12jzlNvGVD0AlPJ4E7cTlw1Dvu6n9cb4kMCobJ28XPs&m=nUDCP_0kFOhDYlTHgOpWtf_zdL77aQFeQwYOGIQwP8c&s=9Z4T1hdtxOyIjpn6f70qhrQRuGsZxXdV-oLSJF1zGkY&e= >>> >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Jan 9 10:14:05 2018 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 9 Jan 2018 10:14:05 -0700 Subject: [maker-devel] questions on master_datastore_index.log file In-Reply-To: <0BECB285-BB11-4F46-B6D7-072640F311B2@ucr.edu> References: <0BECB285-BB11-4F46-B6D7-072640F311B2@ucr.edu> Message-ID: <0E5E8721-E814-4BA5-891B-B1C312BC0D4A@gmail.com> Your contig names may create issues. Specifically the ?;? character, but you should also remove the ?=? character. However, I believe your problem may be IO. If you are running under MPI or are running multiple jobs, the disk one of the machines may have that location unmounted, it may be full, you may have hit a system file quota limit, or the IO load is slowing it is not actually finished writing the file when MAKER tries to read it. If IO load, is the issue, then you just need to run fewer processes. The other possibilities would mean you need to make space, fix the mount, or raise any quotas on your systems. Thanks, Carson > On Jan 6, 2018, at 4:09 PM, Qihua Liang wrote: > > Hi Carson, > > I am pasting more lines of error messages. I notice an error of "ERROR: Can't open seq file: /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/query.masked.gff.seq?, the seq name of ?ScsGwly? is ">ScsGwly_6124;HRSCAF=6247?, is it because of the seq naming that makes the temp file name weird? > > Thanks > Qihua > > #--------- command -------------# > Widget::blastx: > /24-2/home/qliang/0.soft/maker/bin/../exe/blast/bin/blastx -db /tmp/maker_nJDkCL/te_proteins%2Efasta.mpi.10.9 -query /tmp/maker_nJDkCL/0/ScsG > wly_5932%3BHRSCAF=6050.0 -num_alignments 10000 -num_descriptions 10000 -evalue 1e-06 -dbsize 300 -searchsp 500000000 -num_threads 1 -seg yes > -soft_masking true -lcase_masking -show_gis -out /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/D7/4 > A/ScsGwly_5932%3BHRSCAF=6050//theVoid.ScsGwly_5932%3BHRSCAF=6050/0/ScsGwly_5932%3BHRSCAF=6050.0.te_proteins%2Efasta.repeatrunner.temp_dir/te_ > proteins%2Efasta.mpi.10.9.repeatrunner > #-------------------------------# > deleted:0 hits > collecting blastx repeatmasking > processing all repeats > in cluster::shadow_cluster... > ...finished clustering. > preparing masked sequence > preparing ab-inits > running snap. > #--------- command -------------# > Widget::snap: > /24-2/home/qliang/0.soft/maker/exe/snap/snap /home/qliang/cowpea/annotation/09.tingting/4.Abintio/2.CEGMA/3.maker/maker1.hmm/maker1.snap.hmm > /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0 > /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0.maker1%2Esnap%2Eh > mm.snap > #-------------------------------# > scoring....decoding.10.20.30.40.50.60.70.80.90.100 done > scoring....decoding.10.20.30.40.50.60.70.80.90.100 done > running augustus. > #--------- command -------------# > Widget::augustus: > /usr/local/augustus.2.7/bin/augustus --species=cowpea_new --UTR=off /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0 > /tmp/maker > _nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0.cowpea_new.augustus > #-------------------------------# > deleted:0 hits > collecting blastx repeatmasking > processing all repeats > in cluster::shadow_cluster... > ...finished clustering. > deleted:0 hits > doing blastx repeats > running blast search. > #--------- command -------------# > Widget::blastx: > /24-2/home/qliang/0.soft/maker/bin/../exe/blast/bin/blastx -db /tmp/maker_mvdRkd/te_proteins%2Efasta.mpi.10.6 -query /tmp/maker_mvdRkd/0/chr10.75 -num_alignments 10000 -num_descriptions 10000 -evalue 1e-06 -dbsize 300 -searchsp 500000000 -num_threads 1 -seg yes -soft_masking true -lcase_masking -show_gis -out /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/09/chr10//theVoid.chr10/7/chr10.75.te_proteins%2Efasta.repeatrunner.temp_dir/te_proteins%2Efasta.mpi.10.6.repeatrunner > #-------------------------------# > doing blastx repeats > re reading blast report. > /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/0/ScsGwly_6124%3BHRSCAF=6247.0.te_proteins%2Efasta.repeatrunner > deleted:0 hits > doing blastx repeats > doing blastx repeats > doing blastx repeats > doing blastx repeats > doing blastx repeats > doing blastx repeats > doing blastx repeats > doing blastx repeats > doing blastx repeats > collecting blastx repeatmasking > processing all repeats > in cluster::shadow_cluster... > ...finished clustering. > ERROR: Can't open seq file: /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/query.masked.gff.seq > No such file or directory > > at /24-2/home/qliang/0.soft/maker/bin/../lib/Dumper/GFF/GFFV3.pm line 199. > Dumper::GFF::GFFV3::finalize(Dumper::GFF::GFFV3=HASH(0x5000ab8)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 700 > Process::MpiChunk::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 > eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 > Error::subs::try(CODE(0x502bbb0), HASH(0x5007788)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 4269 > Process::MpiChunk::_go(Process::MpiChunk=HASH(0x50a1a18), "flow", HASH(0x50ad0f0), 2, 0) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 378 > Process::MpiChunk::_flow(Process::MpiChunk=HASH(0x50a1a18), HASH(0x50ad0f0), 2, 0, Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 318 > Process::MpiTiers::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 > eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 > Error::subs::try(CODE(0x50a9348), HASH(0x4ff0ec0)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 338 > Process::MpiTiers::_next_level(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 179 > Process::MpiTiers::next_chunk(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 286 > Process::MpiTiers::run_all(Process::MpiTiers=HASH(0x4fb3350), 0) called at /home/qliang/0.soft/maker/bin/maker line 695 > --> rank=NA, hostname=H4 > ERROR: Failed while builing masking tiers > --> rank=NA, hostname=H4 > --> rank=NA, hostname=H4 > ERROR: Can not get next level > running genemark. > #--------- command -------------# > Widget::genemark: > /24-2/home/qliang/0.soft/PerlPackages/ActivePerl-5.22/bin/perl-static /24-2/home/qliang/0.soft/maker/bin/../lib/Widget/genemark/gmhmm_wrap -m /home/qliang/cowpea/annotation/05.CEGMA/2.genemask/output/gmhmm.mod -g /24-2/home/qliang/0.soft/makerPackages/gm_et_linux_64/gmes_petap/gmhmme3 -p /24-2/home/qliang/0.soft/makerPackages/gm_et_linux_64/gmes_petap/probuild -o /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_nomask.0.gmhmm%2Emod.genemark /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_nomask.0 > #-------------------------------# > FAILED CONTIG:ScsGwly_6124;HRSCAF=6247 > > examining contents of the fasta file and run log > > > > --Next Contig-- > > #--------------------------------------------------------------------- > Now starting the contig!! > SeqID: ScsGwly_6140;HRSCAF=6263 > Length: 1247 > #--------------------------------------------------------------------- > > > > >> On Jan 5, 2018, at 7:22 PM, Carson Holt wrote: >> >> That?s the stack trace. The error is going to be a few lines further back. It would be best to get a few hundred lines right around the area you are showing. >> >> ?Carson >> >>> On Jan 4, 2018, at 2:36 PM, Qihua Liang wrote: >>> >>> Hi Ence, >>> >>> When I searched for ?E/error? in the output file, here is what first showed up: >>> Process::MpiChunk::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >>> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >>> Error::subs::try(CODE(0x502bbb0), HASH(0x5007788)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 4269 >>> Process::MpiChunk::_go(Process::MpiChunk=HASH(0x50a1a18), "flow", HASH(0x50ad0f0), 2, 0) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 378 >>> Process::MpiChunk::_flow(Process::MpiChunk=HASH(0x50a1a18), HASH(0x50ad0f0), 2, 0, Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 318 >>> Process::MpiTiers::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >>> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >>> Error::subs::try(CODE(0x50a9348), HASH(0x4ff0ec0)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 338 >>> Process::MpiTiers::_next_level(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 179 >>> >>> Is this what you may need? >>> >>> Qihua >>> >>>> On Jan 4, 2018, at 6:16 AM, Ence,daniel wrote: >>>> >>>> Hi, Before we can give any help to debug it, we need the error messages. These should be in the same file that the ?maker is finished? message is in. Look for the first error message (the one closest to the top of the file) and send that to the mailing list. >>>> >>>> Thanks, >>>> Daniel >>>> >>>> >>>>> On Jan 3, 2018, at 8:52 PM, Qihua Liang wrote: >>>>> >>>>> Dear Maker Develop Team, >>>>> >>>>> I have successfully run Maker for several times before. But I came across a strange thing days ago when I ran Maker again on a different assembly with the same input files and settings. >>>>> >>>>> I saw the message of "Maker is now finished!!!? but got empty GFF3 and no fasta files. And then I checked the master_datastore_index.log and realized that there are a lot of ?failed?s and ?retry?s and ?failed? again. What does this mean? Since I used same inputs as previous successful runs, could you provide some instructions on how to debug and solve it? >>>>> >>>>> Thank you so much >>>>> Qihua >>>>> _______________________________________________ >>>>> maker-devel mailing list >>>>> maker-devel at box290.bluehost.com >>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__box290.bluehost.com_mailman_listinfo_maker-2Ddevel-5Fyandell-2Dlab.org&d=DwIGaQ&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=12jzlNvGVD0AlPJ4E7cTlw1Dvu6n9cb4kMCobJ28XPs&m=nUDCP_0kFOhDYlTHgOpWtf_zdL77aQFeQwYOGIQwP8c&s=9Z4T1hdtxOyIjpn6f70qhrQRuGsZxXdV-oLSJF1zGkY&e= >>>> >>> >>> _______________________________________________ >>> maker-devel mailing list >>> maker-devel at box290.bluehost.com >>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From qlian003 at ucr.edu Tue Jan 9 11:10:49 2018 From: qlian003 at ucr.edu (Qihua Liang) Date: Tue, 9 Jan 2018 10:10:49 -0800 Subject: [maker-devel] questions on master_datastore_index.log file In-Reply-To: <0E5E8721-E814-4BA5-891B-B1C312BC0D4A@gmail.com> References: <0BECB285-BB11-4F46-B6D7-072640F311B2@ucr.edu> <0E5E8721-E814-4BA5-891B-B1C312BC0D4A@gmail.com> Message-ID: <87A06F3B-82C1-4B21-906E-69DC1308DEC6@ucr.edu> Hi Carson, I just check with the system administrator and we think the disk space should be working fine. And actually I also ran another attempt with much fewer processors days ago and I am having the same issues. Maybe I will try renaming the contig names to see how the new attempt works? Or any other suggestions? Thank you! Qihua > On Jan 9, 2018, at 9:14 AM, Carson Holt wrote: > > Your contig names may create issues. Specifically the ?;? character, but you should also remove the ?=? character. However, I believe your problem may be IO. If you are running under MPI or are running multiple jobs, the disk one of the machines may have that location unmounted, it may be full, you may have hit a system file quota limit, or the IO load is slowing it is not actually finished writing the file when MAKER tries to read it. If IO load, is the issue, then you just need to run fewer processes. The other possibilities would mean you need to make space, fix the mount, or raise any quotas on your systems. > > Thanks, > Carson > > On Jan 6, 2018, at 4:09 PM, Qihua Liang > wrote: > >> Hi Carson, >> >> I am pasting more lines of error messages. I notice an error of "ERROR: Can't open seq file: /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/query.masked.gff.seq?, the seq name of ?ScsGwly? is ">ScsGwly_6124;HRSCAF=6247?, is it because of the seq naming that makes the temp file name weird? >> >> Thanks >> Qihua >> >> #--------- command -------------# >> Widget::blastx: >> /24-2/home/qliang/0.soft/maker/bin/../exe/blast/bin/blastx -db /tmp/maker_nJDkCL/te_proteins%2Efasta.mpi.10.9 -query /tmp/maker_nJDkCL/0/ScsG >> wly_5932%3BHRSCAF=6050.0 -num_alignments 10000 -num_descriptions 10000 -evalue 1e-06 -dbsize 300 -searchsp 500000000 -num_threads 1 -seg yes >> -soft_masking true -lcase_masking -show_gis -out /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/D7/4 >> A/ScsGwly_5932%3BHRSCAF=6050//theVoid.ScsGwly_5932%3BHRSCAF=6050/0/ScsGwly_5932%3BHRSCAF=6050.0.te_proteins%2Efasta.repeatrunner.temp_dir/te_ >> proteins%2Efasta.mpi.10.9.repeatrunner >> #-------------------------------# >> deleted:0 hits >> collecting blastx repeatmasking >> processing all repeats >> in cluster::shadow_cluster... >> ...finished clustering. >> preparing masked sequence >> preparing ab-inits >> running snap. >> #--------- command -------------# >> Widget::snap: >> /24-2/home/qliang/0.soft/maker/exe/snap/snap /home/qliang/cowpea/annotation/09.tingting/4.Abintio/2.CEGMA/3.maker/maker1.hmm/maker1.snap.hmm >> /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0 > /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0.maker1%2Esnap%2Eh >> mm.snap >> #-------------------------------# >> scoring....decoding.10.20.30.40.50.60.70.80.90.100 done >> scoring....decoding.10.20.30.40.50.60.70.80.90.100 done >> running augustus. >> #--------- command -------------# >> Widget::augustus: >> /usr/local/augustus.2.7/bin/augustus --species=cowpea_new --UTR=off /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0 > /tmp/maker >> _nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0.cowpea_new.augustus >> #-------------------------------# >> deleted:0 hits >> collecting blastx repeatmasking >> processing all repeats >> in cluster::shadow_cluster... >> ...finished clustering. >> deleted:0 hits >> doing blastx repeats >> running blast search. >> #--------- command -------------# >> Widget::blastx: >> /24-2/home/qliang/0.soft/maker/bin/../exe/blast/bin/blastx -db /tmp/maker_mvdRkd/te_proteins%2Efasta.mpi.10.6 -query /tmp/maker_mvdRkd/0/chr10.75 -num_alignments 10000 -num_descriptions 10000 -evalue 1e-06 -dbsize 300 -searchsp 500000000 -num_threads 1 -seg yes -soft_masking true -lcase_masking -show_gis -out /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/09/chr10//theVoid.chr10/7/chr10.75.te_proteins%2Efasta.repeatrunner.temp_dir/te_proteins%2Efasta.mpi.10.6.repeatrunner >> #-------------------------------# >> doing blastx repeats >> re reading blast report. >> /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/0/ScsGwly_6124%3BHRSCAF=6247.0.te_proteins%2Efasta.repeatrunner >> deleted:0 hits >> doing blastx repeats >> doing blastx repeats >> doing blastx repeats >> doing blastx repeats >> doing blastx repeats >> doing blastx repeats >> doing blastx repeats >> doing blastx repeats >> doing blastx repeats >> collecting blastx repeatmasking >> processing all repeats >> in cluster::shadow_cluster... >> ...finished clustering. >> ERROR: Can't open seq file: /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/query.masked.gff.seq >> No such file or directory >> >> at /24-2/home/qliang/0.soft/maker/bin/../lib/Dumper/GFF/GFFV3.pm line 199. >> Dumper::GFF::GFFV3::finalize(Dumper::GFF::GFFV3=HASH(0x5000ab8)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 700 >> Process::MpiChunk::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >> Error::subs::try(CODE(0x502bbb0), HASH(0x5007788)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 4269 >> Process::MpiChunk::_go(Process::MpiChunk=HASH(0x50a1a18), "flow", HASH(0x50ad0f0), 2, 0) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 378 >> Process::MpiChunk::_flow(Process::MpiChunk=HASH(0x50a1a18), HASH(0x50ad0f0), 2, 0, Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 318 >> Process::MpiTiers::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >> Error::subs::try(CODE(0x50a9348), HASH(0x4ff0ec0)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 338 >> Process::MpiTiers::_next_level(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 179 >> Process::MpiTiers::next_chunk(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 286 >> Process::MpiTiers::run_all(Process::MpiTiers=HASH(0x4fb3350), 0) called at /home/qliang/0.soft/maker/bin/maker line 695 >> --> rank=NA, hostname=H4 >> ERROR: Failed while builing masking tiers >> --> rank=NA, hostname=H4 >> --> rank=NA, hostname=H4 >> ERROR: Can not get next level >> running genemark. >> #--------- command -------------# >> Widget::genemark: >> /24-2/home/qliang/0.soft/PerlPackages/ActivePerl-5.22/bin/perl-static /24-2/home/qliang/0.soft/maker/bin/../lib/Widget/genemark/gmhmm_wrap -m /home/qliang/cowpea/annotation/05.CEGMA/2.genemask/output/gmhmm.mod -g /24-2/home/qliang/0.soft/makerPackages/gm_et_linux_64/gmes_petap/gmhmme3 -p /24-2/home/qliang/0.soft/makerPackages/gm_et_linux_64/gmes_petap/probuild -o /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_nomask.0.gmhmm%2Emod.genemark /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_nomask.0 >> #-------------------------------# >> FAILED CONTIG:ScsGwly_6124;HRSCAF=6247 >> >> examining contents of the fasta file and run log >> >> >> >> --Next Contig-- >> >> #--------------------------------------------------------------------- >> Now starting the contig!! >> SeqID: ScsGwly_6140;HRSCAF=6263 >> Length: 1247 >> #--------------------------------------------------------------------- >> >> >> >> >>> On Jan 5, 2018, at 7:22 PM, Carson Holt > wrote: >>> >>> That?s the stack trace. The error is going to be a few lines further back. It would be best to get a few hundred lines right around the area you are showing. >>> >>> ?Carson >>> >>>> On Jan 4, 2018, at 2:36 PM, Qihua Liang > wrote: >>>> >>>> Hi Ence, >>>> >>>> When I searched for ?E/error? in the output file, here is what first showed up: >>>> Process::MpiChunk::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >>>> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >>>> Error::subs::try(CODE(0x502bbb0), HASH(0x5007788)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 4269 >>>> Process::MpiChunk::_go(Process::MpiChunk=HASH(0x50a1a18), "flow", HASH(0x50ad0f0), 2, 0) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 378 >>>> Process::MpiChunk::_flow(Process::MpiChunk=HASH(0x50a1a18), HASH(0x50ad0f0), 2, 0, Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 318 >>>> Process::MpiTiers::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >>>> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >>>> Error::subs::try(CODE(0x50a9348), HASH(0x4ff0ec0)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 338 >>>> Process::MpiTiers::_next_level(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 179 >>>> >>>> Is this what you may need? >>>> >>>> Qihua >>>> >>>>> On Jan 4, 2018, at 6:16 AM, Ence,daniel > wrote: >>>>> >>>>> Hi, Before we can give any help to debug it, we need the error messages. These should be in the same file that the ?maker is finished? message is in. Look for the first error message (the one closest to the top of the file) and send that to the mailing list. >>>>> >>>>> Thanks, >>>>> Daniel >>>>> >>>>> >>>>>> On Jan 3, 2018, at 8:52 PM, Qihua Liang > wrote: >>>>>> >>>>>> Dear Maker Develop Team, >>>>>> >>>>>> I have successfully run Maker for several times before. But I came across a strange thing days ago when I ran Maker again on a different assembly with the same input files and settings. >>>>>> >>>>>> I saw the message of "Maker is now finished!!!? but got empty GFF3 and no fasta files. And then I checked the master_datastore_index.log and realized that there are a lot of ?failed?s and ?retry?s and ?failed? again. What does this mean? Since I used same inputs as previous successful runs, could you provide some instructions on how to debug and solve it? >>>>>> >>>>>> Thank you so much >>>>>> Qihua >>>>>> _______________________________________________ >>>>>> maker-devel mailing list >>>>>> maker-devel at box290.bluehost.com >>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__box290.bluehost.com_mailman_listinfo_maker-2Ddevel-5Fyandell-2Dlab.org&d=DwIGaQ&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=12jzlNvGVD0AlPJ4E7cTlw1Dvu6n9cb4kMCobJ28XPs&m=nUDCP_0kFOhDYlTHgOpWtf_zdL77aQFeQwYOGIQwP8c&s=9Z4T1hdtxOyIjpn6f70qhrQRuGsZxXdV-oLSJF1zGkY&e= >>>>> >>>> >>>> _______________________________________________ >>>> maker-devel mailing list >>>> maker-devel at box290.bluehost.com >>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Wed Jan 10 12:05:03 2018 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 10 Jan 2018 12:05:03 -0700 Subject: [maker-devel] questions on master_datastore_index.log file In-Reply-To: <87A06F3B-82C1-4B21-906E-69DC1308DEC6@ucr.edu> References: <0BECB285-BB11-4F46-B6D7-072640F311B2@ucr.edu> <0E5E8721-E814-4BA5-891B-B1C312BC0D4A@gmail.com> <87A06F3B-82C1-4B21-906E-69DC1308DEC6@ucr.edu> Message-ID: <36B45AA1-3D02-4E83-9EF8-85D56C4D3020@gmail.com> The error is saying exactly that the file MAKER just created does not exist. The only time we ever see this is when using network mounted locations under heavy IO load. Most network storage options use asynchronous IO, which means the system returns success on file operation before they actually complete. So they can say they finished writing a file before it actually exist. So if you try and open it right away, it doesn?t really exist and everything fails. But that only happens if there is heavy IO (lots of things going on in that mount location). So if you are getting persitent failures you may want to try a different work directory, or get your IT to troubleshoot IO load in the directory you are using. ?Carson > On Jan 9, 2018, at 11:10 AM, Qihua Liang wrote: > > Hi Carson, > > I just check with the system administrator and we think the disk space should be working fine. And actually I also ran another attempt with much fewer processors days ago and I am having the same issues. > > Maybe I will try renaming the contig names to see how the new attempt works? Or any other suggestions? > > Thank you! > Qihua > >> On Jan 9, 2018, at 9:14 AM, Carson Holt > wrote: >> >> Your contig names may create issues. Specifically the ?;? character, but you should also remove the ?=? character. However, I believe your problem may be IO. If you are running under MPI or are running multiple jobs, the disk one of the machines may have that location unmounted, it may be full, you may have hit a system file quota limit, or the IO load is slowing it is not actually finished writing the file when MAKER tries to read it. If IO load, is the issue, then you just need to run fewer processes. The other possibilities would mean you need to make space, fix the mount, or raise any quotas on your systems. >> >> Thanks, >> Carson >> >> On Jan 6, 2018, at 4:09 PM, Qihua Liang > wrote: >> >>> Hi Carson, >>> >>> I am pasting more lines of error messages. I notice an error of "ERROR: Can't open seq file: /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/query.masked.gff.seq?, the seq name of ?ScsGwly? is ">ScsGwly_6124;HRSCAF=6247?, is it because of the seq naming that makes the temp file name weird? >>> >>> Thanks >>> Qihua >>> >>> #--------- command -------------# >>> Widget::blastx: >>> /24-2/home/qliang/0.soft/maker/bin/../exe/blast/bin/blastx -db /tmp/maker_nJDkCL/te_proteins%2Efasta.mpi.10.9 -query /tmp/maker_nJDkCL/0/ScsG >>> wly_5932%3BHRSCAF=6050.0 -num_alignments 10000 -num_descriptions 10000 -evalue 1e-06 -dbsize 300 -searchsp 500000000 -num_threads 1 -seg yes >>> -soft_masking true -lcase_masking -show_gis -out /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/D7/4 >>> A/ScsGwly_5932%3BHRSCAF=6050//theVoid.ScsGwly_5932%3BHRSCAF=6050/0/ScsGwly_5932%3BHRSCAF=6050.0.te_proteins%2Efasta.repeatrunner.temp_dir/te_ >>> proteins%2Efasta.mpi.10.9.repeatrunner >>> #-------------------------------# >>> deleted:0 hits >>> collecting blastx repeatmasking >>> processing all repeats >>> in cluster::shadow_cluster... >>> ...finished clustering. >>> preparing masked sequence >>> preparing ab-inits >>> running snap. >>> #--------- command -------------# >>> Widget::snap: >>> /24-2/home/qliang/0.soft/maker/exe/snap/snap /home/qliang/cowpea/annotation/09.tingting/4.Abintio/2.CEGMA/3.maker/maker1.hmm/maker1.snap.hmm >>> /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0 > /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0.maker1%2Esnap%2Eh >>> mm.snap >>> #-------------------------------# >>> scoring....decoding.10.20.30.40.50.60.70.80.90.100 done >>> scoring....decoding.10.20.30.40.50.60.70.80.90.100 done >>> running augustus. >>> #--------- command -------------# >>> Widget::augustus: >>> /usr/local/augustus.2.7/bin/augustus --species=cowpea_new --UTR=off /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0 > /tmp/maker >>> _nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_masked.0.cowpea_new.augustus >>> #-------------------------------# >>> deleted:0 hits >>> collecting blastx repeatmasking >>> processing all repeats >>> in cluster::shadow_cluster... >>> ...finished clustering. >>> deleted:0 hits >>> doing blastx repeats >>> running blast search. >>> #--------- command -------------# >>> Widget::blastx: >>> /24-2/home/qliang/0.soft/maker/bin/../exe/blast/bin/blastx -db /tmp/maker_mvdRkd/te_proteins%2Efasta.mpi.10.6 -query /tmp/maker_mvdRkd/0/chr10.75 -num_alignments 10000 -num_descriptions 10000 -evalue 1e-06 -dbsize 300 -searchsp 500000000 -num_threads 1 -seg yes -soft_masking true -lcase_masking -show_gis -out /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/09/chr10//theVoid.chr10/7/chr10.75.te_proteins%2Efasta.repeatrunner.temp_dir/te_proteins%2Efasta.mpi.10.6.repeatrunner >>> #-------------------------------# >>> doing blastx repeats >>> re reading blast report. >>> /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/0/ScsGwly_6124%3BHRSCAF=6247.0.te_proteins%2Efasta.repeatrunner >>> deleted:0 hits >>> doing blastx repeats >>> doing blastx repeats >>> doing blastx repeats >>> doing blastx repeats >>> doing blastx repeats >>> doing blastx repeats >>> doing blastx repeats >>> doing blastx repeats >>> doing blastx repeats >>> collecting blastx repeatmasking >>> processing all repeats >>> in cluster::shadow_cluster... >>> ...finished clustering. >>> ERROR: Can't open seq file: /24-2/home/qliang/cowpea/annotation/22.dovetail.assembly/map.maker.output/map_datastore/ED/F1/ScsGwly_6124%3BHRSCAF=6247//theVoid.ScsGwly_6124%3BHRSCAF=6247/query.masked.gff.seq >>> No such file or directory >>> >>> at /24-2/home/qliang/0.soft/maker/bin/../lib/Dumper/GFF/GFFV3.pm line 199. >>> Dumper::GFF::GFFV3::finalize(Dumper::GFF::GFFV3=HASH(0x5000ab8)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 700 >>> Process::MpiChunk::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >>> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >>> Error::subs::try(CODE(0x502bbb0), HASH(0x5007788)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 4269 >>> Process::MpiChunk::_go(Process::MpiChunk=HASH(0x50a1a18), "flow", HASH(0x50ad0f0), 2, 0) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 378 >>> Process::MpiChunk::_flow(Process::MpiChunk=HASH(0x50a1a18), HASH(0x50ad0f0), 2, 0, Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 318 >>> Process::MpiTiers::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >>> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >>> Error::subs::try(CODE(0x50a9348), HASH(0x4ff0ec0)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 338 >>> Process::MpiTiers::_next_level(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 179 >>> Process::MpiTiers::next_chunk(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 286 >>> Process::MpiTiers::run_all(Process::MpiTiers=HASH(0x4fb3350), 0) called at /home/qliang/0.soft/maker/bin/maker line 695 >>> --> rank=NA, hostname=H4 >>> ERROR: Failed while builing masking tiers >>> --> rank=NA, hostname=H4 >>> --> rank=NA, hostname=H4 >>> ERROR: Can not get next level >>> running genemark. >>> #--------- command -------------# >>> Widget::genemark: >>> /24-2/home/qliang/0.soft/PerlPackages/ActivePerl-5.22/bin/perl-static /24-2/home/qliang/0.soft/maker/bin/../lib/Widget/genemark/gmhmm_wrap -m /home/qliang/cowpea/annotation/05.CEGMA/2.genemask/output/gmhmm.mod -g /24-2/home/qliang/0.soft/makerPackages/gm_et_linux_64/gmes_petap/gmhmme3 -p /24-2/home/qliang/0.soft/makerPackages/gm_et_linux_64/gmes_petap/probuild -o /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_nomask.0.gmhmm%2Emod.genemark /tmp/maker_nJDkCL/ScsGwly_5932%3BHRSCAF=6050.abinit_nomask.0 >>> #-------------------------------# >>> FAILED CONTIG:ScsGwly_6124;HRSCAF=6247 >>> >>> examining contents of the fasta file and run log >>> >>> >>> >>> --Next Contig-- >>> >>> #--------------------------------------------------------------------- >>> Now starting the contig!! >>> SeqID: ScsGwly_6140;HRSCAF=6263 >>> Length: 1247 >>> #--------------------------------------------------------------------- >>> >>> >>> >>> >>>> On Jan 5, 2018, at 7:22 PM, Carson Holt > wrote: >>>> >>>> That?s the stack trace. The error is going to be a few lines further back. It would be best to get a few hundred lines right around the area you are showing. >>>> >>>> ?Carson >>>> >>>>> On Jan 4, 2018, at 2:36 PM, Qihua Liang > wrote: >>>>> >>>>> Hi Ence, >>>>> >>>>> When I searched for ?E/error? in the output file, here is what first showed up: >>>>> Process::MpiChunk::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >>>>> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >>>>> Error::subs::try(CODE(0x502bbb0), HASH(0x5007788)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 4269 >>>>> Process::MpiChunk::_go(Process::MpiChunk=HASH(0x50a1a18), "flow", HASH(0x50ad0f0), 2, 0) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiChunk.pm line 378 >>>>> Process::MpiChunk::_flow(Process::MpiChunk=HASH(0x50a1a18), HASH(0x50ad0f0), 2, 0, Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 318 >>>>> Process::MpiTiers::__ANON__() called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 415 >>>>> eval {...} called at /24-2/home/qliang/0.soft/maker/bin/../lib/Error.pm line 407 >>>>> Error::subs::try(CODE(0x50a9348), HASH(0x4ff0ec0)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 338 >>>>> Process::MpiTiers::_next_level(Process::MpiTiers=HASH(0x4fb3350)) called at /24-2/home/qliang/0.soft/maker/bin/../lib/Process/MpiTiers.pm line 179 >>>>> >>>>> Is this what you may need? >>>>> >>>>> Qihua >>>>> >>>>>> On Jan 4, 2018, at 6:16 AM, Ence,daniel > wrote: >>>>>> >>>>>> Hi, Before we can give any help to debug it, we need the error messages. These should be in the same file that the ?maker is finished? message is in. Look for the first error message (the one closest to the top of the file) and send that to the mailing list. >>>>>> >>>>>> Thanks, >>>>>> Daniel >>>>>> >>>>>> >>>>>>> On Jan 3, 2018, at 8:52 PM, Qihua Liang > wrote: >>>>>>> >>>>>>> Dear Maker Develop Team, >>>>>>> >>>>>>> I have successfully run Maker for several times before. But I came across a strange thing days ago when I ran Maker again on a different assembly with the same input files and settings. >>>>>>> >>>>>>> I saw the message of "Maker is now finished!!!? but got empty GFF3 and no fasta files. And then I checked the master_datastore_index.log and realized that there are a lot of ?failed?s and ?retry?s and ?failed? again. What does this mean? Since I used same inputs as previous successful runs, could you provide some instructions on how to debug and solve it? >>>>>>> >>>>>>> Thank you so much >>>>>>> Qihua >>>>>>> _______________________________________________ >>>>>>> maker-devel mailing list >>>>>>> maker-devel at box290.bluehost.com >>>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__box290.bluehost.com_mailman_listinfo_maker-2Ddevel-5Fyandell-2Dlab.org&d=DwIGaQ&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=12jzlNvGVD0AlPJ4E7cTlw1Dvu6n9cb4kMCobJ28XPs&m=nUDCP_0kFOhDYlTHgOpWtf_zdL77aQFeQwYOGIQwP8c&s=9Z4T1hdtxOyIjpn6f70qhrQRuGsZxXdV-oLSJF1zGkY&e= >>>>>> >>>>> >>>>> _______________________________________________ >>>>> maker-devel mailing list >>>>> maker-devel at box290.bluehost.com >>>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>>> >>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From arsilan324 at gmail.com Thu Jan 11 07:15:31 2018 From: arsilan324 at gmail.com (Muhammad Arslan) Date: Thu, 11 Jan 2018 15:15:31 +0100 Subject: [maker-devel] GFF3 to .tbl Message-ID: Dear Madam or Sir, I am writing this email to inquire if there is any way to make .tbl file from maker generated GFF3 file? This is required since I am trying to submit the annotation to NCBI. If there is any other solution for this, please advise accordingly. Thank you very much! Arslan -- --------------------------------------------------------------------------------------------*Muhammad Arslan* PhD Student / Guest Scientist Department of Environmental Biotechnology Helmholtz Centre for Environmental Research - UFZ Permoserstra?e 15, 04318 Leipzig, Germany Phone +49,341,235 <+49%20341%20235> 1696, muhammad.arslan at ufz.de , www.ufz.de Registered Office / Registered Office: Leipzig Register court / Registration Office: Amtsgericht Leipzig Commercial register Nr./Trade Register No .: B 4703 Chairman / Chairman of the Supervisory Board: MinDirig Wilfried Kraus Scientific Director / Scientific Managing Director: Prof. Georg Teutsch Administrative Managing Director / Administrative Managing Director: Prof. Dr. Heike Grassmann -------------------------------------------------------------------------------------------- *SAVE PAPER - Please do not print this e-mail unless absolutely necessary* -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Fri Jan 19 15:46:26 2018 From: carsonhh at gmail.com (Carson Holt) Date: Fri, 19 Jan 2018 15:46:26 -0700 Subject: [maker-devel] GFF3 to .tbl In-Reply-To: References: Message-ID: <93BD3F52-1D76-465A-94EE-80D616BB72A6@gmail.com> Try GAG ?> https://genomeannotation.github.io/GAG/ ?Carson > On Jan 11, 2018, at 7:15 AM, Muhammad Arslan wrote: > > Dear Madam or Sir, > > I am writing this email to inquire if there is any way to make .tbl file from maker generated GFF3 file? This is required since I am trying to submit the annotation to NCBI. If there is any other solution for this, please advise accordingly. > > Thank you very much! > Arslan > > -- > -------------------------------------------------------------------------------------------- > Muhammad Arslan > PhD Student / Guest Scientist > Department of Environmental Biotechnology > > Helmholtz Centre for Environmental Research - UFZ > Permoserstra?e 15, 04318 Leipzig, Germany > Phone +49,341,235 1696, > muhammad.arslan at ufz.de , www.ufz.de > > Registered Office / Registered Office: Leipzig > Register court / Registration Office: Amtsgericht Leipzig > Commercial register Nr./Trade Register No .: B 4703 > Chairman / Chairman of the Supervisory Board: MinDirig Wilfried Kraus > Scientific Director / Scientific Managing Director: > Prof. Georg Teutsch > Administrative Managing Director / Administrative Managing Director: > Prof. Dr. Heike Grassmann > > > -------------------------------------------------------------------------------------------- > SAVE PAPER - Please do not print this e-mail unless absolutely necessary > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From qwzhang0601 at gmail.com Mon Jan 22 10:23:34 2018 From: qwzhang0601 at gmail.com (Quanwei Zhang) Date: Mon, 22 Jan 2018 12:23:34 -0500 Subject: [maker-devel] name of gene model Message-ID: Hello: Would you please explain how the genes were named? Do similar names indicate sequence similarities (e.g., maker-Contig3217-snap-gene-35.13-mRNA-1, maker-Contig3217-snap-gene-35.14-mRNA-1)? maker-Contig2667-augustus-gene-266.22-mRNA-1; maker-Contig2667-snap-gene-266.5-mRNA-1; maker-Contig3217-snap-gene-35.13-mRNA-1; maker-Contig3217-snap-gene-35.14-mRNA-1; maker-Contig3217-snap-gene-35.15-mRNA-1; maker-Contig3217-snap-gene-35.16-mRNA-1 Thank you Best Quanwei -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Jan 22 10:29:26 2018 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 22 Jan 2018 10:29:26 -0700 Subject: [maker-devel] name of gene model In-Reply-To: References: Message-ID: <499FF8DC-C277-484B-AEC9-EE7A35090615@gmail.com> The only info in the name is the source program of the model (i.e. snap/augustus). The numbers are just meaningless iterators. ?Carson > On Jan 22, 2018, at 10:23 AM, Quanwei Zhang wrote: > > Hello: > > Would you please explain how the genes were named? Do similar names indicate sequence similarities (e.g., maker-Contig3217-snap-gene-35.13-mRNA-1, maker-Contig3217-snap-gene-35.14-mRNA-1)? > > maker-Contig2667-augustus-gene-266.22-mRNA-1; > maker-Contig2667-snap-gene-266.5-mRNA-1; > > maker-Contig3217-snap-gene-35.13-mRNA-1; > maker-Contig3217-snap-gene-35.14-mRNA-1; > maker-Contig3217-snap-gene-35.15-mRNA-1; > maker-Contig3217-snap-gene-35.16-mRNA-1 > > Thank you > > Best > Quanwei > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From yincl2013 at 126.com Tue Jan 23 08:01:17 2018 From: yincl2013 at 126.com (Chuanlin Yin) Date: Tue, 23 Jan 2018 23:01:17 +0800 (GMT+08:00) Subject: [maker-devel] maker-3.01.02-beta run error Message-ID: <165b9e3f.a94e.16123899e8a.Coremail.yincl2013@126.com> Dear Mr/Ms? Recently?when i want to use maker-3.01.02-beta for genome annotation. I had failed for the following error: Can't call method "add_entry" without a package or object reference at /gpfs/bioinformatics/software/maker-3.01.02-beta/bin/../lib/Widget/snap.pm line 540. --> rank=NA, hostname=c01n02 ERROR: Failed while annotating transcripts ERROR: Chunk failed at level:1, tier_type:4 FAILED CONTIG:002369F_pilon_obj Could you explain why it happened! Much appreciated for any replies. Thanks. Best regards! Showky -------------- next part -------------- An HTML attachment was scrubbed... URL: From Emily.Giroux at inspection.gc.ca Tue Jan 23 13:35:06 2018 From: Emily.Giroux at inspection.gc.ca (Giroux, Emily (CFIA/ACIA)) Date: Tue, 23 Jan 2018 20:35:06 +0000 Subject: [maker-devel] maker pipeline 2nd round updating augustus Message-ID: <397E3572255740488AA9993F4D41A3B036B588CB@ONOTTAXES2.AGR.GC.CA> Hi, I completed a first round of Maker, followed by snap and BUSCO to train augustus. I then placed the newly-trained species-specific augustus files in the augustus species directory and used this for my second round of maker. What I'm wondering now is whether I should repeat this process after completeing round 2 of maker, and follow this with using BUSCO to retrain the augustus files again and replace the previous species-specific libraries from round 1 with those from round 2 and use these as input for my third round of maker. Thank-you very much, Emily -------------- next part -------------- An HTML attachment was scrubbed... URL: From patrick.tranvan at unil.ch Thu Jan 25 07:46:27 2018 From: patrick.tranvan at unil.ch (Patrick Tran Van) Date: Thu, 25 Jan 2018 14:46:27 +0000 Subject: [maker-devel] Adding NR functional annotation Message-ID: <1516891629951.7595@unil.ch> Hi, Can you please update maker_functional_gff maker_functional_fasta in order to make it compatible with the database NR ? Thanks, Patrick -------------- next part -------------- An HTML attachment was scrubbed... URL: From marni at cs.au.dk Thu Jan 25 03:26:04 2018 From: marni at cs.au.dk (Marni Tausen) Date: Thu, 25 Jan 2018 10:26:04 +0000 Subject: [maker-devel] Maker run problems - BLAST makeblastdb failed Message-ID: <72D3C07A-D1A6-4759-B8D2-13EBE8DD7982@birc.au.dk> Hey, I have a problem getting maker to run. I?ve tried installing the pipeline on three separate systems. CentOS 6 (cluster), Mac OS X 10.12.6 and on CentOS 7. With each of them I run into problems with Repeatmasker step with the error message: #--------------------------------------------------------------------- Now starting the contig!! SeqID: chr0 Length: 38046352 #--------------------------------------------------------------------- setting up GFF3 output and fasta chunks doing repeat masking running repeat masker. #--------- command -------------# Widget::RepeatMasker: cd makertmp/maker_DMTHbJ; /Users/PM/maker/exe/RepeatMasker/RepeatMasker /Users/PM/GENEANNOTATION/TrR.v5.maker.output/TrR.v5_datastore/82/7E/chr0//theVoid.chr0/0/chr0.0.all.rb -species all -dir /Users/PM/GENEANNOTATION/TrR.v5.maker.output/TrR.v5_datastore/82/7E/chr0//theVoid.chr0/0 -pa 1 #-------------------------------# doing blastx repeats formating database... #--------- command -------------# Widget::formater: /Users/PM/maker/bin/../exe/lblast/bin/makeblastdb -dbtype prot -in makertmp/maker_DMTHbJ/0/blastprep/te_proteins%2Efasta.mpi.10.0 #-------------------------------# BLAST options error: File makertmp/maker_DMTHbJ/0/blastprep/te_proteins%2Efasta.mpi.10.0 does not exist ERROR: /Users/PM/maker/bin/../exe/lblast/bin/makeblastdb failed in Widget::formater --> rank=NA, hostname=d24834.local ERROR: Failed while doing blastx repeats ERROR: Chunk failed at level:1, tier_type:1 FAILED CONTIG:chr0 ERROR: Chunk failed at level:2, tier_type:0 FAILED CONTIG:chr0 examining contents of the fasta file and run log The Maker version that was installed is 2.31.9, and it was build using the ./Build commands. However the links for exonerate (2.2.0) and repeatmasker (repbase) (latest version) seem to be broken, as they always returned connection errors. So I manually installed both of those programs and linked maker to them. I?ve attached the config files and the script used to run maker. Cheers, Marni Tausen -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_bopts.ctl Type: application/octet-stream Size: 1413 bytes Desc: maker_bopts.ctl URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_exe.ctl Type: application/octet-stream Size: 1277 bytes Desc: maker_exe.ctl URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_opts.ctl Type: application/octet-stream Size: 4554 bytes Desc: maker_opts.ctl URL: From mmokrejs at gmail.com Thu Jan 25 09:05:45 2018 From: mmokrejs at gmail.com (=?UTF-8?Q?Martin_MOKREJ=c5=a0?=) Date: Thu, 25 Jan 2018 17:05:45 +0100 Subject: [maker-devel] Maker run problems - BLAST makeblastdb failed In-Reply-To: <72D3C07A-D1A6-4759-B8D2-13EBE8DD7982@birc.au.dk> References: <72D3C07A-D1A6-4759-B8D2-13EBE8DD7982@birc.au.dk> Message-ID: Hi Marni, do not use spaces in your filenames and directory names. I think that is your issue: te_proteins%2Efasta.mpi.10.0 Martin From carsonhh at gmail.com Thu Jan 25 14:20:21 2018 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 25 Jan 2018 14:20:21 -0700 Subject: [maker-devel] Maker run problems - BLAST makeblastdb failed In-Reply-To: <72D3C07A-D1A6-4759-B8D2-13EBE8DD7982@birc.au.dk> References: <72D3C07A-D1A6-4759-B8D2-13EBE8DD7982@birc.au.dk> Message-ID: <2AC145CF-6954-4740-BA88-A7ABBBC841D0@gmail.com> You set TMP=makertmp. That is likly not a true locally mounted location (i.e. it?s network mounted). In which case you will hit a race condition where files you just created don?t become readable for a few milliseconds to seconds after creation under heavy IO load. Alternatively it is locally mounted, but only exists on a single node and you are running it on a cluster (other nodes cannot cross access local nodes). Unless your cluster setup has a specific location for locally mounted temporary scratch space, you should not set TMP=. Just let it default to /tmp which is almost always locally mounted. ?Carson > On Jan 25, 2018, at 3:26 AM, Marni Tausen wrote: > > Hey, > > I have a problem getting maker to run. > > I?ve tried installing the pipeline on three separate systems. CentOS 6 (cluster), Mac OS X 10.12.6 and on CentOS 7. > > With each of them I run into problems with Repeatmasker step with the error message: > > #--------------------------------------------------------------------- > Now starting the contig!! > SeqID: chr0 > Length: 38046352 > #--------------------------------------------------------------------- > > > setting up GFF3 output and fasta chunks > doing repeat masking > running repeat masker. > #--------- command -------------# > Widget::RepeatMasker: > cd makertmp/maker_DMTHbJ; /Users/PM/maker/exe/RepeatMasker/RepeatMasker /Users/PM/GENEANNOTATION/TrR.v5.maker.output/TrR.v5_datastore/82/7E/chr0//theVoid.chr0/0/chr0.0.all.rb -species all -dir /Users/PM/GENEANNOTATION/TrR.v5.maker.output/TrR.v5_datastore/82/7E/chr0//theVoid.chr0/0 -pa 1 > #-------------------------------# > doing blastx repeats > formating database... > #--------- command -------------# > Widget::formater: > /Users/PM/maker/bin/../exe/lblast/bin/makeblastdb -dbtype prot -in makertmp/maker_DMTHbJ/0/blastprep/te_proteins%2Efasta.mpi.10.0 > #-------------------------------# > BLAST options error: File makertmp/maker_DMTHbJ/0/blastprep/te_proteins%2Efasta.mpi.10.0 does not exist > ERROR: /Users/PM/maker/bin/../exe/lblast/bin/makeblastdb failed in Widget::formater > --> rank=NA, hostname=d24834.local > ERROR: Failed while doing blastx repeats > ERROR: Chunk failed at level:1, tier_type:1 > FAILED CONTIG:chr0 > > ERROR: Chunk failed at level:2, tier_type:0 > FAILED CONTIG:chr0 > > examining contents of the fasta file and run log > > The Maker version that was installed is 2.31.9, and it was build using the ./Build commands. > > However the links for exonerate (2.2.0) and repeatmasker (repbase) (latest version) seem to be broken, as they always returned connection errors. > > So I manually installed both of those programs and linked maker to them. > > I?ve attached the config files and the script used to run maker. > > Cheers, > Marni Tausen > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Thu Jan 25 14:29:37 2018 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 25 Jan 2018 14:29:37 -0700 Subject: [maker-devel] maker pipeline 2nd round updating augustus In-Reply-To: <397E3572255740488AA9993F4D41A3B036B588CB@ONOTTAXES2.AGR.GC.CA> References: <397E3572255740488AA9993F4D41A3B036B588CB@ONOTTAXES2.AGR.GC.CA> Message-ID: <2D069310-1BFC-4C30-98B5-739FC90A732B@gmail.com> Don?t use BUSCO to train for the second round, there is a bias in the models it produces for conserved genes that tend to be short and intron poor., You will want to avoid this bias in the second round. You want to use a broad selection of gene models instead. Use the maker2zff script to select gene models for training (examples on doing this can be found on the maker tutorial wiki). Then use this script to convert ZFF to GenBank format to train Augustus ?> https://github.com/hyphaltip/genome-scripts/blob/master/gene_prediction/zff2augustus_gbk.pl This is a nice guide to train Augustus using GenBank format input?> https://vcru.wisc.edu/simonlab/bioinformatics/programs/augustus/docs/tutorial2015/training.html ?Carson > On Jan 23, 2018, at 1:35 PM, Giroux, Emily (CFIA/ACIA) wrote: > > Hi, > > I completed a first round of Maker, followed by snap and BUSCO to train augustus. I then placed the newly-trained species-specific augustus files in the augustus species directory and used this for my second round of maker. > > What I?m wondering now is whether I should repeat this process after completeing round 2 of maker, and follow this with using BUSCO to retrain the augustus files again and replace the previous species-specific libraries from round 1 with those from round 2 and use these as input for my third round of maker. > > Thank-you very much, > > Emily > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Thu Jan 25 14:33:33 2018 From: carsonhh at gmail.com (Carson Holt) Date: Thu, 25 Jan 2018 14:33:33 -0700 Subject: [maker-devel] maker-3.01.02-beta run error In-Reply-To: <165b9e3f.a94e.16123899e8a.Coremail.yincl2013@126.com> References: <165b9e3f.a94e.16123899e8a.Coremail.yincl2013@126.com> Message-ID: Because of where that error occurred, it may be a snowball error (i.e. a result of another error upstream that is the real failure). Could you look back in the data to see if there is a failure further back? Perhaps include your entire STDERR log. Thanks, Carson > On Jan 23, 2018, at 8:01 AM, Chuanlin Yin wrote: > > Dear Mr/Ms? > > Recently?when i want to use maker-3.01.02-beta for genome annotation. I had failed for the following error: > > Can't call method "add_entry" without a package or object reference at /gpfs/bioinformatics/software/maker-3.01.02-beta/bin/../lib/Widget/snap.pm line 540. > --> rank=NA, hostname=c01n02 > ERROR: Failed while annotating transcripts > ERROR: Chunk failed at level:1, tier_type:4 > FAILED CONTIG:002369F_pilon_obj > > Could you explain why it happened! > > Much appreciated for any replies. Thanks. > > Best regards! > Showky > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From qwzhang0601 at gmail.com Fri Jan 26 09:40:32 2018 From: qwzhang0601 at gmail.com (Quanwei Zhang) Date: Fri, 26 Jan 2018 11:40:32 -0500 Subject: [maker-devel] map the transcripts back onto the genome using "est2genome=1" Message-ID: Hello: I am trying to annotate a new NMR genome assembly. Since the gene annotation is available for the old version of NMR from NCBI, I tried to map the published refSeq transcripts onto the genome by "est2genome=1". But I found quite a few genes were lost during mapping. Then I did another test to check the functionality of the mapping by "est2genome=1". I mapped the published refSeq transcripts onto the old genome (the same version for the published gene annotation) by maker with "est2genome=1". Still I can found quite a few genes were lost during the mapping. Below I show you the results of gene annotaion by BUSCOs, which *annotation completeness with single-copy orthologs*. You can see, even we only consider the single-copy orthologs, there are still 4% were not map back to the genome. Do you have any comments on this? Besides would you please give us some suggestions to make more published gene annotation map back to the same genome assembly through "est2genome=1"? Attached is the maker_opts.ctl file I used for the mapping. Many thanks. # this is the BUSCOs results using the published gene annotation C:99.3%[S:33.3%,D:66.0%],F:0.3%,M:0.4%,n:4104 4077 Complete BUSCOs (C) 1367 Complete and single-copy BUSCOs (S) 2710 Complete and duplicated BUSCOs (D) 14 Fragmented BUSCOs (F) 13 Missing BUSCOs (M) 4104 Total BUSCO groups searched #this is the BUSCOs results using gene models after mapping by maker2. C:93.4%[S:36.5%,D:56.9%],F:2.6%,M:4.0%,n:4104 3830 Complete BUSCOs (C) 1496 Complete and single-copy BUSCOs (S) 2334 Complete and duplicated BUSCOs (D) 105 Fragmented BUSCOs (F) 169 Missing BUSCOs (M) 4104 Total BUSCO groups searched Best Quanwei -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_opts.ctl Type: application/octet-stream Size: 4734 bytes Desc: not available URL: From qwzhang0601 at gmail.com Fri Jan 26 16:16:50 2018 From: qwzhang0601 at gmail.com (Quanwei Zhang) Date: Fri, 26 Jan 2018 18:16:50 -0500 Subject: [maker-devel] gene annotation for a better genome In-Reply-To: References: <5AFEDD05-DF02-463F-A6EE-1619A9BB968D@gmail.com> Message-ID: Hi Carson: Thank you for your previous suggestions. I have done the annotation according to your suggestions. I firstly mapped the transcripts from old assembly to the new assembly by setting "est2genome=1", and then update the models by new predictions. Besides mapping by "est2genome=1" , do you think it is a good idea to do a separate mapping by proteins of old assembly (setting "protein2genome=1")? And then I provide both mapping GFF files (i.e., mapping GFF by transcripts and proteins, separately) and update them with new predictions and evidence support? Why I am trying to do this is because I found for certain genes they were not mapped to the new assembly but they can be mapped by protein orthologs. Thank you. Best Quanwei 2017-10-24 18:26 GMT-04:00 Carson Holt : > Yes. If you use est2genome it will just align the model, and then find the > longest ORF. So it is a quick way to jsut align old models to the new > assembly. Alternatively you can just do de novo annotation. > > ?Carson > > > > On Oct 24, 2017, at 10:54 AM, Quanwei Zhang wrote: > > Dear Carson: > > Thank you again for your suggestions. I just get the new genome assembly > of NMR and start to do gene annotation. I understand you ideas about this. > But can I simply use the old genome transcripts as transcript evidence, and > just following the standard Maker2 pipeline? I set est2genome=1 and provide > the mRNA sequences in the fasta format for the first round training of SNAP. > > For transcripts I have the following choices. I think the first choice is > more reliable and better, right? > (1) There are about 60,000 RefSeq transcripts from NCBI. So I downloaded > those sequences in fasta format. > (2) We have the raw data of RNA-seq from 11 tissues, we can do assembly by > trinity for each sample and then get the transcripts. But I think most of > the RNA-seq should have been submitted to NCBI. > > BTW, if we use the RefSeq data from NCBI, we can download the mRNA > sequences, coding sequences or protein sequences. I wonder which type of > data are the best to train the SNAP? For Augustus, we will use BUSCO to > train it. > > Many thanks. > > Best > Quanwei > > > > > 2017-09-29 12:36 GMT-04:00 Carson Holt : > >> You can try using the est2genome=1 option to map the old models forward >> onto the new assembly as if they were ESTs (add a line that says >> est_forward=1 to the control file to maintain old naming and set est=1 to >> the old model transcript file). Then provide the final models as a pred_gff >> for a subsuquent run (i.e. a traditional MAKER run where you are annotating >> the new assembly with transcript and protein evidence and ab initio >> predictors). Don?t supply the old models to est= on that run. >> >> The idea behind doing it this way is: >> 1. You need to get old models onto the new assembly so coordinates will >> change. So by doing it this way, you will at least be able to move many >> models forward based on homology. >> 2. By providing the models to pred_gff on a subsequent MAKER run, you are >> just letting old models compete against new annotations. They will be >> rejected if they have no evidence support, or can be kept if they score >> better than alternate models from SNAP/Augustus. That way you have the >> chance to integrate old models while at the same time rejecting some old >> models that have no evidence overlap. >> >> ?Carson >> >> >> > On Sep 28, 2017, at 6:05 AM, Quanwei Zhang >> wrote: >> > >> > Hello: >> > >> > Recently, we got a new version of NMR genome, whose genome had been >> assembled and annotated a few years ago. We can download the gene >> annotation from NCBI. >> > >> > Now we want to annotate the new genome using Maker2 pipeline. I wonder >> how can I fully make use of existing annotations. On the other hand, since >> the previous genome is not very well assemblies, some genes annotation >> maybe false positives. I hope those false positive genes in previous >> annotation won't mislead Maker2 for current gene annotation. >> > >> > Do you have any suggestions. Thanks >> > >> > Best >> > Quanwei >> > _______________________________________________ >> > maker-devel mailing list >> > maker-devel at box290.bluehost.com >> > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Jan 29 11:23:06 2018 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 29 Jan 2018 11:23:06 -0700 Subject: [maker-devel] gene annotation for a better genome In-Reply-To: References: <5AFEDD05-DF02-463F-A6EE-1619A9BB968D@gmail.com> Message-ID: <753F1840-4874-4C0D-80F7-59E1A1579884@gmail.com> You can set both est2genome=1 and protein2genome=1. You can also set est_forward=1 to get the names from the old models (you have to add it as it?s not already there). If you want to try and force an alignment to a specifc location, you can also add maker_coor=chr2:1-3000 to the fasta header comment line to have maker only alow alignments within a specific region (chr2:1-3000 in the example). ?Carson > On Jan 26, 2018, at 4:16 PM, Quanwei Zhang wrote: > > Hi Carson: > > Thank you for your previous suggestions. I have done the annotation according to your suggestions. I firstly mapped the transcripts from old assembly to the new assembly by setting "est2genome=1", and then update the models by new predictions. > > Besides mapping by "est2genome=1" , do you think it is a good idea to do a separate mapping by proteins of old assembly (setting "protein2genome=1")? And then I provide both mapping GFF files (i.e., mapping GFF by transcripts and proteins, separately) and update them with new predictions and evidence support? Why I am trying to do this is because I found for certain genes they were not mapped to the new assembly but they can be mapped by protein orthologs. > > Thank you. > > Best > Quanwei > > 2017-10-24 18:26 GMT-04:00 Carson Holt >: > Yes. If you use est2genome it will just align the model, and then find the longest ORF. So it is a quick way to jsut align old models to the new assembly. Alternatively you can just do de novo annotation. > > ?Carson > > > >> On Oct 24, 2017, at 10:54 AM, Quanwei Zhang > wrote: >> >> Dear Carson: >> >> Thank you again for your suggestions. I just get the new genome assembly of NMR and start to do gene annotation. I understand you ideas about this. But can I simply use the old genome transcripts as transcript evidence, and just following the standard Maker2 pipeline? I set est2genome=1 and provide the mRNA sequences in the fasta format for the first round training of SNAP. >> >> For transcripts I have the following choices. I think the first choice is more reliable and better, right? >> (1) There are about 60,000 RefSeq transcripts from NCBI. So I downloaded those sequences in fasta format. >> (2) We have the raw data of RNA-seq from 11 tissues, we can do assembly by trinity for each sample and then get the transcripts. But I think most of the RNA-seq should have been submitted to NCBI. >> >> BTW, if we use the RefSeq data from NCBI, we can download the mRNA sequences, coding sequences or protein sequences. I wonder which type of data are the best to train the SNAP? For Augustus, we will use BUSCO to train it. >> >> Many thanks. >> >> Best >> Quanwei >> >> >> >> >> 2017-09-29 12:36 GMT-04:00 Carson Holt >: >> You can try using the est2genome=1 option to map the old models forward onto the new assembly as if they were ESTs (add a line that says est_forward=1 to the control file to maintain old naming and set est=1 to the old model transcript file). Then provide the final models as a pred_gff for a subsuquent run (i.e. a traditional MAKER run where you are annotating the new assembly with transcript and protein evidence and ab initio predictors). Don?t supply the old models to est= on that run. >> >> The idea behind doing it this way is: >> 1. You need to get old models onto the new assembly so coordinates will change. So by doing it this way, you will at least be able to move many models forward based on homology. >> 2. By providing the models to pred_gff on a subsequent MAKER run, you are just letting old models compete against new annotations. They will be rejected if they have no evidence support, or can be kept if they score better than alternate models from SNAP/Augustus. That way you have the chance to integrate old models while at the same time rejecting some old models that have no evidence overlap. >> >> ?Carson >> >> >> > On Sep 28, 2017, at 6:05 AM, Quanwei Zhang > wrote: >> > >> > Hello: >> > >> > Recently, we got a new version of NMR genome, whose genome had been assembled and annotated a few years ago. We can download the gene annotation from NCBI. >> > >> > Now we want to annotate the new genome using Maker2 pipeline. I wonder how can I fully make use of existing annotations. On the other hand, since the previous genome is not very well assemblies, some genes annotation maybe false positives. I hope those false positive genes in previous annotation won't mislead Maker2 for current gene annotation. >> > >> > Do you have any suggestions. Thanks >> > >> > Best >> > Quanwei >> > _______________________________________________ >> > maker-devel mailing list >> > maker-devel at box290.bluehost.com >> > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From qwzhang0601 at gmail.com Mon Jan 29 12:57:42 2018 From: qwzhang0601 at gmail.com (Quanwei Zhang) Date: Mon, 29 Jan 2018 14:57:42 -0500 Subject: [maker-devel] gene annotation for a better genome In-Reply-To: <753F1840-4874-4C0D-80F7-59E1A1579884@gmail.com> References: <5AFEDD05-DF02-463F-A6EE-1619A9BB968D@gmail.com> <753F1840-4874-4C0D-80F7-59E1A1579884@gmail.com> Message-ID: Dear Carson: Thank you for your reply. Do you mean set est2genome=1 and protein2genome=1 in one round or do such mapping in two separate rounds? So I will provide gff files by mapping the transcripts and proteins to "pred_gff". Besides the gff from such mapping, I am also considering to provide a gff file obtained from a regular de novo annotation by maker2. And then update gene models from those gff. Here is the reason why I consider this. Suppose at location 1 there is a gene model gA by mapping transcripts and proteins. Then if I try to update those gene models in the second round of maker, maker can not change internal exons of gA (so can not replace it). However, if I provide both the gff by mapping transcripts and gff by maker de novo annotation, then if another gene model gA' (by de novo annotation) was predicted by maker at the same location, maker will compare gA and gA' and select the one with higher score, right? By this way we can replace a mapping gene model with predicted model by maker if the predicted one have stronger evidence support. Right? Thank you. Best Quanwei 2018-01-29 13:23 GMT-05:00 Carson Holt : > You can set both est2genome=1 and protein2genome=1. You can also set > est_forward=1 to get the names from the old models (you have to add it as > it?s not already there). If you want to try and force an alignment to a > specifc location, you can also add maker_coor=chr2:1-3000 to the fasta > header comment line to have maker only alow alignments within a specific > region (chr2:1-3000 in the example). > > ?Carson > > > On Jan 26, 2018, at 4:16 PM, Quanwei Zhang wrote: > > Hi Carson: > > Thank you for your previous suggestions. I have done the annotation > according to your suggestions. I firstly mapped the transcripts from old > assembly to the new assembly by setting "est2genome=1", and then update the > models by new predictions. > > Besides mapping by "est2genome=1" , do you think it is a good idea to do a > separate mapping by proteins of old assembly (setting "protein2genome=1")? > And then I provide both mapping GFF files (i.e., mapping GFF by transcripts > and proteins, separately) and update them with new predictions and evidence > support? Why I am trying to do this is because I found for certain genes > they were not mapped to the new assembly but they can be mapped by protein > orthologs. > > Thank you. > > Best > Quanwei > > 2017-10-24 18:26 GMT-04:00 Carson Holt : > >> Yes. If you use est2genome it will just align the model, and then find >> the longest ORF. So it is a quick way to jsut align old models to the new >> assembly. Alternatively you can just do de novo annotation. >> >> ?Carson >> >> >> >> On Oct 24, 2017, at 10:54 AM, Quanwei Zhang >> wrote: >> >> Dear Carson: >> >> Thank you again for your suggestions. I just get the new genome assembly >> of NMR and start to do gene annotation. I understand you ideas about this. >> But can I simply use the old genome transcripts as transcript evidence, and >> just following the standard Maker2 pipeline? I set est2genome=1 and provide >> the mRNA sequences in the fasta format for the first round training of SNAP. >> >> For transcripts I have the following choices. I think the first choice is >> more reliable and better, right? >> (1) There are about 60,000 RefSeq transcripts from NCBI. So I downloaded >> those sequences in fasta format. >> (2) We have the raw data of RNA-seq from 11 tissues, we can do assembly >> by trinity for each sample and then get the transcripts. But I think most >> of the RNA-seq should have been submitted to NCBI. >> >> BTW, if we use the RefSeq data from NCBI, we can download the mRNA >> sequences, coding sequences or protein sequences. I wonder which type of >> data are the best to train the SNAP? For Augustus, we will use BUSCO to >> train it. >> >> Many thanks. >> >> Best >> Quanwei >> >> >> >> >> 2017-09-29 12:36 GMT-04:00 Carson Holt : >> >>> You can try using the est2genome=1 option to map the old models forward >>> onto the new assembly as if they were ESTs (add a line that says >>> est_forward=1 to the control file to maintain old naming and set est=1 to >>> the old model transcript file). Then provide the final models as a pred_gff >>> for a subsuquent run (i.e. a traditional MAKER run where you are annotating >>> the new assembly with transcript and protein evidence and ab initio >>> predictors). Don?t supply the old models to est= on that run. >>> >>> The idea behind doing it this way is: >>> 1. You need to get old models onto the new assembly so coordinates will >>> change. So by doing it this way, you will at least be able to move many >>> models forward based on homology. >>> 2. By providing the models to pred_gff on a subsequent MAKER run, you >>> are just letting old models compete against new annotations. They will be >>> rejected if they have no evidence support, or can be kept if they score >>> better than alternate models from SNAP/Augustus. That way you have the >>> chance to integrate old models while at the same time rejecting some old >>> models that have no evidence overlap. >>> >>> ?Carson >>> >>> >>> > On Sep 28, 2017, at 6:05 AM, Quanwei Zhang >>> wrote: >>> > >>> > Hello: >>> > >>> > Recently, we got a new version of NMR genome, whose genome had been >>> assembled and annotated a few years ago. We can download the gene >>> annotation from NCBI. >>> > >>> > Now we want to annotate the new genome using Maker2 pipeline. I wonder >>> how can I fully make use of existing annotations. On the other hand, since >>> the previous genome is not very well assemblies, some genes annotation >>> maybe false positives. I hope those false positive genes in previous >>> annotation won't mislead Maker2 for current gene annotation. >>> > >>> > Do you have any suggestions. Thanks >>> > >>> > Best >>> > Quanwei >>> > _______________________________________________ >>> > maker-devel mailing list >>> > maker-devel at box290.bluehost.com >>> > http://box290.bluehost.com/mailman/listinfo/maker-devel_yand >>> ell-lab.org >>> >>> >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From admin at genome.arizona.edu Mon Jan 29 16:08:54 2018 From: admin at genome.arizona.edu (admin at genome.arizona.edu) Date: Mon, 29 Jan 2018 16:08:54 -0700 Subject: [maker-devel] MPI selection Message-ID: <77cfb864-4de1-a9af-aeea-9d3e7cf45ce5@genome.arizona.edu> Hi, we have now three versions of MPI installed on our cluster, OpenMPI, MPICH, and MVAPICH2. Since we have infiniband, the MVAPICH2 is working best with MPI test programs. MPICH should support infiniband too but currently there are some seg faults with that we are trying to resolve. On our cluster we have ~/.mpi-selection file which allows users to pick the MPI installation to use, and sets appropriate PATH and LD_LIBRARY_PATH variables. I am looking through the Maker MPI instructions, and it seems that a certain mpicc and mpi.h must be chosen during installation. So if originally, Maker was installed with MPICH, then would I have to reinstall it if users want to use MVAPICH2? Or is there config file somewhere I can update so I don't have to reinstall Maker? Or does nothing need to be done and we can rely on PATH and LD_LIBRARY_PATH variables pointing to correct mpicc and libmpi.so (mpi.h is in include directory)? Thanks From yuejiaxing at gmail.com Tue Jan 30 09:32:04 2018 From: yuejiaxing at gmail.com (Jia-Xing Yue) Date: Tue, 30 Jan 2018 17:32:04 +0100 Subject: [maker-devel] Is it possible to extract the GFF3 file for the raw gene models predicted by est2genome and protein2genome? Message-ID: Hello, I enabled the est2genome and protein2genome option for Maker-3.00.0-beta in my particular case. I was wondering if it is possible to extract the gene models predicted by snap, augustus, est2genome, and protein2genome respectively. By using the gff_merge command, I think I can extract some gene models for each cases but not all, especially for the est2genome and protein2genome set (e.g. those labeled with "maker-chr*-exonerate_est2genome-gene" and "maker-chr*-exonerate_protein2genome-gene"). Thanks in advance! Best, Jia-Xing -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Jan 30 09:47:39 2018 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 30 Jan 2018 09:47:39 -0700 Subject: [maker-devel] MPI selection In-Reply-To: <77cfb864-4de1-a9af-aeea-9d3e7cf45ce5@genome.arizona.edu> References: <77cfb864-4de1-a9af-aeea-9d3e7cf45ce5@genome.arizona.edu> Message-ID: <34C36A98-A87F-4B28-8E05-FCD412CFEBEA@gmail.com> The libraries used by MVAPICH2, Intel MPI, and OpenMPI to access infiniband have a known bug. For performance reasons, infiniband libraries use registered memory in a way that makes it impossible to do system calls to external programs under MPI (doing so results in seg faults). MAKER has to call out to external programs like BLAST, exonerate, etc., so it triggers this bug. The infiniband bug is well known, and unfortunately will not be fixed because fixing it causes infiniband to lose some advertised features like direct memory access. As a work around OpenMPI and Intel MPI allow you to disable infiniband libraries via command line flags and use IP over infiniband instead (i.e. they let you drop infiniband features on demand so that your code will run). However MVAPICH2 does not provide the same option. As a result you cannot use MAKER or any MPI program that does system calls to external programs with MVAPICH2 (it results in seg faults). But you can use all other MPI flavors with the appropriate flags detailed below: #For OpenMPI, use as follows (the example assumes ib0 is your ip over infiniband adapter) export LD_PRELOAD=/path/to/openmpi/libmpi.so mpiexec --mca btl vader,tcp,self --mca btl_tcp_if_include ib0 --mca btl_openib_want_fork_support 1 --mca mpi_warn_on_fork 0 maker #For Intel MPI set these environmental variables before launch export I_MPI_FABRICS='shm:tcp' export I_MPI_HYDRA_IFACE='ib0' mpiexec maker #For MPICH, nothing is needed as the Infiniband libraries are always disabled, but you can specifically tell it to use the ib0 adapter as the communicator mpiexec -iface ib0 maker ?Carson > On Jan 29, 2018, at 4:08 PM, admin at genome.arizona.edu wrote: > > Hi, we have now three versions of MPI installed on our cluster, OpenMPI, MPICH, and MVAPICH2. Since we have infiniband, the MVAPICH2 is working best with MPI test programs. MPICH should support infiniband too but currently there are some seg faults with that we are trying to resolve. > > On our cluster we have ~/.mpi-selection file which allows users to pick the MPI installation to use, and sets appropriate PATH and LD_LIBRARY_PATH variables. > > I am looking through the Maker MPI instructions, and it seems that a certain mpicc and mpi.h must be chosen during installation. So if originally, Maker was installed with MPICH, then would I have to reinstall it if users want to use MVAPICH2? Or is there config file somewhere I can update so I don't have to reinstall Maker? Or does nothing need to be done and we can rely on PATH and LD_LIBRARY_PATH variables pointing to correct mpicc and libmpi.so (mpi.h is in include directory)? > > Thanks > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From carsonhh at gmail.com Tue Jan 30 09:54:05 2018 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 30 Jan 2018 09:54:05 -0700 Subject: [maker-devel] gene annotation for a better genome In-Reply-To: References: <5AFEDD05-DF02-463F-A6EE-1619A9BB968D@gmail.com> <753F1840-4874-4C0D-80F7-59E1A1579884@gmail.com> Message-ID: <921EBAEF-13E3-4175-90A2-8F41651F95C9@gmail.com> You can set both simultaneously. est2genome will almost always be picked first since it will match better thatn the protein alignment (i.e. it matches at UTRs). ?Carson > On Jan 29, 2018, at 12:57 PM, Quanwei Zhang wrote: > > Dear Carson: > > Thank you for your reply. Do you mean set est2genome=1 and protein2genome=1 in one round or do such mapping in two separate rounds? > > So I will provide gff files by mapping the transcripts and proteins to "pred_gff". Besides the gff from such mapping, I am also considering to provide a gff file obtained from a regular de novo annotation by maker2. And then update gene models from those gff. > > Here is the reason why I consider this. Suppose at location 1 there is a gene model gA by mapping transcripts and proteins. Then if I try to update those gene models in the second round of maker, maker can not change internal exons of gA (so can not replace it). However, if I provide both the gff by mapping transcripts and gff by maker de novo annotation, then if another gene model gA' (by de novo annotation) was predicted by maker at the same location, maker will compare gA and gA' and select the one with higher score, right? By this way we can replace a mapping gene model with predicted model by maker if the predicted one have stronger evidence support. Right? > > Thank you. > > Best > Quanwei > > > > 2018-01-29 13:23 GMT-05:00 Carson Holt >: > You can set both est2genome=1 and protein2genome=1. You can also set est_forward=1 to get the names from the old models (you have to add it as it?s not already there). If you want to try and force an alignment to a specifc location, you can also add maker_coor=chr2:1-3000 to the fasta header comment line to have maker only alow alignments within a specific region (chr2:1-3000 in the example). > > ?Carson > > >> On Jan 26, 2018, at 4:16 PM, Quanwei Zhang > wrote: >> >> Hi Carson: >> >> Thank you for your previous suggestions. I have done the annotation according to your suggestions. I firstly mapped the transcripts from old assembly to the new assembly by setting "est2genome=1", and then update the models by new predictions. >> >> Besides mapping by "est2genome=1" , do you think it is a good idea to do a separate mapping by proteins of old assembly (setting "protein2genome=1")? And then I provide both mapping GFF files (i.e., mapping GFF by transcripts and proteins, separately) and update them with new predictions and evidence support? Why I am trying to do this is because I found for certain genes they were not mapped to the new assembly but they can be mapped by protein orthologs. >> >> Thank you. >> >> Best >> Quanwei >> >> 2017-10-24 18:26 GMT-04:00 Carson Holt >: >> Yes. If you use est2genome it will just align the model, and then find the longest ORF. So it is a quick way to jsut align old models to the new assembly. Alternatively you can just do de novo annotation. >> >> ?Carson >> >> >> >>> On Oct 24, 2017, at 10:54 AM, Quanwei Zhang > wrote: >>> >>> Dear Carson: >>> >>> Thank you again for your suggestions. I just get the new genome assembly of NMR and start to do gene annotation. I understand you ideas about this. But can I simply use the old genome transcripts as transcript evidence, and just following the standard Maker2 pipeline? I set est2genome=1 and provide the mRNA sequences in the fasta format for the first round training of SNAP. >>> >>> For transcripts I have the following choices. I think the first choice is more reliable and better, right? >>> (1) There are about 60,000 RefSeq transcripts from NCBI. So I downloaded those sequences in fasta format. >>> (2) We have the raw data of RNA-seq from 11 tissues, we can do assembly by trinity for each sample and then get the transcripts. But I think most of the RNA-seq should have been submitted to NCBI. >>> >>> BTW, if we use the RefSeq data from NCBI, we can download the mRNA sequences, coding sequences or protein sequences. I wonder which type of data are the best to train the SNAP? For Augustus, we will use BUSCO to train it. >>> >>> Many thanks. >>> >>> Best >>> Quanwei >>> >>> >>> >>> >>> 2017-09-29 12:36 GMT-04:00 Carson Holt >: >>> You can try using the est2genome=1 option to map the old models forward onto the new assembly as if they were ESTs (add a line that says est_forward=1 to the control file to maintain old naming and set est=1 to the old model transcript file). Then provide the final models as a pred_gff for a subsuquent run (i.e. a traditional MAKER run where you are annotating the new assembly with transcript and protein evidence and ab initio predictors). Don?t supply the old models to est= on that run. >>> >>> The idea behind doing it this way is: >>> 1. You need to get old models onto the new assembly so coordinates will change. So by doing it this way, you will at least be able to move many models forward based on homology. >>> 2. By providing the models to pred_gff on a subsequent MAKER run, you are just letting old models compete against new annotations. They will be rejected if they have no evidence support, or can be kept if they score better than alternate models from SNAP/Augustus. That way you have the chance to integrate old models while at the same time rejecting some old models that have no evidence overlap. >>> >>> ?Carson >>> >>> >>> > On Sep 28, 2017, at 6:05 AM, Quanwei Zhang > wrote: >>> > >>> > Hello: >>> > >>> > Recently, we got a new version of NMR genome, whose genome had been assembled and annotated a few years ago. We can download the gene annotation from NCBI. >>> > >>> > Now we want to annotate the new genome using Maker2 pipeline. I wonder how can I fully make use of existing annotations. On the other hand, since the previous genome is not very well assemblies, some genes annotation maybe false positives. I hope those false positive genes in previous annotation won't mislead Maker2 for current gene annotation. >>> > >>> > Do you have any suggestions. Thanks >>> > >>> > Best >>> > Quanwei >>> > _______________________________________________ >>> > maker-devel mailing list >>> > maker-devel at box290.bluehost.com >>> > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >>> >>> >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Jan 30 09:57:01 2018 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 30 Jan 2018 09:57:01 -0700 Subject: [maker-devel] Is it possible to extract the GFF3 file for the raw gene models predicted by est2genome and protein2genome? In-Reply-To: References: Message-ID: <9771EB42-8A80-49D8-9A21-67406860FD4F@gmail.com> You can just grep on the name. Although est2genome and protein2genome should only be used for initial training, as they are almost always guaranteed to be partial and should be disabled once you have trained gene predictors that can build complete models. ?Carson > On Jan 30, 2018, at 9:32 AM, Jia-Xing Yue wrote: > > Hello, > > I enabled the est2genome and protein2genome option for Maker-3.00.0-beta in my particular case. I was wondering if it is possible to extract the gene models predicted by snap, augustus, est2genome, and protein2genome respectively. > > By using the gff_merge command, I think I can extract some gene models for each cases but not all, especially for the est2genome and protein2genome set (e.g. those labeled with "maker-chr*-exonerate_est2genome-gene" and "maker-chr*-exonerate_protein2genome-gene"). > > Thanks in advance! > > Best, > Jia-Xing > > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From yuejiaxing at gmail.com Tue Jan 30 10:03:34 2018 From: yuejiaxing at gmail.com (Jia-Xing Yue) Date: Tue, 30 Jan 2018 18:03:34 +0100 Subject: [maker-devel] Is it possible to extract the GFF3 file for the raw gene models predicted by est2genome and protein2genome? In-Reply-To: <9771EB42-8A80-49D8-9A21-67406860FD4F@gmail.com> References: <9771EB42-8A80-49D8-9A21-67406860FD4F@gmail.com> Message-ID: Dear Carson, Thanks for the quick response! Could you elaborate a bit on on "grep on the name". Do you mean just grep all the lines in the gff_merge output with "est2genome" and "protein2genome" in column 3? In that case, what I got is the alignments rather than the gene model guessed by Maker based on the alignment, right? Thanks! Best, Jia-Xing On Tue, Jan 30, 2018 at 5:57 PM, Carson Holt wrote: > You can just grep on the name. Although est2genome and protein2genome > should only be used for initial training, as they are almost always > guaranteed to be partial and should be disabled once you have trained gene > predictors that can build complete models. > > ?Carson > > > On Jan 30, 2018, at 9:32 AM, Jia-Xing Yue wrote: > > > > Hello, > > > > I enabled the est2genome and protein2genome option for Maker-3.00.0-beta > in my particular case. I was wondering if it is possible to extract the > gene models predicted by snap, augustus, est2genome, and protein2genome > respectively. > > > > By using the gff_merge command, I think I can extract some gene models > for each cases but not all, especially for the est2genome and > protein2genome set (e.g. those labeled with "maker-chr*-exonerate_est2genome-gene" > and "maker-chr*-exonerate_protein2genome-gene"). > > > > Thanks in advance! > > > > Best, > > Jia-Xing > > > > > > > > _______________________________________________ > > maker-devel mailing list > > maker-devel at box290.bluehost.com > > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > -- Jia-Xing Yue Population Genomics and Complex Traits Group Tour Pasteur 8eme etage Facult? de M?decine Institute for Research on Cancer and Aging, Nice (IRCAN) CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA) 28 Avenue de Valombrose 06107 NICE Cedex 2 France Twitter: @iAmphioxus Personal website: http://www.iamphioxus.org/ Lab website: https://litilab.wordpress.com/ Yeast Population Reference Panel: https://yjx1217.github.io/Yeast_PacBio_2016/welcome/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Jan 30 10:06:27 2018 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 30 Jan 2018 10:06:27 -0700 Subject: [maker-devel] Is it possible to extract the GFF3 file for the raw gene models predicted by est2genome and protein2genome? In-Reply-To: References: <9771EB42-8A80-49D8-9A21-67406860FD4F@gmail.com> Message-ID: <335E2942-4FCA-4F3C-A488-06116F6B7604@gmail.com> MAKER models will al have ?maker? in the source column. Everything else is a reference alignment (not a model). But you can grep on the gene name. If it is sourced from SNAP, it will have snap in the name, and the same is true for augustus, est2genome, protein2genome, etc. ?Carson > On Jan 30, 2018, at 10:03 AM, Jia-Xing Yue wrote: > > Dear Carson, > > Thanks for the quick response! Could you elaborate a bit on on "grep on the name". Do you mean just grep all the lines in the gff_merge output with "est2genome" and "protein2genome" in column 3? In that case, what I got is the alignments rather than the gene model guessed by Maker based on the alignment, right? > > > Thanks! > > Best, > Jia-Xing > > > > On Tue, Jan 30, 2018 at 5:57 PM, Carson Holt > wrote: > You can just grep on the name. Although est2genome and protein2genome should only be used for initial training, as they are almost always guaranteed to be partial and should be disabled once you have trained gene predictors that can build complete models. > > ?Carson > > > On Jan 30, 2018, at 9:32 AM, Jia-Xing Yue > wrote: > > > > Hello, > > > > I enabled the est2genome and protein2genome option for Maker-3.00.0-beta in my particular case. I was wondering if it is possible to extract the gene models predicted by snap, augustus, est2genome, and protein2genome respectively. > > > > By using the gff_merge command, I think I can extract some gene models for each cases but not all, especially for the est2genome and protein2genome set (e.g. those labeled with "maker-chr*-exonerate_est2genome-gene" and "maker-chr*-exonerate_protein2genome-gene"). > > > > Thanks in advance! > > > > Best, > > Jia-Xing > > > > > > > > _______________________________________________ > > maker-devel mailing list > > maker-devel at box290.bluehost.com > > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > > > -- > Jia-Xing Yue > > Population Genomics and Complex Traits Group > Tour Pasteur 8eme etage > Facult? de M?decine > Institute for Research on Cancer and Aging, Nice (IRCAN) > CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA) > 28 Avenue de Valombrose > 06107 NICE Cedex 2 > France > > Twitter: @iAmphioxus > Personal website: http://www.iamphioxus.org/ > Lab website: https://litilab.wordpress.com/ > Yeast Population Reference Panel: https://yjx1217.github.io/Yeast_PacBio_2016/welcome/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From admin at genome.arizona.edu Tue Jan 30 10:24:05 2018 From: admin at genome.arizona.edu (admin at genome.arizona.edu) Date: Tue, 30 Jan 2018 10:24:05 -0700 Subject: [maker-devel] MPI selection In-Reply-To: <34C36A98-A87F-4B28-8E05-FCD412CFEBEA@gmail.com> References: <77cfb864-4de1-a9af-aeea-9d3e7cf45ce5@genome.arizona.edu> <34C36A98-A87F-4B28-8E05-FCD412CFEBEA@gmail.com> Message-ID: <4825e452-aab6-aa13-ebc7-3d3d1832cc60@genome.arizona.edu> Carson Holt wrote on 01/30/2018 09:47 AM: > The libraries used by MVAPICH2, Intel MPI, and OpenMPI to access infiniband have a known bug. For performance reasons, infiniband libraries use registered memory in a way that makes it impossible to do system calls to external programs under MPI (doing so results in seg faults). MAKER has to call out to external programs like BLAST, exonerate, etc., so it triggers this bug. > The infiniband bug is well known, and unfortunately will not be fixed because fixing it causes infiniband to lose some advertised features like direct memory access. Well that stinks! Maybe that's why we got such a good deal on new-old-stock infiniband equipment! Still it has allowed us to use full speed of our NFS RAIDs, which has been nice. I will try with using ib0, the speed is still about 10Gb, but I was under the impression using IPoIB would cause packet loss or other problems... Thanks for clearing that up. So is there a fabric/protocol you would recommend for clusters running maker? From yuejiaxing at gmail.com Tue Jan 30 10:24:22 2018 From: yuejiaxing at gmail.com (Jia-Xing Yue) Date: Tue, 30 Jan 2018 12:24:22 -0500 Subject: [maker-devel] Is it possible to extract the GFF3 file for the raw gene models predicted by est2genome and protein2genome? Message-ID: Dear Carson, Yes that's what I did actually. But it seems that I only got much fewer gene models for est2genome and protein2genome in this way than I would expect. I have turned on EVM for my maker run. Could this explain the low numbers of est2genome and protein2genome models that I got? Thx! Best, Jia-Xing Sent from my Nokia Lumia 920 ------------------------------ From: Carson Holt Sent: ?30/?01/?2018 18:06 To: Jia-Xing Yue Cc: maker-devel at yandell-lab.org List Subject: Re: [maker-devel] Is it possible to extract the GFF3 file for the raw gene models predicted by est2genome and protein2genome? MAKER models will al have ?maker? in the source column. Everything else is a reference alignment (not a model). But you can grep on the gene name. If it is sourced from SNAP, it will have snap in the name, and the same is true for augustus, est2genome, protein2genome, etc. ?Carson On Jan 30, 2018, at 10:03 AM, Jia-Xing Yue wrote: Dear Carson, Thanks for the quick response! Could you elaborate a bit on on "grep on the name". Do you mean just grep all the lines in the gff_merge output with "est2genome" and "protein2genome" in column 3? In that case, what I got is the alignments rather than the gene model guessed by Maker based on the alignment, right? Thanks! Best, Jia-Xing On Tue, Jan 30, 2018 at 5:57 PM, Carson Holt wrote: > You can just grep on the name. Although est2genome and protein2genome > should only be used for initial training, as they are almost always > guaranteed to be partial and should be disabled once you have trained gene > predictors that can build complete models. > > ?Carson > > > On Jan 30, 2018, at 9:32 AM, Jia-Xing Yue wrote: > > > > Hello, > > > > I enabled the est2genome and protein2genome option for Maker-3.00.0-beta > in my particular case. I was wondering if it is possible to extract the > gene models predicted by snap, augustus, est2genome, and protein2genome > respectively. > > > > By using the gff_merge command, I think I can extract some gene models > for each cases but not all, especially for the est2genome and > protein2genome set (e.g. those labeled with "maker-chr*-exonerate_est2genome-gene" > and "maker-chr*-exonerate_protein2genome-gene"). > > > > Thanks in advance! > > > > Best, > > Jia-Xing > > > > > > > > _______________________________________________ > > maker-devel mailing list > > maker-devel at box290.bluehost.com > > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > -- Jia-Xing Yue Population Genomics and Complex Traits Group Tour Pasteur 8eme etage Facult? de M?decine Institute for Research on Cancer and Aging, Nice (IRCAN) CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA) 28 Avenue de Valombrose 06107 NICE Cedex 2 France Twitter: @iAmphioxus Personal website: http://www.iamphioxus.org/ Lab website: https://litilab.wordpress.com/ Yeast Population Reference Panel: https://yjx1217.github.io/Yeast_PacBio_2016/welcome/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Tue Jan 30 10:37:59 2018 From: carsonhh at gmail.com (Carson Holt) Date: Tue, 30 Jan 2018 10:37:59 -0700 Subject: [maker-devel] MPI selection In-Reply-To: <4825e452-aab6-aa13-ebc7-3d3d1832cc60@genome.arizona.edu> References: <77cfb864-4de1-a9af-aeea-9d3e7cf45ce5@genome.arizona.edu> <34C36A98-A87F-4B28-8E05-FCD412CFEBEA@gmail.com> <4825e452-aab6-aa13-ebc7-3d3d1832cc60@genome.arizona.edu> Message-ID: MAKER does not really move a lot of data with MPI, it?s just moving around command lines and small variables. So not getting full infiniband performance will not hurt you. I doubt you see any issues using ib0. For MPI flavor, I get the best performance with Intel MPI followed by OpenMPI. Overall you will find that MAKER is IO bound as opposed to CPU or communications bound. So pointing it at your best performing network based storage will be the greatest performance factor (if you have Lustre storage, point it there for example). Pull back on job size and count if other users have issues accessing the disk (too many jobs can bring NFS to it?s knees). The one suggestion I have as far as job size, it to keep jobs sizes under 200 CPU cores. Over that, you will get better performance by splitting up datasets and submitting multiple job. Also MAKER keeps a log of it?s progress, so you can kill jobs or restart failed jobs, and they pick up right where they left off. ?Carson > On Jan 30, 2018, at 10:24 AM, admin at genome.arizona.edu wrote: > > Carson Holt wrote on 01/30/2018 09:47 AM: > > The libraries used by MVAPICH2, Intel MPI, and OpenMPI to access infiniband have a known bug. For performance reasons, infiniband libraries use registered memory in a way that makes it impossible to do system calls to external programs under MPI (doing so results in seg faults). MAKER has to call out to external programs like BLAST, exonerate, etc., so it triggers this bug. > > The infiniband bug is well known, and unfortunately will not be fixed because fixing it causes infiniband to lose some advertised features like direct memory access. > > > Well that stinks! Maybe that's why we got such a good deal on new-old-stock infiniband equipment! Still it has allowed us to use full speed of our NFS RAIDs, which has been nice. I will try with using ib0, the speed is still about 10Gb, but I was under the impression using IPoIB would cause packet loss or other problems... > > Thanks for clearing that up. So is there a fabric/protocol you would recommend for clusters running maker? > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org